1. Introduction
Proteins are hardworking macromolecules that perform a variety of functions in cells. Proteins are made up of chains of 50 to 2000 amino acids, usually folded into a well-defined three-dimensional structure. Since proteins interact each other by forming transient protein-protein complexes, their shape (such as shape complementarity at the interface) plays an important role in their functions.
In this paper, three mathematical models of proteins are presented, i.e., the loop model, the flow model, and the cohomological model in this order (Figure 1).
First, in Section 2, we explain the problem intuitively using a loop model. In particular, we give the intuitive definition of protein interactions and allosteric triplets. For example, “protein interaction is defined as fusions of loops”.
Next, in Section 3, we consider conditions for protein interaction and allosteric regulation using the flow model. The flow model is a differential geometric model [1], where proteins are represented as a closed trajectory of a flow of triangles (2D model) or tetrahedra (3D model). Then, we can define conditions for protein interaction and allosteric regulation using the language of differential geometry [2]. For example, “two proteins interact if the corresponding local flow is integrable”.
Finally, in Section 4, we rephrase the conditions obtained in Section 3 using the cohomological model. For example, “two proteins interact if the cohomology class of the corresponding vector field is zero”. Cohomology classes of vector fields are defined using exterior derivative operators.
“Allosteric regulation” and “post-translational modification” are briefly introduced in Subsections 2.3 and 3.6, respectively. In the flow/cohomological model, “allosteric regulation” corresponds to the integrability of a flow, and “post-translational modification” corresponds to a resolution of singularities of a flow (Figure 2).
As for previous works, the author is unaware of any other geometrical models of allosteric regulation nor any applications of cohomology to protein structure analysis. For an overview of history of allostery, see [3]. For an application of cohomology to biological time series, see [4].
For the sake of clarity, we mainly consider loops of triangles and triangular meshes. In the following, we denote the set of all integers by Z.
Figure 1. The mathematical models. (a) Protein (Polyline); (b) The loop model; (c) The flow/cohomological model.
Figure 2. Advantages of the flow/cohomological model. (a) Integrability of a flow. The arrow indicates a conflict between two distal triangles across two “not-form-a-loop” triangles; (b) Resolution of singularities of a flow. By splitting each of the “not-form-a-loop” triangles into two, a loop of length four (dark grey) is obtained.
2. Loop Model of Proteins
In the loop model (Figure 1(b)), amino acid sequences are represented as a closed chain of triangles (2D model) or tetrahedra (3D model).
2.1. Loops and Their Normal Edges
A chain u of triangles is a series of triangles connected by a common edge, i.e., for some interval
,
(1)
A closed chain u is called a loop.
Let
be a chain of triangles. The underlying mesh
of u is a triplet defined by
, (2)
where
(3)
In this paper, we consider only the chains such that (1) two triangles in T(u) do not intersect except at an edge, and (2) the edges in E(T(u)) are shared by at most two triangles in T(u). In particular,
, (4)
where
denotes the set of edges in
shared by k triangles in X. Edges in
are called the boundary edges of
.
Let
be a chain of triangles. Let
. The normal edge N(t) of t is the edge that is not shared with the adjacent triangles along u. (In Figure 1(c), the normal edges are drawn with thick line segments.) If u is a loop, each triangle in u has exactly one normal edge. If u is not a loop, the endpoint triangles have two normal edges. If u consists of one triangle, say
, t has three normal edges. We denote the set of all normal edges of triangles in u by N(u):
. (5)
Remark 2.1. The normal edge to a chain is a discrete version of the “normal vector” to a curve.
The pair
of
and
is called the local flow of u. Note that we can recover u uniquely from
by connecting triangles in
along the edges in
, i.e.,
Lemma 2.2. Let
be a chain of triangles. Then, there is a one-to-one correspondence
. (6)
In the same way, a chain u of tetrahedra is defined as a series of tetrahedra connected by a common face. The normal edge of a tetrahedra in a trajectory is defined as the edge that is not shared with two adjacent tetrahedra along the trajectory.
2.2. Loop Interaction
Let
and
be two chains of triangles. Binary set operations between
and
are defined by
(7)
where
stands for a set operation such as
, and others. Binary relations between
and
are defined by
if and only if
(8)
where
stands for a binary relation such as
, and others.
Remark 2.3. In this paper, we suppose that
for k > 2. For example, we consider
only for
and
with no overlap.
Definition 2.4. (Reaction Intermediate) Let
be loops of triangles.
is called loop-transitive if there are loops
such that
(9)
Then, we write
. (10)
u is called a reaction intermediate generated from
.
Remark 2.5. If u is a reaction intermediate generated from
, then
has the same contour as
, but has internal holes occupied by
of loops
.
Remark 2.6.
is called strictly loop-transitive if
, i.e., there is a loop u such that
.
In nature, proteins interact with each other by forming “transient protein complexes”. In the loop model, interactions between loops are defined as fusions of loops.
Definition 2.7. (Loop Interaction) Loops
are called interactable if there is a reaction intermediate generated from
.
Example 2.8. In Figure 3,
and
are interactable and their reaction intermediate is shown in the upper middle of the figure. On the other hand,
and
are not interactable. For example, the “intermediate” shown in the lower middle of the figure encloses an “open” trajectory of length 6.
2.3. Allosteric Regulation in the Loop Model
“Allosteric regulation” is a type of interaction between two distal sites of a protein. For example, in the case of allosteric enzymes or receptors, the interaction of a protein with a molecule or protein (called “regulator”) at an “active” site is affected by the binding of another molecule (called “effector”) at a remote “allosteric” site. Allosteric sites have been attracted attention as targets for safer drug discovery [5]. However, allosteric site prediction still remains challenging [6]. Shown in Figure 3 is a schematic diagram of allosteric regulation with three loops, where
interacts with “regulator”
only after interaction with “effector”
.
Remark 2.9. In the loop model, molecules are also represented as a loop of triangles or tetrahedra.
Definition 2.10. (Allosteric Triplet) Let
,
, and
be loops of triangles. The triplet
of loops are called an allosteric triplet if there are two reaction intermediates
such that
(11)
Figure 3. Allosteric triplet.
and
generate a reaction intermediate
(upper middle).
and
do not generate any reaction intermediate (lower middle).
and
form a reaction intermediate
(right).
Example 2.11. Loops
,
,
in Figure 3 form an allosteric triplet.
Remark 2.12. When effectors bind to proteins, they often change the conformation of the protein. In the loop model, since the loop model is a “topological” model, the conformational changes of proteins are considered to be absorbed into deformations of triangles in the mesh.
3. Flow Model of Proteins
In the flow model (Figure 1(c)), amino acid sequences are described as closed trajectories in a flow of triangles (2D model) or tetrahedra (3D model).
3.1. Meshes
In this paper, a triangular mesh is a collection of triangles connected by a common edge. We write a triangular mesh M as a triplet of sets, i.e.,
, (12)
where
(13)
In this paper, we consider only the meshes such that (1) two triangles in T do not intersect except at an edge, and (2) the edges in E(T) are shared by at most two triangles in T. In particular,
, (14)
Edges in
are called the boundary edges of M. Binary set operations and binary relations between
and
are defined in the same way as for
and
.
Definition 3.1. (Reaction Intermediate) Let
be a triangular mesh. M is called loop-transitive if there are loops
such that
(15)
Then, we write
. (16)
u is called a reaction intermediate generated from M.
Remark 3.2. M is called strictly loop-transitive if
, i.e., there is a loop u such that
.
Example 3.3. Shown in the upper middle of Figure 3 is a reaction intermediate generated from
. Shown in the right of Figure 3 is a reaction intermediate generated from
.
In the same way, a tetrahedral mesh is a set of tetrahedra connected by a common face. We write a tetrahedra mesh M as a quartet of sets, i.e.,
, (17)
where F is a set of tetrahedra, T(F) is the set of all faces (i.e., triangles) of the tetrahedra in F, E(F) is the set of all edges of the tetrahedra in F, and V(F) is the set of all vertices of the tetrahedra in F.
3.2. Directed Elements of a Mesh
Let
be a triangular mesh. To specify flows on M, we consider directed elements of M, i.e., directed vertices, directed edges, and directed triangles.
Since vertices have no direction, the set
of all directed vertices in M is defined by
. (18)
Edges in M are denoted by two endpoints, i.e., the edge joining vertices
and
is denoted by
. We make a distinction between two edges
and
, where
is an edge with the direction from vertex
to vertex
. The corresponding edge in
is denoted by
, i.e.,
. Then, the set
of all directed edges in M is defined by
. (19)
Triangles in M are denoted by three vertices, i.e., the triangle with three vertices
,
, and
is denoted by
. We make a distinction between triangles with different order of vertices. For example,
, where
is a triangle with the ordered triplet
. The corresponding triangle in T is denoted by
, i.e.,
. Then, the set
of all directed triangles in M is defined by
. (20)
3.3. Local Flows on a Mesh
Let M be a triangular mesh. A local flow on M is defined as a subset N of
. Elements of N are called the normal edges of the local flow. We often denote a local flow N on M by
(i.e., as a pair of M and N). Binary relations between
and
are defined by
if and only if
and
. (21)
where
stands for a binary relation such as
,
, and others.
Example 3.4. Shown in Figure 1(c) is part of a local flow, where the normal edges are drawn with thick line segments.
Remark 3.5. Recall that a chain u of triangles can be recovered from the underlying mesh
by giving the set
of its normal edges (Lemma 2.2).
To perform “differential geometric analysis” of a local flow
, we assign “gradient” to the edges in
. First, we assign a “height” to the vertices in
. Then, the “gradient” of the edges in
is computed as the difference of the height along the edge. Finally, the normal edge of a triangle in
is defined as the “steepest” edge of the triangles.
The height
of a vertex
is an integer-valued function defined on
, i.e.,
. (22)
Remark 3.6. Vertices are considered to be lifted vertically from the mesh M through to the “height”.
The gradient
of a directed edge
with respect to
is the difference in the height function
along the edge, i.e.,
(23)
Note that
.
Let
be a Z-valued function on
. The normal edge
of
with respect to
is defined as the steepest (positive) edge of the triangle with respect to
, i.e.,
,
(24)
where
.
Remark 3.7. We select edges with positive values as the normal edge of a triangle.
We denote the set of all normal edges of M with respect to
by
, i.e.,
. (25)
The corresponding edges in
are denoted by
, i.e.,
. (26)
Triangles with one normal edge is called regular. Triangles with no normal edge and triangles with more than two normal edges are called singular. A local flow N is called regular if every triangle is regular. For example, a local flow corresponding to a loop (i.e.,
) for some loop u) is regular.
Remark 3.8. Triangles may have multiple normal edges because some edges are shared by two triangles.
A local flow
is called differentiable if there is a Z-valued function
on
such that
.
is called the vector field of N and denoted by
.
A differentiable local flow
is called integrable if there is an Z-valued function
on
such that
.
is called a potential function of N and denoted by
.
A differentiable local flow
is called 2-bounded if
. (27)
Proposition 3.9. Let
be a 2-bounded differentiable local flow. Then,
is regular if
. (28)
Proof. Let
. Since
is differentiable,
,
, and
for some
such that
. That is,
. On the other hand,
and
are not contained in
because
. ∎
Remark 3.10.
is called the circulation of
around a triangle
.
Example 3.11. By piling unit cubes up diagonally in the direction of
, we obtain a regular local flow of triangles on the surface of the piled cubes (Figure 4). The normal edges are the vertical diagonals of the unit cubes. That is, each upper face of a unit cube is divided into two triangles by the vertical diagonal. Then, connecting triangles along the vertical diagonals, we obtain a flow on the surface of piled cubes. A triangular mesh M resides on the hyperplane
and the height of a point
over
(29)
Figure 4. Examples of local flows. (a) Top view of five local flows obtained by piling unit cubes. The normal edges are drawn in thick lines. If one more cube is put on the surface, the local flow will change as indicated by the arrows; (b) The regular flow shown in the upper middle of (a); (c) The non-regular flow shown in the lower middle of (a); (d) The singular triangles indicated by the S-shaped arrows in (a) and (c). Note that they are not obtained by dividing faces of a unit cube.
is given by
. (30)
Note that the local flow of Figure 4(c) is differentiable but not 2-bounded due to the triangles pointed by the S-shaped arrow. On the other hand, the local flow of Figure 4(b) is differentiable and 2-bounded.
3.4. Interaction of Loops in a Flow
Let
be a local flow. A trajectory of
is a chain u of triangles such that
and
. (31)
A closed trajectory of
is called a loop of
. A trajectory is called maximal if it cannot be extended further within M. Let
be a set of trajectories of
.
is called a flow of
if
. (32)
Let
be a local flow, where
.
is called closed if
., i.e., the boundary edges of T are normal edges.
is called finite if T consists of finite triangles.
Proposition 3.12. Let
be a closed finite local flow. Then,
(33)
for some loops
of
if N is regular.
Proof. Since
, trajectories of
do not cross the boundary of M. Since N is regular, maximal trajectories of
have no endpoint. The result follows immediately. ∎
Definition 3.13. (Loop Interaction) Let
be loops of
.
are called interactable if there is a reaction intermediate generated from
.
Example 3.14. Shown in the upper middle of Figure 3 is a reaction intermediate generated from
. Shown in the right of Figure 3 is a reaction intermediate generated from
.
3.5. Allosteric Regulation in the Flow Model
Definition 3.15. (Reaction Precursor) Let
be a triangular mesh. M is called pre-loop-transitive if there is a loop u such that
(34)
Then, we write
. (35)
u is called a reaction precursor generated from M. By definition, reaction intermediates are reaction precursors.
Remark 3.16. If u is a reaction precursor of M, then M(u) may have internal holes with singular triangles inside.
Example 3.17. Shown in the lower middle of Figure 3 is a reaction precursor generated from
. It has a hole with two singular triangles, i.e., the endpoints of the “open” trajectory of length 6.
Proposition 3.18. Let M be a triangular mesh. Suppose that there is a reaction precursor u generated from M of finite length, i.e.,
. Let
be an 2-bounded differentiable local flow such that
. (36)
Then, u is a reaction intermediate (i.e., M is loop-transitive) if
is integrable.
Proof. Since
is integrable, there is a potential function
such that
. Then,
. (37)
The result follows from Proposition 3.9 and Proposition 3.12 immediately. ∎
Definition 3.19. (Pre-Allosteric Triplet) Let
,
, and
be loops of
. The triplet
of loops are called a pre-allosteric triplet if there are two reaction precursors
such that
(38)
Corollary 3.20. (Conditions for loop interaction) Let
be a local flow. Let
and
be loops of
. Suppose that there is a reaction precursor u generated from
and
of finite length, i.e.,
. (39)
Let
be an 2-bounded differentiable local flows such that
. (40)
Then,
and
are interactable if
is integrable.
Corollary 3.21. (Conditions for allosteric triplet) Let
be a pre-allosteric triplet such that
and
(41)
where
and
are reaction precursors of finite length. Let
and
be 2-bounded differentiable local flows such that
and
. (42)
Then,
is an allosteric triplet if
and
are integrable.
3.6. Post-Translational Modification in the Flow Model
“Post-translational modifications (PTMs)” are biochemical modifications of the side chains of amino acids within a protein after their biosynthesis. They have a significant impact on the structure and function of proteins. For example, they play critical roles in regulating the stability of the 3D structure of proteins and their interactions with other molecules and proteins. In particular, the analysis of PTMs is important for the study of diseases, such as heart disease, cancer, and diabetes [7].
In the flow model, the effect of PTMs can be understood as a resolution of singularities of a local flow. That is, PTMs control the stability of a loop by inducing a resolution of the nearby singularity of the local flow.
It is helpful to consider the curvature of vertices to intuitively understand the geometric effects of singularity resolutions. Let
be a triangular mesh. Let
. The curvature
of v is defined by
, (43)
where k is the number of edges incident on v.
Remark 3.22. In comparison to the continuous version of geometry, K corresponds to the “Gaussian curvature” of a plane. For example, a “saddle point” has a negative curvature, and a “point on a hemisphere” has a positive curvature.
Example 3.23. For triangular meshes obtained by piling unit cubes (Example 3.9.), the curvature is zero at all vertices (Figure 4).
Example 3.24. In Figure 1(c), a loop encloses two singular triangles. By splitting each of the singular triangles into two, we obtain a loop of length four (Figure 2(b) dark grey). The curvatures of the vertices of the singular triangles increase by 1 to become positive or zero (i.e., saddle-point). On the other hand, the new vertex obtained in the center has negative curvature −2 (i.e., point on a hemisphere).
4. Cohomological Model of Proteins
In the cohomological model (Figure 1(c)), amino acid sequences are also described as closed trajectories in a flow of triangles (2D model) or tetrahedra (3D model) as in the case of the flow model. Here, we define “cohomology classes” of vector fields on a triangular mesh and rephrase conditions for “interaction” and “allosteric regulation” of loops using the language of cohomology.
4.1. Functions on a Mesh
Let M be a triangular mesh. We denote the set of all “anti-symmetric” assignments of integers to the vertices, edges, and triangles in M by
,
, and
, respectively, i.e.,
, (44)
, (45)
(46)
Elements of
are called scaler functions (or potential functions) on M. Elements of
are called vector fields on M.
Remark 4.1. In comparison to the continuous version of geometry,
corresponds to “scaler functions (or potential functions)” on a space and
corresponds to “vector fields” on a space.
Let
be a subset of
defined by
, (47)
where
(if M is a triangular mesh) or
(if M is a tetrahedral mesh). A vector field
is called n-bounded if
.
Remark 4.2. In comparison to the continuous version of geometry,
corresponds to “differentiable vector fields” on a space.
4.2. Exterior Derivative Operator and Cohomology
Let M be a triangular mesh. Now let’s define “differentials” of functions on M.
Discrete exterior derivative
is a mapping from
to
defined by
, (48)
(49)
(Figure 5). Note that
and
.
Remark 4.3.
computes the difference of
along an edge. On the other hand,
computes the circulation of
around a triangle.
Let
.
is called integrable if
for some
.
Lemma 4.4.
for any
.
Proof. It follows immediately from the definitions. ∎
Discrete exterior co-derivative
is a mapping from
to
defined by
, (50)
Figure 5. Discrete exterior derivative/co-derivative operators. In the figure,
represents edge
and
represents triangle
.
, (51)
where
and
are the triangles that share the edge
(Figure 5). Note that
.
Remark 4.5.
computes the divergence of
at a vertex, i.e., the sum over the outbound arrows. On the other hand,
computes the difference of
at a common edge.
Example 4.6. Figure 6(a) is a computation of
for α on the left. Figure 6(b) is a computation of
and
for β in the center.
Remark 4.7. To learn more about discrete exterior derivative/co-derivative operators, see [8] or [9].
To define cohomology of vector fields on M, we consider a short sequence of sets given by
. (52)
Two subsets of
are defined using the sequence, i.e.,
, (53)
. (54)
Lemma 4.8.
Proof. It follows immediately from Lemma 4.4. ∎
Because of Lemma 4.8, we can consider the quotient set of
by
.
Definition 4.9. (Cohomology Class) The (first) cohomology set
is defined by
. (55)
Let
. The equivalence class
is called the cohomology class of β.
Figure 6. Computation examples of discrete exterior derivative/co-derivative operators. In the figure,
represents edge
and
represents triangle
.
Lemma 4.10. If
, then
for some
.
Proof. It follows immediately from the definitions. ∎
4.3. Cohomological Conditions for Allosteric Regulation
Let’s rephrase some of the definitions given in Subsection 3.3. Let M be a triangular mesh. Then,
1) A local flow
is differentiable if
for some
,
2) A differentiable local flow
is integrable if β is integrable (i.e.,
for some
),
3) A differentiable local flow
is 2-bounded if
.
Lemma 4.11. Let
be a differentiable local flow. Then,
if
is integrable.
Proof. It follows immediately from Lemma 4.4. ∎
Then, we obtain a cohomological description of conditions for loop interaction.
Proposition 4.12. Let M be a triangular mesh. Suppose that there is a reaction precursor u generated from M of finite length, i.e.,
. Let
be an 2-bounded differentiable local flow such that
. (56)
Then, u is a reaction intermediate (i.e., M is loop-transitive) if the cohomology class of β is zero.
Proof. It follows immediately from Proposition 3.18. ∎
Corollary 4.13. (Conditions for loop interaction)
Let
be a local flow. Let ua and ub be loops of
. Suppose that there is a reaction precursor u generated from ua and ub of finite length, i.e.,
. (57)
Let
be an 2-bounded differentiable local flows such that
. (58)
Then, ua and ub are interactable if the cohomology class of β is zero.
Corollary 4.14. (Conditions for allosteric triplet)
Let
be a pre-allosteric triplet such that
and
. (59)
Let
and
be 2-bounded differentiable local flow such that
and
. (60)
where
and
are reaction precursors of finite length.
Then,
is an allosteric triplet if the cohomology classes of
and
are zero.
5. Conclusions
We have considered protein interactions from the viewpoint of cohomology theory, using two-dimensional toy models of proteins. As a specific example, cohomological conditions for allosteric regulation are presented. In this paper, proteins are represented as loops of triangles and protein interactions are represented as fusions of loops. Then, cohomology classes of vector fields on proteins (i.e., a triangular mesh) are defined using discrete exterior operators.
Cohomological conditions for loop-interaction (i.e., protein interaction) are obtained as follows. First, we define reaction intermediates and their precursors generated from a given set of loops. By definition, loops interact if there is a reaction intermediate generated from the loops. Conditions for a precursor to be a reaction intermediate are then given using the language of “differential geometry”. That is, a precursor is a reaction intermediate if the local flow of the precursor is integrable. Finally, the cohomological conditions are obtained by rephrasing the differential geometric conditions using the language of “cohomology”. That is, a precursor is a reaction intermediate if the cohomology class of the vector field on the precursor is zero.
6. Discussion
Currently, since models of allosteric regulation only provide explanations of existing allostery, computer simulations are required to detect unknown allostery. However, the flow/cohomological model, despite its simplicity, is capable of explaining not only the existence of allostery but also its non-existence [10]. In particular, the model can predict the behavior of proteins: “if we remove the obstacle of allostery, we can obtain a new allosteric protein”.
In protein science, when considering protein-protein interactions, only local properties such as local shape complementarity are considered, mainly due to insufficient computer power. However, even local convexities on the surface of proteins are not formed locally, but as a result of global folding. One of the strengths of the flow/cohomological model is that we can consider both the shape of a protein and its folding structure at once. Then, the global properties of proteins can be described using the language of cohomology.
A drawback of the model is that it predicts nothing about the actual protein because it is a 2D model. Therefore, future research directions include the study of the 3D model, where a chains of tetrahedra would be a “backbone” (i.e., streamline) of a flow, rather than a “cage” (i.e., surface of a finite region) of a flow. Then, by detecting “turbulence in a flow” using the language of cohomology, we can predict the behavior of a protein.
Another direction is the study of surface flows, i.e., triangular flows on the surface of folded chains of tetrahedra induced by a tetrahedral flow. As for the 2D model, the study of weaker conditions for loop interaction is also required.