Share This Article:

Discrete Differential Geometry and the Structural Study of Protein Complexes

Full-Text HTML XML Download Download as PDF (Size:1039KB) PP. 148-164
DOI: 10.4236/ojdm.2017.73014    158 Downloads   288 Views  
Author(s)    Leave a comment

ABSTRACT

This paper proposes a novel four-dimensional approach to the structural study of protein complexes. In the approach, the surface of a protein molecule is to be described using the intersection of a pair of four-dimensional triangular cones (with multiple top vertexes). As a mathematical toy model of protein complexes, we consider complexes of closed trajectories of n-simplices (n=2,3,4...), where the design problem of protein complexes corresponds to an extended version of the Hamiltonian cycle problem. The problem is to find “a set of” closed trajectories of n-simplices which fills the n-dimensional region defined by a given pair of n+1 -dimensional triangular cones. Here we give a solution to the extended Hamiltonian cycle problem in the case of n=2 using the discrete differential geometry of triangles (i.e., 2-simplices).

1. Introduction

Proteins are called the workhorse molecules of life, playing a crucial role in essentially every activity of living organisms. A protein molecule is made from one or more long chains of amino acids, which normally folds into a well-defined three-dimensional structure. It is the precise shape of the folded structure that determines the function of proteins in a cell.

Most cellular processes are not carried out by random collisions between freely diffusing proteins. Proteins usually interact with other proteins and assemble into complexes to carry out their function [1] [2] [3] . It is therefore crucial to understand and control the formation of protein complexes for understanding biological activity in the cell. In particular, structural characterization of the components of complexes, such as shape complementarity at protein- protein interfaces, is the key to understanding the function of proteins.

In the last two decade, huge number of protein structures is experimentally determined via high-throughput structural genomics pipelines. However experimental determination of their functions is lagged far behind the pace due to the labor-intensive and time-consuming nature of the process. Urgently needed are improved computational approaches to function prediction of the proteins with known structure [4] .

It is however extremely difficult to describe the shape of proteins without visual inspection on a three-dimensional display. The fundamental question is how to describe the geometry of such a highly complicated shape as proteins.

In most of previous studies, the surface of proteins is described using concepts developed in computational geometry and topology, such as the Voronoi diagram, the Delaunay simplices, and the alpha shape representation [5] [6] [7] . As for protein complexes, the topological arrangement of their subunits is usually represented as a graph [8] [9] .

The Hamiltonian cycle problem on a regular triangular mesh: a) A region in a regular triangular lattice. b) A Hamiltonian cycle through the region.

An extended version of the Hamiltonian cycle problem on a regular triangular mesh: de nove design of complexes of closed trajectories of triangles. Shown are all the three sets of closed trajectories of triangles which cover the specified region. In this case, the region has no Hamiltonian cycle.

In this paper, we propose a novel mathematical toy model which is intended for the structural study of protein complexes. While physics and mathematics have been inspired each other in their long relationship, the relationship between biology and mathematics is still to come. In our case, it is the relation between real protein complexes and the new mathematical toy model. That is, it is critical to justify why such new toy models are indeed relevant and practically useful.

To justify the usefulness of mathematical tools in biology, I’d like to mention the case of the Estrada index introduced by Ernesto Estrada [10] in 2000. The Estrada index was originally proposed as a molecular structure descriptor, and the protein structure has been investigated by using the Estrada index and the normalized Laplacian Estrada index [11] extensively in mathematics in the past decade. The Estrada indices have also found a range of applications in chemistry and complex networks. These days, a dynamic version of the Estrada indices are proposed [12] to study large-scale time-evolving networks which arise naturally in a variety of areas from peer-to-peer telecommunication to online human social behavior to neuroscience.

As for other mathematical approaches to protein structure analysis, most of them are application of known mathematical techniques to the structural study of proteins, such as, distance geometry [13] , the knot theory [13] , and persistent homology [14] . Differential geometric techniques are also applied to the analysis of the backbone structure of proteins [15] .

In our model, instead of open chains of amino acids, we consider closed trajectories of n-simplices using the discrete differential geometry of n-simplices ( n = 2 , 3 , ) [16] [17] . Then, interaction of open chains of amino acids (i.e., proteins) is mimicked with “recombination”, such as fusion and fission, of closed chains of n-simplices. The advantage of our model lies in the correspondence between the shape of a complex of closed trajectories of n-simplices and (a projection image of) the intersection of a pair of n + 1 -dimensional cones.

Using the mathematical toy model, we will consider the problem of designing protein complexes from scratch (de novo design of protein complexes [18] [19] [20] ). That is, we will consider the problem of finding a set of closed trajectories of n-simplex that forms a specified n-dimensional shape: de nove design of complexes of closed trajectories of n-simplices. For simplicity, we consider the case of n = 2 only.

2. Problem

The problem we consider here is an extended version of the Hamiltonian cycle problem on a regular triangular mesh. A Hamiltonian cycle of triangles (i.e., 2- simplices) is a closed trajectory through a given triangular mesh which visits each triangle exactly once, where the trajectory passes triangles through a com- mon edge. As shown in Figure 1(a), meshes are given as a region in a two-di- mensional regular triangular lattice. In this case, a Hamiltonian cycle is obtained as shown in Figure 1(b).

To study the formation of a complex of closed trajectory of triangles, we consider not only a single but also multiple closed trajectories of triangles to cover the given region. In the case of Figure 2, two closed trajectories are required.

In what follows, we will propose a novel method for finding all the sets of closed trajectories which cover a given region of triangles.

3. Differential Structure on the Mesh

To define a differential structure on a regular triangular mesh, we stack unit cubes diagonally in the three-dimensional Euclidean space E 3 (Figure 3(a)).

By piling up unit cubes orderly in the direction of ( 1, 1, 1 ) in E 3 , a “mountain range-like shape object” consisting of multiple triangular cones is obtained as shown in the upper part of Figure 3(c). If we draw a thick straight

(a) (b)

Figure 1. The Hamiltonian cycle problem on a regular triangular mesh. (a) A region in a regular triangular lattice; (b) A Hamiltonian cycle through the region.

line diagonally on the three upper faces of each unit cube, we will obtain a “drawing” on the slope of the mountain range-like shape object (Figure 3(a) and Figure 3(d)). It is the drawing which specifies a flow of “slant” triangles (along the thick polygonal lines) on the slope.

Then, we define a flow of “flat” triangles on a plane which is perpendicular to the direction of ( 1,1,1 ) in E 3 by projecting the flow of “slant” triangles on the plane (the lower part of Figure 3(c))). In the case of Figure 3(c), we obtain a closed trajectory of flat triangles of length 30 and others. In this section we give the precise definition of the differential structure on a regular triangular mesh.

For space saving purposes, we use monomial in indeterminates x 0 , x 1 and x 2 to represent the coordinate of points in the three-dimensional Euclidean space. For example, point p = ( l , m , n ) 3 is identified with monomial x 0 l x 1 m x 2 n , where denotes the set of all integers. Then, points ( l + k , m , n ) , ( l , m + k , n ) and ( l , m , n + k ) are represented by monomials p x 0 k , p x 1 k and p x 2 k respectively. (Note that x i x j = x j x i for all pairs of i and j.)

3.1. Triangle Tiles

Shown in the upper part of Figure 3(a) is a unit cube with a thick straight line drawn diagonally on each of the upper three faces, which is located at the origin

(a) (b) (c)

Figure 2. An extended version of the Hamiltonian cycle problem on a regular triangular mesh: de nove design of complexes of closed trajectories of triangles. Shown are all the three sets of closed trajectories of triangles which cover the specified region. In this case, the region has no Hamiltonian cycle.

(a) (b) (c) (d)

Figure 3. Differential structure on a regular triangular mesh. (a) A unit cube and its projection on a plane perpendicular to the direction of ( 1,1,1 ) in E 3 ; (b) The pro- jection of “slant triangles” onto a “flat triangle”; (c) A “mountain range-like shape object” obtained by piling up unit cubes orderly along the diagonal direction, whose peaks are P 0 = ( 1 , 0 , 0 ) , P 1 = ( 0 , 0 , 1 ) , P 2 = ( 2 , 2 , 1 ) , P 3 = ( 0 , 3 , 0 ) and P 4 = ( 2 , 2 , 1 ) ; (d) A “drawing” on the slope of the mountain range-like shape object of (c).

O of a three-dimensional Cartesian coordinate system defined by three axes φ 0 , φ 1 and φ 2 . Let P a = 1 , P b = x 0 , P c = x 0 x 1 , and P d = x 1 3 . Then, the upper face P a P b P c P d on the p 0 p 1 -plane is divided into two “slant triangles”, P a P b P c and P a P d P c , by the line segment P a P c . The other upper faces are also divided into two “slant triangles” similarly.

Shown in the lower part of Figure 3(a) is the projected image of the unit cube on a plane which is perpendicular to the direction of ( 1,1,1 ) in E 3 . The unit cube at O is projected onto a hexagon, which is divided into six “flat triangles” by the image of the three thick line segments on the cube.

The schematic drawing of Figure 3(b) shows the projection of slant triangles onto a flat triangle. Using the projection, we will define a discrete differential structure on the set of flat triangles, i.e., a regular triangular mesh.

Let S y m 3 be the symmetric group on a finite set of three symbols. For a 3 and ρ S y m 3 , let a [ x ρ ( ( 0 ) x ρ ( 1 ) ] denote the convex hull of three points a , a x ρ ( 0 ) and a x ρ ( 0 ) x ρ ( 1 ) 3 , i.e.,

a [ x ρ ( 0 ) x ρ ( 1 ) ] : = { a λ 0 ( a x ρ ( 0 ) ) λ 1 ( a x ρ ( 0 ) x ρ ( 1 ) ) λ 2 | λ 0 , λ 1 , λ 2 , λ 0 , λ 1 , λ 2 0 , λ 0 + λ 1 + λ 2 = 1 } 3 ,

where denotes the set of all real numbers.

For example, the “slant triangle” P a P d P c defined above is denoted by [ x 1 x 0 ] = a [ x ρ ( 0 ) x ρ ( 1 ) ] , where a = x 0 0 x 1 0 x 2 0 = 1 and ρ = ( 0 , 1 ) .

Definition 3.1 We define the set S 2 of all slant (triangle) tiles by

S 2 : = { a [ x ρ ( 0 ) x ρ ( 1 ) ] | a 3 , ρ S y m 3 } .

The set B 2 of all flat (triangle) tiles is defined as the quotient of S 2 by “shift operator” σ , i.e.,

B 2 : = S 2 / σ ,

where σ ( a [ x ρ ( 0 ) x ρ ( 1 ) ] ) : = a x ρ ( 0 ) [ x ρ ( 1 ) x ρ ( 2 ) ] .

We identify B 2 with the projection image of “slant triangles” on a plane perpendicular to vector ( 1,1,1 ) mentioned above. Then, the schematic drawing of Figure 3(b) shows the equivalence class of a slant tile s S 2 and the corresponding flat tile s m o d σ B 2 .

3.2. Tangent Space at a Flat Triangle Tile

A tangent bundle-like local structure T B 2 is defined on B 2 by

Definition 3.2 T B 2 B 2 , π

{ T B 2 : = S 2 / σ 3 , π : T B 2 B 2 , π ( s m o d σ 3 ) = s m o d σ .

Let s S 2 . Then, we obtain

π 1 ( s m o d σ ) = { s m o d σ 3 , σ ( s ) m o d σ 3 , σ 2 ( s ) m o d σ 3 } .

Definition 3.3 (Tangent space) For s S 2 , we call π 1 ( s m o d σ ) the tangent space of B 2 at s m o d σ .

Definition 3.4 For s = a [ x ρ ( 0 ) x ρ ( 1 ) ] S 2 , the gradient D s of s is defined by

D s : = x ρ ( 0 ) x ρ ( 1 ) ( = x ρ ( 1 ) x ρ ( 0 ) ) .

Then, we can identify T B 2 with

B 2 × { x 0 x 1 , x 1 x 2 , x 0 x 2 }

via the one-to-one correspondence

s m o d σ 3 ~ ( s m o d σ , D s ) .

Note that the monomial D s of s S 2 corresponds to the direction of the thick line on the “slant triangle” s which is described in subsection 3.1 above (Figure 3).

3.3. Vector Field on B2

Having defined a tangent-bundle like structure ( T M 2 , B 2 , π ) on a set of triangles, now we consider the inverse of the projection map π .

Definition 3.5 A section γ of ( T B 2 , B 2 , π ) is a map B 2 T B 2 such that

π ( γ ( t ) ) = t for all t B 2 .

For a section γ of ( T B 2 , B 2 , π ) , the value of γ on t B 2 is given by

γ ( t ) = s mod σ 3 T B 2

for some s = a [ x ρ ( 0 ) x ρ ( 1 ) ] S 2 . Let s D and s U be two adjacent slant tiles of s in S 2 defined by

{ s D : = a x ρ ( 0 ) [ x ρ ( 1 ) x ρ ( 0 ) ] S 2 , s U : = a [ x ρ ( 0 ) x ρ ( 2 ) ] S 2 .

(a) (b) (c) (d)

Figure 4. Local trajectory. (a) The local trajectory specified by s = [ x 1 x 0 ] S 2 . s D = x 1 [ x 0 x 1 ] and s U = [ x 1 x 2 ] ; (b) The smoothness condition on a section γ . Colored gray is γ ( [ x 1 x 0 ] m o d σ ) and colored white is γ ( [ x 1 x 2 ] m o d σ ) . Shown above are the gradient of the white tile. The gradient of the gray tile is x 0 x 1 ; (c) Smooth sections of ( T B 2 , B 2 , π ) on a hexagonal region composed of six flat tiles; (d) Sections of ( T B 2 , B 2 , π ) which dose not satisfy the smoothness condition. The corresponding sin- gular flat tiles are colored gray in the lower part.

Then, the set of three slant tiles, { s D , s , s U } , makes up a “continuous mountain path” along the thick polygonal line (i.e., along the gradient D s ) at s in S 2 (Figure 4(a)). By projecting these slant tiles on B 2 , we obtain a trajectory of flat tiles of length three at s m o d σ .

To consider the “smoothness” of the section γ , we firstly define a local trajectory passing through t B 2 as follows.

Definition 3.6 Let s S 2 . The local trajectory specified by s is the set

{ s D m o d σ , s m o d σ , s U m o d σ } B 2 ,

of three consecutive flat tiles passing though s m o d σ .

Let γ be a section on B 2 . Then, γ ( t ) ( t B 2 ) can assume one of the three values of the corresponding tangent space π 1 ( t ) . For example, γ ( s U m o d σ ) can assume one of the three values of

π 1 ( s U m o d σ ) = { x 0 1 [ x 0 x 1 ] m o d σ 3 , [ x 1 x 2 ] m o d σ 3 , x 1 [ x 2 x 0 ] m o d σ 3 } ,

where s = [ x 1 x 0 ] S 2 and s U = [ x 1 x 2 ] .

However some of the slant tiles are not connected smoothly to γ ( s m o d σ ) in T B 2 . In this case,

x 1 [ x 2 x 0 ] m o d σ 3 ~ ( s U m o d σ , x 0 x 2 )

is not connected smoothly to γ ( s mod σ ) = s mod σ 3 as shown in Figure 4(b).

To obtain a “smooth” trajectory, we will impose a condition on sections of ( T B 2 , B 2 , π ) .

Definition 3.7 (Smoothness condition) Let γ be a section of ( T B 2 , B 2 , π ) and t B 2 . Let γ ( t ) = s mod σ 3 , where s = a [ x ρ ( 0 ) x ρ ( 1 ) ] S 2 . The smoothness condition on γ at t is defined by

{ D ( γ ( s U mod σ ) ) = x ρ ( 0 ) x ρ ( 1 ) or x ρ ( 0 ) x ρ ( 2 ) , D ( γ ( s D mod σ ) ) = x ρ ( 0 ) x ρ ( 1 ) or x ρ ( 1 ) x ρ ( 2 ) .

In what follows, we will only consider the sections of ( T B 2 , B 2 , π ) which satisfies the smoothness condition at every flat tiles of B 2 .

Remark x ρ ( 0 ) corresponds to (the direction of) the contact edge between s and s U . x ρ ( 1 ) corresponds to (the direction of) the contact edge between s and s D .

Definition 3.8 (Vector field) A vector filed on B 2 is a section of ( T B 2 , B 2 , π ) which satisfies the smoothness condition at every flat tiles of B 2 .

Shown in Figure 4(c) are all the six types of “local” smooth sections of ( T B 2 , B 2 , π ) on a hexagonal region composed of six flat tiles of B 2 . By patching these “local” sections together, we will obtain a “global” section of ( T B 2 , B 2 , π ) .

Note that some of the “global” sections do not satisfy the smoothness condition as shown in Figure 4(d). The singular flat tile of a section γ of ( T B 2 , B 2 , π ) is the flat tile where γ dose not satisfy the smoothness condition. A singular flat tile is assigned either no gradient (i.e., without thick edge), two gradients (i.e., two thick edges), or three gradients.

Let μ = { t [ i ] | i I } B 2 be a trajectory of a vector field V, where I is a subset of the set of natural numbers. Then, we can define the second derivative of the trajectory as follows.

Definition 3.9 The second derivative D 2 V ( t [ i ] ) of V along μ is a binary-valued (U or D) function defined by

D 2 V ( t [ i + 1 ] ) : = ( D 2 V ( t [ i ] ) , if D V ( t [ i + 1 ] ) = D V ( t [ i ] ) D 2 V ( t [ i ] ) , otherwise

where D : = U and U : = D .

In [16] , the conformation of a protein backbone structure is encoded into a 16-valued sequence using the second derivative of trajectories of tetrahedrons (i.e., 3-simplices).

3.4. Vector Fields Induced by Tangent Cones

In the beginning of this section, we constructed a “mountain range-like shape object” by piling up unit cubes diagonally. (Using the terminology defined above, it is a section of ( T B 2 , B 2 , π ) .)

Unit cubes are piled up to form a union of triangular cones, which can be identified by its top vertexes. For example, the object shown in the upper part of Figure 3(c) is identified by five peaks P 0 , P 1 , P 2 , P 3 , and P 4 .

Definition 3.10 For A 3 , a tangent cone C o n e A 3 is defined by

C o n e A : = { p x 0 l 0 x 1 l 1 x 2 l 2 | p A and l 0 , l 1 , l 2 0 } .

We denote the set of all the “top vertexes” of C o n e A by p ( C o n e A ) .

Then, the mountain range-like shape object of Figure 3(c) is given by

C o n e { x 0 , x 2 , x 0 2 x 1 2 x 2 , x 1 3 , x 0 2 x 1 2 x 2 1 } .

For a tangent cone c, let d c be the set of all the slant tiles on the surfaces of c, i.e.,

d c : = { a [ x ρ ( 0 ) x ρ ( 1 ) ] S 2 | a , a x ρ ( 0 ) and a x ρ ( 0 ) x ρ ( 1 ) are on the surfaces of c } .

For z 3 and c = C o n e A ( A 3 ), set

l c ( z ) : = max p c { min { l 0 , l 1 , l 2 } | z = x 0 l 0 x 1 l 1 x 2 l 2 p } .

Then, d c is given as follows.

Lemma 3.11 For c = C o n e A ( A 3 ),

d c = { a [ x ρ ( 0 ) x ρ ( 1 ) ] S 2 | l c ( a ) = l c ( a x ρ ( 0 ) ) = l c ( a x ρ ( 0 ) x ρ ( 1 ) ) = 0 }

Proof. Let z , p 3 . Then, z = x 0 l 0 x 1 l 1 x 2 l 2 p for some ( l 0 , l 1 , l 2 ) 3 . ( l 0 , l 1 , l 2 ) is the coordinate of z with respect to “origin” p. In particular,

z C o n e { p } l 0 , l 1 , l 2 0 , z d ( C o n e { p } ) min { l 0 , l 1 , l 2 } = 0.

The result follows immediately.

The surfaces of a tangent cone c induce a vector field of ( T B 2 , B 2 , π ) as follows.

Definition 3.12 For c = C o n e A ( A 3 ), a vector field V c induced by c is defined by

V c ( t ) : = s mod σ 3 ( t B 2 ) ,

where s d c such that t = s mod σ . The value V c ( t ) is uniquely determined at every flat tile of B 2 .

For example, in Figure 3(c), the thick polygonal lines on the surfaces of the tangent cone C o n e { x 0 , x 2 , x 0 2 x 1 2 x 2 , x 1 3 , x 0 2 x 1 2 x 2 1 } shows the vector field induced by the tangent cone.

Note that all the smooth sections shown in Figure 4(c) are induced by a tangent cone as indicated in the figure.

Proposition 3.13 For any vector field V, there exists a tangent cone c such that V = V c .

Proof. Let V be a vector filed on B 2 and let E = { U i | i } be a decomposition of B 2 into hexagonal regions of six flat tiles:

B 2 = { U i | i } .

For U i E , we let V | U i denote the restriction of V on the hexagon U i .

Because of the smoothness condition, V is locally induced by a tangent cone as shown in Figure 3(c). That is, there exists a tangent cone c i for each U i E such that V | U i = V c i | U i , i.e.,

V ( t ) = V c i ( t ) for t U i .

Moreover, by considering all combinations, we can assume

V | U j U k = V c j c k | U j U k (1)

for any pair of adjacent hexagons U j and U k , where c j c k denote the union of two tangent cones c j and c k , i.e., C o n e p ( c j ) p ( c k ) .

Suppose that V V c for c = i c i . Then,

U a E such that V c a | U a V c | U a .

In particular, b such that

t 0 U a such that V c ( t 0 ) = V c b ( t 0 ) .

In other words, tangent cone c a is (partially) covered by tangent cone c b on U a .

Then, there exists a circular loop Θ of hexagons of E around U b such that

U e Θ such that V c e | U e V c e c b | U e and V c i | U i = V c i c b | U i for U i Ω ,

where Ω is the set of all the hexagons of E contained in the circular region surrounded by Θ . Because of the shape of the tangent cones, U f Ω such that U f is adjacent to U e and c e is (partially) covered by c f on U e , i.e.,

t 1 U e such that V c ( t 1 ) = V c f ( t 1 ) V c e ( t 1 ) .

In particular,

V c e | U e V c e c f | U e ,

which is in contradiction to Equation (1).

4. The Boundary of a Closed Trajectory

Now let’s go back to our problem described in section 2. Using the terminology given in section 3, the problem is stated as follows.

Problem 4.1. (De nove design of complexes of closed trajectories of triangles) Given a region in B 2 , find all the vector fields on B 2 which give a decomposition of the region into closed trajectories.

If there exists such a vector field, we can describe the boundary of the region using a pair of three-dimensional cones as explained in this section.

The cones are defined in another lattice which is associated with 3 . Recall that the three-dimensional lattice 3 is generated by x 0 , x 1 and x 2 . The associated lattice is defined as follows.

Definition 4.2. The conjugate lattice L 3 is the lattice which is generated by x 0 x 1 , x 1 x 2 and x 0 x 2 .

Note that the gradient of a slant tile corresponds to one of the three coordinate axes of the conjugate lattice L 3 . In particular, a trajectory of slant tiles corresponds to a zig-zag walk (with gaps) on the grid of L 3 .

Two types of cones are defined in L 3 :

Definition 4.3. For A L 3 , a cotangent cone C o n e * A L 3 is defined by

C o n e * A : = { p ( x 1 x 2 ) l 0 ( x 0 x 2 ) l 1 ( x 0 x 1 ) l 2 | p A and l 0 , l 1 , l 2 0 } .

For A L 3 , a cotangent roof R o o f * A L 3 is defined by

R o o f * A : = { p L 3 | N > 0 such that p ( x 1 x 2 ) N , p ( x 0 x 2 ) N , p ( x 0 x 1 ) N C o n e * A } .

In other words, R o o f * A is obtained by putting as many unit cubes as possible on C o n e * A .

For example, shown in Figure 5(c) is

(a) (b) (c)

Figure 5. Cotangent roofs associated with a closed trajectory on B 2 . (a) The boundary of the closed trajectory of Figure 3(c); (b) The cotangent roof of the region, where Q 0 = x 2 1 , Q 1 = x 0 1 and Q 2 = x 0 2 x 1 ; (c) The inverted cotangent roof of the region, where Q 3 = x 0 3 x 1 3 x 2 and Q 4 = x 0 2 x 1 4 x 2 3 .

R o o f * { x 0 , x 2 , x 0 2 x 1 2 x 2 , x 1 3 , x 0 2 x 1 2 x 2 1 } ( = C o n e * { x 2 1 , x 0 1 , x 0 2 x 1 } ) .

Inverted cones are also defined similarly:

Definition 4.4. For A L 3 , an inverted cotangent cone I C o n e * A L 3 is defined by

I C o n e * A : = { p ( x 1 x 2 ) l 0 ( x 0 x 2 ) l 1 ( x 0 x 1 ) l 2 | p A and l 0 , l 1 , l 2 0 } .

For A L 3 , an inverted cotangent roof I R o o f * A L 3 is defined by

I R o o f * A : = { p L 3 | N < 0 such that p ( x 1 x 2 ) N , p ( x 0 x 2 ) N , p ( x 0 x 1 ) N I C o n e * A } .

For example, shown in Figure 5(c) is

I R o o f * { x 0 , x 2 , x 0 2 x 1 2 x 2 , x 1 3 , x 0 2 x 1 2 x 2 1 } ( = I C o n e * { x 0 3 x 1 3 x 2 , x 0 2 x 1 4 x 2 3 } ) .

Then, the boundary of a closed trajectory of a vector field on B 2 can be described using a pair of a cotangent roof and an inverted cotangent roof as shown below.

Let w be a cotangent cone. We denote by ( w ) the set of all the lattice points of L 3 which resides on the surface of the cone w. ( w ) is called the boundary lattice points of w. The boundary lattice points of an inverted cotangent cone is also defined in the same manner.

Proposition 4.5. Let V c be a vector field of ( T B 2 , B 2 , π ) induced by a tangent cone c whose top vertexes p ( c ) are in L 3 . Let μ = { t [ i ] | i I } ( I ) be a closed trajectory of V c . Let R μ be the region swept by the trajectory μ .

Then, there exist a cotangent cone w and an inverted cotangent cone i v such that the boundary of R μ is uniquely specified by the intersection of ( w ) and ( i v ) .

The pair ( w , i v ) is called a boundary pair (of the region R μ ) and the specified region is denoted by R ( w , i v ) , i.e. R ( w , i v ) = R μ .

Proof. Let Λ = { s [ i ] | i I } be a subset of S 2 such that

V c ( t [ i ] ) = s [ i ] m o d σ 3 ( i I ).

Because of the smoothness condition, we may assume slant tiles of Λ are connected “smoothly” in S 2 without any gap. Let A be the set of all vertexes of the slant tiles of Λ . Define cones w and iv by

{ w = R o o f * A , i v = I R o o f * A .

Then, the boundary of R μ is obtained by connecting the points of π ( ( w ) ( i v ) ) on B 2 , where π denotes the projection of the lattice points of 3 on the corresponding vertexes of flat tiles of B 2 .

Remark R o o f * p ( c ) is not defined if p ( c ) L 3 .

For example, in the case of the closed trajectory given in Figure 3(d), the boundary of R μ is uniquely specified by

{ w = C o n e * { Q 0 , Q 1 , Q 2 } , i v = I C o n e * { Q 3 , Q 4 }

as shown in Figure 5.

Corollary 4.6. Let R be a region in B 2 . Then, R has a closed-trajectory decomposition if and only if there exists a pair of a cotangent cone w and an inverted cotangent cone iv such that R = R ( w , i v ) . The pair ( w , i v ) is also called a boundary pair (of R).

Proof. ( ) Let { μ i | i I } ( I ) be a closed-trajectory decomposition of R and let { ( w i , i v i ) | i I } be their boundary pairs. Set

{ w = i I w i , i v = i I i v i .

Then, R = i I R μ i = i I R ( w i , i v i ) .

( ) A closed-trajectory decomposition of R is induced by C o n e ( w ) ( i v ) .

Remark Let T be the set of all tangent cones. Let be the set of all cotangent cones. Let L be the set of all the regions in B 2 which are defined by boundary pairs. Then, an L -valued function is defined on T × by C o n e A , C o n e * B := “the region in B 2 which is specified by the intersection of C o n e A and C o n e * B “. In particular,

C o n e ( w ) ( i v ) , w = R ( w , i v ) ,

for a boundary pair ( w , i v ) .

5. Extended Hamiltonian Cycle Problem on B2

5.1. Problem

By Corollary 4.6, we can paraphrase Problem 4.1 as follows.

Problem 5.1. (De nove design of complexes of closed trajectories of triangles) Given a boundary pair ( w , i v ) , find all the tangent cones which induce such a vector field that gives a decomposition of the region R ( w , i v ) into closed trajectories (Figure 6).

(a) (b)

Figure 6. The extended Hamiltonian cycle problem on B 2 (See also Figure 1). (a) A pair of a cotangent cone and an inverted cotangent cone which specifies the boundary of a region: R o o f * A and I R o o f * B , where A , B L 3 ; (b) A tangent cone which induces such a vector field whose trajectories don’t traverse the specified boundary: C o n e C , where C = ( R o o f * A ) ( I R o o f * B ) .

One of the solutions to the problem is obtained immediately, i.e., C o n e ( w ) ( i v ) (Figure 6(b)). In this section, we consider how to find all solutions to the problem.

5.2. Closed-Trajectory Decomposition

Definition 5.2. For A 3 , a tangent roof R o o f A 3 is defined by

R o o f A : = { p 3 | N > 0 such that p ( x 0 ) N , p ( x 1 ) N , p ( x 2 ) N C o n e A } . In other words, R o o f A is obtained by putting as many unit cubes as possible on C o n e A .

Definition 5.3. For A 3 , a (tangent) ceiling C e i l A 3 is defined by

C e i l A : = C o n e C ,

where

C = { B | R o o f * B = R o o f * A and I R o o f * B = I R o o f * A } 3 .

For A 3 , a (tangent) floor F l o o r A 3 is defined by

F l o o r A : = C o n e C ,

where

C = { B | R o o f * B = R o o f * A and I R o o f * B = I R o o f * A } 3 .

It follows immediately that

{ F l o o r A C e i l A , R o o f p ( F l o o r A ) = R o o f A , R o o f p ( C e i l A ) = R o o f A ,

where p ( c ) denotes the set of all the “top vertexes” of a cone c.

For a boundary pair ( w , i v ) , let W ( w , i v ) be the set of all the tangent cones c such that

F l o o r C c C e i l C ,

where C = ( w ) ( i v ) .

Now all solutions to Problem 5.1 are obtained as follows:

Proposition 5.4. W ( w , i v ) induces all decompositions of R ( w , i v ) into closed trajectories.

Proof. ( ) Let V F be the set of all the vector fields whose trajectories don’t traverse the boundary of R ( w , i v ) . Then,

V c V F for any c W ( w , i v )

because ( R o o f * p ( c ) , I R o o f * p ( c ) ) = ( w , i v ) . In particular, V c induces a decompositions of R ( w , i v ) into closed trajectories.

( ) Given a decomposition of R ( w , i v ) into closed trajectories. Then, it can be extended to a vector field V on B 2 . For example, a flow of triangles on B 2 \ R ( w , i v ) is induced by

V C o n e ( w ) ( i v ) .

Then, a tangent cone c such that V = V c by proposition 3.13. Then

( R o o f * p ( c ) , I R o o f * p ( c ) ) = ( w , i v )

because trajectories of V c don’t traverse the boundary of R ( w , i v ) .

For example, Figure 7 shows all solutions to the problem for the boundary pair of Figure 5.

5.3. Fusion and Fission of Closed Trajectories

For a vector field V on B 2 and a region R of B 2 , let D e c ( V , R ) be the set of all closed trajectories of V in R. D e c ( V , R ) gives a closed-trajectory decomposition of R if it exists. The number of the closed trajectories of D e c ( V , R ) is denoted by # D e c ( V , R ) .

For a boundary pair ( w , i v ) , let c 0 and c 1 ( c 0 c 1 ) be two tangent cones of W ( w , i v ) . Then, vector fields V c 0 and V c 1 induce two different decompositions of R ( w , i v ) : D e c ( V c 0 , R ( w , i v ) ) and D e c ( V c 1 , R ( w , i v ) ) . The correspondence

D e c ( V c 0 , R ( w , i v ) ) D e c ( V c 1 , R ( w , i v ) )

gives a “recombination” of closed trajectories from one to the other. In particular, it gives a “fusion” and “fission” of closed trajectories on R ( w , i v ) if

# D e c ( V c 0 , R ( w , i v ) ) > 1 and # D e c ( V c 1 , R ( w , i v ) ) = 1.

In the case of Figure 7, the region R ( w , i v ) has three decompositions but no Hamiltonian cycle:

{ # D e c ( V C o n e C , R ( w , i v ) ) = 2 ( a ) # D e c ( V C o n e C { x 1 2 } , R ( w , i v ) ) = 2 ( b ) # D e c ( V C o n e C { x 1 } , R ( w , i v ) ) = 2 ( c )

Moreover, it is not difficult to show the following proposition (in the case of closed trajectories of triangles):

(a) (b) (c)

Figure 7. All solutions to the extended Hamiltonian cycle problem for a boundary pair ( w , i v ) , where ( w ) ( i v ) = { x 0 , x 2 , x 0 2 x 1 2 x 2 , x 1 3 , x 0 2 x 1 2 x 2 1 , x 0 x 1 3 x 2 1 } (See also Figure 2). Set C = ( w ) ( i v ) . (a) The vector field induced by F l o o r C ; (b) The vector field induce by C o n e C { x 1 2 } which is obtained by putting a cube (with top vertex x 1 2 ) on F l o o r C ; (c) The vector field induced by C e i l C = C o n e C { x 1 } , which is obtained by putting a cube (with top vertex x 1 ) on the vector field of (b).

Proposition 5.5. When a closed trajectory is merged with a closed trajectory of length 6 (which occupies a hexagonal region), they don’t fuse together to form a single closed trajectory.

In other words, closed trajectories always split when they interact with a hexagon.

See Tabel 1 for the distribution of the length of closed trajectories of n-sim- plices ( n = 2 , 3 , 4 ).

The distribution of the length of closed trajectories of n-simplices ( n = 2 , 3 , 4 ). Two closed trajectories are identified if and only if their sequences of the second derivative coincide with each other by rotational shift, inversion, or reversion.

6. Conclusions

We have considered an extended version of a two-dimensional Hamiltonian cycle problem in a three-dimensional setting, where the boundary of a two-di- mensional region is uniquely specified by a pair of three-dimensional cones, i.e., a boundary pair. Using the discrete differential geometry of triangles, all decompositions of the region into closed trajectories of triangles are obtained immediately from the intersection of the boundary pair.

In the structural study of protein complexes, it is essential to characterize surface features such as bumps (convexity) and dents (concavity) of protein molecules. However mathematical surface characterization has not produced any satisfactory results so far, where the surface of protein molecules is usually studied in a three-dimensional setting.

This paper proposes a novel mathematical approach to the structural study of protein complexes, i.e., an approach from a four-dimensional setting, where the surface of protein molecules is to be described by a pair of four-dimensional cones (with multiple top vertexes) as in the case of complexes of closed trajectories of triangles.

In our approach, protein molecules are to be represented as closed trajectories of tetrahedrons, where shape complementarity is expressed inherently. In particular, we could define fusion and fission of molecules (i.e., closed trajectories) naturally.

As a future research subject, we are considering whether there exist any (algebraic) equations a given boundary pair satisfies. If there exists a set of such equations that specifies the given boundary pair, it is possible to represent the shape

Table 1. The distribution of the length of closed trajectories of n-simplices ( n = 2 , 3 , 4 ). Two closed trajectories are identified if and only if their sequences of the second derivative coincide with each other by rotational shift, inversion, or reversion.

of a protein molecule as a solution for a system of equations. In particular, we would obtain another protein molecule of the same function if a given set of equations has more than one solution.

Cite this paper

Morikawa, N. (2017) Discrete Differential Geometry and the Structural Study of Protein Complexes. Open Journal of Discrete Mathematics, 7, 148-164. doi: 10.4236/ojdm.2017.73014.

References

[1] Alberts, B. (1998) The Cell as a Collection of Protein Machines: Preparing the Next Generation of Molecular Biologists. Cell, 92, 291-294.
https://doi.org/10.1016/S0092-8674(00)80922-8
[2] Ponstingl, H., Kabir, T., Gorse, D. and Thornton, J.M. (2005) Morphological Aspects of Oligomeric Protein Structures. Progress in Biophysics and Molecular Biology, 89, 9-35.
[3] Pereira-Leal, J.B., Levy, E.D. and Teichmann, S.A. (2006) The Origins and Evolution of Functional Modules: Lessons from Protein Complexes. Philosophical Transactions of the Royal Society B: Biological Sciences, 361, 507-517.
https://doi.org/10.1098/rstb.2005.1807
[4] Lee, D., Redfern, O. and Orengo, C. (2007) Predicting Protein Function from Sequence and Structure. Nature Reviews Molecular Cell Biology, 8, 995-1005.
https://doi.org/10.1038/nrm2281
[5] Edelsbrunner, H. (2001) Geometry and Topology for Mesh Generation. Cambridge University Press, Cambridge.
https://doi.org/10.1017/CBO9780511530067
[6] Gerstein, M and Richards, F.M. (2001) Protein Geometry: Volumes, Areas, and Distances. In: Rossman, M.G. and Arnold, E., Eds., The International Tables for Crystallography, Vol. F, Chap. 22, Kluwer, Dordrecht, 531-539.
[7] Ban, Y.-E.A., Edelsbrunner, H. and Rudolph, J. (2006) Interface Surfaces for Protein-Protein Complexes. Journal of the ACM, 53, 361-378.
https://doi.org/10.1145/1147954.1147957
[8] Levy, E.D., Pereira-Leal, J.B., Chothia, C. and Teichmann, S.A. (2006) 3D Complex: A Structural Classification of Protein Complexes. PLoS Computational Biology, 2, e155.
https://doi.org/10.1371/journal.pcbi.0020155
[9] Perica, T., Marsh, J.A., Sousa, F.L., Natan, E., Colwell, L.J., Ahnert, S.E. and Teichmann, S.A. (2012) The Emergence of Protein Complexes: Quaternary Structure, Dynamics and Allostery. Biochemical Society Transactions, 40, 475-491.
https://doi.org/10.1042/BST20120056
[10] Estrada, E. (2000) Characterization of 3D Molecular Structure. Chemical Physics Letters, 319, 713-718.
[11] Li, J., Guo, J. and Shiu, W.C. (2014) The Normalized Laplacian Estrada Index of a Graph. Filomat, 28, 365-371.
https://doi.org/10.2298/FIL1402365L
[12] Shang, Y. (2015) Laplacian Estrada and Normalized Laplacian Estrada Indices of Evolving Graphs. PLoS ONE, 10, e0123426.
https://doi.org/10.1371/journal.pone.0123426
[13] Taylor, W.R. and Aszodi, A. (2004) Protein Geometry, Classification, Topology and Symmetry: A Computational Analysis of Structure. Taylor & Francis, UK.
[14] Edelsbrunner, H., Letscher, D. and Zomorodian, A. (2002) Topological Persistence and Simplification. Discrete & Computational Geometry, 28, 511-533.
https://doi.org/10.1007/s00454-002-2885-2
[15] Goriely, A., Hausrath, A. and Neukirch, S. (2008) The Differential Geometry of Proteins and Its Applications to Structure Determination. Biophysical Reviews and Letters, 3, 77-101.
https://doi.org/10.1142/S1793048008000629
[16] Morikawa, N. (2014) Discrete Differential Geometry of n-Simplices and Protein Structure Analysis. Applied Mathematics, 5, 2458-2463.
https://doi.org/10.4236/am.2014.516237
[17] Morikawa, N. (2016) Discrete Differential Geometry of Triangles and Escher-Style Trick Art. Open Journal of Discrete Mathematics, 6, 161-166.
https://doi.org/10.4236/ojdm.2016.63013
[18] Woolfson, D.N., Bartlett, G.J., Burton, A.J., Heal, J.W., Niitsu, A., Thomson, A.R. and Wood, C.W. (2015) De Novo Protein Design: How Do We Expand into the Universe of Possible Protein Structures? Current Opinion in Structural Biology, 33, 16-26.
[19] Huang, P.-S., Boyken, S.E. and Baker, D. (2016) The Coming of Age of de Novo Protein Design. Nature, 537, 320-327.
https://doi.org/10.1038/nature19946
[20] Norn, C.H. and Andre, I. (2016) Computational Design of Protein Self-Assembly. Current Opinion in Structural Biology, 39, 39-45.

  
comments powered by Disqus

Copyright © 2017 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.