Elaboration of Simplest Folding Structures in 2-Dimensional Lattice with Delta-Hemolysin and Its Variants in HP Model

Although the advanced 3-dimensional structure measurements provide more and more detailed structures in Protein Data Bank, the simplest 2-dimensional lattice model still looks meaningful because 2-dimensional structures play a complementary role with respect to 3-dimensional structures. In this study, the folding structures of delta-hemolysin and its six variants were studied at 2-dimensional lattice, and their amino acid contacts in folding structures were considered according to HP model with the aid of normalized amino acid hydrophobicity index. The results showed that: 1) either delta-hemolysin or each of its variants could find any of its folding structure in one eighth of 1,129,718,145,924 folding structures because of symmetry, which reduces the time required for folding, 2) the impact of pH on folding structures is varying and associated directly with the amino acid sequence itself, 3) the changes in folding structures of variants appeared different case by case, and 4) the assigning of hydrophobicity index to each amino acid was a way to distinguish folding structures at the same native state. This study can help to understand the structure of del-ta-hemolysin, and such an analysis can shed lights on NP-problem listed in millennium prize because the HP folding in lattice belongs to a sub-problem of NP-problem.


INTRODUCTION
With the advance of technologies in 3-dimensional structure measurements, more and more detailed structures are documented in Protein Data Bank [1,2]. Therefore, the protein folding models become paler compared with real structures, especially the 2-dimensional model whose results deem meaningless and doubtful. However, why do we need to study the Newton's physics after having had the Einstein's theory? Why Open Access should we not destroy all photo devices, which record 2-dimensional pictures while our world is 3-dimensional? Actually, the simplest 2-dimensional lattice model looks more meaningful nowadays than before because: 1) 2-dimensional structures do not exist in reality and thus play a complementary role with respect to 3-dimensional structures; 2) the folding in 2-dimensional lattice tells the ways of how a protein folds since 3-dimensional data document only limited ways that a protein folds if any; 3) the ways that a protein folds provide a clue how a protein folds itself within a very tiny interval of time; 4) it is more efficient and effective to optimize computing algorithms in 2-dimensional lattice rather than in 3-dimensional lattice; 5) the computation of exhaustive folding structures would be a good measure of how the computing power advances; 6) models can tell us the future whereas data only record the past; and 7) our daily experience indicates that we are comfortable to have a family album rather than a collection of 3-dimensional statues of family members, i.e., 2-dimensional data are easier to store than 3-dimensional data.
At this moment, the hydrophobic-polar (HP) model would scornfully emerge in our mind because it folds a protein along 2-dimensional and 3-dimensional lattices [3,4]. Nevertheless, the HP model is widely considered too simple compared with real-life case, because it just classifies amino acids as either hydrophobic (H) or polar (P), and converts an amino acid sequence into an HP sequence, and folds the HP sequence along lattices, and finally counts the number of H-H contacts. However, at the native state of folding structure, the contact between any two amino acids can reach 400 (20 2 ), i.e., AA contact, AR contact, VV contact, so the computation is huge if one wants to find a native state with certain amino acids contacts. From this viewpoint, studies of protein folding along 2-dimensional lattice should not be abandoned.
As a matter of fact, any amino acid sequence should fold in the same way in lattices no matter whether it is an HP sequence or amino acid sequence, a self-avoiding way, i.e., left, ahead, and right. Actually, the HP folding in lattice belongs to a sub-problem of NP-problem listed in millennium prize [5][6][7]. This means if humans want to solve the NP problem, we have to study the HP folding despite some view that this type of studies is out-of-fashion. The real truth is that we do not know much about folding structures in 2-dimensional lattice because of the lack of computing power.
Staphylococcus aureus is a Gram-positive bacterium that can be found in the upper respiratory tract and on the skin [8]. S. aureus can secrete hemolysins to cause cell death [9]. Hemolysins at least include α-, β-, γ-and δ-hemolysins, and each plays a different role in damaging of cell membrane as determined by 3-dimensional structures [10][11][12]. However, their 2-dimensional structures have yet to be studied by HP model. Because the length of α-, β-, and γ-hemolysins are composed of more than 300 amino acids, it is only practical to study δ-hemolysin that is composed of 26 amino acids [13]. Although the δ-hemolysin has been studied for nearly 50 years, its high potency against Legionella was reported recently [14,15], and its anti-pathogenic activity may inhibit quorum sensing pathways [16]. The mechanism of antimicrobial and cytolytic peptides in model membranes suggested that the Gibbs energy of binding to the membrane is the primary determinant of peptide activity [17].
With the advance of computing power, it is possible to analyze all δ-hemolysin's folding structures in 2-dementional lattice because it is short and composed of 26 amino acids. Even so, the number of its possible folding structures in 2-dimensional lattice is astonishing, i.e., 4 × 3 n−2 = 4 × 3 (26−2) = 1,129,718,145,924. However, a duo 2 GHz CUP ThinkPad laptop can compute 200,000 folding structures per second, thus it needs 65 days (1,129,718,145,924/200,000 = 5,648,591 second) to compute all the possible folding structures. Actually, each mutant also has the same number of folding structures. Therefore researchers concentrate themselves on developing optimal algorithms in order to minimize computations as many as possible [5,[18][19][20][21][22][23]. This study will analyze all possible folding structures of δ-hemolysin from S. aureus and its variants.

Data
Amino acid sequences of δ-hemolysin and its variants were obtained from the UniProt [24] with the accession number P0C1V1. Six natural variants were found in canine, including the variants at position 3 J. Biomedical Science and Engineering

Sequence in 2-Dimensional Lattice
The HP model is simple because it classifies each amino acid either as hydrophobic (H) or as polar (P) although there are several neutral amino acids. If those neutral amino acids are dealt properly, the HP model would work in real-life case, for which the normalized amino acid hydrophobicity index (Table 1), where only glycine is considered as a neutral amino acid [26]. This leads an amino acid sequence to have four HP sequences with respect to: 1) whether glycine was as hydrophobic amino acid or as polar one, and 2) whether the amino acid sequence was at pH 2 or at pH 7. In this way the amino acids of δ-hemolysin and its variants were converted, and there were 28 HP sequences listed in Table 2.

RESULTS AND DISCUSSION
As have seen, the merit of HP model is to divide amino acids into hydrophobic (H) or polar (P), then it is easy to find out the native state of folding structure with maximal H-H contacts, whereas it is quite laboring to find any amino acid contacts with respect to any combination of two amino acids. Of 1,129,718,145,924 folding structures, we are only interested in the structures at native state defined by H-H contacts. Figure 1 shows several characteristics of eight folding structures of δ-hemolysin. First, δ-hemolysin begins to fold from position 1 to position 26 no matter whether HP sequence or amino acid sequence. Second, each non-sequential H-H contact constructs a unit of negative energy according to the definition in HP model while this definition could also be any other contact, which certainly results in different folding structures. Third, the symmetric characteristic holds for eight folding structures, i.e., 1) the folding structures on left-hand side are vertically symmetric to the folding structures on the right-hand side; 2) the folding structures between I and VI, between II and V, between III and VIII, between IV and VII are horizontally symmetric; and 3) the folding structures between I and V, between II and VI, between III and VII, and between IV and VIII are 180-degree rotating symmetric. If we carefully examine the eight folding structures in Figure 1, those structures are identical, but fold through eight different pathways in 2-dimensional lattice. This means that a protein needs far less time to fold itself than previous assumed [27]. Actually, such symmetric structures are chiral structures in terms of 3-dimensional structures, which lead to different recognition mechanisms in enzymatic functions. This really shows the complementary role of 2-dimensional structures to 3-dimensional structures because it can explain the chiral center in terms of folding of amino acids.
As we deal with glycine either as hydrophobic (H) or polar (P) according to the normalized amino acid hydrophobicity index (Table 1), Figure 2 presented this influence on the folding structure of δ-hemolysin. As can be seen, the sole glycine at position 10 was subject to different considerations. When the glycine was classified as polar amino acid, the minimal energy changed to -11 from -12 (bottom panels vs. middle panels in Figure 2). Another issue is whether pH levels could influence the folding structure. Actually, in this particular case, the influence of pH on the folding structure was not found because both folding structure and minimal energy were the same in the left and right panels of Figure 2. On the other hand, the pH influence could be seen by the summed values of normalized hydrophobicity index to the amino acids that constructed H-H contacts (the values in parentheses).  Figure 4 and Figure 6, and particularly the mutation led the glycine to be absent in Figure 4. Also, the minimal energy was changed in Figure 4, Figure 6 and Figure 7 at different pH, which also shows influence on misfolding [28,29]. Therefore, whether pH affects H-H contacts should be determined case by case (Figures 2-8).
Moreover, structures in those figures suggested that a native state could have several different structures. Table 3 showed detailed analyses, where pH 2 and pH 7 are the references to determine whether an amino acid is hydrophobic (H) or polar (P) in the normalized amino acid hydrophobicity index (Table 1), and G = H as well as G = P are whether glycine was considered as hydrophobic (H) or polar (P). In example of δ-hemolysin (first entries, Table 3), the negative values (-12 and -11) were the minimal energy determined by the number of non-sequential H-H contacts. As can be seen, a native state can have many folding structures, for example, 2160 and 7552 folding structures at pH 2 with G = H and G = P. As there J. Biomedical Science and Engineering    are so many structures at a native state, it suggested that δ-hemolysin might have sufficient structures to deal with various situations. Still, Table 3 indicated that assigning hydrophobicity index to each amino acid can help to distinguish folding structures at the same native state.
To our knowledge, the hydrophobic-hydrophilic-neutral (BPN) model is a comparable 2-dimensional model, which determines the optimal pathway for folding to native structure by means of enumerating all the possible folding pathways [30]. Yet, the energy landscape model deals with a funnel-like landscape biased toward the native structure [31,32]. Finally, the folding intermediate model is related to the stability and activation energy barriers between folding intermediates [33]. However, the detailed comparison among four models is beyond the scope of this article, and we hope to address this issue in near future. J. Biomedical Science and Engineering

CONCLUSION
In conclusion, this study takes a step forward from our previous studies [34][35][36] in following points: 1) a protein can find any of its folding structure in one eighth folding structures because of symmetry, which reduces the time required for folding, 2) the impact of pH on folding structures is various and associated directly with the amino acid sequence itself, 3) the change of folding structures in variants appeared different case by case, and 4) assigning hydrophobicity index to each amino acid is a way to distinguish folding structures at the same native state.

CONFLICTS OF INTEREST
The authors declare no conflicts of interest regarding the publication of this paper.