Paper Menu >>
Journal Menu >>
![]() J. Biomedical Science and Engineering, 2009, 2, 587-593 doi: 10.4236/jbise.2009.28085 Published Online December 2009 (http://www.SciRP.org/journal/jbise/ JBiSE ). Published Online December 2009 in SciRes. http://www.scirp.org/journal/jbise Fitting evolutionary process of matrix protein 2 family from influenza A virus using analytical solution of differential equation Shao-Min Yan1, Zhen-Chong Li1, Guang Wu2* 1National Engineering Research Center for Non-food Biorefinery, Guangxi Academy of Sciences, 98 Daling Road, Nanning, Guangxi, China; 2Computational Mutation Project, DreamSciTech Consulting, 301, Building 12, Nanyou A-zone, Jiannan Road, Shenzhen, Guang- dong, China. Email: hongguanglishibahao@yahoo.com Received 22 August 2009; revised 3 September 2009; accepted 4 September 2009. ABSTRACT The evolution of protein family is a process along the time course, thus any mathematical methods that can describe a process over time could be possible to de- scribe an evolutionary process. In our previously con- cept-initiated study, we attempted to use the differen- tial equation to describe the evolution of hemaggluti- nins from influenza A viruses, and to discuss various issues related to the building of differential equation. In this study, we attempted not only to use the dif- ferential equation to describe the evolution of matrix protein 2 family from influenza A virus, but also to use the analytical solution to fit its evolutionary process. The results showed that the fitting was pos- sible and workable. The fitted model parameters provided a way to further determine the evolutionary dynamics and kinetics, a way to more precisely pre- dict the time of occurrence of mutation, and a way to figure out the interaction between protein family and its environment. Keywords: Amino-Acid Pair Predictability; Differential Equation; Evolution; Fitting; Influenza A Virus; Matrix Protein 2 1. INTRODUCTION Very recently, we explored the possibility to use a dif- ferential equation to describe the evolution of hemagglu- tinin proteins from influenza A virus [1]. However, there are ten types of proteins from influ- enza A viruses, it is necessary to explore whether or not this differential description can be applied to other pro- teins from influenza A viruses. Also, it is intriguing to use the analytic solution of differential equation to fit the evolution of proteins from influenza A viruses. This is so, because the mathematical modeling is gen- erally the ending point of empiric experiments, and more importantly the mathematical modeling can provide us the tool for predicting the future of evolution. As the evolution is a process along the time course, we at first needed to represent an evolutionary subject along the time course, and then we could consider how to apply the mathematical modeling to this process. Now we are particularly interested in the evolution of proteins. However, a protein generally is a sequence of letters, which represent amino acids. Thus we need a method to represent a protein family along the time course before modeling [2-5]. In general, the evolution is a process of exchanging substances between a living subject and its environment. In this context, the differential equation is quite suitable because we defined outputinput d t dy for exchang- ing substance between a protein family and its environ- ment along the time course [1]. As we know that the evolution of proteins goes through mutations, which bring in new mutating amino acids and take out mutated amino acids. This again re- quires the conversion of amino acids into numbers to represent the exchange [2-5]. Among ten types of proteins from influenza A virus, the matrix protein 2 (M2) is important because it con- structs a proton channel in the virion and it is essential for infection [6]. Thus, the M2 protein was the target for anti-influenza drugs, and the M2 ion channel blockers was approved to treat influenza virus infections [7,8], but their use is limited by high frequencies of the resis- tance among currently circulating strains [9,10]. Also, a vaccine was designed basing on the conserved ectodo- main of M2 protein, which could match multiple influ- enza virus strains including multiple subtypes [11]. In this study, our effort was made to apply the differ- ential description to the evolution of M2 protein family from influenza A virus, and to use the analytical solution of differential equations to fit the evolutionary process of ![]() 588 S. M. Yan et al. / J. Biomedical Science and Engineering 2 (2009) 587-593 SciRes Copyright © 2009 IALS AND METHODS 2 proteins of influenza A virus sampled ers ins into ABF- 01 acids (E) and 11 le BF01755 M2 pr cession number ABF- 01 bove computation, the difference in pr tial Equation 2 proteins would have a JBiSE M2 proteins. 2. MATER 2.1. Data 5926 full-length M from 1959 to 2008 were obtained from the influenza virus resources [12]. After excluded identical sequences, 1084 M2 proteins were actually used in this study. 2.2. Conversion of Proteins into Numb For two purposes, we needed to convert M2 prote numbers: 1) we needed a single number for a single M2 protein so that we could present the evolution of M2 protein family over time, and 2) we needed a number to present mutation, which resulted in numerical exchange between M2 protein and its environment. We used the amino-acid pair predictability to do this job [2-5]. For example, an M2 protein (accession number 755) from an avian influenza virus, strain A/chi- cken/Magetan/BBVW/2005(H5N1), had 97 amino acids. The first and second amino acids could be counted as an amino-acid pair, the second and third as another amino- acid pair, the third and fourth, until the 96th and 97th, thus there were 96 amino-acid pairs. This M2 protein had 10 glutamic ucines (L): if the permutation could predict the ap- pearance of amino-acid pair EL, it would appear once (10/9711/9696=1.13); actually it did appear once, so the pair EL was predictable. By contrast, this M2 protein had 7 isoleucines (I): if the permutation could predict the appearance of amino-acid pair IL, it would appear once (7/9711/9696=0.79); however, it appeared three times in realty, so the pair IL was unpredictable. In this way, all amino-acid pairs in A otein were classified as predictable and unpredictable, which were 17.71% and 82.29%. Taking another M2 protein (ac 771) as example, this M2 protein had only one amino acid different from ABF01755 M2 protein at position 65. However, its predictable and unpredictable portions were 20.83% and 79.17%. Thus, the amino-acid pair predict- ability distinguished the difference between M2 proteins in numbers rather than in letters that represented amino acids in proteins. Based on the a edictable portion between ABF01755 and ABF01771 M2 proteins was −3.12% (17.71%−20.83%), which was regarded as the exchange between M2 protein and its environment. 2.3. Differen If ABF01755 and ABF01771 M direct relation due to a single mutation, the difference between them was outputinput dy , where y was the difference in prede time re- quired for mutation, input was the predictable portion brought in by mutating amino acid, output was the pre- dictable portion taken away by mutated amino acid. Unfortunately, we had no way to know if ABF01 dt icable portion, t was th 755 an sing the pe est and Mann-Whitney U-test were used 3. RESULTS AND DISCUSSION utation rela- d ABF01771 M2 proteins had a direct mutation rela- tionship although both were sampled in 2005. As the predictable portion was determined u rmutation based on random principle, thus this ex- change was in fact the exchange of randomness between M2 proteins and their environment, more accurately was the exchange of entropy between M2 proteins and their environment [1]. 2.4. Statistics The Student’s t-t to compare the difference between uphill and downhill half-life, and P<0.05 was considered statistically sig- nificant. The SigmaPlot for Windows was used for fit- ting [13]. Ideally, we would hope to have a direct m tionship for all 1084 M2 proteins from 1959 to 2008 involved in this study, because then we would have outputinput d t dy for each mutation relationship be- M2 proteins. Although it was impossible to fi tween any related two nd out such a rela- tio hand, our computations on predictable po Figure 1 presented th al Solution ted that the possi- nship among all of these M2 proteins sampled every- where in the world, the real mutation relationship worked in this way no matter if we had sampled them or not. Thus, we had a system of differential equations for all M2 proteins. On the other rtions of 1084 M2 proteins provided us with the most update evolutionary process in Figure 1, which was read as follows. For example, the solid curve in the top panel presented the evolution of 1084 M2 proteins from 1959 to 2008, and each point was the mean value of predict- able portions of all M2 proteins in given year with its standard deviation (vertically grey line). The similar reading was applied to other panels. So the fluctuating solid curves in e evolutionary process over time. If we could use the differential equation to describe these solid curves, it would mean that we were able to model the evolutionary process of M2 proteins. 3.1. Possibly Analytic These fluctuating solid curves sugges ![]() S. M. Yan et al. / J. Biomedical Science and Engineering 2 (2009) 587-593 589 SciRes Copyright © 2009 12 16 20 24 12 16 20 24 12 16 20 24 12 16 20 24 Predictable portion of amino-acid pairs in present types (%) 12 16 20 24 12 16 20 24 12 16 20 24 12 16 20 24 12 16 20 24 12 16 20 24 H1 subtype ( n = 210 ) H3 subtype ( n = 308 ) H5 subtype ( n = 253 ) H7 subtype ( n = 66 ) H9 subtype ( n = 91 ) 12 16 20 24 12 16 20 24 1959 1964 1969 1974 1979 1984 1989 1994 1999 2004 2009 12 16 20 24 12 16 20 24 N2 subtype ( n = 510 ) N1 subtype ( n = 443 ) 1959 1964 1969 1974 1979 1984 1989 1994 1999 2004 2009 14 16 18 20 22 24 26 14 16 18 20 22 24 26 All M2 ( n = 1084 ) Figure 1. Evolutionary process of M2 proteins from influenza A viruses from 1959 to 2008 in terms of predictable portions with respect to different subtypes. The data are presented as mean±SD. The solid and dotted lines are the actual evolutionary process and fitted evolutionary process with analytical solution of differential equation. JBiSE ![]() 590 S. M. Yan et al. / J. Biomedical Science and Engineering 2 (2009) 587-593 SciRes Copyright © 2009 ly analyticerential equations would ating solid curve representin olution governed a decaying trend with al solution for n diffb be a sum of decaying exponential and sinusoidal func- tions CteAty n tk i i cos , where y was the fluctug the predictable por- tion over time, A, α and k were parameters, t was time, was phase difference, and C was a constant [14]. 3.2. Half-Life i ii 1 JBiSE This analytical s fluctuating solid curve because of negative exponential. Hence, we were able to determine the half-life of de- caying phase of the fluctuating solid curve immediately. With decaying exponential, the half-life was k k T696.02ln 2/1 , where erval troughpeak yy klnln , which alf-life. Sre able tint was the downhill hymmetrically, we we to compute the uphill half-life too because the uphill owed the computed half-life for all possible st uggested that we were able to use the an able 1. Half-life for all and each subtype of M2 proteins from influenza A viruses. ) Half-life (years) phase suggested that mutations led M2 proteins to be- come more predictable whereas the downhill phase sug- gested that mutations led M2 proteins to become less predictable. Table 1 sh ratified peaks and troughs, and Figure 2 compared the uphill half-life with the downhill one. As no statistical difference was found in Figure 2, it indicated that the uphill half-life was not different from the downhill half- life in principle. These results s alytical solution to fit the solid curve, because the un- solved problems in our previous study were that we were not able to determine the input function for this differen- tial equation and were not able to determine if this evo- lutionary process was at steady-state. T M2 Period Length Predictable portion (% Se Peak/TrTrough Uphihill ubtypYear Years ough Peak/ll Down All 192 60-1963 20.83 23.44 18 1962-1963 2 23.44 18.23 6 1963-1966 4 18.23 23.26 1 17 58 20 12 33 51 19 38 59 25 35 22 32 1 H 14 1 1 1 18 12 H 12 14 1979-1982 4 19.53 21.67 27 1 1966-1967 2 23.26 21.46 1967-1970 4 21.46 22.51 1970-1973 4 22.51 19.59 1973-1974 2 19.59 21.93 1974-1978 5 21.93 19.73 1978-1981 4 19.73 20.83 1981-1983 3 20.83 17.14 11 1983-1987 5 17.14 20.50 1987-1991 5 20.50 18.69 1991-1997 7 18.69 20.30 1997-1998 2 20.30 19.19 1998-2001 4 19.19 20.76 2001-2004 4 20.76 18.32 2004-2007 4 18.32 19.97 2007-2008 2 19.97 18.58 9 1 1976-1979 4 21.35 18.75 21 1979-1980 2 18.75 22.92 7 1980-1983 4 22.92 16.15 8 1983-1987 5 16.15 20.63 1987-1990 4 20.63 15.63 0 1990-1992 3 15.63 18.75 1 1992-1994 3 18.75 15.63 11 1994-1996 3 15.63 20.83 7 1996-1999 4 20.83 17.07 4 1999-2000 2 17.07 22.94 5 2000-2004 5 22.94 18.85 2004-2008 5 18.85 25.00 3 1972-1973 2 22.20 19.79 1973-1975 3 19.79 25.00 9 1975-1979 5 25.00 19.53 ![]() S. M. Yan et al. / J. Biomedical Science and Engineering 2 (2009) 587-593 591 SciRes Copyright © 2009 JBiSE 1982-1985 4 21.67 17.80 14 1985-1987 3 17.80 22.92 8 1987-1988 2 22.92 18.75 7 1988-1989 2 18.75 22.92 7 1989-1990 2 22.92 17.90 6 1990-1991 2 17.90 22.92 6 1991-1998 8 22.92 17.19 19 1998-1999 2 17.19 19.68 10 1999-2003 5 19.68 17.65 32 2003-2008 6 17.65 19.59 40 H H H N 35 18 N 21 5 1994-1996 3 21.35 17.71 11 1996-1997 2 17.71 20.60 9 1997-1999 3 20.60 16.67 10 1999-2000 2 16.67 23.96 4 2000-2007 8 23.96 16.48 15 7 1996-1999 4 18.75 20.31 35 1999-2000 2 20.31 18.06 12 2000-2002 3 18.06 20.57 16 2002-2006 5 20.57 17.19 19 2006-2007 2 17.19 18.75 16 9 1997-1999 3 23.62 18.06 8 1999-2001 3 18.06 23.24 8 2001-2004 4 23.24 18.93 14 2004-2005 2 18.93 21.42 11 2005-2008 4 21.42 17.01 12 1 1976-1978 3 21.46 20.21 1978-1979 2 20.21 21.88 1980-1983 4 21.88 16.15 9 1983-1985 3 16.15 20.14 9 1985-1986 2 20.14 17.23 9 1986-1987 2 17.23 20.63 8 1987-1990 4 20.63 15.63 10 1990-1993 4 15.63 19.15 14 1993-1994 2 19.15 15.63 7 1994-1997 4 15.63 21.13 9 1997-1999 3 21.13 19.18 22 1999-2000 2 19.18 21.37 13 2000-2004 5 21.37 17.98 20 2006-2007 2 18.07 20.70 10 2007-2008 2 20.70 18.52 13 2 1961-1962 2 20.83 23.44 12 1966-1967 2 23.26 21.35 16 1967-1972 6 21.35 23.34 47 1972-1973 2 23.34 19.59 8 1973-1976 4 19.59 23.57 15 1976-1979 4 23.57 19.79 16 1982-1983 2 21.67 15.89 4 1983-1987 5 15.89 18.75 1987-1988 2 18.75 17.19 16 1989-1990 2 17.19 19.89 10 1990-1992 3 19.89 16.67 12 1992-1994 3 16.67 19.87 12 1994-1995 2 19.87 18.85 26 1995-2001 7 18.85 21.36 39 2001-2004 4 21.36 18.15 17 2004-2006 3 18.15 19.17 38 ![]() 592 S. M. Yan et al. / J. Biomedical Science and Engineering 2 (2009) 587-593 SciRes Copyright © 2009 AllH1H3H5H7H9N1N2 Half-life (years) 0 10 20 30 40 50 60 Uphill Downhill Figure 2. Comparison of uphill half-life with downhill half- life in all and different subtypes of M2 proteins from influ- enza A viruses. The data are presented as mean±SD. e 2 proteins because the process of finding Accordingeneral fitting principle, were able to deterthe goodnes our fittingrough se fit Subtype All H1 H3 H5 H7 H9 N1 N2 to the g we mine s of th veral ways, for example, 1) the Akaike’s information criterion [15], 2) the plotting of residuals versus fitted predictable portion [16], 3) the plotting of residuals ver- sus time [16], 4) the R or squared R value between fitted and actual data [13], etc. We mainly used the squared R value (Table 2) and Akaike’s information criterion to determine if the difference between solid and dotted lines was acceptable. This was so because the sampled influenza A viruses were very unbalanced due to the practical difficulty in sampling, thus the solid lines could be biased on this account. One possibility with this analytical solution was that the fluctuations would become less intensive as the time went on. This was possible because the evolutionary speed was becoming slower as less and less functional units needed to evolve. In fact, influenza viruses became more and more adapt to their environments after long- time evolution, thus they did not need to mutate a lot to suit for the changes in environments. This adaptation would lead the evolutionary speed of influenza A virus to be slower over time. For another example, the appen- dix in human could have very little speed for its evolu- tion because its function is very much limited in general. The use of differential equation to describe the evolu- tion of proteins from influenza A viruses not only ad- va 3.3. Fitting t was possible to use the analytical solution to fit thI evolution of M half-life provided the initial estimate for the parameter ki in exponential terms. The dotted lines in Figure 1 were fitted lines using the analytical solution, and Table 2 listed the fitted parame- ters for the analytical solution. As seen, the dotted lines generally were quite approximate to the evolutionary trend presented by the solid curve, indicating that the analytical solution was able to present the evolutionary process of M2 proteins from influenza A virus. Table 2. Parameters obtained after using the analytical solution to nced our modeling ability in this field, but also pro- vided us the tool to predict future mutations of influenza A viruses. For prevention of possible epidemic/pandemic, it is very important how to time mutations in proteins the evolutionary process of M2 proteins in Figure 1. A1 -1.0962 1.4086 -4.0515 -2.3573 1.5353 -3.9688 7.5911 -5.9992 k1 0. 0 0 0. 0 0. 0. 0. 0292.0000.10870762.0000136021280677 a1 0.9733 1.1999 1.0116 1.6356 1.2230 0.7309 0.6387 -0.1797 1 -0.6025 -6.8726 1.1214 -1.9029 -2.5712 2.1098 3.8746 4.6835 A- K 2 60.1439 46.9288 17.0142 -2.3072 -0.5694 -1.6362 2.1476 -2.0694 2 0.5698 0.0157 0.1206 0.0312 0.0000 0.0000 0.0557 0.0356 a2 3.0209 6.3110 0.0685 -21.0433 1.9395 1.5054 1.5402 1.2495 2 -10.7687 -4.0610 1.6425 - A k 12.2725 3.0707 -3.1941 4.7661 0.7923 3 -2.5565 -1.4254 -2.6202 -1.2754 1.8712 3.3257 1.2562 1.1683 3 0.0687 0.0395 0.0589 0.0000 0.0000 0.0596 0.0198 0.0452 a3 1.2499 2.4881 1.2671 2.8407 1.1729 0.2823 1.2257 4.2995 3 -0.5466 3.2902 -2.6996 -0.1830 0.5379 4.8288 -1.5888 -10.0636 C R 20.02 50.7341 18.4548 18.8510 19.1760 20.3824 18.9722 19.2776 0.5619 0.7951 0.8123 0.7793 0.8057 0.9601 0.7297 0.8772 R2 0.3158 0.6322 0.6599 0.6074 0.6491 0.9217 0.5325 0.7695 JBiSE ![]() S. M. Yan et al. / J. Biomedical Science and Engineering 2 (2009) 587-593 593 SciRes Copyright © 2009 from iuenza As. In tt we e fast Fourier transformis jo17-20]near ture we are abhe an Sciences (0701 and nd G. Wu. (2009) Describing evolution of he- magglutinins from influenza A viruses using a diff in Pept. Lett., 16, 794–804. n. (2002) Randomness in the primar maggluti- of influenza A virus pro- 5. nel, iviral J. Neyts. (2007) Avian influenza Developing new antiviral agents for infleatme doesre hol In- fect 48, S3–S13. 1] M.ert, Milette, s and . (2009) Universal M2 ectodomain-based influenza A vac- JBiSE nfl virusehe pasused th to do th le to use t b [3,5, alytical so . In the lution with fu fitted parameters to time the mutations. 4. ACKNOWLEDGEMENTS This study was partly supported by Guangxi Science Foundation (0907016 and 0991080) and Guangxi Academy of 09YJ17SW07). REFERENCES [1] S. Yan a eren- tial equation, Prote [2] G. Wu and S. Yay structure of protein: Methods and implications, Mol. Biol. Today, 3, 55–69. [3] G. Wu and S. Yan. (2006) Mutation trend of he nin of influenza A virus: A review from computational mutation viewpoint, Acta Pharmacol. Sin., 27, 513–526. [4] G. Wu and S. Yan. (2006) Fate teins, Protein Pept. Lett., 13, 377–384. [5] G. Wu and S. Yan. (2008) Lecture notes on computational mutation, Nova Science Publishers, New York. [6] T. Betakova. (2007) M2 protein-a proton channel of in- fluenza A virus, Curr. Pharm. Des., 13, 3231–323 [7] L. H. Pinto and R. A. Lamb. (2007) Controlling influenza virus replication by inhibiting its proton chanMol. Biosyst., 3, 18–23 [8] J. Beigel and M. Bray. (2008) Current and future ant therapy of severe seasonal and avian influenza, Antiviral Res., 78, 91–102. [9] E. De Clercq and A (H5N1) infection: Targets and strategies for chemothera- peutic intervention, Trends Pharmacol. Sci., 28, 280–285. [10] F. Hayden. (2009) uenza trnt: What the futud? Clin. . Dis., Schotsa[1 . De FW. FierX. Saelens cines: Preclinical and clinical developments, Expert Rev. Vaccines, 8, 499–508. [12] Influenza virus resources, (2009) http://www.ncbi.nlm.nih.gov/genomes/FLU/Database/mu ltiple.cgi. [13] SPSS Inc., (2002) Sigm [14] aPlot for Windows, Version 8.02. Bayesian and nonlinear hys. Res. Commun., 333, 70–78. G. Wu. (1996) Fit fluctuating blood drug concentration: A beginner’s first note, Pharmacol. Res., 33, 379–383. [15] K. Yamaoka, T. Nakagawa and T. Uno. (1978) Applica- tion of Akaike’s information criterion (AIC) in the eva- luation of linear pharmacoketic equations, J. Pharma- cokinet. Biopharm., 6, 165–175. [16] G. Wu, P. Cossettini and M. Furlanut. (1996) Prediction of blood cyclosporine concentrations in haematological patients with multidrug resistance by one-, two- and three-compartment models using least squares methods, Pharmacol. Res., 34, 47–57. [17] G. Wu and S. Yan. (2006) Timing of mutation in hemag- glutinins from influenza A virus by means of amino-acid distribution rank and fast Fourier transform, Protein Pept. Lett., 13, 143–148. [18] G. Wu and S. Yan. (2005) Timing of mutation in hemag- glutinins from influenza A virus by means of unpredict- able portion of amino-acid pair and fast Fourier trans- form, Biochem. Biop [19] G. Wu and S. Yan. (2005) Searching of main cause lead- ing to severe influenza A virus mutations and conse- quently to influenza pandemics/epidemics, Am. J. Infect. Dis., 1, 116–123. [20] S. Yan and G. Wu. (2009) Prediction of mutation position, mutated amino acid and timing in hemagglutinins from North America H1 influenza A virus, J. Biomed. Sci. Eng, 2, 190–199. |