Prediction of protein folding rates from primary sequence by fusing multiple sequential features


We have developed a web-server for predicting the folding rate of a protein based on its amino acid sequence information alone. The web- server is called Pred-PFR (Predicting Protein Folding Rate). Pred-PFR is featured by fusing multiple individual predictors, each of which is established based on one special feature derived from the protein sequence. The ensemble pre-dictor thus formed is superior to the individual ones, as demonstrated by achieving higher correlation coefficient and lower root mean square deviation between the predicted and observed results when examined by the jack-knife cross-validation on a benchmark dataset constructed recently. As a user-friendly web- server, Pred-PFR is freely accessible to the public at Rate/.

Share and Cite:

Shen, H. , Song, J. and Chou, K. (2009) Prediction of protein folding rates from primary sequence by fusing multiple sequential features. Journal of Biomedical Science and Engineering, 2, 136-143. doi: 10.4236/jbise.2009.23024.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Chou, K. C. (2004) Review: Structural bioinformatics and its impact to biomedical science. Current Medicinal Chemistry, 11, 2105-2134.
[2] Anfinsen, C. B. and Scheraga, H. A. (1975) Experimental and theoretical aspects of protein folding. Adv Protein Chem, 29, 205-300.
[3] Chou, K. C., Nemethy, G., Pottle, M. S. and Scheraga, H. A. (1985) The folding of the twisted beta-sheet in bovine pancreatic trypsin inhibitor. Biochemistry, 24, 7948-7953.
[4] Creighton, T. E. (1990) Protein folding. Biochem J, 270, 1-16.
[5] Creighton, T. E. (1995) Protein folding. An unfolding story. Curr Biol, 5, 353-356.
[6] Scheraga, H. A. (2008) From helix-coil transitions to protein folding. Biopolymers, 89, 479-485.
[7] Goldberg, M. E., Semisotnov, G. V., Friguet, B., Kuwajima, K., Ptitsyn, O. B. and Sugai, S. (1990) An early immunoreactive folding intermediate of the tryptophan synthease beta 2 subunit is a 'molten globule'. FEBS Lett, 263, 51-56.
[8] Ivankov, D. N. and Finkelstein, A. V. (2004) Prediction of protein folding rates from the amino acid sequence-predicted secondary structure. Proc Natl Acad Sci USA, 101, 8942-8944.
[9] Anfinsen, C. B. (1973) Principles that govern the folding of protein chains. Science, 181, 223-230.
[10] Chou, K. C. and Scheraga, H. A. (1982) Origin of the right- handed twist of beta-sheets of poly-L-valine chains. Proceedings of National Academy of Sciences, USA, 79, 7047-7051.
[11] Chou, K. C., Nemethy, G. and Scheraga, H. A. (1984) Energetic approach to packing of a-helices: 2. General treatment of nonequivalent and nonregular helices. Journal of American Chemical Society, 106, 3161-3170.
[12] Chou, K. C., Maggiora, G. M., Nemethy, G. and Scheraga, H. A. (1988) Energetics of the structure of the four-alpha-helix bundle in proteins. Proceedings of National Academy of Sciences, USA, 85, 4295-4299.
[13] Klein, P. and Delisi, C. (1986) Prediction of protein structural class from amino acid sequence. Biopolymers, 25, 1659-1672.
[14] Chou, K. C. and Zhang, C. T. (1992) A correlation coefficient method to predicting protein structural classes from amino acid compositions. European Journal of Biochemistry, 207, 429-433.
[15] Zhang, C. T. and Chou, K. C. (1992) An optimization approach to predicting protein structural class from amino acid composition. Protein Science, 1, 401-408.
[16] Chou, J. J. and Zhang, C. T. (1993) A joint prediction of the folding types of 1490 human proteins from their genetic codons. Journal of Theoretical Biology, 161, 251-262.
[17] Chou, K. C. and Zhang, C. T. (1994) Predicting protein folding types by distance functions that make allowances for amino acid interactions. J Biol Chem, 269, 22014-22020.
[18] Dubchak, I., Muchnik, I., Holbrook, S. R. and Kim, S. H. (1995) Prediction of protein folding class using global description of amino acid sequence. Proc Natl Acad Sci U S A, 92, 8700-8704.
[19] Chou, K. C. (1995) Does the folding type of a protein depend on its amino acid composition? FEBS Letters, 363, 127-131.
[20] Chou, K. C. (1995) A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition space. Proteins: Structure, Function & Genetics, 21, 319-344.
[21] Bahar, I., Atilgan, A. R., Jernigan, R. L. and Erman, B. (1997) Understanding the recognition of protein structural classes by amino acid composition. PROTEINS: Structure, Function, and Genetics, 29, 172-185.
[22] Zhou, G. P. (1998) An intriguing controversy over protein structural class prediction. Journal of Protein Chemistry, 17, 729- 738.
[23] Ding, C. H. and Dubchak, I. (2001) Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics, 17, 349-358.
[24] Zhou, G. P. and Assa-Munt, N. (2001) Some insights into protein structural class prediction. PROTEINS: Structure, Function, and Genetics, 44, 57-59.
[25] Ding, Y. S., Zhang, T. L. and Chou, K. C. (2007) Prediction of protein structure classes with pseudo amino acid composition and fuzzy support vector machine network. Protein & Peptide Letters, 14, 811-815.
[26] Shen, H. B. and Chou, K. C. (2006) Ensemble classifier for protein fold pattern recognition. Bioinformatics, 22, 1717-1722.
[27] Chen, K. and Kurgan, L. (2007) PFRES: protein fold classification by using evolutionary information and predicted secondary structure. Bioinformatics, 23, 2843-2850.
[28] Shen, H. B. and Chou, K. C. (2009) Predicting protein fold pattern with functional domain and sequential evolution information. Journal of Theoretical Biology, 256, 441-446.
[29] Chou, K. C. (2005) Review: Progress in protein structural class prediction and its impact to bioinformatics and proteomics. Current Protein and Peptide Science, 6, 423-436.
[30] Ouyang, Z. and Liang, J. (2008) Predicting protein folding rates from geometric contact and amino acid sequence. Protein Science, 17, 1256-1263.
[31] Plaxco, K. W., Simons, K. T. and Baker, D. (1998) Contact order, transition state placement and the refolding rates of single domain proteins. J Mol Biol, 277, 985-994.
[32] Ivankov, D. N., Garbuzynskiy, S. O., Alm, E., Plaxco, K. W., Baker, D. and Finkelstein, A. V. (2003) Contact order revisited: influence of protein size on the folding rate. Protein Science, 12, 2057-2062.
[33] Zhou, H. and Zhou, Y. (2002) Folding rate prediction using total contact distance. Biophys Journal, 82, 458-463.
[34] Gromiha, M. M. and Selvaraj, S. (2001) Comparison between long-range interactions and contact order in determining the folding rate of two-state proteins: application of long-range order to folding rate prediction. J Mol Biol, 310, 27-32.
[35] Nolting, B., Schalike, W., Hampel, P., Grundig, F., Gantert, S., Sips, N., Bandlow, W. and Qi, P. X. (2003) Structural determinants of the rate of protein folding. J Theor Biol, 223, 299-307.
[36] Gromiha, M. M., Thangakani, A. M. and Selvaraj, S. (2006) FOLD-RATE: prediction of protein folding rates from amino acid sequence. Nucleic Acids Res, 34, W70-74.
[37] Wang, D., Keller, J. M., Carson, C. A., McAdo-Edwards, K. K. and Bailey, C. W. (1998) Use of fuzzy-logic-inspired features to improve bacterial recognition through classifier fusion. IEEE Trans Syst Man Cybern B Cybern, 28, 583-591.
[38] Chou, K. C. and Shen, H. B. (2008) Cell-PLoc: A package of web-servers for predicting subcellular localization of proteins in various organisms. Nature Protocols, 3, 153-162.
[39] Chou, K. C. and Shen, H. B. (2007) Review: Recent progresses in protein subcellular location prediction. Analytical Biochemistry, 370, 1-16.
[40] Chou, K. C. and Zhang, C. T. (1995) Review: Prediction of protein structural classes. Critical Reviews in Biochemistry and Molecular Biology, 30, 275-349.
[41] Chou, P. Y. and Fasman, G. D. (1978) Prediction of secondary structure of proteins from amino acid sequences. Advances in Enzymology and Related Subjects in Biochemistry, 47, 45-148.
[42] Iqbal, M. and Verrall, R. E. (1988) Implications of protein folding. Additivity schemes for volumes and compressibilities. J Biol Chem, 263, 4159-4165.
[43] Oobatake, M. and Ooi, T. (1993) Hydration and heat stability effects on protein unfolding. Prog Biophys Mol Biol, 59, 237-284.
[44] Jones, D. T. (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol, 292, 195-202.
[45] Chou, K. C. (1999) Using pair-coupled amino acid composition to predict protein secondary structure content. Journal of Protein Chemistry, 18, 473-480.
[46] Zhou, X. B., Chen, C., Li, Z. C. and Zou, X. Y. (2007) Using Chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. Journal of Theoretical Biology, 248, 546-551.
[47] Ding, Y. S. and Zhang, T. L. (2008) Using Chou's pseudo amino acid composition to predict subcellular localization of apoptosis proteins: an approach with immune genetic algorithm-based ensemble classifier. Pattern Recognition Letters, 29, 1887-1892.
[48] Zhang, G. Y., Li, H. C. and Fang, B. S. (2008) Predicting lipase types by improved Chou's pseudo-amino acid composition. Protein & Peptide Letters, 15, 1132-1137.
[49] Lin, H. (2008) The modified Mahalanobis discriminant for predicting outer membrane proteins by using Chou's pseudo amino acid composition. Journal of Theoretical Biology, 252, 350-356.
[50] Li, F. M. and Li, Q. Z. (2008) Predicting protein subcellular location using Chou's pseudo amino acid composition and improved hybrid approach. Protein & Peptide Letters, 15, 612- 616.
[51] Zhang, G. Y. and Fang, B. S. (2008) Predicting the cofactors of oxidoreductases based on amino acid composition distribution and Chou's amphiphilic pseudo amino acid composition. Journal of Theoretical Biology, 253, 310-315.
[52] Lin, H., Ding, H., Feng-Biao Guo, F. B., Zhang, A. Y. and Huang, J. (2008) Predicting subcellular localization of mycobacterial proteins by using Chou's pseudo amino acid composition. Protein & Peptide Letters, 15, 739-744.
[53] Munteanu, C. R., Gonzalez-Diaz, H., Borges, F. and de Magalhaes, A. L. (2008) Natural/random protein classification models based on star network topological indices. Journal of Theoretical Biology, 254, 775-783.
[54] Rezaei, M. A., Abdolmaleki, P., Karami, Z., Asadabadi, E. B., Sherafat, M. A., Abrishami-Moghaddam, H., Fadaie, M. and Forouzanfar, M. (2008) Prediction of membrane protein types by means of wavelet analysis and cascaded neural networks. Journal of Theoretical Biology, 254, 817-820.
[55] Chou, K. C. (1989) Graphical rules in steady and non-steady enzyme kinetics. J Biol Chem, 264, 12074-12079.
[56] Chou, K. C. (1990) Review: Applications of graph theory to enzyme kinetics and protein folding kinetics. Steady and non- steady state systems. Biophysical Chemistry, 35, 1-24.
[57] Lin, S. X. and Neet, K. E. (1990) Demonstration of a slow conformational change in liver glucokinase by fluorescence spectroscopy. J Biol Chem, 265, 9670-9675.
[58] Chou, K. C. and Liu, W. M. (1981) Graphical rules for non-steady state enzyme kinetics. Journal of Theoretical Biology, 91, 637-654.
[59] Zhou, G. P. and Deng, M. H. (1984) An extension of Chou's graphical rules for deriving enzyme kinetic equations to system involving parallel reaction pathways. Biochemical Journal, 222, 169-176.
[60] Myers, D. and Palmer, G. (1985) Microcomputer tools for steady-state enzyme kinetics. Bioinformatics (original: Computer Applied Bioscience), 1, 105-110.
[61] Kuzmic, P., Ng, K. Y. and Heath, T. D. (1992) Mixtures of tight- binding enzyme inhibitors. Kinetic analysis by a recursive rate equation. Anal Biochem, 200, 68-73.
[62] Andraos, J. (2008) Kinetic plasticity and the determination of product ratios for kinetic schemes leading to multiple products without rate laws: new methods based on directed graphs. Canadian Journal of Chemistry, 86, 342-357.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.