MetalloPred: A tool for hierarchical prediction of metal ion binding proteins using cluster of neural networks and sequence derived features

Abstract

Given a protein sequence, how can we identify whether it is a metalloprotein or not? If it is, which main functional class and subclasses it belongs to? This is an important biological question because they are closely related to the biological function of an uncharacterized protein. Particularly, with the avalanche of protein sequences generated in the post genomic era and since conventional techniques are time consuming and expensive, it is highly desirable to develop an automated method by which one can get a fast and accurate answer to these questions. Here, a top-down predictor, called MetalloPred, is developed which consists of 3 level of hierarchical classification using cascade of neural networks from sequence derived features. The 1st layer of the prediction engine is for identifying a query protein as metalloprotein or not; the 2nd layer for the main functional class; and the 3rd layer for the sub-functional class. The overall success rates for all the three layers are higher than 60% that were obtained through rigorous cross-validation tests on the very stringent benchmark datasets in which none of the proteins has 30% sequence identity with any other in the same class or subclass. MetalloPred achieved good prediction accuracies and could nicely complement experimental approaches for identification of metal binding proteins. MetalloPred is freely available to be used in-house as a standalone and is accessible at http://www.juit.ac.in/assets/Metallopred/.

Share and Cite:

Naik, P. , Ranjan, P. , Kesari, P. and Jain, S. (2011) MetalloPred: A tool for hierarchical prediction of metal ion binding proteins using cluster of neural networks and sequence derived features. Journal of Biophysical Chemistry, 2, 112-123. doi: 10.4236/jbpc.2011.22014.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Finkelstein, J. (2009) Metalloproteins. Nature, 460, 813-813. doi:10.1038/460813a
[2] Rosette, M. and Malone, R. (2002) Bioinorganic Chemistry A Short Course. John Wiley and Sons, New York.
[3] Gregolinski, J., Starynowicz, P., Hua, K.T., Lunkley, J.L., Muller, G. and Lisowski, J. (2008) Helical lanthanide (III) complexes with chiral nonaaza macrocycle. Journal of American Chemical Society, 130, 17761-17773. doi:10.1021/ja805033j
[4] Wintz, H., Fox, T., Wu, Y.Y., Feng, V., Chen, W., Chang, H.S., Zhu, T. and Vulpe, C. (2003) Expression profiles of Arabidopsis thaliana in mineral deficiencies reveal novel transporters involved in metal homeostasis. Journal of Biological Chemistry, 278, 47644-47653. doi:10.1074/jbc.M309338200
[5] Cox, E.H. and McLendon, G.L. (2000) Zinc-dependent protein folding. Current Opinion in Chemical Biology, 4, 162-165. doi:10.1016/S1367-5931(99)00070-8
[6] Michel, S.L. and Berg, J.M. (2002) Building a metal binding domain, one half at a time. Chemical Biology, 9, 667-668. doi:10.1016/S1074-5521(02)00160-6
[7] Guntinas, M.B., Bordin, G. and Rodriguez, A.R. (2002) Identification, characterization and determination of metal-binding proteins by liquid chromatography. Analytical and Bioanalytical Chemistry, 374, 369-378. doi:10.1007/s00216-002-1508-3
[8] Yang, W., Lee, H.W., Hellinga, H. and Yang, J.J. (2002) Structural analysis, identification, and design of calcium-binding sites in proteins. Proteins, 47, 344-356. doi:10.1002/prot.10093
[9] Jensen, M.R., Petersen, G., Lauritzen, C., Pedersen J. and Led, J.J. (2005) Metal binding sites in proteins: Identification and characterization by paramagnetic NMR relaxation. Biochemistry, 44, 11014-11023. doi:10.1021/bi0508136
[10] Wu, H., Yang, Y., Jiang, S.J., Chen, L.L., Gao, H.X., Fu, Q.S., Li, F., Ma, B.G. and Zhang, H.Y. (2005) DCCP and DICP: Construction and analyses of databases for copper- and iron-chelating proteins. Genomics, Proteomics and Bioinformatics, 3, 52-57.
[11] Hantke, K. (2001) Iron and metal regulation in bacteria. Current Opinion in Microbiology, 4, 172-177. doi:10.1016/S1369-5274(00)00184-3
[12] Bouton, C.M. and Pevsner, J. (2000) Effects of lead on gene expression. Neurotoxicology, 21, 1045-1055.
[13] Feng, M., Patel, D., Dervan, J.J., Ceska, T., Suck, D., Haq, I. and Sayers, J.R. (2004) Roles of divalent metal ions in flap endonuclease-substrate interactions. Nature Structural and Molecular Biology, 11, 450-456. doi:10.1038/nsmb754
[14] Carafoli, E. (2002) Calcium signaling: a tale for all seasons. Proceeding National Academic of Science U.S.A., 99, 1115-1122. doi:10.1073/pnas.032427999
[15] Harris, E.D. (2000) Cellular copper transport and metabolism. Annual Review of Nutrition. 20, 291-310. doi:10.1146/annurev.nutr.20.1.291
[16] O’Halloran, T.V. and Culotta, V.C. (2000) Metallochaperones: An intracellular shuttle service for metal ions. Journal of Biological Chemistry, 275, 25057-25060. doi:10.1074/jbc.R000006200
[17] Vallee, B.L. and Auld, D.S. (1990) Active-site zinc ligands and activated H2O of zinc enzymes. Proceeding National Academic of Science U.S.A., 87, 220-224. doi:10.1073/pnas.87.1.220
[18] Zhou, T., Hamer, D.H., Hendrickson, W.A., Sattentau, Q.J. and Kwong, P.D. (2005) Interfacial metal and antibody recognition. Proceeding National Academic of Science U.S.A., 102, 14575-14580. doi:10.1073/pnas.0507267102
[19] Lieu, P.T., Heiskala, M., Peterson, P.A. and Yang, Y. (2001) The roles of iron in health and disease. Molecular Aspects of Medicine, 22, 1-87. doi:10.1016/S0098-2997(00)00006-6
[20] Barondeau, D.P. and Getzoff, E.D. (2004) Structural insights into protein metal ion partnerships. Current Opinion of Structural Biology, 14, 765-774. doi:10.1016/j.sbi.2004.10.012
[21] Reed, G.H. and Poyner, R.R. (2000) Mn2+ as a probe of divalent metal ion binding and function in enzymes and other proteins. Metal Ions Biological Systems, 37, 183-207.
[22] Binet, M.R., Ma, R., McLeod, C.W. and Poole, R.K. (2003) Detection and characterization of zinc- and cadmium-binding proteins in Escherichia coli by gel electrophoresis and laser ablation inductively coupled plasma-mass spectrometry. Analytical Biochemistry, 318, 30-38. doi:10.1016/S0003-2697(03)00190-8
[23] Herald, V.L., Heazlewood, J.L., Day, D.A. and Millar, A.H. (2003) Proteomic identification of divalent metal cation binding proteins in plant mitochondria. FEBS Letter, 537, 96-100. doi:10.1016/S0014-5793(03)00101-7
[24] Papoyan, A. and Kochian, L.V. (2004) Identification of Thlaspi caerulescens genes that may be involved in heavy metal hyperaccumulation and tolerance: Characterization of a novel heavy metal transporting ATPase. Plant Physiology, 136, 3814-3823. doi:10.1104/pp.104.044503
[25] Schnepf, R., Haehnel, W., Weighardt, K. and Hildebrandt, P. (2004) Spectroscopic identification of different types of copper centers generated in synthetic four-helix bundle proteins. Journal of American Chemical Society, 126, 14389-14399. doi:10.1021/ja0484294
[26] Etterna, T.J., Huynen, M.A., De Vos, W.M. and Van der Oost, J. (2003) TRASH: A novel metal-binding domain predicted to be involved in heavy-metal sensing, trafficking and resistance. Trends in Biochemical Science, 28, 170-173. doi:10.1016/S0968-0004(03)00037-9
[27] Rigden, D.J. and Galperin, M.Y. (2004) The DxDxDG motif for calcium binding: Multiple structural contexts and implications for evolution. Journal of Molecular Biology, 343, 971-984. doi:10.1016/j.jmb.2004.08.077
[28] Andreini, C., Banci, L., Bertini, I. and Rosato, A. (2006) Counting the zinc-proteins encoded in the human genome. Journal of Proteome Research, 5, 196-201. doi:10.1021/pr050361j
[29] Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. Journal of Molecular Biology, 215, 403-410.
[30] Chou, K.C. and Zhang, C.T. (1995) Prediction of protein structural classes. Critical Review in Biochemistry and Molecular Biology, 30, 275-349. doi:10.3109/10409239509083488
[31] Klein, P. (1986) Prediction of protein structural class by discriminant analysis. Biochimica Biophysica Acta, 874, 205-215. doi:10.1016/0167-4838(86)90119-6
[32] Anfisen, C.B. (1973) Principles that govern the folding of protein chains. Science, 181, 223-230. doi:10.1126/science.181.4096.223
[33] Chou, K.C. (2000) Review: Prediction of protein structural classes and subcellular locations. Current Protein and Peptide Science, 1, 171-208. doi:10.2174/1389203003381379
[34] Hua, S. and Sun, Z. (2001) Support vector machine approach for protein subcellular localization prediction. Bioinformatics, 17, 721-728. doi:10.1093/bioinformatics/17.8.721
[35] Nakai, K. (2000) Protein sorting signals and prediction of subcellular localization. Advances in Protein Chemistry, 54, 277-344. doi:10.1016/S0065-3233(00)54009-1
[36] Zhou, G.P. and Doctor, K. (2003) Subcellular location prediction of apoptosis proteins. Proteins, 50, 44-48. doi:10.1002/prot.10251
[37] Rice, P., Longden, I. and Bleasby, A. (2000) EMBOSS: The European Molecular Biology Open Software Suite. Trends in Genetics, 16, 276-277. doi:10.1016/S0168-9525(00)02024-2
[38] Chou, K.C. (2001) Prediction of protein cellular attributes using pseudo amino acid composition. Proteins, Structure, Function and Genetics, 43, 246-255. doi:10.1002/prot.1035
[39] Tanford, C. (1962) Contribution of hydrophobic interactions to the stability of the globular conformation of proteins. Journal of American Chemical Society, 84, 4240-4247. doi:10.1021/ja00881a009
[40] Hopp, T.P. and Woods, K.R. (1981) Prediction of protein antigenic determinants from amino acid sequences. Proceeding of National Academic of Science U.S.A., 78, 3824-3828. doi:10.1073/pnas.78.6.3824
[41] Rumelhart, D.E., Hinton, G.E. and Williams, R.J. (1986) Learning internal representations by error propagation. In: D.E. Rumelhart, J.L. McClelland, PDP Research Group Editors, Parallel distributed processing: Explorations in the microstructure of cognition. Foundations, Cambridge, MIT Press, MA, 318-362.
[42] Zhou, G.P. (1998) An intriguing controversy over protein structural class prediction. Journal of Protein Chemistry, 17, 729-738. doi:10.1023/A:1020713915365
[43] Chou, K.C. and Cai, Y.D. (2002) Using functional domain composition and support vector machines for prediction of protein subcellular location. Journal of Biological Chemistry, 277, 45765-45769. doi:10.1074/jbc.M204161200
[44] Huang, Y. and Li, Y. (2004) Prediction of protein subcellular locations using fuzzy k-NN method. Bioinformatics, 20, 21-28. doi:10.1093/bioinformatics/btg366
[45] Lin, H.H., Han, L.Y., Zhang, H.L., Zheng, C.J., Xie, B. and Chen, Y.Z. (2006) Prediction of the functional class of lipid-binding proteins from sequence derived properties irrespective of sequence similarity. Journal of Lipid Research, 47, 824-831. doi:10.1194/jlr.M500530-JLR200
[46] Fierro-Monti, I. and Mathews, M.B. (2000) Proteins binding to duplexed RNA: One motif, multiple functions. Trends in Biochemical Science, 25, 241-246. doi:10.1016/S0968-0004(00)01580-2
[47] Perez-Canadillas, J.M. and Varani, G. (2001) Recent advances in RNA-protein recognition. Current Opinion in Structural Biology, 11, 53-58. doi:10.1016/S0959-440X(00)00164-0
[48] Hunt, J.A., Ahmed, M. and Fierke, C.A. (1999) Metal binding specificity in carbonic anhydrase is influenced by conserved hydrophobic core residues. Biochemistry, 38, 9054-9062. doi:10.1021/bi9900166
[49] Rapisarda, V.A., Chehin, R.N., De Las Rivas, J., Rodriguez-Montelongo, L., Farias, R.N. and Massa, E.M. (2002) Evidence for Cu(I)-thiolate ligation and prediction of a putative copper-binding site in the Escherichia coli NADH dehydrogenase-2. Archives Biochemistry and Biophysics, 405, 87-94. doi:10.1016/S0003-9861(02)00277-1
[50] Abbott, J.J., Pei, J., Ford, J.L., Qi, Y., Grishin, V.N., Pitcher, L.A., Phillips, M.A. and Grishin, N.V. (2001) Structure prediction and active site analysis of the metal binding determinants in gamma-glutamylcysteine synthetase. Journal of Biological Chemistry, 276, 42099-42107. doi:10.1074/jbc.M104672200

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.