Towards a Comprehensive Search of Putative Chitinases Sequences in Environmental Metagenomic Databases


Chitinases catalyze the hydrolysis of chitin, a linear homopolymer of β-(1,4)-linked N-acetylglucosamine. The broad range of applications of chitinolytic enzymes makes their identification and study very promising. Metagenomic approaches offer access to functional genes in uncultured representatives of the microbiota and hold great potential in the discovery of novel enzymes, but tools to extensively explore these data are still scarce. In this study, we develop a chitinase mining pipeline to facilitate the comprehensive search of these enzymes in environmental metagenomic databases and also to explore phylogenetic relationships among the retrieved sequences. In order to perform the analyses, UniprotKB fungal and bacterial chitinases sequences belonging to the glycoside hydrolases (GH) family-18, 19 and 20 were used to generate 15 reference datasets, which were then used to generate high quality seed alignments with the MAFFT program. Profile Hidden Markov Models (pHMMs) were built from each seed alignment using the hmmbuild program of HMMER v3.0 package. The best-hit sequences returned by hmmsearch against two environmental metagenomic databases (Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis—CAMERA and Integrated Microbial Genomes—IMG/M) were retrieved and further analyzed. The NJ trees generated for each chitinase dataset showed some variability in the catalytic domain region of the metagenomic sequences and revealed common sequence patterns among all the trees. The scanning of the retrieved metagenomic sequences for chitinase conserved domains/signatures using both the InterPro and the RPS-BLAST tools confirmed the efficacy and sensitivity of our pHMM-based approach in detecting putative chitinases sequences. These analyses provide insight into the potential reservoir of novel molecules in metagenomic databases while supporting the chitinase mining pipeline developed in this work. By using our chitinase mining pipeline, a larger number of previously unannotated metagenomic chitinase sequences can be classified, enabling further studies on these enzymes.

Share and Cite:

Romão-Dumaresq, A. , Fróes, A. , Cuadrat, R. , Silva, F. and Dávila, A. (2014) Towards a Comprehensive Search of Putative Chitinases Sequences in Environmental Metagenomic Databases. Natural Science, 6, 323-337. doi: 10.4236/ns.2014.65034.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Deshpande, M.V. (1986) Enzymatic Degradation of Chitin and Its Biological Applications. Journal of Scientific & Industrial Research, 45, 273-281.
[2] Shaikh, S.A. and Deshpande, M.V. (1993) Chitinolytic Enzymes: Their Contribution to Basic and Applied Research. World Journal of Microbiology and Biotechnology, 9, 468-475.
[3] Sahai, A.S. and Manocha, M.S. (1993) Chitinases of Fungi and Plants: Their Involvement in Morphogenesis and Host Parasite Interaction. FEMS Microbiology Reviews, 11, 317-338.
[4] Patil, R.S., Ghormade, V. and Deshpande, M.V. (2000) Chitinolytic Enzymes: An Exploration. Enzyme and Microbial Technology, 26, 473-483.
[5] Henrissat, B. (1991) A Classification of Glycosyl Hydrolases Based on Amino Acid Sequence Similarities. Biochemical Journal, 280, 309-316.
[6] Henrissat, B. and Bairoch, A. (1993) New Families in the Classification of Glycosyl Hydrolases Based on Amino Acid Sequence Similarities. Biochemical Journal, 293, 781-788.
[7] Dahiya, N., Tewari, R. and Hoondal, G.S. (2006) Biotechnological Aspects of Chitinolytic Enzymes: A Review. Applied Microbiology and Biotechnology, 71, 773-782.
[8] Deane, E.E., Whipps, J.M., Lynch, J.M. and Peberdy, J.F. (1999) Transformation of Trichoderma reesei with a Constitutively Expressed Heterologous Fungal Chitinase Gene. Enzyme and Microbial Technology, 24, 419-424.
[9] Wiwat, C., Lertcanawanichakul, M., Siwayapram, P., Pantuwatana, S. and Bhumiratana, A. (1996) Expression of Chitinase-Encoding Genes from Aeromonas hydrophila and Pseudomonas maltophila in Bacillus thuringiensis spp. isrealiensis. Gene, 179, 119-126.
[10] Romao-Dumaresq, A.S., Araújo, W.L., Talbot, N.J. and Thornton, C.R. (2012) RNA Interference of Endochitinases in the Sugarcane Endophyte Trichoderma virens 223 Reduces Its Fitness as a Biocontrol Agent of Pineapple Disease. PLoS One, 7, Article ID: e47888.
[11] Tantimavanich, S., Pantuwatana, S., Bhumiratana, S. and Panbangred, W. (1997) Cloning of a Chitinase Gene in to Bacillus thuringiensis spp. aizawai for Enhanced Insecticidal Activity. Journal of General and Applied Microbiology, 43, 341-347.
[12] Murao, S., Kawada, T., Itoh, H., Oyama, H. and Shin, T. (1992) Purification and Characterization of a Novel Type of Chitinase from Vibrio alginolyticus. Bioscience, Biotechnology, and Biochemistry, 56, 368-369.
[13] Terayama, H., Takahashi, S. and Kuzuhara, H. (1993) Large-Scale Preparation of N,N’-Diacetylchitobiose by Enzymic Degradation of Chitin and Its Chemical Modifications. Journal of Carbohydrate Chemistry, 12, 81-93.
[14] Friedman, S.J. and Skehan, P. (1980) Membrane-Active Drugs Potentiate the Killing of Tumor Cells by D-Glucosamine. Proceedings of the National Academy of Sciences, 77, 1172-1176.
[15] Revah-Moiseev, S. and Carroad, P.A. (1981) Conversion of the Enzymatic Hydrolysate of Shellfish Waste Chitin to Single-Cell Protein. Biotechnology and Bioengineering, 23, 1067-1078.
[16] Kelkar, H.S., Shankar, V. and Deshpande, M.V. (1990) Rapid Isolation and Regeneration of Sclerotium rolfsii Protoplasts and Their Potential Application for Starch Hydrolysis. Enzyme and Microbial Technology, 12, 510-514.
[17] Davies, J. (2011) How to Discover New Antibiotics: Harvesting the Parvome. Current Opinion in Chemical Biology, 15, 5-10.
[18] Torsvik, V. and Ovreas, L. (2002) Microbial Diversity and Function in Soil: From Genes to Ecosystems. Current Opinion in Microbiology, 5, 240-245.
[19] Torsvik, V., Goksoyr, J. and Daae, F.L. (1990) High Diversity in DNA of Soil Bacteria. Applied and Environmental Microbiology, 56, 782-787.
[20] Uchiyama, T. and Miyazaki, K. (2009) Functional Metagenomics for Enzyme Discovery: Challenges to Efficient Screening. Current Opinion in Biotechnology, 20, 616-622.
[21] Venter, J.C., Remington, K., Heidelberg, J.F., Halpern, A.L., Rusch, D., Eisen, J.A., et al. (2004) Environmental Genome Shotgun Sequencing of the Sargasso Sea. Science, 304, 66-74.
[22] Rusch, D.B., Halpern, A.L., Sutton, G., Heidelberg, K.B., Williamson, S., Yooseph, S., Wu, D., et al. (2007) The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific. PLoS Biology, 5, e77.
[23] Yooseph, S., Sutton, G., Rusch, D.B., Halpern, A.L., Williamson, S.J., Remington, K., et al. (2007) The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families. PLoS Biology, 5, e16.
[24] Seshadri, R., Kravitz, S.A., Smarr, L., Gilna, P. and Frazier, M. (2007) CAMERA: A Community Resource for Metagenomics. PLoS Biology, 5, e75.
[25] Meyer, F., Paarman, D., D’Souza, M., Olson, R., Glass, E. M., Kubal, M., et al. (2008) The Metagenomics RAST Server—A Public Resource for the Automatic Phylogenetic and Functional Analysis of Metagenomes. BMC Bioinformatics, 9, 386.
[26] Markowitz, V.M., Ivanova, N.N., Szeto, E., Palaniappan, K., Chu, K., Dalevi, D., et al. (2008) IMG/M: A Data Management and Analysis System for Metagenomes. Nucleic Acids Research, 36, D534-538.
[27] Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic Local Alignment Search Tool. Journal of Molecular Biology, 215, 403-410.
[28] Li, M., Ma, B., Kisman, D. and Tromp, J. (2004) PatternHunter II: Highly Sensitive and Fast Homology Search. Journal of Bioinformatics and Computational Biology, 2, 417-439.
[29] Ma, B., Tromp, J. and Li, M. (2002) PatternHunter: Faster and More Sensitive Homology Search. Bioinformatics, 18, 440-445.
[30] Kent, W.J. (2002) BLAT—The Blast-Like Alignment Tool. Genome Research, 12, 656-664. Article published online before March 2002
[31] Karplus, K., Barrett, C. and Hughey, R. (1998) Hidden Markov Models for Detecting Remote Protein Homologies. Bioinformatics, 14, 846-856.
[32] Krogh, A., Brown, M., Mian, I.S., Sjolander, K. and Haussler, D. (1994) Hidden Markov Models in Computational Biology: Applications to Protein Modeling. Journal of Molecular Biology, 235, 1501-1531.
[33] Hughey, R. and Krogh, A. (1996) Hidden Markov Models for Sequence Analysis: Extension and Analysis of the Basic Method. Computer Applications in the Biosciences, 12, 95-107.
[34] Baldi, P., Chauvin, Y., Hunkapillar, T. and McClure, M. (1994) Hidden Markov Models of Biological Primary Sequence Information. Proceedings of the National Academy of Sciences of the United States of America, 91, 1059-1063.
[35] Karplus, K., Sj?lander, K., Barrett, C., Cline, M., Haussler, D., Hughey, R., Holm, L. and Sander, C. (1997) Predicting Protein Structure Using Hidden Markov Models. Proteins: Structure, Function, and Bioinformatics, 29, 134-139.<134::AID-PROT18>3.0.CO;2-P
[36] Sun, S., Chen, J., Li, W., Altintas, I., Lin, A., Peltier, S., Stocks, K., Allen, E.E., Ellisman, M., Grethe, J. and Wooley, J. (2011) Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: The CAMERA Resource. Nucleic Acids Research, 39, D546-D551.
[37] Katoh, K. and Toh, H. (2008) Recent Developments in the MAFFT Multiple Sequence Alignment Program. Briefings in Bioinformatics, 9, 276-285.
[38] Katoh, K., Misawa, K., Kuma, K.I. and Miyata, T. (2002) MAFFT: A Novel Method for Rapid Multiple Sequence Alignment Based on Fast Fourier Transform. Nucleic Acids Research, 30, 3059-3066.
[39] Waterhouse, A.M., Procter, J.B., Martin, D.M.A., Clamp, M. and Barton, G.J. (2009) Jalview Version 2: A Multiple Sequence Alignment and Analysis Workbench. Bioinformatics, 25, 1189-1191.
[40] Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs. Nucleic Acids Research, 1, 3389-3402.
[41] Hunter, S., Apweiler, R., Attwood, T., Bairoch, A., Bateman, A., Binns, D., et al. (2009) InterPro: The Integrative Protein Signature Database. Nucleic and Acids Research, 37, D211-D215.
[42] Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M. and Kumar, S. (2011) MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance and Maximum Parsimony Methods. Molecular Biology and Evolution, 28, 2731-2739.
[43] S?ding, J. (2005) Protein Homology Detection by HMM-HMM Comparison. Bioinformatics, 21, 951-960.
[44] Venclovas, ?. (2003) Comparative Modeling in CASP5: Progress Is Evident, but Alignment Errors Remain a Significant Hindrance. Proteins: Structure, Function and Bioinformatics, 53, 380-388.
[45] Elofsson, A. (2002) A Study on Protein Sequence Alignment Quality. Proteins: Structure, Function and Bioinformatics, 46, 330-339.
[46] Edgar, R.C. and Batzoglou, S. (2006) Multiple Sequence Alignment. Current Opinion in Structural Biology, 16, 368373.
[47] Liu, K., Raghavan, S., Nelesen, S., Linder, C.R. and Warnow, T. (2009) Rapid and Accurate Large-Scale Coestimation of Sequence Alignments and Phylogenetic Trees. Science, 324, 1561-1564.
[48] Eddy, S.R. (1998) Profile Hidden Markov Models. Bioinformatics, 14, 755-763.
[49] Karplus, K., Karchin, R., Barrett, C., Tu, S., Cline, M., Diekhans, M., Grate, L., Casper, J. and Hughey, R. (2001) What Is the Value Added by Human Intervention in Protein Structure Prediction? Proteins: Structure, Function and Bioinformatics, 45, 86-91.
[50] Eddy, S.R., Mitchison, G. and Durbin, R. (1995) Maximum Discrimination Hidden Markov Models of Sequence Consensus. Journal of Computational Biology, 2, 9-23.
[51] Karchin, R. and Hughey, R. (1998) Weighting Hidden Markov Models for Maximum Discrimination. Bioinformatics, 14, 772-782.
[52] Eddy, S. (2001) HMMER: Profile Hidden Markov Models for Biological Sequence Analysis.
[53] Oca?a, K.A.D.C.S. (2006) Detec??o e caracteriza??o de Elementos Móveis Genéticos usando HMMs (Hidden Markov Models). Msc. Instituto Oswaldo Cruz, Funda??o Oswaldo Cruz, Rio de Janeiro.
[54] Fong, J.H. and Marchler-Bauer, A. (2008) Protein Subfamily Assignment Using the Conserved Domain Database. BMC Research Notes, 1, 114.
[55] Marchler-Bauer, A., Lu, S., Anderson, J.B., Chitsaz, F., Derbyshire, M.K., DeWeese-Scott, C., et al. (2011) CDD: A Conserved Domain Database for the Functional Annotation of Proteins. Nucleic Acids Research, 39, D225-D229.
[56] Eisen, J.A. (1998) A Phylogenomic Study of the Muts Family of Proteins. Nucleic Acids Research, 26, 4291-4300.
[57] Foerstner, K.U., Doerks, T., Creevey, C.J., Doerks, A. and Bork, P. (2008) A Computational Screen for Type I Polyketide Synthases in Metagenomics Shotgun Data. PLoS ONE, 3, Article ID: e3515.
[58] Ziemert, N., Podell, S., Penn, K., Badger, J.H., Allen, E. and Jensen, P.R. (2012) The Natural Product Domain Seeker NaPDoS: A Phylogeny Based Bioinformatic Tool to Classify Secondary Metabolite Gene Diversity. PLoS ONE, 7, Article ID: e34064.

Copyright © 2021 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.