Favorable and unfavorable amino acid residues in water-soluble and transmembrane proteins


We analyzed the amino acid residues present in the water-soluble and transmembrane proteins of 6 thermophilic and 6 mesophilic species of the domains Archaea and Eubacteria, and characterized them as favorable or unfavorable. The characterization was performed by comparing the observed number of each amino acid residue to the expected number calculated from the percentage of nucleotides present in each gene. Amino acids that were more or less abundant than expected were considered as favorable or unfavorable, respectively. Comparisons of amino acid compositions indicated that the water-soluble proteins were rich in charged residues such as Glu, Asp, Lys, and His, whereas hydrophobic residues such as Trp, Phe, and Leu were abundant in transmembrane proteins. Interestingly, our results found that although the Trp residue was abundant in transmembrane proteins, it was not defined as favorable by our calculations, indicating that increased numbers of a particular amino acid does not necessary indicate it is a favorable residue. Amino acids with high G + C content such as Ala, Gly, and Pro were frequently observed as favorable in species with low G + C content. Comparatively, amino acids with low G + C content such as Phe, Tyr, Lys, Ile, and Met were frequently observed as favorable in species with high G + C content. These are the examples to increase the supply of amino acids than expected. Amino acids with neutral G + C content, i.e., Glu and Asp were favorable in water-soluble proteins from all species analyzed, and Cys was unfavorable both in water-soluble and transmembrane proteins. These results indicate that amino acid compositions are essentially determined by the nucleotide sequence of the genes, and the amino acid content is altered by a deviation from expectation.

Share and Cite:

Nakashima, H. , Yoshihara, A. and Kitamura, K. (2013) Favorable and unfavorable amino acid residues in water-soluble and transmembrane proteins. Journal of Biomedical Science and Engineering, 6, 36-44. doi: 10.4236/jbise.2013.61006.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Kyte, J. and Doolittle, R.F. (1982) A simple method for displaying the hydropathic character of a protein. Journal of Molecular Biology, 157, 105-132. doi:10.1016/0022-2836(82)90515-0
[2] Klein, P., Ka-nehisa, M. and DeLisi, C. (1985) The detection and classification of membrane-spanning proteins. Biochimica et Biophysica Acta-Biomembranes, 815, 468- 476. doi:10.1016/0005-2736(85)90375-X
[3] Hirokawa, T., Boon-Chieng, S. and Mitaku, S. (1998) SOSUI: Classification and secondary structure prediction system for membrane proteins. Bioinformatics, 14, 378- 379. doi:10.1093/bioinformatics/14.4.378
[4] Nakashima, H. and Kuroda, Y. (2011) Differences in dinucleotide frequencies of thermophilic genes encoding water soluble and membrane proteins. Journal of Zhejiang University-Science B (Biomedicine & Biotechnology), 12, 419-427.
[5] Muto, A. and Osawa, S. (1987) The gua-nine and cytosine content of genomic DNA and bacterial evolution. Proceedings of the National Academy of Sciences of the United States of America, 84, 166-169. doi:10.1073/pnas.84.1.166
[6] Lawrence, J.G. and Ochman, H. (1997) Amelioration of bacterial genomes: rates of change and exchange. Journal of Molecular Evolution, 44, 383-397. doi:10.1007/PL00006158
[7] Karlin, S. and Burge, C. (1995) Dinucleotide relative abundance extremes: A ge-nomic signature. Trends in Genetics, 11, 283-290. doi:10.1016/S0168-9525(00)89076-9
[8] Karlin, S., Mrázek, J. and Campbell, A.M. (1997) Compositional biases of bacterial genomes and evolutionary implications. Journal of Bacteriology, 179, 3899-3913.
[9] Nakashima, H., Ota, M., Nishikawa, K. and Ooi, T. (1998) Gene from nine genomes are separated into their organisms in the dinucleotide composition space. DNA Research, 5, 251- 259. doi:10.1093/dnares/5.5.251
[10] Singer, G.A.C. and Hickey, D.A. (2000) Nucleotide bias causes a genome-wide bias in the amino acid composition of proteins. Molecular Biology and Evolution, 17, 1581-1588. doi:10.1093/oxfordjournals.molbev.a026257
[11] Bharanidharan, D., Bhargavi, G.R., Uthanumallian, K. and Gau-tham, N. (2004) Correlations between nucleotide frequencies and amino acid composition in 115 bacterial species. Biochemical and Biophysical Research Communications, 315, 1097-1103. doi:10.1016/j.bbrc.2004.01.129
[12] Hu, J., Zhao, X., Zhang, Z. and Yu, J. (2007) Compositional dynamics of guanine and cytochine content in prokaryotic genomes. Research in Microbiology, 158, 363- 370. doi:10.1016/j.resmic.2007.02.007
[13] Lobry, J.R. (1997) Influence of genomic G+C content on average aminoacid composition of proteins from 59 bacterial species. Gene, 205, 309-316. doi:10.1016/S0378-1119(97)00403-4
[14] Kumar, S., Tsai, C.J. and Nussinov, R. (2000) Factors enhancing protein thermostability. Protein Engineering, 13, 179-191. doi:10.1093/protein/13.3.179
[15] Kreil, D.P. and Ou-zounis, C.A. (2001) Identification of thermophilic species by the amino acid compositions deduced from their genomes. Nucleic Acids Research, 29, 1608-1615. doi:10.1093/nar/29.7.1608
[16] Farias, S.T. and Bonato, M.C.M. (2003) Preferred amino acids and thermostability. Genetics and Molecular Research, 2, 383-393.
[17] Yokota, K., Satou, K. and Ohki, S. (2006) Comparative analysis of protein thermostability: Differences in amino acid content and substitution at the surfaces and in the core regions of thermophilic and mesophilic proteins. Science and Technology of Advanced Materials, 7, 255-262. doi:10.1016/j.stam.2006.03.003
[18] Zhou, X.-X., Wang, Y.-B., Pan, Y.-J. and Li, W.-F. (2008) Differences in amino acids composition and coupling patterns between mesophilic and thermophilic proteins. Amino Acids, 34, 25-33. doi:10.1007/s00726-007-0589-x
[19] Fukuchi, S., Yoshimune, K., Wakayama, M., Moriguchi, M. and Ni-shikawa, K. (2003) Unique amino acid composition of proteins in halophilic bacteria. Journal of Molecular Biology, 327, 347-357. doi:10.1016/S0022-2836(03)00150-5
[20] Kawarabayasi, Y., Hino, Y., Horikawa, H., Jin-no, K., Takahashi, M., Sekine, M., Baba, S., Ankai, A., Kosugi, H., Hosoyama, A., Fukui, S., Nagai, Y., Nishijima, K., Otsuka, R., Nakazawa, H., Takamiya, M., Kato, Y., Yoshizawa, T., Tanaka, T., Kudoh, Y., Yamazaki, J., Kushida, N., Oguchi, A., Aoki, K., Masuda, S., Yanagii, M., Nishimura, M., Yamagishi, A., Oshima, T. and Kikuchi, H. (2001) Complete genome sequence of an aerobic thermoacidophilic crenarchaeon, Sulfolobus tokodaii strain 7. DNA Research, 8, 123-140. doi:10.1093/dnares/8.4.123
[21] Klenk, H.-P., Clayton, R.A., Tomb, J.-F., White, O., Nelson, K.E., Ketchum, K.A., Dodson, R.J., Gwinn, M., Hickey, E.K., Peterson, J.D., Richardson, D.L., Kerlavage, A.R., Graham, D.E., Kyrpides, N.C., Fleischmann, R.D., Quackenbush, J., Lee, N.H., Sutton, G.G., Gill, S., Kirkness, E.F., Dougherty, B.A., McKenney, K., Adams, M.D., Loftus, B., Peterson, S., Reich, C.I., McNeil, L.K., Badger, J.H., Glodek, A., Zhou, L., Overbeek, R., Gocayne, J.D., Weidman, J.F., McDonald, L., Utterback, T., Cotton, M.D., Spriggs, T., Artiach, P., Kaine, B.P., Sykes, S.M., Sadow, P.W., D’Andrea, K.P., Bowman, C., Fujii, C., Garland, S.A., Mason, T.M., Olsen, G.J., Fraser, C.M., Smith, H.O., Woese, C.R. and Venter, J.C. (1997) The complete ge-nome sequence of the hyperthermophilic, sul-phate-reducing archaeon Archaeoglobus fulgidus. Nature, 390, 364-370. doi:10.1038/37052
[22] Slesarev, A.I., Mezhevaya, K.V., Makarova, K.S., Polushin, N.N., Sh-cherbinina, O.V., Shakhova, V.V., Belova, G.I., Aravind, L., Natale, D.A., Rogozin, I.B., Tatusov, R.L., Wolf, Y.I., Stetter, K.O., Malykh, A.G., Koonin, E.V. and Kozyavkin, S.A. (2002) The complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens. Proceedings of the National Academy of Sciences of the United States of America, 99, 4644-4649. doi:10.1073/pnas.032671499
[23] Bao, Q., Tian, Y., Li, W., Xu, Z., Xuan, Z., Hu, S., Dong, W., Yang, J., Chen, Y., Xue, Y., Xu, Y., Lai, X., Huang, L. Dong, X., Ma, Y., Ling, L., Tan, H., Chen, R., Wang, J., Yu, J. and Yang, H. (2002) A complete sequence of the T. tengcongensis ge-nome. Genome Research, 12, 689-700. doi:10.1101/gr.219302
[24] Nelson, K.E., Clayton, R.A., Gill, S.R., Gwinn, M.L., Dodson, R.J., Haft, D.H., Hick-ey, E.K., Peterson, J.D., Nelson, W.C., Ketchum, K.A., McDonald, L., Utterback, T.R., Malek, J.A., Linher, K.D., Garrett, M.M., Stewart, A.M., Cotton, M.D., Pratt, M.S., Phillips, C.A., Richardson, D., Heidelberg, J., Sutton, G.G., Fleischmann, R.D., Eisen, J.A., White, O., Salzberg, S.L., Smith, H.O., Venter, J.C. and Fraser, C.M. (1999) Evidence for lateral gene transfer between Archaea and Bacteria from genome sequence of Thermotoga maritima. Nature, 399, 323-329. doi:10.1038/20601
[25] Fricke, W.F., Seedorf, H., Henne, A., Krüer, M., Liesegang, H., Hedderich, R., Gottschalk, G. and Thauer, R.K. (2006) The genome sequence of Methanosphaera stad-tmanae reveals why this human intestinal archaeon is restricted to methanol and H2 for methane formation and ATP synthesis. Journal of Bacteriology, 188, 642-658. doi:10.1128/JB.188.2.642-658.2006
[26] Anderson, I., Ulrich, L.E., Lupa, B., Susanti, D., Porat, I., Hooper, S.D., Lykidis, A., Sieprawska-Lupa, M., Dharmarajan, L., Goltsman, E., Lapidus, A., Saunders, E., Han, C., Land, M., Lucas, S., Mukhopadhyay, B., Whitman, W.B., Woese, C., Bristow, J. and Kyrpides, N. (2009) Genomic characterization of methanomicrobiales reveals three classes of methanogens. PLoS One, 4, 1-9. doi:10.1371/journal.pone.0005797
[27] Ng, W.V., Ken-nedy, S.P., Mahairas, G.G., Berquist, B., Pan, M., Shukla, H.D., Lasky, S.R., Baliga, N.S., Thorsson, V., Sbrogna, J., Swartzell, S., Weir, D., Hall, J., Dahl, T.A., Welti, R., Goo, Y.A., Leithauser, B., Keller, K., Cruz, R., Danson, M.J., Hough, D.W., Maddocks, D.G., Jablonski, P.E., Krebs, M.P., Angevine, C.M., Dale, H., Isenbarger, T.A., Peck, R.F., Pohlshroder, M., Spudich, J.L., Jung, K-H., Alam, M., Freitas, T., Hou, S., Daniels, C.J., Dennis, P.P., Omer, A.D., Ebhardt, H., Lowe, T.M., Liang, P., Riley, M., Hood, L. and DasSarma S. (2000) Genome sequence of Halobacterium species NRC-1. Proceedings of the National Academy of Sciences of the United States of America, 97, 12176-12181. doi:10.1073/pnas.190337797
[28] Fleischmann, R.D., Adams, M.D., White, O., Clayton, R.A., Kirkness, E.F., Kerlavage, A.R., Bult, C.J., Tomb, J.-F., Dougherty, B.A., Merrick, J.M., Mckenney, K., Sutton, G., FitzHugh, W., Fields, C., Gocayne, J.D., Scott, J., Shirley, R., Liu, L.-I., Glodek, A., Kelley, J.M., Weidman, J.F., Philips, C.A., Spriggs, T., Hedbolm, E., Cotton, M.D., Utterback, T.R., Hanna, M.C., Nguyen, D.T., Saudek, D.M., Brandon, R.C., Fine, L.D., Fritchman, J.L., Fuhrmann, J.L., Geog-hagen, N.S.M., Gnehm, C.L., McDonald, L.A., Small, K.V., Fraser, C.M., Smith, H.O. and Venter, J.C. (1995) Whole-genome random sequencing and assembly of Haemophilus influenzae rd. Science, 269, 496-512. doi:10.1126/science.7542800
[29] Blattner, F.R., Plunkett, G.III., Bloch, C.A., Perna, N.T., Burland, V., Riley, M., Collado-Vides, J., Glasner, J.D., Rode, C.K., Mayhew, G.F., Gregor, J., Davis, N.W., Kirkpatrick, H.A., Goeden, M.A., Rose, D.J., Mau, B. and Shao, Y. (1997) The complete genome sequence of Escherichia coli K-12. Science, 277, 1453-1462. doi:10.1126/science.277.5331.1453
[30] Stover, C.K., Pham, X.Q., Erwin, A.L., Mizoguchi, S.D., Warrener, P., Hickey, M.J., Brinkman, F.S.L., Hufnagle, W.O., Kowalik, D.J., Lagrou, M., Garber, R.L., Goltry, L., Tolentino, E., Westbrock-Wadman, S., Yuan, Y., Brody, L.L., Coulter, S.N., Folger, K.R., Kas, A., Larbig, K., Lim, R., Smith, K., Spencer, D., Wong, G.K.-S., Wu, Z., Paulsen, I.T., Reizer, J., Saier, M.H., Hancock, R.E.W., Lory, S. and Olson, M.V. (2000) Complete genome sequence of Pseudomonas aeruginosa PA01, an opportunistic pathogen. Nature, 406, 959-964. doi:10.1038/35023079
[31] Kawabata, T., Fukuchi, S., Homma, K., Ota, M., Araki, J., Ito, T., Ichiyoshi, N. and Nishikawa, K. (2002) GTOP: A database of protein structures predicted from genome sequences. Nucleic Acids Research, 30, 294-298. doi:10.1093/nar/30.1.294
[32] Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. Journal of Molecular Biology, 215, 403-410.
[33] Wilquet, V. and Van de Casteele, M. (1999) The role of the codon first letter in the relationship between genomic GC content and protein amino acid composition. Research in Microbiology, 150, 21-32. doi:10.1016/S0923-2508(99)80043-6
[34] Nakashima, H., Fukuchi, S. and Nishikawa, K. (2003) Compositional changes in RNA, DNA and proteins for bacterial adaptation to higher and lower temperatures. The Journal of Biochemistry, 133, 507-513. doi:10.1093/jb/mvg067
[35] Jaenicke, R. and B?hm, G. (1998) The stability of proteins in extreme environments. Current Opinion in Structural Biology, 8, 738-748. doi:10.1016/S0959-440X(98)80094-8
[36] Barrel, B.G., Anderson, S., Bankier, A.T., de Bruijn, M.H.L., Chen, E., Coulson, A.R., Drouin, J., Eperon, I.C., Nerlich, D.P., Roe, B.A., Sanger, F., Schreier, P.H., Smith, A.J.H., Sta-den, R. and Young I.G. (1980) Different pattern of codon recognition by mammalian mitochondrial tRNAs. Proceedings of the National Academy of Sciences of the United States of America, 77, 3164- 3166. doi:10.1073/pnas.77.6.3164
[37] Nakashima, H., Nishi-kawa, K. and Ooi, T. (1990) Distinct character in hydrophobicity of amino acid compositions of mitochondrial proteins. Proteins, 8, 173-178. doi:10.1002/prot.340080207
[38] Chargaff, E., Lipshitz, R., Green, C. and Hodes, M.E. (1951) The composition of the desoxyribonucleic acid of salmon sperm. The Journal of Biological Chemistry, 192, 223-230.
[39] Chargaff, E., Lipshitz, R. and Green, C. (1952) Composition of the desoxypentose nucleic acids of four genera of sea-urchin. The Journal of Biological Chemistry, 195, 155-160.
[40] Karkas, J.D., Runder, R. and Chargaff, E. (1968) Separation of B. subtilis DNA into complementary strands, II. Template functions and composition as determined by transcription with RNA polymerase. Proceedings of the National Academy of Sciences of the United States of America, 60, 915-920. doi:10.1073/pnas.60.3.915
[41] Runder, R., Karkas, J.D. and Chargaff, E. (1968) Separation of B. subtilis DNA into complementary strands, III. Direct analysis. Proceedings of the National Academy of Sciences of the United States of America, 60, 921-922. doi:10.1073/pnas.60.3.921
[42] Mitchell, D. and Bridge, R. (2006) A test of Chargaff’s second rule. Biochemical and Biophysical Research Com- munications, 340, 90-94. doi:10.1016/j.bbrc.2005.11.160
[43] King, J.L. and Jukes, T.H. (1969) Non-Darwinian evolution. Science, 164, 788-798. doi:10.1126/science.164.3881.788
[44] Lutsenko, E. and Bhagwat, A.S. (1999) Principal causes of hot spots for cytosine to thymine mutations at sites of cytosine methy-lation in growing cells: A model, its experimental support and implications. Mutation Research, 437, 11-20. doi:10.1016/S1383-5742(99)00065-4

Copyright © 2022 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.