Characterization of the sequence spectrum of DNA based on the appearance frequency of the nucleotide sequences of the genome——A new method for analysis of genome structure

DOI: 10.4236/jbise.2010.34047   PDF   HTML     5,546 Downloads   10,844 Views   Citations


The nucleotide (base) sequence of the genome might reflect biological information beyond the coding sequences. The appearance frequencies of successive base sequences (key sequences) were calculated for entire genomes. Based on the appearance frequency of the key sequences of the genome, any DNA sequences on the genome could be expressed as a sequence spectrum with the adjoining base sequences, which could be used to study the corresponding biological phenomena. In this paper, we used 64 successive three- base sequences (triplets) as the key sequences, and determined and compared the spectra of specific genes to the chromosome, or specific genes to tRNA genes in Saccharomyces cerevisiae, Schizosaccharomyces pombe and Escherichia coli. Based on these analyses, a gene and its corresponding position on the chromosome showed highly similar spectra with the same fold enlargement (approximately 400-fold) in the S. cerevisiae, S. pombe and E. coli genomes. In addition, the homologous structure of genes that encode proteins was also observed with appropriate tRNA gene(s) in the genome. This analytical method might faithfully reflect the encoded biological information, that is, the conservation of the base sequences was to make sense the conservation of the translated amino acids sequence in the coding region, and might be universally applicable to other genomes, even those that consisted of multiple chromosomes.

Share and Cite:

Nakahara, M. and Takeda, M. (2010) Characterization of the sequence spectrum of DNA based on the appearance frequency of the nucleotide sequences of the genome——A new method for analysis of genome structure. Journal of Biomedical Science and Engineering, 3, 340-350. doi: 10.4236/jbise.2010.34047.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Singer, M. and Berg, P. (1991) Genes & genomes – A changing perspective-. University Science Books.
[2] Garrel, J.I. (1997) The yeast proteome handbook. Third edition, Beverly, Proteome Inc.
[3] Velculescu, V.E., Zhang, L., Zhou, W., Vogelstein, J., Basral, M.A., Bassett, D.E.Jr., Hieter, P., Vogelstein, B. and Kinzler, K.W. (1997) Characterization of the yeast transcriptome. Cell, 88, 243-51.
[4] Wan, X.F., VerBerkmoes, N.C., McCue, L.A., Stanek, D., Connlly, H., et al. (2004) Transcriptomic and proteomic characterization of the fur modulon in the metal- reducing bacterium Shewanella oneidensis. The Journal of Bacteriology, 186, 8385-8400.
[5] Sakharkar, K.R., Sakharkar, M.K., Culiat, C.T., Chow, V. T. and Pervaiz, S. (2006) Functional and evolutionary analyses on expressed intronless genes in the mouse genome. FEBS Letters, 580, 1472-1478.
[6] Karkas, J.D., Rudner, R. and Chargaff, E. (1968) Separation of B. subtilis DNA into complementary strands. II. Template functions and composition as determined by transcription by RNA polymerase. Proceedings of the National Academy of Sciences of the United States of America, 60, 915-920.
[7] Bell, S. J., Fordyke, D. R. (1999) Accounting unit of in DNA. Journal of Theoretical Biology, 197, 51-61.
[8] Abe, T., Kanaya, S., Kinouchi, M., Kudo, Y., Mori, H. et al. (1999) Gene classification method based on batch- learning SOM. Genome Informatics Seris, 10, 314- 315.
[9] Baisnee, P.-F., Hampson, S. and Baldi, P. (2002) Why are complementary DNA strands symmetric? Bioinformatics, 18, 1021-1033.
[10] Chen, L. and Zhao, H. (2005) Negative correlation between compositional symmetries and local recombination rates. Bioinformatics, 21, 3951-3958.
[11] Albrecht-Buehler, G. (2006) Asymptotically increasing compliance of genomes with Chargaff’s second parity rules through inversions and inverted transpositions. Proceedings of the National Academy of Sciences of the United States of America, 103, 17828-17833.
[12] Wilson, J. T., Wilson, L. B., Reddy, V. B., Cavallesco, C., Ghosh, P. K., et al. (1980) Nucleotide sequence of the coding portion of human alpha globin messenger RNA. Journal of Biological Chemistry, 255, 2807-2815.
[13] Wada, A., Suyama, A. and Hanai, R. (1991) Phenomenological theory of GC/AT pressure on DNA base composition. Journal of Molecular Evolution, 32, 374-378.
[14] Nakamura, Y., Itoh, T. and Martin, W. (2007) Rate and polarity of gene and fission in Oryza sativa and Arabidopsis thaliana. Molecular Biology and Evolution, 24, 110-121.
[15] Paila, U., Kondam, R. and Ranjan, A. (2008) Genome bias influences amino acid choice: analysis of amino acid substitution and re-compilation matrices exclusive to an AT-biased genome. Nucleic Acids Research.
[16] Voss, R.F. (1992) Evolution of long-range fractal correlation and 1/f noise in DNA base sequences. Physical Review Letters. 68, 3805-3809.
[17] Bains, W. (1993) Local self-similarity of sequence in mammalian nuclear DNA is modulated by a 180 bp periodicity. Journal of Theoretical Biology, 161, 13-143.
[18] Weinberger, E.D. and Stadler, P.F. (1993) Why some fitness landscape are fractal. Journal of Theoretical Biology, 163, 255-275.
[19] Lu, X., Sun, Z., Chen, H. and Li, Y. (1998) Characterizing self-similarity in bacteria DNA sequences. Physical Review E—Statistical, 58, 3578-3584.
[20] Takeda, M. and Nakahara, M. (2009) Structural Features of the Nucleotide Sequences of Genomes. Journal of Computer Aided Chemistry, 10, 38-52.
[21] NCBI Genome Data Base (2009) http://www.ncbi.nlm.
[22] Crick, F.H. (1968) The origin of genetic code. Journal of Molecular Biology, 38, 367-379.
[23] International Human Genome Sequencing Consortium. (2001) Initial sequencing and analysis of the human genome. Nature, 409, 860-921.
[24] Mattick, J.S. (2004) RNA regulation: A new genetics? Nature Reviews Genetics, 5, 316-323.
[25] Lynch, M. (2007) The frailty of adaptive hypothesis for the origins of organismal complexity. Proceedings of the National Academy of Sciences of the United States of America, 104, 8597-8604.
[26] Takeda, M., Chen, W.-H., Saltzgaber, J. and Douglas, M.G. (1986) Nuclear genes encoding the yeast mitochondrial ATPase complex-analysis of ATP1 coding the F1-ATPase subunit and its assembly-. Journal of Biological Chemistry, 261, 15126-15133.
[27] Takeda, M., Okushiba, T., Hayashida, T. and Gunge, N. (1994) ATP1 and ATP2, F1F0-ATPase and subunit genes of Saccharomyces cerevisiae, are respectively located on chromosome II and X. Yeast, 10, 1531-1534.
[28] Mewes, H. W., Albermann, K., Bähr, M., Frishmann, D., Gleissner, A., et al. (1997) Overview of the yeast genome. Nature, 387 (supp), 7-65.
[29] Dietrich, F. S., Mulligan, J., Hennessy, K., Yelton, M. A., Allen, E., et al. (1997) The nucleotide sequence of Saccharomyces cerevisiae chromosome V. Nature, 387 (supp), 78-81.
[30] Saccharomyce Genome Database. (2009) (http://www.
[31] Transfer RNA data base. (2009) (
[32] Matthews, B.W. (1993) Structural and genetic analysis of protein stability. Annual Review of Biochemistry, 62, 139-160.
[33] Kornberg, R.D. (1974) Chromatin structure: a repeating unit of histones and DNA. Science, 184, 868-871.
[34] van Holde, K. and Zlatonova, J. (1995) Chromatin higher order structure: Chasing a mirage? Journal of Biological Chemistry, 270, 8373-8376

comments powered by Disqus

Copyright © 2020 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.