Analysis and prediction of exon, intron, intergenic region and splice sites for A. thaliana and C. elegans genomes

HTML  Download Download as PDF (Size: 180KB)  PP. 367-373  
DOI: 10.4236/jbise.2009.26053    6,131 Downloads   11,679 Views  

Affiliation(s)

.

ABSTRACT

Although a great deal of research has been undertaken in the area of the annotation of gene structure, predictive techniques are still not fully developed. In this paper, based on the characteristics of base composition of sequences and conservative of nucleotides at exon/intron splicing site, a least increment of diversity al-gorithm (LIDA) is developed for studying and predicting three kinds of coding exons, introns and intergenic regions. At first, by selecting the 64 trinucleotides composition and 120 position parameters of the four bases as informational parameters, coding exon, intron and intergenic sequence are predicted. The results show that overall predicted accuracies are 91.1% and 88.4%, respectively for A. thaliana and C. ele-gans genome. Subsequently, based on the po-sition frequencies of four kinds of bases in regions near intron/coding exon boundary, initia-tion and termination site of translation, 12 position parameters are selected as diversity source. And three kinds of the coding exons are predicted by use of the LIDA. The predicted successful rates are higher than 80%. These results can be used in sequence annotation.

Share and Cite:

Lin, H. , Li, Q. and Chen, C. (2009) Analysis and prediction of exon, intron, intergenic region and splice sites for A. thaliana and C. elegans genomes. Journal of Biomedical Science and Engineering, 2, 367-373. doi: 10.4236/jbise.2009.26053.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.