TITLE:
Accurate Plant MicroRNA Prediction Can Be Achieved Using Sequence Motif Features
AUTHORS:
Malik Yousef, Jens Allmer, Waleed Khalifa
KEYWORDS:
MicroRNA Prediction, Plant, Bioinformatics, Machine Learning, Sequence Motifs
JOURNAL NAME:
Journal of Intelligent Learning Systems and Applications,
Vol.8 No.1,
December
28,
2015
ABSTRACT: MicroRNAs (miRNAs) are short (~21 nt)
nucleotide sequences that are either co-transcribed during the production of
mRNA or are organized in intergenic regions transcribed by RNA polymerase II.
In animals, Drosha, and in plants DCL1 recognize pre-miRNAs which set
themselves apart by their characteristic stem loop (hairpin) structure. This
structure appears important for their recognition during the process of
maturation leading to functioning mature miRNAs. A large body of research is
available for computational pre-miRNA detection in animals, but less within the
plant kingdom. For the prediction of pre-miRNAs, usually machine learning
approaches are employed. Therefore, it is necessary to convert the pre-miRNAs
into a set of features that can be calculated and many such features have been
described. We here select a subset of the previously described features and add
sequence motifs as new features. The resulting model which we called
MotifmiRNAPred was tested on known pre-miRNAs listed in miRBase and its
accuracy was compared to existing approaches in the field. With an accuracy of
99.95% for the generalized plant model, it distinguishes itself from previously
published results which reach an average accuracy between 74% and 98%. We
believe that our approach is useful for prediction of pre-miRNAs in plants
without per species adjustment.