Identification of Regulatory Sequence Signatures in Microrna Precursors Implicated in Neurological Disorders

MicroRNAs have emerged as one of the major classes of non-coding RNAs. Recent reports have placed them in high abundance in the nervous system, having key roles in development. Neurological disorders such as Parkinson's disease, Alzheimer's disease as well as Huntington disease have also been studied and several microRNAs associated with diseases patho-genesis have been identified. Various such findings indicate differential expression levels of many of these microRNAs. Such changes in the expression levels not only indicate towards a control of the biogenesis of these microRNAs but also indicate towards critical yet unelucidated roles of regulatory proteins, which probably act in concert to control the production or maturation of these molecules. In this work, a collection of overrepresented regulatory motif signatures were identified in the DNA and RNA sequences of the precursor microRNAs. The identification of such regulatory sequence signatures promises to provide new insights into many facets of microRNA regulation and neurological disorders.


INTRODUCTION
Along with the steady rise in human life expectancy, there has been an increase in the prevalence of neurodegenerative disorders.Limited knowledge about the actual molecular processes involved in disease occurrence and progression has proved to be major hurdles in the identification of suitable drug design experiments and other therapeutic strategies.
Since their discovery in C. elegans [1], microRNAs have emerged as one of the key cellular regulators which possess sequence specific inhibitory functions [2] and regulate a large number of cellular processes.Recent studies have indicated the presence of a large number of microRNAs in the brain and spinal cord [3,4].A large number of these molecules have been reported to exhibit brain specific expression patterns.Many works have also indicated the expression of a large number of micro-RNAs in the neuron, astrocytes and oligodendritic cells [5][6][7][8][9][10][11][12][13][14][15].Experimental data from such reports reveal the preferential expression levels of the various microRNAs.MicroRNA expression profiling has been the standard procedure for exploration of differentially expressed microRNAs in neurodegenerative diseases.Table 1 summarizes the information on various miRNAs and their disease associations.
The most common form of dementia is Alzheimer's disease [AD].This is characterized by a progressive degenerative neurological disorder and is generally sporadic in nature.Short β-amyloid [Aβ] peptides form plaques which progressively produce the disease condition [16][17][18].Increased BACE1 expression is the most dominant risk factor in the brain as BACE1/β-secretase cleavage of APP is the rate-limiting step for Aβ peptide production [19,20].Many microRNAs have been observed to exhibit altered expression levels under AD progression, and most of them regulate BACE1 [19].Findings further suggest that loss of specific microRNAs miR-107, miR-29a and miR-29b-1 contributes to increased BACE1 and Aβ levels in sporadic AD.In contrast to above, miR-9, miR-125b, and miR-128 have been found to increase their expression levels in the brain under Alzheimer's disease progression [21].Other studies shed light on the control of brain inflammatory responses by miR-146a which regulates c mplement factor H indica-o ting towards a possible role of miRNA control of the neuroinflammatory process associated with deposition of the Aβ peptide.Conglomerative evidence suggests that microRNA regulation may play a role in AD pathogenesis.
On the other hand Parkinson's disease [PD] is characterized by the progressive neurodegeneration of dopaminergic neurons in the substantia nigra.Several evidences have accumulated elucidating the role of miR-133b and its regulation of Pitx3-a paired like homeodomain transcription factor.A disruption of this negative feedback mechanism may be instrumental in causing the multitude of symptoms associated with PD [22][23][24].MiR-433 binding site disruption have been reported to result in increased expression of fibroblast growth factor 20 [FGF20] which has been earlier identified to be an important risk factor for PD and correlated with overexpression of α-synuclein [25].miR-7 has been reported to have a neuron specific expression and has also been shown to reduce α-synuclein protein levels [11].
The third most common neurological disorder is Huntington disease [HD] which is a dominant inherited neurodegenerative disorder.This is primarily caused by a trinucleotide repeat expansion of the gene encoding Huntingtin [Htt].In non-affected individuals, the transcriptional repressor protein REST is found in the cytoplasm through interaction with Htt.The trinucleotide repeat expansion renders, REST incapable of binding Htt and thus accumulates in the nucleus.In the nucleus, REST recruits CoREST, and inactivates neuron-specific genes [26].Recent studies have revealed miR-124a and miR-132 to be dysregulated under effects of REST repression.In vitro studies have also shown that miR-9/ miR-9 * have the ability to target REST and CoREST, respectively [18].The above background suggests the presence of a complex regulatory circuit of microRNAs and their subsequent expression leading to a large number of neurological disorders.The main aim of the work is to identify possible regulatory signatures in the microRNA precursor sequences which could possibly provide insight into the regulatory control of mature microRNA production.

MATERIALS AND METHOD
Information regarding the microRNAs associated with neurological disorders was identified from existing literature and was validated with the Phenomir database.The common miRNAs were then selected and their precursor sequences were obtained from miRBase.Once the precursor sequences were obtained they were converted to their complimentary DNA form using a very basic PERL script and the sets of sequences were divided in two parts-one was the complimentary DNA data set and the other was miRNA precursor dataset [RNA form].The total dataset [Set 1 + Set 2] comprised of 400 sequences belonging to both human and mouse.Now to test the presence of regulatory motifs, a large number of position specific weight matrices were obtained from the established transcription factor databases such as JASPAR [27] and DFTBS [28].And for the analyses of RNA regulatory elements, position specific weight matrices were obtained from RegRNA [29].These weight matrices were compared to the weight matrices obtained through the method described by Wasserman and Sandelin (2004) [30]and Sandelin et al. (2004) [31].Matching results were then validated using the MELINA server [32] for DNA motif identification.Following the identification of the motifs a multiple sequence alignment was constructed using the UGENE tool [33] and the motifs were mapped to the conserved positions.represents the workflow for the analyses.

RESULTS AND DISCUSSION
The results of the analyses clearly indicate the presence of regulatory element signatures which are cis acting in nature in the complimentary DNA sequences of the precursors and when RNA sequence regulatory elements were analyzed a large number of such elements were identified in those precursor sequences as well.This leads us to comment that our understanding of the myriad mechanisms controlling the biogenesis of micro-RNAs is still in its nascent stage as each microRNA family may possess specific modular regulatory elements in both the DNA and RNA levels indicative of a failsafe loop which makes the biogenesis specific and reduces the error rate in the production of the mature functional microRNAs.We have earlier identified secondary structural motifs in the precursors of microRNAs [34][35][36][37][38][39] and have correlated them with their minimum free energy variations.

DNA REGULATORY MOTIFS
It is interesting to note that the occurrence of the sequence signatures are not restricted to any specific type of secondary structural motif apart from the ComA motif which is observed to occur predominantly in loop regions (Figure 2).A large number of the splicing regulatory motifs were identified in the RNA sequence motif discovery pipeline (Table 2).Transcription factor binding sites [TFBS] are generally defined as binding sites on DNA which interact with specific transcription factors and regulate nearby genes.In our study we identified three conserved motifs in all the sequences of the dataset (Figure 3).The presence of most of these motifs are significant as FNR protein is a fumarate and nitrate reductase protein in the prokaryotic system while it shares homology with the eukaryotic cytochrome p450s.HrcA is a repressor protein which controls gene expression as a part of the heat shock response [28] and shares its helix loop helix DNA binding motif with eukaryotic proteins while the GlnR protein which shares homology with numerous eukaryotic Helix turn helix proteins.It is to be noted further that the presence of these regulatory motifs were conserved in miRNAs which were differentially regulated under disease conditions (Phenomir DB; Figure 3).

RNA REGULATORY MOTIFS
Several workers have reported that alternative splicing is a conserved regulatory phenomenon leading to neural protein production.Licatalosi and Darnell (2006) [40] have stated that the phenomenon of alternative splicing enables "the cell to fine tune its protein composition" which helps it to adapt to the changes associated with different stimuli.Mutations in these splicing regulatory motifs can result in neurologic disorders such as phakomatoses as well as muscular dystrophies.Thus the presence of such motifs indicates that there are possibilities    that alternative splicing may occur in presursors as well which might explain their ability to produce multiple mature microRNAs (Table 3).

PHYLOGENETIC CONSERVEDNESS
Phylogenetically conserved positions (Figures 3-5) iden-tified through multiple sequence alignment and the fact that these conserved positions form intricate part of the regulatory elements further signify the assumption that the regulatory modules are indeed functional and hence have retained their position specificity throughout the vents of divergence and occurrence of this particular e  TCCGCT motif-involved in binding of human epidermal growth factor family of micro RNA.

POSSIBLE SIGNIFICANCE OF MOTIF OCCURRENCE
It is clearly known that upstream DNA regulatory elements enable the proper transcription of the genes and exhibit a modular control; however, the question that remains is why would they remain in clusters in phylogenetically conserved positions in an RNA molecule that is already a product of transcription and has passed through a primary processing step?Still to date information is scanty regarding the actual temporal regulation of many microRNA products and as to how their expression levels are controlled; thus it may be possible that these regulatory motifs serve as binding sites of cellular regulatory factors and enable the regulation of the production of mature microRNAs from their precursors.Our observations are in accordance with those of Piriyapongsa et al. (2011) [41], who have identified numerous transcription factor binding sites in the pre-miRNAs.Song Gao et al. (2010) [42] have also reported the presence of atypical promoter elements in microRNA precursor fragments.Many proteins have been identified which bind to the precursor microRNA and regulate its function viz-the NF90-NF45 complex which binds directly to the stem loop region and may also interact via NF90 to DGCR8.Other proteins such as R-Smads, KSRP, hnRNPA1 and LIN28 have also been observed to bind specific pre miR stem or loop regions.The identification of such conserved elements indicate that there may be many more protein factors or factors with multiple binding ability which may act in unison to regulate the expression of the microRNAs associated with neurological disorders.

CONCLUSION
From the obtained results we can safely conclude that a large number of regulatory elements and Transcription factor binding site signatures exist in the precursor sequences of microRNAs implicated in neurological disorders such as Alzheimer's disease, Parkinson's disease and Huntington disease.Whether these regulatory sites are cryptic it is difficult to predict computationally however, many cellular proteins do exist which have high affinity towards such conserved signature motifs.Differential expression levels of the various microRNAs implicated in the disorders provide ample evidence that these small regulators are themselves controlled by regulatory interactions much of which still stands unelucidated.

Figure 2 .
Figure 2. Intronic and Exonic regulatory motifs and their consensus motif structure which indicate phylogenetic conservedness.

Figure 3 .
Figure 3. Intronic and Exonic regulatory motifs and their phylogenetically conserved positions as identified through multiple sequence alignment using the UGENE tool.
promoter; element found from −50 to −20 of the tss 219 2 5'utr py rich stretch region in the 5' utr conferring high transcription levels without the need for other upstream cis elements except for a tata-box 94 3 caat box promoter element found in the upstream region and is reported to be associated with enhancer function 41

Figure 4 .
Figure 4. Conserved ubiquitous TF signatures identified in microRNAs implicated in various neurological disorders.

Figure 5 .
Figure 5. Conserved ubiquitous ComA signatures identified in microRNAs implicated in various neurological disorders and its predominant loop specific occurrence.Table 3. Identified RNA regulatory elements that were overrepresented.Regulatory Motif Description Terminal Oligopyrimidine Tract [TOP]

Table 1 .
Summary of identified microRNAs implicated in the three major neurological disorders.

Table 2 .
Identified DNA regulatory elements that were overrepresented.