Computational Molecular Analysis of the Sequences of BMP 15 Gene of Ruminants and Non-Ruminants

Bone morphogenetic protein 15 (BMP15) is a member of the transforming growth factor β (TGFβ) super family that is expressed by oocytes and plays key roles in granulosa cell development and fertility in animal. This study investigated the molecular genetic variation of BMP15 gene of some selected mammalian species with a view to providing relevant genetic information for breeding and selection programmes in the studied species using computational methods. A total of thirty seven (37) BMP15 nucleotide sequences comprising goats (18), sheep (6), cattle (6), swine (4) and chicken (3) were retrieved from the GenBank. Sequence alignment, translation and comparison of the BMP15 gene of the various species were done with ClustalW. High degree of polymorphism of BMP15 gene was observed among the studied species. The significant value (P < 0.01) for relative proportions of non-synonymous substitutions per non-synonymous site (dN) and the number of synonymous substitutions per synonymous site (dS) symbolized that non-synonymous sites evolved faster than the synonymous sites and positive selection effect over shadowed purifying selection. Functional analysis of missense mutations using PROVEAN showed that twelve amino acid substitutions (L10S, W13A, E20L, V28S, P31R, P31G, P40Q, L42W, Q46N, A52V, R58C and G64T) in goats, nine in sheep (H21R, S32R, I33A, A39W, Q46W, E51A, G54S, R61D and E72A), six in cattle (Q30M, T41W, E50R, I62R, H65E, and E72S), seven in swine (I7L, T9I, V33I, L35H, C40P, R46P and Q61R) and five in chickens (A20H, L27H, W43L, A47P and G50Y) appeared beneficial. The phylogenetic trees from nucleotide and amino acid sequences revealed the close relatedness of members of the bovidae family (goat, sheep and cattle). The present information could guide future efforts involving Corresponding authors.


Introduction
One of the strides towards increased productivity of protein of animal origin is to have an in-depth knowledge of the genes associated with reproduction especially those related to ovulation and litter size.Identifying genes of major effect is also a prerequisite which offers the opportunity to improve production efficiency, product quality and product diversity in livestock industry, through utilizing them in breeding programs [1].Most mutations in proteins are associated with diseases; only a small number of examples show the heterozygote advantage.A good example is the Bone morphogenetic protein 15 (BMP15) gene, which is identified to be related with female fecundity in several animals [2].BMP15 is a member of the transforming growth factor β (TGFβ) superfamily that is expressed by oocytes and play key roles in granulosa cell development and fertility in animal models along with growth differentiation factor 9 (GDF9).It is one of the major causal mechanisms underlying either the highly prolific or infertile phenotypes of several sheep breeds [3] or other mammalian species including man [4].Since BMP15 plays important role in prolificacy and fertility, it therefore means that attention should be paid to the sequences of the gene (using both dry and wet lab approaches) which are usually responsible for the observed phenotypic differences.A good knowledge of the sequences of this gene will help in identifying the variants responsible for various factors attributed to the gene.The objective of this study, therefore, was to investigate using in-silico method, the molecular genetic variation of BMP15 gene of some ruminants (cattle, sheep and goats) and non-ruminants species (swine and chicken) with a view to providing relevant genetic information for marker assisted selection in the studied species.

Sequence Alignment and Translation
Sequence alignment, translation and comparison of the BMP15 gene of the various species were done with ClustalW as described by Larkin et al. [5] using IUB substitution matrix, gap open penalty of 15 and gap extension penalty of 6.66.

Substitution Analysis
The relative proportion of non-synonymous substitution per non synonymous site (dN) and the number of synonymous substitutions per synonymous site (dS) of the deduced amino sequences of BMP15 gene of the various species were computed by bootstrap method (1000 replicates) using the modified Nei-Gojobori (assumed transition/transversion bias = 2) method [6].The ratio of nonsynonymous to synonymous divergence (dN/dS) was tested for departure from the neutral expectation of unity using the codon-based Z-distribution as by modified Nei-Gojobori, applying Jukes-Cantor correction.

Functional Analysis
In silico functional analysis of missense mutations was obtained using PROVEAN with threshold value of −2.5.PROVEAN collects a set of homologous and distantly related sequences from the NCBI NR protein database (released August 2011) using BLASTP (ver.2.2.25) with an E-value threshold of 0.1.The sequences are clustered based on a sequence identity of 80% to remove redundancy using the CD-HIT program (ver.4.5.5) [7].If the PROVEAN score is smaller than or equal to a given threshold, the variation is predicted as deleterious [8].

Phylogenetic Trees Analysis
Neighbor-Joining (NJ) trees were constructed using maximum composite likelihood method and pairwise deletion gap/missing data treatment as described by Saitou and Nei [9] and Tamura et al. [10].The construction was done on the basis of genetic distances, depicting phylogenetic relationships among the BMP15 nucleotide and amino acid sequences of the investigated species.The reliability of the trees was calculated by bootstrap confidence values [11], with 1000 bootstrap iterations using MEGA 6.0 software [12].However, the P-distance method of Nei and Kumar [13] was used for the amino acid sequences in place of maximum composite likelihood method.Similarly, UPGMA trees for the BMP15 gene were constructed with consensus nucleotide and amino acid sequences; using same models as those of the NJ trees.All the nucleotide sequences were trimmed to equal length of 222 bp corresponding to same region of the gene before generating the trees.

Results
The variation in sequence length in base pair (bp) of BMP15 gene within and among species ranges between 222 bp and 6648 bp (Table 1).Across the species, five sequences (3 from goats and 2 from cattle) have 1185 bp, while two sequences each from goat and sheep have 1182 bp.Within species, 2, 3, 3 and 3 sequences have the same bp of 465, 1185, 1230 and 6648 respectively in goats, while 2 sequences each from sheep, cattle, swine and chicken have the same bp of 222, 1185, 988 and 1875.However, the rest of the sequences do not share the same bp within various species.In goats, the sequences JQ350892.1,JQ350891.1,JQ350890.1,GU732196.1,JN860304.1 and JN655670.1 are complete coding sequences (CDS) from DNA and had sequence length that are greater than six thousand base pair (>6000 bp) while the sequences HM462258.1,HM462254.1,HM462255, JX860305.1,JX860304.1 and EU 095935.1 are partial coding sequences (partial CDS) whose length were less than one thousand base pair (<1000 bp).In sheep, cattle, swine and chicken the same reason holds The predicted amino acid sequences of goat, sheep, cattle, swine and chicken BMP15 showed varying degree of amino acid substitutions in the studied species (Figure 1).
Means of dS, dN and dN/dS with their corresponding Z-scores and P-values are presented in Table 2. Swine has the highest dN/dS value of 1.20 while sheep has the least value of 0.97.The Dn/dS pattern revealed positive selection for all species except sheep which has a dN/dS value that is less than one.
The results of functional analysis of coding nonsynonymous single nucleotide polymorphism (nsSNP) of BMP15 gene for goats, sheep, cattle, swine and chicken are presented in Tables 3-7, respectively.Fourteen amino acid substitutions of the wild type alleles located in the putative peptide coding region of goats were obtained from the alignment of deduced amino acid sequences of goats.Out of these, twelve amino acid substitutions (L10S, W13A, E20L, V28S, P31R, P31G, P40Q, L42W, Q46N, A52V, R58C, G64T) were returned neutral indicating that the substitutions did not impair protein function, while the remaining two amino acid substitutions (M70F and Y74M) were returned as deleterious, an indication that the substitutions were harmful.In sheep, nine amino acid substitutions (H21R, S32R, I33A, A39W, Q46W, E51A, G54S, R61D and E72A) were returned neutral; an indication that they did not impair protein function while the remaining amino acid substitution (M70V) was predicted to be harmful.Out of the seven amino acid substitutions in cattle, six (Q30M, T41W, E50R, I62R, H65E, and E72S) appeared beneficial while the remaining one (Y74Q) appeared harmful.
The phylogeny based on nucleotide and amino acid sequences of BMP15 (Figure 2 and Figure 3) shows the clustering of sequences among the various species, although there was some intermingling between the species.The genetic relationship of BMP15 of the studied species using UPGMA showed that members of the Bovidae family (sheep, goat and cattle) were closer at this locus compared to swine and chicken (Figure 4 and Figure 5).

Discussion
BMP15, also known as GDF9B, is an X-linked gene in nature that is expressed in the oocyte and plays a key role in ovarian folliculogenesis.It is a polymorphic gene whose polymorphism has been proven to be associated with increased ovulation rate, sterility and litter size of farm animals [14] [15].The length variation observed within and across species in this study might be due to differences in the genomic region where the sequences were obtained from and differences due to complete coding sequences (CDS) or partial CDS.The variation in sequence length within and among species might result from evolution and differentiation [16].There are cases where variability might result from DNA duplication, DNA rearrangement, short tandem repeat (STR), insertions or deletion of sequences [17].This frameshift mutation often results in a completely different translation from the original protein and is also likely to cause a stop codon to be read which truncates further synthesis of protein [18].This variability might initiate unique structures between individual members in conferring different biological activities [19].
The ratio of the number of nonsynonymous substitution per nonsynonymous sites (dN; amino acid altering) to the number of synonymous mutations per synonymous sites (dS; silent mutation) also known as omega (ɷ = dN/dS), is a useful estimate of gene selective pressure [20].An omega (ɷ) value greater than 1 (ɷ > 1) is an indication of positive selection [21].In the present study, the omega values obtained were greater than one (>1) except sheep which had a value of 0.97, indicating that there was excess of non-synonymous over synonymous nucleotide substitutions over the entire sequences across the species excluding sheep.This symbolizes that non-synonymous sites evolved faster than the synonymous sites and positive selection effect over-shadows purifying selection.It is an implication that balancing selection (or positive Darwinian section) favoured new variants and increased allelic polymorphism [22] which in turn might induce a change in the conformation of protein structure, thereby affecting the signaling pathway during follicle differentiation and ovulation [23].The 0.97 omega value of sheep observed in this study, signifies purifying selection i.e. selection against non-synonymous mutations because of their deleterious effect.
One of the key issues in biology, is understanding how natural selection drives gene functional diversification across different species and lineages [24].The varying substitutions of amino acids within and across species might be as a result of separate divergence from their common ancestor.This agrees with earlier submissions  [20] [25] that as orthologs diverge from their most recent common ancestor, their different evolutionary trajectories lead to divergence in the selective constraints on homologous sites.This information may aid selection of superior animals within and between breeds for genetic improvement of desired traits.
The main goal in animal breeding is to select individuals that have high breeding values for traits of interest as parents to produce the next generation and to do so as quickly as possible.To date, most programs rely on statistical analysis of large data bases with phenotypes on breeding populations by linear mixed model methodology to estimate breeding values on selection candidates.However, there is a long history of research on the use of genetic markers to identify quantitative trait loci and their use in marker-assisted selection but with limited implementation in practical breeding programs [26].
The neutral or beneficial amino acid substitutions are those substitutions that help in maintaining the structural integrity of cells and tissues.Also, they affect positively the functional roles of proteins involved in signal transduction of visual, hormonal, and other stimulants.However, the harmful amino acid substitutions could cause amino acid change further altering protein function which may lead to susceptibility to disease.They may modify enzyme activity, destabilize protein structures or disrupt protein interactions.The beneficial nsSNPs obtained in the present study, therefore, offer hope for future genetic improvement of goats, sheep, cattle, swine and chicken at the BMP15 locus.This is due to the fact that nsSNPs have been reported to be linked to economically important traits and disease development [17].Some nsSNP are beneficial because the altered proteins result in increased sensitivity of granulosa cells to follicle stimulating hormone (FSH), which leads to accelerated follicular development and precocious ovulation of small follicles [27] while the reverse is the case for harmful or deleterious amino acid mutations [28].According to Tariq et al. [29], the prediction of SNPs status is promising in modern genetics analysis and breeding programmes as they have been used to identify those animals with higher breeding value.Gene polymorphism between and among species in the present study appeared to be diverse because BMP15 is a fecundity gene [14] with a very strong divergence and a rapid evolution [30] [31].Therefore, BMP-15 gene as a genetic marker closely linked to the litter size trait can be used in markerassisted selection (MAS) for high litter size production and productivity in ruminants and non-ruminants.
The dendrograms were constructed to compare the common ancestral nucleotide and amino acid sequences of the species since each tree may give a useful information for proper understanding of the evolutionary relationships.The phylogenetic tree revealed that clustering was species wise with some level of intermingling between species.This is an evidence of trans-species evolution which might be attributed to the coding nature of the sequences.The UPGMA consensus trees based on nucleotide and amino acid sequences revealed that goat, sheep and cattle were closer followed by swine while chicken was farther apart.This is in accordance with classical classification as goat, sheep and cattle are members of the bovidae family, and swine shares the order artiodactyla with the ruminants.The findings of this study also agrees with the submissions of Misra et al. [32] and Ugbo et al. [33] who observed similar clustering of members of the bovidae family and order artiodactyla [2].

Conclusion
The study revealed that the gene BMP15 is a polymorphic gene that has numerous mutations which can be beneficial or harmful.However, the beneficial ones were more than the deleterious ones in the present study.The omega values obtained symbolized that non-synonymous sites evolved faster than the synonymous sites and positive selection effect over-shadowed purifying selection.Also, the phylogeny based on nucleotide and amino acid sequences revealed the close relatedness of members of the bovidae family (goat, sheep and cattle), an indication of their closeness in the evolutionary timescale.The information emanating from this study would be relevant in performing further genotype-phenotype research especially the selection of markers of fecundity to improve genetically livestock species in Nigeria, Sub-Saharan Africa.

Figure 1 .
Figure 1.Amino acid prediction of goat, sheep, cattle, swine and chicken BMP15 using ClustalW.Dots indicated identical amino acids and numbers on the right hand side represent site number and asterisks indicate indels.

Figure 5 .
Figure 5. Phylogenetic tree of goat, sheep, cattle, swine and chicken BMP15 consensus amino acid sequences using UPGMA.

Table 1 .
Accession number and sequence length variation of BMP15 gene of goats, sheep, cattle, swine and chicken.

Table 2 .
Mean number of nucleotide substitutions per synonymous site (dS) and per nonsynonymous site (dN) with their ratio among goats, sheep, cattle, swine and chicken.

Table 3 .
Functional analysis of coding nsSNP of the BMP15 gene of goats using PROVEAN.

Table 4 .
Functional analysis of coding nsSNP of the BMP15 gene of sheep using PROVEAN.
Default threshold is −2.5, that is; Variants with a PROVEAN score equal to or below −2.5 are considered "deleterious" while Variants with PROVEAN score above −2.

Table 5 .
Functional analysis of coding nsSNP of the BMP15 gene of cattle using PROVEAN.

Table 6 .
Functional analysis of coding nsSNP of the BMP15 gene of swine using PROVEAN.

Table 7 .
Functional analysis of coding nsSNP of the BMP15 gene of chicken using PROVEAN.