Mitochondrial DNA Control Region Variation from Bangladesh : Sequence Analysis for the Establishment of a Forensic Database

Mitochondrial DNA sequences of the entire control region and three coding regions were analyzed in 108 unrelated individuals from three regions of Bangladesh. Sequence evaluation was performed with validated primers and combined sequence comparison led to the identification of 14 different haplotypes characterized by 37 variable polymorphic sites. The Bangladeshi sequences exhibited high variations and low random match probability, indicating for forensic application. The mean pairwise difference between individual was 9.698 ± 1.8658 nucleotides (95% CI 9.67 9.69), compared to a mean pairwise difference of 9.890 ± 4.189 nucleotides reported from Northeast Asia and suggested significant differences in the mtDNA composition of the various populations. The sequence diversity of 108 Bangladeshi Bengali samples (n = 216 chromosomes) was estimated to be 0.8475 ± 0.13406. This study first time reports that the comparison of closely related mtDNA sequences can be very useful for improving mtDNA database quality, as well as provide haplotype information for forensic study in mainstream population of Bangladesh.


Introduction
Sequence analysis of human mitochondrial DNA (mtDNA) has been demonstrated to be a valid and reliable tool for genetic characterization of an individual [1].Typically, mtDNA typing is attempted on samples for which nuclear DNA is not likely to be successful, such as hair, shaft, bones, teeth and other samples that are severely decomposed or have been exposed to substantial environmental abuse.Mitochondrial DNA is less prone to degradation than nuclear DNA because of the circular nature of mtDNA genome and its sub cellular sequestration [2] [3].Although mtDNA comprises less than 1% of total cellular DNA, the high copy number of mtDNA per cell enables successful typing of samples containing only trace amount of amplifiable nuclear DNA [4].In the forensic community, the sequence variations used for identification are localized to the D-loop region because this region appears to have a higher rate of mutation.
The noncoding, displacement (D-loop) loop is approximately 1100 bp segment situated between the mitochondrial tRNA pro and the tRNA phe genes and contains two hypervariable regions, HVR-I and HVR-II.mtDNA mutations also evolve five to ten times faster than chromosomal DNA, and this relatively higher mutation rate gives rise to more polymorphic sites [5].The hypervariable regions can be sequenced to provide a high degree of information for discriminating between unrelated individuals of Bangladesh.Simultaneous construction of haplogroups is the most common approach in the forensic field and coding region information is indispensable for phylogenetic analysis of mtDNA [6].We applied direct sequencing in this study samples.The most commonly observed haplogroup was M5, M3, M7, R, A4 and U2b in main stream population of Bangladesh.

Population
About 3 -5 mL of blood samples of 108 healthy unrelated anonymous individuals belonging to three districts of Bangladesh was collected.Genealogical information of sampled individual was recorded for at least 2 previous generations.The research aim of the project was explained in Bangla (local language) to each individual and informed consent forms in both Bangla (local language) and English were signed.The project was approved by the ethical committee of University of Dhaka, Bangladesh.Genomic DNA was extracted by the standard phenol-chloroform method [7].

Amplification of mtDNA
The hyper-variable regions (HVR-1) and (HVR-II) and 3 coding regions of mtDNA were amplified from 10 ng of template DNA using 10 pM of each primer, 100 µM dNTPs, 1.5 mM MgCl 2 and 1 U of Taq polymerase.Generally, 35 cycles of reaction was performed with optimized protocol of PCR in addition with positive and negative control.Annealing temperature and the time were slightly modified for few sets of primers.The reactions were carried out in Bio-Rad C1000 thermal cycler.

Sequencing of the PCR Products
PCR products were checked by 1.8% agarose gel electrophoresis and directly sequenced using BigDye™ Terminator cycle sequence kit v3.1 (Applied Biosystems) in ABI Prism 3130 DNA Genetic Analyzer following manufacture's protocol.The mtDNA sequences were aligned and compared with the revised Cambridge Reference Sequence (rCRS) [8] using the Autoassembler software v 1.4.0.Mitochondrial haplogroups were assigned to all samples according to [9] [10].

Data Analysis
All sequences were compared to the reference standard described by Anderson et al. [11].Sequence and nucleotide diversity were calculated using the Arlequin package [12].All sequences were aligned and trimmed to a greatest common range of nt 16024-16565 and nt 73-340 as well as 3 diagnostic coding regions.

Results
In this study, presentation of complete sequences of HVR-I and HVR-II regions and diagnostic coding regions of all 108 individuals are most informative display of data for the first time in mainstream population of Bangladesh.The low value of sequence diversity (0.8475 ± 0.0134) and mean number of pairwise differences (9.698 ± 1.8658) were calculated from the data.Consequently the random match probability is (8.09%).
A total of 31 polymorphic sites were observed in these study population.We report a total of 14 haplotypes in HVR-I, 8 in HVR-II regions.Over 79% of the individuals analyzed exhibited between 2 to 8 polymorphisms in the control region.Nucleotide substitutions (42%) were the most common, compared to insertion (10.41%) or deletion (0%).The most common transition scored was G → A and T → C as well as C → T respectively.While transversion of C → G substitution scored was (6.24%).
Forensic parameters and diversity is calculated in order to compare mainstream population of Bangladesh with tribal population.mtDNA control regions sequence of this population showed lower sequence diversity 0.8475 ± 0.0134, than the tribal Indian-specific diversity 0.989 (95% CI: 0.975, 0.969) indicating gene flow shares with East and Southeast Asian specific diversity than Indian diversity [19].The random match probability is 8.09% with 43 substitutions.Only one sample has "C" stretch and five samples have 319A-ins which is high in tribal population of Bangladesh Table 2.
Haplogroups C and U each occurred at frequencies lower than 3%.We found no remarkable new lineages in the mainstream population of Bangladesh and haplotypes are less complex rather than tribal population of Chittagong Hill Tracts of Bangladesh.A detail list of diversity comparison is given in Table 3.

Discussion
Mitochondrial (mtDNA) haplogroup is very useful tool not only for phylogenetic and clinical study, but it is also useful for forensic study.The direct sequencing of samples in this study has provided a premature database for the initiation of mass population screening.This study reports the first time analysis of the mtDNA control region and three coding regions sequences of mainstream population of Bangladesh and was generated according to standard operating procedure (SOP) of laboratory.Nevertheless, our analysis report of the entire mtDNA control region of 108 individuals proved to be useful to determine the ancestry of the samples examined, by contributing to the confirmation, and on occasion, even to the refinement of the haplogroup assignment.In fact, a precise haplogroup classification is an issue of increasing importance when mtDNA analysis is applied to forensic casework.
The distributions of mtDNA haplogroups in mainstream population of Bangladesh are less complex than those of the tribal populations of Chittagong Hill-Tracts [19] and West Bengal of India [20].The low sequence diversity of mainstream Bangladeshi population indicated that this population are distinct than those of Thailand, Myanmar, and Northeast India.As the land between NE India and Bangladesh is of lower elevation it is not clear why the Bangladeshi population appears to be bottlenecked (low genetic diversity and limited distribution of haplotypes across the haplotype network).It could be due to the low sample size (n = 108) considered in this study.The diversity estimates for the mtDNA-control region, and the sequence diversity values indicate that the use of mitochondrial DNA is very informative for forensic case work, since the probability of discriminating two maternal lineages is high.
The random match probability of the mtDNA sub-haplotypes was <9%.Thus, the panel of 37 variable polymorphic sites identified in this study represents a useful forensic tool to further resolve the identity of individual from mitochondrial DNA (mtDNA).
The frequencies of the M5 and M3 haplogroups were the most common found in this group of population.Uttar Pradesh of India and individuals from Rajputs of Bihar and Muslims of Karnataka were also found to harbor this haplogroups [21]- [23].Further M haplogroup is very common in Bangladesh; M2a is the oldest haplo-group found in Indian Sub-continent and M2a is the third most common in Bangladesh.Other haplogroup; M7, M21, and M45, A4, N9, U2b, R21 show genetic affinity between Bangladesh and East/Southeast Asia, which was also concluded for Tibeto-Burman population of Bangladesh.
Haplogroup U2 is most common in South Asia and sub clades U2b was found in Bangladeshi population.While U2 is typically found in India, it is also present in the Nogais people descendants of various Mongolic and Turkic tribes, who formed the Nogai Horde indicating the [24] admixture of Indian specific branches of haplogroup in Bangladesh.Haplogroup M7 is found in East Asia, especially in Japan, Southern China, Vietnam and Laos (explains the gene flow from East-Asia).It also suggest that China, near Tibet, as a possible origin, from there they entered Myanmar around the 6th-7th century AD, where they still live in large number.Some group might be then migrated to Bangladesh and Thailand [25].
Obviously, knowing the ancestry (geographic origin and/or ethnicity) of a sample may be the first important step in forensic cases for which only a very limited amount of DNA is available.In cases where direct information from coding regions is limited, one can at least anticipate the haplogroup according to mutations of HVR-I and HVR-II regions.In these later cases, implementing the analysis of the entire mtDNA control region as a routine procedure in forensic investigation would be suggested [26].

Conclusion
In all, we can conclude that in isolated human groups, where mitochondrial genome diversities may be substantially low, although the improvement of the power of discrimination between sequences by analysis of the entire mtDNA control region seems to be somewhat limited, this procedure might constitute a very useful tool to carry out reliable, unambiguous haplogroup identification.Because of the feasibility of this system and the need for only small amounts of DNA, we can anticipate its use in studies of mtDNA-associated disorders, individual/ species identification, and maternity testing in forensic genetics for general population of Bangladesh.

Table 1 .
mtDNA haplogroup frequencies among the Bengali population of Bangladesh.

Table 2 .
Diversity and forensic parameters computed from entire mtDNA control region.

Table 3 .
Comparison of diversity measures from Northeast India.