The Uniparental Genetic Landscape of Modern Slavic Speaking Populations

Slavic speaking populations are the most numerous Indo-European ethnolinguistic group in Europe. They show great variety and fall into three groups: West, East and South Slavic populations. In order to contribute to the understanding of the correlation between linguistic and genetic affiliation of Slavic populations, we have analyzed for the first time their matrilineal and patrilineal relationships and we have also illustrated their position in the European uniparental genetic landscape. For the purpose, we have collected previously published data for the frequencies of mitochondrial DNA (mtDNA) and Y-chromosome haplogroups in Slavic and other European populations and compared them by Principal Component Analysis (PCA). In the inter-Slavic population comparisons, West and East Slavs are in a closer position, whereas South Slavic populations are rather grouped on their own. In the European context, South Slavic populations are positioned more close to neighboring Balkan non-Slavic and North Italian populations, than to other Slavic populations. When considering the uniparental diversity of Slavic speaking populations, one should also take into account the prevalence of Y-chromosome haplogroup N among East Slavs (comprising almost half of the paternal gene pool in instances), which is almost absent among the other groups (not exceeding 2% 3%). In conclusion, the data in the present study point that West-East and South Slavic speaking populations, behave as separate groups based on their uniparental genetic structure, which shows that they do not share substantial common genetic ancestry and that there is great genetic variety in the Slavic linguistic unity.


Introduction
Slavs are the most numerous Indo-European ethno-linguistic group in Europe (Kipfer, 2000).Their proposed homeland is in the middle Dnieper basin: the area North-East of the Carpathians, the upper stream of rivers Bug and Dniester, mainly around Pripet river (Mutafchiev, 1943;Rebala et al., 2007).They spread from the 6 th century AD to inhabit whole Eastern Europe and parts of Central and South-Eastern Europe.
Extant Slavic languages show great variety and fall into three groups: West (Czech, Slovakian and Polish), East (Russian, Belorussian and Ukrainian) and South (Serbian, Croatian, Bulgarian and Slovenian).The Bulgarian literary language differs from other Slavic languages by the almost complete loss of grammatical case; the creation of definite article of nouns (appearing in the form of a suffix, added to the stem); analytical comparative and superlative (by word-particles); and a complex tense system where the infinitive is completely lost (Aepli, von Waldenfels, & Samardzic, 2014;Kushniarevich et al., 2015;Raykov, 2005).
Orthodox Christian Slavs use the Cyrillic alphabet, while Roman Catholic Slavs and Bosniaks use the Latin alphabet.
Genetic and genomic analysis of Slavs from different countries have been the object of many previously published studies; however, comparative genetic investigations among different Slavic groups are only few (Grzybowski et al., 2007;Malyarchuk et al., 2008;Mielnik-Sikorska et al., 2013;Rebala et al., 2007).To the best of our knowledge, the most comprehensive study of Slavic genetic heritage to date (Kushniarevich et al., 2015) reveals that the Slavic genetic diversity was formed through assimilation of preexisting regional genetic components and in situ gene pool shaping; as it also identifies an apparent genetic homogeneity of the majority of West and East Slavs and a substantial genetic difference between them and South Slavs.
In order to further contribute to the understanding of the correlation between language and genetic origin in Slavs, the present study analyzes for the first time the matrilineal and patrilineal relationships among European Slavic-speaking countries' populations and also illustrates their position in the European uniparental genetic landscape.

Materials and Methods
We have collected previously published data for the frequencies of mitochondri- We have included 18 and 27 (sub-) populations in the mtDNA and Y-chromosome haplogroup analyses of Slavic speaking populations (Supplementary Table S1 and Table S2); and 41 and 42 (sub-) populations in the mtDNA and Y-chromosome haplogroup analyses of European populations (Supplementary Table S3 and Table S4), respectively.
The mtDNA and Y-chromosome relationships among the populations were depicted by Principal Component Analysis (PCA) performed using XLSTAT.

Results and Discussion
The plot of the PCA performed on mtDNA haplogroup frequencies in Slavic speaking populations is presented in Figure 1.Proto-Bulgarians established the Danubian Bulgarian state (Dobrev, 2016;Mutafchiev, 1943).

Conclusion
In conclusion, as illustrated by the PC analysis of mtDNA and Y-chromosome haplogroup frequencies, West-East and South Slavic speaking populations, traditionally called "Slavs" (a term introduced in the 16 th century AD) (Šafařik, 1848) are heterogeneous based on the uniparental genetic diversity, which shows that they do not share substantial common genetic ancestry and that there is great genetic variety in the Slavic linguistic unity.

Supplementary
Table S1 Table S2.Absolute frequencies of Y-chromosome haplogroups in Slavic speaking populations.
al DNA (mtDNA) and Y-chromosome haplogroups in Slavic speaking and other European populations.In these studies, the mtDNA haplogroup assignment was based on partial or entire control region sequences and/or coding region markers; and the Y-chromosome haplogroup classification was performed by genotyping of informative biallelic markers.To analyze comparable results for larger S. Karachanak-Yankova et al.DOI: 10.4236/aa.2017.74018320 Advances in Anthropology number of populations, the data were normalized to the highest possible level of phylogenetic resolution.

Figure 1 .
Figure 1.PCA plot of Slavic speaking populations based on mtDNA haplogroup frequencies.The variance of the first and second principal components (F1 and F2, respectively) is given in brackets.-West, -East and -South Slavic speaking populations.

Figure 2 .
Figure 2. PCA plot of Slavic speaking populations carried out on Y-chromosome haplogroup frequencies.The variance retained by the first and second principal components (F1 and F2, respectively) is shown in brackets.-West, -East and -South Slavic speaking populations.
position of the mtDNA haplogroup frequency profiles of Slavic speaking populations in the European context is represented in Figure 3.In this PCA plot, South Slavs (Bulgarians, Slovenians, Serbians, Croats and populations of Bosnia and Herzegovina) are grouped alongside Balkan (Northern Greeks and Romanians) and Northern Italian populations.West Slavic populations (Slovaks, Czechs and most of the Poles) are adjacent to some North European non-Slavic populations (from Finland and Sweden).The majority of East Slavic populations (from European Russia and Belarus) are scattered as some European Russians (from Vladimir and Yaroslavl) and Ukrainians are close to West Slavic populations.Furthermore, Germanic and Romance speaking populations (from Germany and Austria; and Iberia, France and Italy; respectively) are located separately, being situated in the positive part of PC1.The comparison of the Y-chromosome haplogroup frequencies in Slavic speaking and remaining European populations performed by PC analysis is given in Figure 4. Compared to the PCA of mtDNA haplogroup frequencies, it again shows a more clear-cut grouping of most of South Slavs (Serbs, Bulgarians, Croats, Bosnia-Croats, Bosnia-Serbs, Bosnians) with neighboring populations from the Balkans (Romanians, Greeks and Macedonian Greeks).On the other hand, West (Czechs, Poles and Western Slovaks) and East Slavs (Ukrainians, Belarus and European Russians) are located separately.From the non-Slavic populations Swedish Saami and Finns are almost outliers; the two populations from Germany are almost overlapping, whereas Italian populations form a cluster which embraces Catalonia.In general, the obtained results show that based on the distribution of mtDNA and Y-chromosome haplogroups West and East Slavic speaking populations locate separately from South Slavic populations.Furthermore, in the European uniparental landscape South Slavic speaking populations are positioned more close to neighboring Balkan non-Slavic populations and North Italian populations, than to other Slavic populations.This hints that the linguistic resemblance of South Slavic speaking populations with East and West Slavic groups is not paralleled to a similar extent by a genetic one, which is in line with previous findings demonstrating that the basis of the gene pool of West-East and South

Figure 3 .
Figure 3. PCA plot of European populations based on mtDNA haplogroup frequencies.The variance captured by the first and second principal components (F1 and F2, respectively) is written in brackets.NE-northeast, NW-northwest, -West, -East Slavic, -South Slavic, -non-Slavic populations.

Figure 4 .
Figure 4. Map of Principal Component Analysis of European populations based on their Y-chromosome haplogroup frequencies.Variance explained by the first and second principal components (F1 and F2, respectively) is shown in brackets.-West, -East, -South Slavic speaking, -non-Slavic populations.
. Number of individuals belonging to each mtDNA haplogroup in European Slavic speaking populations.