The Major Y-Chromosome Haplogroup R 1 bM 269 in West-Europe , Subdivided by the Three SNPs S 21 / U 106 , S 145 / L 21 and S 28 / U 152 , Shows a Clear Pattern of Geographic Differentiation

More than 2600 unrelated males from West-Europe were analysed by molecular hybridization experiments for the p49a,fTaq I polymorphisms. A total of 895 subjects (34%), belonging to haplogroup M269, were identified and further analysed for the three SNPs, S21/U106, S145/L21 and S28/U152; these three SNPs define the Northwest, West and South European sub-haplogroups, respectively. These haplogroups showed quite different frequency distribution patterns within West-Europe, with frequency peaks in Northern Europe, in Brittany in France and in Northern Italy/Southern France.


Introduction
Analysis of Y-chromosome sequence variation has provided major insights to the analysis of human evolution and dispersals.The first published tools permitting such a study (Lucotte & Ngo, 1985) were the informative p49a,fY-chromosome specific DNA probes, mapped to the non-recombinant (NRY) Yq11.2 region (Quack et al., 1988).Because the non-recombining region of the Y-chromosome is uniparentally transmitted and escapes recombination, its variation arises only by the sequential accumulation of the rare events of new mutations along radiating paternal lineages.
Our last, relatively recent, study on haplogroup M269 (Diéterlen & Lucotte, 2005) concerns a large samples of males from Europe, Middle-East and North Africa.We concentrate now on a study of more than 2600 unrelated males originating from West-Europe, in order to construct a geographical map of this region.Our sample of males belonging to haplogroup M269 was also typed by the Y-genetic markers S21/U106, S145/L21 and S28/U152, to understand better the pattern of their distribution in West-Europe.

Subjects and Methods Used
The population sample (29 populations) consisted of 2618 unrelated adult males from West-Europe (Table 1); they originate from 13 countries.All samples of blood were collected from volunteer donors, with informed consent; their classification was based on their grandfathers' birthplaces.The geographic location of the populations analysed is shown in Figure 1.Genomic DNA was extracted from whole blood by a classic method (Gautreau et al., 1983), using proteinase K and several successive phenol/chloroform extractions.
At least 5 µg of genomic DNA were restricted with Taq I enzyme and separated by electrophoresis on a 1.5% agarose gel.The restricted DNAs were then transferred to Hybond N + membranes by the Southern blot (SB) method, and hybridized with two probes (the 2.8 kb p49f -EcoRI insert first, and the 0.9 kb p49a -XbaI and BamH I second), according to Lucotte et al. (1994).The TaqI fragments (named alphabetically A-Q according to decreasing sizes) are revealed in genomic DNA, most of which being male specific; among them the A, C, D, F and I bands can be either present or absent individuals (variants or zero).Haplogroup M269 is: A3, C1, D2, F1, I1.
All the SNPs (Single Nucleotide Polymorphisms) are tested by PCR (Polymerase Chain Reaction) in the following order: M269, S127, S116, S21, S145 and S28.Table 2 gives primers used for S21/U106 (rs16981293), S145/L21 (rs11799226) and S28/U152 (rs1236440).Samples were amplified in a standard PCR reaction and the SNaPshot Multiplex System (Life Technology Corp, Carlsbad, California, USA).SNP markers S127/L11, S116/P312 and S250/DF27 were also typed in our DNA sample of 2618 subjects; but, because they correspond to no or to very low numbers of individuals having not the subsequent markers S21, S28 and S145 (that is, L11*, P312* and DF27*, respectively), the corresponding results concerning these markers are not tabulated in the present study.
Haplotype and SNP maps were realized with the Spatial Analyst program (Arcview software) using the classical Kringing procedure (Diéterlen & Lucotte, 2005).We used the inverse distance weighting (IDW), which performs well with scarce data.The IDW method was computed for the five nearest neighbors (The grid has 250 rows and 355 columns) and we used a power of 2 (so that the influence is greater at large distance than with a high power).
In the north of France, haplotype frequencies vary between 39% in Paris and 479% in Lille.For north European continental countries, haplogroup frequencies vary between 49% in the Netherlands and 41.5% in Belgium; values are 44% in Germany and 39% in Switzerland.
Haplogroup frequencies decrease from Central Italy (37.5%) to Calabria (18%), Sardinia (14%) and Sicily (13%).As to the Mediterranean coast, in France values are 53.5% in Montpellier, 34% in Perpignan and 34% in Grasse; in Corsica, the mean frequency is 24%.Haplogroup M269 frequencies decrease in Iberia: in Portugal from the North (39%) to the South (29%), and in Spain (41% for the Catalans of Barcelona and 30% for Sevilla and Andalusia).
Figure 2 represents the most current haplogroup M269 distribution in West-Europe; the focus of this haplotype is the Basque region in France and in Spain (Diéterlen & Lucotte, 2005).From this center of origin haplogroup M269 frequencies decrease to the East in Continental West-Europe, and to the South in Italy and in Iberia.
This characteristic geographical pattern of haplogroup M269 in West Europe corresponds to the map of percentages of Basque words in local dialects in the south of France, in Spain and in Italy (Figure 3).There are two hundred and four subjects belonging to sub-haplogroup R-S21/U106 on 895 of haplogroup M269 (23%) in our data (Table 1).The most elevated value of sub-haplogroup R-S21 frequencies (79%)  corresponds to The Netherlands.R-S21 frequencies are 77% in Germany, 74% in Denmark, 73% in Belgium, 70% in Austria and 60% in Switzerland.In the Czech Republic R-S21 frequency is 53%.Maximal values of R-S21 for France is Nancy (58%) and for Italy is Torino (26%).

Y-Chromosome SNPs Studies
Figure 4 shows the map of R-S21 frequencies in West-Europe.As the peak of S21 frequencies concerns Germany and surrounding northern areas, we intend to name R-S21 "the North-Western R1b-haplogroup" (the "germanic" haplotype); Great Britain had an intermediate value.We can see on the map a gradient of decreasing S21 frequencies from the focus towards south-west in France and to south-eastern Italy; there is an other relatively important focus on S21 frequencies in Czechs.
One hundred and twenty subjects (14% of those that are of haplogroup M269 belong to sub-haplogroup R-S145/L21 (Table 1).The most elevated value of sub-haplogroup R-S145 frequencies (62.5%) corresponding to Brittany in France.S145 frequencies are 46% in Great Britain, 45% in Paris and 29% in the north of Portugal.
The relatively higher R-S145 concentrations in people of Western part of Europe (in Brittany in the North-West of France, South-English in Great Britain and Galicians at the north of Portugal) showed in Figure 5 is the reason why we intend to name this haplogroup "the Western R1b-haplogroup" (the "celtic" haplotype).The highest peak of R-S145 in Paris in this sample could be explained by Bretons' massive immigration towards the Capital since the beginning of the 20 th Century.
Figure 6 shows the R-S28 isofrequency map in West-Europe.Starting from a zone of high S28 concentration in southwestern France and in the north of Italy, frequency of this haplogroup decreases progressively towards north, west and east in the rest of West-Europe.Because the highest S28 values are concentrated in a region located in the European South, we intend to name the corresponding haplogroup "the Southern European R1b haplogroup" (the "alpine" haplotype).

Anthropological Considerations and Dating
Results reported here confirm those recently obtained by three groups of authors (Myres et al., 2011;Cruciani et al., 2011;Busby et al., 2013).The isofrequency maps presented by these three groups using the STR M269 (haplogroup R1b1b2 = R-M269) are very similar and correspond to that shown in the present study (Figure 2) concerning the "Basque subclade" (see Figure 3 for the map of Basque words), of the haplogroup M269.The term "Basque subclade" is adopted in this article for the reasons here exposed (in the strict actual sense it could be inadequate).Initial studies (Malaspina et al., 2000;Semino et al., 2000) had suggested that the observed R1b1b2 frequency cline in Europe (from frequencies greater that 70% in Western Europe, decreasing eastward) is due to population expansion from a French-Iberian Ice-Age refugium after the Late Glacial Maximum (LGM), as suggested by the authors.Morelli et al. (2010) calculated the TMRCA (Time to the Most Recent Common Ancestor), based on Sardinian and Anatolian Y chromosomes and employing the "population mutation rate", estimated the R-M269 lineage to have originated 25,000 -80,700 years ago.Such a huge variance in the date shows a problem with the calculation methodology.Another calculations (Klyosov, 2012) estimates that haplogroup R1b came to Europe only around 5000 years ago.
It seems to be of great interest to try to understand how the vast majority of Western Europe men (more than 100 million) carry Y chromosomes that belong to the haplogroup R-M269.But it is only relatively recently (Niederstätter et al., 2008;Myres et al., 2009) that the R-M269 haplogroup was first sub-structured; use of the SNP U106 (that determins the R1b1a2a1a1 sub-haplogroup) showed high frequency of this haplogroup in northern and central Europe: Austria 18.5% -23%, Denmark 18%, England 22%, Germany 20.5%, Netherlands 37%.Our isofrequency S21 map (Figure 4) is very similar to those corresponding to the three groups of authors (Myres et al., 2011;Cruciani et al., 2011;Busby et al., 2013) using the SNP U106/S21.Preliminary time expansion estimate for haplogroup R1b1a2a1a1-U106 (Cruciani et al., 2011) based on seven STRs analysed and the "population mutation rates" (8.3ky; 95% CIs 5.8-10.9ky) is compatible with both Neolithic and post-glacial expansion within Europe.So, the North-Western R1b haplogroup is spatially and timely well defined; we still need to subdivide it further by downwards SNPs, like Z381.
A similar characterisation exists for the South European haplogroup studied by S28/U152, our isofrequency map (Figure 6) being similar to those of the three groups of authors using SNP U152/S28; time expansion estimation (Cruciani et al., 2011) for the corresponding sub-haplogroup R1b1a2a1a2b, again employing the "population mutation rates" is 7.4 ky (95% CIs 5.3 -10.2 ky), also compatible with both Neolithic and post-glacial expansion within Europe.Klyosov (2012) has determined that haplogroups U106 and U152 arose 4175 ± 430 and 4125 ± 450 years before present, respectively.Both U106 and U152 subclades showed frequency clines towards the south and the north, respectively.
Our isofrequency map concerning the Western European R1b haplogroup studied by S145 (Figure 5) is also very similar to those of Myres et al. (2011) and of Busby et al. (2013) using SNP M529/S145.
The majority (597 of 895 = 67%, Table 1) of the Basque M269 chromosomes from West-Europe were found to be ancestral for the S21 North-Western sub-haplogroup, the S145 Western sub-Haplogroup and the S28 South sub-haplogroup (all three are included in the paragroup R1b1b2*).

Conclusion
About one third (34% in this study) of the males originating from West-Europe belong to Y-chromosome haplogroup M269.Isofrequency map of West-Europe, for this haplogroup, shows a peak in frequencies for the Basque country and clines of decreasing frequencies from this peak towards the south and the west.When analysed for SNPs S21, S145 and S28, the haplogroup Y-chromosomes show a clear pattern of geographic differenciation: 23% of them are of the R-S21 sub-haplogroup, 14% of the R-S145 subhaplogroup and 30% of the R-S28 sub-haplogroup.Isofrequency map for R-S21 (Figure 4) shows a peak in frequencies for northern Europe and a gradient of decreasing frequencies from this peak towards the south of West-Europe.The isofrequency map for R-S145 (Figure 5) shows peaks for Brittany in France and in the south of England (and a minor peak in Galicia).The isofrequency map for R-S28 (Figure 6) shows maximal frequencies in Southern France and Northern Italy, and decreasing frequencies towards west, north and east of West-Europe; the most elevated R-S28 values are concentrated at the interior of the Alps mountains in continental Italy.Together all the three haplogroups have subdivided the geographic distribution of haplogroup M269 in a highly structured fashion.

Figure 1 .
Figure 1.Sample populations (numbers refer toTable 1) and their locations in West-Europe.
Myres et al. (2011) published an important study on 10,355 male subjects (from 118 West Asian and European populations) typed for the M269 SNP, common in West-Europe.The M269 + European subjects were subsequently typed with S127/L11 and S116/P312 SNPs; the first one appears to be the most common (>70%) Y-chromosome haplogroup in Western Europe; Myres et al. described also several new SNP mutations downstream of R-M269, showing strong geographical structuring in a large sample of 2043 R-M269 + subjects.

Figure 2 .
Figure 2. Map of haplogroup M269 (haplotype XV) in West-Europe and in the Mediterranean basin.The various nuances of grey correspond to artificial discontinuities, with density percentages as indicated.

Figure 3 .
Figure 3. Map of the percentages of Basque' words in local dialects in west and south of continental Europe.The various nuances of red correspond to discontinuities, with density percentages as indicated (F.Diéterlen personal communication).This map shows nine gradations of red, for more than 127 words (log.scale).

Figure 6 .
Figure 6.Isofrequency map of R-S28 in West-Europe; the various nuances of red depicts the structured pattern of variation for S28 frequencies.

Table 1 .
Frequencies of haplogroup M269 (haplotype XV) and of S21, S145 and S28 in the 29 study populations (N: number of subjects); the most elevated values are bolded.

Table 1 )
and their locations in West-Europe.