Mapping Human Genetic Diversity on the Japanese Archipelago

The Japanese people are one of the most important populations for studying the origin and diversification of East Asian populations. As an island population, the Japanese's path of migration is a long-standing controversy. Archeological evidence suggests that there were at least two waves of migration to the Japanese archipelago in prehistory: the Paleolithic and Neolithic Jomonese as well as the Aeneolithic Yayoiese. However, the contributions of these Jomonese and Yayoiese to the contemporary Japanese population remain unclear. In this article, we provide evidence from human genetics as a new approach to addressing this topic. At the beginning, we introduce the history of human migration to the Japanese archipelago, as well as materials and methods human geneticists use. Subsequently, we tested three distinct population expansion models using evidences from recent human genetic studies on the Japanese, East Asian, and Serbian populations. Finally, we conclude that the contemporary main island Japanese are the result of population admixture of Jomonese, Yayoiese, and Han Chinese, which consists with the Admixture model.


Introduction
There are at least two waves of migrations to the Japanese archipelago in the prehistory.The first wave of migration, represented by the Minatogawa Man in Okinawa (Hisao et al., 1998), began at 50,000 years BP and reached the climax at about 10,000 years BP, giving rise to the Jomon culture.More recently, a second wave of migration traveled to the Japanese Archipelago at 2300 years BP, giving rise to the Yayoi culture.Based on fossil records and human remains, the Yayoiese soon dominated the Japanese archipelago and completed their expansion at about 300 AD (Chard, 1974).By that time, evidence of agriculture could be found at almost all human settlements on the Japanese archipelago except the northernmost areas of Hokkaido and the southernmost areas of Okinawa.This consists with the dominance of the Yayoiese at that period.However, the evidence from cranial morphology does not support a complete replacement of the Jomonese by the Yayoiese (Hanihara, 1984).Therefore, several different theories about the population origin and diversification of the Japanese have been proposed since the 19th century (Mizoguchi, 1986).
Human genetics and molecular anthropology advanced rapidly after the invention of the polymerase chain reaction (PCR) in the 1980s.In spite of the long dispute on the origin of anatomically modern human, Homo sapiens sapiens, evidence from mitochondrial DNA and Y chromosome analysis has finally confirmed an African origin of all modern humans (Vigilant et al., 1991;Bowcock et al., 1994), augmented with moderate gene flow from Neanderthals and possibly Denisovan archaic humans among non-African groups.Since then, Y chromosome and mitochondrial DNA analyses have become powerful tools in human evolutionary studies.The most influential example is the identification of the African origins of the East Asian population, despite the large number of hominoid fossils that have been found in East Asia (Ke et al., 2001).Most recently, autosomal DNA analyses were developed to test more complex population expansion models.Here, we use these methodologies to conclude on the origin of contemporary Japanese from both paternal and maternal lineages.

Y Chromosome and Mitochondrial DNA: Powerful Tools for Studying Human Evolution
Compared to historical, archeological, or osteological studies, molecular anthropological evidence is in some senses more reliable since the genetic material used by molecular anthropology is continuous and maintains its integrity as it is passed from generation to generation.Historical, archeological, and osteological studies are based on material culture and fossilized human remains, which can be discontinuous, strongly modified by the environment, and frequently hard to identify.Among all the materials available for molecular anthropological study, the Y chromosome and mtDNA are most powerful because of their abundance and ease of extraction.Their advantages are particularly pronounced in include lack of recombination, small effective population size and population-specific haplotype distribution (Zhang et al., 2007).

The Y Non-Recombining Region, Y-Snps, Y-Strs, and the Paternal Transmission
Different from autosomes and X chromosome, the non-recombining region of Y chromosome (NRYs) could be passed from generation to generation without recombination (Wang et al., 2010).The effective population sizes (Ne) are also much smaller than autosomes since only male carry Y chromosome.Therefore, the effects of genetic drifts on Y chromosome genetic structure will be more significant than autosomal genetic structure.
The Y chromosome single nucleotide polymorphisms (Y-SNPs) are highly informative and therefore could be used in population classification.For example, in a single study, Underhill et al. (1997) identified more than 160 bi-allelic polymerphisms on NRY using HPLC.Moreover, by sequencing the whole Y chromosome, more than 15,000 Y-SNPs have been identified recently (unpublished results) using next generation sequencing technologies.
The base substitution rate of Y chromosome is relatively low, which suggests it may be unable to reflect the most recent population expansion events.However, Y chromosome short tandem repeats (Y-STRs) could be used as a highly mutable tool alternatively, to construct detailed evolutionary network and estimate the divergence time of Y-SNPs.
To conclude, by combining Y-SNPs and Y-STRs information, the NRY could be used to reveal the population migration and expansion history of modern human.

The Mitochondrial DNA (Mtdna) and the Maternal Transmission
Human mtDNA is 16,659 base pairs in length and consist predominantly of coding DNA, with a 1100 bp-long non-coding control region (Anderson et al., 1981;Pakendorf et al., 2005).The control regions are highly mutative, named high variation region or HVR.However, the unprecedented high mutation rate in the HVR and potential possibility of recombination may lead to recurrent mutation and could cause bias (Arctander, 1999;Yao et al., 2000).
During the fertilization process, the tail section of sperm, which carries mitochondria is not permitted to enter the egg, which means the paternal mtDNA are impossible to transfer to the next generation.It is now widely accepted that mtDNA is maternally inherited and the population genetic characteristics of mtDNA are similar to the NRY.

Paleolithic and Neolithic Jomon Period Migrations
The first wave of migration to the Japanese archipelago started around 50,000 years BP.However, the human fossil remains in Japan are rare.The 18,000 years old Minatogawa Man found in Okinawa is considered as the oldest hominid remain in the Japanese archipelago (Matsu'ura et al., 2010).Morphological findings on the Minatogawa I male skull suggests it is closer to the Liujiang Man from Guangxi, China than to the Upper Cave Man of northern China (Suzuki, 1982).This finding suggests that the Jomonese might be the direct descendants of Minatogawa Man (Hisao et al., 1989).
The Jomon culture lasted from the final stage of the last ice age to around 2300 years ago and was distributed widely on the Japanese archipelago at its climax, from the southernmost Okinawa to the northernmost Hokkaido.The Japanese archipelago is not separated from mainland Asia until early period, therefore, left the puzzle of the migration route of Jomonese remains to be solved (Palmer, 2007).
Based on craniology research, the Jomonese are different from contemporary Chinese populations as well as the Neolithic Chinese.Correspondingly, a research (Hanihara, 1984) based on Q-mode correlation matrix from nine cranial measurements in 21 populations (Figure 1) reveals that the Polynesians and Micronesians are the closest population to the Jomonese among ancient Chinese, contemporary Chinese, Korean, contemporary Japanese, Ainu, Ryukyuans, North Asian, Yayoiese, Polynesians and Micronesians, except the Ainu and Ryukyuans, which have been proven to be the directly descendent of Jomonese.

Aeneolithic Yayoi Period Migrations
The Yayoiese were the second wave of migration to the Japanese archipelago, entering from the Korean Peninsula around 2300 years BP and finishing its expansion at 3rd century AD.The Yayoiese brought wet rice agriculture, weaving, and metalworking to the Japanese archipelago (Chard, 1974).In contrast to the hunter-gatherer lifestyle Jomonese, the introduction of tools and crops apparently increase the productivity.
In addition to the increasing of productivity, the political system and style of human settlements changed significantly in Yayoi Period, which plays an important role in the later nation processes.Before the arrival of Yayoiese, the Jomonese lived in relatively small communities, estimated about 24 individuals per human settlement.Correspondingly, the population sizes of each human settlement of Yayoi communities were larger, at 57 individuals, more than twice that of the Jomon Period (Suzuki, 1982;Koyama, 1978).
The origin of Yayoiese and the driven force of their migration also remain controversial.Archeological evidences and human remains suggest that the migration started at the end stage of Jomon Period, at 3rd century BC, followed by a rapid population expansion in the next 600 years.The time of migration was consistent with cooling climate and widespread civil disturbance in central and northern China and Korea.Therefore, this wave of migration to Japanese archipelago might be stimulated by war and chaos in neighboring countries, as well as the climate change.Furthermore, the discovery of northeast Asian style metal tools in Yayoiese settlements may indicates a Northeast origin of Yayoiese.
Correspondingly, the cranial morphological studies also demonstrate a close relationship of Yayoiese to Mongolians, Siberians, and northeastern Chinese.Notably, all these populations are adapted to extremely cold climates.This evidence suggests that the Yayoiese may enter Japan from Sakhalin to the northernmost Hokkaido instead of from the Korean Peninsula.
After the Yayoi Period, during the Kofun Period ranging from 3rd century AD to 6th century AD, there was increased number of mainland Asian migrants due to a set of policies from the Imperial Court, which encouraged the importation of more advanced culture and techniques from China and Korea.The Chinese written characters were introduced into Japan in the 5th century AD, as an important evidence of the implementtation of these Imperial Court policies (Hanihara, 1991).

Three Competing Theories on the Origin of Contemporary Japanese
Similar to the debates of human origin in other regions, the competing hypotheses on the origin of contemporary Japanese could be classified into three models: Replacement, Admixture, or Transformation.These models involve different genetic contributions of Jomonese and Yayoiese to the modern Japanese population (Figure 2).

Replacement Model
The main argument of the Replacement model is that since the Yayoiese dominated the Japanese archipelago after 3rd century AD, the Jomonese genetic lineages were completely replaced by those of the Yayoiese (Howells, 1966;Tuner, 1976).

Transformation Model
In contrast with the Replacement model, the Transformation model claims that the incoming Yayoiese did not affect the gene pool of the Jomonese, which implies that the contemporary Japanese are the direct decedents of the Jomonese with no Yayoiese contamination (Suzuki, 1981;Mizoguchi, 1986).

Admixture Model
The Admixture model was raised as a compromise of the Replacement and Transformation model.Based on both archeological and anthropological evidence, the Admixture model suggests that modern Japanese are an admixture population of Jomonese, Yayoiese and more recent migrants, which reflects the admixture of contemporary variations (Hanihara, 1991).
All the three models have been supported by evidences from genetics and archeology, whereas more recent evidences are in favor of the Admixture model, e.g., Cavalli-Sforza et al. (1994) for the Replacement model, Nei (1995) for the Transformation model, and Hammer et al. (1995), Horai et al. (1996), Omoto et al. (1997), Sokal et al. (1998), Tajima et al. (2002) for the Admixture model.

Haplogroup O
The frequency of haplogroup O is ranging from 46.2% -62.3% on main island Japanese and 37.8% on Okinawa Japanese, but absent in the northern Hokkaido Ainu population.There are two major sub-haplogroups of haplogroup O in Japanese: O2 defined by SNP M31 and O3 defined by SNP M122.Within sub-haplogroup O2, Japanese/Korean-specified sub-haplogroup O2b1 defined by SNP 47z exhibits higher frequencies on main islands than Okinawa, while no Japanese-specific O3 sub-haplogroups has been identified.
Estimated by Y-STRs data and the Y chromosome most recent common ancestors (YMRCA) algorithm (Stumpf et al., 2001), the age of Japanese-specific haplogroup O2b1 is 5720 -12,630 years (Hammer et al., 2006).This age consists with the Yayoi Period.
Haplogroup O3 is another major haplogroup of contemporary Japanese, which is also the biggest haplogroup in contemporary Han Chinese (Yan et al., 2011).The age estimation of Japanese O3 haplogroups is much younger (~500 years BP or less) and could be explained by the recent migrants from mainland East Asia.

Haplogroup D
The second most frequent haplogroup in Japanese is haplogroup D, accounting for about 34.7% of main island Japanese.Haplogroup D is absent outside Asia while most common in the Japan and Tibet, which suggest that Tibetans and Japanese may be closely related.
The D2-P37.1 is Japanese specific and can hardly be found in Koreans, different from the high frequencies of haplogroup O2b1 in both Japanese and Koreans.Haplogroup D exhibits a very different pattern than haplogroup O in Japanese (Figure 3 and Figure 4).For example, the frequency of Haplogroup D in northern Hokkaido Ainu and Okinawa people is highest among all Japanese populations.
The age of Haplogroup D2 was 14,060 -31,050 years (Hammer et al., 2006), significantly older than that of Haplogroup O2b1.This old age consists with the earliest peopling of Japanese archipelago 50,000 to 10,000 years BP.Therefore, Haplogroup D and Haplogroup O came from two distinct waves of migration to the Japanese archipelago.

Haplogroups C and N
The third and the fourth most frequent haplogroups among the Japanese are C and N, which account for 8.5% and 1.5% of Japanese population, respectively.In Haplogroup C, a Japanese specific sub-haplogroup C1-M8 could be found.Interestingly, sub-haplogroup C1-M8 is absent among the Ainu and most frequent in Tokushima people, exhibiting a different pattern compared to the Haplogroup D2 and O2b1.The estimated age of Haplogroup C-M8 is 8,460-18,690 years (Hammer et al., 2006).

Evidences from Mitochondrial and Autosomal DNA Analyses
Evidences from mtDNA Analyses According to the full mitochondrial genome sequencing data (Tanaka et al., 2004), contemporary Japanese are closest related to Koreans, which consists with the migration from Korea and the following rapid expansion in the Yayoi Period and Kofun Period.Furthermore, evidence of admixture between ancient southern and northern migrants was found.For example, Haplogroup M12 may be a mitochondrial counterpart of Y chromosome Haplogroup D lineage, and could be found in high frequency and diversity in both Japanese and Tibetans.
Frequencies of most mtDNA haplogroups were estimated based on the sequencing data of hyper-variable region (HVR) of mtDNA, offering a comprehensive comparison among ancient Japanese, contemporary Japanese, other East Asians, and Siberian populations (Table 2).Similar to Haplogroups C1 and D2 of Y chromosome, mtDNA Haplogroups N9b and M7a could be identified as Jomonese-specific haplogroups (Kivisild et al., 2002;Umetsu et al., 2005;Tanaka et al., 2004).Among all three Japanese populations in Table 2, the frequencies of Haplogroups N9b and M7a in Ainu and Okinawa Islanders are much higher than main island Japanese, while the frequencies of Haplogroups A and D (excluding D1), frequent among contemporary Chinese and Koreans, is much lower than the main island Japanese.Since the sea level rise did not isolate the Japanese archipelago from mainland Asia until the early Jomon period, the high frequency of Haplogroups N9b and M7a in

Evidences from Autosomal Analyses
Autosomes are ideal material to test complex population genetic hypotheses, since they carry much more information than the Y chromosome and mtDNA.The most comprehensive autosomal study of East Asians (The HUGO Pan-Asian SNP Consortium, 2009) shows a relatively clear difference between analysis showed that the components in Ryukyuans are apparently different from any other populations.Furthermore, Ryukyuans lack East Asian specific components and have abundant common component shared by East Asians, Central Asians, Southeastern Asians, and Oceanians.This evidence suggests a different origin of Ryukyuans compared to the main island Japanese, which consists with the Jomon-origin of Ryukyuans.
Furthermore, principal component analysis (PCA) and genetic distances calculation of variant Japanese populations using 140,387 autosomal SNPs have been performed by Yamaguchi-Kabata et al., (2009).From the PCA diagram, two distinct clusters of Japanese different from Han Chinese could be identified, while another small group of Japanese shows a closer relationship with Han Chinese.The genetic distance between Ryukyu cluster and any main island Japanese population (Hondo cluster) is significantly higher than the genetic distance within main island Japanese population.This evidence confirms a different origin of Okinawa Islanders than the main island Japanese populations, and could also be a solid evidence for the Admixture model.
However, there are many problems when using autosomal data in population structure interpretation.Recombination occurs frequently on autosomes, making it hard to trace the ancestry of a population.The data output from the STRUCTURE analysis (The HUGO Pan-Asian SNP Consortium, 2009) or the inter-population genetic distances (Yamaguchi-Kabata et al., 2009) could not be assigned to any specific genetic components.It is not possible to calculate the admixture ratio of Jomonese with Yayoiese using autosomal data.

Conclusion and Prospects
According to studies on the population structure of East Asians, Southeast Asians, and South Asians (Jobling et al., 2003;Tu et al., 1993;Wen et al., 2004;Thangarai et al., 2003;Deng et al., 2004;Karafet et al., 1999), based on Y chromosomal evidences, we can draw conclusions as follows.
The frequency distribution of Haplogroup D is U-shaped, while the distribution of Haplogroup O is inverted U-shaped (Figure 4).Before the discovery of detailed markers on the Y chromosome, Sokal et al. (1998) predicted that if the origin of Japanese consists with the Admixture model, the distribution of The diversity of Y-STRs and downstream Y-SNPs of Haplogroup O2 and O3 in Koreans are apparently higher than in Japanese, which proves that the Yayoiese came to Japanese archipelago from the southern route (Korean) rather than from the northern route (Sakhalin).
Ruling out sampling error and the uneven distribution of different location, and based on data of Haplogroups D and O, the genetic contributions of Jomonese and Yayoiese have been calculated.The contribution of Jomonese is 40.3%, while the contribution of Yayoiese is 51.9% (Hammer et al., 2006).
The evidence from mtDNA analysis also supports the Admixture model of Japanese origins.However, the pattern is not as clear as that provided by the Y chromosome.For example, the admixture of Jomonese to Yayoiese calculated from mtDNA haplogroups (~80% Jomonese contribution, unpublished data) are stronger than Y chromosome haplogroups (~40% Jomonese contribution, Hammer et al., 2006), as well as the low frequentcy existence of Y chromosome haplogroup D counterpart in mtDNA Haplogroup, the Haplogroup M12 in Korean.This could be explained by wars and other activities between populations.During a war, males from the defeated side have a higher probability of being killed while the females from the defeated side have a greater probability of being abducted as war trophy and may continue give birth to the next generation with the conquering males.The same phenomena could be observed among Micronesians, among which the Japanese spe-cific Y Haplogroup D2b1 is found in Micronesians at low frequentcy due to the prolonged Japanese dominance of Micronesia before the end of the World War II.
Furthermore, information of mitochondrial DNA hypervariation regions is very limited and results in large standard error.The calculation of the admixture ratio based on HVR is problematic.Whole mitochondrial genome sequencing on differrent populations is required to increase the reliability of admixture ratio and other inference based on mitochondrial evidences.
The evidence from autosomal analyses is even weaker; we can only conclude that gene flow between different populations has occurred in Japanese population throughout history.More and more recent studies suggest that, based on Y chromosome, mitochondrial DNA, and autosomal analyses, the Admixture model is the best-fit model.
Although the rough outline of contemporary Japanese migration routes has been drawn, many small puzzles remain unsolved.For example, is the extremely high admixture ratio of mitochondrial DNA of Jomonese reliable?Is the Y Haplogroups C and D arrived at Japanese archipelago at the same time?With the development of next-generation sequencing and advanced sampling, those puzzles are expected to be addressed in the near future.

Figure 1 .
Figure 1.Dendrogram showing affinities of population groups based on Q Mode correlation matrix from nine cranial measurements within 21 populations (Modified from Hanihara, 1984).
Hammer et al. (2006) compared the Y chromosome haplogroup frequencies of Japanese to those of Northeastern Asian, Southeastern Asian, Central Asian, South Asian, and Oceanian population (Table1).These populations comprise Y chromosome Haplogroups C, D, NO, N, O, etc. Haplogroup C, D, N, and O account for 98.9% of contemporary Japanese.

Figure 2 .
Figure 2. Three competing hypotheses on the origin of contemporary Japanese.The rounded rectangles are gene pool and the arrows are gene flow.

Figure 3 .
Figure 3. Frequencies of Haplogroup D and O in different Japanese population, arrows represent possible migration routes.
also evince the first wave of migration to Japan.However, the different frequency of Haplogroup D1 in Funadomari Jomon and Kanto Jomon may suggest multiple routes of migration to the Japanese archipelago in the Jomon period, which consists with the different distribution patterns and ages of Y chromosome Haplogroups C and D.To conclude, consistent with Y chromosome data, the presences of Jomonese-specific Haplogroups N9b and M7a and contemporary East Asian common Haplogroups A and D (xD1) in main island Japanese provide solid evidence of population admixture between the Jomonese and subsequent Yayoiese.The higher frequencies of N9b and M7a in Hokkaido Ainu and Okinawa Islanders suggest less contribution from Yayoiese to the local Jomonese, which also consists with evidence from Y chromosome research among these populations.

Figure 4 .
Figure 4.The U-shape and inverted U-shape pattern of Haplogroup frequency vs. Distance from Kyushu (Modified from Hammer et al., 2006).certain characteristics will be U-shaped or inverted U-shaped.Thanks to the large sample number of Y chromosome studies, the U-shaped and the inverted U-shaped patterns have been shown, which is supportive of the Admixture model.The age estimation of Haplogroup C and D in Japanese archipelago consists with the range of the Jomon Period, while the age of Haplogroup O consists with the Yayoi Period.The absence of Haplogroup C1-M8 in Haplogroup D-rich Ainu and Okinawa is also a puzzle remains unsolved.Since the age estimation of Haplogroup C1 (8460 -18,690 years) and Haplogroup D2 (14,060 -31,050 years) is apparently different, and still much older than the Haplogroup O2b1, we can conclude that Haplogroup C1/D2 and O2b1 arise from two different waves of migration.However, given the different geographic distribution of Haplogroup C and D, it is yet to be seen whether the Haplogroup C and Haplogroup D are brought by the same wave of migration or by two different waves of migration with different migration routes.The diversity of Y-STRs and downstream Y-SNPs of Haplogroup O2 and O3 in Koreans are apparently higher than in Japanese, which proves that the Yayoiese came to Japanese archipelago from the southern route (Korean) rather than from the northern route (Sakhalin).Ruling out sampling error and the uneven distribution of different location, and based on data of Haplogroups D and O, the genetic contributions of Jomonese and Yayoiese have been calculated.The contribution of Jomonese is 40.3%, while the contribution of Yayoiese is 51.9%(Hammer et al., 2006).The evidence from mtDNA analysis also supports the Admixture model of Japanese origins.However, the pattern is not as clear as that provided by the Y chromosome.For example, the admixture of Jomonese to Yayoiese calculated from mtDNA haplogroups (~80% Jomonese contribution, unpublished data) are stronger than Y chromosome haplogroups (~40% Jomonese contribution,Hammer et al., 2006), as well as the low frequentcy existence of Y chromosome haplogroup D counterpart in mtDNA Haplogroup, the Haplogroup M12 in Korean.This could be explained by wars and other activities between populations.During a war, males from the defeated side have a higher probability of being killed while the females from the defeated side have a greater probability of being abducted as war trophy and may continue give birth to the next generation with the conquering males.The same phenomena could be observed among Micronesians, among which the Japanese spe-

Table 1 .
Frequencies of the Y chromosome haplogroups in Japanese and the reference populations (%).

Table 2 .
Frequencies of selected mtDNA Haplogroups in Jomonese and the reference populations (%).