Microsatellite-Based Genetic Structure and Differentiation of Goldfish ( Carassius auratus ) with Sarcoma

Ten microsatellite loci were used for analyzing six populations of goldfish (Carassius auratus) with sarcoma. It showed that there was the highest genetic diversity among the white oranda with red cap (RC) population, and the lowest among the white tigerhead (WT) population. However, the outcross existed among every population. There was huge genetic differentiation between WT and the other four populations. The average observed heterozygosity (HO) among populations ranged from 0.3571 to 0.7381. And significant genetic difference (FCT = 0.1891, P = 0.0186) appeared among goldfish varieties which can be classified into three groups (RT, WT; RL, BL, YC; RC). The software Principal Coordinate Analysis (PCA) and STRUCTURE showed that significant genetic differences were revealed between RC population of goldfish and other five populations.


Introduction
The goldfish is a very important ornamental fish in China, as well as a very popular pet around the world due to its variety of color patterns and morphological characteristics.It is speculated that this species was derived from wild crucian carp (C.auratus) under the combined forces of natural selection and domestication pressures, especially artificial breeding practice [1].As consequence, goldfish has been developed in plenty of varieties or strains, such as the dragon-eye, moor, and ryukin strains [2] [3].In human-mediated domestication process, the breeders have paid more attention to the goldfish variation and hoped to develop the goldfish with high ornamental value and to keep their substantial variation [4].According to dorsal fin, two taxonomies of goldfish can be clearly defined: one consists of dorsal fins goldfish (such as shubunkin and kurodemekin) and another is nondorsal fins goldfish (such as ranchu, chotengan and chinese ranchu).The non-dorsal fin goldfish would have originated from the dorsal fin goldfish [5].
The traditional biology of the goldfish has been widely studied [6] [7].However, its genetic analysis has been rarely investigated [8]- [10].Owing to its richness in strains with characteristic color and morphology, it is of importance to investigate the population structure and genetic relationship of this species for better understanding of the historical and current forces shaping its morphological characteristics.Furthermore, by now, only 5 microsatellite markers have been developed for C. auratus [11], relatively low locus numbers and polymorphism limits its application in the population genetic evaluation of this species.Meanwhile, more polymorphic novel microsatellite markers are extremely needed for investigation of the population structure and strain evaluation of goldfish.
Nowadays, the goldfish with sarcoma have increasingly tended to be more expensive than other goldfish [2] [12]- [14], more strains will be developed in this taxonomy.However, the genetic relationship and structure have not been investigated among the strains of goldfish with sarcoma.It is necessary to investigate genetic analysis of this taxonomy of goldfish for better conducting breeding program and developing ornamental strain in goldfish.In this paper, we conducted the genetic investigation of six representative populations of sarcoma goldfish by means of ten novel microsatellite markers which were newly developed.The objective of current study was to firstly understand the genetic structure and differentiation among these populations and to evaluate the utility of newly novel microsatellite markers.

Samples Collection
A total of 243 individuals were collected from six populations of goldfish with sarcoma, which were from Shanghai Nanhui fish farm in China.According to the classification of the goldfish variation [15], they included lionhead, tigerhead and oranda.These six representative populations were red tigerhead (RT), white tigerhead (WT), red lionhead (RL) and black lionhead (BL), white oranda with red cap (RC) and white oranda with yellow cap (YC), respectively.The sample information is listed in Table 1.A small piece of the caudal fin was excised from each specimen and stored in 95% ethanol.

DNA Extraction and SSR Methods
Genomic DNA was extracted from a small piece of the caudal fins using standard phenol-chloroform method [16].The goldfish (C.auratus) microsatellite was produced through the enrichment by magnetic beads [17].A total of 56 pairs of primers could successfully amplify the products, but only 18 (32.1%)were polymorphic in the tested specimens.From deposited in GenBank, these 18 microsatellites were Accession No. GU181334-GU181348.And all the 18 loci did not deviate from the Hardy-Weinberg equilibrium (HWE).The results de- monstrate that these 18 microsatellite loci might be useful for the assessment of genetic variation and population structure in goldfish.

Microsatellite Amplification
A total of 10 polymorphic microsatellite loci (Table 2) those we chose from the 18 loci newly developed by our laboratory were used to perform polymerase chain reactions (PCRs).The reaction mixture was 10 μL containing 1 μL genomic DNA (20 ng/μL), 5 μL buffer, 0.2 μM dNTPs, 1.5 μM MgCl 2 , 0.5 μM Taq DNA polymerase (Tiangen, China), 1 μL primers (0.5 μM each), and 3 μL distilled water.The PCR reactions were conducted in an Eppendorf thermocycler (Eppendorf, Germany) under the following conditions: initial denaturation for 5 min at 94˚C; followed by 35 cycles for 30 s at 94˚C, 30 s at optimal annealing temperature (Table 2), and 30 s at 72˚C, final extension for 10 min at 72˚C.The PCR products were electrophoresed in 8% polyacrylamide gel and the fragment sizes (bp) were recorded by Gel-PRO ANALYZER (Media Cyberbetics, USA) using PBR322 as a ladder marker.

Data Analysis
Genetic diversity of the six populations was estimated as allelic richness (A R ), observed (H O ) and expected (H E ) heterozygosity, and inbreeding coefficient (F IS ) using FSTAT version 2.9.3.2 [18] [19].Departure from Hardy-Weinberg equilibrium (HWE) was tested between populations and calculated using POPGENE3.2[20].Genetic variability within and among populations were estimated using analysis of molecular variance (AMOVA) and comparisons of pairwise F ST between populations were conducted using ARLEQUIN 3.5 [19] [21] with 1000 permutations.Furthermore, two methods were used to further reveal population differentiation in the studied samples.First, a principal components analysis (PCA) was performed using GENALEX version 6.1 [22] to reveal the internal population structure and to visualize population discreteness.Second, Bayesian clustering analysis implemented with STRUCTURE 2.2 [23] was performed to estimate the most likely number of genetic clusters (K) of populations and assign individuals to those clusters without using prior information about their sample origin.The admixture model was employed with 20,000 burn-in periods and 1,000,000 Markov Chain Monte Carlo (MCMC) iterations.To identify the most probable posterior probability K value, the simulation program was running with increasing numbers of clusters (K) from two to four, with a plateau used to indicate the most likely K [19] [24].For each successive value of genetic clusters (K), the inferred clusters were analyzed and visualized as colored box plots using the DISTRUCT program [25].The genetic bottleneck about the six populations of goldfish with sarcoma was detected by using the software of Bottleneck 1.202.It was analyzed through two methods, the first one is to use stepwise mutation model (SMM) and two-phased mutation model (TPM) [26] [27], as the evolutional mutation model of most microsatellite loci comes closer to SMM model, not IAM model.The examining parameters are as follows: the square deviation of TPM is 10%, the number of SMM/TPM is 90%, 95% and 98%, respectively.The repeat number is 10,000.Statistical significance was tested by Wilcoxon signed-rank test.The second is model-drift indicator, for example, the population without suffering bottlenecked revealed the normal L-model distribution, which is near to mutation-drift equilibrium.Otherwise, the bottlenecked population revealed a drift model [19] [28].

Genetic Variations within Populations
A total of 243 alleles were observed for all samples based on 10 loci used (Table 2).The RC had the highest allelic richness (A R ) and expected heterozygosity (H E ), observed heterozygosity (H O ).The WT had the lowest allelic richness (A R ), as well as observed heterozygosity (H O ), expected heterozygosity (H E ).Interestingly, all F IS values from the six populations were significantly negative.

Genetic Differentiation and Relationship among Populations
The results of AMOVA revealed about 98.65% genetic variation contributed to differences within individuals, and only 16.35% contributed to differences among populations (Table 3), but there are extremely significant differences among the six populations (P < 0.001).Furthermore, there were the highest level of differences (F CT = 0.1891, P = 0.0186) when six populations were classified into three groups ((1) RT, WT; (2) RL, BL, YC; (3) RC).
The pairwise F ST values between populations are presented in Table 4.All pairwise F ST values, ranging from 0.0120 to 0.3045, were significant (P < 0.01).Moreover, the WT population had higher F ST value comparing to the others except RT.The result indicated that there is higher genetic differentiation between the WT and the other four populations.And the lowest genetic differentiation was found between WT and RT.
Relationships among populations were further illustrated by the two-dimensional scatter plot of a PCA based on the squared Euclidean (Figure 1).Both RT and WT which were closer to each other owned no any connection with the other four populations.Meanwhile, there was some connection among the populations RL, BL, YC and RC.Moreover, there was obvious genetic differentiation between RC and the other five populations.There was the closest genetic relationship among the YC, RL and BL population.
The genetic relationship of the six populations was analyzed by Structure 2.3.2, the result indicated that three clusters (RT, WT; RL, BL, YC; RC) were clearly identified (Figure 2).The genetic difference of WT population was closer to RT and their genetic difference were both far away from the other four populations.There were the nearest genetic difference among YC, RL and BL population of goldfish.The results are similar to the    one of PCA analysis.

The Test of Genetic Bottleneck
The genetic bottleneck about the six populations of goldfish with sarcoma was detected by using the software of Bottleneck 1.202 (Table 5).According to the TPM and SMM models, there were significant genetic bottleneck in the populations of RL, RC and YC, and no significant bottleneck signals in the population of RT, WT and BL.Similarly, the distribution of the allele frequencies also showed the results as the TPM and SMM models.

Discussion
Genetic variation is non-randomly distributed among populations, species and higher taxa [19] [29].This distribution of alleles and genotypes in space or in time is often referred to the genetic structure of a population [30].
In the present study, the observed heterozygosities were between 0.3571 and 0.7381, while expected heterozygosities were between 0.3262 and 0.5950, demonstrating that the six goldfish populations detected by the 10  [32].At the same time, they will own relative high evolutional capacity and stronger adaptive capacity to the environmental conditions as well as the dominant capacity on their growth and disease resistance [33] [34].The results seemed to indicate the genetic diversity of WT was obvious lower than other five populations.Meanwhile, WT population might have been experienced a founder effect during the later development [35].AMOVA revealed about 98.65% genetic variation contributed to differences within individuals, and only 16.35% contributed to differences among populations.There are extremely significant differences among the six populations (P < 0.001), while there were obvious significant differences (F CT = 0.1891, P = 0.0186) among goldfish varieties classified into three groups (RT, WT; RL, BL, YC; RC).It may be owned to the lack communication among the populations.According to the F ST value comparing, the result indicated that there is higher genetic differentiation between WT and other four populations, while there is lowest genetic differentiation between WT and RT (F ST = 0.0299).And there may be the outcrossing within the six populations [36].The phenomenon might be due to mankind's activity which affected the genetic differentiation within goldfish populations.As an important ornamental fish with numerous varieties, inevitably they will be often interference with mankind's activity.There was the study indicating that the mankind's activity could affect the genetic diversity and differentiation [37]- [39].Relationships among populations were further illustrated by the two-dimensional scatter plot of a PCA based on the squared Euclidean.It showed that there was obvious genetic diversity between RC and other five populations.The result can provide reference basis for the fine variety breeding of goldfish.In accordance with taxonomy system, RC and YC populations are belong to the fish just with head growth on the top of the head, RT and WT population are belong to the fish with the head growth extending to the opercula and even covering the eyes, while RL and BL population are belong to the fish without dorsal fin, short and egg-shaped body, meanwhile, they have the head growth extending to the opercula and even the eyes [15] [40].According to the analysis by PCA and STRUCTURE analysis, it indicated that there was an obvious genetic difference between RC and other five populations.The genetic difference of WT population was closer to RT population.And their genetic differences were both far away from other four populations.There were the nearest genetic difference among YC, RL and BL population of goldfish.Their analysis results are in accordance with the one analyzed by AMOVA.The results can show that there's no obvious relationship between the genetic differentiation within populations and the varieties those including sarcoma.And this is different from the traditional taxonomy system.Therefore, as a scientific researcher on goldfish, we can devote ourselves to study what is the real cause of the goldfish classified in terms of genetics.And we even will clearly know which gene can control their color difference of goldfish sarcoma.
Due to the changing of living environment and the artificial selection, the differentiation of goldfish genotype tended to be changed during gene diosmosis development.Especially, as an ornamental fish, their commercial activity often takes place.The numbers of WT, RT and BL populations might actively be recovered what had been lost.And all we know that the breeders like to pay attention to the goldfish variation and hope to develop the goldfish owning high ornamental value by keeping their variation forever, which might be the most important reason to get the active population gene.
In the present study, the average observed heterozygosity was higher than expected heterozygosity among the six populations, showing that the excess heterozygote generally existed within the population.This phenomenon often appears in the study materials which has relative less or close population.For example, filial generation population in the breeding one was produced from limited parents, the founder and bottleneck effect can lead to the phenomenon of linkage disequilibrium [41].This might be related to that the most goldfish varieties are cultured population at the present day.At the present time, RC belongs to the most typical and stable goldfish variety in the goldfish with different head characteristics [42].Egg-fish goldfish appearing gets a leap for all centuries.Theoretically speaking, the goldfish deviated farther from their original species and they will get the higher evolution.In a sense, the lack of dorsal fin is positive to the evolution process.Whether to regain the goldfish species or to improve their varieties, we still need a long-hard exploration and research to finish.

Figure 1 .
Figure 1.Principal coordinate analysis (PCA) of allele frequencies at ten microsatellite loci.Note: The letters of the represent scattered grams based on the squared Euclidean genetic distance matrix.

Figure 2 .
Figure 2. Population structure differences of the six populations of goldfish based on ten microsatellite loci using Bayesian clustering analysis (K = 3).

Table 1 .
Summary statistics of genetic parameters in the six populations of goldfish.
FN, full name; N, numbers of specimens; AR, mean allelic richness; HO, mean observed heterozygosity overall loci; HE, mean expected heterozygosity overall loci; FIS, inbreeding coefficients.

Table 2 .
Sequences, annealing temperature of PCR amplification and size of detected alleles for the ten microsatellite markers in Carassius auratus.

Table 3 .
Analysis of molecular variance (AMOVA) for the six populations of goldfish.

Table 4 .
Pairwise population differentiation FST values between the six populations of goldfish based on 10 microsatellite loci (SSR).

Table 5 .
P-values of bottleneck tests for the six populations of goldfish using two phased mutation model (TPM), stepwise mutation model (SMM) and mode shift indicator.