1. Introduction
Rosa rugosa, a deciduous shrub of the genus Rosa, has prickly stems, pinnately compound leaves, and variously colored, often fragrant flowers. The rugged rose is a deciduous shrub native to northeastern China and adjacent areas of North Korea, Japan and eastern Russia [1] [2] . In China, the rugged rose is naturally distributed on the coast and islands of Southern Liaoning Province, Eastern Shandong Province, and Tumen River estuary in Jilin Province, and is classified as an endangered species [1] . R. rugosa is economically important in many fields because their petals are a source of rose essential oil, which is also known as “liquid gold”, a very valuable natural raw material, especially in perfumes, cosmetics, aromatherapy, spices, and nutrition [3] . As R. rugosa is harder than most cultivated roses, its germplasm is an important resource for rose cultivation. This species is also important ecologically, as it can grow on sandy substrates and thus protect delicate coastal areas from erosion. However, in recent years, many of the coastal areas that comprise the endemic habitat of the rugged rose have been subject to increasing industrial development, tourism, aquaculture, and other detrimental human activities, resulting in a decrease in suitable habitats [4] . In order to protect endemic populations of this ecologically important plant in China better, it is necessary to have a deep understanding of genetic variation and establish a core collection among various rugged rose populations. Collard and Mackill developed a new molecular marker technique named CDDP (conserved DNA-derived polymorphism) molecular marker, which is an effective supplement to labeling methods such as RAPD, ISSR, TRAP, CoRAP, and SCoT, and is prospectively useful in many applications [5] . In 1984 Frankel proposed [6] [7] the core collection concept, which means it is possible to select some of the numbers and the minimum genetic duplication of resources from the entire genetic resources, to maximize the diversity, preservation and utilization of these sources on behalf of the entire sample, so as to facilitate the evaluation of germplasm resources. Frankel and Brown will be further developed into a new approach for the construction of germplasm resources in-depth evaluation as well as effective protection and utilization, and gradually into a hot spot of international plant germplasm resource research [8] . The core collection of many crops has been established, such as Triticum aestivum [9] , Glycine max [10] and Medicagosativa [11] . At present, molecular markers have been applied to the establishment of core collections of Malus sieversii [12] [13] , Prunus persica [14] , Ginkgo biloba [15] and many other species. The aim of this study is to establish core germplasm resources of rugged rose with CDDP molecular marker technology.
2. Materials and Methods
2.1. Plant Materials
We collected fresh leaves from 120 different R. rugosa plants in various locations in northeast China (Table 1). All specimens were preserved in silica gel at −20˚C.
2.2. DNA Extraction and CDDP Marker Analysis
We extracted the total genomic DNA of each plant from the leaves (n = 120) using the cetyl trimethyl ammonium bromide (CTAB) method (Murray and Thompson, 1980) with minor modifications. We measured DNA quantity with a spectrophotometer and DNA quality using agarose gel electrophoresis. All DNA samples were stored at −20˚C before use. We tested 21 CDDP primers (Collard et al. 2009; synthesized by Shanghai Biotechnology Engineering Co., Ltd).
![]()
Table 1. The120 R. rugosa samples used in this study.
![]()
Table 2. 13 primers for rugged rose CDDP molecular marker.
Of these, preliminary results indicated that 13 resulted in clear, stable and polymorphic bands. These 13 primers (Table 2) were selected for PCR amplification. Each 20 µL PCR mixture contained 10 µL 2XTaq Master Mix (containing dye), 1 µL 40 ng/µL DNA template, 1.0 µL 10 pmol/µL primer, and 8 µL ddH2O. A standard PCR cycle was used: an initial denaturation step at 94˚C for 3 min; 35 annealing cycles of 94˚C for 1 min, 50˚C for 1 min and 72˚C for 2 min; a final extension of 5 min at 72˚C. PCR products were stored at 4˚C. PCR products were separated by electrophores is on 2% polyacrylamide gel using a vertical-gel apparatus in 1 × TAE at 120V constant current for 1.5 - 2 h, along with a DL5000 marker as a size marker. Amplified fragments were strained by the silver-straining method and photographed with a digital camera. All amplifications were repeated at least twice.
2.3. Data Scoring and Statistical Analysis
Based on the electrophorogram, we selected only clear bands for statistical analysis. Only bands that were clearly polymorphic in all replicates were scored as present (1) in a data matrix. Bands without clear polymorphisms, or that were only polymorphic in one replicate were scored as absent (0). We calculated the total number of bands, the number of polymorphic bands, and the percentage of the total bands that were polymorphic or specific using Microsoft Excel 2007. We used POPGENE 1.32 to compute: the number of effective loci, the percentage of polymorphic loci, Shannon’s information index (I), observed number of alleles (Na), expected heterozygosity (He), effective number of alleles (Ne), Nei’s gene diversity (H). We constructed a UPGMA dendrogram using SHAN. We used Tree Plot to draw a cluster analysis tree (Xu 1994).
The calculation methods of the above parameters are as follows:
Effective number of alleles (Ne) = 1/∑Pi2, Pi is the frequency of variation in the first i allele; Nei’s gene diversity (H) = 1 − ∑Pik2, Pik is the frequency of the occurrence of i alleles in all genes; Shannon’s information index (I) = −∑(Pi × lnPi), Pi is the frequency of variation in the first i allele.
2.4. Construction and Evaluation of Core Collection
NTSYS software was adopted to cluster the experimental materials and all the germplasms were grouped with the appropriate genetic similarity coefficients. In the whole sampling, the number of each sample accepted into the core collection was determined according to the sampling ratio given. They take different proportions (80%, 70%, 60%, 50%, 40%, 30%, 20%, 10%). Then the genetic similarity coefficient of the germplasm of pairs from each group was selected and the poor ones were deleted. The process went on and eventually the representative germplasms that meet the requirements constituted a core subset.
The genetic diversity parameters of the original germplasm and every candidate core collection were carried out with T test with SPSS statistical software. The significance of T value was used to represent the significant difference between the candidate core germplasm and the original germplasm. Based on this, whether the core collection is reasonable can be further determined.
3. Results
3.1. Genetic Diversity Analysis of R. rugosa Population
Among the 128 fragments generated from the 13 selected CDDP primers, 121 were polymorphic (94.53%), an average of 9.85 polymorphic bands perprimer. This suggests that rugged roses are rich in genetic diversity. At population level, the genetic diversity index is the highest in Hunchun (A), Chengshan (F) minimum (Table 3). Therefore, the genetic diversity of Hunchun population in Northeast China is the highest.
3.2. Construction of Wild Rose Core Collection
The UPGMA dendrogram showed that the populations of R. rugosa analyzed could be divided into six major groups when genetic coefficient is 0.81 (Figure 1). The core collection was screened according to these groups.
![]()
Table 3. Genetic diversity of the six populations of R. rugosa.
The results of stepwise cluster analysis showed that the number of polymorphic loci (NPL) decreased gradually with the decrease of sampling proportion, while the values of Ne, H and I increased gradually (Table 4). If the primary core collection of
R. rugosa is constructed, the core collection needs less resources to represent more than 95% of the genetic diversity of the original germplasm. Generally it is considered that 20% - 30% of original germplasm as core germplasm is the best [16] . Overwhelmingly, a sampling ratio of 20% was selected as core germplasm. The codes are: C1, C7, C15, C19, C21, C25, C31, C32, C42, C44, C46, C50, C58, C60, C61, C63, C68, C76, C85, C88, C91, C93, C103, C104, C107, C117.
3.3. Evaluation of Core Collection
The genetic similarity coefficient (GS) for all 26 core samples ranged from 0.67 to 0.97 (Figure 2). Compared with the original germplasm (0.71 - 1.00), the relationship between the germplasm becomes far away. The 26 core collection resources of R. rugosa collections in China have 20% germplasm samples of initial collection, the retention ratio of polymorphic loci, effective number of alleles (Ne), Nei’s genetic diversity (H) and Shannon information index (I) were respectively 97.52%, 104.16%, 108.38% and 106.18%. The genetic parameters of selected core germplasms was obtained with T test and the results showed that it was not distinct from those of the original germplasm (Table 5).
4. Discussion
4.1. Method for Constructing Core Collection
The selection of sampling methods has always been the focus of the construction of core germplasm because it determines which germplasm is eligible for core collection. According to the objective of core collection, core germplasm should represent the greatest genetic diversity of the entire genetic resource. In this study, the core collection of wild roses was constructed with the method of stepwise clustering. This method has been used by many researchers [17] [18] [19] .
Establishment of core collection is mostly based on morphological traits, but
![]()
Figure 1. UPGMA dendrogram of genetic similarity of 120 samples among rugged rose.
![]()
Figure 2. UPGMA dendrogram of genetic similarity of 26 core samples among rugged rose.
![]()
Table 4. Genetic diversities of core germplasm constructed by the grouping gradual clustering method.
![]()
Table 5. Comparision of the genetic diversities between the core collection and the primary group of R. rugosa.
a = 0.05, ta = 1.960
because of the morphological characters of vulnerable ecological environment of plant growth and various climate factors, full use of agronomic or morphological data sampling to construct nuclear germplasm and genetic resources is not representative for all species [18] [19] . However, molecular markers can exclude genetic differences in environmental traits [20] . There are a variety of molecular screening technology is that have been applied to core collection. ISS molecular marker technique was successfully applied to the core germplasm screening of Morinda citrifolia [21] . 35% of the core collection of Ginkgo biloba was retained using APLF markers [15] . The core collections in the French Apple germplasm based on SSR markers was constructed [22] . In this study, we adopted CDDP molecular marker technology, which is rich in polymorphism and information and high in versatility [23] . The results showed that 20% core collection of R. rugosa was screened out by using CDDP molecular technique.
4.2. Reasonable Sampling Proportion
The purpose of core collection is to obtain the diversity that represents the entire genetic resource to the greatest extent with a minimum number. Therefore, the rational sampling proportion is the key point. Brown believes that the core collection samples from 5% to 10% of the original germplasm resources can represent more than 70% of the genetic variations of the whole germplasm [8] . In core collection construction, the proportion of core germplasm generally accounts for 5% ~ 40% of the original Germplasm [24] [25] . Li believes the sampling ratio should be determined based on the genetic structure and the number of specific species. The proportion of core germplasm in the small germplasm of the original germplasm is relatively large, whereas the proportion of core germplasm is smaller [26] . Zhang screened 30 samples to construct the core collection of 100 cultivars of osmanthus fragrans [27] ; Liu selected 25 samples from 110 pummelo germplasm resources to construct core germplasm of pomelo [28] . Because there are only 120 copies in the original germplasm in this study, in order to minimize the loss of genetic variation and to ensure that the sampling points can be built into the core germplasm collection, finally a collection contains 26 core germplasms of different population groups, accounting for 20% of the proportion of the original germplasm sampling, is obtained. The results show that the diversity and genetic diversity structure of the original population can be well preserved. In addition, according to the result of T test, the genetic diversity of the core collection is not significantly different from that of the original germplasm. Since all rugged rose resources are not included in this study, the core collection needs to be further supplemented with new germplasm resources to ensure the long-term conservation and utilization of the diversity and resources of the species.
4.3. Conservation Strategies for R. rugosa
With the development of industry, the population size of the rugged rose is decreasing. As a result, it is urgent to develop reasonable strategies for the conservation and breeding of germplasm resources. Not only the large number but also the repetition rate up to 60% of the collections has become the major obstacle for effective use of germplasm materials collected in most plant resources [29] . Therefore, we propose the following approach. First, we should focus on areas that are rich in genetic diversity (Hunchun, Jilin Province). Second, the rugged rose core germplasm resources selected from this study should be sent to the germplasm database for protection and research.
5. Conclusion
In this study, based on CDDP molecular markers, the core germplasm resources of rugged rose in China were constructed with the method of stepwise clustering. Core collection can provide a theoretical basis for the protection and utilization of wild rose germplasm resources. At the same time, it provides a reference for the application of CDDP molecular technology for other plants.
Acknowledgements
This work was supported by National Natural Science Foundation of China (NSFC) under Grant No. 30972406 and Shandong agricultural seeds engineering major issue under No. 20106.