Determination of the Genetic Structure of the Oleaginous Lagenaria siceraria of the Nangui Abrogoua University Germplasm Collection

Thirty accessions of Lagenaria siceraria from the Nangui Abrogoua University germplasm collection were analyzed using three microsatellite markers. The average Polymorphism Information Content (PIC) value was 0.61. The average observed heterozygosity value (H0 = 0.631) compared to the average expected heterozygosity value (He = 0.645) did not show significant differences in the selected accessions, which suggested random mating in the set of accessions. Within accession inbreeding estimate (FIS), was 40% and was not significantly different from zero. The reduction of heterozygotes was likely the result of presence of null alleles. Analysis of Molecular Variance (AMOVA) within and among 30 accessions of L. siceraria revealed that 39% of the total variation resides among accessions and 61% within accessions. Accession structuring pattern derived from Bayesian clustering analysis revealed two clusters. Based on the genetic structure of the accessions analyzed, a sampling strategy to collect and conserve genetic resources of L. siceraria was suggested.


Introduction
A germplasm collection is a means of preserving the genetic diversity of a cultivated species before that diversity is lost as a result of implementing high input crop monoculture systems.Such collections serve as a genetic bank from which valuable genes can be selected.An important source of germplasm is the gene pool of landraces that farmers' fields constitute in terms of specific ecological adaptations, usefulness in breeding programs and/or crop improvement.However, to understand the dynamics of diversity in agroecosystems, genetic variability must be investigated [1].Therefore, for maintenance of the diversity and identification of valuable genes, evaluation of collections is essential.Such studies should not be neglected, particularly for minor or orphan crops such as indigenous edible-seeded cucurbits.
Cucurbits are present in both the New and Old World and are among the most important plant families that supply humans with edible products and useful fibers.Cucurbits are divided into five sub-families: Fevilleae, Melothrieae, Cucurbitaceae, Sicyoideae, and Cyclanthereae.
Lagenaria siceraria is a member of the cucurbit family (Cucurbitaceae) which includes several other economically important species such as cucumber and melon that belong to the Cucumis, as well as squash and pumpkin that belong to the genus Cucurbita, and watermelon that belong to the genus Citrullus [2].
Cultivated L. siceraria (Malign) Stanley is commonly known as the white-flowered bottle gourd, but called "Bebu" in Côte d'Ivoire and "Egusi" in Benin, Nigeria, and Ghana.This species is a diploid (2n = 22) belonging to the genus Lagenaria.Worldwide, L. siceraria is grown for its fruit either being harvested mature young and used as a vegetable or harvested mature and used as bottle, utensil, or pipe.In West Africa, oleaginous cucurbits are cultivated for their seeds where they are reported to make an important social and cultural contribution [3].Another recent utilization of L. siceraria is as rootstocks for watermelon against soil-borne diseases and low soil temperature [4] [5].This plant was one of the first crops to be domesticated.Based on archeological evidence, L. siceraria is presumed to have been domesticated in Africa [6] and might have dispersed to the New World by ocean currents or by human migration in pre-historic times [7].Africa is believed to be the centre of genetic diversity for bottle gourd, although wild progenitors of bottle gourd have not been identified there [8].
The oleaginous and nutritious seeds of Lagenaria siceraria, are important in the social and cultural lives of several people [3].For example dried, slightly toasted and grounds seeds of the indigenous L. siceraria are used as soup thickener.Achu et al. [9] showed that egusi (L.siceraria) had a high nutritional value: protein (34.19% ± 0.85%); fat (50.08% ± 1.23%), provided good quality oil and good groundcover.In addition, commonly found in many traditional cropping systems, the plant is well adapted to extremely divergent agro-ecosystems and various cropping systems characterized by minimal inputs [10] [11].L. siceraria thus represented an excellent plant model for which improved cropping systems implementation could insure the economic prosperity of rural women from tropical Africa.To our knowledge, no detailed study has been devoted to the genetic diversity and reproduction biology.However, investigations reported for others species, suggested that cucurbit family was predominantly outcrossing [12].Such expectations are based on the fact that indigenous edible-seeded cucurbits are generally monoecious and entomophilous [13].Neither the occurrence of auto-incompatibility, nor the reproductive mechanisms of this plant have been clearly demonstrated.
The regeneration of the cucurbits in gene banks is mainly done through botanical seeds.Unfortunately, the seeds quickly lose their capacity to germinate and cannot be preserved for more than one year [14].Thus, regular plant regeneration is required to avoid genetic resource depletion.Also, maintaining the true type plant is a major problem, because it is both hard and time consuming due to the numerous practical precautions required by the mating system.For Lagenaria siceraria, the task is also complicated by the creeping behavior of the target species, making appropriate harvesting tedious.Therefore, there is a need to identify a reduced number of L. siceraria accessions that can be managed efficiently.
Based on fruits shape, two distinct cultivars have been described [11].The first one, with round-fruited, is characterized by the presence of a cap on the distal side of seeds.There is no cap on seeds, from elongated fruits of the second cultivar.In spite of the nutritional and agronomic potentials of L. siceraria, in depth basic investigations on the crop are scant [15].For example, to our knowledge, only genetic characterizations of L. siceraria accessions from Nangui Abrogoua University collection were done using isozyme [15].This study using few accessions did not detect the genetic structure of the material analyzed.The weakness of isozyme markers is that each of the proteins that are being scored may not be expressed in the same tissue and at the same time in development.Therefore, several samplings of the genetic population need to be made.To refine these studies and understand the mechanism responsible for genetic variance at both inter-and intra-accession levels, we used more accessions and microsatellite markers, which are widely employed in the analysis of genetic diversity in L. siceraria [16] and have proven to be polymorphic.
The objective of this study was 1) to estimate the amount of genetic diversity within and among watermelon accessions, 2) to determine the genetic structure among accessions and cultivars (or morphotypes), of the oleaginous L. siceraria.

Plant Material
Plant materials were selected from a collection of Lagenaria siceraria maintained at Nangui Abrogoua University (Abidjan, Côte d'Ivoire).The seed samples of thirty accessions were collected mainly in five geographical zones (South, East, Northeast, North, and Centre) of Côte d'Ivoire.The selected accessions were representative of two cultivars.The first one, with round fruit is characterized by the presence of a cap on the distal side of seeds (C) and the second cultivar (SC), with elongated fruits and characterized by seeds without a cap (Figure 1).
The cultivar "C" contained 22 accessions and the cultivar "SC", 8 accessions, according to seeds availability in each cultivar.A complete description of these accessions is given in Table 1.Ten seeds per accession, in total 300 seeds were analyzed.

DNA Extraction
The young leaves of each seedling were collected and stored at −80˚C until use.These samples were used for DNA isolation and PCR analysis.DNA isolation was carried out according to procedure described by Levi and Thomas [17] with a few modifications.Fresh leaf (0.1 g) tissue was finely ground in 1.

PCR Conditions
The microsatellite markers had been set up for Lagenaria siceraria previously [16].The PCR reaction condition used was as follows: genomic DNA samples (15 ng) were amplified in a 15 µl reaction volume containing 1× ThermoPol Reaction buffer (20 mMTris HCl, 10 mM (NH 4 ) 2 SO 4 , 10 mM KCl, 0.1% Triton X-100 pH 8.8 @ 25˚C), 0.2 mM each of the four dNTPs, 2 mM MgCl 2 , 0.5 mM of each forward and reverse primer, and 0.5 U of Taq polymerase (BioLabsInc, NEW ENGLAND).The amplifications were performed in a thermocycler (Biometra) programmed as follows: an initial cycle at 94˚C for 3 min, followed by 40 cycles at 94˚C for 30 s, 52˚C -55˚C for 30 s and 72˚C for 1 min.Cycling was followed by a final extension at 72˚C for 8 min, and a soak at 4˚C.

Electrophoresis
PCR products were separated in denaturing 6% polyacrylamide gels prepared using an acrylamide/bisacrylamide ratio of 19:1, 0.53 TBE (Tris boric acid ethylenediaminetetraacetic acid) buffer, 0.1% ammonium persulfate (APS), and 8.33% tetramethylethylenediamine (TEMED).Polyacrylamide gels were cast in a vertical gel casting plate.Immediately after addition of APS, 70 ml of the gel solution was poured directly into the gel casting plate.The plate with gel solution was then kept at room temperature for approximately 1.5 hours to allow polymerization.The amplified DNAs were mixed with 20 µl of formamide dye (98% formamide, 10 mM EDTA pH 8.0, 1% xylene cyanol and 1% bromophenol blue) before denaturation by heating for 3 min at 90˚C.Three microliters of each denaturated DNA mixture were loaded onto a pre-warmed polyacrylamide gel.Electrophoresis was performed at 55 W for 2 hours.The separated DNA bands were revealed using a silver staining method as described by Creste et al. [18] which was slightly modified.

Genetic Data Analysis
Fourteen primer pairs used to evaluate 44 entries of Chinese bottle gourd were tested to estimate the genetic diversity among Lagenaria siceraria accessions in the present study.
The genetic diversity was evaluated based on genotype and allele frequencies, using the level of polymorphism 0.95 criterion.There should be at least two alleles each with a frequency of at least 0.05.Only one allele has a frequency of 0.95 and the rest of the alleles have less than 0.05.Hence, the locus cannot be considered po-lymorphic.To evaluate the informativeness of each marker, polymorphic information content (PIC) of an SSR locus was calculated, based on the allele frequencies [19].The number of alleles per locus, estimates of observed and expected heterozygosity, and Shannon's Information Index were calculated for each population an each locus using GenAlEx v. 6.1 [20].Comparison between observed and expected heterozygosities were examined according to Mann-Whitney U test using software STATISTICA version 7.1 [21].The fixation indices were estimated at each polymorphic locus and tested for significant deviation using an exact test performed by the software Genepop [22].Within each accession, null allele frequencies were estimated using the maximum likelihood estimator based on the EM algorithm of Dempster et al. [23] and implemented in Genepop 4.0 [22].
Analysis of molecular variance (AMOVA) was calculated for the sampled accessions to estimate the partitioning of genetic variation at different levels and then to investigate the hierarchical level upon which genetic variation can be attributed.Significance of AMOVA was tested using a nonparametric permutation approach with 999 permutations [24].
A model-based on clustering algorithm in order to search for the most likely number of accessions sampled was used.This algorithm assigns individuals to accessions and also assesses accessions heterogeneity as implemented in the STRUCTURE program [25].The STRUCTURE analysis was conducted at five replications of K (assumed number of subpopulations), ranging from 1 to 10, with 100,000 repetitions of Markov Chain Monte Carlo (MCMC) and a burn-in period of 50,000, using the admixture model.Each assessment of K was repeated five times to check the repeatability of the results.
An UPGMA tree based on the Nei's genetic distances matrix was constructed in PHYLIP package version 3.6 [26].Cluster analysis was used to describe the relationships among and within different L. siceraria accessions.First, 1000 times bootstrapping was performed on SEQBOOT program to generate confidence in the dataset.Then, biased genetic distance from gene frequencies on GENDIST program was computed [27].The cluster analysis tree was produced with the NEIGHBOUR program which use a matrix of pairwise distances (based on gene frequency genetic distances) between all pairs of accessions and CONSENSUS program.
Confidence in tree topology was assessed by bootstrapping over loci (1000 iterations) and the phylogenetic tree was visualized in TREEVIEW 1.6.6 [28].

Estimation of the Informativeness of SSR Markers
Among fourteen primer pairs tested, only three showed polymorphism in this studied.The allelic composition of each marker in each genotype was determined to calculate a PIC value.The average PIC value was 0.61 with a maximum of 0.65 observed with LSR030 and a minimum of 0.55 observed with LSR020 (Table 2).

Genetic Diversity and Accession-Level Heterozygosity
The population statistics generated by the three microsatellites has been summarized in Table 3.A total of 116 alleles were detected across the three loci.The mean effective number of alleles per locus (A), varied respectively from 1 (Ls020) to 5.56 (Ls166) with a mean of 2.88 (Table 3).The average observed heterozygosity (Ho) was 0.631, ranging from 0 (Ls020) to 1 (LS207) and the average expected heterozygosity (He) was 0.645, ranging from 0 (Ls020) to 0.863 (LS166) (Table 3).Mann-Whitney U test indicated that there is no significant difference (p > 0.05) between observed heterozygosity and expected heterozygosity.Indeed, of the 89 inbreeding coefficients calculated, only 36 (40.5%) were significantly different from zero (p < 0.05).Eighteen inbreeding coefficients showed negative indices.The average inbreeding coefficients (F IS = 0.040) was significantly different from zero (p < 0.05) for the analyzed accessions.Null alleles frequencies estimates ranged from 0 (Ls005) to 34.15% (Ls147), and were consistent with the F IS estimates.Overall, the three markers investigated appeared to be affected by at least one null allele.The mean accession diversity using the Shannon Information Index (I) was 1.113.Accession Ls166 was the most diverse (I = 1.805) and the least diverse accession was Ls020 (I = 0).
Analysis of molecular variance (AMOVA) within and among 30 accessions of Lagenaria siceraria revealed that 39% of the total variation resides among accessions and 61% within accessions (Table 4).Calculations carried out separately for differentiation among cultivars, exhibited similar trends of AMOVA taking into account no prior grouping of accessions.We found more genetic diversity within cultivars (90%) than among cultivars (10%).

Cluster and Assignment Analysis
The Bayesian analysis using the software STRUCTURE indicated the presence of two main clusters in the entire set of accessions.The highest value for ΔK, the rate of change in the log probability of the data between successive potential numbers of clusters, was obtained for K = 2.Estimated log probability of the data was higher under K = 2 (−3576.96)than under K = 1 (−4170.1).High proportions of admixed individuals were observed in the region of six accessions with assigned membership different from one to another individual.The results were plotted to evaluate the geographical relationships of the accessions and the cultivars in different genetic clusters (Figure 2).The first cluster was composed of ten accessions.All of them are characterized by the presence of a cap on the distal side of seeds.The second was composed of 20 accessions.Twelve of these accessions are characterized by the presence of a cap on the distal side of seeds, eight showed the absence of a cap on the distal side of seeds.Accessions from southern part of Côte d'Ivoire are exclusively member of the first cluster and accessions from northeastern, northern and center part of Côte d'Ivoire are exclusively member of the second cluster.Only accessions from the eastern part of Côte d'Ivoire are member of the two clusters.
The UPGMA phylogenetic tree based on the Nei's genetic distances matrix from SSR data is shown in Figure 3.The dendrogram consisted of two major clusters.The branch separating these two clusters was well supported (bootstrap = 1000).The first cluster contained a group of twenty-seven accessions (I), the second one (II), consisted of three accessions from eastern part of Côte d'Ivoire (one from Gontougo and the two others from Moronou).

Estimation of the Informativeness of SSR Markers
The PIC value is a measure of the polymorphism level detected by a particular marker and is dependent on the number of alleles detected and their distribution in the population.The number of alleles (1 to 7) and PIC mean value (0.61) observed in this study suggested that the SSR markers selected were highly informative with sufficient discriminatory power.
Indeed, based on the PIC values, all the markers were highly informative (PIC > 0.50).Such a result indicated the high utility of used set of markers for genetic diversity analysis.A similar PIC value was reported in eight C. lanatus (another cucurbit species) accessions collected in Zimbabwe [29] using nine SSR primers.A similar trend was observed in investigations carried out by Ram et al. [30] in five germplasm lines of bottle gourd (Lagenaria siceraria) using six RAPD primers.These results also indicated profound polymorphism in bottle gourd landraces.
The used microsatellites with wide range of heterozygosity reduced the risk of overestimating genetic variability, which might occur with microsatellites exhibiting only high heterozygosity.Although varying across the loci, the mean values of observed heterozygosity were lower than the expected mean heterozygosity values.However, failure of significant differences between observed and expected heterozygosities according to Mann-Whitney U test (p > 0.05) suggested random mating in Lagenaria siceraria.
The number of alleles at different marker loci serves as measure of genetic variability having a direct impact on the differentiation of accessions.The allelic variation in this study was lower than those obtained by Gürcan et al. [31] in 60 Turkish bottle gourd accessions.This discrepancy could be attributed to the methods used in  sample genotyping.Indeed, Gürcan et al. [31] used the capillary electrophoresis method for allele determination, whereas we used a direct reading on polyacrylamide gels.The capillary electrophoresis method is known to be the most efficient approach in sample genotyping [32].The gene diversity indices obtained from the present study were higher than those (H e = 0.073; H o = 0.053) reported for 30 L. siceraria accessions from Nangui Abrogoua University germplasm using allozyme markers [15].This discrepancy could be due to the fact that molecular markers are more polymorphic than isozymes.Significant deviations from Hardy-Weinberg expectations due to heterozygotes deficits were observed in 11 accessions, confirming random mating in the plant material studied.The same results had been reported by Koffi et al. [15].A relative high prevalence of null alleles has been observed in the concerned accessions.Null alleles affect population parameter estimates.The observed heterozygosity would be largely underestimated [33].
We found a significantly low level (39%) of genetic differentiation among accessions and within them the estimated genetic variation was 61% (AMOVA, P = 0.001).The results of this study were different from the findings of Minsart et al. [34] in Citrulluslanatus with, 88% variation among accessions and 12% within them.These results show that the structuring of the genetic diversity could be depend on the mode of sampling.For the present study, we selected ten seeds and a relatively higher number of accessions (30) while in the study conducted by Minsart et al. [34], 20 seeds were randomly chosen per accession and only three accessions were selected.Such contrasted sampling schemes should have resulted in inverted trends in accession's genetic structure as revealed by AMOVAs.Accession genetic structure observed in this study was similar to that reported in the previous study [29].Performing AMOVA within and among seven accessions of watermelons divided into two major groups (cow-melons and sweet watermelons).Authors demonstrated that only 0.8% of the total variation resided between the two groups, 10% between accessions within groups and 89.2% within accessions.
Overall, results from accession genetic structure showed that L. siceraria maintained a high level of variability within cultivars in accordance with its mating system, coupled with farmer's seed management approaches.Indeed, at the collecting sites a few seeds are usually saved from the previous season's harvest, or obtained from neighboring farmers or local markets, resulting in the gradual depletion of genetic variability.Similar results have been reported in Cucumeropsismannii, another oleaginous cucurbits cultivated in Côte d'Ivoire [35].

Cluster and Assignment Analysis
Accession structuring pattern derived from Bayesian clustering analysis revealed two clusters.
Data collected show a clustering according to geographical location.This result indicated that besides the forces such as exchange of genetic stock, genetic drift, spontaneous variation, natural and artificial selection, geographical origin also is responsible for genetic diversity.
A dendrogram established based on SSR genotyping of 30 accessions, also detect clustering by geographical location, which is not in agreement with Yetişir et al. [36] in which clustering of bottle gourd accessions from Turkey was based around fruit morphology much more than on geographical origin.No significant grouping pattern based on cultivar was observed in group I, on other hand, in group II, all the accessions are characterized by the presence of a cap on the distal side of seeds.However, according to the findings of Koffi et al. [15], the UPGMA cluster analysis of morphological differentiation among cultivars of Lagenaria siceraria showed that the used two cultivars were well separated.Consistent with the result from Uluturk et al. [37], morphological and molecular genetic diversity are distinct factors and must be considered separately in germplasm characterization.This is especially important for crops like cucurbits which have limited molecular genetic diversity.

Conclusion and Orientations for Future Research
Microsatellite markers have proven to be useful tools in this study in estimating the genetic variation within and among Lagenaria siceraria accessions.The relatively high level of the genetic diversity within accessions and cultivars was also in accordance with the mating system of L. siceraria and it suggested these accessions can be regarded as potential sources of enetic tank for in situ conservation.Furthermore, a clear understanding of the genetic diversity with explicit analyses of genetic structure of L. siceraria accessions is important; it can help in understanding the remarkable morphologic diversity existing among edible-seeded L. siceraria accessions genetic resources especially in terms of fruit and seed characters.An assessment of genetic diversity based only on morpho-agronomic traits might be biased, because distinct morphotypes can result from only a few mutations while they share a common genetic base.Therefore, molecular markers have the potential to complement al-ready existing estimations of diversity, and to be used to construct core collections for effective genebank management.The relatively high genetic diversity within accessions supported the sampling scheme proposed.

Figure 1 .
Figure 1.Seeds from the two cultivar of Lagenaria siceraria oleaginous type.Round fruit (a); Seeds with a cap (b); Elongated fruit (c) ; Seeds without a cap (d).

Figure 2 .
Figure 2. Genetic structure of microsatellites across 300 individuals from 30 accessions.Bar plot showing clustering of individuals by Structure with K = 2 [25].Each color represents one accession, each accession is represented by a vertical bar, each individual is represented by a single vertical line broken into K colored segments, with lengths proportional to each of the K inferred clusters.E east; C presence of a cap on the distal side of seeds; Ce, center; Ls, Lagenaria siceraria; N, North; NE, Northeast, S, south; SC, absence of a cap.

Figure 3 .
Figure 3. Dendrogram UPGMA showing relationships among 30 accessions of Lagenaria siceraria oleaginous type based on three microsatellites loci.CE presence of a cap, Eastern part; CN presence of a cap, Northern part; CNE presence of a cap, Northeastern part; SCCe absence of a cap, Central part ; SC absence of a cap, Northern part.

Table 1 .
Characteristics of 30 Lagenaria siceraria accessions used for genetic diversity analysis.

Table 2 .
Simple sequence repeat markers selected, their motif, primer sequence and calculated PIC value.

Table 3 .
[33]ctive number of alleles, observed (Ho) and expected (He) heterozygosities, fixation index (F IS , following Weir and Cockerham[33], and estimated frequency of null alleles (Fnull) per locus and accession, and Shannon's Information Index of 30 accessions of Lagenaria siceraria oleaginous type from SSR markers analysis.
a p-value of the score test for heterozygote deficiency, with: * 1%\p-value \5%; ** 0.1% \p-value \1%; *** p-value \0.1%; b 95% confidence intervals (CI) for null allele frequencies provided by Genepop.A effective number of alleles per polymorphic locus; Ho observed heterozygosity; He expected heterozygosity under Hardy-Weinberg equilibrium; FIS Fixation index; Fnullnull alleles frequencies per locus and accession; I Shannon's Information Index and SE standard error of sample mean.

Table 4 .
Partitioning of genetic variation using AMOVA on SSR data taking into account (a) no prior grouping of accessions; (b) among cultivars., partitioning all accessions; PCu, partitioning per cultivar; df, degrees of freedom; SS, sum of square; MS, mean square; Est.Var.estimated variance and %D distribution of total variance.The probability was estimated computing 999 permutation. PA