Quantitative Trait Loci Underlying Seed Sugars Content in “ MD 96-5722 ” by “ Spencer ” Recombinant Inbred Line Population of Soybean

Sucrose, raffinose, and stachyose are important soluble sugars in soybean [Glycine max (L.) Merr.] seeds. Seed sucrose is a desirable trait for taste and flavor. Raffinose and stachyose are undesirable in diets of monogastric animals, acting as anti-nutritional factors that cause flatulence and abdominal discomfort. Therefore, reducing raffinose and stachyose biosynthesis is considered as a key quality trait goal in soy food and feed industries. The objective of this study was to identify genomic regions containing quantitative trait loci (QTL) controlling sucrose, raffinose, and stachyose in a set of 92 F5:7 recombinant inbred lines (RILs) derived from a cross between the lines “MD965722” and “Spencer” by using 5376 Single Nucleotide Polymorphism (SNP) markers from the Illumina Infinium SoySNP6K BeadChip array. Fourteen significant QTL were identified and mapped on eight different linkage groups (LGs) and chromosomes (Chr). Three QTL for seed sucrose content were identified on LGs N (Chr3), K (Chr9), and E (Chr15). Seven QTL were identified for raffinose content on LGs D1a (Chr1), N (Chr3), C2 (Chr6), K (Chr9), B2 (Chr14), and J (Chr16). Four QTL for stachyose content were identified on LG D1a (Chr1), C2 (Chr6), H (Chr12), and B2 (Chr14). Selection for beneficial alleles of these QTLs could facilitate breeding strategies to develop soybean lines with higher concentrations of sucrose and lower levels of raffinose and stachyose. Corresponding author.


Introduction
Soybean [Glycine max (L.) Merr.] seeds are used in many prepackaged traditional foods such as soy milk, tofu, tofu skin, soy sauce, bean paste, natto, and tempeh. These types of foods are produced from special soybean lines that are ideally low in oil but high in sucrose and proteins. Soybean seeds contain approximately 20% oil, 40% protein, 35% carbohydrates, and 5% ash by dry weight. The di-, tri-, and tetra-saccharides: sucrose, raffinose, and stachyose are the most abundant soluble sugars in soybean seeds [1], accounting for 5%, 1.5%, and 3% of the total carbohydrates portions (35%) [2]- [4]. Sucrose is easily digestible, raffinose is not easily digested, and stachyose is difficult to digest by monogastric animals, including humans, pig, chicken, and fish [1]- [5]. Beside the nutritional value, sucrose is transported from the source to sink organs in plants during the process of photosynthesis and functions as a storage reserve, compatible solute, and signal metabolite [5]. Raffinose and stachyoseis serve as transport carbohydrates in the phloem [6], storage reserves, cryoprotectants in frost-hardy plant organs [7] [8], and accumulate in maturing seeds to play a role in the acquisition of desiccation tolerance, seed hardiness, or storability and germination [9]. Raffinose and stachyose have previously been thought to be involved in seed protection during the desiccation process [10] [11] by stabilizing the cell membrane [12]. However, when a monogastric animal diet contains raffinose and stachyose, they act as anti-nutritional factors that cause flatulence and abdominal discomfort [13] [14], diarrhea, and reduced energy metabolism and digestion [15] [16]. Therefore, many attempts have been made at modulating their contents in crop seeds [17]. Seed sugar content is an important trait in food-grade soybeans and up to 10% sucrose and less than 2.5% total raffinose and stachyose contents are considered advantageous [4].
In a recent study, a germplasm, "V99-5089", with high sucrose, low raffinose, and low stachyose contents to use as a parent in soybean food-grade breeding programs was developed [4]. To date, there are limited studies on genetic and QTL mapping of soluble sugars in soybean seed. In SoyBase: Soybean Breeders Toolbox (USDA-ARS) (verified on December, 2014) [18], thousands of QTL have been reported for many agronomic and quality traits, but only 28 were for seed sucrose. These 28 QTLs were mapped on LGs A1, E, 3 QTLs on A2, I, F, 3 QTLs on L, M [19], B1, two QTLs on L, D1b, seven QTLs on L [20], and B2, D1B, E, H, J [21]. No QTLs for raffinose or stachyose were found in SoyBase, indicating that more researches are needed in this area. To date, although several QTL associated with soybean seed soluble sugars have been identified [20]- [24], it is still necessary to validate these QTL or discover new ones in different genetic backgrounds by using different populations and genetic mapping approaches. Additional information regarding the genetic basis of sugar-related traits could accelerate the discovery of new genes and develop cultivars or more germplasm with high sucrose and low raffinose and stachyose contents.
The increased use of different types of molecular markers to generate dense genetic linkage maps can facilitate the discovery of genomic regions containing genes involved in the biosynthesis of sugars in soybean and other crops [20]- [23]. Different types of molecular markers have been used to construct linkage maps to identify QTL in soybean such as restriction fragment length polymorphisms (RFLP), simple sequence repeat (SSR), amplified fragment length polymorphism (AFLP), and random amplified polymorphic DNA (RAPD), and single nucleotide polymorphism (SNP) [25]- [31]. In spite of the identification of genomic regions affecting soybean seed sucrose content [20]- [22], limited information is available for the genomic locations controlling sucrose, raffinose, or stachyose accumulation [32] and their inheritance [21].
The genetic analysis for seed sucrose, raffinose, and stachyose contents in several Arabidopsis accessions in a recombinant inbred line population revealed one major QTL responsible for the monogenic segregation for stachyose content, and this locus affected sucrose and raffinose content [33]. They also found two candidate genes encoding for galactinol synthase and raffinose synthase located within the genomic region near this major QTL. In addition, it was found that three smaller-effect QTLs and each QTL affected the content of the other sugars [33]. The genomic regions associated with sucrose, raffinose, and stachyose were identified in segregating F 2 :F 10 RILs [20]. They found two independent related QTLs oligosaccharides near marker satt546 on linkage group (LG) Dlb and satt278 on LG L. They found four other QTLs for sucrose located at LG B1 (satt197), Dlb (satt546), and L (satt523 and satt278). Another two QTLs on LG Dlb and L associated with sucrose, raffinose, and stachyose. It was suggested that a single locus could have pleiotropic effects on the biosynthesis pathways of sucrose, raffinose, and stachyose [32]. They also reported that a similar phenomenon occurred in soybean and a mutation at a single locus conferred a phenotype of low stachyose [34]. The genetic variability of seed sugars is significant [35] [36], indicating allelic differences in the genes controlling the biosynthetic enzymes. Other researchers were able to identify a low stachyose line [37] to increase seed sucrose content. The allele responsible for low stachyose was found to be stc1a, and it is a recessive allele that may have been caused by loss of function in a synthase enzyme involved in the RFO biosynthetic pathway. Recently, other researchers [32], using QTL analysis, were able to detect a locus of a major gene that contributes to low stachyose on LG C2. The candidate gene containing this mutation (Rsm1) in soybean was suggested to be galactosyltransferase gene, and shared close homology with raffinose and stachyose synthase genes from other plant species. It is suggested that although the putative enzyme encoded by Rsm1 is involved with stachyose biosynthesis, further research on gene expression and enzyme activity is needed to clarify the role of Rsm1 in seed sucrose and raffinose accumulation. In a very recent research published in Crop Science, QTLs associated with seed sucrose content, using SSR and SNP markers in a low sucrose line, MFS-553 crossed with a high sucrose plant introduction, PI 243545, were identified [38]. They developed an F2-derived QTL mapping population and then genotyped 220 F2:3 lines through polymorphic SSR markers. They used a total of 94 F3:4 lines derived from the F2:3 population, using 5361 SNP markers spanning 20 chromosomes, of which 2016 were polymorphic. After screening F2:3, F3:5, and F3:6 lines for sucrose, they found three novel QTLs for seed sucrose on chromosomes 5, 9, and 16 and accounted for 46%, 10%, and 8%, respectively, of the phenotypic variation for sucrose content. Using SSR and SNP markers [38], it was possible to identify sucrose QTL on chromosome 5, positioning between Sat_344 and Sat_407 and concluded that QTL identified by both marker could be the same QTL for sucrose. Also, they found a sucrose QTL on chromosome 16 identified by SSR markers in F2:3 population, positioning between Sat366 and Satt431. Using SNPs markers for an advanced F3-derived population, they showed that SNP marker ss249186914 linked to the sucrose QTL on chromosome 16. The three QTLs explained a total of 64% of variation in seed sucrose content and concluded that the total phenotypic variation was less than the heritability estimate (0.74), indicating involvement of other minor QTL. It was concluded that the use of SSR markers helped identified only few SSR markers associated with sucrose, but the use of dense SNP markers significantly improved the association between SNP markers and the sucrose [38]. The QTLs and associated markers could be used for MAS on soybean seed sucrose content.
Since limited research was done in this area of QTL mapping for seed sugars, the objective of this study was to identify more QTLs related to seed sucrose, raffinose, and stachyose QTLs based on natural variation of seed sugars. We hypothesized that the current population had adequate natural variation to detect further QTLs. In this study, the "MD96-5722" and "Spencer" genetic linkage map of soybean [39] was used to identify the genomic regions containing QTL that controlled sugars content in soybean seeds using 5376 Single Nucleotide Polymorphism (SNP) markers from the Illumina Infinium SoySNP6K BeadChip array [39] [40].

Plant Material and Seed Analysis for Sugars
Ninety-two F 5:7 recombinant inbred lines (RILs) developed by crossing the line MD 96-5722 with Spencer were used to generate phenotypic and genotypic data. Detailed information of the development of the RIL population can be found [39]. The population was grown in pots at Fayetteville State University, Fayetteville, NC in 2012 with 25 cm plant to plant spaces. Briefly, plants were grown in pots in Growmix TM in a greenhouse. The Growmix TM used was a ready-made, peat based growing mix containing the Canadian Sphagnum Peat Moss, containing perlite, vermiculite, limestone, and a wetting agent (The Canadian Sphagnum Peat Moss Association; http://www.peatmoss.com/) [41]. The plants were kept in the greenhouse at 25˚C ± 10˚C under natural daylight from May 1 st to October 4 th . No additional fertilizers or insecticides were used. At maturity stage, parents and RILs seeds were harvested and analyzed for the three major sugars: sucrose, raffinose and stachyose content (%). These sugars were quantified using near infrared reflectance (NIR) with an AD 7200 array feed analyzer (Perten, Springfield, IL) as described previously [36]. Briefly, about 25 g of seeds from each line were ground using a Laboratory Mill 3600 (Perten, Springfield, IL). The development of the updated calibration equation was based on the use of the method of using a sufficient number of samples, resulting in sufficiently accurate estimations of sugar concentrations. Determination of sugar concentrations was performed on the basis of percent dry matter of the seeds, and sugar analyses were performed on a seed dry matter basis [36] [42] [43]. Determination of soybean seed sugars (sucrose, raffinose, and stachyose) using NIR technology is well-established, well-accepted, and accurate, and has been used for breeding selection for seed composition, including seed protein, oil, fatty acids, amino acids, and sugars [36] [42]- [45].

Genetic Map and QTL Identification
The MD 96-5722 by Spencer recombinant inbred lines were genotyped using the SoySNP6K Illumina Infi-niumBeadChip array which produced 5,376 SNPs. A dense SNP-based genetic linkage map [39] [40] was constructed using JoinMap 4 [46] in which composite interval mapping (CIM) was used to detect QTL from genotypic and phenotypic data using WinQTLCart 2.5 software (http://statgen.ncsu.edu/qtlcart/WQTLCart.htm) [47]. The threshold settings were setup at 1,000 permutations, 0.05 significant level, and 2 cM walking speed. The Model 6 with its default parameters; backward stepwise regression method, window size of 10 cM, control marker numbers of 5; have been chosen to running WinQTLCart [47].

Results
Analysis of variance, mean values, standard deviation of parental and RILs (lines) for sucrose, raffinose, and stachyose content (%) are presented in Table 1. Variations among RILs were narrower for stachyose (CV = 6.73%) than raffinose (CV = 11.11%) and sucrose (CV = 12.37%). The frequency distribution of seed sucrose, raffinose, and stachyose concentrations of RILs was different from each other. Frequency distributions for sucrose, raffinose, and stachyose were shown (Figure 1). However, the skewness and kurtosis values for these traits were <1.00 (Table 1). Positive but non-significant correlation between sucrose and raffinose (r = 0.32); sucrose and stachyose (r = 0.13); raffinose and stachyose (r = 0.03) was observed ( Table 1). Additive effects (negative or positive) indicated positive or negative interactions between these sugars and their biochemical synthesis pathways.

Discussion
In contrast to the results here, bimodal distributions for sucrose, raffinose, and stachyose contents with single gene segregation ratios in the V99-5089-derived population were previously found [4]. It was reported normal distribution in soybean sucrose contents in different breeding populations [19]- [21]. A normal distribution was found for stachyose content in this study, but not for sucrose and raffinose. The positive non-significant correlations between sucrose and raffinose found in this study were supported by previous reports [20] [21] who observed positive correlation between high oligosaccharides (raffinose and stachyose) and high sucrose, which indicated that developing soybean lines with high sucrose, low raffinose and low stachyose will not be easily achieved. Unlike the correlations observed in this study, others [4] [32], reported strong negative correlations between sucrose and raffinose (r = −0.88), and between sucrose and stachyose (r = −0.96) concentrations. Therefore, selection of soybean lines with high sucrose, low raffinose, and low stachyose contents in seed will be possible.
The seven QTLs associated with seed raffinose are new. One region underlying seed stachyose QTL detected in this study with stachyose on C2 (Chr6) was on the same LG that two stachyose QTL were detected by others [32] [50], but the positions of the QTL were different. The other three regions underlying seed stachyose QTL identified in this study were additions to the previously detected QTLs, identified by others [32] [50] on LG A1 (Chr5), M (Chr7), B1 (Chr11), and I (Chr20). Only two QTL underlying seed sucrose and stachyose contents in this study were previously mapped. The percentage contribution of most QTLs (12 QTLs) on different LG and chromosome to the phenotypic variation ranged from low (8%) to moderate (26%), except for sucrose for two QTLs (qSUC001 on LG N/Chr3 and qSUC003 on LG E/Chr15). For these two QTLs, the percentage contribution to the phenotypic variation was high (more than 70%). The different QTLs reported for the same sugars, reported here and elsewhere, could be because many of the previously reported QTLs were discovered through simple linear regression methods (SIM) and not by composite interval mapping (CIM), or use of different types of marker systems or different populations. Therefore, the QTLs detected in the present study and their associated molecular makers provided new additional knowledge that helps breeders to adopt breeding strategies to develop cultivars or germplasm with desirable soybean seed quality traits such as higher sucrose and lower raffinose and stachyose.

Conclusion
Our research shows that out of 20 LG/Chr of soybean we identify 14 QTL on 8 different LG/Chr which underlie the three main sugars in the MD96-5722 by Spencer RIL population of soybean. Since higher seed sucrose is a desirable trait and higher seed raffinose and stachyose are undesirable for seed quality in soybean, selection for beneficial alleles of these QTL could facilitate breeding strategies to develop soybean lines with higher concentrations of sucrose and lower levels of raffinose and stachyose. It is important to note that the lack of a high correlation among the sugars, suggests that it could be possible to genetically manipulate content of these sugars independently. Expansion of our existing linkage map would be needed for more comprehensive examination of the genetic basis for these traits. Since seed sugars content are influenced by the growing conditions, greenhouse and field experiments will be conducted to further study the stability of these traits under drought and high heat. We believe that the current research contributes valuable knowledge and new additional QTLs associated with seed composition, especially seed sugars.