Genetic Map of Cotton with Molecular Markers

Cotton (Gossypium spp.) is the most important natural fiber in the world, and its seeds are also used as a food source. Breeding cotton for traits of interest, such as production and processing of fibers, will ensure that this natural product is as competitive as renewable synthetic fibers derived from petroleum. Thus, the mapping of the cotton genome for traits of interest may be the basis for its subsequent use in breeding programs. This work consists of a literature review, with the aim of bringing together works from different research groups working with the mapping of the cotton genome with molecular markers.


Introduction
Genetic mapping of a species can be conducted based on the frequency of recombination of genes or by means of molecular markers.Molecular marker mapping is possible because of the presence of heterozygous loci in the genome of each individual.These loci act as landmarks and enable their identification along chromosomes [1].
However, there are two factors that limit the use of molecular markers for the analysis of quantitative inheritance genes and assisted selection: the limited number of suitable markers available and the lack of knowledge of how these markers are associated with economically important traits [2].Currently, the number of markers is virtually unlimited with the increasing use of genotyping by sequencing.
Bringing together work in the literature about genetic mapping of cotton through molecular markers, this study aimed to gather relevant and recent research on this subject to facilitate the searches for this information by cotton breeders.Cotton (Gossypium spp.) is one of the most intensively cultivated species worldwide; it is grown in more than 80 countries and in varying climatic conditions [3].Cotton is one of the most important economic crops, providing natural textile fiber and edible oil throughout the world [4].
Cotton is the most widely grown fiber-producing crop and third most widely grown oleaginous seed crop worldwide (Figure 1), with the three largest producers being China, India and the United States (Figure 2) [5].
Gossypium spp.belongs to the family Malvaceae, which contains approximately 90 genera, including the genus Gossypium.Species of this genus have shrubs and saplings up to seven feet tall, with a disposition of entire leaves alternating with stipules.The flowers are usually large, showy, cyclical and perfect.The petals are welded to the base.The androecium has numerous stamens and fillets, which are partially welded into a tube surrounding the pistil.The anthers have one theca and spiny pollen.The ovary is superoinferior, pentacarpelar, and pentalocular, with approximately two ovules per locule.The seeds are hairy [6] [7].
Gossypium hirsutum and G. barbadense are classic allopolyploids resulting from the merger of two formerly isolated diploid genomes, containing one genome similar to those found in the Old World (A-genome diploids) and a second genome similar to those of the New World (D-genome diploids).In the simplest case, allopolyploids have one complete diploid set of chromosomes derived from each parental species; thus, they contain a doubled complement of genes (homeologs).This history may have promoted morphological, ecological, and  physiological adaptation, mediated by natural selection on a greatly enhanced level of variation resulting from an instantaneously doubled complement of genes [8].
Cotton breeding programs have continuously aimed to improve productivity and increase the quality of the lint, consequently increasing the length, strength, fineness and maturity of the fibers.However, there are more specific characteristics of interest, which are defined by soil and climate peculiarities, plant health and the production system of each region.These characteristics include regional or national disease resistance and features of the production system, such as cultivation scale, intensity of machine use, planting system etc.Since the beginning of small-scale cotton production on family farms, the characteristics of the production system have influenced the cultural practices and products of different business systems, such as soil management, pests and disease control, production of colored fiber and manual harvesting [9].
This work consists of a literature review, with the aim of bringing together works from different research groups working with the mapping of the cotton genome with molecular markers.

Genetic Mapping of Molecular Markers
Most nucleotide-level variation is not visible in the phenotype.However, this variation can be explored using genetic markers.Molecular markers have developed rapidly in the last decade and represent one of the most powerful tools for genome analysis, allowing for the study of heritable traits based on variations in the DNA sequence.Microsatellites, also known as simple sequence repeat (SSR) and single nucleotide polymorphisms (SNPs) are the main markers applied in genetic analysis [10], in addition to marker-assisted selection (MAS) [11].The study of the genetic control of quantitatively inherited traits is of fundamental importance for agriculture because most characteristics of economic interest have a quantitative inheritance, with a significant environmental influence on the expression of the phenotype, making selection based uniquely on the characteristics expressed difficult.
A detailed view of the genetic architecture of quantitative traits can be obtained through a genetic map saturated with molecular markers.The parameters that quantify the genetic architecture of a quantitative trait include the number and positions of genomic loci, their actions and genetic interactions and their responses to biotic and abiotic factors.Thus, we can identify quantitative trait loci (QTL) with the availability of polymorphic markers from a single nucleotide and multiple polymorphism markers [12].
The mapping of molecular markers has also proven useful for the identification of genetic characteristics that change the phenotype of an organism over time or in response to other variables, mainly influences from the environment [13].
Genetic mapping is an expensive procedure, and only a few species have been mapped.The first plant species to have its genome fully mapped was Arabidopsis thaliana.Therefore, this species is considered to be a model organism for the study of plant genetics, and studies of its genome are often compared with studies of other species.

Genetic Mapping by Molecular Markers in Cotton
In 1994 the first molecular genetic map of cotton was published [14].However, even with all possible potential molecular markers, studies using markers for genetic mapping are too incomplete to be useful for breeding are still only attempts by researchers and that the development of an efficient breeding program using genetic maps requires additional dialogue among breeders regarding this technology [15].
A generation of backcrossing between the cultivar G. hirsutum Tamcot 2111 and the cultivar G. barbadense Pima S6 and a set of secondary genes were studied aiming to increase the value of cotton.The variance analysis found 22 genes that do not overlap with the QTLs and are distributed in 15 different chromosomes [16].
Functional molecular markers of Gossypium raimondii were selected and characterized.These markers were based on 58,906 non-redundant expressed sequence tags (ESTs) deposited at NCBI.Of these markers, were obtained 2818 EST-SSRs and 300 pairs of primers EST-SSRs that were randomly selected to analyze the polymorphism and build linkage groups in cotton allotetraploid cultivars [11].Also were obtained 119 EST-SSRs based on 98 ESTs from a complementary DNA (cDNA) library used in fiber development of G. barbadense cv.Pima 3 -79 [4].
To study chromosomes 12 and 26, a total of 118 markers, 28 SSR markers and 90 sequence-tagged sites (STS) were developed.The data indicated that 46 (38.9%) of the 118 markers had only one locus, and the remaining 76 (61.1%) were composed of several loci.On chromosome 26, 65 genetic markers (11 SSR and 54 STS) were detected.Twenty-six markers (40%) behaved as single gene copies and 39 markers (60%) had multiple copies in the genome of cotton, which is an allotetraploid species.They found a homeolog similarity of 40% on both chromosomes analyzed [17].
SSR loci were used in hybrids involving Gossypium barbadense L. and Gossypium tomentosum.Chromosomal associations were determined for 123 SSR loci, of which 90, 106 and 73 loci were polymorphic in G. tomentosum, G. barbadense, and both sets, respectively [18].
Amplified fragment length polymorphism (AFLP) markers were used in a set of aneuploid populations of cotton and observed 608 polymorphic AFLP markers distributed in 22 chromosomes.Based on the study of 16 characteristics of interest, they found 13 genes linked in 8 different chromosomes; however, the authors recommend studies with SSR markers [19].
SSR markers were used for identification and phylogenetic analysis of the major cotton cultivars of Greece.Twenty-nine cultivars of Gossypium hirsutum and an interspecific hybrid (G.hirsutum × G. barbadense) were analyzed using 11 pairs of SSR markers.Overall, 17 loci containing polymorphic markers have been identified, and the level of polymorphism ranged from 0 to 0.548, with an average of 0.293 [3].
SSR markers were also used to evaluate the genetic diversity of Gossypium barbadense.In total, 237 SSR markers commonly mapped to cotton genomes were used to analyze 56 genotypes of Sea Island cotton.A total of 218 polymorphic primer pairs (91.98%) amplified 361 loci, with an average of 1.66 loci per primer.The level of polymorphism ranged from 0.035 to 0.862, with an average of 0.320 [20].
346 SNP markers were used to report the discovery of over 151,000 putative SNPs in non-transcribed sequences of allotetraploid cotton.Was constructed the first genetic linkage map of G. hirsutum based entirely on SNP markers and also identified over 151,000 putative SNPs and hundreds of putative microsatellites in allotetraploid cotton [21].
A very-high-density whole-genome marker map (WGMM) for cotton was constructed by using 18,597 DNA markers corresponding to 48,958 loci that were aligned to a consensus genetic map and a reference genome sequence.The WGMM has a density of one locus per 15.6 kb.Hotspots for quantitative trait loci and resistance gene analog clusters were aligned to the map and DNA markers identified for targeting of these regions of high practical importance for breeders and genome evolution studies [22].

Pest and Disease Resistance
With regard to marker-assisted selection in cotton, few studies have been conducted.One successful experiment with assisted selection was a project which used 200 markers and obtained two contrasting parental varieties and an F3 segregating family resistant to virus blue, which is a pathogen that attacks the cotton culture.However, due of the shortage of studies, more research is necessary for assisted selection by markers in cotton aiming to obtain cultivars resistant to pathogens [23].
For resistance to nematodes, was investigated the existence of molecular markers closely linked to the GB713 gene, which confers a high level of resistance to cotton reniform nematode (RN).Three hundred plants of the F2 generation were genotyped with SSR markers that covered most of the cotton chromosomes.Two QTLs were found on chromosome 21, and 1 QTL was found on chromosome 18.One of the QTLs on chromosome 21 occupied position 168.6 on the map, and the other occupied position 182.7.Chromosome 21 had 61 microsatellite markers covering 219 cM.The two QTLs on chromosome 21 had additive effects and dominance [24].
A study also investigated populations of cotton using molecular markers and aimed to find genes for resistance to reniform nematode from the progeny of a trihybrid cross between Gossypium arboreum, G. hirsutum and G. aridum.Twenty markers were associated with a resistance locus.Because the SSR fragments associated with resistance were found in G. aridum, this species is most likely the source of this resistance.The resistance was also found to have simple inheritance and be controlled by a single dominant gene [25].
Fusarium is among the diseases that cause the most damage to cotton producers around the world, causing yellowing, wilting, defoliation, damage to the vascular tissue and possibly resulting in the death of the plant.In this sense, the identification of molecular markers that are linked to resistance genes is of fundamental importance.A study with an F2 population of Gossypium hirsutum L., which was developed by crossing a highly resistant cultivar, Zhongmiansuo 35 (ZMS35), and a susceptible cultivar, Junmian 1, for evaluation by SSR markers closely linked to resistance gene, which showed monogenic inheritance (3:1).This gene (JESPR304) was closely linked with the SSR marker and was located on chromosome 17 [26].
A linkage map was constructed, in a population of Upland cotton, consisting of 882 simple sequence repeat, single nucleotide polymorphism, and resistance gene analog-amplified fragment length polymorphism marker loci.Identified a total of 21 QTLs on 11 chromosomes and two linkage groups associated with Verticillium wilt (VW) resistance.The authors provide useful information for the development of VW-resistant Upland cotton lines via genomics-assisted breeding.However, the complex genetic basis of VW resistance with many VW resistant genetic factors makes it difficult to select molecular markers [27].
One hundred and fifty eight elite cotton (Gossypium hirsutum L.) germplasm from all over the world contributed to association mapping and supply the marker candidates for marker-assisted selection of Verticillium wilt resistance in cotton.42 marker loci associated with Verticillium wilt resistance were identified through association mapping, which widely were distributed among 15 chromosomes.Among which 10 marker loci were found to be consistent with previously identified QTLs and 32 were new unreported marker loci, and QTL clusters for Verticillium wilt resistance on chromosome 16 were also proved in this study, which was consistent with the strong linkage in this chromosome.Our results would contribute to association mapping and supply the marker candidates for marker-assisted selection of Verticillium wilt resistance in cotton [28].
Selection assisted by markers and genetic transformation will be the most used technique in the improvement of cotton [29].

Agronomic Qualities
Four populations of cotton were assessed for the genetic mapping of traits related to fiber quality and cotton yield using RFLP (Restriction Fragment Length Polymorphism) [2].The results showed the presence of 63 QTLs on five different chromosomes (03, 07, 09, 10, 12) and 29 QTLs in three other chromosomes (14,20 and the long arm of chromosome 26).Chromosome three contained 26 genes linked in quantitative inheritance, covering 117 cM with 54 RFLP loci.The long arm of chromosome 26 contained 19 genes for quantitative traits, representing 77.6 cM with 27 RFLP loci.The authors concluded that approximately 49% of the supposed 92 QTLs for fiber quality and agronomic traits are linked on these two chromosomes, indicating that cotton chromosomes may have islands of high and low meiotic recombination, similar to other eukaryotic organisms.
Zhang et al. [30] constructed a genetic linkage map with 70 loci (55 SSR, 12 AFLP and three morphological loci), evaluating 117 plants derived from a cross between two cotton cultivars, Yumian 1 and T586, which have relatively high levels of polymorphism.The linkage map had 20 linkage groups covering 525 cM with an average distance of 7.5 cM between two markers, or approximately 11.8% of the length of recombinants from cotton genome.This map was used to identify QTLs related to the percentage of fiber and fiber quality traits.The researchers obtained sixteen QTLs for percentage and fiber quality, which were identified in six linkage groups.
Shen et al. [31], aiming to obtain information about the molecular markers from cotton to develop a future assisted selection program, evaluated an F2 population descendant from a cross between Gossypium hirsutum L. (7235) and Gossypium hirsutum (TM-1) and found 25 main QTLs from studies with SSR for fiber quality and 28 QTLs for yield components.
In an effort to develop molecular marker systems to search for polymorphisms associated with high yield and fiber quality in cotton, Pang et al. [32] have developed a method that specifically targets the regulatory regions of the cotton genome.Degenerate primers were designed to target promoter sequences, and the application of the system was tested in combination with 10-mer random amplified polymorphic DNA (RAPD) primers to characterize 40 cotton genotypes relative to their genetic and geographical origins.The amplified markers are called promoter anchored amplified polymorphism based on RAPD (PAAP-RAPD).
Rakshit et al. [33] also studied molecular markers linked to QTLs contributing to agronomic traits of interest.Using F2 and F3 populations, they analyzed quality characteristics of the fiber using AFLP and SSR and detected that thirty-two AFLP markers and four SSR primers may be useful for genotyping F2 individuals and, consequently, for genetic mapping of such individuals because different loci were observed in five evaluated characteristics related to fiber quality.In this study, the markers explained 41% of the phenotypic variation for individual characteristics, suggesting the clustering of QTLs for quality traits of the cotton fiber.Another important feature related to the quality of the plant is the cotton plant architecture.Song and Zhang [34] studied an interspecific population, derived from a cross of Gossypium hirsutum and Gossypium barbadense, to identify QTLs associated with architectural features of plants.A single QTL that controls seven traits of plant architec-ture (branch angle, angle of fruits, plant height, leaf size main fruiting, internode length and length between branches) was identified.
Additional work from Rakshit et al. [33] examined a cross between Gossypium hirsutum and Gossypium barbadense to identify QTLs related to the production of chlorophyll by the leaf in the three stages of plant development.Twenty-four QTLs were identified in a population of 140 plants.The main QTL that controls the production of chlorophyll-A was detected on chromosome A12 (QCA-A12-2) and expressed in all three phases of development.Two QTLs found on chromosome A06 (QCB-A6-1 for chlorophyll b and QTC-A6-1 for total chlorophyll content) were found in only two of the phases studied.
Liu et al. [35] used cDNA-AFLP markers to construct a linkage map for cotton hybrid Xiangzamian 2 (Gossypium hirsutum L.) from 171 strains from the immortalized F2 generation.A total of 302 fragments were mapped in 26 linkage groups, and an average distance of 8.23cM was detected between two markers.Through this study, 71 QTLs associated with productivity and crop value were mapped.
In the research discussed above, there is no study mapping all of the genes of the cotton genome, whose diploid number is 2n = 26 chromosomes.It has been observed that most features mapped by scientists are desirable agronomic characteristics, especially those related to fiber quality and productivity.Therefore, only some chromosomes have been mapped, so that a complete species ideogram has not yet been constructed for this species.The lack of a more specific type of genetic marker for cotton mapping was also identified, because we found experiments in the literature using SSR, SNP, AFLP, RFLP and STS.
Liang et al. [36] constructed a high-density genetic linkage map to facilitate marker assisted selection for fiber quality traits in Upland cotton (Gossypium hirsutum L.).The authors obtained a genetic linkage map using an F2 population derived from the GX1135 × GX100-2 cross comprising 421 loci and covering approximately 73.35% of the cotton genome.In the map, 44 of 49 linkage groups were assigned to the 26 chromosomes, and 39 QTLs were detected corresponding to five fiber quality traits: 12 for fiber length, 5 for fiber uniformity, 9 for fiber strength, 7 for fiber elongation, and 6 for fiber micronaire.
Wang et al. [37] analyzed the association of 3 yield component traits and 5 fiber quality traits of 55 Gossypium barbadense accessions in 2009 and 2010 using 170 SSRs and 258 sequence-related amplified polymorphisms (SRAPs).A total of 72 loci were detected, including 28 loci of SSRs and 44 loci of SRAPs; 26 of these loci were related to yield component traits, and 46 of these loci were related to fiber quality traits.
Cao et al. [38] demonstrates the first practical use of chromosome segment introgression lines (CSILs) for the transfer of fiber quality QTLs into Upland cotton cultivars using SSR markers without detrimentally affecting desirable agronomic characteristics.They developed a set of five CSILs associated with QTLs for superior fiber qualities, including length, strength and fineness of fiber.TM-1, the standard genetic in G. hirsutum, was used as the recipient parent and the long staple cotton G. barbadense cv.Hai7124 was used as the donor parent by marker-assisted molecular selection (MAS).In diverse field trials, these QTLs consistently and significantly offered additive effects on the target phenotype.These CSILs have great potential to improve fiber qualities in Upland cotton MAS breeding programs.

Conclusion
A genetic map of cotton is extremely important because it provides fundamental information about the genetic content and genome organization of this species.In addition, it allows for the development of a strategy for improving cotton production.The research that has been conducted thus far has not mapped all cotton genes, and scientists have prioritized genes associated with traits of interest, especially those related to fiber quality.Moreover, no record of an ideogram of the species was found in the literature, only the genetic map of some chromosomes.

Figure 1 .
Figure 1.Comparison of the world cotton crop for the production of fiber (a) and oleaginous seeds (b).Source: FAO (2012).