The Tale of Cotton Plant: From Wild Type to Domestication, Leading to Its Improvement by Genetic Transformation

Cotton is considered as a major cash crop of the world. It earns huge foreign exchange by its valuable products; fiber, lint, cotton seed oil, hull and a lot more. Being an important fiber crop, it earns huge foreign exchange by contributing to textile and seed oil industry. This review summarizes cotton biology, its diversity and domestication, genome assembly, constraints in its production and methods to improve cotton plant to fulfill the need of textile and oil industry. But cotton is facing enormous biotic and abiotic stresses with insect pests being most prominent. Massive destruction caused by insects needs to be controlled for maintaining fruitful cotton crop production. Conventional breeding approaches are limited to improving single trait and integrate stable genes within plant genome in approximately 7 - 8 years. Improved biotechnological procedures have paved new pathways to target genes specifically and improve cotton germplasm in lesser time than conventional breeding.


Introduction
Cotton is considered as the world's most important fiber producing crop. It not only supports textile industry by providing fiber but also supports oil industry by producing high quality oil. It is a major source of proteins (30% -40%) to be fed The United States of America is the third-largest cotton producer at 3553 thousand metric tons annually ( Figure 2). Arizona, California, Florida, Mississippi and Texas are the major cotton-producing states. Cotton is cultivated on nearly 5 mha/yr in the United States, exceeding the area planted in all crops except wheat, maize and soybean [2]. The cotton fiber grown annually in the United States is worth ~6 billion dollars and the added value from cottonseed oil and meal is ~500 million dollars. Cotton fiber exports are worth four billion dollars annually. About 120 billion dollars is assessed for the business revenue of this crop [3]. The United States is progressively dependent on worldwide market for cotton to counteract the high demand for artificial fiber [2].
Pakistan is a prominent cotton producer and consumer and the economy depends profoundly on its cotton and textile industry ( Table 1). The cotton and textile sectors contribute almost half the country's industrial base and cotton is the principal cash crop of Pakistan, providing critical income to country's households. The cotton-textile sectors comprise 11% of GDP and 60% of export earnings. Cotton is grown on 15% of the agricultural land from May to August, during monsoon season. It is also grown from February to April on a small scale.
Pakistan stands at fourth position in world's cotton production behind China, India and the United States [4]. Pakistan stands third in exporting raw cotton, fourth in cotton consumption, and is the largest cotton yarn exporter [5]. However, the future of this essential part of the national economy is uncertain. This sector faces challenges from unbalanced prices and enhanced competition from worldwide liberalization of the multidimensional cotton and textile business. Therefore, Pakistan's economic circumstances are unstable [6]. As a cash crop, cotton represents a huge foreign exchange for Pakistan's economy, earning it the name "white gold of Pakistan". The textile industry earned $10.22 billion in foreign exchange from July, 2016 to March, 2017. The cotton crop was grown on 2,961,000 ha in 2017-18, an increase of 5.5% over the previous year. The 2017-18 cotton production was 11.8% increase over the previous year [7]. The shift from cotton production to rice and maize crops in some districts of Punjab contributed to the decreased acreage of cotton in Pakistan ( Figure 3). flower's creamy white or yellow color to pinkish-red, then the flower wilts and falls off, uncovering a small, green, immature cotton boll. A cotton boll is a segmented fruit pod with thirty-two immature seeds from which the fibers originate. It enlarges to the size of a small fig as the fibers develop and thicken within the boll. At this stage, the mature cotton fibers are thickened with their chief component, the carbohydrate cellulose, also the major component of higher plants cell wall. Each plant may bear up to 100 bolls and the average boll contains ~500,000 cotton fibers [11]. About 140 days after planting, or nearly 45 days after the appearance of bolls, the cotton boll will start splitting naturally along its segments or carpels and begin to dry out, uncovering the underlying cotton segments called locks. Dried carpels are called the bur and when fully dried and fluffed, this bur holds locks of cotton firmly in place, ready for easy picking [10].
The genus of cotton is Gossypium, in the tribe Gossypieae of the mallow family, Malvaceae (Table 2). Cotton occurs naturally in arid to semiarid areas of subtropical and tropical regions in both the New and Old World. The genus Gossypium consists of ~50 species [12], making it the largest in the tribe Gossypioieae. The name of the genus is derived from the Arabic word goz, which means soft material [13]. The Gossypium genus originated ~5 to 10 million  years ago [14]. The species included in the genus are extremely varied, both morphologically and physiologically, ranging from fire-borne, herbaceous perennials in Australia to tree species in Mexico [12].
The life cycle of the different cotton species varies, but the symmetry of fruit production remains almost the same. Insects, weather, and moisture can alter the ideal conditions for cotton growth drastically and it is the farmer's profession to regulate these conditions to adjust yield [9]. Perennial shrubs of cultivated cotton are most often grown as annual crops. Plants are 1 -1.5 m high under modern cropping patterns, sometimes higher in conventional multi-cropping patterns. The leaves are wide and have 4 -5 lobes. The cottonseeds, located in the capsule-shaped boll, are enclosed by two types of fibers, which are detached from the seed by ginning. Firstly, the longer fibers, called staples, are detached and twisted together to produce yarn for making thread and weaving into high-quality fabric. Secondly, the linters are detached and woven into poor-quality fabric that contains the eponymous lint. Modern machines harvest cotton bolls without damaging the plant. Cotton production is boosted further by a favorable environment. Spinning machines and power looms were early, but enduring innovations to modern industry and help maintain consistent quality and quantity of cotton products [1].
The most common commercial species of cotton are G. hirsutum (>90% of world cotton production), G. barbadense (3% -4%), G. arboreum and G. herbaceum (2%) [15]. Most cotton varieties were derived through conventional breeding techniques like selection and hybridization. Current breeding programs seek to cross-breed traits such as insect/disease resistance and drought tolerance from wild cotton species into the major commercially cultivated species. Cotton fibers occur naturally as white, green, brown, and some blends of these shades [16]. Cotton is native to tropical and subtropical regions of the globe, mainly America, Australia, and Africa [8]. 17 native species of Gossypium are distant relatives of Australian flora [12]. Currently, 52 species of cotton are placed in the genus Gossypium. The Mayan civilization in Mexico first developed Gossypium hirsutum as a cultivated species [8].
Cotton is a popular natural fiber. Cultivated cotton is also an important oilseed crop and a major protein source for animal feed. This makes cotton a major player in the world economy and it is central to the industry, agriculture and employment of many subtropical and tropical regions in South America, Africa and Asia. Therefore, the genus Gossypium has long been the object of research [16]. All parts of the cotton plant are significant. Seeds are used for oil or as animal feed. Fiber is used in the textile industry to produce thread/fabric and remaining parts are mulched. Linters (small fibers removed from the seed after ginning) are a good source of cellulose. Linters are used to make products like cotton balls [8]. Among cotton products, lint (the fiber detached from the seed) is the major product; other products include textile and yarn products, automobile tire cord, plastic reinforcing and cordage. Cotton hulls are used for fuel, fer-tilizer, as packing material and the stalk fiber is used for pressed paper and cardboard [8]. Cottonseed oil is now considered as a chief byproduct of the cotton plant and has emerged as a distinct industry since its development in the latter half of the 19 th century. It is gradually becoming more important to cotton farmers as the natural cotton fiber meets competition from cheaper, stronger synthetic fibers cottonseed contains ~20% oil. After detaching from the linters, seeds are shelled, pressed and crushed. The crude oil is extracted with solvents. Cottonseed oil is used as a cooking or salad oil, in margarine and shortening, and in a highly refined form, for cosmetics. It is used as a semidrying oil in paint. Candles, soap, detergents, oilcloth, artificial leather and many other commodities are manufactured from its less refined state [8]. Cotton is easily combined with linen to make velvet. It is less expensive than silk and can be printed more easily than wool. Its low market price makes it acceptable to the general public. The British midlands became a profitable center in the 1770s due to innovative techniques like the spinning jenny, water frame and spinning mule. British cotton export reached 15.6% from 1794 to 1796 and 42.3% from 1804 to 1806 [17].
Cotton is grown all around the world: 75 countries produce cotton for different purposes [18]. It is grown north and south of the equator to 45 to 35 degrees and comprises ~31.7% of global agricultural production [19]. Cotton  Management improvements are an ongoing challenge as cotton producers face market realities. A producer's understanding and knowledge of the crop and ability to read the plant are critical to develop strategies to meet anticipated needs. Developing an integrated management approach to increase the efficiency of every production input and output is an essential element of a successful enterprise. Cotton producers will be expected to produce quality fiber and cotton products under increasing demands for environmental stewardship. Integrating management practices into an efficient system is the best approach to sustain the future of cotton production [21].

Domestication of Cotton
No one knows when cotton was first domesticated. Some 7000-years-old cotton boll fragments and wreckages of cloth have been found in Mexico, as this cotton was similar to that grown in America today. Cotton has been grown, spun and woven in the Indus valley of Pakistan since 3000 BC. At the same time, the inha-bitants of the Nile valley in Egypt were also wearing cotton clothes. Arabs brought cotton to Europe in 800 A.D. [19]. Columbus found cotton in the Bahamas during his exploration of America in 1492. After 1500, cotton was known worldwide. Florida started growing cotton in 1556 and Virginia, in 1607. Farmers were cultivating cotton in Virginia along the James River by 1616. The first spinning of cotton by machine was done in England in 1730. Cotton ginning and the industrial revolution paved the way for today's world significance of cotton. Recent advances in cotton research compared seeds and bolls from cultivated and wild relatives of G. hirsutum, indicating that they are the same species, initially cultivated on the Yucatan Peninsula [12]. G. arboretum is a tropical plant, which has restricted its spread from southern Asia. It is grown in the Persian Gulf and some parts of North Africa. G. arboretum was recently found in Karatape, Uzbekistan [24]. G. herbacium is less familiar than G. arboretum. It was cultivated in the forests and plains of Africa. Its wild plants were taller, with small fruits and thick testa. No archeological remains of it have been recovered, but it has spread toward North Africa and the near East [24]. G. hirsutum and G. barbadense are considered New World cotton species [22]. G. hirsutum and G. barbadense were cultivated in Mexico and Peru, respectively. Some archeologists maintain that the most primitive form of cotton was domesticated from G. barbadense and was first cultivated in Mesoamerica. Others believe that G. hirsutum was solely domesticated in Mesoamerica [12]. Either way, cotton became an important cash crop and a valuable exchange element in Mesoamerica. Maya and Aztec merchants exchanged cotton articles and precious woven colored blankets. Aztec kings gave gifts of cotton items to their guests and army leaders [24]. Cotton remains from Ancon dating to 4200 BC provide the earliest evidence for domestication of G. barbadense. By 1000 BC, Peruvian cotton bolls were still different from modern cultivated varieties of G. barbadense. The evidence of this form of cotton was found in some regions of Ecuador and on the middle coast of Peru [12].
Pakistan is among the pioneer cotton cultivation regions: the earliest known traces of cotton were found in Mehrgarh, near Quetta city: a copper bead with threads of cotton was found in a Neolithic burial site dated to ~6000 BC. Metallurgical analysis of mineralized threads with light and scanning electron microscopy confirmed the presence of genus Gossypium [25]. Cotton threads have been recovered from archeological investigations of the Indus Valley civilization. Cultivation of cotton was extensive at the time of the Indus Valley civilization, covering areas of modern northwest India and eastern Pakistan [26]. Archeological indications of seeds from Mehrgarh have been dated back to 5000 BC. Cotton clothes were being used in Mohenjo-Daro and the Harappa Valley in 2500 BC. Cotton pollens were discovered at Balakot [27]. Evidence of cotton threads was found around mirror handles and copper razors dated to the mature Harappan period of ~2500 to 2000 BC. Other evidence of cotton was found in Balakot as pollen, in Banawali as seeds, and in Kanmer, Imlidhi Khurd, Kacchh, Sanghol and Gorakhpur as lint fibers [28].

Colored Cotton
Cotton varieties that produce colors other than off-white are important additions to modern marketable cultivated cotton. Green, red, and several shades of brown are the major natural colors of cotton varieties, which do not fade. The yield of colored cotton varieties is lower than commercially cultivated white cotton due to harvest constraints. Fiber is shorter and more fragile, but also softer than commercial white cotton. For better yield, specialized harvest technologies are required [29].
Sally Fox started postgraduate work on colored cotton in 1982. She first developed a long fiber of colored cotton and obtained patents for different shades of colored cotton including coyote brown, green, palo verde green and buffalo brown, under Fox Fiber [30]. In 1984, Raymond Bird worked on naturally colored cotton to improve its quality [31]. Colored cotton has excellent sun-protection properties. The color doesn't fade even after laundering and is environmentally friendly because it isn't dyed, which also saves the capital investment for fabric dyeing. Naturally colored cotton is more expensive ($1.8 to 5.0 per pound) than cultivated white cotton ($0.75 to 1.75 per pound) [30]. Colored cotton is grown in the United States, China, Russia and Brazil. China produces ~61% of the world's colored cotton and exports to Western Europe, North America, Southeast Asia and Eastern Europe [29]. The fiber of colored cotton could help reduce the incidence of ~50 somatic and psychosomatic disorders in human beings [32]. Dyed cotton fabrics can trigger skin allergies and some dyes are carcinogenic, endangering textile workers [32]. Naturally colored cotton has shortcomings too. The fiber is short and is not harvested efficiently, causing great losses in yield. The color range is limited, sometimes not stable, and the fiber quality is low. Naturally colored cotton has very low market demand and fewer marketing facilities, but it could be a greener solution for cotton production and the textile industry. There may be increased demand for colored organic cotton, which might be a better price option for the cotton industry. Colored cotton can be a great source of wealth for rural families and woman empowerment [33]. Cotton plants have several key characteristics: okra-like leaves, nectriness, gossypol glands, reddish-brown stems and frago-bracts [8]. Existing colored cotton varieties can be improved through conventional breeding programs, biotechnology, or gene pool diversification. Understanding the mechanisms underlying pigment formation is critical for such efforts [8]. A comparison of the economics of colored and white cotton is given (Table 3).

Cotton Genome
Phylogenetic analysis places the genera Gossypoides and Kokia closest to the genus Gossypium ( Figure 5). Genus Gossypium has 3 diploid progenitors. The A, B, E and F genomes constituted the African-Asian clade, the D genome formed the New World clade, and the C, G and K genomes gave rise to the Australian clade. Worldwide expansion led to differentiation in genome size based on morphology, ecology and chromosome pattern. Introgression and interspecific hybridization were the most common causes of speciation in Gossypium.  Allopolyploidy occurred when American native A genome diploid cotton hybridized with a D genome female of immigrant New World G. raimondii [14]. Wild cotton plants are diploid, but a group of five tetraploid species is native to the Pacific Island and America, due to one hybridization incident ~1.5 to 2 million years ago. The tetraploid species are G. hirsutum, G. mustelinum, G. tomentosum, G. darwinii and G. barbadense [8].
A significant goal in cotton research is to study the genome of cultivated cotton and its relatives. Sequencing the cotton genome will help decipher important genetic components of the genus Gossypium. Gossypium is comprised of 52 species: 46 diploid species distributed in eight groups (A, B, C, D, E, F, G and K) and six tetraploid species (AD genome). Hybridization and polyploidization between the A and D diploid genomes resulted in the AD genome of tetraploid cotton. This polyploidization led to a remarkable combination of high yield potential and superior fiber quality compared to the A genome of G. arboretum, which has poor fiber quality, and the D genome of G. raimondii, which does not produce a spinable fiber [34]. Genomic research on cotton started as an analysis of the genetic diversity of diploid and allotetraploid species using SSR markers in A-and D-genome species of Gossypium (Guo et al., 2003). High polymorphism among Gossypium species with A-or D-genomes was found and the molecular cluster was consistent with previously-defined Gossypium taxonomy (Fryxell, 1965). G. gossypioides, with a D-genome, was least similar to other D-genome diploid species, emphasizing the significance of G. gossypioides as the original D-genome cotton specie. To understand allopolyploidization in Gossypium, two allotetraploid cotton species were studied, but allotetraploid cotton species were inappropriate for studying the evolution of the A and D genomes [35]. The formation of Gossypium polyploids and their role in the evolution of new species was examined using a polyploid of G. barbedense with an AD genome: 83 non-cross hybridizing clones with discrete replications consisting of ~24% of nuclear DNA [36]. The A-genome encompasses 77% of nuclear DNA. FISH analysis depicted the spread of some A-genome repeats to D-genome chromosomes in tetraploid cotton. Only G. gossypioides had adequate levels of A-genome repeat sequences, including D-genome replications. The spread of discrete repeats in polyploids gave contribution of diploid progenitors. Most DNA sequences in the clones did not match known DNA sequences: only four were linked to transposable elements, some had internal repeats and ~12 could hybridize to mRNAs. A new breakthrough in the evolution of polploidy was observed using cytogenetic and phylogenetic analysis of discrete DNA repeats [36].
Two types of genomes are present in G. hirsutum: DT and AT genomes [37]. Public sector research produced a superior sequence of the draft genome from reads created by all available sources, Sanger reads of bacterial artificial colonies, cosmids and plasmids, and 454 reads. These advanced reads will be influential in ordering an initial draft D-genome [38]. About 50× of the D-genome of G. raimondii was covered through illumina sequencing by Monsanto and Illumina in 2010 [39]. The raw reads were donated to the community. The assemblage of the AD-genomes of cultivated cotton varieties requires assembling the D-genome from raw material: a formidable task [13]. Gene Trek and BAC tagging techniques were used to identify the organization and configuration of the genome of allotetraploid cotton [40]. Analysis of BAC sequences showed 70,000 genes with replicas in homeologous regions of A and D-subgenomes. Uneven gene distribution was observed, with both gene-rich and gene-poor sections. Among BACs, 21% lacked genes. Other gene islands averaged ~1.5 genes/island, with BAC gene density ranging from 0 to 33.2/100kb. In D-genome, 125 polymorphic loci were marked out of 166 loci. Thirty-seven BACs, 12 from the A-genome and ing that introns play no role in altering the size of subgenomes in cotton species [40]. The importance of polyploidy as a key factor for increased quality and fiber productivity was emphasized [41]. A five-to six-fold increase in ploidy level occurred in cotton ~60 million years ago and allopolyploidy occurred one to two million years ago. The evolution of embryonic fiber before allopolyploidy is confirmed by the occurrence of spinnable and non-spinnable fibers in the A and F-subgenomic species G. herbacium and G. longicalyx, respectively, and in the D-genome of G. raimnondii. Several non-reciprocal interactions between genomes contribute to innovative properties in the G. hirsutum AD genome. The novel properties of G. hirsutum were obtained by recombining D-and A-genome alleles [41]. A draft genome was sequenced for G. raimondii, the descendant of the significant D-subgenome species G. hirsutum and G. barbadense. Thirteen chromosomes of G raimondii contain ~73% of the assembled sequences. Transcriptome analysis confirmed 40,976 protein-coding genes in the genome. Thirteen to twenty million years ago, hexaploidization and whole-genome duplication events occurred in eudicots of various cotton species. The G. raimondii genome had ~2355 syntenic blocks and ~40% paralogues genes in more than one block, suggesting the significance of chromosomal rearrangements during evolution of cotton species. Phylogenetic analysis found the cadinene synthase (CDN) gene family, for the synthesis of gossypol glands, only in species of cotton and Theobroma cacao [38]. A whole genome marker (WGMM) of cotton was based on the sequenced D-genome of G. raimondii [42]. A WGMM of 48,959 loci was created for cotton [43], comparable to the rice and brassica genetic maps of 15,759 SNPs and 13,551 sequence-related amplified polymorphisms, respectively [44] and [45]. This cotton WGMM aided targeted research for gene cloning, association mapping of cotton and other related genes, and genome-wide studies. The WGMM is a significant resource for understanding QTLs for cotton fiber development, association mapping, pest resistance gene analogue clusters, gene structure, and variation [42]. The genome of allotetraploid cotton (AADD; 2n = 52) makes genetic, genomic and functional analysis difficult. The genome of G. arboretum (AA; 2n = 26), an assumed donor of the A-subgenome, has been assembled [46]. Paired-end sequencing used 193.6 Gb of fresh sequencing, covering the genome 112.6 times. Subsequently, 90.4% of the sequenced array on 13 pseudo chromosomes and 68.5% of the genome made up of repetitive DNA sequences was aligned. Up to 41,330 gene-coding sequences in G. arboreum were defined. G. raimondii and G. arboreum shared two whole-genome doublings before speciation. The differences in genome sizes stem from the addition of repeats at terminal regions five million years ago [47]. An A-genome sequencing project for Gossypium was initiated in 2007 [15]. The goal was to sequence the entire genome of commercially cultivated allotetraploid cotton species. "Allotetraploid" indicates that these cotton genomes consist of two diverse subgenomes, the At and Dt (the "t" denotes tetraploid and differentiates from the A and D-genomes of related diploid spe-cies). The D-genome of the allotetraploid relative of cotton, G. raimondii, was sequenced first. G. raimondii is a wild cotton species of South America (Peru, Ecuador) and its genome is smaller due to less replicative DNA (primarily retrotransposons). The G. raimondii genome has a 3 times fewer bases than tetraploid AD cotton due to the sole presence of each chromosome. The "Old-World" A-genome cotton species G. arboretum, cultivated in India, was decided to be sequenced next. The genome of G. arboreum is about twice the size of G. raimondii. Once both genomes are completely sequenced, the genomes of cultivated tetraploid cotton varieties can be sequenced. This strategy is necessary because if the tetraploid genome were sequenced without model diploid genomes, the euchromatic DNA sequences of the AD genomes would co-assemble and the repetitive elements would assemble independently into A and D sequences, respectively. The AD sequences can only be untangled by comparison with their diploid counterparts [46].

Genetic Diversity of Cotton
The sum of all genetic characters in a species' genetic makeup constitutes its genetic diversity. Genetic diversity is different from the genetic variability, or differences among genetic characters. Species diversify as they adapt to a changing environment, when individuals with the best-suited alleles produce more offspring. Gossypium hirsutum, a cultivated upland cotton with an (AD) 1 genome and extra-long staple length, and Gossypium barbedense, with an (AD) 2 genome, evolved through whole-genome duplication as neoployploids with different genomes. Gossypium barbedense produces the strongest fiber or any plant, with a long staple and pure cellulose composition [50]. Genetic diversity effects in inbreds of cotton were observed in Xiangzamian 2 (XZM2) hybrid cotton in China, which was a cross between Zhongmiansuo I2 and 8891. One hundred eighty recombinant lines were produced after nine generations [53]. Ten agronomic traits in a population were studied over two years. SSR markers determined the genetic map of the XZM2 hybrid and single-and double-locus studies used QTLs. The genetic base of cotton agronomic traits is influenced by additive and epistatic effects of QTLs [53]. Fiber quality traits were studied using association mapping and genetic diversity [51]. Linkage disequilibrium served as an alternative approach to use the genetic diversity of Gossypium species. Genome-wide linkage disequilibrium was used to examine fiber quality characters. An 11% to 12% significant linkage disequilibrium was designated by SSR markers among 208 landraces and 77 cultivars with a significant similarity in population structure. This demonstrated the potential of cotton cultivars for stratification and population structure using association mapping [52].  [54]. In the United States, 378 Gossypium hirsutum and three Gossypium barbadense accessions were characterized using 120 gene-specific microsatellites to identify population structure and genetic diversity of tetraploid cotton [55].
One hundred forty-one SSR loci were identified with 546 alleles, of which ~22% were unique. Population analysis by STRUCTURE distinguished five groups as belonging to the southwest, mid-south, southeastern and western cotton belts of United States; the three Gossypium barbadense lines formed a distinct group.
Low genetic diversity was observed among upland cotton genotypes at a 0.195 mean genetic distance between Gossypium hirsutum lines. Population structure and phylogenetic analysis results were consistent with pedigree evidence [55].
The genetic diversity of 40 releases from a Pakistani cotton breeding program from 1914 to 2005 was evaluated [33]. The genetic diversity of Pakistani cotton germplasm was relatively low over time in the previous releases, showing conservation of elite cotton genotypes for use in future breeding programs [33]. The genomic diversity of 20 cotton cultivars was examined using 31 microsatellite markers. Only two genotypes, K-68/9 and MNH-93, had the maximum significant similarity [56].

Insect Pests
A wide range of insect pests attack cotton: The most damaging insects include ash weevils, cotton aphids, cotton stem weevil, dusky cotton bug, fruit borer, leaf hopper, leaf roller, mealy bug, pink bollworm, spotted boll worm, shoot weevil, red cotton bug, stem borer, thrips, tobacco cutworm and white fly ( Table 5). Integrated pest management is a critical step to boost cotton production. Formally, the most devastating cotton pest in North America was cotton boll weevil. This pest was completely eradicated by the efforts of the Boll Weevil Eradication Program (BWEP) of the United States Department of Agriculture. Synthetic pesticide use was reduced with the introduction of Bt cotton, genetically modified against cotton bollworm and pink bollworm.

Diseases
Cotton production is greatly affected by diseases causing yield loss and poor-quality seed and fiber. Cotton is affected by bacterial, viral, fungal, nematodal, phytoplasmal and spiroplasmal diseases (  [61]. Molecular biologists struggled to understand the biology of CLCuV to combat this disease [62]. These efforts are hindered by the complicated nature  Suspecious phytoplasma (unknown) of the virus and its quick evolution and gene recombination [63]. Currently, no variety of G. hirsutum is resistant to CLCuV. Current strategies include introducing resistance genes from G. arboreum to G. hirsutum.

Improvement of Cotton
Humanity is fed and clothed by several dozen of crops since beginning of time both conscious and unconscious selection [69]. Drought tolerance has been achieved by recurrent selection under drought conditions. Selection of Gossypium barbadense in elevated temperature condition has been resulted into heat tolerance varieties [70].
Natural genetic mutation is also a source of improvement in particular varieties especially oilseed crops. Rape has subsequently been bred into modern oilseed rape and cotton varieties in order to knock out the hazardous chemicals [71] and [72]. In classical plant breeding, linkage is used for transmission of deleterious genes from donors to cultivated plant in order to develop the insect/pest resistance. A morphological mutant having more numbers of monopodial branches in G. hirsutum var; RH-003 when it was treated with 15 kR of gamma rays also beard more number of bolls with elevated size [73]. Induced mutagenesis for improvement of cotton related to characters such as earliness [74], compactness and dwarfism, more boll weight [75], ginning percentage and improved fiber length [76], yield [77], seed oil content [78], resistance against diseases, insect resistance [79], drought and salinity tolerance [80] have been reported. Cotton is often cross pollinated crop and doesn't suffer from inbreeding depression. Crossing (test cross and back cross) is considered as an effective way to get a plant of interest. Mac7 is identified as resistant cultivar against CLCuD Burewala strain and it was released as a germplasm line by the USDA [81]. It is also an effective measurement for development of plant for better traits. A multiple backcross was performed between Gossypium barbadense L. and G. hirsutum L. observed for QTL analysis for fiber quality. After cross between Guazuncho 2, G. hirsutum, and "VH8", G. barbadense, three backcross generations studied were the 1 st (BC 1 ) and 2 nd (BC 2 and BC 2 S 1 ) showing fine fiber quality and fiber length [82].
In 1976, Konarev proposed that heterosis is more dominantly manifested in F1 generation and also passed through subsequent generations. It proved supe-riority of emerging cultivars regarding disease resistance, elevated yield and tolerance to environmental changes over parental vigor due to recombination phenomenon [83]. In India commercial exploitation of hybrid vigor in cotton was mainly achieved and popularized by cultivation of Hybrid-4 and Varalaxmi at large scale [84]. Improvement in fiber quality, increased number of bolls, halo length and span length of cotton by heterosis were observed [85]. Combining ability generated larger progenies harboring new combinations by hybridization [86]. Genetic variations due to GCA and SCA were significantly noticed for different yield traits in Gossypium hirsutum [87] and [88]. Combining ability (GCA, SCA) plays a significant role for crop improvement through determination of nature and magnitude of gene action and its inheritance. Research conducted on G. hirsutum L. involving a cross between 11 parental plants (MCU5, MCU12, Surabhi and SVPR2 ) as male and seven high oil content genetic accessions (F776, F1861, SOCC11, SOCC17, TCH1641, TCH1644 and TCH1646) were used as female lines had shown the improved cotton yield, bundle strength and optimum seed protein production [89]. Gene action for different non-additive traits including boll weight, boll number, lint % age and seed cotton yield was observed [90] [91] and [92]. While additive gene action for different traits was observed in upland cotton having genetic effects with enough variability for yield parameter [93] and [94]. Wild relatives of cotton are a critical source of novel genes for breeding programs, particularly for developing biotic and abiotic stress tolerance. Wild cotton specie Gossypium arboreum L. has resistance genes for Begomoviruses causing CLCuD [6], drought [95], heat [95], root rot, CLCuV [63] and insect pests [96]. Interspecific hybridization of cotton has been successful [97]. Nematode resistance was transformed into tetraploid G. hirsutum [98]. Novel genes for resistance to drought and cotton leaf curl disease were introduced into G. hirsutum from G. austral and G. stocksii. As interspecific hybridization of G. arboreum and G. hirsutum is difficult, some researchers used bridge crosses to introgress resistance genes from wild relatives [99].
Genetically modified cotton has started to reduce strong dependence on pesticides. Bt toxin is a protein naturally produced by the bacterium Bacillus thuringiensis is toxic to some insects, including flies, beetles, butterflies and moths [100] [101] and [102]. Natural insecticide can be produced in cotton tissues by introducing the Bt gene into cotton genome to make Bt cotton. Lepidopteran larvae die after eating leaves of Bt cotton, reducing pesticide use and allowing natural insect predators to dominate and manage the pests. Insecticides are still necessary to control pests that are not affected by Bt toxin, including stink bug, plant bug and aphids. A joint research project by the Chinese Academy of Science, The Center for Chinese Agricultural Policy and Cornell University emphasized the development of resistance in insects against Bt toxin [103]. This statement was later disputed when joint research conducted at Stanford University, the Chinese Academy of Sciences and Rutgers University proposed that Bt cotton can control bollworm and the massive increase in secondary pests was due to increased temperature and precipitation [104]. Pesticide use dropped by half after introducing Bt cotton, increasing the numbers of beneficial insects like lacewings, ladybirds and spiders [105] and [106]. Bt cotton was grown on ~25 mha globally [41], or an estimated 69% of the total area cultivated in cotton. India is growing the most Bt cotton in the world, increasing from 50,000 to 10

Cotton Transformation
It is an essential and foremost priority of a researcher to develop an efficient and  [109].
In planta transformation technique mainly encompasses Agrobacterium mediated and pollen tube pathway (Arabidopsis thaliana, rice, cotton, wheat, Medicago truncatula, Jatropha curcas and so on). In planta method have been successfully practiced using seeds, epicotyls, shoot apical nodes, flowers and fruits as recipient tissues with greater efficiency compared to other tissue culture based protocol [112]. Cotton has been transformed by various protocols (Table 7).
Transformation and regeneration of cotyledonary tissues of cotton was first performed in 1987, which regenerated over 80% of embryos from Agro-transformed calli. Antibiotic resistance, production of opines, immunoassay and Southern blot analysis confirmed positive transformation [113]. A particle bombardment protocol for cotton transformation was optimized using cotton meristems and a Bio-Rad PDS-1000-He gene gun [114]. The role of cytokinins in cotton shoot development was studied by treating embryonic axis of cotton apical meristems with the cytokinin benzyladenine for 2 -20 days and watching expansion [115].
Benzyladenine was effective at promoting development of shoots and buds [115]. A genotype-independent regeneration protocol for some elite varieties of Gossypium hirsutum was optimized. The high regeneration potential of Riata was due to introgression of potential regeneration alleles (it was a hybrid cross of a Roundup-ready transgenic cultivar and a cultivar with transgenic Maxxa genetic background). Max-R lines with elite genetic background were produced through increased regeneration selection pressure [116]. Apical meristems and cotyledonary nodes were used to induce multiple shoots in CIM-443 [117]. Somaclonal variation and somatic embryogenesis are often barriers to cotton plant regeneration. Genotype, explant, Agrobacterium strain and callus induction medium are the critical parameters for Agrobacterium-mediated transformation. A set can produce transformed cotton plants in eight to 10 months [118]. Agrobacterium mediated transformation of green colored cotton was reported from the Chinese Academy of Sciences [119]. Cotton variety G-9803 showed regeneration of embryogenic callus. G-9803 was transformed with gene (GhExp-1) specific for fiber expansion and tissue culture. A transformation frequency of 17.8% was observed among 32 distinct regenerants produced within seven months.
These findings signify the pioneer work for genetic manipulation of green-colored cotton [119]. An Agrobacterium-mediated transformation protocol for Coker-312 used a cDNA (GUS and nptII genes) on two-month-old embryogenic calli derived from hypocotyls. Nearly, 46.6% and 20% of explants showed GUS activity after vacuum infiltration and Agrobacterium-mediated transformation.
28.23% efficiency of transformation was achieved [120]. An efficient method with improved frequency of somatic embryogenesis and concomitant growth of somatic embryos was developed. The combined protocol of suspension and solid culture promoted synchronization of somatic embryogenesis and mass embryo development [121]. Cotton transformation was improved by minimizing tissue culture to avoid recalcitrance [122]. A high-efficiency cotton transformation protocol bombarded embryo with gold microparticles coated with DNA at 0.55% average frequency of transformation and produced plants in 7 -10 months [123]. A pistil drip method for cotton transformation used a solution carrying Agrobacterium with a plasmid conferring herbicide resistance, showed stable gene integration and heritability [124]. Agrobacterium-mediated transformation was done with antisense CLCuD coat protein RNA resulting in no symptoms of CLCuD in positively transformed plants [125]. Agrobacterium-mediated transformation and somatic embryogenesis were employed to insert the bC1 gene into Coker-310. Transformed plants developed no CLCuD symptoms throughout their life cycle and were deemed resistant to CLCuV [126]. The fiber expansion gene GhEXPA8 was introduced into NIAB846. Transgenic plants had increased expression of GhEXPA8 for fiber length and micronaire value [127]. Pollen tube pathway has been used for development of transgenic cotton to overcome the problem of regeneration owing to recalcitrant nature of cotton [128] (Table 8).
Pollen tube mediated gene transfer (PTT) has the ability to transform foreign DNA through involvement of pollen grain into germ line. PTT has potential benefits as it often forestall the imperfection in reduction of fertility, dependence on genotype, and different genetic variation including mutation/methylation and most important one is that it avoids manipulation, identification and screening of transformants as compared to other protocols [129]. Pollen tube mediated transformation is efficient and simple alternative of producing transgenic plants, evading the stipulations for tissue culture. The PTT method was first reported in cotton (Gossypium hirsutum L.) [130] and rice [131]. Two approaches were   cell and a generative cell and pass through mitosis for the formation of two male gametes. Transgene is incorporated into generative cell which fuse to egg cell through the way of pollen tube ultimately for the formation of zygote. For foreign DNA incorporation, vacuum infiltration or gene gun can be employed [132]. Targeted pollen having genes of interest are transferred to recipient plant at embryo forming stage through the process of pollination [133]. Secondary approach is the process of transformation in recipient plant; the stigma is removed shortly after pollination from style. Solution of exogenous DNA is applied directly to ovary as transgene reaches the ovule and fertilize it, leading to the formation of zygote, the whole process is carried in a natural manner [134]. Multinational companies have transformed various cotton varieties for commercial use with traits including insect pest and herbicide resistance (Table 9).

Conclusion
During current era, there has been an intense need for developing improved cotton varieties to meet the ever growing demands of fine quality and sophisticated clothes. Crop improvement has been shifted from conventional breeding approaches towards manipulating the genome of cotton plant precisely without leaving marks of antibiotic resistance genes within the genome. A broad view of each and every aspect of cotton plant has been presented in this review to help researchers pinpoint the traits and modify its genome precisely. Increased cotton production will not only fulfill the needs of growing world's population but will also strengthen the textile economy.

Future Directions
Cotton transformation towards improvement of traits for yield, quality and beyond will be the future goal of cotton breeders and biotechnologists. Marker free plants will be produced to lessen the impact caused by antibiotics to the entire ecosystem.

Author's Contribution
Sabin Aslam wrote the main manuscript, drawn all figures and tables. Sultan Habibullah Khan prepared the main outline and edited primary draft of manuscript. Abhaya M. Dandekar edited the manuscript extensively. Aftab Ahmed gave valuable suggestions to improve manuscript. All authors read final copy and approved the manuscript.