Multivariate Cluster and Principle Component Analyses of Selected Yield Traits in Uzbek Bread Wheat Cultivars ()
1. Introduction
Bread wheat (Triticum aestivum L.) is one of the most important cereal crops in Uzbekistan, providing over 50% of gained energy. The region is one of the top regions in the world in terms of the average consumption of such products (>185 kg/person) [1]. The World population is rapidly increasing, and there is a growing demand for products derived from wheat [2]. Over thousands of years under the severe conditions of Central Asia, bread wheat has undergone adaptations to local soil and climatic conditions [3]. In wheat-producing countries, much attention is paid to the selection of high-yielding, high-quality wheat varieties that are resistant to diseases and pests, and adverse environmental factors. To increase yield, using wheat genotypes with maximum variability is required. Genetic variability means that genetic material differs between individuals of the same species and is used for the detection of genetic diversity in closely related species [4]. Yield traits have been successfully used for the estimation of genetic diversity since they provide a simple way of quantifying genetic variation [5]. Because traits such as optimum plant height, grain number per spike, and a TKW contribute to wheat yield [6] [7] [8].
Several authors have suggested the use of cluster and principal component analyses to study the genetic diversity and relationships of wheat genotypes [9] [10] [11]. The advantage of cluster analysis (CA) is that varieties or samples are grouped on the basis of complex traits rather than one character [12]. The high yield of winter wheat depends on many factors. Each factor has its own effect, but a separate study of the effect of each factor is not sufficient for complete analysis [13]. Principal component analysis has the ability to transform a number of possibly correlated variables into a smaller number of variables called principal components [14]. Mujaju et al. [15] argued that principal component analysis (PCA) should be conducted before cluster analysis (CA). Principal component analysis (PCA) facilitates the selection of potential parents for hybridization programmes [16]. Mustafa et al. studied maize genotypes under drought stress conditions using PCA and CA [14]. Their study revealed that four PCAs explaining 86.7% and 88.4% of the total variation. Ahmad et al. [4] evaluated the relationship between yields and its components in bread wheat (Triticum aestivum L.) using factor and cluster analyses. These analyses distributed all genotypes among three clusters and revealed strong relationships of yield with other traits [17].
Ahmadizadeh et al. [18] studied the agronomic and morphological traits of durum wheat landraces under drought stress in a greenhouse. Analysis of variance indicated great variations among landraces and genotypes. Cluster analysis divided the genotypes into three groups under normal and stress conditions. The author concluded that under stress conditions, grain yield showed a positive and significant correlation with peduncle length, number of grains per spike, 1000-grain weight, biological yield and harvest index. Ibyatov [19] identified four components that explained 84% of the variability in traits among spring wheat genotypes. Cluster analysis based on genetic diversity of yield traits can be used to assess genetic variation among plant genotypes to detect significant variation demonstrating high-yielding genotypes. This can be successfully applied in plant breeding via using of significant genotypes found from different clusters [17]. Many studies have carried out preliminary selection of high-performing genotypes in order to determine the effectiveness of using cluster analysis to evaluate selected T. aestivum lines for valuable economic traits and adaptation features [20] [21].
Because of hybridisation between geographically distant parental forms with productivity related genes responsible for enhanced traits can be transferred into next generation. It is known that correlation among genes can be altered depending on conditions under which plants are grown.
It is important to work out the interrelationship of yield and other related traits for efficient selection of improved genotypes. Moreover, similarity among the wheat genotypes was evaluated using cluster analysis based on the agro-morphic traits by exploiting Euclidean distance. Other researchers have also used cluster analysis to study the morphological similarity among the genotypes [22] [23].
Morphological traits and yield parameters are broadly using to determine genetic diversity during breeding processes to produce new cultivars [24]. Multivariate statistical tools enable to analyse genotypic stability and creation of groups with distinct traits [25]. The purpose of this work was to study the genetic diversity among geographically distant wheat genotypes by using principal component and cluster analyses. In future, such diversity can be exploited in wheat breeding programmes to create new bread wheat varieties with a high-yield and enhanced grain quality.
2. Materials and Methods
2.1. Plant Material
The study was conducted in the Durmon Experimental field Station of the Institute of Genetics and Plant Experimental Biology, Academy of Sciences of Republic of Uzbekistan. The experimental materials consisted of 25 wheat accessions with different geographic origins, namely, the Bardosh, Ilgor, and Ezoz varieties obtained breeding program using CYMMIT germplasm, the Pakhlavon and Oq Marvarid varieties obtained from hybridized from local varieties, winter wheat cultivars obtained from the Krasnodar Germplasm collection. Semi-arid and landraces were used from the CYMMIT germplasm and local wheat landraces collection, respectively (Table 1).
2.2. Measuring Quantitative Traits
Measuring of yield traits were performed as described earlier work by our group and following experimental manual by Dospekhov [26] [27].
2.3. Statistical Analysis
Statistical analysis of quantitative traits was conducted by Ken Sayre’s method [23]. The calculation of descriptive statistics, cluster analysis and principal component analysis (PCA) were performed using ANOVA in STATGRAPHICS 18 software (https://www.statgraphics.com/). Cluster analysis was performed using K-means clustering, while a tree diagram based on Euclidean distances was developed by Ward’s method [29].
Table 1. Brief information of different wheat accessions used in this study.
aInternational bread wheat screening nursery, Mexico; bLeaft rust/yellow rust screening nursery, Mexico; cFacultative and winter wheat observation nursery-Semi Arid, Mexico; dFederal state budget scientific institution “National Center of Grain” named after P.P. Lukyanenko, Russian Federation.
3. Results and Discussion
The average data were analysed by using principal component analysis. Principal component analysis reflects the importance of the largest contributor to the total variation along each axis of differentiation. The resulting eigenvalues are often used to determine how many factors to retain. The sum of the eigenvalues is usually equal to the number of variables [19] [30]. According to Chahal et al. [20], characters with the largest absolute value closer to unity within the first principal component influence the clustering more than those with lower absolute values closer to zero.
The yield data of 25 wheat genotypes were analysed by using principal component analysis. In this study, out of a total of eight components, three had eigenvalues > 1. These three principal components explained approximately 90.8% of the total variability. The other five components explained only 9.2% of the variation in the wheat genotypes (Table 1). This is consistent to the work by Degewione and Alamerew [31].
Based on the results, the corresponding elements of productivity were identified for each of the main components, thus indicating the variance in the population characteristics. Table 2 shows that PC-1 explained 54.2%, PC-2 explained 23.3%, and PC-3 explained 13.2% of the total variance among different yield traits.
Table 3 shows factor loadings for various yield traits. According to Table 2, the first PC was related to yield and yield traits, i.e., grain yield (0.45), spikes m−2 (0.42), and grains m−2 (0.46), with positive loadings and exhibited positive loadings for TKW (−0.22) and plant height (−0.11). The second PC exhibited a positive effect on biomass (0.47) and plant height (0.61) and a negative effect on the harvest index (−0.49), 1000-grain weight (−0.24), and the number of grains per spike (−0.23). The third PC explained variation among genotypes for 1000-grain weight (0.75), with a positive factor loading.
The positive and negative effects of factors indicate the association between components and varieties [8]. Therefore, the abovementioned positive and negative productivity elements also contributed to cluster formation. According to the principal component analysis, grain weight m2 was selected for the first group, plant height was selected for the second group, and 1000-grain weight was selected for the third group. During the differentiation of genotypes into clusters, it was found that the contributions of the three major components were greater than those of the other components.
Twenty-five geographically separated wheat genotypes were statistically analysed and clustered based on various yield traits: the harvest index (HI), grain yield m−2 (GY), biomass (B), spikes m−2 (S), 1000-grain weight (TKW), grain number per m−2 (G), spikes per grain (GS) and plant height (PH).
All four clusters were analysed according to their means and standard deviations (Table 4). The mean values for grain yield (742.2), biomass (1756.7), grains/m2 (18,373.7), and grains/spike (45.3) were higher in cluster C than in the other clusters. Cluster D exhibited the maximum value for TKW (46.6). The dendrogram was constructed based on cluster analysis of yield traits (Figure 1).
Cluster analysis showed that cluster A included 7 genotypes, cluster C contained 8 genotypes and clusters B and D each contained 5 genotypes. Cluster A include a combination of genotypes (CIMMYT collection, local wheat varieties and landraces). Cluster A contains genotypes closely related to those in cluster B. Apparently, this cluster included accessions from the Krasnodar collection (Krasnodar 99, Tanya, Kroshka, and Grom) and semiarid wheat collection (171,358). Clusters A and B displayed similar values for grain yield, biomass and spikes m−2.
Table 2. Principal component analysis of different yield traits in wheat.
Table 3. Factor loadings for various traits.
HI, harvest index; TKW, thousand-kernel weight.
Table 4. Means, standard deviations and variances for clusters based on morpho-physiological traits.
C-A, C-B, C-C and C-D are clusters. HI, harvest index; TKW, thousand kernel weight.
Figure 1. Tree diagram of 25 wheat genotypes based on different yield traits.
Despite of these two clusters have different geographic origins, they have common similarity in yield index, which has close similarities with intensive winter wheat varieties and extensive local varieties (Oq marvarid, Bardosh, Pahlavon, Sayhun, and Kayrattosh). All this varieties are early maturity. The early maturing genotypes are important for late sowing time conditions to escape the effects of high temperature, especially during the reproductive stage [28] [32] [33].
The first group of cluster C included genotypes from the CYMMIT collection, which are adapted to local conditions (Ilg’or, Ezoz and 1326). These varieties showed practical benefit as markers for yield traits. The second group of cluster C contained semi-arid drought-tolerant wheat varieties. The two groups of C cluster genotypes had the highest values for TKW, spikes m−2 and grain yield; however, they were grown under different conditions (well watered or drought stress).
Spring local wheat landraces adapted to rain-fed conditions composed cluster D. These varieties were characterized by the highest values for plant height and grain quality and a low value for grain yield. The dendrogram showed close relationships between the varieties Qorakiltik, Grekum, Surxak, Khivit and Oq bug’doy which are considered as local landraces [3] [26].
4. Conclusion
The results of PC analysis revealed the main components that contributed greatly to the evaluation of high yield in bread wheat genotypes. Hence, for the first group, grains/m2 had the largest loading for component one, plant height had the largest loading for the second component, and 1000-grain weight had the largest loading for the third component. Principal component analysis grouped genotypes with similar origins into four clusters. The results showed that genotypes with wide genetic diversity can be utilized in future breeding programmes to obtain high-yield genotypes/hybrids adapted to water-scarce areas.
Acknowledgements
We would like to show our gratitude to the members of Grain Genetics for their help in the fieldwork, to Dr. Khurshid Turakulov and Dr. Bakhodir Chinnikulov for their help in collection of wheat landraces, to CIMMYT and IKARDA organizations for providing wheat accessions and Ministry of Innovations for funding.