Multivariate Cluster and Principle Component Analyses of Selected Yield Traits in Uzbek Bread Wheat Cultivars


Investigation of genetic diversity of geographically distant wheat genotypes is a useful approach in wheat breeding providing efficient crop varieties. This article presents multivariate cluster and principal component analyses (PCA) of some yield traits of wheat, such as thousand-kernel weight (TKW), grain number, grain yield and plant height. Based on the results, an evaluation of economically valuable attributes by eigenvalues made it possible to determine the components that significantly contribute to the yield of common wheat genotypes. Twenty-five genotypes were grouped into four clusters on the basis of average linkage. The PCA showed four principal components (PC) with eigenvalues > 1, explaining approximately 90.8% of the total variability. According to PC analysis, the variance in the eigenvalues was the greatest (4.33) for PC-1, PC-2 (1.86) and PC-3 (1.01). The cluster analysis revealed the classification of 25 accessions into four diverse groups. Averages, standard deviations and variances for clusters based on morpho-physiological traits showed that the maximum average values for grain yield (742.2), biomass (1756.7), grains square meter (18,373.7), and grains per spike (45.3) were higher in cluster C compared to other clusters. Cluster D exhibited the maximum thousand-kernel weight (TKW) (46.6).

Share and Cite:

Adilova, S. , Qulmamatova, D. , Baboev, S. , Bozorov, T. and Morgunov, A. (2020) Multivariate Cluster and Principle Component Analyses of Selected Yield Traits in Uzbek Bread Wheat Cultivars. American Journal of Plant Sciences, 11, 903-912. doi: 10.4236/ajps.2020.116066.

1. Introduction

Bread wheat (Triticum aestivum L.) is one of the most important cereal crops in Uzbekistan, providing over 50% of gained energy. The region is one of the top regions in the world in terms of the average consumption of such products (>185 kg/person) [1]. The World population is rapidly increasing, and there is a growing demand for products derived from wheat [2]. Over thousands of years under the severe conditions of Central Asia, bread wheat has undergone adaptations to local soil and climatic conditions [3]. In wheat-producing countries, much attention is paid to the selection of high-yielding, high-quality wheat varieties that are resistant to diseases and pests, and adverse environmental factors. To increase yield, using wheat genotypes with maximum variability is required. Genetic variability means that genetic material differs between individuals of the same species and is used for the detection of genetic diversity in closely related species [4]. Yield traits have been successfully used for the estimation of genetic diversity since they provide a simple way of quantifying genetic variation [5]. Because traits such as optimum plant height, grain number per spike, and a TKW contribute to wheat yield [6] [7] [8].

Several authors have suggested the use of cluster and principal component analyses to study the genetic diversity and relationships of wheat genotypes [9] [10] [11]. The advantage of cluster analysis (CA) is that varieties or samples are grouped on the basis of complex traits rather than one character [12]. The high yield of winter wheat depends on many factors. Each factor has its own effect, but a separate study of the effect of each factor is not sufficient for complete analysis [13]. Principal component analysis has the ability to transform a number of possibly correlated variables into a smaller number of variables called principal components [14]. Mujaju et al. [15] argued that principal component analysis (PCA) should be conducted before cluster analysis (CA). Principal component analysis (PCA) facilitates the selection of potential parents for hybridization programmes [16]. Mustafa et al. studied maize genotypes under drought stress conditions using PCA and CA [14]. Their study revealed that four PCAs explaining 86.7% and 88.4% of the total variation. Ahmad et al. [4] evaluated the relationship between yields and its components in bread wheat (Triticum aestivum L.) using factor and cluster analyses. These analyses distributed all genotypes among three clusters and revealed strong relationships of yield with other traits [17].

Ahmadizadeh et al. [18] studied the agronomic and morphological traits of durum wheat landraces under drought stress in a greenhouse. Analysis of variance indicated great variations among landraces and genotypes. Cluster analysis divided the genotypes into three groups under normal and stress conditions. The author concluded that under stress conditions, grain yield showed a positive and significant correlation with peduncle length, number of grains per spike, 1000-grain weight, biological yield and harvest index. Ibyatov [19] identified four components that explained 84% of the variability in traits among spring wheat genotypes. Cluster analysis based on genetic diversity of yield traits can be used to assess genetic variation among plant genotypes to detect significant variation demonstrating high-yielding genotypes. This can be successfully applied in plant breeding via using of significant genotypes found from different clusters [17]. Many studies have carried out preliminary selection of high-performing genotypes in order to determine the effectiveness of using cluster analysis to evaluate selected T. aestivum lines for valuable economic traits and adaptation features [20] [21].

Because of hybridisation between geographically distant parental forms with productivity related genes responsible for enhanced traits can be transferred into next generation. It is known that correlation among genes can be altered depending on conditions under which plants are grown.

It is important to work out the interrelationship of yield and other related traits for efficient selection of improved genotypes. Moreover, similarity among the wheat genotypes was evaluated using cluster analysis based on the agro-morphic traits by exploiting Euclidean distance. Other researchers have also used cluster analysis to study the morphological similarity among the genotypes [22] [23].

Morphological traits and yield parameters are broadly using to determine genetic diversity during breeding processes to produce new cultivars [24]. Multivariate statistical tools enable to analyse genotypic stability and creation of groups with distinct traits [25]. The purpose of this work was to study the genetic diversity among geographically distant wheat genotypes by using principal component and cluster analyses. In future, such diversity can be exploited in wheat breeding programmes to create new bread wheat varieties with a high-yield and enhanced grain quality.

2. Materials and Methods

2.1. Plant Material

The study was conducted in the Durmon Experimental field Station of the Institute of Genetics and Plant Experimental Biology, Academy of Sciences of Republic of Uzbekistan. The experimental materials consisted of 25 wheat accessions with different geographic origins, namely, the Bardosh, Ilgor, and Ezoz varieties obtained breeding program using CYMMIT germplasm, the Pakhlavon and Oq Marvarid varieties obtained from hybridized from local varieties, winter wheat cultivars obtained from the Krasnodar Germplasm collection. Semi-arid and landraces were used from the CYMMIT germplasm and local wheat landraces collection, respectively (Table 1).

2.2. Measuring Quantitative Traits

Measuring of yield traits were performed as described earlier work by our group and following experimental manual by Dospekhov [26] [27].

2.3. Statistical Analysis

Statistical analysis of quantitative traits was conducted by Ken Sayre’s method [23]. The calculation of descriptive statistics, cluster analysis and principal component analysis (PCA) were performed using ANOVA in STATGRAPHICS 18 software ( Cluster analysis was performed using K-means clustering, while a tree diagram based on Euclidean distances was developed by Ward’s method [29].

Table 1. Brief information of different wheat accessions used in this study.

aInternational bread wheat screening nursery, Mexico; bLeaft rust/yellow rust screening nursery, Mexico; cFacultative and winter wheat observation nursery-Semi Arid, Mexico; dFederal state budget scientific institution “National Center of Grain” named after P.P. Lukyanenko, Russian Federation.

3. Results and Discussion

The average data were analysed by using principal component analysis. Principal component analysis reflects the importance of the largest contributor to the total variation along each axis of differentiation. The resulting eigenvalues are often used to determine how many factors to retain. The sum of the eigenvalues is usually equal to the number of variables [19] [30]. According to Chahal et al. [20], characters with the largest absolute value closer to unity within the first principal component influence the clustering more than those with lower absolute values closer to zero.

The yield data of 25 wheat genotypes were analysed by using principal component analysis. In this study, out of a total of eight components, three had eigenvalues > 1. These three principal components explained approximately 90.8% of the total variability. The other five components explained only 9.2% of the variation in the wheat genotypes (Table 1). This is consistent to the work by Degewione and Alamerew [31].

Based on the results, the corresponding elements of productivity were identified for each of the main components, thus indicating the variance in the population characteristics. Table 2 shows that PC-1 explained 54.2%, PC-2 explained 23.3%, and PC-3 explained 13.2% of the total variance among different yield traits.

Table 3 shows factor loadings for various yield traits. According to Table 2, the first PC was related to yield and yield traits, i.e., grain yield (0.45), spikes m−2 (0.42), and grains m−2 (0.46), with positive loadings and exhibited positive loadings for TKW (−0.22) and plant height (−0.11). The second PC exhibited a positive effect on biomass (0.47) and plant height (0.61) and a negative effect on the harvest index (−0.49), 1000-grain weight (−0.24), and the number of grains per spike (−0.23). The third PC explained variation among genotypes for 1000-grain weight (0.75), with a positive factor loading.

The positive and negative effects of factors indicate the association between components and varieties [8]. Therefore, the abovementioned positive and negative productivity elements also contributed to cluster formation. According to the principal component analysis, grain weight m2 was selected for the first group, plant height was selected for the second group, and 1000-grain weight was selected for the third group. During the differentiation of genotypes into clusters, it was found that the contributions of the three major components were greater than those of the other components.

Twenty-five geographically separated wheat genotypes were statistically analysed and clustered based on various yield traits: the harvest index (HI), grain yield m−2 (GY), biomass (B), spikes m−2 (S), 1000-grain weight (TKW), grain number per m−2 (G), spikes per grain (GS) and plant height (PH).

All four clusters were analysed according to their means and standard deviations (Table 4). The mean values for grain yield (742.2), biomass (1756.7), grains/m2 (18,373.7), and grains/spike (45.3) were higher in cluster C than in the other clusters. Cluster D exhibited the maximum value for TKW (46.6). The dendrogram was constructed based on cluster analysis of yield traits (Figure 1).

Cluster analysis showed that cluster A included 7 genotypes, cluster C contained 8 genotypes and clusters B and D each contained 5 genotypes. Cluster A include a combination of genotypes (CIMMYT collection, local wheat varieties and landraces). Cluster A contains genotypes closely related to those in cluster B. Apparently, this cluster included accessions from the Krasnodar collection (Krasnodar 99, Tanya, Kroshka, and Grom) and semiarid wheat collection (171,358). Clusters A and B displayed similar values for grain yield, biomass and spikes m−2.

Table 2. Principal component analysis of different yield traits in wheat.

Table 3. Factor loadings for various traits.

HI, harvest index; TKW, thousand-kernel weight.

Table 4. Means, standard deviations and variances for clusters based on morpho-physiological traits.

C-A, C-B, C-C and C-D are clusters. HI, harvest index; TKW, thousand kernel weight.

Figure 1. Tree diagram of 25 wheat genotypes based on different yield traits.

Despite of these two clusters have different geographic origins, they have common similarity in yield index, which has close similarities with intensive winter wheat varieties and extensive local varieties (Oq marvarid, Bardosh, Pahlavon, Sayhun, and Kayrattosh). All this varieties are early maturity. The early maturing genotypes are important for late sowing time conditions to escape the effects of high temperature, especially during the reproductive stage [28] [32] [33].

The first group of cluster C included genotypes from the CYMMIT collection, which are adapted to local conditions (Ilg’or, Ezoz and 1326). These varieties showed practical benefit as markers for yield traits. The second group of cluster C contained semi-arid drought-tolerant wheat varieties. The two groups of C cluster genotypes had the highest values for TKW, spikes m−2 and grain yield; however, they were grown under different conditions (well watered or drought stress).

Spring local wheat landraces adapted to rain-fed conditions composed cluster D. These varieties were characterized by the highest values for plant height and grain quality and a low value for grain yield. The dendrogram showed close relationships between the varieties Qorakiltik, Grekum, Surxak, Khivit and Oq bug’doy which are considered as local landraces [3] [26].

4. Conclusion

The results of PC analysis revealed the main components that contributed greatly to the evaluation of high yield in bread wheat genotypes. Hence, for the first group, grains/m2 had the largest loading for component one, plant height had the largest loading for the second component, and 1000-grain weight had the largest loading for the third component. Principal component analysis grouped genotypes with similar origins into four clusters. The results showed that genotypes with wide genetic diversity can be utilized in future breeding programmes to obtain high-yield genotypes/hybrids adapted to water-scarce areas.


We would like to show our gratitude to the members of Grain Genetics for their help in the fieldwork, to Dr. Khurshid Turakulov and Dr. Bakhodir Chinnikulov for their help in collection of wheat landraces, to CIMMYT and IKARDA organizations for providing wheat accessions and Ministry of Innovations for funding.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.


[1] Dixon, J., Braun, H.J., Kosina, P. and Crouch, J. (2009) Wheat Facts and Futures. Preface, CIMMYT, Mexico, DF, p. 15.
[2] Baboev, S.K., Turakulov, Kh.S. and Khasanov, B.A. (2014) Genes for Wheat Resistance to Yellow Rust and the Role of Epiphytotics in the Emergence of New Races. Russian Journal of Genetics, 50, 261-266.
[3] Baboev, S., Morgounov, A. and Muminjanov, H. (2015) Wheat Landraces in Farmers’ Fields in Uzbekistan: National Survey, Collection, and Conservation. FAO, Ankara.
[4] Ahmad, H.M., Awan, S.I., Aziz, O. and Ali, M.A. (2019) Multivariative Analysis of some Metric Traits in Bread Wheat (Triticum aestivum L.). Journal of Pharmacognosy and Phytochemistry, 8, 4834-4839.
[5] Li, W. and Gill, B.S. (2004) Genomics for Cereal Improvement. In: Gupta, P.K. and Varshney, R.K., Eds., Cereal Genomics, Kluwer, Dordrecht, 585-634.
[6] Jamali, K.D. and Ali, S.A. (2008) Yield and Yield Components with Relation to Plant Height in Semidwarf Wheat. Pakistan Journal of Botany, 40, 1805-1808.
[7] Gupta, P.K., Rustgi, S. and Kumar, N. (2006) Genetic and Molecular Basis of Grain Size and Grain Number and Its Relevance to Grain Productivity in Higher Plants. Genome, 49, 565-571.
[8] Kasyanenko, A.N. (1989) The Use of Multivariate Statistical Analysis in Plant Breeding. Proceedings of the All-Sovjet-Union Consuation, Simferopol, Yalta, 26-28 September 1989, 38-39.
[9] Devesh, P., Moitra, P.K., Shukla, R.S. and Pandey, S. (2019) Genetic Diversity and Principal Component Analyses for Yield, Components and Quality Traits of Advanced Lines of Wheat. Journal of Pharmacognosy and Phytochemistry, 8, 4834-4839.
[10] Beheshtizadeh, H., Rezaie, A., Rezaie, A. and Ghandi, A. (2013) Principal Component Analysis and Determination of the Selection Criteria in Bread Wheat (Triticum aestivum L.) Genotypes. The International Journal of Agriculture and Crop Sciences, 5, 2024-2027.
[11] Lysenko, A.A. (2011) Comparative Productivity of Pea Varieties of Various Morphotypes and the Creation of a New Selection Material on Their Basis. Zernograd, 23.
[12] Brown-Guedira, G.L., Thompson, J.A., Nelson, R.L. and Warburton, M.L. (2000) Evaluation of Genetic Diversity of Soybean Introductions and North American Ancestors Using RAPD and SSR Markers. Crop Science, 40, 815-823.
[13] Ziegel, E. (2002) Editor’s Report on Encyclopedia of Environmetrics, Vols. 1-4. Technometrics, 44, 408-409.
[14] Mujaju, C. and Chakuya, E. (2008) Morphological Variation of Sorghum Landrace Accessions On-Farm in Semi-Arid Areas of Zimbabwe. International Journal of Botany, 4, 376-382.
[15] Akter, A., Hasan, M.J., Paul, A.K., Mutlib, M.M. and Hossain, M.K. (2009) Selection of Parent for Improvement of Restorer Line in Rice (Oryza sativa L.). SAARC Journal of Agriculture, 7, 43-50.
[16] Mustafa, H.S., Farooq, J., Ejaz-ul-Hasan, Bibi, T and Mahmood, M. (2015) Cluster and Principle Component Analyses of Maize Accessions under Normal and Water Stress Conditions. Journal of Agricultural Sciences, 60, 33-48.
[17] Ahmadizadeh, M., Nori, A., Shahbazi, H. and Habibpour, M. (2011) Effects of Drought Stress on Some Agronomic and Morphological Traits of Durum Wheat (Triticum durum Desf.) Landraces under Greenhouse Condition. African Journal of Biotechnology, 10, 14097-14107.
[18] Ibyatov, R.I. (2016) Factor Analysis of the Data Effecting Wheat Productivity. In: Ibyatov, R.I., Shaykhutdinov, F.Sh. and Valiev, A.А., Eds., Collection of the Works of the International Science-Practical Conference. Agrarian Science of XXI Century. Urgent Research and Prospects, Kazan SAU, Kazan, 77-79.
[19] Hailegiorgis, D., Mesfin, M, and Genet, T. (2011) Genetic Divergence Analysis on Some Bread Wheat Genotypes Grown in Ethiopia. Journal of Central European Agriculture, 12, 344-352.
[20] Chekalin, N.M., Tishchenko, V.N. and Batashova, M.E. (2008) Selection and Genetics of Individual Cultures. FOP Govorov, S.V., Poltava, 368 p.
[21] Ali, Y., Khan, M.A., Hussain, M., Atiq, M. and Ahmad, J.N (2019) An Assessment of the Genetic Diversity in Selected Wheat Lines Using Molecular Markers and PCA-based Cluster Analysis. Applied Ecology and Environmental Research, 17, 931-950.
[22] Awan, F.K., Khurshid, M.Y., Afzal, O., Ahmed, M. and Chaudhry, A.N. (2014) Agro-morphological Evaluation of Some Exotic Common Bean (Phaseolus vulgaris L.) Genotypes under Rainfed Conditions of Islamabad, Pakistan. Pakistan Journal of Botany, 46, 259-264.
[23] Yadav, S.K., Singh, A.K. and Malik, R. (2015) Genetic Diversity Analysis Based on Morphological Traits and Microsatellite Markers in Barley (Hordeum vulgare). Indian Journal of Agricultural Sciences, 85, 37-44.
[24] Fufa, H., Baenizger, P.S., Beecher, B.S., Dweikat, I., Graybosch, R.A. and Eskridge, K.M. (2005) Comparison of Phenotypic and Molecular-Based Classifications of Hard Red Winter Wheat Cultivars. Euphytica, 145, 133-146.
[25] Lin, C.S., Binns, M.R. and Lefkovitch, L.P. (1986) Stability Analysis: Where Do We Stand. Crop Science, 26, 894-900.
[26] Baboev, S.K., Buranov, A.K., Bozorov, T.A., Adilov B.S. and Morgunov A.I. (2017) Studying of Local wheat landraces of Uzbekistan. Journal of Agricultural Biology, 52, 553-560.
[27] Dospekhov, B.A. (1985) Metodika polevogo opyta. [Methods of Field Trials.] Moscow. (In Russian)
[28] Sayre, K.D., Rajaram, S. and Fischer, R.A. (1997). Yield Potential Progress in Short Bread Wheat in Northwest Mexico. Crop Science, 37, 36-42.
[29] Ward, J. H. (1963) Hierarchical Grouping to Optimize an Objective Function. Journal of the American Statistical Association, 58, 236-244.
[30] Khodadadi, M., Fotokian, M. H., and Miransari, M. (2011) Genetic Diversity of Wheat (Triticum aestivum L.) Genotypes Based on Cluster and Principal Component Analyses for Breeding Strategies. AJCS, 5, 17-24.
[31] Degewione, A. and Alamerew, S. (2013) Genetic Diversity in Bread Wheat (Triticum aestivum L.) Genotypes. Pakistan Journal of Biological Sciences, 16, 1330-1335.
[32] Mondal, S., Singh, R.P., Mason, E.R., Huerta-Espino, J., Autrique, E. and Joshi, A.K. (2016) Grain Yield, Adaptation and Progress in Breeding for Early-Maturing and Heat-Tolerant Wheat Lines in South Asia. Field Crops Research, 192, 78-85.
[33] Kaur, C. (2017) Performance of Wheat Varieties under Late and Very Late Sowing Conditions. International Journal of Current Microbiology and Applied Sciences, 6, 3488-3492.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.