Genetic Divergence in Mango and Obtaining Minimum Efficient Descriptors

Mangifera indica (mango) is a typically tropical fruit with considerable economic value. Brazil features a wide variety of cultivars of this fruit, most of which are known under several different names. Indeed, the nomenclature of mango varieties is still quite confusing. Up to now there has been no well-defined scientific principle to differentiate them. The objective of the present work is to compare the different clustering methods in assessing genetic divergence among mango accessions, as well as identify the minimum efficient descriptors for that crop. A total of 20 mango accessions in Cáceres, Mato Grosso state, Brazil were evaluated. When building dissimilarity matrices, the descriptors were divided according to the following groups: leaf, flower/inflorescence, fruit, seed and growth habit/ripening period. With these divisions, combinations were performed among the groups of descriptors. The similarity index was used to obtain the dissimilarity matrices. Later, the accessions were clustered using the methods of Tocher, Ward and UPGMA. The study observed that it was possible to reduce the number of descriptors from 64 to 35, and that the clustering methods were compatible with the study of the genetic diversity of mango.


Introduction
Mango (Mangifera indica L.) is a fruit species native to Asia and grown widely in tropical and subtropical countries.In Brazil, there are currently several mango varieties, obtained through breeding programs or even by producer selection [1].It is the sixth most important fruit tree in an area in Brazil that expands over 75.2 thousand hectares.And it is the third species in export volume, which amounted to 124.6 thousand tons in 2010 [2].
The use of multivariate analysis techniques in the genetic diversity of mango trees is of great importance, because it not only provides information on parent plants with the potential for use in breeding programs but also allows the characterization of accessions and thereby facilitates the identification of duplicates.This technique makes it possible to evaluate a set of traits, taking existing correlations into account.
The use of morphological characters combined with multivariate techniques has been widely utilized to quantify genetic distance [3], examples of which can be found in such quantifications for crops like pepper [4], cassava [5], soybeans [6], goatweed [7] and turmeric [8].
To facilitate interpretation in multivariate analysis, clustering methods are used to divide an original group into subgroups to ensure homogeneity within, and heterogeneity among, subgroups [3].According to Cruz and Carneiro [9], the most commonly used clustering methods by breeders are optimization and hierarchical methods, which can be differentiated based on both the type of result and the manner in which individuals are clustered.
Given the great economic importance of mango in Brazilian markets and the scarcity of research works on this crop (especially on its genetic diversity), the present work aims to obtain the minimum efficient descriptors for that crop and compare the different clustering methods to obtain genetic divergence among mango tree accessions.
Twenty accessions were evaluated, from nine varieties known popularly as Banana, Bourbon, Coquinho, Espada, Haden, Keitt, Maçã, Rosa and Tommy Atkins.These accessions were collected in backyards within the municipality of Cáceres.
In each accession, 64 morpho-agronomic descriptors, regarded as essential for executing assays on distinctness, uniformity and stability (DUS) of Mangifera spp accessions [11], were analyzed and studied comprising the following traits: 1) Leaf: anthocyanin-derived color; petiole length; position in relation to the stem; symmetry; length; width; ratio; predominant shape; margin undulation; base shape; apex shape.
2) Flower and Inflorescence: axis position; length; width; shape; color of the main and secondary rachis; rachis pubescence; leaf-shaped bracts; size; stamen placement in relation to the style; length of the fertile stamen in relation to the style; anthocyanin-derived color.
3) Fruit: length; width; length/width ratio; shape; epidermis color; waxiness; depth of stalk cavity; prominence at the base of the pedicel; base of the pedicel; shape of the ventral base; shape of the dorsal base; reentrance; reentrance depth; protuberance near the pistil scar; shape of the pistil scar; amount of latex in the peduncle; predominant epidermis color; distribution of peel color; conspicuity of lenticels; density of lenticels; size of lenticels; peel thickness; peel weight; peel attachment to pulp; main peel color; juiciness; pulp weight; pulp fibrousness; amount of fiber attached to the pit; amount of fiber in the pulp underneath the peel; pulp firmness; turpentine; soluble solids; acidity; soluble solids/acidity ratio.
4) Seed and Pit: endocarp surface relief; pit weight; kernel length in relation to the pit; kernel shape; embryo.

5) Ripening Period and Growth Habit:
To build the dissimilarity matrices, we divided all 64 descriptors into five groups, namely: growth habit and ripening period, leaf, flower and inflorescence, fruit, and seed/pit.The main matrix consisted of all 64 descriptors, and all combinations were performed between the five above mentioned descriptor groups, totaling 31 combinations (matrices).
The matrices were built using the multicategorical variables methodology, as per Cruz and Carneiro [9], through the equation: In which: d ii = dissimilarity considering a set of multicategorical variables; D: category disagreement; C = category concordance.To bypass eventual indeterminacy issues in the conditions where the coefficient is equal to "0", the inverse of the similarity coefficient was used, plus one added unit [9].
The matrices of the multicategorical distances between the accessions were used as a measure of dissimilarity for cluster analysis of the accessions using the Tocher optimization method, Ward hierarchical methods, and UP-GMA [9,12].
In the multivariate analyses, all genetic-statistical analyses were carried out using GENES software version 2011 [13].

Results and Discussion
Following the evaluation of the results obtained from the genetic dissimilarity matrix using all descriptors (main matrix), evaluation was conducted in four groups, using the Ward hierarchical method that comprised all 20 accessions.Among the groups, Group I allocated the four accessions of Keitt mango; Group II consisted of the two accessions of Rosa mango and two of Maçã mangos, while Group III allocated the two accessions of Tommy Atkins mango, two of Coquinho mango and Haden mango; and Group IV comprised the highest number of accessions, totaling 35%, formed by two accessions of Banana mango, three of Bourbon mango, and two of Espada mango.
In the dendrogram of the main matrix, the most distant accessions were Keitt and Bourbon, whereas accessions Keitt and Keitt were the most similar.In the evaluation of the efficiency of the clustering method among all possible combinations of each studied accession, it was observed that most of them showed their respective minimum and maximum distances, but the fruit descriptors were more equivalent in relation to the main array, which can be established as minimum effective descriptors.
For the clusters obtained with the UPGMA method using all descriptors, some groups featured different allocation from those presented in the Ward dendrogram.For this method, the 20 evaluated accessions were clustered in four groups, in which Group I allocated the four accessions of Keitt mango and the two accessions of Tommy Atkins mango, totaling six accessions-that is, 30%.Group II was formed only by Haden mango; Group III combined the largest number of accessions, totaling 35%-two accessions of Banana mango, three of Bourbon mango and two of Espada mango.Lastly, Group IV had both accessions of Rosa mango, both of Coquinho mango and both of Maçã mango.
The evaluations made using the Tocher method combined the 20 accessions into four different groups.Groups I and II were the most plentiful, allocating respectively 50% and 35% of the 20 evaluated accessions.Group I allocated the two accessions of Banana mango, three of Bourbon mango, two of Espada mango, one of Coquinho mango, one of Maçã mango and one of Rosa mango.Group II combined the four accessions of Keitt mango, both accessions of Tommy Atkins mango and one of Coquinho mango; Group IV was formed only by Haden mango.
In the evaluation of the efficiency of the clustering methods obtained among all possible combinations of each of the studied accessions, it was observed that most featured their respective maximum and minimum distances, but only the fruit descriptors were efficient in differentiating the accessions.Moreover, when fruit descriptors-rather than all 64 descriptors recommended for mango-were used, the clusters obtained by the methods were more equivalent.
Figure 1 shows the dendrogram obtained in the Ward method from the 35 fruit traits.We can see the difference between the formation of three groups and those formed using all descriptors.The accessions that showed the highest similarity were 19 and 20 (Keitt mango), 1 and 2 (Banana mango), while the most divergent were 19 (Keitt mango) and 3 (Bourbon mango).
In the dendrogram obtained by using the UPGMA method (Figure 2), the 20 accessions were allocated into four groups, which showed little difference with regard to those obtained by using all descriptors.The groups formed by the UPGMA and Ward methods were quite equivalent, with the exception of accession 10 (Haden mango), which was isolated in the UPGMA dendrogram; the most similar and divergent accessions were the same for both methods.
The clusters obtained in the Tocher method (Table 1) and UPGMA were the same.These two methods are often used in works on genetic diversity.And they most Table 1.Representation of the clusters generated by the Tocher optimization method based on the dissimilarity between the 20 accessions of Mangifera indica, using the 64 descriptors (main matrix) for the species.

Accessions
Groups % de accession allocation   often show similar clusters, as in the work by Neitzke et al. [14], who assessed the genetic diversity of melons, and the study by Neto et al. [15], who evaluated the genetic divergence in castor bean using quantitative descriptors.
Each mango variety has a unique combination of traits, setting it apart from other varieties.These sets of distinct morphological traits constitute a basis based on which the varieties can be told apart.Certain traits are hardly informative for the differentiation of varieties by name [16].In this study, it was the fruit traits that contributed most significantly in the accession clusters.
The three methods made it possible to identify the genetic variability among the evaluated accessions.This variability was most intense between the different varieties studied, as accessions of the same variety remained within the same group.There have been several works on the genetic diversity of mangos that particularly used molecular markers [17][18][19].The evaluation of genetic diversity through morphological characteristics has also been widely employed in studies [8,20].This methodology is quite efficient, as demonstrated by Ramessur et al., [16] who evaluated mango tree diversity through RAPD markers and morphological traits.In that study, both molecular and morphological methods were efficient in reporting genetic variability.
For the present study, it is evident that, first, by using fruit traits it is possible to evaluate genetic diversity and, second, hierarchical and optimization methods are equivalent in forming the groups.

Conclusion
The use of only the 35 fruit descriptors, in detriment of the 64 overall descriptors, makes it possible to obtain, with greater efficiency, the genetic dissimilarity among accessions of Mangifera indica L. The Tocher, UPGMA and Ward methods were in agreement in allocating the 20 evaluated accessions.