Rapid Differentiation of Closely Related Citrus Genotypes by Fluorescence Spectroscopy

The differentiation of closely related Citrus genotypes is a meticulous, laborious, and time-consuming task that involves the assessment of complex traits such as growth, tolerance to stress, photosynthetic efficiency, yield, and many others. Such a task is generally accomplished either by analyzing specific features of adult plants or by applying molecular markers to young trees. On one hand, only after plants start yielding can distinct genotypes be differentiated by comparing their fruit sizes, shapes, taste, and the number of seeds. On the other hand, molecular markers are expensive, and demand expertise and time for the analysis of a larger number of plants. For these reasons, the development of techniques that could assist in an early, quick and accurate differentiation of closely related Citrus varieties is of utmost importance. In this context, laser-induced fluorescence spectroscopy (LIFS) is a promising technique, since it is rapid, highly sensitive, and inexpensive. Previous studies showed that LIFS can differentiate a variety of sweet orange. However, this new study aimed to determine LIFS accuracy in the differentiation and grouping of very closely varieties of four Sunki mandarin selections: Comum, Florida, Tropical, and Maravilha. Furthermore we compared the results with ISSR and SSR molecular markers for the same varieties. LIFS technique distinguished the four selections with accuracy greater than 70%. Only with molecular markers was possible distinguishing clearly Tropical from Maravilha, but not Comum from Florida selections. In this way the results suggest that LIFS may be a sound tool for helping the identification of closely Citrus varieties. Corresponding author. D. D. S. Santana-Vieira et al. 904


Introduction
Species of the genus Citrus (Linnaeus) are cultivated worldwide and have great economic importance.The great diversity of species, hybrids, and clones [1] observed in this genus originated from three base species, which are regarded as true: mandarin (Citrus reticulata Blanco), pomelo [Citrus grandis (L.) Osbeck], and citron (Citrus medica L.) [2].Among all Citrus species, the orange is probably one of the most important one in the world due to its high consumption in many countries.Brazil is the world's largest producer of concentrated orange juice.Although Brazilian production potential is much higher, it has been restricted in recent years by problems caused by different pathogens [3].
The difficulty in controlling these pathogens arises from the predominant use of a single rootstock, mainly the Rangpur lime (Citrus limonia Osbeck), in Citrus groves.A strategy to minimize the vulnerability of groves is the use of different rootstocks [4] [5], and a promising alternative for this purpose is the Sunki mandarin [Citrus Sunki (Hayata) hort.Ex Tanaka].This species originates from southeastern China [6], is compatible with several scion varieties, and can be grafted onto sweet orange trees [Citrus sinensis (L.) Osbeck], pomelo trees (Citrus paradisi Macfad.), and mandarins (several species) [7].The use of Sunki mandarin as rootstock results in vigorous scions, with high fruit-yield, tolerance to Citrus triteza virus, xyloporosis, sorosis, Citrus decline, Citrus sudden death, adaptation to saline soils, and moderate tolerance to drought [7] [8].However, the variety Common Sunki shows high susceptibility to Phytophthora root rot and a reduced number of seeds per fruit [9].
Two selections of mandarin Sunki were reported by the Citrus Breeding Program at EMBRAPA Mandioca e Fruticultura (Citrus PMG-CNPMF): the tropical [5], selected from Common Sunki, and Maravilha [10], selected from Florida Sunki, which usually show maternal traits, but higher numbers of seeds, higher percentage of polyembryony, and resistance to Phytophthora root rot [10].Morphological traits are currently used to identify these selections-Common, Florida, Tropical and Maravilha, but it is very costly due to the long time required for fruit production and subsequent evaluation.
Another widely used tool for the identification of Citrus species and varieties are molecular markers, such as SSRs (simple sequence repeats) [11]- [13] and ISSRs (inter-simple sequence repeats) [14]- [17].These markers are advantageous as they allow the use of any plant tissue, at any developmental stage.However, the use of such markers in breeding programs or by seedling producers that have a larger number of plants is unfeasible due to the high cost and time required for analysis [13] [18].Alternative tools that can be complementary in this type of selection have been extensively researched.One promising possibility in this case is the photonic techniques [19]- [26].
Studies using spectroscopic techniques have been carried out for differentiation of species of forest trees [19], varieties Syrian wheat [20] and commercial cultivars of strawberry [21].A potential alternative is fluorescence spectroscopy, which does not require intense sample treatment and is highly sensitive, inexpensive, fast, and hence can be employed on large-scale analyses.
This technique has been applied to the diagnosis of biotic and abiotic stresses [22]- [24], such as differentiating diseases in Citrus such as Huanglongbing (HLB, Ex greening) and Citrus variegated chlorosis (CVC) [25] [26].More recently the laser-induced fluorescence spectroscopy (LIFS) was used to identify Citrus varieties of sweet orange with around 100% rate of success [27].Based on the previous results, this study aimed to: 1) verify LIFS accuracy for the differentiation of very closely related varieties such as the Sunki selections; 2) differentiate Sunki selections using SSR and ISSR molecular markers; and 3) compare the results obtained by both techniques using statistical tools, pattern recognition algorithms, and clustering methods to build dendrograms.

Samples
The Sunki mandarin selections-Tropical (TSKTR), Maravilha (TSKMA), Florida (TSKFL) and Comum (TS-KC)-analyzed in this study belong to the Citrus Germplasm Bank of EMBRAPA Mandioca e Fruticultura, Cruz das Almas, Bahia, Brazil.All Sunki plants selected at EMBRAPA and evaluated in this study were clones grafted on different rootstocks.
Each of these selections was grafted onto two different rootstocks.TSKC and TSKFL were grafted onto Cleopatra mandarin (Citrus reshni hort.Ex Tanaka); TSKTR and TSKMA were grafted onto Santa Cruz Rangpur lime; and all selections were grafted onto Volkameriano lemon (Citrus volkameriana V. Ten.& Pasq.).For molecular analyses, three leaves of each Sunki selection, from the same rootstocks-Volkameriano lemonwere collected.For LIFS measurements, 10 leaves were collected from each selection from both rootstocks, as shown in Table 1.

Laser-Induced Fluorescence Spectroscopy-LIFS
Two different LIFS systems were used for the analysis of the four selections, both of which were assembled at the Laboratory of Optics and Photonics of Embrapa Instrumentação.The main difference between the two systems is their excitation sources: a 561 nm diode laser COMPASS model and a 405 nm Cube laser model both produced by Coherent.Each system will be henceforth referred to as LIFS-405 and LIFS-561, regarding the different excitation lasers.Spectroscopic measurements were performed using USB2000 mini-spectrometers manufactured by Ocean Optics ranging from 194 to 894 nm for LIFS-405, and from 500 to 1200 nm for LIFS-561.Figure 1 illustrates a LIFS schematic system.In both systems, a fluorescent probe composed of six outer optical fibers is used to excite the sample, whereas a central fiber is used to capture the fluorescence emission signal.Also, an adjustable optical filter and a notebook with a program specially designed to collect and process data are used.
The spectrometer parameters subject to adjustments were: integration time, number of averages, and boxcar.The first parameter was the time in which the spectrometer captures light emission for each spectrum, the second sets the number of spectra measured for the average, and the third is a median smoothing method.In regard to the LIFS-405 system, the following acquisition parameters were adopted: integration time of 60 ms, 20 spectral averages, and boxcar 2. For the LIFS-561 system, the parameters were: integration time of 2 ms, 20 spectral average, and boxcar 2. With respect to each measurement, the spectrum collection was carried out after a time interval of 2.5 seconds, which was sufficient for the intensity to reach its maximum value.LIFS measurements were performed on leaves in natura, at the bottom of the abaxial side next to the midrib.

DNA Extraction, ISSR and SSR Analysis
Total genomic DNA extraction was performed using the Invisorb Spin Plant Mini Kit (Invitek, Berlin, Germany).The concentration and quality of samples were evaluated in 0.8% agarose gels by electrophoresis and Nano-  Drop TM 8000 (Thermo Fisher Scientific, Wilmington, Denmark).The final concentration of all samples was standardized to 10 ng•μL −1 .
For DNA amplification, the Eppendorf Gradiente Master Cycler thermocycler was employed.Six ISSR primers, presented in Table 2, the program and PCR reaction conditions used followed the procedure described in [28].Visualization of the amplified bands was carried out in 2% agarose gels with 1X TBE, stained with ethidium bromide (0.5 µg bromide/100 mL of 1X TBE) at 90 volts for 2 -3 h.Visualization was performed using a UV Loccus transilluminator.

Data Processing and Classification Models
In order to be comparable, the LIFS spectra were processed by applying offset removal and area normalization.The offset was determined by the average of each spectrum between 875 and 880 nm, and the normalization was performed by dividing each spectrum by their respective area.Thus, the spectra are only differentiated by variations in spectral intensity.
To assess whether the spectral leaf of different varieties grafted onto the same rootstock can be differentiated by LIFS, classification techniques [29] were employed."Classify" means selecting amongst various classes, the one that, according to some previously defined criteria, most closely resembles the element tested.From the set of processed spectra, one part is randomly chosen for constructing the classifier, whereas the other part is used for testing.
The technique selected for building the classifier was the classification via regression, using the partial least squares (PLS) method (e.g.[30]).This method has been widely used for evaluating the concentration of chemical compounds in unknown samples [30] and has the advantage of using the correlation between input variables-spectrum points-and the variable of interest, i.e., the concentration of certain compounds.The PLS regression method applied in classification problems has to be adapted to correlate the input data with the classes, i.e., different selections grafted onto the same rootstock.Since the PLS method has been developed for numerical regression, the classes have to be represented by numbers, e.g., 0 for the class to be evaluated and 1 for the others [31].Thus, one classifier is designed for every class, and the corresponding PLS regression method determines the probability of each tested leaf to belonging to that class.Other models are designed for the other classes, and the same leaf is tested on all of them.Therefore, the tested leaf will belong to the class with the highest probability.
To construct the dendrogram that determines the hierarchy between classes, the agglomerative hierarchical clustering method [32] was adopted in which the Euclidean distance was the similarity measurement, and the Ward method [33] was used to determine the clustering between classes.This analysis was performed in the R software package [34].
Differentiation between all selections was performed two by two, according to the number of leaves from each rootstock (see Table 1).The first comparison was carried out between the leaves from TSKC and TSKFL on the Cleopatra (CLEO) mandarin rootstock whereas the second, between the leaves from TSKMA and TSKTR, on the Santa Cruz Rangpur lime rootstock (LCRSTC).Finally, all selections on the Volkameriano lemon tree rootstock (LVK) were compared.
For constructing the dendrogram, 10 leaves of all selections on the Volkameriano lemon tree rootstock (LVK) were used.In cases in which there were more than 10 leaves per selection, they were randomly chosen.
The molecular markers were analyzed by visual assessment.For ISSRs, a binary system was applied, where 1 denotes presence and 0 denotes absence of band; in SSRs, in turn, genotyping was carried out through allele identification.After this evaluation, the Jaccard's coefficient was calculated and the dendrogram constructed using the Ward's clustering method with Euclidean distance similarity measurement with the R software package [33].

Results
Figure 2 shows typical LIFS spectra of the four Sunki mandarin selections on the LVK rootstock, after baseline correction and normalization by area.The spectral differences are small amongst the selections, especially between TSKC and TSKFL, in agreement with that observed in the field, even in mature plants.Despite the small spectral differences, a properly designed classifier may be able to identify all selections.Table 4 shows the classification results for LIFS-561.The most successful classification was the one involving the canopies TSKMA and TSKTR on the rootstock LCRSTC, with an accuracy of 83.52%, followed by TSKC and TSKFL on the rootstock CLEO, with 78.57%.Only when TSKFL, TSKMA and, TSKTR with the same sampling (leaf) number were compared on the rootstock LVK, the accuracy was 75.8%; on the other hand, when comparing all selections (TSKC, TSKFL, TSKTR, and TSKMA) with only 10 leaves per selection, the accuracy was reduced to 70.68%.
The classification results for LIFS-405 are presented in Table 5.The best classification was again assigned to TSKMA and TSKTR canopies in the rootstock LCRSTC, with an accuracy of 98.9%.Regarding the LVK rootstock, an accuracy of 73.13% was achieved for TSKFL, TSKMA, and TSKTR, whereas the accuracy was 71.64% for all canopies.The lowest value was verified between the canopies TSKC and TSKFL on the rootstock CLEO, with a differentiation of 70.24%.
Forty-nine fragments were generated by the ISSR molecular markers in which 17 were polymorphic, as shown in Table 6, with the possibility to differentiate TSKTR and TSKMA selections and their source matrices: TSKC and TSKFL.Using such markers, TSKC and TSKFL selections were not distinguished.The same results were obtained with SSR markers, shown in Table 7, in which the alleles found in TSKC and TSKFL selections were the same.However, differences in the alleles were found in the other selections, as presented in Table 7.
Figure 3 presents dendrograms generated from the analysis of molecular markers ISSR (A) and SSR (B), and of the LIFS-405 (C) and LIFS-561 (D) systems applied in all selections grafted onto lemon Volkameriano.Fig- ures 3(A)-(C) shows a similar clustering pattern for all Sunki mandarin selections, where TSKC and TSKFL were clustered in one group, and TSKTR and TSKMA in another.Figure 3(D), however, presents a distinct cluster pattern where the TSKTR and TSKMA selections were placed in different groups.It should be noted from this figure that the y-axis indicates Euclidean distance between classes.Therefore, according to the dendrograms of the molecular markers, Figure 3(A) and Figure 3(B), TSKC and TSKFL selections cannot be separated because the distance between them is zero.

Discussion
The classification obtained by using the LIFS-561 and LIFS-405 equipment is quite consistent since, from a

Number of Alleles Generated
Sunki Mandarin Selections (Genotypes) morphological standpoint, the TSKMA and TSKTR selections are easier to differentiate between themselves and between them and their respective source varieties, TSKC and TSKFL, due to having a greater number of seeds and higher polyembryony than the other Sunki selections [5] [10].On the other hand, the differentiation between TSKC and TSKFL is far more complex because of the Sunki-specific features [10].Despite this complexity, the LIFS technique was capable of differentiating TSKC and TSKFL with at least 70% accuracy.In the case in which the comparison was performed with LVK as the rootstock of the Sunki selections, the classification accuracy decreases mainly as a consequence of the number of selections considered, which thereby increases the difficulty level of differentiating between them.These results demonstrate that LIFS is a promising technique for differentiating very closely related varieties since plants must reach their reproductive stage for the fruit, seeds, and polyembryony analysis.Therefore, the LIFS technique can be applied earlier than visual analysis, even before the plant's reproductive stage.
The LIFS results presented here are consistent with other studies that evaluated the biochemistry and genetic variation of wheat varieties, and compared the near infrared spectroscopy (NIRS) with RAPD and AFLPs markers [20].Both analyses provided a similar clustering pattern, although the markers evaluated the ploidy level of varieties, whereas NIRS assessed their chemical composition.This result reveals that the genetic variation pattern reflects the chemical variation observed in the varieties.Likewise, NIRS identified specific genetic features of the wheat germplasm [35].The ISSR and SSR molecular analysis showed that the morphological difference between TSKTR and TSK-MA, already detected by the LIFS systems, has a genetic basis.Furthermore, these selections are differentiated from their source matrices, TSKC and TSKFL, respectively.Yet, the non-distinction between TSKC and TSKFL selections may be explained by the number of molecular markers employed.As these selections are highly similar in regards to their morphology, a greater number of markers should be employed to find their differences.
SSR markers have been used for various purposes involving the genus Citrus (L.), such as the differentiation between nucellar and zygotic embryos [36] and evaluation of diversity and structure of collections [37].ISSR has also been applied successfully in differentiating zygotic from nucellar plants in Citrus [16] [17], as well as evaluating the genetic similarity between wild orange and related wild species [38].Accordingly, even though ISSR and SSR markers could not separate TSKC from TSKFL selections, they achieved similar results and, therefore, were consistent.
Also, the dendrograms obtained using ISSR, SSR, and LIFS-405 yielded similar results.They clustered the Sunki selections in the same way, i.e., a cluster for TSKC and TSKFL and another for TSKTR and TSKMA, corroborating the morphological analysis described in [10], although the latter two selections exhibited similar polyembryony and seed number, but are different from their source matrices.The dendrogram obtained with LIFS-561, however, separated TSKMA from TSKTR, and identified the former as the most distinct of the four analyzed selections.Similarly, this fact can be observed in the SSR marker analysis in Table 7, as shown by the presence of distinct alleles in TSKMA and TSKTR, which were not present in the other evaluated selections.

Conclusions
This study used two different technical, LIFS and molecular markers, to identicaty four Sunki selections-TS-KC, TSKFL, TSKMA, and TSKTR by analyzing their leaves.
TSKMA and TSKTR are selections derived from TSKC and TSKFL, respectively.We employed two types of molecular markers: ISSR and SSR, and two types of LIFS systems, with excitations at 561 and 405 nm.The results obtained for both techniques differed.Both LIFS systems differentiated all selections with accuracy above 70%.However, molecular markers used were able only to differentiate the TSKMA and TSKTR selections among themselves and between them and their source matrices, but not between TSKC and TSKFL selections.It is worth mentioning here that the differences between the TSKC and TSKFL selections are very subtle, and very difficult to detect even with many molecular markers or with the plants in the reproductive stage.Thus, the LIFS technique was very successful at differentiating distinct selections, though without very high accuracy.The LIFS technique presents advantages as low cost and fast measurements, which enable its use in large scale.In this point the LIFS technique could help in breeding programs or seedling producers for identify hybrids or natural mutants before application of other technical with more accuracy.Future studies include validation of the LIFS technique in a larger number of plants per selection, application of this technique in other selections, and analyses with the LIFS systems with excitation at other wavelengths or combining different excitations in the same equipment.

Figure 3 .
Figure 3. Dendrograms obtained by data generated with ISSR (A) and SSR (B) markers, and laser-induced fluorescence spectroscopy (LIFS) with wavelengths of 405 nm (C) and 561 nm (D), using the WARD clustering method.The y-axis indicates the Euclidean distance between selections, and the Agglomerative Coefficient, ranging from 0 to 1, measures the clustering structure.

Table 2 .
List of ISSR primers used in the analyses.

Table 3 .
List of the SSR primers, F and R primer sequence, and expected fragment size.PCRs were performed in a gradient of temperature from 60˚ to 50˚ during 10 cycles and total of 25 cycles.

Table 7 .
Polymorphic primers regarding Sunki mandarin [C.sunki (Hayata) hort.Ex Tanaka] 1 selections, number of alleles generated in each primer, and alleles present in all evaluated selections.