Omics Technologies Reveal Abundant Natural Variation in Metabolites and Transcripts among Conventional Maize Hybrids

In this report we have evaluated metabolite and RNA profiling technologies to begin to understand the natural variation in these biomolecules found in commercial-quality, conventional (non-GM) maize hybrids. Our analyses focus on mature grain, the article of commerce that is most typically subjected to the rigorous studies involved in the comparative safety assessment of GM products. We have used a population of conventionally-bred maize hybrids that derive from closely related inbred parents grown under standard field conditions across geographically similar locations. This study highlights the large amount of natural variation in metabolites and transcripts across conventional maize germplasm grown under normal field conditions, and underscores the critical need for further extensive studies before these technologies can be seriously considered for utility in the comparative safety assessment of GM crops.


Introduction
There is an urgent need to accelerate agricultural productivity on a global scale to feed our rapidly increasing human population.It is estimated that by the year 2050, global food production must double [1,2].This is a tremendous challenge when coupled with ongoing pressures to preserve the environment and minimize the impact of global warming.Both traditional breeding and biotechnology methods of crop improvement must be utilized to meet these growing demands.
First-generation commercial genetically-modified (GM) crops were developed to begin to meet the critical need for increased productivity from current agricultural practices.These first-generation GM crops have focused on agronomic traits, including herbicide tolerance and insect resistance, and carry transgenes that impart new and easily measured biochemical properties to the plant [3,4].The impending next generation of GM crops will address multigenic yield traits such as drought tolerance or improved utilization of nitrogenous fertilizers, whose molecular genetic basis of the engineered trait is only now beginning to be understood [5].Due to the multigenic nature of these next generation yield traits, GM crops may carry one or more transgenes required to modulate the expression of several endogenous plant genes or biochemical pathways.
The current safety assessment process for products improved through modern biotechnology includes indepth studies of phenotypic, agronomic, morphological, and compositional profiles to identify potential harmful effects that could affect product safety [6].The application of this safety assessment process has worked well to protect public safety.Since commercialization of the first GM crop in 1996 [4], farmers have planted more than 690 million hectares (1.7 billion acres) [7] without a single confirmed incidence of health or environmental harm.We believe it is therefore appropriate that the safety assessment of the next generation GM crops utilize the current well-established and proven regulatory processes [8][9][10].This rational approach will enable the development and commercial use of new products that are critical to meeting the next generation's agricultural challenges.
Advances in technology, such as open-ended profiling technologies (i.e.Omics) to assess metabolite and gene expression profiles, raises the question of whether these methods should also be included as part of the safety assessment process.These methods have the ability to identify and quantify a wide variety of specific biomolecules to determine differences between GM and non-GM crops.Recent publications have suggested that the application of open-ended profiling technologies might be informative to the safety assessment of GM crops, particularly to identify potential unintended effects resulting from the transgenic modification [11][12][13][14][15]. Application of these powerful methods to the regulatory decision-making process requires that the data generated be both reproducible and have probative value to influence product safety.
Despite the potential of Omics technologies, many challenges must be addressed before these could add useful information to the current regulatory safety assessment.These challenges include the multiple number of available Omics platforms, as well as the lack of standardized methods and protocols for sample preparation, data generation and data analysis.Recognizing these shortcomings, international efforts to standardize the technologies have recently been initiated [16][17][18][19][20].
Another significant challenge to the use of Omics technologies in safety assessment of GM crops is the rational determination of biologically meaningful differences in relation to control samples.To this end, the extent of the inherent natural variation in GM and non-GM crops must first be known to ascertain if changes detected by an analytical technology are due to the introduced transgene or are the result of changes due to genetic and environmental variability.To address this deficiency for composition data currently required for safety assessment, the International Life Sciences Institute (ILSI) formed a large consortium representing academics, private industry and government agencies to collect comprehensive compositional data from grain in a range of genotypes across several crops.The ILSI Compositional Database (www.cropcomposition.org)now provides a data-rich baseline of the natural variation in crops that is used as the current standard when assessing substantial equivalence of compositional data for GM crops [21].To date, there are no standardized and universally accepted databases available that describe the natural variation in transcripts, proteins or metabolites.

A preliminary Assessment of Natural Variation by Metabolite and Transcript Profiling Technologies
In this report we have evaluated metabolite (metabolomics) and RNA (transcriptomics) profiling technologies to begin to understand the natural variation in these biomolecules found in commercial-quality, conventional (non-GM) maize hybrids.Our analyses focus on mature grain, the article of commerce that is most typically subjected to the rigorous studies involved in the comparative safety assessment of GM products.We generated population of 30 genetically-related maize hybrids by crossing 6 female inbred lines with 5 different male inbred lines to produce hybrid seeds.The female inbred lines can trace their lineages to one of the common Stiff-Stalk (BSSS) progenitors and have a ~72% -87% marker-based similarity to B73, while the male inbreds are of Non-Stiff Stalk (NSS) lineages and have a ~50% marker-based similarity to either B73 or MO17.To minimize environment and genotype variables, we have used this population derived from closely related inbred parents grown under standard field conditions across two geographically similar locations at Jerseyville and Jacksonville in Illinois, which represent typical commercial corn production environment.To our knowledge, this is the first use of Omics technologies to characterize the large amount of natural variability in transcripts and metabolites across maize germplasm, and underscores the critical need for further extensive studies before these technologies can be seriously considered for utility in the comparative safety assessment of GM crops.We also conducted the standard composition analysis as a reference to compare the variability revealed by the Omics technologies.

Compositional Analysis
Composition analysis of total protein, fat, ash, amino acids, and fatty acids from mature grain is a standard assay to establish equivalence and safety of GM crops (see the guidelines of the OECD (http://www.oecd.org)).These compositional metabolites accumulate as the result of diverse metabolic activities, particularly during the late vegetative and reproductive stages of the life cycle.We used accepted protocols [22][23][24] to measure the variation in those standard composition metabolites in grain samples from a set of 30 hybrids used in this study (analysis performed by Covance, Madison, WI). Figure 1 graphically displays the amino acid composition data measured in mature grain samples among the 30 hybrids at each of the two locations tested.For comparison, we also list the known variation for each amino acid, as found in the ILSI composition database.It is interesting to note that the natural range of variation for some amino acids observed across the germplasm available in the ILSI database varies broadly, such as tryptophan that has a variation of almost 700%.This is not surprising since the ILSI database is reflective of grain from many different genetic backgrounds grown in many different environments.As expected, a large amount of variation within our dataset was also observed in some key metabolites across the 30 hybrids tested at the two locations.However, not all metabolites exhibited such high variability and the accumulation of the majority of metabolites varied by less than 100%.Despite the observed variation, all of the values determined for the 30 hybrids fell within the documented ranges reported by the ILSI database, indicating that the hybrids were compositionally equivalent to commercial hybrids.The range of data across the 30 hybrids within each single location also demonstrates the inherent variability of common metabolites observed even when the starting materials are genetically very similar.This compositional data will be submitted to the ILSI database for public access.

Metabolic Profiling
To understand the application of Omics technologies to characterize the nature and variability of additional metabolites observed in grain samples, we employed a commercially available, non-targeted metabolite profiling platform (Metabolon, Durham, NC).This platform uses gas and liquid chromatography in combination with mass spectrometry to identify and quantify small molecule metabolites [25,26].We used this metabolite profiling platform on the identical 30 grain samples as used for standard compositional analysis.Of the >400 metabolites that were detected in at least 50% of the samples, it was surprising that only ~130 were identified as previously known compounds.This analysis suggests that the majority of metabolites in maize grain are currently uncharacterized and demonstrates that the current standard compositional analyses represent the most common and abun-dant metabolites.
Table 1 shows the quantification of the metabolites analyzed, and the number whose concentrations across grain samples from the two measured locations varies from 2-fold to 16-fold.The vast majority (>96%) of metabolites vary more than 2-fold when compared at each location tested.Furthermore, more than half of the metabolites vary more than 4-fold, ~20% vary by at least 8-fold and 6% -8% of metabolites vary as much as 16-fold.This result indicates that dramatic variations in metabolite concentrations occur in both known and uncharacterized metabolites even among genetically similar maize plants.
Application of Omics technology in a food safety assessment context requires that consistent platforms and validated methods be used so that reasonably skilled and trained individuals would obtain the same values for a specific sample.To evaluate this, we compared the results from the standard compositional analysis (Covance, Madison, WI) and the open-ended metabolite analysis (Metabolon, Durham, NC) performed on the 30 hybrid grain samples to determine the consistency in the measured variation of identical metabolites across the two platforms.Because each platform measures a different complement of metabolites, the comparison was made only between identically annotated metabolites that were measured across both of the analytical platforms.Table 2 lists the 19 common metabolites measured in grain samples of hybrids at each of two field locations.To eliminate large differences observed in absolute metabolite intensity values due to differences in the internal standards  The variation in metabolite accumulation within each profiling platform is defined as the maximum/minimum values measured across the 30 hybrids at each of two locations (JA: Jacksonville; JE: Jerseyville) in Illinois.
that were used across the platforms (data not shown), we report the range of values expressed as the maximum value divided by the minimum value for each metabolite.
As can be seen in Table 2, 17 amino acids and 2 fatty acids were measured across both profiling platforms.In general, the range of values observed for the targeted profiling approach (Covance) is much smaller (up to 2.6-fold) than the open profiling approach (Metabolon; up to 49-fold).The larger range of variation and the large discrepancy for several metabolites (for example, arginine, lysine and oleate) may be due to an increased sensi-tivity in the open profiling methodology.However, these platform-based differences may also be due to several technical factors, including the metabolite extraction procedures and the technical variation of the separation and detection equipment.In summary, these data indicate that vastly different values can be obtained from the same sample depending on the method used to measure the metabolite.

Transcriptomic Profiling
Expression profiling in maize [27 and references within], Arabidopsis [28] and rice [29,30] suggests that there is a strong correlation between genetic diversity and transcriptional variation.To measure the variation of gene expression across the 30 hybrid lines of maize, we used a custom-designed Affymetrix oligonucleotide microarray to analyze RNA from the grain samples collected from each hybrid.The custom microarray was designed to detect >54,000 unique mRNA transcripts based on available EST sequences from public and proprietary databases [31].
As shown in Table 3, a subset of ~30,000 unique EST sequences represented on our microarray showed expression levels above background intensity values in the grain samples.To determine the extent of gene expression variation due primarily to genetic factors, the data from individual locations are reported separately.Similar to results observed for metabolite profiling, the majority of expressed genes (58% -71%) have expression levels that vary by at least 2-fold at each location.Moreover, as many as 8% -12% of genes have expression levels that change more than 4-fold at each location, and the expression level of hundreds of genes change by as much as 8-fold to 16-fold.
The large amount of variation in gene expression observed in grain samples across hybrid lines is consistent with similar large variation observed across metabolites in the same samples.Although microarray data are highly dependent on such factors as oligonucleotide probe design and the accuracy of the targeted EST sequences, the technical variation across microarrays in our platform is much less than 10% (data not shown).Thus, changes in gene expression among hybrids are largely due to genetic factors, but the moderate differences observed across sites indicate an additional significant role of environment on gene expression.

Discussion
As has been documented many times throughout the course of maize breeding, there are large variations among hybrid germplasm observed in basic yield components, due to genetic background as well as environment [32,33].We have also seen this type of variation in yield and yield components within our set of genetically-similar hybrids (data not shown).Further, the set of 30 hybrids described here also showed large variation in some standard compositional metabolites in grain, using a widely accepted targeted analysis platform.Beyond these anticipated results, the study described here was designed to determine the extent of natural variation of less well characterized biomolecules, transcripts and metabolites in maize hybrids.Our results demonstrate that widely differing levels of these biomolecules can also occur in mature grain, with the major differences observed due to genetic factors alone, although environmental factors contribute to further variation.This natural variation appeared to be common across the transcriptome and metabolome, with no molecular or biochemical pathway showing more or less variation than any other (data not shown).
Omics profiling technologies are being more widely applied to study the effects of transgenes in crops.The conclusions drawn in much of the published literature suggests that the presence of a transgene often results in less variation than what may be introduced by conventional breeding methods [12,15,[34][35][36][37][38].Given that these Omics technologies are still evolving with increasing detection capabilities and sensitivity of these technology platforms expected, it is reasonable to assume that differences in gene expression and metabolites will be identified between crops with and without a transgene.However, one of the many challenges Omics technologies present is the ability to interpret changes in the context of product safety.A key advancement to understand whether changes between GM and non-GM crops may impact product safety is to first understand the natural variation of a specific biomolecule in non-GM crops that is due to environmental and genetic factors.Based on results from this field-based study of 30 genetically similar maize hybrids, there exists a broad range of variation at both the level of gene expression and accumulated metabolites that is dependent on the genetic backgrounds and environment of the source material.This natural variation of key macromolecules emphasizes the need for comprehensive baseline databases to characterize the natural variation found within a species.These baseline databases would provide the appropriate context for interpreting changes in composition and the relevance of identified variation for safety assessment.As with the current compositional analysis used in a comparative safety assessment of biotech crops, if a change is detected but the level still falls within the natural range of variation, the change is most likely not due to the transgene per se, but rather is a function of the underlying genetic backgrounds or the environment.Importantly, to date there are no reports that demonstrate a direct correlation between the magnitude of a detected change in RNA, protein or metabolites and any adverse safety effect.At the same time, dozens of GM crops have been determined to be safe for use in food, feed and the environment.There continues to be a growing body of literature indicating variation in metabolite content between GM and non-GM crops is routinely smaller and within the range of natural variation of the crop.The results discussed in this report provide strong evidence that further extensive studies would be needed to address the existing challenges, including standardization of Omics methodologies and the establishment of baseline databases.This dataset will provide a solid foundation to development of publically available reference databases for natural variation of key macromolecules.

Figure 1 .
Figure 1.Variation in amino acid composition among conventional maize hybrids.Metabolite measurements for amino acids are represented as the range of values observed among 30 hybrids at each of two locations (JA: Jacksonville; JE: Jerseyville) in Illinois during the 2006 growing season.The variation reported for standardized metabolite data from the ILSI database (version 2.0) is also shown for comparison to the hybrids analyzed in this study.