Microarray Technology and Its Applicability in Soil Science — A Short Review

The GeoChip is a glass slide containing oligonucleotide probes targeting genes that confer specific function to microorganism. The GeoChip has been used to dissect the microbial community functional structure of environmental samples. The PhyloChip is a glass slide containing oligonucleotide probes of the 16S rRNA genes and it offers tremendous potential to monitor microbial population. Below ground microbial community can be linked to the above ground plant community by the use of these Chips in a high throughput manner. This review seeks to determine the various roles of the GeoChip and the PhyloChip in soil microbial ecology studies. During biostimulation of uranium in groundwater, microbial community dynamics was linked to functional processes and in global warming studies, microbial response to functional gene structure has been possible by the use of the GeoChip. The PhyloChip, on the other hand, provides more comprehensive survey of the microbial diversity, composition and structure and are less susceptible to the influence of dominance in microbial community. Some of the concerns regarding the use of compost in agricultural soils i.e. the spread of human, animal and plant pathogens were reduced when the PhyloChip was used to monitor composting.


Introduction
The soil environment is highly complex with the diversity of soil microorganisms being extremely high.Through DNA re-association kinetics, it is known that a gram of soil contains more than 4000 different genomes.Diverse systems such as soil, maybe more resilient to perturbation because removal of a portion of microbial components or the microbial component being compromised in some way, others that prevail will be able to compensate.However, more diverse systems may be less efficient since a greater proportion of available energy is used in generally countering competitive interactions between the various microbial components.Microorganisms play unique roles in ecosystem functions such as biogeochemical cycling of carbon, nitrogen, sulfur, phosphorus and various metals.Microorganisms also regulate nitrous oxide emissions in soil.However, the precise roles of many of the microorganisms in these cycles are unknown [1].Owing to their extremely high diversity and their as yet uncultivated status, microbial detection, characterization and quantification in natural systems are difficult especially in a large scale and in a parallel and high throughput manner.Also, owing to the high versatility, rapid adaptation of microbial populations, high heterogeneity and microscale diversity in soils, high-throughput methods have to be applied in order to understand population shifts at a finer level and to be better able to link microbial diversity with functioning of ecosystems.
Microarrays represent a powerful tool for the parallel, high-throughput identification of many microorganisms in different environmental samples.A microarray is made up of thousands of spots on a slide with each spot containing multiple copies of unique nucleic acid sequences that correspond to a single gene.Microarray technology facilitates the detection of genetic sequences or expressed gene from particular samples in a high throughput format.In the case of expressed genes, microarrays are popular due to their unique ability to query the mRNA expression levels of thousands of genes (potentially all of the genes of an organism) simultaneously with relatively high specificity, providing a snapshot in time of the overall gene expression of the system under study.Compared to conventional membrane-based hybridization methods, microarrays offer the additional advantages of rapid detection, low cost, automation and low background level [2].
Living organisms contain thousands of genes that control cellular activities in their cells.Studying one component of the cell at a time will not give a complete picture of the cellular activities, it is only through a comprehensive integration of the entire molecular machinery controlling the cell that a thorough understanding can be gained.Similarly, a comprehensive integration of the microbial activities in environmental samples such as soil can be captured by the use of microarray technology and thus obtaining a holistic view of the microbial community.The traditional soil microbiological approaches, analyze material derived from microbial growth such as liquid cultures or colonies obtained by plating.However, such methods have often met with strong limitations, the reason being that only a small fraction of the microbiota (<0.01%) in soil can be accessed on the basis of cultivation, thus a complete picture of the microbial community is not obtained and about 99% of the microbial population in soil still remains unknown.
Also, some soil microbiologists have focused on evidence of processes and activities such as respiration and enzymatic transformation of adding substrates to soil.Measurements of soil processes give insight into microbial mediated transformations in soils.Such microbial mediated transformations do not inform us of the mechanisms, microbial functional composition and diversity that underlie the process level differences.Thus, relating microbial diversity and function to ecological processes remains a critical issue in the study of soil microbial ecology [1,3].This review seeks to determine the various roles of the functional gene array (the GeoChip) and the PhyloChip in soil microbial ecology studies.

GeoChip and Its Relevance
The GeoChip is made of a glass slide containing thousands of bound oligonucleotide probes (of about 40 -70 mers long) targeting genes that confer specific function to the microorganism.Thus, one can monitor the levels of thousands of functional genes simultaneously thereby gaining a window into the soil microbial community function of an environmental sample.The Geo-Chips 2 and 3 have so far been developed for soil microbial ecological studies.Geochip 2.0 contains over 24,000 probes covering more than 10,000 genes distributed among more than 150 functional groups involved in nitrogen, carbon, sulphur cycling, phosphorus utilization, metal resistance, metal reduction and organic contaminant degradation [4].GeoChip 2.0 is useful for studying biogeochemical processes and functional activities of microbial communities which is important to human health, agriculture, energy, global climate change, ecosystem management and environmental clean-up and restoration.Because the arrays contain probes from genes with known biological function, they are useful in linking microbial diversity to ecosystem processes and functions.This array allows for a detailed analysis of the biogeochemical gene profiles of soil microbe and is ideal for understanding how these profiles change in response to environmental perturbations and experimentally imposed conditions [5,6].To increase the confidence of detection, multiple probes for each sequence or each group of sequences were designed for the GeoChip 2.0.The positive controls are made of 16S rRNA gene probes (192 probes) and negative controls with 10 probes from human genes (960 spots) and blanks [4].Later experiments showed that the GeoChip 2.0 was highly specific to their corresponding targets at 45˚C to 50˚C and with 50% formamide during hybridization.
The GeoChip 3.0 is a more comprehensive microarray than GeoChip 2.0 which currently is available for microbial community studies.The developed GeoChip 3.0 can be used as a generic high throughput tool to address various biological questions in different systems such as bioreactors, soils, groundwater, marine, sediments and animal guts.The GeoChip 3.0 has about 28,000 probes covering 57,000 gene variants from 292 functional gene families involved in carbon, nitrogen, phosphorus and sulphur cycles, energy metabolism, antibiotic resistance, metal resistance and organic contaminant degradation.It has several other distinct features and one of such is the gyrB gene for phylogenetic analysis [7].The gyrB gene, encodes DNA gyrase β-subunit gene that has been used to differentiate closely related species/strains.Phylogenetic tree based on gyrB results in a magnitude higher resolution than a tree based on 16S rRNA gene [8,9].GeoChip 3.0 contains eight degenerate probes for the 16S rRNA genes and 672 unique probes designed from hypothetical genes of seven sequenced genomes of hyperthermophiles for negative controls.In addition, a 50 mer common oligonucleotide reference standard (CORS) is mixed with all these probes, including gene probes and controls and co-spotted on GeoChip 3.0 as a common reference standard for data normalization and comparison [10].
In addition, the GeoChip probes are selected from coding sequences of functional genes, GeoChip can be used not only for measuring the abundance, but also for the expression of functional genes in a microbial community if high quality of mRNAs can be recovered from environmental samples.Thus, probing mRNA with the developed GeoChip will provide valuable insight into functions of the genes/populations in critical geochemical and ecological processes.Such information will be useful in establishing mechanistic linkages between diversity of microbial genes/populations and ecosystem functions.
ture in both natural and contaminated environments [11][12][13].Using the GeoChip 2.0, Liang et al. [14] found high abundance of genes involved in organic contaminant degradation in an oil-contaminated site indicating the biodegradation potential of the indigenous microorganisms for oil contaminant degradation.Also, the existence of key genes at that contaminated site (such as genes encoding alkane monooxygenase and benzene dioxygenase) involved in hydrocarbon degradation across the oil-contaminated site implied that stimulating indigenous microorganisms could be a valid option for remediating oil-contaminated sites.However, the degradation process might be influenced by low nutrients [15,16].Nitrogen could be limited because of the decrease in nitrogen cycling genes with oil concentration which may indicate decrease in nitrogen-cycling activity.Adjustment of the carbon/nitrogen ratio by adding nitrogen maybe important for in situ bioremediation of oil contaminated fields.
He et al. [4] monitored microbial community dynamics in groundwater undergoing in situ biostimulation for uranium reduction by using the GeoChip.Their results showed that the GeoChip is able to reveal microbial community differences and that it could track bioremediation processes for linking microbial populations to functional processes.During the uranium reduction period, both FeRB (iron reduction bacteria) and SRB (sulphate reduction bacteria) populations reached their highest levels followed by a gradual decrease over 500 days.Consequently, the uranium in groundwater and sediments reduced and thus uranium concentrations in groundwater decreased.Because Geobacter-type FeRB and some SRB can use U(VI) as electron acceptor by obtaining energy for growth.The uranium concentrations in the ground water were significantly correlated with the total abundance of c-type cytochrome genes from Geobacter-type FeRB and Desulfovibrio-type and with the total abundance of dsrAB (dissimilatory sulfite reductase).Experimental results from GeoChip analysis suggested that Geobacter-type FeRB and SRB played significant roles in uranium reduction suggesting that uranium remediation using indigenous microorganisms could be a valid option in heavily uranium contaminated sites.
Other types of microarrays have been developed for application in bioremediation studies.An example of such an array was developed by Rhee et al. [17], that comprised of 1662 unique and group-specific 50 mer probes targeting most of the genes and pathways known to be involved in biodegradation and metal resistance.Its applicability was demonstrated in naphthalene-amended enrichment cultures as well as in soil microcosm experiments, the soil containing polychlorinated biphenyl.
A three year experimental field warming (+0.5˚C to 2˚C) to determine microbial response to global warming using the GeoChip microarray analyses showed significant warming effects on functional communities, specifically in the N-cycling microorganisms [18].The number of functional genes detected on the GeoChip was significantly lower in the plots subjected to higher temperature as compared with the controls.For a range of gene families (amo A, cellulase, chitinase, laccase, nif H, nir K, nir S, nos Z, pmo A and urease) the number of variants detected on the GeoChip was generally lower in the plots subjected to higher temperature as compared with the control plots.
Understanding the factors influencing methanotrophs diversity and activity is of high importance in order to adapt environmental remediation strategies for optimal methane oxidation.Methanotrophs play an essential role in mitigating the greenhouse effect by metabolizing most of the methane produced for example in landfill sites.The gene encoding the particulate methane monooxygenase (pmo A) the key enzyme in methane oxidation was chosen for the development of a microarray to identify methanotrophs [19], bacteria that are capable of utilizing methane as their sole source of carbon and energy.The improved pmo A microarray contained 68 (18 -28 mer) probes targeting all known methanotrophs including uncultivated members as well as the related ammonium monooxygenase (amo A) genes of ammonium oxidizing bacteria.The pmo A microarray identified Methylocystis spp. was dominating and was an efficient methane oxidizer.
The use of the functional gene array provided insight into the forces driving important processes of terrestrial Antartic nutrient cycling [20].In the Antarctica, denitrification genes were linked to higher soil temperatures and N 2 fixation genes were linked to plots mainly vegetated by lichens.The relative detection of cellulose degradation genes was correlated with temperature and microbial carbon fixation genes were more present in plots principally lacking vegetation.Yergeau et al. [20] also showed a significant correlation between cellulase activity and the number of cellulase gene variants determined by the functional gene array.In a similar study, Reeve et al. [21] observed a significant correlation between cellulase gene signal intensity and cellulase activity in the soil (p < 0.01), correlation between dehydrogenase gene signal intensity and dehydrogenase activity, urease gene signal intensity and urease activity and so forth demonstrating that functional gene array can to some extent complement soil process measurements.

Some Challenges to Addressing the Use of GeoChip
1) High quality community DNA is required to minimize experimental variations for improving microarraybored quantitative accuracy.Impure community DNA with humic acid can affect amplification reactions.Hence to increase sensitivity to amplification reactions, pre amplification by rolling circle can be included in the methodology.This allows for the amplification of low microbial biomass communities before microarray hybridization and thereby increasing the signal levels from such environmental samples [22].The reaction involves the use of spermidine and single-strand binding protein added to the reaction mix to facilitate amplification.The reactions are then incubated and the enzymatic reaction stopped and the amplification product was used for labeling.
2) The target sequences in public database increase exponentially and hence the GeoChip needs to be continuously updated.That could mean the quantity of data generated by microarray studies of environmental samples will be enormous but rapidly processing, comparing, interpreting hybridization data still remain difficult endeavors.
3) A large component of the soil microbial population may be inactive.Soil DNA hybridizations cannot differentiate between active and inactive microbial cells and potential contribution to signal intensity of the inactive i.e. spores or dead biomass or damaged copies of genes cannot be determined.For this reason, caution should be used when interpreting DNA functional gene array [23].To overcome this criticism, researchers are beginning to use RNA for environmental microarray analysis [17,24].Analysis of mRNA would allow more direct connection to be drawn.Recent research on environmental samples using both mRNA and genomic DNA microarrays has shown that the dominant species identified by mRNA arrays are also the most abundant in terms of genomic DNA [25].This suggests that connection drawn between genomic DNA and biogeochemical cycles is reasonable.Yergeau et al. [20] had microarray-based results that were confirmed for a number of gene families using specific real-time PCR, enzymatic assays and process rate measurements suggesting a quantitative relationship between microarray signals and environmental gene densities.The significant correlations between the enzymatic activities measured in soil and the microarray data provided some indication that the detected genes are also expressed in the soil system examined.

The PhyloChip
The PhyloChip is made of slide on which are attached thousands of oligonucleotide probes (of about 50 mers long) of the 16S rRNA genes.The PhyloChip microarray allows the molecular biologist to monitor the levels of 16S rRNA genes (thousands of them) simultaneously thereby giving an 'inner picture' of microbial community in an environmental sample such as soil.PhyloChip (G2) consists of 506,944 probe features, and of these features, 297,851 are oligonucleotide perfect match or mismatch (MM) probes of 16S rRNA genes [26,27].Depending on the type of probe set used, the PhyloChip can allow the parallel detection of up to several thousand microbial strains, species, genera or higher taxonomic groups in a single experiment [19,28,29].The parallel detection of numerous 16S rRNA genes makes the PhyloChip useful for environmental studies of phylogenetically diverse microbial groups.
In a variety of environments, such as contaminated sites [26,28] air [27] water [30] soil [31][32][33], the Phylo-Chip has been used to detect microorganisms.In addition, the PhyloChip can detect much more bacterial taxa as compared with the 16S rRNA gene-based clone library approach [28,34] suggesting that the PhyloChip provides more comprehensive surveys of microbial diversity, composition and structure.Furthermore, such microarray-based approaches are less susceptible to the influence of dominance in microbial communities, whereby sequences of more abundant members mask the presence of other numerically significant taxa and rare species [34].PhyloChip has been considered a powerful tool to comprehensively and rapidly analyze microbial communities.

The Use of the PhyloChip in Soil Microbial Community Studies
Among the concerns regarding the composting process and the use of compost in agriculture and horticulture are the survival and spread of animal, human and plant pathogens.Thus any composting process must be capable of eliminating any health risk that may be present in the end product.The microarray technology offers tremendous potential to monitor the detection of pathogens and of beneficial microbial populations during composting and this helps in the management of the compost before being sold to the public.A microarray was designed targeting the species of microorganisms usually encountered in compost [35] and it offered potential for process monitoring, and the detection of pathogens as well as of beneficial microbes [35].This microarray contained probes targeting actinomycetes and other organisms in the composting process and 35 probes specific to other pathogens.The use of this microarray reduced the concerns regarding the use of composts on agricultural soils and the spread of human, animal and plant pathogens.
He et al. [36] used the PhyloChip to determine the impact of elevated CO 2 on the diversity and function of soil microbial communities.Richness of soil microbial communities at the Phylum, Class, Orders, Families and Subfamilies levels i.e. at the different taxanomic levels was detected.Thus, the taxonomic structure of microbial communities was linked with soil and plant properties through Mantel and such tests to know the extent, the soil and plant properties helped to shape the taxanomic structure.Shifts in the richness, composition and structure of soil microbial communities under elevated carbon dioxide were observed [36].As noted by Cheneby et al. [ 37] and Hallin et al. [38], shifts in diversity will not necessarily alter the ability of soil microbes to perform biogeochemical functions.
The PhyloChip allows for the simultaneous detection of thousands of bacterial and archeal taxa and has been shown to reveal a broader range of diversity than modesty sized 16S rRNA gene libraries for soil, water and aerosol samples [33].PhyloChip analyses also offer the opportunity to link microbial community composition to analyses of enzyme activity, density of functional gene families and the distribution of nutrient cycle-related functional gene sequences.It is possible to use both the GeoChip and the PhyloChip in an experiment [39].To determine whether phylogenetic community structure, based on PhyloChip analysis was related to the distribution of microbial genes involved in nutrient cycling, the PhyloChip data was compared to the GeoChip data.Results showed that communities with more similar taxa composition were also more closely related in their functional genes supporting the notion that the functional genes detected in soils are strongly linked to community composition as determined by 16S rRNA gene-based.Such analysis provides evidence for a strong link between composition and functional gene distribution in Antarctic soils.
Like other high-throughput technologies, however, PhyloChip has its limitations.For example, PhyloChip only detects known sequences already present in a database at the time of probe design, so the G2 PhyloChip may not fully cover the species richness of soil microbial communities.Another limitation might be to improve the sensitivity and selectivity of the analysis.To discover unkown 16S rRNA genes, future investigations may use high-quality, full-length sequencing as a complementary approach to further understand the taxonomic and phylogenetic diversity, composition, structure and function of the soil microbial communities.
Integral to most methods of microbial community analysis is PCR amplification of small-subunit rRNA genes, undertaken primarily to obtain a sufficient mass of genetic material for analysis.This manipulation has wellknown inherent biases and potentially unknown effect.The biggest bias is associated with multi-template PCR, in which the relative abundances of 16S rRNA gene signatures are distorted during PCR amplification [40].The choice of primer pairs as well as the number of amplification cycles strongly influence the ratios of amplicons in the final pool when mixed templates are amplified by PCR [41].Uneven amplification of mixed templates precludes both accurate estimation of evenness in communities and estimates of fold change in response to perturbation or experimental manipulation.Other problems include formation of chimeric amplicons and deletion and point mutations and amplification of contaminating DNA.

Methodology in the Use of the PhyloChip and the GeoChip
PhyloChip analysis includes three major steps: 1) Amplification of the target genomic DNA using 16S rRNA primers; 2) Adding an amount of the amplified DNA (50 -500 ng PCR products) and hybridizing to the PhyloChip [26,27]; 3) Hybridization data being processed prior to statistical analysis.PCR amplification for microarray hybridization is carried out using a bacterial specific 16S rRNA primer e.g.27 F1 and 1492 R and an archealspecific 16S rRNA primers.Many independent PCRs are performed in a thermocycler with different annealing temperatures (eg.48˚C, 51.9˚C, 54.4˚C and 58˚C).The samples are pooled per treatment then concentrated to a smaller volume.The pooled PCR product of each sample is spiked with known concentration of amplicons derived from yeast and bacterial metabolic genes serving as internal controls during the process of normalization.This mixture is fragmented to 50 -200 bp with DNase 1 and One-Phor-All buffer following the manufacturer's protocol.The mixture is normally labeled with biotin or Cy5 or Cy3.Next, the labeled DNA is denatured at high temperature (for instance at 99˚C) for 5 min and hybridized to custom made Gene Chips.PhyloChip washing and staining are performed according to the manufacturer's prescription.Each PhyloChip is scanned and recorded as a pixel image and the initial data acquisition and intensity determination performed using standard Affymetrix (or type of platform software used).Background subtraction, data normalization and probe pair scoring are done.
To use GeoChip, soil DNA is extracted after mechanical lysis in a CTAB buffer using a phenol-chloroform purification protocol [42].Other similar methods of soil DNA extraction such as the one by Zhou et al. [43] have been used.The genomic soil DNA can be labeled with cystidine-5 (Cy-5) dye or the Cy-3 dye.Hybridization of the labeled soil genomic DNA to a custom made Geo-Chip can be carried out at a hybridization station for instance TECAN US, Durham, NC, USA.The first wash is carried out followed by the prehybridization, hybridization and post hybridization washes.Scanning and imaging are then done.
One must take into consideration both the biological and the technical replications in performing experiments using the GeoChip and the PhyloChip.Environmental samples such as soil, the source of biological variation include macro environmental differences such as those caused by growth room/greenhouse effects (light, heat, humidity, location etc.) watering/fertilizing programs, soil conditions, pathogen/herbivore pressures, etc. Sample pooling and replication are the primary methods used to account for biological variation.Biologcal replication is necessary: 1) to estimate the biological variation within an experiment for downstream statistical analysis; 2) to extend the generality of the conclusions beyond the tested samples to the untested population as a whole.Technical variations include differences in labeling efficiencies, amplification reactions and the methodologies involved in hybridization.

Conclusion
Soil has been considered as a black box all this while.Especially on earlier the twenty century, it was difficult to establish the link between microbial community structure and function and even to link them to the above ground plant community.With the advent of microarray for microbial ecology studies, such linkages can be established.The time is drawing closer when the soil will no longer be considered as a blackbox.