Gene Expression : A Review on Methods for the Study of Defense-Related Gene Differential Expression in Plants

The plant genes involved in cellular signaling and metabolism have not been fully identified, while the function(s) of many of those which have are as yet incompletely characterized. Gene expression analysis allows the identification of genes and the study of their relationship with cellular processes. There are several options available for studying gene expression, including the use of cDNA and microarray libraries and techniques such as suppression subtractive hybridization (SSH), differential display (DD), RNA fingerprinting by arbitrary primed PCR (RAP), expressed sequence tags (EST), serial analysis of gene expression (SAGE), representational difference analysis (RDA), cDNA-amplified fragment length polymorphism (cDNA-AFLP) and RNA sequencing (RNA-Seq). Focusing on defense-related processes in plants, we present a brief review and examples of each of these methodologies and their advantages and limitations regarding the study of plant gene expression.


Introduction
It is known that defense mechanisms in plants are complex.During infection, cellular metabolism can be subverted by the pathogen towards its own survival, suppressing the defense responses of the host and favoring compatibility [1,2].Although the molecular aspects of the defense responses and infection process in plants are not well characterized, it is known that the defense mechanisms included not only physical barriers, such as trichomes and the closure of stomata, but also the accumulation of antimicrobial compounds and defense-related proteins as well as the hypersensitive response and increased activity of peroxidases [3].
The activation of defense processes in plants is initiated by the recognition by host cells of molecules either within the membrane of pathogens or produced by them.This interaction involves protein phosphorylation, ion fluxes, formation of reactive oxygen species, activation of transcription factors and endogenous signals, such as hormones, and the expression of several genes related to defense [4,5].The survival of the pathogen in the host not only involves several specific gene products but also depends on the regulation of virulence genes, in that for a successful infection it is necessary that avirulence (avr) genes be expressed at specific stages of infection [6,7].
Understanding the cellular and molecular mechanisms involved in resistance and susceptibility assists in the development of control strategies, identification of pathogens and factors implicated in the progress of a disease.New methods of disease control must be developed, which requires a better understanding of the interactions between pathogens and their hosts, especially the genes expressed during the course of infection [8].One of the strategies for analysis and study of plant-pathogen interaction is the determination of differential gene expression [9].Gene regulation via extracellular signals is essential for development, homeostasis, defense and adaptation in plants [10].
The first stage of gene expression is the transcription of genes to produce RNA, the complete set of RNA transcripts produced by the genome at any one time being known as the transcriptome.In the context of this paper, plant transcriptomics refers to the study of differential gene expression and the molecular mechanisms involved in plant-pathogen interactions.Such interactions may enable, disable, increase or decrease the expression of several defense-response related genes, the products of which may be involved in different cellular pathways [9,[11][12][13].The differences in RNA expression during infection can be determined at both the host and pathogen level [9].
There are several methodologies and strategies that can be used to study gene expression, the most important of which are as follows: complementary DNA (cDNA) libraries; cDNA-amplified fragment length polymorphism (cDNA-AFLP) analysis; microarray analysis; suppressive subtractive hybridization (SSH); differential display (DD); RNA fingerprinting by arbitrary primed PCR (RAP-PCR); expressed sequence tag (EST) sequencing; serial analysis of gene expression (SAGE); representational difference analysis (RDA); and RNA sequencing (RNA-Seq).

Complementary DNA (cDNA) Libraries
Reverse transcriptase is used to convert RNA to cDNA (Figure 1) to create cDNA libraries for use in gene expression studies based on complementarity, in which messenger RNA (mRNA) serves as a template for the identification of a coding gene and the study of its regulation and function [14,15].This conversion makes use of a mRNA template, dNTPs, reverse transcriptase and oligo dT primers, which anneal to the poly-A tail of the mRNA and start the conversion process.The fragments obtained by this process are then linked to vectors, cloned and sequenced.These sequences can be analyzed for homology by searching existing databases or can be used in microarrays or differential display techniques for the evaluation of differential expression [16].The sequences can also be used in gene cloning experiments or for the construction of peptide libraries [17,18].Specific primers which anneal to specific regions of the transcript can also be used, but this requires knowledge of the target sequence to be amplified [19].
Various cDNA libraries have been used in the identification and study of gene structure and the elucidation of molecular mechanisms [20].The differential expression of these sequences allows analysis of the spatial and temporal distribution of gene products in both the pathogen and the host [21].
A study of the compatible and incompatible interactions between wheat and Puccinia striiformis (Pst) produced a cDNA library of 5793 expressed sequences, 2743 of which were unique sequences related to the wheat-Pst compatible interaction [8].This study also showed that differential expression of genes occurred during different stages of the infection and that these differences were highly dependent on the plant-pathogen interaction type being compatible or incompatible.
Differential gene expression in soybean infected with Phakopsora pachyrhizi has been investigated using cDNA libraries obtained by mRNA sequencing (mRNA-Seq), with this methodology detecting, among other things, increased expression of Clostridium stercorarium subsp.stercorarium (CSS) copper chaperones, cytochrome P450, O-methyltransferases and reductases, class IV chitinase, β-1,3-glucanases, glutathione S-transferase, lipoxygenase 2, ATP-binding cassette transporters (ABC transporters), dienelactone hydrolases and EF-hand proteins [22].Most of these genes are directly or indirectly related to defense mechanisms in plants.
For example, chaperones are metal receptor proteins that carry copper to the cytoplasm and intracellular compartments or to specific sites such as copper-dependent enzymes.In plants, three members (CCH, CCS and COX17) of this family have been identified and classified [23].Copper is involved in several physiological processes, including those related to defense, such as oxidative stress and the synthesis of receptors for the plant hormone ethylene [24,25].Another example is cytochrome P450, which is involved in cellular detoxification processes such as the detoxification of herbicides and the biosynthesis of cutin and hormones such as brassinosteroids [26].Class IV chitinases act as fungicides by cleaving glycosidic linkages in fungal walls [27].

cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP)
In this technique, RNA is converted to cDNA and subsequently digested with two restriction enzymes, one with rare and the other with frequent cutting.Synthetic linkers are attached to the ends of the cDNA and primers complementary to these synthetic linkers are used to amplify the fragments, which are then visualized and compared on a gel (tester × control).Different sized frag-ments, representing differentially expressed genes, can then be isolated and sequenced [28].In a study in which wheat was infected with P. striiformis cDNA-AFLP produced 255 transcripts with expression changes after infection, of which 161 classified as basal because they were induced in both compatible and incompatible interactions while 94 were preferentially expressed in the incompatible interaction [29].These sequences showed homology to genes related to metabolism and photosynthesis, defense and signal transduction, transcription, transport, protein metabolism and cell structure.

Microarrays
Microarray hybridization technology allows the study of a large number of genes from different species.Oligonucleotides can be classified into high density microelectronics, macroarray and microarray [30].Microarrays are based on a glass, plastic or nylon matrix to which specific gene probes are attached in such a manner that complementarity can be obtained between the attached nucleic acid (RNA or DNA) probes and free target complementary nucleic acid labeled with a fluorescent probe, low-intensity or high-intensity fluorescence indicating the low or high expression of a particular gene within a pathosystem or in plants subjected to a specific stress [31,32].This method allows the simultaneous analysis of thousands of genes of interest, and the identification of both their presence and differential expression, the latter allowing inferences to be made regarding the possible function of specific genes [11].
In diverse wheat tissues infected with Fusarium graminearum this methodology found 185 up-regulated and 16 down-regulated genes, with the up-regulated sequences showing homology to stress and the defense responses related plant genes such as β-1,3-glucanase and class I chitinase as well as oxidative reactions, regulatory functions, protein synthesis and the phenylpropanoid pathway [33].
A study of wheat cultivars infected with F. graminearum was based on a cDNA microarray library of wheat expressed sequence tags (ESTs) obtained using suppressive subtractive hybridization and found 25 differentially expressed wheat UniGenes [34].The authors reported that the chromosomal segments of wheat cultivars Sumai-3, 2AL and 3BS were important in the activation of the defense mechanisms against F. graminearum, whereas, for example, the loss of segment 3BS reduced the activity of genes encoding the defense-related proteins PR-2 (β-1,3-glucanase), PR-4 (wheatwins) and PR-5 (thaumatin-like proteins).
The expression profile of wheat responding to Puccinia triticina leaf rust was studied by contrasting compatible and incompatible interactions with the isogenic wheat line RL6003, the most contrasting time-points being 6 hours and 24 hours post-inoculation [35].Several genes associated with photosynthesis, oxidative reactions, signal transduction, ubiquitination and precursors to the shikimate-phenylpropanoid pathway showed differential expression between the compatible and incompatible interactions and appeared to represent key genes in the metabolism of defense [35].The time-points coincided with the intracellular growth of the fungus and the formation of the septum that separates the infective hyphae from the newly-formed primary haustorial mother cell, which had been described in a previous study [36].
Microarrays have been used in a study to assess differential expression in two barley cultivars (L94, susceptible, Vada, partially resistant) infected or uninfected with Puccinia hordei, with the identification of 802 fragments related to plant genes responsive to this fungus being present in inoculated plants but absent from uninoculated controls [37].Of the genes detected, 584 were common to both cultivars, 34 were specific to L94, 24 were specific to Vada and 160 showed no homology to genes described in the literature and thus represented potential genes to be studied.When the authors compared the differential expression between the two cultivars they found that 1411 genes were differentially expressed, including, among others, genes for transcription factors, PR proteins, protein receptors, R genes, hormones and protein carriers.

Suppressive Subtractive Hybridization (SSH)
This method separates differentially expressed genes using hybridization between sequences taken from a sample under study (the "tester") as compared to a control sample (the "driver").In other words, tester sequences with no homology with the driver are separated from the pool as differentially expressed [38].
A study of genes involved in the pathogenicity mechanisms of F. graminearum identified four genes (Abc2, Lyp1, Rrr1 and Zbc1) potentially involved in the pathogenesis and development of this fungus in wheat [39].
Another study involving F. graminearum found that 24 genes were differentially expressed in the interaction between this fungus and wheat, of which 8 showed homology with genes from the pathogen and 16 with wheat genes, including those encoding cytochrome P450, actin depolymerizing factors, chitinase, histone H4, pyruvate decarboxylase and S-adenosylmethionine decarboxylase [40].
In a study of the drought response of Leymus secalinus (wikdrye, wheatgrass) 16 sequences were differentially expressed in response to drought, 13 of which showed homology to genes previously described in the literature such as betaine aldehyde dehydrogenase, rice heat shock protein 70, barley ribulose 1,5-bisphosphate carboxylase activator protein and maize ubiquitin conjugating en-zyme E2-17Ka [41].

Differential Display (DD) and RNA Fingerprinting by Arbitrary Primed PCR (RAP-PCR)
Differential display and RAP-PCR are methodologies that involve the conversion of RNA to cDNA followed by PCR amplification using arbitrary (RAP-PCR) or 3' oligo dT (DD) primers.After this step, the transcribed cDNAs are amplified with a set of arbitrary primers that anneal to the 3' oligo dT or to the arbitrary primers previously used in the conversion step to cDNA [42].The amplified products are separated on a gel, with differentially expressed sample and control bands being visualized, extracted and sequenced.The differential expression of such sequences can be confirmed by the real time polymerase chain reaction (RT-PCR) or microarray analysis [43,44].
A study of differential expression in wheat in response to yellow rust using DD showed the differential expression of 14 gene fragments, some of which showed homology with genes involved in the synthesis of antifungal cyclophilins and ubiquitins (Rad6) [45].These authors also showed that these sequences were involved in the process of programmed cell death and defense and resistance mechanisms.

Expressed Sequence TAG (EST) Sequencing and Serial Analysis of Gene Expression (SAGE)
The differential expression in these techniques is evaluated by the number of times of a particular sequence randomly selected from a cDNA library/EST or a specific SAGE sequence appears and/or is present or absent in a given library(s) [42].In SAGE the RNA is converted to cDNA using biotin-linked 3' oligo dT primers, which are subsequently cleaved with restriction enzymes.The fragments generated are attached to adapters, linked and amplified using PCR.The resulting fragments are concatamers, generated as a result of the joining of various fragments, which are then cloned and sequenced for the analysis of differential expression [46,47].
Research using EST libraries indicates that wheat responding to infection with Blumeria graminis f. sp.tritici showed differential expression of genes coding for the enzymes ferulate 5-hydroxylase (F5H), phenylalanine ammonia lyase (PAL), cinnamoyl-CoA reductase (CCR), caffeic acid O-methyltransferase (CAOMT) and caffeoyl-CoA3-O-methyltransferase (CCOAMT), plus the multifunctional protein with carbamoylphosphate synthetase, aspartate carbamoyltransferase and dihydroorotase activity (CAD).These enzymes are involved in the synthesis of monolignols during different stages of infection, indicating that biosynthesis is an important step in the wheat defense process [48].
Open Access AJPS

Representational Difference Analysis (RDA)
The RDA technique combines the subtractive libraries, as described by Lisitsyn et al. [49], with PCR amplification, resulting in the enrichment of differentially expressed fragments.In other words, the RNA is converted to cDNA and digested by restriction enzymes.Adapters are attached to the fragments and amplified by PCR.The tester sample is again digested and linked to new adapters, complementary to those previously applied, and only the driver sample is digested with restriction enzymes.The two samples are placed in contact and the hybridization tester-tester only (with additional adapters) is then exponentially amplified [50,51].
A study to evaluate the non-host interaction of Polymyxa graminis on beet confirmed that about 17 genes are up-regulated in this interaction [52].These genes are related to metabolism (e.g.NADP-isocitrate dehydrogenase), synthesis and protein processing (e.g.ubiquitin extension protein), oxidative stress (e.g.class VII chitinase precursor), cell wall and development (e.g.glycine-rich protein) and signal transduction (e.g.serine/ threonine protein kinase).A study to identify differentially expressed genes from Xanthomonas axonopodis pv.citri growing under various conditions on the leaves of sweet orange (Citrus sinensis) plants identified the differential expression of genes related to protein synthesis, cell metabolism, pathogenicity and mobile gene elements [53].

RNA Sequencing (RNA-Seq)
The sequencing of RNA enables the entire transcriptome of a species to be studied using only small amounts of RNA.The data obtained by RNA-Seq analysis can be analyzed using bioinformatics tools, and it has been reported that this methodology, coupled with real-time PCR (RT-PCR), is one of the most effective strategies to discover new genes [54].Sequencing-based RNA methodology is a related technique used in gene expression studies where the numerical frequency of a given sequence is determined in the library.
The RNA-Seq technique has several advantages over other methods for a variety of reasons, the most important of which are that it does not rely on prior knowledge of an organisms genes, it can reveal the precise location of the connections between the transcripts and the connectivity between the exons, reveals the existence of single nucleotide polymorphisms (SNPs) and has high sensitivity and reproducibility [55].However, this technology also has limitations, such as the size of the transcript, where larger transcripts are detected more easily than smaller ones, and the size/type of sequencing for the detection of genes showing lower expression [56].Statistical programs capable of storing, evaluate and processing this huge amount of data, have also become one of the major challenges encountered when applying this methodology [29,57].

Advantages and Limitations
Differential expression analysis is based on gene regulation, that is, under different physiological situations, or when experiencing different stimulation, the regulation of gene expression is increased or decreased.Construction of cDNA libraries is one of the strategies used to obtain differentially expressed sequences and is useful for the identification of genes involved in contrasting situations [8].The construction of such libraries by differential analysis can be performed using the total RNA and/or fragments obtained by subtractive hybridization techniques, such as always comparing the test sample with a control sample.These sequences are then used in a microarray or subjected to RT-PCR to confirm the differential expression of genes (tester × control) [22].
Microarrays enable the study of many genes simultaneously, in a semi-quantitative way, and have been widely used to address several biological, genetic and biochemical issues [37,58].Normally this analysis is performed with previously known gene sequences inserted into an array.Nevertheless, this array can be supplied with sequences derived from RNA libraries constructed specifically for the situation under study.In this case the selected sequences should be informative and amenable to physiological or interaction study [58].However, this type of analysis depends on a prior knowledge of the genetic content of a species, although this information may be often available in existing databases.Furthermore, the sequences available in databases are usually cultivar non-specific and are not PCR amplified, decreasing the sensitivity of this method [29].
The association between microarray methodology and RDA has the potential to increases the efficiency of both obtaining and analyzing differentially expressed sequences because the RNA sequences used are amplified and non-redundant.In addition, they are definitely related to the situation under study since they represent the result of enriched subtraction between the two contrasting situations (tester × control) [58].This combination simplifies the interpretation of the results and the identification of differentially expressed genes, while allowing rare transcripts to be identified by amplification [29,59].
The cDNA-RDA technique allows the detection of relative (greater than 3 to 5 times) and absolute expression differentials.The caveat is that this methodology requires not only representative and high quality mRNA but is also dependent on the affinity of the oligo (dT), requires cleavage with a varying number of restriction enzymes and can generate false positives if the proportion of the tester × driver is inadequate and the number of amplification cycles are suboptimal [29,59].
Suppressive subtractive hybridization, as with the cDNA-RDA technique, involves the separation of differentially expressed sequences, simplifying the evaluation of the library and reducing costs [60,61].Studies using the SSH method permit comparisons not only of the function of genes involved in a particular disease but also in the development of plants and/or pathogens and the differential expression of tissue-specific proteins etc. [38].However, this method only allows the collection of sequences with increased expression or unisequences in a given population, requires a large amount of good quality RNA and involves multiple and repetitive steps [11].
Analyses of RNA fingerprinting using arbitrary primed PCR and differential display is less laborious and has the potential for identifying differentially expressed genes, however there is a high chance of false positives [62,63].This is because not only are gel bands difficult to analyze but the size of the fragment is not directly associated with a gene.In addition, a single band may represent more than one cDNA or may not be representative of the gene because, among other reasons, a primer may produce sequences with only a small portion of the coding region.Not only this, but there is also the possibility of contamination between samples when the fragment is cut from the gel.All these factors have been discussed in more detail elsewhere [58,64].
A further consideration is that rare transcripts may not be adequately amplified, thus reducing the sensitivity of the test [65].Nevertheless, the DD technique is simple and relatively sensitive compared to other available methodologies, detects genes with increased and decreased expression and allows you to compare more than two samples at the same time [11,44].
The cDNA-AFLP method functions as a global analysis of differential expression because it requires no prior visual identification of differentially expressed transcripts, broadly covers the transcripts and is a tool for the simultaneously study of genes in both members (pathogen-host) of a pathosystem, thus allowing the characterization and study of the gene expression profile over time [9,28,29,66,67].However, this technique has the same limitations inherent to DD and RNA fingerprinting by arbitrary primed PCR.
The SAGE methodology enables the assessment of differential gene expression based on the identification and quantification of SAGE sequences in a library.This technique allows the quantitative and cumulative evaluation of plants, pathogens and pathosystems and also generates a large amount of data.Nevertheless, this methodology requires an extensive input of both time and work [58].Furthermore, SAGE requires a next-generation sequencing platform; it can generate unreliable results due to sequencing errors or by the use of inadequate restriction enzymes or by a specific gene being represented by multiple SAGE sequences [42].

Final Considerations
The analysis of differential expression is an effective and efficient strategy for the study of biological cellular pathways, because it allows the detection of genes and the elucidation of molecular mechanisms related to physiological events, signal transduction, primary and secondary metabolism, defense mechanisms, the stress response and other genetic and physiological factors.
There are various methodologies and strategies that can be applied to study differential expression and all the techniques discussed in this article can be used in this type of study, taking into account the limitations inherent to each (Table 1).The choice of technique and/or strat- Fragment may comprise more than one gene, or may not be representative of one gene; contamination between samples may occur during the excision of a fragment from the gel; rare fragments with low expression may not be adequately amplified and identified.

SAGE
Quantitative and cumulative analysis; large amount of data.
High cost and time-consuming in data analysis; requires a next-generation sequencing platform; multiple sequences can represent a single gene.

RNA-Seq
Large amount of data; finds connections between exons and other transcripts; finds SNPs; sensitive and reproducible analysis.
Size of the transcripts influence the amplification; requires a next-generation sequencing platform; requires a complex analysis using multiple computer programs to analyze the data.
egy to be used is determined by the available technology plus the cost and the limitations of the analysis.When choosing the techniques used in a particular investigation, these limitations should be minimized methodologically or statistically so that the results are reliable and representative of the situation under study.

Figure 1 .
Figure 1.(a) Scheme of the conversion of mRNA into cDNA.Adapted from Addison Wesley Longman, Inc.; (b) Strategies and methodologies to obtain cDNA libraries and perform gene expression studies.

Table 1 . Advantages, critical points and inherent limitations of differential expression analysis methodologies. Methodology Advantages Critical points and limitations
Requires large amount of good quality RNA; the affinity of the oligo dT is a determinant in the amplification; use of various restriction enzymes; the tester:driver proportion is crucial in the subtraction process; amplification cycles are crucial in obtaining differentially expressed sequences.