Transcriptome Analysis of Reaction Wood in Gymnosperms by Next-Generation Sequencing

Special xylem tissue called “compression wood” is formed on the lower side of inclined stems when gymnosperms grow on a slope. We investigated the molecular mechanism of compression wood formation. Transcriptome analysis by next-generation sequencing (NGS) was applied to the xylem of Chamaecyparis obtusa to develop a catalog of general gene expression in differentiating xylem during compression and normal wood formation. The sequencing output generated 234,924,605 reads and 40,602 contigs (mean size = 529 bp). Based on a sequence similarity search with known proteins, 54.2% (22,005) of the contigs showed homology with sequences in the databases. Of these annotated contigs, 19,293 contigs were assigned to Gene Ontology categories. Differential gene expression between the compression and normal wood libraries was analyzed by mapping the reads from each library to the assembled contigs. In total, 2875 contigs were identified as differentially expressed, including 1207 that were up-regulated and 1668 that were down-regulated in compression wood. We selected 30 genes and compared the transcript abundance between compression and normal wood by quantitative polymerase chain reaction analysis to validate the NGS results. We found that 27 of the 30 genes showed the same expression patterns as the original NGS results.


Introduction
A special xylem tissue called reaction wood is formed in leaning stems when trees grow on an incline.Reaction wood includes both tension wood in angiosperms and compression wood in gymnosperms such as Chamaecyparis obtusa, which was used in this study.Tension wood is usually produced on the upper side in leaning angiosperm stems, and compression wood is usually produced on the lower side in leaning gymnosperm stems [1].Compression wood shows accelerated cambial growth on the lower side of the inclined stems, thicker tracheid walls, higher lignin content, and larger microfibril angles in the S2 layer compared with those of normal wood.The combination of these anatomical and chemical characteristics leads to high compressive growth stress in the compression wood region and acts to mechanically bend a leaning stem upward toward the vertical [2].
To clarify the molecular mechanism of compression wood formation, some studies have screened for genes and proteins that exhibit cumulative changes during compression wood formation [3]- [7].However, most of the screened genes play an immediate role in cell wall formation or metabolism.Thus, the trigger for compression wood formation (i.e., which kind of stimulus and how the stimulus reaches cells to start compression wood formation) remains unknown.In this study, we applied RNA-Seq to C. obtusa differentiating xylem and compared differences in transcript abundance between compression and normal wood.RNA-Seq is a term referring to transcriptome analysis by next-generation sequencing (NGS) and it allows for a higher dynamic range of detection compared to that of microarray and has good ability to detect novel transcripts [8] [9].Therefore, RNA-Seq is considered to be applicable in finding genes involved in compression wood formation at low expression levels and thus cannot be detected by conventional methods.RNA-Seq analysis is applied primarily to organisms with complete reference genomes available, but advances in de novo assembly programs have enabled its application to non-model organisms with a lack of reference sequences.To date, de novo assembly and transcript profiling has been conducted in various non-model plant species such as Eucalyptus, Taxus, rubber trees, and garlic [10]- [13].
In this study, we conducted a transcriptome analysis using NGS in C. obtusa to develop a catalog of general gene expression in the xylem including both compression and normal woods.Then, gene expression between compression and normal wood was compared based on the catalog.We discuss the mechanism of expression of the compression wood characters based on the results.

Plant Material and RNA Extraction
Experiments were conducted from April to June 2012 in a field owned by Nagoya University, Japan.Six 3-yearold Japanese cypress (C.obtusa) saplings (about 120 cm in height and 10 mm in diameter) were planted in plastic pots filled with a mixture of red soil and compost.The saplings were loosely fixed to a stake using wire to maintain vertical stem growth.Three saplings were artificially inclined after initiating cambial growth in May.The angle of inclination was 30˚ from the vertical.The remaining three saplings were grown vertically.Sampling was conducted during the most active period of cambial growth in June.After removing the bark, differentiating xylem tissues were scraped from the stems.The lower sides of the stems (compression wood) were collected from the inclined saplings, and both sides of the stems were collected from the vertical saplings (normal wood).These tissues were immediately frozen in liquid nitrogen and stored at -80˚C until use.Total RNA was extracted from 200 -300 mg of xylem sample using an RNeasy Plant Mini Kit (Qiagen, Valencia, CA, USA) according to the manufacturer's protocol.The RNA samples were treated with DNase I (TaKaRa Bio, Otsu, Japan) to digest contaminating genomic DNA.Poly(A)-containing mRNAs were purified from the total RNA samples using the Dynabeads mRNA Purification Kit (Invitrogen, Carlsbad, CA, USA).The quality of the isolated mRNA was checked using an Agilent 2100 Bioanalyzer RNA chip (Agilent Technologies, Santa Clara, CA, USA).

Library Preparation and Sequencing
mRNA was fragmented and used to generate libraries for NGS.The libraries of the fragments were obtained using the SOLiD Total RNA-Seq Kit (Life Technologies, Carlsbad, CA, USA).The procedure for this kit is based on hybridization of adapters, followed by reverse transcription and library amplification by polymerase chain reaction (PCR).Barcoded SOLiD 3' Primers from the SOLiD RNA Barcoding Kit (Life Technologies) were used to distinguish the six samples.The cDNAs were selected by size using an AgencourtAMPure XP (Beckman Coulter, Brea, CA, USA) before and after library amplification.The yield and size distribution of the am-plified DNA were assessed using an Agilent 2100 Bioanalyzer (Agilent Technologies).The pooled library was sequenced by multiplex paired-end sequencing on a flow cell using the 5500xl SOLiD system (Life Technologies).The sequence data were deposited in the DDBJ Sequence Read Archive under the accession number DRA001036.

De Novo Assembly
De novo sequence assembly was carried on all SOLiD reads from all libraries using the CLC Genomics Workbench (CLC Bio, Aarhus, Denmark).Minimum contig size was 200 bp.The CLC Genomics Workbench performed scaffolding using paired-end read information.We use the term "contigs" to refer to both contigs and scaffolds.

BLAST Search against Presently Available C. obtusa Sequences
All 5897 expressed sequence tag (EST) sequences of C. obtusa were downloaded from the PlantGDB [14] and used for a nucleotide BLAST search against 40,602 contigs using an E-value cutoff of 10 -6 .

Functional Annotation and Classification
Allassembled contigs were utilized for homology searches against protein databases such as non-redundant (Nr) and Swiss-Prot with the BLASTX program (E-value cutoff 10 -6 ), and the aligning results were used to annotate the contigs.The functional annotation by gene ontology (GO) terms was performed using the Blast2GO program [15].To complement the Blast-based annotations with domain-based annotations, InterProScan was used to merge the GO terms to the existing annotations.GO-slim reduction was used to reduce the amount of functional information and to summarize the functional content of a dataset.The ESTScan program was used to detect potential coding regions in the transcript sequences obtained by assembly [16].The Transeq package from EMBOSS was employed to obtain amino acid sequences [17].The sequences were aligned to the eukaryotic Clusters of Orthologous Groups of proteins (COG) database (E-value cutoff 10 -6 ) to predict and classify possible functions [18].Pathway assignments were performed according to the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database (E-value cutoff 10 -6 ) [19].The WebMGA server was used to perform the eukaryotic COG and KEGG classifications [20].

Normalization of Gene Expression Levels and Analysis of Differential Gene Expression
SOLiD reads from each library were mapped to the assembled contigs using the CLC Genomics Workbench (CLC Bio), and the reads mapped to each contig were counted.Normalizaiton and differential expression analysis between compression and normal wood samples was performed using R package TCC (ver.1.0.0) with a false discovery rate (FDR) < 0.05 [21].

Quantitative Real-Time PCR (qPCR) Validation of the Differentially Expressed Genes
The transcription profiles of the 30 selected genes were analyzed by qPCR.Six C. obtusa saplings were grown from April to June 2013 for quantification.Three saplings were artificially inclined at an angle (compression wood), and the remaining three saplings were grown vertically (normal wood).Sampling of the compression and normal wood was conducted 21, 25, and 28 days following initiation of the inclination stimulus.Total RNA was extracted from each sapling and treated with DNase I as described above.Approximately 1 µg of DNase I-treated total RNA was converted into cDNA using PrimeScript RT Master Mix (TaKaRa Bio).The cDNA products were then diluted twelve-fold with nuclease-free water before its being used as a template in the qPCR.
The quantitative reaction was performed on a StepOnePlus Real Time PCR System (Life Technologies) using the POWER SYBR Green PCR Master Mix (Life Technologies).The reaction mixture (20 µl) contained 2× POWER SYBR Master Mix, 0.2 µM each of the forward and reverse primers, and 2 µl of template cDNA.The primers were designed using Primer3Plus [22] and synthesized commercially.PCR amplification was performed under the following conditions: 95˚C for 10 min, followed by 40 cycles at 95˚C for 15 s and 58˚C for 60 s.A dissociation curve was designed after each PCR run to confirm product specificity and to avoid production of primer dimers.Gene expression of selected genes was normalized against ubiquitin as an internal reference gene (BW987637).Relative gene expression was calculated using the 2 -∆∆Ct method [23].All reactions were performed in triplicate, and each sample was further amplified without reverse transcription to avoid DNA contamination of the sample.Differences among saplings were determined using one-and two-factor analysis of variance (ANOVA).When the ANOVA was significant, differences among individual saplings were estimated using post hoc Tukey's tests at alpha = 0.01.

Tissue Observations
After harvesting differentiating xylem for RNA extraction, stem segments were harvested and fixed in 3% glutaraldehyde in 0.1 M phosphate buffer (pH 6.98) for 1 week at 4˚C.Transverse sections, 12 µm thick, were prepared from the segments using a sliding microtome.The sections were double-stained with safranin/astra blue and dehydrated in an increasing ethanol series.After soaking in xylene, the sections were mounted on glass slides with EntellanNeu (Merck, Darmstadt, Germany) and observed under a light microscope (BX60; Olympus, Tokyo, Japan).An intense red safranin stain indicated a high lignin content in the cell wall.

Anatomical Observations
After harvesting differentiating xylem for RNA extraction, the remaining xylem tissues were collected and observed by light microscopy (Figure 1).The cell outlines of lower-side of saplings grown on an incline were rounded, and thick cell walls and intercellular spaces were observed (Figure 1(a)).An intense red safranin stain was observed in the inclined saplings.The cell outlines in the cross section were rectangular or hexagonal with no intercellular spaces in the vertical saplings (Figure 1(b)).

Sequencing and de Novo Assembly
RNA was extracted from both compression and normal wood samples to achieve a broad survey of genes associated with xylogensis.Using a SOLiD paired-end sequencing platform, 234,924,605 reads were obtained, with total nucleotides of 12,859,597,397 (12.8 Gb).The datasets were submitted to the DDBJ database (accession number DRA001036).Based on the reads, 40,602 contigs, with an average length of 529 bp and an N50 of 631 bp were assembled (Table 1).The size distribution of the contigs is shown in Figure 2. We used ESTScan [16]    to detect potential coding regions in the sequences obtained by the assembly and found that 29,470 contigs (72.6%) contained a predicted coding region.

Comparison of Assembled Contigs with C. obtusa ESTs Deposited in the Database
We downloaded all presently available C. obtusac DNA sequences from Plant GDB to evaluate the quality of the assembled contigs (as of December 2013), which included all 5897 ESTs submitted by Ujino-Ihara et al. [24] and Yamashita et al. [5].These sequences were submitted to a BLAST search against 40,602 assembled contigs.
Of the 5897 C. obtusa EST sequences in the database, 4437 (75.2%) could be matched with assembled contigs using an E-value cutoff of 10 -6 .Of the 40,602 contig sequences, 4548 contigs (11.2 %) matched with C. obtusa EST sequences.

Functional Annotation and Classification of Contigs
Contig sequences were searched using BLASTX against the Nr and Swiss-Prot database with an E-value cutoff of 10 -6 for annotation.Using this approach, 22,005 sequences (54.2% of the contig sequences) returned above BLAST cutoff results.Longer contigs were more likely to have BLAST matches: 95.3% of the contigs > 1000 bp in length showed homologous matches, whereas only 34.0% of the contigs < 300 bp showed matches.GO assignments were used to classify the functions of the contigs, and 19,293 sequences (47.5% of the contig sequences) were functionally annotated.They were categorized into 30 functional groups belonging to three main GO-slim ontologies: "biological process", "molecular function", and "cellular component" (Figure 3).The assignments to "biological process" (46.7%) made up the majority, followed by "cellular component" (29.1%) and "molecular function" (24.3%).Under the category of "biological process", "cellular process" (25.0%) and "metabolic process" (23.6%) were prominently represented.We performed a COG classification analysis to further evaluate the functions of the assembled contigs.Of the 29,470 contigs with a predicted coding region, 11,769 (39.9%) sequences were assigned to COG classifications (Figure 4).Among the 25 COG categories, the cluster for R "general function prediction only" (1392, 11.8%) was the largest group, followed by T "signal transduction mechanisms" (1365, 11.6%), O "posttranslational modification, protein turnover and chaperones" (1136, 9.7%), K "transcription" (677, 5.8%), and A "RNA processing and modification" (667, 5.7%).
In total, 29,470 contigs with predicted coding regions were mapped to KEGG pathways to identify active biological pathways in C. obtusa xylem; 11,355 (38.5%) contigs were assigned to 286 pathways.These pathways were divided into six groups, of which "metabolism" was the largest group (8196), followed by "organismal systems" (4191) and "genetic information processing" (2901).The "metabolism" pathway was classified into 11 subgroups of which "carbohydrate metabolism" was the largest group (1791), followed by "amino acid metabolism" (1069) and "lipid metabolism" (900).In particular, the proportion of "phenylalanine metabolism" in "amino acid metabolism" was high (134, 12.5%).To further distinguish the C. obtusa EST sequences deposited in the database and our assembled contigs, COG classification of the EST sequences was carried out (Figure 4).Of the 5897 EST sequences, 2483 were successfully annotated and classified into 24 functional categories.Each of the COG clusters of the contigs contained more sequences than EST sequences, suggesting that the de novo-assembled contigs had wider transcriptome coverage than the EST sequences deposited in the database.

Transcript Difference between Compression and Normal Wood Samples
Normalization of gene expression and analysis of differential expression between compression and normal wood libraries were performed using R package TCC [21] with a FDR < 0.05.This analysis found that 2875 contigs were differentially expressed, including 1207 and 1668 contigs up-and down-regulated, respectively, in compression wood (Figure 5).
To validate the RNA-Seq results and to identify genes involved in compression wood formation, a qPCR analysis was performed using gene-specific primers for 30 contigs classified into the six GO categories of "binding", "transcription regulator activity", "transporter activity", "response to stimulus", "signal transduction", and "anatomical structure morphogenesis" (Figure 6).Three compression wood samples and three normal wood samples were used for quantification, and sampling of the compression and normal wood tissues was conductedat 21, 25, and 28 days following initiation of the inclination stimulus.A two-factor ANOVA at P < 0.01 revealed that an interaction occurred between the inclination stimulus (i.e., compression or normal wood) and the day of sampling in 16 of 30 genes.Therefore, a precise assessment could not be made, but the following two tendencies were observed: the inclination stimulus had a significant effect on transcript abundance in 29 genes except for histone acetyltransferase, and the day of sampling had a significant effect on the transcript abundance in 15 genes.However, notably, the qPCR results of WRKY transcription factor 1 and BEL1-type homeodomain protein showed more transcript abundance in normal wood than in compression wood, which were contrary to the RNA-Seq results.Therefore, 27 genes except for histone acetyltransferase, WRKY transcription factor 1, and BEL1-type homeodomain protein exhibited the same expression profiles as the original RNA-Seq results.These results suggest that the data obtained from the RNA-Seq analysis were credible.To examine differences in transcript abundance among each sapling condition, inclination stimulus and day of sampling were paired intosix combinations (i.e., compression wood on day 21, compression wood on day 25, compression wood on day 28, normal wood on day 21, normal wood on day 25, and normal wood on day 28), and a one-factor ANOVA at P < 0.01 was performed.The results revealed a significant difference in the mean transcript abundance value of 27 genes except for histone acetyltransferase, WRKY transcription factor 1, and plasma membrane intrinsic protein 2. The results of post hoc tests are shown in Figure 6 for the significant differences among the conditions.Of note, transcript abundance increased with increasing inclined growth period, i.e., day 21 < 25 < 28 for the mechanosensitive ion channel protein and R2R3-MYB transcription factor in the compression wood samples.

Anatomical Observations
After harvesting differentiating xylem for RNA extraction, cross sections were prepared from the remaining xylem tissues and observed by light microscopy (Figure 1).Compared with normal wood, compression wood has a thicker secondary wall, intercellular spaces with cells that are rounded in cross section, and higher lignin content [1], showing that the vertical and inclined saplings formed normal and compression woods, respectively.The upper side of each photograph in Figure 1 is the position of scraping for RNA extraction.The cells undergoing differentiation into compression and normal wood were successfully scraped and collected.

Sequencing and de Novo Assembly
The SOLiD system was used to generate 235 million paired-end reads (about 12.8 Gb).Although de novo assembly of short-read sequences without a known reference has been considered difficult, development and optimization of a de novo assembly method has allowed cost-effective assembly of transcriptomes for non-model organisms with unknown genomes [11] [25]- [27].In this study, sequence assembly was performed using the CLC Genomics Workbench, which identified 40,602 contigs with an average length of 529 bp.Based on the criteria of contig length distribution and reduction in redundancy, the CLC Genomics Workbench is considered the most suitable assembler for non-model transcriptome data [28].
To determine whether the short reads were correctly assembled, we performed a nucleotide BLAST search against C. obtusa EST sequences deposited in the database.Of the 5897 EST sequences, 4437 (75.2%) were matched with the assembled contigs.Although we used xylem samples only, the EST sequences in the database were derived not only from xylem but also from pollen and seed cones.Unmatched EST sequences (24.8%) probably resulted from the difference in sampling positions.Of 40,602 contig sequences, 4548 (11.2%) matched the EST sequences, suggesting that the de novo-assembled contigs had wider xylem transcriptome coverage than the EST sequences, but it also suggests that about 90% of the assembled contigs have not been sequenced previously or have not been assembled correctly.We performed xylem-specific transcriptomic sequencing and assembly.Some plant transcriptomic studies sequenced pooled cDNA samples from different tissues [25] [29] [30] or assembled transcriptomic data using sequencing reads from different tissues [10] [11] [31], whereas others performed tissue-specific transcriptomic sequencing and assembly [12] [26] [32].Although more extensive transcriptomic data can be obtained using the former strategy, more accurate information can be produced using the latter method, as alternative splicing may exist in different tissues, which makes contig assembly difficult [33].Therefore, the sequence lists obtained in this study will provide a good reference data for gene expression profiling in C. obtusa xylem.

Functional Annotation and Classification of Contigs
In this study, 54.2% (22,005) of the contigs had BLAST hits in the Nr and Swiss-Prot database.Wang et al. [34] and Wang et al. [27] showed that longer contigs are more likely to have BLAST matches in protein databases.Our results also showed that 95.3% of the contigs > 1000 bp in length had homologous matches, whereas only 34.0% of the contigs < 300 bp had matches.
Functional annotation and classification provides predicting information of the biological behaviors of genes.Many of the contigs were assigned to a wide range of GO, COG, and KEGG classifications, which indicated that our assembled contigs represented a wide diversity of transcripts.Among three main GO domains, "cellular process" and "metabolic process", "binding" and "catalytic activity", and "cell" were the most abundant classes in "biological processes", "molecular functions", and "cellular components", respectively (Figure 3), which was consistent with the reports of Hao et al. [11], Li et al. [12], Xia et al. [25], and Gordo et al. [26].Among COG classifications, the second largest classifications unearthed in our work was T "signal transduction mechanisms" (Figure 4), which was different from the reports by Hao et al. [11], Li et al. [12], and Xia et al. [25].As half of the xylem samples we used were compression wood, the signaling system following perception of inclination stimulus seems to have been activated compared to that in past reports.Among KEGG pathways, the proportions of "carbohydrate metabolism" and "amino acid metabolism" were considerably high.In particular, the proportion of "phenylalanine metabolism" in "amino acid metabolism" was higher than that in other studies [12] [25] [32].As phenylalanine is an amino acid necessary for monolignol biosynthesis [35], the high proportion of "phenylalanine metabolism" suggests active lignification in the cell wall.

Transcript Difference between Normal and Compression Wood Samples
Thirty genes were selected from 2875 differentially expressed genes (FDR < 0.05), and qPCR was performed to validate the RNA-Seq results and to identify the genes involved in compression wood formation.In order to achieve anatomical and chemical characteristics of compression wood, the cells must take some steps such as perception of inclination stimulus, signal transduction, regulation of gene expression by transcription factor binding, and transport of materials necessary for metabolism or cell wall formation.Hence, the 30 genes classified into six GO categories of "response to stimulus", "signal transduction", "binding", "transcription regulator activity", "transporter activity", and "anatomical structure morphogenesis" were selected in this study.The results showed that most of the genes exhibited the same expression profiles as the original RNA-Seq results (Figure 6), suggesting that the data obtained from the RNA-Seq were credible.
For example, transcript abundance of the ABC transporter B family member in compression wood increased about twice that in normal wood (Figure 6).This transporter is involved with conveying various molecules such as phytohormones, secondary metabolites, and xenobiotic substances [36] [37].Increases in this transporter in compression wood suggest that transport of auxin, which is a phytohormone, or monolignol, which is a secondary metabolite, became active.Transcript abundance of endoplasmic reticulum (ER)-type calcium-transporting ATPase andcalreticulin in compression wood increased about 5 -8 times and 3 -5 times, respectively, compared to that in normal wood (Figure 6).An ER-type calcium-transporting ATPase takes calcium ions into the ER using the energy of ATP [38], and calreticulin is combined with calcium in the ER [39].Allona et al. [40] also reported that calreticulin gene expression increases in the compression wood region.Du and Yamamoto [41] reported that calcium plays an important role in compression wood formation.Transcript abundance of the mechanosensitive ion channel in compression wood increased about 2 -6 times compared to that in normal wood (Figure 6).This channel is thought to be able to perceive distortion of the membrane and change the concentration of intracellular calcium ions [42], which also suggests the importance of calcium in compression wood formation.Transcript abundance of kinesin-like calmodulin-binding protein in compression wood increased about 2 -3 times compared to that in normal wood (Figure 6).This gene is involved in microtubule bundling during mitosis and the orientation of cortical microtubules during interphase [43] [44].This gene seems to be involved in the orientation of cortical microtubules, as the samples collected in this study were differentiating xylem tissue that had finished cell division.As the orientation of cortical microtubules is associated with the orientation of cellulose microfibrils, which constitute the cell wall [45], the characteristics of compression-wood cell walls such as large microfibril angle and helical cavities may be achieved by increasing expression of this gene.The transcript abundance of R2R3-MYB, WRKY 2, and WRKY 3 in compression wood increased, and transcript abundance of the NAC domain-containing protein in compression wood decreased compared to those in normal wood (Figure 6).Bedon et al. [46] reported that gene expression of R2R3-MYB increases in the compression wood region.Li et al. [47] used branches of radiate pine and reported that WRKY gene expression increases in the opposite wood region rather than in the compression wood region.In this study, WRKY 1 was not different among the six conditions (Figure 6).The roles of the members appear to be different even when each gene belongs to the same WRKY family.The NAC domain-containing protein plays a role activating the entire secondary wall biosynthetic program in poplar normal wood [48].However, in this study, we found that NAC domain-containing protein expression decreased in the compression wood region whose secondary wall formation is active.Transcription factors, which are necessary for activating secondary wall formation, may be different between compression and normal wood.The transcript abundance of expansin in compression wood decreased about one-half to one-third compared to that in normal wood (Figure 6).This enzyme disrupts hydrogen bonding between cellulose microfibrils and hemicelluloses, leading to wall loosening and enlargement of cells [49].Tracheids in the compression wood region are shorter in length and smaller in diameter compared to those in normal wood [1].These anatomical characteristics may be related with the expression level of the expansin gene.
We used three compression wood samples and three normal wood samples in the qPCR experiment whose timing of sampling was different among the three.In Figure 6, the timing of the sampling was later toward the right (i.e., 21, 25, and 28 days following initiation of the inclination stimulus).Many genes such as the mechanosensitive ion channel protein and the R2R3-MYB transcription factor showed a tendency to increase transcript abundance with increasing inclination period in compression wood.The same tendency was observed in our former study in which the transcript abundance of laccase in the compression wood region reached a peak at about 28 days [50].

Conclusion
In conclusion, we generated libraries for SOLiD paired-end sequencing from differentiating xylem of C. obtusa compression and normal wood; de novo assembly generated 40,602 contigs.The large number of GO, COG, and KEGG classifications assigned indicated that this dataset represents the most comprehensive expressed gene catalog for C. obtusa xylem.This is the first application of NGS technology to assess transcript differences between compression and normal wood.The dataset generated in this study will improve our understanding regarding the molecular mechanisms of xylogenesis and formation of reaction wood in gymnosperms.

Figure 1 .
Figure 1.Light microscopy of xylem derived from saplings grown inclined (a) or vertically (b).The differentiating xylem on the upper side of the photograph was scraped for RNA extraction.Bar = 50 μm.

Figure 2 .
Figure 2. Size distribution of the assembled contigs.The gray bars and black circles indicate the number of contigs and cumulative relative frequency, respectively.

Figure 3 .
Figure 3. Gene ontology (GO) classification of the assembled contigs.Results are summarized in the three main GO categories of "biological processes", "molecular functions", and "cellular components".

Figure 4 .
Figure 4.The eukaryotic Clusters of Orthologous Groups of proteins (COG) classifications for de novo-assembled contigs and Chamaecyparis obtusa expressed sequence tag (EST) sequences deposited in the database.A, RNA processing and modification; B, chromatin structure and dynamics; C, energy production and conversion; D, cell cycle control, cell division, chromosome partitioning; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; G, carbohydrate transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism; J, translation, ribosomal structure, and biogenesis; K, transcription; L, replication, recombination, and repair; M, cell wall/membrane/envelope biogenesis; N, cell motility; O, posttranslational modification, protein turnover, chaperones; P, inorganic ion transport and metabolism; Q, secondary metabolite biosynthesis, transport, and catabolism; R, general function prediction only; S, function unknown; T, signal transduction mechanisms; U, intracellular trafficking, secretion, and vesicular transport; V, defense mechanisms; W, extracellular structures; Y, nuclear structure; Z, cytoskeleton.

Figure 5 .
Figure 5. Differentially expressed genes in compression versus normal wood.Each dot represents a contig.Gray dots indicate contigs estimated as differentially expressed genes with a false discovery rate < 0.05.CW, compression wood; NW, normal wood.M = the log-ratios of expression, A = the log-intensity of each dot.

Figure 6 .
Figure 6.Expression profiles of 30 selected genes.Quantitative real-time polymerase chain reaction analyses were performed to validate the RNA-Seq results.CW, compression wood; NW, normal wood.Sampling of CW1 and NW1, CW2 and NW2, and CW3 and NW3 were conducted 21, 25, and 28 days, respectively, following initiation of the inclination stimulus.Asterisk indicates that NW1 was a reference relative quantification sample.Error bars show the standard deviation.Different letters denote a significant difference at P < 0.01.Multiple letters indicate that the mean value fell into more than one post hoc group.

Table 1 .
Overview of the sequencing and de novo assembly.