Identification and Characterization of Human Genomic Binding Sites for Retinoic Acid Receptor / Retinoid X Receptor Heterodimers

All-trans retinoic acid (ATRA) triggers a wide range of effects on vertebrate development by regulating cell proliferation, differentiation, and apoptosis. ATRA activates retinoic acid receptors (RARs) which heterodimerize with retinoid X receptors (RXRs). RAR/RXR heterodimers function as ATRA-dependent transcriptional regulators by binding to retinoic acid response elements (RAREs). To identify RAR/RXR heterodimer-binding sites in the human genome, we performed a modified yeast one-hybrid assays and identified 193 RAR/RXR heterodimer-binding fragments in the human genome. The putative target genes included genes involved in development process and cell differentiation. Gel mobility shift assays indicated that 160 putative RAREs could directly interact with the RAR/RXR heterodimer. Moreover, 19 functional regulatory single nucleotide polymorphisms (rSNPs) on the RAR/RXR-binding sequences were identified by analyzing the difference in the DNA-binding affinities. These results provide insights into the molecular mechanisms underlying the physiological and pathological actions of RAR/RXR heterodimers.


Introduction
All-trans retinoic acid (ATRA) which is a vitamin A derivative affects physiological processes ranging from embryonic development to homeostasis of adult tissues and organs [1].The actions of ATRA are mediated through the retinoic acid receptors (RARs), which is a member of the nuclear receptor super family.RARs heterodimerize with retinoid X receptors (RXRs) and function as ATRA dependent transcriptional regulators by binding to retinoic acid response elements (RAREs).Their target genes involve cell proliferation, differentiation, apoptosis, and homeostasis [2] [3].As synthetic retinoic acids have been widely used and are known to have therapeutic effects in cancers and other various types of disease [4] [5], identification of target genes may lead to the development of more specific and effective treatments.
Methods to identify DNA-binding sites for transcription factors such as electrophoretic mobility shift assays (EMSAs), SELEX, DNase I foot printing assays, chromatin immunoprecipitation (ChIP) analysis, and reporter assays can be used to identify transcription factor binding sites, but these processes tend to be laborious and not well matched for screening large numbers of DNA elements [6].Recently, the dramatic rising of the data for RAR target genes is generated by high-throughput technologies, such as microarrays, genome-wide ChIP analyses, and computational methods.However microarrays could not discriminate between primary and secondary target genes.Furthermore, the results of ChIP analyses actually include indirect transcription factor-DNA interactions [7].In addition, these approaches have obstacles including high cost, reproducibility, high false positive rates, and cell-specific context.Therefore, it was important to develop alternative methods to assess regulatory regions in the genome.
Recently, we developed a modified yeast one-hybrid system which enables rapid and efficient identification of genomic targets for DNA-binding proteins [8].Using this system, we reported here functional screening for RAR/RXR heterodimer-binding sites from human genome.As a result, we identified 193 genomic fragments including RAR/RXR heterodimer-binding motifs.At least 160 genomic fragments were confirmed as direct binding sites for the RAR/RXR heterodimer.In addition, we identified 19 functional regulatory single nucleotide polymorphisms (rSNPs) on the identified RAR/RXR-binding sequences.

Plasmid Constructions
Human RXR alpha and human RAR alpha was amplified by the polymerase chain reaction (PCR) from Uterus cDNA (PCR Ready-cDNA, Maxim Biotech, Inc., USA) and pituitary Cap site cDNA (Wako Pure Chemical Industries, Ltd., Japan.),respectively.These cDNA fragments were cloned into pGADT7 (CLONTECH, USA) and reamplified by PCR with primers (Table 1, RXR-5', RXR-3', RAR-5) to generate the restriction sites for sub-cloning.The Hind III fragment including nuclear localization signal (NLS) and GAL4 activation domain (GAL4 AD) was removed from pGADT7andthe cDNAs was inserted into the plasmid, respectively.The resulting plasmid was named pADH1_RXR and pADH1_RAR.The fragment including NLS and GAL4AD was amplified from pGADT7 by PCR and cloned.The fragment was inserted Bgl II sites at N-terminal end of RAR in pADH1_RAR and confirmed by sequencing.Resulting plasmid was named pADH1_NLS_GAL4AD_RAR.Foot-and-mouth disease virus (FMDV) 2A peptide sequence was amplified by annealing synthetic oligonucleotides (Table 1, FMDV_2A-F and FMDV_2A-R) and reamplified by PCR with primers (Table 1, FMDV_2A-5' and FMDV_2A-3') to generate the restriction sites.Then FMDV 2A fragment and NLS_GAL4AD_RAR was tandemly inserted at C-terminal end of RXR in pADH1_RXR.The resulting plasmid was named pADH1_RXR_2A_NLS_GAL4AD_RAR.pSUR(Gene Bank AB425277) was constructed previously and used as a reporter in the modified yeast one-hybrid system [8].The positive control of a modified one-hybrid assay were constructed by inserting eight copies of the mouse cellular retinoic acid binding protein II (mCRABPII) RARE (Table 1, mCRABP2_F and mCRABP2_R) [9] at the upstream of the SPO13 promoter of pSUR.To generate the restriction sites and Kozak sequence for sub-cloning, N-terminal end side of RXR was amplified by PCR.The resulting PCR product was digested with Hind III and Sma I. C-terminal end side of RXR was obtainedby digesting with Sma I and XhoI.Both fragments were inserted into Hind III and XhoI sites pSP64 Poly (A) (Promega, USA) for in vitro transcription/translation. The cDNA for human RAR alpha was also inserted into the Sal I and the Xba I sites pSP64 Poly (A).

A Modified Yeast One-Hybrid Assay
The human genomic library for a modified yeast one-hybrid assay was generated as previously described [8].

EMSAs
The RARE of rat rat cellular retinol binding protein type II (rCRBPII) were labeled with Cy5 and used as a probe (Table 1, rCRBPII-F and rCRBPII-R) [10].Synthesis of human RAR and RXR proteins was performed with the TNT SP6 High Yield System (Promega, USA).In vitro synthesized RAR and RXR proteins were mixed with 500 ng of Calf thymus DNA (Invitrogen, USA) and 0.25 pmol of labeled oligonucleotide at 4˚C.In competition experiments, a 25-and 200-fold molar excess of unlabeled oligonucleotide was added to the reaction mixture.Used competitor sequences were listed in Table 2 and Table 3.The RARE of human RAR beta (Table 1, RARE-F and RARE-R) [11] and random sequences (Table 1, random_F and random_R) were used as positive and negative controls, respectively.The binding reaction was carried out in the EMSA binding buffer containing 12 mM HEPES (pH 7.9), 60 mM KCl, 4 mM MgCl 2 , 1 mM EDTA, 12% glycerol, and 0.5% Nonidet P-40.The reaction mixtures were directly loaded onto a 4% nondenaturing polyacrylamide gels made in 0.5xTBE.After electrophoresis was performed at 4˚C, the gels were analyzed by using a bio-imaging analyzer (FLA-7000 FUJIFILM).

Bioinformatics
To map the obtained sequences on the human genomic assembly (GRCh37), these sequences were analyzed using NCBI's BLAST and were searched for AGGTCA motifs.For the stringency of the search, we allowed up to 2-bp mismatches.For each predicted RAR/RXR binding site, the nearest gene and the distance from the center of the binding site to the transcriptional start site of the gene within 1000 kb was identified with GREAT (http://bejerano.stanford.edu/great/public/html/).

Identification of Human Genomic Binding Sites for the RAR/RXR Heterodimer
To identify RAR/RXR heterodimer-but not RXR homodimer-binding sites by a modified yeast one-hybrid system, the simultaneous expression of two transcription factors in yeast was required.Internal ribosome entry site (IRES) has been widely used for this purpose, but it has a major limitation.Namely, translation efficiency of a gene placed after IRES is much lower than that of a gene located before IRES [12].The limitation can be over come by a 2A peptide, a "self-cleaving" small peptide, which was identified in the FMDV [13].As 2A-me-diated cleavage is a universal phenomenon in eukaryotic cells [14] [15], we constructed a single expression vector by placing FMDV 2A segment between RXR and RAR.To minimize the effect of RXR homodimer on the reporter activity, RXR is expressed as the protein, although RAR is expressed as a fusion to the GAL4 AD (Figure 1).To evaluate the function of RAR/RXR heterodimer, yeast cells were transformed with these effectors and the indicated reporters (Figure 2).The transformants were grown on synthetic complete media includ-  ing RAR-specific ligand TTNPB [16] and antagonist of RXR homodimers LG100754 [17] [18].The transformants expressing either RAR or RXR alone could not grow, whereas the transformants expressing both RAR and RXR could grow in a RARE-dependent manner.These results indicated that RAR/RXR heterodimer could activate the reporter gene via the RARE.The human genomic library after 5FOA selection [8] was transformed with the human RAR/RXR expression vector (Figure 1).More than 1 × 10 7 of the library was selected for uracil prototroph.Human genomic fragments were recovered from the validated colonies by colony-direct PCR and sequenced.Two hundred and eleven unique sequences were obtained from 364 clones.

Experimental Validation of the RAR/RXR-Binding Sites
RAREs are typically composed of two directrepeats of a core motif, RGKTCA [19].The classical RARE is a 5 bp-spaced direct repeat (DR5).Furthermore, the heterodimers also bind to direct repeats separated by 1 bp (DR1) or 2 bp (DR2) [20].To analyze the obtained genomic sequences, AGGTCA motifs were computationally searched.For the stringency of the search, we allowed up to 2-bp mismatches.As an initial test, we examined direct interaction in vitro synthesized RAR/RXR and the known RARE (rCRBPII RARE) (Figure 3).Incubation of the the cy5-labeled rCRBPII oligonucleotides with the combination of RAR and RXR retarded complexes, but no shifted band was observed with either receptor alone (Figure 3, lanes 2-4).The complexes represented a sequence-specific interaction between the rCRBPII probe and the RAR/RXR heterodimer, since the formation of this complex was specifically reduced with molar excess of unlabeled competitors (Figure 3, lanes 5-7).Moreover, the addition of anti-RXR or anti-RAR antibody created a slower-migrating complex (Figure 3(a), lanes 7 and 8).No super shifted bands were observed with anti-HNF4 antibodies (Figure 3, lane 9).These results indicated that the sequence-specific binding complex contained both RAR and RXR, presumably as a heterodimer.
As a next step, we examined whether the 193 predicted RARE should interact with RAR/RXR heterodimer.At least 160 predicted RAREs in the obtained genomic sequences could directly interact with RAR/RXR heterodimer (Table 2).These RAR/RXR-binding sites were located around or in the genes with various functions, Figure 3. Binding of RAR/RXR heterodimer to RARE by mobility gel shift assays.Cy5-labeled double-stranded rCRBPII RARE was incubated with in vitro transcribed/translated human RAR alpha (RAR) and human RXR alpha (RXR).In a competition assay, 100-fold molar excess of the unlabeled oligonucleotides (RARE or Random) were added to the reaction mixture.In a supershift experiment, the indicated antibodies were incubated in the reaction mixture.Binding reactions were resolved by electrophoresis on a 4% acrylamide gel in 0.5 × TBE.such as cytoskeleton (TUBB6, KIF16B, KRT18 and CTNNA1), extracellular matrix (LARGE), signal transduction (CREB5 and STAT3), transcription (TAF12, POLQ and TRERF1), translation (EIF2B3 and EIF5B) and development (INHBA, ISX, KIAA1715, LHX1, and PBX3).Remarkably, several genes previously known to be regulated by ATRA including FBLN1 [21] and RAR beta [11] [22] were also included.Although the RARE of human RAR betagene was already reported, the exon 9 was a novel binding site.Furthermore, we confirmed that RAR/RXR could directly interact with non-canonical motifs, such as inverted repeats (IRs) and everted repeats (ERs) (Table 2).

Regulatory SNPs in RAR/RXR-Binding Sites
Functional rSNPs in transcription factor-binding sites may predictably lead to differences in gene expression and associate with disease susceptibility.Then, we identify 23SNPs on each RAR/RXR-binding sequence by using the NCBI's SNP database and examined the effect on the RAR/RXR-binding affinity.As a result, 19 functional rSNPs were identified by analyzing the difference in the DNA-binding affinities (Table 3).

Discussion
We identified 211 of the human genomic fragments containing putative RAREs using the modified yeast onehybrid.Theses sequences contain canonical RAREs (DR1, DR2, and DR5) but also contain half-site arrangements.At least 160 of putative RAREs in the obtained genomic fragments could directly interact with RAR/ RXR heterodimer.Interestingly, some putative RAREs differed from canonical RAREs also directly bound to RAR/RXR heterodimer.Furthermore, 19 functional rSNPs on the RAR/RXR-binding sites were identified by analyzing the difference in DNA-binding affinity.
Most of researches historically focused on promoter regions to find transcription factor-binding sites, but 90% of the identified RAR/RXR-binding sites located in over 10 kb from the transcription start site of genes (Table 2).In this study, we could not identify all previously reported canonical RAR/RXR binding sites [11] [23]- [25], although a modified yeast one-hybrid assay is one of the effective methods to identify transcription factorbinding sites as a genome-wide scale.There are some explanations for missed RAR/RXR-DNA interactions.First, the quality of a library will affect the efficiency of identification of protein-DNA interactions.Second, some RAREs may exist adjacent to sequences recognized by yeast transcription factors and may be discarded during the negative selection by 5FOA [8].
RAR/RXR heterodimers contribute to transactivation through widely spaced (up to 150 bp) DR elements [26].A response element compose of palindromic arrangement of consensus motifs with no spacer nucleotide between the two half-sites (IR0) is known to be bound and activated by RAR/RXR heterodimers [27] [28].In this study, we confirmed that several types of RAREs including IR0 and DR elements spaced by more than 5 bp directly interact with RAR/RXR heterodimerby EMSA (Table 2).These results are consistent with above previous researches.
Genome-scale sequencing has led to the discovery of millions of human SNPs [29].There are several examples of rSNPs associated with disease susceptibility [30]- [32].In this study, we identified 19 functionalr SNPs including the 50 kb downstream region of EPHA7 gene.A recent study revealed a strong correlation between expression of EPHA7 and glioblastoma multiforme patient survival [33].ATRA induces cell differentiation and causes inhibition of cell proliferation in a variety of cancer cell lines including glioblastoma cell lines [34]- [37].ATRA enhances cytotoxicity of paclitaxel in glioblastoma xenografts and can be therapeutically useful against glioblastoma [38]- [40].Our results suggest that the direct interaction RAR/RXR with the sequence around EPHA7 gene may possibly affect early stage of neural differentiation and the therapy of glioblastoma.Further studies will be necessary to clarify the relationship between the EPHA7 polymorphism (rs6907105) and the effects of ATRA therapy.
Recently, genome-wide ChIP analyses of RAR/RXR-binding sites were reported using several human cancer cell lines, but such studies are insufficient to get a complete overview of the target genes under its control [41] [42].As mentioned above, ChIP techniques potentially include indirect transcription factor-DNA interactions [7].In contrast, our strategy for identification of RAR/RXR sites in human genome took a fundamentally different approach based on the direct interaction between the RAR/RXR heterodimer and human genomic sequences using a yeast genetic selection.Our finding will provide insights into the molecular mechanisms underlying the physiological and pathological actions of RAR/RXR heterodimer.

Figure 1 .
Figure1.Schematic of bicistronic protein expression from a single expression vector.A bicistronic expression vector contains FMDV 2A peptide sequence (2A) between RXR cDNA and the GAL4 activation domain-RAR fusioncDNA under the control of the ADH1 promoter.The plasmid also contains Col E1 and 2 micro origins for the autonomous replication in E. coli and yeast respectively, ampicillin resistance, and a LEU2 selectable marker.Schematic of RXR and RAR expression via 2A-mediated translational skip mechanism is presented.

Figure 2 .
Figure2.A modified yeast one-hybrid system.Yeast cells transformed with the indicated plasmids.The transformants were grown on synthetic complete media including uracil (Ura) or lacking uracil but containing the RAR-specific ligand (TNPB) and the antagonist of RXR homodimers (LG100754).The plates were photographed after 2 days (+Ura) or 5 days (−Ura + TNPB + LG100754) growth at 30˚C.

Table 2 .
The validated RAR/RXR heterodimer-binding sites in the human genome.