In vitro examining the existing prognoses how TBP binds to TATA with SNP associated with human diseases

We in vitro examined the existing prognoses of the dissociation constant, KD, between ТАТАBinding Protein (TBP) and ТАТА box with single nucleotide polymorphism (SNP) associated with human diseases. Five SNPs of the genes for cytochrome P450 2A6 (associated with lung cancer), β-globin (associated with β-thalassemia), mannose binding lectin (associated with variable immunodeficiency), superoxide dismutase 1 (associated with amyotrophic lateral sclerosis) and triosephosphate isomerase (associated with anemia) fell within the range of –ln(KD;M/KD;WT) between –1.5 and –1 (here KD;WT and KD;M denote the normal ТАТА box and with SNP). The measurements using EMSA demonstrated that: 1) all the predictions stating that the affinity between ТВР and ТАТА boxes with SNPs would be reduced were correct; 2) the departures of three predictions from the measurements fell within the confidence interval; 3) all the predictions consistently underestimated actual mutational damage caused to ТАТА boxes with SNPs ( < 0.05; binomial law) and two of these predictions did so significantly ( < 0.05, Student’s t-test). This consistent underestimation seems to be associated with the damage to the context that modulates ТВP/ТАТА affinity, for example, the contact between the nucleosomal histone H3-Н4 dimer and the core promoter immediately near ТАТА boxes.


INTRODUCTION
Variome is largely composed of single nucleotide polymorphisms (SNPs).Consequently, no study of their role in ontogenesis or evolution could be efficient without computer-aided support, which would facilitate searches for SNPs, their documentation and systematization, and prediction of their effects on the phenotype.Over the past 10 years of the coordinated development of SNP databases and tools [1], anything outside the phenotype prediction problem has been successfully addressed [2].In particular, SNPs responsible for the propensity for diseases, susceptibility to therapy, sensitivity to regulatory signals, etc. have been identified.
Phenotype prediction has only been successful for the SNPs that are located in coding gene regions.The common molecular mechanism that can be proposed for them is mutational damage made to the gene product [3].It is still difficult to propose the same for SNPs in regulatory gene regions because of the diversity of such regions and a multi-step sequence of assembly, rearrangement and degradation of DNA-protein complexes in them [4].Examination of the computer-aided methods, developed on such a diverse material, requires an experimental examination in standardized conditions (the standardized examination throughout).Because different authors set different experimental conditions, the sole analysis of experimental results in databases will be of little help.For avoidance of doubt, there is a need for a coordination of bioinformatic and experimental studies of each type of site.For that purpose we conduct an integrated study of regulatory SNPs, especially those in ТАТА boxes.
The binding of the ТАТА-binding protein (ТВР) to the ТАТА box initiates assembly of the pre-initiation complex on the ТАТА-containing promoters of eukaryotic genes, which is a critical step in transcription initiation [5].We had previously [6] proposed a method for in silico prediction of TBP/TATA affinity on the basis of the equation of equilibrium of ТВР/ТАТА binding over four subsequent steps: 1) non-specific ТВР/ DNA binding [7]; 2) TBP sliding along DNA [8]; 3) molecular identification of ТВР/ТАТА [9] and 4) stabilization of the ТВР/ТАТА complex [10] by endothermic DNA rearrangement [11] with the helix axis bent at 90˚ [10].This equation allows the relative affinity, Δ = -ln(K D;M /K D;WT ), that is, the ratio of the dissociation constant of TBP and the normal TATA box (K D;WT ) to that of TBP and the mutant ТАТА box (K D;M ), to be estimated in logarithms.This equation puts together the commonly accepted criterion associated with TATA boxes for arbitrary DNA [9] (step 3), which only accounts for 33% of the variance of the measured ТВР/ ТАТА affinity [12], original criteria associated with TBP affinity estimates for single-strand DNA [13] and double-strand DNA [14] (step 4 and step 2) and an independent measurement of the non-specific affinity of ТВР and DNA [7] (step 1).Stepwise binding of ТВР to ТАТА, predicted by this equation [6], has now been confirmed experimentally [15].
Although the ТАТА box is a "semi-conservative" site, no-neutral SNPs in it are quite a common occurrence in various species.Thus, in silico analysis of the current content of the GenBank database revealed 146 SNPs of the HIV-1 TATA box, of which 63 could significantly modify the replicative potential of the virus and were associated with the regional patterns of the AIDS pandemic in 70 countries [16].Literature data suggest that 53 SNPs in the ТАТА boxes of various human genes are associated with the propensity for diseases [17], 38 SNPs are associated with various animal and plant traits valuable with respect to breeding purposes [18].In both cases we predicted in silico significant departures of ТВР/ТАТА affinity.The aim of the present work was to perform a standardized experimental examination of the strongest reductions in ТВР/ТАТА affinity as predicated among 53 disease-associated SNPs in human TATA boxes [19].The measurements of K D;M and K D;WT using EMSA demonstrated that: 1) all the predictions stating that the affinity between ТВР and ТАТА boxes with SNPs would be reduced were correct; 2) the departures of three predictions from the measurements fell within the confidence interval; 3) all the predictions consistently underestimated actual mutational damage caused to ТАТА boxes with SNPs ( < 0.05; binomial law) and two of these predictions did so significantly ( < 0.05, Student's t-test).

EXPERIMENTAL PROCEDURES
For the standardized experimental examination of the predicted ТВР/ТАТА affinity, we used recombinant human ТВР expressed in E. coli BL21 (DE3) cells from plasmid pAR3038-hTBP (courtesy of Professor B. Puhg, Center for Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, USA).E. coli BL21 (DE3) transformation was performed as per Peterson and the co-workers [20].The expression and purification of TBP were done as [21].A 26-bp strand of oligodeoxyribonucleotides (Biosset, Novosibirsk) was labeled with γ 32 P-ATP (Biosan, Novosibirsk, Russia) using Т4 polynucleotide kinases (SibEnzime, Novosibirsk, Russia), annealed at 95˚C with the non-labeled strand and cooled slowly to room temperature.The equilibrium dissociation constants, K D , of the ТВР/DNA complexes were measured using EMSA, titration of a fixed amount of ТВР with the oligonucleotide in increasing concentrations and isotopic dilution [22] as shown in Figure 1.In doing so, we used two standard tools, Gel-Pro Analyzer 3.1 for the densitometry of autoradiographs and OriginPro 8 for obtaining K D from densitometry data (Figure 1).
The confidence intervals of the 5% boundary ( < 0.05) for each prediction were in silico estimated using Student's t-test as [19].For all the in vitro measurements, the confidence interval, K D , commonly acceptable for the above two standard tools, was set as ±0.37 in relative natural logarithms, which corresponds to a confidence interval of ±30% of the K D value in nM, commonly accepted for EMSA-measurements of the parameters of the protein/DNA complex.

RESULTS AND DISCUSSIONS
The predicted in silico [19] and experimentally measured in vitro relative affinity of ТВР for ТАТА boxes containing SNPs for the genes encoding cytochrome P450 2A6 (associated with lung cancer), β-globin (associated with β-thalassemia), superoxide dismutase 1 (associated with amyotrophic lateral sclerosis), mannose binding lectin (associated with variable immunodeficiency) and triosephosphate isomerase (associated with anemia) are presented in Table 1.In all cases, the experiment confirmed the in silico predicted reduction in the affinity of ТВР to the ТАТА box containing the SNPs associated with the respective diseases.In three of the five SNPs, namely cytochrome P450 2A6 (associated with lung cancer), β-globin (associated with β-thalassemia) and superoxide dismutase 1 (associated with amyotrophic lateral sclerosis), the departures of the predicted values from those measured experimentally fell within the confidence interval.
In the rightmost column of Table 1, we compared the affinity range from -3.72 ± 0 37 to -1.03 ± 0.14, which .corresponds to the highest amounts of damage to the TATA boxes with SNPs evaluated in silico [19] and in vitro, with the affinity range from -8.60 ± 2.33 to -5.52 ± 2.31, which corresponds to the differences between the affinity of TBP for non-specific DNA [19] and the affinity of TBP for the five TATA boxes in the focus of this work [7].Lack of overlap between these two ranges implies that not even the strongest damage to any TATA box with SNPs can affect affinity so much as can total destruction of that TATA box.Nevertheless, we were surprised to observe an underestimation of the effect that mutational damage to the TATA box had on the numerical value for any of the five SNPs (Table 1:  < 0.05; binomial law), and that in two of the five, namely mannose binding lectin (associated with variable immunodeficiency) and triosephosphate isomerase (associated with anemia), this underestimation reached significance ( < 0.05, t-test).Because the measurement was done in standard conditions, this underestimation cannot have been due to local natural factors (as tissue-specificity) or laboratory factors.Therefore, in all the five genes, not only did the SNPs affect ТВР/ТАТА affinity, but also damaged the nucleotide context, which modulates this affinity, but does not immediately affect any of the ТВP/ТАТА steps included in the equilibrium equation [6].Since it was discovered [23] that whether or not TBP will bind to the TATA box absolutely depends upon the position of the TATA box relative to the histone octamer and, hence, the promoter nucleosome should undergo a rearrangement to enable transcription, the universal contexts, which, being common to all the genes, modulates their expression and interferes with transcription factor binding sites, is nucleosomal context.
It is commonly considered that the optimum seat site (145 bp) for the specific nucleosome of the core promoter of eukaryotic genes is at position -43 [24].Upstream and downstream of the nucleosome center (between positions ±13 and ±17 relative to it) are located two 5-bp (А + Т)-rich regions, which make contact with two nucleosomal histone H3-Н4 dimers [25].Whichever of the two DNA/(Н3-Н4) contacts is closer to the transcription start site overlaps the commonly accepted optimum location of the TATA box, namely: T -30 A -29 T -28 A -27 A -26 A -25 A -24 [9].Consequently, it is likely that SNPs in TATA boxes (Table 1: (A or T) → (G or C) substitutions) can damage not only the TATA box itself but also the (A + T)-rich context, which forms the contact between the promoter and the nucleosomal histone H3-Н4 dimer [26].The observed significant consistent underestimation of the affinity of TBP and mutant TATA boxes by in silico prediction revealed in our standardized experimental measurements is indicative of a likely cooperative influence of the contextual damages to the DNA/(Н3-Н4) contact on the ТВР/ТАТА complex which overlaps this contact (similarly to that described previously for the composite element NFATp/ AP-1 [27]).This suggests that eukaryotic promoters might possess the composite element ТАТА/(Н3-Н4) (Н2А-Н2В) 2 (Н3-Н4), which has been indicated experimentally [26] and which is still not yet considered in the tools intended for in silico analysis.So, after we performed the standardized experimental examination of the in silico predictions for the ТВР/ ТАТА affinity, we found in all the five cases that non-mutant ТАТА boxes in these genes had a high ТВР/ТАТА affinity (which suggests a high was consistently potential for expression); however, the relative affinity, Δ underestimated.Except rarely, it is not the absolute value of the gene expression level that is an evolutionarily important parameter, rather it is the scope of the norm of reaction-or the ability to modify this value.Dynamical systems theory considers two modes of modification, external and parametric [28].In the former case, any change represents an unambiguous reflection of the impact made.This is consistent with the formation of a mosaic of transcriptional factors on the promoter, which allows expression to be finely regulated.However, this is a relatively slow process, which requires, if nothing else, the presence of the pre-initiation complex.Typically, the phenotypic effects of the polymorphisms damaging the mechanisms of fine transcriptional regulation are specific.
In the latter case, a change in the values of the parameters destabilizes the system, leading to a change in the probability of what its function will be afterwards.This is consistent with disruption of the multistep regulatory process as a whole rather than disturbances in some single steps [29][30][31].The phenotypic effect of the polymorphisms that influence this variability is nonspecific and general.This change in the norm of reaction may be adaptive for a large population when in stressful conditions: if environmental changes occur very rapidly or are multiple (in which case some often are mutually exclusive), the level of expression in part of the population may-just for random reasons-turn out to be adap-tive.
In particular, a change in nucleosome packaging is capable of non-specifically changing the probability of the gene expressing in many tissues at a time.A polymorphism that changes nucleosome packaging affects at least two parameters: the "layout" of the transcriptionally active genomic regions [23] and the order of binding/affinity of transcriptional factors for DNA due to chemical modification of histones [32].It is commonly accepted that the housekeeping gene promoters have a special nucleosomal context, which ensures a looser nucleosome packaging and thus makes the promoter accessible by various regulatory proteins in a large variety of tissues.Destruction of one of the few nucleosomes should considerably affect regulation.True is, it was a housekeeping gene, the only one in our check, for which we observed the largest departure of Δ from the in silico prediction (Table 1: triosephosphate isomerase associated with anemia [33]).
The MBL2 gene, too, is characterized by a large departure of Δ from the in silico prediction.Mannose binding lectin (MBL) is a key protein in the development of the innate immune response.The polymorphisms that reduce MBL expression are associated with variable immunodeficiency, which is a risk factor (especially in the tender age [34,35]) for a variety of infectious diseases [34][35][36].In lower primates, both copies of the MBL gene are under stabilizing selection [37].In the anthropoid lineage, one of the copies has underwent pseudogenization [38], and man has additionally acquired a high frequency of polymorphisms that reduce the MBL level in the tissues, disrupt folding (codon 52 Arg → Cys, codon 54 Gly → Asp, codon 57 Gly → Glu) or transcription (-2550 -H/L polymorphism, -2221 -X/Y polymorphism, -2427, -2349, -2336, -270, +4 -P/Q polymorphism; -2324 --2329 deletion) [34,36,38].A low level of MBL eases the after-effects of the stroke [39] and pre-eclampsia [40].Thus, it is adaptive for the humanoids, with their actively working brain and difficult child-birth, to have strong variation of the within-tissue level of MBL across populations by combination of these polymorphisms in the heterozygotes.
In addition to the common polymorphisms mentioned above, local human populations may have other, independently fixed polymorphisms [41], the effect of one of which, located in the area of the ТАТА box (T-35c, Table 1), has been reported here.This polymorphism is likely to serve the same purpose as the common SNPs; specifically, it expands the norm of reaction, but does it somewhat differently, namely, by modulation of a trigger-like regulator based on the composite ТАТА/(Н3-Н4) (Н2А-Н2В) 2 (Н3-Н4) unit.
Importantly, this modulation of the norm of reaction is mild.The polymorphisms that disrupt folding or transcription [34,36,38] inhibit MBL gene expression.Consequently, the norm of reaction widens only at a population-wide level.Individuals homozygous for such polymorphisms are vulnerable to infection at all times, that is, the individual norm of reaction is narrow and these individuals will not survive any attack by viruses or microbes.The lack of overlap between the affinity of ТВР for the TATA box in the MBL gene with the T-35c polymorphism and the affinity of TBP for non-specific DNA indicates that the expression of this gene is not totally suppressed even in the individuals that were homozygous for this polymorphism.Decrease in ТВР affinity means decrease in the probability that the expression of the MBL gene will be initiated; however, if initiation does take place, the gene will be expressed to the extent that it will be in any wild-type individual.In other words, if some individuals in a population carry the T-35c polymorphism, the norm of reaction will be widening not only at a population-wide level, but also at an individual level-specifically, in those carriers.Thus, even when under a viral or microbial attack, such individuals are given a chance.Identification of the previously unidentified source of the consistent in silico underestimation of the amount of damage caused to the regulatory regions of genes with SNPs makes us expect that a standardized examination of all the 237 TATA box SNPs, associated with human diseases [17], regional patterns of the AIDS pandemic [16], animal and plant traits, which are valuable with respect to breeding purposes [18], and newly discovered mutations in TATA boxes will allow us to look into the mechanisms of ТВР/ТАТА binding and improve the research quality of the computer-aided tools used for analysis and prediction of SNPs in the regulatory regions of human genes.

Figure 1 .
Figure 1.An example of the EMSA measurement of the mutation-induced change in the affinity of ТВР to the ТАТА box in the gene for triosephosphate isomerase containing a SNP associated with anemia [33].ТВР/ТАТА binding isotherms inferred from electrophoregrams (insets): upper, for the -24T allele, unaffected, K D;WT = 7 nM; lower, for the -24 g, allele, anemia, K D;M = 290 nM.The result of the measurement is presented in Table 1, -ln(K D;M /K D;WT ) = -ln(290/7) = -3.72.

Table 1 .
[19]existing prognoses in silico[19]and in vitro measurements the change, Δ ± δ 5% , in the TBP affinity for the known natural ТАТА boxes with SNPs associated with human disease.
[19]terisks are the significant differences between the prognoses in silico[19]and the measurements in vitro [this work],  < 0.05 (Student's t-test).