The Polymorphisms in the Vitamin D Receptor Gene and Disease Severity in Sickle Cell Disease

Vitamin D is important in multiple aspects of health and its effects are mediated through the Vitamin D Receptor (VDR). We wanted to test the hypothesis that specific haplotypes of the VDR gene are associated with markers of disease severity, inflammation and bone health in Sickle Cell Disease (SCD). Genotyping was performed on DNA specimens from 1141 study participants in the NIH-funded Silent Infarct Transfusion (SIT) trial. We used the clinical and laboratory data to create separate endothelial dysfunction, vaso-occlusive severity scores and phenotype variables. Seventy-nine Single Nucleotide Polymorphisms (SNP) in the VDR gene and three associated genes— CYP27B1, VD binding protein, retinoid X receptor, were evaluated. The discovery cohort individuals had VDR haplotype information from a prior Genome-Wide Association Study (GWAS). The validation cohort was analyzed for SNPs that were significant in the discovery cohort. The phenotype data were obtained from the demographic and clinical information of the participants, and were used to create the severity scores, vaso-occlusive score, endothelial dysfunction severity, and overall severity score. Potential gene-gene interactions were analyzed for prediction of disease severity within each severity score. Two SNPs were associated with the overall severity score, 3 Corresponding author. E. L. J. Clay et al. 25 SNPs with the endothelial dysfunction severity score and 4 SNPs with the vaso-occlusive severity score. After permutation testing to correct for multiple comparisons, only one of the associations remained significant. SNP rs7965281 was found to be associated with the endothelial dysfunction severity score and remained significant after correcting for multiple comparisons using permutation testing. In the validation cohort, that SNP was again tested for association with each of the severity scores. There was no association with the endothelial or the overall severity score but a trend towards association with the vaso-occlusive severity score (p = 0.02). None of the known functional polymorphisms in the VDR gene were found to have an association with severity in sickle cell disease. Further work analyzing for gene-gene interaction using the same significant SNPs remains to be done in association with inflammatory markers and measure of bone health. Those studies may provide insight on the contribution of VDR polymorphisms to sickle cell disease severity.


Introduction
Sickle Cell Disease (SCD) is a hereditary disorder primarily affecting people of African ancestry [1].SCD is a single gene disorder and the complications of this disease affect every organ of the body including the vascular and immune systems along with bone metabolism.Vitamin D deficiency is commonly seen in patients with sickle cell disease especially those with SCD-SS [2] [3].Vitamin D and calcium are required for optimal bone health with close to 90% of required vitamin D synthesis coming from exposure to sunlight.Dark-skinned individuals usually require 5 -10 times more exposure to sunlight to produce the same amount of vitamin D3 in their skin [4].As common as vitamin D deficiency is in SCD, its contribution to disease manifestations is not yet known.Vitamin D is important in multiple aspects of health, including cardiovascular, immune and skeletal systems.The active metabolite of the pro-hormone vitamin D exerts its effect through interaction with the Vitamin D Receptor (VDR).A Single Nucleotide Polymorphism (SNP) is DNA sequence variation occurring commonly within a population in which a single nucleotide in the genome differs between members of a biological species or paired chromosome.The VDR is expressed in multiple cell types and there are specific VDR polymorphisms associated with diseases such as diabetes, hyperparathyroidism, and renal disease (to name a few) [5].Some of the polymorphisms such as bsmI, fokI, taqI have been found to be associated with bone health, inflammation and vascular disease, respectively [6] [7].Although vitamin D deficiency has been studied in sickle cell disease to some extent, the polymorphisms in the VDR gene have not been explored in sickle cell disease or in Africans or African Americans [8].Because of the presence of the VDR on cells in endothelial, inflammatory and bone systems, polymorphisms in this gene may be associated with disease manifestations in these organ systems in SCD.Our hypothesis is that specific VDR polymorphisms are associated with disease severity in sickle cell disease.Potentially some of the variation in the severity of sickle cell disease can be explained by differing polymorphisms in the VDR gene.

Materials and Methods
After approval of the study was obtained from the Biomedical Institutional Review Board of the University of North Carolina, genotyping was performed on DNA specimens from 1141 study participants in the NIH-funded Silent Infarct Transfusion (SIT) trial (WU-04-60/PO29892B).In this multi-center international trial, the participants were children ages 4 through 13 with SCD who were screened for the presence of silent cerebral infarction and had demographic and clinical data collected, as well as samples for a biologic repository which included DNA.We used the clinical and laboratory data to create separate endothelial dysfunction and vaso-occlusive severity scores for phenotype variables, following the current thinking on the subtypes of sickle cell disease expression [9].The initial 570 participants served as our discovery cohort, used to detect potential associations from all the variants tested.The subsequently enrolled 530 individuals formed our validation cohort to try to replicate/validate statistically significant findings from the discovery cohort.We evaluated 79 Single Nucleotide Polymorphisms (SNPs) in the VDR gene, three associated genes: CYP27B1, VD binding protein, retinoid X receptor, and tagging SNPs from the African American population from Hap map (Appendix 1).The discovery cohort individuals had VDR haplotype information from a prior Genome-Wide Association Study (GWAS) study, and analysis for additional VDR-related SNPs was performed using a specifically designed Sequenom assay.
The validation cohort was analyzed for SNPs that were significant in the discovery cohort.
The phenotype data was obtained from the demographic and clinical information of the participants, and was used to create severity scores.The vaso-occlusive severity score includes: number of hospitalizations for pain, number of hospitalizations for acute chest syndrome and avascular necrosis.The endothelial dysfunction severity score includes priapism, Transient Ischemic Attacks (TIA), silent cerebral infarct, systolic and diastolic blood pressure, transcranial doppler velocity, white count and baseline hemoglobin.The overall severity score includes all of the endothelial dysfunction and vaso-occlusive severity variables (Figures 1(a)-(c)).To derive the scores, the variables were transformed into quartiles.Each individual subject was assigned values of 1, 2, 3, or 4 for each variable with 1 representing lowest severity and 4, the highest.In addition, in concert with prior analyses of the SIT data, the variable for number of hospitalizations for pain was used alone as a severity measure.
Before association analysis, genotypes underwent quality control filters.First, markers with missing data rates greater that 5% were removed.Second, individuals with missing data rates greater than 5% were removed.Next, markers were tested for deviation from Hardy-Weinberg proportions using a Fisher's exact test and a Bonferroni corrected alpha =0.05 for the number of SNPs tested and markers with significant deviations were removed prior to analysis.This QC resulted in no markers of individuals needing to be removed.After QC, genetic markers were tested for association with each of the severity scores using a two stage approach.Initially, univariate tests for association were performed.Because severity scores were not normally distributed and were not totally continuous distributions, the Kruskal-Wallis test was used in association analysis (treating each genotype as a category).To correct for multiple comparisons, permutation testing was performed.One thousand permuted datasets (permuting the risk score outcomes while preserving the genotype matrix) were created and the Kruskal-Wallis tests were performed for each of the variants.The lowest p-value from the analysis of each permuted dataset was used to build an empirical distribution of p-values expected by chance, and the p-values for the original analyses were determined by their hypothetical percentile rank in the empirical distribution.Such a permutation approach is a well validated approach for multiple testing correction that is less conservative than a Bonferroni correction when there is correlation between predictor variables [10] [11].
To look for complex genetic models, including potential gene-gene interactions for prediction of disease severity, the Generalized Multifactor Dimensionality Reduction (GMDR) method was utilized, with repeat analyses performed for each severity score.GMDR introduces the concept of a score statistic into the Multifactor Dimensionality Reduction (MDR) framework to obtain an appropriate statistic to classify multifactor contingency table cells into two different groups.The score value is based on a generalized linear model of the phenotype variable, and allows for the use of covariates in the score approximation.Details of GMDR have been previously described in detail [12].
In step one, data was divided into a training set and an independent testing set for cross validation.Five-fold cross-validation was used, with 4/5 of the data used for training and 1/5 for testing.A set of n genetic factors were then selected.These factors and their multiple classes were divided in n-dimensional space.In the traditional MDR approach, the ratios of cases to controls were then calculated within each multifactor class.Each multifactor cell class was labeled "high risk" if the ratio exceeded 1.0, or "low risk" if less than 1.0, thus reducing n-dimensional space to one dimension with two levels.In GMDR, the ratio of cases to controls in each cell is replaced by the score values.The null hypothesis assumes there are no effects of the putative factors or their interactions, so the score values are the same for all different factor classifications.In the third step of GMDR, the cumulative score value was calculated within each multifactor cell and in the fourth step, each multifactor cell was labeled either as high-risk if the average score meets or exceeded a pre-assigned threshold T (e.g., 0), or as low-risk if the threshold was not exceeded.The collection of these multifactor classes comprised the GMDR model.The result is a set of models, one for each model size considered.The final model was chosen that minimizes prediction error while maximizing Cross Validation Consistency (CVC).The statistical significance of the final best model was determined through permutation testing, which involved creating 1000 permuted datasets by randomizing the value of the phenotype variable.The entire procedure was repeated for each, generating a distribution of 1000 prediction errors that could be expected by chance alone.The significance of the final model was determined by comparing the prediction error of the final model to the distribution.A p-value was extracted for the model by its theoretical location in the permutation distribution.
In the present study, GMDR analysis was performed with the same outcome variables listed above as the phenotypes, using stratification when necessary and all other genetic markers as potential predictor variables.Analysis was performed with 5-fold cross-validation, and single-variable through four-variable interactions were evaluated.The permutation testing was used to empirically define the prediction accuracies that are significant with a family-wise type I error rate of 0.05.
The study was designed to have at least 80% power to detect odds ratios of 2.15 for the univariant association of variants with minor allele frequencies of at least 0.25; 58% of our candidate genes fell within this range.These power calculations were performed assuming an F distribution with a Bonferroni corrected alpha of 0.05.SNPs that were significantly associated with severity in the discovery cohort were tested for association in the validation cohort.Each SNP was directly tested for association using a Kruskal-Wallis test after QC checks.Since this replication analysis represents a single statistical hypothesis, no correction for multiple testing was performed.

Results
By univariate testing in the discovery cohort, 2 SNPs (rs1491710, rs11574114) were nominally associated with the overall severity score, 3 SNPs (rs11829917, rs2853563, rs11574138) with the endothelial dysfunction severity score and 4 (rs7855881, rs12348547, rs11574114, rs7965281) with the vaso-occlusive severity score.One of the SNPs, rs11574114, was associated with both the overall and the vaso-occlusive severity scores.
Our severity scores, although based on current thinking about sickle cell disease, have not been validated in other work.The measure that has been validated as a measure severity in the SIT population (the source of the data for this study) is number of hospitalization for pain.When tested by univariate analysis, there was a nominal association with 2 of the SNIPS (rs7855881, rs34312136) with this variable.However, after permutation testing to correct for multiple comparisons, none of the associations remained significant.
Using MDR to test for significant gene-gene interactions in the discovery cohort, one single locus association between SNP rs7965281 and the endothelial dysfunction score was significant, and remained significant after permutation testing to correct for multiple comparisons.The mean endothelial dysfunction severity score for those homozygous for A and for G (the variant allele) was 15, and it is 13 for the heterozygotes (AG).The range for the endothelial dysfunction score is 7-23, with higher numbers corresponding to greater severity.
In the validation cohort, there was no univariate association with the one significant SNP from the discovery cohort, rs7965281.However, that SNP was significantly associated with the vaso-occlusive severity score in the validation cohort by MDR testing (p value = 0.02).

Discussion
Overall there does not appear to be a significant association between any of the currently described functional VDR genes, such as bsmI, fokI, taqI and severity in SCD.With the current study's sample size and power, only 58% of our candidate genes had a minor allele frequency sufficient for testing for an association with disease severity.The association between the VDR and the SCD severity could be further explored in a larger SCD population.In addition, a sickle cell disease population for which there are data on bone health, disease and inflammatory markers should be studied in relation to VDR haplotypes.
Concerning the single SNP that was found to be significantly associated with SCD severity, not much is known currently.In the literature, SNP rs7965281 is a tagging SNP whose significance is not fully clear.It is located in the regulatory region UTR + 3713: Chr position 45262658 as per dbSNP (HuRef).It was found to be associated with reduced risk for cutaneous melanoma in a large population [13].In a marker report, a trend analysis was done with this SNP and an association with systolic blood pressure was found in a British population, with an adjusted p value of 0.029.More work is being currently done on this particular SNP (personal communication with Dr. Orlow [13]).
While our study did not show a relationship between the well-described VDR gene polymorphisms and the SCD severity, one SNP was found which may be of importance.In this study with a two-stage approach to association analysis, the SNP found was associated with disease severity in both the discovery and the validation cohorts, but it was not associated with the same severity scores in the two cohorts, making the results somewhat less precise that one would like.However, in the current conception of the clinical expression of sickle cell disease, the vaso-occlusive and the endothelial dysfunction disease manifestations do overlap, so the association of SNP rs7965281 with both of the scores does not invalidate its significance.Clearly, an understanding of the biologic function of SNP rs7965281 is necessary if the results of the current study are to be built upon, and these results should be evaluated in other studies of the VDR and sickle cell disease severity to confirm this association.
National Institutes of Health to The Johns Hopkins University (PI: J.F. Casella, MD) (CIDR contract #: HHSN268200782096C).Additional assistance with the GWAS data was provided by Drs.D.E.Arking and P. Bhatnagar (JHU) and W.C. Hooper and C.J. Bean (CDC).
We also would like to acknowledge David Barrow, DDS, for his contribution in the early stages and planning of this project.

Figure 1 .
Figure 1.Histogram of severity score distributions in the discovery cohort.The relative density of the validation population for the (a) vaso-occlusive, (b) endothelial dysfunction and (c) overall severity score are shown.