Association between Polymorphisms of SNPs Located at the 3’-Untranslated Region of SET8 and Codon 72 of the TP53 with Breast Cancer among Cameroonian Women

In sub-Saharan Africa, breast cancer (BC) constitutes a serious public health problem and the genetic basis of its development is remaining poorly un-derstood. Although the SNPs at codon 72 of TP53 (rs1042522) and at the UTR of SET8 (rs16917496) have both been associated with BC development among Asian and European women, no published data has been reported within African population. We herein report on the impact of these polymorphisms on the risk of BC among Cameroonian women. Blood samples were collected from 111 breast cancer patients and 224 controls. DNA was extracted from each sample and PCR-RFLP was used to investigate the polymorphisms at SNPs rs1042522 of TP53 and rs16917496 of SET8. Association studies were performed according to ethno-linguistic groups and menopausal status. The minor allele “T” of SET8 gene revealed a protective effect in premenopausal 95% CI, 0.24 - 0.91) with reduced disease risk. No significant association between polymorphisms at the SET8 and TP53 loci and clinical pathologic features of BC was observed. This study suggests significant associations between the SNPs located at the 3’-UTR of SET8 and codon 72 of the TP53 with the risk of breast cancer development among premenopausal women. There is an interaction between TP53 and SET8 genes.


Introduction
Breast cancer (BC) is the most predominant cancer in women worldwide with about 2.2 million new cases diagnosed in 2018 [1]. Although the incidence of BC is relatively low in developing countries, the mortality rates are very high. According to the International Agency for Cancer Research (IARC), BC incidence ranges from 28 per 100,000 women in central Africa to more than 37 per 100,000 women in Western Africa [1]. More than 50% of BC-related deaths occur in low-income countries probably due to advanced-stage at diagnosis and disease aggressiveness [2] [3] [4] [5]. Although enormous efforts have been undertaken to better understand BC etiology, several aspects remain underexplored, especially in sub-Saharan Africa where the disease is characterized by different epidemiological features. Although about 53,917 new breast cancer cases have been reported in North Africa, more than 114,707 new cases have been recorded in sub-Saharan Africa [1]. Moreover, amongst young women of 15 to 49 years, the incidence of breast cancer in North Africa is lower than in sub-Saharan African countries [1]. Women of Sub-Saharan African region also have a higher risk for early-onset, high-grade, node-positive and hormone receptor-negative disease [6]. Although lifestyle factors have been proposed to partially explain these observed features [7] [8] [9], studies addressing BC genetics have highlighted the role of genetic factors. In the light of the foregoing, polymorphism at some specific genetic markers such as single-nucleotide polymorphisms (SNPs) have been postulated to explain the differences in BC outcome based on the race and/or ethnicity [10] [11] [12].
Single-nucleotide polymorphisms are the most frequent type of variation in the human genome [13]. Several studies have shown SNPs as important genetic variants that could help to predict individual susceptibility to various cancers and response to certain drugs [14] [15]. In some epidemiological investigations, SNPs in critical genes have been examined in order to unravel associations between specific alleles and genotypes with the risk of cancer development and/or the appearance of a specific pattern of cancer development [16] [17]. Recent investigations on the genetic bases of breast cancer revealed that one SNP of the TP63 gene was associated with reduce risk of breast cancer development in Ca-meroonian women [18]. However, other SNPs that have shown some associations with cancer development remain to be investigated in sub-Saharan African countries. For instance, the SNP (rs1042522) in codon 72 of TP53 has shown no associated with breast cancer in Rwandese Population [19]. However, for the same SNP, other studies reported its association with the risk of developing several cancers including breast cancer [20]; thus highlighting its potential role in the development of breast cancer in other populations. This SNP produces two variants G and C with distinct biological and biochemical properties [21]. It has been reported to play important role by mediating apoptotic response [22]. It has been also associated with the risk of developing cancer including BC. Moreover, polymorphism at SNP rs16917496 T/C located at the 3'UTR of SET8 has been associated with BC risk in young Asian women [23]. Subsequent investigations have shown this SNP to be a susceptibility factor for a number of cancers including non-small cell lung cancer [24], childhood acute lymphoblastic leukemia and cervical cancer [25]. Remarkably, TP53 and SET8 genes may have some biological molecular interactions. For instance, as a methyltransferase, SET8 methylates TP53 gene at Lys-382, which may affect the gene function [26]. By this methylation, there is an interaction between the SET8 and TP53 gene products and polymorphism on these genes could alter their function. The deletion at the SET8 gene increased proapoptotic and checkpoint activation functions of TP53 [27]. Thus, polymorphism in either the SET8 or TP53 genes may lead to the loss of homeostatic control during human carcinogenesis [28] [29]. However, there is no evidence to show a correlation between the SNP in the 3'-UTR of SET8 (rs16917496 C/T) and BC in Sub-Saharan Africa population. Meanwhile the aforementioned SNPs have been associated with the risk of BC in young Asian women [23], no published data has shown their implications in the risk of developing BC in African women. Understanding the impact of these SNPs in the development of BC in sub-Saharan Africa may help in designing well-tailored preventive measures and sensitization measures.
We herein report on the association between polymorphisms at two SNPs of SET8 and TP53 genes with risk of BC in Cameroonian women both as independent factors as well as in an interaction model.  [30]. Beside these three groups, some minor groups exist such as the Baka who generally speak the Bantu languages but who are not closely related to any of these three major groups [31].

Blood Sampling and DNA Extraction
About 5 ml of whole blood sample was taken by vein-puncture into EDTA-coated tubes. After centrifugation at 3000 ×g for 5 minutes, the buffy coat was collected. From each buffy coat, DNA was extracted using phenol-chloroform-isoamylic alcohol (25:24:1) as described by Kerney [32] and then, precipitated with isopropanol. The DNA pellets were washed twice with 70% cool ethanol and then dried at room temperature. DNA pellets were finally re-suspended in 50 µl of sterile ultrapure water and stored at −20˚C until use.

Genotyping of SNPs in SET8 and TP53 Genes
In this study, the SNPs in SET8 and TP53 were investigated by PCR-RFLP where a DNA fragment of each of these genes was amplified and subsequently digested by a specific restriction enzyme. The following primer pairs were used: SET8-Fow (5'-TGAGCTGAGGTGTGAGCCTA-3') and SET8-Rev (5'-AGAGTTCTGGGA AACACGCT-3') for SET8, sense 5'-ATGGGACTGACTTTCTGCTCTTG-3' and anti-sense 5'-GGAAGCCAAAGGGTGAAGAGG-3' for TP53. These primers were designed using Primer-BLAST software as described by Ye et al. [33]. and 5 µL of 10-fold diluted genomic DNA extract and supplemented with sterile ultrapure water. The amplification program was made up of an initial denaturation step of 95˚C for 15 min followed by 40 cycles of 95˚C for 45 s, 58˚C and 57˚C for 45 s respectively for SET8 and TP53, and 72˚C for 1 min, and a final extension step of 72˚C for 10 min. PCR products from different amplification reactions were resolved by electrophoresis on 2% agarose gel, visualized under UV-light and documented with UVItec (Cambridge, UK). All successfully amplified samples (a DNA fragment of 700 bp for SET8 or 500 bp for TP53) were selected and subsequently subjected to restriction digestion.
For this digestion, ten micro-liters of SET8 or TP53 PCR products were digested with SwaI and BstUI respectively (cat # New England BioLabs, Inc. country). The digestion was performed overnight at 25˚C and 60˚C respectively in the buffers NEBuffer 3.1 for SwaI and NEBuffer CutSmart for BstUI. The digested products were resolved on 2% agarose gel (FMC Bio Products) at 100 volts for 90 minutes and documented using a UVItec (Cambridge, UK) gel documentation system. The expected size of DNA fragments resulting from the digestion of PCR products was determined using the online Restriction Map software (Restriction Mapper version 3) at http://www.restrictionmapper.org. This was done by simulating the digestion of each PCR product sequence with the corresponding restriction enzyme identified in the previous studies (Table   1). For SET8 and TP53 loci, three different profiles were expected (Table 1): 1) the homozygote wild type genotype with one DNA fragment of 700 bp for SET8 and two DNA fragments of 286 and 214 bp for TP53; 2) the homozygote genotype with two DNA fragments of 203 and 497 pb for SET8 and one DNA fragment of 500 bp for TP53; 3) and the heterozygote genotype showing three DNA fragments of 203, 497 and 700 bp for SET8, and 214, 286 and 500 bp for TP53 (Table 1).
Amplicons from controls and BC patients were quantified before their digestion. Equal amount of amplicons was digested to minimize misinterpretation of heterozygote frequency resulting probably from partial digestion. For each series of amplification and digestion, samples with known genotypes were added as internal controls in order to control the reproducibility and digestion efficiency.

Power Calculation
The power of this study was calculated using the PGA modeller package in MATLAB [34]. It was estimated by considering an odd ratio (OR) or a relative risk (RR) ≥ 2 for the locus with the allele frequency of the disease of 0.085 -0.1791 for the two genotyped loci. In addition, the disease prevalence estimated at 0.1% in women aged from 20 to 74 years according to WHO [35], a type 1 error of 5% of risk, a complete linkage parameter (r2) of 0.8 for the linkage disequilibrium (LD) [36], a case-control ratio of 1:2 and the size of sampling was also taken into account in the power calculation.

Association Analyses
Before association studies, the HWE test was undertaken on the entire population and different subpopulations stratified according to ethno-linguistic subgroups or menopausal status. Each population or subpopulation was considered in HWE when the p value (comparing the observed heterozygote rate and that of expected heterozygote) was ≥0.05 using PLINKv1.9 package. Association studies between the polymorphisms at SET8 and TP53 gene loci and the risk of BC development were investigated with a logistic regression model that was performed to estimate odds ratio (OR) at 95% confidence intervals (CI) in PLINKv1.9 package [37]. They were performed on the entire population as well as different subpopulations represented by ethno-linguistic groups and women with different menopausal status. To avoid standard error resulting from the absence of genotypes or alleles (represented by zero), a value of 0.5 was added to all cells as described previously [38] [39]. Pearson chi-square (χ 2 ) tests and Fisher's exact test were used to compare categorical variables between participants while the student t-test was used to compare the mean values for continuous variables between subpopulations using SPSS Software 22.0 (SPSS Inc., Chicago, Illinois, USA). The test was considered significant for a P value below 0.05. The Cochran-Mantel-Haenszel (CMH) test implemented in PLINKv1.9 package was performed with the allelic frequencies because this test can only be done with binary variables [37]. Used as an extension of the chi-square test, the CMH allows for the estimation of odds ratio and 95% confidence interval across the stratified populations represented here by different ethno-linguistic groups and menopausal women. This test enabled to assess the association between alleles and the probability to develop breast cancer within each stratified subpopulation. The CMH2 test, also implemented in PLINKv1.9 package, was used to determine if significant differences exist between the allelic frequencies in different subpopulations. In addition to genotypic and allelic tests, the Cochran-Armitage trend test for interaction between genotypes was performed on the entire and different subpopulations in order to see if there is any association between polymorphism at a given locus and the risk of breast cancer development [40].
To confirm results of association studies generated by CMH tests, the logistic regression model was performed on different subpopulations stratified by eth-no-linguistic subgroup and menopausal status. The Fisher exact test was performed on samples from premenopausal women that were in HWE and that showed significant association with polymorphisms at TP53 and SET8 loci in order to see if there is any association between the genotypic frequencies and different clinico-pathological presentations of BC. It was also performed to assess the implication of the combined polymorphism at TP53 and SET8 loci with risk of BC development [27].  Table 2).
With a complete linkage parameter (r2) of 0.8, a disease prevalence 0.1% in women aged 20 to 74, the disease allelic frequencies ranging from 0.085 to 0.1791 for two loci genotyped and a sampling size of 335 individuals including 111 BC patients and 224 controls, the power of this study was estimated at 86%.

Amplification of SET8 and TP53 Genes
The DNA extracts from 335 participants were successfully amplified for both SET8 and TP53 genes. Figure 1(a) and Figure 1(b) illustrate the electrophoretic profiles obtained on agarose gel. They show the amplicons resulting from the amplification of different DNA extracts. The quality and intensity of bands observed on agarose gel testify not only the good amplification, but also the quality of DNA extracts resulting from phenol-chloroform-isoamyl alcohol extraction method used. Journal of Biosciences and Medicines

Genotyping of Different SNPs
According to SNPs that were investigated, different electrophoretic profiles were generated after digestion of PCR products. Figure 2 is an example of electro-

Association Study Performed on the Whole Population
At the SET8 gene locus, the overall population as well as different subpopulations were in HWE (p = 1). No significant difference was observed at this locus when the allelic and genotypic frequencies were compared between patients and controls.   (Table 3) (OR, 0.5; CI 95%, 0.311 -0.798; p-value = 0.004). When the population was stratified into ethno-linguistic subgroups and according to the menopausal status, the allelic frequencies were in HWE for the Bantu (p-value = 0.5859), Semi-bantu (p-value = 0.1428) ethno-linguistic groups, premenopausal (p-value = 0.1) and postmenopausal (p-value = 0.5841) women, respectively. The Sudano-Sao ethno-linguistic subgroup was not in HWE (p = 0.0373) ( Table 5). Data presented in Table 5 shows detailed results of HWE values when the population was stratified into ethno-linguistic groups. The heterogeneous nature of the studied population formed by several ethno-linguistic subgroups has an impact on the HWE. For these reasons (various ethno-linguistic groups and the deviation of HWE in the entire population), additional analyses were performed with the Cochran-Mantel-Haentszel test (CMH) that takes into account the population stratification. For these analyses, the population was stratified on the basis of ethno-linguistic groups and the menopausal status. During these analyses, the Sudano-Sao subgroup was excluded.

Association Study Performed on the Stratified Population
Data used in the CMH test included 328 participants (105 BC patients and 223 controls) from Bantu and Semi-bantu ethno-linguistic groups; the Sudano-sao group being excluded. With the CMH test, no significant association was observed between the polymorphisms at SNPs of SET8 and TP53 genes and the risk of developing BC in different ethno-linguistic groups. The minor allele T at the SET8 locus was not significantly associated (unadjusted p = 0.096, X 2 = 2.777, adjusted p = 0.1913) with BC development. For the TP53 locus, the minor allele

Association Study Performed on Each Subpopulation
Due to the fact that the CMH test did not show any significant association with the different subpopulations, each of them was analyzed independently with the logistic regression model by considering only the subpopulations that were in HWE.

Association Study Performed According to Menopausal Status
The CT genotype of SET8 gene was significantly associated with increased risk     (30) development between women with and without breast cancer. Additional association studies that take into consideration the clinical and pathological characteristics of the disease revealed no significant association between the polymorphisms at SET8 and TP53 loci and the risk of developing different clinical evolution of breast cancer in the studied population (Table 8).

Frequencies of Combined Genotypes
The combination of CC genotype of SET8 and GC genotype of TP53 revealed a significant protective effect (OR = 0.46, 95% CI: 0.24 -0.91, p-value = 0.024) for BC development with the significant enlargement in healthy controls compared to BC patients. The other genotype combinations didn't show any association with BC development (Table 9). results are in line with those reported in Caucasian patients where polymorphism at the same locus seemed to increase risk of BC among premenopausal women [62]. However, some studies have suggested that there is no association between the rs1042522 variant and the development of BC in Africa, whatever the menopausal status [19] [47]. The discrepancies between association studies involving this SNP could be explained by the genetic variability of the African population made up of various ethno-linguistic groups characterized by a diversity of genetic background [21] [62].
Although the CG genotype of TP53 has not been implicated in premenopausal BC susceptibility [48], results (adjusted p-value of 0.002 and an OR of 0.39) of our study revealed its association with a reduced risk of developing BC in premenopausal women. These results contrast those of Cherdyntseva et al. [61] reporting that CG genotype seemed to increase the risk of BC in premenopausal Caucasian patients. Moreover, the GC genotype of TP53 showed a protective effect against retinoblastoma invasion [63]. The differences observed in these association studies could be related to differences in genotype frequencies between various populations and the type of cancer considered. In our study, both cases and controls showed a high prevalence of C allele compared to G allele and the lack of GG genotype. Indeed, Brenna et al. [64] had shown that the frequency of G allele increases with latitude, while the C allele shows the opposite effect.
Moreover, several studies reported that polymorphism at SNP rs1042522 is balanced by natural selection [65] [66]. They also reported that the frequency of C allele increases in a linear manner in multiple populations as they are near the equator, with around 60% in people of African descent and 17% -34% in those of Caucasian descent [65] [66]. These variations in the allelic and genotypic frequencies according to geographical position of the studied populations could partly explain the rarity of G allele and GG genotype in Cameroon and therefore, their association with the risk of breast cancer development in Cameroonian premenopausal women.
In our study, the combination of CC genotype of SET8 with CG genotype of TP53 has a significant protective effect (OR = 0.46, 95% CI: 0.24 -0.91, P = 0.024) against BC development in premenopausal women. These results do no corroborate with those obtained in Chinese population where individuals with the same combined genotypes had a high risk of developing BC at an early age [23]. These results suggest that SET8 and TP53 gene variants may interact in BC development. They are in line with observations of Yang et al. [25] providing evidence that there is a gene-gene interaction between SET8 and TP53 polymorphisms and the risk of cervical cancer. Indeed, past investigations revealed the contribution of cancer-related SET8 mutants with p53 in the installation of DNA-damage signaling and senescence in primary human cells [67]. TP53 is regulated by monomethylation at K382 by SET8, which might render TP53 gene inert in part by preventing acetylation at K382 [67]. Further studies with large sample sizes are needed to confirm our findings.

Conclusion
This study showed a significant association between the polymorphisms in the 3'-UTR of SET8 and in the codon 72 of TP53 genes and the risk of developing BC in premenopausal Cameroonian women. The association of SET8 and TP53 polymorphisms with the risk of BC suggests a multiplicative gene-gene interaction. Further studies are warranted to elucidate the role of genetic polymorphisms in breast carcinogenesis in Cameroon.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.