Genetic Copy Number Variations in Colon Mucosa Indicating Risk for Colorectal Cancer

Background: Sporadic colorectal tumors probably carry genetic alterations that may be related to familiar clusters according to risk loci visualized by SNP arrays on normal tissues. The aim of the present study was therefore to search for DNA regions (copy number variations, CNVs) as biomarkers associated to genetic susceptibility for early risk predictions of colorectal cancer. Such sequence alterations could provide additional information on phenotypic grouping of patients. Material and Methods: High resolution 105K oligonucleotide microarrays were used in search for CNV loci in DNA from tumor-free colon mucosa at primary operations for colon cancer in 60 unselected patients in comparison to DNA in buffy coat cells from 44 confirmed tumor-free and healthy blood donors. Array-detected CNVs were confirmed by Multiplex ligation-dependent probe amplification (MLPA). Results: A total number of 205 potential CNVs were present in DNA from colon mucosa. 184 (90%) of the 205 potential CNVs had been identified earlier in mucosa DNA from healthy individuals as reported to the Database of Genomic Variants. Remaining 21 (10%) CNVs were potentially novel sites. Two CNVs (3q23 and 10q21.1) were significantly related to colon cancer, but not confirmed in buffy coat DNA from the cancer patients. Conclusion: Our study reveals two CNVs that indicate increased risk for colon cancer; these DNA alterations may have Corresponding author. Authors contributed equally to the manuscript.


Introduction
Confirmed large-scale copy number DNA variations were observed in our previous analyses on colon mucosa tissue from colorectal cancer (CRC) patients with different clinical outcome [1].Patients with excellent survival displayed only reported copy number variations (CNVs), while both reported and unconfirmed DNA alterations occurred in patients with short postoperative survival.Large-scale CNVs in mucosa tissues have been emphasized by others [2] [3], and multiple genome-wide association studies (GWAS) have identified several susceptibility single-nucleotide polymorphism (SNP) loci proposed to predispose for colorectal cancer [4]- [7].Hence, diagnostic markers based on genetic aberrations may be a possibility to benefit disease prevention by early diagnosis to improve survival.Therefore, we intended to follow up on our previous observations on CNVs in DNA from patients who developed colon cancer.
DNA sequence alterations such as CNVs in mucosa tissue from patients with colorectal cancer may either be acquired and reflect toxic environments to which colon epithelial cells are exposed during life time predisposing to carcinogenesis, or it may represent inborn host susceptibility for cancer development [8].In recent years, it has appeared that CNVs are common and involve a great proportion of the human genome and may be involved in cancer development [2] [3] [9]- [15].Such findings may be utilized to improve public health care activities [5] [11] [16] [17], where DNA CNVs may provide additional information in screening of phenotypes compared to SNPs [8] [18] [19].Therefore, the aim of this study was to search for CNVs in colon mucosa, as related to colon cancer, in 60 patients at their primary operation for colorectal cancer.

Patients and Sample Collection
A total cohort of 486 patients diagnosed with primary colorectal carcinoma were subgrouped into Dukes A, B, C and D tumor stages corresponding to stage I-IV according to histopathological findings at operation [20].Fifteen patients per Dukes stage were consecutively selected to a total number of 60 patients.Patients with Dukes A and B stage had at least 5 years recurrence free survival after the primary cancer operation and the 30 patients with Dukes C or D stage died from colorectal cancer within maximum 38 months after primary surgery.The two groups with Dukes A and B comprised 7 females and 8 males each, while the two groups with Dukes C and D comprised 8 females and 7 males each.Twenty-three patients were operated at Uppsala University Hospital, Sweden between 1988-1992 and 37 were operated at Uddevalla Hospital, Sweden between 2001-2005.None of the patients experienced any additional perioperative cancer treatment according to local practice at the time and individual patient requirements.This study protocol was approved by the Board of ethics at University of Gothenburg (dnr 365-05) and all participators gave written informed consent.Biopsies from colon mucosa tissue far from the tumor (>15 cm) down to the serosa layer were collected from each patient at operation for the primary tumor, snap frozen in liquid nitrogen and stored at −80˚C.Blood samples from patients were collected beyond 5 years postoperatively in 5 patients with Dukes A tumors and blood from healthy blood donors were collected at the Blood center at Sahlgrenska University Hospital.Buffy coat layers were stored at −80˚C until analysis.

Extraction of Genomic DNA from Normal Colon Mucosa and Blood
Tissue biopsies were homogenized and total genomic DNA extraction was performed with QIamp DNA mini kit (Qiagen, Hilden, Germany) according to instructions and quantified in a NanoDrop ND-1000 (NanoDrop Tech-nologies Inc., USA).The buffy coat layer from blood was used for genomic leukocyte DNA extraction with Flexi Gene DNA kit (Qiagen, Hilden, Germany).

105K Oligo Array CGH
DNA was hybridized in competition with the commonly used reference DNA NA10851 (Coriell Cell Repositories, Camden, NJ, USA) to 105K Whole Human Genome oligo arrays (Design 014698, Agilent Technologies, Palo Alto, CA, USA).The high resolution 105K array contains approximately 99000 coding and noncoding human sequences distributed with median 21.7 kb probe spacing genome wide.Genomic DNA derived from colon mucosa tissue in 5 patients was pooled into one analytical specimen repeated to give triplicates for all Dukes A-D groups (n = 12 arrays).Genomic DNA from 5 individual blood samples from Dukes A patients was also pooled to one analytical specimen (n = 1 array).Thus, 12 analytical specimens of pooled DNA from colon mucosa and 1 specimen of pooled DNA from patients' blood cells were analyzed versus reference DNA.500 ng of study and reference DNA were labeled with Agilent Genomic DNA Labeling Kit PLUS, hybridized with Agilent Oligo array CGH Hybridization Kit and washed with Agilent Oligo Wash Buffer 1 & 2 set, following Agilents standard processing recommendations.All labeled samples were checked in a NanoDrop prior to hybridization and arrays were scanned on an Agilent scanner (G2565AA, Agilent Technologies).Analyses of scanned images from two-color oligonucleotide CGH arrays were performed in Feature Extraction 9.5 (FE 9.5, Agilent Technologies), using default CGH settings.FE 9.5 result files were implemented in CGH Analytics 3.4 and quality control was assessed by metrics provided by the software.

Multiplex Ligation-Dependent Probe Amplification (MLPA)
MLPA analysis was performed to confirm array CGH CNVs using a probe mixture (Salsa MLPA kit P300-A1 Human DNA Reference-2, MRC-Holland b.v., Amsterdam, the Netherlands) with 20 different probes and 11 specific probes.MLPA is an effective approach to cover several gene sequences in single DNA samples with confirmed quantitative information compared to results achieved by array CGH.The specific probes were designed to study specific known CNVs and were distributed on different chromosomes shown in Table 1, which were interesting targets based on the prior CGH array experiments performed on colon mucosal DNA from cancer patients.Control DNA in the array experiments were re-analyzed by MLPA as well as our cancer patients (39 individual colon mucosa DNA samples from patients with colorectal cancer and 5 blood samples from Dukes A) and a cohort of blood donors (44 healthy northern European citizens).
The MLPA analysis was performed according to the protocol provided by the supplier with minor changes; denaturation of DNA was prolonged to 10 min.Briefly, 250 -500 ng DNA in 5 μl TE was denatured at 98˚C and subsequently hybridized overnight (16 hours) with a mix of probes, each consisting of two parts that recognize adjacent target sequences.After hybridization, the probes were ligated with a thermostable ligase.PCR was performed on ligated products with two universal PCR primers, amplifying all probe pair in one reaction.2 μl of the PCR products were diluted with 9 μl High Dye Formamide and 0.3 μl size standard LIZ (Applied Biosystems, Foster City, CA, USA).The amplification products were separated by electrophoresis using an ABI 3730 Genetic Analyzer (Applied Biosystems).The MLPA results were analyzed by GeneMapper software (v.3.7, Applied Biosystems) and the values were normalized against a normal genome.

Data Analysis and Statistics
Standard DNA reference sample, NA10851 was used in comparison to study specimens.A study specimen contained pooled DNA from 5 patients in CGH analysis.Impact of possible reference specific CNVs was reduced by hybridization of the NA10851 reference DNA to another purchased reference sample, which is a pool of DNA from normal colon mucosa derived from six human cancer free donors (Biochain Institute Inc., Hayward, CA, USA).This procedure was used to attenuate false positive calls.
For CGH analyses the ADM-2 algorithm was used to determine statistically significant sequence alterations for identification of potential CNV intervals.Data were centralized and calls with average log2 ratios < 0.3 were excluded from analysis [8].We used an ADM-2 threshold at 3.5 to ensure accurate detection of reference specific variations in the data set.All potential CNV calls from any of the 13 specimens analyzed, located at or flanking a reference specific probe locus, were excluded from analysis.It was considered a potential CNV call, when at least two out of twelve specimens showed gain or loss that comprised a single or multiprobe DNA

Copy Number Variants in Colon Mucosa and Blood
Genome-wide analyses of sequence variations in all patient groups defined by Dukes tumor stages, revealed a total number of 774 potential CNV loci calls identified in DNA from colon mucosa in 60 patients when compared to a reference DNA (NA10851).205 calls were identified in at least two specimens, where 90% (185) were present in 2 -6 specimens and 10% [20] in 7 -12 specimens.104 (51%) CNV were single probe alterations and 101 (49%) were multiprobe alterations.In total, 21 potential novel CNV calls, not present in database information of genomic variation (DGV), were discovered in the dataset at the time of analysis (Jan 2014, Table 1).All novel CNVs were significantly present in 2 -6 specimens corresponding to 10 -30 patients, where 1 CNV (22q12.3)was present in more than 75% of the specimens (45 patients).In total, 76% (16/21) were single probe variations; 71% (15/21) of potential novel CNVs covered regions that contained at least one gene.Of the CNVs from array CGH present in mucosa DNA 20% (40 of 205) were also present in buffy coat DNA from Dukes A patients.

Copy Number Variant and Tumor Progression
Identified CNVs were present in more than 50% of the mucosa specimens in either early (Dukes A and B, n = 6) or late (Dukes C and D, n = 6) tumors.No significant correlation between mucosa CNV and tumor stage/progression was confirmed in our data set.However, three mucosa CNVs discriminated between early (Dukes A and B) and late (Dukes C and D) stages.One multi probe variation at 15q21.3 was present in early Dukes stages (2 Dukes A and 1 Dukes B specimen), and one single probe variation at 2p15 was present in early Dukes stages (2 Dukes A and 1 Dukes B).One single probe variation, identified at 10q21.1, was present in late Dukes stages (2 Dukes C and 1 Dukes D, Table 2).

Confirmation of Copy Number Variants with MLPA
Selected mucosa CNV loci from array CGH were confirmed and compared to DNA from healthy blood donors with Multiplex ligation-dependent probe amplification (MLPA) analyses on an individual patient basis.Of 11 mucosa locus that were significantly changed with array CGH, two (3q23 and 10q21.1)were significantly more frequent in colorectal cancer patients compared to buffy coat DNA from healthy blood donors (p < 0.001).None of these mucosa CNVs were detected in blood from Dukes A cancer patients (Table 3), which may imply that 3q23 and 10q21.1 CNVs were acquired alterations and not inborn.

Discussion
Colorectal cancer is divided into hereditary (~20%) and sporadic (~80%) forms.Among patients with sporadic disease, some may carry familiar risk genotypes [17], which is not equivalent to hereditary disease.However, predictive genetic loci of increased cancer risks are largely unknown, but sibling studies have estimated that approximately 35% of all colorectal cancer may be attributed to genetic susceptibility [22], with recent studies suggesting even higher rates based on rare predisposing genetic variants [23].In the present study, we have used high resolution oligonucleotide microarrays in genome-wide scans to search for significant DNA aberrations in tumor free colon mucosa from patients at primary operation for colorectal cancer according to observations in  our previous report [1].Our present and previous approach to pool DNA before array CGH hybridization cancels out by chance variations among individuals in studies with limited number of patients.A limitation is that significant but less frequent alterations may not be confirmed.Thus, analysis on pooled DNA promotes specificity but attenuates the sensitivity of our investigation.Therefore, MLPA analyses were also performed on 11 mucosa CNV locus not earlier related to colorectal cancer as selected from our array CGH analyses.Patient data were analyzed compared to CNVs in buffy coat DNA from 44 healthy blood donors to exclude that the 11 CNVs were common frequent loci in Northern Europeans.
Our data revealed 184 known and 21 novel CNVs present in pooled DNA from array CGH based analysis in comparisons with the 109,863 CNV loci present in the Database of Genomic Variants (DGV).None of the 21 novel CNVs was discovered in all mucosa specimens, but one CNV (22q12.3)was found in more than 75% of the mucosa specimens indicating that identified potential CNVs are either loci in DNA from Northern Europeans or may be sequences related to risk for colorectal cancer.Most of the variants were detected by one single probe.Out of the 21 novel CNV intervals identified in our data set, 15 (71%) contained at least one gene.With few exceptions, these genes did not code for any known tumor related gene, which is in line with results from the majority of the GWAS performed to date [17].In array CGH analyses we observed a number of potential copy number variations (CNV) in DNA from patients with Dukes A-D tumors and correspondingly expected different survival.CNV regions in mucosa identified by array CGH were also identified in DNA from buffy coat in Dukes A patients.None of the patient specific probe alterations in mucosa were found in their corresponding buffy coat DNA according to MLPA analyses.Thus, discrepant alterations in array CGH analyses based on pooled DNA and MLPA analyses on individual patient samples may imply that the 3q23 and 10q21.1 alterations had been acquired, perhaps by toxic influence to colon stem cells.The 10q21.1 region does not contain any known genes as far as is presently known today, but may contain other sequences for regulation of gene function or cell division.
From array CGH analyses we found three potential CNV regions (10q21.1,15q21.3,2p15) that may discriminate early (Dukes A and B) from late (Dukes C and D) tumor stage, present in 25% -30% of all analyzed mucosa specimens.One of these CNVs, located at 10q21.1 was only found in early tumor stages based on MLPA analyses at variance with other observations [8].The remaining two CNVs (15q21.3,2p15) were identified in early tumor stages and were also found in DNA in buffy coat from the patients, indicating inborn variations.The implication of this information could be that tumor stages in colorectal cancer may not be strict in order; it may rather represent a direct end-stage alteration following transformation.Thus, our present results agree with observations in our previous study, where we used tiling BAC arrays to monitor DNA aberrations in tumor and colon mucosa tissue from patients with poor and excellent survival following primary colon cancer operations [1].
During last years, several genome-wide association studies (GWAS) have presented a number of potential and valid risk sites for each of four prevalent cancer types, as breast, prostate, colorectal and lung cancer [17].In colorectal cancer, 10 risk SNP loci were reported at chromosomal bands 8q24, 18q21 [25], 8q23.3,10p14 [7], 11q23 [5] [6], 15q13.3[7], 14q22.2,16q22.1,19q13.1 and 20p12.3[5].In these studies, large cohorts of colorectal cancer patients and cancer free individuals were screened for novel and confirmed loci and presentation of susceptibility loci should be a step toward blood based screening for colorectal cancer risk [5] [16] [17].Odds ratios generated in the studies ranged from 1.1 -1.5, which was low even when several DNA loci were combined [16] [17] [26].Hence, a number of additional biomarkers should be included in proposed panels of identified risk loci to allow for predictive purpose next to colonoscopy examination and detection markers for hereditary colon cancer variants as HNPCC and FAP.It is therefore desirable to determine predictive biomarkers, including nonpolymorphic, structurally related biomarkers in addition to SNPs.In situ synthesized oligonucleotide microarray technology is then regarded an appropriate tool for such purposes and has recently facilitated costefficient and high-resolution CNV screening [8].Moreover, oligonucleotide arrays may provide genome-wide sequence coverage, which is not obtained by SNP-arrays.

Conclusion
In conclusion, oligonucleotide-based array CGH appears as a sensitive tool for screening for identification of CNVs related to tumor development, complementary to the use of SNP arrays.Two CNVs at 3q23 and 10q21.1 were identified and statistically validated on an individual basis to discover significant risk for colorectal cancer.Our findings need further validation in large patient cohort studies.

Table 1 .
Novel CNV sites at the time detected by array CGH analyses in tumor free colon mucosa according to database of genomic variants (13 January 2014, build 36, genome build 18).
[21]ration according to defined confidence limits.Each call was verified by comparison of altered locus to all known variations in the Database of Genomic Variants (DGV, Build 36 (Mar.2006), last update 3 Jan 2014 (http://projects.tcag.ca/variation/)[21].CNV calls were further analyzed when present in more than 50% (6/12) or 75% (9/12) of the DNA specimens.Potential CNVs were subdivided into 2 categories; single and multi probe variations.Multi probe variations are made up by >2 contiguous probes.A CNV was considered novel if the locus was not reported of in the Database of Genomic variants.Bonferroni corrected Fischer's exact test in SPSS was used for significance (p < 0.05).

Table 2 .
Eight CNV sites detected by array CGH analyses in tumor free colon mucosa indicative of increased risk to develop colorectal cancer and 3 CNVs related to tumor stage and corresponding prognosis.
*CNVs were regarded potential if present in more than 50% of the specimens and unrecognized in colorectal cancer at that time.a Colocalized with CNV sites detected by de Smith et al. [8]; b Confirmed in pooled DNA from blood cell (Dukes A); c Hung et al., Nature 2008 [24]; d Novel CNV, the CNVs were regarded novel if not present in the Database of Genomic Variants (Last updated Jan 3 2014, Build 36 (Mar.2006) http://projects.tcag.ca/variation/.

Table 3 .
Confirmation of array CGH data with MLPA analyses based on tumor DNA from cancer patients versus DNA in buffy coat cells from healthy blood donors.The results display two CNV loci that may be statistically related to colorectal cancer; two deletions at 3q23 and 10q21.1..0001with Bonferroni corrected Fischer's exact test versus healthy donors DNA.An array CGH sample contained pooled DNA from mucosa tissue from 5 patients.