Clinical Applications of Molecular Profiling in Colorectal Cancer

Despite the developments in the diagnostic and management strategies, a considerable number 
of colorectal cancer (CRC) patients 
present with disease recurrence after curative surgery. Moreover; there are no 
reliable indicators to determine the prognosis and response of CRC patients to 
therapy. By harnessing recent 
technological advances in molecular profiling techniques, it is anticipated 
that greater insight to the various 
combinations of genetic events or alternative pathways underlying 
carcinogenesis will be gained. By carrying out literature search, we were able 
to identify a comprehensive list of genes with high differential expression patterns in colorectal cancer that 
could serve as molecular markers to complement existing histopathological factors in diagnosis, follow up and 
therapeutic strategies for individualized care of patients.


Acknowledgements
First and foremost I offer my sincerest gratitude to Professor Michael Kerin, who has supported me throughout, with his patience, knowledge and crucial contribution, which made him a backbone of this research and so to this thesis.
Professor Kerin enthusiasm for the practice of surgery, patient care and translational research has been inspirational and his energy and insight has been highly motivational.            Colorectal carcinoma (CRC) is one of the most common types of cancer worldwide with increasing incidence especially in developed countries [1].
Despite several advances in diagnosis and treatment, this disease remains a threat to life for a large number of people and approximately 20% of patients present with metastatic disease, and 30% of colorectal cancers recur [2]. In general, colorectal carcinoma is classified into three categories, based on increasing hereditary influence and cancer risk [3]. Sporadic CRC accounts for approximately 60% of patients and comprises patients with no notable family history and, by definition, with no identifiable inherited gene mutation that accelerates cancer development. Familial CRC accounts for approximately 30% of cases and refers to patients who have at least one blood relative with CRC or an adenoma, but with no specific germline mutation or clear pattern of inheritance.
True hereditary CRC syndromes, accounting for approximately 10% of cases, originate from inheritance of single gene mutations in highly penetrant cancer susceptibility genes. Although the latter group of cancers occurs with the lowest frequency, due to the clear patterns of inheritance and identification of key pathogenic genes, it has helped to elucidate the molecular mechanisms of carcinogenesis applicable to sporadic CRC.

Pathology
From initial diagnosis through to definitive treatment, pathological evaluation plays a central role in the care of patients with colorectal cancer. Pathological stage of disease is widely recognised as the most accurate predictor of survival and is used to determine the appropriate treatment. Many other pathological factors have been shown to have prognostic significance that are independent of stage, and they may help to further sub-stratify tumours.

Histological types:
For consistency and uniformity in pathological reporting, the histological classification of CRC proposed by the World health Organisation (WHO) [4] is internationally accepted (table 1):

Adenocarcinoma:
Adenocarcinoma is the most common tumour type (95%). Most are moderately differentiated and lack specific histological features, although colorectal tumours tend to show cribriform patterns with central necrosis, a feature that is useful if a metastatic tumour is encountered when no colorectal primary has been diagnosed.
Dysplasia in adjacent mucosa may be seen, but frequently the invasive tumour obliterates any pre-existing polyp from which it may have arisen.

Mucinous adenocarcinoma:
This is a subtype of adenocarcinoma that secretes extracellular mucin. At least 50% of the tumour must be mucinous in order to make this diagnosis. Mucinous adenocarcinomas are associated with microsatellite instability. Mucinous change may also be seen in ordinary adenocarcinomas treated with neoadjuvant chemoradiotherapy. Whether or not mucinous adenocarcinomas have a better prognosis is uncertain [5].

Medullary carcinoma
This is an important subtype of colorectal cancer, added to the World Health Organisation classification in 2000. It has a characteristic phenotype -right-sided tumours with sheets of cells and numerous tumour-infiltrating lymphocytes on microscopy [6]. This phenotype is associated with the Lynch cancer family syndrome (hereditary non-polyposis colorectal cancer). These colorectal tumours show a loss in expression of DNA mismatch repair proteins such as MSH1 (60% of cases) or MLH2 (30%).

Other tumours
Two specific tumours with a poor prognosis are signet ring cell and small cell carcinoma. Signet ring cell carcinoma is composed of at least 50% cells with intracytoplasmic mucin, resembling gastric signet ring cell tumours. Small cell carcinoma is a poorly differentiated neuroendocrine carcinoma. Occasionally, tumours from other sites involve the colorectum, and the pathologist should be aware, in particular, of direct extension into the rectum of prostate and bladder tumours. In most cases, morphology will distinguish these tumours from a primary colorectal neoplasm, although in some immunohistochemical stains may be necessary to identify a tumour of non-colorectal origin Other (e.g., papillary carcinoma)

Tumour grade:
Since the Broder's [7] and Duke's [8,9] classification schemes were reported, the representative criterion of tumour grade employed for colorectal cancer has been the degree of tumour differentiation, as gauged primarily by architectural features.
Tumour grade is defined based on the tumour differentiation in the TNM classification i.e., grade 1 is defined as well-differentiated, grade 2 as moderately differentiated, grade 3 as poorly differentiated, and grade 4 as undifferentiated. In the World Health Organization (WHO) classification, tumour grade is assessed based on the least differentiated component, with both well-and moderately differentiated adenocarcinomas being considered low-grade, and both poorly differentiated adenocarcinomas and undifferentiated carcinomas as high-grade.
Poorly differentiated tumours can be identified by the absence of tubular formation and is an independent prognostic factor as it increases the risk of lymphatic spread from early stage tumours [10]. Although the relationship between histological grading based on tumour differentiation and disease prognosis has been well-recognized [11,12], the existing grading systems have been criticized regarding the difficulty of making objective judgments. There are two causes of this problem. First, it is difficult to clarify the distinctions among individual categories because tumour differentiation is a continuum parameter and an apparent break does not exist, especially between well-differentiated and moderately differentiated adenocarcinoma. Second, the extent of the component that examiners judge to be the least differentiated component has not been standardized.

Tumour staging:
The colon and rectum are unique among organs in that invasion of the lamina propria (that is the part of the mucosa surrounding the colorectal crypts) is considered to be in situ disease [13]. Thus, invasion of the submucosa is required to make a diagnosis of invasive carcinoma. The rationale for this is that because the colorectal lamina propria lacks lymphatics, tumours that are limited to the lamina propria have no means by which to spread. This is supported by evidence of a lack of malignant potential for such tumours [14].
Accurate and consistent pathological staging of colorectal cancer is vital to correct management. The central factor in T staging is the extent of invasion of the tumour through the bowel wall and it is still the most accurate predictor of prognosis in colorectal cancer patients. Table 1 summarises the two staging systems currently in use in Ireland and indicates the relationship between them.
Historically, the Duke's system has been valuable in clearly identifying patients who would benefit from postoperative chemotherapy (Duke's C). It has always been apparent that the Duke's B category is heterogeneous and includes patients who would also benefit from chemotherapy, especially with the advent of less toxic drug regimens. For these patients, the TNM staging system has advantages, as it identifies pT4 cases with a higher risk of local recurrence. Stage III-A T1-2 N1 M0 N1: Metastasis to 1 to 3 regional lymph nodes. T1 or T2.
Stage III-C any T, N2 M0 N2: Metastasis to 4 or more regional lymph nodes. Any T.
Stage IV any T, any N, M1 M1: Distant metastases present. Any T, any N.

Tumour Markers:
Carcinoembryonic antigen (CEA) is a glycoprotein involved in cell adhesion. It is normally produced during fetal development, but the production of CEA stops before birth. Therefore, it is not usually present in the blood of healthy adults. It was found that serum from individuals with colorectal carcinoma, gastric carcinoma, pancreatic carcinoma, lung carcinoma and breast carcinoma, as well as individuals with medullary thyroid carcinoma, had higher levels of CEA than healthy individuals. CEA levels may also be raised in some non-neoplastic conditions like ulcerative colitis, pancreatitis, cirrhosis, COPD, Crohn's disease as well as in smokers. CEA measurement is mainly used as a tumor marker to identify recurrences after surgical resection, or localize cancer spread though dosage of biological fluids. The CEA blood test is not reliable for diagnosing cancer or as a screening test for early detection of cancer as most types of cancer do not produce a high CEA. Elevated CEA levels should return to normal after successful surgical resection or within 6 weeks of starting treatment if cancer treatment is successful. CEA and related genes make up the CEA family belonging to the immunoglobulin superfamily. In humans, the carcinoembryonic antigen family consists of 29 genes, 18 of which are normally expressed [15].
bowel wall. In 1970s, areas of failure found at re-operation following an initial curative resection for CRC were investigated, and the results showed that survival and disease recurrence rates are significantly related to the degree of bowel wall penetration and the extent of nodal disease [18]. This early work paved the way for the identification of those patients with high-risk disease.
Before the adoption of total mesenteric excision (TME) [19,20], surgery alone for transmural or node positive rectal cancer was associated with local recurrence rates of up to 50% [21,22]. This provided the rationale for exploration of management plans to improve outcomes following resection. The first trial was conducted by the Gastrointestinal Tumour Study Group, which randomised patients to surgery alone vs. chemotherapy vs. pelvic radiation vs. chemoradiation [23]. The arm that combined chemotherapy and radiotherapy showed a significant improvement in local control and survival [24]. Following this, investigators at Mayo Clinic/North Central Cancer Treatment group explored the postoperative radiotherapy alone vs. postoperative chemoradiation and found a significant reduction in local recurrence and cancer-related deaths in the chemoradiation group. Both trials set new standards for the postoperative management of highrisk rectal cancer. Once this new standard of care was established, the ongoing studies sought to determine the best regimen [25][26][27].
While optimising the treatment regimen and rationale for postoperative adjuvant therapy, researchers were also questioning whether preoperative therapy would be even more beneficial. Many reasons were proposed to demonstrate why treatment in the preoperative settings would be more efficacious [28]. The advantages of neoadjuvant therapy utilizing radiation are thought to be due to improved responsiveness of tissue not rendered hypoxic by previous surgery. Theoretically, ionizing radiation is more effective in irradiation of virgin tissue due to the increased oxygen tension in this tissue. Accordingly, preoperative radiation and chemotherapy are more effective in producing tumour necrosis in the nondisturbed pre-surgical tumour bed and cancer cells of the tumour periphery compared to the hypoxic post-surgical bed. Several other advantages with neoadjuvant therapy include less radiation-induced small bowel injury in the pelvis, which has not been repaired by previous surgery, and the ability to excise the irradiated rectal segment and perform an anastomosis to healthy, nonirradiated colon, resulting in improved postoperative function [29]. In addition, studies have shown chemoradiation therapy, in the preoperative setting, results in less acute grades 3 and 4 toxic side effects and long-term toxic effects compared to giving it postoperatively [30]. Not surprisingly, there is less patient compliance with chemotherapy regimens provided in the postoperative period compared to giving it preoperatively [30,31]. Taken together, the amalgamation of these modern regimens, including improved imaging, better chemotherapy, and more accurate and focused radiation, have resulted in an increased frequency of tumour down-staging, a higher likelihood of complete clinical and pathologic responses, and decreased local recurrence rates in stage II and III rectal cancer [30]. In addition, the utilization of neoadjuvant therapy in the management of stage IV disease has shown potential for prolonged survival.
In the 1990s, several institutions began evaluating the integration of preoperative radiotherapy approach. In Europe, investigators focused on the delivery of a short course of higher-dose radiation therapy alone followed in 1 week by resection.
The Swedish Rectal trial reported improvement in survival adopting such an approach in 1997 [32]. This study randomised patients with respectable cancer to surgery alone or to surgery following a 1-week course of pelvic radiotherapy delivering 25 Gy in 5 daily fractions and their results showed that both local recurrence and 5-year survival were significantly improved. Moreover, the Dutch TME trial reported in 2001 [33] showed a higher local recurrence rate with TME alone without preoperative radiotherapy. In both Swedish and Dutch trials the interval from the end of pelvic radiation to surgical intervention was 1 week. Lyon R90-01 [34] studied the influence of this interval on down-staging and sphincter preservation and their results demonstrated that a longer interval between completion of radiotherapy and surgery was associated with increased tumour down-staging (26% vs. 10.3%, p = 0.005) and clinical tumour response (71.7% vs. 53.1%, p = 0.007). However; no significant differences were identified regarding morbidity, local recurrence or sphincter preservation.
Data from the Memorial Sloan-Kettering Cancer Center and the MD Anderson cancer Center supported the benefits of combining a total dose of 50.4 Gy of pelvic radiotherapy fractionated over 5.5 weeks in conjunction with concurrent chemotherapy [35]. Results from these series advocated for an improvement in sphincter preservation rates. In addition, patients with low-lying T2 lesions who would otherwise be offered abdomino-perineal resection (APR) were shown to benefit from such therapy [28]. Moreover; the German Rectal cancer group [36] confirmed the efficiency of a preoperative combined modality approach over the traditional strategy of providing subsequent postoperative adjuvant therapy.
Additional trials have now shown that with longer course preoperative CRT significantly improves local control, tumour down-staging and down-sizing compared to radiotherapy alone [31,[37][38][39].

Quantification of tumour response to neoadjuvant therapy:
Assessment of response after preoperative CRT is essential in detecting patients who obtained a complete pathological response and can therefore be considered for a less aggressive surgical approach.
-Pathological assessment of tumour response: Pathological complete response rates of 10-23% following neoadjuvant therapy have been reported. Although conflicting data exist, this suggests that good outcomes can be expected for patients with pathologic complete or near-complete response. Neoadjuvant CRT leads to characteristic histopathological changes in colorectal cancer. Grossly the tumour may be difficult to see, with in some cases no gross tumour visible in the mucosa. An area of scarring may be present in the bowel wall, or in surrounding fat, indicating treated tumour. To ensure consistency in reporting a complete response, in accordance with the protocol used in the CORE trial [40], the pathologist should extensively sample any areas of fibrosis seen in order to find any residual tumour.
Microscopically, these tumours display variable reduction in the volume of malignant cells, and an increase in the amount of stroma. The tumour cells may show phenotypic changes, such as mucinous metaplasia; the stroma may show fibrosis, atypical fibroblasts or calcification. The degree of fibrosis correlates with outcome -recently it has been shown to be prognostic in R0 cases [41] and can be quantified with the tumour regression grade. Simplifications of the original five grades have been proposed based on the inter-observer variation when using five categories [42,43]. Table 1.3 describes the two most commonly used tumour regression grading systems.

Table 1.3: Tumour regression grades
Mandard tumour regression grade [44,45] Wheeler rectal cancer regression grade [42]  This is where the most accurate T and N stage before and after treatment determined clinically e.g. by magnetic resonance imaging (MRI) or trans-rectal ultrasound (TRUS), is compared with the pathological T-and N-stage in the resected specimen [46,47]. This is a commonly used means of assessing response, but the accuracy of this technique may be flawed by limitations in these imaging modalities.
Several studies have examined the accuracy of different imaging techniques in assessing rectal cancer response and lymph node involvement after preoperative CRT. The overall accuracy of endorectal ultrasound (EUS) ranges from 62-92% for initial assessment of T-stage, compared to 66-88% for initial assessment of Nstage. Following CRT however, it accurately identifies only 10 of 16 (63%) patients with pathological complete response [48]. Moreover, EUS is far more likely to accurately stage non-responders than good responders (82% vs. 29%) [49]. The limitations of EUS following preoperative CRT are probably attributable, in part, to its inability to differentiate between tumour and radiationinduced inflammation [48]. Other imaging modalities such as computed tomography (CT) and magnetic resonance imaging (MRI) play a role in initial staging of rectal cancer patients. MRI performed with endorectal coil seems to be the most useful technique, with sensitivity and specificity similar to that of EUS in assessing wall penetration, and comparatively greater accuracy in assessing nodal involvement [50]. The accuracy of MRI declines in terms of response quantification following CRT, mostly due to overstating [51,52]. CRT. PDG-PET achieved a sensitivity of 100% and specificity ranging from 60-86% in predicting histopathological tumour response.

Pathological and molecular predictors of response to neoadjuvant CRT:
It is not known why such large differences in rectal cancer response to neoadjuvant CRT occur between patients. In order to elucidate factors that may allow for response prediction, existing research has focussed primarily on histological and molecular assessment of pre-treatment tumour biopsy specimens.

Clinical and histological indices:
Thusfar, this has not been systematically analysed in any single study. It has been indirectly addressed by multivariate analysis in four studies assessing molecular response predictors. These studies all concluded that pre-treatment T stage, N stage, grade, differentiation, age and gender could not predict histological response to RCT [46,57]. Whilst conventional factors may have no influence over tumour radiosensitivity, they may, however, influence rates of local recurrence.
Myerson et al. identified that tumour location <5 cm from the anal verge, circumferential lesions, obstruction and tethered/fixed tumours were all independent risk factors for local recurrence [58].

Cancer genetics
Oncogene and tumour-suppressor gene mutations all operate similarly at the physiological level: they drive the neoplastic process by increasing tumour cell number through the stimulation of cell birth or the inhibition of cell death or cellcycle arrest. The increase can be caused by activating genes that drive the cell cycle, by inhibiting normal apoptotic processes or by facilitating the provision of nutrients through enhanced angiogenesis. A third class of cancer genes, called stability genes, promotes tumourigenesis in a completely different way when mutated. This class includes the mismatch repair (MMR), nucleotide-excision repair (NER) and base-excision repair (BER) genes responsible for repairing subtle mistakes made during normal DNA replication or induced by exposure to mutagens. Other stability genes control processes involving large portions of chromosomes, such as those responsible for mitotic recombination and chromosomal segregation (e.g., BRCA1, BLM and ATM). Stability genes keep genetic alterations to a minimum, and thus when they are inactivated, mutations in other genes occur at a higher rate [70]. All genes are potentially affected by the resultant increased rate of mutation, but only mutations in oncogenes and tumoursuppressor genes affect net cell growth and can thereby confer a selective growth advantage to the mutant cell. As with tumour-suppressor genes, both alleles of stability genes generally must be inactivated for a physiologic effect to result.
Mutations in these three classes of genes can occur in the germline, resulting in hereditary predispositions to cancer, or in single somatic cells, resulting in sporadic tumours. It is important to point out that a mutation is defined as any change in the sequence of the genome. These changes include those affecting single base pairs as well as those creating large or small deletions or insertions, amplifications or translocations. In the germline, the most common mutations are subtle (point mutations or small deletions or insertions), whereas all types of mutation can be found in tumour cells. In fact, cancers represent one of the few disease types in which somatic mutations occurring after birth are pathogenic. The first somatic mutation in an oncogene or tumour-suppressor gene that causes a clonal expansion initiates the neoplastic process [71]. Subsequent somatic mutations result in additional rounds of clonal expansion and thus in tumour progression [72]. Germline mutations of these genes cause cancer predisposition, not cancer per se. Such individuals therefore often develop multiple tumours that occur at an earlier age than in individuals whose cancer-gene mutations have all occurred somatically [73].

Genetic and epigenetic alterations in CRC
Colorectal cancer results from the progressive accumulation of genetic and epigenetic alterations that lead to the transformation of normal colonic epithelium to colon adenocarcinoma. From the analysis of the molecular genesis of colon cancer, four central tenets concerning the pathogenesis of cancer have been established. The first is that the genetic and epigenetic alterations that underlie colon cancer formation promote the cancer formation process because they provide a clonal growth advantage to the cells that acquire them. The second tenet is that cancer emerges via a multi-step progression at both the molecular and the morphologic levels [74]. The third is that loss of genomic stability is a key molecular step in cancer formation [75]. The fourth is that hereditary cancer syndromes frequently correspond to germ line forms of key genetic defects whose somatic occurrences drive the emergence of sporadic colon cancers [76].

Genetic alterations:
Much progress has been made in understanding the molecular mechanism of CRC

APC:
The Adenomatous polyposis coli (APC) gene encodes a protein that possesses multiple functional domains that mediate oligomerization as well as binding to a variety of intracellular proteins including ß-catenin, γ-catenin, glycogen synthase kinase (GSK)-3ß, axin, tubulin, EB1, and hDLG [76]. Germline mutations in APC result in FAP or one of its variants, Gardner's syndrome, attenuated FAP, Turcott's syndrome, or the flat adenoma syndrome [77,78]. In addition; studies have shown that APC is mutated in up to 70% of all sporadic colon adenocarcinomas, which is a high APC mutation frequency unique to colorectal cancers [79,80]. These mutations are present beginning in the earliest stages of colon-cancer formation and precede the other alterations observed during coloncancer formation [81,82]

P53:
Tumour protein-53 (p53) was initially identified as a protein forming a stable complex with the SV40 large T antigen and was originally suspected to be an oncogene [87]. Subsequent studies demonstrated that P53 is a transcription factor with tumour suppressor activity, is located at chromosome 17p13.1, and is mutated in 50% of primary human tumours, including tumours of the gastrointestinal tract [88]. P53 is currently believed to be a transcription factor that is involved in maintaining genomic stability through the control of cell cycle progression and apoptosis in response to genotoxic stress [88]. In colon cancers, P53 mutations have not been observed in colon adenomas, but rather appear to be late events in the colon adenoma-carcinoma sequence that may mediate the transition from adenoma to carcinoma [82]. Furthermore, mutation of P53 coupled with loss of heterozygosity (LOH) of the wild-type allele was found to coincide with the appearance of carcinoma in an adenoma, thus providing further evidence of its role in the transition to malignancy [89,90]. The function of P53 to recognize DNA damage and induce cell cycle arrest and DNA repair or apoptosis has led to P53 being called the "guardian of the genome" [91]. Thus, P53 normally acts as a tumour suppressor gene by inducing genes that can cause cell cycle arrest or apoptosis and also by inhibiting angiogenesis through the induction of TSP1 [92]. Mutant P53 can block these functions by forming oligomers with wild-type TP53, thereby causing diminished DNA-binding specificity [93].

DCC:
Since it was first discovered in a colorectal cancer study in 1990 [94], DCC (Deleted in colorectal cancer) has been the focus of a significant amount of research. DCC held a controversial place as a tumour suppressor gene for many years, and is well known as an axon guidance receptor that responds to netrin-1 [95]. More recently DCC has been characterized as a dependence receptor, and theories have been put forward that have revived interest in DCC's candidacy as a tumour suppressor gene, as it may be a ligand-dependent suppressor that is frequently epigenetically silenced. One of the most frequent genetic abnormalities that occur in advanced colorectal cancer is loss of heterozygosity (LOH) of DCC in region 18q21. DCC elimination is not believed to be a key genetic change in tumour formation, but one of many alterations that can promote existing tumour growth.

Epigenetic alterations:
The finding of aberrant hMLH1 promoter methylation in sporadic MSI colon cancers dramatically illustrated the role of epigenetic changes as potential pathogenetic alterations in cancer [96][97][98][99]. The term DNA methylation refers to the methylation of cytosine residues (5-methylcytosine) at CpG sites found throughout the genome [100]. These epigenetic alterations are characteristically clustered in so-called CpG islands in gene promoter regions, and hypo and hypermethylation of these regions are related to activation and inhibition of transcription, respectively. This type of gene regulation is essential to cell differentiation as well as embryological development [101]. Furthermore, DNA methylation is closely related to the mechanism by which one copy of a gene is preferentially silenced according to parental origin, generally referred to as genomic imprinting [102]. Aberrant methylation of the cancer genome, and associated silencing of the genes whose promoters demonstrated such methylation, has been well described at multiple genetic loci [103,104].
Reversion of the methylation using demethylating agents such as 5-deoxyazacytidine frequently restores expression of these genes, demonstrating methylation in fact induces gene silencing. Hereditary CRC syndromes (10%) which result from germline inheritance of mutations in highly penetrant cancer susceptibility genes. Although the last group is observed with the lowest frequency, however they have been instrumental in the elucidation of molecular mechanisms of carcinogenesis applicable to sporadic CRC.

Sporadic CRC:
Sporadic colorectal cancers arise at a median age of 70-75 years. Seventy percent arise in the left side of the colon and there are differences in the age, sex and regional distribution of both adenomas and carcinomas between both sides of the large bowel. Sporadic cancers caused by the development of a series of genetic abnormalities in tumour suppressor genes and oncogenes that give cells an evolutionary advantage over their neighbours.

Hereditary and familial CRC syndromes:
Hereditary non-polyposis colorectal cancer (Lynch Syndrome): Hereditary non-polyposis colorectal cancer, also referred to as the Lynch  [117]. Papillary carcinoma of the thyroid and its rare cribriform-morular variant may be associated with FAP, and this could lead to detection of unsuspected FAP [118,119]. The risk of hepatoblastoma in children of patients with FAP is highly increased and new germline mutations can be identified in 10% of cases [120].
All of these syndromes are characterized by hamartomatous polyps and most of them are associated with increased risk of development of gastrointestinal and extraintestinal carcinomas [124].

-Peutz-Jeghers Syndrome (PJS):
Peutz-Jeghers syndrome (PJS) is characterized by mucocutanous pigmentation and GI hamartomas, which occur anywhere from stomach to anus. It was first described by Peutz in 1921 [125] and Jeghers in 1944 [126]. It is inherited in an autosomal dominant fashion with no sex predilection [127]. A prototypic PJS polyp is a hamartoma of the muscularis mucosae. Therefore, the core of the polyp consists of smooth muscle covered by lamina propria and mature glandular epithelium [128,129] which gives rise to a characteristic arborising smooth muscle core of the polyp. Germ-line mutations in the serine/threonine kinase gene (STK11/LKB1) on chromosome 19p13.3 cause Peutz-Jeghers syndrome in about half of the affected families. Additional loss of the wild-type allele in hamartomas and adenocarcinomas suggests that STK11/LKB1 is a tumour suppressor gene [130].

-Juvenile polyposis coli (JP):
Juvenile polyposis (JP) coli is inherited in an autosomal dominant fashion at least in 30% of patients. Patients develop numerous hamartomatous colorectal polyps, which are characterized by dilated crypts [131]. The number of polyps is smaller than in FAP and the disease course is less malignant [132]. The diagnosis of juvenile polyposis syndrome is made when multiple (3-10) juvenile polyps are found in the gastrointestinal tract, even though there is still some variation in criteria used in diagnosis. Mutations of SMAD4/MADH4 gene were initially described and explain about 30% of cases [133]. Mutations in BMPR1A can also lead to juvenile polyposis in additional 30% of patients [134,135].  [138].
Genes that possess such microsatellite-like repeats in their coding regions appear to be the targets relevant to carcinogenesis. Indeed, frequently, many genes that possess microsatellite repeats are observed to be mutated in MSI colon cancers.
The relationship between the microsatellite mutator pathway and other genetic alterations frequently found in colon cancer is only partially understood.
Alteration of the Wnt/Wingless pathway can be observed in tumours irrespective of MSI [139]. Mutations in APC and CTNNB1 can be found in 21% and 43% of MSI tumours, respectively [140,141]. In addition, the incidence of K-RAS mutations appears to be as high as 22-31%, which is similar to the incidence observed in microsatellite stable (MSS) colon cancers [142]. Mutations in P53 do appear to be less frequent in MSI cancers than in MSS cancers. The mutation incidence in MSI colon cancers has been demonstrated to range between 0-40%, whereas the incidence in MSS tumours is between 31-67% [140,142,143].
The MSI tumour formation process has been termed the microsatellite mutator phenotype and is a pathway to tumour formation that is distinct from that seen in colon cancers that are microsatellite stable [144][145][146].  [150][151][152][153]. The early recognition of Lynch syndrome is essential to identify patients at high risk who will require intensive surveillance.
Nevertheless; its diagnosis can be difficult to make due to incomplete family history information and lake of characteristic clinical phenotype. Although the Amsterdam criteria and Bethesda guidelines continue to be used widely, several studies have underscored the limitations of their accuracy in predicting the presence of MMR gene mutations [154,155]. Therefore, new strategies for screening for and diagnosis of Lynch syndrome need to be investigated.
In addition to screening for Lynch syndrome, testing for MSI is important because of its possible prognostic and therapeutic implications. Cancers with high microsatellite instability (H-MSI) were reported to have a more favourable clinical out come than non-MSI tumours and the survival advantage conferred by the MSI phenotype is independent of tumour stage and other clinicopathological variables [156][157][158]. Moreover, tumours with H-MSI are thought to be less responsive to 5-fluorouracil and other anticancer agents in vitro and in vivo [159][160][161].

Source of biological data
One of the major factors influencing the performance and accuracy of molecular profiling is the source and processing of patient samples. So far, reliable analysis is limited to fresh blood or fresh-frozen tissue samples. However, these samples may be unavailable from subsets of patients.
Routine histology processing uses formalin fixation to preserve the histological architecture of tissue specimens. Archival collections of formalin-fixed tissues, linked to clinical databases, provide a rich resource from which biological insights could be derived far more expeditiously than the prospective collection of frozen samples. In addition, any biomarker developed from formalin-fixed paraffinembedded (FFPE) samples could be more readily translated into clinical practice.
Unfortunately, RNA is degraded in tissues before, during, and after formalin fixation [162,163] and can continue to deteriorate even during storage, leading to shortened fragments of RNA [164]. Interestingly, miRNAs appear to be better preserved, perhaps because of their intrinsically shorter lengths. Therefore, it should be possible to perform genome-wide screening for miRNAs using FFPE tissues [165][166][167] Indeed, this approach has been successful for colon and breast cancers; however, the technical robustness of these platforms has not been thoroughly investigated. in their gene expression patterns [169]. Subsequently they found a correlation between those subtypes and clinical outcome, suggesting that gene expression patterns of tumours have both a taxonomic and prognostic value [170].
CRC represents an interesting field of molecular profiling research for several reasons: CRC is considered a biological model of tumourigenesis, because clinical progression from adenoma to early stage carcinoma until advanced stage carcinoma seems to parallel distinctive molecular alterations [82]. In addition; traditional clinical and pathological parameters are not always sufficient to discriminate high risk from low risk CRC and validated molecular markers with prognostic value are still not available. The studies on molecular profiling in CRC have mainly focused on carcinogenesis process, disease prognosis prediction and therapeutic targets and response prediction.  Subsequently many studies reported other sets of genes that were differentially expressed between cancer and normal tissue and therefore potentially involved in the development of colorectal carcinogenesis [171][172][173][174][175]. In addition, some studies reported significant differences in gene expression profile between adenoma and normal mucosa, suggesting that different mechanisms of development of these precancerous lesions may exist [176,177]. Furthermore, and in order to clarify the molecular modifications underlying the development of metastases, some studies compared the gene expression profile of primary tumours with their corresponding metastases [178][179][180][181][182]. Agrawal [184] Some studies have also investigated differences in gene expression between CRC of the right side and left side, due to their epidemiological, morphological and pathogenetic diversity and found distinct profiles according to the anatomical stratification. Birkenkamp-Demtroder et al. [185] investigated the difference in gene expression between the caecum vs. sigmoid and rectosigmoid and identified 58 genes to be differentially expressed between the normal mucosa of caecum and the sigmoid and rectosigmoid.

Therapeutic targets and treatment response prediction
While gene expression profiling has been widely applied to CRC for diagnosis, classification and prognosis prediction based on molecular patterns of expression, its application to response prediction to medical treatment is still lacking reliable results due to few currently available studies [186][187][188][189][190]. In a panel of 30 colon carcinoma cell lines Mariadason et al. identified 420 genes correlated with response to 5-fluorouracil (5-FU) and involved in two main biological processes, DNA replication and repair and protein processing/targeting [188]. The predictive value of 50 genes best correlated with 5-FU response was subsequently validated using a leave one out cross validation approach and it was higher than the traditional markers, such as thymidylate synthase, thymidine phosphorylase, mismatch repair and p53 status. Furthermore they also found that 149 genes bestcorrelated with CPT-11-induced apoptosis significantly predicted response of colon cancer cell lines to this agent. In addition; Del Rio et al. analyzed gene expression profile of 21 primary advanced CRC tissues, in order to identify an expression pattern that could predict response to leucovorin, fluorouracil and irinotecan as first-line treatment: 14 genes were found expressed differently between responders and non responders and were able to predict treatment response with 95% accuracy [189]. In the same year Khambata-Ford et al.
investigated gene expression pattern of metastatic biopsies of 80 advanced CRC patients treated with cetuximab to identify genes whose expression correlates with best clinical response [190]. They found that, among 629 genes expressed differently between 25 patients with disease control and 55 non responders, the top candidate markers based on lowest p value were epiregulin and amphiregulin, both ligands for epidermal growth factor receptor (EGFr), suggesting that these markers could select patients for cetuximab therapy.
Some studies evaluated the ability of gene expression profiling for predicting response of advanced rectal cancer (RC) to preoperative chemoradiotherapy [186,[191][192][193]. Ghadimi  predicted complete and partial response with an 84% accuracy [191]. Similarly another study identified a gene expression signature of 42 genes that was able to distinguish responder from non responder locally advanced RC patients with a 71% accuracy [192]. Recently; Spitzner et al. were able to identify a gene expression signature for chemoradiosensitivity of colorectal cancer cells [193].
They exposed 12 colorectal cancer cell lines to of 5-fluorouracil and radiation therapy. The differences in treatment sensitivity were then correlated with the pretherapeutic gene expression profiles of these cell lines. Their data have suggested a potential relevance of the insulin and Wnt signalling pathways for treatment response, and they also identified STAT3, RASSF1, DOK3, and ERBB2 as potential therapeutic targets [193].
Although colorectal cancer (CRC) is still one of the leading causes of cancer related death, the introduction of new therapeutic options like oxaliplatin and irinotecan in addition to 5-fluorouracil, the standard therapeutic for CRC has increased the overall survival of affected patients from 10 to 18-24 months.
Furthermore, the ''biological" therapeutics cetuximab, an IgG1 chimeric monoclonal antibody against epidermal growth factor receptor (EGFR), and bevacizumab, a monoclonal antibody against vascular endothelial growth factor (VEGF), have augmented the course of the disease and brought in the new era of targeted therapy against cancer specific molecular pathways [194][195][196][197]. Although these biologicals have entered clinical routine due to their encouraging results, their effect has been shown to be limited due to adaptation or previously existing resistance of tumour cells. This has been clearly shown in the case of patients with mutations of K-RAS, which lead to resistance against cetuximab. Therefore, several new pathways are currently investigated for therapeutic targeting in CRC.

miRNA biology and functions
There are three steps in the maturation of miRNA: transcription of pri-miRNA, cleavage in the nucleus to form pre-miRNA, and a final cleavage in the cytoplasm to form mature miRNA [221,222]. Pri-miRNA is synthesized from DNA by cytoplasm, pre-miRNA is cleaved on the loop end by Dicer to form a miRNA: miRNA duplex that is unwound by a helicase to release two mature miRNAs, of which one or both may be active [227].
MiRNAs exert their functionality via sequence-specific regulation of posttranscriptional gene expression and it is estimated that they regulate up to 30% of all protein-coding genes [228]. The specific region important for mRNA target recognition is located in the 5'-end of the mature miRNA strand, from bases 2 to 8, often referred to as the 'seed-sequence' [229]. Governance of gene expression and protein translation by these noncoding RNA molecules occurs largely through one of two mechanisms, dependent upon the complementarity of the miRNA seed sequence with its target mRNA. Although remarkably small, miRNAs harbor enough sequence content to be relatively specific. Generally, if a miRNA-target duplex contains imperfect complementarity, protein expression is inhibited without target mRNA destruction. However, if the duplex has nearly perfect basepairing, then the mRNA target is marked for degradation [229,230]. The Argonaute proteins present in the RNA-induced silencing complex (RISC) appear to dictate the mode of regulation elicited by the miRNA-target duplex.
Recruitment of specific Argonaute proteins can catalyze cleavage of mRNA sequences perfectly base-paired to the miRNA, or inhibit translation of mRNAs that form an imperfect duplex with the miRNA [231,232]. The recent explosion of miRNA research and discovery further underscores the importance of these regulatory molecules in many key biological processes, such as development, cellular differentiation, cell cycle control and apoptosis [233][234][235]. There is enough evidence to show that miRNAs are involved in human cancer [236,237]. It was suggested previously that miRNAs assert their function as oncogenes or tumour suppressor genes via several potential mechanisms. If a particular miRNA targets key tumour suppressor genes, it is supposed to be an oncogene; but, if a miRNA targets an oncogene, it might be viewed as a tumour suppressor gene. However, the matter may be far more complicated than this simple view because one particular miRNA can mediate the expression of up to several hundred mRNAs. We speculate that to a large extent, the function of miRNAs is to fine tune gene expression in response to acute changes in growth conditions rather than as a traditional tumour suppressor or oncogene by definition. The first evidence that miRNAs may function as tumour suppressor genes came from a recent study by Calin et al. [220] that showed that patients with B-cell chronic lymphocytic leukemia (CLL) have frequent deletions or down regulation of two miRNA genes, miR-15a and miR-16-1 . Cimmino et al. [238] showed that an anti-apoptotic gene BCL2, was negatively regulated by miR-15a and miR-16-1 . This suggests that deletion or down regulation of miR-15a and miR-16-1 results in an elevated level of BCL2 to promote leukaemogenesis and lymphomagenesis in haematopoietic cells. However, Borkhardt et al. [239] reported recently that among 69 B-cell cases with 13q deletion, none of them showed mutations in miR-15a and miR-16-1.. Fulci et al. [240] also reported that the down regulation of miR-15a and miR- 16- also been found to be upregulated in breast cancer, and this upregulation causes downregulation of 2 important targets: programmed cell death 4 (PDCD4) and tropomyosin1 (TPM1) [244][245][246] Differential expression of genes encoding some miRNAs seems to be associated with particular pathologic features of breast cancer. Mattie et al. [247] subsequently identified unique sets of miRNAs associated with breast tumours defined by their HER2/neu or ER/PR status . Moreover, Lowery et al. [248] has described 3 miRNA signatures predictive of ER, PR and Her2/neu receptor status, which were identified by applying artificial neural network analysis to miRNA microarray expression data. In addition; expression of the gene encoding miR-30 seems to correlate with estrogen receptor and progesterone receptor status; downregulation of this miRNA is found in estrogen receptor-and progesterone receptor-negative tumours [242]. MiR-206 has been found to target the 3' UTR of the estrogen receptor α protein, leading to an inverse correlation between miR-206 concentration and estrogen receptor status [249,250]. Recently, Heneghan et al. miR-203 appear to correlate with tumour stage; increased expression of the genes encoding these miRNAs is found in higher-stage tumours [242]. Other miRNAs with prognostic value for breast cancer include miR-10b, miR-21, miR-145, miR-9-3 and let-7; levels of these miRNAs correlate with tumour grade, degree of vascular invasion, lymph node metastases, or metastatic potential [252].

miRNAs in Gastric Cancer:
There is an increasing number of studies showing the overexpression or downregulation of specific miRNA in H. pylori-infected gastric mucosa and gastric cancer [253]. Dysregulated miRNAs include miR-21, miR-181 family, whereas let-7a is lower in gastric cancer patients [254]. High levels of miR-17 and miR-106a in peripheral blood of gastric cancer patients have also been confirmed in another study [255]. These findings suggest that miRNAs are useful biomarkers for early diagnosis of gastric cancer.
Recent studies suggest that polymorphisms in the miRNA genes may serve as novel risk predictors for gastric cancer. Arisawa et al. [256] in 2007 first reported that a polymorphism of miR-27a genome region is associated with a higher risk for the development of gastric mucosal atrophy in Japanese men. Peng et al. [257] later reported an association of miRNA-196a-2 gene polymorphism with gastric cancer risk in a Chinese population. Moreover; miRNAs have recently been used to predict the outcome of patients with gastric cancer. For example, a seven-miRNA signature (miR-10b, miR-21, miR-223, miR-338, let-7a, miR-30a-5p and miR-126) is closely associated with relapse-free and overall survival among patients with gastric cancer [258]. High expression levels of miR-20b or miR-150 [259] or downregulation of miR-451 [260] or miR-218 [261] are also associated with poor survival, whereas there is a correlation between miR-27a and lymph node metastasis [259]. In addition, Ueda et al. [262] recently reported that miR-125b, miR-199a and miR-100 represents a progression-related signature, whereas low expression of let-7g and high expression of miR-214 are associated with shorter overall survival independent of depth of invasion, lymph-node metastasis and stage [262]. These prognostic miRNAs could be applicable to future decisions concerning treatment

miRNA expression and functions in CRC
In 2003 [264]. Subsequent mechanistic investigations provide evidence for the oncogenic role of miR-21 in CRC by demonstrating how it suppresses the cell cycle regulator CDC25A [268], and can also target and repress the tumour suppressor gene PDCD4 thus inducing invasion, intravasation and metastatic potential [269]. MiR-21 may also target PTEN and TPM1. In addition; miR-135a and miR-135b are upregulated, and this upregulation correlates with reduced expression of the APC [270]. Moreover; miR-143 and miR-145 are both downregulated in colorectal cancer. The genes encoding these miRNAs are both located on 5q23, and these miRNAs possibly originate from the same primary miRNA [263,265]. MiR-126 promotes cell proliferation through modulation of phosphatidylinositol 3-kinase signaling [271]. MiR-133b is also downregulated, and one of its putative targets is KRAS [272], which is a member of the Ras family of proteins, that regulates signaling pathways involved in cellular proliferation, differentiation, and survival. Moreover; over-expression of the oncogenic miR-17-92 cluster is also implicated in the etiology of CRC, specifically in adenoma to adenocarcinoma progression.

Diagnostic and prognostic value:
To test the function of miRNAs in the pathogenesis of CRC, expression of 156 miRNAs was measured in both tumour and normal tissues from patients with CRC and cell lines [272].  [274] identified 37 miRNAs that were differentially expressed between CRC and normal tissues. They also reported that loss of miR-133a and gain of miR-224 are associated with tumour progression. Overexpression of miR-21 was shown in many reports to be associated with worse prognosis, lymph node and distant metastasis and poor response to chemotherapy in CRC. Moreover, Asangani et al. [269] reported that overexpression of miR-21causes tumour cells to invade and metastasize more aggressively when implanted into mouse models. In addition; a study by Motoyama and colleagues [275] showed that expression of miR-31, miR-183, miR-17-5p, miR-18a and miR-92 were significantly higher in tumour tissues compared to normal, while expression of miR-143 and miR-145 in cancer were lower than in normal tissues. They also showed that miR-18a expression was associated with poor disease prognosis. Moreover; miR-31 expression was positively related to advanced TNM stage and tumour invasion suggesting its role in CRC initiation and progression [276]. Of further interest; Lanza et al. [277] identified a molecular signature consisting of 27 differentially expressed genes, inclusive of 8 miRNAs that can correctly distinguish high microsatellite instable (MSI-H) vs. microsatellite stable (MSS) colon cancers of Therapeutic potential: The synthesis and functions of miRNAs can be manipulated with various oligonucleotides that encode the sequences complementary to mature miRNAs [278]. Overexpression of miRNAs can be induced either by using synthetic miRNA mimics or chemically modified oligonucleotides [279]. Conversely, miRNAs can be silenced by antisense oligonucleotides and synthetic analogues of miRNAs [280,281]. Cross-sensitivity with endogenous miRNAs and lack of specificity for target miRNAs can cause non-specific side-effects with miRNA modulation therapy. However, the use of an effective delivery system and less toxic synthetic anti-miRNA oligonucleotides may minimize such side-effects. The role of miRNAs in pathogenesis of cancer makes them important targets for therapeutic intervention. Gene therapies may be designed to treat colorectal cancers and to block the progression of precursor lesions by manipulating the tumour suppressor or promoter miRNAs [282]. Such manipulation may control the tumour growth rate and have potential as a new therapy for both early and advanced cancers Studies have revealed that inhibition of miR-21 and miR-17-92 activity is associated with reduced tumour growth, invasion, angiogenesis and metastasis [283,284]. Targeting such miRNAs may help to prevent the recurrence of disease in high-risk tumours and may control the growth of advanced metastatic tumours.
Overexpression of miR-21 is associated with low sensitivity and poor response to chemotherapy [282]. Its inhibition may improve the response to chemotherapy. In addition; some drugs were found to alter the expression of miRNAs. Rossi et al.

Study Rationale
The involvement of certain molecules in initiation and progression of human malignancy holds much potential for new developments in current diagnostic and therapeutic strategies in the management of CRC patients. While a number of miRNAs with a functional role in CRC have been identified and functionally characterised, the heterogeneity and molecular complexity of CRC makes it likely that there are many more molecules involved in the pathways that promote CRC progression and response to therapeutics. The identification of novel genes and miRNAs involved in colorectal carcinogenesis and understanding their functional effects, particularly in relation to the current indicators will improve our knowledge of the roles of these novel biomarkers in carcinogenesis and promises to open avenues for potential therapeutic intervention.
The purpose of this study was to investigate the role of mRNA, miRNA and MMR proteins by analysing their expression using the following approaches:

I-mRNA expression profile in CRC:
Analysis of gene expression patterns represents one of the most interesting topics in medical oncology, because it provides a global and detailed view on the molecular changes involved in tumour progression, leading to a better understanding of the carcinogenesis process, to discovering new prognostic markers and novel therapeutic targets. Despite of clinical and pathological parameters are available for the classification and prognostic stratification of cancer, they may be inadequate in everyday practice due to the great biologic and genetic heterogeneity of this multiform disease. Therefore, we selected a panel of candidate genes, based on literature review, to quantitate their expression in colorectal cancer using RQ-PCR in order to: 1. Determine the expression levels of candidate genes in tumour and tumourassociated normal colorectal tissue.

2.
Investigate correlation between serum carcinoembryonic antigen (CEA) and tissue CEACAM5 levels to identify a relationship that could further refine the role of CEACAM5 as a biomarker in CRC. 3. Correlate the expression levels of candidate mRNA to a panel of miRNAs in order to identify miRNA/mRNA duplexes and to investigate the miRNA and target gene expression patterns in colorectal tissue samples.

III-MMR protein expression in CRC:
Information about MMR protein status in colorectal cancer is important because it will identify those most likely to have Lynch syndrome and those most likely to have microsatellite instability in their tumours which has been proven to have better prognosis and may affect their treatment regimens in the future. We undertook this study to: -Serum, plasma and whole blood samples retrieved from patients preand post tumour resection.
-Serum, plasma and whole blood samples retrieved from non-cancer controls (Appendix 2: Specimen Request Form).
In accordance with the guidelines [287] including:  Biopsies were fixed and stored at room temperature until embedding for a minimum of 24 hours.

Paraffin embedding
After fixation, tissue samples (10mm×5mm×2mm) were removed from formalin and placed in open cassettes. The cassettes were then closed and placed in 250 mL of industrial methylated spirit (VWR) to wash the formalin from the tissue.
Next, the cassettes were removed and placed in JFC solution (Milestone) filled JFC beaker and placed in the histoprocessor (MicroMED) for 60 minutes (70°C).
Thereafter, the cassettes were transferred to paraffin wax (VWR)-filled beakers and placed in the histoprocessor (MicroMED) for 30 minutes. The cassettes were then removed from the wax beaker and tissue was blocked out carefully. The blocks were left at 4°C until hard and then stored at fridge or room temperature until sectioning.

Sectioning
Sectioning of formalin-fixed paraffin-embedded tissues was carried out using a Slee microtome (LIS Ltd). Tissue blocks were inserted into the holder with the label facing downwards. Section thickness was set to 30µM to pare the block down until even sections were being cut and the outer layer of wax was removed.  -Tumour grade which represent the degree of differentiation, as gauged primarily by architectural features and is defined base on TNM classification [288], i.e., grade 1 as well-differentiated, grade 2 is moderately-differentiated, grade 3 as poorly-differentiated and grade 4 as undifferentiated [289].
-Pathological data includes perineural and lymphovascular invasion, and mucin secretion -Response to neoadjuvant therapy was scored based on tumour regression score by Mandard [44].
-Dukes' [8] and American Joint Committee on Cancer (AJCC) [290] systems were used for disease classification and staging.
-Tumour markers (CEA and CA 19.9) serum levels A group of 65 patients, in whom the expression levels of a panel of miRNA was carried out before at the surgical research laboratory, were selected for the miRNA:mRNA correlations in order to identify miRNA/target genes duplexes. MiRNA expression analysis was carried out on them before in the surgical laboratory.

RNA extraction from fresh-frozen tissue
Two methods of RNA extraction were employed in the study, the total RNA extraction (co-purification) and the separate purification of mRNA and miRNA.
The co-purification method includes isolation of total RNA with a subsequent mRNA or small RNA purification from the total RNA pool. The second method purifies mRNA and miRNA directly out of solution via poly-A isolation or sequence-specific isolation. The separate purification was used when miRNA analysis was required. To ensure both methods were working properly correlation of RNA extraction were carried out and showed good results (

Total RNA extraction (co-purification)
Approximately 50-100 mg of fresh-frozen colorectal tissue samples were homogenised using a hand-held homogenizer (Polytron PT1600E) in 1-2 mL of QIAzol reagent (Qiagen). To minimise variation in sample processing, tumour and TAN samples were homogenised separately, but on the same day. Total RNA was isolated from homogenised tissues using RNeasy Plus Mini kit (Qiagen) according to the manufacturer's instructions. An Eppendorf Micromax refrigerated centrifuge was used throughout the RNA extraction process. Aliquote of 500 µL of the homogenate was transferred to sterile 1.5 mL tubes and centrifuged at 14000 rpm for 10 minutes at 4°C, before the addition of chloroform  (Qiagen). Enzyme was applied onto the membrane of the column and left at room temperature for 15 minutes. The buffer RW1 wash step was repeated. Two further wash steps, using 500µL of buffer RPE, were carried out. The second of these steps had an increased centrifugation time of 2 minutes to dry the membrane. The large RNA was eluted from the RNeasy column by applying 50µL RNase-free water to the membrane and centrifuging at 12000×g for 1 minute at 4°C. A portion of the purified large and small RNA was aliquoted for quantitative and qualitative analysis using NanoDrop ND-1000

Spectrophotometer (NanoDrop technologies) and the Agilent 2100 Bioanalyzer
System (Agilent technologies) respectively. The remaining RNA was stored at -80°C until further use.

Qiagen RNeasy FFPE kit
This method was employed using RNeasy FFPE kit (Qiagen)  This was repeated until the entire sample drawn through the column. Two wash steps, using 500µL of buffer RPE, were carried out. The second of these steps had an increased centrifugation time of 2 minutes to dry the membrane. The RNeasy MinElute column was placed in a new 2mL collection tube, and the old collection tube with the flow-through was discarded. The lid of the spin column was opened and centrifuged at full speed for 5 minutes. The collection tube with the flowthrough was then discarded. The RNeasy MinElute column was the placed in a new 1.5 mL collection tube. The RNA was eluted from the RNeasy column by applying 30µL RNase-free water to the membrane and centrifuging at full speed for 1 minute at 20-25°C. A portion of the purified RNA was aliquoted for quantitative and qualitative analysis using NanoDrop ND-1000

Spectrophotometer (NanoDrop technologies) and the Agilent 2100 Bioanalyzer
System (Agilent technologies) respectively. The remaining RNA was stored at -80°C until further use.

Qiazol and chloroform protocol
Paraffin sections (3×10µm) were prepared as previously described and placed in 2 mL microcentrifuge tubes. To each sample 1mL of 100% xylene (Sigma Aldrich) was added, samples were vortexed vigorously for 10 seconds, and centrifuged at was added to each sample prior to disruption using a needle and syringe before the addition of chloroform. Three and a half volumes of 100% ethanol were added to the upper aqueous phase and the entire volume was transferred to RNeasy mini kit column and the process was continued as for total RNA extraction from freshfrozen tissues. A portion of the RNA was aliquoted for quantitative and qualitative analysis. The remaining RNA was stored at -80°C until further use.

TRI reagent RT-Blood protocol
Paraffin sections (3×10µm) were placed in a 2 mL microcentrifuge tube. Xylene and 100% ethanol (Sigma-Aldrich) wash were carried out as described in the preceeding section. After complete evaporation of the ethanol at room temperature for 10 minutes 1mL of QIAzol reagent (Qiagen) was added to the sample and then homogenized using needle and syringe. To precipitate RNA, 80% of each aqueous phase (about 1 mL) was then transferred to a new 2 mL round tube and mixed with a similar volume of isopropanol (Sigma-Aldarich).
The mixture was stored at room temperature for 5 minutes and the centrifuged at bottom of the tube. Two wash steps using 75%ethanol were then carried out in order to improve the quality of RNA (260/280 ratio). The RNA pellet was airdried for 5 minutes before dissolving RNA in 30 µL of nuclease-free water. The dissolve was incubated at room temperature for 5 minutes, vortexed and spin down for 10 seconds. A volume of the RNA was aliquoted for quantitative and qualitative analysis. The remaining RNA was stored at -80°C until further use.

RNA extraction from blood
Total RNA was extracted from 1mL of whole blood using the Tri Reagent BD (http://www.mrcgene.com/rna.htm) and a modified protocol from that provided by the manufacturers. In brief, 1-bromo-4-methoxybenzene was used to augment the RNA phase separation and an additional ethanol (75%) wash was performed to improve the purity of RNA isolated as reflected in an improved 260/280 ratio.
RNA concentration and integrity were determined by spectrophotometery (NanoDrop technologies) and bioanalyzer (Agilent technologies)

RNA concentration and quality analysis
RNA concentration and purity was assessed in duplicate samples (1 µL) using a NanoDrop ND-1000 Spectrophotometer (NanoDrop technologies) while RNA integrity was evaluated using the RNA 6000 Nano Chip Kit (Series II) and the Agilent 2100 Bioanalyzer System (Agilent technologies).

Nanodrop Spectrophotometery:
Total and large RNA concentration and purity was assessed using the NanoDrop ND-1000 Spectrophotometer (NanoDrop technologies). Aliquote of 1 µL of RNA was pipetted onto the apparatus pedestal. The sample arm was used to compress the sample and a sample column formed, held in place by surface tension.
Spectral measurements were made with a tightly controlled pathlength of 0.1cm.

Agilent Bioanalyzer:
The large-RNA enriched fractions and the total RNA were also analysed using the RNA 6000 Nano LabChip series II Assay and the Agilent 2100 Bioanalyzer An RNA integrity number [291] was generated for each sample using the Agilent 2100 Expert Software (Version B.02.03) based on the ratio of ribosomal bands and also the presence or absence of degradation products on the electrophoretic and gel-like images. A threshold value of RIN ≥ 7 was applied; ensuring only RNA of good integrity was used in these experiments.

Nanodrop Spectrophotometery:
The concentration and purity of small RNA were assessed using a NanoDrop ND-1000 Spectrophotometer (NanoDrop technologies). 'Other' was selected as the sample type and the wavelength-dependent extinction coefficient of 33 was used.
RNA integrity was assessed using Small RNA Assay with the bioanalyzer (Agilent technologies).

Agilent Bioanalyzer
The Small RNA Assay was chosen for its high resolution in the 6-150 nucleotide range, allowing verification of small RNA retrieval and comparison of the small RNA component between tissue samples. The small RNA assay was carried out according to the Small RNA Assay kit guide. The electrodes were cleaned with RNase-free water prior to use. To prepare the gel, Small RNA gel matrix and dye concentrate were allowed to equilibrate at room temperature for 30 minutes. The complete volume of gel was spun at 10000×g for 15 minutes. The dye concentrate was vortexed for 10 seconds and briefly centrifuged. In a new 0.5 mL RNase-free tube, 2 µL of dye concentrate and 40 µL of the filtered gel were mixed thoroughly by careful pipetting. The gel/dye mix was the spun at 13000×g for 10 minutes at room temperature. Samples were diluted to 1 ng/µL within the quantitative and qualitative range of the assay. The RNA samples and RNA ladder were denatured at 70°C for 2min and then placed on ice prior to use. To prepare the chip 9.0 µL of gel/dye mix were pipetted into a marked well before closing the chip priming station for 60 seconds. Another 9.0 µL of gel/dye mix were pipetted into the second marked well before discarding the rest of the mix. Small RNA conditioning solution was then drawn into the well marked CS. 5.0 µL of the Small RNA marker was pipetted into all the 11 sample wells and the ladder well.
Then 1 µL of sample is drawn into each of the 11 sample wells and 1 µL of the ladder is pipetted into the ladder well. Before the chip was run in the Agilent 2100 bioanalyzer it was vortexed horizontally in the adaptor at 2000rpm for 5 minutes.

mRNA Reverse transcription
Aliquots of large RNA equivalent to 2 µg were reverse transcribed using Superscript III reverse transcriptase (Invitrogen). After a gentle mix, the tubes were briefly centrifuged. The mixture, (40μL in total) was incubated as above in an AB9700 GeneAmp thermal cycler (Applied Biosystems). Samples were subsequently diluted to 100 µL in nuclease-free water and stored at -20º C. An RT-negative control was included in each batch of reactions. Samples were incubated at 16ºC for 30 minutes, 42ºC for 30 minutes and finally 85ºC for 5 minutes to denature the strands. The reaction was performed using a Gene Amp PCR system 9700 thermal cycler (Applied Biosystems). An RTnegative control was included in each batch of reactions.

Real time quantitative PCR
RQ-PCR allows accumulating amplified DNA/cDNA to be detected and measured as the reaction progresses, i.e. in real time. It is possible to detect the amount of amplified product by incorporating a DNA-binding dye or fluorescently-labelled gene-specific probe in the reaction. The RQ-PCR reaction consists of an exponential phase in which the amount of amplified product approximately double during the each cycle of denaturation, primer annealing and template extension, and a non-exponential or plateau phase in which reduced reagents limit the reaction. The point at which enough amplified product has accumulated to produce a detectable fluorescence signal is known as the threshold cycle or C t and the greater the amount of starting template, the lower the C t value.

Figure 2.3: RQ-PCR phases.
Basic PCR run can be broken up into three phases: Exponential: Exact doubling of product is accumulating at every cycle .It occurs because all of the reagents are fresh and available. Linear: As the reaction progresses, some of the reagents are being consumed as a result of amplification. Plateau: The reaction has stopped, no more products are being made and if left long enough, the PCR products will begin to degrade. The RQ-PCR calculates two values. The Threshold line is the level of detection at which a reaction reaches a fluorescent intensity above background. The PCR cycle at which the sample reaches this level is called the Cycle Threshold, C t .

Amplification efficiency
In a PCR reaction with optimised primer conditions, reagent concentration etc. the amplification efficiency should approach 100% in the exponential phase, i.e. a doubling of amplification product for each cycle. To determine the amplification efficiency of the RQ-PCR assay, serial dilution (neat to 10 -6 ) of cDNA template were prepared and amplified using the same conditions used for subsequent gene expression analysis. A dilution curve was constructed by plotting C t versus the dilution factor of cDNA. Amplification efficiencies (E) were calculated for each RQ-PCR assay using the formula: Amplification efficiencies (E) = (10-1/slope -1) × 100 Slope = Slope of the dilution curve.
The R 2 Amplification efficiencies value of the dilution curve represents the linearity of the data. R 2 value should be ≥ 0.98 for each dilution curve. A threshold of 10% above and below 100% efficiency was applied to indicate a relatively robust and producible RQ-PCR assay.  As before, standard fast thermal cycling conditions were used, consisting of 40 cycles at 95ºC for 15 seconds and 60ºC for 60 seconds. On each plate, an interassay control was included to account for any variations between runs.

Endogenous control
Central to the reliable determination of gene expression is the choice of control gene with which to normalise real-time data from target genes. Normalisation can be achieved using endogenous or exogenous controls; however the use of endogenous control (EC) genes is the most widely adopted approach as it excludes variation associated with differences in amounts of template RNA. An ideal EC gene (or genes) should be stably expressed and unaffected by parameters such as disease status and in the case of CRC, should remain unaffected by whether a tissue was derived from normal, adenoma or carcinoma lesions.
B2M and PPIA were used as endogenous control (EC) genes to normalise gene expression levels in RQ-PCR reactions measuring gene expression levels [292].
This pair of genes was chosen on the basis that they had been validated as the most stably expressed genes in a large group of colorectal tissues, as will be described in detail in the following chapter. For miRNA expression analysis, the combined expression of miR-16 and miR-345 was used to normalise expression data, as previous work in the department of Surgery had validated these miRNAs in colorectal tissue [293].

Relative quantity
Cycle threshold (C t ) is defined as the PCR cycle number at which the fluorescence generated from amplification of the target gene within a sample increases to a threshold value of 10 times the standard deviation of the base line emission and is inversely proportionate to the starting amount of the target cDNA.
In order to correct for non-biological variation in gene expression potentially introduced during RQ-PCR process, an endogenous control (EC) gene, which has verified stable expression across samples, is used. QBasePlus was used for calculation of expression levels of target genes relative to each of the EC genes. It applies ΔΔ C t method where: ΔΔCt = (C t target gene, test sample -C t endogenous control, test sample) -(C t target gene, calibrator sample -C t endogenous control, calibrator sample).
Relative quantities were corrected for efficiency of amplification and fold change in gene expression between groups was calculated as E -ΔΔCt ± s.e.m. Where more than one endogenous control are used , fold change estimates were calculated using the geometric mean of EC quantities relative to the calibrator sample which could be the minimum, maximum or a named sample or an average.

Microarray analysis
Microarray analysis was carried out on total RNA extracted from sections of colorectal FFPE tissues using Megaplex pool A primers (Applied Biosystems) according to the manufacturer's protocol. These TaqMan microfluidic real-time PCR array cards (TLDAs) contained of 384 TaqMan sequence-specific miRNA assays and were prepared in at two-step process as follows:

Megaplex RT reactions:
Total RNA was extracted from paraffin sections as described in section 2.3.2.1. Reactions were performed in total volumes of 7.5µL of total RNA and RT reaction mix. Thereafter, samples were incubated for 40 cycles at 16°C for 2 minutes, 42°C for 1 minute and 50°C for 1 second and finally left at 85°C for 5 minutes to denature the strands. The reaction was performed using a Gene Amp PCR system 9700 thermal cycler (Applied Biosystems).

TLDA RQ-PCR reactions:
Reactions mixes (900 µL in total) for samples for TLDA RQ-PCR array profiling using 384-well microfluidic cards were prepared by combining: Nuclease-free water 444 μL 100 µL of the above pre-mix was dispensed into each port of the TLDA card, which was then centrifuged and sealed. Thermal cycling was performed using a

Figure 2.4: The steps in real-time PCR (the 5´ nuclease assay)
Each TaqMan MGB probe anneals specifically to its complementary sequence between the forward and reverse primer sites. When the hybridized probes are cleaved by AmpliTaq Gold® enzyme, the quencher is separated from the reporter dye, increasing the fluorescence of the reporter dye. Therefore, the fluorescence signal generated by PCR amplification indicates the gene expression level in the sample.

Artificial neural network
Algorithms and architecture: In this study, a three-layer multi-layer Perceptron (MLP) modified with a feedforward back-propagation algorithm and a sigmoidal transfer function was used.
The learning rate and momentum were respectively set at 0.1 and 0.5. An automatic pre-processing normalised the data between 0 and 1 for each variable.
The intensity values for the miRNA for each individual were represented in the input layer, the hidden layer contained 2 hidden nodes, and the class was represented in the output layer coded as 0 for negative and 1 for positive. A randomLy selected subset of the cases developed for training is presented to the network to train it (training data) while it is constantly monitored with a randomLy selected subset of unseen cases (test data). These test data are used to stop the training process once the model has reached predetermined conditions like an optimal error value preventing overtraining. Once training is stopped the efficiency of the model is further assessed by presenting a third, randomLy selected blind subset to the model to determine performance for unseen cases not involved in the training process. This subset selection process was repeated up to 50 times for randomLy selected subsets, a process known as Monte Carlo Cross Validation (MCCV). The suite of 50 models produced was analysed and screened for model optimisation purposes.

Model optimisation:
An additive stepwise approach was employed to identify an optimal set of markers explaining variations in the population for each question explored [294].
The stepwise approach consists of taking each single variable as an input to the ANN, and training 50 sub-models with MCCV. Each single input model subset is then analysed and the median classification performance (based on predictive error for the blind test set) determined. The median performance for all single inputs is then analysed and the inputs ranked accordingly. The best predictor input (with the lowest error) is then selected and a second single variable added, creating a two-input model. This was repeaed for all the variables in the dataset, and the best pair determined again based on classification error. Further inputs are then added in the stepwise fashion (generating 3-input models, 4-input models and so on), until no further improvement is obtained and an obtimal model with the best predictive performance is generated.

Protein analysis
Immunohistochemistry was used to examine the expression of the DNA mismatch repair (MMR) proteins hMLH1, hMSH2, hMSH6 and hPMS2 in colorectal cancer tissues.

Immunohistochemistry (IHC)
IHC is the localisation of antigens in fixed cells by the use of labelled antibodies as specific reagents through antigen-antibody interactions that are visualised by a marker such as an enzyme or a fluorescent label. In most routine IHC methods   10-Then the slides were washed twice for 2 minutes before disabling slide heater.
11-Counter-stain: Slides were warmed up to 3°C for 2minutes, washed and then a drop of haematoxylin was applied to each slide and incubated for 2 minutes.
12-Post Counter-stain: Slides were then washed twice and Bluing reagent (Ventana) was applied to each slide and incubated for 2 minutes and again washed and blotted dry.
13-Once Ventana staining was complete, sections were washed in warm soapy water and then dehydrated in serial alchol immersions as follow: -Dipped many times in distilled water -Immersed for 3 minutes in 70% ethanol.
14-A layer of DPX mounting medium (Sigma-Aldrich) was added to each slide , followed by the application of cover glass, taking care to avoid bubble formation. Slides were allowed to dry overnight and then examined.

UltraMap protocol:
UltraMap anti-MS HRP detection system was used to determine the expression of hMLH1, hMSH2 and hPMS2 in colorectal cancer tissues. The steps were much similar to DABMap system with differences in reagents.

Extended UltraMap:
It was used for detection of hMLH1 and PMS2. 7-Counter-stain, post counter-stain, dehydration and DPX mounting was carried out as for DABMap protocol.

Standard UltraMap:
It was used to detect hMSH2 protein expression. The steps were similar to these of extended protocol with the exception of cell conditioning which was carried out in two cycles of (1×MCC, 2×CC, 1×MCC, 1×CC) compared to the three cycles in the extended UltraMap system.

IHC analysis
Changes in protein expression following transfection of colorectal tissues were observed in stained cells using Olympus BX60 microscope and image analySIS software. Adjacent normal tissue served as an internal control for positive staining. As a negative control staining was carried out without the primary antibody. MMR protein staining was considered negative when all of the tumour cell nuclei failed to react with the antibody.

mRNA target prediction
It is thought that functional characterisation of miRNAs will depend heavily on identification of their specific target mRNAs. However, experimental studies have touched on only a handful of the possible ranges of function of miRNAs, and numerous bioinformatics methods have been developed to allow high-throughput prediction of miRNA target genes. Results derived using these computational algorithms have been validated biologically and feedback from validation results have greatly improved performance of in-silico miRNA target prediction algorithms.

Computational target predictions
There are several computational target prediction programmes available (table   2.6), all of which place emphasis on the seed region of the miRNA and the 3' UTR of mRNA sequence. However, they differ in their exact scoring system.
Computational prediction of miRNA target sites consists of four main steps: -Extraction of rules related to formation of miRNA-mRNA Duplexes -Incorporation of those rules in computational algorithms; -Prediction of novel miRNA target sites using those algorithms -Validation of the results, and thus the algorithm itself, using computational and experimental approaches.
For the purpose of this study predicted targets of specific miRNAs were determined by searching the miRBase, miRDB, PicTar and TargetScans for putative mRNA with a known role in colorectal cancer or other cancer-associated signal cascade.
MiRBase [295][296][297], is a programme which predicts mRNA targets in vertebrates through a fully automated pipeline (figure 2.4), using the miRanda algorithm to identify potential binding sites for a given miRNA. The miRNA sequence is for G:U pairs) mismatched pairs get a negative scores (e.g., -3), and there is a gap-opening and a gap-elongation penalty of -8 and -2 respectively. The scoring system is weighted for complementarity at the 5'end of the miRNA. An alignment score (S score) is calculated based on all of these factors. Next the free energy score (ΔG score) of the resulting duplex is computed using the RNA lib package [298]. Cut-offs for S and ΔG score must be met before conservation of the 3'UTR target site is examined a cross species. For a site to be conserved it must be detected at the same position in a cross-species orthologous UTR alignment by a miRNA of the same family. The position of the target site can be shifted slightly (e.g., ±10 residues), and sequence identity does not have to be perfect (e.g., 90% identity may be required). Each target must be conserved in at least two species for inclusion in the database.
In determining putative mRNA targets for a miRNA, TargetScan [299][300][301] requires target site conservation in the human, mouse, rat, dog and chicken MiRDB [303,304] is a free online database for miRNA target prediction and functional annotations. All the targets were predicted by a bioinformatics tool called MirTarget2, which was developed by analyzing thousands of genes downregulated by miRNAs with an SVM learning machine  The main steps in identifying miRNA target genes are shown. When miRNA and mRNA (3'UTR) sequences are provided as input data sets, similar data sets from related species are constructed using data on putative orthologs. After preparation of the data sets, miRNA binding sites are identified by determining the base pairing pattern of miRNAs and mRNAs according to the complementarity within specific regions (Step1); determining the strength of the resulting miRNA-mRNA duplex by calculation of the free energy (Step 2); comparative sequence analysis (Step 3); and checking for the presence of multiple target sites per transcript (Step 4). [312] mRNA (3'UTR) sequence microRNA sequence Extraction of orthologous mRNA and miRNA sequences Step 1: Complete pairing at seed region Alignment score (S) Step 3: Conserved in related species Step 2: Optimal thermodynamic stability Free energy score (ΔG) Step 4: Multiple target sites per target mRNA Potential miRNA target Typically, the miRNA binds to a specific site or sites within the 30UTR region of the mRNA sequence. According to thermodynamic analysis, some degree of complex formation occurs along the entire miRNA-mRNA duplexed region (A). Base pairing is particularly weak in the central region (B) and particularly strong at the 50 end (seed region) of the miRNA (C).These aspects are commonly used to identify putative novel binding sites. Base pairing between let-7 miRNA and hbl-1 mRNA in C. elegans is shown as an example [313]

Correlation of mRNA-miRNA expression levels
The expression levels of the examined mRNA was quantitated by RQ-PCR from colorectal tissues and correlated with miRNA expression levels quantitated by stem-loop RQ-PCR from the same tumour samples. The correlated genes were then checked against the miRNA target databases to see if any of the correlated genes were designated targets worthy of further investigation.

Statistical Analysis
Statistical analysis was carried out with Minitab 15 (Minitab Ltd) and IBM SPSS Statistics 17.0 (SPSS Inc.). Data was tested for normal distribution graphically using histograms and also using the Kolmogorov-Smirnov, Shapiro-Wilk and Anderson-Darling tests. Parametric tests were used where appropriate. One-way ANOVA and independent t-test were used to determine association and comparisons between independent groups. Correlation analysis used Spearman's Rho and Pearson's correlations coefficient for nonparametric and parametric data respectively. The correlation data interpretation was carried out following Cohen's guidelines [314] (table 2.7). Univariate analysis and paired-T test were used to assess related samples. The statistical significance of differences in survival between groups was determined by log rank which compares differences along all points of the curve and multivariate analysis was done using Cox regression. P values <0.05 were considered statistically significant.

Normalisation of RQ-PCR data
Chapter 3

Introduction:
The majority of colorectal tumours originate from adenomatous precursor lesions and develop along a well-defined adenoma-carcinoma sequence. According to this model the culmination of mutational events including activation of oncogenes and loss of function of tumour suppressor genes results in the emergence of carcinomas [315]. Molecular profiling across the spectrum of normal-adenomatumour tissue types has yielded many candidate genes in the search for novel molecular diagnostic and prognostic markers and treatment strategies [316][317][318].
In latter years real-time quantitative (RQ-) PCR has become established as the gold standard for accurate, sensitive and rapid quantification of gene expression [319,320]. In comparison to alternative methods such as Northern blotting and Ribonuclease Protection Assays (RPA), RQ-PCR has been universally adopted as the transcriptomic method of choice due to its superiority with regard to speed, sensitivity, reproducibility and the wide range of instrumentation and reagents commercially available.
To accurately quantify an mRNA target by RQ-PCR, samples are assayed during the exponential phase of the PCR reaction during which the amount of target is assumed to double with each cycle of PCR without bias due to limiting reagents.
Analysis of cycle threshold (C t ), the cycle number at which signals are detected above background, can be used to estimate gene expression levels by relating C t values either to a standard curve (absolute quantification) or to a control gene (relative quantification). The latter method requires the generation of standard curves of known copy number for each target and so is limited due to logistical issues associated with the generation of standards in studies of multiple gene targets. Relative quantification is the most widely adopted approach and as the name suggests, quantification of gene expression is based on the analysis of a target gene whose expression is normalised relative to the expression of a control gene. Central to the reliable determination of gene expression is the choice of control gene with which to normalise real-time data from target genes.
Normalisation can be achieved using endogenous or exogenous controls; however the use of endogenous control (EC) genes is the most widely adopted approach as it excludes variation associated with differences in amounts of template RNA.
An ideal EC gene (or genes) should be stably expressed and unaffected by parameters such as disease status and in the case of CRC, should remain unaffected by whether a tissue was derived from normal, adenoma or carcinoma lesions. Traditionally GAPDH (glyceraldehyde phosphate dehydrogenase) has been widely used to normalise RQ-PCR data. A common feature of earlier studies was that the stability of reference gene expression between different sample types was assumed with little consideration paid to validation of these EC genes as suitable normalisers. More recent studies have brought into question the stability of commonly used EC genes such as GAPDH on the basis that gene expression levels have been found to vary in response to treatment or as a result of physiological, pathological or experimental changes. For example, alteration in oxygen tension and hypoxia were found to be associated with wide variation in GAPDH, B-ACTIN and CYCLOPHILIN expression [328]. In addition, GAPDH expression was found to be strongly unregulated in diabetic patients and downregulated in response to the administration of bisphosphonate compounds in the treatment of metastatic breast cancer [329]. Other evidence indicates that neoplastic growth can affect EC expression levels [330]. Goidin et al [331] found differences in the expression of GAPDH and B-ACTIN in two sub-populations of melanoma cells derived from a tumour in a single patient. Treatment agents such as dexamethasone, deprenyl and isatin also affect EC gene expression [332,333].
Schmittgen et al [334] reported increased expression of GAPDH, B2M, 18S rRNA and β-ACTIN in fibroblasts after the addition of serum: evidence of the effect of experimental conditions on EC expression. These findings were further supported by Wu et al [335] in their investigation of the effect of different skin irritants on GAPDH and PolyA+ RNA expression. GAPDH was found to be involved in ageinduced apoptosis in mature cerebellar cells [336] and also as a tRNA binding protein present in the nuclei of HeLa cells [337].
As the use of unreliable ECs can result in inaccurate results, the identification of the most reliable gene or set of genes at the outset of an investigation is critical.
Thus far, a pervasive stably expressed gene (or genes) has yet to be identified across all tissue types [338,339]. This would indicate that the identification of robust ECs at the outset of transcriptomic analysis would yield more reliable and meaningful RQ-PCR data.

Aims
The aim of this study was to evaluate a panel of thirteen candidate EC genes from which to identify the most stably expressed gene (or genes) to normalise RQ-PCR data derived from primary colorectal tumour and tumour associated normal (TAN) tissue. Six of the candidate EC genes were selected from the literature and represent the most frequently studied reference genes in cancer including, but not limited to, colorectal cancer. Each gene was previously reported as being constitutively expressed in various tissues. These EC genes included B2M (beta-2-microglogulin) [318], HPRT (hypoxanthine guanine phosphoribosyl transferase 1) [316,340], GAPDH [341], ACTB (beta-actin) [342], PPIA (peptidyl-prolyl isomerise A) [322] and MRPL19 (mitochondrial ribosomal protein L19) [322].
The remaining seven genes included HCRT, SLC25A23, DTX3, APOC4, RTDR1, KRTAP12-3, and CHRNB4. The latter candidates were selected from an unpublished whole genome microarray dataset of 20 human tumour specimens and represented the most stably expressed probes with a fold-change of 1.0-1.2, (p< 0.05). Expression of CXCL12 [343], FABP1 [344], MUC2 [345] and PDCD4 genes were chosen as targets against which to measure the effects of candidate EC expression on the basis of their previously identified roles in tumourigenesis. In addition to its tumour suppressor properties, PDCD4 [346] also has diagnostic and prognostic utility and represents a promising target for anti-cancer therapy.

Study group
A study group of 64 biopsies of human colon tissue samples was gathered from consenting patients at the time of primary curative surgical resection at Galway University Hospital, Ireland. The cohort comprised of 30 colorectal tumour specimens and 34 and tumour-associated normal (TAN) tissues. Following excision, all samples were subject to histopathological review prior immediate snap-freezing in liquid nitrogen and archival at -80ºC until further use.
Concomitant clinicopathological data on patients and specimens was obtained from the Department of Surgery Biobank, NUI Galway as detailed in Table 4.

Ethical approval for this study was granted by the Clinical Research Ethics
Committee, Galway University Hospitals.

Candidate Endogenous Control Genes
Based on literature search six commonly used candidate endogenous control genes were selected for analysis: ACTB, GAPDH, HPRT, B2M, PPIA and MRPL19. An additional panel of seven genes: HCRT, SLC25A23, DTX3, APOC4, RTDR1, KRTAP12-3 and CHRNB4, was also selected for analysis (table 3.2). To our knowledge all genes have independent cellular functions and were assumed not to be co-regulated.
Negative control samples were included in each set of reactions. Reactions were incubated at 25º C for 5 minutes followed by 50º C for 1 hour and final denaturation at 72º C for 15 minutes. Samples were subsequently diluted to 50 µL in nuclease-free water and stored at -20º C. The expression of each EC gene was analysed by RQ-PCR using TaqMan gene expression assays using a 7900HT instrument (Applied Biosystems ). All reactions were performed in 20 µL reactions, in triplicate within the same PCR run. Negative controls were included for each gene target under assay. On each plate, an interassay control was included to account for any variations between runs. For each well 2µl of cDNA from each sample was added to 18µl of PCR reaction mix which consisted of 10x TaqMan universal master mix, No AmpErase UNG, 7X nuclease free water and 1X gene expression assay primer-probe mix (Applied Biosystems). The PCR reactions were initiated with a 10 minute incubation at 95º C followed by 40 cycles of 95º C for 15 seconds and 60º C for 60 seconds, in accordance with the manufacturer's recommendations.

PCR Amplification Efficiency
Amplification efficiencies for each EC gene assay were calculated applying the formula E= (10-1/slope -1) × 100, using the slope of the plot of Ct versus log input of cDNA (10-fold dilution series). A threshold of 10% above and below 100% efficiency was applied. PCR amplification efficiency for each candidate EC gene is shown in table 3.1

Data Analysis
Cycle threshold (C t ) is defined as the PCR cycle number at which the fluorescence generated from amplification of the target gene within a sample increases to a threshold value of 10 times the standard deviation of the base line emission and is inversely proportionate to the starting amount of the target cDNA.

Range of Expression of Candidate EC Genes
A range of C t values was observed across the candidate EC genes in tumour and

Identification of Optimal EC genes
Scaled expression levels across the remaining nine candidate ECs analysed (figure 3.1) indicated within-gene differences in expression between tumour and normal tissue groups in both SLC25A23 (p= 0.040) and CHRNB4 (P=0.002) but not in the remaining genes (p>0.05), (figure 3.1A). Therefore, SLC25A23 and CHRNB4 genes were excluded from further analysis. Significant differences in variance of EC expression were identified using Levene's test (p<0.001, figure 2B). These findings necessitated further evaluation of each candidate EC gene prior to their possible use to accurately quantitate gene expression levels of the target genes CXCL12, FABP1, MUC2 and PDCD4.
The stability of candidate EC genes was analysed using geNorm [321] and NormFinder [347] programmes. Stability was further evaluated using qBasePlus [321,348], a commercially available RQ-PCR data mining package. These programmes were used to calculate amplification efficiency-corrected relative quantities from raw fluorescence data. The ranking of candidate EC genes as determined by each of these programmes is illustrated in Table 3. In the case of GeNorm the variable V indicating the pairwise variation (Vn/Vn+1) between two sequential normalisation factors (NFn/NFn+1) indicated that three EC genes was the optimal number of genes for accurate normalisation (figure 3.2), however, target genes expression did not differ significantly if two rather than three EC genes were used (figure 3.3). Use of all three programmes confirmed that B2M and PPIA was the best combination of genes for normalising RQ-PCR data in CRC tissues (table 3.3). The Equivalence test [349] was used to examine the expression of candidate ECs. All genes were equivalently expressed between the normal and tumour colorectal tissues using a fold cut-off of 2 (figure 3.4).   The GeNorm programme calculates a normalisation factor (NF) which is used to determine the optimal number of EC genes required for accurate normalisation. This factor is calculated using the variable V as the pairwise variation (Vn/Vn + 1) between two sequential NFs (NFn and NFn + 1). To meet the recommended cut off V-value which is the point at which it is unnecessary to include additional genes in a normalisation strategy. The recommended limit for V value is 0.15 but it is not always achievable. In this instance, the GeNorm output file indicated that the optimal number of genes required for normalisation was three.    In the case of NormFinder, stability is calculated from inter-and intra-group variation. By grouping the tissues into tumour and normal the best combination of genes was identified. For geNorm stability was based on the estimation of pair-wise variation. QBasePlus through its components, geNorm and qBase, identified coefficient of variation (CV) and stability (M) values and thereby the best combination of genes for normalisation only when more than one gene is used.

Association between EC genes and target genes
There was a significant effect of the expression of the candidate EC genes on   Table 1 Additional files for Post Hoc tests. Error bars indicate 95% confidence intervals.

Non-normalised expression levels of target genes
To assess whether normalisation was necessary in a large cohort such as this in which the biological effect of the target genes is already established, we compared

Figure 3.6: Non-normalised C t of target genes in CRC
Using this approach, the expression of CXCL12, FABP1, MUC2 and PDCD4 appeared to be down-regulated in tumours compared to normal tissues in the large cohort of patients (30 tumour and 34 normal tissue specimens), similar to previous published reports of reduced expression in colorectal tumours. No significant differences were noted in expression levels of target genes when using the small cohort of patients (10 tumour and 10 normal tissue specimens) (2-sample t-test). This confirms the effect of sample size on findings when using non-normalised C t values and therefore the importance of normalisation especially in such type of studies

Discussion
Since its introduction in 1996 [350] [321] thereby clearly indicating the potential for superior accuracy when due consideration is paid to the choice of EC genes.
Many analytical programmes for relative quantification have been developed, certain of which enable the identification of EC genes from a study population [349,352,353]. In the present study the stability of expression of candidate EC genes was determined using a pair-wise comparison model: geNorm [321] and an MS Excel ANOVA based model, NormFinder [347]. No effect of disease status EC gene expression was identified in colorectal tissue. Since both geNorm and NormFinder are based on the assumption that candidate genes are not differentially expressed between samples, this was an important first step prior to their continued use [322,323].
In this study GeNorm was used to identify the most stably expressed EC genes from our panel of candidates and also provided a measure of the optimal number of EC genes. B2M and PPIA were identified as the most stable pairing. In order to achieve a pair-wise variation value (V) below the cut-off of 0.15 additional genes should theoretically be used; however this cut-off point is not absolute [326]and may not always be achievable [354]. No significant difference in target gene expression was observed when the top three most stable EC genes identified by geNorm were used confirming that using of a pair of genes may be more practicable given cost, work load and sample availability considerations.
NormFinder was designed to identify EC genes with the lowest stability values; these values are calculated based on intra-and inter-group variation. In this study NormFinder was used to define the best combination of genes using tumour and normal as group identifiers in the calculations. MRPL19 was selected as the most stable gene using these criteria; however B2M and PPIA were highlighted as the best combination of genes with even lower stability value compared to MRPL19 alone. QBasePlus real-time PCR data manager programme was developed based on geNorm and qBase [348] algorithms. QBasePlus was used to confirm our selection of the B2M and PPIA pairing as the best combination of ECs in colorectal tissue.
Equivalence testing was developed in biostatistics to address the situation where the aim is not to show the difference between groups, but rather to establish that two methods are equal to one another. In equivalence testing, the null hypothesis is that the two groups are not equivalent to one another, and hence rejection of the null indicates that the two groups are equivalent. Therefore, as stated by Haller et al, there is a risk of accepting non-differentially expressed genes as suitable controls although they are not equivalently expressed [355]. Equivalence of expression between tumour and normal colorectal tissue was confirmed for all candidate EC genes using the equivalence test and a fold cut-off of 2. DTX3, B2M, MRPL19 and PPIA showed the minimum of variability in the confidence interval hence can be used for normalisation.
In their study to identify EC genes to monitor enterocyte differentiation and to compare normal and adenocarcinoma of the colon from microarray data, Dydensborg et al [318] recommended RPLP0 for normalising gene quantification in human intestinal epithelial cells and B2M for studying gene expression in human colon cancer. In addition, Blanquicett [356] analysed the extent of variability in gene expression between tumour and normal colorectal and liver tissues using two-tailed T tests. They showed that 18S, S9 and GUS were the least variable genes in normal and metastatic liver specimens and were also appropriate for normal and tumour colorectal tissues. In the present study, we confirmed that more than one EC gene is required for optimal normalisation in colorectal tissue.
We used clinico-pathologically diverse tissues to systematically evaluate normalisation of gene expression data in colorectal tissues. We also conducted equivalence testing to confirm the equality of expression of each EC gene.
Thereby, the risks of incorrect rejection (type 1 error) and of false negativity (type 2 error) were minimised.
As stated above significant differences in target gene expression were noticed when using each of the EC genes and the combination of PPIA and B2M.
Moreover, significant effect of EC on the magnitude of error associated with estimation of target gene expression was also determined in this study ( The findings reported in this study confirm that use of two EC genes to normalise RQ-PCR data resulted in superior accuracy in the quantification of gene expression in colorectal tissue. The combined use of B2M and PPIA was validated as the optimal pair of EC genes with which to estimate the expression of all four target genes in colorectal cancer tissue. Although these ECs may not be ideal in other tissue types, the approach described herein could serve as a template to identify valid ECs in other tissue types.

Introduction
Despite At the molecular level, activation of oncogenes and inactivation of tumour suppressor genes [359] are processes known to be involved in colorectal carcinogenesis. Additionally, abrogation of mismatch repair systems [360] contributes to some colorectal cancers.  On the other hand, real-time quantitative polymerase chain reaction (TR-PCR) is a combination of the reverse transcriptase (RT)-dependent conversion of RNA into cDNA, the amplification of the cDNA using the PCR and the detection and quantification of amplification products in real time [364]. It addresses the evident requirement for quantitative data analysis in molecular medicine and has become the gold standard method for the quantification of mRNA and therefore for validation of microarray data.

Candidate genes
In order to identify a list of genes associated with deregulated expression in colorectal cancer and thereby might have a role in colorectal cancer tumourogenesis, we carried out a detailed analysis of published colorectal cancer microarray data and identify the most prominent genes. Furthermore, a literature review was performed to identify mRNA highly associated with cancer to identify their role in colorectal cancer pathogenecity and progression [185,363,365].   [191,203] Transforming growth factor beta1 TGFB1 19q13.1 Hs00998133_m1 57 [185,365,[381][382][383] Transforming growth factorbeta receptor type 1 TGFBR1 9q22 Hs00610320_m1 73 [384,385] Transforming growth factorbeta receptor type 2 TGFBR2 3p22 Hs00234253_m1 70 [381,386] used as an indicator of circulating tumour cells load [387,388]. Quantification oftumour tissues CEACAM5 expression levels may establish an alternative approach with higher sensitivity and specificity for diagnosis and follow up of colorectal cancer patients. The chemokine CXCL12, also known as stromal cell-derived factor-1 (SDF-1), has been shown to play a significant role in tumourgenesis, promoting angiogenesis and tumour invasion and migration to metastatic sites [378,[392][393][394]. These observed effects of CXCL12 have been previously thought to be mediated entirely through its receptor CXCR4. However, the recently described receptor for CXCL12, CXCR7 [395,396] may required a re-examination of much of the previous work that presumed an exclusive effect of CXCL12/CXCR4 axis.

Fatty acid-binding protein 1, liver (FABP1)
Liver fatty acid-binding protein (L-FABP, FABP1) is a member of intracellular proteins family that mediate transportation and utilization of lipids. It is specifically expressed in the hepatocytes and enterocytes and could serve as a sensitive marker of enterocytes differentiation [397,398]. It increases solubility of fatty acids in cell cytoplasm and facilitate their up take and processing [399][400][401].
FABP1 plays an active part in several physiological functions including signal transduction, modulation of cell division, cell growth and differentiation and regulation of gene expression [402,403]. All these functions are deregulated in tumourogenesis and tumour progression, supporting the possible role of this molecule in colorectal cancer development and progression.

Interleukin 8 (IL-8, CXCL8)
IL8 is a major mediator of the inflammatory response. It is secreted by leukocytes and tumour cells and functions as a chemoattractant, and a potent angiogenic factor [404,405]. Increased IL-8 expression has been found in most of metastatic and solid tumours of the breast, melanoma and ovaries [406][407][408]. In colorectal cancer, studies have found that IL-8 serum levels correlate with poorer prognosis, tumour progression and metastasis [409].

Mucin 2 (MUC2)
MUC2, intestinal-type gel-forming secretary mucin, is produced and secreted by globlet cells and is a major constituents of mucus, which acts to lubricate and protect intestinal epithelial tissues [410]. In animals, inactivation of MUC2 caused intestinal tumour formation which was accompanied by increase proliferation, decrease apoptosis and increase migration of the cells [411]. These alterations might primarily relate to MUC 2 absence or could be secondary to the inadequate protection of intestinal mucosa. Reduced MUC2 expression was reported to be associated with development and progression of colorectal cancer [412,413], however, in tumours like gastric and bladder cancer overexpression of MUC2 was noticed [414,415]

Programmed cell death 4 (PDCD4)
PDCD4 is a novel tumour suppressor that inhibits tumour promotion and progression in both cell lines and animal models. The main functions of the gene are to inhibit translation, suppress proliferation and cell cycle progression and induce apoptosis [416,417]. It achieves these functions through interaction with many other molecules and pathways. PDCD4 is down-regulated in several human cancers including lung, ovary, and brain [418][419][420]. In colorectal cell lines, downregulation of PDCD4 found to promote invasion and metastasis.

TGFB1 and its receptors (TGFBR1 and TGFBR2)
Transforming growth factor-β1 (TGFB1), a multifunctional cytokine, mediates its effect on cells through a heteromeric receptor complex that consist of type I and type II components. The pathway signalling is initiated by binding of TGFB to type II receptor (TGFBR2) which consequently recruits and phosphorylates the type I (TGFBR1) receptor. This will lead to stimulation of TGFBR1 protein kinase activity. Activated TGFBR1 then phosphorylates two downstream transcription factors, SMAD2 and SMAD3, allowing them to form a complex with SMAD4.
The complexes then translocate into the nucleolus and interact with other transcription factors to regulate the transcription of TGFB1 responsive genes.
The TGFB1 pathway, the most commonly altered cellular pathway in human cancer, is involved in several physiological functions including cell proliferation, differentiation, migration and apoptosis [421]. It also stimulates angiogenesis, directly through induction of expression of VEGF or indirectly through attracting monocytes which release angiogenic cytokines [422]. In addition, TGFB1 is involved in regulation of extracellular matrix production, cell adhesion and immune surveillance. These functions are integral part of tissue homeostasis and represent logical targets for dysregulation in carcinogenesis.
In colorectal cancer high serum levels of TGFB1 protein was found to be associated with advanced Dukes' stage, depth of tumour invasion and metastasis [428]. However, at an early stage of the disease it was found to suppress nonmetastatic tumour growth [429]. Regarding TGFB1 receptors, TGFBR1*6A polymorphism was linked to hereditary colorectal cancer [430,431] while TGFBR2 inactivation was identified in more than 90% of tumours with microsatellite instability [147]

Aims
The aim of the study was to quantitative candidate genes expression in colorectal cancer tissues using RT-PCR in order to: -Determine the expression levels of candidate genes in tumour and tumourassociated normal colorectal tissue -Investigate correlation between serum carcinoembryonic antigen (CEA) and tissue CEACAM5 levels -Correlate candidate genes expression levels and clinicopathological variables.

RNA extraction and analysis
Tissue samples (50-100 mg) were homogenised using a hand-held homogenizer (Polytron PT1600E) in 1-2 mL of QIAzol reagent (Qiagen). Tumour and TAN samples were homogenised separately but on the same day. Two methods of RNA extractions were employed in the study, the total RNA extraction (co-purification) and the separate purification of mRNA and miRNA.To ensure both methods were working properly correlation of RNA concentration and quality were carried out and showed good results (

Reverse transcription
First strand cDNA was synthesised using Superscript III reverse transcriptase (Invitrogen) and random primers (N9; 1µg, MWG Biotech). Negative control samples were included in each set of reactions. Reactions were incubated at 25º C for 5 minutes followed by 50º C for 1 hour and final denaturation at 72º C for 15 minutes. Samples were subsequently diluted to 100 µL in nuclease-free water and stored at -20º C.

Amplification efficiency
In determining gene expression using RQ-PCR and relative quantification, it is important to consider the amplification efficiency for the assay in use. The PCR efficiency impacts greatly on the accuracy of the calculated expression result and is influence by PCR reaction component. For 100% efficiency there will be doubling of the amount of DNA at each cycle, while for 80% and 70% the amount of DNA will increase from 1 to 1.8 and 1.7, respectively. Therefore, a small difference in efficiency makes a large difference in the amount of the final product.
Amplification efficiencies for each EC gene assay were calculated applying the formula E= (10-1/slope -1) × 100, using the slope of the plot of Ct versus log input of cDNA (10-fold dilution series). A threshold of 10% above and below 100% efficiency was applied (Table 4.4)

Endogenous control
Relative quantification is the most widely adopted approach whereby quantification of gene expression is normalised relative to an endogenously qBasePlus. We determined that two genes were required for optimal normalisation and identified B2M and PPIA as the most stably expressed and reliable EC genes [292].

RQ-PCR of mRNA
The expression of each EC gene was analysed by RQ-PCR using TaqMan gene expression assays using a 7900HT instrument (Applied Biosystems). All reactions were performed in 20 µL reactions, in triplicate within the same PCR run.
Negative controls were included for each gene target under assay. On each plate, an interassay control was included to account for any variations between runs. For each well 2µl of cDNA from each sample was added to 18µl of PCR reaction mix which consisted of 10x TaqMan universal master mix, No AmpErase UNG, 7X nuclease free water and 1X gene expression assay primer-probe mix (Applied Biosystems). The PCR reactions were initiated with a 10 minute incubation at 95º C followed by 40 cycles of 95º C for 15 seconds and 60º C for 60 seconds, in accordance with the manufacturer's recommendations.

Relative quantification
Cycle threshold (C t ) is defined as the PCR cycle number at which the fluorescence generated from amplification of the target gene within a sample increases to a threshold value of 10 times the standard deviation of the base line emission and is inversely proportionate to the starting amount of the target cDNA. QBasePlus was used for calculation of candidate expression relative to the endogenous control genes. It applies ΔΔ C t method was used where ΔΔCt = (C t target gene, test sample -C t endogenous control, test sample) -(C t target gene, calibrator sample -C t endogenous control, calibrator sample). Relative quantities were corrected for efficiency of amplification and fold change in gene expression between groups was calculated as E-ΔΔCt ± s.e.m. The lowest expressed sample was used as a calibrator.

Statistical analysis
Statistical analysis was carried out with IBM SPSS Statistics 17.0 (SPSS Inc.).
Data was tested for normal distribution graphically using histograms and also using the Kolmogorov-Smirnov and Shapiro-Wilk tests. Parametric tests were used where appropriate. One-way ANOVA and independent t-test were used to determine association and comparisons between independent groups. Correlation analysis used Spearman's Rho and Pearson's correlations coefficient for nonparametric and parametric data respectively. Univariate analysis and paired-T test were used to assess related samples. The statistical significance of differences in survival between groups was determined by log rank which compares differences along all points of the curve and multivariate analysis was done using Cox regression. P values <0.05 were considered statistically significant.

In blood
The expression levels of all the candidate genes and the endogenous control genes were undetermined in blood using RQ-PCR despite the high concentration and the good quality of RNA extracted.

Cadherin 17 (CDH17)
The expression of CDH17 was significantly lower in colorectal cancer compared to TAN tissues (p<0.001, t-test,

CXCL12 and its receptors (CXCR4 and CXCR7)
Paired t test was used to investigate the difference in gene expression between tumour and normal colorectal cancer patients in 101 paired tissues.

Mucin 2 (MUC2)
Again a progressive manner of expression from tumour, to polyp, to tumour associated normal was observed in MUC2 (p<0.001, Kruskal-Wallis t-test, Figure   4.16A)). Further analysis confirmed a significant differences in MUC2 expression levels between tumour and TAN (p<0.001) but not between polyps and TAN (p=0.081), and between tumours and polyp (p=0.218) (

Neoadjuvant therapy and colorectal cancer genes expression
In the cohort of rectal cancer patients (n=58) we analysed the differences in gene expression in patients who had neoadjuvant chemoradiation (n=25) compared to those who did not (n=33) using t-test. Univariate analysis of variance was further conducted to test for interaction effect and to control for confounding factors. We

Discussion:
The clinical and pathological parameters of colorectal cancer are still the basis of treatment options, classification and prognostic stratification; however, they may be inadequate in everyday practice due to the great biologic and genetic heterogeneity of the disease. Furthermore, the molecular factors involved in prognosis and response to therapy in CRC is poorly understood Detection of disseminated tumour cells in peripheral blood of colorectal cancer patients has been achieved primarily using immunocytological or flow cytometry based techniques [432,433]. The clinical usefulness and high sensitivity of PCR in detecting cancer markers in circulating tumour cells was confirmed before [434,435]. In colorectal cancer, RQ-PCR was used to determine the expression levels of CEA, CK20 and CK19 mRNA in peripheral blood and indicated a valuable tool for cancer staging and disease monitoring [387,436,437]. The undetermined expression of mRNA in peripheral blood in this study may be explained by the low concentration of these molecules in the blood. This problem could be overcome by optimisation of the extraction methods and applying RNA concentration techniques.
In this study we characterised the expression of a group of genes in colorectal cancer. Although their expression levels were undetermined in blood, we identified a comprehensive list of genes with highly differential expression patterns in colorectal cancer tissues that could serve as molecular markers to complement existing histopathological factors in diagnosis, follow up and therapeutic strategies for individualised care of patients.
CDH17 is a member of the cadherin superfamily of genes which encode calciumdependent membrane-associated glycoproteins that mediate cell-cell adhesion in the intestinal epithelium. The protein is a component of the gastrointestinal tract and pancreatic ducts where it functions as an intestinal proton-dependent peptide transporter. The mechanism of the adhesive function of CDH17 is unclear but it could be complementary to the classical cadherins like E-cadherins [438]. Cell to cell adhesion by CDH17 is apparently independent of any interaction with cytoskeleton component because its cytoplasmic dominant is very short [439].
High expression levels of CDH17 were noted in hepatocellular carcinoma and gastric cancer and found to be associated with intestinal type gastric carcinoma, poor survival, tumour invasion and lymph nodes metastasis, while in pancreatic and colorectal cancer the reverse is true [370,[372][373][374][440][441][442][443]. Those observations in diverse types of tumours highlight tumour-specific expression patterns and presumably reflect tissue specific regulatory mechanism for CDH17.
In colorectal cancer, CDH17 expression was only investigated at protein level  [445]. Overexpression of CEACAM5 antigen was identified in majority of carcinomas involving the gastrointestinal, respiratory and genitourinary tracts and in breast cancer [445,446]. Moreover, it has been reported to promote the metastatic potential in some experimental tumours [447]. The antigenic characteristics of CEACAM5, in addition to its role in tumour biology and metastasis, make it a favourable target for immunotherapy [448][449][450][451].
We identified non-significant overexpression of CEACAM5 in tumour compared to tumour-associated normal colorectal tissues. Overexpression of CEACAM5 was significantly associated with moderately differentiated tumours and tumour invasion. In relation to CEA protein, although the expression of the gene was high in patients with raised CEA protein serum level, we failed to identify any correlation between the tumour CEACAM5 expression and the CEA serum level.
Previous reports have shown that increase CEA protein level does not involve gene amplification or rearrangement but may be due to hypomethylation of upstream regions and factor changes leading to altered rate transcription [452,453].
Chemokines are low molecular weight proteins that share a high degree of structural homology and the ability to attract specific cell types like leukocytes and tumour cells. They exert their biological effect by coupling to G proteinlinked transmembrane receptors called chemokine receptor. Binding of chemokines to their receptors trigger activation of many signalling pathways including activation of calcium fluxes and protein kinases [454]. Dysregulated chemokine expression has fundamental roles in tumour initiation and progression and recent studies have shown that chemokines and chemokine receptors contribute to cancer biology in different ways. They are able to modulate the local inflammatory reaction harbouring pro-or anti-tumorigenic activity and may promote angiogenesis and tumour cell proliferation, migration and survival [389,455,456].

CXCL12 (SDF-1), binds and signals through the chemokine receptors CXCR4
and CXCR7, regulates many essential biological processes including angiogenesis, apoptosis, cell motility, migration and adhesion and cardiac and neural development [457,458]. In mice genetically deleted of CXCL12 early stage embryos exhibit profound defects in the vascular and brain development, hematopoisis and cardiogenesis which lead ultimately to embryonic lethality [459,460]. The recent evidence that CXCL12 also binds CXCR7 receptor raised many questions on the potential contribution of the CXCL12/CXCR7 axis to these processes that were previously attributed solely to CXCL12/CXCR4 interaction.
While CXCR4 mRNA and protein were reported to be expressed on several cells as immune cells, epithelial cells and various types of cancer cells, CXCR7 protein rarely expressed on the surface of normal nontransformed adult tissues [395,[461][462][463]. Interestingly, nontransformed tissues that lake surface CXCR7 expression expressed CXCR7 mRNA, suggesting that CXCR7 could be regulated in a posttranslational manner [395]. Furthermore, in contrast to CXCR4 which only binds to CXCL12, CXCR7 is able to interact with two chemokines, CXCL11 (I-TAC) and CXCL12 [395,464]. Hence, CXCL12 mediated response could be potentially modulated by I-TAC. These facts must always be considered when the biological effect of CXCL12 via these receptors is evaluated.
A considerable number of previous reports have demonstrated that overexpression of CXCR4 on tumours cells is associated with increased tumour growth and metastasis [465,466]. CXCR4 signalling was found to play a crucial role in metastasis homing of breast [467], ovarian [457] and prostatic [468] cancer cells by inducing chemotactic and invasive responses. Muller et al reported that primary breast cancer cells expressed CXCR4, whereas CXCL12 was found in elevated levels in the metastatic sites of breast cancer like bone marrow, lung, lymph nodes and liver. Neutralization of CXCL12/CXCR4 interactions lead to inhibition of breast cancer lymphatic and lung metastasis [467]. In addition, marked expression of CXCR7 was determined in variety of tumour cell lines and in primary human tumours with correlation with tumour aggressiveness, angiogenesis, metastasis and promotion of tumour growth [395,[461][462][463].
Intestinal epithelium produces chemokines to regulate the trafficking of leukocytes into and out of the lamina properia [469]. Both CXCL12 and CXCR4 are normally expressed in these cells, however, in colorectal cancer cells CXCR4 is over expressed while CXCL12 seems to be partially or irregularly expressed as shown by Immunohistochemistry and RQ-PCR [392,470]. In colorectal cancer, the overexpression of CXCR4 was significantly associated with advanced tumours and metastasis [393]. Furthermore, CXCR4 was found to induce stimulation of colon growth, VEGF release and ICAM-1 upregulation [471]. Similarly, CXCL12 was reported to increase VEGF expression and to induce cell proliferation, metastasis and migration in colorectal cancer cells [392]. On the other hand, CXCR7 in colorectal cancer was only investigated in colorectal cancer cell lines and animal colon cancer models and was determined to regulate angiogenesis and induce proliferation and chemotactic of cancer cells [377,379]. Although the role of CXCL12 and its receptors in colorectal cancer was investigated before, there is only sparse information on their function in carcinogenesis in vivo. Moreover, the data provided was controversy and generated on small sample size. In this study we used RQ-PCR, a gold standard method for gene analysis, to determine the clinicopathological correlation of CXCL12, CXCR4 and CXCR7 mRNA in 107 tumour and tumour associated normal tissues, which is the largest sample size to date.
We demonstrated a significant down-regulation of CXCL12 in tumour compared to tumour-associated normal colorectal tissues in contrast to CXCR4, which showed non-significant up-regulated expression levels in tumour tissues. The In summary, in our cohort of CRC patients we found a reciprocal pattern of CXC12 and CXCR4 expression with increase expression of CXCR4 is associated with decrease expression of its ligand. Further more we demonstrated significant correlation of expression of CXCL12, CXCR4 and CXCR7 in both tumour and normal colorectal tissues. We also addressed for the first time the significant association of clinicopathological variables like tumour size, location, grade, invasion and lymph node status and the expression of target genes. To our knowledge, this is the first study to report the prognostic significant of CXCR7 expression in cancer patients.
Liver fatty-acid binding protein (FABP1) is specific marker of hepatocytes and enterocytes. In gastrointestinal tract it exists only in the epithelial absorptive cells of small intestine and the colon, but not in oesophagus and stomach [473]. The expression of FABP1 was investigated before in numerous cancers. High expression was identified in hepatocellular carcinoma, lung cancer and gastric cancer [474][475][476].
Evidence of dysregulated FABP1 gene expression has been reported in colorectal gene expression array datasets [365,477], however, little is known of its expression profile with regard to clinical data. Lawrie et al. used the proteomic and immunohistochemistry to determine the changes of FABP1 in 20 colorectal tumours. They identified consistent loss of FABP1 in tumour compared to normal colon. They also noted the association of decreased protein expression and poorly differentiated tumours and large adenomas [344]. Moreover, FABP1 expression was significantly associated with good prognosis after liver resection of colorectal cancer metastasis in the study by Yamazaki et al. who investigated 68 liver metastasis and 10 primary colorectal cancers using immunohistochemistry [478].
Although no statistically significant correlation between FABP1 expression and clinicopathological parameters was identified in this study, we observed that In addition to their role in inflammation IL8 may promotes tumour progression, invasion and metastasis by stimulating neoangiogenesis and activation of matrix proteases [479,480]. IL8 was shown to directly modulate endothelial cells proliferation and migration, hence promoting angiogenesis [481,482]. It also exerts its effect on endothelial cells indirectly via increase secretion of vascular growth factors such as vascular endothelial growth factor and basic fibroblast growth factor [407]. Furthermore, inflammatory cells recruited by IL8 to the cancer site may contribute to tumour progression through release of growth and angiogenic factors and promote invasion and distal metastasis [483]. Although high expression of IL8 in colorectal cancer was noted before [484,485], no in vivo correlations with clinicopatological variables were identified.
In keeping with the previous reports, we noted overexpression of IL8 in tumour compared to normal colorectal tissue. In addition, we identified a progressive manner of increase gene expression from normal, to polyps, to tumour. The early dysregulation of IL8 in colorectal cancer suggest that the gene may play a role in carcinogenesis in addition to its confirmed role in tumour progression.
Correlations with clinicopathological parameters revealed significant association of reduced IL8 expression and poor tumour differentiation, advanced nodal stage and disease recurrence. Although the significant of these findings is unclear, it should be considered when planning IL8 targeting therapy.
Mucus is viscoelastic secretion that is secreted by specialised epithelial cells The human mucin family consist of at least 21 members designated MUC1 to MUC21 and have been classified into secreted gel-forming and membrane-bound (transmembrane) forms [487,488].
The intestinal mucosa is covered by mucus, partly consist of secreted mucins, which provide a physical barrier and limit damage to the epithelium by luminal contents including bile, enzymes, ingested toxins and normal flora. The mucus consists of a less dense outer layer and a highly enzyme resistant packed inner layer. The inner layer is comprised of uncleaved MUC2 and is free of bacterial colonisation. Therefore, loss of MUC2 expression could contributes to numerous colonic pathologies including, but not limited to, ulcerative colitis and carcinoma [488,489]. However, the role of MUC2 as a tumour suppressor may seems confusing as MUC2 was reported to be expressed in increased levels in certain malignancies including gastrointestinal tract [490][491][492]. This might reflect the origin of these tumours from cells normally expressed MUC2, rather than a role in carcinogenesis.
Inactivation of MUC2 in mice caused tumour formation in small intestine and colon. This was accompanied by increase proliferation, decreased apoptosis and increased migration of epithelial cells. It is unclear if these changes are a primary response to loss of MUC2 or secondary to the inadequate protection of the intestinal epithelium [411]. In human, loss of MUC2 expression was identified in non-mucinous colorectal cancer and showed to correlate with tumour progression [487][488][489]493]. Moreover, down regulation of MUC2 was noted to be associated with progression along the adenoma-carcinoma sequence pathway [412,494]. On the other hand, overexpression of MUC2 was noticed in mucinous type colorectal cancer and found to be associated with poor prognosis and depth of invasion [495]. This might be due to the barrier formed by mucins secreted by tumour cells, which protect against recognition by anti-tumour immune effectors. In this study, keeping with previous reports, we confirmed MUC2 mRNA downregulation in non-mucinous and over-regulation in mucinous colorectal cancer.
We also showed decreased expression of MUC2 in a progressive manner from tumour-associated normal, to polyps, to tumours. No significant association of MUC2 and clinicopathological variables other than CA19.9 serum levels has been determined in this study.
Programmed cell death 4 (PDCD4) is a tumour suppressor gene, its overexpression was found to inhibit chemicals induced neoplastic transformation in vitro [496,497]. It was also shown to suppress tumour promotion and progression in animal models [498]. In the JB6 mouse epidermal clonal genetic variant cell system PDCD4 was found to be highly expressed in JB6 transformation-resistant but not in transformation-susceptible cells. Moreover, reduction of PDCD4 expression in transforming-resistant cells was accompanied by acquisition of a transforming-susceptible phenotype, while its overexpression in transforming-susceptible cells render them resistant to tetradecanoyl phorbol acetate-induced transformation and inhibit the expression of tumour phenotype [497].
Although the role of PDCD4 in suppressing different phenomena associated with cancer has been extensively explored, few studies have investigated the potential use of PDCD4 as a prognostic or diagnostic biomarker. Furthermore, the role of PDCD4 in tumour progression has mainly been based on studies that used cell lines. Mudduluru et al carried out the only prognostic study of PDCD4 in colorectal cancer [511]. They analysed PDCD4 expression in 71 colorectal cancer patients and 42 adenomas using immunohistochemistry and western blot and noticed a significant reduction in PDCD4 expression comparing tumour and polyps to tumour associated tissues. They also identified loss of PDCD4 expression as an independent predictor of disease free survival.
To our knowledge, we carried out the first study to characterised PDCD4 expression in colorectal cancer tissues using RQ-PCR. PDCD4 mRNA was significantly lower in tumour and polyp compared to tumour-associated tissue in keeping with the protein expression levels described before [477,502,511].
Furthermore, we identified the novel association of reduced PDCD4 expression with disease recurrence and raised CA19.9 serum level. These findings suggest that PDCD4 involves in both tumour promotion and tumour progression and represent a potential biomarker for evaluating the transition of normal colorectal tissue to adenoma and carcinoma. Reduced expression of PDCD4 in proximal compared to distal colon may indicate a potential role in microsatellite instability (MSI) and Lynch syndrome.
TGFB1 serves as tumour suppressor in the normal intestinal epithelial cells as they move out of intestinal crypts to the tips of villous by inhibiting proliferation and inducing apoptosis [429,512]. However, during the late stage of carcinogenesis it acts as a tumour promoter and is usually highly expressed [422,426,513]. Experimentally, prolong exposure of intestinal epithelium to TGFB1 promotes neoplastic transformation and it stimulates proliferation and invasion of poorly differentiated and metastatic colon cancer cells [428,514,515]. The molecular changes that result in redirection of TGFB1 growth inhibition signals during tumourogenesis are essentially unknown. A subset of colorectal cancer has been shown to have mutations or down regulation of type I and type II receptors [516,517] SMAD2 [518] and SMAD4 [519], hence increase the production of several mitogenic growth factors including TGFα, FGF and EGF [520]. The role of TGFB1 signalling pathway is best illustrated by presence of inactivating mutation in genes encoding TGFB receptors and SMADs in human cancer and by studies of tumour development in mouse models.
Silencing of TGFB receptors has been observed to promote establishment and progression of cancer [516]. Type I receptor inactivating mutations were described in ovarian and pancreatic cancers, T-cells lymphoma and metastatic breast cancer [422], while type II receptor gene inactivation have been identified in colon, head and neck cancers [425]. Transgenic mice that lack a copy of tgfbr2 have an increase susceptibility to develop cancer and restoration of functional receptors reverse the malignant behaviour of several human cancer cell lines that lack functional TGFB receptors. These observations suggest that TGFB receptors might work as tumour suppressor in development of cancer.
Many previous studies have examined the relations between TGFB pathway and the disease progression in colorectal cancer. Nevertheless, this is the first study to explore the relation of TGFB1 and its receptors mRNA in colorectal cancer using RT-PCR. Moreover, the large cohort of patients in this study gives it further advantage compared to the other studies.
Although no significant differences were identified in TGFB receptors expression in colorectal tumours compared to normal, TGFB1 expression levels were significantly lower in polyp and higher in cancer compared to tumour-associated normal tissues. This is in keeping with previous reports. Matsushita et al (1999) found that TGFB receptors mRNA expressed mainly by normal and adenoma colorectal tissues whereas TGFB1 expressed by cancer [516]. Moreover, Daniel et al (2007), using immunohistochemistry, identified higher TGFB1 protein expression in colorectal cancer, than in high-grade dysplastic polyp, than in lowgrade dysplastic polyp [521]. The significant positive correlation between TGFB1 and its receptors expression levels in both tumour and normal colorectal tissues confirm that their role in colorectal cancer is more complex than a simple legendreceptor feedback.
Interestingly, we identified for the first time the relationship of TGFB pathway and some established prognostic clinicopathological parameters. Low expression of TGFBR1 was found to be associated with raised CEA serum level and local tumour invasion. In addition, TGFBR2 down-regulation was associated with local, perineural and lymphovascular invasion and advanced nodal stage. These findings will further confirm the role of TGFB receptors as tumour suppressor. The downregulation of TGFBR2 in proximal compared to distal tumours was described before and highlights the role of this gene in microsatellite instable tumours.
Tumours of proximal and distal parts of the colon may form different but related groups of tumours because of their different embryological origin, different exposure to bowel contents and differences in clinical presentation, progression and possible genetic and environmental epidemiology [522]. The concept that proximal and distal colon and rectum represent different entities is supported by evidence that two different genetic mechanisms, microsatellite instability (MSI) and chromosomal instability (CIN), contributes unevenly to the carcinogenesis in the different parts of gastrointestinal tract [523]. The incidence of CIN is similar in the distal colon and the rectum and associated with tumours located distal to the splenic flexure [524,525]. In sporadic colorectal cancer, MSI results from inactivation of DNA-mismatch repair (MMR) genes and secondary mutations of genes with coding microsatellite. Reports indicate that MSI tumours are located proximally to splenic flexure [524] and also suggested that if MSI is present in rectal cancer, this would strongly suggest a hereditary predisposition for the cancer as MSI is rare in the lower part of the colon and rectum.
Using oligonucleotide microarrays, Birkenkamp-Demtroder and his colleagues investigated the differences in gene expression in colon cancer of the caecum versus the sigmoid and rectosigmoid [185]. They identified 58 genes to be differentially expressed between the normal mucosa and 16 genes differentially Nevertheless, these benefits only noticed in limited group of patients (10-30%). In addition, such therapy is expensive, associated with increase risk of second cancer in adjacent organs and increases the risk of postoperative mortality and morbidity. Therefore, accurate selection of patients who are suitable candidates for neoadjuvant therapy will significantly improve the outcomes. Many molecular markers, including P53, p21, BCL2, BAX, EGFR, COX2, PTMA and ELF5a1, were identified before as predictors of response to this modality of treatment; however, their clinical application is still under evaluation [67,528]. Although no pre-treatment biopsies were used, the list of genes identified to be dysregulated in response to neoadjuvant therapy in this study is consistent with previous reports.
Ambrosini-Spaltro et al. investigated 32 pre-treatment biopsies by immunohistochemistry and determined MUC2 as a predictor of poor response [529]. Moreover, stromal CXCL12 and CXCR4 expression was found to be associated with recurrence and poor survival after neoadjuvant therapy in the 53 patients analysed using RQ-PCR and immunohistochemistry [528]. However; comparison of expression levels of these genes in pre-and post-treatment biopsies is required to further validate their use as predictors of response.

Introduction
The increasing use of adjuvant and neoadjuvant therapy has led to improved outcomes in the management of colorectal cancer [530]. Post-operative adjuvant chemotherapy has been shown to improve the outcome in patients with Dukes' C tumours and is generally accepted as standard care [357] , however, only selected patients of Dukes' B group would benefit from this treatment. Moreover, neoadjuvant chemoradiation is becoming the standard of care in the treatment of locally advanced rectal cancer. It is associated with significant improvements in down staging of the disease which correlates with improved rates of sphincter sparing surgery, decreased regional recurrence, and improved overall survival as confirmed by the prospective randomized trial of the DCCG and the German Rectal Cancer Study Group [36,358]. The response to neoadjuvant therapy is quantified by tumour regression grade which was originally described for tumours of the oesophagus [44]. Tumour regression grading (TRG) is pathological grading system based on the histological degree of tumour regression and fibrosis present in the specimen after preoperative treatment [531]. It has proven to be of prognostic significant when assessed in multi-centric preoperative therapy trials [41]. Preoperative treatment revealed increase risk of second cancer and associated with considerable perioperative morbidity. Moreover; there are emerging opinions for non-surgical management of patients with complete response to neoadjuvant therapy [532][533][534]. These facts make the ability to predict response to neoadjuvant therapy of great importance in clinical settings.
Recently, post-transcriptional and translational controls of protein coding genes regulated by miRNA have emerged as an interesting field of cancer research.
Control mediated by miRNA provides the cell with a more precise, energyefficient way of controlling the expression of proteins and greater flexibility in responding to numerous cytotoxic stresses. The exact function of miRNAs is just emerging, however, their ability to regulate cell proliferation and cell death has been previously shown [535]. Due to their small size, miRNA are more stable and resistant to environmental, physical and chemical stresses compared to mRNAs.
Therefore; their analysis in formalin-fixed paraffin embedded (FFPE) tissue samples is likely to provide more accurate replication of what would be observed in fresh tissues than that of mRNA species [167,[536][537][538].
Relationships between radiosensitivity and functions of several genes have been reported, including P53, BCL-2, BAX and P27 [67,539], however, little is known about the clinical significant of these genes or their usefulness for estimating radiotherapy effectiveness. The potential role of miRNAs as good biomarkers for cancer diagnosis and prognosis has been confirmed before [540][541][542][543]. However, no previous study has investigated their role as predictors of response to neoadjuvant therapy in colorectal cancer.
FFPE tissue offers a widely available and rich archive of well characterised tissue specimens and patient data for comparative molecular and clinical retrospective studies [544]. New extraction methods have made it possible to retrieve total RNA from preserved tissue specimens to a level that could be quantified by RQ-PCR. However, the application of these methods to algorithm. The models produced by ANNs have been shown to have the ability to predict well for unseen data and have the ability to cope with complexity and nonlinearity within the dataset [545,546]. Thus ANNs have the potential to identify and model patterns in this type of data to address a particular question. These patterns can combine into a fingerprint that can accurately predict subgroups. For this reason, they have been widely applied to a range of domains including character and face recognition [547], stock market predictions, or survival prognosis for trauma victims [548]. Consequently, ANNs have the ablility to determine patterns or features (e.g. in genes or proteins) within a dataset that can discriminate between subgroups of a clinical population (e.g. disease and control, or disease grades [549]. ANNs have already been successfully applied in a number of contexts where markers of biological relevance have been identified including polycystic ovarian syndrome [549], melanoma [549], prostate cancer [550,551] and breast cancer [294,552]. In colorectal cancer they were used to characterize the disease and to predict survival and outcomes of CRC patients [553][554][555][556][557][558]. Their application to the analysis of colorectal cancer microarray data was reported by Selaru et al [556] who evaluated the ability of artificial neural networks (ANNs) based on complementary DNA (cDNA) microarray data to discriminate between sporadic colorectal adenomas and cancers (SAC) and inflammatory bowel disease (IBD)-associated dysplasias and cancers. Signatures were identified and validated with 100% accuracy. Notably, significantly fewer genes were included in the signatures compared to signatures generated by other analysis methodologies. This study highlighted the potential application of ANN to microarray analysis and illustrated how this method should be exploited to provide a further understanding of CRC biology.

Aims
The objectives of this study were to optimise miRNA extraction methods from FFPE tissue samples and to systematically investigate the miRNA expression profiles between FFPE samples and fresh-frozen samples using RQ-PCR. Also to characterise miRNA expression in tumour compared to tumour-associated normal (TAN) FFPE colorectal tissues. Moreover; we aimed to identify predictors of response to neoadjuvant chemoradiation therapy in colorectal cancer using FFPE tissues as source of genetic materials, and microarray analysis as investigation tool.

Study group
A group of 9 patients was selected for optimization of RNA extraction methods from FFPE tissues and to evaluate RNA quality in relation to RNA extracted from fresh-frozen tissues. Each patient in this group has both FFPE and fresh frozen/ tumour and TAN tissues available. Then a group of 12 rectal cancer patients who had neoadjuvant therapy and had pre-treatment biopsies available was selected to examine the expression of miRNA by microarray analysis in order to identify predictors of response to neoadjuvant chemoradiation therapy. Response to neoadjuvant therapy was quantified using Mansard tumour regression grade.

Formalin-fixed paraffin embedded (FFPE) tissues
A pair of tissues (tumour and TAN) was placed in 10% formalin (Lennox) for fixation and prior to paraffin embedding. The 10% formalin mixed was made of 4 gm of Sodium phosphate monobasic, 6.5 gm of Sodium phosphate dibasic, 100 mL of 37% formaldehyde and 900 mL of distilled water. Biopsies were fixed and stored at room temperature until embedding for a minimum of 24 hours. Tissue

RNA extraction and analysis
For the initial evaluation of extraction techniques, RNA was isolated from colorectal tissues using three previously described methods for RNA extraction: Qiagen RNeasy FFPE method (Qiagen), Qiazol and chloroform protocol (Qiagen) and the TRI reagent RT-Blood protocol (Qiagen) according to the manufacturer's instruction. Thereafter Qiagen RNeasy FFPE method was employed in the proceeding experiments. RNA concentration and purity was assessed in duplicate samples (1 µL) using a NanoDrop ND-1000 Spectrophotometer (NanoDrop technologies) while RNA integrity was evaluated using the RNA 6000 Nano Chip Kit (Series II) and the Agilent 2100 Bioanalyzer System (Agilent technologies).

Reverse transcription
Small RNA (5ng or 100ng) was reverse transcribed to cDNA using MultiScribe Reverse Transcriptase (Applied Biosystems). Each reaction was primed using a gene-specific stem-loop primer. Where sequences were available, primers were

Real-time quantitative PCR
The PCR reactions were carried out using a 7900 HT Fast Real-Time PCR System (Applied Biosystems). All reactions were performed in 20 µL reactions, in triplicate within the same PCR run. On each plate, an interassay control was included to account for any variations between runs. Reactions consist of 1.33 µL cDNA, 10 µL 2× TaqMan universal PCR master mix (Applied Biosystems), 1µL 0.2µM TaqMan probe (Applied Biosystems), 3 µL 1.5 µM forward primer, 1.4 µL 0.7µM reverse primer and 3.27 µL nuclease-free water. As before, The PCR reactions were initiated with a 10 minute incubation at 95ºC followed by 40 cycles of 95ºC for 15 seconds and 60º C for 60 seconds.

Relative quantification
Cycle threshold (C t ) is defined as the PCR cycle number at which the fluorescence generated from amplification of the target gene within a sample increases to a threshold value of 10 times the standard deviation of the base line emission and is inversely proportionate to the starting amount of the target cDNA. QBasePlus was used for calculation of candidate expression relative to the endogenous control miRNA (miR-26). It applies ΔΔ C t method was used where ΔΔCt = (C t target gene, test sample -C t endogenous control, test sample) -(C t target gene, calibrator sample -C t endogenous control, calibrator sample). Relative quantities were corrected for efficiency of amplification and fold change in miRNA expression between groups was calculated as E-ΔΔCt ± s.e.m. The lowest expressed sample was used as a calibrator.

Microarray analysis
Microarray analysis was carried out on total RNA extracted from FFPE tissues using Megaplex pool A (Applied Biosystems) according to the manufacturer's protocol. It consists of matching primer pool and TaqMan arrays.

Megaplex RT reactions:
We used TaqMan  water (Applied Biosystems). Thereafter, samples were incubated for 40 thermal cycles at 16°C for 2 minutes, 42°C for 1 minute and 50°C for 1 second and finally left at 85°C for 5 minutes to denature the strands. The reaction was performed using a Gene Amp PCR system 9700 thermal cycler (Applied Biosystems).

TLDA RQ-PCR reactions:
The DNA polymerase from TaqMan

Artificial neural network
Data was analysed using an Artificial Neural Network algorithm. A three layer Multi-Layer Perceptron (MLP) modified with a feed forward back-propagation algorithm and a sigmidal transfer function [559] was employed for development of the model using randomLy selected training and testing data sets. An additive stepwise approach [546] was employed to identify an optimal set of markers explaining variation in the population of each of questions explored.

Statistical analysis
Statistical analysis was carried out with IBM SPSS Statistics 17.0 (SPSS Inc.).
Data was tested for normal distribution. Parametric tests were used where appropriate. One-way ANOVA and independent t-test were used to determine association and comparisons between independent groups. Correlation analysis used Spearman's Rho and Pearson's correlations coefficient for nonparametric and parametric data respectively. Univariate analysis and paired-T test were used to assess related samples. P values <0.05 were considered statistically significant.

Extraction Methods
As initial step towards identifying the most optimal technique for RNA isolation fro FFPE tissues we isolated total RNA in duplicate from 4 tissue samples using three different extraction protocols: Qiagen RNeasy FFPE method (Qiagen), Qiazol and chloroform protocol (Qiagen) and the TRI reagent RT-Blood protocol (Qiagen). The comparison was based mainly on yield and purity of RNA

Number of Slices:
To determine the number of slices required for optimal analysis of miRNA expression in FFPE we analysed the RNA yield and 2 miRNAs expression levels in two colorectal samples using 1, 2, 3 or 4 × 10 micron slices and under the same experimental conditions. The RNA showed stepwise increase in concentration depending on the number of slides, however, this increase was not statistically significant (p>0.05) ( figure 6.1). Moreover, we examined the expression levels of

MiRNAs as predictors of neoadjuvant chemoradiation response in RC
Using ANN to analyse the miRNA profiling data, we identified a distinct miRNA expression signature predictive of response to neoadjuvant CRT in 12 FFPE pretreatment rectal cancer tissue samples. These signatures consisted of three miRNA transcripts (miR-16, miR-590-5p and miR-153) to predict complete vs. incomplete response and two miRNAtranscript (miR-519c-3p and miR-561) to predict good vs. poor response with a median accuracy of 100%. Details of this miRNAs in the signatures are presented in Table 3

Discussion
Various limitations are associated with the retrieval of pre-treatment fresh biopsies from patients undergoing neoadjuvant treatment. Therefore, an alternative source of genetic material should be investigated. A valuable well characterised archival collection of FFPE tissue, linked to clinical databases was created worldwide for over a century. This archive provides a rich resource from which biological insight could be derived beyond the prospective collection of fresh-frozen sample. Moreover, the biomarkers developed from FFPE samples could be more readily translated into clinical practice. There have been extensive evaluations conducted on the quality of mRNA and miRNA isolated from FFPE tissues. miRNAs may be less prone to degradation and modification compared to mRNA, and good quality miRNAs were extracted from up to 12 years preserved tissues [538]. Direct comparison of mRNA profiling of FFPE versus fresh-frozen tissues has showed a correlation coefficient of only 0.28 compared to above 0.9 coefficient in case of miRNA expression analysis [167,537]. Hence; the only mRNA experiments that could be conducted using FFPE samples is to measure a previously determined transcript, which will not allow for the identification of novel biomarkers. These results provide a solid foundation for using miRNAs as biomarkers when using FFPE samples in targets discovery studies.
In 1993 the first miRNA, Lin-4, was discovered in C.elegans [216,560] For the purpose of this study, we compared the performance of three RNA extraction methods, and identified Qiagen RNeasy FFPE kit as a preferred methodology. The main reasons why RNA extracted from FFPE tissues is of poor quality are RNA fragmentation and cross-linked with other molecules including proteins [568]. The problem of fragmentation is solved by choosing small fragments for detection by PCR-based methods [568,569]. Qiagen RNeasy FFPE kit uses Proteinase k at 55°C to break the cross-linked RNA formed with proteins.
Incubation at 80°C in buffer PKD is an important step in RNA isolation process using this method. It partially reverses formaldehyde modification of nucleic acids; thereby improves the quality of RNA harvested (chapter 2). To ensure that the recovery of miRNA was adequately assessed it was crucial to select appropriate miRNA targets for integration by RQ-PCR. miR-10b, miR-143, miR-145, miR-21 and miR-30a-3p were chosen because they were intensively investigated in colorectal cancer before [263,264,269,570,571]. Using FFPE and fresh-frozen tissue samples we were able to demonstrate the previously Furthermore, to enable extraction of miRNA from FFPE tissue blocks with different cross-sectional areas in quantities adequate for multiple analyses of the purified miRNA, we determined the number of slices required for optimal RNA yield. The purified RNA yield increased stepwise when we used 1, 2, 3, or 4 slices; however, the changes in concentration were not statistically significant.  [538]. For the tissue blocks with smaller cross-sectional area they observed a linear increase in RNA recovery, while for the blocks with larger area not all the tissue was digested in tubes containing more than 4 slices resulting in yields that were lower than expected. To further evaluate the RNA recovered we selected miR-143 and miR-145 isolated from 1, 2, 3 or 4 slices for integration by RQ-PCR. The reactions were carried in triplicate for each slice number.
Regardless of the number of slices used for miRNA extraction, the mean expression level of miRNAs was stable with standard deviation less than 0.3. This will confirm the suitability of this method for RNA isolation from tissue as small as a colonic biopsy retrieved during endoscopy procedure.
Microarray studies are frequently used to identify differential biosignature that distinguish two or more groups. Routinely processed FFPE samples represent an extensive and valuable resource for large-scaled, microarray-based molecular analysis. Major improvements were achieved in improving RNA extraction techniques and further RNA processing for microarray hybridisation, and recent reports provided evidence of the validity and utility of conducting mRNA and miRNA microarray profiling using FFPE tissues [536,537,572]. Roberts et al.
obtained mRNA expression data for colon and lung tumour and normal FFPE samples and matched frozen samples and found significant agreement between the biosignature identified by each samples group using microarray technology [572].
Their microarray results were further validated and confirmed using RQ-PCR.
Moreover, Hui et al. employed TaqMan low density arrays (TLDA) to assess the expression levels of miRNAs in FFPE breast cancer and normal tissues. They identified a high technical reducibility with intra-sample correlations above 0.9 and 92.8% accuracy in differential expression comparisons, indicating that such profiling studies are technically and biologically robust [537].
Several neoadjuvant treatment regimens have been described and established, including short-term radiotherapy, long-term radiotherapy alone or in combination with chemotherapy. The benefits of these therapeutic regimens have been examined and confirmed by numerous prospective trials [28,31,37,38,[573][574][575].
Therefore, neoadjuvant therapy has become the preferred treatment modality for locally advanced rectal adenocarcinoma with a complete pathological response observed in up to 30% of patients [59,576]. The ability to predict response to pretreatment chemoradiation may spare poorly responding patients from undergoing aggressive and severely toxic treatment [577,578] [579]. Other molecules including P21 [63], EGFR [64], COX2 [580], MUC2 [529] and growth hormone receptor [581] were also examined as potential markers; nevertheless, it seems unlikely that they will prove to be clinically useful response predictors While expression profiling with microarray technologies has been broadly used to colorectal cancer for diagnosis, classification and prognostication based on pattern of expression, its application to response prediction to treatment is still unclear due to few currently available studies [186,191,192,206]. The first report using microarray for prediction of response to pre-treatment radiotherapy was published reported to induce radiosensitivity in vivo and in vitro [582,585,586]. Although miR-16 was described as being stably expressed in both colorectal and breast tissues and has been highlighted as a good endogenous control for miRNA profiling in cancer research using RQ-PCR [293,323], several studies confirmed its dysregulation in many cancers including CRC [587][588][589][590]. Moreover; Schaefer et al examined the expression of four putative reference genes including miR-16 with regard to their use as normalizer in prostatic cancer and they found that normalization to miR-16 can lead to biased results [591].  in tamoxifen-sensitive cells activated BCL2 expression and promoted tamoxifen resistance. Expression of miRNAs after ionizing radiation in human endothelial cells was investigated by Wanger-Ecker et al [598]. They reported that radiation up-regulate miR-16 expression levels. Their data also suggested that the miRNAs which are differentially expressed after radiation modulate the intrinsic radiosensitivity of endothelial cells in subsequent irradiations. This indicates that miRNAs are part of the innate response mechanism of the endothelium to radiation [598].
Altough no report has determined the significance of miR-153 and miR-590 in CRC, their role in carcinogenesis ws highlighted before [599]. Shan et al. [599] investigated the role of miRNAs on the expression and regulation of transforming As mentioned above we identified two miRNAs signature to predict good vs. poor response to neoadjuvant CRT in rectal cancer. None of the miRNAs identified in our study was reported to be associated with CRC.Howevere; the role of miR-519 in cancer was docummented before [603][604][605]. miR-519 was reported as a tumour suppressor and was found to reduce cell proliferation by lowering RNA-binding protein HuR levels [603]. It decreases HuR translation without influencing HuR mRNA abundance [603,604]. Abdelmoshen et al examined the level of miR-519 and HuR in pairs of cancer and adjacent normal tissues from ovary, lung and kidney and reported significant high levels of HuR, unchanged HuR mRNA concentration and reduced miR-519 levels in cancer specimens compared to normal tissues [604]. They also found that tumour cells overexpressing miR-519 fpormed significantly smaller tumours while those expressing reduded miR-519 gave rise to substancilally larger tumours.
Taken together, therefore, using microarray analysis of pretreatment FFPE rectal cancer tissues we identified for the first time a group of miRNAs predictors of response to neoadjuvant CRT. This, indeed, can lead to a significant improvement in patient selection criteria and personalized rectal cancer management. However; before clinically applying this data, a validation study using a large cohort of patients needs to be designed. UTR and inhibit gene translation [606,607]. Moreover; they can also function by cleaving and degrading a target mRNA, in which case the miRNA may target sequences out side the 3-UTR [608]. MiRNAs are crucial in eukaryotes gene regulation, especially in development and differentiation [609,610], and their expression in cancers has indicated that they may have a tumour suppressor or oncogenic function [611]. Functional characterisation of miRNAs will depends heavily on identification of their specific gene targets. In addition; a number of studies have shown that more than one miRNA can potentially bind to a single targeted gene; hence multiple miRNAs may cooperatively control the expression of target genes [305,306]. Numerous bioinformatics methods were developed to high-throughput prediction of miRNA target genes [295,[300][301][302]304], although it is understood that the presumed targets have to be validated experimentally. 3-For prediction of multiple binding sites in one target, the appropriate potential binding sites have to be cut out and folded separately.
Therefore, these algorithms can result in prediction of false-positives. Moreover; some targets may pass undetected. The false positive rates were estimated at 22%, 24% and 30% for TargetScan, miRanda and PicTar, respectively [312]. The PicTar and EMBL algorithms have a reported sensitivity of 70-80% [615] indicating 20-30% of targets may go undetected. The experimental verification is usually based on demonstrating that [616]: 1-The target protein is down-regulated by the predicted miRNA.
2-A reporter gene expressing the 3-UTR or the miRNA-binding sites of the targeted mRNA is also down-regulated 3-The targeted protein is not down-regulated when the 3-UTR is missing or blocked.
4-The miRNA has a biological function predicted by the biological function of the targeted protein.

Aims:
The aims of this study were to correlate the expression levels of candidate mRNA to a panel of miRNAs in order to identify miRNA/mRNA duplexes and to investigate the miRNA and target gene expression patterns in colorectal tissue samples using RQ-PCR.

Study group
A group of 58 consecutive patients undergoing surgical resection for CRC, and in whom the expression levels of a panel of miRNA was carried out before in the surgical research laboratory, were selected for the miRNA: mRNA correlations study in order to determine miRNA targeting a panel of genes (table 5.1).

RNA extraction and analysis
Tissue samples (50-100 mg) were homogenised using a hand-held homogenizer (Polytron PT1600E) in 1-2 mL of QIAzol reagent (Qiagen). Two methods of RNA extractions were employed in the study, the total RNA extraction (copurification) and the separate purification of mRNA and miRNA. RNA was extracted using the RNeasy Plus Mini Kit and RNeasy MinElute cleanup kit (Qiagen) according to the manufacturer's instructions. RNA concentration, purity and integrity were assessed in duplicate samples using a NanoDrop ND-1000 Spectrophotometer (NanoDrop technologies) and the Agilent 2100 Bioanalyzer System (Agilent technologies).

mRNA Reverse transcription
First strand cDNA was synthesised using Superscript III reverse transcriptase (Invitrogen) and random primers (N9; 1µg, MWG Biotech). Negative control samples were included in each set of reactions. Reactions were incubated at 25º C for 5 minutes followed by 50º C for 1 hour and final denaturation at 72º C for 15 minutes. Samples were subsequently diluted to 100 µL in nuclease-free water and stored at -20º C.

miRNA Reverse transcription
First strand cDNA was synthesised using gene specific stem-loop primers.
Primers were obtained from MWG Biotech (Germany) if sequences were available. Otherwise, assays containing stem-loop primer were purchase from Applied Biosystems. All reagents were included in High-capacity cDNA reverse transcription kit (Applied Biosystems). The reactions were performed using a GeneAmp PCR system 9700 thermal cycle (Applied Biosystems) with sample incubated at 16 º C for 30 minutes, 42 º C for 30 minutes and 85 º C for 5 minutes.
An RT-negative control was included in each batch of reactions.

Real-time quantitative PCR
The expression of each gene was analysed by RQ-PCR using TaqMan gene expression assays using a 7900HT instrument (Applied Biosystems). All reactions were performed in 20 µL reactions, in triplicate within the same PCR run. Percent PCR amplification efficiencies (E) for each assay were calculated as E = (10-1/slope -1) × 100, using the slope of the semi-log regression plot of C t versus log input of cDNA (10-fold dilution series of five points) A threshold of 10% above or below 100% efficiency was applied.

Relative quantification
Cycle threshold (C t ) is defined as the PCR cycle number at which the fluorescence generated from amplification of the target gene within a sample increases to a threshold value of 10 times the standard deviation of the base line emission and is inversely proportionate to the starting amount of the target cDNA. QBasePlus was used for calculation of candidate expression relative to the endogenous control genes. It applies ΔΔ C t method was used where ΔΔCt = (C t target gene, test sample -C t endogenous control, test sample) -(C t target gene, calibrator sample -C t endogenous control, calibrator sample). Relative quantities were corrected for efficiency of amplification and fold change in gene expression between groups was calculated as E-ΔΔCt ± s.e.m. The lowest expressed sample was used as a calibrator.

Statistical analysis
Statistical analysis was carried out with IBM SPSS Statistics 17.0 (SPSS Inc.

Discussion
It is known that miRNA are key regulators of gene expression and that these are aberrantly expressed in diverse cancers, including colorectal cancer [248,272,540,561,617,618]. It is becoming apparent that miRNAs act as both tumour suppressors and oncogenes in the gene regulatory network and markedly contribute to tumourogenesis. Hence; identification of the miRNA functions may help in understanding cancer pathogenesis, prognosis and response to treatment.
Prediction and recognition of miRNA target genes is the first step towards understanding the biology of miRNAs. CXCL12 has been found to play a critical role in tumourogenesis, angiogenesis and tumour cells migration through binding to its CXCR4 and CXCR7 receptors [378,455,456]. In colorectal cancer, CXCL12 was reported to be down-regulated and increased with tumour differentiation [392]. Its possible association with MSI and adenoma-carcinoma sequence was discussed in the previous chapters.
Moreover; CXCL12 expression was found to correlate with tumour stage, lymph node status and survival [393,620]. Although a considerable number of previous studies have investigated the CXCL12 and its receptors in cancer, the mechanisms by which it exerts its effects are not fully understood. Brand et al. postulated that CXCL12 activates ERK-1/2, SAPK/JNK kinases, AKT and matrix metalloproteinase-9 which mediate reorganization of actin cytoskeleton resulting in increase cell migration and invasion [392]. In addition, Yang et al. have demonstrated that stimulation of glioblastoma cells with CXCL12 contributes to the production of VEGF in vitro and thereby synergistically induce tumour angiogenesis [621,622]. The miRNAs identified to target CXCL12 in this study will help in further understanding of CXCL12 role in carcinogenesis.  [344,475,476,623]. These differences in expression might highlight tumour-specific expression patterns. Expression levels of FABP1 were found to correlate with survival and grade of colorectal cancer [344,478]. Lee et al. analysed the expression profiles in the sequence of normal colon crypts, adenoma and early stage carcinoma using cDNA microarray analysis. They identified a group of genes, including PDCD4 and FABP1, to be down regulated in the sequence [477].
PDCD4 is a tumour suppressor gene that inhibits neoplastic transformation, tumour promotion and progression and induces apoptosis in response to different oncogenic factors [497,499,500]. It exerts its functions interaction with different other molecules including eIF4A, eIF4G and p21 [416]. Down-regulation of PDCD4 was showed to lead to increase colon cancer cells invasion [624,625].
Moreover; its expression levels were found to correlate with poor survival and disease progression in colon and lung cancer [416,626] miR-21 is one of the most prominent miRNAs implicated in the promotion and progression of human malignancies. It is over-expressed in different tumour types and has been implicated in promotion of growth, proliferation and inhibition of apoptosis [242,246,286,535,571,627]. miR-21 expression has been associated with advanced lymph node and disease stage and tumour invasion and metastasis [571,628]. Moreover, high expression levels of miR-21 were reported to be associated with disease recurrence, prognosis and therapeutic outcome in colorectal cancer [264,629]. miR-21 is shown to target and down-regulate the expression of tropomyosin 1, PTEN, SPRY2 and PDCD4 [245,269,286,630].
Both miR-17, miR-31and miR-21 have been found to be up-regulated in tumours, including colorectal cancer [275,276,588,[631][632][633][634]. miR-17, a member of miR-17-92 cluster, was reported to be overexpressed during colorectal adenoma to carcinoma progression and induced proliferation of lung cancer cells [635]. cluster [284]. On the other hand, miR-31 overexpression was noted to be associated with advanced tumour stage and local invasion [276]. The role of miR-31 in cell proliferation was investigated by Liu et al. who found that knockdown of the miRNA repress proliferation of both murine and human lung cancer cell lines [636]. Moreover; miR-31 was reported to be over-expressed in right-sided colon tumours and associated with microsatellite instability [637].
We identified reciprocal pattern of expression of CXCL12 and miR-17 and miR-31 in tumour compared to normal colorectal tissues. Furthermore; the inverse relationship of CXCL12 and miRNAs was also seen is association with tumour differentiation and tumour location. Dysregulation of CXCL12 and miR-31 expression in proximal compared to distal colonic cancer may support their role in MSI tumours. Moreover; inversely related expression levels were noted when comparing the down-regulated miRNAs miR-21 and miR-31 to their overexpressed putative targets PDCD4 and FABP1.
Interleukin 8 (IL8) has been reported to be overexpressed in cancer and modulate proliferation and migration of tumour cells [406,638,639]. Evidences exist that IL8 is a critical angiogenic factor in a multitude of human cancer. Blocking of the angiogenic activity of IL8 have proven effective to inhibit angiogenesis, metastasis and tumour progression in murine models [640][641][642]. The two miRNAs identified in this study to target IL8, miR-10b and miR-145, are reported to be down-regulated in human cancer [242,263,272,[274][275][276]543]. They concluded that miR-145 down-regulate IRS-I protein and inhibits the growth of human cancer cells [616]. Moreover, type 1 insulin-like growth factor receptor (IGF-IR) is also confirmed as miR-145 target gene [616,643]. No significant correlations of miR-145 with clinicopathological variables were previously identified. Regarding miR-10b, the available data is confusing. Although downregulation of miR-10b was identified in relation to many cancers like colorectal, breast and head and neck aquamous cell carcinoma [242,263,543,644,645], some other reports described over-expression of miR-10b in tumours and correlate its expression to poor prognostic features like invasion and metastasis [243,646,647]. The significance of this apparent paradox is unclear but might highlight tumour-specific expression patterns of miR-10b. Our results support the downregulation opinion of miR-10b in colorectal cancer as shown in this and previous chapters. Both miR-10b and mi-145 might target IL8 and cause its up-regulation and thereby potentiate its angiogenic effect.
Our results and the previous reports, in addition to the negative correlations between miRNA and mRNAs, might support our hypothesis that miRNA/mRNA duplexes identified above represent miRNA/target gene pair. The identified miRNA/ mRNA combinations will not only help in understanding of molecular pathology of colorectal cancer, but may have a potential therapeutic capacity for the disease.   MMR proteins are nuclear enzymes, which participate in repair of base-base mismatch that occur during DNA replication in proliferating cells. The proteins form complexes (heterodimers) that bind to areas of abnormal DNA and initiates its removal. Loss of MMR proteins leads to an accumulation of DNA replication errors, particularly in areas of the genome with short repetitive nucleotide sequences, a phenomenon known as microsatellite instability (MSI) [360,648,649]. MSI can be identified in more than 90% of colorectal cancers that arise in patients with Lynch syndrome, while in sporadic colorectal cancer it occurs in 15% of cases [650].

Mechanisms for MSI
Alterations in at least six of the genes that encode proteins involved in the MMR system have been identified in either HNPCC or sporadic colon cancer. These HNPCC related colon cancers account for 3-6% of all colon cancers, and germline mutations in MSH2 and MLH1 have been found in 45-70% of families that meet the Amsterdam criteria for HNPCC [653,654]. Since inactivation of both alleles of MSH2 or MLH1 is required to generate MSI, the cancers that arise in HNPCC kindred frequently show loss of heterozygosity at the loci of these genes, or alternatively show somatic mutation of the sole wild-type MMR allele.
The germline mutations that occur in MSH2 and MLH1 are widely distributed throughout either gene and are missense, deletion, or insertion mutations. These mutations result in frame shifts (60% of hMSH2 mutations and 40% of MLH1 mutations), premature truncations (23% of MSH2 mutations), or missense mutations (31% of MLH1 mutations) [655]. The lack of a mutation hotspot has hampered the development of an inexpensive clinical assay to detect germline mutations in the genes known to cause HNPCC. Furthermore, because one wildtype allele is sufficient to maintain MMR activity, functional assays to detect MMR gene mutation carriers have not been developed for clinical use to date.
However, proof-of-principle studies have demonstrated that it may be possible to develop such an assay by forcing a cell to a haploid state in which case a mutant MMR allele could be detected [656,657]. Studies of the 15% of sporadic colon cancers that display MSI demonstrated these arose due to somatic inactivation of MMR genes and not due to germline MMR gene mutations with low penetrance.
While occasional somatic mutations of MSH2 and MLH1 were detected , the predominant mechanism for inactivating MMR unexpectedly proved to be the epigenetic silencing of the MLH1 promoter due to aberrant promoter methylation [98,99].

Clinical implications of MSI
The CRC microsatellite profile provides useful prognostic information [138,658], showing the patients with microsatellite unstable neoplasms have a better overall survival rate and a modified response to conventional chemotherapy [161,[659][660][661][662][663]. MSI also helps in predicting the treatment response of CRC [161,661,664], and could modify the chemotherapy protocols offered to the patients in the future [161], but these results should be applied with caution before this predictive tool is verified.
Molecular markers as predictive factors in treatment decisions have been developed in the last few years. The initial studies in sporadic CRC showed that the retention of heterozygosity at one or more 17p or 18q alleles in microsatellitestable CRCs and mutation of the gene for the type II receptor for TGF-β1 in CRCs with high levels of microsatellite instability correlated with a favorable outcome after adjuvant chemotherapy with fluorouracil based regimens, especially for stage III CRC [661,664]. However, most recent studies have revealed that fluorouracil-based adjuvant chemotherapy benefited patients with stage II or stage III CRC with MSS tumors or tumors exhibiting low frequency MSI but not those with CRCs exhibiting high frequency MSI [161]. The reasons for these responses must be related to the distinctive cell kinetics associated with MMR downregulation (significantly increased apoptosis and decreased proliferation), which can certainly contribute to tumor cell resistance to conventional chemotherapy.

Testing for MSI and MMR defects: Clinical Criteria:
The recognition that certain types of cancers cluster in families with HNPCC and that cancer develops at relatively early ages compared with the general population provided the rationale for development of criteria that could be used to aid in the diagnosis. Two sets of criteria (the Amsterdam criteria and Bethesda guidelines) developed by a consensus of experts, have been most widely accepted and best studied.  [665][666][667][668], and review of the literature reported that the sensitivity of the original Amsterdam criteria ranged from 54 to 91% [669]. Such a wide range of estimates leaves substantial uncertainty as to the role of the Amsterdam criteria as a screening test for mismatch repair mutations. In addition to the limitations regarding their predictive accuracy, there are practical problems with policies based on the implementation of these clinical criteria. Patients' report of the family history may not be accurate, particularly for cancers other than colorectal that are potentially related to HNPCC. [670]. Issues of uncertain paternity may also be relevant in some families while some families may be too small or have insufficient contact among family members to obtain a clinically meaningful family history.  [151] Revised (Amsterdam II) [150] -At least 3 relatives with colorectal cancer, one of whom must be a first degree relative of the other two -Involvement of 2 or more generations -At least 1 case diagnosed before age 50 -Familial adenomatous polyposis has been excluded -At least 3 relatives with HNPCCassociated cancer -One should be 1st degree relative of other two -At least 2 successive generations affected -At least 1 diagnosed before age 50 -Familial adenomatous polyposis excluded -Tumors should be verified by pathologic examination

Aims
Information about MMR protein status in colorectal cancer is important because it will identify those most likely to have Lynch syndrome and those most likely to have microsatellite instability in their tumours which has been proven to have better prognosis and may affect their treatment regimens in the future. We undertook this study to develop and optimise a protocol for MMR protein immunohistochemistry testing in colorectal cancer. We also aimed to analyse the proportion of patients with colorectal cancer with loss of immunostaining for MMR proteins (hMLH1, hMPS2, hMSH2 and hMSH6) in order to determine the feasibility of molecular screening for the loss of MMR proteins through the study of unselected patients with colorectal cancer.

Study group
A group of 33 patients with colorectal cancer was randomLy selected from the department of surgery bio-bank to determine the expression of MMR proteins in their FFPE tumour tissues using immunohistochemistry techniques. The age of the patients at diagnosis of their cancers and their family history were collected by reviewing the medical charts.

FFPE tissues
Tumour tissues collected at time of surgery were collected and placed in 10%

Immunohistochemistry
Immunostaining was carried out on 5 µm thick paraffin sections of tumour tissue from each patient, using mouse monoclonal antibodies specific for each of the four human MMR proteins and employing automated DABMap system (Ventana) for hMSH6 detection and UltraMap system (Ventana) to detect hMLH1, hMSH2, and hPMS2 proteins.

DABMap protocol:
It was consist of deparaffinization and cell conditioning, followed by addition of primary antibody and incubation at room temperature for I hour. Then the secondary antibody was added before counterstaining with haematoxylin and slides dehydration.

UltraMap protocol:
The standard UltraMap was used to detect hMSH2. It was again consist of deparrafinization and cell conditioning followed by primary antibody titration.
The tissue section was incubated with primary antibody for 12 hours at 37°C. No secondary antibody was added. This was followed by counterstaining and dehydration in serial ethanol alcohol dilution and Xylene (Sigma).
The extended UltraMap protocol was used to determine the expression of hMLH1 and hPMS2.It was different from the standard one in that the cell conditioning was extended to three cycles of medium cell conditioner and cell conditioner compared to two cycles in case standard protocol.

IHC analysis
Changes in protein expression following transfection of colorectal tissues were observed in stained cells using Olympus BX60 microscope and image analySIS software. Adjacent normal tissue served as an internal control for positive staining and a negative control staining was carried out without the primary antibody.
MMR protein staining was considered negative when all of the tumour cell nuclei failed to react with the antibody.

Optimization of MMR protein staining protocol
Tissue processing has the greatest single impact on the end result of IHC and different tissue types often require slightly different pre-treatments for optimum results. To optimized staining protocols we employed the Closed Loop Assay Development (CLAD) for IHC (figure 3.1).

Figure 7.2: Closed Loop Assay Development (CLAD)
Optimal staining was achieved for hMSH6 using DABMap system, however; acceptable stating for hMLH1, hMSH2 and hMPS2 was only achievable using UltarMap system.  The index case was 38 years old when diagnosed with caecal cancer. One of her grandfathers was diagnosed with colorectal cancer (weather paternal or maternal side, site of tumour and age at diagnosis were not documented). Her mother died of breast cancer (age was not documented). One of her paternal cousin was diagnosed with breast cancer; also age at onset was no documented.

Discussion
The or hPMS2 mutations [76,655]. On the other hand, about 10-15% of sporadic colorectal cancer also exhibit MSI, and loss of one or more of the MMR proteins has been found in these tumours [658,688]. Lack of expression of hMLH1 as the result of promoter methylation occurs in most of sporadic MSI-positive tumours [97]. Loss of the other MMR proteins is rare in sporadic tumours and in one study loss of either hMSH2 or hPMS2 was found in only 2% of tumours [689].
The major laboratory tests used in the evaluation of patients suspected to have Lynch syndrome include testing of tumour tissues using immunohistochemistry (IHC), MSI testing or germ line testing for mismatch defects. IHC has the advantage over the other methods, as the primary screening method, since it is less demanding to perform and is available as part of routine services in general pathology laboratories. In addition, IHC will determine which protein is affected and provides gene specific information; thereby direct the genetic analysis rather than performing exhausting, time and material consuming unnecessary tests.
Nevertheless, while most of mutations will results in total loss of the protein expression , in some cases mutations only result in loss of function rather than the expression of the protein which will still be detectable by IHC.  [698].
In evaluating the expression of MMR proteins using IHC, any tumour cell nuclear expression is considered positive due to the heterogeneity of expression and difficulties in test standardisation [136]. The intensity of staining in normal mucosa decreased towards the surface. Moreover, the normal enterocytes can serve as positive internal controls and should always be observed to determine the quality of staining [699]. In sporadic tumours due to hypermethylation of the promoter of hMLH1 there is consistent loss of the protein expression [700].

Discussion
Colorectal cancer is the fourth most common cancer in men and the third most common cancer in women worldwide [701]. In the USA, colorectal cancer is the second most common cause of cancer death among men aged 40 to 79 years and accounts for 9% of all cancer related deaths [702]. In Ireland, the National Cancer Registry predicts that the incidence of colorectal cancer will increase from 2111 cases in 2005 to 5537 in 2035 [703], indicating a more than 100% increase over the next 30 years. In this setting of increasing disease burden, translational research is of vital importance to clinical advancement. At the molecular level, activation of oncogenes and inactivation of tumour suppressor genes [359] are processes known to be involved in colorectal carcinogenesis.
Additionally, abrogation of mismatch repair systems [360]  The principle of an adenoma-carcinoma sequence, described in 1990, postulates that the transition from adenoma to carcinoma is associated with an accumulation of genetic events in key regulatory genes that confer a growth advantage to a clonal population of cells [74]. Since then, although molecular detection methods

Gene expression in colorectal cancer:
One of the primary aims of this study was to characterise the expression profiles of candidate genes in colorectal tissue. Rigourous evaluation of appropriate genes with which to normalise real-time quantitative PCR data identified PPIA and B2M as the most stably expressed genes in colorectal tissue samples. This enabled the development of a robust experimental approach which ensured that subsequent profiling of gene expression levels would be measured accurately and reproducibly in colorectal tissue. As a result, a comprehensive list of genes with highly differential expression patterns was derived.

CXCL12 and its receptors CXCR4 and CXCR7:
The first candidates to be examined were the chemokine CXCL12 and its receptors CXCR4 and CXCR7, whose gene expression levels were determined in 107 tumour and tumour associated normal colorectal tissues, the largest patient cohort reported to date. Significant down-regulation of CXCL12 in tumour compared to normal colorectal tissue was found, in contrast to CXCR4, which showed non-significant up-regulated expression levels in tumour tissues. The

TGFB1 and its receptors TGFBR1 and TGFBR2:
Although no significant differences were identified in gene expression levels of the chemokine receptor molecules TGFBR1 and TGFBR2 in tumour versus normal tissue, the expression of their ligand TGFB1 was found to be significantly lower in polyps and higher in tumours compared to normal tissue. These findings confirm previous work by Daniel et al (2007), investigating TGFB1 protein expression by IHC in colorectal cancer. The authors demonstrated than in highgrade dysplastic polyps, than in low-grade dysplastic polyp [521]. Matsushita et al (1999) found that TGFB receptor mRNA was expressed mainly by normal and adenoma colorectal tissues whereas TGFB1 expressed by cancer [516].
Moreover, the significant positive correlation between TGFB1 and the expression levels of its receptors in both tumour and normal tissue confirms that their role in colorectal cancer is more complex than a simple legend-receptor feedback.
Interestingly, we identified for the first time the relationship of TGFB pathway and some established prognostic clinicopathological parameters. Low expression of TGFBR1 was found to be associated with raised CEA serum level and local tumour invasion. In addition, TGFBR2 down-regulation was associated with local, perineural and lymphovascular invasion and advanced nodal stage. These findings will further confirm the role of TGFB receptors as tumour suppressor.
The down-regulation of TGFBR2 in proximal compared to distal tumours was described before and highlights the role of this gene in microsatellite instable tumours.
Tumours of proximal and distal parts of the colon may form different but related groups of tumours because of their different embryological origin, different exposure to bowel contents and differences in clinical presentation, progression and possible genetic and environmental epidemiology [522].
Many previous studies have examined the relationship between TGFB pathway and the disease progression in colorectal cancer. Nevertheless, this is the first study to explore the relation of TGFB1 and its receptors mRNA in colorectal cancer using RT-PCR. Moreover, the large cohort of patients in this study gives it further advantage compared to the other studies.
Other genes shown to be potential biomarkers in this study included CDH17, FABP1, IL8, MUC2 and PDCD4. In colorectal cancer, CDH17 expression was only investigated at protein level using IHC and immunoblotting. Hinoi et al.
examined the protein expression in human colorectal cancer cell lines. In their study, CDH17 was not detected in cell lines showing dedifferentiated phenotypes [444]. This was further confirmed by Takamura et al. who examined the CDH17 expression in four cell lines and 45 human primary colorectal carcinoma using monoclonal antibodies. In cell lines the protein was expressed in differentiated but not the dedifferentiated phenptypes while in tissues reduced CDH17 expression was associated with high tumour grade, advanced stage and lymphatic invasion and metastasis [373]. Moreover, Kwak et al. found reduced expression in 51% of the 207 colorectal cancers he studied using immunohistochemistry and he significantly correlated down-expression of CDH17 with poor survival and lymph nodes metastasis [374]. To our knowledge, this is the first study to investigate CDH17 mRNA in colorectal cancer using RQ-PCR. Our findings support the above reports and confirm that down-regulation of CDH17 in colorectal cancer is associated with poor differentiation, raised CA19. as a marker for rectal cancer surgical management planning. In other wards, decrease level of CDH17 may indicat local invasion of tumour and therefore total mesorectal excission (TME) will be indicated.
Evidence of dysregulated FABP1 gene expression has been reported in colorectal gene expression array datasets [365,477], however, little is known of its expression profile with regard to clinical data. Lawrie et al. identified consistent loss of FABP1 in tumour compared to normal colon and also noted the association of decreased protein expression and poorly differentiated tumours and large adenomas [344]. Moreover, FABP1 expression was found to be associated with good prognosis after liver resection of colorectal cancer metastasis [478]. Although no statistically significant correlation between FABP1 expression and clinicopathological parameters was identified in this study, we observed that FABP1 is differentially expressed in normal-adenomacarcinoma sequence and its loss occurred early in colorectal cancer tumourogenesis. This indicates tumour suppressor function of FABP1 in colorectal cancer. The loss of FABP1 in colorectal cancer contrast with the findings in other tumours types which might be explained by the organ-specific distribution and the different role of FABP1 through distinct intracellular interacting molecules.
In keeping with the previous reports, we noted overexpression of IL8 in tumour compared to normal colorectal tissue. In addition, we identified a progressive manner of increase gene expression from normal, to polyps, to tumour. The early dysregulation of IL8 in colorectal cancer suggest that the gene may play a role in carcinogenesis in addition to its confirmed role in tumour progression.
Correlations with clinicopathological parameters revealed significant association of reduced IL8 expression and poor tumour differentiation, advanced nodal stage and disease recurrence. Although the significant of these findings is unclear, it should be considered when planning IL8 targeting therapy.
Furthermore, we confirmed MUC2 mRNA down-regulation in non-mucinous and over-regulation in mucinous colorectal cancer. We also showed decreased expression of MUC2 in a progressive manner from tumour-associated normal, to polyps, to tumours. No significant association of MUC2 and clinicopathological variables other than CA19.9 serum levels has been determined in this study.
Regarding PDCD4 mRNA, its expression was significantly lower in tumour and polyp compared to tumour-associated tissue in keeping with the protein expression levels described before [477,502,511]. Furthermore, we identified the novel association of reduced PDCD4 expression with disease recurrence and raised CA19.9 serum level. These findings suggest that PDCD4 involves in both tumour promotion and tumour progression and represent a potential biomarker for evaluating the transition of normal colorectal tissue to adenoma and carcinoma. Reduced expression of PDCD4 in proximal compared to distal colon may indicate a potential role in microsatellite instability (MSI) and Lynch syndrome.
Measurement and quantifying of tumour response to neoadjuvant CRT is an important parameter in order to elucidate factors that may allow for response prediction and planning of next step of treatment in rectal cancer patients.
Clinical response (cCR), pathological response (pCR) and tumour downstaging are the commonly used methods to measure response. Both clinical response and tumour downstaging compared the tumour characteristics before and after treatment clinically and using radiological tools like magnetic resonance imaging (MRI) and trans-rectal ultrasound (TRUS). Whereas pathological response (regression grade) stratifies response base on biological effect of radiation on tumours. Mandard tumour regression grade, originally described for oesophageal cancer, is the most commonly used [44]. It consists of five different grades based on ratio of fibrosis to tumours. We identified, for the first time, a group of genes that can be used as markers to quantify tumour response following neoadjuvant therapy in rectal cancer patients. Genes identified in the study as potential biomarkers for CRC screening, diagnosis and disease progression.
-Biomarkers for diagnosis and screening: -Biomarkers of disease progression:  cancer [263]. They found reduced accumulation of specific miRNA in colorectal neoplasia and identified 28 different miRNA sequences between colonic cancer and normal mucosa. They also identified the human homologues of murine miR-  [538]. For the tissue blocks with smaller cross-sectional area they observed a linear increase in RNA recovery, while for the blocks with larger area not all the tissue was digested in tubes containing more than 4 slices resulting in yields that were lower than expected. To further evaluate the RNA recovered we selected miR-143 and miR-145 isolated from 1, 2, 3 or 4 slices for integration by RQ-PCR. The reactions were carried in triplicate for each slice number.
Regardless of the number of slices used for miRNA extraction, the mean expression level of miRNAs was stable with standard deviation less than 0.3.
This will confirm the suitability of this method for RNA isolation from tissue as small as a colonic biopsy retrieved during endoscopy procedure.
Neoadjuvant CRT has become the preferred treatment modality for locally advanced rectal adenocarcinoma with a complete pathological response observed in up to 30% of patients [59,576]. The ability to predict response to pretreatment chemoradiation may spare poorly responding patients from undergoing aggressive and severely toxic treatment [577,578] from which they would derive no benefit. At present there is no reliable technique to predict clinical or pathological complete tumour regression after treatment and limited data exist for each potential modality in this regard. Hence; many molecular markers have been assessed for their predictive values. Nevertheless, it seems unlikely that they will prove to be clinically useful response predictors.
Change in miRNA expression profiles during treatment of cancer could potentially provide a tool to predict and estimate the success of certain therapies.
By enabling screening of tissue samples for multiple miRNAs simultaneously, microarrays revealed convincing evidence that a large number of miRNAs are deregulated in therapy resistance or sensitive cancer cells. The extent of changes in miRNA expression were reported following anticancer treatment with various chemotherapeutic drugs in different cancer cell lines and patient samples [582].
To the author's knowledge, this is the first study to investigate the role of miRNA as predictors of response to neoadjuvant CRT therapy in rectal cancer.
Using ANN to analyse the miRNA profiling data, a distinct miRNA expression signature predictive of response to neoadjuvant CRT in 12 FFPE pre-treatment rectal cancer tissue samples was identified. These signatures consisted of three miRNA transcripts (miR-16, miR-590-5p and miR-153) to predict complete vs.
incomplete response and two miRNAtranscript (miR-519c-3p and miR-561) to predict good versus poor response with a median accuracy of 100%.
Although miR-16 was described as being stably expressed in both colorectal and breast tissues and has been highlighted as a good endogenous control for miRNA profiling in cancer research using RQ-PCR [293,323], several studies have confirmed its dysregulation in many cancers including CRC [587][588][589][590].
Moreover; Schaefer et al examined the expression of four putative reference genes including miR-16 with regard to their use as normalizer in prostatic cancer and they found that normalization to miR-16 can lead to biased results [591].
Although no report has determined the significance of miR-153 and miR-590 in CRC, their role in carcinogenesis ws highlighted before [599]. Shan et al. [599] investigated the role of miRNAs on the expression and regulation of transforming growth factor-beta1 (TGFB1), TGF-beta receptor type II (TGFBRII), and collagen production in vivo and in vitro. They found that nicotine produced significant upregulation of expression of TGFB1 and TGFBRII at the protein level, and a decrease in the levels of miRNAs miR-133 and miR- 590. The role of miR-519 in cancer was documented before [603][604][605]. miR-519 was reported as a tumour suppressor and was found to reduce cell proliferation by lowering RNA-binding protein HuR levels [603]. It decreases HuR translation without influencing HuR mRNA abundance [603,604]. Abdelmoshen et al examined the level of miR-519 and HuR in pairs of cancer and adjacent normal tissues from ovary, lung and kidney and reported significant high levels of HuR, unchanged HuR mRNA concentration and reduced miR-519 levels in cancer specimens compared to normal tissues [604]. They also found that tumour cells overexpressing miR-519 fpormed significantly smaller tumours while those expressing reduded miR-519 gave rise to substancilally larger tumours.
Taken together, therefore, using microarray analysis of pretreatment FFPE rectal cancer tissues, for the first time a group of miRNAs predictors of response to neoadjuvant CRT was identified. This, indeed, can lead to a significant improvement in patient selection criteria and personalized rectal cancer management. However; before clinically applying this data, a validation study using a large cohort of patients needs to be performed.

miRNA: mRNA correlations in colorectal cancer:
MiRNAs are crucial in eukaryotes gene regulation, especially in development and differentiation [609,610], and their expression in cancers has indicated that they may have a tumour suppressor or oncogenic function [611]. Functional characterisation of miRNAs will depends heavily on identification of their specific gene targets. In addition; a number of studies have shown that more than one miRNA can potentially bind to a single targeted gene; hence multiple miRNAs may cooperatively control the expression of target genes [305,306].
Numerous bioinformatic methods have been developed to high-throughput prediction of miRNA target genes [295,[300][301][302]304], although it is understood that the presumed targets have to be validated experimentally. have many more targets than anticipated by convensional prediction methods [619]. In addition; these algorithms can result in prediction of false-positives or some targets may pass undetected. The false positive rates were estimated at 22%, 24% and 30% for TargetScan, miRanda and PicTar, respectively [312].
The PicTar and EMBL algorithms have a reported sensitivity of 70-80% [615] indicating 20-30% of targets may go undetected.
To further understand the factors control gene expression, and therefore the protein biosynthesis, we performed bioinformatics analysis to search for putative miRNA/target genes duplexes from the panel of genes and miRNA previously investigated by our research group in the Department of Surgery, NUI Galway.
In addition, correlation analysis was performed between miRNA and mRNA which identified novel pairs of miRNA:mRNA duplexes not previously identified by any of the computational approaches mentioned above. In this study, the in sillico predicted relationship of miR-21/PDCD4, miR-31/CXCL12 and miR-145/IL8 duplexes was confirmed by real-time PCR expression analysis.

Mismatch-repair (MMR) protein expression in colorectal cancer:
MMR proteins are nuclear enzymes, which participate in repair of base-base mismatch that occur during DNA replication in proliferating cells. The proteins form heterodimers that bind to areas of abnormal DNA and initiates its removal.
Loss of MMR proteins leads to an accumulation of DNA replication errors, particularly in areas of the genome with short repetitive nucleotide sequences, a phenomenon known as microsatellite instability (MSI) [360,648,649]. In addition to screening for Lynch syndrome, testing for MSI is important because of its possible prognostic and therapeutic implications. Cancers with high microsatellite instability (H-MSI) were reported to have a more favourable clinical out come than non-MSI tumours and the survival advantage conferred by the MSI phenotype is independent of tumour stage and other clinicopathological variables [156][157][158]. Moreover, tumours with H-MSI are thought to be less responsive to 5-fluorouracil and other anticancer agents in vitro and in vivo [159][160][161].
The major laboratory tests used in the evaluation of patients suspected to have Lynch syndrome include testing of tumour tissues using immunohistochemistry (IHC), MSI testing or germ line testing for mismatch defects. IHC has the advantage over the other methods, as the primary screening method, since it is less demanding to perform and is available as part of routine services in general pathology laboratories. In addition, IHC will determine which protein is affected and provides gene specific information; thereby direct the genetic analysis rather than performing exhausting, time and material consuming unnecessary tests.
Nevertheless, while most of mutations will results in total loss of the protein expression , in some cases mutations only result in loss of function rather than the expression of the protein which will still be detectable by IHC.
In this study, MMR protein expression was tested without considering the family history in a prospective of newly diagnosed colorectal cancer patients. This analysis identified three patients with loss of one or more MMR protein. Our findings and the previous reports pointed out the importance of molecular screening of patients with colorectal cancer for MSI using immunohistochemistry. This strategy managed to identify mutations in patients otherwise would not have been detected. Therefore, we recommend it as a policy for all newly diagnosed colorectal cancer patients due to its important prognostic implications.

Future Work
The study of gene expression in colorectal cancer has yielded interesting results and opened new avenues of exploration. The challenge we now face is the translation of new scientific knowledge into clinically applicable diagnostic, prognostic and therapeutic tools for use in the management of colorectal cancer The microarray analysis of pretreatment FFPE rectal cancer tissues has identified a group of miRNAs predictors of response to neoadjuvant CRT. This, indeed, can lead to a significant improvement in patient selection criteria and personalized rectal cancer management. However; before clinically applying this data, a validation study using a large cohort of patients needs to be designed.
Reciprocal expression observed between several genes and their miRNAs partners, suggestive of novel mechanisms which could become uncoupled in colorectal carcinogenesis. These findings support the hypothesis that these miRNAs:mRNA duplexes may hold potential as therapeutic agents/targets in colorectal cancer. In-vivo functional analysis is warranted to further investigate this potential. One possible direction would be the development of a model of colorectal cancer in which the effect of specific miRNA up-or down-regulation on gene expression, and therefore tumour behavior, could be assessed. In this manner, the potential of these duplexes as therapeutic targets could be explored.
MMR protein analysis has pointed out the importance of molecular screening of patients with colorectal cancer for MSI using immunohistochemistry. Expansion of this analysis to a wider scale via microarray promises to identify novel biomarkers that could be used for prognostication and personalized patient treatment in colorectal cancer.