Identification of Candidate Targeted Genes in Molecular Subtypes of Gastric Cancer

Because of high heterogeneity, a further classification should be made for diagnosis and treatment in gastric cancer. Biomarkers selected in subtypes are important for precision medicine. Based on gene expression level, we constructed genome-wide co-expression networks for invasive, proliferative and metabolic subtype in gastric cancer respectively. The hierarchical clustering was used to get sub-networks, and hub gene sets of subtypes were got by analysis in sub-networks. Unique differential expression genes as candidate targeted genes in subtype were gained by a comparative analysis between subtypes. These genes may be helpful for improving diagnosis and therapy methods and developing new drug in gastric cancer.


Introduction
Gastric cancer is a common tumor with high morbidity and mortality globally.It is a heterogeneous disease with multiple histopathologic features.The differences in clinic, pathology and molecular level among subtypes were identified as features of subtypes.The verification for judging classification effect was made using other independent gene expression profile data of patients whose features were supported clinically [1].Although the classification could be helpful for diagnosis and treatment of gastric cancer, but it can't provide a help for precise treatment because there were not targeted genes identified in their research, so we thought it was essential for making further analyses to identify targeted genes using genome-wide data of these subtypes.

Microarray Data
Gene expression profile data was downloaded from GEO database (Access No. GSE35809).It includes genome-wide mRNA expression data of 70 primary gastric cancer patients from Australia.These patients were divided into three subtypes by LEI when verifying credibility of classification method based on this data.There are 26 samples in invasive, 29 samples in proliferative and 15 samples in metabolic.The invasive subtype was mentioned as mesenchymal subtype in literature published by LEI [1].After preprocessing, there were 21,212 genes left to construct co-expression networks.

Constructing Co-Expression Network and Sub-Networks
The co-expression network was constructed using WGCNA (weighted gene coexpression network analysis, WGCNA) package in R [2].After calculating the Pearson correlation coefficient between genes and making choice of threshold value which decides the relationship between genes, we got an adjacency function.Then, the topological overlap matrix is got with adjacency function.Next, the dissimilarity matrix is got through the topological overlap matrix.Sub-networks of co-expression network are got based on the dissimilarity matrix by the hierarchical clustering.These sub-networks can be merged when the similarity among sub-networks is larger than a certain threshold value named height cutoff [2].

Identifying Hub Gene Set of Subtype
The hub gene is a most important gene in network.Here, hub gene in sub-network should satisfy the following two rules: (1) Genes was ranked in descending order according to degrees of genes, and the top 10 genes were selected as hub genes.(2) Genes was ranked in descending order according to correlation coefficient between gene expression level and module eigengene E, and the top 10 genes were selected.The module eigengene E is defined as the first principal component of a given module.It can be considered a representative of the gene expression profiles in a module [3].The hub genes of each sub-network were merged into a hub gene set of subtype.

Gene Ontology and Pathway Analysis of the Subtypes
In order to understand the functions of the hub genes in each subtype, they were used to enrich the gene ontology terms and the pathways by KEGG in DAVID (6.7 versions).

Differential Expression Analysis of Genes
Differentially expressed genes between subtypes were screened by t-test and fold change (P-value ≤ 0.05, Fold change ≥ 2).

The Co-Expression Network of Each Subtype
Based on microarray gene expression data preprocessed, the co-expression network of each subtype was constructed using WGCNA.The co-expression networks must satisfy a scale-free topology, so we should select an appropriate threshold value which decides the relationship between genes.When the Pearson correlation coefficient between genes is larger than threshold value, genes are interrelated.Here, we selected the threshold as 0.6 because the three network models of subtypes satisfy a scale-free topology under this threshold condition.

Sub-Networks of Each Subtype
Hierarchical clustering method was used to divide co-expression network into sub-networks (Figure 1).When height cut-off is 0.75, there are 24 sub-networks in invasive, 24 in proliferative and 26 in metabolic.It is a co-expression subnetwork called invmodule24 in invasive subtype in Figure 2, and it contains 52 genes.

Hub Gene Set in Subtype
There are 207 genes in hub gene set of invasive, 215 genes in proliferative and 204 genes in metabolic.For the purpose of finding out the differences between the different subtypes, we made a comparison among hub gene sets of the different subtypes.There aren't common genes among the three gene sets.There are 13 common genes between the invasive and proliferative, 7 common genes between the proliferative and metabolic, and 4 common genes between the invasive and metabolic.The proportion of unique genes is 91.79% in the invasive, and 90.70% in the proliferative, and 94.61% in the metabolic.It is suggested that the hub gene sets in a subtype may well represent the unique features of this subtype.

Gene Ontology and Pathway Analysis Results
The hub gene set in each subtype was used to make gene ontology and KEGG pathway analysis.The most significant enriched terms of top 10 in biological process are shown in Figure 3.
At present, researches on gastric cancer have been put focus on identifying tumor biomarkers related to cell cycle regulation, apoptosis, tumor angiogenesis, tumor invasion and metastasis, and their roles in pathophysiology [4].Changes in the expression level of growth factors and cytokines and abnormal regulations of cell cycle are associated with differentiation and survival of tumor cells.Mutant genes related to celladhesion and angiogenesis are vital in invasion and metastasis of gastric cancer cells [5].Aberrant mitosis is the most common feature of cancer.NUSAP1 is a mitotic regulator.The depletion of NUSAP1 in cells causes G2/M arrest and abnormalities in interphase nuclei [6].Rho GTPases ac- toskeleton and epithelial structures, so they are considered to regulate histological cell type, such as invasive activities of tumor cells [8].Oocyte meiosis and vascular smooth muscle contraction are related to the proliferative.B cell receptor signaling pathway and p53 signaling pathway are related to the metabolic, in which the p53 signaling pathway plays an important role in cancers, and mutations of genes in p53 signaling pathway are the most common genetic changes in cancers [9].The enriched pathways related to a subtype are high reliable.

Discussion
In the comparative analysis between each pair of three sets, we identified some unique differential expression genes as the candidate targeted genes (Table 1).Differences in the phenotype of subtypes may be caused by these genes.In Table 1, the genes in bold are associated with the development of gastric cancer reported in literatures.It is noted that some unique differential expression genes appear in both results of analysis between one subtype and other subtypes, such as ARHGAP15, CAP2, COL14A1, DARC, FERMT2, FHL1, FLNA, RAB23, SMYD1, SPON1 and ZEB1 in invasive subtype, BUB1B, KIF11, KIF18B, NUSAP1 and SYNPO2 in proliferative subtype.They are more suitable to be specific target biomarkers in subtypes.It is a pity that we didn't find this kind of genes in metabolic subtype.Some genes in Table 1 have been confirmed to be related to gastric cancer by the biological experiments.This can indirectly prove the reliability of our results.In invasive subtype, compared with the adjacent normal tissue, the expression level of FHL1 mRNA in gastric carcinoma tissue was significantly lower.The patients with high expression of FHL1 showed significantly longer survival when compared to those with low expression [4].FLNA expression was down regulated in gastric carcinoma tissues and is related to tumor invasion, lymph node metastasis, clinical stage, tumor differentiation and poor prognosis [10].Silence of RAB23 in gastric cancer cells can significantly decrease cellular invasion and migration.Inversely, over expression of RAB23 improved cellular invasion [11].
The protein expression of ZEB1 was significantly up regulated in gastric carcinoma tissues.Over expression of ZEB1 was involved in differentiation, TNM stage and invasion in gastric cancer [12].
In proliferative subtype, it was examined that polymorphisms rs1031963 (C > T) and rs1801376 (A > G) in BUB1B gene in advanced gastric cancer patients and their influence on gastric cancer risk [13].KIF11 was overexpression in gastric cancer.Knockdown of KIF11 inhibited sphere formation of gastric cancer stem cells, so KIF11 likely played a vital role in gastric cancer [14].
In metabolic subtype, because of mononucleotide repeats in coding sequence, HMCN1 gene could be a target for frameshift mutation in cancers with microsatellite instability.Frameshift mutation of genes which contain mononucleotide repeats is a feature of gastric cancer with microsatellite instability [15].The expression of ISL1 was significantly higher in gastric adenocarcinoma by immunohistochemistry and bound up with depth of invasion, lymph node metastasis, TNM stage and histological grade [16].Thy1 was over expression both in the human gastric cancer samples and the isolated fibroblasts cells associated with cancer [17].RUNX3 can inhibit gastric cancer invasion and metastasis by upre-gulatingTIMP1 to inactivateMMP9 [18].Through regulating CCKBR, the HER2-negative gastric cancer cells are inhibited by trastuzumab and gastrin [19].Mutations of KIT gene have been detected in 20% to 92% of gastrointestin-W.Zeng et al.
al stromal tumors, thus it can be seen that frequency of mutations was high, therefore it was thought that KIT might be a genetic biomarker for gastrointestinal stromal tumors [20].The correlation between gene expression level and promoter methylation in LTF gene may provide a new target for clinical diagnosis and treatment of gastric cancer [21].The mRNA and protein expressions of Nek2 in gastric cancer were significantly higher than those in surgical margin tissues, and there was prominent correlation between the expression of Nek2 and TNM stage, depth of invasion, differentiation and lymph node metastasis in gastric cancer [22].

Conclusion
Gastric cancer is one of the most common cancers in human cancers.It is relevant to genetic and epigenetic alterations.Researches on changes in the gene expression level in occurrence and development of gastric cancer are conducive to diagnosis and treatment of disease.Here, we identified a number of candidate biomarkers in three subtypes of gastric cancer.They might represent specific genome features of subtypes, which may be the reason that causes differences in phenotype between subtypes.Some results of KEGG pathway and GO are same as those of LEI.For example, focal adhesions pathway is in invasive subtype, cell adhesion in biological process terms of gene ontology is in invasive and inducing cells into M phase and mitosis is in proliferative.Furthermore, we got some new features in three subtypes as described earlier, and the number of genes that represent features of subtypes is much less.It is more effective to select candidate targeted genes of subtypes from hub gene sets and it may be helpful for diagnosis and treatment of gastric cancer.
For a better diagnosis and treatment, the subtype classification of gastric cancer should be made clinically.Tumor molecular classification was first proposed by National Cancer Institute.It divided tumor into subtypes using molecular classification technology.Classification of tumor based on the characteristics of molecular expression is more useful for individual therapy and more effective in prognosis than classification in pathology.LEI et al. used a robust method of unsupervised clustering and consensus hierarchical clustering with iterative feature selection to analyze gene expression profiles among 248 patients with gastric tumor.They defined 3 subtypes of gastric cancer: proliferative, metabolic and mesenchymal.

Table 1 .
Unique differential expression genes between subtypes.