Bioinformatics Analysis of the Relationship between Dilated Cardiomyopathy and Chronic Heart Failure ()
1. Background
DCM is one of the common types of cardiomyopathy. It is a primary cardiomyopathy characterized by left ventricular or biventricular enlargement, ventricular wall thinning, and ventricular systolic dysfunction. It is one of the common diseases leading to heart failure (HF), arrhythmia and sudden death. The incidence rate was 0.08‰, and the prevalence rate was 0.36‰. After clinical symptoms appeared, the mortality rate within 1 year was as high as 25%, and the mortality rate within 5 years increased to 50% [1], The incidence in China is (13 - 84)‰ [2]. The etiology involves mutations in genes encoding cytoskeleton, sarcomere and nuclear membrane proteins in 35% of cases. Acquired causes include myocarditis and exposure to alcohol, drugs and toxins, as well as metabolic and endocrine disorders. The most common symptoms are related to congestive heart failure, but may also include arrhythmia, thromboembolism and sudden death events. Secondary neurohormonal changes may lead to reverse cardiac remodeling and persistent myocardial cell injury. Individuals with low ejection fraction or severe diastolic dysfunction have a poor prognosis [3]. DCM is often hidden in the early stage of the disease, and there are no obvious related symptoms. Generally, when there are obvious clinical symptoms, the whole heart has expanded, the cardiac systolic function has decreased significantly, and the systemic circulation or pulmonary circulation congestion has caused related clinical symptoms. Most DCM patients eventually die of congestive HF, and DCM is the main cause of sudden cardiac death and HF [2], and it is also the first cause of heart transplantation [4], Effective treatment should be given in time. Because the early symptoms of the disease are not obvious and the a lack of specific screening methods, more patients are admitted to the hospital due to heart failure in the middle and late stages. Therefore, the key to the treatment of DCM is to improve cardiac function, prevent malignant arrhythmia, and improve quality of life and survival rate [5].
CHF is a complex clinical syndrome, which brings a huge burden to society, not only in its mortality rate but also in its incidence rate. Because of its easy acute attack, it leads to repeated hospitalization and long-term hospitalization. Its treatment includes drugs to improve survival rate and reduce hospitalization rate. It can be seen that with the progression of the disease, DCM is one of the causes of acute attack of CHF. B and Pawlak A [6] [7] Related studies have screened biomarkers for the diagnosis of CHF caused by DCM, such as hepatocyte growth factor and desmin, and whether there is a link between the two at the gene level remains to be explored.
In this study, bioinformatics analysis was performed on the gene chip data of DCM and CHF included in the GEO and their control populations to obtain common differentially expressed genes (cDEGs) of the two diseases. The key genes and their molecular networks were studied to explore the molecular biological functions that may be involved, and to provide a theoretical reference for revealing the potential association between the two and their molecular mechanisms.
2. Method
2.1. Data Acquisition
The original data of DCM patient data chip GSE3585 and CHF patient data chip GSE76701 were obtained by searching the public GEO database of the National Center for Biotechnology Information with “Dilated Cardiomyopathy” and “Chronic Heart Failure” as keywords. The data set GSE3585 was generated based on the GPL96 [HG-U133A] Affymetrix Human Genome U1331 Array, and the data set GSE76701 was generated based on the GPL570 [HG-U133_plus_2] Affymetrix Human Genome U133 plus 2.0 Array. GSE3585 included 7 DCM patients and 5 non-DCM patients. The GSE76701 chip data set contains 4 non-CHF patients and 4 CHF patients.
Differentially expressed genes were screened based on the R language (https://www.r-project.org) limma package. Differential expression analysis was performed on the mRNA microarray data of DCM and CHF diseases. The test statistics P < 0.05, logarithmic absolute value of fold change(FC)|log FC| > 1 was set as the screening condition to screen the differentially expressed mRNAs (DEmRNAs). The common target genes of DCM and CHF were obtained by intersecting the differential genes of DCM and CHF.
2.2. GO Function and KEGG Signaling Pathway Enrichment Analysis
The R clusterProfiler package was used for GO and KEGG enrichment analysis. GO functional enrichment was performed on the common DEmRNAs of the two to analyze the biological processes they participate in, and KEGG signaling pathway enrichment analysis was performed at the same time. Metascape (http://metascape.org/gp/) online analysis tool was used to analyze the enrichment of differential genes GO and KEGG, and the modules of GO functional enrichment were screened out. At the same time, enrichment analysis of KEGG, Wiki Pathways, Reactome, Hallmark Gene Sets and other signaling pathways was performed to discover possible biological pathways. The adjusted P < 0.05 was used as the threshold to screen the main enrichment functions and pathways of differential genes.
2.3. PPI and Core Module Analysis
The differential genes were analyzed by STRING (search tool for the retrieval of interacting genes/proteins) 11.0 online tool. The calculation results of STRING were imported into Cytoscape 3.9.1 software, and the plug-ins CytoNCA, Cytohubba and MCODE were used for protein interaction network diagram, co-expression of core genes and modules.
3. Results
3.1. Co-Morbidity mRNA Identification
Using the R language limma package to analyze the differential expression of gene chips for the two diseases, it was found that most of the gene expression levels were basically consistent, indicating that the data was suitable for the next analysis in Figure 1.
(a) (b)
Figure 1. Gene expression box plot. Note: (a) is GSE3585 data chip box diagram, (b) is GSE76701 data chip box diagram.
A total of 29 differential genes were identified between DCM patients and normal individuals, and a total of 219 differential genes were identified between CHF patients and normal individuals (Figure 2). The up-regulated and down-regulated genes were visualized respectively, and the heat map was made according to |logFC| (Figure 3). In addition, we determined Sthe intersection of these two datasets and obtained a total of 15 co-expressed genes, which are shown in the Venn diagram, Figure 4.
(a) (b)
Figure 2. Volcano diagram. Note: (a) is the volcanic map of GSE3585 dataset, (b) is the volcanic map of GSE76701 dataset, blue is the down-regulated gene, and red is the up-regulated gene.
(a) (b)
Figure 3. Heat map. Note: (a) is the heat map of GSE3585 dataset, (b) is the heat map of GSE76701 dataset, blue is the down-regulated gene, and red is the up-regulated gene.
Figure 4. Wayne diagram of DCM and CHF intersection gene.
3.2. GO and KEGG Enrichment Analysis of Intersection Genes
The R cluster Profiler package was used to perform GO and KEGG pathway enrichment analysis on the common target genes of DCM and CHF. Biological Processes (BP), Cellular Components (CC), Molecular Functions (MF) and Pathways in KEGG were analyzed (Figure 5). It can be seen that: 1) GO function is mainly enriched in cGMP metabolic process, cGMP biosynthetic process, receptor guanylyl cyclase signaling pathway, collagen-containing extracellular matrix, Muscle myosin complex, serine-type peptidase complex, hormone activity, hormone receptor binding and Wnt-protein binding are shown in Figure 5. 2) KEGG pathway is mainly enriched in cGMP-PKG signaling pathway, vascular smooth muscle contraction, etc. (Figure 5).
Figure 5. GO functional enrichment analysis and KEGG pathway analysis.
3.3. Construction and Analysis of Common Expression PPI
STRING online tool was used to analyze 248 common DEmRNAs. The calculation results of STRING were imported into Cytoscape 3.9.1 software to obtain PPI interaction diagram. The common expression gene module was obtained by MOCDE algorithm in Cytoscape plug-in, and there was a core module with more than 4 points (Figure 6). Nine Hub genes were obtained by 10 algorithms such as MCC, DMNC, and MNC in the CytoHubba plug-in, which were NPPB, NPPA, MYH6, FRZB, ASPN, SFRP4, RPS4Y1, DDX3Y, and HNRNPU (Figure 6).
Figure 6. PPI network of common expressed genes.
4. Discussions
DCM is understood as the final common response of the myocardium to various genetic and environmental damages. At present, the etiology is unknown and the etiology is diverse, including genetic factors (primary dilated cardiomyopathy) or acquired factors (secondary dilated cardiomyopathy). Acquired factors include infection, toxins, cancer treatment, endocrine diseases, pregnancy, tachyarrhythmia and immune-mediated diseases. 5% - 15% of patients with acquired DCM may have pathogenic or pathogenic gene mutations. Therefore, genetic and acquired factors should always be considered in diagnostic tests and treatment methods [8]. In most cases, DCM is progressive, and most patients will eventually develop into CHF. HF is the leading cause of death worldwide, affecting more than 37.7 million people worldwide, especially DCM, which is the most common type of systolic HF [9]. The progression of HF depends on the ejection fraction and the cause of the disease. Without transplantation, the long-term survival rate is very low. The results of related laboratory tests are not the specific results of other causes of cardiomyopathy to a large extent, but the typical results of congestive HF [10]-[12]. At present, there is no effective treatment to prevent DCM from developing into HF [13], Therefore, it is particularly important to explore the relationship between DCM and CHF, and the specific molecular mechanism between them remains to be explored.
In this study, bioinformatics methods were used to select the appropriate gene data chip through the GEO database, and the GEO2 R online differential gene analysis method was used to screen the differentially expressed genes related to the pathogenesis of DCM and CHF. The volcano map and heat map were drawn for the differentially expressed genes obtained from the two data sets, and the intersection of the co-expressed genes was obtained by Venn diagram to obtain 15 cDEGs. The results of GO, KEGG Wiki Pathways, Reactome and other signal pathway enrichment analysis of differential genes showed that differential genes were mainly enriched in cGMP metabolic process, biosynthesis process, collagen-containing extracellular matrix, muscle myosin complex, serine peptidase complex, hormone receptor binding, Wnt-protein binding, vascular smooth muscle contraction, receptor guanylate cyclase signaling pathway and cGMP-PKG signaling pathway. KEGG analysis showed that there were 2 main signaling pathways involved in differentially expressed genes, and 9 key differentially expressed genes were predicted, namely: NPPB, NPPA, MYH6, FRZB, ASPN, SFRP4, RPS4Y1, DDX3Y, HNRNPU. Therefore, the diagnosis and treatment of long-term DCM patients with CHF can be based on the above 9 genes.
NPPA and NPPB are cardiac genes encoding atrial natriuretic factor (ANF) and brain natriuretic peptide (BNP), which belong to this kind of fetal gene program. These two genes are highly expressed in atrial and ventricular myocardium at both embryonic and fetal stages. The expression of NPPA is strongly down-regulated in the ventricle after birth. When pressure is applied, the heart releases propeptides, and the ventricular expression of NPPA and NPPB in cardiomyocytes is strongly increased [14]. Both NPPA and NPPB are secreted by the heart and are involved in heart development, heart and kidney homeostasis, as well as heart injury and stress response. The plasma levels of these peptides are powerful diagnostic and prognostic biomarkers for heart disease [15]. At the same time, natriuretic peptide is functionally related to cardiac hypertrophy, fibrosis, angiogenesis, and cardiomyocyte proliferation and viability. It has the effects of regulating water and electrolyte balance, reducing cardiac afterload, and dilating blood vessels through natriuretic and diuretic effects [16]. ANP and BNP may have cardioprotective effects in patients with HF and acute myocardial infarction (MI). In patients with HF, the expression of NPPB in DCM is higher, and the patients with higher BNP level have poor cardiac function. NPPA and NPPB have been proven to be closely related to HF caused by DCM in many studies [17] [18]. Therefore, NPPA and NPPB are closely related to the diagnosis, treatment and prognosis of DCM with HF, but the specific molecular mechanism between the two remains to be explored. In the later stage, further research can be focused on the genetic level of the two.
Human myocardium expresses two subtypes of myosin heavy chain (MyHC), α and β, which are tandemly present on chromosome 14q12 [19]. Cardiac myosin and actin are the main components of the sarcomere, which is an integral part of the cardiac systolic system. Myosin is composed of two heavy chain subunits (α and β), two light chain subunits and two regulatory subunits. MYH6 encodes a myosin heavy chain subunit (α-MHC), which is about 26,000 bp and consists of 39 exons, of which 37 exons contain coding information. Consisting of head, neck and tail domains, α-MHC plays a crucial role in myofibril assembly and normal heart development. MYH6 mutation can lead to HCM, DCM and congenital heart disease with incomplete penetrance [20]. Other studies [21] show, MYH6 p. S180 Y is located in the motor domain of myosin head, which is related to the recovery of dynamic stroke and the interaction with actin. The mutation of MYH6 p.S180 Y reduces the hydrophobicity of α-MHC, and leads to the disappearance of potential phosphorylation of corresponding amino acid sites, which may lead to the weakening of actin strength and interaction, and then participate in the pathogenicity of DCM. MYH6 has been shown to be involved in the development of DCM, which is down-regulated in DCM and can be used as one of the important target genes in HF treatment. LVEF (left ventricular ejection fraction) is significantly positively correlated with MYH6, and LVIDD (left ventricular end-diastolic diameter) is significantly negatively correlated with MYH6 [22]. It can be seen that MYH6 can participate in the pathogenesis and development of DCM and HF to varying degrees by regulating myocardial actin and myosin.
Study [23] showed, FRZB (frizzled-related protein) acts as a Wnt signal regulator by directly interacting with Wnts and plays a role in regulating cell growth and differentiation of specific cell types. It is a hub gene of DCM, and its high expression is significantly related to DCM, which is expected to become a DCM biomarker.
Related research [24] shows ASPN is one of the pathogenic hub genes of DCM. It is a member of the SLRP (leucine-rich small proteoglycan) family and plays an important role in tissue damage and regeneration. According to the study of Liu [25], ASPN is the most highly expressed gene in keloids. Overexpression of ASPN inhibits fibroblast activity and differentiation into mature myofibroblasts, which rapidly changes the extracellular matrix and leads to keloid formation and invasion. Proteomic studies by Manuel Mayr [26] revealed the role of ASPN in cardiac remodeling and verified their findings in patients with ischemic cardiomyopathy. Huang et al. [27] found that ASPN mimetic peptides can prevent cardiac fibrosis caused by aortic contraction and protect normal cardiac function in mice. Down-regulation of ASPN can inhibit the abnormal development of myofibroblasts and the gene expression of fibroblasts in vitro, thus blocking or reversing the progression of pulmonary fibrosis [28]. ASPN increased the apoptosis of H9C2 cardiomyocytes, down-regulated Bcl-2, up-regulated transforming growth factor-β1, Bax, type III collagen, fibronectin, and phosphorylation of smad2 and smad3 [29]. This indicates that the role of ASPN in cardiomyocytes has been preliminarily confirmed, and ASPN may be a promising biomarker for heart failure [30] [31]. Therefore, ASPN may be one of the important core genes, which may be involved in the pathogenesis of DCM and CHF by inhibiting the activity and differentiation of fibroblasts into mature myofibroblasts, causing myocardial fibrosis and myocardial apoptosis. In the future, ASPN can be used as one of the important biomarkers for basic research verification.
SFRP4 (or FRZB) belongs to the SFRP family [32], It can be used as a soluble regulator of Wnt signaling [33]. Specifically, SFRP4 contains a cysteine-rich domain homologous to the putative Wnt binding site as an inhibitor of Wnt signaling. The expression of SFRP4 in ventricular myocardium is related to the expression of apoptosis-related genes. SFRP1-4 is expressed in cardiomyocytes. The levels of SFRP3 and SFRP4 are increased during HF [34] [35]. SFRP4 has been identified as an upstream regulator that mediates the significant activation of multiple HF-related genes, and SFRP4 levels are elevated in DCM patients [36]. Heterogeneous ribonucleoprotein U (HNRNP) is a large family of RBPs consisting of 33 core and minor members involved in many steps of RNA processing. Some of them (mainly through changes in expression or localization) are associated with neurodegenerative diseases, such as HNRNPU [37]. HNRNPU is a nuclear protein that plays a crucial role in various biological functions such as RNA splicing and chromatin tissue. HNRNPU scaffold attachment factor A (SAF-A) activity is critical for regulating gene expression, DNA replication, genomic integrity, and mitotic fidelity. These functions are essential to ensure the robustness of developmental processes, especially those involved in shaping the human brain [38]. Therefore, SFRP4 can participate in the pathogenesis of DCM and HF by regulating the expression of Wnt signal and apoptosis-related genes, and the related research can be started from Wnt signal in the later stage.
A study [39] shows that PRS4Y1 (ribosomal protein S4, Y-linked 1) is a member of the ribosomal protein S4E family. Dysregulation of RPS4Y1 expression impairs STAT3 signaling, thereby inhibiting trophoblast migration and invasion. It is overexpressed in men and has been found to encode ribosomal protein S4. This protein has a functionally interchangeable counterpart RPS4X on the X chromosome. As one of the potential key genes of HF, the exploration of RPS4Y1 may provide some potential help for further searching for new HF susceptibility biomarkers and therapeutic targets [40]. The human DDX3 homolog is located on the X chromosome (DDX3X) and the Y chromosome non-recombination region Yq11 (DDX3Y or DBY), and is expressed only in male germ cells. DDX3Y gene deletion can lead to azoospermia and cause human Setoli cell syndrome (SCOS) [41]. DDX3Y is highly expressed in the HF population and is located on the Y chromosome. The DDX3Y gene is located in the non-recombination region of the Y chromosome and is a potential regulator of apoptotic signaling proteins in male neurons. Knockdown of DDX3Y gene in neural progenitor cells indicates that the inhibition of DDX3Y gene leads to significant changes in the expression of RNA splicing, which regulates the cell cycle, thereby affecting apoptosis and other biological functions [42]. Bioinformatics showed that DDX3Y and PRS4Y1 were highly expressed in the population from three geographic databases. As potential key genes of HF and some important pathways related to HF risk, they have been identified, suggesting that these genes may play an important role in the occurrence and development of HF [40]. It can be seen that RPS4Y1, as a low-expressed Y-linked gene, is involved in the pathogenesis of HF by regulating ribosomal protein. At the same time, DDX3Y, as a potential regulator of male neurons, leads to HF by affecting apoptosis and other biological functions. Both suggest that DCM combined with CHF is genetically related, but its specific potential mechanism and biological significance remain to be explored.
5. Summary and Prospect
In this study, nine hub genes including NPPB, NPPA, MYH6, FRZB, ASPN, SFRP4, RPS4Y1, DDX3Y and HNRNPU were screened out by mining DCM and CHF gene chip data and bioinformatics analysis, which were closely related to the pathogenesis of DCM and HCF. At the same time, it was found that many of them were related to myocardial fibrosis, cardiac hypertrophy, angiogenesis, myocardial cell proliferation and vitality, myocardial apoptosis and so on. Extracellular matrix remodeling, serine peptidase complex, hormone receptor binding, Wnt-protein binding, vascular smooth muscle contraction, receptor guanylate cyclase signaling pathway, cGMP-PKG signaling pathway and other physiological processes and signaling pathways play an important role in the pathogenesis of DCM and CHF, providing new early warning indicators and theoretical basis for the molecular mechanism and therapeutic targets of DCM and CHF. In the future, relevant basic experiments are still needed to verify the specific mechanism of key genes in the occurrence and development of diseases.