Type-2 Diabetes Mellitus and Glucagon-Like Peptide-1 Receptor toward Predicting Possible Association ()
1. Introduction
Type 2 Diabetes Mellitus (T2DM) is caused by inability of the pancreatic beta cells to produce sufficient insulin hormone against insulin resistance [1] . It is a heterogeneous disorder of glucose metabolism characterized by both insulin resistance and pancreatic β-cell dysfunction considered as multifactorial due to many genetic and environmental factors involved together in its pathophysiology [2] [3] (Before the development of T2DM, individuals develop hyperproinsulinemia and the elevated level of proinsulinemia have been significantly associated with diabetes [4] [5] . Globally, type 2 diabetes mellitus was diagnosed in 537 million adults aged 20 - 79 years in 2021 and is projected to affect 783 million adults by 2045 [6] .
Glucagon-like peptide-1 (GLP-1) is secreted chiefly by the intestinal L cells in the distal small intestine and proximal colon encoded by the proglucagon gene and it is a 30 amino acid polypeptide [7] [8] . GLP-1 specifically targets the GLP1R (GLP-1 Receptor). The human GLP1R gene is allocated to the long arm of chromosome 6 (chr 6p21) and contains 13 exons [9] . The gene is a member of the B1 family of G protein-coupled receptors, mainly shows expression within islet β cells and contains 463 amino acids containing seven trans membrane domains belonging to the family of G-protein coupled receptors. GLP-1 binds to GLP1R gene to activate adenylate cyclase, which activates cyclic adenosine monophosphate (cAMP) dependent second messenger pathways, including protein kinase PKA and Exchange Proteins Activated (Epac) by cAMP, thus increasing insulin release through β cells and the number of β cells that respond to glucose [10] . Guanine nucleotide-binding protein (G-protein) and G-protein-coupled receptor (GPCR) were involved in the development of T2 DM [11] . G protein activates adenylyl cyclase (AC), which generates “second messenger”, cAMP, which regulates glucose homeostasis by regulating glucose uptake, insulin and glucagon secretion, synthesis and breakdown of glycogen through protein kinase A (PKA). The cAMP is one of the most important cellular signaling molecules in the regulation of insulin secretion by beta cells [12] [13] .
The objective of this study is to predict the nsSNPs in the GLP1R gene and the effect they may impose on the protein structure and function using various computational software.
2. Material and Methods
2.1. Data Retrieval
Data was retrieved from the SNP database of the National Center for Biotechnology Information (dbSNP) (http://www.ncbi.nlm.nih.gov/snp). The NCBI SNP database was used to access the SNPs of the GLP1R gene (accessed June 2022). The primary sequence of the protein (accession number: P43220) encoded by the GLP1R, human gene was obtained from UniProtKB database (accessed June 2022).
2.2. Gene MANIA: (http://www.genemania.org)
It is a web interface that finds other genes related to an input gene, using a very large set of functional association data. Association data include protein and genetic interactions, pathways, co-expression, co-localization and protein domain similarity [14] .
2.3. Sorting Intolerant from Tolerant (SIFT): (https://sift.bii.a-star.edu.sg/)
This is a tool that expresses whether a nsSNP at special position affects the structure and function of the protein based on sequence homology and the physiochemical characteristics of substituted amino acid. SIFT computes the normalized probability score (SIFT score) for each substitution. The SIFT score has a range of 0.0 to 1.0, the amino acid substitution with a score greater than or equal to 0.05 (≥0.05) is predicted as tolerated polymorphism, whereas a score less than 0.05 (<0.05) is predicted to be damaging ones [15] .
2.4. Protein Variation Effect Analysis (PROVEAN): (provean.jcvi.org/)
This is another sequence homology-based predictor. It is used to assess the possible functional influence of nsSNPs on a protein. It predicts the variation as deleterious or natural, if the functional impact score is less than or equal to −2.5 (≤−2.5) it is estimated as a deleterious; score above − 2.5 (>−2.5) is estimated as neutral [16] .
2.5. Polymorphism Phenotyping Version2 (PolyPhen-2): (genetics.bwh.harvard.edu/pph2/)
It is a combination of protein 3D structure and multiple homolog sequence alignment-based method. It predicts the potential consequences of single amino acid substitution on both protein function and structure. The prediction is provided as benign, possibly damaging and probably damaging according to the position-specific independent count (PSIC) scores difference between 2 variants (wild amino acid (aa1) and mutant amino acid (aa2)). PSIC score has a range of 0.0 to 1.0. The amino acid substitution with a score of 0.0 to 0.49 is predicted as benign, with a score of 0.5 to 0.89 is predicted as damaging and with a score of 0.9 to 1 is predicted as probably damaging [17] [18] .
2.6. SNPs & GO (Single Nucleotide Polymorphism & Gene Ontology)
SNPs & GO is an accurate method that, starting from a protein sequence, can predict whether a mutation is disease-related or not by exploiting the protein functional annotation. SNPs & GO collects in unique framework information derived from protein sequence, evolutionary information, and function as encoded in the Gene Ontology terms, and outperforms other available predictive methods [19] .
2.7. PHD-SNP: (Predictor of Human Deleterious SNP) (snps.biofold.org/phd-snp/phd-snp.html)
Predictor of human Deleterious Single Nucleotide Polymorphisms (PhD-SNP) is a support vector machine (SVM) based server. This server determines whether a certain amino acid substitution is related to disease or neutral by protein sequence information, protein structure, conservation and solvent accessibility. The output is a probability index with a score of 0.0 to 1.0, when the score is higher than 0.5, the substituted amino acid is pathogenic [16] [20] .
2.8. Protein Stability Prediction
Two software were used to predict the effect of a missense mutation on the protein’s stability.
2.8.1. I-Mutant 2.0 https://folding.biofold.org/cgi-bin/i-mutant2.0.cgi
(http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant3.0/I-Mutant3.0.cgi)
This software offers the opportunity to predict automatically protein stability changes upon single-site mutations starting from protein sequence alone or protein structure when available. Moreover, it can predict deleterious Single Nucleotide Polymorphism starting from the protein sequence alone [20] .
2.8.2. MUpro
(mupro.proteomics.ics.uci.edu): MUpro uses the Support Vector Machine (SVM) to assess the variation in the stability of the protein consequent to amino acid substitutions. The output is a confidence score among −1 and 1. A confidence score < 0 indicates the substituted amino acid decreases the stability and a score > 0 indicates the substituted amino acid increases the stability [21] .
2.9. Prediction of Protein Modeling www.cmbi.ru.nl/hope/
Project Have your Protein Explained (HOPE) is a web server that was used for the investigation of the impacts of a missense mutation on the native protein structure. HOPE will roll up and incorporate available information from UniProtKB, protein’s 3D structure and DAS-servers. As regards the exact 3D-structures of some GLP1R protein isoforms are unknown; HOPE built the model of them based on homologous structures. HOPE processes the gathered data and produces a report, including schematic structures of the wild-type and the mutant amino acids, differences in the properties of wild-type and mutant amino acids and the impacts of a substituted amino acid on the protein structure along with figures and animations [22] .
3. Results
In this study GLP1R gene was found to have an association with 20 other different genes. Among the most important ones is the GCG (glucagon) gene which is also a trans membrane protein (Figure 1 and Table 1).
Table 1. Gene description rank using GLP1R gene.
Figure 1. GeneMANIA results for GLP1R Gen.
Overall 7229 variants were seen, and the missense variants or nsSNPs (146) were selected for further analysis. The nsSNPs obtained were subjected to SIFT software, 27 were predicted as deleterious and 119 were predicted as tolerated. Analysis with Provean showed that; 20 were predicted as deleterious and 7 as neutral, Table 3. Analysis using Polyphen-2 revealed that 17 were predicted as probably damaging, 2 possibly damaging and 1 as benign. By using software SNPs&GO there were 14 SNPs had a disease effect and 5 were neutral, while using PHD SNPs 17 SNPs had a disease effect and 2 were neutral, Table 2.
Disease-related mutations resulting from SNPs&Go were submitted to I-Mutant and MUpro software, results showed an effect on the protein stability with varied probabilities, Table 3 and Table 4.
When using five different software, (SIFT, Polyphen-2, Provean, SNPs & Go and PHD-SNP) for studying the functional and structural effects a total of 14 nsSNPs had a disease effect (Table 4). Regarding the effect on protein stability, 17 nsSNPs were predicted to decrease the stability of the protein when using I-Mutant 2.0. On the other hand, MUpro software showed 19 SNPs decrease the protein stability, Table 4.
The structural impact of the SNPs on protein structure and function was investigated using Project Hope. Fourteen nsSNPs were analyzed using Project Hope, Table 5.
rs201231115 (C46Y): showed that the mutant residue is bigger than the wild-type residue. The wild-type residue was buried in the core of the protein. The mutant residue is bigger and probably will not fit. Mutation of a 100% conserved residue is usually damaging for the protein.
rs200765138 (R64W): the mutant residue is bigger than the wild-type and the size difference between wild-type and mutant residue makes that the new residue is not in the correct position to make protein structure.
rs201634613 (W87R): the mutant residue is smaller than the wild-type residue. The mutated residue is located in a domain that is important for the activity of the protein. The mutation can affect this interaction and as such affect protein function. There is also difference in the charge between the wild and mutant type.
Table 2. The results of different software.
Table 3. Results of SIFT, Provean and Polyphen-2 analysis.
Table 4. Results of SNPs & GO, PHD SNP and I-Mutant software.
Table 5. The effect of mutations on the protein using Project Hope software.
*Note: Grey colour is protein chains, green coloured atoms are the wild amino acid residues, while red coloured atoms are the mutated amino acid residues.
rs182447758 (R227H): the wild-type residue is smaller than the mutant residue. There is a difference in charge between the wild-type and mutant amino acid, this can cause loss of interactions with other molecules.
rs375865648 (I272T): the mutant residue is smaller and less hydrophobic than the mutant residue, this will cause a possible loss of external interactions and will cause an empty space in the core of the protein and this differences in hydrophobicity can affect the hydrophobic interactions with the membrane lipids.
rs200792917 (Y291C): the mutant residue is smaller than the wild-type residue. The mutant residue is more hydrophobic than the wild-type residue which might affect the function of the protein.
rs150729240 (D293Y): appeared that the mutant residue and wild residue are differing in size and charge. The mutant residue is more hydrophobic than the wild-type residue, so the interaction between these domains could be disturbed by the mutation, which might affect the function of the protein.
rs149578908 (R310Q): The mutant residue is smaller than the wild-type residue. This mutation might occur in some rare cases, but it’s more likely that the mutation is damaging to the protein. The difference in charge will disturb the ionic interaction made by the original, wild-type residue.
rs200118342 (R326W): indicated that the wild-type and mutant amino acids differ in size. The mutant residue is bigger than the wild-type residue. The residue is located on the surface of the protein. There is a difference in charge between the wild-type and mutant amino acid. Mutation of this residue can disturb interactions with other molecules or other parts of the protein. The mutation can affect this interaction and as such affect protein function.
rs199783730 (T353M): each amino acid has its own specific size, charge, and hydrophobicity-value. The original wild-type residue and newly introduced mutant residue differ in these properties. The mutant residue is bigger and more hydrophobic than the wild-type residue, based on this conservation information this mutation is probably damaging to the protein.
rs199818129 (P358L): the mutated residue is located in a domain that is important for the activity of the protein and in contact with residues in another domain. The wild-type residue was buried in the core of the protein. The mutant residue is bigger and probably will not fit.
rs201669667 (L360P): the mutant residue is smaller than the wild type which will cause an empty space in the core of the protein and loss of hydrophobic interactions.
rs148543734 (G361R): showed that the mutant residue is bigger than the wild-type residue. The wild-type residue was buried in the core of the protein. The mutant residue is bigger and probably will not fit. The mutation into another residue will force the local backbone into an incorrect conformation and will disturb the local structure.
rs201899163 (Q394R): the wild-type and mutant amino acids differ in size, based on conservation scores this mutation is probably damaging to the protein.
4. Discussion
Missense mutations resulting in the amino acid change disturb the potential protein structure, stability, and activity which can enhance the individual susceptibility to disease [23] . Investigating the impact of a missense mutation on proteins is a key step in understanding the effect of the mutation on the characteristics of the wild type protein and the resulting phenotype [24] . The major effects of GLP-1 protein include the potentiation of glucose-stimulated insulin secretion, suppression of appetite, and slowing of gastric emptying [25] . Five GLP-1 mimetic agents are approved for the treatment of type 2 diabetes, including exenatide, lixisenatide, liraglutide, dulaglutide, and semaglutide [26] .
In this study, a total of 14 nsSNPs were shown to be deleterious, damaging, disease-related, and affecting the protein function due to differences in charge, size, hydrophobicity, and conservancy between the wild and mutant types using different software namely rs201231115 (C46Y); rs200765138 (R64W); rs201634613 (W87R); rs182447758 (R227H); rs375865648 (I272T); rs200792917 (Y291C); rs150729240 (D293Y); rs149578908 (R310Q); rs200118342 (R326W); rs199783730 (T353M); rs199818129 (P358L); rs201669667 (L360P); rs148543734 (G361R); and rs201899163 (Q394R) were not reported in ClinVar database and has not been previously reported in the GLP1R gene.
The rs1042044 of the GLP1R gene is a tag-SNP located in exon7 of the gene. This is a nsSNP resulting in (Phe260Leu) according to Hap Map location (http://www.hapmap.org). The nsSNPs of the GLP1R gene affect the in vivo response of GLP-1 and are considered to be the primary cause of the inconsistent clinical efficacy of GLP-1 analogs within T2DM patients. Anderson et al., 2012 and Sathananthan et al. 2010 studied the effects of the GLP1R gene nsSNPs on insulin secretion from exogenous GLP-1 and found that two sites of GLP1R, rs6923761 and rs3765467, could change the insulin promoting effect of GLP-1 [27] [28] . In a clinical study for the same mutation indicated that patients with T2DM carrying the variant allele of rs3765467 showed a more robust response to the treatment by dipeptidyl peptidase inhibitors [29] . Also the rs3765467 and rs10305492 nsSNPs in the GLP1R gene showed to exert a critical effect on regulating insulin secretory capacity by β-cell and β-cell mass through leading to the dysfunction and apoptosis of β-cell, GLP1R rs3765467 and rs10305492 might also impair GLP-1 interaction with GLP1R [9] [30] . Another study found that the nsSNP rs367543060 is associated with T2DM susceptibility and, its expression can reduce the receptor affinity and intracellular signaling of GLP-1 [29] .
When GLP-1R signaling pathway is disrupted, insulin secretion, insulin sensitivity and glucose effectiveness would be impaired, as in the case of GLP-1R mutation. So far, the N-terminal extracellular domain and three intracellular loops have been reported to be functionally important regions in terms of binding and signal transduction [9] . Some studies have shown that GLP-1 analog increases β-cell mass by both differentiation and neogenesis of precursor cells and by replication of pre-existing β-cells [30] .
5. Conclusion
In this study, the 14 nsSNPs were highly affecting the protein function, which is providing the necessary instruction for the synthesis of the insulin hormones needed for glucose catabolism. Concluding that these nsSNPs are associated with T2DM and also affect the treatment of diabetic patients due to the fact that the protein acts as an important drug target. Thus, the findings of the present study provide a guideline for researchers to know the important role of these nsSNPs in the etiology of the complex diseases. In vitro research is needed to confirm these results.