Systematic analysis of diabetes-and glucose metabolism-related proteins and its application to Alzheimer ’ s disease

Alzheimer disease has been defined as Type 3 Diabetes due to their shared metabolic profiles. Like our previously research, results of Alzheimer’s disease and other neurodegenerative diseases, systematic analysis of diabetesand glucose metabolism-related proteins also provides help in the treatment of Alzheimer’s patients. Some interesting results indicate that diabetes-related proteins (DRPs) are rich in Lys and the content of Trp can distinguish between type 1 and type 2 diabetes mellitus in particular, while glucose metabolism-related proteins (GMRPs) possess Leurich and Trp-poor character. Moreover, the usage biases of codons depend on GC contents to a great extent, in concord with all codons of the highly expressed genes with the terminal of C/G. Especially, the deficit of CpG dinucleotides is largely attributed to the hypermutability of methylated CpGs to UpGs by the mutational pressure. Besides a common node insulin receptor, there are some similar node proteins, such as glucose transporter member, protein tyrosine phosphatase, and adipose metabolism signal protein. The sharing proteins involve glucagon, amylin, insulin, PPARγ, angiopoietin, PC-1/ENPP1, and adiponectin mediated signal pathway. Meanwhile, the gene sequences of node proteins contained the binding sites of 37 transcription factors divide into four kinds of superclasses. Additionally, BAD complex can integrate pathways of glucose metabolism and apoptosis by BH3 domain of BAD directly interacting with GK as well as GK binding with the consensus motif [G]-[1]-[K]-[2]-[S/T] or [L/M]-[R/K]-[2]-[T] of PP1 or WAVE1. This facilitates the therapies for diabetes mellitus as well as Alzheimer’s disease.


INTRODUCTION
Alzheimer's disease (AD) and diabetes are both ageassociated diseases [1].AD is a late-onset neurological disorder characterized by progressive loss of memory and cognitive abilities as a result of excessive neurodegeneration in the hippocampus and cortex [2], accompanied by the accumulation of intracellular neurofibrillary tangles (NFTs) consisting of hyperphosphorylated tau protein and extracellular amyloid beta (Aβ) peptides containing senile plaques [3].Aβ peptides originate from an abnormal proteolytic processing of the amyloid precursor protein (APP) which is a large transmembrane type 1 (cytosolic C-terminal) glycoprotein.Cleavage of the APP ectodomain by β-secretase at the amino-terminus of Aβ is followed by cleavage of the β-secretase-generated carboxyl-terminal fragment (β-CTF, C99) at two sites of the carboxyl terminus of Aβ by γ-secretase [4], to produce either the Aβ residue 1-40 (Aβ 40) or a longer, more amyloidogenic form that contains the residues 1 -42 (Aβ42) [99].On the other hand, diabetes mellitus is a chronic metabolic disorder in the endocrine system and systemic underway disease with high blood glucose, specified to mammal and not to be effective cured until now [5].Insulin-dependent diabetes mellitus (IDDM, type 1), accounting for 5% -10% of diabetes, in which the body does not produce any insulin due to islet beta-cell apoptosis, most often occurs in children and young adults [6,7].Noninsulin-dependent diabetes mellitus (NIDDM, type 2), accounting for 90% -95%, in which the body does not produce enough, or properly use, insulin due to the damage of insulin gene, its receptor gene, glucokinase gene, and mitochondria gene, is nearing epidemic pro-portions, because of an increased number of elderly people, and a greater prevalence of obesity and sedentary lifestyles [8,9].Diabetes mellitus is associated with decreased body insulin production than required (type 1) and impaired insulin signaling (type 2/T2DM).Diabetes especially T2DM and AD follow the same pathological mechanisms resulting in misfolded proteins, insulin impairment, abnormal glucose metabolism, abnormal fatty acid metabolism, mitochondrial dysfunction, and high oxidative stress [1].These shared metabolic profiles and diabetes as extreme risk factor for AD lead to the assumption that AD may reflect type-3 diabetes.
We have previously finished construction of cell model for screening Alzheimer's drugs [10], codon usage biases in AD and other neurodegenerative diseases [11], and protein-protein interactions related to AD based on complex network [12].Similarly, we are to propose a computational analysis on codon biases of some proteins related to diabetes and glucose metabolism, hoping that the findings might be useful for developing new therapeutic treatment for diabetes as well as AD.Moreover, the proteins related to diabetes are located at signal transduction pathway [13,14], gene control pathway [15,16], glucose metabolic pathway [17,18], etc.A large number of research work on these proteins and their structures, functions, pathways, networks, and relations to diabetes have been conducted yet [19,20].It is strongly significant to pay close attention to the systematical integration of known diabetes-related proteins (DRPs) for treatment of diabetes mellitus and its complications, which helps to the treatment of Alzheimer's patients.With the application of bioinformatics and functional proteomics to drug research and development, we try to systematically analyze the property of glucose metabolism-related proteins (GMRPs) and DRPs, involving the amino acid compositions, protein-protein interaction network, codon biases, and transcript factors binding sites, etc.Up to now, there have not been enough effective ways to treat diabetes thoroughly as well as its complications, while the control and treatment of diabetes and its complications mainly depend on the chemical or biochemical agents, namely insulin, insulin-like growth factor, alpha-glycosidase inhibitors, aldose reductase inhibitor, sulfonylureas, biguanide, and others.
Additionally, mitochondrial dysfunction, as one of the shared metabolic profiles between diabetes and AD, results in increased oxidative stress and induces glyceroldehyde-derived advanced glycation end products (AGEs) [21,22].The progressive deposition of fibrillar Aβ in the brain can induce disruption of the mitochondrial membrane, resulting in the production of reactive oxygen species, the release of cytochrome c from mitochondria into the cytosol, and the activation of caspase-dependent apoptotic pathways [23].The PI3K/Akt-mediated interaction between Bad and Bcl(XL) plays an important role in maintaining mitochondrial integrity [24].Zeng KW et al. have demonstrated that hyperoside, a bioactive flavonoid compound from Hypericum perforatum, significantly inhibited Aβ (25)(26)(27)(28)(29)(30)(31)(32)(33)(34)(35)-induced cytotoxicity and apoptosis by reversing Aβ-induced mitochondrial dysfunction, including mitochondrial membrane potential decrease, reactive oxygen species production, and mitochondrial release of cytochrome c via PI3K/Akt/Bad/ Bcl(XL)-regulated mitochondrial apoptotic pathway [24].Additionally, Danial and co-workers have indicated that human BAD (a pro-apoptotic Bcl-2 family member) resides in a mitochondrial complex together with protein kinase A (PKA) and protein phosphatase 1 (PP1) catalytic units, Wiskott-Aldrich family member WAVE-1, and glucokinase (GK, hexokinase D, HXK4), integrating pathways of glucose metabolism and apoptosis [25].Glucose deprivation results in dephosphorylation of BAD and BAD-dependent cell death, while the phosphorylation status of BAD helps regulate glucokinase activity.In this study, we sought to discuss some proteins contributing to diabetes emergence and development.This present paper addresses the key attempts to propose ways for future discovery and validation of potential targets for diabetes as well as AD therapies.

Data
The amino acid sequences of all proteins come from SWISSPORT (http://au.expasy.org/)and their coding sequences were extracted from EMBL-EBI (http://www.ebi.ac.uk) linked to GenBank database (http://www.ncbi.nlm.nih.gov).Firstly, there are five groups of diverse proteins in various species for analysis of amino acid composition, composed of 211 DRPs, 31 proteins related to diabetes type I (DRPsI), 66 to diabetes type II (DRPsII), 223 GMRPs, 100 human GMRPs (hGMRPs) and 123 mouse GMRPs (mGMRPs), respectively.Secondly, a series of complete genes (including initiation and stop codons) were analyzed and are available for codon usage biases, such as 211 DRPs, 100 hGMRPs and 123 mGMRPs.And then some indices of codon biases in the coding sequences of DRPs and GMRPs were calculated and the frequency of optimal codons (FOP) was analyzed.A correlation between codon usage for DRPs and gene expression level was built using correspondence analysis (COA) while the codon usage pattern in the two data sets was compared using chi square test.Thirdly, protein-protein interaction networks for DRPs and GMRPs were built using Osprey software.Finally, BAD complex was analyzed and evaluated.

Amino acid Composition
codon to other synonymous codons coding the same amino acid) [29], and the frequency of optimal codons [30] were computed using the program CodonW version 1.4 (www.molbiol.ox.ac.uk/cu) and GCUA [31].Furthermore, the correlation analysis and cluster analysis were carried out by SPSS version 13.0.
According to our methods [26,27], the accumulative times   i c and the percentage of each amino acid in 211 known DRPs, 100 hGMRPs and 123 mGMRPs (Table 1) were analysized and calculated by symbolic statistic method as follows: We estimated codon frequencies from the observed codon counts of these protein-coding sequences mentioned above.64 codons are obtained using the sliding window method one by one from 5'-terminal of protein-coding sequences to 3'-teminal: a sequence of n residues gives rise to 3 n units [32].The accumulative times   l M of each codon in 211 DRPs and 223 GMRPs were analysized and calculated as follows:   .

Relative Synonymous Codon Usage
Codon usage, COA, GC3s (the frequency of codons ending in C and G, excluding Met, Trp and stop codons), A3s, T3s, G3s, C3s (the ratio of A/T/G/C content at the third position in codon to total gene bases), c (the "effective number of codons") [28], RSCU (relative synonymous codon usage, relative probability of certain where l express the times of 64 codons of each protein.The usage frequency of 64 codons respectively in each protein was calculated by using C ++ program written by us.

m
To normalize synonymous codon usage within dataset of different amino acid compositions, RSCU is calculated by dividing the observed frequency of a codon by the expected frequency assuming all synonymous codons are used equally [33].The RSCU values of different codons in each sequence were calculated as follows: Here, means relative usage probability of the codon at position in mRNA sequence; ij RSCU th j th i j N , the actual observed number of certain codon in some template sequence; aa , the actual observed number of other synonymous codons coding the same amino acid coded by certain codon in the template sequence; k, the number of synonymous codons coding the same amino acid.It is obvious that RSCU values close to 1.0 indicate a lack of bias for the corresponding codon [34].

Indices of Codon Biases
As Ghosh and co-worker [35] mentioned, some indices of codon biases were introduced and calculated for each gene, such as GC3s, c , A3s, T3s, G3s, and C3s.c whose value ranges from 20 to 61 was used to quantify the codon usage bias of a gene.This is a measure of general non-uniformity of synonymous codon usage.The expected c value under random synonymous codon usage can be calculated for any value of GC3s by the following formula: When all sense codons are used randomly, c takes a value of 61.Lower values of c indicate stronger bias, with an extreme value of 20 when only one synonym is used for each amino acid.The values of c would fall on the continuous curve described by the formula, if the G + C composition at the synonymous third position is the only determinant factor shaping the codon usage [36].
N N N

Statistical Analysis
COA was used to investigate the major trend in codon usage variation among genes [37].In this study, the RSCU values of genes in each proteins related to diabetes and glucose metabolism were plotted in a multidimensional space of 59 axes, according to their usage of the 59 sense codons (excluding Met, Trp and stop codons), and COA identified a series of new orthogonal axes accounting for the greatest variation among genes.The analysis yielded the coordinate of each gene on each new axis, and the fraction of the total variation was accounted for by each axis.By this means, the axis that contributes most to the total variation among different genes as well as the primary trend of codon usage can be identified.The analysis was performed using the RSCU values in order to minimize the effects of amino acid composition [38].COA of RSCU values was carried out to determine the major source of variation among genes.Hence, each sequence is described by a vector of 59 variables, which is the number of codons for which there are synonyms.COA plots these genes in a multidimensional space of 59 axes.Then a certain number of axes are identified.

The Relative Abundance of Dinucleotides
Based on DRPs 16 dinucleotide relative abundances (DRA) in DRPs' open reading frames (ORFs) were obtained using the method described by Karlin and Burge [39].The odds ratio , where f x denotes the frequency of the nucleotide X and f xy the frequency of the dinucleotide XY, etc., for each dinucleotide were calculated.As a conservative criterion, for (or < 0.78), the XY pair is considered to be of high (or low) relative abundance compared with a random association of mononucleotides.

1.23
xy P 

Protein-Protein Interaction Network
Many properties of complex system are decided by interaction compositions not by single.Life activity is a complex system while the research is dependent upon protein interaction network, which is beneficial to identifying and validating the drug targets and prediction of new functional proteins [40].Protein-protein interaction networks are mostly large network and actually a scalefree network, which follows power-law principle:   7 P k k   [41].Here, the network consists of nodes and edges, which express proteins and their interactions, respectively.The mark k donates link degree, namely the linking numbers of certain node protein.
displays the distributing coefficient of link degree that means the link degree of single node protein, namely the link degree divided by the numbers of nodes.The sign <k> presents the mean link degree of whole nodes in some network.The node proteins of the protein interaction network are maybe more conservative during molecular evolution [42].Search for the protein-protein interaction in IntAct database (

 
P k http://www.ebi.ac.uk/intact/index.jsp) of EMBL-EBI and map the protein-protein interaction network using Osprey software.

Cluster Analysis of Codon Usages of DRPs Based on Protein Interaction Network
Cluster analysis refers to assign a set of samples into groups according to the sample characteristics and the adjacency and similarity of pattern space.The basic operation process of hierarchial cluster analysis is as follows.First, respectively classified the n samples as a category, calculated the distances between samples, and then selected a couple of samples with the minimum distance into a new class.Second, calculated the distances between the new class and other classes and merged the two classes with the minimum distance into another new class.This moved in circles until all the samples are combined into one category.Here, the variable group composed of different RSCU value of a signal mRNA sequence of 11 node proteins in DRPs were regarded as a point in the multidimensional space.Tryptophan (Trp) and methionine (Met) because of their codons UUG and AUG with degeneracy of 1 were not considered as well as 3 termination signals (UGA, UAA and UAG).So, the space dimension is 59.Every gene sequence of 11 node proteins was viewed as the space vector consisting of 59 variable components.The Euclidean distance coefficient of codon usage bias between two genes i and k is calculated as follows [43]: Here, the group average linkage method was used for clustering by SPSS software.11 node proteins in DRPs are listed as follows: O43707 (ACTN4_human), P13987 (CD59_human), Q64521 (GPDM_mouse), P19357 (GTR4_rat), P20823 (HNF1A_human), P02545 (LMNA_human), P06213 (INSR_human, Q05329), P06858 (LIPL_human), P62845 (RS15_rat), Q16849 (PTPRN_human), and Q14191 (WRN_human).

Analysis of Transcription Factors at DNA Level Based on DRPs' Node Proteins
The regulation of a gene depends on the binding of transcription factors (TFs) to specific sites located in the regulatory region of the gene.Transcription regulation is carried out by the interactions between TFs and their binding sites in DNA.Transcription factor binding to specific DNA sequence of its gene promoter activates or represses transcription of genes, which plays a critical role in regulation of specific gene expression and specific stress response.Transcription factor binding sites (TFBS), short, degenerate nucleotide sequences (usually 6 -20 bps), are very important in drug research and development taking transcription factors as targets.Here, we search for the TFBS of DRPs interaction with transcription factors using TRANSFAC 7.0-Public database (http://www.gene-regulation.com/pub/databases.html).

Prediction of BAD Complex Proteins Based on the Theoretical Model
Analyze and calculate a series of parameters of BAD complex and assess whether these proteins might participate in diabetes pathogenesis in comparison with the results mentioned above.

RESULTS
In addition, motif is some similar steric shape or topology structure occurring in local region of various proteins, which displays structure conserved regions (SCR) of proteins during evolution.The important motifs of DRPs involve the immunity-related motifs (i.e.Ig-like, Ig_c1, and Ig_MHC), the transport-related (e.g.Sub_transporter, MFS, and Sugar_transpt), the transcript-and adjustment-related (such as Homeobox and Homeodomain-re), the phosphorylation-related (including TYR_phosphatase, Tyr_PP, and Prot_kinase), and the insulin-related (i.e.Ins/IGF/relax and Insulin-like).On the other hand, there are NAD (P)-bd motif and Glc-6-P_DHase motif with a percentage of 5.47% and 2.57%, respectively, in human GMRPs, while G6PDH motif, Glyco_trans_35 motif, Glycg_phsphrylas motif, and Znf_ring motif with a percentage of 9.62%, 6.54%, 6.54%, and 3.85% in turn, exist in mouse GMRPs.

Amino Acid Composition
20 standard kinds of amino acids in 211 known DRPs are ranged by their percentages as follows: Lys, Trp, His, Asn, Leu, Ser, Ala, Gln, Gly, Glu, Pro, Val, Arg, Thr, Asp, Ile, Phe, Tyr, Met, and Cys in turn (Table 1).NIDDM is the most familiar diabetes whose composition of amino acid residues is consistent with that of the whole.Comparison NIDDM with IDDM, the obvious difference is the percentage of Trp that the former is 9.52% and the latter is 1.00%, which is supported by the results that the maladjustment of Trp could lead to type 2 diabetes because aromatic residues have been identified as crucial in formation and stabilization of amyloid structures.Similarly, 20 amino acids in 100 hGMRPs are listed in descending order of their percentages as follows: Leu, Ala, Gly, Val, Ser, Glu, Arg, Pro, Asp, Thr, Lys, Ile, Phe, Gln, Asn, Tyr, His, Met, Cys, and Trp.The amino acids in 123 mGMRPs are arranged by descending as follows: Leu, Val, Gly, Glu, Ala, Ser, Lys, Asp, Arg, Ile, Pro, Thr, Phe, Asn, Gln, Tyr, Met, His, Cys, and Trp.
On the basis of amino acid composition, percentages of histidine (His), asparagine (Asn) and tryptophan (Trp) in DRPs are different from those of GMRPs (Figure 1).The percentages of His and Asn in DRPs are more than those of GMRPs whileas the percentage of Trp in DRPs is less than that of GMRPs.The Trp percentage of type I diabetes is similar to GMRPs.Table 2 also revealed that the content percentages of UGG coding Trp in DRPs, hGMRPs and mGMRPs were 1.74%, 1.57% and 1.46%, respectively, and more than those of UGG in healthy people.This means tryptophan disorders associated with diabetes mellitus by insulin, which is supported by Oxenkrug GF's results that tryptophan-kynurenine metabolism might be a new target for prevention and treatment of metabolic syndrome, age-associated neuroendocrine disorders [46].and AGA, show higher usage biases.The codon composition of those hGMRPs and DRPs are more close to those of normal person from reference [47].Comparison of RSCU values of 64 types of codons of all various proteins with their percentage reveals that all RSCU are adjacent to each other with the exception of the eight residues of mGMRPs mentioned above.On the other hand, the genes of most DRPs are GC-rich and the average of GC3s is 61.8%.Due to base composition constraints, it is expected that G and/or C containing codons will predominate in the coding regions.

Synonymous Codon Usage Biases
About codon usage biases, their percentage trendlines of 64 types of codons of various proteins in various species are basically consistent with each other except Ser, Pro, Thr, Ala, Cys, Gly, Arg and Asp residues of mGMRPs (Table 2).Especially Arg contains six codons showing diversity, and four codons, such as CGC, CGG, AGG  Note: " * " means the data from reference [46]." ** "RSCU values are average of all of proteins and N means the total times of each codon.The colored numbers show the higher values corresponding to codons of each amino acid (P < 0.01).
Although the overall RSCU values in a genome could unveil the codon usage pattern of a whole genome, there may be potential heterogeneity of codon usage among genes in DRPs: GC3s values both range from 24.6% to 92.2% with a mean of 61.8% and a standard deviation (S.D.) of 1.53%; while c values both range from 31.37 to 61.0 with a mean of 48.11 ± 7.06.Similarly, in human GMRPs: GC3s values both range from 23.75% to 87.73% with a mean of 61.21% and a standard deviation (S.D.) of 1.62%; while c values both range from 34.35 to 61.19 with a mean of 47.94 ± 6.60; in mouse GMRPs: GC3s range from 25.6% to 95.7% with a mean of 62.1% ± 1.76% while c from 31.25 to 61.0 with a mean of 47.25 ± 7.33.The plot of c against GC3s can be effectively used to explore the heterogeneity (Figure 2), and most genes fall within a restricted cloud, near the area of the axes with GC3s between 0.40 and 0.85 and c from 40 to 60.The points in the plot are quite spreaded out and the bulk of genes appear to be following a less slope than that of the theoretical curve, which suggests that there are possibly other contributors to the codon usage pattern in DRPs, hGMRPs, and mGMRPs besides the genomic composition.If the codon usage pattern of the genes has some influence other than the GC content, the comparison of the actual distribution of genes with the expected distribution under no selection could be indicative.In other words, if GC3s is the only determinant factor shaping the codon usage pattern, the values of c would fall on a continuous curve, which represents random codon usage [48].

A Correlation between Codon Usage and Gene
Expression Level A more extensive and quantitative analysis of the sources of codon usage variation among genes can be achieved using multivariate statistical analysis [49].In the present work, COA of codon usage in DRP genes which was performed on RSCU values shows that the first axis account for 34.59% of all variation among DRP genes and 43.65% of variation among human DRP genes, whereas the rest of the axes account for no more than 5.06% and 5.74% among all DRP and human DRP genes, respectively (Figure 3).The first principle trend apparently describes a quite large weight of the codon usage variation, which suggests that the major trend in codon bias is as strong as in other species [50].The plots of genes on the first twenty axes show the relative inertia and cumulative inertia.From the plots, it can be seen that most genes falling within the first axis, demonstrating that there is a single major trend in codon usage in genes related to diabetes, especially human DRPs.According to two-way chi square   2  contingency test, codon usage differences in the two data sets for diabetes genes at between high and low expresses level was compared  with the criterion of to assess significance, respectively.In order to find out whether translational selection is acting on the codon usage in diabetes, a new COA of the RSCU values of the genes located on the leading strand was conducted, since most of the highly expressed sequences are located on that strand, namely the first axis where synonymous codons usages of these genes exist.To investigate the expression trend in codon usage on axis 1, we selected 10 genes from the bottom of axis 1 (the High data set, mainly consisted of genes known to be expressed at high levels), and 10 genes from the other extreme (the Low data set, mainly including genes known to be expressed at low levels) (Figure 3).Statistical analysis showed that the position of gene on the first axis was high positive correlated with the GC3s content ( ; human ).Similarly, the position of gene on the first axis was also positive correlated with A3s ( r ; human ), T3s ( ) content, respectively.Thus, the value of GC3s plays a decisive influence in synonymous codon usage variation of DRPs.
Moreover, optimal codons are defined as those occurr significantly more often in highly expressed genes than they do in lowly expressed genes.27 codons for 18 amino acids in diabetes genes were identified as significantly   0.01 P  0.78 more frequent in the high set (Table 3).These codons are thought to be optimal for translation.16 codons of them terminate in C (59.3%), 11 in G (40.7%), but no in A/U.The contrast indicates that codon usage biases of DRPs are induced by the composing restriction of those high-expressed GC-rich genes to a great extent.This is consistent with the result that DRP genes generally full contain GC, which hints the compositional restriction plays an important role in boosting GC3s usages of DRP codons.Here, the highest frequent codon are Phe (UUC), Leu (CUG), Ile (AUC), Met (AUG), Val (GUG), Ser (AGC), Tyr (UAC), Pro (CCC), Thr (ACC), Ala (GCC), His (CAC), Trp (UGG), Cys (UGC), Gln (CAG), Asn (AAC), Gly (GGC), Arg (CGG), Asp (GAC), Glu (GAG), Lys (AAG), and STOP (UGA), in turn.Ser and Arg don't show certain bias but have lowest frequent codon, namely Ser (UCG) and Arg (CGU).COA results also revealed that the bigger RSCU value is, the higher codon usage bias is.The codon with RSCU value greater than 1.5 was Leu (CUG), Val (GUG), Ile (AUC), Thr (ACC), Ser (AGC), and Ala (GCC).Especially, RSCU values of Leu (CUG) and Val (GUG) were significantly higher than their synonymous codons, while the two amino acids had very high codon usage preference in diabetes for further study.Moreover, GC contents are evidently more than AU contents and the usage biases of codons depend on GC contents to a great extent.In high-expression gene, all codons show the terminal of C/G, which is consistent with the usage biases of codons, namely all biased codons with the terminal of C/G.This helps to treat and control diabetes at gene level by site mutant.

Effect of Relative Abundance of Dinucleotide
and CpG Suppression Based on DRPs It has been reported that dinucleotide biases can affect codon bias.Using the calculating method of a previous research about CpG under-representation in classical swine fever virus (CSFV) [51], we are supposed to determine whether the relative abundances of dinucleotides in human diabetes related genes affects codon usage.As a conservative criterion told above, for (or 1.23 ), the XY pair is considered to be of high (or low) relative abundance compared with a random association of mononucleotides.The frequencies of occurrence for  0.969 0.194

 
The relative abundances of CpG and UpA are lower than the "normal range" compared with a random association of mononucleotides, while those of UpG, CpU and CpA are higher than the "normal range".These observations indicated that the composition of dinucleotides, which are independent of the overall base composition but still the result of differential mutational pressure, also determines the variation in synonymous codon usage among different diabetes related proteins ORFs.
Based on the Euclidean distance coefficients (d ik ) of codon usage biases, cluster analysis (Figure 6) show that the 11 node proteins in DRPs mentioned above were divided into two groups.Group I contains six node proteins, O43707, P20823, P02545, P06213, P62845, and Q16849, while group II includes five proteins, P13987, Q64521, P19357, P06858, and Q14191.Here, a class of O43707 and P20823 hints cytoskeletal protein association with hepatocyte nuclear factor.The former related to anchor with diabetes nephropathy as a good marker protein to examine the relation between O-linked N-acetylglucosamine (O-GlcNAcylation) and diabetic nephropathy [52].The latter is a hepatocyte nuclear factor as key metabolic regulators of energy homeostasis pathways including GnT-4a (glycosyltransferase) glycosylation and glucose transporter expression [53].Elevated free fatty acid (FFA) concentrations impair the expression and function of FOXA2 and HNF1A transcription factors sufficiently in beta cells to deplete GnT-4a glycosylation and glucose transporter expression.FFAs induce the activation of one or more beta cell G protein-coupled FFA receptor such as GPR40, while the chronic elevation of FFAs increases mitochondrial oxidation and reactive oxygen species.The resulting dysfunction of beta cells leads to impaired glucose tolerance and failure of GSIS (glucose-stimulated insulin secretion) and further contributes to hyperglycemia, hepatic steatosis and systemic insulin resistance.The molecules that impinge upon this pathway to sustain beta cell GnT-4a activity and glucose transporter expression may suggest new therapeutic targets to achieve effective prevention and treatment of Here, red dots donate the proteins related to diabetes and the bigger ones do hub proteins of network, whilst yellow dots express the proteins whose anti-diabetes activities have not been reported until now, they nevertheless exist in the network.However, the smaller red dots at right corner under the figure don't exist in the above network although they are related to diabetes.Moreover, the blue lines show the interaction among diabetes-related proteins by the other proteins while the purple lines figure the direct interactions between diabetes-related proteins.And the indirect interactions of the diabetes-related proteins are connected by less than three other proteins during building of the interaction network.diabetes [53].A class of Q64521 and P06858 related to glycerin metabolism, indicating that adipose tissue dysfunction correlate with systemic insulin resistance and type 2 diabetes.Rogers C and co-worker have found that only ErbB1 expression was correlated with insulin sensitivity using ELISA and real-time PCR technology [54].Additionally, EGF receptor (ErbB1) levels correlated positively with PPARγ and several PPARγ-regulated genes including acyl-coenzyme A synthetase long-chain family member 1 (ACSL1), adiponectin, adipose tissue triacylglycerol lipase (ATGL), diacylglycerol acyl transferase 1 (DGAT1), glycerol-3-phosphate dehydrogenase 1 (GPD1), and lipoprotein lipase (LPL), but negatively with CD36 and fatty acid-binding protein 4 (FABP4).These findings demonstrate a key role for ErbB1 in adipogenesis and suggest that lower ErbB1 protein abun-dance may lead to adipose tissue dysfunction [54].A class of P02545 and P62845 hints ribosome protein may play a role in diabetes-related protein synthesis.The former related to Familial partial lipodystrophy (FPLD), an autosomal dominant disorder caused due to missense mutations in the lamin A/C (LMNA) gene encoding nuclear lamina proteins [55].The latter related to insulinoma [56].A class of P13987 and Q14191 concerned with diabetes by enzyme signal pathway.P13987 as membrane attack complex (MAC) inhibition factor plays an important role in diabetic macrovascular diseases associated with protein tyrosine kinase signal pathway [57].Q14191 as a ATP-and Mg 2+ -dependent DNA helicase participates in DNA repair, replication, recombinetion and telomere maintenance, which isolated from Werner's syndrome, a rare human autosomal recessive segmental progeroid syndrome clinically characterized by atherosclerosis, cancer, osteoporosis, type 2 diabetes mellitus and ocular cataracts [58].

Glucose Metabolism-Related Proteins
There are 330 node proteins and 296 links in the protein-protein network related to GMRPs (Figure 7) and its mean link degree <k> is 0.89.Some node proteins are surrounded by a large number of proteins whose link degrees are far greater than the average degree of the whole network.15 central node proteins exist in this network, P01308 (insulin), P02652 (apolipoprotein A-II), P06213 (insulin receptor), P11166 (solute carrier family 2, facilitated glucose transporter member 1), P13807 (glycogen synthase), P18031 (tyrosine-protein phosphatase non-receptor type 1), P31749 (RAC-alpha serine/ threonine-protein kinase), P35568 (insulin receptor substrate 1), P37231 (peroxisome proliferator-activated receptor gamma), P52790 (hexokinase-3), P60484 (phosphatidylinositol-3,4,5-trisphosphate 3-phosphatase and dual-specificity protein), Q15118 (pyruvate dehy-drogenase kinase isozyme 1), Q9P1Z2 (calcium-binding and coiled coil domain-containing protein 1), Q9UH92 (Max-like protein X), and P04075 (fructose-bisphosphate aldolase A), are related to diabetes whose connect degrees (k) are 11, 6, 46, 9, 6, 25, 48, 48, 24, 5, 11, 6, 5, 6, and 5, respectively.Q86YI5 (dihydrolipoamide S-acetyltransferase) doesn't exist in GMRPs whose connect degrees (k) is 5. Specially, P31749, P35568 and P06213 play important roles in protein-protein interaction network.P31749 (RAC-alpha serine/threonine-protein kinase, protein kinase B, PKB) is one of closely related serine/ threonine-protein kinases (AKT1, AKT2 and AKT3) called the AKT kinase, and which regulate many processes including metabolism, proliferation, cell survival, growth and angiogenesis.This is mediated through serine and/or threonine phosphorylation of a range of downstream substrates.AKT is responsible of the regulation of glucose uptake by mediating insulin-induced translocation of the SLC2A4/GLUT4 glucose transporter to the cell surface.PTP1B is a protein tyrosine phosphatase that negatively regulates insulin sensitivity by dephosphorylating the insulin receptor [59].Phosphorylation of PTPN1 by AKT at Ser50 negatively modulates its phosphatase activity creating a positive feedback mechanism for insulin signaling, namely preventing dephosphorylation of the insulin receptor and the attenuation of insulin signaling.AKT phosphorylation of TBC1D4 (the Rab GTPase-activating protein, 160-kDa Akt substrate) triggers GLUT4 translocation and the binding of this effector to inhibitory 14-3-3 proteins, which is required for insulin-stimulated glucose transport [60].AKT regulates also the storage of glucose in the form of glycogen by phosphorylating GSK3A at Ser21 and GSK3B at Ser9, resulting in inhibition of its kinase activity.
P35568 (insulin receptor substrate 1, IRS1) functions as one of the key downstream signaling molecules in both the insulin receptor and the insulin-like growth factor-1 receptor signaling pathways (IGF-1R).Thus genetic changes in IRS-1 may potentially contribute toward the development of insulin resistance [61].After autophosphorylation of the insulin receptor, the receptor kinase is activated and phosphorylates IRS-1 and other intracellular substrates.This signaling molecule then acts as a docking protein for multiple Src homology-2 domain (SH2)-containing proteins, including phosphatidylinositol 3-kinase (PI3-k), Grb-2, and SHP2.The G972R polymorphism is found near the C terminus of IRS-1 flanked by two tyrosine phosphorylation consensus sites (EY 941 MLM and DY 989 MTM), which are known binding sites for the p85α regulatory subunit of PI 3-kinase.The G972R polymorphism impairs the ability of insulin to stimulate glucose transport, glucose transporter translocation, and glycogen synthesis by affecting the PI3K/ AKT1/GSK3 signaling pathway.The polymorphism at G972R may contribute to the in vivo insulin resistance observed in carriers of this variant.G972R could contribute to the risk for atherosclerotic cardiovascular diseases associated with non-insulin-dependent diabetes mellitus (NIDDM) by producing a cluster of insulin resistance-related metabolic abnormalities [62].In insulinstimulated human endothelial cells from carriers of the G972R polymorphism, genetic impairment of the IRS1/ PI3K/PDPK1/AKT1 insulin signaling cascade results in impaired insulin-stimulated nitric oxide (NO) release and suggested that this may be a mechanism through which the G972R polymorphism contributes to the genetic predisposition to develop endothelial dysfunction and cardiovascular disease.The G972R polymorphism not only reduces phosphorylation of the substrate but allows IRS1 to act as an inhibitor of PI3K, producing global insulin resistance [61].

Analysis of Transcription Factors at DNA Level Based on DRPs
The genes of 25 4).Sequence alignment shows that BP1 binding site presents a conserved motif of A-Py-AT-

Prediction of BAD Complex Proteins
Danial and co-workers indicates an unanticipated role for BAD in integrating pathways of glucose metabolism and apoptosis using proteomics, genetics and physiology [25].
In liver mitochondria, BAD (Q92934) resides in a functional holoenzyme complex together with PKA (Q4P0B3) and PP1 (Q9FSU8) catalytic units, WAVE-1 (Q92558), and GK (P35557).BAD complex includes phosphorylation motif (hexokinase and cAMP_kin), dephosphorylation motif (T_phtase_apaH), apoptosis regulator related motif (Bcl2_BH and BCL2_apoptsis), cyclic nucleotidebinding domain (cNMP_bd), cytoskeleton regulation related motif (WH2_actin_bd), and regulatory subunit portion of type II PKA R-subunit (RIIa).Comparison with the frequency of amino acids in BAD complex, these proteins basically accord with the amino acid composition, especially GK, WAVE1 with lowest Trp contents (Table 1).The Cys contents of BAD and PAK are the lowest but the Tyr content of PP1 is the lowest.Table 5 displays the usage biases of codons of BAD complex.Usage biases of codons reveal that GK are consistent with the results of DRPs while BAD and PKA nearly coincide with the codon biases.But WAVE1 and PP1 depart from the codon usage biases.The interaction between BAD complex and other proteins was searched in IntAct database.The B cell leukemia-2 gene product (Bcl-2) family of proteins can be divided into three different subclasses based on conservation of BCL-2 homology (BH1-4) domains: multidomain anti-apoptotic proteins (BCL-2, BCL-XL, MCL-1, BCL-W, and Bfl-1/A1), multidomain proapoptotic proteins (BAX and BAK), and BH3-only proapoptotic proteins (BID, BAD, BIM, PUMA, NOXA, and NBK/BIK) [44,63].Notably, BH3-only proteins are not able to kill cells that lack BAX and BAK, indicating that BH3-only proteins function upstream of and are dependent on BAX and BAK [64].IntAct database shows that BAD interacts with Bcl-2, Bcl-x, Bcl-w, RNA-binding protein EWS, and 14-3-3 protein sigma, respectively.BAK could contact the node protein Lamin-A/C by E1B protein of the following pathway, namely Q16611 (BAK_human, Bcl-2 homologous antagonist/killer)-P03247 (E1BS_ ADE02, E1B protein, small T-antigen)-P02545 (LMNA_ Human: Lamin-A/C).
On the other hand, gamma-enolase might associate the node protein alpha-endosulfine with HXK 1, viz.HXK 1 (P19367, Hexokinase-1)-P09104 (ENOG_human, Gammaenolase)-O43768 (ENSA_human, alpha-endosulfine).Figure 5 shows HXK1 (green dot) interaction with O43768 by P09104.O43768 is the endogenous ligand for sulfonylurea receptor, while endosulfine is the endogenous ligand for the ATP-dependent potassium channels that occupy a key position in the control of insulin release from the pancreatic beta cell by coupling cell polarity to metabolism.By inhibiting sulfonylurea from binding to the receptor, it reduces K + channel currents and thereby stimulates insulin secretion.P09104, neuron-specific enolase, has neurotrophic and neuroprotective properties on a broad spectrum of central nervous system (CNS) neurons, binding to cultured neocortical neurons in a calcium-dependent manner to promote cell survival.It is normally expressed in neural tissue and involves a freely reversible cytosolic reaction in glycolysis where one molecule each of phosphoenolpyruvate and water react to form one molecule of 2-phosphoglycerate.It is of possible clinical interest as a marker of some types of neuroendocrine and lung tumors.Moreover, HXK2 interacts with ubiquilin-1, while GK interacts with two consensus peptides, EYLSAIVAGPWP of GCKR1 (Glucokinase regulatory protein) and HGMK-VWTLPATS of PFKFB1 (F261, 6-phosphofructo-2kinase/fructose-2,6-biphosphatase 1) by phage display, respectively.The homology of WAVE1 and PKA by alignment of F261 or GCKR is more than 35%, respectively.Our results show that GK might distinguish from PP1 or WAVE1 in a special pattern by binding with the consensus motif or WAVE1, which is sup-ported by our previous results that GK distinguishing a nine-residue motif EGLKFYTNP (146-154) of WAVE1 construct the BAD-GK-PKAc-PP1c-WAVE1 complex [45].Additionally, BAD and GK play key roles because of BAD as a substrate for the PKA-PP1 pair and by BH3 domain directly interacting with GK.Apparently, this may be the reason of the BAD complex existing in liver mitochondria, regardless of phosphorylated and dephosphorylated BAD, integrating glycolysis and apoptosis.

Molecular Modeling of BAD Complex
Knowledge, both from the 3D structures of homologous proteins and from the general analysis of protein structure, is of value in modeling a protein of known sequence but unknown structure.Many models are constructed by homologous modeling on graphics devices.Our group has even built molecular models of HIV-1 co-receptor CCR5 [65], fibrinogen receptor [66], ADP receptor [27], and recombinant SCF-MCSF (Stem Cell Factor-Macrophage Colony Stimulating Factor) fusion proteins and their receptors [67] using the same method, especially integrin GPIIB and purinergic receptor P2Y12 were embodied into PDB database, and their PDB codes were 1UV9 and 1VZ1, respectively.Here we mainly discuss the mechanism of BAD complex using homologous modeling and molecular dynamics.Firstly, human BAD model was built using NMR structure of mouse Bid (PDB code: 1DDB) [68] as a template.Secondly, based on the sequence alignments between BAD and CAPKI and between PKAc and GK, taking X-ray crystal structure of PKAc interaction with its inhibitor (CAPKI) (PDB code: 1CTP) [69] as a template, a complex model of BAD with GK was constructed using Biopolymer module.Similarly, a complex model of PKAc, BAD, and GK was built using a complex model of human BAD with Bcl-xL and PKAc [70] built by us as a template.Thirdly, based on the structure of PP1c-MYPT1 (myosin phasphatase targeting subunit 1) complex (PDB code: 1S70) [71], a structural model of PP1c-GK complex was constructed using Biopolymer module.Combined with BAD complex with GK and PKAc, a complex model of PP1c, BAD, GK, and PKAc was built.Finally, the fourmember complex by interaction with the dimerization domain of PKA regulatory subunit could also connect with WAVE1 to construct the nine-member complex, namely (BAD-GK-PKAc-PP1c) 2 -WAVE1 complex.
The survey for their interactions in the complex with human BAD and GK focuses on a helical region (98 to 126 residues) of BAD and a shallow pocket at surface of GK (Figure 8).The direct interatomic contacts are made between BAD and GK by van der Waals contacts, hydrophobic interaction, electrostatic interaction, hydrogen bond, and salt bridge, respectively.Residues in contact are congregated in five scattered regions of human BAD: Gln12-Ser16 (loop, pink), Ala27 (loop, pink), Leu44-Glu53 (turn, yellow), Arg98-Lys126 (BH3 domain, red), and Gln152-Ser167 (loop and helix, green); and they spread over six segments of GK: Asp78-Met87 at a turn, two loops (Ser100-Glu112 and Gly170), two helices (Glu128-Phe133 and Arg327-Gln347), and a span from 415 to 446.Besides the interaction between BAD and GK mentioned above, other interactions exist in the complex with BAD, GK, PP1c and PKAc.Especially the amidogen hydrogen of Lys81 and Lys83 of PKAc make hydrogen bridges with the carbonyl oxygen of Leu314 and Pro359 of GK, respectively.Besides, the direct interatomic contacts are made between 7 residues of BAD and 6 residues of PKAc as follows: the active sites of BAD distributing over three segments: two loops (Glu68-Ser71 and Glu84-Glu88) and a turn (Tyr76-Gly79); and the active sites of GK over three parts: a helix (Glu248-Ser252) and two loop regions (Gln242 and Lys254-Arg256).Additionally, the direct interatomic contacts are made between 5 residues of GK and 5 residues of PP1c.Residues in contact are concentrated in three dispersed regions of GK: Gln18 residue at a helix, Gln24-Leu25-Gln26 motif and Glu421 residue at loops; and they are distributed over three segments of PP1c: Asp179-Gln181, Arg188 residue, and Typ216-Glu218 segment.
Our research results reveal that BAD-GK-PKAc-PP1c complex by GK distinguishing a nine-residue motif EGLKFYTNP (146-154) of WAVE1 construct the BAD-GK-PKAc-PP1c-WAVE1 complex, which is supported by Danial NN's results [25].This is a case study of how human BAD was phosphorylated and inactivated at Ser75 by PKAc.Another for dephosphorylated and activated BAD on Ser75, maybe substitute PP1c for PKAc in the complex where PP1c interacts with BAD to result in the dissociation of PKAc and its binding to the regulatory subunit of PKA.Thus, PKA by the dimerization domain of the regulatory subunit could interact with N-terminal peptides of WAVE1, which is supported by Newlon MG's results [72].Additionally, we have built a complex model of double PKA with a peptide from WAVE1 [73].Moreover, PKA by WAVE1 interacts with GK to form (BAD-GK-PKAc-PP1c) 2 -WAVE1 complex.

DISCUSSION
Diabetes is an important endocrine disorder whose pathogenesis is so complex and indefinite that it has never been reported that someone had recovered totally from diabetes.Genomic and proteomic data analysis is essential for understanding the underlying factors that are involved in human disease, such as AD previously reported by ourselves [11].By extracting significant genomic and proteomic biomarkers in controlled experiments, scientists come closer to understanding how bio-logical mechanisms contribute to human diseases such as neurodegenerative diseases and cardiovascular disorders, and how drug treatments interact with the nervous system.By integrating proteins associated with the pathology of AD into a database and mapped into a protein interaction network, we report a novel approach to predict protein-protein interactions and their potential interaction with GK (green), PKAc (cyan), and PP1c (red); b) a nine-member complex model of double four-member complex with a peptide of WAVE1, here, BAD colored by orange and blue, respectively, GK (green and purple), PKAc (cyan and gray), and PP1c (red and white).
sites, based on the characteristics of the free-scale network [74].These networks are constructed with the proteins being the "nodes" and an observation of interaction being an "edge".The scale free nature of the protein interaction network indicates that a limited number of proteins have a large number of interactions, and function as "hubs" [74].Thus, protein interaction networks provide an indication of which proteins are more likely to be critical for overall functioning, and to indicate groups of proteins sharing common functions.
Insulin is traditionally thought to reduce blood glucose levels by stimulating glucose uptake into skeletal muscle and adipose tissues via GLUT4 and by suppressing hepatic glucose production [82].The insulin receptor (ISR) recruits adaptor protein IRS (insulin receptor substrates) to connect with downstream signalling pathways [77].Insulin activates the ISR tyrosine kinase, which subsequently tyrosine phosphorylates IRS [83] (Figure 9).In the fed state, dietary carbohydrate increases plasma glucose and promotes insulin secretion from the pancreatic beta cells.In the skeletal muscle, insulin increases glucose transport, permitting glucose entry and glycogen synthesis.In the liver, insulin promotes glycogen synthesis and de novo lipogenesis while also inhibiting gluconeogenesis.In the adipose tissue, insulin suppresses lipolysis and promotes lipogenesis [83].A family of IRS proteins was defined based on three major common structural elements: amino-terminal PH and PTB domains (PH/PTB) that mediate protein-lipid or protein-protein interactions, mostly carboxy-terminal multiple tyrosine residues (multi-Tyr) that serve as binding sites for proteins that contain one or more SH2 domains, and serine/ threonine-rich regions (Ser/Thr-yich) which may be recognized by negative regulators of insulin action [77].Glucagon, a major insulin counterregulatory hormone, binds to specific G s protein-coupled receptors (GPCRs) to activate glycogenolytic and gluconeogenic pathways via adenylyl cyclase and phospholipase C (PLC) signaling pathways leading to increased intracellular cAMP and [Ca 2+ ] i , causing blood glucose levels to increase [75].Inappropriate increases in serum glucagon play a critical role in the development of insulin resistance and target organ damage in type 2 diabetes.Xiao C et al. have demonstrated that glucagon activated specific receptors to increase rat mesangial cell proliferation by stimulating MAPK ERK1/2 phosphorylation and that glucagon-induced ERK 1/2 phosphorylation requires receptor-mediated activation of cAMP-dependent PKA and PLC/ [Ca 2+ ] i -mediated signaling cascades [75].Hepatic glucagon action increases in response to accelerated metabolic demands and is associated with increased whole body substrate availability, including circulating lipids [84].The hypothesis that increases in hepatic glucagon action stimulate AMP-activated protein kinase (AMPK) signaling and peroxisome proliferator-activated receptor-α (PPARα) and fibroblast growth factor 21 (FGF21) expression in a manner modulated by fatty acids was tested in vivo.These findings demonstrate that glucagon exerts a critical regulatory role in liver to stimulate pathways linked to lipid metabolism in vivo and shows for the first time that effects of glucagon on PPARα and FGF21 expression are amplified by a physiological increase in circulating lipids [84].Moreover, accumulating evidence indicates a role for metabolic dysfunction in the pathogenesis of AD, with the evidence of insulin resistance in the AD brain [85].Thus, insulin-based therapies, especially targeting insulin signaling, have emerged as potential strategies to slow cognitive decline and to improve cognitive function in AD.
PC-1/ENPP1 has been shown to inhibit insulin signaling by inhibition of insulin-stimulated ISR tyrosine phosphorylation in cultured HEK293 cells in vitro and in transgenic mice in vivo when overexpressed [86].Moreover, knockdown of ENPP1 with siRNA significantly increases insulin-stimulated AKT phosphorylation in HuH7 human hepatoma cells, supporting the proposition that ENPP1 inhibition is a potential therapeutic approach for the treatment of type 2 diabetes [79].It is reported that disruption of Angiopoietin-1 (Ang-1)/Tie-2 signaling pathway by PTP contributes to the diabetes-associ- ated impairment of angiogenesis [78].Chen JX et al have found that protein tyrosine phosphatase-1 (SHP-1) bond to Tie-2 receptor and stimulation with Ang-1 led to SHP-1 dissociation from Tie-2 in mouse heart microvascular endothelial cell (MHMEC).Exposure of MHMEC to high glucose blunted Ang-1-mediated SHP-1/Tie-2 dissociation while treatment with PTP inhibitors restored Ang-1-induced AKT/eNOS phosphorylation and angiogenesis.Upregulation of Ang-1/Tie-2 signaling by targeting SHP-1 should be considered as a new therapeutic strategy for the treatment of diabetes-associated impairment of angiogenesis [78].Similarly, microvessels isolated from the brains of AD patients express a large number of angiogenic proteins, such as hypoxia inducible factor 1-alpha (HIF-1α), angiogenic proteins, angiopoietin-2 (Ang-2), and matrix metalloproteinase 2 (MMP2), and survival/apoptotic proteins (Bcl-xL, caspase 3), accompanied by Ang-2, MMP2 and caspase 3 elevated and the anti-apoptotic protein Bcl-xL decreased [87].Hypoxia is thought to contribute to AD pathogene-sis and the cerebro-microvasculature is an important target for the effects of hypoxia in the AD brain.
The prevalence of type 2 diabetes increases rapidly due to the obesity epidemic.Obesity is the primary risk factor for insulin resistance, a hallmark and a driving factor for NIDDM progression [82].The best-established connection between obesity and insulin resistance is the elevated and/or dysregulated levels of circulating free fatty acids that cause and aggravate insulin resistance, type 2 diabetes, cardiovascular disease and other hazardous metabolic conditions.Karki S and co-worker have indicated that palmitate, a major dietary saturated fatty acid, not only decreases adiponectin expression at the level of transcription by PPARγ and C/EBPα signal pathway and its intracellular metabolism via the acyl-CoA synthetase 1-mediated pathway, but may also stimulate lysosomal degradation of newly synthesized adiponectin by the intracellular sorting receptor sortilin [80].INT131 is a potent non-thiazolidinedione (TZD)-selective PPARγ modulator being developed for the treatment of type 2 diabetes [77].In preclinical studies and a phase II clinical trial, INT131 has been shown to lower glucose levels and ameliorate insulin resistance without typical TZD side effects.Lee DH and co-worker have suggested that a newly developed insulin-sensitizing agent, INT131, normalizes obesity-related defects in insulin action on PI3K signaling in insulin target tissues by a mechanism involved in glycemic control.If these data are confirmed in humans, INT131 could be used for treating type 2 diabetes without loss in bone mass [77].Moreover, nsulin-sensitive neurons in the brain, especially hypothalamus, also play a key role in the maintenance of normal body weight and glucose homeostasis.Morris DL et al. recently demonstrated that SH2B1, an SH2 and PH domain-containing adaptor molecule, links to human obesity and type 2 diabetes [82].Genetic deletion of SH2B1 results in severe leptin resistance, insulin resistance, hyperphagia, obesity, and type 2 diabetes in mice.Neuronal SH2B1 serves as an endogenous leptin sensitizer and directly enhances leptin signaling in hypothalamic neurons via a JAK2-dependent mechanism, whereas peripheral SH2B1 directly regulates glucose metabolism by enhancing insulin sensitivity.Furthermore, adipose SH 2B1 mediates insulin stimulation of GLUT4 trafficking and glucose uptake in adipocytes by a novel mechanism.SH2B1 appears to be evolutionally conserved to regulate body weight and glucose metabolism via the brain-adipose tissue axis [82].Nuclear receptors are attractive targets for the treatment of AD due to their ability to facilitate degradation of Aβ, affect microglial activation and suppress the inflammatory milieu of the brain [88].Countering neurovascular impairment with PPARγ agonists and acetylcholinesterase inhibitors (AChEi) may improve clinical outcome and delay progression to severe dementia [89].PPARγ agonists that act by enhancing insulin sensitivity are believed to derive from their ability as nuclear receptor ligands to regulate transcription of a wide variety of oxidative, inflammatory, fibrotic, and neuronal survival genes, although transcription independent effects may also be involved.PPARγ agonists thus have the ability to improve neuronal, glial, and cerebrovascular networks in AD, and consequently to rescue brain hemodynamics [89].
Obviously, more than 60 percent of the glucose streaming through our blood is consumed by the brain.Uncontrolled diabetes can also be indirectly linked to numerous maladies by interfering with cerebral glucose metabolism to ultimately affect brain functioning.Impairment and damage of brain in lack of enough glucose supply will induce Alzheimer's disease (AD), considered as the "brain-type diabetes" or a third form of diabetes [90].In fact, brain insulin has been shown to regulate both peripheral and central glucose metabolism, neurotransmission, learning, and memory and to be neuropro-tective.Insulin signaling in central nervous system (CNS) has emerged as a novel field of research since decreased brain insulin levels and/or signaling were associated to impaired learning, memory, and age-related neurodegenerative diseases [90].The formation of misfolded proteins, AGEs (advanced glycation end products), mitochondrial disorders, generation of oxidative stress, insulin resistance and abnormal glucose metabolism are main hallmarks of diabetes and AD [1].Insulin may constitute a promising therapy against diabetes-and age-related neurodegenerative disorders, as the potential missing link between diabetes and aging in CNS.By integration of the metabolic, neuromodulatory, and neuroprotective roles of insulin in two age-related pathologies: diabetes and AD, Duarte AI and co-worker have revealed that the two age prevalent diseases, AD and type 2 diabetes, share many common features including the deposition of amyloidogenic proteins, amyloid β protein (Aβ) and amylin (human islet amyloid polypeptide, hIAPP), respectively [90].Recent evidence suggests that both Aβ and amylin may express their effects through the amylin receptor although the precise mechanisms for this interaction at a cellular level are unknown [91].Aβ  and human amylin increase cytosolic cAMP and Ca 2+ , trigger multiple pathways involving the signal transduction mediators PKA, MAPK, Akt, and cFos.In the presence of human amylin, Aβ(1-42) effects on HEK293-AMY3 expressing cells are occluded suggesting a shared mechanism of action between the two peptides.Amylin receptor antagonist AC253 blocks increases in intracellular Ca(2+), activation of PKA, MAPK, Akt, cFos and cell death that occur upon amylin receptor-3 (AMY3) activation with human amylin, Aβ , or their coapplication.Fu W et al suggested that AMY3 plays an important role by serving as a receptor target for actions Aβ and thus may represent a novel therapeutic target for development of compounds to treat neurodegenerative conditions such as AD [91].Additionally, hIAPP as another hallmark of diabetes leads to pancreatic β-cells dysfunction.Neprilysin, a metallopeptidase known to degrade amyloid in AD, decreases islet amyloid deposition by inhibiting hIAPP fibril formation, rather than degrading hIAPP [92].Targeting the role of neprilysin in IAPP fibril assembly, in addition to IAPP cleavage by other peptidases, may provide a novel approach to reduce and/or prevent islet amyloid deposition in type 2 diabetes.The fact that some drugs developed to treat diabetes also have a positive effect on Alzheimer's patients, such as metformin, supports the view that there is a common metabolic basis for both disorders [1].
Furthermore, mitochondrial abnormalities like over oxidative stress and impaired calcium homeostasis are reported in AD and diabetes [1].Although mitochondria are capable of generating ROS/RNS in AD and diabetes by themselves, other sources/mechanisms like misfolded proteins (Aβ and hIAPP), accumulated NFTs (tau plaques), hyperglycemia and AGEs also promote ROS generation [93].Abundant evidence indicates that direct neurotoxic effects of Aβ involve activation of apoptosis pathways [94].Apoptosis is regulated by the B cell leukemia-2 gene product (Bcl-2) family (Bcl-2, Bcl-x, Bax, Bak and Bad) and the caspase family (ICH-1 and CPP32), with apoptosis being prevented by Bcl-2 and Bcl-x, and promoted by Bax, Bak, Bad, ICH-1 and CPP32 [95].Members of the Bcl-2 family are pivotal regulators of the apoptotic process and include both proteins that promote cell survival (e.g., Bcl-2, Bcl-xL, and Bcl-w) and others that antagonize it (e.g., Bax, Bad, Bak, Bik, Bid, BNIP3, and Bim).Previous work suggests that Aβ-induced apoptosis is characterized by decreased expression of the antiapoptotic Bcl-2, Bcl-xL, and/or increased expression of the proapoptotic Bax, Bim.Furthermore, overexpression of antiapoptotic Bcl-2 and Bcl-xL or suppression of Bim can attenuate Aβ toxicity [94].Kitamura Y et al. have indicated that Bak and Bad but not Bax may contribute to neuronal death in AD [95].The mechanism of Aβ-induced neuronal apoptosis sequentially involves JNK (c-Jun N-terminal kinase) activation, Bcl-w downregulation, and release of mitochondrial Smac (second mitochondrion-derived activator of caspase, an important precursor event to cell death), followed by cell death [94].Especially, the mitochondrial BAD complex composed of GK, PP1, PKA, WAVE-1 and BAD in integrating pathways of glucose metabolism and apoptosis [25].GK activation in the liver results in reduction of glucose concentrations by increasing glycogen synthesis [84].Dual role of BAD in insulin secretion and beta-cell survival is represented by the BH3 domain, which endows BAD with bifunctional activities to differentially control insulin secretion and apoptosis [86].Phosphorylation of serine 155 in the BH3 domain instructs BAD to assume a metabolic role by activating GK.The metabolic function of BAD controls insulin secretion, hepatic glucose sensing and overall glucose tolerance.When dephosphorylated, BAD BH3 targets BCL-XL to induce apoptosis.Apoptosis, together with other negative and positive regulators of beta-cell growth, proliferation and survival, contributes to the physiological control of beta-cell mass homeostasis.In addition, exposure to perfluorononanoic acid (PFNA), an increasingly persistent organic pollutant, has been demonstrated to cause hepatotoxicity in animals because PFNA exposure changed the expression levels of several genes related to hepatic glucose metabolism [96].The protein expression levels of GK, PI3Kca (phosphoinositide-3-kinase, catalytic, alpha polypeptide), phospho-insulin receptor substrate 1 (p-IRS1), p-PI3K, p-AKT and p-PDK1 (phosphoinositide-dependent kinase 1) were decreased in the livers of rats that received 5mg/kg/d PFNA, while that of glucose-6-phosphatase, GLUT2 and p-GSK3β (glycogen synthase kinase-3 beta) were increased, which explains the augment of hepatic glycogen [96].Hyperoside can activate the PI3K/Akt signaling pathway, resulting in inhibition of the interaction between Bad and Bcl(XL), without effects on the interaction between Bad and Bcl-2, and further inhibited mitochondria-dependent downstream caspase-mediated apoptotic pathway, such as that involving caspase-9, caspase-3, and poly ADP-ribose polymerase (PARP) [24].Hyperoside could be developed into a clinically valuable treatment for AD and other neuronal degenerative diseases associated with mitochondrial dysfunction.
The constructed model of BAD complex with GK, PKAc, PP1c and WAVE-1 integrating glycolysis and apoptosis using homology modeling method revealed that human BAD is phosphorylated and inactivated on Ser 75 in a BAD-Bcl-xL complex by PKA (targeted to mitochondria through association with WAVE1), resulting in the dissociation of BAD and its binding to GK [45].Moreover, GK can interact with PP1c and also distinguish WAVE1.On the other hand, BAD is dephosphorylated and activated on Ser75 by PP1c, leading to the separation of PKAc and its binding to the regulatory (R) subunit of PKA which by the dimerization domain of its R subunit connects with WAVE1 linked with GK of the complex.This may be the reason of the complex existing in liver mitochondria, regardless of phosphorylated and dephosphorylated BAD.Additionally, GK like PKA may also prevent Bcl-xL from rebinding to BAD by phosphorylating BAD at Ser 118.The BAD complex model reveals that BAD and GK play key roles because of BAD as a substrate for the PKA-PP1 pair and by BH3 domain directly interacting with GK.This is helpful for our development and research of the molecular mechanism of BAD.Our previously results have shown that PKA, targeted to mitochondria through association with AKAPs, phosphorylated and inactivated human BAD on Ser75 in a BAD-Bcl-xL complex resulting in the dissociation of BAD and its binding to 14-3-3, whereas PKA may also prevent Bcl-xL from rebinding to BAD by phosphorylating BAD at Ser118 of BH3 domain in a BAD-14-3-3E complex [97].The 14-3-3 family consists of homo-and heterodimeric proteins representing a novel type of "adaptor proteins" modulating the interaction between components of signal transduction pathways, especially 14-3-3 isoforms gamma and epsilon (14-3-3E) increased in several brain regions of AD patients [98].14-3-3E may reflect impaired signaling and/or apoptosis in the brain as several kinases (such as protein kinase C, Ras, mitogen-activated kinase MEK) involved in signaling and apoptotic factors due to BAD and BAG-1 binding to 14-3-3 motifs [98].
The activation of Wnt signaling and Akt is the main-tenance of mitochondrial membrane potential and the regulation of Bcl-xL, mitochondrial energy metabolism, and cytochrome c release [99].Loss of Wnt signaling also appears to play a role in more widespread neurodegenerative disorders, such as AD.Neurotoxicity of * β- amyloid deposition in hippocampal neurons has been linked to increased levels of GSK-3β * and loss of * β-catenin [99].Decreased production of Aβ * can occur during the enhancement of protein kinase C (PKC) activity which may be controlled by the Wnt pathway.In addition, PS1 has been shown to downregulate Wnt signaling and interact with β-catenin to promote its turnover [99].Disheveled (a downstream transducer of Wnt signaling) can promote non-amyloidogenic alpha-secretase cleavage of APP to yield secreted APP (sAPP) and inhibit GSK-3β to reduce the phosphorylation of tau.Thus, disheveled may increase neuronal protection during neurodegenerative disorders through sAPP production and reduction in tau phosphorylation [99].Wnt and Akt act upon downstream substrates, such as GSK-3β, Bad, and WISP-1 to block the induction of apoptotic cellular injury [99].EPO decreases the toxic effect of Aβ on microgliain through up-regulation of Wnt1 and activation of the PI3-K/Akt1/mTOR/p70S6K pathway-integration of the PI3-K/Akt1 pathways, mTOR, and mitochondrial related signaling of Bad, Bax, and Bcl-xL.EPO may in turn increases phosphorylation and cytosol trafficking of Bad, reduces the Bad/Bcl-xL complex and increases the Bcl-xL/Bax complex, thus preventing caspase 1 and caspase 3 activation and apoptosis.This may foster development of novel strategies to use cytoprotectants such as EPO for AD and other degenerative disorders [100].On the other hand, studies of synonymous codon usage in human genes can reveal some of human genome features, and the knowledge of codon usage pattern in diabetes or glucose metabolism related genes may assist mechanism researches of diabetes.Although the causes of codon usage bias are still under discussion, a lot of factors that influence the codon usage patterns are discovered in many researches: the GC content of the genome, especially the GC content at the third position of codons [101]; concentrations of corresponding acceptor tRNA molecules [102]; the functions or hydrophilicity of proteins expressed by a gene [103]; gene expression levels [104]; the amino acids composition of a protein [105]; the structure of proteins [106]; the mutational frequency and the method of mutation [107], etc.All these factors can be summarized as the influence of mutational pressure and translational selection (natural selection).Our results revealed that the bias of the codon usage in these proteins is primarily determined by the GC content and the third codons of the highly expressed genes are all G/C bases except part of mouse GMRPs.The synonymous codon usage biases in DRPs' genes and human GMRPs'genes are relatively strong, using G or C ended codons, which is consistent with that of human genes (table 2).12 amino acids prefer codons ended with C, including Asn (AAC), Asp (GAC), His (CAC), Tyr (UAC), Ala (GCC), Pro (CCC), Thr (ACC), Ser (AGC), Cys (UGC), Gly (GGC), Ile (AUC), and Phe (UUC), as well as Arg (CGC) for human genes.Besides Met (AUG) and Trp (UGG), 5 amino acids prefer codons ended with G, such as Gln (CAG), Glu (GAG), Lys (AAG), Leu (CUG), and Val (GUG), as well as Arg (CGG) for DRPs'gene.The compositional constraints (or mutational bias) and gene functions are the cause of the bias and the effect of natural selection is slight.All these analysis rendered some clue in discovering the mystery and curing the human diabetes disease.As analyzed in this study, CpG containing codons are markedly suppressed while UpG containing codons are over-represented.These results can be explained by mutational pressure.The frequencies of occurrence for dinucleotides were not randomly distributed and no dinucleotides were present at the expected frequencies.The codon usage in human diabetes related can be strongly influenced by underlying biases in dinucleotide frequencies.The global methylation pattern is a key feature of the methylation landscape of the human genome.Most of the gene bodies and intergenic sequences are globally methylated with the exception of regions called CpG islands (CGIs).CGIs are often unmethylated, but there are increasing number of CGIs being reported to be methylated in normal tissues [108].In mammals, the methylated form of cytosine (5-methylcytosine) is hypermutable.5-methylcytosine is formed by the enzyme DNA methyltransferase operating on a cytosine occurring immediately 5' of a guanine.One effect of methylation is to increase the rate of spontaneous deamination of 5-methylcytosine to form thymine.It has been estimated that transitions in the methylated CpG dinucleotide occur 8 -16 times faster than non-CpG transitions [109].The deficit of CpG dinucleotides in human diabetes related gene coding sequences is largely attributed to the hypermutability of methylated CpGs to UpGs (or CpAs in the complementary strand).Since, CpGs in CGIs are often unmethylated and their mutations are rare, the deficiency of CpG can be considered an influence of mutation, which is in accordance with the conclusion that the codon usage of these diabetes related genes are influenced by the cause mutational pressure.Our results also displayed that the synonymous codon usage biases in mouse GMRPs' genes are ordered by the third codon position as follows: C > G > U > A, which is in accord with Zeeberg's results [110].5 amino acids prefer codons ended with U, including Ala (GCU), Pro (CCU), Ser (UCU), Asp (GAU), and Cys (UGU).3 amino acids prefer codons ended with A, such as Gly (GGA), Arg (AGA), and Thr (ACA).This appears to be a manifestation of an evolutionary strategy for placement of genes in regions of the genome with a GC content that relates synonymous codon bias and protein folding.Similarly, codon usage biases in AD and other neurodegenerative diseases indicated that GCrich codons are mainly in charge of forming contracted conformation, especially the first nucleotide of codons plays a dominant role in translating the genomic GC signature into protein sequences and structures [11].Our previous research has revealed that conformation biases of amino acids are present in natural proteins and the corresponding biases of codons show an evident tendency in protein folding [26].
Meanwhile, accumulative evidences suggest that human virus infections can induce diabetes, including diabetes Type 1 and Type 2. The former is caused by progressive destruction of pancreatic β-cells, leading to insulin deficiency.There is extensive epidemiological evidence that viral infections, among other environmental agents, may contribute to the pathogenesis of Type 1 and several enteroviruses can infect human β-cells, resulting in functional impairment or cell death [111].Likewise, insulin resistance, caused by HCV infection, evolves to type 2 diabetes when superimposed on a high-fat diet and obesity.That means there exists a correlation exists between HCV infection and diabetes [112].For the virus infections that can induce diabetes, Coxsackievirus serotype B (CVB) and hepatitis C virus (HCV) will be discussed [113].CVB belongs to the species Human en- terovirus-B (HEV-B), a member of the family Picornaviridae.There are six serotypes of CVB and a large number of substrains of these small non-enveloped, single-stranded positive-sense RNA viruses.Up to now, the codon usage research of CVB is lacked.

CONCLUSION
Blood glucose concentration is very important, disorder of glucose metabolism will lead to a series of serious diseases, especially diabetes.DRPs represent Lys as the most abundant amino acid while GMRPs show Leu-rich and Trp-poor character.Especially, the percentage of Trp can distinguish between type 1and type 2 diabetes mellitus.Amongst the aromatic entities, Tryp-tophan was found to possess the most amyloidogenic potential.Therefore, targeting aromatic recognition interfaces by tryptophan could be a useful approach for inhibiting the formation of amyloids and a treatment strategy for several human diseases associated with amyloid formation, including Alzheimer's disease, Park-inson's disease, diabetes, etc.Moreover, the usage biases of codons in DRPs'genes depend on GC contents to a great extent, in concord with all codons of the highly expressed genes with the terminal of C/G.The deficit of CpG dinucleotides in diabetes related gene coding sequences is largely attributed to the hypermutability of methylated CpGs to UpGs by the mutational pressure.This helps to treat and control diabetes at gene level by site mutant.Additionally, besides a common node insulin receptor, there are some similar node proteins in DRP interaction network (DRPIN) and GMRP interaction network (GMRPIN), such as glucose transporter members GLUT4 and GLUT1, protein tyrosine phosphatases PTPRN and PTP1B, and adipose metabolism signal proteins LIPL and APOA2.Moreover, GMRPIN's node protein PPARγ could regulate expression of the following genes: ACSL1, adiponectin, ATGL, and DGAT1, as well as the expression of DRPIN's node proteins GPDM and LIPL.The sharing proteins involve the following signal pathway, such as glucagon receptor, amylin receptor, insulin, PPARγ, angiopoietin, PC-1/ ENPP1, and adiponectin mediated signal pathway.Moreover, most of gene sequences of 15 node proteins in DRPs' interaction network involved the binding sites of 37 transcription factors divide into four kinds of superclasses, such as basic domain (such as C/EBP), beta-Scaffold factors (i.e.NF-κB), helix-turn-helix (i.e.HNF), and zinc-coordinating DNA-binding domains (i.g.GATA-1).The leucine zipper factor family C/EBP distinguishes a nine-nucleotide motif of T-[1]-C- [1]-G/C-[1]-A-Pu-T, while Cys4 zinc finger family GATA-1 binding site displays two reverse symmetry motifs, such as PyTATC-A/T and A/T-GATA-Pu.Especially, BAD complex can integrate pathways of glucose metabolism and apoptosis by BH3 domain of BAD directly interacting with GK because of BAD as a substrate for the PKA-PP1 pair, regardless of phosphorylated and dephosphorylated BAD.
In conclusion, we try to systematically analyze the basic parameters, interactions, pathways, and networks of proteins related to diabetes and glucose metabolism on mutil-scale and mutil-level, including the amino acid compositions, codon biases, protein-protein interaction network, transcription factor binding sites, and BAD complex structure.This facilitates the discovery of treatment for diabetes mellitus and is helpful for the prediction and evaluation of potential diabetes targets.They are also conducive to the treatment of Alzheimer's disease.

Figure 1 .
Figure 1.Content percentage of 20 standard amino acids in sixteen types of proteins possessing various biology activities, including diabetes-related proteins and glucose metabolism-related proteins.

Figure 2 .Figure 3 .
Figure 2. The effective number of codons (N c ) used in a gene plotted against the G+C content at the synonymously variable third position (GC3s) for some genes.a) DRPs; b) human GMRPs; c) mouse GMRPs.The solid curve indicates the expected N c value if bias is due to GC3s alone when codens are random used; the dispersed points the corresponding actual N c value at certain GC 3s value.

Figure 5 .
Figure 5. Interaction network of HXK 1 (green dot) and the diabetes-related proteins by Osprey 1.2.0 (91).HXK1 interacts with O43768 (ENSA_HUMAN, Alpha-endosulfine) by P09104 (ENOG_HUMAN, gamma enolase, Neuron-specific enolase).Here, red dots donate the proteins related to diabetes and the bigger ones do hub proteins of network, whilst yellow dots express the proteins whose anti-diabetes activities have not been reported until now, they nevertheless exist in the network.However, the smaller red dots at right corner under the figure don't exist in the above network although they are related to diabetes.Moreover, the blue lines show the interaction among diabetes-related proteins by the other proteins while the purple lines figure the direct interactions between diabetes-related proteins.And the indirect interactions of the diabetes-related proteins are connected by less than three other proteins during building of the interaction network.

Figure 7 .
Figure 7. Protein-protein interaction network of GMRPs by Osprey 1.2.0.Here, red dots donate the proteins related to human glucose metabolism; bule dots don't exist in the above network.

Figure 8 .
Figure 8.Molecular modeling of BAD complex.a) a four-member complex model of human BAD (orange)with GK (green), PKAc (cyan), and PP1c (red); b) a nine-member complex model of double four-member complex with a peptide of WAVE1, here, BAD colored by orange and blue, respectively, GK (green and purple), PKAc (cyan and gray), and PP1c (red and white).

Figure 9 .
Figure 9. Potential signal pathway between glucose and BAD.

Table 1 .
Content percentage of 20 types of standard amino acids in a series of proteins distributed over various species.
Note: DRPs and GMRPs mean proteins related to diabetes and to glucose metabolism, respectively.(P < 0.01).

Table 2 .
Content percentage and RSCU of 64 types of codens in a series of proteins possessing various biology activities.

Table 3 .
Codon usages of high and low expression genes of diabetes-related proteins.a High and Low refer to subsets of genes from the extremities of the correspondence analysis axis 1. N=Number of codons; and RSCU=Relative synonymous codon usage values." * " codons are those more frequent (P<0.01) in highly expressed genes, designated as optimal codons.

Table 4 .
Classification of transcription factors.

Table 5 .
Usage biases of Codons of BAD complex.