Semi-Global Inference in Phenotype-Protein Network


Discovering genetic basis of diseases is an important goal and a challenging problem in bioinformatics research. Inspired by network-based global inference approach, Semi-global inference method is proposed to capture the complex associations between phenotypes and genes. The proposed method integrates phenotype similarities and protein-protein interactions, and it establishes the profile vectors of phenotypes and proteins. Then the relevance between each candidate gene and the target phenotype is evaluated. Candidate genes are then ranked according to relevance mark and genes that are potentially associated with target disease are identified based on this ranking. The model selects nodes in integrated phenotype-protein network for inference, by exploiting Phenotype Similarity Threshold (PST), which throws lights on selection of similar phenotypes for gene prediction problem. Different vector relevance metrics for computing the relevance marks of candidate genes are discussed. The performance of the model is evaluated on Online Mendelian Inheritance in Man (OMIM) data sets and experimental evaluation shows high performance of proposed Semi-global method outperforms existing global inference methods.

Share and Cite:

Xia, S. , Quan, G. , Zhao, Y. and Jia, X. (2013) Semi-Global Inference in Phenotype-Protein Network. Engineering, 5, 181-188. doi: 10.4236/eng.2013.510B039.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] D. Botstein and N. Risch, “Discovering Genotypes Underlying Human Phenotypes: Past Successes for Mendelian Disease, Future Approaches for Complex Disease,” Nature Genetics, Vol. 33, 2003, pp. 228-237.
[2] F. S. Turner, D. R. Clutterbuck and C. Semple, “Pocus: Mining Genomic Sequence Annotation to Predict Disease Genes,” Genome Biology, Vol. 4, 2003, p. R75.
[3] J. Chen, C. Shen and A. Sivachenko, “Mining Alzheimer Disease Relevant Proteins from Integrated Protein Interactome Data,” Pacific Symposium on Biocomputing, Vol. 11, 2006, pp. 367-378.
[4] A. Hamosh, A. F. Scott, J. S. Amberger, C. A. Bocchini, and V. A. McKusick, “Online Mendelian Inheritance in Man (OMIM), a Knowledge-base of Human Genes and Genetic Disorders,” Nucleic Acids Research, Vol. 33, Database Issue, 2005.
[5] E. Adie, R. R. Adams, K. L. Evans, D. J. Porteous and B. Pickard, “Speeding Disease Gene Discovery by Sequence Based Candidate Prioritization,” BMC Bioinformatics, Vol. 6, 2005, p. 55.
[6] L. Sam, Y. Liu, J. Li, C. Friedman and Y. A. Lussier, “Discovery of Protein Interaction Networks Shared by Diseases,” Pacific Symposium on Biocomputing, Vol. 12, 2007, pp. 76-87.
[7] G. Jiminez-Sanchez, et al., ”Human Disease Genes,” Nature, Vol. 409, 2001, pp. 853-854
[8] M. Oti and H. G. Brunner, “The Modular Nature of Genetic Diseases,” Clinical Genetics, Vol. 71, 2007, pp. 1- 11.
[9] J. H. Jing-Dong, “Understanding Biological Functions through Molecular Networks,” Cell Research, Vol. 18, 2008, pp. 224-237.
[10] J. Chen, B. Aronow and A. Jegga, “Disease Candidate Gene Identification and Prioritization Using Protein Interaction Networks,” BMC Bioinformatics, Vol. 10, No. 1, 2009, p. 73.
[11] M. Oti, B. Snel, M. A. Huynen and H. G. Brunner, “Predicting Disease Genes Using Protein-Protein Interactions,” Journal of Medical Genetics, Vol. 43, 2006, pp. 691-698.
[12] S. Navlakha and C. Kingsford, “The Power of Protein Interaction Networks for Associating Genes with Diseases,” Bioinformatics, Vol. 26, 2010, pp. 1057-1063.
[13] K. Lage, E. O. Karlberg, Z. M. Storling, P. I. Olason, A. G. Pedersen, O. Rigina, A. M. Hinsby, Z. Tumer, F. Pociot, N. Tommerup, Y. Moreau and S. Brunak, “A Human Phenome-Interactome Network of Protein Complexes Implicated in Genetic Disorders,” Nature Biotechnology, Vol. 25, 2007, pp. 309-316.
[14] X. B. Wu, R. Jiang, M. Q Zhang and S. Li, “Network-Based Global Inference of Human Disease Genes,” Molecular Systems Biology, Vol. 4, 2008, p. 189.
[15] S. Kohler, S. Bauer, D. Horn and P. N. Robinson, “Walking the Interactome for Prioritization of Candidate Disease Genes,” The American Journal of Human Genetics, Vol. 82, No. 4, 2008, pp. 949-958.
[16] Y. Li and J. C. Patra, “Genome-Wide Inferring Gene- Phenotype Relationship by Walking on the Heterogeneous Network,” Bioinformatics, Vol. 26, No. 9, 2010, pp. 1219-1224.
[17] S. Erten and M. Koyut¨urk, “Role of Centrality in Network-Based Prioritization of Disease Genes,” Proceedings of the 8th European Conf. Evolutionary Computation, Machine Learning, and Data Mining in Bioinformatics (EVOBIO’10), Vol. LNCS 6023, 2010, pp. 13-25.
[18] A. M. Edwards, B. Kus, R. Jansen, D. Greenbaum, J. Greenblatt and M. Gerstein, “Bridging Structural Biology and Genomics: Assessing Protein Interaction Data with Known Complexes,” Trends in Genetics, Vol. 18, No. 10, 2002, pp. 529-536.
[19] Online Mendelian Inheritance in Man, OMIM®. McKusick-Nathans Institute of Genetic Medicine, Johns Hop-kins University (Baltimore, MD), World Wide Web URL:
[20] M. A. van Driel, J. Bruggeman, G. Vriend, et al., ”A Text-Mining Analysis of the Human Phenome,” European Journal of Human Genetics, Vol. 14, 2006, pp. 535-542.
[21] D. Szklarczyk, A. Franceschini, M. Kuhn, M. Simonovic, A. Roth, P. Minguez, T. Doerks, M. Stark, J. Muller, P. Bork, et al., “The STRING Database in 2011: Functional Interaction Networks of Proteins, Globally Integrated and Scored,” Nucleic Acids Research, Vol. 39, 2011, pp. D561-D568.
[22] E. Birney, D. Andrews, M. Caccamo, Y. Chen, L. Clarke, G. Coates, T. Cox, F. Cunningham, V. Curwen, T. Cutts, et al., “Ensembl 2006,” Nucleic Acids Research, Vol. 34, 2006, pp. D556-D561.
[23] U. Mudunuri, A. Che, M. Yi and R. M. Stephens, “bio-DBnet: The Biological Database Network,” Bioinformatics, Vol. 25, No. 4, 2009, pp. 555-556.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.