TITLE:
Semantic Similarity over Gene Ontology for Multi-Label Protein Subcellular Localization
AUTHORS:
Shibiao Wan, Man-Wai Mak, Sun-Yuan Kung
KEYWORDS:
Protein Subcellular Localization; Semantic Similarity; GO Terms; Multi-Label Classification
JOURNAL NAME:
Engineering,
Vol.5 No.10B,
October
25,
2013
ABSTRACT:
As one of the essential topics in proteomics and
molecular biology, protein subcellular localization has been extensively
studied in previous decades. However, most of the methods are limited to the
prediction of single-location proteins. In many studies, multi-location
proteins are either not considered or assumed not existing. This paper proposes
a novel multi-label subcellular-localization predictor based on the semantic
similarity between Gene Ontology (GO) terms. Given a protein, the accession
numbers of its homologs are obtained via BLAST search. Then, the homologous
accession numbers of the protein are used as keys to search against the
gene ontology annotation database to obtain a set of GO terms. The semantic
similarity between GO terms is used to formulate semantic similarity vectors
for classification. A support vector machine (SVM) classifier with a new
decision scheme is proposed to classify the multi-label GO semantic
similarity vectors. Experimental results show that the proposed multi-label
predictor significantly outperforms the state-of-the-art predictors such as
iLoc-Plant and Plant-mPLoc.