Prediction of Peptides Binding to Major Histocompatibility Class II Molecules Using Machine Learning Methods

Abstract

In daily life,we are frequently attacked by infection organisms such as bacteria and viruses. Major Histocompatibility (MHC) molecules have an essential role in T-cell activation and initiating an adaptive immune response. Development of methods for prediction of MHC-Peptide binding is important in vaccine design and immunotherapy. In this study, we try to predict the binding between peptides and MHC class II. Support vector machine (SVM) and Multi-Layer Percep-tron (MLP) are used for classification. These classifiers based on pseudo amino acid compositions of data that we ex-tracted from PseAAC server, classify the data. Since, the dataset, used in this work, is imbalanced, we apply a pre-processing step to over-sample the minority class and come over this problem. The results show that using the concept of pseudo amino acid composition and applying over-sampling method, increases the performance of predictor. Fur-thermore, the results demonstrate that using the concept of PseAAC and SVM is a successful method for the prediction of MHC class II molecules.

Share and Cite:

Faramarzi, F. , Beigi, M. , Botorabi, Y. and Mousavi, N. (2013) Prediction of Peptides Binding to Major Histocompatibility Class II Molecules Using Machine Learning Methods. Engineering, 5, 513-517. doi: 10.4236/eng.2013.510B105.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] H. Yu, X. Zhu and M. Huang, “Using String Kernel to Predict Binding Peptides for MHC Class II Molecules,” The 8th International Conference on Signal Processing, 2006.
[2] V. Brusic, G. Rudy, M. Honeyman, J. Hammer and L. Harrison, “Prediction of MHC Class II-Binding Peptides Using an Evolutionary Algorithm and Artificial Neural Network,” Bioinformatics, Vol. 14, 1998, pp. 121-130. http://dx.doi.org/10.1093/bioinformatics/14.2.121
[3] J. Cui, L. Han, H. Lin, H. Zhang, Z. Tang, C. J. Zheng, Z. W. Cao and Y. Z. Chen, “Prediction of MHC Binding Peptides of Flexible Lengths from Sequence-Derived Structural and Physicochemical Properties,” Molecular Immunology, Vol. 44, No. 5, 2007, pp. 866-877. http://dx.doi.org/10.1016/j.molimm.2006.04.001
[4] C. Leslie and E. Eskin, “The Spectrum Kernel: A String Kernel for SVM Protein Classification,” Proceedings of the Pacific Symposium on Biocomputing, Vol. 7, 2002, pp. 566-575.
[5] H. Saigo, J. Vert, N. Ueda and T. Akutsu, “Protein Homology Detection Using String Alignment Kernels,” Bioinformatics, Vol. 20,2004, pp. 1682-1689. http://dx.doi.org/10.1093/bioinformatics/bth141
[6] K. C. Chou, “Prediction of Protein Cellular Attributes Using Pseudo-Amino Acid Composition,” Proteins, Vol. 43, 2001, pp. 246-255. http://dx.doi.org/10.1002/prot.1035
[7] Y. EL-Manzalawy, D. Dobbs and V. Honar, “On Evaluating MHC-II Binding Peptide Prediction Methods,” PLoS One, Vol. 3, 2008.
[8] K. C. Chou, “Pseudo Amino Acid Composition and Its Applications in Bioinformatics, Proteomics and System Biology,” Proteomics, Vol. 6, 2009, pp. 262-274. http://dx.doi.org/10.2174/157016409789973707
[9] H. Mohabatkar, M. Mohammad Beigi and A. Esmaeili, “Prediction of GABAA Receptor Proteins Using the Concept of Chou’s Pseudo-Amino Acid Composition and Support Vector Machine,” Journal of Theoretical Biology, Vol. 281, 2011, pp. 18-23. http://dx.doi.org/10.1016/j.jtbi.2011.04.017
[10] J. Luengo, A. Fernández, S. García and F. Herrera, “Addressing Data Complexity for Imbalanced Data Sets: Analysis of SMOTE-Based Oversampling and Evolutionary Undersampling,” Soft Computing, Vol. 15, 2011, pp. 1909-1936. http://dx.doi.org/10.1007/s00500-010-0625-8
[11] H. Han, W. Y. Wang and B. H. Mao, “Borderline- SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning,” International Conference on Intelligent Computing, 2005, pp. 878-887.
[12] G. Raghava, “Evaluation of MHC Binding Peptide Prediction Algorithms”. http://www.imtech.res.in/raghava/mhcbench

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.