Text Classification Using Support Vector Machine with Mixture of Kernel

Abstract

Recent studies have revealed that emerging modern machine learning techniques are advantageous to statistical models for text classification, such as SVM. In this study, we discuss the applications of the support vector machine with mixture of kernel (SVM-MK) to design a text classification system. Differing from the standard SVM, the SVM-MK uses the 1-norm based object function and adopts the convex combinations of single feature basic kernels. Only a linear programming problem needs to be resolved and it greatly reduces the computational costs. More important, it is a transparent model and the optimal feature subset can be obtained automatically. A real Chinese corpus from FudanUniversityis used to demonstrate the good performance of the SVM- MK.

Share and Cite:

L. Wei, B. Wei and B. Wang, "Text Classification Using Support Vector Machine with Mixture of Kernel," Journal of Software Engineering and Applications, Vol. 5 No. 12B, 2012, pp. 55-58. doi: 10.4236/jsea.2012.512B012.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] V. Vapnik, “The nature of statistic learning theory. Springer, New York, 1995.
[2] T. Joachims, “Text Categorization with Support Vector Machines Learning with Many Relevant Features,” In European Conference on Machine Learning ( ECML). Chemnitz, Germany: [s.n.], 1998, pp. 137-142.
[3] T. Gartner, P. A. Flach, “WBCSVM: Weighted Bayesian Classification based on support vector machine,” 18th Int. Conf. on Machine Learning. Willianstown, Carla E. Brodley, Andrea Po-horeckyj Danyluk, (eds.), 2001, pp. 207–209.
[4] ChengHua Li, JuCheng Yang, S. C. Park, “Text categorization algorithms using semantic ap-proaches, corpus-based thesaurus and WordNet,” Expert Syst. Appl. 39(1), pp. 765-772, 2012.
[5] A. Ch. Mic-chelli, M. Pontil, “Learning the kernel function via regu-larization,” Journal of Machine Learning Research, 6, 2005, pp. 1099-1125.
[6] G. R.G. Lanckrient, N. Cris-tianini, P. Bartlett, L. El Ghaoui, M.I. Jordan. Learning the kernel matrix with semidefinite programming. Jour-nal of Machine Learning Research, 5, 2004, pp. 27-72.
[7] F.R. Bach, G. R.G. Lanckrient, M.I. Jordan. Multiple kernel learning, conic duality and the SMO al-gorithm. Twenty First International Conference on Ma-chine Learning, 2004, pp. 41-48.
[8] L.W. Wei, J.P. Li, Z.Y. Chen. Credit Risk Evaluation Using Support Vector Machine with Mixture of Kernel, The 7th International Conference on Computational Science 2007, Lecture Notes in Computer Science 4488, 2007, pp. 431-438.
[9] Institute of Computing Technology, Chi-nese Lexical Analysis System: http://www.nlp.org.cn/project/project.php?proj_id=6.
[10] F. Jiang, “Research on Chinese Text Categorization based on Support Vector Machine,” Degree of Master paper, Chongqing University, 2009.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.