A Comparison of Classifiers in Performing Speaker Accent Recognition Using MFCCs


An algorithm involving Mel-Frequency Cepstral Coefficients (MFCCs) is provided to perform signal feature extraction for the task of speaker accent recognition. Then different classifiers are compared based on the MFCC feature. For each signal, the mean vector of MFCC matrix is used as an input vector for pattern recognition. A sample of 330 signals, containing 165 US voice and 165 non-US voice, is analyzed. By comparison, k-nearest neighbors yield the highest average test accuracy, after using a cross-validation of size 500, and least time being used in the computation.

Share and Cite:

Ma, Z. and Fokoué, E. (2014) A Comparison of Classifiers in Performing Speaker Accent Recognition Using MFCCs. Open Journal of Statistics, 4, 258-266. doi: 10.4236/ojs.2014.44025.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Dhanalakshmi, P., Palanivel, S. and Ramalingam, V. (2009) Classification of Audio Signals Using SVM and RBFNN. Expert Systems with Applications, 36, 6069-6075.
[2] Chen, S.-H. and Luo, Y.-R. (2009) Speaker Verification Using MFCC and Support Vector Machine. Proceedings of the International Multi-Conference of Engineers and Computer Scientists, Hong Kong, 18-20 March 2009.
[3] Ittichaichareon, C., Suksri, S. and Yingthawornsuk, T. (2012) Speech Recognition Using MFCC. International Conference on Computer Graphics, Simulation and Modeling, Pattaya, 28-29 July 2012, 135-138.
[4] Gaikwad, S., Gawali, B. and Mehrotra, S.C. (2012) Gender Identification Using SVM with Combination of MFCC. Advances in Computational Research, 4, 69.
[5] Khan, A., Farhan, M. and Ali, A. (2011) Speech Recognition: Increasing Efficiency of Support Vector Machines. International Journal of Computer Applications, 35, 17-21.
[6] Clarkson, P. and Moreno, P.J. (2009) On the Use of Support Vector Machines for Phonetic Classification. Proc. ICASSP, 585-588.
[7] Pedersen, C. and Diederich, J. (2006) Accent Classification Using Support Vector Machines. Working Paper, The University of Queensland, Brisbane.
[8] Huang, X., Acero, A. and Hon, H.-W. (2001) Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice Hall, Upper Saddle River.
[9] Pedersen, C. and Diedrich, J. (2008) Accent in Speech Samples: Support Vector Machines for Classification and Rule Extraction. In: Kacprzyk, J., Ed., Studies in Computational Intelligence, Springer-Verlag, Berlin, Vol. 80, 205-226.
[10] Rabiner, L. and Juang, B.-H. (1993) Fundamental of Speech Recognition. Prentice-Hall, Englewood Cliffs.
[11] Vapnik, V.N. (1995) The Nature of Statistical Learning Theory. Springer-Verlag, Berlin.
[12] Zheng, F., Zhang, G. and Song, Z. (2001) Comparison of Different Implementations of MFCC. Journal of Computer Science and Technology, 16, 582-589.
[13] Clarke, B., Fokoué, E. and Zhang, H. (2009) Principals and Theory for Data Mining and Machine Learning. Springer, Berlin.
[14] James, G., Witten, D., Hastie, T. and Tibshirani, R. (2013) An Introduction to Statistical Learning, Springer, New York.
[15] Fokoué, E. (2013) A Taxonomy of Massive Data for Optimal Predictive Machine Learning and Data Mining. Working Paper CQAS-DSRG-2013-3, Rochester Institute of Technology, Center for Quality and Applied Statistics.

Copyright © 2022 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.