Development of Application Specific Continuous Speech Recognition System in Hindi

Abstract

Application specific voice interfaces in local languages will go a long way in reaching the benefits of technology to rural India. A continuous speech recognition system in Hindi tailored to aid teaching Geometry in Primary schools is the goal of the work. This paper presents the preliminary work done towards that end. We have used the Mel Frequency Cepstral Coefficients as speech feature parameters and Hidden Markov Modeling to model the acoustic features. Hidden Markov Modeling Tool Kit —3.4 was used both for feature extraction and model generation. The Julius recognizer which is language independent was used for decoding. A speaker independent system is implemented and results are presented.

Share and Cite:

G. Gaurav, D. Deiv, G. Sharma and M. Bhattacharya, "Development of Application Specific Continuous Speech Recognition System in Hindi," Journal of Signal and Information Processing, Vol. 3 No. 3, 2012, pp. 394-401. doi: 10.4236/jsip.2012.33052.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] K. Kumar and R. K. Agarwal, “Hindi Speech Recognition System Using HTK,” International Journal of Computing and Business Research, Vol. 2, No. 2, 2011, ISSN (On- line): 2229-6166.
[2] G. Sivaraman and K. Samudravijaya, “Hindi Speech Recognition and Online Speaker Adaptation,” Proceedings of ICTSM 2011, Vol. 145, 2011, pp. 233-238.
[3] D. ShakinaDeiv, Gaurav and M. Bhattacharya, “Automatic Gender Identification for Hindi Speech Recognition,” International Journal of Computer Applications, Vol. 31, No. 5, 2011, pp. 1-8.
[4] R. K. Aggarwal and M. Dave, “Implementing a Speech Recognition System Interface for Indian Language,” Proceedings of the IJCNLP-2008 Workshop on NLP for Less Privileged Languages, Hyderabad, January 2008, pp. 105-112.
[5] R. Mathur, Babita and A. Kansal, “Domain Specific Speaker Independent Continuous Speech Recognition Using Julius,” ASCNT 2010.
[6] S. Arora, B. Saxena, K. Arora and S. S. Agarwal, “Hindi ASR for Travel Domain,” Oriental COCOSDA 2010 Proceedings Centre for Development of Advanced Computing, Noida, 24-25 November 2010.
[7] R. K. Aggarwal and M. Dave, “Fitness Evaluation of Gaussian Mixtures in Hindi Speech Recognition System,” 2010 First International Conference on Integrated Intelligent Computing, Bangalore, 5-7 August 2010, pp. 177- 183. doi:10.1109/ICIIC.2010.13
[8] K. Samudravijaya, “Hindi Speech Recognition,” Journal Acoustic Society of India, Vol. 29, No. 1, 2009, pp. 385- 393.
[9] K. Malhotra and A. Khosla, “Automatic Identification of Gender & Accent in Spoken Hindi Utterances with Regional Indian Accents,” IEEE Spoken Language Technology Workshop, Goa, 15-19 December 2008, pp. 309- 312.
[10] R. Gupta, “Speech Recognition for Hindi,” M. Tech. Pro- ject Report, Department of Computer Science and Engineering, Indian Institute of Technology, Bombay, Mum- bai, 2006.
[11] B. A. Q. Al-Qatab and R. N. Ainon, “Arabic Speech Re- cognition Using Hidden Markov Model Toolkit (HTK),” International Symposium in Information Technology, Kuala Lumpur, 15-17 June 2011, pp. 557-562.
[12] C. Kurian and K. Balakrishnan, “Speech Recognition of Malayalam Numbers,” World Congress on Nature & Biologically Inspired Computing, Coimbatore, 9-11 December 2009, pp. 1475-1479.
[13] R. Syama and S. M. Idikkula, “HMM Based Speech Recognition System for Malayalam,” The International Conference on Artificial Intelligence, 2008Monte Carlo Resort, Las Vegas, 14-17 July 2008.
[14] P. G. Deivapalan and H. A. Murthy, “A Syllable-Based Isolated Word Recognizer for Tamil Handling OOV Words,” The National Conference on Communications, Indian Institute of Technology Bombay, 1-3 February 2008, pp. 267-271.
[15] C. Neti, N. Rajput and A. Verma, “A Large Vocabulary Continuous Speech Recognition System for Hind,” IBM Research and Development Journal, September 2004.
[16] G. Anumanchipalli, R. Chitturi, S. Joshi, R. Kumar, S. P. Singh, R. N. V. Sitaram and S. P. Kishore, “Development of Indian Language Speech Databases for Large Vocabulary Speech Recognition Systems,” Proceedings of International Conference on Speech and Computer (SPECOM), Patras, October 2005.
[17] A. Stolcke, “SRILM—An Extensible Language Modeling Toolkit,” Proceedings of the 7th International Conference on Spoken Language Processing, 2002, pp. 901-904. http://www.speech.sri.com/
[18] L. R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proceedings of the IEEE, Vol. 77, No. 2, 1989, pp. 257-286.
[19] C. J. Legetter, “Improved acoustic modeling for HMMs using linear transformations,” Ph.D. Thesis, University of Cambridge, Cambridge, 1995.
[20] M. Benzeghiba, R. De Mori, O. Deroo, S. Dupont, T. Erbes, D. Jouvet, L. Fissore, P. Laface, A. Mertins, C. Ris, R. Rose, V. Tyagi and C. Wellekens, “Automatic Speech Recognition and Speech Variability: A Review,” Speech Communication, Vol. 49, No. 10-11, 2007, pp. 763-786. doi:10.1016/j.specom.2007.02.006
[21] T. Herbig, F. Gerl, W. Minker and R. Haeb-Umbach, “Adaptive Systems for Unsupervised Speaker Tracking and Speech Recognition,” Evolving Systems, Vol. 2, No. 3, 2011, pp. 199-214. doi:10.1007/s12530-011-9034-1
[22] Steve Young, et al., “The HTK Book,” http://htk.eng.cam.ac.uk/docs/docs.shtml
[23] A. Lee, T. Kawahara and K. Shikano, “Julius—An Open Source Real-Time Large Vocabulary Recognition Engine,” Proceedings of 7th European Conference on Speech Communication and Technology, 2001.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.