Share This Article:

Stand-Alone Intelligent Voice Recognition System

Abstract Full-Text HTML XML Download Download as PDF (Size:3598KB) PP. 179-190
DOI: 10.4236/jsip.2014.54019    3,272 Downloads   3,997 Views   Citations

ABSTRACT

In this paper, an expert system for security based on biometric human features that can be obtained without any contact with the registering sensor is presented. These features are extracted from human’s voice, so the system is called Voice Recognition System (VRS). The proposed system consists of a combination of three stages: signal pre-processing, features extraction by using Wavelet Packet Transform (WPT) and features matching by using Artificial Neural Networks (ANNs). The features vectors are formed after two steps: firstly, decomposing the speech signal at level 7 with Daubechies 20-tap (db20), secondly, the energy corresponding to each WPT node is calculated which collected to form a features vector. One hundred twenty eight features vector for each speaker was fed to the Feed Forward Back-propagation Neural Network (FFBPNN). The data used in this paper are drawn from the English Language Speech Database for Speaker Recognition (ELSDSR) database which composes of audio files for training and other files for testing. The performance of the proposed system is evaluated by using the test files. Our results showed that the rate of correct recognition of the proposed system is about 100% for training files and 95.7% for one testing file for each speaker from the ELSDSR database. The proposed method showed efficiency results were better than the well-known Mel Frequency Cepstral Coefficient (MFCC) and the Zak transform.

Conflicts of Interest

The authors declare no conflicts of interest.

Cite this paper

Saady, M. , El-Borey, H. , El-Dahshan, E. and Yahia, A. (2014) Stand-Alone Intelligent Voice Recognition System. Journal of Signal and Information Processing, 5, 179-190. doi: 10.4236/jsip.2014.54019.

References

[1] Desyatchikov, A.A., Kovkov, D.V., Lobantsov, V.V., Makovkin, K.A., Matveev, I.A., Murynin, A.B. and Chuchupal, V.Ya. (2006) A System of Algorithms for Stable Human Recognition. Journal of Computer and Systems Sciences International, 45, 958-969. http://dx.doi.org/10.1134/S1064230706060116
[2] Wu, J.-D. and Lin, B.-F. (2009) Speaker Identification Using Discrete Wavelet Packet Transform Technique with Irregular Decomposition. Expert Systems with Applications, 36, 3136-3143. http://dx.doi.org/10.1016/j.eswa.2008.01.038
[3] Kinnunen, T. and Li, H.Z. (2010) An Overview of Text-Independent Speaker Recognition: From Features to Supervectors. Speech Communication, 52, 12-40.
http://dx.doi.org/10.1016/j.specom.2009.08.009
[4] Khalaf, E.F., Daqrouq, K. and Sherif, M. (2011) Wavelet Packet and Percent of Energy Distribution with Neural Networks Based Gender Identification System. Journal of Applied Sciences, 11, 2940-2946.
[5] Hossen, A. and Al-Rawahi, S. (2010) A Text-Independent Speaker Identification System Based on the Zak Transform. Signal Processing: An International Journal, 4, 68-74.
[6] ELSDSR Database for Speaker Recognition (2004). http://www.imm.dtu.dk/~lf/eLSDSR.htm
[7] Lung, S.Y. (2006) Wavelet Feature Selection Based Neural Networks with Application to the Text Independent Speaker Identification. Pattern Recognition, 39, 1518-1521.
http://dx.doi.org/10.1016/j.patcog.2006.02.004
[8] Avci, E. and Akpolat, Z.H. (2006) Speech Recognition Using a Wavelet Packet Adaptive Network Based Fuzzy Inference System. Expert Systems with Applications, 31, 495-503.
http://dx.doi.org/10.1016/j.eswa.2005.09.058
[9] Sarikaya, R., Pellom, B.L. and Hansen, J.H.L. (1998) Wavelet Packet Transform Features with Application to Speaker Identification. Proceedings of the IEEE Nordic Signal Processing Symposium, Denmark, 81-84.
[10] Goupillaud, P., Grossman, A. and Morlet, J. (1984) Cycle-Octave and Related Transforms in Seismic Signal Analysis. Geoexploration, 23, 85-102. http://dx.doi.org/10.1016/0016-7142(84)90025-5
[11] Louis, A.K., Maass, D. and Rieder, A. (1997) Wavelets-Theory and Applications. Wiley, Hoboken.
[12] Avci, E. (2007) A New Optimum Feature Extraction and Classification Method for Speaker Recognition: GWPNN. Expert System with Applications, 32, 485-498.
http://dx.doi.org/10.1016/j.eswa.2005.12.004
[13] Haykin, S. (1999) Neural Networks: A Comprehensive Foundation. 2nd Edition, Prentice-Hall, Englewood Cliffs.
[14] Lou, X. and Loparo, K.A. (2004) Bearing Fault Diagnosis on Wavelet Transform and Fuzzy Inference. Mechanical System and Signal Processing, 18, 1077-1095.
http://dx.doi.org/10.1016/S0888-3270(03)00077-3
[15] Burrus, C.S., Gopinath, R.A. and Guo, H. (1998) Introduction to Wavelet and Wavelet Transforms. Prentice Hall, New Jersey.

  
comments powered by Disqus

Copyright © 2019 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.