Neural Network Based Missing Feature Method For Text-Independent Speaker Identification

Ying WANG; Wei LU

doi:10.4236/ijcns.2010.31005

International Journal of Communications, Network and System Sciences > Vol.3 No.1, January 2010

Neural Network Based Missing Feature Method For Text-Independent Speaker Identification

Ying WANG, Wei LU
.
DOI: 10.4236/ijcns.2010.31005 PDF HTML 6,380 Downloads 10,280 Views

The first step of missing feature methods in text-independent speaker identification is to identify highly corrupted spectrographic representation of speech as missing feature. Most mask estimation techniques rely on explicit estimation of the characteristics of the corrupting noise and usually fail to work with inaccurate estimation of noise. We present a mask estimation technique that uses neural networks to determine the reliability of spectrographic elements. Without any prior knowledge of the noise or prior probability of speech, this method exploits only the characteristics of the speech signal. Experiments were performed on speech corrupted by stationary F16 noise and non-stationary Babble noise from 5dB to 20 dB separately, using cluster based reconstruction missing feature method. The result performs better recognition accuracy than conventional spectral subtraction mask estimation methods.

Keywords

Speaker Identification, Missing Feature Reconstruction, Mask Estimation, Neural Network

Share and Cite:

Y. WANG and W. LU, "Neural Network Based Missing Feature Method For Text-Independent Speaker Identification," International Journal of Communications, Network and System Sciences, Vol. 3 No. 1, 2010, pp. 43-47. doi: 10.4236/ijcns.2010.31005.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1]	M. P. Cooke, A. Morris, and P. D. Green, “Recognition occluded speech,” ESCA Tutorial and Workshop on Auditory Basis of Speech Perception, Keele University, July 15–19, 1996.
[2]	A. Vizinho, P. Green, M. P. Cooke, and L. Josifovski, “Missing data theory, spectral subtraction and signal-to- noise estimation for robust ASR: An integrated study [C],” Proceedings of Sixth European Conference on Speech Communication and Technology, Eurospeech, Budapest, pp. 2407–2410, 1999.
[3]	M. P. Cooke, P. Green, L. Josifovski, and A. Vizinho, “Robust automatic speech recognition with missing and uncertain acoustic data [J],” Speech Communication, pp. 267–285, 2001.
[4]	A. Drygajlo and M. El-Maliki, “Speaker verification in noisy environments with combined spectral subtraction and missing feature theory [M],” Proceedings of IEEE ICASSP 98, Seattle, IEEE, USA, pp. 121–124, 1998.
[5]	M. L. Seltzer, B. Raj, and R. M. Stern, “A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition [J],” Speech Communication, Vol. 43, pp. 379–393, 2004.
[6]	B. Raj, M. L. Seltzer, and R. M. Stern, “Reconstruction of missing features for robust speech recognition [J],” Speech Communication, Vol. 43, pp. 275–296, 2004.
[7]	B. Raj, “Reconstruction of incomplete spectrograms for robust speech recognition [D],” Pittsburgh, ECE Department, USA, Carnegie Mellon University, 2000.
[8]	Z. Q. Bian and X. G. Zhang, “Pattern recognition [M],” Tsinghua University, Beijing, pp. 235–237, 2000.
[9]	R. J. Higgins, “Digital signal processing in VLSI, Englewood Cliffs,” Prentice Hall, NJ, 1990.
[10]	http://www.fon.hum.uva.nl/praat/download_win.html.
[11]	J. P. Leblanc and P. L. De Leon, “Speech separation by kurtosis maximization,” Proceedings of ICASSP_98, 1998.
[12]	J. P. Leblanc and P. L. De Leon, “Noise estimation techniques for robust speech recognition,” Proceedings of ICASSP’ 95, pp. 153–156, 1998.
[13]	J. Campbell, “Testing with the YOHO CD-ROM voice verification corpus [C],” Proceedings of IEEE ICASSP. Detroit, USA, IEEE, pp. 341–344, 1995.

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies