Robust Speech Endpoint Detection in Airplane Cockpit Voice Background

Hongbing CHENG; Ming LEI; Guorong HUANG; Yan XIA

doi:10.4236/wsn.2009.15059

Wireless Sensor Network > Vol.1 No.5, December 2009

Robust Speech Endpoint Detection in Airplane Cockpit Voice Background

Hongbing CHENG, Ming LEI, Guorong HUANG, Yan XIA
.
DOI: 10.4236/wsn.2009.15059 PDF HTML 4,967 Downloads 9,089 Views

A method of robust speech endpoint detection in airplane cockpit voice background is presented. Based on the analysis of background noise character, a complex Laplacian distribution model directly aiming at noisy speech is established. Then the likelihood ratio test based on binary hypothesis test is carried out. The decision criterion of conventional maximum a posterior incorporating the inter-frame correlation leads to two separate thresholds. Speech endpoint detection decision is finally made depend on the previous frame and the observed spectrum, and the speech endpoint is searched based on the decision. Compared with the typical algorithms, the proposed method operates robust in the airplane cockpit voice background.

Keywords

Complex Laplacian Model, Maximum A Posterior Criterion, Likelihood Ratio Test, Speech End- point Detection, Airplane Cockpit Voice

Share and Cite:

H. CHENG, M. LEI, G. HUANG and Y. XIA, "Robust Speech Endpoint Detection in Airplane Cockpit Voice Background," Wireless Sensor Network, Vol. 1 No. 5, 2009, pp. 489-495. doi: 10.4236/wsn.2009.15059.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1]	Y. M. Guo, Q. Fu, and Y. H. Yan, “Speech endpoint detection in complex noise environment [J],” Journal of Acoustics, Vol. 31, No. 6, pp. 549–554, 2006.
[2]	D. L. Cheng, C. J. Yi, H. Y. Yao, et al., “The primary research of voice information identify methods of airplane cockpit voice recorder [J],” Control of Noise and Quiver, Vol. 3, pp. 81–84, 2006.
[3]	J. L. Shen, J. W. Hung, and L. S. Lee, “Robust entropy-based endpoint detection for speech recognition in noisy environments [C],” In Proceedings of ICSLP, pp. 232–235, 1998.
[4]	J. L. Shen and C. H. Yang, “A novel approach to robust speech endpoint detection in car environment [C],” In Proceedings of ICASSP, Vol. 3, pp. 1751–1754, 2000.
[5]	C. Jia and B. Xu, “An improved entropy-based endpoint detection algorithm [C],” In Proceedings of ISCSLP, 2002.
[6]	J. A. Haigh and J. S. Mason, “Robust voice activity detection using cepstral feature [C],” In Proceedings of IEEE TELCON’93, pp. 321–324, 1993.
[7]	X. D. Wei, G. R. Hu, and X. L. Ren,” Speech endpoint detection with noise using cepstral feature [J],” Journal of Shanghai Jiao Tong University, Vol. 34, No. 2, pp. 185– 188, 2001.
[8]	E. Nemer, R. Goubran, and S. Mahmoud, “Robust voice activity detection using higher-order statistics in the LPC residual domain [J],” IEEE Transactions on Speech and Audio Processing, Vol. 9, No. 3, pp. 217–231, 2001.
[9]	R. Q. Yan and Y. S. Zhu, “Speech endpoint detection based on the analysis of signal recursion [J],” Journal of Communication, Vol. 1, pp. 35–39, 2007.
[10]	J. Sohn, N. S. Kim, and W. Sung, “A statistical model- based voice activity detection [J],” IEEE Signal Processing Letters, Vol. 6, No. 1, pp. 1–3, 1999.
[11]	A. Davis, S. Nordholm, and R. Togneri, “Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold [J],” IEEE Transactions on Audio, Speech, Language Process, Vol. 14, No. 2, pp. 412–424, 2006.
[12]	M. Fujimoto, K. Ishizuka, and H. Kato, “Noise robust voice activity detection based on statistical model and parallel non-linear Kalman filter [C],” ICASSP’07, pp. 797–800, 2007.
[13]	J. H. Chang, J. W. Shin, and N. S. Kim, “Likehood ratio test with complex Laplacian model for voice activity detection [C],” In Proceedings of Euro Speech, pp. 1065– 1068, 2003.
[14]	M. J. F. Gales, “Models based techniques for noise robust speech recognition [D],” Cambridge University, 1995.
[15]	H. Hirsch and C. Ehrlicher, “Noise estimation techniques for robust speech recognition [A],” ICASSP’95 Proceedings, pp. 153–156, 1995.
[16]	N. S. Kim and J. H. Chang, “Space enhancement based on global soft decision [J],” IEEE Signal Processing Letters, Vol. 7, No. 5, pp. 108–110, 2000.
[17]	W. H. Shin, B. S. Lee, Y. H. Lee, et al., “Speech/non- speech classification using multiple features for robust endpoint detection [C],” In Proceeding of ICAASSP, Vol. 3, pp. 1399–1402, 2000.
[18]	J. J. Lei, “The research of some issues in noise robust speech identification [D],” Doctor Thesis of Beijing University of Posts and Telecommunications, 2007.

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies