// } // function pdfdownloadjudge() { // $("a").each(function(index) { // var rel = $(this).attr("rel"); // if (rel == "true") { // $(this).removeAttr("onclick"); // $(this).attr("href","#"); // //$(this).bind('click', function() { SetNumTwo(3304)}); // var url = "../userInformation/PDFLogin.aspx"; // var refererrurl = document.referrer; // var downloadurl = window.location.href; // var args = "PaperID=" + 3304 + "&RefererUrl=" + refererrurl + "&DownloadUrl=" + downloadurl; // url = url + "?" + args + "&rand=" + RndNum(4); // // $(this).bind('click', function() { ShowTwo(url)}); // } // }); // } // //获取下载pdf注册的cookie // function getcookie() { // var cookieName = "pdfddcookie"; // var cookieValue = null; //返回cookie的value值 // if (document.cookie != null && document.cookie != '') { // var cookies = document.cookie.split(';'); //将获得的所有cookie切割成数组 // for (var i = 0; i < cookies.length; i++) { // var cookie = cookies[i]; //得到某下标的cookies数组 // if (cookie.substring(0, cookieName.length + 2).trim() == cookieName.trim() + "=") {//如果存在该cookie的话就将cookie的值拿出来 // cookieValue = cookie.substring(cookieName.length + 2, cookie.length); // break // } // } // } // if (cookieValue != "" && cookieValue != null) {//如果存在指定的cookie值 // return false; // } // else { // // return true; // } // } // function ShowTwo(webUrl){ // alert("22"); // $.funkyUI({url:webUrl,css:{width:"600",height:"500"}}); // } //window.onload = pdfdownloadjudge;
JSIP> Vol.1 No.1, November 2010
Share This Article:
Cite This Paper >>

Real Time Prosody Modification

Abstract Full-Text HTML Download Download as PDF (Size:1092KB) PP. 50-62
DOI: 10.4236/jsip.2010.11006    4,662 Downloads   8,323 Views   Citations
Author(s)    Leave a comment
Krothapalli Sreenivasa Rao




Real time prosody modification involves changing the prosody parameters such as pitch, duration and intensity of speech in real time without affecting the intelligibility and naturalness. In this paper prosody modification is performed using instants of significant excitation (ISE) of the vocal tract system during production of speech. In the conventional prosody modification system the ISE are computed using group delay function, and it is computationally intensive task. In this paper, we propose computationally efficient methods to determine the ISE suitable for prosody modification in interactive (real time) applications. The overall computational time for the prosody modification by using the proposed method is compared with the conventional prosody modification method which uses the group delay function for computing the ISE.


Instants of Significant Excitation, Group Delay Function, Voiced Region Detection, Hilbert Envelope, Li-near Prediction Residual, Real Time Prosody Modification

Cite this paper

K. Rao, "Real Time Prosody Modification," Journal of Signal and Information Processing, Vol. 1 No. 1, 2010, pp. 50-62. doi: 10.4236/jsip.2010.11006.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] D. G. Childers, K. Wu, D. M. Hicks, and B. Yegnanarayana,“Voice conversion,” Speech Communication, Vol. 8, pp. 147-158, June 1989.
[2] E. Moulines and J. Laroche, “Non-parametric techniques for pitch-scale and time-scale modification of speech,” Speech Communication, Vol. 16, pp. 175-205, Feb. 1995.
[3] B. Yegnanarayana, S. Rajendran, V. R. Ramachandran, and A. S.M. Kumar, “Significance of knowledge sources for TTS system for Indian languages,” SADHANA Academy Proc. In Engineering Sciences, Vol. 19, pp. 147-169, Feb. 1994.
[4] M. R. Portnoff, “Time-scale modification of speech based on short-time Fourier analysis,” IEEE Trans. Acoustics, Speech, and Signal Processing, Vol. 29, pp. 374-390, June. 1981.
[5] M. R. Schroeder, J. L. Flanagan, and E. A. Lundry, “Bandwidth compression of speech by analytic-signal rooting,” Proc. IEEE, Vol. 55, pp. 396-401, Mar. 1967.
[6] M. Narendranadh, H. A. Murthy, S. Rajendran, and B. Yegnanarayana, “Transformation of formants for voice conversion using artificial neural networks,” Speech Communication, Vol. 16, pp. 206-216, Feb. 1995.
[7] E. B. George andM. J. T. Smith, “Speech Analysis/Synthesis and modification using an Analysis-by-Synthesis/Overlap-Add Sinusoidal model,” IEEE Trans. Speech and Audio Processing, Vol. 5, pp. 389-406, Sept. 1997.
[8] Y. Zhang and J. Tao, “Prosody modification on mixedlanguage speech synthesis,” in Proc. Int. Conf. Spoken Language Processing, (Brisbane, Australia), Sept. 2008.
[9] S. R. M. Prasanna, D. Govind, K. S. Rao, and B. Yegnanarayana, “Fast prosody modification using instants of significant excitation,” in Speech Prosody 2010, (Chicago, USA), May 2010.
[10] D. Govind and S. R. M. Prasanna, “Expressive speech synthesis using prosodic modification and dynamic time warping,” in NCC 2009, (Guwahati, India), January 2009.
[11] Y. Stylianou, “Applying the harmonic plus noise model in concatenative speech synthesis,” IEEE Trans. Speech and Audio Processing, Vol. 9, pp. 21-29, Jan. 2001.
[12] H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigne, “Restructuring speech representations using a pitch- adaptive time-frequency smoothing and an instantaneous-frequencybased F0 extraction: Possible role of a repetitive structure in sounds,” Speech Communication, Vol. 27, pp. 187-207, 1999.
[13] R. MuraliSankar, A. G. Ramakrishnan, and P. Prathibha, “Modification of pitch using DCT in source domain,” Speech Communication, Vol. 42, pp. 143-154, Jan. 2004.
[14] T. F. Quatieri and R. J.McAulay, “Shape invariant time-scale and pitch modification of speech,” IEEE Trans. Signal Processing, Vol. 40, pp. 497-510, Mar. 1992.
[15] W. Verhelst, “Overlap-add methods for time-scaling of speech,” Speech Communication, Vol. 30, pp. 207-221, 2000.
[16] D. O’Brien and A. Monaghan, Improvements in Speech Synthesis, ch. Shape invariant pitch and time-scale modification of speech based on harmonic model. Chichester: John Wiley & Sons, 2001.
[17] P. S. Murthy and B. Yegnanarayana, “Robustness of groupdelay-based method for extraction of significant excitation from speech signals,” IEEE Trans. Speech and Audio Processing, Vol. 7, pp. 609-619, Nov. 1999.
[18] J. Makhoul, “Linear prediction: A tutorial review,” Proc. IEEE, Vol. 63, pp. 561-580, Apr. 1975.
[19] A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discretetime signal processing. Upper Saddle River, NJ.: Prentice-Hall, 1999.
[20] K. S. Rao and B. Yegnanarayana, “Prosody modification using instants of significant excitation,” IEEE Trans. Speech and Audio Processing, Vol. 14, pp. 972-980, May 2006.
[21] S. Haykin, Neural Networks: A Comprehensive Foundation. New Delhi, India: Pearson Education Aisa, Inc., 1999.
[22] D. Gabor, “Theory of communication,” J. IEE, Vol. 93, No. 2, pp. 429-457, 1946.

comments powered by Disqus
JSIP Subscription
E-Mail Alert
JSIP Most popular papers
Publication Ethics & OA Statement
Frequently Asked Questions
Recommend to Peers
Recommend to Library
Contact Us

Copyright © 2020 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.