Selection of Suitable Features for Modeling the Durations of Syllables - Journal of Software Engineering and Applications

JSEA > Vol.3 No.12, December 2010

Journal of Software Engineering and Applications

Volume 3, Issue 12 (December 2010)

ISSN Print: 1945-3116 ISSN Online: 1945-3124

Google-based Impact Factor: 1.22 Citations h5-index & Ranking

Selection of Suitable Features for Modeling the Durations of Syllables ()

HTML

Download as PDF (Size: 364KB) PP. 1107-1117

DOI: 10.4236/jsea.2010.312129 3,953 Downloads 7,676 Views Citations

Author(s)

Krothapalli S. Rao, Shashidhar G. Koolagudi

Affiliation(s)

ABSTRACT

Acoustic analysis and synthesis experiments have shown that duration and intonation patterns are the two most important prosodic features responsible for the quality of synthesized speech. In this paper a set of features are proposed which will influence the duration patterns of the sequence of the sound units. These features are derived from the results of the duration analysis. Duration analysis provides a rough estimate of features, which affect the duration patterns of the sequence of the sound units. But, the prediction of durations from these features using either linear models or with a fixed rulebase is not accurate. From the analysis it is observed that there exists a gross trend in durations of syllables with respect to syllable position in the phrase, syllable position in the word, word position in the phrase, syllable identity and the context of the syllable (preceding and the following syllables). These features can be further used to predict the durations of the syllables more accurately by exploring various nonlinear models. For analying the durations of sound units, broadcast news data in Telugu is used as the speech corpus. The prediction accuracy of the duration models developed using rulebases and neural networks is evaluated using the objective measures such as percentage of syllables predicted within the specified deviation, average prediction error (µ), standard deviation (σ) and correlation coefficient (γ).

KEYWORDS

Prosody, Syllable Duration, Syllable Position, Syllable Context, Syllable Identity, Feed Forward Neural Network

Share and Cite:

K. Rao and S. Koolagudi, "Selection of Suitable Features for Modeling the Durations of Syllables," Journal of Software Engineering and Applications, Vol. 3 No. 12, 2010, pp. 1107-1117. doi: 10.4236/jsea.2010.312129.

Cited by

[1]	Gradient Boost algorithms for Modelling Malayalam Poem Syllable Duration: Gradient Boost algorithms for Modelling Malayalam Poem Syllable Duration
	INFOCOMP Journal of …, 2022

[2]	Random Forest and AdaBoost-DT: Ensemble Machine Learning Estimators to Model Malayalam Poem Syllable Duration
	Soft Computing: Theories and …, 2022

[3]	Identification and Extraction of Features from Malayalam Poems for Analyzing Syllable Duration Patterns
	Transactions on Asian and Low-Resource …, 2021

[4]	Incorporating Dialectal Features in Synthesized Speech using Voice Conversion Techniques
	International Journal of Computer Applications, 2018

[5]	Prosody Detection from Text Using Aggregative Linguistic Features
	Smart and Innovative Trends in Next Generation Computing Technologies, 2017

[6]	Exploring robust spectral features for emotion recognition using statistical approaches
	2017

[7]	Speaker Identification and Time Scale Modification Using VOPs
	Speech Processing in Mobile Environments, 2014

[8]	Background and Literature Review
	Speech Processing in Mobile Environments, 2014

[9]	Spotting and Recognition of Consonant–Vowel Units from Continuous Speech
	Speech Processing in Mobile Environments, 2014

[10]	Consonant–Vowel Recognition in the Presence of Coding and Background Noise
	Speech Processing in Mobile Environments, 2014

[11]	Vowel Onset Point Detection from Coded and Noisy Speech
	Speech Processing in Mobile Environments, 2014

[12]	Speech Processing in Mobile Environments
	Springer International Publishing, 2014

[13]	Introduction
	Video coding standards, 2014

[14]	Summary and Conclusions
	Employment Relations in South Korea, 2014

[15]	Emotion Recognition Using Vocal Tract Information
	Emotion Recognition using Speech Features. Springer New York, 2013

[16]	SpringerBriefs in Electrical and Computer Engineering
	A Neustein - Springer, 2013

[17]	Emotion Recognition Using Excitation Source Information
	Emotion Recognition using Speech Features. Springer New York, 2013

[18]	Pitch synchronous and glottal closure based speech analysis for language recognition
	International Journal of Speech Technology, 2013

[19]	Speech Emotion Recognition: A Review
	Emotion Recognition using Speech Features. Springer New York, 2013

[20]	Emotion Recognition Using Prosodic Information
	motion Recognition using Speech Features. Springer New York, 2013

[21]	Prosody Modification
	Predicting Prosody from Text for Text-to-Speech Synthesis. Springer New York, 2012

[22]	Prosody Knowledge for Speech Systems: A Review
	Predicting Prosody from Text for Text-to-Speech Synthesis. Springer New York, 2012

[23]	Modeling Intonation
	Predicting Prosody from Text for Text-to-Speech Synthesis. Springer New York, 2012

[24]	Modeling Duration
	Predicting Prosody from Text for Text-to-Speech Synthesis. Springer New York, 2012

[25]	Analysis of Durations of Sound Units
	Predicting Prosody from Text for Text-to-Speech Synthesis. Springer New York, 2012

[26]	Practical Aspects of Prosody Modification
	Predicting Prosody from Text for Text-to-Speech Synthesis. Springer New York, 2012

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies