Using Formants to Extract Short Vowels from Arabic Words with (Consonant Vowel)3 Structure


Arabic texts suffer from missing short vowels. Arabic Speech Recognition is not as good as English speech recognition due to the short vowels not being recognized. And the Arabic language is unlike the English language in characteristics such as the number of vowels. English has more than 24 vowels that are close to each other in pronunciation. The Arabic language only has three short vowels that are far from each other in utter and measurement, by elongating those short vowels, long vowels arose. Researchers said that the vowels could be recognized using formants. The formants’ measurements of Arabic vowels are far from each other too, so it is possible to recognize them so that Arabic Speech recognition can give more accurate results. The paper applies this idea to the corpus Phonemes of Arabic. It uses the Euclidian distance method to measure the distances between formant values to recognize Arabic from words with a CV3 structure, the Linear Predictive Coding method and MATLAB to develop the programs that will extract the formants and calculate the means of the short vowels by using the corpus to identify the short vowels within words in the corpus. The results showed that if highly qualified readers were chosen to read the Arabic text, then higher rates of recognition of the short vowels involved in words will be achieved. This paper revealed that some of the characteristics of a language can be utilized for vowel recognition or to enhance the existing methods for speech recognition.

Share and Cite:

Alshaari, M. and Kepuska, V. (2021) Using Formants to Extract Short Vowels from Arabic Words with (Consonant Vowel)3 Structure. Journal of Computer and Communications, 9, 1-9. doi: 10.4236/jcc.2021.95001.

1. Introduction

The sixth most broadly spoken language in the world is Arabic. Nowadays, there are three kinds of Arabic: Classical, Modern Standard Arabic (SA), and there are many Arabic Dialects. Classical Arabic is used in holy texts such as Al Quran, and for linguistic studies. SA is the formal language for all Arabic countries, it is used for official communications, news, and writing in schools. Most Arab countries use many Dialects. This form of Arabic is not written except on social media sites such as Facebook. This paper focuses only on written Standard Arabic which suffers from being non-diacritized (short vowels), which means the words are missing the marks that appear above and under the word’s letters to express the short vowels. In other words, Arabic has short vowels which are usually ignored in the text. The Arabic language is a Semitic language, but English is not. That is why it is different in many aspects such as vowels; English has many vowels, a lot more than Arabic which has only six. Therefore, using English speech processing theories, methods, and tools to deal with Arabic and expecting the same results will cause fundamental and essential mistakes in transcribing the Arabic speech.

2. The International Phonetic Alphabet (IPA)

IPA [1] puts a notation for speech sounds used by humans to speak any language. The notation expresses the vowels depending on the position of the tongue, whether it is bottom or top, the figure of lips, and the opening of the mouth. The phoneticians such as Daniel Jones [2] tried to express all vowels using a triangle chart (Figure 1).

The /a/ vowel is expressed at the bottom position of the tongue which is very low. When the tongue is high at the top of the mouth the vowel expressed is /i/. And when the tongue is far back and very high, and the lips are rounded, the vowel expressed is /u/ (Figure 2).

Figure 1. Daniel Jones triangle chart.

Figure 2. Position and shape of vocal tract for /a/, /i/ and /u/ vowels.

3. Arabic Short Vowels

Arabic has only three short vowels [3], if we look for them in the cardinal vowel chart; we found that the nearest vowels in the IPA chart (Figure 1) are located at the edges of the chart, which are /a/, /i/, and /u/. The short vowels are illustrated in Table 1.

4. Linear Predictive Coding (LPC) and Formants

The LPC method is one of the most effective and valuable methods for speech analysis. It is a method used mostly in the processing of audio signals and speech, and for encoding voice of good quality at a low bit rate, which provides highly accurate estimates of speech parameters. “Several authors have therefore investigated formant frequencies as speech recognition features, using various methods for basic analysis, such as linear prediction.” [1] The formants, which are the resonant frequencies of the vocal tract, are the most important feature classifying a specific vowel.

5. Formants of Arabic Vowels

Phonemes of Arabic corpus [4] were used to calculate formants’ mean for Arabic vowels from isolated vowels, it was also used to extract vowels from Arabic words with CV3 structure, see Table 2.

6. Measuring the Mean of Short Vowels’ Formant Values for All the Corpus Readers

The mean Formant1 (F1) and Formant2 (F2) were calculated for short vowels [5]. See Figure 3. The standard deviation (SD) and coefficient of variation (CVar) were calculated to examine whether the formants are accurate and reliable, and the results are illustrated in Table 3.

Table 1. Arabic short vowels.

Table 2. List of Arabic CV3 words.

Figure 3. Formants’ mean of Arabic short vowels.

Table 3. The mean, SD and CVar of F1 and F2 for short vowels.

Since all CVar values are less than 1 which is considered low that indicates that the measurements tend to be close to the mean. Therefore, the results are precise and reliable. In addition, the formants of the Arabic short vowels, /a/, /i/ and /u/ are divergent. Therefore, any Arabic short vowels could be recognized if their F1 and F2 are close to the mean of F1 and F2 of /a/, /i/ or /u/.

7. Recognizing the SVs in Arabic Words with CV3 Structure

Since, the CV3 structure has 4 patterns i.e., /aaa/, /aia/, /aua/, and /uia/. The corpus involves 24 CV3 words, each word recorded 3 times by 18 readers, so 72 recorded words [6]. Matlab was used to develop a program that recognized SVs from the recorded words by measuring the distances between words’ formants and the calculated mean of F1 and F2 of SVs (/a/, /i/, and /u/). The program processed as follows:

• Read the calculated means of F1 and F2 for /a/, /i/ and /u/, and F1 and F2 for CV3 words.

• Calculate the distances between the means of F1 and F2 for /a/, /i/ and /u/, and F1s and F2s for all the CV3 words.

• Put distances related to each reader in a separate row.

• Find the minimum distances of each row, which expresses the distances of one SV to the mean values, the minimum indicates that the SV is the same type as the mean.

8. The Results and the Analysis

All CV3 (1296) words were entered into the program. Table 4 shows the SVs recognition percentage in each CV3 pattern.

And Table 5 shows the recognition rates of each SVs (/a/, /i/, and /u/) in all CV3 words.

Tables 6(a)-(d) show the recognition rates of SVs pattern in the CV3 words assigned to each reader. The titles of columns are the CV3 words and their patterns, the titles of each of the rows are the readers’ numbers, and the cells show the recognition percentage of SVs in the words which are represented in the column’s title and recorded by readers who are represented in the rows’ titles.

From the tables the following results can be concluded:

­ The recognition rates of SVs in words with /aaa/ pattern are as follows:

• 90% were recognized by a rate of 100%.

• 96.3% were recognized.

• 4% were not recognized.

­ The recognition rates of SVs in words with /aia/ pattern are as follows:

• 36% were recognized by a rate of 100%.

• 68% were recognized.

• 32% were not recognized.

­ The recognition rates of SVs in words with /aua/ pattern are as follows:

• 28% were recognized by a rate of 100%.

• 66% were recognized.

• 34% were not recognized.

­ The recognition rates of SVs in words with /uia/ pattern are as follows:

• 25% were recognized by a rate of 100%.

• 52% were recognized.

• 48% were not recognized.

It is noted that the recognition was affected by the type of pattern and the skills of the readers. The pattern with the best recognition was /aaa/, and the

Table 4. SVs recognition rate in every word’s pattern.

Table 5. Recognition rate of the SVs in all CV3 words.

(a) (b) (c) (d)

Table 6. (a) Recognition rate of SVs’ patterns in all CV3 words assigned to reader; (b) Recognition rate of SVs’ patterns in all CV3 words assigned to reader; (c) Recognition rate of SVs’ patterns in all CV3 words assigned to reader; (d) Recognition rate of SVs’ patterns in all CV3 words assigned to reader.

reader with the best results was reader number 10. This highlights the importance of choosing a qualified person for the purpose of adding diacritics to Arabic texts using formants.

9. Conclusions

Most Arabic texts suffer from missing SVs which cause problems in learning Arabic as a second language. Arabic speech recognition is supposed to tackle that problem. However, they use the same methods used to treat English, which causes the results to not be as accurate for Arabic as it is in English. Arabic has characteristics that are not used in Arabic speech recognition, which are that Arabic has only three short vowels and the values of F1 and F2 of each one of them are always divergent and have fixed patterns.

This paper dealt with this problem by using the results of previous studies that said the vowels could be identified by using formants, and by using the characteristics mentioned. To recognize the Arabic SVs, the Euclidian distance law was used. It calculates the distances between F1 and F2 of an Arabic input sound or CV3 word, and the mean of F1s and F2s of /a/, /i/, and /u/, and the smallest distances show to which type each input belongs.

By Using the MATLAB environment CV3 words were examined. Necessary programs were designed, implemented, and then tested. The best results are as follows:

• CaCaCa pattern was recognized with a rate of 94.44%

• CaCiCa pattern was recognized with a rate of 55.56%

• CaCuCa pattern was recognized with a rate of 94.44%

• CuCiCa pattern was recognized with a rate of 88.89%

The results show that the idea and the method worked and can be used by other languages that have a limited number of short vowels and their formants are divergent.

It is seen that the recognition is influenced by the type of pattern and the experience of a person. That knowledge shows the importance of selecting qualified readers for pronouncing Arabic text that concerns adding diacritics to texts that do not have them by using formants.

10. Future Work

Here are some views that need to be focused on for future work:

• Apply the idea to a corpus for Children and Women.

• See if other languages have some characteristics that allow them to benefit from the method applied in this research.

• Find a method to help in partitioning the words into letters; this will help a lot in giving more precise results. ANN may be a suitable method to accomplish that goal.

• Combine the method applied in this research with statistical methods; doing so should give more precise results.

• Extend this research by using a large vocabulary with different patterns and develop the method to deal with continuous speech.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.


[1] The International Phonetic Alphabet (2005) IPA: Vowels.
[2] Jones, D. and Wells, J. (2015) Vowel Triange, Cardinal Vowels.,_cardinal_vowels.png
[3] Mubark, A. (2013) Arabic Sounds from Alphabetical to Vocal Order. Damascus Univ. Mag, 29, no. 3+4.
[4] Alshaari, M., ElHarati, H. and Kepuska, V. (2020) Phonemes of Arabic.
[5] Kepuska, V. and Alshaari, M. (2020) Using Formants to Compare Short and Long Vowels in Modern Standard Arabic. Journal of Computer and Communications, 8, 96-106.
[6] Elharati, H.A., Alshaari, M. and Këpuska, V.Z. (2020) Arabic Speech Recognition System Based on MFCC and HMMs. Journal of Computer and Communications, 8, 28-34.

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.