7
Analyzing the Seventh Vowel of Classical Arabic Mubark Obaid AlQahtany 1 , Yousef Ajami Alotaibi 1 , Sid-Ahmed Selouani 2 1 Department of Computer Engineering Department, King Saud University, Saudi Arabia 2 LARIHS Lab. Université de Moncton, Campus de Shippagan, Canada [email protected], [email protected], [email protected] Abstract In general, any language sounds are classified into two categories: vowels that contain no major air restriction through the vocal tract, and consonants that involve a significant restriction and are therefore weaker in amplitude and often "noisier" than vowels. Modern standard Arabic contains six basic vowels, but there has been disagreement between linguists and researchers on the exact number of Arabic vowels that exist in the classical Arabic. We believe that classical and Quranic Arabic contains an extra vowel in addition to the basic ones. This study is specifically concerned with analyzing this extra vowel that exists in a specific verse in The Holly Quran. We called this intended vowel as “the 7 th vowel in classical Arabic”. The first, second, third, and fourth formant values in this vowel are investigated by using more than one Quranic recitation and more than one narrator among those people who recited The Holly Quran perfectly. This vowel is analyzed in both time and frequency domains and acoustically compared with basic Arabic vowels. The result of this analysis and investigation will facilitate Arabic speech processing tasks such as vowel and speech recognition and classification. Keywords: The Holy Quran (THQ), Modern Standard Arabic (MSA), Classical Arabic (CA), Quranic Arabic (QA). INTRODUCTION Most of The Holly Quran (THQ) learning process is still handled with manual method, through reading skills with direct supervised learning (talaqqi and mushafahah methods). These methods are described as a face-to-face teaching process between students and teachers. The techniques used by the learners are mainly based on listening and repetitions. The correction of wrong recitation is performed by attempting, through many repetitions, to reach the correct teacher pronunciation. This learning method was used from the beginning of the Islam era up to now. This way of teaching is very efficient since students will know how the sounds (phonemes) of the THQ and alphabets of Arabic are pronounced by watching the teacher and by listening the correct teachers’ pronunciations. Also this method of teaching will allow teachers to hear and notice mistakes of their students and correct their pronunciations by leading them to repeat the correct recitation and by describing the exact mistake and the correction. This process can only be done, if the teachers and students follow the art, rules, and regulations of the correct THQ recitation, known as “rules of Tajweed[1]. Modern Standard Arabic (MSA) possesses six vowels namely /a/, /i/, /u/, /aa/, /ii/, /uu/. The Arabic language is a quantitative language where sound duration is phonemic and semantically relevant. The vocalic system is then composed of three short vowels, /a/, /i/, /u/, and three long counterparts, /aa/, /ii/, /uu/ [2]. The Classical Arabic (CA) and Quranic Arabic (QA) contain many variations and vowel-like sounds that can be noticed in reciting THQ. Some linguists and researchers treat these variations and allophones as new phonemes that exit in CA and/or in QA but cannot be found in MSA. This paper focuses on the acoustical analysis of the unique vowel found in the sixth word “it moves” (ﻣﺠﺮﻳﻬﺎ) in Chapter 11"Houd", verse number 41 [3]. The rest of the paper is organized as follows. This section continues by giving background subsections about Arabic language and THQ reading dialects. We also introduce in this section the “7 th vowelof Arabic language and the formant analysis used to characterize this vowel. The second section explains the experimental framework and contains subsections about the database, the file coding system, and the methodology used throughout our analysis experiments. The third section discussed the obtained results and the last section concludes this work and gives indication about future work. A. Arabic Language Arabic is a Semitic language, and it is one of the oldest languages in the world. Currently it is the second language in terms of number of speakers [4]. Arabic is the first language in the Arab world, i.e., Saudi Arabia, Jordan, Oman, Yemen, Egypt, Syria, Lebanon, etc. Arabic alphabets are used in several languages, such as Persian and Urdu. Standard Arabic has basically 34 phonemes, of which six are vowels, and 28 are consonants [5]. A phoneme is the smallest element of speech that indicates a difference in meaning, word, or sentence. Arabic language has fewer vowels than English language. It has three long and three short vowels, while American English has twelve vowels [6].

[IEEE 2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE) - Dalian, China (2009.09.24-2009.09.27)] 2009 International Conference on Natural

Embed Size (px)

Citation preview

Page 1: [IEEE 2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE) - Dalian, China (2009.09.24-2009.09.27)] 2009 International Conference on Natural

Analyzing the Seventh Vowel of Classical Arabic Mubark Obaid AlQahtany1, Yousef Ajami Alotaibi1, Sid-Ahmed Selouani2

1Department of Computer Engineering Department, King Saud University, Saudi Arabia

2LARIHS Lab. Université de Moncton, Campus de Shippagan, Canada [email protected], [email protected], [email protected]

Abstract – In general, any language sounds are classified into two categories: vowels that contain no major air restriction through the vocal tract, and consonants that involve a significant restriction and are therefore weaker in amplitude and often "noisier" than vowels. Modern standard Arabic contains six basic vowels, but there has been disagreement between linguists and researchers on the exact number of Arabic vowels that exist in the classical Arabic. We believe that classical and Quranic Arabic contains an extra vowel in addition to the basic ones. This study is specifically concerned with analyzing this extra vowel that exists in a specific verse in The Holly Quran. We called this intended vowel as “the 7th vowel in classical Arabic”. The first, second, third, and fourth formant values in this vowel are investigated by using more than one Quranic recitation and more than one narrator among those people who recited The Holly Quran perfectly. This vowel is analyzed in both time and frequency domains and acoustically compared with basic Arabic vowels. The result of this analysis and investigation will facilitate Arabic speech processing tasks such as vowel and speech recognition and classification. Keywords: The Holy Quran (THQ), Modern Standard Arabic (MSA), Classical Arabic (CA), Quranic Arabic (QA).

INTRODUCTION

Most of The Holly Quran (THQ) learning process is still

handled with manual method, through reading skills with direct supervised learning (talaqqi and mushafahah methods). These methods are described as a face-to-face teaching process between students and teachers. The techniques used by the learners are mainly based on listening and repetitions. The correction of wrong recitation is performed by attempting, through many repetitions, to reach the correct teacher pronunciation. This learning method was used from the beginning of the Islam era up to now. This way of teaching is very efficient since students will know how the sounds (phonemes) of the THQ and alphabets of Arabic are pronounced by watching the teacher and by listening the correct teachers’ pronunciations. Also this method of teaching will allow teachers to hear and notice mistakes of their students and correct their pronunciations by leading them to repeat the correct recitation and by describing the exact mistake and the correction. This process can only be done, if

the teachers and students follow the art, rules, and regulations of the correct THQ recitation, known as “rules of Tajweed” [1].

Modern Standard Arabic (MSA) possesses six vowels namely /a/, /i/, /u/, /aa/, /ii/, /uu/. The Arabic language is a quantitative language where sound duration is phonemic and semantically relevant. The vocalic system is then composed of three short vowels, /a/, /i/, /u/, and three long counterparts, /aa/, /ii/, /uu/ [2]. The Classical Arabic (CA) and Quranic Arabic (QA) contain many variations and vowel-like sounds that can be noticed in reciting THQ. Some linguists and researchers treat these variations and allophones as new phonemes that exit in CA and/or in QA but cannot be found in MSA. This paper focuses on the acoustical analysis of the unique vowel found in the sixth word “it moves” (مجريها) in Chapter 11"Houd", verse number 41 [3].

The rest of the paper is organized as follows. This section continues by giving background subsections about Arabic language and THQ reading dialects. We also introduce in this section the “7th vowel” of Arabic language and the formant analysis used to characterize this vowel. The second section explains the experimental framework and contains subsections about the database, the file coding system, and the methodology used throughout our analysis experiments. The third section discussed the obtained results and the last section concludes this work and gives indication about future work.

A. Arabic Language Arabic is a Semitic language, and it is one of the oldest

languages in the world. Currently it is the second language in terms of number of speakers [4]. Arabic is the first language in the Arab world, i.e., Saudi Arabia, Jordan, Oman, Yemen, Egypt, Syria, Lebanon, etc. Arabic alphabets are used in several languages, such as Persian and Urdu. Standard Arabic has basically 34 phonemes, of which six are vowels, and 28 are consonants [5]. A phoneme is the smallest element of speech that indicates a difference in meaning, word, or sentence. Arabic language has fewer vowels than English language. It has three long and three short vowels, while American English has twelve vowels [6].

Page 2: [IEEE 2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE) - Dalian, China (2009.09.24-2009.09.27)] 2009 International Conference on Natural

Arabic phonemes contain two distinctive classes, which are named pharyngeal and emphatic phonemes. These two classes can be found only in Semitic languages like Hebrew [5], [7]. The allowed syllables in Arabic language are CV, CVC, and CVCC where V indicates a (long or short) vowel while C indicates a consonant. Arabic utterances can only start with a consonant [5]. All Arabic syllables must contain at least one vowel. Also Arabic vowels cannot be initials and they can occur either between two consonants or be the final phoneme in a word. Arabic syllables can be classified as short or long. The CV type is a short one while all others are long. Syllables can also be classified as open or closed. An open syllable ends with a vowel while a closed syllable ends with a consonant. For Arabic, a vowel always forms a syllable nucleus, and there are as many syllables in a word as vowels in it [7]. With very few exceptions, alphabet-to-sound conversion for Arabic usually has simple one-to-one mapping between orthography and phonetic transcription for given correct diacritics [8].

B. The Holy Quran and its Reading Dialects The word "Quran" in Arabic means that which is recited; or

that which is dictated in memory form. As such, it is not a book, nor is it something that reaches us only in written form. There are no different versions of the THQ in the Arabic language, only different translations and of course, none of these would be considered to hold the value and authenticity of the original Arabic recitation. THQ is divided up into 30 equal parts, called "Juza’" (chapter) in the Arabic language [9]. THQ can be read in ten different reading dialects, seven of them from Shatebiah way (Nafea AlMadni, Ibn Kather, Abu Amero AlBassery, Ibn Aamer AlShami, Assem AlKoofi, Hamzah AlKoofi, AlKessai AlKoofi) and three from AlDorrah AlModiah way (Abu Jaafer AlMadni, Yaqoob AlBassery and Khalafe bin Hesham). Each Imam has two narrators.

Table 1 lists the ten different reading dialects and their narrators. All reading narrators per each Imam are identical regarding their vowel pronunciation except narrators for Nafea AlMadni (i.e., Qaloon and Warsh) and narrators for Assem AlKoofi (i.e., Hafss and Shobah). So we have total of twelve different types of reading dialects [10]. To make it more clear, Imam Nafea takes the reading dialects from seventy readers from Madinah city scholars. What was agreed by two readers of them, he considered it for teaching to the others. If the reading is narrated by only one reader, then he keeps it for himself without teaching it to anyone. After Warsh came from Egypt, he introduced one of his reading dialects to Nafea. Nafea found that this Warsh’s new dialect matches one of the readings that was unique and have only a single reader. Thus, Nafea received another confirmation for one of his personal reading dialects; hence, he decided to teach it to public. Therefore narrators for Nafea (i.e., Warsh and Qaloon) became slightly different [11].

TABLE 1

READING DIALECTS AND THEIR NARRATORS

C. The 7th Vowel of Arabic language Most of Arabic researchers and linguists agree about the

fact that MSA has only six vowels but we believe that the number of vowels is more than that in classical Arabic. Our definition of classical Arabic includes the old Arabic and the manner of reciting THQ. This difference in the vocalic system can be noticed when we listen to the way of reciting THQ in many chapters, and verses. As shown in Table 1, THQ can be recited through several reading manners that we refer to as dialects of reading. One example of popular reading dialects is the dialect of “Hafss Ann Assem”. All of the reading dialects are correct and recognized by Islamic literatures and scholars. Each reader (who uses a specific dialect) has two narrators as can be shown in Table 1. For any specific word or verse in THQ, the two narrators for any given reader may agree or disagree about the way of reading [12]. For example, Nafea AlMadani (Reading Dialect 01) has two narrators, namely Warsh and Qaloon, and those narrators disagree on the way of reading the 7th vowel which is the subject of investigation in this research. Warsh read this vowel by Way 01, but Qaloon read by using Way 02.

All vowels that exist in MSA also exist in CA. Exact number of vowels in CA is not clear and it needs more research and investigation by both Arabic linguists and phoneticians. There is a disagreement among the researchers

Reading Dialecte Dialect Code

NarratorsName

Narrator Code

Way Code

Warsh 1 01 Qaloon 2

AlBazzie 1Gounbul 2Hesham 1

Ibn Thakwan 2Ruwais 1 Roah 2

Ibn Wardan 1Ibn Jamaz 2 Shoabah 1

Hafss 2Khalafe 1Khalad 2

Abu AlHarth 1Hafss AlDory 2

Isaac 1Edrees 2AlDory 1AlSosi 2Abu Amero AlBassery 10

05

Assem AlKoofi 06

03

Hamzah AlKoofi 07

AlKessai AlKoofi 08

Khalafe Bin Hesham 09

Nafea AlMadani 01

02

Ibn Kather 02

Ibn Aamer AlShami 03

Yaqoob AlBassery 04

Abu Jaafer AlMadni

Page 3: [IEEE 2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE) - Dalian, China (2009.09.24-2009.09.27)] 2009 International Conference on Natural

regarding this subject but in this research we are going to investigate a specific CA vowel that cannot be found in MSA abut it exists in CA. This vowel is present in the sixth word of the 41st verse of Houd Chapter of THQ (the word [مجراها] “its move”) as shown in Figure 1. In English translation, Prophet Noah, peace be upon him, said: “embark therein: in the name of Allah will be its (moving) course and its (resting) anchorage. Surely, my lord is Oft-forgiving, most Merciful” [3]. We will call it in this research as “The 7th vowel in CA”. This specific vowel is vocalized by many manners by many readers and reading dialects. To be specific, all of the reading narrators for any specific reading dialect are identical in vocalizing this vowel except narrators of Nafea AlMadni (Qaloon and Warsh) and narrators of Assem AlKoofi (Hafss and Shobah). Therefore, we have twelve different types of reading dialects as listed in Table 1. Depending on the considered reading dialects, this vowel (which is pronounced in the sixth word of verse 41 of Houd chapter of THQ) is vocalized in three ways: first by lowest tilt, second by using the normal vowel /a/, and third as highest tilt [10].

Figure 1: Verse 41 of Chapter 11 of THQ

D. Formant Analysis Formant frequencies are defined as the resonance

frequencies of the vocal tract. Formants are considered to be representative of the underlying phonetic knowledge of speech. It is well established that the first three formant frequencies are sufficient for perceptually identifying vowels. They are usually called as F1, F2, and F3. Even for the same phoneme, however, these formants frequency largely vary, depending on the speaker and the neighboring phonemes (e.g., the coarticulation effects). Furthermore, the formant frequencies vary, depending on the adjacent phoneme in continuously spoken utterances [8]. In our study the three first formants are used to characterize the vocalic system including our assumed 7th vowel.

EXPERIMENTAL FRAMEWORK

A. Database We decided to use the recorded THQ recitations due to the

good vocalization of well-known reciters and the lack of dependable CA corpus that can provide the 7th vowel. Acoustic analysis was performed on 50 audio files downloaded from different official and authenticated websites [12], [13]. These recorded audio files were recorded by many

well-known male reciters as given in Table 2. All these narrators recited the whole verse with different recording environments as can be noticed from playing back the audio files. An extra narrator was invited to record this verse by using all different authentic reading dialects. His recitation was recorded by an IC recorder (SONY ® ICD-UX71F). This vowel can be read by three ways in THQ lowest tilt, normal vowel as the /a/ vowel in MSA, and highest tilt. All of these ways of reading for this vowel were included in our audio files and covered by all reciters evenly.

The selected readers are among the best THQ reciters and they have been approved and accepted by Islamic scholars. Normally the quality of reciting THQ passes through a very tough and accurate certification to be accepted by public community in authenticated websites. Also this is a major and an essential requirement to be used in TV and radio broadcasting stations in the world. All of these reciters are well-known leaders of prayers in mosques in their countries and communities. All reading dialects mentioned in Table 1 are acceptable and any readers can select one reading dialect to read THQ. This implies that if a recite was authenticated then the variations between different repetitions by him or any other reciter must be minimal. Thanks to these very strict and rigorous rules of pronunciation, errors in articulating any verse in THQ is considered a big mistake and it can be easily identified and rejected immediately by listeners.

B. Files Coding In order to organize the research and keep expansion of the

research in mind, the audio file names have been coded in specific formats. Each audio file name consists of 11 digits such as WWVVSSNTTRR.wav. The first pair of digits (WW) from left represents the way of reading to this intended vowel (lowest tilt is 01, as normal vowel /a/ is 02 and highest tilt is 03. The second pair of digits (VV) represents the vowel that is targeted in the experiment. In our case this vowel is the one in the fifth word of verse 41 of Chapter 11 of the THQ and it is coded as VV=01. The third pair of digits (SS) represents the reading dialect; and the following digit (N), the seventh one, is used to represent the narrator for each reading dialect. This digit is either 1 or 2 because every dialect of THQ has only two narrators. The pair before the last one TT) is designed to represent the trial for each reader for the same verse and dialect. The last pair of digits (RR) represents the reference of the reciter. Table 2 lists readers and their assigned codes.

C. Methodology In this research we consider the first three formants in

studying the 7th vowel. Values of these formants in various contexts of the vowel are analyzed and compared. Also the variations of formant patterns (e.g., increasing, decreasing, and steady states) are considered. In addition, changes in the value of the fundamental frequency within the vowel under investigation are monitored and discovered. Any observations in frequency patterns other than formants will be put under investigation in order of full analyzing this vowel. Our

Page 4: [IEEE 2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE) - Dalian, China (2009.09.24-2009.09.27)] 2009 International Conference on Natural

technique will also permit to find the configuration and alignment of the tongue (i.e., front, back, high, and/or low positioning) while vocalizing this vowel in all ways of articulation. This characterization will allow us to conclude if this vowel is a 7th vowel or if it is just an allophone of one of the six basic vowels that exist in MSA Arabic.

TABLE 2

READER NAMES AND ASSIGNED CODES

RESULTS

The goal is to investigate the 7th vowel if it exists from the acoustic and phonetic point of view. This vowel will be compared to all MSA vowels in frequency domain, time domain, and by observing the formant quantities and trends.

As described by Quranic Arabic linguists, the ten different dialects of THQ show that the targeted vowel is realized in three ways: by the lowest tilt (coded as 01), as a normal vowel /a/ (coded as 02), and as a highest tilt (coded as 03) [10]. The only reader who read it by the lowest tilt is Nafea (with his narrator Warsh). The readers who read by using the normal vowel /a/ are Qaloon Ann Nafea, AlBazzie and Gounbul narrators of Ibn Kather ), Hesham and Ibn Thakwan narrators of Ibn Aamer AlShami, Ruwais and Roah narrators of Yaqoob

AlBassery , Ibn Wardan and Ibn Jamaz narrators of Abu Jaafer AlMadni and Shobah (with his narrator Assem). The readers who read the assumed 7th vowel by using the highest tilt are Hafss (with his narrator Assem), Khalafe and Khalad narrators of Hamzah AlKoofi, Abu AlHarth and Hafss AlDory narrators of AlKessai AlKoofi , Khalafe bin Hesham with his narrators Isaac and Edrees and AlDory and AlSosi narrators of Abu Amero AlBassery [10].

Acoustically we found that the major difference in these three ways of vocalization is in the values of the second formant (F2). A prominent difference in the location of formant F2 was observed in all spectrograms of utterances of different reading dialects and readers. We observed that for the first manner of reading where lowest spectrum tilt is realized, F2 is at equal distance (in terms of frequency) from F1 and F3, as shown in Figure 2 (a). For the second type of reading when the normal vowel /a/ is realized, we noticed that F2 is close to F1 as shown in the Figure 2 (b). When the highest spectrum tilt is realized confirming that narrators are using the third type of reading, F2 is located near F3 and far from F1 as shown in Figure 2 (c).

Figure 2: Spectrogram plots of the 7th vowel in case of ways of readings for Reader 02. Solid lines represent the formants’ variations (F1 is the lowest in

frequency and F4 is the highest)

A. The Lowest Tilt (01) Figure 3 shows formant values for many readers who use

the lowest tilt reading method as given by Warsh Ann Nafea dialect. The file name coding as described above can give the full description of each audio sample in the figures. The average of F2 is about 1676 Hz which represents the middle in range between the average of F1 and the average of F3. The average values of F1 and F3 are 565 Hz and 2665 Hz, respectively. The average ratio of F2 to F1 (i.e., F2/F1) is about 2.97, and the average ratio of F3 to F2 is about 1.59. The average differences between F2 and F1 and between F3 and F2 are 1107 Hz and 1125 Hz, respectively.

Reader Name ReaderCode

Abdulrasheed Ibn AlShaekh Ali Sofi 01Mahmoud Khaleel AlHossary 02

Yassen AlJazaeri 03Ali AlHothaifi 04

Abdulbassed Abdulsamad 05Mohammad Hamdan 06

Omar AlMagrebi 07Mohammad AlKintawy 08

Yasser AlMazroei 09Mohammad Abu Sninah 10Mohammad Bin Saleem 11

Ahmed Ali 12Mohammad Faroq 13Mohammad Ayoob 14

Mostafa AlBana 15Abdullah Bassfer 16Adel AlKalbani 17

Meshari AlAffasi 18Emad Zuhair Hafed 19

Mohammad AlDwalki 20

Page 5: [IEEE 2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE) - Dalian, China (2009.09.24-2009.09.27)] 2009 International Conference on Natural

Figure 3: Formant values (Hz) for lowest tilt by using selected samples of files

Figure 4: Formant values (Hz) for normal vowel /a/ by using selected samples of files

B. The Normal Vowel /a/ (02) Figure 4 represent formants for more than one reader who

pronounced the normal vowel /a/. Those readers are Qaloon Ann Nafea, AlBazzie and Gounbul narrators of Ibn Kather,

Hesham and Ibn Thakwan narrators of Ibn Aamer AlShami, Ruwais and Roah narrators of Yaqoob AlBassery, Ibn Wardan and Ibn Jamaz narrators of Abu Jaafer AlMadni, and Shobah Ann Assem. The average value F2 is 1070 Hz and it is closer to F1 and relatively far from F3. The average values of F1 and F3 are 606 Hz and 2853 Hz, respectively. The average ratio of F2 to F1 is 1.77, and the average ratio of F3 to F2 is about 2.67. The average difference between F2 and F1 and difference between F3 and F2 are 464 Hz and 1783 Hz, respectively.

C. The Highest Tilt (03) Figure 5 shows formants of readers who use the highest tilt.

The identity of the readers can be inferred from the file naming system. The average value of F2 is about 2061 Hz and it is approaching F3 and is far from F1. The average values of F1 and F3 are 493 Hz and 2748 Hz, respectively. The average ratio of F2 to F1 is 4.34, and the average ratio of F3 to F2 is 1.34. The average difference between F2 and F1 and between F3 and F2 are 464 Hz and 1783 Hz, respectively.

Figure 5: Formant values (Hz) for highest tilt by using selected samples of files

Page 6: [IEEE 2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE) - Dalian, China (2009.09.24-2009.09.27)] 2009 International Conference on Natural

Figure 6: Formants ratio for all way reading by using selected samples of files

TABLE 3

MEAN AND STANDARD DEVIATIONS OF ALL WAYS OF READINGS (HZ)

DISCUSSION Figure 6 depicted formants’ ratio of F2 to F1 for the three

methods of readings used by ten different readers. The average ratio for lowest tilt is about 2.47, for normal vowel /a/ is about 1.77, and for highest tilt this ratio is 4.11. These results suggest that the F2/F1 ratio could constitute a relevant feature for the discrimination of reading methods. According to Table 1, the readers referred to as 02, 11, and 12 use both of the three reading techniques (i.e., lowest tilt, normal vowel /a/ and highest tilt). We used their audio recordings in order to fix the effect of changing speaker. In fact, the results obtained by analyzing the F2/F1 ratio for these speakers permit us to conclude that this feature is speaker-independent. In other words, for this specific vowel, we can easily specify the reading method by only calculating the F2/F1 ratio. It is important to note that for both reading methods, and for each speaker, we found that the fundamental frequency remains approximately constant. This implies that the pitch frequency plays no role in the description of the different vocalizations of reading dialects for this 7th vowel. The realization of this particular vowel that is different when compared to the standard /a/ vowel, confirms that it is possible to include the

7th vowel as an additional vowel in the Arabic vocalic system even if this vowel is very rare.

It is important to note that due to the high confidence in the quality of vocalizing THQ verses for any specific reading dialect, we can consider that limited number of audio files and reciters are sufficient to give complete information about the vocalic system. Table 3 gives statistics on the four formants with respect to the three ways of reading. In addition to this table, Figure 7 gives a histogram plot in order to easily compare the first three formants for the three ways of vocalization of the 7th vowel.

Figure 7: Average F1, F2 & F3 values (Hz) for the three reading methods by

using samples of reciters

CONCLUSION

In this paper, we have presented a new analysis of the

classical Arabic vocalic system. This study was carried out by observing the formant structure on many utterances of the word “it moves” of the 41st verse of Chapter 1 (named as Houd) in The Holly Quran. Various recordings of famous narrators were analyzed for this purpose. This analysis leads us to conclude that a specific vowel is pronounced and validated by confirmed and experimented readers. This “very rare vowel” permits us to derive a feature that is relevant to discriminate between the Quran reading methods. This feature is the F2/F1 ratio. We have found that for the first manner of reading where lowest spectrum tilt is performed, F2 is at equal distance (in terms of frequency) from F1 and F3. For the second type of reading when the normal vowel /a/ is realized, we noticed that F2 is close to F1. In the case of the third method of reading where the highest tilt is realized, F2 is located near F3 and far from F1.

As future work, we are going to expand this research to acoustically and linguistically compare this vowel to the six MSA vowels. Also we have intention to include the relevant feature we found to discriminate reading techniques thanks to

Mean Stdev Mean Stdev Mean StdevF1 566 40 609 40 493 94F2 1673 162 1076 90 2061 173F3 2698 120 2871 271 2748 132F4 3739 134 3655 193 3847 270

Way 01 Way 02 Way 03Formant

Page 7: [IEEE 2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE) - Dalian, China (2009.09.24-2009.09.27)] 2009 International Conference on Natural

an automatic system that aims at evaluating the quality of Quran reading. This research may also be expanded to compare these three ways of vocalization for this 7th vowel with respect to MSA vowels and vowels in other languages such as English.

6. REFERENCES [1] Zaidi Razak, Noor Jamaliah Ibrahim, Emran Mohd Tamil, Mohd

Yamani Idna Idris, Mohd. Yakub @ Zulkifli Bin Mohd Yusoff, (2008) "Quranic Verse Recitation Feature Extraction Using Mel-Frequency Cepstral Coefficient (MFCC)", In Proceedings of the 4th IEEE International Colloquium on Signal Processing and its Application (CSPA) 2008, 7-9 March 2008, Kuala Lumpur, MALAYSIA.

[2]M. M. Alghamdi, “A spectrographic analysis of Arabic vowels: A cross-dialect study” Journal of King Saud University, Vol.10, Arts(1), pp. 3-24, 1998.

[3] The Holy Quran and English Translation of the Meanings and Commentary, King Fahd Complex for the Printing of the Holy Quran, Madinah City, 1999.

[4]Muhammad Alkhouli. “Alaswaat Alaghawaiyah,” Daar Alfalah, Jordann, 1990 (in Arabic).

[5] J. Deller, J.Proakis, and J. H.Hansen. “Discrete-Time Processing of Speech Signal,” MacmillAnn, 1993.

[6] M. Elshafei. “Toward an Arabic Text-to-Speech System,” The Arabian Journal for Scince and Engineering, vol. 16, no. 4B, pp. 565-83, Oct. 1991.

[7] R. Cole, M. Fanty, Y. Muthusamy, and M. Gopalakrishnan. “Speaker-Independent Recognition of Spoken English Letters,” International Joint Conference on Neural Networks (IJCNN), vol. 2, pp. 45-51, Jun.1990.

[8] Sadaoki Furui, “Digital Speech Processing, Synthesis, and Recognition,” Marcel Dekker, Inc 2001.

[9] Imanway Home Page http://www.imanway1.com/ [10] Abdulfatah Alqadee. “AlBdoor AlZaherah in the Ten Authentic

Quran Dialects from Shatebiah and AlDorah AlModiah” Daar AlKetab Alarabi, LebAnnon , 2005 (in Arabic).

[11] Website, http://www.quraat.com/ [12] Islamweb http://www.islamweb.net/ [13] Islam Way Website, http://www.islamway.com/ [14] Hafiz Rizwan Iqbal, Mian M. Awais, Shahid Masud, Shafay

Shamail: On Vowels Segmentation and Identification Using Formant Transitions in Continuous Recitation of Quranic Arabic, Chapter of “New Challenges in Applied Intelligence Technologies”, Springer Berlin, 2008: pp. 155-162.