Upload
arabicnlpimamu2013
View
481
Download
2
Embed Size (px)
DESCRIPTION
Citation preview
By ...Dhaifah AL-ammari
Wafa AL-shehri
Speech Recognition for Arabic
Arabic linguistic varieties .
Table 1: Some differences between Modern Standard Arabic and Egyptian Colloquial Arabic
GLOSS ECA MSA
summerصيف
se:f saif
‘he speacks’يتكلم
yitkallim yatakallam
Tableطاولة
tarabeeza
Tawila
Writing System
Arabic is written in script and from right to left. The alphabet consists of twenty-eight letters, twenty-five
of which represent consonants. The remaining three letters represent the long vowels of Arabic and, where applicable, the corresponding Each letter can appear in up to four different shapes, depending on whether it occurs at the beginning, in the middle, or at the end of a word, or in isolation. Letters are mostly connected and there is no capitalization semivowels.
Arabic diacritics
Morphology
Examples of MSA pronominal and possessive
affixes (separated from stem by '-').
error rates on conversational speech, by contrast, are unacceptably high. The
currently best error rate, 55.5%, is larger than those
for comparable data in other languages
problemsthe mismatch between spoken and written
representation (missing pronunciation information in Arabic script);
the lack of conversational training data;
morphological complexity.
Projects and contributionsIBM first established a system to learn to speak Arabic and converted to text. (OS2)After that the two versions of the Windows systemThen introduced Via Voice mulineum. In speech recognition system to answer phone calls, and responding to user voice command.The problem was: the need for a large number of words, a word 200,000 to cover 97% of the language used in the modern day.
Recent Works
Alghamdi . (2009) developed an Arabic broadcast news transcription system.
Elmahdy in. (2009)used acoustic models trained with large MSA news broadcast speech corpus to work as multilingual or multi-accent models to decode colloquial Arabic.
Selouani and Alotaibi (2011)presented Genetic Algorithms to adapt HMMs for
non-native speech in a large vocabulary speech recognition system of MSA.
Saon et al. (2010) described the Arabic broadcast transcription system
Kuo et al. (2010) studied various syntactic and morphological context features incorporated in an NNLM for Arabic speech recognition