challenges and techniques for dialectal arabic speech
play

Challenges and Techniques for Dialectal Arabic Speech Recognition - PowerPoint PPT Presentation

Challenges and Techniques for Dialectal Arabic Speech Recognition and Machine Translation Mohamed Elmahdy, Mark Hasegawa-Johnson, Eiman Mustafawi, Rehab Duwairi, Wolfgang Minker Nov. 21, 20011 Qatar University University of Illinois Ulm


  1. Challenges and Techniques for Dialectal Arabic Speech Recognition and Machine Translation Mohamed Elmahdy, Mark Hasegawa-Johnson, Eiman Mustafawi, Rehab Duwairi, Wolfgang Minker Nov. 21, 20011 Qatar University University of Illinois Ulm University

  2. Introduction | Approaches | Experiments and results | Conclusions Page 2 Arabic Language  Largest still living Semitic language  250+ million native speakers Arabic Formal Dialectal   Modern Standard Arabic (MSA) Used in everyday life   Standardized Not standardized (mainly spoken)   A lot of ASR and MT research Many different dialects   Not used in everyday life Very few ASR and MT research Significant differences between MSA and Dialectal Arabic  Considered as completely different languages Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  3. Introduction | Approaches | Experiments and results | Conclusions Page 3 MSA Versus Dialectal Arabic  Let‟s have Egyptian Colloquial Arabic (ECA) as a typical Arabic dialect  Phonological  /t/, /s/ in ECA instead of /T/ in MSA e.g. /tala:tah/ (three) in ECA versus /Tala:Tah/ in MSA  Lexical  / t„ArAbE:zA / (table) in ECA versus / t„awila / in MSA  Syntactic  SVO in ECA versus VSO in MSA Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  4. Introduction | Approaches | Experiments and results | Conclusions Page 4 Automatic Speech Recognition  High level diagram for a state-of-the-art ASR system ^  arg max ( | ) ( ) W P O W P W W  L Features Feature Decoder O Words Extraction ^ Speech W Acoustic Language Pronunciation Model Model Model ( | ) ( W ) P O W P For dialectal Arabic, sparse and low quality corpora are available Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  5. Introduction | Approaches | Experiments and results | Conclusions Page 5 Statistical Machine Translation  High level diagram for a SMT system ^  arg max ( | ) ( ) E P A E P E E  English Arabic Decoder English sentence sentence ^ A E Translation Language Model Model ( | ) ( E ) P A E P Large parallel corpora are required For dialectal Arabic, parallel corpora are not available Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  6. Introduction | Approaches | Experiments and results | Conclusions Page 6 Objectives  ASR and MT for dialectal Arabic where little data exists  To benefit from existing MSA speech data to improve dialectal Arabic ASR and MT  Ultimate goal “Speech -to- text MT” for dialectal Arabic Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  7. Introduction | Approaches | Experiments and results | Conclusions Page 7 Outline  Introduction  Approaches  Experiments and results  Conclusions and future directions Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  8. Introduction | Approaches | Experiments and results | Conclusions Page 8 Proposed Approaches for Dialectal Arabic ASR  Phonemic acoustic modeling → Dialectal speech data where phonetic transcription is available  Graphemic acoustic modeling  Unsupervised acoustic modeling  Arabic Chat Alphabet-based acoustic modeling Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  9. Introduction | Approaches | Experiments and results | Conclusions Page 9 Phonemic Cross-Lingual Acoustic Modeling  Benefit from existing large MSA speech corpora  Assumptions:  MSA is always a 2 nd language for any Arabic speaker  Large amount of MSA speech data (large number of speakers) implicitly cover all the acoustic features of the different Arabic dialects  Approach:  Train an acoustic model using a large amount of MSA speech data  Adaptation of the MSA acoustic models with a little amount of dialectal speech data Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  10. Introduction | Approaches | Experiments and results | Conclusions Page 10 Phonemic Cross-Lingual Acoustic Modeling (cont.)  State-of-the-art AM adaptation techniques include:  Maximum Likelihood Linear Regression (MLLR)     A b MLLR  Maximum A-Posteriori (MAP)     arg max ( | ) ( ) P O P MAP   Requirement: adaptation data and the AM have to share the same language and phoneme set  Egyptian Colloquial Arabic (ECA) is chosen as a typical dialect  INITIALLY: MSA and ECA do not share the same phoneme inventory MSA ECA Acoustic model adaptation is not possible Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  11. Introduction | Approaches | Experiments and results | Conclusions Page 11 Phonemic Cross-Lingual Acoustic Modeling (cont.)  SOLUTION: Phoneme sets normalization  AM adaptation is possible ECA MSA  Phoneme sets normalization  Several phone mapping rules are applied  Map ECA phonemes to their origins in MSA (even if they are acoustically different) Normalization phone mapping rules ……. ECA /b/ /g/ /j/ /e/ /i/ /o/ /u/ /t/ MSA ……… MSA /b/ /dZ/ /i/ /u/ /t/ ECA رزج /g/ /A/ /z/ /A/ /r/ /dZ/ /a/ /z/ /a/ /r/ (carrot) Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  12. Introduction | Approaches | Experiments and results | Conclusions Page 12 Phonemic Cross-Lingual Acoustic Modeling (cont.)  Block diagram for the proposed approach  The adapted ECA AM is evaluated against the ECA baseline AM ECA ECA Training baseline final model model ECA corpus Phonemes Normalized Normalization ECA corpus MLLR MAP adaptation adaptation Normalized Phonemes MSA MSA Training MSA Normalization corpus acoustic corpus model Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  13. Introduction | Approaches | Experiments and results | Conclusions Page 13 Proposed Approaches for Dialectal Arabic ASR  Phonemic acoustic modeling → Dialectal speech data where phonetic transcription is available  Graphemic acoustic modeling → Phonetic transcription is not possible/difficult → Short vowels are missing → Phonetic transcription is approximated to be word letters  Unsupervised acoustic modeling → Transcriptions are not available at all → Dialectal speech was automatically transcribed using a MSA model  Arabic Chat Alphabet-based acoustic modeling → Latin letters are used instead of Arabic ones → Include short vowels that are missing in traditional Arabic orthography Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  14. Introduction | Approaches | Experiments and results | Conclusions Page 14 Outline  Introduction  Approaches  Experiments and results  Conclusions Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  15. Introduction | Approaches | Experiments and results | Conclusions Page 15 Phonemic Cross-Lingual Adaptation Results   ECA corpus: Word Error Rate (WER) → 65% for training/adaptation   → 35% for testing Sub Ins Del  WER N 30.00 25.00 20.00 ECA baseline WER (%) MSA only 15.00 MSA+ECA data pooling 10.00 MSA+ECA adaptation 5.00 0.00 41.8% ECA Phonemic AM Relative reduction in WER Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  16. Introduction | Approaches | Experiments and results | Conclusions Page 16 Effect of MSA Speech Data Amount  Varying the amount of MSA speech data  Effect on phonemic cross-lingual adaptation 18 16 14 WER (%) 12 10 MSA+ECA adaptation 8 6 4 Consistent decrease in WER 2 0 0.5 1 2 4 8 16 32 MSA speech amount (hours) Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  17. Introduction | Approaches | Experiments and results | Conclusions Page 17 Outline  Introduction  Approaches  Experiments and results  Conclusions Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  18. Introduction | Approaches | Experiments and results | Conclusions Page 18 Conclusions and Future Directions  Conclusions → Problems in ASR and MT for dialectal Arabic → Cross-lingual acoustic modeling for dialectal Arabic ASR → Improvements are observed in both phonemic and graphemic modeling → Consistent reduction in WER by adding more MSA data  Future directions → Data collection (a focus is placed on the Qatari dialect) → Extension to all the Arabic dialects → Dialectal Arabic MT and LM Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  19. Introduction | Approaches | Experiments and results | Conclusions Page 19 Thank you for your attention Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Recommend


More recommend