forest rescoring
play

Forest Rescoring Faster Decoding with Integrated Language Models - PowerPoint PPT Presentation

Forest Rescoring Faster Decoding with Integrated Language Models Liang Huang David Chiang ACL 2007, Praha, esk republika Statistical Machine Translation Spanish/English English Bilingual Text Text Statistical Analysis Statistical


  1. Forest Rescoring Faster Decoding with Integrated Language Models Liang Huang David Chiang ACL 2007, Praha, Č eská republika

  2. Statistical Machine Translation Spanish/English English Bilingual Text Text Statistical Analysis Statistical Analysis translation model (TM) language model (LM) Broken Spanish English competency fluency English What hunger have I Hungry I am so Have I that hunger Que hambre tengo yo I am so hungry I am so hungry How hunger have I ... Huang and Chiang (Knight and Koehn, 2003) Forest Rescoring 2

  3. Statistical Machine Translation Spanish/English English Bilingual Text Text Statistical Analysis Statistical Analysis translation model (TM) language model (LM) Broken Spanish English competency fluency English n -best rescoring What hunger have I Hungry I am so Have I that hunger Que hambre tengo yo I am so hungry I am so hungry How hunger have I ... Huang and Chiang (Knight and Koehn, 2003) Forest Rescoring 2

  4. Statistical Machine Translation Spanish/English English Bilingual Text Text Statistical Analysis Statistical Analysis translation model (TM) language model (LM) Broken Spanish English competency fluency English Huang and Chiang Forest Rescoring 3

  5. Statistical Machine Translation Spanish/English English Bilingual Text Text Statistical Analysis Statistical Analysis translation model (TM) language model (LM) Broken Spanish English competency fluency English decoder integrated decoder I am so hungry Que hambre tengo yo (LM-integrated) computationally challenging! ☹ Huang and Chiang Forest Rescoring 3

  6. Statistical Machine Translation Spanish/English English Bilingual Text Text Statistical Analysis Statistical Analysis phrase-based TM translation model (TM) language model (LM) Broken n -gram LM Spanish English competency fluency English syntax-based decoder integrated decoder I am so hungry Que hambre tengo yo (LM-integrated) computationally challenging! ☹ Huang and Chiang Forest Rescoring 3

  7. Forest Rescoring Spanish/English English Bilingual Text Text Statistical Analysis Statistical Analysis phrase-based TM translation model (TM) language model (LM) Broken n -gram LM Spanish English competency fluency English syntax-based packed forest decoder integrated decoder I am so hungry Que hambre tengo yo (LM-integrated) computationally challenging! ☹ Huang and Chiang Forest Rescoring 4

  8. Forest Rescoring Spanish/English English Bilingual Text Text Statistical Analysis Statistical Analysis phrase-based TM translation model (TM) language model (LM) Broken n -gram LM Spanish English competency fluency English syntax-based packed forest on-the-fly rescoring decoder integrated decoder I am so hungry Que hambre tengo yo (LM-integrated) computationally challenging! ☹ Huang and Chiang Forest Rescoring 4

  9. Forest Rescoring Spanish/English English Bilingual Text Text Statistical Analysis Statistical Analysis phrase-based TM translation model (TM) language model (LM) Broken n -gram LM Spanish English competency fluency English syntax-based packed forest on-the-fly rescoring decoder integrated decoder forest rescorer I am so hungry Que hambre tengo yo (LM-integrated) significant speed-up: 10~30 times faster! ☺ Huang and Chiang Forest Rescoring 4

  10. The Forest Framework unifying phrase- and syntax-based decoding

  11. Phrase-based Decoding source-side: coverage vector 与 沙龙 举行 了 会谈 _ _ ● ● ● yu Shalong juxing le huitan held a talk target-side: grow hypotheses held a talk with Sharon strictly left-to-right ... ... _ _ ● ● ● ● ● ● _ _ _ _ _ ● ● held a talk held a talk with Sharon ... ... Huang and Chiang Forest Rescoring 6

  12. Syntax-based Translation • synchronous context-free grammars (SCFGs) • context-free grammar in two dimensions • generating pairs of strings/trees simultaneously • co-indexed nonterminal further rewritten as a unit PP (1) VP (2) , VP (2) PP (1) VP → VP juxing le huitan , held a meeting → PP yu Shalong , with Sharon → VP VP PP VP VP PP yu Shalong juxing le huitan held a meeting with Sharon Huang and Chiang Forest Rescoring 7

  13. Translation as Parsing • translation with SCFGs => monolingual parsing • parse the source input with the source projection • build the corresponding target sub-strings in parallel PP (1) VP (2) , VP (2) PP (1) VP → VP juxing le huitan , held a meeting → PP yu Shalong , with Sharon → VP 1, 6 VP 3, 6 PP 1, 3 yu Shalong juxing le huitan Huang and Chiang Forest Rescoring 8

  14. Translation as Parsing • translation with SCFGs => monolingual parsing • parse the source input with the source projection • build the corresponding target sub-strings in parallel PP (1) VP (2) , VP (2) PP (1) VP → VP juxing le huitan , held a meeting → PP yu Shalong , with Sharon → VP 1, 6 VP 3, 6 PP 1, 3 yu Shalong juxing le huitan Huang and Chiang Forest Rescoring 8

  15. Translation as Parsing • translation with SCFGs => monolingual parsing • parse the source input with the source projection • build the corresponding target sub-strings in parallel PP (1) VP (2) , VP (2) PP (1) VP → VP juxing le huitan , held a meeting → held a talk with Sharon PP yu Shalong , with Sharon → VP 1, 6 with Sharon held a talk VP 3, 6 PP 1, 3 yu Shalong juxing le huitan Huang and Chiang Forest Rescoring 8

  16. Packed Forest • a compact representation of all translations • has a structure of hypergraph (graph is a special case) phrase-based: graph syntax-based: hypergraph VP 1, 6 ●●●●● _ _ ● _ _ _ _ _ ● ● PP 1, 3 VP 3, 6 Huang and Chiang Forest Rescoring 9

  17. Packed Forest • a compact representation of all translations • has a structure of hypergraph (graph is a special case) phrase-based: graph syntax-based: hypergraph nodes VP 1, 6 ●●●●● _ _ ● _ _ _ _ _ ● ● PP 1, 3 VP 3, 6 (hyper-)edges Huang and Chiang Forest Rescoring 9

  18. Adding a Bigram Model ●●●●● ... Shalong ... meeting _ _ ● ● ● ●●●●● ... Sharon ... talk _ _ _ _ _ _ _ ● ● ● ... talks _ _ ● ● ● VP 1, 6 PP 1, 3 VP 3, 6 +LM items with ... Sharon held ... talk along ... Sharon held ... meeting with ... Shalong hold ... talks Huang and Chiang Forest Rescoring 10

  19. Adding a Bigram Model ●●●●● ... Shalong ... meeting _ _ ● ● ● with Sharon ●●●●● ... Sharon ... talk _ _ _ _ _ _ _ ● ● ● ... talks _ _ ● ● ● VP 1, 6 PP 1, 3 VP 3, 6 +LM items with ... Sharon held ... talk along ... Sharon held ... meeting with ... Shalong hold ... talks Huang and Chiang Forest Rescoring 10

  20. Adding a Bigram Model ●●●●● ... Shalong ... meeting _ _ ● ● ● with Sharon ●●●●● ... Sharon ... talk _ _ _ _ _ _ _ ● ● ● bigram ... talks _ _ ● ● ● VP 1, 6 PP 1, 3 VP 3, 6 +LM items with ... Sharon held ... talk along ... Sharon held ... meeting with ... Shalong hold ... talks Huang and Chiang Forest Rescoring 10

  21. Adding a Bigram Model ●●●●● ... Shalong ... meeting _ _ ● ● ● with Sharon ●●●●● ... Sharon ... talk _ _ _ _ _ _ _ ● ● ● bigram ... talks _ _ ● ● ● VP 1, 6 held ... talk with ... Sharon PP 1, 3 VP 3, 6 +LM items with ... Sharon held ... talk along ... Sharon held ... meeting with ... Shalong hold ... talks Huang and Chiang Forest Rescoring 10

  22. Adding a Bigram Model ●●●●● ... Shalong ... meeting _ _ ● ● ● with Sharon ●●●●● ... Sharon ... talk _ _ _ _ _ _ _ ● ● ● bigram ... talks _ _ ● ● ● VP 1, 6 held ... talk with ... Sharon bigram PP 1, 3 VP 3, 6 +LM items with ... Sharon held ... talk along ... Sharon held ... meeting with ... Shalong hold ... talks Huang and Chiang Forest Rescoring 10

  23. Adding a Bigram Model ●●●●● ... Shalong ... meeting _ _ ● ● ● with Sharon ●●●●● ... Sharon ... talk _ _ _ _ _ _ _ ● ● ● bigram ... talks _ _ ● held ... Sharon ● ● VP 1, 6 held ... talk with ... Sharon bigram PP 1, 3 VP 3, 6 +LM items with ... Sharon held ... talk along ... Sharon held ... meeting with ... Shalong hold ... talks Huang and Chiang Forest Rescoring 10

  24. Adding a Bigram Model ●●●●● ... Shalong ... meeting _ _ ● ● ● with Sharon ●●●●● ... Sharon ... talk _ _ _ _ _ _ _ ● ● ● bigram ... talks _ _ ● held ... Sharon ● ● hold ... Sharon VP 1, 6 held ... Shalong hold ... Shalong held ... talk with ... Sharon bigram PP 1, 3 VP 3, 6 +LM items with ... Sharon held ... talk along ... Sharon held ... meeting with ... Shalong hold ... talks Huang and Chiang Forest Rescoring 10

  25. Conventional Beam Search VP 1, 6 hyperedge PP 1, 3 VP 3, 6 PP 1, 4 VP 4, 6 NP 1, 4 VP 4, 6 1.0 2.3 1.1 4.6 2.5 7.2 • beam search: only keep top- k +LM items at each node • but there are many ways to derive each node • can we avoid enumerating all combinations? • best-first enumeration? Huang and Chiang Forest Rescoring 11

Recommend


More recommend