generation in machine translation from deep syntactic
play

Generation in Machine Translation from Deep Syntactic Trees Keith - PowerPoint PPT Presentation

Generation in Machine Translation from Deep Syntactic Trees Keith Hall Petr N mec Johns Hopkins University Charles University in Prague Outline Transfer-based MT Tectogrammatical Representation (TR) (deep syntax) Generation from


  1. Generation in Machine Translation from Deep Syntactic Trees Keith Hall Petr N ě mec Johns Hopkins University Charles University in Prague

  2. Outline ● Transfer-based MT ● Tectogrammatical Representation (TR) (deep syntax) ● Generation from English TR trees ● process ● models ● Empirical results SSST ‘07 - Hall & N ě mec

  3. Transfer-based MT Source Target (Czech) (English) SSST ‘07 - Hall & N ě mec

  4. Transfer-based MT Source Target (Czech) (English) SSST ‘07 - Hall & N ě mec

  5. Transfer-based MT Interlingua Source Target (Czech) (English) SSST ‘07 - Hall & N ě mec

  6. Transfer-based MT Interlingua Source Target (Czech) (English) SSST ‘07 - Hall & N ě mec

  7. Transfer-based MT Tectogrammar Source Target (Czech) (English) SSST ‘07 - Hall & N ě mec

  8. Tecto Transfer-based MT Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  9. Tecto Transfer-based MT deep deep syntax syntax (Czech Tecto) (English Tecto) surface surface syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  10. Tecto Transfer-based MT deep deep syntax syntax (Czech Tecto) (English Tecto) g n i s r a surface surface p syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  11. Tecto Transfer-based MT tree transduction deep deep syntax syntax (Czech Tecto) (English Tecto) g n i s r a surface surface p syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  12. Tecto Transfer-based MT tree transduction deep deep syntax syntax (Czech Tecto) (English Tecto) generation g n i s r a surface surface p syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  13. Tecto Transfer-based MT tree transduction deep deep syntax syntax (Czech Tecto) (English Tecto) generation g n i s r a surface surface ` p syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  14. Tecto Transfer-based MT tree transduction deep deep syntax syntax (Czech Tecto) (English Tecto) generation g n i s ? r a surface surface ` p syntax syntax Czech English sentence sentence SSST ‘07 - Hall & N ě mec

  15. Transfer-based MT ● Allows us to explore deep syntactic representations ● Factored models are clear ● Need not be greedy one-best process ● although we present one-best generation/results SSST ‘07 - Hall & N ě mec

  16. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” FORM: #2 LEMM: # FUNC: SENT FORM: opened LEMM: open FUNC: PRED POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  17. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” FORM: #2 LEMM: # FUNC: SENT FORM: opened LEMM: open FUNC: PRED POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  18. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” lemma FORM: #2 LEMM: # FUNC: SENT FORM: opened LEMM: open FUNC: PRED POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  19. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” FORM: #2 LEMM: # FUNC: SENT functor FORM: opened LEMM: open FUNC: PRED POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  20. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” FORM: #2 LEMM: # FUNC: SENT part-of-speech FORM: opened LEMM: open FUNC: PRED POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  21. Tectogrammatical Representation “Now the network has opened a news bureau in the Hungarian capital” FORM: #2 LEMM: # FUNC: SENT FORM: opened LEMM: open FUNC: PRED tense & mood POS: 'VBN' T_M: 'SIM'_'IND' FORM: network FORM: Now FORM: bureau FORM: capital LEMM: network LEMM: now LEMM: bureau LEMM: capital FUNC: ACT FUNC: TWHEN FUNC: PAT FUNC: LOC POS: 'NN' POS: 'RB' POS: 'NN' POS: 'NN' FORM: news FORM: Hungarian LEMM: news LEMM: hungarian FUNC: RSTR FUNC: RSTR POS: 'NN' POS: 'JJ' SSST ‘07 - Hall & N ě mec

  22. Generation Process deep 1. Insert syn-semantic (function) words syntax (English Tecto) 2. Subtree reordering ● Intermediary surface syntax ? surface ● Reordering constraints? syntax ● maximum subtree size ● coordination English sentence SSST ‘07 - Hall & N ě mec

  23. Generation Model arg max A,f P ( A, f | T ) = arg max A,f P ( f | A, T ) P ( A | T ) ≈ arg max P ( f | T, arg max A P ( A | T )) f ● tecto nodes: T = { t 1 , . . . , t i , . . . , t n } ● insertion string: A = { a 1 , . . . , a i , . . . , a k } n ≤ k ≤ 2 n ● order mapping: f : { A ∪ T } → { 1 , . . . , 2 n } SSST ‘07 - Hall & N ě mec

  24. Generation Model Insertion arg max A,f P ( A, f | T ) = arg max A,f P ( f | A, T ) P ( A | T ) ≈ arg max P ( f | T, arg max A P ( A | T )) f ● tecto nodes: T = { t 1 , . . . , t i , . . . , t n } ● insertion string: A = { a 1 , . . . , a i , . . . , a k } n ≤ k ≤ 2 n ● order mapping: f : { A ∪ T } → { 1 , . . . , 2 n } SSST ‘07 - Hall & N ě mec

  25. Generation Model Reordering arg max A,f P ( A, f | T ) = arg max A,f P ( f | A, T ) P ( A | T ) ≈ arg max P ( f | T, arg max A P ( A | T )) f ● tecto nodes: T = { t 1 , . . . , t i , . . . , t n } ● insertion string: A = { a 1 , . . . , a i , . . . , a k } n ≤ k ≤ 2 n ● order mapping: f : { A ∪ T } → { 1 , . . . , 2 n } SSST ‘07 - Hall & N ě mec

  26. Insertion Process “Now the network has opened a news bureau in the Hungarian capital” open PRED VBN SIM_IND network now bureau capital ACT TWHEN PAT LOC NN RB NN NN news hungarian RSTR RSTR NN JJ SSST ‘07 - Hall & N ě mec

  27. Insertion Process “Now the network has opened a news bureau in the Hungarian capital” open PRED VBN SIM_IND network now bureau capital has ACT TWHEN PAT LOC AUX NN RB NN NN the DT news hungarian in RSTR PP RSTR NN JJ the a DT DT SSST ‘07 - Hall & N ě mec

  28. Insertion Model P ( A | T ) � = P ( a i | a 1 , . . . , a i − 1 , T ) i � ≈ P ( a i | t i , t g ( i ) ) i ● Insertion is dependent on local context: ● tecto node (includes: lemma, functor, POS) ● parent node ● Three independent models: ● articles ● prepositions and subordinating conjunctions ● modals (deterministic, given functor) SSST ‘07 - Hall & N ě mec

  29. Reordering Process “Now the network has opened a news bureau in the Hungarian capital” open PRED VBN SIM_IND network bureau capital now has ACT PAT LOC TWHEN AUX NN NN NN RB SSST ‘07 - Hall & N ě mec

  30. Reordering Process “Now the network has opened a news bureau in the Hungarian capital” open PRED VBN SIM_IND network has bureau capital now ACT AUX PAT LOC TWHEN NN NN NN RB SSST ‘07 - Hall & N ě mec

  31. Reordering Process “Now the network has opened a news bureau in the Hungarian capital” network open bureau capital has now ACT PRED PAT LOC AUX TWHEN NN VBN NN NN RB SIM_IND SSST ‘07 - Hall & N ě mec

Recommend


More recommend