bilingual markov reordering labels for hierarchical smt
play

Bilingual Markov Reordering Labels for Hierarchical SMT Gideon - PowerPoint PPT Presentation

Bilingual Markov Reordering Labels for Hierarchical SMT Gideon Maillette de Buy Wenniger and Khalil Simaan gemdbw AT gmail.com k.simaan AT uva.nl http://staff.science.uva.nl/~gemaille/ http://staff.science.uva.nl/~simaan/ Statistical


  1. Bilingual Markov Reordering Labels for Hierarchical SMT Gideon Maillette de Buy Wenniger and Khalil Sima’an gemdbw AT gmail.com k.simaan AT uva.nl http://staff.science.uva.nl/~gemaille/ http://staff.science.uva.nl/~simaan/ Statistical Language Processing and Learning Lab Institute for Logic Language and Computation University of Amsterdam, the Netherlands October 25th, 2014 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 1

  2. The incoherence of translation reordering Sentence type Sentence contents der handlungsspielraum der beiden betroffenen regierung Source Sentence ist also durch das internationale recht begrenzt . any action by the two governments concerned Reference is therefore limited by this international law . the margin for manoeuvre of two government Hiero (Baseline) is concerned by the international community limited . G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 2

  3. Hiero and Memento Question: what do they have in common? S 10 accordingly X 11 X 13 X 17 tailor our X 12 policy should politik X 14 wir X 17 ausrichten we unsere X 14 X 13 müssen X 12 X 11 darauf S 10 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 3

  4. Lexicalization and Language model: the words are not enough G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 4

  5. Coherence demands (reordering) context Vision: Hierarchical Alignment Trees (HATs) G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 5

  6. Outline Part 1: Bilingual Phrase Reordering Labels Part 2: Label Substitution Features Part 3: Experiments Conclusions G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 6

  7. Part 1: Bilingual Phrase Reordering Labels

  8. NDT with Alignment structure 1 2 3 4 5 6 7 1 1 3 4 5 6 we should tailor our policy accordingly darauf müsen wir unsere politik ausrichten 1 2 3 4 5 6 ([ 1 , 6 ] , [ 1 , 6 ] , 1 ) ([ 1 , 2 ] , [ 2 , 3 ] , 2 ) ([ 4 , 5 ] , [ 4 , 5 ] , 3 ) ([ 1 , 1 ] , [ 3 , 3 ] , 4 ) ([ 2 , 2 ] , [ 2 , 2 ] , 5 ) ([ 4 , 4 ] , [ 4 , 4 ] , 6 ) ([ 5 , 5 ] , [ 5 , 5 ] , 7 ) G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 7

  9. NDT with Alignment structure = HAT G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 8

  10. Reordering Labeled Grammar Extraction Word Alignment Hierarchical Align- ment Trees Chart Extract Reordering labels Label Chart Grammar Extractor SCFG G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 9

  11. Bilingual Phrase Reordering label categories Phrase-Centric Parent-Relative G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 10

  12. Phrase-centric reordering labels Complexity relation between base phrase and children in HAT determines label Five cases distinguished, ordered by increasing complexity Monotonic Inversion 1 2 1 2 this is an important matter we all agree on this das ist ein wichtige angelegenheit das sehen wir alle 1 2 2 1 Permutation Complex Atomic 1 2 3 1 1 2 3 4 we owe this to our citizens it would be possible i want to stress two points auf zwei punkte möchte ich hinweisen das sind wir unsern burgern schuldig kann mann 2 4 1 3 2 1 3 1 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 11

  13. Known labels from ITG and Phrase pair Theory

  14. Monotonic Monotonic : If the alignment can be split into two monotonically ordered parts. Monotonic 1 2 this is an important matter Inversion 1 2 we all agree on this das ist ein wichtige angelegenheit 1 2 das sehen wir alle 2 1 Permutation Complex Atomic 1 2 3 1 2 3 1 4 i want to stress two points we owe this to our citizens it would be possible auf zwei punkte möchte ich hinweisen das sind wir unsern burgern schuldig kann mann 2 4 1 3 2 1 3 1 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 12

  15. Inverted Inverted : If the alignment can be split into two inverted parts. Inversion 1 2 we all agree on this Monotonic 1 2 this is an important matter das sehen wir alle das ist ein wichtige angelegenheit 1 2 1 2 Permutation Complex Atomic 1 2 3 4 1 2 3 1 i want to stress two points we owe this to our citizens it would be possible auf zwei punkte möchte ich hinweisen das sind wir unsern burgern schuldig kann mann 2 4 1 3 2 1 3 1 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 13

  16. Atomic Atomic : If the alignment does not allow the existence of smaller (child) phrase pairs. Monotonic Inversion 1 2 1 2 this is an important matter we all agree on this das ist ein wichtige angelegenheit das sehen wir alle 1 2 2 1 Atomic 1 it would be possible Permutation Complex 1 2 3 1 2 3 4 i want to stress two points we owe this to our citizens kann mann auf zwei punkte möchte ich hinweisen das sind wir unsern burgern schuldig 1 2 4 1 3 2 1 3 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 14

  17. New labels based on HATs

  18. Permutation Permutation : If the alignment can be factored as a permutation of more than 3 parts. Monotonic Inversion 1 2 1 2 this is an important matter we all agree on this das ist ein wichtige angelegenheit das sehen wir alle 1 2 2 1 Permutation 1 2 3 4 i want to stress two points Complex Atomic 1 2 3 1 we owe this to our citizens it would be possible punkte auf zwei möchte ich hinweisen 2 4 1 3 das sind wir unsern burgern schuldig kann mann 2 1 3 1 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 15

  19. Complex Complex : No alignment factorization as a permutation of parts, but smaller phrase pair is contained (i.e., it is composite). Monotonic Inversion 1 2 1 2 this is an important matter we all agree on this das ist ein wichtige angelegenheit das sehen wir alle 1 2 2 1 Complex 1 2 3 we owe this to our citizens Permutation Atomic 1 2 3 1 4 i want to stress two points it would be possible das sind wir unsern burgern schuldig 2 1 auf zwei punkte möchte ich hinweisen 3 kann mann 2 4 1 3 1 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 16

  20. Phrase-Centric labeled derivation S 10 S 10 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 17

  21. Phrase-Centric labeled derivation S 10 COMPLEX 11 COMPLEX 11 S 10 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 17

  22. Phrase-Centric labeled derivation S 10 COMPLEX 11 tailor accordingly INVERTED MONO 12 13 darauf ausrichten INVERTED MONO 12 13 COMPLEX 11 S 10 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 17

  23. Phrase-Centric labeled derivation S 10 COMPLEX 11 tailor accordingly INVERTED MONO 12 13 should ATOMIC 14 müssen ATOMIC 14 darauf ausrichten INVERTED MONO 12 13 COMPLEX 11 S 10 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 17

  24. Phrase-Centric labeled derivation S 10 COMPLEX 11 tailor accordingly INVERTED MONO 12 13 should ATOMIC 14 we wir müssen ATOMIC 14 darauf ausrichten INVERTED MONO 12 13 COMPLEX 11 S 10 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 17

  25. Phrase-Centric labeled derivation S 10 COMPLEX 11 tailor accordingly INVERTED MONO 12 13 should our ATOMIC ATOMIC 14 17 we wir müssen unsere ATOMIC ATOMIC 14 17 darauf ausrichten INVERTED MONO 12 13 COMPLEX 11 S 10 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 17

  26. Phrase-Centric labeled derivation S 10 COMPLEX 11 tailor accordingly INVERTED MONO 12 13 should our ATOMIC ATOMIC 14 17 we policy wir politik müssen unsere ATOMIC ATOMIC 14 17 darauf ausrichten INVERTED MONO 12 13 COMPLEX 11 S 10 G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 17

  27. Parent-Relative reordering labels Describe type of reordering relative to embedding “parent” phrase First-order view on reordering (Details ommitted due to time constraints) G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 18

  28. Part 2: Label Substitution Features

  29. Label substitution features Unique feature for every label pair � L α , L β � Marks specific LHS substitutes specific gap Substituting rule LHS 10 Two more coarse α β γ N1 N2 11 12 features: ◮ Match GAP1 GAP2 11 12 ◮ Nomatch Decoder chart Basic Features G. Wenniger, K. Sima’an (ILLC) Bilingual Markov Reordering Labels October 25th, 2014 19

  30. Part 3: Experiments

Recommend


More recommend