syntax based decoding 2
play

Syntax-Based Decoding 2 Philipp Koehn 14 November 2017 Philipp - PowerPoint PPT Presentation

Syntax-Based Decoding 2 Philipp Koehn 14 November 2017 Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017 1 flashback: syntax-based models Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017


  1. Syntax-Based Decoding 2 Philipp Koehn 14 November 2017 Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  2. 1 flashback: syntax-based models Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  3. Synchronous Context Free Grammar Rules 2 • Nonterminal rules NP → DET 1 NN 2 JJ 3 | DET 1 JJ 3 NN 2 • Terminal rules N → maison | house NP → la maison bleue | the blue house • Mixed rules NP → la maison JJ 1 | the JJ 1 house Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  4. Extracting Minimal Rules 3 S VP VP VP PP NP PRP MD VB VBG RP TO PRP DT NNS be I shall passing on to you some comments entsprechenden Anmerkungen aushändigen Ich werde Ihnen die Extracted rule: S → X 1 X 2 | PRP 1 VP 2 DONE — note: one rule per alignable constituent Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  5. 4 flashback: decoding Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  6. Chart Organization 5 Sie will eine Tasse Kaffee trinken PPER VAFIN ART NN NN VVINF NP VP S • Chart consists of cells that cover contiguous spans over the input sentence • For each span, a stack of (partial) translations is maintained • Bottom-up: a higher stack is filled, once underlying stacks are complete Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  7. Prefix Tree for Rules 6 NP: NP 1 IN 2 NP 3 NP DET NN NP: NP 1 of DET 2 NP 3 NP … ... NP: NP 1 NP: NP 1 of IN 2 NP 3 ... ... PP … des NP: NP 1 of the NN 2 NN um VP … NP: NP 2 NP 1 NP: NP 1 of NP 2 ... ... VP … ... NP: DET 1 NN 2 DET NN ... ... das Haus NP: the house ... ... ... Highlighted Rules NP → NP 1 DET 2 NN 3 | NP 1 IN 2 NN 3 NP → NP 1 | NP 1 NP → NP 1 des NN 2 | NP 1 of the NN 2 NP → NP 1 des NN 2 | NP 2 NP 1 NP → DET 1 NN 2 | DET 1 NN 2 NP → das Haus | the house Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  8. CYK+ Parsing for SCFG 7 das ❶ Haus ❻ ● NP : the house NN ❼ NP : the NN DET ❷ Haus ❽ NP : DET house NN ❾ NP : DET NN NP ❺ NP : that house NP : the house DET NN ❾ NP ❺ DET Haus ❽ das NN ❼ das Haus ❻ DET : that NP : house IN : of NP : architect DET : the NN : house DET : the NN : architect NNP : Frank NNP : Gehry DET ❷ NN ❹ NP ❺ DET ❷ NN ❹ NNP • NNP • das ❶ house ❸ des • Architekten • Frank • Gehry • das Haus des Architekten Frank Gehry Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  9. Processing One Span 8 Extend lists of dotted rules with cell constituent labels span’s dotted rule list (with same start) plus neighboring span’s constituent labels of hypotheses (with same end) das Haus des Architekten Frank Gehry Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  10. 9 pruning Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  11. Where are we now? 10 • We know which rules apply • We know where they apply (each non-terminal tied to a span) • But there are still many choices – many possible translations – each non-terminal may match multiple hypotheses → number choices exponential with number of non-terminals Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  12. Rules with One Non-Terminal 11 Found applicable rules PP → des X | ... NP ... the architect ... PP ➝ of NP NP PP ➝ by NP architect Frank ... NP PP ➝ in NP the famous ... NP PP ➝ on to NP Frank Gehry NP • Non-terminal will be filled any of h underlying matching hypotheses • Choice of t lexical translations ⇒ Complexity O ( ht ) (note: we may not group rules by target constituent label, so a rule NP → des X | the NP would also be considered here as well) Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  13. Rules with Two Non-Terminals 12 Found applicable rule NP → X 1 des X 2 | NP 1 ... NP 2 the architect a house NP ➝ NP of NP NP NP ➝ NP by NP architect Frank ... a building NP NP ➝ NP in NP the famous ... the building NP NP ➝ NP on to NP Frank Gehry a new house NP • Two non-terminal will be filled any of h underlying matching hypotheses each • Choice of t lexical translations ⇒ Complexity O ( h 2 t ) — a three-dimensional ”cube” of choices (note: rules may also reorder differently) Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  14. Cube Pruning 13 1.7 by architect ... 2.6 by the ... 3.2 of the ... 1.5 in the ... a house 1.0 a building 1.3 the building 2.2 a new house 2.6 Arrange all the choices in a ”cube” (here: a square, generally a orthotope, also called a hyperrectangle) Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  15. Create the First Hypothesis 14 1.7 by architect ... 2.6 by the ... 3.2 of the ... 1.5 in the ... a house 1.0 2.1 2.1 a building 1.3 the building 2.2 a new house 2.6 • Hypotheses created in cube: (0,0) Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  16. Add (”Pop”) Hypothesis to Chart Cell 15 1.7 by architect ... 2.6 by the ... 3.2 of the ... 1.5 in the ... a house 1.0 2.1 a building 1.3 the building 2.2 a new house 2.6 • Hypotheses created in cube: ǫ • Hypotheses in chart cell stack: (0,0) Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  17. Create Neighboring Hypotheses 16 1.7 by architect ... 2.6 by the ... 3.2 of the ... 1.5 in the ... a house 1.0 2.1 2.5 a building 1.3 2.7 the building 2.2 a new house 2.6 • Hypotheses created in cube: (0,1), (1,0) • Hypotheses in chart cell stack: (0,0) Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  18. Pop Best Hypothesis to Chart Cell 17 1.7 by architect ... 2.6 by the ... 3.2 of the ... 1.5 in the ... a house 1.0 2.1 2.5 a building 1.3 2.7 the building 2.2 a new house 2.6 • Hypotheses created in cube: (0,1) • Hypotheses in chart cell stack: (0,0), (1,0) Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  19. Create Neighboring Hypotheses 18 1.7 by architect ... 2.6 by the ... 3.2 of the ... 1.5 in the ... a house 1.0 2.1 2.5 3.1 a building 1.3 2.7 2.4 the building 2.2 a new house 2.6 • Hypotheses created in cube: (0,1), (1,1), (2,0) • Hypotheses in chart cell stack: (0,0), (1,0) Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  20. More of the Same 19 1.7 by architect ... 2.6 by the ... 3.2 of the ... 1.5 in the ... a house 1.0 2.1 2.5 3.1 a building 1.3 2.7 2.4 3.0 the building 2.2 3.8 a new house 2.6 • Hypotheses created in cube: (0,1), (1,2), (2,1), (2,0) • Hypotheses in chart cell stack: (0,0), (1,0), (1,1) Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  21. Queue of Cubes 20 • Several groups of rules will apply to a given span • Each of them will have a cube • We can create a queue of cubes ⇒ Always pop off the most promising hypothesis, regardless of cube • May have separate queues for different target constituent labels Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  22. Bottom-Up Chart Decoding Algorithm 21 1: for all spans (bottom up) do extend dotted rules 2: for all dotted rules do 3: find group of applicable rules 4: create a cube for it 5: create first hypothesis in cube 6: place cube in queue 7: end for 8: for specified number of pops do 9: pop off best hypothesis of any cube in queue 10: add it to the chart cell 11: create its neighbors 12: end for 13: extend dotted rules over constituent labels 14: 15: end for Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  23. 22 recombination and pruning Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  24. Dynamic Programming 23 Applying rule creates new hypothesis NP: a cup of coffee apply rule: NP → NP Kaffee | NP+P coffee NP+P: a cup of eine Tasse Kaffee trinken ART NN NN VVINF Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  25. Dynamic Programming 24 Another hypothesis NP: a cup of coffee NP: a cup of coffee apply rule: NP → eine Tasse NP | a cup of NP NP+P: a cup of NP: coffee eine Tasse Kaffee trinken ART NN NN VVINF Both hypotheses are indistiguishable in future search → can be recombined Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  26. Recombinable States 25 Recombinable? NP: a cup of coffee NP: a cup of coffee NP: a mug of coffee Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  27. Recombinable States 26 Recombinable? NP: a cup of coffee NP: a cup of coffee NP: a mug of coffee Yes, iff max. 2-gram language model is used Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

  28. Recombinability 27 Hypotheses have to match in • span of input words covered • output constituent label • first n –1 output words not properly scored, since they lack context • last n –1 output words still affect scoring of subsequently added words, just like in phrase-based decoding ( n is the order of the n-gram language model) Philipp Koehn Machine Translation: Syntax-Based Decoding 14 November 2017

Recommend


More recommend