Syntax-Based Decoding Philipp Koehn 9 November 2017 Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
1 syntax-based models Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Synchronous Context Free Grammar Rules 2 • Nonterminal rules NP → DET 1 NN 2 JJ 3 | DET 1 JJ 3 NN 2 • Terminal rules N → maison | house NP → la maison bleue | the blue house • Mixed rules NP → la maison JJ 1 | the JJ 1 house Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Extracting Minimal Rules 3 S VP VP VP PP NP PRP MD VB VBG RP TO PRP DT NNS I shall be passing on to you some comments werde Ihnen entsprechenden Anmerkungen aushändigen Ich die Extracted rule: S → X 1 X 2 | PRP 1 VP 2 DONE — note: one rule per alignable constituent Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
4 decoding Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Syntactic Decoding 5 Inspired by monolingual syntactic chart parsing: During decoding of the source sentence, a chart with translations for the O ( n 2 ) spans has to be filled Sie will eine Tasse Kaffee trinken PPER VAFIN ART NN NN VVINF NP VP S Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Syntax Decoding 6 ➏ VB drink Sie will eine Tasse Kaffee trinken PPER VAFIN ART NN NN VVINF NP VP S German input sentence with tree Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Syntax Decoding 7 ➏ ➊ PRO VB she drink Sie will eine Tasse Kaffee trinken PPER VAFIN ART NN NN VVINF NP VP S Purely lexical rule: filling a span with a translation (a constituent in the chart) Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Syntax Decoding 8 ➏ ➊ ➋ PRO NN VB she coffee drink Sie will eine Tasse Kaffee trinken PPER VAFIN ART NN NN VVINF NP VP S Purely lexical rule: filling a span with a translation (a constituent in the chart) Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Syntax Decoding 9 ➏ ➊ ➋ ➌ PRO NN VB she coffee drink Sie will eine Tasse Kaffee trinken PPER VAFIN ART NN NN VVINF NP VP S Purely lexical rule: filling a span with a translation (a constituent in the chart) Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Syntax Decoding 10 ➏ ➍ NP NP PP NN DET NN IN | | | a cup of ➊ ➋ ➌ PRO NN VB she coffee drink Sie will eine Tasse Kaffee trinken PPER VAFIN ART NN NN VVINF NP VP S Complex rule: matching underlying constituent spans, and covering words Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Syntax Decoding 11 ➏ ➎ VP VP VBZ | TO VB NP wants | to ➍ NP NP PP NN DET NN IN | | | a cup of ➊ ➋ ➌ PRO NN VB she coffee drink Sie will eine Tasse Kaffee trinken PPER VAFIN ART NN NN VVINF NP VP S Complex rule with reordering Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Syntax Decoding 12 ➏ S PRO VP ➎ VP VP VBZ | TO VB NP wants | to ➍ NP NP PP NN DET NN IN | | | a cup of ➊ ➋ ➌ PRO NN VB coffee drink she Sie will eine Tasse Kaffee trinken PPER VAFIN ART NN NN VVINF NP VP S Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Bottom-Up Decoding 13 • For each span, a stack of (partial) translations is maintained • Bottom-up: a higher stack is filled, once underlying stacks are complete Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Chart Organization 14 Sie will eine Tasse Kaffee trinken PPER VAFIN ART NN NN VVINF NP VP S • Chart consists of cells that cover contiguous spans over the input sentence • Each cell contains a set of hypotheses 1 • Hypothesis = translation of span with target-side constituent 1 In the book, they are called chart entries. Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Naive Algorithm 15 Input: Foreign sentence f = f 1 , ...f l f , with syntax tree Output: English translation e 1: for all spans [start,end] (bottom up) do for all sequences s of hypotheses and words in span [start,end] do 2: for all rules r do 3: if rule r applies to chart sequence s then 4: create new hypothesis c 5: add hypothesis c to chart 6: end if 7: end for 8: end for 9: 10: end for 11: return English translation e from best hypothesis in span [0, l f ] Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Stack Pruning 16 • Number of hypotheses in each chart cell explodes • Dynamic programming (recombination) not enough ⇒ need to discard bad hypotheses e.g., keep 100 best only • Different stacks for different output constituent labels? • Cost estimates – translation model cost known – language model cost for internal words known → estimates for initial words – outside cost estimate? (how useful will be a NP covering input words 3–5 later on?) Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Naive Algorithm: Blow-ups 17 • Many subspan sequences for all sequences s of hypotheses and words in span [start,end] • Many rules for all rules r • Checking if a rule applies not trivial rule r applies to chart sequence s ⇒ Unworkable Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Solution 18 • Prefix tree data structure for rules • Dotted rules • Cube pruning Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
19 storing rules efficiently Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Storing Rules 20 • First concern: do they apply to span? → have to match available hypotheses and input words • Example rule NP → X 1 des X 2 | NP 1 of the NN 2 • Check for applicability – is there an initial sub-span that with a hypothesis with constituent label NP ? – is it followed by a sub-span over the word des? – is it followed by a final sub-span with a hypothesis with label NN ? • Sequence of relevant information NP • des • NN • NP 1 of the NN 2 Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Rule Applicability Check 21 Trying to cover a span of six words with given rule NP • des • NN → NP: NP of the NN das Haus des Architekten Frank Gehry Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Rule Applicability Check 22 First: check for hypotheses with output constituent label NP NP • des • NN → NP: NP of the NN das Haus des Architekten Frank Gehry Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Rule Applicability Check 23 Found NP hypothesis in cell, matched first symbol of rule NP • des • NN → NP: NP of the NN NP das Haus des Architekten Frank Gehry Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Rule Applicability Check 24 Matched word des, matched second symbol of rule NP • des • NN → NP: NP of the NN NP das Haus des Architekten Frank Gehry Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Rule Applicability Check 25 Found a NN hypothesis in cell, matched last symbol of rule NP • des • NN → NP: NP of the NN NP NN das Haus des Architekten Frank Gehry Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Rule Applicability Check 26 Matched entire rule → apply to create a NP hypothesis NP • des • NN → NP: NP of the NN NP NP NN das Haus des Architekten Frank Gehry Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Rule Applicability Check 27 Look up output words to create new hypothesis (note: there may be many matching underlying NP and NN hypotheses) NP • des • NN → NP: NP of the NN NP: the house of the architect Frank Gehry NP: the house NN: architect Frank Gehry das Haus des Architekten Frank Gehry Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Checking Rules vs. Finding Rules 28 • What we showed: – given a rule – check if and how it can be applied • But there are too many rules (millions) to check them all • Instead: – given the underlying chart cells and input words – find which rules apply Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Prefix Tree for Rules 29 NP: NP 1 IN 2 NP 3 NP DET NN NP: NP 1 of DET 2 NP 3 NP … ... NP: NP 1 of IN 2 NP 3 NP: NP 1 ... ... PP … NP: NP 1 of the NN 2 des NN um VP … NP: NP 2 NP 1 NP: NP 1 of NP 2 ... ... VP … ... NP: DET 1 NN 2 DET NN ... ... das Haus NP: the house ... ... ... Highlighted Rules NP → NP 1 DET 2 NN 3 | NP 1 IN 2 NN 3 NP → NP 1 | NP 1 NP → NP 1 des NN 2 | NP 1 of the NN 2 NP → NP 1 des NN 2 | NP 2 NP 1 NP → DET 1 NN 2 | DET 1 NN 2 NP → das Haus | the house Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
30 dotted rules Philipp Koehn Machine Translation: Syntax-Based Decoding 9 November 2017
Recommend
More recommend