A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit Yin-Wen Chang 1 (Joint work with Michael Collins 1 , 2 ) 1 Google, New York 2 Columbia University July 31, 2017
Introduction Background: ◮ Phrase-based decoding without further constraints is NP-hard ◮ Proof: reduction from the travelling salesman problem (TSP)[Knight(1999)] ◮ Hard distortion limit is commonly imposed in PBMT systems Question: ◮ Is phrase-based decoding with a fixed distortion limit NP-hard or not?
Introduction A related problem: bandwidth-limited TSP | i − j | ≤ d . . . 1 2 i j . . . This work: a new decoding algorithm ◮ Process the source word from left-to-right ◮ Maintain multiple “tapes” in the target side ◮ Run time: O ( nd ! lh d +1 ) n : source sentence length d : distortion limit
Overview of the proposed decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein π 1 ← π 1 = (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) ǫ π 2 = (3 , 4 , our concern) ǫ ◮ Process the source word from left-to-right ◮ Maintain multiple “tapes” in the target side
Overview of the proposed decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein π 1 ← π 1 · (1 , 2 , this must) π 1 = (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) π 2 = (3 , 4 , our concern) ǫ ◮ Process the source word from left-to-right ◮ Maintain multiple “tapes” in the target side
Overview of the proposed decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein π 2 ← π 2 · (3 , 4 , our concern) π 1 = (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) π 2 = (3 , 4 , our concern) ◮ Process the source word from left-to-right ◮ Maintain multiple “tapes” in the target side
Overview of the proposed decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein π 1 ← π 1 · (5 , 5 , also) π 1 = (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) π 2 = (3 , 4 , our concern) ◮ Process the source word from left-to-right ◮ Maintain multiple “tapes” in the target side
Overview of the proposed decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein π 1 ← π 1 · (6 , 6 , be) · π 2 π 1 = (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) π 2 = (3 , 4 , our concern) ǫ ◮ Process the source word from left-to-right ◮ Maintain multiple “tapes” in the target side
Outline Introduction of the phrase-based decoding problem Target-side left-to-right: the usual decoding algorithm Source-side left-to-right: the proposed algorithm Time complexity of the proposed algorithm Conclusion and future work
Phrase-based decoding problem das muss unsere sorge gleichermaßen sein this must our concern also be Derivation: complete translation with phrase mappings Sub-derivation: partial translation
Phrase-based decoding problem das muss unsere sorge gleichermaßen sein this must our concern also be ◮ Segment the German sentence into non-overlapping phrases Derivation: complete translation with phrase mappings Sub-derivation: partial translation
Phrase-based decoding problem das muss unsere sorge gleichermaßen sein this this must must our our concern concern also also be be ◮ Segment the German sentence into non-overlapping phrases ◮ Find an English translation for each German phrase Derivation: complete translation with phrase mappings Sub-derivation: partial translation
Phrase-based decoding problem das muss unsere sorge gleichermaßen sein this this must must our also concern be also our concern be ◮ Segment the German sentence into non-overlapping phrases ◮ Find an English translation for each German phrase ◮ Reorder the English phrases to get a better English sentence Derivation: complete translation with phrase mappings Sub-derivation: partial translation
Phrase-based decoding problem das muss unsere sorge gleichermaßen sein this this must must our also concern be also our concern be ◮ Segment the German sentence into non-overlapping phrases ◮ Find an English translation for each German phrase ◮ Reorder the English phrases to get a better English sentence Derivation: complete translation with phrase mappings Sub-derivation: partial translation
Score a derivation 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must also be our concern
Score a derivation 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must also be our concern ◮ Phrase translation score: score (das muss , this must) + · · ·
Score a derivation 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must also be our concern ◮ Phrase translation score: score (das muss , this must) + · · · ◮ Language model score: score ( <s> this must also be our concern < / s> ) = score (this | <s> ) + score (must | this) + · · ·
Score a derivation 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must also be our concern ◮ Phrase translation score: score (das muss , this must) + · · · ◮ Language model score: score ( <s> this must also be our concern < / s> ) = score (this | <s> ) + score (must | this) + · · · ◮ Reordering score: η · | 2 + 1 − 5 |
Fixed distortion limit: distortion distance ≤ d 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must also be our concern ◮ Distortion distance: | 2 + 1 − 5 | = 2
Target-side left-to-right: the usual decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must
Target-side left-to-right: the usual decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must
Target-side left-to-right: the usual decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must also
Target-side left-to-right: the usual decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must also be
Target-side left-to-right: the usual decoding algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must also be our concern
Target-side left-to-right: dynamic programming algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must Sub-derivation: (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) DP state:
Target-side left-to-right: dynamic programming algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must Sub-derivation: Sub-derivation: (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) DP state: (must , 2 , 110000) DP state:
Target-side left-to-right: dynamic programming algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must also Sub-derivation: Sub-derivation: (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) DP state: (also , 5 , 110010) DP state:
Target-side left-to-right: dynamic programming algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must also be Sub-derivation: Sub-derivation: (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) DP state: (be , 6 , 110011) DP state:
Target-side left-to-right: dynamic programming algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must also be our concern Sub-derivation: Sub-derivation: (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) DP state: (concern , 4 , 111111) DP state:
Source-side left-to-right: the proposed algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must
Source-side left-to-right: the proposed algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must
Source-side left-to-right: the proposed algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must our concern
Source-side left-to-right: the proposed algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must also our concern
Source-side left-to-right: the proposed algorithm 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must also be our concern
Source-side left-to-right: dynamic programming state 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this must π 2 =(3 , 4 , our concern) Sub-derivation: (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) DP state: j = 4 , σ 1 = � 1 , this , 2 , must � , σ 2 = � 3 , our , 4 , concern �
Source-side left-to-right: dynamic programming state 1 2 3 4 5 6 das muss unsere sorge gleichermaßen sein this this must must Sub-derivation: Sub-derivation: π 1 = (1 , 2 , this must)(5 , 5 , also)(6 , 6 , be)(3 , 4 , our concern) π 2 =(3 , 4 , our concern) DP state: j = 2 , DP state: j = 4 , σ 1 = � 1 , this , 2 , must � σ 1 = � 1 , this , 2 , must � , σ 2 = � 3 , our , 4 , concern �
Recommend
More recommend