Faster Decoding for Phrases and Syntax Kenneth Heafield
Translation is Expensive “speed-up in tuning time but a ff ects the performance” “18 days using 12 cores” [Williams et al WMT 2014] “Time-sensitive BLEU score” [Chung and Galley, 2012] “Due to time constraints, this procedure was not used” [Servan et al, WMT 2012] ) Routine Quality Compromises = Introduction Problem Cube Pruning Incremental Conclusion 2
Introduction Problem Cube Pruning Incremental Conclusion 3
Blame the Language Model “LM queries often account for more than 50% of the CPU” [Green et al, WMT 2014] Introduction Problem Cube Pruning Incremental Conclusion 4
Blame the Language Model “LM queries often account for more than 50% of the CPU” [Green et al, WMT 2014] Faster queries (KenLM) More e ff ective queries Introduction Problem Cube Pruning Incremental Conclusion 5
������������ � ���������������������������������������������� �� � ����������������������������������������������� ���������������������������������������� � ���������������������������������������� ���������������������������������������������� ������������ � ������������������������������������������� ������������������������������������������
������������ � ���������������������������������������������� �� � ����������������������������������������������� ���������������������������������������� � ���������������������������������������� ���������������������������������������������� ������������ � ������������������������������������������� ������������������������������������������
1 Decoding problem 2 Cube pruning 3 Incremental Introduction Problem Cube Pruning Incremental Conclusion 8
Decoding Example: Input a vu Le gar¸ con l’homme avec un t´ elescope Introduction Problem Cube Pruning Incremental Conclusion 9
Decoding Example: Parse with SCFG S : S X : NP X : VP X : VP X : V X : NP X : PP a vu Le gar¸ con l’homme avec un t´ elescope Introduction Problem Cube Pruning Incremental Conclusion 10
Decoding Example: Read Target Side S : S X : NP X : VP X : VP X : V X : NP X : PP a vu Le gar¸ con l’homme avec un t´ elescope seen The boy man with the telescope saw A boy the man to an telescope view some men with a telescope Introduction Problem Cube Pruning Incremental Conclusion 11
Decoding Example: One Constituent S : S X : NP X : VP X : VP X : V X : NP X : PP a vu Le gar¸ con l’homme avec un t´ elescope seen The boy man with the telescope saw A boy the man to an telescope view some men with a telescope Introduction Problem Cube Pruning Incremental Conclusion 12
X : VP X : V X : NP a vu l’homme Hyp Hyp seen man saw the man view some men Introduction Problem Cube Pruning Incremental Conclusion 13
X : VP X : VP a vu l’homme X : V X : NP Hypothesis seen man seen the man a vu l’homme seen some men Hyp Hyp saw man seen man saw the man saw the man saw some men view some men view man view the man view some men Introduction Problem Cube Pruning Incremental Conclusion 14
X : VP X : VP a vu l’homme X : V X : NP Hypothesis Score seen man - 8 . 8 seen the man - 7 . 6 a vu l’homme seen some men - 9 . 5 Hyp Score Hyp Score saw man - 8 . 3 seen - 3 . 8 man - 3 . 6 saw the man - 6 . 9 saw - 4 . 0 the man - 4 . 3 saw some men - 8 . 5 view - 4 . 0 some men - 6 . 3 view man - 8 . 5 view the man - 8 . 9 view some men - 10 . 8 Introduction Problem Cube Pruning Incremental Conclusion 15
X : VP X : VP a vu l’homme X : V X : NP Hypothesis Score saw the man - 6 . 9 seen the man - 7 . 6 a vu l’homme saw man - 8 . 3 Hyp Score Hyp Score saw some men - 8 . 5 seen - 3 . 8 man - 3 . 6 view man - 8 . 5 saw - 4 . 0 the man - 4 . 3 seen man - 8 . 8 view - 4 . 0 some men - 6 . 3 view the man - 8 . 9 seen some men - 9 . 5 view some men - 10 . 8 Introduction Problem Cube Pruning Incremental Conclusion 16
X : VP X : VP a vu l’homme X : V X : NP Hypothesis Score saw the man - 6 . 9 seen the man - 7 . 6 a vu l’homme saw man - 8 . 3 Hyp Score Hyp Score saw some men - 8 . 5 seen - 3 . 8 man - 3 . 6 view man - 8 . 5 saw - 4 . 0 the man - 4 . 3 seen man - 8 . 8 view - 4 . 0 some men - 6 . 3 view the man - 8 . 9 seen some men - 9 . 5 view some men - 10 . 8 Scores do not sum Introduction Problem Cube Pruning Incremental Conclusion 17
X : VP X : VP a vu l’homme X : V X : NP Hypothesis Score saw the man - 6 . 9 seen the man - 7 . 6 a vu l’homme saw man - 8 . 3 Hyp Score Hyp Score saw some men - 8 . 5 seen - 3 . 8 man - 3 . 6 view man - 8 . 5 saw - 4 . 0 the man - 4 . 3 seen man - 8 . 8 view - 4 . 0 some men - 6 . 3 view the man - 8 . 9 seen some men - 9 . 5 view some men - 10 . 8 Pruning is Approximate Introduction Problem Cube Pruning Incremental Conclusion 18
Appending Strings Hypotheses are built by string concatenation. Language model probability changes when this is done: p ( saw the man ) = p ( the | saw ) p ( man | saw the ) p ( saw ) p ( the man ) p ( the ) p ( man | the ) Introduction Problem Cube Pruning Incremental Conclusion 19
Appending Strings Hypotheses are built by string concatenation. Language model probability changes when this is done: p ( saw the man ) = p ( the | saw ) p ( man | saw the ) p ( saw ) p ( the man ) p ( the ) p ( man | the ) Log probability is part of the score = ) Scores do not sum ) Local decisions may not be globally optimal = = ) Search is hard. Introduction Problem Cube Pruning Incremental Conclusion 20
1 Decoding problem 2 Cube pruning 3 Incremental Introduction Problem Cube Pruning Incremental Conclusion 21
Beam Search man � 3.6 the man � 4.3 some men � 6.3 seen man � 8.8 seen the man � 7.6 seen some men � 9.5 seen � 3.8 saw � 4.0 saw man � 8.3 saw the man � 6.9 saw some men � 8.5 view man � 8.5 view the man � 8.9 view some men � 10.8 view � 4.0 [Lowerre, 1976; Chiang, 2005] Introduction Problem Cube Pruning Incremental Conclusion 22
Cube Pruning man � 3.6 the man � 4.3 some men � 6.3 Queue seen � 3.8 saw � 4.0 view � 4.0 Queue Hypothesis Sum seen man � 3.8 � 3.6 = � 7.4 [Chiang, 2007] Introduction Problem Cube Pruning Incremental Conclusion 23
Cube Pruning man � 3.6 the man � 4.3 some men � 6.3 seen man � 8.8 Queue seen � 3.8 saw � 4.0 Queue view � 4.0 Queue Hypothesis Sum saw man � 4.0 � 3.6 = � 7.6 seen the man � 3.8 � 4.3 = � 8.1 [Chiang, 2007] Introduction Problem Cube Pruning Incremental Conclusion 24
Cube Pruning man � 3.6 the man � 4.3 some men � 6.3 seen man � 8.8 Queue seen � 3.8 saw � 4.0 saw man � 8.3 Queue Queue view � 4.0 Queue Hypothesis Sum view man � 4.0 � 3.6 = � 7.6 seen the man � 3.8 � 4.3 = � 8.1 saw the man � 4.0 � 4.3 = � 8.3 [Chiang, 2007] Introduction Problem Cube Pruning Incremental Conclusion 25
Cube Pruning man � 3.6 the man � 4.3 some men � 6.3 seen man � 8.8 Queue seen � 3.8 saw � 4.0 saw man � 8.3 Queue view man � 8.5 Queue view � 4.0 Queue Hypothesis Sum seen the man � 3.8 � 4.3 = � 8.1 saw the man � 4.0 � 4.3 = � 8.3 view the man � 4.0 � 4.3 = � 8.3 [Chiang, 2007] Introduction Problem Cube Pruning Incremental Conclusion 26
Beam Search Make every dish. Keep the best k , throw the rest out. Cube pruning Combine the best ingredients. Only make k dishes. Introduction Problem Cube Pruning Incremental Conclusion 27
Cube Pruning Hypotheses are Atomic String String is a countries that String countries that are a countries that is a countries which are a countries which are a country . . . No notion that “a countries” is bad. Introduction Problem Cube Pruning Incremental Conclusion 28
Recommend
More recommend