representing huge translation models statistical machine
play

Representing Huge Translation Models Statistical Machine - PowerPoint PPT Presentation

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment Statistical Machine Translation extract rules parallel text + alignment Statistical Machine Translation score extract rules rules parallel


  1. Suffix Arrays it makes him and it mars him . it sets him on and it takes him off . # 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Text T 3 12 2 15 10 6 0 4 8 13 1 5 16 11 9 14 7 17 18 Suffix Array SA O ( | w | log | T | ) (Manber & Myers, 93) O ( | w | + log | T | ) him and it O ( | w | ) (Abouelhoda et al., 04) Query Pattern w

  2. Suffix Arrays it makes him and it mars him . it sets him on and it takes him off . # 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Text T 3 12 2 15 10 6 0 4 8 13 1 5 16 11 9 14 7 17 18 Suffix Array SA O ( | w | log | T | ) (Manber & Myers, 93) O ( | w | + log | T | ) him and it O ( | w | ) (Abouelhoda et al., 04) on baseline model: Query Pattern w 0.009 seconds/sentence (not including extraction/scoring)

  3. Problem: Phrases with Gaps • Hierarchical phrase-based translation (Chiang 2005, 2007) • Quirk et al. 2005, Simard et al. 2005, DeNeefe et al. 2007 Input it persuades him and it disheartens him Source Phrase it X him

  4. Hierarchical Phrases: Phrases with Gaps • Hierarchical phrase-based translation (Chiang 2005, 2007) • Quirk et al. 2005, Simard et al. 2005, DeNeefe et al. 2007 Input it persuades him and it disheartens him Source Phrase it X him

  5. Hierarchical Phrases: Phrases with Gaps • Hierarchical phrase-based translation (Chiang 2005, 2007) • Quirk et al. 2005, Simard et al. 2005, DeNeefe et al. 2007 Input it persuades him and it disheartens him Source Phrase it X him

  6. Hierarchical Phrases: Phrases with Gaps • Hierarchical phrase-based translation (Chiang 2005, 2007) • Quirk et al. 2005, Simard et al. 2005, DeNeefe et al. 2007 Input it persuades him and it disheartens him Source Phrase it X him

  7. Hierarchical Phrases: Phrases with Gaps • Hierarchical phrase-based translation (Chiang 2005, 2007) • Quirk et al. 2005, Simard et al. 2005, DeNeefe et al. 2007 Input it persuades him and it disheartens him Source Phrase it X and X him

  8. Problem Statement Given an input sentence, efficiently find all hierarchical phrase-based translation rules for that sentence in the training corpus.

  9. Pattern Matching for Hierachical PBMT Input Pattern it persuades him and it disheartens him

  10. Pattern Matching for Hierarchical PBMT Input Pattern it persuades him and it disheartens him it persuades him it Query Patterns persuades him and persuades him and it him and it disheartens and it disheartens him disheartens it persuades him and it persuades persuades him and it persuades him him and it disheartens him and and it disheartens him and it it persuades him and it it disheartens persuades him and it disheartens disheartens him him and it disheartens him

  11. Pattern Matching for Hierarchical PBMT Input Pattern it persuades him and it disheartens him it X and it X disheartens him Query Patterns it X it it X and X him it X disheartens persuades him X disheartens it X him persuades him X him persuades X it persuades X it disheartens persuades X disheartens persuades X disheartens him persuades X him him and X him it persuades X it him X disheartens him it persuades X disheartens it persuades him X disheartens it persuades X him it persuades him X him it X and it it persuades X it disheartens it X it disheartens it persuades X disheartens him

  12. Pattern Matching for Hierarchical PBMT Input Pattern it persuades him and it disheartens him it X and it disheartens Query Patterns it X it disheartens him persuades him and X him persuades him X disheartens him persuades X it disheartens him it persuades him and X him it persuades him X disheartens him it persuades X it disheartens him it X and it disheartens him

  13. Pattern Matching for Hierarchical PBMT Input Pattern it persuades him and it disheartens him it X and it disheartens Query Patterns it X it disheartens him persuades him and X him persuades him X disheartens him persuades X it disheartens him it persuades him and X him it persuades him X disheartens him it persuades X it disheartens him it X and it disheartens him This is a variant of approximate pattern matching (Navarro ‘01)

  14. Pattern Matching with Gaps 3 and it mars him , it sets him ... Query pattern 12 and it takes him off . # α 2 him and it mars him . it sets ... him X it him off . # 15 him on and it takes him off . # 10 him , it sets him on and it ... 6 it makes him and it mars ... 0 it mars him , it sets him on ... 4 it sets him on and it takes ... 8 it takes him off . # 13 makes him and it mars him ... 1 ...

  15. Pattern Matching with Gaps 3 and it mars him , it sets him ... Query pattern α 12 and it takes him off . # 2 him and it mars him . it sets ... him X it him off . # 15 him on and it takes him off . # 10 him , it sets him on and it ... 6 it makes him and it mars ... 0 it mars him , it sets him on ... 4 it sets him on and it takes ... 8 it takes him off . # 13 makes him and it mars him ... 1 ...

  16. Pattern Matching with Gaps 3 and it mars him , it sets him ... Query pattern α 12 and it takes him off . # 2 him and it mars him . it sets ... him X it him off . # 15 him on and it takes him off . # 10 him , it sets him on and it ... 6 it makes him and it mars ... 0 it mars him , it sets him on ... 4 it sets him on and it takes ... 8 it takes him off . # 13 makes him and it mars him ... 1 ...

  17. Pattern Matching with Gaps 3 and it mars him , it sets him ... Query pattern 12 and it takes him off . # α 2 him and it mars him . it sets ... him X it him off . # 15 Subpatterns w i him on and it takes him off . # 10 him , it sets him on and it ... 6 him it makes him and it mars ... 0 it it mars him , it sets him on ... 4 it sets him on and it takes ... 8 it takes him off . # 13 makes him and it mars him ... 1 ...

  18. Pattern Matching with Gaps 3 and it mars him , it sets him ... Query pattern 12 and it takes him off . # α 2 him and it mars him . it sets ... him X it him off . # 15 Subpatterns w i him on and it takes him off . # 10 him , it sets him on and it ... 6 him it makes him and it mars ... 0 it it mars him , it sets him on ... 4 it sets him on and it takes ... 8 it takes him off . # 13 makes him and it mars him ... 1 ...

  19. Pattern Matching with Gaps 3 and it mars him , it sets him ... Query pattern 12 and it takes him off . # α 2 him and it mars him . it sets ... him X it him off . # 15 Subpatterns w i him on and it takes him off . # 10 him , it sets him on and it ... 6 him n i Occurrences it makes him and it mars ... 0 it it mars him , it sets him on ... 4 it sets him on and it takes ... 8 it takes him off . # 13 makes him and it mars him ... 1 ...

  20. Pattern Matching with Gaps 3 and it mars him , it sets him ... 12 and it takes him off . # 2 him and it mars him . it sets ... him off . # 15 2 0 him on and it takes him off . # 10 15 4 him , it sets him on and it ... 6 10 8 it makes him and it mars ... 0 6 13 it mars him , it sets him on ... 4 it sets him on and it takes ... 8 it takes him off . # 13 makes him and it mars him ... 1 ...

  21. Pattern Matching with Gaps 2 0 15 4 10 8 6 13

  22. Pattern Matching with Gaps (2, 4) 2 0 (2, 8) 15 4 (2, 13) 10 8 (6, 8) 6 13 (6, 13) (10, 13)

  23. Pattern Matching with Gaps (2, 4) 2 0 (2, 8) 15 4 (2, 13) 10 8 (6, 8) 6 13 (6, 13) (10, 13) RILMS (Rahman et al., 06)

  24. Pattern Matching with Gaps (2, 4) 2 0 (2, 8) 15 4 (2, 13) 10 8 (6, 8) 6 13 (6, 13) (10, 13) RILMS (Rahman et al., 06) � linear in number of occurrences of subpatterns: O ( n i ) i

  25. Baseline Timing Result 221 seconds per sentence compare: 0.009 seconds per sentence for contiguous phrases

  26. Complexity Analysis � ( | w | + log | T | ) contiguous w 137 5 27 I � � discontiguous ( | w i | + log | T | + n i ) α = w 1 X...Xw I i =1 2825 3 5 27 82069

  27. Complexity Analysis � ( | w | + log | T | ) contiguous w 137 5 27 I � � discontiguous ( | w i | + log | T | + n i ) α = w 1 X...Xw I i =1 2825 3 5 27 82069

  28. Exploiting Redundancy Input Pattern it persuades him and it disheartens him it X and it X disheartens him Query Patterns it X it it X and X him it X disheartens persuades him X disheartens it X him persuades him X him persuades X it persuades X it disheartens persuades X disheartens persuades X disheartens him persuades X him him and X him it persuades X it him X disheartens him it persuades X disheartens it persuades him X disheartens it persuades X him it persuades him X him it X and it it persuades X it disheartens it X it disheartens it persuades X disheartens him

  29. Exploiting Redundancy Input Pattern it persuades him and it disheartens him it X and it X disheartens him Query Patterns it X it it X and X him it X disheartens persuades him X disheartens it X him persuades him X him persuades X it persuades X it disheartens persuades X disheartens persuades X disheartens him persuades X him him and X him it persuades X it him X disheartens him it persuades X disheartens it persuades him X disheartens it persuades X him it persuades him X him it X and it it persuades X it disheartens it X it disheartens it persuades X disheartens him

  30. Exploiting Redundancy Query Pattern it persuades X disheartens him

  31. Exploiting Redundancy Query Pattern it persuades X disheartens him Maximal Prefix it persuades X disheartens (Zhang & Vogel 2005)

  32. Exploiting Redundancy Query Pattern it persuades X disheartens him Maximal Prefix it persuades X disheartens Maximal Suffix persuades X disheartens him

  33. Prefix Tree with Suffix Links him persuades him persuades it X him X him him

  34. Timing Results 221 seconds/ sentence Baseline

  35. Timing Results 221 177 seconds/ sentence Baseline Prefix Tree

  36. Complexity Analysis � ( | w | + log | T | ) contiguous w 137 5 27 I � � discontiguous ( | w i | + log | T | + n i ) α = w 1 X...Xw I i =1 2825 3 5 27 82069

  37. Complexity Analysis � ( | w | + log | T | ) contiguous w 137 5 27 I � � discontiguous ( | w i | + log | T | + n i ) α = w 1 X...Xw I i =1 2825 3 5 27 82069

  38. Empirical Analysis cumulative time (s) computations (ranked by time)

  39. Distribution of Patterns in Training Data Frequency Pattern types (in descending order of frequency)

  40. Distribution of Patterns in Training Data Frequency Pattern types (in descending order of frequency)

  41. Analysis of Problem • The expensive computations involve at least one frequent subpattern. There are two cases. • A frequent pattern paired with an infrequent pattern • Two frequent patterns paired with each other

  42. Frequent × Infrequent Subpatterns

  43. Frequent × Infrequent Subpatterns

  44. Frequent × Infrequent Subpatterns

  45. Frequent × Infrequent Subpatterns

  46. Double Binary Search Baeza-Yates, 04

  47. Double Binary Search Baeza-Yates, 04 Queryset Q Dataset D

  48. Double Binary Search Baeza-Yates, 04 Queryset Q Dataset D

  49. Double Binary Search Baeza-Yates, 04 Queryset Q Dataset D

  50. Double Binary Search Baeza-Yates, 04 Queryset Q Dataset D

  51. Double Binary Search Baeza-Yates, 04 Queryset Q Dataset D

  52. Double Binary Search Baeza-Yates, 04 Queryset Q Dataset D

  53. Double Binary Search Baeza-Yates, 04 Queryset Q Dataset D complexity: | Q | log | D | Upper bound

  54. Obtaining Sorted Sets

  55. Obtaining Sorted Sets Sort via Stratified Tree (van Emde Boas et al. 1977)

  56. Obtaining Sorted Sets Sort via Stratified Tree (van Emde Boas et al. 1977) Problem: complexity increases to O ( | Q | log | D | + ( | Q | + | D | ) log log | T | )

  57. Obtaining Sorted Sets Sort via Stratified Tree (van Emde Boas et al. 1977) Solution: cache sorted set in prefix tree Problem: complexity increases to O ( | Q | log | D | + ( | Q | + | D | ) log log | T | )

  58. Timing Results 221 177 seconds/ sentence Baseline Prefix + double Tree binary

  59. Timing Results 221 177 174 seconds/ sentence Baseline Prefix + double Tree binary

  60. Obtaining Sorted Sets Sort via Stratified Tree Problem: sort complexity is still very high for very frequent patterns

  61. Obtaining Sorted Sets Solution: precompute the inverted index for 1000 most frequent contiguous patterns

  62. Timing Results 221 177 174 seconds/ sentence Baseline Prefix + double Tree binary

  63. Timing Results 221 177 174 seconds/ sentence 44 Baseline Prefix + double + inverted Tree binary indices

  64. Frequent × Frequent Subpatterns

  65. Frequent × Frequent Subpatterns Problem: There is no clever algorithm to solve this problem

  66. Solution: Precomputation it makes him and it mars him . it sets him on and it takes him off . # it makes him and it mars him . it sets him on and it takes him off . # 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Text

Recommend


More recommend