translating neuralese
play

Translating Neuralese Jacob Andreas, Anca Dragan, and Dan Klein - PowerPoint PPT Presentation

Translating Neuralese Jacob Andreas, Anca Dragan, and Dan Klein Learning to Communicate [Wagner et al. 03, Sukhbaatar et al. 16, Foerster et al. 16] 2 Learning to Communicate 3 Neuralese 1.0 2.3 -0.3 0.4 -1.2 1.1 4 Translating


  1. Translating Neuralese Jacob Andreas, Anca Dragan, and Dan Klein

  2. Learning to Communicate [Wagner et al. 03, Sukhbaatar et al. 16, Foerster et al. 16] 2

  3. Learning to Communicate 3

  4. Neuralese 1.0 2.3 -0.3 0.4 -1.2 1.1 4

  5. Translating neuralese 1.0 2.3 -0.3 0.4 -1.2 1.1 all clear 5

  6. Translating neuralese 1.0 2.3 • Interoperate with 
 -0.3 0.4 -1.2 1.1 autonomous systems • Diagnose errors all clear • Learn from solutions [Lazaridou et al. 16] 6

  7. Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 7

  8. Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 8

  9. Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 9

  10. Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 10 10

  11. Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 11

  12. Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 12

  13. A statistical MT problem max p( | ) p( ) a a 0 a 1.0 2.3 all clear -0.3 0.4 -1.2 1.1 [e.g. Koehn 10] 13

  14. A statistical MT problem How do we induce a translation model? 14

  15. A statistical MT problem max p( | ) p( ) a a 0 a max Σ p( | ) p( | ) p( ) ∝ a 0 a 15

  16. Strategy mismatch � ∞ 1 e x − 1 x s d x 1 ζ ( s ) = Γ ( s ) x 0 16

  17. Strategy mismatch not sure � ∞ 1 e x − 1 x s d x 1 ζ ( s ) = Γ ( s ) x 0 17

  18. Strategy mismatch not sure dunno 18

  19. Strategy mismatch not sure dunno yes 19

  20. Strategy mismatch not sure yes dunno no yes yes 20

  21. Strategy mismatch not sure yes Σ p( , | not sure ) p( not sure ) 0 21

  22. Stat MT criterion doesn’t capture meaning moving 
 In the (0,3) → (1,4) intersection 22

  23. Outline Natural language & neuralese Statistical machine translation ✘ Semantic machine translation Implementation details Evaluation 23

  24. A “semantic MT” problem The meaning of an utterance is given by its truth conditions I’m going north [Davidson 67] 24

  25. A “semantic MT” problem The meaning of an utterance is given by its truth conditions I’m going north ✔ ✔ ✘ [Davidson 67] 25

  26. A “semantic MT” problem The meaning of an utterance is given by its truth conditions I’m going north (loc (goal blue) north) 26

  27. A “semantic MT” problem The meaning of an utterance is given by its truth conditions the distribution over states in which it is uttered I’m going north 0.4 0.2 0.001 [Beltagy et al. 14] 27

  28. A “semantic MT” problem The meaning of an utterance is given by its truth conditions the distribution over states in which it is uttered or equivalently, the belief it induces in listeners I’m going north 0.4 0.2 0.001 [Frank et al. 09, A & Klein 16] 28

  29. Representing meaning The meaning of an utterance is given by its truth conditions the distribution over states in which it is uttered or equivalently, the belief it induces in listeners 29

  30. Representing meaning The meaning of an utterance is given by its truth conditions the distribution over states in which it is uttered or equivalently, the belief it induces in listeners This distribution is well-defined even if the “utterance” is a vector rather than a sequence of tokens. 30

  31. Translating with meaning 1.0 2.3 -0.3 0.4 -1.2 1.1 31

  32. Translating with meaning 1.0 2.3 In the -0.3 0.4 -1.2 1.1 intersection 32

  33. Translating with meaning 1.0 2.3 I’m going -0.3 0.4 -1.2 1.1 north 33

  34. Translating with meaning 1.0 2.3 I’m going -0.3 0.4 -1.2 1.1 north p( | ) a p( | ) 0 34

  35. Translating with meaning 1.0 2.3 I’m going -0.3 0.4 -1.2 1.1 north β ( ) a β ( ) 0 35

  36. Interlingua! β ( ) β ( ) 0 a source text target text 36

  37. Translation criterion argmin a KL( || ) β ( ) β ( ) 0 a 37

  38. Translation criterion argmin a KL( || ) β ( ) β ( ) 0 a 38

  39. Translation criterion argmin a KL( || ) β ( ) β ( ) 0 a 39

  40. Translation criterion argmin a KL( || ) β ( ) β ( ) 0 a 40

  41. Computing representations a KL( || ) β ( ) β ( ) argmin 0 a 41

  42. Computing representations: sparsity a KL( || ) β ( ) β ( ) argmin 0 a p( | ) p( | ) a 0 42

  43. Computing representations: smoothing a KL( || ) β ( ) β ( ) argmin 0 a agent 
 actions & policy messages 43

  44. Computing representations: smoothing a KL( || ) β ( ) β ( ) argmin 0 a agent 
 actions & policy messages agent 
 model 44

  45. Computing representations: smoothing a KL( || ) β ( ) β ( ) argmin 0 a actions & human messages 45

  46. Computing representations: smoothing a KL( || ) β ( ) β ( ) argmin 0 a human 
 actions & policy messages human 
 model 46

  47. Computing representations: smoothing a KL( || ) β ( ) β ( ) argmin 0 a a 0 0.10 0.08 0.05 0.01 0.13 0.22 47

  48. Computing KL a KL( || ) β ( ) β ( ) argmin 0 a 48

  49. Computing KL a KL( || ) β ( ) β ( ) argmin 0 a p( ) KL(p || q) = E p q( ) 49

  50. Computing KL: sampling a KL( || ) β ( ) β ( ) argmin 0 a p( ) i KL(p || q) = Σ p( ) log q( ) i i i 50

  51. Finding translations a KL( || ) β ( ) β ( ) argmin 0 a 51

  52. Finding translations: brute force a KL( || ) β ( ) β ( ) argmin 0 a going north 0.5 crossing the intersection 2.3 I’m done 0.2 after you 9.7 52

  53. Finding translations: brute force a KL( || ) β ( ) β ( ) argmin 0 a going north 0.5 crossing the intersection 2.3 I’m done 0.2 after you 9.7 53

  54. Finding translations argmin a KL( || ) β ( ) β ( ) 0 a 54

  55. Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 55

  56. Referring expression games 1.0 2.3 -0.3 0.4 -1.2 1.1 orange bird with black face 56

  57. Evaluation: translator-in-the-loop 1.0 2.3 -0.3 0.4 -1.2 1.1 orange bird with black face 57

  58. Evaluation: translator-in-the-loop 1.0 2.3 -0.3 0.4 -1.2 1.1 orange bird with black face 58

  59. Experiment: color references 59

  60. Experiment: color references 1.00 Neuralese → Neuralese 0.83 English → English* 0.50 60

  61. Experiment: color references 1.00 Neuralese → Neuralese Statistical MT 0.83 English → English* 0.72 0.70 0.50 Neuralese → English* English → Neuralese 61

  62. Experiment: color references 1.00 Neuralese → Neuralese Statistical MT 0.86 0.83 English → English* 0.73 0.72 0.70 Semantic MT 0.50 Neuralese → English* English → Neuralese 62

  63. Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 63

  64. Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 64

  65. Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 65

  66. Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 66

  67. Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 67

  68. Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 68

  69. Experiment: image references 95 Neuralese → Neuralese Statistical MT 77 English → English* 86 Semantic MT 73 72 70 50 Neuralese → English* English → Neuralese 69

  70. Experiment: image references large bird, black wings, black crown large bird, black wings, black crown small brown, light brown, dark brown 70

  71. Experiment: driving game 1.93 Neuralese → Neuralese Statistical MT 1.54 Semantic MT 1.49 1.35 Neuralese ↔ English* 71

  72. Conclusions • Classical notions of “meaning” apply even to 
 un-language-like things (e.g. RNN states) • These meanings can be compactly represented without logical forms if we have access to world states • • Communicating policies “say” interpretable things! 72

Recommend


More recommend