Translating Neuralese Jacob Andreas, Anca Dragan, and Dan Klein
Learning to Communicate [Wagner et al. 03, Sukhbaatar et al. 16, Foerster et al. 16] 2
Learning to Communicate 3
Neuralese 1.0 2.3 -0.3 0.4 -1.2 1.1 4
Translating neuralese 1.0 2.3 -0.3 0.4 -1.2 1.1 all clear 5
Translating neuralese 1.0 2.3 • Interoperate with -0.3 0.4 -1.2 1.1 autonomous systems • Diagnose errors all clear • Learn from solutions [Lazaridou et al. 16] 6
Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 7
Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 8
Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 9
Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 10 10
Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 11
Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 12
A statistical MT problem max p( | ) p( ) a a 0 a 1.0 2.3 all clear -0.3 0.4 -1.2 1.1 [e.g. Koehn 10] 13
A statistical MT problem How do we induce a translation model? 14
A statistical MT problem max p( | ) p( ) a a 0 a max Σ p( | ) p( | ) p( ) ∝ a 0 a 15
Strategy mismatch � ∞ 1 e x − 1 x s d x 1 ζ ( s ) = Γ ( s ) x 0 16
Strategy mismatch not sure � ∞ 1 e x − 1 x s d x 1 ζ ( s ) = Γ ( s ) x 0 17
Strategy mismatch not sure dunno 18
Strategy mismatch not sure dunno yes 19
Strategy mismatch not sure yes dunno no yes yes 20
Strategy mismatch not sure yes Σ p( , | not sure ) p( not sure ) 0 21
Stat MT criterion doesn’t capture meaning moving In the (0,3) → (1,4) intersection 22
Outline Natural language & neuralese Statistical machine translation ✘ Semantic machine translation Implementation details Evaluation 23
A “semantic MT” problem The meaning of an utterance is given by its truth conditions I’m going north [Davidson 67] 24
A “semantic MT” problem The meaning of an utterance is given by its truth conditions I’m going north ✔ ✔ ✘ [Davidson 67] 25
A “semantic MT” problem The meaning of an utterance is given by its truth conditions I’m going north (loc (goal blue) north) 26
A “semantic MT” problem The meaning of an utterance is given by its truth conditions the distribution over states in which it is uttered I’m going north 0.4 0.2 0.001 [Beltagy et al. 14] 27
A “semantic MT” problem The meaning of an utterance is given by its truth conditions the distribution over states in which it is uttered or equivalently, the belief it induces in listeners I’m going north 0.4 0.2 0.001 [Frank et al. 09, A & Klein 16] 28
Representing meaning The meaning of an utterance is given by its truth conditions the distribution over states in which it is uttered or equivalently, the belief it induces in listeners 29
Representing meaning The meaning of an utterance is given by its truth conditions the distribution over states in which it is uttered or equivalently, the belief it induces in listeners This distribution is well-defined even if the “utterance” is a vector rather than a sequence of tokens. 30
Translating with meaning 1.0 2.3 -0.3 0.4 -1.2 1.1 31
Translating with meaning 1.0 2.3 In the -0.3 0.4 -1.2 1.1 intersection 32
Translating with meaning 1.0 2.3 I’m going -0.3 0.4 -1.2 1.1 north 33
Translating with meaning 1.0 2.3 I’m going -0.3 0.4 -1.2 1.1 north p( | ) a p( | ) 0 34
Translating with meaning 1.0 2.3 I’m going -0.3 0.4 -1.2 1.1 north β ( ) a β ( ) 0 35
Interlingua! β ( ) β ( ) 0 a source text target text 36
Translation criterion argmin a KL( || ) β ( ) β ( ) 0 a 37
Translation criterion argmin a KL( || ) β ( ) β ( ) 0 a 38
Translation criterion argmin a KL( || ) β ( ) β ( ) 0 a 39
Translation criterion argmin a KL( || ) β ( ) β ( ) 0 a 40
Computing representations a KL( || ) β ( ) β ( ) argmin 0 a 41
Computing representations: sparsity a KL( || ) β ( ) β ( ) argmin 0 a p( | ) p( | ) a 0 42
Computing representations: smoothing a KL( || ) β ( ) β ( ) argmin 0 a agent actions & policy messages 43
Computing representations: smoothing a KL( || ) β ( ) β ( ) argmin 0 a agent actions & policy messages agent model 44
Computing representations: smoothing a KL( || ) β ( ) β ( ) argmin 0 a actions & human messages 45
Computing representations: smoothing a KL( || ) β ( ) β ( ) argmin 0 a human actions & policy messages human model 46
Computing representations: smoothing a KL( || ) β ( ) β ( ) argmin 0 a a 0 0.10 0.08 0.05 0.01 0.13 0.22 47
Computing KL a KL( || ) β ( ) β ( ) argmin 0 a 48
Computing KL a KL( || ) β ( ) β ( ) argmin 0 a p( ) KL(p || q) = E p q( ) 49
Computing KL: sampling a KL( || ) β ( ) β ( ) argmin 0 a p( ) i KL(p || q) = Σ p( ) log q( ) i i i 50
Finding translations a KL( || ) β ( ) β ( ) argmin 0 a 51
Finding translations: brute force a KL( || ) β ( ) β ( ) argmin 0 a going north 0.5 crossing the intersection 2.3 I’m done 0.2 after you 9.7 52
Finding translations: brute force a KL( || ) β ( ) β ( ) argmin 0 a going north 0.5 crossing the intersection 2.3 I’m done 0.2 after you 9.7 53
Finding translations argmin a KL( || ) β ( ) β ( ) 0 a 54
Outline Natural language & neuralese Statistical machine translation Semantic machine translation Implementation details Evaluation 55
Referring expression games 1.0 2.3 -0.3 0.4 -1.2 1.1 orange bird with black face 56
Evaluation: translator-in-the-loop 1.0 2.3 -0.3 0.4 -1.2 1.1 orange bird with black face 57
Evaluation: translator-in-the-loop 1.0 2.3 -0.3 0.4 -1.2 1.1 orange bird with black face 58
Experiment: color references 59
Experiment: color references 1.00 Neuralese → Neuralese 0.83 English → English* 0.50 60
Experiment: color references 1.00 Neuralese → Neuralese Statistical MT 0.83 English → English* 0.72 0.70 0.50 Neuralese → English* English → Neuralese 61
Experiment: color references 1.00 Neuralese → Neuralese Statistical MT 0.86 0.83 English → English* 0.73 0.72 0.70 Semantic MT 0.50 Neuralese → English* English → Neuralese 62
Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 63
Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 64
Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 65
Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 66
Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 67
Experiment: color references magenta, hot, rose 0 magenta, hot, violet olive, puke, pea pinkish, grey, dull 68
Experiment: image references 95 Neuralese → Neuralese Statistical MT 77 English → English* 86 Semantic MT 73 72 70 50 Neuralese → English* English → Neuralese 69
Experiment: image references large bird, black wings, black crown large bird, black wings, black crown small brown, light brown, dark brown 70
Experiment: driving game 1.93 Neuralese → Neuralese Statistical MT 1.54 Semantic MT 1.49 1.35 Neuralese ↔ English* 71
Conclusions • Classical notions of “meaning” apply even to un-language-like things (e.g. RNN states) • These meanings can be compactly represented without logical forms if we have access to world states • • Communicating policies “say” interpretable things! 72
Recommend
More recommend