Reasoning about pragma0cs with neural listeners and speakers Jacob Andreas and Dan Klein
The reference game 2
The reference game 3
The reference game The one with the snake 4
The reference game Mike is holding a baseball bat 5
The reference game bat a is holding Mike baseball 6
The reference game They are si4ng by a picnic table 7
The reference game There is a bat 8
The reference game There is a bat 9
The reference game Why do we care about this game? Don’t you think it’s a li:le cold in here? Do you know what <me it is? Some of the children played in the park. 10
Deriving pragma0cs from reasoning Mike is holding a baseball bat 11
Deriving pragma0cs from reasoning Jenny is running from the snake 12
Deriving pragma0cs from reasoning Mike is holding a baseball bat 13
How to win DERIVED STRATEGY : DIRECT STRATEGY : Reason about listener beliefs Imitate successful human play There is a snake ? There is There is a snake a snake 14
How to win DERIVED STRATEGY : DIRECT STRATEGY : Reason about listener beliefs Imitate successful human play [Monroe and PoRs, 2015] [Mao et al. 2015] [Smith et al. 2013] [Kazemzadeh et al. 2014] [Vogel et al. 2013] [Fitzgerald et al., 2013] [Golland et al. 2010] 15
How to win DERIVED STRATEGY : DIRECT STRATEGY : Reason about listener beliefs Imitate successful human play PRO : pragma0cs “for free” PRO : domain repr “for free” CON : past work needs CON : past work needs hand-engineering targeted data 16
How to win DERIVED STRATEGY : DIRECT STRATEGY : Reason about listener beliefs Imitate successful human play Learn base models for Explicitly reason about base interpreta0on & genera0on models to get novel behavior without pragma0c context 17
Data Abstract Scenes Dataset 1000 scenes 10k sentences Feature representa0ons 18
Approach Literal Sampler speaker Literal listener Reasoning speaker 19
A literal speaker ( S0 ) Mike is holding a baseball bat 20
A literal speaker ( S0 ) Mike is holding Referent Referent a baseball bat encoder decoder 21
Module architectures Referent encoder ref referent FC features Referent decoder word n FC Softmax ReLU word n+1 FC word <n referent 22
Training S0 Mike is holding a baseball bat 23
A literal speaker ( S0 ) Mike is holding a baseball bat S0 The sun is in the sky Jenny is standing next to Mike 24
A literal listener ( L0 ) Mike is holding a baseball bat 25
A literal listener ( L0 ) Mike is holding 0.87 a baseball bat Descr. encoder Referent Scorer encoder 0.13 Referent encoder 26
Module architectures Referent encoder ngram FC desc features Referent decoder desc Sum ReLU FC Softmax choice referent sentence 27
Training L0 Mike is holding a baseball bat 0.87 ( random distractor) 28
A literal listener ( L0 ) Mike is holding a baseball bat L0 29
A reasoning speaker ( S1 ) ? Mike is holding a baseball bat 30
A reasoning speaker ( S1 ) Literal 0.9 Mike is speaker a baseball bat Literal The sun is in 0.5 listener the sky Jenny is standing next to Mike 0.7 31
A reasoning speaker ( S1 ) Literal 0.05 0.9 Mike is speaker a baseball bat 0.09 Literal The sun is in 0.5 listener the sky 0.08 Jenny is standing next to Mike 0.7 32
A reasoning speaker ( S1 ) Literal 0.05 0.9 1-λ Mike is speaker a baseball bat * 0.05 λ 0.09 Literal The sun is in 0.5 1-λ listener the sky * 0.09 λ 0.08 Jenny is standing next to Mike 0.7 1-λ * 0.09 λ 33
Experiments 34
Baselines • Literal : the L0 model by itself • ContrasIve : a condi0onal LM trained on both the target image and a random distractor [Mao et al. 2015] 35
Results (test) 81% 69% 64% Literal Reasoning Contras0ve 36
Accuracy and fluency 37
How many samples? 100 90 Accuracy 80 70 60 50 1 10 100 1000 # Samples 38
Examples (a) the sun is in the sky [contrastive] 39
Examples (c) the dog is standing beside jenny [contrastive] 40
Examples (b) mike is wearing a chef’s hat [non-contrastive] 41
Conclusions • Standard neural kit of parts for base models • Probabilis0c reasoning for high-level goals • A liRle bit of structure goes a long way! 42
Thank you!
“Compiling” the reasoning model What if we train the contras0ve model on the output of the reasoning model?
Results (dev) 83% 69% 66% Literal Reasoning Compiled
Recommend
More recommend