Analogs of Linguistic Structure in Deep Representations Jacob Andreas and Dan Klein
A game for humans everything but the blue shapes orange square and non-squares ✔ ✔ ✔ [FitzGerald et al. 2013] 2
A game for RNNs 1.0 2.3 -0.3 0.4 -1.2 1.1 ✔ ✔ ✔ [e.g. Lazaridou et al. 2016] 3
Questions 1. Does the RNN employ a human-like communicative strategy? ? everything but 1.0 2.3 = -0.3 0.4 squares -1.2 1.1 4
Questions 2. Do RNN representations have interpretable compositional structure? “ not ” “ red ” ? 1.0 2.3 ∗ = -0.3 0.4 -1.2 1.1 5
Computing meaning representations not the red squares
Computing meaning representations not the red squares λ x.¬(sqr(x) ∧ red(x))
Computing meaning representations not the red squares not red or not square λ x.¬(sqr(x) ∧ red(x)) λ x.¬red(x) ∨ ¬sqr(x)
Computing meaning representations not the red squares not red or not square λ x.¬(sqr(x) ∧ red(x)) λ x.¬red(x) ∨ ¬sqr(x) ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔
Computing meaning representations not the red squares not red or not square λ x.¬(sqr(x) ∧ red(x)) λ x.¬red(x) ∨ ¬sqr(x) ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔
Computing meaning representations not the red squares -0.1 1.3 0.5 -0.4 λ x.¬(sqr(x) ∧ red(x)) ✔ ✔ ✔ ✔
Computing meaning representations not the red squares -0.1 1.3 0.5 -0.4 λ x.¬(sqr(x) ∧ red(x)) ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔
Computing meaning representations -0.1 1.3 0.5 -0.4 0.2 1.0 13
Computing meaning representations -0.1 1.3 0.5 -0.4 0.2 1.0 . . . 14
Computing meaning representations -0.1 1.3 0.5 -0.4 0.2 1.0 ✔ ✔ ✔ ✔ ✔ ✔ ✔ . . . 15
Computing meaning representations everything but -0.1 1.3 0.5 -0.4 squares 0.2 1.0 ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ . . . 16
Computing meaning representations not the blue -0.1 1.3 0.5 -0.4 squares 0.2 1.0 ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ . . . 17
Translating By comparing denotations from logical forms and the decoder model , we can find utterances and vectors with the same meaning. [A, Dragan & Klein 2013] 18
Questions 1. Does the RNN employ a human-like communicative strategy? 2. Do RNN representations have interpretable compositional structure? 19
Questions 1. Does the RNN employ a human-like communicative strategy? 2. Do RNN representations have interpretable compositional structure? 20
Comparing strategies 21
Comparing strategies -0.1 1.3 everything 0.5 -0.4 but squares 0.2 1.0 22
Comparing strategies -0.1 1.3 everything 0.5 -0.4 but squares 0.2 1.0 ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ . . . 23
Comparing strategies -0.1 1.3 everything 0.5 -0.4 but squares 0.2 1.0 ? ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ . . . 24
Evaluation: strategies ? 92% = ✔ ✔ ✔ ✔ ✔ ✔ 25
Evaluation: strategies ? 92% = ✔ ✔ ✔ ✔ ✔ ✔ ? 50% = ✔ ✔ ✔ ✔ ✔ ✔ 26
Evaluation: strategies ? 92% = ✔ ✔ ✔ ✔ ✔ ✔ ? 74% = ✔ ✔ ✔ ✔ ✔ ✔ 27
Experiments 1. Does the RNN employ a human-like communicative strategy? 2. Do RNN representations have interpretable compositional structure? 28
Collecting translation data all the red shapes blue objects everything but red green squares not green squares 29
Collecting translation data λ x.red(x) λ x.blu(x) λ x.¬red(x) λ x.grn(x) ∧ sqr(x) λ x.¬(grn(x) ∧ sqr(x)) 30
Collecting translation data 0.1 -0.3 0.5 1.1 λ x.red(x) -0.3 0.2 0.1 0.1 λ x.blu(x) 1.4 -0.3 -0.5 0.8 λ x.¬red(x) 0.2 -0.2 0.5 -0.1 λ x.grn(x) ∧ sqr(x) 0.3 -1.3 -1.5 0.1 λ x.¬(grn(x) ∧ sqr(x)) 31
Extracting related pairs λ x.red(x) λ x.¬red(x) 0.1 -0.3 0.5 1.1 1.4 -0.3 -0.5 0.8 λ x.grn(x) ∧ sqr(x) λ x.¬(grn(x) ∧ sqr(x)) 0.2 -0.2 0.5 -0.1 0.3 -1.3 -1.5 0.1 32
Extracting related pairs λ x.red(x) λ x.¬red(x) 0.1 -0.3 0.5 1.1 1.4 -0.3 -0.5 0.8 λ x.grn(x) ∧ sqr(x) λ x.¬(grn(x) ∧ sqr(x)) 0.2 -0.2 0.5 -0.1 0.3 -1.3 -1.5 0.1 33
Learning compositional operators 2 f(x) ¬f(x) argmin 34
Evaluating learned operators λ x.red(x) λ x.¬red(x) 0.1 -0.3 0.5 1.1 1.4 -0.3 -0.5 0.8 λ x.grn(x) ∧ sqr(x) λ x.¬(grn(x) ∧ sqr(x)) 0.2 -0.2 0.5 -0.1 0.3 -1.3 -1.5 0.1 λ x.f(x) 0.2 -0.2 0.5 -0.1
Evaluating learned operators λ x.red(x) λ x.¬red(x) 0.1 -0.3 0.5 1.1 1.4 -0.3 -0.5 0.8 λ x.grn(x) ∧ sqr(x) λ x.¬(grn(x) ∧ sqr(x)) 0.2 -0.2 0.5 -0.1 0.3 -1.3 -1.5 0.1 λ x.f(x) 0.2 -0.2 0.5 -0.1 -0.2 0.4 -0.3 0.0
Evaluating learned operators λ x.red(x) λ x.¬red(x) 0.1 -0.3 0.5 1.1 1.4 -0.3 -0.5 0.8 λ x.grn(x) ∧ sqr(x) λ x.¬(grn(x) ∧ sqr(x)) 0.2 -0.2 0.5 -0.1 0.3 -1.3 -1.5 0.1 λ x.f(x) ??? 0.2 -0.2 0.5 -0.1 -0.2 0.4 -0.3 0.0
Evaluation: negation ? 97% = ¬f(x) ✔ ✔ ✔ ✔ ✔ ✔ 38
Evaluation: negation ? 97% = ¬f(x) ✔ ✔ ✔ ✔ ✔ ✔ ? 50% = ✔ ✔ ✔ ✔ ✔ ✔ 39
Visualizing negation all the toys that are not red Input all items that are Predicted only the blue and not blue or green green objects True every thing that is red 40
Visualizing disjunction all of the red objects the blue objects Input the blue and red items the blue and yellow items Predicted all yellow or red items True all the yellow toys 41
Conclusions • Under the right conditions, RNN reprs exhibit interpretable pragmatics & compositional structure • • Not just communication games—language might be a good general-purpose tool for interpreting deep reprs. 42
Conclusions • Under the right conditions, RNN reprs exhibit interpretable pragmatics & compositional structure • • Not just communication games—language might be a good general-purpose tool for interpreting deep reprs. 43
1.0 2.3 -0.3 0.4 Thank you! -1.2 1.1 http://github.com/jacobandreas/rnn-syn
Recommend
More recommend