Sequence-to-sequence Tasks Task Example Bigram Flipping {w 1 , w 2 … w 2n-1 , w 2n } → {w 2 , w 1 , … w 2n , w 2n-1 } {w 1 ,w 2, … w n-1 , w n } → {w 1 ,w 2 , … w n , w n-1 } Sequence Copying {w 1 ,w 2, … w n-1 , w n } → {w n ,w n-1 , … w 2 , w 1 } Sequence Reversal This is an example. → Dieser ist ein Beispiel. English - German MT 19
Manipulating Attention 20
Manipulating Attention • Let be the impermissible tokens, m is the mask 𝖩 20
Manipulating Attention • Let be the impermissible tokens, m is the mask 𝖩 • For any task-specific loss function, a penalty term is added 20
Manipulating Attention • Let be the impermissible tokens, m is the mask 𝖩 • For any task-specific loss function, a penalty term is added • The penalty term penalizes the model for allocating attention to impermissible tokens 20
Manipulating Attention 21
Manipulating Attention Total attention mass on all the "allowed" tokens 21
Manipulating Attention Penalty coefficient that modulates attention on impermissible tokens Total attention mass on all the "allowed" tokens 21
Manipulating Attention Penalty coefficient that modulates attention on impermissible tokens Total attention mass on all the "allowed" tokens • Side note: In a parallel work, Wiegre ff e and Pinter (2019) propose a di ff erent penalty term 21
Manipulating Attention 22
Manipulating Attention • Multiple attention heads 22
Manipulating Attention • Multiple attention heads • Optimizing the mean over a set of attention heads 22
Manipulating Attention • Multiple attention heads • Optimizing the mean over a set of attention heads • One of the attention heads can be assigned a large amount of attention to impermissible tokens 22
Outline 1. What Is attention mechanism? 2. Attention-as-explanations 3. Manipulating attention weights 4. Results and discussion 5. Conclusion 23
BiLSTM + Attention y α 1 α 2 α 3 α n ….. biLSTM biLSTM biLSTM biLSTM x 1 x 2 x 3 x n 24
Embedding + Attention (No recurrent connections) y α 1 α 2 α 3 α n x 3 x 1 x 2 x n 25
Transformer-based Model Devlin et. al 26
Transformer-based Model Devlin et. al 26
Restricted BERT Predictions Movie Good [SEP] L 12 Movie Good [SEP] L.. L 1 L 0 [CLS] [CLS] Original 27
Restricted BERT Predictions Movie Good [SEP] L 12 Movie Good [SEP] L.. L 1 L 0 [CLS] [CLS] Original Predictions Movie L 12 L.. L 1 Good [SEP] Impermissible L 0 [CLS] Permissible Delhi [SEP] Capital Restricted 27
Occupation Prediction 28
Occupation Prediction Accuracy Attention Mass 100 75 50 25 0 Original Manipulated Manipulated ( λ = 0.1) ( λ = 1.0) Attention type 28
Occupation Prediction Accuracy Attention Mass 100 99.7 97.2 75 50 25 0 Original Manipulated Manipulated ( λ = 0.1) ( λ = 1.0) Attention type 28
Occupation Prediction Accuracy Attention Mass 100 99.7 97.2 97.1 75 50 25 0 0 Original Manipulated Manipulated ( λ = 0.1) ( λ = 1.0) Attention type 28
Occupation Prediction Accuracy Attention Mass 100 99.7 97.2 97.4 97.1 75 50 25 0 0 0 Original Manipulated Manipulated ( λ = 0.1) ( λ = 1.0) Attention type 28
Classification Tasks 29
Classification Tasks 29
Classification Tasks 29
Classification Tasks 29
Classification Tasks 29
Classification Tasks 29
Classification Tasks 29
Alternate mechanisms Gender-Identification 30
Alternate mechanisms Gender-Identification At inference time, what if we hard set the corresponding attention mass to ZERO? 30
Alternate mechanisms Gender-Identification 50 % 100% At inference time, what if we hard set the corresponding attention mass to ZERO? 30
Bigram Flip 31
Bigram Flip Accuracy Attention Mass 100 75 50 25 0 Original None Uniform Manipulated Attention type 31
Bigram Flip Accuracy Attention Mass 100 100 94.5 75 50 25 0 Original None Uniform Manipulated Attention type 31
Bigram Flip Accuracy Attention Mass 100 100 96.5 94.5 75 50 25 0 0 Original None Uniform Manipulated Attention type 31
Bigram Flip Accuracy Attention Mass 100 100 97.9 96.5 94.5 75 50 25 0 5.2 0 Original None Uniform Manipulated Attention type 31
Bigram Flip Accuracy Attention Mass 100 100 99.9 97.9 96.5 94.5 75 50 25 0 5.2 0.4 0 Original None Uniform Manipulated Attention type 31
Bigram Flip Original 32
Bigram Flip Original Manipulated 32
Bigram Flip A di ff erent seed Original Manipulated 32
Sequence Copy 33
Sequence Copy Accuracy Attention Mass 100 75 50 25 0 Original None Uniform Manipulated Attention type 33
Sequence Copy Accuracy Attention Mass 100 100 98.8 75 50 25 0 Original None Uniform Manipulated Attention type 33
Sequence Copy Accuracy Attention Mass 100 100 98.8 84.1 75 50 25 0 0 Original None Uniform Manipulated Attention type 33
Sequence Copy Accuracy Attention Mass 100 100 98.8 93.8 84.1 75 50 25 0 5.2 0 Original None Uniform Manipulated Attention type 33
Sequence Copy Accuracy Attention Mass 100 100 99.9 98.8 93.8 84.1 75 50 25 0 5.2 0.01 0 Original None Uniform Manipulated Attention type 33
Recommend
More recommend