Analyzing Neural Language Models Introduction Shane - PowerPoint PPT Presentation

Sidebar: Word Embeddings ● Aren’t word embeddings like word2vec and GloVe examples of transfer learning? ● Yes: get linguistic representations from raw text to use in downstream tasks ● No: not to be used as general-purpose representations 38

Sidebar: Word Embeddings 39

Sidebar: Word Embeddings ● One distinction: ● Global representations: ● word2vec, GloVe: one vector for each word type (e.g. ‘play’) ● Contextual representations (from LMs): ● Representation of word in context, not independently 39

Sidebar: Word Embeddings ● One distinction: ● Global representations: ● word2vec, GloVe: one vector for each word type (e.g. ‘play’) ● Contextual representations (from LMs): ● Representation of word in context, not independently ● Another: ● Shallow (global) vs. Deep (contextual) pre-training 39

Global Embeddings: Models 40

Global Embeddings: Models Mikolov et al 2013a (the OG word2vec paper) 40

Shallow vs Deep Pre-training Model for task Model for task Contextual embedding (pre-trained) Global embedding Raw tokens Raw tokens 41

NLP’s “Clever Hans Moment” Clever Hans BERT link 42

Clever Hans ● Early 1900s, a horse trained by his owner to do: ● Addition ● Division ● Multiplication ● Tell time ● Read German ● … ● Wow! Hans is really smart! 43

Clever Hans Effect 44

Clever Hans Effect ● Upon closer examination / experimentation… 44

Clever Hans Effect ● Upon closer examination / experimentation… ● Hans’ success: 44

Clever Hans Effect ● Upon closer examination / experimentation… ● Hans’ success: ● 89% when questioner knows answer 44

Clever Hans Effect ● Upon closer examination / experimentation… ● Hans’ success: ● 89% when questioner knows answer ● 6% when questioner doesn’t know answer 44

Clever Hans Effect ● Upon closer examination / experimentation… ● Hans’ success: ● 89% when questioner knows answer ● 6% when questioner doesn’t know answer ● Further experiments: as Hans’ taps got closer to correct answer, facial tension in questioner increased 44

Clever Hans Effect ● Upon closer examination / experimentation… ● Hans’ success: ● 89% when questioner knows answer ● 6% when questioner doesn’t know answer ● Further experiments: as Hans’ taps got closer to correct answer, facial tension in questioner increased ● Hans didn’t solve the task but exploited a spuriously correlated cue 44

Central question ● Do BERT et al’s major successes at solving NLP tasks show that we have achieved robust natural language understanding in machines? ● Or: are we seeing a “Clever BERT” phenomenon? 45

McCoy et al 2019 46

Results (performance improves if fine-tuned on this challenge set) 48

link 49

Recent Analysis Explosion ● E.g. BlackboxNLP workshop [2018, 2019] ● New “Interpretability and Analysis” track at ACL 50

Why care? ● Effects of learning what neural language models understand: ● Engineering: can help build better language technologies via improved models, data, training protocols, … ● Trust, critical applications ● Theoretical: can help us understand biases in different architectures (e.g. LSTMs vs Transformers), similarities to human learning biases ● Ethical: e.g. do some models reflect problematic social biases more than others? 51

Stretch Break! 52

Course Overview / Logistics 53

Large Scale ● Motivating question: what do neural language models understand about natural language? ● Focus on meaning , where much of the literature has focused on syntax ● A research seminar : in groups, you will carry out and execute a novel analysis project. ● Think of it as a proto-conference-paper, or the seed of a conference paper. 54

Analyzing Neural Language Models Introduction Shane - PowerPoint PPT Presentation

Analyzing Neural Language Models Introduction Shane Steinert-Threlkeld Jan 9, 2020 1 Todays Plan Motivation / background NLPs ImageNet moment NLPs Clever Hans moment 15 minute break Course information /

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Introduction to Neural Machine Translation Gongbo Tang 16 September 2019 Outline Why Neural

Models of Language Evolution models thereof its evolution language Models of Language Evolution

IN4080 2020 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lnning 2 Neural networks, Language

Recurrent Language Models CMSC 470 Marine Carpuat Toward a Neural Language Model Figures by

Recurrent Language Models CMSC 470 Marine Carpuat Toward a Neural Language Model Figures by

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

4 Language Models 2: Log-linear Language Models This chapter will discuss another set of language

PerfMon redux: analyzing a CUDA application with the Windows PerfMon redux: analyzing a CUDA

What are survey weights? Kelly McConville Assistant Professor of Statistics DataCamp Analyzing

Understanding Census geography and tigris basics Kyle Walker Instructor DataCamp Analyzing US

Twitter Networks Alex Hanna Computational Social Scientist DataCamp Analyzing Social Media Data

Language Modeling CS 6956: Deep Learning for NLP Overview What is a language model? How

Libraries and Tools Transformers, AllenNLP LING575 Analyzing Neural Language Models Shane

Finding datasets / resources LING575 Analyzing Neural Language Models Shane Steinert-Threlkeld

CROWD SOURCING C S 1 4 7 PROJECT TEAM Karna Sloane Marie Tina MBA-GSB, 2 nd CS, Senior Year

CBRE GROUP, INC. First Quarter 2013: Earnings Conference Call April 25, 2013 p FORWARD-LOOKING

Automatic Programming Error Class Identification with Code Plagiarism-Based Clustering Dr

Constraining New Physics with Combined Low and High Energy Observables A combined effective

Review Gibbs sampling MH with proposal Q( X | X ) = P( X B(i) | X B(i) ) I( X

Modeling Security Requirements Through Ownership, Permission and Delegation A comedy in 10 acts

An Algorithm for the Multi-Relational Boolean Factor Analysis based on Essential Elements Martin

Language Notes Topic 8: Filler Words, Interjections, and Onomatopoeia 1 What word is the man