Modern NLP for Pre-Modern Practitioners Joel Grus #QConAI @joelgrus #2019
"True self-control is waiting until the movie starts to eat your popcorn."
the movie True until is waiting self-control starts to eat your popcorn.
Natural Language Understanding is Hard
B u t W e ' r e G e t t i n g B e t t e r A t I t * * as measured by performance on tasks we're getting better at
As Measured by Performance on Tasks We're Getting Better at* * tasks that would be easy if we were good at natural language understanding and that we therefore use to measure our progress toward natural language understanding
About Me
Obligatory Plug for AllenNLP
A Handful of Tasks That Would Be Easy if We Were Good at Natural Language Understanding
Parsing
Named-Entity Recognition
Coreference Resolution
Machine Translation
Summarization Attend QCon.ai.
Text Classification
Machine Comprehension
Machine Comprehension?
Textual Entailment
Winograd Schemas The conference organizer disinvited the speaker conference organizer because he feared a boring talk. The conference organizer disinvited the speaker speaker because he proposed a boring talk.
Language Modeling
And many others!
If you were good at natural language understanding, you'd also be pretty good at these tasks
So if computers get good at each of these tasks, then...
(I Am Being Unfair) Each of these tasks is Likely they are getting us valuable on its own merits closer to actual natural language understanding
Pre-Modern NLP
Lots of Linguistics
Grammars S S -> NP VP NP VP VP -> VBZ ADJP NP VBZ ADJP NP -> JJ NN JJ NN VBZ ADJP ADJP -> JJ JJ NN VBZ JJ JJ -> "Artificial" NN -> "intelligence" VBZ -> "is" JJ -> "dangerous" Artificial intelligence is dangerous
Hand-Crafted Features
Rule-Based Systems
Modern NLP
Theme 1: Neural Nets and Low-Dimensional Representations
Theme 2: Putting Things in
Theme 3:
Theme 4:
Theme 5: Transfer Learning
Word Vectors
Joel is attending an artificial intelligence conference. artificial 0 0 0 0 0 0 0 0 0 1 0 0 0 0 ... 0 embedding .3 .6 .1 .2 2.3 prediction .01 0 0 .9 0 0 0 0 0 .05 0 0 0 0 ... 0 intelligence
Using Word Vectors ? ?
Using Word Vectors N V
Using Word Vectors J N The official department heads all quit .
bites dog man
Using Context for Sequence Labeling N V
Using Context for Sequence Classification
Recurrent Neural Networks
LSTMs and GRUs
Bidirectionality
Generative Character-Level Modeling
Convolutional Networks
Sequence-to-Sequence Models
Attention
Large "Unsupervised" Language Models
Contextual Embeddings
Contextual Embeddings The Seahawks football today
word2vec
ELMo
ELMo
"NLP's ImageNet moment"
Self-Attention
RNN vs CNN vs Self-Attention
The Transformer ("Attention Is All You Need")
OpenAI GPT, or Transformer Decoder Language Model
One Model to Rule Them All?
The GLUE Benchmark
BERT
Task 1: Masked Language Modeling Joel is giving a [MASK] talk at a [MASK] in San Francisco interesting conference exciting meetup derivative rave pedestrian coffeehouse snooze-worthy WeWork ... ...
Task 2: Next Sentence Prediction [CLS] Joel is giving a talk. [SEP] The [CLS] Joel is giving a talk. [SEP] The audience is enthralled. [SEP] audience is falling asleep. [SEP] 99% is_next_sentence 1% is_next_sentence 1% is_not_next_sentence 99% is_not_next_sentence
BERT for downstream tasks
GPT-2
1.5 billion parameters
PRETRAINED LANGUAGE MODEL Is GPT-2 Dangerous?
How Can You Use These In Your Work?
Use Pretrained Word Vectors
Better Still, Use Pretrained Contextual Embeddings
Use Pretrained BERT to Build Great Classifiers
PRETRAINED LANGUAGE Use GPT-2 MODEL (small) (if you dare)
I'm fine-tuning a transformer model! In Conclusion NLP is cool ● Modern NLP is solving really hard ● problems (And is changing really really quickly) ● Lots of really smart people with lots of ● data and lots of compute power have trained models that you can just download and use So take advantage of their work! ●
Thanks! I'll tweet out the slides: @joelgrus ● read the speaker notes ○ they have lots of links ○ I sometimes blog: joelgrus.com ● AI2: allenai.org ● AllenNLP: allennlp.org ● GPT-2 Explorer: gpt2.apps.allenai.org ● podcast: adversariallearning.com ●
Appendix
References http://ruder.io/a-review-of-the-recent-history-of-nlp/ https://ankit-ai.blogspot.com/2019/03/future-of-natural-language-processing.html https://lilianweng.github.io/lil-log/2019/01/31/generalized-language-models.html#openai-gpt
Recommend
More recommend