Whats so Hard about Natural Language Understanding? Alan Ritter - PowerPoint PPT Presentation

What’s so Hard about Natural Language Understanding? Alan Ritter Computer Science and Engineering The Ohio State University Collaborators: Jiwei Li, Dan Jurafsky (Stanford) Bill Dolan, Michel Galley, Jianfeng Gao (MSR), Colin Cherry (Google) Jeniya Tabassum (Ohio State), Alexander Konovalov (Ohio State), Wei Xu (Ohio State) Brendan O’Connor (Umass)

What’s so Hard about Natural Language Understanding? Alan Ritter Computer Science and Engineering The Ohio State University Collaborators: Jiwei Li , Dan Jurafsky (Stanford) Bill Dolan, Michel Galley, Jianfeng Gao (MSR), Colin Cherry (Google) Jeniya Tabassum (Ohio State), Alexander Konovalov (Ohio State), Wei Xu (Ohio State) Brendan O’Connor (Umass)

Q: Why are we so good at Speech, MT (but bad at NLU)? People naturally translate and transcribe.

Q: Why are we so good at Speech, MT (but bad at NLU)? People naturally translate and transcribe. Q: Large, End-to-End Datasets for NLU? • Web-scale Conversations? • Web-scale Structured Data?

Data-Driven Conversation • Twitter: ~ 500 Million Public SMS-Style Conversations per Month • Goal: Learn conversational agents directly from massive volumes of data. 6

[Ritter, Cherry, Dolan EMNLP 2011] Noisy Channel Model Input: Who wants to come over for dinner tomorrow? 7

[Ritter, Cherry, Dolan EMNLP 2011] Noisy Channel Model Input: Who wants to come over for dinner tomorrow? { Output: Yum ! I 7

[Ritter, Cherry, Dolan EMNLP 2011] Noisy Channel Model Input: Who wants to come over for dinner tomorrow? { { Output: Yum ! I want to 7

[Ritter, Cherry, Dolan EMNLP 2011] Noisy Channel Model Input: Who wants to come over for dinner tomorrow? { { { Output: Yum ! I want to be there 7

[Ritter, Cherry, Dolan EMNLP 2011] Noisy Channel Model Input: Who wants to come over for dinner tomorrow? { { { { Output: Yum ! I want to be there tomorrow ! 7

Neural Conversation [Sordoni et. al. 2015] [Xu et. al. 2016] [Wen et. al. 2016] [Li et. al. 2016] [Kannan et. al. 2016] [Serban et. al. 2016] 8

How old are you? Slide Credit: Jiwei Li 9

How old are you? i 'm 16 . Slide Credit: Jiwei Li 10

How old are you? i 'm 16 . 16 ? Slide Credit: Jiwei Li 11

How old are you? i 'm 16 . 16 ? i don 't know what you 're talking about Slide Credit: Jiwei Li 12

How old are you? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying Slide Credit: Jiwei Li 13

How old are you? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about … Slide Credit: Jiwei Li

Bad Action How old are you? i 'm 16 . 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about … Slide Credit: Jiwei Li

Bad Action How old are you? i 'm 16 . Outcome 16 ? i don 't know what you 're talking about you don 't know what you 're saying i don 't know what you 're talking about … Slide Credit: Jiwei Li

Deep Reinforcement Learning [Li, Monroe, Ritter, Galley, Gao, Jurafsky EMNLP 2016] How old are you? State Encoding how old are you

Deep Reinforcement Learning [Li, Monroe, Ritter, Galley, Gao, Jurafsky EMNLP 2016] Action How old are you? i 'm 16 . . 16 EOS I’m Encoding Decoding how EOS old are you I’m 16 .

Learning: Policy Gradient REINFORCE Algorithm (Williams,1992) What we want to learn Action How old are you? i 'm 16 . . 16 EOS I’m Encoding Decoding how EOS old are you I’m 16 .

Q: Rewards?

Q: Rewards? A: Turing Test

Q: Rewards? A: Turing Test Adversarial Learning (Goodfellow et al., 2014)

Adversarial Learning for Neural Dialogue [Li, Monroe, Shi, Jean, Ritter, Jurafsky EMNLP 2016] sample Real-world human response conversations Discriminator Real or Fake? generate response Response Generator

Adversarial Learning for Neural Dialogue [Li, Monroe, Shi, Jean, Ritter, Jurafsky EMNLP 2016] (Alternate Between Training Generator and Discriminator) sample Real-world human response conversations Discriminator Real or Fake? generate response Response Generator

Adversarial Learning for Neural Dialogue [Li, Monroe, Shi, Jean, Ritter, Jurafsky EMNLP 2016] (Alternate Between Training Generator and Discriminator) sample Real-world human response conversations Discriminator Real or Fake? generate response Response Generator REINFORCE Algorithm (Williams,1992)

Adversarial Learning Improves Response Generation vs vanilla generation model Adversarial Adversarial Tie Win Lose Human Evaluator: 62% 18% 20% Adversarial Success (How often can you fool a machine) Adversarial Learning 8.0% Machine Evaluator: Standard Seq2Seq model 4.9% [Bowman et. al. 2016] Slide Credit: Jiwei Li

Q: Why are we so good at Speech, MT (but bad at NLU)? People naturally translate and transcribe. Q: Large, End-to-End Datasets for NLU? • Web-scale Conversations? • Web-scale Structured Data?

Q: Why are we so good at Speech, MT (but bad at NLU)? People naturally translate and transcribe. Q: Large, End-to-End Datasets for NLU? Generates fluent open domain • Web-scale Conversations? replies • Web-scale Structured Data?

Q: Why are we so good at Speech, MT (but bad at NLU)? People naturally translate and transcribe. Q: Large, End-to-End Datasets for NLU? Generates fluent open domain • Web-scale Conversations? replies • Web-scale Structured Data? Really Natural Language Understanding?

Learning from Distant Supervision [Mintz et. al. 2009] 1) Named Entity Recognition Challenge: highly ambiguous labels [Ritter, et. al. EMNLP 2011] 2) Relation Extraction Challenge: missing data [Ritter, et. al. TACL 2013] 3) Time Normalization Challenge: diversity in noisy text [Tabassum, Ritter, Xu, EMNLP 2016] N X − λ U D (˜ p unlabeled O ( θ ) = log p θ ( y i | x i ) p || ˆ ) 4) Event Extraction − θ | {z } i Label regularization Challenge: lack of negative examples | {z } Log Likelihood [Ritter, et. al. WWW 2015] [Konovalov, et. al. WWW 2017]

Time Normalization [Tabassum, Ritter, Xu EMNLP 2016] State-of- the-art time resolvers { } TempEX HeidelTime SUTime UWTime 1 Jan 2016

Time Normalization [Tabassum, Ritter, Xu EMNLP 2016] Distant Supervision   (no human labels or rules!) State-of- the-art time resolvers { } TempEX HeidelTime SUTime UWTime 1 Jan 2016

Distant Supervision Assumption Mercury Transit May 9,2016

Distant Supervision Assumption Mercury Transit May 9,2016 8 May 9 May 10 May

Multiple Instance Learning Tagger [ Mercury, 5/9/2016 ] … w 1 w 2 w n w 3 Words t 1 t 2 t 3 t 4 Sentence Level Tags 1 Mon 1 Past … … … Present 31 12 Sun Future [Event Database]

Multiple Instance Learning Tagger [ Mercury, 5/9/2016 ] … w 1 w 2 w n w 3 Words Local Classifier exp ( θ · f ( w i , z i )) … z 1 z 2 z 3 z n Word Level Tags t 1 t 2 t 3 t 4 Sentence Level Tags 1 Mon 1 Past … … … Present 31 12 Sun Future [Event Database]

Whats so Hard about Natural Language Understanding? Alan Ritter - PowerPoint PPT Presentation

Whats so Hard about Natural Language Understanding? Alan Ritter Computer Science and Engineering The Ohio State University Collaborators: Jiwei Li, Dan Jurafsky (Stanford) Bill Dolan, Michel Galley, Jianfeng Gao (MSR), Colin Cherry

Natural Language Processing Stages in understanding natural language Why its hard

A Software Suite for the Understanding of Natural Language Marco Ponza Paolo Ferragina Natural

Natural Language Understanding We want to communicate with computers using natural language

Neural Language Models The New Frontier of Natural Language Understanding Gabriele Sarti

Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2019 Natural Language

Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 15: Natural Language

Whodunnit? Crime Drama as a Case for Natural Language Understanding Lea Frermann , Shay Cohen and

Advanced Natural Language Processing: What is Natural Language Processing (NLP)? Background

Natural Language Understanding Kyunghyun Cho, NYU & U. Montreal 2 Fun Trivia 3

Overview Background: Who did what to whom is a major focus in natural language understanding,

Natural Language Understanding Lecture 17: Entity-based Coherence Mirella Lapata School of

Outline of todays lecture Natural Language Processing Lecture 1: Introduction Overview of the

Basic Natural Language Processing Why NLP? Understanding Intent Search Engines

Advanced Natural Language Processing: What is Natural Language Processing (NLP)? Background

Dialog as a Vehicle for Lifelong Learning of Grounded Language Understanding Systems Aishwarya

Natural Language Understanding Lecture 2: Revision of neural networks and backpropagation Adam

Constrained Conditional Models Learning and Inference in Natural Language Understanding Dan Roth

Semantic role labeling Christopher Potts CS 244U: Natural language understanding Feb 2 With

Towards Robust Natural Language Understanding Group 3 Shengshuo L, Xuhui Z, Zeyu L, Xinyi W,

Natural Language Processing The Turing Test Eliza State of the art Conversational Agent

Natural language is a programming language: Applying natural language processing to software

Natural Language Understanding Lecture 1: Introduction Adam Lopez TAs: Marco Damonte, Federico

Configuring Domain Knowledge for Natural Language Understanding Matt Selway, Wolfgang Mayer,

Learning Compositional Semantics CS224U: Natural Language Understanding Feb. 9, 2012 Percy Liang

Whats so Hard about Natural Language Understanding? Alan Ritter - PowerPoint PPT Presentation

Whats so Hard about Natural Language Understanding? Alan Ritter Computer Science and Engineering The Ohio State University Collaborators: Jiwei Li, Dan Jurafsky (Stanford) Bill Dolan, Michel Galley, Jianfeng Gao (MSR), Colin Cherry

Natural Language Processing Stages in understanding natural language Why its hard

A Software Suite for the Understanding of Natural Language Marco Ponza Paolo Ferragina Natural

Natural Language Understanding We want to communicate with computers using natural language

Neural Language Models The New Frontier of Natural Language Understanding Gabriele Sarti

Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2019 Natural Language

Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 15: Natural Language

Whodunnit? Crime Drama as a Case for Natural Language Understanding Lea Frermann , Shay Cohen and

Advanced Natural Language Processing: What is Natural Language Processing (NLP)? Background

Natural Language Understanding Kyunghyun Cho, NYU &amp; U. Montreal 2 Fun Trivia 3

Overview Background: Who did what to whom is a major focus in natural language understanding,

Natural Language Understanding Lecture 17: Entity-based Coherence Mirella Lapata School of

Outline of todays lecture Natural Language Processing Lecture 1: Introduction Overview of the

Basic Natural Language Processing Why NLP? Understanding Intent Search Engines

Advanced Natural Language Processing: What is Natural Language Processing (NLP)? Background

Dialog as a Vehicle for Lifelong Learning of Grounded Language Understanding Systems Aishwarya

Natural Language Understanding Lecture 2: Revision of neural networks and backpropagation Adam

Constrained Conditional Models Learning and Inference in Natural Language Understanding Dan Roth

Semantic role labeling Christopher Potts CS 244U: Natural language understanding Feb 2 With

Towards Robust Natural Language Understanding Group 3 Shengshuo L, Xuhui Z, Zeyu L, Xinyi W,

Natural Language Processing The Turing Test Eliza State of the art Conversational Agent

Natural language is a programming language: Applying natural language processing to software

Natural Language Understanding Lecture 1: Introduction Adam Lopez TAs: Marco Damonte, Federico

Configuring Domain Knowledge for Natural Language Understanding Matt Selway, Wolfgang Mayer,

Learning Compositional Semantics CS224U: Natural Language Understanding Feb. 9, 2012 Percy Liang

Natural Language Understanding Kyunghyun Cho, NYU & U. Montreal 2 Fun Trivia 3