Human-AI Collaboration for Neural Text Generation with - PowerPoint PPT Presentation

Human-AI Collaboration   for Neural Text Generation   with Interpretable Neural Networks Sebastian Gehrmann   Thesis Defense Committee Members   Barbara Grosz Sasha Rush Oct 18, 2019 Stuart Shieber

This is Jesse, a journalist. Jesse has a ton of work.

Maybe AI can help reduce the workload? Introduce AI-reen, a text-generation model.

Jesse could give some of the workload to AI-reen. Doing so, Jesse would give up her agency over that work.

But AI-reen is biased and makes mistakes! Jesse still needs to provide oversight over its work.

By collaborating with AI-reen, Jesse could gain   the benefits of automation without losing her agency.

Problem Explain Suggestion Provide Feedback Update Suggestion t p e c c A Accepted solution

They want to collaboratively summarize a document. Source

Both have an idea how to summarize it.

If AI-reen was human, it could communicate its reasoning. But its prediction are not interpretable . ???

Even if it could explain its suggestion,   it can’t incorporate feedback from Jesse. I picked this phrase, because… I don’t like it.

Interpretability is necessary, but we also need controllability . I picked this phrase, because… How about … instead?

Let’s empower humans to collaborate with AI! Summarization [EMNLP ’18]   Data2Text [INLG ’18] ++ Section Title Generation [NAACL ’19] TL;DR Generation [INLG ’19] LSTMVis [InfoVis ’17] Phenotyping Saliency [PloS one, ’17] Seq2Seq-Vis [VAST ’18] Model Selection [DeepStruct ’19] Modeling Capacity [Formal Languages ’19] Collaborative Semantic Inference [VAST ’19] Detecting Fake Text with GLTR [ACL Demo ’19] Automated Mediation [Behavior & Technology ’19]

Outline 1. Background: Sequence Modeling for NLP 2. Incorporating Content Selection into a Summarization Model 3. How to Understand Predictions? 4. Collaborating with the Model to Summarize

p ( y t +1 | y 1 , …, y t ) The small ? dog ? owns ? a ? yellow ? ball ? ? . y 1 y 2 y 3 y 4 y 5 y 6 y 7 y 8

The small dog owns a yellow ball . y 1 y 2 y 3 y 4 y 5 y 6 y 7 y 8 [Elman ’90, Hochreiter & Schmidhuber ’97]

p … large small child dog The small dog owns a yellow ball . y 1 y 2 y 3 y 4 y 5 y 6 y 7 y 8 [Bengio ‘03]

Source x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 p ( y t +1 | y 1 , …, y t ) The small dog owns a yellow ball . p ( y 3 | y 1 , y 2 , x ) Target p ( y t +1 | y 1 , …, y t , x ) p ( next word | Der kleine , The small dog... ) Der kleine Hund besitzt einen gelben Ball . y 1 y 2 y 3 y 4 y 5 y 6 y 7 y 8

The small dog owns a yellow ball . Der kleine x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 y 1 y 2 Encoder Decoder [Bahdanau et al. ’14, Sutskever et al. ’14]

Attention p ( a t | x , y 1: t ) The small dog owns a yellow ball . Der kleine x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 y 1 y 2 Encoder Decoder [Bahdanau et al. ’14, Sutskever et al. ’14]

S Context ∑ a s t x s s =1 The small dog owns a yellow ball . Der kleine x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 y 1 y 2 Encoder Decoder [Bahdanau et al. ’14, Sutskever et al. ’14]

p … das Hund Kind große The small dog owns a yellow ball . Der kleine x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 y 1 y 2 Encoder Decoder [Bahdanau et al. ’14, Sutskever et al. ’14]

Consider an abstractive summarization problem, with Input x 1 , …, x S y 1 , …, y T Summary p ( y | x ) Train a summarizer to maximize . [ Gehrmann , Deng, and Rush, EMNLP ’18]

Attention p ( a t | x , y 1: t ) p … dog The a ball The small dog owns a yellow ball . Dog owns x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 y 1 y 2 Encoder Decoder [Vinyals et al. ’15, Filippova et al. ’15, Gu et al. ’16, See et al. ’17]

z t The copy mechanism uses a binary soft switch   that determines whether the model copies or generates. p ( y t +1 | x , y 1: t ) = p ( | x , y 1: t ) × + p ( | x , y 1: t ) × p … das Hund Kind große

z t The copy mechanism uses a binary soft switch   that determines whether the model copies or generates. σ ( Wh t + b ) Reusing p ( a t | x , y 1: t ) } } p ( y t +1 | x , y 1: t ) = p ( z t = 1 | x , y 1: t ) × p ( y t +1 | z t = 1, x , y 1: t ) + p ( z t = 0 | x , y 1: t ) × p ( y t +1 | z t = 0, x , y 1: t ) } } 1 − σ ( Wh t + b ) Standard model prediction

Just because a model can copy, should it?

Summarizer Copy Mechanism Text

Text Summarizer Copy Mechanism Copy Mechanism Text

Abstractive summarizers over-extract. “Angela Merkel and her husband, chemistry professor Joachim Sauer,   are spotted on their annual easter trip   to the island of ischia, near Naples. ”

  The model fails at content selection! Consider the content selection as   word-level extractive summarization . Let denote a binary indicator   t 1 , …, t S whether a source word is used in a summary.   p ( t | x ) Train a model to maximize .

How to generate supervised data? The small dog owns a large yellow ball.   The big dog from next door chases the ball. Big dog chases small dog ’ s ball.

The small dog owns a large yellow ball.   The big dog from next door chases the ball. Big dog chases small dog ’ s ball.

t The small dog owns a large yellow ball.   The big dog from next door chases the ball. Content Selector Model based on ELMo

Control copied content with Bottom-Up Attention by restricting what can be copied to important content. Content Selection Bottom-Up Attention Source Masked Source Summary

Control copied content with Bottom-Up Attention by restricting what can be copied to important content. Let denote the selection probability from the content selector. q s ϵ Let denote an importance threshold. Modify the copy-attention such that t | x , y 1: t ) = { p ( a s t | x , y 1: t ) q s > ϵ a s p ( ˜ ow. 0

Bottom-Up Attention p ( ˜ a t | x , y 1: t ) p … dog The a ball The small dog owns a yellow ball . Dog owns x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 y 1 y 2 Encoder Decoder

+2 ROUGE The improvements were consistent across two evaluated datasets.

“Angela Merkel and her husband, chemistry professor Joachim Sauer,   Without Bottom-Up are spotted on their annual easter trip   to the island of ischia, near Naples. ” “Angela Merkel and her husband   With Bottom-Up are spotted on their easter trip. ”

There is still work to be done…

Summarization models struggle in real-world scenarios! How do we make the generation of a summary collaborative ?

The Users of Interpretability and Collaboration Architect Trainer End User [Strobelt*, Gehrmann , et al,. InfoVis ’17]

̂ The Target of Interpretability and Collaboration y θ Model Decision [ Gehrmann* , Strobelt*, et al., VAST ’19]

The Coupling of Model and Interface (c) Interactive Collaboration (a) Passive Obervation (b) Interactive Obervation x o x o ABCDEF (b) Interactive Obervation (c) Interactive Collaboration Passive Observation Interactive Observation Interactive Collaboration [ Gehrmann* , Strobelt*, et al., VAST ’19]

(a) Passive Obervation θ (b) Interactive Obervation [Wongsuphasawat et al,. VAST ’17]

̂ y (b) Interactive Obervation x o [Strobelt*, Gehrmann* , et al,. VAST ’18] (c) Interactive Collaboration

Human-AI Collaboration for Neural Text Generation with - PowerPoint PPT Presentation

Human-AI Collaboration for Neural Text Generation with Interpretable Neural Networks Sebastian Gehrmann Thesis Defense Committee Members Barbara Grosz Sasha Rush Oct 18, 2019 Stuart Shieber This is Jesse, a journalist. Jesse

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

GANocracy Outline Background: Text Generation Latent-Variable Generation Learning

What can Statistical Machine Translation teach Neural Text Generation about Optimization? Graham

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Text Text #ICANN50 Contractual Compliance Text Text GNSO Council Meeting Wednesday, Jun 25

God Rescues Daniel from the Lions Daniel 6 Here is some test text Here is some test text Here

5. Text CHAPTER HIGHLIGHTS Text tradition. Codes for computer text. C d f t t t

Stack Stack Heap Heap Data Data Text Text Program A Program B Stack Stack Text Heap

Business Proposal Infographic Style Your Text Here Your Text Here Your Text Here Your Text

Development Tools for Multicore Systems David Lecomber david@allinea.com CTO www.allinea.com

CENTRUpdate ccNSOSingapore PeterVanRoste CENTR

Web 2.0 and other tools for the Social Studies Class By Monica Albuixech and Janice Fairchild

Project Plan Banking with Amazons Alexa and Apples Siri The Capstone Experience Team

e-Verification of Agricultural Inputs: Progress in Uganda Judy Payne, e-Business Advisor, USAID

AC5000 AC5000 Design Design

Fall Opening Looking Back and Moving Forward Reading School Committee June 25, 2020 District

ASUCR ELECTIONS SURVEY RECOMMENDATIONS Improve distribution of information to students about