Argument Retrieval in Project Debater Yufang Hou IBM Research - PowerPoint PPT Presentation

Argument Retrieval in Project Debater Yufang Hou IBM Research Europe, Dublin

IBM Research: History of Grand Challenges 2019 First computer to successfully debate champion debaters 2011 ( Proje ater ) Project Debat First computer to defeat best human Jeopardy! players (Watson) 1997 First computer to defeat a world champion in Chess (Deep Blue)

Segments from a Live Debate (San Francisco, Feb 11 th 2019) Expert human debater: Mr. Harish Natarajan Motion: We should Format: Oxford style debating Fully automatic debate subsidize preschool Fully automatic debate No human intervention No human intervention Selected from test set based on assessment of chances to have a meaningful debate

Project Debater: Media Exposure Millions Hundreds 2.1 Billion 100 Million social media people reached of video views of press articles in all leading news papers impressions

• Fu Full Li Live e Deba ebate, te, Feb Feb-2019 2019 https://www.youtube.com/watch?v=m3u-1yttrVw&t=2469s • “T “The e Deb ebater ter” Doc ocumen enta tary https://www.youtube.com/watch?v=7pHaNMdWGsk&t=1383s

Outline q System overview q Argument retrieval in Project Debater q Some retrospective thoughts

Current Publications Highlight Various As Aspects of the System

Pub Public licatio ions ns an and Dat Datas asets are are av avai ailab able at at - https://www.research.ibm.com/artificial- intelligence/project-debater/research/

Related Work • Lippi Li ppi an and d To Toroni ni, IJCAI AI, 2015 • Al Al-Khatib et al, NAAC AACL 2016; Wachsmuth et al, Ar Argument-Mi Mining W Workshop, 2 2017, … … • Stab and St nd Gu Gurevy vych, EMNLP 2014; Stab et al, NAAC AACL 2018, … • Re Recent nt reviews • Fiv Five years rs of of argu rgument min inin ing: g: a data-dr driven en an anal alysis, Cabr abrio an and d Vi Villata, IJCAI AI, 2018 • Ar Argumentation Mining, St Stede an and d Sch chnei eider der, Synthes esis Lect Lectures es on HLT LT, 2018 2018 • Ar Argument Mining: A A Survey, Lawrence and Reed, CL, 2019

Wikipedia Stage Con ontext Dependent Claim Detection on, Levy et al, COLING 2014. 2014. Show ow Me You Your Evidence - an Au Autom omatic Method od for or Con ontext Dependent Evidence Detection on, Rinot ott et al, EMNLP 2015. 2015.

Wikipedia Stage • Wikiped edia Claim/Eviden ence e Label eled ed Data – Label eling Proces ess Con ontrov oversial Top opic Select Wikipedia Ar Articles ü 5 5 In-house An Annotators Per Stage Fi Find Claim Candidates per Ar Article ü Ex Exhau austive e an annotat ation Con onfirm/Reject Each Claim Candidate Find Candidate Evidence per Claim Fi Con onfirm/Reject Each Candidate Evidence

Wikipedia Stage • Wikiped edia Claim/Eviden ence e Label eled ed Data - Res esults ü 58 58 Controver ersial al Topi pics cs se selected from rom De Debatabase ü 547 547 rel elev evan ant Wikipedi pedia a ar articl cles es car caref efully label abeled ed by by in-ho hous use team § E.g., Ban the sale of of Viol olent Video o Games for or Children ü 2. 2.6K 6K Clai aims ms & & 4. 4.5K 5K Ev Eviden dence ce th that s t support/c t/conte test th t the c claims § Evidence length vary from om on one sentence to o a whol ole paragraph § Three types of of Evidence: Study, Expert, and An Anecdot otal ü Pr Pre-def defined ed trai ain/dev dev/tes est spl plit

Wikipedia Stage • System em Des esign for Ar Argumen ent Mining Topic We should subsidize preschool Simple logistic regression model with lots of o carefully designed features GrASP: Rich Patterns for Argumentation o Claim Detection Mining, Shnarch et al., EMNLP 2017 Document Topic Static train/dev/test datasets Level IR o Analysis Evidence Moderate success over a range of test topics o Detection Only positive instances are annotated o Limited coverage o Retrieve documents that directly o address the topic and are likely to contain argumentative text segments

VLC (Very Large Corpus) Stage Cor orpus wide argument mining - a wor orking sol olution on, Ein-Dor or et al, AAAI AAAI 2020. 2020.

VLC (Very Large Corpus) Stage Mai Main n Di Disti tincti nction n from Prev. Wo Work • Se Sent ntenc nce Level (SL (SL) ) strategy, vs. Docum ument nt Level us used before • SCAL ALE • ~240 ~240 trai ain/dev dev topi pics cs & ~100 ~100 tes est topi pics cs • ~200, ~200,000 000 sen enten ences ces car caref efully an annotat ated ed for trai ain/dev dev à Re Retrospective Lab abeling ng Par arad adigm • ~10, ~10,000, 000,000, 000,000 000 Sen enten ences ces - Re Reporting ng resul ults over a a mas massive corpus us Closer tha Clos han n ever to o a wor orking ng solut olution on

VLC (Very Large Corpus) Stage System Ar Architecture Retrieve 12, 000 sentences per o evidence type per topic Massive Corpus Retrieved High-precision ~10B Sentences Sentences Evidence Set Queries Ranking Model BERT Support flexible patterns to retrieve o argumentative sentences § Topic terms § Evidence connectors Starting with LR from Rinott et o § sentiment lexicon al, EMNLP 2015 § NER Controversial Re Retrospectiv ive Labelin ing g Paradigm igm o An infrastructure that supports An Topic o Iteratively qu quick dy dynamic expe periments and d Collected mo monitors annotation quality Labeled-Data

VLC (Very Large Corpus) Stage Ho How to to Collect ct Lab Labeled Data? ata? • Co Collecting labeled data poses a two wo-fo fold c challenge - • Low ow prior or of of pos ositive examples • An Annot otation on throu ough crow owd requires expertise – simple guidelines, careful mon onitor oring… • BTW - Kappa of BT of ~0. ~0.4 4 is ac actual ally quite good ood • De Developing corpus-wi wide a argument m t mining p poses a anoth ther c challenge • Imagine ~2, ~2,000 000 new prediction ons every week… à As Assoc ociated infrastructure is a must • Re Retrospective lab abeling ng of top predictions ns is a a nat natur ural al and and effective solut ution

Why Eviden ence e Det etec ection is Hard? Mo Moti tion: n: Blood donation should be mandatory According to studies, blood donors are 88 percent less likely to suffer a heart attack… CONFIRMED Statistics … show that students are the main blood donors contributing about 80 percent… REJECTED

VLC (Very Large Corpus) Stage Re Results Re Results by va various BERT RT Models ove ver o a mas a assive corpus of ~10B B sentences BA A baselines: Bl BlendNet, At Attention based o Macro-Average Precision bi bidi directional LS LSTM mode del [ Shnarch et al. (2018 )] )] Hig High p precis isio ion o Wide coverage wi Wi with th diverse evidences o (hi (highl hly simi milar sent ntenc nces are remo moved) Number of candidates

Challenges to Consider while developing a Live Debate System Data-driven speech Listening comprehension Modeling human writing and delivery dilemmas • Identify key claims hidden in long continuous spoken • Digest massive corpora • Modeling the world of human language controversy and • Write a well-structured speech • Compare to personal assistants • Enabling the system to suggest • Deliver with clarity and purpose - simple short commands principled arguments Ar Argumen ent ret etriev eval is the e first step ep to build such a system em

The Problem: Many things need to succeed simultaneously and many things can go wrong…

Many things can go wrong… / Examples • Ge Getti tting th the s sta tance wr wrong m means y you s support y t your o opponent… t… • Dr Drifting from the topic – fr from Ph Physical l Ed Education on to to Se Sex Edu ducat atio ion an and d back back… • The The system is onl nly as good as its corpus us à … … gl global al war armin ing g wil ill lead ad ma malaria virus to to creep into nto hi hilly areas…

Progress over time / Improvement in Precision of Detecting Claims Sentence level IR Se o Very Large Corpus: 400 Ve 400 o mi million articles ( 50 times larger than Wikipedia) Retrospective labelling o Bert fine-tuning o Docu Document le level l IR o Co Corpus us: Wi Wikipedia o Exhaustive Ex ve labe belling g o of pos of ositive instance ces LR + Rich feat LR ature res o Very large corpus Sentence level IR Attention-based Bi-LSTM Retrospective labelling Flexible query with weak supervision

Beyond Project Debater Computational argumentation o is emerging as an interesting Dialogue System research area “Argument mining” is the new Social NLP o keyword in the list of topics in o Sentiment Computational Argumentation recent *ACL conferences Natural o Persuasiveness Language o Social bias o Argument retrieval o Framing Generation o Argument Unit Identification o Fact verification o Argument Relation Prediction o … o Argument(ation) Quality o Argument Generation o … Text Summarization Discourse and Pragmatics o Argumentative discourse o Argumentative coherence o …

Argument Retrieval in Project Debater Yufang Hou IBM Research - PowerPoint PPT Presentation

Argument Retrieval in Project Debater Yufang Hou IBM Research Europe, Dublin IBM Research: History of Grand Challenges 2019 First computer to successfully debate champion debaters 2011 ( Proje ater ) Project Debat First computer to defeat

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Schools, Skills, and Synapses James J. Heckman University of Chicago The Argument Argument

ARGUMENT NOTES M R S . P E R RY E N G L I S H I ARGUMENT TERMS Claim: a statement that asserts

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Teaching Argument Blanqui Valledor SURN April 20, 2018 Introducing Argument Amys Murder

The 5 th Argument Mining Workshop The Argument Mining Community is Growing 90 80 82 70 60

The Kalam Cosmological Argument Cosmological Arguments A cosmological argument is one that

SYMBOLIC LOGIC UNIT 1: INTRODUCTION TO LOGIC What is an argument? An argument is the public,

End-to-End Argument Jeff Chase Duke University End-To-End Argument Application TCP Where to

Topic 29/9/20 L.O To write a balanced argument -I can think of valid reasons for my argument -I

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

Automatic construction of distributional thesaurus (for multiple languages) Zheng ZHANG 1 st

Mining User Navigation Patterns for Personalizing Topic Directories Theodore Dalamagas,

Toward Mining Concept Keywords from Identifiers in Large Software Projects Masaru Ohba

Extracting keywords from images Bag-of-visual-words enriched with graph techniques Gjorgji

A Model for Recommending Research Articles: A Case Study in Computer Science, Neuroscience and

Security for Cloud & Big Data CS 161: Computer Security Prof. David Wagner April 18, 2013

Combining Text and Image Processing in an Automa6c Image

Sample-Efficient Optimization in the Latent Space of Deep Generative Models via Weighted

Argument Retrieval in Project Debater Yufang Hou IBM Research - PowerPoint PPT Presentation

Argument Retrieval in Project Debater Yufang Hou IBM Research Europe, Dublin IBM Research: History of Grand Challenges 2019 First computer to successfully debate champion debaters 2011 ( Proje ater ) Project Debat First computer to defeat

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Schools, Skills, and Synapses James J. Heckman University of Chicago The Argument Argument

ARGUMENT NOTES M R S . P E R RY E N G L I S H I ARGUMENT TERMS Claim: a statement that asserts

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

Retrieval Models: Outline CS490W: Web I nformation Search &amp; Management Retrieval Models

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Teaching Argument Blanqui Valledor SURN April 20, 2018 Introducing Argument Amys Murder

The 5 th Argument Mining Workshop The Argument Mining Community is Growing 90 80 82 70 60

The Kalam Cosmological Argument Cosmological Arguments A cosmological argument is one that

SYMBOLIC LOGIC UNIT 1: INTRODUCTION TO LOGIC What is an argument? An argument is the public,

End-to-End Argument Jeff Chase Duke University End-To-End Argument Application TCP Where to

Topic 29/9/20 L.O To write a balanced argument -I can think of valid reasons for my argument -I

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

Automatic construction of distributional thesaurus (for multiple languages) Zheng ZHANG 1 st

Mining User Navigation Patterns for Personalizing Topic Directories Theodore Dalamagas,

Toward Mining Concept Keywords from Identifiers in Large Software Projects Masaru Ohba

Extracting keywords from images Bag-of-visual-words enriched with graph techniques Gjorgji

A Model for Recommending Research Articles: A Case Study in Computer Science, Neuroscience and

Security for Cloud &amp; Big Data CS 161: Computer Security Prof. David Wagner April 18, 2013

Combining Text and Image Processing in an Automa6c Image

Sample-Efficient Optimization in the Latent Space of Deep Generative Models via Weighted

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

Security for Cloud & Big Data CS 161: Computer Security Prof. David Wagner April 18, 2013