for Open IE Gabi Stanovsky and Ido Dagan Bar-Ilan University In - PowerPoint PPT Presentation

Creating a Gold Benchmark for Open IE Gabi Stanovsky and Ido Dagan Bar-Ilan University

In this talk • Problem : No large benchmark for Open IE evaluation! • Approach • Identify common extraction principles • Extract a large Open IE corpus from QA-SRL • Automatic system comparison • Contributions • Novel methodology for compiling Open IE test sets • New corpus readily available for future evaluations

Problem: Evaluation of Open IE

Open Information Extraction • Extracts SVO tuples from texts • Barack Obama, the U.S president, was born in Hawaii → (Barack Obama, born in , Hawaii) • Obama and Bush were born in America → ( Obama, born in , America), (Bush, born in , America) • Useful for populating large databases • A scalable open variant of Information Extraction

Open IE: Many parsers developed • TextRunner (Banko et al., NAACL 2007) • WOE (Wu and Weld, ACL 2010) • ReVerb (Fader et al., 2011) • OLLIE (Mausam et al., EMNLP 2012) • KrakeN (Akbik and Luser, ACL 2012) • ClausIE (Del Corro and Gemulla, WWW 2013) • Stanford Open Information Extraction (Angeli et al., ACL 2015) • DEFIE (Bovi et al., TACL 2015) • Open-IE 4 (Mausam et al., ongoing work) • PropS-DE (Falke et al., EMNLP 2016) • NestIE (Bhutani et al., EMNLP 2016)

Problem: Open IE evaluation • Open IE task formulation has been lacking formal rigor • No common guidelines → No large corpus for evaluation • Post-hoc evaluation: • Annotators judge a small sample of their output → Precision oriented metrics → Figures are not comparable → Experiments are hard to reproduce

Previous evaluations  Hard to draw general conclusions!

Solution: Common Extraction Principles Large Open IE Benchmark Automatic Evaluation

Common principles 1. Open lexicon 2. Soundness “Cruz refused to endorse Trump” ReVerb: (Cruz; endorse ; Trump) OLLIE: (Cruz; refused to endorse ; Trump) 3. Minimal argument span “Hillary promised better education, social plans and healthcare coverage” ClausIE: (Hillary, promised , better education), (Hillary, promised , better social plans), (Hillary, promised , better healthcare coverage)

Solution: Common Extraction Principles Large Open IE Benchmark QA-SRL  Open IE Automatic Evaluation

Open IE vs. traditional SRL Open IE Traditional SRL Open lexicon V X Soundness V V Reduced arguments V X

QA-SRL • Recently, He et al. (2015) annotated SRL by asking and answering argument role questions Obama, the U.S president, was born in Hawaii • Who was born somewhere? Obama • Where was someone born ? Hawaii

Open IE vs. SRL vs. QA QA-SRL SRL QA- SRL isn’t limited to a lexicon Open IE Traditional SRL QA-SRL Open lexicon V X V Consistency V V V Reduced arguments V X V QA-SRL format solicits reduced arguments (Stanovsky et al., ACL 2016)

Converting QA-SRL to Open IE • Intuition: generate all independent extractions • Example: • “ Barack Obama , the newly elected president , flew to Moscow on Tuesday ” • QA-SRL: • Who flew somewhere? Barack Obama / the newly elected president • Where did someone fly ? to Moscow • When did someone fly ? on Tuesday  OIE: (Barack Obama, flew , to Moscow, on Tuesday) (the newly elected president, flew , to Moscow, on Tuesday)  Cartesian product over all answer combinations • Special cases for nested predicates, modals and auxiliaries

Resulting Corpus • Validated against an expert annotation of 100 sentences (95% F1) • 13 times bigger than largest previous OIE corpus (ReVerb)

Solution: Common Extraction Principles Large Open IE Benchmark Automatic Evaluation

Evaluation • We evaluate 6 publicly available systems 1. ClausIE 2. Open-IE 4 3. OLLIE 4. PropS IE 5. ReVerb 6. Stanford Open IE • Soft matching function to accomodate system flavors

Low recall: Evaluation Missed long-range dep, pronoun resolution Stanford ’ s performance: Probability of 1 to most extractions “ Duplicates ” hurt precision

Caveat • OIE parsers didn’t tune for our corpus  Evaluation may not reflect optimal performance • More importantly – using our corpus for future system development

Conclusion • New benchmark published • https://github.com/gabrielStanovsky/oie-benchmark • 13 times larger than previous benchmarks • First automatic and objective OIE evaluation • Novel method for creating OIE test sets for new domains Thanks for listening!

for Open IE Gabi Stanovsky and Ido Dagan Bar-Ilan University In - PowerPoint PPT Presentation

Creating a Gold Benchmark for Open IE Gabi Stanovsky and Ido Dagan Bar-Ilan University In this talk Problem : No large benchmark for Open IE evaluation! Approach Identify common extraction principles Extract a large Open IE

Make Money With Open Source What is Open Source? Community Free software vs. open source

Why Open Data? Closed Data is Bad For You Ingo R. Keck ingo.keck@openknowledge.ie Open

Open house Open house Open house Open house on on on on on on on on World Raw Cashew

Support Requesting new features & raising issues 1. Open SDG documentation 2. Open SDG issue

Open Komodo: An Open Source IDE For Open Languages For Open Languages Own Your IDE Eric

open platform, open tools and open data for an open Internet Tiziana Refice (tiziana@google.com)

Open Notebook Computer Science Open Software Day 2012 Vadim Zaytsev, SWAT, CWI 2012 Open

Using Open Office Impress Starting off Open Impress. If a new presentation is not already open

open source, open data, & personalized medicine Benaroya Research Institute open source

Hyper-scaling on Openstack with Open Source tooling A use case in deploying hyper-scale grid

The Future of Open Data THE FUTURE OF OPEN DATA AFRICA OPEN DATA CONFERENCE Edward Anderson Dar

OPEN UNIVERSITY Brand and logo development OPEN UNIVERSITY Headline and message development

5G at the Edge Suresh Krishnan 2020 The facets of openness Open Software Open Open Hardware

Open-source without headaches Edwin Dalmaijer @esdalmaijer 20 November 2018 Wait, isnt open

Mega Open House Events By Joshua Smith Open Houses Work Realtors, in general, have a negative

Senior Leaders Management Degree Apprenticeship The Open Univ iversity About The Open

Gold Performance gold and Linux Future Ian Lance Taylor Who? Google April 16, 2010 Gold

Retirement Online Gold Certification Introduction to Enhanced Reporting Office of the New York

BUILDING A GOLD REPUTATION Eugene K. Pettis, Esq. Haliczer Pettis & Schwamm, P.A. January

lld: A Fast, Simple and Portable Linker Rui Ueyama <ruiu@google.com> LLVM Developers'

Guidance for English Slides Refer to year group planning for the sequence of lessons and

Health Care & the Rural Economy: How Rural Health Works Can Benefit Your County NACo is

OF THE YEAR The UGA Could not function in the same capacity without the dedicated help of our

IMGD 3000 - Technical Game Development I: Gold's Nuggets by Robert W. Lindeman gogo@wpi.edu

for Open IE Gabi Stanovsky and Ido Dagan Bar-Ilan University In - PowerPoint PPT Presentation

Creating a Gold Benchmark for Open IE Gabi Stanovsky and Ido Dagan Bar-Ilan University In this talk Problem : No large benchmark for Open IE evaluation! Approach Identify common extraction principles Extract a large Open IE

Make Money With Open Source What is Open Source? Community Free software vs. open source

Why Open Data? Closed Data is Bad For You Ingo R. Keck ingo.keck@openknowledge.ie Open

Open house Open house Open house Open house on on on on on on on on World Raw Cashew

Support Requesting new features &amp; raising issues 1. Open SDG documentation 2. Open SDG issue

Open Komodo: An Open Source IDE For Open Languages For Open Languages Own Your IDE Eric

open platform, open tools and open data for an open Internet Tiziana Refice (tiziana@google.com)

Open Notebook Computer Science Open Software Day 2012 Vadim Zaytsev, SWAT, CWI 2012 Open

Using Open Office Impress Starting off Open Impress. If a new presentation is not already open

open source, open data, &amp; personalized medicine Benaroya Research Institute open source

Hyper-scaling on Openstack with Open Source tooling A use case in deploying hyper-scale grid

The Future of Open Data THE FUTURE OF OPEN DATA AFRICA OPEN DATA CONFERENCE Edward Anderson Dar

OPEN UNIVERSITY Brand and logo development OPEN UNIVERSITY Headline and message development

5G at the Edge Suresh Krishnan 2020 The facets of openness Open Software Open Open Hardware

Open-source without headaches Edwin Dalmaijer @esdalmaijer 20 November 2018 Wait, isnt open

Mega Open House Events By Joshua Smith Open Houses Work Realtors, in general, have a negative

Senior Leaders Management Degree Apprenticeship The Open Univ iversity About The Open

Gold Performance gold and Linux Future Ian Lance Taylor Who? Google April 16, 2010 Gold

Retirement Online Gold Certification Introduction to Enhanced Reporting Office of the New York

BUILDING A GOLD REPUTATION Eugene K. Pettis, Esq. Haliczer Pettis &amp; Schwamm, P.A. January

lld: A Fast, Simple and Portable Linker Rui Ueyama &lt;ruiu@google.com&gt; LLVM Developers'

Guidance for English Slides Refer to year group planning for the sequence of lessons and

Health Care &amp; the Rural Economy: How Rural Health Works Can Benefit Your County NACo is

OF THE YEAR The UGA Could not function in the same capacity without the dedicated help of our

IMGD 3000 - Technical Game Development I: Gold's Nuggets by Robert W. Lindeman gogo@wpi.edu

Support Requesting new features & raising issues 1. Open SDG documentation 2. Open SDG issue

open source, open data, & personalized medicine Benaroya Research Institute open source

BUILDING A GOLD REPUTATION Eugene K. Pettis, Esq. Haliczer Pettis & Schwamm, P.A. January

lld: A Fast, Simple and Portable Linker Rui Ueyama <ruiu@google.com> LLVM Developers'

Health Care & the Rural Economy: How Rural Health Works Can Benefit Your County NACo is