Machine Reading todo From Wikipedia to the Web More on - PDF document

Machine Reading todo From Wikipedia to the Web � More on bootstrapping to the web � Retrain too brief Daniel S. Weld � Results for shrinkage independent of Department of Computer Science & Engineering retraining University of Washington Seattle, WA, USA Many Collaborators… Overview Raphael Stefan Fei Hoffmann Schoenmackers Wu � Extracting Knowledge from the Web � Facts � Ontology � Inference Rules � Using it for Q/A UW And… Eytan Adar, Saleema Amershi, Oren Etzioni, Intelligence in Wikipedia James Fogarty, Xiao Ling, Kayur Patel Project Key Ideas Key Idea 1 Ways WWW � Knowledge Community Machine-Learning-Based Content Information Extraction Creation UW Intelligence in Wikipedia Project

Key Idea 1 Key Idea 2 � Synergy (Positive Feedback) � Synergy (Positive Feedback) � Between ML Extraction & Community Content Creation � Between ML Extraction & Community Content Creation � Self Supervised Learning � Heuristics for Generating (Noisy) Training Data Match Key Idea 3 Key Idea 4 � Synergy (Positive Feedback) � Synergy (Positive Feedback) � Between ML Extraction & Community Content Creation � Between ML Extraction & Community Content Creation � Self Supervised Learning � Self Supervised Learning � Heuristics for Generating (Noisy) Training Data � Heuristics for Generating (Noisy) Training Data � Shrinkage (Ontological Smoothing) & Retraining � Shrinkage (Ontological Smoothing) & Retraining � For Improving Extraction in Sparse Domains � For Improving Extraction in Sparse Domains � Approximately Pseudo-Functional (APF) Relations person � Efficient Inference Using Learned Rules performer actor comedian Next-Generation Search Motivating Vision Next-Generation Search = Information Extraction � Information Extraction + Ontology Scalable � <Einstein, Born-In, Germany> Self-Supervised + Inference Means � <Einstein, ISA, Physicist> � <Einstein, Lectured-At, IAS> � <IAS, In, New-Jersey> Which German � <New-Jersey, In, United-States> … Scientists Taught at US � Ontology Universities? … � Physicist (x) � Scientist(x) … Einstein was a guest lecturer at � Inference the Institute for Advanced Study � Lectured-At(x, y) ∧ University(y) � Taught-At(x, y) in New Jersey … � Einstein = Einstein …

TextRunner Open Information Extraction For each sentence Apply POS Tagger For each pairs of noun phrases, NP 1 , NP 2 If classifier confirms they are “Related?” Use CRF to extract relation from intervening text Return relation(NP 1 , , NP 2 ) Train classifier & extractor on Penn Treebank data ( , ) ? } } Mark Emmert Mark Emmert was born in Fife and graduated from UW in 1975 was-born-in Fife Why Wikipedia? Wikipedia Structure � Pros � Unique IDs & Links � Comprehensive � Infoboxes � High Quality � Categories & Lists [Giles Nature 05] � First Sentence � Useful Structure � Redirection pages � Cons � Disambiguation pages � Natural-Language � Missing Data � Revision History Comscore MediaMetrix – August 2007 � Inconsistent � Multilingual � Low Redundancy

Traditional, Supervised I.E. Status Update Raw Data Outline Key Ideas Motivation Synergy Extracting Facts from Wikipedia Self-Supervised Learning Labeled Ontology Generation Shrinkage & Retraining Learning Training Improving Fact Extraction APF Relations Algorithm Data Bootstrapping to the Web Validating Extractions Kirkland -based Microsoft is the largest software company. Improving Recall with Inference Boeing moved it’s headquarters to Chicago in 2003. Hank Levy was named chair of Computer Science & Engr. Conclusions … Extractor HeadquarterOf(<company>,<city>) [Wu & Weld CIKM 2007] Kylin Architecture Kylin: Self-Supervised Information Extraction from Wikipedia From infoboxes to a training set Clearfield County was created in 1804 from parts of Huntingdon and Lycoming Counties but was administered as part of Centre County until 1812. Its county seat is Clearfield. 2,972 km² (1,147 mi²) of it is land and 17 km² (7 mi²) of it (0.56%) is water. As of 2005, the population density was 28.2/km². Preliminary Evaluation The Precision / Recall Tradeoff Correct Tuples tp � Precision Kylin Performed Well on Popular Classes: + tp fp tn Precision: mid 70% ~ high 90% � Proportion of selected fp tp fn Recall: low 50% ~ mid 90% items that are correct tp + tp fn � Recall Tuples returned by System ... But Floundered on Sparse Classes � Proportion of target items that were selected (Too Little Training Data) Precision AuC � Precision-Recall curve Is this a Big Problem? � Shows tradeoff Recall

Long Tail: Sparse Classes Long-Tail 2: Incomplete Articles � Desired Information Missing from Wikipedia Too Little Training Data 800,000/1,800,000 (44.2%) stub pages [Wikipedia July 2007] Length ID 82% < 100 instances; 40% <10 instances Status Update Shrinkage? person Outline Key Ideas (1201) Motivation Synergy .birth_place Extracting Facts from Wikipedia Self-Supervised Learning Ontology Generation Shrinkage & Retraining performer .location Improving Fact Extraction APF Relations (44) Bootstrapping to the Web Validating Extractions .birthplace Improving Recall with Inference .birth_place .cityofbirth Conclusions actor comedian .origin (8738) (106) KOG: Kylin Ontology Generator How Can We Get a [Wu & Weld, WWW08] Taxonomy for Wikipedia? Do We Need to? What about Category Tags? Conjunctions Schema Mapping Person Performer birth_date birthdate birth_place location name name other_names othername … …

KOG Architecture Subsumption Detection � Binary Classification Problem n i e t s Person � Nine Complex Features n i E : 7 0 E.g., String Features / 6 Scientist … IR Measures … Mapping to Wordnet Physicist … Hearst Pattern Matches … Class Transitions in Revision History � Learning Algorithm SVM & MLN Joint Inference Schema Mapping KOG: Kylin Ontology Generator [Wu & Weld, WWW08] Performer Person birth_date birthdate person birth_place location (1201) name name other_names othername performer … … (44) � Heuristics actor � Edit History comedian (8738) (106) � String Similarity • Experiments .birth_place • Precision: 94% Recall: 87% .location • Future .birthplace • Integrated Joint Inference .birth_place .cityofbirth .origin Improving Recall on Sparse Classes Status Update [Wu et al. KDD-08] Outline Key Ideas person � Shrinkage (1201) Motivation Synergy � Extra Training Examples Extracting Facts from Wikipedia Self-Supervised Learning from Related Classes Ontology Generation Shrinkage & Retraining performer (44) Improving Fact Extraction APF Relations � How Weight New Examples? Bootstrapping to the Web Validating Extractions actor comedian (8738) (106) Improving Recall with Inference Conclusions

Recall after Shrinkage / Retraining… Improving Recall on Sparse Classes [ Wu et al. KDD-08] Retraining � Compare Kylin Extractions with Tuples from Textrunner � Additional Positive Examples � Eliminate False Negatives TextRunner [Banko et al. IJCAI-07, ACL-08 ] � Relation-Independent Extraction � Exploits Grammatical Structure � CRF Extractor with POS Tag Features Status Update Long-Tail 2: Incomplete Articles � Desired Information Missing from Wikipedia Outline Key Ideas 800,000/1,800,000(44.2%) stub pages [July 2007 of Wikipedia ] Motivation Synergy Extracting Facts from Wikipedia Self-Supervised Learning Length Ontology Generation Shrinkage & Retraining Improving Fact Extraction APF Relations Bootstrapping to the Web Validating Extractions Improving Recall with Inference Conclusions ID Bootstrapping to the Web Extracting from the Broader Web [Wu et al. KDD-08] � Extractor Quality Irrelevant � If no information to extract… 1) Send Query to Google � 44% of Wikipedia Pages = “stub” Object Name + Attribute Synonym � Instead, … Extract from Broader Web 2) Find Best Region on the Page Heuristics > Dependency Parse � Challenges 3) Apply Extractor � How maintain high precision? � Many Web pages noisy, 4) Vote if Multiple Extractions � Describe multiple objects

Bootstrapping to the Web Problem � Information Extraction is Still Imprecise � Do Wikipedians Want 90% Precision? � How Improve Precision? � People! Status Update Accelerate Outline Key Ideas Motivation Synergy Extracting Facts from Wikipedia Self-Supervised Learning Ontology Generation Shrinkage & Retraining Improving Fact Extraction APF Relations Bootstrapping to the Web Validating Extractions Improving Recall with Inference Conclusions Contributing as a Non-Primary Task [Hoffman CHI-09] � Encourage contributions � Without annoying or abusing readers Designed Three Interfaces � Popup (immediate interruption strategy) � Highlight (negotiated interruption strategy) � Icon (negotiated interruption strategy) Popup Interface

r v e o h Highlight Interface Highlight Interface r e v o h Highlight Interface Highlight Interface r e v o h Icon Interface Icon Interface

Machine Reading todo From Wikipedia to the Web More on - PDF document

Machine Reading todo From Wikipedia to the Web More on bootstrapping to the web Retrain too brief Daniel S. Weld Results for shrinkage independent of Department of Computer Science & Engineering retraining University of

Machine Reading and Reasoning with Neural Program Interpreters Sebastian Riedel @riedelcastro

Machine Reading and Reasoning with Neural Program Interpreters Sebastian Riedel @riedelcastro

Machine Reading and Reasoning with Neural Program Interpreters Sebastian Riedel @riedelcastro

Decision Trees 2-26-16 Reading Quiz Decision trees are an algorithm for which machine learning

Machine Learning for NLP Reinforcement Learning Reading Aurlie Herbelot 2018 Centre for

- 4. Department of Meteorology, University of Reading , Reading RG6 6BB, UK 5. Laboratory of

Evaluation Metrics for Machine Reading Comprehension (RC): Prerequisite Skills and Readability

Representa)on Learning for Reading Comprehension Russ Salakhutdinov Machine Learning Department

Machine Learning for NLP Learning from small data: reading Aurlie Herbelot 2018 Centre for

Clustering Analysis Basics Ke Chen Reading: [Ch. 7, EA], [25.1, KPM] COMP24111 Machine Learning

I ntroduction to Machine Learning Reading for today: R&N 18.1-18.4 Next lecture: R&N

Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline

10-701 Machine Learning Classification Related reading: Mitchell 8.1,8.2; Bishop 1.5 Where we

K -means Clustering Ke Chen Reading: [7.3, EA], [9.1, CMB] COMP24111 Machine Learning Outline

What is Reading? Reading is making meaning from print. PRE READING SKILLS The image

10601 Machine Learning Hierarchical clustering Reading: Bishop: 9-9.2 Second half: Overview

CK3LI CT K-3 Reading Model Commitment to K-3 reading as the top priority Comprehensive

Reading Because They Want To, Not Because They Have To: Supporting Reading Motivation in

10-701 Machine Learning Decision trees Optional additional reading: Mitchell Chapter 3 Types of

General Reading Strategies For students who love reading and students who will love reading! Our

Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University

Reading in Primary 1 Reading Reading is a jigsaw of skills When all of the skills are

Ensemble Learning Machine Learning Introduction 2 In our daily life Asking different

TURING MACHINE VARIATIONS ENCODING TURING MACHINES UNIVERSAL TURING MACHINE Your Questions?

Machine Reading todo From Wikipedia to the Web More on - PDF document

Machine Reading todo From Wikipedia to the Web More on bootstrapping to the web Retrain too brief Daniel S. Weld Results for shrinkage independent of Department of Computer Science & Engineering retraining University of

Machine Reading and Reasoning with Neural Program Interpreters Sebastian Riedel @riedelcastro

Machine Reading and Reasoning with Neural Program Interpreters Sebastian Riedel @riedelcastro

Machine Reading and Reasoning with Neural Program Interpreters Sebastian Riedel @riedelcastro

Decision Trees 2-26-16 Reading Quiz Decision trees are an algorithm for which machine learning

Machine Learning for NLP Reinforcement Learning Reading Aurlie Herbelot 2018 Centre for

- 4. Department of Meteorology, University of Reading , Reading RG6 6BB, UK 5. Laboratory of

Evaluation Metrics for Machine Reading Comprehension (RC): Prerequisite Skills and Readability

Representa)on Learning for Reading Comprehension Russ Salakhutdinov Machine Learning Department

Machine Learning for NLP Learning from small data: reading Aurlie Herbelot 2018 Centre for

Clustering Analysis Basics Ke Chen Reading: [Ch. 7, EA], [25.1, KPM] COMP24111 Machine Learning

I ntroduction to Machine Learning Reading for today: R&amp;N 18.1-18.4 Next lecture: R&amp;N

Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline

10-701 Machine Learning Classification Related reading: Mitchell 8.1,8.2; Bishop 1.5 Where we

K -means Clustering Ke Chen Reading: [7.3, EA], [9.1, CMB] COMP24111 Machine Learning Outline

What is Reading? Reading is making meaning from print. PRE READING SKILLS The image

10601 Machine Learning Hierarchical clustering Reading: Bishop: 9-9.2 Second half: Overview

CK3LI CT K-3 Reading Model Commitment to K-3 reading as the top priority Comprehensive

Reading Because They Want To, Not Because They Have To: Supporting Reading Motivation in

10-701 Machine Learning Decision trees Optional additional reading: Mitchell Chapter 3 Types of

General Reading Strategies For students who love reading and students who will love reading! Our

Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University

Reading in Primary 1 Reading Reading is a jigsaw of skills When all of the skills are

Ensemble Learning Machine Learning Introduction 2 In our daily life Asking different

TURING MACHINE VARIATIONS ENCODING TURING MACHINES UNIVERSAL TURING MACHINE Your Questions?

I ntroduction to Machine Learning Reading for today: R&N 18.1-18.4 Next lecture: R&N