Lecture 24: Relation Extraction Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16 CS6501-NLP 1
Goal v Acquire structured knowledge from text CS6501-NLP 2
Information extraction v Entities recognition v Identify name entities: People, Organization, Location, Times, Dates, etc. v or genes, proteins, diseases, etc. v Relation extraction v Location in, employed by, married to CS6501-NLP 3
Example CS6501-NLP 4
Why relation extraction? v Create structured knowledge bases v Augment structured knowledge bases v Support question answering v The first step for event extraction and storyline extraction v … CS6501-NLP 5
Relation types (closed domain) v 17 relations from Automated Content Extraction (ACE) Credit: Dan Jurafsky CS6501-NLP 6
Relation types (closed domain) v UMLS: Unified Medical Language System v 134 entity types, 54 relations CS6501-NLP 7
Relation types (open domain) v Freebase: thousand relations/million entities CS6501-NLP 8
Wikipedia Infobox CS6501-NLP 9
|undergrad = 15,669<ref name=facts/> |postgrad = 6,316<ref name=facts/> |city = [[Charlottesville, Virginia|Charlottesville]]|state = [[Virginia]]|country = U.S. |campus = [[Charlottesville, Virginia metropolitan area|Small city]]<br />{{convert|1682|acre|km2}}<br />[[World Heritage Site]] CS6501-NLP 10
How to build relation extractors (closed domain) v Hand-written patterns v Supervised machine learning v Take each sentence as input v Identify name entities (mentions) v Perform multi-class classifications v + constraints or features to model correlations CS6501-NLP 11
CS6501-NLP 12
How to build relation extractors (open domain) v Bootstrap learning [Brin 98, …] v Use seed instances to extract a set of relational patterns v Unsupervised learning v Cluster sentences based on relational patterns v Distant supervision Distant supervision for relation extraction without labeled data [Mintz 09+] v Combine the above approaches CS6501-NLP 13
v A follow-up approach: Relation Extraction with Matrix Factorization and Universal Schemas [Riedel 13+] CS6501-NLP 14
Recommend
More recommend