Introduction The ProbKB System Conclusion Knowledge Expansion over Probabilistic Knowledge Bases Yang Chen, Daisy Zhe Wang { yang,daisyw } @cise.ufl.edu Computer and Information Science and Engineering University of Florida SIGMOD, Snowbird, UT Jun 25, 2014 @ ProbKB D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 1/28
Introduction The ProbKB System Conclusion Outline Introduction 1 Knowledge Bases Knowledge Expansion The ProbKB System 2 Probabilistic Knowledge Bases ProbKB Architecture Grounding Quality Control Conclusion 3 Conclusion @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 2/28
Introduction The ProbKB System Conclusion Knowledge Bases A knowledge base is a collection of entities, facts, and relationships that conforms with a certain data model. Allows machines to interpret human information in a principled manner. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 3/28
Introduction The ProbKB System Conclusion Knowledge Bases A knowledge base is a collection of entities, facts, and relationships that conforms with a certain data model. Allows machines to interpret human information in a principled manner. Figure: Google knowledge graph @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 3/28
Introduction The ProbKB System Conclusion Knowledge Bases A knowledge base is a collection of entities, facts, and relationships that conforms with a certain data model. Allows machines to interpret human information in a principled manner. But they are often incomplete . Figure: Google knowledge graph @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 3/28
Introduction The ProbKB System Conclusion Knowledge Base Construction Review 1 Human collaboration: DBPedia, Freebase, Google Knowledge Graph, YAGO. 2 Automatic construction: DeepDive, Knowledge Vault, Nell , OpenIE , ProBase , YAGO. 3 Knowledge integration: Knowledge Fusion, Knowledge Vault, PIDGIN. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 4/28
Introduction The ProbKB System Conclusion Inferring Implicit Information Kale is rich in Calcium ∧ Calcium helps prevent Osteoporosis → Kale helps prevent Osteoporosis. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 5/28
Introduction The ProbKB System Conclusion Inferring Implicit Information Kale is rich in Calcium ∧ Calcium helps prevent Osteoporosis → Kale helps prevent Osteoporosis. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 5/28
Introduction The ProbKB System Conclusion Inferring Implicit Information IsHeadquarteredIn(Company, State) :- IsBasedIn(Company, City) ∧ IsLocatedIn(City, State); Contains(Food, Chemical) :- IsMadeFrom(Food, Ingredient) ∧ Contains(Ingredient, Chemical); Reduce(Medication, Factor) :- KnownGenericallyAs(Medication, Drug) ∧ Reduce(Drug, Factor); BornIn(Writer, City) ∧ CapitalOf(City, Place); ReturnTo(Writer, Place) :- Buy(Company1, Company2) ∧ Make(Company2, Device); Make(Company1, Device) :- Figure: Sherlock Horn clauses learner. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 6/28
Introduction The ProbKB System Conclusion Contributions Knowledge Expansion Problem Inferring implicit knowledge in KBs. Efficiency. We use DBMSes to model knowledge bases; We design a SQL-based algorithm to apply inference rules in batches ; We use MPP databases to parallelize the inference process. Quality. We identify major error sources and combine state-of-the-art methods to detect and recover from errors; We use semantic constraints to identify errors and ambiguities; We clean the rule set based on their statistical properties. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 7/28
Introduction The ProbKB System Conclusion Contributions Knowledge Expansion Problem Inferring implicit knowledge in KBs. Efficiency. We use DBMSes to model knowledge bases; We design a SQL-based algorithm to apply inference rules in batches ; We use MPP databases to parallelize the inference process. Quality. We identify major error sources and combine state-of-the-art methods to detect and recover from errors; We use semantic constraints to identify errors and ambiguities; We clean the rule set based on their statistical properties. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 7/28
Introduction The ProbKB System Conclusion Contributions Knowledge Expansion Problem Inferring implicit knowledge in KBs. Efficiency. We use DBMSes to model knowledge bases; We design a SQL-based algorithm to apply inference rules in batches ; We use MPP databases to parallelize the inference process. Quality. We identify major error sources and combine state-of-the-art methods to detect and recover from errors; We use semantic constraints to identify errors and ambiguities; We clean the rule set based on their statistical properties. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 7/28
Introduction The ProbKB System Conclusion Contributions Knowledge Expansion Problem Inferring implicit knowledge in KBs. Efficiency. We use DBMSes to model knowledge bases; We design a SQL-based algorithm to apply inference rules in batches ; We use MPP databases to parallelize the inference process. Quality. We identify major error sources and combine state-of-the-art methods to detect and recover from errors; We use semantic constraints to identify errors and ambiguities; We clean the rule set based on their statistical properties. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 7/28
Introduction The ProbKB System Conclusion Contributions Knowledge Expansion Problem Inferring implicit knowledge in KBs. Efficiency. We use DBMSes to model knowledge bases; We design a SQL-based algorithm to apply inference rules in batches ; We use MPP databases to parallelize the inference process. Quality. We identify major error sources and combine state-of-the-art methods to detect and recover from errors; We use semantic constraints to identify errors and ambiguities; We clean the rule set based on their statistical properties. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 7/28
Introduction The ProbKB System Conclusion Contributions Knowledge Expansion Problem Inferring implicit knowledge in KBs. Efficiency. We use DBMSes to model knowledge bases; We design a SQL-based algorithm to apply inference rules in batches ; We use MPP databases to parallelize the inference process. Quality. We identify major error sources and combine state-of-the-art methods to detect and recover from errors; We use semantic constraints to identify errors and ambiguities; We clean the rule set based on their statistical properties. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 7/28
Introduction The ProbKB System Conclusion Contributions Knowledge Expansion Problem Inferring implicit knowledge in KBs. Efficiency. We use DBMSes to model knowledge bases; We design a SQL-based algorithm to apply inference rules in batches ; We use MPP databases to parallelize the inference process. Quality. We identify major error sources and combine state-of-the-art methods to detect and recover from errors; We use semantic constraints to identify errors and ambiguities; We clean the rule set based on their statistical properties. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 7/28
Introduction The ProbKB System Conclusion Contributions Knowledge Expansion Problem Inferring implicit knowledge in KBs. Efficiency. We use DBMSes to model knowledge bases; We design a SQL-based algorithm to apply inference rules in batches ; We use MPP databases to parallelize the inference process. Quality. We identify major error sources and combine state-of-the-art methods to detect and recover from errors; We use semantic constraints to identify errors and ambiguities; We clean the rule set based on their statistical properties. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 7/28
Introduction The ProbKB System Conclusion Contributions Knowledge Expansion Problem Inferring implicit knowledge in KBs. Efficiency. We use DBMSes to model knowledge bases; We design a SQL-based algorithm to apply inference rules in batches ; We use MPP databases to parallelize the inference process. Quality. We identify major error sources and combine state-of-the-art methods to detect and recover from errors; We use semantic constraints to identify errors and ambiguities; We clean the rule set based on their statistical properties. @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 7/28
Introduction The ProbKB System Conclusion Outline Introduction 1 Knowledge Bases Knowledge Expansion The ProbKB System 2 Probabilistic Knowledge Bases ProbKB Architecture Grounding Quality Control Conclusion 3 Conclusion @ D ata S cience R esearch Knowledge Expansion over Probabilistic Knowledge Bases Jun 25, 2014 8/28
Recommend
More recommend