Introduction Ontological Pathfinding Experiments Ontological Pathfinding: Mining First-Order Knowledge from Large Knowledge Bases Yang Chen, Sean Goldberg, Daisy Zhe Wang, Soumitra Siddharth Johri { yang,sean,daisyw } @cise.ufl.edu, soumitra.johri@ufl.edu Computer and Information Science and Engineering University of Florida SIGMOD’16, San Francisco, CA Jun 29, 2016 @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 1/25
Introduction Ontological Pathfinding Experiments Outline Introduction 1 Knowledge Bases Ontological Pathfinding 2 Partitioning Parallel Rule Mining Experiments 3 Overall Result Partitioning @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 2/25
Introduction Ontological Pathfinding Experiments Outline Introduction 1 Knowledge Bases Ontological Pathfinding 2 Partitioning Parallel Rule Mining Experiments 3 Overall Result Partitioning @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 3/25
Introduction Ontological Pathfinding Experiments Knowledge Bases A knowledge base organizes human information in a structured format. Predicate Subject Object isLocatedIn Washington, D.C. United States hasCapital Canada Ottawa wasBornIn Donald Knuth Milwaukee, Wisconsin isCitizenOf Donald Knuth United States dealsWith United States Canada Ontological Pathfinding Jun 29, 2016 4/25
Introduction Ontological Pathfinding Experiments Knowledge Bases A knowledge base organizes human information in a structured format. H ( x, y ) b 1 ( x, z ) b 2 ( y, z ) dealsWith isLocatedIn isLocatedIn dealsWith imports exports isCitizenOf wasBornIn hasCapital worksAt wasBornIn isLocatedIn isLocatedIn hasCapital isLocatedIn Ontological Pathfinding Jun 29, 2016 4/25
Introduction Ontological Pathfinding Experiments Knowledge Bases A knowledge base organizes human information in a structured format. H ( x, y ) b 1 ( x, z ) b 2 ( y, z ) dealsWith isLocatedIn isLocatedIn dealsWith imports exports isCitizenOf wasBornIn hasCapital worksAt wasBornIn isLocatedIn isLocatedIn hasCapital isLocatedIn Figure: Knowledge base examples. Ontological Pathfinding Jun 29, 2016 4/25
Introduction Ontological Pathfinding Experiments Knowledge Bases ProbKB Ontological Pathfinding Jun 29, 2016 5/25
Introduction Ontological Pathfinding Experiments First-Order Knowledge Kale is rich in Calcium ∧ Calcium helps prevent Osteoporosis → Kale helps prevent Osteoporosis. @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 6/25
Introduction Ontological Pathfinding Experiments First-Order Knowledge Kale is rich in Calcium ∧ Calcium helps prevent Osteoporosis → Kale helps prevent Osteoporosis. Question answering; @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 6/25
Introduction Ontological Pathfinding Experiments First-Order Knowledge Kale is rich in Calcium ∧ Calcium helps prevent Osteoporosis → Kale helps prevent Osteoporosis. Question answering; Data cleaning; @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 6/25
Introduction Ontological Pathfinding Experiments First-Order Knowledge Kale is rich in Calcium ∧ Calcium helps prevent Osteoporosis → Kale helps prevent Osteoporosis. Question answering; Data cleaning; Incremental knowledge construction. @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 6/25
Introduction Ontological Pathfinding Experiments State-of-the-Art AMIE YAGO2: 834K entities, 1M facts; Runtime: 3.59 minutes. Ontological Pathfinding Jun 29, 2016 7/25
Introduction Ontological Pathfinding Experiments State-of-the-Art AMIE YAGO2: 834K entities, 1M facts; Runtime: 3.59 minutes. AMIE+ YAGO2S: 2.1M entities, 4.5M facts; Runtime: 1 hour. Ontological Pathfinding Jun 29, 2016 7/25
Introduction Ontological Pathfinding Experiments State-of-the-Art AMIE YAGO2: 834K entities, 1M facts; Runtime: 3.59 minutes. AMIE+ YAGO2S: 2.1M entities, 4.5M facts; Runtime: 1 hour. Sherlock TextRunner: 250K facts; Runtime: 50 minutes. Ontological Pathfinding Jun 29, 2016 7/25
Introduction Ontological Pathfinding Experiments State-of-the-Art AMIE YAGO2: 834K entities, 1M facts; Runtime: 3.59 minutes. AMIE+ YAGO2S: 2.1M entities, 4.5M facts; Runtime: 1 hour. Sherlock TextRunner: 250K facts; Runtime: 50 minutes. Freebase: 112M entities, 388M facts; Ontological Pathfinding Jun 29, 2016 7/25
Introduction Ontological Pathfinding Experiments State-of-the-Art AMIE YAGO2: 834K entities, 1M facts; Runtime: 3.59 minutes. AMIE+ YAGO2S: 2.1M entities, 4.5M facts; Runtime: 1 hour. Sherlock TextRunner: 250K facts; Runtime: 50 minutes. Freebase: 112M entities, 388M facts; Is it possible to mine first-order rules from Freebase? Ontological Pathfinding Jun 29, 2016 7/25
Introduction Ontological Pathfinding Experiments Contributions Goal: Mining first-order knowledge from web-scale knowledge bases. @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 8/25
Introduction Ontological Pathfinding Experiments Contributions Goal: Mining first-order knowledge from web-scale knowledge bases. Result: Design the Ontological Pathfinding algorithm to mine 36,625 inference rules from Freebase (388M facts) in 34 hours; @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 8/25
Introduction Ontological Pathfinding Experiments Contributions Goal: Mining first-order knowledge from web-scale knowledge bases. Result: Design the Ontological Pathfinding algorithm to mine 36,625 inference rules from Freebase (388M facts) in 34 hours; publish the first Freebase rule set. @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 8/25
Introduction Ontological Pathfinding Experiments Contributions Goal: Mining first-order knowledge from web-scale knowledge bases. Result: Design the Ontological Pathfinding algorithm to mine 36,625 inference rules from Freebase (388M facts) in 34 hours; publish the first Freebase rule set. Contributions: Partition KB into independent subsets to reduce join sizes. @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 8/25
Introduction Ontological Pathfinding Experiments Contributions Goal: Mining first-order knowledge from web-scale knowledge bases. Result: Design the Ontological Pathfinding algorithm to mine 36,625 inference rules from Freebase (388M facts) in 34 hours; publish the first Freebase rule set. Contributions: Partition KB into independent subsets to reduce join sizes. (Improve runtime from 2.55 days to 5.06 hours for a single task.) @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 8/25
Introduction Ontological Pathfinding Experiments Contributions Goal: Mining first-order knowledge from web-scale knowledge bases. Result: Design the Ontological Pathfinding algorithm to mine 36,625 inference rules from Freebase (388M facts) in 34 hours; publish the first Freebase rule set. Contributions: Partition KB into independent subsets to reduce join sizes. (Improve runtime from 2.55 days to 5.06 hours for a single task.) Design a parallel rule mining algorithm for each partition. @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 8/25
Introduction Ontological Pathfinding Experiments Contributions Goal: Mining first-order knowledge from web-scale knowledge bases. Result: Design the Ontological Pathfinding algorithm to mine 36,625 inference rules from Freebase (388M facts) in 34 hours; publish the first Freebase rule set. Contributions: Partition KB into independent subsets to reduce join sizes. (Improve runtime from 2.55 days to 5.06 hours for a single task.) Design a parallel rule mining algorithm for each partition. (Achieve 3-6 times of speedup.) @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 8/25
Introduction Ontological Pathfinding Experiments Contributions Goal: Mining first-order knowledge from web-scale knowledge bases. Result: Design the Ontological Pathfinding algorithm to mine 36,625 inference rules from Freebase (388M facts) in 34 hours; publish the first Freebase rule set. Contributions: Partition KB into independent subsets to reduce join sizes. (Improve runtime from 2.55 days to 5.06 hours for a single task.) Design a parallel rule mining algorithm for each partition. (Achieve 3-6 times of speedup.) Prune inefficient and erroneous candidate rules. @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 8/25
Introduction Ontological Pathfinding Experiments Contributions Goal: Mining first-order knowledge from web-scale knowledge bases. Result: Design the Ontological Pathfinding algorithm to mine 36,625 inference rules from Freebase (388M facts) in 34 hours; publish the first Freebase rule set. Contributions: Partition KB into independent subsets to reduce join sizes. (Improve runtime from 2.55 days to 5.06 hours for a single task.) Design a parallel rule mining algorithm for each partition. (Achieve 3-6 times of speedup.) Prune inefficient and erroneous candidate rules. (Make joins possible.) @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 8/25
Introduction Ontological Pathfinding Experiments Outline Introduction 1 Knowledge Bases Ontological Pathfinding 2 Partitioning Parallel Rule Mining Experiments 3 Overall Result Partitioning @ D ata S cience R esearch Ontological Pathfinding Jun 29, 2016 9/25
Recommend
More recommend