yago ya go a c a cor ore e of of se semantic mantic kno
play

YAGO YA GO A C A Cor ore e of of Se Semantic mantic Kno - PowerPoint PPT Presentation

Wednesday, October 15, 2014 YAGO YA GO A C A Cor ore e of of Se Semantic mantic Kno nowledge wledge Ye Yet t An Anot other her Gr Great eat On Onto tology ogy Knowledge CS 743@Tandra YAGO - A Core of Semantic Fabian M.


  1. Wednesday, October 15, 2014 YAGO YA GO – A C A Cor ore e of of Se Semantic mantic Kno nowledge wledge Ye Yet t An Anot other her Gr Great eat On Onto tology ogy Knowledge CS 743@Tandra YAGO - A Core of Semantic Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum (Max-Planck Institute for Computer Science Saarbrücken/Germany) Presented by Tandra Chakraborty 1 Std# 20546668

  2. Overview Wednesday, October 15, 2014 • Motivation for Ontology • The YAGO Model  Structure Knowledge CS 743@Tandra YAGO - A Core of Semantic  Semantics  Source • Conclusion 2

  3. Motivation for Ontology Wednesday, October 15, 2014 Knowledge CS 743@Tandra YAGO - A Core of Semantic What else come on your mind? • Nobel Prize • Germany 3

  4. YAGO - A Core of Semantic Wednesday, October 15, 2014 4 Knowledge CS 743@Tandra

  5. YAGO - A Core of Semantic Wednesday, October 15, 2014 5 Knowledge CS 743@Tandra

  6. YAGO - A Core of Semantic Wednesday, October 15, 2014 6 Knowledge CS 743@Tandra

  7. YAGO - A Core of Semantic Wednesday, October 15, 2014 7 Knowledge CS 743@Tandra

  8. Webpage not searching for Knowledge Google is YAGO - A Core of Semantic Wednesday, October 15, 2014 8 Knowledge CS 743@Tandra

  9. Solution: An ontology CLASSES RELATIONS Wednesday, October 15, 2014 INDIVIDUALSC PERSON RELATIONS OBJECT subclassOf SCIENTIST WINNER HASWONPRIZE ? Knowledge CS 743@Tandra YAGO - A Core of Semantic NOBELPRIZE WINNER 1921 HASWONPRIZE NOBELPRIZE 9 An ontology is a formal framework for representing knowledge [Wikipedia]

  10. Application of Ontology Wednesday, October 15, 2014 • Machine Translation • Word Sense Disambiguation • Document Classification Knowledge CS 743@Tandra YAGO - A Core of Semantic • Question Answering • Entity and fact-oriented Web Search • Data cleaning • Record Linkage 10

  11. Idea behind YAGO Wednesday, October 15, 2014 Manual categorization used for extraction :  WordNet, SUMO, GeneOntology  Problem: Usually low coverage Knowledge CS 743@Tandra YAGO - A Core of Semantic Extracting from Web:  KnowItAll, Espresso, Snowball, L EILA  Problem: Usually low accuracy (many false positive 11 result)

  12. What’s new in YAGO? Wednesday, October 15, 2014 YAGO approach: • Combine Wikipedia and Wordnet • Arranged concepts in a taxonomy (Wordnet) (=> good Knowledge CS 743@Tandra YAGO - A Core of Semantic coverage) • Use the category system of Wikipedia (=> good accuracy) • Extension of RDF(Resource Description Framework ) • It supports built in transitive relation (not in RDF) • Simple and Decidable 12

  13. Structure Wednesday, October 15, 2014 • Represents knowledge as • Entities • Relations • Facts Knowledge CS 743@Tandra YAGO - A Core of Semantic Example: AlbertEinste ertEinstein HASWONPRIZE NobelPriz elPrize Albe bertE rtEinst instein ein BORNINYEAR 1879 1879 13

  14. Structure Wednesday, October 15, 2014 • All objects are entity(cities, people, URL) • Number, dates, strings are entity • Words are also entity • Similar entities are grouped in classes Knowledge CS 743@Tandra YAGO - A Core of Semantic AlbertE rtEins inste tein in TYPE physici icist st • Classes are also entities. physic icist ist SUBCLASSOF scienti entist • Relations are entity as well subclassOf TYPE tr transitive itiveRel Relation ation 14 The triple of entity, a relation and an entity is called a fact

  15. Structure Wednesday, October 15, 2014 • Entities are argument of fact • Each fact is given a fact identifier Knowledge CS 743@Tandra YAGO - A Core of Semantic 15

  16. Structure Wednesday, October 15, 2014 𝐵 𝑡𝑢𝑠𝑣𝑑𝑢𝑣𝑠𝑓 𝑔𝑝𝑠 𝑏 𝑍𝐵𝐻𝑃 𝑝𝑜𝑢𝑝𝑚𝑝𝑕𝑧 𝑧 𝑗𝑡 𝑏 𝑢𝑠𝑗𝑞𝑚𝑓 𝑝𝑔 • 𝑏 𝑡𝑓𝑢 U 𝑢ℎ𝑓 𝑣𝑜𝑗𝑤𝑓𝑠𝑡𝑓 • 𝑏 𝑔𝑣𝑜𝑑𝑢𝑗𝑝𝑜 𝐸: 𝐽 ∪ 𝐷 ∪ 𝑆 → 𝑉(the denotation) Knowledge CS 743@Tandra YAGO - A Core of Semantic • 𝑏 𝑔𝑣𝑜𝑑𝑢𝑗𝑝𝑜 𝜁: 𝐸(𝑆) → 𝑉 × 𝑉(the extension function) 16

  17. Transitivity Wednesday, October 15, 2014 Axioms: Scientist (x, is_a, y) subclass Knowledge CS 743@Tandra YAGO - A Core of Semantic (y, subclass, z) Physicist is a => (x, is_a, z) is a ... 17

  18. Enriching YAGO Wednesday, October 15, 2014 Want to add (x,r,y) to existing fact • Map x and y to existing entities in YAGO ontology • Add as new entity if they don’t exist Knowledge CS 743@Tandra YAGO - A Core of Semantic • Next r has to be mapped to a relation in the YAGO ontology • If (x,r,y) exist in ontology then add a new witness for the fact (f,FOUNDIN,w) • Calculate the confidence of fact 18 • If (x,r,y) does not exist add this fact with a new fact identifier

  19. Knowledgebase For YAGO Wednesday, October 15, 2014 • Wordnet • Semantic Lexicon for the English Language • Wikipedia • Multilingual, web-based encyclopedia Knowledge CS 743@Tandra YAGO - A Core of Semantic • “Categories” from Wikipedia • “Class hierarchy” from Wordnet Example: According to Wiki Zidane is in the super-category named “ Football in France” , but Zidane is a football player and not a football. 19 Here WordNet, in contrast, provides a clean and carefully assembled hierarchy of thousands of concepts.

  20. Knowledge Extraction Wednesday, October 15, 2014 Albert Einstein is in the category Naturalized citizens of the United States Albert Einstein is also in the category Articles with unsourced statements, Relational information (like 1879 births) and Thematic vicinity (Physicist). How to identify Conceptual category? [We need it for definig Knowledge CS 743@Tandra YAGO - A Core of Semantic TYPE relation] Naturalized citizens of the United States Use Noun Premodifier Head Postmodifier Group parser Articles with unsourced statements 20 Heuristics: If the head is a plural word, the category is conceptual

  21. Knowledge Extraction Wednesday, October 15, 2014 • The subclassOf is reflected by thematic category of Wiki • Only with Wiki we don’t get total accuracy • So for taxonomy purpose leaf categories are taken from Wikipedia Knowledge CS 743@Tandra YAGO - A Core of Semantic • Wordnet is used on those leaf categories to establish hierarchy of classes • Each synset(A set of words share one sense) becomes a class for YAGO. • Extract MEANS relation from redirect pages of Wiki Ei Einstein,Alb nstein,Albert ert MEA EANS NS Al Albert bert Ei Einste nstein in 21

  22. Knowledge Extraction Wednesday, October 15, 2014 Other relations from Category: • BornInYear • DiedInYear • EstablishedIn Knowledge CS 743@Tandra YAGO - A Core of Semantic • LocatedIn • WrittenInYear • PoliticianOf • HasWonPrize 22

  23. Knowledge Extraction Thursday, October 16, 2014 Meta Relations • Describes (individual and URL of the corresponding Wikipedia page) Knowledge CS 743@Tandra YAGO - A Core of Semantic • Witness (The page from where knowledge extracted) • ExtractedBy (Technique of Extraction) • FoundIn (Relation between Fact and URL) 23 • Context(Albert Einstein, Relativity Theory)

  24. The YAGO ontology: Accuracy Wednesday, October 15, 2014 Relation Accuracy subclass 97.70% +/- 1.59% is a 94.54% +/- 2.36% familyName 97.81% +/- 1.75% givenName Knowledge CS 743@Tandra YAGO - A Core of Semantic 97.62% +/- 2.08% establishedIn 90.84% +/- 4.28% bornInYear 93.14% +/- 3.71% diedInYear 98.70% +/- 1.30% locatedIn 98.41% +/- 1.52% politicianOf 92.43% +/- 3.93% writtenInYear 94.35% +/- 3.33% 24 hasWonPrize 98.47% +/- 1.53% Ref: http://suchanek.name/work/publications/www2007.ppt

  25. YAGO Storage Wednesday, October 15, 2014 • When the paper was written YAGO has 1 million entities with 5 million facts • In the next iteration YAGO-2 has 2 million of entities with 20 million of facts • Simple text fi les as an internal format Knowledge CS 743@Tandra • Store only facts that are unique, not derivable from other YAGO - A Core of Semantic facts[Canoncialization] Folder of Relations 25 Files that list entities of pair

  26. Size of YAGO YAGO - A Core of Semantic 26 Wednesday, October 15, 2014 Knowledge CS 743@Tandra

  27. The YAGO ontology: Number of Facts 6,000,000 Wednesday, October 15, 2014 Ontologies should not be judged purely by the number of facts! This is just an informational overview. Knowledge CS 743@Tandra YAGO - A Core of Semantic 2,000,000 30,000 60,000 200,000 300,000 Yago KnowItAll SUMO WordNet OpenCyc Cyc 27 Ref: http://suchanek.name/work/publications/www2007.ppt

  28. Compatibility Wednesday, October 15, 2014 • YAGO is available as a simple XML version of the text fi les • YAGO is loadable from Oracle, Postgress or MYSQL Knowledge CS 743@Tandra YAGO - A Core of Semantic • YAGO can be converted to a database table. • The table has the simple schema FACTS(factId, arg1,relation,arg2, confidence). 28

Recommend


More recommend