statistical relational learning and knowledge graph
play

Statistical Relational Learning and Knowledge Graph Reasoning - PowerPoint PPT Presentation

Statistical Relational Learning and Knowledge Graph Reasoning CSCI 699 J AY P UJARA Reminder: Basic problems A 1 E 1 A 2 Who are the entities (nodes) in the graph? R 2 R 1 What are their attributes E 2 and types (labels)? R 3 A 1 A 2


  1. Statistical Relational Learning and Knowledge Graph Reasoning CSCI 699 J AY P UJARA

  2. Reminder: Basic problems A 1 E 1 A 2 • Who are the entities (nodes) in the graph? R 2 R 1 • What are their attributes E 2 and types (labels)? R 3 A 1 A 2 • How are they related E 3 (edges)? A 1 A 2 2

  3. Motivating Problem: New Opportunities Extraction Internet Knowledge Graph (KG) Cutting-edge IE Structured methods representation of Massive source of entities, their labels and publicly available the relationships information between them

  4. Motivating Problem: Real Challenges Extraction Internet Knowledge Graph Difficult! Noisy! Contains many errors and inconsistencies

  5. Graph Construction Issues Extracted knowledge is: • ambiguous: ◦ Ex: Beetles, beetles, Beatles ◦ Ex: citizenOf, livedIn, bornIn 5

  6. Graph Construction Issues Extracted knowledge is: • ambiguous author • incomplete author ◦ Ex: missing relationships c o ◦ Ex: missing labels w o r k ◦ Ex: missing entities e r 6

  7. Graph Construction Issues Extracted knowledge is: • ambiguous • incomplete spouse • inconsistent ◦ Ex: Cynthia Lennon, Yoko Ono ◦ Ex: exclusive labels (alive, dead) spouse ◦ Ex: domain-range constraints 7

  8. Graph Construction Issues Extracted knowledge is: • ambiguous • incomplete • inconsistent 8

  9. NELL:The Never-Ending Language Learner • Large-scale IE project (Carlson et al., 2010) • Lifelong learning: aims to “read the web” • Ontology of known labels and relations • Knowledge base contains millions of facts

  10. Examples of NELL errors

  11. Entity co-reference errors Kyrgyzstan has many variants: • Kyrgystan • Kyrgistan • Kyrghyzstan • Kyrgzstan • Kyrgyz Republic

  12. Missing and spurious labels Kyrgyzstan is labeled a bird and a country

  13. Missing and spurious relations Kyrgyzstan’s location is ambiguous – Kazakhstan, Russia and US are included in possible locations

  14. Violations of ontological knowledge • Equivalence of co-referent entities (sameAs) • SameEntity(Kyrgyzstan, Kyrgyz Republic) • Mutual exclusion (disjointWith) of labels • MUT(bird, country) • Selectional preferences (domain/range) of relations • RNG(countryLocation, continent) Enforcing these constraints require jointly considering multiple extractions

  15. Graph Construction approach •Graph construction cleans and completes extraction graph •Incorporate ontological constraints and relational patterns •Discover statistical relationships within knowledge graph 15

  16. Graph Construction Probabilistic Models TO TOPICS: O VERVIEW G RAPHICAL MODELS R ANDOM W ALK M ETHODS 16

  17. Graph Construction Probabilistic Models TO TOPICS: O VERVIEW G RAPHICAL MODELS R ANDOM W ALK M ETHODS 17

  18. Voter Party Classification ? 

  19. Voter Party Classification Multiple Sources of Information Statuses & Tweets 

  20. Voter Party Classification Multiple Sources of Information Statuses & Donations Tweets 

  21. Voter Party Classification Multiple Sources of Information Statuses & Donations Tweets  Friends & Followers

  22. Voter Party Classification Multiple Sources of Information Statuses & Donations Tweets  Friends & Family Followers

  23. Voter Party Classification

  24. Voter Party Classification

  25. Voter Party Classification $ CarlyFiorinaforVicePresident.com

  26. Voter Party Classification $ CarlyFiorinaforVicePresident.com

  27. Voter Party Classification Multiple Sources of Information Statuses & Donations Tweets  Friends & Family Followers

  28. Standard Classification CarlyFiorinaforVicePresident.com Bag-of-words features

  29. Standard Classification CarlyFiorinaforVicePresident.com Bag-of-words features Pr( Y )

  30. Standard Classification CarlyFiorinaforVicePresident.com Bag-of-words features

  31. Voter Party Classification Multiple Sources of Information Status Donations Updates  Friends Family

  32. Collective Classification Follows

  33. Collective Classification Follows

  34. Collective Classification Follows Pr( Y )

  35. Collective Classification Follows My label is likely to match that of my follower

  36. Collective Classification Follows Follows(U1, U2) & Votes(U1, P) à Votes(U2, P)

  37. Collective Classification €   follower spouse

  38. Collective Classification €   follower spouse

  39. Collective Classification €   follower spouse Spouse(U1, U2) & Votes(U1, P) à Votes(U2, P) Follows(U1, U2) & Votes(U1, P) à Votes(U2, P)

  40. Collective Classification

  41. Collective Classification Pr( Y )

  42. Collective Classification €   follower spouse € follower

  43. Collective Classification €   follower spouse € follower 5.0: Spouse(U1, U2) ^& Votes(U1, P) à Votes(U2, P) 2.0: Follows(U1, U2) & Votes(U1, P) à Votes(U2, P)

  44. Collective Classification €   follower spouse € follower

  45. Collective Classification ? ? ?   € friend spouse colleague friend spouse friend  friend € € colleague spouse ? ? ?

  46. Collective Classification with PSL /* Local rules */ 5.0: Donates(A, P) -> Votes(A, P) 0.3: Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) 0.3: Mentions(A, “Tax Cuts”) -> Votes(A, “Republican”) /* Relational rules */ 1.0: Votes(A,P) & Spouse(B,A) -> Votes(B,P) 0.3: Votes(A,P) & Friend(B,A) -> Votes(B,P) 0.1: Votes(A,P) & Colleague(B,A) -> Votes(B,P) /* Range constraint */ Votes(A, “Republican”) + Votes(A, “Democrat”) = 1.0 .

  47. Beyond Pure Reasoning •Classical AI approach to knowledge: reasoning Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal) 47

  48. Beyond Pure Reasoning •Classical AI approach to knowledge: reasoning Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal) •Reasoning difficult when extracted knowledge has errors 48

  49. Beyond Pure Reasoning •Classical AI approach to knowledge: reasoning Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal) •Reasoning difficult when extracted knowledge has errors •Solution: probabilistic models P(Lbl(Socrates, Mortal)|Lbl(Socrates,Man)=0.9) 49

  50. Logic Refresher: Satisfaction /* Model Snippet */ Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) Affordable Democrat Logical Health Satisfaction TRUE TRUE J TRUE FALSE L FALSE TRUE J FALSE FALSE J

  51. Logic and Noisy Data /* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”) -> !Votes(A, “Democrat”) Affordable Tax Democrat [1] Logical [2] Logical Health Cuts Satisfaction Satisfaction TRUE TRUE TRUE J L TRUE TRUE FALSE L J

  52. Logic and Noisy Data /* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”) -> !Votes(A, “Democrat”) Affordable Tax Democrat [1] Logical [2] Logical Health Cuts Satisfaction Satisfaction TRUE TRUE TRUE J L TRUE TRUE FALSE L J In logic, much as in politics, it is hard to satisfy everyone

  53. Soft Logic to the Rescue! /* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”) -> !Votes(A, “Democrat”) Affordable Tax Democrat [1] Logical [2] Logical Health Cuts Satisfaction Satisfaction TRUE TRUE 0.5 ! ! ! ! TRUE TRUE 0.5

  54. What does 0.5 MEAN?

  55. What does 0.5 mean? Rounding probability: • Flip a coin with bias 0.5 • Heads = TRUE • Tails = FALSE • Using this method is a ¾ optimal solution to the • NP hard weighted MAX SAT problem [Goemans&Williams, 94] 55

  56. What does ! MEAN?

  57. What does ! mean? P -> Q • /* Soft Logic Penalty */ • if P < Q return J • • else: • return P-Q

  58. ! : Closed Form P -> Q • max(0, P-Q)

  59. ! : Closed Form P -> Q max(0, P-Q) Soft Loss 1 0.8 0.6 0.4 0.2 P=1 0 P=.6 P=.2 Q=0 Q=0.2 Q=0.4 Q=0.6 Q=0.8 Q=1 0-0.2 0.2-0.4 0.4-0.6 0.6-0.8 0.8-1

  60. What does ! mean? /* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”) -> !Votes(A, “Democrat”) /* Soft Logic Penalty */ if Mentions(A, “Tax Cuts”) < !Votes(A, “Democrat”): return 0 else: return Mentions(A, “Tax Cuts”) - !Votes(A, “Democrat”)

  61. Computing ! /* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”) -> !Votes(A, “Democrat”) Affordable Tax Democrat [1] Penalty [2] Penalty Health Cuts 1 1 0.7 1 1 0.2 !Q = 1-Q P -> Q = max(0, P-Q)

  62. Computing ! /* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”) -> !Votes(A, “Democrat”) Affordable Tax Democrat [1] Penalty [2] Penalty Health Cuts 1 1 0.7 0.3 0.7 1 1 0.2 0.8 0.2 !Q = 1-Q P -> Q = max(0, P-Q)

Recommend


More recommend