low pass semantics
play

Low-Pass Semantics Fernando Pereira with William Cohen * , Rahul - PowerPoint PPT Presentation

Low-Pass Semantics Fernando Pereira with William Cohen * , Rahul Gupta, Ni Lao * , Slav Petrov, Michael Ringgaard, and Amar Subramanya * CMU An easy query An easy query frequent query negation class negation class


  1. Joint KB+text inference Freebase Writer ? Profession HasFather Patrick Brontë ⇤ Charlotte score ( s, t ) = P ( s ⇥ t ; ⇥ ) � � . Bronte Jane Eyre Write � ∈ B Entity Resolution Mention Mention Mention wrote was nsubj nsubj dobj Charlotte She Jane Eyre Dependency Trees Coreference Resolution News Corpus Path ranking algorithm (PRA): Lao and Cohen, Machine Learning , 2010

  2. Joint KB+text inference Freebase Writer ? Profession HasFather Patrick Brontë ⇤ Charlotte score ( s, t ) = P ( s ⇥ t ; ⇥ ) � � . Bronte Jane Eyre Write � ∈ B Entity Resolution • Path types π : edge label Mention Mention Mention sequences wrote was nsubj nsubj dobj Charlotte She Jane Eyre Dependency Trees Coreference Resolution News Corpus Path ranking algorithm (PRA): Lao and Cohen, Machine Learning , 2010

  3. Joint KB+text inference Freebase Writer ? Profession HasFather Patrick Brontë ⇤ Charlotte score ( s, t ) = P ( s ⇥ t ; ⇥ ) � � . Bronte Jane Eyre Write � ∈ B Entity Resolution • Path types π : edge label Mention Mention Mention sequences wrote was • Random walk probabilities nsubj nsubj dobj Charlotte She Jane Eyre Dependency Trees Coreference Resolution News Corpus Path ranking algorithm (PRA): Lao and Cohen, Machine Learning , 2010

  4. Joint KB+text inference Freebase Writer ? Profession HasFather Patrick Brontë ⇤ Charlotte score ( s, t ) = P ( s ⇥ t ; ⇥ ) � � . Bronte Jane Eyre Write � ∈ B Entity Resolution • Path types π : edge label Mention Mention Mention sequences wrote was • Random walk probabilities nsubj nsubj dobj • Weights θ π learned by Charlotte She Jane Eyre Dependency Trees logistic regression Coreference Resolution News Corpus Path ranking algorithm (PRA): Lao and Cohen, Machine Learning , 2010

  5. Case study: extending Freebase

  6. Case study: extending Freebase • Freebase: 21M concepts, 70M edges

  7. Case study: extending Freebase • Freebase: 21M concepts, 70M edges • 60M Web pages mention Freebase concepts relevant to this study

  8. Case study: extending Freebase • Freebase: 21M concepts, 70M edges • 60M Web pages mention Freebase concepts relevant to this study • Study relations: profession, nationality, parent

  9. Case study: extending Freebase • Freebase: 21M concepts, 70M edges • 60M Web pages mention Freebase concepts relevant to this study • Study relations: profession, nationality, parent • Simplified entity resolution: most likely concept for named mentions in coref cluster

  10. Case study: extending Freebase • Freebase: 21M concepts, 70M edges • 60M Web pages mention Freebase concepts relevant to this study • Study relations: profession, nationality, parent • Simplified entity resolution: most likely concept for named mentions in coref cluster • Profession stats:

  11. Case study: extending Freebase • Freebase: 21M concepts, 70M edges • 60M Web pages mention Freebase concepts relevant to this study • Study relations: profession, nationality, parent • Simplified entity resolution: most likely concept for named mentions in coref cluster • Profession stats: • 2M people in Freebase

  12. Case study: extending Freebase • Freebase: 21M concepts, 70M edges • 60M Web pages mention Freebase concepts relevant to this study • Study relations: profession, nationality, parent • Simplified entity resolution: most likely concept for named mentions in coref cluster • Profession stats: • 2M people in Freebase • 0.3M have a recorded profession

  13. Case study: extending Freebase • Freebase: 21M concepts, 70M edges • 60M Web pages mention Freebase concepts relevant to this study • Study relations: profession, nationality, parent • Simplified entity resolution: most likely concept for named mentions in coref cluster • Profession stats: • 2M people in Freebase • 0.3M have a recorded profession • Biased data (0.24M politicians, actors)

  14. Selecting training data

  15. Selecting training data π → t, | π | ≤ 4 s −

  16. Selecting training data π → t, | π | ≤ 4 s − Positive : r ( s , t ) , downsample for popular s , t

  17. Selecting training data π → t, | π | ≤ 4 s − Positive : r ( s , t ) , downsample for popular s , t Negative : sample t ′ such that ¬ r ( s , t ′ )

  18. Selecting training data π → t, | π | ≤ 4 s − Positive : r ( s , t ) , downsample for popular s , t Negative : sample t ′ such that ¬ r ( s , t ′ ) Task Training Set Test Set Profession 22,829 15,219 Nationality 14,431 9,620 Parents 21,232 14,155

  19. A Learned path for profession M , conj , M − 1 , Profession � ⇥ M conj − 1 M − 1 Profession � ⇥

  20. A Learned path for profession M , conj , M − 1 , Profession � ⇥ M conj − 1 M − 1 Profession � ⇥ Miles Davis John Coltrane Profession Musician

  21. A Learned path for profession M , conj , M − 1 , Profession � ⇥ M conj − 1 M − 1 Profession � ⇥ Miles Davis John Coltrane Profession Musician

  22. A Learned path for profession M , conj , M − 1 , Profession � ⇥ M conj − 1 M − 1 Profession � ⇥ Miles Davis John Coltrane Profession Musician M

  23. A Learned path for profession M , conj , M − 1 , Profession � ⇥ M conj − 1 M − 1 Profession � ⇥ Miles Davis John Coltrane Profession Musician M M -1

  24. Relation extraction results

  25. Relation extraction results Known triples Task KB Text KB+Text KB+Text[b] Profession 0.532 0.516 0.583 0.453 1 1 ⇤ MRR = Nationality 0.734 0.729 0.812 0.693 | Q | rank of q ’s first correct answer q ∈ Q Parents 0.329 0.332 0.392 0.319

  26. Relation extraction results Known triples Task KB Text KB+Text KB+Text[b] Profession 0.532 0.516 0.583 0.453 1 1 ⇤ MRR = Nationality 0.734 0.729 0.812 0.693 | Q | rank of q ’s first correct answer q ∈ Q Parents 0.329 0.332 0.392 0.319 Human evaluation Task p@100 p@1k p@10k Profession 0.97 0.92 0.84 Nationality 0.98 0.97 0.90 Parents 0.86 0.81 0.79

Recommend


More recommend