computing with natural language
play

Computing with Natural Language Percy Liang ACL Workshop on - PowerPoint PPT Presentation

Computing with Natural Language Percy Liang ACL Workshop on Semantic Parsing - June 15, 2014 Stanford University [PaleoDeepDive (Shanan Peters, Chris R e)] Paleobiology 1 [PaleoDeepDive (Shanan Peters, Chris R e)] Paleobiology 1


  1. Training intuition Where did Mozart tupress? ⇒ Salzburg PlaceOfBirth.Mozart ⇒ Vienna PlaceOfDeath.Mozart PlaceOfMarriage.Mozart ⇒ Vienna Vienna Where did William Hogarth tuppress? PlaceOfBirth.WilliamHogarth PlaceOfDeath.WilliamHogarth PlaceOfMarriage.WilliamHogarth London 17

  2. Training intuition Where did Mozart tupress? ⇒ Salzburg PlaceOfBirth.Mozart ⇒ Vienna PlaceOfDeath.Mozart PlaceOfMarriage.Mozart ⇒ Vienna Vienna Where did William Hogarth tuppress? ⇒ London PlaceOfBirth.WilliamHogarth ⇒ London PlaceOfDeath.WilliamHogarth PlaceOfMarriage.WilliamHogarth ⇒ Paddington London 17

  3. Training intuition Where did Mozart tupress? ⇒ Salzburg PlaceOfBirth.Mozart ⇒ Vienna PlaceOfDeath.Mozart PlaceOfMarriage.Mozart ⇒ Vienna Vienna Where did William Hogarth tuppress? ⇒ London PlaceOfBirth.WilliamHogarth ⇒ London PlaceOfDeath.WilliamHogarth PlaceOfMarriage.WilliamHogarth ⇒ Paddington London 17

  4. Training intuition Where did Mozart tupress? ⇒ Salzburg PlaceOfBirth.Mozart ⇒ Vienna PlaceOfDeath.Mozart PlaceOfMarriage.Mozart ⇒ Vienna Vienna Where did William Hogarth tuppress? ⇒ London PlaceOfBirth.WilliamHogarth ⇒ London PlaceOfDeath.WilliamHogarth PlaceOfMarriage.WilliamHogarth ⇒ Paddington London 17

  5. Outline • Semantic parsing in 5 minutes • A closer look at the elements – Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets • Final remarks 18

  6. Outline • Semantic parsing in 5 minutes • A closer look at the elements – Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets • Final remarks 19

  7. Challenge: incomplete knowledge base What are the longest hiking trails in Baltimore ? Data Source hiking trails in Baltimore Avalon Super Loop Patapsco Valley State Park Gunpowder Falls State Park Union Mills Hike Greenbury Point ... 20

  8. MichelleObama Gender Female USState PlacesLived 1992.10.03 Spouse Type StartDate Event21 Event8 Hawaii ContainedBy Location Type UnitedStates Marriage ContainedBy ContainedBy Chicago BarackObama Honolulu PlaceOfBirth Location PlacesLived Event3 DateOfBirth Profession Type Type Person 1961.08.04 Politician City 21

  9. MichelleObama Gender Female USState PlacesLived 1992.10.03 Spouse Type StartDate Event21 Event8 Hawaii ContainedBy Location Type UnitedStates Marriage ContainedBy ContainedBy Chicago BarackObama Honolulu PlaceOfBirth Location PlacesLived Event3 Type DateOfBirth Profession Type Person 1961.08.04 Politician City Fewer than 10% general web questions can be answered via Freebase 21

  10. 22

  11. [Pasupat & Liang, 2014] Semantic parsing on the web Input: • query x hiking trails near Baltimore • web page w 23

  12. [Pasupat & Liang, 2014] Semantic parsing on the web Input: • query x hiking trails near Baltimore • web page w 23

  13. [Pasupat & Liang, 2014] Semantic parsing on the web Input: • query x hiking trails near Baltimore • web page w 23

  14. [Pasupat & Liang, 2014] Semantic parsing on the web Input: • query x hiking trails near Baltimore • web page w Output: • list of entities y [Avalon Super Loop, Patapsco Valley State Park, ...] 23

  15. [Sahuguet and Azavant, 1999; Liu et al., 2000; Crescenzi et al., 2001] Logical forms: XPath expressions html head body table h1 table tr tr tr ... tr td td td td th th td td td td z = /html[1]/body[1]/table[2]/tr/td[1] 24

  16. Framework html hiking trails head body x w near Baltimore ... ... 25

  17. Framework html hiking trails head body x w near Baltimore ... ... Generation ( |Z| ≈ 8500) Z 25

  18. Framework html hiking trails head body x w near Baltimore ... ... Generation ( |Z| ≈ 8500) Z Model /html[1]/body[1]/table[2]/tr/td[1] z 25

  19. Framework html hiking trails head body x w near Baltimore ... ... Generation ( |Z| ≈ 8500) Z Model /html[1]/body[1]/table[2]/tr/td[1] Execution z [Avalon Super Loop, Patapsco Valley State Park, ...] y 25

  20. Outline • Semantic parsing in 5 minutes • A closer look at the elements – Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets • Final remarks 26

  21. Challenge: lexical coverage born ⇒ Type.City, PeopleBornHere, Profession.Lawyer, ... ? 27

  22. [Fader et al. 2011] Solution: alignment Open information extraction on ClueWeb09: ( Barack Obama , was born in , Honolulu ) ( Albert Einstein , was born in , Ulm ) ( Barack Obama , lived in , Chicago ) ... 15M triples ... 28

  23. [Fader et al. 2011] Solution: alignment Open information extraction on ClueWeb09: ( Barack Obama , was born in , Honolulu ) ( Albert Einstein , was born in , Ulm ) ( Barack Obama , lived in , Chicago ) ... 15M triples ... Freebase: MichelleObama Gender ( BarackObama , PlaceOfBirth , Honolulu ) Female USState PlacesLived 1992.10.03 Spouse Type StartDate ( Albert Einstein , PlaceOfBirth , Ulm ) Event21 Event8 Hawaii ContainedBy Location Type UnitedStates Marriage ContainedBy ( BarackObama , PlacesLived.Location , Chicago ) ContainedBy Chicago BarackObama PlaceOfBirth Honolulu Location PlacesLived Event3 Type DateOfBirth Profession Type ... 400M triples ... Person 1961.08.04 Politician City 28

  24. Match text and Freebase predicates grew up in DateOfBirth born in PlaceOfBirth married in Marriage.StartDate born in PlacesLived.Location Similar schema matching / alignment ideas [Cai & Yates, 2013, Fader et. al, 2013, Yao & van Durme, 2014; etc.] 29

  25. Challenge: variability in language What is the currency in the US? 30

  26. Challenge: variability in language What is the currency in the US? What money do they use in the states? How do you pay in America? What’s the currency of the US? What money is accepted in the United States? What money to take to the US? . . . 30

  27. [Berant & Liang, 2014] A solution: paraphrasing How many people live in Seattle? paraphrase What is the population of Seattle? PopulationOf(Seattle) 850,000 Convert to a text-only problem 31

  28. Challenge: ”sub-lexical compositionality” grandmother λx. Gender.Female ⊓ Parent . Parent .x mayor λx. GovtPositionsHeld . ( Title . Mayor ⊓ OfficeOfJurisdiction .x ) 32

  29. Challenge: ”sub-lexical compositionality” grandmother λx. Gender.Female ⊓ Parent . Parent .x mayor λx. GovtPositionsHeld . ( Title . Mayor ⊓ OfficeOfJurisdiction .x ) presidents who have served two non-consecutive terms [requires higher-order quantification] presidents who were previously vice-presidents [anaphora] every other president [weird quantification anaphora] 32

  30. Outline • Semantic parsing in 5 minutes • A closer look at the elements – Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets • Final remarks 33

  31. Many possible derivations! Where was Obama born? A Really Dumb Grammar (lexicon) Obama ⇒ Unary : BarackObama (lexicon) born ⇒ Binary : PlaceOfBirth ... (join) Unary : u Binary : b ⇒ Unary : b.u (intersect) Unary : u Unary : v Unary : u ⊓ v ⇒ set of candidate derivations D ( x ) Type.Location ⊓ R[PlaceOfBirth].BarackObama Type.Date ⊓ R[Founded].ObamaJapan intersect intersect Type.Location was R[PlaceOfBirth].BarackObama ? Type.Date was R[Founded].ObamaJapan ? join join ... lexicon lexicon where BarackObama R[PlaceOfBirth] where ObamaJapan R[Founded] lexicon lexicon lexicon lexicon Obama born Obama born 34

  32. [Berant et al., 2013] Bridging Type.University alignment BarackObama alignment Which college did Obama go to ? 35

  33. [Berant et al., 2013] Bridging Type.University bridging Education alignment BarackObama alignment Which college did Obama go to ? Bridging: use neighboring predicates / type constraints 35

  34. [Berant et al., 2013] Bridging Type.University bridging Education alignment BarackObama alignment Which college did Obama go to ? Bridging: use neighboring predicates / type constraints Start building from parts with more certainty 35

  35. [Berant & Liang, 2014] Bridging to nowhere Search logical forms based on ”prior”: What countries in the world speak Arabic? 36

  36. [Berant & Liang, 2014] Bridging to nowhere Search logical forms based on ”prior”: What countries in the world speak Arabic? ArabicAlphabet ArabicLang 36

  37. [Berant & Liang, 2014] Bridging to nowhere Search logical forms based on ”prior”: What countries in the world speak Arabic? ArabicAlphabet ArabicLang LangSpoken.ArabicLang LangFamily.Arabic 36

  38. [Berant & Liang, 2014] Bridging to nowhere Search logical forms based on ”prior”: What countries in the world speak Arabic? ArabicAlphabet ArabicLang LangSpoken.ArabicLang LangFamily.Arabic Type.Country ⊓ LangSpoken.ArabicLang Count(Type.Country ⊓ LangSpoken.ArabicLang) 36

  39. [Berant & Liang, 2014] Bridging to nowhere Search logical forms based on ”prior”: What countries in the world speak Arabic? ArabicAlphabet ArabicLang LangSpoken.ArabicLang LangFamily.Arabic Type.Country ⊓ LangSpoken.ArabicLang Count(Type.Country ⊓ LangSpoken.ArabicLang) Start building from parts with more certainty 36

  40. Oracle on WebQuestions For what fraction of utterances was a candidate logical form correct? 70 60 50 40 30 20 10 0 [Berant et al., 2013] Paraphrasing 37

  41. Overapproximation via simple grammars • Modeling correct derivations requires complex rules 38

  42. Overapproximation via simple grammars • Modeling correct derivations requires complex rules • Simple rules generate overapproximation of good deriva- tions 38

  43. Overapproximation via simple grammars • Modeling correct derivations requires complex rules • Simple rules generate overapproximation of good deriva- tions • Hard grammar rules ⇒ soft/overlapping features 38

  44. Outline • Semantic parsing in 5 minutes • A closer look at the elements – Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets • Final remarks 39

  45. Bootstrapping from easy examples Iteration 1 Example 1 Example 2 Example 3 Example 4 Example 5 ... ... ... ... ... 40

  46. Bootstrapping from easy examples Iteration 2 Example 1 Example 2 Example 3 Example 4 Example 5 ... ... ... ... ... 40

  47. Bootstrapping from easy examples Iteration 3 Example 1 Example 2 Example 3 Example 4 Example 5 ... ... ... ... ... 40

  48. Bootstrapping from easy examples Iteration 4 Example 1 Example 2 Example 3 Example 4 Example 5 ... ... ... ... ... 40

  49. Bootstrapping from easy examples On GeoQuery [Liang et al., 2011]: 100 % train examples 75 50 25 0 1 2 3 4 iteration 41

  50. Outline • Semantic parsing in 5 minutes • A closer look at the elements – Knowledge base incompleteness – Lexical coverage – Search over logical forms – Learning via bootstrapping – Leveraging denotations (”grounding”) – Datasets • Final remarks 42

  51. Type.Location ⊓ R[PlaceOfBirth].BarackObama intersect Type.Location was R[PlaceOfBirth].BarackObama ? x : utterance join lexicon d : derivation where BarackObama R[PlaceOfBirth] lexicon lexicon Obama born Feature vector φ ( x, d ) ∈ R f : 43

  52. Type.Location ⊓ R[PlaceOfBirth].BarackObama intersect Type.Location was R[PlaceOfBirth].BarackObama ? x : utterance join lexicon d : derivation where BarackObama R[PlaceOfBirth] lexicon lexicon Obama born Feature vector φ ( x, d ) ∈ R f : apply join 1 apply intersect 1 apply lexicon 3 skipped VBD-AUX 1 skipped NN 0 born maps to PlaceOfBirth 1 born maps to PlacesLived.Location 0 alignmentScore 1.52 1 denotation-size=1 ... ... 43

  53. Denotation features for entity extraction /html[1]/body[1]/table[2]/tr/td[1] /html[1]/body[1]/div[2]/a hiking trails near Baltimore hiking trails near Baltimore Avalon Super Loop Home Patapsco Valley State Park About Baltimore Tour > Gunpowder Falls State Park Pricing Rachel Carson Conservation Park Contact Union Mills Hike Online Support ... ... 44

Recommend


More recommend