weakly supervised grammar informed bayesian ccg parser
play

Weakly-Supervised Grammar-Informed Bayesian CCG Parser Learning - PowerPoint PPT Presentation

Weakly-Supervised Grammar-Informed Bayesian CCG Parser Learning Dan Garrette UT-Austin Chris Dyer CMU Jason Baldridge UT-Austin Noah A. Smith CMU Motivation Annotating parse trees by hand is extremely difficult. Motivation Can we learn


  1. Weakly-Supervised Grammar-Informed Bayesian CCG Parser Learning Dan Garrette UT-Austin Chris Dyer CMU Jason Baldridge UT-Austin Noah A. Smith CMU

  2. Motivation Annotating parse trees by hand is extremely difficult.

  3. Motivation Can we learn new parsers cheaply? (cheaper = less supervision)

  4. Motivation When supervision is scarce , we have to be smarter about data.

  5. Type-Level Supervision

  6. Type-Level Supervision • Unannotated text • Incomplete tag dictionary: word � {tags}

  7. Type-Level Supervision Used for part-of-speech tagging for 20+ years [Kupiec, 1992] [Merialdo, 1994]

  8. Type-Level Supervision Good tagger performance even with low supervision [Ravi & Knight, 2009] [Das & Petrov, 2011] [Garrette & Baldridge, 2013] [Garrette et al. , 2013]

  9. Combinatory Categorial Grammar (CCG)

  10. CCG Every word token is associated with a category Categories combine to form categories of larger constituents [Steedman, 2000] [Steedman and Baldridge, 2011]

  11. CCG np np / n n the dog

  12. CCG s np s \ np dogs sleep

  13. Type-Supervised CCG the lazy wander dogs np/n n/n n n np np n/n (s\np)/np np/n s\np …

  14. CCG Parsing n / n s np / n n \ np wander the lazy dogs

  15. CCG Parsing n / n s np / n n \ np wander the lazy dogs

  16. CCG Parsing n / n s np / n n \ np wander the lazy dogs

  17. CCG Parsing n n / n s np / n n \ np wander the lazy dogs

  18. CCG Parsing n n / n s np / n n \ np wander the lazy dogs

  19. CCG Parsing n n / n s np / n n \ np wander the lazy dogs

  20. CCG Parsing np n n / n s np / n n \ np wander the lazy dogs

  21. CCG Parsing np n n / n s np / n n \ np wander the lazy dogs

  22. CCG Parsing np n n / n s np / n n \ np wander the lazy dogs

  23. CCG Parsing s np n n / n s np / n n \ np wander the lazy dogs

  24. Why CCG? Machine Translation [Weese, Callison-Burch, and Lopez, 2012] Semantic Parsing [Zettlemoyer and Collins, 2005]

  25. Type-Supervised CCG Type-supervised learning for CCG is highly ambiguous Penn Treebank CCGBank parts-of-speech Categories 48 tags 1,300+ categories

  26. Our Strategy The grammar formalism itself can be used to guide learning

  27. Our Strategy Incorporate universal knowledge about grammar into learning

  28. Universal Knowledge

  29. Prefer Simpler Categories np np np\(np/n) n np/n (np\(np/n))/n n np/n n/n n the lazy dog the lazy dog

  30. Prefer Simpler Categories np np np\(np/n) n np/n (np\(np/n))/n n np/n n/n n the lazy dog the lazy dog

  31. Prefer Simpler Categories appears 342 times in CCGbank buy := (s b \np)/np e.g. “Opponents don't buy such arguments.” buy := (((s b \np)/ pp )/ pp )/np appears once “Tele-Communications agreed to buy half of Showtime Networks from Viacom for $ 225 million.” pp pp

  32. Prefer Modifier Categories (s b \np)/np transitive verb : (he) hides (the money) ((s b \np)/np)/((s b \np)/np) adverb : (he) quickly (hides) (the money)

  33. Weighted Category Grammar a {s, np, n,…} p atom ( a ) × p term A B / B p term × p fwd × p mod A B / C p term × p fwd × p mod A B \ B p term × p fwd × p mod A B \ C p term × p fwd × p mod

  34. Weighted Category Grammar a {s, np, n,…} p atom ( a ) × p term A B / B p term × p fwd × p mod + A B / C p term × p fwd × p mod A B \ B p term × p fwd × p mod + A B \ C p term × p fwd × p mod

  35. Prefer Likely Categories s np n n / n s np np / n n s \ np wander the lazy dogs

  36. Prefer Likely Categories s np n n / n s np np / n n s \ np wander the lazy dogs

  37. Type-Supervised Learning unlabeled corpus same as POS tagging tag dictionary universal properties of the CCG formalism

  38. Posterior Inference [Johnson, Griffiths, and Goldwater, 2007]

  39. Posterior Inference Priors Inside (simple is good) the lazy dogs wander np/n n/n n n PCFG np np n/n (s\np)/np np/n s\np …

  40. Posterior Inference Priors Inside (simple is good) the lazy dogs wander np/n n/n n n PCFG np np n/n (s\np)/np np/n s\np …

  41. Posterior Inference Priors Sample (simple is good) the lazy dogs wander np/n n/n n n PCFG np np n/n (s\np)/np np/n s\np …

  42. Posterior Inference Priors Sample (simple is good) the lazy dogs wander np/n n/n n n PCFG np np n/n (s\np)/np np/n s\np …

  43. Posterior Inference Priors (simple is good) the lazy dogs wander np/n n/n n n PCFG np np n/n (s\np)/np np/n s\np …

  44. Posterior Inference Priors (simple is good) the lazy dogs wander np/n n/n n n PCFG np np n/n (s\np)/np np/n s\np …

  45. Posterior Inference Priors (simple is good) the lazy dogs wander np/n n/n n n PCFG np np n/n (s\np)/np np/n s\np …

  46. Results

  47. CCG Parsing Results 75 75 Uniform Uniform With Prior With Prior parsing accuracy 60.0 58.2 55.7 50 50 53.4 42.0 35.9 25 25 0 0 English English Chinese Chinese Italian Italian

  48. Conclusion Using universal grammatical knowledge can make better use of weak supervision

Recommend


More recommend