computational models of language learning
play

Computational Models of Language Learning Jelle Zuidema Institute - PowerPoint PPT Presentation

Computational Models of Language Learning Jelle Zuidema Institute for Logic, Language and Computation, U. of Amsterdam MSc Brain & Cognitive Science, Artificial Intelligence, Logic MoL Guest Lecture Computational Models of Language


  1. Computational Models of Language Learning Jelle Zuidema Institute for Logic, Language and Computation, U. of Amsterdam MSc Brain & Cognitive Science, Artificial Intelligence, Logic MoL Guest Lecture Computational Models of Language Learning

  2. Plan for today • Introduction: Grammars in cognitive science and language technology • What kind of grammars do we need? A quick intro to probabilistic grammars • How do we learn them? A quick intro to statistical inference • Efficiency • Accuracy MoL Guest Lecture Computational Models of Language Learning

  3. 2;5 *CHI: seen one those . 3;0 *CHI: I never seen a watch . 3;0 *CHI: I never seen a watch . 3;0 *CHI: I never seen a bandana . 3;0 *CHI: I never seen a monkey train . 3;0 *CHI: I never seen a tree dance . 3;2 *CHI: I never seen a duck like that # riding@o on a pony . 3;2 *CHI: I never seen (a)bout dat [: that] . 3;5 *CHI: I never seen this jet . 3;5 *CHI: I never seen this jet . 3;5 *CHI: I never seen a Sky_Dart . 3;5 *CHI: I never seen this before . 3;8 *CHI: yeah # I seen carpenters too . 3;8 *CHI: where had you seen carpenters do that ? 3;8 *CHI: I never seen her . 3;8 *CHI: I never seen people wear de [: the] fish flies . 3;8 *CHI: where have you seen a whale ? 3;8 *CHI: I never seen a bird talk . 3;11 *CHI: I never seen a kangaroo knit . 3;11 *CHI: I never seen dat [: that] to play . 3;11 *CHI: I never seen a dog play a piano # have you ? 3;11 *CHI: I never seen a rhinoceros eat with a hands . 4;7 *CHI: I seen one in the store some days .

  4. Grammar in child language MacWhinney et al. (1983) Sagae et al. (2007) Borensztajn, Zuidema & Bod (CogSci, 2008) Adam, 3;11.01

  5. Grammar in NLP applications • E.g., speech recognition – please, right this down – write now – who's write, and who's wrong • E.g., anaphora resolution – Mary didn't know who John was married to. He told her, and it turned out, she already knew her. • E.g., machine translation MoL Guest Lecture Computational Models of Language Learning

  6. Steedman, 2008, CL MoL Guest Lecture Computational Models of Language Learning

  7. MoL Guest Lecture Computational Models of Language Learning

  8. MoL Guest Lecture Computational Models of Language Learning

  9. Learning grammars from data • Syntactically annotated corpora – Penn WSJ Treebank trainset: 38k sentences, ~1M words – Tuebingen spoken/written English/German – Corpus Gesproken Nederlands • Unannotated corpora – the web ... – Google's ngram corpora MoL Guest Lecture Computational Models of Language Learning

  10. Spam www.culturomics.org Penn WSJ: 0 counts. MoL Guest Lecture Computational Models of Language Learning

  11. Kick the bucket www.culturomics.org Penn WSJ: 0 counts. MoL Guest Lecture Computational Models of Language Learning

  12. ... know but were afraid to ... www.culturomics.org Penn WSJ: 0 counts. MoL Guest Lecture Computational Models of Language Learning

  13. Probabilistic Grammar Paradigm • Generative models define the process by which sentences are generated, and assign probabilities to sentences. • Statistical inference lets us search through the space of possible generative models. • Empirical evaluation against a manually written 'gold standard' allows us to more-or- less objectively compare different models. MoL Guest Lecture Computational Models of Language Learning

  14. A very brief tour of generative models MoL Guest Lecture Computational Models of Language Learning

  15. Sequences: e.g., Hidden Markov Model MoL Guest Lecture Computational Models of Language Learning

  16. Syntax: e.g., Probabilistic Contextfree Grammars MoL Guest Lecture Computational Models of Language Learning

  17. MoL Guest Lecture Computational Models of Language Learning

  18. Semantics: e.g. Discourse Representation Structure • “It is not clear” negation present tense agent anaphor resolution MoL Guest Lecture Computational Models of Language Learning

  19. MoL Guest Lecture Computational Models of Language Learning

  20. Semantics, e.g. Discourse Representation Structure (Le & Zuidema, 2012, Coling) MoL Guest Lecture Computational Models of Language Learning

  21. A very brief tour of statistical learning MoL Guest Lecture Computational Models of Language Learning

  22. Bayes' Rule P(D|G) P(G) P(G|D)= P(D) MoL Guest Lecture Computational Models of Language Learning

  23. Bayes' Rule prior likelihood P(D|G) P(G) P(G|D)= P(D) posterior probability of data MoL Guest Lecture Computational Models of Language Learning

  24. Bayes' Rule G G prior likelihood P(D|G) P(G) P(G|D)= P(D) posterior probability of data G MoL Guest Lecture Computational Models of Language Learning

  25. Statistical inference P(G|D) G MoL Guest Lecture Computational Models of Language Learning

  26. Statistical inference P(D|G) P(G) P(G|D)= P(D) P(D|G) P(G|D) G G MoL Guest Lecture Computational Models of Language Learning

  27. Statistical inference P(D|G) P(G) P(G|D)= P(D) P(D|G) P(G|D) MoL Guest Lecture Computational Models of Language Learning

  28. Statistical inference Bayesian inversion P(D|G) P(G|D) Generative model MoL Guest Lecture Computational Models of Language Learning

  29. Stochastic hillclimbing P(G|D) MoL Guest Lecture Computational Models of Language Learning

  30. Stochastic hillclimbing P(G|D) MoL Guest Lecture Computational Models of Language Learning

  31. Stochastic hillclimbing P(G|D) MoL Guest Lecture Computational Models of Language Learning

  32. Stochastic hillclimbing P(G|D) MoL Guest Lecture Computational Models of Language Learning

  33. Stochastic hillclimbing P(G|D) MoL Guest Lecture Computational Models of Language Learning

  34. Stochastic hillclimbing MoL Guest Lecture Computational Models of Language Learning

  35. Local optimum P(G|D) MoL Guest Lecture Computational Models of Language Learning

  36. Statistical inference Bayesian inversion P(D|G) P(G|D) Generative model MoL Guest Lecture Computational Models of Language Learning

  37. Statistical inference Bayesian inversion P(D|G) P(G|D) Generative model MoL Guest Lecture Computational Models of Language Learning

  38. Statistical inference Bayesian inversion P(D|G) P(G|D) Generative model MoL Guest Lecture Computational Models of Language Learning

  39. MAP Bayesian inversion P(D|G) P(G|D) Generative model MoL Guest Lecture Computational Models of Language Learning

  40. Maximum likelihood P(D|G) MoL Guest Lecture Computational Models of Language Learning

  41. Learning a grammar • Choose a generative model – HMM, PCFG, PTSG, PTAG, … • Choose an objective function – Maximum Likelihood, Bayesian … • Choose an optimization strategy – Stochastic hillclimbing • Choose a dataset – Penn WSJ treebank • Find the generative model that maximizes the objective function on the dataset! MoL Guest Lecture Computational Models of Language Learning

  42. Does it work in practice? Two major issues for research • Efficiency: How can we optimize our objective functions given exponentially many grammars that assign exponentially many analyses to sentences? • Accuracy : Which combination of generative models, objective functions and efficiency heuristics actually works best? MoL Guest Lecture Computational Models of Language Learning

  43. MoL Guest Lecture Computational Models of Language Learning

  44. MoL Guest Lecture Computational Models of Language Learning

  45. MoL Guest Lecture Computational Models of Language Learning

  46. MoL Guest Lecture Computational Models of Language Learning

  47. Evaluation Treebank Parse MoL Guest Lecture Computational Models of Language Learning

  48. Evaluation The screen was a sea of red Treebank Parse MoL Guest Lecture Computational Models of Language Learning

  49. Evaluation The screen was a sea of red induction/parsing Unsupervised Parse Treebank Parse MoL Guest Lecture Computational Models of Language Learning

  50. Evaluation The screen was a sea of red induction/parsing Unsupervised Parse Treebank Parse MoL Guest Lecture Computational Models of Language Learning

  51. Evaluation • Precision : number of constituents in the unsupervised parse that are also in the treebank parse; correctness • Recall : number of constituents in the treebank parse that are also in the unsupervised parse; completeness • F-score : geometric mean, i.e. F=2*P*R / (P+R) • Labels usually ignored. MoL Guest Lecture Computational Models of Language Learning

  52. Evaluation • Precision : number of constituents in the unsupervised parse that are also in the treebank parse; correctness • Recall : number of constituents in the treebank parse that are also in the unsupervised parse; completeness • F-score : geometric mean, i.e. F=2*P*R / (P+R) • Labels usually ignored. MoL Guest Lecture Computational Models of Language Learning

Recommend


More recommend