taaltheorie en taalverwerking
play

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel - PowerPoint PPT Presentation

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernndez Institute for Logic, Language, and Computation Winter 2012, lecture 3b Raquel Fernndez TtTv 2012 - lecture 3b 1 / 19 Plan for Today Theoretical session: PCFGs


  1. Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernández Institute for Logic, Language, and Computation Winter 2012, lecture 3b Raquel Fernández TtTv 2012 - lecture 3b 1 / 19

  2. Plan for Today Theoretical session: • PCFGs • Exam Practical session: • Projects: teams & topics • Work with tutors over problematic homework. Raquel Fernández TtTv 2012 - lecture 3b 2 / 19

  3. Ambiguity Ambiguity is pervasive in natural language: Raquel Fernández TtTv 2012 - lecture 3b 3 / 19

  4. Ambiguity Ambiguity is pervasive in natural language: Some NLP tasks may do without disambiguation but most natural language understanding task need to disambiguate to get at the intended interpretation. Raquel Fernández TtTv 2012 - lecture 3b 3 / 19

  5. Probabilistic Parsing • The abstract parsers we looked at can represent ambiguities (by returning more than one parse tree) but cannot resolve them. Raquel Fernández TtTv 2012 - lecture 3b 4 / 19

  6. Probabilistic Parsing • The abstract parsers we looked at can represent ambiguities (by returning more than one parse tree) but cannot resolve them. • Main idea behind probabilistic parsing: compute the probability of each possible tree given a sentence and choose the most probably one. Raquel Fernández TtTv 2012 - lecture 3b 4 / 19

  7. Probabilistic Parsing • The abstract parsers we looked at can represent ambiguities (by returning more than one parse tree) but cannot resolve them. • Main idea behind probabilistic parsing: compute the probability of each possible tree given a sentence and choose the most probably one. • At first glance to compute the probability of a parse tree seems difficult: trees are complex structures and the set of all possible trees generated by a grammar will most likely be infinite. Raquel Fernández TtTv 2012 - lecture 3b 4 / 19

  8. Probabilistic Parsing • The abstract parsers we looked at can represent ambiguities (by returning more than one parse tree) but cannot resolve them. • Main idea behind probabilistic parsing: compute the probability of each possible tree given a sentence and choose the most probably one. • At first glance to compute the probability of a parse tree seems difficult: trees are complex structures and the set of all possible trees generated by a grammar will most likely be infinite. • Probabilistic CFGs allow us to compute the probability of a parse tree from the probability of the grammar rules used to derive it. Raquel Fernández TtTv 2012 - lecture 3b 4 / 19

  9. Probabilistic CFGs A PCFG is a CFG where each rule is augmented with a probability: Raquel Fernández TtTv 2012 - lecture 3b 5 / 19

  10. Probabilistic CFGs A PCFG is a CFG where each rule is augmented with a probability: • Σ : a finite alphabet of terminal symbols • N : a finite set of non-terminal symbols • S : a special symbol S ∈ N called the start symbol • R : a set of rules each of the form A → β [ p ] , where ∗ A is a non-terminal symbol ∗ β is any sequence of terminal or non-terminal symbols, including ǫ ∗ p is a number between 0 and 1 expressing the probability that A will be expanded to the sequence β , which we can write as P ( A → β ) ∗ for any non-terminal A , the sum of the probabilities for all rules A → β must be one: � β P ( A → β ) = 1 P ( A → β ) is a conditional probability P ( β | A ) : the probability of observing a β once we have observed an A . Raquel Fernández TtTv 2012 - lecture 3b 5 / 19

  11. En Example PCFG Each probability is constrained to be non-negative, and for any non-terminal A , the probabilities for all rules with that non-terminal on the left-hand-side of the rule must sum to 1. Raquel Fernández TtTv 2012 - lecture 3b 6 / 19

  12. Probability of a Parse Tree The probability of a parse tree for a given sentence P ( t , S ) is the product of the probabilities of all the grammar rules used in the derivation of the sentence. Raquel Fernández TtTv 2012 - lecture 3b 7 / 19

  13. Probability of a Parse Tree The probability of a parse tree for a given sentence P ( t , S ) is the product of the probabilities of all the grammar rules used in the derivation of the sentence. S NP VP DT NN Vt NP the man DT NN saw the dog Raquel Fernández TtTv 2012 - lecture 3b 7 / 19

  14. Probability of a Parse Tree The probability of a parse tree for a given sentence P ( t , S ) is the product of the probabilities of all the grammar rules used in the derivation of the sentence. S NP VP DT NN Vt NP the man DT NN saw the dog P ( t , S ) = p ( S → NP VP ) × p ( NP → DT NN ) × p ( VP → Vt NN ) × p ( NP → DT NN ) Raquel Fernández TtTv 2012 - lecture 3b 7 / 19

  15. Probability of a Parse Tree S What’s the probability of this tree? NP VP DT NN Vt NP the man DT NN saw the dog And the probability of [ S [ NP the man ] [ VP saw [ NP the dog [ PP with the telescope ]]] and [ S [ NP the man ] [ VP saw [ NP the dog ] [ PP with the telescope ]] ? Raquel Fernández TtTv 2012 - lecture 3b 8 / 19

  16. Disambiguation with PCFGs These probabilities can provide a criterion for disambiguation: they give us a ranking over possible parses for any sentence – we can simply choose the parse tree with the highest probability. Raquel Fernández TtTv 2012 - lecture 3b 9 / 19

  17. Learning PCFGs: Treebanks How do we know the probability of each rule? Raquel Fernández TtTv 2012 - lecture 3b 10 / 19

  18. Learning PCFGs: Treebanks How do we know the probability of each rule? We can compute them from a corpus of parsed sentences – a treebank. Raquel Fernández TtTv 2012 - lecture 3b 10 / 19

  19. Learning PCFGs: Treebanks How do we know the probability of each rule? We can compute them from a corpus of parsed sentences – a treebank. • the most well-known treebank is the Penn Treebank, which includes parse trees of sentences from different corpora, and in different languages. http://www.cis.upenn.edu/~treebank/ Raquel Fernández TtTv 2012 - lecture 3b 10 / 19

  20. Learning PCFGs: Treebanks How do we know the probability of each rule? We can compute them from a corpus of parsed sentences – a treebank. • the most well-known treebank is the Penn Treebank, which includes parse trees of sentences from different corpora, and in different languages. http://www.cis.upenn.edu/~treebank/ • a treebank is typically built by automatic parsing plus manual correction of parse trees. • standarization of the types of grammar rules allowed, POS tags and generally all non-terminal symbols is of course critical. Raquel Fernández TtTv 2012 - lecture 3b 10 / 19

  21. Learning PCFGs: Treebanks How do we know the probability of each rule? We can compute them from a corpus of parsed sentences – a treebank. • the most well-known treebank is the Penn Treebank, which includes parse trees of sentences from different corpora, and in different languages. http://www.cis.upenn.edu/~treebank/ • a treebank is typically built by automatic parsing plus manual correction of parse trees. • standarization of the types of grammar rules allowed, POS tags and generally all non-terminal symbols is of course critical. • the parsed sentences in a treebank implicitly constitute a grammar of the language: we can extract the CFG rules used to derive them. Raquel Fernández TtTv 2012 - lecture 3b 10 / 19

  22. Treebanks Treebanks are useful tools to study syntactic phenomena. (S (NP (NNP John)) (VP (VPZ loves) (NP (NNP Mary))) (. .)) Raquel Fernández TtTv 2012 - lecture 3b 12 / 19

  23. Treebanks Treebanks are useful tools to study syntactic phenomena. (S (NP (NNP John)) ( (CODE SpeakerA4 .)) (VP (VPZ loves) ( (S (INTJ Well) (NP (NNP Mary))) , (. .)) (EDITED (RM [) (NP-SBJ I) , (IP +)) (NP-SBJ I) (RS ]) (VP think (SBAR 0 (S (NP-SBJ it) (VP ’s (NP-PRD a (ADJP pretty good) idea))))) . E_S)) The bracketed tree on the left is from the Penn Treebank of the Switchboard corpus. For more resources, check out http://en.wikipedia.org/wiki/Treebank Raquel Fernández TtTv 2012 - lecture 3b 12 / 19

  24. Learning PCFGs from Treebanks Raquel Fernández TtTv 2012 - lecture 3b 13 / 19

  25. Learning PCFGs from Treebanks For each non-terminal A , we want to compute the probability of each rule A → β that expands A . Raquel Fernández TtTv 2012 - lecture 3b 13 / 19

  26. Learning PCFGs from Treebanks For each non-terminal A , we want to compute the probability of each rule A → β that expands A . • we count how often a rule A → β occurs in the treebank, and • divide that by the total number of rules that expand A (the total number of occurrences of A in the treebank). P ( A → β ) = Total ( A → β ) Total ( A ) Raquel Fernández TtTv 2012 - lecture 3b 13 / 19

  27. Learning PCFGs from Treebanks For each non-terminal A , we want to compute the probability of each rule A → β that expands A . • we count how often a rule A → β occurs in the treebank, and • divide that by the total number of rules that expand A (the total number of occurrences of A in the treebank). P ( A → β ) = Total ( A → β ) Total ( A ) For example, if the rule VP → Vt NP is seen 105 times in our corpus, and the non-terminal VP is seen 1000 times, then P ( VP → Vt NP ) = 105 1000 = 0 . 105 Raquel Fernández TtTv 2012 - lecture 3b 13 / 19

Recommend


More recommend