Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel - PowerPoint PPT Presentation

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernández Institute for Logic, Language, and Computation Winter 2012, lecture 3b Raquel Fernández TtTv 2012 - lecture 3b 1 / 19

Plan for Today Theoretical session: • PCFGs • Exam Practical session: • Projects: teams & topics • Work with tutors over problematic homework. Raquel Fernández TtTv 2012 - lecture 3b 2 / 19

Ambiguity Ambiguity is pervasive in natural language: Raquel Fernández TtTv 2012 - lecture 3b 3 / 19

Ambiguity Ambiguity is pervasive in natural language: Some NLP tasks may do without disambiguation but most natural language understanding task need to disambiguate to get at the intended interpretation. Raquel Fernández TtTv 2012 - lecture 3b 3 / 19

Probabilistic Parsing • The abstract parsers we looked at can represent ambiguities (by returning more than one parse tree) but cannot resolve them. Raquel Fernández TtTv 2012 - lecture 3b 4 / 19

Probabilistic Parsing • The abstract parsers we looked at can represent ambiguities (by returning more than one parse tree) but cannot resolve them. • Main idea behind probabilistic parsing: compute the probability of each possible tree given a sentence and choose the most probably one. Raquel Fernández TtTv 2012 - lecture 3b 4 / 19

Probabilistic Parsing • The abstract parsers we looked at can represent ambiguities (by returning more than one parse tree) but cannot resolve them. • Main idea behind probabilistic parsing: compute the probability of each possible tree given a sentence and choose the most probably one. • At first glance to compute the probability of a parse tree seems difficult: trees are complex structures and the set of all possible trees generated by a grammar will most likely be infinite. Raquel Fernández TtTv 2012 - lecture 3b 4 / 19

Probabilistic Parsing • The abstract parsers we looked at can represent ambiguities (by returning more than one parse tree) but cannot resolve them. • Main idea behind probabilistic parsing: compute the probability of each possible tree given a sentence and choose the most probably one. • At first glance to compute the probability of a parse tree seems difficult: trees are complex structures and the set of all possible trees generated by a grammar will most likely be infinite. • Probabilistic CFGs allow us to compute the probability of a parse tree from the probability of the grammar rules used to derive it. Raquel Fernández TtTv 2012 - lecture 3b 4 / 19

Probabilistic CFGs A PCFG is a CFG where each rule is augmented with a probability: Raquel Fernández TtTv 2012 - lecture 3b 5 / 19

Probabilistic CFGs A PCFG is a CFG where each rule is augmented with a probability: • Σ : a finite alphabet of terminal symbols • N : a finite set of non-terminal symbols • S : a special symbol S ∈ N called the start symbol • R : a set of rules each of the form A → β [ p ] , where ∗ A is a non-terminal symbol ∗ β is any sequence of terminal or non-terminal symbols, including ǫ ∗ p is a number between 0 and 1 expressing the probability that A will be expanded to the sequence β , which we can write as P ( A → β ) ∗ for any non-terminal A , the sum of the probabilities for all rules A → β must be one: � β P ( A → β ) = 1 P ( A → β ) is a conditional probability P ( β | A ) : the probability of observing a β once we have observed an A . Raquel Fernández TtTv 2012 - lecture 3b 5 / 19

En Example PCFG Each probability is constrained to be non-negative, and for any non-terminal A , the probabilities for all rules with that non-terminal on the left-hand-side of the rule must sum to 1. Raquel Fernández TtTv 2012 - lecture 3b 6 / 19

Probability of a Parse Tree The probability of a parse tree for a given sentence P ( t , S ) is the product of the probabilities of all the grammar rules used in the derivation of the sentence. Raquel Fernández TtTv 2012 - lecture 3b 7 / 19

Probability of a Parse Tree The probability of a parse tree for a given sentence P ( t , S ) is the product of the probabilities of all the grammar rules used in the derivation of the sentence. S NP VP DT NN Vt NP the man DT NN saw the dog Raquel Fernández TtTv 2012 - lecture 3b 7 / 19

Probability of a Parse Tree The probability of a parse tree for a given sentence P ( t , S ) is the product of the probabilities of all the grammar rules used in the derivation of the sentence. S NP VP DT NN Vt NP the man DT NN saw the dog P ( t , S ) = p ( S → NP VP ) × p ( NP → DT NN ) × p ( VP → Vt NN ) × p ( NP → DT NN ) Raquel Fernández TtTv 2012 - lecture 3b 7 / 19

Probability of a Parse Tree S What’s the probability of this tree? NP VP DT NN Vt NP the man DT NN saw the dog And the probability of [ S [ NP the man ] [ VP saw [ NP the dog [ PP with the telescope ]]] and [ S [ NP the man ] [ VP saw [ NP the dog ] [ PP with the telescope ]] ? Raquel Fernández TtTv 2012 - lecture 3b 8 / 19

Disambiguation with PCFGs These probabilities can provide a criterion for disambiguation: they give us a ranking over possible parses for any sentence – we can simply choose the parse tree with the highest probability. Raquel Fernández TtTv 2012 - lecture 3b 9 / 19

Learning PCFGs: Treebanks How do we know the probability of each rule? Raquel Fernández TtTv 2012 - lecture 3b 10 / 19

Learning PCFGs: Treebanks How do we know the probability of each rule? We can compute them from a corpus of parsed sentences – a treebank. Raquel Fernández TtTv 2012 - lecture 3b 10 / 19

Learning PCFGs: Treebanks How do we know the probability of each rule? We can compute them from a corpus of parsed sentences – a treebank. • the most well-known treebank is the Penn Treebank, which includes parse trees of sentences from different corpora, and in different languages. http://www.cis.upenn.edu/~treebank/ Raquel Fernández TtTv 2012 - lecture 3b 10 / 19

Learning PCFGs: Treebanks How do we know the probability of each rule? We can compute them from a corpus of parsed sentences – a treebank. • the most well-known treebank is the Penn Treebank, which includes parse trees of sentences from different corpora, and in different languages. http://www.cis.upenn.edu/~treebank/ • a treebank is typically built by automatic parsing plus manual correction of parse trees. • standarization of the types of grammar rules allowed, POS tags and generally all non-terminal symbols is of course critical. Raquel Fernández TtTv 2012 - lecture 3b 10 / 19

Learning PCFGs: Treebanks How do we know the probability of each rule? We can compute them from a corpus of parsed sentences – a treebank. • the most well-known treebank is the Penn Treebank, which includes parse trees of sentences from different corpora, and in different languages. http://www.cis.upenn.edu/~treebank/ • a treebank is typically built by automatic parsing plus manual correction of parse trees. • standarization of the types of grammar rules allowed, POS tags and generally all non-terminal symbols is of course critical. • the parsed sentences in a treebank implicitly constitute a grammar of the language: we can extract the CFG rules used to derive them. Raquel Fernández TtTv 2012 - lecture 3b 10 / 19

Treebanks Treebanks are useful tools to study syntactic phenomena. (S (NP (NNP John)) (VP (VPZ loves) (NP (NNP Mary))) (. .)) Raquel Fernández TtTv 2012 - lecture 3b 12 / 19

Treebanks Treebanks are useful tools to study syntactic phenomena. (S (NP (NNP John)) ( (CODE SpeakerA4 .)) (VP (VPZ loves) ( (S (INTJ Well) (NP (NNP Mary))) , (. .)) (EDITED (RM [) (NP-SBJ I) , (IP +)) (NP-SBJ I) (RS ]) (VP think (SBAR 0 (S (NP-SBJ it) (VP ’s (NP-PRD a (ADJP pretty good) idea))))) . E_S)) The bracketed tree on the left is from the Penn Treebank of the Switchboard corpus. For more resources, check out http://en.wikipedia.org/wiki/Treebank Raquel Fernández TtTv 2012 - lecture 3b 12 / 19

Learning PCFGs from Treebanks Raquel Fernández TtTv 2012 - lecture 3b 13 / 19

Learning PCFGs from Treebanks For each non-terminal A , we want to compute the probability of each rule A → β that expands A . Raquel Fernández TtTv 2012 - lecture 3b 13 / 19

Learning PCFGs from Treebanks For each non-terminal A , we want to compute the probability of each rule A → β that expands A . • we count how often a rule A → β occurs in the treebank, and • divide that by the total number of rules that expand A (the total number of occurrences of A in the treebank). P ( A → β ) = Total ( A → β ) Total ( A ) Raquel Fernández TtTv 2012 - lecture 3b 13 / 19

Learning PCFGs from Treebanks For each non-terminal A , we want to compute the probability of each rule A → β that expands A . • we count how often a rule A → β occurs in the treebank, and • divide that by the total number of rules that expand A (the total number of occurrences of A in the treebank). P ( A → β ) = Total ( A → β ) Total ( A ) For example, if the rule VP → Vt NP is seen 105 times in our corpus, and the non-terminal VP is seen 1000 times, then P ( VP → Vt NP ) = 105 1000 = 0 . 105 Raquel Fernández TtTv 2012 - lecture 3b 13 / 19

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel - PowerPoint PPT Presentation

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernndez Institute for Logic, Language, and Computation Winter 2012, lecture 3b Raquel Fernndez TtTv 2012 - lecture 3b 1 / 19 Plan for Today Theoretical session: PCFGs

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernndez Institute for Logic,

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernndez Institute for Logic,

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernndez Institute for Logic,

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernndez Institute for Logic,

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernndez Institute for Logic,

Taaltheorie en Taalverwerking BSc Artificial Intelligence Raquel Fernndez Institute for Logic,

The C standard formalized in Coq, whats next? Robbert Krebbers Aarhus University, Denmark

The Dual Simplex Method Combinatorial Problem Solving (CPS) Javier Larrosa Albert Oliveras

Outline Introduction to Parsing Regular languages revisited Ambiguity and Syntax Errors

Ubiquitous Computing Spring 2010 - Making Sense of Sensing

Automatically Annotating Text with Linked Open Data Delia Rusu , Bla Fortuna, Dunja Mladeni

EQUAL Encyclopaedic QA for Lists Iustin Dornescu Research Group in Computational Linguistics,

Action recognition Cordelia Schmid INRIA Grenoble Action recognition examples Short

PostgreSQL, pgAdmin, and JOINs PDBM 7.37.3.1.5 Dr. Chris Mayfield Department of Computer

Photon BackTracker in LArSoft J. Stock, J Reichenbacher. South Dakota School of Mines and

Lazy Spilling for a Time-Predictable Stack Cache: Implementation and Analysis Sahar Abbaspour,

Modeling Islamist Extremist Communications on Social Media using Religion, Ideology and Hate

DISCO: Sidestepping RPKIs Deployment Barriers Tomas Hlavacek 1 Italo Cunha 23 Yossi Gilad 4 Amir

A RELOAD Usage for Distributed Conference Control (DisCo) Update draft-knauf-p2psip-disco-02

Presented by Jonathan Walpole (based on a slide set from Vidhya Sivasankaran) Outline Goal

MULTIPROCESSORS AND HETEROGENEOUS ARCHITECTURES Hakim Weatherspoon CS6410 Slides borrowed

Modern systems: multicore issues By Paul Grubbs Portions of this talk were taken from Deniz

Using Disco and MapReduce to study mRNA complexity Dan Williams SciPy 2011 Lightning Talk

100% Big Data 0% Hadoop 0% Java Pavlo Baron, codecentric Wednesday, November 7, 12