Parsing Probabilistic Context Free Grammars CMSC 473/673 UMBC November 8 th , 2017
Recap from last time…
Constituents Help Form Grammars constituent: spans of words that act (syntactically) as a group “X phrase” (noun phrase) Baltimore is a great place to be . This house is a great place to be . This red house is a great place to be . This red house on the hill is a great place to be . This red house near the hill is a great place to be . This red house atop the hill is a great place to be . The hill is a great place to be . S NP VP PP P NP NP Det Noun AdjP Adj Noun NP Noun VP V NP NP Det AdjP Noun Baltimore NP NP PP
Context Free Grammar S NP VP PP P NP NP Det Noun AdjP Adj Noun NP Noun VP V NP NP Det AdjP Noun Baltimore NP NP PP Set of rewrite rules, comprised of terminals and non-terminals Terminals: the words in the language (the lexicon), e.g., Baltimore Non-terminals: symbols that can trigger rewrite rules, e.g., S, NP , Noun (Sometimes) Pre-terminals: symbols that can only trigger lexical rewrites, e.g., Noun
Generate from a Context Free Grammar S NP VP PP P NP NP Det Noun AdjP Adj Noun NP Noun VP V NP NP Det AdjP Noun Baltimore NP NP PP … S NP VP Baltimore is a great city NP Noun Verb Baltimore is a great city
Assign Structure (Parse) with a Context Free Grammar S NP VP PP P NP NP Det Noun AdjP Adj Noun NP Noun VP V NP NP Det AdjP Noun Baltimore NP NP PP … S Baltimore is a great city NP VP [ S [ NP [ Noun Baltimore] ] [ VP [ Verb is] [ NP a great city]]] bracket notation NP Noun Verb (S (NP (Noun Baltimore)) (VP (V is) (NP a great city))) Baltimore is a great city S-expression
Parsing as a Core NLP Problem Gold (correct) reference trees sentence 1 sentence 2 score Evaluation Parser sentence 3 sentence 4 Grammar Other NLP task independent (entity coref., operations MT, Q&A, …)
Grammars Aren’t Just for Syntax N overgeneralization N N N over- generalization V N V generalize -tion A V A general -ize overgeneralization
Clearly Show Ambiguity… But Not Necessarily All Ambiguity PP Attachment Semantic (a common source of Ambiguities errors, even still today) I ate the meal with friends I ate the meal with gusto I ate the meal with a fork VP NP PP Issue 1: Which grammar? NP VP Issue 2: Discourse demands S flexibility
How Do We Robustly Handle Ambiguities?
How Do We Robustly Handle Ambiguities? Add probabilities (to what?)
Probabilistic Context Free Grammar S NP VP PP P NP NP Det Noun AdjP Adj Noun NP Noun VP V NP NP Det AdjP Noun Baltimore NP NP PP … Set of weighted (probabilistic) rewrite rules, comprised of terminals and non-terminals Terminals: the words in the language (the lexicon), e.g., Baltimore Non-terminals: symbols that can trigger rewrite rules, e.g., S, NP , Noun (Sometimes) Pre-terminals: symbols that can only trigger lexical rewrites, e.g., Noun
Probabilistic Context Free Grammar S NP VP PP P NP NP Det Noun AdjP Adj Noun NP Noun VP V NP NP Det AdjP Noun Baltimore NP NP PP … Set of weighted (probabilistic) rewrite Q: What are the distributions? rules, comprised of terminals and What must sum to 1? non-terminals Terminals: the words in the language (the lexicon), e.g., Baltimore Non-terminals: symbols that can trigger rewrite rules, e.g., S, NP , Noun (Sometimes) Pre-terminals: symbols that can only trigger lexical rewrites, e.g., Noun
Probabilistic Context Free Grammar 1.0 S NP VP 1.0 PP P NP .4 NP Det Noun .34 AdjP Adj Noun .3 NP Noun .26 VP V NP .2 NP Det AdjP .0003 Noun Baltimore .1 NP NP PP … Set of weighted (probabilistic) rewrite Q: What are the distributions? rules, comprised of terminals and What must sum to 1? non-terminals Terminals: the words in the language (the lexicon), e.g., Baltimore A: P(X Y Z | X) Non-terminals: symbols that can trigger rewrite rules, e.g., S, NP , Noun (Sometimes) Pre-terminals: symbols that can only trigger lexical rewrites, e.g., Noun
Probabilistic Context Free Grammar S p( )= NP VP product of probabilities of individual rules used in the derivation NP Noun Verb Baltimore is a great city
Probabilistic Context Free Grammar S p( VP ) * NP S p( )= NP VP NP Noun Verb Baltimore is a great city product of probabilities of individual rules used in the derivation
Probabilistic Context Free Grammar S p( VP ) * NP S NP Noun p( ) * p( ) * p( )= NP VP Noun Baltimore NP Noun Verb Baltimore is a great city product of probabilities of individual rules used in the derivation
Probabilistic Context Free Grammar S p( VP ) * NP S NP Noun p( ) * p( ) * p( )= NP VP Noun Baltimore VP NP Noun Verb Verb p( ) * p( ) * is Baltimore is a great city NP Verb product of probabilities of NP p( ) individual rules used in the derivation a great city
Log Probabilistic Context Free Grammar S lp( VP ) + NP S NP Noun lp( ) + lp( ) + lp( )= NP VP Noun Baltimore VP NP Noun Verb Verb lp( ) + lp( ) + is Baltimore is a great city NP Verb sum of log probabilities of NP lp( ) individual rules used in the derivation a great city
Estimating PCFGs Attempt 1: • Get access to a treebank (corpus of syntactically annotated sentences), e.g., the English Penn Treebank • Count productions • Smooth these counts • This gets ~75 F1
Probabilistic Context Free Grammar (PCFG) Tasks Find the most likely parse (for an observed sequence) Calculate the (log) likelihood of an observed sequence w 1 , …, w N Learn the grammar parameters
Probabilistic Context Free Grammar (PCFG) Tasks Find the most likely parse (for an observed sequence) Calculate the (log) likelihood of an observed sequence w 1 , …, w N Learn the grammar parameters
Probabilistic Context Free Grammar (PCFG) Tasks any Find the most likely parse (for an observed sequence) Calculate the (log) likelihood of an observed sequence w 1 , …, w N Learn the grammar parameters
Parsing with a CFG Top-down backtracking (brute force) CKY Algorithm: dynamic bottom-up Earley’s Algorithm: dynamic top-down
Parsing with a CFG Top-down backtracking (brute force) CKY Algorithm: dynamic bottom-up Earley’s Algorithm: dynamic top-down
CKY Precondition Grammar must be in Chomsky Normal Form (CNF) non-terminal non-terminal non-terminal non-terminal terminal
CKY Precondition Grammar must be in Chomsky Normal Form (CNF) non-terminal non-terminal non-terminal X Y Z non-terminal terminal X a
CKY Precondition Grammar must be in Chomsky Normal Form (CNF) non-terminal non-terminal non-terminal X Y Z binary rules can only involve non-terminals non-terminal terminal X a unary rules can only involve terminals no ternary (+) rules
S NP VP NP Papa NP Det N N caviar NP NP PP N spoon VP V NP V spoon VP VP PP V ate PP P NP P with Det the Entire grammar Assume uniform weights Det a Example from Jason Eisner
0 1 2 3 4 5 6 7 “Papa ate the caviar with a spoon” S NP VP NP Papa NP Det N N caviar NP NP PP N spoon VP V NP V spoon VP VP PP V ate PP P NP P with Det the Entire grammar Assume uniform weights Det a Example from Jason Eisner
0 1 2 3 4 5 6 7 “Papa ate the caviar with a spoon” S NP VP NP Papa NP Det N N caviar Goal: NP NP PP N spoon VP V NP V spoon (S, 0, 7) VP VP PP V ate PP P NP P with Det the Entire grammar Assume uniform weights Det a Example from Jason Eisner
0 1 2 3 4 5 6 7 “Papa ate the caviar with a spoon” Check 1 : What are the non- terminals? S NP VP NP Papa NP Det N N caviar NP NP PP N spoon VP V NP V spoon VP VP PP V ate PP P NP P with Det the Entire grammar Det a Assume uniform weights Example from Jason Eisner
0 1 2 3 4 5 6 7 “Papa ate the caviar with a spoon” Check 1 : What are the non- terminals? S NP VP NP Papa S N NP Det N N caviar NP V VP P NP NP PP N spoon PP Det VP V NP V spoon Check 2 : What are the terminals? VP VP PP V ate PP P NP P with Det the Entire grammar Det a Assume uniform weights Example from Jason Eisner
0 1 2 3 4 5 6 7 “Papa ate the caviar with a spoon” Check 1 : What are the non- terminals? S NP VP NP Papa S N NP Det N N caviar NP V VP P NP NP PP N spoon PP Det VP V NP V spoon Check 2 : What are the terminals? VP VP PP V ate Papa with PP P NP P with caviar the Det the spoon a ate Entire grammar Det a Assume uniform weights Check 3 : What are the pre- terminals? Example from Jason Eisner
Recommend
More recommend