Dependency Grammars and Parsing CMSC 473/673 UMBC
Outline Review: PCFGs and CKY Dependency Grammars Dependency Parsing (Shift-reduce) From syntax to semantics
Probabilistic Context Free Grammar 1.0 S → NP VP 1.0 PP → P NP .4 NP → Det Noun .34 AdjP → Adj Noun .3 NP → Noun .26 VP → V NP .2 NP → Det AdjP .0003 Noun → Baltimore .1 NP → NP PP … Set of weighted (probabilistic) rewrite Q: What are the distributions? rules, comprised of terminals and What must sum to 1? non-terminals Terminals: the words in the language (the lexicon), e.g., Baltimore A: P(X → Y Z | X) Non-terminals: symbols that can trigger rewrite rules, e.g., S, NP , Noun (Sometimes) Pre-terminals: symbols that can only trigger lexical rewrites, e.g., Noun
Probabilistic Context Free Grammar S p( VP ) * NP S NP Noun p( ) * p( ) * p( )= NP VP Noun Baltimore VP NP Noun Verb Verb p( ) * p( ) * is Baltimore is a great city NP Verb product of probabilities of NP p( ) individual rules used in the derivation a great city
0 1 2 3 4 5 6 7 “Papa ate the caviar with a spoon” S → NP VP NP → Papa NP → Det N N → caviar NP → NP PP N → spoon First : Let’s find all NPs VP → V NP V → spoon VP → VP PP V → ate (NP, 0, 1): Papa (NP, 0, 1) (VP, 1, 7) (S, 0, 7) PP → P NP P → with (NP, 2, 4): the caviar (NP, 5, 7): a spoon Det → the Entire grammar (NP, 2, 7): the caviar with a spoon Assume uniform Det → a weights Second : Let’s find all VPs end 1 2 3 4 5 6 7 (VP, 1, 7): ate the caviar with a spoon (VP, 1, 4): ate the caviar 0 S NP 1 VP Third : Let’s find all Ss 2 (S, 0, 7): Papa ate the caviar with a 3 start spoon 4 (S, 0, 4): Papa ate the caviar 5 6 Example from Jason Eisner
CKY Recognizer Input: * string of N words * grammar in CNF Output: True (with parse)/False For Viterbi in HMMs: build table left-to-right Data structure: N*N table T Rows indicate span For CKY in trees: start (0 to N-1) 1. build smallest-to-largest & Columns indicate span 2. left-to-right end (1 to N) T[i][j] lists constituents spanning i → j
T = Cell[N][N+1] CKY Recognizer for(j = 1; j ≤ N; ++j) { T [j-1][j].add(X for non-terminal X in G if X → word j ) } for(width = 2; width ≤ N; ++ width) { for(start = 0; start < N - width; ++start) { end = start + width Y Z for(mid = start+1; mid < end; ++mid) { for(non-terminal Y : T [start][mid]) { X for(non-terminal Z : T [mid][end]) { T [start][end].add(X for rule X → Y Z : G ) Y Z } } } } }
CKY is Versatile: PCFG Tasks Task PCFG algorithm name HMM analog Find any parse CKY recognizer none Find the most likely parse (for an CKY weighted Viterbi Viterbi observed sequence) Calculate the (log) likelihood of an observed sequence w 1 , …, w N Inside algorithm Forward algorithm Forward- Inside-outside algorithm Learn the grammar parameters backward/Baum- (EM) Welch (EM)
Outline Review: PCFGs and CKY Dependency Grammars Dependency Parsing (Shift-reduce) From syntax to semantics
Structure vs. Word Relations Constituency trees/analyses: based on structure Dependency analyses: based on word relations
Remember: (P)CFGs Help S Clearly Show Ambiguity NP VP VP NP NP PP I ate the meal with friends VP NP PP NP VP S
CFGs to Dependencies S NP VP VP NP NP PP I ate the meal with friends VP NP PP NP VP S
CFGs to Dependencies S NP VP VP NP NP PP I ate the meal with friends VP NP PP NP VP S
CFGs to Dependencies S NP VP VP NP NP PP I ate the meal with friends VP NP PP NP VP S
CFGs to S Labeled Dependencies NP VP VP NP NP PP nsubj nmod dobj I ate the meal with friends nsubj nmod dobj VP NP PP NP VP S
Labeled Dependencies Word-to-word labeled relations nsubj gov ernor (head) Chris ate dep endent
Labeled Dependencies Word-to-word labeled relations nsubj gov ernor (head) Chris ate dep endent de Marneffe et al., 2014
http://universaldependencies.org/
(Labeled) Dependency Parse Directed graphs Vertices: linguistic blobs in a sentence Edges: (labeled) arcs
(Labeled) Dependency Parse Directed graphs Vertices: linguistic blobs in a sentence Edges: (labeled) arcs Often directed trees 1. A single root node with no incoming arcs 2. Each vertex except root has exactly one incoming arc 3. Unique path from the root node to each vertex
Projective Dependency Trees No crossing arcs ✔ Projective SLP3: Figs 14.2, 14.3
Projective Dependency Trees No crossing arcs ✔ Projective ✖ Not projective SLP3: Figs 14.2, 14.3
Projective Dependency Trees No crossing arcs ✔ Projective ✖ Not projective non projective parses capture • certain long-range dependencies • free word order SLP3: Figs 14.2, 14.3
Are CFGs for Naught? Nope! Simple algorithm from Xia and Palmer (2011) 1. Mark the head child of each node in a phrase structure, using “appropriate” head rules. 2. In the dependency structure, make the head of each non-head child depend on the head of the head-child.
Are CFGs for Naught? Nope! Simple algorithm from S Xia and Palmer (2011) VP 1. Mark the head child of each node in a phrase structure, using VP PP “appropriate” head rules. 2. In the dependency NP NP structure, make the head of each non-head child depend on the head of NP V D N P D N the head-child. Papa ate the caviar with a spoon
Are CFGs for Naught? Nope! Simple algorithm from S Xia and Palmer (2011) VP 1. Mark the head child of each node in a phrase structure, using VP PP “appropriate” head rules. 2. In the dependency caviar spoon NP NP structure, make the head of each non-head child depend on the head of NP V D N P D N the head-child. Papa ate the caviar with a spoon
Are CFGs for Naught? Nope! Simple algorithm from S Xia and Palmer (2011) VP 1. Mark the head child of each node in a phrase ate spoon structure, using VP PP “appropriate” head rules. 2. In the dependency caviar spoon NP NP structure, make the head of each non-head child depend on the head of NP V D N P D N the head-child. Papa ate the caviar with a spoon
Are CFGs for Naught? Nope! Simple algorithm from S Xia and Palmer (2011) ate VP 1. Mark the head child of each node in a phrase ate spoon structure, using VP PP “appropriate” head rules. 2. In the dependency caviar spoon NP NP structure, make the head of each non-head child depend on the head of NP V D N P D N the head-child. Papa ate the caviar with a spoon
Are CFGs for Naught? Nope! Simple algorithm from ate S Xia and Palmer (2011) ate VP 1. Mark the head child of each node in a phrase ate spoon structure, using VP PP “appropriate” head rules. 2. In the dependency caviar spoon NP NP structure, make the head of each non-head child depend on the head of NP V D N P D N the head-child. Papa ate the caviar with a spoon
Dependency Post-Processing (Keep Tree Structure)
Dependency Post-Processing (Get Possible Graph Structure) Amaranthus, collectively known as amaranth, is a cosmopolitan genus of annual or short-lived perennial plants.
Outline Review: PCFGs and CKY Dependency Grammars Dependency Parsing (Shift-reduce) From syntax to semantics
(Some) Dependency Parsing Algorithms Dynamic Programming Eisner Algorithm (Eisner 1996) Transition-based Shift-reduce, arc standard Graph-based Maximum spanning tree
(Some) Dependency Parsing Algorithms Dynamic Programming Eisner Algorithm (Eisner 1996) Transition-based Shift-reduce, arc standard Graph-based Maximum spanning tree
Shift-Reduce Parsing Recall from CMSC 331 Bottom-up Tools: input words, some special root symbol ($), and a stack to hold configurations
Shift-Reduce Parsing Recall from CMSC 331 Bottom-up Tools: input words, some special root symbol ($), and a stack to hold configurations Shift: – move tokens onto the stack – match top two elements of the stack against the grammar (RHS) Reduce: – If match occurred, place LHS symbol on the stack
Shift-Reduce Dependency Parsing Tools: input words, some special root symbol ($), and a stack to hold configurations Shift: – move tokens onto the stack – decide if top two elements of the stack form a valid (good) grammatical dependency Reduce: – If there’s a valid relation, place head on the stack
Shift-Reduce Dependency Parsing Tools: input words, some special root symbol ($), and a stack to hold configurations decide how ? Search problem! Shift: – move tokens onto the stack – decide if top two elements of the stack form a valid (good) grammatical dependency Reduce: – If there’s a valid relation, place head on the stack
Shift-Reduce Dependency Parsing Tools: input words, some special root symbol ($), and a stack to hold configurations decide how ? Search problem! Shift: – move tokens onto the stack – decide if top two elements of the stack form a valid (good) grammatical dependency what is valid? Learn it! Reduce: – If there’s a valid relation, place head on the stack
Recommend
More recommend