Lecture 15: Dependency Parsing Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16 CS6501: NLP 1
How to represent the structure CS6501: NLP 2
Dependency trees v Dependency grammar describe the structure of sentences as a graph (tree) v Nodes represent words v Edges represent dependencies v Idea goes back to 4 th century BC in ancient India CS6501: NLP 3
Phrase structure (constituent parse) trees v Can be modeled by Context-free grammars v We will see how constituent parse and dependency parse are related CS6501: NLP 4
Context-free grammars PP → P NP PP → P DT N PP → in the garden Non-terminal: DT, N, P, NP, PP, … Terminal: the, a, ball, garden CS6501: NLP 5
Generate sentences by CFG CS6501: NLP 6
Parse tree defined by CFG 1 2 3 4 5 6 Rule 2 Rule 4 & 1 7 Rule 6 CS6501: NLP 7
Example: noun phrases CS6501: NLP 8
Example: verb phrase CS6501: NLP 9
Sentences CS6501: NLP 10
Constituent Parse From: Kevin Gimpel CS6501: NLP 11
Constituent Parse Non-terminal S à NP VP NP à DT NN NP à DT PP à IN NP VP à VBD PP VP à NP VBD NP VP à NP VB Terminal CS6501: NLP 12
Nonterminal in Penn Treebank CS6501: NLP 13
Probabilistic Context-free Grammar Non-terminal 1.0 S à NP VP 0.6 NP à DT NN 0.4 NP à NP PP 1.0 PP à IN NP 0.5 VP à VBD PP 0.2 VP à NP VBD NP 0.3 VP à NP VB Terminal CS6501: NLP 14
Probabilistic Context-free Grammar v PCFG achieves ~73% on PTB v State-of-the art ~92% v Lexicalized PCFG (Collins 1997) CS6501: NLP 15
CS6501: NLP 16
How to decide head? v Usually use deterministic head rules (e.g., Collins head rules) v Define heads in CFG v S → NP VP v VP → VBD NP PP v NP → DT JJ NN From Noah Smith CS6501: NLP 17
Lexical Head Annotation CS6501: NLP 18
Constituent parse → Dependency Parse CS6501: NLP 19
Constituent parse → Dependency Parse CS6501: NLP 20
Head rules can be used to extract dependency parse from a CFG CS6501: NLP 21
Arrow types show the name of grammatical relations CS6501: NLP 22
Dependency parsing v Can be more flexible (non-projective) v English are mostly projective v Some free word order languages (e.g., Czech) are non-projective CS6501: NLP 23
How to build a dependency tree? v There are several approaches v Graph Algorithms v Consider all word pairs v Create a Maximum Spanning Tree for a sentence v Transition-base Approaches v Similar to how we parse a program: Shift-Reduce Parser v Many other approaches… CS6501: NLP 24
Sources of information for DP v Lexical affinities v [ issues → the ] v [ issues → I ] v Distances v Words usually depend on nearby words v Valency of heads v # dependents for a head CS6501: NLP 25
Graph-Based Approaches [McDonald et al. 2005] v Consider all word pairs and assign scores v Score of a tree = sum of score of edges v Can be solve as a MST problem v Chu-Liu-Edmonds CS6501: NLP 26
Transition-based parser v MaltParser (Nivre et al. 2008) v Similar to a Shift-Reduce Parser v But “reduce” actions can create dependencies v The parser has: v A stack 𝜏 – starts with a “Root” symbol v A buffer 𝛾 – starts with the input sentence v A set of dependency arcs A– starts off empty v Use a set of actions to parse sentences v Many possible action sets CS6501: NLP 27
Arc-Eager Dependency Parser v Shift: → Joe Joe likes Mary likes Mary ROOT ROOT v Left-Arc: Precondition: 𝑥 % ≠ Root & (𝑥 ( ,𝑥 % ) ∉ 𝐵 → Joe likes Mary likes Mary ROOT ROOT Joe CS6501: NLP 28
Arc-Eager Dependency Parser v Right-Arc Mary → like like Mary ROOT ROOT Joe Joe v Reduce: Precondition: (𝑥 ( , 𝑥 % ) ∈ 𝐵 Mary Mary → like like ROOT ROOT Joe Joe CS6501: NLP 29
Arc-Eager Dependency Parser v Start: v Conduct a sequence of actions v Terminate with 𝜏, 𝛾 = ∅ CS6501: NLP 30
CS6501: NLP 31
CS6501: NLP 32
CS6501: NLP 33
CS6501: NLP 34
CS6501: NLP 35
CS6501: NLP 36
CS6501: NLP 37
CS6501: NLP 38
CS6501: NLP 39
CS6501: NLP 40
CS6501: NLP 41
CS6501: NLP 42
CS6501: NLP 43
CS6501: NLP 44
CS6501: NLP 45
CS6501: NLP 46
CS6501: NLP 47
It’s your turn v Happy children like to play with their friend . v Shift → Left-arc → Shift → Left-arc → Right-arc → Shift → Left-arc → Right-arc → Right-arc → Shift → Left-arc → Right-arc → Reduce*3 → Right-arc → Reduce*3 CS6501: NLP 48
From Chris Manning CS6501: NLP 49
CS6501: NLP 50
Structured Prediction –beyond sequence tagging Assign values to a set of interdependent output variables Task Input Output Part-of-speech They operate Pronoun Verb Noun And Noun Tagging ships and banks. Dependency They operate Root They operate ships and banks . Parsing ships and banks. 51
Recommend
More recommend