Semantic Roles, Frames, and Expectations CMSC 473/673 UMBC November 27 th and 29 th , 2017
Course Announcement 1: Assignment 4 Due Monday December 11 th (~2 weeks) Any questions?
Course Announcement 2: Final Exam No mandatory final exam December 20 th , 1pm-3pm: optional second midterm/final Averaged into first midterm score No practice questions Register by Monday 12/11: https://goo.gl/forms/aXflKkP0BIRxhOS83
Recap from last time…
Probabilistic Context Free Grammar (PCFG) Tasks Find the most likely parse (for an observed sequence) Calculate the (log) likelihood of an observed sequence w 1 , …, w N Learn the grammar parameters
CKY Algorithms Weights ⓪ ① Boolean Recognizer or and False True (True/False) Viterbi [0,1] max * 0 1 Inside [0,1] + * 0 1 Not really (“ Semiring Parsing,” Goodman, 1998). But there is a Outside? connection between inside-outside and backprop ! (“Inside -Outside and Forward-Backward Algorithms are Just Backprop ,” Eisner, 2016) Adapted from Jason Eisner
Expectation Maximization (EM) 0. Assume some value for your parameters p(X Y Z) Two step, iterative algorithm 1. E-step: count under uncertainty, assuming these parameters 𝔽[𝑌 → 𝑍 𝑎 | 𝑥 1 𝑥 2 ⋯ 𝑥 𝑂 ] = 𝔽[𝑌 → 𝑏 | 𝑥 1 𝑥 2 ⋯ 𝑥 𝑂 ] = 𝑞(𝑌 → 𝑍 𝑎) 𝛽 𝑌, 𝑗, 𝑘 𝛾 𝑍, 𝑗, 𝑙 𝛾 𝑎, 𝑙, 𝑘 𝑞(𝑌 → 𝑏) 𝛽 𝑌, 𝑗, 𝑗 + 1 𝑀(𝑥 1 𝑥 2 ⋯ 𝑥 𝑂 ) 𝑀(𝑥 1 𝑥 2 ⋯ 𝑥 𝑂 ) 0≤𝑗<𝑂 :𝑥 𝑗 =𝑏 0≤𝑗<𝑙<𝑘≤𝑂 2. M-step: maximize log-likelihood, assuming these uncertain counts “Inside - outside” estimated counts
Projective Dependency Trees No crossing arcs ✔ Projective ✖ Not projective non projective parses capture • certain long-range dependencies • free word order SLP3: Figs 14.2, 14.3
Are CFGs for Naught? Nope! Simple algorithm from ate S Xia and Palmer (2011) ate VP 1. Mark the head child of each node in a phrase ate spoon structure, using VP PP “appropriate” head rules. 2. In the dependency caviar spoon NP NP structure, make the head of each non-head child depend on the head of NP V D N P D N the head-child. Papa ate the caviar with a spoon
(Some) Dependency Parsing Algorithms Dynamic Programming Eisner Algorithm (Eisner 1996) Transition-based Shift-reduce, arc standard Graph-based Maximum spanning tree
Shift-Reduce Dependency Parsing Tools: input words, some special root symbol ($), and a stack to hold configurations decide how ? Search problem! Shift: – move tokens onto the stack – decide if top two elements of the stack form a valid (good) grammatical dependency what are the what is valid? possible actions? Learn it! Reduce: – If there’s a valid relation, place head on the stack
Arc Standard Parsing state {[root], [words], [] } while state ≠ {[root], [], [ (deps) ]} { t ← ORACLE( state) state ← APPLY(t , state) Action } Possibility Action Meaning Name Assign the current word Assert a head-dependent relation between the word at as the head of some L EFT A RC the top of stack and the word directly beneath it; remove previously seen word the lower word from the stack Assign some previously Assert a head-dependent relation between the second return state seen word as the head of R IGHT A RC word on the stack and the word at the top; remove the the current word word at the top of the stack Wait processing the Remove the word from the front of the input buffer and current word; add it for S HIFT push it onto the stack later
Papa ate the caviar Deps Stack Word Buffer Action --- $ Papa ate the caviar S HIFT --- Papa $ ate the caviar S HIFT --- ate Papa $ the caviar L EFT A RC ate->Papa ate $ caviar S HIFT ate->Papa the ate $ --- S HIFT ate->Papa caviar the ate $ --- L EFT A RC ate->Papa, caviar-> the caviar ate $ --- R IGHT A RC ate->Papa, caviar-> the, ate->caviar ate $ --- R IGHT A RC ate->Papa, caviar-> the, ate->caviar, $->ate --- --- ---
Arc Standard Parsing state {[root], [words], [] } Q: What is the time complexity? while state ≠ {[root], [], [ (deps) ]} { t ← ORACLE( state) state ← APPLY(t , state) } return state
Arc Standard Parsing state {[root], [words], [] } Q: What is the time complexity? while state ≠ {[root], [], [ (deps) ]} { A : Linear t ← ORACLE( state) state ← APPLY(t , state) } return state
Arc Standard Parsing state {[root], [words], [] } Q: What is the time complexity? while state ≠ {[root], [], [ (deps) ]} { A : Linear t ← ORACLE( state) state ← APPLY(t , state) Q: What’s potentially } problematic? return state
Arc Standard Parsing state {[root], [words], [] } Q: What is the time complexity? while state ≠ {[root], [], [ (deps) ]} { A : Linear t ← ORACLE( state) state ← APPLY(t , state) Q: What’s potentially } problematic? return state A : This is a greedy algorithm
Learning An Oracle (Predictor) Training data: dependency treebank Input: configuration Output: {L EFT A RC , R IGHT A RC , S HIFT } t ← ORACLE(state) • Choose L EFT A RC if it produces a correct head-dependent relation given the reference parse and the current configuration • Choose R IGHT A RC if • it produces a correct head-dependent relation given the reference parse and • all of the dependents of the word at the top of the stack have already been assigned • Otherwise, choose S HIFT
Training the Predictor Predict action t give configuration s t = φ (s) Extract features of the configuration Examples: word forms, lemmas, POS, morphological features How? Perceptron, Maxent, Support Vector Machines, Multilayer Perceptrons, Neural Networks Take CMSC 478 (678) to learn more about these
Becoming Less Greedy Beam search Breadth-first search strategy (CMSC 471/671) At each stage, keep K options open
Evaluation Exact Match (per-sentence accuracy) Unlabeled Attachment Score (UAS) Labeled Attachment Score (LS, LAS) Recall/Precision/F 1 for particular relation types
From Dependencies to Shallow Semantics
From Syntax to Shallow Semantics “Open Information Extraction” Angeli et al. (2015)
From Syntax to Shallow Semantics “Open Information Extraction” Angeli et al. (2015) http://corenlp.run/ (constituency & dependency) https://github.com/hltcoe/predpatt a sampling of efforts http://openie.allenai.org/ http://www.cs.rochester.edu/research/knext/browse/ (constituency trees) http://rtw.ml.cmu.edu/rtw/
Semantic Role Labeling Who did what to whom at where ? The police officer detained the suspect at the scene of the crime V ARG2 ARG0 AM-loc Agent Predicate Theme Location Following slides adapted from SLP3
Predicate Alternations XYZ corporation bought the stock.
Predicate Alternations XYZ corporation bought the stock. They sold the stock to XYZ corporation.
Predicate Alternations XYZ corporation bought the stock. They sold the stock to XYZ corporation. The stock was bought by XYZ corporation. The purchase of the stock by XYZ corporation... The stock purchase by XYZ corporation...
A Shallow Semantic Representation: Semantic Roles (event) Predicates (bought, sold, purchase) represent a situation Semantic roles express the abstract role that arguments of a predicate can take in the event More specific More general agent buyer proto-agent
Thematic roles Sasha broke the window Pat opened the door Subjects of break and open: Breaker and Opener Specific to each event
Thematic roles Sasha broke the window Breaker and Opener have something in common! Volitional actors Pat opened the door Often animate Direct causal responsibility for their events Subjects of break and open: Breaker and Opener Thematic roles are a way to capture this semantic commonality between Breakers Specific to each event and Eaters .
Thematic roles Sasha broke the window Breaker and Opener have something in common! Volitional actors Pat opened the door Often animate Direct causal responsibility for their events Subjects of break and open: Breaker and Opener Thematic roles are a way to capture this semantic commonality between Breakers and Eaters . Specific to each event They are both AGENTS . The BrokenThing and OpenedThing , are THEMES . prototypically inanimate objects affected in some way by the action
Thematic roles Breaker and Opener have something in common! Sasha broke the window Volitional actors Often animate Pat opened the door Direct causal responsibility for their events Thematic roles are a way to capture this semantic Subjects of break and open: Breaker and commonality between Breakers and Eaters . Opener They are both AGENTS . Specific to each event The BrokenThing and OpenedThing , are THEMES . prototypically inanimate objects affected in some way by the action Modern formulation from Fillmore (1966,1968), Gruber (1965) Fillmore influenced by Lucien Tesnière ’ s (1959) Ê léments de Syntaxe Structurale, the book that introduced dependency grammar
Typical Thematic Roles
Verb Alternations (Diathesis Alternations) Break: AGENT, INSTRUMENT, or THEME as subject Give: THEME and GOAL in either order
Recommend
More recommend