Exploiting and Expanding Corpus Resources for Frame-Semantic - PowerPoint PPT Presentation

Exploiting and Expanding Corpus Resources for Frame-Semantic Parsing Nathan Schneider, CMU (with Chris Dyer & Noah A. Smith) April 26, 2013 ■ IFNW’13 1

FrameNet + NLP = <3 • We want to develop systems that understand text • Frame semantics and FrameNet o ff er a linguistically & computationally satisfying theory/representation for semantic relations 2

Roadmap • A frame-semantic parser • Multiword expressions • Simplifying annotation for syntax + semantics 3

Frame-semantic parsing SemEval Task 19 [Baker, Ellsworth, & Erk 2007] 4

Frame-semantic parsing SemEval Task 19 [Baker, Ellsworth, & Erk 2007] • Given a text sentence, analyze its frame semantics. Mark: 4

Frame-semantic parsing SemEval Task 19 [Baker, Ellsworth, & Erk 2007] • Given a text sentence, analyze its frame semantics. Mark: ‣ words/phrases that are lexical units 4

Frame-semantic parsing SemEval Task 19 [Baker, Ellsworth, & Erk 2007] • Given a text sentence, analyze its frame semantics. Mark: ‣ words/phrases that are lexical units ‣ frame evoked by each LU 4

Frame-semantic parsing SemEval Task 19 [Baker, Ellsworth, & Erk 2007] • Given a text sentence, analyze its frame semantics. Mark: ‣ words/phrases that are lexical units ‣ frame evoked by each LU ‣ frame elements (role–argument pairings) 4

Frame-semantic parsing SemEval Task 19 [Baker, Ellsworth, & Erk 2007] • Given a text sentence, analyze its frame semantics. Mark: ‣ words/phrases that are lexical units ‣ frame evoked by each LU ‣ frame elements (role–argument pairings) • Analysis is in terms of groups of tokens. No assumption that we know the syntax. 4

SEMAFOR [Das, Schneider, Chen, & Smith 2010] 5

SEMAFOR [Das, Schneider, Chen, & Smith 2010] ✓ 6

SEMAFOR [Das, Schneider, Chen, & Smith 2010] • SEMAFOR consists of a pipeline: preprocessing → target identi fi cation → frame identi fi cation → argument identi fi cation 7

SEMAFOR [Das, Schneider, Chen, & Smith 2010] • SEMAFOR consists of a pipeline: preprocessing → target identi fi cation → frame identi fi cation → argument identi fi cation • Preprocessing: syntactic parsing 7

SEMAFOR [Das, Schneider, Chen, & Smith 2010] • SEMAFOR consists of a pipeline: preprocessing → target identi fi cation → frame identi fi cation → argument identi fi cation • Preprocessing: syntactic parsing • Heuristics + 2 statistical models 7

SEMAFOR [Das, Schneider, Chen, & Smith 2010] • SEMAFOR consists of a pipeline: preprocessing → target identi fi cation → frame identi fi cation → argument identi fi cation • Preprocessing: syntactic parsing • Heuristics + 2 statistical models • Trained/tuned on English FrameNet’s full-text annotations 7

Full-text Annotations https://framenet.icsi.berkeley.edu/fndrupal/index.php?q=fulltextIndex 8

Full-text annotations 9

SEMAFOR [Das, Schneider, Chen, & Smith 2010] • SEMAFOR’s models consist of features over observable parts of the sentence (words, lemmas, POS tags, dependency edges & paths) that may be predictive of frame/role labels 10

SEMAFOR [Das, Schneider, Chen, & Smith 2010] • SEMAFOR’s models consist of features over observable parts of the sentence (words, lemmas, POS tags, dependency edges & paths) that may be predictive of frame/role labels • Full-text annotations as training data for (semi)supervised learning 10

SEMAFOR [Das, Schneider, Chen, & Smith 2010] • SEMAFOR’s models consist of features over observable parts of the sentence (words, lemmas, POS tags, dependency edges & paths) that may be predictive of frame/role labels • Full-text annotations as training data for (semi)supervised learning • Extensive body of work on semantic role labeling [starting with Gildea & Jurafsky 2002 for FrameNet; also much work for PropBank] 10

SEMAFOR [Das, Schneider, Chen, & Smith 2010] [Das et al. 2013 to appear] 11

SEMAFOR [Das, Schneider, Chen, & Smith 2010] • State-of-the-art performance on SemEval’07 evaluation (outperforms the best system from the task, Johansson & Nugues 2007) [Das et al. 2013 to appear] 11

SEMAFOR [Das, Schneider, Chen, & Smith 2010] • State-of-the-art performance on SemEval’07 evaluation (outperforms the best system from the task, Johansson & Nugues 2007) • On SE07: [F] 74% [A] 68% [F → A] 46% On FN1.5: [F] 91% [A] 80% [F → A] 69% [Das et al. 2013 to appear] 11

SEMAFOR [Das, Schneider, Chen, & Smith 2010] • State-of-the-art performance on SemEval’07 evaluation (outperforms the best system from the task, Johansson & Nugues 2007) • On SE07: [F] 74% [A] 68% [F → A] 46% On FN1.5: [F] 91% [A] 80% [F → A] 69% [Das et al. 2013 to appear] • BUT: This task is really hard. Room for improvement at all stages. 11

SEMAFOR Demo http://demo.ark.cs.cmu.edu/parse 12

How to improve? • Better modeling with current resources? • Ways to use non-FrameNet resources? • Create new resources? 13

How to improve? • Better modeling with current resources? • Ways to use non-FrameNet resources? • Create new resources? Dipanjan Das Sam Thomson 13

Better Modeling? • We already have over a million features. • better use of syntactic parsers (e.g., better argument span heuristics, considering alternative parses, constituent parsers) • recall-oriented learning? [Mohit et al. 2012 for NER] • better search in decoding [Das, Martins, & Smith 2012] • joint frame ID & argument ID? 14

Use Other Resources? • FN1.5 has just 3k sentences/20k targets in full-text annotations. data sparseness • semisupervised learning: reasoning about unseen predicates with distributional similarity [Das & Smith 2011] • NER? supersense tagging? • use PropBank → FrameNet mappings to get more training data? 15

Roadmap • A frame-semantic parser • Multiword expressions • Simplifying annotation for syntax + semantics 16

Roadmap • A frame-semantic parser • Multiword expressions • Simplifying annotation for new resources syntax + semantics 16

Roadmap • A frame-semantic parser • A frame-semantic parser • Multiword expressions • Multiword expressions • Simplifying annotation for • Simplifying annotation for new resources syntax + semantics syntax + semantics 16

Multiword Expressions Christmas Day.n Losing_it : German measles.n lose it.v along with.prep go ballistic.v also_known_as.a fl ip out.v armed forces.n blow cool.v bear arms.v freak out.v beat up.v double-check.v 17

Multiword Expressions • 926 unique multiword LUs in FrameNet lexicon ‣ 545 w/ space, 222 w/ underscore, 177 w/ hyphen ‣ 361 frames have an LU containing a space, underscore, or hyphen • support constructions like ‘take a walk’: only the N should be frame-evoking [Calzolari et al. 2002] 18

✗ ✗ 19

✓ 20

✓ ✗ 20

...even though take break.v is listed as an LU! (probably not in training data) 21

✗ ✗ ...even though take break.v is listed as an LU! (probably not in training data) 21

✗ ✗ ...even though take break.v is listed as an LU! (probably not in training data) ✗ 21

• There has been a lot of work on speci fi c kinds of MWEs (e.g. noun-noun compounds, phrasal verbs) [Baldwin & Kim, 2010] 22

• There has been a lot of work on speci fi c kinds of MWEs (e.g. noun-noun compounds, phrasal verbs) [Baldwin & Kim, 2010] ‣ Special datasets, tasks, tools 22

• There has been a lot of work on speci fi c kinds of MWEs (e.g. noun-noun compounds, phrasal verbs) [Baldwin & Kim, 2010] ‣ Special datasets, tasks, tools • Can MWE identi fi cation be formulated in an open-ended annotate-and-model fashion? 22

• There has been a lot of work on speci fi c kinds of MWEs (e.g. noun-noun compounds, phrasal verbs) [Baldwin & Kim, 2010] ‣ Special datasets, tasks, tools • Can MWE identi fi cation be formulated in an open-ended annotate-and-model fashion? ‣ Linguistic challenge: understanding and guiding annotators’ intuitions 22

MWE Annotation • We are annotating the 50k-word Reviews portion of the English Web Treebank with multiword units (MWEs + NEs) 23

MWE Annotation 24

Exploiting and Expanding Corpus Resources for Frame-Semantic - PowerPoint PPT Presentation

Exploiting and Expanding Corpus Resources for Frame-Semantic Parsing Nathan Schneider, CMU (with Chris Dyer & Noah A. Smith) April 26, 2013 IFNW13 1 FrameNet + NLP = <3 We want to develop systems that understand text

Kinds of picture Single frame Kinds of picture Single frame Multi-frame Kinds of

What is frame busting? What is frame busting? HTML allows for any site to frame any URL with an

Expanding Enrollment in Advanced Expanding Enrollment in Advanced Expanding Enrollment in

Expanding the Borders Expanding the Borders Janerik Lundquist LiTH LiTH Expanding the Borders

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

Frame Relay Topologies and Designs Frame Relay Topologies and Design As we learned in the Frame

Corpus Stylistics: Speech, Writing and Thought Presentation in a Corpus of English Writing

The need for Corpus Statistics: Corpus analysis and the identification of linguistically relevant

Dynamical systems Expanding maps on the circle Jana Rodriguez Hertz ICTP 2018 lifts and degree

FRAME- -DRAGGI NG DRAGGI NG FRAME (GRAVI TOMAGNETI SM) (GRAVI TOMAGNETI SM) AND I TS

Deck Deck Frame Frame DeckFrame Deck Frame is the utilization of VP Buildings

The Frame of the p -Adic Numbers Francisco Avila June 27, 2017 Francisco Avila The Frame

Solving Quadratic BSDEs Hlne HIBON 29/06/16 Contents Introduction The convex frame The

Corpus Linguistics Seminar Resources for Computational Linguists SS 2007 Magdalena Wolska

MACAQ : A Multi Annotated Corpus to study how we adapt Answers to various Questions Anne

TrustedOut Corpus Intelligence Corpus Intelligence Makes Intelligence Trustworthy. Florent Solt,

IBR Approaches for View Synthesis Image-Based Rendering and Modeling l Image-based

Developing Web Applications for DHIS2 An introduction to the web API and how to create

> = Kai von Fintel and Anthony S. Gillies MIT and Rutgers 1. The Strict Analysis of

Managing Coronavirus in the Workplace April 13, 2020 @HROToday www.HROToday.com/CTEN

Continuous Improvement Toolkit Cost Benefit Analysis Continuous Improvement Toolkit .

Making Financial Literacy Accessible to All Strategies for delivering content tailored to

Introduction to Artificial Intelligence CS171, Summer 1 Quarter, 2019 Introduction to Artificial

Lab 1 Issued: Monday, October 11, 2004 Optionally Due: Monday, October 18 Reading Gordon E.

Exploiting and Expanding Corpus Resources for Frame-Semantic - PowerPoint PPT Presentation

Exploiting and Expanding Corpus Resources for Frame-Semantic Parsing Nathan Schneider, CMU (with Chris Dyer & Noah A. Smith) April 26, 2013 IFNW13 1 FrameNet + NLP = <3 We want to develop systems that understand text

Kinds of picture Single frame Kinds of picture Single frame Multi-frame Kinds of

What is frame busting? What is frame busting? HTML allows for any site to frame any URL with an

Expanding Enrollment in Advanced Expanding Enrollment in Advanced Expanding Enrollment in

Expanding the Borders Expanding the Borders Janerik Lundquist LiTH LiTH Expanding the Borders

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

Frame Relay Topologies and Designs Frame Relay Topologies and Design As we learned in the Frame

Corpus Stylistics: Speech, Writing and Thought Presentation in a Corpus of English Writing

The need for Corpus Statistics: Corpus analysis and the identification of linguistically relevant

Dynamical systems Expanding maps on the circle Jana Rodriguez Hertz ICTP 2018 lifts and degree

FRAME- -DRAGGI NG DRAGGI NG FRAME (GRAVI TOMAGNETI SM) (GRAVI TOMAGNETI SM) AND I TS

Deck Deck Frame Frame DeckFrame Deck Frame is the utilization of VP Buildings

The Frame of the p -Adic Numbers Francisco Avila June 27, 2017 Francisco Avila The Frame

Solving Quadratic BSDEs Hlne HIBON 29/06/16 Contents Introduction The convex frame The

Corpus Linguistics Seminar Resources for Computational Linguists SS 2007 Magdalena Wolska

MACAQ : A Multi Annotated Corpus to study how we adapt Answers to various Questions Anne

TrustedOut Corpus Intelligence Corpus Intelligence Makes Intelligence Trustworthy. Florent Solt,

IBR Approaches for View Synthesis Image-Based Rendering and Modeling l Image-based

Developing Web Applications for DHIS2 An introduction to the web API and how to create

&gt; = Kai von Fintel and Anthony S. Gillies MIT and Rutgers 1. The Strict Analysis of

Managing Coronavirus in the Workplace April 13, 2020 @HROToday www.HROToday.com/CTEN

Continuous Improvement Toolkit Cost Benefit Analysis Continuous Improvement Toolkit .

Making Financial Literacy Accessible to All Strategies for delivering content tailored to

Introduction to Artificial Intelligence CS171, Summer 1 Quarter, 2019 Introduction to Artificial

Lab 1 Issued: Monday, October 11, 2004 Optionally Due: Monday, October 18 Reading Gordon E.

> = Kai von Fintel and Anthony S. Gillies MIT and Rutgers 1. The Strict Analysis of