Probabilistic Frame-Semantic Parsing Noah A. Smith Dipanjan Das Nathan Schneider Desai Chen School of Computer Science Carnegie Mellon University NAACL-HLT June 4, 2010
In a Nutshell • Most models for semantics are very local (cascades of classifiers) • This work: towards more global modeling for rich semantic processing (feature sharing among all semantic classes) ( just two probabilistic models) • Our model outperforms the state of the art • Our framework lends itself to extensions and improvements 2 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Outline • Introduction • Background and Datasets • Models and Results • Conclusion 3 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Outline • Introduction • Background and Datasets • Models and Results • Conclusion 4 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Overview • Annotate English sentences with semantic representations • Combination of: • semantic frame (word sense) disambiguation • semantic role labeling • Frame and role repository: FrameNet (Fillmore et al., 2003) 5 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Frame Semantics • Theory developed by Fillmore (1982) • a word evokes a frame of semantic knowledge • a frame encodes a gestalt event or scenario • it has conceptual dependents filling roles elaborating the frame instance 6 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Frame Semantics • Theory developed by Fillmore (1982) • a word evokes a frame of semantic knowledge • a frame encodes a gestalt event or scenario the 1995 book by John Grisham • it has conceptual dependents filling roles elaborating the frame instance 7 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Frame Semantics • Theory developed by Fillmore (1982) • a word evokes a frame of semantic knowledge • a frame encodes a gestalt event or scenario the 1995 book by John Grisham • it has conceptual dependents filling roles T EXT elaborating the frame instance 8 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Frame Semantics • Theory developed by Fillmore (1982) • a word evokes a frame of semantic knowledge • a frame encodes a gestalt event or scenario the 1995 book by John Grisham • it has conceptual dependents filling roles T EXT elaborating the frame instance • a frame encodes a gestalt event or scenario 9 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Frame Semantics • Theory developed by Fillmore (1982) • a word evokes a frame of semantic knowledge • a frame encodes a gestalt event or scenario the 1995 book by John Grisham • it has conceptual dependents filling roles T EXT elaborating the frame instance Time_of_creation Author • a frame encodes a gestalt event or scenario • it has conceptual dependents filling roles elaborating the frame instance 10 Das, Schneider, Chen and Smith, NAACL-HLT 2010
FrameNet M AKE _ NOISE Sound Place Time Noisy_event Sound_source cough.v, gobble.v, hiss.v, ring.v, yodel.v, ... (Fillmore et al., 2003) 11 Das, Schneider, Chen and Smith, NAACL-HLT 2010
FrameNet frame M AKE _ NOISE Sound roles Place Time Noisy_event Sound_source cough.v, gobble.v, hiss.v, ring.v, yodel.v, ... lexical units (Fillmore et al., 2003) 12 Das, Schneider, Chen and Smith, NAACL-HLT 2010
FrameNet E VENT Event Place T RANSITIVE _ ACTION C AUSE _ TO _ MAKE _ NOISE M AKE _ NOISE Time Event Purpose Sound event.n, happen.v, occur.v, take place.v, ... Place Place Place Time Time Time O BJECTIVE _ INFLUENCE Agent Agent Noisy_event Place Cause Cause Sound_source Time cough.v, gobble.v, Patient Sound_maker Influencing_entity hiss.v, ring.v, yodel.v, ... blare.v, honk.v, play.v, — Influencing_situation ring.v, toot.v, ... Dependent_entity affect.v, effect.n, Inheritance relation Causative_of relation impact.n, impact.v, ... Excludes relation relationships between frames and between roles (Fillmore et al., 2003) 13 Das, Schneider, Chen and Smith, NAACL-HLT 2010
FrameNet • Statistics: • 795 semantic frames • 7124 roles • 8379 lexical units (predicates) • 139,000 exemplar sentences containing one frame annotation per sentence 14 Das, Schneider, Chen and Smith, NAACL-HLT 2010
A Frame-Semantic Parse Marco Polo wrote an account of Asian society during the 13th century . T EXT Author Topic 15 Das, Schneider, Chen and Smith, NAACL-HLT 2010
A Frame-Semantic Parse Marco Polo wrote an account of Asian society during the 13th century . T EXT Author Topic here, the ambiguous word evokes the T EXT frame 16 Das, Schneider, Chen and Smith, NAACL-HLT 2010
A Frame-Semantic Parse Marco Polo wrote an account of Asian society during the 13th century . T EXT Author Topic participants in the event or scenario 17 Das, Schneider, Chen and Smith, NAACL-HLT 2010
A Frame-Semantic Parse Marco Polo wrote an account of Asian society during the 13th century . T EXT Author Topic participants in the event or scenario frame-specific 18 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Why Frame-Semantic Parsing? • Combines lexical and predicate-argument semantics • Exploits meaningful primitives developed by experts • the FrameNet lexicon • Richer representation than PropBank style SRL • No inconsistent symbolic tags (ARG2-ARG5) (Yi et al. 2007, Matsubayashi et al. 2009) • Patterns generalizing across frames and roles can be learned (Matsubayashi et al. 2009) 19 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Outline • Introduction • Background and Datasets • Models and Results • Conclusion 20 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Early Work • Gildea and Jurafsky (2002) • Much smaller version of FrameNet • exemplar sentences 21 Das, Schneider, Chen and Smith, NAACL-HLT 2010
SemEval 2007 • Baker et al. (2007) organized the SemEval task on frame structure extraction • first set of full text annotations available • released a corpus of ~2000 sentences with full frame-semantic parses • Johansson and Nugues (2007) submitted the best performing system • our baseline for comparison (J&N’07) 22 Das, Schneider, Chen and Smith, NAACL-HLT 2010
SemEval 2007 • SemEval 2007 dataset: • training set: 1941 sentences • test set: 120 sentences • Three domains • American National Corpus (travel) • Nuclear Threat Initiative (bureaucratic) • PropBank (news) 23 Das, Schneider, Chen and Smith, NAACL-HLT 2010
SemEval 2007 • Evaluation is done using the official SemEval script • Measures precision, recall and F 1 score for frames and arguments • Features a partial matching criterion for frame identification • assigns score between 0 and 1 to closely related frames in the FrameNet hierarchy 24 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Outline • Introduction • Background and Datasets • Models and Results • Conclusion 25 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Challenges • Several times more labels than traditional shallow semantic parsing • Annotated data does not have gold syntactic annotation • Very little labeled data • Identifying semantic frames for unknown lexical units • Very sparse features 26 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Desired Structure Everyone in Dublin seems intent on changing places with everyone else . 27 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Desired Structure Everyone in Dublin seems intent on changing places with everyone else . L OCATIVE_RELATION L OCALE Figure Ground Locale E XCHANGE Exchanger_1 Themes Exchanger_2 P URPOSE Agent Goal A PPEARANCE Phenomenon Inference 28 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Three Subtasks: • Target identification • Identifying frame-evoking predicates (nontrivial!) • Frame identification • Labeling each target with a frame type (795 possibilities; ~WSD) • Argument identification • Finding each frame's arguments (~SRL; roleset is frame-specific) 29 Das, Schneider, Chen and Smith, NAACL-HLT 2010
sentence Three Subtasks: • Target identification predicates • Identifying frame-evoking predicates (nontrivial!) • Frame identification frames • Labeling each target with a frame type (795 possibilities; ~WSD) • Argument identification frames and • Finding each frame's arguments arguments (~SRL; roleset is frame-specific) 29 Das, Schneider, Chen and Smith, NAACL-HLT 2010
sentence Three Subtasks: rule-based • Target identification predicates • Identifying frame-evoking predicates probabilistic (nontrivial!) • Frame identification frames • Labeling each target with a frame type probabilistic (795 possibilities; ~WSD) • Argument identification frames and • Finding each frame's arguments arguments (~SRL; roleset is frame-specific) 29 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Outline • Introduction • Background and Datasets • Models and Results • Target Identification • Frame Identification • Argument Identification • Final Results • Conclusion 30 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Target Identification Everyone in Dublin seems intent on changing places with everyone else . 31 Das, Schneider, Chen and Smith, NAACL-HLT 2010
Recommend
More recommend