Double Trouble: The Problem of Construal in Semantic Annotation of Adpositions Jena Hwang Na-Rae Han Vivek Srikumar Archna Bhatia Tim O’Gorman Nathan Schneider August 4, 2017, *SEM, Vancouver
Most languages have adpositions . in on at by for to of with from b ə - l ə - mi- about … ‘al ‘im … adposition = preposition | postposition k ā ko ne se (n)eun i/ga, m ẽ par tak … do, (r)eul … 2
Feature 85A: Order of Adposition and Noun Phrase Dryer in WALS , http://wals.info/chapter/85 3
We know PPs are challenging for syntactic parsing. a talk at the workshop on prepositions But what about the meaning beyond linking governor & modifier? 4
“I study preposition semantics.” 5
Adpositions have semantics?! https://michaelspiro.wordpress.com/author/michaelspiro/page/4/ 6
based on COCA list of 5000 most frequent English words 7
Polysemy • With great frequency comes great polysemy . • in ‣ in the box ‣ in the afternoon ‣ in love, in trouble ‣ in fact ‣ … 8
Cross- linguistically interesting • Small number of grammatical categories • Language-specific partitioning of functions • Translations are many-to-many 9
Bewildering to learn in an L2 10
Shared functions They ran to the roof for a quick escape. D ESTINATION P URPOSE They made for the roof to escape the cops. 11
Design Principles 1. Coverage: Wicked polysemy, rare senses make it hard to annotate all tokens in a corpus. 2. Cross-linguistic adequacy: Adpositions/case markers work differently in different languages. Ideally, our semantic functions should be language-independent. 12
Design Principles 1. Coverage: Annotate all adposition types and tokens in a corpus. 2. Cross-linguistic adequacy: Adpositions/case markers work differently in different languages. Ideally, our semantic functions should be language-independent. 13
Design Principles 1. Coverage: Annotate all adposition types and tokens in a corpus. 2. Cross-linguistic adequacy: Our semantic functions should be as language-independent as possible. 14
Senses vs. Supersenses 2.B 2.C Above-and- Completion beyond 2.D (excess I) Transfer 3. Covering 2. 2.A On-the- A-B-C trajectory other-side- cluster of 1. 5. Protoscene Up cluster 5.A More 5.C Preference 5.B Control 5.A.1 Over-and-above (excess II) . The semantic network for . fine-grained details lexeme-specific (extensive linguistic & AI research 15 on space & time)
Senses vs. Supersenses N = 4073 2.B 2.C Above-and- Completion beyond 2.D (excess I) Transfer 3. Covering Spatial 2. 2.A On-the- A-B-C 25% trajectory other-side- cluster of 1. 5. Protoscene Up cluster 5.A More 5.C Neither Temporal Preference 5.B Control 62% 13% 5.A.1 Over-and-above (excess II) . The semantic network for . fine-grained details lexeme-specific (extensive linguistic & AI research 15 on space & time)
Senses vs. Supersenses 2.B 2.C Above-and- Completion beyond 2.D (excess I) Transfer 3. Covering 2. 2.A On-the- A-B-C trajectory other-side- cluster of 1. 5. Protoscene Up cluster 5.A More 5.C Preference 5.B Control 5.A.1 Over-and-above (excess II) . The semantic network for . fine-grained details cross-lexical classes; coarse; lexeme-specific interpretable names like T OPIC (extensive linguistic & AI research 15 on space & time)
Preposition Supersenses L OCATION We met in Paris at a shop on a street by the Seine T IME at 6:00 in the evening on Saturday. 16
Supersense Hierarchy 1.0 [LAW 2015] Superset Co-Agent Creator Possessor StartTime EndTime ClockTimeCxn Agent Whole Elements Instance DeicticTime RelativeTime Function Species Causer Quantity Reciprocation Purpose Age Time Frequency Configuration A ff ector Duration Explanation Attribute Temporal Participant Co-Participant Circumstance Accompanier Patient Experiencer Stimulus Place Comparison/Contrast Undergoer Co-Patient ProfessionalAspect Value Scalar/Rank Theme Path Manner Locus Activity Extent Co-Theme Topic ValueComparison Instrument Contour Beneficiary Location Source State Approximator Direction StartState Means Via InitialLocation Material Traversed Goal Transit Donor/Speaker 1DTrajectory EndState Destination 2DArea 3DMedium Course Recipient 75 preposition supersense categories http://tiny.cc/prepwiki 17
English Annotation in STREUSLE [LAW 2016] • Online reviews corpus previously annotated for multiword expressions and noun & verb supersenses. 55,000 words, including 4,250 preps. • Comprehensive annotation: first dataset with all prepositions (types+tokens) semantically annotated ‣ Sentences not hand-selected ‣ Sentences fully annotated ‣ Preposition types not constrained by a lexicon (labels generalize) ‣ All sentences seen by multiple annotators 18
Comparing resources [LAW 2016] P P ∞ ∞ P* {P1,P2} Ann P1 < P2 X ~ P P P The Preposition Project ✓ ✓ ( ✓ ) TPP (Litkowski & Hargraves 2005, SemEval 2007 shared task) TPP senses for 7 preposition ✓ ✓ D+ 7 types in PropBank WSJ data (Dahlmeier et al. 2009) Annotator-optimized revised ✓ ✓ ( ✓ ) Tratz 34 senses for 34 TPP SemEval prepositions (Tratz 2011) 32 hard clusters of TPP senses ✓ S&R 34 for 34 SemEval prepositions (Srikumar & Roth 2013) Preposition supersenses ✓ ✓ ✓ ✓ ✓ Ours (Schneider et al. LAW 2015, 2016) 19
A Vexing Problem • Drawing clean boundaries between semantic categories is always difficult. • But we were surprised by the frequency of apparent overlaps between semantic role labels. • These overlaps proved pervasive in the other languages we looked at. 20
Destination/Location • The prepositions to , into , onto , and for explicitly encode D ESTINATION . • D ESTINATION masquerading as static L OCATION : ‣ Put the pen in the box. (= into) ‣ He threw his cards on the table. (= onto) ‣ The ball rolled behind the trash can. • Extremely productive for motion/caused motion! • We could stipulate one or the other, but annotators would still get confused. 21
Fictive Motion • In the other direction, we know that static locative relations can be described using dynamic language (Talmy 1996): ‣ The road runs through the trees. ‣ I heard him from the room next door. ‣ The school is around the corner. • In assigning a semantic label, is it sufficient to “choose sides” between the static nature of the spatial scene, and the dynamic way that relation is portrayed by the preposition? 22
Stimulus/Topic • Another conundrum: ‣ I thought about getting my ears pierced.: T OPIC (cf. know, talk, read ) ‣ I feared getting my ears pierced: S TIMULUS (cf. see , hurt ) ‣ I was scared about getting my ears pierced: ??? • Again, two labels are competing for semantic territory. • Should we add more categories with double inheritance? (Problem: Proliferation of categories.) • Should we just allow annotators to specify multiple labels if they’re unsure? (Problem: Would create inconsistency.) 23
Construal • Assumption thus far: preposition token’s semantics = role in a scene … Topic ‣ I thought about getting my ears pierced. Topic • But it’s not always so simple: … Stimulus ‣ I was scared about getting my ears pierced. Topic 24
Construal • Observation: The preposition can frame or construe the situation in a way that differs from the predicate or scene. • Solution: Allow tokens to receive two labels from the hierarchy, one for the scene role and one for the preposition’s semantic function , when warranted. 25
Construal • In fact, Stimulus can be interpreted differently by different prepositions: … Stimulus ‣ I was scared by the bear. Causer • … Stimulus ‣ I was scared about getting my ears pierced. Topic 26
Experiencer Dative • Experiencers can be realized as recipients/datives: … Experiencer ‣ The bear felt scary to me. Recipient • In some languages, this is the main way E XPERIENCER s are realized: ‣ koev li ha-ro š . [Hebrew] Hurts to.me the-head ‘My head hurts.’ ‣ mujh- ko garmii lag rahii hai. [Hindi] I- DAT head feel PROG PRESS ‘I’m feeling hot.’ 27
Employment • The P ROFESSIONAL A SPECT label is used for employer– employee and other professional relationships. • It participates in several different preposition construals: … ProfAsp Beneficiary ‣ He works for XYZ Inc. at Location … ProfAsp Source ‣ He’s from XYZ Inc. with Accompanier 28
Null Functions? • Sometimes it’s hard to tell whether the adposition has any semantic contribution: … Stimulus ? ‣ I’m angry with my mom. *mad ? … Topic ‣ She’s interested in politics. *fascinated 29
Recommend
More recommend