jena hwang na rae han vivek srikumar archna bhatia tim o
play

Jena Hwang Na-Rae Han Vivek Srikumar Archna Bhatia Tim OGorman - PowerPoint PPT Presentation

Double Trouble: The Problem of Construal in Semantic Annotation of Adpositions Jena Hwang Na-Rae Han Vivek Srikumar Archna Bhatia Tim OGorman Nathan Schneider August 4, 2017, *SEM, Vancouver Most languages have adpositions . in on at


  1. Double Trouble: The Problem of Construal in Semantic Annotation of Adpositions Jena Hwang Na-Rae Han Vivek Srikumar Archna Bhatia Tim O’Gorman Nathan Schneider August 4, 2017, *SEM, Vancouver

  2. Most languages have adpositions . in on at by for to of with from b ə - l ə - mi- about … ‘al ‘im … adposition = preposition 
 | postposition k ā ko ne se (n)eun i/ga, m ẽ par tak … do, (r)eul … 2

  3. Feature 85A: Order of Adposition and Noun Phrase 
 Dryer in WALS , http://wals.info/chapter/85 3

  4. We know PPs are challenging for syntactic parsing. a talk at the workshop on prepositions But what about the meaning beyond linking governor & modifier? 4

  5. “I study preposition semantics.” 5

  6. Adpositions have semantics?! https://michaelspiro.wordpress.com/author/michaelspiro/page/4/ 6

  7. based on COCA list of 5000 most frequent English words 7

  8. Polysemy • With great frequency comes great polysemy . • in ‣ in the box ‣ in the afternoon ‣ in love, in trouble ‣ in fact ‣ … 8

  9. Cross- linguistically interesting • Small number of grammatical categories • Language-specific partitioning of functions • Translations are many-to-many 9

  10. Bewildering to learn in an L2 10

  11. Shared functions They ran to the roof for a quick escape. D ESTINATION P URPOSE They made for the roof to escape the cops. 11

  12. Design Principles 1. Coverage: Wicked polysemy, rare senses make it hard to annotate all tokens in a corpus. 2. Cross-linguistic adequacy: Adpositions/case markers work differently in different languages. Ideally, our semantic functions should be language-independent. 12

  13. Design Principles 1. Coverage: Annotate all adposition types and tokens in a corpus. 2. Cross-linguistic adequacy: Adpositions/case markers work differently in different languages. Ideally, our semantic functions should be language-independent. 13

  14. Design Principles 1. Coverage: Annotate all adposition types and tokens in a corpus. 2. Cross-linguistic adequacy: Our semantic functions should be as language-independent as possible. 
 14

  15. Senses vs. Supersenses 2.B 2.C Above-and- Completion beyond 2.D (excess I) Transfer 3. Covering 2. 2.A On-the- A-B-C trajectory other-side- cluster of 1. 5. Protoscene Up cluster 5.A More 5.C Preference 5.B Control 5.A.1 Over-and-above (excess II) . The semantic network for . fine-grained details lexeme-specific (extensive linguistic & AI research 15 on space & time)

  16. Senses vs. Supersenses N = 4073 2.B 2.C Above-and- Completion beyond 2.D (excess I) Transfer 3. Covering Spatial 2. 2.A On-the- A-B-C 25% trajectory other-side- cluster of 1. 5. Protoscene Up cluster 5.A More 5.C Neither Temporal Preference 5.B Control 62% 13% 5.A.1 Over-and-above (excess II) . The semantic network for . fine-grained details lexeme-specific (extensive linguistic & AI research 15 on space & time)

  17. Senses vs. Supersenses 2.B 2.C Above-and- Completion beyond 2.D (excess I) Transfer 3. Covering 2. 2.A On-the- A-B-C trajectory other-side- cluster of 1. 5. Protoscene Up cluster 5.A More 5.C Preference 5.B Control 5.A.1 Over-and-above (excess II) . The semantic network for . fine-grained details cross-lexical classes; coarse; lexeme-specific interpretable names like T OPIC (extensive linguistic & AI research 15 on space & time)

  18. Preposition Supersenses L OCATION We met in Paris at a shop on a street by the Seine T IME at 6:00 in the evening on Saturday. 16

  19. Supersense Hierarchy 1.0 [LAW 2015] Superset Co-Agent Creator Possessor StartTime EndTime ClockTimeCxn Agent Whole Elements Instance DeicticTime RelativeTime Function Species Causer Quantity Reciprocation Purpose Age Time Frequency Configuration A ff ector Duration Explanation Attribute Temporal Participant Co-Participant Circumstance Accompanier Patient Experiencer Stimulus Place Comparison/Contrast Undergoer Co-Patient ProfessionalAspect Value Scalar/Rank Theme Path Manner Locus Activity Extent Co-Theme Topic ValueComparison Instrument Contour Beneficiary Location Source State Approximator Direction StartState Means Via InitialLocation Material Traversed Goal Transit Donor/Speaker 1DTrajectory EndState Destination 2DArea 3DMedium Course Recipient 75 preposition supersense categories http://tiny.cc/prepwiki 17

  20. English Annotation in STREUSLE [LAW 2016] • Online reviews corpus previously annotated for multiword expressions and noun & verb supersenses. 55,000 words, including 4,250 preps. • Comprehensive annotation: first dataset with all prepositions (types+tokens) semantically annotated ‣ Sentences not hand-selected ‣ Sentences fully annotated ‣ Preposition types not constrained by a lexicon (labels generalize) ‣ All sentences seen by multiple annotators 18

  21. Comparing resources [LAW 2016] P P ∞ ∞ P* {P1,P2} Ann P1 < P2 X ~ P P P The Preposition Project ✓ ✓ ( ✓ ) TPP (Litkowski & Hargraves 2005, SemEval 2007 shared task) TPP senses for 7 preposition ✓ ✓ D+ 7 types in PropBank WSJ data (Dahlmeier et al. 2009) Annotator-optimized revised ✓ ✓ ( ✓ ) Tratz 34 senses for 34 TPP SemEval prepositions (Tratz 2011) 32 hard clusters of TPP senses ✓ S&R 34 for 34 SemEval prepositions (Srikumar & Roth 2013) Preposition supersenses ✓ ✓ ✓ ✓ ✓ Ours (Schneider et al. LAW 2015, 2016) 19

  22. A Vexing Problem • Drawing clean boundaries between semantic categories is always difficult. • But we were surprised by the frequency of apparent overlaps between semantic role labels. • These overlaps proved pervasive in the other languages we looked at. 20

  23. Destination/Location • The prepositions to , into , onto , and for explicitly encode D ESTINATION . • D ESTINATION masquerading as static L OCATION : ‣ Put the pen in the box. (= into) ‣ He threw his cards on the table. (= onto) ‣ The ball rolled behind the trash can. • Extremely productive for motion/caused motion! • We could stipulate one or the other, but annotators would still get confused. 21

  24. Fictive Motion • In the other direction, we know that static locative relations can be described using dynamic language (Talmy 1996): ‣ The road runs through the trees. ‣ I heard him from the room next door. ‣ The school is around the corner. • In assigning a semantic label, is it sufficient to “choose sides” between the static nature of the spatial scene, and the dynamic way that relation is portrayed by the preposition? 22

  25. Stimulus/Topic • Another conundrum: ‣ I thought about getting my ears pierced.: T OPIC (cf. know, talk, read ) ‣ I feared getting my ears pierced: S TIMULUS (cf. see , hurt ) ‣ I was scared about getting my ears pierced: ??? • Again, two labels are competing for semantic territory. • Should we add more categories with double inheritance? (Problem: Proliferation of categories.) • Should we just allow annotators to specify multiple labels if they’re unsure? (Problem: Would create inconsistency.) 23

  26. Construal • Assumption thus far: 
 preposition token’s semantics = role in a scene 
 … Topic ‣ I thought about getting my ears pierced. 
 Topic • But it’s not always so simple: 
 … Stimulus ‣ I was scared about getting my ears pierced. 
 Topic 24

  27. Construal • Observation: The preposition can frame or construe the situation in a way that differs from the predicate or scene. • Solution: Allow tokens to receive two labels from the hierarchy, one for the scene role and one for the preposition’s semantic function , when warranted. 25

  28. Construal • In fact, Stimulus can be interpreted differently by different prepositions: 
 … Stimulus ‣ I was scared by the bear. 
 Causer • 
 … Stimulus ‣ I was scared about getting my ears pierced. 
 Topic 26

  29. Experiencer Dative • Experiencers can be realized as recipients/datives: 
 … Experiencer ‣ The bear felt scary to me. 
 Recipient • In some languages, this is the main way E XPERIENCER s are realized: ‣ koev li ha-ro š . [Hebrew] 
 Hurts to.me the-head ‘My head hurts.’ ‣ mujh- ko garmii lag rahii hai. [Hindi] 
 I- DAT head feel PROG PRESS ‘I’m feeling hot.’ 27

  30. Employment • The P ROFESSIONAL A SPECT label is used for employer– employee and other professional relationships. • It participates in several different preposition construals: … ProfAsp Beneficiary ‣ He works for XYZ Inc. 
 at 
 Location … ProfAsp Source ‣ He’s from XYZ Inc. 
 with 
 Accompanier 28

  31. Null Functions? • Sometimes it’s hard to tell whether the adposition has any semantic contribution: … Stimulus ? ‣ I’m angry with my mom. 
 *mad ? … Topic ‣ She’s interested in politics. 
 *fascinated 29

Recommend


More recommend