surface construction labeling
play

Surface Construction Labeling Lori Levin Language Technologies - PowerPoint PPT Presentation

Corpus Annotation for Surface Construction Labeling Lori Levin Language Technologies Institute Carnegie Mellon University Collaborators and acknowledgements Collaborators: Jesse Dunietz, Jaime Carbonell, Dunietz, Jesse, Lori Levin, and


  1. Outline • What is Surface Construction Labeling (SCL)? • The causality task (Dunietz 2018) • The Because onomasiological annotation scheme • Making the task doable – Constructicon as an annotation tool – Overlapping Relations • Slow research musings about grammaticalization and the contents of the constructicon

  2. Annotation for Surface Construction Labeling • Onomasiological – “How do you express x?” • https://en.wikipedia.org/wiki/Onomasiology • Hasegawa et al. • For any potential annotation unit: – Is meaning X expressed? (semasiological) – Is there a form that expresses X? • If you do this as a corpus study, you end up with a collection of constructions that express X in the corpus.

  3. The Because Corpus Sentences Causal Documents New York Times 59 1924 717 Washington section (Sandhaus, 2014) Penn TreeBank 47 1542 534 WSJ 2014 NLP Unshared Task in 3 772 324 PoliInformatics (Smith et al., 2014) Manually Annotated Sub- 12 629 228 Corpus (Ide et al., 2010) Total 121 4790 1803 B ECAUSE = B ank of E ffects and Cau ses S tated E xplicitly

  4. The Because Annotation Scheme • B ank of E ffects and Cau ses S tated E xplicitly

  5. Causal language: a clause or phrase in which one event, state, action, or entity is explicitly presented as promoting or hindering another

  6. Annotators can’t do this. Causal language: a clause or phrase in which one event, state, action, or entity is explicitly presented as promoting or hindering another

  7. We annotate three types of causation. The system failed because of C ONSEQUENCE a loose screw. Mary left because John was M OTIVATION there. Mary left in order to avoid P URPOSE John. For us to succeed, we all have to cooperate We all have to cooperate (cause) for us to succeed (effect)

  8. Motivation and Purpose • Motivation: the reason exists – Mary left because John was there • John being there (cause) motivated Mary to leave (effect) • Purpose: a desired state – Mary left in order to avoid John – (Mary wants to avoid John) causes (Mary leaves) – (Mary leaves) may cause/enable (Mary avoids John) For us to succeed, we all have to cooperate We all have to cooperate (cause) for us to succeed (effect)

  9. Causation can be positive or negative. This has often caused F ACILITATE problems elsewhere. He kept the dog from leaping at her. I NHIBIT

  10. Connective: fixed morphological or lexical cue indicating a causal construction John killed the dog because it was threatening his chickens. John prevented the dog from eating his chickens. Ice cream consumption causes drowning. We do not annotate most agentive verbs. We only annotate verbs that express causation as their primary meaning.

  11. Effect: presented as outcome Cause: presented as producing effect John killed the dog because it was threatening his chickens . John prevented the dog from eating his chickens . Ice cream consumption causes drowning . She must have met him before , because she recognized him yesterday .

  12. Means arguments for cases with an agent and an action caused a commotion . My dad by shattering a glass E FFECT M EANS C AUSE By altering immune responses, inflammation can trigger depression. We have not yet added the AFFECTED frame element (Rehbein and Ruppenhofer 2017).

  13. We exclude language that does not encode pure, explicit causation: Relationships with no lexical trigger John killed the dog. It was threatening his chickens. Connectives lexicalizing a means or result John killed the dog. Unspecified causal relationships The treatment is linked to better outcomes.

  14. Actual corpus examples can get quite complex. “ For market discipline to effectively constrain risk, financial institutions must be allowed to fail.” “ If properly done, a market sensitive regulatory authority not only prevents some of the problems, but is pro-market, because we have investors now who are unwilling to invest even in things they should.” Average causal sentence length: 30 words

  15. Complex corpus examples

  16. Outline • What is Surface Construction Labeling (SCL)? • The causality task (Dunietz 2018) • The Because onomasiological annotation scheme • Making the task doable – Constructicon as an annotation tool – Overlapping Relations • Slow research musings about grammaticalization and the contents of the constructicon

  17. Decision Tree

  18. Annotators were guided by a “ constructicon .” Connective <cause> prevents <enough cause> for pattern <effect> from <effect> <effect> to <effect> Annotatable prevent, from enough, for, to words WordNet prevent.verb.01 verb senses prevent.verb.02 Type Verbal Complex Degree I NHIBIT F ACILITATE Type restrictions Not P URPOSE There’s enough time Example His actions prevented for you to find a disaster. restroom.

  19. Inter-annotator agreement Causal Overlapping 0.77 0.89 Connective spans ( F 1 ) Relation types ( κ ) 0.70 0.91 0.92 (n/a) Degrees ( κ ) 0.89 0.96 C AUSE /A RG C spans (%) 0.92 0.97 C AUSE /A RG C spans (Jaccard) 0.92 0.96 C AUSE /A RG C heads (%) 0.86 0.84 E FFECT /A RG E spans (%) 0.93 0.92 E FFECT /A RG E spans (Jaccard) 0.95 0.89 E FFECT /A RG E heads (%) 260 sentences; 98 causal instances; 82 overlapping relations

  20. The Causal Language Constructicon • 290 construction variants • 192 lexically distinct connectives – “prevent” and “prevent from” use the same primary connective word

  21. New examples can be added • Annotators recommend new connectives • Do a quick corpus study to verify that that connective frequently expresses causality – In other words, its use for causality seems to be conventionalized

  22. We annotate 7 different types of overlapping relations. T EMPORAL After; once; during C ORRELATION As; the more…the more… H YPOTHETICAL If…then… O BLIGATION /P ERMISSION Require; permit Generate; eliminate C REATION /T ERMINATION So…that…; sufficient…to… E XTREMITY /S UFFICIENCY Without; when (circumstances where…) C ONTEXT

  23. Lingering difficulties with other overlapping relations Origin/destination: toward that goal Topic: fuming over recent media reports Component: as part of the liquidation Evidentiary basis : went to war on bad intelligence Having a role: as an American citizen Placing in a position: puts us at risk

  24. Overlapping Relation Examples • Temporal – Within minutes after the committee released its letter, Torricelli took the senate floor to apologize to the people of New Jersey. • Correlation – Auburn football players are reminded of last year’s losses every time they go into the weight room. • Hypothetical – Previously, he allowed increases in emissions as long as they did not exceed the rate of economic growth.

  25. Overlapping Relation Examples • Obligation/Permission – He will roll back a provision known as a new source review that compels utilities to install modern pollution controls whenever the significantly upgrade older plants. • “whenever” is also a connective • Creation/Termination – Many expected synergies of financial service activities gave rise to conflicts and excessive risk taking. • Context – With Hamas controlling Gaza, it was not clear that Mr. Abbas had the power to carry out his decrees.

  26. Annotators applied several tests to determine when an overlapping relation was also causal. • Can the reader answer a “why” question ? • Does the cause precede the effect? • Counterfactuality(Grivaz, 2010) : would the effect have been just as probable without the cause? Rules out: My bus will leave soon, I just finished my breakfast. • Ontological asymmetry (Grivaz, 2010): could the cause and effect be reversed? – Rules out: It is a triangle. It has three sides. • Can it be rephrased as “because?”

  27. Temporal with and without causal interpretation After last year’s fiasco , everyone is being cautious. M OTIVATION A RG C A RG E + T EMPORAL After last year’s fiasco , they’ve rebounded this year . T EMPORAL A RG C A RG E

  28. Conditional hypotheticals don’t have to be causal, but most are. If he comes, he’ll bring his wife. Non-causal: If I told you, I’d have to kill you. Causal: carry causal meaning 84%

  29. Overlapping relations

  30. Causality has seeped into the temporal and hypothetical domains. Of the causal expressions in the corpus: are piggybacked on temporal relations > 14% are expressed as hypotheticals ~7%

  31. Conditional hypotheticals don’t have to be causal, but most are. If he comes, he’ll bring his wife. Non-causal: If I told you, I’d have to kill you. Causal: carry causal meaning 84%

  32. Outline • What is Surface Construction Labeling (SCL)? • The causality task (Dunietz 2018) • The Because onomasiological annotation scheme • Making the task doable – Constructicon as an annotation tool – Overlapping Relations • Slow research musings about grammaticalization and the contents of the constructicon

  33. Slow Research Musings • Usefulness of SCL and onomasiological annotation – In NLP tasks – For linguistic discovery

  34. Slow research musings • Convention vs exploitation – Patrick Hanks, Lexical Analysis: Norms and Exploitations , MIT Press. • With respect to constructions is conventional the same as grammaticalized ? • In 84% of if-then in the Because corpus, the sentence seems to have the intent of expressing causality: – Does that mean that if-then is a conventional/grammaticalized way of expression causality? – If so, does it lead us to extreme conclusions…….

  35. Slow research musings • Constructions whose meaning side is a frame: e.g., causality • Constructions whose meaning side is a function: e.g., “to V - base”, “ Det N”

  36. Slow Research Musings • Fusion constructions vs overlapping relations – English Dative shift The child was hungry. So his mother gave him (recipient) a cookie (theme). So his mother gave a cookie (theme) to him (recipient). • Animacy, information status, definiteness, NP weight, change of possession/information/state • Joan Bresnan, Anna Cueni, Tatiana Nikitina, and Harald Baayen. 2007. "Predicting the Dative Alternation." In Cognitive Foundations of Interpretation , ed. by G. Boume, I. Kraemer, and J. Zwarts. Amsterdam: Royal Netherlands Academy of Science, pp. 69--94. • Frishkoff et al. (2008) Principal components analysis reduced Bresnan’s 14 features to four, roughly corresponding to verb class, NP weight, animacy, and information status.

  37. Slow research musings Fusion Constructions – Chinese ba (aside from issues of the part of speech) • proposition-level completedness, lexical aspect, discourse information status of the direct object, affectedness of the direct object

  38. Slow research musings • What is the meaning side of fusional constructions in a constructicon? – An abstraction from which all the other functions follow? • English Dative Shift: Topic worthiness of the recipient • Ba : affectedness or completedness – Or would fusional constructions be represented as a collection of overlapping relations?

  39. Stop here

  40. Lingering difficulties include other overlapping relations and bidirectional relationships. For us to succeed, we all have to cooperate . enables succeed cooperate necessitates succeed cooperate

  41. • But not every construction evokes a frame. There could be a function instead of a frame, like information structure (Goldberg). (Fillmore, Lee-Goldman, and Rhodes, 2012). Chuck mentioned the need for interactional frames for pragmatics and some people have pursued it.

  42. Foundation: Form and Meaning • Constructions (pairs of form and meaning) (Fillmore): – Meaning is not a discrete space • However there are centroids of meaning that tend to get grammaticalized – Meaning is grammaticalized in arbitrary constellations of linguistic units – Meaning can be compositional or conventional (idiomatic) • Languages have comparable categories – Common centroids of form and meaning like “noun” • Languages differ – How they carve up the meaning space when they grammaticalize it – How they spread from the centroids – What constellations of linguistic units are used

  43. Meaning is not a discrete space • http://www.ecenglish.com/learnenglish/lessons/will-would-shall-should • Desire, preference, choice or consent: – Will you please be quiet? – He won’t wash the dishes • Future: – It will be a great party. – I will go to the market. • Capability/capacity: – The ship will take three hundred guests. – This bottle will hold two litres of wine. • Other: – Phone rings: That will be my son

  44. Discrete centroids Muyuw irrealis Prohibitive tend to be morpheme grammaticalized, but not in the same way in Manam irrealis each language. morpheme Imperative Counter- Hypo- Past Present Future factual thetical Susurunga The ovals represent the points in semantic space. irrealis The outlines each represent the irrealis morpheme morpheme in one language, showing what part of the semantic space it covers. Sinaugoro irrealis morpheme Ferdinand De Haan , “On Representing Semantic Maps”

  45. Related spaces are grammaticalized differently Anaphoricity, Specificity, Familiarity Genericity Specific Discourse-old I’m looking for a I met a student. The student. Her name is student was tall. Chris. Discourse Non Specific Predictable I’m looking for a student. I need one I went to a wedding. h to help me with The bride and groom something. looked great. Uniqueness Abstract Familiarity/Context nouns Hand me the pen on the “The” and “a” are not desk. meanings. They are Mass grammaticalizations of nouns parts of this space.

  46. Constructions: pairs of form and meaning (Goldberg) • Morpheme – -ed past time • Word – Student • Phrase (productive and meaning is compositional) – The student (an anchored instance of a student)

  47. Constructions: pairs of form and meaning • Phrase (productive but meaning is non-compositional) – What a nice shirt! – Why not read a book? (suggestion, invitation, deontic) • Phrase (not productive, Goldberg) – To prison – To bed – To school – To hospital – On holiday – *to airport * To bath – • Idiom – Out in left field

  48. Constructions: pairs of form and meaning • Arbitrary constellation of units of form, possibly discontinuous, possibly non-compositional – What is she doing going to the movies? (Fillmore et al.) (incongruity) – What do you mean you don’t know? (disbelief, incongruity) – It was too big to fail – No intellectual characteristic is too ineffable for assessment.

  49. Task definition: connective discovery + argument identification Connective discovery Find lexical triggers of causal relations I worry because I worry because I care. I care. Argument identification Identify cause & effect spans for each connective

  50. Goal of Surface Construction Labeling • Improve shallow semantic parsing coverage using richer, more flexible linguistic representations, leading to a unified approach to Frame Semantic Parsing, integrating the FrameNet Lexicon and the FrameNet Constructicon. Goal of the causal labeling project • Create a proof-of-concept for SCL: • Design annotation guidelines & annotate a corpus using these representations. • Build automated machine learning taggers for constructional realizations of semantic relations.

  51. Four Automatic Labelers • Syntax-based – Benchmark – CausewayS • String-based – CausewayL • Neural: DeepCx

  52. Syntax-based connective discovery: each construction is treated as a partially-fixed parse tree fragment worry/VBP “head” of because I nsubj advcl care care/VBP I/PRP mark nsubj because/IN I/PRP I worry because I care.

  53. Syntax-based connective discovery: each construction is treated as a partially-fixed parse tree fragment. worry/VBP advcl nsubj care/VBP I/PRP mark nsubj because/IN I/PRP I worry because I care.

  54. Syntax-based connective discovery: each construction is treated as a partially-fixed parse tree fragment. advcl mark because/IN

  55. Benchmark: Syntax-based dependency path memorization heuristic. Parse paths to possible Connective cause/effect heads Causal / Not causal 27/ 4 prevent from nsubj, advcl 0 / 8 prevent from nsubj, advmod 14 / 1 because of case, case  nmod … … …

  56. The syntax-based benchmark system • Benchmark – For each causal tree fragment • Count how many times it is causal in the training data • Count how many times the same fragment is not causal in the training data • If that pattern is causal more often in the training data, then whenever you see it in the test data, label it as causal. • If that pattern is non-causal more often in the training data, then whenever you see it in the test data, don’t label it as causal. – High precision: If the benchmark system thinks it’s causal, it is usually right – Low recall: The benchmark system misses too many instances of causality

  57. Syntax-based connective discovery: TRegex patterns are extracted in training, and matched at test time. (/^because_[0-9]+$/ <2 /^IN.*/ <1 mark advcl I worry because Training: > (/.*_[0-9]+/ I care. <1 advcl mark > (/.*_[0-9]+/))) because/IN I worry because I love you. + I worry TRegex 1 Test: (/^because_[0-9]+$/ because I love <2 /^IN.*/ <1 mark you. > (/.*_[0-9]+/ <1 advcl > (/.*_[0-9]+/))) 1 Levy and Andrew, 2006

  58. The syntax-based CausewayS system • Fancier way of extracting tree fragments from the training data – Dreyfus-Wagner, minimum-weight sub-tree • For each tree fragment extracted from the training data, find all instances of it in the test data. • CausewayS, at this point, has high recall (finds a lot of instances of causality) but low precision (a lot of what it finds isn’t right). • Apply a classifier to increase precision.

  59. Syntax-based argument ID: Argument heads are expanded to include most dependents. worry/VBP nsubj advcl care/VBP I/PRP mark nsubj because/IN I/PRP

  60. Syntax-based argument ID: Argument heads are expanded to include most dependents. worry/VBP nsubj advcl care/VBP I/PRP mark nsubj because/IN I/PRP

  61. CausewayL: constructions are matched by regular expressions over word lemmas . (ˆ | )([ \ S]+ )+?(because/IN) I worry because Training: ([ \ S]+ )+? I care. I worry because I love you. I worry + regex Test: because I love you. (ˆ | )([ \ S]+ )+?(because/IN) ([ \ S]+ )+?

  62. CausewayL: Arguments are labeled by a conditional random field. E FFECT E FFECT C AUSE labels … featurized … words 𝑈 𝐿 1 𝑞 𝐳 𝐲 = 𝑎 𝐲 ෑ exp ෍ 𝜄 𝑙 𝑔 𝑙 𝑧 𝑢 , 𝑧 𝑢−1 , 𝐲 𝑢 𝑢=1 𝑙=1 Features include information about: • Word • Connective • Relationship between word & connective

  63. CausewayL • Recall is high (finds a lot of matches) • Precision is low (a lot of what it finds is wrong) • Use a classifier to raise precision

  64. DeepCx: Neural Net, LSTM

  65. Analysis of results • DeepCx is better than Causeway in these circumstances – Ambiguity (entropy) of the causal connective. • Eg., to has entropy of 1 – F1 for DeepCx: 40.6% – F1 for Causeway: 25.5% – Connectives whose part of speech is Adverb • Causeway missed all instances of “why” and “when” – Design problem? – Connectives whose part of speech is Noun • Fewer instances to train on – More words in the causal connective • But CausewayL does pretty well too because it is matching a pattern over a whole sentence whereas DeepCx is proceeding one word at a time. CausewayS is missing complex parse paths.

  66. Analysis of results • CausewayL and DeepCx do better than CausewayS on most overlapping relations • Amount of training data – DeepCx performs better at all amounts of training data – It appears to be better at generalizing across patterns (vs generalizing within patterns) – The gap between CausewayS and DeepCx remains constant across all amounts of training data – CausewayL increases fastest with more training data • Length of Cause and Effect arguments – DeepCx and CausewayL are better than CausewayS as argument length increases • CausewayL uses CRF to label argument spans • CausewayS uses the dependency trees

  67. Next steps • Reproduce with other meanings with large constructicons such as comparatives • Reproduce with multiple languages • Apply where the constructicon is small and quirky too such as incongruity

  68. Related annotation schemes and labelers for causality

  69. Previous projects have struggled to annotate real-world causality. SemEval 2007 “A person infected with a <e1> flu </e1> <e2> virus </e2> strain develops antibodies against Task 4 it.” (Girju et al., Cause-Effect(e2, e1) = "true" 2007) CaTeRS (Mostafazadeh et al., 2016) Richer Event B EFORE - PRECONDITIONS Descriptions We’ve allocated a budget to equip the barrier with electronic detention equipment. (O’Gorman et al., 2016; Croft et al., 2016)

  70. Existing shallow semantic parsing schemes include some elements of causal language. Penn Discourse Treebank (Prasad et al., 2008) … PropBank (Palmer et al., 2005) FrameNet He made me bow C AUSATION C AUSER E FFECT E FFECT (Fillmore & Baker, 2010; Ruppenhofer et al., to show his dominance . 2016) P URPOSE

  71. Others have focused specifically on causality. CaTeRS (Mostafazadeh et al., 2016) Richer Event Description B EFORE - PRECONDITIONS We’ve allocated a budget to equip the barrier (O’Gorman et al., 2016) with electronic detention equipment. Causality in TempEval-3 EVENT HP acquired 730,070 common shares as a result of a stock purchase agreement . (Mirza et al., 2014) EVENT TLINK BEFORE CAUSE BioCause (Mihaila et al., 2013)

Recommend


More recommend