antecedent and referent types of abstract pronominal
play

Antecedent and referent types of abstract pronominal anaphora - PowerPoint PPT Presentation

Centre for Language Technology Antecedent and referent types of abstract pronominal anaphora Costanza Navarretta University of Copenhagen costanza@hum.ku.dk Beyond Semantics, Gttingen February 23-25 2011 Centre for Language Technology


  1. Centre for Language Technology Antecedent and referent types of abstract pronominal anaphora Costanza Navarretta University of Copenhagen costanza@hum.ku.dk Beyond Semantics, Göttingen February 23-25 2011

  2. Centre for Language Technology Abstract anaphora Third person singular personal pronouns and demonstrative pronouns having • as antecedents: copula predicates, verbal phrases, clauses/ utterances • as referents: properties, events, facts, speech acts, situations, propositions Beyond Semantics, Göttingen February 23-25 2011 Dias 2

  3. Centre for Language Technology Corpus-based work DAD parallel and comparable written and spoken corpora in Danish and Italian (approx. 200000 words in Da, 150000 in It) 2007-2010. Related work: Webber 1988, Fraurud 1992, Byron & Allen 1998, Gundel et al. 2003, 2005, Recasens & Martí 2008, Artstein & Poesio 2008, Dipper & Zinsmeister 2008,2010. Beyond Semantics, Göttingen February 23-25 2011 Dias 3

  4. Centre for Language Technology Abstract Anaphora in Danish • written Danish: • det (it/ this/ that) • dette (this) • spoken Danish: • unstressed det (it) • d'et (this/ that), • d'et h'er (this) • d'et d'er (that) • dette (this) – very seldom Beyond Semantics, Göttingen February 23-25 2011 Dias 4

  5. Centre for Language Technology The Danish part of the DAD corpora The written corpora: • general language texts, legal texts, literary texts: 86,832 words. The spoken corpora: • DanPASS monologues: 23,957 words • two-party DanPASS dialogues: 52,145 • multi-party LANCHART+ TV interviews: 26,304 words. Beyond Semantics, Göttingen February 23-25 2011 Dias 5

  6. Centre for Language Technology Annotations and evaluation (Navarretta & Olsen LREC-2008) Texts: structural information, PoS and lemma, (DDT) Spoken data: PoS, (lemma), stress, (prosody, phrases, hesitations), speakers, interaction segments, utterances, timestamps; All data: • 3sn: pronominal type, pronominal function (9), syntactic function; • anaphoric occurrences: antecedents, syntactic type of antecedents, semantic type of referent (Asher 1993), referential links and their type (identity/ non- identity), anaphoric distance (in clauses); Beyond Semantics, Göttingen February 23-25 2011 Dias 6

  7. Centre for Language Technology Example A: hun skulle nok sørge for han kom derover She would certainly make sure that he came over there B : d’et kunne jeg godt være sikker på of that I could be completely sure LANCHART 2008 Beyond Semantics, Göttingen February 23-25 2011 Dias 7

  8. Centre for Language Technology Abstract Anaphora in English Webber (1988,1991): • Abstract anaphors with same antecedent can refer to objects of different semantic types; the referred object is created on the spot ( ostension ). • Strong preference for use of demonstrative pronouns with clausal antecedents - 83.7% occurrences in written corpus. Similar figures in other data i.a. Byron & Allen 1998, Gundel et al. 2003, 2004, Hedberg et al. 2007, Navarretta 2007. Beyond Semantics, Göttingen February 23-25 2011 Dias 8

  9. Centre for Language Technology Hegarty 2003, Gundel et al. 2005 Entities introduced in discourse by clauses are only activated in the cognitive status of the addressee: they are often not accessible to reference by personal pronouns; entities introduced in discourse by VPs are “in focus” and can be referred to by personal pronouns (Givenness Hierarchy Gundel et al. 1993) . Referents of anaphors with clausal antecedents are often facts, situations and propositions. Beyond Semantics, Göttingen February 23-25 2011 Dias 9

  10. Centre for Language Technology Resolution algorithms Eckert and Strube 2000, Byron 2002, Strube and Müller 2003, Müller 2007: use difference between personal and demonstrative pronouns in abstract reference in their algorithms. Non valid criterion in Danish. Beyond Semantics, Göttingen February 23-25 2011 Dias 10

  11. Centre for Language Technology Abstract Reference: Danish, Italian vs. English Personal pronouns (comprising zero anaphora in Italian) have often clausal antecedents and thus refer to facts, situations and propositions. (In Italian abstract pronominal reference is seldom). Use of demonstrative pronouns (both individual and abstract) in Danish, Italian and English is different. Beyond Semantics, Göttingen February 23-25 2011 Dias 11

  12. Centre for Language Technology Differences –continued Although many factors influence reference comprising IS, discourse structure, linguistic and extra-linguistic knowledge, some of the language specific differences in abstract reference are systematic and they should/ can be explained in terms of the three languages’ pronominal system and syntax (LREC 2010). Beyond Semantics, Göttingen February 23-25 2011 Dias 12

  13. Centre for Language Technology Pronominal System Pronouns for inanimate entities • English: 1 gender • Danish and Italian: 2 inanimate genders Danish: common and neuter – only latter can be abstract anaphor Italian: feminine and masculine – only latter can be abstract anaphor. In English more necessary to restrict interpretation: via distinction personal- demonstrative pronoun. Beyond Semantics, Göttingen February 23-25 2011 Dias 13

  14. Centre for Language Technology Syntax • constructions as clefts and left dislocations are much more frequent in Danish than in English and Italian, thus in Danish the clause is often the entity which is in "focus" (Gundel et al. 1993) – this partly explains the frequent use of personal pronouns ( det and unstressed det ) with clausal antecedents; • word order is relatively free in Italian opposed to Danish and English: the use of abstract substantives in Italian restricts the antecedent search space; Beyond Semantics, Göttingen February 23-25 2011 Dias 14

  15. Centre for Language Technology Present work Only on Danish data: is there a relation between the type of pronoun, the clausal type of the antecedents, the type of referents and the anaphoric distance? If yes this could be used to improve anaphora resolution. Importance of anaphoric distance (Ariel 1988, 1994) Beyond Semantics, Göttingen February 23-25 2011 Dias 15

  16. Centre for Language Technology Clausal types • Simple main clause (utterance) • Subordinate clause • Complex clause • Matrix clause • Discourse segment (sequence of main simple clauses and/ or complex clauses) Beyond Semantics, Göttingen February 23-25 2011 Dias 16

  17. Centre for Language Technology Results in texts Both det and dette occur with all types of antecedents, but dette has seldom verbal phrase and copula predicate antecedents. The most frequent clausal antecedents: 1. subordinate clauses 2. simple main clauses. Det and dette occur with all types of referent, but dette seldom refers to a property. Dette refers more often to fact-like objects. Beyond Semantics, Göttingen February 23-25 2011 Dias 17

  18. Centre for Language Technology Results in monologues Most frequently used abstract anaphor is the unstressed det . Most frequently occurring antecedents are 1. simple main clauses 2. complex clauses. Propositions are most frequently referred to by unstressed pronoun det . Beyond Semantics, Göttingen February 23-25 2011 Dias 18

  19. Centre for Language Technology Results in dialogues Most common abstract anaphor is the unstressed det . Most frequent antecedent types are: 1. simple main clauses, 2. complex clauses, 3. verbal phrases (map-task). Propositions are most frequently referred to by the unstressed pronoun det . Beyond Semantics, Göttingen February 23-25 2011 Dias 19

  20. Centre for Language Technology Distance Texts Monologues Dialogues zero 8 8 % 9 2 % 6 1 % one 8 % 5 % 2 7 % tw o 2 % 3 % 2 % three 1 % 0 0 .8 % four 0 .4 % 0 0 .3 % … eleven 0 0 0 .0 1 % Beyond Semantics, Göttingen February 23-25 2011 Dias 20

  21. Centre for Language Technology Distance, antecedents, referents With anaphoric distance > 1 • Pronouns : unstressed det in dialogues and the det in texts. • Syntactic types: subordinate clauses and discourse segments in the texts, simple main clauses and discourse segments in dialogues. • Referent types : eventualities and fact-like objects. Beyond Semantics, Göttingen February 23-25 2011 Dias 21

  22. Centre for Language Technology On-going and future work • Machine Learning: pronominal type + referent type + distance  antecedent type (only CL or CL types) • Find a better way to calculate distance • Resolution • Relation with non-verbal behaviours Beyond Semantics, Göttingen February 23-25 2011 Dias 22

Recommend


More recommend