tagging modality in oceanic languages of melanesia
play

Tagging modality in Oceanic languages of Melanesia Annika Tjuka, - PowerPoint PPT Presentation

MelaTAMP project Data Method Inter-annotator agreement Conclusion Tagging modality in Oceanic languages of Melanesia Annika Tjuka, Lena Weimann, and Kilu von Prince August 1 st , 2019 1 / 23 The 13 th Linguistic Annotation Workshop


  1. MelaTAMP project Data Method Inter-annotator agreement Conclusion Tagging modality in Oceanic languages of Melanesia Annika Tjuka, Lena Weißmann, and Kilu von Prince August 1 st , 2019 1 / 23 The 13 th Linguistic Annotation Workshop

  2. MelaTAMP project Data Method Inter-annotator agreement Conclusion The MelaTAMP project 2 / 23

  3. MelaTAMP project Data Method Inter-annotator agreement Conclusion Introduction Figure 1: Subject languages of the MelaTAMP project. 3 / 23 Saliba-Logea Mavea North Ambrym Daakaka Dalkalaen Daakie Nafsan

  4. MelaTAMP project Data Method Inter-annotator agreement Conclusion The Languages Nafsan, Saliba-Logea, and North Ambrym. 6000 (Nafsan). Melanesia is based mostly on descriptive accounts. 4 / 23 • Subject languages: Daakaka, Dalkalaen, Daakie, Mavea, • Speaker populations range from about 30 (Mavea) to around • So far, our understanding of the Oceanic languages of

  5. MelaTAMP project Data Method Inter-annotator agreement Conclusion The MelaTAMP Project the respective language. in Oceanic languages. The focus of this talk is on our study on tagging modality in fjve of the seven subject languages. 5 / 23 • Comparative research • Based on corpus data • Texts were recorded during fjeldwork sessions with speakers of • Investigation of modality, aspect, tense, and polarity (TAMP)

  6. • In contrast, Saliba-Logea only uses optional particles to MelaTAMP project adv express TAM-related meanings. Table 1: The verbal complex in Mavea (Guérin, 2011). =a /NP =i l(o) - r- / tol- m̋e- / pete- sopo- mo- i- , … obj tr Verb Data redup - impf num it / incpt neg cond sbj . agr the verbal complex, sometimes in more than one place. Expressing TAMP Conclusion Inter-annotator agreement Method 6 / 23 • TAM-related meanings are often expressed obligatorily within

  7. MelaTAMP project adv express TAM-related meanings. Table 1: The verbal complex in Mavea (Guérin, 2011). =a /NP =i l(o) - r- / tol- m̋e- / pete- sopo- mo- i- , … obj tr Verb Data redup - impf num it / incpt neg cond sbj . agr the verbal complex, sometimes in more than one place. Expressing TAMP Conclusion Inter-annotator agreement Method 6 / 23 • TAM-related meanings are often expressed obligatorily within • In contrast, Saliba-Logea only uses optional particles to

  8. MelaTAMP project Data Method Inter-annotator agreement Conclusion Data 7 / 23

  9. MelaTAMP project Data Method Inter-annotator agreement Conclusion Corpora study: Daakaka , Dalkalaen , Mavea , Nafsan , and Saliba-Logea . specifjc target set of expressions to label (e.g., modal auxiliaries and adverbs). 8 / 23 • Corpora of the following languages were considered in this • In comparison to previous approaches, we did not identify a

  10. MelaTAMP project Data Method Inter-annotator agreement Conclusion Sub-Corpus miraculous events including mysterious fjgures and animals native to the region. 9 / 23 • Prioritizing of a comparable sub-corpus ( 26 texts ). • Descriptions of wild-life behaviour, tales and fables about

  11. MelaTAMP project 214 3 634 Nafsan 110 65k 6 363 Saliba-Logea 150k* 61 6 157 Total 618 362k 26 1953 Table 2: Corpora included in this study; Tok: tokens; tag.: tagged; *of the 150k tokens in this corpus, about 70k are fully annotated. 45k Mavea Data #Texts Method Inter-annotator agreement Conclusion Overview Total Tagged Language #Texts #Tok. #Clauses 658 Daakaka 119 68k 5 141 Dalkalaen 114 34k 6 10 / 23

  12. MelaTAMP project Data Method Inter-annotator agreement Conclusion Method 11 / 23

  13. MelaTAMP project Data Method Inter-annotator agreement Conclusion Previous Approaches to Tagging Modality epistemic and modal forces such as necessity and possibility . et al., 2013). 12 / 23 • Difgerentiation between modal fmavours such as deontic and • These distinctions are notoriously diffjcult to tag (Rubinstein

  14. MelaTAMP project past, future, present https://wikis.hu-berlin.de/melatamp/Main_page . Table 3: Tag set of the MelaTAMP project, see positive, negative polarity Polarity stative bounded, ongoing, repeated, event Aspectual domain possible factual , counterfactual , mood Modal domain time Data Temporal domain adverbial, attributive conditional, e.question, temporal, embedded: proposition, assertion, question, directive; clause Clause type Tags Name Category Our Tag Set Conclusion Inter-annotator agreement Method 13 / 23

  15. MelaTAMP project Data Method Inter-annotator agreement Conclusion Branching-times Framework Figure 2: The three domains of the factual (solid line), the counterfactual (dotted lines), and the possible future (dashed lines). Vertically aligned indices are here taken to be simultaneous (von Prince, 2019). 14 / 23

  16. MelaTAMP project liye “He took his copra chisel.” (Daakaka) copra.chisel bosi 3SG.POSS Data take an REAL mwe (1) Example: factual Conclusion Inter-annotator agreement Method 15 / 23 • clause: assertion • time: past • mood: factual • event: bounded • polarity: positive

  17. MelaTAMP project get shop].” (Nafsan: 030.048) “they thought [someone had taken money from inside the shop st]o. inside em̃rom money mane some Data tete sol (2) Method Inter-annotator agreement 3PL.IR=go Example: counterfactual Conclusion ru=mroki 3PL.RS=think [na COMP ruk=fan 16 / 23 • clause: proposition • time: past • mood: counterfactual • event: bounded • polarity: positive

  18. MelaTAMP project 1SG-POT “I will live in the bush.” (Daakaka: 1348) bush or in yen Data pwer-pwer REDUP-stay na-p MOD ka (3) Example: possible Conclusion Inter-annotator agreement Method 17 / 23 • clause: assertion • time: future • mood: possible • event: stative • polarity: positive

  19. MelaTAMP project Data Method Inter-annotator agreement Conclusion Results of Inter-Annotator Agreement 18 / 23

  20. MelaTAMP project Data Method Inter-annotator agreement Conclusion Results in each Category Figure 3: Percentages of total inter-annotator consistencies (orange) and inconsistencies (grey) in each TAMP category of the tag set. 19 / 23

  21. MelaTAMP project Data Method Inter-annotator agreement Conclusion Inter-Annotator Agreement Score for each Category 1 Krippendorfg’s alpha coeffjcient (Krippendorfg, 1980). 20 / 23 • Polarity: α 1 = 0.91 • Mood: α = 0.86 • Clause: α = 0.85 • Time: α = 0.85 • Event: α = 0.79

  22. MelaTAMP project Data Method Inter-annotator agreement Conclusion Results in the Mood Category distinction ( factual , counterfactual , possible ) seems to be effjcient. 21 / 23 • The high α score in this category indicates that our three-way

  23. MelaTAMP project Data Method Inter-annotator agreement Conclusion Conclusion 22 / 23

  24. • Our modal tag set has been proven useful for our purposes. • Depending on the languages and the goals of tagging MelaTAMP project Data Method Inter-annotator agreement Conclusion Conclusion categories exhibits a high percentage of inter-annotator consistency throughout difgerent categories. modality, our tag set may be an interesting alternative to other models. Thank you! 23 / 23 • The overall tag set that we used to annotate the TAM

  25. • Depending on the languages and the goals of tagging MelaTAMP project Data Method Inter-annotator agreement Conclusion Conclusion categories exhibits a high percentage of inter-annotator consistency throughout difgerent categories. modality, our tag set may be an interesting alternative to other models. Thank you! 23 / 23 • The overall tag set that we used to annotate the TAM • Our modal tag set has been proven useful for our purposes.

  26. MelaTAMP project Data Method Inter-annotator agreement Conclusion Conclusion categories exhibits a high percentage of inter-annotator consistency throughout difgerent categories. modality, our tag set may be an interesting alternative to other models. Thank you! 23 / 23 • The overall tag set that we used to annotate the TAM • Our modal tag set has been proven useful for our purposes. • Depending on the languages and the goals of tagging

  27. MelaTAMP project Data Method Inter-annotator agreement Conclusion Conclusion categories exhibits a high percentage of inter-annotator consistency throughout difgerent categories. modality, our tag set may be an interesting alternative to other models. Thank you! 23 / 23 • The overall tag set that we used to annotate the TAM • Our modal tag set has been proven useful for our purposes. • Depending on the languages and the goals of tagging

Recommend


More recommend