Beyond Semantics Wrapup Annual meeting of the DGfS AG 1 Göttingen, 2011
Wrapup • Bonnie Webber: Patterns of Explicit and Implicit Clausal Connectors: What this might suggest for "beyond semantics" • Iørn Korzen and Matthias Buch-Kromann: Anaphoric relations in the Copenhagen Dependency Treebanks • Yannick Versley: Towards finer-grained tagging of discourse connectives • Ilka Floeck: Suggestions in British and American English: A corpus-linguistic study • Stefania Degaetano and Elke Teich: The lexico-grammar of stance: an exploratory analysis of scientific texts • Galina Tremper and Anette Frank: Extending Semantic Relation Classification to Presupposition Relations between Verbs 25.02.2011 1
Wrapup • Rebecca Passonneau: Making Sense of Word Sense Variation • Nynke H. van der Vliet, Ildikó Berzlánovich, Gosse Bouma, Markus Egg and Gisela Redeker: Building a Discourse- annotated Dutch Text Corpus • Matthias Buch-Kromann, Daniel Hardt and Iørn Korzen: Syntax- centered and semantics-centered views of discourse. Can they be reconciled? • Philippa Cook and Felix Bildhauer: Annotating Information Structure: The Case of "Topic" • Arndt Riester and Stefan Baumann: Information Structure Annotation and Secondary Accents • Christian Chiarcos: On the Dimensions of Discourse Salience 25.02.2011 2
Wrapup • Costanza Navarretta: Antecedent and referent types of abstract pronominal anaphora • Maria Aloni, Angelika Port, Ana Aguilar Guevara, Radek Simik, Machteld de Vos and Hedde Zeijlstra: Semantics and pragmatics of indefinites: methodology for a synchronic and diachronic corpus study 25.02.2011 3
Agenda • Annotation • Tools • Back to theory • Discussion 25.02.2011 4
Annotation • Pragmatic phenomena are usually not explicitly marked in the surface string – discourse relations: explicit / implicit connectives – relations between verbs: explicit pairs of verbs / implicit presuppositions or implicatures – information structural entities: topic, focus etc. 25.02.2011 5
Annotation • How do we annotate? – which tools? – which kind of annotators (trained experts, turkers)? – which criteria / which kind of tests? • how to operationalize the linguistic concepts / which proxies (surface clues) are suitable? / How can we validate the proxies? – how to evaluate the manual annotation of pragmatic phenomena / multiple annotators? 25.02.2011 6
How to get hold of the implicit information • Linguistic definitions • Proxies – surface clues (approximations) • Reference to independent “resources” – WordNet, qualia structure,... – pro: objective means and (easily) available – con: quality and suitability of the resource • Linguistic test – decision trees – elicitation of the native speaker intuition from the annotators 25.02.2011 7
Desiderata in Annotation • Recipes / canonical methods for annotating the beyond phenomena – identification of the ‘markables’ (e.g. topic - focus vs. thetic sentence; minimal unit) – annotation of instances vs. types (cf. verb pairs out of context) – annotation of instances of a specific type vs. patterns vs. continuous text • Question of isomorphism – independent vs. incremental annotation • Cross-linguistic application 25.02.2011 8
Multi-layer tools • Annotation – incremental adding of additional layers • e.g. information structure / discourse structure to syntax • Query – search on multiple layers – search on multiple texts / subcorpora – incremental adding of layers 25.02.2011 9
Multi-layer tools (contd.) Search of multi-layered annotated data: • Examples – “Give me all unstressed, non-pronominal, coreferent direct objects in the ‘Vorfeld’.” “Diesen Film haben ALle gesehen.” (Everyone watched this movie.) – “Give me all connectives which are not subordinating and which are followed by a non- deictic subject.” ... so that is an issue... 25.02.2011 10
Back to the Theory • Disagreement in reliably deliberate annotation – problems with the instructions OR – problems with the theory • The challenge of authentic text • Text as a research object • Importance of genre etc. 25.02.2011 11
Beyond the Workshop Which topics were not addressed in the workshop? • Where semantics goes into pragmatics • Challenge of web-based data / automatically compiled or automatically annotated data • Problem of being influenced in annotation by prior assumptions • Falsifiability of the assumptions 25.02.2011 12
Recommend
More recommend