Janis Pagel Nils Reiter Ina Rösiger Sarah Schulz A Uni�ed Text Annotation Work�ow for Diverse Goals
Discovering phenomena not covered by a theory Strengthening de�nitions in a theory Often confused categories might be overlapping or at least unclear Uncovering implicit assumptions Manually annotated data can be analysed Which categories are how frequent in what context? Automatic tools can be evaluated How well do machines do this task? Supervised tools can be trained Why do we annotate? Empirical validation of theories Data creation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 2
Manually annotated data can be analysed Which categories are how frequent in what context? Automatic tools can be evaluated How well do machines do this task? Supervised tools can be trained Why do we annotate? Empirical validation of theories Discovering phenomena not covered by a theory Strengthening de�nitions in a theory Often confused categories might be overlapping or at least unclear Uncovering implicit assumptions Data creation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 2
Why do we annotate? Empirical validation of theories Discovering phenomena not covered by a theory Strengthening de�nitions in a theory Often confused categories might be overlapping or at least unclear Uncovering implicit assumptions Data creation Manually annotated data can be analysed Which categories are how frequent in what context? Automatic tools can be evaluated How well do machines do this task? Supervised tools can be trained Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 2
Analog annotation … Ideas attached to spans of text Sometimes fuzzy text spans Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 3
…and digital annotation Explicit assignment of categories to text spans Text spans are explicitly bounded (begin, end) Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 4
Pustejovsky and Stubbs (2012) Hovy and Lavid (2010) Circles Annotation (Circle) Well known in Computational Linguis- tics Automation-centered “Enrich with knowledge” Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 5
Hovy and Lavid (2010) Circles Annotation (Circle) Well known in Computational Linguis- tics Automation-centered “Enrich with knowledge” Pustejovsky and Stubbs (2012) Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 5
Circles Annotation (Circle) Well known in Computational Linguis- tics Automation-centered “Enrich with knowledge” Pustejovsky and Stubbs (2012) Hovy and Lavid (2010) Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 5
Gius and Jacke (2017) Circles Hermeneutic Circle Well known in Humanities Interpretation-centered “Retrieve knowledge” Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 6
Circles Hermeneutic Circle Well known in Humanities Interpretation-centered “Retrieve knowledge” Gius and Jacke (2017) Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 6
Work�ow Theoretical Data notion (Proto) annotation Annotation guidelines Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 7
Work�ow Theoretical Data notion (Proto) annotation Annotation guidelines Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 7
Work�ow Theoretical Data notion (Proto) annotation Annotation guidelines Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 7
Work�ow Theoretical Data notion (Proto) annotation Annotation guidelines Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 7
Work�ow Theoretical Data notion (Proto) annotation Annotation guidelines Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 7
Work�ow Theoretical Data notion (Proto) annotation Annotation guidelines Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 7
Work�ow Theoretical Data notion (Proto) annotation Annotation guidelines Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 7
Work�ow Theoretical Data notion (Proto) annotation Annotation guidelines Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 7
Work�ow Theoretical Data notion (Proto) annotation Annotation guidelines Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 7
Work�ow Theoretical Data notion (Proto) annotation Annotation guidelines Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 7
Theoretical Data notion (Proto) annotation Annotation guidelines Examples Pliny (Bradley, 2008) Anno- tated 3DH (Kleymann, Meister, Analysis text/ and Stange, 2018) corpus Interpretation Automation Goals of Annotation Exploratory Mostly note-taking Semi-organized Humanities centered Ideally completely free of presuppositions Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 8
Examples Pliny (Bradley, 2008) 3DH (Kleymann, Meister, and Stange, 2018) Goals of Annotation Theoretical Data notion Exploratory Mostly note-taking Semi-organized (Proto) Humanities centered annotation Annotation Ideally completely free of guidelines presuppositions Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 8
Examples Pliny (Bradley, 2008) 3DH (Kleymann, Meister, and Stange, 2018) Goals of Annotation Theoretical Data notion Exploratory Mostly note-taking Semi-organized (Proto) Humanities centered annotation Annotation Ideally completely free of guidelines presuppositions Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 8
Examples Pliny (Bradley, 2008) 3DH (Kleymann, Meister, and Stange, 2018) Goals of Annotation Theoretical Data notion Exploratory Mostly note-taking Semi-organized (Proto) Humanities centered annotation Annotation Ideally completely free of guidelines presuppositions Anno- tated Analysis text/ corpus Interpretation Automation Janis Pagel, Nils Reiter, Ina Rösiger, Sarah Schulz, Institute for Natural Language Processing (IMS), University of Stuttgart: Uni�ed Annotation Work�ow 8
Recommend
More recommend