KAF: a generic semantic annotation format Wauter Bosma & Piek - PowerPoint PPT Presentation

KAF: a generic semantic annotation format Wauter Bosma & Piek Vossen (VU University Amsterdam) Aitor Soroa & German Rigau (Basque Country University) Maurizio Tesconi & Andrea Marchetti (CNR-IIT, Pisa) Carlo Aliprandi (Synthema, Pisa) Monica Monachini (CNR-ILC, Pisa) KYOTO EU-FP7 ICT Program

KYOTO – overview  A system for defining and sharing meaning in a domain  Domain wordnet (linked to generic wordnet)  Ontology (linked to wordnet)  Fact profiles  Semantic interoperability  Knowledge is maintained by end-users  System can be used for extracting factual data from documents  Cross-language; cross-culture

KYOTO – some statistics  March 2008 – March 2011  8 countries (The Netherlands, Italy, Germany, Spain, Taiwan, Japan, Czech Republic)  12 sites  Universities & research institutes: VUA, CNR-ILC, CNR-IIT, BBAW, EHU, AS, NICT, Masaryk  Companies: Synthema, Irion  User organizations: ECNC, WWF  7 languages (English, Italian, Japanese, Dutch, Spanish, Basque, Chinese)

KYOTO – knowledge cycle

Linguistic Linguistic Wordnets & Ontology Wordnets & Ontology Processor Processor Multilingual Multilingual Knowledge Base Knowledge Base Semantic & Syntactic Semantic & Syntactic Wikyoto Wikyoto Kybot Kybot Kybot representation representation Kyoto Annotation Format Kyoto Annotation Format Wiki Editor Wiki Editor Fact Extractor Fact Extractor Fact Extractor 2 2 1 1 Tybot Tybot Tybot Term Base Term Base Term Extractor Term Extractor Term Extractor Fact Base Fact Base

Requirements for semantic annotation in KYOTO  Interoperability across languages and cultures  Language-neutral annotation  One format for all languages  Interoperability across linguistic processors  Specialized processors for specific tasks  System should work with new (unknown) languages  Flexibility and extendibility , as requirements for applications may change over time

The KYOTO way  KAF: KYOTO/Knowledge Annotation Format  Annotation consists of layers stacked on top of each other  Layers are used to generate more sophisticated layers  Morpho-syntactic layers – Level-2 semantic layers language specific parsing Level-1 semantic layers  Level-1 semantic layers – named entities, events, etc. Morpho-syntactic layers  Level-2 semantic layers – facts  Layers refer to items in lower level layers  KAF is LAF-compliant

Morpho-syntactic layers  Text: tokenization, sentences, paragraphs, with reference to the Level-2 semantic layers source  Terms [Text]: words and multi- Level-1 semantic layers words, includes parts-of-speech, declension information, etc. Chunks  Dependencies [Terms]: Dependencies dependency relations between terms Terms  Chunks [Terms]: constituents & Text phrases

Semantic layers  Level-1 layers for linear annotation : tagging text elements (expressions of time, events, quantities, locations, etc.)  Level-2 layers for generic annotation : extracted facts (with pointers to evidence in the text) – possibly multiple sources of evidence  Linear vs. Generic ↔ Information vs. Knowledge

General KAF layout <kaf xml:lang="en"> <kafHeader>...</kafHeader> layer 1... layer 2... ... layer N... </kaf>

Morpho-syntactic annotation: text and terms <kaf> <text> <wf wid=”w1” page=”1” sent=”1” para=”1” fileoffset=”0,3”> tw o </wf> <wf wid=”w2” page=”1” sent=”1” para=”1” fileoffset=”4,7”> pe r </wf> <wf wid=”w3” page=”1” sent=”1” para=”1” fileoffset=”8,12”> c e nt </wf> </text> <terms> <term tid=”t1” type=”open” lemma=”two” pos=”G”>  </term> <term tid=”t2” type=”open” lemma=”per cent” pos=”N”> </term>

Morpho-syntactic annotation: deps and chunks <kaf> <text>...</text> <terms>...</terms> <deps>  <dep from=”t1” to=”t2” rfunc=”mod”/> </deps> <chunks>  <chunk cid=”c1” head=”t2” phrase=”NP”>   </chunk> </chunks>

Linear semantic annotation <timexs>  <timex3 texid="timex1" type="DATE" value="1970"> <target id="c7"/> </timex3>  <timex3 texid="timex2" type="DATE" value="2003"> <target id="c9"/> </timex3>  <timex3 texid="timex3" type="DURATION" value="P33Y" beginPoint="timex1" endPoint="timex2" temporalFunction="true"/>

Generic annotation <entities> <ent eid =”e1”>  <spans> <target doc=”134” id="c7"/> <target doc=”134” id="c34"/> <target doc=”14” id="c13"/> </spans> <ent eid =”e300”>  <spans> <target doc=”134” id="c13"/> <target doc=”4” id="c3"/> </spans> </entities>

Generic annotation <facts>   facts facts <fact fid="f1"> entities entities  semantic roles semantic roles <process eid="e1"/> dependencies dependencies  chunks chunks <quantity qid="q1"/> term: migration term: migration  Wordnet synset {eng-30-6766767-v} Wordnet synset {eng-30-6766767-v} <timex3 texid="timex3"/> Ontology Type = MigrationProcess Ontology Type = MigrationProcess - MigratingSpecies - MigratingSpecies  - Source - Source <arg tid="c1" role="patient"/> - Path - Path </fact> - Distance - Distance word: migration word: migration </facts>

KAF in KYOTO  Word Sense Disambiguation adds sense annotation to the terms layer of KAF  Tybots (term yielding robots) use KAF for term extraction  Uses the terms layer and the chunks layer  Kybots (knowledge yielding robots) use KAF for fact extraction  Kybot is configured to search for specific facts by defining a kybot profile  Wikyoto allows domain experts to define kybot profiles and to build a domain wordnet from Tybot terms, linked to a shared ontology  All of the above are language-neutral

KAF and ISO standards  KAF is inspired by: SynAF (dependency relations), MAF (morphological annotation), SemAF (time and events), LAF (generic linguistic annotation framework)  SynAF , MAF and SemAF cannot be stacked  LAF is a data model rather than a standard  KAF is an instantiation of LAF with elements from SynAF , MAF and SemAF

Conclusion  Key features of KAF:  Layered annotation; extendible for new applications  Distributed processing  Language neutral processing  Sharing & reusing resources  KAF in KYOTO:  Three types of annotation: morphosyntactic , linear (level-1 semantic) and generic (level-2 semantic)  Used for 7 languages in several applications  KAF manual: www.kyoto-project.eu (under system architecture and demos , data formats )

KAF: a generic semantic annotation format Wauter Bosma & Piek - PowerPoint PPT Presentation

KAF: a generic semantic annotation format Wauter Bosma & Piek Vossen (VU University Amsterdam) Aitor Soroa & German Rigau (Basque Country University) Maurizio Tesconi & Andrea Marchetti (CNR-IIT, Pisa) Carlo Aliprandi (Synthema, Pisa)

Annotation Processing in a Kotlin World Zac Sweers @pandanomic Annotation Processing in a

Detecting Errors in Semantic Annotation Argument identification variation Heuristics for

What are Generics? e.g. Generics, Generic Programming, Generic Types, Generic Methods 6

Generic Programming in a Dependently Typed Language Generic proofs for generic programs Peter

Generic Methods 36 What are Generic Methods? Generic methods = methods that introduce type

1 Definition of a simple generic class Why generic programming (cont.) class Pair <T> {

Annotation and Evaluation Diana Maynard, Niraj Aswani University of Sheffield University of

Lecture 2 Annotation tools & Segmentation Summary of Part 1 Annotation theory

Systematic Annotation Mark Voorhies 4/5/2012 Mark Voorhies Systematic Annotation Review RTFM

Assessing annotation Assessing annotation consistency in the Gene consistency in the Gene

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

Web Annotations Building the Experience Annotation An annotation is something added. It is not

Decompositional Semantics Rachel Rudinger January 30, 2020 A story about semantic annotation

Ontologies, semantic annotation and GATE Kalina Bontcheva Johann Petrak University of Sheffield

Planning and Optimization C14. Merge-and-Shrink Abstractions: Generic Algorithm Malte Helmert and

Generic classes Declaration Use Annotations 54 Generic classes Declaration add

For most, the ability to open and edit documents will likely be a necessary feature, especially

Beta Presentation Image Recognition, Annotation and Validation Mobile Application The Capstone

CROI 2020 Recorded Video Submission Instructions Recorded Presentation Submission Instructions

Fundamentals of FileNet Fundamentals of FileNet Human Resources Human Resources April 2009

PD2100A User Guide Overview The Nomad Technologies, Inc. PD2100A is a self contained, mobile, all

Maria Ralli cultural content. Associate CrowdHeritage: CrowdSourcing Platform Researcher at the

Domain Adaptation for Constituency Parsing Using Partial Annotations Vidur Joshi Matthew Peters

Organ-Specific Differences in Gene Expression and UniGene Annotations Describing Source Material

KAF: a generic semantic annotation format Wauter Bosma & Piek - PowerPoint PPT Presentation

KAF: a generic semantic annotation format Wauter Bosma & Piek Vossen (VU University Amsterdam) Aitor Soroa & German Rigau (Basque Country University) Maurizio Tesconi & Andrea Marchetti (CNR-IIT, Pisa) Carlo Aliprandi (Synthema, Pisa)

Annotation Processing in a Kotlin World Zac Sweers @pandanomic Annotation Processing in a

Detecting Errors in Semantic Annotation Argument identification variation Heuristics for

What are Generics? e.g. Generics, Generic Programming, Generic Types, Generic Methods 6

Generic Programming in a Dependently Typed Language Generic proofs for generic programs Peter

Generic Methods 36 What are Generic Methods? Generic methods = methods that introduce type

1 Definition of a simple generic class Why generic programming (cont.) class Pair &lt;T&gt; {

Annotation and Evaluation Diana Maynard, Niraj Aswani University of Sheffield University of

Lecture 2 Annotation tools &amp; Segmentation Summary of Part 1 Annotation theory

Systematic Annotation Mark Voorhies 4/5/2012 Mark Voorhies Systematic Annotation Review RTFM

Assessing annotation Assessing annotation consistency in the Gene consistency in the Gene

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

Web Annotations Building the Experience Annotation An annotation is something added. It is not

Decompositional Semantics Rachel Rudinger January 30, 2020 A story about semantic annotation

Ontologies, semantic annotation and GATE Kalina Bontcheva Johann Petrak University of Sheffield

Planning and Optimization C14. Merge-and-Shrink Abstractions: Generic Algorithm Malte Helmert and

Generic classes Declaration Use Annotations 54 Generic classes Declaration add

For most, the ability to open and edit documents will likely be a necessary feature, especially

Beta Presentation Image Recognition, Annotation and Validation Mobile Application The Capstone

CROI 2020 Recorded Video Submission Instructions Recorded Presentation Submission Instructions

Fundamentals of FileNet Fundamentals of FileNet Human Resources Human Resources April 2009

PD2100A User Guide Overview The Nomad Technologies, Inc. PD2100A is a self contained, mobile, all

Maria Ralli cultural content. Associate CrowdHeritage: CrowdSourcing Platform Researcher at the

Domain Adaptation for Constituency Parsing Using Partial Annotations Vidur Joshi Matthew Peters

Organ-Specific Differences in Gene Expression and UniGene Annotations Describing Source Material

1 Definition of a simple generic class Why generic programming (cont.) class Pair <T> {

Lecture 2 Annotation tools & Segmentation Summary of Part 1 Annotation theory