Morphology in CLARIN-D Danil de Kok Introduction A whirlwind - PowerPoint PPT Presentation

Jun 08, 2023 •140 likes •341 views

Morphology in CLARIN-D Danil de Kok Introduction A whirlwind introduction: CLARIN-D tools: WebLicht, TNDRA Resources: corpora with morphology Mostly oriented towards inflectional morphology WebLicht WebLicht is a web

Morphology in CLARIN-D Daniël de Kok
Introduction A whirlwind introduction: ● CLARIN-D tools: WebLicht, TüNDRA ● Resources: corpora with morphology ● Mostly oriented towards inflectional morphology
WebLicht WebLicht is a web application for creating and running NLP pipelines
Services ● Centers provide RESTful annotation services ○ Input: Text Corpus Format (TCF) ○ Output: TCF with the added layers ● Centers create metadata for their annotations services and put them in their repository
WebLicht architecture
Morphology services ● Currently available (morphological tagging): ○ German: Stuttgart Morphology (RFTagger) , SMOR ○ Dutch: Alpino ○ English: MorphAdorner ● Adding new services for morphology: ○ Since WebLicht is decentralized, any CLARIN center could add additional morphology services. ○ If some interesting tool is missing, let us know!
Stuttgart morphology (German) ● HMM tagger specialized for large, feature-rich tag sets. ● Trained on the Tiger treebank. ● Uses a supplementary lexicon. ● Outputs morphological tags in the TIGER morphology scheme: ● Part-of-speech ● Gender ● Case ● Number ● Degree ● Person ● Tense ● Mood ● Finiteness
Alpino (Dutch) ● Wide-coverage dependency parser for Dutch. ● But also has: ○ An extensive lexicon with subcategorization frames. ○ A guesser for unknown words. ● Eventual frames are decided by: ○ Filtering by n-best tagging. ○ The parse selected by the disambiguation model.
Resources for German ● Semi-automatically annotated ○ Tiger treebank ○ TüBa-D/Z ● Automatically annotated ○ TüBa-D/W
Tiger treebank ● ~50,000 sentences ● Newspaper text (Frankfurter Rundschau) ● Semi-automatically annotated ● Annotations: ○ STTS part-of-speech tags ○ Lemmas ○ Inflectional morphology ○ Constituency structure ○ Dependency conversion (subset hand-annotated)
TüBa-D/Z ● ~95,500 sentences ● Newspaper text ( taz ) ● Semi-automatically annotated ● Annotations: ○ STTS part-of-speech tags ○ Lemmas ○ Inflectional morphology ○ Constituency structure ○ Dependency conversion ○ Anaphora and coreference relations ○ Subset with GermaNet word senses ○ Named entity class
TüBa-D/W ● 36.1 million sentences ● German Wikipedia ● Automatically annotated ● Annotations: ○ STTS part-of-speech tags ○ Lemmas ○ Inflectional morphology ○ Dependency structure ● Processed using WebLicht :-)
TüBa-D/W TüBa-D/W is fully searchable using the TüNDRA treebank viewer
Links WebLicht: https://weblicht.sfs.uni-tuebingen.de/ TüNDRA: https://weblicht.sfs.uni-tuebingen.de/Tundra/

Recommend

Why joining the EOSC portal? The CLARIN case Dieter Van Uytvanck Technical Director CLARIN ERIC

Why joining the EOSC portal? The CLARIN case Dieter Van Uytvanck Technical Director CLARIN ERIC dieter@clarin.eu EOSC-hub week Prague 10 April 2019 CLARIN? C ommon La nguage R esources and Technology In frastructure ESFRI ERIC

328 views • 12 slides

Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data

Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure Twan Goosen 1 (CLARIN ERIC), Nuno Freire 2 , Clemens Neudecker 3 , Maria Eskevich 1 1 CLARIN ERIC; 2 Europeana /

702 views • 17 slides

Morphology Morphology Morphology yields words with Morphology yields words with predictable

LIGN171: Child Language Acquisition http://ling.ucsd.edu/courses/lign171 http://ling.ucsd.edu/courses/lign171 LIGN171: Child Language Acquisition Morphology Morphology Morphology yields words with Morphology yields words with

782 views • 32 slides

CLARIN: how to make it all fit together? Steven Krauwer Utrecht institute of Linguistics UiL-OTS

CLARIN: how to make it all fit together? Steven Krauwer Utrecht institute of Linguistics UiL-OTS CLARIN Coordinator HLT2010 Riga Steven Krauwer 1 Background ESFRI: EU initiative to identify essential research infrastructures for Europe

431 views • 23 slides

Computational Morphology: Machine learning of morphology Yulia Zinova 09 April 2014 16 July

Computational Morphology: Machine learning of morphology Yulia Zinova 09 April 2014 16 July 2014 . . . . . . Yulia Zinova Computational Morphology: Machine learning of morphology Introduction: History Disconnect between

855 views • 31 slides

Update on morphology WP activities M. Huertas-Company (GAL-SWG - morphology) EUCLID France - 7

Update on morphology WP activities M. Huertas-Company (GAL-SWG - morphology) EUCLID France - 7 Janvier 2016 Morphology WP in a nutshell Legacy Galaxies WP Provide? Request? shape / morphology measurements for EUCLID galaxies France

414 views • 18 slides

Lexical Phonology and Morphology February 4, 2016 Lexical Phonology and Morphology Paul

Lexical Phonology and Morphology Lexical Phonology and Morphology February 4, 2016 Lexical Phonology and Morphology Paul Kiparsky: early 1980s 1 Developing work by Dorothy Siegel, Steven Strauss, Mark Aronoff, David Pesetsky. 2 A theory of many

526 views • 22 slides

Computational Morphology: Introduction Yulia Zinova SoSe 2020 Yulia Zinova Computational

Computational Morphology: Introduction Yulia Zinova SoSe 2020 Yulia Zinova Computational Morphology: Introduction SoSe 2020 1 / 55 Introduction Computational Morphology Theoretical knowledge of morphology speakers intuition

842 views • 66 slides

Introduction to English Linguistics 3: Morphology and Word Formation Part I: Morphology Part II:

Introduction to English Linguistics 3: Morphology and Word Formation Part I: Morphology Part II: Word Formation Morphology morpheme (meaning-carrying) allomorph (meaningless variant) morph (concrete form) Systems and Nomenclature

721 views • 36 slides

Introduction to English Linguistics 3: Morphology and Word Formation Part I: Morphology Part II:

512 views • 38 slides

Morphology and Corpora: Introduction Marco Baroni University of Bologna Granada Morphology

Morphology and Corpora: Introduction Marco Baroni University of Bologna Granada Morphology and Corpora Seminar Outline Corpora General overview Data sparseness and the need for larger corpora Morphology Derivational vs. inflectional

748 views • 63 slides

Discrete Morphology and Distances on graphs Jean Cousty Four-Day Course on Mathematical

Discrete Morphology and Distances on graphs Jean Cousty Four-Day Course on Mathematical Morphology in image analysis Bangalore 19-22 October 2010 J. Serra, J. Cousty, B.S. Daya Sagar : Course on Math. Morphology 1/34 Mathematical Morphology

1.1k views • 95 slides

Munic icip ipalit lity o of C Clarin ington on Neighbourhood Character Study Public

Munic icip ipalit lity o of C Clarin ington on Neighbourhood Character Study Public Workshop May 30, 2019 www.clarington.net/NCS Agend enda 1. Introduction 2. Postcard activity 3. Managing neighbourhood change 4. Breakout

555 views • 28 slides

Usi Using Pa PaQu for or l language acquis isit ition ion r research Jan Odijk CLARIN

Usi Using Pa PaQu for or l language acquis isit ition ion r research Jan Odijk CLARIN 2015 Conference Wroclaw, 2015-10-16 1 Overview Introduction CHILDES Corpora PaQu Evaluation & Analysis Conclusions Future

760 views • 28 slides

capturing natural interactions Nick Campbell Trinity College, Dublin Clarin/FLaReNet

capturing natural interactions Nick Campbell Trinity College, Dublin Clarin/FLaReNet Workshop@KTH November 26th, 2009 Thursday 26 November 2009 introduction Speech recognition and synthesis technologies can now be considered mature,

272 views • 22 slides

WebAnno: a flexible, web-based annotation tool for CLARIN Richard Eckart de Castilho , Chris

WebAnno: a flexible, web-based annotation tool for CLARIN Richard Eckart de Castilho , Chris Biemann, Iryna Gurevych, Seid Muhie Yimam #WebAnno This work is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International. If you are

519 views • 18 slides

3 COMP 1 5 9 3 Algorithmic Verification LTL Model Checking and B uchi Automata Dr. Liam

<latexit

1.07k views • 96 slides

Matthew 7:24-27 NIV 24 Therefore everyone who hears these words of mine and puts them into

Matthew 7:24-27 NIV 24 Therefore everyone who hears these words of mine and puts them into practice is like a wise man who built his house on the rock. 25 The rain came down, the streams rose, and the winds blew and beat against that house;

640 views • 21 slides

Machine Learning - MT 2017 20. Course Summary Varun Kanade University of Oxford November 29,

Machine Learning - MT 2017 20. Course Summary Varun Kanade University of Oxford November 29, 2016 Machine Learning - What we covered SVM Nave Bayes Convnets k -Means Clustering Kernels Logistic Regression Deep Learning Least Squares

368 views • 16 slides

Operating System Landscape: 1960s MIT Whirlwind Honeywell 800 Manchester Atlas Comprehensive

Operating System Landscape: 1960s MIT Whirlwind Honeywell 800 Manchester Atlas Comprehensive System Eight-Way Atlas Supervisor 1954 Multiprogramming Paged Virtual Memory 1960 1962 Rice Institute Computer Code Words c. 1960

608 views • 4 slides

Project Takeaway submitted. Next submission: Software Testing Nim Oct 26 th

Project Takeaway submitted. Next submission: Software Testing Nim Oct 26 th Remember: okay to change framework Make sure all games work with framework Memory checks Your program should do proper memory

395 views • 6 slides

On Recovering Affine Encodings in White-Box Implementations Patrick Derbez 1 , Pierre-Alain Fouque

On Recovering Affine Encodings in White-Box Implementations Patrick Derbez 1 , Pierre-Alain Fouque 1 , Baptiste Lambin 1 , Brice Minaud 2 1 Univ Rennes, CNRS, IRISA 2 Royal Holloway University of London Baptiste Lambin On Recovering Affine

595 views • 21 slides

MAP Inference with MILP Matt Gormley Lecture 12 Oct. 7, 2019 1 Reminders Homework 2: BP

10-418 / 10-618 Machine Learning for Structured Data Machine Learning Department School of Computer Science Carnegie Mellon University MAP Inference with MILP Matt Gormley Lecture 12 Oct. 7, 2019 1 Reminders Homework 2: BP for Syntax

422 views • 30 slides

Data Compression Heiko Schwarz Freie Universitt Berlin Fachbereich Mathematik und Informatik

Data Compression Heiko Schwarz Freie Universitt Berlin Fachbereich Mathematik und Informatik Data Compression: General Information Course Web Page (as announced in Whiteboard) http://www.inf.fu-berlin.de/lehre/WS20/DataCompression/dc.htm

595 views • 5 slides

Morphology in CLARIN-D Danil de Kok Introduction A whirlwind - PowerPoint PPT Presentation

Morphology in CLARIN-D Danil de Kok Introduction A whirlwind introduction: CLARIN-D tools: WebLicht, TNDRA Resources: corpora with morphology Mostly oriented towards inflectional morphology WebLicht WebLicht is a web

Why joining the EOSC portal? The CLARIN case Dieter Van Uytvanck Technical Director CLARIN ERIC

Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data

Morphology Morphology Morphology yields words with Morphology yields words with predictable

CLARIN: how to make it all fit together? Steven Krauwer Utrecht institute of Linguistics UiL-OTS

Computational Morphology: Machine learning of morphology Yulia Zinova 09 April 2014 16 July

Update on morphology WP activities M. Huertas-Company (GAL-SWG - morphology) EUCLID France - 7

Lexical Phonology and Morphology February 4, 2016 Lexical Phonology and Morphology Paul

Computational Morphology: Introduction Yulia Zinova SoSe 2020 Yulia Zinova Computational

Introduction to English Linguistics 3: Morphology and Word Formation Part I: Morphology Part II:

Introduction to English Linguistics 3: Morphology and Word Formation Part I: Morphology Part II:

Morphology and Corpora: Introduction Marco Baroni University of Bologna Granada Morphology

Discrete Morphology and Distances on graphs Jean Cousty Four-Day Course on Mathematical

Munic icip ipalit lity o of C Clarin ington on Neighbourhood Character Study Public

Usi Using Pa PaQu for or l language acquis isit ition ion r research Jan Odijk CLARIN

capturing natural interactions Nick Campbell Trinity College, Dublin Clarin/FLaReNet

WebAnno: a flexible, web-based annotation tool for CLARIN Richard Eckart de Castilho , Chris

3 COMP 1 5 9 3 Algorithmic Verification LTL Model Checking and B uchi Automata Dr. Liam

Matthew 7:24-27 NIV 24 Therefore everyone who hears these words of mine and puts them into

Machine Learning - MT 2017 20. Course Summary Varun Kanade University of Oxford November 29,

Operating System Landscape: 1960s MIT Whirlwind Honeywell 800 Manchester Atlas Comprehensive

Project Takeaway submitted. Next submission: Software Testing Nim Oct 26 th

On Recovering Affine Encodings in White-Box Implementations Patrick Derbez 1 , Pierre-Alain Fouque

MAP Inference with MILP Matt Gormley Lecture 12 Oct. 7, 2019 1 Reminders Homework 2: BP

Data Compression Heiko Schwarz Freie Universitt Berlin Fachbereich Mathematik und Informatik

Sambuz

Useful Links

Newsletter

Mail Us