Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch utze January 23, - PDF document

Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch¨ utze January 23, 2017 1

Working Group MORPH: Morphology This WG is concerned with morphology, one of the core areas of computational linguistics and theoretical linguistics, especially once we’ve overcome English-centric myopia. A lot (everything?) changes when morphology is modeled on the character level, in End2End systems and in the framework of deep learning. • character-level models for morphological analysis • character-level models for morphological generation • in character-level models: What happens to prefixes, suffixes, stems, roots? • subword units: morphologically motivated vs non-morphological – properties, strengths, weaknesses etc. • morphological induction / paradigm completion / discovery of morphological rules: supervised, semisupervised, unsupervised • Do certain types of morphology lend themselves better to character-level models? • inflectional vs. derivational morphology • non-concatenative morphologies • segmentation • language modeling – how to incorporate morphology: input, output, at which level? • insights into human morphology from analyzing neural models? • use character-level representations as a research methodology for morphology: e.g., compositional (“dearly”) vs noncompositional (“early”) forms • efficiency (character-based worse than word-based?) • inspection, interpretation, analysis, beyond black-box models • evaluation • applications • come up with 1-5 new research directions

Working Group MT: Machine Translation Machine translation is perhaps the biggest success of deep learning in NLP. This WG will be concerned with research questions and challenges for character-level MT. • character NMT • linear NMT • dealing with OOVs and cross-token dependencies (e.g., hierarchy) • localization • beyond LSTM dependencies • transliteration • character-level alignment • multilingual NMT • multi-task NMT for multiple modalities • document level NMT • What are the units: characters, BEPs, subwords, words, phrases? • Are there still units? • What happens with syntax? • efficiency (character-based worse than word-based?) • inspection, interpretation, analysis, beyond black-box models • evaluation • applications (e.g., specific (low/high) resource settings / text types / language pairs?) • come up with 1-5 new research directions

Working Group RepLearn: Character-Level Representation Learning “Unsupervised representation learning techniques capitalize on unlabeled data . . . The goal . . . is to learn a representation that reveals intrinsic low-dimensional structure in data, disentangles underlying factors of variation by incorporating universal AI priors such as smoothness and sparsity, and is useful across multiple tasks and domains.” (Raman Arora) Embeddings and representation learning in general have been critical to the success of deep learning. Can we learn embeddings / representations without feature engineering (e.g., tokenization) and, if so, how? • OOV representations • beyond word embeddings • RNN/GRU/LSTM based embeddings • CNN based embeddings • multilingual embeddings, universal embeddings • noise • noncanonical language • characters vs bytes vs radicals vs bits • learning algorithms, segmentation • linking it back up to traditional linguistic units (e.g., words) • how is ambiguity represented? • numbers, named entities, multiwords and other nontypical units • form-function regularities: which form regularities (e.g., “add s at the end”) correspond to function regularities • cross-token modeling • char2vec (FastText?) • non-morphological character-level producitivity • typoglycemia • efficiency (character-based worse than word-based?) • inspection, interpretation, analysis, beyond black-box models • evaluation • applications • come up with 1-5 new research directions

Working Group End2End: End2End Architectures This WG will be concerned with the challenges that character-level models pose for machine learning. How can dependencies over long distances be learned? How can such models be made efficient in training and application? In an approach without feature- engineered preprocessing, how can domain knowledge and priors be incorporated into machine learning architectures? • CNNs vs RNNs: tradeoff speed/accuracy, parallel/sequential • hierarchical, multi-speed, multi-scale architectures – fixed small depth (2?) vs unbounded hierarchy (paragraph, document, book) • context: attention, memory, convolution etc. • which point in input to focus on • interface between character-level and higher-level (traditional?) processing layers (syntax, semantics) • multimodal / crossmodal End2End architectures • End2End learning of long-distance relationships: corresponding phrases in sentence pairs (or document pairs) • generation of OOVs • End2End segmentation learning (i.e., learn the right way to segment for an application) • how to put in domain / linguistic knowledge? • Bayesian models • in our big machine learning toolbox T , what are interesting t ∈ T to explore in combinations of the form “character-level + t ” • (add hot deep learning architecture of the day here) • efficiency (character-based worse than word-based?) • inspection, interpretation, analysis, beyond black-box models • evaluation • applications • come up with 1-5 new research directions

Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch utze January 23, - PDF document

Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch utze January 23, 2017 1 Working Group MORPH: Morphology This WG is concerned with morphology, one of the core areas of computational linguistics and theoretical linguistics, especially once

C2NLU: An Overview Heike Adel CIS, LMU Munich Dagstuhl January 23, 2017 C2NLU: An Overview

Seminar C2NLU, Schlo Dagstuhl, Wadern, Germany 24-January-2017 From Bayes Decision Rule to

Modelling Multiple Sequences: Explorations, Consequences and Challenges Orhan Firat Dagstuhl

Advanced Air Mobility Ecosystem Working Groups Virtual Kickoff AAM Ecosystem Working Groups

Spanish 7 Slide presentation Name:________________ You will be working in groups of two to

WORKING GROUPS: COALITION PRESENTATIONS Illinois Prenatal to Three Initiative October and

Break-out working groups Aravind Joshi Jack Mostow Rashmi Prasad Vasile Rus Svetlana

Welcome to the CONVERGE Virtual Forum COVID-19 Working Groups for Public Health and Social

Dagstuhl Workshop Quantum Cryptanalysis Schloss Dagstuhl / Leibniz-Zentrum fr Informatik,

Proof-of-Concept Working Groups August 5, 2016; Facilitators John Lumpkin & Andy Wiesenthal

SBN Joint Working Groups SBN Oversight Board Meeting Fermilab, September 13 th 2019 Ornella

Working With Organic Groups in Drupal 7 Matthew Radcliffe mradcliffe@kosada.com Wednesday,

POLLINATOR WORKING GROUPS: INITIAL REPORT CONCERNING RHODE ISLAND POLLINATOR HEALTH &

5 Board Meetings 42 Meetings of inter-council working groups, steering committees or

Act to Improve Enforcement of the Law in Social Networks S&D Digital Working Groups

Staff Review of the Interim Tree Bylaw Working Groups Final Recommendations Michelle McGuire,

REINVENT 7 lessons about leadership; 7 working groups; 7 types of followers; 7

MEMBERSHIP OPPORTUNITIES & ENGAGEMENTS Working Groups Certification Policy

Lyman-alpha forest science with BOSS Lya working groups Lya observations Lya mocks

EMA Working Groups on Committees' Operational Preparedness Mandate and objectives Industry

Programme Director June 2015 Change to structure of CLGs: now working groups looking at

50 by 5 Reforms The Secretariat PriSEC Working Groups 2018 Background achievements out of 326

Advisory Working Group Air Quality February 3, 2010 Agenda Introductions About Advisory

Supplement 214: Cone Beam CT RDSR SUPPLEMENT IS DEVELOPED BY DICOM WORKING GROUPS 02 AND 28

Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch utze January 23, - PDF document

Dagstuhl C2NLU: Working Groups Mo/Tu Hinrich Sch utze January 23, 2017 1 Working Group MORPH: Morphology This WG is concerned with morphology, one of the core areas of computational linguistics and theoretical linguistics, especially once

C2NLU: An Overview Heike Adel CIS, LMU Munich Dagstuhl January 23, 2017 C2NLU: An Overview

Seminar C2NLU, Schlo Dagstuhl, Wadern, Germany 24-January-2017 From Bayes Decision Rule to

Modelling Multiple Sequences: Explorations, Consequences and Challenges Orhan Firat Dagstuhl

Advanced Air Mobility Ecosystem Working Groups Virtual Kickoff AAM Ecosystem Working Groups

Spanish 7 Slide presentation Name:________________ You will be working in groups of two to

WORKING GROUPS: COALITION PRESENTATIONS Illinois Prenatal to Three Initiative October and

Break-out working groups Aravind Joshi Jack Mostow Rashmi Prasad Vasile Rus Svetlana

Welcome to the CONVERGE Virtual Forum COVID-19 Working Groups for Public Health and Social

Dagstuhl Workshop Quantum Cryptanalysis Schloss Dagstuhl / Leibniz-Zentrum fr Informatik,

Proof-of-Concept Working Groups August 5, 2016; Facilitators John Lumpkin &amp; Andy Wiesenthal

SBN Joint Working Groups SBN Oversight Board Meeting Fermilab, September 13 th 2019 Ornella

Working With Organic Groups in Drupal 7 Matthew Radcliffe mradcliffe@kosada.com Wednesday,

POLLINATOR WORKING GROUPS: INITIAL REPORT CONCERNING RHODE ISLAND POLLINATOR HEALTH &amp;

5 Board Meetings 42 Meetings of inter-council working groups, steering committees or

Act to Improve Enforcement of the Law in Social Networks S&amp;D Digital Working Groups

Staff Review of the Interim Tree Bylaw Working Groups Final Recommendations Michelle McGuire,

REINVENT 7 lessons about leadership; 7 working groups; 7 types of followers; 7

MEMBERSHIP OPPORTUNITIES &amp; ENGAGEMENTS Working Groups Certification Policy

Lyman-alpha forest science with BOSS Lya working groups Lya observations Lya mocks

EMA Working Groups on Committees' Operational Preparedness Mandate and objectives Industry

Programme Director June 2015 Change to structure of CLGs: now working groups looking at

50 by 5 Reforms The Secretariat PriSEC Working Groups 2018 Background achievements out of 326

Advisory Working Group Air Quality February 3, 2010 Agenda Introductions About Advisory

Supplement 214: Cone Beam CT RDSR SUPPLEMENT IS DEVELOPED BY DICOM WORKING GROUPS 02 AND 28

Proof-of-Concept Working Groups August 5, 2016; Facilitators John Lumpkin & Andy Wiesenthal

POLLINATOR WORKING GROUPS: INITIAL REPORT CONCERNING RHODE ISLAND POLLINATOR HEALTH &

Act to Improve Enforcement of the Law in Social Networks S&D Digital Working Groups

MEMBERSHIP OPPORTUNITIES & ENGAGEMENTS Working Groups Certification Policy