Lexical Normalization for Neural Network Parsing Rob van der Goot, - PowerPoint PPT Presentation

Lexical Normalization for Neural Network Parsing Rob van der Goot, Gertjan van Noord University of Groningen r.van.der.goot@rug.nl 26-01-2018 1 / 31

Last Year (CLIN27) kheb da gzien ik heb dat gezien orig gzien kheb da tokenize lookup gezien ik heb daar gzn hb wa +N-grams classifier w2v gezien dat kga gezien heb dea aspell geziene khee doa 2 / 31

This Year Use normalization to adapt neural network dependency parsers Evaluate the effect of normalization versus externally trained word embeddings and character level models See if we can exploit top-n candidates New treebank to evaluate domain adaptation 3 / 31

New Treebank Why? Manually corrected train data Gold normalization available Data should be non-canonical UD format 4 / 31

New Treebank Pre-filtered to contain non-standard words Data from Li and Liu (2015): Owoputi and LexNorm 600 Tweets / 10,000 words UD2.1 format 5 / 31

New Treebank root parataxis punct xcomp advmod obl nsubj advmod advmod case nmod:tmod I feel so bad .. Not so sure about wrk tomarra 6 / 31

New Treebank Experimental setup: Train: English Web Treebank Dev: Owoputi Test: Lexnorm 7 / 31

Neural Network parser Configuration: s 2 s 1 s 0 b 0 b 1 b 2 b 3 lazy dog ROOT the jumped over the fox brown Scoring: ( Score LeftArc , Score RightArc , Score Shift ) MLP V jumped V lazy V dog V the V brown V fox V over V the V ROOT concat concat concat concat concat concat concat concat concat s 8 s 7 s 6 s 5 s 4 s 3 s 2 s 1 s 0 LSTM b LSTM b LSTM b LSTM b LSTM b LSTM b LSTM b LSTM b LSTM b LSTM f LSTM f LSTM f LSTM f LSTM f LSTM f LSTM f LSTM f LSTM f x the x brown x fox x jumped x over x the x lazy x dog x ROOT Taken from Kiperwasser and Goldberg (2016) 8 / 31

Neural Network parser UUparser (de Lhoneux et al., 2017) Performs well Relatively easy to adapt No POS tags Characters + external embeddings 9 / 31

Neural Network parser LSTM f LSTM f LSTM f LSTM b LSTM b LSTM b � � � � � � � � � t 1 c 1 e 1 t 2 c 2 e 2 t 3 c 3 e 3 word1 word2 word3 10 / 31

Use Normalization as Pre-processing root amod compound compound new pix comming tomorroe 11 / 31

Use Normalization as Pre-processing root amod obj obl new pix coming tomorrow new pix comming tomorroe 12 / 31

Use Normalization as Pre-processing 64 62 60 58 56 54 52 50 raw norm. 48 e r t t a x x s h e e a b c + + + r a h c + 13 / 31

Use Normalization as Pre-processing 64 62 60 58 56 54 52 raw 50 norm. gold 48 e r t t a x x s h e e a b c + + + r a h c + 17 / 31

Integrate Normalization new pix comming tomorroe 18 / 31

Integrate Normalization new pix comming tomoroe new 0.9466 pix 0.7944 coming 0.5684 tomorrow 0.5451 news 0.0315 selfies 0.0882 comming 0.4314 tomoroe 0.3946 knew 0.0111 pictures 0.0559 combing 0.0002 tomorrow’s 0.0191 now 0.0063 photos 0.0449 comping < 0.0001 Tagore 0.0174 newt 0.0045 pic 0.0165 common < 0.0001 tomorrows 0.0173 19 / 31

Integrate Normalization LSTM f LSTM f LSTM f LSTM b LSTM b LSTM b � � � � � � � � � t 1 c 1 e 1 t 2 c 2 e 2 t 3 c 3 e 3 word1 word2 word3 20 / 31

Integrate Normalization n � w i = � P ij ∗ � n ij j =0 21 / 31

Integrate Normalization news ∗ 0 . 0315) + ( � w 1 = ( � new ∗ 0 . 9466) + ( � knew ∗ 0 . 0111) + ( � � now ∗ 0 . 0063) + ( � newt ∗ 0 . 0045) 22 / 31

Integrate Normalization 64 62 60 58 56 54 52 raw norm. 50 integr. gold 48 e r t t a x x s h e e a b c + + + r a h c + 23 / 31

Integrate Normalization But what about in-domain performance? 24 / 31

Integrate Normalization 90 88 86 84 82 base +norm 80 25 / 31

Integrate Normalization 0.10 +8.42e1 0.08 0.06 0.04 0.02 base +norm 0.00 26 / 31

Integrate Normalization Test data: Model UAS LAS raw 70.47 60.16 normalization- direct 71.03* 61.83* integrated 71.15 62.30* gold 71.45 63.16* Table: *indicates statistical significance compared to previous entry. 27 / 31

Integrate Normalization Conclusions: Normalization is still helpful on top of character and external embeddings Integrating normalization leads to a small but consistent/significant improvement Performance +-60% from using gold normalization New dataset will be made available, provides a nice benchmark for domain adaptation 28 / 31

Next CLIN Effect of different categories of normalization replacements Get closer to gold normalization 29 / 31

Bibliography Miryam de Lhoneux, Yan Shao, Ali Basirat, Eliyahu Kiperwasser, Sara Stymne, Yoav Goldberg, and Joakim Nivre. From raw text to universal dependencies - look, no tags! In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , pages 207–217, Vancouver, Canada, August 2017. Association for Computational Linguistics. Eliyahu Kiperwasser and Yoav Goldberg. Simple and accurate dependency parsing using bidirectional LSTM feature representations. TACL , 4:313–327, 2016. Chen Li and Yang Liu. Joint POS tagging and text normalization for informal text. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015 , pages 1263–1269, 2015. 30 / 31

Integrate Normalization Foster: not noisy, constituency Denoised Web Treebank: no train Tweebank: no train Foreebank: not noisy 31 / 31

Lexical Normalization for Neural Network Parsing Rob van der Goot, - PowerPoint PPT Presentation

Lexical Normalization for Neural Network Parsing Rob van der Goot, Gertjan van Noord University of Groningen r.van.der.goot@rug.nl 26-01-2018 1 / 31 Last Year (CLIN27) kheb da gzien ik heb dat gezien orig gzien kheb da tokenize lookup

Introduction to Bottom-Up Parsing Shift-reduce parsing The LR parsing algorithm

Compilers Lexical Analysis Alex Aiken Lexical Analysis 1. Lexical Analysis 2. Parsing 3.

Heterogeneous Lexical Resources MultiJEDI ERC 259234 Lexical Resource Lexical Resource Lexical

Robust Incremental Neural Semantic Graph Parsing Jan Buys and Phil Blunsom Dependency Parsing vs

CSC 4181 Compiler Construction Parsing 1 1 Outline Top-down v.s. Bottom-up Top-down parsing

LEXICAL TYPOLOGY Peter Koch (Part I) Koch, Lexical typology, 2010-8-24 A. General introduction

Basic Parsing Algorithms Chart Parsing Seminar Recent Advances in Parsing Technology WS

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

LEXICAL TYPOLOGY LEXICAL TYPOLOGY Peter Koch (Part II) Department of Romance Studies, Tbingen

LEXICAL SEMANTICS LEXICAL SEMANTICS CS 224N 2011 Gerald Penn Slides largely adapted from

Lesson 2 Lexical Analysis CS 226/326 Spring 2003 Lexical Analysis Transform source program

Lexical analysis Lexical analysis Lexical analysis checks the correctness of program words and

Introduction to Lexical Analysis Outline Informal sketch of lexical analysis

Lexical Analysis Aslan Askarov aslan@cs.au.dk acknowledgments: E. Ernst Lexical analysis

Models of Human Parsing Experimental Data 2 Informatics 2A: Lecture 22 Eye-tracking Reading

Outline LR Parsing Review of bottom-up parsing LALR Parser Generators Computing the

Alex Suciu Northeastern University Topology Seminar Brandeis University March 29, 2016 A LEX S

Interoperability driven integration of biomedical data sources Douglas TEODORO a,1 , Rmy CHOQUET

On Dynamic Range Reporting in One Dimension Christian Mortensen 1 Rasmus Pagh 1 cu 2 Mihai P

PDSF User Meeting November 4, 2014 Lisa Gerhardt Utilization -

Quasi Real-time Data Analytics for Free Electron Lasers March 21 st 2018 OSG AHM Amedeo Perazzo

Convergence rates for discretized optimal transport Quentin M erigot Universit e Paris-Sud

Lecture 9: Prediction markets, fair games and martingales.. David Aldous March 2, 2016 The

On character varieties of 3-manifold groups Misha Kapovich June 22-23, 2015 A character-buildier