An Unsupervised Method for Uncovering Morphological Chains Karthik - PowerPoint PPT Presentation

An Unsupervised Method for Uncovering Morphological Chains Karthik Narasimhan Regina Barzilay Tommi Jaakkola CSAIL, Massachusetts Institute of Technology 1

Morphological Chains 2

Morphological Chains Chains to model the formation of words. 2

Morphological Chains Chains to model the formation of words. paint → painting → paintings 2

Morphological Chains Chains to model the formation of words. paint → painting → paintings Richer representation than traditional scenarios 2

Morphological Chains Chains to model the formation of words. paint → painting → paintings Richer representation than traditional scenarios Segmentation 2

Morphological Chains Chains to model the formation of words. paint → painting → paintings Richer representation than traditional scenarios Paradigms Segmentation 2

Our Approach Core Idea : Unsupervised discriminative model over pairs of words in the chain. paint → painting 3

Our Approach Core Idea : Unsupervised discriminative model over pairs of words in the chain. paint → painting • Orthographic features   Morfessor (Goldwater and Johnson, 2004; Creutz and Lagus, 2007), Poon et al., 2009, Dreyer and Eisner, 2009, Sirts and Goldwater, 2013 3

Our Approach Core Idea : Unsupervised discriminative model over pairs of words in the chain. paint → painting • Orthographic features   Morfessor (Goldwater and Johnson, 2004; Creutz and Lagus, 2007), Poon et al., 2009, Dreyer and Eisner, 2009, Sirts and Goldwater, 2013 • Semantic features   Schone and Jurafsky, 2000; Baroni et al., 2002 3

Our Approach Core Idea : Unsupervised discriminative model over pairs of words in the chain. paint → painting • Orthographic features   Morfessor (Goldwater and Johnson, 2004; Creutz and Lagus, 2007), Poon et al., 2009, Dreyer and Eisner, 2009, Sirts and Goldwater, 2013 • Semantic features   Schone and Jurafsky, 2000; Baroni et al., 2002 • Handle transformations. (plan → planning) 3

Textual Cues 4

Textual Cues Orthographic 4

Textual Cues Orthographic Patterns in the characters forming words. 4

Textual Cues Orthographic Patterns in the characters forming words. paint pain paints pains painted pained 4

Textual Cues Orthographic Patterns in the characters forming words. paint pain paints pains painted pained pain ran paint rant 4

Textual Cues Orthographic Semantic Patterns in the Meaning embedded as characters forming vectors. words. paint pain paints pains painted pained pain ran paint rant 4

Textual Cues Orthographic Semantic Patterns in the Meaning embedded as characters forming vectors. words. A B cos(A,B) paint pain paint paints 0.68 paints pains paint painted 0.60 painted pained pain pains 0.60 pain paint 0.11 ran rant 0.09 pain ran paint rant 4

Task Setup Training Word Vector Learning Unannotated word list Large text corpus with frequencies a 395134 ability 17793 able 56802 about 524355 Wikipedia 5

Multiple chains possible for a word. nation → national → international → internationally nation → national → nationally → internationally 6

Multiple chains possible for a word. nation → national → international → internationally nation → national → nationally → internationally Different chains can share word pairs. nation → national → international → internationally nation → national → nationalize 6

Independence Assumption 7

Independence Assumption Treat word-parent pairs separately 7

Independence Assumption Treat word-parent pairs separately national Word ( w ) 7

Independence Assumption Treat word-parent pairs separately national nation Suffix Word ( w ) Parent ( p ) Type ( t ) 7

Independence Assumption Treat word-parent pairs separately national nation Suffix Word ( w ) Parent ( p ) Type ( t ) Candidate ( z ) 7

national nation Suffix Word ( w ) Parent ( p ) Type ( t ) Candidate ( z ) 8

national nation Suffix Word ( w ) Parent ( p ) Type ( t ) Candidate ( z ) P ( w, z ) ∝ e θ · φ ( w,z ) 8

national nation Suffix Word ( w ) Parent ( p ) Type ( t ) Candidate ( z ) P ( w, z ) ∝ e θ · φ ( w,z ) Types - Prefix, Suffix, Transformations, Stop. 8

Transformations • Templates for handling changes in stem during addition of affixes. • Repetition template: PQ → PQQR (for each Q in alphabet). Ex. plan → planning P Q R • Feature template for each transformation. 9

Transformation types 10

Transformation types 3 different transformations: 10

Transformation types 3 different transformations: • Repetition (plan → planning) 10

Transformation types 3 different transformations: • Repetition (plan → planning) • Deletion (decide → deciding) 10

Transformation types 3 different transformations: • Repetition (plan → planning) • Deletion (decide → deciding) • Modification (carry → carried) 10

Transformation types 3 different transformations: • Repetition (plan → planning) • Deletion (decide → deciding) • Modification (carry → carried) Trade-off between types of transformation and computational tractability. 10

Transformation types 3 different transformations: • Repetition (plan → planning) • Deletion (decide → deciding) • Modification (carry → carried) Trade-off between types of transformation and computational tractability. • These three do well for a range of languages and are computationally tractable: max O(| ∑ | 2 ) for alphabet ∑ 10

Features φ (w,z) 11

Features φ (w,z) Orthographic 11

Features φ (w,z) Orthographic • A ffi xes : Indicator feature for top affixes 11

Features φ (w,z) Orthographic • A ffi xes : Indicator feature for top affixes • A ffi x Correlation : pairs of affixes sharing set of stems (inter-, re-), (under-, over-) 11

Features φ (w,z) Orthographic • A ffi xes : Indicator feature for top affixes • A ffi x Correlation : pairs of affixes sharing set of stems (inter-, re-), (under-, over-) • Word freq. of parent 11

Features φ (w,z) Orthographic • A ffi xes : Indicator feature for top affixes • A ffi x Correlation : pairs of affixes sharing set of stems (inter-, re-), (under-, over-) • Word freq. of parent • Transformation types with character bigrams 11

Features φ (w,z) Orthographic Semantic • A ffi xes : Indicator feature • Cosine similarity for top affixes between word vectors of word and parent • A ffi x Correlation : pairs of affixes sharing set of stems (inter-, re-), (under-, over-) • Word freq. of parent • Transformation types with character bigrams 11

Features φ (w,z) Orthographic Semantic • A ffi xes : Indicator feature • Cosine similarity for top affixes between word vectors of word and parent • A ffi x Correlation : pairs of affixes sharing set of stems (inter-, re-), (under-, over-) • Word freq. of parent • Transformation types with character bigrams Cosine similarity with player 11

Learning 12

Learning • Objective: 12

Learning • Objective: e θ · φ ( w,z ) Y Y X Y X P ( w ) = P ( w, z ) = w 0 ∈ Σ ⇤ ,z 0 e θ · φ ( w 0 ,z 0 ) P w w z w z 12

Learning • Objective: e θ · φ ( w,z ) Y Y X Y X P ( w ) = P ( w, z ) = w 0 ∈ Σ ⇤ ,z 0 e θ · φ ( w 0 ,z 0 ) P w w z w z • Optimize likelihood using convex optimization: LBFGS-B (with regularization) 12

Learning • Objective: e θ · φ ( w,z ) Y Y X Y X P ( w ) = P ( w, z ) = w 0 ∈ Σ ⇤ ,z 0 e θ · φ ( w 0 ,z 0 ) P w w z w z • Optimize likelihood using convex optimization: LBFGS-B (with regularization) • Not tractable - requires summing over all possible strings in alphabet to calculate normalization constant, Z. 12

Contrastive Estimation 13

Contrastive Estimation • Instead, we use Contrastive Estimation (Smith and Eisner, 2005): 13

Contrastive Estimation • Instead, we use Contrastive Estimation (Smith and Eisner, 2005): • Neighborhood of invalid words for each word to take probability mass from. 13

An Unsupervised Method for Uncovering Morphological Chains Karthik - PowerPoint PPT Presentation

An Unsupervised Method for Uncovering Morphological Chains Karthik Narasimhan Regina Barzilay Tommi Jaakkola CSAIL, Massachusetts Institute of Technology 1 Morphological Chains 2 Morphological Chains Chains to model the formation of words.

UNSUPERVISED LEARNING, CLUSTERING UNSUPERVISED LEARNING UNSUPERVISED LEARNING Supervised

Automatic Learning of a Morphological Model Theory and Unsupervised Approaches Unsupervised

Morphology & Transducers Intro to morphological analysis of languages Motivation for

Industrial Robots Industrial Robots Kinematic chains Kinematic chains Kinematic chains Kinematic

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

UNSUPERVISED MORPHOLOGICAL SEGMENTATION & CLUSTERING ICL UNI HEIDELBERG - HS CL4LRL -

Unsupervised Learning and Clustering l In unsupervised learning you are given a data set with no

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

Food Losses/Waste in Food Value Chains Food Losses/Waste in Food Value Chains Areas

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

Overview Motivation Verifying Continuous-Time Markov Chains 1 Lecture 1+2: Discrete-Time Markov

Discrete time Markov chains Today: Discrete Time Markov Chains, Limiting Discrete time Markov

Supervised Learning of Complete Morphological Paradigms Greg Durrett and John DeNero UC

Russian Morphological Processing for ICALL System architecture Exercise design Error types

A New Universal Morphological Feature Schema for Rich Morphological Annotation and Cross-Lingual

Morphological Analysis Morphological Analysis and Generation for Pali and Generation for Pali

Linguistics in a nutshell by hook or by crook Jeremy G. Kahn Signal, Speech & Language

Generalizing paerns in Instrumented Item-and-Paern Morphology Sarah Beniamine and Olivier

Introduction to Computational Linguistics Frank Richter fr@sfs.uni-tuebingen.de. Seminar f

Generalization via Modularity Deepak Chris Trevor Phillip Alyosha Pathak* Lu* Darrell

Multiscale local multiple orientation estimation using Mathematical Morphology and B-spline

Computational Morphology: Introduction Yulia Zinova SoSe 2020 Yulia Zinova Computational

Shared Task Bilingual Document Alignment Christian Buck and Philipp Koehn University of

Morphology and Size Evolu4on of Massive and Compact Galaxies

An Unsupervised Method for Uncovering Morphological Chains Karthik - PowerPoint PPT Presentation

An Unsupervised Method for Uncovering Morphological Chains Karthik Narasimhan Regina Barzilay Tommi Jaakkola CSAIL, Massachusetts Institute of Technology 1 Morphological Chains 2 Morphological Chains Chains to model the formation of words.

UNSUPERVISED LEARNING, CLUSTERING UNSUPERVISED LEARNING UNSUPERVISED LEARNING Supervised

Automatic Learning of a Morphological Model Theory and Unsupervised Approaches Unsupervised

Morphology &amp; Transducers Intro to morphological analysis of languages Motivation for

Industrial Robots Industrial Robots Kinematic chains Kinematic chains Kinematic chains Kinematic

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

UNSUPERVISED MORPHOLOGICAL SEGMENTATION &amp; CLUSTERING ICL UNI HEIDELBERG - HS CL4LRL -

Unsupervised Learning and Clustering l In unsupervised learning you are given a data set with no

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

Food Losses/Waste in Food Value Chains Food Losses/Waste in Food Value Chains Areas

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

Overview Motivation Verifying Continuous-Time Markov Chains 1 Lecture 1+2: Discrete-Time Markov

Discrete time Markov chains Today: Discrete Time Markov Chains, Limiting Discrete time Markov

Supervised Learning of Complete Morphological Paradigms Greg Durrett and John DeNero UC

Russian Morphological Processing for ICALL System architecture Exercise design Error types

A New Universal Morphological Feature Schema for Rich Morphological Annotation and Cross-Lingual

Morphological Analysis Morphological Analysis and Generation for Pali and Generation for Pali

Linguistics in a nutshell by hook or by crook Jeremy G. Kahn Signal, Speech &amp; Language

Generalizing paerns in Instrumented Item-and-Paern Morphology Sarah Beniamine and Olivier

Introduction to Computational Linguistics Frank Richter fr@sfs.uni-tuebingen.de. Seminar f

Generalization via Modularity Deepak Chris Trevor Phillip Alyosha Pathak* Lu* Darrell

Multiscale local multiple orientation estimation using Mathematical Morphology and B-spline

Computational Morphology: Introduction Yulia Zinova SoSe 2020 Yulia Zinova Computational

Shared Task Bilingual Document Alignment Christian Buck and Philipp Koehn University of

Morphology and Size Evolu4on of Massive and Compact Galaxies

Morphology & Transducers Intro to morphological analysis of languages Motivation for

UNSUPERVISED MORPHOLOGICAL SEGMENTATION & CLUSTERING ICL UNI HEIDELBERG - HS CL4LRL -

Linguistics in a nutshell by hook or by crook Jeremy G. Kahn Signal, Speech & Language