Alignment 4 In a parallel text (or when we translate), we align - PowerPoint PPT Presentation

Alignment 4 • In a parallel text (or when we translate), we align words in one language with the words in the other 1 2 3 4 das Haus ist klein the house is small 1 2 3 4 • Word positions are numbered 1–4 Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

Alignment Function 5 • Formalizing alignment with an alignment function • Mapping an English target word at position i to a German source word at position j with a function a : i → j • Example a : { 1 → 1 , 2 → 2 , 3 → 3 , 4 → 4 } Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

Reordering 6 Words may be reordered during translation 1 2 3 4 klein ist das Haus the house is small 1 2 3 4 a : { 1 → 3 , 2 → 4 , 3 → 2 , 4 → 1 } Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

One-to-Many Translation 7 A source word may translate into multiple target words 1 2 3 4 das Haus ist klitzeklein the house is very small 1 2 3 4 5 a : { 1 → 1 , 2 → 2 , 3 → 3 , 4 → 4 , 5 → 4 } Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

Dropping Words 8 Words may be dropped when translated (German article das is dropped) 1 2 3 4 das Haus ist klein house is small 1 2 3 a : { 1 → 2 , 2 → 3 , 3 → 4 } Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

Inserting Words 9 • Words may be added during translation – The English just does not have an equivalent in German – We still need to map it to something: special NULL token 0 1 2 3 4 das Haus ist klein NULL the house is just small 1 2 3 4 5 a : { 1 → 1 , 2 → 2 , 3 → 3 , 4 → 0 , 5 → 4 } Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

IBM Model 1 10 • Generative model: break up translation process into smaller steps – IBM Model 1 only uses lexical translation • Translation probability – for a foreign sentence f = ( f 1 , ..., f l f ) of length l f – to an English sentence e = ( e 1 , ..., e l e ) of length l e – with an alignment of each English word e j to a foreign word f i according to the alignment function a : j → i l e ✏ Y p ( e , a | f ) = t ( e j | f a ( j ) ) ( l f + 1) l e j =1 – parameter ✏ is a normalization constant Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

Example 11 das Haus ist klein e t ( e | f ) e t ( e | f ) e t ( e | f ) e t ( e | f ) the 0.7 house 0.8 is 0.8 small 0.4 that 0.15 building 0.16 ’s 0.16 little 0.4 which 0.075 home 0.02 exists 0.02 short 0.1 who 0.05 household 0.015 has 0.015 minor 0.06 this 0.025 shell 0.005 are 0.005 petty 0.04 p ( e, a | f ) = ✏ 4 3 × t ( the | das ) × t ( house | Haus ) × t ( is | ist ) × t ( small | klein ) = ✏ 4 3 × 0 . 7 × 0 . 8 × 0 . 8 × 0 . 4 = 0 . 0028 ✏ Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

14 em algorithm Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

EM Algorithm 16 • Incomplete data – if we had complete data , would could estimate model – if we had model , we could fill in the gaps in the data • Expectation Maximization (EM) in a nutshell 1. initialize model parameters (e.g. uniform) 2. assign probabilities to the missing data 3. estimate model parameters from completed data 4. iterate steps 2–3 until convergence Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

EM Algorithm 17 ... la maison ... la maison blue ... la fleur ... ... the house ... the blue house ... the flower ... • Initial step: all alignments equally likely • Model learns that, e.g., la is often aligned with the Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

EM Algorithm 18 ... la maison ... la maison blue ... la fleur ... ... the house ... the blue house ... the flower ... • After one iteration • Alignments, e.g., between la and the are more likely Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

EM Algorithm 19 ... la maison ... la maison bleu ... la fleur ... ... the house ... the blue house ... the flower ... • After another iteration • It becomes apparent that alignments, e.g., between fleur and flower are more likely (pigeon hole principle) Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

EM Algorithm 20 ... la maison ... la maison bleu ... la fleur ... ... the house ... the blue house ... the flower ... • Convergence • Inherent hidden structure revealed by EM Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

EM Algorithm 21 ... la maison ... la maison bleu ... la fleur ... ... the house ... the blue house ... the flower ... p(la|the) = 0.453 p(le|the) = 0.334 p(maison|house) = 0.876 p(bleu|blue) = 0.563 ... • Parameter estimation from the aligned corpus Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

IBM Model 1 and EM 22 • EM Algorithm consists of two steps • Expectation-Step: Apply model to the data – parts of the model are hidden (here: alignments) – using the model, assign probabilities to possible values • Maximization-Step: Estimate model from data – take assign values as fact – collect counts (weighted by probabilities) – estimate model from counts • Iterate these steps until convergence Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

IBM Model 1 and EM 23 • We need to be able to compute: – Expectation-Step: probability of alignments – Maximization-Step: count collection Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

IBM Model 1 and EM: Expectation Step 25 • We need to compute p ( a | e , f ) • Applying the chain rule: p ( a | e , f ) = p ( e , a | f ) p ( e | f ) • We already have the formula for p ( e , a | f ) (definition of Model 1) Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

IBM Model 1 and EM: Expectation Step 26 • We need to compute p ( e | f ) X p ( e | f ) = p ( e , a | f ) a l f l f X X = ... p ( e , a | f ) a (1)=0 a ( l e )=0 l f l f l e ✏ X X Y = ... t ( e j | f a ( j ) ) ( l f + 1) l e j =1 a (1)=0 a ( l e )=0 Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

IBM Model 1 and EM: Expectation Step 27 l f l f l e ✏ X X Y p ( e | f ) = ... t ( e j | f a ( j ) ) ( l f + 1) l e j =1 a (1)=0 a ( l e )=0 l f l f l e ✏ X X Y = ... t ( e j | f a ( j ) ) ( l f + 1) l e j =1 a (1)=0 a ( l e )=0 l f l e ✏ Y X = t ( e j | f i ) ( l f + 1) l e j =1 i =0 • Note the trick in the last line – removes the need for an exponential number of products → this makes IBM Model 1 estimation tractable Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

The Trick 28 (case l e = l f = 2 ) 2 2 2 = ✏ X X Y t ( e j | f a ( j ) ) = 3 2 j =1 a (1)=0 a (2)=0 = t ( e 1 | f 0 ) t ( e 2 | f 0 ) + t ( e 1 | f 0 ) t ( e 2 | f 1 ) + t ( e 1 | f 0 ) t ( e 2 | f 2 )+ + t ( e 1 | f 1 ) t ( e 2 | f 0 ) + t ( e 1 | f 1 ) t ( e 2 | f 1 ) + t ( e 1 | f 1 ) t ( e 2 | f 2 )+ + t ( e 1 | f 2 ) t ( e 2 | f 0 ) + t ( e 1 | f 2 ) t ( e 2 | f 1 ) + t ( e 1 | f 2 ) t ( e 2 | f 2 ) = = t ( e 1 | f 0 ) ( t ( e 2 | f 0 ) + t ( e 2 | f 1 ) + t ( e 2 | f 2 )) + + t ( e 1 | f 1 ) ( t ( e 2 | f 1 ) + t ( e 2 | f 1 ) + t ( e 2 | f 2 )) + + t ( e 1 | f 2 ) ( t ( e 2 | f 2 ) + t ( e 2 | f 1 ) + t ( e 2 | f 2 )) = = ( t ( e 1 | f 0 ) + t ( e 1 | f 1 ) + t ( e 1 | f 2 )) ( t ( e 2 | f 2 ) + t ( e 2 | f 1 ) + t ( e 2 | f 2 )) Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

IBM Model 1 and EM: Expectation Step 29 • Combine what we have: p ( a | e , f ) = p ( e , a | f ) /p ( e | f ) Q l e j =1 t ( e j | f a ( j ) ) ✏ ( l f +1) le = P l f Q l e i =0 t ( e j | f i ) ✏ ( l f +1) le j =1 l e t ( e j | f a ( j ) ) Y = P l f i =0 t ( e j | f i ) j =1 Philipp Koehn Machine Translation: IBM Model 1 and the EM Algorithm 13 September 2018

Alignment 4 In a parallel text (or when we translate), we align - PowerPoint PPT Presentation

Alignment 4 In a parallel text (or when we translate), we align words in one language with the words in the other 1 2 3 4 das Haus ist klein the house is small 1 2 3 4 Word positions are numbered 14 Philipp Koehn

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

Sequence Alignment (chapter 6) p The biological problem p Global alignment p Local alignment p

Ben Burr Trail PROJECT ALIGNMENT Project alignment Hamblen Elem School PROJECT ALIGNMENT

Ben Burr Trail PROJECT ALIGNMENT Project alignment Hamblen Elem School PROJECT ALIGNMENT

Data driven Ontology Alignment Data driven Ontology Alignment Nigam Shah nigam@stanford.edu

Sequence Alignment (chapter 6) The biological problem l Global alignment l Local alignment l

Image alignment Slides from Derek Hoiem, Svetlana Lazebnik Image source Alignment applications

TOD Alignment Rezoning Public Meeting July 18, 2019 TOD Alignment Rezoning The TOD Alignment

This week CSE 527 Sequence alignment Computational Biology More sequence alignment

Sequence Alignment Mark Voorhies 5/20/2015 Mark Voorhies Sequence Alignment Exercise: Scoring

Discriminative word alignment by learning the Discriminative word alignment by learning the

Sequence Alignment Mark Voorhies 5/29/2013 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 4/12/2018 Mark Voorhies Sequence Alignment Exercise: Scoring

Alignment with beam halo MC Andrea Parenti 05/05/2009 Outline: Alignment with Beam Halo (BH)

Alignment in C Seminar Effiziente Programmierung in C Sven-Hendrik Haase Universit at

Educational Alignment Study 2 5 Ju n e 2 0 1 8 Educational Alignment Study Jefferson Primary

Pier 8, Block 16 Opportunity Study May 20, 2020 PLANNING & ECONOMIC DEVELOPMENT DEPARTMENT

Radical Agility with Autonomous Teams and Microservices jan.loeffler@zalando.de / @jlsoft2 GOTO

P L A N N I N G & T R A N S P O R T A T I O N C O M M I T T E E 6 October 2020 Swan Lane

Status of the MICE Online Status of the MICE Online Systems Systems Pierrick Hanlet Pierrick

Empirical Methods in Natural Language Processing Lecture 15 Machine translation (II): Word-based

Some Unlikely Intersections Beyond Andr e-Oort Jonathan Pila Mathematical Institute Oxford

Hyper-and-elliptic-curve cryptography (which is not the same as: hyperelliptic-curve

Pseudo-analytic structures: model theory and algebraic geometry B. Zilber University of Oxford

Alignment 4 In a parallel text (or when we translate), we align - PowerPoint PPT Presentation

Alignment 4 In a parallel text (or when we translate), we align words in one language with the words in the other 1 2 3 4 das Haus ist klein the house is small 1 2 3 4 Word positions are numbered 14 Philipp Koehn

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

Sequence Alignment (chapter 6) p The biological problem p Global alignment p Local alignment p

Ben Burr Trail PROJECT ALIGNMENT Project alignment Hamblen Elem School PROJECT ALIGNMENT

Ben Burr Trail PROJECT ALIGNMENT Project alignment Hamblen Elem School PROJECT ALIGNMENT

Data driven Ontology Alignment Data driven Ontology Alignment Nigam Shah nigam@stanford.edu

Sequence Alignment (chapter 6) The biological problem l Global alignment l Local alignment l

Image alignment Slides from Derek Hoiem, Svetlana Lazebnik Image source Alignment applications

TOD Alignment Rezoning Public Meeting July 18, 2019 TOD Alignment Rezoning The TOD Alignment

This week CSE 527 Sequence alignment Computational Biology More sequence alignment

Sequence Alignment Mark Voorhies 5/20/2015 Mark Voorhies Sequence Alignment Exercise: Scoring

Discriminative word alignment by learning the Discriminative word alignment by learning the

Sequence Alignment Mark Voorhies 5/29/2013 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 4/12/2018 Mark Voorhies Sequence Alignment Exercise: Scoring

Alignment with beam halo MC Andrea Parenti 05/05/2009 Outline: Alignment with Beam Halo (BH)

Alignment in C Seminar Effiziente Programmierung in C Sven-Hendrik Haase Universit at

Educational Alignment Study 2 5 Ju n e 2 0 1 8 Educational Alignment Study Jefferson Primary

Pier 8, Block 16 Opportunity Study May 20, 2020 PLANNING &amp; ECONOMIC DEVELOPMENT DEPARTMENT

Radical Agility with Autonomous Teams and Microservices jan.loeffler@zalando.de / @jlsoft2 GOTO

P L A N N I N G &amp; T R A N S P O R T A T I O N C O M M I T T E E 6 October 2020 Swan Lane

Status of the MICE Online Status of the MICE Online Systems Systems Pierrick Hanlet Pierrick

Empirical Methods in Natural Language Processing Lecture 15 Machine translation (II): Word-based

Some Unlikely Intersections Beyond Andr e-Oort Jonathan Pila Mathematical Institute Oxford

Hyper-and-elliptic-curve cryptography (which is not the same as: hyperelliptic-curve

Pseudo-analytic structures: model theory and algebraic geometry B. Zilber University of Oxford

Pier 8, Block 16 Opportunity Study May 20, 2020 PLANNING & ECONOMIC DEVELOPMENT DEPARTMENT

P L A N N I N G & T R A N S P O R T A T I O N C O M M I T T E E 6 October 2020 Swan Lane