Machine Translation: Examples Statistical NLP Spring 2011 Lecture - PDF document

Machine Translation: Examples Statistical NLP Spring 2011 Lecture 7: Phrase-Based MT Dan Klein – UC Berkeley Levels of Transfer World-Level MT: Examples � la politique de la haine . (Foreign Original) � politics of hate . (Reference Translation) � the policy of the hatred . (IBM4+N-grams+Stack) � nous avons signé le protocole . (Foreign Original) � we did sign the memorandum of agreement . (Reference Translation) � we have signed the protocol . (IBM4+N-grams+Stack) � où était le plan solide ? (Foreign Original) � but where was the solid plan ? (Reference Translation) � where was the economic base ? (IBM4+N-grams+Stack) MT: Evaluation Phrasal / Syntactic MT: Examples � Human evaluations: subject measures, fluency/adequacy � Automatic measures: n-gram match to references � NIST measure: n-gram recall (worked poorly) � BLEU: n-gram precision (no one really likes it, but everyone uses it) � BLEU: � P1 = unigram precision � P2, P3, P4 = bi-, tri-, 4-gram precision � Weighted geometric mean of P1-4 � Brevity penalty (why?) � Somewhat hard to game… 1

Automatic Metrics Work (?) Corpus-Based MT Modeling correspondences between languages Sentence-aligned parallel corpus: Yo lo haré mañana Hasta pronto Hasta pronto I will do it tomorrow See you soon See you around Machine translation system: Model of Yo lo haré pronto I will do it soon translation I will do it around See you tomorrow Phrase-Based Systems cat ||| chat ||| 0.9 the cat ||| le chat ||| 0.8 dog ||| chien ||| 0.8 house ||| maison ||| 0.6 my house ||| ma maison ||| 0.9 language ||| langue ||| 0.9 … Phrase table Sentence-aligned Word alignments (translation model) corpus Many slides and examples from Philipp Koehn or John DeNero Phrase-Based Decoding The Pharaoh “Model” 这 7 人中包括来自法国和俄罗斯的宇航员 . [Koehn et al, 2003] Segmentation Translation Distortion Decoder design is important: [Koehn et al. 03] 2

The Pharaoh “Model” Phrase Weights Where do we get these counts? Phrase-Based Decoding Monotonic Word Translation � Cost is LM * TM � It’s an HMM? […. slap to, 6] � P(e|e -1 ,e -2 ) 0.00000016 � P(f|e) […. a slap, 5] � State includes 0.00001 � Exposed English […. slap by, 6] 0.00000001 � Position in foreign � Dynamic program loop? for (fPosition in 1…|f|) for (eContext in allEContexts) for (eOption in translations[fPosition]) score = scores[fPosition-1][eContext] * LM(eContext+eOption) * TM(eOption, fWord[fPosition]) scores[fPosition][eContext[2]+eOption] = max score Beam Decoding Phrase Translation � For real MT models, this kind of dynamic program is a disaster (why?) � Standard solution is beam search: for each position, keep track of only the best k hypotheses for (fPosition in 1…|f|) for (eContext in bestEContexts[fPosition]) for (eOption in translations[fPosition]) � If monotonic, almost an HMM; technically a semi-HMM score = scores[fPosition-1][eContext] * LM(eContext+eOption) * TM(eOption, fWord[fPosition]) bestEContexts.maybeAdd(eContext[2]+eOption, score) for (fPosition in 1…|f|) for (lastPosition < fPosition) � Still pretty slow… why? for (eContext in eContexts) � Useful trick: cube pruning (Chiang 2005) for (eOption in translations[fPosition]) … combine hypothesis for (lastPosition ending in eContext) with eOption � If distortion… now what? Example from David Chiang 3

Non-Monotonic Phrasal MT Pruning: Beams + Forward Costs � Problem: easy partial analyses are cheaper � Solution 1: use beams per foreign subset � Solution 2: estimate forward costs (A*-like) The Pharaoh Decoder Hypotheis Lattices Word Alignment Word Alignment �� 4

Unsupervised Word Alignment 1-to-Many Alignments Input: a bitext : pairs of translated sentences � nous acceptons votre opinion . we accept your view . Output: alignments : pairs of � translated words � When words have unique sources, can represent as a (forward) alignment function a from French to English positions Many-to-Many Alignments IBM Model 1 (Brown 93) � Alignments: a hidden vector called an alignment specifies which English source is responsible for each French target word. IBM Models 1/2 1 2 3 4 5 6 7 8 9 E : Thank you , I shall do so gladly . A : 1 3 7 6 8 8 8 8 9 F : Gracias , lo haré de muy buen grado . Model Parameters Emissions: P( F 1 = Gracias | E A1 = Thank ) Transitions : P( A 2 = 3) 5

Machine Translation: Examples Statistical NLP Spring 2011 Lecture - PDF document

Machine Translation: Examples Statistical NLP Spring 2011 Lecture 7: Phrase-Based MT Dan Klein UC Berkeley Levels of Transfer World-Level MT: Examples la politique de la haine . (Foreign Original) politics of hate .

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Statistical Machine Translation George Foster George Foster Statistical Machine Translation A

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment

Statistical Machine Translation Graham Neubig Nara Institute of Science and Technology (NAIST)

Natural Language Processing Machine Translation Machine Translation Dan Klein UC Berkeley

Domain Adaptation in Statistical Machine Translation Logic, Language and Computation Bart

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

Machine Translation 12: (Non-neural) Statistical Machine Translation Rico Sennrich University of

Natural Language Processing Machine Translation Dan Klein UC Berkeley 1 Machine Translation 2

Improved Word Alignments for Statistical Machine Translation Alex Fraser Institute for NLP

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

CSE 517 Natural Language Processing Winter 2015 Phrase Based Translation Yejin Choi Slides

Self-Assembly The spontaneous and reversible association of molecular species to form larger, more

IoT Platform using Geode and ActiveMQ Scalable IoT Platform Swapnil Bawaskar @sbawaskar

New CIM00012 germplasm exchange project Five years: 07/2008 to 06/2013 Sending only

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Corpus Acquisition from the Internet Philipp Koehn partially based on slides from Christian Buck

Agenda Part 1: Professional Guest Speakers Spring Recruitment DAS Upcoming Events

Natural Language Processing Computational Linguistics Text processing Artificial Intelligence

Sambuz

Useful Links

Newsletter

Mail Us