Algorithms for NLP IITP, Fall 2019 Lecture 21: Machine Translation - PowerPoint PPT Presentation

May 31, 2023 •366 likes •1.4k views

Algorithms for NLP IITP, Fall 2019 Lecture 21: Machine Translation I Yulia Tsvetkov 1 Machine Translation from Dream of the Red Chamber Cao Xue Qin (1792) English: leg, foot, paw French: jambe, pied, patte, etape Challenges Ambiguities

EM Algorithm ▪ Parameter estimation from the aligned corpus
IBM Model 1 and EM EM Algorithm consists of two steps ▪ Expectation-Step: Apply model to the data ▪ parts of the model are hidden (here: alignments) ▪ using the model, assign probabilities to possible values ▪ Maximization-Step: Estimate model from data ▪ take assigned values as fact ▪ collect counts (weighted by lexical translation probabilities) ▪ estimate model from counts ▪ Iterate these steps until convergence
IBM Model 1 and EM ▪ We need to be able to compute: ▪ Expectation-Step: probability of alignments ▪ Maximization-Step: count collection
IBM Model 1 and EM t-table
IBM Model 1 and EM t-table
IBM Model 1 and EM t-table
IBM Model 1 and EM t-table Applying the chain rule:
IBM Model 1 and EM: Expectation Step
IBM Model 1 and EM: Expectation Step
The Trick
IBM Model 1 and EM: Expectation Step
IBM Model 1 and EM: Expectation Step t-table E-step
IBM Model 1 and EM: Maximization Step
IBM Model 1 and EM: Maximization Step t-table E-step M-step
IBM Model 1 and EM: Maximization Step
IBM Model 1 and EM: Maximization Step t-table E-step M-step Update t-table: p (the|la) = c (the|la)/ c (la)
IBM Model 1 and EM: Pseudocode
Convergence
IBM Model 1 ▪ Generative model: break up translation process into smaller steps ▪ Simplest possible lexical translation model ▪ Additional assumptions ▪ All alignment decisions are independent ▪ The alignment distribution for each a i is uniform over all source words and NULL
IBM Model 1 ▪ Translation probability ▪ for a foreign sentence f = ( f 1 , ..., f lf ) of length l f ▪ to an English sentence e = ( e 1 , ..., e le ) of length l e ▪ with an alignment of each English word e j to a foreign word f i according to the alignment function a : j → i ▪ parameter ϵ is a normalization constant
Example
Evaluating Alignment Models ▪ How do we measure quality of a word-to-word model? ▪ Method 1: use in an end-to-end translation system ▪ Hard to measure translation quality ▪ Option: human judges ▪ Option: reference translations (NIST, BLEU) ▪ Option: combinations (HTER) ▪ Actually, no one uses word-to-word models alone as TMs ▪ Method 2: measure quality of the alignments produced ▪ Easy to measure ▪ Hard to know what the gold alignments should be ▪ Often does not correlate well with translation quality (like perplexity in LMs)
Alignment Error Rate
Alignment Error Rate
Alignment Error Rate
Alignment Error Rate

Recommend

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly Most top-tier universities now have NLP faculty (Stanford, Cornell, Berkeley, MIT, UPenn, CMU, Hopkins, etc) Commercial NLP hiring: Google,

380 views • 24 slides

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly Most top-tier universities now have NLP faculty (Stanford, Cornell, Berkeley, MIT, UPenn, CMU, Hopkins, etc) Commercial NLP hiring: Google,

541 views • 25 slides

NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity

7/13/2012 NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity Parsing Part of Speech Lecture delivered at the summer school on NLP, IIIT Hyderabad, Tagging 10July, 2012 Vision Speech Morph

522 views • 39 slides

Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and

CS11-747 Neural Networks for NLP Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and Sequential Data NLP and Sequential Data NLP is full of sequential data NLP and Sequential Data NLP is full of

2.05k views • 171 slides

Natural Language Processing (NLP) In 11-711 Algorithms for NLP we take an

Natural Language Processing (NLP) In 11-711 Algorithms for NLP we take an English-centric approach to NLP This enables us to work with a language that all of us understand and focus on core algorithms and tasks Even

369 views • 15 slides

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for Natural Language Overview NLP for Ontologies Ontologies for NLP Portuguese resources Research at PUCRS Introduction We think and we

758 views • 71 slides

Algorithms for NLP 11-711, Fall 2019 Lecture 26: Computational Ethics Yulia Tsvetkov 1

Algorithms for NLP 11-711, Fall 2019 Lecture 26: Computational Ethics Yulia Tsvetkov 1 Tsvetkov Socially Responsible NLP What NLP Has To Do With Ethics? Applications Machine Translation Information Retrieval

1.04k views • 89 slides

Algorithms for NLP IITP, Fall 2019 Lecture 25: Computational Ethics Yulia Tsvetkov 1 Tsvetkov

Algorithms for NLP IITP, Fall 2019 Lecture 25: Computational Ethics Yulia Tsvetkov 1 Tsvetkov Socially Responsible NLP What NLP Has To Do With Ethics? Applications Machine Translation Information Retrieval Question

1.35k views • 89 slides

Facing NLP German Rigau i Claramunt http://adimen.si.ehu.es/~rigau IXA group Departamento de

Facing NLP German Rigau i Claramunt http://adimen.si.ehu.es/~rigau IXA group Departamento de Lenguajes y Sistemas Informticos UPV/EHU AI and NLP Facing NLP From Cyc (adapted) (I) Fred saw the plane flying over Zurich. AI and NLP 2

486 views • 24 slides

IXA pipes: Efficient and Ready to Use Multilingual NLP tools Rodrigo Agerri IXA NLP Group,

IXA pipes: Efficient and Ready to Use Multilingual NLP tools Rodrigo Agerri IXA NLP Group, UPV/EHU OpenNLP project, Apache Software Foundation Rodrigo Agerri (IXA NLP Group, UPV/EHU OpenNLP project, Apache Software Foundation) IXA pipes:

584 views • 21 slides

Prominent Research Directions in NLP Alexander Panchenko Assistant Professor for NLP About

Prominent Research Directions in NLP Alexander Panchenko Assistant Professor for NLP About myself: a decade of fun R&D in NLP 2002-2008: Bauman Moscow State Technical University , Engineer in Information Systems, MOSCOW 2008: Xerox

276 views • 25 slides

Deep Learning for NLP Kiran Vodrahalli Feb 11, 2015 Overview What is NLP? Natural

Deep Learning for NLP Kiran Vodrahalli Feb 11, 2015 Overview What is NLP? Natural Language Processing We try to extract meaning from text: sentiment, word sense, semantic similarity, etc. How does Deep Learning relate? NLP

2.02k views • 124 slides

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and Shallow Processing

Hybrid NLP Hybrid NLP O UTLINE O UTLINE Problems of Deep and Shallow Processing Hybrid Architectures An Advanced Platform for Hybrid NLP: Deep Thought Applications for Hybrid Processing Conclusion and Outlook LTII

635 views • 25 slides

NLP Programming Tutorial 4 - Word Segmentation Graham Neubig Nara Institute of Science and

NLP Programming Tutorial 4 Word Segmentation NLP Programming Tutorial 4 - Word Segmentation Graham Neubig Nara Institute of Science and Technology (NAIST) 1 NLP Programming Tutorial 4 Word Segmentation Introduction 2 NLP Programming

1.07k views • 43 slides

SI485i : NLP Set 12 Features and Prediction What is NLP, really? Many of our tasks boil down

SI485i : NLP Set 12 Features and Prediction What is NLP, really? Many of our tasks boil down to finding intelligent features of language. We do lots of machine learning over features NLP researchers also use linguistic insights, deep

326 views • 18 slides

Capsule Networks for NLP Will Merrill Advanced NLP 10/25/18 Capsule Networks: A Better ConvNet

Capsule Networks for NLP Will Merrill Advanced NLP 10/25/18 Capsule Networks: A Better ConvNet Architecture proposed by Hinton as a replacement for ConvNets in computer vision Several recent papers applying them to NLP: Zhao

673 views • 64 slides

NOC Flash Introduction Terena TF NOC , Ljubjana Slovenia february 15- 16 2011

NOC Flash Introduction Terena TF NOC , Ljubjana Slovenia february 15- 16 2011 havard.kusslid@uninett.no NOC Manager Network infrastructure Uninett operates its infrastructure on a 15 year long IRU (irrevokable right of use) agreement with

500 views • 10 slides

VT Social Media Best Practices FAIREN HORNER, VT SOCIAL MEDIA MANAGER SUSAN GILL, VT DIRECTOR OF

VT Social Media Best Practices FAIREN HORNER, VT SOCIAL MEDIA MANAGER SUSAN GILL, VT DIRECTOR OF NEW MEDIA APRI L 1 5 , 2 01 9 THE PRIMARY VT ACCOUNTS PAGE VS. GROUP: FACEBOOK GROUPS Facebook groups are built for small group communication

1.07k views • 32 slides

CS171 Visualization Alexander Lex alex@seas.harvard.edu Design Guidelines Tasks [xkcd] Next

CS171 Visualization Alexander Lex alex@seas.harvard.edu Design Guidelines Tasks [xkcd] Next Week Lecture 7: Homework 2 Design Studio Lecture 8: Interaction Guest Lecture, Jean-Daniel Fekete (INRIA) Sections: D3 & JS: Data

1.48k views • 88 slides

Fulfilling Changing Customer Expectations with True Omnichannel March 2019 Customer

Fulfilling Changing Customer Expectations with True Omnichannel March 2019 Customer Experience Benchmark Business Impact Case Study: Swedish Rail 2019 Trends NICE inContact CXone 2018 Customer Experience (CX) Benchmark

590 views • 27 slides

OWASP London Chapter Meeting 30th March 2017 London Chapter Chapter Leaders: Sam

OWASP London Chapter Meeting 30th March 2017 London Chapter Chapter Leaders: Sam Stepanyan (@securestep9) Sherif Mansour (@kerberosmansour) Keeping In Touch: Join the OWASP London mailing list Follow

917 views • 23 slides

Microsoft Teams for Teaching Online Kiruthika Ragupathi, CDTL | Alan Soong, CDTL | Wanyun, CIT

Microsoft Teams for Teaching Online Kiruthika Ragupathi, CDTL | Alan Soong, CDTL | Wanyun, CIT https://wiki.nus.edu.sg/display/cit/Microsoft+Teams+for+Teaching Workshop Plan Understanding team types Setting up your class Organising

814 views • 43 slides

Welcome to Systems Security! (SysSec) Systems Security - Spring 2020 Introductions Meet Our

Welcome to Systems Security! (SysSec) Systems Security - Spring 2020 Introductions Meet Our Team! Aibek Zhylkaidarov Dewan Islam Andrew Mavrogeorgis Jay Chen Andrew Shi Lucas Crassidis Anthony Magrene

614 views • 25 slides

Strategic Communication: From Planning to Action June 11, 2018, 2:30-3:45pm ET Welcome! We

Strategic Communication: From Planning to Action June 11, 2018, 2:30-3:45pm ET Welcome! We will begin shortly. While you wait, please chat in Your Name, Your Organization, and the names of anyone else on the phone line with you .

1.05k views • 79 slides