Unsupervised Word Translation Kira Selby University of Waterloo - PowerPoint PPT Presentation

Dec 31, 2023 •122 likes •218 views

Unsupervised Word Translation Kira Selby University of Waterloo Can we train a model to translate a language we know nothing about? Yes we can! Near the end of 2017, FAIR (Facebook AI Research) published a model called MUSE ( M ultilingual

Unsupervised Word Translation Kira Selby University of Waterloo
Can we train a model to translate a language we know nothing about?
Yes we can! • Near the end of 2017, FAIR (Facebook AI Research) published a model called MUSE ( M ultilingual U n S upervised word E mbeddings) • MUSE can learn to translate between languages without any cross-lingual information! • Achieves state of the art accuracy on hundreds of languages, even coming close to or surpassing supervised models!
Word Embeddings • Word embeddings are models that map every word in a language to a fixed-size vector • The idea is to map words in such a way that the resulting vector space somehow captures something about the relationships between words • Most famous example: Word2Vec (Mikolov 2013) • King – Man + Woman = Queen
MUSE • We start with a fixed set of word embeddings in each language, typically learned from a large corpus of text • Given target vectors Y and source vectors X, we want to learn a mapping Y = XW between the two spaces • We want to do this in such a way that the distribution of vectors in each of the two languages is the same
GANs • MUSE does this by using a GAN ( G enerative A dversarial N etwork) • We train a discriminator to try to tell whether two vectors are from the same language, and a generator to map the vectors from one language into each other • The discriminator and the generator are adversaries – they each train to try to beat the other
MUSE • MUSE has been incredibly successful, and set a new standard for word translation • Many papers have been published following up on MUSE’s techniques, but there are still open problems in the area • One of the most important is to improve the performance on highly dissimilar languages and low- resource languages • This is an area that could be an excellent opportunity for a research project

Recommend

Word Sense Disambiguation Unsupervised WSD Modern WSD L645 / B659 (Some material from Jurafsky

Word Sense Disambiguation Supervised WSD WSD evaluation Feature extraction Naive Bayes Lesk algorithm Heuristic-based WSD Similarity-based WSD Translation-based WSD Word Sense Disambiguation Unsupervised WSD Modern WSD L645 / B659

602 views • 30 slides

Chapter 4 Word-based models Statistical Machine Translation Lexical Translation How to

Chapter 4 Word-based models Statistical Machine Translation Lexical Translation How to translate a word look up in dictionary Haus house, building, home, household, shell. Multiple translations some more frequent than others

1.2k views • 61 slides

CRF Word Alignment & Noisy Channel Translation Machine Translation Lecture 6 Instructor:

CRF Word Alignment & Noisy Channel Translation Machine Translation Lecture 6 Instructor: Chris Callison-Burch TAs: Mitchell Stern, Justin Chiu Website: mt-class.org/penn Last Time ... X Translation Translation Alignment p ( p ( ) = )

937 views • 48 slides

What if Bible translation could be more like this? All the Word Bible Translators Summary

What if Bible translation could be more like this? All the Word Bible Translators Summary Thousands of languages still need Bible translation, but the current process requires a lot of time and money. All the Word (ATW) has developed a new

512 views • 20 slides

Evaluation methods for unsupervised word embeddings EMNLP 2015 Tobias Schnabel, Igor Labutov,

Evaluation methods for unsupervised word embeddings EMNLP 2015 Tobias Schnabel, Igor Labutov, David Mimno and Thorsten Joachims Cornell University September 19th, 2015 Evaluation methods for unsupervised word embeddings Motivation How

762 views • 26 slides

On the Limitations of Unsupervised Bilingual Dictionary Induction Anders Sgaard Sebastian

On the Limitations of Unsupervised Bilingual Dictionary Induction Anders Sgaard Sebastian Ruder Ivan Vuli Background: Unsupervised MT 2 Background: Unsupervised MT Recently: Unsupervised neural machine translation (Artetxe

1.36k views • 98 slides

A SEMANTIC UNSUPERVISED LEARNING APPROACH TO WORD SENSE DISAMBIGUATION Dissertation Presentation

A SEMANTIC UNSUPERVISED LEARNING APPROACH TO WORD SENSE DISAMBIGUATION Dissertation Presentation April 4, 2018 Dian I. Martin Presenta tati tion Overview Background LSA-WSD Approach Word Importance in a Sentence

785 views • 53 slides

Lexical Translation Models 1 January 24, 2013 Thursday, January 24, 13 Lexical Translation

Lexical Translation Models 1 January 24, 2013 Thursday, January 24, 13 Lexical Translation How do we translate a word? Look it up in the dictionary Haus : house, home, shell, household Multiple translations Different word senses,

1.23k views • 103 slides

Unsupervised Morpheme Analysis Competition 3: Statistical Machine Translation Mikko Kurimo, Sami

Unsupervised Morpheme Analysis Competition 3: Statistical Machine Translation Mikko Kurimo, Sami Virpioja, Ville T. Turunen (TKK) Graeme W. Blackwood, William Byrne (UCAM) Morphology and SMT Statistical machine translation systems find

333 views • 21 slides

Unsupervised speech processing using acoustic word embeddings Herman Kamper School of

Unsupervised speech processing using acoustic word embeddings Herman Kamper School of Informatics, University of Edinburgh TTI at Chicago MLSLP 2016: Spotlight invited talk Unsupervised speech processing Speech recognition applications

912 views • 68 slides

Unsupervised Machine Translation Sachin Kumar Conditional Text Generation Generate text

CMU CS11-737: Multilingual NLP (Fall 2020) Unsupervised Machine Translation Sachin Kumar Conditional Text Generation Generate text according to a specification: P(Y|X) Input X Output Y (Text) Task English Hindi Machine Translation

743 views • 35 slides

Inversion Transduction Grammars Wilker Aziz 3/5/17 Word-based Translation Mary did not slap

Inversion Transduction Grammars Wilker Aziz 3/5/17 Word-based Translation Mary did not slap the green witch Mary no di una bofetada a la bruja verde Every French word is generated by an English word (or null) 2 Generative

834 views • 46 slides

Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling Daichi

Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling Daichi Mochihashi NTT Communication Science Laboratories, Japan daichi@cslab.kecl.ntt.co.jp ACL-IJCNLP 2009 Aug 3, 2009 Word segmentation: string words

548 views • 21 slides

Using Contextual Word Clusters and AutomaGc Word Alignments

Unsupervised False Friend DisambiguaGon Using Contextual Word Clusters and AutomaGc Word Alignments Maryam Aminian , Mahmoud Ghoneim, Mona Diab

625 views • 45 slides

Towards Unsupervised Speech-to-Text Translation Yu-An Chung Wei-Hung Weng Schrasing Tong

Towards Unsupervised Speech-to-Text Translation Yu-An Chung Wei-Hung Weng Schrasing Tong James Glass Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, Massachusetts, USA ICASSP

1.17k views • 16 slides

Translation-based Word Sense PhD project in affiliation with the LOGON project (Machine

Translation-based Word Sense PhD project in affiliation with the LOGON project (Machine Translation) Disambiguation LOGON project description: The biggest single Gunn Inger Lyse challenge in computational linguistics is University

439 views • 11 slides

Word Reordering in Statistical Machine Translation with a POS-Based Distortion Model Kay Rottmann

Outline Motivation The Model Experiments Conclusion Translation Examples Word Reordering in Statistical Machine Translation with a POS-Based Distortion Model Kay Rottmann (UKA), Stephan Vogel (CMU) September 7, 2007 Kay Rottmann (UKA),

829 views • 72 slides

Using unsupervised corpus-based methods to build rule-based machine translation systems Felipe

Using unsupervised corpus-based methods to build rule-based machine translation systems Felipe S anchez Mart nez fsanchez@dlsi.ua.es Ph.D. thesis supervised by Mikel L. Forcada Juan Antonio P erez Ortiz 30th June 2008 Felipe S

930 views • 56 slides

CSP 517 Natural Language Processing Winter 2015 Machine Translation: Word Alignment Yejin Choi

CSP 517 Natural Language Processing Winter 2015 Machine Translation: Word Alignment Yejin Choi Slides from Dan Klein, Luke Zettlemoyer, Dan Jurafsky, Ray Mooney Machine Translation: Examples Corpus-Based MT Modeling correspondences between

925 views • 58 slides

Improved Word Alignments for Statistical Machine Translation Alex Fraser Institute for NLP

Improved Word Alignments for Statistical Machine Translation Alex Fraser Institute for NLP University of Stuttgart Statistical Machine Translation (SMT) Build a model P( e | f ), the probability of the English sentence e given the

558 views • 28 slides

Tagger Comparison (Gao, Johnson) John Wieting CS 598 Unsupervised POS tagging Predict the

Unsupervised HMM POS Tagger Comparison (Gao, Johnson) John Wieting CS 598 Unsupervised POS tagging Predict the tags for each word in a sentence 2 approaches used in this paper o Maximum likelihood o Bayesian Notice the prior which

997 views • 21 slides

Improving Unsupervised Acoustic Word Embeddings using Speaker and Gender Information Lisa van

Improving Unsupervised Acoustic Word Embeddings using Speaker and Gender Information Lisa van Staden, Herman Kamper 31 January 2020 Zero-Resource Speech Processing Popular methods for speech processing rely on transcribed speech. Obtaining

602 views • 23 slides

Machine Translation: Word Alignment Problem Marcello Federico FBK, Trento - Italy 2013 M.

Machine Translation: Word Alignment Problem Marcello Federico FBK, Trento - Italy 2013 M. Federico MT 2013 Outline 1 Word alignments Word alignment models Alignment search Alignment estimation EM algorithm M. Federico

410 views • 20 slides

Natural Language Processing Machine Translation Dan Klein UC Berkeley 1 Machine Translation 2

Natural Language Processing Machine Translation Dan Klein UC Berkeley 1 Machine Translation 2 Machine Translation: Examples 3 Levels of Transfer 4 Word Level MT: Examples la politique de la haine . (Foreign Original) politics

675 views • 43 slides