Inferring Translation Candidates for Multilingual Dictionary - PowerPoint PPT Presentation

Inferring Translation Candidates for Multilingual Dictionary Generation with Multi-Way Neural Machine Translation Mihael Arcan, Daniel Torregrosa*, Sina Ahmadi* and John P. McCrae This publication has emanated from research supported in part by a research grant from Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289, co-funded by the European Regional Development Fund, and the European Union’s Horizon 2020 research and innovation programme under grant agreement No 731015, ELEXIS - European Lexical Infrastructure.

Introduction Neural machine translation Results Dictionary data Conclusion 2

Motivation • Knowledge bases are useful for many applications, but available in few languages • The creation and curation of knowledge bases is expensive • Hence, few or no knowledge bases in most languages • Can we use machine translation to translate knowledge? 3

Overview • Multi-way neural machine translation without the targeted direction • Continuous training with a small curated dictionary • Discovery of new bilingual dictionary entries 4

Targeted languages PT GL RO ES IT CA EU FR EN EO 5

Machine translation before 2014 • Rule-based machine translation • Humans write rules • Highly customisable • High maintenance cost • Phrase-based statistical machine translation • Learns from parallel corpus • Less control on the translations 7

Word embeddings • Fixed size numerical representation for words • From one-hot space (one dimension per difgerent word) to embedding space • The embedding vector represents the context where the word appears 8

Long-short term memory Input Gate σ Output Gate σ Memory c t Input Output × × × σ Forget Gate Based on tex.stackexchange.com/questions/332747/how-to-draw-a-diagram-of-long-short-term-memory 9

Bi-directional LSTM ⃗ ⃗ ⃗ ⃗ h 2 h 3 h 4 h 5 . . . LSTM ← LSTM ← LSTM ← LSTM ← . . . . . . LSTM → LSTM → LSTM → LSTM → . . . … … x 2 x 3 x 4 x 5 ⃗ ⃗ ⃗ ⃗ Based on github.com/PetarV-/TikZ 10

Neural machine translation 11

Subword units • One-hot vocabulary space has to be limited due to performance issues • This generates a lot of out-of-vocabulary entries • To minimize the efgect, we use subword units instead of words 12

Byte pair encoding • BPE is a compression technique • It starts with all the difgerent characters in the corpus • The most frequent character combination is selected as a BPE operation • This is repeated until the desired number of BPE is reached • The fjnal size of the vocabulary is the number of BPE operations + the alphabet 13

Byte pair encoding example low lower big bigger 14

Byte pair encoding example l o w _ l o w e r _ b i g _ b i g g e r 14

Byte pair encoding example l ‚ o w _ l ‚ o w e r _ b i g _ b i g g e r 14

Byte pair encoding example l ‚ o w _ l ‚ o w e r _ b ‚ i g _ b ‚ i g g e r 14

Byte pair encoding II bebo bebemos bebería beberíamos Present bebes bebéis Conditional beberías beberíais bebe beben bebería beberían bebí bebimos beberé beberemos Preterit bebiste bebisteis Future beberás beberéis bebió bebieron beberá beberán bebía bebíamos Imperfect bebías bebíais bebía bebían 15

Byte pair encoding II beb o beb emos beb ería beb eríamos Present beb es beb éis Conditional beb erías beb eríais beb e beb en beb ería beb erían beb í beb imos beb eré beb eremos Preterit beb iste beb isteis Future beb erás beb eréis beb ió beb ieron beb erá beb erán beb ía beb íamos Imperfect beb ías beb íais beb ía beb ían 15

Multi-way model • The model receives corpus in several difgerent languages both for source and target sentences • Each input sentence is annotated with the source language and the requested target language • In our case, Spanish-English, French-Romanian and Italian-Portuguese 16

Continuous training • After training, the network is seldom able to produce text in the requested language other than the training one • For example, if requested to translate Spanish to French, it will generate English • We continue the training with a small corpus of sentences 17

Dictionary data We used three difgerent dictionaries to continue training the system • Spanish to French Apertium dictionary (paper) • Spanish-French, Spanish-Portuguese and French-Portuguese dictionaries generated from Apertium data (task) • By following a cycle-based approach • By following a path-based approach 18

Part of speech • The NMT models were trained without part of speech (POS) data • To assign POS, we use monolingual dictionaries automatically extracted from Wiktionary • If > the source word is in the source-language dictionary; and > the target word is in the target-language dictionary; and > they have one or more POS tags in common, • generate one entry per shared POS 19

Evaluation • We used a dictionary automatically extracted from Wiktionary as gold standard • For those systems that have confjdence intervals, we calculate the precision and recall for all possible thresholds 21

Results (paper) Spanish → French French → Spanish 1 1 0 . 8 0 . 8 0 . 6 0 . 6 Precision Precision 0 . 4 0 . 4 0 . 2 0 . 2 0 0 0 3000 6000 9000 12000 0 3000 6000 9000 12000 Correct entries Correct entries Apertium NMT+Apertium 1 NMT+Apertium 10 22

Graph-based approaches Basic idea: Retrieve translations based on the graph of languages Two defjnitions: • Language graph refers to the Apertium dictionary graph • Translation graph refers to a graph where vertices represent a word and edges represent the translations in other languages. 24

Cycle-based approach EN:antique ES:antiguo EU:zahar FR:antique EN:ancient EO:antikva Apertium translations (black lines) in English (EN), French (FR), Basque (EU) and Esperanto (EO), and discovered possible translations (gray lines) and synonyms (red lines). 25

Path-based approach Traverse all simple paths using pivot-oriented inference origen origem iturri fuente source fonto font fonte brollador spring primavero primavera primavera udaberri primavera printemps printempo malguki muelle English ( en ) Basque ( eu ) Spanish ( es ) French ( fr ) Esperanto ( eo ) Catalan ( ca ) Portuguese ( pt ) (Task) Weight translations w.r.t. frequency and path length 26

Results (task, Wiktionary reference) English → French French → English English → Portuguese 1 1 1 0 . 8 0 . 8 0 . 8 Precision 0 . 6 0 . 6 0 . 6 0 . 4 0 . 4 0 . 4 0 . 2 0 . 2 0 . 2 0 0 0 0 3000 6000 9000 12000 0 3000 6000 9000 12000 0 3000 6000 9000 12000 Portuguese → English Portuguese → French French → Portuguese 1 1 1 0 . 8 0 . 8 0 . 8 Precision 0 . 6 0 . 6 0 . 6 0 . 4 0 . 4 0 . 4 0 . 2 0 . 2 0 . 2 0 0 0 0 3000 6000 9000 12000 0 3000 6000 9000 12000 0 3000 6000 9000 12000 Correct entries Correct entries Correct entries Cycle Path NMT-Cycle NMT-Path 27

Conclusion • Using neural machine translation with • Existing bilingual knowledge (Paper) • Discovered bilingual knowledge (Task) • to generate new dictionaries. 29

Inferring Translation Candidates for Multilingual Dictionary - PowerPoint PPT Presentation

Inferring Translation Candidates for Multilingual Dictionary Generation with Multi-Way Neural Machine Translation Mihael Arcan, Daniel Torregrosa, Sina Ahmadi and John P. McCrae This publication has emanated from research supported in part

Googles Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

Peter De Boos DRUPAL TRANSLATION MANAGEMENT: THE EASY WAY DRUPAL = MULTILINGUAL FRIENDLY

CS11-737: Multilingual Natural Language Processing Translation Yulia Tsvetkov Translation Mr.

Multilingual and Multitask Learning in seq2seq Models CMSC 470 Marine Carpuat Multilingual

MOLTO: Multilingual On-Line Translation Or: Using Grammatical Framework to Build

Translating for a Multilingual Legal System: Ambiguity, Hybridity and the Impact of Translation

Parameter Sharing Methods for Multilingual Self-Attentional Translation Models Devendra Sachan 1

Standards for multilingual web sites MultilingualWeb.eu, 4-5 April 2011, Pisa, Italy M.T.

Drupal 8 Multilingual Wonderland Gabor Hojtsy Acquia Foreign language site Multilingual site

Data-Driven Documentation Multilingual Technology for Producers of Information Aarne Ranta

COMMUNITY TRANSLATION IN AFRICA DENIS GIKUNDA, LOCALIZATION PRG MANAGER w3c: The Multilingual

THE TRANSLATION TOOL EXPERTS Try it, love it GETTING STARTED Dj Vu X3 WITH TEAM SERVER

Riga Summit 2015 MultilingualWeb Building a Multilingual Website with No Translation Resources

MT@EC Final Multilingual W eb w orkshop Luxem bourg 1 5 -1 6 March 2 0 1 2 Spyridon Pilos

Multilingual Web Content for Mass Consumer Products Manish

Translation Model Adaptation Using Genre-Revealing Text Features Marlies van der Wees, Arianna

Specialists in building and training multilingual teams LANGUAGE DEPARTMENT LANGUAGE

VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research Wang

On Inferring and Characterizing On Inferring and Characterizing Internet Routing Policies

Unsupervised Machine Translation Sachin Kumar Conditional Text Generation Generate text

Multilingual App Toolkit Standards and multilingual software development 29, April 2015 Jan

Psychosocial Evaluation of Adult Cardiothoracic Transplant Candidates and Candidates for

www.ExploreCalling.org www.ExploreCalling.org/presentations/aumcpbo 7436 active candidates 1295

Mentoring Internal Candidates as Mentoring Internal Candidates as a Component of Leadership