Simultaneous Translation: Recent Advances and Remaining Challenges - PowerPoint PPT Presentation

Simultaneous Translation:   Recent Advances and Remaining Challenges Liang Huang Baidu Research (USA) and Oregon State University

Consecutive vs. Simultaneous Interpretation consecutive interpretation   simultaneous interpretation   multiplicative latency (x2) additive latency (+3 secs)

Consecutive vs. Simultaneous Interpretation consecutive interpretation   simultaneous interpretation   multiplicative latency (x2) additive latency (+3 secs) simultaneous interpretation is extremely difficult only ~3,000 qualified simultaneous interpreters world-wide (AIIC) each interpreter can only sustain for   at most 15-20 minutes the best interpreters can only cover   ～ 60% of the source material

Simultaneous Interpreters: Strategies & Limitations • anticipation, summarization, generalization, etc… • and they inevitably make (quite a bit of) mistakes • “human-level” quality: much lower than normal translation • “human-level” latency : very short: 2~4 secs (actually higher latency hurts quality…) from United Nations Proceedings Speech Corpus (LDC2014S08, Chay et al, 2014)

Simultaneous Interpreters: Strategies & Limitations • anticipation, summarization, generalization, etc… • and they inevitably make (quite a bit of) mistakes • “human-level” quality: much lower than normal translation • “human-level” latency : very short: 2~4 secs (actually higher latency hurts quality…) latency latency latency latency from United Nations Proceedings Speech Corpus (LDC2014S08, Chay et al, 2014)

Tradeoff between Latency and Quality high   written   consecutive   quality translation full-sentence interpretation machine   translation simultaneous interpretation word-by-word   low   translation quality low latency 1 sentence ～ 3 seconds high latency 4

Tradeoff between Latency and Quality seq-to-seq is   high   written   one of AI’s holy grails already very good consecutive   quality translation full-sentence needs fundamentally   interpretation machine   new ideas! translation previous work in simultaneous simultaneous interpretation translation word-by-word   low   translation quality low latency 1 sentence ～ 3 seconds high latency 4

Tradeoff between Latency and Quality seq-to-seq is   high   written   one of AI’s holy grails already very good consecutive   quality translation full-sentence needs fundamentally   interpretation machine   new ideas! translation previous work in simultaneous simultaneous interpretation translation word-by-word   low   translation quality low latency 1 sentence ～ 3 seconds high latency streaming   simultaneous   incremental   speech �� … text-to-text text-to- President Bush … … … recognition translation speech source speech stream source text stream target text stream target speech stream 4

Outline • Background on Simultaneous Interpretation • Part I: Our Breakthrough in 2018 • Prefix-to-Prefix Framework, Integrated Anticipation, Controllable Latency • New Latency Metric • Demos and Examples • Part II: Towards Flexible (Adaptive) Translation Policies • Part III: Remaining Challenges

Our Breakthrough in 2018 Baidu World Conference, Nov. 2017 Baidu World Conference, Nov. 2018 full-sentence translation (latency: 10+ secs) low-latency simultaneous translation (latency: ~3 secs) our work 6

Our Breakthrough in 2018 Baidu World Conference, Nov. 2017 Baidu World Conference, Nov. 2018 full-sentence translation (latency: 10+ secs) low-latency simultaneous translation (latency: ~3 secs) our work request Haifeng Wang Zhongjun He Hao Xiong Mingbo Ma Kaibo Liu Renjie Zheng 6

Our Breakthrough in 2018 Baidu World Conference, Nov. 2017 Baidu World Conference, Nov. 2018 full-sentence translation (latency: 10+ secs) low-latency simultaneous translation (latency: ~3 secs) our work request I really need low-latency Haifeng Wang Zhongjun He Hao Xiong Mingbo Ma Kaibo Liu Renjie Zheng simultaneous translation! 6 Ken Church

Main Challenge: Word Order Difference • e.g. translate from Subj-Obj-Verb (Japanese, German) to Subj-Verb-Obj (English) • German is underlyingly SOV, and Chinese is a mix of SVO and SOV • human simultaneous interpreters routinely “anticipate” (e.g., predicting German verb) Grissom et al, 2014

Main Challenge: Word Order Difference • e.g. translate from Subj-Obj-Verb (Japanese, German) to Subj-Verb-Obj (English) • German is underlyingly SOV, and Chinese is a mix of SVO and SOV • human simultaneous interpreters routinely “anticipate” (e.g., predicting German verb) Grissom et al, 2014 President Bush meets with Russian President Putin in Moscow

Main Challenge: Word Order Difference • e.g. translate from Subj-Obj-Verb (Japanese, German) to Subj-Verb-Obj (English) • German is underlyingly SOV, and Chinese is a mix of SVO and SOV • human simultaneous interpreters routinely “anticipate” (e.g., predicting German verb) Grissom et al, 2014 President Bush meets with Russian President Putin in Moscow non-anticipative: President Bush ( …… waiting …… ) meets with Russian …

Main Challenge: Word Order Difference • e.g. translate from Subj-Obj-Verb (Japanese, German) to Subj-Verb-Obj (English) • German is underlyingly SOV, and Chinese is a mix of SVO and SOV • human simultaneous interpreters routinely “anticipate” (e.g., predicting German verb) Grissom et al, 2014 President Bush meets with Russian President Putin in Moscow non-anticipative: President Bush ( …… waiting …… ) meets with Russian … anticipative: President Bush meets with Russian President Putin in Moscow

Previous Solutions • industrial systems • almost all “real-time” translation systems use full-sentence translation • some systems “repeatedly retranslate”, but constantly changing translations is annoying to the users and can’t be used for speech-to-speech translation • academic papers (just to sample a few) • explicit prediction of German verbs (Grissom et al, 2014) • reinforcement learning (Gu et al, 2017) to decide READ or WRITE • segment-based (Bangalore et al, 2012; Fujita et al, 2013; Oda et al, 2014) • these efforts (a) use full-sentence translation model; (b) can’t ensure a given latency 8

Our Idea: Prefix-to-Prefix, not Seq-to-Seq • standard seq-to-seq is only suitable for   p ( y i | x 1 … x n , y 1 …y i- 1 ) conventional full-sentence MT 1 2 3 4 5 seq-to-seq source: • we propose prefix-to-prefix framework   … target: … wait whole source sentence … tailed to tasks with simultaneity 1 2 • special case: wait- k policy: translation is   1 2 3 4 5 source: prefix-to-prefix   always k words behind source sentence … (wait- k ) target: wait k words • decoding this way => controllable latency 1 2 p ( y i | x 1 … x i+k- 1 , y 1 …y i- 1 ) • training this way => implicit anticipation on the target-side

Simultaneous Translation: Recent Advances and Remaining Challenges - PowerPoint PPT Presentation

Simultaneous Translation: Recent Advances and Remaining Challenges Liang Huang Baidu Research (USA) and Oregon State University Consecutive vs. Simultaneous Interpretation consecutive interpretation simultaneous interpretation

Appetizer: Simultaneous Translation ACL 2019 Invited Talk Simultaneous Translation: Recent

Simple, Lexicalized Choice of Translation Timing for Simultaneous Speech Translation Tomoki

Simultaneous Speech Translation Graham Neubig Nara Institute of Science and Technology (NAIST)

Simultaneous Speech Translation Graham Neubig Nara Institute of Science and Technology (NAIST)

Simultaneous GermanEnglish Lecture Translation Muntsin Kolss, Matthias Wlfel, Florian Kraft,

STACL: Simultaneous Translation with Integrated Anticipation & Controllable Latency Liang

Dont Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation

Multithreaded processors Hung-Wei Tseng Simultaneous Multi- Threading (SMT) 12 Simultaneous

Simultaneous embeddings with few bends and crossings Fabrizio Frati Michael Hoffmann Vincent

1 Translation and Scalling Matrice Translation o The translation and scaling transformations may

CRF Word Alignment & Noisy Channel Translation Machine Translation Lecture 6 Instructor:

Global Translation Services Website translation using post-edited machine translation and

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Simultaneous Causality: Part IV on Causality James J. Heckman Econ 312, Spring 2019 1 / 29

Translation Services: Innovation in Translation Workflow, Tools and Translation Workflow, Tools

Bad Translation Bad translation is a fight in which analogue wins over digital O le faaliliuga

On the Uniqueness of Simultaneous Rational Function Reconstruction Ilaria Zappatore a joint work

The leaky translation process New perspectives in cognitive translation studies Hanna Risku

JUST THE MATHS SLIDES NUMBER 9.3 MATRICES 3 (Matrix inversion & simultaneous

ECO 199 B GAMES OF STRATEGY Spring Term 2004 B February 24 SEQUENTIAL AND SIMULTANEOUS GAMES

Social Translation: How Massive Online Collaboration Could Take Machine Translation to the Next

Use of the Machine Translation Module within Dj Vu X2 Quick Guidance Introduction Machine

Machine Translation 2 Wikipedia Machine translation, often referred to by the acronym MT, is a

Machine Translation (M2M) Machine Translation (M2M) SNMP MIB to CIM MOF SNMP MIB to CIM MOF

Simultaneous Translation: Recent Advances and Remaining Challenges - PowerPoint PPT Presentation

Simultaneous Translation: Recent Advances and Remaining Challenges Liang Huang Baidu Research (USA) and Oregon State University Consecutive vs. Simultaneous Interpretation consecutive interpretation simultaneous interpretation

Appetizer: Simultaneous Translation ACL 2019 Invited Talk Simultaneous Translation: Recent

Simple, Lexicalized Choice of Translation Timing for Simultaneous Speech Translation Tomoki

Simultaneous Speech Translation Graham Neubig Nara Institute of Science and Technology (NAIST)

Simultaneous Speech Translation Graham Neubig Nara Institute of Science and Technology (NAIST)

Simultaneous GermanEnglish Lecture Translation Muntsin Kolss, Matthias Wlfel, Florian Kraft,

STACL: Simultaneous Translation with Integrated Anticipation &amp; Controllable Latency Liang

Dont Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation

Multithreaded processors Hung-Wei Tseng Simultaneous Multi- Threading (SMT) 12 Simultaneous

Simultaneous embeddings with few bends and crossings Fabrizio Frati Michael Hoffmann Vincent

1 Translation and Scalling Matrice Translation o The translation and scaling transformations may

CRF Word Alignment &amp; Noisy Channel Translation Machine Translation Lecture 6 Instructor:

Global Translation Services Website translation using post-edited machine translation and

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Simultaneous Causality: Part IV on Causality James J. Heckman Econ 312, Spring 2019 1 / 29

Translation Services: Innovation in Translation Workflow, Tools and Translation Workflow, Tools

Bad Translation Bad translation is a fight in which analogue wins over digital O le faaliliuga

On the Uniqueness of Simultaneous Rational Function Reconstruction Ilaria Zappatore a joint work

The leaky translation process New perspectives in cognitive translation studies Hanna Risku

JUST THE MATHS SLIDES NUMBER 9.3 MATRICES 3 (Matrix inversion &amp; simultaneous

ECO 199 B GAMES OF STRATEGY Spring Term 2004 B February 24 SEQUENTIAL AND SIMULTANEOUS GAMES

Social Translation: How Massive Online Collaboration Could Take Machine Translation to the Next

Use of the Machine Translation Module within Dj Vu X2 Quick Guidance Introduction Machine

Machine Translation 2 Wikipedia Machine translation, often referred to by the acronym MT, is a

Machine Translation (M2M) Machine Translation (M2M) SNMP MIB to CIM MOF SNMP MIB to CIM MOF

STACL: Simultaneous Translation with Integrated Anticipation & Controllable Latency Liang

CRF Word Alignment & Noisy Channel Translation Machine Translation Lecture 6 Instructor:

JUST THE MATHS SLIDES NUMBER 9.3 MATRICES 3 (Matrix inversion & simultaneous