Simultaneous Speech Translation Simultaneous Speech Translation Graham Neubig Nara Institute of Science and Technology (NAIST) 5/18/2015 Joint Work With: Satoshi Nakamura, Tomoki Toda, Sakriani Sakti, Tomoki Fujita, Hiroaki Shimizu, Yusuke Oda, Takashi Mieno 1
Simultaneous Speech Translation Background 2
Simultaneous Speech Translation Speech Translation Systems ● Translate speech from source language to target ASR こ ん に ち は 、 駅 は ど こ で す か ? MT Hello, where is the station? TTS 3
Simultaneous Speech Translation Problem: Delay ● Wait for the whole utterance to end before translating Delay ASR こ ん に ち は 、 駅 は ど こ で す か ? MT Hello, where is the station? TTS 4
Simultaneous Speech Translation Solution: Divide into Smaller Chunks ● Choose appropriate timing to start translation Delay: Reduced ASR こ ん に ち は 、 駅 は ど こ で す か ? MT MT MT Hello, the station where is it? TTS TTS TTS 5
Simultaneous Speech Translation Four Problems ● Segmentation: When do we start translating? ● Prediction: Can we predict things that haven't been said? ● Data: Can we learn something from actual simultaneous interpreters? ● Evaluation: How do we decide which results are better? 6
Simultaneous Speech Translation 1) Sentence Segmentation for Simultaneous Speech Translation 7
Simultaneous Speech Translation Previous Work: Incremental Dependency Parsing/Manual Rules [Ryu+ 04] ● Utilize knowledge of English/Japanese to derive rules subj prep prep I went to the park with your brother Translate after the first prepositional phrase completes! MT MT あ な た の 弟 と 私 は 公 園 に 行 き ま し た ● - Requires a bilingual linguist to design rules ● - Requires an accurate incremental dependency parser 8
Simultaneous Speech Translation Previous Work: Division on Pauses [Fugen+ 08, Bangalore+ 12] ● Simply divide on short pauses in the utterance ASR hello where is the station ● - Cannot capture relationship between languages ● - Result will greatly change with speech speed, disfluencies 9
Simultaneous Speech Translation Previous Work: Division on Predicted Commas [Sridhar+ 13] ● Guess where commas would appear in the text hello where is the ... Classifier Classifier Classifier comma! no comma no comma translate wait wait ● + Simple, and surprisingly effective ● - No parameter to adjust the granularity ● - Can't capture features of the target language 10
Simultaneous Speech Translation Considering Reordering Probabilities in Sentence Segmentation [Fujita et al., Interspeech 2013] 11
Simultaneous Speech Translation Phrase Based Machine Translation ● Divide the sentence into small phrases and translate Today I will give a lecture on machine translation . Today I will give a lecture on machine translation . 今 日 は 、 を 行 い ま す の 講 義 機 械 翻 訳 。 Today machine translation a lecture on I will give . 今 日 は 、 機 械 翻 訳 の 講 義 を 行 い ま す 。 今 日 は 、 機 械 翻 訳 の 講 義 を 行 い ま す 。 ● Score translations with translation model (TM), 12 reordering model (RM), and language model (LM)
Simultaneous Speech Translation Translation Model Creation ● Perform automatic alignment of bitext ● From aligned text, extract phrases for translation ホ テ 受 ホ テ ル の → hotel ル の 付 ホ テ ル の → the hotel the 受 付 → front desk hotel ホ テ ル の 受 付 → hotel front desk front ホ テ ル の 受 付 → the hotel front desk desk
Simultaneous Speech Translation Lexicalized Reordering Model ● Probabilistically models reorderings for increased accuracy of translation ● Given current phrase and next phrase: Monotone: Swap: 背 の 高 い 男 太 郎 を 訪 問 し た the tall man visited Taro Discontinuous Right: Discontinuous Left: 私 は 太 郎 を 訪 問 し た 背 の 高 い 男 を 訪 問 し た I visited Taro visited the tall man ● “monotone” + “discontinuous right” = “right probability”
Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet
Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こ ん に ち は where is→ ど こ で す か the station→ 駅 where→ ど こ the→ そ の Input String hello where is the station
Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こ ん に ち は where is→ ど こ で す か the station→ 駅 where→ ど こ the→ そ の Input String hello where is the station
Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こ ん に ち は where is→ ど こ で す か the station→ 駅 where→ ど こ the→ そ の Input String hello where is the station “hello” phrase exists ↓ wait
Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こ ん に ち は where is→ ど こ で す か the station→ 駅 where→ ど こ the→ そ の Input String hello where is the station “hello” phrase exists ↓ wait
Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こ ん に ち は where is→ ど こ で す か the station→ 駅 where→ ど こ the→ そ の Input String hello where is the station “hello” “hello where” phrase exists phrase missing ↓ ↓ wait translate “hello”
Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こ ん に ち は where is→ ど こ で す か the station→ 駅 where→ ど こ the→ そ の Input String hello where is the station “hello” “hello where” phrase exists phrase missing ↓ ↓ wait translate “hello”
Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こ ん に ち は where is→ ど こ で す か the station→ 駅 where→ ど こ the→ そ の Input String hello where is the station “hello” “hello where” “where is” phrase exists phrase missing phrase exists ↓ ↓ ↓ wait translate wait “hello”
Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こ ん に ち は where is→ ど こ で す か the station→ 駅 where→ ど こ the→ そ の Input String hello where is the station “hello” “hello where” “where is” phrase exists phrase missing phrase exists ↓ ↓ ↓ wait translate wait “hello”
Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こ ん に ち は where is→ ど こ で す か the station→ 駅 where→ ど こ the→ そ の Input String hello where is the station “hello” “hello where” “where is” “where is the” phrase exists phrase missing phrase exists phrase missing ↓ ↓ ↓ ↓ wait translate wait translate “hello” “where is”
Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こ ん に ち は where is→ ど こ で す か the station→ 駅 where→ ど こ the→ そ の Input String hello where is the station “hello” “hello where” “where is” “where is the” phrase exists phrase missing phrase exists phrase missing ↓ ↓ ↓ ↓ wait translate wait translate “hello” “where is”
Recommend
More recommend