Document Context Neural Machine Translation with Memory Networks Document Context Neural Machine Translation with Memory Networks Sameen Maruf, Gholamreza Haffari Faculty of Information Technology Monash University July 17, 2017 1 / 30
Document Context Neural Machine Translation with Memory Networks Overview 1 Introduction 2 Document MT as Structured Prediction 3 Document NMT with MemNets 4 Experiments and Analysis 5 Conclusion 6 References 2 / 30
Document Context Neural Machine Translation with Memory Networks Introduction Overview 1 Introduction 2 Document MT as Structured Prediction 3 Document NMT with MemNets 4 Experiments and Analysis 5 Conclusion 6 References 3 / 30
Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? 4 / 30
Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Most MT models translate sentences independently 4 / 30
Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Most MT models translate sentences independently Discourse phenomena are ignored, e.g. pronominal anaphora and lexical consistency which may have long range dependency 4 / 30
Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Most MT models translate sentences independently Discourse phenomena are ignored, e.g. pronominal anaphora and lexical consistency which may have long range dependency 4 / 30
Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Most MT models translate sentences independently Discourse phenomena are ignored, e.g. pronominal anaphora and lexical consistency which may have long range dependency 4 / 30
Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? 5 / 30
Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Statistical MT attempts to document MT do not yield significant empirical improvements [Hardmeier and Federico, 2010, Gong et al., 2011, Garcia et al., 2014] 5 / 30
Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Statistical MT attempts to document MT do not yield significant empirical improvements [Hardmeier and Federico, 2010, Gong et al., 2011, Garcia et al., 2014] Previous context-NMT models only use local context and report deteriorated performance when using the target-side context [Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018] 5 / 30
Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Statistical MT attempts to document MT do not yield significant empirical improvements [Hardmeier and Federico, 2010, Gong et al., 2011, Garcia et al., 2014] Previous context-NMT models only use local context and report deteriorated performance when using the target-side context [Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018] We incorporate global source and target document contexts 5 / 30
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Overview 1 Introduction 2 Document MT as Structured Prediction 3 Document NMT with MemNets 4 Experiments and Analysis 5 Conclusion 6 References 6 / 30
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction 7 / 30
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Two types of factors: f θ ( y t ; x t , x − t ), g θ ( y t ; y − t ) 8 / 30
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction 9 / 30
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Training objective: 9 / 30
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Training objective: Maximise P ( y 1 , . . . , y | d | | x 1 , . . . , x | d | ) 9 / 30
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Training objective: Maximise P ( y 1 , . . . , y | d | | x 1 , . . . , x | d | ) = ⇒ Maximise the pseudo-likelihood | d | � arg max P θ ( y t | x t , y − t , x − t ) (1) θ t =1 where f θ and g θ are subsumed in the P θ ( y t | x t , y − t , x − t ) 9 / 30
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction 10 / 30
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Challenge: During test time, the target document is not given 10 / 30
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Challenge: During test time, the target document is not given Coordinate Ascent (i.e., Iterative Decoding)
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Challenge: During test time, the target document is not given Coordinate Ascent (i.e., Iterative Decoding)
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Challenge: During test time, the target document is not given Coordinate Ascent (i.e., Iterative Decoding)
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Challenge: During test time, the target document is not given Coordinate Ascent (i.e., Iterative Decoding) 10 / 30
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding
Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding 11 / 30
Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Overview 1 Introduction 2 Document MT as Structured Prediction 3 Document NMT with MemNets 4 Experiments and Analysis 5 Conclusion 6 References 12 / 30
Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ P θ ( y t | x t , y − t , x − t ) 13 / 30
Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ 14 / 30
Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ 15 / 30
Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ 16 / 30
Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ Memory-to-Context: t , c trg s t , j = GRU ( s t , j − 1 , E T [ y t , j − 1 ] , c t , j , c src ) t 17 / 30
Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ Memory-to-Output: + W yt · c trg y t , j ∼ softmax ( W y · r t , j + W ym · c src + b y ) t t 18 / 30
Recommend
More recommend