Parser Self-Training for Syntax-Based Machine Translation Nara Institute of Science and Technology Augmented Human Communication Laboratory Makoto Morishita, Koichi Akabe, Yuto Hatakoshi, Graham Neubig, Koichiro Yoshino, Satoshi Nakamura 2015/12/03 IWSLT 2015
Background
ジョンは 打った ボールを 3 ジョンは ボールを 打った Phrase-Based Machine Translation [Koehn et al., 2003] John hit a ball Translation Model Reordering Model ๏ Translate and reorder by phrases. - Easy to learn translation model. - Low translation accuracy on language pairs with different word order. Makoto Morishita, AHC Lab, NAIST
4 Tree-to-String Machine Translation [Liu et al., 2006] S x0:NP0 は x1:NP1 を 打った NP0 VP NN VBD NP1 DT NN John hit a ball ๏ Use the source language parse tree in translation - High translation accuracy on language pairs with different word order. - Translation accuracy is affected greatly by the parser accuracy. Makoto Morishita, AHC Lab, NAIST
5 Forest-to-String Machine Translation [Mi et al., 2008] Target language Forest-to-String sentence decoder Source language parse forest ๏ Use the source language parse forest in translation - Decoder can choose the parse tree that has high translation probability from the parse tree candidates [Zhang et al., 2012] Makoto Morishita, AHC Lab, NAIST
6 Parser Parser Self-Training [McClosky et al., 2006] Use as training data Input Parse tree sentence ๏ Use the parser output as training data. ๏ Improve the parser accuracy. - Parser is adapted to the target domain. Makoto Morishita, AHC Lab, NAIST
Parser 7 Evaluation Self-Training for Preordering [Katz-Brown et al., 2011] Use as training data High scored Input Candidate parse tree sentence preordering parse trees Correct preordering data ๏ By selecting the parse trees, more effective self-training (Targeted Self-Training). - Use only high scored parse trees. - However, in this method, we need hand-aligned data. - It is costly to make hand-aligned data. Makoto Morishita, AHC Lab, NAIST
Proposed Method
Forest-to-String 9 Decoder Parser Evaluation using MT automatic evaluation metrics Proposed Method Use as training data Input Parse forest sentence High scored Translated sentence parse tree and parse tree used in translation ๏ Targeted Self-Training using MT automatic evaluation metrics - low cost and accurate evaluation Makoto Morishita, AHC Lab, NAIST
selection Parse tree 10 selection Sentence Selection Methods ๏ Parse tree selection - Select a parse tree to use from a single sentence One High scored Several parse tree sentence parse tree candidates ๏ Sentence selection - Select the sentences to use from the entire corpus Several Sentences to be used sentences Makoto Morishita, AHC Lab, NAIST
11 Parse Tree Selection ๏ Parser 1-best - Use the parser 1-best tree. - Traditional self-training [McClosky et al. 2006] . ๏ Decoder 1-best - Use the parse tree used in translation. ๏ Evaluation 1-best - Among the translation candidates, use the parse tree used in highest scored translation. Makoto Morishita, AHC Lab, NAIST
12 Parser Forest-to-String Decoder Decoder 1-best Input Parse forest sentence Translated sentence and parse tree used in translation ๏ Decoder 1-best - Use the parse tree used in translation. Makoto Morishita, AHC Lab, NAIST
Evaluation 13 Parser Forest-to-String Decoder Automatic Evaluation 1-best Input Parse forest sentence Translation and High scored translation parse tree candidates and parse tree (Oracle translation) ๏ Evaluation 1-best - Among the translation candidate, use the parse tree used in highest scored translation. - This highest scored translation is called Oracle translation. Makoto Morishita, AHC Lab, NAIST
14 Sentence Selection ๏ Random - Select sentences randomly from the corpus. - Traditional self-training. ๏ Threshold of the evaluation score - Use sentences that score over the threshold. ๏ Gain of the evaluation score - Use sentences that have a large gain in score between decoder 1-best and oracle translation. Makoto Morishita, AHC Lab, NAIST
15 Selection based on the score Threshold of the Evaluation Score Score ≧ Threshold Use High scored translation and parse tree Score<Threshold (Oracle translation) Do not use ๏ Threshold of the evaluation score - Use sentences that score over threshold. Makoto Morishita, AHC Lab, NAIST
16 Selection based on the gain of the score Gain of the Evaluation Score Oracle translation, Large gain parse tree Use Small gain Decoder 1-best translation, parse tree Do not use ๏ Gain of evaluation score - Use sentences that have a large gain in score between decoder 1-best and oracle translation. Makoto Morishita, AHC Lab, NAIST
Experiments
Decoder (Travatar) 18 Parser (Egret) Evaluation (BLEU+1) Forest-to-String Experimental Setup (for Self-Training) Existing model Japanese Dependency Corpus (7k) Parser self-training Parallel corpus (ASPEC 2.0M) Source Target language language Makoto Morishita, AHC Lab, NAIST
Self-Trained Parser Existing Parser (Egret) Forest-to-String Decoder (Travatar) (Egret) 19 Experimental Setup (for Evaluation) Decoder training Parallel corpus (ASPEC 2.0M) Source Target language language Translated sentence Decoder dev, test Parallel corpus (ASPEC dev:2k, test 2k) In these experiments, we focused on Japanese-English, Source Target Japanese-Chinese translation language language Makoto Morishita, AHC Lab, NAIST
72.09 Random 23.93 97 Random Oracle 72.04 23.81 97 Decoder 1-best Random 71.77 23.66 96 Parser 1-best 72.27 23.83 - Baseline RIBES BLEU Sentences (k) Sentence Selection Tree Selection 20 Experiment Results (Japanese-English Translation) Makoto Morishita, AHC Lab, NAIST
21 Oracle Translation Score Distribution 25k 20k Sentences � 15k 10k 5k 0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 BLEU+1 Score � • It contains a lot of noisy sentences. Makoto Morishita, AHC Lab, NAIST
23.81 BLEU+1 Threshold Oracle Random 97 23.93 72.09 Oracle 120 97 24.26 72.38 Oracle BLEU+1 Gain 100 24.22 72.04 Decoder 1-best Random 71.77 - 22 Tree Selection Sentence Selection Sentences (k) BLEU RIBES Baseline 23.83 72.27 Parser 1-best Random 96 23.66 72.32 Experiment Results (Japanese-English Translation) ** * * : p < 0.05 ** : p < 0.01 ๏ By self-training, the accuracy significantly improved Makoto Morishita, AHC Lab, NAIST
̶ ̶ (99% level) Yes 2.50 Threshold BLEU+1 Oracle ̶ No 2.42 Random Parser 1-best 23 2.38 (90% level) Baseline Parser -best between Significance Baseline between Significance Score selection Sentence selection Tree Yes Manual Evaluation Score range is 1 to 5 ๏ We could verify that our method is effective.
24 Source C投与群ではRの活動を240分にわたって明らかに増強した Reference in the C - administered group, thermal reaction clearly increased the activity of R for 240 minutes. Baseline for 240 minutes clearly enhanced the activity of C administration group R. Self-Trained for 240 minutes clearly enhanced the activity of R in the C - administration group. Example of an improvement Makoto Morishita, AHC Lab, NAIST
25 Before Self-Training PP NP P NP N SYMP P AUX_SYMP SYMP SYM AUX_SYM AUX_VP SYM N PP P P C 投与 群 で は R の 活動 を administered group in TOP of activity OBJ Makoto Morishita, AHC Lab, NAIST
26 After Self-Training VP PP VP NP PP PP VP N NP P P NP P SYM SYM N NP ADV N C 投与 群 で は R の 活動 を administered group in TOP of activity OBJ Makoto Morishita, AHC Lab, NAIST
81.60 BLEU+1 Gain 81.66 Oracle BLEU+1 Threshold 82 29.86 Tree Selection Oracle 100 130 29.85 81.59 Oracle (ja-en) BLEU+1 Threshold 120 29.87 81.58 29.89 Random 27 81.32 Sentence Selection Sentences (k) BLEU RIBES Baseline - 29.60 Parser 1-best Oracle Random 129 29.75 81.55 Decoder 1-best Random 130 29.76 81.53 Experiment Results (Japanese-Chinese Translation) ** * ** ** * ** * ** * * * : p < 0.05 ** : p < 0.01 ๏ By self-training, the accuracy significantly improved ๏ By using ja-en self-trained model, it also improved the accuracy. Makoto Morishita, AHC Lab, NAIST
81.60 BLEU+1 Gain 81.66 Oracle BLEU+1 Threshold 82 29.86 Tree Selection Oracle 100 130 29.85 81.59 Oracle (ja-en) BLEU+1 Threshold 120 29.87 81.58 29.89 Random 28 81.32 Sentence Selection Sentences (k) BLEU RIBES Baseline - 29.60 Parser 1-best Oracle Random 129 29.75 81.55 Decoder 1-best Random 130 29.76 81.53 Experiment Results (Japanese-Chinese Translation) ** * ** ** * ** * ** * * * : p < 0.05 ** : p < 0.01 ๏ By self-training, the accuracy significantly improved ๏ By using ja-en self-trained model, it also improved the accuracy. Makoto Morishita, AHC Lab, NAIST
Parser Accuracy
Recommend
More recommend