Parser Self-Training for Syntax-Based Machine Translation Nara - PowerPoint PPT Presentation

Parser Self-Training   for Syntax-Based Machine Translation Nara Institute of Science and Technology Augmented Human Communication Laboratory Makoto Morishita, Koichi Akabe, Yuto Hatakoshi,   Graham Neubig, Koichiro Yoshino, Satoshi Nakamura 2015/12/03 IWSLT 2015

Background

ジョンは打ったボールを 3 ジョンはボールを打った Phrase-Based Machine Translation [Koehn et al., 2003] John hit a ball Translation Model Reordering Model ๏ Translate and reorder by phrases.   - Easy to learn translation model.   - Low translation accuracy   on language pairs with different word order. Makoto Morishita, AHC Lab, NAIST

4 Tree-to-String Machine Translation [Liu et al., 2006] S x0:NP0 は x1:NP1 を打った NP0 VP NN VBD NP1 DT NN John hit a ball ๏ Use the source language parse tree in translation   - High translation accuracy on language pairs with   different word order.   - Translation accuracy is affected greatly by the parser accuracy. Makoto Morishita, AHC Lab, NAIST

5 Forest-to-String Machine Translation [Mi et al., 2008] Target language   Forest-to-String   sentence decoder Source language   parse forest ๏ Use the source language parse forest in translation   - Decoder can choose the parse tree that has   high translation probability from the parse tree candidates [Zhang et al., 2012] Makoto Morishita, AHC Lab, NAIST

6 Parser Parser Self-Training [McClosky et al., 2006] Use as training data Input   Parse tree sentence ๏ Use the parser output as training data. ๏ Improve the parser accuracy.   - Parser is adapted to the target domain. Makoto Morishita, AHC Lab, NAIST

Parser 7 Evaluation Self-Training for Preordering [Katz-Brown et al., 2011] Use as training data High scored   Input   Candidate   parse tree sentence preordering parse trees Correct preordering data ๏ By selecting the parse trees,   more effective self-training (Targeted Self-Training).   - Use only high scored parse trees.   - However, in this method, we need hand-aligned data.   - It is costly to make hand-aligned data. Makoto Morishita, AHC Lab, NAIST

Proposed Method

Forest-to-String   9 Decoder Parser Evaluation using MT automatic evaluation metrics Proposed Method Use as training data Input   Parse forest sentence High scored   Translated sentence   parse tree and parse tree used in translation ๏ Targeted Self-Training using MT automatic evaluation metrics   - low cost and accurate evaluation Makoto Morishita, AHC Lab, NAIST

selection Parse tree 10 selection Sentence Selection Methods ๏ Parse tree selection   - Select a parse tree to use from a single sentence One   High scored   Several parse tree   sentence parse tree candidates ๏ Sentence selection   - Select the sentences to use from the entire corpus Several   Sentences to be used sentences Makoto Morishita, AHC Lab, NAIST

11 Parse Tree Selection ๏ Parser 1-best   - Use the parser 1-best tree.   - Traditional self-training [McClosky et al. 2006] . ๏ Decoder 1-best   - Use the parse tree used in translation. ๏ Evaluation 1-best   - Among the translation candidates, use the parse tree used in highest scored translation. Makoto Morishita, AHC Lab, NAIST

12 Parser Forest-to-String   Decoder Decoder 1-best Input   Parse forest sentence Translated sentence   and parse tree used in translation ๏ Decoder 1-best   - Use the parse tree used in translation. Makoto Morishita, AHC Lab, NAIST

Evaluation 13 Parser Forest-to-String   Decoder Automatic   Evaluation 1-best Input   Parse forest sentence Translation and   High scored translation   parse tree candidates and parse tree   (Oracle translation) ๏ Evaluation 1-best   - Among the translation candidate, use the parse tree used in highest scored translation.   - This highest scored translation is called Oracle translation. Makoto Morishita, AHC Lab, NAIST

14 Sentence Selection ๏ Random   - Select sentences randomly from the corpus.   - Traditional self-training. ๏ Threshold of the evaluation score   - Use sentences that score over the threshold. ๏ Gain of the evaluation score   - Use sentences that have a large gain in   score between decoder 1-best and oracle translation. Makoto Morishita, AHC Lab, NAIST

15 Selection   based on the score Threshold of the Evaluation Score Score ≧ Threshold Use High scored translation   and parse tree   Score<Threshold (Oracle translation) Do not use ๏ Threshold of the evaluation score   - Use sentences that score over threshold. Makoto Morishita, AHC Lab, NAIST

16 Selection based on   the gain of the score Gain of the Evaluation Score Oracle translation,   Large gain parse tree Use Small gain Decoder 1-best translation,   parse tree Do not use ๏ Gain of evaluation score   - Use sentences that have a large gain in score between decoder 1-best and oracle translation. Makoto Morishita, AHC Lab, NAIST

Experiments

Decoder (Travatar) 18 Parser   (Egret) Evaluation (BLEU+1) Forest-to-String   Experimental Setup   (for Self-Training) Existing model Japanese Dependency Corpus (7k) Parser self-training Parallel corpus   (ASPEC 2.0M) Source Target   language language Makoto Morishita, AHC Lab, NAIST

Self-Trained Parser   Existing Parser   (Egret) Forest-to-String   Decoder (Travatar) (Egret) 19 Experimental Setup   (for Evaluation) Decoder training Parallel corpus   (ASPEC 2.0M) Source   Target   language language Translated   sentence Decoder dev, test Parallel corpus   (ASPEC dev:2k, test 2k) In these experiments, we focused on Japanese-English,   Source   Target   Japanese-Chinese translation language language Makoto Morishita, AHC Lab, NAIST

72.09 Random 23.93 97 Random Oracle 72.04 23.81 97 Decoder 1-best Random 71.77 23.66 96 Parser 1-best 72.27 23.83 - Baseline RIBES BLEU Sentences (k) Sentence Selection Tree Selection 20 Experiment Results   (Japanese-English Translation) Makoto Morishita, AHC Lab, NAIST

21 Oracle Translation Score Distribution 25k 20k Sentences � 15k 10k 5k 0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 BLEU+1 Score � • It contains a lot of noisy sentences. Makoto Morishita, AHC Lab, NAIST

23.81 BLEU+1 Threshold Oracle Random 97 23.93 72.09 Oracle 120 97 24.26 72.38 Oracle BLEU+1 Gain 100 24.22 72.04 Decoder 1-best Random 71.77 - 22 Tree Selection Sentence Selection Sentences (k) BLEU RIBES Baseline 23.83 72.27 Parser 1-best Random 96 23.66 72.32 Experiment Results   (Japanese-English Translation) ** * * : p < 0.05 ** : p < 0.01 ๏ By self-training, the accuracy significantly improved Makoto Morishita, AHC Lab, NAIST

̶ ̶ (99% level) Yes   2.50 Threshold BLEU+1 Oracle ̶ No 2.42 Random Parser 1-best 23 2.38 (90% level) Baseline Parser -best between   Significance Baseline between Significance Score selection Sentence   selection Tree   Yes   Manual Evaluation Score range is 1 to 5 ๏ We could verify that our method is effective.

24 Source C投与群ではRの活動を240分にわたって明らかに増強した Reference in the C - administered group, thermal reaction clearly increased the activity of R for 240 minutes. Baseline for 240 minutes clearly enhanced the activity of C administration group R. Self-Trained for 240 minutes clearly enhanced the activity of R in the C - administration group. Example of an improvement Makoto Morishita, AHC Lab, NAIST

25 Before Self-Training PP NP P NP N SYMP P AUX_SYMP SYMP SYM AUX_SYM AUX_VP SYM N PP P P C 投与群では R の活動を administered group in TOP of activity OBJ Makoto Morishita, AHC Lab, NAIST

26 After Self-Training VP PP VP NP PP PP VP N NP P P NP P SYM SYM N NP ADV N C 投与群では R の活動を administered group in TOP of activity OBJ Makoto Morishita, AHC Lab, NAIST

81.60 BLEU+1 Gain 81.66 Oracle BLEU+1 Threshold 82 29.86 Tree Selection Oracle 100 130 29.85 81.59 Oracle (ja-en) BLEU+1 Threshold 120 29.87 81.58 29.89 Random 27 81.32 Sentence Selection Sentences (k) BLEU RIBES Baseline - 29.60 Parser 1-best Oracle Random 129 29.75 81.55 Decoder 1-best Random 130 29.76 81.53 Experiment Results   (Japanese-Chinese Translation) ** * ** ** * ** * ** * * * : p < 0.05 ** : p < 0.01 ๏ By self-training, the accuracy significantly improved ๏ By using ja-en self-trained model,   it also improved the accuracy. Makoto Morishita, AHC Lab, NAIST

81.60 BLEU+1 Gain 81.66 Oracle BLEU+1 Threshold 82 29.86 Tree Selection Oracle 100 130 29.85 81.59 Oracle (ja-en) BLEU+1 Threshold 120 29.87 81.58 29.89 Random 28 81.32 Sentence Selection Sentences (k) BLEU RIBES Baseline - 29.60 Parser 1-best Oracle Random 129 29.75 81.55 Decoder 1-best Random 130 29.76 81.53 Experiment Results   (Japanese-Chinese Translation) ** * ** ** * ** * ** * * * : p < 0.05 ** : p < 0.01 ๏ By self-training, the accuracy significantly improved ๏ By using ja-en self-trained model,   it also improved the accuracy. Makoto Morishita, AHC Lab, NAIST

Parser Accuracy

Parser Self-Training for Syntax-Based Machine Translation Nara - PowerPoint PPT Presentation

Parser Self-Training for Syntax-Based Machine Translation Nara Institute of Science and Technology Augmented Human Communication Laboratory Makoto Morishita, Koichi Akabe, Yuto Hatakoshi, Graham Neubig, Koichiro Yoshino, Satoshi

https://bazel.build/ Inputs /usr/bin/cc Action Outputs ./parser.h cc -I. -c parser.c -o

1 2 3+4 2 type Parser = String Tree type Parser = String ( Tree, String) type Parser =

Syntax-Based Decoding Philipp Koehn 9 November 2017 Philipp Koehn Machine Translation:

Syntax-Based Decoding 2 Philipp Koehn 14 November 2017 Philipp Koehn Machine Translation:

Syntax and Semantics Philipp Koehn 3 November 2020 Philipp Koehn Machine Translation: Syntax

Building a Predictive Parser I.e., How to build the parse table for a recursive-descent parser 1

Tasks of a Parser Tasks of a Parser Document Parser Interfaces Document Parser Interfaces

Plan for Today Review main idea syntax-directed evaluation and translation Recall syntax-directed

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Inducing a Discriminative Parser to Optimize Machine Translation Reordering Graham Neubig 1,2,3 ,

log ( parseProb ) (Alex) log ( parseProb / trigramProb ) (Anoop) Result: worse than

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Chapter 6: Syntax Syntax Syntax is the structure of a language. Earlier, both syntax and

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Order and Report Schema Translation in WISE-SBML Server Dr. Mark Pullen Dr. Per Gustavsson Dr.

Improving User Experience for translators Translate Extension Translate Extension Translate

Breaking the barrier of context-freeness. Towards a linguistically adequate probabilistic

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

Optimizing Binary Translation of Dynamically Generated Code Byron Hawkins Brian Demsky

Mary Southern and Gopalan Nadathur Department of Computer Science and Engineering University of

Translation Model Adaptation Using Genre-Revealing Text Features Marlies van der Wees, Arianna

CS356 Unit 9 Virtual Memory & Address Translation 9.2 Indirection Indirection means

Parser Self-Training for Syntax-Based Machine Translation Nara - PowerPoint PPT Presentation

Parser Self-Training for Syntax-Based Machine Translation Nara Institute of Science and Technology Augmented Human Communication Laboratory Makoto Morishita, Koichi Akabe, Yuto Hatakoshi, Graham Neubig, Koichiro Yoshino, Satoshi

https://bazel.build/ Inputs /usr/bin/cc Action Outputs ./parser.h cc -I. -c parser.c -o

1 2 3+4 2 type Parser = String Tree type Parser = String ( Tree, String) type Parser =

Syntax-Based Decoding Philipp Koehn 9 November 2017 Philipp Koehn Machine Translation:

Syntax-Based Decoding 2 Philipp Koehn 14 November 2017 Philipp Koehn Machine Translation:

Syntax and Semantics Philipp Koehn 3 November 2020 Philipp Koehn Machine Translation: Syntax

Building a Predictive Parser I.e., How to build the parse table for a recursive-descent parser 1

Tasks of a Parser Tasks of a Parser Document Parser Interfaces Document Parser Interfaces

Plan for Today Review main idea syntax-directed evaluation and translation Recall syntax-directed

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Inducing a Discriminative Parser to Optimize Machine Translation Reordering Graham Neubig 1,2,3 ,

log ( parseProb ) (Alex) log ( parseProb / trigramProb ) (Anoop) Result: worse than

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Chapter 6: Syntax Syntax Syntax is the structure of a language. Earlier, both syntax and

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Order and Report Schema Translation in WISE-SBML Server Dr. Mark Pullen Dr. Per Gustavsson Dr.

Improving User Experience for translators Translate Extension Translate Extension Translate

Breaking the barrier of context-freeness. Towards a linguistically adequate probabilistic

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

Optimizing Binary Translation of Dynamically Generated Code Byron Hawkins Brian Demsky

Mary Southern and Gopalan Nadathur Department of Computer Science and Engineering University of

Translation Model Adaptation Using Genre-Revealing Text Features Marlies van der Wees, Arianna

CS356 Unit 9 Virtual Memory &amp; Address Translation 9.2 Indirection Indirection means

CS356 Unit 9 Virtual Memory & Address Translation 9.2 Indirection Indirection means