a pattern based machine translation system yakushite net
play

A Pattern-Based Machine Translation System Yakushite Net MT Engine - PowerPoint PPT Presentation

A Pattern-Based Machine Translation System Yakushite Net MT Engine Miki Sasaki and Toshiki Murata Oki Electric Industry Co., Ltd. 2-5-7 Hommachi, Chuo-ku, Osaka 541-0053, JAPAN {sasaki234, murata656}@oki.com Machine Translation by OKI inc.


  1. A Pattern-Based Machine Translation System — Yakushite Net MT Engine Miki Sasaki and Toshiki Murata Oki Electric Industry Co., Ltd. 2-5-7 Hommachi, Chuo-ku, Osaka 541-0053, JAPAN {sasaki234, murata656}@oki.com

  2. Machine Translation by OKI inc.  Rule-based MT -> Pattern-based MT  Rule-based MT (PENSEE 1980s ~ 1990s)  Pattern-based MT (implemented with Java 1997 ~ )  Collaborative translation environment (Yakusite Net 2001 ~ )  Pattern-based MT method  All the knowledge needed for translation are treated as translation patterns  Grammars and word dictionaries can be registered in the same way to our system because they are both treated as translation patterns 2

  3. Yakushite Net  Pattern-based MT  Collaborative translation environment  Users collaborate to improve the translation accuracy  To improve the translation accuracy;  Our system has various communities  Each community has a dictionary  Users register dictionary data to dictionaries of relevant communities 3

  4. Structure of Communities General tree structure Root dictionary ....... science technology computer hobby computer dictionary technology hobby science dictionary dictionary hardware software dictionary software hardware dictionary dictionary electronics programming programming dictionary electronics perl java dictionary java perl dictionary 4 dictionary

  5. Structure of Communities General tree structure Root dictionary ....... science technology computer hobby computer dictionary technology hobby science dictionary dictionary hardware software dictionary software hardware dictionary dictionary electronics programming programming dictionary electronics perl java dictionary java perl dictionary 5 dictionary

  6. Structure of Communities General tree structure Root dictionary ....... science technology computer hobby computer dictionary technology hobby science dictionary dictionary hardware software dictionary software hardware dictionary dictionary electronics programming programming dictionary electronics perl java dictionary java perl dictionary 6 dictionary

  7. Technologies in Yakushite Net  Automatic dictionary acquisition  Determination of dictionaries, texts and communities  Multilingual processing 7

  8. Architecture of Our System The sentence is parsed using the translation patterns in the dictionaries source sentences translation engine dictionary morphological analyzer user dictionary parser/generator system dictionary general post dictionry generator failure recovery morphological dictionary synthesizer translated sentences 8

  9. Translation Patterns  Rules of Context-free Grammar (CFG) are paired  CFG is a formal grammar in which every production rule is of the form “V -> w”  Examples of CFG rules ja:S en:S Japanese : S -> Sintr English : S -> Sintr ?  Examples of translation patterns SIntr SIntr ? [ja:S [1:SIntr:*] ] [en:S [1:SIntr:*] ?:pos=punc];  The mandatory numerical index allows elements between source and target patterns to be related  Source language patterns are used for analysis. (In Japanese-English translation, “ja” is source language and “en” is target language) 9

  10. Parsing and generating method Target Source S S 10

  11. Parsing and generating method Target Source S S 11

  12. Parsing and generating method Target Source S S VP VP か 行く 12

  13. Parsing and generating method Target Source S S 13

  14. Parsing and generating method Target Source S S 14

  15. Parsing and generating method Target Source S S 15

  16. Parsing and generating method Target Source S S 16

  17. Parsing and generating method Target Source S S 17

  18. Parsing and generating method Target Source S S 18

  19. Parsing and generating method Target Source S S 19

  20. Parsing and generating method  Word sequences are reduced to a root of a parse tree (“S”) by applying patterns  When word sequences reach “S”, the source parse tree is completed  each node using the corresponding target language pattern is converted  Generation of the target parse tree is carried out immediately after the parse tree is completed 20

  21. Priority Control of Translation  A parsing tree  prioritized by the combination of criteria (ex. number of selected patterns)  A translation pattern  prioritized with an priority control mark  Failure Recovery Dictionary  becomes active only when the normal parsing process failed 21

  22. The Results for IWSLT2005  Description of the planned training methods  Results  Performance for training data  Result for test data  Examples of registered translation pattern and translation results 22

  23. Description of the Planned Training Methods  Not cover much of expressions seen in BTEC  We manually made translation patterns that are highly generalized 1. we manually extracted frequently used expressions in the IWSLT05 training corpus 2. we patternized those expressions and gave them appropriate translations 3. we made corrections to the existing patterns 4. we registered the new patterns to our system 23

  24. Performance for Training Data(IWSLT04 Test Set) (1) Before registering new patterns (2) After registering them (3) After we extracted the parallel texts with one Japanese sentence from IWSLT05 training corpus and IWSLT04 test corpus, and registered them BLEU NIST WER PER (1) 0.1918 6.2283 0.6470 0.5640 (2) 0.2179 6.7882 0.5989 0.5183 (3) 0.7616 12.5216 0.2216 0.1894 24

  25. Result for Test Data (IWSLT05 Test Set) (1) Before we registered the new patterns (2) After we registered the new patterns (3) After we extracted the parallel texts with one Japanese sentence from IWSLT05 training corpus and IWSLT04 test corpus, and registered them BLEU NIST WER PER (1) 0.1918 6.3279 0.6749 0.5624 (2) 0.2222 6.8913 0.6314 0.5258 (3) 0.2639 7.3585 0.6066 0.5065 25

  26. Examples of Registered Translation Pattern and Translation Results(1/2) IWSLT05_JE_training: Japanese : ボール (booru) を (wo) よく (yoku) 見 (mi) て (te) 。 Translation result (1) : You see a ball well and. English : Watch your ball carefully. Japanese : つかまえ (tsukamae) て (te) 。 Translation result (1) : It catches it and. English : Catch him. Extracted expression: -te form of verbs (conjugated form that leads declinable words) + particle "te( て )" or "de( で )" make imperatives. 26

  27. Examples of Registered Translation Pattern and Translation Results(2/2) Registered translation pattern: ![ja:SImp [1:VP:*:inf=ry:pos=ds] て :pos=sj] [en:SImp [1:VP:*:conjug=bare] ]; IWSLT05_JE_TESTSET: Japanese : 警察 (keisatsu) を (wo) 呼ん (yon) で (de) 。 Translation result (1) : It calls police and. Translation result (3) : Call police. Japanese : 芝生 (shibahu) に (ni) 入ら (haira) ない (nai) で (de) 。 translation result (1) : It does not enter a lawn and. translation result (3) : Do not enter the lawn. 27

  28. Conclusion  We presented our pattern-based MT method  Enables easier registration of phrasal expressions and grammatical knowledge  We described how we dealt with the task  We dealt with the task mainly manually  Future study  Adoption of an automatic dictionary acquisition technology 28

  29. Example of Translation(1/3) Japanese : 「 彼はどこに行くか 」 English : “Where does he go?” VP VP か [ja:VP 行く :*:pos=ds] [en:VP go:*:pos=v]; 行く [ja:VP:jSentenceType=interrogative [1:VP:*] か :pos=ej] [en:VP [1:VP:*]]; 29

  30. Example of Translation(2/3) Japanese : 「 彼はどこに行くか 」 English : “Where does he go?” NPJoshi NP は [ja:NP:personNum=3sg 彼 :*:pos=ms] [en:NP he:*:pos=prn]; 彼 [ja:NPJoshi:case=subj [2:NP:*] は :pos=fj] [en:NP [2:NP:*:case=subj] ]; FsIntr [ja:FsIntr どこに :*:pos=fs] [en:AdvIntr where:*:pos=adv]; どこに 30

  31. Example of Translation(3/3) S Japanese : 「 彼はどこに行くか 」 SIntr English : “Where does he go?” NPJoshi FsIntr VP [ja:SIntr [2:NPJoshi:case=subj:personNum=3sg] [1:FsIntr] [3:VP:* :jSentenceType=interrogative] ] [en:SIntr [1:AdvIntr] do:pos=v:personNum=3sg [2:NP] [3:VP:*] ]; S SIntr “?” [ja:S [1:SIntr:*] ] [en:S [1:SIntr:*] ?:pos=punc]; AdvIntr “do” NP VP “where” “do” “he” “go” 31

Recommend


More recommend