cooperative unsupervised training of the part of speech
play

Cooperative unsupervised training of the part-of-speech taggers in a - PowerPoint PPT Presentation

Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system Felipe S anchez-Mart nez, Juan Antonio P erez-Ortiz, Mikel L. Forcada Departament de Llenguatges i Sistemes Inform`


  1. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system ∗ Felipe S´ anchez-Mart´ ınez, Juan Antonio P´ erez-Ortiz, Mikel L. Forcada Departament de Llenguatges i Sistemes Inform` atics Universitat d’Alacant E-03071 Alacant, Spain { fsanchez,japerez,mlf } @dlsi.ua.es ∗ Funded by the Spanish Government through grants TIC2003-08681-C02-01 and BES-2004-4711

  2. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 1 ⊲ Contents • Introduction • Part-of-speech ambiguities in machine translation • Part-of-speech tagging with HMM • Target-language based training of HMM-based taggers • Cooperative learning of HMM • Experiments • Discussion • Future work TMI, Baltimore 4–6 October, 2004

  3. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 2 ⊲ Introduction Part-of-speech (PoS) tagging: To determine the lexical category or PoS of each word that appears in a text Ambiguous word: Word with more than one possible lexical category (PoS) Lemma PoS noun book book verb book Ambiguities are usually solved by looking at the context TMI, Baltimore 4–6 October, 2004

  4. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 3 ⊲ PoS ambiguities in machine translation (I) Indirect MT system: Source language (SL) text is analysed and transformed into an abstract intermediate representation, transformations are applied and, finally, target language (TL) text is generated. SLAR TLAR ↓ ↓ SL → TL text − → Analysis − → − → Generation − Transformation text • Analysis module usually includes a PoS tagger TMI, Baltimore 4–6 October, 2004

  5. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 4 ⊲ PoS ambiguities in machine translation (II) Mistranslation due to wrong PoS tagging • Translation differs from one PoS to another: Spanish PoS Translation into Catalan preposition per a (for/to) para verb para (stop) TMI, Baltimore 4–6 October, 2004

  6. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 4 ⊲ PoS ambiguities in machine translation (II) Mistranslation due to wrong PoS tagging • Translation differs from one PoS to another: Spanish PoS Translation into Catalan preposition per a (for/to) para verb para (stop) • Some transformation is applied (or not) for some PoS: Spanish PoS Translation into Catalan gender la (article) els carrers (the streets) ← agreement las calles la (pronoun) * les carrers (them streets) rule applied TMI, Baltimore 4–6 October, 2004

  7. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 5 ⊲ PoS tagging with HMM (I) Use of a hidden Markov model (HMM): • Adopting a reduced tag set (grouping the finer tags delivered by the morpho- logical analyser) • Each HMM state corresponds to a different PoS tag • Each input word is replaced by its corresponding ambiguity class TMI, Baltimore 4–6 October, 2004

  8. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 6 ⊲ PoS tagging with HMM (II) Estimating proper HMM parameters  supervised  Training unsupervised  TMI, Baltimore 4–6 October, 2004

  9. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 6 ⊲ PoS tagging with HMM (II) Estimating proper HMM parameters  supervised  ✑✑✑✑✑✑✑✑✑✑✑✑ ✸ Training unsupervised  ❅ ■ ❅ ❅ ❅ ❅ tagged corpus untagged corpus TMI, Baltimore 4–6 October, 2004

  10. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 6 ⊲ PoS tagging with HMM (II) Estimating proper HMM parameters  supervised  ✑✑✑✑✑✑✑✑✑✑✑✑ ✸ � Training Baum-Welch unsupervised New idea: Use of TL information  ❅ ■ ❅ ❅ ❅ ❅ tagged corpus untagged corpus TMI, Baltimore 4–6 October, 2004

  11. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 7 ⊲ Target-language based training of HMM-based taggers (I) • Transition probabilities n ( γ i γ j ) ˜ a γ i γ j = � γ k ∈ Γ ˜ n ( γ i γ k ) • Emission probabilities n ( σ, γ i ) ˜ b γ i σ = � σ ′ : γ i ∈ σ ′ ˜ n ( σ ′ , γ i ) TMI, Baltimore 4–6 October, 2004

  12. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 8 ⊲ Target-language based training of HMM-based taggers (II) segment s 1 ր segment s 2 SL . . text − → Segmentation . . . . ց segment s n TMI, Baltimore 4–6 October, 2004

  13. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 8 ⊲ Target-language based training of HMM-based taggers (II) segment s 1 ր segment s 2 SL . . text − → Segmentation . . . . ց segment s n disambiguations path g 1 ր path g 2 seg. . . . . . . s i ց path g m TMI, Baltimore 4–6 October, 2004

  14. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 8 ⊲ Target-language based training of HMM-based taggers (II) segment s 1 ր segment s 2 SL . . text − → Segmentation . . . . ց segment s n disambiguations translations path g 1 τ ( g 1 , s ) ր ց ր path g 2 τ ( g 2 , s ) seg. . . . . . . MT . . . . . . . . . s i ց ր ց τ ( g m , s ) path g m TMI, Baltimore 4–6 October, 2004

  15. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 8 ⊲ Target-language based training of HMM-based taggers (II) segment s 1 ր segment s 2 SL . . text − → Segmentation . . . . ց segment s n disambiguations translations likelihoods path g 1 τ ( g 1 , s ) p TL ( τ ( g 1 , s )) ր ց ր ց ր path g 2 τ ( g 2 , s ) p TL ( τ ( g 2 , s )) TL seg. . . . . . . . . . . MT . . . . . . . . . . . . . . model s i ց ր ց ր ց τ ( g m , s ) p TL ( τ ( g m , s )) path g m TMI, Baltimore 4–6 October, 2004

  16. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 8 ⊲ Target-language based training of HMM-based taggers (II) segment s 1 ր segment s 2 SL . . text − → Segmentation . . . . ց segment s n disambiguations translations likelihoods path g 1 τ ( g 1 , s ) p TL ( τ ( g 1 , s )) p ( g 1 | s ) ��� ր ց ր ց ր path g 2 τ ( g 2 , s ) p TL ( τ ( g 2 , s )) TL p ( g 2 | s ) seg. . . . . . . . . . . ��� MT . . . . . . . . . . . . . . . . . . model . s i . ց ր ց ր ց τ ( g m , s ) p TL ( τ ( g m , s )) path g m p ( g m | s ) ��� TMI, Baltimore 4–6 October, 2004

  17. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 9 ⊲ Target-language based training of HMM-based taggers (III) s ≡ y la para si � CNJ � � CNJ � � � � � ART VB PRN PR p ( g i | s ) g 1 ≡ CNJ ART PR CNJ τ ( g 1 , s ) ≡ i (and) la (the) per a (for/to) si (if) 0 . 0001 g 2 ≡ CNJ ART VB CNJ τ ( g 2 , s ) ≡ i (and) la (the) para (stop) si (if) 0 . 4999 g 3 ≡ CNJ PRN PR CNJ τ ( g 3 , s ) ≡ i (and) la (it/her) per a (for/to) si (if) 0 . 0001 g 4 ≡ CNJ PRN VB CNJ τ ( g 4 , s ) ≡ i (and) la (it/her) para (stop) si (if) 0 . 4999 TMI, Baltimore 4–6 October, 2004

  18. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 9 ⊲ Target-language based training of HMM-based taggers (III) s ≡ y la para si � CNJ � � CNJ � � � � � ART VB PRN PR p ( g i | s ) g 1 ≡ CNJ ART PR CNJ τ ( g 1 , s ) ≡ i (and) la (the) per a (for/to) si (if) 0 . 0001 g 2 ≡ CNJ ART VB CNJ τ ( g 2 , s ) ≡ i (and) la (the) para (stop) si (if) 0 . 4999 g 3 ≡ CNJ PRN PR CNJ τ ( g 3 , s ) ≡ i (and) la (it/her) per a (for/to) si (if) 0 . 0001 g 4 ≡ CNJ PRN VB CNJ τ ( g 4 , s ) ≡ i (and) la (it/her) para (stop) si (if) 0 . 4999 Free ride: word translated the same way independently of the tag selected TMI, Baltimore 4–6 October, 2004

  19. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 10 ⊲ Target-language based training of HMM-based taggers (IV) p ( g i | s ) ∝ p ( g i | τ ( g i , s )) p TL ( τ ( g i , s )) • p ( g i | s ) : Probability of path g i to be the correct disambiguation of segment s • p TL ( τ ( g i , s )) : Likelihood of the translation into TL of segment s according to the disambiguation given by path g i – Language model based on trigrams of words – Hidden Markov model – ... • p ( g i | τ ( g i , s )) : Contribution of the disambiguation path g i to the translation given by τ ( g i , s ) TMI, Baltimore 4–6 October, 2004

  20. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 11 ⊲ Cooperative learning of HMM (I) • Use of the prevoius idea ... TMI, Baltimore 4–6 October, 2004

Recommend


More recommend