Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system ∗ Felipe S´ anchez-Mart´ ınez, Juan Antonio P´ erez-Ortiz, Mikel L. Forcada Departament de Llenguatges i Sistemes Inform` atics Universitat d’Alacant E-03071 Alacant, Spain { fsanchez,japerez,mlf } @dlsi.ua.es ∗ Funded by the Spanish Government through grants TIC2003-08681-C02-01 and BES-2004-4711
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 1 ⊲ Contents • Introduction • Part-of-speech ambiguities in machine translation • Part-of-speech tagging with HMM • Target-language based training of HMM-based taggers • Cooperative learning of HMM • Experiments • Discussion • Future work TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 2 ⊲ Introduction Part-of-speech (PoS) tagging: To determine the lexical category or PoS of each word that appears in a text Ambiguous word: Word with more than one possible lexical category (PoS) Lemma PoS noun book book verb book Ambiguities are usually solved by looking at the context TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 3 ⊲ PoS ambiguities in machine translation (I) Indirect MT system: Source language (SL) text is analysed and transformed into an abstract intermediate representation, transformations are applied and, finally, target language (TL) text is generated. SLAR TLAR ↓ ↓ SL → TL text − → Analysis − → − → Generation − Transformation text • Analysis module usually includes a PoS tagger TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 4 ⊲ PoS ambiguities in machine translation (II) Mistranslation due to wrong PoS tagging • Translation differs from one PoS to another: Spanish PoS Translation into Catalan preposition per a (for/to) para verb para (stop) TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 4 ⊲ PoS ambiguities in machine translation (II) Mistranslation due to wrong PoS tagging • Translation differs from one PoS to another: Spanish PoS Translation into Catalan preposition per a (for/to) para verb para (stop) • Some transformation is applied (or not) for some PoS: Spanish PoS Translation into Catalan gender la (article) els carrers (the streets) ← agreement las calles la (pronoun) * les carrers (them streets) rule applied TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 5 ⊲ PoS tagging with HMM (I) Use of a hidden Markov model (HMM): • Adopting a reduced tag set (grouping the finer tags delivered by the morpho- logical analyser) • Each HMM state corresponds to a different PoS tag • Each input word is replaced by its corresponding ambiguity class TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 6 ⊲ PoS tagging with HMM (II) Estimating proper HMM parameters supervised Training unsupervised TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 6 ⊲ PoS tagging with HMM (II) Estimating proper HMM parameters supervised ✑✑✑✑✑✑✑✑✑✑✑✑ ✸ Training unsupervised ❅ ■ ❅ ❅ ❅ ❅ tagged corpus untagged corpus TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 6 ⊲ PoS tagging with HMM (II) Estimating proper HMM parameters supervised ✑✑✑✑✑✑✑✑✑✑✑✑ ✸ � Training Baum-Welch unsupervised New idea: Use of TL information ❅ ■ ❅ ❅ ❅ ❅ tagged corpus untagged corpus TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 7 ⊲ Target-language based training of HMM-based taggers (I) • Transition probabilities n ( γ i γ j ) ˜ a γ i γ j = � γ k ∈ Γ ˜ n ( γ i γ k ) • Emission probabilities n ( σ, γ i ) ˜ b γ i σ = � σ ′ : γ i ∈ σ ′ ˜ n ( σ ′ , γ i ) TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 8 ⊲ Target-language based training of HMM-based taggers (II) segment s 1 ր segment s 2 SL . . text − → Segmentation . . . . ց segment s n TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 8 ⊲ Target-language based training of HMM-based taggers (II) segment s 1 ր segment s 2 SL . . text − → Segmentation . . . . ց segment s n disambiguations path g 1 ր path g 2 seg. . . . . . . s i ց path g m TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 8 ⊲ Target-language based training of HMM-based taggers (II) segment s 1 ր segment s 2 SL . . text − → Segmentation . . . . ց segment s n disambiguations translations path g 1 τ ( g 1 , s ) ր ց ր path g 2 τ ( g 2 , s ) seg. . . . . . . MT . . . . . . . . . s i ց ր ց τ ( g m , s ) path g m TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 8 ⊲ Target-language based training of HMM-based taggers (II) segment s 1 ր segment s 2 SL . . text − → Segmentation . . . . ց segment s n disambiguations translations likelihoods path g 1 τ ( g 1 , s ) p TL ( τ ( g 1 , s )) ր ց ր ց ր path g 2 τ ( g 2 , s ) p TL ( τ ( g 2 , s )) TL seg. . . . . . . . . . . MT . . . . . . . . . . . . . . model s i ց ր ց ր ց τ ( g m , s ) p TL ( τ ( g m , s )) path g m TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 8 ⊲ Target-language based training of HMM-based taggers (II) segment s 1 ր segment s 2 SL . . text − → Segmentation . . . . ց segment s n disambiguations translations likelihoods path g 1 τ ( g 1 , s ) p TL ( τ ( g 1 , s )) p ( g 1 | s ) ��� ր ց ր ց ր path g 2 τ ( g 2 , s ) p TL ( τ ( g 2 , s )) TL p ( g 2 | s ) seg. . . . . . . . . . . ��� MT . . . . . . . . . . . . . . . . . . model . s i . ց ր ց ր ց τ ( g m , s ) p TL ( τ ( g m , s )) path g m p ( g m | s ) ��� TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 9 ⊲ Target-language based training of HMM-based taggers (III) s ≡ y la para si � CNJ � � CNJ � � � � � ART VB PRN PR p ( g i | s ) g 1 ≡ CNJ ART PR CNJ τ ( g 1 , s ) ≡ i (and) la (the) per a (for/to) si (if) 0 . 0001 g 2 ≡ CNJ ART VB CNJ τ ( g 2 , s ) ≡ i (and) la (the) para (stop) si (if) 0 . 4999 g 3 ≡ CNJ PRN PR CNJ τ ( g 3 , s ) ≡ i (and) la (it/her) per a (for/to) si (if) 0 . 0001 g 4 ≡ CNJ PRN VB CNJ τ ( g 4 , s ) ≡ i (and) la (it/her) para (stop) si (if) 0 . 4999 TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 9 ⊲ Target-language based training of HMM-based taggers (III) s ≡ y la para si � CNJ � � CNJ � � � � � ART VB PRN PR p ( g i | s ) g 1 ≡ CNJ ART PR CNJ τ ( g 1 , s ) ≡ i (and) la (the) per a (for/to) si (if) 0 . 0001 g 2 ≡ CNJ ART VB CNJ τ ( g 2 , s ) ≡ i (and) la (the) para (stop) si (if) 0 . 4999 g 3 ≡ CNJ PRN PR CNJ τ ( g 3 , s ) ≡ i (and) la (it/her) per a (for/to) si (if) 0 . 0001 g 4 ≡ CNJ PRN VB CNJ τ ( g 4 , s ) ≡ i (and) la (it/her) para (stop) si (if) 0 . 4999 Free ride: word translated the same way independently of the tag selected TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 10 ⊲ Target-language based training of HMM-based taggers (IV) p ( g i | s ) ∝ p ( g i | τ ( g i , s )) p TL ( τ ( g i , s )) • p ( g i | s ) : Probability of path g i to be the correct disambiguation of segment s • p TL ( τ ( g i , s )) : Likelihood of the translation into TL of segment s according to the disambiguation given by path g i – Language model based on trigrams of words – Hidden Markov model – ... • p ( g i | τ ( g i , s )) : Contribution of the disambiguation path g i to the translation given by τ ( g i , s ) TMI, Baltimore 4–6 October, 2004
Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system 11 ⊲ Cooperative learning of HMM (I) • Use of the prevoius idea ... TMI, Baltimore 4–6 October, 2004
Recommend
More recommend