The Power of Tree Series Transducers Andreas Maletti 1 Technische Universität Dresden Fakultät Informatik June 15, 2006 1 Research funded by German Research Foundation (DFG GK 334) Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 1 / 26
1 Motivation 2 Definition of Tree Series Transducers 3 Results Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 2 / 26
Machine Translation Overview Subfield of computational linguistics Automatic translation shall assist (human) translator Offer several (likely) alternatives History 1954: High prospects and expectations after Georgetown Experiment 1966: “Perfect translation” failed (ALPAC report) 1993: Statistical machine translation system [Brown et al 93] Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 3 / 26
Machine Translation Problem Translate text of language X into grammatical text of language Y . 1 Preserve meaning 2 Preserve connotation 3 Preserve style Relaxed Problem Transform text of language X into text of language Y such that 1 the result is grammatical 2 expert for X and Y can discern original sentence Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 4 / 26
Tree-based Model [Yamada, Knight 01] VB VB PRP VB1 VB2 PRP VB2 VB1 VB TO ⇒ TO VB He adores He adores listening TO NN NN TO listening to music music to ⇓ VB VB PRP VB2 VB1 PRP VB2 VB1 ga ga kare ha TO VB daisuki desu ⇐ He ha TO VB adores desu kiku no NN TO listening no NN TO ongaku wo music to kare ha ongaku wo kiku no ga daisuki desu 3 phases: (i) Reorder, (ii) Insert, (iii) Translate Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 5 / 26
Implementation of Phase (i): Reorder Implementation by top-down tree series transducer VB ( 0.723 ) TT (vb) TT (prp) TT (vb2) TT (vb1) VB PRP VB1 VB2 PRP VB2 VB1 ⇒ VB TO He adores He VB TO adores listening TO NN listening TO NN to music to music 0.723 vb ( VB ( x 1 , x 2 , x 3 )) → VB ( prp ( x 1 ) , vb2 ( x 3 ) , vb1 ( x 2 )) Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 6 / 26
Implementation of Phase (i): Reorder VB ( 0.723 ) VB ( 0.723 ) TT (prp) TT (vb2) TT (vb1) PRP ( 1 ) TT (vb2) TT (vb1) PRP VB2 VB1 VB2 VB1 He ⇒ He VB TO adores VB TO adores listening TO NN listening TO NN to music to music 1 prp ( PRP ( x 1 )) → PRP ( x 1 ) Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 7 / 26
Implementation of Phase (i): Reorder VB ( 0.723 ) VB ( 0.723 ) PRP ( 1 ) VB2 ( 0.749 ) TT (vb1) PRP ( 1 ) TT (vb2) TT (vb1) TT (to) TT (vb) VB2 VB1 VB1 He He ⇒ VB TO adores TO VB adores listening TO NN TO NN listening to music to music 0.749 vb2 ( VB2 ( x 1 , x 2 )) → VB2 ( to ( x 2 ) , vb ( x 1 )) Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 8 / 26
Implementation of Phase (i): Reorder VB ( 0.723 ) VB ( 0.723 ) PRP ( 1 ) VB2 ( 0.749 ) TT (vb1) PRP ( 1 ) VB2 ( 0.749 ) VB1 ( 1 ) TT (to) TT (vb) VB1 He ⇒ ∗ TO ( 0.893 ) VB ( 1 ) He adores TO VB adores NN ( 1 ) TO ( 1 ) listening TO NN listening music to to music Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 9 / 26
Implementation of Phase (i): Reorder VB ( 0.723 ) VB PRP ( 1 ) VB2 ( 0.749 ) VB1 ( 1 ) PRP VB1 VB2 He adores VB TO ⇒ ∗ TO ( 0.893 ) VB ( 1 ) He adores listening TO NN NN ( 1 ) TO ( 1 ) listening to music music to The above reordering has probability: 0 . 723 · 0 . 749 · 0 . 893 = 0 . 484 Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 10 / 26
Implementation Details Rules Original Reordered Probability PRP VB1 VB2 PRP VB1 VB2 0.074 PRP VB2 VB1 0.723 VB1 PRP VB2 0.061 VB1 VB2 PRP 0.037 VB2 PRP VB1 0.083 VB2 VB1 PRP 0.021 VB TO VB TO 0.251 TO VB 0.749 TO NN TO NN 0.107 NN TO 0.893 Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 11 / 26
Tiburon [May, Knight 06] Overview Implements top-down weighted tree automata and top-down tree series transducers over the probability semiring Operations WTA: intersection, weighted determinization, pruning Operations TST: application, composition, training Applications Used to implement Yamada-Knight model (custom implementation took > 1 year, implementation in Tiburon 2 days) Used to implement Japanese transliteration [Knight, Graehl 98] Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 12 / 26
1 Motivation 2 Definition of Tree Series Transducers 3 Results Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 13 / 26
Tree Series Transducers Overview tree series transducer weighted tree weighted transducer tree transducer automaton generalized weighted automaton tree automaton sequential machine string automaton History Introduced in [Kuich 99] Extended to full generality in [Engelfriet, Fülöp, Vogler 02] Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 14 / 26
Semiring Definition ( A , + , · , 0 , 1 ) semiring, if ( A , + , 0 ) commutative monoid ( A , · , 1 ) monoid · distributes (both sided) over + 0 is absorbing for · ( a · 0 = 0 = 0 · a ) Example Natural numbers ( N , + , · , 0 , 1 ) Probabilities ([ 0 , 1 ] , max , · , 0 , 1 ) Subsets ( P ( A ) , ∪ , ∩ , ∅ , A ) Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 15 / 26
Top-down Tree Series Transducer [Engelfriet et al 02] Definition Polynomial top-down tree series transducer ( Q , Σ , ∆ , A , I , R ) where Q finite set of states Σ and ∆ input and output ranked alphabet A = ( A , + , · , 0 , 1 ) semiring I ⊆ Q set of initial states R finite set of rules of the form a q ( σ ( x 1 , . . . , x k )) → t where t ∈ T ∆ ( Q ( X k )) Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 16 / 26
Properties of TST Definition ( Q , Σ , ∆ , A , I , R ) top-down TST deterministic, if there is at most one rule with a given left hand side and at most one initial state linear, if (for every rule) every variable appears at most once in the right hand side nondeleting, if (for every rule) variables that occur in the left hand side also occur in the right hand side Note Bottom-up TST process input tree from leaves toward root. Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 17 / 26
Classes of Transformations Definition denotation class of transformations computed by substitution x -TOP ε ( A ) top-down TST with properties x ε -subst. x -TOP o ( A ) top-down TST with properties x o-subst. x -BOT ε ( A ) bottom-up TST with properties x ε -subst. x -BOT o ( A ) bottom-up TST with properties x o-subst. In diagram: x -TOP ω ( A ) abbreviated to x ⊤ ω x -BOT ω ( A ) abbreviated to x ⊥ ω Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 18 / 26
1 Motivation 2 Definition of Tree Series Transducers 3 Results Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 19 / 26
Hasse Diagram for Deterministic TST Probability Semiring and Semiring of Natural Numbers d ⊥ d ⊥ ε o dt ⊥ dl ⊥ dn ⊥ dn ⊥ dl ⊥ dt ⊥ ε ε ε o o o h = h ⊥ dlt ⊥ dnt ⊥ dnl ⊥ dnt ⊥ dlt ⊥ = ε ε ε o o o hn = hl = hl ⊥ hn ⊥ dnlt ⊥ = o o ε ε hnl = = dnlt ⊤ = dnl ⊤ dnt ⊤ dlt ⊤ = = = dn ⊤ dl ⊤ dt ⊤ = = = d ⊤ = Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 20 / 26
Hasse Diagram for Deterministic TST Semiring of Subsets d ⊥ d ⊤ = = dn ⊥ dl ⊥ dt ⊥ dt ⊤ dn ⊤ dl ⊤ = = = = = = h ⊤ = dnl ⊥ dnt ⊥ dlt ⊥ h ⊥ dnt ⊤ dlt ⊤ dnl ⊤ = = = = = = o h ⊥ ε hl ⊤ = dnlt ⊥ hn = hl ⊥ dnlt ⊤ = = = o hl ⊥ ε hnl = = Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 21 / 26
Composition of Transformations Definition Let ϕ : T Σ × T ∆ → A ψ : T ∆ × T Γ → A Composition of ϕ and ψ ( ϕ ; ψ ): T Σ × T Γ → A � ( t , v ) �→ ϕ ( t , u ) · ψ ( u , v ) u ∈ T ∆ Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 22 / 26
Composition Results Theorem (see [Kuich 99] and [Engelfriet et al 02]) A commutative semiring nlp-BOT ( A ) ; p-BOT ( A ) = p-BOT ( A ) p-BOT ( A ) ; bdth-BOT ( A ) = p-BOT ( A ) Theorem A commutative semiring lp-BOT ( A ) ; p-BOT ( A ) = p-BOT ( A ) p-BOT ( A ) ; bd-BOT ( A ) = p-BOT ( A ) bdt-TOP ( A ) ; lp-TOP ( A ) ⊆ p-TOP ( A ) Andreas Maletti (TU Dresden) The Power of Tree Series Transducers June 15, 2006 23 / 26
Recommend
More recommend