compositions of bottom up tree series transformations
play

Compositions of Bottom-Up Tree Series Transformations Andreas Maletti - PowerPoint PPT Presentation

Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universit at Dresden Fakult at Informatik D01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de May 17, 2005 1. Motivation 2. Semirings, Tree Series,


  1. Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universit¨ at Dresden Fakult¨ at Informatik D–01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de May 17, 2005 1. Motivation 2. Semirings, Tree Series, and Tree Series Substitution 3. Bottom-Up Tree Series Transducers 4. Composition Results a Financially supported by the German Research Foundation (DFG, GK 334) 1 May 17, 2005

  2. Motivation Babel Fish Translation English German Herzlich willkommen meine sehr geehrten Cordially welcomely my very much hon- Damen und Herren. Ich m¨ ochte mich oured ladies and gentlemen. I would like vorab bei den Organisatoren f¨ ur die vor- to thank you first the supervisors for the trefflich geleistete Arbeit bedanken. splendid carried out work. Herzlich willkommen meine sehr geehrten Cordially welcomely my very much hon- Damen und Herren. Ich m¨ ochte mich oured ladies and gentlemen. I would like vorab bei den Organisatoren , die diese me first the supervisors , who made this Veranstaltung erst erm¨ oglicht haben, f¨ ur meeting possible only, for whom splendid die vortrefflich geleistete Arbeit bedanken. carried out work thank you . Motivation 2 May 17, 2005

  3. Motivation Humans Conversation level ? ? Context, History SENT SENT ⋆ VP (Tree) VP (Tree) Sentence level NP NP ADJ NP ADJ NP Grammar Babel Fish Herzlich (String) Cordially (String) Word level Motivation 3 May 17, 2005

  4. Motivation • Automatic translation is widely used (even Microsoft uses it to translate English documentation into German) • Dictionaries are very powerful word-to-word translators; leave few words untranslated • Outcome is nevertheless usually unhappy and ungrammatical • Post-processing necessary Major problem: Ambiguity of natural language Common approach: • “Soft output”(results equipped with a probability) • Human choses the correct translation among the more likely ones Motivation 4 May 17, 2005

  5. Motivation Humans Conversation level ? ? Context, History SENT SENT 0.2 0.4 NP VP NP VP ⋆ ADJ NP ADJ NP Sentence level SENT SENT 0.8 0.6 NP VP NP VP ADV VP ADV VP Grammar 0.9 Cordially . . . Babel Fish Word level Herzlich . . . 0.1 Heartily . . . Motivation 5 May 17, 2005

  6. Motivation Tree series transducers are a straightforward generalization of (i) tree transducers, which are applied in • syntax-directed semantics, • functional programming, and • XML querying, (ii) weighted automata, which are applied in • (tree) pattern matching, • image compression and speech-to-text processing. Motivation 6 May 17, 2005

  7. Generalization Hierarchy tree series transducer τ : T Σ − → A � � T ∆ � � weighted tree weighted transducer tree transducer automaton τ : Σ ∗ − τ : T Σ − → B � � T ∆ � � � ∆ ∗ � → A � � L ∈ A � � T Σ � � generalized weighted automaton tree automaton sequential machine L ∈ B � � T Σ � � � Σ ∗ � L ∈ A � � τ : Σ ∗ − � ∆ ∗ � → B � � string automaton � Σ ∗ � L ∈ B � � Motivation 7 May 17, 2005

  8. Trees Σ ranked alphabet, Σ k ⊆ Σ symbols of rank k , X = { x i | i ∈ N + } • T Σ ( X ) set of Σ -trees indexed by X , • T Σ = T Σ ( ∅ ) , • t ∈ T Σ ( X ) is linear (resp., nondeleting ) in Y ⊆ X , if every y ∈ Y occurs at most (resp., at least) once in t , • t [ t 1 , . . . , t k ] denotes the tree substitution of t i for x i in t Examples: Σ = { σ ( 2 ) , γ ( 1 ) , α ( 0 ) , β ( 0 ) } and Y = { x 1 , x 2 } σ γ γ σ σ x 1 x 1 x 2 α β � � � � γ σ ( x 1 , x 2 ) σ ( α, β ) , γ ( x 1 ) σ Semirings, Tree Series, and Tree Series Substitution 8 May 17, 2005

  9. Semirings A semiring is an algebraic structure A = ( A, ⊕ , ⊙ ) • ( A, ⊕ ) is a commutative monoid with neutral element 0 , • ( A, ⊙ ) is a monoid with neutral element 1 , • 0 is absorbing wrt. ⊙ , and • ⊙ distributes over ⊕ (from left and right). Examples: • semiring of non-negative integers N ∞ = ( N ∪ { ∞ } , + , · ) • Boolean semiring B = ( { 0, 1 } , ∨ , ∧ ) • tropical semiring T = ( N ∪ { ∞ } , min , +) • any ring, field, etc. Semirings, Tree Series, and Tree Series Substitution 9 May 17, 2005

  10. Properties of Semirings We say that A is • commutative , if ⊙ is commutative, • idempotent , if a ⊕ a = a , • complete , if there is an operation � I : A I − → A such that 1. � i ∈ { m,n } a i = a m ⊕ a n , �� � 2. � i ∈ I a i = � , if I = � j ∈ J I j is a (generalized) partition of I , and i ∈ I j a i j ∈ J �� � �� � = � 3. ⊙ i ∈ I,j ∈ J ( a i ⊙ b j ) . i ∈ I a i j ∈ J b j Semiring Commutative Idempotent Complete YES no YES N ∞ YES YES YES B YES YES YES T Semirings, Tree Series, and Tree Series Substitution 10 May 17, 2005

  11. Tree Series A = ( A, ⊕ , ⊙ ) semiring, Σ ranked alphabet Mappings ϕ : T Σ ( X ) − → A are also called tree series • the set of all tree series is A � � T Σ ( X ) � � , • the coefficient of t ∈ T Σ ( X ) in ϕ , i.e., ϕ ( t ) , is denoted by ( ϕ, t ) , • the sum is defined pointwise ( ϕ 1 ⊕ ϕ 2 , t ) = ( ϕ 1 , t ) ⊕ ( ϕ 2 , t ) , • the support of ϕ is supp ( ϕ ) = { t ∈ T Σ ( X ) | ( ϕ, t ) � = 0 } , • ϕ is linear (resp., nondeleting in Y ⊆ X ), if supp ( ϕ ) is a set of trees, which are linear (resp., nondeleting in Y ), • the series ϕ with supp ( ϕ ) = ∅ is denoted by � 0 . Example: ϕ = 1 α + 1 β + 3 σ ( α, α ) + . . . + 3 σ ( β, β ) + 5 σ ( α, σ ( α, α )) + . . . Semirings, Tree Series, and Tree Series Substitution 11 May 17, 2005

  12. Tree Series Substitution A = ( A, ⊕ , ⊙ ) complete semiring, ϕ, ψ 1 , . . . , ψ k ∈ A � � T Σ ( X ) � � Pure substitution of ( ψ 1 , . . . , ψ k ) into ϕ : � − ( ψ 1 , . . . , ψ k ) = ( ϕ, t ) ⊙ ( ψ 1 , t 1 ) ⊙ · · · ⊙ ( ψ k , t k ) t [ t 1 , . . . , t k ] ϕ ← t ∈ supp ( ϕ ) , ( ∀ i ∈ [ k ]): t i ∈ supp ( ψ i ) Example: 5 σ ( x 1 , x 1 ) ← − ( 2 α ⊕ 3 β ) = 10 σ ( α, α ) ⊕ 15 σ ( β, β ) σ σ σ − ( 2 α ⊕ 3 β ) = 10 ⊕ 15 5 ← x 1 x 1 α α β β Semirings, Tree Series, and Tree Series Substitution 12 May 17, 2005

  13. Tree Series Transducers Definition: A (bottom-up) tree series transducer (tst) is a system M = ( Q, Σ, ∆, A , F, µ ) • Q is a non-empty set of states , • Σ and ∆ are input and output ranked alphabets, • A = ( A, ⊕ , ⊙ ) is a complete semiring, � Q is a vector of linear and nondeleting tree series, also called final • F ∈ A � � T ∆ ( X 1 ) � output , � Q × Q k . • tree representation µ = ( µ k ) k ∈ N with µ k : Σ k − → A � � T ∆ ( X k ) � If Q is finite and µ k ( σ ) q, � q is polynomial, then M is called finite . Tree Series Transducers 13 May 17, 2005

  14. Semantics of Tree Series Transducers Mapping r : pos ( t ) − → Q is a run of M on the input tree t ∈ T Σ Run ( t ) set of all runs on t Evaluation mapping: eval r : pos ( t ) − → A � � T ∆ � � defined for every k ∈ N , lab t ( p ) ∈ Σ k by � � eval r ( p ) = µ k ( lab t ( p )) r ( p ) ,r ( p · 1 ) ...r ( p · k ) ← − eval r ( p · 1 ) , . . . , eval r ( p · k ) Tree-series transformation induced by M is � M � : A � � T Σ � � − → A � � T ∆ � � defined � � � � � M � ( ϕ ) = eval r ( ε ) t ∈ T Σ r ∈ Run ( t ) Tree Series Transducers 14 May 17, 2005

  15. Semantics — Example M = ( Q, Σ, ∆, N ∞ , F, µ ) with • Q = { ⊥ , ⋆ } , • Σ = { σ ( 2 ) , α ( 0 ) } and ∆ = { γ ( 1 ) , α ( 0 ) } , • F ⊥ = � 0 and F ⋆ = 1 x 1 , • and tree representation µ 0 ( α ) ⊥ = 1 α µ 0 ( α ) ⋆ = 1 α µ 2 ( σ ) ⋆ , ⋆ ⊥ = 1 x 1 µ 2 ( σ ) ⋆ , ⊥ ⋆ = 1 x 2 µ 2 ( σ ) ⊥ , ⊥⊥ = 1 α Tree Series Transducers 15 May 17, 2005

  16. Semantics — Example (cont.) Input tree t Run r on t ⋆ σ ⊥ ⋆ σ σ ⊥ ⊥ ⊥ ⋆ α α σ σ ⋆ ⊥ ⊥ ⊥ α α α α � M � ( 1 t ) = 2 γ ( α ) ⊕ 4 γ 3 ( α ) Tree Series Transducers 16 May 17, 2005

  17. Extension q ∈ Q k , q ∈ Q , ϕ ∈ A � ( Q, Σ, ∆, A , F, µ ) tree series transducer, � � T Σ ( X k ) � � Definition: We define h � q � Q µ : T Σ ( X k ) − → A � � T ∆ ( X k ) �   , if q = q i 1 x i h � q µ ( x i ) q = �  , otherwise 0 � h � q − ( h � q µ ( t 1 ) p 1 , . . . , h � q µ ( σ ( t 1 , . . . , t k )) q = µ k ( σ ) q,p 1 ...p k ← µ ( t k ) p k ) p 1 ,...,p k ∈ Q � Q by We define h � q µ : A � � T Σ ( X k ) � � − → A � � T ∆ ( X k ) � � h � q ( ϕ, t ) ⊙ h � q µ ( ϕ ) q = µ ( t ) q t ∈ T Σ ( X k ) Composition results 17 May 17, 2005

  18. Composition Construction M 1 = ( Q 1 , Σ, ∆, A , F 1 , µ 1 ) and M 2 = ( Q 2 , ∆, Γ, A , F 2 , µ 2 ) tree series transducer Definition: The product of M 1 and M 2 , denoted by M 1 · M 2 , is the tree series transducer M = ( Q 1 × Q 2 , Σ, Γ, A , F, µ ) � � • F pq = � − h q i ∈ Q 2 ( F 2 ) i ← ( F 1 ) p µ 2 i � � • µ k ( σ ) pq, ( p 1 q 1 ,...,p k q k ) = h q 1 ...q k ( µ 1 ) k ( σ ) p,p 1 ...p k q . µ 2 Composition results 18 May 17, 2005

Recommend


More recommend