Learning Tree to Word Transducers LATA 2014 Aur´ elien Lemay joint work with: Gr´ egoire Laurence Joachim Niehren Slawek Staworko Marc Tommasi March 11, 2014 Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 1 / 32
Learning Tree Transductions Transforming structured datas Example of XSLT transformation : from XML to XHTML Many applications, many formalisms... requires some expertise One solution : infer the transformation Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 2 / 32
Learning Subsequential Transducer Learning Subsequential Transducer[OncinaGarciaVidal93] Subsequential transducers are learnable from examples with polynomial time and data (Gold Model [Gold78]) Two main ideas: Onward normal form [Choffrut79] : produce the output as soon as possible a /ε a / b 0 1 0 1 a / b a /ε Two subsequential transducers for τ ( a 2 n ) = b n State merging algorithm : OSTIA [OncinaGarciaVidal93] Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 3 / 32
Extensions of OSTIA - two learnable classes Rational Functions [BoiretLemayNiehren12] ◮ Represented by Subsequential transducers w. deterministic look-ahead ◮ Normal form (inspired by bimachines [ReteunauerSchutzenberger92]) ◮ Learning algorithm ≃ learn the look-ahead, then apply OSTIA Top-Down Tree-to-Tree Transducers [LemayManethNiehren11] ◮ Earliest normal form [EngelfrietManethSeidl09] : earliest production (produce as ’up’ as possible), ◮ Myhill-Nerode kind of theorem in [LemayManethNiehren11] ◮ Learning based on a state merging algorithm Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 4 / 32
Toward learning MSO tree transformations ? MSO tree Transformation [Courcelle92] : an interesting target for learning tree transformation ! The big picture MSO tree transformations ≃ Macro Tree Transducers w. regular look-ahead ( MTT R ) [EngelfrietManeth03] ≃ Top-Down + Concatenation + Look-ahead Top-Down Tree transducers : learnable � Look-ahead : learnable � ◮ not extended to trees yet Concatenation in the output : ? Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 5 / 32
Outline Tree to Word Transducers 1 Normal Form 2 A Myhill-Nerode Theorem 3 learning Algorithm 4 Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 6 / 32
Outline Tree to Word Transducers 1 Normal Form 2 A Myhill-Nerode Theorem 3 learning Algorithm 4 Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 7 / 32
Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → < b / > f g b a b Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32
Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → < b / > q f g b a b Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32
Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → < b / > q q <f> </f> g b a b Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32
Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → < b / > q <f> <b/> </f> g a b Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32
Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → < b / > q q <f> <g> </g> <b/> </f> a b Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32
Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → < b / > <g> <a/> <b/> </g> <b/> </f> <f> Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32
Tree to word Transducers - Presentation Axiom : u 0 · q ( x 0 ) · u 1 Rules : q ( f ( x 1 , x 2 )) → u 0 · q 1 ( x 1 ) · u 1 · q 2 ( x 2 ) · u 2 Three Restrictions Deterministic Linear (no copy) Ordered (no swap) Deterministic Sequential Tree to Words (STW) Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 9 / 32
Outline Tree to Word Transducers 1 Normal Form 2 A Myhill-Nerode Theorem 3 learning Algorithm 4 Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 10 / 32
Normal Form Earliest STW : produce as soon as possible Example transformation : count the number of symbols. τ count ( f ( a , f ( a , b ))) = ##### An STW for τ count Axiom: q q ( f ) → # q ( x 1 ) q ( x 2 ) q ( a ) → # q ( b ) → # Not earliest ! At least one ’#’ could be output from the beginning. Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 11 / 32
Normal Form Another STW for τ count : Axiom: # q q ( f ) → # q ( x 1 )# q ( x 2 ) q ( a ) → ε q ( b ) → ε Earliest (Rule 1) Produce as ’up’ as possible Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 12 / 32
Normal Form We want a unique normal form : Do we want : q ( f ) → # q ( x 1 )# q ( x 2 ) or q ( f ) → ## q ( x 1 ) q ( x 2 ) (or another choice ?) Earliest - Rule 2 produce as ’left’ as possible Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 13 / 32
Normal Form Earliest STW (eSTW) : produce as ’up’ and as ’left’ as possible Theorem [LaurenceLemayNiehrenStaworkoTommasi11] For any STW, there exists an equivalent unique minimal eSTW Possibly of exponential size The minimal eSTW of τ count Axiom: # q q ( f ) → ## q ( x 1 ) q ( x 2 ) q ( a ) → ε q ( b ) → ε Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 14 / 32
Outline Tree to Word Transducers 1 Normal Form 2 A Myhill-Nerode Theorem 3 learning Algorithm 4 Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 15 / 32
A Myhill-Nerode Theorem for STW constructive algorithm for can ( τ ) (minimal eSTW for τ ) builds for each input path p a τ p p ≃ p ′ iff τ p = τ p ′ Myhill-Nerode Theorem for STW τ is represented by a STW ⇔ ≃ is of Finite Index ⇔ can ( τ ) is the minimal eSTW of τ Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 16 / 32
Building Axiom Axiom : lcp ( range ( τ )) · q ε · lcs ′ ( range ( τ )) lcp : longest common prefix lcs’ : longest common suffix (minus what is in lcp) For τ count : lcp ( range ( τ count )) = # lcs ′ ( range ( τ count )) = ε Axiom : # q ε Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 17 / 32
Building τ ε Axiom : # q ε We define τ ε : For any t , τ ε ( t ) = # − 1 τ count ( t ) Defining τ ε a → ε b → ε f ( a , a ) → # 2 ... f ( f ( a , b ) , a ) → # 4 ... τ ε ( t ) : # | t |− 1 Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 18 / 32
Building Rules for Leaf Symbols Rules from state q ε For leaf symbols : τ ε ( a ) → ε τ ε ( b ) → ε Rules q ε ( a ) → ε q ε ( b ) → ε Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 19 / 32
Building Other Rules (1) build the rule q ε ( f ( x 1 , x 2 )) → u 0 · q ( f , 1) · u 1 · q ( f , 2) · u 2 First, u 0 = lcp ( { τ ε )( f (? , ?))) } ) Compute u 0 from τ ε ( f (? , ?)) f ( a , a ) → # 2 ... f ( f ( a , b ) , a ) → # 4 ... u 0 = # 2 q ε ( f ( x 1 , x 2 )) → # 2 · q ( f , 1) · u 1 · q ( f , 2) · u 2 Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 20 / 32
Recommend
More recommend