kernel on automata cousins of string kernels and dynamic
play

Kernel on Automata Cousins of String Kernels and Dynamic Systems - PowerPoint PPT Presentation

Kernel on Automata Cousins of String Kernels and Dynamic Systems Kernels? S.V.N. Vishy Vishwanathan vishy@csa.iisc.ernet.in Indian Insitutute of Science Bangalore, India Joint work with Alex Smola S.V.N. Vishy Vishwanathan:


  1. Kernel on Automata Cousins of String Kernels and Dynamic Systems Kernels? S.V.N. “Vishy” Vishwanathan vishy@csa.iisc.ernet.in Indian Insitutute of Science Bangalore, India Joint work with Alex Smola S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 1

  2. Overview Introduction and Motivation Definition of Automata Kernels on Automata Kernels defined by Automata Applications of Automata Kernels S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 2

  3. Introduction Automata are powerful abstractions HMM’s, Dynamical systems, graphs etc. can be viewed as special cases Sometimes even input data can be modeled as Au- tomata Many times we want to define kernels by using Automata We may also want to compare two Automata by defining kernels on them Our Automata kernels are also related to diffusion ker- nels on graphs, rational kernels on transducers and ker- nels on Dynamical systems S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 3

  4. Notation Characters make up the alphabet set Σ Sequence of characters is a string A string is accepted by an Automata if there are a se- quence of state transitions which lead from the initial state to the final state Set of all strings accepted by an Automata define its lan- guage (denoted by L ) The language accepted by various families of Automata are well studied Computers can be modeled as Turing machines which are a kind of Automata ! S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 4

  5. Basic Idea Given two strings, if the state transitions they induce is similar then the two strings are similar If a set of strings result in similar state transitions in two different Automata then the Automata themselves are similar Using these two ideas we can talk of kernels defined by Automata and kernels on Automata This is a very generic framework and does not impose any restrictions on how you define similarity This means, for example, that time warped kernels can also be considered for defining similarity S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 5

  6. Finite State Automata (FSA) Mathematical models to describe regular languages FSA is denoted by a 5-tuple ( Q, Σ , δ, q 0 , F ) Q is the finite set of states q 0 ∈ Q is the initial state F ⊆ Q is a the set of final states δ is a transition function mapping Q × Σ → Q In case of Non-deterministic FSA δ is a transition func- tion Q × Σ → 2 Q In case of weighted FSA we also have weights associ- ated with the transitions S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 6

  7. � Finite State Automata contd . . . Any language accepted by a NFA can also be accepted by a FSA Addition of ǫ transitions does not add to the expressive power of either FSA or NFA Addition of weighted transitions does not add to the ex- pressive power of the NFA b ���� ���� � ���� ���� a � ���� ���� ���� ���� a 1 S F S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 7

  8. Kernel Definition Every x ∈ L induces a set of state transitions (denoted by q ( x ) ) of the form q 0 Q k f s ⊑ q ( x ) denotes that s occurs as a sub-sequence of some element of q ( x ) The generic kernel is defined as � k ( x, x ′ ) = κ ( x , x ′ ) x ⊑ q ( x ) , x ′ ⊑ q ( x ′ ) κ ( ., . ) is a kernel function and depends on the application domain Sometimes a normalizing term is also added Note the correspondence with R-Convolution kernels of Haussler S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 8

  9. Special Cases Bag of States: Counts common states � if w x ∈ Q w x δ x , x ′ κ ( x , x ′ ) = 0 otherwise Bag of State Sub-Sequences: Includes context κ ( x , x ′ ) = w x δ x , x ′ Weights can also be assigned based on location of match Time warped sequence kernels may also be used but you have to pay the computational cost Gap penalities, decay factors and other fancy ideas can also be used S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 9

  10. Context Free Grammar A Contex Free Grammar is denoted by G = ( V, T, P, S ) V is a finite set of variables T is a finite set of terminals S is a special variable called the start symbol P is a finite set of productions of the form A → α , where A is a variable and α is a string of symbols from ( V ∪ T ) ∗ S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 10

  11. Context Free Grammar . . . A string x is said to belong to the language if productions in the CFG can derive the string starting from S A parse tree of x is the tree representation of the produc- tions that derive x In the case of an un-ambigous CFG each string x in the langugage corresponds to an unique parse tree A Push-down Automata is an abstraction which can ac- cept an un-ambigous CFG S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 11

  12. Kernel using a CFG Given two strings x and x ′ in the language generate their parse trees Compute the kernels using the parse trees Not as simple minded as it looks Structured languages like XML or HTML are parsed by a Push-down Automata to produce a DOM Our idea can also be used to compute kernels between say two web pages S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 12

  13. Other Cases Every programming language is defined by a CFG which is accepted by some Push-down Automata This means we can now compute kernels between say two C programs! If we ignore the actual names of the variables, code du- plication, plagarism etc. can be detected ! Also has applications in efficient compression of struc- tured text S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 13

  14. Kernels on Dynamical Systems We consider very simple linear systems described by x A ( t ) := A ( t ) x for A ∈ A For simplicity, in this talk, we assume that A consists of only single transformation We also assume that noise is absent (for details on how to use noisy models talk to me or Alex!) We define the kernel for this simple case as k ( x, ˜ x ) := E A [ k (( x, A ) , (˜ x, A ))] . S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 14

  15. Kernels on Dynamical Systems . . . Cranking a few equations yields a kernel of the form ∞ � e − λt � A t x 0 , ˜ x 0 x ⊤ A t ˜ x 0 � = tr(˜ 0 ) M t =0 Here M satisfies e − λ A ⊤ M ˜ A + 1 = M Such equations are called Sylvester equations and can be solved in O ( n 3 ) time by using widely available pack- ages Challenge lies in finding efficient special cases which can be solved cheaply S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 15

  16. Summary Automata are important abstractions It is important to define similarities using Automata They are closely related to Dynamical systems kernels S.V.N. “Vishy” Vishwanathan: Kernel on Automata, Page 16

Recommend


More recommend