Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Center for Language and Speech Processing (CLSP) Center of Excellence in Language and Speech Processing (COE) Computer Science Department (CS) Johns Hopkins University (JHU) EMNLP 2009
Motivation single prediction joint prediction in : text, in out : topic ID simple Function variables out this talk complex variables goes here!
Motivation single prediction joint prediction in t , x in : text, t e n : i in out : topic ID g a t t : u o simple e c n Function e u e q s variables out out out ) . . . F , R C ( out this talk complex variables goes here!
Motivation single prediction joint prediction in t , x in : text, t e n : i in out : topic ID g a t t : u o simple e c n Function e u e q s variables out out out ) . . . F , R C ( out this talk complex variables goes here!
Motivation single prediction joint prediction in t , x in : text, t e n : i in out : topic ID g a t t : u o simple e c n Function e u e q s variables out out out ) . . . F , R C ( out this talk complex variables goes here!
Motivation single prediction joint prediction in t , x in : text, t e n : i in out : topic ID g a t t : u o simple e c n Function e u e q s variables out out out ) . . . F , R C ( out this talk complex variables goes here!
Motivation single prediction joint prediction in t , x in : text, t e n : i in out : topic ID g a t t : u o simple e c n Function e u e q s variables out out out ) . . . F , R C ( out in : word, in out : trans- this talk complex literation, ... FST, ... variables goes here! out Y
Motivation single prediction joint prediction in t , x in : text, t e n : i in out : topic ID g a t t : u o simple e c n Function e u e q s variables out out out ) . . . F , R C ( out in : word, in in out : trans- complex literation, ... FST, ... out out variables Y 1 Y 2 Y 3 out Y out
Motivation. Example tasks Morphology ? ? ? ? ? ? ? ? ? ? ? ?
Motivation. Example tasks Morphology
Motivation. Example tasks Morphology
Motivation. Example tasks Morphology predict predict
Motivation. Example tasks Morphology predict predict
Motivation. Example tasks Morphology predict predict
Motivation. Example tasks Morphology predict predict reinforce
Motivation. Example tasks Transliteration Japanese orthogr. English ice cream orthogr.
Motivation. Example tasks Transliteration Japanese orthogr. predict English ice cream orthogr.
Motivation. Example tasks Transliteration hidden pronunciations Japanese Japanese ay s u k u l iy m u phonology orthogr. English English ice cream ay s k r iy m orthogr. phonology Knight & Graehl 1997
Motivation. Example tasks Transliteration hidden pronunciations Japanese Japanese ay s u k u l iy m u phonology orthogr. English English ice cream ay s k r iy m orthogr. phonology Knight & Graehl 1997
Motivation. Example tasks Transliteration hidden pronunciations Japanese Japanese ay s u k u l iy m u phonology orthogr. English English ice cream ay s k r iy m orthogr. phonology Knight & Graehl 1997
Motivation. Example tasks Transliteration hidden pronunciations Japanese Japanese ay s u k u l iy m u phonology orthogr. English English ice cream ay s k r iy m orthogr. phonology Knight & Graehl 1997
Motivation. Example tasks Transliteration hidden pronunciations Japanese Japanese ay s u k u l iy m u phonology orthogr. English English ice cream ay s k r iy m orthogr. phonology Add arbitrary piecewise factors!
Motivation. Example tasks Transliteration hidden pronunciations Japanese Japanese ay s u k u l iy m u phonology orthogr. English English ice cream ay s k r iy m orthogr. phonology Add arbitrary piecewise factors!
Motivation. Example tasks • Further examples: • Cognate modeling • Multiple-string alignment • System combination
Overview • Motivation • Model • Inference & Approximations • Experiments • Conclusions
Model. Getting started: 2 strings • Suppose we have a probability distribution over two string S 1 variables S 1 and S 2 F • Construct weighted finite-state S 2 transducer F that can assign a score to any values of the strings s 1 , s 2 . Pr(s 1 ,s 2 ) = 1/Z F(s 1 ,s 2 ) Dreyer, Smith & Eisner, 2008
Model. 2 strings: An example b r e c h e n S 1 = S 1 F F S 2 b r a c h t S 2 =
Model. 2 strings: An example b r e c h e n S 1 = S 1 F b r e c h e n ε F b r a c h εε t =13.26 n S 2 t b r a c h t S 2 =
Model. 2 strings: An example b r e c h e n S 1 = S 1 F b r e c h e n ε F b r a c h εε t =13.26 n S 2 t b r a c h t S 2 = Transducer F computes score by looking at all alignments
Model. Factor graph examples Factor Graph: Pr(s 1 , s 2 ) = 1/Z S 1 x F 1 (s 1 , s 2 ) F 1 S 2
Model. Factor graph examples Factor Graph: Pr(s 1 , s 2 , s 3 ) = 1/Z S 1 x F 1 (s 1 , s 2 ) F 1 x F 2 (s 1 , s 3 ) F 2 S 2 S 3
Model. Factor graph examples Factor Graph: Pr(s 1 , s 2 , s 3, s 4 ) = 1/Z S 1 x F 1 (s 1 , s 2 ) F 1 F 3 x F 2 (s 1 , s 3 ) F 2 x F 3 (s 1 , s 4 ) S 2 S 4 S 3
Model. Factor graph examples Factor Graph: Pr(s 1 , s 2 , s 3, s 4 ) = 1/Z S 1 x F 1 (s 1 , s 2 ) F 1 F 3 x F 2 (s 1 , s 3 ) F 2 x F 3 (s 1 , s 4 ) x F 4 (s 2 , s 3 ) F 4 S 2 S 4 S 3
Model. Factor graph examples Factor Graph: Pr(s 1 , s 2 , s 3, s 4 ) = 1/Z S 1 x F 1 (s 1 , s 2 ) F 1 F 3 x F 2 (s 1 , s 3 ) F 2 x F 3 (s 1 , s 4 ) x F 4 (s 2 , s 3 ) F 4 F 5 S 2 S 4 S 3 x F 5 (s 3 , s 4 )
Model. Factor graph examples Factor Graph: Pr(s 1 , s 2 , s 3, s 4 ) = 1/Z S 1 x F 1 (s 1 , s 2 ) F 1 F 3 x F 2 (s 1 , s 3 ) F 2 x F 3 (s 1 , s 4 ) x F 4 (s 2 , s 3 ) F 4 F 5 S 2 S 4 S 3 x F 5 (s 3 , s 4 ) F 6 x F 6 (s 2 , s 4 )
Model. Summary • Our model is formally an undirected graphical model , • in which the variables are string-valued , and the factors (potential functions) are finite-state transducers.
Model. Less formal description To model multiple strings and their various interactions, we • build many finite-state transducers , like the ones we presented last year, • have each of them look at a different string pair, • plug them together into a big network, • and coordinate them to predict all strings jointly.
Model. Comparison with k-tape FSM • Model k strings with a k-tape finite-state machine? F F b r e ε chen ε b r ε ach εε t b r ε achen ε b r ε ach εεε S 2 S 4 S 1 S 3
Model. Comparison with k-tape FSM • Model k strings with a k-tape finite-state machine? • >26 k arcs, intractable! Multiple-sequence alignment F F b r e ε chen ε b r ε ach εε t b r ε achen ε b r ε ach εεε S 2 S 4 S 1 S 3
Model. Comparison with k-tape FSM • Model k strings with a k-tape finite-state machine? • >26 k arcs, intractable! Multiple-sequence alignment F F b r e ε chen ε b r ε ach εε t b r ε achen ε b r ε ach εεε S 2 S 4 S 1 S 3 • Factored model more powerful: • Encode swaps and other useful models ☺ ☹ • Encode undecidable models
Overview • Motivation • Model • Inference & Approximations • Experiments • Conclusions
Inference. Overview Factor Graph: S 1 F 1 F 3 F 2 F 4 F 5 S 2 S 4 S 3 F 6
Inference. Overview • We run Belief Factor Graph: Propagation (BP) S 1 F 1 F 3 F 2 F 4 F 5 S 2 S 4 S 3 F 6
Inference. Overview • We run Belief Factor Graph: Propagation (BP) S 1 • BP is a message-passing F 1 F 3 algorithm, a F 2 generalization of forward-backward . F 4 F 5 S 2 S 4 S 3 F 6
Inference. Overview • We run Belief Factor Graph: Propagation (BP) S 1 • BP is a message-passing F 1 F 3 algorithm, a F 2 generalization of forward-backward . F 4 F 5 S 2 S 4 S 3 • BP computes marginals F 6
Inference. Overview • We run Belief Factor Graph: Propagation (BP) S 1 • BP is a message-passing F 1 F 3 algorithm, a F 2 generalization of forward-backward . F 4 F 5 S 2 S 4 S 3 • BP computes marginals F 6 In our version of BP, all messages and beliefs are finite-state machines , which is novel.
Inference. Multiple strings Example: brechen S 1 F 1 S 2
Inference. Multiple strings Example: brechen S 1 predict ( F 1 w h o l e d i s t r i b u t i o n ) 0.20 bracht 0.13 brechtet 0.08 brachtet ... S 2
Inference. Multiple strings Example: brechen brechen brechen S 1 S 1 S 1 ( F 1 F 2 F 3 w h o l e d i s t r i b u t i o n ) 0.20 bracht F 4 F 5 0.13 brechtet 0.08 brachtet ... S 2 S 3 S 4
Inference. Multiple strings Example: brechen brechen brechen S 1 S 1 S 1 0.27 brachen ( F 1 F 2 F 3 w h o l e d i s t r i b u t i o n ) 0.07 brechten ... 0.20 bracht F 4 F 5 0.13 brechtet 0.08 brachtet ... S 2 S 3 S 4
Recommend
More recommend