graphical models
play

Graphical Models Kalman Filter DBN ML 701 Undirected Models - PowerPoint PPT Presentation

Outline Dynamic Models Gaussian Linear Models Graphical Models Kalman Filter DBN ML 701 Undirected Models Anna Goldenberg Unification Summary HMMs HMM in short is a Bayes Net satisfies Markov property


  1. Outline � Dynamic Models � Gaussian Linear Models Graphical Models Kalman Filter � � DBN ML 701 � Undirected Models Anna Goldenberg � Unification � Summary HMMs HMM in short � is a Bayes Net � satisfies Markov property (independence of states given present) q t . . . q 0 q 1 q T hidden states � with discrete states (time steps are discrete) O t O o O 1 O T observations T − 1 T What about continuous HMMs? � � P ( Q, O ) = p ( q 0 ) p ( q t +1 | q t ) p ( O t | q t ) t =1 t =1

  2. Example of use SLAM - Simultaneous Localization and Mapping What about continuous HMMs? http://www.stanford.edu/~paskin/slam/ Gaussian Linear State Space models!!! Drawback: Belief State and Time grow quadratically in the number of landmarks State Space Models State Space Models State Space Models q t q t q t . . . . . . q 0 q 1 q T q 0 q 1 q T hidden states hidden states hidden states O t O t O t O o O 1 O o O 1 O T O T observations observations observations T − 1 T q t - is a real-valued K-dimensional hidden state variable � � P ( Q, O ) = p ( q 0 ) p ( q t +1 | q t ) p ( O t | q t ) O t - is a D-dimensional real-valued observation vector t =1 t =1

  3. Gaussian Linear State Space Models State Space Models A A A � O t and q t are Gaussian q t . . . q 0 q 1 q T correction: hidden states � f and g are linear and time-invariant previously B B B R and S were , reversed q t = Aq t − 1 + w t w t ∼ N (0 , R ) O t , O o O 1 O T O t = Bq t − 1 + v t v t ∼ N (0 , S ) observations q 0 ∼ N (0 , Σ 0 ) f determines mean of q t given mean of q t-1 q t = f ( q t − 1 ) + w t A - transition matrix w t is zero-mean random noise vector B - observation matrix O t = g ( q t ) + v t similarly Inference Kalman Filter (1960) Kalman Filter � forward step (filtering) � time update P ( q t − 1 | O 0 , . . . , O t − 1 ) → P ( q t | O 0 , . . . , O t − 1 ) q 1-1 q t E ( q t | t − 1 ) = A · E ( q t − 1 | t − 1 ) V ( q t | t − 1 ) = A · V ( q t − 1 | t − 1 ) A T + R p ( q t | O 0 , . . . , O t ) O t-1 O t � measurement update P ( q t | O o , . . . , O t − 1 ) → P ( q t | O o , . . . , O t ) 1. P ( q t , O t | O o , . . . , O t − 1 ) Σ 12 � backward step (smoothing) Σ 11 q 1-1 q t � V ( q t | t − 1 ) B T � � E ( q t | t − 1 ) � V ( q t | t − 1 ) BV ( q t | t − 1 ) B T + R B · E ( q t | t − 1 ) BV ( q t | t − 1 ) Σ 21 O t-1 O t Σ 22 p ( q t | O t , O t +1 , . . . , O T ) 2. P ( q t | O o , . . . , O t − 1 ) → P ( q t | O o , . . . , O t ) E ( q t | t ) = E ( q t | t − 1 ) + Σ 12 Σ − 1 22 ( O t − E ( O t | t )) q 1-1 q t V ( q t | t ) = V ( q t | t − 1 ) − Σ 12 Σ − 1 22 Σ 21 O t-1 O t

  4. Example of use Kalman Filter Usage � Tracking motion � Missiles � Hand motion � Lip motion from videos � Signal Processing � Navigation � Economics (for prediction) Reported by Welch and Bishop, SIGGRAPH 2001 Dynamic Bayes Nets Dynamic Bayes Nets Weather 0 Weather 1 Weather 2 � So far . . . q 0 q 1 q T Velocity 0 Velocity 1 Velocity 2 Location 0 Location 1 Location 2 O o O 1 O T Failure 0 Failure 1 Failure 2 � But are there more appealing models? Obs_2 Obs_1 Obs_0 � It’s just a Bayes Net! Weather 0 Weather 1 Weather 2 Approach to the dynamics � Velocity 0 Velocity 1 Velocity 2 1. Start with some prior for the initial state � Location 0 Location 1 Location 2 2. Predict the next state just using the observation up to the previous time step � (Koller and Friedman) Failure 0 Failure 1 Failure 2 3. Incorporate the new observation and re-estimate the current state � Obs_2 Obs_1 Obs_0

  5. Dynamic Bayes Nets Other graphical models Weather 0 Weather 1 Weather 2 Velocity 0 Velocity 1 Velocity 2 but first... Location 0 Location 1 Location 2 Failure 0 Failure 1 Failure 2 Obs_2 Obs_1 Any questions so far? Obs_0 � It’s just a Bayes Net! Approach to the dynamics � Most importantly: 1. Start with some prior for the initial state � Use the structure of the Bayes Net. 2. Predict the next state just using the observation up to the previous time step � Use the independencies!!! 3. Incorporate the new observation and re-estimate the current state � Are all GM directed? Undirected models There are Undirected Graphical Models! A p ( X ) = 1 � B C ψ ( X C ) A Z C ψ ( X C ) - non-negative potential function D E B C What are C ? D E

  6. Cliques Cliques A A p ( X ) = 1 � B C B C ψ ( X C ) Z C ψ ( X C ) - non-negative potential function D E D E A clique C is a subset C ∈ V if ∀ i,j ∈ C, (i,j) ∈ E i) B - a clique? ii) BC - a maximal clique? C is maximal if it is not contained in any other clique iii) ABCD - a clique? iv) ABC - a maximal clique? v) BCDE - a clique? Decomposition Independence Rule: V 1 is independent of V 2 given cutset S A S is called the Markov Blanket (MB) e.g. MB(B) = {A,C,D}, i.e. the set of neighbors B C A D E B C Note to resolve the confusion: The most common machine learning notation is the decomposition over maximal cliques D E p ( A, B, C, D, E ) = 1 Z p ( A, B, C ) p ( B, D ) p ( C, E ) p ( D, E )

  7. Are undirected models useful? Are undirected models useful? � Yes! � Yes! � Used a lot in Physics (Ising model, Boltzmann machine) � Used a lot in Physics (Ising model, Boltzmann machine) � In vision (every pixel is a node) � In vision (every pixel is a node), bioinformatics � Bioinformatics � Why not more popular? � the ZZZZZZ! it’s the partition function p ( X ) = 1 � ψ ( X C ) Z C What’s Z and ways to fight it Chain Graphs � � � Generalization of MRFs and Bayes Nets Z = ψ ( X C ) � Structured as blocks ∀ x C � Undirected edges within a block � Approximations � Directed edges between blocks � Sampling (MCMC sampling is common) � Pseudo-Likelihood � Mean-field approximation

  8. Chain Graphs Graphical Models Chain Graphs � Generalization of MRFs and Bayes Nets � Structured as blocks quite intractable Undirected � Undirected edges within a block not very popular Directed used in BioMedical � Directed edges between blocks ? Engineering (text) Undirected? Directed? Directed Undirected? A A A B A B B C B C C C D D

  9. Summary Chain Graphs � Graphical Models is a huge evolving field Undirected � There are many other variations that haven’t been Directed discussed � Used extensively in variety of domains � Tractability issues � More work to be done! Questions ?

Recommend


More recommend