1
play

1 X 1 X 2 X 3 Ghostbusters HMM Chain Rule and HMMs E 1 E 2 E 3 P(X - PDF document

Hidden Markov Models 1 Hidden Markov Models Markov chains not so useful for most agents Need observations to update your beliefs Hidden Markov models (HMMs) Underlying Markov chain over states X You observe outputs (effects) at


  1. Hidden Markov Models 1 Hidden Markov Models  Markov chains not so useful for most agents  Need observations to update your beliefs  Hidden Markov models (HMMs)  Underlying Markov chain over states X  You observe outputs (effects) at each time step  As a Bayes net (or more generally, a graphical model): X 1 X 2 X 3 X 4 X 5 E 1 E 2 E 3 E 4 E 5 Don't complain; the weather could be worse. 2 Example: Weather HMM CSE 473: Artificial Intelligence Hidden Markov Models Rain t-1 Rain t Rain t+1 Umbrella t-1 Umbrella t Umbrella t+1  An HMM is defined by: R t R t+1 P(R t+1 |R t ) R t U t P(U t |R t )  Initial distribution: +r +r 0.7 +r +u 0.9 +r -r 0.3 +r -u 0.1  Transitions: -r +r 0.3 -r +u 0.2  Emissions: Steve Tanimoto --- University of Washington -r -r 0.7 -r -u 0.8 [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] 1

  2. X 1 X 2 X 3 Ghostbusters HMM Chain Rule and HMMs E 1 E 2 E 3  P(X 1 ) = uniform 1/9 1/9 1/9  P(X’|X) = ghosts usually move clockwise,  From the chain rule, every joint distribution over can be written as: 1/9 1/9 1/9 but sometimes move in a random direction or stay put 1/9 1/9 1/9  P(E|X) = same sensor model as before: P(X 1 ) red means close, green means far away.  Assuming that for all t :  State independent of all past states and all past evidence given the previous state, i.e.: 1/6 1/6 1/2 X 1 X 2 X 3 X 4 0 1/6 0 Etc…  Evidence is independent of all past states and all past evidence given the current state, i.e.: 0 0 0 E 1 E 1 E 3 E 4 P(X’|X=<1,2>) gives us the expression posited on the earlier slide: P(red | 3) P(orange | 3) P(yellow | 3) P(green | 3) E 5 P(E|X) 0.05 0.15 0.5 0.3 Etc… (must specify for other distances) Joint Distribution of an HMM Conditional Independence X 1 X 2 X 3 X 5  HMMs have two important independence properties:  Markov hidden process: future depends on past via the present E 1 E 2 E 3 E 5 ? ?  Joint distribution: X 1 X 2 X 3 X 4 X 5  More generally: E 1 E 2 E 3 E 4 E 5  Questions to be resolved:  Does this indeed define a joint distribution?  Can every joint distribution be factored this way, or are we making some assumptions about the joint distribution by using this factorization? X 1 X 2 X 3 Chain Rule and HMMs Conditional Independence E 1 E 2 E 3  HMMs have two important independence properties:  Markov hidden process: future depends on past via the present  From the chain rule, every joint distribution over can be written as:  Current observation independent of all else given current state ? X 1 X 2 X 3 X 4 X 5  Assuming that E 1 E 2 E 3 E 4 E 5 ? gives us the expression posited on the previous slide: 2

  3. Conditional Independence HMM Computations  HMMs have two important independence properties:  Given  Markov hidden process: future depends on past via the present  parameters  Current observation independent of all else given current state  evidence E 1: n =e 1: n ? X 1 X 2 X 3 X 4 X 5  Inference problems include:  Filtering, find P ( X t |e 1: t ) for all t E 1 E 2 E 3 E 4 E 5  Smoothing, find P ( X t |e 1: n ) for all t ?  Most probable explanation, find x* 1: n = argmax x 1: n P ( x 1: n |e 1: n ) Conditional Independence Filtering / Monitoring  HMMs have two important independence properties:  Filtering, or monitoring, is the task of tracking the distribution  Markov hidden process: future depends on past via the present B t (X) = P t (X t | e 1 , …, e t ) (the belief state) over time  Current observation independent of all else given current state  We start with B 1 (X) in an initial setting, usually uniform X 1 X 2 X 3 X 4 X 5  As time passes, or we get observations, we update B(X) E 1 E 2 E 3 E 4 E 5  The Kalman filter was invented in the 60’s and first ? ? implemented as a method of trajectory estimation for the  Quiz: does this mean that evidence variables are guaranteed to be independent? Apollo program  (Kalman filter is a type of HMM with continuous values)  [No, they are correlated by the hidden state(s)] Real HMM Examples Example: Robot Localization Example from  Speech recognition HMMs: Michael Pfeiffer  Observations are acoustic signals (continuous valued)  States are specific positions in specific words (so, tens of thousands)  Machine translation HMMs:  Observations are words (tens of thousands)  States are translation options  Robot tracking:  Observations are range readings (continuous)  States are positions on a map (continuous) Prob 0 1 t=0 Sensor model: can read in which directions there is a wall, never more than 1 mistake Motion model: may not execute action with small prob. 3

  4. Example: Robot Localization Example: Robot Localization Prob 0 1 Prob 0 1 t=1 Lighter grey: was possible to get the reading, but less likely b/c t=4 required 1 mistake Example: Robot Localization Example: Robot Localization Prob 0 1 Prob 0 1 t=2 t=5 Example: Robot Localization Inference: Base Cases X 1 X 1 X 2 E 1 Prob 0 1 t=3 4

  5. Passage of Time Observation  Assume we have current belief P(X | previous evidence): X 1  Assume we have current belief P(X | evidence to date) X 1 X 2  Then, after evidence comes in: E 1  Then, after one time step passes:  Or compactly:  Basic idea: beliefs “reweighted”  Or, compactly: by likelihood of evidence  Basic idea: beliefs get “pushed” through the transitions  Unlike passage of time, we have  With the “B” notation, we have to be careful about what time step t the belief is about, and what to renormalize evidence it includes Example: Passage of Time Example: Observation  As time passes, uncertainty “ accumulates ”  As we get observations, beliefs get reweighted, uncertainty “ decreases ” (Transition model: ghosts usually go clockwise) T = 1 T = 2 T = 5 Before observation After observation Example: Weather HMM Video of Passage of Time (Transition Model) B’(+r) = 0.5 B’(+r) = 0.627 B’(-r) = 0.5 B’(-r) = 0.373 B(+r) = 0.818 B(+r) = 0.5 B(+r) = 0.883 B(-r) = 0.5 B(-r) = 0.182 B(-r) = 0.117 Rain 0 Rain 1 Rain 2 R t R t+1 P(R t+1 |R t ) R t U t P(U t |R t ) +r +r 0.7 +r +u 0.9 +r -r 0.3 +r -u 0.1 -r +r 0.3 -r +u 0.2 Umbrella 1 Umbrella 2 -r -r 0.7 -r -u 0.8 5

  6. The Forward Algorithm Video of Demo Pacman – Sonar (with beliefs)  We are given evidence at each time and want to know  We can derive the following updates We can normalize as we go if we want to have P(x|e) at each time step, or just once at the end… Online Belief Updates HMM Computations (Reminder)  Every time step, we start with current P(X | evidence)  Given  We update for time:  parameters  evidence E 1: n =e 1: n X 1 X 2  Inference problems include:  We update for evidence: X 2  Filtering, find P ( X t |e 1: t ) for all t  Smoothing, find P ( X t |e 1: n ) for all t  Most probable explanation, find  The forward algorithm does both at once (and doesn’t normalize) x* 1: n = argmax x 1: n P ( x 1: n |e 1: n )  Potential issue: space is |X| and time is |X| 2 per time step E 2 Pacman – Sonar (P4) Smoothing  Smoothing is the process of using all evidence better individual estimates for a hidden state (or all hidden states)  Idea: run FORWARD algorithm up until t , and a similar BACKWARD algorithm from the final timestep n down to t+1 36 [Demo: Pacman – Sonar – No Beliefs(L14D1)] 6

  7. Most Likely Explanation Forward / Viterbi Algorithms sun sun sun sun rain rain rain rain Forward Algorithm (Sum) Viterbi Algorithm (Max) HMMs: MLE Queries Most Probably Explanation (Sequence)  Viterbi algorithm: very similar to filtering algorithm (FORWARD)  HMMs defined by X 1 X 2 X 3 X 4 X 5  States X  Essentially: replace “sum” with “max”, keep back pointers  Observations E  Initial distribution: E 1 E 2 E 3 E 4 E 5  Transitions:  Emissions:  New query: most likely explanation:  New method: the Viterbi algorithm State Trellis  State trellis: graph of states and transitions over time sun sun sun sun rain rain rain rain  Each arc represents some transition  Each arc has weight  Each path is a sequence of states  The product of weights on a path is that sequence’s probability along with the evidence  Forward algorithm computes sums of paths, Viterbi computes best paths 7

Recommend


More recommend