markov decision processes
play

Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn - PowerPoint PPT Presentation

Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020 Outline 1 Hidden Markov models Inference: filtering, smoothing, best sequence Dynamic Bayesian


  1. Markov Decision Processes Philipp Koehn 7 April 2020 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  2. Outline 1 ● Hidden Markov models ● Inference: filtering, smoothing, best sequence ● Dynamic Bayesian networks ● Speech recognition Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  3. Time and Uncertainty 2 ● The world changes; we need to track and predict it ● Diabetes management vs vehicle diagnosis ● Basic idea: sequence of state and evidence variables ● X t = set of unobservable state variables at time t e.g., BloodSugar t , StomachContents t , etc. ● E t = set of observable evidence variables at time t e.g., MeasuredBloodSugar t , PulseRate t , FoodEaten t ● This assumes discrete time ; step size depends on problem ● Notation: X a ∶ b = X a , X a + 1 ,..., X b − 1 , X b Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  4. Markov Processes (Markov Chains) 3 ● Construct a Bayes net from these variables: parents? ● Markov assumption: X t depends on bounded subset of X 0 ∶ t − 1 ● First-order Markov process: P ( X t ∣ X 0 ∶ t − 1 ) ≃ P ( X t ∣ X t − 1 ) Second-order Markov process: P ( X t ∣ X 0 ∶ t − 1 ) ≃ P ( X t ∣ X t − 2 , X t − 1 ) ● Sensor Markov assumption: P ( E t ∣ X 0 ∶ t , E 0 ∶ t − 1 ) ≃ P ( E t ∣ X t ) ● Stationary process: transition model P ( X t ∣ X t − 1 ) and sensor model P ( E t ∣ X t ) fixed for all t Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  5. Example 4 ● First-order Markov assumption not exactly true in real world! ● Possible fixes: 1. Increase order of Markov process 2. Augment state , e.g., add Temp t , Pressure t Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  6. 5 inference Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  7. Inference Tasks 6 ● Filtering: P ( X t ∣ e 1 ∶ t ) belief state—input to the decision process of a rational agent ● Smoothing: P ( X k ∣ e 1 ∶ t ) for 0 ≤ k < t better estimate of past states, essential for learning ● Most likely explanation: arg max x 1 ∶ t P ( x 1 ∶ t ∣ e 1 ∶ t ) speech recognition, decoding with a noisy channel Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  8. Filtering 7 ● Aim: devise a recursive state estimation algorithm P ( X t + 1 ∣ e 1 ∶ t + 1 ) = P ( X t + 1 ∣ e 1 ∶ t , e t + 1 ) = α P ( e t + 1 ∣ X t + 1 , e 1 ∶ t ) P ( X t + 1 ∣ e 1 ∶ t ) (Bayes rule) ≃ α P ( e t + 1 ∣ X t + 1 ) P ( X t + 1 ∣ e 1 ∶ t ) (Sensor Markov assumption) = α P ( e t + 1 ∣ X t + 1 )∑ P ( X t + 1 ∣ x t , e 1 ∶ t ) P ( x t ∣ e 1 ∶ t ) (multiplying out) x t ≃ α P ( e t + 1 ∣ X t + 1 )∑ P ( X t + 1 ∣ x t ) P ( x t ∣ e 1 ∶ t ) (first order Markov model) x t ● Summary: P ( X t + 1 ∣ e 1 ∶ t + 1 ) ≃ α P ( e t + 1 ∣ X t + 1 ) P ( X t + 1 ∣ x t ) P ( x t ∣ e 1 ∶ t ) ∑ �ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ�ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ� �ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ�ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ� �ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ�ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ� x t emission transition recursive call ● f 1 ∶ t + 1 = F ORWARD ( f 1 ∶ t , e t + 1 ) where f 1 ∶ t = P ( X t ∣ e 1 ∶ t ) Time and space constant (independent of t ) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  9. Filtering Example 8 transition transition emission emission Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  10. Smoothing 9 ● If full sequence is known ⇒ what is the state probability P ( X k ∣ e 1 ∶ t ) including future evidence? ● Smoothing: sum over all paths Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  11. Smoothing 10 ● Divide evidence e 1 ∶ t into e 1 ∶ k , e k + 1 ∶ t : P ( X k ∣ e 1 ∶ t ) = P ( X k ∣ e 1 ∶ k , e k + 1 ∶ t ) = α P ( X k ∣ e 1 ∶ k ) P ( e k + 1 ∶ t ∣ X k , e 1 ∶ k ) α P ( X k ∣ e 1 ∶ k ) P ( e k + 1 ∶ t ∣ X k ) ≃ = α f 1 ∶ k b k + 1 ∶ t ● Backward message b k + 1 ∶ t computed by a backwards recursion P ( e k + 1 ∶ t ∣ X k ) P ( e k + 1 ∶ t ∣ X k , x k + 1 ) P ( x k + 1 ∣ X k ) ∑ = x k + 1 P ( e k + 1 ∶ t ∣ x k + 1 ) P ( x k + 1 ∣ X k ) ∑ ≃ x k + 1 P ( e k + 1 ∣ x k + 1 ) P ( e k + 2 ∶ t ∣ x k + 1 ) P ( x k + 1 ∣ X k ) ∑ = x k + 1 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  12. Smoothing Example 11 Forward–backward algorithm: cache forward messages along the way Time linear in t (polytree inference), space O ( t ∣ f ∣) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  13. Most Likely Explanation 12 ● Most likely sequence ≠ sequence of most likely states ● Most likely path to each x t + 1 = most likely path to some x t plus one more step x 1 ... x t P ( x 1 ,..., x t , X t + 1 ∣ e 1 ∶ t + 1 ) max = P ( e t + 1 ∣ X t + 1 ) max x t ( P ( X t + 1 ∣ x t ) max x 1 ... x t − 1 P ( x 1 ,..., x t − 1 , x t ∣ e 1 ∶ t )) ● Identical to filtering, except f 1 ∶ t replaced by m 1 ∶ t = x 1 ... x t − 1 P ( x 1 ,..., x t − 1 , X t ∣ e 1 ∶ t ) max i.e., m 1 ∶ t ( i ) gives the probability of the most likely path to state i . ● Update has sum replaced by max, giving the Viterbi algorithm: m 1 ∶ t + 1 = P ( e t + 1 ∣ X t + 1 ) max x t ( P ( X t + 1 ∣ x t ) m 1 ∶ t ) Also requires back-pointers for backward pass to retrieve best sequence bX t + 1 ,t + 1 = argmax x t ( P ( X t + 1 ∣ x t ) m 1 ∶ t ) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  14. Viterbi Example 13 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  15. Hidden Markov Models 14 ● X t is a single, discrete variable (usually E t is too) Domain of X t is { 1 ,...,S } ● Transition matrix T ij = P ( X t = j ∣ X t − 1 = i ) , e.g., ( 0 . 7 0 . 7 ) 0 . 3 0 . 3 ● Sensor matrix O t for each time step, diagonal elements P ( e t ∣ X t = i ) e.g., with U 1 = true , O 1 = ( 0 . 9 0 . 2 ) 0 . 1 0 . 8 ● Forward and backward messages as column vectors: = α O t + 1 T ⊺ f 1 ∶ t f 1 ∶ t + 1 = b k + 1 ∶ t TO k + 1 b k + 2 ∶ t ● Forward-backward algorithm needs time O ( S 2 t ) and space O ( St ) Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  16. 15 dynamic baysian networks Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  17. Dynamic Bayesian Networks 16 ● X t , E t contain arbitrarily many variables in a sequentialized Bayes net Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  18. DBNs vs. HMMs 17 ● Every HMM is a single-variable DBN; every discrete DBN is an HMM ● Sparse dependencies ⇒ exponentially fewer parameters; e.g., 20 state variables, three parents each DBN has 20 × 2 3 = 160 parameters, HMM has 2 20 × 2 20 ≈ 10 12 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  19. 18 speech recognition Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  20. Speech as Probabilistic Inference 19 It’s not easy to wreck a nice beach ● Speech signals are noisy, variable, ambiguous ● What is the most likely word sequence, given the speech signal? I.e., choose Words to maximize P ( Words ∣ signal ) ● Use Bayes’ rule: P ( Words ∣ signal ) = αP ( signal ∣ Words ) P ( Words ) i.e., decomposes into acoustic model + language model ● Words are the hidden state sequence, signal is the observation sequence Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  21. Phones 20 ● All human speech is composed from 40-50 phones, determined by the configuration of articulators (lips, teeth, tongue, vocal cords, air flow) ● Form an intermediate level of hidden states between words and signal ⇒ acoustic model = pronunciation model + phone model ● ARPAbet designed for American English b ea t b et p et [iy] [b] [p] [ih] b i t [ch] Ch et [r] r at b e t d ebt s et [ey] [d] [s] b ough t h at th ick [ao] [hh] [th] [ow] b oa t [hv] h igh [dh] th at [er] B er t [l] l et [w] w et ros e s si ng butt on [ix] [ng] [en] ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ e.g., “ceiling” is [s iy l ih ng] / [s iy l ix ng] / [s iy l en] Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  22. Speech Sounds 21 ● Raw signal is the microphone displacement as a function of time; processed into overlapping 30ms frames, each described by features ● Frame features are typically formants—peaks in the power spectrum Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

  23. Speech Spectrogram 22 Philipp Koehn Artificial Intelligence: Markov Decision Processes 7 April 2020

Recommend


More recommend