advanced hidden markov models the baum welch algorithm
play

Advanced Hidden Markov Models The Baum-Welch Algorithm - PowerPoint PPT Presentation

. . April 12th, 2011 Biostatistics 615/815 - Lecture 22 Hyun Min Kang April 12th, 2011 Hyun Min Kang Advanced Hidden Markov Models The Baum-Welch Algorithm Biostatistics 615/815 Lecture 23: . . . . . . Summary . Introduction . .


  1. . . April 12th, 2011 Biostatistics 615/815 - Lecture 22 Hyun Min Kang April 12th, 2011 Hyun Min Kang Advanced Hidden Markov Models The Baum-Welch Algorithm Biostatistics 615/815 Lecture 23: . . . . . . Summary . Introduction . . . . . . . . . . Baum-Welch . Implementation Uniform HMM CTMP 1 / 35 . . . . . . . . . . . . . . . . . . . . . . . . . . .

  2. . . Gibbs Sampler . 815 Project . . . . . . . . Final Exam . . . . . . . . . Hyun Min Kang Biostatistics 615/815 - Lecture 22 April 12th, 2011 . 2 / 35 . . . . . . . . Introduction Baum-Welch Implementation Uniform HMM . . . CTMP Summary Annoucement . Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Final homework is announced. • Implementing two among E-M algorithm, Simulated Annealing, and • Presentation : Tuesday April 19th. • Final report : Friday Apil 29th. • Thursday April 21st, 10:30AM-12:30PM.

  3. . Uniform HMM April 12th, 2011 Biostatistics 615/815 - Lecture 22 Hyun Min Kang Key components in 815 Presentations Summary . CTMP . 3 / 35 Implementation . Baum-Welch Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Duration : 15 minutes • Describe / illustrate what the problem is • Key idea • Results • Challenges and lessons from implementations • Comparisons with other alternatives (if possible)

  4. . Summary April 12th, 2011 Biostatistics 615/815 - Lecture 22 Hyun Min Kang 3 Increment t and repeat the previous steps. . . i 2 Define the next set of parameter values by . . . . . Recap - Gibbs Sampler Algorithm . CTMP . . . . . . . . . . Introduction 4 / 35 Baum-Welch Implementation Uniform HMM . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Consider a particular choice of parameter values λ ( t ) . • Selecting a component to update, say i . • Sample value for λ ( t +1) , from p ( λ i | x , λ 1 , · · · , λ i − 1 , λ i +1 , · · · , λ k ) .

  5. . Implementation April 12th, 2011 Biostatistics 615/815 - Lecture 22 Hyun Min Kang Recap - Gibbs Sampling for Gaussian Mixture Summary . CTMP . Uniform HMM Baum-Welch Introduction . . . . . . . . . . 5 / 35 . . . . . . . . . . . . . . . . . . . . . . . . . . . • Observed data : x = ( x 1 , · · · , x n ) • Parameters : z = ( z 1 , · · · , z n ) where z i ∈ { 1 , · · · , k } . • Sample each z i conditioned by all the other z .

  6. . . . . . . requiring aperiodicity and irreducbility. . Both Methods are Metropolis-Hastings Algorithms . . . . . . . . relative probabilities between the original and proposed states Hyun Min Kang Biostatistics 615/815 - Lecture 22 April 12th, 2011 . . . . . . . . . . . . . . Introduction Baum-Welch Implementation 6 / 35 Uniform HMM CTMP . Summary Recap - Simulated Annealing and Gibbs Sampler . Both Methods are Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . • The distribution of λ ( t ) only depends on λ ( t − 1) • Update rule defines the transition probabilities between two states, • Acceptance of proposed update is probabilistically determined by

  7. . . . . . . . . Advanced HMM . . . . . . . . Hyun Min Kang Biostatistics 615/815 - Lecture 22 April 12th, 2011 . . . Uniform HMM . . . . . . . . . . Introduction Baum-Welch Implementation Baum-Welch Algorithm 7 / 35 CTMP . Summary Today . . . . . . . . . . . . . . . . . . . . . . . . . . . . • An E-M algorithm for HMM parameter estimation • Three main HMM algorithms • The forward-backward algorithm • The Viterbi algorithm • The Baum-Welch Algorithm • Expedited inference with uniform HMM • Continuous-time Markov Process

  8. . Uniform HMM April 12th, 2011 Biostatistics 615/815 - Lecture 22 Hyun Min Kang . Revisiting Hidden Markov Model Summary . CTMP 8 / 35 Implementation . Baum-Welch Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . !" ()*# " # $ # % # & # - $"# - %$# !" - &/&0"1# ! "# ! $# ! %# ! &# +,-,*+# 3 !" /' " 1 # 3 !$ /' $ 1 # 3 !% /' % 1 # 3 !& /' & 1 # !" ' "# ' $# ' %# ' &# .-,-# 2 #

  9. . . . . . . . . HMM for a stochastic process / algorithm . . . . . . . . Hyun Min Kang Biostatistics 615/815 - Lecture 22 April 12th, 2011 . . . HMM for a deterministic problem . . . . . . . . . . Introduction Baum-Welch Implementation Uniform HMM 9 / 35 CTMP . Statistical analysis with HMM Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Given • Given parameters λ = { π, A , B } • and data o = ( o 1 , · · · , o T ) • Forward-backward algorithm • Compute Pr ( q t | o , λ ) • Viterbi algorithm • Compute arg max q Pr ( q | o , λ ) • Generate random samples of o given λ

  10. . Uniform HMM April 12th, 2011 Biostatistics 615/815 - Lecture 22 Hyun Min Kang exponential using naive algorithms deterministic given data Deterministic Inference using HMM Summary . CTMP . 10 / 35 . Implementation Baum-Welch . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • If we know the exact set of parameters, the inference is deterministic • No stochastic process involved in the inference procedure • Inference is deterministic just as estimation of sample mean is • The computational complexity of the inference procedure is • Using dynamic programming, the complexity can be reduced to O ( n 2 T ) .

  11. . Uniform HMM April 12th, 2011 Biostatistics 615/815 - Lecture 22 Hyun Min Kang Using random process for the inference Using Stochastic Process for HMM Inference Summary . . CTMP Implementation . . . . . . . . . . 11 / 35 Baum-Welch Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . • Randomly sampling o from Pr ( o | λ ) . • Estimating arg max λ Pr ( o | λ ) . • No deterministic algorithm available • Simplex, E-M algorithm, or Simulated Annealing is possible apply • Estimating the distribution Pr ( λ | o ) . • Gibbs Sampling

  12. . . . . . . conditional distribution of latent variable z . distribution of z can be obtained . Maximization step (M-step) . . . . . . . . Hyun Min Kang Biostatistics 615/815 - Lecture 22 April 12th, 2011 . . . Uniform HMM . . . . . . . . . . Introduction Baum-Welch . Implementation 12 / 35 . Expectation step (E-step) . Recap : The E-M Algorithm Summary CTMP . . . . . . . . . . . . . . . . . . . . . . . . . . . • Given the current estimates of parameters θ ( t ) , calculate the • Then the expected log-likelihood of data given the conditional Q ( θ | θ ( t ) ) = E z | x ,θ ( t ) [ log p ( x , z | θ )] • Find the parameter that maximize the expected log-likelihood θ ( t +1) = arg max Q ( θ | θ t ) θ

  13. . Uniform HMM April 12th, 2011 Biostatistics 615/815 - Lecture 22 Hyun Min Kang be provided. Assumptions Summary . CTMP . Implementation . . . . . . . . . . 13 / 35 Introduction Baum-Welch . . . . . . . . . . . . . . . . . . . . . . . . . . . Baum-Welch for estimating arg max λ Pr ( o | λ ) • Transition matrix is identical between states • a ij = Pr ( q t +1 = i | q t = j ) = Pr ( q t = i | q t − 1 = j ) • Emission matrix is identical between states • b i ( j ) = Pr ( o t = j | q t = i ) = Pr ( o t =1 = j | q t − 1 = i ) • This is NOT the only possible assumption. • For example, a ij can be parameterized as a function of t . • Multiple sets of o independently drawn from the same distribution can • Other assumptions will result in different formulation of E-M algorithm

  14. . CTMP April 12th, 2011 Biostatistics 615/815 - Lecture 22 Hyun Min Kang . . . . . E-step of the Baum-Welch Algorithm Summary . 14 / 35 . . . . . . . Implementation Baum-Welch . . Uniform HMM . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Run the forward-backward algorithm given λ ( τ ) Pr ( o 1 , · · · , o t , q t = i | λ ( τ ) ) α t ( i ) = Pr ( o t +1 , · · · , o T | q t = i , λ ( τ ) ) β t ( i ) = α t ( i ) β t ( i ) Pr ( q t = i | o , λ ( τ ) ) = γ t ( i ) = ∑ k α t ( k ) β t ( k ) 2 Compute ξ t ( i , j ) using α t ( i ) and β t ( i ) Pr ( q t = i , q t +1 = j | o , λ ( τ ) ) ξ t ( i , j ) = α t ( i ) a ji b j ( o t +1 ) β t +1 ( j ) = Pr ( o | λ ( τ ) ) α t ( i ) a ji b j ( o t +1 ) β t +1 ( j ) = ∑ ( k , l ) α t ( k ) a lk b l ( o t +1 ) β t +1 ( l )

  15. . CTMP April 12th, 2011 Biostatistics 615/815 - Lecture 22 Hyun Min Kang IEEE Information Theory Society News Letter, Dec 2003 A detailed derivation can be found at ij . T T M-step of the Baum-Welch Algorithm Summary . 15 / 35 . . . . . . Baum-Welch Implementation . . . . Introduction Uniform HMM . . . . . . . . . . . . . . . . . . . . . . . . . . . Let λ ( τ +1) = ( π ( τ +1) , A ( τ +1) , B ( τ +1) ) t =1 Pr ( q t = i | o , λ ( τ ) ) ∑ T ∑ T t =1 γ t ( i ) π ( τ +1) ( i ) = = ∑ T − 1 ∑ T − 1 t =1 Pr ( q t = j , q t +1 = i | o , λ ( τ ) ) t =1 ξ t ( j , i ) a ( τ +1) = = ∑ T − 1 ∑ T − 1 t =1 Pr ( q t = j | o , λ ( τ ) ) t =1 γ t ( j ) t =1 Pr ( q t = i , o t = k | o , λ ( τ ) ) ∑ T ∑ T t =1 γ t ( i ) I ( o t = k ) b i ( k ) ( τ +1) = = ∑ T t =1 Pr ( q t = i | o , λ ( τ ) ) ∑ T t =1 γ t ( i ) • Welch, ”Hidden Markov Models and The Baum Welch Algorithm”,

Recommend


More recommend