automatic speech recognition cs753 automatic speech
play

Automatic Speech Recognition (CS753) Automatic Speech Recognition - PowerPoint PPT Presentation

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 5: Hidden Markov Models (Part I) Instructor: Preethi Jyothi Lecture 5 OpenFst Cheat Sheet Qv ick Intro to OpenFst (www.openfst.org) a 0


  1. Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 5: Hidden Markov Models (Part I) Instructor: Preethi Jyothi Lecture 5 


  2. OpenFst Cheat Sheet

  3. ��� Qv ick Intro to OpenFst (www.openfst.org) a �� “ 0 ” � l a b e l � i s � r e s e r v e d � f o r � e p s i l o n 0 1 2 an �� 0 1 an a <eps> 0 Input 
 1 2 <eps> n an 1 alphabet 
 (in.txt) 0 2 a a a 2 1 2 <eps> 0 Output 
 a 1 alphabet 
 A.txt (out.txt) n 2

  4. ������� Qv ick Intro to OpenFst (www.openfst.org) a ������ 2/0.1 0 1 an ������ 0 1 an a 0.5 1 2 <eps> n 1.0 0 2 a a 0.5 1 2 0.1

  5. Compiling & Printing FSTs The text FSTs need to be “compiled” into binary objects before further use with OpenFst utilities Command used to compile: • fstcompile --isymbols=in.txt --osymbols=out.txt A.txt A.fst Get back the text FST using a print command with the binary file: • fstprint --isymbols=in.txt --osymbols=out.txt A.fst A.txt

  6. Drawing FSTs Small FSTs can be visualized easily using the draw tool: fstdraw --isymbols=in.txt --osymbols=out.txt A.fst | dot -Tpdf > A.pdf 1 <eps>:n an:a 0 2 a:a

  7. Fairly large FST!

  8. Hidden Markov Models (HMMs) Following slides contain figures/material from “Hidden Markov Models”, 
 Chapter 9, “Speech and Language Processing”, D. Jurafsky and J. H. Martin, 2016. (h tu ps://web.stanford.edu/~jurafsky/slp3/9.pd f )

  9. Markov Chains Q = q 1 q 2 ... q N a set of N states A = a 01 a 02 ... a n 1 ... a nn a transition probability matrix A , each a i j rep- resenting the probability of moving from state i to state j , s.t. P n j = 1 a i j = 1 ∀ i π = π 1 , π 2 ,..., π N an initial probability distribution over states. π i is the q 0 , q F a special start state and end (final) state that are probability that the Markov chain will start in state i . Some not associated with observations states j may have π j = 0, meaning that they cannot be initial states. Also, P n i = 1 π i = 1 QA = { q x , q y ... } a set QA ⊂ Q of legal accepting states

  10. Hidden Markov Model Q = q 1 q 2 ... q N a set of N states A = a 11 a 12 ... a n 1 ... a nn a transition probability matrix A , each a i j rep- resenting the probability of moving from state i to state j , s.t. P n j = 1 a i j = 1 ∀ i O = o 1 o 2 ... o T a sequence of T observations , each one drawn from a vocabulary V = v 1 , v 2 ,..., v V B = b i ( o t ) a sequence of observation likelihoods , also called emission probabilities , each expressing the probability of an observation o t being gen- erated from a state i q 0 , q F a special start state and end (final) state that are not associated with observations, together with transition probabilities a 01 a 02 ... a 0 n out of the start state and a 1 F a 2 F ... a nF into the end state

  11. HMM Assumptions .1 .2 start 0 end 3 .6 .5 .1 .3 .8 HOT 1 COLD 2 .4 B 1 B 2 P(1 | HOT) .2 P(1 | COLD) .5 P(2 | HOT) = .4 P(2 | COLD) = .4 P(3 | HOT) .4 P(3 | COLD) .1 P ( q i | q 1 ... q i − 1 ) = P ( q i | q i − 1 ) Markov Assumption: Output Independence: P ( o i | q 1 ... q i ,..., q T , o 1 ,..., o i ,..., o T ) = P ( o i | q i )

  12. Three problems for HMMs Given an HMM λ = ( A , B ) and an observation se- Problem 1 (Likelihood): quence O , determine the likelihood P ( O | λ ) . Given an observation sequence O and an HMM λ = Problem 2 (Decoding): ( A , B ) , discover the best hidden state sequence Q . Problem 3 (Learning): Given an observation sequence O and the set of states in the HMM, learn the HMM parameters A and B . Computing Likelihood: Given an HMM λ = ( A , B ) and an observa- tion sequence O , determine the likelihood P ( O | λ ) .

  13. Forward Trellis N X α t ( j ) = P ( o 1 , o 2 ... o t , q t = j | λ ) α t ( j ) = α t − 1 ( i ) a ij b j ( o t ) i = 1 q F end end end end α 1 (2) =.32 α 2 (2) = .32*.12 + .02*.08 = .040 P(H|H) * P(1|H) H H H q 2 H P(C|H) * P(1|C) .6 * .2 .3 * .5 P(H|start)*P(3|H) P(H|C) * P(1|H) α 2 (1) = .32*.15 + .02*.25 = .053 α 1 (1) = .02 .8 * .4 .4 * .2 P(C|C) * P(1|C) q 1 C C C C .5 * .5 P(C|start) * P(3|C) .2 * .1 q 0 start start start start 1 3 3 o 2 o 3 o 1 t

  14. Forward Algorithm 1. Initialization: α 1 ( j ) = a 0 j b j ( o 1 ) 1 ≤ j ≤ N 2. Recursion (since states 0 and F are non-emitting): N X α t ( j ) = α t − 1 ( i ) a ij b j ( o t ) ; 1 ≤ j ≤ N , 1 < t ≤ T i = 1 3. Termination: N X P ( O | λ ) = α T ( q F ) = α T ( i ) a iF i = 1

  15. Visualizing the forward recursion α t-2 (N) α t-1 (N) q N q N q N α t (j)= Σ i α t-1 (i) a ij b j (o t ) a Nj q j α t-2 (3) α t-1 (3) a 3j q 3 q 3 q 3 a 2j α t-2 (2) α t-1 (2) b j (o t ) a 1j q 2 q 2 q 2 q 2 α t-2 (1) α t-1 (1) q 1 q 1 q 1 q 1 ot-1 ot o t-2 o t+1

  16. Three problems for HMMs Given an HMM λ = ( A , B ) and an observation se- Problem 1 (Likelihood): quence O , determine the likelihood P ( O | λ ) . Given an observation sequence O and an HMM λ = Problem 2 (Decoding): ( A , B ) , discover the best hidden state sequence Q . Problem 3 (Learning): Given an observation sequence O and the set of states in the HMM, learn the HMM parameters A and B . Decoding : Given as input an HMM λ = ( A , B ) and a sequence of ob- servations O = o 1 , o 2 ,..., o T , find the most probable sequence of states Q = q 1 q 2 q 3 ... q T .

  17. Viterbi Trellis N v t ( j ) = q 0 , q 1 ,..., q t − 1 P ( q 0 , q 1 ... q t − 1 , o 1 , o 2 ... o t , q t = j | λ ) v t ( j ) = i = 1 v t � 1 ( i ) a i j b j ( o t ) max max q F end end end end v 1 (2) =.32 v 2 (2) = max(.32*.12, .02*.08) = .038 P(H|H) * P(1|H) H H H q 2 H P(C|H) * P(1|C) .6 * .2 .3 * .5 P(H|start)*P(3|H) P(H|C) * P(1|H) v 2 (1) = max(.32*.15, .02*.25) = .048 v 1 (1) = .02 .8 * .4 .4 * .2 P(C|C) * P(1|C) q 1 C C C C .5 * .5 P(C|start) * P(3|C) .2 * .1 q 0 start start start start 3 1 3 o 2 o 3 o 1 t

  18. Viterbi recursion 1. Initialization: v 1 ( j ) = a 0 j b j ( o 1 ) 1 ≤ j ≤ N bt 1 ( j ) = 0 2. Recursion (recall that states 0 and q F are non-emitting): N v t ( j ) = i = 1 v t − 1 ( i ) a i j b j ( o t ) ; 1 ≤ j ≤ N , 1 < t ≤ T max N bt t ( j ) = v t − 1 ( i ) a ij b j ( o t ) ; 1 ≤ j ≤ N , 1 < t ≤ T argmax i = 1 3. Termination: N P ∗ = v T ( q F ) = i = 1 v T ( i ) ∗ a iF The best score: max N q T ∗ = bt T ( q F ) = v T ( i ) ∗ a iF The start of backtrace: argmax i = 1

  19. Viterbi backtrace q F end end end end v 1 (2) =.32 v 2 (2) = max(.32*.12, .02*.08) = .038 P(H|H) * P(1|H) H H H q 2 H P(C|H) * P(1|C) .6 * .2 .3 * .5 ) P(H|C) * P(1|H) H | v 2 (1) = max(.32*.15, .02*.25) = .048 3 ( P .4 * .2 v 1 (1) = .02 * 4 ) t . r a * t P(C|C) * P(1|C) 8 q 1 s C C C C . | H .5 * .5 P(C|start) * P(3|C) ( P .2 * .1 q 0 start start start start 3 1 3 o 2 o 3 o 1

Recommend


More recommend