lecture 10
play

Lecture 10 N.MORGAN / B.GOLD LECTURE 10 10.1 LECTURE ON - PowerPoint PPT Presentation

LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Statistical Pattern


  1. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Statistical Pattern Recognition Lecture 10 N.MORGAN / B.GOLD LECTURE 10 10.1

  2. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D Last Time p x ω i ( ) p ω i ( ) log + log N.MORGAN / B.GOLD LECTURE 10 10.2

  3. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D N.MORGAN / B.GOLD LECTURE 10 10.3

  4. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D Discrete Density Estimation clusters – 1 K j 0 1 ω 0 ω 1 class index ω 2 ω M ∑ n ij row total p ω i ( ) = - - - - - - - - - = - - - - - - - - - - - - - - - - - - - - - j ∑ total n ij , i j p ω i y j ( , ) n ij p ω i x ( ) ≈ p ω i y j ( ) = - - - - - - - - - - - - - - - - - - - - - = - - - - - - - - - ( ) ∑ p y j n ij i N.MORGAN / B.GOLD LECTURE 10 10.4

  5. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D K-means Clustering 1.Choose N centers 2.Assign paths to nearest 3.Compute centers 4.Assess 5.Write “codebook” N.MORGAN / B.GOLD LECTURE 10 10.5

  6. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D Example: speech frame classification •256 pt DFT, 128 spec vals, take log power •Use K-means, find 64 centers, make table •Assign each spectrum to a codebook entry, count co-occurences with phoneme labels, get probs N.MORGAN / B.GOLD LECTURE 10 10.6

  7. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D Estimators requiring iterative training •Gaussian mixtures •Neural networks N.MORGAN / B.GOLD LECTURE 10 10.7

  8. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D Gaussian Mixtures p x ω k ( ) ∑ c 1 c M p x ω k j = 1 ( , ) p x ω k j = M ( , ) x M M p x ω k ( ) p j x ω k ( , ) p j ω k ( ) p x ω k j ( , ) ∑ ∑ = =      j = 1 j = 1 c j c j = prob x originated from dist j N.MORGAN / B.GOLD LECTURE 10 10.8

  9. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D Expectation Maximization (Also sometimes called Estimate-and-Maximize) •Potentially quite general •Cannot analytically determine parameters •E step: Conditional Expectation of unknown variable given what is known •M step: Choose parameters to maximize E N.MORGAN / B.GOLD LECTURE 10 10.9

  10. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D p ω i x ( ) p x ω i ( ) p ω i x θ ( , ) p x ω i θ ( , ) for class ω i p x θ ( ) → ML arg θ max p x θ ( ) Σ N.MORGAN / B.GOLD LECTURE 10 10.10

  11. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D Let k be hidden variables x be observed θ be params θ old old params { p k x θ ( , ) } p k x θ old ( , ) [ p k x θ ( , ) ] ∑ E log = log k p k x θ old ( , ) [ p k x θ ( , ) p x θ ( ) ] ∑ = log k N.MORGAN / B.GOLD LECTURE 10 10.11

  12. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D Q θ θ old ( , ) p k x θ old ( , ) [ ( ( x θ , ) ) ] ∑ = log p k k Sums to 1 p k x θ old ( , ) [ p x θ ( ) ] ∑ + log        k Indep. of k Q θ old θ old ( , ) p K x θ old ( , ) [ p k x θ old ( , ) ] p x θ old ( ) ∑ = log + log k p x θ ( ) p x θ old ( ) diff = log – log p k x θ old ( , ) p k x θ old ( , ) ∑ – log - - - - - - - - - - - - - - - - - - - - - - - - - - - p k x θ ( , ) k N.MORGAN / B.GOLD LECTURE 10 10.12

  13. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D Gaussian Mixture K K p x θ ( ) ( , θ ) p k θ ( ) p x k θ ( , ) ∑ ∑ = p x k = k = 1 k = 1 Log Joint Density p x k θ ( , ) [ p k θ ( ) p x k θ ( , ) ] log = log K N p k x n θ old ( , ) [ p k θ ( ) p x n k θ ( , ) ] ∑ ∑ = log Q k = 1 n = 1 K N p k x n θ old ( , ) p k θ ( ) ∑ ∑ = log k = 1 n = 1 K N p k x n θ old ( , ) p x n k θ ( , ) ∑ ∑ log k = 1 n = 1 N.MORGAN / B.GOLD LECTURE 10 10.13

  14. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D µ k 2   1 1 - x n – ( ) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Let p x n k = exp –   σ k 2 πσ k 2 2 K N p k x n θ old ( , ) p k θ ( ) ∑ ∑ Q = log k = 1 n = 1 ( µ k ) 2 K N   – x n – p k x n θ old ( , ) σ n ∑ ∑ + – log + C + - - - - - - - - - - - - - - - - - - - - - - - - -   2 σ k 2 k = 1 n = 1 N.MORGAN / B.GOLD LECTURE 10 10.14

  15. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D ∂ Q = 0 ∂ µ j µ j N   ) x n ⇒ p j x n θ old ( , ∑ - - - - - – - - - - - = 0   σ j σ j 2 2 n = 1 N N ⇒ p j x n θ old ( , ) x n p j x n θ old ( , )µ j ∑ ∑ = n = 1 n = 1 N p j x n θ old ( , ) x n ∑ ⇒ µ j = - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - n = 1 N p j x n θ old ( , ) ∑ = 1 n N.MORGAN / B.GOLD LECTURE 10 10.15

  16. LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D EM Summary •Choose parametric form •Choose initial values •Compute posterior estimates for hidden variables •Choose parameters to maximize expectation of joint density (observed, hidden) •Assess goodness of fit Yes •Good enough Stop No N.MORGAN / B.GOLD LECTURE 10 10.16

Recommend


More recommend