machine learning for signal processing linear gaussian
play

Machine Learning for Signal Processing Linear Gaussian Models - PowerPoint PPT Presentation

Machine Learning for Signal Processing Linear Gaussian Models Class 21. 12 Nov 2013 Instructor: Bhiksha Raj 12 Nov 2013 11755/18797 1 Administrivia HW3 is up . Projects please send us an update 12 Nov 2013 11755/18797 2


  1. Machine Learning for Signal Processing Linear Gaussian Models Class 21. 12 Nov 2013 Instructor: Bhiksha Raj 12 Nov 2013 11755/18797 1

  2. Administrivia • HW3 is up – . • Projects – please send us an update 12 Nov 2013 11755/18797 2

  3. Recap: MAP Estimators • MAP ( Maximum A Posteriori ): Find a “best guess” for y (statistically), given known x y = argmax Y P (Y | x ) 12 Nov 2013 11755/18797 3

  4. Recap: MAP estimation • x and y are jointly Gaussian      x  y    x z   E [ z ]    z     y   C C        T xx xy C E [( x )( y ) ]   Var ( z ) C xy x y zz C C   yx yy   1         T P ( z ) N ( , C ) exp 0 . 5 ( z )( z ) z zz z z  2 | C | zz • z is Gaussian 12 Nov 2013 11755/18797 4

  5. MAP estimation: Gaussian PDF Y F1 X 12 Nov 2013 11755/18797 5

  6. MAP estimation: The Gaussian at a particular value of X x 0 12 Nov 2013 11755/18797 6

  7. Conditional Probability of y|x         1 T 1 P ( y | x ) N ( C C ( x ), C C C C ) y yx xx x yy yx xx xy         1 E [ y ] C C ( x ) y | x y | x y yx xx x    T 1 Var ( y | x ) C C C C yy xy xx xy • The conditional probability of y given x is also Gaussian – The slice in the figure is Gaussian • The mean of this Gaussian is a function of x • The variance of y reduces if x is known – Uncertainty is reduced 12 Nov 2013 11755/18797 7

  8. MAP estimation: The Gaussian at a particular value of X Most likely value F1 x 0 12 Nov 2013 11755/18797 8

  9. MAP Estimation of a Gaussian RV   ˆ y arg max P ( y | x ) E [ y ] y y | x x 0 12 Nov 2013 11755/18797 9

  10. Its also a minimum-mean-squared error estimate • Minimize error:     2      T ˆ ˆ ˆ Err E [ y y | x ] E [ y y y y | x ]       ˆ ˆ ˆ ˆ ˆ ˆ T T T T T T Err E [ y y y y 2 y y | x ] E [ y y | x ] y y 2 y E [ y | x ] • Differentiating and equating to 0:    ˆ ˆ ˆ T T d . Err 2 y d y 2 E [ y | x ] d y 0  ˆ The MMSE estimate is the y E [ y | x ] mean of the distribution 12 Nov 2013 11755/18797 10

  11. For the Gaussian: MAP = MMSE Most likely value is also The MEAN value  Would be true of any symmetric distribution 12 Nov 2013 11755/18797 11

  12. MMSE estimates for mixture distributions   P ( | ) P ( k | ) P ( | k , ) y x x y x k  Let P( y | x ) be a mixture density  The MMSE estimate of y is given by       E [ y | x ] y P ( k | x ) P ( y | k , x ) d y P ( k | x ) y P ( y | k , x ) d y k k   P ( k | ) E [ | k , ] x y x k  Just a weighted combination of the MMSE estimates from the component distributions 12

  13. MMSE estimates from a Gaussian mixture  Let P( x , y ) be a Gaussian Mixture    x     P ( x, y ) P ( z ) P ( k ) N ( z ; , )  y z   k k   k  P( y|x ) is also a Gaussian mixture   P ( k , x , y ) P ( x ) P ( k | x ) P ( y | x , k ) P ( x, y )    k k P ( y | x ) P ( x ) P ( x ) P ( x )   P ( y | x ) P ( k | x ) P ( y | x , k ) k 12 Nov 2013 11755/18797 13

  14. MMSE estimates from a Gaussian mixture  Let P( y|x ) is a Gaussian Mixture   P ( y | x ) P ( k | x ) P ( y | x , k ) k      C C  k , x k , xx k , xy     P ( y , x , k ) N ( , )  C C     k , y k , yx k , yy        1 P ( y | x , k ) N ( C C ( x ), ) k , y k , yx k , xx k , x         1 P ( y | x ) P ( k | x ) N ( C C ( x ), ) k , y k , yx k , xx k , x k 12 Nov 2013 11755/18797 14

  15. MMSE estimates from a Gaussian mixture         1 P ( y | x ) P ( k | x ) N ( C C ( x ), ) k , y k , yx k , xx k , x k  P( y | x ) is a mixture Gaussian density  E[ y | x ] is also a mixture   E [ y | x ] P ( k | x ) E [ y | k , x ] k          1 E [ y | x ] P ( k | x ) C C ( x ) k , y k , yx k , xx k , x k 12 Nov 2013 11755/18797 15

  16. MMSE estimates from a Gaussian mixture          1 E [ y | x ] P ( k | x ) C C ( x ) k , y k , yx k , xx k , x k  Weighted combination of MMSE estimates obtained from individual Gaussians!  Weight P ( k | x ) is easily computed too.. P ( k , x )     P ( x ) P ( k ) N ( , C ) P ( k | x ) k , x xx P ( x ) k 12 Nov 2013 11755/18797 16

  17. MMSE estimates from a Gaussian mixture  A mixture of estimates from individual Gaussians 12 Nov 2013 11755/18797 17

  18. Voice Morphing • Align training recordings from both speakers – Cepstral vector sequence • Learn a GMM on joint vectors • Given speech from one speaker, find MMSE estimate of the other • Synthesize from cepstra 12 Nov 2013 11755/18797 18

  19. MMSE with GMM: Voice Transformation - Festvox GMM transformation suite (Toda) awb bdl jmk slt awb bdl jmk slt 12 Nov 2013 11755/18797 19

  20. MAP / ML / MMSE • General statistical estimators • All used to predict a variable, based on other parameters related to it.. • Most common assumption: Data are Gaussian, all RVs are Gaussian – Other probability densities may also be used.. • For Gaussians relationships are linear as we saw.. 12 Nov 2013 11755/18797 20

  21. Gaussians and more Gaussians.. • Linear Gaussian Models.. • But first a recap 12 Nov 2013 11755/18797 21

  22. A Brief Recap D C D  BC B • Principal component analysis: Find the K bases that best explain the given data • Find B and C such that the difference between D and BC is minimum – While constraining that the columns of B are orthonormal 12 Nov 2013 11755/18797 22

  23. Remember Eigenfaces • Approximate every face f as f = w f,1 V 1 + w f,2 V 2 + w f,3 V 3 +.. + w f,k V k • Estimate V to minimize the squared error • Error is unexplained by V 1 .. V k • Error is orthogonal to Eigenfaces 12 Nov 2013 11755/18797 23

  24. Karhunen Loeve vs. PCA • Eigenvectors of the Correlation matrix: – Principal directions of tightest ellipse centered on origin – Directions that retain maximum energy 12 Nov 2013 11755/18797 24

  25. Karhunen Loeve vs. PCA • • Eigenvectors of the Correlation Eigenvectors of the Covariance matrix: matrix: – Principal directions of tightest – Principal directions of tightest ellipse centered on data ellipse centered on origin – Directions that retain maximum – Directions that retain variance maximum energy 12 Nov 2013 11755/18797 25

  26. Karhunen Loeve vs. PCA • • Eigenvectors of the Correlation Eigenvectors of the Covariance matrix: matrix: – Principal directions of tightest – Principal directions of tightest ellipse centered on data ellipse centered on origin – Directions that retain maximum – Directions that retain variance maximum energy 12 Nov 2013 11755/18797 26

  27. Karhunen Loeve vs. PCA • • Eigenvectors of the Correlation Eigenvectors of the Covariance matrix: matrix: – Principal directions of tightest – Principal directions of tightest ellipse centered on data ellipse centered on origin – Directions that retain maximum – Directions that retain variance maximum energy 12 Nov 2013 11755/18797 27

  28. Karhunen Loeve vs. PCA • If the data are naturally centered at origin, KLT == PCA • Following slides refer to PCA! – Assume data centered at origin for simplicity • Not essential, as we will see.. 12 Nov 2013 11755/18797 28

  29. Remember Eigenfaces • Approximate every face f as f = w f,1 V 1 + w f,2 V 2 + w f,3 V 3 +.. + w f,k V k • Estimate V to minimize the squared error • Error is unexplained by V 1 .. V k • Error is orthogonal to Eigenfaces 12 Nov 2013 11755/18797 29

  30. Eigen Representation = w 11 + e 1 0 e 1 w 11 Illustration assuming 3D space • K-dimensional representation – Error is orthogonal to representation – Weight and error are specific to data instance 12 Nov 2013 11755/18797 30

  31. Representation = w 12 + e 2 Error is at 90 o to the eigenface w 12 90 o e 2 Illustration assuming 3D space • K-dimensional representation – Error is orthogonal to representation – Weight and error are specific to data instance 12 Nov 2013 11755/18797 31

  32. Representation All data with the same 0 representation wV 1 w lie a plane orthogonal to wV 1 • K-dimensional representation – Error is orthogonal to representation 12 Nov 2013 11755/18797 32

Recommend


More recommend