probability and statistics
play

Probability and Statistics for Computer Science Principal - PowerPoint PPT Presentation

Probability and Statistics for Computer Science Principal Component Analysis --- Exploring the data in less dimensions Credit: wikipedia Hongye Liu, Teaching Assistant Prof, CS361, UIUC, 10.27.2020 Last time Review of Bayesian inference


  1. Probability and Statistics ì for Computer Science Principal Component Analysis --- Exploring the data in less dimensions Credit: wikipedia Hongye Liu, Teaching Assistant Prof, CS361, UIUC, 10.27.2020

  2. Last time � Review of Bayesian inference � Visualizing high dimensional data & Summarizing data � The covariance matrix

  3. Objectives gpr.in#*m- Analysis Two applications :O Dimension reduction ⑤ Compression , Reconstruction Ear :* t those in data see ! directions !

  4. Examples: Immune Cell Data -38816 � There are 38816 white N - blood immune cells from DX N T cells a mouse sample T � Each immune cell has 40+ features/ measurements ↳ components B cells � Four features are used subset choose - as illustraSon. d=4 � There are at least 3 cell types involved Natural killer cells

  5. Scatter matrix of Immune Cells � There are 38816 white blood immune cells from a mouse sample � Each immune cell has 40+ features/ components � Four features are used for the illustraSon. Dark red : T cells � There are at least 3 cell Brown: B cells types involved Blue: NK cells Cyan: other small populaSon

  6. ' Data PCA of Immune Cells > res1 Eigenvalues $values [1] 4.7642829 2.1486896 1.3730662 0.4968255 T - cell Eigenvectors $vectors NK - cell [,1] [,2] [,3] [,4] UL [1,] 0.2476698 0.00801294 -0.6822740 0.6878210 81 [2,] 0.3389872 -0.72010997 -0.3691532 § -0.4798492 [3,] -0.8298232 0.01550840 -0.5156117 if B-cell -0.2128324 & word : notes [4,] 0.3676152 0.69364033 -0.3638306 are along -0.5013477 d - ④ eigenvector

  7. Properties of Covariance matrix { x } Covmat( ) 7×7 cov ( { x } ; j, k ) = cov ( { x } ; k, j ) ' ' ' s ) [ 1 2 3 4 5 6 7 � The covariance " " 1 * * * * * * * matrix is symmetric ! 2 * * * * * * * � And it’s posi6ve 3 * * * * * * * semi-definite , that is 4 * * * * * * * in all λ i ≥ 0 5 * * * * * * * 6 * * * * * * * � Covariance matrix is 7 * * * * * * * diagonalizable as

  8. Properties of Covariance matrix { x } Covmat( ) � If we define x c as the 7×7 CoV C ' , 2 ) mean centered oil 1 2 3 4 5 6 7 matrix for dataset {x} 1 * * * * * * * Z 2 * * * * * * * Gz Covmat ( { x } ) = X c X T c Z 3 * * * * * * * 63 N 2 4 * * * * * * * 64 Z 5 * * * * * * * 65 � The covariance 2 6 * * * * * * * 06 matrix is a d×d matrix 2 7 * * * * * * * 67 - d =7

  9. What is the correlation between the 2 components for the data m? § � � 20 25 Covmat ( m ) = GT 25 40 ' , feet 2) Corr ( feat - u ) Wr l l , 25 1 Tiki tr

  10. Example: covariance matrix of a data set Mean centering mean ) A 2 = A 1 A T (II) (I) 1 " t � � 5 4 3 2 1 Inner product of each pairs: A 0 = [1,1] = 10 − 1 1 0 1 − 1 A 2 [2,2] = 4 A 2 � � 2 1 0 − 1 − 2 [1,2] = 0 A 2 A 1 = − 1 1 0 1 − 1 ' , 2) I (III) Cov C 0 Corr Cl , 4=0 Divide the matrix with N – the number of data poits � � � � = 1 N A 2 = 1 10 0 2 0 Covmat( ) { x } = 0 . 8 0 4 0 5

  11. What do the data look like when Covmat({x}) is diagonal? X (2) X (1) � � 5 4 3 2 1 A 0 = − 1 1 0 1 − 1 X (2) * * X (1) * - * * g Max or , # ° � � � � { x } = 1 N A 2 = 1 10 0 2 0 Covmat( ) = 0-z.ms ' 0 . 8 0 4 0 5

  12. : gatton Diagonal ' eisjrectz g- Et e - e. Etc : :] c. one : X " M X A X Xx " M = c :;H¥÷÷x÷÷x÷÷ ⇒ U U A = UN UT

  13. Diagonalization of a symmetric matrix � If A is an n × n symmetric square matrix, the eigenvalues are real. � If the eigenvalues are also disSnct, their eigenvectors are orthogonal � We can then scale the eigenvectors to unit length, and place them into an orthogonal matrix U = [ u 1 u 2 …. u n ] � We can write the diagonal matrix such Λ = U T AU that the diagonal entries of Λ are λ 1 , λ 2 … λ n in that order.

  14. Diagonalization example hi ? � For - AIL = 1 A 0 7. a) → ⇒ I ? I I � � 5 3 A = 3 5 eigenvectors ? A U , = 2 Up I , = 2 - 2114=0 ( A 3) v. → ⇒ v. =L . ! ) ✓ = fu , nil I } - fulfil ⇒ a- EH - full ] - t un res - normalized I ) eisen - et - A=?ff A= UTAU

  15. Diagonalization example hi ? � For - NII = 1 A O 7. a) → ⇒ l ? g I � � 5 3 z A = 3 5 eigenvectors ? A 4=80 , a. = 8 - 8114=0 ( A 3) v. → ⇒ v. =/ ! ) ✓ = lui al f } - - tf , ] , ⇒ u . - = ? , An -_ 2 Uz=fzf - I ] T normalized eisen - et - ; ) A= ? ( § A= UTAU

  16. ⇒ Rotation Matrix - t RT R Def : = " Ute formed U is V if can prove we generators by , normalized . T called U L are U matrices orthonormal UT rotation N are u matrices .

  17. * of : ! ! ) ' as =L ! ) u - =/ ! ) - f ! ) u . - - ← www.rfmdim ui÷m=÷T " Dot nd ' U z Ui = . Husk ? ' yay , = ? I - ti =

  18. " ZD A T wi :3 - it :c - . T - O 3.1 ut-f.sn " : d UTC Ux ) = ¥ " x - . x u " ✓ = u ⇒ UT . U = I

  19. Q.#Is#this#true?# Transforming+a+matrix+with+ orthonormal+matrix+only+rotates+the+ data+ UT x D A.+Yes+ u x B.+No+

  20. Dimension reduction from 2D to 1D Credit: Prof. Forsyth

  21. Step 1: subtract the mean Credit: Prof. Forsyth

  22. Step 2: Rotate to diagonalize the covariance IT . im ⑧ § txt → u . , Credit: Prof. Forsyth

  23. Step 3: Drop component(s) up -7117 Credit: Prof. Forsyth

  24. Principal Components � The columns of are the normalized eigenvectors of U the Covmat({x}) and are called the principal components of the data {x}

  25. Principal components analysis � We reduce the dimensionality of dataset { x } represented by matrix from d to s (s < d). D d × n � Step 1. define matrix such that m = D − mean ( D ) m d × n r i = U T m i � Step 2. define matrix such that r d × n True tht . Tom Λ = U T Covmat ( { x } ) U Λ Where saSsfies , is U T the diagonalizaSon of with the eigenvalues Covmat ( { x } ) sorted in decreasing order, is the orthonormal U eigenvectors’ matrix � Step 3. Define matrix such that is with the last p p d × n r d-s components of made zero. r

  26. What happened to the mean? � Step 1. mean ( m ) = mean ( D − mean ( D )) = 0 � Step 2. mean ( r ) = U T mean ( m ) = U T 0 = 0 � Step 3. mean ( p i ) = mean ( r i ) = 0 while i ∈ 1 : s mean ( p i ) = 0 while i ∈ s + 1 : d

  27. What happened to the covariances? � Step 1. Covmat ( m ) = Covmat ( D ) = Covmat ( { x } ) T � Step 2. - V m r - Covmat ( r ) = U T Covmat ( m ) U = Λ = A Grunt 3 × 3 ) AT Granat 4A the property for � Step 3. is with the last/smallest d-s Λ Covmat ( p ) diagonal terms turned to 0.

  28. Sample covariance matrix � In many staSsScal programs, the sample covariance matrix is defined to be Covmat ( m ) = m m T C c N − 1 � Similar to what happens to the unbiased standard deviaSon

  29. PCA an example � Step 1. � � � � 3 − 4 7 1 − 4 − 3 0 D = ⇒ mean ( D ) = 7 − 6 8 − 1 − 1 − 7 0 � � 3 − 4 7 1 − 4 − 3 m = 7 − 6 8 − 1 − 1 − 7 � Step 2. � Step 3.

  30. PCA an example � Step 1. � � � � 3 − 4 7 1 − 4 − 3 0 D = ⇒ mean ( D ) = 7 − 6 8 − 1 − 1 − 7 0 � � 3 − 4 7 1 − 4 − 3 m = 7 − 6 8 − 1 − 1 − 7 � Step 2. � � 20 25 Covmat ( m ) = λ 1 ≃ 57; λ 2 ≃ 3 ⇒ 25 40 � � � � 0 . 5606288 0 . 8280672 0 . 5606288 − 0 . 8280672 U T = ⇒ U = − 0 . 8280672 0 . 5606288 0 . 8280672 0 . 5606288 � Step 3.

  31. PCA an example � Step 1. � � � � 3 − 4 7 1 − 4 − 3 0 D = ⇒ mean ( D ) = 7 − 6 8 − 1 − 1 − 7 0 � � 3 − 4 7 1 − 4 − 3 m = 7 − 6 8 − 1 − 1 − 7 � Step 2. � � 20 25 Covmat ( m ) = λ 1 ≃ 57; λ 2 ≃ 3 ⇒ 25 40 � � � � 0 . 5606288 0 . 8280672 0 . 5606288 − 0 . 8280672 U T = ⇒ U = − 0 . 8280672 0 . 5606288 0 . 8280672 0 . 5606288 � � 7 . 478 − 7 . 211 10 . 549 − 0 . 267 − 3 . 071 − 7 . 478 ⇒ r = U T m = 1 . 440 − 0 . 052 − 1 . 311 − 1 . 389 2 . 752 − 1 . 440 � Step 3.

  32. PCA an example � Step 1. � � � � 3 − 4 7 1 − 4 − 3 0 D = ⇒ mean ( D ) = 7 − 6 8 − 1 − 1 − 7 0 � � 3 − 4 7 1 − 4 − 3 m = 7 − 6 8 − 1 − 1 − 7 � Step 2. � � 20 25 Covmat ( m ) = λ 1 ≃ 57; λ 2 ≃ 3 ⇒ 25 40 � � � � 0 . 5606288 0 . 8280672 0 . 5606288 − 0 . 8280672 U T = ⇒ U = − 0 . 8280672 0 . 5606288 0 . 8280672 0 . 5606288 � � 7 . 478 − 7 . 211 10 . 549 − 0 . 267 − 3 . 071 − 7 . 478 ⇒ r = U T m = 1 . 440 − 0 . 052 − 1 . 311 − 1 . 389 2 . 752 − 1 . 440 coordinates → new � Step 3. � � 7 . 478 − 7 . 211 10 . 549 − 0 . 267 − 3 . 071 − 7 . 478 along Pcl ⇒ p = 0 0 0 0 0 0

  33. What is this matrix for the previous example? U T Covmat ( m ) U =? ± e : :L ±

  34. The Mean square error of the projection � The mean square error is the sum of the smallest d-s eigenvalues in Λ d 1 1 � r i − p i � 2 = ( r ( j ) � � � i ) 2 N − 1 N − 1 j = s +1 i i

Recommend


More recommend