svd and pca
play

SVD and PCA Derek Onken and Li Xiong Feature Extraction Create new - PowerPoint PPT Presentation

SVD and PCA Derek Onken and Li Xiong Feature Extraction Create new features (attributes) by combining/mapping existing ones Common methods Principle Component Analysis Singular Value Decomposition Other compression methods


  1. SVD and PCA Derek Onken and Li Xiong

  2. Feature Extraction  Create new features (attributes) by combining/mapping existing ones  Common methods  Principle Component Analysis  Singular Value Decomposition  Other compression methods (time-frequency analysis)  Fourier transform (e.g. time series)  Discrete Wavelet Transform (e.g. 2D images) January 29, 2018 2

  3. Principal Component Analysis (PCA) Principle component analysis: find the dimensions that capture the  most variance  A linear mapping of the data to a new coordinate system such that the greatest variance lies on the first coordinate (the first principal component), the second greatest variance on the second coordinate, and so on. Steps   Normalize input data: each attribute falls within the same range  Compute k orthonormal (unit) vectors, i.e., principal components - each input data (vector) is a linear combination of the k principal component vectors  The principal components are sorted in order of decreasing “significance”  Weak components can be eliminated, i.e., those with low variance January 29, 2018 3

  4. Dimensionality Reduction: PCA  Mathematically Y  Compute the covariance matrix v  Find the eigenvectors of the covariance matrix correspond to large eigenvalues X

  5. PCA: Illustrative Example 5

  6. PCA: Illustrative Example 6

  7. PCA: Illustrative Example 7

  8. PCA: Illustrative Example 8

  9. PCA: Illustrative Example 9

  10. Eigen Decomposition How the eigenvalues and eigenvectors create a Matrix decomposition. Q is a matrix consisting of the eigenvectors • Λ is the diagonal matrix containing all the eigenvalues •

  11. Singular Value Decomposition (SVD)

  12. Similarity of Eigen and SVD  Columns of Q are eigenvectors  Columns of u are left-singular vectors  Λ contains eigenvalues  Columns of v are right-singular vects  Σ contains ordered singular values 𝜏 𝑗  A must be square and we defined A as A=M T M.  The v j are eigenvectors of M T M.  The u i are eigenvectors of MM T .  The eigenvalues are squares of the singular values. ( 𝜇 𝑗 = 𝜏 𝑗 2 )

  13. AN APPLICATION EXAMPLE … ..

  14. FROM:: Dimensionality Reduction: SVD & CUR CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu

  15. SVD - Properties It is always possible to decompose a real matrix A into A = U  V T , where  U,  , V : unique  U, V : column orthonormal  U T U = I ; V T V = I ( I : identity matrix)  (Columns are orthogonal unit vectors)   : diagonal  Entries ( singular values ) are positive, and sorted in decreasing order ( σ 1  σ 2  ...  0 ) Nice proof of uniqueness: http://www.mpi-inf.mpg.de/~bast/ir-seminar-ws04/lecture2.pdf Jure Leskovec, Stanford CS246: 1/29/2018 15 Mining Massive Datasets

  16. SVD – Example: Users-to-Movies  Consider a matrix. What does SVD do? Casablanca Serenity Amelie Matrix Alien n 1 1 1 0 0 3 3 3 0 0 SciFi  V T 4 4 4 0 0 = m 5 5 5 0 0 0 2 0 4 4 0 0 0 5 5 Romance U 0 1 0 2 2 “Concepts” AKA Latent dimensions AKA Latent factors Jure Leskovec, Stanford CS246: 1/29/2018 16 Mining Massive Datasets

  17. SVD – Example: Users-to-Movies  A = U  V T - example: Users to Movies Casablanca Serenity Amelie Matrix Alien 1 1 1 0 0 0.13 0.02 -0.01 3 3 3 0 0 0.41 0.07 -0.03 SciFi 12.4 0 0 4 4 4 0 0 0.55 0.09 -0.04 = x x 0 9.5 0 5 5 5 0 0 0.68 0.11 -0.05 0 0 1.3 0 2 0 4 4 0.15 -0.59 0.65 0 0 0 5 5 0.07 -0.73 -0.67 Romance 0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09 0.12 -0.02 0.12 -0.69 -0.69 0.40 -0.80 0.40 0.09 0.09 Jure Leskovec, Stanford CS246: 1/29/2018 17 Mining Massive Datasets

  18. SVD – Example: Users-to-Movies  A = U  V T - example: Users to Movies Casablanca SciFi-concept Serenity Amelie Matrix Romance-concept Alien 1 1 1 0 0 0.13 0.02 -0.01 3 3 3 0 0 0.41 0.07 -0.03 SciFi 12.4 0 0 4 4 4 0 0 0.55 0.09 -0.04 = x x 0 9.5 0 5 5 5 0 0 0.68 0.11 -0.05 0 0 1.3 0 2 0 4 4 0.15 -0.59 0.65 0 0 0 5 5 0.07 -0.73 -0.67 Romance 0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09 0.12 -0.02 0.12 -0.69 -0.69 0.40 -0.80 0.40 0.09 0.09 Jure Leskovec, Stanford CS246: 1/29/2018 18 Mining Massive Datasets

  19. SVD – Example: Users-to-Movies  A = U  V T - example: U is “user -to- concept” factor matrix Casablanca Serenity Amelie Matrix Romance-concept SciFi-concept Alien 1 1 1 0 0 0.13 0.02 -0.01 3 3 3 0 0 0.41 0.07 -0.03 SciFi 12.4 0 0 4 4 4 0 0 0.55 0.09 -0.04 = x x 0 9.5 0 5 5 5 0 0 0.68 0.11 -0.05 0 0 1.3 0 2 0 4 4 0.15 -0.59 0.65 0 0 0 5 5 0.07 -0.73 -0.67 Romance 0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09 0.12 -0.02 0.12 -0.69 -0.69 0.40 -0.80 0.40 0.09 0.09 Jure Leskovec, Stanford CS246: 1/29/2018 19 Mining Massive Datasets

  20. SVD – Example: Users-to-Movies  A = U  V T - example: Casablanca Serenity Amelie Matrix SciFi-concept Alien “strength” of the SciFi-concept 1 1 1 0 0 0.13 0.02 -0.01 3 3 3 0 0 0.41 0.07 -0.03 SciFi SciFi 12.4 0 0 4 4 4 0 0 0.55 0.09 -0.04 = x x 0 9.5 0 5 5 5 0 0 0.68 0.11 -0.05 0 0 1.3 0 2 0 4 4 0.15 -0.59 0.65 0 0 0 5 5 0.07 -0.73 -0.67 Romance 0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09 0.12 -0.02 0.12 -0.69 -0.69 0.40 -0.80 0.40 0.09 0.09 Jure Leskovec, Stanford CS246: 1/29/2018 20 Mining Massive Datasets

  21. SVD – Example: Users-to-Movies  A = U  V T - example: Casablanca V is “movie -to- concept” Serenity Amelie Matrix factor matrix SciFi-concept Alien 1 1 1 0 0 0.13 0.02 -0.01 3 3 3 0 0 0.41 0.07 -0.03 SciFi 12.4 0 0 4 4 4 0 0 0.55 0.09 -0.04 = x x 0 9.5 0 5 5 5 0 0 0.68 0.11 -0.05 0 0 1.3 0 2 0 4 4 0.15 -0.59 0.65 0 0 0 5 5 0.07 -0.73 -0.67 Romance 0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09 0.12 -0.02 0.12 -0.69 -0.69 SciFi-concept 0.40 -0.80 0.40 0.09 0.09 Jure Leskovec, Stanford CS246: 1/29/2018 21 Mining Massive Datasets

  22. SVD - Interpretation #1 ‘ movies ’, ‘ users ’ and ‘ concepts ’:  U : user-to-concept matrix  V : movie-to-concept matrix   : its diagonal elements: ‘strength’ of each concept Jure Leskovec, Stanford CS246: 1/29/2018 22 Mining Massive Datasets

  23. SVD – Best Low Rank Approx.  Fact: SVD gives ‘best’ axis to project on:  ‘ best ’ = minimizing the sum of reconstruction errors 2 𝐵 − 𝐶 𝐺 = 𝐵 𝑗𝑘 − 𝐶 𝑗𝑘 Sigma 𝑗𝑘 U A = V T B is best approximation of A: Sigma B U = V T Jure Leskovec, Stanford CS246: 1/29/2018 23 Mining Massive Datasets

  24. Example of SVD

  25. Case study: How to query?  Q: Find users that like ‘Matrix’  A: Map query into a ‘concept space’ – how? Casablanca Serenity Amelie Matrix Alien 1 1 1 0 0 0.13 0.02 -0.01 3 3 3 0 0 0.41 0.07 -0.03 SciFi 12.4 0 0 4 4 4 0 0 0.55 0.09 -0.04 = x x 0 9.5 0 5 5 5 0 0 0.68 0.11 -0.05 0 0 1.3 0 2 0 4 4 0.15 -0.59 0.65 0 0 0 5 5 0.07 -0.73 -0.67 Romnce 0 1 0 2 2 0.07 -0.29 0.32 0.56 0.59 0.56 0.09 0.09 0.12 -0.02 0.12 -0.69 -0.69 Jure Leskovec, Stanford CS246: 0.40 -0.80 0.40 0.09 0.09 1/29/2018 25 Mining Massive Datasets

  26. Case study: How to query?  Q: Find users that like ‘Matrix’  A: Map query into a ‘concept space’ – how? Alien Casablanca q Serenity Amelie Matrix Alien v2 q = 5 0 0 0 0 v1 Project into concept space: Matrix Inner product with each ‘concept’ vector v i Jure Leskovec, Stanford CS246: 1/29/2018 26 Mining Massive Datasets

  27. Case study: How to query?  Q: Find users that like ‘Matrix’  A: Map query into a ‘concept space’ – how? Alien Casablanca q Serenity Amelie Matrix Alien v2 q = 5 0 0 0 0 v1 q*v 1 Project into concept space: Matrix Inner product with each ‘concept’ vector v i Jure Leskovec, Stanford CS246: 1/29/2018 27 Mining Massive Datasets

  28. Case study: How to query? Compactly, we have: q concept = q V E.g.: Casablanca SciFi-concept Serenity Amelie Matrix 0.56 0.12 Alien 0.59 -0.02 = x q = 0.56 0.12 2.8 0.6 5 0 0 0 0 0.09 -0.69 0.09 -0.69 movie-to-concept factors (V) Jure Leskovec, Stanford CS246: 1/29/2018 28 Mining Massive Datasets

  29. Case study: How to query?  How would the user d that rated (‘Alien’, ‘Serenity’) be handled? d concept = d V E.g.: Casablanca SciFi-concept Serenity Amelie Matrix 0.56 0.12 Alien 0.59 -0.02 = x q = 0.56 0.12 5.2 0.4 0 4 5 0 0 0.09 -0.69 0.09 -0.69 movie-to-concept factors (V) Jure Leskovec, Stanford CS246: 1/29/2018 29 Mining Massive Datasets

  30. Case study: How to query?  Observation: User d that rated (‘ Alien ’, ‘ Serenity ’) will be similar to user q that rated (‘ Matrix ’), although d and q have zero ratings in common ! Casablanca Serenity Amelie Matrix SciFi-concept Alien d = 5.2 0.4 0 4 5 0 0 q = 2.8 0.6 5 0 0 0 0 Zero ratings in common Similarity > 0 Jure Leskovec, Stanford CS246: 1/29/2018 30 Mining Massive Datasets

Recommend


More recommend