principal component analysis
play

Principal Component Analysis Proseminar Data Mining Tobias Holl 1 1 - PowerPoint PPT Presentation

. . . . . . . . . . . . . . Introduction Theory Applications Principal Component Analysis Proseminar Data Mining Tobias Holl 1 1 Technische Universitt Mnchen 2017-06-09 Tobias Holl Technische Universitt Mnchen . . .


  1. . . . . . . . . . . . . . . Introduction Theory Applications Principal Component Analysis Proseminar Data Mining Tobias Holl 1 1 Technische Universität München 2017-06-09 Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis

  2. . . . . . . . . . . . . . . . . Introduction Theory Applications The Problem Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis

  3. . . . . . . . . . . . . . . . Introduction Theory Applications The Problem Data. Lots of data. Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis

  4. . . . . . . . . . . . . . . . . Introduction Theory Applications An Example Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis Reference Energy Disaggregation Data Set [1] ▶ Power usage over >100 days for >200 devices ▶ Measured every 2s ▶ Over 500GB of compressed data

  5. . . . . . . . . . . . . . . . . Introduction Theory Applications An Example Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis Iris Data Set [2] ▶ 150 fmowers of 3 difgerent species ▶ Petal and sepal widths and lengths

  6. . . . . . . . . . . . . . . . Introduction Theory Applications An Example . Tobias Holl Technische Universität München . Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . . 2.0 2.5 3.0 3.5 4.0 0.5 1.0 1.5 2.0 2.5 7.5 6.5 Sepal length 5.5 4.5 4.0 3.5 Sepal width 3.0 2.5 2.0 7 6 5 Petal length 4 3 2 1 2.5 2.0 1.5 Petal width 1.0 0.5 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 1 2 3 4 5 6 7

  7. . . . . . . . . . . . . . . . . Introduction Theory Applications An Example Tobias Holl Technische Universität München . . . . . . . . . . . . . Principal Component Analysis . . . . . . . . . . . 0.5 1.0 1.5 2.0 2.5 7 6 5 Petal length 4 3 2 1 2.5 2.0 1.5 Petal width 1.0 0.5 1 2 3 4 5 6 7

  8. . . . . . . . . . . . . . . . . Introduction Theory Applications An Example Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . Principal Component Analysis . . . . . 2.5 2.0 1.5 Petal width 1.0 0.5 1 2 3 4 5 6 7 Petal length

  9. . . . . . . . . . . . . . . . Introduction Theory Applications An Example Clear correlation Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . Principal Component Analysis . . . . . 2.5 2.0 1.5 Petal width 1.0 0.5 1 2 3 4 5 6 7 Petal length

  10. . . . . . . . . . . . . . . . Introduction Theory Applications An Example Unnecessary redundancy Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . Principal Component Analysis . . . . . 2.5 2.0 1.5 Petal width 1.0 0.5 1 2 3 4 5 6 7 Petal length

  11. . x 1 n . . . . . Introduction Theory Applications Data Matrices Variable 1 Variable n Measurement 1 x 11 . . . . . . . ... . . . Measurement m x m 1 x mn Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis ···   · · · X =  ∈ R m × n    · · ·

  12. . . . . . . Introduction Theory Applications Data Matrices Variable 1 Variable n Measurement 1 x 11 x 1 n . . . . . . ... . . . Measurement m x m 1 x mn Assume that X is centered around 0 . Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis ···   · · · X =  ∈ R m × n    · · ·

  13. . . . . . . . . . . . . . . . Introduction Theory Applications Some Statistics 1 a x b Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis cov ( x a , x b ) = m − 1 x T

  14. . . . . . . . . . . . . . . . Introduction Theory Applications Some Statistics 1 a x b Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis cov ( x a , x b ) = m − 1 x T cov ( x a , x b ) is the covariance of x a and x b .

  15. . . . . . . . . . . . . . . Introduction Theory Applications Some Statistics 1 a x b Covariance describes the strength of the correlation. Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis cov ( x a , x b ) = m − 1 x T cov ( x a , x b ) is the covariance of x a and x b .

  16. . Introduction . . . . . . . . . . Theory . Applications Some Statistics . . . ... . . . Tobias Holl Technische Universität München . . . . . . . . . . . . . . . Principal Component Analysis . . . . . . . . . . . . .   cov ( x 1 , x 1 ) · · · cov ( x 1 , x n ) cov ( X ) =     cov ( x 1 , x n ) cov ( x n , x n ) · · ·

  17. . Theory . . . . . . . . . . Introduction Applications . Some Statistics . . . ... . . . 1 Tobias Holl Technische Universität München . . . . . . . . . . . . . . . Principal Component Analysis . . . . . . . . . . . . .   cov ( x 1 , x 1 ) cov ( x 1 , x n ) · · · cov ( X ) =     cov ( x 1 , x n ) · · · cov ( x n , x n ) cov ( v , v ) = m − 1 v T v = var ( v )

  18. . Introduction . . . . . . . . . . Theory . Applications Some Statistics . . . ... . . . Tobias Holl Technische Universität München . . . . . . . . . . . . . . . Principal Component Analysis . . . . . . . . . . . . .   var ( x 1 ) · · · cov ( x 1 , x n ) cov ( X ) =     cov ( x 1 , x n ) var ( x n ) · · ·

  19. . Theory . . . . . . . . . . Introduction Applications . Some Statistics . . . ... . . . 1 Tobias Holl Technische Universität München . . . . . . . . . . . . . . . Principal Component Analysis . . . . . . . . . . . . .   var ( x 1 ) · · · cov ( x 1 , x n ) cov ( X ) =  =   m − 1 X T X  cov ( x 1 , x n ) var ( x n ) · · ·

  20. . . . . . . . . . . . . . . . Introduction Theory Applications What We Really Want Eliminate unnecessary redundancies Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis

  21. . . . . . . . . . . . . . . . Introduction Theory Applications What We Really Want Transform X into Y so that Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis cov ( y a , y b ) = 0 ∀ a ̸ = b

  22. . . . . . . . . . . . . . . . . Introduction Theory Applications What We Really Want Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis Transform X linearly into Y = XP so that cov ( y a , y b ) = 0 ∀ a ̸ = b

  23. . . . . . . . . . . . . . . Introduction Theory Applications What We Really Want 0 ... 0 Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis . . . . Transform X linearly into Y = XP so that   var ( y 1 ) cov ( Y ) =     var ( y n )

  24. . . . . . . . . . . . . . . Introduction Theory Applications Diagonalizing Matrices Theorem Every symmetric real matrix A has an eigenvalue decomposition eigenvalues of A , and V is orthonormal. The rows of V are the eigenvectors corresponding to the matching entry in D . Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis A = VDV T , where D is a diagonal matrix composed of the D = V T AV follows trivially.

  25. . . . . . . . . . . . . . . Introduction Theory Applications Diagonalizing Matrices Theorem Every symmetric real matrix A has an eigenvalue decomposition eigenvalues of A , and V is orthonormal. The rows of V are the eigenvectors corresponding to the matching entry in D . Tobias Holl Technische Universität München . . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis A = VDV T , where D is a diagonal matrix composed of the D = V T AV follows trivially. cov ( Y ) = V T cov ( X ) V

Recommend


More recommend