The Singular Value Decomposition COMPSCI 527 — Computer Vision COMPSCI 527 — Computer Vision The Singular Value Decomposition 1 / 21
Outline 1 Math Corners and the SVD: Motivation 2 Orthogonal Matrices 3 Orthogonal Projection 4 The Singular Value Decomposition 5 Principal Component Analysis COMPSCI 527 — Computer Vision The Singular Value Decomposition 2 / 21
Math Corners and the SVD: Motivation Math Corners and the SVD: Motivation • A few math installments to get ready for later technical topics are sprinkled throughout the course • The Singular Value Decomposition (SVD) gives the most complete geometric picture of a linear mapping • SVD yields orthonormal vector bases for the null space, the row space, the range, and the left null space of a matrix • SVD leads to the pseudo-inverse , a way to give a linear system a unique and stable approximate solution • SVD leads to principal component analysis , a technique to reduce the dimensionality of a set of vector data while retaining as much information as possible • Dimensionality reduction improves the ability of machine learning methods to generalize COMPSCI 527 — Computer Vision The Singular Value Decomposition 3 / 21
Orthogonal Matrices Why Orthonormal Bases are Useful • n -dim linear space S ⊆ R m (so n ≤ m ) • p = [ p 1 , . . . , p m ] T ∈ S (standard basis) • v 1 , . . . , v n : an orthonormal basis for S : v T i v j = δ ij (Ricci delta) • Then there exists q = [ q 1 , . . . , q n ] T s.t. p = q 1 v 1 + . . . + q n v n • Matrix form: p = V q where V = [ v 1 , . . . , v n ] ∈ R m × n v T i p = COMPSCI 527 — Computer Vision The Singular Value Decomposition 4 / 21
Orthogonal Matrices The Left Inverse of an Orthogonal Matrix q i = v T i p (Finding coefficients q i is easy!) • Rewrite v T i v j = δ ij in matrix form V = [ v 1 , . . . , v n ] ∈ R m × n V T V = • LV = I with L = V T COMPSCI 527 — Computer Vision The Singular Value Decomposition 5 / 21
Orthogonal Matrices Left and Right Inverse of an Orthogonal Matrix • LV = I with L = V T • Can we have R such that VR = I ? • That would be the right inverse • What if m = n ? COMPSCI 527 — Computer Vision The Singular Value Decomposition 6 / 21
Orthogonal Matrices Orthogonal Transformations Preserve Norm ( m ≥ n ) y = V x : R n → R m � y � 2 = COMPSCI 527 — Computer Vision The Singular Value Decomposition 7 / 21
Orthogonal Projection Projection Onto a Subspace ( m ≤ n ) • Projection of b ∈ R m onto subspace S ⊆ R m is the point p ∈ S closest to b • Let V ∈ R m × n an orthonormal basis for S (That is, V is an orthogonal matrix ) • b − p ⊥ v i for i = 1 , . . . , n that is, V T ( b − p ) = 0 • Projection of b ∈ R m onto S is p = VV T b (optional proofs in an Appendix) COMPSCI 527 — Computer Vision The Singular Value Decomposition 8 / 21
The Singular Value Decomposition Linear Mappings √ √ 3 3 0 1 b = A x : R n → R m . Example: A = ( m = n = 3 ) √ − 3 3 0 2 1 1 0 range ( A ) ↔ rowspace ( A ) COMPSCI 527 — Computer Vision The Singular Value Decomposition 9 / 21
The Singular Value Decomposition The Singular Value Decomposition: Geometry √ √ 3 3 0 1 b = A x where A = − 3 3 0 √ 2 1 1 0 COMPSCI 527 — Computer Vision The Singular Value Decomposition 10 / 21
The Singular Value Decomposition The Singular Value Decomposition: Algebra A v 1 = σ 1 u 1 A v 2 = σ 2 u 2 ≥ σ 2 > σ 3 = 0 σ 1 u T 1 u 1 = 1 u T 2 u 2 = 1 u T 3 u 3 = 1 u T 1 u 2 = 0 u T 1 u 3 = 0 u T 2 u 3 = 0 v T 1 v 1 = 1 v T 2 v 2 = 1 v T 3 v 3 = 1 v T 1 v 2 = 0 v T 1 v 3 = 0 v T 2 v 3 = 0 COMPSCI 527 — Computer Vision The Singular Value Decomposition 11 / 21
The Singular Value Decomposition The Singular Value Decomposition: General For any real m × n matrix A there exist orthogonal matrices ∈ R m × m � � U = u 1 · · · u m � � ∈ R n × n = v 1 · · · v n V such that U T AV = Σ = diag ( σ 1 , . . . , σ p ) ∈ R m × n where p = min( m , n ) and σ 1 ≥ . . . ≥ σ p ≥ 0. Equivalently, A = U Σ V T . COMPSCI 527 — Computer Vision The Singular Value Decomposition 12 / 21
The Singular Value Decomposition Rank and the Four Subspaces σ 1 ... v 1 σ r . . 0 . v r A = U Σ V T = [ u 1 , . . . , u r , u r + 1 , . . . , u m ] ... v r + 1 0 . . . 0 · · · · · · 0 . . v n . . . . 0 · · · · · · 0 [drawn for m > n ] COMPSCI 527 — Computer Vision The Singular Value Decomposition 13 / 21
Principal Component Analysis Principal Component Analysis • We used the SVD to view a matrix as a map • We can also view a matrix as a data set • A is a m × n matrix with n data points in R m COMPSCI 527 — Computer Vision The Singular Value Decomposition 14 / 21
Principal Component Analysis Principal Component Analysis • Let k ≤ m be some “smaller dimensionality” • How to approximate the m -dimensional data in A with points in k dimensions? (Data compression, dimensionality reduction ) • The columns in A are a cloud of points around the mean µ ( A ) = 1 n A 1 n • Center the matrix: A c = A − µ ( A ) 1 T n COMPSCI 527 — Computer Vision The Singular Value Decomposition 15 / 21
Principal Component Analysis Principal Component Analysis COMPSCI 527 — Computer Vision The Singular Value Decomposition 16 / 21
Principal Component Analysis Principal Component Analysis • How to approximate the m -dimensional centered cloud A c in k ≪ m dimensions? • A c = U Σ V T = v 1 σ 1 . ... . . v k σ k [ u 1 , . . . , u k , u k + 1 , . . . , u m ] v k + 1 σ k + 1 . ... . . v n σ min( m , n ) v 1 σ 1 . ... • A c ≈ U k Σ k V T . k = [ u 1 , . . . , u k ] . v k σ k COMPSCI 527 — Computer Vision The Singular Value Decomposition 17 / 21
Principal Component Analysis Principal Component Analysis v 1 σ 1 . ... • A c ≈ U k Σ k V T . k = [ u 1 , . . . , u k ] . v k σ k • A c ≈ U k B = [ b 1 , . . . , b m ] • B = U T k A c • B = U T k [ A − µ ( A ) 1 T n ] (PCA) • B is k × n and captures most of the variance in A • See notes for a statistical interpretation (optional) • Reconstruct approximate original data: A = A c + µ ( A ) 1 T n • A ≈ U k B + µ ( A ) 1 T n COMPSCI 527 — Computer Vision The Singular Value Decomposition 18 / 21
Principal Component Analysis PCA Example Image compression: Each column viewed as a data point 10 5 10 4 10 3 10 2 10 1 10 0 0 100 200 300 400 500 600 700 m × n = 685 × 1024 original Singular Values k = 10 dimensions k = 50 dimensions COMPSCI 527 — Computer Vision The Singular Value Decomposition 19 / 21
Principal Component Analysis Encoding/Decoding New Points • Given PCA parameters µ ( A ) , U k for an m × n matrix A , the compressed points are B = U T k [ A − µ ( A ) 1 T n ] (a k × n matrix with k ≪ m ) • The original points can be approximately reconstructed as A ≈ U k B + µ ( A ) 1 T n • Given a new point a ∈ R m , it can be encoded as b = U T k [ a − µ ( A )] (without incorporating a into the PCA) ( b is a short, k -dimensional vector) • The original a can be approximately reconstructed as a ≈ U k b + µ ( A ) • The parameters µ ( A ) , U k are a code , used for encoding and approximate decoding (compression/decompression) COMPSCI 527 — Computer Vision The Singular Value Decomposition 20 / 21
Principal Component Analysis PCA is Not the Final Answer + + + + + + + + + _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ COMPSCI 527 — Computer Vision The Singular Value Decomposition 21 / 21
Recommend
More recommend