11 the singular value decomposition the singular value
play

[11] The Singular Value Decomposition The Singular Value - PowerPoint PPT Presentation

The Singular Value Decomposition [11] The Singular Value Decomposition The Singular Value Decomposition Gene Golubs license plate, photographed by Professor P. M. Kroonenberg of Leiden University. Frobenius norm for matrices x 2 1 + x 2


  1. The Singular Value Decomposition [11] The Singular Value Decomposition

  2. The Singular Value Decomposition Gene Golub’s license plate, photographed by Professor P. M. Kroonenberg of Leiden University.

  3. Frobenius norm for matrices � x 2 1 + x 2 2 + · · · + x 2 We have defined a norm for vectors over R : � [ x 1 , x 2 , . . . , x n ] � = n Now we define a norm for matrices: interpret the matrix as a vector. � � A � F = sum of squares of elements of A called the Frobenius norm of a matrix. Squared norm is just sum of squares of the elements. � 1 2 � � �� � 2 3 = 1 2 + 2 2 + 3 2 + 4 2 + 5 2 + 6 2 � � � � Example: � � � � 4 5 6 � � � � F Can group in terms of rows .... or columns � 1 2 � � �� � 2 3 = (1 2 + 2 2 + 3 2 ) + (4 2 + 5 2 + 6 2 ) = � [1 , 2 , 3] � 2 + � [4 , 5 , 6] � 2 � � � � � � � � 4 5 6 � � � � F � 1 2 � � �� � 2 3 = (1 2 + 4 2 ) + (2 2 + 5 2 ) + (3 2 + 6 2 ) = � [1 , 4] � 2 + � [2 , 5] � 2 + � [3 , 6] � 2 � � � � � � � � 4 5 6 � � � � F

  4. Frobenius norm for matrices � 1 2 � � �� � 2 3 = 1 2 + 2 2 + 3 2 + 4 2 + 5 2 + 6 2 � � � � Example: � � � � 4 5 6 � � � � F Can group in terms of rows .... or columns � 1 2 � � �� � 2 3 = (1 2 + 2 2 + 3 2 ) + (4 2 + 5 2 + 6 2 ) = � [1 , 2 , 3] � 2 + � [4 , 5 , 6] � 2 � � � � � � � � 4 5 6 � � � � F � 1 2 � � �� � 2 3 = (1 2 + 4 2 ) + (2 2 + 5 2 ) + (3 2 + 6 2 ) = � [1 , 4] � 2 + � [2 , 5] � 2 + � [3 , 6] � 2 � � � � � � � � 4 5 6 � � � � F Proposition: Squared Frobenius norm of a matrix is the sum of the squared norms of its rows ... 2 � �  a 1  � � � � � � . � � � � = � a 1 � 2 + · · · + � a m � 2 . � �   � � . � �   � � � � � � a m � � � � F

  5. Frobenius norm for matrices � 1 2 � � �� � 2 3 = 1 2 + 2 2 + 3 2 + 4 2 + 5 2 + 6 2 � � � � Example: � � � � 4 5 6 � � � � F Can group in terms of rows .... or columns � 1 2 � � �� � 2 3 = (1 2 + 2 2 + 3 2 ) + (4 2 + 5 2 + 6 2 ) = � [1 , 2 , 3] � 2 + � [4 , 5 , 6] � 2 � � � � � � � � 4 5 6 � � � � F � 1 2 � � �� � 2 3 = (1 2 + 4 2 ) + (2 2 + 5 2 ) + (3 2 + 6 2 ) = � [1 , 4] � 2 + � [2 , 5] � 2 + � [3 , 6] � 2 � � � � � � � � 4 5 6 � � � � F Proposition: Squared Frobenius norm of a matrix is the sum of the squared norms of its rows ... or of its columns. 2 � � � �   � � � � � � � � � �   � � = � v 1 � 2 + · · · + � v n � 2 � �   � � v 1 v n · · · � �   � � � �   � � � � � �   � � � � � � � � F

  6. Low-rank matrices Saving space and saving time    u  � v T �              u  w  =  u  w  � v T � � v T �        v T � �  u 1 u 2 1  v T 2

  7. Silly compression Represent a grayscale m × n image by an m × n matrix A . (Requires mn numbers to represent.) Find a low-rank matrix ˜ A that is as close as possible to A . (For rank r , requires only r ( m + n ) numbers to represent.) Original image (625 × 1024, so about 625k numbers)

  8. Silly compression Represent a grayscale m × n image by an m × n matrix A . (Requires mn numbers to represent.) Find a low-rank matrix ˜ A that is as close as possible to A . (For rank r , requires only r ( m + n ) numbers to represent.) Rank-50 approximation (so about 82k numbers)

  9. The trolley-line-location problem Given the locations of m houses a 1 , . . . , a m , we must choose where to run a trolley line. a 2 The trolley line must go through downtown (origin) and must be a straight line. a 1 a 3 The goal is to locate the trolley line so that it is as close as possible to the m houses. a 4 Specify line by unit-norm vector v : line is Span { v } . In measuring objective, how to combine individual objectives? As in least squares, we minimize the 2-norm of the vector [ d 1 , . . . , d m ] of distances. Equivalent to minimizing the square of the 2-norm of this vector, i.e. d 2 1 + · · · + d 2 m .

  10. The trolley-line-location problem Given the locations of m houses a 1 , . . . , a m , we must choose where to run a trolley line. The trolley line must go through downtown (origin) and must be a straight line. v The goal is to locate the trolley line so that it is as close as possible to the m houses. Specify line by unit-norm vector v : line is Span { v } . In measuring objective, how to combine individual objectives? As in least squares, we minimize the 2-norm of the vector [ d 1 , . . . , d m ] of distances. Equivalent to minimizing the square of the 2-norm of this vector, i.e. d 2 1 + · · · + d 2 m .

  11. The trolley-line-location problem Given the locations of m houses a 1 , . . . , a m , we must choose where to run a trolley line. The trolley line must go through downtown distance to a 2 (origin) and must be a straight line. distance to a 1 distance to a 3 The goal is to locate the trolley line so that it is as close as possible to the m houses. distance to a 4 Specify line by unit-norm vector v : line is Span { v } . In measuring objective, how to combine individual objectives? As in least squares, we minimize the 2-norm of the vector [ d 1 , . . . , d m ] of distances. Equivalent to minimizing the square of the 2-norm of this vector, i.e. d 2 1 + · · · + d 2 m .

  12. Solution to the trolley-line-location problem For each vector a i , write a i = a � v where a � v + a ⊥ v is the projection of a i along v i i i and a ⊥ v is the projection orthogonal to v . i By the Pythagorean Theorem, = a 1 − a � v a ⊥ v 1 1 . � a � v � a ⊥ v . � 2 � a 1 � 2 1 � 2 = − . 1 . = a m − a � v . a ⊥ v . m m � a � v � a ⊥ v m � 2 � a m � 2 m � 2 = − Since the distance from a i to Span { v } is � a ⊥ v � , we have i � a � v (dist from a 1 to Span { v } ) 2 � a 1 � 2 1 � 2 = − . . . � a � v (dist from a m to Span { v } ) 2 � a m � 2 m � 2 = − using a || v = � a i , v � v and hence � a || v � 2 = � a i , v � 2 � v � 2 i i

  13. Solution to the trolley-line-location problem By the Pythagorean Theorem, = a 1 − a � v a ⊥ v 1 1 . � a � v � a ⊥ v . � 2 � a 1 � 2 1 � 2 = − . 1 . = a m − a � v . a ⊥ v . m m � a � v � a ⊥ v m � 2 � a m � 2 m � 2 = − Since the distance from a i to Span { v } is � a ⊥ v � , we have i � a � v (dist from a 1 to Span { v } ) 2 � a 1 � 2 1 � 2 = − . . . � a � v (dist from a m to Span { v } ) 2 � a m � 2 m � 2 = − � a � v 1 � 2 + · · · + � a � v � a 1 � 2 + · · · + � a m � 2 � i (dist from a i to Span { v } ) 2 � m � 2 � = − � a 1 , v � 2 + · · · + � a m , v � 2 � � A � 2 � = − F using a || v = � a i , v � v and hence � a || v � 2 = � a i , v � 2 � v � 2 = � a i , v � 2 i i

  14. Solution to the trolley-line-location problem, continued (dist from a i to Span { v } ) 2 = � A � 2 � a 1 , v � 2 + · · · + � a m , v � 2 � � � − F i � a 1 , v � 2 + · · · + � a m , v � 2 � � can be replaced by � A v � 2 . By our Next, we show that dot-product interpretation of matrix-vector multiplication,       a 1 � a 1 , v �   . .   . v . = (1)     . .         � a m , v � a m   so � a 1 , v � 2 + � a 2 , v � 2 + · · · + � a m , v � 2 � � A v � 2 = � Substituting into Equation 1, we obtain i (distance from a i to Span { v } ) 2 || A || 2 � A v � 2 � = − F Therefore the best vector v is a unit vector that maximizes || A v || 2 (equivalently, maximizes || A v || ).

  15. Solution to the trolley-line-location problem, continued i (distance from a i to Span { v } ) 2 || A || 2 � A v � 2 � = − F Therefore the best vector v is a unit vector that maximizes || A v || 2 (equivalently, maximizes || A v || ). def trolley line location ( A ): v 1 = arg max {|| A v || : || v || = 1 } σ 1 = || A v 1 || return v 1 So far, this is a solution only in principle since we have not specified how to actually compute v 1 . Definition: We refer to σ 1 as the first singular value of A , and we refer to v 1 as the first right singular vector .

  16. Trolley-line-location problem, example � 1 � 4 , so a 1 = [1 , 4] and a 2 = [5 , 2]. In this case, a unit vector Example: Let A = 5 2 � . 78 � maximizing || A v || is v 1 ≈ . We use σ 1 to denote || A v 1 || , which is about 6.1: . 63 6 5 a 1=[1,4] 4 3 a 2=[5,2] 2 v 1=[.777, .629] 1 -1 0 1 2 3 4 5 6 -1

Recommend


More recommend