guaranteed rank minimization via singular value
play

Guaranteed Rank Minimization via Singular Value Projections - PowerPoint PPT Presentation

Guaranteed Rank Minimization via Singular Value Projections Inderjit S. Dhillon University of Texas at Austin Workshop on Algorithms for Massive Data Processing IIT Kanpur Dec 18, 2009 Joint work with Raghu Meka, Prateek Jain Inderjit S.


  1. Guaranteed Rank Minimization via Singular Value Projections Inderjit S. Dhillon University of Texas at Austin Workshop on Algorithms for Massive Data Processing IIT Kanpur Dec 18, 2009 Joint work with Raghu Meka, Prateek Jain Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  2. Overview Affine Constrained Rank Minimization Problem (ARMP) Singular Value Projection algorithm (SVP) Analysis Matrix Completion Results Conclusions Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  3. Rank Minimization Problem(RMP) ( RMP ) : min rank( X ) s.t X ∈ C . C is a convex set, e.g., a polyhedral set Applications: Machine Learning Computer Vision Control Theory Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  4. Affine Constrained Rank Minimization Problem (ARMP) ( ARMP ) : min rank ( X ) X s.t A ( X ) = b . X ∈ R m × n , A : R m × n → R d , b ∈ R d . d ≪ mn Applications: Matrix completion: Netflix Challenge Linear time-invariant systems Embedding using missing Euclidean distances NP-hard even to approximate within log factor (Meka et al.’08) Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  5. An Example: Minimum Rank Matrix Completion Netflix Challenge: Given a few user-movie ratings Goal: complete ratings matrix Small number of latent factors ≡ low-rank Special case of ARMP: ( MCP ) : min rank ( X ) X tr( X e j e T s.t i ) = b ij , ∀ ( i , j ) ∈ Ω . Typically, number of samples very small: Netflix has 1% samples Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  6. Existing Work Various heuristics like alternative minimization, log-det relaxation Typically no theoretical guarantees Recent work: theoretical guarantees from generalizations of compressed sensing Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  7. ARMP: Generalization of Compressed Sensing (CS) ( CS ) : min � x � 0 x s.t A x = b . x ∈ R n , A : R n → R d , b ∈ R d . d ≪ n (typically, d = s log n ) Specific instance of ARMP with X = Diag ( x ). Technique CS ARMP Convex relaxation ℓ 1 (Lasso) Trace-norm (SVT) Greedy approach MP, OMP, CoSamp ADMiRA Hard Thresholding IHT, GradeS SVP, IHT Table: CS vs ARMP Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  8. Restricted Isometry Property (RIP) Most CS methods assume RIP (1 − δ s ) � x � 2 ≤ � A x � 2 ≤ (1 + δ s ) � x � 2 , ∀ x s.t. � x � 0 ≤ s Generalization to matrices: (1 − δ k ) � X � 2 F ≤ �A ( X ) � 2 2 ≤ (1 + δ k ) � X � 2 F , ∀ X s.t. rank( X ) ≤ k Families satisfying RIP: A ( X ) = A vec ( X ) , A ij ∼ N (0 , 1 / d ) √ � 1 / d with probability 1 / 2 √ A ij = − 1 / d with probability 1 / 2 Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  9. Singular Value Projection (SVP) ψ ( X ) = 1 2 �A ( X ) − b � 2 ( RARMP ) : min 2 , X s.t X ∈ C ( k ) = { X : rank ( X ) ≤ k } . Adapt classical projected gradient Efficient projection onto non-convex rank constraint Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  10. Singular Value Projection (SVP) Algorithm 1 SVP algorithm Initialize X 0 = 0, t = 0 Set step size η t repeat X t +1 = P k ( X t − η t A T ( A ( X t ) − b ) ) � �� � ∇ ψ ( X ) t = t + 1 until Convergence P k ( X ) = U k Σ k V T k —top k singular vectors, best rank k approximation X t : low-rank, stored using ( m + n ) k values Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  11. SVP: Main result Theorem Isometry constant: δ 2 k < 1 / 3 Exact case: b = A ( X ∗ ) Set η t = 1 / (1 + δ 2 k ) SVP outputs matrix X of rank k s.t. �A ( X ) − b � 2 2 ≤ ǫ Maximum number of iterations: � � C log � b � 2 2 ǫ Geometric convergence � � � b � 2 For δ 2 k = 1 / 5, η t = 5 / 6, number of iterations: log 2 2 ǫ Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  12. SVP: Guarantees—Noisy Case Theorem Isometry constant δ 2 k ≤ 1 / 3 Noisy case: b = A ( X ∗ ) + e (e is error vector) Set η t = 1 / (1 + δ 2 k ) SVP outputs X of rank k s.t., 2 ≤ ( C 2 + ǫ ) � e � 2 , ǫ ≥ 0 �A ( X ) − b � 2 Number of iterations is bounded by: � � � b � 2 D log ( C 2 + ǫ ) � e � 2 Geometric convergence to C -approx solution Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  13. SVP: Proof Simple analysis–apply RIP twice and Eckart-Young theorem once ψ ( X ) = 1 2 �A ( X ) − b � 2 2 : a quadratic function, Rank2 k z }| { ψ ( X t +1 ) − ψ ( X t ) = �∇ ψ ( X t ) , X t +1 − X t � + 1 X t +1 − X t ) � 2 2 �A ( 2 Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  14. SVP: Proof Simple analysis–apply RIP twice and Eckart-Young theorem once ψ ( X ) = 1 2 �A ( X ) − b � 2 2 : a quadratic function, Rank2 k z }| { ψ ( X t +1 ) − ψ ( X t ) = �∇ ψ ( X t ) , X t +1 − X t � + 1 X t +1 − X t ) � 2 2 �A ( 2 ≤ �∇ ψ ( X t ) , X t +1 − X t � + 1 2 (1 + δ 2 k ) � X t +1 − X t � 2 , F | {z } Using RIP Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  15. SVP: Proof Simple analysis–apply RIP twice and Eckart-Young theorem once ψ ( X ) = 1 2 �A ( X ) − b � 2 2 : a quadratic function, Rank2 k z }| { ψ ( X t +1 ) − ψ ( X t ) = �∇ ψ ( X t ) , X t +1 − X t � + 1 X t +1 − X t ) � 2 2 �A ( 2 ≤ �∇ ψ ( X t ) , X t +1 − X t � + 1 2 (1 + δ 2 k ) � X t +1 − X t � 2 , F | {z } Using RIP = 1 1 2 (1 + δ 2 k ) � X t +1 − Y t +1 � 2 2(1 + δ 2 k ) �A T ( A ( X t ) − b ) � 2 F − F Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  16. SVP: Proof Simple analysis–apply RIP twice and Eckart-Young theorem once ψ ( X ) = 1 2 �A ( X ) − b � 2 2 : a quadratic function, Rank2 k z }| { ψ ( X t +1 ) − ψ ( X t ) = �∇ ψ ( X t ) , X t +1 − X t � + 1 X t +1 − X t ) � 2 2 �A ( 2 ≤ �∇ ψ ( X t ) , X t +1 − X t � + 1 2 (1 + δ 2 k ) � X t +1 − X t � 2 , F | {z } Using RIP = 1 1 2 (1 + δ 2 k ) � X t +1 − Y t +1 � 2 2(1 + δ 2 k ) �A T ( A ( X t ) − b ) � 2 F − F 1 whereY t +1 = X t − ∇ ψ ( X t ) , X t +1 = P k ( Y t +1 ) 1 + δ 2 k Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  17. SVP: Proof Simple analysis–apply RIP twice and Eckart-Young theorem once ψ ( X ) = 1 2 �A ( X ) − b � 2 2 : a quadratic function, Rank2 k z }| { ψ ( X t +1 ) − ψ ( X t ) = �∇ ψ ( X t ) , X t +1 − X t � + 1 X t +1 − X t ) � 2 2 �A ( 2 ≤ �∇ ψ ( X t ) , X t +1 − X t � + 1 2 (1 + δ 2 k ) � X t +1 − X t � 2 , F | {z } Using RIP = 1 1 2 (1 + δ 2 k ) � X t +1 − Y t +1 � 2 2(1 + δ 2 k ) �A T ( A ( X t ) − b ) � 2 F − F ≤ 1 1 � X ∗ − Y t +1 � 2 2(1 + δ 2 k ) �A T ( A ( X t ) − b ) � 2 2 (1 + δ 2 k ) − F F | {z } Eckart − YoungTheorem Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  18. SVP: Proof Simple analysis–apply RIP twice and Eckart-Young theorem once ψ ( X ) = 1 2 �A ( X ) − b � 2 2 : a quadratic function, ψ ( X t +1 ) − ψ ( X t ) ≤ 1 1 � X ∗ − Y t +1 � 2 2(1 + δ 2 k ) �A T ( A ( X t ) − b ) � 2 2 (1 + δ 2 k ) − F F | {z } Eckart − YoungTheorem = �∇ ψ ( X t ) , X ∗ − X t � + 1 2 (1 + δ 2 k ) � X ∗ − X t � 2 F Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

  19. SVP: Proof Simple analysis–apply RIP twice and Eckart-Young theorem once ψ ( X ) = 1 2 �A ( X ) − b � 2 2 : a quadratic function, ψ ( X t +1 ) − ψ ( X t ) ≤ 1 1 � X ∗ − Y t +1 � 2 2(1 + δ 2 k ) �A T ( A ( X t ) − b ) � 2 2 (1 + δ 2 k ) − F F | {z } Eckart − YoungTheorem = �∇ ψ ( X t ) , X ∗ − X t � + 1 2 (1 + δ 2 k ) � X ∗ − X t � 2 F ≤ �∇ ψ ( X t ) , X ∗ − X t � + 1 1 + δ 2 k �A ( X ∗ − X t ) � 2 2 2 1 − δ 2 k | {z } Using RIP Inderjit S. Dhillon University of Texas at Austin Guaranteed Rank Minimization via Singular Value Projections

Recommend


More recommend