spectral frank wolfe algorithm strict complementarity and
play

Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear - PowerPoint PPT Presentation

Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence Lijun Ding Joint work with Yingjie Fei, Qiantong Xu, and Chengrun Yang June 15, 2020 Lijun Ding (Cornell University) SpecFW June 15, 2020 1 / 17 Overview


  1. Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence Lijun Ding Joint work with Yingjie Fei, Qiantong Xu, and Chengrun Yang June 15, 2020 Lijun Ding (Cornell University) SpecFW June 15, 2020 1 / 17

  2. Overview Introduction 1 Problem setup Past algorithms SpecFW and strict complementarity 2 Spectral Frank-Wolfe (SpecFW) Strict complementarity Numerics 3 Experimental setup Numerical results Lijun Ding (Cornell University) SpecFW June 15, 2020 2 / 17

  3. Convex smooth minimization over a spectrahedron Main optimization problem: Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  4. Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  5. Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  6. Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n trace tr ( · ), sum of diagonals Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  7. Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n trace tr ( · ), sum of diagonals positive semidefinite matrices S n + , i.e., symmetric matrices with non-negative eigenvalues Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  8. Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n trace tr ( · ), sum of diagonals positive semidefinite matrices S n + , i.e., symmetric matrices with non-negative eigenvalues unique optimal solution X ⋆ Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

  9. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  10. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  11. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  12. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  13. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] one-bit matrix completion [DPVDBW14] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  14. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] one-bit matrix completion [DPVDBW14] blind deconvolution [ARR13] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  15. Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] one-bit matrix completion [DPVDBW14] blind deconvolution [ARR13] Expect rank r ⋆ = rank ( X ⋆ ) ≪ n ! Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

  16. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  17. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  18. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  19. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) iteration complexity O ( 1 ǫ ) Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  20. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) iteration complexity O ( 1 ǫ ) accelerated PG, O ( 1 √ ǫ ) Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  21. Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) iteration complexity O ( 1 ǫ ) accelerated PG, O ( 1 √ ǫ ) Bottleneck: O ( n 3 ) per iteration due to FULL EVD in P SP n ! Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

  22. Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

  23. Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n FW: choose X 0 ∈ SP n , iterate (LOO) Linear Optimization Oracle: V t = arg min V ∈SP n tr ( V ∇ f ( X t )). Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

  24. Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n FW: choose X 0 ∈ SP n , iterate (LOO) Linear Optimization Oracle: V t = arg min V ∈SP n tr ( V ∇ f ( X t )). (LS) Line Search: X t +1 solves min X = η X t +(1 − η ) V t ,η ∈ [0 , 1] f ( X ). Low per iteration complexity : LOO only needs to compute one eigenvector of ∇ f ( X t )! Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

  25. Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� � SP n FW: choose X 0 ∈ SP n , iterate (LOO) Linear Optimization Oracle: V t = arg min V ∈SP n tr ( V ∇ f ( X t )). (LS) Line Search: X t +1 solves min X = η X t +(1 − η ) V t ,η ∈ [0 , 1] f ( X ). Low per iteration complexity : LOO only needs to compute one eigenvector of ∇ f ( X t )! Bottleneck: Slow convergence, O ( 1 ǫ ) iteration complexity in both theory and practice! Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

  26. FW variants Many variants: Randomized regularized FW [Gar16] In-face direction FW [FGM17] BlockFW [AZHHL17] FW with r ⋆ = rank ( X ⋆ ) = 1 [Gar19] Shortage: No linear convergence or sensitive to input rank estimate or r ⋆ = 1 . Lijun Ding (Cornell University) SpecFW June 15, 2020 7 / 17

  27. Outline Introduction 1 Problem setup Past algorithms SpecFW and strict complementarity 2 Spectral Frank-Wolfe (SpecFW) Strict complementarity Numerics 3 Experimental setup Numerical results Lijun Ding (Cornell University) SpecFW June 15, 2020 8 / 17

  28. Spectral Frank-Wolfe (SpecFW) Spectral Frank-Wolfe: choose X 0 ∈ SP n , a rank estimate k > 0, iterate Lijun Ding (Cornell University) SpecFW June 15, 2020 9 / 17

  29. Spectral Frank-Wolfe (SpecFW) Spectral Frank-Wolfe: choose X 0 ∈ SP n , a rank estimate k > 0, iterate k LOO: Compute bottom k eigenvectors V = [ v 1 , . . . , v k ] ∈ R n × k of ∇ f ( X t ). Lijun Ding (Cornell University) SpecFW June 15, 2020 9 / 17

Recommend


More recommend