Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear - PowerPoint PPT Presentation

Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence Lijun Ding Joint work with Yingjie Fei, Qiantong Xu, and Chengrun Yang June 15, 2020 Lijun Ding (Cornell University) SpecFW June 15, 2020 1 / 17

Overview Introduction 1 Problem setup Past algorithms SpecFW and strict complementarity 2 Spectral Frank-Wolfe (SpecFW) Strict complementarity Numerics 3 Experimental setup Numerical results Lijun Ding (Cornell University) SpecFW June 15, 2020 2 / 17

Convex smooth minimization over a spectrahedron Main optimization problem: Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n trace tr ( · ), sum of diagonals Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n trace tr ( · ), sum of diagonals positive semidefinite matrices S n + , i.e., symmetric matrices with non-negative eigenvalues Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

Convex smooth minimization over a spectrahedron Main optimization problem: minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , function g strongly convex and smooth linear map A and matrix C ∈ S n trace tr ( · ), sum of diagonals positive semidefinite matrices S n + , i.e., symmetric matrices with non-negative eigenvalues unique optimal solution X ⋆ Lijun Ding (Cornell University) SpecFW June 15, 2020 3 / 17

Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] one-bit matrix completion [DPVDBW14] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] one-bit matrix completion [DPVDBW14] blind deconvolution [ARR13] Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

Applications minimize f ( X ) := g ( A X ) + tr ( CX ) n ⊂ R n × n X ∈ S (M) X ∈ S n subject to tr ( X ) = 1 , and + , matrix sensing [RFP10] matrix completion [CR09, JS10] phase retrieval [CESV15, YUTC17] one-bit matrix completion [DPVDBW14] blind deconvolution [ARR13] Expect rank r ⋆ = rank ( X ⋆ ) ≪ n ! Lijun Ding (Cornell University) SpecFW June 15, 2020 4 / 17

Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� SP n Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) iteration complexity O ( 1 ǫ ) Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) iteration complexity O ( 1 ǫ ) accelerated PG, O ( 1 √ ǫ ) Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

Projected Gradient (PG) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� SP n orthogonal projection: P SP n ( X ) = arg min V � X − V � F PG: Choose X 0 ∈ SP n and η > 0, iterate X t +1 = P SP n ( X t − η ∇ f ( X t )) . (PG) iteration complexity O ( 1 ǫ ) accelerated PG, O ( 1 √ ǫ ) Bottleneck: O ( n 3 ) per iteration due to FULL EVD in P SP n ! Lijun Ding (Cornell University) SpecFW June 15, 2020 5 / 17

Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� SP n Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� SP n FW: choose X 0 ∈ SP n , iterate (LOO) Linear Optimization Oracle: V t = arg min V ∈SP n tr ( V ∇ f ( X t )). Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� SP n FW: choose X 0 ∈ SP n , iterate (LOO) Linear Optimization Oracle: V t = arg min V ∈SP n tr ( V ∇ f ( X t )). (LS) Line Search: X t +1 solves min X = η X t +(1 − η ) V t ,η ∈ [0 , 1] f ( X ). Low per iteration complexity : LOO only needs to compute one eigenvector of ∇ f ( X t )! Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

Projection free method: Frank-Wolfe (FW) tr ( X ) = 1 , X ∈ S n minimize X ∈ S f ( X ) subject to , n + (M) � �� SP n FW: choose X 0 ∈ SP n , iterate (LOO) Linear Optimization Oracle: V t = arg min V ∈SP n tr ( V ∇ f ( X t )). (LS) Line Search: X t +1 solves min X = η X t +(1 − η ) V t ,η ∈ [0 , 1] f ( X ). Low per iteration complexity : LOO only needs to compute one eigenvector of ∇ f ( X t )! Bottleneck: Slow convergence, O ( 1 ǫ ) iteration complexity in both theory and practice! Lijun Ding (Cornell University) SpecFW June 15, 2020 6 / 17

FW variants Many variants: Randomized regularized FW [Gar16] In-face direction FW [FGM17] BlockFW [AZHHL17] FW with r ⋆ = rank ( X ⋆ ) = 1 [Gar19] Shortage: No linear convergence or sensitive to input rank estimate or r ⋆ = 1 . Lijun Ding (Cornell University) SpecFW June 15, 2020 7 / 17

Outline Introduction 1 Problem setup Past algorithms SpecFW and strict complementarity 2 Spectral Frank-Wolfe (SpecFW) Strict complementarity Numerics 3 Experimental setup Numerical results Lijun Ding (Cornell University) SpecFW June 15, 2020 8 / 17

Spectral Frank-Wolfe (SpecFW) Spectral Frank-Wolfe: choose X 0 ∈ SP n , a rank estimate k > 0, iterate Lijun Ding (Cornell University) SpecFW June 15, 2020 9 / 17

Spectral Frank-Wolfe (SpecFW) Spectral Frank-Wolfe: choose X 0 ∈ SP n , a rank estimate k > 0, iterate k LOO: Compute bottom k eigenvectors V = [ v 1 , . . . , v k ] ∈ R n × k of ∇ f ( X t ). Lijun Ding (Cornell University) SpecFW June 15, 2020 9 / 17

Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear - PowerPoint PPT Presentation

Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence Lijun Ding Joint work with Yingjie Fei, Qiantong Xu, and Chengrun Yang June 15, 2020 Lijun Ding (Cornell University) SpecFW June 15, 2020 1 / 17 Overview

WOLFE RESIDENCE 337 Kenmore Road The Douglaston Historic District Kevin Wolfe Architect 1

Spectral Clustering Spectral Clustering? Spectral methods Methods using eigenvectors of

with OpenACC Directives Michael Wolfe michael.wolfe@pgroup.com http://www.pgroup.com/accelerate

I nnovation I nnovation Complementarity and and Complementarity Scale of Production Scale of

Complementarity of Implementing the Biological Complementarity of Implementing the Biological

Complementarity of information found in media reports Complementarity of information found in

Complementarity in categorical quantum mechanics Chris Heunen May 29, 2010 Complementarity

Fortran Programmers Michael Wolfe PGI compiler engineer michael.wolfe@pgroup.com Outline GPU

Presentation for: Prospect Name February 17, 2015 Gregg Wolfe-Principal Doreen Guss National

Categories and Quantum Informatics Week 7: Complementarity Chris Heunen 1 / 31 Overview

An Introduction to Spectral Learning Hanxiao Liu November 8, 2013 An Introduction to Spectral

The Monad of Strict Computation A Categorical Framework for the Semantics of Languages in which

SHOULD STRICT CRIMINAL LIABILITY BE REMOVED FROM ALL IMPRISONABLE OFFENCES? ANDREW ASHW ORTH,

Implied Volatilities from Strict Local Martingales Martin Keller-Ressel TU Dresden ETH Zurich,

Distributed Frank-Wolfe Algorithm A Unified Framework for Communication-Efficient Sparse Learning

A A Modi dified d Frank nk-Wo Wolfe Algorithm for Te Tensor Fa Factorization with Unimodal

Exponentiating in Pairing Groups Joppe W. Bos, Craig Costello, and Michael Naehrig SAC 2013

Efficient representations for the modal logic S5 Alexandre Niveau Bruno Zanuttini GREYC lab,

A New Family of Pairing-Friendly Elliptic Curves Michael Scott and Aurore Guillevic MIRACL.com

Pairings are not dead, just resting ECC 2017 Diego F. Aranha December 8, 2018 Institute of

COMPONENT BASED THEMING WITH UI PATTERNS PRESENTED BY BRIAN PERRY November 18, 2017 BRIAN

CHAPTER 9: WORKING TOGETHER An Introduction to Multiagent Systems

A short-list of pairing-friendly curves resistant to Special TNFS at the 128-bit security level

Universal BPS Structure of stationary supergravity solutions K.S. Stelle Imperial College London

Sambuz

Useful Links

Newsletter

Mail Us