Limited memory Kelleys Method Converges for Composite Convex and - PowerPoint PPT Presentation

Limited memory Kelley’s Method Converges for Composite Convex and Submodular Objectives Madeleine Udell Operations Research and Information Engineering Cornell University Song Zhou (Cornell), Swati Gupta (Georgia Tech) NeurIPS, December 2018 1 / 11

Problem to solve minimize g ( x ) + f ( x ) ◮ g : R n → R strongly convex ◮ f : R n → R Lov´ asz extension of submodular function F ◮ piecewise linear ◮ convex envelope of F ◮ generically, exponentially many linear pieces L-KM solves composite convex + submodular problems whose natural size is exponential with linear memory . 2 / 11

Submodular optimization background ◮ Ground set V = { 1 , n } . ◮ F : 2 V → R is submodular if for all A , B ⊆ V , F ( A ) + F ( B ) ≥ F ( A ∪ B ) + F ( A ∩ B ) ◮ the base polytope of F is B ( F ) = { w ∈ R n : w ( V ) = F ( V ) , w ( A ) ≤ F ( A ) , ∀ A ⊆ V } ◮ the Lov´ asz extension of F is the homogeneous piecewise linear convex function w ∈ B ( F ) w ⊤ x f ( x ) = max ◮ linear optimization over B ( F ) is easy ◮ = ⇒ evaluating f ( x ) and ∂ f ( x ) is easy 3 / 11

Original Simplicial Method ( OSM ) [Bach 2013] Intuition : ◮ approximate f with pwl function whose values and (sub)gradients match f at all previous iterates ◮ minimize approximation to determine the next iterate Advantages : Finite convergence [Bach 2013] Drawbacks : ◮ Memory. memory |V ( i ) | = i grows with iteration counter i ◮ Computation. subproblem size grows with memory ◮ Convergence rate. no known rate of convergence [Bach 2013] 4 / 11

Limited Memory Kelley’s Method ( L-KM ) Algorithm 1 L-KM (to minimize g ( x ) + f ( x )) initialize V � = ∅ affinely independent. repeat 1. define ˆ f ( x ) = max w ∈V w ⊤ x 2. solve subproblem x ← argmin g ( x ) + ˆ ˆ f ( x ) x ⊤ w 3. compute v ∈ ∂ f (ˆ x ) = argmax w ∈ B ( F ) ˆ 4. V ← { w ∈ V : w ⊤ x = f (ˆ x ) } ∪ v unlike OSM , L-KM drops subgradients w ∈ V that are not tight at current iterate 5 / 11

L-KM : example g + f (1) g + f (1) x (1) z (0) 6 / 11

L-KM : example g + f (2) g + f (2) z (1) x (2) 6 / 11

L-KM : example g + f (3) g + f (3) z (2) x (3) 6 / 11

L-KM : example 6 / 11

Properties of L-KM ◮ Limited memory: In L-KM , for all i ≥ 0, vectors in V ( i ) are affinely independent. Moreover, |V ( i ) | ≤ n + 1. ◮ Finite convergence: When g is strongly convex, L-KM converges finitely. ◮ Linear convergence: When g is smooth and strongly convex, the duality gap of L-KM and OSM converges linearly to 0. 7 / 11

Limited-memory Fully Corrective Frank Wolfe L-FCFW Algorithm 2 L-FCFW (to minimize − g ∗ ( − y ) over y ∈ B ( F )) initialize V � = ∅ affinely independent. repeat 1. solve subproblem − g ∗ ( − y ) minimize subject to y ∈ conv ( V ) do convex decomposition of the solution ˆ y = � w ∈V λ w w with λ w ≥ 0 and � w ∈V λ w = 1 x = ∇ ( − g ∗ ( − ˆ 2. compute gradient ˆ y )) x ⊤ w 3. solve linear optimization v = argmax w ∈ B ( F ) ˆ 4. V ← { w ∈ V : λ w > 0 } ∪ v 8 / 11

Fully corrective Frank-Wolfe v 1 v 1 w (0) − g ∗ ( − w ) minimize subject to w ∈ B ( F ) w (1) v 5 v 5 v 5 v 2 −∇ g ( w (2) ) −∇ g ( w (1) ) −∇ g ( w (0) ) w (2) w (2) v 3 v 3 v 3 v 4 v 4 v 4 9 / 11

Properties of L-FCFW ◮ Limited memory : By Carath´ eodory’s theorem, we can choose ≤ n + 1 active vertices to represent the current iterate. ◮ Linear Convergence [Lacoste-Julien and Jaggi, 2015]: When g is smooth and strongly convex, the duality gap of L-FCFW converges linearly to 0. ◮ Duality : Two algorithms are dual if their iterates solve dual subproblems. If g is smooth and strongly convex and ◮ B ( i ) = { w ∈ V ( i − 1) : λ w > 0 } , L-FCFW is dual to L-KM . ◮ B ( i ) = V ( i − 1) , L-FCFW is dual to OSM . 10 / 11

Summary L-KM solves composite convex + submodular problems whose natural size is exponential with linear memory . ◮ S. Zhou, S. Gupta, and M. Udell. Limited Memory Kelley’s Method Converges for Composite Convex and Submodular Objectives. NIPS 2018. ◮ 5–7pm Room 210 Poster #16 11 / 11

Limited memory Kelleys Method Converges for Composite Convex and - PowerPoint PPT Presentation

Limited memory Kelleys Method Converges for Composite Convex and Submodular Objectives Madeleine Udell Operations Research and Information Engineering Cornell University Song Zhou (Cornell), Swati Gupta (Georgia Tech) NeurIPS, December 2018

Enrichment for Enrichment for Shelter Dogs Shelter Dogs Kelley Bollen Kelley Kelley Bollen

PEO Myths vs. Reality By Clay Kelley 1 Clay Kelley Background Clay Kelley has been in the HR

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

COMPOSITE OF PLAGE AREAS OVER COMPOSITE OF PLAGE AREAS OVER COMPOSITE OF PLAGE AREAS OVER

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

28.05.04 09:50 Memory Management The computer memory is a limited resource so the Memory

and Potential April 3, 2019 Kelley Hamrick, Manager, FT s Ecosystem Marketplace Kelley is

Investors Should Use 1031 Exchanges! DUGAN P. KELLEY, Esq. Shareholder KELLEY | CLARKE, PC

Passive Investors Should Care About 1031 Exchanges! DUGAN P. KELLEY, Esq. Shareholder KELLEY

Snake graphs from orbifolds Elizabeth Kelley (Joint work with Esther Banaian) April 14, 2019

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

The Chain Rule Given a composite function: The Chain Rule Given a composite function: h ( x ) =

Pr Prog ogrammin ramming g th the e T op opolo ology gy of N f Netw etworks orks T

Pr ProjecToR: : Agile gile Rec econfigu figurab able le Data C Da Center I Interconne

REINFORCED PLASTICS MACT STANDARDS DEVELOPMENT FOR EXISTING OPEN MOLDING SOURCES Briefing

SEPTEMBER 2 GCM Agenda Introduction (5:30-5:40) New faces Quorum check* Approval

Post Mortem of the Electronic Publication 6 th European Workshop on of the DIPAC 2003 Beam

THE TRUTH ABOUT TREE PROTECTION Presenters: Douglas Allan (partner Ellis Gould), Madeleine

Citations Needed: Build Your Wikipedia Skills While Building the Worlds Encyclopedia 3 - 4

RelSim: Relation Similarity Search in Schema-Rich Heterogeneous Information Networks Chenguang