Sparse Convex Optimization Methods for Machine Learning PhD - PowerPoint PPT Presentation

Sparse Convex Optimization Methods for Machine Learning PhD Defense Talk 2011 / 10 / 04 Martin Jaggi Examiner: Co-Examiners: Emo Welzl Bernd Gärtner, Elad Hazan, Joachim Giesen, Joachim Buhmann

Convex Optimization D ⊂ R n

f ( x ) x ∈ D f ( x ) min x D ⊂ R n

f ( x ) The Linearized Problem min y ∈ D f ( x ) + h y � x, d x i x s D ⊂ R n Algorithm 1 Greedy on a Compact Convex Set Pick an arbitrary starting point x (0) ⇤ D Theorem: for k = 0 . . . ⇥ do Algorithm obtains Let d x ⇤ ∂ f ( x ( k ) ) be a subgradient to f at x ( k ) � 1 accuracy � O Compute s := approx arg min ⌅ y, d x ⇧ k y ∈ D after steps . 2 k Let α := k +2 Update x ( k +1) := x ( k ) + α ( s � x ( k ) ) end for

f ( x ) The Linearized Problem min y ∈ D f ( x ) + h y � x, d x i x D ⊂ R n d x Our Method Gradient Descent Approx. solve Cost per step Projection back to D linearized problem on D 1 /k 1 /k Convergence ✓ Sparse / Low ✗ Rank Solutions (depending on the domain)

History & Related Work Known Approx. Primal-Dual Domain Stepsize Subproblem Guarantee Frank & Wolfe linear inequality ✗ ✗ ✗ 1956 constraints Dunn general bounded ✓ ✗ ✗ 1978, 1980 convex domain Zhang ✓ ✗ ✗ convex hulls 2003 Clarkson ✓ ✓ ✗ unit simplex 2008, 2010 Hazan semidefinite matrices ✓ ✓ ✓ 2008 of bounded trace general bounded ✓ ✓ ✓ J. PhD Thesis convex domain

Sparse Approximation x ∈ ∆ n f ( x ) min D := conv ( { e i | i ∈ [ n ] } ) unit simplex 1 for k = 0 . . . ∞ do Corollary: Let d x ∈ ∂ f ( x ( k ) ) be a subgradient to f at x ( k ) Algorithm gives an -approximate Compute i := arg min i ( d x ) i ε � 1 2 Let α := k +2 � solution of sparsity . O Update x ( k +1) := x ( k ) + α ( e i − x ( k ) ) end for ε [ Clarkson SODA '08 ] lower bound: Sparsity as a function of “Coresets” � 1 � the approximation quality Ω ε

Applications • Smallest enclosing ball • Linear Classifiers (such as Support Vector Machines, -loss) ` 2 x ∈ ∆ n x T ( K + t 1 ) x min • Model Predictive Control • Mean Variance Portfolio Optimization x ∈ ∆ n x T Cx − t · b T x min

Sparse Approximation k x k 1  1 f ( x ) min D := conv ( {± e i | i ∈ [ n ] } ) ` 1 -ball for k = 0 . . . ∞ do Corollary: Let d x ∈ @ f ( x ( k ) ) be a subgradient to f at x ( k ) Compute i := arg max i | ( d x ) i | , Algorithm gives an -approximate ε and let s := e i · sign (( − d x ) i ) � 1 2 Let ↵ := � solution of sparsity . O k +2 Update x ( k +1) := x ( k ) + ↵ ( s − x ( k ) ) ε end for lower bound: Sparsity as a function of “Coresets” � 1 � the approximation quality Ω ε

Applications • -regularized regression ` 1 k x k 1  t k Ax � b k 2 min 2 Sparse Recovery

Low Rank Approximation min x ∈ D f ( x ) � v 2 R n , vv T � X 2 Sym n × n � D := conv ( ) = k v k 2 =1 X ⌫ 0 Tr ( X ) = 1 spectahedron for k = 0 . . . 1 do Corollary: X 2 ∂ f ( X ( k ) ) be a subgradient to f at X ( k ) Let D 2 Let α := Algorithm gives an -approximate ε k +2 � 1 Compute v := v ( k ) = ApproxEV ( D X , α C f ) � solution of rank . Update X ( k +1) := X ( k ) + α ( vv T � X ( k ) ) O end for ε [ Hazan LATIN '08 ] lower bound: � 1 � Ω ε

Applications • Trace norm regularized problems k X k ∗  t f ( X ) min Low-Rank Matrix Recovery • Max norm regularized problems

⎫ ⎭ Matrix Factorizations for recommender systems Movies The Netflix challenge: 1 3 17k Movies Customers 500k Customers 4 100M Observed Entries 1 ( ≈ 1%) ≈ UV T Y 2 3 = v (1) 5 3 ⎬ k Sulovsk ý 2 v ( k ) = 2 1 3 u (1) u ( k ) X ⌫ 0 f ( X ) min Is equivalent to: s.t. Tr ( X ) = t ( Y ij � ( UV T ) ij ) 2 X min U,V ( i,j ) ∈ Ω 1 2 4 1  UU T UV T  s.t. k U k 2 F ro + k V k 2 F ro = t 2 3 5 3 2 =: X 2 1 3   5 V U T V V T 1 1 2 2 [ J, Sulovsk ý ICML 2010 ] 4 2 2 3 1 3 3

A Simple Alternative Optimization Duality The Problem x ∈ D f ( x ) min f ( x ) The Dual gap (x) ω ( x ) := min y ∈ D f ( x ) + h y � x, d x i ω ( x ) Weak Duality x ω ( x ) ≤ f ( x ⇤ ) ≤ f ( x 0 ) D ⊂ R n

Pathwise Optimization f t 0 ( x ) 200 The Parameterized Problem x ∈ D f t ( x ) min ω t 0 ( x ) f t ( x ∗ t ) 100 “Better than necessary” g t ( x ) ≤ ε 2 0 t t 0 The difference � 1 − 1 � ⇐ | t 0 − t | ≤ ε · P f g t 0 ( x ) − g t ( x ) ≤ ε 2 “Continuity in the parameter” “Still good enough” Theorem: � 1 � g t 0 ( x ) ≤ ε O There are many intervals ε of piecewise constant -approx. solutions. ε [ Giesen, J, Laue ESA 2010 ]

Applications test accuracy • Smallest enclosing ball ionosphere breast-cancer of moving points t • SVMs, MKL (with 2 base kernels) x ∈ ∆ n x T ( K + t 1 ) x min • Model Predictive Control • robust PCA • Mean Variance Portfolio Optimization • Recommender Systems x ∈ ∆ n x T Cx − t · b T x min

f t 0 ( x ) 200 Thanks ω t 0 ( x ) f t ( x ∗ t ) 100 co-authors: 0 Bernd Gärtner t t 0 Joachim Giesen Soeren Laue Marek Sulovsk ý f ( x ) 3D visualization: Robert Carnecky x D ⊂ R n

Sparse Convex Optimization Methods for Machine Learning PhD - PowerPoint PPT Presentation

Sparse Convex Optimization Methods for Machine Learning PhD Defense Talk 2011 / 10 / 04 Martin Jaggi Examiner: Co-Examiners: Emo Welzl Bernd Grtner, Elad Hazan, Joachim Giesen, Joachim Buhmann Convex Optimization D R n f ( x ) x

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

Revisiting Frank-Wolfe Projection-Free Sparse Convex Optimization Martin Jaggi Ecole

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

16. Review of convex optimization Convex sets and functions Convex programming models

Faster convex optimization Simulated annealing & Interior point Elad Hazan Joint work with

moviCompile: An LLVM based compiler for heterogeneous SIMD code generation Erkan Diken, Roel

Dynamic Reductions for Model Checking Concurrent Software Alfons Laarman alfons@laarman.com

Results for different matrices and comparisons Dense Matrices Rectangular Matrices

Modelling and Simulation of Mechatronic Systems 02PCYQW Examples Matrix Calculus Basilio Bona

Conic Optimization: Relaxing at the Cutting Edge Miguel F . Anjos Professor and Canada Research

. ~-1<~ 1 1~~ [-Se-~-~ ~.ef SJvn.,..~ ~I.ca..o A.;.\'ti,:. ~ ~~~: IV-

Flag Algebra Methods (more formal approach) Bernard Lidick y 6th Lake Michigan Workshop on

Extended Scalar Searches at ATLAS & CMS Allison McCarn (University of Michigan) On behalf of

Sambuz

Useful Links

Newsletter

Mail Us

Sparse Convex Optimization Methods for Machine Learning PhD - PowerPoint PPT Presentation

Sparse Convex Optimization Methods for Machine Learning PhD Defense Talk 2011 / 10 / 04 Martin Jaggi Examiner: Co-Examiners: Emo Welzl Bernd Grtner, Elad Hazan, Joachim Giesen, Joachim Buhmann Convex Optimization D R n f ( x ) x

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

Revisiting Frank-Wolfe Projection-Free Sparse Convex Optimization Martin Jaggi Ecole

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

16. Review of convex optimization Convex sets and functions Convex programming models

Faster convex optimization Simulated annealing &amp; Interior point Elad Hazan Joint work with

moviCompile: An LLVM based compiler for heterogeneous SIMD code generation Erkan Diken, Roel

Dynamic Reductions for Model Checking Concurrent Software Alfons Laarman alfons@laarman.com

Results for different matrices and comparisons Dense Matrices Rectangular Matrices

Modelling and Simulation of Mechatronic Systems 02PCYQW Examples Matrix Calculus Basilio Bona

Conic Optimization: Relaxing at the Cutting Edge Miguel F . Anjos Professor and Canada Research

. ~-1&lt;~ 1 1~~ [-Se-~-~ ~.ef SJvn.,..~ ~I.ca..o A.;.\'ti,:. ~ ~~~: IV-

Flag Algebra Methods (more formal approach) Bernard Lidick y 6th Lake Michigan Workshop on

Extended Scalar Searches at ATLAS &amp; CMS Allison McCarn (University of Michigan) On behalf of

Sambuz

Useful Links

Newsletter

Mail Us

Faster convex optimization Simulated annealing & Interior point Elad Hazan Joint work with

. ~-1<~ 1 1~~ [-Se-~-~ ~.ef SJvn.,..~ ~I.ca..o A.;.\'ti,:. ~ ~~~: IV-

Extended Scalar Searches at ATLAS & CMS Allison McCarn (University of Michigan) On behalf of