Projective Splitting Methods for Decomposing Convex Optimization - PowerPoint PPT Presentation

Projective Splitting Methods for Decomposing Convex Optimization Problems Jonat han Eckstein Rutgers University, New Jersey, US A Various portions of this talk describe j oint work with Patrick Combettes — NC S tate University, US A Patrick Johnstone — Rutgers University, US A Benar F. S vaiter — IMPA, Brazil Also: Jean-Paul Watson — S andia National Labs, US A David L. Woodruff — UC Davis, US A Funded in part by NSF grants CCF-1115638, CCF-1617617, and AFOS R grant FA9550-15-1-0251 May 2019 1 of 45

Introductory Remarks • I did some of the earlier work on an optimization algorithm called the ADMM (the Alternating Direction Method of Multipliers) o But not the earliest work May 2019 2 of 45

Introductory Remarks • I did some of the earlier work on an optimization algorithm called the ADMM (the Alternating Direction Method of Multipliers) o But not the earliest work • I know that the ADMM has been used in image processing because about 15 years ago I st arted being asked to referee a deluge of papers with this picture: May 2019 3 of 45

Introductory Remarks • I did some of the earlier work on an optimization algorithm called the ADMM (the Alternating Direction Method of Multipliers) o But not the earliest work • I know that the ADMM has been used in image processing because about 15 years ago I st arted being asked to referee a deluge of papers with this picture: May 2019 4 of 45

Introductory Remarks • I did some of the earlier work on an optimization algorithm called the ADMM (the Alternating Direction Method of Multipliers) o But not the earliest work • I know that the ADMM has been used in image processing because about 15 years ago I st arted being asked to referee a deluge of papers with this picture: • Today I want to talk about an algorithm t hat uses similar building blocks to the ADMM but is much more flexible May 2019 5 of 45

More General Problem Setting The algorit hms in t his talk can work for monotone inclusion problems of the form n ∈ ∑ * 0 ( ) G T G x i i i = i 1 where •    0 , , are real Hilbert spaces n •   are (generally set-valued) maximal monot one T : i i i =  operators, 1, , i n • =   are bounded linear maps,  : G i 1, , n i 0 i However, for this t alk we will restrict ourselves to... May 2019 6 of 45

A General Convex Optimization Problem { } ∑ n min ( ) f G x = i i 1 i x • For = → ∪ +∞    p , is closed proper convex i 1, , n f : { } i i • For = ×  i 1, , n , G is a p m real matrix i i • Assume you have a class of such problems t hat is not suitable for standard LP/ NLP solvers because either o The problems are very large o They is fairly large but also dense May 2019 7 of 45

Subgradient Maps of Convex Functions, Monotonicity { } ∂ of a convex function → ∪ +∞ p   The subgradient map f : is f given by { } ∂ = ≥ + − ∀ ∈  p f x ( ) y f x ( ') f x ( ) y x , ' x x ' . This has the property that ∈∂ ∈∂ ⇒ − − ≥ y f x ( ), ' y f x ( ') x x y ', y ' 0 Proof: − ≥ − f x ( ') f x ( ) y x , ' x − ≥ − f x ( ) f x ( ') y x ', x ' ≥ − − 0 y ' y x , x ' May 2019 8 of 45

Normal Cone Maps The indicat or f unct ion of a nonempty closed convex set C is ∈  0, x C δ = +∞ ( ) x ∉ C  , x C Its subgradient map is the normal cone map N of C : C { }  − ≤ ∀ ∈ ∈ y y x , ' x 0 x ' C , x C ∂ δ = = ∅ ( ) x N ( ) x C C ∉  x C y x − ≤ y x , ' x 0 C + − ≤ − ', ' 0 y x x x ' x − − ≤ y ' y x , x ' 0 x ' y ' N ( ') x C May 2019 9 of 45

A Subgradient Chain Rule • S → ∪ +∞   p uppose f : { } is closed proper convex • S × uppose G is a p real matrix m Then for any x , ) { } ( ∂ ⊇ ∂ = ∈∂  T T ( f G x )( ) G f Gx G y y f Gx ( ) and “ usually” ( ) ∂ = ∂  T ( )( ) f G x G f Gx May 2019 10 of 45

An Optimality Condition Let’ s go back to { } ∑ n min f G x ( ) = i i i 1 x ∈ ∈ ∈     p p m S uppose we have z , w , , w such that 1 n 1 n ∈∂ =  w f G z ( ) i 1, , n i i i n ∑ = T G w 0 i i = i 1 ∑   ∈∂  n  The chain rule then implies that 0 f G ( ) z , so…  = i i i 1 z is a solution t o our problem • This is always a sufficient optimality condit ion • It’ s “ usually” necessary as well • The w are the Lagrange multipliers / dual variables i May 2019 11 of 45

The Primal-Dual Solution Set (Kuhn-Tucker Set) { } ∑ n = ∀ = ∈∂ =    T ( , z w , , w ) ( i 1, n w ) f G z ( ), G w 0 1 = n i i i i i i 1 = = Or, if we assume that p m G , Id  , n n m { } ∑ − 1 = ∀ = − ∈∂ − n ∈∂    T ( , z w , , w ) ( i 1, n 1) w f G z ( ), G w f ( ) z − = 1 n 1 i i i i i n i 1 • This is t he set of points satisfying the optimality conditions • S tanding assumption:  is nonempt y • Essentially in E & S vaiter 2009:  is a closed convex set • In the = = p m G , Id  case, streamline notation: m n n − ∑ − n 1 ∈ × ×     * For , let w G w w n − = 1 1 n i i i 1 May 2019 12 of 45

Valid Inequalities for  • Take some x y ∈  such t hat ∈∂ =  p for , y f x ( ) i 1, , n i i i i i i • If ( , ∈  ∈∂ =  z ) , then w f G z ( ) for i 1, , n w i i i • S − − ≥ =  o, for 1, , x G z y , w 0 i n i i i i • Negate and add up: n ∑ ϕ = − − ≤ ∀ ∈  ( , z ) G z x , y w 0 ( , z ) w w i i i i = i 1 { } = ϕ = H p ( ) p 0  ϕ ≤ ∀ ∈  ( ) p 0 p May 2019 13 of 45

Confirming that ϕ is Affine ϕ The quadratic terms in ( , w take the form z ) n n n ∑ ∑ ∑ − = − = − = − = T T G z , w z , G w z , G w z , 0 0 i i i i i i = = = i 1 i 1 i 1 • Also true in the = = th , Id  case where we drop the n p m G m n n index o S lightly different proof, same basic idea May 2019 14 of 45

Generic Projection Method for a Closed Convex Set  in a Hilbert Space  Apply the following general template: • Given p ∈  , choose some affine function ϕ with k k ϕ ≤ ∀ ∈  k p ( ) 0 p { } • Proj ect = ϕ = k p onto , possibly with an H p ( ) p 0 k k λ ∈ ε − ε overrelaxation factor [ ,2 ] , giving p + , and repeat… k k 1 p k ϕ p + is affine k 1 k { } = ϕ = H p ( ) p 0 k k ϕ ≤ ∀ ∈  ( ) p 0 p  k ϕ > ( p ) 0 k k ϕ by picking some = × × ×      p p m In our case: and we find n 1 k ∈ ∈∂ =   k k p k k x , y : y f x ( ), i 1, , n and using the construction above i i i i i i May 2019 15 of 45

General Properties of Projection Algorithms ≠ ∅  Proposition. In such algorithms, assuming t hat , • { } − p ∈  k * * p p is nonincreasing for all • { k p } is bounded + − • → k 1 k p p 0 { } • If { ∇ ϕ ϕ ≤ k is bounded, then } limsup ( p ) 0 k k →∞ k • If all limit points of { k are in  , then { k p } p } converges to a point in  ϕ The first t hree properties hold no matter how badly we choose k ϕ so that the st ipulations of the last two The idea is to pick k properties hold – t hen we have a convergent algorit hm ϕ badly, we may “ stall” If we pick k May 2019 16 of 45

ϕ Selecting the Right k • S ϕ involves picking some ∈ ∈∂  p k k k k electing x , y : y f x ( ) , i k i i i i i =  i 1, , n • It turns out there are many ways to pick k k x , y so that the last i i two properties of t he proposition are satisfied • One fundamental t hing we would like is n ∑ ϕ − − ≥  k k k k k k ( , ) , 0 z G z x y w w k i i i i = i 1 ∉  k k with strict inequality if ( z , ) w • The oldest suggestion is “ prox” (E & S vait er 2008 & 2009) May 2019 17 of 45

The Prox Operation { } • S → ∪ +∞ p   uppose we have a convex function : f • Take any vector r ∈  and scalar c > and solve p 0   1 = + − 2   argmin ( ') ' x f x x r   2 c ∈ p  x ' • Optimality condition for this minimization is 1 ∈∂ + − 0 ( ) ( ) f x x r c 1 ( • S − ∈∂  o we have y r x ) f x ( ) c 1 ( • And + = + ⋅ − = x cy x c r x ) r c • S x y ∈  such that ∈∂ + = p o, we j ust found , and x cy r y f x ( ) • Call this Prox ( ) c f r ∂ May 2019 18 of 45

Projective Splitting Methods for Decomposing Convex Optimization - PowerPoint PPT Presentation

Projective Splitting Methods for Decomposing Convex Optimization Problems Jonat han Eckstein Rutgers University, New Jersey, US A Various portions of this talk describe j oint work with Patrick Combettes NC S tate University, US A

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

Computer Vision Mid-Level Vision Projective Geometry The projective projection of a 3D point:

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Introduction 1 Splitting unpack 2 Splitting pack 3 Reduction 4 Advanced technicalities 5

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

Convex hull: basic facts Convex hull: basic facts CG Lecture 1 CG Lecture 1 Problem : give a set

Convex hulls of spheres and convex hulls of convex polytopes lying on parallel hyperplanes

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

Convex Analysis Jos e De Don a September 2004 Centre of Complex Dynamic Systems and

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS133 Computational Geometry Convex Hull 4/12/2018 1 Convex Hull Given a set of n points,

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Splitting methods in geometric numerical integration of differential equations Fernando Casas

Finite projective spaces Leo Storme Ghent University Dept. of Mathematics Krijgslaan 281 - S22

Brndsted-Rockafellar property of subdifferentials of prox-bounded functions Marc Lassonde

Inferring Visibility: Who is (not) talking to whom? Gonca Grsun, Natali Ruchansky, Evimaria

Recent Progresses in Stochastic Algorithms for Big Data Optimization Tong Zhang Rutgers

Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization Jialei Wang

Convex Optimization: Modeling and Algorithms Lieven Vandenberghe Electrical Engineering

Complex Case Phenomena in the Grammar Matrix Scott Drellishak University of Washington July 28,

Projection onto Minkowski Sums with Application to Constrained Learning Joong-Ho (Johann) Won 1

Distributed nonsmooth composite optimization via the proximal augmented Lagrangian Neil K.