Stochastic constrained optimization in Hilbert spaces with - PowerPoint PPT Presentation

Stochastic constrained optimization in Hilbert spaces with applications Georg Ch. Pflug/C. Geiersbach March 27, 2019 Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

Iteration methods Problems: Finding roots of equations: Finding optima of functions: Given f ( · ) find a root x ∗ , Given f ( · ) find a candidate such that f ( x ∗ ) = 0. for an optimum, i.e. x ∗ such that ∇ f ( x ∗ ) = 0. Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

Newton (1669). Iterative solution method for the equation f ( x ) = x 3 − 2 x − 5 = 0 Raphson (1690) General version x n +1 = x n − f ( x n ) x n +1 = x n − [ ∇ 2 f ( x n )] − 1 ∇ f ( x n ) f ′ ( x n ) vMises, Pollaczek-Geiringer (1929) x n +1 = x n − t · f ( x n ) x n +1 = x n − t · ∇ f ( x n ) converges, if t ≤ [sup x f ′ ( x )] − 1 converges, if t < 1 /λ max with λ max the max. eigenvalue of ∇ 2 f ( x ) decreasing stepsize: x n +1 = x n − t n f ( x n ) x n +1 = x n − t n ∇ f ( x n ) with t n ≥ 0 , t n → 0 , � n t n = ∞ . Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

xxxxxxxxxxxxxxxxxxx Rosen (1960). Gradient projection for optimization under linear equality constraints min { f ( x ) : Ax = b } x n +1 = x n − t n ( I − A ⊤ ( AA ⊤ ) − 1 A ) ∇ f ( x n ) Goldstein (1964). Gradient projection for optimization under convex constraints min { f ( x ) : x ∈ C (convex) } x n +1 = π C ( x n − t n ∇ f ( x n )) π C is the convex projection Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

Stochastic iterations f resp. ∇ f is only observable together with noise, i.e. f ω ( · ) resp. ∇ f ω ( · ). E [ f ω ( x )] = f ( x ) + bias resp. E [ ∇ f ω ( x )] = ∇ f ( x ) + bias Robbins-Monro (1951) Ermoliev (1967-1976) stochastic (quasi-)gradients X n +1 = X n − t n f ω n ( X n ) X n +1 = X n − t n ∇ f ω n ( X n ) Gupal (1974), Kushner (1974) stochastic (quasi-)gradient projection X n +1 = π C ( X n − t n ∇ f ω n ( X n )) Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

The projected stochastic quasigradient method While more sophisticated methods like (Armijo line-search, level set methods, mirror decent methods or operator splitting) have developed and became popular for deterministic optimization, the good old gradient search is still nearly the only method for stochastic optimization. Stochastic optimization is applied in two different cases (1) for problems of huge dimensions, where subproblems of smaller dimension are generated by random selection; (2) for intrinsically stochastic problems, where externally risk factors have to be considered. Problems of type (1) include e.g. digital image classification and restoration, speech recognition, deep machine learning using neural networks and deterministic shape optimization. In this talk, we discuss a problem of type (2): Shape optimization in an intrinsically random environment. Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

Projected Stochastic Gradient (PSG) Algorithm in Hilbert spaces Let H be a Hilbert space with inner product �· , ·� and norm �·� . Let the projection onto C be denoted by π C : H → C . Problem: min u ∈ C { j ( u ) = E [ J ω ( u )] } . The PSG Algorithm: ◮ Initialization: u 0 ∈ H ◮ for n = 0 , 1 , . . . Generate independent ω n , choose t n > 0 u n +1 := π C ( u n − t n g n ( ω n )) with stochastic gradient g n Possible choices for the stochastic gradient: ◮ Single realization: g n = ∇ J ω n ( u n ) 1 � m n ◮ Batch method: g n = i =1 ∇ J ω n , i ( u n ) m n Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

Illustration Left: Projection to the tangent space Right: Projection to the constraint set Left: Line Search? Right: A stationary point Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

Assumptions for Convergence 1. ∅ � = C ⊂ H is closed and convex. 2. J ω is convex and continuously Fr´ echet differentiable for a.e. ω ∈ Ω on a neighborhood of C ⊂ H . 3. j bounded below by ¯ j ∈ R and finitely valued over C . 4. Robbins-Monro step sizes t n ≥ 0 , � ∞ n =0 t n = ∞ , � ∞ n =0 t 2 n < ∞ . 5. ∇ J ω n ( u n ) = ∇ j ( u n ) + w n +1 + r n +1 and increasing {F n } , (i) w n and r n are F n -measurable; (ii) E [ w n |F n ] = 0; (iii) � ∞ n =0 t n esssup � r n � < ∞ ; (iv) ∃ M 1 , M 2 : E [ �∇ J ω n ( u ) � 2 |F n ] ≤ M 1 + M 2 � u n � 2 . Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

Convergence Results Theorem ((Geiersbach and G.P.) Weak Convergence in Probability for General Convex Objective.) Under Assumptions 1-5 it holds for PSG algorithm and S := { w ∈ C : j ( w ) = j (˜ u ) } , where ˜ u is a minimizer of j: 1. {� u n − ˜ u �} converges a.s. for all ˜ u ∈ S, 2. { j ( u n ) } converges a.s. and lim n →∞ j ( u n ) = j (˜ u ) , 3. { u n } weakly converges a.s. and lim n u n ∈ S. This is stronger than ”any weak cluster point of ( u n ) lies in S ! Corollary (A.s. Strong Convergence for Strongly Convex Objective) Given Assumptions 1-5, assume as well that j is strongly convex. Then { u n } converges strongly a.s. to a unique optimum ¯ u. Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

Efficiency Estimates in the Strongly Convex Case If j is strongly convex with growth µ and t n = θ/ ( n + ν ) for θ > 1 / (2 µ ); ν ≥ K 1 then there are computable constants K 1 , K 2 such that the expected error in the control at step n is � K 2 E [ � u n − ¯ u � ] ≤ n + ν and the expected error in the objective n is given by LK 2 E [ j ( u n ) − j (¯ u )] ≤ 2( n + ν ) . L is the Lipschitz constant for j . This generalizes a result by Nemirovski et al. (2009). Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

Efficiency Estimates in the General Convex Case I Polyak and Juditsky (1992), Ruppert (1992): Convergence improvement by taking larger stepsizes and averaging. Define γ k := t k / ( � k ℓ =1 t ℓ ) and N � u N ˜ 1 = γ k u k . k =1 Let D S be a bound s.t. sup u ∈ S � u 0 − u � ≤ D S . We can show that u )] ≤ D S + R � N k =1 t 2 k E [ j ( u k ) − j (¯ 2 � N k =1 t k with a computable constant R . With the constant stepsize policy t n = D S K − 1 / 2 N − 1 / 2 for a fixed number of iterations n = 1 , . . . , N we get the efficiency estimate √ u )] ≤ D R u N E [ j (˜ 1 ) − j (¯ √ . N Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

Efficiency Estimates in the General Convex Case II √ With the choice of a variable stepsize t n = θ D S / nR we can show that u )] = O (log n u n E [ j (˜ 1 ) − j (¯ √ n ) And if one starts averaging after N 1 steps with N 1 = [ rn ] one can also get u )] = O ( 1 u n E [ j (˜ N 1 ) − j (¯ √ n ) . These bounds are extensions of Nemirovski et al. (2009). Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

A PDE-constained problem: Optimal Control of Stationary Heat Source � � 1 � � + λ 2 � y − y D � 2 2 � u � 2 min E [ J ω ( u )] = E L 2 ( D ) L 2 ( D ) u ∈ C − ∇ · ( a ( x , ω ) ∇ y ( x , ω ) = u ( x ) , ( x , ω ) ∈ D × Ω , s.t. y ( x , ω ) = 0 , ( x , ω ) ∈ ∂ D × Ω . C = { u ∈ L 2 ( D ) : u a ( x ) ≤ u ( x ) ≤ u b ( x ) a.e. x ∈ D } . ◮ Random (positive) conductivity a ( x , ω ) ∈ ( a min , a max ) ◮ Random temperature y = y ( x , ω ) controlled by deterministic source density u = u ( x ). ◮ Deterministic target distribution y D = y D ( x ) ∈ L 2 ( D ). Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

The Problem Satisfies Convergence Assumptions ◮ ∅ � = C ⊂ H is closed and convex. ◮ J ω is convex and continuously Fr´ echet differentiable for a.e. ω ∈ Ω on a neighborhood of C ⊂ H . ◮ j bounded below by ¯ j ∈ R and finite valued over C . ◮ Robbins-Monro step sizes t n ≥ 0 , � ∞ n =0 t n = ∞ , � ∞ n =0 t 2 n < ∞ . ◮ For a fixed realization ω , there exists a unique solution y ( · , ω ) ∈ H 1 0 ( D ) to the PDE constraint and � y ( · , ω ) � L 2 ( D ) ≤ C 1 � u � L 2 ( D ) . ◮ ∇ J ω ( u ) = λ u − p ( · , ω ) , where p ( · , ω ) solves the adjoint PDE ˆ ˆ ∀ v ∈ H 1 a ( x , ω ) ∇ v · ∇ p d x = ( y D − y ( · , ω )) v d x 0 ( D ) D D with bounds � p ( · , ω ) � L 2 ( D ) ≤ C 2 � y D − y ( · , ω ) � L 2 ( D ) . ◮ �∇ J ω ( u ) � L 2 ( D ) ≤ λ � u � L 2 ( D ) + C 2 ( � y D � L 2 ( D ) + C 1 � u � L 2 ( D ) ). Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications

Stochastic constrained optimization in Hilbert spaces with - PowerPoint PPT Presentation

Stochastic constrained optimization in Hilbert spaces with applications Georg Ch. Pflug/C. Geiersbach March 27, 2019 Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications Iteration methods

Stochastic optimization in Hilbert spaces Aymeric Dieuleveut Aymeric Dieuleveut Stochastic

A new weak Hilbert space Jess Surez de la Fuente, UEx Workshop on Banach spaces and Banach

Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 L. Rosasco RKHS About this

On Hilbert IVth Problem Marc Troyanov (EPFL) SJTU, June 21, 2019 On Hilbert IVth Abstract

Tyrol Hill Park Phase 4 Elementary Campbell Elementary Campbell Park Spaces Open Park

Dual Effect in Stochastic Optimization February 10, 2015 P. Carpentier Master MMMEF Cours

Stochastic Approximation in Hilbert Spaces Aymeric DIEULEVEUT Supervised by Francis BACH

Definable operators on Hilbert spaces The Main Theorem Corollaries Isaac Goldbring UCLA ASL

MATHEMATICS 1 CONTENTS Unconstrained optimization Constrained optimization Lagrange method

AM 205: lecture 20 Today: PDE optimization, constrained optimization example New topic:

PDE-Constrained Optimization Using Hyper-Reduced Models Matthew J. Zahr and Charbel Farhat

PDE-Constrained Optimization using Progressively-Constructed Reduced-Order Models Matthew J.

Accelerating PDE-Constrained Optimization using Progressively-Constructed Reduced-Order Models

Mixed-Integer PDE-Constrained Optimization Frontiers in PDE-constrained Optimization Pelin Cay,

Ch02. Constrained Optimization Ping Yu Faculty of Business and Economics The University of Hong

Presentation constrained optimization Wenda Chen Speech Data and Constrained Optimization

Introduction to Artificial Neural Networks Ahmed Guessoum Natural Language Processing and

A Formal Model Approach for the Analysis and Validation of the Cooperative Path Planning of a UAV

Introduction to Neural Machine Translation Gongbo Tang 16 September 2019 Outline Why Neural

Image and Video Coding: Encoder Control D D = - R d R Problem Statement / Scope of Image

Natural Language Processing with Deep Learning Neural Networks a Walkthrough Navid Rekab-Saz

Optimization CMPUT 296: Basics of Machine Learning Textbook 4.1-4.4 Logistics Reminders:

Target Client <<interface>> Original operationA() operationB() Adapter

Moving Target Defense for the Placement of Intrusion Detection Systems in the Cloud Sailik

Stochastic constrained optimization in Hilbert spaces with - PowerPoint PPT Presentation

Stochastic constrained optimization in Hilbert spaces with applications Georg Ch. Pflug/C. Geiersbach March 27, 2019 Georg Ch. Pflug/C. Geiersbach Stochastic constrained optimization in Hilbert spaces with applications Iteration methods

Stochastic optimization in Hilbert spaces Aymeric Dieuleveut Aymeric Dieuleveut Stochastic

A new weak Hilbert space Jess Surez de la Fuente, UEx Workshop on Banach spaces and Banach

Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 L. Rosasco RKHS About this

On Hilbert IVth Problem Marc Troyanov (EPFL) SJTU, June 21, 2019 On Hilbert IVth Abstract

Tyrol Hill Park Phase 4 Elementary Campbell Elementary Campbell Park Spaces Open Park

Dual Effect in Stochastic Optimization February 10, 2015 P. Carpentier Master MMMEF Cours

Stochastic Approximation in Hilbert Spaces Aymeric DIEULEVEUT Supervised by Francis BACH

Definable operators on Hilbert spaces The Main Theorem Corollaries Isaac Goldbring UCLA ASL

MATHEMATICS 1 CONTENTS Unconstrained optimization Constrained optimization Lagrange method

AM 205: lecture 20 Today: PDE optimization, constrained optimization example New topic:

PDE-Constrained Optimization Using Hyper-Reduced Models Matthew J. Zahr and Charbel Farhat

PDE-Constrained Optimization using Progressively-Constructed Reduced-Order Models Matthew J.

Accelerating PDE-Constrained Optimization using Progressively-Constructed Reduced-Order Models

Mixed-Integer PDE-Constrained Optimization Frontiers in PDE-constrained Optimization Pelin Cay,

Ch02. Constrained Optimization Ping Yu Faculty of Business and Economics The University of Hong

Presentation constrained optimization Wenda Chen Speech Data and Constrained Optimization

Introduction to Artificial Neural Networks Ahmed Guessoum Natural Language Processing and

A Formal Model Approach for the Analysis and Validation of the Cooperative Path Planning of a UAV

Introduction to Neural Machine Translation Gongbo Tang 16 September 2019 Outline Why Neural

Image and Video Coding: Encoder Control D D = - R d R Problem Statement / Scope of Image

Natural Language Processing with Deep Learning Neural Networks a Walkthrough Navid Rekab-Saz

Optimization CMPUT 296: Basics of Machine Learning Textbook 4.1-4.4 Logistics Reminders:

Target Client &lt;&lt;interface&gt;&gt; Original operationA() operationB() Adapter

Moving Target Defense for the Placement of Intrusion Detection Systems in the Cloud Sailik

Target Client <<interface>> Original operationA() operationB() Adapter