Derivative-Free Robust Optimization by Outer Approximations Stefan Wild Mathematics and Computer Science Division Argonne National Laboratory Joint work with Goldfarb grandson Matt Menickelly (Argonne) + Sven Leyffer, Todd Munson, Charlie Vanaret (Argonne) January 11, 2018
Outline ⋄ Nonlinear robust optimization ⋄ E. Polak’s method of inexact outer x ∈ R n max min u ∈U f ( x , u ) approximation ⋄ ∇ f -free outer approximation ⋄ Early numerical experience Images: [DebRoy, Zhang, Turner, Babu; ScrMat, 2017] US-Mexico 2018 1
Nonlinear Robust Optimization Guard against worst-case uncertainty in the problem data � � min f ( x ) : c ( x, u ) ≤ 0 ∀ u ∈ U x ∈ R n where f certain objective c : R n × R m → R p uncertain constraints u uncertain variables/data U ⊂ R m uncertainty set (compact, convex) Well studied for linear (convex/concave) f , c [Ben-Tal, El Ghaoui, Nemirovski; 2009] , [Bertsimas, Brown, Caramani; SIRev 2011] , . . . US-Mexico 2018 2
Nonlinear Robust Optimization Guard against worst-case uncertainty in the problem data � � min f ( x ) : c ( x, u ) ≤ 0 ∀ u ∈ U x ∈ R n where f certain objective c : R n × R m → R p uncertain constraints u uncertain variables/data U ⊂ R m uncertainty set (compact, convex) Well studied for linear (convex/concave) f , c [Ben-Tal, El Ghaoui, Nemirovski; 2009] , [Bertsimas, Brown, Caramani; SIRev 2011] , . . . Special cases: Minimax Implementation errors x ∈ R n max min u ∈U f ( x, u ) x ∈ R n max min u ∈U f ( x + u ) US-Mexico 2018 2
Another Case: Goldfarb Robust Optimization Robust convex quadratically constrained programs c ⊤ x : 1 � � 2 x ⊤ Qx + x ⊤ g + γ ≤ 0 min ∀ ( Q, g, γ ) ∈ U (RCQP) x ∈ R n US-Mexico 2018 3
Another Case: Goldfarb Robust Optimization Robust convex quadratically constrained programs c ⊤ x : 1 � � 2 x ⊤ Qx + x ⊤ g + γ ≤ 0 min ∀ ( Q, g, γ ) ∈ U (RCQP) x ∈ R n ⋄ [Ben-Tal, Nemirovski; MathOR, 1997] : U i conditions to obtain SDP for (RCQP) ⋄ [Goldfarb, Iyengar; MathProg, 2003] : U i conditions to obtain SOCP for (RCQP) � Discrete/polytopic uncertainty sets p � � + , Q i � 0 ∀ i , λ ⊤ e = 1 � λ i ( Q i , g i , γ i ) , λ ∈ R p U = ( Q, g, γ ) : ( Q, g, γ ) = i =1 � Affine uncertainty sets U p Q = Q 0 + Q i � 0 ∀ i � λ i Q i , � λ � ≤ 1 , i =1 p ( g, γ ) = ( g 0 , γ 0 ) + � v i ( g i , γ i ) , � v � ≤ 1 i =1 � Factorized uncertainty sets U · · · . . . CRs around MLEs ⋄ See also Robust portfolio selection problems [Goldfarb, Iyengar; MOR, 2003] US-Mexico 2018 3
Example of Robustness “Helping” � 2 ≤ 0 , ∀ u ∈ U = [ − 1 , 1] 2 � u 1 x 1 + u 2 x 2 − u 2 1 − u 2 min x 1 + x 2 : x ∈ R 2 2 2 x 2 x 2 √ 3 x 1 + x 2 = 2 1 1 x 1 + x 2 = k √ x 1 + x 2 = − 2 x 1 x 1 − 2 − 1 1 2 − 2 − 1 1 2 − 1 − 1 − 2 − 2 √ Robust problem, x ∗ = ( − 1 2 , 1 3 2 , − 1 Nominal problem, ˆ u = ( 2 ) 2 ) √ √ US-Mexico 2018 4
Notation and Assumptions Implicitly robustified form: x ∈ R n max min u ∈U f ( x, u ) =: min x ∈ R n Ψ U ( x ) (MM) where, for any subset ˆ U ⊆ U use the relaxation: Ψ ˆ U ( x ) := max f ( x, u ) ≤ Ψ U ( x ) u ∈ ˆ U Sometimes forget and write Ψ := Ψ U US-Mexico 2018 5
Notation and Assumptions Implicitly robustified form: x ∈ R n max min u ∈U f ( x, u ) =: min x ∈ R n Ψ U ( x ) (MM) where, for any subset ˆ U ⊆ U use the relaxation: Ψ ˆ U ( x ) := max f ( x, u ) ≤ Ψ U ( x ) u ∈ ˆ U Sometimes forget and write Ψ := Ψ U Assume the following about (MM): a. Local Lipschitz continuity of f and ∇ x f everywhere f ( · , · ) and, for any u ∈ U , partial gradient ∇ x f ( · , u ) Lipschitz continuous over any bounded subset of R n × R m and R n , resp. b. Compactness of U c. (MM) solution exists → no convexity of f or U assumed US-Mexico 2018 5
An Optimality Measure Employ second-order convex approximation of f ( · , u ) at x : � � f ( x, u ) + �∇ x f ( x, u ) , h � + 1 2 � h � 2 Θ( x ) := min h ∈ R n max − Ψ( x ) u ∈U US-Mexico 2018 6
An Optimality Measure Employ second-order convex approximation of f ( · , u ) at x : � � f ( x, u ) + �∇ x f ( x, u ) , h � + 1 2 � h � 2 Θ( x ) := min h ∈ R n max − Ψ( x ) u ∈U Properties of Θ For all x ∈ R n 1. Θ( x ) ≤ 0 2. Θ( x ) is continuous 3. 0 ∈ ∂ Ψ( x ) if and only if Θ( x ) = 0 4. Θ( x ) = � ξ 0 �� Ψ U ( x ) − f ( x, u ) � ξ 0 + 1 � � �� 2 � ξ � 2 : − min ∈ co : u ∈ U ξ ∇ x f ( x, u ) ξ 0 ,ξ US-Mexico 2018 6
An Optimality Measure Employ second-order convex approximation of f ( · , u ) at x : � � f ( x, u ) + �∇ x f ( x, u ) , h � + 1 2 � h � 2 Θ( x ) := min h ∈ R n max − Ψ( x ) u ∈U Properties of Θ For all x ∈ R n 1. Θ( x ) ≤ 0 2. Θ( x ) is continuous 3. 0 ∈ ∂ Ψ( x ) if and only if Θ( x ) = 0 4. Θ( x ) = � ξ 0 �� Ψ U ( x ) − f ( x, u ) � ξ 0 + 1 � � �� 2 � ξ � 2 : − min ∈ co : u ∈ U ξ ∇ x f ( x, u ) ξ 0 ,ξ For any relaxation ˆ U ⊆ U , will use � ξ 0 �� Ψ ˆ � � � �� ξ 0 + 1 U ( x ) − f ( x, u ) 2 � ξ � 2 : : u ∈ ˆ Θ ˆ U ( x ) := − min ∈ co U ξ ∇ x f ( x, u ) ξ 0 ,ξ ≤ Θ( x ) = Θ U ( x ) US-Mexico 2018 6
Inexact Method of Outer Approximation Cutting-plane method from [Polak Optimization; 1997] Uses approximate solutions of alternating block subproblems � � x ∈ R n Ψ ˆ min U ( x ) , max u ∈U f (ˆ x, u ) ǫ k , Ω k �� ∞ �� IOA Alg: Given data k =0 Initialize x 0 ∈ R n , u 1 ∈ argmax f ( x 0 , u ) , U 0 ← { u 1 } u ∈ Ω 0 Loop over k : 1. Compute any x k +1 such that Θ U k ( x k +1 ) ≥ − ǫ k 2. Compute any u ′ ∈ argmax f ( x k +1 , u ) exactly u ∈ Ω k 3. Augment U k +1 ← U k ∪ { u ′ } a. Ω k ⊆ U and ǫ k ∈ [0 , 1] with lim k →∞ ǫ k = 0 b. Ω k grows dense in U Assumes: c. min x ∈ R n max u ∈ Ω k f ( x, u ) has a solution for all k US-Mexico 2018 7
Result Theorem [Polak] Given assumptions on f and IOA Alg. Then, for any accumulation point x ∗ of { x k } ∞ k =1 , Θ( x ∗ ) = 0 . Thus, 0 ∈ ∂ Ψ( x ∗ ) . Basic idea is that as IOA progresses: 1. sequence of finite max functions Ψ Ω k ( x ) = max u ∈ Ω k f ( x, u ) are arbitrarily good approximations of Ψ( x ) 2. sequence of optimality measures Θ Ω k ( x ) are arbitrarily good approximations of the optimality measure Θ( x ) US-Mexico 2018 8
When the Derivatives Start Hiding: Simulation-Based Optimization x ∈ R n { h ( x ; S ( x )) : c I [ x, S ( x )] ≤ 0 , c E [ x, S ( x )] = 0 } min ⋄ S : R n → C p simulation output, often “noisy” (even when deterministic) ⋄ Derivatives ∇ x S often unavailable or prohibitively expensive to obtain/approximate directly ⋄ S can contribute to objective and/or constraints ⋄ Single evaluation of S could take seconds/minutes/hours/. . . ⇒ Evaluation is a bottleneck for optimization ⋄ This talk: h ( x ; S ( x )) = max u ∈U f ( x, u ) Functions of complex (numerical) simulations arise everywhere US-Mexico 2018 9
Derivative-Free Inexact Outer Approximation Main task: Compute sufficiently accurate approximation of � ξ 0 �� Ψ Ω k ( x k ) − f ( x k , u ) ξ 0 + � ξ � 2 � � � �� Θ Ω k ( x k ) = − min : u ∈ Ω k : ∈ co ∇ x f ( x k , u ) ξ 2 ξ 0 ,ξ for which Θ Ω k ( x k ) ≤ ǫ k is attainable when ⋄ ∇ f values unavailable ⋄ f ( x, u ) evaluations expensive US-Mexico 2018 10
Derivative-Free Inexact Outer Approximation Main task: Compute sufficiently accurate approximation of � ξ 0 �� Ψ Ω k ( x k ) − f ( x k , u ) ξ 0 + � ξ � 2 � � � �� Θ Ω k ( x k ) = − min : u ∈ Ω k : ∈ co ∇ x f ( x k , u ) ξ 2 ξ 0 ,ξ for which Θ Ω k ( x k ) ≤ ǫ k is attainable when ⋄ ∇ f values unavailable ⋄ f ( x, u ) evaluations expensive Approach Phase 1 Inner iterations to obtain x k +1 an approximate minimizer of min x Ψ U k ( x ) → Manifold sampling, trust-region approach f ( x k +1 , u ) Phase 2 Solve argmax u ∈ Ω k US-Mexico 2018 10
Model-Based Approximation for Inner Solve of min x Ψ U k ( x ) Associate with each u j ∈ U k a model about primal iterate y t ( y t → t x k +1 ): Fully Linear Models m t j fully linear model of f ( · , u j ) on B ( y t , ∆) if there exist constants κ j, ef and κ j, eg independent of y t and ∆ with | f ( y t + s, u j ) − m t j ( y t + s ) | ≤ κ j, ef ∆ 2 ∀ s ∈ B (0 , ∆) �∇ x f ( y t + s, u j ) − ∇ m t j ( y t + s ) � ≤ κ j, eg ∆ ∀ s ∈ B (0 , ∆) [Conn, Scheinberg, Vicente; SIAM, 2009] US-Mexico 2018 11
Recommend
More recommend