Parallel Scenario Decomposition of Risk Averse 0-1 Stochastic Programs Shabbir Ahmed ISyE, Georgia Tech joint work with Yan Deng, Siqian Shen (IOE, U of Michigan) 2016 ICSP 1 / 27
Outline ◮ Risk-Averse Stochastic 0-1 Program ◮ Dual representation of coherent risk measure ◮ Dual decomposition ◮ Distributionally robust counterpart ◮ Parallelization of Decomposition Method ◮ Motivation ◮ Parallel Schemes 2 / 27
Risk Averse 0-1 Program min ρ ( f ( x, ξ )) x ∈ X ⊆ { 0 , 1 } d s.t. ◮ ξ : a random vector with finite support { ξ 1 , . . . , ξ K } and probabilities p 1 , . . . , p K . � � K � p ∈ A = ( p 1 , . . . , p K ) : p k = 1 , p k ≥ 0 , ∀ k = 1 , . . . , K k =1 ◮ f ( x, ξ ) : cost function, e.g., f ( x, ξ ) = c ⊤ x + min { θ ( y ) : y ∈ Y ( x, ξ ) } y ◮ ρ ( · ) : coherent risk measure. 3 / 27
Coherent Risk Measure min ρ ( f ( x, ξ )) x ∈ X ⊆ { 0 , 1 } d s.t. ◮ Positive homogeneity: ρ (0) = 0 , and ρ ( ǫw ) = ǫρ ( w ) for any ǫ > 0 ◮ Sub-additivity: ρ ( w 1 + w 2 ) ≤ ρ ( w 1 ) + ρ ( w 2 ) ◮ Monotonicity: ρ ( w 1 ≥ w 2 ) , if w 1 ≥ w 2 in all scenarios ◮ Translation invariance: ρ ( w + C ) = ρ ( w ) + C, for any constant C. 4 / 27
Coherent Risk Measure min ρ ( f ( x, ξ )) x ∈ X ⊆ { 0 , 1 } d s.t. ◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013): For some uncertainty set Q ( p ) ⊆ A , � K � � q k f ( x, ξ k ) ρ ( f ( x, ξ )) = max E q [ f ( x, ξ )] = . q ∈Q ( p ) k =1 5 / 27
Coherent Risk Measure min ρ ( f ( x, ξ )) x ∈ X ⊆ { 0 , 1 } d s.t. ◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013): For some uncertainty set Q ( p ) ⊆ A , � K � � q k f ( x, ξ k ) ρ ( f ( x, ξ )) = max E q [ f ( x, ξ )] = . q ∈Q ( p ) k =1 See, e.g., CVaR 1 − ǫ ( f ( x, ξ )) ϵ max VaR ϵ CVaR 5 / 27
Coherent Risk Measure min ρ ( f ( x, ξ )) x ∈ X ⊆ { 0 , 1 } d s.t. ◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013): For some uncertainty set Q ( p ) ⊆ A , � K � � q k f ( x, ξ k ) ρ ( f ( x, ξ )) = max E q [ f ( x, ξ )] = . q ∈Q ( p ) k =1 See, e.g., CVaR 1 − ǫ ( f ( x, ξ )) � K K � � � q k f ( x, ξ k ) : = max q k = 1 , 0 ≤ q k ≤ p k /ǫ, ∀ k = 1 , . . . , K k =1 k =1 6 / 27
Coherent Risk Measure min ρ ( f ( x, ξ )) x ∈ X ⊆ { 0 , 1 } d s.t. ◮ Artzner et al. (1999), Shapiro and Ahmed (2004), Shapiro (2013): For some uncertainty set Q ( p ) ⊆ A , � K � � q k f ( x, ξ k ) ρ ( f ( x, ξ )) = max E q [ f ( x, ξ )] = . q ∈Q ( p ) k =1 ◮ Minimax Reformulation � K � � q k f ( x, ξ k ) min max x ∈ X q ∈Q ( p ) k =1 ◮ Collado et. al. (2012): risk averse multistage stochastic linear program ◮ Ahmed (2013): 0-1 stochastic program ◮ Ahmed et. al. (2015): 0-1 chance constrained program 7 / 27
Dual Decomposition � K � � q k f ( x, ξ k ) min max x ∈ X q ∈Q ( p ) k =1 8 / 27
Dual Decomposition � K � � q k f ( x, ξ k ) min max x ∈ X q ∈Q ( p ) k =1 ◮ Clone x for each scenario ⇒ x 1 , . . . , x K . ◮ Force x 1 = · · · = x K by non-anticipativity constraint: K α k x k = x 1 � (NAC) k =1 where α 1 , . . . , α K are positive constants that sum to 1. 8 / 27
Dual Decomposition K � q k f ( x k , ξ k ) min max x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 K α k x k = x 1 � (NAC) s.t. k =1 9 / 27
Dual Decomposition K � q k f ( x k , ξ k ) min max x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 K α k x k = x 1 � (NAC) s.t. k =1 ◮ Relax (NAC) and punish violation by λ ∈ R d . � K K � � � α k x k − x 1 λ ⊤ � � q k f ( x k , ξ k ) g ( λ ) = min max + x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 k =1 � K �� � ( α k − δ k ) λ ⊤ x k + q k f ( x k , ξ k ) � = min max llllllllllllllllllllll x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 where δ 1 = 1 and δ k = 0 for k = 2 , . . . , K . 9 / 27
Dual Decomposition K � q k f ( x k , ξ k ) min max x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 K α k x k = x 1 � (NAC) s.t. k =1 ◮ Relax (NAC) and punish violation by λ ∈ R d . � K K � � � α k x k − x 1 λ ⊤ � � q k f ( x k , ξ k ) g ( λ ) = min max + x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 k =1 � K �� � ( α k − δ k ) λ ⊤ x k + q k f ( x k , ξ k ) � = min max x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 � K �� � ( α k − δ k ) λ ⊤ x k + q k f ( x k , ξ k ) � ≥ max min q ∈Q ( p ) x 1 ,...,x K ∈ X k =1 � K �� � ( α k − δ k ) λ ⊤ x k + q k f ( x k , ξ k ) � = max min = g ( λ ) llllllllllllllllllll q ∈Q ( p ) x k ∈ X k =1 10 / 27
Dual Decomposition K � q k f ( x k , ξ k ) min max x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 K α k x k = x 1 � (NAC) s.t. k =1 ◮ Relax (NAC) and punish violation by λ ∈ R d . � K K � � � α k x k − x 1 λ ⊤ � � q k f ( x k , ξ k ) g ( λ ) = min max + x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 k =1 � K �� � ( α k − δ k ) λ ⊤ x k + q k f ( x k , ξ k ) � = min max x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 � K �� � ( α k − δ k ) λ ⊤ x k + q k f ( x k , ξ k ) � ≥ max min q ∈Q ( p ) x 1 ,...,x K ∈ X k =1 � K �� � ( α k − δ k ) λ ⊤ x k + q k f ( x k , ξ k ) � = max min = g ( λ ) (LB, ∀ λ ) q ∈Q ( p ) x k ∈ X k =1 11 / 27
LB Computation � K �� � ( α k − δ k ) λ ⊤ x k + q k f ( x k , ξ k ) � g ( λ ) = max min q ∈Q ( p ) x k ∈ X k =1 12 / 27
LB Computation � K �� � ( α k − δ k ) λ ⊤ x k + q k f ( x k , ξ k ) � g ( λ ) = max min q ∈Q ( p ) x k ∈ X k =1 ◮ Approach 1: LB ← g (0) . �� K � k =1 q k min x ∈ X f ( x, ξ k ) g (0) = max q ∈Q ( p ) 1: for k = 1 , . . . , K do β k ← min { f ( x, ξ k ) : x ∈ X } 2: 3: end for �� K � 4: ℓ ← max k =1 β k q k : q ∈ Q ( p ) 13 / 27
LB Computation � K �� � ( α k − δ k ) λ ⊤ x k + q k f ( x k , ξ k ) � g ( λ ) = max min q ∈Q ( p ) x k ∈ X k =1 ◮ Approach 2: LB ← max λ g ( λ ) . � �� � K � ( α k − δ k ) λ ⊤ x + q k f ( x, ξ k ) MP: max φ : φ ≤ k =1 min x ∈ X q ∈Q ( p ) ,λ,φ 1: repeat (ˆ φ, ˆ 2: λ, ˆ q ) ← MP 3: for k = 1 , . . . , K do � � ( α k − δ k )ˆ λ ⊤ x + ˆ q k f ( x, ξ k ) : x ∈ X 4: β k ← min 5: end for � x k + q k f (ˆ � add cut φ ≤ � K ( α k − δ k ) λ ⊤ ˆ x k , ξ k ) 6: to MP k =1 7: until ˆ φ ≤ � K k =1 β k Slow convergence: stop after some iterations and return the best-found � K k =1 β k . 14 / 27
LB Computation � K �� � ( α k − δ k ) λ ⊤ x k + q k f ( x k , ξ k ) � g ( λ ) = max min q ∈Q ( p ) x k ∈ X k =1 ◮ Approach 1 & 2: K � q k f ( x k , ξ k ) min max x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 K α k x k = x 1 , � ∼ λ ∈ R d ∀ i = 1 , . . . , K s.t. k =1 ◮ Approach 3: K � q k f ( x k , ξ k ) min max x 1 ,...,x K ∈ X q ∈Q ( p ) k =1 K α k x k = x i , ∼ q i λ i ∈ R d � ∀ i = 1 , . . . , K s.t. k =1 15 / 27
LB Computation � � K � f ( x k , ξ k ) − ( λ k ) ⊤ x k � g ( λ ) = max min k =1 q k q ∈Q ( p ) x 1 ,...,x K ∈ X k =1 α k x k � ⊤ � � K � � K k =1 q k λ k �� + 16 / 27
LB Computation � � K � f ( x k , ξ k ) − ( λ k ) ⊤ x k � g ( λ ) = max min k =1 q k q ∈Q ( p ) x 1 ,...,x K ∈ X k =1 α k x k � ⊤ � � K � � K k =1 q k λ k �� + � � K � � � f ( x, ξ k ) − ( λ k ) ⊤ x g ( λ ) = max k =1 q k min , q ∈Q ( p ) � Q ( λ ) x ∈ X � K � k =1 q k λ k = 0 � where Q ( λ ) = q : ◮ Approach 3: LB ← max λ g ( λ ) . 1: initialize λ 1 , . . . , λ K 2: repeat 3: for k = 1 , . . . , K do � f ( x, ξ k ) − ( λ k ) ⊤ x : x ∈ X � 4: β k ← min 5: end for k =1 β k q k : q ∈ Q ( p ) � Q ( λ ) �� K � 6: ℓ ← max update λ 1 , . . . , λ K 7: 8: until ℓ converges Slow convergence: stop after some iterations and return the best-found ℓ . 17 / 27
Serial Algorithm ◮ LB: Subproblem of Scenario k x ∈ X { f ( x, ξ k ) } Approach 1 min � ( α k − δ k ) λ ⊤ x + q k f ( x, ξ k ) � Approach 2 min x ∈ X � f ( x, ξ k ) − ( λ k ) ⊤ x � Approach 3 min x ∈ X ◮ UB: evaluate subproblem solutions. ◮ Algorithm overview: 1: initialize LB ℓ and UB u 2: repeat 3: compute ℓ and collect subproblem solutions in S , by Approach 1/2/3 4: for ˆ x ∈ S do 5: u ← min { u, ρ ( f (ˆ x, ξ )) } 6: end for 7: X ← X \ S 8: until u − ℓ ≤ ǫ ◮ No-good Cut to exclude evaluated ˆ x : � x j =1 (1 − x j ) + � x j =0 x j ≥ 1 . j :ˆ j :ˆ 18 / 27
Distributionally Robust Risk-Averse 0-1 Program ◮ Known probability distribution p , x ∈ X ρ ( f ( x, ξ )) = min min q ∈Q ρ ( p ) E q [ f ( x, ξ )] max x ∈ X ◮ If p is not known exactly, but an uncertainty set U is given, min x ∈ X max p ∈ U ρ ( f ( x, ξ )) 19 / 27
Recommend
More recommend