Extending SDDP-style Algorithms for Multistage Stochastic - PowerPoint PPT Presentation

Extending SDDP-style Algorithms for Multistage Stochastic Programming Dave Morton Industrial Engineering & Management Sciences Northwestern University Joint work with: Oscar Dowson, Daniel Duque, and Bernardo Pagnoncelli

Collaborators

Hydroelectric Power Itaipu (14 GW)

Yuba, Bear and South Feather Hydrological Basin

SDDP Stochastic Dual Dynamic Programming

SLP- T z ∗ = min x 1 ≥ 0 c 1 x 1 + E ξ 2 | ξ 1 V 2 ( x 1 , ξ 2 ) s.t. A 1 x 1 = B 1 x 0 + b 1 where for t = 2 , . . . , T, V t ( x t − 1 , ξ t ) = min x t ≥ 0 c t x t + E ξ t +1 | ξ 1 ,...,ξ t V t +1 ( x t , ξ t +1 ) s.t. A t x t = B t x t − 1 + b t and where V T +1 ≡ 0 V t ( · , ξ t ) is piecewise linear and convex

SLP- T Assumptions for SDDP • Relatively complete recourse, finite optimal solution • ξ t = ( A t , B t , b t , c t ) is inter-stage independent • Or, ( A t , B t , c t ) is inter-stage independent and b t satisfies, e.g., – b t = Ψ( b t − 1 ) + ε t with ε t inter-stage independent; or, – b t = Ψ( b t − 1 ) · ε t with ε t inter-stage independent • Sample space: Ω t = Σ 2 × Σ 3 × · · · × Σ t with | Σ t | modest • T may be large

What Does “Solution” Mean? A solution is a policy

SDDP …  …  …  …  …  …  …  …  …  (a) Forward Pass (b) Backward Pass

SDDP Master Programs min c t x t + θ t x t ,θ t s.t. A t x t = B t x t − 1 + b t − G k t x t + θ t ≥ g k t , k = 1 , 2 , . . . , K x t ≥ 0

Partially Observable Multistage Stochastic Programming Or, an alternative to DRO when you don’t really know the distribution An apology: Not talking about Wasserstein-based DRO for SLP- T via an SDDP Algorithm (with Daniel Duque)

Policy Graphs (Dowson) A policy graph for SLP- 3 with inter-stage independence: 1 2 3 Unfolds to a scenario tree: 3HH 2H 3HL 1 2L 3LH 3LL

Policy Graphs A Markov-switching model: Random transitions:

Inventory Example 1 D A H A 1 2 ρ R 1 D B H B 2 1 ρ Demand model A : P ( ω = 1) = 0 . 2 P ( ω = 2) = 0 . 8 Demand model B : P ( ω = 1) = 0 . 8 P ( ω = 2) = 0 . 2 u,x ′ ≥ 0 u + E ω [ H i ( x ′ , ω )] D i : D i ( x ) = min s.t. x ′ = x + u u,x ′ ≥ 0 2 u + x ′ + ρD i ( x ) H i : H i ( x, ω ) = min s.t. x ′ = x + u − ω

Policy Graphs Each node i : Ω i ω u = π i ( x, ω ) x x ′ x ′ = T i ( x, u, ω ) C i ( x, u, ω ) A policy graph: • G = ( R, N , E , Φ) • ω j ∈ Ω j : node-wise independent noise • feasible controls: u ∈ U i ( x, ω ) • transition function: x ′ = T i ( x, u, ω ) • one-step cost function: C i ( x, u, ω )

Policy Graphs min π E i ∈ R + ; ω ∈ Ω i [ V i ( x R , ω )] (1) where x, u, ω ) + E j ∈ i + ; ϕ ∈ Ω j [ V j ( x ′ , ϕ )] V i ( x, ω ) = min x,x ′ C i (¯ u, ¯ s.t. ¯ x = x (2) u ∈ U i (¯ x, ω ) x ′ = T i (¯ x, u, ω ) Goal: Find π i ( x, ω ) that solves (1) for each i ∈ N , x , and ω (A1) N is finite (A2) Ω i is finite and ω i is node-wise independent ∀ i ∈ N (A3) Excluding cost-to-go term, subproblem (2) is an LP (A4) Subproblem (2) has finite optimal solution (A5) Hit leaf node with probability 1 (or graph G is acyclic)

Policy Graphs with Partial Observability Extend policy graph to: G = ( R, N , E , Φ , A ) where A partitions N : A ∩ A ′ = ∅ , A � = A ′ � A = N A ∈A We know the current ambiguity set, A , but not which node Full observability: A = {{ i } : i ∈ N} , i.e., | A | = 1 But, could have | A | = 2 , where we know the stage but not the node

Updates to the Belief State 1 D A H A 1 2 ρ A = { A 1 , A 2 } , with A 1 = { D A , D B } and A 2 = { H A , H B } R 1 D B H B 2 1 ρ P { Node = k | ω, A } = 1 k ∈ A · P { ω | Node = k } P { Node = k } P { ω } [ 1 k ∈ A · P ( ω ∈ Ω k )] � i ∈N b i φ ik ← b k j ∈ A φ ij P ( ω ∈ Ω j ) � i ∈N b i � D ω A Φ ⊤ b b ← B ( b, ω ) = j ∈ A φ ij P ( ω ∈ Ω j ) � � i ∈N b i

Policy Graphs with Partial Observability Each node: Ω i ω b ← B ( b, ω ) x, b x ′ , b u = π i ( x, ω, b ) x ′ = T i ( x, u, ω ) C i ( x, u, ω ) • All nodes in an ambiguity set have the same C i , T i , and U i • Children i + , transition probabilities φ ij , even Ω i may differ

Policy Graphs with Partial Observability min π E i ∈ R + ; ω ∈ Ω i [ V i ( x R , B i ( b R , ω ) , ω )] (3) where x, u, ω ) + V ( x ′ , b ) V i ( x, b, ω ) = min x,x ′ C i (¯ u, ¯ s.t. ¯ x = x u ∈ U i (¯ x, ω ) x ′ = T i (¯ x, u, ω ) and where V ( x ′ , b ) = � � � P ( ϕ ∈ Ω k ) · V k ( x ′ , B k ( b, ϕ ) , ϕ ) b j φ jk j ∈N k ∈N ϕ ∈ Ω k Goal: Find π A ( x, b, ω ) that solves (3) for each A ∈ A , x , b , and ω

Saddle Property of Cost-to-go Function x, u, ω ) + V ( x ′ , b ) V i ( x, b, ω ) = min x,x ′ C i (¯ u, ¯ s.t. ¯ x = x u ∈ U i (¯ x, ω ) x ′ = T i (¯ x, u, ω ) where V ( x ′ , b ) = � � � P ( ϕ ∈ Ω k ) · V k ( x ′ , B k ( b, ϕ ) , ϕ ) b j φ jk j ∈N k ∈N ϕ ∈ Ω k Assume (A1)-(A5) with G acyclic Lemma 1. Fix i , b , ω . Then, V i ( x, b, ω ) is piecewise linear convex in x . Lemma 2. Fix x ′ . Then, V ( x ′ , b ) is piecewise linear concave in b . Theorem 1. V ( x ′ , b ) is a piecewise linear saddle function, which is convex in x ′ for fixed b and concave in b for fixed x ′ .

Linear Interpolation: Towards an SDDP Algorithm V ( b ) b ¯ ¯ ¯ ¯ ¯ b 1 = 0 b 2 b 3 b 4 b 5 = 1 K γ k V (¯ V ( b ) = max � b k ) γ ≥ 0 k =1 K s.t. � γ k = 1 k =1 K γ k ¯ � b k = b k =1

Saddle Function with Interpolated Cuts V ( x ′ , b ) x ′ b

Computing Cuts for What? x, u, ω ) + V A ( x ′ , b ) V i ( x, b, ω ) = min x,x ′ C i (¯ u, ¯ s.t. ¯ x = x u ∈ U i (¯ x, ω ) x ′ = T i (¯ x, u, ω ) where V A ( x ′ , b ) = � � � P ( ϕ ∈ Ω k ) · V k ( x ′ , B k ( b, ϕ ) , ϕ ) b j φ jk k ∈ j + j ∈ A ϕ ∈ Ω k

SDDP Master Program K V K i ( x, b, ω ) = min x,x ′ ,θ max γ ≥ 0 C i (¯ x, u, ω ) + � γ k θ k u, ¯ k =1 s.t. ¯ x = x [ λ ] u ∈ U i (¯ x, ω ) x ′ = T i (¯ x, u, ω ) K � γ k b k = b [ µ ] k =1 K � γ k = 1 [ ν ] k =1 θ k ≥ G k x ′ + g k , k = 1 , . . . , K

SDDP Master Program x, u, ω ) + µ ⊤ b + ν V K i ( x, b, ω ) = x,x ′ ,ν,µ C i (¯ min u, ¯ s.t. ¯ x = x, [ λ ] u ∈ U i (¯ x, ω ) x ′ = T i (¯ x, u, ω ) µ ⊤ b k + ν ≥ G k x ′ + g k , k = 1 , . . . , K Theorem 2. Assume (A1)-(A5) with G acyclic. Let the sample paths of the “obvious” SDDP algorithm be generated independently at each iteration. Then, the algorithm converges to an optimal policy almost surely in a finite number of iterations.

Inventory Example 1 D A H A 1 2 ρ R 1 D B H B 2 1 ρ Demand model A : P ( ω = 1) = 0 . 2 P ( ω = 2) = 0 . 8 Demand model B : P ( ω = 1) = 0 . 8 P ( ω = 2) = 0 . 2 u,x ′ ≥ 0 u + E ω [ H i ( x ′ , ω )] D i : D i ( x ) = min s.t. x ′ = x + u u,x ′ ≥ 0 2 u + x ′ + ρD i ( x ) H i : H i ( x, ω ) = min s.t. x ′ = x + u − ω

Inventory Example: Train Four Policies 1. fully observable : distribution known upon departing R 2. partially observable : ambiguity partition { D A , D B } , { H A , H B } 3. risk-neutral average demand : demand equally likely to be 1 or 2 4. DRO average demand : modified χ 2 method with radius 0.25

Inventory Example: Train Four Policies • 2000 out-of-sample costs over 50 periods; quartiles; ρ = 0 . 9 30 25 simulated cost ($) Discounted 20 15 10 Fully Partially Risk neutral DRO average average demand observable observable demand 140 simulated cost ($) 120 Undiscounted 100 80 60 Fully Partially Risk neutral DRO average average demand observable observable demand

Inventory Example One Sample Path of the Partially Observable Policy (a) Belief (b) First-stage buy (c) Inventory 2 2 1 Belief in model A 0 . 8 1 . 5 1 . 5 0 . 6 Units Units 1 1 0 . 4 0 . 5 0 . 5 0 . 2 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 Periods Periods Periods

Concluding Thoughts • Partially observable multistage stochastic programs – Saddle-cut SDDP algorithm – SDDP.jl (Dowson and Kapelevich) • Related saddle-function work in stochastic programming – Baucke et al. (2018): risk measures – Downward et al. (2018): stage-wise dependent obj. coefficients • Closely related ideas are well known in POMDPs – Contextual, multi-model, concurrent MDPs – We allow continuous state and action spaces via convexity • Countably infinite LPs for cyclic case • We did not handle decision-dependent learning – b ← B ( b, ω ) versus b ← B ( b, ω, u )

Concluding Thoughts http://www.optimization-online.org/DB_HTML/2019/03/7141.html

Extending SDDP-style Algorithms for Multistage Stochastic - PowerPoint PPT Presentation

Extending SDDP-style Algorithms for Multistage Stochastic Programming Dave Morton Industrial Engineering & Management Sciences Northwestern University Joint work with: Oscar Dowson, Daniel Duque, and Bernardo Pagnoncelli Collaborators

style#1 grace style#2 freya style#3 iona style#4 skye style#5 cora style#6 maisie style#7 isla

An introduction to the theory of SDDP algorithm V. Lecl` ere (ENPC) August 1, 2014 V. Lecl`

A fair comparison of two stochastic optimization algorithms Benchmarking MPC vs SDDP April 10,

A deterministic algorithm for stochastic multistage problems or The problem-child algorithm

Multi-rate Signal Processing 4. Multistage Implementations 5. Multirate Application: Subband

Multistage robust convex optimization problems: A sampling based approach Fabrizio Dabbene/

Multimodal Language Analysis with Recurrent Multistage Fusion Presenter: Paul Pu Liang Paul Pu

Click to edit Master title style DRVR Click to edit Master title style Click to edit Master

Style le GAN Prof. Leal-Taix and Prof. Niessner 1 Style leGAN Style-based generator

Time consistency and optimal stopping of risk averse multistage stochastic programs A. Shapiro

Application of Markov SDDP to Financial Modelling Martin d 2 k 2 Sm V aclav Kozm

Optimization Part II: sampling theorems, multistage problems and other extensions Anupam Gupta

James Madison University SACS Style Guide The following is a list of style conventions to use in

IT350: Web & Internet Programming Set 4: CSS No Style Style! How do we get from here to

Extending ns Extending ns In OTcl In C++ Debugging Padma Haldar USC/ISI 1 2 ns

Extending CSP with tests for availability Gavin Lowe Extending CSP with tests for availability

The MIPS instruction set architecture The MIPS has a 32 bit architecture, with 32 bit

Using Formal Concept Analysis to Acquire Knowledge about Verbs Ingrid Falk 124 Claire Gardent 34

Row polymorphism 1/ 25 Record operations 2. Extend a record with a field ( extend ) 3. Access the

Concepts of Object-Oriented Programming 7 January 2019 OSU CSE 1 Recall... Standard extends

New techniques for trail bounds and application to differential trails in Keccak Silvia Mella 1 ,

3. Convex functions basic properties and examples operations that preserve convexity

Multiple graphs and composable queries in Cypher for Apache Spark Max Kieling openCypher

Typesafe Extensible Functional Objects J o n a t h a n I m m a n u e l B r a c

Sambuz

Useful Links

Newsletter

Mail Us

Extending SDDP-style Algorithms for Multistage Stochastic - PowerPoint PPT Presentation

Extending SDDP-style Algorithms for Multistage Stochastic Programming Dave Morton Industrial Engineering & Management Sciences Northwestern University Joint work with: Oscar Dowson, Daniel Duque, and Bernardo Pagnoncelli Collaborators

style#1 grace style#2 freya style#3 iona style#4 skye style#5 cora style#6 maisie style#7 isla

An introduction to the theory of SDDP algorithm V. Lecl` ere (ENPC) August 1, 2014 V. Lecl`

A fair comparison of two stochastic optimization algorithms Benchmarking MPC vs SDDP April 10,

A deterministic algorithm for stochastic multistage problems or The problem-child algorithm

Multi-rate Signal Processing 4. Multistage Implementations 5. Multirate Application: Subband

Multistage robust convex optimization problems: A sampling based approach Fabrizio Dabbene/

Multimodal Language Analysis with Recurrent Multistage Fusion Presenter: Paul Pu Liang Paul Pu

Click to edit Master title style DRVR Click to edit Master title style Click to edit Master

Style le GAN Prof. Leal-Taix and Prof. Niessner 1 Style leGAN Style-based generator

Time consistency and optimal stopping of risk averse multistage stochastic programs A. Shapiro

Application of Markov SDDP to Financial Modelling Martin d 2 k 2 Sm V aclav Kozm

Optimization Part II: sampling theorems, multistage problems and other extensions Anupam Gupta

James Madison University SACS Style Guide The following is a list of style conventions to use in

IT350: Web &amp; Internet Programming Set 4: CSS No Style Style! How do we get from here to

Extending ns Extending ns In OTcl In C++ Debugging Padma Haldar USC/ISI 1 2 ns

Extending CSP with tests for availability Gavin Lowe Extending CSP with tests for availability

The MIPS instruction set architecture The MIPS has a 32 bit architecture, with 32 bit

Using Formal Concept Analysis to Acquire Knowledge about Verbs Ingrid Falk 124 Claire Gardent 34

Row polymorphism 1/ 25 Record operations 2. Extend a record with a field ( extend ) 3. Access the

Concepts of Object-Oriented Programming 7 January 2019 OSU CSE 1 Recall... Standard extends

New techniques for trail bounds and application to differential trails in Keccak Silvia Mella 1 ,

3. Convex functions basic properties and examples operations that preserve convexity

Multiple graphs and composable queries in Cypher for Apache Spark Max Kieling openCypher

Typesafe Extensible Functional Objects J o n a t h a n I m m a n u e l B r a c

Sambuz

Useful Links

Newsletter

Mail Us

IT350: Web & Internet Programming Set 4: CSS No Style Style! How do we get from here to