Time consistency and optimal stopping of risk averse multistage stochastic programs A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA Joint work with A.Pichler and R.P.Liu Mathematical Optimization of Systems Impacted by Rare, High-Impact Random Events ICERM, Brown University, June, 2019
Let (Ω , F , P ) be a probability space and F be a filtration F 0 ⊂ · · · ⊂ F T (a sequence of sigma fields) with F 0 = {∅ , Ω } and F T = F . Stopping time is a random variable τ : Ω → { 0 , . . . , T } such that { ω ∈ Ω: τ ( ω ) = t } ∈ F t for t = 0 , . . . , T . For a ran- dom process Z 0 , ..., Z T , adapted to the filtration F , the optimal stopping time problem can be written as max τ ∈ T E [ Z τ ] , where T is the set of stopping times. It is tempting to write distributionally robust/risk averse coun- terpart as max Q ∈ M E Q [ Z τ ] , inf τ ∈ T where M is a family of probability measures on (Ω , F ). 1
The expectation operator has the following property � � �� E Q [ · ] = E Q |F 0 · · · E Q |F T − 1 [ · ] , (1) E Q |F 1 where E Q |F t denotes the conditional expectation. Note that E Q |F 0 = E Q since F 0 = {∅ , Ω } . Then � � �� Q ∈ M E Q [ · ] ≥ inf inf inf · · · inf Q ∈ M E Q |F T − 1 [ · ] (2) . Q ∈ M E Q |F 0 Q ∈ M E Q |F 1 There is a technical difficulty here since it is not clear what is minimum (inf) of conditional expectations E Q |F t . 2
Let Z := L p (Ω , F , P ) and suppose that M is a set of probability measures absolutely continuous with respect to the reference probability measure P and such that the densities dQ/dP , Q ∈ M , form a bounded convex weakly ∗ closed set A ∈ Z ∗ in the dual space Z ∗ = L q (Ω , F , P ). Consider functional ̺ : Z → R defined as � ̺ ( Z ) := sup E Q [ Z ] = sup Ω ζ ( ω ) Z ( ω ) dP ( ω ) . Q ∈ M ζ ∈ A Its concave counterpart is ν ( Z ) = − ̺ ( − Z ), ν ( Z ) = inf Q ∈ M E Q [ Z ] . 3
Functional ̺ : Z → R has the following properties for Z, Z ′ ∈ Z ; (i) ̺ ( Z + Z ′ ) � ̺ ( Z ) + ̺ ( Z ′ ), subadditivity, (ii) if Z � Z ′ , then ̺ ( Z ) ≤ ̺ ( Z ′ ), monotonicity, (iii) ̺ ( λZ ) = λ̺ ( Z ) , λ ≥ 0, positive homogeneity (iv) ̺ ( Z + a ) = ̺ ( Z ) + a, a ∈ R , translation equivariance. Its concave counterpart is ν ( Z ) = − ̺ ( − Z ) inherits properties (ii)-(iv) and is superadditive. Functional ̺ is convex, and ν is concave. It is said that a functional ̺ : Z → R is (convex) coherent if it satisfies (i)-(iv) (Artzner et al (1999)). By duality (convex) coherent ̺ can be represented in the form � ̺ ( Z ) = sup Ω ζ ( ω ) Z ( ω ) dP ( ω ) , ζ ∈ A for some set of densities A ⊂ Z ∗ . 4
Conditional analogues (assuming that A , and hence ̺ and ν , are law invariant) ̺ |F t ( Z ) := ess sup E Q |F t [ Z ] , ν |F t ( Z ) := ess inf Q ∈ M E Q |F t [ Z ] . Q ∈ M Note that ̺ |F t ( Z ) and ν |F t can be viewed as mappings from Z T = L p (Ω , F T , P ) to Z t = L p (Ω , F t , P ) and the inequality (2) as � � �� ν ( · ) ≥ ν |F 0 · · · ν |F T − 1 ( · ) ν |F 1 . Similarly � � �� ̺ ( · ) ≤ ̺ |F 0 ̺ |F 1 · · · ̺ |F T − 1 ( · ) . 5
Note that for τ ∈ T , Ω is the union of the disjoint sets Ω τ t := { ω : τ ( ω ) = t } , t = 0 , . . . , T, and hence 1 Ω = � T t =0 1 { τ = t } . Moreover 1 { τ = t } Z τ = 1 { τ = t } Z t and thus for Z t ∈ Z t it follows that T T � � Z τ = 1 { τ = t } Z τ = 1 { τ = t } Z t , t =0 t =0 and hence (since 1 { τ = t } Z t is F t -measurable) �� T � E ( Z τ ) = t =0 1 { τ = t } Z t E � � = 1 { τ =0 } Z 0 + E |F 0 1 { τ =1 } Z 1 + · · · + E |F T − 1 ( 1 { τ = T } Z T ) . 6
Definition 1 Let ̺ t |F t : Z t +1 → Z t , t = 0 , . . . , T − 1 , be monotone, translation equivariant mappings and consider the corresponding mappings ρ s,t : Z t → Z s represented in the nested form � � � � ρ s,t ( · ) := ̺ s |F s · · · ̺ t − 1 |F t − 1 ( · ) , 0 ≤ s < t ≤ T. ̺ s +1 |F s +1 The stopping risk measure is � � ρ 0 ,T ( Z τ ) = 1 { τ =0 } Z 0 + ̺ 0 |F 0 1 { τ =1 } Z 1 + · · · + ̺ T − 1 |F T − 1 ( 1 { τ = T } Z T ) , and its concave counterpart � � ν 0 ,T ( Z τ ) = 1 { τ =0 } Z 0 + ν 0 |F 0 1 { τ =1 } Z 1 + · · · + ν T − 1 |F T − 1 ( 1 { τ = T } Z T ) . 7
Distributionally robust/risk averse optimal stopping max τ ∈ T ν 0 ,T ( Z τ ) (3) or max τ ∈ T ρ 0 ,T ( Z τ ) . If ̺ t |F t are convex coherent, then the composite functional ρ 0 ,T (functional ν 0 ,T ) is convex (concave) coherent, and hence � ν 0 ,T ( Z ) = inf Ω ζ ( ω ) Z ( ω ) dP ( ω ) , ζ ∈ � A A ⊂ Z ∗ . Thus for the corresponding set for some set of densities � of probability measures � M = { Q : dQ/dP ∈ � A } , problem (3) can be written as max inf E Q [ Z τ ] . τ ∈ T Q ∈ � M 8
Dynamic programming equations. Definition 2 (Snell envelope) Let Z t ∈ Z t , t = 0 , ..., T , be a stochastic process. The Snell envelope (associated with func- tional ρ 0 ,T ) is the stochastic process E T := Z T , E t := Z t ∨ ̺ t |F t ( E t +1 ) , t = 0 , . . . , T − 1 , defined in backwards recursive way. Similarly Snell envelope can be defined for ν 0 ,T . 9
For m = 0 , . . . , T , consider T m := { τ ∈ T : τ ≥ m } , the optimiza- tion problem max ρ 0 ,T ( Z τ ) , (4) τ ∈ T m and τ ∗ m ( ω ) := min { t : E t ( ω ) = Z t ( ω ) , m ≤ t ≤ T } , ω ∈ Ω . Denote by v m the optimal value of the problem (4). Note the recursive property ρ 0 ,T ( Z τ ) = ρ 0 ,m ( ρ m,T ( Z τ )), m = 1 , . . . , T . The following assumption was used by several authors, some refer to it as local property , ̺ t |F t ( 1 A · Z ) = 1 A · ̺ t |F t ( Z ) , for all A ∈ F t , t = 0 , . . . , T − 1 . For coherent law invariant mappings ̺ t |F t it always holds. 10
Recall T m := { τ ∈ T : τ ≥ m } , τ ∗ m ( ω ) := min { t : E t ( ω ) = Z t ( ω ) , m ≤ t ≤ T } and the respective problem (4) max τ ∈ T m ρ 0 ,T ( Z τ ). Theorem 1 Let ̺ t |F t : Z t +1 → Z t , t = 0 , . . . , T − 1 , be (convex or concave) monotone translation equivariant mappings possessing local property and ρ s,t , 0 ≤ s < t ≤ T , be the corresponding nested mappings. Then for Z t ∈ Z t , t = 0 , ..., T , the following holds: (i) for m = 0 , . . . , T , E m � ρ m,T ( Z τ ) , ∀ τ ∈ T m , E m = ρ m,T ( Z τ ∗ m ) , (ii) the stopping time τ ∗ m is optimal for the problem (4) , (iii) if ˆ τ m is an optimal stopping time for the problem (4) , then τ m � τ ∗ ˆ m , (iv) v m = ρ 0 ,m ( E m ) , m = 1 , . . . , T , and v 0 = E 0 . 11
We have that E t � Z t , t = 0 , . . . , T , and τ ∗ 0 ( ω ) = min { t : Z t ( ω ) ≥ E t ( ω ) , t = 0 , . . . , T } is an optimal solution of the optimal stoping problem, and E 0 is the corresponding optimal value. That is, going forward the optimal stopping time τ ∗ 0 stops at the first time Z t = E t . As in the risk neutral case the time consistency (Bellman’s principle) is ensured here by the decomposable structure of the considered nested risk measure. That is, if it was not optimal to stop within the time set { 0 , . . . , m − 1 } , then starting the observation at time t = m and being based on the information F m (i.e., conditional on F m ), the same stopping rule is still optimal for the problem. 12
For convex law invariant risk functional ̺ : Z → R it holds that E [ · ] ≤ ̺ ( · ). In that case the distributionally robust formulation will stop later than the corresponding risk neutral formulation. For the respective concave risk functional ν , it will stop earlier. It is also possible to combine this with policy optimization. That is, to consider problems � � f τ ( x τ ( · ) , · ) (min/ max )(min/ max ) ̺ 0 ,T , π ∈ Π τ ∈ T where Π the set of feasible policies π = { x 0 , x 1 ( · ) , . . . , x T ( · ) } such that f t ( x t ( · ) , · ) ∈ Z t , with f 0 : R n 0 → R , f t : R n t × Ω → R , and feasibility constraints defined by X 0 ⊂ R n 0 and multifunctions X t : R n t − 1 × Ω ⇒ R n t , t = 1 , . . . , T . It is assumed that f t ( x t , · ) and X t ( x t − 1 , · ) are F t -measurable. Some of these formulations preserve convexity of f t ( · , ω ), and some do not. 13
Interchangeability principle for a functional ̺ : Z → R , Z = L p (Ω , F , P ) Consider a function ψ : R n × Ω → R ∪ { + ∞} . Let Ψ( ω ) := inf y ∈ R n ψ ( y, ω ) and Y := { η : Ω → R n | ψ η ( · ) ∈ Z} , � � where ψ η ( · ) := ψ η ( · ) , · . Suppose that: the function ψ ( y, ω ) is random lower semiconti- nous (i.e., its epigraphical mapping is closed valued and measur- able), Ψ ∈ Z and the functional ̺ : Z → R is monotone. It is said that ̺ is strictly monotone if Z � Z ′ and Z � = Z ′ implies that ̺ ( Z ) < ̺ ( Z ′ ). 14
Recommend
More recommend