Dynamic Stochastic Optimization Bill (Lin-Liang) Wu University of Toronto linliang.wu@mail.utoronto.ca July 3, 2014 Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 1 / 18
Overview Theory 1 Introduction Definitions Methodology Applications 2 Production-consumption model Portfolio allocation More Theory! 3 Dynamic programming principle Verification theorem Viscosity solutions Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 2 / 18
Introduction Study of optimization problems subject to randomness Deterministic vs Non-Deterministic optimization Deterministic: No randomness involved of how future states develop Always produce same output from given initial condition Non-Deterministic: Dynamical systems subject to random perturbations Subjectivity of people Goal: Find an optimal control to optimize some performance criterion Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 3 / 18
Definitions and Theorems Definition (Probability Space) (Ω , F , P ) Ω : sample space F : σ -field P : probability measure Definition (Measurable) Function f : Ω → [ −∞ , ∞ ] measurable if { f ∈ B } ∈ F , ∀ B ∈ B ( R ) , B ( R ) = ∩{F : F is a σ -field on R and I ⊂ F , I = ( a , b ) ⊂ R } Definition (Stochastic process) Sequence of random variables, an F -measurable function: X ( n ) : Ω → R for n ≥ 0, denoted as X = ( X ( n )) n ≥ 0 Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 4 / 18
Definitions and Theorem Definition (Filtration) Sequence of σ -fields F n such that F n ⊂ F and F n ⊂ F n +1 . A process X is adapted if each X ( n ) is F n -measurable Definition (Brownian motion) Mapping W : [0 , ∞ ) × Ω → R for some probability space (Ω , F , P ) measurable with respect to the product σ -field B ([0 , ∞ ]) × F = σ { B × A : B ∈ B ([ ′ , ∞ )) , A ∈ F such that 1 W (0) = 0 , a . s . ( P ) 2 For 0 ≤ s < t < ∞ , W ( t ) − W ( s ) has normal distribution with mean zero and standard deviation √ t − s 3 For all m and all 0 ≤ t 1 ≤ t 2 ≤ · · · t m , the increments W ( t n +1 ) − W ( t n ) , n 0 , 1 , . . . m − 1 are independent. Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 5 / 18
Definitions and Theorem Definition (Stochastic differential equation (SDE)) dX ( t ) = a ( t , X ( t )) dt + b ( t , X ( t )) dW ( t ) where a ( t , x ) , b ( t , x ) : R 2 → R and integral form: � t � t X ( t ) = X (0) + 0 a ( s , X ( s )) ds + 0 b ( s , X ( s )) dW ( s ) where the first integral is either Riemann or Lebesgue and the second is an Ito integral Theorem (Ito Formula) If F : R → R is of class C 2 , then � t � t 0 F ′ ( W ( s )) dW ( s ) + 1 0 F ′′ ( W ( s )) ds F ( W ( t )) − F (0) − 2 Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 6 / 18
Methodology: Structure of Stochastic Optimization Problem There are four components in a stochastic optimization problem. State of the system: Start with a dynamical system which evolves over time in an uncertain environment on the probability space (Ω , F , P ) The state of the system is denoted by X t ( w ) at time t for a world scenario w ∈ Ω. Then the evolution of the system is characterized by the mapping t �→ X t ( w ) through a stochastic differential equation in particular the geometric Brownian motion. Control: The evolution t �→ X t of the system is influenced by a control α whose value is given at time t and is called the admissible control. We call A the set of admissible controls. Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 7 / 18
Methodology: Structure of Stochastic Optimization Problem Performance Criterion: The goal is to optimize over all admissible controls the functional � ∞ e − θ t f ( X t , α t ) dt ] J ( X , α ) = E [ 0 where f is the reward function and θ > 0 is the discount factor. The value function is defined as V = sup α J ( X , α ) The main goal in a stochastic optimization problem is to find an optimal control policy α ∗ = { a ∗ ( t ) : t ≥ 0 } that satisfies the value function. Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 8 / 18
Methodology: Structure of Stochastic Optimization Problem Hamilton-Jacobi-Bellman: Given our state system X = { X ( t ) : t ≥ 0 } which is characterized by dX ( t ) = µ ( X ( t ) , α ( t )) dt + σ ( X ( t ) , α ( t )) dB ( t ) for Brownian motion B = { B ( t ) : t ≥ 0 } , then the HJB equation is given by: α { f ( x , α ) + µ ( x , α ) V x ( x ) + σ 2 ( x , α ) sup V xx ( x ) − θ V ( x ) } = 0 2 for value function V ( x ) . Hard to solve analytically so usually approximated numerically like finite difference method (Research in numerical analysis and PDE to find suitable methods) Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 9 / 18
Application 1: Production-consumption model Model for production unit: Capital value K t which evolves according investment rate I t in capital and the price S t per unit of capital given by: dK t = K t dS t S t + I t dt Debt L t of production unit evolves in terms of interest rate r , the consumption rate C t and productivity rate P t of capital: dL t = rL t dt − K t S t dP t + ( I t + C t ) dt Choose dynamic model for ( S t , P t ) SDE: dS t + µ S t dt + σ 1 S t W 1 1 dP t = bdt + σ 2 dW 2 t where ( W 1 , W 2 ) is 2D Brownian motion on filtered probability space (Ω , F , F = (( F t ) t , P ) and µ, b , σ 1 , σ 2 are constants and σ 1 , σ 2 > 0 . Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 10 / 18
Application 1: Production-consumption model Dynamics: Net value of production unit is X t = K t − L t Impose constraints K t ≥ 0 , C t ≥ 0 , X t ≥ 0 , t ≥ 0 Control variables: Denote k t = K t X t , c t = C t X t for investment and consumption Dynamics of controlled system: dX t = X t [ k t ( µ − r + b S t ) + ( r − c t )] dt + k t X t σ 1 W 1 X t S t σ 2 dW 2 t + k t t dS t = µ S t dt + σ 1 S t dW 1 t Given discount factor β > 0 , utility function U the objective is to determine the optimal investment and consumption for the production � ∞ e − β t U ( c t , X t ) dt ] unit: sup ( k , c ) E [ 0 Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 11 / 18
Application 2: Portfolio allocation: Model: Financial market consisting of a riskless asset with strictly positive price process S 0 representing the savings account and n risky assets of price process S representing stocks Invest in this market at time t and the number of shares invested in savings account is X t − α t S t S 0 t SDE: Self-financed wealth process evolves according to dX t = ( X t − α S t ) dS 0 t + α t dS t t S 0 Dynamics: Control is the process α and the portfolio allocation problem is choose best investment in these assets Denoting U for the utility function, we thus need sup α E [ U ( X t )] Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 12 / 18
Dynamic programming principle Goal: Primary goal of stochastic control problem is to find an optimal control Definition: Dynamic programming is an optimization technique Comparison with divide and conquer: Both techniques split their input into parts, find subsolutions to the parts and then synthesize larger solutions from small ones Divide and conquer: split input at prespecified deterministic points (eg: always in the middle) Dynamic programming: splits its input at every possible split points rather than at pre-specified points. After trying all split points, it determines which split point is optimal Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 13 / 18
Dynamic programming principle Idea: Question: How can we use dynamic programming principle to compute an optimal control? Answer: Partition the time interval into smaller chunks and optimize over each individually and let the partition size go to zero Answer: Calculation of optimal control becomes pointwise minimization HJB Equation: Letting t → 0 is how we get the HJB equation: Assume that V is smooth enough and apply Ito’s formula between 0 and t . Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 14 / 18
Dynamic programming principle Model: Consider infinite horizon discounted cost function � ∞ e − θ s f ( X s , α ( X s ))) ds for discount factor θ. J ( x , α ) = E ( 0 Value function for this stochastic control problem is V ( x ) = inf α J ( x , α ) where infimum is taken over all control functions. Then for control function α ∗ , we have for x ∈ R , t ∈ (0 , ∞ ) : V ( x ) = J ( x , α ∗ ) = � t � ∞ 0 e − θ s f ( X ∗ e − θ s f ( X ∗ E ( s , α ∗ ( X ∗ s )) ds ) + E ( s ) , α ( X ∗ s ) ds ) = ⇒ t � t f ( X α s , α ( X u s )) ds + e − θ t V ( X α V ( x ) = inf α E [ t )] 0 (Dynamic programming principle) Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 15 / 18
Recommend
More recommend