Time Inconsistent Optimal Control and Mean Variance Optimization Tomas Bj¨ ork Stockholm School of Economics Agatha Murgoci Copenhagen Business School Xunyu Zhou Oxford University Conference in honour of Walter Schachermayer Wien 2010 – Typeset by Foil T EX – 1
Contents • Recap of DynP. • Problem formulation. • Discrete time. • Continuous time. • Example: Dynamic mean-variance optimization. – Typeset by Foil T EX – 2
Standard problem We are standing at time t = 0 in state X 0 = x 0 . �� T � max E h ( s, X s , u s ) dt + F ( X T ) u 0 dX t = µ ( t, X t , u t ) dt + σ ( t, X t , u t ) dW t For simplicty we assume that • X is scalar. • The adapted control u t is scalar with no restrictions. We denote this problem by P We restrict ourselves to feedback controls of the form u t = u ( t, X t ) . – Typeset by Foil T EX – 3
Dynamic Programming We embed the problem P in a family of problems P tx P tx : �� T � max E t,x h ( s, X s , u s ) dt + F ( X T ) u t dX s = µ ( t, X s , u s ) ds + σ ( s, X s , u s ) dW s , X t = x The original problem corresponds to P 0 ,x 0 . – Typeset by Foil T EX – 4
Bellman We now have the Bellman optimality principle , which says that the family {P t,x ; t ≥ 0 , x ∈ R } are time consistent . More precisely: If ˆ u is optimal on the time interval [ t, T ] , then it is also optimal on the sub-interval [ s, T ] for every s with t ≤ s ≤ T . We can easily derive the Hamilton-Jacobi-Bellman equation HJB: � h ( t, x, u ) + µ ( t, x, u ) V x ( t, x ) + 1 � 2 σ 2 ( t, x, u ) V xx ( t, x ) V t ( t, x ) + sup = 0 , u V ( T, x ) = F ( x ) – Typeset by Foil T EX – 5
Three Disturbing Examples Hyperbolic discounting (Ekeland-Lazrak-Pirvu) �� T � max E t,x ϕ ( s − t ) h ( c s ) ds + ϕ ( T − t ) F ( X T ) u t Mean variance utility (Basak-Chabakauri) E t,x [ X T ] − γ max 2 V ar t,x ( X T ) u Endogenous habit formation � � X T �� max E t,x ln x − β u dX t = [ rX t + ( α − r ) u t ] dt + σu t dW t – Typeset by Foil T EX – 6
Moral • These types of problems are not time consistent. • We cannot use DynP. • In fact, in these cases it is unclear what we mean by “optimality”. Possible ways out: • Easy way: Dismiss the problem as being silly. • Pre-commitment: Solve (somehow) the problem P 0 ,x 0 and ignore the fact that later on, your “optimal” control will no longer be viewed as optimal. • Game theory: Take the time inconsistency seriously. View the problems as a game and look for a Nash equilibrium point. We use the game theoretic approach. – Typeset by Foil T EX – 7
Our Basic Problem � � max E t,x [ F ( x, X T )] + G x, E t,x [ X T ] u dX s = µ ( X s , u s ) ds + σ ( X s , u s ) dW s , X t = x This can be extended considerably. For simplicity we will consider the easier problem � � max E t,x [ F ( X T )] + G E t,x [ X T ] u – Typeset by Foil T EX – 8
The Game Theoretic Approach • This is a bit delicate to formalize in continuous time. • Thus we turn to discrete time, and then go to the limit. – Typeset by Foil T EX – 9
Discrete Time Given: A controlled Markov process { X n : n = 0 , 1 , . . . T } At any time n we can change the transition probabilities for X n → X n +1 by choosing a control value u ∈ R . Players: • For each point in time n there is a player – “player No n ” or “ P n ”. • P n chooses the feedback control law u n ( X n ) . • A sequence of control laws u 0 , . . . , u T − 1 is denoted by u . • Given a sequence u of control laws, the value function for P n is defined by J n ( x, u ) = E n , x [ F ( X u E n , x [ X u � � T )] + G T ] – Typeset by Foil T EX – 10
Subgame Perfect Nash Equilibrium The value function for P n was defined by J n ( x, u ) = E n,x [ F ( X u E n,x [ X u � � T )] + G T ] We see that J n ( x, u ) depends on ( n, x ) and u n , u n +1 , . . . , u T − 1 . Definition: • The control law ˆ u is an equilibrium strategy if the following hold for each fixed n . – Assume that P k use ˆ u k ( · ) for k = n +1 , . . . , T − 1 . – Then it is optimal for player No n to use ˆ u n ( · ) . • The equilibrium value function is defined by V n ( x ) = J n ( x, ˆ u ) – Typeset by Foil T EX – 11
The infinitesimal operator Let { f n ( · ) } T n =0 be a sequence of real valued functions. Def: For a fixed control value u ∈ R , the infinitesimal operator A u , is defined by ( A u f ) n ( x ) = E [ f n +1 ( X n +1 ) − f n ( x ) | X n = x, u n = u ] Def: For a fixed control law u , the A u , is defined by ( A u f ) n ( x ) = E [ f n +1 ( X n +1 ) − f n ( x ) | X n = x, u n = u n ( x )] – Typeset by Foil T EX – 12
Important Idea It turns out that a fundamental role is played by the function sequence f n defined by X ˆ u � � f n ( x ) = E n,x T where ˆ u is the equilibrium strategy. The process f n ( X n ) is of course a martingale under the equilibrium control ˆ u so we have A ˆ u f n ( x ) = 0 , f T ( x ) = x. – Typeset by Foil T EX – 13
Extending HJB Proposition: The equilibrium value function V n ( x ) and the function f n ( x ) satisfy the system u { A u V n ( x ) − A u ( G ◦ f ) n ( x ) + ( H u f ) n ( x ) } sup = 0 , V T ( x ) = F ( x ) + G ( x ) A ˆ u f n ( x ) = 0 , f T ( x ) = x. ( H u f ) n ( x ) = G X u X ˆ u � � � ��� � � E n,x f n +1 − G ( f n ( x )) , f n ( x ) = E n,x n +1 T – Typeset by Foil T EX – 14
Continuous Time The discrete time results extend immediately to continuous time. • Now X is a controlled continuous time Markov process with controlled infinitesimal generator 1 A u g ( t, x ) = lim g ( t + h, X u � � � � E t,x t + h ) − g ( t, x ) h h → 0 • The extended HJB is now an equation with time step [ t, t + h ] . • Divide the discrete time HJB equations by h and let h → 0 . – Typeset by Foil T EX – 15
Extended HJB Continuous Time Conjecture: The equilibrium value function satisfies the system u { A u V ( t, x ) − A u ( G ◦ f ) ( t, x ) + G ′ ( f ( t, x )) · A u f ( t, x ) } sup = 0 , A ˆ u f ( t, x ) = 0 , V ( T, x ) = F ( x ) + G ( x ) f ( T, x ) = x. Note the fixed point character of the extended HJB. – Typeset by Foil T EX – 16
General Problem �� T � � � max E t,x C ( t, x, X s , u s ) ds + F ( t, x, X T ) + G t, x, E t,x [ X T ] u t – Typeset by Foil T EX – 17
The general case Z T Z T A uV ( A ucs ) t ( x, x ) ds + ( A ucs,x ) t ( x ) ds “ ” sup { ( t, x ) + C ( x, x, u ) − u ∈U t t A uf A ufx ” ( t, x ) − A u ( G ⋄ g ) ( t, x ) + H ug “ ” “ “ ” − ( t, x, x ) + ( t, x ) } = 0 , A ˆ u fy ( t, x ) = 0 , A ˆ u g ( t, x ) = 0 , ( A ˆ u cs,y ) t ( x ) = 0 , 0 ≤ t ≤ s V ( T, x ) = F ( x, x ) + G ( x, x ) , cs,y C ( x, y, ˆ ( x ) = u s ( x )) , s f ( T, x, y ) = F ( y, x ) , g ( T, x ) = x. – Typeset by Foil T EX – 18
Optimal for what? • In continuous time, it is not immediately clear how to define an equilibrium strategy. • We follow Ekeland et al and define the equilibrium using spike variations. – Typeset by Foil T EX – 19
HJB as a Necessary Condition Conjecture: Assume that there exists and equilibrium control ˆ u and define V and f as above. The V and f satisfies the extended HJB system. Note: It is probably very hard to prove this, due to technical problems. We do however have a converse result. – Typeset by Foil T EX – 20
Verification Theorem Assume that V , f and ˆ Theorem: u satisfies the extended HJB system. Then V is the equilibrium value function and ˆ u is the equilibrium control. Proof: Not very hard, but a bit harder than for standard DynP. – Typeset by Foil T EX – 21
A useful Lemma Consider a functional J ( t, x, u ) = E t,x [ F ( x, X u T )] + G ( x, E t,x [ X u T ]) . and denote the equilibrium control and value function by ˆ u and V respectively. Let ϕ ( x () be a given deterministic real valued function and consider the functional J ϕ ( t, x, u ) = ϕ ( x ) { E t,x [ F ( x, X u T )] + G ( x, E t,x [ X u T ]) } Denoting the equilibrium control and value function by u ϕ and V ϕ respectively we have ˆ u ϕ ( t, x ) ˆ = u ( t, x ) , ˆ V ϕ ( t, x ) = ϕ ( x ) V ( t, x ) – Typeset by Foil T EX – 22
Practical handling of the theory • Make a parameterized Ansatz for V . • Make a parameterized Ansatz for f . • Plug everything into the extended HJB system and hope to obtain a system of ODEs for the parameters in the Ansatz. • Alternatively, compute Lie symmetry groups. – Typeset by Foil T EX – 23
Basak’s Example (in a simple version) dS t = αS t dt + σS t dW t , dB t = rB t dt X t = portfolio value process u = amount of money invested in risky asset Problem: E t,x [ X T ] − γ max 2 V ar t,x ( X T ) u dX t = [ rX t + ( α − r ) u t ] dt + σu t dW t This corresponds to our standard problem with F ( x ) = x − γ G ( x ) = γ 2 x 2 , 2 x 2 – Typeset by Foil T EX – 24
Extended HJB � � [ rX t + ( α − r ) u ] V x + 1 2 σ 2 u 2 V xx − γ 2 σ 2 u 2 f 2 V t + sup = 0 x u V ( T, x ) = x A ˆ u f = 0 f ( T, x ) = x Ansatz: V ( t, x ) = g ( t ) x + h ( t ) f ( t, x ) = A ( t ) x + B ( t ) – Typeset by Foil T EX – 25
Recommend
More recommend