deterministic mean field games
play

DETERMINISTIC MEAN FIELD GAMES Italo Capuzzo Dolcetta Sapienza - PowerPoint PPT Presentation

DETERMINISTIC MEAN FIELD GAMES DETERMINISTIC MEAN FIELD GAMES Italo Capuzzo Dolcetta Sapienza Universit` a di Roma and GNAMPA - Istituto di Alta Matematica DETERMINISTIC MEAN FIELD GAMES A classical optimization problem Given a time interval


  1. DETERMINISTIC MEAN FIELD GAMES DETERMINISTIC MEAN FIELD GAMES Italo Capuzzo Dolcetta Sapienza Universit` a di Roma and GNAMPA - Istituto di Alta Matematica

  2. DETERMINISTIC MEAN FIELD GAMES A classical optimization problem Given a time interval [0 , T ] consider the classical Mayer type problem � T � 1 � X s | 2 + L ( X s ) 2 | ˙ inf ds + G ( X T ) (1) t where X := X t , x is any curve in the Sobolev space W 1 , 2 ([ t , T ]; R d ) such that X T = x ∈ R d for t ∈ [0 , T ]. Well-known that if L : R d × [0 , T ] → R , g : R d → R are continuous and bounded, then the value function of problem (1) above, i.e. � � T � � 1 � X s | 2 + L ( X s ) 2 | ˙ ds + G ( X T ) ; X ∈ W 1 , 2 ([0 , T ]; R d ) u ( t , x ) = inf t is the unique bounded continuous viscosity solution of

  3. DETERMINISTIC MEAN FIELD GAMES the backward Cauchy problem HJ 2 |∇ x u ( t , x ) | 2 = L ( x )  − ∂ t u ( t , x ) + 1 in (0 , T ) × R d ,  (2) in R d u ( T , x ) = G ( x )  of Hamilton-Jacobi type. The proof that u solves (2) in viscosity sense is a simple consequence of the following identity, the Dynamic Programming Principle : � t � � u ( s , X t , x ( s )) + X ∈ W 1 , 2 ([0 , T ]; R d ) u ( t , x ) = inf L ( X s ) ds ; s valid for any given ( t , x ) ∈ (0 , T ) × R d and any s ∈ [ t , T ]. Uniqueness of solution is a non trivial, fundamental result in viscosity solutions theory (Lions 1982).

  4. DETERMINISTIC MEAN FIELD GAMES t , x is optimal for the initial As for optimal curves , easy to check that X setting ( t , x ) if and only if � T t , x ( s )) + t , x ( τ )) d τ for all s ∈ [ t , T ] u ( t , x ) = u ( s , X L ( X s Moreover, if u is smooth enough, the velocity field of the optimal paths is the spatial gradient of the solution of the HJ equation. More precisely,

  5. DETERMINISTIC MEAN FIELD GAMES A Verification Lemma Lemma Let X ∗ ( t ) be such that X ∗ ( s ) = −∇ x u ( s , X ∗ ( s )) for s ∈ [ t , T ] , X ∗ ( t ) = x ˙ Then, � T � 1 � X ∗ ( s ) | 2 + L ( X ∗ ( s )) 2 | ˙ ds + G ( X ∗ ( T )) = t � T � 1 � X s | 2 + L ( X s ) 2 | ˙ = inf ds + G ( X T ) t

  6. DETERMINISTIC MEAN FIELD GAMES Verification result above requires u to be C 1 with respect to x . This turns out to be true in the present model problem under a C 2 smoothness assumptions on L , G . The proof of C 1 regularity of u is in 3 steps: step 1: u is globally Lipschitz w.r.t ( t , x ) step 2 : u is semiconcave w.r.t. x , i.e. x → u ( t , x ) − 1 2 C t | x | 2 concave for some positive constant C t step 3: the upper semidifferential � � u ( t , y ) − u ( t , x ) − p · ( y − x ) p ∈ R d : lim sup D + x u ( t , x ) = ≤ 0 | y − x | y → x is a singleton at each ( t , x ) Alternative way to optimal feebacks for general control problems when no smoothness available is via semi-discretization (comments on this issue later on)

  7. DETERMINISTIC MEAN FIELD GAMES Proof of Verification Lemma: � T � � ∂ s u ( s , X s ) + ˙ u ( T , X T ) = u ( t , X T ) + X s · ∇ u ( s , X s ) ds = t [by HJ] � T � 1 � 2 |∇ x u ( s , X s ) | 2 + ˙ = u ( t , X T ) + X s · ∇ x u ( s , X s ) − L ( X s ) ds ≥ t [by convexity of p → 1 2 | p | 2 ] � T � − 1 � X s | 2 − L ( X s ) 2 | ˙ ≥ u ( t , X T ) + ds t

  8. DETERMINISTIC MEAN FIELD GAMES Since u ( T , X T ) = G ( X T ), u ( t , X T ) = u ( t , x ), above yields � T � 1 � X s | 2 + L ( X s ) 2 | ˙ G ( X T ) + ds ≥ u ( t , x ) t Same computation with generic curve X replaced by X ∗ given by X ∗ ( s ) = −∇ x u ( s , X ∗ ( s )) for s ∈ [ t , T ] , X ∗ ( t ) = x ˙ gives = in the last step, so that � T � 1 � X s | 2 + L ( X s ) 2 | ˙ u ( t , x ) = inf ds + G ( X T ) t

  9. DETERMINISTIC MEAN FIELD GAMES A deterministic mean field game problem An interesting new class of optimal control has become recently object of interest after the 2006/07 papers by Lasry and Lions (see also P.-L. Lions, Cours au Coll` ege de France www.college-de-france.fr. for more recent developments) Related ideas have been developed independently in the engineering literature, and at about the same time, by Huang, Caines and Malham´ e. Assume that the running cost L ( X s ) depends also on an exhogenous variable m ( s , X s ) modeling the density of population of the other agents at state X s at time s .

  10. DETERMINISTIC MEAN FIELD GAMES The new cost criterion is then � T � 1 � X s | 2 + L ( X s , m ( s , X s )) 2 | ˙ inf ds + G ( X T , m ( T , X T )) (3) t Here, m is a non-negative function valued in [0 , 1] such that � R d m ( s , x ) dx = 1 for all s . The time evolution of m starting from an initial configuration m (0 , x ) is governed by the continuity equation in (0 , T ) × R d ∂ t m ( t , x ) − div ( m ( t , x ) D x u ( t , x )) = 0 Note that in the cost criterion the evolution of the measure m enters as a parameter. The value function of the agent is then given by � T � 1 � X s | 2 + L ( X s , m ( s , X s )) 2 | ˙ inf ds + G ( X T , m ( T , X T )) (4) t

  11. DETERMINISTIC MEAN FIELD GAMES His optimal control is, at least heuristically, given in feedback form by α ∗ ( t , x ) = −∇ x u ( t , x ). Now, if all agents argue in this way, their repartition will move with a velocity which is due to the drift term ∇ x u ( t , x ). This leads eventually to the continuity equation.

  12. DETERMINISTIC MEAN FIELD GAMES We are therefore led to consider the following system of nonlinear evolution pde’s for the unknown functions u = u ( t , x ) , m = m ( t , x ): ∂ t + 1 − ∂ u 2 |∇ u | 2 = L ( x , m ) in (0 , T ) × R d (5) ∂ m in (0 , T ) × R d ∂ t − div ( m ∇ u ) = 0 (6) with the initial and terminal conditions in R d m (0 , x ) = m 0 ( x ) , u ( T , x ) = G ( x , m ( T , x )) (7)

  13. DETERMINISTIC MEAN FIELD GAMES Three crucial structural features: first equation backward , second one forward in time the operator in the continuity equation is the adjoint of the linearization at u of the operator in the HJ operator in the first equation nonlinearity in the HJB equation is convex with respect to |∇ u |

  14. DETERMINISTIC MEAN FIELD GAMES The planning problem An interesting variant of the MFG system proposed by Lions for modeling the presence of a regulator prescribing a target density to be reached at final time : ∂ u ∂ t + 1 2 |∇ u | 2 = L ( x , m ) in (0 , T ) × R d ∂ m in (0 , T ) × R d ∂ t − div ( m ∇ u ) = 0 with the initial and terminal conditions in R d m (0 , x ) = m 0 ( x ) ≥ 0 , m ( T , x ) = m T ( x ) , No side conditions on u . For L ≡ 0, the above is the equivalent formulation of Monge-Kantorovich optimal mass transport problem considered by Benamou-Brenier (2000), see also Achdou-Camilli-CD SIAM J. Control Optim. (2011).

  15. DETERMINISTIC MEAN FIELD GAMES Stochastic mean field game models Consider the following system (MFG ) of evolution pde’s: − ∂ u ∂ t − ν ∆ u + 1 2 |∇ u | 2 = L ( x , m ) in (0 , T ) × R d (8) ∂ m in (0 , T ) × R d ∂ t − ν ∆ m − div ( m ∇ u ) = 0 (9) with the initial and terminal conditions in R d m (0 , x ) = m 0 ( x ) , u ( T , x ) = G ( x , m ( T , x )) (10) ν is a positive number. First equation is a backward HJB , the second one a forward FP

  16. DETERMINISTIC MEAN FIELD GAMES The heuristic interpretation of this system is as follows. Fix a solution of MFG : classical dynamic programming approach to optimal control suggest that the solution u of (HJB) is the value function of an agent controlling the stochastic ODE √ dX t = α t dt + 2 ν dB t , X 0 = x where B t is a standard Brownian motion, i.e. � t √ X t = x + α s ds + 2 ν B t 0 The agent aims at minimizing the integral cost � � T � 1 � 2 | α s | 2 + L ( X s , m ( s ) � J ( x , α ) := E x ds + G ( X T , m ( T )) 0 considering the density m ( s ) of ”the other agents” as given.

  17. DETERMINISTIC MEAN FIELD GAMES Formal dynamic programming arguments indicate that the candidate optimal control for the agent should be constructed through the feedback strategy α ∗ ( t , x ) := −∇ u ( t , x ) where u is the unique solution of HJB for fixed m . Indeed, we have the simple verification result: Lemma Let X ∗ t be the solution of √ dX t = α ∗ ( t , X t ) dt + 2 ν dB t , X 0 = x and set α ∗ t := α ∗ ( t , X t ) . Then, � α J ( x , α ) = J ( x , α ∗ inf t ) = R d u (0 , X 0 ) dm 0 ( x ) Therefore, optimal control problem ”completely” solved by solving backward HJB , determining ∇ u ( t , x ) for all t and initial value u (0 , x )

  18. DETERMINISTIC MEAN FIELD GAMES Proof: Take ν = 1 for simplicity and let α t be any admissible control. Then, � � � � G ( X T , m ( T )) = E u ( X T , m ( T )) = E x [by Ito’s formula] � T � ∂ u ( s , X s ) � � � = E x u (0 , X 0 ) + + α s · ∇ u ( s , X s ) + ∆ u ( s , X s ) ds = ∂ t 0 [by HJB ] � T � 1 � � � 2 |∇ u ( s , X s ) | 2 + α s · ∇ u ( s , X s ) − F ( X s , m ( s )) = E x u (0 , X 0 )+ ≥ 0 [by convexity] � T ( − 1 � 2 | α s | 2 − L ( X s , m ( s ))) ds � ≥ E x u (0 , X 0 ) + 0

  19. DETERMINISTIC MEAN FIELD GAMES Hence, by very definition of J , � � E x u (0 , x ) ≤ J ( α, x ) for any admissible control α . The same computation with α s replaced by α ∗ s gives an equality in the last step, proving that α J ( x , α ) = J ( x , α ∗ ) inf

Recommend


More recommend