numerical strategies for efficient control of large scale
play

Numerical strategies for efficient control of largescale particle - PowerPoint PPT Presentation

Numerical strategies for efficient control of largescale particle systems Michael Herty IGPM, RWTH Aachen University. joint work with G. Albi, L. Pareschi, C. Ringhofer, S. Steffensen, M. Zanella. CIRM, Crowds: models and control, 2019 1 /


  1. Numerical strategies for efficient control of large–scale particle systems Michael Herty IGPM, RWTH Aachen University. joint work with G. Albi, L. Pareschi, C. Ringhofer, S. Steffensen, M. Zanella. CIRM, Crowds: models and control, 2019 1 / 30

  2. Control of interacting particle systems i = 1 , . . . � dx i ( t ) = P ( x i , x j )( x j − x i ) dt + u i + σ dW i j ◮ Coupled system of ODEs / SDEs ◮ x i , u i are state/control of i th particle ◮ P is interaction kernel, e.g. P ( x , y ) = χ ( � x − y � ≤ ∆) ◮ P = cst used as opinion formation model Bounded confidence model, u i ≡ 0, Hegselmann/Krause 2 / 30

  3. Realistic example: Financial market model by Levy-Levy-Solomon ◮ Particles (investors) i have two portfolios ( x i stocks/ y i bonds). S is the stock price, D > 0 dividend, and r > 0 interest rate. dt x i = S ′ + D d x i + u i , S d dt y i = ry i − u i , S = 1 � x j . N j ◮ Particles have a choice u i to invest in stocks ◮ Objective: Maximize total profit 3 / 30

  4. Problem setup: Optimal Control Consider i = 1 , . . . , N particles with additive control and quadratic regularization � T ν u ( s ) 2 + h ( X ( s ))) ds u = argmin ˜ 2 ˜ u ∈ R 0 d dt x i = f ( x i , X − i ) + u , ◮ Notation: X = ( x i ) N i =1 and X − i = ( x j ) N j =1 , j � = i ◮ Simplifying setup: Single control u = u ( t ) for all particles but similar results for u = u i possible ◮ Interest in open/closed loop control but no game theoretic setting (see other talks in this conference) ◮ Interest in mean–field limit N → ∞ 4 / 30

  5. Results in the Linear–Quadratic Case Opinion formation model with constant P and quadratic cost N N P ( x j − x i ) , h ( X ) = 1 � � x 2 f ( x i , X − i ) = i 2 N j =1 i =1 and it is a linear quadratic problem � T ν u ( s ) 2 + X ( s ) T MX ( s ) ds u = argmin ˜ 2 ˜ u ∈ R 0 d dt X ( t ) = AX ( t ) + Bu ( t ) Solution given by Riccati equation with K ( t ) ∈ R N × N − d dt K = KA + A T K − 1 ν K 1 K T + M , K ( T ) = 0 Bu ( t ) = − 1 ν BB T K ( t ) X ( t ) . 5 / 30

  6. Structure of K and mean–field limit Matrices have symmetric structure A i , i = a 0 , A i , j = a d , B i , j = 1 , M i , j = 1 N δ i , j Structure extends to Riccati equation − d dt K = KA + A T K − 1 ν K 1 K T + M , K ( T ) = 0 dt X ( t ) = AX ( t ) − 1 d ν BB T K ( t ) X ( t ) Lemma. The solution to the matrix Riccati equation K ( t ) ∈ R N × N fulfills for K ∈ R i , j = 1 � � BB T K ( t ) N K ( t ) − d dt K ( t ) = 1 − 1 ν K ( t ) , K ( T ) = 0 Corresponding mean–field for probability measure µ ( t ) ∈ P ( R ) �� � P ( y − x ) − 1 � � ∂ t µ ( t , x ) + ∂ x ν K ( t ) y µ ( t , y ) dy µ ( t , x ) = 0 6 / 30

  7. Long-term behavior of solutions �� � � � P ( y − x ) − 1 0 = ∂ t µ ( t , x ) + ∂ x ν K ( t ) y µ ( t , y ) dy µ ( t , x ) − d dt K ( t ) = 1 − 1 ν K ( t ) , K ( T ) = 0 x 2 µ ( t , x ) dx ◮ Moments m ( t ) = � � x µ ( t , x ) dx and E ( t ) = have the following asymptotic behavior m ( t ) → 0 at rate 1 / √ ν, E ( t ) → 0 at rate 2 P ◮ Results generalize to problems with fixed desired state x d 7 / 30

  8. Nonlinear Case: Model–Predictive Control � T d � ν u 2 + h ( X )) � ds x i = f ( x i , X − i ) + u , u = argmin ˜ 2 ˜ ds u 0 8 / 30

  9. Evolution of State under Control Action for Time Control Horizon N 9 / 30

  10. Model–Predictive Control on Single Time Horizon N = 2 � T d � ν u 2 + h ( X )) � ds x i = f ( x i , X − i ) + u , u = argmin ˜ 2 ˜ ds u 0 ◮ Piecewise constant control u on time interval ( t , t + ∆ t ) ◮ Discretized dynmics x i ( t + ∆ t ) = x i ( t ) + ∆ t ( f ( x i , X − i ) + u i ) , � ν u 2 + h ( X ( t + ∆ t )) � u = argmin ˜ u ∆ t 2 ˜ ◮ Optimization problem is solved explicitly u = − ∆ t ν ∂ x i h ( X ( t ) + O (∆ t )) ≈ − ∆ t ν ∂ x i h ( X ( t )) + O (∆ t ) 2 ◮ Scaling of ν as ∆ t yields closed loop system ν ds x i = f ( x i , X − i ) − 1 d ν ∂ x i h ( X ) 10 / 30

  11. Mean–field limit of closed loop system ds x i = f ( x i , X − i ) − 1 d ν ∂ x i h ( X ) �� f ( x , µ ) − 1 � � 0 = ∂ t µ + ∂ x ν ∂ x h ( µ ) µ Comparison of uncontrolled and controlled model and 1 i x 2 h ( x ) = � 2 N i 11 / 30

  12. Efficient computation of controlled particle systems � T d � ν u 2 + h ( X ) � ds x i = f ( x i , X − i ) + u , u = argmin ˜ 2 ˜ ds u 0 ◮ MPC approach at time t and horizon ∆ t yields u = − 1 ν ∂ x i h ( X ) ◮ Binary discretized interaction model with f in N = 2 i ) − ∆ t x n +1 = x n i + ∆ t f bin ( x n j , x n ν ∂ x i h bin ( x n i , x n j ) , i j ) − ∆ t x n +1 = x n j + ∆ t f bin ( x n i , x n ν ∂ x j h bin ( x n i , x n j ) , j ◮ Binary interaction model has same mean–field limit 12 / 30

  13. i ) and h ( X ) = 1 Sznadj’s model P = (1 − x 2 2 ( x i − w d ) 2 T =1 T =1 5 5 exact exact k = ∞ k = ∞ w d =0.5 4.5 4.5 k =1 k =1 k =0.5 k =0.5 4 4 w d =0 3.5 3.5 3 3 2.5 2.5 2 2 1.5 1.5 w d =-0.25 1 1 0.5 0.5 0 0 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 w w T =2 T =2 20 25 exact exact w d =0.5 k = ∞ k = ∞ 18 k =1 k =1 w d =0 k =0.5 k =0.5 16 20 14 12 15 10 8 10 6 4 5 w d =-0.25 2 0 0 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 w w Figure: Solution profiles at time T = 1 , first row, and T = 2, second row, for uncontrolled, mildly controlled case, strong controlled case. On the left: desired state is set to w d = 0, on the right w d = 0 . 5 for the strongly controlled case, and w d = − 0 . 25 for the mildly controlled case. 13 / 30

  14. Model predictive control vs Riccati control results � Figure: Evolution of the mean xf ( t , x ) dx for in the Riccati control case (left) and the MPC case (right). Plots are in log–scale and for different penalization of the control ν. Left plot scales to 10 − 8 , right to 10 − 0 . 55 . 14 / 30

  15. Performance result for MPC measured by value function � T h ( X ) + ν V ∗ ( τ, Y ) = min u 2 u 2 ds , x ′ i ( t ) = f ( x i ( t ) , X − i ( t )) + u τ ◮ V ∗ value function for optimal control u and initial data X ( τ ) = Y ◮ MPC controlled dynamics ( x MPC ) ′ ( t ) = f ( X MPC ( t )) + u MPC i and corresponding value function � T h ( X MPC ) + ν 2( u MPC ) 2 ds V MPC ( τ, y ) = τ ◮ Gr¨ une [2009]: There exists 0 < α < 1 such that V MPC ( τ, y ) ≤ 1 α V ∗ ( τ, y ) ◮ α depends in particular on MPC horizon M and growth conditions. 15 / 30

  16. Performance result independent of number of particles ◮ Result extends to the mean–field under same assumptions as in finite dimensions ◮ Growth conditions are fulfilled for example for the opinion model ◮ α independent of number of Quality of the estimate V MPC ( τ, y ) ≤ 1 α V ∗ ( τ, y ) . agents 16 / 30

  17. Computation of mean–field optimality conditions MPC with horizon larger than one requires to solve � t + M ∆ t d � ν u 2 + h ( x i , X − i ) � ds x i = f ( x i , X − i ) + u , u = argmin ˜ 2 ˜ ds u t or on mean–field level � t + M ∆ t � h ([ µ ]) µ dx + ν u 2 dt u = argmin ˜ 2 ˜ u t 0 = ∂ t µ + ∂ x (( f ( x , [ µ ]) µ + u ) µ ) ◮ Derivation of consistent optimality systems on particle and mean–field level ◮ Leading to suitable numerical discretizations of both systems Particle system → Pontryagin’s maximum principle → Mean–field → Decomposition with conditional probabilities → Numerical scheme 17 / 30

  18. Particle system → Pontryagin’s maximum principle Link optimality systems for N -particle system for pairwise interactions (see E. Caines) with x i ∈ R K , u ∈ R M � T dt x i = 1 d 1 � � p ( x i , x j , u ) , u = argmin ˜ φ ( x j , ˜ u ) dt u N N 0 j j Pontryagin’s maximum principle and adjoint variable ν i ∈ R K with zero terminal conditions − d dt ν i + 1 � ∂ 1 p ( x i , x j , u ) T ν i + ∂ 2 p ( x i , x j , u ) T ν j + ∇ x φ ( x i , u ) = 0 N j 1 i ∂ u p ( x i , x j , u ) + 1 � � ν T ∇ u φ ( x i , u ) = 0 N 2 N i , j i Under suitable assumptions on ∇ 2 uu φ and ∇ 2 uu p the control can be expressed explicitly in terms of ( x , ν ) . 18 / 30

  19. Pontryagin’s maximum principle → Mean–field Under IID assumption we obtain by BBGKY hierarchy the mean–field of the PMP system as g = g ( t , x , z ) � � � ∂ t g ( x , z , t ) + div x g ( t , x , z ) g ( y , w , t ) p ( x , y , u ) dyw � � g ( y , w , t ) ∂ 1 p ( x , y , u ) T wdyw + − div z g ( x , z , t ) � � g ( y , w , t ) ∂ 2 p ( y , x , u ) T w g ( x , z , t ) − div z f ( x , z , t ) ∇ x φ ( x , u ) = 0 ◮ Kinetic density g depends on z corresponding to Lagrange multiplier variable ◮ Goal is to derive equations for the optimality system to mean–field control problem ◮ Multiplier depends on state, hence decompose g ( x , z , t ) := µ ( t , x ) µ c ( z , x , t ) 19 / 30

Recommend


More recommend