Numerical strategies for efficient control of large–scale particle systems Michael Herty IGPM, RWTH Aachen University. joint work with G. Albi, L. Pareschi, C. Ringhofer, S. Steffensen, M. Zanella. CIRM, Crowds: models and control, 2019 1 / 30
Control of interacting particle systems i = 1 , . . . � dx i ( t ) = P ( x i , x j )( x j − x i ) dt + u i + σ dW i j ◮ Coupled system of ODEs / SDEs ◮ x i , u i are state/control of i th particle ◮ P is interaction kernel, e.g. P ( x , y ) = χ ( � x − y � ≤ ∆) ◮ P = cst used as opinion formation model Bounded confidence model, u i ≡ 0, Hegselmann/Krause 2 / 30
Realistic example: Financial market model by Levy-Levy-Solomon ◮ Particles (investors) i have two portfolios ( x i stocks/ y i bonds). S is the stock price, D > 0 dividend, and r > 0 interest rate. dt x i = S ′ + D d x i + u i , S d dt y i = ry i − u i , S = 1 � x j . N j ◮ Particles have a choice u i to invest in stocks ◮ Objective: Maximize total profit 3 / 30
Problem setup: Optimal Control Consider i = 1 , . . . , N particles with additive control and quadratic regularization � T ν u ( s ) 2 + h ( X ( s ))) ds u = argmin ˜ 2 ˜ u ∈ R 0 d dt x i = f ( x i , X − i ) + u , ◮ Notation: X = ( x i ) N i =1 and X − i = ( x j ) N j =1 , j � = i ◮ Simplifying setup: Single control u = u ( t ) for all particles but similar results for u = u i possible ◮ Interest in open/closed loop control but no game theoretic setting (see other talks in this conference) ◮ Interest in mean–field limit N → ∞ 4 / 30
Results in the Linear–Quadratic Case Opinion formation model with constant P and quadratic cost N N P ( x j − x i ) , h ( X ) = 1 � � x 2 f ( x i , X − i ) = i 2 N j =1 i =1 and it is a linear quadratic problem � T ν u ( s ) 2 + X ( s ) T MX ( s ) ds u = argmin ˜ 2 ˜ u ∈ R 0 d dt X ( t ) = AX ( t ) + Bu ( t ) Solution given by Riccati equation with K ( t ) ∈ R N × N − d dt K = KA + A T K − 1 ν K 1 K T + M , K ( T ) = 0 Bu ( t ) = − 1 ν BB T K ( t ) X ( t ) . 5 / 30
Structure of K and mean–field limit Matrices have symmetric structure A i , i = a 0 , A i , j = a d , B i , j = 1 , M i , j = 1 N δ i , j Structure extends to Riccati equation − d dt K = KA + A T K − 1 ν K 1 K T + M , K ( T ) = 0 dt X ( t ) = AX ( t ) − 1 d ν BB T K ( t ) X ( t ) Lemma. The solution to the matrix Riccati equation K ( t ) ∈ R N × N fulfills for K ∈ R i , j = 1 � � BB T K ( t ) N K ( t ) − d dt K ( t ) = 1 − 1 ν K ( t ) , K ( T ) = 0 Corresponding mean–field for probability measure µ ( t ) ∈ P ( R ) �� � P ( y − x ) − 1 � � ∂ t µ ( t , x ) + ∂ x ν K ( t ) y µ ( t , y ) dy µ ( t , x ) = 0 6 / 30
Long-term behavior of solutions �� � � � P ( y − x ) − 1 0 = ∂ t µ ( t , x ) + ∂ x ν K ( t ) y µ ( t , y ) dy µ ( t , x ) − d dt K ( t ) = 1 − 1 ν K ( t ) , K ( T ) = 0 x 2 µ ( t , x ) dx ◮ Moments m ( t ) = � � x µ ( t , x ) dx and E ( t ) = have the following asymptotic behavior m ( t ) → 0 at rate 1 / √ ν, E ( t ) → 0 at rate 2 P ◮ Results generalize to problems with fixed desired state x d 7 / 30
Nonlinear Case: Model–Predictive Control � T d � ν u 2 + h ( X )) � ds x i = f ( x i , X − i ) + u , u = argmin ˜ 2 ˜ ds u 0 8 / 30
Evolution of State under Control Action for Time Control Horizon N 9 / 30
Model–Predictive Control on Single Time Horizon N = 2 � T d � ν u 2 + h ( X )) � ds x i = f ( x i , X − i ) + u , u = argmin ˜ 2 ˜ ds u 0 ◮ Piecewise constant control u on time interval ( t , t + ∆ t ) ◮ Discretized dynmics x i ( t + ∆ t ) = x i ( t ) + ∆ t ( f ( x i , X − i ) + u i ) , � ν u 2 + h ( X ( t + ∆ t )) � u = argmin ˜ u ∆ t 2 ˜ ◮ Optimization problem is solved explicitly u = − ∆ t ν ∂ x i h ( X ( t ) + O (∆ t )) ≈ − ∆ t ν ∂ x i h ( X ( t )) + O (∆ t ) 2 ◮ Scaling of ν as ∆ t yields closed loop system ν ds x i = f ( x i , X − i ) − 1 d ν ∂ x i h ( X ) 10 / 30
Mean–field limit of closed loop system ds x i = f ( x i , X − i ) − 1 d ν ∂ x i h ( X ) �� f ( x , µ ) − 1 � � 0 = ∂ t µ + ∂ x ν ∂ x h ( µ ) µ Comparison of uncontrolled and controlled model and 1 i x 2 h ( x ) = � 2 N i 11 / 30
Efficient computation of controlled particle systems � T d � ν u 2 + h ( X ) � ds x i = f ( x i , X − i ) + u , u = argmin ˜ 2 ˜ ds u 0 ◮ MPC approach at time t and horizon ∆ t yields u = − 1 ν ∂ x i h ( X ) ◮ Binary discretized interaction model with f in N = 2 i ) − ∆ t x n +1 = x n i + ∆ t f bin ( x n j , x n ν ∂ x i h bin ( x n i , x n j ) , i j ) − ∆ t x n +1 = x n j + ∆ t f bin ( x n i , x n ν ∂ x j h bin ( x n i , x n j ) , j ◮ Binary interaction model has same mean–field limit 12 / 30
i ) and h ( X ) = 1 Sznadj’s model P = (1 − x 2 2 ( x i − w d ) 2 T =1 T =1 5 5 exact exact k = ∞ k = ∞ w d =0.5 4.5 4.5 k =1 k =1 k =0.5 k =0.5 4 4 w d =0 3.5 3.5 3 3 2.5 2.5 2 2 1.5 1.5 w d =-0.25 1 1 0.5 0.5 0 0 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 w w T =2 T =2 20 25 exact exact w d =0.5 k = ∞ k = ∞ 18 k =1 k =1 w d =0 k =0.5 k =0.5 16 20 14 12 15 10 8 10 6 4 5 w d =-0.25 2 0 0 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 w w Figure: Solution profiles at time T = 1 , first row, and T = 2, second row, for uncontrolled, mildly controlled case, strong controlled case. On the left: desired state is set to w d = 0, on the right w d = 0 . 5 for the strongly controlled case, and w d = − 0 . 25 for the mildly controlled case. 13 / 30
Model predictive control vs Riccati control results � Figure: Evolution of the mean xf ( t , x ) dx for in the Riccati control case (left) and the MPC case (right). Plots are in log–scale and for different penalization of the control ν. Left plot scales to 10 − 8 , right to 10 − 0 . 55 . 14 / 30
Performance result for MPC measured by value function � T h ( X ) + ν V ∗ ( τ, Y ) = min u 2 u 2 ds , x ′ i ( t ) = f ( x i ( t ) , X − i ( t )) + u τ ◮ V ∗ value function for optimal control u and initial data X ( τ ) = Y ◮ MPC controlled dynamics ( x MPC ) ′ ( t ) = f ( X MPC ( t )) + u MPC i and corresponding value function � T h ( X MPC ) + ν 2( u MPC ) 2 ds V MPC ( τ, y ) = τ ◮ Gr¨ une [2009]: There exists 0 < α < 1 such that V MPC ( τ, y ) ≤ 1 α V ∗ ( τ, y ) ◮ α depends in particular on MPC horizon M and growth conditions. 15 / 30
Performance result independent of number of particles ◮ Result extends to the mean–field under same assumptions as in finite dimensions ◮ Growth conditions are fulfilled for example for the opinion model ◮ α independent of number of Quality of the estimate V MPC ( τ, y ) ≤ 1 α V ∗ ( τ, y ) . agents 16 / 30
Computation of mean–field optimality conditions MPC with horizon larger than one requires to solve � t + M ∆ t d � ν u 2 + h ( x i , X − i ) � ds x i = f ( x i , X − i ) + u , u = argmin ˜ 2 ˜ ds u t or on mean–field level � t + M ∆ t � h ([ µ ]) µ dx + ν u 2 dt u = argmin ˜ 2 ˜ u t 0 = ∂ t µ + ∂ x (( f ( x , [ µ ]) µ + u ) µ ) ◮ Derivation of consistent optimality systems on particle and mean–field level ◮ Leading to suitable numerical discretizations of both systems Particle system → Pontryagin’s maximum principle → Mean–field → Decomposition with conditional probabilities → Numerical scheme 17 / 30
Particle system → Pontryagin’s maximum principle Link optimality systems for N -particle system for pairwise interactions (see E. Caines) with x i ∈ R K , u ∈ R M � T dt x i = 1 d 1 � � p ( x i , x j , u ) , u = argmin ˜ φ ( x j , ˜ u ) dt u N N 0 j j Pontryagin’s maximum principle and adjoint variable ν i ∈ R K with zero terminal conditions − d dt ν i + 1 � ∂ 1 p ( x i , x j , u ) T ν i + ∂ 2 p ( x i , x j , u ) T ν j + ∇ x φ ( x i , u ) = 0 N j 1 i ∂ u p ( x i , x j , u ) + 1 � � ν T ∇ u φ ( x i , u ) = 0 N 2 N i , j i Under suitable assumptions on ∇ 2 uu φ and ∇ 2 uu p the control can be expressed explicitly in terms of ( x , ν ) . 18 / 30
Pontryagin’s maximum principle → Mean–field Under IID assumption we obtain by BBGKY hierarchy the mean–field of the PMP system as g = g ( t , x , z ) � � � ∂ t g ( x , z , t ) + div x g ( t , x , z ) g ( y , w , t ) p ( x , y , u ) dyw � � g ( y , w , t ) ∂ 1 p ( x , y , u ) T wdyw + − div z g ( x , z , t ) � � g ( y , w , t ) ∂ 2 p ( y , x , u ) T w g ( x , z , t ) − div z f ( x , z , t ) ∇ x φ ( x , u ) = 0 ◮ Kinetic density g depends on z corresponding to Lagrange multiplier variable ◮ Goal is to derive equations for the optimality system to mean–field control problem ◮ Multiplier depends on state, hence decompose g ( x , z , t ) := µ ( t , x ) µ c ( z , x , t ) 19 / 30
Recommend
More recommend