A Mean Field Games Approach to Consensus Problems Mojtaba Nourian McGill University, Montr´ eal, Canada Information and Control in Networks LCCC, Lunds University 23 October 2012 Joint work with Professors Peter Caines, Roland Malham´ e and Minyi Huang 1 / 22
Outline Background: Mean Field Game (MFG) Theory Standard Consensus Algorithms (SCAs) MFG Consensus Formulation and Solution – Homogenous Case MFG Consensus Formulation and Solution – Heterogeneous Case 2 / 22
Background – Mean Field Game (MFG) Theory The Modeling Setup of Mean Field Game Theory (Huang, Caines, Malham´ e (’03,’06,’07), Lasry-Lions (’06,’07)): For a class of dynamic games with a large number of minor agents Each minor agent interacts with the average or so-called mass e ff ect of other agents via couplings in their individual cost functions and individual dynamics A minor agent is an agent which, asymptotically as the population size goes to infinity, has a negligible influence on the overall system while the overall population’s e ff ect on it is significant 3 / 22
Background – Mean Field Game (MFG) Theory Key Idea of Mean Field Game (MFG) Theory (HCM (’03,’06,’07)): Establish the existence of an equilibrium relationship between the individual strategies and the mass e ff ect in the infinite population limit Such that the individual strategy of each agent is a best response to the mass e ff ect, and the set of the strategies collectively replicate that mass e ff ect Apply the resulting infinite population strategies to a finite population system and obtain suitable approximate equilibrium 4 / 22
Background – MFG-LQG Problem Formulation Basic Linear-Qudratic-Gaussian (LQG) Dynamic Game Problem Individual Agent’s Dynamics: � � dz i ( t ) = a i z i ( t ) + bu i ( t ) dt + � i dw i ( t ) , 1 i N N : population size, z i : state of agent i , u i : control input, w i : disturbance Individual Agent’s Cost Function: Z 1 � 2 + ru 2 e � ρ t ⇣� ⌘ J i ( u i , ⌫ ) , E z i ( t ) � ⌫ ( t ) i ( t ) dt 0 ⇢ > 0 : discount factor, r > 0 : control penalty, and N � 1 X ⌫ ( · ) , � z k ( · ) + ⌘ � N k =1 Main feature: Agents are coupled via their costs Stochastic tracked process ⌫ : (i) depends on other agents’ control laws (ii) not feasible for z i to track all z k trajectories for large N 5 / 22
Background – Preliminary LQG Tracking Problem Preliminary LQG Tracking Problem For One Agent Only: x ⇤ ( · ) known and deterministic � � dz i ( t ) = a i z i ( t ) + bu i ( t ) dt + � i dw i ( t ) Z 1 � 2 + ru 2 e � ρ t ⇣� ⌘ J i ( u i , x ⇤ ) = E z i ( t ) � x ⇤ ( t ) i ( t ) dt 0 Computation of the Optimal Tracking Control: u i ( · ) = � b � � Π i z i ( · ) + s i ( · ) r ⇢ Π i = 2 a i Π i � b 2 r Π 2 Riccati Equation: i + 1 , Π i > 0 dt = � ⇢ s i + a i s i � b 2 � ds i r Π i s i � x ⇤ Mass O ff set Control: Boundedness condition on x ⇤ ( · ) implies existence of unique solution s i 6 / 22
Background – The Fundamental MFG-LQG System Continuum of Systems under Optimal LQG Tracking Control: a 2 A ; common b for simplicity dt = � ⇢ s a + as a � b 2 � ds a r Π a s a � x ⇤ (Tracking mass equation) dt = ( a � b 2 z a � b 2 d ¯ z a r Π a )¯ r s a (The mean state equation) Z ¯ z ( t ) = z a ( t ) dF ( a ) ¯ (The mean field function) A x ⇤ ( t ) = � (¯ z ( t ) + ⌘ ) t � 0 (The mass function) ⇢ Π a = 2 a Π a � b 2 r Π 2 Riccati Equation : a + 1 , Π a > 0 F ( · ) : The limit empirical distribution of { a i : i > 1 } ⇢ A Individual control action u a = � b r ( Π a z a + s a ) is optimal w.r.t tracked x ⇤ z a , s a , x ⇤ ; a 2 A ) ? Yes: Fixed Point Theorem Does there exist a solution (¯ 7 / 22
Background – Properties of MFG-LQG Solution Theorem (HCM’03,’07) Subject to technical conditions, the MFG system has a unique solution for which the resulting set of MFG controls i = � b U N mf = { u 0 r ( Π i z i + s i ); 1 i N } , 1 N < 1 yields an ✏ -Nash equilibrium for all ✏ , i.e. 8 ✏ > 0 9 N ( ✏ ) s.t. 8 N � N ( ✏ ) J i ( u 0 i , u 0 u i J i ( u i , u 0 � i ) J i ( u 0 i , u 0 � i ) � ✏ inf � i ) where u i is adapted to the set of full information admissible controls. 4 3 2 1 Agent y is a maximizer 0 − 1 Agent x is a minimizer − 2 − 3 − 4 2 1 0 2 1 − 1 0 − 1 − 2 − 2 y x 8 / 22
Background – Properties of MFG-LQG Solution Counterintuitive Nature of MFG controls: Intrinsically decentralized agent’s feedback = feedback of agent’s local stochastic state + feedback of deterministic precomputable mass (No communication among agents!) Applying MFG Controls to the Finite Population System: ✏ -Nash equilibrium (with respect to all possible controls among the full information pattern) exists between the individuals of a large N population system with ✏ ! 0 as N goes to infinity 9 / 22
Background – Standard Consensus Algorithms Definition A consensus process is a process for achieving an agreement among the members of a group of agents on some common state property such as velocity or information. Standard Consensus Algorithms (SCAs): A network of N agents with dynamics dz i ( t ) = u i ( t ) dt, t � 0 , 1 i N, where an agreement is achieved via local communications with their neighbours based on the network topology G = ( V, E ) ( V : the set of vertices, E ⇢ V ⇥ V : an ordered set of edges) 10 / 22
Background – Standard Consensus Algorithms Time-Invariant SCAs: X � � dz i ( t ) = a ij z j ( t ) � z i ( t ) dt, t � 0 , 1 i N j 2 N i where N i = { j 2 V : ( i, j ) 2 E } . Definition Consensus is said to be achieved asymptotically for a group of N agents if lim t !1 | z i ( t ) � z j ( t ) | = 0 for any i and j , 1 i 6 = j N . Theorem (see e.g. (Ren et.al. ’05)) If the undirected graph G is connected (i.e., there is a path between every pair of nodes), then the system achieves consensus asymptotically as time goes to infinity P N 1 the consensus value is the average of initial states j =1 z j (0) . N 11 / 22
Why MFG Consensus Formulation? The connectivity of the network structure needed for the SCAs (even for the less demanding “frequently connected” hypotheses) may not hold Communication is costly and may be distorted SCAs are fragile in the presence of noise in the agents’ dynamics MFG approach with no communication but prior statistical information In this approach we seek to synthesize the collective behaviour of the group from fundamental principles 12 / 22
MFG Consensus Formulation – Homogenous Case Dynamics: dz i ( t ) = u i ( t ) dt + � dw i ( t ) , t � 0 , 1 i N Cost Functions: Z 1 N 1 � 2 + ru 2 e � ρ t ⇣� ⌘ J N X i ( u i , u � i ) := E z i ( t ) � z j ( t ) i ( t ) dt N � 1 0 j =1 ,j 6 = i N ; population size, z i : state of agent i , u i : control input w i : disturbance (standard Wiener process), ⇢ > 0 : discount factor r > 0 : control penalty Each agent in the group seeks a strategy to be as close as possible to the average of the population Let F ( · ) be the limit empirical distribution of { z i (0) : i > 1 } ⇢ C 13 / 22
MFG Consensus Solution – Homogenous Case Mean Field Game System of the Consensus Formulation: • Computation of Best Response Control for a Generic Agent with Initial ↵ 2 C and Mass Trajectory � 1 ( · ) : α ( t ) = � 1 u o � � pz α ( t ) + s ( t ) (Best Response Control) r p 2 + r ⇢ p � r = 0 = ( r ⇢ ) 2 + 4 r ) / 2 p ) p = ( � r ⇢ + (Riccati Equation) ds ( t ) ⇢ + p s ( t ) + � 1 ( t ) � � = (Tracking equation) dt r • Mass behavior equation in the consensus formulation under u o α ( · ) : dz α ( t ) = � 1 � � pz α ( t ) + s ( t ) dt + � dw i ( t ) (The generic agenet process) r d ¯ z α ( t ) = � 1 � � p ¯ z α ( t ) + s ( t ) , ¯ z α (0) = ↵ (The mean state equation) dt r Z � 1 ( t ) = z α ( t ) dF ( ↵ ) , t � 0 ¯ (The mass function) C 14 / 22
MFG Consensus Solution – Homogenous Case Theorem (NCMH’10) The unique solution of MFG system: ( s ( t ) , � 1 ( t )) = ( � p � 1 (0) , � 1 (0)) , t � 0 . Applying the MFG control u o i ( t ) = � p z i ( t ) � � 1 (0) � � yields: r Z t i ( t ) = � 1 (0) + e � p e � p z i (0) � � 1 (0) r ( t � τ ) dw i ( ⌧ ) , t � 0 . z o r t � � + � 0 Definition Mean-consensus is said to be achieved asymptotically for a group of N agents if lim t !1 | ¯ z i ( t ) � ¯ z j ( t ) | = 0 for any i and j , 1 i 6 = j N . Theorem (NCMH’10) (i) A mean-consensus is reached asymptotically as time goes to infinity with individual asymptotic variance σ 2 r 2 p . (ii) The set of MFG control strategies { u o i : 1 i N } generates an ✏ N -Nash equilibrium such that lim N !1 ✏ N = 0 . 15 / 22
MFG Consensus Solution – Homogenous Case Simulation Result (500 agents) (A) Trajectories of agents’ states, (B) Histogram of the system at time t = 20 16 / 22
Recommend
More recommend