a mean field games formulation of network based auction
play

A Mean Field Games Formulation of Network Based Auction Dynamics - PowerPoint PPT Presentation

A Mean Field Games Formulation of Network Based Auction Dynamics Peter E. Caines McGill University Information and Control in Networks Lund, October 2012 Joint work with Peng Jia Co-Authors Minyi Huang Roland Malham e Peng Jia


  1. A Mean Field Games Formulation of Network Based Auction Dynamics Peter E. Caines McGill University Information and Control in Networks Lund, October 2012 Joint work with Peng Jia

  2. Co-Authors Minyi Huang Roland Malham´ e Peng Jia

  3. Collaborators & Students Arman Kizilkale Arthur Lazarte Zhongjing Ma Mojtaba Nourian

  4. Basic Ideas of Mean Field Games 1 / 47

  5. Part 1 – CDMA Power Control Base Station & Individual Agents 2 / 47

  6. Part 1 – CDMA Power Control Lognormal channel attenuation: 1 ≤ i ≤ N i th channel: dx i = − a ( x i + b ) dt + σdw i , 1 ≤ i ≤ N Transmitted power = channel attenuation × power = e x i ( t ) p i ( t ) (Charalambous, Menemenlis; 1999) Signal to interference ratio (Agent i ) at the base station � � ( β/N ) � N = e x i p i / j � = i e x j p j + η How to optimize all the individual SIR’s? Self defeating for everyone to increase their power Humans display the “Cocktail Party Effect”: Tune hearing to frequency of friend’s voice (E. Colin Cherry) 3 / 47

  7. Part 1 – CDMA Power Control Can maximize � N i =1 SIR i with centralized control. (HCM, 2004) Since centralized control is not feasible for complex systems, how can such systems be optimized using decentralized control? Idea: Use large population properties of the system together with basic notions of game theory. Massive game theoretic control systems: Large ensembles of partially regulated competing agents Fundamental issue: The relation between the actions of each individual agent and the resulting mass behavior 4 / 47

  8. Part 2 – Basic LQG Game Problem Individual Agent’s Dynamics: dx i = ( a i x i + bu i ) dt + σ i dw i , 1 ≤ i ≤ N. (scalar case only for simplicity of notation) x i : state of the i th agent u i : control w i : disturbance (standard Wiener process) N : population size 5 / 47

  9. Part 2 – Basic LQG Game Problem Individual Agent’s Cost: � ∞ e − ρt [( x i − ν ) 2 + ru 2 J i ( u i , ν ) � E i ] dt 0 � N Basic case: ν � γ. ( 1 k � = i x k + η ) N Main features: Agents are coupled via their costs Tracked process ν : (i) stochastic (ii) depends on other agents’ control laws (iii) not feasible for x i to track all x k trajectories for large N 6 / 47

  10. Part 2 – Large Popn. Models with Game Theory Features Economic models: Cournot-Nash equilibria (Lambson) Advertising competition: game models (Erickson) Wireless network res. alloc.: (Alpcan et al., Altman, HCM) Admission control in communication networks: (Ma, MC) Public health: voluntary vaccination games (Bauch & Earn) Biology: stochastic PDE swarming models (Bertozzi et al.) Sociology: urban economics (Brock and Durlauf et al.) Renewable Energy: Charging control of of PEVs (Ma et al.) 7 / 47

  11. Part 2 – Preliminary Optimal LQG Tracking LQG Tracking: Take x ∗ (bounded continuous) for scalar model: dx i = a i x i dt + bu i dt + σ i dw i � ∞ e − ρt [( x i − x ∗ ) 2 + ru 2 J i ( u i , x ∗ ) = E i ] dt 0 ρ Π i = 2 a i Π i − b 2 r Π 2 Riccati Equation: i + 1 , Π i > 0 Set β 1 = − a i + b 2 r Π i , β 2 = − a i + b 2 r Π i + ρ , and assume β 1 > 0 ρs i = ds i dt + a i s i − b 2 r Π i s i − x ∗ . Mass Offset Control: u i = − b Optimal Tracking Control: r (Π i x i + s i ) Boundedness condition on x ∗ implies existence of unique solution s i . 8 / 47

  12. Part 2 – Key Intuition When the tracked signal is replaced by the deterministic mean state of the mass of agents: Agent’s feedback = feedback of agent’s local stochastic state + feedback of deterministic mass offset Think Globally, Act Locally (Geddes, Alinsky, Rudie-Wonham) 9 / 47

  13. Part 2 – LQG-NCE Equation Scheme The Fundamental NCE Equation System Continuum of Systems: a ∈ A ; common b for simplicity dt + as a − b 2 ρs a = ds a r Π a s a − x ∗ dt = ( a − b 2 r Π a ) x a − b 2 dx a r s a , � x ( t ) = x a ( t ) dF ( a ) , A x ∗ ( t ) = γ ( x ( t ) + η ) t ≥ 0 ρ Π a = 2 a Π a − b 2 r Π 2 Riccati Equation : a + 1 , Π a > 0 Individual control action u a = − b r (Π a x a + s a ) is optimal w.r.t tracked x ∗ . Does there exist a solution ( x a , s a , x ∗ ; a ∈ A ) ? Yes: Fixed Point Theorem 10 / 47

  14. Part 2 – NCE Feedback Control Proposed MF Solution to the Large Population LQG Game Problem The Finite System of N Agents with Dynamics: dx i = a i x i dt + bu i dt + σ i dw i , 1 ≤ i ≤ N, t ≥ 0 Let u − i � ( u 1 , · · · , u i − 1 , u i +1 , · · · , u N ) ; then the individual cost � ∞ N e − ρt { [ x i − γ ( 1 x k + η )] 2 + ru 2 � J i ( u i , u − i ) � E i } dt N 0 k � = i Algorithm: For i th agent with parameter ( a i , b ) compute: • x ∗ using NCE Equation System ρ Π i = 2 a i Π i − b 2  r Π 2 i + 1  dt + a i s i − b 2 • ρs i = ds i r Π i s i − x ∗ u i = − b  r (Π i x i + s i ) 11 / 47

  15. Part 2 – Saddle Point Nash Equilibrium Agent y is a maximizer Agent x is a minimizer 4 3 2 1 0 −1 −2 −3 −4 2 1 0 2 1 −1 0 −1 −2 −2 y x 12 / 47

  16. Part 2 – Nash Equilibrium The Information Pattern: F N � σ ( x j ( τ ); τ ≤ t, 1 ≤ j ≤ N ) F i � σ ( x i ( τ ); τ ≤ t ) F N adapted control: U F i adapted control: U loc,i The Equilibria: The set of controls U 0 = { u 0 i ; u 0 i adapted to U loc,i , 1 ≤ i ≤ N } generates a Nash Equilibrium w.r.t. the costs { J i ; 1 ≤ i ≤ N } if, for each i , J i ( u 0 i , u 0 u i ∈U J i ( u i , u 0 − i ) = inf − i ) 13 / 47

  17. Part 2 – ǫ -Nash Equilibrium ǫ -Nash Equilibria: Given ε > 0 , the set of controls U 0 = { u 0 i ; 1 ≤ i ≤ N } generates an ε -Nash Equilibrium w.r.t. the costs { J i ; 1 ≤ i ≤ N } if, for each i , J i ( u 0 i , u 0 u i ∈U J i ( u i , u 0 − i ) ≤ J i ( u 0 i , u 0 − i ) − ε ≤ inf − i ) 14 / 47

  18. Part 2 – NCE Control: First Main Result Theorem 1: (MH, PEC, RPM, 2003) Subject to technical conditions, the NCE Equations have a unique solution for which the NCE Control Algorithm generates a set of controls U N nce = { u 0 i ; 1 ≤ i ≤ N } , 1 ≤ N < ∞ , where i = − b u 0 r (Π i x i + s i ) which are s.t. (i) All agent systems S ( A i ) , 1 ≤ i ≤ N, are second order stable. (ii) {U N nce ; 1 ≤ N < ∞} yields an ε -Nash equilibrium for all ε , i.e. ∀ ε > 0 ∃ N ( ε ) s.t. ∀ N ≥ N ( ε ) J i ( u 0 i , u 0 u i ∈U J i ( u i , u 0 − i ) ≤ J i ( u 0 i , u 0 − i ) − ε ≤ inf − i ) , where u i ∈ U is adapted to F N . 15 / 47

  19. Network Based Auctions and Applications of MFG 16 / 47

  20. Part 3 – Network Based Auction: Overview Game theoretic methods for market pricing and resource allocation on distributed networks Two-level network structure Lower level: quantized progressive second price auctions with fixed local quantities Higher level: cooperative consensus allocation of local quantities Convergence and efficiency analysis of network based auctions Applications of Mean Field Game to auctions and networks 17 / 47

  21. Part 3 – ISO / RTO 18 / 47

  22. Part 3 – Hydro-Qu´ ebec 60 hydroelectric generating stations 36,971 MW installed capacity 175 TW storage capacity 579 dams, 97 control structures www.hydroforthefuture.com 19 / 47

  23. Part 3 – Worldwide Examples of Extreme Price Volatility Illinois [1] East US [2] Ontario [1] The Netherlands [1] New Zealand [3] West Texas [4] [1] Cho & Meyn, 2010 [2] http://www.ferc.gov [3] http://www.treasury.govt.nz [4] Giberson, 2008 20 / 47

  24. Part 3 – Quantized PSP Auctions (Jia & Caines 2011) A non-cooperative game; N buyer agents bid for a divisible resource C ; Given a finite price set B 0 p , each buyer agent BA i makes a quantized bid : s i = ( p i , q i ) = ( price, quantity ) , p i ∈ B 0 p ; A bid profile is s = ( s 1 , · · · , s N ) ; θ i : R + → R + , is the valuation function , and θ ′ i is the (decreasing) demand function ; A market price function (MPF) for BA i is     � P i ( z, s − i ) = inf  y ≥ 0 : C − q k ≥ z  . p k >y,k � = i Objective: Design a market mechanism (i.e., assignment of allocations) and find a bidding rule for each agent which individually maximizes its utility function and which leads to a Nash equilibria and which is socially efficient (i.e. max sum individual utilities). 21 / 47

  25. Part 3 – PSP Mechanism (celebrated VCG mechanism) The PSP allocation rule and cost function are defined as: q i a i ( s ) = a i (( p i , q i ) , s − i ) = min { q i , Q i ( p i , s − i ) } , � k : p k = p i q k (reasonable: MPF constrained allocation) � c i ( s ) = p j [ a j ((0 , 0) , s − i ) − a j ( s i , s − i )] , j � = i (reasonable: corresponding to opportunity costs) where Q i ( y, s − i ) is the available quantity at price y given s − i . Then BA i ’s utility function u i ( s ) = θ i ( a i ( s )) − c i ( s ) . 22 / 47

  26. Part 3 – Best Reply ′ Given s − i and elastic θ i , utility maximum implies the best (bid) reply, �� + � � ′ i ( v i ) ∈ R + . ′ v i = sup q ≥ 0 : θ i ( q ) > P i ( q, s − i ) , w i = θ 23 / 47

  27. Part 3 – Quantized Strategies A generic buyer, e.g., Agent 2 : Applies the same utility function and allocation rule as PSP. Makes the quantized price and quantity bid: i ∈ B 0 ′ − 1 p k p , q k ( p k i = θ i ) , 1 ≤ i i ≤ N, k ≥ 0 , where there is no bid fee. Bids are made synchronously. 24 / 47

Recommend


More recommend