Network Games: On the Extension of Nash’s Theory to Networks Lacra Pavel* Systems Control Group University of Toronto ∗ based on joint work with Dian Gadjov, Peng Yi Farzad Salehisadaghiani NecSys 2018, Groningen, 27-28 August, 2018 1 / 56
Motivation Social Networks Wireless Ad-hoc Networks 2 / 56
What is a Game? Setup: a number of players /agents ( 2 or more), or decision-makers. Each player has a finite number of choices or has a continuum of choices (actions/decisions). At the end of the game there is some payoff to be gained or cost to be paid by each player. In standard optimization: one decision-maker who aims to optimize some objective function (goal). In a game: multiple decision-makers (self-interested players /agents) with own individual goals (cost/payoff). 3 / 56
What is a Game? Setup: a set of N players /agents, V = { 1 , . . . , N } . Each player i ∈ V has an action x i selected from his action set Ω i (finite or continuous). has an individual payoff (utility) U i or cost function J i takes an action x i to maximize its own payoff U i , which is equivalent to minimizing its own cost (loss) function, J i . His success in making decisions depends on the others’ decisions. J i ( x ) , where x = ( x 1 , · · · , x N ) is the action profile of all players, or, denoted J i ( x i , x − i ) , where x = ( x i , x − i ) and where x − i denotes the actions of others except player i . Denote such a game by G ( V, Ω i , J i ) . Game theory = the mathematical framework that studies the strategic interaction among multiple decision-makers . 4 / 56
Solution Concepts A player is rational if he makes choices that optimize his expected utility (minimize expected cost). A strategy can be regarded as a rule for choosing an action, e.g. Security strategy: minimize your own maximum (worst) expected cost. Minimax Solution: when each player uses a security strategy. Prisoners’ Dilemma Game example: Action choices: "Confess" (defect) or "Not Confess", Cost matrix: years in prison � � (5 , 5) (0 , 15) M = (15 , 0) (1 , 1) "Confess" is the security strategy for either players. Minimax solution is ("Confess", "Confess") ⇒ (5 , 5) years in prison. 5 / 56
Solution Concepts Best-response (BR) strategy: Play the action that gives you the lowest cost given your opponents’ actions/strategies . Given his opponents’ play x − i , a BR strategy for player i is x ∗ i such that J i ( x ∗ i , x − i ) ≤ J i ( x i , x − i ) , ∀ x i ∈ Ω i (Nash) Equilibrium Solution x ∗ : when each player uses a best-response (BR) strategy J i ( x ∗ i , x ∗ − i ) ≤ J i ( x i , x ∗ ∀ x i ∈ Ω i , ∀ i ∈ I − i ) , denoted as x ∗ = ( x ∗ i , x ∗ − i ) (action N -tuple or profile) An equilibrium solution is at the intersection of all BR strategies. Key: No player has an incentive to unilaterally change its action => No Regret. We might expect a set of rational agents to choose an equilibrium solution. 6 / 56
Solution Concepts Prisoners’ Dilemma Game: ("Confess", "Confess") is in fact the unique equilibrium ⇒ cost (5 , 5) years in prison. � � (5 , 5) (0 , 15) M = (15 , 0) (1 , 1) Matching Pennies Game: DOES NOT have an equilibrium solution! � ( − 1 , 1) � (1 , − 1) M = (1 , − 1) ( − 1 , 1) Randomize choices: if each player chooses Head (H) with 50% probability and Tail (T) with 50% probability, the expected payoff for both is (0 , 0) , and no regret ⇒ is the equilibrium solution in randomized (mixed) strategies. von Neuman (1928): Every 2 -player zero-sum game has an equilibrium in mixed strategies. J. Nash (1949): Every N -player game G ( V, Ω i , J i ) with finite action sets has an equilibrium in mixed strategies (called Nash equilibrium (NE)). Debreu, Glicksberg, Fan (1952): Every N -player infinite game G ( V, Ω , J i ) with non-empty, compact, convex Ω , J i continuous in x = ( x i , x − i ) and convex in x i , has a pure Nash equilibrium (NE). 7 / 56
Game Theory: from Classical Setting to Learning Classical Setting: Originated and used in economics and social sciences. Relies on equilibrium analysis based on NE or its refinements. Offers traditional explanation for when and why a NE arises: in one-shot games (players interact for only a single round), from analysis and introspection by sophisticated players when the game and the rationality of the players are all common knowledge, complete information. Learning Game Theory: Alternative justification - more relevant for engineering or social systems. in repeated games (players interact with each other for multiple rounds of the same game), a NE arises as the limiting point of repeated play in which (less than fully) rational players update their behaviour. information available is critical. 8 / 56
Learning Game Theory Myopic Learning: Simple and rule-of-thumb rules, no forecasting. Examples: Best-response (BR) play. Fictitious play (play optimally/BR against the empirical distribution of past plays of opponents). Gradient-play ("better-response" play). Reinforcement-learning (payoff-based play). Evolutionary Games: Select strategies according to performance against the aggregate and random mutations. 9 / 56
Network Games Multi-agent (MA) systems or networks Complex problems Multiple entities with individual goals and capacity to act Limited distributed resources Central controller issues Applications Transportation Communication Economics Social Networks . . . Major challenge Efficient coordination between agents (centralized?) 10 / 56
Game Theory in Network Games Congestion control games in networks (internet, transportation): [Altman, Basar & Srikant’02, Alpcan & Basar’05, Yin, Shanbhag & Mehta’11] . Power allocation in communication networks (wireless, optical): [Alpcan & Basar’03, Pan & Pavel’09, Menache & Ozdaglar’11] . Multi-agent (mobile sensors/robot) network formation: [Cortes, Martinez et al.’02’13, Arslan, Marden & Shamma’07] . Resource allocation game in cloud computation: [Ardagna et al.’13] . Demand response management in smart grids: [Zhu & Frazzoli’16, van der Schaar et al.’16, Ye & Hu’17] . Players / Agents: fase network nodes, routers, channels, robots or network users. share network resources (bandwidth, power, capacity). Approaches: infinite (continuous-kernel) games, finite action games, evolutionary games. 11 / 56
Network Games Centralized coordination is often impractical in networks, or consumes excessive bandwidth and energy => want distributed learning algorithms/dynamics. Challenges: complexity of agents’ networked interaction local/partial information delayed/asynchronous communication curse of dimensionality non-stationary environment (multiple agents learning at the same time). Differences: MA consensus/agreement: dynamic but decoupled agents, with individual properties for each agent (e.g. incremental passivity, convexity); only consensus/agreement type optimality (translates in quadratic objective). Distributed optimization: optimality in terms of an aggregate objective, each agent optimizes a part of it (cooperate) and has complete authority over full argument (decoupled problems). Game setting: each agent optimizes own objective (does not cooperate) and has authority only over his own actions, but his objective depends on the others (coupled problems and incomplete authority over the full argument/profile). 12 / 56
Network Games: Setting 2 1 Interference Graph 3 N G I 1 2 Communication Graph 3 N G C Each player/agent i has an individual cost J i or payoff U i (his goal). J i may be affected by the decision of any subset of players => strategic interaction modelled by interference graph G I (conflict, in graphical games). Information obtained over a network => communication graph G C . G I and G C can be identical or not, complete or sparsely connected, directed or undirected, static or time-varying. 13 / 56
Network Games: Problem 2 1 Interference Graph 3 N G I 2 1 Communication Graph 3 N G C Design agent learning rules (algorithms/dynamics) that achieve a (G)NE collective configuration, while considering agents’ networked interaction and communication (rely on local information), minimize superfluous communication or processing overhead, and are provably convergent in a large class of games. Setting: non-cooperative in the way actions are taken (each agent minimizes its own individual cost), but collaborative (agents share some information with neighbours to compensate for the lack of global information on others). Tools: convex optimization, graph theory, monotone operator theory, multi-agent dynamical systems & control. 14 / 56
Network Games: Some Results 2 1 Interference Graph 3 N G I 1 2 Communication Graph 3 N G C NEP in general or in aggregative games with central node authority: G I Complete, G C star, [Gilpin ’07, Facchinei & Pang’07, Parise et al’15] NEP in aggregative games or 2-network zero-sum games, center-free: G I Complete, any G C ⊂ G I [Koshal, Nedic & Shanbhag’12], [Maojiao ’17] , [Gharesifard & Cortes’13] . NEP in general monotone games: G I Partial, G C = G I , [Zhu & Frazzoli’12], [Li et al.’13] NEP in general monotone games: G I Complete or Partial, any G C ⊂ G I , [Salehi & Pavel’16,’18, Gadjov Pavel’18, Yi & Pavel’18] 15 / 56
Recommend
More recommend