agent based systems
play

Agent-Based Systems Partial global planning: achieving a global view - PowerPoint PPT Presentation

Agent-Based Systems Agent-Based Systems Where are we? Last time . . . Coordination: managing interactions effectively Different methods for coordination Agent-Based Systems Partial global planning: achieving a global view through


  1. Agent-Based Systems Agent-Based Systems Where are we? Last time . . . • Coordination: managing interactions effectively • Different methods for coordination Agent-Based Systems • Partial global planning: achieving a global view through information exchange Michael Rovatsos • Joint intentions: extending the BDI paradigm to include joint mrovatso@inf.ed.ac.uk intentions, collective commitments and conventions • Mutual modelling: taking the role of the other to predict their actions • Norms and social laws: coordination through offline/emergent Lecture 8 – Multiagent Interactions constraints on agent behaviour • Multiagent planning and synchronisation, plan merging Today . . . • Multiagent Interactions 1 / 18 2 / 18 Agent-Based Systems Agent-Based Systems Multiagent interactions Preferences and utilities • We have looked at agent communication, but not described how it • We first need an abstract model of interactions is used in actual agent interactions • Assume O = { o 1 , . . . o n } a set of possible outcomes (e.g. possible • In itself, communication does not have much effect on the agents “runs” of the system until final states are reached) • Now, we are going to look at interactions in which agents affect • A preference ordering ≻ i ⊆ O × O for agent i is a total, each other through their actions antisymmetric, transitive relation on O , i.e. • Assume agents to have “spheres of influence” that they control in • o ≻ i o ′ ⇒ o ′ �≻ i o • o ≻ i o ′ ∧ o ′ ≻ o ′′ ⇒ o ≻ i o ′′ the environment • ∀ o , o ′ ∈ O either o ≻ i o ′ or o ′ ≻ i o • Also, we assume that the welfare (goal achievement, utility) of each agent at least partially depends on the actions of others • Such an ordering can be used to express strict preferences of an agent over O (write � i if also reflexive, i.e. o � i o ) • This part of the lecture will deal with what agents should do in the presence of other agents (which also do stuff) 3 / 18 4 / 18

  2. Agent-Based Systems Agent-Based Systems Preferences and utilities Preferences and utilities • The utility of money: • Preferences are often expressed through a utility function u i : O → R : u i ( o ) > u i ( o ′ ) ⇔ o ≻ o ′ , u i ( o ) ≥ u i ( o ′ ) ⇔ o � o ′ • Utilities make representing preferences easier because the ordering follows naturally if we use real numbers • Often, people falsely associate utility directly with money! • Intuitively, the utility of money depends on how much money one • Empirical evidence suggests utility of money is often very close to already has logarithm function for humans • Therefore, utility does not increase proportionally with monetary • This shows that utility function depends on agent’s risk aversion wealth attitude (value of additional utility depending on current “wealth”) 5 / 18 6 / 18 Agent-Based Systems Agent-Based Systems Multiagent encounters Example: The Prisoner’s Dilemma • Applying the above to a multiagent setting, we need to consider several agents’ actions and the outcomes they lead to • Two men are collectively charged with a crime and held in separate cells, with no way of meeting or communicating. They are told that: • For now, restrict ourselves to two players and identical sets of • if one confesses and the other does not, the confessor will be freed, actions and the other will be jailed for three years; • Abstract architecture: state transformer function becomes • if both confess, then each will be jailed for two years. τ : Ac × Ac → O Both prisoners know that if neither confesses, then they will each be jailed for one year. where Ac are the actions of each of the two agents • Payoff matrix for this game: • Outcome depends on other’s actions! • For pairs ( a 1 , a 2 ) , ( a ′ 1 , a ′ 2 ) ∈ Ac × Ac we can write 2 C D 1 ( a 1 , a 2 ) � ( a ′ 1 , a ′ 2 ) iff τ ( a 1 , a 2 ) � τ ( a ′ 1 , a ′ 2 ) C (3,3) (0,5) (similarly for ≻ and utilities u 1 / 2 ( τ ( a 1 , a 2 )) ) D (5,0) (1,1) • We consider agents to be rational if they prefer actions that lead to preferred outcomes 7 / 18 8 / 18

  3. Agent-Based Systems Agent-Based Systems Game theory Dominance and Best Response Strategies • Mathematical study of interaction problems of this sort • Two simple and very common criteria for rational decision making • Basic model: agents perform simultaneous actions (potentially over in games • Strategy s ∈ S i is said to dominate s ′ ∈ S i iff several stages), the actual outcome depends on the combination of action chosen by all agents u i ( s , s − i ) ≥ u i ( s ′ , s − i ) • Normal-form games : final result reached in single step (in ∀ s − i ∈ S − i contrast to extensive-form games ) ( s − i = ( s 1 , . . . , s i − 1 , s i + 1 , . . . , s n ) , same abbrev. used for S ) • Agents { 1 , . . . , n } , S i =set of (pure) strategies for agent i , S = × n i = 1 S i space of joint strategies • Dominated strategies can be safely deleted from the set of • Utility functions u i : S → R map joint strategies to utilities strategies, a rational agent will never play them • A probability distribution σ i : S i → [ 0 , 1 ] is called a mixed strategy • Some games are solvable in dominant strategy equilibrium , of agent i (can be extended to joint strategies) i.e. all agents have a single (pure/mixed) strategy that dominates • Game theory is concerned with the study of this kind of games (in all other strategies particular developing solution concepts for games) 9 / 18 10 / 18 Agent-Based Systems Agent-Based Systems Dominance and Best Response Strategies Nash Equilibrium • Nash (1951) defined the most famous equilibrium concept for • Strategy s ∈ S i is a best response to strategies s − i ∈ S − i iff normal-form games • A joint strategy s ∈ S is said to be in (pure-strategy) Nash ∀ s ′ ∈ S i , s ′ � = s u i ( s , s − i ) ≥ u i ( s ′ , s − i ) equilibrium (NE), iff • Weaker notion, only considers optimal reaction to a specific ∀ i ∈ { 1 , . . . n }∀ s ′ u i ( s i , s − i ) ≥ u i ( s ′ i ∈ S i i , s − i ) behaviour of other agents • Intuitively, this means that no agent has an incentive to deviate • Unlike dominant strategies, best-response strategies (trivially) from this strategy combination always exist • Very appealing notion, because it can be shown that a • Strict versions of the above relations require that “ > ” holds‘ for at (mixed-strategy) NE always exists least one s ′ • But also some problems: • Replace s i / s − i above by σ i / σ − i and you can extend the definitions • Not always unique, how to agree on one of them? for dominant/best-response strategies to mixed strategies • Proof of existence does not provide method to actually find it • Many games do not have pure-strategy NE 11 / 18 12 / 18

  4. Agent-Based Systems Agent-Based Systems Example Example The Prisoner’s Dilemma: Nash equilibrium is not Pareto efficient (or: no one will dare to cooperate although mutual cooperation is preferred over The Coordination Game: No temptation to defect, but two equilibria mutual defection) (hard to know which one will be chosen by other party) 2 C D 2 A B 1 1 C (3,3) (0,5) A (1,1) (-1,-1) D (5,0) (1,1) B (-1,-1) (1,1) General conditions on utilities: DC ≻ CC ≻ DD ≻ CD (from first player’s point of view) and u ( CC ) > u ( DC )+ u ( CD ) 2 13 / 18 14 / 18 Agent-Based Systems Agent-Based Systems The Evolution of cooperation? The evolution of cooperation? • In zero-sum/constant-sum games one agent loses what the other • In single-shot PD, defection is the rational solution wins (e.g. Chess) no potential for cooperation • In (infinitely) iterated case, cooperation is the rational choice in the • Typical non-zero sum game : there is a potential for cooperation PD but how should it emerge among self-interested agents? • But not if game has a fixed, known length (“backward induction” • This situation occurs in many real life cases: problem) • Nuclear arms race • Tragedy of the commons • TIT FOR TAT strategy performed best against a variety of • “Free rider” problems strategies (this does not mean it is the best strategy, though!) • Axelrod’s tournament (1984): a very interesting study of such • Axelrod’s conclusions from this: interaction situations • don’t be envious, don’t be the first to defect, reciprocate defection and cooperation (don’t hold grudges), don’t be too clever • Iterated Prisoner’s Dilemma was played among many different strategies (how to play against different opponents?) 15 / 18 16 / 18

Recommend


More recommend