Multiagent Problem Formulation Multiagent Problem Formulation Jos´ e M Vidal Department of Computer Science and Engineering, University of South Carolina January 5, 2010 Abstract We cover the most popular formal models for representing agents and multiagent problems.
Multiagent Problem Formulation Introduction Why Study Multiagent Systems? Multiagent systems everywhere!
Multiagent Problem Formulation Introduction Why Study Multiagent Systems? Multiagent systems everywhere! Internet: peer-to-peer programs (bittorrent), web applications (REST), social networks, the routers themselves.
Multiagent Problem Formulation Introduction Why Study Multiagent Systems? Multiagent systems everywhere! Internet: peer-to-peer programs (bittorrent), web applications (REST), social networks, the routers themselves. Economics: just-in-time manufacturing and procurement, sourcing, ad auctions.
Multiagent Problem Formulation Introduction Why Study Multiagent Systems? Multiagent systems everywhere! Internet: peer-to-peer programs (bittorrent), web applications (REST), social networks, the routers themselves. Economics: just-in-time manufacturing and procurement, sourcing, ad auctions. Political Science and Sociology: negotiations among self-interested parties.
Multiagent Problem Formulation Introduction Why Study Multiagent Systems? Multiagent systems everywhere! Internet: peer-to-peer programs (bittorrent), web applications (REST), social networks, the routers themselves. Economics: just-in-time manufacturing and procurement, sourcing, ad auctions. Political Science and Sociology: negotiations among self-interested parties. Nanofabrication and MEMS: sensor networks.
Multiagent Problem Formulation Introduction Why Study Multiagent Systems? Multiagent systems everywhere! Internet: peer-to-peer programs (bittorrent), web applications (REST), social networks, the routers themselves. Economics: just-in-time manufacturing and procurement, sourcing, ad auctions. Political Science and Sociology: negotiations among self-interested parties. Nanofabrication and MEMS: sensor networks. Biology: social insects, ontogeny, neurology.
Multiagent Problem Formulation Introduction Science: How stuff works.
Multiagent Problem Formulation Introduction Science: How stuff works. Engineering: How to build stuff.
Multiagent Problem Formulation Introduction Science: How stuff works. Engineering: How to build stuff. Multiagent Systems: We want to build systems of, mostly, artificial agents. To do this we need to understand the science and math.
Multiagent Problem Formulation Introduction Fundamentals of Multiagent Systems Theory: Game Theory, Economics, Sociology, Biology, AI, Multiagent algorithms. Practice: NetLogo
Multiagent Problem Formulation Introduction History 1970s AI Boom
Multiagent Problem Formulation Introduction History 1970s AI Boom 1980s AI Bust, Blackboard Systems, DAI
Multiagent Problem Formulation Introduction History 1970s AI Boom 1980s AI Bust, Blackboard Systems, DAI 1990s The Web, Multiagent Systems
Multiagent Problem Formulation Introduction History 1970s AI Boom 1980s AI Bust, Blackboard Systems, DAI 1990s The Web, Multiagent Systems 2000s Ad auctions, Algorithmic Game Theory, Social Networks, REST
Multiagent Problem Formulation Introduction History 1970s AI Boom 1980s AI Bust, Blackboard Systems, DAI 1990s The Web, Multiagent Systems 2000s Ad auctions, Algorithmic Game Theory, Social Networks, REST 2010s ?
Multiagent Problem Formulation Introduction Grading Problem sets. Tests.
Multiagent Problem Formulation Utility Our Model: The Utility Function u i : S → ℜ
Multiagent Problem Formulation Utility Utility Requirements reflexive: u i ( s ) ≥ u i ( s ) transitive: If u i ( a ) ≥ u i ( b ) and u i ( b ) ≥ u i ( c ) then u i ( a ) ≥ u i ( c ). comparable: ∀ a , b either u i ( a ) ≥ u i ( b ) or u i ( b ) ≥ u i ( a ).
Multiagent Problem Formulation Utility Utility is Not Money Which one do you prefer 50 / 50 chance at winning $10 dollars, 1 $5 dollars? 2
Multiagent Problem Formulation Utility Utility is Not Money Which one do you prefer 50 / 50 chance at winning $1,000,000 dollars, 1 $500,000 dollars? 2
Multiagent Problem Formulation Utility Expected Utility E [ u i , s , a ] = ∑ T ( s , a , s ′ ) u i ( s ′ ) , s ′ ∈ S
Multiagent Problem Formulation Utility Expected Utility E [ u i , s , a ] = ∑ T ( s , a , s ′ ) u i ( s ′ ) , s ′ ∈ S T ( s , a , s ′ ) probability of reaching s ′ from s by taking action a .
Multiagent Problem Formulation Utility Maximum Expected Utility π i ( s ) = argmax a ∈ A E [ u i , s , a ]
Multiagent Problem Formulation Utility Value of Information Value of information that tells agent it is not in s but is in t instead: E [ u i , t , π i ( t )] − E [ u i , t , π i ( s )]
Multiagent Problem Formulation Markov Decision Processes The Model Markovian Assumption Andrey Markov. – .
Multiagent Problem Formulation Markov Decision Processes The Model Markov Decision Process a 2 : . 8 s 2 s 3 a 2 : . 2 a 3 : . 2 0 1 a 4 : 1 a 1 : . 8 a 3 : . 8 a 1 : . 9 a 3 : 1 a 2 : . 8 s 1 s 4 a 1 : . 2 a 1 : . 1 a 2 : . 2 a 4 : . 2 0 0 a 4 : . 8
Multiagent Problem Formulation Markov Decision Processes The Model What To Do? Reward when arriving at each state. Must take action each time.
Multiagent Problem Formulation Markov Decision Processes The Model What To Do? Reward when arriving at each state. Must take action each time. Take a high reward now or go to states with higher reward?
Multiagent Problem Formulation Markov Decision Processes The Model Discount Future Rewards Let γ be a discount factor, then reward at s 0 is γ 0 r ( s 0 )+ γ 1 r ( s 1 )+ γ 2 r ( s 2 )+ ···
Multiagent Problem Formulation Markov Decision Processes The Model Define Utility T ( s , a , s ′ ) u ( s ′ ) a ∑ u ( s ) = r ( s )+ γ max s ′
Multiagent Problem Formulation Markov Decision Processes The Model Define Utility T ( s , a , s ′ ) u ( s ′ ) a ∑ u ( s ) = r ( s )+ γ max s ′ Then its easy to calculate: π ∗ ( s ) = argmax T ( s , a , s ′ ) u ( s ′ ) a ∑ s ′
Multiagent Problem Formulation Markov Decision Processes The Model Define Utility T ( s , a , s ′ ) u ( s ′ ) a ∑ u ( s ) = r ( s )+ γ max s ′ Then its easy to calculate: π ∗ ( s ) = argmax T ( s , a , s ′ ) u ( s ′ ) a ∑ s ′ How do we calculate u ( s )?
Multiagent Problem Formulation Markov Decision Processes The Solution Bellman Update Richard Bellman. – . Inventor of dynamic programming.
Multiagent Problem Formulation Markov Decision Processes The Solution value-iteration ( T , r , γ , ε ) 1 repeat u ← u ′ 2 3 δ ← 0 4 for s ∈ S do u ′ ( s ) ← r ( s )+ γ max a ∑ s ′ T ( s , a , s ′ ) u ( s ′ ) 5 if | u ′ ( s ) − u ( s ) | > δ 6 then δ ← | u ′ ( s ) − u ( s ) | 7 8 until δ < ε (1 − γ ) / γ 9 return u
Multiagent Problem Formulation Markov Decision Processes The Solution Value Iteration Example a 2 : . 8 γ = . 5 s 2 s 3 a 2 : . 2 a 3 : . 2 0 1 a 4 : 1 0 0 a 1 : . 8 a 3 : . 8 a 1 : . 9 a 3 : 1 0 a 2 : . 8 0 s 1 s 4 a 1 : . 2 a 1 : . 1 a 2 : . 2 a 4 : . 2 0 0 a 4 : . 8
Multiagent Problem Formulation Markov Decision Processes The Solution Value Iteration Example a 2 : . 8 γ = . 5 s 2 s 3 a 2 : . 2 a 3 : . 2 0 1 a 4 : 1 0 1 a 1 : . 8 a 3 : . 8 a 1 : . 9 a 3 : 1 0 a 2 : . 8 0 s 1 s 4 a 1 : . 2 a 1 : . 1 a 2 : . 2 a 4 : . 2 0 0 a 4 : . 8
Multiagent Problem Formulation Markov Decision Processes The Solution Value Iteration Example a 2 : . 8 γ = . 5 s 2 s 3 a 2 : . 2 a 3 : . 2 0 1 a 4 : 1 . 4 = . 5( . 8)1 1 a 1 : . 8 a 3 : . 8 a 1 : . 9 a 3 : 1 0 . 5( . 9)1 = . 45 a 2 : . 8 s 1 s 4 a 1 : . 2 a 1 : . 1 a 2 : . 2 a 4 : . 2 0 0 a 4 : . 8
Multiagent Problem Formulation Markov Decision Processes The Solution Value Iteration Example a 2 : . 8 γ = . 5 s 2 s 3 a 2 : . 2 a 3 : . 2 0 1 a 4 : 1 . 44 = . 5( . 88) 1+ . 5( . 45) = 1 . 225 a 1 : . 8 a 3 : . 8 a 1 : . 9 a 3 : 1 . 18 = . 5( . 8) . 45 . 5( . 945) = . 4725 a 2 : . 8 s 1 s 4 a 1 : . 2 a 1 : . 1 a 2 : . 2 a 4 : . 2 0 0 a 4 : . 8
Multiagent Problem Formulation Markov Decision Processes The Solution Value Iteration Example a 2 : . 8 γ = . 5 s 2 s 3 a 2 : . 2 a 3 : . 2 0 1 a 4 : 1 . 57 = . 5( . 176+ . 98) 1+ . 5( . 47) = 1 . 2 a 1 : . 8 a 3 : . 8 a 1 : . 9 a 3 : 1 . 234 = . 5( . 09+ . 378) . 5(1 . 1+ . 047) = . 57 a 2 : . 8 s 1 s 4 a 1 : . 2 a 1 : . 1 a 2 : . 2 a 4 : . 2 0 0 a 4 : . 8
Multiagent Problem Formulation Markov Decision Processes The Solution Value Iteration Example a 2 : . 8 γ = . 5 s 2 s 3 a 2 : . 2 a 3 : . 2 0 1 a 4 : 1 . 57 1 . 2 a 1 : . 8 a 3 : . 8 a 1 : . 9 a 3 : 1 . 234 a 2 : . 8 . 57 s 1 s 4 a 1 : . 2 a 1 : . 1 a 2 : . 2 a 4 : . 2 0 0 a 4 : . 8
Multiagent Problem Formulation Markov Decision Processes Extensions Multiagent MDPs Instead of individual actions use a vector of actions. T ( s , a , s ′ ) becomes T ( s ,� a , s ′ ). r ( s ) becomes r i ( s ).
Recommend
More recommend