softagents/ http://www.cs.cmu.edu/ & % Katia Sycara ATAL-96 - PowerPoint PPT Presentation

' $ How Can An Agent Learn To Negotiate? Dajun Zeng Katia Sycara The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 zeng+@cs.cmu.edu katia@cs.cmu.edu � softagents/ http://www.cs.cmu.edu/ & % Katia Sycara ATAL-96 Page 1

' $ Talk Overview � Motivations & Research Objective � Desiderata of A Computational Model of Negotiation � Modeling negotiation as a Sequential Decision Making Process � Bazaar: Sequential Decision Making + Bayesian Updating � Supply Contracting Domain � Learning in a Simple Buyer-Supplier Scenario � Computational Issues & % � Conclusions Katia Sycara ATAL-96 Page 2

' $ Motivations for Learning in Negotiation � Importance of automated negotiation that can tolerate incomplete information and is able to adapt according to external changes in domain such as supply contracting and electronic commerce � Much DAI and game theoretic work provides pre-computed solutions to specific problems & % Katia Sycara ATAL-96 Page 3

' $ Research Objective Build autonomous agents that improve their negotiation competence based on learning from their interactions with other agents & % Katia Sycara ATAL-96 Page 4

' $ Game Theoretic Modeling of Negotiation � Advantages: – Mathematical soundness and elegance – Thorough analysis of strategic interactions – Explicit criteria & % Katia Sycara ATAL-96 Page 5

' $ � Many restrictive assumptions: – The number of players and their identity are fixed and known to everyone – All the players are assumed to be fully rational – Each player’s set of alternatives is fixed and known – Each player’s risk-taking attitude and expected-utility calculations are also fixed and known � Game Theoretic Models are fundamentally static � Not historically concerned with computational issues & % Katia Sycara ATAL-96 Page 6

' $ Desiderata of A Computational Model of Negotiation � Support a concise yet effective way to represent negotiation context � Be prescriptive in nature � Be computationally efficient, sometimes at the cost of compromising the rigor of the model and the optimality of solutions. � Model the dynamics of negotiation and Learn through & % interactions Katia Sycara ATAL-96 Page 7

' $ Characteristics of Sequential Decision Making � A sequence of decision making points (different stages) which are dependent on each other � The decision maker has a chance to update his/her knowledge after implementing the decision made at a certain stage and receiving feedback & % Katia Sycara ATAL-96 Page 8

' $ Modeling Negotiation as a SDM process � Most negotiation tasks involve multiple rounds of exchanging proposals and counter-proposals � Negotiating agents indeed receive feedback after they offer a proposal or a counter-proposal in the form of replies � A sequential decision making framework supports an open world approach. � Learning can take place naturally in a sequential decision making framework. & % Katia Sycara ATAL-96 Page 9

' $ Limitations of SDM � Strategic interactions only partially modeled � Fuzzy evaluation criteria & % Katia Sycara ATAL-96 Page 10

' $ Bazaar : Sequential Decision Making with Rational Learning I In Bazaar, a negotiation process is modeled by a 10-tuple < N ; M ; � ; A; H ; Q; � ; P ; C ; E > , where, N (the set of players) A-1 A set M (the set of issues) A-2 A set � � f ( D ) g A-3 A set of vectors j j 2 M A composed of all the possible actions that can be A set taken by every member of the players set. & % � � [ f Accept; Quit g A B Katia Sycara ATAL-96 Page 11

' $ 2 i N a set of possible agreements A A-4 For each player i 2 � i N , A A B For each i H of sequences (finite or infinite) that satisfies the A-5 A set following properties: A B The elements of each sequence are defined over � is a member of H B The empty sequence k k ( a ) 2 ( a ) 2 H and L < K then H B If k =1 ;::: ;K k =1 ;::: ;L k K ( a ) 2 2 f Accept; Quit g then H and a B If k =1 ;::: ;K k 2f Accept; Quit g when = 1 ; � 1 a k : : : ; K & % Katia Sycara ATAL-96 Page 12

' $ Q that associates each nonterminal history A-6 A function 2 n h H Z ) to a member of N ( � of relevant information entities A-7 A set of B The parameters of the environment B Beliefs about other players: (a) Beliefs about the factual aspects of other agents (b) Beliefs about the decision making process of other agents (c) Beliefs about some meta-level issues such as the overall negotiation style of other players & % Katia Sycara ATAL-96 Page 13

' $ 2 h and each player i N , a A-8 For each nonterminal history � P subjective probability distribution h;i defined over 2 n i N , each nonterminal history H Z , and A-9 For each player a 2 A C each action i , there is an implementation cost i i;h;a 2 � i N a preference relation A-10 For each player i on Z and 2 � P h Z . h;i for each i in turn results in an evaluation ( Z ) E ; P function i Z ;i � Solution Concept: Adaptive feedback control from Dynamic Programming & % Katia Sycara ATAL-96 Page 14

' $ Domain: Supply Contracting � Supply Contracting is an emerging area in Operations Management – Motivation: Manufacturing companies need to ensure smooth and inexpensive supply of raw material and components that are needed to produce and assemble the final product. & % Katia Sycara ATAL-96 Page 15

' $ � Supply contracting is an ideal evaluation domain for Bazaarsince: – Significant in its own right – Quantitatively-oriented – Some strategic parts of supply contracting have been ignored in analytic modeling and in fact are being ignored in practice – Opportunity for learning: uncertainties involved in various stages of supply contracting, e.g., uncertainty in demand and supply & % Katia Sycara ATAL-96 Page 16

' $ Learning in a Simple Buyer-Supplier Scenario � Assumptions: � has only one item: belief – The relevant information set R P about the supplier’s reservation price supplier (from the buyer’s perspective) R P – The buyer’s partial belief about supplier is represented by two hypotheses: � = $100 : 00 ” H R P 1 = “ supplier � = $130 : 00 ” H R P 2 = “ supplier & % ( H ) = 0 : 5 ; ( H ) = 0 : 5 P P – A priori knowledge: 1 2 Katia Sycara ATAL-96 Page 17

' $ – Domain Knowledge: “Usually in our business people will offer a price which is above their reservation price by 17%”, part of which is encoded as: � ( e j ) = 0 : 95 P H 1 1 � ( e j ) = 0 : 75 P H 1 2 $117 : 00 e where 1 denotes the event that the supplier asks for the goods under negotiation – The buyer adopts a simple negotiation strategy: “Propose a R P price which is 10% below the estimated supplier ” & % Katia Sycara ATAL-96 Page 18

' $ � Suppose that the supplier offers $117 : 00 � Given this signal and the domain knowledge, the buyer can R P calculate the posterior estimation of supplier as follows: ( H ) P ( e j ) P H 1 1 1 ( H j ) = P e 1 1 ( H ) P ( e j ) + ( H ) P ( e j ) P H P H 1 1 1 2 1 2 = 55 : 9% ( H ) P ( e j ) P H 2 1 2 P ( H j e ) = 2 1 ( H ) P ( e j ) + ( H ) P ( e j ) P H P H 2 1 1 2 1 2 = 44 : 1% & % Katia Sycara ATAL-96 Page 19

' $ � Prior to receiving the supplier’s offer ( $117 : 00 ), the buyer would $115 : 00 (the mean of the R P propose supplier subjective distribution) � After receiving the offer from the supplier and updating his belief $113 : 23 instead R P about supplier , the buyer will propose & % Katia Sycara ATAL-96 Page 20

' $ Initial Theoretical Results A player who uses the Bayesian mechanism to update his beliefs about the unknown parameters of the game and other player’s strategies in a subjectively rational fashion performs at least as well as without the Bayesian learning & % Katia Sycara ATAL-96 Page 21

' $ Computational Issues � Efficiency – Bayesian Network � Convergence, . . . , – Experimental study of solution quality, time to reach an agreement, etc. & % Katia Sycara ATAL-96 Page 22

' $ Conclusions � “In-between” game-theoretic models and single agent decision making models � Bazaar aims at modeling multi-issue negotiation processes � Bazaar supports an open world model � Address multi-agent learning utilizing the iterative nature of sequential decision making and the explicit representation of beliefs about other agents & % Katia Sycara ATAL-96 Page 23

softagents/ http://www.cs.cmu.edu/ & % Katia Sycara ATAL-96 - PowerPoint PPT Presentation

' $ How Can An Agent Learn To Negotiate? Dajun Zeng Katia Sycara The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 zeng+@cs.cmu.edu katia@cs.cmu.edu softagents/ http://www.cs.cmu.edu/ & % Katia Sycara

A New Boosting Algorithm Using Input-Dependent Regularizer Rong Jin rong+@cs.cmu.edu Yan Liu

FACT: A Diagnostic for Group Fairness Trade-offs Joon Kim, CMU (joonsikk@cs.cmu.edu ) Jiahao Chen,

http://db.cs.cmu.edu/events/db-seminar-spring- 2018-alok-pareek-striim/ CMU 15-721 (Spring 2018)

The bluetides simulation Tiziana DiMatteo (CMU ) Yu Feng (Berkeley), Rupert Croft (CMU ), Aklant

Emulation of Ad Hoc Networks David A. Maltz, Qifa Ke, David B. Johnson CMU Monarch Project

Intro to Computer Security Lujo Bauer lbauer@cmu.edu http://www.ece.cmu.edu/~lbauer Fall 2011

Algorithms in Nature Nature inspired algorithms http://www.cs.cmu.edu/~02317/ Ziv Bar-Joseph

Nektarios Leontiadis (CMU/EPP/CyLab) leontiadis@cmu.edu Joint work

Federated Optimization in Heterogeneous Networks Tian Li (CMU) , Anit Kumar Sahu (BCAI), Manzil

PocketSphinx: Open-Source Speech Recognition for Hand-held and Embedded Devices David

Evolution Styles Foundations and Tool Support for Software Architecture Evolution David Garlan

Modern Version Control with Git Aaron Perley (aperley@andrew.cmu.edu) Ilan Biala

Modern Version Control with Git Ilan Biala (ibiala@andrew.cmu.edu) Aaron Perley

Patterns and Anomalies Christos Faloutsos CMU CMU SCS Thank you The Department of

Public Works Compliance Monitoring Unit 1 CMU APPLICABILITY AB 436 and the new CMU

CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann School of Computer Science

4. Multiagent Systems Design Part 3: Coordination models (I): Social Models Social Models ems

Attracting Students to Computer Science Using Artificial Intelligence, Economics, and Linear

Committee for the Advancement of Theoretical Computer Science CATCS Richard Ladner SIGACT Chair

Dynamic Markets for Wireless Congestion Pricing Srinivas Shakkottai Texas A&M University

REINFORCEMENT LEARNING IN MULTI-AGENT SYSTEMS MACHINE LEARNING MEETUP DR. ANA PELETEIRO

MOVING AND COMPUTING IN BY DISCRETE SPACES GRASTA/MAC Tutorial 2015 Netscape Graph G node

Plan for Today Revelation Principle: formal justification for concentrating on

Algorithmic Game Theory Anna Andrey

Sambuz

Useful Links

Newsletter

Mail Us

softagents/ http://www.cs.cmu.edu/ & % Katia Sycara ATAL-96 - PowerPoint PPT Presentation

' $ How Can An Agent Learn To Negotiate? Dajun Zeng Katia Sycara The Robotics Institute Carnegie Mellon University Pittsburgh, PA 15213 zeng+@cs.cmu.edu katia@cs.cmu.edu softagents/ http://www.cs.cmu.edu/ & % Katia Sycara

A New Boosting Algorithm Using Input-Dependent Regularizer Rong Jin rong+@cs.cmu.edu Yan Liu

FACT: A Diagnostic for Group Fairness Trade-offs Joon Kim, CMU (joonsikk@cs.cmu.edu ) Jiahao Chen,

http://db.cs.cmu.edu/events/db-seminar-spring- 2018-alok-pareek-striim/ CMU 15-721 (Spring 2018)

The bluetides simulation Tiziana DiMatteo (CMU ) Yu Feng (Berkeley), Rupert Croft (CMU ), Aklant

Emulation of Ad Hoc Networks David A. Maltz, Qifa Ke, David B. Johnson CMU Monarch Project

Intro to Computer Security Lujo Bauer lbauer@cmu.edu http://www.ece.cmu.edu/~lbauer Fall 2011

Algorithms in Nature Nature inspired algorithms http://www.cs.cmu.edu/~02317/ Ziv Bar-Joseph

Nektarios Leontiadis (CMU/EPP/CyLab) leontiadis@cmu.edu Joint work

Federated Optimization in Heterogeneous Networks Tian Li (CMU) , Anit Kumar Sahu (BCAI), Manzil

PocketSphinx: Open-Source Speech Recognition for Hand-held and Embedded Devices David

Evolution Styles Foundations and Tool Support for Software Architecture Evolution David Garlan

Modern Version Control with Git Aaron Perley (aperley@andrew.cmu.edu) Ilan Biala

Modern Version Control with Git Ilan Biala (ibiala@andrew.cmu.edu) Aaron Perley

Patterns and Anomalies Christos Faloutsos CMU CMU SCS Thank you The Department of

Public Works Compliance Monitoring Unit 1 CMU APPLICABILITY AB 436 and the new CMU

CMU @ TRECVID Event Detection @ Ming-yu Chen &amp; Alex Hauptmann School of Computer Science

4. Multiagent Systems Design Part 3: Coordination models (I): Social Models Social Models ems

Attracting Students to Computer Science Using Artificial Intelligence, Economics, and Linear

Committee for the Advancement of Theoretical Computer Science CATCS Richard Ladner SIGACT Chair

Dynamic Markets for Wireless Congestion Pricing Srinivas Shakkottai Texas A&amp;M University

REINFORCEMENT LEARNING IN MULTI-AGENT SYSTEMS MACHINE LEARNING MEETUP DR. ANA PELETEIRO

MOVING AND COMPUTING IN BY DISCRETE SPACES GRASTA/MAC Tutorial 2015 Netscape Graph G node

Plan for Today Revelation Principle: formal justification for concentrating on

Algorithmic Game Theory Anna Andrey

Sambuz

Useful Links

Newsletter

Mail Us

CMU @ TRECVID Event Detection @ Ming-yu Chen & Alex Hauptmann School of Computer Science

Dynamic Markets for Wireless Congestion Pricing Srinivas Shakkottai Texas A&M University