Learning the Valuations of a k-demand Agent Hanrui Zhang - PowerPoint PPT Presentation

Learning the Valuations of a k-demand Agent Hanrui Zhang Vincent Conitzer � Duke University

this talk: • optimal (up to lower order terms) algorithm for actively learning the valuations of a k- demand agent • algorithm with polynomial time & sample complexity for passively learning the valuations of a k-demand agent

k-demand agents and demand sets k-demand agent: demands a set of items of size <=k maximizing her utility, i.e., total value - total price � demand set: the set of items the agent demands

Unit-demand agents value: $10 $8 $12 price: $6 $5 $5 surplus: $4 $7 $3 ✘ ✔ ✘ agent buys:

k-demand agents and demand sets value: $5 $6 $4 $3 price: $4 $3 $2 $2 agent is 2-demand — they want no more than 2 items

k-demand agents and demand sets value: $5 $6 $4 $3 price: $4 $3 $2 $2 surplus: $1 $3 $2 $1 2-demand ✘ ✔ ✔ ✘ agent buys:

k-demand agents and demand sets demand set value: $5 $6 $4 $3 price: $4 $3 $2 $2 surplus: $1 $3 $2 $1 2-demand ✘ ✔ ✔ ✘ agent buys:

Demand queries demand query: given a vector of prices, returns a demand set (which may not be unique)

value: v 1 = $5 v 2 = $6 v 3 = $4 v 4 = $3 price: p 1 = $4 p 2 = $2 p 3 = $2 p 4 = $2 2-demand ✘ ✔ ✔ ✘ agent buys:

value: v 1 = $5 v 2 = $6 v 3 = $4 v 4 = $3 price: p 1 = $4 p 2 = $2 p 3 = $2 p 4 = $2 2-demand ✘ ✔ ✔ ✘ agent buys: price: p 1 = $2 p 2 = $5 p 3 = $3 p 4 = $1.5 2-demand ✔ ✘ ✘ ✔ agent buys:

value: v 1 = $5 v 2 = $6 v 3 = $4 v 4 = $3 price: p 1 = $4 p 2 = $2 p 3 = $2 p 4 = $2 2-demand ✘ ✔ ✔ ✘ agent buys: price: p 1 = $2 p 2 = $5 p 3 = $3 p 4 = $1.5 2-demand ✔ ✘ ✘ ✔ agent buys: price: p 1 = $7 p 2 = $3.5 p 3 = $5.5 p 4 = $4 2-demand ✘ ✔ ✘ ✘ agent buys:

Actively learning the valuations • suppose there are n items, and the value v i of each item is an integer between 1 and W • how many demand queries suffice to learn the full valuations (i.e., (v i ) i ) of a k-demand agent? • spoiler: optimal number of queries is (n log W) / (k log (n / k)) + n / k ± o(…)

Sketch of lower bound (n log W) / (k log (n / k)) + n / k ± o(…) amount of maximum amount information of information encoded in (v i ) i per query

Sketch of lower bound (n log W) / (k log (n / k)) + n / k ± o(…) necessary in the following case: • exactly one item is special, which has value 0 • all other items have value 1 • the special item is chosen uniformly at random

Sketch of upper bound • warmup: n = k = 1 • need to learn: a single number v 1 in {1, 2, …, W} • query: given p, returns whether p < v 1 • optimal solution: binary search — log W queries

Sketch of upper bound • slight generalization: n = k (= 1) • need to learn: a vector (v i ) i of integers in {1, 2, …, W} • query: given (p i ) i , returns, for each item i, whether p i < v i • optimal solution: simultaneous binary search — log W queries

Sketch of upper bound • general case: n ≥ k ≥ 1 • straightforward solution: (1) divide items into groups of size k, and (2) perform simultaneous binary search for each group sequentially • (n / k) log W queries • LB is (n log W) / (k log (n / k)) — can we do better?

Sketch of upper bound idea: biased binary search • learn v 1 using log W queries, use item 1 as reference • in each query, post p 1 = v 1 - 0.5, so item 1 is marginally attractive • for all other items, post biased (rather than middle-of- possible-range) prices

Sketch of upper bound 100 75 n = 4 50 k = 1 v 1 25 0 item 1 item 2 item 3 item 4

Sketch of upper bound 100 p 3 p 2 75 p 4 n = 4 50 k = 1 p 1 = v 1 - 0.5 25 0 item 1 item 2 item 3 item 4 prices biased toward higher end of possible ranges

Sketch of upper bound • in each query, post p 1 = v 1 - 0.5, so item 1 is marginally attractive • for all other items, post biased (rather than middle-of- possible range) prices • if item 1 in demand set: many items are overpriced; shrink their possible ranges by a little • if item 1 not in demand set: a few items are underpriced; shrink their possible ranges by a lot

Sketch of upper bound 100 p 3 p 2 75 p 4 n = 4 50 k = 1 p 1 = v 1 - 0.5 25 0 item 1 item 2 item 3 item 4 if item 1 in demand set: many items are overpriced; shrink their possible ranges by a little

Sketch of upper bound 100 p 3 p 2 75 p 4 n = 4 50 k = 1 p 1 = v 1 - 0.5 25 0 item 1 item 2 item 3 item 4 if item 1 not in demand set: a few items are underpriced; shrink their possible ranges by a lot

Sketch of upper bound • if item 1 in demand set: many items are overpriced; shrink their possible ranges by a little • if item 1 not in demand set: a few items are underpriced; shrink their possible ranges by a lot • adjust bias to equalize information gain • larger information gain (~ k log (n / k)) in both cases!

• so far: tight UB & LB for active learning • next: (very brief discussion of) computation & sample efficient algorithm for passive learning

Passively learning valuations • prices are distributed according to a distribution 𝒠 • true valuations v: a vector of real numbers • algorithm observes m iid sample price vectors p j together with demand set S j under p j • given {(S j , p j )}, algorithm outputs a hypothesis vector h which recovers v in a PAC sense — algorithm succeeds with probability 1 - 𝜀 , in which case with probability 1 - 𝛇 , demand set under (v, p) = demand set under (h, p)

Passively learning valuations • idea: empirical risk minimization • tool: multiclass ERM principle & Natarajan dimension • treat problem as multiclass classification with < n k labels • hypothesis class has Natarajan dimension n • sample complexity is poly(n, k, log(1 / 𝜀 ), 1 / 𝛇 ) • solving ERM = finding a feasible solution to an LP

Future directions • more general valuations, e.g., matroid-demand • tighter sample complexity bounds for passive learning

Thanks for your attention! Questions?

Related research • in economic theory: learning utility functions from revealed preferences (Samuelson, 1938; Afriat, 1967; Beigman & Vohra, 2006; …) • in CS: preference elicitation (Blum et al., 2004; Lahaie & Parkes, 2004; Sandholm & Boutilier, 2006; …)

Learning the Valuations of a k-demand Agent Hanrui Zhang - PowerPoint PPT Presentation

Learning the Valuations of a k-demand Agent Hanrui Zhang Vincent Conitzer Duke University this talk: optimal (up to lower order terms) algorithm for actively learning the valuations of a k- demand agent algorithm with

APT TECHNICAL CPD - MAF (Valuations) 1 Valuations Nicholas Riemer Agenda Workflow to

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

ASC 740 and Valuations of Deferred Tax Assets ASC 740 and Valuations of Deferred Tax Assets

Insurance Valuations What to know? Presented By: Devin Baker B.Comm. Suncorp Valuations What

Valuations in Financial Valuations in Financial Reporting Reporting May 9, 2007 Cheryl

Defining Henselian Valuations (with a little help from the residue field) Franziska Jahnke WWU

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Nash demand game Julio D avila 2009 Julio D avila Nash demand game Nash demand game

Learning Agent Learning Agents An Agent that observes its performance and adapts its

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Agent Training Welcome Blues Agent Portal Training e-Learning on the BCBSM Agent Portal

Ahead of the Pack: Developing Middle Management from an Inside Perspective Mary Lecy, Hospital

Software Engineering I (02161) Week 5 Assoc. Prof. Hubert Baumeister DTU Compute Technical

Sandboxing 1 logistics CHALLENGE assignment take-home portion of the fjnal next class

Synthesis A Lock-Free Multiprocessor OS Kernel Josh Triplett April 19, 2006 Locking a Linked

FRAMING CRYPTOS IN CAPITAL MARKETS LEGISLATION THE MICAR PROPOSAL Filippo Annunziata

MiFID II Costs & Charges Workshop 4th September 2019 Jeffrey Mushens Technical Policy

UPDATE ON THE MERGER UPDATE ON THE MERGER On track in important areas Staffed office

Success breeds complacency. Complacency breeds failure. Only the paranoid survive. Andy Grove,

Sambuz

Useful Links

Newsletter

Mail Us

Learning the Valuations of a k-demand Agent Hanrui Zhang - PowerPoint PPT Presentation

Learning the Valuations of a k-demand Agent Hanrui Zhang Vincent Conitzer Duke University this talk: optimal (up to lower order terms) algorithm for actively learning the valuations of a k- demand agent algorithm with

APT TECHNICAL CPD - MAF (Valuations) 1 Valuations Nicholas Riemer Agenda Workflow to

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

ASC 740 and Valuations of Deferred Tax Assets ASC 740 and Valuations of Deferred Tax Assets

Insurance Valuations What to know? Presented By: Devin Baker B.Comm. Suncorp Valuations What

Valuations in Financial Valuations in Financial Reporting Reporting May 9, 2007 Cheryl

Defining Henselian Valuations (with a little help from the residue field) Franziska Jahnke WWU

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Nash demand game Julio D avila 2009 Julio D avila Nash demand game Nash demand game

Learning Agent Learning Agents An Agent that observes its performance and adapts its

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Agent Training Welcome Blues Agent Portal Training e-Learning on the BCBSM Agent Portal

Ahead of the Pack: Developing Middle Management from an Inside Perspective Mary Lecy, Hospital

Software Engineering I (02161) Week 5 Assoc. Prof. Hubert Baumeister DTU Compute Technical

Sandboxing 1 logistics CHALLENGE assignment take-home portion of the fjnal next class

Synthesis A Lock-Free Multiprocessor OS Kernel Josh Triplett April 19, 2006 Locking a Linked

FRAMING CRYPTOS IN CAPITAL MARKETS LEGISLATION THE MICAR PROPOSAL Filippo Annunziata

MiFID II Costs &amp; Charges Workshop 4th September 2019 Jeffrey Mushens Technical Policy

UPDATE ON THE MERGER UPDATE ON THE MERGER On track in important areas Staffed office

Success breeds complacency. Complacency breeds failure. Only the paranoid survive. Andy Grove,

Sambuz

Useful Links

Newsletter

Mail Us

MiFID II Costs & Charges Workshop 4th September 2019 Jeffrey Mushens Technical Policy