Optimal Non-parametric Learning in Repeated Contextual Auctions with - PowerPoint PPT Presentation

Optimal Non-parametric Learning in Repeated Contextual Auctions with Strategic Buyer Alexey Drutsa

Repeated Contextual Posted-Price Auctions Different goods (e.g., ad spaces) › described by 𝑒 -dimensional feature vectors (contexts) from 0,1 % › are repeatedly offered for sale by a seller › to a single buyer over 𝑈 rounds (one good per round). The buyer › holds a private fixed valuation function 𝑤: 0,1 % → 0,1 › used to calculate his valuation 𝑤(𝑦) for a good with context 𝑦 ∈ 0,1 % , › 𝑤 is unknown to the seller. At each round 𝑢 = 1, … , 𝑈 , › a feature vector 𝑦 2 of the current good is observed by the seller and the buyer › a price 𝑞 2 is offered by the seller, › and an allocation decision 𝑏 2 ∈ {0,1} is made by the buyer: 𝑏 2 = 0 , when the buyer rejects, and 𝑏 2 = 1 , when the buyer accepts.

Seller’s pricing algorithm and buyer strategy : The seller applies a pricing algorithm 𝐵 that sets prices {𝑞 2 } 289 : : in response to buyer decisions 𝐛 = {𝑏 2 } 289 and observed contexts 𝐲 = {𝑦 2 } 289 . The price 𝑞 2 can depend only on › past decisions {𝑏 = } =89 2>9 › feature vectors {𝑦 = } =89 2 › the horizon 𝑈

Strategic buyer ▌ The seller announces her pricing algorithm 𝐵 in advance The buyer has some distribution (beliefs) 𝐸 about future contexts. In each round 𝑢 , given the history of previous rounds, he chooses his decision 𝑏 2 s.t. it maximizes his future 𝛿 -discounted surplus: : 𝛿 =>9 𝑏 = (𝑤(𝑦 = ) − 𝑞 = ) 𝔽 B C ~E F , 𝛿 ∈ (0,1] =82

The game’s workflow and knowledge structure priva pr vate 𝑤 𝐸 𝑤 𝐸 𝑤 𝐸 knowle kn ledge Nature Na Buyer Buye Nature Na Buye Buyer Na Nature Buyer Buye Se Seller Algo Al gorithm hm Al Algo gorithm hm Al Algo gorithm hm ledge blic 𝑦 9 𝑞 9 𝑏 9 𝑦 J 𝑞 J 𝑏 J 𝑦 L 𝑞 L 𝑏 L publ Al Algo gorithm knowle pu kn before game starts round 𝑢 = 1 round 𝑢 = 2 round 𝑢 = 3

Seller’s goal The seller’s strategic regret: RST 𝑞 2 ) : SReg 𝑈, 𝐵, 𝑤, 𝛿, 𝑦 9:: , 𝐸 : = ∑ (𝑤(𝑦 2 ) − 𝑏 2 289 We will learn the function 𝑤 in a non-parametric way. For this, we will assume that it is Lipschitz (a standard requirement for non-parametric learning): 0,1 % ≔ 𝑔: 0,1 % → 0,1 |∀𝑦, 𝑧 ∈ 0,1 % 𝑔 𝑦 − 𝑔 𝑧 Lip X ≤ 𝑀 𝑦 − 𝑧 The seller seeks for a no-regret pricing for worst-case valuation function: sup b∈cdS e f,9 g ,B h:i ,E SReg 𝑈, 𝐵, 𝑤, 𝛿, 𝑦 9:: , 𝐸 = 𝑝 𝑈 Optimality : the lowest possible upper bound for the regret of the form 𝑃 𝑔(𝑈) .

Background & Research question

Background [Kleinberg et al., FOCS’2003] Non-contextual setup ( 𝑒 = 0 ). Horizon-dependent optimal algorithm against myopic buyer ( 𝛿 = 0 ) with truthful regret Θ(log log 𝑈) . [Amin et al., NIPS’2013] Non-contextual setup ( 𝑒 = 0 ). The strategic setting is introduced. ∄ no-regret pricing for non-discount case 𝛿 = 1 . [Drutsa, WWW’2017] Non-contextual setup ( 𝑒 = 0 ). Horizon- independent optimal algorithm against strategic buyer with regret Θ(log log 𝑈) for 𝛿 < 1 . [Mao et al., NIPS’2018] Our non-parametric contextual setup ( 𝑒 > 0 ). Horizon-dependent optimal algorithm against g myopic buyer ( 𝛿 = 0 ) with truthful regret Θ(𝑈 gph ) .

Research question The key approaches of the non-contextual optimal algorithms ([pre]PRRFES) cannot be directly applied to contextual algorithm of [Mao et al., NIPS’2018] In order to search the valuation of the strategic buyer without context: › Penalization rounds are used › We do not propose prices below the ones that are earlier accepted In the approach of [Mao et al., NIPS’2018]: › Standard penalization does not help › Proposed prices can be below the ones that are earlier accepted by the buyer ▌ In this study, I overcome these issues and propose an optimal ▌ non-parametric algorithm for the contextual setting with strategic buyer

Novel optimal algorithm

Penalized Exploiting Lipschitz Search (PELS) PELS has three parameters: › the price offset 𝜃 ∈ 1, +∞ › the degree of penalization 𝑠 ∈ ℕ › the exploitation rate 𝑕: ℤ z → ℤ z This algorithm keeps track of › a partition 𝔜 of the feature domain 0,1 % › initialized to 4𝜃 + 6 𝑀 % cubes (boxes) with side length 𝑚 = 1/ 4𝜃 + 6 𝑀 : % . 𝔜 = 𝐽 9 × 𝐽 J × ⋯ × 𝐽 % | 𝐽 9 , 𝐽 J , … , 𝐽 % ∈ 0, 𝑚 , 𝑚, 2𝑚 , … , 1 − 𝑚, 1

Penalized Exploiting Lipschitz Search (PELS) For each box 𝑌 ∈ 𝔜 , PELS also keeps track of: › the lower bound 𝑣 † ∈ [0,1] , › the upper bound 𝑥 † ∈ [0,1] , › the depth 𝑛 † ∈ ℤ z . They are initialized as follows: 𝑣 † = 0 , 𝑥 † = 1 , and 𝑛 † = 0 , 𝑌 ∈ 𝔜 . The workflow of the algorithm is organized independently in each box 𝑌 ∈ 𝔜 . › the algorithm receives a good with a feature vector 𝑦 2 ∈ 0,1 % › finds the box 𝑌 ∈ 𝔜 in the current partition 𝔜 s.t. 𝑦 2 ∈ 𝑌 . ▌ Then, the proposed price 𝑞 2 is determined only from the current state ▌ associated with the box 𝑌 , while the buyer decision 𝑏 2 is used ▌ only to update the state associated with this box 𝑌 .

Penalized Exploiting Lipschitz Search (PELS) In each box 𝑌 ∈ 𝔜 , the algorithm iteratively offers exploration price: 𝑣 † + 𝜃𝑀diam(𝑌) ▌ If this price is accepted by the buyer: › the lower bound 𝑣 † is increased by 𝑀diam(𝑌) . ▌ If this price is rejected: › the upper bound 𝑥 † is decreased by 𝑥 † − 𝑣 † − 2(𝜃 + 1)𝑀diam(𝑌) › 1 is offered as a penalization price for 𝑠 − 1 next rounds in this box 𝑌 (if one of them is accepted, we continue offering 1 all the remaining rounds).

Penalized Exploiting Lipschitz Search (PELS) ▌ If, after an acceptance of an exploration price or after penalization rounds we have 𝑥 † − 𝑣 † < (2𝜃 + 3)𝑀diam(𝑌) , ▌ then PELS: › offers the exploitation price 𝑣 † for 𝑕(𝑛 † ) next rounds in this box 𝑌 (buyer decisions made at them do not affect further pricing); › bisects each side of the box 𝑌 to obtain 2 % boxes 𝔜 † ≔ 𝑌 9 , … , 𝑌 J g with ℓ Ž -diameter equal to diam(𝑌)/2 ; › refines the partition 𝔜 † replacing the box 𝑌 by the new boxes 𝔜 † . These new boxes 𝔜 † › inherit the state of the bounds 𝑣 † and 𝑥 † from the current state of 𝑌 , › while their depth 𝑛 • = 𝑛 † + 1 ∀𝑍 ∈ 𝔜 † .

PELS is optimal Theorem 1. Let 𝑒 ≥ 1 and 𝛿 f ∈ 0,1 . Then for the pricing algorithm PELS 𝐵 with: › the number of penalization rounds 𝑠 ≥ log ’ “ 9>’ “ J › the exploitation rate 𝑕 𝑛 = 2 ” , 𝑛 ∈ ℤ z , › the price offset 𝜃 ≥ 2/(1 − 𝛿 f ) 0,1 % , discount 𝛿 ≤ 𝛿 f , distribution 𝐸 and for any valuation function 𝑤 ∈ Lip X feature vectors 𝑦 9:: , the strategic regret is upper bounded: 9 % %z9 = Θ(𝑈 SReg 𝑈, 𝐵, 𝑤, 𝛿, 𝑦 9:: , 𝐸 ≤ 𝐷 𝑂 f 𝑈 + 𝑂 f % %z9 ), 𝐷 ≔ 2 % 𝑠 2𝜃 + 3 + 𝑀 >9 + 1 and 𝑂 f ≔ 4𝜃 + 6 𝑀 % .

PELS: main properties and extensions › Can be applied against myopic buyer ( 𝛿 = 0 ) (setup of [Mao et al., NIPS’2018]) › PELS is horizon-independent (in contrast to [Mao et al., NIPS’2018]) ▌ What if the loss is symmetric? › We can generalize the algorithm to classical online learning losses › For instance, we want to optimize regret of the form ∑ : |𝑤(𝑦 2 ) − 𝑞 2 | 289 › But interacting with the strategic buyer still g—h › Slight modification of PELS has regret 𝑃(𝑈 g ) , which is tight for 𝑒 > 1 .

Thank you! Alexey Drutsa Yandex adrutsa@yandex.ru

Optimal Non-parametric Learning in Repeated Contextual Auctions with - PowerPoint PPT Presentation

Optimal Non-parametric Learning in Repeated Contextual Auctions with Strategic Buyer Alexey Drutsa Setup Repeated Contextual Posted-Price Auctions Different goods (e.g., ad spaces) described by -dimensional feature vectors (contexts)

MLSE Channel Estimation MLSE Channel Estimation MLSE Channel Estimation Parametric or Non-

Contextual Inquiry Take Aways Overview of Contextual Design Contextual inquiry

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Semi-parametric and response setup non-parametric approaches to Parametric models

Contextual Analysis SWEN-444 Contextual analysis Systematic analysis of contextual user work

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

Introduction to non-parametric Bayes Introduction to non-parametric Bayes methods 1 Overview

Analysis of variance and regression 2009-3-11 Lene Theil Skovgaard Repeated measurements May

Contextual Advertising: Contextual Advertising: Semantic Approach Semantic Approach Ekaterina

Experimental Design & Evaluation 4. Contextual Inquiry SunyoungKim,PhD Contextual

Serving Contextual Communities Serving Contextual Communities The Evangelical Theological

Towards a non-parametric Towards a non-parametric stochastic framework: a consistent approach of

Non parametric prediction and mapping of standing Non-parametric prediction and mapping of

Learning From Data Lecture 18 Radial Basis Functions Non-Parametric RBF Parametric RBF k

TCTL model checking lower/upper-bound Introduction parametric timed automata without Parametric

CMSC427 Notes on piecewise parametric curves: Hermite, Catmull-Rom, and Bezier I. Parametric

Improved Differential-Linear Cryptanalysis of 7-round Chaskey with Partitioning Gatan Leurent

W ORK R OUNDS S MALL G ROUP T EACHING S MALL G ROUP T EACHING B ringing B ringing E

Overfitting Can Happen Overfitting Can Happen Overfitting Can Happen Overfitting Can Happen

Locality lower bounds through round elimination D 1 Jukka Suomela U v 1 D 3 u Aalto

Reserve Pricing in Repeated Second-Price Auctions with Strategic Bidders Alexey Drutsa Setup

Lower Bounds for Maximal Matchings and Maximal Independent Sets Alkida Balliu Aalto University,

Two ways of building round functions for block ciphers Joan Daemen Radboud University ibenik

Strategies for Schwartz Rounds Marketing & Promotion An Office Hours Webinar January 8,

Optimal Non-parametric Learning in Repeated Contextual Auctions with - PowerPoint PPT Presentation

Optimal Non-parametric Learning in Repeated Contextual Auctions with Strategic Buyer Alexey Drutsa Setup Repeated Contextual Posted-Price Auctions Different goods (e.g., ad spaces) described by -dimensional feature vectors (contexts)

MLSE Channel Estimation MLSE Channel Estimation MLSE Channel Estimation Parametric or Non-

Contextual Inquiry Take Aways Overview of Contextual Design Contextual inquiry

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Semi-parametric and response setup non-parametric approaches to Parametric models

Contextual Analysis SWEN-444 Contextual analysis Systematic analysis of contextual user work

Non-parametric Bayesian Statistics Graham Neubig 2011-12-22 1 Graham Neubig Non-parametric

Introduction to non-parametric Bayes Introduction to non-parametric Bayes methods 1 Overview

Analysis of variance and regression 2009-3-11 Lene Theil Skovgaard Repeated measurements May

Contextual Advertising: Contextual Advertising: Semantic Approach Semantic Approach Ekaterina

Experimental Design &amp; Evaluation 4. Contextual Inquiry SunyoungKim,PhD Contextual

Serving Contextual Communities Serving Contextual Communities The Evangelical Theological

Towards a non-parametric Towards a non-parametric stochastic framework: a consistent approach of

Non parametric prediction and mapping of standing Non-parametric prediction and mapping of

Learning From Data Lecture 18 Radial Basis Functions Non-Parametric RBF Parametric RBF k

TCTL model checking lower/upper-bound Introduction parametric timed automata without Parametric

CMSC427 Notes on piecewise parametric curves: Hermite, Catmull-Rom, and Bezier I. Parametric

Improved Differential-Linear Cryptanalysis of 7-round Chaskey with Partitioning Gatan Leurent

W ORK R OUNDS S MALL G ROUP T EACHING S MALL G ROUP T EACHING B ringing B ringing E

Overfitting Can Happen Overfitting Can Happen Overfitting Can Happen Overfitting Can Happen

Locality lower bounds through round elimination D 1 Jukka Suomela U v 1 D 3 u Aalto

Reserve Pricing in Repeated Second-Price Auctions with Strategic Bidders Alexey Drutsa Setup

Lower Bounds for Maximal Matchings and Maximal Independent Sets Alkida Balliu Aalto University,

Two ways of building round functions for block ciphers Joan Daemen Radboud University ibenik

Strategies for Schwartz Rounds Marketing &amp; Promotion An Office Hours Webinar January 8,

Experimental Design & Evaluation 4. Contextual Inquiry SunyoungKim,PhD Contextual

Strategies for Schwartz Rounds Marketing & Promotion An Office Hours Webinar January 8,