Reserve Pricing in Repeated Second-Price Auctions with Strategic Bidders Alexey Drutsa
Setup
Second-Price (SP) Auction with Reserve Prices β Setting βΊ A good (e.g., an ad space) is offered for sale by a seller to π buyers βΊ Each buyer π holds a private valuation π€ $ β [0,1] for this good ( π€ $ is unknown to the seller) β Actions βΊ The seller selects a reserve price π $ for each buyer π βΊ Each buyer π submits a bid π $ β Allocation and payments βΊ Determine actual buyer-participants: π = {π β£ π $ β₯ π $ } βΊ The good is received by the buyer π 4 = argmax $βπ π $ (that has the highest bid) βΊ This buyer pays π $ 4 = max {π $ 4 , max $βπ β{$ 4} π $ }
Repeated Second-Price Auctions with Reserve Equal goods (e.g., ad spaces) are repeatedly offered for sale βΊ by a seller (e.g., RTB platform) to π buyers (e.g., advertisers) βΊ over π rounds (one good per round). Each buyer π βΊ holds a private fixed valuation π€ $ β [0,1] for each of those goods, βΊ π€ $ is unknown to the seller. At each round π’ = 1, β¦ , π , the seller conducts SP auction with reserves: βΊ the seller selects a reserve price π > $ for each buyer π βΊ and a bid π > $ is submitted by each buyer π .
Sellerβs pricing algorithm βΊ The seller applies a pricing algorithm π΅ that sets reserve prices {π > B,C $ } >@A,$@A B,C $ } >@A,$@A in response to bids π = {π > of buyers π = 1, β¦ , π βΊ A price π > $ can depend only on past bids {π E >GA,C F } E@A,F@A and the horizon π .
Strategic buyers β The seller announces her pricing algorithm π΅ in advance In each round π’ , each buyer π βΊ observes a history of previous rounds (available to this buyer) and βΊ chooses his bid π > $ s.t. it maximizes his future πΏ $ -discounted surplus: B 4 O (π€ $ β π E $ ) Sur > π΅, π€ $ , πΏ $ , {π E $ } : = π½ M EGA π $@$ πΏ $ , πΏ $ β 0,1 , E@> where π $@$ 4 O is the indicator of the event when buyer π is the winner in round π‘ $ is the payment of the buyer π in this case π E
Sellerβs goal The sellerβs strategic regret: $ π€ $ β π π W Xβ π > 4 W ) $ SReg π, π΅, π€ $ $ , πΏ $ $ : = β B (max >@A She seeks for a no-regret pricing for worst-case valuation: sup \ ] ,β¦,\ ^ β _,A SReg π, π΅, π€ $ $ , πΏ $ $ = π π Optimality : the lowest possible upper bound for the regret of the form π π(π) .
Background, Research question & Main contribution
Background: 1-buyer case (posted-price auctions) If one buyer ( π = 1 ), a SP auction reduces to a posted-price auction: βΊ the buyer either accepts or rejects a currently offered price π > A βΊ the seller either gets payment equal to π > A or nothing [Kleinberg et al., FOCSβ2003] Optimal algorithm against myopic buyer with truthful regret Ξ(log log π) . [Amin et al., NIPSβ2013] The strategic setting is introduced. β no-regret pricing for non-discount case πΏ = 1 . [Drutsa, WWWβ2017] Optimal algorithm against strategic buyer with regret Ξ(log log π) for πΏ < 1 .
Research question The known optimal algorithms (PRRFES & prePRRFES) from posted-price auctions cannot be directly applied to set reserve prices in second-price auctions βΊ buyers in SP auctions have incomplete information due to presence of rivals βΊ the proofs of optimality of [pre]PRRFES strongly rely on complete information β In this study, I try to find an optimal algorithm for the multi-buyer setup
Main contribution A novel algorithm for our strategic buyers with regret upper bound of Ξ(log log π) for πΏ < 1 A novel transformation that maps any pricing algorithm designed for posted-price auctions to a multi-buyer setup
Main ideas
Two learning processes $ π€ $ β π π W Xβ π > 4 W ) $ B SReg π, π΅, π€ $ $ , πΏ $ $ : = β (max >@A Find which buyer has Find the buyersβ valuations the maximal valuation Learning process #1 Learning process #2
Learning proc.#1: an idea to localize a valuation PRRFES is an optimal learner of a valuation in posted-price auctions. However, its core localization technique relies on: β The buyer completely knows the outcomes of current and all future rounds β given their bids (due to absence of rivals) Can we use PRRFES in the second-price scenario where each buyer does not know perfectly the outcomes of rounds?
Barrage pricing βΊ Reserve prices are personal (individual) in our setup βΊ Thus, we are able to βeliminateβ particular buyers from particular rounds βΊ Namely, a buyer π will not bid above 1/(1 β πΏ $ ) βΊ We call this price as βbarrageβ one and denote it by β Let βeliminateβ all buyers except some buyer π in a round π’ Then the buyer π will have com round π’ complete i ete inf nfor ormati tion on abo about outcome of this s ro
Learning proc.#2: an idea to find max valuation The search algorithm works by maintaining a feasible interval [π£ $ , π₯ $ ] that βΊ is aimed to localize the valuation π€ $ , i.e. π€ $ β [π£ $ , π₯ $ ] βΊ shrinks as π’ β β [π£ o , π₯ o ] [π£ A , π₯ A ] [π£ p , π₯ p ] π€ A π€ p π€ o 0 1 round π’ A round π’ p round π’ o β If, in a round π’ , it becomes that π₯ $ < π£ m for some buyers π and π , β then buyer π has non-maximal valuation which should not be searched anymore
Dividing algorithms
Key instrument that implements the ideas transformation di div
Transformation di div : cyclic elimination Let π΅ be an algorithm designed for repeated posted-price auctions β Its transformation ππ£π° π΅ is an algorithm for repeated SP auctions as follows Buyers: Reserve prices are set by: Reserve Prices (only one non-barrage in a round): A A A π A β β Algorithm π΅ π t β β π u β . . . p p p β β Algorithm π΅ π p β β β . . . π v π w o o β β Algorithm π΅ π o β β β β . . . π x Rounds, π’ = 3 1 2 4 5 6 7 8 1 2 3 Periods, π =
Transformation di div : stopping rule We stop considering a buyer π in periods when π₯ $ < π£ m for some buyer π. β The number of periods with buyer π is referred to as subhorizon, π½ $ . We stopped learning of π€ A and π½ A = π , when π₯ A < π£ p Buyers: Reserve prices are set by: Reserve Prices : A π E β β β Algorithm π΅ β β β β . . . p p p p β β Algorithm π΅ π E|A β β π E|u . . . π E|o π E|v o o o β β Algorithm π΅ π E|p β β β . . . π E|t π E|x π‘ + 2 π‘ π‘ + 1 π‘ + 3 π‘ + 4 π‘ + 5 π‘ + 6 π‘ + 7 Rounds, π’ = π + 3 π π + 1 π + 2 Periods, π =
οΏ½ οΏ½ Transformation div: regret decomposition Lemma 1. For the described transformation, strategic regret has decomposition: SReg π, ππ£π° π΅ , π€ $ $ , πΏ $ $ = π€ m β π€ $ ) = M Reg $ (π, π΅, π€ $ , πΏ $ ) π½ $ (max + M m $ $ Individual regrets Deviation regret Measure how the algorithm π΅ learns Measures how fast we stop learning of non-maximal valuations the valuation of each buyer
Key challenge against strategic buyer Strategic buyer may lie and mislead algorithms, thus a good algorithm must Extract correct information about a buyerβs valuation from his actions (bids) β Dividing structure in a round allows to construct a tool to locate valuations: β it is enough to make complete information situation in a round
Upper bound on valuation of strategic buyer Let buyer π is the non-βeliminated β one in a round π’ . β If the buyer accepts (bids above) the current reserve price π > $ B 4 W (π€ $ β π > 4 O (π€ $ β π E >GA π $@$ $ ) + π½ M EGA π $@$ $ ) Surplus > = π½ πΏ $ πΏ $ E@>|A = β€ Ε½ (π€ $ β π > >GA (π€ $ β π > >GA π β’ W $ ) = πΏ $ $ ) 0 πΏ $ Ε½ ββ W β If the buyer rejects (bids below) the current reserve price π > $ >|β GA B β€ πΏ $ 4 O (π€ $ β π E (π€ $ β [lowest_price]) $ ) EGA π $@$ Surplus > = π½ M πΏ $ 1 β πΏ $ E@>|β If we observe that a buyer rejects non-βbarrageβ reserve price, then: β’ π€ $ β π > $ < $ β [lowest_price]) β’ Ε½ β’ (π > AGβ’ Ε½ Gβ’ Ε½
Optimal algorithm
Pricing algorithm divPRRFES Apply the transformation div div to PRRFES algorithm
divPRRFES: individual and deviation regrets β Individual regrets Our tool to locate valuations provides the upper bound (as in 1-buyer case): Reg $ π, π΅, π€ $ , πΏ $ = π log p log p π βπ β Deviation regrets βΊ For each buyer π with non-maximal valuation (i.e., π€ $ < max π€ m ) m βΊ We can upper bound its subhorizon π½ $ : π· π½ $ β€ π€ m β π€ $ max m
Recommend
More recommend