MergeDTS for Large Scale Condorcet Dueling Bandits Chang Li , Ilya - PowerPoint PPT Presentation

Apr 06, 2024 •44 likes •104 views

MergeDTS for Large Scale Condorcet Dueling Bandits Chang Li , Ilya Markov, Maarten de Rijke and Masrour Zoghi What are dueling bandits? The K -armed dueling bandits (Yue et al, COLT 2009) : K arms (aka actions) Each time-step:

MergeDTS for Large Scale Condorcet Dueling Bandits Chang Li , Ilya Markov, Maarten de Rijke and Masrour Zoghi
What are dueling bandits? • The K -armed dueling bandits (Yue et al, COLT 2009) : K arms (aka actions) • Each time-step: • ➡ the algorithm chooses two arms, l and r (for “left” and “right”); ➡ the dueling happens between l and r with one returned as the winner. Goal : converge to the optimal play for both l and r. • � 2
What is the optimal play? • Notation : is the preference matrix with P := [ P ij ] P ij = Pr (arm i beats arm j ) • Assumption : there exists one arm that on average beats all the other arms: called the Condorcet winner. P 1 j > 0 . 5 for all j 6 = 1 • Regret : the loss of comparing non-Condorcet winner. r t = 0 . 5 ∗ ( P 1 l − 0 . 5) + 0 . 5 ∗ ( P 1 r − 0 . 5) • Optimal play : only play the Condorcet winner, i.e. choose the Condorcet winner as l and r. � 3
Related works • DTS (Wu et al. NIPS 2016) , etc.   Limited to small scale set up, i.e. K is small • Self-Sparring (Sui et al. UAI 2017) , etc.   Designed under strict assumptions, i.e. not cyclic relationship • MergeRUCB (Zoghi, WSDM 2014)   Designed for large scale dueling bandits yet with high cumulative regret � 4
Merge Double Thompson Sampling • Randomly partition arms into small groups. • Each time step: 1. Sample a tournament inside a small group; 2. Choose the winner and loser of the tournament as l and r , respectively; 3. Compare l and r online, and update statistic; 4. Eliminate an arm if it is dominated by any other arm with high confidence. 5. If half arms are eliminated, re-partition rankers. • Stop if only one arm left. � 5
Experiment: online ranker evaluation MSLR-Navigational MergeRUCB α = 0 . 8 6 25000 DTS α = 0 . 8 6 Self-Sparring 20000 Cumulative regret MergeDTS α = 0 . 8 6 15000 10000 5000 10 4 10 5 10 6 10 7 10 8 Iteration � 6

Recommend

The Dueling Bandits Problem Yisong Yue Collaborators Yanan

The Dueling Bandits Problem Yisong Yue Collaborators Yanan Vincent Josef Sui Zhuang Broder Joel Thorsten Bobby Burdick Joachims Kleinberg

791 views • 76 slides

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large Scale Solar Lead April 2017 CONTENTS 1. Introduction to CEFC 2. Investment trends 3. The future of large scale solar 4. Pathway to sustainable

664 views • 21 slides

INCORPORATING LARGE-SCALE CITIZEN INCORPORATING LARGE-SCALE CITIZEN DELIBERATION INTO

INCORPORATING LARGE-SCALE CITIZEN INCORPORATING LARGE-SCALE CITIZEN DELIBERATION INTO ENVIRONMENTAL CONFLICT RESOLUTION America Speaks presentation to the 2008 ECR Conference Table Introductions Please form groups of 4-5 and give: Your name

848 views • 41 slides

Large-Scale Survey Interviewing Following Large Scale Survey Interviewing Following the 2008

Large-Scale Survey Interviewing Following Large Scale Survey Interviewing Following the 2008 WenChuan Earthquake Zhang Huafeng and Jon Pedersen International Blaise Users Conference (IBUC) Baltimore, USA , 1 Introduction Two household

714 views • 22 slides

Large-scale production of graphene: Introduction Large-scale production of graphene: what is

Large-scale production of graphene: Introduction Large-scale production of graphene: what is graphene? Graphene is a truly 2D system made of Carbon atoms arranged in an honeycomb hexagonal lattice Zero-gap semiconductor with a low-energy linear

268 views • 6 slides

Ethics in Techniques for large-scale data Graham J.L. Kemp TECHNIQUES FOR LARGE-SCALE DATA

Ethics in Techniques for large-scale data Graham J.L. Kemp TECHNIQUES FOR LARGE-SCALE DATA Masters level course: Chalmers MPALG and MPSOF programmes (DAT345) GU Applied Data Science programme (DIT871) Learning outcomes include:

305 views • 18 slides

Introduction to Bandits R emi Munos SequeL project: Sequential Learning

Introduction to bandits Games Hierarchical bandits Lipschitz optimization X -armed bandits Planning Conclusion Introduction to Bandits R emi Munos SequeL project: Sequential Learning http://researchers.lille.inria.fr/ munos/ INRIA

1.1k views • 67 slides

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A.

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A. Joint work with Aditya Gopalan , Michael Fu and Steve Marcus University of Maryland, College Park Indian Institute of Science

510 views • 27 slides

An Email Contact Protocol Experiment in a Large-Scale Survey of U.S. in a Large Scale Survey of

An Email Contact Protocol Experiment in a Large-Scale Survey of U.S. in a Large Scale Survey of U.S. Government Employees DC-AAPOR Summer Conference Preview/Review Bureau of Labor Statistics Conference Center Washington DC Washington, DC

1.17k views • 25 slides

Outline Restless Bandits 1 Overview Problem Description Decomposition Applications 2

Whittles index for Markovian bandits A UNIFYING COMPUTATION OF W HITTLE S INDEX FOR M ARKOVIAN BANDITS Manu K. Gupta 2 Joint work with U. Ayesta 1 , 2 & I.M. Verloop 1 , 2 1 Centre National de la Recherche Scientifique (CNRS), 2 Institut

1.22k views • 98 slides

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E.

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E. Santos-Neto, L. Costa, M. Ripeanu. IEEE TPC, 2014 Sami (sa894) - R244: Large-scale data processing and optimization Efficient Large-Scale Graph Processing on

381 views • 25 slides

Motivation Large-scale distributed systems becoming more common multiple datacenters, cloud

Census: Location-Aware Membership Management for Large-Scale Distributed Systems James Cowling Dan R. K. Ports Barbara Liskov Raluca Ada Popa Abhijeet Gaikwad* MIT CSAIL *cole Centrale Paris Motivation Large-scale distributed systems

606 views • 48 slides

Large-Scale Electronic Voting Protocols Mike Carpenter Introduction What is meant by large-scale

Large-Scale Electronic Voting Protocols Mike Carpenter Introduction What is meant by large-scale electronic voting protocol: Primarily Internet-based Users voting from their own devices (such as home PC/laptop) Aimed toward actual

721 views • 21 slides

Internet Topology Generation for Large Scale BGP Simulation Jean-Michel Fourneau - Houssame

Internet Topology Generation for Large Scale BGP Simulation Jean-Michel Fourneau - Houssame Yahiaoui Outlook BGP: Border Gateway Protocol Large Scale Simulation: motivations Large Scale BGP Simulation Model Realistic Topologies

550 views • 17 slides

E Evolution of NTCIR: l Infrastructure of Large-Scale Infrastructure of Large Scale

E Evolution of NTCIR: l Infrastructure of Large-Scale Infrastructure of Large Scale Information Access Technologies Evaluation and Testing Evaluation and Testing Noriko Kando Noriko Kando National Institute of Informatics, Japan

632 views • 61 slides

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and Alek Kolcz Twitter, Inc. 1 Image source:google.com/images Large-Scale Machine Learning at Twitter Outline Outline Is twitter big data?

582 views • 40 slides

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo Multi-Armed Bandits Problem: bandits with unknown average reward () Which arm should we play at each

414 views • 15 slides

Large Scale I nternational I Pv6 Pilot Large Scale I nternational I Pv6 Pilot Network (6NET)

Large Scale I nternational I Pv6 Pilot Large Scale I nternational I Pv6 Pilot Network (6NET) Network (6NET) Athanassios Liakopoulos (aliako@grnet.gr) Greek Research & Technology Network (GRNET) I I I Global I Pv6 Summit November 2004

638 views • 37 slides

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

2110414 - Large Scale Computing Systems 1 LARGE SCALE INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2 Overview Hardware Virtualization Storage Technology 2110414 - Large Scale Computing

512 views • 29 slides

Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation

Cooperative Bandits with Heavy Tails Dubey and Pentland ICML 2020 Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation Summary Abhimanyu Dubey and Alex Pentland Background K-Armed Bandits

350 views • 16 slides

A large-scale chemical data integration system Gaia Paolini Pfizer Confidential 1 Large-Scale

A large-scale chemical data integration system Gaia Paolini Pfizer Confidential 1 Large-Scale Chemical Data Integration Summary Current situation Business case Aims The design process Functionality Applications

261 views • 25 slides

About this class An example Bandit problems in general Two-armed bandits Multi-armed bandits

About this class An example Bandit problems in general Two-armed bandits Multi-armed bandits and Gittins indices 1 An Example [Most of this lecture from Berry & Fristedt] You want to maximize the sum of two observa- tions. The process

407 views • 13 slides

Very Large Scale Neighborhoods Weighted Matching Neighborhoods Cyclic Exchange Neighborhoods

Outline DM811 HEURISTICS AND LOCAL SEARCH ALGORITHMS FOR COMBINATORIAL OPTIMZATION 1. Very Large Scale Neighborhoods Variable Depth Search Ejection Chains Lecture 11 Dynasearch Very Large Scale Neighborhoods Weighted Matching Neighborhoods

202 views • 9 slides

Cosmic Calibration Katrin Heitmann Statistical Challenges for Large-Scale Structure in the Era of

Cosmic Calibration Katrin Heitmann Statistical Challenges for Large-Scale Structure in the Era of LSST Oxford , April 18, 2018 Cosmological Constraints Large Scale Simulations Cosmic Emulation Cosmic Emulators Large Synoptic Survey Telescope

944 views • 32 slides