Almost Optimal Algorithms for Linear Stochastic Bandits with - PowerPoint PPT Presentation

Dec 10, 2023 •460 likes •537 views

1/7 Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payofgs Department of Computer Science and Engineering The Chinese University of Hong Kong NeurIPS, Dec. 2018 Han Shao , Xiaotian Yu , Irwin King and Michael

1/7 Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payofgs Department of Computer Science and Engineering The Chinese University of Hong Kong NeurIPS, Dec. 2018 Han Shao ∗ , Xiaotian Yu ∗ , Irwin King and Michael R. Lyu
2/7 True Learning setting Optimal at t Linear Stochastic Bandits (LSB) Optimal Empirically Exploitation Exploration Previous setting x 1 , t ∈ R d x 4 , t ▶ 1. Given a set of arms represented by D ⊆ R d ▶ 2. At time t , select an arm x t ∈ D , and observe y t ( x t ) = ⟨ x t , θ ∗ ⟩ + η t ▶ 3. The goal is to maximize ∑ T t =1 E [ y t ( x t )] ▶ 4. η t follows a sub-Gaussian distribution ( E [ η 2 t ] < ∞ )
3/7 What Is A Heavy-Tailed Distribution? Practical scenarios Gaussian NASDAQ returns 1. Delays in communication networks (Liebeherr et al., 2012) 2. Analysis of biological data (Burnecki et al., 2015) 3. ... ▶ High-probability extreme returns in fjnancial markets ▶ Many other real cases
4/7 t MAB sub-Gaussian LSB with Heavy-Tailed Payofgs (1) (Bubeck et al., 2013) Problem defjnition LSB ▶ Multi-armed bandits (MAB) with heavy-tailed payofgs E [ η 1+ ϵ ] < + ∞ , where ϵ ∈ (0 , 1] ▶ Our setting: LSB with η t satisfying Eq. (1) ▶ Weaker assumption than sub-Gaussian ▶ Medina and Yang (2016) studied LSB with heavy-tailed payofgs heavy-tailed ( ϵ = 1 ) 1 1 2 ) 2 ) by Bubeck et al. (2013) O ( T O ( T � 1 � 3 2 ) 4 ) by Medina and Yang (2016) O ( T O ( T ▶ Can we achieve � 1 2 ) ? O ( T
5/7 Algorithm: Median of means under OFU (MENU) Framework comparison with MoM by Medina and Yang (2016)
6/7 algorithm MoM MENU CRT TOFU regret Regret Bounds ▶ Upper bounds 1+2 ϵ 1 1 � � 1 � 2(1+ ϵ ) ) � 1 1+3 ϵ ) 1+ ϵ ) 2 + 1+ ϵ ) O ( T O ( T O ( T O ( T 1 1+ ϵ ) ▶ Lower bound: Ω( T 1 When ϵ = 1 , our algorithms achieve � 2 ) O ( T
7/7 See You at the Poster Session Time: Dec. 5th, 10:45 AM – 12:45 PM Location: Room 210 & 230 AB #158

Recommend

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem A New, Fast, and Simple Algorithm A New, Fast, and Simple Algorithm A New, Fast, and

1.56k views • 134 slides

Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation

Cooperative Bandits with Heavy Tails Dubey and Pentland ICML 2020 Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation Summary Abhimanyu Dubey and Alex Pentland Background K-Armed Bandits

350 views • 16 slides

Introduction to Bandits R emi Munos SequeL project: Sequential Learning

Introduction to bandits Games Hierarchical bandits Lipschitz optimization X -armed bandits Planning Conclusion Introduction to Bandits R emi Munos SequeL project: Sequential Learning http://researchers.lille.inria.fr/ munos/ INRIA

1.1k views • 67 slides

On adaptive regret bounds for non- stochastic bandits Gergely Neu INRIA Lille, SequeL team

On adaptive regret bounds for non- stochastic bandits Gergely Neu INRIA Lille, SequeL team Universitat Pompeu Fabra, Barcelona Online learning and bandits Adaptive bounds in online learning Adaptive bounds for bandits Outline

1.22k views • 75 slides

Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual

Differentially- Private Federated Linear Bandits Dubey and Pentland June 2020 Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual Bandits Summary Background Abhimanyu Dubey and Alex Pentland

437 views • 20 slides

Data Poisoning Attack cks on Stoch chastic c Bandits Fang Liu and Ness Shroff Outline

Data Poisoning Attack cks on Stoch chastic c Bandits Fang Liu and Ness Shroff Outline Background q What are bandits? q Motivations Data poisoning attacks on stochastic bandits q Offline model q Online model q Simulation results

218 views • 17 slides

Simpler Optimal Algorithm for Contextual Bandits under Realizability Yunzong Xu MIT Joint work

Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability Yunzong Xu MIT Joint work with David Simchi-Levi (MIT) July 18 RealML @ ICML 2020 Stochastic Contextual Bandits For round = 1,

622 views • 5 slides

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Need for Unmanned . . . Need for Easily . . . Technical Details of . . . Need for an Optimal . . . Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal Trajectory for Solution: How to . . . What If

429 views • 20 slides

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel Progra ram Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel Progra ram The Basic Idea Chicago Bandits

203 views • 9 slides

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo Multi-Armed Bandits Problem: bandits with unknown average reward () Which arm should we play at each

414 views • 15 slides

Econ 2148, fall 2019 Multi-armed bandits Maximilian Kasy Department of Economics, Harvard

Bandits Econ 2148, fall 2019 Multi-armed bandits Maximilian Kasy Department of Economics, Harvard University 1 / 25 Bandits Agenda Thus far: Supervised machine learning data are given. Next: Active learning

665 views • 25 slides

CS885 Reinforcement Learning Lecture 8b: May 25, 2018 Bayesian and Contextual Bandits [SutBar]

CS885 Reinforcement Learning Lecture 8b: May 25, 2018 Bayesian and Contextual Bandits [SutBar] Sec. 2.9 University of Waterloo CS885 Spring 2018 Pascal Poupart 1 Outline Bayesian bandits Thompson sampling Contextual bandits

464 views • 22 slides

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A.

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A. Joint work with Aditya Gopalan , Michael Fu and Steve Marcus University of Maryland, College Park Indian Institute of Science

510 views • 27 slides

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint work with milie Kaufmann PhD Student Team SCEE, IETR, CentraleSuplec, Rennes & Team SequeL, CRIStAL, Inria, Lille CMAP Seminar 31 st

1.47k views • 96 slides

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

951 views • 63 slides

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Christophe Moy milie Kaufmann Advised by PhD Student Team SCEE, IETR, CentraleSuplec, Rennes & Team SequeL, CRIStAL, Inria, Lille SequeL

860 views • 68 slides

Learning diverse rankings with multi-armed bandits Radlinski, Kleinberg & Joachims. ICML

Learning diverse rankings with multi-armed bandits Radlinski, Kleinberg & Joachims. ICML 08 Radlinski, Kleinberg & Joachims. ICML 08 Overview a) Problem of diverse rankings. b) Solution approaches c) Two possible candidates d)

420 views • 15 slides

Multi-armed Bandits for Efficient Lifetime Estimation in MPSoC Design Calvin Ma, Aditya Mahajan,

Multi-armed Bandits for Efficient Lifetime Estimation in MPSoC Design Calvin Ma, Aditya Mahajan, and Brett H. Meyer Department of Electrical and Computer Engineering McGill University Design Differentiation in DSE Design space exploration

521 views • 26 slides

A Gang of Bandits Will Knospe, Paul Reich, Bryce Bern, Dawson dAlmeida The Problem Trying

A Gang of Bandits Will Knospe, Paul Reich, Bryce Bern, Dawson dAlmeida The Problem Trying to make a recommendation from thousands of choices Only understand users preferences as we recommend them shows MyHouse Friends Tags that

776 views • 43 slides

Noncommutative OSp ( 4 | 2 ) SUGRA canin 1 Dragoljub Go 1Faculty of Physics, University of

Noncommutative OSp ( 4 | 2 ) SUGRA canin 1 Dragoljub Go 1Faculty of Physics, University of Belgrade, Studentski Trg 12-16, 11000 Belgrade, Serbia D. Go canin & V. Radovanovi c, Canonical Deformation of N = 2 AdS 4 SUGRA

392 views • 14 slides

Planning and Optimization G7. Monte-Carlo Tree Search Algorithms (Part I) Malte Helmert and

Planning and Optimization G7. Monte-Carlo Tree Search Algorithms (Part I) Malte Helmert and Thomas Keller Universit at Basel December 16, 2019 Introduction Default Policy Optimality MAB Summary Content of this Course Foundations Logic

582 views • 47 slides

Exponential Lower Bounds for Polytopes in Combinatorial Optimization Ronald de Wolf Joint with

Exponential Lower Bounds for Polytopes in Combinatorial Optimization Ronald de Wolf Joint with Samuel Fiorini (ULB), Serge Massar (ULB), Sebastian Pokutta (Erlangen), Hans Raj Tiwary (ULB) Exponential Lower Bounds for Polytopes in

1.19k views • 94 slides

14 Allocation Dirichlet Latent Lecture : Taheri Sara Scribes : Chu 4am Exam Man Tue

14 Allocation Dirichlet Latent Lecture : Taheri Sara Scribes : Chu 4am Exam Man Tue 12 : Midterm Exam focus Open Format book understanding on : , IEM ) Everything 8 Lecture Content to up ' . ( VBEM 9810 lectures

440 views • 21 slides

Catmandu What is it? a Perl library a command line tool to import , transform and

Catmandu What is it? a Perl library a command line tool to import , transform and export (library) data in a pragmatic way can handle large streams of data Where do i find it? http://librecat.org/

1.02k views • 84 slides

Almost Optimal Algorithms for Linear Stochastic Bandits with - PowerPoint PPT Presentation

1/7 Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payofgs Department of Computer Science and Engineering The Chinese University of Hong Kong NeurIPS, Dec. 2018 Han Shao , Xiaotian Yu , Irwin King and Michael

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation

Introduction to Bandits R emi Munos SequeL project: Sequential Learning

On adaptive regret bounds for non- stochastic bandits Gergely Neu INRIA Lille, SequeL team

Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual

Data Poisoning Attack cks on Stoch chastic c Bandits Fang Liu and Ness Shroff Outline

Simpler Optimal Algorithm for Contextual Bandits under Realizability Yunzong Xu MIT Joint work

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning

Econ 2148, fall 2019 Multi-armed bandits Maximilian Kasy Department of Economics, Harvard

CS885 Reinforcement Learning Lecture 8b: May 25, 2018 Bayesian and Contextual Bandits [SutBar]

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A.

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson

Learning diverse rankings with multi-armed bandits Radlinski, Kleinberg &amp; Joachims. ICML

Multi-armed Bandits for Efficient Lifetime Estimation in MPSoC Design Calvin Ma, Aditya Mahajan,

A Gang of Bandits Will Knospe, Paul Reich, Bryce Bern, Dawson dAlmeida The Problem Trying

Noncommutative OSp ( 4 | 2 ) SUGRA canin 1 Dragoljub Go 1Faculty of Physics, University of

Planning and Optimization G7. Monte-Carlo Tree Search Algorithms (Part I) Malte Helmert and

Exponential Lower Bounds for Polytopes in Combinatorial Optimization Ronald de Wolf Joint with

14 Allocation Dirichlet Latent Lecture : Taheri Sara Scribes : Chu 4am Exam Man Tue

Catmandu What is it? a Perl library a command line tool to import , transform and

Learning diverse rankings with multi-armed bandits Radlinski, Kleinberg & Joachims. ICML