Correlated bandits or: How to minimize mean-squared error online 1 - PowerPoint PPT Presentation

Aug 27, 2023 •231 likes •322 views

Correlated bandits or: How to minimize mean-squared error online 1 LinkedIn Corp. 2 Indian Institute of Technology Madras. A portion of this work was done while the authors were at University of Maryland, College Park 1 V. Praneeth Boda 1 and

Correlated bandits or: How to minimize mean-squared error online 1 LinkedIn Corp. 2 Indian Institute of Technology Madras. A portion of this work was done while the authors were at University of Maryland, College Park 1 V. Praneeth Boda 1 and Prashanth L. A. 2
Centrality among Bandits for measuring temperature in a region. approximate the whole network. Aim: Find arm with highest information about other arms 2 ▶ Placement of sensors used ▶ Best set of towers which
Minimum Mean Squared Error Estimation g The optimal K 3 MMSE ▶ Jointly Gaussian arms X M = ( X 1 , . . . , X K ) , with zero mean and covariance matrix Σ ≜ E [ X T M X M ] . E i ≜ min [( ) T ( )] X M − g ( X i ) X M − g ( X i ) E [( ) 2 ] ∑ ∑ = E X j − E [ X j | X i ] = σ 2 j ( 1 − ρ 2 ij ) j = 1 j ̸ = i g ∗ ( X i ) = E [ X M | X i ] = [ E [ X 1 | X i ] . . . E [ X K | X i ]] T , with E [ X j | X i ] = E [ X j X i ] i ] X i = ρ ij σ j X i . E [ X 2 σ i
Correlated Bandits Observe a sample from the bivariate endfor A n based on sample-based MSE-value estimates necessary for estimating correlation structure 4 Input : set of arm-pairs S ≜ { ( i , j ) | i , j = 1 , . . . , K , i < j } , number of rounds n For t = 1 , 2 , . . . , n do Select a pair ( i t , j t ) ∈ S distribution corresponding to the arms i t , j t Output an arm ˆ so that P ( A n ̸ = i ∗ ) is minimized. Here i ∗ = arg min E i . i ∈M
MSE Estimation and Concentration j i cK 5 Based on samples of the Gaussian arms: Sample correlation Sample variance ij 5 MSE of arm i ( ) ˆ ∑ E i ≜ σ 2 ˆ 1 − ˆ ρ 2 . j ̸ = i MSE Concentration: Assume σ 2 i ≤ 1 , i = 1 , . . . , K . Then, for any i = 1 , . . . , K , and for any ϵ ∈ [ 0 , 2 K ] , we have ( − nl 2 ϵ 2 ) (� � ) � ˆ E i − E i � > ϵ ≤ 14 K exp , P � � where c is a universal constant, and 0 < l = min σ 2 i .
SR algorithm: Illustration of arm-pair elimination (1,2) are eliminated (4,5) (3,5) (3,4) (2,5) (2,4) (2,3) (1,5) (1,4) (1,3) eliminated Maintain active arms and arm-pairs (4,5) (3,5) (3,4) (2,5) (2,4) (2,3) (1,5) (1,4) (1,3) (1,2) 6 Active arm-pairs after arms 4 , 5 are Active arm-pairs after arms 3 , 4 , 5
Successive Rejects: An algorithm to find the best arm arm with lowest MSE times) 2 play each arm-pair Play the remaining two arm Phase . . Initial- . . . Play each arm pair in A 2 , Phase 2 . Pull each pair in A 1 , n 1 2 ization 7 Phase 1 A 1 = all arm pairs, ▶ One arm pair played n 1 B 1 = { 1 , . . . , K } , times, . . . , another two ⌈ ⌉ n − ( K ) n k = , C ( K ) ≈ C ( K ) ( K + 1 − k ) played n 2 times K log K . ▶ k arms played n k + 1 times times; Set B k + 1 = B k \ K − 1 ▶ ∑ ( k − 1 ) n k + ( K − 1 ) n K − 1 < n , k = 1 n 2 − n 1 times; Eliminate . . . ▶ n k increases with k ▶ Adaptive exploration: better than uniform ( = ( K ) n / pairs n K − 1 − n K − 2 times K − 1
Thanks. Questions? 8

Recommend

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem A New, Fast, and Simple Algorithm A New, Fast, and Simple Algorithm A New, Fast, and

1.56k views • 134 slides

Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation

Cooperative Bandits with Heavy Tails Dubey and Pentland ICML 2020 Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation Summary Abhimanyu Dubey and Alex Pentland Background K-Armed Bandits

350 views • 16 slides

Introduction to Bandits R emi Munos SequeL project: Sequential Learning

Introduction to bandits Games Hierarchical bandits Lipschitz optimization X -armed bandits Planning Conclusion Introduction to Bandits R emi Munos SequeL project: Sequential Learning http://researchers.lille.inria.fr/ munos/ INRIA

1.1k views • 67 slides

M Squared Engineering M Squared Engineering PLAN REVIEW AND PLAN REVIEW AND SPECIFICATIONS

Plans Review and Specifications District 3 M Squared Engineering M Squared Engineering PLAN REVIEW AND PLAN REVIEW AND SPECIFICATIONS WORKSHOP SPECIFICATIONS WORKSHOP TECHNICAL SUPPORTIVE SERVICES IN IDOT DISTRICTS 2 & 3 FOR

442 views • 7 slides

Chapter 11: The R.M.S. Error for Regression Errors: A has a large positive error B has a large

Chapter 11: The R.M.S. Error for Regression Errors: A has a large positive error B has a large positive error C has a negative error D has a negative error E has a positive error The r.m.s. error is the r.m.s. size of the errors. The r.m.s.

262 views • 12 slides

Subspace Information Criterion Subspace Information Criterion for Image Restoration for Image

SCIA2001 June 11, 2001. 1 Subspace Information Criterion Subspace Information Criterion for Image Restoration for Image Restoration Mean Squared Error Estimator Mean Squared Error Estimator for Linear Filters for Linear Filters Department

309 views • 18 slides

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel Progra ram Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel Progra ram The Basic Idea Chicago Bandits

203 views • 9 slides

Data Poisoning Attack cks on Stoch chastic c Bandits Fang Liu and Ness Shroff Outline

Data Poisoning Attack cks on Stoch chastic c Bandits Fang Liu and Ness Shroff Outline Background q What are bandits? q Motivations Data poisoning attacks on stochastic bandits q Offline model q Online model q Simulation results

218 views • 17 slides

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo Multi-Armed Bandits Problem: bandits with unknown average reward () Which arm should we play at each

414 views • 15 slides

Econ 2148, fall 2019 Multi-armed bandits Maximilian Kasy Department of Economics, Harvard

Bandits Econ 2148, fall 2019 Multi-armed bandits Maximilian Kasy Department of Economics, Harvard University 1 / 25 Bandits Agenda Thus far: Supervised machine learning data are given. Next: Active learning

665 views • 25 slides

Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual

Differentially- Private Federated Linear Bandits Dubey and Pentland June 2020 Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual Bandits Summary Background Abhimanyu Dubey and Alex Pentland

437 views • 20 slides

CS885 Reinforcement Learning Lecture 8b: May 25, 2018 Bayesian and Contextual Bandits [SutBar]

CS885 Reinforcement Learning Lecture 8b: May 25, 2018 Bayesian and Contextual Bandits [SutBar] Sec. 2.9 University of Waterloo CS885 Spring 2018 Pascal Poupart 1 Outline Bayesian bandits Thompson sampling Contextual bandits

464 views • 22 slides

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A.

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A. Joint work with Aditya Gopalan , Michael Fu and Steve Marcus University of Maryland, College Park Indian Institute of Science

510 views • 27 slides

On adaptive regret bounds for non- stochastic bandits Gergely Neu INRIA Lille, SequeL team

On adaptive regret bounds for non- stochastic bandits Gergely Neu INRIA Lille, SequeL team Universitat Pompeu Fabra, Barcelona Online learning and bandits Adaptive bounds in online learning Adaptive bounds for bandits Outline

1.22k views • 75 slides

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint work with milie Kaufmann PhD Student Team SCEE, IETR, CentraleSuplec, Rennes & Team SequeL, CRIStAL, Inria, Lille CMAP Seminar 31 st

1.47k views • 96 slides

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

951 views • 63 slides

Random Function Priors for Correlation Modeling Aonan Zhang John Paisley Columbia University

Random Function Priors for Correlation Modeling Aonan Zhang John Paisley Columbia University Setup Model exchangeable data X = [ X 1 , , X N ] collection of features = ( k ) k K Z n X n k Z n = [ Z n 1 , , Z nk , ,

210 views • 9 slides

Matching Auctions Daniel Fershtman Alessandro Pavan Tel Aviv University Northwestern University

Introduction Model Matching Auctions Truthful Bidding Profit Maximization Distortions Endogenous processes Conclusions Matching Auctions Daniel Fershtman Alessandro Pavan Tel Aviv University Northwestern University Introduction Model

1.05k views • 43 slides

Wires Within Wires A Minimal Model for Computational Bioelectronic Peptide Design R. A. Mansbach 1

Wires Within Wires A Minimal Model for Computational Bioelectronic Peptide Design R. A. Mansbach 1 A. L. Ferguson 2 1 Physics Department 2 Materials Science Department University of Illinois at Urbana-Champaign Blue Waters Symposium, Sunriver,

603 views • 35 slides

3. Generating Functions http://aofa.cs.princeton.edu A N A L Y T I C C O M B I N A T O R I C S

A N A L Y T I C C O M B I N A T O R I C S P A R T O N E 3. Generating Functions http://aofa.cs.princeton.edu A N A L Y T I C C O M B I N A T O R I C S P A R T O N E 3. Generating Functions OGFs Solving recurrences Catalan

768 views • 52 slides

JUST THE MATHS SLIDES NUMBER 9.3 MATRICES 3 (Matrix inversion & simultaneous

JUST THE MATHS SLIDES NUMBER 9.3 MATRICES 3 (Matrix inversion & simultaneous equations) by A.J.Hobson 9.3.1 Introduction 9.3.2 Matrix representation of simultaneous linear equations 9.3.3 The definition of a multiplicative inverse

394 views • 15 slides

Improved Single-Key Attacks on 9-Round AES-192/256 Leibo Li 1 , Keting Jia 2 and Xiaoyun Wang 1 ,

Improved Single-Key Attacks on 9-Round AES-192/256 Improved Single-Key Attacks on 9-Round AES-192/256 Leibo Li 1 , Keting Jia 2 and Xiaoyun Wang 1 , 3 1 Key Laboratory of Cryptologic Technology and Information Security, Ministry of Education,

476 views • 29 slides

Slotted Aloha, instability D n is the drift , i.e. expected change in backlog over one slot time

Slotted Aloha, instability D n is the drift , i.e. expected change in backlog over one slot time starting in state n , D n = ( m n ) q a P s P s G ( n ) e G ( n ) is probability of successful transmission, and also expected number

301 views • 19 slides

Continuation-passing Style (CPS) Assignment-converted/alphatized IR (.alpha) e ::= (let ([x e]

909 views • 53 slides

Correlated bandits or: How to minimize mean-squared error online 1 - PowerPoint PPT Presentation

Correlated bandits or: How to minimize mean-squared error online 1 LinkedIn Corp. 2 Indian Institute of Technology Madras. A portion of this work was done while the authors were at University of Maryland, College Park 1 V. Praneeth Boda 1 and

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Cooperative Multi-Agent Bandits with Heavy Tails Introduction K-Armed Bandits Cooperation

Introduction to Bandits R emi Munos SequeL project: Sequential Learning

M Squared Engineering M Squared Engineering PLAN REVIEW AND PLAN REVIEW AND SPECIFICATIONS

Chapter 11: The R.M.S. Error for Regression Errors: A has a large positive error B has a large

Subspace Information Criterion Subspace Information Criterion for Image Restoration for Image

Chicag cago o Bandits dits Affili liate te Program ram Junior r Affiliate and Tra vel

Data Poisoning Attack cks on Stoch chastic c Bandits Fang Liu and Ness Shroff Outline

Module 13 Bayesian Bandits CS 886 Sequential Decision Making and Reinforcement Learning

Econ 2148, fall 2019 Multi-armed bandits Maximilian Kasy Department of Economics, Harvard

Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual

CS885 Reinforcement Learning Lecture 8b: May 25, 2018 Bayesian and Contextual Bandits [SutBar]

Weighted bandits or: How bandits learn distorted values that are not expected Prashanth L.A.

On adaptive regret bounds for non- stochastic bandits Gergely Neu INRIA Lille, SequeL team

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson Joint

Random Function Priors for Correlation Modeling Aonan Zhang John Paisley Columbia University

Matching Auctions Daniel Fershtman Alessandro Pavan Tel Aviv University Northwestern University

Wires Within Wires A Minimal Model for Computational Bioelectronic Peptide Design R. A. Mansbach 1

3. Generating Functions http://aofa.cs.princeton.edu A N A L Y T I C C O M B I N A T O R I C S

JUST THE MATHS SLIDES NUMBER 9.3 MATRICES 3 (Matrix inversion &amp; simultaneous

Improved Single-Key Attacks on 9-Round AES-192/256 Leibo Li 1 , Keting Jia 2 and Xiaoyun Wang 1 ,

Slotted Aloha, instability D n is the drift , i.e. expected change in backlog over one slot time

Continuation-passing Style (CPS) Assignment-converted/alphatized IR (.alpha) e ::= (let ([x e]

JUST THE MATHS SLIDES NUMBER 9.3 MATRICES 3 (Matrix inversion & simultaneous