Counterfactual Regret Minimization Gabriele Farina 1 Christian Kroer - PowerPoint PPT Presentation

Dec 03, 2023 •226 likes •283 views

Stable-Predictive Optimistic Counterfactual Regret Minimization Gabriele Farina 1 Christian Kroer 2 Noam Brown 1 Tuomas Sandholm 1,3 1 Computer Science Department, Carnegie Mellon University 2 IEOR Department, Columbia University 3 Strategic

Stable-Predictive Optimistic Counterfactual Regret Minimization Gabriele Farina 1 Christian Kroer 2 Noam Brown 1 Tuomas Sandholm 1,3 1 Computer Science Department, Carnegie Mellon University 2 IEOR Department, Columbia University 3 Strategic Machine, Inc.; Strategy Robot, Inc.; Optimized Markets, Inc.
Recent Interest in Extensive-Form Games (EFGs) • EFGs are games played on a game tree – Can capture both sequential and simultaneous moves – Can capture private information • Application : recent breakthroughs show that it is possible to compute approximate Nash equilibria in large poker games: – Heads-Up Limit Texas Hold’Em [Bowling, Burch, Johanson and Tammelin, Science 2015] – Heads-Up No-Limit Texas Hold’Em • The game has 10 161 decision points (before abstraction)! • Finally reached superhuman level (after 20 years of effort) [Brown and Sandholm, Science 2017]
Counterfactual Regret Minimization (CFR) • Defines a class of regret minimizers • Specifically designed for EFGs: regret is minimized locally at each decision point in the game – By taking into account the combinatorial structure of the game tree, it enables game-specific techniques , such as pruning subtrees, and warm starting different parts of the tree separately • Convergence rate Θ 𝑈 −1/2 • Practical state of the art for approximating Nash equilibrium in EFGs for 10+ years (when used in conjunction with alternation and other techniques)
Optimistic (aka Predictive ) Regret Minimization • Recent development in online learning • Idea: inform device with prediction of next loss – Accurate prediction ⟹ small regret – Several optimistic/predictive regret minimizers are known in the literature, notably Optimistic Follow-the-Regularized-Leader (OFTRL) – Enables convergence rate of Θ 𝑈 −1 to Nash equilibrium in matrix games • Natural idea: can we combine CFR’s idea of local regret minimization with the improved convergence rate of predictive regret minimization?
Our Contributions • We present the first CFR variant which breaks the 𝚰(𝑼 −𝟐/𝟑 ) convergence rate to Nash equilibrium , where 𝑈 is the number of iterations. Our algorithm converges to a Nash equilibrium at the improved rate 𝑃(𝑈 −3/4 ) • Our algorithm is based on the notion of “ stable- predictive” regret minimizers , which are a particular type of predictive regret minimizers that we introduce • Our algorithm operates locally at each decision point . We show how different local regret minimizers should be set up differently at different parts of the game tree – Main idea: the stability parameter of the different regret minimizers drops exponentially fast with the depth of the decision point – Any stable-predictive regret minimizer (such as OFTRL) can be used as long as it respects the requirements on the stability parameter Poster: Pacific Ballroom #152 06:30 - 09:00 pm

Recommend

Counterfactual Regret Minimization and Domination in Extensive-Form Games Richard Gibson

Counterfactual Regret Minimization and Domination in Extensive-Form Games Richard Gibson University of Alberta Edmonton, Alberta, Canada Overview Counterfactual Regret Minimization (CFR) Overview Counterfactual Regret Minimization (CFR)

479 views • 36 slides

Deep Counterfactual Regret Min inimization Noam Brown* 12 , Adam Lerer* 1 , Sam Gross 1 , Tuomas

Deep Counterfactual Regret Min inimization Noam Brown* 12 , Adam Lerer* 1 , Sam Gross 1 , Tuomas Sandholm 23 *Equal Contribution 1 Facebook AI Research 2 Carnegie Mellon University 3 Strategic Machine Inc., Strategy Robot Inc., and Optimized

487 views • 11 slides

Cautious R Regret M Minimization: Online O Optimization w with L Long-Term B Budg udget Co

Cautious R Regret M Minimization: Online O Optimization w with L Long-Term B Budg udget Co Constraints Nikolaos Liakopoulos 1,2,3 , Apostolos Destounis 1 , Georgios S. Paschos 1 , Thrasyvoulos Spyropoulos 2 , Panayotis Mertikopoulos 4 1

315 views • 13 slides

Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions

Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions Gabriele Farina 1 Christian Kroer 2 Tuomas Sandholm 1,3,4,5 1 Computer Science Department, Carnegie Mellon University 2 IEOR Department, Columbia

512 views • 32 slides

U G A V ! Michael Johanson, Nolan Bard, Marc Lanctot, " # ! K Q $ A J Richard

Efficient Nash Equilibrium Approximation through Monte Carlo Counterfactual Regret Minimization AAMAS 2012 - June 6, 2012 Q J # $ K 1 0 P C R " ! U G A V ! Michael Johanson, Nolan Bard, Marc Lanctot, " # ! K Q $

539 views • 24 slides

Bayesian Counterfactual Risk Minimization Ben London (blondon@) Amazon Music Ted Sandler (sandler@)

Bayesian Counterfactual Risk Minimization Ben London (blondon@) Amazon Music Ted Sandler (sandler@) Amazon Music International Conference on Machine Learning Long Beach, CA, June 11, 2019 Learning from Logged Data Pull log data e.g., user i

593 views • 10 slides

Regret Minimization for Online Buffering Problems Using the Weighted Majority Algorithm Sascha

Regret Minimization for Online Buffering Problems Using the Weighted Majority Algorithm Sascha Geulen, Berthold V ocking, Melanie Winkler Department of Computer Science, RWTH Aachen University June 27, 2010 Melanie Winkler (RWTH Aachen

240 views • 11 slides

randregret : A command for fitting random regret minimization models using Stata UK Stata

3 4 3 . 5 2 . 5 2 0 . 5 2 . 5 2 1 . 5 1 0 . 5 = 1 1 . 5 1 = 2 4 0 3 . 5 = 0 . 5 = 0 . 05 = 15 ( x jmn x imn ) r 0 . 5 1 1 . 5 2 2 . 5 3 randregret : A command for fitting random regret

1.87k views • 132 slides

Minimization Using Descent Information we will consider the minimization of unconstrained

Minimization Using Descent Information we will consider the minimization of unconstrained functions of several variables where we now assume we have some derivative information such as the gradient vector or the Hessian matrix. Recall

131 views • 11 slides

Counterfactual Donkeys and the Modal Horizon Andreas Walker and Maribel Romero Counterfactual

Counterfactual Donkeys and the Modal Horizon Andreas Walker and Maribel Romero Counterfactual Donkeys and the Modal Horizon Andreas Walker and Maribel Romero 1 Introduction Counterfactual Donkey Sentences 2 Readings High and low readings 3

594 views • 30 slides

Counterfactual Policy Evaluation in Reproducing Kernel Hilbert Spaces Krikamol Muandet Max

Counterfactual Policy Evaluation in Reproducing Kernel Hilbert Spaces Krikamol Muandet Max Planck Institute for Intelligent Systems Tbingen, Germany Jeju, Korea February 22, 2019 Krikamol Muandet Counterfactual Learning in RKHS Jeju,

1.14k views • 61 slides

Minimization Satoru Iwata (University of Tokyo) Submodular Function Minimization ( )

Submodular Function Minimization Satoru Iwata (University of Tokyo) Submodular Function Minimization ( ) 0 f Assumption: X Minimization Evaluation Algorithm Oracle ( X ) f Minimizer min{ ( ) | } ? f Y Y V

177 views • 16 slides

One-Dimensional Minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi

One-Dimensional Minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universit a di Trento November 21 December 14, 2011 One-Dimensional Minimization 1 / 33 Outline Golden Section minimization 1

654 views • 33 slides

Counterfactual-based mediation analysis Workshop 2 Rhian Daniel London School of Hygiene and

Counterfactual-based mediation analysis Workshop 2 Rhian Daniel London School of Hygiene and Tropical Medicine CIMPOD 28th February, 2017 Rhian Daniel/Counterfactual-based mediation analysisWorkshop 2 1/55 Setting the scene Case study

1.85k views • 171 slides

Counterfactual-based mediation analysis Workshop 1 Rhian Daniel London School of Hygiene and

Counterfactual-based mediation analysis Workshop 1 Rhian Daniel London School of Hygiene and Tropical Medicine CIMPOD 27th February, 2017 Rhian Daniel/Counterfactual-based mediation analysisWorkshop 1 1/51 Setting the scene Case study

1.78k views • 131 slides

No-Regret Learning in Convex Games Geoff Gordon, Amy Greenwald, Casey Marks, and Martin Zinkevich

No-Regret Learning in Convex Games Geoff Gordon, Amy Greenwald, Casey Marks, and Martin Zinkevich No-Regret Learning in Convex Games p. 1 Introduction The connection between regret and equilibria is well understood in matrix games. Most

554 views • 42 slides

Learning as Loss Minimization Machine Learning 1 Learning as loss minimization The setup

Learning as Loss Minimization Machine Learning 1 Learning as loss minimization The setup Examples x drawn from a fixed, unknown distribution D Hidden oracle classifier f labels examples We wish to find a hypothesis h that mimics f

294 views • 25 slides

Composability of Regret Minimizers Gabriele Farina 1 Christian Kroer 2 Tuomas Sandholm 1,3,4,5 1

Regret Circuits: Composability of Regret Minimizers Gabriele Farina 1 Christian Kroer 2 Tuomas Sandholm 1,3,4,5 1 Computer Science Department, Carnegie Mellon University 2 IEOR Department, Columbia University 3 Strategic Machine, Inc. 4 Strategy

219 views • 19 slides

1 The Minimization Problem The Minimization Problem Input: A DFA (deterministic finite-state

Simpler & More General Minimization The Minimization Problem for Weighted Finite-State Automata Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| )

792 views • 16 slides

11. Equality constrained minimization equality constrained minimization eliminating

Convex Optimization Boyd & Vandenberghe 11. Equality constrained minimization equality constrained minimization eliminating equality constraints Newtons method with equality constraints infeasible start Newton method

445 views • 19 slides

Our Virtual Welcome Welcome We want to welcome you to this presentation. Whilst we regret we are

Y6 Open Evening September 2020 Our Virtual Welcome Welcome We want to welcome you to this presentation. Whilst we regret we are not able to meet you in person, we hope that this presentation will give you a good understanding of our school,

750 views • 38 slides

A Closer Look at Adaptive Regret Dmitry Adamskiy Joint work with Wouter Koolen, Volodya Vovk and

A Closer Look at Adaptive Regret Dmitry Adamskiy Joint work with Wouter Koolen, Volodya Vovk and Alexey Chernov Department of Computer Science Royal Holloway, University of London 15/11/2014 Adamskiy (RHUL) A Closer Look at Adaptive Regret

525 views • 50 slides

The Actual, the Counterfactual and the Possible An Oceanic-centric approach to tense and modality

The Actual, the Counterfactual and the Possible An Oceanic-centric approach to tense and modality Kilu von Prince AFLA, August 20 2020 Background Three modal domains handle/10900/91242 . https://publikationen.uni-tuebingen.de/xmlui/

983 views • 71 slides

A Minimization Algorithm Consider the minimization problem: * M min M M * subject

A Minimization Algorithm Consider the minimization problem: * M min M M * subject to (i, 2 (M(i, j) j) ) (i, j) There are many techniques to solve this problem

753 views • 31 slides