2-7 Triple Draw Poker: With Learning Nikolai Yakovenko 2/18/15 - PowerPoint PPT Presentation

2-7 Triple Draw Poker: With Learning Nikolai Yakovenko 2/18/15 EE6894 Deep Learning Class

Overview • Problem: learn strategy to play 2-7 triple draw poker • Data: play against existing C-lang program that plays pretty good & really fast • Why: poker is played by lots of people and for high stakes • Why learning: heuristic-based algorithms only play so well… and don’t adjust to small changes in the game rules. Which happens a lot. • Speculate: reinforcement learning (Q-learning) to learn policy for optimizing reward function. Neural net layer between raw inputs and Q-learning algorithm.

2-7 Triple draw Best hand Typical hand Draw cards three times, to make 5 different low cards.

Sample Hand Opponent does same, and final hands compared.

Why Poker? • There are 10x games played with same mechanics, different rules – Winner for high hand – Winner for low hand – Winner for badugi – Split pot game (½ low hand, ½ badugi) • Also variations in betting, number of players at the table, etc. • Could we re-use original problem setup, but learn totally different strategy for each variant?

Poker Data • Algorithm can play itself. • Also, I have C-lang program that plays triple draw – Brute force tree search, with optimization – Optimizes for average value of final hand – All final 5-card hands scored 0-1000 heuristic • In the real world… sites like PokerStars have billions of hands of real play, for most popular variants.

Keep Game Really Simple • Triple draw, no betting – Reward is +-1 for winning the hand • Triple draw, automatic bet per round (or can fold) – Reward is winning the chips if best hand, or opponent folds • Can even start with single draw. • Important thing is setup for learning game strategy, directly from game results.

Relevant Research • For Poker: – PokerSnowie: neural net from game-theory-optimal No Limit Hold’em – “Limit Hold’em is Solve” – recent academic result (although possibly not accurate) – Can play neural-net limit Texas Hold’em machine for real $ in Las Vegas – No deep learning, focus on GTO for Hold’em • Other games: – Backgammon: neural nets dominant since the 1990’s – Go: recent huge breakthrough vs best human players, using CNN – Atari: famous DeepMind paper – Flappy Bird: great example of Q-learning, for problem with simpler game state

Speculate on Deep Learning • Reinforcement Learning (Q-learning, for example) to learn a strategy for optimizing rewards • This requires representing game state s and s ’ with full information about cards, actions • DeepMind paper shows how to turn raw state into useful representation of s through neural net layer • Also shows how to deal with noisy & delayed “rewards”

Reinforcement Learning: Flappy

But Flappy State Space is Simpler • Distance from pipe • Dead or alive • Actions: tap or no tap

Conclusion • Can we simplify poker game, but still keep it the same game? • And learn a strategy for drawing cards, optimizing the hand, using neural net layer that feeds into reinforcement learning? • If so… steps to real poker at world -class level, for 100 different game variants… is straightforward.

Thank you! • Who is interested? • Flappy Bird result: http://sarvagyavaish.gith ub.io/FlappyBirdRL/ • DeepMind Atari paper: http://arxiv.org/pdf/1312. 5602v1.pdf • 2-7 Triple draw sample hand: http://www.clubpoker.net /2-7-triple-draw/p-263

2-7 Triple Draw Poker: With Learning Nikolai Yakovenko 2/18/15 - PowerPoint PPT Presentation

2-7 Triple Draw Poker: With Learning Nikolai Yakovenko 2/18/15 EE6894 Deep Learning Class Overview Problem: learn strategy to play 2-7 triple draw poker Data: play against existing C-lang program that plays pretty good & really

Opponent Modelling in Poker Mentor: Prof. Amitabha Mukharjee SOURAJ MISRA AYUSH JAIN Poker and

Architectural Complexity Lessons from the bwin P5 Poker System Presented by: Henrik Henke

Multi-agent learning Simplied Poker Yannick Bitane , April 14th, 2011. Yannick Bitane. Slides

Triple P - Positive Parenting Program: AZ Expands Triple P to Address the Opioid Crisis Cricket

The Triple Helix Model Role of different entities 1 The Triple Helix Model Role of

JUST THE MATHS SLIDES NUMBER 8.4 VECTORS 4 (Triple products) by A.J.Hobson 8.4.1 The

So Whats New? David A. V. Reynolds, DrPH My Business Card Intentionally Blank Presentation

TRIPLE INTEGRALS MATH 200 GOALS Be able to set up and evaluate triple integrals using

Another family of Steiner triple systems without almost parallel classes Daniel Horsley (Monash

ONLINE POKER United States v. DiCristina The federal district court for the Eastern District of

EMPOWERING THE COMMUNITY TO HAVE THEIR SAY ON POKER MACHINES Dr Susan Rennie Project aim

Richard Gibson Ph.D. Thesis Presentation December 6, 2013 Computer Poker Research Group Heads

Planning Poker SWEN-261 Introduction to Software Engineering Department of Software Engineering

Draw Seven Park Assembly Square Draw 7 Park Mall Himanshu Dubey & Christopher Cumming

Expert 2D Shape Drawing Aim I can accurately draw a range of 2D shapes using the measurements

Playoff Draw Procedures European Zone Playoff Draw Procedures Playoff Format: 8 best runners-up

We Welc lcome ome to to U. S. H . S. His istor tory Mr Mr. . Na Nation tion Quick

, Olivea Resort W orld class resort uniquely designed around you C yprus Jewel of the

2017 June Ramaco Resources Investor Presentation Disclaimer Forward Looking Statements The

JONAH: THE RELUCTANT PROPHET OUR STORY TOO? BRIAN PURFIELD JONAH AN UNUSUAL TEACHER The

Monte-Carlo tree search for Monte-Carlo tree search for multi-player, no-limit multi-player,

Empowerment: 5 myths and how to get there! Marcos Garrido, Certified Scrum Trainer (CST)

o Capital Raising o Valuations o Financial Advisory Services 2 Robert Heller, CEO, Spectrum

M K Using Japanese to save your I a Legacy S/W k n + a b d a o n Mikado Method

2-7 Triple Draw Poker: With Learning Nikolai Yakovenko 2/18/15 - PowerPoint PPT Presentation

2-7 Triple Draw Poker: With Learning Nikolai Yakovenko 2/18/15 EE6894 Deep Learning Class Overview Problem: learn strategy to play 2-7 triple draw poker Data: play against existing C-lang program that plays pretty good & really

Opponent Modelling in Poker Mentor: Prof. Amitabha Mukharjee SOURAJ MISRA AYUSH JAIN Poker and

Architectural Complexity Lessons from the bwin P5 Poker System Presented by: Henrik Henke

Multi-agent learning Simplied Poker Yannick Bitane , April 14th, 2011. Yannick Bitane. Slides

Triple P - Positive Parenting Program: AZ Expands Triple P to Address the Opioid Crisis Cricket

The Triple Helix Model Role of different entities 1 The Triple Helix Model Role of

JUST THE MATHS SLIDES NUMBER 8.4 VECTORS 4 (Triple products) by A.J.Hobson 8.4.1 The

So Whats New? David A. V. Reynolds, DrPH My Business Card Intentionally Blank Presentation

TRIPLE INTEGRALS MATH 200 GOALS Be able to set up and evaluate triple integrals using

Another family of Steiner triple systems without almost parallel classes Daniel Horsley (Monash

ONLINE POKER United States v. DiCristina The federal district court for the Eastern District of

EMPOWERING THE COMMUNITY TO HAVE THEIR SAY ON POKER MACHINES Dr Susan Rennie Project aim

Richard Gibson Ph.D. Thesis Presentation December 6, 2013 Computer Poker Research Group Heads

Planning Poker SWEN-261 Introduction to Software Engineering Department of Software Engineering

Draw Seven Park Assembly Square Draw 7 Park Mall Himanshu Dubey &amp; Christopher Cumming

Expert 2D Shape Drawing Aim I can accurately draw a range of 2D shapes using the measurements

Playoff Draw Procedures European Zone Playoff Draw Procedures Playoff Format: 8 best runners-up

We Welc lcome ome to to U. S. H . S. His istor tory Mr Mr. . Na Nation tion Quick

, Olivea Resort W orld class resort uniquely designed around you C yprus Jewel of the

2017 June Ramaco Resources Investor Presentation Disclaimer Forward Looking Statements The

JONAH: THE RELUCTANT PROPHET OUR STORY TOO? BRIAN PURFIELD JONAH AN UNUSUAL TEACHER The

Monte-Carlo tree search for Monte-Carlo tree search for multi-player, no-limit multi-player,

Empowerment: 5 myths and how to get there! Marcos Garrido, Certified Scrum Trainer (CST)

o Capital Raising o Valuations o Financial Advisory Services 2 Robert Heller, CEO, Spectrum

M K Using Japanese to save your I a Legacy S/W k n + a b d a o n Mikado Method

Draw Seven Park Assembly Square Draw 7 Park Mall Himanshu Dubey & Christopher Cumming