computer poker research at the university of alberta
play

Computer Poker Research at The University of Alberta Richard Gibson - PowerPoint PPT Presentation

Computer Poker Research at The University of Alberta Richard Gibson Computing Science Honours Seminar February 25, 2013 Games have been used to showcase advances in artificial intelligence... Checkers Source: spectrum.ieee.org Chess VS


  1. Computer Poker Research at The University of Alberta Richard Gibson Computing Science Honours Seminar February 25, 2013

  2. Games have been used to showcase advances in artificial intelligence...

  3. Checkers Source: spectrum.ieee.org

  4. Chess VS Source: Wikipedia Source: robertamsterdam.com

  5. Goal : Build a computer poker program capable of defeating the world's best human players!

  6. Overview ● Texas Hold'em Why is poker research interesting? – Computer Poker Research Group – ● Creating Polaris, a poker-playing program Nash equilibrium – Abstraction – ● Polaris in Action Annual Computer Poker Competition (Programs vs. Programs) – Man vs. Machine Competitions – ● Future Directions

  7. Overview ● Texas Hold'em Why is poker research interesting? – Computer Poker Research Group – ● Creating Polaris, a poker-playing program Nash equilibrium – Abstraction – ● Polaris in Action Annual Computer Poker Competition (Programs vs. Programs) – Man vs. Machine Competitions – ● Future Directions

  8. Texas Hold'em Poker Source: ebaumsworld.com Source: Wikipedia Dealer

  9. Texas Hold'em Poker Source: ebaumsworld.com Raise! Dealer

  10. Texas Hold'em Poker Source: ebaumsworld.com Call. Dealer

  11. Texas Hold'em Poker Source: ebaumsworld.com Flop Pot Dealer

  12. Texas Hold'em Poker Source: ebaumsworld.com Check. Dealer

  13. Texas Hold'em Poker Source: ebaumsworld.com Check. Dealer

  14. Texas Hold'em Poker Source: ebaumsworld.com Turn Dealer

  15. Texas Hold'em Poker Source: ebaumsworld.com Bet! Dealer

  16. Texas Hold'em Poker Source: ebaumsworld.com Call. Dealer

  17. Texas Hold'em Poker Source: ebaumsworld.com River Dealer

  18. Texas Hold'em Poker Source: ebaumsworld.com Check. Dealer

  19. Texas Hold'em Poker Source: ebaumsworld.com Bet! Dealer

  20. Texas Hold'em Poker Source: ebaumsworld.com Raise! Dealer

  21. Texas Hold'em Poker Source: ebaumsworld.com Call. Dealer

  22. Texas Hold'em Poker Source: ebaumsworld.com Dealer

  23. Texas Hold'em Poker Winner! Loser. Source: ebaumsworld.com Dealer

  24. Why is Poker Interesting? ● Poker is challenging, thought-provoking, and most importantly, fun! ● ... but is that enough? Source: maps.google.com

  25. Why is Poker Interesting? ● Card deals introduce elements of chance. Flop? Flop? . . . . . . Flop?

  26. Why is Poker Interesting? ● Degree of winnings can vary. Pot 1 Pot 2 Pot 3

  27. Why is Poker Interesting? ● Imperfect information! ? ? Source: Wikipedia

  28. Why is Poker Interesting? ● Poker decisions are analogous to real-life decisions. Example: Driving a car. Source: clker.com

  29. Why is Poker Interesting? ● Poker decisions are analogous to real-life decisions. Example: Online Advertisement Auctions. Source: blog.revizzit.com

  30. Why is Poker Interesting? ● Poker decisions are analogous to real-life decisions. Example: Sequential Auctions. Source: wikipedia.com

  31. Why is Poker Interesting? ● Poker decisions are analogous to real-life decisions. Example: “Adaptive Treatment Strategies” – For instance: Insulin for diabetes patients ? [Chen and Bowling, NIPS 2012] Source: clker.com

  32. Computer Poker Research Group (CPRG)

  33. Computer Poker Research Group (CPRG) ● Some of our old programs include: – Loki (1997) Limit Texas Hold'em – Poki (1999) – PsOpti / Sparbot (2002) Heads-up (2-player) – Vexbot (2003) Limit Texas Hold'em

  34. Computer Poker Research Group (CPRG) ● Our current programs: – Polaris (vs. Humans) – Hyperborean (vs. Programs) ● Games we play: – Heads-up Limit Texas Hold'em – Heads-up No-limit Texas Hold'em – Three-player Limit Texas Hold'em

  35. Computer Poker Research Group (CPRG) ● Our current programs: – Polaris (vs. Humans) – Hyperborean (vs. Programs) ● Games we play: – Heads-up Limit Texas Hold'em – Heads-up No-limit Texas Hold'em – Three-player Limit Texas Hold'em

  36. Overview ● Texas Hold'em Why is poker research interesting? – Computer Poker Research Group – ● Creating Polaris, a poker-playing program Nash equilibrium – Abstraction – ● Polaris in Action Annual Computer Poker Competition (Programs vs. Programs) – Man vs. Machine Competitions – ● Future Directions

  37. Creating Polaris ● Model Texas Hold'em as an extensive-form game . . . f c r f c r -1 -1 k k r f c r r f c r +2 +2

  38. Creating Polaris Extensive-Form Game Strategy Profile

  39. Creating Polaris ● A strategy profile provides probabilities for each action . . . 0 0.2 0.8 0 0.2 0.8 -1 -1 0.9 0.3 0.1 1 0 0 0.7 0 0.4 0.6 +2 +2

  40. Creating Polaris ● What type of strategy profile do we want? – Nash equilibrium ● Example: Rock-Paper-Scissors Source: clker.com

  41. Creating Polaris r p s r p s r p s r p s 0 -1 +1 +1 0 -1 -1 +1 0

  42. Creating Polaris ● A Nash equilibrium strategy profile for Rock-Paper-Scissors. – “No one can change their strategy and do better.” 1/3 1/3 1/3 1/31/3 1/3 1/31/3 1/3 1/31/3 1/3 0 -1 +1 +1 0 -1 -1 +1 0

  43. Creating Polaris ● A Nash equilibrium is a defensive strategy: – “I can't lose no matter what my opponent does.” 1/3 1/3 1/3 ? ? ? ? ? ? ? ? ? 0 -1 +1 +1 0 -1 -1 +1 0

  44. Creating Polaris ● But wait, you said we want to win as much as possible! Pot 1 Pot 2 Pot 3

  45. Creating Polaris ● But wait, you said we want to win as much as possible! ● Requires opponent modelling. ● Some progress made: – [Bard and Bowling, AAAI 2007] – [Johanson, Zinkevich, and Bowling, NIPS 2007] – [Johanson and Bowling, AISTATS 2009] but still lots of work to be done!

  46. Creating Polaris Extensive-Form Game Nash Equilibrium Strategy Profile

  47. Creating Polaris ● Use minimax (alpha-beta) search to compute Nash? Source: clker.com

  48. Creating Polaris ● Use minimax (alpha-beta) search? . . . f c r f c r -1 -1 k k r f c r r f c r +2 +2 Source: clker.com

  49. Creating Polaris ● Instead, use Counterfactual Regret Minimization (CFR) [Zinkevich et al ., NIPS 2007]. Update Deal Cards Strategy Profile “Play” Poker

  50. Creating Polaris ● Instead, use Counterfactual Regret Minimization (CFR) [Zinkevich et al ., NIPS 2007]. Update Deal Cards Strategy Profile “Play” Poker ● Repeat billions of times Nash Equilibrium Limit Strategy Profile

  51. Creating Polaris ● “Huge” problem (no pun intended): Game 10 18 Extensive-Form Strategy Profile 5 million GB

  52. Creating Polaris ● “Huge” problem (no pun intended): Game 10 18 Extensive-Form Strategy Profile 5 million GB

  53. Creating Polaris Extensive-Form Game ? Nash Equilibrium Strategy Profile

  54. Creating Polaris Abstract Extensive-Form Game Game

  55. Creating Polaris ● Merge card deals into buckets. Abstract Extensive-Form Game Game

  56. Creating Polaris ● Merge card deals into buckets. Abstract Extensive-Form Game Game

  57. Creating Polaris ● Old technique: Percentile Hand Strength – Rank hands from best to worst. . . . . . . . . . . . Best Worst

  58. Creating Polaris ● Old technique: Percentile Hand Strength – Rank hands from best to worst. – For 10 buckets, put top 10% into bucket 1, next 10% into bucket 2, etc. Bucket 1 Bucket 5 Bucket 10 . . . . . . . . . . . Best Worst

  59. Creating Polaris ● New technique: Hand Strength Distribution Clustering

  60. Creating Polaris ● New technique: Hand Strength Distribution Clustering – Old bucketing technique

  61. Creating Polaris ● New technique: Hand Strength Distribution Clustering – New bucketing technique

  62. Creating Polaris Abstract Extensive-Form Game Game 10 9 - 10 12 10 18

  63. Creating Polaris Abstract Extensive-Form Game Game 10 9 - 10 12 10 18 CFR Abstract Game Update Abstract Equilibrium Strategy Deal Strategy Profile “Play” Buckets “Poker” billions of times

  64. Creating Polaris Abstract Extensive-Form Game Game 10 9 - 10 12 10 18 Abstract Game Approximate Full Game Equilibrium Strategy Equilibrium Strategy <100 GB

  65. Creating Polaris ● How are these numbers still manageable? – We use Compute Canada's largest supercomputers. – Parallel implementations of abstraction, CFR. Source: rqchp.ca

  66. Creating Polaris ● So how close to equilibrium are we? Old abstraction CFR New abstraction Supercomputers Fancy new CFR variant

  67. Overview ● Texas Hold'em Why is poker research interesting? – Computer Poker Research Group – ● Creating Polaris, a poker-playing program Nash equilibrium – Abstraction – ● Polaris in Action Annual Computer Poker Competition (Programs vs. Programs) – Man vs. Machine Competitions – ● Future Directions

  68. Polaris / Hyperborean in Action ● Annual Computer Poker Competition – Programs vs. Programs.

Recommend


More recommend