richard gibson siat faculty search presentation february
play

Richard Gibson SIAT Faculty Search Presentation February 28, 2013 - PowerPoint PPT Presentation

Recent Advances in Computer Poker and Future Research for Artificial Intelligence in Video Games Richard Gibson SIAT Faculty Search Presentation February 28, 2013 One Slide Summary 2009 2013: Computer Poker Research One Slide Summary


  1. Recent Advances in Computer Poker and Future Research for Artificial Intelligence in Video Games Richard Gibson SIAT Faculty Search Presentation February 28, 2013

  2. One Slide Summary ● 2009 – 2013: Computer Poker Research

  3. One Slide Summary ● 2009 – 2013: Computer Poker Research ● Future: AI in Video Games Image source: co-optimus.com Image source: arcadelearningenvironment.org

  4. Outline of Presentation ● Computer Poker Primer – Motivation – Background ● New Contributions to Computer Poker – Research + Hyperborean3p ● Future Research – AI in Video Games – StarCraft AI, ALE, automated content generation ● Teaching Interests – Game design, AI in video games

  5. Outline of Presentation ● Computer Poker Primer – Motivation – Background ● New Contributions to Computer Poker – Research + Hyperborean3p ● Future Research – AI in Video Games – StarCraft AI, ALE, automated content generation ● Teaching Interests – Game design, AI in video games

  6. Why Poker Research? ● Classic games, such as chess and checkers, are: – Deterministic – Binary outcomes (+ draw) – Perfect Information Image source: spectrum.ieee.org Image sources: Wikipedia

  7. Why Poker Research? ● However, poker is a game with: – Stochastic elements Image sources: Wikipedia Flop? Flop? . . . . . . Flop?

  8. Why Poker Research? ● However, poker is a game with: – Stochastic elements – Varying outcomes Pot 1 Image source: ebaumsworld.com Pot 2 Pot 3

  9. Why Poker Research? ● However, poker is a game with: – Stochastic elements – Varying outcomes – Imperfect information ? ?

  10. Why Poker Research? ● Poker research is applicable in other areas: – Airport security [Pita et al. , AI Magazine 2009] – Adaptive treatment strategies [Chen and Bowling, NIPS 2012] – Sequential auctions [?]

  11. Outline of Presentation ● Computer Poker Primer – Motivation – Background ● New Contributions to Computer Poker – Research + Hyperborean3p ● Future Research – AI in Video Games – StarCraft AI, ALE, automated content generation ● Teaching Interests – Game design, AI in video games

  12. Poker Research Background ● Model poker as an extensive-form game : c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 f c f c -1 +2 -1 -2

  13. Poker Research Background ● Information sets : Sets of states a player cannot distinguish between. c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 f c f c -1 +2 -1 -2

  14. Poker Research Background ● Example: Kuhn Poker

  15. Poker Research Background ● Example: Kuhn Poker

  16. Poker Research Background ● Example: Kuhn Poker ?

  17. Poker Research Background ● Example: Kuhn Poker Fold? Bet! ? Call?

  18. Poker Research Background ● Example: Kuhn Poker Call. ?

  19. Poker Research Background ● Example: Kuhn Poker

  20. Poker Research Background ● Example: Kuhn Poker -2 +2 Lose. Win!

  21. Poker Research Background ● Example: Kuhn Poker c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 f c f c -1 +2 -1 -2

  22. Poker Research Background Extensive-Form Game Strategy Profile

  23. Poker Research Background A strategy profile maps each information set to probability a ● distribution over actions. c QJ QK 0.5 0.5 1 1 0.6 c b 0.4 0.6 c b 0.4 2 2 2 2 0.8 0.2 1 0 0 1 0 1 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 0.7 f c f c 0.3 0.3 0.7 -1 +2 -1 -2

  24. Poker Research Background ● What type of strategy profile do we want? – Nash equilibrium ● Example: Rock-Paper-Scissors

  25. Poker Research Background 1 r p s 2 2 2 r p s r p s r p s 0 -1 +1 +1 0 -1 -1 +1 0

  26. Poker Research Background ● A Nash equilibrium strategy profile for Rock-Paper-Scissors. – “No one can change their strategy and do better.” 1 1/3 r p s 1/3 1/3 2 2 2 1/3 r p s 1/3 1/3 r p s 1/3 1/3 r p s 1/3 1/3 1/3 1/3 0 -1 +1 +1 0 -1 -1 +1 0

  27. Poker Research Background ● A Nash equilibrium in a 2-player game is a defensive strategy: – “I can't lose no matter what my opponent does.” 1 1/3 r p s 1/3 1/3 2 2 2 ? r p s ? ? r p s ? ? r p s ? ? ? ? 0 -1 +1 +1 0 -1 -1 +1 0

  28. Poker Research Background Extensive-Form Game ? Nash Equilibrium Strategy Profile

  29. Poker Research Background ● Use minimax (alpha-beta) search to compute Nash? Source: clker.com

  30. Poker Research Background ● Use minimax (alpha-beta) search to compute Nash? c QJ QK 0.5 0.5 1 1 c b c b 0.6 0.4 0.6 0.4 2 2 2 2 0.8 c b 0.2 1 f c 0 0 c b 1 0 f c 1 +1 1 +1 +2 -1 -1 1 +1 -2 0.7 0.3 f c f c 0.7 0.3 -1 +2 -1 -2

  31. Poker Research Background ● Instead, use Counterfactual Regret Minimization (CFR) [Zinkevich et al. , NIPS 2007]. Strategy Deal “Play” Cards Profile 1 Poker

  32. Poker Research Background ● Instead, use Counterfactual Regret Minimization (CFR) [Zinkevich et al. , NIPS 2007]. Strategy Deal “Play” Cards Profile 1 Poker Deal Strategy Cards “Play” T Profile 2 Poker (billions) ... ... Deal Strategy Cards “Play” Profile T Poker

  33. Poker Research Background ● Instead, use Counterfactual Regret Minimization (CFR) [Zinkevich et al. , NIPS 2007]. Strategy 1 + Strategy 2 + ... + Strategy T T ∞ Nash Equilibrium Strategy Profile T = Average Strategy Profile

  34. Poker Research Background Extensive-Form Game CFR Nash Equilibrium Strategy Profile

  35. Poker Research Background ● Huge problem (no pun intended): Texas Hold'em >10 14 CFR CFR Nash Equilibrium > 5 million GB Strategy Profile

  36. Poker Research Background Extensive-Form Game ? Nash Equilibrium Strategy Profile

  37. Poker Research Background Abstract Extensive-Form Game Game

  38. Poker Research Background ● Merge card deals into buckets. Abstract Extensive-Form Game Game

  39. Poker Research Background ● Merge card deals into buckets. Abstract Extensive-Form Game Game

  40. Poker Research Background Abstract Extensive-Form Game Game ≈10 9 >10 14

  41. Poker Research Background Abstract Extensive-Form Game Game ≈10 9 >10 14 CFR Abstract Game Abstract Equilibrium Strategy Deal “Play” Strategy Profile Buckets “Poker” billions of times

  42. Poker Research Background Abstract Extensive-Form Game Game ≈10 9 >10 14 Abstract Game Approximate Full Game Equilibrium Strategy Equilibrium Strategy ≈100 GB

  43. Outline of Presentation ● Computer Poker Primer – Motivation – Background ● New Contributions to Computer Poker – Research + Hyperborean3p ● Future Research – AI in Video Games – StarCraft, ALE, automated content generation ● Teaching Interests – Game design, AI in video games

  44. 2 f c +1 +2 Contribution 1: Domination

  45. Domination 3-or-more Player Abstract Game CFR ? (Not equilibrium)

  46. Domination Annual Computer Poker Competition 3-Player Limit Texas Hold'em - 2009 Agent Total Bankroll (mbb/g) Hyperborean3p 319 ± 2 3-or-more dpp 171 ± 2 Player Abstract akuma 151 ± 2 Game CMURingLimit -37 ± 2 dcu3pl -63 ± 2 Bluechip -548 ± 2 CFR ? (Not equilibrium)

  47. Domination c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 f c f c -1 +2 -1 -2

  48. Domination c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 f c f c -1 +2 -1 -2

  49. Domination c QJ QK 0.5 0.5 1 1 Dominated Strategies c b c b 2 2 2 2 c b f c c b f c +1 1 +1 +2 -1 -1 1 +1 -2 f c f c -1 +2 -1 -2

  50. Domination c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c b c +1 1 +1 -1 -1 1 -2 f c f c -1 +2 -1 -2

  51. Domination c QJ QK 0.5 0.5 1 1 c b c b 2 2 2 2 c b f c b c +1 1 +1 -1 -1 1 -2 f c f c -1 +2 -1 -2 Iteratively Dominated Strategy

  52. Domination 3-or-more Player Abstract Game CFR Average Strategy Profile T ∞ No Iteratively New! Dominated Strategies [G., submitted to EC 2013]

  53. Domination 3-or-more 3-or-more Player Abstract Player Abstract Game Game CFR CFR Average “Current” Strategy Profile Strategy Profile T T ∞ Finite T No Iteratively No Iteratively New! Dominated Strategies Dominated Strategies New! [G., submitted to EC 2013]

  54. Domination 3-Player Limit Texas Hold'em - 2012 New! [G., submitted to EC 2013]

  55. Contribution 2: Strategy Stitching

  56. Strategy Stitching ≈10 9 ≈10 14 Abstract 2-player Limit Game Texas Hold'em ≈ 59,000,000 540,000 “Turn” “Turn” Deals Buckets

  57. Strategy Stitching ≈10 9 ≈10 14 Abstract 2-player Limit Game Texas Hold'em ≈ 59,000,000 540,000 “Turn” “Turn” Deals Buckets ≈10 9 Abstract ≈10 17 3-player Limit Game Texas Hold'em ≈ 59,000,000 540 “Turn” Buckets “Turn” Deals

Recommend


More recommend