 
              Survey of Artificial Intelligence for Card Games and Its Application to the Swiss Game Jass J. Niklaus *1 , M. Alberti *1 , V. Pondenkandath 1 , R. Ingold 1 , M. Liwicki 12 *Equal contribution 1 DIVA Group, University of Fribourg, Switzerland 2 EISLAB Machine Learning, Luleå University of Technology, Sweden
Perfect Information Games Great Success of Artificial Intelligence in Games in last decades AlphaGO (GO), AlphaZero (Chess), Maluuba (Pacman), … 2
Why Care About Hidden Information Games? Hidden information makes games hard Are very close to real-world application energy optimization in datacenters, surgical operations, business, physics, … 3
AI in Hidden Information Games Recently several milestones have been reached OpenAI Five (Dota II), Libratus (Poker), AlphaStar (StarCraft II), … 4
Trick-taking traditional Swiss card game Hidden information Sequential Non-cooperative Finite Constant-sum The Next Challenge: Jass Schieber variant with 4 players 5
Coordination Game Within Jass Activity within team → Coordination Game Can convey meaning by playing cards according to common protocol Agreements like “discarding policy” Player 3 Example: Top-Down Player 3 plays low Diamond Card to signal strong Diamond Suit (probably Ace or at least King) Player 2 to player 1 Player 1 6
Jass vs. Hanabi Competitive and Cooperative “New Frontier of AI Research” ( Deepmind) Cooperation is key to success on high level Purely cooperative Multiplayer Game Multiplayer Game Hidden Information Hidden Information → Suitable testbed 7
Survey of existing methods for card games Rule Based, Evolutionary, RL, MCTS Starting point for research in hidden information or card games Discussion of methods on use case Jass Our Contribution 8
Rule Based Systems Leverage human knowledge Simple Used as baselines Hanabi is solved only by RB systems so far Can be seen as man-made decision trees 9
Evolutionary Algorithms Inspired by evolutionary theory Survival of the fittest Example: Population: Blackjack strategy Fitness function: money after playing the game for N iterations Example: Population: Hearthstone deck Fitness function: score after playing versus established human-designed decks 10
Reinforcement Learning in a Nutshell 11
Counterfactual Regret Minimization (CFR) 12
Temporal Difference Learning Various Policy Gradient Counterfactual Regret Minimization (CFR) Reinforcement CFR+ Learning Deep CFR Discounted CFR Methods Neural Fictitious Self-Play First Order Methods
Monte Carlo Simulation in a Nutshell Problem analytically very hard or impossible to solve Stochastic solution: big number of random experiments Example: Approximate π, 4*probability that random point in square is within the circle 14
Monte Carlo Tree Search (MCTS) 15
Monte Carlo Simulation Flat Monte Carlo Variations of Monte Carlo Tree Search Monte Carlo Upper Confidence Bound for Trees Determinization Methods Information Set Monte Carlo Tree Search Monte Carlo Sampling for Regret Minimization
MCTS and CFR most successfully applied to card games CFR only used in Poker so far Approaches NE Application to Bad at exploiting opponents the use-case MCTS applied to plethora of complex card games No guarantees for approaching NE of Jass Good at finding good solution fast Choice depends on the goal AI that never loses, costs don’t matter → CFR AI that performs well, can exploit opponents → MCTS
Jass Server: http://jass.joeli.to Play against MCTS bot! Suggestion engine based on MCTS bot Beats the average human player (par-human) Introduction Video: bit.ly/jass_intro Experiment: bit.ly/jass_form Preliminary Results 18
Conclusion Rule-based is replaced by most popular methods MCTS and CFR MCTS finds good strategies fast, but is exploitable by very good opponents CFR is not exploitable but cannot exploit others Jass is a hard game and hence suitable for testbed of new technologies and methods in the field. We propose using MCTS and our preliminary results suggest it’s a good idea Our bot performs at par-human level 19
Recommend
More recommend