CSC 380 Final Presentation Connect 4 David Alligood, Scott Swiger, Jo Van Voorhis
Intro Connect 4 is a zero-sum game, which means one party wins everything or both parties win nothing; there is no “mutual” win or loss, but there can be ties, as that would qualify as both parties winning nothing. The sum of one player’s total gain and one player’s total loss = 0, giving it the name “zero-sum” game. Connect 4 is also a game of perfect information, which means no matter whose turn it is, that player is aware of all previous moves. A strong solution to a game means knowing the outcome of the game from any given board position.
Informal We are observing and analyzing the way different algorithms try to get 4 pieces in a row on a Connect 4 board. How does each one decide where to play their piece? Which one is the most accurate? Which one makes the fastest decision? These are the factors we are looking at in order to determine which algorithm is the best at playing Connect 4.
Formal A board, s, represented by a 7x6 multidimensional array, where x = row and y = column of s and where (x, y) = {X, O} is won when one of the following are true: (x, y) = (x, y+1) = (x, y+2) = (x, y+3) (x, y) = (x + 1, y) = (x + 2, y) = (x + 3, y) (x, y) = (x + 1, y+1) = (x + 2, y+2) = (x + 3, y+3)
MinMax ● Player 1: Maximize score Player 2: Minimize score ● Explores each possible move within the “game tree” ● Recursive ● O(b^d) b= # of possible moves d= max # of turns until end game (depth)
MinMax MinMax problem: In order for the computer to play Connect 4 perfectly with MinMax, the entire game tree must be explored. There are over 4 trillion game states in Connect 4, and the runtime needed to iterate through such a vast tree would take an enormous amount of time. There are 4,531,985,219,092 board states in Connect 4
MinMax Heuristic Allow MinMax to score game states with other methods Give scores for every 3 in-a-row and 2 in-a-row ((comp_4s * 10000) + (comp_3s * 100) + (comp_2s * Board Score: 90 10)) - ((user_4s * 10000) + (user_3s * 100) + (user_2s * 10)) This allows game states to be scored without having to reach terminal states Loss of perfection, but playable
Alpha - Beta The Alpha Beta algorithm is a tree traversing search algorithm, with the primary goal of decreasing the number of nodes that are visited by the MinMax algorithm, making it an improvement over MinMax. A-B optimizes MinMax by reducing effective depth to more than half that of minmax by “pruning” nodes whose values are compared to that of the initial alpha and beta values, each players worst possible scores, respectively.
Alpha - Beta ● α = value of best choice so far for MAX (highest-value) ● β = value of best choice so far for MIN (lowest-value) ● Each node keeps track of its [α,β] values ● It also allows to prune the search tree as soon as we know that the score of the position is greater than beta ● More efficient depth search than MiniMax
Alpha Beta Pruning Best Case: O√(b^d) Worst Case: O(b^d) ● ● An improvement over minimax in terms of efficiency, as the best case for MiniMax is AlphaBeta’s worst Branches of the search can be eliminated ● ○ “Pruning” can be accomplished as soon as the score of the position is greater than beta ● Initially α= -∞ and β= +∞, meaning that each player starts with their worst possible state/score ● <α<β so min will opt for alpha, less than beta greater than alpha
Alpha Beta Illustration
Alpha Beta pseudocode We created a solver that follows the Alpha Beta Pruning mechanism which ● applies to the connect four game states ● ●
Alpha Beta Implementation The benefit of implementing Alpha Beta lies in that the branches of the ● search tree can be eliminated based on depth of game states/player moves (plies) (d) and branching factor (b), or b^d Due to this, a deeper search can be achieved in the same amount of time ● that a respective minmax run would achieve ● Alpha Beta, just as minimax, can be improved upon heuristically, which searches parts of the tree that would cause early alpha-beta cutoffs This heuristic implementation allows for greater improvement without ● sacrificing accuracy
Our Algorithm Horizontal Vertical Diagonal 35 36 37 38 35 28 21 14 35 29 23 17 36 37 38 39 28 21 14 7 28 22 16 10 Every way four, 37 38 39 40 21 14 7 0 21 15 9 3 38 39 40 41 36 29 22 15 36 30 24 18 28 29 30 31 29 22 15 8 29 23 17 11 consecutive, like-colored 29 30 31 32 22 15 8 1 22 16 10 4 30 31 32 33 37 30 23 16 37 31 25 19 31 32 33 34 30 23 16 9 30 24 18 12 pieces could be 21 22 23 24 23 16 9 2 23 17 11 5 22 23 24 25 38 31 24 17 38 32 26 20 positioned on a 23 24 25 26 31 24 17 10 31 25 19 13 24 25 26 27 24 17 10 3 24 18 12 6 14 15 16 17 39 32 25 18 41 33 25 17 Connect-4 board can be 15 16 17 18 32 25 18 11 34 26 18 10 0 1 2 3 4 5 6 16 17 18 19 25 18 11 4 27 19 11 3 visualized. 17 18 19 20 40 33 26 19 40 32 24 16 10 11 12 13 33 26 19 12 33 25 17 9 7 8 9 10 11 12 13 9 10 11 12 26 19 12 5 26 18 10 2 8 9 10 11 41 34 27 20 39 31 23 15 Our algorithm recognizes 7 8 9 10 34 27 20 13 32 24 16 8 14 15 16 17 18 19 20 0 1 2 3 27 20 13 6 25 17 9 1 1 2 3 4 38 30 22 14 these visualizations 21 22 23 24 25 26 27 2 3 4 5 31 23 15 7 3 4 5 6 24 16 8 0 through the use of its 28 29 30 31 32 33 34 numeric database. 35 36 37 38 39 40 41
Else if, there is a move which will set Game Play the opponent up for a win, avoid it. If it is the beginning of a game, make initial move in the center column. Our goal was to How we get our Big O. produce an algorithm that would move quickly, anticipate the player’s next If there is a move which will result in a move, find it’s win, take it. best move, and overall, do it’s best to win. Else if, there is a move which will Else, randomly prevent the opponent make a move in a column that is from winning, take it. not full.
The Collected Data: 30 Games
Run Time is Highly Influenced by Player’s Approach
Getting the Big O B x C A x C A x C + B x C = C (A + B)
To Sum Up Our Algorithm The Big O, which is Big O(C(A+B)): A = The number of list in winningBoardStates Both originate from the data file containing the 69 lists of horizontal, (lists are eliminated as they contain player’s piece) vertical, and diagonal positions. B = The number of list in originalBoardStates (lists are eliminated as they contain computer’s piece) C = The number of variables in the lists of A and B (which is 4, since we want 4 in a row) Is highly influenced by the player, as he/she may choose to play in such a way that few list can be eliminated from A, keeping the run times higher. Worst case: 4(full winningBoardStates + full originalBoardStates) = 4(138) = 552 Best case: 4(one winningBoardStates+ one originalBoardStates) = 4(2) = 8
Conclusions AlphaBeta vs MinMax runtimes Early game vs late game Heuristic impact on effectiveness Preferred algorithm? AlphaBeta seems to be most efficient.
Future Work Look into heuristics that are playable, yet perfect / more accurate. Our Algorithm: Add a method to see if there are two in a row horizontally in columns 1-5 to avoid this scenario: See how different depths in the game tree affect gameplay.
Questions 1. Why is Connect 4 considered a zero-sum game? 2. What is the worst case complexity for AlphaBeta but best case for MinMax? 3. What is pruning? 4. Why is a heuristic needed in order for MinMax to be playable for Connect 4? 5. What is the downside of a heuristic approach in the case of both MinMax and A-B?
Answers 1. Either one player wins everything or both player win nothing. 2. O(b^d) where b = number of possible moves, d = depth. 3. Eliminating branches from a tree that do not need to be searched through. 4. Because it would take way too long for the entire game tree to be iterated through. 5. There is loss of perfection, but the game has an increase in playability
Recommend
More recommend