for thursday
play

For Thursday No reading Homework: Chapter 3, exercise 23 Do this - PowerPoint PPT Presentation

For Thursday No reading Homework: Chapter 3, exercise 23 Do this twice, once with A* search (as specified) and once with greedy best first search Program 1 Any questions? Discussion Assignment Admissible Heuristic An


  1. For Thursday • No reading • Homework: – Chapter 3, exercise 23 – Do this twice, once with A* search (as specified) and once with greedy best first search

  2. Program 1 • Any questions?

  3. Discussion Assignment

  4. Admissible Heuristic • An admissible heuristic is one that never overestimates the cost to reach the goal. • It is always less than or equal to the actual cost. • If we have such a heuristic, we can prove that best first search using f(n) is both complete and optimal. • A* Search

  5. Focus on Total Path Cost • Uniform cost search uses g(n) --the path cost so far • Greedy search uses h(n) --the estimated path cost to the goal • What we’d like to use instead is f(n) = g(n) + h(n) to estimate the total path cost

  6. Heuristics Don’t Solve It All • NP-complete problems still have a worst- case exponential time complexity • Good heuristic function can: – Find a solution for an average problem efficiently – Find a reasonably good (but not optimal) solution efficiently

  7. 8-Puzzle Heuristic Functions • Number of tiles out of place • Manhattan Distance • Which is better? • Effective branching factor

  8. Inventing Heuristics • Relax the problem • Cost of solving a subproblem • Learn weights for features of the problem

  9. Local Search • Works from the “current state” • No focus on path • Also useful for optimization problems

  10. Local Search • Advantages? • Disadvantages?

  11. Hill-Climbing • Also called gradient descent • Greedy local search • Move from current state to a state with a better overall value • Issues: – Local maxima – Ridges – Plateaux

  12. Variations on Hill Climbing • Stochastic hill climbing • First-choice hill climbing • Random-restart hill climbing

  13. Evaluation of Hill Climbing

  14. Simulated Annealing • Similar to hill climbing, but-- – We select a random successor – If that successor improves things, we take it – If not, we may take it, based on a probability – Probability gradually goes down

  15. Local Beam Search • Variant of hill-climbing where multiple states and successors are maintained

  16. Genetic Algorithms • Have a population of k states (or individuals ) • Have a fitness function that evaluates the states • Create new individuals by randomly selecting pairs and mating them using a randomly selected crossover point . • More fit individuals are selected with higher probability. • Apply random mutation . • Keep top k individuals for next generation.

  17. Game Playing in AI • Long history • Games are well-defined problems usually considered to require intelligence to play well • Introduces uncertainty (can’t know opponent’s moves in advance)

  18. Games and Search • Search spaces can be very large: • Chess – Branching factor: 35 – Depth: 50 moves per player – Search tree: 35 100 nodes (~10 40 legal positions) • Humans don’t seem to do much explicit search • Good test domain for search methods and pruning methods

  19. Game Playing Problem • Instance of general search problem • States where game has ended are terminal states • A utility function (or payoff function) determines the value of the terminal states • In 2 player games, MAX tries to maximize the payoff and MIN is tries to minimize the payoff • In the search tree, the first layer is a move by MAX and the next a move by MIN, etc. • Each layer is called a ply

  20. Minimax Algorithm • Method for determining the optimal move • Generate the entire search tree • Compute the utility of each node moving upward in the tree as follows: – At each MAX node, pick the move with maximum utility – At each MIN node, pick the move with minimum utility (assume opponent plays optimally) – At the root, the optimal move is determined

  21. Recursive Minimax Algorithm function Minimax-Decision( game ) returns an operator for each op in Operators[ game ] do Value[ op ] <- Mimimax-Value(Apply( op , game ), game ) end return the op with the highest Value[ op ] function Minimax-Value( state , game ) returns a utility value if Terminal-Test[ game ]( state ) then return Utility[ game ]( state ) else if MAX is to move in state then return highest Minimax-Value of Successors( state ) else return lowest Minimax-Value of Successors( state )

  22. Making Imperfect Decisions • Generating the complete game tree is intractable for most games • Alternative: – Cut off search – Apply some heuristic evaluation function to determine the quality of the nodes at the cutoff

  23. Evaluation Functions • Evaluation function needs to – Agree with the utility function on terminal states – Be quick to evaluate – Accurately reflect chances of winning • Example: material value of chess pieces • Evaluation functions are usually weighted linear functions

  24. Cutting Off Search • Search to uniform depth • Use iterative deepening to search as deep as time allows (anytime algorithm) • Issues – quiescence needed – horizon problem

  25. Alpha-Beta Pruning • Concept: Avoid looking at subtrees that won’t affect the outcome • Once a subtree is known to be worse than the current best option, don’t consider it further

  26. General Principle • If a node has value n, but the player considering moving to that node has a better choice either at the node’s parent or at some higher node in the tree, that node will never be chosen. • Keep track of MAX’s best choice (  ) and MIN’s best choice (  ) and prune any subtree as soon as it is known to be worse than the current  or  value

  27. function Max-Value (state, game,  ,  ) returns the minimax value of state if Cutoff-Test(state) then return Eval(state) for each s in Successors(state) do  <- Max(  , Min-Value(s , game,  ,  )) if  >=  then return  end return  function Min-Value(state, game,  ,  ) returns the minimax value of state if Cutoff-Test(state) then return Eval(state) for each s in Successors(state) do  <- Min(  ,Max-Value(s , game,  ,  )) if  <=  then return  end return 

  28. Effectiveness • Depends on the order in which siblings are considered • Optimal ordering would reduce nodes considered from O(b d ) to O(b d/2 )--but that requires perfect knowledge • Simple ordering heuristics can help quite a bit

Recommend


More recommend