Local Search for a Globally Optimal Solution Russell and Norvig Chapter 4
Limitations of hill climbing n Can we find a globally optimal solution? n Can’t guarantee that. A good alternative: an approach that will give that “with high probability”. n Need a more thorough exploration of the state space
Limitations of hill climbing n An algorithm that never makes downhill moves will get stuck in local minima
Simulated Annealing n Physical systems are good at finding minimum energy configurations: a physical system slowly cooled down to absolute zero will settle into its minimum energy configuration. n This is called annealing n Mimic this using a probabilistic process
Simulated Annealing n Simulated annealing [Kirkpatrick, Gelatt, Vecchi, 1983] . q T large ⇒ probability of accepting an uphill move is large. q T small ⇒ uphill moves are almost never accepted. q Idea: turn knob to control T. q Cooling schedule: T = T(i) at iteration i. n Physical analogy. q When we take a molten solid and freeze it very abruptly, we do not expect to get a perfect crystal. q Annealing: cool material gradually from high temperature, allowing it to reach equilibrium at succession of intermediate lower temperatures. q Algorithm will find optimal solution with high probability if using a sufficiently slow cooling schedule.
Example Image from http://en.wikipedia.org/wiki/Simulated_annealing
Simulated annealing for TSP
Simulated Annealing function SIMULATED-ANNEALING( problem, schedule ) return a solution state input: problem , a problem schedule , a mapping from time to temperature current ← MAKE-NODE(INITIAL-STATE[ problem ]) for t ← 1 to ∞ do T ← schedule [ t ] if T = 0 then return current . next ← a randomly selected successor of current ∆ E ← current. VALUE - next.VALUE if ∆ E > 0 then current ← next else current ← next with probability e ∆ E /T Temperature controls the probability of non-increasing steps .
Properties of Simulated Annealing n As the number of moves at a given temperature goes to infinity, the probability of a state becomes proportional to exp(- E / T ) (Boltzman distribution) n If temperature is lowered slowly enough - global optimum will be found with high probability. A lot of research into what makes a good cooling schedule. n Widely used in a variety of applications (VLSI layout, airline scheduling, etc.)
Beam Search n Variant of hill climbing: q Initially: k random states q Next: determine all successors of k states q If any successor is optimal → done q Else select k best from successors and repeat. n Major difference with random-restart search: q Information is shared among k search threads. n Can suffer from lack of diversity (somewhat addressed by stochastic beam search which chooses k-best randomly according to their value).
Genetic algorithms n Keep a population of solutions that undergo recombination and mutation
Genetic algorithms The state is the genetic material that makes an individual 24748552 32752411 32748552 32748152 24 31% 24752411 24752411 32752411 24748552 23 29% 32752124 32252124 24415124 32752411 20 26% 24415411 24415417 32543213 24415124 chromosome 11 14% (a) (b) (c) (d) (e) Initial Population Fitness Function Selection Crossover Mutation + = Note: cross-over should happen with a certain probability: if an individual is already highly fit, it’s not a good idea to change it too much.
Crossover (aka recombination) One-point crossover Two-point crossover
Genetic algorithms function GENETIC_ALGORITHM( population, FITNESS-FN) return an individual input: population , a set of individuals FITNESS-FN, a function quantifying the quality of an individual repeat new_population ← empty set for i =1 to SIZE( population ) do x ← RANDOM_SELECTION( population , FITNESS_FN) y ← RANDOM_SELECTION( population , FITNESS_FN) child ← REPRODUCE( x,y ) MUTATE( child ) add child to new_population population ← new_population until some individual is fit enough or enough time has elapsed return the best individual
Solving TSP n Represent a tour as a permutation (i 1 , … ,i n ) of {1,2, … ,n} n Fitness of a solution: negative of the cost of the tour n Initialization: a random set of permutations n Need to define crossover and mutation operations.
Crossover n Order crossover: choose a subsequence of a tour from one parent and preserve the relative order of the cities from the other. Example: p 1 = (1 2 3 | 5 4 6 7 | 8 9) p 2 = (4 5 2 | 1 8 7 6 | 9 3) c 1 = (x x x | 5 4 6 7 | x x) c 2 = (x x x | 1 8 7 6 | x x) The tour in p 2 , starting from its second cut point, is 9 → 3 → 4 → 5 → 2 → 1 → 8 → 7 → 6. Remove the cities already in c 1 , obtaining the partial tour 9 → 3 → 2 → 1 → 8. Insert this partial tour after the second cut point of c 1 , resulting in c 1 = (2 1 8 | 5 4 6 7 | 9 3 ).
Crossover (2) Partially Mapped (PMX) crossover: n choose a subsequence of a tour from one parent and preserve the order and position of as many cities as possible from the other parent.
PMX crossover p 1 = (1 2 3 | 4 5 6 7 | 8 9) p 2 = (4 5 2 | 1 8 7 6 | 9 3) c 1 = (x x x | 4 5 6 7 | x x) c 2 = (x x x | 1 8 7 6 | x x) Swap defines a mapping: 1 ↔ 4, 8 ↔ 5, 7 ↔ 6, 6 ↔ 7. The easy ones: c 1 = (x 2 3 | 1 8 7 6 | x 9) c 2 = (x x 2 | 4 5 6 7 | 9 3) For the rest, use the mapping: c 1 = (4 2 3 | 1 8 7 6 | 5 9) c 2 = (1 8 2 | 4 5 6 7 | 9 3).
Mutation Can use a 2-opt operation: Select two points along the permutation, cut it at these points and re-insert the reversed string. Example: (1 2 | 3 4 5 6 | 7 8 9) → (1 2 | 6 5 4 3 | 7 8 9)
The knapsack problem
Local search: summary n Why I like local search algorithms u Easy to implement u Widely applicable u Provide good results
Online search n So far we have assumed deterministic actions and fully- known environments q Permits off-line search n Consider a new problem: q A robot is placed in the middle of a maze q The task is to find the exit q Actions are deterministic but the environment is unknown Difference from offline agents: An online agent can only expand the node it is physically in. What search strategies are applicable?
Online agents q Difference from offline agents: An online agent can only expand the node it is physically in. q Therefore agent needs to work locally: Online DFS, IDS. q Possible only when actions are reversible. q Heuristic search: LRTA*
Recommend
More recommend