stochastic simulation simulated annealing
play

Stochastic Simulation Simulated annealing Bo Friis Nielsen - PowerPoint PPT Presentation

Stochastic Simulation Simulated annealing Bo Friis Nielsen Institute of Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: bfni@dtu.dk A general optimisation problem A general optimisation problem DTU


  1. Optimisation problem - probability distribution Optimisation problem - probability distribution We introduce a probability distribution over S to be e − f ( x ) /T e − f ( x ) /T P T ( x ) = y ∈S e − f ( y ) /T = | M | e − f ⋆ /T + � � y ∈S\M e − f ( y ) /T e ( f ⋆ − f ( x )) /T = | M | + � y ∈S\M e ( f ⋆ − f ( y )) /T • we have a probability function with an “easy” to calculate expression multiplied with a difficult to calculate constant • For fixed T we can sample, states x with low “energy” (low valuels of f ( x ) ) will be more frequent/likely

  2. Optimisation problem - probability distribution Optimisation problem - probability distribution We introduce a probability distribution over S to be e − f ( x ) /T e − f ( x ) /T P T ( x ) = y ∈S e − f ( y ) /T = | M | e − f ⋆ /T + � � y ∈S\M e − f ( y ) /T e ( f ⋆ − f ( x )) /T = | M | + � y ∈S\M e ( f ⋆ − f ( y )) /T • we have a probability function with an “easy” to calculate expression multiplied with a difficult to calculate constant • For fixed T we can sample, states x with low “energy” (low valuels of f ( x ) ) will be more frequent/likely • As T

  3. Optimisation problem - probability distribution Optimisation problem - probability distribution We introduce a probability distribution over S to be e − f ( x ) /T e − f ( x ) /T P T ( x ) = y ∈S e − f ( y ) /T = | M | e − f ⋆ /T + � � y ∈S\M e − f ( y ) /T e ( f ⋆ − f ( x )) /T = | M | + � y ∈S\M e ( f ⋆ − f ( y )) /T • we have a probability function with an “easy” to calculate expression multiplied with a difficult to calculate constant • For fixed T we can sample, states x with low “energy” (low valuels of f ( x ) ) will be more frequent/likely • As T → 0

  4. Optimisation problem - probability distribution Optimisation problem - probability distribution We introduce a probability distribution over S to be e − f ( x ) /T e − f ( x ) /T P T ( x ) = y ∈S e − f ( y ) /T = | M | e − f ⋆ /T + � � y ∈S\M e − f ( y ) /T e ( f ⋆ − f ( x )) /T = | M | + � y ∈S\M e ( f ⋆ − f ( y )) /T • we have a probability function with an “easy” to calculate expression multiplied with a difficult to calculate constant • For fixed T we can sample, states x with low “energy” (low valuels of f ( x ) ) will be more frequent/likely • As T → 0 the distribution will degenerate

  5. Optimisation problem - probability distribution Optimisation problem - probability distribution We introduce a probability distribution over S to be e − f ( x ) /T e − f ( x ) /T P T ( x ) = y ∈S e − f ( y ) /T = | M | e − f ⋆ /T + � � y ∈S\M e − f ( y ) /T e ( f ⋆ − f ( x )) /T = | M | + � y ∈S\M e ( f ⋆ − f ( y )) /T • we have a probability function with an “easy” to calculate expression multiplied with a difficult to calculate constant • For fixed T we can sample, states x with low “energy” (low valuels of f ( x ) ) will be more frequent/likely • As T → 0 the distribution will degenerate to states with minimum energy

  6. Simulated annealing Simulated annealing DTU 02443 – lecture 9 4

  7. Simulated annealing Simulated annealing • Stochastic algorithm for optimisation DTU 02443 – lecture 9 4

  8. Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems DTU 02443 – lecture 9 4

  9. Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) DTU 02443 – lecture 9 4

  10. Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many DTU 02443 – lecture 9 4

  11. Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods DTU 02443 – lecture 9 4

  12. Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic DTU 02443 – lecture 9 4

  13. Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic • Simulated annealing one of the first, DTU 02443 – lecture 9 4

  14. Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic • Simulated annealing one of the first, inspired from Metropolis-Hastings DTU 02443 – lecture 9 4

  15. Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic • Simulated annealing one of the first, inspired from Metropolis-Hastings - Kirkpatrick paper Science 1983 DTU 02443 – lecture 9 4

  16. Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic • Simulated annealing one of the first, inspired from Metropolis-Hastings - Kirkpatrick paper Science 1983 • Alternatives: Stochastic gradient DTU 02443 – lecture 9 4

  17. Simulated annealing Simulated annealing • Stochastic algorithm for optimisation • Large scale (typically discrete) problems • Attempts to find the global optimum in presence of multiple local optima min x f ( x ) • One among many stochastic optimisation methods - a metaheuristic • Simulated annealing one of the first, inspired from Metropolis-Hastings - Kirkpatrick paper Science 1983 • Alternatives: Stochastic gradient and several other DTU 02443 – lecture 9 4

  18. Physical inspiration (with apologies) Physical inspiration (with apologies) DTU 02443 – lecture 9 5

  19. Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. DTU 02443 – lecture 9 5

  20. Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state DTU 02443 – lecture 9 5

  21. Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. DTU 02443 – lecture 9 5

  22. Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states DTU 02443 – lecture 9 5

  23. Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. DTU 02443 – lecture 9 5

  24. Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. This is likely to happen when welding, machining, etc. DTU 02443 – lecture 9 5

  25. Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. This is likely to happen when welding, machining, etc. By heating the material DTU 02443 – lecture 9 5

  26. Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. This is likely to happen when welding, machining, etc. By heating the material and slowly cooling, DTU 02443 – lecture 9 5

  27. Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. This is likely to happen when welding, machining, etc. By heating the material and slowly cooling, we ensure that the material ends in the ground state. DTU 02443 – lecture 9 5

  28. Physical inspiration (with apologies) Physical inspiration (with apologies) Steel and other materials can exist in several crystalline structures. One - the ground state - has lowest energy. The material may be “caught” in other states which are only locally stable. This is likely to happen when welding, machining, etc. By heating the material and slowly cooling, we ensure that the material ends in the ground state. This process is called annealing . DTU 02443 – lecture 9 5

  29. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature DTU 02443 – lecture 9 6

  30. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). DTU 02443 – lecture 9 6

  31. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . DTU 02443 – lecture 9 6

  32. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, DTU 02443 – lecture 9 6

  33. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X DTU 02443 – lecture 9 6

  34. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution DTU 02443 – lecture 9 6

  35. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution f ( x, T ) = DTU 02443 – lecture 9 6

  36. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T DTU 02443 – lecture 9 6

  37. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T So states with low U are more probable; DTU 02443 – lecture 9 6

  38. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T So states with low U are more probable; in particular at low T . DTU 02443 – lecture 9 6

  39. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T So states with low U are more probable; in particular at low T . Note the normalization constant c T is unknown; DTU 02443 – lecture 9 6

  40. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T So states with low U are more probable; in particular at low T . Note the normalization constant c T is unknown; can be found by integration, DTU 02443 – lecture 9 6

  41. P.d.f. of the state at fixed temperature P.d.f. of the state at fixed temperature Use X ∈ S to denote the state of the system (e.g., positions of atoms). Let U ( x ) denote the energy of state x ∈ S . According to statistical physics, if the temperature is T , the p.d.f. of X is the Canonical Distribution � − U ( x ) � f ( x, T ) = c T · exp T So states with low U are more probable; in particular at low T . Note the normalization constant c T is unknown; can be found by integration, but our algorithms will not require it. DTU 02443 – lecture 9 6

  42. Example energy potential Example energy potential 1.0 0.5 Potential U(x) 0.0 −0.5 −1.0 0.0 0.2 0.4 0.6 0.8 1.0 State x DTU 02443 – lecture 9 7

  43. Corresponding p.d.f., for T = 0 . 2 , 1 , 5 Corresponding p.d.f., for T = 0 . 2 , 1 , 5 10 8 6 p.d.f. 4 2 0 0.0 0.2 0.4 0.6 0.8 1.0 State x DTU 02443 – lecture 9 8

  44. An algorithm for Simulated Annealing An algorithm for Simulated Annealing

  45. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time

  46. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k .

  47. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state

  48. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC,

  49. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f.

  50. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) .

  51. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly

  52. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i .

  53. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i . If the candidate has lower energy than the old state, accept.

  54. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i . If the candidate has lower energy than the old state, accept. Otherwise, accept only with probability

  55. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i . If the candidate has lower energy than the old state, accept. Otherwise, accept only with probability exp( − ( U ( Y i ) − U ( X i )) /T i )

  56. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i . If the candidate has lower energy than the old state, accept. Otherwise, accept only with probability exp( − ( U ( Y i ) − U ( X i )) /T i ) for a symmetric proposal distribution

  57. An algorithm for Simulated Annealing An algorithm for Simulated Annealing Let the temperature be a decreasing function of time or iteration number - k . At each time step, update the state according to the random walk Metropolis-Hastings algorithm for MCMC, where the target p.d.f. is f ( x, T i ) . I.e., permute the state X i randomly to generate a candidate Y i . If the candidate has lower energy than the old state, accept. Otherwise, accept only with probability exp( − ( U ( Y i ) − U ( X i )) /T i ) for a symmetric proposal distribution (to keep the probabilistic interpreation)

  58. 1.0 0.8 0.6 x 0.4 0.2 0.0 −1.0 0.0 1.0 0 2000 4000 6000 8000 10000 U(x) Index DTU 02443 – lecture 9 10

  59. Different issues Different issues DTU 02443 – lecture 9 11

  60. Different issues Different issues • Try with different schemes for lowering the temperature DTU 02443 – lecture 9 11

  61. Different issues Different issues • Try with different schemes for lowering the temperature • Alternative initial solutions DTU 02443 – lecture 9 11

  62. Different issues Different issues • Try with different schemes for lowering the temperature • Alternative initial solutions • Different candidate generation algorithms DTU 02443 – lecture 9 11

  63. Different issues Different issues • Try with different schemes for lowering the temperature • Alternative initial solutions • Different candidate generation algorithms • Refine with local search DTU 02443 – lecture 9 11

  64. Travelling salesman problem (TSP) Travelling salesman problem (TSP) DTU 02443 – lecture 9 12

  65. Travelling salesman problem (TSP) Travelling salesman problem (TSP) A basic problem in combinatorial optimisation DTU 02443 – lecture 9 12

Recommend


More recommend