help me find it
play

Help me find it! Tim Blackwell Goldsmiths November 2010 Outline - PDF document

Help me find it! Tim Blackwell Goldsmiths November 2010 Outline 1. PSO from above 2. Focus, spread and stability 3. Bare Bones 4. Study of collapse in BB 1 PSO from above Ive lost it. It should be here, or at least somewhere close to


  1. Help me find it! Tim Blackwell Goldsmiths November 2010 Outline 1. PSO from above 2. Focus, spread and stability 3. Bare Bones 4. Study of collapse in BB 1

  2. PSO from above I’ve lost it. It should be here, or at least somewhere close to here. Can you help me? Could your friends help me as well? How do we share information, and what do we do with it? My current position x i . My best, p i ; my helpers bests, p j ; informer neighbourhood N i . 2

  3. PSO as second order stochastic di ff erence equation for each particle i = 1 . . . N for each dimension d = 1 . . . D x t +1 ,id = − a t x t,id − b t x t − 1 ,id + c t ( N i ) end for p t +1 = BEST( � p t ) � x t +1 , � end for Underlying assumption: BEST has some struc- ture (nearer is better). 3

  4. Examples: Clerc-Kennedy The Clerc-Kennedy formulation has become the de facto standard PSO: a t = − (1 + w ) + 1  2 (Φ 1 + Φ 2 )    b t = w c t ( p ) = 1   2 (Φ 1 p 1 + Φ 2 p 2 )  where Φ ∼ U [0 , φ ], p 1 = p i , and p 2 is the best informer in N i (the same Φ k appear in a and b ). It is written more conventionally as x t +1 = x t + wv t + Φ 1 2 ( p 1 − x t ) + Φ 2 2 ( p 2 − x t ) . 4

  5. Examples: Discrete Recombinant Pe˜ na’s Discrete Recombinant PSO has an up- date rule: a t = − (1 + w ) + 1  � K k =1 φ k K    b t = w c t ( p ) = 1  � K k =1 φ k ˆ  P k ( p )  K where φ k are real constants and ˆ P k is a selec- tion operator over K informers p . 5

  6. Examples: Discrete Recombinant Model 3 Various other recombinant PSOs were studied by Bratton and Blackwell including a reduced version known as Model 3,  a = − 1 + φ    b = 0  c = φ U { p 1 , p 2 }   which is a first order SDE (i.e. a particle up- date without velocity). Denoting the d ’th com- ponent of the recombinant informer as r (= p 1 d or p 2 d ), Model 3 is simply written as x t +1 = x t + φ ( r − x t ) . 6

  7. Examples: Bare Bones Bare Bones PSO, originally formulated by Kennedy:  a t = 0    b t = 0  c t ( p ) = N ( µ ( p ) , σ 2 ( p ))   where N is the Normal distribution. In Kennedy’s formulation, N has mean µ = p 1 + p 2 and variance σ 2 = ( p 1 − p 2 ) 2 . 2 7

  8. Focus � x t +1 � + � a t �� x t � + � b t �� x t − 1 � = � c t � where the random variables a, b, c are indepen- dent of x . Order-1 stability condition is found by solving the ho- mogeneous equation for x t = λ t , | λ | < 1. The conditions for real and imaginary roots within the unit circle are − 1 − � b � < � a � < 1 + � b � ( real roots, a 2 > 4 b ) and � a � 2 � b � 1 < < 4 ( imaginary roots, a 2 < 4 b ) with fixed point � c � � x � = 1 + � a � + � b � . � x � is the mean position generated by iterating the SDE; it a focus of the search at fixed p . 8

  9. Focus - examples CK, DR: � x � = 1 � P � = � ¯ K � P � , demonstrating that the search (at stagnation) focuses around the centroid of the neighbour- ing attractors ¯ P (CK) and around the expec- tation value of the centroid (DR). BB: � x � = � N � = µ. 9

  10. Spread The variance in x is obtained from � δx 2 � = � ( x − � x � ) 2 � . � d 2 � � δx 2 � = � . � 2 � ab �� a � 1 − � a 2 � − � b 2 � + 1+ � b � This equation gives the standard deviation of the general PSO, when order-2 stable, in terms of averages over the random variables a, b and c . 10

  11. Spread - examples CK, DR: � � δP 2 � � δx 2 � = γ � ˆ where � � Φ j � 2 � γ = C � C = 2(1 − w ) � Φ � 2 w Φ) 2 � + Φ � 2 . � � − � ( 1 + w � and δ ˆ P j = ˆ P j − � x � � δP 2 � is a measure of the spread of the in- � ˆ former group. BB: � δx 2 � = σ 2 11

  12. Spread - examples - summary The standard deviations for CK, DR and BB follow a common form, � � δx 2 � = α | p 1 − p 2 | with α CK = 1 . 042 α DR = 0 . 612 α BB − Kennedy = 1 . 0 12

  13. Stability  1+ < a > + < b > � = 0 Order 1   � � 2 � ab �� a � 1 − � a 2 � − � b 2 � + > 0 Order 2 .  1+ � b �  CK: 2 K (1 − w 2 ) − 7 6 φ + 5 6 wφ ≥ 0 DR (model 3): 0 < φ < 2. BB: Since a = b = 0, stability is immediately satisfied. 13

  14. General Bare Bones Bare bones is simplest PSO in the sense that a = b = 0. Not a di ff erence equation at all; unsuccessful trials x are ignored. Kennedy: µ = p 1 + p 2 , σ = | p 1 − p 2 | . 2 Hidden parameter α : σ = α | p 1 − p 2 | . In general, mean and informer separation can be chosen from the neighbourhood informers: x = µ ( p ) + αδ ( p ) N (0 , 1) . 14

  15. A problem - collapse First (DR) and second order (CK) PSOs have stability conditions that help us chose param- eters φ and w . The bare bones swarm cannot become unsta- ble, but it may collapse . 15

  16. Collapse, which is undesirable, is to be con- trasted to convergence . In arbitrary precision arithmetic, convergence means that the swarm best informer, p g , ap- proaches, but does not reach, a limit point x ∗ . Suppose the swarm is stable and the best in- former g is approaching x ∗ . σ The dimensionless variable ¯ σ = | g − x ∗ | measures the standard deviation of the sampling distri- bution in units of the separation from the op- timum. 16

  17. There are two scenarios. σ → 0 with σ → 0 faster than g → x ∗ (1) ¯ and the swarm collapses and progress towards x ∗ slows until the swarm stagnates at a finite distance from x ∗ . σ → const and the swarm converges on x ∗ . (2) ¯ This is the most desirable scenario; without the constraints of numerical precision, the g will become as close to x ∗ as we care to specify. A consideration of collapse must, unlike the stability analysis mentioned above, consider in- former movement. 17

  18. Analysis of simple model with informer movement g p O p g O Two possible configurations for a Bare Bones particle interacting with an e ff ective particle. The e ff ective particle represents the e ff ects that N − 1 particles have on the single particle. The informers are placed at g and p ; either p or g can be regarded as the e ff ective informer. The optimum is at O and g , which is closer to O , is the better informer. 18

  19. � g � g � = g + − g ( x − g ) ρ g,σ 2 dx 2 π (1 − e − 2 g 2 σ σ 2 ) . √ = g − 1.0 0.8 0.6 <g> 0.4 0.2 0.0 0 1 2 3 4 5 6 7 8 9 10 � Expected value of g after a single update from g = 1, plotted as a function of standard devi- ation σ = α | δ | . The minimum of � g � is 0.64 at σ = 1 . 26. 19

  20. �� − g � | p | � � δ � = δ + −| p | + ( x − p ) ρ g,σ 2 dx g � g − − g ( x − g ) ρ g,σ 2 dx = δA + σB where � b � c A = 1 − ρ 0 , 1 dx − ρ 0 , 1 dx a 0 1 2 a 2 − e − 1 2 + e − 1 2 c 2 � � B = √ 2 π 1 − 2 e − 1 2 b 2 � � + √ 2 π and a = −| p | − g σ b = − 2 g σ c = | p | − g . σ 20

  21.  g � g � R = � g � � g � = 1   � g � � δ � = � δ � g � δ � R = � g � .   The rescaled system can be viewed as a dy- namical system. Since g = 1, there is a single state δ R ≡ � δ � R ( t ) with dynamics δ R ( t + 1) = � δ R ( t ) � ≡ F ( δ R ( t )) � g � Self consistent condition (fixed points of F ): � δ � R = δ. 21

  22. 5 4 0.4 0.1 3 < � > R 0.65 2 1 1 0 0 1 2 3 4 5 � Expected value of δ after rescaling. The straight line is drawn at < δ > R = δ . α ≥ 0 . 65: There are two attractors, δ ∗ b > 0 and δ ∗ − < − 2. Repeller at 0. States close to 0 are driven further away; the system resists collapse. α < 0 . 65: Attractor at 0. States close to 0 are driven towards 0 and the systems collapses. 22

  23. Empirical test of α = 0 . 65 4 2 log10(f(g)) 1.0 0.9 0 0.8 0.7 -2 -4 0 10000 20000 30000 40000 50000 60000 Evaluation 4 2 log10(f(g)) 0.65 0.6 0 0.5 -2 -4 0 10000 20000 30000 40000 50000 60000 Evaluation 23

  24. Conclusions - BB In tests over Yao et al and CEC2005 benchmarks, global and local focus BB at α = 0 . 65 performs as well as PSO-CK and DR-Model 3 at their standard parameter setting. All PSO’s use information sharing to guide exploration. The focus and spread are determined by the dynamics, i.e. by the 2nd order SDE. How important are the dynamics? Second order SDE’s with multiplicative stochasticity have bursts, but not first or zero order SDE’s (Blackwell and Bratton). Bursts enable exploration of the whole search space at any stage. A simple jump mechanism improves BB performance in some cases. The shape of the distribution itself may have a small e ff ect, but since the distribution scales with the swarm, it does not allow distant exploration. 24

Recommend


More recommend