COSC 3P71 Particle Swarm Optimization (PSO) Brock University Brock University ((PSO)) Particle Swarm Optimization 1 / 23
Swarm Intelligence Recall Swarm Intelligence in general, and Ant Colony Optimization in specific. What do we remember? Biologically inspired (yawn) etc Brock University ((PSO)) Particle Swarm Optimization 2 / 23
Particle Swarm Optimization (PSO) Inspired by flocking birds and schooling fish Develops multiple solutions in parallel Provides both elements of independent exploration and social cooperation/collaboration Brock University ((PSO)) Particle Swarm Optimization 3 / 23
Particle Swarm Optimization So... ? More practically, it’s an optimization algorithm that seeks to find a vector of floating point values that best solves some N-dimensional target function Brock University ((PSO)) Particle Swarm Optimization 4 / 23
The Swarm swarm consists of multiple particles flying independently through search space each particle acts as a separate very simple agent we’ll better define a ‘particle’ in just a bit Brock University ((PSO)) Particle Swarm Optimization 5 / 23
particles... find new solutions that are ‘similar’ to their current solutions at the moment usually have a tendency to stay in motion search for solutions with awareness of past personal success search for solutions with consideration of progress made by the collective etc. Brock University ((PSO)) Particle Swarm Optimization 6 / 23
so... what IS a particle? each has a position: single solution; analogous to chromosome velocity: tendency to change position (and thus solution) at each step/increment neighbourhood: particles with which a particle collaborates Brock University ((PSO)) Particle Swarm Optimization 7 / 23
searching Updating the position (and thus solution) is trivial: x ′ = � � x + � v intelligence normally comes from velocity update rule change the manner by which the velocity is updated, and you change the means of training the position (candidate solution)! Brock University ((PSO)) Particle Swarm Optimization 8 / 23
initialization All positions and velocities are randomized (per dimension) Positions may be chosen anywhere within the expected bounds of the search space ◮ We’ll talk a bit more about these bounds later Choosing velocity bounds is trickier, and mostly isn’t standardized ◮ I’m partial to half the bounds of the search space (or less) ◮ Obviously the magnitude of the velocity bounds is strictly positive, but the selectable initial velocity components should be permitted to be negative Brock University ((PSO)) Particle Swarm Optimization 9 / 23
Conventional Velocity Update Rule v ′ = ω� v + c 1 r 1 ( � x ) + c 2 r 2 ( � x )[+ c 3 � r ] � x b − � x gb − � The + c 3 � r is optional The r s are at least generated per particle, per iteration ◮ Optionally, they may be per dimension, per particle, per iteration ◮ i.e. � r ◮ Values typically range from 0 to 1 Brock University ((PSO)) Particle Swarm Optimization 10 / 23
Explanation of Terms v : Inertia (or momentum) ω� c 1 ( � x b − � x ): Cognitive component — tendency to drift back towards ‘past personal glory’ c 2 ( � x gb − � x ): Social component — tendency to drift towards global or neighbourhood best thus far c 3 � r : Explorative/random — to discourage stagnation Brock University ((PSO)) Particle Swarm Optimization 11 / 23
Selection of Parameters General decent initial guesses ω — less than 1 (obviously) c 1 and c 1 — varies, but 2 and 2 isn’t unheard of Of course, all are normally determined empirically, and are normally static, but can be dynamic, or may be trained by another algorithm Brock University ((PSO)) Particle Swarm Optimization 12 / 23
Neighbourhoods Remember that the social component dictates the likelihood that a particle will rely on the work of its peers — those other particles within its neighbourhood . The simplest neighbourhood is the entire swarm ◮ That might inhibit exploration, and cause premature convergence Alternatively, you can choose a neighbourhood size and a mechanism for choosing neighbours ◮ The initial question is whether you should choose the neighbourhood once, at the beginning of the algorithm, or if you should decide each iteration ⋆ If decided per-iteration, it will typically be based on Euclidean proximity ⋆ It’s up to you whether swarm associations are symmetric or not — e.g. whether p A ∈ s B = ⇒ p B ∈ s A ⋆ When you choose your neighbourhoods probably won’t really matter, because being assigned to the same neighbourhood a priori will tend to also encourage proximity Brock University ((PSO)) Particle Swarm Optimization 13 / 23
Bounds and Restrictions Solutions often have upper/lower bounds When this is true, it must be enforced on the positions Particles can simply be unable to exceed the bounds, may wrap to the other side, or may bounce off Note that, when such restrictions don’t exist, there may be some risk of the particles just flying away, never to be seen again As mentioned earlier, velocities should also be “clamped” to some maximum magnitude. Brock University ((PSO)) Particle Swarm Optimization 14 / 23
Limitations of Canonical PSO First and foremost, PSO is not suitable for problems with “holes” in the search space ◮ e.g. x between 0 and 1, or between 2 and 3, but not betwen 1 and 2 ⋆ Of course, for a problem like this, one could simply adapt the transcription ⋆ e.g. the aforementioned range could be mapped to a [0 .. 2]; continuous within the particle’s space, but discontinuous when evaluated for fitness Not suitable for problems with highly constrained choices ◮ e.g. What’s legal in dimension 1 depends on what was chosen in dimension 0 ◮ This can be particularly problematic if trying to adapt to a combinatorial problem Not really appropriate when adjacent values in the search space aren’t “similar” in the solution space Brock University ((PSO)) Particle Swarm Optimization 15 / 23
Limitations of Canonical PSO One final thought... For problems/functions that can scale into larger versions, dimensionality might eventually become a limiting factor. Remember that our fitness function will normally just give us a single value Declaring one particle’s 300-dimensional position better than another particle’s 300-dimensional position might not ascribe much significance to each individual dimension Brock University ((PSO)) Particle Swarm Optimization 16 / 23
Benefits PSOs are easy to code and (outside of possibly the fitness function) very fast to execute The lack of a need for differentiability ◮ Unlike, for example, gradient descent! PSOs are easy to code and fast to execute There are relatively few parameters to choose ◮ Consequently, there may be less bias from the user/experimenter PSOs are easy to code and fast to execute Brock University ((PSO)) Particle Swarm Optimization 17 / 23
Applications The most common application is function optimization Minimization/maximization is an example It can also include things like training weights for ANNs ◮ As such, it can be a viable alternative to BackProp Brock University ((PSO)) Particle Swarm Optimization 18 / 23
Applications Hmmm... Could we apply this to a combinatorial problem without changing the continuous nature of the pso itself? Could we apply it to something like TSP? Could we apply it to a highly-constrained problem, like, say, two-connected networks with bounded rings? Brock University ((PSO)) Particle Swarm Optimization 19 / 23
Variations Modifications to inertia ◮ The easiest is to start momentum high, and then gradually it reduce over time ⋆ Compare to simulated annealing ⋆ It might help to avoid local minima, and then allow for refinement ◮ Personally, I’m partial to oscillation Binary or combinatorial PSO ◮ If velocity is the propensity to change position, then one could relax the definitions to include combinatorial versions ◮ Asterisk* I even saw one paper that used special operators like crossover and mutation... ◮ ...wait a minute... Anyhoo, in general, outside of screwy representations and such, much as the ‘intelligence’ comes from the velocity rule, that’s also where it’s easiest to introduce novel variations Brock University ((PSO)) Particle Swarm Optimization 20 / 23
Example Minimization Let’s take a look at an example! Suppose we want to minimize the function: f ( x , y ) = cos (6 . 28 × (3 x +2 y ))+ cos (6 . 28 × (2 x +3 y )) − 2 × sin (6 . 28 × ( x + y )) within the range of [ − 0 . 4 .. 0 . 4] Just... because of reasons. Okay? Brock University ((PSO)) Particle Swarm Optimization 21 / 23
Example Additional samples You can find some additional interesting test functions here: http://www.sfu.ca/~ssurjano/optimization.html Particularly neat ones are: Ackley Function Schaffer Function Eggholder Function Cross-in-Tray Function Langermann Function Rastrigin Function Also check out the Hartmann functions listed. They go up to 6D! That’s 4 more Ds! Brock University ((PSO)) Particle Swarm Optimization 22 / 23
Questions? Comments? Catchy tunes? Brock University ((PSO)) Particle Swarm Optimization 23 / 23
Recommend
More recommend