facetwise modeling of genetic algorithms
play

Facetwise Modeling of Genetic Algorithms Dirk Thierens Utrecht - PowerPoint PPT Presentation

Facetwise Modeling of Genetic Algorithms Dirk Thierens Utrecht University The Netherlands 1/ ?? Dirk Thierens (Utrecht University) GA Modeling 1 / 47 Run Time Complexity In typical application the total run time of a genetic algorithm is


  1. Facetwise Modeling of Genetic Algorithms Dirk Thierens Utrecht University The Netherlands 1/ ?? Dirk Thierens (Utrecht University) GA Modeling 1 / 47

  2. Run Time Complexity In typical application the total run time of a genetic algorithm is determined by the number of fitness function evaluations. Run time of selection algorithm and variation operators can be ignored. Number of fitness function evaluations is equal to the number of generations times the population size: # FitnessFct . Evals = # Generations × PopulationSize 2/ ?? Dirk Thierens (Utrecht University) GA Modeling 2 / 47

  3. Convergence speed Rate at which a population converges is determined by the selection pressure: ◮ high selection pressure: fast convergence ◮ low selection pressure: slow convergence Size of population determines quality of solution found: ◮ large population size: more reliable convergence ◮ small population size: less reliable convergence Trade-off between selection pressure and population size 3/ ?? Dirk Thierens (Utrecht University) GA Modeling 3 / 47

  4. Key questions How long does a GA - with a certain selection pressure - runs 1 before it converges ? What is the minimal population size to ensure reliable 2 convergence ? → problem dependent, but: We can build analytical models for simple problems, Use this as an approximation for some real, complex problems, Gives insight in and guidance for designing performant GAs. 4/ ?? Dirk Thierens (Utrecht University) GA Modeling 4 / 47

  5. Models First, we will build analytical models for the convergence 1 behavior, assuming large enough populations, Second, we will build analytical models for the minimal required 2 population size, Third, we will test the models on a real, complex problem (map 3 labeling). 5/ ?? Dirk Thierens (Utrecht University) GA Modeling 5 / 47

  6. Selection Intensity To quantify the speed of convergence we need a quantitative measure of selection pressure. The selection differential S ( t ) is the difference between the mean fitness of the parent population at generation t and the population mean fitness at generation t . The selection intensity I ( t ) is the scaled selection differential, obtained by dividing by the standard deviation of the fitness values. I ( t ) is dimensionless since the standard deviation has the units in which the selection differential is expressed: σ ( t ) = f ( t s ) − f ( t ) I ( t ) = S ( t ) . σ ( t ) 6/ ?? Dirk Thierens (Utrecht University) GA Modeling 6 / 47

  7. Counting Ones fitness function Counting Ones, ’fruit fly’ of GA theory ℓ � CO ( X ) = x i x i ∈ { 0 , 1 } i = 1 Probability having 1 at a certain locus: p ( t ) Fitness binomial distributed Mean fitness at generation t : ¯ f ( t ) = l . p ( t ) Variance at gen. t : σ 2 p ( t ) = l . p ( t )( 1 − p ( t )) Recombination makes no change to population mean fitness ⇒ simple, yet accurate convergence models 7/ ?? Dirk Thierens (Utrecht University) GA Modeling 7 / 47

  8. Proportionate selection Probability selecting i (fitness f i , proportion P i ( t ) ): P i ( t s ) = P i ( t ) f i f ( t ) Selection Differential S ( t ) : N � P i ( t s ) f i − f ( t ) f ( t s ) − f ( t ) = i = 1 N P i ( t ) f 2 � i = − f ( t ) f ( t ) i = 1 1 ( f 2 ( t ) − ( f ( t )) 2 ) = f ( t ) σ 2 ( t ) = f ( t ) Selection intensity I ( t ) = σ ( t ) f ( t ) 8/ ?? Dirk Thierens (Utrecht University) GA Modeling 8 / 47

  9. Proportionate Selection: Counting Ones mean fitness increase: f ( t + 1 ) − f ( t ) = σ 2 ( t ) f ( t ) proportion of optimal alleles p(t) p ( t + 1 ) − p ( t ) = 1 l ( 1 − p ( t )) dp ( t ) ≈ 1 l ( 1 − p ( t )) dt convergence model ( p ( 0 ) = 0 . 5) p ( t ) = 1 − 0 . 5 e − t / l convergence speed: p ( t conv ) = 1 − 1 / ( 2 ℓ ) t conv = ℓ ln ( ℓ ) 9/ ?? Dirk Thierens (Utrecht University) GA Modeling 9 / 47

  10. 1 0.95 0.9 0.85 proportion p(t) 0.8 0.75 0.7 0.65 0.6 SUS model 0.55 0.5 0 50 100 150 200 250 300 350 400 450 500 generations 10/ ?? Dirk Thierens (Utrecht University) GA Modeling 10 / 47

  11. Truncation Selection Truncating a normal distribution at the top τ % gives fitness increase proportional to the standard deviation: f ( t s ) − f ( t ) = c ( τ ) .σ ( t ) Selection intensity: I ( τ ) = c ( τ ) Values of selection intensity I for truncation selection are constant: τ 1 % 10 % 20 % 40 % 50 % 80 % I 2.66 1.76 1.2 0.97 0.8 0.34 11/ ?? Dirk Thierens (Utrecht University) GA Modeling 11 / 47

  12. Truncation Selection mean fitness increase f ( t + 1 ) − f ( t ) = I σ ( t ) proportion of optimal alleles p(t) p ( t + 1 ) − p ( t ) = I � √ p ( t )( 1 − p ( t )) l dp ( t ) ≈ I � √ p ( t )( 1 − p ( t )) dt l convergence model ( p ( 0 ) = 0 . 5) p ( t ) = 0 . 5 ( 1 + sin ( I l t )) √ convergence speed ( p ( t conv ) = 1) √ t conv = π l 2 I 12/ ?? Dirk Thierens (Utrecht University) GA Modeling 12 / 47

  13. 1 0.95 0.9 0.85 proportion p(t) 0.8 0.75 0.7 0.65 trunc + recomb trunc + 2.recomb 0.6 model 0.55 0.5 0 5 10 15 20 25 30 35 40 generations 13/ ?? Dirk Thierens (Utrecht University) GA Modeling 13 / 47

  14. Tournament Selection Tournament size s : the selection intensity i is equal to the expected value of the best ranked individual of a sample from s individuals taken from the standard normal distribution: Can be computed using order statistics: I = u s : s s 2 3 4 5 6 7 1 I = u s : s √ π = 0.56 0.85 1.03 1.16 1.27 1.35 14/ ?? Dirk Thierens (Utrecht University) GA Modeling 14 / 47

  15. Tournament Selection Same model as truncation selection, for instance for tournament size s = 2: mean fitness increase 1 f ( t + 1 ) − f ( t ) = I σ ( t ) = √ π σ ( t ) convergence model ( p ( 0 ) = 0 . 5) t p ( t ) = 0 . 5 ( 1 + sin ( π l )) √ convergence speed ( p ( t conv ) = 1) √ t conv = π π l 2 15/ ?? Dirk Thierens (Utrecht University) GA Modeling 15 / 47

  16. 1 0.95 0.9 0.85 proportion p(t) 0.8 0.75 0.7 0.65 tour + recomb tour + 2.recomb 0.6 model 0.55 0.5 0 5 10 15 20 25 30 35 40 generations 16/ ?? Dirk Thierens (Utrecht University) GA Modeling 16 / 47

  17. Population sizing Correct size of the population important: ◮ too small: premature convergence to sub-optimal solutions ◮ too large: computational inefficient We focus on the Counting-Ones problem, but the model can be extended to (slightly) more complex functions Key question: how does the optimal population size scales with the complexity of the problem, ie. the length of the string ? 17/ ?? Dirk Thierens (Utrecht University) GA Modeling 17 / 47

  18. Selection Error Tournament selection: s 1 : 1100011100, fitness = 5 s 2 : 0100111101, fitness = 6 ⇒ string s 2 is selected ! Competition at the schema level: (order-1 sufficient since we focus on Counting-Ones) ◮ partition f ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ : schema 0 ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ wins from schema 1 ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ⇒ selection decision error. ◮ partitions ∗ ∗ ∗ ∗ f ∗ ∗ ∗ ∗ ∗ and ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ f : schema ∗ ∗ ∗ ∗ 1 ∗ ∗ ∗ ∗ ∗ wins from schema ∗ ∗ ∗ ∗ 0 ∗ ∗ ∗ ∗ , and schema ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ 1 wins from schema ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ 0 ⇒ correct selection decisions. ◮ other partitions: nothing changes. 18/ ?? Dirk Thierens (Utrecht University) GA Modeling 18 / 47

  19. Selection Error What is the probability of making a selection error ? How many selection errors can we afford to make before the optimal bit-value at a cdertain position is completely lost in the population = premature convergence ? Population sizing is basically a statistical decision making problem. 19/ ?? Dirk Thierens (Utrecht University) GA Modeling 19 / 47

  20. Probability selection decision error Schemata fitness f ( H 1 : ∗ ∗ ∗ 1 ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ) and f ( H 2 : ∗ ∗ ∗ 0 ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ) binomial distributed → approximating with normal distribution N ( µ, σ 2 ) : σ 2 µ H 1 = 1 + ( ℓ − 1 ) p , H 1 = ( ℓ − 1 ) p ( 1 − p ) σ 2 µ H 2 = ( ℓ − 1 ) p , H 2 = ( ℓ − 1 ) p ( 1 − p ) ( p = probability of having a bit value 1 at any position). Distribution of the fitness difference of the best schema and the worst schema f ( H 1 ) − f ( H 2 ) is also normal distributed: σ 2 µ H 1 − H 2 = 1 , H 1 − H 2 = 2 ( ℓ − 1 ) p ( 1 − p ) 20/ ?? Dirk Thierens (Utrecht University) GA Modeling 20 / 47

  21. Probability selection decision error Probability selection error is equal to the probability that the best schema is sampled by a string with fitness less than the sample of the worst schema, which is equal to the probability that the fitness difference of the strings is negative: P [ SelErr ] = P ( F H 1 − H 2 < 0 ) − 1 = Φ( ) � 2 ( ℓ − 1 ) p ( 1 − p ) Φ( x ) : Cumulative distribution function of the standard normal distribution. P ( X < b ) = Φ( b − µ σ ) 21/ ?? Dirk Thierens (Utrecht University) GA Modeling 21 / 47

  22. 0.5 0.45 0.4 0.35 probability selection error 0.3 0.25 0.2 l=400 l=200 l=100 l= 50 0.15 0.1 0.05 0 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 proportion p bit values 1 22/ ?? Dirk Thierens (Utrecht University) GA Modeling 22 / 47

Recommend


More recommend