Differential Evolution Entirely Parallel Method K.N. Kozlov 1 , A.M. - PowerPoint PPT Presentation

Differential Evolution Entirely Parallel Method K.N. Kozlov ∗ 1 , A.M. Samsonov 2 and M.G. Samsonova 1 1 St.Petersburg Polytechnic University, St.Petersburg 195251, Russia 2 A.F. Ioffe Physico-Technical Institute, St.Petersburg, 194021, Russia ∗ email:kozlov_kn@spbstu.ru

Introduction Motivation: The design of efficient algorithms and systems to solve the inverse problem of mathematical modeling continues to be a challenge. Applications in biology and medicine: Universal solution doesn’t exist. The main difficulties are: processing of large data volumes, ■ determination of experimentally heterogeneity of biomedical data ■ ■ immeasurable quantities, and parameters, design of new substances, multiple fitness criteria, ■ ■ interpretation of modeling results. high computational complexity of ■ ■ biomedical applications, Aims: An efficient optimization method should not only be fast and scalable, but also reliable and robust. 1. Applicability to the different types of models and the formal description of the a priori knowledge, 2. Flexibility: Wide range of strategies to overcome local extrema, 3. Usability: independent of the language of model description. 2 / 30

Formal Problem Statement Initial value problem: Let the vector v ( t, q ) = ( v 0 ( t, q ) , ..., v K − 1 ( t, q )) T describe the state of the system at time t . The dimension of v equals the number of state variables K . Let the vector q = ( q 0 , ..., q I − 1 ) T be the vector of parameters with dimension I . The system of ordinary differential equations of the first order in respect to the independent variable t and initial conditions describe the dynamics of the system: ∂v ∂t = f ( v, q ); v (0 , q ) = v 0 ; (1) The problems of mathematical physics described by the differential equations of higher orders can be rewritten in normal form, like (1)(Lurie, 1993). The model parameters are to be estimated by adaptation or fitting the model output to the experimental data. 3 / 30

Objective Function Adaptation is performed by minimization of the objective function, e.g. the sum of squared differences between the data and model output, J ( v ( t i , q ) − y ( t i )) T ( v ( t i , q ) − y ( t i )) ,where J independent experimental observations � i =1 are denoted y ( t i ) = ( y 0 ( t ) , ..., y K − 1 ( t )) T and i = 1 , ..., J . Multi-objective case: Weighted global criterion method (weighted sum method). Let R i ( X ) represent the optimization criteria and S j ( X ) = 0 be the parameter constraints. � � F ( q ) = α i R i ( q ) + β j S j ( q ) → min i j α i > 0 β j > 0 4 / 30

Box Constraints Constraints in the form of inequalities can be imposed for the subset of parameters: ≤ q i ≤ q up q low i ∈ I l . (2) i i Lets introduce new parameters u i : q i = α i + β i tanh ( ηu i ) , (3) where scaling coefficient η is chosen experimentally, and α i = ( q up β i = ( q up + q low − q low ) / 2; ) / 2 . i i i i Consequently, parameters q i , i ∈ I l in (1) are substituted with there transformations and adaptation procedure is applied to determine unconstrained parameters u i . 5 / 30

Overview The constraint optimization problem can be transformed to the unconstrained problem using penalty functions. Global methods : Local search : Simulated Annealing (Metropolis Gradient descent, ■ ■ method), Conjugate gradient, ■ Genetic algorithms, evolutionary ■ Nelder-Mead method, ■ algorithms, Imitation methods: Particle ■ Swarm Optimization, Ant Colonies, For more overview see (Ashyraliyev et al., 2009; Mendes and Kell, 1998; Moles et al., 2003). 6 / 30

Existing Solutions Methods for solution of the inverse problem of mathematical modeling are actively developed. Available tools and approaches: Disadvantages: Matlab, Mathematica, Maple, etc, Integrated systems: poor ■ ■ interoperability, oriented to Octave, R, COPASI, KNIME, etc, ■ simulations, Programming «from scratch», ■ Limitations to certain types of ■ Adaptation «by hand». ■ models, Steep learning curve, ■ Complexity of the a priori ■ knowledge formalization. 7 / 30

Differential Evolution Stochastic iterative optimization technique, suggested by Storn and Price (Storn and Price, 1995). Geometric interpretation in Construction of trial vectors: 2D case: v = q r 1 + S ( q r 2 − q r 3 ) , where S is the NP parameter vectors from current generation q x 2 scaling constant. Minimum q r3 x x x q i x «trigonometric mutation rule» x x q r2 x x (Fan and Lampinen, 2003): x q r1 x x x x v o z = q r 1 + q r 2 + q r 3 +( ϕ 2 − ϕ 1 )( q r 1 − q r 2 ) 3 +( ϕ 3 − ϕ 2 )( q r 2 − q r 3 ) +( ϕ 1 − ϕ 3 )( q r 3 − q r 1 ) q 1 where ϕ i = | F ( q ri ) | /ϕ ∗ , i = 1 , 2 , 3 , The set random parameter vectors q i , i = 1 , ..., NP . The ϕ ∗ = | F ( q r 1 ) | + | F ( q r 2 ) | + | F ( q r 3 ) | size of the set NP is fixed. 8 / 30

Crossover v w z The offspring is defined as follows: q 0 q 0 q 0 q 1 q 1 q 1 n = 2 � q 2 q 2 q 2 v j , j = � n � I , � n + 1 � I , ..., � n + L − 1 � I n = 3 w j = q 3 q 3 q 3 n = 4 z j j < � n � I OR j > � n + L − 1 � I q 4 q 4 q 4 q q q 5 5 5 q 6 q 6 q 6 q 7 q 7 q 7 q 8 q 8 q 8 where n is the randomly chosen index, � n � I is the reminder of division of n by I , L is determined by Pr ( L = a ) = ( p ) a , where p is the crossover probability. 9 / 30

Differential Evolution Entirely Parallel Method Being the evolutionary algorithm, DE can be effectively parallelized: each member of the population is pushed to the asynchronous queue. ■ it is evaluated individually as soon as there is an available thread in the pool. ■ threads are synchronized on each generation. ■ Age of the individual – the number of iterations the individual survived without changes. Several oldest members are substituted by the same number of individuals, having the best values of objective function. The parameters of the algorithm: - Ψ is the number of seeding individuals (i.e. individuals that substitute old ones), - Θ is the number of iterations called seeding interval. 10 / 30

Selection of Offsprings proc select (individual) = New Selection Rule for several { objective functions : if ( F < the value of the parent) then Accept offspring 1. The offspring replaces its parent else if the value for the offspring set for all additional functions P do of parameters is less than that for if ( P < the value of the parent) then the parental one. Generate the random number U . if ( U < control parameter for this 2. The offspring replaces its parent function) then if the value of some objective Accept offspring function is better and the end if randomly selected value is less end if than the predefined parameter for end for this function. end if } 11 / 30

Population Diversity The original algorithm was highly dependent on internal parameters, see, e.g. (Gaemperle R., 2002). The variance of each parameter: � 2 NP − 1 NP − 1 � 1 1 � � var j = q i,j − q k,j (4) NP NP i =0 k =0 where j = 0 , ..., I − 1 . An efficient Adaptive Scheme for selection of internal parameters S and p based on the Preserving Population Diversity was proposed in (Zaharie, 2002). 12 / 30

Adaptive Scheme Then �� NP · ( c j − 1)+ p j (2 − p j ) NP · ( c j − 1) + p j (2 − p j ) ≥ 0 2 · NP · p j (5) S j = S inf NP · ( c j − 1) + p j (2 − p j ) < 0 and � � j − 1) 2 − NP · (1 − c j ) − ( NP · S 2 ( NP · S 2 j − 1) + c j ≥ 1 p j = (6) p inf c j < 1 New control parameter γ : c new var j /var new � � = γ (7) j j 13 / 30

Stopping criterion F max M ρ F opt N Calculations are stopped in case that the objective function F decreases less than a predefined value ρ during M steps. 14 / 30

The integer valued parameters The values are sorted in ascending order. ■ The index of the parameter in the floating point array becomes the value of the ■ parameter in the integer array. a[0] = 0.3; a[1] = 0.5; a[2] = 0.1; a[3] = 0.8; After sorting in the ascending order the array should be transformed to: b[0] = a[2] = 0.1 b[2] = a[1] = 0.5 b[1] = a[0] = 0.3 b[3] = a[3] = 0.8 The indices of the sorted array define the current set of the integer valued parameters: i[0] = 2; i[1] = 0; i[2] = 1; i[3] = 3; 15 / 30

Approximation of the Optimum The plot of the criteria in 6 4.5 Criterion 1 Criterion 1 Criterion 2 Criterion 2 4 the close vicinity of the 5 Criterion 3 Criterion 3 3.5 optimal values of the four 4 3 parameters. 2.5 Error Error 3 2 F 1 is the sum of squared 2 1.5 x=5.0 1 y=3.0 differences between the 1 0.5 model output and data, 0 0 4.6 4.8 5 5.2 5.4 5.6 5.8 6 2.4 2.6 2.8 3 3.2 3.4 3.6 F 2 and F 3 penalize the Parameter Value Parameter Value 4.5 6 Criterion 1 Criterion 1 deviation of the model from Criterion 2 Criterion 2 4 5 Criterion 3 Criterion 3 the desired behavior. 3.5 4 3 2.5 Error Error While the criteria do not 3 2 take a minimum value at 2 1.5 one and the same point, 1 q=0.43 1 z=1.12 nevertheless, the algorithm 0.5 0 0 produces reliable 1 1.05 1.1 1.15 1.2 1.25 1.3 0.36 0.38 0.4 0.42 0.44 0.46 0.48 Parameter Value Parameter Value approximation of the optimum. 16 / 30

Differential Evolution Entirely Parallel Method K.N. Kozlov 1 , A.M. - PowerPoint PPT Presentation

Differential Evolution Entirely Parallel Method K.N. Kozlov 1 , A.M. Samsonov 2 and M.G. Samsonova 1 1 St.Petersburg Polytechnic University, St.Petersburg 195251, Russia 2 A.F. Ioffe Physico-Technical Institute, St.Petersburg, 194021, Russia

DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOLATILES DIFFERENTIAL AROMA

Tutorial: Differential Categories and Cartesian Differential Categories JS Pacaud Lemay FMCS

Introduction Introduction What is Parallel Architecture? Why Parallel Architecture? Evolution

EVOLUTION X3 - 1 - Evolution X3 Marketing Dpt. November 2006 - 2 - EVOLUTION X3 Evolution X3

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

Differential Inclusion Method in High Dimensional Statistics Yuan Yao HKUST July 14, 2018 Yuan

Differential expression analysis John Blischak Instructor DataCamp Differential Expression

Differential and Linear Cryptanalysis Lars R. Knudsen June 2014 L.R. Knudsen Differential and

Modelling with Differential Equations Modelling with Differential Equations Modelling with

Differential equations Programming of Differential Equations A differential equation (ODE)

Differential forms in non-linear Cartesian differential categories Hayley Reid and Jonathan

differential schemes and differential algebraic varieties Dmitry Trushin Department of

Differential equations Programming of Differential Equations A differential equation (ODE)

A Comparison of Unified Parallel C Titanium and Co-Array Fortran (parallel computing made fun,

2.6 A Numerical Method (Eulers Method) a lesson for MATH F302 Differential Equations Ed

Parallel Numerical Algorithms Chapter 7 Differential Equations Section 7.1 Ordinary

Never Cut What You can Untie: Colonial Irrigation Development as a Metaphor Maurits W. Ertsen

Machine Learning for Financial Forecasting Ali Habibnia Department of Statistics, LSE May , 2016

Application of metaheuristics to task-to-processors assignation problems Domingo Gim enez

Carlos A. Coello Coello Departamento de Computacin CINVESTAV-IPN ccoello@cs.cinvestav.mx

Architecting Cross-Platform Mobile Frameworks Spencer Chan Quora Motivation Two extremes

RGCCISD Technology Academy Troubleshooting Presented By Diana Cantu Mayra Toscano Jesus

Slide 1: Tech Terminology Overview There are lots of terminology terms that we use freely. I

Prefiltering the wave field synthesis operators - anti-aliasing and source directivity Gergely

Sambuz

Useful Links

Newsletter

Mail Us