Comparison of Ordinal and Metric Gaussian Process Regression as - PowerPoint PPT Presentation

DTS-CMA-ES Surrogate models Experimental results Comparison of Ordinal and Metric Gaussian Process Regression as Surrogate Models for CMA Evolution Strategy ek Pitra 1 , 2 , 3 , Lukáš Bajer 1 , 4 , Jakub Repický 1 , 4 , Zbynˇ na 1 Martin Holeˇ 1 Institute of Computer Science, Czech Academy of Sciences 2 Faculty of Nuclear Sciences and Physical Engineering 3 National Institute of Mental Health 4 Faculty of Mathematics and Physics, Charles University Prague, Czech Republic GECCO 2017 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 1

DTS-CMA-ES Surrogate models Experimental results Contents DTS-CMA-ES 1 Surrogate models 2 Metric Gaussian Processes Ordinal Gaussian Processes Experimental results 3 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 2

DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 CMA-ES m 1 , σ 1 sampling from N ( m 1 , σ 1 ) 1 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 m 1 , σ 1 1 st model training 2 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 s 2 m 1 , σ 1 distribution prediction 3 according to 1 st model Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 select the most promising ⌈ αλ ⌉ points accord. to the model f M 1 4 s 2 3rd 3rd m 1 , σ 1 1st 1st 2nd 2nd criterion ranking 4 according to 1 st model Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 select the most promising ⌈ αλ ⌉ points accord. to the model f M 1 4 evaluate the chosen points 5 with the original fitness f fitness evaluation m 1 , σ 1 of a few chosen points 5 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 select the most promising ⌈ αλ ⌉ points accord. to the model f M 1 4 evaluate the chosen points 5 with the original fitness f re-train the second model f M 2 6 with these new points 2 nd model m 1 , σ 1 training 6 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 select the most promising ⌈ αλ ⌉ points accord. to the model f M 1 4 evaluate the chosen points 5 with the original fitness f re-train the second model f M 2 6 2 nd model with these new points mean-prediction m 1 , σ 1 for the rest of predict the fitness for the 7 population non-original-evaluated points 7 with f M 2 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

DTS-CMA-ES Surrogate models Experimental results DTS-CMA-ES Initialize : standard CMA-ES initialization with population doubled while not terminate CMA-ES sampling of population x i ∼ N ( m , σ 2 C ) , for i = 1 , . . . , λ 1 train the first model f M 1 on the so-far original-evaluated points 2 s 2 get mean ˆ µ i and variance ˆ i of all x i with the model f M 1 3 select the most promising ⌈ αλ ⌉ points accord. to the model f M 1 4 evaluate the chosen points 5 with the original fitness f re-train the second model f M 2 6 with these new points m, σ, C CMA-ES m 2 , σ 2 predict the fitness for the 7 update non-original-evaluated points 8 with f M 2 CMA-ES update of m , σ , C 8 Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 3

DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process GP is a stochastic process, where any finite collection of random variables has a joint Gaussian distribution f GP ( x ) ∼ GP ( µ ( x ) , k ( x 1 , x 2 )) Defined by the mean function µ ( x ) (usually constant) and covariance function k ( x 1 , x 2 ) and their (hyper)parameters Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 4

DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process GP is a stochastic process, where any finite collection of random variables has a joint Gaussian distribution f GP ( x ) ∼ GP ( µ ( x ) , k ( x 1 , x 2 )) Defined by the mean function µ ( x ) (usually constant) and covariance function k ( x 1 , x 2 ) and their (hyper)parameters GP can express uncertainty of the prediction in a new point x : it gives a probability distribution of the output value Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 4

DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process given a set of N training points X N = ( x 1 . . . x N ) , x i ∈ R d , and corresponding measured values y N = ( y 1 , . . . , y N ) ⊤ of a function f being approximated y i = f ( x i ) , i = 1 , . . . , N Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 5

DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process given a set of N training points X N = ( x 1 . . . x N ) , x i ∈ R d , and corresponding measured values y N = ( y 1 , . . . , y N ) ⊤ of a function f being approximated y i = f ( x i ) , i = 1 , . . . , N GP considers vector of these function values as a sample from N -variate Gaussian distribution y N ∼ N ( 0 , C N ) Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 5

DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process prediction When considering a new point ( x ∗ , y ∗ ) , the prob. density of its f -values is 1D Gaussian p ( y ∗ | X N , x ∗ , y N ) ∼ N (ˆ s 2 N + 1 ) µ N + 1 , ˆ Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 6

DTS-CMA-ES Metric Gaussian Processes Surrogate models Ordinal Gaussian Processes Experimental results Gaussian Process prediction When considering a new point ( x ∗ , y ∗ ) , the prob. density of its f -values is 1D Gaussian p ( y ∗ | X N , x ∗ , y N ) ∼ N (ˆ s 2 N + 1 ) µ N + 1 , ˆ with the mean and variance given by k ⊤ C N − 1 y N , ˆ = µ N + 1 s 2 N + 1 κ − k ⊤ C N − 1 k = where C N is GP covariance matrix – matrix of covariance function’s values k ( x i , x j ) for each pair x i , x j k is vector of covariance function’s values k ( x ∗ , x i ) between the new point x ∗ and x i ∈ X N κ is the variance of the new point itself k ( x ∗ , x ∗ ) Z Pitra, L Bajer, J Repický, M Holeˇ na Comparison Ordinal vs. Metric GP for CMA-ES 6

Comparison of Ordinal and Metric Gaussian Process Regression as - PowerPoint PPT Presentation

DTS-CMA-ES Surrogate models Experimental results Comparison of Ordinal and Metric Gaussian Process Regression as Surrogate Models for CMA Evolution Strategy ek Pitra 1 , 2 , 3 , Luk Bajer 1 , 4 , Jakub Repick 1 , 4 , Zbyn na 1 Martin

Welcome back... Metric spaces. Approximate metric using a tree. Tree metric: 16 16 A metric

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

Metric Spaces Definition If d is a metric on X , then the metric topology on X induced by d is

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

Ordinal Numbers and the Axiom of Substitution Bernd Schr oder logo1 Bernd Schr oder

Ordinal social ranking : simulations for CP-majority rule Nicolas Fayard 1 and Meltem ztrk 1 1

Representations of Ordinal Numbers Juan Sebasti an C ardenas-Rodr guez Andr es

Gaussian Process Lei Tang Arizona State University Jul. 31th, 2007 Lei Tang (ASU) Gaussian

Information- -Velocity Metric Velocity Metric Information-Velocity Metric Information for the

The Metric Coalescent Process joint with David Aldous Daniel Lanoue June 17, 2014 Daniel Lanoue

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

Gaussian Processes Seung-Hoon Na Chonbuk National University Gaussian Process Regression

Distance Metric Learning: Beyond 0/1 Loss Praveen Krishnan CVIT, IIIT Hyderabad June 14, 2017 1

Metric Conversions Ladder Method T. Trimpe 2008 http://sciencespot.net/ Metric System The

Statistics, Error Analysis Hypothesis Testing PHY517 / AST443, Lecture 5 Remote Login Issues

Bayesian Inference for Normal Mean Al Nosedal. University of Toronto. November 18, 2015 Al

Super-resolution using Gaussian Process Regression Final Year Project Interim Report He He

A Proposal for an International Virtual Water Trading Council building institutional frameworks at

Machine Learning: Foundations Lecturer: Yishay Mansour Lecture 2 Bayesian Inference Kfir Bar

(Still) Hunting for Primordial Non-Gaussianity: Current Status and Future Prospects Eiichiro

Multimodality in the Kalman Filter and Ensemble Kalman Filter Maxime Conjard, Henning Omre

An Unified Parametric-Nonparametric Uncertainty Quantification Approach for Linear Dynamical