Mining Markov Network Surrogates for Value-Added Optimisation - PowerPoint PPT Presentation

Mining Markov Network Surrogates for Value-Added Optimisation Alexander Brownlee www.cs.stir.ac.uk/~sbr sbr@cs.stir.ac.uk

Outline • Value-added optimisation • Markov network fitness model • Mining the model • Examples with benchmarks • Case study: cellular windows • Discussion / conclusions 2

Value-added Optimisation • A philosophy whereby we provide more than simply optimal solutions • Information gained during optimisation can highlight sensitivities and linkage • This can be useful to the decision maker: – Confidence in the optimality of results – Aids decision making – Insights into the problem • Help solve similar problems • Highlight problems / misconceptions in definition 3

Value-added Optimisation • This information can come from – the trajectory followed by the algorithm – models built during the run • If we are constructing a model as part of the optimisation process, anything we can learn from it comes "for free" • Some examples from MBEAs / EDAs – M. Hauschild, M. Pelikan, K. Sastry, and C. Lima. Analyzing probabilistic models in hierarchical BOA. IEEE TEC 13(6):1199- 1217, December 2009 – R. Santana, C. Bielza, J. A. Lozano, and Pedro Larranaga. Mining probabilistic models learned by EDAs in the optimization of multi-objective problems. In Proc. GECCO 2009, pp 445-452 4

Markov network fitness model (MFM) • Suited to bit string encoded problems • Originally developed as part of DEUM EDA – A probabilistic model of fitness, directly sampled to generate solutions, replacing crossover and mutation operators • Markov network is undirected probabilistic graphical model – energy U(x) of a solution x equates to a sum of clique potentials, in turn equates to a mass distribution of fitness – energy has negative log relationship to probability, so minimise U to maximise f • MFM can be used as a surrogate 5

FM with Markov Networks � Two aspects to building a Markov network: x 0 – Structure x 1 x 2 – Parameters (α) � Model can be represented by: x 3 α + α + α + α x x x x 0 0 1 1 2 2 3 3 ln( ( )) + α + α + α + α + α = − x x x x x x x x x x f x 01 0 1 02 0 2 03 0 3 13 1 3 23 2 3 + α + α + x x x x x x c 013 0 1 3 023 0 2 3 • Compute parameters using sample of population • Variables are -1 and +1 instead of 0 and 1 � The terms in the MFM correspond to Walsh functions (can represent any bit string encoded problem) 6

x 0 Building a Model x 1 x 2 Calc Markov network parameters using SVD x 3 1011 f=1 ln( 1 ) ( 1 ) α + ( − 1 ) α + ( 1 ) α α + − ( α 1 ) α + + α ( 1 )( + − α 1 ) α − α + ( 1 + )( 1 α ) α + + α ( 1 )( 1 − ) α α + + ( α − 1 )( 1 − ) α α + ( + 1 )( α 1 ) α + + ( 1 = )( − − 1 )( 1 ) α + ( 1 )( 1 )( 1 ) α + = − ln( 1 ) c c 0 1 2 3 01 02 03 13 23 013 023 0 1 2 3 01 02 03 13 23 013 023 1111 f=4 ( 1 ) ( 1 ) ( 1 ) ( 1 ) ( 1 )( 1 ) ( 1 )( 1 ) ( 1 )( 1 ) ( 1 )( 1 ) ( 1 )( 1 ) ( 1 )( 1 )( 1 ) ( 1 )( 1 )( 1 ) ln( 4 ) α + α + α + α + α + α + α + α + α + ln( α 4 ) + α + = − α + α + α + α + α + α + α + α + α + α + α + = − c c 0 1 2 3 01 02 03 13 23 013 023 0 1 2 3 01 02 03 13 23 013 023 1001 f=1 ln( 1 ) α − α − α + α − α − α + α − α − α − α − α + = − c ( 1 ) ( 1 ) ( 1 ) ( 1 ) ( 1 )( 1 ) ( 1 )( 1 ) ( 1 )( 1 ) ( 1 )( 1 ) ( 1 )( 1 ) ( 1 )( 1 )( 1 ) ( 1 )( 1 )( 1 ) ln( 1 ) α + − α + − α + α + − α + − α + α + − α + − α + − α + − α + = − c 0 1 2 3 01 02 03 13 23 013 023 0 1 2 3 01 02 03 13 23 013 023 1000 f=3 ln( 3 ) α − α − α − α − α − α − α + α + α + α + α + = − c ( 1 ) α + ( − 1 ) α + ( − 1 ) α + ( − 1 ) α + ( 1 )( − 1 ) α + ( 1 )( − 1 ) α + ( 1 )( − 1 ) α + ( − 1 )( − 1 ) α + ( − 1 )( − 1 ) α + ( 1 )( − 1 )( − 1 ) α + ( 1 )( − 1 )( − 1 ) α + = − ln( 3 ) c 0 1 2 3 01 02 03 13 23 013 023 0 1 2 3 01 02 03 13 23 013 023 0011 f=2 ln( 2 ) − α − α + α + α + α − α − α − α + α + α − α + = − c ( − 1 ) α + ( − 1 ) α + ( 1 ) α + ( 1 ) α + ( − 1 )( − 1 ) α + ( − 1 )( 1 ) α + ( − 1 )( 1 ) α + ( − 1 )( 1 ) α + ( 1 )( 1 ) α + ( − 1 )( − 1 )( 1 ) α + ( − 1 )( 1 )( 1 ) α + = − ln( 2 ) c 0 1 2 3 01 02 03 13 23 013 023 0 1 2 3 01 02 03 13 23 013 023 α 0 =-0.38 α 1 =0.16 α 2 =0.02 α 3 =-0.34 α 01 =-0.07 α 02 =0.25 α 03 =-0.11 α 13 =-0.11 α 23 =-0.25 α 013 =-0.34 α 023 =-0.02 c=-0.61 7

MFM Predicts Fitness • Example; for individual X={1011} • Substitute variable values into energy function and solve: ( ) = α − α + α + α − α + α + α − α + α − α + α + U x c 0 1 2 3 01 02 03 13 23 013 023 ( ) − U x ( ) = f x e � This can then be used to predict fitness as a surrogate 8

MFM as a surrogate • Can either – completely replace fitness function (GA essentially samples the MFM) – take a mixed approach, where MFM is retrained occasionally, and used to filter candidate solutions • e.g. Speeding up benchmark FFs – A. Brownlee, O. Regnier-Coudert, J. McCall, and S. Massie. Using a Markov network as a surrogate fitness function in a genetic algorithm. Proc. IEEE CEC 2010, pp. 4525-4532 • e.g. Speeding up feature selection – A. Brownlee, O. Regnier-Coudert, J. McCall, S. Massie, and S. Stulajter. An application of a GA with Markov network surrogate to feature selection. International Journal of Systems Science, 44(11):2039-2056, 2013. • Now we consider how the model might be mined 9

Mining the model (1) ln( ( )) ( ) / − = f x U x T • As we minimise energy, we maximise fitness. So to minimise energy: α i x i • If the value taken by x i is 1 (+1) in high-fitness solutions, then a i will be negative • If the value taken by x i is 0 (-1) in the high-fitness solutions, then a i will be positive • If no particular value is taken by x i optimal solutions, then a i will be near zero 10

Mining the model (2) ln( ( )) ( ) / − = f x U x T • As we minimise energy, we maximise fitness. So to minimise energy: α x x ij i j • If the values taken by x i and x j are equal (+1) in the optimal solutions, then a i will be negative • If the values taken by x i and x j are opposite (-1) in the optimal solutions, then a ij will be positive • Higher order interactions follow this pattern 11

Examples with Benchmarks • A few well-known benchmarks to get the idea • In these experiments, the MFM replaces FF • Solutions generated at random and used to train model parameters 12

Onemax • Fitness is the sum of x i set to 1 0 -0.001 -0.002 -0.003 Coefficient values -0.004 -0.005 -0.006 -0.007 -0.008 -0.009 -0.01 0 10 20 30 40 50 60 70 80 90 100 13 Univariate alpha numbers

Checkerboard 2D • Form an s x s grid of the x i : fitness is the count of neighbouring x i taking opposite values 14

Checkerboard 2D 0.001 0.05 0.0008 0.045 0.0006 0.04 0.0004 0.035 Coefficient values Coefficient values 0.0002 0.03 0 0.025 -0.0002 0.02 -0.0004 0.015 -0.0006 0.01 -0.0008 0.005 -0.001 0 0 5 10 15 20 25 0 5 10 15 20 25 Univariate alpha numbers Bivariate alpha numbers 15

Checkerboard 2D x 1 x 2 x 3 x 5 x 4 1 2 3 x 7 x 10 x 6 x 8 x 9 4 5 6 7 8 9 10 x 11 x 13 x 14 x 15 x 12 11 12 13 14 15 16 17 x 17 x 18 x 19 x 20 x 16 18 19 20 21 22 23 24 x 25 x 22 x 23 x 24 x 21 16

RW Example: Cellular Windows • Optimise glazing for an atrium in a building • Switch on glazing in 120 cells – 120 bits encoding • Minimise energy use and construction cost – Energy for lighting, heating and cooling – Costly to compute: motivating use of surrogate 17

Optimisation run • Optimisation run used NSGA-II to find approximated Pareto-optimal solutions 18

Optimisation run • Trade-off and the specific designs in it are already helpful for a decision maker • But: – Lowest cost solution missing due to randomness – Slightly odd window shapes • What might be the impact of aesthetic changes to these solutions? 19

Adding value • Earlier paper tried two approaches • Frequency that cells are glazed in the approximated Pareto optimal sets + shows glazing - unclear how cells common to all affect the objectives optima separately + cheap to compute 20

Adding value • Local sensitivity – Hamming-1 neighbourhood of approx. Pareto optimal solutions + shows possible local - needs further fitness improvements evaluations + shows impact on objectives separately 21

Mining Markov Network Surrogates for Value-Added Optimisation - PowerPoint PPT Presentation

Mining Markov Network Surrogates for Value-Added Optimisation Alexander Brownlee www.cs.stir.ac.uk/~sbr sbr@cs.stir.ac.uk Outline Value-added optimisation Markov network fitness model Mining the model Examples with benchmarks

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Value Added Opportunities with Value Added Opportunities with Value Added Opportunities with

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

New Surrogates in Low-moisture Food/Petfood Process Validation , Are We Ready to Use Them? Dr.

PRESENTATION Robert Kavet EPRI Palo Alto, CA Dr. Kavet began his presentation with a general

Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking

SoDeep: A Sorting Deep Net to Learn Ranking Loss Surrogates June, 2019 Martin Engilberge, Louis

Stochastic Processes Markov Processes Hamid R. Rabiee 1 Overview o Markov Property o Markov

Modeling Data Correlations in Private Data Mining with Markov Model and Markov Networks Yang Cao

Markov Logic Markov Logic Probability First-Order Logic Propositional Logic Markov Logic

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Model Repair for Markov Decision Model Repair for Markov Decision Model Repair for Markov

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

The role of migration on family formation trajectories Evidence from the United States Andrs

HD GP-GPU Systems for HPC Applications: Engines | SAR | RF Amps Sergio Tafur , &

Speech recognition frontend on Cell BE Pavel Bazika (bazikp1@fel.cvut.cz) Speech recognizer

Modeling And Visualizing Fire Without Getting Burned MCSD Seminar June 29, 2005 Glenn P. Forney

The role of low-carbon technologies in climate mitigation Perspectives on feasibility of low

Relevant Background World wide industrial agriculture expertise 1 1/10/2019 Who Are We ?

Infibeam Incorporation Limited Q4 FY2018 Earnings Conference Call May 31, 2018 M R . D HAVAN

Recycling Centres and Reuse Shops Tender Information Day 2 nd June 2014 Glenn Fleet Group

Mining Markov Network Surrogates for Value-Added Optimisation - PowerPoint PPT Presentation

Mining Markov Network Surrogates for Value-Added Optimisation Alexander Brownlee www.cs.stir.ac.uk/~sbr sbr@cs.stir.ac.uk Outline Value-added optimisation Markov network fitness model Mining the model Examples with benchmarks

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Value Added Opportunities with Value Added Opportunities with Value Added Opportunities with

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

New Surrogates in Low-moisture Food/Petfood Process Validation , Are We Ready to Use Them? Dr.

PRESENTATION Robert Kavet EPRI Palo Alto, CA Dr. Kavet began his presentation with a general

Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking

SoDeep: A Sorting Deep Net to Learn Ranking Loss Surrogates June, 2019 Martin Engilberge, Louis

Stochastic Processes Markov Processes Hamid R. Rabiee 1 Overview o Markov Property o Markov

Modeling Data Correlations in Private Data Mining with Markov Model and Markov Networks Yang Cao

Markov Logic Markov Logic Probability First-Order Logic Propositional Logic Markov Logic

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Model Repair for Markov Decision Model Repair for Markov Decision Model Repair for Markov

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

The role of migration on family formation trajectories Evidence from the United States Andrs

HD GP-GPU Systems for HPC Applications: Engines | SAR | RF Amps Sergio Tafur , &amp;

Speech recognition frontend on Cell BE Pavel Bazika (bazikp1@fel.cvut.cz) Speech recognizer

Modeling And Visualizing Fire Without Getting Burned MCSD Seminar June 29, 2005 Glenn P. Forney

The role of low-carbon technologies in climate mitigation Perspectives on feasibility of low

Relevant Background World wide industrial agriculture expertise 1 1/10/2019 Who Are We ?

Infibeam Incorporation Limited Q4 FY2018 Earnings Conference Call May 31, 2018 M R . D HAVAN

Recycling Centres and Reuse Shops Tender Information Day 2 nd June 2014 Glenn Fleet Group

HD GP-GPU Systems for HPC Applications: Engines | SAR | RF Amps Sergio Tafur , &