BOAT: Building Auto-Tuners with Structured Bayesian Optimization - PowerPoint PPT Presentation

BOAT: Building Auto-Tuners with Structured Bayesian Optimization Valentin Dalibard Michael Schaarschmidt Eiko Yoneki Presented by Jesse Mu

Parameters in large-scale systems Coarse Number of cluster nodes ML Hyperparams Compiler Flags Fine

Parameters in large-scale systems Coarse How to optimize Number of cluster nodes parameters θ ? ML Hyperparams Compiler Flags Fine

Parameters in large-scale systems Coarse How to optimize Number of cluster nodes parameters θ ? Minimize some cost ML Hyperparams function f( θ ) . Compiler Flags Fine

Parameters in large-scale systems Coarse How to optimize Number of cluster nodes parameters θ ? Minimize some cost ML Hyperparams function f( θ ) ...where cost is runtime, Compiler Flags Fine memory, I/O, etc

Auto-tuning (optimization)

Auto-tuning (optimization) Grid search θ ∈ [1, 2, 3, …] ●

Auto-tuning (optimization) θ ∈ [1, 2, 3, …] ● Grid search ● Evolutionary approaches (e.g. ) ● Hill-climbing (e.g. )

Auto-tuning (optimization) θ ∈ [1, 2, 3, …] ● Grid search ● Evolutionary approaches (e.g. ) ● Hill-climbing (e.g. ) SPEARMINT ● Bayesian optimization (e.g. )

Auto-tuning (optimization) in distributed systems θ ∈ [1, 2, 3, …] ● Grid search ● Evolutionary approaches (e.g. ) ● Hill-climbing (e.g. ) SPEARMINT ● Bayesian optimization (e.g. )

Auto-tuning (optimization) in distributed systems θ ∈ [1, 2, 3, …] ● Grid search ● Evolutionary approaches (e.g. ) Require 1000s of evaluations of ● Hill-climbing (e.g. ) cost function! SPEARMINT ● Bayesian optimization (e.g. )

Auto-tuning (optimization) in distributed systems θ ∈ [1, 2, 3, …] ● Grid search ● Evolutionary approaches (e.g. ) Require 1000s of evaluations of ● Hill-climbing (e.g. ) cost function! Fails in high SPEARMINT ● Bayesian optimization (e.g. ) dimensions!

Auto-tuning (optimization) in distributed systems θ ∈ [1, 2, 3, …] ● Grid search ● Evolutionary approaches (e.g. ) Require 1000s of evaluations of ● Hill-climbing (e.g. ) cost function! Fails in high SPEARMINT ● Bayesian optimization (e.g. ) dimensions! ● Structured Bayesian optimization (this work: B esp O ke A uto- T uners)

Gaussian Processes Data Prior Posterior From Carl Rasmussen’s 4F13 lectures http://mlg.eng.cam.ac.uk/teaching/4f13/1718/gp%20and%20data.pdf

e.g. expected increase over max perf. (balance exploration vs exploitation)

Bayesian Optimization Gaussian Process

Structured Bayesian Optimization (SBO) Gaussian Process

Structured Bayesian Optimization (SBO)

Structured Bayesian Optimization (SBO) * *Developer-specified, semi-parametric model of performance from observed performance + arbitrary runtime characteristics

Probabilistic Models for SBO

Probabilistic Models for SBO Too restrictive Too generic Just right

Semi-parametric models in SBO ● Specify the parametric component only (GP for free)

Semi-parametric models in SBO ● Specify the parametric component only (GP for free) ● e.g. predict GC rate from JVM eden size

Semi-parametric models in SBO ● Specify the parametric component only (GP for free) ● e.g. predict GC rate from JVM eden size Prior: malloc rate ~ Uniform(0, 5000)

Semi-parametric models in SBO

Composing semi-parametric models

Composing semi-parametric models Dataflow DAG Inference exploits conditional independence between models

SBO: Summary 1. Configuration space (i.e. possible params) 2. Objective function + runtime measurements 3. Semi-parametric model of system

SBO: Summary 1. Configuration space (i.e. possible params) standard 2. Objective function + runtime measurements 3. Semi-parametric model of system

SBO: Summary 1. Configuration space (i.e. possible params) standard 2. Objective function + runtime measurements 3. Semi-parametric model of system new

SBO: Summary 1. Configuration space (i.e. possible params) standard 2. Objective function + runtime measurements 3. Semi-parametric model of system new Key: try generic system, before optimizing with structure

Evaluation: Cassandra GC

Evaluation: Cassandra GC Best params outperform Cassandra defaults by 63% Existing systems converge but take 6x longer

Evaluation: Neural Net SGD Load balancing, worker allocation over 10 machines = 30 params

Evaluation: Neural Net SGD Load balancing, worker allocation over 10 machines = 30 params Default configuration: 9.82s OpenTuner: 8.71s BOAT: 4.31s Existing systems don’t converge!

Review:

Review: overall, a good, unsurprising contribution

Review: overall, a good, unsurprising contribution ● Theory ○ Unsurprising that expert-developed models optimize better! ■ Tradeoff: developer hours vs machine hours ○ Cassandra GC system converges in 2 iterations - model is near-perfect! What happens when parametric model is wrong? ■ More details about tradeoff between parametric model and generic GP ■ OpenTuner: build an ensemble of multiple search techniques

Review: overall, a good, unsurprising contribution ● Theory ○ Unsurprising that expert-developed models optimize better! ■ Tradeoff: developer hours vs machine hours ○ Cassandra GC system converges in 2 iterations - model is near-perfect! What happens when parametric model is wrong? ■ More details about tradeoff between parametric model and generic GP ■ OpenTuner: build an ensemble of multiple search techniques ● Implementation ○ Cross-validation? ○ Key for system adoption: make interface as high-level as possible

Review: overall, a good, unsurprising contribution ● Theory ○ Unsurprising that expert-developed models optimize better! ■ Tradeoff: developer hours vs machine hours ○ Cassandra GC system converges in 2 iterations - model is near-perfect! What happens when parametric model is wrong? ■ More details about tradeoff between parametric model and generic GP ■ OpenTuner: build an ensemble of multiple search techniques ● Implementation ○ Cross-validation? ○ Key for system adoption: make interface as high-level as possible ● Evaluation ○ What happens when # params >> 30? ○ “DAGModels help debugging”...how?

BOAT: Building Auto-Tuners with Structured Bayesian Optimization - PowerPoint PPT Presentation

BOAT: Building Auto-Tuners with Structured Bayesian Optimization Valentin Dalibard Michael Schaarschmidt Eiko Yoneki Presented by Jesse Mu Parameters in large-scale systems Coarse Number of cluster nodes ML Hyperparams Compiler

BOAT: Building Auto-Tuners with Structured Bayesian Optimization B esp O ke A uto- T uners Indigo

MDOT Bathymetric Boat Survey Research Project October 5, 2017 MDOT Bathymetric Boat

Boats Information The NAVSUP Enterprise 1 Overview Shipping Your Boat General Information

CG5 Transcript Peace Boat US Presentation Slide 1: Peace Boat - Title Slide Slide 2: Table of

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Frequency Tuners Mini-Workshop Objectives Akira Yamamoto To be held at CERN, 5 September, 2014

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

The Korean Auto & Auto Parts Industry Chapter 1. The Status of Korean Auto Industry 2 1

GB Auto The Ghabbour Group of Companies Everything on Wheels GB Auto, S.A.E I nitial

GB Auto The Ghabbour Group of Companies Everything on Wheels GB Auto, S.A.E Initial

WIDE Project RFID/Auto-ID activities Yojiro UO Auto-ID Labs, JAPAN WIDE Project Auto-ID

Multilateral Solutions www.multilateralsolutions.co.uk Multilaterals Are Not New We can all

SPE 555 Teaching Reading to Teaching Reading to Black Adolescent Maleswhat educators

SAPC IADC Chapter Meeting 14th February 2019 1 Agenda 2 Agenda 3 Housekeeping

SAFETY OFFSHORE Roles, Responsibilities, Regrets and Recriminations SPE WA Perth 3

TRANSPORT FOR LONDON BOARD MEETING OPEN SESSION TO BE HELD ON 28 JUNE 2006 AT 1000 HOURS IN

ACTIVATING: GATES HILLSIDE Weee Garden BID 2010 | A4: Activating Public Landscapes Alice

Machine Learning. Brushing on tools used for summer student projects Overview Workbooks

Constructing Dynamic Policies for Paging Mode Selection Jason Hiebel Laura E. Brown Zhenlin

BOAT: Building Auto-Tuners with Structured Bayesian Optimization - PowerPoint PPT Presentation

BOAT: Building Auto-Tuners with Structured Bayesian Optimization Valentin Dalibard Michael Schaarschmidt Eiko Yoneki Presented by Jesse Mu Parameters in large-scale systems Coarse Number of cluster nodes ML Hyperparams Compiler

BOAT: Building Auto-Tuners with Structured Bayesian Optimization B esp O ke A uto- T uners Indigo

MDOT Bathymetric Boat Survey Research Project October 5, 2017 MDOT Bathymetric Boat

Boats Information The NAVSUP Enterprise 1 Overview Shipping Your Boat General Information

CG5 Transcript Peace Boat US Presentation Slide 1: Peace Boat - Title Slide Slide 2: Table of

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Frequency Tuners Mini-Workshop Objectives Akira Yamamoto To be held at CERN, 5 September, 2014

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

The Korean Auto &amp; Auto Parts Industry Chapter 1. The Status of Korean Auto Industry 2 1

GB Auto The Ghabbour Group of Companies Everything on Wheels GB Auto, S.A.E I nitial

GB Auto The Ghabbour Group of Companies Everything on Wheels GB Auto, S.A.E Initial

WIDE Project RFID/Auto-ID activities Yojiro UO Auto-ID Labs, JAPAN WIDE Project Auto-ID

Multilateral Solutions www.multilateralsolutions.co.uk Multilaterals Are Not New We can all

SPE 555 Teaching Reading to Teaching Reading to Black Adolescent Maleswhat educators

SAPC IADC Chapter Meeting 14th February 2019 1 Agenda 2 Agenda 3 Housekeeping

SAFETY OFFSHORE Roles, Responsibilities, Regrets and Recriminations SPE WA Perth 3

TRANSPORT FOR LONDON BOARD MEETING OPEN SESSION TO BE HELD ON 28 JUNE 2006 AT 1000 HOURS IN

ACTIVATING: GATES HILLSIDE Weee Garden BID 2010 | A4: Activating Public Landscapes Alice

Machine Learning. Brushing on tools used for summer student projects Overview Workbooks

Constructing Dynamic Policies for Paging Mode Selection Jason Hiebel Laura E. Brown Zhenlin

The Korean Auto & Auto Parts Industry Chapter 1. The Status of Korean Auto Industry 2 1