To tune or not to tune
Thomas Pasquier
tfjmp@cs.ubc.ca https://tfjmp.org
To tune or not to tune Thomas Pasquier tfjmp@cs.ubc.ca - - PowerPoint PPT Presentation
To tune or not to tune Thomas Pasquier tfjmp@cs.ubc.ca https://tfjmp.org The team - Ayat Fekry , PhD student - Lucian Carata , Senior Research Associate - Andrew Rice , Professor - Andy Hopper , Professor 2 About me - Assistant Professor
tfjmp@cs.ubc.ca https://tfjmp.org
2
3
4
5
Big Data Analytics”, ICDCS 2019
Configuration Tuner for Big Data Analytics”, arxiv 2020
Optimal Configurations for Data Analytics”, KDD 2020
Big Data Analytics with Similarity-aware Multitask Bayesian Optimization”, BigData 2020
6
https://github.com/ayat-khairy/tuneful-data
7
How can we help?
8
9
One model does not fit all Amazon/Google provide Configuration for Spark Cluster (from experiment 25% to 63% slower than
Significant parameters analysis
10
11
Can we identify a better configuration while doing useful work?
12
13
14
15
16
17
18
https://github.com/ayat-khairy/tuneful-code
19
20
split
(i.e. “fix” them)
21
22
23
Feature Elimination* as ground truce
* Isabelle Guyon, Jason Weston, Stephen Barnhill, and Vladimir Vapnik. Gene selection for cancer classification using support vector machines. Machine learning. 2002.
24
budget
*Guangdeng et al. Gunther: Search-based auto-tuning of MapReduce. +Jason et al. Opentuner: An extensible framework for program autotuning.
25
10% of optimum
samples but how fast they execute GP Converge towards the optimum and therefore reduce cost
26
and see if we save Money
won’t ;)
converge to some local minima eventually
cost at the start of the GP, then stabilise to close to
27
cost at the start of the GP, then stabilise to close to
28
29
30
cluster Maybe we should?
via lower dimension projection
similar execution parameters
Process to optimize config.
31
configuration much Faster
before
32
Budget:
33
We need to find configuration at 10% of the optimum. Shorter sample execution time Simtune does generally much better!
34
Simtune performs better Able to leverage information from more workloads
35
Configurations” EuroSys 2020
… hiring students for fall 2021 at UBC looking for collaboration!
36
tfjmp@cs.ubc.ca https://tfjmp.org
37