Moritz Meister moritz@logicalclocks.com @morimeister
Maggy
- Open-Source Asynchronous Distributed
Hyperparameter Optimization Based on Apache Spark
FOSDEM’20 02.02.2020
Maggy - Open-Source Asynchronous Distributed Hyperparameter - - PowerPoint PPT Presentation
Maggy - Open-Source Asynchronous Distributed Hyperparameter Optimization Based on Apache Spark Moritz Meister FOSDEM20 moritz@logicalclocks.com 02.02.2020 @morimeister The Bitter Lesson (of AI)* Methods that scale with computation
Moritz Meister moritz@logicalclocks.com @morimeister
FOSDEM’20 02.02.2020
“Methods that scale with computation are the future of AI”** “The two (general purpose) methods that seem to scale … ... are search and learning.”*
2
** https://www.youtube.com/watch?v=EeMCEQa85tw
Rich Sutton
(Father of Reinforcement Learning)
* http://www.incompleteideas.net/IncIdeas/BitterLesson.html
Generalization Error Better Regularization Methods Design Better Models Hyperparameter Optimization Larger Training Datasets Better Optimization Algorithms Distribution
Inner Loop Outer Loop worker1
worker2 workerN
…
∆
2
∆
N
Synchronization
http://tiny.cc/51yjdz
Training Data
∆
1
Metric
Search Method
Trial HParams, Architecture, etc.
Inner Loop Outer Loop worker1
worker2 workerN
…
∆
2
∆
N
Synchronization
http://tiny.cc/51yjdz
Training Data
∆
1
Metric
SEARCH LEARNING
Search Method
Trial HParams, Architecture, etc.
Data Pipelines Ingest & Prep Feature Store Machine Learning Experiments Data Parallel Training Model Serving Ablation Studies Hyperparameter Optimization Explore/ Design Model
The distribution oblivious training function (pseudo-code):
Inner Loop
hyperparameters
Set Hyper- parameters Train Model Evaluate Performance
Learning Black Box Metric Meta-level learning &
Search space
Outer Loop
Learning Black Box Metric Global Controller
Outer Loop
Learning Black Box Metric Global Controller Parallel Workers Queue
Trial Trial
Search space or Ablation Study Which algorithm to use for search? How to monitor progress? Fault Tolerance? How to aggregate results?
Outer Loop
Learning Black Box Metric Meta-level learning &
Parallel Workers Queue
Trial Trial
Search space Which algorithm to use for search? How to monitor progress? Fault Tolerance? How to aggregate results?
This should be managed with platform support!
A flexible framework for asynchronous parallel execution of trials for ML experiments on Hopsworks: ASHA, Random Search, Grid Search, LOCO-Ablation, Bayesian Optimization and more to come…
Task11
Driver
Task12 Task13 Task1N
…
HDFS
Task21 Task22 Task23 Task2N
…
Barrier Barrier Task31 Task32 Task33 Task3N
…
Barrier
Metrics1 Metrics2 Metrics3
Task11
Driver
Task12 Task13 Task1N
…
HDFS
Task21 Task22 Task23 Task2N
…
Barrier Barrier Task31 Task32 Task33 Task3N
…
Barrier
Metrics1 Metrics2 Metrics3 Wasted Compute Wasted Compute Wasted Compute
Early Stopping:
Multi-fidelity Methods:
Animation: https://blog.ml.cmu.edu/2018/12/12/massively-parallel-hyperparameter-optimization/ Liam Li et al. “Massively Parallel Hyperparameter Tuning”. In: CoRR abs/1810.05934 (2018).
PClass name survive sex sex name survive
Replacing the Maggy Optimizer with an Ablator:
the Feature Store
Ablation
(LOCO)
How can we fit this into the bulk synchronous execution model of Spark? Mismatch: Spark Tasks and Stages vs. Trials
Databricks’ approach: Project Hydrogen (barrier execution mode) & SparkTrials in Hyperopt
Task11
Driver
Task12 Task13 Task1N
…
Barrier
Metrics New Trial Early Stop
Long running tasks and communication:
HyperOpt: One Job/Trial, requiring many Threads on Driver
Hyperparameter Optimization Task ASHA Validation Task ASHA RS-ES RS-NS ASHA RS-ES RS-NS
asynchronously
Transparency
reproducibility of experiments
Thanks to the entire Logical Clocks Team ☺ Contributions from colleagues: Robin Andersson @robzor92 Sina Sheikholeslami @cutlash Kim Hammar @KimHammar1 Alex Ormenisan @alex_ormenisan
https://github.com/logicalclocks/maggy https://maggy.readthedocs.io/en/latest/
https://github.com/logicalclocks/hopsworks https://www.logicalclocks.com/whitepapers/hopsworks
https://www.logicalclocks.com/feature-store/
@hopsworks HOPSWORKS