ask adaptive sampling kit
play

ASK: Adaptive Sampling Kit P. de Oliveira Castro, E. Petit, JC. - PowerPoint PPT Presentation

ASK: Adaptive Sampling Kit P. de Oliveira Castro, E. Petit, JC. Beyler, W. Jalby Universit e de Versailles St-Quentin-en-Yvelines Exascale Computing Research 2012/08/29 P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 1 / 17 Outline


  1. ASK: Adaptive Sampling Kit P. de Oliveira Castro, E. Petit, JC. Beyler, W. Jalby Universit´ e de Versailles St-Quentin-en-Yvelines – Exascale Computing Research 2012/08/29 P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 1 / 17

  2. Outline Building Empirical Performance Models 1 Adaptive Sampling Kit 2 Hierarchical Variance Sampling 3 Evaluation 4 P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 2 / 17

  3. Motivation: Building Performance Models Building performance models is important to ◮ Understand performance bottlenecks ◮ Optimize applications ◮ Find best architecture for a given application (co-design) P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 3 / 17

  4. Motivation: Building Performance Models How to model performance ? ◮ Using simulators or analytical models ⋆ Architectures are complex and many factors interact (memory hierarchy, amount of parallelism, mapping, access patterns) ⋆ Often models are too complex or costly ◮ Black-box approach: ⋆ Measure performance for different hardware or software configurations (the design space) ⋆ Build an empirical model P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 4 / 17

  5. Design Space example: Jacobi Stencil code T, number of OpenMP Threads, between 1 and 32 Y N and M between 64 and 2048 M X,Y ∈ { 1 , 2 , 4 , 8 , 16 } Design space size around 31 . 10 8 X What is the performance for any combination of factors ? N P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 5 / 17

  6. Building empirical models Exhaustively measuring large design spaces is prohibitive. Build an accurate performance model with as few samples as possible Sampling method to select which points to measure ◮ Samples must be chosen with care or the model will be biased. Regression model to estimate the missing samples ◮ Linear, polynomial, SVM, Gaussian Process, Regression Trees, etc. No one size fits all strategy: ◮ Depending on the design space response some models and sampling methods will work better than others ◮ Important to try different strategies P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 6 / 17

  7. Contributions The contributions of this work are: ◮ ASK open-source toolkit to build empirical models ⋆ Easy to try different sampling strategies ◮ A novel sampling strategy HVS ⋆ Evaluated on different performance characterization problems P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 7 / 17

  8. ASK: Adaptive Sampling Kit Adaptive Sampling Kit (ASK) is a toolkit for building empirical models Modular architecture for conducting experiments: ◮ Easy to combine different sampling strategies and models ◮ Gathers state-of-the art sampling methods ◮ Provides visualization modules to supervise the sampling ◮ Provides control modules to stop the sampling when its accurate enough 5.Control Decides when to stop sampling 2.Source 2.Source 1.Bootstrap 4.Sampler Reporter 3.Model Reports progress and predictive error P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 8 / 17

  9. Sampling methods included in ASK Sampling methods fall in two main categories Static methods: Space Filling Designs ◮ Select a set of samples covering the design space ◮ All points are measured in a single batch ⋆ Latin Hyper Cube ⋆ Maximin Latin Hyper Cube ⋆ Low discrepancy ⋆ Random Adaptive methods: ◮ Sampling iteratively adapts to the design space complexity ⋆ AMART [Li09]: a Query-By-Comittee method ⋆ TGP + ALC [Gramacy09]: an Error-reduction method ⋆ HVS: a novel Error-reduction method that takes into account bias Adaptive Sampling P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 9 / 17

  10. Hierarchical Variance Sampling (1/2) Divide the space in regions using Regression Trees Compute the variance in each region Sample new points proportionally to: Variance upper bound × size of the region 0.7 ● samples ● ● ● ● ● ● ● ● last iteration samples ● ● ● ● ● 0.6 ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● σ ub ● ● ● ● ● 0.4 ● response f(x) ● ● ● ● s ● ● ● ● ● ● ● 0.3 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 0.2 0.4 0.6 0.8 1.0 factor x P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 10 / 17

  11. Hierarchical Variance Sampling (2/2) P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 11 / 17

  12. Hierarchical Variance Sampling (2/2) P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 11 / 17

  13. Hierarchical Variance Sampling (2/2) P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 11 / 17

  14. Hierarchical Variance Sampling (2/2) P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 11 / 17

  15. Hierarchical Variance Sampling (2/2) P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 11 / 17

  16. ASK: Stencil code evaluation Despite using only 1500 points, HVS+GBM captures the performance features of the application. (25600 samples used as original response test set) 32 cores Xeon X7550 2.00GHz P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 12 / 17

  17. ASK: Evaluating estimation error Strategy ● 25 AMART ● HVS ● HVSrelative ● 20 Latin ● ● ● Random ● ● RMSE 15 ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● ● ● ● 5 200 400 600 800 1000 1200 1400 samples Figure: Stencils, Root Mean Square Error for different ASK sampling strategies P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 13 / 17

  18. Using the Model for prediction 100 ● Model True response ● 80 HVSrelative model Cycles per element Ideal linear scaling 60 ● 40 ● ● 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 10 15 20 25 30 T : threads Figure: Scalability of the 8x8 stencil on a 1000x1000 matrix P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 14 / 17

  19. Importance of selecting a good model Influence of alignment stream benchmark ◮ Three store streams hitting memory ◮ Memory offsets: S ( k ), S ( V 1 + k ), S ( V 2 + k ) ◮ 4K aliasing ◮ non aligned access overhead SVM model GBM model P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 15 / 17

  20. Alternatives to ASK SUrrogate MOdeling Lab (SUMO) [Gorissen2010] ◮ Mature toolbox ◮ Includes many models and sampling methods ◮ Automatic tuning of model parameters ◮ Supports modeling of multiple responses ◮ ASK specifically targets performance characterization ⋆ AMART [Li09] and HVS methods have been evaluated on performance problems ◮ Only supports real-valued inputs ◮ Depends on Matlab and is not open-source (but freely available for academic use) Caret R package [Kuhn2012] ◮ Includes many models ◮ Automatic tuning of model parameters ◮ Does not handle sampling P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 16 / 17

  21. How to get ASK ? ASK is open-source and available at ◮ http://code.google.com/p/adaptive-sampling-kit/ The experimental data used in the paper is available at ◮ http://code.google.com/p/adaptive-sampling-kit/wiki/ ExperimentalData P. Oliveira et al (UVSQ/ECR) ASK 2012/08/29 17 / 17

Recommend


More recommend