escartes
play

escartes Modeling and Experimental Analysis of Virtualized Storage - PowerPoint PPT Presentation

escartes Modeling and Experimental Analysis of Virtualized Storage Performance using IBM System z as Example Diploma Thesis Presentation October 12, 2012 Dominik Bruhn Reviewers: Prof. Dr. Ralf H. Reussner, Prof. Dr. Walter F . Tichy


  1. escartes Modeling and Experimental Analysis of Virtualized Storage Performance using IBM System z as Example Diploma Thesis Presentation October 12, 2012 Dominik Bruhn Reviewers: Prof. Dr. Ralf H. Reussner, Prof. Dr. Walter F . Tichy Advisors: Qais Noorshams, Dr. Samuel Kounev CHAIR FOR SOFTWARE DESIGN AND QUALITY www.kit.edu KIT – Universit¨ at des Landes Baden-W¨ urttemberg und nationales Forschungszentrum in der Helmholtz-Gemeinschaft

  2. Motivation Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 2/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  3. Motivation Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 2/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  4. Motivation Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 2/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  5. Motivation ? ? App App App App A A’ A A System System System Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 2/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  6. Problem & Idea & Benefit & Action Problem Complex systems with many layers Difficulty to obtain good performance prediction models Idea Derivation of storage performance models from systematic measurements using regression techniques Benefit Possibility to predict the performance Easier decisions on configurations and systems Action Creation and evaluation of performance models Evaluation of techniques and optimization possibilites Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 3/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  7. Contribution Contribution Creation and evaluation of regression models for storage performance prediction Evaluation, analysis and comparison of regression techniques valid for storage performance prediction Repeatable process validated in a representative real-world environment Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 4/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  8. System Under Study IBM System z IBM DS8700 LPAR1 LPAR2 Storage Controller App. App. Volatile Non-Volatile Fibre Channel Cache Cache z/Linux z/Linux PR/SM (Hypervisor) Switched Fibre Channel Processors, Memory Harddisks (RAID) Storage-Performance-Influencing Factors Workload System Requests Locality Operating System Hardware Size Mix Pattern File System I/O Scheduler Derived from Noorshams et al. (2012) Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 5/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  9. Modeling Regression Models Training Data Independent Dependent Variables Variable Regression Model Regression Model 1. Training 2. Prediction Black Box Model Introspection Regression Model Regression Model D C A B E Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 6/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  10. Regression Techniques Linear Regression Models MARS 10.0 10.0 7.5 7.5 5.0 y 5.0 y 2.5 2.5 0.0 0.0 2 4 6 8 2 4 6 8 x 1 x 1 y = − 1 . 884 + 1 . 293 x 1 y = 1 . 014501 + 1 . 72866 h ( x 1 − 3 . 25 ) Parameters: None Parameters: nk , threshold Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 7/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  11. Regression Techniques Regression Trees (CART) M5 10.0 10.0 7.5 7.5 y 5.0 y 5.0 2.5 2.5 0.0 0.0 2 4 6 8 2 4 6 8 x 1 x 1 x 1 < 4 . 5 x 1 ≤ 3 . 5 Model LM0 LM1 (Intercept) 1 -3.34 1.17 x 1 < 6 . 75 LM0 LM1 x 1 1.53 5.35 8.93 Parameter: nsplits Parameters: minsplit , cp Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 8/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  12. Cross-Validation Samples Randomized Split into Samples k folds Training Test Data Data Test Training Data Data . . . Training Data Training Data Test Data First Second k th Training Training Training Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 9/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  13. Experimental Setup Workload Benchmark - FFSB Existing benchmark System Parameters At application layer File system ext4 I/O scheduler CFQ, NOOP System Setup Workload Parameters Virtual Machines: z/Linux Threads 100 Virtualized by PR/SM in an File set size 1 GB, 25 GB, 50 GB, LPAR 75 GB, 100 GB Request size 4 KB, 8 KB, 12 KB, DS8700 System Storage 16 KB, 20 KB, 24 KB, with 50 GB volatile and 2GB 28 KB, 32 KB non-volatile cache. Access pattern random, sequential Read percentage 0%, 25%, 30%, 50%, 70%, 75%, 100% Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 10/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  14. Approach Goal/Question/Metric (GQM) Setup Analysis Stability of the Results Measurements Analysis Parameter Influence Parameter Analysis Virtualization Influence Synthetic Interpolation Random Synthetic Extrapolation Random Model Analysis Reduced Training Sets Nominal Split Performance Modeling Quality Comparison Technique Analysis Parameter Tuning Tradeoff Analysis Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 11/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  15. Measurement Analysis - Results GQM Setup Analysis Stability of the Results Measurements Analysis Parameter Influence Parameter Analysis Virtualization Influence Synthetic Interpolation Random Synthetic Extrapolation Random Model Analysis Reduced Training Sets Nominal Split Performance Modeling Quality Comparison Technique Analysis Parameter Tuning Tradeoff Analysis Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 12/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  16. Performance Modeling - Results GQM Setup Analysis Stability of the Results Measurements Analysis Parameter Influence Parameter Analysis Virtualization Influence Synthetic Interpolation Random Synthetic Extrapolation Random Model Analysis Reduced Training Sets Nominal Split Performance Modeling Quality Comparison Technique Analysis Parameter Tuning Tradeoff Analysis Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 13/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  17. Interpolation Using Random Samples What interpolation abilities do the regression models show when being tested using newly collected samples? Method Creation of six regression models: Linear regression model ( lm ) Linear regression model including interaction terms ( lm 5param inter ) CART tree ( cart ) MARS model without interactions ( mars ) MARS model including all interaction terms ( mars multi ) M5 model ( m5 ) Training using all measurements Validation using newly collected random samples Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 14/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  18. Interpolation Using Random Samples Models without interactions ( lm , lm 79.34% mars ) do not perform well. 13.87% lm 5param inter With an error of ∼ 10%, M5 works 35.28% cart read well. 79.49% mars Linear regression with interactions mars multi 28.52% works surprisingly well. 9.27% m5 CART and MARS (with lm 62.28% interactions) rank in the midfield. 10.01% lm 5param inter 33.97% cart write 64.65% mars mars multi 16.64% 10.39% m5 0 25 50 75 Relative Error (%) Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 15/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

  19. Extrapolation Using Random Samples How is the extrapolation ability of the regression models when testing using newly collected data? Again, the models without 97.27% lm interactions do not work well. 20.16% lm 5param inter CART models can not be used for 61.29% cart read extrapolation. mars 95.92% mars multi 41.01% M5 still performs well with an error 12.58% m5 of ∼ 14%. 84.33% lm 20.32% lm 5param inter 82.13% cart write 79.71% mars 26.47% mars multi m5 15.46% 0 25 50 75 100 Relative Error (%) Introduction Foundations Methodology Results Related Work Conclusion October 12, 2012 16/25 Dominik Bruhn – Modeling and Experimental Analysis of Virtualized Storage Performance

Recommend


More recommend