towards understanding the workload of a iaas cloud
play

Towards Understanding the Workload of a IaaS Cloud Lo c Perennou - PowerPoint PPT Presentation

Towards Understanding the Workload of a IaaS Cloud Lo c Perennou Outscale, ISEP loic.perennou@outscale.com September 13, 2018 1 / 25 Outline Introduction Data Collection Comparison of Outscales and Azures Workloads Relationship


  1. Towards Understanding the Workload of a IaaS Cloud Lo¨ ıc Perennou Outscale, ISEP loic.perennou@outscale.com September 13, 2018 1 / 25

  2. Outline Introduction Data Collection Comparison of Outscale’s and Azure’s Workloads Relationship Between Tags and CPU Utilization 2 / 25

  3. Outline Introduction Data Collection Comparison of Outscale’s and Azure’s Workloads Relationship Between Tags and CPU Utilization 3 / 25

  4. Outscale ◮ Founded in 2010, acquired by Dassault Systemes in 2017. ◮ Provides virtualized hardware like VMs, and services to manage them. ◮ Develops its own orchestrator, TINA OS, compatible with Amazon EC2. 4 / 25

  5. Motivations ◮ We need to make resource allocation fit utilization. ◮ Utilization is unknown when a VM starts, but could be predicted by ML. ◮ Data must be available to propose and test models. 5 / 25

  6. Related Cloud Workload Traces Organization Google Eucalyptus Sys. Bitbrains Azure year 2011 2014 2015 2017 # jobs/VMs 0.7M jobs 9,173 VMs 1,750 VMs 2M VMs resource usage no yes yes no starts/stops yes yes yes no reference [1, 2] [5] [3] [4] ◮ Problem : We are not sure if Outscale’s workload is similar to Azure’s. 6 / 25

  7. Outline Introduction Data Collection Comparison of Outscale’s and Azure’s Workloads Relationship Between Tags and CPU Utilization 7 / 25

  8. Overview infrastructure manages calls TINA API orchestrator server user system sends logs VM1 VM2 probe database operating system reads syslog counters hardware 2 data sources: ◮ Logs of user actions from TINA OS ◮ Measurements of hardware utilization of Virtual Machines 8 / 25

  9. Descriptive Statistics ◮ 4 months 700 000 VMs in total ◮ 10 000 VMs running simultaneously ◮ 9 / 25

  10. Outline Introduction Data Collection Comparison of Outscale’s and Azure’s Workloads Relationship Between Tags and CPU Utilization 10 / 25

  11. Distribution of Resources Requested by VMs 100 100 36 80 80 [32;inf[ 24 [16;32[ 20 % of VMs % of VMs 60 60 [8;16[ 16 [4;8[ 8 40 40 [2;4[ 4 [0;2[ 2 20 20 1 0 OSC client OSC internal OSC all Microsoft 0 OSC client OSC internal OSC all Microsoft (b) ram requested (a) cores requested ◮ Internal accounts at Outscale launch small VMs (test). ◮ Clients create bigger VMs than at Microsoft. 11 / 25

  12. Distribution of Runtime 100 CDF (P{runtime<x} = y) 80 60 40 20 outscale client microsoft 0 10 0 10 1 10 2 10 3 10 4 10 5 rutime (minutes) ◮ The runtime of 65% VMs is ¡ 1h. ◮ Clients create slightly longer VMs than at Microsoft. 12 / 25

  13. VM Start Rate outscale_client number of VMs started (smoothed) microsoft 2 1 0 1 20 40 60 80 100 120 140 160 hour OfWeek ◮ 2 peaks/day at Outscale, 1 at Microsoft. ◮ Less activity at Outscale in the weekend. 13 / 25

  14. Relationship Between Start Time and Runtime ◮ Daily creation of VMs from Monday to Friday. ◮ VMs created on Friday run during the whole weekend. 14 / 25

  15. Conclusion on Workload Comparison ◮ Bigger requests, longer runtimes at Outscale. ◮ Relatively more activity during the week, less in weekends. ◮ Activity patterns exists, at least for some users. 15 / 25

  16. Outline Introduction Data Collection Comparison of Outscale’s and Azure’s Workloads Relationship Between Tags and CPU Utilization 16 / 25

  17. Definition of Tags Freely-typed string that describes a VM. ◮ Example (ideal): “Release 2.4 of Kafka used in production”. ◮ Example (real): “EV6MTNDBLU FUn3xlIATTiOAoDJYIeYGA MT Database2 0 420403n2q”. 17 / 25

  18. Methodology ◮ Group VMs according to their tags (clustering). ◮ Visualize the CPU utilization of VMs within each cluster. 18 / 25

  19. Convert Text Tags to Vectors for Clustering Figure: Dictionary Vectorization 19 / 25

  20. Hierarchical Clustering ◮ At the beginning, there is 1 group per vector. ◮ The two closest groups are merged (based on the distance between their elements). 20 / 25

  21. Visualization of the CPU utilization of tag groups Figure: group A Low utilization for every VM 21 / 25

  22. Visualization of the CPU utilization of tag groups Figure: group B Tags alone fail to explain the variance. 22 / 25

  23. Conclusion ◮ Resource allocation of VMs needs to be based on predicted utilization. ◮ Predictive models need data to be trained and tested. ◮ Outscale’s data is different from Azure’s and justifies that we look for our own models. ◮ Tex information (tags) could provide interesting features (ongoing work). 23 / 25

  24. References I J. Wilkes, “More Google cluster data.” Google research blog, Nov. 2011. Posted at http://googleresearch.blogspot.com/2011/ 11/more-google-cluster-data.html . C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. A. Kozuch, “Heterogeneity and dynamicity of clouds at scale: Google trace analysis,” in Proceedings of the Third ACM Symposium on Cloud Computing , SoCC ’12, (New York, NY, USA), pp. 7:1–7:13, ACM, 2012. S. Shen, V. v. Beek, and A. Iosup, “Statistical characterization of business-critical workloads hosted in cloud datacenters,” in 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing , pp. 465–474, May 2015. 24 / 25

  25. References II E. Cortez, A. Bonde, A. Muzio, M. Russinovich, M. Fontoura, and R. Bianchini, “Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms,” in Proceedings of the 26th Symposium on Operating Systems Principles , SOSP ’17, (New York, NY, USA), pp. 153–167, ACM, 2017. R. Wolski and J. Brevik, “Using parametric models to represent private cloud workloads,” IEEE Transactions on Services Computing , vol. 7, pp. 714–725, Oct 2014. 25 / 25

Recommend


More recommend