optimizing client side resource utilization in public
play

Optimizing Client-side Resource Utilization in Public Clouds - PowerPoint PPT Presentation

Optimizing Client-side Resource Utilization in Public Clouds Swapnil Haria, Mihir Patil, Haseeb Tariq, Anup Rathi Outline Motivation Solution Implementation Evaluation Conclusion Outline Motivation Solution


  1. Optimizing Client-side Resource Utilization in Public Clouds Swapnil Haria, Mihir Patil, Haseeb Tariq, Anup Rathi

  2. Outline • Motivation • Solution • Implementation • Evaluation • Conclusion

  3. Outline • Motivation • Solution • Implementation • Evaluation • Conclusion

  4. Cloud Services ( Not a distraction anymore 1 ) [1] Jeff Bezos' Risky Bet, November 2006, http://www.bloomberg.com/bw/stories/2006-11-12/jeff-bezos-risky-bet

  5. Cloud Services ( Not a distraction anymore 1 ) • 30 % of total cloud revenue • Annual revenues crossed $5 Billion [1] Jeff Bezos' Risky Bet, November 2006, http://www.bloomberg.com/bw/stories/2006-11-12/jeff-bezos-risky-bet

  6. Cloud Services ( Not a distraction anymore 1 ) • 30 % of total cloud revenue • Annual revenues crossed $5 Billion Other Players : [1] Jeff Bezos' Risky Bet, November 2006, http://www.bloomberg.com/bw/stories/2006-11-12/jeff-bezos-risky-bet

  7. Popularity • ZERO up-front capital expenses • On-demand hardware availability • Flexible pricing options

  8. Popularity • ZERO up-front capital expenses • On-demand hardware availability • Flexible pricing options "Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use."

  9. Popularity • ZERO up-front capital expenses • On-demand hardware availability • Flexible pricing options "Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use." Elastic Cloud Compute

  10. Popularity • ZERO up-front capital expenses • On-demand hardware availability • Flexible pricing options "Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use." Elastic Cloud Compute

  11. Limitations • Allocate resources in fixed sized chunks (EC2 Instances) • 1 core , 1GB RAM -> 36 core, 244 GB RAM • Accurately predict application requirements • Undersized VM - Performance degradation • Oversized VM - Extra costs Multiple applications, multiple VMs, no peace

  12. Challenges • Application requirements vary widely • Black Friday for e-commerce websites http://www.xad.com/media-mentions/mobile-activity-on-xmas-eve-24-pct-higher-than-black-friday/

  13. Challenges • Application requirements vary widely • Black Friday for e-commerce websites • Evenings and late nights for Netflix http://www.techspot.com/news/46048-netflix-represents-327-of-north-americas-peak-web-traffic.html

  14. Challenges • Application requirements vary widely • Black Friday for e-commerce websites • Evenings and late nights for Netflix • Slashdot effect! CMUSphinx Project

  15. Challenges terrible • Humans are bad at estimating workload requirements 2 • Study of developers at Twitter submitting jobs to datacenter • 70% overestimated by 10x • 20% underestimated by 5x [2] Quasar: Resource-Efficient and QoS-Aware Cluster Management . Christina Delimitrou and Christos Kozyrakis. ASPLOS 2014.

  16. Outline • Motivation • Solutions • Implementation • Evaluation • Conclusion

  17. Resource as a Service 3 1. Fine grained cloud reservations 2. CPU (cycles), memory (pages), I/O (bandwidth), Time (seconds) • Where does it stop? • Reduces wasted costs, but difficult to reason about • Hardware feasibility issues for service providers [3] The rise of RaaS: the resource-as-a-service cloud. Orna Agmon Ben-Yehuda et al. Commun. ACM 2014

  18. Proposal

  19. Tell me more! Application Mobility Real-time Management

  20. Application Mobility • On-demand application migration across machines • Conventional issues - • Application state stored in kernel (file descriptors, sockets) • Residual dependencies left on source machine • Execution Continuity We need - Process Isolation (even from kernel) - Minimal state in kernel

  21. Now where did I see that before? Image Source - Wikipedia

  22. Where do I find one of these? Old idea, but making a comeback in Cloud OS • Drawbridge from Microsoft Research • MirageOS from University of Cambridge Both (claim to) support application-migration!

  23. Real-time Management • Monitor application requirements in real-time • Use application migration to organize processes on VMs

  24. Real-time Management • Monitor application requirements in real-time • Relatively easy • Working set sizes, idle cycles • Use application migration to organize processes on VMs • Complex • Varying configurations and prices of VMs • Identifying processes to migrate • Downtime / Budgets!

  25. Policies Steps • Determine migration events • Identify process(es) for migration • Choose target from existing VMs, if possible • Figure out instance types for creating new VMs

  26. Policies Metrics (in order of priority) • Maximize VM utilization • Satisfy performance guarantees • Minimize costs User-Defined Parameters • Upper limit on cost • Max downtime per process

  27. Policies • Single Application per VM • Easy to reason about • Use naive best fit model to find target VMs • Multiple Applications per VM • Highly complex optimization problem (NP-Hard) • Use Heuristics! • Use best fit and explore nearby options to find target VMs

  28. Software Architecture

  29. Software Architecture

  30. Software Architecture

  31. Software Architecture

  32. Outline • Motivation • Solutions • Implementation • Evaluation • Conclusion

  33. Proof of Concept Model • Linux Containers (lxc) • Emulate isolated processes on Drawbridge/MirageOS • Checkpoint/Restore in Userspace (CRIU) • Checkpoint containers on VM A • Migrate files to VM B • Restore on VM B

  34. Simulator • Rapidly validate migration policies • Evaluate the influence of policy parameters on results • Written in about 2000 lines of Java code

  35. Outline • Motivation • Solutions • Implementation • Evaluation • Conclusion

  36. Experimental Setup • Proof of concept model(WIP) • Live migrating SPEC benchmarks running in LXC • Observed downtime – 30 seconds (depending of process size) • Migration Policy Simulations • Used our own random workload generator • 2 workloads of each type – static, high variability and low variability

  37. Capping Costs Overcommitment Number of Migrations Single app Multiple apps Single app Multiple apps 400 60 350 50 300 40 250 200 30 150 20 100 10 50 0 0 3 4 4.5 5 3 4 4.5 5 Max spending limit per day (dollars) Max spending limit per day (dollars)

  38. Constraining Downtime Total Cost Overcommitment Single app Multiple apps Single app Multiple apps 15.5 600 15 500 400 14.5 300 14 200 13.5 100 13 0 12.5 2 3 4 5 2 3 4 5 Max migrations per process per day Max migrations per process per day

  39. Suppressing Spikes Overcommitment Number of Migrations Single app Multiple apps Single app Multiple apps 250 60 50 200 40 150 30 100 20 50 10 0 0 1 4 8 1 4 8 Median window size Median window size

  40. Show me the money • Baseline • Used same workloads as the simulation • Picked from available VMs that would best fit the workloads • No migrations! • Cost for 3 days - $45.36 • Our solution • No migration policy requires more than $15 for 3 days • 66% money saved!

  41. Conclusions • Streamlining cloud operations important with increasing scale • Current IaaS reservation models insufficient • Better support needed from cloud providers • Amazon EC2 Container Service • Migration policies have to optimize in a multi-dimensional space • Simple ones offer savings too!

  42. Questions?

  43. BACKUPS

  44. Single application per VM

  45. Effect of cost per day Migrations and Cost Overcommitment 140 45 40 120 35 100 30 80 25 60 20 15 40 10 20 5 0 0 3 4 4.5 5 3 4 4.5 5 Max amount allowed per day (dollars) Max amount allowed per day (dollars) Overcommitment Migrations Cost

  46. Migrations cap Overcommitment Migrations and Cost 600 35 500 30 25 400 20 300 15 200 10 100 5 0 0 2 3 4 2 3 4 Max number of migrations per process per day Max number of migrations per process per day Migrations Cost Overcommitment

  47. Median window variations Migrations and Cost Overcommitment 200 50 180 45 40 160 35 140 30 120 25 100 20 80 15 60 10 40 5 20 0 0 1 4 8 1 4 8 Migrations Cost Overcommitment

  48. Multiple applications per VM

  49. Effect of cost per day Migrations and Cost Overcommitment 60 400 350 50 300 40 250 200 30 150 20 100 50 10 0 3 4 4.5 5 0 Max amount allowed per day (dollars) 3 4 4.5 5 Max amount allowed per day (dollars) Overcommitment Migrations Cost

  50. Migrations cap Migrations and Cost Overcommitment 50 600 45 40 500 35 400 30 25 300 20 200 15 10 100 5 0 0 3 4 5 3 4 5 Max number of migrations per process per day Max number of migrations per process per day Overcommitment Migrations Cost

  51. Median window variations Migrations and Cost Overcommitment 60 250 50 200 40 150 30 100 20 50 10 0 1 4 8 0 Overcommitment 1 4 8 Migrations Cost

Recommend


More recommend