auto scaling deadline constrained workloads
play

Auto-scaling deadline- constrained workloads in containers in - PowerPoint PPT Presentation

Auto-scaling deadline- constrained workloads in containers in the cloud Jay Jay DesLauriers DesLauriers Research Associate, University of Westminster Project COLA Horizon 2020 33 months Completion September 2019 14


  1. Auto-scaling deadline- constrained workloads in containers in the cloud Jay Jay DesLauriers DesLauriers Research Associate, University of Westminster

  2. Project COLA • Horizon 2020 • 33 months • Completion September 2019 • 14 Partners in 6 Countries • 10 SME/Public Sector • 4 HE/Research Institutions This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 731574 June 5 th 2019 www.project-cola.eu 2

  3. Head in the clouds On-Premise Off-Premise Capital Expense Pay-as-you-go High Upfront Cost No Upfront Cost High Maintenance Cost No Maintenance Cost June 5 th 2019 www.project-cola.eu 3

  4. A match made in ... Containers Operating-system virtualisation and application packaging for reusable, portable software June 5 th 2019 www.project-cola.eu 4

  5. The Problem Some requirements: Application 1 Application 2 Application N • Dynamic Supply (auto-scaling) Service 1 Service 2 Service 3 Service 4 Service 5 • Vendor–free Resource requirements • Modular Variable resource consumption Baseline resource consumption • Flexible Dynamic To be replaced by Manually demand automatically adjusted adjusted supply supply • Secure Cloud services June 5 th 2019 www.project-cola.eu 5

  6. Finding a solution... June 5 th 2019 www.project-cola.eu 6

  7. The Solution MiCADO MiCADO MASTER NODE WORKER NODE cAdvisor Submitter Policy Keeper Optimiser TOSCA ML based Translates Enforces Application optimisation ADT scaling Node Exporter Description Export VM/ Template container metrics (ADT) Occopus Kubernetes Prometheus Describes application, Orchestrate Orchestrate Monitor VMs Docker infrastructure, scaling containers VMs & containers policies, security policies Container Runtime June 5 th 2019 www.project-cola.eu 7

  8. Scaling Use-Case No.1 • Resource intensive services • Typically CPU/memory –bound apps/services • Containers & underlying VMs scale to meet demand June 5 th 2019 www.project-cola.eu 8

  9. Scaling Use-Case No.2 ... ? • Multi-job experiments • Typically batch/parameter sweep jobs • Containers/VMs scale to complete jobs by deadline MICADO • Where do we put the jobs? Container and Cloud • How do we execute them Orchestrator (in containers!) cqueue Policy cqueue Keeper Scale up/ worker worker Jobs (Scaling down ADT: logic) <insert queue here> MiCADO infrastructure R Submitter and scaling R MICADO MICADO jQueuer rules MASTER WORKER Agent Jobs June 5 th 2019 www.project-cola.eu 9

  10. JQueuer • Asynchronous Distributed Task Queue • Master Component • Runs externally • Queue & monitoring • Agent Component MICADO Container and • Runs on worker VMs Cloud Orchestrator • Fetch & execute jobs cqueue Policy cqueue Keeper Scale up/ worker worker Jobs (Scaling down ADT: logic) MiCADO experiment infrastructure jQUEUER Submitter .json and scaling MASTER MICADO MICADO jQueuer rules MASTER WORKER Agent End user Jobs June 5 th 2019 www.project-cola.eu 10

  11. JQueuer Metrics Metrics exported to MiCADO for scaling: Queue length Jobs completed Jobs failed Jobs running Jobs remaining Time elapsed Average job length Time to deadline June 5 th 2019 www.project-cola.eu 11

  12. The experiment Determining the impact of changes in behavior on the spread of a disease across a population June 5 th 2019 www.project-cola.eu 12

  13. Experiment design • Agent-based simulation • Repast Simphony • Three agents • Infected • Susceptible • Recovered • Simulate movement & interaction of agents in an environment June 5 th 2019 www.project-cola.eu 13

  14. Manual job allocation (baseline) equal distribution 40x jobs per VM 200 jobs 1-hour to complete Repast Repast Repast Repast Repast all jobs 1 2 3 4 5 VM 1 VM 2 VM 3 VM 4 VM 5 June 5 th 2019 www.project-cola.eu 14

  15. Automatic job allocation (MiCADO) JQueuer experiment.json Manager 1-hour 200 deadline jobs MiCADO Master Repast Repast Repast JQueuer JQueuer JQueuer n Agent 2 Agent 1 Agent MiCADO Worker n MiCADO Worker 2 MiCADO Worker 1 June 5 th 2019 www.project-cola.eu 15

  16. Results • Dynamic allocation of variable length jobs results in a better use of cloud resources 5 VMs Manually Allocated Allocated by MiCADO Manually Allocated Allocated by MiCADO Manual allocation (baseline) 3.86 VMs Dynamic allocation (MiCADO) June 5 th 2019 www.project-cola.eu 16

  17. Thanks! • github.com/micado-scale/ansible-micado • project-cola.eu/ • T. Kiss, J. DesLauriers, G. Gesmier et al., A cloud-agnostic queuing system to support the implementation of deadline-based application execution policies, Future Generation Computer Systems (2019), https://doi.org/10.1016/j.future.2019.05.062 Project Director: Dr. Tamas Kiss, University of Westminster, UK This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 731574 June 5 th 2019 www.project-cola.eu 17

Recommend


More recommend