Cloud Native Cost Optimization Adrian Cockcroft @adrianco - PowerPoint PPT Presentation

State of the Art in Cloud Native Microservice Architectures AWS Re:Invent : Asgard to Zuul https://www.youtube.com/watch?v=p7ysHhs5hl0 Resiliency at Massive Scale https://www.youtube.com/watch?v=ZfYJHtVL1_w Microservice Architecture https://www.youtube.com/watch?v=CriDUYtfrjs http://www.infoq.com/presentations/scale-gilt http://www.slideshare.net/mcculloughsean/itier-breaking-up-the-monolith-philly-ete http://www.infoq.com/presentations/Twitter-Timeline-Scalability http://www.infoq.com/presentations/twitter-soa http://www.infoq.com/presentations/Zipkin https://speakerdeck.com/mattheath/scaling-micro-services-in-go-highload-plus-plus-2014

Trust with Verification ● Edda - the “black box flight recorder” for configuration state ● Chaos Monkey - enforcing stateless business logic ● Chaos Gorilla - enforcing zone isolation/ replication ● Chaos Kong - enforcing region isolation/ replication ● Security Monkey - watching for insecure configuration settings ● See over 40 NetflixOSS projects at netflix.github.com ● Get “Technical Indigestion” trying to keep up with techblog.netflix.com @adrianco

Autoscaled Ephemeral Instances at Netflix Largest services use autoscaled red/ black code pushes P u Average lifetime of an instance is 36 hours s h Autoscale Up Autoscale Down

Netflix Automatic Code Deployment Canary Bad Signature Implemented by Simon Tuffs

@adrianco Happy Canary Signature

Speeding Up The Platform Datacenter Snowflakes • Deploy in months • Live for years @adrianco

Speeding Up The Platform Datacenter Snowflakes Virtualized and Cloud • Deploy in months • Deploy in minutes • Live for years • Live for weeks @adrianco

Speeding Up The Platform Datacenter Snowflakes Virtualized and Cloud Docker Containers • Deploy in months • Deploy in minutes • Deploy in seconds • Live for years • Live for weeks • Live for minutes/hours @adrianco

Speeding Up The Platform Datacenter Snowflakes Virtualized and Cloud Docker Containers AWS Lambda • Deploy in months • Deploy in minutes • Deploy in seconds • Deploy in milliseconds • Live for years • Live for weeks • Live for minutes/hours • Live for seconds @adrianco

Speeding Up The Platform Datacenter Snowflakes Virtualized and Cloud Docker Containers AWS Lambda • Deploy in months • Deploy in minutes • Deploy in seconds • Deploy in milliseconds • Live for years • Live for weeks • Live for minutes/hours • Live for seconds Speed enables and encourages new microservice architectures @adrianco

With AWS Lambda compute resources are charged by the 100ms, not the hour First 1,000,000 node.js executions/ month are free First 400,000 GB-seconds of RAM-CPU are free

Monitoring Requirements Metric resolution microseconds Metric update rate 1 second Metric to display latency less than human attention span (<10s)

Low Latency SaaS Based Monitors @adrianco www.vividcortex.com and www.boundary.com

Adrian’s Tinkering Projects Model and visualize microservices Simulate interesting architectures � See github.com/ adrianco/ spigo Simulate Protocol Interactions in Go � See github.com/ adrianco/ d3grow Dynamic visualization

Cost Optimization

Capacity Optimization for a Single System Bottleneck Lower Spec Limit Upper Spec Limit � � When demand When demand probability is below probability exceeds USL by 3.0 sigma USL by 4.0 sigma scale down resource scale up resource to to save money maintain low latency Documentation on Capability Plots To get accurate high dynamic range histograms see http://hdrhistogram.org/ Slideshare: 2003 Presentation on Capacity Planning Methods See US Patent: 7467291

But interesting systems don’t have a single bottleneck nowadays…

What about cloud costs? @adrianco

Cloud Native Cost Optimization Optimize for speed first Turn it off! $ $ $ Capacity on demand Consolidate and Reserve Plan for price cuts FOSS tooling @adrianco

The Capacity Planning Problem @adrianco

Best Case Waste �� Cloud capacity used is maybe �� half average �� DC capacity �� @adrianco

Failure to Launch �� Mad scramble to add more DC capacity during launch phase h c t u n g u o - n outages a h d i L t c l s i h - n u e e t B u w T h r P a t o w L r G o r G @adrianco

Over the Top Losses �� Capacity wasted on failed launch $ magnifies the losses Pre-Launch Build-out Testing Launch Growth Growth @adrianco

Turning off Capacity Off-peak production Test environments Dev out of hours Dormant Data Science �� @adrianco

Containerize Test Environments Snapshot or freeze Fast restart needed Persistent storage 40 of 168 hrs/ wk Bin-packed containers shippable.com saved 70% @adrianco

Seasonal Savings 50 % Savings �� Web Servers 1 5 9 13 17 21 25 29 33 37 41 45 49 @adrianco Week

Autoscale the Costs Away �� @adrianco

Daily Duty Cycle �� Reactive Autoscaling Predictive Autoscaling saves around 70% saves around 50% See Scryer on Netflix Tech Blog @adrianco

Underutilized and Unused �� @adrianco

Clean Up the Crud �� • �� – �� – �� – �� – �� @adrianco

Total Cost of Oranges �� @adrianco

Total Cost of Oranges �� How much does �� datacenter automation �� software and support �� cost per instance? �� @adrianco

When Do You Pay? Run Datacenter Up Front Costs My Stuff Lease Install Rack & Private Building AC etc Stack Cloud SW bill Ages Next Now Ago Month @adrianco @adrianco

Cost Model Comparisons AWS has most complex model • Both highest and lowest cost options! CPU/ Memory Ratios Vary • Can’t get same config everywhere Features Vary • Local SSD included on some vendors, not others • Network and storage charges also vary

Digital Ocean Flat Pricing H ourly Price ($0.06/ hr) M onthly Price ($40/ mo) $ No Upfront $ No Upfront $0.060/ hr $0.056/ hr $1555/ 36mo $1440/ 36mo Savings 7% Prices on Dec 7th, for 2 Core, 4G RAM, SSD, purely to show typical savings @adrianco

Google Sustained Usage Full Price W ithout Typical Sustained Full Sustained Usage Sustained Usage Usage Each M onth Each M onth $ No Upfront $ No Upfront $ No Upfront $0.063/ hr $0.049/ hr $0.045/ hr $1633/ 36mo $1270/ 36mo $1166/ 36mo Savings 22% 29% Prices on Dec 7th, for n1.standard-1 (1 vCPU, 3.75G RAM, no disk) purely to show typical savings @adrianco

AWS Reservations On Demand No Upfront Partial Upfront All Upfront 1 year 3 year 3 year $ No Upfront $No Upfront $337 Upfront $687 Upfront $0.070/ hr $0.050/ hr $0.0278/ hr $0.00/ hr $1840/ 36mo $1314/ 36mo $731/ 36mo $687/ 36mo Savings 29% 60% 63% Prices on Dec 7th, for m3.medium (1 vCPU, 3.75G RAM, SSD) purely to show typical savings @adrianco

Blended Benefits �� On Demand � �� Partial Upfront � �� All Upfront � �� @adrianco ��

Consolidated Reservations Burst capacity guarantee Higher availability with lower cost Other accounts soak up any extra Monthly billing roll-up Capitalize upfront charges! But: Fixed location and instance type @adrianco

Use EC2 Spot Instances Cloud native dynamic autoscaled spot instances � Real world total savings up to 50% @adrianco

Right Sizing Instances Fit the instance size to the workload @adrianco

Six Ways to Cut Costs �� @adrianco Credit to Jinesh Varia of AWS for this summary

Compounded Savings @adrianco

Lift and Shift Compounding 100 Traditional 100 application 75 using AWS 70 70 70 heavy use 50 reservations 25 30 30 25 0 Base Price Rightsized Seasonal Daily Scaling Reserved Tech Refresh Price Cuts @adrianco Base price is for capacity bought up-front

Lift and Shift Compounding 100 Traditional 100 application 75 using AWS 70 70 70 heavy use 50 reservations 25 30 30 25 0 Base Price Rightsized Seasonal Seasonal Daily Scaling Reserved Tech Refresh Price Cuts @adrianco Base price is for capacity bought up-front

Lift and Shift Compounding 100 Traditional 100 application 75 using AWS 70 70 70 heavy use 50 reservations 25 30 30 25 0 Base Price Rightsized Seasonal Seasonal Daily Scaling Daily Scaling Reserved Tech Refresh Price Cuts @adrianco Base price is for capacity bought up-front

Lift and Shift Compounding 100 Traditional 100 application 75 using AWS 70 70 70 heavy use 50 reservations 25 30 30 25 0 Base Price Rightsized Seasonal Seasonal Daily Scaling Daily Scaling Reserved Tech Refres Tech Refresh Price Cuts @adrianco Base price is for capacity bought up-front

Conservative Compounding 100 Cloud native 100 application 75 partially optimized 70 light use reservations 50 50 35 25 25 20 15 0 Base Price Rightsized Seasonal Daily Scaling Reserved Tech Refresh Price Cuts @adrianco

Cloud Native Cost Optimization Adrian Cockcroft @adrianco - PowerPoint PPT Presentation

Cloud Native Cost Optimization Adrian Cockcroft @adrianco Technology Fellow - Battery Ventures ICPE - Austin, February 2015 Why Does Performance Matter? @adrianco Latency Efficiency @adrianco Users: Response Latency Developers: Release

Are We Really Cloud-Native? Bert Ertman Cloud-Native Computing What is Cloud-Native? answer:

Cloud Native Go Building Scalable, Resilient Microservices for the Cloud in Go 1 / 29

Cloud Native Visibility and Security Chris Kranz Sysdig Secure DevOps for Cloud Native Open by

Going Cloud Native with Cloud Foundry @chipchilders Chip Childers, VP Technology Cloud Foundry

Problem solved: Cloud Cost Management 2 Cloud Cost Management Using cloud services from

Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, Software Engineer @gwenshap 2

Server Operational Cost Optimization for Cloud Computing Service Providers over a Time Horizon

PATH TO CLOUD-NATIVE APP DEV 8 steps to cloud-native app dev Thomas Qvarnstrom Cesar Saavedra

The Cloud Native Elephant in the Room The Cloud Native Elephant in the Room Bob Quillin, VP

What is Cloud Native? WW Developer Advocacy Contents App Modernization Docker

Lessons Learnt from Running a Container Native Cloud Xu Wang (@gnawux) CTO & Cofounder,

PolarDB Cloud Native DB @ Alibaba Lixun Peng Inaam Rana Alibaba Cloud Team Agenda

Towards a Robust Edge-Native Storage System Presenter: Nikhil Sreekumar Authors: Nikhil

5G Cloud Native from RAN to Core Christian Maciocco, Intel Shilpa Talwar, Intel Saikrishna

Using Cloud Native Technologies to Solve Complex Application Security Challenges in Kubernetes

Honey, I shrunk the database! Resilience and recoverability in Cloud Native services JEFFREY

Cloud Native and Container Technology Landscape Chris Aniszczyk (@cra) Rise of Containers and

Teaching an old DAG new tricks Migrating a decade old pipeline to Airflow Outline Cloud native

fundamental technologies to work on for cloud-native networking Magnus Karlsson, Intel

Up Cloud Native Networking with eBPF Next Technical Track Presentation Raymond Maika

Cloud Gaming Architecture based on StarlingX and Akraino Integrated Cloud Native Edge Stack (ICN

A Multi-Tenancy Cloud-Native Digital Library Platform Yinlin Chen, Jim Tuttle, William A. Ingram

Managing Openstack in a cloud-native way Marcel Haerry Alberto Garca Leading the

Enabling Cloud-Native Applications with Application Credentials in Keystone Colleen Murphy Cloud

Cloud Native Cost Optimization Adrian Cockcroft @adrianco - PowerPoint PPT Presentation

Cloud Native Cost Optimization Adrian Cockcroft @adrianco Technology Fellow - Battery Ventures ICPE - Austin, February 2015 Why Does Performance Matter? @adrianco Latency Efficiency @adrianco Users: Response Latency Developers: Release

Are We Really Cloud-Native? Bert Ertman Cloud-Native Computing What is Cloud-Native? answer:

Cloud Native Go Building Scalable, Resilient Microservices for the Cloud in Go 1 / 29

Cloud Native Visibility and Security Chris Kranz Sysdig Secure DevOps for Cloud Native Open by

Going Cloud Native with Cloud Foundry @chipchilders Chip Childers, VP Technology Cloud Foundry

Problem solved: Cloud Cost Management 2 Cloud Cost Management Using cloud services from

Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, Software Engineer @gwenshap 2

Server Operational Cost Optimization for Cloud Computing Service Providers over a Time Horizon

PATH TO CLOUD-NATIVE APP DEV 8 steps to cloud-native app dev Thomas Qvarnstrom Cesar Saavedra

The Cloud Native Elephant in the Room The Cloud Native Elephant in the Room Bob Quillin, VP

What is Cloud Native? WW Developer Advocacy Contents App Modernization Docker

Lessons Learnt from Running a Container Native Cloud Xu Wang (@gnawux) CTO &amp; Cofounder,

PolarDB Cloud Native DB @ Alibaba Lixun Peng Inaam Rana Alibaba Cloud Team Agenda

Towards a Robust Edge-Native Storage System Presenter: Nikhil Sreekumar Authors: Nikhil

5G Cloud Native from RAN to Core Christian Maciocco, Intel Shilpa Talwar, Intel Saikrishna

Using Cloud Native Technologies to Solve Complex Application Security Challenges in Kubernetes

Honey, I shrunk the database! Resilience and recoverability in Cloud Native services JEFFREY

Cloud Native and Container Technology Landscape Chris Aniszczyk (@cra) Rise of Containers and

Teaching an old DAG new tricks Migrating a decade old pipeline to Airflow Outline Cloud native

fundamental technologies to work on for cloud-native networking Magnus Karlsson, Intel

Up Cloud Native Networking with eBPF Next Technical Track Presentation Raymond Maika

Cloud Gaming Architecture based on StarlingX and Akraino Integrated Cloud Native Edge Stack (ICN

A Multi-Tenancy Cloud-Native Digital Library Platform Yinlin Chen, Jim Tuttle, William A. Ingram

Managing Openstack in a cloud-native way Marcel Haerry Alberto Garca Leading the

Enabling Cloud-Native Applications with Application Credentials in Keystone Colleen Murphy Cloud

Lessons Learnt from Running a Container Native Cloud Xu Wang (@gnawux) CTO & Cofounder,