Summer In the Cloud Christina Delimitrou 1 and Christos Kozyrakis 2 1 - PowerPoint PPT Presentation

Bolt: I Know What You Did Last Summer… In the Cloud Christina Delimitrou 1 and Christos Kozyrakis 2 1 Cornell University, 2 Stanford University ASPLOS – April 12 th 2017

Executive Summary  Problem: cloud resource sharing hides security vulnerabilities  Interference from co-scheduled apps  leaks app characteristics  Enables severe performance attacks  Bolt: adversarial runtime in public clouds  Transparent app detection (5-10sec)  Leverages practical machine learning techniques  DoS  140x increase in latency  User study: 88% correctly identified applications  Resource partitioning is helpful but insufficient 2

Motivation App1 App2 3

Motivation App1 App2 containers 4

Motivation App1 App2 containers memory capacity 5

Motivation App1 App2 containers memory capacity storage capacity/bw 6

Motivation App1 App2 containers memory capacity storage network bw capacity/bw 7

Motivation App1 App2 LL cache containers memory capacity storage network bw capacity/bw 8

Motivation power App1 App2 LL cache containers memory capacity storage network bw capacity/bw 9

Motivation power Not all isolation techniques available App1 App2 LL cache Not all used/configured correctly containers Not all scale well Mem bw/core resources not isolated memory capacity storage network bw capacity/bw 10

Bolt  Key idea: Leverage lack of isolation in public clouds to infer application characteristics  Programming framework, algorithm, load characteristics  Exploit: enable practical, effective, and hard-to-detect performance attacks  DoS, RFA, VM pinpointing  Use app characteristics (sensitive resource) against it  Avoid CPU saturation  hard to detect 11

Threat Model Cloud Adversary Victim provider  Impartial, neutral cloud provider  Active adversary but no control over VM placement 12

Bolt App Contention 1 3 inference injection Adversary Victim 2 Interference Impact measurement 13

Bolt App Contention 1 3 inference injection Custom 4 contention Adversary Victim kernel Performance attack 5 2 Interference Impact measurement 14

1. Contention Measurement  Set of contentious kernels (iBench) 1 Contention injection  Compute  L1/L2/L3 Adversary Victim  Memory bw 2 Interference  Storage bw impact  Network bw measurement  (Memory/Storage capacity)  Sample 2-3 kernels, run in adversarial VM  Measure impact on performance of kernels vs. isolation 15

2. Practical App Inference Practical app inference  Infer resource pressure in non- 3 profiled resources  Sparse  dense information Adversary Victim  SGD (Collaborative filtering)  Classify unknown victim based on previously-seen applications  Label & determine resource sensitivity  Content-based recommendation Hybrid recommender 16

Big Data to the Rescue Infer pressure in non-profiled resources 1. Reconstruct sparse information  Stochastic Gradient Descent (SGD), O(mpk)  Contention injection Bolt uBench uBench Data Interference App App App profile App SVD+SGD r 1 r 2 r 3 … r N r 1 r 2 r 3 … r N a 11 0 0 … a 1N a 11 a 12 a 13 … a 1N 0 a 22 0 … 0 a 21 a 22 a 23 … a 2N … … … … … … … … … … 17 a M1 0 a M3 … 0 a M1 a M2 a M3 … a MN

Big Data to the Rescue Classify and label victims 2. Weighted Pearson Correlation Coefficients  Output: distribution of similarity scores to app classes  Bolt Data App label & App App characteristics App App Pearson Corr Coeff r 1 r 2 r 3 … r N Hadoop SVM: 65% a 11 a 12 a 13 … a 1N Spark ALS: 21% a 21 a 22 a 23 … a 2N memcached: 11% … … … … … … a M1 a M2 a M3 … a MN 18

Inference Accuracy  40 machine cluster (420 cores)  Training apps: 120 jobs (analytics, databases, webservers, in- memory caching, scientific, js)  high coverage of resource space  Testing apps: 108 latency-critical webapps, analytics  No overlap in algorithms/datasets between training and testing sets Application class Detection accuracy (%) In-memory caching (memcached) 80% Persistent databases (Cassandra, MongoDB) 89% Hadoop jobs 92% Spark jobs 86% Webservers 91% Aggregate 89% 19

3. Practical Performance Attacks Custom kernel 4 Determine the resource injection 1. bottleneck of the victim Create custom contentious 2. Adversary Victim kernel that targets critical resource(s) Inject kernel in Bolt 3.  Several performance attacks (DoS, RFAs, VM pinpointing)  Target specific, critical resource  low CPU pressure 20

3. Practical DoS Attacks  Launched against same 108 applications as before  On average 2.2x higher execution time and up to 9.8x  For interactive services, on average 42x increase in tail latency and up to 140x  Bolt does not saturate CPU  Naïve attacker gets migrated 21

Demo 22

User Study  20 independent users from Stanford and Cornell  Cluster  200 EC2 servers, c3.8xlarge (32vCPUs, 60GB memory)  Rules:  4vCPUs per machine for Bolt  All users have equal priority  Users use thread pinning  Users can select specific instances  Training set: 120 apps incl. analytics, webapps, scientific, etc. 23

Accuracy of App Labeling 53 app classes (analytics, webapps, FS/OS, HLS/sim, other…) 24

Accuracy of App Characterization Performance attack results in the paper 25

The Value of Isolation 45% 14%  Need more scalable, fine-grain, and complete isolation techniques 26

Conclusions  Bolt: highlight the security vulnerabilities from lack of isolation  Fast detection using online data mining techniques  Practical, hard-to-detect performance attacks  Current isolation helpful but insufficient  In the paper:  Sensitivity to Bolt parameters  Sensitivity to applications and platform parameters  User study details  More performance attacks (resource freeing, VM pinpointing) 27

Questions?  Bolt: highlight the security vulnerabilities from lack of isolation  Fast detection using online data mining techniques  Practical, hard-to-detect performance attacks  Current isolation helpful but insufficient  In the paper:  Sensitivity to Bolt parameters  Sensitivity to applications and platform parameters  User study details  More performance attacks (resource freeing, VM pinpointing) 28

Evolving Applications  Cloud applications change behavior  Users use the same cloud resources for several apps over time  Bolt periodically wakes up, checks if app profile has changed; if so, reprofile & reclassify 29

Inference Within a Framework  Within a framework, dataset and choice of algorithm affect resource requirements  Bolt matches a new unknown application to apps in a framework by distinguishing their resource needs 30

Summer In the Cloud Christina Delimitrou 1 and Christos Kozyrakis 2 1 - PowerPoint PPT Presentation

Bolt: I Know What You Did Last Summer In the Cloud Christina Delimitrou 1 and Christos Kozyrakis 2 1 Cornell University, 2 Stanford University ASPLOS April 12 th 2017 Executive Summary Problem: cloud resource sharing hides security

Building a Private Cloud Cloud Infrastructure Using Opensource Building a Private Cloud OSCON

KAFKA STREAMS CLOUD MONITORING AWS CLOUD MONITORING AWS APP CLOUD MONITORING AWS HTTP APP

SUMMER BRAIN GAIN: REIMAGINING SUMMER LEARNING What is the problem? Why Summer Matters There is

SNR SNR- -cloud interaction cloud interaction cloud interaction SNR SNR cloud interaction

Cloud Cloud Cloud Cloud network Edge Edge Edge Edge as a Edge Edge Edge Edge Edge

Cloud Ross Mallace Commercial Director Cloud/SaaS Cloud is here. ALL By 2020 most core

Embracing Cloud Ian Apperley Agenda A little about me What is Cloud and where did it come

Are We Really Cloud-Native? Bert Ertman Cloud-Native Computing What is Cloud-Native? answer:

CS5412: THE CLOUD VALUE PROPOSITION Lecture XXII Ken Birman Cloud Hype 2 The cloud is

SAS and (the) Cloud Dave Annis SAS Solutions onDemand SAS and (the) Cloud Everyones Cloud

Cloud Computing & Cloud Models Cloud Models Topics Defining cloud computing

CS5412: THE CLOUD VALUE PROPOSITION Lecture XXII Ken Birman Cloud Hype 2 The cloud is

Electron Cloud Build Electron Cloud Build- Electron Cloud Build Electron Cloud Build -Up

Summer Salary MARCH 20, 2019 Todays Agenda What is Summer Salary? Key Considerations

Proposed Project Schedules Bond Summer Summer Summer Summer Commitment 2007 2008 2009

Cloud-iQ New features including xSP reporting Crayon Channel Team Cloud-iQ updates The Cloud-iQ

Collaboration is Key Emma Dunbar, Head of Engagement, Innovation & Entrepreneurship

Be The Spark to Success: Fostering Cultural Inclusion Through Positive Relationships Richland

Spark Emilie Zermatten SNSF 24.05.2019 - 28 Research creates knowledge. Aims Fund

Validation for Distributed Systems with Apache Spark & Beam Melinda Seckington Now

Fourth Quarter and Full Year 2016 Investor Update Conference Call February 10, 2017 Safe Harbor

Tomorrows Railway and Climate Change Adapta7on (T1009) Presenta7on for Transport Day Break-out

Spatial Resolution Assessment from Real Image Data Ralf Reulke (Institute for Robotics and

Solar and Wind Resource Data for use in the Systems Advisor Model Anthony Lopez Senior

Summer In the Cloud Christina Delimitrou 1 and Christos Kozyrakis 2 1 - PowerPoint PPT Presentation

Bolt: I Know What You Did Last Summer In the Cloud Christina Delimitrou 1 and Christos Kozyrakis 2 1 Cornell University, 2 Stanford University ASPLOS April 12 th 2017 Executive Summary Problem: cloud resource sharing hides security

Building a Private Cloud Cloud Infrastructure Using Opensource Building a Private Cloud OSCON

KAFKA STREAMS CLOUD MONITORING AWS CLOUD MONITORING AWS APP CLOUD MONITORING AWS HTTP APP

SUMMER BRAIN GAIN: REIMAGINING SUMMER LEARNING What is the problem? Why Summer Matters There is

SNR SNR- -cloud interaction cloud interaction cloud interaction SNR SNR cloud interaction

Cloud Cloud Cloud Cloud network Edge Edge Edge Edge as a Edge Edge Edge Edge Edge

Cloud Ross Mallace Commercial Director Cloud/SaaS Cloud is here. ALL By 2020 most core

Embracing Cloud Ian Apperley Agenda A little about me What is Cloud and where did it come

Are We Really Cloud-Native? Bert Ertman Cloud-Native Computing What is Cloud-Native? answer:

CS5412: THE CLOUD VALUE PROPOSITION Lecture XXII Ken Birman Cloud Hype 2 The cloud is

SAS and (the) Cloud Dave Annis SAS Solutions onDemand SAS and (the) Cloud Everyones Cloud

Cloud Computing &amp; Cloud Models Cloud Models Topics Defining cloud computing

CS5412: THE CLOUD VALUE PROPOSITION Lecture XXII Ken Birman Cloud Hype 2 The cloud is

Electron Cloud Build Electron Cloud Build- Electron Cloud Build Electron Cloud Build -Up

Summer Salary MARCH 20, 2019 Todays Agenda What is Summer Salary? Key Considerations

Proposed Project Schedules Bond Summer Summer Summer Summer Commitment 2007 2008 2009

Cloud-iQ New features including xSP reporting Crayon Channel Team Cloud-iQ updates The Cloud-iQ

Collaboration is Key Emma Dunbar, Head of Engagement, Innovation &amp; Entrepreneurship

Be The Spark to Success: Fostering Cultural Inclusion Through Positive Relationships Richland

Spark Emilie Zermatten SNSF 24.05.2019 - 28 Research creates knowledge. Aims Fund

Validation for Distributed Systems with Apache Spark &amp; Beam Melinda Seckington Now

Fourth Quarter and Full Year 2016 Investor Update Conference Call February 10, 2017 Safe Harbor

Tomorrows Railway and Climate Change Adapta7on (T1009) Presenta7on for Transport Day Break-out

Spatial Resolution Assessment from Real Image Data Ralf Reulke (Institute for Robotics and

Solar and Wind Resource Data for use in the Systems Advisor Model Anthony Lopez Senior

Cloud Computing & Cloud Models Cloud Models Topics Defining cloud computing

Collaboration is Key Emma Dunbar, Head of Engagement, Innovation & Entrepreneurship

Validation for Distributed Systems with Apache Spark & Beam Melinda Seckington Now