: Taming the Cloud Object Storage Ali Anwar , Yue Cheng , Aayush - PowerPoint PPT Presentation

: Taming the Cloud Object Storage Ali Anwar ★ , Yue Cheng ★ , Aayush Gupta † , Ali R. Butt ★ ★ Virginia Tech & † IBM Research – Almaden

Cloud object stores enable cost-efficient data storage Object storage 2

Cloud object store supports various workloads Website Online gaming Object storage Enterprise backup Online video sharing 3

One size does not fit all Replace monolithic object store with specialized fine-grained object stores each launched on a sub-cluster 4

Reason 1: Classification of workloads Applications have different service level requirements, e.g., average latency per request, Website Online gaming queries per second (QPS), and data transfer throughput (MB/s) Object storage Enterprise backup Online video sharing 5

Small objects Website Online gaming Get: 90%, Put: 5%, Get: 5%, Put: 90%, Object storage Delete5:% Delete5:% ~ 1-100 KB 6

Large objects Enterprise backup Online video sharing Object storage Get: 90%, Put: 5%, Get: 5%, Put: 90%, Delete5:% Delete5:% ~ 1-100 MB 7

Reason 2: Heterogeneous resources ¤ Dcenters hosting object stores are becoming increasingly heterogeneous ¤ Hardware to application workload mismatch ¤ Meeting SLA requirement is challenging 8

Outline Introduction Motivation Contribution Design Evaluation 9

Background: Swift object store = Object storage Proxy Storage server nodes 10

Swift: Proxy and Storage servers = Object storage Proxy Storage server nodes 1 2 11

Swift: Ring architecture = Object storage Proxy 3 Storage server nodes 12

Benchmark used: CosBench ¤ COSBench is Intel developed Bench mark to measure C loud O bject S torage Service performance ¤ For S3, OpenStack Swift like object store ¤ Not for file system or block device system ¤ Used to compare different hardware software stacks ¤ Identify bottlenecks and make optimizations 13

Workload used Workload Workload Characteristics Application scenario Object Size Distribution Workload A 1 – 128 KB G: 90% , P: 5%, D:5% Web hosting Workload B 1 – 128 KB G: 5%, P: 90% , D:5% Online game hosting Workload C 1 – 128 MB G: 90% , P: 5%, D:5% Online video sharing Workload D 1 – 128 MB G: 5%, P: 90% , D:5% Enterprise backup 14

Experimental setup for motivational study 1 Gbps Proxy 10 Gbps servers 32 cores 8 cores COSBench Storage 3 SATA SSD/node servers 32 cores 15

Configuration 1 – Default monolithic Round 1 Gbps robin 10 Gbps COSBench 32 cores 8 cores 3 SATA SSD/node 32 cores 16

Configuration 2 – Favors small objects 10 Gbps 1 Gbps COSBench COSBench 32 cores 8 cores Large objects Small objects 3 SATA SSD/node 32 cores 17

Configuration 3 – Favors large objects 1 Gbps 10 Gbps COSBench COSBench 32 cores 8 cores Large objects Small objects 3 SATA SSD/node 32 cores 18

Performance under multi tenant environment – Workload A & B Small objects Large objects Throughput (QPS) 500 Config 1 400 Config 2 300 Config 3 200 100 0 A B C D Workload 19

Performance under multi tenant environment- – Workload A & B Large objects Small objects Throughput (MB/s) 200 Config 1 150 Config 2 100 Config 3 50 0 A B C D Workload 20

Performance under multi tenant environment - latency Small objects Large objects 18 Latency (sec) 15 Config 1 12 Config 2 9 Config 3 6 3 0 A B C D Workload 21

Key Insights ¤ Cloud object store workloads can be classified based on the size of the objects in their workloads ¤ When multiple tenants run workloads with drastically different behaviors, they compete for the object store resources with each other 22

Contributions ¤ Perform a performance and resource efficiency analysis on major hardware and software configuration opportunities ¤ We design MOS, M icro O bject S torage: ¤ 1) dynamically provisions fine-grained microstores ¤ 2) exposes the interfaces of microstores to the tenants ¤ Evaluate MOS to showcase its advantages 24

Design criteria for MOS ¤ We studied the effect of three knobs on performance of a typical object store to come up with design rules/ rules of thumb ¤ Proxy Server settings ¤ Storage Server settings ¤ Hardware changes 26

Effect of Proxy server settings Throughput (10 3 QPS) 4 100% Per-node CPU util 100% util 3.2 80% QPS 2.4 CPU util 60% (%) 1.6 40% 0.8 20% 0 0% Large objects 1 2 4 8 16 32 64 2x Proxy workers 2 Throughput (GB/s) Small objects 1.5 10 Gbps NIC bandwidth limit 1 0.5 0 1 2 4 8 16 32 2x Proxy workers 27

Effect of Proxy server settings Throughput (10 3 QPS) 4 100% Per-node CPU util 100% util 3.2 80% QPS 2.4 CPU util 60% (%) 1.6 40% 0.8 20% Large objects 0 0% 1 2 4 8 16 32 64 2x Proxy workers 2 Throughput (GB/s) Small objects 1.5 10 Gbps NIC bandwidth limit 1 0.5 0 1 2 4 8 16 32 2x Proxy workers 28

Effect of Storage server settings Throughput (10 3 QPS) 2.8 1.2 Throughput (GB/s) QPS 2.4 1 GB/s 2 0.8 1.6 0.6 1.2 0.4 0.8 0.2 0.4 0 0 1 2 4 8 16 32 Object storage workers 29

Effect of Storage server settings Throughput (10 3 QPS) 2.8 1.2 Throughput (GB/s) QPS 2.4 1 GB/s 2 0.8 1.6 0.6 1.2 0.4 0.8 0.2 0.4 0 0 1 2 4 8 16 32 Object storage workers Small objects 30

Effect of Storage server settings Large objects Throughput (10 3 QPS) 2.8 1.2 Throughput (GB/s) QPS 2.4 1 GB/s 2 0.8 1.6 0.6 1.2 0.4 0.8 0.2 0.4 0 0 1 2 4 8 16 32 Object storage workers 31

Effect of hardware settings Small objects Large objects 2.5 1.5 HDD HDD 2 1.2 SSD SSD Throughput Throughput (10 3 QPS) (GB/s) 1.5 0.9 1 0.6 0.5 0.3 0 0 1 Gbps 10 Gbps 1 Gbps 10 Gbps 32

Rules of thumb ¤ CPU on proxy serves as the first-priority resource for small-object intensive workloads ¤ Network bandwidth is more important than CPU on proxy for large-object intensive workloads ¤ proxyCores ¡= ¡storageNodes ¡ ∗ ¡coresPerStorageNode ¡ ¡ ¤ BW proxies ¡= ¡ ¡storageNodes ¡ ∗ ¡BW storageNode ¡ ¤ Faster network cannot effectively improve QPS for small-object intensive workloads – use weak network (1 Gbps NICs) with good storage devices (SSD) 33

MOS Design … Load balancer/ Load balancer/ Load balancer/ Load redirector Load redirector Load redirector Workload monitor Workload monitor … … Proxy Proxy Proxy Proxy Proxy Proxy Microstores … … … Object Object Object Object Object Object storage storage storage storage storage storage Microstore 1 Microstore N Object Object Server Server Object Resource Proxy storage storage Resource storage manager Free resource pool Manager MOS setup 34

Resource Provisioning Algorithm ¤ Initially, the algorithm allocates the same amount of resources to each microstore conservatively then use greedy approach for resource allocation ¤ Keep track of free set of resources (including hardware configuration, current load served, and the resource utilization such as CPU and network bandwidth utilization) ¤ Periodically collect monitoring data from each microstore to aggressively increase and linearly decrease resources from each microstore 35

Preliminary evaluation via simulation – Experimental setup ¤ Compute nodes: - 3 – 32 core machines - 4 – 16 core - 31 – 8 core machines - 12 – 4 core machines ¤ Network: - 18 – 10 Gbps - 32 – 1 Gbps NICs ¤ HDD to SSD ratio was 70% to 30%. 37

Aggregated throughput Small objects Large objects 20 15 Throughput (10 3 QPS) Default Default Throughput (GB/s) MOS static MOS static 16 12 MOS dynamic MOS dynamic 12 9 8 6 4 3 0 0 0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20 Time (min) Time (min) 38

Timeline under dynamically changing workloads A B C D Throughput (10 3 QPS) Stage 1 Stage 2 Stage 3 Stage 4 8 5 Throughput (GB/s) 4 6 3 4 2 2 1 0 0 0 100 200 300 400 500 600 700 800 Time (min) 41

Resource utilization timeline CPU Network 1 A A A A 0.75 0.5 1 Utilization (%) Utilization (%) Utilization (%) Utilization (%) B B B B 0.75 0.5 1 C C C C 0.75 0.5 1 D D D D 0.75 0.5 0 100 200 300 400 500 600 700 800 Time (min) 42

: Taming the Cloud Object Storage Ali Anwar , Yue Cheng , Aayush - PowerPoint PPT Presentation

: Taming the Cloud Object Storage Ali Anwar , Yue Cheng , Aayush Gupta , Ali R. Butt Virginia Tech & IBM Research Almaden Cloud object stores enable cost-efficient data storage Object storage 2 Cloud object

Cloud object storage in Ceph Orit Wasserman owasserm@redhat.com Fosdem 2017 AGENDA What is

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

Software Tool Seminar WS1516 - Taming the Snake November 4, 2015 1 Taming the Snake 1.1

Large objects in the Cloud Thursday, 11 April 13 Riak Cloud Storage Cloud Storage software

Cloud Storage Nabil Abdennadher nabil.abdennadher@hesge.ch 1 Cloud storage Objective

A Simulation-based Evaluation of a Hybrid Storage System combining P2P, F2F, and Cloud storage

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

Building a Private Cloud Cloud Infrastructure Using Opensource Building a Private Cloud OSCON

KAFKA STREAMS CLOUD MONITORING AWS CLOUD MONITORING AWS APP CLOUD MONITORING AWS HTTP APP

Cloud Computing and Cloud Storage By: Maurice Kelly History of Internet and Cloud Computing

Storage Deduplication in Cloud Computing Joo Paulo and Jos Pereira University of Minho July

Kurma: Secure Geo-distributed Multi-cloud Storage Gateways Ming Chen and Erez Zadok Stony Brook

Cloud storage state of affairs Storage clusters contain thousands of storage nodes, with e.g. 500

> SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE

TAMING NG T THE C CAVEMAN: STRESS MANAGEMENT FOR THE NEW AGE Diana F. Hott, LCSW CEAP

TOG web pages EVN pages: http://www.evlbi.org/ Radionet wiki:

Parameter Hub A Rack-Scale Parameter Server for Efficient Cloud-based Distributed Deep Neural

Gigabit Broadband, Interconnec1on proposi1ons, and the Challenge of Managing Expecta1ons Steven

Security Architecture for the Smart Home Jacob Fahrenkrug, CTO at yetu AG

High throughput High throughput kafka for science kafka for science Testing Kafkas limits

Fresco SIX-G The NEXT Generator Who Is Murideo? Murideo was created out of the need to bring

Buffer sizing and Video QoE Measurements at Netflix Bruce Spang , Brady Walsh, Te-Yuan Huang,

An Approximation Algorithm for Path Computation and Function Placement in SDNs Matthias Rost

: Taming the Cloud Object Storage Ali Anwar , Yue Cheng , Aayush - PowerPoint PPT Presentation

: Taming the Cloud Object Storage Ali Anwar , Yue Cheng , Aayush Gupta , Ali R. Butt Virginia Tech & IBM Research Almaden Cloud object stores enable cost-efficient data storage Object storage 2 Cloud object

Cloud object storage in Ceph Orit Wasserman owasserm@redhat.com Fosdem 2017 AGENDA What is

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

Software Tool Seminar WS1516 - Taming the Snake November 4, 2015 1 Taming the Snake 1.1

Large objects in the Cloud Thursday, 11 April 13 Riak Cloud Storage Cloud Storage software

Cloud Storage Nabil Abdennadher nabil.abdennadher@hesge.ch 1 Cloud storage Objective

A Simulation-based Evaluation of a Hybrid Storage System combining P2P, F2F, and Cloud storage

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

Building a Private Cloud Cloud Infrastructure Using Opensource Building a Private Cloud OSCON

KAFKA STREAMS CLOUD MONITORING AWS CLOUD MONITORING AWS APP CLOUD MONITORING AWS HTTP APP

Cloud Computing and Cloud Storage By: Maurice Kelly History of Internet and Cloud Computing

Storage Deduplication in Cloud Computing Joo Paulo and Jos Pereira University of Minho July

Kurma: Secure Geo-distributed Multi-cloud Storage Gateways Ming Chen and Erez Zadok Stony Brook

Cloud storage state of affairs Storage clusters contain thousands of storage nodes, with e.g. 500

&gt; SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE

TAMING NG T THE C CAVEMAN: STRESS MANAGEMENT FOR THE NEW AGE Diana F. Hott, LCSW CEAP

TOG web pages EVN pages: http://www.evlbi.org/ Radionet wiki:

Parameter Hub A Rack-Scale Parameter Server for Efficient Cloud-based Distributed Deep Neural

Gigabit Broadband, Interconnec1on proposi1ons, and the Challenge of Managing Expecta1ons Steven

Security Architecture for the Smart Home Jacob Fahrenkrug, CTO at yetu AG

High throughput High throughput kafka for science kafka for science Testing Kafkas limits

Fresco SIX-G The NEXT Generator Who Is Murideo? Murideo was created out of the need to bring

Buffer sizing and Video QoE Measurements at Netflix Bruce Spang , Brady Walsh, Te-Yuan Huang,

An Approximation Algorithm for Path Computation and Function Placement in SDNs Matthias Rost

> SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE