Optimizing Hash-based Distributed Storage Using Client Choices - PowerPoint PPT Presentation

Optimizing Hash-based Distributed Storage Using Client Choices Peilun Li and Wei Xu Institute for Interdisciplinary Information Sciences, Tsinghua University

Data Placement Design #1 • Centralized management: GFS, HDFS, … data name → server name Data Server server name → server IP Name Server Data Server Client Data Data Server 2

Data Placement Design #2 • Hash-based distributed management: Ceph, Dynamo, FDS, … server name → server IP Data Server Monitor Server server name → Data Server server IP Hash function Data Server Data Data Server Name Name Client 3

Pros and Cons of Different Designs Pros Cons Centralized Global performance Centralized name server can become Management optimization. bottleneck. Fixed placement makes it hard to do optimization. Hash-based Avoid centralized server Distributed bottleneck. Some optimization is vulnerable to Management change of lower-level storage architectures. 4

Motivation • We want to use server information to improve system performance in hash-based distributed management. • Static information: network structure, failure domain, … • Dynamic information: latency, memory utilization, … • We want a flexible system so that new optimizations for specific applications can be added easily. • Do not want to redesign the whole placement algorithm or hash function. 5

Solution: Multiple Hash Functions server name → server IP Hash Server 1 Server 1 Function 1 Hash Server 2 Policy Server 2 Data Server 2 Function 2 Hash Server 3 Server 3 Function 3 Client 6

Solution: Multiple Hash Functions • We can use multiple hash functions to provide multiple choices, and choose the best one with a fixed policy. • Different servers provide different performance. • A performance requirement or even a specific application can have their own optimization policy. • Easy to be implemented as an independent module. 7

How does Write Work Now? Server 1 Choice Cache Write-Query Server 2 Write Data Client No data & Performance Multi-hash Server 3 8

How does Read Work Now? Server 1 Choice Cache Read-Query Server 2 Read Data Client Has data Multi-hash Server 3 9

Simple Server • Gather server performance metrics. • CPU/memory/disk utilization, average read/write latency, unflushed journal size, … • Answer client probing. • Check whether the requested data exist on this server or not. • Piggyback server metrics with probing results. 10

Clever Client • Provide multiple choices. • Probe server choices before the first access. • Make a choice if need to write new data. • Cache the choice after the first access. 11

Making the Best Choice • A policy gets server information as input and output the best choice. • Example policies: 12

Implementation • We implement it based on Ceph. • About 140 lines of C++ codes for server module. • Easy to be implemented on other systems. • Only support block device interface now. • It ensures that only one client is accessing the block device data. 13

Evaluation Setup • Testbed cluster. • 3 machines. • 15*4TB hard drives • 2*12 cores 2.1GHz Xeon CPU • 128 GB memory • 10Gb NIC. • Workloads are generated with librbd engine of FIO. 8 images are read/written with 4MB block size concurrently on the same machine. • Production cluster. • 44 machines. • 4*4TB hard drives and 256GB SSD. • 2 10Gb NICs. • Workloads are generated with webserver module of FileBench. • The number of choice is fixed to 2. 14

Policy space Saves Disk Space • space chooses the server with most free space to store data. • A hash-based storage system is full when there is one full disk. Evaluation of space 96% 100% Disk capacity utilization 73% 80% 60% 40% 20% 0% baseline space 15

Policy local Reduces Network Bottleneck • local chooses the closest server to store data. • Can save cross-rack network bandwidth. Evaluation of local on testbed Evaluation of local on production cluster 12947.2 2500 14000 12000 2000 Throughput (MS/s) Throughput (MB/s) 10000 7963.1 1500 8000 6000 1000 4000 500 2000 0 0 1 2 3 4 5 6 7 8 9 101112131415161718192021222324 baseline local baseline local 16

Policy memory Improves Read Throughput • memory chooses the server with the most free memory. • Coexist with other running programs • More free memory => more file systems buffer => better read perf. Evaluation of memory 1600 1400 Throughput (MB/s) 1200 1000 800 600 400 200 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 baseline memory 17

Inefficient Policies • Policies cpu , latency , and journal do not work well. 1700 1000 1000 900 900 1650 800 800 Throughput (MS/s) Throughput (MB/s) Throughput (MB/s) 1600 700 700 600 600 1550 500 500 1500 400 400 300 300 1450 200 200 1400 100 100 1350 0 0 1 3 5 7 9 11 13 15 17 19 21 23 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 baseline cpu baseline latency baseline journal 18

Why are They Inefficient? • The Ceph server is not CPU intensive under this hardware configuration. • Queue-based transient metrics, e.g. unflushed journal size, changes too fast, so we can not have a consistent measurement. • However, applying ineffective policies still provide similar performance of the baseline! 19

Summary of Different Policies • General improvement: Policy Performance Change Improvement local 1545 MB/s → 1900 MB/s 23.0% memory 778 MB/s → 1403 MB/s 80.3% space 73% → 96% 31.5% cpu 1545 MB/s → 1513MB/s -1.9% latency 402 MB/s → 396MB/s -1.5% journal 402MB/s → 396MB/s -1.5% 20

Probing Overhead • The most significant overhead is server probing. 4MB sequential write 4KB random write 100 100 2 choices 2 choices no probing no probing 90 90 80 80 70 70 60 60 Percentile Percentile 50 50 40 40 30 30 20 20 10 10 0 0 20 40 60 80 100 120 140 160 50 100 150 200 250 300 21 Latency (ms) Latency (ms)

Discussion about Probing Overhead • It has 2.7ms average latency overhead for probing because of an extra round trip time. • Latency is increased by 2.7% for large sequential write and 6.9% for small random write. • The probing is only done in the first access at a client. • The overhead is distributed to all subsequent accesses of an object. 22

Future Work • Develop more advanced choice policies based on multiple metrics. • Provide an application-level API, so the application itself can make the choices. • Exploring different ways to collaboratively cache the choice information, in order to reduce the number of probing. 23

Conclusion • Hash-based design in distributed systems can be flexible as well. • Statistic optimization with best efforts can be both simple and efficient. • Without significant queueing effects, the power of two may not work well in a real computer system. 24

Thank You We are hiring: faculty members, postdocs in any CS field contact: weixu@tsinghua.edu.cn 25

Optimizing Hash-based Distributed Storage Using Client Choices - PowerPoint PPT Presentation

Optimizing Hash-based Distributed Storage Using Client Choices Peilun Li and Wei Xu Institute for Interdisciplinary Information Sciences, Tsinghua University Data Placement Design #1 Centralized management: GFS, HDFS, data name

Multi-Threaded Servers December 6, 2007 1 Client-Server Communication Client Client Client

Hash Functions in Action Hash Functions in Action Lecture 12 Hash Functions Hash Functions

Hash Functions in Action Hash Functions in Action Lecture 11 Hash Functions Hash Functions

Hash Functions Hash Functions 1 Cryptographic Hash Function Crypto hash function h(x) must

Hash Functions and Hash Tables (2.5.2) A hash function h maps keys of a given type to

Hash Pile Ups: Using Collisions to Identify Unknown Hash Functions R. Joshua Tobin and David

PgBouncer and a Bit of Queueing Theory Peter Eisentraut peter.eisentraut@2ndquadrant.com

Hash Tables 1 Hash Table in Primary Storage Main parameter B = number of buckets Hash

Chapter 6 Hash-Based Indexing Efficient Support for Equality Search Hash-Based Indexing Static

Generics Asumu Takikawa RacketCon 2012 1 What are generics? 2 What are generics? hash-ref

Distributed Hash Tables What is a DHT? Hash Table data structure that maps keys to

Optimizing monitoring networks for Optimizing monitoring networks for Optimizing monitoring

Security Proofs for the MD6 Hash Algorithm Ahmed Ezzat Outline Introduction to hash

LUX Hash Function Ivica Nikoli c, Alex Biryukov, Dmitry Khovratovich University of Luxembourg

HASH FUNCTIONS Mihir Bellare UCSD 1 Mihir Bellare UCSD 2 Hash functions Hash functions

Topic 22 Hash Tables " hash collision n. [from the techspeak] (var. `hash clash') When used

Exception-Less System Calls for Event-Driven Servers Livio Soares and Michael Stumm University

Server Advantage in Tennis Matches Jonathan Rougier (and Iain MacPhee) Department of Mathematics

Sa Saving ng Da Data o on t n the he Se Server No screens Say your name Prof. Lydia

Using the OS The Basic Abstractions Processes Files Other Resources Processes

Queuing Theory Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of

Lessons learnt building Kubernetes controllers David Cheney - Heptio gday Craig McLuckie

Lab-S313674 : Build a GlassFish JavaFX Monitoring Application using REST monitoring API David

An Introduction to Php for Web API Principle of server side script WEB Client WEB SERVER html

Optimizing Hash-based Distributed Storage Using Client Choices - PowerPoint PPT Presentation

Optimizing Hash-based Distributed Storage Using Client Choices Peilun Li and Wei Xu Institute for Interdisciplinary Information Sciences, Tsinghua University Data Placement Design #1 Centralized management: GFS, HDFS, data name

Multi-Threaded Servers December 6, 2007 1 Client-Server Communication Client Client Client

Hash Functions in Action Hash Functions in Action Lecture 12 Hash Functions Hash Functions

Hash Functions in Action Hash Functions in Action Lecture 11 Hash Functions Hash Functions

Hash Functions Hash Functions 1 Cryptographic Hash Function Crypto hash function h(x) must

Hash Functions and Hash Tables (2.5.2) A hash function h maps keys of a given type to

Hash Pile Ups: Using Collisions to Identify Unknown Hash Functions R. Joshua Tobin and David

PgBouncer and a Bit of Queueing Theory Peter Eisentraut peter.eisentraut@2ndquadrant.com

Hash Tables 1 Hash Table in Primary Storage Main parameter B = number of buckets Hash

Chapter 6 Hash-Based Indexing Efficient Support for Equality Search Hash-Based Indexing Static

Generics Asumu Takikawa RacketCon 2012 1 What are generics? 2 What are generics? hash-ref

Distributed Hash Tables What is a DHT? Hash Table data structure that maps keys to

Optimizing monitoring networks for Optimizing monitoring networks for Optimizing monitoring

Security Proofs for the MD6 Hash Algorithm Ahmed Ezzat Outline Introduction to hash

LUX Hash Function Ivica Nikoli c, Alex Biryukov, Dmitry Khovratovich University of Luxembourg

HASH FUNCTIONS Mihir Bellare UCSD 1 Mihir Bellare UCSD 2 Hash functions Hash functions

Topic 22 Hash Tables &quot; hash collision n. [from the techspeak] (var. `hash clash') When used

Exception-Less System Calls for Event-Driven Servers Livio Soares and Michael Stumm University

Server Advantage in Tennis Matches Jonathan Rougier (and Iain MacPhee) Department of Mathematics

Sa Saving ng Da Data o on t n the he Se Server No screens Say your name Prof. Lydia

Using the OS The Basic Abstractions Processes Files Other Resources Processes

Queuing Theory Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of

Lessons learnt building Kubernetes controllers David Cheney - Heptio gday Craig McLuckie

Lab-S313674 : Build a GlassFish JavaFX Monitoring Application using REST monitoring API David

An Introduction to Php for Web API Principle of server side script WEB Client WEB SERVER html

Topic 22 Hash Tables " hash collision n. [from the techspeak] (var. `hash clash') When used