Data Storage Revolution Relational Databases Object Storage - PowerPoint PPT Presentation

Data Storage Revolution • Relational Databases • Object Storage (put/get) Speed – Dynamo Scalability – PNUTS Availability – CouchDB Throughput – MemcacheDB No Complexity – Cassandra

Eventual Consistency Read Request Write Request   Replica   Replica A   Manager   Replica   Replica B Read Request

Eventual Consistency • Writes ordered after commit • Reads can be out-of-order or stale • Easy to scale, high throughput • Difficult application programming model

Traditional Solution to Consistency Two-Phase   Replica Write Request Commit:   Replica 1. Prepare   Manager 2. Vote: Yes 3. Commit   Replica   Replica 4. Ack

Strong Consistency • Reads and Writes strictly ordered • Easy programming • Expensive implementation • Doesn’t scale well

Our Goal • Easy programming • Easy to scale, high throughput

Chain Replication van Renesse & W1 W1 Schneider R1 R1 (OSDI 2004) W2 R2 R2 W2 R3 R3   Replica Write Request Read Request   Replica   Manager HEAD   TAIL     Replica   Replica

Chain Replication • Strong consistency • Simple replication • Increases write throughput • Low read throughput • Can we increase throughput? • Insight: – Most applications are read-heavy (100:1)

CRAQ • Two states per object – clean and dirty Read Request Read Request Read Request Read Request Read Request           HEAD Replica Replica Replica TAIL V 1 V 1 V 1 V 1 V 1

CRAQ • Two states per object – clean and dirty • If latest version is clean , return value • If dirty , contact tail for latest version number Read Request Read Request Write Request 2 1 V 2 V 1 V 1           HEAD Replica Replica Replica TAIL V 1 V 2 ,V 2 V 2 ,V 2 V 2 V 1 ,V 2 V 1 V 2 ,V 2 V 2 V 1 V 1

Multicast Optimizations • Each chain forms group • Tail multicasts ACKs           HEAD Replica Replica Replica TAIL V 2 V 1 ,V 2 V 1 V 2 ,V 2 V 2 V 1 ,V 2 V 1 V 2 ,V 2 V 2

Multicast Optimizations • Each chain forms group • Tail multicasts ACKs • Head multicasts write data Write Request           HEAD Replica Replica Replica TAIL V 2 ,V 3 V 2 ,V 3 V 2 ,V 3 V 2 ,V 3 V 2 V 3 ,V 3

CRAQ Benefits • From Chain Replication – Strong consistency – Simple replication – Increases write throughput • Additional Contributions – Read throughput scales : • Chain Replication with Apportioned Queries – Supports Eventual Consistency

High Diversity • Many data storage systems assume locality – Well connected, low latency • Real large applications are geo-replicated – To provide low latency – Fault tolerance (source: Data Center Knowledge)

Multi-Datacenter CRAQ DC1 HEAD TAIL DC3 Replica Replica TAIL Replica Replica Replica Replica Replica DC2

Multi-Datacenter CRAQ DC1 HEAD TAIL DC3 Replica Replica Client Replica Replica Client Replica Replica Replica DC2

Chain Configuration Motivation Solution 1. Specify chain size 1. Popular vs. scarce objects 2. List datacenters 2. Subset relevance － dc 1 , dc 2 , … dc N 3. Separate sizes 3. Datacenter diversity – dc 1 , chain_size 1 , … 4. Specify master 4. Write locality

Master Datacenter DC1 Writer HEAD TAIL Replica TAIL Replica Replica Replica DC3 Replica HEAD Replica Replica DC2

Implementation • Approximately 3,000 lines of C++ • Uses Tame extensions to SFS asynchronous I/O and RPC libraries • Network operations use Sun RPC interfaces • Uses Yahoo’s ZooKeeper for coordination

Coordination Using ZooKeeper • Stores chain metadata • Monitors/notifies about node membership DC2 DC1 CRAQ CRAQ CRAQ CRAQ ZooKeeper CRAQ ZooKeeper CRAQ ZooKeeper DC3 CRAQ CRAQ CRAQ

Evaluation • Does CRAQ scale vs. CR? • How does write rate impact performance? • Can CRAQ recover from failures ? • How does WAN effect CRAQ? • Tests use Emulab network emulation testbed

Read Throughput as Writes Increase CRAQ − 7 15000 7x- CRAQ − 3 CR − 3 10000 Reads/s 3x- 5000 1x- 0 0 20 40 60 80 100 Writes/s

Failure Recovery (Read Throughput) 60000 40000 Reads/s 20000 Length 7 Length 5 Length 3 0 0 10 20 30 40 50 Time (s)

Failure Recovery (Latency) 5000 1.5 Read Latency (ms) Write Latency (ms) 1.0 3000 0.5 1000 0.0 0 0 10 20 0 10 20 Time (s) Time (s)

Geo-replicated Read Latency 80 Mean Latency (ms) 60 40 20 CR CRAQ 0 0 5 10 15 20 Writes/s

If Single Object Put/Get Insufficient • Test-and-Set, Append, Increment – Trivial to implement – Head alone can evaluate • Multiple object transaction in same chain – Can still be performed easily – Head alone can evaluate • Multiple chains – An agreement protocol (2PC) can be used – Only heads of chains need to participate – Although degrades performance (use carefully!)

Summary • CRAQ Contributions? – Challenges trade-off of consistency vs. throughput • Provides strong consistency • Throughput scales linearly for read-mostly • Support for wide-area deployments of chains • Provides atomic operations and transactions Thank Questions? You

Data Storage Revolution Relational Databases Object Storage - PowerPoint PPT Presentation

Data Storage Revolution Relational Databases Object Storage (put/get) Speed Dynamo Scalability PNUTS Availability CouchDB Throughput MemcacheDB No Complexity Cassandra Eventual Consistency Read Request Write

Chapter 2: Relational Model Chapter 2: Relational Model Structure of Relational Databases

Chapter 3: Relational Model Structure of Relational Databases Relational Algebra Tuple

CSE 154 LECTURE 13:RELATIONAL DATABASES AND SQL Relational databases relational database : A

CSC 337 LECTURE 20: RELATIONAL DATABASES AND SQL Relational databases relational database : A

CSE 154 LECTURE 22:RELATIONAL DATABASES AND SQL Relational databases relational database : A

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Domain Driven Domain Driven Design with relational Design with relational Databases and Spring

The Digital Revolution 1 Digital Revolution Nadias Theme 2 Digital Revolution Digital

5. Revolution and Napoleonic Europe 5.1 The Revolution in France 5.2 The Revolution and Europe

Java Object Persistence Rakesh Vidyadharan rakesh@sptci.com 2008-05-20 Rakesh Vidyadharan Java

Outline Object-orientation and databases CS 235: Object-oriented model: ODL Object

Relational Algebra Relational Query Languages Recall: Query = Retrieval Program Language

Relational Algebra 1 / 39 Relational Algebra Relational model specifies stuctures and

Relational Query Languages (2) SQL and QBE Walid G. Aref Query Languages For The Relational

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

The Relational Data Model Lecture 6 1 Outline Relational Data Model Functional

Operating Systems Operating Systems CMPSC 473 CMPSC 473 Synchronization Synchronization

Total Pasta: Unfailing Pointer Programs Neil Mitchell, ndm AT cs.york.ac.uk Department of

Race Condition Shared Data: 4 5 6 1 8 5 6 20 9 ? Synchronization and Deadlocks tail

Learning higher-order logic programs Andrew Cropper, Rolf Morel, and Stephen Muggleton Program

Heavy Tails: Performance Models and Scheduling Disciplines Part II Workload Asymptotics for

Tail call elimination Tail calls and their elimination Michel Schinz Loops in functional

Data Structures: Queues & ADT CS 1112 Mona Diab

Proofs about functions Function consuming A is related to proof about A Q: How to prove two

Data Storage Revolution Relational Databases Object Storage - PowerPoint PPT Presentation

Data Storage Revolution Relational Databases Object Storage (put/get) Speed Dynamo Scalability PNUTS Availability CouchDB Throughput MemcacheDB No Complexity Cassandra Eventual Consistency Read Request Write

Chapter 2: Relational Model Chapter 2: Relational Model Structure of Relational Databases

Chapter 3: Relational Model Structure of Relational Databases Relational Algebra Tuple

CSE 154 LECTURE 13:RELATIONAL DATABASES AND SQL Relational databases relational database : A

CSC 337 LECTURE 20: RELATIONAL DATABASES AND SQL Relational databases relational database : A

CSE 154 LECTURE 22:RELATIONAL DATABASES AND SQL Relational databases relational database : A

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Domain Driven Domain Driven Design with relational Design with relational Databases and Spring

The Digital Revolution 1 Digital Revolution Nadias Theme 2 Digital Revolution Digital

5. Revolution and Napoleonic Europe 5.1 The Revolution in France 5.2 The Revolution and Europe

Java Object Persistence Rakesh Vidyadharan rakesh@sptci.com 2008-05-20 Rakesh Vidyadharan Java

Outline Object-orientation and databases CS 235: Object-oriented model: ODL Object

Relational Algebra Relational Query Languages Recall: Query = Retrieval Program Language

Relational Algebra 1 / 39 Relational Algebra Relational model specifies stuctures and

Relational Query Languages (2) SQL and QBE Walid G. Aref Query Languages For The Relational

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

The Relational Data Model Lecture 6 1 Outline Relational Data Model Functional

Operating Systems Operating Systems CMPSC 473 CMPSC 473 Synchronization Synchronization

Total Pasta: Unfailing Pointer Programs Neil Mitchell, ndm AT cs.york.ac.uk Department of

Race Condition Shared Data: 4 5 6 1 8 5 6 20 9 ? Synchronization and Deadlocks tail

Learning higher-order logic programs Andrew Cropper, Rolf Morel, and Stephen Muggleton Program

Heavy Tails: Performance Models and Scheduling Disciplines Part II Workload Asymptotics for

Tail call elimination Tail calls and their elimination Michel Schinz Loops in functional

Data Structures: Queues &amp; ADT CS 1112 Mona Diab

Proofs about functions Function consuming A is related to proof about A Q: How to prove two

Data Structures: Queues & ADT CS 1112 Mona Diab