A Cloud-native Architecture for Replicated Data Services Hemant - PowerPoint PPT Presentation

A Cloud-native Architecture for Replicated Data Services Hemant Saxena, Jeffery Pound University of Waterloo, SAP Labs Waterloo

Outline ● Problem overview ● Solution overview ○ Kafka ○ Cassandra ● Evaluation 2

Problem overview Cloud has become de facto standard for deploying applications ➢ However, applications designed for on-premise infrastructure ➢ find it challenging to leverage the Cloud storage efficiently, because: Data replication for on-premise provides fault-tolerance (FT) and high ○ availability (HA) Whereas, Cloud storage already uses replication to provides FT and HA ○ Making application’s replication redundant resulting into additional storage ○ cost 3

Typical replicated application on-premise client Replicated application replica-set - - - 4

Typical replicated application on Cloud client Replicated application - Application-level replica-set replication - - - (replica-set) - Storage-level replication - Resulting into redundant replicas Storage service - Introducing additional storage cost 5

Problem overview We ask the following research question... How can we easily allow applications designed for on-premise infrastructure to efficiently leverage the Cloud storage? 6

Na ȉ ve solution replica-set - Have one replica (i.e. no application-level replication) - Solves the problem of redundant replication - But, it is prone to node failure. Hence not highly available. 8

Contributions of this work We show how a well-known main-delta architecture can be ➢ used to leverage cloud storage efficiently i.e. ensure no redundant replication ○ while maintaining the fault-tolerance and availability guarantees of the ○ applications We show that incorporating main-delta architecture in ➢ existing on-premise applications is easy by controlling how buffers are managed and flushed to storage ○ and it is compatible with the whole spectrum of replication strategies ○ 9

Quick recap of main-delta architecture Originally designed for ➢ efficiently handling mixed read/update workloads Two parts ➢ ○ Static, read-only, read optimized main ○ Small, write-optimized delta Deltas are merged with the main ○ at regular intervals 10

Solution overview replica-set - Replicated local deltas, maintained by application - But single shared main on Cloud storage (which is fault-tolerant) M M M 11

Solution overview replica-set - Replicated local deltas, maintained by application - But single shared main on Cloud storage (which is fault-tolerant) How to merge the M deltas? M M 12

Merging Deltas to Main Details are in how the delta is merged to the main such that ➢ No data is lost from any deltas ○ And applications have same guarantees as on-premise deployment ○ Delta-merge strategy depends on the replication strategy ➢ ○ Single primary node means single delta to merge Multiple primary nodes means multiple deltas to merge ○ 13

Classification of replication strategies ▪ Write to any, read from ▪ Write to primary, ▪ Write to primary, any (e.g. quorum): read from primary: read from any: Request-handler Request-handler Request-handler replica-set replica-set replica-set 14

Case-study 1: Delta merge for single primary Idea: In-memory buffers as ● deltas , on-disk data as main. Only the primary will merge its ● replica-set delta to main . Other replicas will discard their deltas when they are full. M In case of primary node failure, ● new primary node takes the M M responsibility of merging deltas. 15

Case-study 2: Delta merge for quorum system The memtable and sstables can ● be easily leveraged as delta and main . Deciding which node merges ● the delta is tricky: replica-set Each node can have different set ○ of updates M M M 16

Case-study 2: Delta merge for quorum system Nodes flush their deltas to ● cloud storage Background compaction job ● combines the deltas and merges it to the main 17

Evaluation Want to show that our cloud-native design can save storage cost while ● keeping the performance same Tested performance of our prototype on Kafka and Cassandra ● ○ Used real Cloud infrastructure - Amazon Web Services (AWS) Tested different types of storage types - EBS and EFS ○ 19

Evaluation Implementations: ● md-kafka : main-delta architecture ○ based Kafka implementation kafka : vanilla Kafka ○ 3x storage cost savings ● Replication factor 3x ○ Savings by design ○ Similar write throughput for block base ● storage (EBS) Almost 2x throughput improvement for ● EFS storage, due to batching 20

Evaluation Implementations: ● md-cassandra-efs : main-delta ○ based Cassandra using EFS storage cassandra-ebs : vanilla ○ Cassandra using EBS cassandra-efs : vanilla ○ Cassandra using EFS Close to 2.8x storage cost saving ● With replication factor of 3x ○ Almost similar throughput for all 3 ● types of workloads 21

Conclusion Existing on-premise applications (with replication) when deployed on ➢ cloud ends up with redundant replication We proposed a main-delta based cloud-native architecture to solve this ➢ problem ○ Allowing for storage cost savings up to factor of k (applications replication factor) We show our approach is general enough to work with the complete ➢ spectrum of replication strategies Simplest strategy: single primary (Kafka case study) ○ ○ Complex strategy: quorum based systems(Cassandra case study) 22

Thank you! Contact for any follow-up questions: Hemant Saxena email : hemant.saxena@uwaterloo.ca 23

A Cloud-native Architecture for Replicated Data Services Hemant - PowerPoint PPT Presentation

A Cloud-native Architecture for Replicated Data Services Hemant Saxena, Jeffery Pound University of Waterloo, SAP Labs Waterloo Outline Problem overview Solution overview Kafka Cassandra Evaluation 2 Problem overview Cloud

Are We Really Cloud-Native? Bert Ertman Cloud-Native Computing What is Cloud-Native? answer:

The Cloud Native Elephant in the Room The Cloud Native Elephant in the Room Bob Quillin, VP

Native American Cultural Center NATIVE AMERICAN NATIVE AMERICAN NATIVE AMERICAN CULTURAL CENTER

Cloud Native Visibility and Security Chris Kranz Sysdig Secure DevOps for Cloud Native Open by

Cloud Native Go Building Scalable, Resilient Microservices for the Cloud in Go 1 / 29

Geo-Replicated Transactions in 1.5RTT Robert Escriva Strangeloop September 30, 2017 @rescrv

Geo-Replicated Transaction Commit in 3 Message Delays Robert Escriva VMWare June 9, 2017

Going Cloud Native with Cloud Foundry @chipchilders Chip Childers, VP Technology Cloud Foundry

Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, Software Engineer @gwenshap 2

ILLUMI NATIVE NARRATIVE CHANGE INSIGHTS AND ACTION PRESENTATION ILLUMI NATIVE S MISSION Created

Live Coding Kotlin/Native Snake github.com/dkandalov/kotlin-native-snake @dmitrykandalov

NATIVE MODE PROGRAMMING Fiona Reid Overview What is native mode? What codes are suitable

PolarDB Cloud Native DB @ Alibaba Lixun Peng Inaam Rana Alibaba Cloud Team Agenda

Lessons Learnt from Running a Container Native Cloud Xu Wang (@gnawux) CTO & Cofounder,

PATH TO CLOUD-NATIVE APP DEV 8 steps to cloud-native app dev Thomas Qvarnstrom Cesar Saavedra

What is Cloud Native? WW Developer Advocacy Contents App Modernization Docker

Distributed Databases 1 19.1 Distributed Database System A distributed database system

Data Replication and Power Consumption in Data Grids Karl Smith, Susan Vrbsky, Ming Lei, Jeff

Data Preprocessing Data Mining and Exploration: Preprocessing Data preparation is a big issue for

Data Mining Techniques: Statistical Decision Theory Nearest Neighbor Classification and

Wide Area Placement of Data Replicas for Fast and Highly Available Data Access Fan Ping Xiaohu

CS 744: GEODE Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - Assignment 2 grades - Midterm

Principles of Software Construction: Objects, Design, and Concurrency Distributed System Design,

Vembu Technologies 100+ Decade + G2 crowd Countries Experience Top Leaders-2019

A Cloud-native Architecture for Replicated Data Services Hemant - PowerPoint PPT Presentation

A Cloud-native Architecture for Replicated Data Services Hemant Saxena, Jeffery Pound University of Waterloo, SAP Labs Waterloo Outline Problem overview Solution overview Kafka Cassandra Evaluation 2 Problem overview Cloud

Are We Really Cloud-Native? Bert Ertman Cloud-Native Computing What is Cloud-Native? answer:

The Cloud Native Elephant in the Room The Cloud Native Elephant in the Room Bob Quillin, VP

Native American Cultural Center NATIVE AMERICAN NATIVE AMERICAN NATIVE AMERICAN CULTURAL CENTER

Cloud Native Visibility and Security Chris Kranz Sysdig Secure DevOps for Cloud Native Open by

Cloud Native Go Building Scalable, Resilient Microservices for the Cloud in Go 1 / 29

Geo-Replicated Transactions in 1.5RTT Robert Escriva Strangeloop September 30, 2017 @rescrv

Geo-Replicated Transaction Commit in 3 Message Delays Robert Escriva VMWare June 9, 2017

Going Cloud Native with Cloud Foundry @chipchilders Chip Childers, VP Technology Cloud Foundry

Cloud Native Data Pipelines with Apache Kafka Gwen Shapira, Software Engineer @gwenshap 2

ILLUMI NATIVE NARRATIVE CHANGE INSIGHTS AND ACTION PRESENTATION ILLUMI NATIVE S MISSION Created

Live Coding Kotlin/Native Snake github.com/dkandalov/kotlin-native-snake @dmitrykandalov

NATIVE MODE PROGRAMMING Fiona Reid Overview What is native mode? What codes are suitable

PolarDB Cloud Native DB @ Alibaba Lixun Peng Inaam Rana Alibaba Cloud Team Agenda

Lessons Learnt from Running a Container Native Cloud Xu Wang (@gnawux) CTO &amp; Cofounder,

PATH TO CLOUD-NATIVE APP DEV 8 steps to cloud-native app dev Thomas Qvarnstrom Cesar Saavedra

What is Cloud Native? WW Developer Advocacy Contents App Modernization Docker

Distributed Databases 1 19.1 Distributed Database System A distributed database system

Data Replication and Power Consumption in Data Grids Karl Smith, Susan Vrbsky, Ming Lei, Jeff

Data Preprocessing Data Mining and Exploration: Preprocessing Data preparation is a big issue for

Data Mining Techniques: Statistical Decision Theory Nearest Neighbor Classification and

Wide Area Placement of Data Replicas for Fast and Highly Available Data Access Fan Ping Xiaohu

CS 744: GEODE Shivaram Venkataraman Fall 2019 ADMINISTRIVIA - Assignment 2 grades - Midterm

Principles of Software Construction: Objects, Design, and Concurrency Distributed System Design,

Vembu Technologies 100+ Decade + G2 crowd Countries Experience Top Leaders-2019

Lessons Learnt from Running a Container Native Cloud Xu Wang (@gnawux) CTO & Cofounder,