Redefining Data Locality for Cross-Data Center Storage Kwangsung - PowerPoint PPT Presentation

Redefining Data Locality for Cross-Data Center Storage Kwangsung Oh, Ajaykrishna Raghavan Abhishek Chandra, and Jon Weissman Department of Computer Science and Engineering University of Minnesota Twin Cities

Background Private Cloud

Background Computation Storage Network

Background ElastiCache App / Server S3 EBS

Data replication is unavoidable

Questions • Where to store data? • Which Datacenter, Local or Near or Remote DCs? • Which Storage tier, Faster or Slower Tiers? • When and where to replicate or move data? • Which data? • No single answer. • Answer should be changed based on user requirements such as QoS (Performance), consistency, expected workload, and cost.

Disk-locality in datacenter computing considered irrelevant App / Server

Motivation From http://www.datacentermap.com

Key observations • Multiple DCs are in the same region Memcache App / Server and close each other. • By using nearby DC, data locality can Disk be extended. Azure Storage • Data can be stored in non-local DC’s storage without (or less) data locality concern. ElastiCache EBS S3

DC locations example

Latency and bandwidth between DCs Latency (ms) between DCs Region US West US East Europe West Asia Southeast AWS Azure AWS Azure AWS Azure GC AWS Azure AWS - 3.84 - 1.97 - 17.58 16.33 - 1.84 Azure 3.62 - 1.99 - 18.67 - 16.02 1.98 - GC - - - - 16.35 16.12 - - - Bandwidth (MB/s) between DCs Region US West US East Europe West Asia Southeast AWS Azure AWS Azure AWS Azure GC AWS Azure AWS - 48.75 - 48.13 - 48.38 48.63 - 48.88 Azure 21.62 - 23.63 - 45.25 - 53.5 24.38 - GC - - - - 32.38 40.25 - - -

Data Retrieval Time (100KB)

Data Retrieval time (100KB) in US East

Disk performance of AWS and Azure

Various Data Size

Summary of experiments • Accessing data in memory, in a nearby DC is faster than local slower storage tier. • Accessing data from disk (archival) storage in a nearby DC can be as fast as accessing disk (archival) in the local DC. • These trends hold for data sizes up to 1MB (can be increased), which encompass many common internet applications.

Usecases • Simpler Consistency Policy • Using faster (memory) tier can reduce the number of replicas. • Lowering number of replicas reduces the network traffic for consistency. • Hot and Cold Data • Data can be located in Memory on DC A and Disk or Archival in DC B based on data access. • Higher Availability • If DC A fails, the application can minimize the performance penalty by using DC B’s faster storage.

Usecases • Expanding the Memory Tier • New VM instance needs to be spawned but it can be expensive. • Spawning VM instance can be rejected by providers’ policy but not outage.

Usecases • Competitive Pricing • Each cloud provider has different pricing policy for their service. The Cheapest VM Instance for 3.5GB Memory from each cloud provider in US East AWS T2.medium (4GB, 2 cores) $0.052 / hour – $37.44 / month ( $9.36 / GB ) Azure A2-Basic Tier (3.5GB, 2 cores) $0.088 / hour – $63.36 / month ($18.10 / GB) Google Cloud n1-standard-1 (3.5GB, 1 core) $0.049 / hour – $35.29 / month ($10.08 / GB) The Cheapest VM Instance for 25GB Memory from each cloud provider in US East AWS r3.xlarge (30.5GB, 4 cores) $0.350 / hour – $252 / month ($8.26 / GB) Azure D12 (28GB, 4 cores) $0.476 / hour – $342.72 / month ($12.24 / GB) Google Cloud n1-highmem-8 (26GB, 4 core) $0.226 / hour – $162.72 / month ( $6.25 / GB )

Web application case study (RUBiS) • eBay like web application (Apache + MySQL) • 1,000,000 users, 1,000,000 items -> 2GB • Emulate 300 users (view, sell, bid, buy, comment ...) • Change the location where MySQL is running • Local node’s disk • with system buffer cache. • limited size of buffer cache. • Nearby DC node’s disk • Ramdisk (with system buffer cache)

Benchmark (RUBiS)

Challenges • Infrastructure Dynamics • Cloud services do not provide consistent performance over time. • Performance throttling based on the VM instance size.

Challenges • Application Dynamics • Data size and access patterns keep changing. • Simple Storage Abstraction • More complexities from different storage interfaces and various pricing policy • Discovering nearby DCs • Network performance between DCs is not decided by physical distance. • Cloud providers’ implementation and policies • Same storage tier has different performance. • New types of VM Instance and new pricing policy. • Network cost should be considered for optimized cost.

Conclusion • Data locality can be extended with denser data centers • Accessing data in nearby DC can be faster than local storage tiers. • Small size data can be stored nearby DC without (less) locality concern. • Benefits from using multiple data centers • Better performance. • Reduced cost. • Better availability. • Durability. • M any challenges to be overcome for realizing such benefits

Thank you! • Questions? 25

Redefining Data Locality for Cross-Data Center Storage Kwangsung - PowerPoint PPT Presentation

Redefining Data Locality for Cross-Data Center Storage Kwangsung Oh, Ajaykrishna Raghavan Abhishek Chandra, and Jon Weissman Department of Computer Science and Engineering University of Minnesota Twin Cities Background Private Cloud

CONTEXT LOCALITY LOCALITY LOCALITY LOCALITY LAYOUTS M E E R L U S T R O A D PICK

REDEFINING CENTRALITY Redefining Centrality Overview - Regional Integration - Global and Local

Desistance in Practice Yvonne Thomas Interserve Justice, MD and Chair, Purple Futures LLP 10

Locality Locality CS 105 Tour of the Black Holes of Computing Principle of Locality: Programs

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

Software-defined Storage the future is now Redefining the economics of storage with SUSE

Redefining Case Management Redefining Case Management The Division of Developmental

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The

locality.org.uk Locality is the national network of ambitious and enterprising community-led

Highway Locality Budget Scheme Steve Dibben Highway Locality Manager Mid Herts Group

> SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE

Compiling for Parallelism & Locality Last time SSA and its uses Today

Cross Ram Support Set Ram accessories 1 Cross Ram Support Set Set composition The Cross

Locality Planning in the South Eastern Area Cathy Polley Ards Community Network & Chair of

Hinckley & Bosworth Locality Hinckley & Bosworth Locality group is the larger of the

Agenda Item 2 Highway Locality Budget Scheme Steve Dibben - Highway Locality Manager

Designing a Workload Scenario for Benchmarking Message-Oriented Middleware Kai Sachs*, Kai

DC Autism Parents (DCAP) Where Parents Empower Parents Who we are... About DCAP DC

EECS 192: Mechatronics Design Lab Discussion 4: Project Proposal Feedback & Power Systems

Network and Computing Resource Sharing in Federated Cloud Systems Walter Cerroni Dept. of

Jane Pan janepan@hbi-dc.org www.hbi-dc.org Phone: (571) 274-0021 } Was developed and produced in

Washington, DC, Area Dismissal and Closure Procedures 2014-2015 Winter Season Overview The

July 13, 2020 Washington, DC Information Security and Financial Institutions: An FTC Workshop on

Introduction to Government Relations and DC Trip Fernanda Psihas Joseph Zennamo Carrie McGivern

Sambuz

Useful Links

Newsletter

Mail Us

Redefining Data Locality for Cross-Data Center Storage Kwangsung - PowerPoint PPT Presentation

Redefining Data Locality for Cross-Data Center Storage Kwangsung Oh, Ajaykrishna Raghavan Abhishek Chandra, and Jon Weissman Department of Computer Science and Engineering University of Minnesota Twin Cities Background Private Cloud

CONTEXT LOCALITY LOCALITY LOCALITY LOCALITY LAYOUTS M E E R L U S T R O A D PICK

REDEFINING CENTRALITY Redefining Centrality Overview - Regional Integration - Global and Local

Desistance in Practice Yvonne Thomas Interserve Justice, MD and Chair, Purple Futures LLP 10

Locality Locality CS 105 Tour of the Black Holes of Computing Principle of Locality: Programs

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

Software-defined Storage the future is now Redefining the economics of storage with SUSE

Redefining Case Management Redefining Case Management The Division of Developmental

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The

locality.org.uk Locality is the national network of ambitious and enterprising community-led

Highway Locality Budget Scheme Steve Dibben Highway Locality Manager Mid Herts Group

&gt; SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE

Compiling for Parallelism &amp; Locality Last time SSA and its uses Today

Cross Ram Support Set Ram accessories 1 Cross Ram Support Set Set composition The Cross

Locality Planning in the South Eastern Area Cathy Polley Ards Community Network &amp; Chair of

Hinckley &amp; Bosworth Locality Hinckley &amp; Bosworth Locality group is the larger of the

Agenda Item 2 Highway Locality Budget Scheme Steve Dibben - Highway Locality Manager

Designing a Workload Scenario for Benchmarking Message-Oriented Middleware Kai Sachs*, Kai

DC Autism Parents (DCAP) Where Parents Empower Parents Who we are... About DCAP DC

EECS 192: Mechatronics Design Lab Discussion 4: Project Proposal Feedback &amp; Power Systems

Network and Computing Resource Sharing in Federated Cloud Systems Walter Cerroni Dept. of

Jane Pan janepan@hbi-dc.org www.hbi-dc.org Phone: (571) 274-0021 } Was developed and produced in

Washington, DC, Area Dismissal and Closure Procedures 2014-2015 Winter Season Overview The

July 13, 2020 Washington, DC Information Security and Financial Institutions: An FTC Workshop on

Introduction to Government Relations and DC Trip Fernanda Psihas Joseph Zennamo Carrie McGivern

Sambuz

Useful Links

Newsletter

Mail Us

> SUN STORAGE 7000 UNIFIED STORAGE SYSTEMS ITS TIME TO CHANGE YOUR STORAGE

Compiling for Parallelism & Locality Last time SSA and its uses Today

Locality Planning in the South Eastern Area Cathy Polley Ards Community Network & Chair of

Hinckley & Bosworth Locality Hinckley & Bosworth Locality group is the larger of the

EECS 192: Mechatronics Design Lab Discussion 4: Project Proposal Feedback & Power Systems