The Performance Analysis of Cache Architecture based on Alluxio - PowerPoint PPT Presentation

The Performance Analysis of Cache Architecture based on Alluxio over Virtualized Infrastructure � Xu Chang, Li Zha � 1

Contents • Background • Related Works • Motivation • Experiments • Results • Conclusion • Future Work ��

Background • Cloud Computing – Computing as a service – Application of resources on demand and payment on demand • Virtualization – Integrates and encapsulates the resources – Provide the resource in piece – Transparent to users � ��

Background Traditional Architecture Compute Node Compute Node Compute Node Data Node Data Node Data Node Decoupling architecture of Decoupling vs Traditional computing and storage Compute cluster Advantage: Compute Compute Compute • More flexible Node Node Node • Overall cost is reduced Shortcoming: Data Center (Object Storage) • Performance decline Data Node Data Node Data Node � ��

Related Works For making up the loss of performance • Traditional optimization method – Speed up the shuffle part of jobs with SSDs – [kambatla2014truth] [ruan2017improving] • Reduce the frequency of accessing the object storage – Construct the cache layer between applications and object storage – [shankar2017performance] [qureshi2014cache] ��

Related Works Alluxio (Tachyon) • The world’s first memory speed virtual distributed storage system • Resides between computation frameworks and storage systems �� Source: https://www.alluxio.org/

Motivation • Only concern about performance, do not care about cost • Cost reduction is critical • Question: – How to design the caching architecture to make the cost performance highest? ��

Experiments System architecture � MapReduce MapReduce MapReduce Alluxio Alluxio Alluxio Cloud Storage �� Source: https://www.alluxio.org/

Experiments Experimental environment Experiment 1: Experiment 2: Platform: AWS Platform: G-Cloud Servers: m3.2xlarge * 4 Servers: 8 cores & 30G Object storage: S3 memory * 4 Object storage: Ceph � � ��

Experiments Experimental scheme • Experiment 1: – Workload: Terasort * 6 • Experiment 2: – Workload: Hive-Join * 3 • Data Size: 120G • Cost ratio of memory to SSD Memory : 8:0 � 7:1 � 5:3 � 3:5 � 1:7 � 0:8 � SSD � ��

Results Experimental 1: Performance 92.00 Throughput (MB/s) � 90.00 88.00 86.00 84.00 82.00 80.00 78.00 76.00 Cost Performance 5.00 4.50 COST PERFORMANCE � 4.00 3.50 3.00 2.50 2.00 1.50 1.00 0.50 0.00 � 100%MEM 87.5%MEM 62.5%MEM 37.5%MEM 12.5%MEM 100%SSD 12.5%SSD 37.5%SSD 62.5%SSD 87.5%SSD

Results Experimental 2: Performance 210 Throughput (MB/s) � 205 200 195 190 185 180 175 Cost Performance 3 COST PERFORMANCE � 2.5 2 1.5 1 0.5 0 100%MEM 87.5%MEM 62.5%MEM 37.5%MEM 12.5%MEM 100%SSD 12.5%SSD 37.5%SSD 62.5%SSD 87.5%SSD ��

Conclusion • Hybrid cache architecture is recommended. • For the workload with large size of output and small size of hot data, the cost ratio of memory to SSD in cache should be around 1:7 • For the workload with small size of output and large size of hot data, the cost ratio of memory to SSD in cache should be around 5:3 ��

Future Work • Study several aspects that affect the cost performance, and try to give a configuration scheme with the best cost performance • Increase workload types and application scenarios, so that the conclusion is closer to the real scene and has generality ��

Q & A � Thanks! � ��

The Performance Analysis of Cache Architecture based on Alluxio - PowerPoint PPT Presentation

The Performance Analysis of Cache Architecture based on Alluxio over Virtualized Infrastructure Xu Chang, Li Zha 1 Contents Background Related Works Motivation Experiments Results Conclusion Future Work

1 Classifying cache misses Cache Organization Classifying misses by causes (3Cs) Cache size,

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

Cache Performance Associativity Replacement Samira Khan Cache Performance March 28,

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

L09: Cache Name: ID: Question: Direct Mapping Cache Hit Rate Consider a 4-block empty Cache,

Cache Impact on Program Performance T. Yang. UCSB CS240A. 2017 Multi-level cache in computer

CSE378 - Cache Performance metrics for caches Parameters for cache design Basic performance

Plan Hierarchical memories and their impact on our programs 1 Cache Memories, Cache Complexity

Generations of Cache 1980: no cache in proc; 1989 first Intel proc with a cache on chip.

Cache Memory Chapter 17 S. Dandamudi Outline Introduction Types of cache misses

Cache Memory Chapter 17 S. Dandamudi Outline Introduction Types of cache misses

Caches Electronic Computers M Caches 1 Cache LOCALITY PRINCIPLE (SPATIAL AND TEMPORAL)

Cache Performance Samira Khan March 28, 2017 Agenda Review from last lecture Cache

Cache Memories, Cache Complexity Marc Moreno Maza University of Western Ontario, London, Ontario

Cache Creek Placer Area Fee Proposal History of Placer Mining at Cache Creek Prospecting in

Approximating with Input Level Granularity Parker Hill, Michael Laurenzano, Mehrzad Samadi Scott

Richard North Richard North Chief Executive Chief Executive 1 Richard Solomons Richard

Asynchronous Communication II: Semantics, specification and reasoning INF 4140 Lecture 11

Class 09: Recursion practice, how recursive programs work Recall the list-length procedure

Crystallography Open Database for teaching Saulius Graulis Andrius Merkys Antanas Vaitkus

Hypothesis Testing Part III: Pretest Bias James J. Heckman University of Chicago Based on T.A.

Admin Halfway point on our journey!

HDFI: Hardware-Assisted Data-flow Isolation Chengyu Song 1 , Hyungon Moon 2 , Monjur Alam 1 ,