Semantics of Caching with SPOCA - A Stateless, Proportional, - PowerPoint PPT Presentation

Semantics of Caching with SPOCA - A Stateless, Proportional, Optimally-Consistent Addressing Algorithm Ashish Chawla, Benjamin Reed, Karl Juhnke, Ghousuddin Syed Yahoo! Inc

Video Platform 2 6/21/11

Video Platform 3 6/21/11

Simple Content Serving Architecture 4 6/21/11

Outline § Introduction § Problem Definition § SPOCA and Requirements § Evaluations § Conclusion 5 6/21/11

The Problem § The front-end server disks are a secondary bottleneck. § Eliminating redundant caching of content also reduces the load on the storage farm. § An intelligent request-routing policy can produce far more caching efficiency than even a perfect cache promotion policy that must labor under random request routing. § The cache promotion algorithm not enough. 6 6/21/11

Problems from Geographic Distribution 7 6/21/11

Problems from Geographic Distribution Reque sts v7 8 6/21/11

Problems from Geographic Distribution 9 6/21/11

Requirements § Merge different delivery pools and manage the diverse requirements in an adaptive way. § Minimize caching disruptions when front-end server leaves or enters the pool - re-address as few files as possible to different servers. § Proportional distribution of files among servers does not necessarily result in a proportional distribution of requests (Power Law) 11 6/21/11

SPOCA and Zebra § Used in production in a global scenario for web-scale load. § Shows real world improvements over the simple off-the- shelf solution. § Implements load balancing, fault tolerance, popular content handling, and efficient cache utilization with a single simple mechanism. 12 6/21/11

Traditional Approach 13 6/21/11

Complete Picture 14 6/21/11

Complete Picture – Inside Data Center 15 6/21/11

Zebra Algorithm § Handles the geographic component of request routing and content caching § Based on content popularity, Zebra decides when requests should be routed to content’s home locale and when the content should be cached in the nearest locale § We use bloom filters to determine popularity. 16 6/21/11

Tracking popularity add(vid1) Bloom Filter 17 6/21/11

Checking Popularity contains(vid1) Bloom Filter 18 6/21/11

What’s the problem here? § Everything will become popular. § No way to expire content in bloom filter § We use a sequence of bloom filters to track popularity. 19 6/21/11

Bloom Filter Representation 0 1 2 • vid1 • vid8 • vid2 • vid5 • vid526 • vid752 20 6/21/11

Bloom Filter Representation 0 1 2 • vid1 • vid8 • vid5 • vid526 21 6/21/11

Bloom Filter Representation add(vid8) 0 1 2 • vid1 • vid8 • vid5 • vid526 22 6/21/11

Bloom Filter Representation 0 1 2 • vid8 • vid1 • vid8 • vid5 • vid526 23 6/21/11

Bloom Filter Representation contains(vid3) 0 1 2 • vid8 • vid1 • vid8 • vid5 • vid526 24 6/21/11

Bloom Filter Representation contains(vid3) Unified Filter vid1, vid5, vid8, vid526 0 1 2 • vid8 • vid8 • vid1 • vid526 • vid5 25 6/21/11

Key Points § Zebra determines which serving cluster will handle a given request based on geolocality and popularity. § SPOCA determines which front-end server within that cluster will cache and serve the request. 26 6/21/11

SPOCA Algorithm § Goal : Maximize cache utilization at the front-end servers. § Simple content to server assignment function based on a sparse hash space. § Each front-end server is assigned a portion of the hash space according to its capacity. § The SPOCA routing function uses a hash function to map names to a point in a hash space. › Input = the name of the requested content › Output = the server that will handle the request. § Re-hashing happens till the result maps to a valid hash space. 27 6/21/11

SPOCA hash map example �� 28 6/21/11

Failure Handling �� 29 6/21/11

Elasticity �� 30 6/21/11

Popular Content § SPOCA minimizes the number of servers to maximize the aggregate number of cached objects. § For popular content we need to route requests to multiple front-end servers. § We store the hashed address of any requested content for a brief popularity window, 150 seconds in our case. § When the popularity window expires, the stored hash for each object is discarded. 31 6/21/11

�� 32 6/21/11

�� 33 6/21/11

Scaling 5x w/o software improvements 35 6/21/11

Scaling 5x with software improvements 36 6/21/11

Memory cache hits 37 6/21/11

Cache Hit and Misses* 2/26 3/1 3/5 3/7 3/10 3/14 Download Cache Miss 9.7% 7.2% 4.3% 3.7% 1.8% 0.4% Download Cache HIT 90.3% 92.8% 95.7% 96.3% 98.2% 99.6% Flash Cache Miss 21.8% 13.5% 22.0% 14.8% 2.5% 0.7% Flash RAM hit 57.2% 81.4% 66.1% 71.5% 90.0% 90.1% * Download and Flash Pools in S1S data center 38 6/21/11

Conclusion § Zebra and SPOCA do not have any hard state to maintain or per object meta-data § Eliminates any per object storage overhead or management, simplifying operations. § Consolidate content serving into a single pool of servers that can handle files from a variety of different workloads. § Decouple serving and caching layers. § Cost savings and end user satisfaction are key success metrics. 39 6/21/11

40 6/21/11

Semantics of Caching with SPOCA - A Stateless, Proportional, - PowerPoint PPT Presentation

Semantics of Caching with SPOCA - A Stateless, Proportional, Optimally-Consistent Addressing Algorithm Ashish Chawla, Benjamin Reed, Karl Juhnke, Ghousuddin Syed Yahoo! Inc Video Platform 2 6/21/11 Video Platform 3 6/21/11 Simple Content

Semantics of Caching with SPOCA: A Stateless, Proportional, Optimally-Consistent Addressing

Agenda Caching Caching Gitlab Demo Caching Demos Mirroring Caching Limitations Manual

Web Proxy Web Proxy Caching Caching Caching Web Proxy Web Proxy Caching By Miquel Company

Cooperative Web Caching Cooperative Web Caching Cooperative Caching Cooperative Caching

Web Caching and Content Delivery Web Caching and Content Delivery Caching for a Better Web

Web Caching based on: Web Caching , Geoff Huston Web Caching and Zipf-like Distributions:

Semantics 1 / 21 Outline What is semantics? Denotational semantics Semantics of naming What

Scaling Your Cache & Caching at Scale Alex Miller @puredanger Mission Why does caching

Web Caching Web Caching and wireless networks Next generation Wireless Networks Helsinki

Middle Grades Proportional Reasoning Middle Grades Proportional Reasoning Middle Grades

Stateless Systems, Factory Reset, Golden Master Systems and systemd LinuxCon Europe, Duesseldorf

Temporal Temporal Radiance Caching Radiance Caching Pascal Gautron R&D Engineer Thomson

1 Harvest Harvest- -Style ICP Hierarchies Style ICP Hierarchies Issues for Cache Hierarchies

1 Web Traffic Characterization Zipf Web Traffic Characterization Zipf [Breslau/Cao99] and

Operational Semantics 1 / 14 Outline What is semantics? Operational Semantics What is

15-411: Dynamic Semantics Jan Ho ff mann Dynamic Semantics Static semantics: definition of

Modelling in CP Marco Chiarandini Department of Mathematics & Computer Science University of

The Princeton ZebraNet Project: Sensor Networks for Wildlife Tracking Margaret Martonosi VET

Constraint Satisfaction for First-Order Logic William McCune Computer Science Department

TOPIC 9ISN'T IN THIS Max Fowler (Computer Science)

INTRO TO OOP FOR STREAMS AND DATA SCIENCE FILES PROF. JOHN GAUCH OVERVIEW OVERVIEW OVERVIEW

The K.U.Leuven CHR System: Implementation and Application Tom Schrijvers, Bart Demoen {

Minor target countries

Deep learning 8.4. Networks for semantic segmentation Fran cois Fleuret

Semantics of Caching with SPOCA - A Stateless, Proportional, - PowerPoint PPT Presentation

Semantics of Caching with SPOCA - A Stateless, Proportional, Optimally-Consistent Addressing Algorithm Ashish Chawla, Benjamin Reed, Karl Juhnke, Ghousuddin Syed Yahoo! Inc Video Platform 2 6/21/11 Video Platform 3 6/21/11 Simple Content

Semantics of Caching with SPOCA: A Stateless, Proportional, Optimally-Consistent Addressing

Agenda Caching Caching Gitlab Demo Caching Demos Mirroring Caching Limitations Manual

Web Proxy Web Proxy Caching Caching Caching Web Proxy Web Proxy Caching By Miquel Company

Cooperative Web Caching Cooperative Web Caching Cooperative Caching Cooperative Caching

Web Caching and Content Delivery Web Caching and Content Delivery Caching for a Better Web

Web Caching based on: Web Caching , Geoff Huston Web Caching and Zipf-like Distributions:

Semantics 1 / 21 Outline What is semantics? Denotational semantics Semantics of naming What

Scaling Your Cache &amp; Caching at Scale Alex Miller @puredanger Mission Why does caching

Web Caching Web Caching and wireless networks Next generation Wireless Networks Helsinki

Middle Grades Proportional Reasoning Middle Grades Proportional Reasoning Middle Grades

Stateless Systems, Factory Reset, Golden Master Systems and systemd LinuxCon Europe, Duesseldorf

Temporal Temporal Radiance Caching Radiance Caching Pascal Gautron R&amp;D Engineer Thomson

1 Harvest Harvest- -Style ICP Hierarchies Style ICP Hierarchies Issues for Cache Hierarchies

1 Web Traffic Characterization Zipf Web Traffic Characterization Zipf [Breslau/Cao99] and

Operational Semantics 1 / 14 Outline What is semantics? Operational Semantics What is

15-411: Dynamic Semantics Jan Ho ff mann Dynamic Semantics Static semantics: definition of

Modelling in CP Marco Chiarandini Department of Mathematics &amp; Computer Science University of

The Princeton ZebraNet Project: Sensor Networks for Wildlife Tracking Margaret Martonosi VET

Constraint Satisfaction for First-Order Logic William McCune Computer Science Department

TOPIC 9ISN'T IN THIS Max Fowler (Computer Science)

INTRO TO OOP FOR STREAMS AND DATA SCIENCE FILES PROF. JOHN GAUCH OVERVIEW OVERVIEW OVERVIEW

The K.U.Leuven CHR System: Implementation and Application Tom Schrijvers, Bart Demoen {

Minor target countries

Deep learning 8.4. Networks for semantic segmentation Fran cois Fleuret

Scaling Your Cache & Caching at Scale Alex Miller @puredanger Mission Why does caching

Temporal Temporal Radiance Caching Radiance Caching Pascal Gautron R&D Engineer Thomson

Modelling in CP Marco Chiarandini Department of Mathematics & Computer Science University of