The Power of Prediction: Cloud Bandwidth and Cost Reduction Eyal Zohar Israel Cidon Osnat (Ossi) Mokryn Technion Tel-Aviv College
Traffic Redundancy Elimination (TRE) Traffic redundancy stems from downloading same or similar information items. We found around 70% redundancy in end-clients traffic, compared with past traffic and local files. � SIGCOMM 2011
TRE Importance Moving to the cloud => higher e2e traffic. Cloud users pay for traffic used in practice => incentive to use TRE. Cloud User Application Pay for Use Cloud Provider Cloud Provider Cloud Traffic TRE End-user � SIGCOMM 2011
How TRE Works Server parses the outgoing stream to content- based chunks and signs with SHA-1 Byte stream Rolling hash Anchor 1 Anchor 2 Anchor 3 Anchor 4 Chunk 1 Chunk 2 Chunk 3 SHA-1 signature Sign. 1 Sign. 2 Sign 3 New bytes Chunk 1 Chunk 2’ Chunk 3 Insertion example � SIGCOMM 2011
Problems in Existing Solutions In the cloud environment: 1. High processing costs in the cloud. 2. Scalability – remember each client. 3. Elasticity - unaware of data from other sources. 4. Do not handle long-term repeats (days/weeks). Server 2 Server 1 Receiver � SIGCOMM 2011
Our Solution: PACK (Predictive ACK) Redundancy detection by the client. Repeats appear in chains. Tries to match incoming chunks with a previously received chain or local file. Sends to the server predictions of the future data. � SIGCOMM 2011
PACK: The Client Prediction Stream chunks Chunk 1 Chunk 2 Chunk 3 SHA-1 signature Chain of chunks Sign. 1 Sign. 2 Sign 3 Received Prediction Each prediction: TCP seq. 1.TCP seq. – no server parsing Chunk 2.Hint – spare unnecessary SHA-1 Last-byte hint 3.SHA-1 signature SHA-1 � SIGCOMM 2011
PACK: Server Operation The server compares the hint with the last-byte to sign. Upon a hint match it performs the expensive SHA-1. PACK saves cloud’s computational effort in the absence of redundancy. First receiver-based TRE: the server does not parse. It signs with >99% confidence. 2,3V 3 1 2 3 1 2 2,3? Server Client Local storage Chain � SIGCOMM 2011
PACK Benefits Minimizes processing costs induced by TRE. – Signs with SHA-1 in the presence of redundancy. Receiver-based end-to-end TRE => suitable for cloud server elasticity and client mobility. – Does not require the server to continuously maintain clients’ status. � SIGCOMM 2011
Server Effort Experiment Several data-sets in 3 modes: baseline no-TRE, PACK and a sender-based TRE. 140% 120% Single Server Cloud Operational Cost (100%=without TRE system) 100% 80% 60% 25%-30% redundancy: 40% common to many EndRE-like Sender-based data-sets 20% PACK 0% 0% 10% 20% 30% 40% 50% Redundancy Elimination Ratio �� SIGCOMM 2011
YouTube Redundancy Traces of 40k clients, captured at an ISP. Found 30% end-to-end (personal) redundancy. 3.0 35% 30% 2.5 PACK TRE (Removed Redundancy) All YouTube Traffic (Gbps) 25% 2.0 YouTube Traffic 20% PACK TRE 1.5 15% 1.0 10% 0.5 5% 0.0 0% �� Time (24 hours) SIGCOMM 2011
Long-Term TRE Social network: eliminated 30% with one hour cache and 75% with a long-term cache. 80% 70% Average Redundancy of Daily Traffic 60% 50% 40% 30% 20% Unlimited 1 Hour 10% 24 Hours 0% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 Days Since Start �� SIGCOMM 2011
Cloud Email Redundancy Gmail account with 1,000 Inbox messages. Found 32% static redundancy (higher when messages are read multiple times). 300 250 Redundant Traffic Volume Per Month (MB) Non-redundant 200 150 100 50 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec �� SIGCOMM 2011 Month
Implementation Linux with Netfilter Queue, 25k lines of C and Java, available for download. Receiver-sender protocol is embedded in the TCP Options field. Transparent use at both sides. �� SIGCOMM 2011
Processing Effort in the Client Laptop experiment: PACK-related CPU consumption is ~4% when playing HD video (9 Mbps with 30% redundancy). Smartphone experiment: PACK consumes ~3% of the battery power when processing 1 GB video (avg. monthly data plan). Virtual traffic saves the client the need to chunk or sign. �� SIGCOMM 2011
New Chunking Algorithm Most existing solutions use Rabin fingerprint. �� SIGCOMM 2011
New Chunking Algorithm 64 bits Mask=00 00 8A 31 10 58 30 80 n n-1 n-2 n-3 n-4 n-5 n-6 n-7 n-8 n-40 n-41 n-42 n-43 n-44 n-45 n-46 n-47 �� SIGCOMM 2011
Summary Current TRE solutions may not reduce cloud cost. PACK is the first receiver-based TRE – leverages the power of prediction. Minimizes processing costs induced by TRE. Suitable for cloud server migration and client mobility. Implementation is available for download. �� SIGCOMM 2011
Recommend
More recommend