Asymmetric Caching: Improved Network Deduplication for Mobile Devices Shruti Sanadhya, 1 Raghupathy Sivakumar, 1 Kyu-Han Kim, 2 Paul Congdon, 2 Sriram Lakshmanan, 1 Jatinder P Singh 3 1 Georgia Institute of Technology, Atlanta, GA, USA 2 HP Labs, Palo Alto, CA, USA 3 Xerox PARC, Palo Alto, CA, USA 1
Introduction Network traffic has a lot of redundancy • – 20% HTTP content accessed on smartphones is redundant 1 Network deduplication (dedup) leverages this redundancy to • conserve network bandwidth Mobile Cache Regular Cache Mobile H2 H2 C2 Dedup Source Dedup Destination Receiver Sender SGSN Rabin Fingerprinting C1 C 2 C3 C1 C 2 C3 Inflate packet Hashing H1 H2 H3 Compress C1 H2 C3 C1 H2 C3 1 Qian et al,, “Web Caching on Smartphones: Ideal vs. Reality “ , MobiSys 2012 2
The Asymmetry Problem • What happens when the mobile cache is more populated than the cache at dedup source? H3 C3 H4 C4 Mobile H5 C5 Cache H6 C6 Mobile Regular H7 C7 Cache Cache H8 C8 H2 H2 C2 H2 C2 Sender Receiver Dedup Source Dedup Destination How can all the past cached information at the mobile be successfully leveraged for dedup by any given dedup source? 3
Motivational Scenarios • Multi-homed devices Cache Cache Cache WiFi Access 3G Base Point Station (BS) Mobile Device 4
Motivational Scenarios • Multi-homed devices • Resource pooling Cache Cache SGSN RNC Cache BS Mobile SGSN Device RNC BS – BS: Base Station – RNC: Radio Network Controller – SGSN: Serving GPRS Support Node 5
Motivational Scenarios • Multi-homed devices • Resource pooling • Memory scalability Cache Cache RNC Cache BS SGSN Cache – BS: Base Station – RNC: Radio Network Controller – SGSN: Serving GPRS Support Node 6
Scope and Goals • Scope – Laptops/smartphones using 3G/WiFi – Conserving cellular bandwidth – Downstream and unencrypted traffic • Goals – Overall efficiency: Using downstream and upstream more efficiently – Application agnostic: Applicable to any application – Limited overheads: Deployable computational and memory complexities 7
Asymmetric Caching - Overview Feedback H3 C3 Mobile Regular Cache H4 C4 Cache Cache H5 C5 H4 H2 C2 H2 Sender Receiver Dedup Dedup Source Destination Feedback • Mobile cache is more populated than dedup source • On receiving downstream traffic, the mobile selectively advertises portions of its cache to dedup source • Dedup source also maintains a feedback cache • Both regular and feedback cache is used for dedup 8
When is feedback sent? • Feedback is sent reactively • Feedback is sent only when there is downstream traffic • Feedback sent is specific to the ongoing traffic Downstream traffic Downstream traffic Dedup Dedup Destination Source Feedback Feedback 9
Where from is feedback selected? • Hashes at dedup destination can be organized as per: – Order of arrival H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H1 H2 H6 H7 H8 – Same flow (Src IP, Dest IP, Src Port, Dest Port ) H3 H4 H5 H9 H10 H1 H2 – Same object (HTML , JPEG or CSS) H6 H7 H8 H3 H4 H5 H9 H10 • Objects help in effectively matching new and old content • Application agnostic estimate of objects are flowlets 10
How are flowlets extracted? Sequence of bytes in a flow is a time-series • Flowlets are piecewise stationary segments of a flow • Check for flowlet boundary at start of each packet • Consider byte series B [0:m] (1 st packet), B [m+1:n] (2 nd packet) and • B [0:m] B [m+1:n] B [0:n] as autoregressive processes of order p : B 0 , B 1 , ….., B m , B m+1 , ……, B n B i = Σ 1<=j<=p a i B i-j + σε , ε is white noise B [0:n] d [0:m:n] = gain(B [0:n] ) – gain(B [0:m] ) – gain(B [m+1:n] ) • Gain in the noise power when B [0:n] is in one flowlet instead of different flowlets: B [0:m] and B [m+1:n] If d [0:m:n] > d thresh , then flowlet boundary exists at m • 11
How is feedback selected? • Find best matching past flowlet 12
How is feedback selected? • Find best matching past flowlet H1 H2 13
How is feedback selected? • Find best matching past flowlet H1 H2 H1 F1, F3 F1, F2, F3 H2 14
How is feedback selected? • Find best matching past flowlet F1: H1, H2, H4, H5, H6, H7, H8, H9, H10, H11, H12, ……. H1 H2 F2: H2, H5, H10, …. H1 F1, F3 F1, F2, F3 H2 F3: H5, H8, H11, H12, ….. Last hash matched 15
How is feedback selected? • Find best matching past flowlet F1: H1, H2, H4, H5, H6, H7, H8, H9, H10, H11, H12, ……. H1 H2 F2: H2, H5, H10, …. H1 F1, F3 F1, F2, F3 H2 F3: H5, H8, H11, H12, ….. Last hash matched 16
How is feedback selected? • Find best matching past flowlet F1: H1, H2, H4, H5, H6, H7, H8, H9, H10, H11, H12, ……. H1 H2 F2: H2, H5, H10, …. H1 F1, F3 F1, F2, F3 H2 F3: H5, H8, H11, H12, ….. Last hash matched – Flowlet 1 (F1) is best matched 17
How is feedback selected? • Find best matching past flowlet F1: H1, H2, H4, H5, H6, H7, H8, H9, H10, H11, H12, ……. H1 H2 F2: H2, H5, H10, …. H1 F1, F3 F1, F2, F3 H2 F3: H5, H8, H11, H12, ….. Last hash matched – Flowlet 1 (F1) is best matched • Find start of next feedback in the best matching flowlet 18
How is feedback selected? • Find best matching past flowlet F1: H1, H2, H4, H5, H6, H7, H8, H9, H10, H11, H12, ……. H1 H2 F2: H2, H5, H10, …. H1 F1, F3 F1, F2, F3 H2 F3: H5, H8, H11, H12, ….. Last hash matched – Flowlet 1 (F1) is best matched • Find start of next feedback in the best matching flowlet Best matching H1, H2, H4, H5, H6,H7, H8, H9,H10, H11, H12, H13, past flowlet 19
How is feedback selected? • Find best matching past flowlet F1: H1, H2, H4, H5, H6, H7, H8, H9, H10, H11, H12, ……. H1 H2 F2: H2, H5, H10, …. H1 F1, F3 F1, F2, F3 H2 F3: H5, H8, H11, H12, ….. Last hash matched – Flowlet 1 (F1) is best matched • Find start of next feedback in the best matching flowlet Best matching H1, H2, H4, H5, H6,H7, H8, H9,H10, H11, H12, H13, past flowlet Last hash matched 20
How is feedback selected? • Find best matching past flowlet F1: H1, H2, H4, H5, H6, H7, H8, H9, H10, H11, H12, ……. H1 H2 F2: H2, H5, H10, …. H1 F1, F3 F1, F2, F3 H2 F3: H5, H8, H11, H12, ….. Last hash matched – Flowlet 1 (F1) is best matched • Find start of next feedback in the best matching flowlet Best matching H1, H2, H4, H5, H6,H7, H8, H9,H10, H11, H12, H13, past flowlet Last hash Last hash matched advertised 21
How is feedback selected? • Find best matching past flowlet F1: H1, H2, H4, H5, H6, H7, H8, H9, H10, H11, H12, ……. H1 H2 F2: H2, H5, H10, …. H1 F1, F3 F1, F2, F3 H2 F3: H5, H8, H11, H12, ….. Last hash matched – Flowlet 1 (F1) is best matched • Find start of next feedback in the best matching flowlet Best matching H1, H2, H4, H5, H6,H7, H8, H9,H10, H11, H12, H13, past flowlet δ Last hash Last hash matched advertised – δ : temporal offset 22
How is feedback selected? • Find best matching past flowlet F1: H1, H2, H4, H5, H6, H7, H8, H9, H10, H11, H12, ……. H1 H2 F2: H2, H5, H10, …. H1 F1, F3 F1, F2, F3 H2 F3: H5, H8, H11, H12, ….. Last hash matched – Flowlet 1 (F1) is best matched • Find start of next feedback in the best matching flowlet Best matching H1, H2, H4, H5, H6,H7, H8, H9,H10, H11, H12, H13, past flowlet δ Last hash Start of next Last hash matched advertised feedback – δ : temporal offset 23
How is the feedback used? Regular Cache Feedback Cache Dedup Source • Dedup source maintains a feedback cache along with regular cache of baseline dedup 24
How is the feedback used? Regular Cache Feedback Cache H1 H2 Dedup Source • Dedup source maintains a feedback cache along with regular cache of baseline dedup • Regular cache is populated by downstream data 25
How is the feedback used? Regular Cache Feedback Cache H1 H2 Dedup Source • Dedup source maintains a feedback cache along with regular cache of baseline dedup • Regular cache is populated by downstream data 26
How is the feedback used? Regular Cache Feedback Cache H1 H3 H2 H4 Dedup Source • Dedup source maintains a feedback cache along with regular cache of baseline dedup • Regular cache is populated by downstream data • Feedback hashes are inserted in feedback cache 27
How is the feedback used? Regular Cache Feedback Cache H1 H3 H2 H4 Dedup Source • Dedup source maintains a feedback cache along with regular cache of baseline dedup • Regular cache is populated by downstream data • Feedback hashes are inserted in feedback cache Every downstream packet is deduped using both regular and feedback cache 28
Design Summary • When is the feedback sent? • Reactively • Where from is the feedback • Flowlets at dedup destination chosen? • Stationarity properties • How are flowlets extracted? • How is the feedback • Best matching flowlet and selected? pointers in past flowlet • Stored in the feedback cache • How is the feedback used? for dedup 29
Recommend
More recommend