on delay storage trade o ff s in content download from
play

On Delay-Storage Trade-o ff s in Content Download from Coded - PowerPoint PPT Presentation

On Delay-Storage Trade-o ff s in Content Download from Coded Distributed Storage Systems Gauri Joshi (MIT) joint work with Yanpei Liu (UW-Madison) Emina Soljanin (Bell Labs) DIMACS Workshop on Algorithms for Green Data Storage Gauri Joshi


  1. On Delay-Storage Trade-o ff s in Content Download from Coded Distributed Storage Systems Gauri Joshi (MIT) joint work with Yanpei Liu (UW-Madison) Emina Soljanin (Bell Labs) DIMACS Workshop on Algorithms for Green Data Storage Gauri Joshi (MIT) Delay-Storage Trade-o ff s 1 / 24

  2. Why Use Coding in Distributed Storage Data Centers Server clusters that store and process all the data in the Internet More than 500000 data centers worldwide Consume vast amounts of energy - more than 2% of US electricity Power to run and repair servers, and for cooling systems Gauri Joshi (MIT) Delay-Storage Trade-o ff s 2 / 24

  3. Trade-o ff s in Coding for Distributed Storage Reliability vs. Storage Replication is the most commonly used redundancy ( n , k ) MDS Codes - any k out of n su ffi cient for data recovery Repair Bandwidth vs. Storage Locally Repairable Codes[Dimakis, IT-Tran ’10] Regenerative codes for storage [Rashmi, IT-Tran ’12] Gauri Joshi (MIT) Delay-Storage Trade-o ff s 3 / 24

  4. Trade-o ff s in Coding for Distributed Storage Accessibility vs. Storage Lower blocking probability than replication for the same storage (Energy Cost) [Ferner, Allerton ’12] Delay vs. Storage Our work - k out of n fork-join queues Packet Routing Diversity [Maxemchuk, 1991], [Kabatiansky, 2005] – do not consider queueing Redundant requests, MDS queue [Shah, Lee, 2013] Gauri Joshi (MIT) Delay-Storage Trade-o ff s 4 / 24

  5. How Coding Reduces Download Time Single M/M/1 Queue Requests arrive at rate λ and served at rate µ 1 Mean response time T 1 , 1 = µ � λ for Poisson arrivals and departures λ µ Gauri Joshi (MIT) Delay-Storage Trade-o ff s 5 / 24

  6. How Coding Reduces Download Time Multiple Copies give Diversity, but with More Storage Requests is sent to n disks storing copies of content Need to wait only for download of only one n copies 1 Mean response time T n , 1 = n µ � λ , but storage increases n -fold λ µ λ µ λ λ µ Gauri Joshi (MIT) Delay-Storage Trade-o ff s 6 / 24

  7. How Coding Reduces Download Time Coding Gives Diversity with Lower Storage Content divided into k blocks and encoded to n blocks Each disk stores 1 / k units, so service rate becomes µ 0 = k µ Downloading any k blocks is su ffi cient to decode the file λ kµ λ kµ λ λ kµ Gauri Joshi (MIT) Delay-Storage Trade-o ff s 7 / 24

  8. Definition: ( n , k ) Fork-Join System Requests arrivals are Poisson with rate λ A request forked into n tasks ! enter FCFS queues at the n disks Time to download one block of content ⇠ exp( µ 0 ), where µ 0 = k µ Load factor ρ = λ /µ 0 for each queue. λ kµ λ kµ λ λ kµ Gauri Joshi (MIT) Delay-Storage Trade-o ff s 8 / 24

  9. Fork-Join Queues: Example A content file of unit size is divided into k = 2 blocks, a and b Encoded into 3 blocks, a , b and a + b Downloading any 2 blocks is su ffi cient to decode the entire file Storage is 50% higher, but response time is reduced. a a + b b b a + b Gauri Joshi (MIT) Delay-Storage Trade-o ff s 9 / 24

  10. Fork-Join Queues: Example A content file of unit size is divided into k = 2 blocks, a and b Encoded into 3 blocks, a , b and a + b Downloading any 2 blocks is su ffi cient to decode the entire file Storage is 50% higher, but response time is reduced. Abandon kµ 4 3 2 1 λ λ F kµ 4 3 2 1 3 J 2 1 λ kµ 4 3 Gauri Joshi (MIT) Delay-Storage Trade-o ff s 10 / 24

  11. Mean Response Time Challenges Arrivals to the n queues are perfectly synchronized. Hence it is not the k th order statistic of exponential Previous work has attempted finding T n , n , but only bounds are known Abandon kµ 4 3 2 1 λ λ F kµ 4 3 2 1 3 J 2 1 λ kµ 4 3 Gauri Joshi (MIT) Delay-Storage Trade-o ff s 11 / 24

  12. Our Contributions Bounds on mean response time of the ( n , k ) fork-join system Delay-Storage Trade-o ff s Fixed storage expansion k / n what is the best n ? Fixed n disks what is the best k ? Extensions to correlated service times, ( m , n , k ) fork-join etc. [1] G. Joshi, Y. Liu, E. Soljanin, ”Coding for Fast Content Download”, Allerton Conference 2012 [2] G. Joshi, Y. Liu, E. Soljanin, ”On Delay-Storage Trade-o ff s in Content Download from Coded Distributed Storage Systems”, to appear in JSAC 2014 Gauri Joshi (MIT) Delay-Storage Trade-o ff s 12 / 24

  13. Upper Bound on Response Time Comparison with a split-merge system Split-merge system - All n queues are blocked until k tasks finish Response time of split-merge is always greater than fork-join Abandon kµ 4 3 2 1 λ λ kµ F 4 3 2 1 3 J 2 1 λ kµ 4 3 Gauri Joshi (MIT) Delay-Storage Trade-o ff s 13 / 24

  14. Upper Bound on Response Time Equivalent to an M / G / 1 queue Arrivals are Poisson with rate λ Departures according to S , k th order statistic of exp( µ 0 ) E[ S ] = H n � H n � k µ 0 V[ S ] = H n 2 � H ( n � k ) 2 . µ 0 2 Mean Response time given by the Pollaczek-Khinchin formula, � V[ S ] + E[ S ] 2 � T n , k  E[ S ] + λ 2(1 � λ E[ S ]) Gauri Joshi (MIT) Delay-Storage Trade-o ff s 14 / 24

  15. Lower Bound on Response Time Stages of Processing of a Job A job goes through k stages of processing, at stage j , 0  j  k � 1 At stage j , the job has completed j tasks and waiting for the remaining k � j The service rate of a job in stage j stage is at most ( n � j ) µ 0 [Varki]. k � 1 1 X T n , k � Sum of response times of k stages ( n � j ) µ 0 � λ j =0 k � 1 = 1 1 ρ h i X n � j + µ 0 ( n � j )( n � j � ρ ) j =0 = 1 ⇥ ⇤ H n � H n � k + ρ · ( H n ( n � ρ ) � H ( n � k )( n � k � ρ ) ) µ 0 Gauri Joshi (MIT) Delay-Storage Trade-o ff s 15 / 24

  16. Flexible Disks, Fixed Storage Expansion Parameters: Expansion k / n = 1 / 2, λ = 1 More diversity ! Lower Response Time 2 10 µ = 5 upper bound µ = 5 lower bound µ = 1 upper bound 1 µ = 1 lower bound 10 µ = 0.51 upper bound Mean response time µ = 0.51 lower bound 0 10 − 1 10 − 2 10 2 4 6 8 10 12 14 16 18 20 n, n = 2k Gauri Joshi (MIT) Delay-Storage Trade-o ff s 16 / 24

  17. How Much Can Double Storage Improve Completion Time? 1.0 0.8 fraction of completed downloads 0.6 single disk k=1 0.4 k=2 k=5 0.2 0.0 0.0 0.5 1.0 1.5 2.0 response time Gauri Joshi (MIT) Delay-Storage Trade-o ff s 17 / 24

  18. Comparison to Power-of- d For same storage fork-join gives much faster response 2 10 (10, 1) fork − join system (20, 2) fork − join system Power − of − 2 1 Power − of − 10 (LWL job assignment) 10 Mean Response Time 0 10 − 1 10 − 2 10 1 2 3 4 5 6 7 8 Average time to download one unit of content (1/ µ ) Gauri Joshi (MIT) Delay-Storage Trade-o ff s 18 / 24

  19. Flexible Storage Expansion, Fixed Disks Parameters: n = 10, λ = 1, µ = 1 More redundancy ! Lower Response Time 0.35 10 T (10, k) simulation Upper bound Lower bound 0.3 8 Required storage Mean response time 0.25 6 Storage 0.2 4 0.15 2 0.1 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 k Gauri Joshi (MIT) Delay-Storage Trade-o ff s 19 / 24

  20. Flexible Storage Expansion, Fixed Disks 1.0 1.0 fraction of completed downloads 0.8 fraction of completed downloads 0.8 M / M / 1 λ = 1 0.6 0.6 µ = 3 single disk 0.4 0.4 k=1 k=5 k=10 0.2 0.2 0.0 0.0 0.0 0.5 1.0 1.5 2.0 0.00 0.05 0.10 0.15 0.20 response time response time single disk baseline – unit storage the same total storage double total storage 10 ⇥ increase in storage Gauri Joshi (MIT) Delay-Storage Trade-o ff s 20 / 24

  21. Correlated Service Times Service time X = δ X d + (1 � δ ) X r , i , for i = 1 , 2 , · · · n More correlation ! lose the diversity advantage 0.4 δ = 1 δ = 0.5 Mean response time δ = 0 0.1 0.03 2 4 6 8 10 k Figure : λ = 1 , µ = 3 Gauri Joshi (MIT) Delay-Storage Trade-o ff s 21 / 24

  22. (m,n,k) fork-join system Large number of disks m � n Can be divided into m / n = g fork-join systems 1 λ 1 . . . . . . n n + 1 λ 2 . . . . λ . . 2 n . . . . . . ( g � 1) n + 1 λ g . . . . . . gn = m Gauri Joshi (MIT) Delay-Storage Trade-o ff s 22 / 24

  23. (m,n,k) fork-join system 0.3 (12, 6, k) Exponential (12, 6, k) Pareto, α = 1.8 0.25 (12, 12, k) Pareto, α = 1.8 (12, 12, k) Exponential Mean response time 0.2 0.15 (12, 6, k) system 0.1 (12, 12, k) system 0.05 0 2 4 6 8 10 12 k Figure : λ = 1 , µ = 3 Gauri Joshi (MIT) Delay-Storage Trade-o ff s 23 / 24

  24. Concluding Remarks Major Implications Investigated the delay-storage trade-o ff in distributed storage Showed that diversity of more disks helps, for same storage space used Generalization of ( n , n ) fork-join systems to the ( n , k ) fork-join system Future Perspectives Percentile analysis from the CDF of response time Extension to parallel computing instead of storage Gauri Joshi (MIT) Delay-Storage Trade-o ff s 24 / 24

Recommend


More recommend