hierarchical codes how to make erasure codes attractive
play

Hierarchical Codes: How to Make Erasure Codes Attractive for - PowerPoint PPT Presentation

Hierarchical Codes: How to Make Erasure Codes Attractive for Peer-to-Peer Storage Systems Alessandro Duminuco and Ernst Biersack EURECOM Sophia Antipolis, France (Best paper award in P2P'08) Presented by: Amir H. Payberah amir@sics.se


  1. Hierarchical Codes: How to Make Erasure Codes Attractive for Peer-to-Peer Storage Systems Alessandro Duminuco and Ernst Biersack EURECOM Sophia Antipolis, France (Best paper award in P2P'08) Presented by: Amir H. Payberah amir@sics.se Hierarchical Code, 11 Nov. 2008 1

  2. What's The Problem? Hierarchical Code, 11 Nov. 2008 2

  3. What Is The Problem? Does file backup fit the P2P model? Hierarchical Code, 11 Nov. 2008 3

  4. Churn and Redundancy • The challenge in P2P model is to provide storage reliability under churn. • The key solution is to add redundancy to the data. Hierarchical Code, 11 Nov. 2008 4

  5. The Basic Solution: Replication • With 4 replicas, even if 3 peers are offline we still have the file. • Every file consumes storage for 4 times its size!! file Hierarchical Code, 11 Nov. 2008 5

  6. A Better Solution: Coding • Any k fragments are sufficient to reconstruct the file:  We can sustain any h losses. • Every file consumes storage for (k+h)/k times its size:  If k=6 and h=3, (k+h)/k=1.5 .... Instead of 4!! Dissemination Coding file k k + h Hierarchical Code, 11 Nov. 2008 6

  7. Repair Communication Cost • Replication • Coding ... a repair means combining k fragments into a new one. k • To create a single fragment we must transfer k fragments, i.e. the size-equivalent of the whole file!! Hierarchical Code, 11 Nov. 2008 7

  8. Storage vs Repair Cost • If we want to sustain 10 losses: Repair Cost makes coding unattractive. Hierarchical Code, 11 Nov. 2008 8

  9. Motivation Can we mitigate the repair cost of coding Can we mitigate the repair cost of coding while retaining storage efficiency? while retaining storage efficiency? Hierarchical Code, 11 Nov. 2008 9

  10. Efficiency Metrics • Redundancy factor  β = |S| / |O|  |S|: size of the stored data.  |O|: size of the original data. • Repair degree  The amount of data read with respect to the amount of new redundant data created.  Denoted as d . Hierarchical Code, 11 Nov. 2008 10

  11. Efficiency Analysis • Replication  β = R  d = 1 • Block replication  β = R  d = 1 • Erasure codes  β = (k + h) / k  d = k Hierarchical Code, 11 Nov. 2008 11

  12. Linear Codes • A specific implementation of erasure codes. • f i : i th fragment b7 • b i :i th fragment b6 • c i,j : coefficient s b5 f4 b4 (k-h)-code f3 b3 f2 b2 f i i ≤ k b i = f1 b1 k k + h ∑(c i,j X f j ) k < i ≤ k + h • Any 4 of these 7 fragments can reconstruct the original file if the coefficients are linearly independent. (will be back to it later) • Repair degree d = k Hierarchical Code, 11 Nov. 2008 12

  13. Hierarchical Code • Additional fragments can be linear combinations of a subset of the original ones. b7 b6 b5 f4 b4 f3 b3 f2 b2 f1 b1 k k + h • Not all the subsets of 4 fragments are sufficient to reconstruct the file. • The repair cost varies accordingly to the particular fragments that are available (we can have d < k). Hierarchical Code, 11 Nov. 2008 13

  14. Comparison b7 b7 b6 b6 b5 b5 f4 b4 f4 b4 f3 b3 f3 b3 f2 b2 f2 b2 f1 b1 f1 b1 k k + h k k + h Hierarchical Code, 11 Nov. 2008 14

  15. Generalizing The Concept • If we take a 64+64 traditional linear code and we apply the same idea hierarchically... • If we set the hierarchy differently we obtain a different trade-off. Hierarchical Code, 11 Nov. 2008 15

  16. Experiments Hierarchical Code, 11 Nov. 2008 16

  17. Synthetic Data • An event-driven simulator. • They compared a 64+64 Reed-Solomon code (linear code) with one instance of a 64+64 Hierarchical code. • They generated synthetic peer behavior with exponentially distributed uptimes, downtimes and lifetimes. • As a general rule, the smaller is the up-ratio the higher the number of repairs. Hierarchical Code, 11 Nov. 2008 17

  18. Synthetic Data Results Hierarchical Code, 11 Nov. 2008 18

  19. Real Data • PlanetLab traces consist in 669 nodes monitored for 500 days. • KAD traces consist in the availability of about 6500 peers in the KAD network for about 5 months. Hierarchical Code, 11 Nov. 2008 19

  20. Real Data Results • PlanetLab • KAD Hierarchical Code, 11 Nov. 2008 20

  21. Conclusion Hierarchical Code, 11 Nov. 2008 21

  22. Conclusion • They proposed a new class of erasure codes called Hierarchical Codes. • They aim at coupling the communication efficiency of replication with the storage efficiency of coding. • Experiments showed that Hierarchical Codes require more repairs, but those repairs are so cheap that the resulting communication cost is smaller. Hierarchical Code, 11 Nov. 2008 22

  23. More Detail About Coding Hierarchical Code, 11 Nov. 2008 23

  24. Linear Codes • f i : i th fragment • b i :i th fragment f i i ≤ k b i = ∑(c i,j X f j ) k < i ≤ k + h • c i,j : coefficient s • Any 4 of these 7 fragments can reconstruct the original file if the coefficients are linearly independent. b7 b6 b5 f4 b4 f3 b3 f2 b2 f1 b1 k k + h Hierarchical Code, 11 Nov. 2008 24

  25. Linear Codes B = C' F F = S -1 B s • If any sub-matrix S built using k rows from C' is invertible, then the original fragments can be always reconstructed by F = S −1 B s. • B S : The k-long subvector of B, corresponding to the coefficients chosen in S. • If this property is satisfied, the code obtained is a (k,h)-code. Hierarchical Code, 11 Nov. 2008 25

  26. Coefficient Matrix • Reed-Solomon Codes • Random Linear Codes Hierarchical Code, 11 Nov. 2008 26

  27. Reed-Solomon • I k, k : Indentity matrix. • C h, k : Coefficient Matrix. I B = F = C' F C • If k = 2 and h = 3 1 0 f 1 0 1 f 2 I f 1 c 1,1 c 1,2 c 1,1 f 1 + c 1,2 f 2 B = F = = C f 2 c 2,1 c 2,2 c 2,1 f 1 + c 2,2 f 2 c 3,1 c 3,2 c 3,1 f 1 + c 3,2 f 2 Hierarchical Code, 11 Nov. 2008 27

  28. Reed-Solomon Codes • Define the matrix C as a h × k Vandermonde matrix. • c i,j = a i j-1 Hierarchical Code, 11 Nov. 2008 28

  29. Reed-Solomon Codes • k = 2 • h = 3 • c i,j = j i-1 1 0 f 1 0 1 f 2 I f 1 1 1 f 1 + f 2 B = F = = C f 2 1 2 f 1 + 2f 2 1 3 f 1 + 3f 2 Hierarchical Code, 11 Nov. 2008 29

  30. Reed-Solomon Codes 1 0 f 1 0 1 f 2 I f 1 1 1 f 1 + f 2 B = F = = C f 2 1 2 f 1 + 2f 2 1 3 f 1 + 3f 2 1 0 1 0 f 1 f 1 S -1 B s = = = F S = 1 3 -1/3 1/3 f 1 + 3f 2 f 2 Hierarchical Code, 11 Nov. 2008 30

  31. Random Linear Code • It is shown that a k × k random matrix S in GF(2 q ) is invertible with a probability which depends only on the field size and will increase by the size increasing.  GF(2 q ): Galois Field, where the elements can be expressed by q-bit words. • If q ≥ 16, the probability can be considered practically 1. • This means that any k × k sub-matrix of C' is invertible and that the property of a (k,h)-code is provided. Hierarchical Code, 11 Nov. 2008 31

  32. Information Flow Graph (Code Graph) • Represents the evolution of the stored data through time. F B 1 B t-1 B t b3 b3 b3 b2 b2 b2 f2 ... f1 b1 b1 b1 0 1 t-1 t Hierarchical Code, 11 Nov. 2008 32

  33. Information Flow Graph (Code Graph) F B 1 B t-1 B t b3 b3 b3 f2 b2 b2 b2 ... f1 b1 b1 b1 0 1 t-1 t • Proposition 1: At any time t, any of all the possible selections of k nodes B t k is sufficient to reconstruct the original fragments only if the disjoint paths condition is provided at time step t = 1 and the repair degree d ≥ k. • A Random linear code provides this condition  By design any node in B1 is connected to all the source nodes in F. Hierarchical Code, 11 Nov. 2008 33

  34. Block Replication vs Linear Codes • k = 8, h = 16 and R = 3 • Block replication: d = 1 • Linear codes: d = k Hierarchical Code, 11 Nov. 2008 34

  35. Question? Is there a design space between these two Is there a design space between these two limits that can be explored to find a better limits that can be explored to find a better trade-off between storage efficiency and trade-off between storage efficiency and repair degree? repair degree? Hierarchical Code, 11 Nov. 2008 35

  36. Hierarchical Codes Hierarchical Code, 11 Nov. 2008 36

  37. Hierarchical Code Graph – Step 1 • Choose k 0 and h 0 and build (k 0 , h 0 )-code: f i i ≤ k b i = ∑(c i,j X f j ) k < i ≤ k + h • k 0 = 2 • h 0 = 1 f2 b2 G 2,1 b3 f1 b1 Hierarchical (2, 1)-code • The generated group denoted as G d0,1 , where d 0 = k 0 . Hierarchical Code, 11 Nov. 2008 37

  38. Hierarchical Code Graph – Step 2 • Choose g 1 and h 1 . • Replicate G d0,1 for g 1 times.  g 1 groups denoted as G d0,1, ..., G d0,g . f4 b4 G 2,1 b6 • Then add other h 1 redundant blocks. f3 b3  Combining all the existing g 1 k 0 original G 4,1 b7 fragments F. • The new group denoted as G d1,1 , f2 b2 G 2,2  Hierarchical (d 1 , H 1 )-code, b5  H 1 = g 1 h 0 + h 1 f1 b1  d 1 = g 1 k 0 = g 1 d 0 • g 1 = 2 Hierarchical (4, 3)-code • h 1 = 1 Hierarchical Code, 11 Nov. 2008 38

  39. Hierarchical Code Graph – Step 3 • Repeat Step 2 several times. • Hs = g s H s-1 + h s • ds = g s d s-1 Hierarchical Code, 11 Nov. 2008 39

Recommend


More recommend