Windows Azure Storage Coding Solution MicrosoH+Azure+Code+ P 1+ X 1+ X 2+ X 3+ X 4+ X 5+ X 6+ X 7+ Y 1+ Y 2+ Y 3+ Y 4+ Y 5+ Y 6+ Y 7+ P 2+ XPcode+ YPcode+ P X+ P Y+ Comparison: In terms of reliability and number of helper nodes contacted for node repair, the two codes are comparable. The overheads however are quite di ff erent, 1.29 for the Azure code versus 1.5 for the RS code. This di ff erence has reportedly saved Microsoft millions of dollars. Re X 1* X 2* X 3* X 4* X 5* X 6* P 1* P 2* P 3* Huang, Simitci, Xu, Ogus, Calder, Gopalan, Li, Yekhanin, “Erasure Coding in Windows Azure Storage,” USENIX, Boston, MA, 2012. 29 / 41
Codes with Hierarchical Locality [4 , 3 , 2] code ⇒ (3,1) code [12 , 8 , 3] code ⇒ (8,4) code [24 , 14 , 6] code ⇒ (14,10) code Codes with hierarchical locality do exactly that by calling for help from an intermediate layer of codes when the local code fails. These codes may be regarded as the “middle codes”. B. Sasidharan, G. K.Agarwal, PVK, “Codes With Hierarchical Locality,” arXiv:1501.06683 [cs.IT]. 30 / 41
Codes with Local Regeneration 31 / 41
Codes with Local Regeneration Codes(with(Locality:(( Regenera'ng(Codes:(( Minimize(repair(BW( Minimize(repair(degree( Codes(with(Local(Regenera'on:(( Small(repair(BW(and(( small(repair(degree( A single code that has both locality and regeneration properties and inherent double replication of data 1 G. M. Kamath, N. Prakash, V. Lalitha, PVK, ‘Codes With Local Regeneration and Erasure Correction,” T-IT, Aug. 2014 . 32 / 41
An Example Code with Local Regeneration The construction makes can make use of an all-symbol local scalar code and is also optimal: 1,2, 1,2, 1,2, 3,4 3,4 3,4 1 4 1 4 1 4 2 2 2 1,5 4,7, 1,5, 4,7, 1,5, 4,7, 7 7 7 6,7 9,P 1 6,7 9,P 2 6,7 9,P 3 6 6 6 5 3 5 3 P 2 5 3 P 3 P 1 9 9 9 2,5, 3,6, 2,5, 3,6, 2,5, 3,6, 8 8 8 8,9 8,P 1 8,9 8,P 2 8,9 8,P 3 Local Code 1 Local Code 2 Local Code 3 Scalar All-Symbol Locality Code 1 2 . . . 9 P 1 1 2 . . . 9 P 2 1 2 9 P 3 . . . Local Code 1 Local Code 2 Local Code 3 33 / 41
Codes with Availability (Recovery from Simultaneous Multiple Erasures) 34 / 41
Recovery in Parallel c 11 C 12 C 13 c 14 c 15 c 21 c 22 c 23 c 24 c 25 X X c 31 c 32 c 33 c 34 c 35 c 41 c 42 c 43 c 44 c 45 c 51 c 52 c 53 c 54 c 55 Last column is a parity check on entries to the left in the same row Last row is a parity check on entries above in the same column Can recover locally from 2 erasures in parallel 35 / 41
Codes with Sequential Recovery (Recovery from Simultaneous Multiple Erasures) 36 / 41
Sequential Recovery c 11 C 12 C 13 c 14 c 15 X c 21 c 22 c 23 c 24 c 25 X X c 31 c 32 c 33 c 34 c 35 c 41 c 42 c 43 c 44 c 45 c 51 c 52 c 53 c 54 c 55 Same code as before Can recover locally from 3 erasures in a sequential manner Sequential recovery enables codes with larger storage e ffi ciency 37 / 41
References - Codes for Multiple Erasures A. Wang and Z. Zhang, “Repair locality with multiple erasure tolerance,” IEEE Trans. 1 Inf. Theory, Nov. 2014. N. Prakash, V. Lalitha, and P. V. Kumar, “Codes with locality for two erasures,” in Proc. 2 IEEE Int. Symp. Inform. Theory (ISIT) 2014. W. Song and C. Yuen, “Binary locally repairable codes - sequential repair for multiple 3 erasures,” in Proc. IEEE GLOBECOM, 2016. 38 / 41
Functioning of an Example, Coupled-Layer MSR Code Goal: To show that a larger sub-packetization level is not necessarily a problem for implementation 39 / 41
Example Coupled-Layer MSR Code y x" Our coupled-layer perspective Z"="(0,0,0)" on the Ye-Barg construction (2) a (4 , 2) MSR code 6 nodes, sub-packetization Z level is ` = 8 6 × 8 = 48 points 2MB in the example to follow, each point stores 2MB Z"="(1,1,1)" 1 M. Ye, and A. Barg, “Explicit constructions of optimal- access MDS codes with nearly optimal sub-packetization, ” May 2016. B. Sasidharan, M. Vajha, and PVK. “An Explicit, Coupled-Layer Construction of a 2 High-Rate MSR Code with Low Sub-Packetization Level, Small Field Size and d < ( n − 1), ” to be presented at ISIT 2017. 40 / 41
Consider a file of size 64MB 64MB • Will encode via a [k=4, m=2] MSR Code • Called the Coupled-Layer MSR Code
Step 1: Break file into k = 4 data chunks, each of 16MB. 16MB 16MB 16MB 16MB
Data cube representation of CL-MSR Code 16MB 16MB 16MB 16MB y x" The cube has: Z"="(0,0,0)" ● 6 columns, each associated to a distinct node Z ● 8 horizontal planes. ● A column has 8 points 2MB ● Each point corresponds Z"="(1,1,1)" to 2MB of storage
Place four 16MB chunks in four systematic nodes 16MB 16MB 16MB y x" Z"="(0,0,0)" Z Z"="(1,1,1)"
Place four 16MB chunks in four systematic nodes 16MB 16MB y x" Z"="(0,0,0)" Z Z"="(1,1,1)"
Place four 16MB chunks in four systematic nodes 16MB y x" Z"="(0,0,0)" Z Z"="(1,1,1)"
Place four 16MB chunks in four systematic nodes y x" Z"="(0,0,0)" Z Z"="(1,1,1)"
We now have the systematic nodes
We will now compute the parity nodes Actual data cube A
Will get there through an intermediate “Virtual data cube” Virtual data cube Actual data cube A B
Start filling the virtual data cube on the right as follows
Certain pairs of points in the cube are “coupled” A 1 A 2
The Coupling Transform is a 2x2 matrix transform A 2 A 1 A 1 A 2 Coupling Transform B 1 B 2
Place the points obtained in the Virtual data cube A 1 A 2 B 1 B 2
Place the points obtained in the Virtual data cube B 1 A 1 A 2 B 1 B 2 B 2
Place the points obtained in the Virtual data cube A 1 A 2
Place the points obtained in the Virtual data cube A 2 A 1 A 1 Coupling A 2 Transform B 1 B 2
Place the points obtained in the Virtual data cube A 1 B 1 B 1 B 2 A 2 B 2
Place the points obtained in the Virtual data cube A 2 A 1 A 1 Coupling Transform A 2 B 1 B 2
Place the points obtained in the Virtual data cube A 1 B 1 B 2 A 2
Place the points obtained in the Virtual data cube A 1 B 1 B 1 B 2 A 2 B 2
Place the points obtained in the Virtual data cube B 1 A 1 A 2 B 2
Place the points obtained in the Virtual data cube
Red dotted points are not paired, they are simply carried over Copy
Red dotted points are not paired, they are simply carried over Copy
We now have data-part of the Virtual data cube y x" Z"="(0,0,0)" Z Z"="(1,1,1)"
Each plane is Reed-Solomon coded to obtain parity points Z"="(0,0,0)"
Each plane is Reed-Solomon coded to obtain parity points Z"="(0,0,0)" RS Encode
Each plane is Reed-Solomon coded to obtain parity points Z"="(0,0,0)" RS Encode
Each plane is Reed-Solomon coded to obtain parity points Z"="(0,0,0)"
Each plane is Reed-Solomon coded to obtain parity points Z"="(1,0,0)" RS Encode
Each plane is Reed-Solomon coded to obtain parity points Z"="(0,1,0)" RS Encode
Each plane is Reed-Solomon coded to obtain parity points Z"="(1,1,0)" RS Encode
Each plane is Reed-Solomon coded to obtain parity points Z"="(0,0,1)" RS Encode
Each plane is Reed-Solomon coded to obtain parity points Z"="(1,0,1)" RS Encode
Each plane is Reed-Solomon coded to obtain parity points Z"="(0,1,1)" RS Encode
Each plane is Reed-Solomon coded to obtain parity points Z"="(1,1,1)" RS Encode
Now we have the complete Virtual data cube Virtual data cube B
Parity points of Actual data cube can now be computed Virtual data cube B
Perform decoupling B 1 B 2 Virtual data cube B
Perform decoupling B 1 B 1 B 2 Inverse Coupling Transform B 2 A 2 A 1 Virtual data cube B
Perform decoupling B 1 A 2 A 1 B 2 Virtual data cube B
Perform decoupling B 1 A 2 A 1 A 1 B 2 A 2 Virtual data cube B
Perform decoupling B 1 B 2 Virtual data cube B
Perform decoupling B 1 B 1 B 2 Inverse Coupling Transform B 2 A 2 A 1 Virtual data cube B
Perform decoupling B 1 A 1 A 2 B 2 Virtual data cube B
Perform decoupling B 1 A 1 A 2 A 1 B 2 A 2 Virtual data cube B
Perform decoupling B 1 B 1 B 2 Inverse Coupling Transform B 2 A 2 A 1 Virtual data cube B
Perform decoupling B 1 A 1 A 2 B 2 Virtual data cube B
Perform decoupling B 1 A 1 A 2 A 1 B 2 A 2 Virtual data cube B
Perform decoupling B 1 A 1 B 2 A 2 Virtual data cube B
Red dotted points are simply carried over B 1 Copy B 2 Virtual data cube B
Red dotted points are simply carried over B 1 Copy B 2 Virtual data cube B
Actual and Virtual data cubes Coupling Decoupling Virtual data cube Virtual data cube A B
The encoding is now completed!
Problem of Node Repair: One node fails
Problem of Node Repair: One node fails
For this example, only half of the planes participate in repair ● Total Helper Data = 2MB X 4 X 5 = 40MB ● Opposed to RS code = 16MB X 4 = 64MB ● Much larger savings seen for m > 2
Couple points Coupling
Run RS decoding on each of the selected planes RS Dec Coupling
Recommend
More recommend