coding for loss tolerant systems
play

Coding for loss tolerant systems Workshop APRETAF, 22 janvier 2009 - PowerPoint PPT Presentation

Coding for loss tolerant systems Workshop APRETAF, 22 janvier 2009 Mathieu Cunche, Vincent Roca INRIA, quipe Plante INRIA Rhne-Alpes Mathieu Cunche, Vincent Roca The erasure channel Erasure codes Reed-Solomon codes LDPC


  1. Coding for loss tolerant systems Workshop APRETAF, 22 janvier 2009 Mathieu Cunche, Vincent Roca INRIA, équipe Planète INRIA Rhône-Alpes – Mathieu Cunche, Vincent Roca

  2.  The erasure channel  Erasure codes  Reed-Solomon codes  LDPC codes  Application to distributed storage 2

  3. The erasure channel  erasure channel 0 0 o definition: Erased ! a symbol either arrives to the destination, without any error… 1 1 … or is erased and never received ≠ BSC (binary symmetric) and AWGN channels… o the integrity assumption is a strong hypothesis o a received symbol is 100% guaranteed error free 3

  4. The erasure channel  where do we find erasure channels? o On the Internet o Because of routing error, congestion o Because of bad CRC/checksum o On wireless and satelitte networks o intermittent connection due to obstacles o Distributed storage o disk failure in RAID systems o node failure in a data center o Distributed computation o Fail stop 4

  5.  The erasure channel  Erasure codes  Reed-Solomon codes  LDPC codes  Application to distributed storage 5

  6. Erasure codes o k sources symbols, encoded into n encoding symbols before encoding k o Code rate = = n after encoding o Close to 1 => little redundancy o Close to 0 => high amount of redundancy Symbol erasure Transmission k source symbols Source object Decoded object Encoding Decoding (n-k) repair symbols 6

  7. Erasure codes  Often used as AL-FEC codes o “Application Level - Forward Error Correction” codes  AL-FEC differ from Physical-layer FEC codes o PHY codes: o correct bit errors, and if not possible detect the errors o Symbol = bit o AL-FEC: o recover from symbol erasures o Symbol = byte, IP datagram, file chunck 7

  8. Erasure codes  how can we define good erasure codes?  performance metrics for erasure codes o erasure recovery capabilities o main metric, measured as the overhead ratio: decoding _ overhead  #_ of _ symbols _ required _ for _ decoding  1 k o decoding needs (1+overhead)*k symbols to succeed, whereas ideal (MDS) codes need only k symbols ฀  o encoding and decoding speed o to appreciate the complexity o required memory during encoding and decoding 8

  9.  The erasure channel  Erasure codes  Reed-Solomon codes  LDPC codes  Application to distributed storage 9

  10. Reed Solomon codes  In short o Discovered by Reed & Solomon in 1959 o Linear codes over GF(2 n ) o Sum : simple binary XOR o Multiplication and Division: use a logarithmic table o Based on polynomial interpolation o Practical implementation with Vandermonde matrix o any k×k submatrix of a Vandermonde is invertible 10

  11. Reed Solomon codes  Encoding o Matrix vector multiplication X × G = Y × = Encoded vector: Source vector: n encoded symbols Generator matrix: k source symbols k x n Vandermonde o Complexity O(k 2 ) operations 11

  12. Reed Solomon codes  Decoding o Solve a linear system X × G’ = Y’ × = Received vector: Source vector: kxk submatrix of G k received symbols k source symbols (invertible) o Good VDM property: any kxk submatrix is invertible o k encoding symbols are enough to decode o Decoding overhead = 0, said differently RS are MDS o Complexity O(k 3 ) 12

  13. Reed Solomon codes: summary  Perfect codes o Decoding overhead = 0 o Decoding possible as soon as k symbols are received  … but limited scalability o n<255 GF(2 8 ) is sufficient o Fast operation over GF(2 8 ), (small logarithmic table) o Decoding speed = a few 10 Mbps o n>255, use GF(2 16 ) or more o Log table too large, cannot fit in cache o Decoding speed falls = a few Mbps 13

  14.  The erasure channel  Erasure codes  Reed-Solomon codes  LDPC codes  Application to distributed storage 14

  15. LDPC codes  in short o “Low Density Parity Check” (LDPC) o linear block codes o Sparse parity check matrix o discovered by Gallager in the 60’s, re -discovered in mid-90s o In general encoding require to solve a linear system O(k 3 ) o but high performance, lightweight variants exist o in the remaining we focus on a binary LDPC o Based on XOR operations 15

  16. LDPC codes  LDPC-staircase codes (RFC 5170) o a simple (trivial) parity check matrix structure Source symbols Parity symbols S 1 S 2 S 3 S 4 S 5 P 1 P 2 P 3 P 4 P 5 0 0 1 1 0 1 0 0 0 0 Constraints S 1  S 4  S 5  P 1  P 2 = 0 1 0 0 1 1 1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 0 1 0 1 1 0 0 1 1 0 1 1 1 0 1 0 0 0 1 1 o A.K.A. double diagonal or Repeat Accumulate codes o high encoding speed (encoding is trivial) o recovery capabilities can be made close to ideal codes  16

  17. LDPC codes  Encoding S 1 S 2 S 3 S 4 S 5 P 1 P 2 P 3 P 4 P 5 S 3  S 4  P 1 0 0 1 1 0 1 0 0 0 0 =0  P 2 S 1  S 4  S 5  P 1 =0 1 0 0 1 1 1 1 0 0 0  P 3 S 1  S 2  S 3  P 2 =0 1 1 1 0 0 0 1 1 0 0 S 2  S 4  S 5  P 3  P 4 =0 0 1 0 1 1 0 0 1 1 0  P 5 S 1  S 2  S 3  S 5  P 4 =0 1 1 1 0 1 0 0 0 1 1 S 1  S 4  S 5  P 1  P 2 =0 o Linear complexity O(k)  Decoding o solve a system of linear equations o Several techniques are feasible… 17

  18. LDPC codes  Sol.1: Iterative Decoding (ID) o If an equation has only one unknown variable, this latter is equal to the sum of the others. Reiterate … o Efficient thanks to the sparsness of the parity check matrix o Pros: Low complexity (linear O(k)) o Low CPU load and high sustainable bandwidth o Cons: Suboptimal in terms of correction capabilities o Some full rank systems cannot be solved Overhead for a failure proba ≤ 10 -4 code rate Average overhead (k=1000,N1=3) 2/3 (=0.66) 9.99 % 13.93 % 2/5 (=0.4) 17.13 % 22.91 % 18

  19. LDPC codes  Sol.2: Maximum Likelihood(ML) decoding o Solve a linear system (Gaussian Elimination, LU decomposition …) xA = b Information of the Missing symbols Submatrix of the received symbols Generator matrix o Excellent erasure correction capabilities Overhead for a failure proba ≤ 10 -4 code rate Average overhead (k=1000,N1=5) 2/3 (=0.66) 0.63 % 2.21 % 2/5 (=0.4) 2.04 % 4.41 % o High complexity: O(k 3 ) 19

  20. Some more details on LDPC codes considered  Sol. 3: Hybrid ID/ML scheme o Hybrid decoder o start decoding with ID (fast) o finish with ML if necessary (optimal) o excellent erasure correction capabilities… o … while remaining very fast 20

  21. LDPC codes  Decoding speed of the hybrid decoder o LDPC-staircase (N1=5), code rate 2/3, k=1,000 o Reed Solomon over GF(2 8 ) 32.4 times faster than RS ML needed more (1.7 Gbps) and more often sustainable ID sufficient decoding speed still 10.2 times faster (Mbps) (500 Mbps) with RS: 54Mbps loss probability(%) 21

  22.  The erasure channel  Erasure codes  Reed-Solomon codes  LDPC codes  Application to distributed storage 22

  23. Application to distributed storage Using replication : • A file partitionned into 8 blocks • Each block is replicated 4 times Client_1 2 4 1 3 1 2 5 8 1 3 4 6 6 8 3 4 6 7 6 7 2 5 2 3 1 4 7 8 5 7 5 8 Client_2 Can tolerate up to 3 failures 23

  24. Application to distributed storage Using erasure codes: • A file encoded into 32 blocks: Client_1 8 source blocks 24 repair blocks 1 2 E F 3 4 M N G H A B I J O P C D K L Q R U V 5 6 S T W X 7 8 Client_2 Can tolerate up to 6 failures, since 8 blocks are enough to decode 24

  25. Conclusion  Erasure codes o Add redundancy to combat symbol erasures  Reed-Solomon o Perfect codes (MDS), but inefficient for large objects  LDPC codes o Can encode large objects o Corrections capabilities close to MDS o High encoding and decoding speed 25

  26. Questions ?

Recommend


More recommend