Error-correcting codes and Cryptography Henk van Tilborg Code-based Cryptography Workshop Eindhoven, May 11-12, 2011 1/45
CONTENTS I Error-correcting codes; the basics II Quasi-cyclic codes; codes generated by circulants III Cyclic codes IV The McEliece cryptosystem V Burst-correcting array codes 2/45
I Error-correcting codes; the basics Noise � m c r m � � Encode Decode Sender Receiver � � � � Channel Error-correcting codes are (mostly) used to correct independent, random errors that occur during transmission of data or during storage of data. i j (0 , . . . . . . , 0 , 1 , 0 , . . . . . . . . . , 0 , 1 , 0 , . . . , 0) We shall also briefly discuss codes that correct bursts (clusters) of errors, i.e. error patterns of the form: i i + b − 1 (0 , . . . . . . . . . , 0 , 1 , ∗ , . . ., ∗ , 1 , 0 , . . . . . . . . . , 0) 3/45
0 0 0 0 0 0 0 m 0 c 0 0 0 0 1 1 1 1 m 1 c 1 0 0 1 0 0 1 1 m 2 c 2 0 0 1 1 1 0 0 m 3 c 3 0 1 0 0 1 0 1 m 4 c 4 m 5 0 1 0 1 0 1 0 c 5 m 6 0 1 1 0 1 1 0 c 6 0 1 1 1 0 0 1 m 7 c 7 16 codewords of length 7 1 0 0 0 1 1 0 m 8 c 8 1 0 0 1 0 0 1 m 9 c 9 1 0 1 0 1 0 1 m 10 c 10 1 0 1 1 0 1 0 m 11 c 11 1 1 0 0 0 1 1 m 12 c 12 m 13 1 1 0 1 1 0 0 c 13 1 1 1 0 0 0 0 m 14 c 14 1 1 1 1 1 1 1 m 15 c 15 4/45
A code C is such a (well-chosen) subset of { 0 , 1 } n . So codes here will be binary codes. The generalization to other field sizes is easy. The weight of a word is the number of non-zero coordinates. Example : a code C of length 5 with the following four codewords: c 0 = 0 0 0 0 0 c 1 = 0 0 1 1 1 c 2 = 1 1 0 0 1 c 3 = 1 1 1 1 0 5/45
Suppose that each two codewords differ in at least d coordinates (have dis- tance at least d ) and put t = ⌊ d − 1 2 ⌋ . c 0 = 0 0 0 0 0 c 1 = 0 0 1 1 1 d = 3 , t = 1 c 2 = 1 1 0 0 1 c 3 = 1 1 1 1 0 Then the code C is said to be t -error-correcting, because if you transmit (or store) a codeword and not more than t errors have occurred upon reception (or read out) due of noise or damage, then the received word will still be closer to the original codeword than to any other. For instance, if you receive r = 0 1 0 0 1 you know that c 2 is the most likely transmitted codeword. 6/45
From now on codes will be linear, meaning that C is a linear subspace of { 0 , 1 } n . We use the notation [ n, k, d ] codes, where k denotes the dimension of the code C and d the so-called minimum distance of C : the minimum of all distances between codewords. The quantity r = n − k is called the redundancy of the code. This is the number of additional coordinates (apart from the actual information being transmitted) that make error-correction possible. It follows from the linear structure of C that an appropriate choice of k codewords forms a basis for the code. A basis of the code C = { 00000 , 00111 , 11001 , 11110 } is given by the rows of � � 0 0 1 1 1 . 1 1 0 0 1 7/45
0 0 0 0 0 0 0 c 0 0 0 0 1 1 1 1 c 1 0 0 1 0 0 1 1 c 2 0 0 1 1 1 0 0 c 3 0 1 0 0 1 0 1 c 4 0 1 0 1 0 1 0 c 5 A basis of the linear (!) 0 1 1 0 1 1 0 c 6 0 1 1 1 0 0 1 c 7 [7 , 4 , 3] code introduced before 1 0 0 0 1 1 0 c 8 1 0 0 1 0 0 1 c 9 is given by c 1 , c 2 , c 4 , c 8 1 0 1 0 1 0 1 c 10 1 0 1 1 0 1 0 c 11 1 1 0 0 0 1 1 c 12 1 1 0 1 1 0 0 c 13 1 1 1 0 0 0 0 c 14 1 1 1 1 1 1 1 c 15 8/45
A matrix G whose rows form a basis of an [ n, k, d ] code C, is called a gene- rator matrix G of C. Its size is k × n. The basis c 1 , c 2 , c 4 , c 8 of the code on the previous page results in the gene- rator matrix: 0 0 0 1 1 1 1 0 0 1 0 0 1 1 G = . 0 1 0 0 1 0 1 1 0 0 0 1 1 0 So, in general, a linear code C with k × n generator matrix G consists of all linear combinations of the rows of G . C = { mG | m ∈ { 0 , 1 } k } 9/45
If k is large compared to n, it is often advantageous to describe C as the null-space of a ( n − k ) × n matrix H called a parity check matrix: C = { x ∈ { 0 , 1 } n | Hx T = 0 T } . Typically, you transmit a codeword c and you receive r which can be written as r = c ⊕ e, where e is called the error vector and is caused by the noise. The decoder can not do better than look for the closest codeword to r, i.e. look for e of lowest weight such that r − e ∈ C. Note that s T := Hr T = Hc T ⊕ He T = He T . This value is called the syndrome of the received word. It only depends on the error-vector. 10/45
Example: The matrix 0 0 0 1 1 1 1 H = 0 1 1 0 0 1 1 1 0 1 0 1 0 1 is the parity check matrix of a linear code C = { x ∈ { 0 , 1 } n | Hx T = 0 T } of length 7 and dimension 4. Moreover, this code can correct a single error ( d = 3 , t = 1 ). We give a decoding algorithm. Let r be a received word. Compute its syndrome s, i.e. compute s T = Hr T . 0 If s T = then r ∈ C, so (most likely) no error occurred. 0 0 11/45
Example continued: Suppose you receive � 1 0 0 0 1 1 1 � r = Its syndrome with 0 0 0 1 1 1 1 0 1 1 0 0 1 1 H = 1 0 1 0 1 0 1 1 0 , which is the 5 -th column. Note that is 1 � 0 0 0 0 1 0 0 � e = gives the same syndrome, so H ( r T − e T ) = 0 T . So, the most likely transmitted codeword is r − e, i.e. � 1 0 0 0 0 1 1 � c = 12/45
II Quasi-cyclic codes; Codes generated by circulants 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 Consider the U = 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 15 × 15 circulant 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 1 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 1 13/45
Note that in 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 u 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 = U u 4 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 u 6 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 u 7 . . . . . . rows u 0 , u 4 , and u 6 add up (modulo 2) to row u 7 . So, row u 7 is a linear combination of the preceding rows. But then, because of the cyclic structure, also row u 8 is a linear combination of the top 7 rows, etc.. We conclude that the rows of U generate a [15 , 7] code. 14/45
1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 u ( x ) 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 xu ( x ) . . . . . . . . . = U x 6 u ( x ) 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 x 7 u ( x ) 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 . . . . . . . . . Each row in U is a cyclic shift of the previous row. i =0 U 0 ,i x i = 1 + x 4 + x 6 + x 7 + x 8 . Define u ( x ) by the top row u 0 : u ( x ) = � 14 Then xu ( x ) corresponds to row u 1 , x 2 u ( x ) corresponds to row u 2 , etc., whe- re these polynomials have to be taken modulo x 15 − 1 . For example, u 6 corresponds to x 6 u ( x ) = x 6 + x 10 + x 12 + x 13 + x 14 u 7 corresponds to x 7 u ( x ) = x 7 + x 11 + x 13 + x 14 + 1 15/45
The reason that U generates a [15 , 7] code ( 2 nd proof) is that: 1. u ( x ) has degree 8, so the first 7 rows of U are clearly linearly indepen- dent. 2. u ( x ) divides x 15 − 1 . Indeed x 15 − 1 = u ( x )(1 + x 4 + x 6 + x 7 ) , as one can easily check. So, x 7 u ( x ) ≡ x 7 u ( x ) + ( x 15 − 1) ≡ ( x 7 + (1 + x 4 + x 6 + x 7 )) u ( x ) ≡ ≡ (1 + x 4 + x 6 ) u ( x ) ≡ u ( x ) + x 4 u ( x ) + x 6 u ( x ) (mod x 15 − 1) . This shows why rows u 0 , u 4 , u 6 add up (modulo 2) to row u 7 . This argument holds in general when u ( x ) divides x n − 1 . 16/45
How about the rank of a code generated by a circulant U with top row u 0 , corresponding to a polynomial u ( x ) that does not divide x n − 1 ? 1 0 0 1 1 1 0 u 0 0 1 0 0 1 1 1 1 0 1 0 0 1 1 U = 1 1 0 1 0 0 1 1 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 1 u ( x ) = 1 + x 3 + x 4 + x 5 does not divide x 7 − 1 . 17/45
� u ( x ) � with u ( x ) does not divide of x n − 1 . U = � Define g ( x ) = gcd( u ( x ) , x n − 1) and use the extended version of Euclid’s Algorithm to write: g ( x ) = a ( x ) u ( x ) + b ( x )( x n − 1) . Then � n − 1 � n − 1 (mod x n − 1) . � � a i x i � x i u ( x ) � g ( x ) ≡ u ( x ) ≡ a i i =0 i =0 n − 1 � g = a i u i . So, i =0 So, g is a linear combination of the rows of U. 18/45
The vector g (and each of its cyclic shifts) is a linear combination of the rows of U. Since g ( x ) = gcd( u ( x ) , x n − 1) divides u ( x ) , we also know that u 0 (and each of its shifts) is a linear combination of cyclic shifts of g. We conclude that G, the circulant with g as top row, generates the same code as U does: � u ( x ) � g ( x ) � � U = G = and � � generate the same code. But now g ( x ) divides x n − 1 , so the code generated by U has dimension n − degree ( g ( x )) . 19/45
Recommend
More recommend