Code-Based Cryptography Tung Chou with some slides by Tanja Lange and Christiane Peters Academia Sinica PQCRYPTO Mini-School 2020 20 July, 2020
Basics of coding theory
Error correction • Goal: protect against errors in a noisy channel. Sender Receiver errors � m m + e 1
Error correction • Goal: protect against errors in a noisy channel. Sender Receiver errors � m m + e m c c + e c • The sender transforms a length- k message m into a length- n codeword c ( n > k ) by adding redundancy (encoding). • The channel introduces errors (bitflips), which can be viewed as adding an error vector e to the data. • The receiver uses a decoding algorithm to correct the errors. This works as long as there are not many errors. 1
Linear codes A linear code C of length n and dimension k is a k -dimensional subspace F n of I q . 2
Linear codes A linear code C of length n and dimension k is a k -dimensional subspace F n of I q . C is usually specified as • the row space of a generating matrix G ∈ I F k × n q � � F k C = m G | m ∈ I q Encoding means computing c = m G . 2
Linear codes A linear code C of length n and dimension k is a k -dimensional subspace F n of I q . C is usually specified as • the row space of a generating matrix G ∈ I F k × n q � � F k C = m G | m ∈ I q Encoding means computing c = m G . F ( n − k ) × n • the kernel space of a parity-check matrix H ∈ I q � c | H c ⊺ = 0 , c ∈ I � F n C = q ⊺ and assuming q = 2 from now on) (leaving out the 2
Linear codes A linear code C of length n and dimension k is a k -dimensional subspace F n of I q . C is usually specified as • the row space of a generating matrix G ∈ I F k × n q � � F k C = m G | m ∈ I q Encoding means computing c = m G . F ( n − k ) × n • the kernel space of a parity-check matrix H ∈ I q � c | H c ⊺ = 0 , c ∈ I � F n C = q ⊺ and assuming q = 2 from now on) (leaving out the • In general generating and parity-check matrices are not unique! 2
Example • C with code length n = 7 and code dimension k = 4. 1 0 0 0 1 1 0 0 1 0 0 1 0 1 G = 0 0 1 0 0 1 1 0 0 0 1 1 1 1 • c = (1001) G = (1001001) is a codeword. 1 1 0 1 1 0 0 H = 1 0 1 1 0 1 0 0 1 1 1 0 0 1 • Hc = 0. • Note that G = ( I | Q ) and H = ( Q T | I ). 3
Example • C with code length n = 7 and code dimension k = 4. 1 0 0 0 1 1 0 0 1 0 0 1 0 1 G = 0 0 1 0 0 1 1 0 0 0 1 1 1 1 • c = (1001) G = (1001001) is a codeword. 1 1 0 1 1 0 0 H = 1 0 1 1 0 1 0 0 1 1 1 0 0 1 • Hc = 0. • Note that G = ( I | Q ) and H = ( Q T | I ). • Linear codes are linear: α 1 c 1 + α 2 c 2 = α 1 m 1 G + α 2 m 2 G = ( α 1 m 1 + α 2 m 2 ) G . H ( α 1 c 1 + α 2 c 2 ) = α 1 H c 1 + α 2 H c 2 = 0 + 0 = 0 . 3
Hamming weight and distance • The Hamming weight of a word is the number of nonzero coordinates. wt (1 , 0 , 0 , 1 , 1) = 3 • The Hamming distance between two words in I F n 2 is the number of coordinates in which they differ. d ((1 , 1 , 0 , 1 , 1) , (1 , 0 , 0 , 1 , 1)) = 1 • The minimum distance of a linear code C is • the smallest Hamming distance between any two codewords. • the smallest Hamming weight of a nonzero codeword in C . d = min 0 � = c ∈ C { wt ( c ) } = min b � = c ∈ C { d ( b , c ) } 4
Minimum distance • Minimum distance indicates how many errors can be corrected: any vector x = c + e with wt ( e ) = t < d / 2 is uniquely decodable to c ; t t c x • Equivalently, the code can correct t errors if d ≥ 2 t + 1. • Equivalently, the code can correct ⌊ ( d − 1) / 2 ⌋ errors. 5
Minimum distance • Minimum distance indicates how many errors can be corrected: any vector x = c + e with wt ( e ) = t < d / 2 is uniquely decodable to c ; t t c x • Equivalently, the code can correct t errors if d ≥ 2 t + 1. • Equivalently, the code can correct ⌊ ( d − 1) / 2 ⌋ errors. • Having t < d / 2 does not mean that there is an efficient algorithm for decoding t errors! 5
Hamming code Parity check matrix ( n = 7 , k = 4): 1 1 0 1 1 0 0 H = 1 0 1 1 0 1 0 0 1 1 1 0 0 1 • Note that the columns are binary expansions of { 1 , 2 , . . . , 7 } . A codeword c = ( c 0 , c 1 , c 2 , c 3 , c 4 , c 5 , c 6 ) satisfies these three equations: + c 1 + c 3 + c 4 = 0 c 0 + c 2 + c 3 + c 5 = 0 c 0 + c 2 + c 3 + c 6 = 0 c 1 • Minimum distance? 6
Hamming code Parity check matrix ( n = 7 , k = 4): 1 1 0 1 1 0 0 H = 1 0 1 1 0 1 0 0 1 1 1 0 0 1 • Note that the columns are binary expansions of { 1 , 2 , . . . , 7 } . A codeword c = ( c 0 , c 1 , c 2 , c 3 , c 4 , c 5 , c 6 ) satisfies these three equations: + c 1 + c 3 + c 4 = 0 c 0 + c 2 + c 3 + c 5 = 0 c 0 + c 2 + c 3 + c 6 = 0 c 1 • Minimum distance? d = 3. 6
Hamming code Parity check matrix ( n = 7 , k = 4): 1 1 0 1 1 0 0 H = 1 0 1 1 0 1 0 0 1 1 1 0 0 1 • Note that the columns are binary expansions of { 1 , 2 , . . . , 7 } . A codeword c = ( c 0 , c 1 , c 2 , c 3 , c 4 , c 5 , c 6 ) satisfies these three equations: + c 1 + c 3 + c 4 = 0 c 0 + c 2 + c 3 + c 5 = 0 c 0 + c 2 + c 3 + c 6 = 0 c 1 • Minimum distance? d = 3. • The code can correct ⌊ ( d − 1) / 2 ⌋ = 1 error. 6
Hamming code Parity check matrix ( n = 7 , k = 4): 1 1 0 1 1 0 0 H = 1 0 1 1 0 1 0 0 1 1 1 0 0 1 • Note that the columns are binary expansions of { 1 , 2 , . . . , 7 } . A codeword c = ( c 0 , c 1 , c 2 , c 3 , c 4 , c 5 , c 6 ) satisfies these three equations: + c 1 + c 3 + c 4 = 0 c 0 + c 2 + c 3 + c 5 = 0 c 0 + c 2 + c 3 + c 6 = 0 c 1 • Minimum distance? d = 3. • The code can correct ⌊ ( d − 1) / 2 ⌋ = 1 error. • If there is an error in any c i , the error position can be identified by s = H c , e.g. s = (1 , 0 , 1) T means that c ? is flipped. 6
Decoding problem F n Decoding problem: find the closest codeword c ∈ C to a given x ∈ I 2 , assuming that there is a unique closest codeword. Let x = c + e . Note that finding e is an equivalent problem. • If c is t errors away from x , i.e., the Hamming weight of e is t , this is called a t -error correcting problem. • There are lots of code families with fast decoding algorithms, e.g., Reed–Solomon codes, Goppa codes/alternant codes, etc. • However, the general decoding problem, i.e., the decoding problem for random linear codes, is hard. • Theoretically, the problem is NP-complete • Pratically, Information-set decoding (see later) takes exponential time. 7
Different view: syndrome decoding • The syndrome of x ∈ I F n 2 is s = H x . Note H x = H ( c + e ) = H c + H e = H e depends only on e . • The syndrome decoding problem is to compute e ∈ I F n 2 given F n − k s ∈ I so that H e = s and e has minimal weight. 2 • Syndrome decoding and (regular) decoding are equivalent. 8
Different view: syndrome decoding • The syndrome of x ∈ I F n 2 is s = H x . Note H x = H ( c + e ) = H c + H e = H e depends only on e . • The syndrome decoding problem is to compute e ∈ I F n 2 given F n − k s ∈ I so that H e = s and e has minimal weight. 2 • Syndrome decoding and (regular) decoding are equivalent. • To decode x = c + e with syndrome decoder, compute e from H x = H e , then c = x + e . • Given syndrome s = H e , assume H = ( Q ⊺ | I n − k ). 8
Different view: syndrome decoding • The syndrome of x ∈ I F n 2 is s = H x . Note H x = H ( c + e ) = H c + H e = H e depends only on e . • The syndrome decoding problem is to compute e ∈ I F n 2 given F n − k s ∈ I so that H e = s and e has minimal weight. 2 • Syndrome decoding and (regular) decoding are equivalent. • To decode x = c + e with syndrome decoder, compute e from H x = H e , then c = x + e . • Given syndrome s = H e , assume H = ( Q ⊺ | I n − k ). Then x = (00 . . . 0) || s satisfies s = H x and x = c + e . • Note that this x is not a solution to the syndrome decoding problem, unless it has very low weight. 8
Binary Goppa Codes
Binary Goppa code Let q = 2 m . A binary Goppa code is often defined by • a list L = ( a 1 , . . . , a n ) of n distinct elements in I F q , called the support. • a degree- t polynomial g ( x ) ∈ I F q [ x ] such that g ( a ) � = 0 for all a ∈ L . g ( x ) is called the Goppa polynomial. 9
Binary Goppa code Let q = 2 m . A binary Goppa code is often defined by • a list L = ( a 1 , . . . , a n ) of n distinct elements in I F q , called the support. • a degree- t polynomial g ( x ) ∈ I F q [ x ] such that g ( a ) � = 0 for all a ∈ L . g ( x ) is called the Goppa polynomial. The corresponding binary Goppa code Γ( L , g ) is � � � c 1 c 2 c n � F n c ∈ I � S ( c ) = + + · · · + ≡ 0 mod g ( x ) � 2 x − a 1 x − a 2 x − a n • This code is linear S ( b + c ) = S ( b ) + S ( c ) and has length n . 9
Recommend
More recommend