Administrivia Hamming’s Problem (1940s) • Webpage: • Magnetic storage devices are prone to http://theory.lcs.mit.edu/ ˜ madhu/FT04 . making errors. • Send email to madhu@mit.edu to be added • How to store information (32 bit words) so to course mailing list. Critical! that any 1 bit flip (in any word) can be • Sign up for scribing. corrected? • Pset 1 out today. First part due in a week, • Simple solution: second in two weeks. − Repeat every bit three times. • Madhu’s office hours for now: Next − Works. To correct 1 bit flip error, take Tuesday 2:30pm-4pm. majority vote for each bit. − Can store 10 “real” bits per word this • Course under perpetual development! way. Efficiency of storage ≈ 1 / 3 . Can Limited staffing. Patience and constructive we do better? criticism appreciated. � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 1 � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 2 Hamming’s Solution - 1 [7 , 4 , 3] -Hamming code • Break ( 32 -bit) word into four blocks of size • Will explain notation later. 7 each (discard four remaining bits). • Let • In each block apply a transform that maps 1 0 0 0 0 1 1 4 “real” bits into a 7 bit string, so that any 0 1 0 0 1 0 1 1 bit flip in a block can be corrected. G = 0 0 1 0 1 1 0 0 0 0 1 1 1 1 • How? Will show next. • Result: Can now store 16 “real” bits per • Encode b = � b 0 b 1 b 2 b 3 � as b · G . word this way. Efficiency already up to 1 2 . • Claim: If a � = b , then a · G and b · G differ in at least 3 coordinates. • Will defer proof of claim. � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 3 � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 4
Hamming’s Notions Hamming notions (contd.) • Since codewords (i.e., b · G ) differ in at Code: Subset C ⊆ Σ n . least 3 coordinates, can correct one error. Min. distance: Denoted ∆( C ) , is • Motivates Hamming distance, Hamming min x � = y ∈ C { ∆( x , y ) } . weight, Error-correcting codes etc. e error detecting code If up to e errors • Alphabet Σ of size q . Ambient space, Σ n : happen, then codeword does not mutate Includes codewords and their corruptions. into any other code. • Hamming distance between strings x , y ∈ t error-correcting code If up to t errors Σ n , denoted ∆( x , y ) , is # of coordinates i happen, then codeword is uniquely s.t. x i � = y i . (Converts ambient space into determined (as the unique word within metric space.) distance t from the received word). • Hamming weight of z , denoted wt( z ) , is # Proposition: C has min. dist. 2 t + 1 ⇔ it is coordinate where z is non-zero. 2 t error-detecting ⇔ it is t error-correcting. � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 5 � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 6 Back to Hamming code Standard notation/terminology • So we have an [7 , 4 , 3] code (modulo proof • q : Alphabet size of claim). • n : Block length • Can correct 1 bit error. • k : Message length, where | C | = q k . • Storage efficiency (rate) approaches 4 / 7 (as • d : Min. distance of code. word size approached ∞ ). • Code with above is an ( n, k, d ) q code. • Will do better, by looking at proof of claim. [ n, k, d ] q code if linear. Omit q if q = 2 . • k/n : Rate • d/n : Relative distance. � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 7 � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 8
Proof of Claim • Let h i be i th row of H . Then y · H = � i | y i =1 h i . • Let y have weight 2 and say y i = y j = 1 . 0 0 1 Then y · H = h i + h j . But this is non-zero 0 1 0 since h i � = h j . QED. 0 1 1 Let H = 1 0 0 1 0 1 1 1 0 1 1 1 • Sub-Claim 1: { x G | x } = { y | y · H = 0 } . Simple linear algebra (mod 2). You’ll prove this as part of Pset 1. • Sub-claim 2: Exist codewords z 1 � = z 1 s.t. ∆( z 1 , z 2 ) ≤ 2 iff exists y of weight at most 2 s.t. y · H = 0 . � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 9 � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 10 Generalizing Hamming codes Summary of Hamming’s paper (1950) • Important feature: Parity check matrix • Defined Hamming metric and codes. should not have identical rows. But then can do this for every ℓ . • Gave codes with d = 1 , 2 , 3 , 4 ! 0 · · · 0 0 1 • d = 2 : Parity check code. 0 · · · 0 1 0 H ℓ = 0 · · · 0 1 1 . ... . . . • d = 3 : We’ve seen. . . . . . . . . 1 · · · 1 1 1 • d = 4 ? • H ℓ has ℓ columns, and 2 ℓ − 1 rows. • Gave a tightness result: His codes have maximum number of codewords. “Lower • H ℓ : Parity check matrix of ℓ th Hamming bound”. code. • Gave decoding “procedure”. • Message length of code = exercise. Implies rate → 1 . � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 11 � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 12
Volume Bound • Proves Hamming codes are optimal, when they exist. • Hamming Ball: B ( x, r ) = { w ∈ { 0 , 1 } n | ∆( w, x ) ≤ r } . • Volume: Vol( r, n ) = | B ( x, r ) | . (Notice volume independent of x and Σ , given | Σ | = q .) • Hamming(/Volume/Packing) Bound: − Basic Idea: Balls of radius t around codewords of a t -error correcting code don’t intersect. − Quantitatively: 2 k · Vol( t, n ) ≤ 2 n . − For t = 1 , get 2 k · ( n + 1) ≤ 2 n or k ≤ n − log 2 ( n + 1) . � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 13 � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 14 Decoding the Hamming code Rest of the course • Can recognize codewords? Yes - multiply • More history! by H ℓ and see if 0 . • More codes (larger d ). • What happens if we send codeword c and • More lower bounds (will see other i th bit gets flipped? methods). • Received vector r = c + e i . • More algorithms - decode less simple codes. • r · H = c · H + e i · H • More applications: Modern connections to = 0 + h i theoretical CS. = binary representation of i . • r · H gives binary rep’n of error coordinate! � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 15 � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 16
Applications of error-correcting codes • Obvious: Communication/Storage. • Algorithms: Useful data structures. • Complexity: Pseudorandomness ( ǫ -biased spaces, t -wise independent spaces), Hardness amplification, PCPs. • Cryptography: Secret sharing, Crypto- schemes. • Central object in extremal combinatorics: relates to extractors, expanders, etc. • Recreational Math. � Madhu Sudan, Fall 2004: Essential Coding Theory: MIT 6.895 c 17
Recommend
More recommend