The RC6 Block Cipher: A simple f ast secure AES proposal Ronald L. Rivest MI T Mat t Robshaw RSA Labs Ray Sidney RSA Labs Yiqun Lisa Yin RSA Labs (August 21, 1998) Out line N Design Philosophy N Descr ipt ion of RC6 N I mplement at ion Result s N Secur it y N Conclusion
Design Philosophy N Lever age our exper ience wit h RC5: use dat a-dependent r ot at ions t o achieve a high level of secur it y. N Adapt RC5 t o meet AES r equir ement s N Take advant age of a new pr imit ive f or incr eased secur it y and ef f iciency: 32x32 mult iplicat ion , which execut es quickly on moder n pr ocessor s, t o comput e r ot at ion amount s. Descript ion of RC6
Descript ion of RC6 N RC6-w/ r / b par amet er s: – Wor d size in bit s: w ( 32 )( lg(w) = 5 ) – Number of rounds : r ( 20 ) – Number of key byt es : b ( 16, 24, or 32 ) N Key Expansion: – Produces array S[ 0 … 2r + 3 ] of w-bit round keys. N Encr ypt ion and Decr ypt ion: – I nput / Out put in 32-bit r egist er s A,B,C,D RC6 Primit ive Operat ions w A + B Addit ion modulo 2 w A - B Subt r act ion modulo 2 A ⊕ B Exclusive-Or RC5 A < < < B Rot at e A lef t by amount in low-or der lg(w ) bit s of B A > > > B Rot at e A r ight , similar ly (A,B,C,D) = (B,C,D,A) Par allel assignment w A x B Mult iplicat ion modulo 2
RC6 Encrypt ion (Generic) B = B + S[ 0 ] D = D + S[ 1 ] f or i = 1 to r do { t = ( B x ( 2B + 1 ) ) < < < lg( w ) u = ( D x ( 2D + 1 ) ) < < < lg( w ) A = ( ( A ⊕ t ) < < < u ) + S[ 2i ] C = ( ( C ⊕ u ) < < < t ) + S[ 2i + 1 ] (A, B, C, D) = (B, C, D, A) } A = A + S[ 2r + 2 ] C = C + S[ 2r + 3 ] RC6 Encrypt ion (f or AES) B = B + S[ 0 ] D = D + S[ 1 ] f or i = 1 to 20 do { t = ( B x ( 2B + 1 ) ) < < < 5 u = ( D x ( 2D + 1 ) ) < < < 5 A = ( ( A ⊕ t ) < < < u ) + S[ 2i ] C = ( ( C ⊕ u ) < < < t ) + S[ 2i + 1 ] (A, B, C, D) = (B, C, D, A) } A = A + S[ 42 ] C = C + S[ 43 ]
RC6 Decrypt ion (f or AES) C = C - S[ 43 ] A = A - S[ 42 ] f or i = 20 downto 1 do { (A, B, C, D) = (D, A, B, C) u = ( D x ( 2D + 1 ) ) < < < 5 t = ( B x ( 2B + 1 ) ) < < < 5 t ) ⊕ u C = ( ( C - S[ 2i + 1 ] ) > > > u ) ⊕ t A = ( ( A - S[ 2i ] ) > > > } D = D - S[ 1 ] B = B - S[ 0 ] Key Expansion (Same as RC5’s) N I nput : ar r ay L[ 0 … c-1 ] of input key wor ds N Out put : array S[ 0 … 43 ] of round key words N Pr ocedur e: S[ 0 ] = 0xB7E15163 f or i = 1 to 43 do S[i] = S[i-1] + 0x9E3779B9 A = B = i = j = 0 f or s = 1 to 132 do { A = S[ i ] = ( S[ i ] + A + B ) < < < 3 B = L[ j ] = ( L[ j ] + A + B ) < < < ( A + B ) i = ( i + 1 ) mod 44 j = ( j + 1 ) mod c }
From RC5 t o RC6 in seven easy st eps (1) St art wit h RC5 RC5 encrypt ion inner loop: f or i = 1 t o r do { A = ( ( A ⊕ B ) < < < B ) + S[ i ] ( A, B ) = ( B, A ) } Can RC5 be st rengt hened by having rot at ion amount s depend on all t he bit s of B?
Bet t er rot at ion amount s? N Modulo f unct ion? Use low-or der bit s of ( B mod d ) Too slow! N Linear f unct ion? Use high-or der bit s of ( c x B ) Har d t o pick c well! N Quadr at ic f unct ion? Use high-or der bit s of ( B x (2B+1) ) J ust r ight ! B x (2B+1) is one-t o-one mod 2 w Pr oof : By cont r adict ion. I f B ≠ C but B x (2B + 1) = C x (2C + 1) (mod 2 w ) t hen (B - C) x (2B+2C+1) = 0 (mod 2 w ) But (B-C) is nonzer o and (2B+2C+1) is odd; t heir pr oduct can’t be zer o! � Cor ollar y: B unif or m � B x (2B+1) unif or m (and high-or der bit s ar e unif or m t oo!)
High-order bit s of B x (2B+1) N The high-or der bit s of f (B) = B x ( 2B + 1 ) = 2B 2 + B depend on all t he bit s of B . N Let B = B 31 B 30 B 29 … B 1 B 0 in binar y. N Flipping bit i of input B – Leaves bit s 0 … i-1 of f (B) unchanged, – Flips bit i of f (B) wit h probabilit y one, – Flips bit j of f (B) , f or j > i , wit h pr obabilit y appr oximat ely 1/ 2 (1/ 4… 1), – is likely t o change some high-order bit . (2) Quadrat ic Rot at ion Amount s f or i = 1 to r do { t = ( B x ( 2B + 1 ) ) < < < 5 A = ( ( A ⊕ B ) < < < t ) + S[ i ] ( A, B ) = ( B, A ) } But now much of t he out put of t his nice mult iplicat ion is being wast ed...
(3) Use t , not B, as xor input f or i = 1 to r do { t = ( B x ( 2B + 1 ) ) < < < 5 A = ( ( A ⊕ t ) < < < t ) + S[ i ] ( A, B ) = ( B, A ) } Now AES r equir es 128-bit blocks. We could use t wo 64-bit r egist er s, but 64-bit oper at ions ar e poor ly suppor t ed wit h t ypical C compiler s... (4) Do t wo RC5’s in parallel Use f our 32-bit regs (A,B,C,D), and do RC5 on (C,D) in parallel wit h RC5 on (A,B): f or i = 1 t o r do { t = ( B x ( 2B + 1 ) ) < < < 5 A = ( ( A ⊕ t ) < < < t ) + S[ 2i ] ( A, B ) = ( B, A ) u = ( D x ( 2D + 1 ) ) < < < 5 C = ( ( C ⊕ u ) < < < u ) + S[ 2i + 1 ] ( C, D ) = ( D, C ) }
(5) Mix up dat a bet ween copies Swit ch r ot at ion amount s bet ween copies, and cyclically per mut e r egist er s inst ead of swapping: f or i = 1 to r do { t = ( B x ( 2B + 1 ) ) < < < 5 u = ( D x ( 2D + 1 ) ) < < < 5 A = ( ( A ⊕ t ) < < < u ) + S[ 2i ] C = ( ( C ⊕ u ) < < < t ) + S[ 2i + 1 ] (A, B, C, D) = (B, C, D, A) } One Round of RC6 A B C D t u < < < < < < f f 5 5 < < < < < < S[2i] S[2i+1] A B C D
(6) Add Pre- and Post -Whit ening B = B + S[ 0 ] D = D + S[ 1 ] f or i = 1 to r do { t = ( B x ( 2B + 1 ) ) < < < 5 u = ( D x ( 2D + 1 ) ) < < < 5 A = ( ( A ⊕ t ) < < < u ) + S[ 2i ] C = ( ( C ⊕ u ) < < < t ) + S[ 2i + 1 ] (A, B, C, D) = (B, C, D, A) } A = A + S[ 2r + 2 ] C = C + S[ 2r + 3 ] (7) Set r = 20 f or high securit y (based on analysis) B = B + S[ 0 ] D = D + S[ 1 ] f or i = 1 to 20 do { t = ( B x ( 2B + 1 ) ) < < < 5 u = ( D x ( 2D + 1 ) ) < < < 5 A = ( ( A ⊕ t ) < < < u ) + S[ 2i ] C = ( ( C ⊕ u ) < < < t ) + S[ 2i + 1 ] (A, B, C, D) = (B, C, D, A) } Final RC6 A = A + S[ 42 ] C = C + S[ 43 ]
RC6 I mplement at ion Result s CPU Cycles / Operat ion J ava Bor land C Assembly Set up 110000 2300 1108 Encr ypt 16200 616 254 Decr ypt 16500 566 254 Less t han t wo clocks per bit of plaint ext !
Operat ions/ Second (200MHz) J ava Bor land C Assembly Set up 1820 86956 180500 Encr ypt 12300 325000 787000 Decr ypt 12100 353000 788000 Encrypt ion Rat e (200MHz) MegaByt es / second MegaBit s / second J ava Bor land C Assembly Encr ypt 0.197 5.19 12.6 1.57 41.5 100.8 Decr ypt 0.194 5.65 12.6 1.55 45.2 100.8 Over 100 Megabit s / second !
On an 8-bit processor N On an I nt el MCS51 ( 1 Mhz clock ) N Encr ypt / decr ypt at 9.2 Kbit s/ second (13535 cycles/ block; f r om act ual implement at ion) N Key set up in 27 milliseconds N Only 176 byt es needed f or t able of r ound keys. N Fit s on smar t car d (< 256 byt es RAM). Cust om RC6 I C N 0.25 micr on CMOS pr ocess N One r ound/ clock at 200 MHz N Convent ional mult iplier designs N 0.05 mm 2 of silicon N 21 milliwat t s of power N Encr ypt / decr ypt at 1.3 Gbit s/ second N Wit h pipelining, can go f ast er , at cost of mor e ar ea and power
RC6 Securit y Analysis Analysis procedures N I nt ensive analysis, based on most ef f ect ive known at t acks (e.g. linear and dif f er ent ial cr ypt analysis) N Analyze not only RC6, but also sever al “simplif ied” f or ms (e.g. wit h no quadr at ic f unct ion, no f ixed r ot at ion by 5 bit s, et c… )
Linear analysis N Find appr oximat ions f or r -2 r ounds. N Two ways t o appr oximat e A = B < < < C – wit h one bit each of A, B, C (t ype I ) – wit h one bit each of A, B only (t ype I I ) – each have bias 1/ 64; t ype I more usef ul N Non-zer o bias acr oss f (B) only when input bit = out put bit . (Best f or lsb.) N Also include ef f ect s of mult iple linear appr oximat ions and linear hulls. Securit y against linear at t acks Est imat e of number of plaint ext / cipher t ext pair s r equir ed t o mount a linear at t ack. (Only 2 128 such pair s ar e available.) Rounds Pair s 2 47 8 2 83 12 2 119 16 2 155 20 RC6 I nf easible 2 191 24
Recommend
More recommend