Amir Ali Kouzeh Geran and Arash Reyhani-Masoleh Presented by: Arash Reyhani-Masoleh Department of Electrical and Computer Engineering Western University, London, Ontario, Canada 23rd IEEE Symposium on Computer Arithmetic (ARITH 23) June 11, 2016
Outline Motivation Preliminaries Single-bit Fault Detection Scheme CRC-based Fault Detection Scheme Fault Simulation Results FPGA Implementations and Overheads Conclusion 2
Motivations: GCM Galois/Counter Mode (GCM) is a recently adopted mode of operation for symmetric key cryptography (like AES). Proposed by McGrew and Viega in 2005 and was defined by NIST (SP 800-38D) in 2007. AES-GCM is included in “ NSA Suite B Cryptography ”. It is being used in a number of protocols and standards: IEEE 802.1AE, IEEE 802.11 AD ANSI (INCITS) Fiber Channel Security Protocols ( FC-SP ). IEEE P1619.1 tape storage, IETF IPsec standards, SSH and TLS 1.2. It provides authentication assurance for additional data that is not encrypted. It detects accidental modifications of data, unauthorized alterations, and protects confidentiality . 3
Motivations: Reliable GCM Sources of faults in cryptographic systems: Natural Faults Fault Attacks: inject faults and look for leakage of information. The need for fault detection method Protect the integrity and authenticity of data Prevent the attack sequence in case of fault attack. In this paper, we propose a reliable GCM scheme to detect both permanent and transient faults. Low overhead in terms of area and delay. Acceptable fault coverage. 4
Preliminaries The GCM has two operations: authenticated encryption and authenticated decryption . There are 4 inputs for authenticated encryption: A secret key ( K ) with the length based on the block cipher. 1. An initialization vector ( IV ) between 1 and 2 64 . 2. A plaintext ( P ) with any number of bits between 0 and 2 39 − 256 3. An additional authenticated data ( A ), which is authenticated but 4. not encrypted, with any number of bits between 0 and 2 64 . There are two outputs for authenticated encryption: A ciphertext ( C ) whose length is exactly that of the plaintext. 1. An authentication tag ( T ), whose length can be any value 2. between 0 and 128. 5
AES-GCM Block Diagram • The “Hash Key” H is generated by the encryption of 128 bits of zero using the symmetric key ( K ): H = E(K,0 128 )=E K (0) • The Additional Authenticated Data A is represented as m blocks of 128 bits: A 1 , A 2 , . . . , A m • The Plaintext P is divided into n blocks of 128-bit long: P 1 , P 2 , . . . , P n • An up-counter with the output U i is used to generate blocks of ciphertext: C i =P i ⊕ E K (U i ) for i =1, 2, …, n. 6
AES-GCM Block Diagram (cont.) • Using the inputs H, A and C , the output of the GCM is defined by X m+n+1 = GHASH (H, A, C), where • The 128-bit register Y • Cleared initially. • After the (m+n+1)th clock cycle, it contains X m+n+1 = GHASH (H, A, C) . • In this paper, we consider the GCM loop. 7
Single-bit Fault Detection Scheme The parity of multiplier output ( X i ) is computed using two different functions: Actual parity ( p Xi ) is obtained by XORing the 1. coordinates of X i Then, they are compared to find error: if 𝑞 ≠ Ƹ 𝑞 ⇒ e out =1. 2. The predicted parity is a X i ˆ p f ( H , C , Y ). complex function of H, C i , Y: i 8
Single-bit Parity Prediction Formulations We write the multiplier output as follows: 𝑌 𝑗 = 𝐼 × 𝐸 𝑗 mod 𝐺(α) , where α is the root of irreducible polynomial F(x)=x 128 + x 7 + x 2 + x + 1 and 0 ≤ 𝑗 ≤ 𝑛 + 𝑜 + 1 . The hash key 𝐼 ∈ GF(2 128 ) is fixed in each iterations 𝑗. 127 𝑒 𝑘 α 𝑘 (drop 𝑗 for simplicity). The field element 𝐸 𝑗 = σ 𝑘=0 𝑘 = (𝐼 α 𝑘 )mod 𝐺(α) , Z (0) =H . 127 𝑒 𝑘 𝑎 (𝑘) , where 𝑎 𝑌 𝑗 = σ 𝑘=0 Then, the parity prediction of multiplier output: 127 ˆ ˆ p d p . X j ( j ) Z i j 0 9
Single-bit Parity Prediction Formulations (Cont.) 127 ˆ ˆ p d p ( j ) X j Z i j 0 127 127 ˆ ˆ ˆ Since 𝐸 = 𝑍 + 𝐷 ⇒ d j =y j +c j p y p c p ⇒ X j ( j ) j ( j ) Z Z i j 0 j 0 • ˆ , 0 ≤ 𝑘 ≤ 127 , is a binary function and depends on p ( j ) Z the coordinates of 𝐼 ∈ 𝐻𝐺 2 128 : 0 = 𝐼 ˆ • 𝑎 ⇒ p p . ( 0 ) H 1 = 𝑎 0 α mod 𝐺 α ⇒ Z ˆ ˆ • 𝑎 p p h . ( 1 ) ( 0 ) 127 Z Z ˆ ˆ ( j 1 ) • In general: p p z for 1 j 127 . ( j ) ( j 1 ) 127 Z Z • These values are stored in a register (PH) at the initialization phase. • They remain constant for the entire 𝑛 + 𝑜 + 1 cycles of the GCM computation. 10
Single Parity Fault Detection Architecture • The actual and predicted parities are computed and compared in each clock cycle to generate the output error signal. 127 127 ˆ ˆ ˆ p y p c p . X j ( j ) j ( j ) Z Z i j 0 j 0 11
Ƹ CRC-Based Fault Detection Scheme • We extend the idea from single bit to multiple bits. • The Cyclic Redundancy Check (CRC) code has been adopted to detect errors in the GCM loop. • For 𝑙 parity bits, the CRC generator polynomial must be of degree 𝑙: 𝑙 𝑦 = 𝑦 𝑙 + … + 1 𝑦 + 1. • Let us denote the output of the multiplier in the GCM loop as the message: 𝑛 𝑦 = 𝑌 i (𝑦) 1. Compute actual k-bit parity: 𝑞 𝑦 = 𝑛 𝑦 𝑛𝑝𝑒 k (𝑦) 2. Compute k-bit predicted parity: 𝑞 𝑦 = 𝑔 𝐷, 𝐼, 𝑍 . 3. Compare them to detect error: if 𝑞 𝑦 ≠ Ƹ 𝑞 𝑦 ⇒ e out =1. 12
Matrix-Based CRC Formulations 1. The k parity bits of the multiplier output are computed as p CRC-k = [ p 0 p 1 … p k-1 ] = [ m 0 m 1 … m 127 ] G CRC-k. • m j ∈ {0 ,1} is the j -th coordinate of the multiplier output 𝑌 𝑗 . • G CRC-k is the 128 × 𝑙 CRC generator matrix. • The 𝑘 -th row, 0 ≤ 𝑘 ≤ 127, of G CRC-k contain coefficients of 𝑦 𝑘 𝑛𝑝𝑒 k 𝑦 . • For 𝑙 = 1 (single bit parity), 1 𝑦 = 𝑦 + 1 and then G CRC-1 = [ 1 1 … 1 ] T ⇒ p=m 0 +m 1 +…+m 127 • For 2 ≤ 𝑙 ≤ 4 ⇒ 13
Matrix-Based CRC Formulations (cont.) 2. To calculate k predicted parity bits, we use the Mastrovito formulation for the multiplier output as m =[ m 0 m 1 … m 127 ] T = Ed • The entries of E contain coordinates of 𝐼 only. • d=y+c is a vector with the coordinates of 𝐸 𝑗 = 𝑍 𝑗 + 𝐷 i • Substituting m T = d T E T into p CRC-k = m T G CRC-k , we obtain 𝒒 CRC-k = [ Ƹ ෝ 𝑞 0 Ƹ 𝑞 1 … Ƹ 𝑞 k-1 ] = d T E T G CRC-k = y T O CRC-k + c T O CRC-k • The entries of O CRC-k = E T G CRC-k are functions of 𝐼 only. • They are stored into k 128 -bit registers at the initialization phase. 14
Matrix-Based CRC Formulations (cont.) 3. After calculations of [ p 0 p 1 … p k-1 ] and [ Ƹ 𝑞 0 Ƹ 𝑞 1 … Ƹ 𝑞 k-1 ], we compare all 𝑙 actual parities with the corresponding predicted parities to generate the output error signal e out = ( p 0 + Ƹ 𝑞 1 ) ∨ … ∨ ( p k-1 + Ƹ 𝑞 0 ) ∨ ( p 1 + Ƹ 𝑞 k-1 ) • It requires 𝑙 2-input XOR gates and a k- input OR gate. 15
CRC-Based Fault Detection Architecture • The actual and predicted parities are computed and compared in each clock cycle to generate the output error signal. ෝ 𝒒 CRC-k = [ Ƹ 𝑞 0 Ƹ 𝑞 1 … Ƹ 𝑞 k-1 ]= y T O CRC-k + c T O CRC-k e out =( p 0 + Ƹ 𝑞 1 ) ∨ … ∨ ( p k-1 + Ƹ 𝑞 0 ) ∨ ( p 1 + Ƹ 𝑞 k-1 ) p CRC-k = [ p 0 p 1 … p k-1 ] = [ m 0 m 1 … m 127 ] G CRC-k 16
Fault Simulation Results • We have written a VHDL code to simulate the entire fault detection scheme for the GCM using ModelSim. • We have considered up to degree six for the CRC generator polynomials. • Different cases of single and multiple bit faults (300,000 in total) are injected into different modules of the proposed fault detection architecture. • By increasing number of parity bits, fault coverage increases and can reach to 100% with acceptable false alarm. 17
FPGA Implementations and Overheads • We have implemented the original GCM and six fault detection architectures on Altera’s 28 nm FPGA. • Their areas in terms number of ALM (Adaptive Logic Module) and longest delays are recorded. • The area and time overheads of the fault detection schemes are presented as compared to the original one. • For fault coverage of 98% (k=6), we have area overhead of 10.9% and delay of 23%. 18
Conclusion We proposed a reliable GCM scheme capable of detecting permanent and transient faults. The proposed fault detection scheme checks the validity of the GCM computation in every clock cycle. Based on available overheads and/or required fault coverage, number of parity bits (and hence the CRC generator polynomial) can be selected. We performed fault simulation and FPGA implementations We considered single and multiple faults in all locations of the GCM, parity generation and predicted modules. The proposed fault detection scheme has high fault coverage with low overheads and negligible false alarm. 19
Thank You & Questions? 20
Recommend
More recommend