Introduction to Side-Channel Analysis Franois-Xavier Standaert UCL - PowerPoint PPT Presentation

Introduction to Side-Channel Analysis François-Xavier Standaert UCL Crypto Group, Belgium Summer school on real-world crypto, 2016

Outline • Link with linear cryptanalysis • Standard Differential Power Analysis • Noise-based security (is not enough) • CPA vs Gaussian templates • Post-processing the traces • Noise amplification (aka masking) • Conclusions & advanced topics

Linear cryptanalysis (I) 1

Linear cryptanalysis (II) 2 • Main characteristics • Divide-and-conquer attack 1 • Data complexity ∝ 𝜁 2 • 𝜁 = 2 𝑜−1 ∙ 𝑡=1 𝑜 𝜁 𝑡 ( 𝑜 S-boxes in A, bias 𝜁 𝑡 ) • Time complexity ≈ # of active S-boxes in R1

Linear cryptanalysis (II) 2 • Main characteristics • Divide-and-conquer attack 1 • Data complexity ∝ 𝜁 2 • 𝜁 = 2 𝑜−1 ∙ 𝑡=1 𝑜 𝜁 𝑡 ( 𝑜 S-boxes in A, bias 𝜁 𝑡 ) • Time complexity ≈ # of active S-boxes in R1 • Countermeasures • Data: good (non-linear) S-boxes • Data & time: Many active S-boxes • Data: Larger number of rounds

Linear cryptanalysis (II) 2 • Main characteristics • Divide-and-conquer attack 1 • Data complexity ∝ 𝜁 2 • 𝜁 = 2 𝑜−1 ∙ 𝑡=1 𝑜 𝜁 𝑡 ( 𝑜 S-boxes in A, bias 𝜁 𝑡 ) • Time complexity ≈ # of active S-boxes in R1 • Countermeasures • Data: good (non-linear) S-boxes • Data & time: Many active S-boxes • Data: Larger number of rounds  AES: 𝜁 < 2 −64 after a few of rounds

Side-channel cryptanalysis 3

Differential Side-Channel Analysis 4 • Main characteristics • Divide-and-conquer attack 1 • Data complexity ∝ MI(𝐿;𝑀,𝑌) • Time complexity ∝ # of S-boxes predicted

Differential Side-Channel Analysis 4 • Main characteristics • Divide-and-conquer attack 1 • Data complexity ∝ MI(𝐿;𝑀,𝑌) • Time complexity ∝ # of S-boxes predicted • Linear cryptanalysis countermeasures • Good (non-linear) S-boxes

Differential Side-Channel Analysis 4 • Main characteristics • Divide-and-conquer attack 1 • Data complexity ∝ MI(𝐿;𝑀,𝑌) • Time complexity ∝ # of S-boxes predicted • Linear cryptanalysis countermeasures • Good (non-linear) S-boxes • Many active S-boxes

Differential Side-Channel Analysis 4 • Main characteristics • Divide-and-conquer attack 1 • Data complexity ∝ MI(𝐿;𝑀,𝑌) • Time complexity ∝ # of S-boxes predicted • Linear cryptanalysis countermeasures • Good (non-linear) S-boxes • Many active S-boxes ? • Larger number of rounds

Differential Side-Channel Analysis 4 • Main characteristics • Divide-and-conquer attack 1 • Data complexity ∝ MI(𝐿;𝑀,𝑌) • Time complexity ∝ # of S-boxes predicted • Linear cryptanalysis countermeasures • Good (non-linear) S-boxes • Many active S-boxes ? • Larger number of rounds  Unprotected implem: MI 𝐿; 𝑀, 𝑌 > 0.01

Standard DPA 5

Measurement & pre-processing 6 • Noise reduction via good setups (!) • Filtering, averaging (FFT, SSA, …) • Detection of Points-Of-Interest (POI) • Dimensionality reduction (PCA, LDA,…) • …

Prediction and modeling 7 • General case: profiled DPA • Build “ templates ”, i.e. 𝑔 𝑚 𝑗 𝑙, 𝑦 𝑗 • e.g. Gaussian, regression-based • Which directly leads to Pr[𝑙|𝑚 𝑗 , 𝑦 𝑗 ]

Prediction and modeling 7 • General case: profiled DPA • Build “ templates ”, i.e. 𝑔 𝑚 𝑗 𝑙, 𝑦 𝑗 • e.g. Gaussian, regression-based • Which directly leads to Pr[𝑙|𝑚 𝑗 , 𝑦 𝑗 ] • “Simplified” case: non -profiled DPA • Just assumes some model 𝑙 ∗ = HW(𝑨 𝑗 ) • e.g. 𝑛 𝑗

Prediction and modeling 7 • General case: profiled DPA • Build “ templates ”, i.e. 𝑔 𝑚 𝑗 𝑙, 𝑦 𝑗 • e.g. Gaussian, regression-based • Which directly leads to Pr[𝑙|𝑚 𝑗 , 𝑦 𝑗 ] • “Simplified” case: non -profiled DPA • Just assumes some model 𝑙 ∗ = HW(𝑨 𝑗 ) • e.g. 𝑛 𝑗 • Separation: only profiled DPA is guaranteed to succeed against any leaking device (!)

Exploitation 8 • Profiled case: maximum likelihood

Exploitation 8 • Profiled case: maximum likelihood • Unprofiled case: • Difference-of-Means • Correlation (CPA) • « On-the-fly » regression • Mutual Information Analysis (MIA) • […]

Illustration 9 Gaussian templates CPA 𝑙 = argmax E 𝑀 ∙ 𝑁 𝑙 ∗ − E 𝑀 ∙ E(𝑁 𝑙 ∗ ) 𝑟 2 𝑙 ∗ 1 − 1 𝑚 𝑗 − 𝑛 𝑗 𝑙 = argmax ∙ exp 2 ∙ 𝜏(𝑀) ∙ 𝜏(𝑁 𝑙 ∗ ) 𝜏(𝑀) 2 ∙ 𝜌 ∙ 𝜏(𝑀) k* k* 𝑗=1 • More efficient (why?) • Less efficient (why?) • Outputs probabilities • Outputs scores

First-order CPA (I) 10 • Lemma 1. The mutual information between two normally distributed random variables 𝑌, 𝑍 with 2 equals: 2 , 𝜏 𝑍 means 𝜈 𝑌 , 𝜈 𝑍 and variances 𝜏 𝑌 MI 𝑌; 𝑍 = − 1 2 log 2 (1 − 𝜍 𝑌, 𝑍 2 )

First-order CPA (I) 10 • Lemma 1. The mutual information between two normally distributed random variables 𝑌, 𝑍 with 2 equals: 2 , 𝜏 𝑍 means 𝜈 𝑌 , 𝜈 𝑍 and variances 𝜏 𝑌 MI 𝑌; 𝑍 = − 1 2 log 2 (1 − 𝜍 𝑌, 𝑍 2 ) • Lemma 2. In a CPA, the number of samples required to distinguish the corrrect key with model 𝑁 𝑙 from the other key candidates with 𝑑 models 𝑁 𝑙∗ is ∝ 𝜍(𝑁 𝑙 ,𝑀) 2 ( with c a small constant depending on the SR & # of key candidates )

First-order CPA (II) 11 • Lemma 3. Let 𝑌, 𝑍 and 𝑀 be three random variables s.t. 𝑍 = 𝑌 + 𝑂 1 and 𝑀 = 𝑍 + 𝑂 2 with 𝑂 1 and 𝑂 2 two additive noise variables. Then: 𝜍 𝑌, 𝑀 = 𝜍(𝑌, 𝑍) ∙ 𝜍(𝑍, 𝑀)

First-order CPA (II) 11 • Lemma 3. Let 𝑌, 𝑍 and 𝑀 be three random variables s.t. 𝑍 = 𝑌 + 𝑂 1 and 𝑀 = 𝑍 + 𝑂 2 with 𝑂 1 and 𝑂 2 two additive noise variables. Then: 𝜍 𝑌, 𝑀 = 𝜍(𝑌, 𝑍) ∙ 𝜍(𝑍, 𝑀) • Lemma 4. The correlation coefficient between the sum of 𝑜 independent and identically distributed random variables and the sum of the first 𝑛 < 𝑜 of these equals 𝑛/𝑜

Paper & pencil estimations (I) 12 • FPGA implementation of the AES • Adversary targeting the 1st byte of key • Hamming weight leakage function/model • 8-bit loop architecture broken in 10 traces

Paper & pencil estimations (I) 12 • FPGA implementation of the AES • Adversary targeting the 1st byte of key • Hamming weight leakage function/model • 8-bit loop architecture broken in 10 traces • How does the attack data complexity scale • For a 32-bit architecture? • i.e. with 24 bits of « algorithmic noise » • For a 128-bit architecture? • i.e. with 120 bits of « algorithmic noise »

Paper & pencil estimations (II) 13 • Hint: 𝑀 = M + N = 𝑁 𝑄 + 𝑁 𝑉 + 𝑂

Paper & pencil estimations (II) 13 • Hint: 𝑀 = M + N = 𝑁 𝑄 + 𝑁 𝑉 + 𝑂 • Lemma 3: 𝜍 𝑁 𝑄 , 𝑀 =

Paper & pencil estimations (II) 13 • Hint: 𝑀 = M + N = 𝑁 𝑄 + 𝑁 𝑉 + 𝑂 • Lemma 3: 𝜍 𝑁 𝑄 , 𝑀 = 𝜍(𝑁 𝑄 , 𝑁) ∙ 𝜍(𝑁, 𝑀) • Lemma 4: 𝜍 𝑁 𝑄 , 𝑁 = ? • For the 8-bit architecture: 8/8 • For the 32-bit architecture: 8/32 • For the 128-bit architecture: 8/128

Introduction to Side-Channel Analysis Franois-Xavier Standaert UCL - PowerPoint PPT Presentation

Introduction to Side-Channel Analysis Franois-Xavier Standaert UCL Crypto Group, Belgium Summer school on real-world crypto, 2016 Outline Link with linear cryptanalysis Standard Differential Power Analysis Noise-based security (is

CHANNEL ALLOCATION Channel Language Translation Channel Translation Language Channel 1 German

ANNUAL ACCOUNTS PRESS CONFERENCE CHANNEL ALLOCATION. Channel Language Translation Channel

Channel Assignment and Channel Hopping in IEEE 802.11 Operating Channels for 802.11b Europe

ANNUAL ACCOUNTS PRESS CONFERENCE LANGUAGE CHANNELS. Channel Language Channel (translation)

Channel design Channel coverage Intensive Selective Exclusive Channel

SCHISM numerical formulation Joseph Zhang Horizontal grid: hybrid 2 3 Side 2 4 3 Side 1

Higher-Order Side Channel Security and Mask Refreshing J.-S. Coron,E. Prouff, M. Rivain and T.

1 Simultaneous interpretation EN channel 1 FR channel 2 ES channel 3 DE channel 4 2 The Future

Formal Modeling in Cognitive Science 1 Noisy Channel Model Channel Capacity Lecture 29: Noisy

Glitching and Side-Channel Analysis for All Colin OFlynn NewAE Technology Inc. RECON 2015

Recent advances in side- channel analysis using machine learning techniques Annelie Heuser with

A Comprehensive Study of Deep Learning for Side-Channel Analysis c Masure 1,3 ecile Dumas 1

Side-Channel Cryptanalysis Joseph Bonneau Security Group jcb82@cl.cam.ac.uk Rule 0: Attackers

FIDES: Lightweight Authentication Cipher with Side-Channel Resistance for Constrained Hardware

Systems Security: Side-channel attacks Stjepan Picek s.picek@tudelft.nl Delft University of

Enhancing the Power of Deep Learning in Side-Channel Analysis? Breaking multiple layers of

Addressing the Forking Amplification Vulnerability draft-ietf-sip-fork-loop-fix-02 Robert Sparks

Reducing a Masked Implementations Effective Security Order with Setup Manipulations And an

Josephson Parametric Amplifiers: Theory and Application Andrew Eddins D. Wright R.

WiscKey: Separating Keys from Values in SSD-Conscious Storage Lanyue Lu, Thanumalayan Pillai,

POPCA 2012 CURRENT MEASUREMENT FOR POWER CONVERTERS - TUTORIAL - POCPA Conference 20..23 May @

Reducing Write Amplification of Flash Storage through Cooperative Data Management with NVM 32nd

Stellar feedback strongly alters the amplification and morphology of galactic magnetic fields

A Probabilistic Model for Component Based Shape Synthesis Evangelos Kalogerakis, Siddhartha