Systems Security: Side-channel attacks Stjepan Picek s.picek@tudelft.nl Delft University of Technology, The Netherlands May 6, 2018
Outline 1 Side-channels 2 Implementation Attacks 3 Side-channel Attacks 4 Fault Injection 2 / 48
Side-channels Something that enables you to know something about something without directly observing that something. 3 / 48
Side-channels 4 / 48
Side-channels Figure: https://www.strava.com/heatmap#3.10/-108.57419/44.95226/hot/all 5 / 48
Side-channels 6 / 48
Side-channels 7 / 48
Outline 1 Side-channels 2 Implementation Attacks 3 Side-channel Attacks 4 Fault Injection 8 / 48
Implementation Attacks “Researchers have extracted information from nothing more than the reflection of a computer monitor off an eyeball or the sounds emanating from a printer.” - Scientific American, May 2009. 9 / 48
Cryptographic Theory vs Physical Reality ❼ Cryptographic algorithms are (supposed to be) theoretically secure. ❼ Implementations lean in physical world. 10 / 48
Implementation Attack Categories ❼ Side-channel attacks. ❼ Faults. ❼ Microprobing. 11 / 48
Taxonomy of Implementation Attacks ❼ Active vs passive. ❼ Active: 1 Active: the key is recovered by exploiting some abnormal behavior. 2 Insertion of signals. ❼ Passive: 1 The device operates within its specifications. 2 Reading hidden signals. 12 / 48
Implementation Attacks Implementation attacks Implementation attacks do not aim at the weaknesses of the algorithm, but on its implementation. ❼ Side-channel attacks (SCAs) are passive, non-invasive attacks. ❼ SCAs represent one of the most powerful category of attacks on crypto devices. 13 / 48
Examples of Implementation Attacks ❼ KeeLoq: eavesdropping from up to 100 m. ❼ PS3 hack due to ECDSA implementation failed. ❼ Attacks on Mifare Classic, Atmel CryptoMemory. ❼ Spectre and Meltdown. 14 / 48
The Goals of Attackers ❼ Secret data. ❼ Location. ❼ Reverse engineering. ❼ Theoretical cryptanalysis. ❼ ... 15 / 48
Physical Security in the Beginning ❼ Tempest – already known in 1960s that computers generate EM radiation that leaks information about the processed data. ❼ 1965: MI5 used a microphone positioned near the rotor machine used by Egyptian embassy to deduce the positions of rotors. ❼ 1996: first academic publication on SCA – timing. ❼ 1997: Bellcore attack. ❼ 1999: first publication of SCA – power. 16 / 48
Outline 1 Side-channels 2 Implementation Attacks 3 Side-channel Attacks 4 Fault Injection 17 / 48
Power Analysis ❼ Direct attacks: 1 Simple Power Analysis – SPA. 2 Differential Power Analysis – DPA. 3 Correlation Power Analysis – CPA. 4 ... ❼ Two-stage attacks: 1 Template attack – TA. 2 Stochastic models. 3 Machine learning-based attacks. 4 ... 18 / 48
Simple Power Analysis ❼ Based on one or a few measurements. ❼ Visual inspection of measurements. ❼ Discovery of data independent but instruction dependent properties. ❼ In symmetric crypto: 1 Number of rounds. 2 Memory access. ❼ In asymmetric crypto: 1 Key length. 2 Implementation details. 3 Key. 19 / 48
SPA 20 / 48
SPA 21 / 48
SPA 22 / 48
Assignment 1 ❼ Learn/remind about DES, AES, RSA. 23 / 48
Differential Power Analysis ❼ Statistical analysis of measurements. 24 / 48
Assignment 2 ❼ Implement DPA. 25 / 48
Correlation Power Analysis ❼ Write a leakage model for the power consumption. ❼ Obtain measurements of power consumption while device is running encryption over different plaintexts. ❼ Attack subparts of the key (divide and conquer approach): 1 Consider all options for subkey. For each guess and trace, use plaintext and guessed subkey to calculate power consumption according to the model. 2 Use the Pearson correlation to differentiate between the modeled and actual power consumption. 3 Decide which subkey guess correlates best to the measured traces. ❼ Combine the best subkey guesses to obtain the secret key. 26 / 48
Pearson’s Correlation ρ X , Y = cov ( X , Y ) E [( X − µ x )( Y − µ y )] = √ (1) E [( X − µ x ) 2 ] E [( Y − µ y ) 2 ] σ x σ y 27 / 48
Leakage Models ❼ Recall, power has two components: static and dynamic. ❼ Static power is required to keep the device running and it depends on the number of transistors inside the device. ❼ Dynamic power depends on data processing. 28 / 48
Leakage Models ❼ Transition = the Hamming distance model. ❼ Counts the number of transitions between 0 → 1 and 1 → 0. ❼ Typical model for ASIC. ❼ Requires j=knowledge of a previous (or succeeding) value. ❼ The Hamming weight model is typical on a precharged data bus in a microcontroller. 29 / 48
The Distinguishers ❼ Difference of Means. ❼ T-test. ❼ Variance test. ❼ Pearson correlation. ❼ Spearman’s rank correlation. ❼ MIA. ❼ ... 30 / 48
Example ❼ Let us consider AES-128 where we use the Hamming weight model. ❼ After the first S-box operation, state = sbox [ input XOR key ] . ❼ Our modeled power consumption for one byte of plaintext p is then h p = Hamming ( sbox [ input p XOR key ]) . ❼ How many key guesses do we need to do for each subkey? ❼ How many in total? 31 / 48
Profiled Attacks ❼ Profiled attacks have a prominent place as the most powerful among side channel attacks. ❼ Within profiling phase the adversary estimates leakage models for targeted intermediate computations, which are then exploited to extract secret information in the actual attack phase. ❼ Template Attack (TA) is the most powerful attack from the information theoretic point of view. ❼ Some machine learning (ML) techniques also belong to the profiled attacks. 32 / 48
Profiled Attacks 33 / 48
Profiled Attacks ❼ Two stage (profiled) attacks are more complicated than the direct attacks. ❼ The attacker must have access to a copy of the device to be attacked. 34 / 48
Template Attack ❼ Using the copy of device, record a large number of measurements using different plaintexts and keys. We require information about every possible subkey value. ❼ Create a template of device’s operation. A template is a set of probability distributions that describe what the power traces look like for many different keys. ❼ On device that is to be attacked, record a (small) number of measurements (called attack traces) using different plaintexts. ❼ Apply the template to the attack traces. For each subkey, record what value is the most likely to be the correct subkey. 35 / 48
Template Attack ❼ When using high-quality templates made from many traces, it is possible to attack a system with a single trace. ❼ Template attack can become unstable if there are more points of interest than measurements per value. 36 / 48
Assignment 3 ❼ Implement TA. 37 / 48
Machine Learning-based Attacks ❼ In symmetric crypto, machine learning-based attacks are mostly supervised learning approaches. ❼ Up to now, various techniques have been used with great success: SVM, Random Forest, Multi layer Perceptron, CNNs. ❼ The attack goes in two phases: 1 Train a model from the training set (measurements with labels). 2 Apply the model to the testing set (measurements without labels). 38 / 48
Reality Is More Complicated ❼ Pre-processing. ❼ Feature engineering. ❼ Model Selection. ❼ Hyper parameter optimization. ❼ Fighting with countermeasures. ❼ ... 39 / 48
Reality Is More Complicated ❼ Constraints for implementing countermeasures (software and hardware). ❼ Optimization can make SCA easier. ❼ Trade-off between practical and academic attacks. 40 / 48
Outline 1 Side-channels 2 Implementation Attacks 3 Side-channel Attacks 4 Fault Injection 41 / 48
Fault Injection ❼ Alter the correct functioning of a system. ❼ Often called perturbation attacks. ❼ Fault injection is very hard (accuracy, reproducibility). ❼ The equipment is expensive. 42 / 48
Methods ❼ Variations in supply voltage. ❼ Variation in external clock. ❼ Change in temperature. ❼ White light. ❼ X-rays and ion beams. 43 / 48
Goals ❼ Insert computational fault (null key, wrong crypto result). ❼ Change software decision (force approval of wrong PIN, enforce access rights). ❼ ... 44 / 48
Force Approval of Wrong PIN 45 / 48
Types of Fault Injection ❼ Non invasive: glitching (clock, power supply). ❼ Semi invasive: UV lights, laser, optical fault injection. ❼ Invasive: microprobing, FIB probing. 46 / 48
Differential Fault Analysis – DFA ❼ The attacker obtains a pair of ciphertexts derived by encrypting the same plaintext. ❼ One is correct value and one is faulty. ❼ Two encryptions are identical up to the point where the fault occurred. ❼ Two ciphertexts can be regarded as outputs of round reduced ciphers where the inputs are unknown but show a small differential. 47 / 48
Recommend
More recommend