Why Your Encrypted Database Is Not Secure Paul Grubbs Tom Ristenpart Vitaly Shmatikov
Outsourced Applications Today Server data data
Encrypt the data!
Encrypt the Data App functionality no App functionality no longer works :( longer works :( Server Encrypted Encrypted data data
Encrypt the Data Server Encrypted Encrypted data data • Searchable encryption use property-revealing • Deterministic encryption encryption (PRE) • Order-revealing encryption
Building “Secure” Systems Server Server Encrypted Encrypted data data
Building “Secure” Systems Server Server “computing on encrypted data”
Building “Secure” Systems • CryptDB (SOSP 2011) • CryptDB (SOSP 2011) • Mylar (NSDI 2014) • Mylar (NSDI 2014) • Seabed (OSDI 2016) • Seabed (OSDI 2016) • Arx • Arx • Many others • Many others • Lots of industry and • Lots of industry and government interest!! government interest!!
What They Claim
“Magically Flexible Cryptography”
Claims emulates fully homomorphic encryption emulates fully homomorphic encryption provable confjdentiality provable confjdentiality semantic security semantic security the database does not leak the values the database does not leak the values of sensitive fjelds, even if the attacker of sensitive fjelds, even if the attacker has side information has side information
Fallacy #1 Encryption scheme is “secure” does not mean The system is “secure”
What This Talk Is About … and build a completely insecure system from it Encrypted Encrypted How to take a plausible data data encryption scheme
Unsafe at Any Speed If you look at an actual commodity DBMS … • CryptDB (SOSP 2011) • CryptDB (SOSP 2011) • Mylar (NSDI 2014) • Mylar (NSDI 2014) • Seabed (OSDI 2016) • Seabed (OSDI 2016) • Arx • Arx • Many others • Many others • Lots of industry and • Lots of industry and government interest!! government interest!! … insecure under ANY real-world attack
Threat Models “Snapshot” Server Persistent Encrypted Encrypted passive data data Active
Claims Meet Reality • Secure against active attacks: false – Grubbs et al. “Breaking web applications built on top of encrypted data” (CCS 2016) • Secure against “snapshot” attacks: false – Grubbs et al. “Why your encrypted database is not secure” (HotOS 2017) • Sensitivity analysis helps: false – Bindschaedler et al. “The tao of inference in privacy- protected databases” (forthcoming)
Security Against Active Attacks
Mylar Insecure proxy re- encryption scheme Add orange user Hiring plan for 2017 [see Van Rompay et al. 2017] I trust I trust ( ) blue user blue user My secret diary Server, you can convert all Server, you can convert all my searches to blue key. my searches to blue key. Here’s a token to do it. Here’s a token to do it.
Mylar Under Active Attack Hiring plan for 2017 Search(w) ( ) My secret diary
some user machines … collude with some user machines … collude with the server… because the the server… because the adversary broke into a user’s adversary broke into a user’s machine machine
Mylar Under Active Attack Hiring plan for 2017 Search(w) ( ) My secret diary Unkeyed “hash” of keyword. Unkeyed “hash” of keyword. Perform dictionary attack. Perform dictionary attack. ( ) + + = H(w) Search(w)
… as long as none of the … as long as none of the users with access to that users with access to that data item use a data item use a compromised machine compromised machine
Mylar Under Active Attack Hiring plan for 2017 Search(w) My secret diary None of the users with None of the users with access to this data item use access to this data item use a compromised machine a compromised machine
Mylar in a Hospital One nurse loses their laptop, One nurse loses their laptop, server can compromise every server can compromise every doctor’s private fjles doctor’s private fjles
“Snapshot” Threat Model Server False in any realistic Existing systems explicitly snapshot attack on a claim security commodity DBMS … assuming there are no queries in the snapshot
A Simple System Abstraction OS Volatile DB memory Persistent storage
Actual Attacks Full-system compromise OS Volatile VM DB memory snapshot leak SQL Persistent injection storage Disk theft
Case Study: MySQL similar issues in any other commodity DBMS Failed encrypted Attack What MySQL leaks database MVCC data Arx’s range query Disk theft structures index Seabed’s SPLASHE SQL Injection Past query statistics scheme Full system CryptDB, Lewi/Wu, compromise or VM Text of past queries etc. snapshot leak
Disk Theft If this is your threat model, just use full-disk encryption
Logs on Disk Data modifjcation queries General query log (not widely used) can be reconstructed from Binary log records modifjcations, these logs used for replication and recovery [FHMW ’10, FKSHW ’12] Multi-version concurrency control using log data structures In all modern SQL databases! Insert Select MVCC log In Update Up ->
Poddar et al. Arx Range queries via chained garbled circuits Tree nodes become consumed, need replacing >=2 >=2 E k (5) I used up these nodes >=2 E k (3) E k (7) Here, refresh nodes with these ciphertexts E k (1) E k (2) E k (3) E k (2) >=2 E k (5)
Poddar et al. Security Claim for Arx “Arx protects the database with the “Arx protects the database with the same level of security as regular same level of security as regular AES-based encryption” AES-based encryption”
Arx Under Snapshot Attack Range queries via chained garbled circuits Tree nodes become consumed, need replacing Consumed nodes immediately replaced, stored in MVCC log E k (5) E k (5) Query access pattern recorded on disk E k (3) E k (3) E k (7) Here, refresh nodes E k (1) E k (2) E k (2) Snapshot attacker can recover queries with these ciphertexts Up E k (3) and plaintexts using variants of attacks Up E k (2) E k (3) E k (2) from [GSBNR - S&P ‘17] Up E k (5) E k (5)
SQL Injection Failed encrypted Attack What MySQL leaks database MVCC data Arx’s range query Disk theft structures index Seabed’s SPLASHE SQL Injection Past query statistics scheme Full system CryptDB, Lewi/Wu, compromise or VM Text of past queries etc. snapshot leak
SQL Injection SQL injection accounted for 51% of all Web application attacks in 2016 (source: Akamai) Malicious code Runs here
Diagnostic Tables information_schema stores current query for all users, Separate counts for queries contents of buffer cache which involve different columns performance_schema stores current query for all threads, statistics for past queries performance_schema 1 2 Inserts: Insert 1 Selects: Select Insert
Problem: Frequency Analysis Name Has given this talk before Paul Grubbs 1 Thomas Ristenpart 0 Vitaly Shmatikov 0 Order-preserving encryption reveals histogram of plaintext values This is how Naveed et al. used frequency analysis to break CryptDB: match histogram to auxiliary model of data distribution
Papadimitriou et al. (OSDI 2016) Seabed Name Has given this talk before Paul Grubbs 1 Thomas Ristenpart 0 Vitaly Shmatikov 0 (“Has …”=1) (“Has …”=0) Each possible plaintext gets Name C2 C3 its own column aspoiwnpoinio E k (1) E k (0) WHERE clause transformed petryoiueytiew E k (0) E k (1) to correct column Xncmxncmbcn E k (0) E k (1) SELECT Count(“Has … ”) WHERE “Has …”=1 SELECT Count(C2) Separate counts for queries which involve different columns
Example
SQLi Extracts Diagnostic Tables Use frequency analysis to recover plaintexts (see paper for details) performance_schema: Selects for C2: 1 SELECT Count(C3) 2 Selects for C3: SELECT Count(C2) SELECT Count(C3) Separate counts for queries which involve different columns
Full-System Snapshot Failed encrypted Attack What MySQL leaks database MVCC data Arx’s range query Disk theft structures index Seabed’s SPLASHE SQL Injection Past query statistics scheme Full system CryptDB, Lewi/Wu, compromise or VM Text of past queries etc. snapshot leak
Full-System Compromise Leakage of sensitive data at OS level is well-studied [CPGR, DLJKSXSW] We focus on DBMS address space, things inaccessible to users
Data Structures and Caches Adaptive hash index tracks pages accesses, indexes automatically MySQL query cache stores select queries and results Other query caches (memcached) essential for performance! MySQL manages internal heaps, does not zero freed memory! Insert Select Select
Token-Based Systems Still there. CryptDB, Mylar, Lewi-Wu, other Still there. searchable encryption schemes Still there. cannot be semantically secure if Still there. attacker sees a single search token Select Search Search token token 1,000 random selects… Waited a while… 100,000 more random selects…
Let Me Make Myself Perfectly Clear These encrypted databases CANNOT be semantically secure under ANY real-world attack
Recommend
More recommend