SAC Summer School 2019 Encrypted Search: Intro & Basics Seny Kamara
2
14,717,618,286* 4% * since 2013 3
Why so Few? Incompetence? Lazyness? Cost? “…because it would have hurt Yahoo’s ability to index and search message data…” — J. Bonforte in NY Times 4
Q : can we search on encrypted data? 5
distributed leakage suppression range attacks storage [KMO18,KM19] [NKW15,KKNO17,LMP18,…] [AK19] ranges attacks [PBP16,…] [IKK12,CGPR15,ZKP16,BKM19] Boolean in sub-linear [CJJJ+13,PKVK+14,KM17] dynamic in OPT time OPT time [KPR12,NPG14,CJJJKRS14] [CGKO06] O(#docs) I/O efficient [Goh03,CM05] [CJJJKRS14,CT14,…] parallel Can we? [KPR13] [SWP00] sec. defs adaptive sec. defs forward private ESPADA,BlindSeer [Goh03,CM05] [CGKO06] [SPS14,B16,…] [CJJKRS13,PKVK+14] dual secure [AKM19] snapshot secure [AKM19] multi-user Pixek [CGKO06,JJKRS13,PPY18,…] [ZKM18] relational DBs graphs [HILI02,KC05, [CK10,MKNK15] PRZB11,KM19] beyond search DEX [CK10] [KMZZ19] 6
Interdisciplinary Databases Machine Data Structures Learning Graph Algorithms Cryptography Information Retrieval Distributed Systems Statistics Optimization 7
Real-World Problem • Startups • Major companies • Funding agencies • Ciphercloud • Microsoft, SAP • NSF • Skyhigh Networks • MongoDB, Cisco • IARPA • Bitglass • Google Research • DARPA • Baffle • Hitachi, Fujitsu • Cossack Labs • more… • Strong Salt, Overnest • many many more 8
Encrypted Search (Building Blocks) Property-Preserving Functional Structured Encryption (PPE) Encryption Encryption (STE) Oblivious RAM Fully-Homomorphic (ORAM) Encryption (FHE) 9
Efficiency Functionality Leakage 10
What is Search? • Complexity regimes • linear search: O(n) Without Pre-Processing With Pre-Processing • sub-linear search: o(n) • Algorithmic paradigms Linear sequential scan not interesting • with pre-processing • without pre-processing read sub-set of input Sub-Linear data structures ( errors) • For medium to large data • sub-linear search is a requirement ; not an option 11
Background: Data Structures • Abstract data types • Arrays store values • capture functionality • ex: dictionary A • Data structures v 1 v 2 v 3 v 4 v 5 v 6 • instantiate ADTs • ex: hash table, binary search tree • Write: A[i] := v i • As common in CS • Read: A[i] returns v i • we sometimes blur the distinction 12
Background: Data Structures • Dictionaries map labels to values • Multi-Maps map labels to tuples DX MM ℓ 1 v 1 ℓ 1 v 1 v 3 v 4 ℓ 2 v 2 ℓ 2 v 3 ℓ 3 v 3 ℓ 3 v 2 v 4 • Put: DX[ ℓ 2 ] := v 2 • Put: MM[ ℓ 3 ]:= (v 2 ,v 4 ) • Get: DX[ ℓ 2 ] returns v 2 • Get: MM[ ℓ 3 ] returns (v 2 ,v 4 ) 13
Keyword Search in Sub-Linear Time Setup time DS O(n) Query time DS q ans = (ptr 1 , …, ptr n ) 14
Database Queries in Sub-Linear Time Setup time DS O(n) Query time DS q ans = (ptr 1 , …, ptr n ) 15
Q : how do we do sub-linear search on encrypted data? 16
Encrypted Keyword Search in Sub-Linear Time Setup time E DS DS O(n) O(n) Query time E DS q ans = (ptr 1 , …, ptr n ) 17
Encrypted Database Queries in Sub-Linear Time Setup time E DS DS O(n) O(n) Query time E DS q ans = (ptr 1 , …, ptr n ) 18
Q : how do we formalize encrypted data structures? 19
Structured Encryption [Chase-K.10] DS EDS q ans Setup (1 k , DS ) ⟶ (K, EDS) Query(EDS, tk ) ⟶ ans Token (K , q ) ⟶ tk 20
Desiderata Setup leakage EDS Size of EDS q ans Size of state Query time Size of token Query leakage 21
Structured Encryption [Chase-K.10] • Many variants of STE • response-revealing • EDS query reveals answer in plaintext • response-hiding • EDS query reveals encrypted answer • non-interactive queries • clients sends single message called a token • interactive queries • client and server execute multi-round protocol 22
Evolution of Structured Encryption Expressiveness Security Efficiency ‘00 Single-keyword SSE ‘00 Linear in file length [SWP00] ‘06 Leakage-parametrized [SWP00,Goh03,CGKO06,CJJJKRS14] security definitions [CGKO06] Linear in #docs [Goh03] ‘03 ‘06 Multi-user SSE Attacks ‘12 [CGKO06,JJKRS13,PPY16,HS [IKK12,CGPR15,ZKP16,KMNO16, WW18] LMP18,GLMP18] ‘06 Optimal [CGKO06,CK10] Boolean SSE ‘13 Forward/Backward Security ‘14 [CJJKRS13,PKVK+14,KM17] [SPS14,Bost16,LC17,BMO17,AK Optimal Dynamic ‘12 M18] Range SSE [KPR12,CJJJKRS14] ‘14 [PKVK+14,FJKNRS15] Leakage Supression ‘18 [KMO18,KM19] STE-based SQL [KM18] ‘18 I/O efficient ‘14 ‘19 Snapshot [CT14,CJJJKRS14,ANSS16,D [AKM18] PP18],ASS18] 23
Adversarial Models 24
Adversarial Models Snapshot Persistent View View EDS 0 EDS 0 q q ans ans EDS 0 EDS 0 u u EDS 1 q EDS 2 u 25
Persistent (Adaptive) Security [Curtmola-Garay-K.-Ostrovsky06,Chase-K.10] • An STE scheme is ( ℒ S , ℒ Q )-secure vs. a persistent adv. if • it reveals no information about the structure beyond ℒ S • it reveals no information about the structure and query beyond ℒ Q 26
Persistent (Adaptive) Security [Curtmola-Garay-K.-Ostrovsky06,Chase-K.10] Real Ideal ℒ S ( DS ) DS DS DS DS ℒ Q ( DS, q ) q q q q ℒ U ( DS, u ) u u u u 27
Forward Privacy [Stefanov-Papamanthou-Shi14, Bost16] • Informally [SPS14] • “Updates not correlated to previous queries” • Formally [Bost16] • ℒ U (MM, ( ℓ , v )) = # v 28
Snapshot (Adaptive) Security [Amjad-K.-Moataz19] • We say that an STE scheme is ℒ Snp -secure vs. a snapshot adv. if • it reveals no information about the structure beyond ℒ Snp 29
Snapshot (Adaptive) Security [Amjad-K.-Moataz19] Real Ideal DS 0 DS 0 L S ( DS 0 ) E DS 0 E DS 0 L S ( DS 1 , q ) q q E DS 1 E DS 1 L S ( DS 2 , q ) u u E DS 2 E DS 2 30
Snapshot (Adaptive) Security [Amjad-K.-Moataz19] Static Structures Dynamic Structures ℒ Snp = ℒ S Forward privacy Snapshot security Insertion independence Write-only obliviousness (variant of history independence) 31
Q : Why do we parameterize definitions with leakage? 32
Leakage-Parameterized Definitions [Curtmola-Garay-K.-Ostrovsky, Chase-K.10] • This area is about tradeoffs • but traditional cryptographic definitions don’t capture tradeoffs • in 00’s, different approaches were proposed to capture leakage • #1: limit adversary’s power in the proof • #2: make assumptions on data (e.g., high entropy) • Original motivations for leakage-parameterized definitions • Approaches #1 & #2 are misleading (sweep leakage under the rug) • Leakage should be made explicit and not be implicit • gives clear target for cryptanalysis • makes it (somewhat) easier to compare schemes 33
Q : How do we model leakage? 34
Modeling Leakage • Each scheme has a leakage profile: 𝚳 = ( ℒ S , ℒ Q , ℒ U ) • where ℒ S = (patt 1 , …, patt n ) is the Setup leakage • ℒ Q = (patt 1 , …, patt n ) is the Query leakage • ℒ U = (patt 1 , …, patt n ) is the Update leakage • Each “operational” leakage is composed of leakage patterns • (patt 1 , …, patt n ) 35
Common Leakage Patterns [K.-Moataz-Ohrimenko18] • qeq : query equality • req : response equality • a.k.a. search pattern • mqlen : max query length • rid : response identity • mrlen : max resp. length • a.k.a. access pattern • srlen : sequence resp. length • qlen : query length • dsize : data size • trlen : total resp. length • usize : update size • rlen/vol: response length • did : data identity • a.k.a. volume pattern 36
Example Leakage Profiles • The “Baseline” leakage profile for response-revealing EMMs • 𝚳 = ( ℒ S , ℒ Q , ℒ U ) = (dsize, (qeq, rid), usize) • The “Baseline” leakage profile for response-hiding EMMs • 𝚳 = ( ℒ S , ℒ Q , ℒ U ) = (dsize, qeq, usize) • Several new constructions have better leakage profiles • AZL and FZL [K.-Moataz-Ohrimenko18] • VLH and AVLH [K.-Moataz19] 37
Structured Encryption vs. Other Primitives • Encrypted structures appear implicitly throughout crypto • Oblivious RAM can be viewed as a • response-hiding encrypted array • with leakage profile 𝚳 ORAM = ( ℒ S , ℒ Q , ℒ U ) = (dsize, ⟘ ) • PIR can be viewed as a • response-hiding encrypted array • with leakage profile 𝚳 PIR = ( ℒ S , ℒ Q , ℒ U ) = (did, ⟘ ) • Garbled gates can be viewed as • response-revealing 2x2 arrays • 𝚳 GG = ( ℒ S , ℒ Q , ℒ U ) = (dsize, qeq) 38
Encrypted Multi-Maps 39
Encrypted Multi-Maps: The Heart of Sub-Linear Encrypted Search • EMMs are used as building block for sub-linear • Single keyword search [Curtmola-Garay-K.-Ostrovsky06,…] • Conjunctive keyword search [Cash et al.13,…] • Boolean keyword search [Cash et al.13, K.-Moataz17,…] • Range queries [Faber et al.14, Demertzis et al. 16,…] • Substring, wildcard, [Faber et al.14,…] • SQL databases [K.-Moataz18,…] • Graph databases [Chase-K.10,…] 40
Recommend
More recommend