Improved reconstruction attacks using range query leakage Marie-Sarah Lacharité Brice Minaud Kenny Paterson Information Security Group
Application Setting
Storing Records in the Cloud value of record ( N possible values) record identifier (unique) R records 3
Application Scenario give me all records with values in the range [1975, 1979] client 4
Access Pattern Leakage give me all records with values in the range [1975, 1979] client record identifiers 5 OPE, ORE schemes, POPE, [HK16], Blind seer, [Lu12], [FJKNRS15],…
Access Pattern Leakage and Rank Leakage give me all records with values rank in the range [1975, 1979] a+1 client b record identifiers 6 FH-OPE, Lewi-Wu, Arx, Cipherbase, EncKV,…
Assumptions 1. Data is dense: all values appear in at least one record. 2. Queries are uniformly distributed . Target : full reconstruction: find the value associated with each record. Best previous result (Kellaris et al., CCS 2016): Full reconstruction by analysing access pattern leakage from O( N 2 log N ) queries. 7
Our Main Results (eprint 2017/701) Full reconstruction with O( N log N ) queries • – in fact, expected N · (3 + log N ). Approximate reconstruction with relative accuracy ε from • O( N · (log 1/ ε )) queries – in fact, expected 5/4 · N · (log 1/ ε ) + O(N). Approximate reconstruction using an auxiliary distribution and • rank leakage. – more efficient in practice, evaluation via simulation. – applies in the non-dense case too, giving a new attack on OPE/ORE schemes. 8
(1, 1) Uniform Queries: Uniform Endpoints vs. Uniform Ranges ( N =10) 9 (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8) (1, 9) (1, 10) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8) (2, 9) (2, 10) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8) (3, 9) (3, 10) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8) (4, 9) (4, 10) (5, 5) (5, 6) (5, 7) (5, 8) (5, 9) (5, 10) (6, 6) (6, 7) (6, 8) (6, 9) Uniform ranges Uniform endpoints (6, 10) (7, 7) (7, 8) (7, 9) (7, 10) (8, 8) (8, 9) (8, 10) (9, 9) (9, 10) (10, 10)
Distribution of Left Endpoints: Uniform Endpoints vs. Uniform Ranges ( N =10) Uniform endpoints Uniform ranges 1 2 3 4 5 6 7 8 9 10 10
Coupon Collector’s Problem 800 700 600 500 Expected 400 number of draws 300 N · (1 + log N) 200 N · H(N) 100 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 11 Number of coupons (N)
Coupon Collector’s Problem 800 700 600 500 Expected 400 number of draws 300 N · (1 + log N) 200 N · H(N) 100 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 12 Number of coupons (N)
Attack 1: Full Reconstruction
Motivating Example (with Rank Leakage) • Suppose left endpoints of query intervals are chosen uniformly at random. • Wish to observe at least 1 query with each of the N possible left endpoints. • Expected number of queries needed is at most N · (1 + log N ). hidden leaked [x,y] a = rank(x-1) b = rank(y) matching IDs [20,25] 1300 1500 M 20 [1,18] 0 1200 M 1 [55,125] 3100 4400 M 55 [2,10] 500 800 M 2 [7,98] 700 4200 M 7 relabelled for convenience 14
Motivating Example (with Rank Leakage) 501 … … 4400 1 rank M 1 M 2 M 3 …. M 1 – U i >1 M i M 2 – U i >2 M i M N-1 – M N M N 15
Full Reconstruction (with Rank Leakage) • Now suppose queries have ranges chosen uniformly at random. • We present a data-optimal algorithm (fails ð full reconstruction is impossible). • Expected number of sufficient queries is at most N · (2 + log N ) for N ≥ 27. • Main idea: partition, then sort (easy with rank leakage, harder without). • Expected number of necessary queries is at least 1/2 · N · log N – O(N) for any algorithm . 16
Full Reconstruction (with Rank Leakage) 80000 70000 60000 50000 Expected number 40000 of queries O( N 2 log N ) 30000 KKNO16 20000 This work 10000 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 17 Number of coupons (N)
Full Reconstruction (with Rank Leakage): Partitioning Step record matched query? ID 1 2 3 4 5 6 7 20 ü ü û û ü û û 23 ü û û ü ü ü ü 29 û ü ü û û ü û 89 û ü ü û ü ü û 193 ü ü û û ü ü ü … • Equality of matching defines a partition of records. • Records in same class of partition cannot be distinguished. • For complete reconstruction, we need N classes – one class 18 per value.
Full Reconstruction (with Rank Leakage): Partitioning Step record matched query? ID 1 2 3 4 5 6 7 20 ü ü û û ü û û 23 ü ü û û ü ü ü [1,100] [18,82] [16,96] [16,30] [21,61] 29 û ü ü û û ü û 89 û ü ü û ü ü û 193 ü ü û û ü ü ü … Can also deduce from rank leakage that, e.g., records 23 and 193 have ranks in [21,30], by intersecting rank intervals. 19
Full Reconstruction (with Rank Leakage): Partitioning Step 1 2 Order partition 3 into N classes by rank 4 Ranks 5 [21,30] records 23 and 193 6 (and more) 20
Full Reconstruction (with Rank Leakage): Proof Intuition • Hard part is to show that O( N log N ) queries suffice with a small constant. • Proof consists of showing that if certain favourable queries are made, then partitioning succeeds in constructing N classes. • Roughly speaking, for our proof we hope for queries on ranges: 1. [x,*] for all 1 ≤ x ≤ N /2 (left coupons) 2. [*,y] for all N /2+1 ≤ y ≤ N (right coupons) 3. [ N /2+1,y] and [x, N ] for some y ≥ x. • Assuming these all arise, then a combinatorial argument establishes the success of the partitioning step. • First two cases are essentially a pair of coupon collector problems – success with high probability with O( N log N ) queries. • Third case is a high probability event: 1 - e - Q /(2 N+2) for Q queries. 21
Full Reconstruction ( without Rank Leakage) • Can only recover values up to reflection . • Data-optimal algorithm (fails _ full reconstruction is impossible). • Expected number of sufficient queries is at most N · (3 + log N ) for N ≥ 26 • Partition (as before), then sort*. • Expected number of necessary queries is at least 1/2 · N · log N – O(N) - for any algorithm . *Not quite. 22
Full Reconstruction (without Rank Leakage): Sorting Step all records M 7 M 39 M 72 1 or N M 36 M 93 M 58 M 28 M 9 M 40 M 18 23 Interval of size N -1
Full Reconstruction (without Rank Leakage): Sorting Step – Extending all records M 25 M 36 M 22 M 17 T M 62 T M 81 T … 24
Full Reconstruction (without Rank Leakage): Sorting Step – Extending all records 25
Full Reconstruction (without Rank Leakage): Sorting Step all records M 3 M 39 M 27 M 13 T M 52 T M 99 T 26
Full Reconstruction (without Rank Leakage): Sorting Step all records … 27
Full Reconstruction (without Rank Leakage): Proof Intuition • Hard part is again to show that O( N log N ) queries suffice, with a small constant. • Proof again consists of showing that if certain favourable queries are made, then partitioning succeeds in constructing N classes. • Coupon collecting bounds then establish that O( N log N ) queries are enough. 28
Attack 2: Approximate Reconstruction
Approximate Reconstruction Attack (without Rank Leakage) • Recover values up to reflection and with relative error ε. • Expected number of sufficient queries is 5/4 · N · (log 1/ ε ) + O(N). • Expected number of necessary queries is at least 1/2 · N · (log 1/ ε) – O(N) for any algorithm. • Not data-optimal without rank leakage (but is with it) 30
Coupon Collection ( N =125) Collecting n of 125 coupons 700 600 500 Expected 400 number of draws 300 200 100 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 Coupon number ( n ) 31
Coupon Collection ( N =125) Collecting fraction (1- ε) of 125 coupons 700 ε = 0.04 ε = 0.08 600 ε = 0.12 ε = 0.16 500 ε = 0.2 Expected 400 number of draws 300 200 100 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 Coupon number ( n ) 32
Approximate Reconstruction: Old Partitioning Method Doesn't Work all records M 7 M 39 M 72 M 36 M 93 M 58 M 28 M 9 M 40 M 18 33
Approximate Reconstruction: Partitioning Step 1. Pick any record r. 34
Approximate Reconstruction: Partitioning Step 2. Intersect all queries matching r to get M . 35
Approximate Reconstruction: Partitioning Step 2. Intersect all queries matching r to get M . M 36
Approximate Reconstruction: Partitioning Step 3. Find q L and q R : q L ∩ q R = M and | q L U q R | maximised. q L q R M 37
Approximate Reconstruction: Partitioning Step 4. Find q' L : q L ∩ q' L ≠ ∅ , q' L ∩ q R ⊆ M , | q L U q' L | maximised. q' L q L q R M 38
Approximate Reconstruction: Partitioning Step 5. Find q' R : q R ∩ q' R ≠ ∅ , q' R ∩ q L ⊆ M , | q R U q' R | maximised. q' L q L q R M q' R 39
Approximate Reconstruction: Partitioning Step 6. Start over if not every record is in q L U q' L U q R U q’ R . q' L q L q R M q' R 40
Recommend
More recommend