efficient private matching and set intersection
play

Efficient Private Matching and Set Intersection We think patients - PowerPoint PPT Presentation

A Story Efficient Private Matching and Set Intersection We think patients are misusing Here too.. prescriptions to obtain drugs Mike Freedman, NYU Kobbi Nissim, MSR But, what about HIPAA? We could share our lists Benny Pinkas, HP


  1. A Story… Efficient Private Matching and Set Intersection We think patients are misusing Here too.. prescriptions to obtain drugs… Mike Freedman, NYU Kobbi Nissim, MSR But, what about HIPAA? We could share our lists Benny Pinkas, HP Labs And we’re competitors! of patients? ( To appear in EUROCRYPT 2004 ) This is all “theory”. Have you heard of “secure It can’t be efficient. function evaluation” ? A Story… The Scenario Client Server 1.Improvements to generic primitives (SFE, OT) Input: X = x 1 … x k Y = y 1 … y k X ∩ Y only 2.Improvements in specific protocol examples Output: nothing � Enterprises and government holding sensitive databases We could share our lists � Peer-to-Peer networks of patients? � Mobile wireless crowds (PDAs, cell phones) This is all “theory”. Have you heard of “secure Credit rating, CAPS II, shared interests (research, music), It can’t be efficient. function evaluation” ? genetic compatibility, etc

  2. Related work Crypto vs. randomization methods � Use a circuit for SFE [Yao,GMW,BGW] �������� ���������������������� � Use k 2 private equality tests � Single inputs x,y; return 1 iff x = y, 0 otherwise ��� ����� � (O(k) computation [NP]) � Diffie-Hellman based solutions [FHH99, EGS03] ���������� � Insecure against malicious adversaries ������������� � Depend on a “random oracle” assumption ������� ������������ ������� � Our work: O(k ln ln k) overhead. � “Semi-honest” adversaries – no RO assumption � “Malicious” adversaries – with RO assumption This talk… Basic tool: Homomorphic Encryption � Overview � Semantically-secure public-key encryption � Basic protocol in semi-honest model � Efficient Improvements � Given Enc(M1), Enc(M2), can compute � A little on… � Enc(M1+M2) = Enc(M1) � Enc(M2) � Enc(c � M1) = [Enc(M1)] c , for any constant c � Extending protocol to malicious model � Approximation bounds without knowing decryption key � Multi-party security � Fuzzy matching � Examples: El Gamal, Paillier, DJ

  3. The Protocol …The Protocol � S uses homomorphic properties to compute, � Client (C) defines a polynomial of degree k ∀ y, r whose roots are her inputs x 1 ,…,x k random P(y) = (x 1 -y)(x 2 -y)…(x k -y) = a 0 + a 1 y +…+ a k y k Enc( r � P(y) + y ) � C sends to server (S) homomorphic if y ∈ X ∩ Y otherwise encryptions of polynomial’s coefficients Enc(a 0 ),…, Enc(a k ) Enc (y) Enc (random) Enc( P(y) ) = Enc( a 0 + a 1 · y 1 + … + a k · y k ) � S sends (permuted) results back to C Enc(a 0 ) · Enc (a 1 ) y1 · … · Enc (a k ) yk Variant protocols…cardinality Variant protocols…others Enc( r � P(y) + 1 ) Enc( r � P(y) + s ) r 1 s 1 if y ∈ X ∩ Y otherwise s 2 s 2 ? s 3 s 3 Enc (1) Enc (random) = r 4 s 4 r 5 s 5 circuit � Computes size of intersection: # Enc (1) � ∀ y, compute r � P(y) + s, for s random � Perform Yao circuit on decrypted values

  4. Variant protocols…others Security (semi-honest case) Enc( r � P(y) + s ) � Client’s privacy � r 1 s 1 � S only sees semantically-secure enc’s s 2 = s 2 � Learning about C’s input = breaking enc’s e.g., s 3 = s 3 | intersection | > threshold � r 4 s 4 � Server’s privacy (proof via simulation) � r 5 s 5 � Client can simulate her view in the protocol, given circuit the output of X ∩ Y alone: she can compute the � ∀ y, compute r � P(y) + s, for s enc’s of items in X ∩ Y and of random items. random � Perform Yao circuit on decrypted values Efficiency Improving Efficiency (1) � Communication is O(k) � Inputs typically from a “small” domain of D � C sends k coefficients values. Represented by log D bits (…20) � S sends k evaluations on polynomial � Use Horner’s rule � Computation P(y)= a 0 + y (a 1 +…y (a n-1 +ya n ) ...) � Client encrypts and decrypts k values � Server: � That is, exponents are only log D bits � Overhead of exponentiation is linear in | exponent | � ∀ y ∈ Y, computes Enc(r � P(y)+y), � Improve by factor of | modulus | / log D using k exponentiations � Total O(k 2 ) exponentiations e.g., 1024 / 20 50

  5. Improving Efficiency (2): Hashing Improving Efficiency (2): Hashing x 1 x 2 x 3 x 4 x 5 x 6 x 7 … x k-1 x k H(·) H P 1 P 2 P 3 P B … B ∀ y, i H(y), r rand P 1 P 2 P 3 P B M Enc( r · P i (y) + y ) � C uses PRF H(·) to hash inputs to B bins � Let M bound max # of items in a bin � Client sends B polynomials and H to server. � For every y, S computes H(y) and evaluates the � Client defines B polynomials of deg M. Each poly single corresponding poly of degree M encodes x’s mapped to its bin Overhead with Hashing This talk… � Communication: B � M � Overview � Basic protocol in semi-honest model � Server: k � M short exp � s, k full exp � s � Efficient Improvements ( P i (y) ) ( r·P i (y) + y ) � A little on… � How to make M small as possible? � Extending protocol to malicious model � Approximation bounds Balanced allocations [ABKU]: � Multi-party security � H: Choose two bins, map to the emptier bin � Fuzzy matching M = O (ln ln k) (M ≤ 5 [BM]) � B = k / ln ln k � Communication: O(k) � Server: k ln ln k short exp, k full exp in practice

  6. Malicious Adversaries Security against Malicious Server � Malicious clients � Correctness: Ensure that there is an input of k items corresponding to S’s actions � Without hashing: trivial. Parties use known a 0 � With hashing � Problem: Server computes r � P(y) + y’ � Verify that total # of roots (in all B poly’s) is k � Solution using cut-and-choose � Solution: Server uses RO to commit to � Exponentially small error probability seed, then uses resulting randomness to “prove” correctness of encryption � Still standard model � Malicious servers � Privacy…easy: S receives semantically-secure encryptions Is Approximation easier? Multi-party intersection � Represent inputs sets as k-bit vectors � N parties: (N-1) clients, 1 leader 0 0 1 1 1 0 0 1 0 1 0 1 0 0 1 0 1 0 1 � ∀ y, leader prepares (N-1) shares that XOR to y � Approximate size of intersection (scalar product) with sublinear overhead? And securely? � Each client performs intersection protocol with � Lower bound: � Approximating |X ∩ Y| within 1 ± leader, learns random share of y factor requires (k) communication � True even for randomized algorithms � Clients XOR (N-1) decrypted values � Proof: Reduction from Razborov � s lower bound for Recovers y iff y ∈ |X 1 ∩ X 2 ∩ X 3 ∩ … ∩ X N | Disjointness � Nice communication flow � We provide secure approximation protocol

  7. Fuzzy matching Open problems � Databases are not always accurate or full � More computationally-efficient protocol? � Errors, omissions, inconsistent spellings, etc. � Malicious parties � How to report a match iff entries similar? � Protocol secure in standard model? � Match in t out of T “attributes” � Secure, efficient set cardinality protocol? � Adaption of earlier protocol, but requires � Fuzzy matching T choose t overhead � Efficient protocol needed? � Security in malicious model?

Recommend


More recommend