PIR-PSI : SCALING PRIVATE CONTACT DISCOVERY PETS 2018 Peter Rindal Daniel Demmler Mike Rosulek Ni Trieu
Motivation – Application: Contact Discovery • Contact discovery tells social network users which of their contacts are in the social network • An insecure naïve hashing-based protocol is used in practice • Vulnerable to • Brute-force attacks (for small input domain, e.g. phone numbers) • Comparison with hashes from later sessions Hashes of User Contacts Matching WhatsApp Contacts 2
Motivation – Application: Private Contact Discovery • Contact Discovery should be efficient and scalable , and protect the privacy of user inputs . • It runs once when a user initially joins a social network • … and periodically to find contacts that join the social network later on. WhatsApp User Customers Contacts PIR-PSI WhatsApp Contacts 3
Private Set Intersection (PSI) 𝑌 𝑍 𝑌 ∩ 𝑍 4
Private Set Intersection (PSI) “Receiver” “Sender” Ideal World 𝑌 𝑍 PSI 𝑌 ∩ 𝑍 5
PSI for Contact Discovery 𝑌 = 𝑜 ≪ | 𝑍| = 𝑂 𝑌 𝑍 𝑌 ∩ 𝑍 𝑌 ∩ 𝑍 6
Status-Quo vs. PIR-PSI • Communication linear in both sets 𝑃 𝑂 + 𝑜 • What about 𝑂 ≫ 𝑜 ? • Insecure solution Private Contact Discovery • Send small set to other party 𝑌 = 𝑍 = • Communication = 𝑃 min 𝑂, 𝑜 𝑜 Contacts 𝑂 Customers PIR-PSI 𝑌 ∩ 𝑍 • PIR-PSI 𝑂 log 𝑜 • Communication = 𝑃 𝑜 log 𝑜 𝑂 log 𝑜 • Client Computation = 𝑃 𝑜 log AES operations 𝑜 • Server Computation = 𝑃 𝑂 log 𝑜 AES operations 7
Plaintext Database Query 𝑗 𝐸𝐶 𝑧 1 𝑗 𝑧 2 𝑧 3 𝑧 4 𝑧 𝑗 … 𝑧 𝑂 TLS 8
Private Information Retrieval (PIR) Ideal World 𝑗 𝐸𝐶 𝑧 1 𝑗 𝐸𝐶 𝑧 2 PIR 𝑧 3 𝑧 4 … 𝑧 𝑂 𝐸𝐶[𝑗] 9
2-Server PIR [CGKS95] 𝐸𝐶 𝑧 1 𝑧 2 #1 𝑧 3 𝑧 4 … 𝑧 𝑂 𝐸𝐶 𝑧 1 𝑧 2 #2 𝑧 3 𝑧 4 … 𝑧 𝑂 10
2-Server PIR [CGKS95] 𝐸𝐶 𝑧 1 𝑧 2 #1 𝑧 3 𝑟 1 𝑗 𝑧 4 … 𝑠 1 𝑧 𝑂 no collusion! 𝐸𝐶 𝑟 2 𝑧 1 𝑧 2 #2 𝑧 3 𝑠 2 𝑧 4 𝑠 1 ⊕ 𝑠 2 = DB i … 𝑧 𝑂 11
Example: 2-Server Linear Summation PIR [CGKS95] 𝐸𝐶 𝑗 = 2 ⇒ 𝑟 = 001 000 𝑧 1 𝑟 1 chosen at random 𝑧 2 𝑟 2 = 𝑟 ⊕ 𝑟 1 #1 𝑧 3 𝑧 4 … 𝑧 𝑂 𝑠 𝑗 = 𝑟 𝑗 ⋅ 𝐸𝐶 𝐸𝐶 𝐸𝐶 𝑧 1 𝑧 2 #2 𝑧 3 𝑧 4 𝑠 1 ⊕ 𝑠 2 = DB 2 … 𝑧 𝑂 12
PIR from Distributed Point Functions (DPFs) • Point Functions: 𝑄𝐺 = {𝑔 𝑗,𝑤 : 𝑔 𝑗,𝑤 𝑗 = 𝑤, Intuition: 𝑗,𝑤 𝑦 = 0 ∀ 𝑦 ≠ 𝑗} . 𝑔 DPF Key Expansion • Distributed PFs allow 2 parties the secret- 𝑙 1 𝐿 1 shared PF evaluation, without revealing 𝑗, 𝑤 . • DPFs are described by short keys 𝑙 1 , 𝑙 2 of length 𝑃 log 𝑂 , where 𝑂 is the domain of 𝑗. • By using 𝑤 = 1 , i.e., a DPF returning 1 only at index 𝑗 , we can express the plain text query 𝑟 and thus build 2-server PIR with 𝑃 log 𝑂 communication complexity. • Instantiated efficiently with AES. 𝑙 2 𝐿 2 13
Designated-Output PIR 𝐸𝐶 𝑧 1 𝑧 2 #1 𝑧 3 𝑟 1 , 𝑛 𝑗, 𝑛 𝑧 4 … 𝑧 𝑂 𝑠 1 ⊕ 𝑛 𝐸𝐶 𝑟 2 𝐸𝐶 𝑧 1 𝑧 2 #2 𝑧 3 𝑧 4 𝑠 2 ⊕ 𝑠 1 ⊕ 𝑛 … = DB i ⊕ 𝑛 𝑧 𝑂 14
PIR Private Equality Test 𝐸𝐶 𝑧 1 𝑧 2 #1 𝑧 3 𝑟 1 , 𝑛 𝑦, 𝑗, 𝑛 𝑧 4 … 𝑧 𝑂 𝑠 1 ⊕ 𝑛 𝐸𝐶 𝑟 2 𝐸𝐶 𝑧 1 𝑧 2 #2 𝑧 3 𝑧 4 𝑠 2 ⊕ 𝑠 1 ⊕ 𝑛 𝑦 ⊕ 𝑛 PEQ … = DB i ⊕ 𝑛 𝑦 == 𝐸𝐶[𝑗] 𝑧 𝑂 15
Cuckoo Hashing • Server performs Cuckoo hashing. ℎ(𝑧 1 ) 𝑧 1 ℎ(𝑧 2 ) 𝑧 2 𝑧 1 𝑧 3 … 𝑧 4 … ℎ(𝑧 𝑂 ) 𝑧 𝑂 16
Cuckoo Hashing • Server performs Cuckoo hashing. ℎ(𝑧 1 ) 𝑧 1 ℎ(𝑧 2 ) 𝑧 2 𝑧 1 𝑧 3 … 𝑧 4 … 𝑧 2 ℎ(𝑧 𝑂 ) 𝑧 𝑂 17
Cuckoo Hashing • Server performs Cuckoo hashing. ℎ(𝑧 1 ) 𝑧 1 ℎ(𝑧 2 ) 𝑧 2 𝑧 1 𝑧 3 … 𝑧 3 𝑧 4 … 𝑧 2 ℎ(𝑧 𝑂 ) 𝑧 𝑂 18
Cuckoo Hashing • Server performs Cuckoo hashing. ℎ(𝑧 1 ) 𝑧 1 𝑧 4 ℎ(𝑧 2 ) 𝑧 2 𝑧 1 𝑧 𝑂 𝑧 3 … 𝑧 3 𝑧 4 … 𝑧 2 ℎ(𝑧 𝑂 ) 𝑧 𝑂 19
Cuckoo Hashing • Server performs Cuckoo hashing. 𝑧 4 ← Collision: ℎ 𝑧 1 = ℎ(𝑧 𝑂 ) 𝑧 1 𝑧 𝑂 𝑧 3 𝑧 2 20
Cuckoo Hashing • Server performs Cuckoo hashing. 𝑧 4 𝑧 1 𝑧 𝑂 ℎ′(𝑧 1 ) 𝑧 3 𝑧 2 ℎ′(𝑧 2 ) • To avoid collisions: use multiple hash functions - in this example: ℎ, ℎ ′ . • In our implementation we used 3 hash functions and a cuckoo expansion factor of 𝑓 ≈ 1.4 for a cuckoo failure probability of 2 −20 during one-time initialization. 21
Cuckoo Hashing • Every element can be located in two possible bins. 𝑦 𝑜 𝑦 4 𝑧 4 𝑦 2 𝑦 3 𝑦 1 𝑧 𝑂 𝑧 3 𝑦 4 𝑦 𝑜 𝑦 1 𝑦 2 𝑧 1 𝑧 2 𝑦 3 • The client computes all hash functions for every element. 22
Cuckoo Hashing • Every element can be located in two possible bins. 𝑦 𝑜 𝑦 4 𝑧 4 𝑦 2 𝑦 3 𝑦 1 𝑧 𝑂 ℎ(𝑧 1 ) 𝑧 3 𝑦 4 𝑦 𝑜 ℎ′(𝑧 1 ) 𝑦 1 𝑦 2 𝑧 1 𝑧 2 𝑦 3 • To check if the server holds 𝑦 1 , the client runs a PIR-PEQ with the 2 nd and 4 th bin. • In the full protocol: instead of single PIR-PEQ, we run all of them together in a PSI protocol . 23
PIR-PSI Overview ℎ(𝑧 1 ) 𝑧 1 𝑧 4 1. Cuckoo Hashing ℎ(𝑧 2 ) 𝑧 2 𝑧 𝑂 • Both servers compute the same cuckoo hash table for their 𝑂 elements. 𝑧 3 𝑧 3 𝑧 4 … 𝑧 1 … 𝑧 2 𝑧 𝑂 ℎ(𝑧 𝑂 ) 2. DPF-PIR Query • The client delegated extraction of 𝑜 elements from the cuckoo table. 3. Oblivious Shuffle ? ℎ(𝑧 1 ) One server receives the other server’s masked output and obliviously • ℎ′(𝑧 1 ) shuffles the PIR results to hide which Cuckoo hash function was used. 𝑦 𝑜 𝑦 4 𝑧 4 4. Small PSI 𝑦 2 𝑦 3 𝑦 1 𝑧 𝑂 PSI 𝑦 4 𝑦 𝑜 A standard PSI protocol is used to determine intersection. • 𝑧 3 𝑦 1 𝑦 2 𝑧 1 𝑦 3 24
Optimizations • Binning 𝑧 1 • Instead of running full domain DPFs, we partition the server 𝑧 2 cuckoo table into bins and a smaller DPFs per bin. … • Parallelization! 𝑧 𝑂 • Batching 𝑧 1 • Instead of running DPF queries separately, 𝑧 2 run all queries in each bin in parallel. • Only a single pass over the cuckoo table for multiple queries. 𝑧 1 • Larger PIR Blocks 𝑧 2 𝑧 3 • PIR queries can return multiple cuckoo table entries. 𝑧 4 • less communication, more computation in PSI. … 𝑧 𝑂 25
PIR-PSI with 3 PIR Servers 𝐸𝐶 𝐸𝐶 = 𝐸𝐶 2 ⊕ 𝐸𝐶 3 #1 (𝐿 1 ⋅ 𝐸𝐶) ⊕ 𝑛 (𝐿 2 ⋅ 𝐸𝐶 2 ) ⊕ (𝐿 2 ⋅ 𝐸𝐶 3 ) ⊕ 𝑙 1 , 𝑛 𝐿 1 ⋅ 𝐸𝐶 ⊕ 𝑛 𝐸𝐶 2 = 𝐿 2 ⋅ 𝐸𝐶 2 ⊕ 𝐸𝐶 3 ⊕ 𝑙 2 #2 𝐿 2 ⋅ 𝐸𝐶 2 𝐿 1 ⋅ 𝐸𝐶 ⊕ 𝑛 = 𝑙 2 𝐸𝐶 3 𝐿 2 ⋅ 𝐸𝐶 ⊕ 𝐿 1 ⋅ 𝐸𝐶 ⊕ 𝑛 = #3 𝐿 3 ⋅ 𝐸𝐶 3 𝐸𝐶 𝑗 ⊕ 𝑛 26
PIR-PSI Performance • Communication and running time for 2^20 2^24 2^26 2^28 𝑜 = 1 024 client elements and server 30 set sizes 𝑂 ∈ {2 20 , 2 24 , 2 26 , 2 28 } . 1.6; 28.3 25 • Benchmarked in Gigabit LAN, on Communication in MiB 1 machine with 36 x 2.3 GHz. Implementation set to use 4 threads. 20 • Client computation is ≈ 10% of total. 15 • Parameters for communication / 0.72; 12.7 computation trade-off 10 0.36; 8.61 13.22; 4.93 5 3.65; 4.28 0.94; 3.85 0.1; 2.1 0 0 2 4 6 8 10 12 14 Running time in seconds 27
Conclusion • Combination of DPF-based PIR with state-of the art PSI to achieve scalable contact discovery. • Efficient open-source C++ implementation on Github: github.com/osu-crypto/libPSI • Many more details in the paper! • Security Analysis • Cuckoo Hashing Parameters • Detailed performance analysis and comparison with related work • Extensions 28
Thank you! Peter Rindal Daniel Demmler Mike Rosulek Ni Trieu
References • Some icons are made by Freepik from flaticon.com 30
• Extra / Backup slides coming up next… 31
A Sampling of PSI Over the Decades [Meadows86] [HubermanFranklinHogg99] [DeCristofaroKimTsudik10] Diffie-Hellman Private equality test Private equality test to PSI Malicious secure 𝑦 𝛽𝛾 = 𝑧 𝛾𝛽 ⇒ 𝑦 = 𝑧 1985 1990 1995 2000 2005 2010 2015 2020 32
A Sampling of PSI Over the Decades Oblivious [NaorPinkas99] [FreedmanNissimPinkas04] [DachmanMalkinRaykovaYung09] [GhoshJasper17] Polynomial Semi-honest PSI Hash table base PSI Malicious secure Malicious secure Evaluation 𝑅 𝑦 ≔ (𝑦 − 𝑧) 𝑅 𝑦 = 0 ⇒ 𝑦 = 𝑧 1985 1990 1995 2000 2005 2010 2015 2020 33
A Sampling of PSI Over the Decades [HuangEvansKatz12] Generic MPC Garbled Circuit based PSI 1985 1990 1995 2000 2005 2010 2015 2020 34
Recommend
More recommend