Secure Intersection with MapReduce R. Ciucanu 1 M. Giraud 2 P. Lafourcade 2 L. Ye 3 1 LIFO, INSA Centre Val de Loire Universit´ e d’Orl´ eans 2 3 School of Computer Science and Technology Harbin Institute of Technology, China 26 July 2019 @ SECRYPT, Prague 1/29
Big Data Cloud Service Provider (CSP) 2/29
Model 1 Application Avoid double submissions in conferences Mutual Private Set Intersection (PSI) Participants List A B A ∩ B A ∩ B Result 3/29
Model 2 Application FBI wants to detect suspicious passengers of an airline company One-way PSI Passengers List A B Result A ∩ B ∅ 4/29
Model 3 Application Interpol wants the most dangerous persons from FBI and MI6 Our PSI Model Suspects Lists A B Result ∅ ∅ A ∩ B 5/29
Example Suspects Lists Alice Oscar Cesar Bob Mallory Mallory Intersection List Mallory 6/29
Outline Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion 7/29
Outline Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion 8/29
MapReduce 1 MapReduce Environment Take care of ◮ Partitioning input data ◮ Scheduling program execution on a set of machines ◮ Handling machine failures Programmer Specify ◮ Map and Reduce functions ◮ Input files 1 J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters . In the proceedings of OSDI 2004. 9/29
MapReduce Example Input 1 Input 2 Input 3 Map 1 Map 2 Map 3 Shuffle Reduce 1 Reduce 2 Output 1 Output 2 10/29
MapReduce in 3 Steps 1. Map tasks Input: ID of chunk Output: key-value pairs 2. Master Controller ◮ Key-value pairs aggregated and sorted by key ◮ Pairs with same key sent to the same Reduce task 3. Reduce tasks Input: One key Output: Combine values associated to the key 11/29
Outline Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion 12/29
Intersection with MapReduce 2 3 participants NSA GCHQ Mossad F654 F654 F654 U840 M349 M349 X098 P027 U840 2 J. Leskovec, A. Rajaraman and J. D. Ullman. Mining of Massive Datasets . Cambridge University Press. 13/29
Intersection with MapReduce Data Public cloud User owners Key F654 Interpol F654 Reduce Values F654 F654 F654 Map Master Controller Key M349 NSA NSA M349 Values M349 GCHQ GCHQ Key P027 Value P027 Mossad Mossad Key U840 Value U840 Key X098 Value X098 Reduce function It returns value only if: #values = #participants 14/29
Outline Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion 15/29
Security Model Cloud is honest-but-curious Cloud Data Owner Intersection User Relations Without security, Cloud learns: ◮ Content of relations ◮ Intersection result 16/29
Cryptographic Tools Pseudorandom function f : K × D → R ◮ Deterministic ◮ Indistinguishable from a random function Notation [ m ] k = f ( k , m ) 17/29
Cryptographic Tools Asymmetric encryption scheme ◮ ( pk , sk ) ← G ( λ ) ◮ c ← E ( pk , m ) ◮ m ← D ( sk , c ) D ( sk , E ( pk , m )) = m Notation { m } = E ( pk , m ) 18/29
Outline Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion 19/29
Secure Intersection with MapReduce Setting ◮ n relations: R 1 , . . . , R n ◮ R 1 has: k 1 , . . . , k n PRF secret keys, and pk ◮ R i (for 2 ≤ i ≤ n ) has: k 1 and k i Preprocessing ◮ One main relation using the public key of the final user For each element x , compute the key-value pair: � �� { x } ⊕ ( ⊕ i = n � [ x ] k 1 , i =2 [ x ] k i )) ◮ Other relation compute the key-value pair: ([ x ] k 1 , [ x ] k i ) 20/29
Secure Intersection with MapReduce Processed relations NSA ∗ � � [ F654 ] k 1 , ( { F654 } ⊕ [ F654 ] k 2 ⊕ [ F654 ] k 3 ) � � [ U840 ] k 1 , ( { U840 } ⊕ [ U840 ] k 2 ⊕ [ U840 ] k 3 ) � � [ X098 ] k 1 , ( { X098 } ⊕ [ X098 ] k 2 ⊕ [ X098 ] k 3 ) GCHQ ∗ � � [ F654 ] k 1 , [ F654 ] k 2 � � [ M349 ] k 1 , [ M349 ] k 2 � � [ P027 ] k 1 , [ P027 ] k 2 Mossad ∗ � � [ F654 ] k 1 , [ F654 ] k 3 � � [ M349 ] k 1 , [ M349 ] k 3 � � [ U840 ] k 1 , [ U840 ] k 3 21/29
Secure Intersection with MapReduce Data Public cloud User owners ( sk , pk ) Key [ F654 ] k 1 Interpol { F654 } ⊕ [ F654 ] k 2 ⊕ [ F654 ] k 3 { F654 } Values [ F654 ] k 2 [ F654 ] k 3 Map Master Controller Key [ M349 ] k 1 NSA ∗ NSA ∗ Reduce [ M349 ] k 2 Values [ M349 ] k 3 GCHQ ∗ GCHQ ∗ Key [ P027 ] k 1 Value [ P027 ] k 2 Mossad ∗ Mossad ∗ Key [ U840 ] k 1 Values { U840 } ⊕ [ U840 ] k 2 ⊕ [ U840 ] k 3 [ U840 ] k 3 Key [ X098 ] k 1 Values { X098 } ⊕ [ X098 ] k 2 ⊕ [ X098 ] k 3 22/29
Outline Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion 23/29
Experimental Results Settings 3.2.0 / Standalone mode / Streaming ◮ 16.04 LTS ◮ ◮ Map and Reduce functions in Hardware ◮ 4 CPU @ 2.4 GHz ◮ 80 Gb of disk ◮ 8 Gb of RAM Experiments 1. Varying the number of tuples 2. Varying the number of intersected relations 24/29
Results: Varying the Number of Tuples Standard protocol 3 2 , 000 Secure Intersection CPU time (s) 1 , 500 1 , 000 500 0 0 . 5 1 1 . 5 2 2 . 5 3 Number of tuples (in millions) 3 J. Leskovec, A. Rajaraman and J. D. Ullman. Mining of Massive Datasets . Cambridge University Press. 25/29
Results: Varying the Number of Intersected Relations 500 400 Standard protocol 4 CPU time (s) Secure Intersection 300 200 100 0 2 3 4 5 6 7 8 9 10 Number of intersected relations (500,000 tuples / relation)) 4 J. Leskovec, A. Rajaraman and J. D. Ullman. Mining of Massive Datasets . Cambridge University Press. 26/29
Outline Motivations MapReduce Intersection with MapReduce Security Model and Cryptographic Tools Secure Intersection with MapReduce Performance Evaluation Conclusion 27/29
Conclusion and Future Works Conclusion ◮ Design of secure intersection with MapReduce ◮ Collision resistance ◮ Practical scalability Future Works ◮ Apache Spark environment ◮ Malicious model 28/29
Thank you for your attention. Any questions? pascal.lafourcade@uca.fr 29/29
Recommend
More recommend