Learning to Reconstruct Statistical Learning Theory and Encrypted Database Attacks Paul Grubbs, Marie-Sarah Lacharité, Brice Minaud, Kenny Paterson CASYS team seminar, Grenoble, December 2018
Outsourcing Data with Search Capabilities Data upload Search query Matching records Server Client For an encrypted database management system : • Data = collection of records. e.g. health records. • Basic query examples: - find records with given value. e.g. patients aged 57. - find records within a given range. e.g. patients aged 55-65. 2
Searchable Encryption Search query Matching records Adversarial Client Server Adversary : honest-but-curious host server. Security goal : privacy of data and queries . Very active topic in research and industry. [AKSX04], [BCLO09], [PKV+14], [BLR+15], [NKW15], [KKNO16], [LW16], [FVY+17], [SDY+17], [DP17], [HLK18], [PVC18], [MPC+18]… 3
Security Model of Searchable Encryption Server Search query learns F (query, DB) Matching records Adversarial Client Server Generic solutions (FHE) are infeasible at scale → for efficiency reasons, some leakage is allowed. Security model : parametrized by a leakage function . Server learns nothing except for the output of the leakage function. 4
Implications of Leakage Function In practice : nearly all practical schemes leak at least the set of records matching each query = access pattern leakage . OPE, ORE schemes, POPE, [HK16], BlindSeer, [Lu12], [FJ+15], FH- OPE, Lewi-Wu, Arx, Cipherbase, EncKV, … What are the security implications of this leakage? In this talk: focus on range queries . ‣ Fundamental for any encrypted DB system. ‣ Many constructions out there. ‣ Simplest type of query that can't “just” be handled by an index (cf. Symmetric Searchable Encryption). 5
Range Queries Range = [40,100] 3 1 45 83 Client Server 2 3 4 1 45 6 83 28 What can the server learn from the above leakage? Let's specify the problem: say records take N possible values, and queries are uniformly distributed . 6
Database Reconstruction Strongest goal : full database reconstruction = recovering the exact value of every record. More general : approximate database reconstruction = recovering all values within ε N . ε = 1/N is full recovery. ε = 0.05 is recovery within 5%. (“Sacrificial” recovery: values very close to 1 and N are excluded.) [KKNO16] : full reconstruction in O( N 4 log N ) queries, assuming i.i.d. uniform queries! 7
Database Reconstruction [KKNO16] : full reconstruction in O( N 4 log N ) queries! recovers This talk ([GLMP19], subsuming [LMP18]): Full. Rec. Lower Bound ‣ O( ε -4 log ε -1 ) for approx. reconstruction. O( N 4 log N ) Ω ( ε -4 ) ‣ O( ε -2 log ε -1 ) with very mild hypothesis. O( N 2 log N ) Ω ( ε -2 ) ‣ O( ε -1 log ε -1 ) for approx. order rec. O( N log N ) Ω ( ε -1 log ε -1 ) implies Full reconstruction in O( N log N ) for dense DBs. Scale-free : does not depend on size of DB or number of possible values. → Recovering all values in DB within 5% costs O(1) queries! 8
Database Reconstruction [KKNO16] : full reconstruction in O( N 4 log N ) queries! This talk ([GLMP19], subsuming [LMP18]): Full. Rec. Lower Bound ‣ O( ε -4 log ε -1 ) for approx. reconstruction. O( N 4 log N ) Ω ( ε -4 ) ‣ O( ε -2 log ε -1 ) with very mild hypothesis. O( N 2 log N ) Ω ( ε -2 ) ‣ O( ε -1 log ε -1 ) for approx. order rec. O( N log N ) Ω ( ε -1 log ε -1 ) This talk. Main tool: - connection with statistical learning theory ; - especially, VC theory . 9
VC Theory C
VC Theory Foundational paper: Vapnik and Chervonenkis, 1971 . Uniform convergence result. Now a foundation of learning theory, especially PAC ( probably approximately correct ) learning. Wide applicability. Fairly easy to state/use. (You don't have to read the original article in Russian.) 11
<latexit sha1_base64="DjnHOxRz4I3ci4soPVaWmu0s1+E=">AC1XicbVFNixNBEO2MX2v82KwevTRmF1bQMCOCHheD4DGC2SxkQqj01CTN9sfQXRM3DnMTr/4Of41X9/Yk01gN2tBw+NVe9V7NCSU9xfNmK7ty9d/B3sP2o8dPnu53Dp6dels6gUNhlXVnM/CopMEhSVJ4VjgEPVM4mp3m/xoic5La7SqsCJhrmRuRAgZp2+unAHfdf8RSKwtkLnuYORJV2U8ILqgorDXkuDT/sH9b1Lk+WQNX1tNONe/E6+G2QbECXbWIwPWh9TzMrSo2GhALvx0lc0KQCR1IorNtp6bEAcQ5zHAdoQKOfVGu1NT8KTMZz68IzxNfs9Y4KtPcrPQuVGmjhd3MN+b/cuKT8w6SpigJjbgalJcqyOSNdTyTDgWpVQAgnAy7crGA4BcFg9vto+tzFqiWSEGJQ4PfhNUaTFalOWipVhnmUCoKfvp8i29s6cFs1+SfDIRjBrNLZ6SZc2t4k37j0cmcN1Vc2wxf8zCAl35dQgt0HM1SOmsak4PO7ZftcK1k9za3wenbXhL3ki/vuicfN3fbYy/YS3bMEvaenbDPbMCGTLDf7A/7y6jUVRHP6KfV6VRa9PznN2I6Nc/0+vmnQ=</latexit> <latexit sha1_base64="DjnHOxRz4I3ci4soPVaWmu0s1+E=">AC1XicbVFNixNBEO2MX2v82KwevTRmF1bQMCOCHheD4DGC2SxkQqj01CTN9sfQXRM3DnMTr/4Of41X9/Yk01gN2tBw+NVe9V7NCSU9xfNmK7ty9d/B3sP2o8dPnu53Dp6dels6gUNhlXVnM/CopMEhSVJ4VjgEPVM4mp3m/xoic5La7SqsCJhrmRuRAgZp2+unAHfdf8RSKwtkLnuYORJV2U8ILqgorDXkuDT/sH9b1Lk+WQNX1tNONe/E6+G2QbECXbWIwPWh9TzMrSo2GhALvx0lc0KQCR1IorNtp6bEAcQ5zHAdoQKOfVGu1NT8KTMZz68IzxNfs9Y4KtPcrPQuVGmjhd3MN+b/cuKT8w6SpigJjbgalJcqyOSNdTyTDgWpVQAgnAy7crGA4BcFg9vto+tzFqiWSEGJQ4PfhNUaTFalOWipVhnmUCoKfvp8i29s6cFs1+SfDIRjBrNLZ6SZc2t4k37j0cmcN1Vc2wxf8zCAl35dQgt0HM1SOmsak4PO7ZftcK1k9za3wenbXhL3ki/vuicfN3fbYy/YS3bMEvaenbDPbMCGTLDf7A/7y6jUVRHP6KfV6VRa9PznN2I6Nc/0+vmnQ=</latexit> <latexit sha1_base64="DjnHOxRz4I3ci4soPVaWmu0s1+E=">AC1XicbVFNixNBEO2MX2v82KwevTRmF1bQMCOCHheD4DGC2SxkQqj01CTN9sfQXRM3DnMTr/4Of41X9/Yk01gN2tBw+NVe9V7NCSU9xfNmK7ty9d/B3sP2o8dPnu53Dp6dels6gUNhlXVnM/CopMEhSVJ4VjgEPVM4mp3m/xoic5La7SqsCJhrmRuRAgZp2+unAHfdf8RSKwtkLnuYORJV2U8ILqgorDXkuDT/sH9b1Lk+WQNX1tNONe/E6+G2QbECXbWIwPWh9TzMrSo2GhALvx0lc0KQCR1IorNtp6bEAcQ5zHAdoQKOfVGu1NT8KTMZz68IzxNfs9Y4KtPcrPQuVGmjhd3MN+b/cuKT8w6SpigJjbgalJcqyOSNdTyTDgWpVQAgnAy7crGA4BcFg9vto+tzFqiWSEGJQ4PfhNUaTFalOWipVhnmUCoKfvp8i29s6cFs1+SfDIRjBrNLZ6SZc2t4k37j0cmcN1Vc2wxf8zCAl35dQgt0HM1SOmsak4PO7ZftcK1k9za3wenbXhL3ki/vuicfN3fbYy/YS3bMEvaenbDPbMCGTLDf7A/7y6jUVRHP6KfV6VRa9PznN2I6Nc/0+vmnQ=</latexit> <latexit sha1_base64="DjnHOxRz4I3ci4soPVaWmu0s1+E=">AC1XicbVFNixNBEO2MX2v82KwevTRmF1bQMCOCHheD4DGC2SxkQqj01CTN9sfQXRM3DnMTr/4Of41X9/Yk01gN2tBw+NVe9V7NCSU9xfNmK7ty9d/B3sP2o8dPnu53Dp6dels6gUNhlXVnM/CopMEhSVJ4VjgEPVM4mp3m/xoic5La7SqsCJhrmRuRAgZp2+unAHfdf8RSKwtkLnuYORJV2U8ILqgorDXkuDT/sH9b1Lk+WQNX1tNONe/E6+G2QbECXbWIwPWh9TzMrSo2GhALvx0lc0KQCR1IorNtp6bEAcQ5zHAdoQKOfVGu1NT8KTMZz68IzxNfs9Y4KtPcrPQuVGmjhd3MN+b/cuKT8w6SpigJjbgalJcqyOSNdTyTDgWpVQAgnAy7crGA4BcFg9vto+tzFqiWSEGJQ4PfhNUaTFalOWipVhnmUCoKfvp8i29s6cFs1+SfDIRjBrNLZ6SZc2t4k37j0cmcN1Vc2wxf8zCAl35dQgt0HM1SOmsak4PO7ZftcK1k9za3wenbXhL3ki/vuicfN3fbYy/YS3bMEvaenbDPbMCGTLDf7A/7y6jUVRHP6KfV6VRa9PznN2I6Nc/0+vmnQ=</latexit> Warm-up Set X with probability distribution D . Let C ⊆ X . Call it a concept . Pr( C ) ≈ #points in C #points total Sample complexity : to measure Pr( C) within ε , you need O(1/ ε 2 ) samples. C X 12
Recommend
More recommend