learning to reconstruct statistical learning theory and
play

Learning to Reconstruct Statistical Learning Theory and Encrypted - PowerPoint PPT Presentation

Learning to Reconstruct Statistical Learning Theory and Encrypted Database Attacks Paul Grubbs, Marie-Sarah Lacharit, Brice Minaud, Kenny Paterson eprint 2019/011 and IEEE S&P 2019. C2 seminar, Rennes, 2019 Outsourcing Data Data upload


  1. Learning to Reconstruct Statistical Learning Theory and Encrypted Database Attacks Paul Grubbs, Marie-Sarah Lacharité, Brice Minaud, Kenny Paterson eprint 2019/011 and IEEE S&P 2019. C2 seminar, Rennes, 2019

  2. Outsourcing Data Data upload Data access Client Server Sensitive data → encryption needed. An encrypted database is of little use if it cannot be searched. → Searchable Encryption . Examples: Private message server. Company/hospital outsourcing client/patient info. 2

  3. Searchable Encryption Data upload Data access Adversarial Client Server Adversary : honest-but-curious host server. Security goal : confidentiality of data and queries . Very active topic in research and industry. [AKSX04], [BCLO09], [PKV+14], [BLR+15], [NKW15], [KKNO16], [LW16], [FVY+17], [SDY+17], [DP17], [HLK18], [PVC18], [MPC+18]… 3

  4. Security Model Data upload Server Data access learns L (query, DB) Adversarial Client Server Generic solutions (FHE) are infeasible at scale → for efficiency reasons, some leakage is allowed. Security model : parametrized by a leakage function L . Server learns nothing except for the output of the leakage function. 4

  5. Security Model Real world Adversary q Query q Client Server Adversary Ideal world q L (q) L Simulator 5

  6. Keyword Search Symmetric Searchable Encryption (SSE) = keyword search: • Data = collection of documents. e.g. messages. • Serch query = find documents containing given keyword(s). Efficient solutions for leakage = search pattern + access pattern . Some active topics: - Forward and backward privacy [B16][BMO17][CPPJ18][SYL+18]... - Locality [CT14][ANSS16][DPP18]... 6

  7. Beyond Keyword Search Data upload Search query Matching records Server Client For an encrypted database management system : • Data = collection of records. e.g. health records. • Basic query examples: - find records with given value. e.g. patients aged 57. - find records within a given range. e.g. patients aged 55-65. 7

  8. Range queries In this talk: range queries . ‣ Fundamental for any encrypted DB system. ‣ Many constructions out there. ‣ Simplest type of query that can't “just” be handled by an index. Initial solutions: Order-Preserving, Order-Revealing Encryption . Leakage-abuse attacks : order information can be used to infer (approximate) values. Leaking order is too revealing . → “Second-generation” schemes enable range queries without relying on OPE/ORE. Still leak access pattern . 8

  9. Range Queries Range = [40,100] 3 1 45 83 Client Server 2 3 4 1 45 6 83 28 What can the server learn from the above leakage? 9

  10. Database Reconstruction Let N = number of possible values for the target attribute. Strongest goal : full database reconstruction = recovering the exact value of every record. More general : approximate database reconstruction = recovering all values within ε N . ε = 0.05 is recovery within 5%. ε = 1/N is full recovery. (“Sacrificial” recovery: values very close to 1 and N are excluded.) [KKNO16] : full reconstruction in O( N 4 log N ) queries, assuming i.i.d. uniform queries! 10

  11. Database Reconstruction [KKNO16] : full reconstruction in O( N 4 log N ) queries! recovers This talk ([GLMP19], [LMP18]): Full. Rec. Lower Bound ‣ O( ε -4 log ε -1 ) for approx. reconstruction. O( N 4 log N ) Ω ( ε -4 ) ‣ O( ε -2 log ε -1 ) with very mild hypothesis. O( N 2 log N ) Ω ( ε -2 ) ‣ O( ε -1 log ε -1 ) for approx. order rec. O( N log N ) Ω ( ε -1 log ε -1 ) implies Full reconstruction in O( N log N ) for dense DBs. Scale-free : does not depend on size of DB or number of possible values. → Recovering all values in DB within 5% costs O(1) queries! 11

  12. Database Reconstruction [KKNO16] : full reconstruction in O( N 4 log N ) queries! This talk ([GLMP19], subsuming [LMP18]): Full. Rec. Lower Bound ‣ O( ε -4 log ε -1 ) for approx. reconstruction. O( N 4 log N ) Ω ( ε -4 ) ‣ O( ε -2 log ε -1 ) with very mild hypothesis. O( N 2 log N ) Ω ( ε -2 ) ‣ O( ε -1 log ε -1 ) for approx. order rec. O( N log N ) Ω ( ε -1 log ε -1 ) This talk. Main tool: - connection with statistical learning theory ; - especially, VC theory . 12

  13. VC Theory C

  14. VC Theory Foundational paper: Vapnik and Chervonenkis, 1971 . Uniform convergence result. Now a foundation of learning theory, especially PAC ( probably approximately correct ) learning. Wide applicability. Fairly easy to state/use. (You don't have to read the original article in Russian.) 14

  15. <latexit sha1_base64="DjnHOxRz4I3ci4soPVaWmu0s1+E=">AC1XicbVFNixNBEO2MX2v82KwevTRmF1bQMCOCHheD4DGC2SxkQqj01CTN9sfQXRM3DnMTr/4Of41X9/Yk01gN2tBw+NVe9V7NCSU9xfNmK7ty9d/B3sP2o8dPnu53Dp6dels6gUNhlXVnM/CopMEhSVJ4VjgEPVM4mp3m/xoic5La7SqsCJhrmRuRAgZp2+unAHfdf8RSKwtkLnuYORJV2U8ILqgorDXkuDT/sH9b1Lk+WQNX1tNONe/E6+G2QbECXbWIwPWh9TzMrSo2GhALvx0lc0KQCR1IorNtp6bEAcQ5zHAdoQKOfVGu1NT8KTMZz68IzxNfs9Y4KtPcrPQuVGmjhd3MN+b/cuKT8w6SpigJjbgalJcqyOSNdTyTDgWpVQAgnAy7crGA4BcFg9vto+tzFqiWSEGJQ4PfhNUaTFalOWipVhnmUCoKfvp8i29s6cFs1+SfDIRjBrNLZ6SZc2t4k37j0cmcN1Vc2wxf8zCAl35dQgt0HM1SOmsak4PO7ZftcK1k9za3wenbXhL3ki/vuicfN3fbYy/YS3bMEvaenbDPbMCGTLDf7A/7y6jUVRHP6KfV6VRa9PznN2I6Nc/0+vmnQ=</latexit> <latexit sha1_base64="DjnHOxRz4I3ci4soPVaWmu0s1+E=">AC1XicbVFNixNBEO2MX2v82KwevTRmF1bQMCOCHheD4DGC2SxkQqj01CTN9sfQXRM3DnMTr/4Of41X9/Yk01gN2tBw+NVe9V7NCSU9xfNmK7ty9d/B3sP2o8dPnu53Dp6dels6gUNhlXVnM/CopMEhSVJ4VjgEPVM4mp3m/xoic5La7SqsCJhrmRuRAgZp2+unAHfdf8RSKwtkLnuYORJV2U8ILqgorDXkuDT/sH9b1Lk+WQNX1tNONe/E6+G2QbECXbWIwPWh9TzMrSo2GhALvx0lc0KQCR1IorNtp6bEAcQ5zHAdoQKOfVGu1NT8KTMZz68IzxNfs9Y4KtPcrPQuVGmjhd3MN+b/cuKT8w6SpigJjbgalJcqyOSNdTyTDgWpVQAgnAy7crGA4BcFg9vto+tzFqiWSEGJQ4PfhNUaTFalOWipVhnmUCoKfvp8i29s6cFs1+SfDIRjBrNLZ6SZc2t4k37j0cmcN1Vc2wxf8zCAl35dQgt0HM1SOmsak4PO7ZftcK1k9za3wenbXhL3ki/vuicfN3fbYy/YS3bMEvaenbDPbMCGTLDf7A/7y6jUVRHP6KfV6VRa9PznN2I6Nc/0+vmnQ=</latexit> <latexit sha1_base64="DjnHOxRz4I3ci4soPVaWmu0s1+E=">AC1XicbVFNixNBEO2MX2v82KwevTRmF1bQMCOCHheD4DGC2SxkQqj01CTN9sfQXRM3DnMTr/4Of41X9/Yk01gN2tBw+NVe9V7NCSU9xfNmK7ty9d/B3sP2o8dPnu53Dp6dels6gUNhlXVnM/CopMEhSVJ4VjgEPVM4mp3m/xoic5La7SqsCJhrmRuRAgZp2+unAHfdf8RSKwtkLnuYORJV2U8ILqgorDXkuDT/sH9b1Lk+WQNX1tNONe/E6+G2QbECXbWIwPWh9TzMrSo2GhALvx0lc0KQCR1IorNtp6bEAcQ5zHAdoQKOfVGu1NT8KTMZz68IzxNfs9Y4KtPcrPQuVGmjhd3MN+b/cuKT8w6SpigJjbgalJcqyOSNdTyTDgWpVQAgnAy7crGA4BcFg9vto+tzFqiWSEGJQ4PfhNUaTFalOWipVhnmUCoKfvp8i29s6cFs1+SfDIRjBrNLZ6SZc2t4k37j0cmcN1Vc2wxf8zCAl35dQgt0HM1SOmsak4PO7ZftcK1k9za3wenbXhL3ki/vuicfN3fbYy/YS3bMEvaenbDPbMCGTLDf7A/7y6jUVRHP6KfV6VRa9PznN2I6Nc/0+vmnQ=</latexit> <latexit sha1_base64="DjnHOxRz4I3ci4soPVaWmu0s1+E=">AC1XicbVFNixNBEO2MX2v82KwevTRmF1bQMCOCHheD4DGC2SxkQqj01CTN9sfQXRM3DnMTr/4Of41X9/Yk01gN2tBw+NVe9V7NCSU9xfNmK7ty9d/B3sP2o8dPnu53Dp6dels6gUNhlXVnM/CopMEhSVJ4VjgEPVM4mp3m/xoic5La7SqsCJhrmRuRAgZp2+unAHfdf8RSKwtkLnuYORJV2U8ILqgorDXkuDT/sH9b1Lk+WQNX1tNONe/E6+G2QbECXbWIwPWh9TzMrSo2GhALvx0lc0KQCR1IorNtp6bEAcQ5zHAdoQKOfVGu1NT8KTMZz68IzxNfs9Y4KtPcrPQuVGmjhd3MN+b/cuKT8w6SpigJjbgalJcqyOSNdTyTDgWpVQAgnAy7crGA4BcFg9vto+tzFqiWSEGJQ4PfhNUaTFalOWipVhnmUCoKfvp8i29s6cFs1+SfDIRjBrNLZ6SZc2t4k37j0cmcN1Vc2wxf8zCAl35dQgt0HM1SOmsak4PO7ZftcK1k9za3wenbXhL3ki/vuicfN3fbYy/YS3bMEvaenbDPbMCGTLDf7A/7y6jUVRHP6KfV6VRa9PznN2I6Nc/0+vmnQ=</latexit> Warm-up Set X with probability distribution D . Let C ⊆ X . Call it a concept . Pr( C ) ≈ #points in C #points total Sample complexity : to measure Pr( C) within ε , you need O(1/ ε 2 ) samples. C X 15

Recommend


More recommend