security and data privacy
play

Security and Data Privacy Instructor: Pratiksha Thaker - PowerPoint PPT Presentation

Security and Data Privacy Instructor: Pratiksha Thaker cs245.stanford.edu Outline Security requirements Key concepts and tools Differential privacy Other security tools CS 245 2 Outline Security requirements Key concepts and tools


  1. Security and Data Privacy Instructor: Pratiksha Thaker cs245.stanford.edu

  2. Outline Security requirements Key concepts and tools Differential privacy Other security tools CS 245 2

  3. Outline Security requirements Key concepts and tools Differential privacy Other security tools CS 245 3

  4. Why Security & Privacy? CS 245 4

  5. Why Security & Privacy? Data is valuable & can cause harm if released » Example: medical records, purchase history, internal company documents, etc Data releases can’t usually be “undone” Security policies can be complex » Each user can only see data from their friends » Analyst can only query aggregate data » Users can ask to delete their derived data CS 245 5

  6. Why Security & Privacy? It’s the law! New regulations about user data: US HIPAA: Health Insurance Portability & Accountability Act (1996) » Mandatory encryption, access control, training EU GDPR, CA CCPA: (2018) » Users can ask to see & delete their data PCI DSS: Payment Card Industry standard (2004) » Required in contracts with MasterCard, etc CS 245 6

  7. Consequence Security and privacy must be baked into the design of data-intensive systems » Often a key differentiator for products! CS 245 7

  8. The Good News Declarative interface to many data-intensive systems can enable powerful security features » One of the “big ideas” in our class! Example: System R’s access control on views read arbitrary write SQL query SQL Tables View Users CS 245 8

  9. Outline Security requirements Key concepts and tools Differential privacy Other security tools CS 245 9

  10. Some Security Goals Access Control: only the “right” users can perform various operations; typically relies on: » Authentication: a way to verify user identity (e.g. password) » Authorization: a way to specify what users may take what actions (e.g. file permissions) Auditing: system records an incorruptible audit trail of who did each action CS 245 10

  11. Some Security Goals Confidentiality: data is inaccessible to external parties (often via cryptography) Integrity: data can’t be modified by external parties Privacy: only a limited amount of information about “individual” users can be learned CS 245 11

  12. Clarifying These Goals Say our goal was access control : only Matei can set CS 245 student grades on Axess What scenarios should Axess protect against? 12

  13. Clarifying These Goals Say our goal was access control : only Matei can set CS 245 student grades on Axess What scenarios should Axess protect against? 1. Bobby T. (an evil student) logging into Axess as himself and being able to change grades 13

  14. Clarifying These Goals Say our goal was access control : only Matei can set CS 245 student grades on Axess What scenarios should Axess protect against? 1. Bobby T. (an evil student) logging into Axess as himself and being able to change grades 2. Bobby sending hand-crafted network packets to Axess to change his grades 14

  15. Clarifying These Goals Say our goal was access control : only Matei can set CS 245 student grades on Axess What scenarios should Axess protect against? 1. Bobby T. (an evil student) logging into Axess as himself and being able to change grades 2. Bobby sending hand-crafted network packets to Axess to change his grades 3. Bobby getting a job as a DB admin at Axess 15

  16. Clarifying These Goals Say our goal was access control : only Matei can set CS 245 student grades on Axess What scenarios should Axess protect against? 1. Bobby T. (an evil student) logging into Axess as himself and being able to change grades 2. Bobby sending hand-crafted network packets to Axess to change his grades 3. Bobby getting a job as a DB admin at Axess 4. Bobby guessing Matei’s password 16

  17. Clarifying These Goals Say our goal was access control : only Matei can set CS 245 student grades on Axess What scenarios should Axess protect against? 1. Bobby T. (an evil student) logging into Axess as himself and being able to change grades 2. Bobby sending hand-crafted network packets to Axess to change his grades 3. Bobby getting a job as a DB admin at Axess 4. Bobby guessing Matei’s password 5. Bobby blackmailing Matei to change his grade 17

  18. Clarifying These Goals Say our goal was access control : only Matei can set CS 245 student grades on Axess What scenarios should Axess protect against? 1. Bobby T. (an evil student) logging into Axess as himself and being able to change grades 2. Bobby sending hand-crafted network packets to Axess to change his grades 3. Bobby getting a job as a DB admin at Axess 4. Bobby guessing Matei’s password 5. Bobby blackmailing Matei to change his grade 6. Bobby discovering a flaw in AES to do #2 18

  19. Threat Models To meaningfully reason about security, need a threat model : what adversaries may do » Same idea as failure models! For example, in our Axess scenario, assume: » Adversaries only interact with Axess through its public API » No crypto algorithm or software bugs » No password theft Implementing complex security policies can be hard even with these assumptions! CS 245 19

  20. Threat Models No useful threat model can cover everything » Goal is to cover the most feasible scenarios for adversaries to increase the cost of attacks Threat models also let us divide security tasks across different components » E.g. auth system handles passwords, 2FA CS 245 20

  21. Threat Models CS 245 Source: XKCD.com 21

  22. Useful Building Blocks Encryption: encode data so that only parties with a key can efficiently decrypt Cryptographic hash functions: hard to find items with a given hash (or collisions) Secure channels (e.g. TLS): confidential, authenticated communication for 2 parties CS 245 22

  23. Security Tools from DBMSs First-class concept of users + access control » Views as in System R, tables, etc Secure channels for network communication Audit logs for analysis Encrypt data on-disk (perhaps at OS level) CS 245 23

  24. Modern Tools for Security Privacy metrics and enforcement thereof (e.g. differential privacy) Computing on encrypted data (e.g. CryptDB) Hardware-assisted security (e.g. enclaves) Multi-party computation (e.g. secret sharing) CS 245 24

  25. Outline Security requirements Key concepts and tools Differential privacy Other security tools CS 245 25

  26. Threat Model queries queries Table with Database private data server Data analysts • Database software is working correctly • Adversaries only access it through public API • Adversaries have limited # of user accounts CS 245 26

  27. Private statistics SELECT AVG(income) FROM professors WHERE state= “ California ” 27

  28. Private statistics Are aggregate statistics more private than individual data? SELECT AVG(income) FROM professors WHERE state= “ California ” SELECT AVG(income) FROM professors WHERE name= “ Matei Zaharia ” 28

  29. Private statistics Are aggregate statistics more private than individual data? No! SELECT AVG(income) FROM professors WHERE state=“ California ” SELECT AVG(income) FROM professors WHERE name=“ Matei Zaharia ” 29

  30. Private statistics 30

  31. Private statistics 31

  32. Idea: differential privacy A contract for algorithms that output statistics 32

  33. Idea: differential privacy A contract for algorithms that output statistics Intuition: the function is differentially private if removing or changing a data point does not change the output "too much" 33

  34. Idea: differential privacy A contract for algorithms that output statistics Intuition: the function is differentially private if removing or changing a data point does not change the output "too much" Intuition: plausible deniability 34

  35. Idea: differential privacy A contract for algorithms that output statistics For A and B that differ in one element , 35

  36. Idea: differential privacy A contract for algorithms that output statistics For A and B that differ in one element , Randomized algorithm that computes statistic 36

  37. Idea: differential privacy A contract for algorithms that output statistics For A and B that differ in one element , Any subset of possible outcomes 37

  38. Idea: differential privacy A contract for algorithms that output statistics For A and B that differ in one element , Privacy parameter Smaller ε ~= more privacy, less accuracy 38

  39. What Does It Mean? Say an adversary runs some query and observes a result Adversary had some set of results, S , that lets them infer something about Matei if Then: ≈ 1+ε Similar outcomes whether or not Matei in DB CS 245 39

  40. What Does It Mean? Private information is noisy. Can we determine anything useful? CS 245 40

  41. What Does It Mean? Private information is noisy. Can we determine anything useful? Query: SELECT COUNT(*) FROM patients WHERE causeOfDeath = ... Assume ε =0.1 CS 245 41

  42. What Does It Mean? CS 245 42

  43. What Does It Mean? CS 245 43

  44. What Does It Mean? With enough base signal, DP can still give useful information! CS 245 44

  45. Side Information Consider the following query: SELECT AVG(income) FROM professors WHERE state=“CA” 45

Recommend


More recommend