privacy through accountability a computer science
play

Privacy through Accountability: A Computer Science Perspective - PowerPoint PPT Presentation

Privacy through Accountability: A Computer Science Perspective Anupam Datta Associate Professor Computer Science, ECE, CyLab Carnegie Mellon University February 2014 Personal Information is Everywhere 2 Research Challenge Programs and


  1. Privacy through Accountability: A Computer Science Perspective Anupam Datta Associate Professor Computer Science, ECE, CyLab Carnegie Mellon University February 2014

  2. Personal Information is Everywhere 2

  3. Research Challenge Programs and People Ensure organizations respect privacy expectations in the collection, use, and disclosure of personal information 3

  4. Web Privacy Example privacy policies:  Not use detailed location (full IP address) for advertising  Not use race for advertising 4

  5. Healthcare Privacy Auditor Hospital Patient Patient Patient informatio informatio information n n Drug Company Patient Nurse Physician Example privacy policies:  Use patient health info only for treatment, payment  Share patient health info with police if suspect crime 5

  6. A Research Area  Formalize Privacy Policies  Precise semantics of privacy concepts (restrictions on personal information flow)  Enforce Privacy Policies  Audit and Accountability  Detect violations  Blame-assignment  Adaptive audit resource allocation Related ideas: Barth et al Oakland 2006; May et al CSFW 2006; Weitzner et al CACM 2008, Lampson 2004 6

  7. Today: Focus on Detection  Healthcare Privacy  Play in two acts  Web Privacy  Play in two (brief) acts 7

  8. Example from HIPAA Privacy Rule A covered entity may disclose an individual’s protected health information (phi) to law-enforcement officials for the purpose of identifying an individual if the individual made a statement admitting participating in a violent crime that the covered entity believes may have caused serious physical harm to the victim  Concepts in privacy policies Black-and-  Actions: send(p1, p2, m) white concepts  Roles: inrole(p2, law-enforcement)  Data attributes: attr_in(prescription, phi)  Temporal constraints: in-the-past(state(q, m)) Grey concepts  Purposes: purp_in(u, id-criminal))  Beliefs: believes-crime-caused-serious-harm(p, q, m) 8

  9. Detecting Privacy Violations Automated audit for The Oracle black-and- white policy concepts Organizational The Matrix character Privacy Policy audit log Species Computer Program Complete formalization Detect Title A program designed to of HIPAA Privacy Rule, policy investigate the human GLBA violation psyche. s Oracles to audit for grey policy concepts Computer-readable privacy policy Audit 9

  10. Policy Auditing over Incomplete Logs With D. Garg (CMU  MPI-SWS) and L. Jia (CMU) 2011 ACM Conference on Computer and Communications Security 10

  11. Key Challenge for Auditing Audit Logs are Incomplete Future: store only past and current events Example: Timely data breach notification refers to future event Subjective: no “grey” information Example: May not record evidence for purposes and beliefs Spatial: remote logs may be inaccessible Example: Logs distributed across different departments of a hospital 11

  12. Abstract Model of Incomplete Logs Model all incomplete logs uniformly as 3 -valued structures Define semantics (meanings of formulas) over 3-valued structures 12

  13. reduce: The Iterative Algorithm reduce ( L , φ ) = φ' Logs r r e e d d u u c c φ 0 φ 1 φ 2 e e Policy Time 13

  14. Syntax of Policy Logic  First-order logic with restricted quantification over infinite domains (challenge for reduce)  Can express timed temporal properties, “grey” predicates 14

  15. Example from HIPAA Privacy Rule A covered entity may disclose an individual’s protected health information (phi) to law-enforcement officials for the purpose of identifying an individual if the individual made a statement admitting participating in a violent crime that the covered entity believes may have caused serious physical harm to the victim ∀ p1, p2, m, u, q, t. (send(p1, p2, m) ∧ inrole(p2, law-enforcement) ∧ tagged(m, q, t, u) ∧ attr_in(t, phi)) ⊃ (purp_in(u, id-criminal)) ∧ ∃ m’. state(q,m’) ∧ is-admission-of-crime(m’) ∧ believes-crime-caused-serious-harm(p1, q, m’) 15 15

  16. reduce: Formal Definition General Theorem: If initial policy passes a syntactic mode check , then finite substitutions can be computed c is a formula for Applications: The entire HIPAA and GLBA which finite satisfying substitutions of x can Privacy Rules pass this check be computed 16

  17. Example φ = { p1 → UPMC, ∀ p1, p2, m, u, q, t. p2 → allegeny-police, (send(p1, p2, m) ∧ m → M2, q → Bob, tagged(m, q, t, u) ∧ u → id-bank-robber, attr_in(t, phi)) t → date-of-treatment ⊃ inrole(p2, law-enforcement) } ∧ purp_in(u, id-criminal) ∧ ∃ m’. ( state(q, m’) { m’ → M1 } ∧ is-admission-of-crime(m’) ∧ believes-crime-caused-serious-harm(p1, m’)) Log φ ' = T Jan 1, 2011 state(Bob, M1) ∧ purp_in(id-bank-robber, id-criminal) ∧ is-admission-of-crime(M1) Jan 5, 2011 ∧ believes-crime-caused-serious-harm(UPMC, M1) send(UPMC, allegeny-police, M2) tagged(M2, Bob, date-of-treatment, id-bank-robber) 17

  18. Implementation and Case Study  Implementation and evaluation over simulated audit logs for compliance with all 84 disclosure-related clauses of HIPAA Privacy Rule  Performance:  Average time for checking compliance of each disclosure of protected health information is 0.12s for a 15MB log  Mechanical enforcement:  reduce can automatically check 80% of all the atomic predicates 18

  19. Ongoing Transition Efforts  Integration of reduce algorithm into Illinois Health Information Exchange prototype  Joint work with UIUC and Illinois HLN  Auditing logs for policy compliance  Ongoing conversations with Symantec Research 19

  20. Related Work  Distinguishing characteristics General treatment of incompleteness in audit logs 1. Quantification over infinite domains (e.g., messages) 2. First complete formalization of HIPAA Privacy Rule and 3. GLBA.  Nearest neighbors  Basin et al 2010 (missing 1, weaker 2, cannot handle 3)  Lam et al 2010 (missing 1, weaker 2, cannot handle entire 3)  Weitzner et al (missing 1, cannot handle 3)  Barth et al 2006 (missing 1, weaker 2, did not do 3) 20

  21. Formalizing and Enforcing Purpose Restrictions With M. C. Tschantz (CMU  Berkeley) and J. M. Wing (CMU  MSR) 2012 IEEE Symposium on Security & Privacy 21

  22. Goal  Give a semantics to  “Not for” purpose restrictions  “Only for” purpose restrictions that is parametric in the purpose  Provide audit algorithm for detecting violations for that semantics 22

  23. No diagnosis X-ray taken Send record by drug company Add x-ray Med records used only for diagnosis Medical Record Diagnosis X-ray added Send record by specialist 23

  24. No diagnosis by X-ray taken Send record drug company Not achieve Add x-ray purpose Achieve purpose Diagnosis X-ray added by specialist Send record 24

  25. No diagnosis X-ray taken Send record (by drug co. or specialist) Choice point Add x-ray Specialist Best choice fails 1/4 Diagnosis Send X-ray added by specialist record 3/4 25

  26. Planning Thesis: An action is for a purpose iff that action is part of a plan for furthering the purpose i.e., always makes the best choice for furthering the purpose 26

  27. Auditing Purpose Obeyed restriction Auditee’s Inconclusiv behavior e Violated Decision- making model 27

  28. Policy Record only Violated implications for treatment No [ , send Actions record] optimal? MDP Optimal Solve actions for r each state 28

  29. Summary: A Sense of Purpose Thesis: An action is for a purpose iff that action  is part of a plan for furthering the purpose i.e., always makes the best choice for furthering the purpose  Audit algorithm detects policy violations by checking if observed behavior could have been produced by optimal plan 29

  30. Today: Focus on Detection  Healthcare Privacy  Play in two acts  Web Privacy  Play in two (brief) acts 30

  31. Bootstrapping Privacy Compliance in a Big Data System With S. Sen (CMU) and S. Guha, S. Rajamani, J. Tsai, J. M. Wing (MSR) 2014 IEEE Symposium on Security & Privacy 31

  32. Privacy Compliance for Bing Setting:  Auditor has access to source code 32

  33. Two Central Challenges Ambiguous privacy 1. policy Legal Team  Meaning unclear Crafts Policy Meeting s Privacy Huge undocumented 2. Champion codebases & Interprets Policy Meeting datasets s Developer  Connection to policy Writes Code unclear Meeting s Audit Team Verifies Compliance 33

  34. 1. Legalease  Example:  Clean syntax  Layered allow-deny information flow rules DENY Datatype IPAddress with exceptions USE FOR PURPOSE  Precise Semantics Advertising  No ambiguity EXCEPT  Focus on Usability ALLOW Datatype IPAddress:  User study of Truncated Legalease with Microsoft privacy champions promising 34

  35. Dataset Dataset IPAddres Dataset Dataset IDX 2. Grok Name Age A B D s G  Data Inventory Process Process GeoIP NewAcct  Annotate code + 4 1 data with policy data types Dataset Dataset  Source labels Hash Country IDX Dataset I C H propagated via data flow graph Process Process Check Login  Different Noisy 2 5 Fraud Sources  Variable Name Dataset Dataset Timestam Dataset J IDX Analysis Hash p E F  Developer Annotations Reportin Process Check Process g 6 Hijack 3 35

Recommend


More recommend