introduction to cybersecurity database privacy
play

Introduction to Cybersecurity Database Privacy Review: Anonymity - PDF document

CISPA Center for IT Security, Privacy and Accountabiltiy Introduction to Cybersecurity Database Privacy Review: Anonymity vs. Privacy Privacy - Privacy is the claim of individuals, groups, or institutions to determine for themselves when,


  1. CISPA Center for IT Security, Privacy and Accountabiltiy Introduction to Cybersecurity Database Privacy Review: Anonymity vs. Privacy  Privacy - Privacy is the claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to others  Anonymity - The state of being not identifiable within a set of subjects/individuals - It is a property exclusively of individuals  Privacy != Anonymity - Anonymity is a way to maintain privacy, and sometimes it is not necessary Foundations of Cybersecurity 2016 1 Review: Anonymous Communication (AC) Protocols  Various AC protocols with different goals: Communication - Low Latency Overhead Complexity - Low Communication Overhead - High Traffic-Analysis Resistance Latency  Typically categorized by latency overhead: Traffic-Analysis Resistance - low-latency AC protcols e.g. Tor, DC Nets, Crowds - high-latency AC protocols e.g. Mix networks Introduction to Cybersecurity 2016 2 1 Foundations of Cybersecurity 2016

  2. CISPA Center for IT Security, Privacy and Accountabiltiy A Glimpse on Research: Privacy Assessment with MATor Randomly choose an entry , a middle and an related exit node. entry middle exit corrupts Goal: Derive worst-case quantitative anonymity guarantees Impact of Single Node Corruption Overall Guarantee related i 𝐶 Budget Adversary 𝐵 𝑔 with cost function 𝑔: 𝑂 → ℝ and budget 𝐶 Anonymity degeneration (for Integer maximization problem encryption as terms) Maximize A 𝑕 ≔ ∑ 𝑏∈A 𝑏𝑒𝑤 𝑜𝑝𝑒𝑓 𝑏 subject to A ⊆ 𝑂, 𝑔 A ≤ 𝐶 𝜀 𝑓𝑜𝑢𝑠𝑧 𝑗 = ∑ (𝑛,𝑦)∈𝑂 2 Pr[ 𝑗,𝑛, 𝑦 ← 𝑈𝑝𝑠] Computational 𝜀 𝑛𝑗𝑒𝑒𝑚𝑓 𝑗 = ∑ (𝑓,𝑦)∈𝑂 2 Δ st Pr[ 𝑓, 𝑗, 𝑦 ← 𝑈𝑝𝑠] 𝑏𝑒𝑤 𝑜𝑝𝑒𝑓 (𝑗) Soundness 𝜀 𝑓𝑦𝑗𝑢 𝑗 = Δ 𝑡𝑢 ∑ 𝑓∈𝑂 Pr[ 𝑓, 𝑛, 𝑗 ← 𝑈𝑝𝑠] 1 𝑕 𝑏𝑚𝑕𝑓𝑐𝑠𝑏𝑗𝑑 (𝑜) − 𝑕 𝑑𝑠𝑧𝑞𝑢𝑝 (𝑜) ≤ 𝑞𝑝𝑚𝑧 𝑜 Introduction to Cybersecurity 2016 3 A Glimpse on Research: Privacy Assessment with MATor Randomly choose an entry , a middle and an related exit node. entry middle exit corrupts Goal: Derive worst-case quantitative anonymity guarantees 1 Live Monitor Anonymity 0.8 Tor 1 LASTor 0.6 Alternative Uniform Path Selection 0.4 Algorithms 0,5 US-Exit 0.2 0 1 8 64 512 4,096 32,768 0 2012 2013 2014 Bandwidthin MB/s time Challenges: Comprehensive network-layer attackers, extension beyond structural corruption, content-sensitive assessment Potential killer arguments: Attackers overly powerful, hence too pessimistic guarantees; assessment only for Tor, not tailored attack Introduction to Cybersecurity 2016 4 Lecture Summary – Part I Basic Database Privacy • Motivation • Data Sanitization • k-anonymity and l-diversity Principle Approaches to Data Protection • Sanitization before Publication • Protection after Publication • Publication without Control Introduction to Cybersecurity 2016 5 2 Foundations of Cybersecurity 2016

  3. CISPA Center for IT Security, Privacy and Accountabiltiy Data Privacy: Attribute Disclosure female female 25-30 25-30 Saarland Saarland Addison Addison Disorder Disorder female female 25-30 25-30 Saarland Saarland Addison Addison Disorder Disorder male male 30-35 30-35 Saarland Saarland Healthy Healthy Alice suffers from the Addison disorder! female 29y Saarbrücken social network Introduction to Cybersecurity 2016 6 Cryptographic Solutions  Why not just delete the data? In contrast to cryptography, privacy often requires a certain utility. Deleting data destroys utility.  Why can ’ t we encrypt? Storing or transmitting data encrypted is a good idea. Someone has (needs to have) the key. Introduction to Cybersecurity 2016 7 Sanitization  Legally, data has to be “sanitized”: - Removal of “identifying” information Unsanitized data Sanitized data • • Name Name • • Gender Gender • • Age Age • • Address Address • • Phone Number Phone Number • • Field of studies Field of studies • • Grades Grades Introduction to Cybersecurity 2016 8 3 Foundations of Cybersecurity 2016

  4. CISPA Center for IT Security, Privacy and Accountabiltiy Benefits of Sanitization Sanitized data can (still) be used for:  Research Statistics Science! Sanitized data  Healthcare • Name • Gender • Age • Address • Phone Number  Governmental statistics • Field of studies • Grades  Improving business models Introduction to Cybersecurity 2016 9 Does Sanitization suffice? Sanitization = Privacy?  No identity  No identifying information (“quasi identifiers”) such as address or phone number 1 female student Sanitized data of this age • Name attends a course • Gender • Age • Address • Phone Number Privacy • Field of studies Breach • Grades Introduction to Cybersecurity 2016 10 Attacks on Databases Early defense mechanisms: query sanitization. Name Age Gender Semester Grade Sanitization: Queries must not Alice 19 Female 1 1.3 depend on identifiers! Bob 18 Male 1 2.0 Charlie 18 Male 1 1.7 SELECT SUM(Grade) WHERE Name = ‘Isa’ Dave 18 Male 1 3.7 Eve 17 Female 1 1.0 3.7 Fritz 19 Male 3 1.3 Gerd 21 Male 3 2.3 Hans 23 Male 3 3.0 Isa 20 Female 3 3.7 John 20 Male 3 1.7 Kale 21 Male 5 1.7 Leonard 23 Male 5 failed Martin 20 Male 5 2.7 Nils 22 Male 5 3.0 Otto 20 Male 5 1.0 Introduction to Cybersecurity 2016 11 4 Foundations of Cybersecurity 2016

  5. CISPA Center for IT Security, Privacy and Accountabiltiy Attacks on Databases Early defense mechanisms: query sanitization. Sanitization: Queries must not Name Age Gender Semester Grade Alice 19 Female 1 1.3 depend on identifiers! Bob 18 Male 1 2.0 Charlie 18 Male 1 1.7 Sanitization: Queries must not Dave 18 Male 1 3.7 be answered if the answer is Eve 17 Female 1 1.0 below a threshold Fritz 19 Male 3 1.3 Gerd 21 Male 3 2.3 SELECT SUM(Grade) WHERE Semester = 3 Hans 23 Male 3 3.0 AND Gender = Female Isa 20 Female 3 3.7 3.7 John 20 Male 3 1.7 Kale 21 Male 5 1.7 Leonard 23 Male 5 failed Martin 20 Male 5 2.7 Nils 22 Male 5 3.0 Otto 20 Male 5 1.0 Introduction to Cybersecurity 2016 12 Attacks on Databases Early defense mechanisms: query sanitization. Sanitization: Queries must not Name Age Gender Semester Grade Alice 19 Female 1 1.3 depend on identifiers! Bob 18 Male 1 2.0 Charlie 18 Male 1 1.7 Sanitization: Queries must not Dave 18 Male 1 3.7 be answered if the answer is Eve 17 Female 1 1.0 below a threshold Fritz 19 Male 3 1.3 Gerd 21 Male 3 2.3 SELECT SUM(Grade) Hans 23 Male 3 3.0 Isa 20 Female 3 3.7 30.1 John 20 Male 3 1.7 Kale 21 Male 5 1.7 SELECT SUM(Grade) WHERE NOT (Semester = 3 AND Gender = Female) Leonard 23 Male 5 failed Martin 20 Male 5 2.7 26.4 Nils 22 Male 5 3.0 Otto 20 Male 5 1.0 Local Computation: Isa’s Grade = 30.1 – 26.4 = 3.7 Introduction to Cybersecurity 2016 13 K-Anonymity (Intuitive Idea) K-Anonymity: Privacy means that one can hide within a set of (at least) K other people with the same quasi-identifiers. Quasi identifiers: Attributes that could identify a person (name, age, etc.) K = 6 Introduction to Cybersecurity 2016 15 5 Foundations of Cybersecurity 2016

  6. CISPA Center for IT Security, Privacy and Accountabiltiy K-Anonymity (Definition) Definition: Data satisfies K-Anonymity, if each person contained in the data cannot be distinguished from at least K-1 other individuals also within the data. Introduction to Cybersecurity 2016 16 Achieving K-Anonymity Reduce the information such that the data collapses: Suppression: Name Age Gender Semester Grade Name Age Gender Semester Grade Alice 19 Female 1 1.3 * 19 * 1 1.3 Bob 18 Male 1 2.0 * 18 * 1 2.0 Charlie 18 Male 1 1.7 * 18 * 1 1.7 Dave 18 Male 1 3.7 * 18 * 1 3.7 Eve 17 Female 1 1.0 * 17 * 1 1.0 Fritz 19 Male 3 1.3 Gerd 21 Male 3 2.3 Hans 23 Male 3 3.0 Generalization: Isa 20 Female 3 failed John 20 Male 3 1.7 Name Age Gender Semester Grade Kale 21 Male 5 1.7 21-25 5 1.7 Leonard 23 Male 5 failed 21-25 5 failed Martin 20 Male 5 2.7 18-20 5 2.7 Nils 22 Male 5 3.0 21-25 5 3.0 Otto 20 Male 5 1.0 20 5 1.0 Introduction to Cybersecurity 2016 17 K-Anonymity (3) Example: K-Anonymity for a list of students with K=5. Name Semester Grade * 1 1.3 For each semester, there are at least 5 * 1 2.0 individuals present that cannot be * 1 1.7 distinguished. * 1 3.7 * 1 1.0 * 3 1.3 Idea/Goal: * 3 2.3 * 3 3.0 Consequently, one cannot be identified, * 3 failed but hides in a group of K=5 people. * 3 1.7 * 5 1.7 * 5 failed * 5 2.7 * 5 3.0 * 5 1.0 Introduction to Cybersecurity 2016 18 6 Foundations of Cybersecurity 2016

Recommend


More recommend