unlinking private data unlinking private data
play

Unlinking Private Data Unlinking Private Data Alex Vaynberg Alex - PowerPoint PPT Presentation

Unlinking Private Data Unlinking Private Data Alex Vaynberg Alex Vaynberg 04/11/2006 04/11/2006 Yale University Yale University Sensitive Information in the Wired World Sensitive Information in the Wired World Privacy and Privacy Loss


  1. Unlinking Private Data Unlinking Private Data Alex Vaynberg Alex Vaynberg 04/11/2006 04/11/2006 Yale University Yale University Sensitive Information in the Wired World Sensitive Information in the Wired World

  2. Privacy and Privacy Loss Privacy and Privacy Loss • Ability to give information to certain Ability to give information to certain individuals, while retaining the ability to individuals, while retaining the ability to keep that information secret from others keep that information secret from others • Privacy loss occurs when information Privacy loss occurs when information becomes known to those from whom it is becomes known to those from whom it is kept secret kept secret

  3. Aggregation = Privacy Loss Aggregation = Privacy Loss • Possible Cause: Possible Cause: – One bit of data is considered private, but is One bit of data is considered private, but is public without being directly connected to an public without being directly connected to an individual individual – One bit of data is not private, but gives out more One bit of data is not private, but gives out more information about an individual information about an individual • allows connection with other record allows connection with other record – Put together: Put together: • private information is known about a person private information is known about a person

  4. Story Time Story Time A priest has been asked if people tell interesting A priest has been asked if people tell interesting stories during confessions. stories during confessions. He tells that his first confessor actually He tells that his first confessor actually confessed to a murder confessed to a murder Later a new person comes in and greets the Later a new person comes in and greets the priest. People ask him, how does he know the priest. People ask him, how does he know the priest? priest? He answers, “I was his first confessor”. He answers, “I was his first confessor”.

  5. The Point The Point Seemingly nameless private data, can Seemingly nameless private data, can be combined with non-anonymous be combined with non-anonymous data, resulting in a privacy loss. data, resulting in a privacy loss.

  6. Linking Data to People Linking Data to People • Types of identification Types of identification – Permanent Permanent • Uniquely identifies individual, follows him wherever Uniquely identifies individual, follows him wherever • Examples: SSN, Passport ID, Name* Examples: SSN, Passport ID, Name* – Semi-permanent Semi-permanent • May id a real person, changing may involve a cost May id a real person, changing may involve a cost • Name*, address, telephone, credit card # Name*, address, telephone, credit card # – Transient Transient • Almost no cost to change Almost no cost to change • Pseudonym, user id, e-mail Pseudonym, user id, e-mail

  7. Linking Data to People Linking Data to People • Types of Data: Types of Data: – Public Data Public Data • Driver's license, property records Driver's license, property records • Kept open by government Kept open by government • Managed by applicable laws (HIPPA, ???) Managed by applicable laws (HIPPA, ???) – Linked Private Linked Private • Almost all business transactions Almost all business transactions • Data collected when dealing with business Data collected when dealing with business • Connected to person via (semi)permanent id Connected to person via (semi)permanent id – Unlinked Private Unlinked Private • Website ids Website ids • No (semi)permanent id was recorded No (semi)permanent id was recorded

  8. Databases and Aggregation Databases and Aggregation • Semipermanent and permanent ids permit Semipermanent and permanent ids permit aggregation of data from private and public aggregation of data from private and public sources sources • Results in digital dossiers, which many Results in digital dossiers, which many consider to be privacy concern consider to be privacy concern • Worse, these dossiers are scattered, Worse, these dossiers are scattered, unreliable, and frequently inaccessible by unreliable, and frequently inaccessible by the person who they describe the person who they describe

  9. Fixing The Problem Fixing The Problem • Reduce public data to minimum Reduce public data to minimum – specifically remove associations between specifically remove associations between permanent and semipermanent Ids permanent and semipermanent Ids • Force private data to be unlinked by Force private data to be unlinked by creating a reliable system of certified creating a reliable system of certified pseudonyms pseudonyms • Allow for undeletable, but commentable Allow for undeletable, but commentable reports (with low privacy value) on reports (with low privacy value) on pseudonyms that follow a real identity from pseudonyms that follow a real identity from pseudonym to pseudonym. pseudonym to pseudonym.

  10. Certified Pseudonyms Certified Pseudonyms • A UID, but can be created at any time A UID, but can be created at any time • Comes attached with information that a Comes attached with information that a person has authorized for a pseudonym person has authorized for a pseudonym • Issued by a licensed pseudonym issuer Issued by a licensed pseudonym issuer • No (semi)permanent Ids No (semi)permanent Ids – Not linkable, except by issuer Not linkable, except by issuer • Issuers operate under strict legal guidelines Issuers operate under strict legal guidelines • Connection may be restored by courts upon Connection may be restored by courts upon necessity (lawsuit, etc.) necessity (lawsuit, etc.)

  11. Ensuring Privacy Ensuring Privacy • A person creates as many identities as he A person creates as many identities as he wishes, selecting information that can be wishes, selecting information that can be revealed by each one revealed by each one • One of these identities will be used when One of these identities will be used when dealing with another entity dealing with another entity • The other entity will be able to get authorized The other entity will be able to get authorized info from issuer info from issuer • Business dealing can proceed if enough Business dealing can proceed if enough information is attached to that identity information is attached to that identity • Identity itself is completely throw-away Identity itself is completely throw-away

  12. Pseudonym Issuers Pseudonym Issuers • Private organizations Private organizations – government will not get credit history without government will not get credit history without warrant, etc. warrant, etc. • Regulated by laws Regulated by laws – minimum requirements / privacy guarantees minimum requirements / privacy guarantees • Compete on ease of use, features, etc Compete on ease of use, features, etc – Compare to credit card issuers Compare to credit card issuers • Unify data from many pseudonyms Unify data from many pseudonyms – many ids, one credit history, no SSN involved many ids, one credit history, no SSN involved • One place to keep track / contest data One place to keep track / contest data

  13. Advantages Advantages • Businesses can not aggregate data Businesses can not aggregate data – no (semi)permanent Ids no (semi)permanent Ids • Accountability preserved Accountability preserved • Free market / legal protections Free market / legal protections • Anonymous guaranteed payment Anonymous guaranteed payment – similar to credit cards similar to credit cards • Ability to keep track of all personal data Ability to keep track of all personal data • Can coexist with current system Can coexist with current system • Allows for statistics for marketing use Allows for statistics for marketing use

  14. Disadvantages Disadvantages • Central point of failure Central point of failure – identity theft can be disastrous identity theft can be disastrous • Complex management interface Complex management interface • Standard protocol required for use Standard protocol required for use – similar to credit cards similar to credit cards • Who will be charged, and how much? Who will be charged, and how much? • Inability for direct customer communication Inability for direct customer communication • Semipermanent Id required for deals Semipermanent Id required for deals – house painting requires an address house painting requires an address

  15. Dealing with Difficulties Dealing with Difficulties • Communication Communication – Direct Communication requires semipermanent Direct Communication requires semipermanent information about a person information about a person – Indirection needed; easy with e-mail, harder Indirection needed; easy with e-mail, harder with phone and address with phone and address • Deals where semipermanent Id is required Deals where semipermanent Id is required – Example: shipping, house painting, cable TV Example: shipping, house painting, cable TV – Bad: can be aggregated with public data Bad: can be aggregated with public data – Good: cannot be aggregated with private data Good: cannot be aggregated with private data – Similar to current method: trust Similar to current method: trust

Recommend


More recommend