location privacy
play

Location Privacy. Where do we stand and where are we going? - PowerPoint PPT Presentation

Location Privacy. Where do we stand and where are we going? Fernando Prez-Gonzlez Signal Theory and Communications Department Universidad de Vigo - SPAIN Why do we like location based apps? 2 Google maps 3 Foursquare 4 Facebook


  1. Location Privacy. Where do we stand and where are we going? Fernando Pérez-González Signal Theory and Communications Department Universidad de Vigo - SPAIN

  2. Why do we like location based apps? 2

  3. Google maps 3

  4. Foursquare 4

  5. Facebook place tips 5

  6. Waze 6

  7. And, of course … 7

  8. How can you be geolocated? (without you fully knowing) 8

  9. IP-based Geolocation Source: GeoIPTool 9

  10. Meta-data based Geolocation 10

  11. Landmark recognition Geolocation 11

  12. Biometric geolocation 12

  13. Credit card usage Geolocation 14

  14. Triangulation and other geolocation techniques 15

  15. Signal strength-based triangulation Source: The Wrongful Convictions Blog 16

  16. Signal strength-based triangulation Source: The Wrongful Convictions Blog 17

  17. Multilateration: Time Difference of Arrival (TDOA) Source:[Fujii et al. 2015] 18

  18. Wardriving geolocation (Wigle) Source:Wigle.net 19

  19. Electrical Network Frequency Geolocation 20

  20. 21

  21. Why is it dangerous? 22

  22. 23

  23. Buster busted! 24

  24. 25

  25. 26

  26. 6 months in the life of Malte Spitz (2009-2010) Source:http://www.zeit.de/datenschutz/malte-spitz-data-retention 29

  27. Are we concerned about it? 31

  28. Are people really concerned about location privacy? • Survey by Skyhook Wireless (July 2015) of 1,000 Smartphone app users. • 40% hesitate or don’t share location with apps. • 20% turned off location for all their apps. • Why people don’t share location? • 50% privacy concerns. • 23% don’t see value in location data. • 19% say it drains their battery. • Why people turn off location? • 63% battery draining. • 45% privacy. • 20% avoid advertising. 32

  29. How much is geolocation data worth? 33

  30. 34

  31. How much value do we give to location data? [Staiano et al. 2014] Many participants opted-out of revealing geolocation information. Avg. daily value of location info: 3 € Daily Value ( € ) Strong correlation between the amount traveled and the value given to location data. 35

  32. Earn money as you share data • GeoTask • £1 PayPal cash voucher per 100 days of location data sharing (£0.01/day) Financial Times in 2013: advertisers are willing to pay a mere $0.0005 per person for general information such as their age, gender and location, or $0.50 per 1,000 people. 36

  33. Pay as you drive • Formula can be a function of the amount of miles driven, or the type of driving, age of the driver, type of roads used… • Up to 40% reduction in the cost of insurance. 38

  34. That’s $90 per person year!!!! BIA/Kelsey projects U.S. location-targeted mobile ad spending to grow from $9.8 billion in 2015 to $29.5 billion in 2020. 39

  35. SAP, Germany, estimates wireless carrier revenue from selling mobile-user behavior data in $5.5 billion in 2015 and predicts $9.6 billion for 2016. 40

  36. How about anonymization/pseudonymization? 47

  37. Anonymity Location Location Service provider Anonymity provider (local/central) Problems: • Difficult authentication and personalization. • Operating system or apps may access location before anonymization. 48

  38. Pseudonimity Location Pseudonym Service provider Problems: • Operating system or apps may access location data before pseudonymization. • Deanonymization. 49

  39. Deanonymization based on home location [Hoh, Gruteser 2006] • Data from GPS traces of larger Detroit area (1 min resolution). • No data when vehicle parked. • K-means algorithm for clustering locations + 2 heuristics: • Eliminate centroids that don’t have evening visits. • Eliminate centroids outside residential areas (manually). Source: [Hoh, Gruteser 2006] 50

  40. Deanonymization based on home location [Krummer 2007] • 2- week GPS data from 172 subjects (avg. 6 sec resolution). • Use heuristic to single out trips by car. • Then use several heuristics: destination closest to 3 a.m. is home; place where individual spends most time is home; center of cluster with most points is home. • Use reverse geocoding and white pages to deanonymize. Success measured by finding out name of individual. • Positive identification rates around 5%. • Even noise addition with std=500 m gives around 5% success when measured by finding out correct address. 51

  41. Mobile trace uniqueness [de Montjoye et al 2013] • Study on 15 months of mobility data; 0.5M individuals. • Dataset with hourly updates and resolution given by cell carrier antennas, only 4 points suffice to identify 95% of individuals. • Uniqueness of mobility traces decays as 1/10th power of their resolution. Source: [de Montojoye et al. 2013] 52

  42. Location privacy protection mechanisms 53

  43. Location white lies Source: Caro Spark (CC BY-NC-ND) 54

  44. Location based privacy mechanisms Input Output location pseudolocation X Z Source: Motherboards.org 55

  45. Location privacy protection mechanisms (LPPMs)   • ( X ) Z • The mechanism may be deterministic (e.g., quantization) or stochastic (e.g., noise addition).  (  • Function may depend on other contextual (e.g., time) ) or user-tunable (e.g., privacy level) parameters. • When the mechanism is stochastic, there is an underlying probability density function, i.e., ( | ) f Z X 56

  46. Hiding 57

  47. Perturbation: (indepedent) noise addition 58

  48. Perturbation: quantization 59

  49. Obfuscation 60

  50. Spatial Cloaking 61

  51. How to commit the perfect murder 62

  52. Space-time Cloaking Time 63

  53. Dummies 64

  54. User-centric vs. Centralized LPPM User-centric 65

  55. User-centric vs. Centralized LPPM Centralized 66

  56. 67

  57. Utility vs. Privacy • In broad terms: Utility Privacy 68

  58. Very nice, but … • There are two main problems: How do we measure utility? How do we measure privacy? 69

  59. How to measure utility? 70

  60. How to measure utility? 71

  61. How to measure utility? Real position pseudolocation 72

  62. A note about distances d 1 d 2 76

  63. Adversarial definition of privacy [Shokri et al 2011-] • Assume stochastic mechanism for the user ( | ) f Z X . • Adversary constructs a (possibly stochastic) estimation ˆ remapping . ( | ) r X Z  • Prior assumed available to the adversary. ( X ) ˆ ˆ • ( , ) d p x x : Distance between and x . x • : Distance between and ( , ) . d q x z z x LPPM z x ˆ x Adversary 77

  64. Adversarial definition of privacy [Shokri et al 2011-]  • Establish a cap on average utility loss: { ( , )} E d X Z QL q • This is a Stackelberg game in which the user chooses first and the adversary plays second. • Find optimal adversarial ‘ remapping ’: ˆ ˆ  * ( | ) arg min { ( , ) | } r X Z E d X X Z p  • Optimal remapping depends on ( | ) ( X ) and . f Z X  ˆ ˆ ˆ  { ( , ) | } ( | ) ( | ) ( , ) E d X X Z r X Z f X Z d X X p P ˆ , X X LBPM where   ( | ) ( ) f Z X X  ( | ) Prior f X Z ( ) f Z 78

  65. Example: uniform noise addition  ( | ) f Z z X Prior z ˆ x  ( | ) f Z X x x LPPM 79

  66. Adversarial definition of privacy [Shokri et al 2011-] ˆ • When for a given there are several minimizers the X Z ˆ * function becomes stochastic. ( | ) r X Z • The user now must maximize privacy:  ˆ ˆ ˆ   * max { ( , )} max ( | ) ( | ) ( ) ( , ) E d X X r X Z f Z X X d X X p p ˆ , , Z X X * ( | ) • Which is achieved for some mechanism f Z X ˆ • Privacy is defined as after solving this { ( , )} E d X X p maxmin problem. 80

  67. An interesting result d  • When d : p q   * ( | ) arg min { ( , )} f Z z X E d z X p ˆ ˆ     * ( | ) ( ) r X Z z X z i.e. do nothing!   • When the following identity must hold d d d 2 p q   { | } z E X Z z • When both user and adversary play optimally: Privacy=Utility Loss 81

  68. The Utility Loss-Privacy plane Achievable region P=UL Optimal Adversary Adv. Adv. Adv. Strategy 2 Achievable region Strategy 1 Strategy 3 Optimal Mechanism Adv. Utility Strategy 4 Loss Adv. Playing line Privacy 85

  69. What’s wrong with priors? • Is it realistic to asume that the adversary knows the prior? • Adversary no longer plays optimally with the ‘ wrong ’ prior. • Shokri’s privacy definition is prior-dependent. • Definition of differential privacy is prior-independent:      log(Pr{ ( ) }) log(Pr{ ( ) }) A D S A D S 1 2 - Two databases differing in a single element. 1 , D D 2 - A : randomized algorithm. - S : set of possible subsets of im(A) . 86

Recommend


More recommend