walk2friends inferring social links from mobility profiles
play

walk2friends: Inferring Social Links from Mobility Profiles Yang - PowerPoint PPT Presentation

walk2friends: Inferring Social Links from Mobility Profiles Yang Zhang joint work with Michael Backes, Mathias Humbert, and Jun Pang Location Privacy 4 spatial-temporal points can identify 95% of the individuals Mobility traces can be e


  1. walk2friends: Inferring Social Links from Mobility Profiles Yang Zhang joint work with Michael Backes, Mathias Humbert, and Jun Pang

  2. Location Privacy • 4 spatial-temporal points can identify 95% of the individuals • Mobility traces can be e ff ectively de-anonymized • You are where you go • Demographics • Social relations

  3. Social Relation Privacy • Social relations can be sensitive, e.g., o ffi ce romance • 17.2% -> 56.2% (Facebook users in New York) • NSA’s co-traveler program

  4. Predict whether two users are friends based on the locations they have visited

  5. • Solution 1: common locations two users have visited • Almost all data mining approaches take this way • Location entropy • Can’t work when two users share no common locations

  6. • Solution 2: mobility profiles/features • Summarize each user’s mobility profiles • Friends share similar mobility profiles than strangers • Feature engineering • Tedious e ff orts and domain expert knowledge Every Single Time!!! • Time consuming

  7. Representation Learning • Learning features (representation/deep learning) • Follow a general object (unsupervised) • Graph representation learning (graph embedding) • Preserve each user’s neighbors in a social network • Mobility feature learning

  8. Assumption: A user’s mobility neighbors can reflect his mobility profile/features • Define each user’s mobility neighbors • Learn mobility features/profiles • Infer two users’ social relation

  9. Mobility Neighbors • A user’s mobility neighbors include • Locations a user has visited • Others who have visited similar locations and their locations • Breadth first search • Not considering the visiting frequencies • Random walk sampling

  10. Mobility Neighbors

  11. Feature Learning • Learn a function: θ : U → R d • Each node to predict it’s neighbors • Softmax p ( | ; θ ) · · ; θ ) · p ( | ; θ ) · p ( | ; θ ) · p ( | arg max θ ; θ ) · p ( | ; θ ) · p ( | p ( | ; θ ) · p ( | ; θ ) · ; θ ) · p ( | ; θ ) · p ( | ; θ ) · p ( | ; θ ) · p ( | ; θ ) · p ( | ; θ ) · p ( | ; θ ) · p ( | ; θ ) · p ( | ; θ ) · ; θ ) · p ( | p ( | ; θ ) · p ( | ; θ ) ; θ ) · p ( | ; θ ) · p ( | p ( |

  12. Social Relation Inference s ( ) = 0 . 9 , s ( ) = 0 . 8 • Cosine similarity , • Unsupervised s ( ) = 0 . 6 , • Predict any social relation s ( ) = 0 . 4 , s ( ) = 0 . 3 , s ( ) = 0 . 2 ,

  13. Evaluation: dataset • Instagram users’ check-ins • New York, Los Angeles and London • Foursquare (location semantics) • Social relations (two users follow each other)

  14. Evaluation: ROC curve

  15. Evaluation: distance metric 0.80 0.70 0.60 0.50 A8C CosLne 0.40 EuclLdean CoUUelatLon 0.30 CheEysheY 0.20 BUay-CuUtLs CanEeUUa 0.10 0anhattan 0.00 1ew YoUN Los Angeles London

  16. Evaluation: baseline models 0.80 0.70 0.60 0.50 A8C 0.40 0.30 2uU appUoach aa_ent w_geodLst common_p mLn_ent pp 0.20 oYeUOap_p aa_p dLYeUsLty w_common_p w_fUequency mLn_p 0.10 peUsonaO w_oYeUOap_p geodLst 0.00 1ew YoUN Los AngeOes London

  17. Evaluation: baseline models 0.80 0.70 0.60 0.50 A8C 0.40 0.30 2uU appUoach aa_ent w_geodLst common_p mLn_ent pp 0.20 oYeUOap_p aa_p dLYeUsLty w_common_p w_fUequency mLn_p 0.10 peUsonaO w_oYeUOap_p geodLst 0.00 1ew YoUN Los AngeOes London

  18. Evaluation: hyperparameters 0.82 0.82 0.82 0.80 0.80 0.80 0.78 0.78 0.78 A8C A8C A8C 0.76 0.76 0.76 0.74 0.74 0.74 1ew YoUN 1ew YoUN New YoUN 0.72 0.72 0.72 Los Angeles Los Angeles Los Angeles London London London 0.70 0.70 0.70 10 20 30 40 50 60 70 80 90 100 2 4 6 8 10 12 14 16 18 20 4 5 6 7 8 l w t w log 2 ( d )

  19. Evaluation: check-in numbers 0.83 0.80 A8C 0.77 0.74 1ew YoUN Los Angeles London 0.71 5 10 15 20 25 30 1umbeU of checN-Lns

  20. Evaluation: common locations 0.82 0.78 A8C 0.74 0.70 1ew YoUN Los Angeles London 0.66 0 1 2 3 4 1umbeU of common locatLons

  21. Evaluation: geo-coordinates 0.83 0.76 A8C 0.69 0.62 1ew YoUN Los Angeles LonGon 0.55 10 −3 10 −2 10 −1 GULG gUanulaULty (Ln GegUee)

  22. Defense Mechanisms • Hiding • Delete certain proportion of check-ins • Replacement • Random walk to replace locations

  23. Defense Mechanisms • Generalization • Geo-coordinate and location semantics • MoMA -> art (40.76N, -73.97W) • Recover location first • art (40.76N, -73.97W) -> MoMA or Tom Otterness Frog?

  24. Utility Metric • Each user’s check-in distribution • Both original and obfuscated • Jensen-Shannon divergence • Average over all users

  25. Defense Evaluation 1.00 0.80 0.76 0.80 0.72 0.60 8tility 0.68 A8C 0.64 0.40 Hiding Hiding 0.60 5HplacHPHnt (stHp 5) 5HplacHPHnt (stHp 5) 5HplacHPHnt (stHp 15) 5HplacHPHnt (stHp 15) 0.20 0.56 5HplacHPHnt (stHp 25) 5HplacHPHnt (stHp 25) 5HplacHPHnt (stHp 35) 5HplacHPHnt (stHp 35) 0.52 0.00 10 20 30 40 50 60 70 80 90 10 20 30 40 50 60 70 80 90 3URpRUtiRn Rf RbfuscatiRn (%) 3URpRUtiRn Rf RbfuscatiRn (%)

  26. Defense Evaluation

  27. Defense Evaluation 1.00 HiGing 5HplacHmHnt GHnHUalizatiRn 0.80 0.60 8tility 0.40 0.20 0.00 0.50 0.55 0.60 0.65 0.70 0.75 0.80 A8C

  28. yang.zhang@cispa.saarland Conclusion @yangzhangalmo • A new social relation inference attack with mobility profiles • Learning user profiles • Unsupervised and predict any social relations • Three general defense mechanisms • Replacement and hiding outperform generalization

Recommend


More recommend