tagvisor a privacy advisor for sharing hashtags
play

Tagvisor: A Privacy Advisor for Sharing Hashtags Yang Zhang Joint - PowerPoint PPT Presentation

Tagvisor: A Privacy Advisor for Sharing Hashtags Yang Zhang Joint work with Mathias Humbert, Tahleen Rahman, Cheng-Te Li, Jun Pang and Michael Backes #hashtag 2 #hashtag 3 #hashtag 4 #hashtag 5 #hashtag #like4like #foodporn


  1. Tagvisor: A Privacy Advisor for Sharing Hashtags Yang Zhang Joint work with Mathias Humbert, Tahleen Rahman, Cheng-Te Li, Jun Pang and Michael Backes

  2. #hashtag � 2

  3. #hashtag � 3

  4. #hashtag � 4

  5. #hashtag � 5

  6. #hashtag #like4like #foodporn #tbt � 6

  7. #hashtag #privacy #locationprivacy � 7

  8. #contributions • Attack: location inference with hashtags • Defense: Tagvisor, a privacy advisor to mitigate the privacy threat by hashtags � 8

  9. #dataset • Collected through Instagram’s APIs • New York, Los Angeles, and London • Hashtags + locations (check-ins) � 9

  10. #attack [1, 1, 1, 0] #a#b#c #b#c [0, 1, 1, 0] #a#d [1, 0, 0, 1] • Bag-of-words for feature representation • Random forest classifier • Multiple-class classification, e.g., 498 classes (locations) in New York • All posts are trained together � 10

  11. #attack � 11

  12. #attack � 12

  13. #tagvisor • A privacy advisor for sharing hashtags • Fool the attacker’s location inferencer (ML classifier) • Three defense mechanisms • Hiding • Replacement • Generalization (location category) • Utility: preserving the semantical meaning of hashtags � 13

  14. #hiding successful attack #a#b#c delete one hashtag (can be more) hide #a #b#c #a#c hide #b #a#b hide #c � 14

  15. #utility • Semantical meaning #a#b • Skip-gram, aka word2vec #a#b#c #a#c • Skip-gram over all posts’ hashtags d2 #c Hashtag vectors d1 d2 #a#b#c #a: [3.1, 1.3] #a#c #b #b: [2.5, 1.9] #a#b #c: [4.0, 5.1] #a d1 � 15

  16. #replacement successful attack #a#b#c • Replace each hashtag with all the possible hashtag • Search space is too big • Bound to the most closest hashtags (with word2vec) • Reduce the search space • Semantical meaning can be preserved � 16

  17. #generalization • Location category from foursquare • #centralpark -> #park • Do not apply to all hashtags • e.g., #tbt #love � 17

  18. #tagvisor • Check whether the post’s location is inferred correctly • If no, then publish • Else, consider the three defense mechanisms • Pick the hashtag set with the highest utility � 18

  19. #tagvisor Obfuscating bounded number of hashtags Obfuscating 2 hashtags is enough! � 19

  20. #conclusion • First location inference attack with hashtags #thankyou • Sharing hashtags is not safe!!! • A privacy advisor to mitigate this risk https://yangzhangalmo.github.io/ • Minimal risk and maximal utility @yangzhangalmo • Fit for the real-world setting � 20

Recommend


More recommend