r susceptibility
play

R-SUSCEPTIBILITY An IR-Centric Approach to Assessing Privacy Risks - PowerPoint PPT Presentation

R-SUSCEPTIBILITY An IR-Centric Approach to Assessing Privacy Risks for Users in Online Communities Joanna Asia Biega Krishna P . Gummadi, Ida Mele, Dragan Milchevski, Christos Tryfonopoulos, Gerhard Weikum SIGIR 2016 APPROACHING


  1. R-SUSCEPTIBILITY An IR-Centric Approach to Assessing Privacy Risks 
 for Users in Online Communities Joanna Asia Biega Krishna P . Gummadi, Ida Mele, 
 Dragan Milchevski, Christos Tryfonopoulos, 
 Gerhard Weikum SIGIR 2016

  2. APPROACHING PRIVACY Data publishing Online communities gender age disease user1 male 37 cancer heart user2 male 37 female disease 20-30 user3 female 42 cancer Prevent deanonymization, Account linking Prevent attribute disclosure Attribute inference 2

  3. APPROACHING PRIVACY Data publishing Online communities gender age disease user1 male 37 cancer heart user2 male 37 female disease 20-30 user3 female 42 cancer Prevent deanonymization, Account linking Prevent attribute disclosure Attribute inference (not in this work) 3

  4. PRIVACY IN ONLINE COMMUNITIES WITH TEXTUAL DATA Build reputation 13.07.2011, user1: Studies show alarming depression rates among teenagers. Get information Share information 17.05.2011, user2: 13.07.2011, user3: Should I inform my potential employer during an On a cocktail of antidepressants and getting 
 interview that I am 3 months pregnant? crazy hallucinations :o Not obvious how to apply noise Quantify, inform, and guide 4

  5. IN THE (IR) WILD Search : 5

  6. IN THE (IR) WILD Search : Great student party #sigir2016 Shouldn’t have drunk that much #wine. #drunk ;) 6

  7. IN THE (IR) WILD Search : drunk wine party HR 7

  8. IN THE (IR) WILD Search : drunk wine party HR user_1 user_2 Great student party #sigir2016. 
 Shouldn’t have drunk that much #wine. #drunk ;) user_3 user_4 8

  9. IN THE (IR) WILD - MORE EXAMPLES Search : - drunk wine wasted party 
 Remote 
 HR - bungee jump adrenaline 
 search - depressed anxiety antidepressant U_1 U_2 Local 
 crawl … U_k 9

  10. IN THE WILD 10

  11. PRIVACY RISKS VIA EXPOSURE IN A COMMUNITY Criterion : <topic> U_1 U_2 … U_k 11

  12. R-SUSCEPTIBILITY Criterion : <topic> Rank-Susceptibility 1. U_1 2. U_2 … … k. U_k 12

  13. R-SUSCEPTIBILITY: FRAMEWORK FOR TEXTUAL DATA drug addiction: financial debts depression: (1) Topics: drug, addiction, 
 debt, loan, 
 depression, suicide, 
 addict, cocaine, … pay, student, … depressed, suffer, … high high medium (2) Sensitivity: U_1 U_7 U_78 R-Susceptibility (3) Risk Scores U_2 U_13 U_1 … … … U_k U_14 U_k 13

  14. OVERVIEW ➤ R-Susceptibility framework ➤ Topics ➤ Topic sensitivity ➤ Risk measures ➤ Baselines ➤ Topic-model-based ➤ Experiments ➤ Summary 
 14

  15. OVERVIEW ➤ R-Susceptibility framework ➤ Topics ➤ Topic sensitivity ➤ Risk measures ➤ Baselines ➤ Topic-model-based ➤ Experiments ➤ Summary 
 15

  16. R-SUSCEPTIBILITY: TOPICS drug addiction: financial debts depression: Topics: drug, addiction, 
 debt, loan, 
 depression, suicide, 
 addict, cocaine, … pay, student, … depressed, suffer, … LDA Quora: NYT: 500 topics 500 topics 600k posts 700k articles 16

  17. OVERVIEW ➤ R-Susceptibility framework ➤ Topics ➤ Topic sensitivity ➤ Risk measures ➤ Baselines ➤ Topic-model-based ➤ Experiments ➤ Summary 
 17

  18. R-SUSCEPTIBILITY: TOPIC SENSITIVITY drug addiction: financial debts depression: Topics: drug, addiction, 
 debt, loan, 
 depression, suicide, 
 addict, cocaine, … pay, student, … depressed, suffer, … ? Sensitivity: high high medium 18

  19. IDENTIFYING SENSITIVE TOPICS drug addiction: financial debts depression: Topics: drug, addiction, 
 debt, loan, 
 depression, suicide, 
 addict, cocaine, … pay, student, … depressed, suffer, … If a user’s post in an online community contained these words, would you consider it privacy sensitive? Sensitivity: 19

  20. IDENTIFYING SENSITIVE TOPICS drug addiction: financial debts depression: Topics: drug, addiction, 
 debt, loan, 
 depression, suicide, 
 addict, cocaine, … pay, student, … depressed, suffer, … yes yes no yes no no (2 topic models * 500 topics * 7 judgements per topic) yes yes no Sensitivity: # yes / # 20

  21. OVERVIEW ➤ R-Susceptibility framework ➤ Topics ➤ Topic sensitivity ➤ Risk measures ➤ Baselines ➤ Topic-model-based ➤ Experiments ➤ Summary 
 21

  22. ENTROPY BASELINE X1 = X2 = X3 = X4 = salient attributes for topic X depression anxiety psychiatrist paxil P(0), P(1) (community without user U) U U ∗ (community with user U) average KL-divergence of salient words distributions risk ( U 0 , X ) = 1 P U ∗ [ x i = v ] log ( P U [ x i = v ] X X P U ∗ [ x i = v ]) j i v = { 0 , 1 } over attributes over values 22

  23. DIFF-PRIV BASELINE X1 = X2 = X3 = X4 = salient attributes for topic X depression anxiety psychiatrist paxil P(0), P(1) (community without user U) U U ∗ (community with user U) Inspired by the differential privacy principle and P U [ x i ] ≤ 2 � P U ∗ [ x i ] P U ∗ [ x i ] ≤ 2 � P U [ x i ] ✓ P U [ x i ] ✓ ✓ ◆ ✓ P U ∗ [ x i ] ◆◆◆ risk ( U 0 , X ) = max max log , log P U ∗ [ x i ] P U [ x i ] x i probability increases or decreases over attributes 23

  24. OVERVIEW ➤ R-Susceptibility framework ➤ Topics ➤ Topic sensitivity ➤ Risk measures ➤ Baselines ➤ Topic-model-based ➤ Strength of interest } Which aspects matter ➤ Breadth of interest when it comes to human risk perception? ➤ Temporal variation of interest ➤ Experiments ➤ Summary 
 24

  25. TOPIC-MODEL RISK SCORE: BUILDING BLOCKS antidepressant depression psychiatrist oscar celebrity R-Susceptibility topic model 25

  26. TOPIC-MODEL RISK SCORE: BUILDING BLOCKS Quantifying user interest in a topic antidepressant depression n o i s s psychiatrist e r p e D = r oscar e X s U = celebrity X _ U R-Susceptibility topic model Details in the paper 26

  27. TOPIC-MODEL RISK SCORE: STRENGTH OF INTEREST 24.10.2012 misbehaving dog 24.10.2012 anxiety 24.10.2012 dog trainers 24.10.2012 feeling lonely 27.10.2012 dentists LA 27.10.2012 psychiatrist nyc 29.10.2012 knitting tutorial 29.10.2012 central park events 03.12.2012 christmas tree shop LA 03.12.2012 antidepressants 10.12.2012 christmas recipes 10.12.2012 xanax side effects 27

  28. TOPIC-MODEL RISK SCORE: STRENGTH OF INTEREST 24.10.2012 misbehaving dog 24.10.2012 anxiety 24.10.2012 dog trainers 24.10.2012 feeling lonely 27.10.2012 dentists LA 27.10.2012 psychiatrist nyc 29.10.2012 knitting tutorial 29.10.2012 central park events 03.12.2012 christmas tree shop LA 03.12.2012 antidepressants 10.12.2012 christmas recipes 10.12.2012 xanax side effects 28

  29. TOPIC-MODEL RISK SCORE: STRENGTH OF INTEREST Three dimensions of user interest Strength of interest g n i k n n o a i R t i s o p depression xanax psychiatrist risk ( U, X ) = cos ( ~ U, ~ X ) 29

  30. TOPIC-MODEL RISK SCORE: BREADTH OF INTEREST 24.10.2012 anxiety 24.10.2012 anxiety 24.10.2012 feeling lonely 24.10.2012 clinical depression 03.11.2012 psychiatrist nyc 03.11.2012 anatomy course book 07.11.2012 central park events 07.11.2012 central park events 03.12.2012 antidepressants 03.12.2012 liver cancer stats 10.12.2012 xanax side effects 10.12.2012 anorexia nervosa 30

  31. TOPIC-MODEL RISK SCORE: BREADTH OF INTEREST 24.10.2012 anxiety 24.10.2012 anxiety 24.10.2012 feeling lonely 24.10.2012 clinical depression 03.11.2012 psychiatrist nyc 03.11.2012 anatomy course book 07.11.2012 central park events 07.11.2012 central park events 03.12.2012 antidepressants 03.12.2012 liver cancer stats 10.12.2012 xanax side effects 10.12.2012 anorexia nervosa 31

  32. TOPIC-MODEL RISK SCORE: BUILDING BLOCKS REVISITED Quantifying user interest in a topic D = Psychiatry antidepressant depression n o i s s psychiatrist e r p e D = r oscar e X s U = celebrity X _ U R-Susceptibility topic model Details in the paper 32

  33. TOPIC-MODEL RISK SCORE: BREADTH OF INTEREST Three dimensions of user interest Strength of interest depression g n i k n research career n o a i R t i s psychiatry o p depression xanax psychiatrist Breadth of interest risk ( U, X, D ) = cos ( ~ U, ~ X ) − cos ( ~ U, ~ D − ~ X ) 33

  34. TOPIC-MODEL RISK SCORE: TEMPORAL VARIATION OF INTEREST 24.10.2012 anxiety 24.10.2012 anxiety 24.10.2012 feeling lonely 24.10.2012 feeling lonely 03.11.2012 psychiatrist nyc 24.10.2012 psychiatrist nyc 07.11.2012 central park events 24.10.2012 central park events 03.12.2012 antidepresssants 24.10.2012 antidepresssants 10.12.2012 xanax side effects 24.10.2012 xanax side effects 34

  35. TOPIC-MODEL RISK SCORE: TEMPORAL VARIATION OF INTEREST 24.10.2012 anxiety 24.10.2012 anxiety 24.10.2012 feeling lonely 24.10.2012 feeling lonely 03.11.2012 psychiatrist nyc 24.10.2012 psychiatrist nyc 07.11.2012 central park events 24.10.2012 central park events 03.12.2012 antidepresssants 24.10.2012 antidepresssants 10.12.2012 xanax side effects 24.10.2012 xanax side effects 35

  36. TOPIC-MODEL RISK SCORE: TEMPORAL VARIATION OF INTEREST time U = { ( v 1 , t 1 ) , ..., ..., ..., ..., ..., ..., ( v k , t k ) } 36

  37. TOPIC-MODEL RISK SCORE: TEMPORAL VARIATION OF INTEREST time U = { ( v 1 , t 1 ) , ..., ..., ..., ..., ..., ..., ( v k , t k ) }

Recommend


More recommend