towards plausible graph anonymization
play

Towards Plausible Graph Anonymization Yang Zhang, Mathias Humbert, - PowerPoint PPT Presentation

Towards Plausible Graph Anonymization Yang Zhang, Mathias Humbert, Bartlomiej Surma, Praveen Manoharan, Jilles Vreeken, Michael Backes Graph sharing 2 Graph anonymization 3 Graph anonymization id 3 id 7 id 1 id 6 id 4 id 2 id 8 id 5


  1. Towards Plausible Graph Anonymization Yang Zhang, Mathias Humbert, Bartlomiej Surma, Praveen Manoharan, Jilles Vreeken, Michael Backes

  2. Graph sharing 2

  3. Graph anonymization 3

  4. Graph anonymization id 3 id 7 id 1 id 6 id 4 id 2 id 8 id 5 4

  5. Graph anonymization id 3 id 7 id 1 id 6 id 4 id 2 id 8 id 5 5

  6. Graph anonymization id 3 id 7 id 1 id 6 id 4 id 2 id 8 id 5 6

  7. Graph anonymization id 3 id 7 id 1 id 6 id 4 id 2 id 8 id 5 7

  8. Our work ▪ Find a fundamental flaw in graph anonymization designs 8

  9. Our work ▪ Find a fundamental flaw in graph anonymization designs ▪ Exploit it to recover original graph 9

  10. Our work ▪ Find a fundamental flaw in graph anonymization designs ▪ Exploit it to recover original graph ▪ Use our findings to enhance anonymization designs 10

  11. Our work ▪ Find a fundamental flaw in graph anonymization designs ▪ Exploit it to recover original graph ▪ Use our findings to enhance anonymization designs ▪ Evaluate privacy and usability of enhanced techniques on 3 real life datasets: ▪ Enron, NO, Snap 11

  12. Graph anonymization methods ▪ ’08 Liu et al. - k-anonymity (k-DA) ▪ ’08 Zhou et al. - k-anonymity (k-NA) ▪ ’10 Cheng et al. - k-anonymity (k-iso) ▪ ’11 Sala et al. - differential privacy ▪ ’12 Mittal et al. - random walk privacy ▪ ’14 Xiao et al. - differential privacy 12

  13. k-DA algorithm id 2 id 1 id 6 id 8 id 3 id 4 id 7 id 5 13

  14. k-DA algorithm 5 id 2 4 # nodes 3 id 1 id 6 2 id 8 1 id 3 0 1 2 3 4 node degree id 4 id 7 id 5

  15. k-DA algorithm 5 id 2 4 # nodes 3 id 1 id 6 2 id 8 1 id 3 0 1 2 3 4 node degree id 4 id 7 id 5 2-DA 6 5 # nodes 4 3 2 1 0 1 2 3 4 node degree 15

  16. k-DA algorithm 5 id 2 4 # nodes 3 id 1 id 6 2 id 8 1 id 3 0 1 2 3 4 node degree id 4 id 7 id 5 2-DA id 2 6 id 1 id 6 5 id 8 # nodes 4 id 3 3 2 1 id 4 0 id 7 1 2 3 4 id 5 node degree 16

  17. SalaDP algorithm id 2 id 1 id 6 id 8 dK-2 series id 3 id 4 id 7 id 5 ɛ -DP id 2 id 1 id 6 id 8 id 3 perturbed dK-2 series id 4 id 7 id 5 17

  18. Social network graph properties id 3 id 7 id 1 id 6 id 4 id 2 id 8 id 5 18

  19. Social network graph properties id 3 id 7 id 1 id 6 id 4 id 2 id 8 id 5 19

  20. Social network graph properties id 3 id 7 id 1 id 6 id 4 id 2 id 8 id 5 20

  21. Social network graph properties id 3 id 7 id 1 id 6 id 4 id 2 id 8 id 5 21

  22. Graph recovery attack - overview 22

  23. Graph recovery attack - graph embedding ▪ Node embeddings with node2vec ’16 Grover and Leskovec ▪ Mapping users into continuous vector space ▪ User’s vector reflects structural properties 23

  24. Graph recovery attack - graph embedding ▪ Plausibility is cosine similarity between embeddings × 10 4 Original edges 7 Fake edges 6 Number of edges 5 4 3 2 1 0 − 0 . 2 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 Edge plausibility 24

  25. Graph recovery attack - graph embedding ▪ Plausibility is cosine similarity between embeddings × 10 4 1 . 0 Original edges 7 Fake edges 0 . 8 6 Number of edges 5 0 . 6 AUC 4 0 . 4 3 2 Cosine Embeddedness 0 . 2 Euclidean Jaccard 1 Bray-Curtis Adamic-Adar 0 0 . 0 − 0 . 2 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 Enron NO SNAP Edge plausibility 25

  26. Graph recovery attack - graph embedding ▪ Find a cutoff point and remove non-plausible edges × 10 4 Original edges 7 Fake edges 6 Number of edges 5 4 3 2 1 F1 score 0 − 0 . 2 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 Edge plausibility 26

  27. Enhancing anonymization ▪ get fake edges with highest plausibility? ▪ the distribution will look unnatural 27

  28. Enhancing anonymization ▪ get fake edges with highest plausibility? ▪ the distribution will look unnatural ▪ draw fake edges from same plausibility distribution? 28

  29. Enhancing anonymization ▪ get fake edges with highest plausibility? ▪ the distribution will look unnatural ▪ draw fake edges from same plausibility distribution? k-DA (k=100) Enhanced k-DA (k=100) 29

  30. Resilience to graph recovery attack ▪ F1 score for original anonymizations k-DA drops by: 
 26~51% SalaDP drops by: 37~48% ▪ F1 score for enhanced anonymizations 30

  31. Utility of Enhanced anonymization 1 . 0 Eigencentrality (Enron) Eigencentrality (NO) 0 . 9 Eigencentrality (SNAP) Utility of G F Degree distribution (Enron) Degree distribution (NO) 0 . 8 Degree distribution (SNAP) Triangle count (Enron) 0 . 7 Triangle count (NO) Triangle count (SNAP) 0 . 6 0 . 6 0 . 7 0 . 8 0 . 9 1 . 0 Utility of G A 31

  32. Resilience to deanonymization attack 30 Anonymity gain (%) 25 k -DA ( k = 50) k -DA ( k = 75) 20 k -DA ( k = 100) 15 SalaDP ( ✏ = 100) SalaDP ( ✏ = 50) 10 SalaDP ( ✏ = 10) 5 0 Enron NO SNAP 32

  33. Conclusion We find flaws in current graph anonymizations 33

  34. Conclusion We find flaws in current graph anonymizations We recover the original, pre-anonymized graph 34

  35. Conclusion We find flaws in current graph anonymizations We enhance the anonymization techniques We recover the original, pre-anonymized graph 35

  36. Conclusion We find flaws in current graph anonymizations We enhance the anonymization techniques We evaluate privacy and utility We recover the original, pre-anonymized graph of enhanced anonymization 36

Recommend


More recommend