an efficient reconciliation algorithm for social networks
play

An Efficient reconciliation algorithm for social networks Silvio - PowerPoint PPT Presentation

An Efficient reconciliation algorithm for social networks Silvio Lattanzi (Google Research NY) Joint work with: Nitish Korula (Google Research NY) ICERM Stochastic Graph Models Outline Graph reconciliation Model and theoretical results.


  1. An Efficient reconciliation algorithm for social networks Silvio Lattanzi (Google Research NY) Joint work with: Nitish Korula (Google Research NY) ICERM Stochastic Graph Models

  2. Outline Graph reconciliation Model and theoretical results. Experimental results From theory to practice. Open problems and future directions Stochastic Graph Models, ICERM

  3. Graph reconciliation Stochastic Graph Models, ICERM

  4. Real world motivations Stochastic Graph Models, ICERM

  5. Real world motivations Intra-language network Stochastic Graph Models, ICERM

  6. Real world motivations Intra-language network Inter-language network Stochastic Graph Models, ICERM

  7. Real world motivations Can we use intra-language information to improve inter- language graph? Stochastic Graph Models, ICERM

  8. Real world motivations Can we use intra-language information to improve inter- language graph? Stochastic Graph Models, ICERM

  9. Real world motivations Can we use intra-language information to improve inter- language graph? ? Stochastic Graph Models, ICERM

  10. Real world motivations Stochastic Graph Models, ICERM

  11. Real world motivations Stochastic Graph Models, ICERM

  12. Real world motivations Stochastic Graph Models, ICERM

  13. Real world motivations Stochastic Graph Models, ICERM

  14. Graph reconciliation problem Given two networks, identify as many users as possible across them. Applications: social networks ontology reconciliation Stochastic Graph Models, ICERM

  15. Previous work Problem of reconciliation introduced by Novak et al. Stochastic Graph Models, ICERM

  16. Previous work Problem of reconciliation introduced by Novak et al. Two main approaches: - ML on user profile features (name, location, image) Stochastic Graph Models, ICERM

  17. Previous work Problem of reconciliation introduced by Novak et al. Two main approaches: - ML on user profile features (name, location, image) - ML on neighborhood topology Stochastic Graph Models, ICERM

  18. Previous work Problem of reconciliation introduced by Novak et al. Two main approaches: - ML on user profile features (name, location, image) - ML on neighborhood topology Limitations: Stochastic Graph Models, ICERM

  19. Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack Stochastic Graph Models, ICERM

  20. Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack Stochastic Graph Models, ICERM

  21. Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack Stochastic Graph Models, ICERM

  22. Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack Stochastic Graph Models, ICERM

  23. Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack Stochastic Graph Models, ICERM

  24. Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack Stochastic Graph Models, ICERM

  25. Previous work Very rich literature in de-anonymization Two relevant works: - Backstrom et al. propose an active and passive attack - Narayanan and Shmatikov successful de-anonymization attack Stochastic Graph Models, ICERM

  26. Narayanan and Shmatikov experiment Ground truth 24000 matching across the two social networks Stochastic Graph Models, ICERM

  27. Narayanan and Shmatikov experiment Ground truth 24000 matching across the two social networks 80 me-links Stochastic Graph Models, ICERM

  28. Narayanan and Shmatikov experiment Ground truth 24000 matching across the two social networks 80 me-links They could re-identify 30.8% of the mappings. Stochastic Graph Models, ICERM

  29. Narayanan and Shmatikov experiment Algorithm: Stochastic Graph Models, ICERM

  30. Narayanan and Shmatikov experiment Algorithm: ? Stochastic Graph Models, ICERM

  31. Narayanan and Shmatikov experiment Algorithm: 2 Stochastic Graph Models, ICERM

  32. Narayanan and Shmatikov experiment Algorithm: 0 1 0 2 Stochastic Graph Models, ICERM

  33. Narayanan and Shmatikov experiment Algorithm: 0 1 0 2 Stochastic Graph Models, ICERM

  34. Narayanan and Shmatikov experiment Algorithm: Stochastic Graph Models, ICERM

  35. Narayanan and Shmatikov experiment Algorithm: Why? Is it necessary to have high degree me-links? Stochastic Graph Models, ICERM

  36. Abstraction Input: two graphs and a set of trusted matching We want to maximize the number of final matches. Stochastic Graph Models, ICERM

  37. Is the problem tractable? Problem is similar to graph isomorphism Stochastic Graph Models, ICERM

  38. Is the problem tractable? Problem is similar to graph isomorphism Problem seems even harder because we want to detect similar structure Stochastic Graph Models, ICERM

  39. Is the problem tractable? Problem is similar to graph isomorphism Problem seems even harder because we want to detect similar structure Stochastic Graph Models, ICERM

  40. Abstraction Formalization of the problem: Underlying social network Stochastic Graph Models, ICERM

  41. Abstraction Formalization of the problem: Underlying social network Delete the edges independently p 1 p 2 Stochastic Graph Models, ICERM

  42. Abstraction Formalization of the problem: Underlying social network Delete the edges independently p 1 p 2 Initial matchings Stochastic Graph Models, ICERM

  43. Questions Having a constant fraction of me-links, can we reconcile the entire network? If we have k me-links which fraction of networks can we reconcile? Stochastic Graph Models, ICERM

  44. Underlying social network Without additional assumption on the underling network problem seems still very hard Stochastic Graph Models, ICERM

  45. Underlying social network Without additional assumption on the underling network problem seems still very hard We study two different models for social networks: - G(n,p) - Preferential attachment Stochastic Graph Models, ICERM

  46. Our algorithm Algorithm: Narayanan Shmatikov + degree bucketing + acceptance threshold Stochastic Graph Models, ICERM

  47. G(n,p) Does the technique works if the underlying graph is random? p 1 p p 2 Stochastic Graph Models, ICERM

  48. G(n,p) Does the technique works if the underlying graph is random? p 1 p p 2 E [ N G 1 ( ∗ ) ∩ N G 2 ( ∗ )] = ( n − 1) pp 1 p 2 E [ N G 1 ( ∗ ) ∩ N G 2 ( ∗ )] = ( n − 2) p 2 p 1 p 2 Stochastic Graph Models, ICERM

  49. Concentration c log n ≤ p ≤ 1 We assume 6 , l, p 1 , p 2 ∈ O (1) n Two cases: - , Chernoff bound is enough npp 1 p 2 l ≥ 24 log n - , we never make error npp 1 p 2 l ≤ 24 log n x = ( n − 2) p 2 p 1 p 2 " n # ✓ n ◆ = (1 − x ) n + nx (1 − x ) n − 1 + x 2 (1 − x ) n − 2 = 1 − n 3 x 3 − o ( n 3 x 3 ) X P = B i ≤ 2 2 i =1 Stochastic Graph Models, ICERM

  50. More realistic model Preferential attachment: - is a single node with G m m 1 self-loops - adding a node to and G m G m m n − 1 n edges with probability proportional to the current degrees Stochastic Graph Models, ICERM

  51. Preferential attachment A bit harder - Several nodes of constant degree, we need to have a cascade - Objective is reconcile a constant fraction of the network Stochastic Graph Models, ICERM

  52. Sketch of the proof For high degree node we can use concentration results. Stochastic Graph Models, ICERM

  53. Sketch of the proof For high degree node we can use concentration results. Different nodes of intermediate degree do not share many neighbors. Stochastic Graph Models, ICERM

  54. Sketch of the proof For high degree node we can use concentration results. Different nodes of intermediate degree do not share many neighbors. High degree nodes help to detect intermediate degree nodes that in turn help to detect small degree nodes. Stochastic Graph Models, ICERM

  55. PA structural lemmas High degree nodes are early birds. o (log 2 n ) Nodes inserted after time , for constant , have degree in φ n φ Stochastic Graph Models, ICERM

  56. PA structural lemmas High degree nodes are early birds. o (log 2 n ) Nodes inserted after time , for constant , have degree in φ n φ The rich get richer. log 2 n For nodes of degree greater than a constant fraction of their neighbors has been inserted after time , for constant ✏ ✏ n Stochastic Graph Models, ICERM

  57. PA structural lemmas High degree nodes are early birds. o (log 2 n ) Nodes inserted after time , for constant , have degree in φ n φ The rich get richer. log 2 n For nodes of degree greater than a constant fraction of their neighbors has been inserted after time , for constant ✏ ✏ n First-mover advantage. All nodes inserted before time , have degree at least n 0 . 3 log 3 n Stochastic Graph Models, ICERM

  58. High degree nodes are early birds G m G m 1 n Stochastic Graph Models, ICERM

Recommend


More recommend