maximizing the spread of influence
play

Maximizing the Spread of Influence through a Social Network Han - PowerPoint PPT Presentation

Maximizing the Spread of Influence through a Social Network Han Wang Department of Computer Science ETH Zrich Problem Example 1: Spread of Rumor 2012 = end! D A C E B F Problem Example 2: Viral Marketing ezPad 1 beats iPad 3 D


  1. Maximizing the Spread of Influence through a Social Network Han Wang Department of Computer Science ETH Zürich

  2. Problem Example 1: Spread of Rumor  2012 = end! D A C E B F

  3. Problem Example 2: Viral Marketing  ezPad 1 beats iPad 3 D A C E B F

  4. Problem Definition  G: a social network (n nodes)  Model: spread process  S: initially active subset (k seeds )  𝝉 𝑻 : #final active nodes ( achievement )  Task: Choose 𝑇 ∗  Goal: 𝜏 𝑇 ∗ = max 𝜏 𝑇 NP-Hard Realistic Goal: Approximate the maximum with a guarantee Choose S: 𝜏 𝑇 ≥ 𝑠 ∙ 𝜏 𝑇 ∗

  5. Contents in This Talk  G: a social network (n nodes)  Model: spread process Two Models  S: initially active subset (k seeds )  𝝉 𝑻 : #final active nodes ( achievement )  Task: Choose 𝑇 ∗ Prove:  Goal: 𝜏 𝑇 ∗ = max 𝜏 𝑇 NP-Hard Realistic Goal: Prove: Approximate the maximum with a guarantee Choose S: 𝜏 𝑇 ≥ 𝑠 ∙ 𝜏 𝑇 ∗

  6. Model 1: Independent Cascade Model

  7. Model 1: Cascade Model  Each active node try to activate his neighbors D 𝑞 𝐷,𝐸 = 0.2  𝑞 𝑣,𝑤 1 − 𝑞 𝑣,𝑤 𝑞 𝐷,𝐹 = 0.8 C E  Only a single chance 𝑞 𝐷,𝐺 = 0.6 F

  8. Model 1: Cascade Model D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  9. Model 1: Cascade Model  𝑇 = 𝐵, 𝐷 , 𝜏 𝑇 = 5 D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  10. Model 2: Linear Threshold Model

  11. Model 2: Threshold Model  Each inactive node picks a random 𝜄 𝑤 ∈ ,0,1-  Active condition: 𝑐 𝑣,𝑤 ≥ 𝜄 𝑤 𝑣: 𝑏𝑑𝑢𝑗𝑤𝑓 𝑜𝑓𝑗𝑕𝑖𝑐𝑝𝑠 𝑝𝑔 𝑤 𝜾 𝑬 = 𝟏. 𝟒 Iteration 2: 0.2 < 0.3 𝑐 𝐷,𝐸 = 0.2 D 𝑐 𝐹,𝐸 = 0.7 Iteration 4: E  active C E Iteration 5: 0.2+0.7 > 0.3 D  active

  12. Model 2: Threshold Model 𝜾 = 𝟏. 𝟒 Iteration: 1 2 D 0.2 A 0.7 0.8 C 0.4 0.3 E 𝜾 = 𝟏. 𝟕 0.6 B 𝜾 = 𝟏. 𝟔 F 𝜾 = 𝟏. 𝟘

  13. Model 2: Threshold Model  𝑇 = 𝐵, 𝐷 , 𝜏 𝑇 = 4 D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  14. How to Prove the Guarantee? ??? find 𝑇 , s.t. Given a 𝜏 𝑇 ≥ 𝑠 ∙ 𝜏 𝑇 ∗ spread model find 𝑇 , s.t. f 𝑇 ≥ (1 − 1 𝑓) ∙ 𝑔 𝑇 ∗ Nemhauser f(S): Non-negative monotone submodular

  15. Submodularity  𝑉 : a finite ground set  𝑄 𝑉 : power set of 𝑉  𝑔 ∙ : 𝑄 𝑉 → 𝑆 ∗  Submodularity: ∀ 𝑜𝑝𝑒𝑓 𝑤, ∀𝑇 ⊆ 𝑈 𝒈 𝐓 ∪ 𝒘 − 𝒈 𝑻 ≥ 𝒈 𝑼 ∪ 𝒘 − 𝒈 𝑼

  16. Example: Submodularity  𝒈 𝑻 : number of vertexes reachable from vertexes in S v v A A C C D D B B

  17. How to Prove the Guarantee? ??? find 𝑇 , s.t. Given a 𝜏 𝑇 ≥ 𝑠 ∙ 𝜏 𝑇 ∗ spread model find 𝑇 , s.t. f 𝑇 ≥ (1 − 1 𝑓) ∙ 𝑔 𝑇 ∗ Prove: 𝛕 𝑻 is Submodular Nemhauser f(S): Non-negative monotone submodular

  18. We Want to Prove… 𝛕 𝑇 is Model NP-hard Submodular Independent Cascade Linear Threshold

  19. Prove: Submodularity Cascade Model

  20. Submodularity (Cascade Model)  Recall: flip coin D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  21. Submodularity (Cascade Model)  Why not flip all the coins in the begining? D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  22. Submodularity (Cascade Model)  Live edges  live paths  blocked edges D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  23. Simplify Cascade Model Node v ends up active A live path: some seed  v

  24. Achievement(Simplified Model) D  X: coin flipping outcome A  e.g. X1, X2 C E  𝑆 𝑌 𝑤 B F  𝑆 𝑌1 𝐵 = 𝐵, 𝐶  𝑆 𝑌1 𝐷 = 𝐷, 𝐸, 𝐹 D A C  𝜏 𝑌 𝑇 = | 𝑆 𝑌 𝑤 | E 𝑤∈𝑇  𝜏 𝑌1 *𝐵, 𝐷+ = 𝐵, 𝐶, 𝐷, 𝐸, 𝐹 = 5 B F

  25. Submodularity (Cascade Model)  Fix x, 𝜏 𝑌 𝑇 is submodular  Linear combination of submodular functions is still submodular 𝜏 𝑇 = 𝑄𝑠𝑝𝑐 𝑌 ∙ 𝜏 𝑌 𝑇 𝑌

  26. Summary of the proof Active = Has a live path 𝜏 𝑌 𝑇 is submodular 𝜏 𝑇 is submodular

  27. Prove: NP-hard Simplified Cascade Model

  28. NP-Hard (Cascade Model)  Set Cover Problem: k subsets cover all?  K=1: No  K=2: No  K=3: Yes  K=4: …

  29. NP-Hard (Cascade Model)  Influence maximization  Solve Set Cover Q: 𝑇 = 2, 𝜏 𝑇 ≥ 2 + 5 ? Q: 2 subsets cover all ? S2 A A S1 B C S1 C S2 B D D S3 S3 E E

  30. NP-Hard (Cascade Model) Influence Maximization Problem is at least as difficult as Set Cover Problem

  31. Prove: Submodularity Linear Threshold Model

  32. Recall: Threshold Model 𝜾 = 𝟏. 𝟒 D 0.2 A 0.7 0.8 C 0.4 0.3 E 𝜾 = 𝟏. 𝟕 0.6 B 𝜾 = 𝟏. 𝟔 F 𝜾 = 𝟏. 𝟘

  33. Gamble: Roulette

  34. Gamble: Roulette N1 None N1 0.2 N6 0.14 N6 0.15 v N2 0.1 N5 N2 N5 0.07 0.23 N3 N3 N4 N4 𝜾 = 𝟏. 𝟓

  35. Submodularity (Threshold Model) None 𝜾 = 𝟏. 𝟒 C D 0.2 E A 0.7 0.8 C 0.4 0.3 E 𝜾 = 𝟏. 𝟕 0.6 None B A 𝜾 = 𝟏. 𝟔 None C F None 𝜾 = 𝟏. 𝟘 C

  36. Submodularity (Threshold Model) 𝜾 = 𝟏. 𝟒  Live edges  live paths D 0.2 A 0.7 0.8 C 0.4 0.3 E 𝜾 = 𝟏. 𝟕 0.6 B 𝜾 = 𝟏. 𝟔 F 𝜾 = 𝟏. 𝟘

  37. Correctness of Simplification 𝐺𝑝𝑠 𝑜𝑝𝑒𝑓 𝑤: 𝑄 𝑏𝑑𝑢𝑗𝑤𝑓 𝑗𝑜 𝐽𝑢𝑓𝑠𝑏𝑢𝑗𝑝𝑜 𝑢 + 1 𝑗𝑜𝑏𝑑𝑢𝑗𝑤𝑓 𝑗𝑜 𝐽𝑢𝑓𝑠𝑏𝑢𝑗𝑝𝑜𝑡 ≤ 𝑢) = 𝑄(𝑏𝑑𝑢𝑗𝑤𝑓 𝑗𝑜 𝐽𝑢𝑓𝑠𝑏𝑢𝑗𝑝𝑜 𝑢 + 1) 𝑄(𝑗𝑜𝑏𝑑𝑢𝑗𝑤𝑓 𝑗𝑜 𝐽𝑢𝑓𝑠𝑏𝑢𝑗𝑝𝑜𝑡 ≤ 𝑢)

  38. Simplified Model Active before iteration 5 becomes active in iteration 5 None N1 N1 0.2 N6 0.14 N6 0.15 v N2 0.1 N5 N2 N5 0.07 0.23 N3 N3 N4 N4

  39. Simplified Model 𝐵 𝑢 : Nodes becoming active in iteration t 𝑐 𝑣,𝑤 𝑣∈𝐵 𝑢 1 − 𝑐 𝑣,𝑤 𝑣∈ 𝐵 1 ∪𝐵 2 ∪⋯∪𝐵 𝑢−1

  40. Original Model N2 N6 N4 N3 N1 N5 None N1 0.2 N6 0.14 0.15 v N2 0.1 N5 0.07 0.23 N3 N4

  41. Original Model 𝐵 𝑢 : Nodes becoming active in iteration t 𝑐 𝑣,𝑤 𝑣∈𝐵 𝑢 1 − 𝑐 𝑣,𝑤 𝑣∈ 𝐵 1 ∪𝐵 2 ∪⋯∪𝐵 𝑢−1

  42. Simplify Threshold Model Node v ends up active A live path: some seed  v

  43. Similarly, we have… Active = Has a live path 𝜏 𝑌 𝑇 is submodular 𝜏 𝑇 is submodular

  44. Prove: NP-hard Linear Threshold Model

  45. NP-Hard (Threshold Model)  Vertex Cover Problem  k vertexes (S) each edge is incident to at least one vertex in S

  46. NP-Hard (Threshold Model)  Influence maximization  Vertex Set Cover Q: 𝑇 = 3, 𝜏 𝑇 = 6 ? Q: 3 vertexes cover all ? D D A A C C E E B B F F

  47. Influence Maximization Q: 𝑇 = 2, 𝜏 𝑇 = 6 ? Q: 𝑇 = 3, 𝜏 𝑇 = 6 ? D D A A C C E E B B F F

  48. NP-Hard (Threshold Model) Influence Maximization Problem is at least as difficult as Vertex Cover Problem

  49. End of Proofs  Influence Maximization Problem 𝛕 𝑇 is Model NP-hard Submodular Independent Cascade Linear Threshold

  50. Initial Problem find 𝑇 , s.t. Given a 𝜏 𝑇 ≥ (1 − 1 𝑓 − 𝝑) ∙ 𝜏 𝑇 ∗ spread model find 𝑇 , s.t. f 𝑇 ≥ (1 − 1 𝑓) ∙ 𝑔 𝑇 ∗ Prove: 𝛕 𝑻 is Submodular Greedy Hill Climbing 𝑵𝑩𝒀 𝒘 𝒈 𝐓 ∪ 𝒘 − 𝒈 𝑻 f(S): (Maximize Marginal Gain) Non-negative monotone submodular

  51. Summary  Problem Description  Two Models  Independent Cascade Model  Linear Threshold Model Submodular Functions  Proof of Approximation Guarantee  Proof of NP-Hardness 

  52. Q&A

Recommend


More recommend