http cs224w stanford edu epidemic model based on random
play

http://cs224w.stanford.edu Epidemic Model based on Random Trees (a - PowerPoint PPT Presentation

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu Epidemic Model based on Random Trees (a variant of branching processes) Root node, patient 0 A patient meets d other


  1. CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu

  2.  Epidemic Model based on Random Trees  (a variant of branching processes) Root node, “patient 0”  A patient meets d other people Start of epidemic  With probability q>0 infects each d subtrees of them  Q: For which values of d and q does the epidemic run forever?  Run forever: lim 𝑜→∞ 𝑄 𝑗𝑗𝑗𝑗𝑗𝑗𝑗𝑗 𝑗𝑜𝑗𝑗 > 0 𝑏𝑗 𝑗𝑗𝑒𝑗𝑒 𝑗  Die out: -- || -- = 0 10/23/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 2

  3.  p n = prob. there is an infected node at depth n  We need: lim 𝑜→∞ 𝑒 𝑜 =? (based on q and d )  Need recurrence for p n 𝑒 𝑜 = 1 − 1 − 𝑟𝑒 𝑜−1 𝑒 No infected node at depth n  lim 𝑜→∞ 𝑒 𝑜 = result of iterating f x = 1 − 1 − 𝑟𝑦 𝑒  Starting at x=1 (since p 1 =1) 10/24/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 3

  4. f(x) y=x y = f x Going to first fixed point When is this going to 0? x 1 What do we know about f(x)? f 0 = 0, f 1 = 1 f 1 = 1 − 1 − q d < 1 f ′ x = qd 1 − qx d−1 f ′ 0 = qd ∶ f ′ (x) is monotone decreasing on [0,1] 10/24/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 4

  5. f(x) y=x y = f x x 1 We need f(x) to be bellow y=x! f ′ 0 < 1 𝑜→∞ 𝑒 𝑜 = 0 ? to 𝑟𝑗 < 1 lim qd = expected # of people at we infect 10/24/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 5

  6.  In this model nodes only go from inactive → active  Can generalize to allow nodes to alternate between active and inactive state by: 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 6

  7.  Generalizing to model to Virus Propagation 2 Parameters:  (Virus) birth rate β :  probability than an infected neighbor attacks  (Virus) death rate δ:  probability that an infected node heals Healthy Prob. δ N 2 Prob. β N 1 N Infected N 3 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 8

  8.  General scheme for epidemic models:  Each node can go through phases:  Transition probs. are governed by model parameters S…susceptible E…exposed I…infected R…recovered Z…immune 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 9

  9.  Node goes through phases S usceptible I nfected R ecovered  Models chickenpox or plague:  Once you heal, you can never get infected again  Assuming perfect mixing  network is a complete graph Number of nodes the model dynamics is time 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 10

  10.  Susceptible-Infective-Susceptible (SIS) model  Cured nodes immediately become susceptible  Virus “strength”: s = β / δ  Node state transition diagram: Infected by neighbor with prob. β Susceptible Infective Cured internally with prob. δ 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 11

  11.  Models flu:  Susceptible node I(t) becomes infected Number of nodes  The node then heals and become susceptible again  Assuming perfect mixing (complete graph): S(t) dS = − β + δ SI I dt time dI = β − δ S usceptible I nfected SI I dt 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 12

  12.  SIS Model  Epidemic threshold of a graph G is a value of t , such that:  If virus strength s = β / δ < t the epidemic can not happen (it eventually dies out)  Given a graph what is its epidemic threshold? 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 13

  13. [Wang et al. 2003]  We have no epidemic if: Epidemic threshold (Virus) Death rate β / δ < τ = 1/ λ 1, A largest eigenvalue (Virus) Birth rate of adj. matrix A ► λ 1, A alone captures the property of the graph! 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 14

  14. [Wang et al. 2003] 10,900 nodes and 500 Oregon 31,180 edges β = 0.001 Number of Infected Nodes β / δ > τ 400 (above threshold) 300 200 β / δ = τ 100 (at the threshold) 0 β / δ < τ 0 250 500 750 1000 (below threshold) Time δ: 0.05 0.06 0.07 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 15

  15.  Does it matter how many people are initially infected? 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 16

  16. 10/24/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 17

  17.  Blogs – Information epidemics  Which are the influential/infectious blogs?  Which blogs create big cascades?  Viral marketing  Who are the influencers?  Where should I advertise?  Disease spreading vs.  Where to place monitoring stations to detect epidemics? 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 18

  18.  Independent Cascade Model  Directed finite G=(V,E)  Set S starts out with new behavior  Say nodes with this behavior are “active”  Each edge (v,w) has a probability p vw  If node v is active, it gets one chance to make w active, with probability p vw  Each edge fires at most once  Does scheduling matter? No  E.g., u,v both active, doesn’t matter which fires first  But the time moves in discrete steps 10/24/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 19

  19.  Initially some nodes S are active  Each edge (v,w) has probability (weight) p vw 0.4 a d 0.4 0.2 0.3 0.3 0.2 0.3 b f f 0.2 e e h 0.4 0.4 0.3 0.2 0.3 0.3 g g i 0.4 c  When node v becomes active:  It activates each out-neighbor w with prob. p vw  Activations spread through the network 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 20

  20.  S : is initial active set  f(S) : the expected size of final active set graph G a b d c … influence set of a node  Set S is more influential if f(S) is larger f({a,b} < f({a,c}) < f({a,d}) 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 21

  21. Problem: 0.4 a d 0.4 0.2  Most influential set of 0.3 0.3 0.2 size k : set S of k nodes 0.3 b f 0.2 e h producing largest 0.4 0.4 0.3 0.2 0.3 0.3 expected cascade size g i 0.4 c f(S) if activated Influence [Domingos-Richardson ‘01] set of b f ( S ) max  Optimization problem: S of size k 10/20/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 22

  22.  Most influential set of k nodes: set S on k nodes producing largest expected cascade size f(S) if activated  The optimization problem: f ( S ) max S of size k  How hard is this problem?  NP-HARD!  Show that finding most influential set is at least as hard as a vertex cover 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 23

  23.  Vertex cover problem:  Given universe of elements U={u 1 ,…,u n } and sets S 1 ,…, S m ⊆ U  Are there k sets among S 1 ,…, S m such that their union is U? S 3 U S 1 S 2 S 4  Goal: f ( S ) Encode vertex cover as an instance of max S of size k 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 24

  24.  Given a vertex cover instance with sets S 1 ,…, S m  Build a bipartite “S-to-U” graph: S 1 Construction: 1 u 1 e.g.: • Create edge 1 S 2 (S i ,u) ∀ S i ∀ u ∈ S i S 1 ={u 1 , u 2 , u 3 } u 2 1 -- directed edge S 3 u 3 from sets to their elements • Put weight 1 on each edge u n S m  There exists a set S of size k with f(S)=k+n iff there exists a size k set cover Note: Optimal solution is always a set of S i This is hard in general, could be special cases that are easier 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 25

  25.  Bad news:  Influence maximization is NP-hard  Next, good news:  There exists an approximation algorithm!  Consider the Hill Climbing algorithm to find S:  Input: Influence set of each node u = {v 1 , v 2 , … }  If we activate u, nodes {v 1 , v 2 , … } will eventually get active  Algorithm: At each step take the node u that gives best marginal gain: max f(S i-1 ∪ {u}) 10/13/2009 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 26

Recommend


More recommend