Searching and Sampling Take a Walk Through a Network! Antonio Carzaniga Faculty of Informatics Università della Svizzera italiana Mach 4, 2020
Outline Applications The network as a linear transformation Other applications of linear algebra
v
v
v a very limited local view of the network
Networks ◮ peer-to-peer ◮ . . . Services ◮ address-based v ◮ content-based ◮ multicast ◮ search ◮ sampling ◮ . . . Algorithms ◮ random walks ◮ . . . local view
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI remaining hops: 9 remaining hops: 9
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI remaining hops: 8 remaining hops: 8
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI remaining hops: 7 remaining hops: 7
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI remaining hops: 6 remaining hops: 6
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI remaining hops: 5 remaining hops: 5
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI remaining hops: 4 remaining hops: 4
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI remaining hops: 3 remaining hops: 3
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI remaining hops: 2 remaining hops: 2
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI remaining hops: 1 remaining hops: 1
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI remaining hops: 0! remaining hops: 0!
JU BS BL AG SH TG SO ZH AR ZG SG NE LU SZ AL NW GL OW VD FR BE UR GR GE VS TI node SG selected node SG selected
Other Applications Relevance score for hyper-linked documents (PageRank) ◮ Input: a large collection of linked documents such as Web pages ◮ Output: a ranking of the pages by reputation ◮ a page that is linked by reputable pages acquires more reputation ◮ equivalent to a random walk over the Web
Problem: given a directed graph G = ( V , A ) , compute the probability p u that a su ffi ciently long random walk would end at node u ∈ V for all nodes u .
Problem: given a directed graph G = ( V , A ) , compute the probability p u that a su ffi ciently long random walk would end at node u ∈ V for all nodes u . Approaches: 1. Simulation 2. Math! (linear algebra)
Random Walks v local view
Random Walks Execution 0 . 2 ◮ trivial, local process 0 . 2 0 . 5 v 0 . 1 local view
Random Walks Execution 0 . 2 ◮ trivial, local process 0 . 2 0 . 5 v Con fi guration and bias ◮ how do we choose 0 . 1 transition probabilities? ◮ is the sample biased? How? ◮ how long do we walk? local view
JU BS AG SH TG BL SO ZH AR ZG SG LU SZ NE AL GL NW OW UR GR VD FR BE GE VS TI stationary distribution ( hops → ∞ )
v 3 0 . 5 0 . 5 1 v 1 v 2 1
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 0 . 5 0 . 5 1 v 1 v 2 1
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 p 1 ( t + 1 ) = 0 . 5 · p 3 ( t ) 0 . 5 0 . 5 p 2 ( t + 1 ) = p 1 ( t ) + 0 . 5 · p 3 ( t ) 1 p 3 ( t + 1 ) = p 2 ( t ) v 1 v 2 1
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 p 1 ( t + 1 ) = 0 . 5 · p 3 ( t ) 0 . 5 0 . 5 p 2 ( t + 1 ) = p 1 ( t ) + 0 . 5 · p 3 ( t ) 1 p 3 ( t + 1 ) = p 2 ( t ) v 1 v 2 1 p ( t + 1 ) = Ap ( t ) p ( t ) = A t p ( 0 )
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 p 1 ( t + 1 ) = 0 . 5 · p 3 ( t ) 0 . 5 0 . 5 p 2 ( t + 1 ) = p 1 ( t ) + 0 . 5 · p 3 ( t ) 1 p 3 ( t + 1 ) = p 2 ( t ) v 1 v 2 1 p ( t + 1 ) = Ap ( t ) p ( t ) = A t p ( 0 ) p ( 0 ) = c 1 x 1 + c 2 x 2 + · · · + c n x n
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 p 1 ( t + 1 ) = 0 . 5 · p 3 ( t ) 0 . 5 0 . 5 p 2 ( t + 1 ) = p 1 ( t ) + 0 . 5 · p 3 ( t ) 1 p 3 ( t + 1 ) = p 2 ( t ) v 1 v 2 1 p ( t + 1 ) = Ap ( t ) p ( t ) = A t p ( 0 ) p ( 0 ) = c 1 x 1 + c 2 x 2 + · · · + c n x n p ( t ) = A t p ( 0 )
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 p 1 ( t + 1 ) = 0 . 5 · p 3 ( t ) 0 . 5 0 . 5 p 2 ( t + 1 ) = p 1 ( t ) + 0 . 5 · p 3 ( t ) 1 p 3 ( t + 1 ) = p 2 ( t ) v 1 v 2 1 p ( t + 1 ) = Ap ( t ) p ( t ) = A t p ( 0 ) p ( 0 ) = c 1 x 1 + c 2 x 2 + · · · + c n x n p ( t ) = A t p ( 0 ) = λ t 2 c 2 x 2 + · · · + λ t 1 c 1 x 1 + λ t n c n x n
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 p 1 ( t + 1 ) = 0 . 5 · p 3 ( t ) 0 . 5 0 . 5 p 2 ( t + 1 ) = p 1 ( t ) + 0 . 5 · p 3 ( t ) 1 p 3 ( t + 1 ) = p 2 ( t ) v 1 v 2 1 p ( t + 1 ) = Ap ( t ) p ( t ) = A t p ( 0 ) p ( 0 ) = c 1 x 1 + c 2 x 2 + · · · + c n x n p ( t ) = A t p ( 0 ) = λ t 2 c 2 x 2 + · · · + λ t 1 c 1 x 1 + λ t n c n x n A is stochastic: 1 = | λ 1 | > | λ 2 | ≥ | λ 3 | ≥ . . .
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 p 1 ( t + 1 ) = 0 . 5 · p 3 ( t ) 0 . 5 0 . 5 p 2 ( t + 1 ) = p 1 ( t ) + 0 . 5 · p 3 ( t ) 1 p 3 ( t + 1 ) = p 2 ( t ) v 1 v 2 1 p ( t + 1 ) = Ap ( t ) p ( t ) = A t p ( 0 ) p ( 0 ) = c 1 x 1 + c 2 x 2 + · · · + c n x n p ( t ) = A t p ( 0 ) = λ t 2 c 2 x 2 + · · · + λ t 1 c 1 x 1 + λ t n c n x n λ 1 = 1 λ 2 , 3 = − 0 . 5 ± 0 . 5 i
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 p 1 ( t + 1 ) = 0 . 5 · p 3 ( t ) 0 . 5 0 . 5 p 2 ( t + 1 ) = p 1 ( t ) + 0 . 5 · p 3 ( t ) 1 p 3 ( t + 1 ) = p 2 ( t ) v 1 v 2 1 p ( t + 1 ) = Ap ( t ) p ( t ) = A t p ( 0 ) p ( 0 ) = c 1 x 1 + c 2 x 2 + · · · + c n x n p ( t ) = A t p ( 0 ) = λ t 2 c 2 x 2 + · · · + λ t 1 c 1 x 1 + λ t n c n x n π λ 1 = 1 λ 2 , 3 = − 0 . 5 ± 0 . 5 i
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 p 1 ( t + 1 ) = 0 . 5 · p 3 ( t ) 0 . 5 0 . 5 p 2 ( t + 1 ) = p 1 ( t ) + 0 . 5 · p 3 ( t ) 1 p 3 ( t + 1 ) = p 2 ( t ) v 1 v 2 1 p ( t + 1 ) = Ap ( t ) p ( t ) = A t p ( 0 ) p ( 0 ) = c 1 x 1 + c 2 x 2 + · · · + c n x n p ( t ) = A t p ( 0 ) = λ t 2 c 2 x 2 + · · · + λ t 1 c 1 x 1 + λ t n c n x n ǫ t ≈ | λ 2 | t → 0 π λ 1 = 1 λ 2 , 3 = − 0 . 5 ± 0 . 5 i
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 p 1 ( t + 1 ) = 0 . 5 · p 3 ( t ) 0 . 5 0 . 5 p 2 ( t + 1 ) = p 1 ( t ) + 0 . 5 · p 3 ( t ) 1 p 3 ( t + 1 ) = p 2 ( t ) v 1 v 2 1 p ( t + 1 ) = Ap ( t ) p ( t ) = A t p ( 0 ) p ( 0 ) = c 1 x 1 + c 2 x 2 + · · · + c n x n p ( t ) = A t p ( 0 ) = λ t 2 c 2 x 2 + · · · + λ t 1 c 1 x 1 + λ t n c n x n ǫ t ≈ | λ 2 | t → 0 π Stationary distribution
let p i ( t ) = Pr [ walk is at node v i at time t ] p ( 0 ) = [ 0 1 0 ] T means the walk starts at v 2 v 3 p 1 ( t + 1 ) = 0 . 5 · p 3 ( t ) 0 . 5 0 . 5 p 2 ( t + 1 ) = p 1 ( t ) + 0 . 5 · p 3 ( t ) 1 p 3 ( t + 1 ) = p 2 ( t ) v 1 v 2 1 p ( t + 1 ) = Ap ( t ) p ( t ) = A t p ( 0 ) p ( 0 ) = c 1 x 1 + c 2 x 2 + · · · + c n x n p ( t ) = A t p ( 0 ) = λ t 2 c 2 x 2 + · · · + λ t 1 c 1 x 1 + λ t n c n x n ǫ t ≈ | λ 2 | t → 0 π Stationary distribution Mixing Time: τ ≈ log | λ 2 | ǫ ǫ t < ǫ for t > τ s.t.
Notice that, if the network is ergodic, then we know for sure that there is a stationary distribution π that satis fi es the equation π = A π . So, we can compute the stationary distribution directly by solving this system of equations: A π = π n � π i = 1 i = 1
Recommend
More recommend