Authority and Co-cite, Hub and Co-reference Given the adjacency matrix A (with entries 0 or 1) a i = A T Aa i − 1 a i = ( A T A ) i a 0 h i = AA T h i − 1 h i = ( AA T ) i h 0 Co-citation: the number of pages co-cite P i and P j Co-reference: the number of pages co-referenced by P i and P j . A T A = D + C where C is the matrix of co-citation and D = diag ( d 1 , d 2 , · · · , d j ) � A ki A kj = ( A T A ) ij = C ij k = 0 C ii � � A ki A ki = ( A T A ) ii d i = A ki = k k A T A = D + C , AA T = O + R
Authority and Co-cite, Hub and Co-reference Given the adjacency matrix A (with entries 0 or 1) a i = A T Aa i − 1 a i = ( A T A ) i a 0 h i = AA T h i − 1 h i = ( AA T ) i h 0 Co-citation: the number of pages co-cite P i and P j Co-reference: the number of pages co-referenced by P i and P j . A T A = D + C where C is the matrix of co-citation and D = diag ( d 1 , d 2 , · · · , d j ) � A ki A kj = ( A T A ) ij = C ij k = 0 C ii � � A ki A ki = ( A T A ) ii d i = A ki = k k A T A = D + C , AA T = O + R
Authority and Co-cite, Hub and Co-reference Given the adjacency matrix A (with entries 0 or 1) a i = A T Aa i − 1 a i = ( A T A ) i a 0 h i = AA T h i − 1 h i = ( AA T ) i h 0 Co-citation: the number of pages co-cite P i and P j Co-reference: the number of pages co-referenced by P i and P j . A T A = D + C where C is the matrix of co-citation and D = diag ( d 1 , d 2 , · · · , d j ) � A ki A kj = ( A T A ) ij = C ij k = 0 C ii � � A ki A ki = ( A T A ) ii d i = A ki = k k A T A = D + C , AA T = O + R
Authority and Co-cite, Hub and Co-reference Given the adjacency matrix A (with entries 0 or 1) a i = A T Aa i − 1 a i = ( A T A ) i a 0 h i = AA T h i − 1 h i = ( AA T ) i h 0 Co-citation: the number of pages co-cite P i and P j Co-reference: the number of pages co-referenced by P i and P j . A T A = D + C where C is the matrix of co-citation and D = diag ( d 1 , d 2 , · · · , d j ) � A ki A kj = ( A T A ) ij = C ij k = 0 C ii � � A ki A ki = ( A T A ) ii d i = A ki = k k A T A = D + C , AA T = O + R
Probabilistic analysis Expected value of co-citation/co-reference For a fixed degree sequence random graphs d i d k E ( C ik ) = n − 1 o i o k E ( R ik ) = n − 1 The node with large indegree d i tend to have large co-citations with other nodes. E ( A T A ) = E ( D ) + E ( C ) = diag ( h 1 , h 2 , · · · , h n ) + dd T / n − 1 where h i ≡ d i − d 2 i / ( n − 1) and d = ( d 1 , d 2 , · · · , d n ) T .
Probabilistic analysis Expected value of co-citation/co-reference For a fixed degree sequence random graphs d i d k E ( C ik ) = n − 1 o i o k E ( R ik ) = n − 1 The node with large indegree d i tend to have large co-citations with other nodes. E ( A T A ) = E ( D ) + E ( C ) = diag ( h 1 , h 2 , · · · , h n ) + dd T / n − 1 where h i ≡ d i − d 2 i / ( n − 1) and d = ( d 1 , d 2 , · · · , d n ) T .
Spectral Decomposition of Diagonal Plus Rank-1 matrices Let M = D + cc T , D is a diagonal n × n matrix of the block form: D = diag ( τ 1 I 1 , τ 2 I 2 , · · · , τ l I l ) where I k is the identity matrix of size n k , τ 1 > τ 2 > · · · > τ l Then , the eigenvalues of M are given by τ 1 > τ 1 = · · · = τ 1 ˆ � > ˆ τ 2 > τ 2 = · · · = τ 2 � > · · · > ˆ τ l > τ l = · · · = τ l � �� � �� � �� � and the eigenvector of A corresponds to the eigenvalue ˆ τ k is c T c T c T 1 2 l ) T . ( , , · · · , τ 1 − τ 1 ˆ τ 2 − τ 2 ˆ τ l − τ l ˆ The eigenvector corresponds to τ k is of the form (0 · · · 0 , u T k , 0 · · · 0) T where u k is a vector of n k satisfying c T k u k = 0.
Average Analysis of HITS E ( A T A ) = E ( D ) + E ( C ) = diag ( h 1 , h 2 , · · · , h n ) + dd T / n − 1 where h i ≡ d i − d 2 i / ( n − 1) and d = ( d 1 , d 2 , · · · , d n ) T . If h 1 > h 2 > · · · > h m ≥ h m +1 ≥ · · · ≥ h n , Then, the m largest eigenvalues λ i satisfying λ 1 > h 1 > λ 2 > h 2 > · · · > λ m > h m the corresponding eigenvectors are d 1 d 2 d n u k = ( , , · · · , ) λ k − h 1 λ k − h 2 λ k − h n Prerequisite h i − h j = ( d i − d j )[1 − ( d i + d j ) / ( n − 1)] > 0 as long as d 1 > · · · > d m > d m +1 ≥ d m +1 ≥ d m +2 · · · ≥ d n and d i + d j < n − 1 for ∀ i , j
Average Analysis of HITS E ( A T A ) = E ( D ) + E ( C ) = diag ( h 1 , h 2 , · · · , h n ) + dd T / n − 1 where h i ≡ d i − d 2 i / ( n − 1) and d = ( d 1 , d 2 , · · · , d n ) T . If h 1 > h 2 > · · · > h m ≥ h m +1 ≥ · · · ≥ h n , Then, the m largest eigenvalues λ i satisfying λ 1 > h 1 > λ 2 > h 2 > · · · > λ m > h m the corresponding eigenvectors are d 1 d 2 d n u k = ( , , · · · , ) λ k − h 1 λ k − h 2 λ k − h n Prerequisite h i − h j = ( d i − d j )[1 − ( d i + d j ) / ( n − 1)] > 0 as long as d 1 > · · · > d m > d m +1 ≥ d m +1 ≥ d m +2 · · · ≥ d n and d i + d j < n − 1 for ∀ i , j
Average Analysis of HITS E ( A T A ) = E ( D ) + E ( C ) = diag ( h 1 , h 2 , · · · , h n ) + dd T / n − 1 where h i ≡ d i − d 2 i / ( n − 1) and d = ( d 1 , d 2 , · · · , d n ) T . If h 1 > h 2 > · · · > h m ≥ h m +1 ≥ · · · ≥ h n , Then, the m largest eigenvalues λ i satisfying λ 1 > h 1 > λ 2 > h 2 > · · · > λ m > h m the corresponding eigenvectors are d 1 d 2 d n u k = ( , , · · · , ) λ k − h 1 λ k − h 2 λ k − h n Prerequisite h i − h j = ( d i − d j )[1 − ( d i + d j ) / ( n − 1)] > 0 as long as d 1 > · · · > d m > d m +1 ≥ d m +1 ≥ d m +2 · · · ≥ d n and d i + d j < n − 1 for ∀ i , j
Eigenvectors
HITS = ranking according to indegrees?? For any i < j d i d j u 1 ( i ) − u 1 ( j ) = − λ 1 − h i lambda 1 − h j ( d i − d j )[ λ 1 − d i d j / ( n − 1)] = ( λ 1 − h i )( λ − h j ) 0 ≥ as λ 1 − d i d j / ( n − 1) > h i − d i d j / ( n − 1) = d i (1 − d i + d j n − 1 ) > 0 What’s the nature of AVERAGE? The authority ranking is, ON AVERAGE, identical to the ranking according to web page indegrees.
HITS = ranking according to indegrees?? For any i < j d i d j u 1 ( i ) − u 1 ( j ) = − λ 1 − h i lambda 1 − h j ( d i − d j )[ λ 1 − d i d j / ( n − 1)] = ( λ 1 − h i )( λ − h j ) 0 ≥ as λ 1 − d i d j / ( n − 1) > h i − d i d j / ( n − 1) = d i (1 − d i + d j n − 1 ) > 0 What’s the nature of AVERAGE? The authority ranking is, ON AVERAGE, identical to the ranking according to web page indegrees.
Recommend
More recommend