WSDM 2009 Effective Latent Space Graph-based Re-ranking Model with Global Consistency Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering The Chinese University of Hong Kong Feb. 12, 2009 1
Outline � Introduction � Related work � Methodology � Graph-based re-ranking model � Learning a latent space graph � A case study and the overall algorithm � Experiments � Conclusions and Future Work Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 2 WSDM 2009 The Chinese University of Hong Kong
Introduction � Problem definition d 1 � Given a set of documents D d 2 � A term vector d i = x i d 3 � Relevance scores using VSM or LM d 4 � A connected graph � d 5 Explicit link (e.g., hyperlinks) � Implicit link (e.g., inferred from the content information) � Many other features � How to leverage the interconnection between d 1 d 1 d 3 d 3 documents/entities to improve the d 2 d 2 d 4 d 4 ranking of retrieved results d 5 d 5 with respect to the query? q q Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 3 WSDM 2009 The Chinese University of Hong Kong
Introduction � Initial ranking scores: relevance � Graph structure: centrality (importance, authority) � Simple method: Combine those two parts linearly � Limitations: � Do not make full use of the information � Treat each of them individually � What we have done? � Propose a joint regularization framework � Combine the content with link information in a latent space graph Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 4 WSDM 2009 The Chinese University of Hong Kong
Related work � Using some variations of PageRank and HITS Structural Structural � Centrality within graphs (Kurland re-ranking model re-ranking model and Lee, SIGIR’05 & SIGIR’ 06) � Improve Web search results using affinity graph (Zhang et al., Regularization Regularization SIGIR’05) framework framework � Improve an initial ranking by random walk in entity-relation networks (Minkov et al., SIGIR’06) Learning a Learning a latent space latent space Linear combination, treat the content and link individually Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 5 WSDM 2009 The Chinese University of Hong Kong
Related work � Using some variations of � Regularization framework PageRank and HITS Structural Structural � Graph Laplacians for label propagation � Centrality within graphs (Kurland re-ranking model re-ranking model (two classes) (Zhu et al., ICML’03, and Lee, SIGIR’05 & SIGIR’ 06) Zhou et al., NIPS’03) � � Extent the graph harmonic function to Improve Web search results using multiple classes (Mei et al., WWW’08) affinity graph (Zhang et al., Regularization Regularization SIGIR’05) framework framework � � Improve an initial ranking by Score regularization to adjust ad-hoc retrieval scores (Diaz, CIKM’05) random walk in entity-relation � networks (Minkov et al., SIGIR’06) Enhance learning to rank with Learning a Learning a parameterized regularization models (Qin et al., WWW’08) latent space latent space Query-independent settings Do not consider multiple relationships between objects. Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 6 WSDM 2009 The Chinese University of Hong Kong
Related work � Using some variations of � Regularization framework � Learning a latent space PageRank and HITS Structural Structural � Graph Laplacians for label propagation � � Latent Semantic Analysis (LSA) Centrality within graphs (Kurland re-ranking model re-ranking model (two classes) (Zhu et al., ICML’03, (Deerwester et al., JASIS’90) and Lee, SIGIR’05 & SIGIR’ 06) Zhou et al., NIPS’03) � Probabilistic LSI (pLSI) (Hofmann, � � Extent the graph harmonic function to Improve Web search results using SIGIR’99) multiple classes (Mei et al., WWW’08) affinity graph (Zhang et al., Regularization Regularization � pLSI + PHITS (Cohn and Hofmann, SIGIR’05) NIPS’00) framework framework � � Improve an initial ranking by Score regularization to adjust ad-hoc � Combine content and link for retrieval scores (Diaz, CIKM’05) random walk in entity-relation classification using matrix factorization � networks (Minkov et al., SIGIR’06) Enhance learning to rank with (Zhu et al., SIGIR’07) Learning a Learning a parameterized regularization models (Qin et al., WWW’08) latent space latent space Use the joint factorization to learning the latent feature. Difference: leverage the latent feature for building a latent space graph. Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 7 WSDM 2009 The Chinese University of Hong Kong
Methodology Graph-based Graph-based re-ranking model re-ranking model Case study: + Expert finding Learning a latent space graph Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 8 WSDM 2009 The Chinese University of Hong Kong
III. Methodology Graph-based re-ranking model � Intuition: � Global consistency: similar documents are most likely to have similar ranking scores with respect to a query. � The initial ranking scores provides invaluable information � Regularization framework Parameter Fit initial scores Global consistency Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 9 WSDM 2009 The Chinese University of Hong Kong
III. Methodology Graph-based re-ranking model � Optimization problem � A closed-form solution � Connection with other methods � µ α � 0, return the initial scores � µ α � 1, a variation of PageRank-based model � µ α ∈ (0, 1), combine both information simultaneously Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 10 WSDM 2009 The Chinese University of Hong Kong
Methodology Graph-based re-ranking model Case study: + Expert finding Learning a latent space graph Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 11 WSDM 2009 The Chinese University of Hong Kong
III. Methodology Learning a latent space graph � Objective: incorporate the content with link information (or relational data) simultaneously � Latent Semantic Analysis � Joint factorization � Combine the content with relational data � Build latent space graph � Calculate the weight matrix W Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 12 WSDM 2009 The Chinese University of Hong Kong
III. Methodology - Learning a latent space graph Latent Semantic Analysis � Map documents to vector space of reduced dimensionality � SVD is performed on the matrix � The largest k singular values � Reformulated as an optimization problem Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 13 WSDM 2009 The Chinese University of Hong Kong
III. Methodology - Learning a latent space graph Embedding multiple relational data � Taking the papers as an example � Paper-term matrix C � Paper-author matrix A � A unified optimization problem NxM NxL A C Conjugate Gradient + NxK V C X V A Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 14 WSDM 2009 The Chinese University of Hong Kong
III. Methodology - Learning a latent space graph Embedding multiple relational data � Taking the papers as an example � Paper-term matrix C � Paper-author matrix A � A unified optimization problem NxM NxL A C Conjugate Gradient + NxK V C X V A Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 15 WSDM 2009 The Chinese University of Hong Kong
III. Methodology - Learning a latent space graph Build latent space graph � The edge weight w ij is defined W Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 16 WSDM 2009 The Chinese University of Hong Kong
Methodology Graph-based re-ranking model Case study: + Expert finding Learning a latent space graph Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 17 WSDM 2009 The Chinese University of Hong Kong
III. Methodology Case study: Application to expert finding � Utilize statistical language model to calculate the initial ranking scores � The probability of a query given a document � Infer a document model θ d for each document � The probability of the query generated by the document model θ d � The product of terms generated by the document model (Assumption: each term are independent) Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering 18 WSDM 2009 The Chinese University of Hong Kong
Recommend
More recommend