QUINT: On Query-Specific Optimal Networks Presenter: Liangyue Li - PowerPoint PPT Presentation

QUINT: On Query-Specific Optimal Networks Presenter: Liangyue Li Joint work with Jie Tang Hanghang Tong Yuan Yao Wei Fan (Tsinghua) (ASU) (NJU) (Baidu) - 1 - Arizona State University

Node Proximity: What? § Node proximity : the closeness (a.k.a., relevance, or similarity) between two nodes 0.03 0.04 10 9 0.10 12 2 0.08 0.02 0.13 8 1 0.13 11 3 0.04 4 What is the closest 0.05 6 5 node to 4? 0.13 7 0.05 - 2 - Arizona State University

Node Proximity: Why? Biology [Ni+] Social Network [Lerman+] E-commerce [Chen+] Disaster Mgtm [Zheng+] - 3 - Arizona State University

Node Proximity: How? § Random Walk with Restart (RWR) – Idea : summarize multiple weighted relationships btw nodes – Variants : • Electric networks: SAEC[Faloutsos+] • Katz [Katz], [Huang+] • Matrix-Forest-based Alg [Chobotarev+] I 1 J Prox (A, B) = 1 1 Score (Red Path) + A 1 H 1 B Score (Green Path) + Score (Blue Path) + 1 1 D Score (Purple Path) + … 1 1 1 E G - 4 - F Arizona State University

Node Proximity: RWR 10 9 12 2 8 1 11 3 4 6 5 7 - 5 - Arizona State University

Node Proximity -- RWR § Detail: a random walker starts from s p ∼ cA ij – (a) transmit to one neighbor with (1 − c ) – (b) go back to s with prob § Formulation r s = c Ar s + (1 − c ) e s Ranking vector Adjacent matrix Restart prob Starting vector § Assumption – How to best leverage the fixed input graph A - 6 - Arizona State University

Q = ( I − c A ) − 1 Node Proximity: Learning RWR § Goal – Use side information to learn better graph – Side info: user feedback, node attributes § Key Idea: Infer optimal edge weights w k w k 2 + λ X min h ( Q ( y, s ) � Q ( x, s )) x ∈ P ,y ∈ N Map edge attributes Match user preferences to weights § Limitation: Fixed topology J. Tang, T. Lou and J. Kleinberg. Transfer Link Prediction across Heterogeneous Social Networks. TOIS, 2015. L. Backstrom and J. Leskovec. Supervised random walks: predicting and recommending links in social networks. WSDM, 2011. A. Agarwal, S. Chakrabarti, and S. Aggarwal. Learning to rank networked entities. KDD, 2006. - 7 - Arizona State University

Algorithmic Questions § Q1: optimal weights or optimal topology? § Q2: one-fits-all or one-fits-one? § Q3: offline learning or online learning? - 8 - Arizona State University

Q1: Optimal Weights or Topology? § Observation : real network is noisy and incomplete § Challenge : learn optimal weights and topology Missing 0.03 0.04 10 edge 9 0.10 12 2 0.08 0.02 0.13 8 1 0.13 11 3 0.04 4 Noisy 0.05 edge 6 5 0.13 7 0.05 - 9 - Arizona State University

Q2: One-fits-all, or one-fits-one? § Observation : optimal network for different queries might be different 10 10 9 9 12 12 2 2 8 Query 8 1 1 Query 11 11 3 3 Node Node Positive Negative N 4 4 P Nodes Nodes 6 6 5 5 Positive Negative P N 7 7 Nodes Nodes § Challenge : – How to tailor learning for each query - 10 - Arizona State University

Q3: Offline or Online Learning § Observation : – Learning RWR: costly iterative sub-routine to compute a single gradient vector – Learning topology: parameter space expands to O ( n 2 ) – One-fits-one: one optimal network for each query § Challenge : – How to perform query-specific online learning? - 11 - Arizona State University

Query-specific Optimal Network Learning 10 9 12 2 Query 8 1 s Node 11 3 Negative N 4 Nodes 6 5 Positive P Nodes 7 Given : An input network , a query node , positive A s N nodes and negative nodes P A s Learn : An optimal network specific to the query - 12 - Arizona State University

Roadmap § Motivations § Proposed Solutions: QUINT § Empirical Evaluations § Conclusions - 13 - Arizona State University

Q = ( I − c A ) − 1 QUINT - Formulations § Optimization Formulation (hard version) Matching Input Network Positive Negative nodes nodes k A s � A k 2 arg min F A s s.t., Q ( x, s ) > Q ( y, s ) , 8 x 2 P , 8 y 2 N Matching Preference(hard) § Remarks O ( n 2 ) – Larger parameter space – Query-specific Optimal Network – No exception is allowed in the constraint - 14 - Arizona State University

Q = ( I − c A ) − 1 QUINT - Formulations § Optimization Formulation (soft version) Loss function = λ k A s � A k 2 arg min A s L ( A s ) F + P g ( Q ( y, s ) � Q ( x, s )) x ∈ P ,y ∈ N Penalty to the violation of preferences § Remarks – Characteristic Q ( y, s ) < Q ( x, s ) ⇒ g ( · ) = 0 Q ( y, s ) > Q ( x, s ) ⇒ g ( · ) > 0 – Wilcoxon-Mann-Whitney (WMW) loss - 15 - Arizona State University

Q = ( I − c A ) − 1 QUINT -- Optimization § Gradient Descent Based Solution – Gradient ∂ L ( A s ) ∂ g ( Q ( y,s ) − Q ( x,s )) = 2 λ ( A s − A ) + P ∂ A s ∂ A s x ∈ P ,y ∈ N ∂ g ( d yx ) ∂ d yx ( ∂ Q ( y,s ) − ∂ Q ( x,s ) = 2 λ ( A s − A ) + P ) ∂ A s ∂ A s x,y Differentiable – Derivative of an Inverse ∂ A s ( i,j ) = − Q ∂ ( I − c A s ) ∂ Q ∂ A s ( i,j ) Q = c QJ ij Q ∂ Q ( x, s ) ∂ A s ( i, j ) = c Q ( x, i ) Q ( j, s ) - 16 - Arizona State University

Q = ( I − c A ) − 1 QUINT -- Optimization § Intuition Query node ∂ Q ( x, s ) s x ∂ A s ( i, j ) = c Q ( x, i ) Q ( j, s ) Q ( j, s ) × Q ( x, i ) Positive node ∂ Q ( x, s ) § Complexity ∝ ∂ A s ( i, j ) j i O ( T 1 |P| · |N| ( T 2 m + n 2 )) Neighbor of Neighbor of s x § Observation T 1 , T 2 , |P| , |N| ⌧ m, n – Usually – Complexity: quadratic Q : how to scale up? - 17 - Arizona State University

Q = ( I − c A ) − 1 QUINT – Scale-up § Key idea: Optimal network is rank-one perturbation to original network § Details : F + β ( k f k 2 + k g k 2 ) = λ k fg 0 k 2 arg min f , g L ( f , g ) + P g ( Q ( y, s ) � Q ( x, s )) x 2 P ,y 2 N § Optimization : alternating gradient descent § Complexity : O ( T 1 |P| · |N| ( T 2 m + n )) - 18 - Arizona State University

QUINT – Variant #1 § Key idea : apply Taylor Approximation for Q § Details : = ( I − c A ) − 1 Q ≈ I + P k i =1 c k A k § Complexity: using 1 st order Taylor O ( T 1 |P| · |N| n ) § Benefit : accessing faster Q ( i, j ) - 19 - Arizona State University

QUINT – Variant #2 § Key idea : Only update neighborhood of the query node and the pos/neg nodes ( Localized Rank-One Perturbation ) § Complexity O ( T 1 |P| · |N| max( | N ( s ) | , | N ( P , N ) | )) N ( s ) : Neighbors of s N ( P , N ) : Neighbors of pos/neg nodes max( | N ( s ) | , | N ( P , N ) | ) ⌧ n § Benefit: usually sub-linear to n - 20 - Arizona State University

Datasets 10+ diverse networks - 22 - Arizona State University

Effectiveness: MAP (Higher is better) MAP : Mean Average Precision Admic/Adar Common Nbr SRW RWR wiZAN_Dual ProSIN QUINT-Basic QUINT-Basic1st QUINT-rankOne 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Astro-Ph GR-QC Hep-TH Hep-PH Protein Airport Oregon NBA Email Gene Last.fm - 23 - Arizona State University

Effectiveness: HLU (Higher is better) HLU : Half-life Utility 90 80 70 60 50 40 30 20 10 0 Astro-Ph GR-QC Hep-TH Hep-PH Protein Airport Oregon NBA Email Gene Last.fm - 24 - Arizona State University

Effectiveness: AUC (Higher is better) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Astro-Ph GR-QC Hep-TH Hep-PH Protein Airport Oregon NBA Email Gene Last.fm - 25 - Arizona State University

Effectiveness: Precision@20 (Higher is better) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Astro-Ph GR-QC Hep-TH Hep-PH Protein Airport Oregon NBA Email Gene Last.fm - 26 - Arizona State University

Effectiveness: Recall@5 (Higher is better) 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 Astro-Ph GR-QC Hep-TH Hep-PH Protein Airport Oregon NBA Email Gene Last.fm - 27 - Arizona State University

Effectiveness: MPR (Lower is better) MPR : Mean Percentile Ranking 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 Astro-Ph GR-QC Hep-TH Hep-PH Protein Airport Oregon NBA Email Gene Last.fm - 28 - Arizona State University

Efficiency -- Twitter 3 3 10 10 QUINT − Basic1st QUINT − Basic1st Running Time (second) Running Time (second) QUINT − rankOne QUINT − rankOne 2 2 10 10 0.34 0.34 0.32 QUINT − rankOne Running Time (second) 0.32 QUINT − rankOne Running Time (second) 0.3 0.3 0.28 0.28 0.26 0.26 1 1 0.24 10 10 0.24 0.22 0.22 0.2 0.2 0.18 0.18 0.16 0.16 0 5 10 15 # Edges x 10 8 0 0.5 1 1.5 2 2.5 3 3.5 4 # Nodes x 10 7 1s 0 0 10 10 − 1 − 1 10 10 0 0.5 1 1.5 2 2.5 3 3.5 4 0 5 10 15 × 10 7 × 10 8 # Nodes # Edges 7 8 x 10 x 10 QUINT-rankOne scales sub-linearly - 29 - Arizona State University

QUINT: On Query-Specific Optimal Networks Presenter: Liangyue Li - PowerPoint PPT Presentation

QUINT: On Query-Specific Optimal Networks Presenter: Liangyue Li Joint work with Jie Tang Hanghang Tong Yuan Yao Wei Fan (Tsinghua) (ASU) (NJU) (Baidu) - 1 - Arizona State University Node Proximity: What? Node proximity : the

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log

Query Execution 2 and Query Optimization Instructor: Matei Zaharia cs245.stanford.edu Query

Query Processing Relevance feedback; query expansion; Web Search 1 Overview Indexes Query

Outline Supply Networks Introduction Optimal Supply Networks Introduction Introduction

Query Understanding: A Manifesto Daniel Tunkelang queryunderstanding.com Overview What is

Perfect Query FORMULA 5 critical sections in every successful query letter (c) 2019

Query Op)miza)on 1 Query op)miza)on Given an SQL query,

CS4224/CS5424 Lecture 9 Distributed Query Processing Query Processing Translates query into a

Specific Aims One Page The single most important page in a grant Specific Aims Specific Aims

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

A Generic Mapping-based Query Translation A Generic Mapping-based Query Translation from SPARQL

Information Retrieval > Query Us User er Query Words Query Words Search Personalization

Module 13: Optimizing Query Performance Overview Introduction to the Query Optimizer

Chapter 3: Top-k Query Processing and Indexing 3.1 Top-k Algorithms 3.2 Approximate Top-k Query

CAS CS 460/660 Introduction to Database Systems Query Evaluation II 1.1 Cost-based Query

Nationwide Collaborative Effort nectar cloud Nationwide Collaborative Effort The NeCTAR Research

Graphs On Databases Alekh Jindal Sam Madden Mike Stonebraker CSAIL, MIT + = Jena FlockDB

K-Anonymity & Social Networks CompSci 590.03 Instructor:

Real World Search Problems CS 331: Artificial Intelligence Uninformed Search 1 2 Assumptions

Outline Introduction 1 Algorithms 2 crtrees 3 Examples 4 Simulations 5 2 / 52

Week 4 - Friday What did we talk about last time? Linked lists You are given a

Impact of Node Level Caching in MPI Job Launch Mechanisms Jaidev Sridhar and D. K. Panda

Eventual Consistency: Bayou CS 240: Computing Systems and Concurrency Lecture 13 Marco Canini

QUINT: On Query-Specific Optimal Networks Presenter: Liangyue Li - PowerPoint PPT Presentation

QUINT: On Query-Specific Optimal Networks Presenter: Liangyue Li Joint work with Jie Tang Hanghang Tong Yuan Yao Wei Fan (Tsinghua) (ASU) (NJU) (Baidu) - 1 - Arizona State University Node Proximity: What? Node proximity : the

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log

Query Execution 2 and Query Optimization Instructor: Matei Zaharia cs245.stanford.edu Query

Query Processing Relevance feedback; query expansion; Web Search 1 Overview Indexes Query

Outline Supply Networks Introduction Optimal Supply Networks Introduction Introduction

Query Understanding: A Manifesto Daniel Tunkelang queryunderstanding.com Overview What is

Perfect Query FORMULA 5 critical sections in every successful query letter (c) 2019

Query Op)miza)on 1 Query op)miza)on Given an SQL query,

CS4224/CS5424 Lecture 9 Distributed Query Processing Query Processing Translates query into a

Specific Aims One Page The single most important page in a grant Specific Aims Specific Aims

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

A Generic Mapping-based Query Translation A Generic Mapping-based Query Translation from SPARQL

Information Retrieval &gt; Query Us User er Query Words Query Words Search Personalization

Module 13: Optimizing Query Performance Overview Introduction to the Query Optimizer

Chapter 3: Top-k Query Processing and Indexing 3.1 Top-k Algorithms 3.2 Approximate Top-k Query

CAS CS 460/660 Introduction to Database Systems Query Evaluation II 1.1 Cost-based Query

Nationwide Collaborative Effort nectar cloud Nationwide Collaborative Effort The NeCTAR Research

Graphs On Databases Alekh Jindal Sam Madden Mike Stonebraker CSAIL, MIT + = Jena FlockDB

K-Anonymity &amp; Social Networks CompSci 590.03 Instructor:

Real World Search Problems CS 331: Artificial Intelligence Uninformed Search 1 2 Assumptions

Outline Introduction 1 Algorithms 2 crtrees 3 Examples 4 Simulations 5 2 / 52

Week 4 - Friday What did we talk about last time? Linked lists You are given a

Impact of Node Level Caching in MPI Job Launch Mechanisms Jaidev Sridhar and D. K. Panda

Eventual Consistency: Bayou CS 240: Computing Systems and Concurrency Lecture 13 Marco Canini

Information Retrieval > Query Us User er Query Words Query Words Search Personalization

K-Anonymity & Social Networks CompSci 590.03 Instructor: