Similarity Ranking in Large- Scale Bipartite Graphs Alessandro - PowerPoint PPT Presentation

Similarity Ranking in Large- Scale Bipartite Graphs Alessandro Epasto Brown University - 20 th March 2014 � 1

Joint work with J. Feldman, S. Lattanzi, S. Leonardi, V. Mirrokni [WWW, 2014] � 2

AdWords Ads Ads

Our Goal ● Tackling AdWords data to identify automatically , for each advertiser, its main competitors and suggest relevant queries to each advertiser. � ● Goals: ● Useful business information. ● Improve advertisement. ● More relevant performance benchmarks.

The Data Query Information Nike store New York Market Segment: Retailer , Geo: NY (USA), Stats: 10 clicks Soccer shoes Market Segment: Apparel , Geo: London, UK, Stats: 4 clicks Soccer ball Market Segment: Equipment Geo: San Francisco (USA), Stats: 5 clicks …. millions of other queries …. Large advertisers (e.g., Amazon, Ask.com, etc) compete in several market segments with very different advertisers.

Modeling the Data as a Bipartite Graph Hundreds of Labels Millions of Advertisers Billions of Queries

Other Applications ● General approach applicable to several contexts: ● User , Movies , Categories : find similar users and suggest movies. ● Authors , Papers , Conferences : find related authors and suggest papers to read. � ● Generally this bipartite graphs are lopsided: we want algorithms with complexity depending on the smaller side.

Semi-Formal Problem Definition Advertisers Queries

Semi-Formal Problem Definition Advertisers A Queries

Semi-Formal Problem Definition Advertisers A Queries Labels:

Semi-Formal Problem Definition Advertisers A Queries Goal: Find the nodes most Labels: “similar” to A.

How to Define Similarity? ● We address the computation of several node similarity measures: ● Neighborhood based: Common neighbors, Jaccard Coefficient, Adamic-Adar. ● Paths based: Katz. ● Random Walk based: Personalized PageRank. � ● What is the accuracy? ● Can it scale to huge graphs? ● Can be computed in real-time ?

Our Contribution ● Reduce and Aggregate: general approach to induce real-time similarity rankings in multi- categorical bipartite graphs, that we apply to several similarity measures. � ● Theoretical guarantees for the precision of the algorithms. � ● Experimental evaluation with real world data.

Personalized PageRank For a node v (the seed) and a probability alpha v u The stationary distribution assigns a similarity score to each node in the graph w.r.t. node v. � �

Personalized PageRank ● Extensive algorithmic literature. � ● Very good accuracy in our experimental evaluation compared to other similarities (Jaccard, Intersection, etc.). � ● Efficient MapReduce algorithm scaling to large graphs (hundred of millions of nodes). However…

Personalized PageRank � � � � � ● Our graphs are too big ( billions of nodes) even for large-scale systems. ● MapReduce is not real-time. ● We cannot pre-compute the rankings for each subset of labels.

Reduce and Aggregate �� Reduce: Given the bipartite and a category construct a graph with only A nodes that preserves the ranking on the entire graph. � Aggregate: Given a node v in A and the reduced graphs of the subset of categories interested determine the ranking for v.

In practice � � � � � First stage: Large-scale (but feasible) MapReduce pre-computation of the individual category reduced graphs. � Second Stage: Fast real-time algorithm aggregation algorithm.

Reduce for Personalized PageRank Side A Side A Side B ● Markov Chain state aggregation theory (Simon and Ado, ’61; Meyer ’89, etc.). ● 750x reduction in the number of node while preserving correctly the PPR distribution on the entire graph .

Stochastic Complementation � � P 11 . . . P 1 i . . . P 1 k � � � � . . . . . � � . . . . . � � . . . . . � � P i 1 . . . P ii . . . P ik � � � � � . . . . . � . . . . . � � . . . . . � � P k 1 . . . P ki . . . P kk � � ● The stochastic complement of is the C i following matrix | C i | × | C i | i ) − 1 P ∗ i S i = P ii + P i ∗ (1 − P ∗

Stochastic Complementation Theorem [Meyer ’89] For every irreducible aperiodic Markov Chain, π i = t i s i where is the stationary distribution of the nodes π i in and is the stationary distribution of C i S i s i

Stochastic Complementation ● Computing the stochastic complements is unfeasible in general for large matrices (matrix inversion). � ● In our case we can exploit the properties of random walks on Bipartite graphs to invert the matrix analytically.

Reduce for PPR w ( x, z ) Side A Side B x z w ( y, z ) y

Reduce for PPR w ( x, z ) Side A Side B x z w ( x, y ) w ( y, z ) y w ( x, z ) w ( y, z ) X w ( x, y ) = P h ∈ N ( z ) w ( z,h ) z ∈ N ( x ) ∪ N ( y )

Reduce for PPR w ( x, z ) Side A Side B x z w ( x, y ) w ( y, z ) y One step in the reduced graph is equivalent to two steps in the bipartite graph.

Properties of the Reduced Graph 2 − α PPR( ˆ 1 G, 2 α − α 2 , a ) Lemma 1: PPR( G, α , a )[ A ] = Proof Sketch: ● Every path between nodes in A is even. � ● Probability of not jumping for two steps. � ● The probability of being in the A-Side at stationarity does not depend on the graph.

Properties of the Reduced Graph Similarly, we can reduce the process to a graph with B-Side nodes only. Lemma 2: b ∈ N ( a ) w ( a, b )PPR( ˆ PPR( G, α , a )[ B ] = 1 − α G B , 2 α − α 2 , b ) P 2 − α Finally, the stationary distribution of either side uniquely determines that of the other side.

Koury et al. Aggregation-Disaggregation Algorithm B A Step 1: Partition the Markov chain into disjoint subsets

Koury et al. Aggregation-Disaggregation Algorithm B A π B π A Step 2: Approximate the stationary distribution on each subset independently.

Koury et al. Aggregation-Disaggregation Algorithm P AA B A P AB π B π A P BA P BB Step 3: Compute the k x k approximated transition matrix T between the subsets.

Koury et al. Aggregation-Disaggregation Algorithm t A t B P AA B A P AB π B π A P BA P BB Step 4: Compute the stationary distribution of T.

Koury et al. Aggregation-Disaggregation Algorithm t A t B P AA B A P AB π 0 π 0 B A P BA P BB Step 5: Based on the stationary distribution improve the estimation of and . Repeat until convergence. π B π A

Aggregation in PPR A X Y π A Precompute the stationary distributions individually

Aggregation in PPR B X Y π B Precompute the stationary distributions individually

Aggregation in PPR B A The two subsets are not disjoint!

Reduction to the Query Side X Y π B π A

Reduction to the Query Side X Y π B π A This is the larger side of the graph.

Our Approach π A π B Y X Y X ● We tackle the bijective relationships between the stationary distributions of the two sides. ● The algorithm is based only on the reduced graphs with Advertiser-Side nodes. ● The aggregation algorithm is scalable and converges to the correct distribution.

Similarity Ranking in Large- Scale Bipartite Graphs Alessandro - PowerPoint PPT Presentation

Similarity Ranking in Large- Scale Bipartite Graphs Alessandro Epasto Brown University - 20 th March 2014 1 Joint work with J. Feldman, S. Lattanzi, S. Leonardi, V. Mirrokni [WWW, 2014] 2 AdWords Ads Ads Our Goal Tackling AdWords

Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs Alessandro Epasto

The Bipartite Matching Problem II Math 482, Lecture 22 Misha Lavrov March 27, 2020 Bipartite

The Bipartite Matching Problem Math 482, Lecture 21 Misha Lavrov March 25, 2020 Bipartite graph

Sources for this lecture 3. Matching in bipartite and general graphs The material for this

1 So, similarity is not a Boolean notion It is Similarity Are they similar? relatively

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

Actionable Objective Optimization for Suspicious Behavior Detection on Large Bipartite Graphs

11 11 11 Learning to Route in Similarity Graphs Dmitry Baranchuk joint work with Dmitry

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Easy and Hard Outline Constraint Ranking in OT The Constraint Ranking problem Making fast

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Watching Systems in Complete Bipartite Graphs C. Hernando M. Mora I. M. Pelayo Depts.

Bipartite Graphs and their Idempotent Polymorphisms Ross Willard University of Waterloo AMS

Perfect matchings in O ( n log n ) time in regular bipartite graphs -Ashish Goel, Michael

Bipartite Edge Prediction via Transductive Learning over Product Graphs Hanxiao Liu, Yiming Yang

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Introduction Mathematics for Informatics 4a Jos e Figueroa-OFarrill Topic : Probability and

Fine-grained Compatibility and Replaceability Analysis of Timed Web Service Protocols Julien Ponge

Rowmotion: Classical & Birational Tom Roby (University of Connecticut) Describing joint

Switching Algebra Basic postulate: existence of two-valued switching variable that takes two

P and NP Evgenij Thorstensen V18 Evgenij Thorstensen P and NP V18 1 / 26 Recap We measure

Lexicalist Approaches to Syntax Day 5 Part I: Origins of Head-Driven Phrase Structure Grammar

PyZX: Quantum circuit optimization using the ZX-calculus Aleks Kissinger aleks@cs.ru.nl John

Section 2.2 Union Definition : Let A and B be sets. The union of the sets A and B , denoted by

Similarity Ranking in Large- Scale Bipartite Graphs Alessandro - PowerPoint PPT Presentation

Similarity Ranking in Large- Scale Bipartite Graphs Alessandro Epasto Brown University - 20 th March 2014 1 Joint work with J. Feldman, S. Lattanzi, S. Leonardi, V. Mirrokni [WWW, 2014] 2 AdWords Ads Ads Our Goal Tackling AdWords

Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs Alessandro Epasto

The Bipartite Matching Problem II Math 482, Lecture 22 Misha Lavrov March 27, 2020 Bipartite

The Bipartite Matching Problem Math 482, Lecture 21 Misha Lavrov March 25, 2020 Bipartite graph

Sources for this lecture 3. Matching in bipartite and general graphs The material for this

1 So, similarity is not a Boolean notion It is Similarity Are they similar? relatively

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

Actionable Objective Optimization for Suspicious Behavior Detection on Large Bipartite Graphs

11 11 11 Learning to Route in Similarity Graphs Dmitry Baranchuk joint work with Dmitry

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Easy and Hard Outline Constraint Ranking in OT The Constraint Ranking problem Making fast

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Watching Systems in Complete Bipartite Graphs C. Hernando M. Mora I. M. Pelayo Depts.

Bipartite Graphs and their Idempotent Polymorphisms Ross Willard University of Waterloo AMS

Perfect matchings in O ( n log n ) time in regular bipartite graphs -Ashish Goel, Michael

Bipartite Edge Prediction via Transductive Learning over Product Graphs Hanxiao Liu, Yiming Yang

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Introduction Mathematics for Informatics 4a Jos e Figueroa-OFarrill Topic : Probability and

Fine-grained Compatibility and Replaceability Analysis of Timed Web Service Protocols Julien Ponge

Rowmotion: Classical &amp; Birational Tom Roby (University of Connecticut) Describing joint

Switching Algebra Basic postulate: existence of two-valued switching variable that takes two

P and NP Evgenij Thorstensen V18 Evgenij Thorstensen P and NP V18 1 / 26 Recap We measure

Lexicalist Approaches to Syntax Day 5 Part I: Origins of Head-Driven Phrase Structure Grammar

PyZX: Quantum circuit optimization using the ZX-calculus Aleks Kissinger aleks@cs.ru.nl John

Section 2.2 Union Definition : Let A and B be sets. The union of the sets A and B , denoted by

Rowmotion: Classical & Birational Tom Roby (University of Connecticut) Describing joint