Finding Dense Subgraphs via Low-Rank Bilinear Optimization Ioannis - PowerPoint PPT Presentation

Finding Dense Subgraphs via Low-Rank Bilinear Optimization Ioannis Mitliagkas Dimitris Papailiopoulos with: Alex Dimakis � UT Austin Constantine Caramanis

Densest k-Subgraph (DkS) Given graph and a parameter k � Find k vertices containing most edges

Densest k-Subgraph (DkS) Given graph and a parameter k � Find k vertices containing most edges � Applications Community Mining communities = large dense components Link Spam Detection dense parts of web: spam Computational biology complex patterns in gene annotation graphs

Densest k-Subgraph (DkS) There is a 5-subgraph with 10 edges � Q: Can you find it?

Densest k-Subgraph (DkS) Given graph and a parameter k � Find k vertices containing most edges NP-hard Hard to approximate

Densest k-Subgraph (DkS) Given graph and a parameter k � Find k vertices containing most edges NP-hard Hard to approximate [Khot, 2004] *Except in specific cases: [Arora et al 95] (1+ ε ) approx. for linear subgraphs of dense graphs

Worst-Case Analysis

Worst-Case Analysis � � � �

Worst-Case Analysis � � � � After long effort, [Feige, 2001], [Bhaskara et al., STOC ’10] Best known ratio � � � 10-factor approx. for graphs with 10K nodes 100-factor approx. for graphs with 100 Million nodes

Known DkS guarantees are not useful in practice… under worst case analysis

Known DkS guarantees are not useful in practice… under worst case analysis Q1 : Provable, graph-dependent bounds? Q2 : DkS on billion-scale graphs?

Beyond the Worst Case New DkS algorithm: Graph-dependent bounds In practice: Scalable nearly-linear times for many real-world graphs Parallelizable implementation in MapReduce+Python up to billion-edge graphs on 800 cores on Amazon EC2

Our Low-Rank Framework 1 1 1 1 1 1 1 DkS on a graph - Hard to solve - Hard to approximate

Our Low-Rank Framework 1 0.9 1 1.1 1 1.2 0.1 1 1.3 0.6 Low rank 1 approximation 1 1.4 1 0.7 -0.2 -0.3 DkS on a graph DkS on constant rank graph - Hard to solve - Nearly-linear time solvable (!) - Hard to approximate

Our Low-Rank Framework 1 0.9 1 1.1 1 1.2 0.1 1 1.3 0.6 Low rank 1 approximation 1 1.4 1 0.7 -0.2 -0.3 DkS on a graph DkS on constant rank graph - Hard to solve - Nearly-linear time solvable (!) - Hard to approximate Low-rank DkS is related to original DkS

Results: Theory

Graph-dependent Guarantees Theorems: Algorithm computes in time O(n d+2 / δ ) a k -subgraph with density OPT d ≥ OPT · 0 . 5 · (1 − δ ) − 2 | λ d +1 |

Graph-dependent Guarantees Theorems: Algorithm computes in time O(n d+2 / δ ) a k -subgraph with density OPT d ≥ OPT · 0 . 5 · (1 − δ ) − 2 | λ d +1 | If the largest d eigenvalues of the adjacency are positive O ( | E | · log n + n Our algorithm computes in time ✏ d ) a k -subgraph with density � OPT d ≥ OPT · (1 − ✏ ) − 2 | � d +1 |

Graph-dependent Guarantees Theorems: Algorithm computes in time O(n d+2 / δ ) a k -subgraph with density OPT d ≥ OPT · 0 . 5 · (1 − δ ) − 2 | λ d +1 | If the largest d eigenvalues of the adjacency are positive O ( | E | · log n + n Our algorithm computes in time ✏ d ) a k -subgraph with density � OPT d ≥ OPT · (1 − ✏ ) − 2 | � d +1 | larger d => better approximation, slower computation

Performance in Practice

com-LiveJournal graph 4M nodes, 35M edges Trivial upper bound = k-1 density subgraph size, k

com-LiveJournal graph 4M nodes, 35M edges Trivial upper bound = k-1 density subgraph size, k Blue: TPower JMLR’13 Green: GreedyFeige Algorithmica ’01 Yellow: GreedyRavi OR’94

com-LiveJournal graph 4M nodes, 35M edges Trivial upper bound = k-1 density Big Gap subgraph size, k Blue: TPower JMLR’13 Green: GreedyFeige Algorithmica ’01 Yellow: GreedyRavi OR’94

com-LiveJournal graph 4M nodes, 35M edges Trivial upper bound = k-1 density d=1 spannogram subgraph size, k Blue: TPower JMLR’13 Green: GreedyFeige Algorithmica ’01 Yellow: GreedyRavi OR’94

com-LiveJournal graph 4M nodes, 35M edges Trivial upper bound = k-1 Smaller Gap density subgraph size, k Blue: TPower JMLR’13 Green: GreedyFeige Algorithmica ’01 Yellow: GreedyRavi OR’94

com-LiveJournal graph 4M nodes, 35M edges Graph-dependent bound 80% OPT OPT d + λ d +1 density subgraph size, k Blue: TPower JMLR’13 Green: GreedyFeige Algorithmica ’01 Yellow: GreedyRavi OR’94

How we do it

DkS via Quadratic Optimization vertex vertex

DkS via Quadratic Optimization vertex Edges In subgraph vertex

DkS via Quadratic Optimization vertex Edges In subgraph vertex DkS :

DkS via Bilinear Optimization DkS :

DkS via Bilinear Optimization DBkS : DkS :

DkS via Bilinear Optimization DBkS : Lemma: ρ -approximation for DBkS = ½ρ -approximation for DkS DkS :

DkS via Bilinear Optimization DBkS : 1 1 1 1 1 1 1

Low-Rank Approximation DBkS :

Low-Rank Approximation DBkS : 0.9 1.1 1.2 0.1 1.3 0.6 1.4 0.7 -0.2 -0.3

Low-Rank Approximation DBkS : 0.9 1.1 1.2 0.1 1.3 0.6 1.4 0.7 -0.2 -0.3 Efficiently solvable

How the Low-Rank Solver Works ✓ n ◆ Naïvely: Check all subgraphs k Rank-1 case: Q: Maximize the product of two numbers A: Maximize each number individually

How the Rank-1 Solver Works 1   1   2   2   3 3 4 4 top-k set : the k-largest coordinates of a vector, e.g., if k =2, then top-2 set = {3,4} � Intuition : x, y pick the top-k set of v .

                How the Rank-2 Solver Works 1 5 1 5 2 2 2 2 3 7 3 7 � � � � 4 0 4 0 Intuition : x, y pick the top- k set of a vector from a 2-dimensional span. Q: How many top-k sets are there in a 2-dimensional span? Based on Spannogram [Asteris, Papail., Karystinos, ISIT2011] Theorem : # top- k sets in a d-dimensional span: Spannogram : Traverses all of them efficiently

                How the Rank-2 Solver Works 1 5 1 5 2 2 2 2 3 7 3 7 � � � � 4 0 4 0 Intuition : x, y pick the top- k set of a vector from a 2-dimensional span. Randomized algorithm Take random points : s 1 , . . . , s 1 / ✏ d ∈ span( v 1 , . . . , v d )

                How the Rank-2 Solver Works 1 5 1 5 2 2 2 2 3 7 3 7 � � � � 4 0 4 0 Intuition : x, y pick the top- k set of a vector from a 2-dimensional span. Randomized algorithm Take random points : s 1 , . . . , s 1 / ✏ d ∈ span( v 1 , . . . , v d ) Practically linear time

Implementation

MapReduce Implementation �

MapReduce Implementation git.io/spannogram �

Billion-scale Graphs n, 1 � � 2 , k = 3 √ n G 1000 G-Feige G-Ravi TPower 800 Subgraph density Spannogram 600 400 200 0 4 6 8 10 10 10 10 10 | E |

Conclusions

Conclusions • New combinatorial approx. algorithm for DkS.

Conclusions • New combinatorial approx. algorithm for DkS. • Graph-dependent spectral bounds:   OPT within 70% in most experiments.

Conclusions • New combinatorial approx. algorithm for DkS. • Graph-dependent spectral bounds:   OPT within 70% in most experiments. • Bound could be trivial in the worst case.

Conclusions • New combinatorial approx. algorithm for DkS. • Graph-dependent spectral bounds:   OPT within 70% in most experiments. • Bound could be trivial in the worst case. • Empirically outperforms previous state of the art

Conclusions • New combinatorial approx. algorithm for DkS. • Graph-dependent spectral bounds:   OPT within 70% in most experiments. • Bound could be trivial in the worst case. • Empirically outperforms previous state of the art • Highly scalable implementation

Thank you

Backup slides

Other experiments

Randomized Algorithm Step 1 Take random points : s 1 , . . . , s 1 / ✏ d ∈ span( v 1 , . . . , v d ) Step 2 Find largest k entries : Step 3 Compute density of corresponding subgraph

Finding Dense Subgraphs via Low-Rank Bilinear Optimization Ioannis - PowerPoint PPT Presentation

Finding Dense Subgraphs via Low-Rank Bilinear Optimization Ioannis Mitliagkas Dimitris Papailiopoulos with: Alex Dimakis UT Austin Constantine Caramanis Densest k-Subgraph (DkS) Given graph and a parameter k Find k vertices

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Pairing-Based Cryptography & Generic Groups Lecture 22 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography & Generic Groups Lecture 21 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography & Generic Groups Lecture 22 1 Bilinear Pairing 2 Bilinear

Finding Dense Subgraphs Moses Charikar Center for Computational Intractability NP ? ? P = NP

Finding Dense Subgraphs with Size Bounds Reid Andersen Kumar Chellapilla Microsoft Live Labs

Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation

Abstract rule representations in a Abstract rule representations in a bilinear model bilinear

Weakly-coupled bilinear quantum systems Thomas Chambrion Nabile Boussad (Besanon) and Marco

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

Dense Flow Visualization Lecture 10 February 27, 2020 General Overview Dense methods in 2D

A Massively Parallel Dense Symmetric A Massively Parallel Dense Symmetric A Massively Parallel

Results for different matrices and comparisons Dense Matrices Rectangular Matrices

LowFER: Low-rank Bilinear Pooling for Link Prediction Saadullah Amin, Stalin Varanasi, Katherine

Bi Bilinear Ba Bandits wi with Low-ra rank Structure Kwang-Sung Jun Boston University (will

Sparse and Low-Rank Optimization for Dense Wireless Networks Part I: Models Jun Zhang Yuanming

Finding topological subgraphs is fixed-parameter tractable Martin Grohe 1 Ken-ichi Kawarabayashi 2

Everything you always wanted to know about the parameterized complexity of Subgraph Isomorphism

Manufacturing Diagnostic Tool Manufacturing Diagnostic Tool An on board on board low cost

A general-purpose program structure for variational Monte-Carlo calculation Kyrre Ness Sjbk

Kuhn Munkres algorithm Actors A definition There is a label on every vertex and its value is

A distributed approximation scheme for sleep scheduling in sensor networks Patrik Flor een,

Distributed Computing in Fault-Prone Dynamic Networks Philipp Brandes, Friedhelm Meyer auf der

VisITMeta Visualizing the security of modern IT environments by using metadata Project

Finding Dense Subgraphs via Low-Rank Bilinear Optimization Ioannis - PowerPoint PPT Presentation

Finding Dense Subgraphs via Low-Rank Bilinear Optimization Ioannis Mitliagkas Dimitris Papailiopoulos with: Alex Dimakis UT Austin Constantine Caramanis Densest k-Subgraph (DkS) Given graph and a parameter k Find k vertices

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Pairing-Based Cryptography &amp; Generic Groups Lecture 22 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography &amp; Generic Groups Lecture 21 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography &amp; Generic Groups Lecture 22 1 Bilinear Pairing 2 Bilinear

Finding Dense Subgraphs Moses Charikar Center for Computational Intractability NP ? ? P = NP

Finding Dense Subgraphs with Size Bounds Reid Andersen Kumar Chellapilla Microsoft Live Labs

Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation

Abstract rule representations in a Abstract rule representations in a bilinear model bilinear

Weakly-coupled bilinear quantum systems Thomas Chambrion Nabile Boussad (Besanon) and Marco

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

Dense Flow Visualization Lecture 10 February 27, 2020 General Overview Dense methods in 2D

A Massively Parallel Dense Symmetric A Massively Parallel Dense Symmetric A Massively Parallel

Results for different matrices and comparisons Dense Matrices Rectangular Matrices

LowFER: Low-rank Bilinear Pooling for Link Prediction Saadullah Amin, Stalin Varanasi, Katherine

Bi Bilinear Ba Bandits wi with Low-ra rank Structure Kwang-Sung Jun Boston University (will

Sparse and Low-Rank Optimization for Dense Wireless Networks Part I: Models Jun Zhang Yuanming

Finding topological subgraphs is fixed-parameter tractable Martin Grohe 1 Ken-ichi Kawarabayashi 2

Everything you always wanted to know about the parameterized complexity of Subgraph Isomorphism

Manufacturing Diagnostic Tool Manufacturing Diagnostic Tool An on board on board low cost

A general-purpose program structure for variational Monte-Carlo calculation Kyrre Ness Sjbk

Kuhn Munkres algorithm Actors A definition There is a label on every vertex and its value is

A distributed approximation scheme for sleep scheduling in sensor networks Patrik Flor een,

Distributed Computing in Fault-Prone Dynamic Networks Philipp Brandes, Friedhelm Meyer auf der

VisITMeta Visualizing the security of modern IT environments by using metadata Project

Pairing-Based Cryptography & Generic Groups Lecture 22 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography & Generic Groups Lecture 21 Bilinear Pairing Bilinear Pairing

Pairing-Based Cryptography & Generic Groups Lecture 22 1 Bilinear Pairing 2 Bilinear