Sampling Vertices Uniformly from a Graph Flavio Chierichetti - PowerPoint PPT Presentation

Sampling Vertices Uniformly from a Graph Flavio Chierichetti Sapienza University With subsets of Anirban Dasgupta IIT Gandhinagar Shahrzad Haddadan Sapienza University Silvio Lattanzi Google Zurich Ravi Kumar Google MTV Tamás Sarlós Google MTV

Social Networks • Social Networks are “large” • We would like to study their properties • We need to be able to sample from them

Learning Average Opinions

Learning Average Opinions 2

Learning Average Opinions 2 0 0 3 0 1 4 4 5 1 2 2 1 2

Learning Average Opinions 2 0 0 3 0 1 4 4 5 1 2 2 1 2 Asking all the users   is too costly!

Learning Average Opinions Select some people uniformly-at-random and ask them   their opinion

Learning Average Opinions Select some people uniformly-at-random and ask them   their opinion d = 1 d = 2

Learning Average Opinions 0 Select some people 1 uniformly-at-random and ask them   their opinion 1 2

Learning Average Opinions 0 Select some people 1 uniformly-at-random and ask them   their opinion 1 2 The empirical   average will be   close to the real   average

Learning Average Opinions

Learning Average Opinions What is the   fraction of ?

Learning Average Opinions Select some people uniformly-at-random and ask them   their opinion

Learning Average Opinions Select some people uniformly-at-random and ask them   their opinion The empirical   fraction of will   be close to the real fraction

How do we select   uniform-at-random profiles   in a Social Network? • We can access the SN through a crawling process. • But we cannot crawl the whole network.   Then, what can we do? http://s-n.com/001.html

How do we select   uniform-at-random profiles   in a Social Network? • We can access the SN through a crawling process. • We cannot crawl the whole network.  

Random Walks

Random Walks 1/4 1/4 1/4 1/4

Random Walks

Random Walks 1/3 1/3 1/3

Random Walks

Random Walks If the process goes on for enough many steps, the random node it ends up on will be “random”,   chosen with probability proportional to its degree

Random Walks Mixing Time T(G) If the process goes on for enough many steps, the random node it ends up on will be “random”,   chosen with probability proportional to its degree

Random Walks The Mixing Times of many “Social Networks” are small   [Leskovec et al, ’08] Mixing Time T(G) If the process goes on for enough many steps, the random node it ends up on will be “random”,   chosen with probability proportional to its degree

Random Walks Mixing Time T(G) If the process goes on for enough many steps, the random node it ends up on will be “random”,   chosen with probability proportional to its degree

Random Walks 1/18 Mixing Time T(G) If the process goes on for enough many steps, the random node it ends up on will be “random”,   chosen with probability proportional to its degree

Random Walks 4/18 1/18 Mixing Time T(G) If the process goes on for enough many steps, the random node it ends up on will be “random”,   chosen with probability proportional to its degree

A Folklore Algorithm • While True: • run the random walk for T(G) steps; • suppose it ends on the node v; • return v with probability 1/deg(v).

A Folklore Algorithm • While True: • run the random walk for T(G) steps; • suppose it ends on the node v; • return v with probability 1/deg(v). ~ 4/18 · 1/4 = ~ 1/18

A Folklore Algorithm • While True: • run the random walk for T(G) steps; • suppose it ends on the node v; • return v with probability 1/deg(v). ~ 1/18 ~ 1/18 · 1/1

A Folklore Algorithm • While True: • run the random walk for T(G) steps; • suppose it ends on the node v; • return v with probability 1/deg(v). ~ 1/18 ~ 1/18

A Folklore Algorithm • While True: • run the random walk for T(G) steps; • suppose it ends on the node v; • return v with probability 1/deg(v). This algorithm returns a node chosen   (arbitrarily close to) uniformly at random

A Folklore Algorithm • While True: • run the random walk for T(G) steps; • suppose it ends on the node v; • return v with probability 1/deg(v). One can easily show that this algorithm   downloads , with high probability, at most   O(T(G) · AvgDeg(G)) nodes from the network

The Max-Degree Algorithm • Let D be the max-degree of G. • Add self-loops to G in order to make it D-regular. • Run the random walk for D · T(G) steps. • return the node on which it ends.

The Max-Degree Algorithm • Let D be the max-degree of G. • Add self-loops to G in order to make it D-regular. • Run the random walk for D · T(G) steps. • return the node on which it ends. Running Time: D · T(G)

The Max-Degree Algorithm • Let D be the max-degree of G. • Add self-loops to G in order to make it D-regular. • Run the random walk for D · T(G) steps. • return the node on which it ends. Running Time: D · T(G) # of Downloaded Vertices ≤ AvgDeg(G) · T(G)

Can one do better? • In [C., Dasgupta, Kumar, Lattanzi, Sarlós,’16] we analyzed various algorithms for selecting a UAR node. • Some of them were on-par with the Folklore Algorithm, some of them were worse. • In [C., Haddadan, ’18], we show that if an algorithm downloads < o(T(G) AvgDeg(G)) nodes from the network, then it cannot return anything close to a uniform-at-random node. • That is, the Folklore algorithm is optimal.  

Two Main Ingredients

Two Main Ingredients G H

Two Main Ingredients G H A distribution over graphs G

Decoration Construction   [C., Haddadan,’18] • Let G = (V,E) be a graph, with mixing time T . • The (random) decoration of G is a super-graph H of G constructed as follows: • for each v in V , flip an iid coin: with probability 1/T, • mark node v ; • create a new node v’ , and cT new nodes v’ i • add an edge from v to v’ , and an edge to v’ to each v’ i

Decoration Construction   [C., Haddadan,’18] • Let G = (V,E) be a graph, with mixing time T . • The (random) decoration of G is a super-graph H of G constructed as follows: • for each v in V , flip an iid coin: with probability 1/T, • mark node v ; • create a new node v’ , and cT new nodes v’ i • add an edge from v to v’ , and an edge to v’ to each v’ i v

Sampling Vertices Uniformly from a Graph Flavio Chierichetti - PowerPoint PPT Presentation

Sampling Vertices Uniformly from a Graph Flavio Chierichetti Sapienza University With subsets of Anirban Dasgupta IIT Gandhinagar Shahrzad Haddadan Sapienza University Silvio Lattanzi Google Zurich Ravi Kumar Google MTV Tams Sarls Google MTV

Graphs Introduction Graph Graph A graph G = ( V , E ) is a set V of vertices connected by an

CS70: Jean Walrand: Lecture 34. Uniformly at Random in [ 0 , 1 ] . Uniformly at Random in [ 0 , 1 ]

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Sum of Degrees of Vertices Theorem Theorem (Sum of Degrees of Vertices Theorem) Suppose a graph

Intro to graphs Minimum Spanning Trees Graphs nodes/vertices and edges between vertices - set

Subgraph counting problems 23rd March 2016 Kitty Meeks The problem Given a graph on n vertices,

What is a Graph? A graph G = ( V , E ) is composed of: V : set of vertices E : set of edges

. . . x 1 x 1 x 2 x 2 x 3 x 3 x n x n 31 32 3-satisfiability reduces to graph

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Method I:FIBONACCI MULTIGRID METHOD (* calculate the vertices for each tile *) Tiles-vertices={};

Slides built from Carter Chapter 4 Vertices Objects are made up of 3D points called vertices.

CSE 311: Foundations of Computing announcements Fall 2014 Hand in Homework 9 now Lecture 30:

SSH SSH ( Struts , Spring and Hibernate Hibernate) ) ( Struts , Spring and Ronghong Li, You

The Java Language Mark Austin E-mail: austin@isr.umd.edu Department of Civil and Environmental

The Laughing PC Using Jokes in Software to Improve Childrens Reading Comprehension Nicola

Perspectives: Improving SSH-style authentication using multi-path probing Dan Wendlandt, David

CSE 311: Foundations of Computing Lecture 6: More Predicate Logic Administrative Homework 2

How to Create an Automated Lead Nurturing Campaign How to Create Content for Every Stage of

Robustness of Interconnection Networks 3rd JLESC Summer School Atsushi Hori RIKEN AICS 16