Distribution and Dependence of Extremes in Network Sampling - PowerPoint PPT Presentation

Distribution and Dependence of Extremes in Network Sampling Processes Jithin K. Sreedharan* with Konstantine Avrachenkov* and Natalia M. Markovich t *INRIA Sophia Antipolis, France t Institute of Control Sciences, Russian Academy of Sciences, Moscow March 30, 2015

Random Sampling All we have: 𝑌 1 , 𝑌 2 , … , 𝑌 𝑜 No complete picture a priori ! Samples: any stationary (most likely dependent) sequence e.g. node ID ’ s, degrees, number of followers or income of the nodes in OSN etc

Correlations in Graphs and Sampling ● Correlations in graph properties exist in real networks e.g: correlation in Coauthorship network ● Usually neglected in analysis of sampling algorithms Effect of neglecting correlations: ● Assuming i.i.d. degrees, largest degree ≈ 𝐿𝑂 1/𝛿 , 𝑂 no. of nodes, 𝛿 tail index of Pareto distribution (N. Litvak et al, LNCS ’ 12) ● Twitter graph (2012): N= 537 M, 𝛿 = 1.124 for out-degree. ● Largest out-degree predicted is 59M . Actual largest out-degree is 𝟑𝟑 M!

Questions We Address Here … Statistical properties of clusters First passage time Kth largest value of samples and many more extremal properties Is there a simple way to get information about many extremal properties? Ans: Extremal Index

Relation to Extreme Value Theory Extremal Index (𝜄) : Point Process Point process of exceedances → Compound poisson process (rate 𝜄𝜐) Tendency to form clusters

Extremal Index: Applications Gives maxima of the degree sequence with certain probability Pareto case revisited: ● i.i.d. degrees, largest degree ≈ 𝐿𝑂 1/𝛿 , 𝑂 no. of nodes, 𝛿 tail index of Pareto distribution (N. Litvak, LNCS ’ 12) ● Stationary degree samples with EI, largest degree ≈ 𝐿(𝑂𝜄) 1/𝛿

Extremal Index: Applications First passage time: Lower the value of EI, more time to hit extreme levels e.g. Pareto

Extremal Index: Applications Relation to Mean Cluster Size:

Calculation of Extremal Index Two mixing conditions on the samples Cond-1 : Limits long range dependence Stationary Markov samples or its measurable functions satisfy this Cond-2 :

Proposition If the sampled sequence is stationary and satisfies mixing conditions, then Extremal Index 0 ≤ 𝜄 ≤ 1 and

Degree Correlations ● Undirected and correlated ● is enough to construct graph ● Crawling via Random Walks on vertices ● Degree sequence is a Hidden Markov chain ● What is the joint stationary distribution on degree state space?

Meanfield Models Standard Random Walk Page Rank Random Walk with Jumps (RWJ)

Check of Meanfield Model in Random Walks

Extremal Index for Bivariate Pareto Model

Estimation of Extremal Index Empirical Copula based estimator: EI: slope at (1; 1),Linear least square fitting & numerical differentiation Intervals Estimator: Based on

Numerical Results: Synthetic Graphs EI EI Analysis Copula based Intervals estimator Estimator Synthetic graph (5K Nodes) 0.56 0.53 0.58 Copula based estr. Intervals Estimator

Numerical Results: Real Graphs EI EI Copula based Intervals estimator Estimator DBLP (32K Nodes,1.1M Edges) 0.29 0.25 Enron Email (37K Nodes,368K Edges) 0.61 0.62

Conclusions ● Associated Extremal Value Theory of stationary sequence to sampling of large graphs ● For any general stationary samples meeting two mixing conditions, knowledge of bivariate distribution or bivariate copula is sufficient to derive many extremal properties ● Extremal Index (EI) encapsulates this relation ● Applications of EI to many relevant extrems: ● First hitting time ● Order statistics ● Mean cluster size ● Modeled correlation in degrees of adjacent nodes and random walk in degree state space ● Estimates of EI for synthetic graph with degree correlations and find a good match with theory ● Estimated EI for two real world networks

Thank You!

Distribution and Dependence of Extremes in Network Sampling - PowerPoint PPT Presentation

Distribution and Dependence of Extremes in Network Sampling Processes Jithin K. Sreedharan* with Konstantine Avrachenkov* and Natalia M. Markovich t *INRIA Sophia Antipolis, France t Institute of Control Sciences, Russian Academy of Sciences,

Extremes and dependence in the context of Solvency II for insurance companies Arthur Charpentier

Measuring Dependence and Conditional Dependence with Kernels Kenji Fukumizu The Institute of

Linear dependence and independence Linear dependence 1 Definition (linear (in)dependence) Let {

Treating Tobacco Treating Tobacco Treating Tobacco Treating Tobacco Dependence and Providing

Control-dependence Analysis 2 Control-dependence Analysis 1. Introduction (motivation, overview)

More refined representations Control dependence graph Problem: control-flow edges in CFG

1. Normal distribution 2. Geometric distribution 3. Binomial distribution 4.

Architecting Cross-Platform Mobile Frameworks Spencer Chan Quora Motivation Two extremes

Priority and Particle Physics: structure, dependence, and moderation in all things Kerry McKenzie

Dependence: Theory and Practice Introduction to loop dependence and loop transformation 1 The

Data Dependence in Data Dependence in Combining Classifiers Combining Classifiers Mohamed

Local Dependence and Persistence in Discrete Sliding Window Processes Ohad N. Feldheim Joint

Dependence in Games & Dependence Games Davide Grossi (ILLC, University of Amsterdam) Paolo

Chapter 3 Structural breaks for models with path dependence 2 Chapter 3 Path dependence (p.

Unusual compositional dependence of the Unusual compositional dependence of the exciton reduced

From Data to Effects Dependence Graphs: Source-to-Source Transformations for C CPC 2015 Nelson

18.175: Lecture 20 Infinite divisibility and L evy processes Scott Sheffield MIT 18.175 Lecture

A New Fractional Process: A Fractional Non-homogeneous Poisson Process Enrico Scalas University

M ODELLING AND A NALYSIS OF B IOCHEMICAL N ETWORKS WITH T IME P ETRI N ETS Louchka Popova-Zeugmann

A Monte Carlo approach to a divergence minimization problem (work in progress) IGAIA IV, June

Extremes of supOU processes Vicky Fasen August 16, 2005 fasen@ma.tum.de Graduate Program

Asymptotics of conditional moments of the summand in Poisson compound Tomasz Rolski (joint work

An operator splitting method for solving a class of Fokker-Planck equations Beatrice Gaviraghi

Pr [ E ] = 2 . E = { Red , Green } Pr [ E ] = 3 + 4 = 3 10 + 4 10 = Pr [ Red ]+ Pr [ Green

Sambuz

Useful Links

Newsletter

Mail Us

Distribution and Dependence of Extremes in Network Sampling - PowerPoint PPT Presentation

Distribution and Dependence of Extremes in Network Sampling Processes Jithin K. Sreedharan* with Konstantine Avrachenkov* and Natalia M. Markovich t *INRIA Sophia Antipolis, France t Institute of Control Sciences, Russian Academy of Sciences,

Extremes and dependence in the context of Solvency II for insurance companies Arthur Charpentier

Measuring Dependence and Conditional Dependence with Kernels Kenji Fukumizu The Institute of

Linear dependence and independence Linear dependence 1 Definition (linear (in)dependence) Let {

Treating Tobacco Treating Tobacco Treating Tobacco Treating Tobacco Dependence and Providing

Control-dependence Analysis 2 Control-dependence Analysis 1. Introduction (motivation, overview)

More refined representations Control dependence graph Problem: control-flow edges in CFG

1. Normal distribution 2. Geometric distribution 3. Binomial distribution 4.

Architecting Cross-Platform Mobile Frameworks Spencer Chan Quora Motivation Two extremes

Priority and Particle Physics: structure, dependence, and moderation in all things Kerry McKenzie

Dependence: Theory and Practice Introduction to loop dependence and loop transformation 1 The

Data Dependence in Data Dependence in Combining Classifiers Combining Classifiers Mohamed

Local Dependence and Persistence in Discrete Sliding Window Processes Ohad N. Feldheim Joint

Dependence in Games &amp; Dependence Games Davide Grossi (ILLC, University of Amsterdam) Paolo

Chapter 3 Structural breaks for models with path dependence 2 Chapter 3 Path dependence (p.

Unusual compositional dependence of the Unusual compositional dependence of the exciton reduced

From Data to Effects Dependence Graphs: Source-to-Source Transformations for C CPC 2015 Nelson

18.175: Lecture 20 Infinite divisibility and L evy processes Scott Sheffield MIT 18.175 Lecture

A New Fractional Process: A Fractional Non-homogeneous Poisson Process Enrico Scalas University

M ODELLING AND A NALYSIS OF B IOCHEMICAL N ETWORKS WITH T IME P ETRI N ETS Louchka Popova-Zeugmann

A Monte Carlo approach to a divergence minimization problem (work in progress) IGAIA IV, June

Extremes of supOU processes Vicky Fasen August 16, 2005 fasen@ma.tum.de Graduate Program

Asymptotics of conditional moments of the summand in Poisson compound Tomasz Rolski (joint work

An operator splitting method for solving a class of Fokker-Planck equations Beatrice Gaviraghi

Pr [ E ] = 2 . E = { Red , Green } Pr [ E ] = 3 + 4 = 3 10 + 4 10 = Pr [ Red ]+ Pr [ Green

Sambuz

Useful Links

Newsletter

Mail Us

Dependence in Games & Dependence Games Davide Grossi (ILLC, University of Amsterdam) Paolo