Graphon Estimation: Minimax Rates and Posterior Contraction Chao Gao - PowerPoint PPT Presentation

Graphon Estimation: Minimax Rates and Posterior Contraction Chao Gao Yale University @Leiden, March 2015

Stochastic Block Model z : { 1 , 2 , ..., n } ! { 1 , 2 , ..., k } A ij ⇠ Bernoulli( θ ij ) θ ij = Q z ( i ) z ( j ) Goal: recover θ ij

Biclustering (Hartigan, 1972) z 1 : { 1 , 2 , ..., n } ! { 1 , 2 , ..., k } z 2 : { 1 , 2 , ..., m } ! { 1 , 2 , ..., l } E ( A ij ) = θ ij = Q z 1 ( i ) z 2 ( j ) Goal: recover θ ij

Nonparametric Regression y i = f ( x i ) + ✏ i x i 2 D , ✏ i ⇠ N (0 , 1) 2 D ⇠ Common assumption: f is smooth on D . Goal: recover f from both x and y

A More Challenging Problem y i = f ( x i ) + ✏ i x i 2 D , ✏ i ⇠ N (0 , 1) 2 D ⇠ Common assumption: f is smooth on D . Goal: recover f from only y

• 1D Problem • 2D Problem • Minimax Rate for Stochastic Block Model • Minimax Rate for Graphon Estimation • Adaptive Bayes Estimation

1D Problem x i = i y i = f ( x i ) + ✏ i , i = 1 , 2 , .., n n, n o F = f : f ( x ) = q 1 for x 2 (0 , 1 / 2] , f ( x ) = q 2 for x 2 (1 / 2 , 1] n ! 1 ⇣ 1 ( ˆ X f ( x i ) � f ( x i )) 2 inf sup n. E n ˆ f f ∈ F i =1

1D Problem x i = i y i = f ( x i ) + ✏ i , i = 1 , 2 , .., n n, Without observing x , the problem is equivalent to y i = ✓ i + ✏ i . Θ = { ✓ : half ✓ i is q 1 , half ✓ i is q 2 } n ! 1 (ˆ X ✓ i � ✓ i ) 2 inf sup ⇣ 1 . E n ˆ θ θ ∈ Θ i =1

2D Problem ⇠ i = i y ij = f ( ⇠ i , ⇠ j ) + ✏ ij , n, i, j = 1 , 2 , .., n F collects f such that  q 1 ( x, y ) 2 [0 , 1 / 2) ⇥ [0 , 1 / 2)         q 2 ( x, y ) 2 [0 , 1 / 2) ⇥ [1 / 2 , 1]   f ( x, y ) = q 3 ( x, y ) 2 [1 / 2 , 1] ⇥ [0 , 1 / 2)         q 4 ( x, y ) 2 [1 / 2 , 1] ⇥ [1 / 2 , 1]  

2D Problem 0 1 @ 1 A ⇣ 1 ( ˆ X f ( ⇠ i , ⇠ j ) � f ( ⇠ i , ⇠ j )) 2 inf sup n 2 . E n 2 ˆ  f f ∈ F 1 ≤ i,j ≤ n F How about without knowing the design? 0 1 @ 1 A ⇣ 1 X ( ˆ f ( ⇠ i , ⇠ j ) � f ( ⇠ i , ⇠ j )) 2 inf sup n. E n 2 ˆ f f ∈ F 1 ≤ i,j ≤ n

2D Problem Let θ ij = f ( ξ i , ξ j ). Does θ ij have any structure? { θ i 1 , θ i 2 , ..., θ in } are from the same row for each i . { θ 1 j , θ 2 j , ..., θ nj } are from the same column for each j .

2D Problem ⇠ ij 2 [0 , 1] 2 , y ij = f ( ⇠ ij ) + ✏ ij , i, j = 1 , 2 , .., n { } Without knowing the design? 0 1 @ 1 ( ˆ X f ( ⇠ ij ) � f ( ⇠ ij )) 2 A ⇣ 1 . inf sup E n 2 ˆ f f ∈ F 1 ≤ i,j ≤ n

Stochastic Block Model A ij ⇠ Bernoulli( ✓ ij ) n o Θ 2 = θ : θ ij = Q z ( i ) z ( j ) , with z : [ n ] ! [2] 0 1 @ 1 A ⇣ 1 (ˆ X ✓ ij � ✓ ij ) 2 inf sup n. E n 2 ˆ θ θ ∈ Θ 2 1 ≤ i,j ≤ n

Stochastic Block Model A ij ⇠ Bernoulli( ✓ ij ) n o Θ k = θ : θ ij = Q z ( i ) z ( j ) , with z : [ n ] ! [ k ] Theorem 1.1. Under the stochastic block model, we have 8 9 ; ⇣ k 2 1 n 2 + log k < = (ˆ X θ ij � θ ij ) 2 inf sup , E n 2 n ˆ θ θ ∈ Θ k : i,j ∈ [ n ] for any 1  k  n .

Stochastic Block Model n Let k ⇣ n δ , for δ 2 [0 , 1]. 8 n − 2 δ = 0 , k = 1 , > > > > > > > n − 1 > δ = 0 , k > 1 , k 2 > n 2 + log k < ⇣ n n − 1 log n δ 2 (0 , 1 / 2] , > > > > > > > n − 2(1 − δ ) > δ 2 (1 / 2 , 1] . > :

Graphon Estimation Theorem (Aldous-Hoover) . A random array { A ij } is jointly exchangeable in the sense that { A ij } d = { A � ( i ) � ( j ) } for all permutation � , if and only if it can be represented as follows: there is a random function F : [0 , 1] 3 ! R such that d = F ( ⇠ i , ⇠ j , ⇠ ij ) , A ij where { ⇠ i } and { ⇠ ij } are i.i.d. Unif [0 , 1] .

Graphon Estimation ⇣ 2 When the graph is undirected and has no self-loop, A ij | ξ i , ξ j ⇠ Bernoulli( θ ij ) , θ ij = f ( ξ i , ξ j ) . ξ i ⇠ Unif(0 , 1) i.i.d. 2 Goal: recover f .

Graphon Estimation A ij | ξ i , ξ j ⇠ Bernoulli( θ ij ) , θ ij = f ( ξ i , ξ j ) . ( ξ 1 , ..., ξ n ) ⇠ P ξ Assumption: f 2 F α ( M ). older class F α ( M ) , defined in Section 2.3. We have Theorem 1.2. Consider the H¨ 8 9 n − 2 α ( α +1 , 0 < α < 1 , 1 < = (ˆ X θ ij � θ ij ) 2 inf sup sup E ; ⇣ n 2 log n ˆ n , α � 1 . θ ξ ∼ P ξ f ∈ F α ( M ) : i,j ∈ [ n ] The expectation is jointly over { A ij } and { ξ i } .

Graphon Estimation Proof: ⇢ 1 k 2 α + k 2 � n 2 + log k min n k

Lower Bound Proof When 1 < k  O (1), the minimax rate is 1 n .  Su ffi cient to prove for k = 2.

Lower Bound Proof Proposition (Fano) . Let ( Θ , ⇢ ) be a metric space and { P ✓ : ✓ 2 Θ } a collection of probability measures. For any T ⇢ Θ , denote by M ( ✏ , T, ⇢ ) the ✏ -packing number of T w.r.t. ⇢ . Define the KL diameter of T by d KL( T ) = sup D ( P ✓ || P ✓ 0 ) . ✓ , ✓ 0 2 T Then ✏ 2 1 � d KL( T ) + log 2 ✓ ◆ E ✓ ⇢ 2 ⇣ ⌘ ˆ inf sup ✓ ( X ) , ✓ � sup 4 log M ( ✏ , T, ⇢ ) ˆ ✏ > 0 ✓ ✓ 2 Θ

Lower Bound Proof • Construct a subset • Upper bound the KL-diameter • Lower bound the packing number

Lower Bound Proof ( { ✓ ij } ∈ [0 , 1] n ⇥ n : ✓ ij = 1 2 for ( i, j ) ∈ ( S × S ) ∪ ( S c × S c ) , = T ) ✓ ij = 1 c √ n for ( i, j ) ∈ ( S × S c ) ∪ ( S c × S ) , with some S ∈ S 2 + . 1 1 c S S 2 + p n 2 1 c 1 2 + p n S c S 2 S S c S S

Lower Bound Proof ij ) 2 = 2 c 2 ⇢ 2 ( ✓ , ✓ 0 ) = 1 | I S � I S 0 | ( n � | I S � I S 0 | ) X ( ✓ ij � ✓ 0 . n 2 n n n 1  i,j  n 1 1 c S S 2 + p n 2 1 c 1 2 + p n S c S 2 S S c S S

Lower Bound Proof Construct a subset: � T ⇢ Θ k � Upper bound the KL diameter || � 8 || ✓ − ✓ 0 || 2 ≤ 8 c 2 n. sup D ( P θ || P θ 0 ) ≤ sup � θ , θ 0 2 T θ , θ 0 2 T � Lower bound the packing number

Lower Bound Proof 0 1 A � c 2 1 � 8 c 2 n + log 2 ✓ ◆ @ 1 & 1 X (ˆ ✓ ij � ✓ ij ) 2 inf sup n. E n 2 32 n c 1 n ˆ θ θ ∈ Θ 2 1 ≤ i,j ≤ n

Upper Bound ∥ − ∥ Oracle solution When the clustering z is known, an obvious estimator 1 ˆ � for ( i, j ) ∈ z − 1 ( a ) × z − 1 ( b ) θ ij = A ij , | z − 1 ( a ) || z − 1 ( b ) | ( i,j ) ∈ z − 1 ( a ) × z − 1 ( b ) achieves the rate ∥ ˆ θ − θ ∥ 2 k 2 � � F ≤ O P .

Upper Bound An equivalent form (least squares) Fixing the known z , then solve ∥ A − θ ∥ 2 min F θ θ ij = Q z ( i ) z ( j ) for some Q = Q T ∈ [0 , 1] k × k s.t. A natural estimator Solve ∥ A − θ ∥ 2 min F θ θ ij = Q z ( i ) z ( j ) for some Q = Q T ∈ [0 , 1] k × k s.t. and some z : { 1 , 2 , ..., n } → { 1 , 2 , ..., k } . k 2 + n log k ∥ ˆ θ − θ ∥ 2 � � F ≤ O P

Bayes Estimation � D ( k 2 + n log k ) 1. Sample k ⇠ ⇡ . � � ⇡ ( k ) / exp 2. Sample z 2 { z : [ n ] ! [ k ] } . uniform 3. Sample Q ⇠ f . ? 4. Let ✓ ij = Q z ( i ) z ( j ) .

Bayes Estimation � D ( k 2 + n log k ) 1. Sample k ⇠ ⇡ . � � ⇡ ( k ) / exp � � 2. Sample z 2 { z : [ n ] ! [ k ] } . uniform sdf ◆ k 2 Γ ( k 2 / 2) ✓ � k f ( Q ) = 1 3. Sample Q ⇠ f . Γ ( k 2 ) e � � k || Q || p ⇡ 2 4. Let ✓ ij = Q z ( i ) z ( j ) .

Bayes Estimation Γ ( k 2 ) � D ( k 2 + n log k ) 1. Sample k ⇠ ⇡ . � � ⇡ ( k ) / Γ ( k 2 / 2) exp � � 2. Sample z 2 { z : [ n ] ! [ k ] } . uniform sdf ◆ k 2 Γ ( k 2 / 2) ✓ � k f ( Q ) = 1 3. Sample Q ⇠ f . Γ ( k 2 ) e � � k || Q || p ⇡ 2 4. Let ✓ ij = Q z ( i ) z ( j ) .

Bayes Estimation � D ( k 2 + n log k ) 1. Sample k ⇠ ⇡ . � � ⇡ ( k ) / exp 2. Sample z 2 { z : [ n ] ! [ k ] } . uniform sdf ✓ λ k ◆ k 2 f ( Q ) = 1 e − λ k || Q || p π 3. Sample Q ⇠ f . 2 4. Let ✓ ij = Q z ( i ) z ( j ) .

Bayes Estimation ✓ ◆ 2 π Theorem 1.3. Consider λ k = β n k for some constant β > 0 . Then 0 1 ✓ k 2 ◆ � @ 1 n 2 + log k ij ) 2 > M k 2 + n log k X A  exp � � C 0 � �� ( θ ij � θ ⇤ � A , E θ ∗ Π � n 2 n i,j for some constants M, C 0 > 0 .

Reference Gao, Chao, Yu Lu, and Harrison H. Zhou. "Rate-optimal Graphon Estimation." arXiv preprint arXiv:1410.5837 (2014).

Thank you

Graphon Estimation: Minimax Rates and Posterior Contraction Chao Gao - PowerPoint PPT Presentation

Graphon Estimation: Minimax Rates and Posterior Contraction Chao Gao Yale University @Leiden, March 2015 Stochastic Block Model z : { 1 , 2 , ..., n } ! { 1 , 2 , ..., k } A ij Bernoulli( ij ) ij = Q z ( i ) z ( j ) Goal: recover ij

A O I Posterior View A O I Posterior View A O I

Graphons and sampled networks A graphon W : [0 , 1] 2 [0 , 1] is a measurable function . W ( u

Some Bayesian extensions of neural network-based graphon approximations Creighton Heaukulani

Section 33: Hip Structural Components 33-1 posterior posterior anterior anterior head of

Nonparametric Minimax Estimation of the Estimation of the Volatility in High- Volatility in

PROPERTY RATES PROPERTY RATES PROPERTY RATES PROPERTY RATES BUFFALO CITY MUNICIPALITY

Mechanism of Contraction 25a A&P: Muscular System - Mechanism of Contraction Class

Stochastic contraction BACS Workshop Chamonix, January 14, 2008 Q.-C. Pham N. Tabareau J.-J.

Fast estimation of posterior change-point probabilities for CNV data The Minh Luong, Yves

4. Minimax and planning problems Optimizing piecewise linear functions Minimax problems

Understanding the Limiting Factors of Topic Modeling via Posterior Contraction Analysis Jian

Mean Field Games on Unbounded Networks and the Graphon MFG Equations Peter E. Caines McGill

Method of cumulants and mod-Gaussian convergence of the graphon models Pierre-Loc Mliot

7b Swedish: Technique Demo and Practice - Posterior Lower Body 7b Swedish: Technique Demo and

4b Swedish: Technique Demo and Practice - Posterior Upper Body 4b Swedish: Technique Demo and

6b Swedish: Technique Review and Practice - Posterior Upper Body 6b Swedish: Technique Review

Medicaid and CHIP: Pathways to Coverage and Covered Services June 21, 2012 12 noon 1 p.m.

Openness of W3C Working Groups Paul Cotton Microsoft, WS-Policy WG co-chair W3C Process (in a

Improving learning and learner engagement in f2f and blended settings Sahana Murthy Educational

PIVOTING F2F TO ONLINE LEARNING: THE IN'S & OUT'S OF TRANSFORMING YOUR ACCREDITED ACTIVITIES

Community breakout session A community riot 1 / 17 Goals Review the current community channels

Recovering a Hidden Hamiltonian Cycle via Linear Programming Yihong Wu Department of Statistics

AV1: Nits, Nitpicks and Shortcomings [Things we should fix for AV2] Nathan Egge

A Multiagent System Approach to Schedule Devices in Smart Homes William Yeoh Enrico Pontelli

Graphon Estimation: Minimax Rates and Posterior Contraction Chao Gao - PowerPoint PPT Presentation

Graphon Estimation: Minimax Rates and Posterior Contraction Chao Gao Yale University @Leiden, March 2015 Stochastic Block Model z : { 1 , 2 , ..., n } ! { 1 , 2 , ..., k } A ij Bernoulli( ij ) ij = Q z ( i ) z ( j ) Goal: recover ij

A O I Posterior View A O I Posterior View A O I

Graphons and sampled networks A graphon W : [0 , 1] 2 [0 , 1] is a measurable function . W ( u

Some Bayesian extensions of neural network-based graphon approximations Creighton Heaukulani

Section 33: Hip Structural Components 33-1 posterior posterior anterior anterior head of

Nonparametric Minimax Estimation of the Estimation of the Volatility in High- Volatility in

PROPERTY RATES PROPERTY RATES PROPERTY RATES PROPERTY RATES BUFFALO CITY MUNICIPALITY

Mechanism of Contraction 25a A&amp;P: Muscular System - Mechanism of Contraction Class

Stochastic contraction BACS Workshop Chamonix, January 14, 2008 Q.-C. Pham N. Tabareau J.-J.

Fast estimation of posterior change-point probabilities for CNV data The Minh Luong, Yves

4. Minimax and planning problems Optimizing piecewise linear functions Minimax problems

Understanding the Limiting Factors of Topic Modeling via Posterior Contraction Analysis Jian

Mean Field Games on Unbounded Networks and the Graphon MFG Equations Peter E. Caines McGill

Method of cumulants and mod-Gaussian convergence of the graphon models Pierre-Loc Mliot

7b Swedish: Technique Demo and Practice - Posterior Lower Body 7b Swedish: Technique Demo and

4b Swedish: Technique Demo and Practice - Posterior Upper Body 4b Swedish: Technique Demo and

6b Swedish: Technique Review and Practice - Posterior Upper Body 6b Swedish: Technique Review

Medicaid and CHIP: Pathways to Coverage and Covered Services June 21, 2012 12 noon 1 p.m.

Openness of W3C Working Groups Paul Cotton Microsoft, WS-Policy WG co-chair W3C Process (in a

Improving learning and learner engagement in f2f and blended settings Sahana Murthy Educational

PIVOTING F2F TO ONLINE LEARNING: THE IN'S &amp; OUT'S OF TRANSFORMING YOUR ACCREDITED ACTIVITIES

Community breakout session A community riot 1 / 17 Goals Review the current community channels

Recovering a Hidden Hamiltonian Cycle via Linear Programming Yihong Wu Department of Statistics

AV1: Nits, Nitpicks and Shortcomings [Things we should fix for AV2] Nathan Egge

A Multiagent System Approach to Schedule Devices in Smart Homes William Yeoh Enrico Pontelli

Mechanism of Contraction 25a A&P: Muscular System - Mechanism of Contraction Class

PIVOTING F2F TO ONLINE LEARNING: THE IN'S & OUT'S OF TRANSFORMING YOUR ACCREDITED ACTIVITIES