on local distributed sampling and counting
play

On Local Distributed Sampling and Counting Yitong Yin Nanjing - PowerPoint PPT Presentation

On Local Distributed Sampling and Counting Yitong Yin Nanjing University Joint work with W eiming Feng ( Nanjing University ) Counting and Sampling [Jerrum-Valiant-Vazirani 86]: (For self-reducible problems) approx. counting (approx.,


  1. On Local Distributed Sampling and Counting Yitong Yin Nanjing University Joint work with W eiming Feng ( Nanjing University )

  2. Counting and Sampling [Jerrum-Valiant-Vazirani ’86]: (For self-reducible problems) approx. counting (approx., exact) sampling is tractable is tractable

  3. Computational Phase Transition Sampling almost-uniform independent set in graphs with maximum degree ∆ : • [Weitz, STOC’06] : If ∆≤ 5 , poly-time. • [Sly, best paper in FOCS’10] : If ∆≥ 6 , no poly-time algorithm unless NP = RP . A phase transition occurs when ∆ : 5 → 6 . Local Computation?

  4. Local Computation “ What can be computed locally? ” [Naor, Stockmeyer ’93] the LOCAL model [Linial ’87] : • Communications are synchronized. • In each round, each node can: exchange unbounded messages with all neighbors perform unbounded local computation read/write to unbounded local memory. • In t rounds: each node can collect information up to distance t .

  5. Example : Sample Independent Set µ : uniform distribution of independent sets in G . Y ∈ {0,1} V indicates an independent set • Each v ∈ V returns a Y v ∈ {0,1} , such that Y = ( Y v ) v ∈ V ∼ µ • Or: d TV ( Y , µ ) < 1/poly( n ) network G ( V , E )

  6. Inference (Local Counting) µ : uniform distribution of independent sets in G . : marginal distribution at v conditioning on σ ∈ {0,1} S . µ σ v ∀ y ∈ { 0 , 1 } : v ( y ) = Pr Y ∼ µ [ Y v = y | Y S = σ ] µ σ 0 • Each v ∈ S receives σ v as input. • Each v ∈ V returns a marginal 0 distribution such that: µ σ ˆ v 1 1 1 d TV (ˆ v ) ≤ µ σ v , µ σ poly( n ) n 1 Y Z = µ ( ∅ ) = Y ∼ µ [ Y v i = 0 | ∀ j < i : Y v j = 0] Pr network G ( V , E ) i =1 Z : # of independent sets

  7. Gibbs Distribution (with pairwise interactions) • Each vertex corresponds to a network G ( V , E ): variable with finite domain [ q ] . • Each edge e =( u , v ) ∈ E has a matrix (binary constraint): A e v u b v A e : [ q ] × [ q ] → [0,1] • Each vertex v ∈ V has a vector (unary constraint): b v : [ q ] → [0,1] • Gibbs distribution µ : ∀ σ ∈ [ q ] V Y Y µ ( σ ) ∝ A e ( σ u , σ v ) b v ( σ v ) e =( u,v ) ∈ E v ∈ V

  8. Gibbs Distribution (with pairwise interactions) • Gibbs distribution µ : ∀ σ ∈ [ q ] V network G ( V , E ): Y Y µ ( σ ) ∝ A e ( σ u , σ v ) b v ( σ v ) e =( u,v ) ∈ E v ∈ V • independent set: A e v u b v  1 �  1 � 1 A e = b v = 1 0 1 • local conflict colorings : [Fraigniaud, Heinrich, Kosowski, FOCS’16] A e : [ q ] × [ q ] → {0,1} A e : [ q ] × [ q ] → [0,1] b v : [ q ] → {0,1} b v : [ q ] → [0,1]

  9. Gibbs Distribution • Gibbs distribution µ : ∀ σ ∈ [ q ] V network G ( V , E ): Y µ ( σ ) ∝ f ( σ S ) ( f,S ) ∈ F each ( f, S ) ∈ F S is a local constraints (factors): f : [ q ] S → R ≥ 0 S ⊆ V with diam G ( S ) = O (1)

  10. A Motivation: Distributed Machine Learning • Data are stored in a distributed system. • Distributed algorithms for: • sampling from a joint distribution (specified by a probabilistic graphical model ); • inferring according to a probabilistic graphical model.

  11. Computational Phase Transition Sampling almost-uniform independent set in graphs with maximum degree ∆ : • [Weitz, STOC’06] : If ∆≤ 5 , poly-time. • [Sly, FOCS’10] : If ∆≥ 6 , no poly-time algorithm unless NP = RP . A phase transition occurs when ∆ : 5 → 6 .

  12. Decay of Correlation : marginal distribution at v conditioning on σ ∈ {0,1} S . µ σ v strong spatial mixing (SSM): ∀ boundary condition B ∈ {0,1} r -sphere( v ) : v , µ σ ,B d TV ( µ σ ) ≤ poly( n ) · exp( − Ω ( r )) v SSM (iff ∆≤ 5 when µ is uniform G distribution of ind. sets) r approx. inference is solvable v B in O(log n ) rounds σ in the LOCAL model

  13. Locality of Counting & Sampling For Gibbs distributions (defined by local factors): Inference: Sampling: Correlation Decay: local approx. local approx. SSM inference sampling easy with additive error O(log 2 n ) factor local approx. local exact inference sampling with multiplicative error

  14. Locality of Sampling Inference: Sampling: Correlation Decay: local approx. local approx. SSM inference sampling return a random Y = ( Y v ) v ∈ V each v can compute a µ σ ˆ v within O(log n ) -ball whose distribution ˆ µ ≈ µ 1 s.t. 1 d TV (ˆ µ, µ ) ≤ d TV (ˆ v ) ≤ µ σ v , µ σ poly( n ) poly( n ) sequential O(log n ) -local procedure: • scan vertices in V in an arbitrary order v 1 , v 2 , …, v n • for i =1,2, …, n : sample according to Y v 1 ,...,Y vi − 1 Y v i ˆ µ v i

  15. Network Decomposition ( C , D ) -network-decomposition of G : • classifies vertices into clusters; • assign each cluster a color in [ C ] ; • each cluster has diameter ≤ D ; • clusters are properly colored. ( C , D ) r -ND: ( C , D ) -ND of G r Given a ( C , D ) r - ND: sequential r -local procedure: r = O(log n ) r = O(log n ) • scan vertices in V in an arbitrary order v 1 , v 2 , …, v n • for i =1,2, …, n : sample according to Y v 1 ,...,Y vi − 1 Y v i ˆ µ v i can be simulated in O( CDr ) rounds in LOCAL model

  16. Network Decomposition ( C , D ) -network-decomposition of G : • classifies vertices into clusters; • assign each cluster a color in [ C ] ; • each cluster has diameter ≤ D ; • clusters are properly colored. ( C , D ) r -ND: ( C , D ) -ND of G r ( O(log n ), O(log n )) r -ND can be constructed in O( r log 2 n ) rounds w.h.p. [Linial, Saks, 1993] — [Ghaffari, Kuhn, Maus, 2017]: r -local SLOCAL algorithm: O( r log 2 n ) -round LOCAL alg.: ND ∀ ordering π =( v 1 , v 2 , …, v n ) , returns w.h.p. the Y ( π ) for some ordering π returns random vector Y ( π )

  17. Locality of Sampling Inference: Sampling: Correlation Decay: O(log n )- round O(log 3 n )- round local approx. local approx. SSM inference sampling with additive error local approx. local exact inference sampling with multiplicative error

  18. Local Exact Sampler In LOCAL model: • Each v ∈ V returns within fixed t ( n ) rounds: • local output Y v ∈ {0,1}; • local failure F v ∈ {0,1} . • Succeeds w.h.p.: ∑ v ∈ V E [ F v ] = O(1/ n ). • Correctness: conditioning on success, Y ~ µ.

  19. Jerrum-Valiant-Vazirani Sampler [ J errum- V aliant- V azirani ’86] ∃ an efficient algorithm that samples from ˆ µ µ ( σ ) given any σ ∈ { 0 , 1 } V and evaluates ˆ e − 1 /n 2 ≤ ˆ µ ( σ ) multiplicative error: ∀ σ ∈ { 0 , 1 } V : µ ( σ ) ≤ e 1 /n 2 Self-reduction: n n Z ( σ 1 , . . . , σ i ) Y µ σ 1 ,..., σ i − 1 Y µ ( σ ) = ( σ i ) = v i Z ( σ 1 , . . . , σ i − 1 ) i =1 i =1 ˆ Z ( σ 1 , . . . , σ i ) let Z ( σ 1 , . . . , σ i − 1 ) ≈ e ± 1 /n 3 · µ σ 1 ,..., σ i − 1 µ σ 1 ,..., σ i − 1 ˆ ( σ i ) = ( σ i ) v i v i ˆ e − 1 / 2 n 3 ≤ ˆ where by approx. counting Z ( ··· ) Z ( ··· ) ≤ e 1 / 2 n 3

  20. Jerrum-Valiant-Vazirani Sampler [ J errum- V aliant- V azirani ’86] ∃ an efficient algorithm that samples from ˆ µ µ ( σ ) given any σ ∈ { 0 , 1 } V and evaluates ˆ e − 1 /n 2 ≤ ˆ µ ( σ ) multiplicative error: ∀ σ ∈ { 0 , 1 } V : µ ( σ ) ≤ e 1 /n 2 Sample a random ; Y ∼ ˆ µ pick Y 0 = ∅ ; q = ˆ µ ( Y 0 ) h i e − 5 /n 2 , 1 accept Y with prob.: µ ( Y ) · e − 3 n 2 ∈ ˆ fail if otherwise; ∀ σ ∈ { 0 , 1 } V : ( 1 σ is ind. set µ ( ∅ ) µ ( σ ) · ˆ µ ( σ ) · e − 3 Pr[ Y = σ ∧ accept] = ˆ ∝ n 2 ˆ 0 otherwise

  21. Boosting Local Inference additive error: ( local approx. 1 d TV (ˆ v ) ≤ µ σ v , µ σ SSM poly( n ) inference multiplicative error: each v computes a µ σ ˆ ˆ v (0) v (0) , ˆ v (1) v µ σ µ σ h e − 1 / poly( n ) , e 1 / poly( n ) i v (1) ∈ within r -ball µ σ µ σ local self-reduction SSM both are achievable with r = O(log n ) boosted sequential r -local sampler: r = O(log n ) • scan vertices in V in an arbitrary order v 1 , v 2 , …, v n • for i =1,2, …, n : sample according to Y v 1 ,...,Y vi − 1 Y v i ˆ µ v i e − 1 /n 2 ≤ ˆ µ ( σ ) multiplicative error: ∀ σ ∈ { 0 , 1 } V : µ ( σ ) ≤ e 1 /n 2

  22. SLOCAL JVV Scan vertices in V in an arbitrary order v 1 , v 2 , …, v n : pass 1 : sample Y ∈ {0,1} V by boosted sequential r -local sampler ; ˆ µ ∀ σ ∈ [ q ] V : e − 1 /n 2 ≤ ˆ µ ( σ ) r = O(log n ) µ ( σ ) ≤ e 1 /n 2 pass 1’ : construct a sequence of ind. sets ∅ = Y 0 , Y 1 , …, Y n = Y ; s.t. ∀ 0 ≤ i ≤ n : • Y i agrees with Y over v 1 , …, v i • Y i and Y i- 1 differ only at v i v i samples independently with F v i ∈ { 0 , 1 } Pr[ F v i = 0] = q v i q v i = ˆ µ ( Y i − 1 ) · e − 3 /n 2 ∈ [e − 5 /n 2 , 1] where µ ( Y i ) ˆ Each v ∈ V returns: O(log n ) -local • Y v ∈ {0,1} to indicate the ind. set; to compute • F v ∈ {0,1} indicate failure at v .

Recommend


More recommend