What can be sampled loca ! y ? Yitong Yin Nanjing University Joint - PowerPoint PPT Presentation

What can be sampled loca ! y ? Yitong Yin Nanjing University Joint work with: W eiming Feng, Y uxin Sun

Local Computation “ What can be computed locally? ” [Noar, Stockmeyer, STOC’93, SICOMP’95] the LOCAL model [Linial ’87] : • Communications are synchronized. • In each round: each node can send messages of unbounded sizes to all its neighbors. • Local computations are free. • Complexity: # of rounds to terminate in the worst case. • In t rounds: each node can collect information up to distance t .

Local Computation the LOCAL model [Linial ’87] : • In t rounds: each node can collect information up to distance t . Locally Checkable Labeling ( LCL ) problems [Noar, Stockmeyer ’93] : • CSPs with local constraints. • Construct a feasible solution: vertex/edge coloring, Lovász local lemma • Find local optimum: MIS, MM • Approximate global optimum: maximum matching, minimum vertex network G ( V , E ) cover, minimum dominating set Q: “What locally definable problems are locally computable?” in O(1) rounds by local constraints or in small number of rounds

“What can be sampled locally?” • CSP with local constraints network G ( V , E ): on the network: • proper q -coloring; • independent set; • Sample a uniform random feasible solution: • distributed algorithms (in the LOCAL model) Q: “What locally definable joint distributions are locally sample - able?”

Markov Random Fields (MRF) • Each vertex corresponds to a network G ( V , E ): variable with finite domain [ q ] . • Each edge e =( u , v ) ∈ E imposes a X v ∈ [ q ] weighted binary constraint: A e v u b v A e : [ q ] 2 → R ≥ 0 • Each vertex v ∈ E imposes a weighted unary constraint: b v : [ q ] → R ≥ 0 • Gibbs distribution µ : ∀ σ ∈ [ q ] V X ∈ [ q ] V follows µ ~ Y Y µ ( σ ) ∝ A e ( σ u , σ v ) b v ( σ v ) e =( u,v ) ∈ E v ∈ V

Markov Random Fields (MRF) • Gibbs distribution µ : ∀ σ ∈ [ q ] V network G ( V , E ): Y Y µ ( σ ) ∝ A e ( σ u , σ v ) b v ( σ v ) e =( u,v ) ∈ E v ∈ V X v ∈ [ q ] • proper q -coloring: A e v u b v   0 1   1 0 .   . b v = A e =     ... . 1       1 0 • independent set:  �  � 1 1 1 A e = b v = X ∈ [ q ] V follows µ ~ 1 0 1 • local conflict colorings: [Fraigniaud, Heinrich, Kosowski FOCS’16] A e ∈ { 0 , 1 } q × q , b v ∈ { 0 , 1 } q

A Motivation: Distributed Machine Learning • Data are stored in a distributed system. • Sampling from a probabilistic graphical model (e.g. the Markov random field ) by distributed algorithms.

Glauber Dynamics G ( V , E ): starting from an arbitrary X 0 ∈ [ q ] V transition for X t → X t+ 1 : b v A e v v pick a uniform random vertex v ; resample X ( v ) according to the marginal distribution induced by µ at vertex v conditioning on X t ( N ( v )) ; marginal distribution: MRF: ∀ σ ∈ [ q ] V , b v ( x ) Q u ∈ N ( v ) A ( u,v ) ( X u , x ) Pr[ X v = x | X N ( v ) ] = P y ∈ [ q ] b v ( y ) Q u ∈ N ( v ) A ( u,v ) ( X u , y ) Y Y µ ( σ ) ∝ A e ( σ u , σ v ) b v ( σ v ) stationary distribution: µ e =( u,v ) ∈ E v ∈ V mixing time: � 1 τ mix = max X 0 min t | d TV ( X t , µ ) ≤ 2e

Mixing of Glauber Dynamics influence matrix : { ρ v,u } v,u ∈ V v ρ v,u : max discrepancy (in total variation distance) of marginal distributions at v caused by any pair σ , τ of boundary conditions that differ only at u u contraction of one-step Dobrushin’s condition: optimal coupling in the worst X k ρ k ∞ = max ⇢ v,u  1 � ✏ case w.r.t. Hamming distance v ∈ V u ∈ V Theorem ( Dobrushin ’70; Salas, Sokal ’97 ) : Dobrushin’s τ mix = O ( n log n ) condition for Glauber dynamics q ≥ (2+ ε ) Δ for q -coloring: Dobrushin’s condition Δ = max-degree

Parallelization Glauber dynamics: G ( V , E ): starting from an arbitrary X 0 ∈ [ q ] V transition for X t → X t+ 1 : v v pick a uniform random vertex v ; resample X ( v ) according to the marginal distribution induced by µ at vertex v conditioning on X t ( N ( v )) ; Parallelization: • Chromatic scheduler [folklore] [Gonzalez et al. , AISTAT’11] : Vertices in the same color class are updated in parallel. • “Hogwild!” [Niu, Recht, Ré, Wright, NIPS’11][De Sa, Olukotun, Ré, ICML’16] : All vertices are updated in parallel, ignoring concurrency issues.

Warm-up: When Luby meets Glauber starting from an arbitrary X 0 ∈ [ q ] V G ( V , E ): at each step, for each vertex v ∈ V : independently sample a random number β v ∈ [0,1] ; Luby step if β v is locally maximum among its neighborhood N ( v ) : resample X ( v ) according to the Glauber marginal distribution induced by µ at step vertex v conditioning on X t ( N ( v )) ; • Luby step: Independently sample a random independent set. • Glauber step: For independent set vertices, update correctly according to the current marginal distributions. • Stationary distribution: the Gibbs distribution µ.

Mixing of LubyGlauber influence matrix { ρ v,u } v,u ∈ V v Dobrushin’s condition: X k ρ k ∞ = max ⇢ v,u  1 � ✏ u v ∈ V u ∈ V Theorem ( Dobrushin ’70; Salas, Sokal ’97 ) : Dobrushin’s τ mix = O ( n log n ) condition for Glauber dynamics Dobrushin’s τ mix = O ( ∆ log n ) condition for the LubyGlauber chain

influence matrix { ρ v,u } v,u ∈ V v Dobrushin’s condition: X k ρ k ∞ = max ⇢ v,u  1 � ✏ u v ∈ V u ∈ V Dobrushin’s τ mix = O ( ∆ log n ) condition for the LubyGlauber chain Proof (similar to [Hayes’04] [Dyer-Goldberg-Jerrum’06] ) : in the one-step optimal coupling ( X t , Y t ) , let p ( t ) = Pr[ X t ( v ) 6 = Y t ( v )] v p ( t +1) ≤ M p ( t ) where M = ( I − D ) + D ρ Pr[ X t 6 = Y t ] k p ( t ) k 1 D is diagonal and  n k p ( t ) k ∞ ∞ k p (0) k ∞ D v,v = Pr[ v is picked in Luby step]  n k M k t 1 ◆ t ✓ ≥ ✏ deg( v ) + 1  n 1 � ∆ + 1

Crossing the Chromatic # Barrier Glauber LubyGlauber O( n log n ) O( Δ log n ) parallel speedup = θ ( n / Δ ) ∆ = max-degree χ = chromatic no. Do not update adjacent vertices simultaneously. It takes ≥ χ steps to update all vertices at least once. Q: “How to update all variables simultaneously and still converge to the correct distribution?”

The LocalMetropolis Chain proposals: σ w σ u σ v u v w current: X u X v X w starting from an arbitrary X ∈ [ q ] V , at each step: each vertex v ∈ V independently proposes a random a collective σ v ∈ [ q ] with probability ; b v ( σ v ) / P i ∈ [ q ] b v ( i ) coin flipping each edge e =( u , v ) passes its check independently made between with prob. ; u and v i,j ∈ [ q ] ( A e ( i, j )) 3 A e ( X u , σ v ) A e ( σ u , X v ) A e ( σ u , σ v ) / max each vertex v ∈ V accepts its proposal and update X v to σ v if all incident edges pass their checks ; • [Feng, Sun, Y. ’17] : the LocalMetropolis chain is time-reversible w.r.t. the MRF Gibbs distribution µ .

Detailed Balance Equation: ∀ X, Y ∈ [ q ] V , µ ( X ) P ( X, Y ) = µ ( Y ) P ( Y, X ) σ ∈ [ q ] V : the proposals of all vertices C ∈ { 0 , 1 } E : indicates whether each edge e ∈ E passes its check Ω X → Y , { ( σ , C ) | X → Y when the random choice is ( σ , C ) } P ( σ , C ) ∈ Ω X → Y Pr( σ )Pr( C | σ , X ) = µ ( Y ) P ( X, Y ) P ( Y, X ) = µ ( X ) P ( σ , C ) ∈ Ω Y → X Pr( σ )Pr( C | σ , Y ) Bijection is constructed as: φ X,Y : Ω X → Y → Ω Y → X C = C 0 ⇢ φ X,Y s.t. ! ( σ 0 , C 0 ) ( σ , C ) if for all e incident with v , then σ 0 7� C e = 1 v = X v otherwise σ 0 v = σ v Pr( σ )Pr( C | σ , X ) b v ( Y v ) A e ( X u , X v ) = µ ( Y ) A e ( Y u , Y v ) Y Y Pr( σ 0 )Pr( C 0 | σ 0 , Y ) = b v ( X v ) µ ( X ) v 2 V e = uv 2 E

The LocalMetropolis Chain proposals: σ w σ u σ v u v w current: X u X v X w starting from an arbitrary X ∈ [ q ] V , at each step: each vertex v ∈ V independently proposes a random a collective σ v ∈ [ q ] with probability ; b v ( σ v ) / P i ∈ [ q ] b v ( i ) coin flipping each edge e =( u , v ) passes its check independently made between with prob. ; u and v i,j ∈ [ q ] ( A e ( i, j )) 3 A e ( X u , σ v ) A e ( σ u , X v ) A e ( σ u , σ v ) / max each vertex v ∈ V accepts its proposal and update X v to σ v if all incident edges pass their checks ; • [Feng, Sun, Y. ’17] : the LocalMetropolis chain is time-reversible w.r.t. the MRF Gibbs distribution µ .

LocalMetropolis for Hardcore model the hardcore model on G ( V , E ) with fugacity λ : λ | I | ∀ independent set I in G : µ ( I ) = I : IS in G λ | I | P starting from an arbitrary X ∈ {0,1} V , with 1 indicating occupied at each step, each vertex v ∈ V : proposes a random σ v ∈ {0,1} independently ( λ 1 with probability 1+ λ , σ v = 1 0 with probability 1+ λ ; accepts the proposal and update X v to σ v unless for some neighbor u of v : X u = σ v =1 or σ u =X v =1 or σ u = σ v =1 ; • λ < 1/ Δ : τ mix = O(log n ), even for unbounded Δ .

What can be sampled loca ! y ? Yitong Yin Nanjing University Joint - PowerPoint PPT Presentation

What can be sampled loca ! y ? Yitong Yin Nanjing University Joint work with: W eiming Feng, Y uxin Sun Local Computation What can be computed locally? [Noar, Stockmeyer, STOC93, SICOMP95] the LOCAL model [Linial 87] :

BMI Report for Sampled CCSD Students: 2010 2013 BMI Report for Sampled CCSD Students: 2010 2013

Timestamp /16 at LBL, sampled 1-in-1K 2nd /16, sampled 1-in-1K Number of relays 8000 6000

Learning from Irregularly-Sampled Time Series A Missing Data Perspective Steven Cheng-Xian Li

Stratospheric Air Sampled at the Stratospheric Air Sampled at the Surface at Mauna Loa Surface

Graphons and sampled networks A graphon W : [0 , 1] 2 [0 , 1] is a measurable function . W ( u

Charging f rom Sampled Net work Usage Nick Duf f ield Carst en Lund Mikkel Thorup AT&T

Sub-sampled Newton Methods with Non-uniform Sampling Jiyan Yang ICME, Stanford University

Inverting Sampled Traffic Nicolas Hohn, Darryl Veitch Australian Research Council Special

Dep Depar artmen ent of of Lo Loca cal Go Governmen ent Finan Finance ce Dep Depar

Differen'ally Private Loca'on Privacy in Prac'ce Vincent Primault

Extending An+cipa+on Games with Loca+on, Penalty and Timeline

sediments sampled in the Swedish Marine Environmental Monitoring Program Mats Eriksson, Pl

2014-2015 Audit Finding 21 of 46 sampled sections did not have census or similar attendance

Naomi Breine 1. Introduction Sampling 13 stations Sampled monthly 2 years (2013-2014)

ABC for Temporally Sampled Genetic Data Mark A. Beaumont, Schools of Biological Sciences and

Process control during running manufacture instead of testing the sampled, finished product?

Pseudorandom generators from polarizing random walks Ka Kaave Ho Hossei eini (UC San Diego)

Sampling in networks Argimiro Arratia & R. Ferrer-i-Cancho Universitat Polit` ecnica de

Data Analysis and Uncertainty Part 1: Random Variables Instructor: Sargur N. Srihari University

Metropolis-Hastings algorithm Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2019

Crash Course on Data Stream Algorithms Part I: Basic Definitions and Numerical Streams Andrew

Some notes on Interrogating Random Quantum Circuits Lus Brando and Ren Peralta

USING DATA TO DRIVE CHANGE Continuous Improvements in CA Hospitals California Breastfeeding and

Introductions CSUS 2019-20 Dietetic Interns Preceptors Nadine Braunstein, PhD, RD, FAND

Sambuz

Useful Links

Newsletter

Mail Us

What can be sampled loca ! y ? Yitong Yin Nanjing University Joint - PowerPoint PPT Presentation

What can be sampled loca ! y ? Yitong Yin Nanjing University Joint work with: W eiming Feng, Y uxin Sun Local Computation What can be computed locally? [Noar, Stockmeyer, STOC93, SICOMP95] the LOCAL model [Linial 87] :

BMI Report for Sampled CCSD Students: 2010 2013 BMI Report for Sampled CCSD Students: 2010 2013

Timestamp /16 at LBL, sampled 1-in-1K 2nd /16, sampled 1-in-1K Number of relays 8000 6000

Learning from Irregularly-Sampled Time Series A Missing Data Perspective Steven Cheng-Xian Li

Stratospheric Air Sampled at the Stratospheric Air Sampled at the Surface at Mauna Loa Surface

Graphons and sampled networks A graphon W : [0 , 1] 2 [0 , 1] is a measurable function . W ( u

Charging f rom Sampled Net work Usage Nick Duf f ield Carst en Lund Mikkel Thorup AT&amp;T

Sub-sampled Newton Methods with Non-uniform Sampling Jiyan Yang ICME, Stanford University

Inverting Sampled Traffic Nicolas Hohn, Darryl Veitch Australian Research Council Special

Dep Depar artmen ent of of Lo Loca cal Go Governmen ent Finan Finance ce Dep Depar

Differen'ally Private Loca'on Privacy in Prac'ce Vincent Primault

Extending An+cipa+on Games with Loca+on, Penalty and Timeline

sediments sampled in the Swedish Marine Environmental Monitoring Program Mats Eriksson, Pl

2014-2015 Audit Finding 21 of 46 sampled sections did not have census or similar attendance

Naomi Breine 1. Introduction Sampling 13 stations Sampled monthly 2 years (2013-2014)

ABC for Temporally Sampled Genetic Data Mark A. Beaumont, Schools of Biological Sciences and

Process control during running manufacture instead of testing the sampled, finished product?

Pseudorandom generators from polarizing random walks Ka Kaave Ho Hossei eini (UC San Diego)

Sampling in networks Argimiro Arratia &amp; R. Ferrer-i-Cancho Universitat Polit` ecnica de

Data Analysis and Uncertainty Part 1: Random Variables Instructor: Sargur N. Srihari University

Metropolis-Hastings algorithm Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2019

Crash Course on Data Stream Algorithms Part I: Basic Definitions and Numerical Streams Andrew

Some notes on Interrogating Random Quantum Circuits Lus Brando and Ren Peralta

USING DATA TO DRIVE CHANGE Continuous Improvements in CA Hospitals California Breastfeeding and

Introductions CSUS 2019-20 Dietetic Interns Preceptors Nadine Braunstein, PhD, RD, FAND

Sambuz

Useful Links

Newsletter

Mail Us

Charging f rom Sampled Net work Usage Nick Duf f ield Carst en Lund Mikkel Thorup AT&T

Sampling in networks Argimiro Arratia & R. Ferrer-i-Cancho Universitat Polit` ecnica de