Gossip algorithms for solving Laplacian systems Anastasios Zouzias University of Toronto joint work with Nikolaos Freris (EPFL) Based on : 1.Fast Distributed Smoothing for Clock Synchronization (CDC ‘12) 2.Randomized Extended Kaczmarz for Solving Least-Squares (Arxiv, May ‘12)
Outline I. Problems: Solving Laplacian & edge-vertex systems II. Motivation: Clock Synchronization over WSNs III. Randomized Gossip Model & Averaging Problem IV. Gossip Solvers via Randomized (Extended) Kaczmarz
Outline I. Problems: Solving Laplacian & edge-vertex systems II. Motivation: Clock Synchronization over WSNs III. Randomized Gossip Model & Averaging Problem IV. Gossip Solvers via Randomized (Extended) Kaczmarz
Distributed solver : Laplacian system b 1 b 2 Laplacian System b 4 b 5 x 1 x 2 x 5 x 4 b 6 b 3 x 6 b 8 x 3 L b 9 b x = b 7 x 7 x 8 b 13 b 12 x 9 x 13 b 11 x 12 b 10 x 11 x 10 G =( V , E ): n nodes, m edges Model of computation • Each node is aware of its neighbors; Problem I exchanges packets with them only Input : Each node i gets b i • Static network; no communication Goal : Each node i computes errors; ignore numerical issues i th coordinate of x LS := L † b • Synchronous, asynchronous & Gossip
Distributed solver : edge-vertex system Edge-vertex System 7 2 x 1 x 2 -5 x 5 7 8 x 4 3 -8 x 6 4 x 3 9 x 1 -2 1 -12 x 7 x 8 -1 x 9 3 -2 6 B x 13 -4 y x 12 = 2 x 11 x 10 G=(V,E): n nodes, m edges m × n Problem II − 1 , if k = i ; Input : Each edge ( i , j ) gets y (i,j) B ek := 1 , if k = j ; Goal : Each node i computes i th 0 , otherwise . Normal equation of Bx = y is coordinate of x LS = B † y Laplacian system ( L = B ⊤ B )
Outline I. Problems: Solving Laplacian & edge-vertex Systems II. Motivation: Clock Synchronization over WSNs III. Randomized Gossip Model & Averaging Problem IV. Gossip Solvers via Randomized (Extended) Kaczmarz
Case Study: Clock Synchronization over WSNs Assumptions v • Each node has clock; same speed Neigh( v ) o v ( t ) = t + o v , o v ∈ R • Node v does not know o v Nodes can approx. relative offsets o uv = o v − o u for every u ∈ Neigh( v ) G=(V,E): n nodes, m edges Clock Synchronization Problem over all pairs Input: Estimates for all o uv = o uv + N (0 , 1) ( u, v ) ∈ E ˆ of nodes Goal: Compute offsets that min. (˜ o u ) u ∈ V o uv − o uv | 2 u,v E | ˜ max
Tree-based Approach Idea : Build a spanning tree u Path u ~ v : diam(G) Every edge: normal error � o uv = ˆ ˆ o u ′ v ′ v ( u ′ ,v ′ ) ∈ P Sync error between u & v grows like ≈ O ( � diam(G)) In general, no hope for better accuracy ...but wireless networks are ``well-connected’’
Modeling Wireless Networks ...as Random Geometric Graphs • n nodes uniform over square �� � log n • Connectivity [GK00] : r = O n �� � • Diameter: n O log n • Tree-based approach: error � O ( n 1 / 4 ) Square: unit area Q : Can we do better on Random Geometric Graphs? Yes ! Spatial Smoothing [KEES03,GK06]
Spatial Smoothing Observation : Every loop in G : sum of relative offsets equals zero Idea: Incorporate the loop constraints How ? x Encode constraints in linear system: B Relative offset = B x = o of ( i, j ) o ( i,j ) m × n
Properties of Least-Squares Gaussian error: compute LS solution of B x = ˆ o Thm [KEES03] Replace each edge by unit resistor. Then error variance between any pair of nodes u and v is : o uv − o uv | 2 ∼ R eff ( u, v ) E | ˜ Effective resistances of RGG bounded by O(1) [GK06] Tree-based vs Smoothing vs O ( n 1 / 4 ) O (1) Q : How to compute the LS solution?
The Model Matters... ∂ Use coordinate descent: � B x − o � 2 = 0 ∂x u Synchronous Jacobi Asynchronous Jacobi o v = 0 , ∀ v ∈ V ˆ Each node regularly: For k = 1 , 2 , . . . v ∈ V • estimates relative offsets with nghbrs ← 1 � � � o ( k +1) o ( k ) o uv ˜ ˜ + ˆ v u • broadcasts its current offset d v ˆ o v u ∈ Neigh( v ) • updates its estimate: o v ← 1 k ≥ 4 m 2 Thm [GK06] : After � ˜ (˜ o u + ˆ o uv ) β 2 ln( � x ∗ � /ε ) d v rounds, it holds that u ∈ Neigh( v ) � x ( k ) − x ∗ � � It converges [BT89] 2 ≤ ε � � � where is the min-cut value β
The Model Matters... Randomized Gossip Model (a.k.a. asynchronous time model) [BTA86,BGPS06] Each node u (randomly) activates itself Synchronous Model Asynchronous Model w.p. p u & performs local computation o v = 0 , ∀ v ∈ V ˆ Each node regularly: For k = 1 , 2 , . . . v ∈ V • estimates relative offsets with nghbrs ← 1 � � � o ( k +1) o ( k ) o uv ˜ ˜ + ˆ v u • broadcasts its current offset d v ˆ o v u ∈ Neigh( v ) • updates its estimate: all nothing o v ← 1 k ≥ 4 m 2 Thm [GK06] : After � ˜ (˜ o u + ˆ o uv ) β 2 ln( � x ∗ � /ε ) d v rounds, it holds that u ∈ Neigh( v ) It converges [BT89] � x ( k ) − x ∗ � � 2 ≤ ε � � � where is the min-cut value β
Outline I. Problems: Solving Laplacian & edge-vertex systems II. Motivation: Clock Synchronization over WSNs III. Randomized Gossip Model & Averaging Problem IV. Gossip Solvers via Randomized (Extended) Kaczmarz
Distributed Averaging How many rounds required to approx. 2.5 3 within ε ? Distributed Averaging: Input : Every node u gets w u 4 1 Goal: Every node want access 2.5 2 to global average 7 4 Gossip averaging algorithm 5 1.Every node u activates uniformly at random 2.Picks random neighbor v and averages their Basic primitive for current values w u , w v other functions Averaging can solve Problems I and II n [BGPS06] proved that rounds are [BDFSV10,XBL05,XBL06] O ( λ 2 ( G ) log( n/ε )) sufficient whp Special cases, complete graph [KSSV00,KDG03,KDN + 06]
Gossip Model Assumptions • Each node u has independent Poisson time process: rate γ u • Each node activates when its arrival occurs • Equivalently * : single global Poisson process: rate � γ u • Arrivals correspond to rounds u ∈ V Claim : Non-uniform sampling of nodes is feasible with zero communication under gossip model (given ‘s) γ u Goal : Design and analyze gossip algorithms for Problem I and II *minimum of ind. Poisson is equivalent to single Poisson with sum of their rates
Outline I. Problems: Solving Laplacian & edge-vertex systems II. Motivation: Clock Synchronization over WSNs III. Randomized Gossip Model & Averaging Problem IV. Gossip Solvers via Randomized (Extended) Kaczmarz
Kaczmarz Method x · A ·· y = · · x ( m ) H m . . . · x (0) x (2) · x (1) · Kaczmarz Method (K) H 2 Initialize : � � � � A (1) , x x (0) = 0 H 1 = x | = y 1 Repeat: It convergences [K37] Set i k = k mod m + 1 A ( i k ) , x ( k ) � � x ( k +1) = x ( k ) + y i k − A ( i k ) � 2 Huge literature; many extensions; � � A ( i k ) � k = k + 1 rediscovered many times (Assumption: Ax = y has solution)
Randomized Kaczmarz Method x Pick rows randomly · A ·· = y · · x ( m ) H m . . . · x (0) x (2) · x (1) · Randomized Kaczmarz(RK) [SV06] H 2 Initialize : x (0) = 0 � � � � A (1) , x H 1 = x | = y 1 Repeat: Exponential convergence 2 � A ( i ) � � Pick w.p. i k ∈ [ m ] � � p i ∝ � k 2 � 1 � � � x ( k ) − x ∗ � � x ∗ � 2 A ( i k ) , x ( k ) � � ≤ 1 − E � � x ( k +1) = x ( k ) + y i k − A ( i k ) κ 2 F ( A ) � � 2 � � A ( i k ) � � A � 2 where κ 2 F F ( A ) := k = k + 1 σ 2 min ( A ) (Assumption: Ax = y has solution)
Let’s apply RK on Problems I and II
RK Laplacian Solver b 1 b 2 x 1 x 2 b 5 Laplacian System x 5 b 4 x 4 b 6 b 3 x 6 x 3 b 8 b 9 b 7 x 7 x 8 L b 13 b x 9 x = b 12 x 13 b 11 x 12 x 11 b 10 x 10 Gossip Laplacian Solver Randomized Kaczmarz(RK) [SV06] Each node u : Initialize : x u = 0 x (0) = 0 Repeat: Repeat: 2 Node u activates w.p. d 2 � � L ( i ) � Pick w.p. u + d u i k ∈ [ n ] � � p i ∝ � • broadcasts θ = x u − 1 � L ( i k ) , x ( k ) � � sparsity x ( k +1) = x ( k ) + b i k − x ℓ L ( i k ) d u • sets x u ← x u + b u − d u θ � 2 of L ℓ ∈ N u � L ( i k ) � � 1 + d u k = k + 1 Every : x v ← x v + θ − b u /d u RK analysis & diag. preconditioning : v ∈ Neigh( u ) 1 + d u � rounds whp (Assumption: Lx = b has solution) O ( n/λ 2 2 ( G ))
RK Edge-vertex Solver 7 2 x 1 x 2 -5 x 5 7 8 x 4 3 -8 x x 6 4 x 3 9 B 1 -2 y 1 -12 = x 7 x 8 -1 x 9 3 -2 6 x 13 -4 x 12 2 x 11 m × n x 10 Consistency assumption (limitation of RK) Gossip Edge-Vertex Solver Randomized Kaczmarz(RK) [SV06] Every node u : x u = 0 How to handle the general case? Repeat: Initialize : x (0) = 0 Node u activates w.p. d u & selects Repeat: random neighbor v Pick uniformly e = ( i, j ) ∈ E • sends x u & receives x v sparsity � B ( e ) , x ( k ) � x ( k +1) = x ( k ) + y e − • Performs: B ( e ) x u ← ( x u + x v + y ( u,v ) ) / 2 of B 2 k = k + 1 Similarly for node v RK analysis & diag. preconditioning : (Assumption: Bx = y has solution) � rounds whp O ( n/λ 2 ( G ))
Recommend
More recommend