ALMOST SURE CONVERGENCE OF RANDOM GOSSIP ALGORITHMS Giorgio Picci with T. Taylor, ASU Tempe AZ. Wofgang Runggaldier’s Birthday, Brixen July 2007 1
CONSENSUS FOR RANDOM GOSSIP ALGORITHMS Consider a finite set of nodes representing say wireless sensors or dis- tributed computing units, can they achieve a common goal by exchang- ing information only locally ? exchanging information locally for the purpose of forming a common esti- mate of some physical variable x ; Each node k forms his own estimate x k ( t ) , t ∈ Z + and updates according to exchange of information with a neighbor. Neighboring pairs are chosen randomly Q: will all local estimates { x k ( t ) , k = 1 ,..., n } converge to the same value as t → ∞ ? . 2
DYNAMICS OF RANDOM GOSSIP ALGORITHMS While two nodes v i and v j are in communication, they exchange informa- tion to refine their own estimate using the neighbor’s estimate. Model this adjustament in discrete time by a simple symmetric linear rela- tion x i ( t + 1 ) = x i ( t )+ p ( x j ( t ) − x i ( t )) x j ( t + 1 ) = x j ( t )+ p ( x i ( t ) − x j ( t )) where p is some positive gain parameter modeling the speed of adjust- ment. For stability need to impose that | 1 − 2 p | ≤ 1 and hence 0 ≤ p ≤ 1 . For p = 1 2 you take the average of the two measurements so that x i ( t + 1 ) = x j ( t + 1 ) . 3
DYNAMICS OF RANDOM GOSSIP ALGORITHMS The whole coordinate vector x ( t ) ∈ R n evolves according to x ( t + 1 ) = A ( e ) x ( t ) , the matrix A ( e ) ∈ R n × n depending on the edge e = v i v j selected at that par- ticular time instant; ··· 1 0 0 ... ... ... 0 . . . 1 − p ··· p . . . . . . 1 . . ... . . A ( e ) = . . . . . . . . 1 ··· 1 − p p ... 1 � T � �� = I n − p 1 v i − 1 v j 1 v i − 1 v j 4
EIGENSPACES The vector 1 v i has the i th entry equal to 1 and zero otherwise. A ( e ) is a symmetric doubly stochastic matrix . The value 1 − 2 p is a simple eigenvalue associated to the eigenvector ( 1 v i − 1 v j ) , A ( e )( 1 v i − 1 v j ) = ( 1 v i − 1 v j ) − p ( 1 v i − 1 v j ) 2 = ( 1 − 2 p )( 1 v i − 1 v j ) � ⊥ � the orthogonal (codimension one) subspace 1 v i − 1 v j is the eigenspace of the eigenvalue 1 . Let 1 : = [ 1 ,..., 1 ] ⊤ . Want x ( t ) to converge to the subspace { 1 } : = { α 1 ; α ∈ R } . This would be automatically true for a fixed irreducible d-stochastic matrix. 5
A CONTROLLABILITY LEMMA � ⊥ � Lemma 1 Let G = ( V , E ) be a graph. Then 1 v i − 1 v j span { 1 } ; i.e. span { 1 v i − 1 v j : ( v i v j ) ∈ E } = 1 ⊥ iff G is connected. Corollary 1 Let G ′ = ( V , E ′ ) with E ′ ⊆ E be a subgraph of G . Let { e i : 1 ≤ i ≤ m ′ } be an ordering of E ′ , and let π denote a permutation of { 1 , 2 , ··· , m ′ } . Let B ( E ′ , π ) = � m ′ i = 1 A ( e π i ) , where the product is ordered from right to left. Then � B ( E ′ , π ) � � 1 ⊥ � < 1 if and only if G ′ is connected. 6
THE EDGE PROCESS Let Ω = E N , be the space of all semi-infinite sequences taking values in E , and let σ : Ω → Ω denote the shift map: σ ( e 0 , e 1 , e 2 , ··· , e n , ··· ) = ( e 1 , e 2 , ··· , e n , ··· ) . Let ev k : Ω → E denote the evaluation on the k th term. Let µ denote an ergodic shift invariant probability measure on Ω , so that the edge process e ( k ) : ω → ev k ( ω ) is ergodic. Special cases: e ( k ) is iid, or an ergodic Markov chain. However, what we shall do works for general ergodic processes. Consider the function t − 1 t − 1 C : Ω × Z → R n × n , � � A ( ev 0 ( σ i ω )) C ( ω , t ) : = A ( ev i ( ω )) = i = 0 i = 0 which by stationarity of e obeys the composition rule C ( ω , t + s ) = C ( σ t ω , s ) C ( ω , t ) with C ( ω , 0 ) = I . Such a function is called a matrix cocycle . 7
MULTIPLICATIVE ERGODIC THEOREM Theorem 1 [Oseledet’s Multiplicative Ergodic Theorem] Let µ be a shift invariant probability measure on Ω and suppose that the shift map σ : Ω → Ω is ergodic and that log + � C ( ω , t ) � is in L 1 . Then the limit � 1 � 2 t C ( ω , t ) T C ( ω , t ) Λ = lim (1) t → ∞ exists with probability one, is symmetric and nonnegative definite, and is µ a.s. independent of ω . Let λ 1 < λ 2 < ··· λ k for k ≤ n be the distinct eigenvalues of Λ , let U i denote the eigenspace of λ i , and let V i = � i j = 1 U j . Then for u ∈ V i − V i − 1 , 1 t log � C ( ω , t ) u � = log ( λ i ) . (2) lim t → ∞ The numbers λ i are called the Lyapunov exponents of C . 8
MULTIPLICATIVE ERGODIC THEOREM The Lyapunov exponents control the exponential rate of convergence (or non-convergence) to consensus. The matrices A ( e ) are doubly stochastic as are any matrix products of them, C ( ω , t ) . If follows that the constant functions on V , { 1 } , as well as the mean zero functions in { 1 } ⊥ are invariant under the action of this cocycle and of its transpose. Thus these subspaces are also invariant under the limiting matrix Λ of the Oseledet’s theorem. There is a Lyapunov exponent associated with the subspace { 1 } which, it is not difficult to see, is one. There are n − 1 Lyapunov exponents associated with the subspace 1 ⊥ , so the key point is to characterize them. 9
CONVERGENCE TO CONSENSUS For x ∈ R n use the symbol ¯ � n x : = 1 i = 1 x i . The main convergence result n follows. Theorem 2 Let G = ( V , E ) be a connected graph and let e ( t ) be an ergodic stochastic process taking values on E . Suppose that the support of the probability distribution induced by e ( t ) is all of E . Let the gossip algorithm be initialized at x ( 0 ) = x 0 . Then there is a (deterministic) constant | λ | < 1 and a (random) constant K λ such that x 0 1 � < K λ λ t � x 0 − ¯ � x ( t ) − ¯ x 0 1 � µ -almost surely. 10
OPEN QUESTIONS • Rate of convergence (for L 2 ...) • Multiple gossiping : more than one pair of communicating edges per time slot, • Convergence is merely associated to the time T it takes the algorithm to visit a spanning tree with positive probability. Indeed, the actual rate of convergence of the algorithm is just determined by T . • Much remains to be done !!! 11
REFERENCES W. Runggaldier (circa 1970): STILLE WASSER GRUNDEN TIEF , unpub- lished (although well known among specialists). 12
Recommend
More recommend