message passing in the presence of erasures
play

Message Passing in the Presence of Erasures Nicholas Ruozzi - PowerPoint PPT Presentation

Message Passing in the Presence of Erasures Nicholas Ruozzi Motivation Real networks are dynamic and constrained Messages are lost Nodes join and leave Nodes may be power constrained Empirical studies suggest that belief


  1. Message Passing in the Presence of Erasures Nicholas Ruozzi

  2. Motivation • Real networks are dynamic and constrained • Messages are lost • Nodes join and leave • Nodes may be power constrained • Empirical studies suggest that belief propagation and its relatives continue to perform well over real networks • [Anker, Dolev, and Hod, 2008] • [Anker, Bickson, Dolev, and Hod, 2008] • Few theoretical guarantees

  3. Convergent Message Passing • New classes of reweighted message passing algorithms guarantee convergence and a notion of correctness • e.g., MPLP, tree-reweighted max-product, norm-product, etc. • Need special updating schedules or central control • No guarantees if messages are lost or updated in the wrong order

  4. Factorizations • A function, f, factorizes with respect to a graph G = (V, E) if Y Y f ( x 1 ; : : : ; x n ) = Á i ( x i ) Ã ij ( x i ; x j ) i 2 V ( i;j ) 2 E • Goal is to maximize the function, f • Max-product attempts to solve this problem by passing messages over the graph G

  5. Reweighted Message Passing Q h i k 2N ( i ) m t ¡ 1 ki ( x i ) c ki m t à ij ( x i ; x j ) 1 =c ij Á i ( x i ) ij ( x j ) := max m t ¡ 1 ji ( x i ) x i • Messages passed from a node only depend on the messages received by that node at the previous time step • Generalization of max-product

  6. Reweighted Message Passing Y b t m t ki ( x i ) c ki i ( x i ) = Á i ( x i ) k 2N ( i ) b t à ij ( x i ; x j ) 1 =c ij b t j ( x j ) i ( x i ) b t ij ( x i ; x j ) = m t m t ji ( x i ) ij ( x j ) • These “beliefs” provide an alternative factorization of the objective function b i ( x i ) (1 ¡ P Y Y k 2 @i c ik ) b ij ( x i ; x j ) c ij f ( x ) = i 2 V ( i;j ) 2 E

  7. Reweighted Message Passing Y b t m t ki ( x i ) c ki i ( x i ) = Á i ( x i ) k 2N ( i ) b t à ij ( x i ; x j ) 1 =c ij b t j ( x j ) i ( x i ) b t ij ( x i ; x j ) = m t m t ji ( x i ) ij ( x j ) • Certain choices of the reweighting parameters produce natural convex upper bounds on the objective function x i b i ( x i ) (1 ¡ P Y k 2 @i c ik ) max f ( x ) · max x i 2 V Y x i ;x j b ij ( x i ; x j ) c ij ¢ max ( i;j ) 2 E

  8. Reweighted Message Passing Q h i k 2N ( i ) m t ¡ 1 ki ( x i ) c ki m t à ij ( x i ; x j ) 1 =c ij Á i ( x i ) ij ( x j ) := max m t ¡ 1 ji ( x i ) x i • If each c < 1/max degree, then there is a simple, “asynchronous” coordinate descent scheme

  9. Reweighted Message Passing • Convergence is guaranteed by performing coordinate descent on a convex upper bound • Can we extend our convergence guarantees to networks in which messages can be lost? • Delivered too slowly • Adversarially lost • Intentionally not sent • Lost independently with some fixed probability

  10. Results • For pairwise MRFs: • Can modify the graph locally in order to guarantee convergence when there are message erasures • Yields a completely local message passing algorithm as a side effect • If no messages are lost, the convergence of the asynchronous algorithm implies convergence of the synchronous one

  11. Extending Convergence • With a linear amount of additional state at each node of the network we can, again, guarantee convergence with erasures • Construct a new graphical model such that message passing on the new model can be simulated over the network • Update messages “internal” to each node in such a way as to guarantee convergence

  12. Extending Convergence • Construct a new graphical model from the network: • Create a copy of node i for each one of i’s neighbors • Attach each copy to exactly one copy of each neighbor • Enforce equality among the copies of each node with equality constraints • Messages can only be lost between copies of different nodes (all other messages are internal to a node of the network)

  13. Extending Convergence 1 = 1 2 2 = = = = = 3 4 = = 3 4 New graphical model Original network (dashed circles are the nodes of the network)

  14. Extending Convergence Á 1 ( x 1 ; 1 ) 3 1 = Á 1 ( x 1 ) 2 2 = = = Á 1 ( x 1 ; 2 ) 3 Á 1 ( x 1 ; 3 ) 3 = = 3 4 = = 3 4 New graphical model Original network (dashed circles are the nodes of the network)

  15. Extending Convergence • Convergence on the new network follows from the convergence of the asynchronous message passing algorithm • Works even in the presence of erasures • Requires no global knowledge of the network • Can convert any network into a equivalent 3-regular network

  16. Other Extensions • Many different updating strategies can be used to guarantee convergence: • Solve the “internal” problem exactly • Complete graph versus single cycle • Don’t divide the potentials evenly • Other graph modifications?

  17. Performance • The additional overhead may result in slower rates of convergence • In practice, there exist sequences of erasures for which either algorithm outperforms the other • However, the reweighted max-product algorithm always seems to converge in practice for appropriate choices of the parameters

  18. Networks Without Erasures • Synchronous algorithm is an asynchronous algorithm on the bipartite 2-cover of the network 1 ’ 1 1 2 2’ 2 3’ 3 3 4 4’ 4 Bipartite 2-cover Original network

  19. Networks Without Erasures • Synchronous algorithm is an asynchronous algorithm on the bipartite 2-cover of the network 1 ’ 1 1 2 2’ 2 3’ 3 3 4 4’ 4 Bipartite 2-cover Original network

  20. Networks Without Erasures • Synchronous algorithm is an asynchronous algorithm on the bipartite 2-cover of the network 1 ’ 1 1 2 2’ 2 3’ 3 3 4 4’ 4 Bipartite 2-cover Original network

  21. Networks Without Erasures • Synchronous algorithm is an asynchronous algorithm on the bipartite 2-cover of the network 1 ’ 1 1 2 2’ 2 3’ 3 3 4 4’ 4 Bipartite 2-cover Original network

  22. Conclusions • Understanding the convergence behavior of BP-like algorithms on a network with errors is a challenging problem • Can engineer around the problem to achieve a purely local algorithm • May incur a performance penalty • What is the exact relationship between these algorithms? • Empirically, the reweighted algorithm on the original network appears to always converge • Prove it?

Recommend


More recommend