optimization
play

Optimization Nicholas Ruozzi Advisor: Sekhar Tatikonda Yale - PowerPoint PPT Presentation

Message Passing Algorithms for Optimization Nicholas Ruozzi Advisor: Sekhar Tatikonda Yale University 1 The Problem Minimize a real-valued objective function that factorizes as a sum of potentials (a multiset whose elements are


  1. Message Passing Algorithms for Optimization Nicholas Ruozzi Advisor: Sekhar Tatikonda Yale University 1

  2. The Problem  Minimize a real-valued objective function that factorizes as a sum of potentials  (a multiset whose elements are subsets of the indices 1,…,n) 2

  3. Corresponding Graph 1 2 3 3

  4. Local Message Passing Algorithms 1 2 3  Pass messages on this graph to minimize f  Distributed message passing algorithm  Ideal for large scientific problems, sensor networks, etc. 4

  5. The Min-Sum Algorithm  Messages at time t: 1 2 3 4 5

  6. Computing Beliefs  The min-marginal corresponding to the i th variable is given by  Beliefs approximate the min-marginals:  Estimate the optimal assignment as 6

  7. Min-Sum: Convergence Properties  Iterations do not necessarily converge  Always converges when the factor graph is a tree  Converged estimates need not correspond to the optimal solution  Performs well empirically 7

  8. Previous Work  Prior work focused on two aspects of message passing algorithms  Convergence  Coordinate ascent schemes  Not necessarily local message passing algorithms  Correctness  No combinatorial characterization of failure modes  Concerned only with global optimality 8

  9. Contributions  A new local message passing algorithm  Parameterized family of message passing algorithms  Conditions under which the estimate produced by the splitting algorithm is guaranteed to be a global optima  Conditions under which the estimate produced by the splitting algorithm is guaranteed to be a local optima 9

  10. Contributions  What makes a graphical model “good”?  Combinatorial understanding of the failure modes of the splitting algorithm via graph covers  Can be extended to other iterative algorithms  T echniques for handling objective functions for which the known convergent algorithms fail  Reparameterization centric approach 10

  11. Publications Convergent and correct message passing schemes for optimization problems  over graphical models Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI) , July 2010 Fixing Max-Product: A Unified Look at Message Passing Algorithms (invited talk)  Proceedings of the Forty-Eighth Annual Allerton Conference on Communication, Control, and Computing, September 2010 Unconstrained minimization of quadratic functions via min-sum  Proceedings of the Conference on Information Sciences and Systems (CISS) , Princeton, NJ/USA, March 2010 Graph covers and quadratic minimization  Proceedings of the Forty-Seventh Annual Allerton Conference on Communication, Control, and Computing, September 2009 s-t paths using the min-sum algorithm  Proceedings of the Forty-Sixth Annual Allerton Conference on Communication, Control, and Computing, September 2008 11

  12. Outline  Reparameterizations  Lower Bounds  Convergent Message Passing  Finding a Minimizing Assignment  Graph covers  Quadratic Minimization 12

  13. The Problem  Minimize a real-valued objective function that factorizes as a sum of potentials  (a multiset whose elements are subsets of the indices 1,…,n) 13

  14. Factorizations  Some factorizations are better than others  If x i takes one of k values this requires at most 2k 2 + k operations 14

  15. Factorizations  Some factorizations are better than others  Suppose  Only need k operations to compute the minimum value! 15

  16. Reparameterizations  We can rewrite the objective function as  This does not change the objective function as long as the messages are real-valued at each x  The objective function is reparameterized in terms of the messages 16

  17. Reparameterizations  We can rewrite the objective function as  The reparameterization has the same factor graph as the original factorization  Many message passing algorithms produce a reparameterization upon convergence 17

  18. The Splitting Reparameterization  Let c be a vector of non-zero reals  If c is a vector of positive integers, then we could view this as a factorization in two ways:  Over the same factor graph as the original potentials  Over a factor graph where each potential has been “split” into several pieces 18

  19. The Splitting Reparameterization 1 1 3 2 3 2 Factor graph resulting from Factor graph “splitting” each of the pairwise potentials 3 times 19

  20. The Splitting Reparameterization  Beliefs:  Reparameterization: 20

  21. Outline  Reparameterizations  Lower Bounds  Convergent Message Passing  Finding a Minimizing Assignment  Graph covers  Quadratic Minimization 21

  22. Lower Bounds  Can lower bound the objective function with these reparameterizations:  Find the collection of messages that maximize this lower bound  Lower bound is a concave function of the messages  Use coordinate ascent or subgradient methods 22

  23. Lower Bounds and the MAP LP  Equivalent to minimizing f  Dual provides a lower bound on f  Messages are a side-effect of certain dual formulations 23

  24. Outline  Reparameterizations  Lower Bounds  Convergent Message Passing  Finding a Minimizing Assignment  Graph covers  Quadratic Minimization 24

  25. The Splitting Algorithm  A local message passing algorithm for the splitting reparameterization  Contains the min-sum algorithm as a special case  For the integer case, can be derived from the min-sum update equations 25

  26. The Splitting Algorithm  For certain choices of c, an asynchronous version of the splitting algorithm can be shown to be a block coordinate ascent scheme for the lower bound:  For example: 26

  27. Asynchronous Splitting Algorithm 1 3 2 27

  28. Asynchronous Splitting Algorithm 1 3 2 28

  29. Asynchronous Splitting Algorithm 1 3 2 29

  30. Coordinate Ascent  Guaranteed to converge  Does not necessarily maximize the lower bound  Can get stuck in a suboptimal configuration  Can be shown to converge to the maximum in restricted cases  Pairwise-binary objective functions 30

  31. Other Ascent Schemes  Many other ascent algorithms are possible over different lower bounds:  TRW-S [Kolmogorov 2007]  MPLP [Globerson and Jaakkola 2007]  Max-Sum Diffusion [Werner 2007]  Norm-product [Hazan 2010]  Not all coordinate ascent schemes are local 31

  32. Outline  Reparameterizations  Lower Bounds  Convergent Message Passing  Finding a Minimizing Assignment  Graph covers  Quadratic Minimization 32

  33. Constructing the Solution  Construct an estimate, x * , of the optimal assignment from the beliefs by choosing  For certain choices of the vector c, if each argmin is unique, then x * minimizes f  A simple choice of c guarantees both convergence and correctness (if the argmins are unique) 33

  34. Correctness  If the argmins are not unique, then we may not be able to construct a solution  When does the algorithm converge to the correct minimizing assignment? 34

  35. Outline  Reparameterizations  Lower Bounds  Convergent Message Passing  Finding a Minimizing Assignment  Graph covers  Quadratic Minimization 35

  36. Graph Covers  A graph H covers a graph G if there is homomorphism from H to G that is a bijection on neighborhoods 2 1 3 1 3 2 1’ 3’ 2’ Graph G 2-cover of G 36

  37. Graph Covers  Potential functions are “lifts” of the nodes they cover 2 1 3 1 3 2 1’ 3’ 2’ Graph G 2-cover of G 37

  38. Graph Covers  The lifted potentials define a new objective function  Objective function:  2-cover objective function 38

  39. Graph Covers  Indistinguishability: for any cover and any choice of initial messages on the original graph, there exists a choice of initial messages on the cover such that the messages passed by the splitting algorithm are identical on both graphs  For choices of c that guarantee correctness, any assignment that uniquely minimizes each must also minimize the objective function corresponding to any finite cover 39

  40. Maximum Weight Independent Set 2 3 1 1 2 3 1’ 3’ 2’ Graph G 2-cover of G 40

  41. Maximum Weight Independent Set 2 2 5 5 2 2 5 2 2 Graph G 2-cover of G 41

  42. Maximum Weight Independent Set 2 2 5 5 2 2 5 2 2 Graph G 2-cover of G 42

  43. Maximum Weight Independent Set 2 2 3 3 2 2 3 2 2 Graph G 2-cover of G 43

  44. Maximum Weight Independent Set 2 2 3 3 2 2 3 2 2 Graph G 2-cover of G 44

  45. More Graph Covers  If covers of the factor graph have different solutions  The splitting algorithm cannot converge to the correct answer for choices of c that guarantee correctness  The min-sum algorithm may converge to an assignment that is optimal on a cover  There are applications for which the splitting algorithm always works  Minimum cuts, shortest paths, and more… 45

  46. Graph Covers  Suppose f factorizes over a set with corresponding factor graph G and the choice of c guarantees correctness  Theorem: the splitting algorithm can only converge to beliefs that have unique argmins if  f is uniquely minimized at the assignment x *  The objective function corresponding to every finite cover H of G has a unique minimum that is a lift of x * 46

Recommend


More recommend