sum product message passing
play

Sum-Product: Message Passing Belief Propagation Probabilistic - PowerPoint PPT Presentation

Sum-Product: Message Passing Belief Propagation Probabilistic Graphical Models Sharif University of Technology Spring 2018 Soleymani All single-node marginals If we need the full set of marginals, repeating elimination algorithm for each


  1. Sum-Product: Message Passing Belief Propagation Probabilistic Graphical Models Sharif University of Technology Spring 2018 Soleymani

  2. All single-node marginals  If we need the full set of marginals, repeating elimination algorithm for each individual variable is wasteful  It does not share intermediate terms  Message-passing algorithms on graphs (messages are the shared intermediate terms).  sum-product and junction tree  upon convergence of the algorithms, we obtain marginal probabilities for all cliques of the original graph. 2

  3. Tree  Sum-product work only in trees (and we will see it also work on tree-like graphs) Directed tree Undirected tree All nodes have one A unique path between parent expect to the root any pair of nodes 3

  4. Parameterization  Consider a tree 𝒰(𝒲, ℰ)  Potential functions: 𝜚 𝑦 𝑗 , 𝜚(𝑦 𝑗 , 𝑦 𝑘 ) 𝑄 𝒚 = 1 𝜚 𝑦 𝑗 𝜚 𝑦 𝑗 , 𝑦 𝑘 𝑎 𝑗∈𝒲 𝑗,𝑘 ∈ℰ  In directed graphs: 𝑄 𝒚 = 𝑄(𝑦 𝑠 ) 𝑄 𝑦 𝑘 |𝑦 𝑗 𝑗,𝑘 ∈ℰ  𝜚 𝑦 𝑠 = 𝑄(𝑦 𝑠 ) , ∀𝑗 ≠ 𝑠, 𝜚 𝑦 𝑗 = 1  𝜚 𝑦 𝑗 , 𝑦 𝑘 = 𝑄(𝑦 𝑘 |𝑦 𝑗 ) ( 𝑦 𝑗 is the parent of 𝑦 𝑘 )  𝑎 = 1  When we have evidence on variable 𝑦 𝑗 as 𝑦 𝑗 = 𝑦 𝑗 we replace 𝑦 𝑗 in all factors in which it appears by 𝑦 𝑗 4

  5. Sum-product: elimination view  Query node 𝑠  Elimination order: inverse of the topological order  Starts from leaves and generates elimination cliques of size at most two  Elimination of each node can be considered as message- passing (or Belief Propagation):  Elimination on trees is equivalent to message passing along tree branches  Instead of the node elimination, we preserve the node and compute a message from it to its parent  This message is equivalent to the factor resulted from the elimination of that node and all of the nodes in its subtree 5

  6. Messages  A node can send a message to its neighbors when (and only when) it has received messages from all its other neighbors. root … Message that 𝑘 sends to 𝑗 6

  7. Messages and marginal distribution Message that X 𝑘 sends to 𝑌 𝑗 𝑛 𝑘𝑗 𝑦 𝑗 = 𝜚 𝑦 𝑘 𝜚 𝑦 𝑗 , 𝑦 𝑘 𝑛 𝑙𝑘 (𝑦 𝑘 ) 𝑦 𝑘 𝑙∈𝒪(𝑘)\𝑗 a function of only 𝑦 𝑗 𝑞 𝑦 𝑠 ∝ 𝜚 𝑦 𝑠 𝑛 𝑙𝑠 (𝑦 𝑠 ) 𝑙∈𝒪(𝑠) 7

  8. Messages and marginal: Example  Compute 𝑞 𝑦 1 𝑞 𝑦 1 ∝ 𝜚 𝑦 1 𝑛 21 (𝑦 1 ) Product remained 𝑛 21 𝑦 1 = 𝜚 𝑦 2 𝜚 𝑦 1 , 𝑦 2 𝑛 32 (𝑦 2 )𝑛 42 (𝑦 2 ) factors (after 21 eliminating all variables 𝑦 2 except to 𝑦 1 ) 𝑛 32 𝑦 2 = 𝜚 𝑦 3 𝜚 𝑦 2 , 𝑦 3 𝑛 42 𝑦 2 = 𝜚 𝑦 4 𝜚 𝑦 2 , 𝑦 4 𝑦 3 𝑦 4 8

  9. Messages and marginal: Example  Compute 𝑞 𝑦 2 𝑛 12 𝑦 2 = 𝜚 𝑦 1 𝜚 𝑦 1 , 𝑦 2 𝑦 1 𝑞 𝑦 2 ∝ 𝜚 𝑦 2 𝑛 12 (𝑦 2 )𝑛 32 (𝑦 2 )𝑛 42 (𝑦 2 ) 𝑛 32 𝑦 2 = 𝜚 𝑦 3 𝜚 𝑦 2 , 𝑦 3 𝑛 42 𝑦 2 = 𝜚 𝑦 4 𝜚 𝑦 2 , 𝑦 4 𝑦 3 𝑦 4 9

  10. Messages on a tree  Messages can be reused to find probabilities on different query variables.  Messages on the tree provide a data structure for caching computations. 𝑌 2 We need 𝑛 32 (𝑦 2 ) to find both 𝑌 1 𝑌 3 𝑄(𝑌 1 ) and 𝑄(𝑌 2 ) 𝑌 4 𝑌 5 10

  11. From elimination to message passing  Recall ELIMINATION algorithm:  Choose an ordering Z in which query node f is the final node  Place all potentials on an active list  Eliminate node i by removing all potentials containing i, take sum over xi  Place the resultant factor back on the list  For a TREE graph:  Choose query node f as the root of the tree  View tree as a directed tree with edges pointing towards leaves from f  Elimination ordering based on reverse topological order  Elimination of each node can be considered as message-passing directly along tree branches  Thus, we can use the tree itself as a data-structure to do general inference!! 11 This slide has been adopted from Eric Zing, PGM 10708, CMU.

  12. Computing all node marginals  We can compute over all possible elimination order (generating only elimination cliques of size 2) by only computing all possible messages ( 2 ℰ )  T o allow all nodes can be the root, we just need to compute 2 ℰ messages  Messages can be reused  Instead of running the elimination algorithm 𝑂 times  Dynamic programming approach  2-Pass algorithm that saves and uses messages  A pair of messages (one for each direction) have been computed for each edge 12

  13. Messages required to compute all node marginals 13

  14. Computing node marginals:  Naïve approach:  Complexity: N×C  N is the number of nodes  C is the complexity of a complete message passing  Alternative dynamic programming approach  2-Pass algorithm  Complexity: 2C! 14

  15. A two-pass message-passing schedule  Arbitrarily pick a node as the root  First pass: starting at the leaves and proceeds inward  each node passes a message to its parent.  continues until the root has obtained messages from all of its adjoining nodes.  Second pass: starting at the root and passing the messages back out  messages are passed in the reverse direction.  continues until all leaves have received their messages. 15

  16. Asynchronous two-pass message-passing First pass: upward 16 Second pass: downward

  17. Sum-product algorithm: example 𝑛 21 (𝑦 1 ) 𝑛 21 (𝑦 1 ) 17

  18. Sum-product algorithm: example 𝑛 21 (𝑦 1 ) 18

  19. Parallel (synchronous) message-passing  For a node of degree d, whenever messages have arrived on any subset of d-1 node, compute the message for the remaining edge and send!  A pair of messages have been computed for each edge, one for each direction  All incoming messages are eventually computed for each node 19

  20. Parallel message-passing  Message-passing protocol: a node can send a message to a neighboring node when and only when it has received messages from all of its other neighbors  Correctness of parallel message-passing on trees  The synchronous implementation is “ non-blocking ”  Theorem: The message-passing guarantees obtaining all marginals in the tree 20

  21. Parallel message passing: Example 21

  22. Tree-like graphs  Sum-product message passing idea can also be extended to work in tree-like graphs (e.g., polytrees) too.  Although the undirected marginalized graphs resulted from polytrees are not tree, the corresponding factor graph is a tree Polytree Moralized Nodes can have graph Factor graph multiple parents 22

  23. References  D. Koller and N. Friedman, “ Probabilistic Graphical Models: Principles and Techniques ” , MIT Press, 2009, Chapter 10.  M.I. Jordan, “ An Introduction to Probabilistic Graphical Models ” , Chapter 4. 23

Recommend


More recommend