junction tree algorithm
play

Junction-tree algorithm Probabilistic Graphical Models Sharif - PowerPoint PPT Presentation

Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring 2018 Soleymani Inference on general GM Now, what if the GM is not a tree-like graph? Can we still directly run message-passing protocol along


  1. Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring 2018 Soleymani

  2. Inference on general GM  Now, what if the GM is not a tree-like graph?  Can we still directly run message-passing protocol along its edges?  For non-trees, we do not have the guarantee that message- passing will be consistent!  Construct a graph data-structure from P that has a tree structure, and run message-passing on it!  Junction tree algorithm 2

  3. Junction-tree algorithm: a general approach  Junction trees as opposed to the sum-product on trees can be applied on general graphs  Junction tree as opposed to the elimination algorithm is not “ query-oriented ”  enables us to record and use the intermediated factors to respond to multiple queries simultaneously  Upon convergence of the algorithms, we obtain marginal probabilities for all cliques of the original graph. 3

  4. Example: variable elimination and cluster tree 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 1 𝑌 3 𝑌 5 𝑌 3 𝑌 5 Moralized graph  Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 4

  5. Example: elimination cliques  Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 𝑌 4 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 6 𝑌 1 𝑌 1 𝑌 3 𝑌 5 𝑌 5 𝑌 5 𝑌 3 𝑌 3 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 1 𝑌 3 𝑌 5 𝑌 3 𝑌 5 5

  6. Example: clique tree obtained by VE  The clique tree contains the cliques (fully connected subsets) generated as elimination executes  This cluster graph induced by an execution ofVE is necessarily a tree  Indeed, after an elimination, the corresponding elimination clique will not be reappear 𝑌 4 𝑌 2 𝑌 6 𝑌 1 Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 6 𝑌 3 𝑌 5

  7. Example: clique tree obtained by VE  The clique tree contains the cliques (fully connected subsets) generated as elimination executes  This graph induced by an execution ofVE is necessarily a tree  Indeed, after an elimination, the corresponding elimination clique will not be reappear Maximal cliques 𝑌 4 𝑌 2 𝑌 6 𝑌 1 Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 7 𝑌 3 𝑌 5

  8. Example: Elimination ≡ message passing on a clique tree 8 This slide has been adopted from Eric Zing, PGM 10708, CMU.

  9. Computation reuse  Another query ...  Messages 𝑛 𝑔 and 𝑛 ℎ are reused, others need to be recomputed 9 This slide has been adopted from Eric Zing, PGM 10708, CMU.

  10. Cluster tree  Cluster tree is a singly connected graph (i.e., exactly one path between each pair of nodes) in which the nodes are the clusters of an underlying graph  A separator set is defined each linked pair of clusters contain the variables in the intersection of the clusters 𝑌 𝐵 𝑌 𝐶 𝑌 𝐷 𝑌 𝐵 , 𝑌 𝐶 𝑌 𝐶 𝑌 𝐶 , 𝑌 𝐷 separator set 10

  11. Junction tree property  Junction tree property: If a variable appears in the two cliques in the tree, it must appear in all cliques on the paths connecting them  For every pair of cliques 𝐷 𝑗 and 𝐷 𝑘 , all cliques on the path between 𝐷 𝑗 and 𝐷 𝑘 contain 𝑇 𝑗𝑘 = 𝐷 𝑗 ∩ 𝐷 𝑘  Also called as running intersection property  The cluster tree that satisfies the running intersection property is called junction tree . 11

  12. Clique tree usefulness  Clique tree provides a structure for caching computations  Multiple queries can be performed much more efficiently than performingVE for each one separately.  dictates a partial order over the operations that are performed on factors to reach a better computational complexity 12

  13. Theorem  The tree induced by a variable elimination algorithm satisfies running intersection property  Proof:  Let 𝐷 and 𝐷 ′ be two clusters that contain 𝑌 and 𝐷 𝑌 be the cluster where 𝑌 is eliminated, we will prove that 𝑌 must be present in every clique on the path between 𝐷 and 𝐷 𝑌 (and similarly on the path between 𝐷 𝑌 and 𝐷 ′ )  Idea: the computation at 𝐷 𝑌 must happen later than the computation at 𝐷 or 𝐷 ′ 13

  14. Clique trees from variable elimination  Each clique in the clique tree induced by VE is also a clique in the induced graph and vice versa.  However, for inference we can reduce the clique tree to contain only maximal cliques of the induced graph. 𝑌 4 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 2 𝑌 6 𝑌 6 𝑌 6 𝑌 1 𝑌 1 𝑌 1 𝑌 5 𝑌 3 𝑌 3 𝑌 5 𝑌 3 𝑌 5 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 1 14 𝑌 3 𝑌 5 𝑌 3 𝑌 5

  15. Triangulated graphs  What class of graphs have junction tree?  A triangulated (or chordal) graph contains no cycles with four or more nodes in which there is no chord  Triangulation is the necessary and sufficient condition for a graph to have a junction tree  only triangulated graphs have the property that their cluster trees are junction trees. 15

  16. Elimination algorithm: triangulation  Every induced graph (by variable elimination) is chordal  For, any chordal graph there is an elimination ordering that does not add any fill edges 𝑌 4 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 2 𝑌 6 𝑌 6 𝑌 1 𝑌 1 𝑌 6 𝑌 1 𝑌 3 𝑌 5 𝑌 3 𝑌 5 𝑌 3 𝑌 5 Induced graph Moralized graph 𝑌 1 , 𝑌 3 , 𝑌 4 , 𝑌 1 , 𝑌 5 , 𝑌 6  In general, finding the best triangulation is NP-hard but some good heuristics exist 16

  17. Building a junction tree  Different junction trees are obtained for different triangulations  Obtained from different elimination orders (and different maximum spanning trees).  Complexity of junction tree algorithms  The time and space complexity is dominated by the size of the largest clique in the junction tree (exponential in the size of the largest clique)  Finding the junction tree with the smallest cliques is an NP- hard problem.  finding the optimum ordering in the Elimination algorithm is NP-hard  but for many graph optimum or near-optimum can often be heuristically found 17

  18. Junction-tree construction  Construct the undirected graph  Triangulate the graph  e.g., Find an induced graph resulted from VE with a specified elimination order of nodes  Find the set of maximal elimination cliques of the triangulated graph  Build a weighted, complete graph over these maximal cliques.  Weight each edge (between cliques 𝐷 𝑗 and 𝐷 𝑘 as 𝐷 𝑗 ∩ 𝐷 𝑘 )  Find a Maximal SpanningTree as a junction tree for 𝐻  A cluster-tree is a junction-tree iff it is maximal spanning tree 18

  19. Junction tree construction: Example 𝑌 4 𝑌 4 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 6 𝑌 6 𝑌 1 𝑌 1 𝑌 1 𝑌 3 𝑌 5 𝑌 3 𝑌 5 𝑌 3 𝑌 5 𝑌 3 𝑌 5 {𝑌 2 , 𝑌 3 } {𝑌 2 , 𝑌 3 } {𝑌 1 , 𝑌 2 , 𝑌 3 } {𝑌 2 , 𝑌 3 , 𝑌 5 } {𝑌 1 , 𝑌 2 , 𝑌 3 } {𝑌 2 , 𝑌 3 , 𝑌 5 } {𝑌 2 } {𝑌 2 } {𝑌 2 } {𝑌 2 , 𝑌 5 } {𝑌 2 } {𝑌 2 , 𝑌 5 } {𝑌 2 } {𝑌 2 } {𝑌 2 , 𝑌 4 } {𝑌 2 , 𝑌 5 , 𝑌 6 } {𝑌 2 , 𝑌 4 } {𝑌 2 , 𝑌 5 , 𝑌 6 } {𝑌 2 } {𝑌 2 } 19

  20. Junction tree algorithm  Given a factorized probability distribution 𝑄 with the Markov network 𝐼 , builds a junction tree 𝑈 based on 𝐼  For each clique, it finds the marginal probability over the variables in that clique  Message-passing sum product (Shafer-Shenoy algorithm)  Run a message-passing algorithm on the junction tree constructed according to the distribution  Belief update: Local consistency preservation (Hugin algorithm)  rescaling (update) equations 20

  21. Junction tree algorithm: inference  Junction tree inference algorithm is a message passing on a junction tree structure.  Each clique starts with a set of initial factors.  Each clique sends one message to each neighbor in a schedule.  Finally, for each clique, the marginal over its variables is computed. 21

  22. Junction tree algorithm: inference  Junction tree inference algorithm is a message passing on a junction tree structure.  Each clique starts with a set of initial factors.  We assign a factor in the distribution 𝑄 to one and only one clique in 𝑈 if the scope of the factor is a subset of the variables in that clique 𝜔 𝑗 = 𝜚 𝜚∈𝐺 𝑗 𝐺 𝑗 shows the set of factors assigned to clique 𝐷 𝑗 22

  23. Junction tree algorithm: inference  Junction tree inference algorithm is a message passing on a junction tree structure.  Each clique starts with a set of initial factors.  Each clique sends one message to each neighbor in a schedule.  Each clique multiplies the incoming messages and its potential, sum out over one or more variables and send an outcoming message. 𝑛 𝑗𝑘 𝑇 𝑗𝑘 = 𝜔 𝑗 𝑛 𝑙𝑗 (𝑇 𝑙𝑗 ) 𝐷 𝑗 −S 𝑗𝑘 𝑙∈𝒪 𝑗 −{j} 𝑛 𝑗𝑘 𝑇 𝑗𝑘 𝑇 𝑗𝑘 𝐷 𝐷 𝑗 𝑘 23

Recommend


More recommend