Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring 2018 Soleymani
Inference on general GM Now, what if the GM is not a tree-like graph? Can we still directly run message-passing protocol along its edges? For non-trees, we do not have the guarantee that message- passing will be consistent! Construct a graph data-structure from P that has a tree structure, and run message-passing on it! Junction tree algorithm 2
Junction-tree algorithm: a general approach Junction trees as opposed to the sum-product on trees can be applied on general graphs Junction tree as opposed to the elimination algorithm is not “ query-oriented ” enables us to record and use the intermediated factors to respond to multiple queries simultaneously Upon convergence of the algorithms, we obtain marginal probabilities for all cliques of the original graph. 3
Example: variable elimination and cluster tree 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 1 𝑌 3 𝑌 5 𝑌 3 𝑌 5 Moralized graph Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 4
Example: elimination cliques Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 𝑌 4 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 6 𝑌 1 𝑌 1 𝑌 3 𝑌 5 𝑌 5 𝑌 5 𝑌 3 𝑌 3 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 1 𝑌 3 𝑌 5 𝑌 3 𝑌 5 5
Example: clique tree obtained by VE The clique tree contains the cliques (fully connected subsets) generated as elimination executes This cluster graph induced by an execution ofVE is necessarily a tree Indeed, after an elimination, the corresponding elimination clique will not be reappear 𝑌 4 𝑌 2 𝑌 6 𝑌 1 Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 6 𝑌 3 𝑌 5
Example: clique tree obtained by VE The clique tree contains the cliques (fully connected subsets) generated as elimination executes This graph induced by an execution ofVE is necessarily a tree Indeed, after an elimination, the corresponding elimination clique will not be reappear Maximal cliques 𝑌 4 𝑌 2 𝑌 6 𝑌 1 Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 7 𝑌 3 𝑌 5
Example: Elimination ≡ message passing on a clique tree 8 This slide has been adopted from Eric Zing, PGM 10708, CMU.
Computation reuse Another query ... Messages 𝑛 𝑔 and 𝑛 ℎ are reused, others need to be recomputed 9 This slide has been adopted from Eric Zing, PGM 10708, CMU.
Cluster tree Cluster tree is a singly connected graph (i.e., exactly one path between each pair of nodes) in which the nodes are the clusters of an underlying graph A separator set is defined each linked pair of clusters contain the variables in the intersection of the clusters 𝑌 𝐵 𝑌 𝐶 𝑌 𝐷 𝑌 𝐵 , 𝑌 𝐶 𝑌 𝐶 𝑌 𝐶 , 𝑌 𝐷 separator set 10
Junction tree property Junction tree property: If a variable appears in the two cliques in the tree, it must appear in all cliques on the paths connecting them For every pair of cliques 𝐷 𝑗 and 𝐷 𝑘 , all cliques on the path between 𝐷 𝑗 and 𝐷 𝑘 contain 𝑇 𝑗𝑘 = 𝐷 𝑗 ∩ 𝐷 𝑘 Also called as running intersection property The cluster tree that satisfies the running intersection property is called junction tree . 11
Clique tree usefulness Clique tree provides a structure for caching computations Multiple queries can be performed much more efficiently than performingVE for each one separately. dictates a partial order over the operations that are performed on factors to reach a better computational complexity 12
Theorem The tree induced by a variable elimination algorithm satisfies running intersection property Proof: Let 𝐷 and 𝐷 ′ be two clusters that contain 𝑌 and 𝐷 𝑌 be the cluster where 𝑌 is eliminated, we will prove that 𝑌 must be present in every clique on the path between 𝐷 and 𝐷 𝑌 (and similarly on the path between 𝐷 𝑌 and 𝐷 ′ ) Idea: the computation at 𝐷 𝑌 must happen later than the computation at 𝐷 or 𝐷 ′ 13
Clique trees from variable elimination Each clique in the clique tree induced by VE is also a clique in the induced graph and vice versa. However, for inference we can reduce the clique tree to contain only maximal cliques of the induced graph. 𝑌 4 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 2 𝑌 6 𝑌 6 𝑌 6 𝑌 1 𝑌 1 𝑌 1 𝑌 5 𝑌 3 𝑌 3 𝑌 5 𝑌 3 𝑌 5 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 1 14 𝑌 3 𝑌 5 𝑌 3 𝑌 5
Triangulated graphs What class of graphs have junction tree? A triangulated (or chordal) graph contains no cycles with four or more nodes in which there is no chord Triangulation is the necessary and sufficient condition for a graph to have a junction tree only triangulated graphs have the property that their cluster trees are junction trees. 15
Elimination algorithm: triangulation Every induced graph (by variable elimination) is chordal For, any chordal graph there is an elimination ordering that does not add any fill edges 𝑌 4 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 2 𝑌 6 𝑌 6 𝑌 1 𝑌 1 𝑌 6 𝑌 1 𝑌 3 𝑌 5 𝑌 3 𝑌 5 𝑌 3 𝑌 5 Induced graph Moralized graph 𝑌 1 , 𝑌 3 , 𝑌 4 , 𝑌 1 , 𝑌 5 , 𝑌 6 In general, finding the best triangulation is NP-hard but some good heuristics exist 16
Building a junction tree Different junction trees are obtained for different triangulations Obtained from different elimination orders (and different maximum spanning trees). Complexity of junction tree algorithms The time and space complexity is dominated by the size of the largest clique in the junction tree (exponential in the size of the largest clique) Finding the junction tree with the smallest cliques is an NP- hard problem. finding the optimum ordering in the Elimination algorithm is NP-hard but for many graph optimum or near-optimum can often be heuristically found 17
Junction-tree construction Construct the undirected graph Triangulate the graph e.g., Find an induced graph resulted from VE with a specified elimination order of nodes Find the set of maximal elimination cliques of the triangulated graph Build a weighted, complete graph over these maximal cliques. Weight each edge (between cliques 𝐷 𝑗 and 𝐷 𝑘 as 𝐷 𝑗 ∩ 𝐷 𝑘 ) Find a Maximal SpanningTree as a junction tree for 𝐻 A cluster-tree is a junction-tree iff it is maximal spanning tree 18
Junction tree construction: Example 𝑌 4 𝑌 4 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 6 𝑌 6 𝑌 1 𝑌 1 𝑌 1 𝑌 3 𝑌 5 𝑌 3 𝑌 5 𝑌 3 𝑌 5 𝑌 3 𝑌 5 {𝑌 2 , 𝑌 3 } {𝑌 2 , 𝑌 3 } {𝑌 1 , 𝑌 2 , 𝑌 3 } {𝑌 2 , 𝑌 3 , 𝑌 5 } {𝑌 1 , 𝑌 2 , 𝑌 3 } {𝑌 2 , 𝑌 3 , 𝑌 5 } {𝑌 2 } {𝑌 2 } {𝑌 2 } {𝑌 2 , 𝑌 5 } {𝑌 2 } {𝑌 2 , 𝑌 5 } {𝑌 2 } {𝑌 2 } {𝑌 2 , 𝑌 4 } {𝑌 2 , 𝑌 5 , 𝑌 6 } {𝑌 2 , 𝑌 4 } {𝑌 2 , 𝑌 5 , 𝑌 6 } {𝑌 2 } {𝑌 2 } 19
Junction tree algorithm Given a factorized probability distribution 𝑄 with the Markov network 𝐼 , builds a junction tree 𝑈 based on 𝐼 For each clique, it finds the marginal probability over the variables in that clique Message-passing sum product (Shafer-Shenoy algorithm) Run a message-passing algorithm on the junction tree constructed according to the distribution Belief update: Local consistency preservation (Hugin algorithm) rescaling (update) equations 20
Junction tree algorithm: inference Junction tree inference algorithm is a message passing on a junction tree structure. Each clique starts with a set of initial factors. Each clique sends one message to each neighbor in a schedule. Finally, for each clique, the marginal over its variables is computed. 21
Junction tree algorithm: inference Junction tree inference algorithm is a message passing on a junction tree structure. Each clique starts with a set of initial factors. We assign a factor in the distribution 𝑄 to one and only one clique in 𝑈 if the scope of the factor is a subset of the variables in that clique 𝜔 𝑗 = 𝜚 𝜚∈𝐺 𝑗 𝐺 𝑗 shows the set of factors assigned to clique 𝐷 𝑗 22
Junction tree algorithm: inference Junction tree inference algorithm is a message passing on a junction tree structure. Each clique starts with a set of initial factors. Each clique sends one message to each neighbor in a schedule. Each clique multiplies the incoming messages and its potential, sum out over one or more variables and send an outcoming message. 𝑛 𝑗𝑘 𝑇 𝑗𝑘 = 𝜔 𝑗 𝑛 𝑙𝑗 (𝑇 𝑙𝑗 ) 𝐷 𝑗 −S 𝑗𝑘 𝑙∈𝒪 𝑗 −{j} 𝑛 𝑗𝑘 𝑇 𝑗𝑘 𝑇 𝑗𝑘 𝐷 𝐷 𝑗 𝑘 23
Recommend
More recommend