junction tree algorithm
play

Junction-tree algorithm Probabilistic Graphical Models Sharif - PowerPoint PPT Presentation

Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring 2016 Soleymani Junction-tree algorithm: a general approach Junction trees as opposed to the sum-product on trees can be applied on general graphs


  1. Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring 2016 Soleymani

  2. Junction-tree algorithm: a general approach  Junction trees as opposed to the sum-product on trees can be applied on general graphs  Junction tree as opposed to the elimination algorithm is not “ query-oriented ”  enables us to record and use the intermediated factors to respond to multiple queries simoultaneously  Upon convergence of the algorithms, we obtain marginal probabilities for all cliques of the original graph. 2

  3. Cluster tree  Cluster tree is a singly connected graph (i.e., exactly one path between each pair of nodes) in which the nodes are the cliques of an underlying graph  A separator set is defined each linked pair of cliques contain the variables in the intersection of the cliques 𝑌 𝐵 𝑌 𝐶 𝑌 𝐷 𝑌 𝐵 , 𝑌 𝐶 𝑌 𝐶 𝑌 𝐶 , 𝑌 𝐷 separator set 3

  4. Example: variable elimination and cluster tree 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 1 𝑌 3 𝑌 5 𝑌 3 𝑌 5 Moralized graph  Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 4

  5. Example: elimination cliques  Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 𝑌 4 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 6 𝑌 1 𝑌 1 𝑌 3 𝑌 5 𝑌 5 𝑌 5 𝑌 3 𝑌 3 𝑌 4 𝑌 4 𝑌 2 𝑌 2 𝑌 6 𝑌 1 𝑌 6 𝑌 1 𝑌 3 𝑌 5 𝑌 3 𝑌 5 5

  6. Example: cluster tree obtained by VE  The cluster tree contains the cliques (fully connected subsets) generated as elimination executes  This cluster graph induced by an execution ofVE is necessarily a tree  Indeed, after an elimination, the corresponding elimination clique will not be reappear 𝑌 4 𝑌 2 𝑌 6 𝑌 1 Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 6 𝑌 3 𝑌 5

  7. Example: cluster tree obtained by VE  The cluster tree contains the cliques (fully connected subsets) generated as elimination executes  This cluster graph induced by an execution ofVE is necessarily a tree  Indeed, after an elimination, the corresponding elimination clique will not be reappear Maximal cliques 𝑌 4 𝑌 2 𝑌 6 𝑌 1 Elimination order: 𝑌 6 , 𝑌 5 , 𝑌 4 , 𝑌 3 , 𝑌 2 7 𝑌 3 𝑌 5

  8. Cluster tree usefulness  Cluster tree provides a structure for caching computations  Multiple queries can be performed much more efficiently than performingVE for each one separately.  Cluster tree dictates a partial order over the operations that are performed on factors to reach a better computational complexity 8

  9. Junction tree property  Junction tree property: If a variable appears in the two cliques in the clique tree, it must appear in all cliques on the paths connecting them  For every pair of cliques 𝐷 𝑗 and 𝐷 𝑘 , all cliques on the path between 𝐷 𝑗 and 𝐷 𝑘 contain 𝑇 𝑗𝑘 = 𝐷 𝑗 ∩ 𝐷 𝑘  Also called as running intersection property  The cluster tree that satisfies the running intersection property is called clique tree or junction tree . 9

  10. Theorem  The tree induced by a variable elimination algorithm satisfies running intersection property  Proof:  Let 𝐷 and 𝐷 ′ be two clusters that contain 𝑌 and 𝐷 𝑌 be the cluster where 𝑌 is eliminated, we will prove that 𝑌 must be present in every clique on the path between 𝐷 and 𝐷 𝑌 (and similarly on the path between 𝐷 𝑌 and 𝐷 ′ )  Idea: the computation at 𝐷 𝑌 must happen later than the computation at 𝐷 or 𝐷 ′ 10

  11. Separation set  Theorem 1: In a clique tree induced by a variable elimination algorithm, let 𝑛 𝑗𝑘 be a message that 𝐷 𝑗 sends to the neighboring cluster 𝐷 𝑘 then the scope of this message is 𝑇 𝑗𝑘 = 𝐷 𝑗 ⋂𝐷 𝑘  Theorem 2: A cluster tree satisfies running intersection property if and only if for every separation set 𝑇 𝑗𝑘 , 𝑊 ≺ 𝑗,𝑘 and 𝑊 ≺ 𝑘,𝑗 are separated in 𝐼 given 𝑇 𝑗𝑘 ≺ 𝑗,𝑘 : set of all variables in the scope of all 𝑊 cliques in the 𝐷 𝑗 side of the edge (𝑗, 𝑘) 11

  12. Junction tree algorithm  Given a factorized probability distribution 𝑄 with the Markov network 𝐼 , builds a junction tree 𝑈 based on 𝐼  For each clique, it finds the marginal probability over the variables in that clique  Message-passing sum product (Shafer-Shenoy algorithm)  Run a message-passing algorithm on the junction tree constructed according to the distribution  Belief update: Local consistency preservation (Hugin algorithm)  rescaling (update) equations 12

  13. Junction tree algorithm: inference  Junction tree inference algorithm is a message passing on a junction tree structure.  Each clique starts with a set of initial factors.  We assign a factor in the distribution 𝑄 to one and only one clique in 𝑈 if the scope of the factor is a subset of the variables in that clique  Each clique sends one message to each neighbor in a schedule.  Each clique multiplies the incoming messages and its potential, sum out over one or more variables and send an outcoming message.  After message-passing, by combining its potential with the messages it receives from its neighbors, it can compute the marginal over its variables. 13

  14. Junction-tree message passing: Shafer-Shenoy algorithm 𝜔 𝑗 = 𝜚 𝜚∈𝐺 𝑗 𝐺 𝑗 shows the set of factors assigned to clique 𝐷 𝑗 𝑛 𝑗𝑘 𝑇 𝑗𝑘 𝑛 𝑗𝑘 𝑇 𝑗𝑘 = 𝜔 𝑗 𝑛 𝑙𝑗 (𝑇 𝑙𝑗 ) 𝑇 𝑗𝑘 𝐷 𝐷 𝑗 𝐷 𝑗 −S 𝑗𝑘 𝑙∈𝒪 𝑗 −{j} 𝑘 𝑄(𝐷 𝑠 ) ∝ 𝜔 𝑠 𝑛 𝑙𝑠 (𝑇 𝑙𝑠 ) 𝑛 𝑘𝑗 𝑇 𝑗𝑘 𝑙∈𝒪(𝑠) Marginal on a clique as a product of the initial 𝑄(𝑇 𝑗𝑘 ) ∝ 𝑛 𝑗𝑘 𝑇 𝑗𝑘 𝑛 𝑘𝑗 𝑇 𝑗𝑘 potential and the messages from its neighbors 14

  15. Junction-tree algorithm: correctness  If 𝑌 is eliminated when a message is sent from 𝐷 𝑗 to a neighboring 𝐷 𝑘 such that 𝑌 ∈ 𝐷 𝑗 and 𝑌 ∉ 𝐷 𝑘 , then 𝑌 does not appear in the tree on the 𝐷 𝑘 side of the edge (𝑗, 𝑘) after elimination ≺ 𝑗,𝑘 : set of all variables in the 𝑊 scope of all cliques in the 𝐷 𝑗 side of the edge (𝑗, 𝑘) 𝐺 ≺ 𝑗,𝑘 : set of factors in the cliques in the 𝐷 𝑗 side of the edge (𝑗, 𝑘) 𝐺 𝑗 : set of factors in the clique 𝐷 𝑗 15

  16. Junction-tree algorithm: correctness 𝐷 𝑘  Induction on the length of the path from the leaves: Base step: leaves   𝐷 𝑗 Inductive step:  𝐷 𝑗 𝑛 𝑛 𝑗→𝑘 𝑇 𝑗𝑘 = 𝜚 𝐷 𝑗 1 … 𝜚∈𝐺 ≺ 𝑗,𝑘 𝑊 ≺ 𝑗,𝑘 ≺ 𝑗,𝑘 is a disjoint union of 𝑊 ≺ 𝑗 𝑙 ,𝑗 for 𝑙 = 1, … , 𝑛 𝑊 = … 𝜚 … 𝜚 𝜚 𝜚∈𝐺 ≺ 𝑗1,𝑗 𝜚∈𝐺 ≺ 𝑗𝑙,𝑗 𝜚∈𝐺 𝑗 𝐷 𝑗 \𝑇 𝑗𝑘 𝑊 ≺ 𝑗1,𝑗 𝑊 ≺ 𝑗𝑙,𝑗 = 𝜚 𝜚 … 𝜚 𝜚∈𝐺 𝑗 𝜚∈𝐺 ≺ 𝑗1,𝑗 𝜚∈𝐺 ≺ 𝑗𝑙,𝑗 𝐷 𝑗 \𝑇 𝑗𝑘 𝑊 ≺ 𝑗1,𝑗 𝑊 ≺ 𝑗 𝑙 ,𝑗 = 𝜔 𝑗 × 𝑛 𝑗 1 →𝑗 × ⋯ × 𝑛 𝑗 𝑙 →𝑗 𝐷 𝑗 \𝑇 𝑗𝑘 16

  17. Message passing schedule  A two-pass message-passing schedule: arbitrarily pick a node as the root  First pass: starting at the leaves and proceeds inward  each node passes a message to its parent.  continues until the root has obtained messages from all of its adjoining nodes.  Second pass: starting at the root and passing the messages back out  messages are passed in the reverse direction.  continues until all leaves have received their messages. 17

  18. Junction tree algorithm Belief update perspective: Hugin algorithm  We define two sets of potential functions:  Clique potentials: on each clique 𝒀 𝐷 , we define a potential function 𝜔 𝒀 𝐷 that is proportional to the marginal probability on that clique 𝜔 𝑌 𝐷 ∝ 𝑄 𝑌 𝐷  Separator potentials: on each separator set 𝒀 𝑇 , we define a potential function 𝜚 𝒀 𝑇 that is proportional to the marginal probability on 𝒀 𝑇 𝜔 𝑌 𝑇 ∝ 𝑄 𝑌 𝑇 𝜔 𝑊 𝜔 𝑋 𝜚 𝑇 𝑊 𝑇 𝑋  Enables us to obtain local representation of marginal probabilities in cliques 18

  19. Extended representation of joint probability  We intend to find extended representation: 𝑄 𝒀 ∝ 𝐷 𝜔 𝐷 (𝒀 𝐷 ) 𝑇 𝜚 𝑇 (𝒀 𝑇 ) 𝐷 𝜔 𝐷 (𝒀 𝐷 )  where the global representation 𝑇 𝜚 𝑇 (𝒀 𝑇 ) corresponds to the joint probabilities  and the local representations 𝜔 𝐷 (𝒀 𝐷 ) and 𝜚 𝑇 (𝒀 𝑇 ) correspond to marginal probabilities 19

  20. Consistency  Consistency: since the potentials are required to represent marginal probabilities, they must give the same marginals for the nodes that they have in common  Consistency is a necessary and sufficient condition for the inference algorithm to find potentials that are marginals  We will first introduce local consistency: 𝜚 𝑇 𝑗𝑘 = 𝜔 𝑗 = 𝜔 𝑘 𝐷 𝑗 \S 𝑗𝑘 𝐷 𝑘 \S 𝑗𝑘 20

Recommend


More recommend