graphical models graphical models
play

Graphical Models Graphical Models Clique trees & Belief - PowerPoint PPT Presentation

Graphical Models Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Winter 2018 Learning objectives Learning objectives message passing on clique trees its relation to variable elimination two different forms of


  1. Graphical Models Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Winter 2018

  2. Learning objectives Learning objectives message passing on clique trees its relation to variable elimination two different forms of belief propagation

  3. Recap Recap: variable elimination (VE) : variable elimination (VE) marginalize over a subset - e.g., P ( J ) expensive to calculate (why?) P ( C , D , I , G , S , L , J , H ) ∑ C , D , I , G , S , L , H use the factorized form of P P ( D ∣ C ) P ( G ∣ D , I ) P ( S ∣ I ) P ( L ∣ G ) P ( J ∣ L , S ) P ( H ∣ G , J ) ∑ C , D , I , G , S , L

  4. Recap Recap: variable elimination (VE) : variable elimination (VE) marginalize over a subset - e.g., P ( H , J ) expensive to calculate (why?) P ( C , D , I , G , S , L ) ∑ C , D , I , G , S , L use the factorized form of P P ( D ∣ C ) P ( G ∣ D , I ) P ( S ∣ I ) P ( L ∣ G ) P ( J ∣ L , S ) P ( H ∣ G , J ) ∑ C , D , I , G , S , L ϕ ( H , G , J ) 2 think of this as a factor/potential same treatment of Bayes-nets Markov nets for inference note that they do not encode the same CIs

  5. Recap Recap: variable elimination (VE) : variable elimination (VE) marginalize over a subset - e.g., P ( H , J ) expensive to calculate (why?) P ( C , D , I , G , S , L ) ∑ C , D , I , G , S , L use the factorized form of P ϕ ( D , C ) ϕ ( G , D , I ) ϕ ( S , I ) ϕ ( L , G ) ϕ ( J , L , S ) ϕ ( H , G , J ) ∑ C , D , I , G , S , L 1 2 3 4 5 6 = .... ϕ ( S ∣ I ) ϕ ( G , D , I ) ϕ ( D , C ) ∑ I ∑ D ∑ C 3 2 1 repeat this ψ ( D , C ) ′ ψ ( D ) 1 1 ′ = .... ∑ I ϕ ( S , I ) ∑ D ϕ ( G , D , I ) ψ ( D ) 3 2 1 ′ ψ ( G , I ) ψ ( G , I , D ) 2 2

  6. Recap Recap: variable elimination (VE) : variable elimination (VE) marginalize over a subset - e.g., P ( J ) expensive to calculate (why?) P ( C , D , I , G , S , L , J , H ) ∑ C , D , I , G , S , L , H eliminate variables in some order C D I

  7. Recap Recap: variable elimination (VE) : variable elimination (VE) eliminate variables in some order creates a chordal graph maximal cliques are the factors created during VE ( ψ ) t chordal graph order: max-cliques C,D,I,H,G,S,L

  8. Clique-tree Clique-tree summarize the VE computation using a clique-tree order: sepset C,D,I,H,G,S,L cluster clusters are maximal cliques (factors that are marginalized) C = Scope [ ψ ] i i P ( J ) = .... P ( S ∣ I ) P ( G ∣ D , I ) P ( D ∣ C ) ∑ I ∑ D ∑ C ψ ( D , C ) ′ ψ ( D ) 1 1

  9. Clique-tree Clique-tree summarize the VE computation using a clique-tree order: sepset C,D,I,H,G,S,L cluster clusters are maximal cliques (factors that are marginalized) C = Scope [ ψ ] i i sepsets are the result of marginalization over cliques S ′ = Scope [ ψ ] i , j i = C ∩ C S i , j i j

  10. Clique-tree: Clique-tree: properties properties a tree from clusters and sepsets S = C ∩ C T C i i , j i j family-preserving property: α ( ϕ ) = j each factor is associated with a cluster s.t. ϕ Scope [ ϕ ] ⊆ C j C j

  11. Clique-tree: Clique-tree: properties properties a tree from clusters and sepsets S = C ∩ C T C i i , j i j family-preserving property: α ( ϕ ) = j each factor is associated with a cluster s.t. ϕ Scope [ ϕ ] ⊆ C j C j running intersection property: if then for in the path C → … → C X ∈ C , C X ∈ C k C k i j i j

  12. VE as VE as message passing message passing think of VE as sending messages

  13. VE as VE as message passing message passing think of VE as sending messages calculate the product of factors in each clique ψ ( C ) ≜ ∏ ϕ : α ( ϕ )= i ϕ i i send messages from the leaves towards a root: ( S ) = ψ ( C ) ( S ) ∑ C − S i ∏ k ∈ Nb − j δ δ i → j i , j k → i i , k i i , j i i

  14. message passing message passing think of VE as sending messages a different root send messages from the leaves towards a root: ( S ) = ψ ( C ) ( S ) ∑ C − S i ∏ k ∈ Nb − j δ δ i → j i , j k → i i , k i i , j i i = ∑ V ≺( i → j ) ∏ ϕ ∈ F ≺( i → j ) ϕ the message is the marginal from one side of the tree

  15. message passing message passing think of VE as sending messages a different root send messages from the leaves towards a root: ) ≜ ( S ψ ( C ) ( S ) = ∑ C − S i ∏ k ∈ Nb − j ∑ V ≺( i → j ) ∏ ϕ ∈ F ≺( i → j ) δ δ ϕ i → j i , j k → i i , k i i , j i i the belief at the root clique is β ( C ) ≜ ψ ( C ) ( S ) r ∏ k ∈ Nb r δ k → r r , k r r r proportional to the marginal P ( X ) β ( C ) ∝ ∑ X − C i r r

  16. message passing: message passing: downward pass downward pass what if we continue sending messages? (from the root to leaves) root clique i sends a message to clique j when received messages from all the other neighbors k

  17. message passing: message passing: downward pass downward pass root what if we continue sending messages? (from the root to leaves) sum-product belief propagation (BP) async. message update ( S ) = ψ ( C ) ( S ) ∑ C − S i ∏ k ∈ Nb − j δ δ i → j i , j k → i i , k i i , j i i ) ≜ δ ( S ( S ) δ ( S ) μ i , j i , j i → j i , j j → i i , j marginals β ( C ) ≜ ψ ( C ) ( S ) i ∏ k ∈ Nb i δ k → i i , k i i i for any clique (not only root)

  18. Clique-tree & queries Clique-tree & queries What type of queries can we answer? marginals over subset of cliques P ( A ) A ⊆ C i updating the beliefs after new evidence ( t ) ( t ) P ( A ∣ E = e ) A ⊆ C , E ⊆ C i j multiply the (previously calibrated) beliefs β ( C ) I ( E ( t ) ( t ) = e ) i propagate to recalibrate

  19. Clique-tree & Clique-tree & queries queries What type of queries can we answer? marginals over subset of cliques P ( A ) A ⊆ C i updating the beliefs after new evidence ( t ) ( t ) P ( A ∣ E = e ) A ⊆ C , E ⊆ C i j multiply the (previously calibrated) beliefs β ( C ) I ( E ( t ) ( t ) = e ) i propagate to recalibrate marginals outside cliques: P ( A , B ) A ⊆ C , B ⊆ C i j define a super-clique that has both A,B a more efficient alternative? partition function Z

  20. Chordal graph and clique-tree Chordal graph and clique-tree any chordal graph gives a clique-tree how to get a chordal graph? triangulation use the chordal graph from VE min-neighbor, min-fill ... or find the optimal chordal graph smallest tree-width NP-hard also smallest max-clique image: wikipedia

  21. Chordal graph and clique-tree Chordal graph and clique-tree ∩ Chordal graph = Markov Bayesian networks from MRF to Bayes-net: triangulate build a clique-tree within cliques: fully connected directed edges between cliques: from a root to leaves image: wikipedia

  22. Building a clique-tree: Building a clique-tree: example example triangulated clique-tree input from: wainwright & jordan

  23. clique-tree clique-tree quiz quiz what clique-tree to use here? what are the sepsets? cost of exact inference?

  24. Summary Summary VE as message passing in a clique-tree clique-tree: running intersection & family preserving belief propagation updates: message update belief update types of queries how to build a clique-tree for exact inference

  25. bonus slides bonus slides

  26. Clique-tree: Clique-tree: calibration calibration ~ ∏ i ∏ i i ∏ k → i β ψ δ represent P using marginals: = k → i = ψ = i ∏ i P i ∏ i , j ∈ E ∏ i , j ∈ E μ δ δ i , j i → j j → i how about about arbitrary assignments? β , μ ∀ i , j ∈ E i , j i can they represent P as above? an assignment is calibrated iff BP produces calibrated beliefs ( S ) = β ( C ) = β ( C ) ∑ C − S ∑ C − S μ i , j i , j i i j j i , j i , j i j for calibrated beliefs these "arbitrary assignments" have to be marginals ~ ⇔ β ( C ) ∏ ( X ) ∝ β ( C ) ∝ P ( C ) i i P ( S ) i i i ∏ i , j ∈ E μ i , j i , j

  27. BP: an alternative update BP: an alternative update message update ( S ) = ψ ( C ) ( S ) ∑ C − S i ∏ k ∈ Nb − j δ δ i → j i , j k → i i , k i i , j i i calculate the beliefs in the end β ( C ) = ψ ( C ) ( S ) i ∏ k ∈ Nb i δ k → i i , k i i i belief update β ( C ) ∑ C − S since we can update the beliefs i i ( S ) = i , j i δ i → j i , j ( S ) δ j → i i , j instead of messages

  28. BP: an alternative update BP: an alternative update belief update β ← ψ = ∏ ϕ : α ( ϕ )= i ϕ , ← 1 initialize μ i , j i i until convergence: ( i , j ) ∈ E pick some ^ i , j ← ∑ C − S μ β ^ i , j = δ new j → i // μ δ i i → j i , j i new new ^ i , j μ ^ i , j δ δ δ i → j β ← β μ j → i = = i → j j j μ i , j // old old μ i , j δ δ δ i → j j → i i → j ← ^ i , j μ μ i , j at convergence, beliefs are calibrated β ( C ) = β ( C ) ∑ C − S ∑ C − S i i j j i , j i , j i j ∝ and so they are marginals

Recommend


More recommend