probabilistic graphical models
play

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for - PowerPoint PPT Presentation

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 , 2 , 3 , , = 1 Finding the normalizer Computing the marginals Two Problems for Graphical


  1. Probabilistic Graphical Models CMSC 691 UMBC

  2. Two Problems for Graphical Models π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals

  3. Two Problems for Graphical Models π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals π‘Ž = ෍ ΰ·‘ πœ” 𝑑 (𝑦 𝑑 ) 𝑦 𝑑

  4. Two Problems for Graphical Models π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals Sum over all variable combinations, with the x n coordinate fixed π‘Ž π‘œ (𝑀) = ෍ ΰ·‘ πœ” 𝑑 (𝑦 𝑑 ) π‘Ž = ෍ ΰ·‘ πœ” 𝑑 (𝑦 𝑑 ) 𝑦:𝑦 π‘œ =𝑀 𝑑 𝑦 𝑑 Example: 3 variables, fix the 2 nd dimension π‘Ž 2 (𝑀) = ෍ ෍ ΰ·‘ πœ” 𝑑 (𝑦 = 𝑦 1 , 𝑀, 𝑦 3 ) 𝑦 1 𝑦 3 𝑑

  5. Two Problems for Graphical Models π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals Sum over all variable combinations, with the x n coordinate fixed π‘Ž π‘œ (𝑀) = ෍ ΰ·‘ πœ” 𝑑 (𝑦 𝑑 ) π‘Ž = ෍ ΰ·‘ πœ” 𝑑 (𝑦 𝑑 ) 𝑦:𝑦 π‘œ =𝑀 𝑑 𝑦 𝑑 Example: 3 Q : Why are these difficult? variables, fix the 2 nd dimension A : Many different combinations π‘Ž 2 (𝑀) = ෍ ෍ ΰ·‘ πœ” 𝑑 (𝑦 = 𝑦 1 , 𝑀, 𝑦 3 ) 𝑦 1 𝑦 3 𝑑

  6. Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂

  7. Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Graph G = (vertices V, edges E) Distribution π‘ž(π‘Œ 1 , … , π‘Œ 𝑂 )

  8. Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Graph G = (vertices V, edges E) Distribution π‘ž(π‘Œ 1 , … , π‘Œ 𝑂 ) Vertices ↔ random variables Edges show dependencies among random variables

  9. Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Graph G = (vertices V, edges E) Distribution π‘ž(π‘Œ 1 , … , π‘Œ 𝑂 ) Vertices ↔ random variables Edges show dependencies among random variables Two main flavors: directed graphical models and undirected graphical models

  10. Outline Directed Graphical Models Undirected Graphical Models Factor Graphs

  11. Directed Graphical Models A directed (acyclic) graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes into factors of π‘Œ 𝑗 conditioned on the parents of π‘Œ 𝑗

  12. Directed Graphical Models A directed (acyclic) graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes into factors of π‘Œ 𝑗 conditioned on the parents of π‘Œ 𝑗 Benefit: read the independence properties are transparent

  13. Directed Graphical Models A directed (acyclic) graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes into factors of π‘Œ 𝑗 conditioned on the parents of π‘Œ 𝑗 A graph/joint distribution that follows this is a Bayesian network

  14. Bayesian Networks: Directed Acyclic Graphs 𝑦 1 𝑦 2 𝑦 3 5 𝑦 4 π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = ΰ·‘ π‘ž 𝑦 𝑗 𝜌(𝑦 𝑗 )) 𝑗 β€œparents of” topological sort

  15. Bayesian Networks: Directed Acyclic Graphs 𝑦 1 𝑦 2 𝑦 3 5 𝑦 4 π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = ΰ·‘ π‘ž 𝑦 𝑗 𝜌(𝑦 𝑗 )) 𝑗 π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , 𝑦 4 , 𝑦 5 = ???

  16. Bayesian Networks: Directed Acyclic Graphs 𝑦 1 𝑦 2 𝑦 3 5 𝑦 4 π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , 𝑦 4 , 𝑦 5 = π‘ž 𝑦 1 π‘ž 𝑦 3 π‘ž 𝑦 2 𝑦 1 , 𝑦 3 π‘ž 𝑦 4 𝑦 2 , 𝑦 3 π‘ž(𝑦 5 |𝑦 2 , 𝑦 4 )

  17. Bayesian Networks: Directed Acyclic Graphs 𝑦 1 𝑦 2 𝑦 3 5 𝑦 4 π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = ΰ·‘ π‘ž 𝑦 𝑗 𝜌(𝑦 𝑗 )) 𝑗 exact inference in general DAGs is NP-hard inference in trees can be exact

  18. Directed Graphical Model Notation 𝑦 1 𝑦 2 𝑦 3 5 𝑦 4 Unshaded nodes Shaded nodes are are unobserved observed R.V.s (latent) R.V.s

  19. D-Separation: Testing for Conditional Independence d-separation X & Y are d-separated if for all paths P, one of the following is true: Variables X & Y are P has a chain with an observed middle node conditionally independent given Z if all X Y (undirected) paths from P has a fork with an observed parent node (any variable in) X to (any variable in) Y are X Y d-separated by Z P includes a β€œv - structure” or β€œcollider” with all unobserved descendants X Z Y

  20. D-Separation: Testing for Conditional Independence d-separation Variables X & Y are conditionally independent given Z if all (undirected) paths from (any variable X & Y are d-separated if for all paths P, one of in) X to (any variable in) Y are d-separated by Z the following is true: P has a chain with an observed middle node observing Z blocks the path from X to Y X Z Y P has a fork with an observed parent node Z observing Z blocks the path from X to Y X Y P includes a β€œv - structure” or β€œcollider” with all unobserved descendants X Z Y not observing Z blocks the path from X to Y

  21. D-Separation: Testing for Conditional Independence d-separation Variables X & Y are conditionally independent given Z if all (undirected) paths from (any variable X & Y are d-separated if for all paths P, one of in) X to (any variable in) Y are d-separated by Z the following is true: P has a chain with an observed middle node observing Z blocks the path from X to Y X Z Y P has a fork with an observed parent node Z observing Z blocks the path from X to Y X Y P includes a β€œv - structure” or β€œcollider” with all unobserved descendants not observing Z blocks X Z Y the path from X to Y π‘ž 𝑦, 𝑧, 𝑨 = π‘ž 𝑦 π‘ž 𝑧 π‘ž(𝑨|𝑦, 𝑧) π‘ž 𝑦, 𝑧 = ෍ π‘ž 𝑦 π‘ž 𝑧 π‘ž(𝑨|𝑦, 𝑧) = π‘ž 𝑦 π‘ž 𝑧 𝑨

  22. Markov Blanket the set of nodes needed to form the complete conditional for a variable x i π‘ž(𝑦 1 , … , 𝑦 𝑂 ) π‘ž 𝑦 𝑗 𝑦 π‘˜β‰ π‘— = ∫ π‘ž 𝑦 1 , … , 𝑦 𝑂 𝑒𝑦 𝑗 x Ο‚ 𝑙 π‘ž(𝑦 𝑙 |𝜌 𝑦 𝑙 ) factorization = of graph ∫ Ο‚ 𝑙 π‘ž 𝑦 𝑙 𝜌 𝑦 𝑙 ) 𝑒𝑦 𝑗 factor out terms not dependent on x i Markov blanket of a node x Ο‚ 𝑙:𝑙=𝑗 or π‘—βˆˆπœŒ 𝑦 𝑙 π‘ž(𝑦 𝑙 |𝜌 𝑦 𝑙 ) is its parents, children, and = children's parents ∫ Ο‚ 𝑙:𝑙=𝑗 or π‘—βˆˆπœŒ 𝑦 𝑙 π‘ž 𝑦 𝑙 𝜌 𝑦 𝑙 ) 𝑒𝑦 𝑗 (in this example, shading does not show observed/latent)

  23. Outline Directed Graphical Models Undirected Graphical Models Factor Graphs

  24. Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes based on cliques in the graph

  25. Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes based on cliques in the graph Common name: Markov Random Fields

  26. Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes based on cliques in the graph Common name: Markov Random Fields Undirected graphs can have an alternative formulation as Factor Graphs

  27. Markov Random Fields: Undirected Graphs π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂

  28. Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂

  29. Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 variables part of the clique C global normalization maximal potential function (not cliques necessarily a probability!)

  30. Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 variables part of the clique C global normalization maximal potential function (not cliques necessarily a probability!)

  31. Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 variables part Q : What restrictions should we of the clique C place on the potentials πœ” 𝐷 ? global normalization maximal potential function (not cliques necessarily a probability!)

Recommend


More recommend