Probabilistic Graphical Models CMSC 691 UMBC
Two Problems for Graphical Models π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π = 1 π ΰ· π π· π¦ π π· Finding the normalizer Computing the marginals
Two Problems for Graphical Models π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π = 1 π ΰ· π π· π¦ π π· Finding the normalizer Computing the marginals π = ΰ· ΰ· π π (π¦ π ) π¦ π
Two Problems for Graphical Models π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π = 1 π ΰ· π π· π¦ π π· Finding the normalizer Computing the marginals Sum over all variable combinations, with the x n coordinate fixed π π (π€) = ΰ· ΰ· π π (π¦ π ) π = ΰ· ΰ· π π (π¦ π ) π¦:π¦ π =π€ π π¦ π Example: 3 variables, fix the 2 nd dimension π 2 (π€) = ΰ· ΰ· ΰ· π π (π¦ = π¦ 1 , π€, π¦ 3 ) π¦ 1 π¦ 3 π
Two Problems for Graphical Models π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π = 1 π ΰ· π π· π¦ π π· Finding the normalizer Computing the marginals Sum over all variable combinations, with the x n coordinate fixed π π (π€) = ΰ· ΰ· π π (π¦ π ) π = ΰ· ΰ· π π (π¦ π ) π¦:π¦ π =π€ π π¦ π Example: 3 Q : Why are these difficult? variables, fix the 2 nd dimension A : Many different combinations π 2 (π€) = ΰ· ΰ· ΰ· π π (π¦ = π¦ 1 , π€, π¦ 3 ) π¦ 1 π¦ 3 π
Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π 1 , β¦ , π π
Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π 1 , β¦ , π π Graph G = (vertices V, edges E) Distribution π(π 1 , β¦ , π π )
Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π 1 , β¦ , π π Graph G = (vertices V, edges E) Distribution π(π 1 , β¦ , π π ) Vertices β random variables Edges show dependencies among random variables
Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π 1 , β¦ , π π Graph G = (vertices V, edges E) Distribution π(π 1 , β¦ , π π ) Vertices β random variables Edges show dependencies among random variables Two main flavors: directed graphical models and undirected graphical models
Outline Directed Graphical Models Undirected Graphical Models Factor Graphs
Directed Graphical Models A directed (acyclic) graph G=(V,E) that represents a probability distribution over random variables π 1 , β¦ , π π Joint probability factorizes into factors of π π conditioned on the parents of π π
Directed Graphical Models A directed (acyclic) graph G=(V,E) that represents a probability distribution over random variables π 1 , β¦ , π π Joint probability factorizes into factors of π π conditioned on the parents of π π Benefit: read the independence properties are transparent
Directed Graphical Models A directed (acyclic) graph G=(V,E) that represents a probability distribution over random variables π 1 , β¦ , π π Joint probability factorizes into factors of π π conditioned on the parents of π π A graph/joint distribution that follows this is a Bayesian network
Bayesian Networks: Directed Acyclic Graphs π¦ 1 π¦ 2 π¦ 3 5 π¦ 4 π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π = ΰ· π π¦ π π(π¦ π )) π βparents ofβ topological sort
Bayesian Networks: Directed Acyclic Graphs π¦ 1 π¦ 2 π¦ 3 5 π¦ 4 π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π = ΰ· π π¦ π π(π¦ π )) π π π¦ 1 , π¦ 2 , π¦ 3 , π¦ 4 , π¦ 5 = ???
Bayesian Networks: Directed Acyclic Graphs π¦ 1 π¦ 2 π¦ 3 5 π¦ 4 π π¦ 1 , π¦ 2 , π¦ 3 , π¦ 4 , π¦ 5 = π π¦ 1 π π¦ 3 π π¦ 2 π¦ 1 , π¦ 3 π π¦ 4 π¦ 2 , π¦ 3 π(π¦ 5 |π¦ 2 , π¦ 4 )
Bayesian Networks: Directed Acyclic Graphs π¦ 1 π¦ 2 π¦ 3 5 π¦ 4 π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π = ΰ· π π¦ π π(π¦ π )) π exact inference in general DAGs is NP-hard inference in trees can be exact
Directed Graphical Model Notation π¦ 1 π¦ 2 π¦ 3 5 π¦ 4 Unshaded nodes Shaded nodes are are unobserved observed R.V.s (latent) R.V.s
D-Separation: Testing for Conditional Independence d-separation X & Y are d-separated if for all paths P, one of the following is true: Variables X & Y are P has a chain with an observed middle node conditionally independent given Z if all X Y (undirected) paths from P has a fork with an observed parent node (any variable in) X to (any variable in) Y are X Y d-separated by Z P includes a βv - structureβ or βcolliderβ with all unobserved descendants X Z Y
D-Separation: Testing for Conditional Independence d-separation Variables X & Y are conditionally independent given Z if all (undirected) paths from (any variable X & Y are d-separated if for all paths P, one of in) X to (any variable in) Y are d-separated by Z the following is true: P has a chain with an observed middle node observing Z blocks the path from X to Y X Z Y P has a fork with an observed parent node Z observing Z blocks the path from X to Y X Y P includes a βv - structureβ or βcolliderβ with all unobserved descendants X Z Y not observing Z blocks the path from X to Y
D-Separation: Testing for Conditional Independence d-separation Variables X & Y are conditionally independent given Z if all (undirected) paths from (any variable X & Y are d-separated if for all paths P, one of in) X to (any variable in) Y are d-separated by Z the following is true: P has a chain with an observed middle node observing Z blocks the path from X to Y X Z Y P has a fork with an observed parent node Z observing Z blocks the path from X to Y X Y P includes a βv - structureβ or βcolliderβ with all unobserved descendants not observing Z blocks X Z Y the path from X to Y π π¦, π§, π¨ = π π¦ π π§ π(π¨|π¦, π§) π π¦, π§ = ΰ· π π¦ π π§ π(π¨|π¦, π§) = π π¦ π π§ π¨
Markov Blanket the set of nodes needed to form the complete conditional for a variable x i π(π¦ 1 , β¦ , π¦ π ) π π¦ π π¦ πβ π = β« π π¦ 1 , β¦ , π¦ π ππ¦ π x Ο π π(π¦ π |π π¦ π ) factorization = of graph β« Ο π π π¦ π π π¦ π ) ππ¦ π factor out terms not dependent on x i Markov blanket of a node x Ο π:π=π or πβπ π¦ π π(π¦ π |π π¦ π ) is its parents, children, and = children's parents β« Ο π:π=π or πβπ π¦ π π π¦ π π π¦ π ) ππ¦ π (in this example, shading does not show observed/latent)
Outline Directed Graphical Models Undirected Graphical Models Factor Graphs
Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables π 1 , β¦ , π π Joint probability factorizes based on cliques in the graph
Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables π 1 , β¦ , π π Joint probability factorizes based on cliques in the graph Common name: Markov Random Fields
Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables π 1 , β¦ , π π Joint probability factorizes based on cliques in the graph Common name: Markov Random Fields Undirected graphs can have an alternative formulation as Factor Graphs
Markov Random Fields: Undirected Graphs π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π
Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π
Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π = 1 π ΰ· π π· π¦ π π· variables part of the clique C global normalization maximal potential function (not cliques necessarily a probability!)
Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π = 1 π ΰ· π π· π¦ π π· variables part of the clique C global normalization maximal potential function (not cliques necessarily a probability!)
Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π π¦ 1 , π¦ 2 , π¦ 3 , β¦ , π¦ π = 1 π ΰ· π π· π¦ π π· variables part Q : What restrictions should we of the clique C place on the potentials π π· ? global normalization maximal potential function (not cliques necessarily a probability!)
Recommend
More recommend