graphical models from an algebraic perspective
play

Graphical models from an algebraic perspective Elina Robeva MIT - PowerPoint PPT Presentation

Graphical models from an algebraic perspective Elina Robeva MIT ICERM Nonlinear Algebra Bootcamp September 11, 2018 1 / 29 Overview Undirected graphical models Definition and parametric description Markov properties and implicit


  1. Graphical models from an algebraic perspective Elina Robeva MIT ICERM Nonlinear Algebra Bootcamp September 11, 2018 1 / 29

  2. Overview • Undirected graphical models • Definition and parametric description • Markov properties and implicit description • Discrete and Gaussian • Directed graphical models • Definition and parametric description • Markov properties, d -separation, and implicit description • Discrete and Gaussian • model equivalence • Mixed graphical models 2 / 29

  3. Undirected graphical models Let G = ( V , E ) be an undirected graph and C ( G ) the set of maximal cliques of G . Let ( X v : v ∈ V ) ∈ X := � v ∈ V X v be a random vector. Notation: X A = � v ∈ A X v , X A = ( X v : v ∈ A ), x A = ( x v : v ∈ A ). For each C ∈ C ( G ) let φ C : X C → R ≥ 0 be a continuous function called a clique potential . The undirected graphical model (or markov random field ) corresponding to G and X is the set of all probability density functions on X of the form p ( x ) = 1 � φ C ( x C ) Z C ∈C ( G ) where � � Z = φ C ( x C ) d µ ( x ) X C ∈C ( G ) is the normalizing constant. 3 / 29

  4. Undirected graphical models Let G = ( V , E ) be an undirected graph and C ( G ) the set of maximal cliques of G . Let ( X v : v ∈ V ) ∈ X := � v ∈ V X v be a random vector. Notation: X A = � v ∈ A X v , X A = ( X v : v ∈ A ), x A = ( x v : v ∈ A ). For each C ∈ C ( G ) let φ C : X C → R ≥ 0 be a continuous function called a clique potential . The undirected graphical model (or markov random field ) corresponding to G and X is the set of all probability density functions on X of the form p ( x ) = 1 � φ C ( x C ) Z C ∈C ( G ) where � � Z = φ C ( x C ) d µ ( x ) X C ∈C ( G ) is the normalizing constant. 3 / 29

  5. Undirected graphical models Example 1 p ( x 1 , x 2 , x 3 , x 4 ) = 1 Z φ 12 ( x 1 , x 2 ) φ 13 ( x 1 , x 3 ) φ 14 ( x 1 , x 4 ) . 2 4 3 Example 1 2 3 p ( x 1 , x 2 , x 3 , x 4 , x 5 ) = 1 Z φ 123 ( x 1 , x 2 , x 3 ) φ 25 ( x 2 , x 5 ) φ 34 ( x 3 , x 4 ) φ 45 ( x 4 , x 5 ) . 5 4 4 / 29

  6. Discrete undirected graphical models Suppose that X v = [ r v ], r v ∈ N . Then, X ∈ X = � v ∈ V [ r v ]. We use parameters θ C C ∈ C ( G ) , x r ∈ [ r v ] . x C := φ C ( x C ) , Then, we get the rational parametrization 1 � θ C p x = x C . Z ( θ ) C ∈C ( G ) The graphical model corresponding to G consists of all discrete distributions p = ( p x : x ∈ X ) that factor in this way. Example Let r 1 = r 2 = r 3 = r 4 = 2. The parametrization has the form 1 1 Z ( θ ) θ (12) x 1 x 2 θ (13) x 1 x 3 θ (14) p x 1 x 2 x 3 x 4 = x 1 x 4 . 2 4 The ideal I G is the ideal of the image of this parametrization. 3 5 / 29

  7. Discrete undirected graphical models Example Let r 1 = r 2 = r 3 = r 4 = 2. The parametrization has the form 1 1 Z ( θ ) θ (12) x 1 x 2 θ (13) x 1 x 3 θ (14) p x 1 x 2 x 3 x 4 = x 1 x 4 . 2 4 The ideal I G is the ideal of the image of this parametrization. 3 S = QQ[a (1,1)..a (2,2), b (1,1)..b (2,2), c (1,1)..c (2,2)] R = QQ[p (1,1,1,1)..p (2,2,2,2)] L = {} for i from 0 to 15 do ( s = last baseName (vars R) (0,i); L = append(L, a (s 0,s 1)*b (s 0,s 2)*c (s 0,s 3)) ) phi = map(S, R, L) I = ker phi Output: I G = � 2-minors of M 1 � + � 2-minors of M 2 � + � 2-minors of M 3 � + � 2-minors of M 4 � where � � � � p 0000 p 0001 p 0010 p 0011 p 1000 p 1001 p 1010 p 1011 M 1 = , M 2 = p 0100 p 0101 p 0110 p 0111 p 1100 p 1101 p 1110 p 1111 � p 0000 p 0001 p 0100 p 0101 � � p 1000 p 1001 p 1100 p 1101 � M 3 = , M 4 = . p 0010 p 0011 p 0110 p 0111 p 1010 p 1011 p 1110 p 1111 6 / 29

  8. Gaussian undirected graphical models X = ( X v : v ∈ V ) ∼ N ( µ, Σ) Gaussian random vector, K = Σ − 1 . The density of X is p ( x ) = 1 � − 1 � 2 ( x − µ ) T K ( x − µ ) Z exp When does it factorize according to G = ( V , E ), i.e. p ( x ) = 1 � C ∈C ( G ) φ C ( x C )? Z p ( x ) = 1 � − 1 � � � − 1 � � 2 ( x v − µ v ) 2 K vv 2 ( x v − µ v )( x u − µ u ) K vu exp exp . Z v ∈ V v � = u 7 / 29

  9. Gaussian undirected graphical models X = ( X v : v ∈ V ) ∼ N ( µ, Σ) Gaussian random vector, K = Σ − 1 . The density of X is p ( x ) = 1 � − 1 � 2 ( x − µ ) T K ( x − µ ) Z exp When does it factorize according to G = ( V , E ), i.e. p ( x ) = 1 � C ∈C ( G ) φ C ( x C )? Z p ( x ) = 1 � − 1 � � � − 1 � � 2 ( x v − µ v ) 2 K vv 2 ( x v − µ v )( x u − µ u ) K vu exp exp . Z v ∈ V v � = u The density factorizes according to G = ( V , E ) if and only if K uv = 0 for all ( u , v ) �∈ E . 7 / 29

  10. Gaussian undirected graphical models X = ( X v : v ∈ V ) ∼ N ( µ, Σ) Gaussian random vector, K = Σ − 1 . The density of X is p ( x ) = 1 � − 1 � 2 ( x − µ ) T K ( x − µ ) Z exp When does it factorize according to G = ( V , E ), i.e. p ( x ) = 1 � C ∈C ( G ) φ C ( x C )? Z p ( x ) = 1 � − 1 � � � − 1 � � 2 ( x v − µ v ) 2 K vv 2 ( x v − µ v )( x u − µ u ) K vu exp exp . Z v ∈ V v � = u The density factorizes according to G = ( V , E ) if and only if K uv = 0 for all ( u , v ) �∈ E . The parametric description of the Gaussian graphical model with respect to G = ( V , E ) is M G = { Σ = K − 1 : K ≻ 0 and K uv = 0 for all ( u , v ) �∈ E } . The ideal of the model I G is the ideal of the image of this parametrization. 7 / 29

  11. Gaussian undirected graphical models X = ( X v : v ∈ V ) ∼ N ( µ, Σ) Gaussian random vector, K = Σ − 1 . The density of X is p ( x ) = 1 � − 1 � 2 ( x − µ ) T K ( x − µ ) Z exp When does it factorize according to G = ( V , E ), i.e. p ( x ) = 1 � C ∈C ( G ) φ C ( x C )? Z p ( x ) = 1 � − 1 � � � − 1 � � 2 ( x v − µ v ) 2 K vv 2 ( x v − µ v )( x u − µ u ) K vu exp exp . Z v ∈ V v � = u The density factorizes according to G = ( V , E ) if and only if K uv = 0 for all ( u , v ) �∈ E . The parametric description of the Gaussian graphical model with respect to G = ( V , E ) is M G = { Σ = K − 1 : K ≻ 0 and K uv = 0 for all ( u , v ) �∈ E } . The ideal of the model I G is the ideal of the image of this parametrization. 7 / 29

  12. Markov properties and conditional independence for undirected graphical models A different way to define undirected graphical models is via conditional independence statements. Let G = ( V , E ). For A , B , C ⊆ V , say that A and B are separated by C if every path between a ∈ A and b ∈ B goes through a vertex in C . The global Markov property associated to G consists of all conditional independence statements X A ⊥ ⊥ X B | X C for all disjoint sets A , B , C such that C separates A and B . Example 8 / 29

  13. Markov properties and conditional independence for undirected graphical models A different way to define undirected graphical models is via conditional independence statements. Let G = ( V , E ). For A , B , C ⊆ V , say that A and B are separated by C if every path between a ∈ A and b ∈ B goes through a vertex in C . The global Markov property associated to G consists of all conditional independence statements X A ⊥ ⊥ X B | X C for all disjoint sets A , B , C such that C separates A and B . Example Global Markov property: 1 X 2 ⊥ ⊥ X 3 | X 1 X 2 ⊥ ⊥ X 4 | X 1 2 4 X 3 ⊥ ⊥ X 4 | X 1 3 8 / 29

  14. Markov properties and conditional independence for undirected graphical models A different way to define undirected graphical models is via conditional independence statements. Let G = ( V , E ). For A , B , C ⊆ V , say that A and B are separated by C if every path between a ∈ A and b ∈ B goes through a vertex in C . The global Markov property associated to G consists of all conditional independence statements X A ⊥ ⊥ X B | X C for all disjoint sets A , B , C such that C separates A and B . Example Global Markov property: 1 X 2 ⊥ ⊥ X 3 | X 1 X 2 ⊥ ⊥ X 4 | X 1 2 4 X 3 ⊥ ⊥ X 4 | X 1 3 8 / 29

  15. Conditional independence for discrete distributions For discrete random variables conditional independence yields polynomial equations in ( p x : x ∈ X ) . How? Example If V = { 1 , 2 } and X = [ m 1 ] × [ m 2 ], then X 1 ⊥ ⊥ X 2 is the same as p ij = p i + p + j for all i ∈ [ m 1 ] , j ∈ [ m 2 ] . Equivalently, the matrix  p 1+  . � p +1 � P = ( p ij ) = . · · · p + m 2 ,   .   p m 1 + has rank 1. So, equivalently its 2 × 2 minors vanish, i.e. p ij p k ℓ − p i ℓ p kj = 0 for all i , k ∈ [ m 1 ] , j , ℓ ∈ [ m 2 ] . 9 / 29

  16. Conditional independence for discrete distributions For discrete random variables conditional independence yields polynomial equations in ( p x : x ∈ X ) . How? Example If V = { 1 , 2 } and X = [ m 1 ] × [ m 2 ], then X 1 ⊥ ⊥ X 2 is the same as p ij = p i + p + j for all i ∈ [ m 1 ] , j ∈ [ m 2 ] . Equivalently, the matrix  p 1+  . � p +1 � P = ( p ij ) = . · · · p + m 2 ,   .   p m 1 + has rank 1. So, equivalently its 2 × 2 minors vanish, i.e. p ij p k ℓ − p i ℓ p kj = 0 for all i , k ∈ [ m 1 ] , j , ℓ ∈ [ m 2 ] . 9 / 29

  17. Conditional independence for discrete distributions Proposition Let X be a discrete random vector with sample space X = � n i =1 [ m i ] . Then for disjoint sets A , B , C ⊂ [ n ] , we have that X A ⊥ ⊥ X B | X C if and only if p i A i B i C + p j A j B i C + − p i A j B i C + p j A i B i C + = 0 for all i A � = j A ∈ X A , i B � = j B ∈ X B , i C ∈ X C . 10 / 29

Recommend


More recommend