identifiability of gaussian dag models with one latent
play

Identifiability of Gaussian DAG models with one latent source - PowerPoint PPT Presentation

Identifiability of Gaussian DAG models with one latent source Hisayuki Hara Niigata University http://www.econ.niigata-u.ac.jp/hara/ hara@econ.niigata-u.ac.jp Joint work with Dennis Leung and Mathias Drton H. Hara (Niigata U.)


  1. Identifiability of Gaussian DAG models with one latent source Hisayuki Hara Niigata University http://www.econ.niigata-u.ac.jp/˜hara/ hara@econ.niigata-u.ac.jp Joint work with Dennis Leung and Mathias Drton H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 1 / 26

  2. Contents Model setup and identifiability 1 Prior work 2 Graphical criteria based on Jacobian of ϕ G 3 Another criterion 4 H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 2 / 26

  3. Contents Model setup and identifiability 1 Prior work 2 Graphical criteria based on Jacobian of ϕ G 3 Another criterion 4 H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 3 / 26

  4. DAG model with one latent variable X = Λ T X + δL + ϵ. X = ( X 1 , . . . , X m ) T : observable variables L : a latent variable L ∼ N (0 , 1) ϵ = ( ϵ 1 , . . . , ϵ m ) T ϵ ∼ N (0 , Ω) , Ω = diag( ω 1 , . . . , ω m ) Λ = { λ vw } : strictly upper triangular δ = ( δ 1 , δ 2 , . . . , δ m ) : factor loads X ∼ N (0 , Σ) , Σ = ( I m − Λ T ) − 1 (Ω + δδ T )( I m − Λ) − 1 . H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 4 / 26

  5. The DAG model with a latent variable X = Λ T X + δL + ϵ. A factor analysis model s.t. one latent variable DAG structure among observable variables L 2 1 2 1 3 4 5 6 3 4 5 6 G ∗ G G = ( V, E ) : DAG for observable variables G ∗ : DAG for the model with a latent variable H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 5 / 26

  6. parametrization map θ := (Λ , Ω , δ ) ∈ Θ := R | E | × R m > 0 × R m . dimΘ = | E | + 2 m . parametrization map : ϕ G : θ �→ ( I m − Λ T ) − 1 (Ω + δδ T )( I m − Λ) − 1 . Λ : strictly upper triangular ( I m − Λ) − 1 = I m + Λ + Λ 2 + · · · + Λ m − 1 . ϕ G is a polynomial map on θ . The model is called globally identifiable when ϕ G is one-to-one. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 6 / 26

  7. Identifiability of models ϕ G (Λ , Ω , δ ) = ϕ G (Λ , Ω , − δ ) not globally identifiable When Λ = 0 , ϕ G is 2-to-1 Anderson and Rubin(1956) When Λ ̸ = 0 , ϕ G could be ∞ -to-1 generically k -to-1 with 2 < k < ∞ not necessarily 2-to-1 Generially finite identifiability When ϕ G is generically finite-to-one, the model is called generically finite identifiable(GFI). H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 7 / 26

  8. Computational algebraic considerations F ( θ, Σ) := ( I m − Λ T ) − 1 ( δδ T + Ω)( I m − Λ) − 1 − Σ f ij ( θ ) : ( i, j ) element of F ( θ, Σ) I G : an ideal generated by { f ij ( θ ) : i > j } I G = ⟨ f 11 , f 12 , . . . , f mm ⟩ Proposition(e.g. Cox et al.) When I G is zero-dimensional, F ( θ, Σ) = 0 has at most finitely many solutions. Question Under what conditions on G (or G ∗ ) ϕ G is finite-to-one? H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 8 / 26

  9. Contents Model setup and identifiability 1 Prior work 2 Graphical criteria based on Jacobian of ϕ G 3 Another criterion 4 H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 9 / 26

  10. G con and G | L,cov G con = ( V, E con ) : conditional independence graph of G G | L,cov = ( V, E | L,cov ) represents marginal dependency of variable pairs after conditioning on L . G con and G | L,cov are easily obtained from G G c con = ( V, E c con ) : complementary graph of G con = ( V, E con ) G c | L,cov = ( V, E c | L,cov ) : complementary graph of G | L,cov H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 10 / 26

  11. SW condition Theorem(Stanghellini and Wermuth) Suppose that G satisfies either of the following conditions, every connected components of G c con has an odd cycle, 1 every connected components of G c | L,cov has an odd cycle. 2 Then the model defined by G ∗ is generically finite identifiable. SW condition is applicable to any DAG model with one latent variable. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 11 / 26

  12. SW condition m 4 5 6 GFI models 5 95 3344 SW condition 5 49 985 The number of GFI models with m = 4 , 5 , 6 computed by Singular. SW condition does not look so good. Here we provide better conditions. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 12 / 26

  13. Contents Model setup and identifiability 1 Prior work 2 Graphical criteria based on Jacobian of ϕ G 3 Another criterion 4 H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 13 / 26

  14. Complementary graph of DAG G G : DAG ¯ G : complementary graph of an undirected graph obtained by replacing all directed edges of G with undirected edges. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 14 / 26

  15. Theorem Theorem 1 If every connected component of ¯ G has an odd cycle, the model defined by G ∗ is generically finite identifiable. ¯ G has one connected component. 2 − 4 − 5 − 2 , 3 − 4 − 5 − 3 are odd cycles. the model associated with G is GFI ¯ G ∗ G G H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 15 / 26

  16. Parametrization map for Σ − 1 Σ is Σ = ( I m − Λ T ) − 1 (Ω + δδ T )( I m − Λ) − 1 . Σ − 1 is Σ − 1 = ( I m − Λ)Ω − 1 ( I m − Λ T ) − γγ T . ˜ θ := (Λ , Ω , γ ) ϕ G (˜ ˜ θ ) : ˜ θ �→ ( I m − Λ)Ω − 1 ( I m − Λ T ) − γγ T . ϕ G is finite-to-one if and only if ˜ ϕ G is finite-to-one. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 16 / 26

  17. Jacobian of ˜ ϕ G and GFI Proposition ˜ ϕ G is generically finite-to-one if and only if its Jacobian matrix ϕ G ) = ∂ ˜ ϕ G J (˜ ∂θ is generically column full-rank. The condition of Theorem 1 is a sufficient condition on J (˜ ϕ G ) to be column full rank. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 17 / 26

  18. LDH condition 1 m 4 5 6 GFI models 5 95 3344 SW condition 5 49 985 LDH condition 1 5 88 2957 SW and LDH 5 88 2957 We can see that our condition is better than SW condition. There still exist some GFI models that do not satisfy our condition. When m is bigger, the ratio of GFI models that do not satisfy our condition increases. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 18 / 26

  19. Contents Model setup and identifiability 1 Prior work 2 Graphical criteria based on Jacobian of ϕ G 3 Another criterion 4 H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 19 / 26

  20. Theorems Theorem 2 Suppose that G has a sink node v satisfying pa( v ) ̸ = V \ { v } , the model defined by G ∗ ( V \ { v } ) is GFI. Then the model defined by G ∗ is GFI. Theorem 3 Suppose that G has a source node v satisfying ch( v ) ̸ = V \ { v } . the model defined by G ∗ ( V \ { v } ) is GFI. Then the model defined by G ∗ is GFI. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 20 / 26

  21. Sufficient condition 2 ¯ G 1 does not have an odd cycle. Symbolic computation shows that the model associated with G ∗ 1 is GFI. L 1 2 1 2 1 5 3 2 4 3 3 5 4 5 4 ¯ G ∗ G 1 G 1 1 H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 21 / 26

  22. LDH condition 2 Add a sink node { 6 } to G 1 . pa(6) = { 2 , 3 , 4 , 5 } . ¯ G 2 also have no odd cydle. But the model defined by G ∗ 2 is also GFI. L 1 2 2 1 3 1 3 5 5 4 2 4 6 3 4 6 5 6 ¯ G ∗ G 2 G 2 2 H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 22 / 26

  23. tetrad Lemma s ij : ( i, j ) element of ( I m − Λ T )Σ( I m − Λ) ( I m − Λ T )Σ( I m − Λ) = Ω + δδ T = diagonal + rank1 if and only if τ ( ik ) , ( jl ) (Λ) = s ij s kl − s il s jk = 0 , i < j < k < l or i < k < j < l. 2 × 2 off-diagonal minors are called tetrads. Tetrads are all zeros. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 23 / 26

  24. tetrad Proposition τ ( ik ) , ( jl ) (Λ) = 0 is a quartic equation on Λ and the model is GFI if and only if τ ( ik ) , ( jl ) (Λ) = 0 , i < j < k < l or i < k < j < l has finitely many solution. For a given Λ , ( I m − Λ T )Σ( I m − Λ) = Ω + δδ T is 2-to-1 on (Ω , δ ) . By using this fact, we can obtain the conditions. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 24 / 26

  25. LDH condition 2 4 5 6 m GFI models 5 95 3344 SW condition 5 49 985 LDH condition 1 5 88 2957 SW and LDH 5 88 2957 For m = 6 , 387 = 3344 − 2957 models are GFI but do not satisfy Theorem 1. It turns out that 194 of them are shown to be GFI in this way. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 25 / 26

  26. References D. Leung, M. Drton, and H. Hara.(2015). Identifiability of directed Gaussian graphical models with one latent source. arXiv 1505.01583, submitted. H. Hara (Niigata U.) Identifiability of factor analysis models Oct 3, 2015 26 / 26

Recommend


More recommend