high dimensional covariance decomposition into sparse
play

High-Dimensional Covariance Decomposition into Sparse Markov and - PowerPoint PPT Presentation

High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid Janzamin and Anima Anandkumar U.C. Irvine High-Dimensional Covariance Estimation n i.i.d. samples, p variables X := [ X 1 , . . . , X p ] T .


  1. High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Majid Janzamin and Anima Anandkumar U.C. Irvine

  2. High-Dimensional Covariance Estimation n i.i.d. samples, p variables X := [ X 1 , . . . , X p ] T . High-dimensional regime: both n, p → ∞ and n ≪ p . Covariance estimation: Σ ∗ := E [ XX T ] . Challenge: empirical (sample) covariance ill-posed when n ≪ p : � n Σ n := 1 x ( k ) x ( k ) T . � n k =1 Solution: Imposing Sparsity for Tractable High-dimensional Estimation

  3. Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M

  4. Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M Relationship with Statistical Properties (Gaussian) Sparse Covariance= Independence Model: marginal independence. Sparse Inverse Covariance=Markov Model: conditional independence

  5. Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M Relationship with Statistical Properties (Gaussian) Sparse Covariance= Independence Model: marginal independence. Sparse Inverse Covariance=Markov Model: conditional independence Guarantees under Sparsity Constraints in High Dimensions Consistent Estimation when n = Ω(log p ) ⇒ n ≪ p .

  6. Incorporating Sparsity in High Dimensions Sparse Covariance Sparse Inverse Covariance 1 J − 1 Σ ∗ Σ ∗ Σ R M Relationship with Statistical Properties (Gaussian) Sparse Covariance= Independence Model: marginal independence. Sparse Inverse Covariance=Markov Model: conditional independence Guarantees under Sparsity Constraints in High Dimensions Consistent Estimation when n = Ω(log p ) ⇒ n ≪ p . Going beyond Sparsity in High Dimensions?

  7. Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains.

  8. Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. One Possibility: Sparse Markov + Sparse Independence Models Sparsity in Multiple Domains: Multiple Statistical Relationships. 1 Σ R J − 1 Σ ∗ M

  9. Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. One Possibility: Sparse Markov + Sparse Independence Models Sparsity in Multiple Domains: Multiple Statistical Relationships. 1 Σ R J − 1 Σ ∗ M Efficient Decomposition and Estimation in High Dimensions?

  10. Going Beyond Sparse Models Motivation Sparsity constraints restrictive to have faithful representation. Data not sparse in a single domain Solution: Sparsity in Multiple Domains. One Possibility: Sparse Markov + Sparse Independence Models Sparsity in Multiple Domains: Multiple Statistical Relationships. 1 Σ R J − 1 Σ ∗ M Efficient Decomposition and Estimation in High Dimensions? Unique Decomposition? Good Sample Requirements?

  11. Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M

  12. Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M Contribution 1: Novel Method for Decomposition Decomposition into Markov and residual domains. Unification of Sparse Covariance and Inverse Covariance Estimation.

  13. Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M Contribution 1: Novel Method for Decomposition Decomposition into Markov and residual domains. Unification of Sparse Covariance and Inverse Covariance Estimation. Contribution 2: Guarantees for Estimation Conditions for unique decomposition (exact statistics). Sparsistency and norm guarantees in both Markov and independence domains (sample analysis) Sample requirement: no. of samples n = Ω(log p ) for p variables.

  14. Summary of Results 1 − 1 + Σ ∗ Σ ∗ = J ∗ R . M Contribution 1: Novel Method for Decomposition Decomposition into Markov and residual domains. Unification of Sparse Covariance and Inverse Covariance Estimation. Contribution 2: Guarantees for Estimation Conditions for unique decomposition (exact statistics). Sparsistency and norm guarantees in both Markov and independence domains (sample analysis) Sample requirement: no. of samples n = Ω(log p ) for p variables. Efficient Method for Covariance Decomposition and Estimation

  15. Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.)

  16. Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang)

  17. Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang) Beyond Sparse Models: Decomposition Issues Sparse + Low Rank (Chandrasekaran et. al) (Candes et. al) Decomposable Regularizers (Negahban et. al)

  18. Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang) Beyond Sparse Models: Decomposition Issues Sparse + Low Rank (Chandrasekaran et. al) (Candes et. al) Decomposable Regularizers (Negahban et. al) Multi-Resolution Markov+Independence Models (Choi et. al) Decomposition in inverse covariance domain Lack theoretical guarantees

  19. Related Works Sparse Covariance/Inverse Covariance Estimation Sparse Covariance Estimation: Covariance Thresholding. ◮ (Bickel & Levina) (Wagaman & Levina) ( Cai et. al.) Sparse Inverse Covariance Estimation: ◮ ℓ 1 Penalization (Meinshausen and B¨ uhlmann) (Ravikumar et. al) ◮ Non-Convex Methods (Anandkumar et. al) (Zhang) Beyond Sparse Models: Decomposition Issues Sparse + Low Rank (Chandrasekaran et. al) (Candes et. al) Decomposable Regularizers (Negahban et. al) Multi-Resolution Markov+Independence Models (Choi et. al) Decomposition in inverse covariance domain Lack theoretical guarantees Our contribution: Guaranteed Decomposition and Estimation

  20. Outline Introduction 1 Algorithm 2 Guarantees 3 Experiments 4 Conclusion 5

  21. Some Intuitions and Ideas − 1 + Σ ∗ Σ ∗ = J ∗ 1 R . M � Σ n : sample covariance using n i.i.d. samples

  22. Some Intuitions and Ideas − 1 + Σ ∗ Σ ∗ = J ∗ 1 R . M � Σ n : sample covariance using n i.i.d. samples Review Ideas for Special Cases: Sparse Covariance/Inverse Covariance

  23. Some Intuitions and Ideas − 1 + Σ ∗ Σ ∗ = J ∗ 1 R . M � Σ n : sample covariance using n i.i.d. samples Review Ideas for Special Cases: Sparse Covariance/Inverse Covariance Sparse Covariance Estimation (Independence Model) Σ ∗ = Σ ∗ R . � Σ n : sample covariance using n samples p variables: p ≫ n . Thresholding estimator for off-diagonals (Bickel & Levina): threshold � log p chosen as n Sparsistency (support recovery) and Norm guarantees when n = Ω(log p ) ⇒ n ≪ p .

  24. Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples

  25. Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08)

  26. Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0

  27. Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0 Max-entropy Formulation (Lagrangian Dual) � Σ M := argmax log det Σ M − λ � Σ R � 1 , off Σ M ≻ 0 , Σ R � � �� Σ n � � � Σ n − Σ M � ∞ , off ≤ γ, � � s . t . Σ M d = , Σ R d = 0 . d

  28. Recap of Inverse Covariance (Markov) Estimation Σ ∗ = J ∗ − 1 +Σ ∗ 1 M R � Σ n : sample covariance using n i.i.d. samples ℓ 1 -MLE for Sparse Inverse Covariance (Ravikumar et. al. ‘08) � � � Σ n , J M � − log det J M + γ � J M � 1 , off J M := argmin J M ≻ 0 Max-entropy Formulation (Lagrangian Dual) � Σ M := argmax log det Σ M − λ � Σ R � 1 , off Σ M ≻ 0 , Σ R � � �� Σ n � � � Σ n − Σ M � ∞ , off ≤ γ, � � s . t . Σ M d = , Σ R d = 0 . d Consistent Estimation Under Certain Conditions, n = Ω(log p )

Recommend


More recommend