2010/7/8 UAI 2010 Tutorial, Catalina Island Outline of Part II Non-Gaussian Methods for Some recent advances in LiNGAM analysis: Learning Linear Structural 1. LiNGAM combined with time-series models – AR-LiNGAM (Hyvarinen et al., 2010) Equation Models (Part II) Equation Models (Part II) – ARMA-LiNGAM (Kawahara et al., 2010) ARMA LiNGAM (Kawahara et al 2010) 2. LiNGAM with latent confounders – lvLiNGAM (Hoyer et al., 2006) Shohei Shimizu and Yoshinobu Kawahara – GroupLiNGAM (Kawahara et al, 2010) Osaka University Time-series analysis with LiNGAM How useful is it to analyze time-series data using non-Gaussianity of data? LiNGAM combined with Instantaneous effects can be incorporated explicitly into account through LiNGAM analysis combined into account through LiNGAM analysis combined time-series models with classical time-series models: – AR-LiNGAM (Hyvarinen et al.,2010) – ARMA-LiNGAM (Kawahara et al.,2010) Instantaneous and lagged effects Autoregressive models Represent the current state with the past states: disturbance – First order : … … – Second order : Lagged effect Lagged effect Instantaneous effect – p-th order : If time-resolution of measurements is sufficiently high, Usually, assumed to be white-noises => these effects can be caught by estimating classical time- series models, such as AR and ARMA models. An AR model is one of the standard tools for analyzing time- Otherwise, how to deal with instantaneous effects ? series data and has been successfully applied in a variety of fields, such as economics (Mills,1990, Perceival & Andrew,1993) . 1
2010/7/8 Incorporating instantaneous effects Estimation (1/2) Introduce the instantaneous term into AR models Relation between two models: (AR-LiNGAM) (Hyvarinen et al.,2010) : AR-model: AR-LiNGAM: i = 1,…, p (AR-model) => i = 0, 1,…, p (AR-LiNGAM) Regression Coef.: How to estimate the model including instantaneous effects ? Disturbance: => 1. Assume that is non-Gaussian. 2. Apply LiNGAM analysis. This is a SEM with non-Gaussian external influences. Estimation (2/2) Extension to ARMA model (1/2) 1. Estimate a multivariate AR model (i.e., ) and then AR-model calculate . – can estimate apparent effects or power-spectrum, – but cannot express direct relationships between variables 2. Apply LiNGAM analysis to the estimated : in principle. ARMA (Autoregressive moving-average) model: and calculate the matrix . – More general representation for time-series data (exact 3. Using the estimated , calculate the parameters of representation of linear differential equations in discrete AR-LiNGAM through time-domain) . (An AR-model is an asymptotic expansion of an ARMA-model.) Extension to ARMA model (2/2) Connection to Granger causality (1/2) The analogous relationships between ARMA models and Suppose a multivariate process is partitioned into ARMA-LiNGAM models still hold (Kawahara et al.,2010) : ( ). ARMA-LiNGAM: ARMA-model: Granger causality* (Granger,1976, Boudjellaba,1992) : The processes do not cause the process if and only if The processes do not cause the process if and only if , for all . Regression Coef.: : Past sequence up to time t , : Variance of prediction error of Disturbance: *) Granger causality is not necessarily a natural extension of the causality for i.i.d. Again, this is a SEM with non-Gaussian external influences. data, which is usually defined based on the counter-factual model. 2
2010/7/8 Connection to Granger causality (2/2) Application to real data Duplex-pendulum system: ARMA model: G. C. => Rad (Boudjellaba,1992) (Hyvarinen et al.,2010, Kawahara et al.,2010) (Hyvarinen et al 2010 Kawahara et al 2010) ARMA-LiNGAM: Chaotic pattern + Time[s] Analytic model for the duplex-pendulum system: => If the order in the sense of Granger causality completely agrees with the instantaneous effects, then the order is preserved even if the instantaneous effects are neglected. Application to physical system (cont.) Summary (LiNGAM combined with time-series models) Estimated lagged effects by AR- and ARMA-LiNGAM: Non-Gaussianity could be useful for analyzing time-series data (AR-LiNGAM and ARMA-LiNGAM). AR-LiNGAM ARMA-LiNGAM – Instantaneous effects can be taken into account by using non-Gaussianity of disturbances. AR-LiNGAM (or ARMA-LiNGAM) is identified by first AR-LiNGAM (or ARMA-LiNGAM) is identified by first estimating a classical AR model (or ARMA-model) and then applying LiNGAM analysis on disturbance sequences. The order in the sense of Granger causality is satisfied if and only if both of the instantaneous and lagged effects in AR-LiNGAM (or ARMA-LiNGAM) give the same order. t t -1 t t -2 t t -1 t t -2 Although dominant patterns are captured by both models, the chaotic effect is captured only by ARMA-LiNGAM. Latent confounder Independent external influences (the assumption in LiNGAM) => No latent confounder (Spirtes et al., 2000) LiNGAM with Latent Latent variable which is a parent of more than two observed variables Confounders Latent confounder A latent confounder induces dependency among external influences : 3
2010/7/8 Motivation of this topic Latent variable LiNGAM Actual data might include latent confounders. Introduce latent confounders f to LiNGAM model: => In the case, the assumption on LiNGAM that there => Overcomplete ICA is no latent confounders has been violated. (Lewicki & Sejnowski 2000, Eriksson & Koivunen 2004) How to overcome this ? non-Gaussian and independent – IvLiNGAM (Hoyer et al.,06) • Overcomplete ICA. <=> – GroupLiNGAM (Kawahara et al.,10) A • Extension of the principle of DirectLiNGAM to ‘set’. How to classify f and e ? and how to assign fi ? Basic idea of lvLiNGAM Find an external influence 1. Remove external influences. The i -th row vector of A has 2. Find a pair of observed variables that has no non-zero at the j -th column and observed parents. all zeros elsewhere: Mixing matrix – Mark their common parent as a latent confounder. j -th col. – The existence of such a pair is guaranteed by the The existence of such a pair is guaranteed by the assumption that “no latent confounder that has total effects to some observed variable and its descendants only.” The j -th element of s is an external influence. 3. Repeat 1-2. Find a latent confounder Empirical example 1. If the j -th row vector ‘ covers ’ i -the row one: Two different networks, which has the same ordering of variables, were estimated in this case. ex.) non-zero element => is a parent of ( ). is a parent of ( ). 2. If the i -th and j -th row vectors do not cover each other: ex.) => and have no order. Estimated network Original network If the i -th row vector ‘covers’ no other rows, has no observed parents. 4
2010/7/8 Motivation of this topic Basic idea of GroupLiNGAM Actual data might include latent confounders. DirectLiNGAM => In the case, the assumption on LiNGAM that there => Variable ordering is estimated by iteratively finding is no latent confounders has been violated. an exogenous variable. How to overcome this ? GroupLiNGAM p – IvLiNGAM (Hoyer et al.,06) => Group ordering (i.e., ordering of sets of variables) is • Overcomplete ICA. estimated by recursively finding an exogenous set (defined later). – GroupLiNGAM (Kawahara et al.,10) • Extension of the principle of DirectLiNGAM to ‘set’. Applicable to data with latent confounders Exogenous set Exogenous set (cont.) Let the partition of variables be . Lemma : a set of variables is exogenous if and only if is independent of its residual . The subset of variables is said to be exogenous against , if the corresponding partition of the matrix : Residual when is regressed on . B has the following form: => This lemma extends DirectLiNGAM to the ‘set’ case. . ex.) is independent of . 0 is not independent of . => Group ordering: {1,2} < {3} Identification of an exogenous set Estimation (1/2) Zero-one structure of matrix B still holds in SEMs of the Find a subset of variables that is Find an exogenous set = exogenous set and the residuals, respectively. independent of the residuals 0 {1,2} < {3} < {4,5} {1,2} {3} {4,5} Find a subset ( U is the set of variables) s.t. Find a subset ( U is the set of variables) s t where is a small real number. exogenous set residuals Some independence measure I , such as mutual information 0 (Kraskov et al.,2004) and HSIC (Gretton et al.,2005) , is used. ( S ={1,2,3}) 5
Recommend
More recommend