Test for Covariances Max Turgeon STAT 7200–Multivariate Statistics
Objectives • Review general theory of likelihood ratio tests • Tests for structured covariance matrices • Test for equality of multiple covariance matrices 2
Likelihood ratio tests i • We will build our tests for covariances using likelihood ratios. • Therefore, we quickly review the asymptotic theory for regular models. • We are interested in the following hypotheses: 3 • Let Y 1 , . . . , Y n be a random sample from a density p θ with parameter θ ∈ R d . H 0 : θ ∈ Θ 0 , H 1 : θ ∈ Θ 1 , where Θ i ⊆ R d .
Likelihood ratio tests ii likelihood ratio 4 • Let L ( θ ) = � n i =1 p θ ( Y i ) be the likelihood, and define the max θ ∈ Θ 0 L ( θ ) Λ = max θ ∈ Θ 0 ∪ Θ 1 L ( θ ) . • Recall : we reject the null hypothesis H 0 for small values of Λ .
Likelihood ratio tests iii Theorem (Van der Wandt, Chapter 16) we have • Therefore, in practice, we need to count the number of free enough. 5 Assume Θ 0 , Θ 1 are locally linear . Under regularity conditions on p θ , − 2 log Λ → χ 2 ( k ) , where k is the difference in the number of free parameters between the null model Θ 0 and the unrestricted model Θ 0 ∪ Θ 1 . parameters in each model and hope the sample size n is large
Tests for structured covariance matrices i • We are going to look at several tests for structured covariance matrix. positive definite. • Like other exponential families, the multivariate normal distribution satisfies the regularity conditions of the theorem above. • Being positive definite implies that the unrestricted parameter space is locally linear , i.e. we are staying away from the 6 • Throughout, we assume Y 1 , . . . , Y n ∼ N p ( µ, Σ) with Σ boundary where Σ is singular.
Tests for structured covariance matrices ii • A few important observations about the unrestricted model: • The number of free parameters is equal to the number of • The maximised likelihood for the unrestricted model is given by 7 entries on and above the diagonal of Σ , which is p ( p + 1) / 2 . • The sample mean ¯ Y maximises the likelihood independently of the structure of Σ . exp( − np/ 2) L ( ˆ Y , ˆ Σ) = Σ | n/ 2 . (2 π ) np/ 2 | ˆ
Specified covariance structure i • We will start with the simplest hypothesis test: • Note that there is no free parameter in the null model. 8 H 0 : Σ = Σ 0 . • Write V = n ˆ Σ . Recall that we have � � − 1 Y , Σ) = (2 π ) − np/ 2 | Σ | − n/ 2 exp L ( ˆ 2tr(Σ − 1 V ) .
Specified covariance structure ii • Therefore, the likelihood ratio is given by 9 � � (2 π ) − np/ 2 | Σ 0 | − n/ 2 exp − 1 2 tr(Σ − 1 0 V ) Λ = exp( − np/ 2)(2 π ) − np/ 2 | ˆ Σ | − n/ 2 � � | Σ 0 | − n/ 2 exp − 1 2 tr(Σ − 1 0 V ) = exp( − np/ 2) | n − 1 V | − n/ 2 � e � np/ 2 � � − 1 0 V | n/ 2 exp | Σ − 1 2tr(Σ − 1 = 0 V ) . n • In particular, if Σ 0 = I p , we get � e � np/ 2 � � − 1 | V | n/ 2 exp Λ = 2tr( V ) . n
Example i library (tidyverse) # Winnipeg avg temperature url <- paste0 (”https://maxturgeon.ca/w20-stat7200/”, ”winnipeg_temp.csv”) dataset[1 : 3,1 : 3] ## temp_2010 temp_2011 temp_2012 ## 1 -25.57500 -16.25417 -6.379167 ## 2 -26.06250 -18.39583 -12.925000 ## 3 -20.56667 -19.45833 -5.791667 10 dataset <- read.csv (url)
Example ii n <- nrow (dataset) # Diag = 14^2 # Corr = 0.8 Sigma0 <- diag (0.8, nrow = p) diag (Sigma0) <- 1 Sigma0 <- 14 ^ 2 * Sigma0 Sigma0_invXV <- solve (Sigma0, V) 11 p <- ncol (dataset) V <- (n - 1) *cov (dataset)
Example iii lrt <- 0.5 * n * p * (1 - log (n)) df <- choose (p + 1, 2) c (lrt, qchisq (0.95, df)) ## [1] 5631.63409 73.31149 12 lrt <- lrt + 0.5 * n *log ( det (Sigma0_invXV)) lrt <- lrt - 0.5 *sum ( diag (Sigma0_invXV)) lrt <- -2 * lrt
Test for sphericity i • Note that there is one free parameter. • We have 13 • In other words, we are looking at the following null hypothesis: uncorrelated and have the same variance . • Sphericity means the different components of Y are σ 2 > 0 . H 0 : Σ = σ 2 I p , � � − 1 Y , σ 2 I p ) = (2 π ) − np/ 2 | σ 2 I p | − n/ 2 exp L ( ˆ 2tr(( σ 2 I p ) − 1 V ) � � − 1 = (2 πσ 2 ) − np/ 2 exp 2 σ 2 tr( V ) .
Test for sphericity ii • We then get • Taking the derivative of the logarithm and setting it equal to 14 zero, we find that L ( ˆ Y , σ 2 I p ) is maximised when σ 2 = tr V � np . � � − 1 σ 2 ) − np/ 2 exp L ( ˆ Y , � σ 2 I p ) = (2 π � σ 2 tr( V ) 2 � � tr V � − np/ 2 � � − np = (2 π ) − np/ 2 exp . np 2
Test for sphericity iii • Therefore, we have 15 (2 π ) − np/ 2 � � − np/ 2 exp � � − np tr V np 2 Λ = exp( − np/ 2)(2 π ) − np/ 2 | ˆ Σ | − n/ 2 � � − np/ 2 tr V np = | n − 1 V | − n/ 2 � � n/ 2 | V | = . (tr V/p ) p
Example (cont’d) i lrt <- -2 * 0.5 * n * ( log ( det (V)) - p *log ( mean ( diag (V)))) c (lrt, qchisq (0.95, df)) ## [1] 5630.79458 72.15322 16 df <- choose (p + 1, 2) - 1
Test for sphericity (cont’d) i • Recall that we have 17 � � n/ 2 | V | Λ = . (tr V/p ) p • We can rewrite this as follows: let l 1 ≥ · · · ≥ l p be the eigenvalues of V . We have | V | Λ 2 /n = (tr V/p ) p � p j =1 l j = � p ( 1 j =1 l j ) p p � p p j =1 l 1 /p j = . � p 1 j =1 l j p
Test for sphericity (cont’d) ii • A result of Srivastava and Khatri gives the exact distribution of 18 • In other words, the modified LRT ˜ Λ = Λ 2 /n is the ratio of the geometric to the arithmetic mean of the eigenvalues of V (all raised to the power p ). ˜ Λ : � 1 � 1 �� p − 1 � 2 + 1 ˜ Λ = B 2( n − j − 1) , j . p j =1
Example (cont’d) i B <- 1000 df1 <- 0.5 * (n - seq_len (p-1) - 1) # Critical values dist <- replicate (B, { prod ( rbeta (p-1, df1, df2)) }) 19 df2 <- seq_len (p-1) * (0.5 + 1 / p)
Example (cont’d) ii # Test statistic decomp <- eigen (V, symmetric = TRUE, only.values = TRUE) lrt_mod <- (geo_mean / ar_mean) ^ p c (lrt_mod, quantile (dist, 0.95)) ## 95% ## 1.181561e-07 8.967361e-01 20 ar_mean <- mean (decomp $ values) geo_mean <- exp ( mean ( log (decomp $ values)))
Test for independence i . . . . ... . . . . . . ... . . 21 • Decompose Y i into k blocks: Y i = ( Y 1 i , . . . , Y ki ) , � k where Y ji ∼ N p j ( µ j , Σ jj ) and j =1 p j = p . • This induces a decomposition on Σ and V : Σ 11 · · · Σ 1 k V 11 · · · V 1 k Σ = , V = . Σ k 1 · · · Σ kk V k 1 · · · V kk
Test for independence ii . • Under the null hypothesis, the likelihood can be decomposed . . • We are interested in testing for independence between the ... . . . 22 different blocks Y 1 i , . . . , Y ki . This equivalent to Σ 11 · · · 0 H 0 : Σ = . 0 · · · Σ kk • Note that there are � k j =1 p j ( p j + 1) / 2 free parameters. into k likelihoods that can be maximised independently.
Test for independence iii • This gives us • Putting this together, we conclude that 23 k � exp( − np j / 2) max L ( ˆ Y , Σ) = (2 π ) np j / 2 | � Σ jj | n/ 2 j =1 exp( − np/ 2) = Σ jj | n/ 2 . (2 π ) np/ 2 � k j =1 | � � � n/ 2 | V | Λ = . � k j =1 | V jj |
Example i url <- paste0 (”https://maxturgeon.ca/w20-stat7200/”, ”blue_data.csv”) names (blue_data) ## [1] ”NumSold” ”Price” ”AdvCost” ”SalesAssist” dim (blue_data) ## [1] 10 4 24 blue_data <- read.csv (url)
Example ii # Let's test for independence between # all four variables n <- nrow (blue_data) V <- (n-1) *cov (blue_data) df <- choose (p + 1, 2) - p c (lrt, qchisq (0.95, df)) 25 p <- ncol (blue_data) lrt <- -2 * ( log ( det (V)) - sum ( log ( diag (V))))
Example iii ## [1] 5.635124 12.591587 lrt > qchisq (0.95, df) ## [1] FALSE 26
Test for equality of covariances i independent random samples from (potentially) different 27 • We now look at a different setting: assume that we collected K p -dimensional multivariate normal distributions: Y 1 k , . . . , Y n k k ∼ N p ( µ k , Σ k ) , k = 1 , . . . , K. • We are interested in the null hypothesis that all Σ k are equal to some unknown Σ : H 0 : Σ k = Σ , for all k = 1 , . . . , K.
Test for equality of covariances ii • First, note that since the samples are independent, the full likelihood is the product of the likelihoods for each sample: • Therefore, over the unrestricted model, the maximum likelihood estimators are • Note that the number of free parameters over the unrestricted 28 K � L ( µ 1 , . . . , µ K , Σ 1 , . . . , Σ K ) = L ( µ k , Σ k ) . k =1 Y k , ˆ ( ¯ Σ k ) . model is kp ( p + 1) / 2 .
Recommend
More recommend