multivariate normal distribution
play

Multivariate Normal Distribution Max Turgeon STAT 4690Applied - PowerPoint PPT Presentation

Multivariate Normal Distribution Max Turgeon STAT 4690Applied Multivariate Analysis Building the multivariate density i random variable. Recall that its density is given by distributed, their joint density is 2 Let Z N (0 , 1) be a


  1. Multivariate Normal Distribution Max Turgeon STAT 4690–Applied Multivariate Analysis

  2. Building the multivariate density i random variable. Recall that its density is given by distributed, their joint density is 2 • Let Z ∼ N (0 , 1) be a standard (univariate) normal 1 − 1 ( ) √ 2 z 2 ϕ ( z ) = 2 π exp . • Now if we take Z 1 , . . . , Z p ∼ N (0 , 1) independently

  3. Building the multivariate density ii defjnite matrix. 3 p 1 − 1 ( ) ∏ √ 2 z 2 ϕ ( z 1 , . . . , z p ) = 2 π exp i i =1 p ( ) 1 − 1 ∑ √ z 2 = 2 π ) p exp i 2 ( i =1 1 − 1 ( ) 2 z T z = √ 2 π ) p exp , ( where z = ( z 1 , . . . , z p ) . • More generally, let µ ∈ R p and let Σ be a p × p positive

  4. Building the multivariate density iii last lecture that • To get the density, we need to compute the inverse transformation: 4 • Let Σ = LL T be the Cholesky decomposition for Σ . • Let Z = ( Z 1 , . . . , Z p ) be a standard (multivariate) normal random vector, and defjne Y = L Z + µ . We know from • E ( Y ) = LE ( Z ) + µ = µ ; • Cov( Y ) = L Cov( Z ) L T = Σ . Z = L − 1 ( Y − µ ) .

  5. Building the multivariate density iv 5 • The Jacobian matrix J for this transformation is simply L − 1 , and therefore | det( J ) | = | det( L − 1 ) | = det( L ) − 1 ( L is p.d. ) − 1 √ = det(Σ) = det(Σ) − 1 / 2 .

  6. Building the multivariate density v • Plugging this into the formula for the density of a 6 transformation, we get 1 det(Σ) 1 / 2 ϕ ( L − 1 ( y − µ )) f ( y 1 , . . . , y p ) = ( )) 1 1 − 1 ( 2( L − 1 ( y − µ )) T L − 1 ( y − µ ) √ = 2 π ) p exp det(Σ) 1 / 2 ( 1 − 1 ( ) 2( y − µ ) T ( LL T ) − 1 ( y − µ ) √ = 2 π ) p exp det(Σ) 1 / 2 ( 1 − 1 ( ) 2( y − µ ) T Σ − 1 ( y − µ ) = exp . √ (2 π ) p | Σ |

  7. Example i set.seed (123) Z <- matrix ( rnorm (n * p), ncol = p) mu <- c (1, 2) L <- t ( chol (Sigma)) 7 n <- 1000; p <- 2 Sigma <- matrix ( c (1, 0.5, 0.5, 1), ncol = 2)

  8. Example ii Y <- L %*% t (Z) + mu colMeans (Y) ## [1] 1.016128 2.044840 cov (Y) ## [,1] [,2] ## [1,] 0.9834589 0.5667194 ## [2,] 0.5667194 1.0854361 8 Y <- t (Y)

  9. Example iii library (tidyverse) data.frame () %>% ggplot ( aes (X1, X2)) + geom_density_2d () 9 Y %>%

  10. Example iv 10 4 3 X2 2 1 0 −1 0 1 2 3 X1

  11. Example v library (mvtnorm) colMeans (Y) ## [1] 0.9812102 1.9829380 cov (Y) 11 Y <- rmvnorm (n, mean = mu, sigma = Sigma)

  12. Example vi ## [,1] [,2] ## [1,] 0.9982835 0.4906990 ## [2,] 0.4906990 0.9489171 Y %>% data.frame () %>% ggplot ( aes (X1, X2)) + geom_density_2d () 12

  13. Example vii 13 4 3 X2 2 1 0 −1 0 1 2 3 X1

  14. Other characterizations There are at least two other ways to defjne the multivariate random distribution: multivariate normal distribution if and only if every linear multivariate normal distribution if and only if its distribution maximises entropy over the class of random 14 1. A p -dimensional random vector Y is said to have a combination of Y has a univariate normal distribution. 2. A p -dimensional random vector Y is said to have a vectors with fjxed mean µ and fjxed covariance matrix Σ and support over R p .

  15. Useful properties i ; . 15 distributed; that is, write • If Y ∼ N p ( µ, Σ) , A is a q × p matrix, and b ∈ R q , then A Y + b ∼ N q ( Aµ + b, A Σ A T ) . • If Y ∼ N p ( µ, Σ) then all subsets of Y are normally ( ) ( ) µ 1 Y 1 • Y = , µ = Y 2 µ 2 ( ) Σ 11 Σ 12 • Σ = Σ 21 Σ 22 • Then Y 1 ∼ N r ( µ 1 , Σ 11 ) and Y 2 ∼ N p − r ( µ 2 , Σ 22 ) .

  16. Useful properties ii • Assume the same partition as above. Then the following are equivalent: 16 • Y 1 and Y 2 are independent; • Σ 12 = 0 ; • Cov( Y 1 , Y 2 ) = 0 .

  17. Exercise (J&W 4.3) Which of the following random variables are independent? Explain. 17 Let ( Y 1 , Y 2 , Y 3 ) ∼ N 3 ( µ, Σ) with µ = (3 , 1 , 4) and   1 − 2 0 Σ =  − 2 5 0   .    0 0 2 1. Y 1 and Y 2 . 2. Y 2 and Y 3 . 3. ( Y 1 , Y 2 ) and Y 3 . 4. 0 . 5( Y 1 + Y 2 ) and Y 3 . 5. Y 2 and Y 2 − 5 2 Y 1 − Y 3 .

  18. Conditional Normal Distributions i ; . 18 • Theorem : Let Y ∼ N p ( µ, Σ) , where ( ) ( ) Y 1 µ 1 • Y = , µ = µ 2 Y 2 ( ) Σ 11 Σ 12 • Σ = Σ 21 Σ 22 • Then the conditional distribution of Y 1 given Y 2 = y 2 is multivariate normal N r ( µ 1 | 2 , Σ 1 | 2 ) , where • µ 1 | 2 = µ 1 + Σ 12 Σ − 1 22 ( y 2 − µ 2 ) • Σ 1 | 2 = Σ 11 + Σ 12 Σ − 1 22 Σ 21 .

  19. Conditional Normal Distributions ii ; . 19 • Corrolary : Let Y 2 ∼ N p − r ( µ 2 , Σ 22 ) and assume that Y 1 given Y 2 = y 2 is multivariate normal N r ( Ay 2 + b, Ω) , where Ω does not depend on y 2 . Then    Y 1  ∼ N p ( µ, Σ) , where Y = Y 2 ( ) Aµ 2 + b • µ = µ 2 ( Ω + A Σ 22 A T ) A Σ 22 • Σ = Σ 22 A T Σ 22

  20. Exercise 20 • Let Y 2 ∼ N 1 (0 , 1) and assume      y 2 + 1  , I 2  . Y 1 | Y 2 = y 2 ∼ N 2  2 y 2 Find the joint distribution of ( Y 1 , Y 2 ) .

  21. Another important result i with mean 0 and covariance matrix standard normal random variables. • This can be seen as a generalization of the univariate result 21 • Let Y ∼ N p ( µ, Σ) , and let Σ = LL T be the Cholesky decomposition of Σ . • We know that Z = L − 1 ( Y − µ ) is normally distributed, Cov( Z ) = L − 1 Σ( L − 1 ) T = I p . • Therefore ( Y − µ ) T Σ − 1 ( Y − µ ) is the sum of squared • In other words, ( Y − µ ) T Σ − 1 ( Y − µ ) ∼ χ 2 ( p ) . ) 2 ∼ χ 2 (1) . ( X − µ σ

  22. Another important result ii • From this, we get a result about the probability that a multivariate normal falls within an ellipse : • We can use this to construct a confjdence region around the sample mean. 22 ( ) ( Y − µ ) T Σ − 1 ( Y − µ ) ≤ χ 2 ( α ; p ) P = 1 − α.

Recommend


More recommend