Finite mixture models
- Dr. Jarad Niemi
STAT 615 - Iowa State University
November 28, 2017
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 1 / 27
Finite mixture models Dr. Jarad Niemi STAT 615 - Iowa State - - PowerPoint PPT Presentation
Finite mixture models Dr. Jarad Niemi STAT 615 - Iowa State University November 28, 2017 Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 1 / 27 Multinomial distribution Categorical distribution Let Z Cat ( H, p )
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 1 / 27
Multinomial distribution
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 2 / 27
Multinomial distribution
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 3 / 27
Dirichlet distribution
0(a0+1) ,
0(a0+1), and
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 4 / 27
Dirichlet distribution Conjugate prior for multinomial distribution
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 5 / 27
Finite mixtures
1000 2000 3000 4 8
φ count
−2.5 0.0 2.5 5.0 7.5 10.0 −2 −1 1 2 3
α φ
−2.5 0.0 2.5 5.0 7.5 10.0 −1 1 2 3
δ φ
−2.5 0.0 2.5 5.0 7.5 10.0 −1 1
γ φ
−2.5 0.0 2.5 5.0 7.5 10.0 −3 −2 −1 1
ψ φ
5000 10000 15000 20000 −2 2
α count
−2 −1 1 2 3 −1 1 2 3
δ α
−2 −1 1 2 3 −1 1
γ α
−2 −1 1 2 3 −3 −2 −1 1
ψ α
5000 10000 15000 20000 −1 1 2 3
δ count
−1 1 2 3 −1 1
γ δ
−1 1 2 3 −3 −2 −1 1
ψ δ
5000 10000 15000 20000 −1 1
γ count
−1 1 −3 −2 −1 1
ψ γ
2000 4000 6000 −3 −2 −1 1
ψ count
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 6 / 27
Finite mixtures
0.00 0.05 0.10 0.15 0.20 4 8
phi density Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 7 / 27
Finite mixtures Finite mixture
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 8 / 27
Finite mixtures Finite mixture
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 9 / 27
Finite mixtures Finite mixture
The steps of a Gibbs sampler with stationary distribution p(π, µ, σ2, ζ|y) ∝ p(y|ζ, µ, σ2)p(ζ|π)p(µ|σ2)p(σ2)p(π) has steps 1. For i = 1, . . . , n, sample ζi from its full conditional (as the ζ are conditionally independent across i): P (ζi = h| . . .) ∝ πhN(yi; µh, σ2
h)
2. Jointly sample π and µ, σ2 as they are conditionally independent. a. Sample π ∼ Dir(a + Z) where n = (Z1, . . . , ZH) and Zh = n
i=1 I(ζi = h).
b. For h = 1, . . . , H, sample µh, σ2
h from their full conditional (as these are conditionally independent across
h): σ2
h ind
∼ IG(c′
h, d′ h)
µh|σ2
h ind
∼ N(m′
h, v′2 h σ2 h)
where v′2
h
= (1/v2
h + Zh)−1
m′
h
= v′2
h (mh/v2 h + Zhyh)
c′
h
= cd + Zh/2 d′
h
= dh + 1
2
1+Zh/v2 h
=
1 Zh
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 10 / 27
Finite mixtures JAGS library("rjags") jags_model = " model { for (i in 1:n) { y[i] ~ dnorm(mu[zeta[i]], tau[zeta[i]]) zeta[i] ~ dcat(pi[]) } for (i in 1:H) { mu[i] ~ dnorm(0,1e-5) tau[i] ~ dgamma(1,1) sigma[i] <- 1/sqrt(tau[i]) } pi ~ ddirich(a) }" tmp = hat[sample(nrow(hat), 1000),] dat = list(n=nrow(tmp), H=3, y=tmp$phi, a=rep(1,3)) jm = jags.model(textConnection(jags_model), data = dat, n.chains = 3) r = coda.samples(jm, c('mu','sigma','pi'), 1e3) Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 11 / 27
Finite mixtures JAGS
gelman.diag(r) ## Error in chol.default(W): the leading minor of order 6 is not positive definite gelman.diag(r, multivariate=FALSE) ## Potential scale reduction factors: ## ## Point est. Upper C.I. ## mu[1] 21.76 55.54 ## mu[2] 41.55 90.41 ## mu[3] 91.54 222.09 ## pi[1] 11.79 29.52 ## pi[2] 3.46 6.36 ## pi[3] 18.63 40.16 ## sigma[1] 13.70 35.60 ## sigma[2] 4.26 8.87 ## sigma[3] 13.93 37.09 Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 12 / 27
Finite mixtures JAGS
plot(r, density=FALSE)
200 400 600 800 −2 1 Iterations
Trace of mu[1]
200 400 600 800 3 6 Iterations
Trace of mu[2]
200 400 600 800 −2 2 6 Iterations
Trace of mu[3]
200 400 600 800 0.1 0.4 Iterations
Trace of pi[1]
200 400 600 800 0.1 0.4 Iterations
Trace of pi[2]
200 400 600 800 0.1 0.4 Iterations
Trace of pi[3]
200 400 600 800 0.0 2.0 Iterations
Trace of sigma[1]
200 400 600 800 1.0 2.5 Iterations
Trace of sigma[2]
200 400 600 800 0.0 2.0 Iterations
Trace of sigma[3]
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 13 / 27
Finite mixtures JAGS
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 14 / 27
Finite mixtures JAGS
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 15 / 27
Finite mixtures JAGS library("rjags") jags_model = " model { for (i in 1:n) { y[i] ~ dnorm(mu[zeta[i]], tau[zeta[i]]) zeta[i] ~ dcat(pi[]) } for (i in 1:H) { mu0[i] ~ dnorm(0,1e-5) tau[i] ~ dgamma(1,1) sigma[i] <- 1/sqrt(tau[i]) } mu[1:H] <- sort(mu0) pi ~ ddirich(a) }" jm = jags.model(textConnection(jags_model), data = dat, n.chains = 3) r = coda.samples(jm, c('mu','sigma','pi'), 1e3) Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 16 / 27
Finite mixtures JAGS
gelman.diag(r) ## Error in chol.default(W): the leading minor of order 6 is not positive definite gelman.diag(r, multivariate=FALSE) ## Potential scale reduction factors: ## ## Point est. Upper C.I. ## mu[1] 1.00 1.00 ## mu[2] 1.01 1.04 ## mu[3] 1.00 1.01 ## pi[1] 1.00 1.00 ## pi[2] 1.01 1.03 ## pi[3] 1.01 1.02 ## sigma[1] 1.00 1.00 ## sigma[2] 1.01 1.03 ## sigma[3] 1.01 1.02 Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 17 / 27
Finite mixtures JAGS
plot(r, density=FALSE)
1000 1400 1800 −2.12 Iterations
Trace of mu[1]
1000 1400 1800 0.4 1.2 Iterations
Trace of mu[2]
1000 1400 1800 5.3 5.7 Iterations
Trace of mu[3]
1000 1400 1800 0.06 0.12 Iterations
Trace of pi[1]
1000 1400 1800 0.35 0.50 Iterations
Trace of pi[2]
1000 1400 1800 0.40 0.55 Iterations
Trace of pi[3]
1000 1400 1800 0.12 0.20 Iterations
Trace of sigma[1]
1000 1400 1800 1.4 2.0 Iterations
Trace of sigma[2]
1000 1400 1800 1.0 1.3 Iterations
Trace of sigma[3]
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 18 / 27
Finite mixtures JAGS
0.0 0.1 0.2 0.3 0.4 0.5 4 8
x density Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 19 / 27
Finite mixtures Clustering
## parameter p1 p2 p3 ## 1 zeta[1] 0.000 0.002666667 0.99733333 ## 2 zeta[10] 0.000 0.090333333 0.90966667 ## 3 zeta[100] 0.000 0.206000000 0.79400000 ## 4 zeta[1000] 0.000 0.959666667 0.04033333 ## 5 zeta[101] 0.000 1.000000000 0.00000000 ## 6 zeta[102] 0.929 0.071000000 0.00000000 Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 20 / 27
Finite mixtures Clustering
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 21 / 27
Finite mixtures Choosing H
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 22 / 27
Finite mixtures Multivariate density estimation
1000 2000 3000 4 8
φ count
−2.5 0.0 2.5 5.0 7.5 10.0 −3 −2 −1 1
ψ φ
2000 4000 6000 −3 −2 −1 1
ψ count Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 23 / 27
Finite mixtures Finite mixture
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 24 / 27
Finite mixtures Finite mixture
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 25 / 27
Finite mixtures Finite mixture
The steps of a Gibbs sampler with stationary distribution p(π, µ, Σ, ζ|y) ∝ p(y|ζ, µ, Σ)p(ζ|π)p(µ|Σ)p(Σ)p(π) has steps 1. For i = 1, . . . , n, sample ζi from its full conditional P (ζi = h| . . .) ∝ πhN(yi; µh, Σh) 2. Jointly sample π and µ, σ2 as they are conditionally independent. a. Sample π ∼ Dir(a + Z) where Z = (Z1, . . . , ZH) and Zh = n
i=1 I(ζi = h).
b. For h = 1, . . . , H, sample µh, Σh from their full conditional Σh
ind
∼ IW (D′
h, c′ h)
µh|Σh
ind
∼ N(m′
h, v′2 h Σh)
where v′2
h
= (1/v2
h + Zh)−1
m′
h
= v′2
h (mh/v2 h + Zhyh)
c′
h
= cd + Zh D′
h
= Dh +
i:ζi=h(yi − yh)(yi − yh)⊤ +
1+Zh/v2 h
yh =
1 Zh
Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 26 / 27
Finite mixtures Finite mixture library("rjags") joint_mixture_model = " model { for (i in 1:n) { y[i,1:p] ~ dmnorm(mu[,zeta[i]], Tau[,,zeta[i]]) zeta[i] ~ dcat(pi[]) } for (h in 1:H) { mu[1:p,h] ~ dmnorm(mu0,Tau[,,h]) Tau[1:p,1:p,h] ~ dwish(D[,],c) Sigma[1:p,1:p,h] <- inverse(Tau[,,h]) } pi ~ ddirich(a) }" tmp = hat[sample(nrow(hat), 1000),] dat = list(n=nrow(tmp), y = tmp[,c('phi','psi')], p=2, H=10) dat$a = rep(1/dat$H, dat$H) dat$D = diag(1, dat$p) dat$c = dat$p+1 dat$mu0 = c(3,0) jm = jags.model(textConnection(joint_mixture_model), data = dat, n.chains = 3) r = coda.samples(jm, c('pi','mu','Sigma'), 1e3) Jarad Niemi (STAT615@ISU) Finite mixture models November 28, 2017 27 / 27