Nonparametric Deconvolution Models Allison J.B. Chaney Princeton University In collaboration with Barbara Engelhardt, Archit Verma and Young-Suk Lee
Objective Model collections of convolved data points
Objective Model collections of convolved data points
Objective Model collections of convolved data points
Objective Model collections of convolved data points 1 2 2 3 1 1 1 2 2 3 2 2 3 2 3
Objective Model collections of convolved data points General Voting Bulk RNA-seq Images observation district vote tally sample image feature issue or candidate gene expression level pixel particle individual voter one cell light particle factor voting cohort cell type visual pattern
“convolution” convolutional individual particles signal observed together progressing neural nets 1 2 2 2 3
“convolution” convolutional individual particles signal observed together progressing neural nets 1 2 2 2 3
Related Models ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Mixture models assign each observation to one of K clusters, or factors.
Related Models ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Admixture models represent groups of observations, each with its own mixture of K shared factors.
Related Models ● ● ● ● ● ● ● ● Decomposition models decompose observations into constituent parts by representing observations as a product between group representations and factor features.
Our Model ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Deconvolution models (this work) similarly decompose, or deconvolve, observations into constituent parts, but also capture group-specific (or local) fluctuations in factor features.
1 2 2 3 1 1 1 2 2 3 2 2 3 2 3 How do usually vote? ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
A B C … 1 2 2 3 1 1 1 2 2 3 2 2 3 2 3 How do vote in district A? A ● ● ● ● ● C D ● ● ● ● ● ● ● B ● ● ● ● ● ● E ●
Our Model
Our Model ● ● ● ● ● ● global factors ● ● ● ● ● ● ● ● ● ● ● ● ●
Our Model local factors ● ● ● ● ● (observation-specific) ● ● ● ● ● ● ● ● ● ● ● ● ● ●
Our Model
Our Model HDP
Our Model HDP
Our Model Paisley, 2012 k | α 0 ∼ Beta (1, α 0 ) β ′ � k − 1 ∏ ( 1 − β ′ � ℓ ) β k = β ′ � k ℓ =1 HDP
Our Model Paisley, 2012 k | α 0 ∼ Beta (1, α 0 ) β ′ � k − 1 ∏ ( 1 − β ′ � ℓ ) β k = β ′ � k ℓ =1 HDP n , k | α , β k ∼ Gamma ( αβ k ,1) π ′ � π ′ � n , k π n , k = ∑ ∞ ℓ =1 π ′ � n , ℓ
Our Model μ k , m | μ 0 , σ 0 ∼ 𝒪 ( μ 0 , σ 0 )
Our Model μ k , m | μ 0 , σ 0 ∼ 𝒪 ( μ 0 , σ 0 ) Σ k | ν , Ψ ∼ 𝒳 − 1 ( Ψ , ν )
Our Model μ k , m | μ 0 , σ 0 ∼ 𝒪 ( μ 0 , σ 0 ) x n , k | π n , k , μ k , Σ k ∼ 𝒪 M ( μ k , P n π n , k ) Σ k ¯ Σ k | ν , Ψ ∼ 𝒳 − 1 ( Ψ , ν )
Our Model μ k , m | μ 0 , σ 0 ∼ 𝒪 ( μ 0 , σ 0 ) x n , k | π n , k , μ k , Σ k ∼ 𝒪 M ( μ k , P n π n , k ) Σ k ¯ Σ k | ν , Ψ ∼ 𝒳 − 1 ( Ψ , ν ) P n | ρ ∼ Poisson ( ρ )
Our Model g ( x n , k , m ) ∞ ∑ x n , π n ∼ f y n , m | ¯ π n , k ¯ k =1
Inference
Inference ? ? ? ? ? ?
Variational Inference intractable posterior p
Variational Inference intractable posterior p easy to compute approximation q
Variational Inference intractable posterior p easy to compute approximation q black box variational inference (Ranganath, 2014) split-merge procedure (Bryant, 2012) to learn K
BBVI overview Ranganath et al., 2014 We want to estimate f z ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ]
BBVI overview Ranganath et al., 2014 We want to estimate f z Which has corresponding variational parameter f λ [ z ] ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ]
BBVI overview Ranganath et al., 2014 is λ the set of all variational parameters We want to estimate f z Which has corresponding variational parameter f λ [ z ] ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ]
BBVI overview Ranganath et al., 2014 is λ the set of all variational The gradient of parameters We want to estimate f z the ELBO Which has corresponding variational parameter f λ [ z ] ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ]
BBVI overview Ranganath et al., 2014 is λ the set of all variational The gradient of parameters We want to estimate f z the ELBO Which has corresponding variational parameter f λ [ z ] ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ] If we can approximate this gradient , we can use ≈ ˜ ∇ λ [ z ] ℒ standard stochastic gradient ascent to update . λ [ z ]
BBVI overview Ranganath et al., 2014 is λ the set of all variational The gradient of parameters We want to estimate f z the ELBO Which has corresponding variational parameter f λ [ z ] ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ] If we can approximate this gradient , we can use ≈ ˜ ∇ λ [ z ] ℒ standard stochastic gradient ascent to update . λ [ z ] S = 1 [ ∇ λ [ z ] log q ( z [ s ] | λ [ z ]) ( log p z ( y , z [ s ], …) − log q ( z [ s ] | λ [ z ]) ) ] ∑ S s =1 From the variational distribution Average over z [ s ] ∼ q ( z | λ [ z ]) S samples
split/merge overview Bryant and Sudderth, 2012 initialize with fixed K consider splitting each factor iterate until batch convergence consider merging some factors full convergence
split/merge overview Bryant and Sudderth, 2012 initialize with fixed K consider splitting each factor ergence consider merging some factors
split/merge overview Bryant and Sudderth, 2012 initialize with fixed K consider splitting each factor ergence consider merging some factors
split/merge overview Bryant and Sudderth, 2012 initialize with fixed K initialize variational consider splitting parameters each factor update variational parameters ergence (one iteration) consider merging some factors accept / reject based on ELBO
split/merge overview Bryant and Sudderth, 2012 initialize with fixed K consider splitting each factor ergence λ [ β k ] consider merging some factors λ S [ β k ′ � ] = ρ t λ [ β k ] λ S [ β k ′ � ′ � ] = (1 − ρ t ) λ [ β k ]
split/merge overview Bryant and Sudderth, 2012 initialize with fixed K consider splitting each factor ergence λ [ π n , k ] consider merging some factors λ S [ π n , k ′ � ] = ρ t λ [ π n , k ] λ S [ π n , k ′ � ′ � ] = (1 − ρ t ) λ [ π n , k ]
split/merge overview Bryant and Sudderth, 2012 initialize with fixed K λ [ μ k ] consider splitting λ S [ μ k ′ � ] = λ [ μ k ] λ S [ μ k ′ � ′ � ] = λ [ μ k ] + ε each factor ergence consider merging some factors
split/merge overview Bryant and Sudderth, 2012 λ [ Σ k ] initialize with fixed K consider splitting λ S [ Σ k ′ � ] = λ [ Σ k ] λ S [ Σ k ′ � ′ � ] = λ [ Σ k ] each factor ergence consider merging some factors
split/merge overview Bryant and Sudderth, 2012 initialize with fixed K λ [ x n , k ] consider splitting λ S [ x n , k ′ � ] = λ [ x n , k ] λ S [ x n , k ′ � ′ � ] = λ [ x n , k ] each factor ergence consider merging some factors
Recommend
More recommend