nonparametric deconvolution models
play

Nonparametric Deconvolution Models Allison J.B. Chaney Princeton - PowerPoint PPT Presentation

Nonparametric Deconvolution Models Allison J.B. Chaney Princeton University In collaboration with Barbara Engelhardt, Archit Verma and Young-Suk Lee Objective Model collections of convolved data points Objective Model collections of


  1. 
 Nonparametric Deconvolution Models Allison J.B. Chaney Princeton University In collaboration with Barbara Engelhardt, Archit Verma and Young-Suk Lee

  2. Objective Model collections of convolved data points

  3. Objective Model collections of convolved data points

  4. Objective Model collections of convolved data points

  5. Objective Model collections of convolved data points 1 2 2 3 1 1 1 2 2 3 2 2 3 2 3

  6. Objective Model collections of convolved data points General Voting Bulk RNA-seq Images observation district vote tally sample image feature issue or candidate gene expression level pixel particle individual voter one cell light particle factor voting cohort cell type visual pattern

  7. “convolution” convolutional individual particles signal observed together progressing neural nets 1 2 2 2 3

  8. “convolution” convolutional individual particles signal observed together progressing neural nets 1 2 2 2 3

  9. Related Models ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Mixture models assign each observation to one of K clusters, or factors.

  10. Related Models ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Admixture models represent groups of observations, each with its own mixture of K shared factors.

  11. Related Models ● ● ● ● ● ● ● ● Decomposition models decompose observations into constituent parts by representing observations as a product between group representations and factor features.

  12. Our Model ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Deconvolution models (this work) similarly decompose, or deconvolve, observations into constituent parts, but also capture group-specific (or local) fluctuations in factor features.

  13. 1 2 2 3 1 1 1 2 2 3 2 2 3 2 3 How do usually vote? ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

  14. A B C … 1 2 2 3 1 1 1 2 2 3 2 2 3 2 3 How do vote in district A? A ● ● ● ● ● C D ● ● ● ● ● ● ● B ● ● ● ● ● ● E ●

  15. Our Model

  16. Our Model ● ● ● ● ● ● global factors ● ● ● ● ● ● ● ● ● ● ● ● ●

  17. Our Model local factors ● ● ● ● ● (observation-specific) ● ● ● ● ● ● ● ● ● ● ● ● ● ●

  18. Our Model

  19. Our Model HDP

  20. Our Model HDP

  21. Our Model Paisley, 2012 k | α 0 ∼ Beta (1, α 0 ) β ′ � k − 1 ∏ ( 1 − β ′ � ℓ ) β k = β ′ � k ℓ =1 HDP

  22. Our Model Paisley, 2012 k | α 0 ∼ Beta (1, α 0 ) β ′ � k − 1 ∏ ( 1 − β ′ � ℓ ) β k = β ′ � k ℓ =1 HDP n , k | α , β k ∼ Gamma ( αβ k ,1) π ′ � π ′ � n , k π n , k = ∑ ∞ ℓ =1 π ′ � n , ℓ

  23. Our Model μ k , m | μ 0 , σ 0 ∼ 𝒪 ( μ 0 , σ 0 )

  24. Our Model μ k , m | μ 0 , σ 0 ∼ 𝒪 ( μ 0 , σ 0 ) Σ k | ν , Ψ ∼ 𝒳 − 1 ( Ψ , ν )

  25. Our Model μ k , m | μ 0 , σ 0 ∼ 𝒪 ( μ 0 , σ 0 ) x n , k | π n , k , μ k , Σ k ∼ 𝒪 M ( μ k , P n π n , k ) Σ k ¯ Σ k | ν , Ψ ∼ 𝒳 − 1 ( Ψ , ν )

  26. Our Model μ k , m | μ 0 , σ 0 ∼ 𝒪 ( μ 0 , σ 0 ) x n , k | π n , k , μ k , Σ k ∼ 𝒪 M ( μ k , P n π n , k ) Σ k ¯ Σ k | ν , Ψ ∼ 𝒳 − 1 ( Ψ , ν ) P n | ρ ∼ Poisson ( ρ )

  27. Our Model g ( x n , k , m ) ∞ ∑ x n , π n ∼ f y n , m | ¯ π n , k ¯ k =1

  28. Inference

  29. Inference ? ? ? ? ? ?

  30. Variational Inference intractable posterior p

  31. Variational Inference intractable posterior p easy to compute approximation q

  32. Variational Inference intractable posterior p easy to compute approximation q black box variational inference (Ranganath, 2014) 
 split-merge procedure (Bryant, 2012) to learn K

  33. BBVI overview Ranganath et al., 2014 We want to estimate f z ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ]

  34. BBVI overview Ranganath et al., 2014 We want to estimate f z Which has corresponding variational parameter f λ [ z ] ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ]

  35. BBVI overview Ranganath et al., 2014 is λ the set of all variational parameters We want to estimate f z Which has corresponding variational parameter f λ [ z ] ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ]

  36. BBVI overview Ranganath et al., 2014 is λ the set of all variational The gradient of parameters We want to estimate f z the ELBO Which has corresponding variational parameter f λ [ z ] ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ]

  37. BBVI overview Ranganath et al., 2014 is λ the set of all variational The gradient of parameters We want to estimate f z the ELBO Which has corresponding variational parameter f λ [ z ] ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ] If we can approximate this gradient , we can use ≈ ˜ ∇ λ [ z ] ℒ standard stochastic gradient ascent to update . λ [ z ]

  38. BBVI overview Ranganath et al., 2014 is λ the set of all variational The gradient of parameters We want to estimate f z the ELBO Which has corresponding variational parameter f λ [ z ] ∇ λ [ z ] ℒ = 𝔽 q [ ∇ λ [ z ] log q ( z | λ [ z ]) ( log p z ( y , z , …) − log q ( z | λ [ z ]) ) ] If we can approximate this gradient , we can use ≈ ˜ ∇ λ [ z ] ℒ standard stochastic gradient ascent to update . λ [ z ] S = 1 [ ∇ λ [ z ] log q ( z [ s ] | λ [ z ]) ( log p z ( y , z [ s ], …) − log q ( z [ s ] | λ [ z ]) ) ] ∑ S s =1 From the variational distribution Average over z [ s ] ∼ q ( z | λ [ z ]) S samples

  39. split/merge overview Bryant and Sudderth, 2012 initialize with fixed K consider splitting each factor iterate until batch convergence consider merging some factors full convergence

  40. split/merge overview Bryant and Sudderth, 2012 initialize with fixed K consider splitting each factor ergence consider merging some factors

  41. split/merge overview Bryant and Sudderth, 2012 initialize with fixed K consider splitting each factor ergence consider merging some factors

  42. split/merge overview Bryant and Sudderth, 2012 initialize with fixed K initialize variational consider splitting parameters each factor update variational parameters ergence (one iteration) consider merging some factors accept / reject based on ELBO

  43. split/merge overview Bryant and Sudderth, 2012 initialize with fixed K consider splitting each factor ergence λ [ β k ] consider merging some factors λ S [ β k ′ � ] = ρ t λ [ β k ] λ S [ β k ′ � ′ � ] = (1 − ρ t ) λ [ β k ]

  44. split/merge overview Bryant and Sudderth, 2012 initialize with fixed K consider splitting each factor ergence λ [ π n , k ] consider merging some factors λ S [ π n , k ′ � ] = ρ t λ [ π n , k ] λ S [ π n , k ′ � ′ � ] = (1 − ρ t ) λ [ π n , k ]

  45. split/merge overview Bryant and Sudderth, 2012 initialize with fixed K λ [ μ k ] consider splitting λ S [ μ k ′ � ] = λ [ μ k ] λ S [ μ k ′ � ′ � ] = λ [ μ k ] + ε each factor ergence consider merging some factors

  46. split/merge overview Bryant and Sudderth, 2012 λ [ Σ k ] initialize with fixed K consider splitting λ S [ Σ k ′ � ] = λ [ Σ k ] λ S [ Σ k ′ � ′ � ] = λ [ Σ k ] each factor ergence consider merging some factors

  47. split/merge overview Bryant and Sudderth, 2012 initialize with fixed K λ [ x n , k ] consider splitting λ S [ x n , k ′ � ] = λ [ x n , k ] λ S [ x n , k ′ � ′ � ] = λ [ x n , k ] each factor ergence consider merging some factors

Recommend


More recommend