sampling methods
play

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods - PowerPoint PPT Presentation

Approximate Inference: Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform sampling Importance Sampling Rejection Sampling Metropolis-Hastings Gibbs sampling Example: Collapsed Gibbs Sampler for


  1. Approximate Inference: Sampling Methods CMSC 678 UMBC

  2. Outline Recap Monte Carlo methods Sampling Techniques Uniform sampling Importance Sampling Rejection Sampling Metropolis-Hastings Gibbs sampling Example: Collapsed Gibbs Sampler for Topic Models

  3. Recap from last time…

  4. Exponential Family Forms: Capture Common Distributions Discrete (Finite distributions) Dirichlet (Distributions over (finite) distributions) Gaussian Gamma, Exponential, Poisson, Negative-Binomial, Laplace, log- Normal,…

  5. Exponential Family Forms: “Easy” Posterior Inference Posterior p has same form as prior p p is the conjugate prior for q Posterior Likelihood Prior Dirichlet (Beta) Discrete (Bernoulli) Dirichlet (Beta) Normal Normal (fixed var.) Normal Gamma Exponential Gamma

  6. Variational Inference: A Gradient- Based Optimization Technique 𝑞(𝜄|𝑦) Set t = 0 Difficult to compute Pick a starting value λ t Until converged : Minimize the 1. Get value y t = F(q(•; λ t )) “difference” 𝑟 𝜇 (𝜄) 2. Get gradient g t = F’(q(•; λ t )) by changing λ Easy(ier) to 3. Get scaling factor ρ t compute 4. Set λ t+1 = λ t + ρ t *g t 5. Set t += 1

  7. Variational Inference: The Function to Optimize KL-Divergence (expectation) Find the best distribution Parameters for desired model D KL 𝑟 𝜄 || 𝑞(𝜄|𝑦) = log 𝑟 𝜄 𝔽 𝑟 𝜄 𝑞(𝜄|𝑦) Variational parameters for θ

  8. Goal: Posterior Inference Hyperparameters α Unknown parameters Θ Data: p α ( Θ | ) Likelihood model: p( | Θ )

  9. (Some) Learning Techniques MAP/MLE: Point estimation, basic EM Variational Inference: Functional Optimization Sampling/Monte Carlo today

  10. Outline Recap Monte Carlo methods Sampling Techniques Uniform sampling Importance Sampling Rejection Sampling Metropolis-Hastings Gibbs sampling Example: Collapsed Gibbs Sampler for Topic Models

  11. Two Problems for Sampling Methods to Solve Generate samples from p 𝑞 𝑦 = 𝑣 𝑦 , 𝑦 ∈ ℝ 𝐸 𝑎 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 samples Q : Why is sampling from p(x) hard?

  12. Two Problems for Sampling Methods to Solve Generate samples from p 𝑞 𝑦 = 𝑣 𝑦 , 𝑦 ∈ ℝ 𝐸 𝑎 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 samples Q : Why is sampling from p(x) hard? A1 : Can we evaluate Z? A2 : Can we sample without enumerating? (Correct samples should be where p is big)

  13. Two Problems for Sampling Methods to Solve Generate samples from p 𝑞 𝑦 = 𝑣 𝑦 , 𝑦 ∈ ℝ 𝐸 𝑎 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 samples Q : Why is sampling from p(x) hard? A1 : Can we evaluate Z? A2 : Can we sample without 𝑣 𝑦 = exp(.4 𝑦 − .4 2 − 0.08𝑦 4 ) ITILA, Fig enumerating? (Correct samples 29.1 should be where p is big)

  14. Two Problems for Sampling Methods to Solve Estimate expectation of a Generate samples from p function 𝜚 𝑞 𝑦 = 𝑣 𝑦 , 𝑦 ∈ ℝ 𝐸 Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 = 𝑎 ∫ 𝑞 𝑦 𝜚 𝑦 𝑒𝑦 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 samples Q : Why is sampling from p(x) hard? A1 : Can we evaluate Z? A2 : Can we sample without enumerating? (Correct samples should be where p is big)

  15. Two Problems for Sampling Methods to Solve Estimate expectation of a Generate samples from p function 𝜚 𝑞 𝑦 = 𝑣 𝑦 , 𝑦 ∈ ℝ 𝐸 Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 = 𝑎 ∫ 𝑞 𝑦 𝜚 𝑦 𝑒𝑦 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 samples 1 ෡ 𝑆 σ 𝑠 𝜚 𝑦 𝑠 Φ = Q : Why is sampling from p(x) hard? A1 : Can we evaluate Z? A2 : Can we sample without enumerating? (Correct samples should be where p is big)

  16. Two Problems for Sampling Methods to Solve Estimate expectation of a Generate samples from p function 𝜚 𝑞 𝑦 = 𝑣 𝑦 , 𝑦 ∈ ℝ 𝐸 Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 = 𝑎 ∫ 𝑞 𝑦 𝜚 𝑦 𝑒𝑦 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 samples 1 ෡ 𝑆 σ 𝑠 𝜚 𝑦 𝑠 Φ = Q : Why is sampling from p(x) hard? If we could sample from p… A1 : Can we evaluate Z? consistent 𝔽 ෡ A2 : Can we sample without Φ = Φ estimator enumerating? (Correct samples should be where p is big)

  17. Outline Recap Monte Carlo methods Sampling Techniques Uniform sampling Importance Sampling Rejection Sampling Metropolis-Hastings Gibbs sampling Example: Collapsed Gibbs Sampler for Topic Models

  18. Goal: Uniform Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 sample ෡ 𝜚 𝑦 𝑠 𝑞 ∗ (𝑦 𝑠 ) Φ = ෍ uniformly : 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 𝑠

  19. Goal: Uniform Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 sample ෡ 𝜚 𝑦 𝑠 𝑞 ∗ (𝑦 𝑠 ) Φ = ෍ uniformly : 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 𝑠 𝑞 ∗ 𝑦 = 𝑣 𝑦 𝑎 ∗ 𝑎 ∗ = ෍ 𝑣(𝑦 𝑠 ) 𝑠

  20. Goal: Uniform Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 sample ෡ 𝜚 𝑦 𝑠 𝑞 ∗ (𝑦 𝑠 ) Φ = ෍ uniformly : 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 𝑠 𝑞 ∗ 𝑦 = 𝑣 𝑦 𝑎 ∗ 𝑎 ∗ = ෍ 𝑣(𝑦 𝑠 ) 𝑠 this might work if R (the number of samples) sufficiently hits high probability regions

  21. Goal: Uniform Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 sample ෡ 𝜚 𝑦 𝑠 𝑞 ∗ (𝑦 𝑠 ) Φ = ෍ uniformly : 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 𝑠 𝑞 ∗ 𝑦 = 𝑣 𝑦 𝑎 ∗ 𝑎 ∗ = ෍ 𝑣(𝑦 𝑠 ) 𝑠 this might work if R Ising model example: (the number of 2 H states of high • samples) sufficiently probability hits high probability 2 N states total • regions

  22. Goal: Uniform Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 sample ෡ 𝜚 𝑦 𝑠 𝑞 ∗ (𝑦 𝑠 ) Φ = ෍ uniformly : 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 𝑠 𝑞 ∗ 𝑦 = 𝑣 𝑦 𝑎 ∗ 𝑎 ∗ = ෍ 𝑣(𝑦 𝑠 ) 𝑠 this might work if R chance of sample being in Ising model example: 2 𝐼 (the number of 2 H states of high high prob. region: • 2 𝑂 samples) sufficiently probability hits high probability 2 N states total • min. samples needed: ∼ 2 𝑂−𝐼 regions

  23. Outline Recap Monte Carlo methods Sampling Techniques Uniform sampling Importance Sampling Rejection Sampling Metropolis-Hastings Gibbs sampling Example: Collapsed Gibbs Sampler for Topic Models

  24. Goal: Importance Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 approximating distribution: 𝑅 𝑦 ∝ 𝑣 𝑟 𝑦 sample from Q : 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 ITILA, Fig 29.5

  25. Goal: Importance Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 approximating distribution: 𝑅 𝑦 ∝ 𝑣 𝑟 𝑦 sample from Q : 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 p(x) x where Q(x) > p(x): over-represented x where Q(x) < p(x): under-represented ITILA, Fig 29.5

  26. Goal: Importance Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 approximating distribution: Φ = σ 𝑠 𝜚 𝑦 𝑠 𝑥(𝑦 𝑠 ) 𝑅 𝑦 ∝ 𝑣 𝑟 𝑦 ෡ σ 𝑠 𝑥 𝑦 𝑠 sample from Q : 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 p(x) x where Q(x) > p(x): 𝑥 𝑦 𝑠 = 𝑣 𝑞 𝑦 over-represented 𝑣 𝑟 𝑦 x where Q(x) < p(x): under-represented ITILA, Fig 29.5

  27. Goal: Importance Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 approximating distribution: Φ = σ 𝑠 𝜚 𝑦 𝑠 𝑥(𝑦 𝑠 ) 𝑅 𝑦 ∝ 𝑣 𝑟 𝑦 ෡ σ 𝑠 𝑥 𝑦 𝑠 sample from Q : 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 p(x) x where Q(x) > p(x): 𝑥 𝑦 𝑠 = 𝑣 𝑞 𝑦 over-represented 𝑣 𝑟 𝑦 x where Q(x) < p(x): under-represented Q : How reliable will ITILA, Fig 29.5 this estimator be?

  28. Goal: Importance Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 approximating distribution: Φ = σ 𝑠 𝜚 𝑦 𝑠 𝑥(𝑦 𝑠 ) 𝑅 𝑦 ∝ 𝑣 𝑟 𝑦 ෡ σ 𝑠 𝑥 𝑦 𝑠 sample from Q : 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 p(x) x where Q(x) > p(x): 𝑥 𝑦 𝑠 = 𝑣 𝑞 𝑦 over-represented 𝑣 𝑟 𝑦 x where Q(x) < p(x): under-represented A : In practice, difficult Q : How reliable will ITILA, Fig 29.5 to say. 𝑥 𝑦 𝑠 may not this estimator be? be a good indicator

  29. Goal: Importance Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 approximating distribution: Φ = σ 𝑠 𝜚 𝑦 𝑠 𝑥(𝑦 𝑠 ) 𝑅 𝑦 ∝ 𝑣 𝑟 𝑦 ෡ σ 𝑠 𝑥 𝑦 𝑠 sample from Q : x where Q(x) > p(x): 𝑥 𝑦 𝑠 = 𝑣 𝑞 𝑦 over-represented 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 𝑣 𝑟 𝑦 x where Q(x) < p(x): under-represented p(x) A : In practice, difficult Q : How reliable will to say. 𝑥 𝑦 𝑠 may not this estimator be? be a good indicator Q : How do you choose a good approximating ITILA, Fig 29.5 distribution?

  30. Goal: Importance Sampling Φ = 𝜚 𝑦 𝑞 = 𝔽 𝑦∼𝑞 𝜚 𝑦 approximating distribution: Φ = σ 𝑠 𝜚 𝑦 𝑠 𝑥(𝑦 𝑠 ) 𝑅 𝑦 ∝ 𝑣 𝑟 𝑦 ෡ σ 𝑠 𝑥 𝑦 𝑠 sample from Q : x where Q(x) > p(x): 𝑥 𝑦 𝑠 = 𝑣 𝑞 𝑦 over-represented 𝑦 1 , 𝑦 2 , … , 𝑦 𝑆 𝑣 𝑟 𝑦 x where Q(x) < p(x): under-represented p(x) A : In practice, difficult Q : How reliable will to say. 𝑥 𝑦 𝑠 may not this estimator be? be a good indicator Q : How do you choose A : Task/domain a good approximating ITILA, Fig 29.5 specific distribution?

Recommend


More recommend