sampling methods
play

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 - PowerPoint PPT Presentation

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Recall Inference For


  1. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11

  2. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Recall – Inference For General Graphs • Junction tree algorithm is an exact inference method for arbitrary graphs • A particular tree structure defined over cliques of variables • Inference ends up being exponential in maximum clique size • Therefore slow in many cases • Sampling methods: represent desired distribution with a set of samples, as more samples are used, obtain more accurate representation

  3. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Outline Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo

  4. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Outline Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo

  5. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling • The fundamental problem we address in this lecture is how to obtain samples from a probability distribution p ( z ) • This could be a conditional distribution p ( z | e ) • We often wish to evaluate expectations such as � E [ f ] = f ( z ) p ( z ) d z • e.g. mean when f ( z ) = z • For complicated p ( z ) , this is difficult to do exactly, approximate as L f = 1 ˆ � f ( z ( l ) ) L l = 1 where { z ( l ) | l = 1 , . . . , L } are independent samples from p ( z )

  6. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling f ( z ) p ( z ) z • Approximate L f = 1 ˆ � f ( z ( l ) ) L l = 1 where { z ( l ) | l = 1 , . . . , L } are independent samples from p ( z ) • Demo on Excel sheet.

  7. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Bayesian Networks - Generating Fair Samples P(C) .50 Cloudy C P(S|C) C P(R|C) Rain T .10 Sprinkler T .80 F .50 F .20 Wet Grass S R P(W|S,R) T T .99 T F .90 F T .90 F F .01 • How can we generate a fair set of samples from this BN? from Russell and Norvig, AIMA

  8. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling from Bayesian Networks • Sampling from discrete Bayesian networks with no observations is straight-forward, using ancestral sampling • Bayesian network specifies factorization of joint distribution n � P ( z 1 , . . . , z n ) = P ( z i | pa ( z i )) i = 1 • Sample in-order, sample parents before children • Possible because graph is a DAG • Choose value for z i from p ( z i | pa ( z i ))

  9. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling From Empty Network – Example P(C) .50 Cloudy C P(S|C) C P(R|C) Rain Sprinkler T .10 T .80 F .50 F .20 Wet Grass S R P(W|S,R) T T .99 T F .90 F T .90 F F .01 from Russell and Norvig, AIMA

  10. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling From Empty Network – Example P(C) .50 Cloudy C P(S|C) C P(R|C) Rain Sprinkler T .10 T .80 F .50 F .20 Wet Grass S R P(W|S,R) T T .99 T F .90 F T .90 F F .01 from Russell and Norvig, AIMA

  11. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling From Empty Network – Example P(C) .50 Cloudy C P(S|C) C P(R|C) Rain Sprinkler T .10 T .80 F .50 F .20 Wet Grass S R P(W|S,R) T T .99 T F .90 F T .90 F F .01 from Russell and Norvig, AIMA

  12. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling From Empty Network – Example P(C) .50 Cloudy C P(S|C) C P(R|C) Rain Sprinkler T .10 T .80 F .50 F .20 Wet Grass S R P(W|S,R) T T .99 T F .90 F T .90 F F .01 from Russell and Norvig, AIMA

  13. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling From Empty Network – Example P(C) .50 Cloudy C P(S|C) C P(R|C) Rain Sprinkler T .10 T .80 F .50 F .20 Wet Grass S R P(W|S,R) T T .99 T F .90 F T .90 F F .01 from Russell and Norvig, AIMA

  14. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling From Empty Network – Example P(C) .50 Cloudy C P(S|C) C P(R|C) Rain Sprinkler T .10 T .80 F .50 F .20 Wet Grass S R P(W|S,R) T T .99 T F .90 F T .90 F F .01 from Russell and Norvig, AIMA

  15. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling From Empty Network – Example P(C) .50 Cloudy C P(S|C) C P(R|C) Rain Sprinkler T .10 T .80 F .50 F .20 Wet Grass S R P(W|S,R) T T .99 T F .90 F T .90 F F .01 from Russell and Norvig, AIMA

  16. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Ancestral Sampling • This sampling procedure is fair, the fraction of samples with a particular value tends towards the joint probability of that value

  17. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling Marginals • Note that this procedure can be applied to generate samples for marginals as P(C) well .50 Cloudy • Simply discard portions of sample C P(S|C) C P(R|C) Rain T .10 Sprinkler T .80 F .50 F .20 which are not needed Wet Grass S R P(W|S,R) • e.g. For marginal p ( rain ) , sample T T .99 T F .90 F T .90 ( cloudy = t , sprinkler = f , rain = t , wg = F F .01 t ) just becomes ( rain = t ) • Still a fair sampling procedure

  18. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Other Problems • Continuous variables? • Gaussian okay, Box-Muller and other methods • More complex distributions? • Undirected graphs (MRFs)?

  19. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Outline Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo

  20. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Rejection Sampling f ( z ) p ( z ) z • Consider the case of an arbitrary, continuous p ( z ) • How can we draw samples from it? • Assume we can evaluate p ( z ) , up to some normalization constant p ( z ) = 1 ˜ p ( z ) Z p where ˜ p ( z ) can be efficiently evaluated (e.g. MRF)

  21. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Proposal Distribution kq ( z ) kq ( z 0 ) p ( z ) ˜ u 0 z 0 z • Let’s also assume that we have some simpler distribution q ( z ) called a proposal distribution from which we can easily draw samples • e.g. q ( z ) is a Gaussian • We can then draw samples from q ( z ) and use these • But these wouldn’t be fair samples from p ( z ) ?!

  22. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Comparison Function and Rejection kq ( z ) kq ( z 0 ) p ( z ) ˜ u 0 z 0 z • Introduce constant k such that kq ( z ) ≥ ˜ p ( z ) for all z • Rejection sampling procedure: • Generate z 0 from q ( z ) • Generate u 0 from [ 0 , kq ( z 0 )] uniformly • If u 0 > ˜ p ( z ) reject sample z 0 , otherwise keep it • Original samples are uniform in grey region • Kept samples uniform in white region – hence samples from p ( z )

  23. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Rejection Sampling Analysis • How likely are we to keep samples? • Probability a sample is accepted is: � p ( accept ) = { ˜ p ( z ) / kq ( z ) } q ( z ) dz � 1 = ˜ p ( z ) dz k • Smaller k is better subject to kq ( z ) ≥ ˜ p ( z ) for all z • If q ( z ) is similar to ˜ p ( z ) , this is easier • In high-dim spaces, acceptance ratio falls off exponentially • Finding a suitable k challenging

  24. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Outline Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo

  25. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Discretization • Importance sampling is a sampling technique for computing expectations: � E [ f ] = f ( z ) p ( z ) d z • Could approximate using discretization over a uniform grid: L � f ( z ( l ) ) p ( z ( l ) ) E [ f ] ≈ l = 1 • c.f. Riemannian sum • Much wasted computation, exponential scaling in dimension • Instead, again use a proposal distribution instead of a uniform grid

  26. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Importance sampling f ( z ) q ( z ) p ( z ) z • Approximate expectation by drawing points from q ( z ) . � � f ( z ) p ( z ) E [ f ] = f ( z ) p ( z ) d z = q ( z ) q ( z ) d z L f ( z ( l ) ) p ( z ( l ) ) 1 � ≈ q ( z ( l ) ) L l = 1 • Quantities p ( z ( l ) ) / q ( z ( l ) ) are known as importance weights • Correct for use of wrong distribution q ( z ) in sampling

  27. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling Importance Resampling • Note that importance sampling, e.g. likelihood weighted sampling, gives approximation to expectation, not samples • But samples can be obtained using these ideas • Sampling-importance-resampling uses a proposal distribution q ( z ) to generate samples • Unlike rejection sampling, no parameter k is needed

Recommend


More recommend