metropolis sampling matt pharr cs348b
play

Metropolis Sampling Matt Pharr cs348b May 20, 2003 Introduction - PowerPoint PPT Presentation

Metropolis Sampling Matt Pharr cs348b May 20, 2003 Introduction Unbiased MC method for sampling from functions distributions Robustness in the face of difficult problems Application to a wide variety of problems Flexibility


  1. Metropolis Sampling Matt Pharr cs348b May 20, 2003

  2. Introduction • Unbiased MC method for sampling from functions’ distributions • Robustness in the face of difficult problems • Application to a wide variety of problems • Flexibility in choosing how to sample • Introduced to CG by Veach and Guibas

  3. (a) Bidirectional path tracing with 40 samples per pixel. Image credit: Eric Veach

  4. (b) Metropolis light transport with an average of 250 mutations per pixel [the same computation time as (a)]. Image credit: Eric Veach

  5. (a) Path tracing with 210 samples per pixel. Image credit: Eric Veach

  6. (b) Metropolis light transport with an average of 100 mutations per pixel [the same computation time as (a)]. Image credit: Eric Veach

  7. Overview • For arbitrary f ( x ) → R , x ∈ Ω

  8. Overview • For arbitrary f ( x ) → R , x ∈ Ω � • Define I ( f ) = Ω f ( x )dΩ so f pdf = f/ I ( f )

  9. Overview • For arbitrary f ( x ) → R , x ∈ Ω � • Define I ( f ) = Ω f ( x )dΩ so f pdf = f/ I ( f ) • Generates samples X = { x i } , x i ∼ f pdf • Without needing to compute f pdf or I ( f )

  10. Overview • Introduction to Metropolis sampling • Examples with 1D problems • Extension to 3D, motion blur • Overview of Metropolis Light Transport

  11. Basic Algorithm • Function f ( x ) over state space Ω , f :Ω → R . • Markov Chain: new sample x i using x i − 1

  12. Basic Algorithm • Function f ( x ) over state space Ω , f :Ω → R . • Markov Chain: new sample x i using x i − 1 • New samples from mutation to x i − 1 → x ′ • Mutation accepted or rejected so x i ∼ f pdf • If rejected, x i = x i − 1 • Acceptance guarantees distribution of x i is the stationary distribution

  13. Pseudo-code x = x0 for i = 1 to n x’ = mutate(x) a = accept(x, x’) if (random() < a) x = x’ record(x)

  14. Expected Values • Metropolis avoids parts of Ω where f ( x ) is small • But e.g. dim parts of an image need samples • Record samples at both x and x ′ • Samples are weighted based on a ( x → x ′ ) • Same result in the limit

  15. Expected Values – Pseudo-code x = x0 for i = 1 to n x’ = mutate(x) a = accept(x, x’) record(x, (1-a) * weight) record(x’, a * weight) if (random() < a) x = x’

  16. Mutations, Transitions, Acceptance • Mutations propose x ′ given x i • T ( x → x ′ ) is probability density of proposing x ′ from x • a ( x → x ′ ) probability of accepting the transition

  17. Detailed Balance – The Key • By defining a ( x → x ′ ) carefully, can ensure x i ∼ f ( x ) f ( x ) T ( x → x ′ ) a ( x → x ′ ) = f ( x ′ ) T ( x ′ → x ) a ( x ′ → x ) • Since f and T are given, gives conditions on acceptance probability • (Will not show derivation here)

  18. Acceptance Probability • Efficient choice: 1 , f ( x ′ ) T ( x ′ → x ) � � a ( x → x ′ ) = min f ( x ) T ( x → x ′ )

  19. Acceptance Probability – Goals • Doesn’t affect unbiasedness; just variance • Maximize the acceptance probability → – Explore state space better – Reduce correlation (image artifacts...) • Want transitions that are likely to be accepted – i.e. transitions that head where f ( x ) is large

  20. Mutations: Metropolis • T ( a → b ) = T ( b → a ) for all a , b 1 , f ( x ′ ) � � a ( x → x ′ ) = min f ( x ) • Random walk Metropolis T ( x → x ′ ) = T ( | x − x ′ | )

  21. Mutations: Independence Sampler • If we have some pdf p , can sample x ∼ p , • Straightforward transition function: T ( x → x ′ ) = p ( x ′ ) • If p ( x ) = f pdf , wouldn’t need Metropolis • But can use pdfs to approximate parts of f ...

  22. Mutation Strategies: General • Adaptive methods: vary transition based on experience • Flexibility: base on value of f ( x ) , etc. pretty freely • Remember: just need to be able to compute transition densities for the mutation • The more mutations, the merrier • Relative frequency of them not so important

  23. 1D Example • Consider the function � ( x − . 5) 2 : 0 ≤ x ≤ 1 f 1 ( x ) = 0 : otherwise • Want to generate samples from f 1 ( x )

  24. 1D Mutation #1 mutate 1 ( x ) → ξ T 1 ( x → x ′ ) = 1 • Simplest mutation possible • Random walk Metropolis

  25. 1D Mutation #2 mutate 2 ( x ) → x + . 1 ∗ ( ξ − . 5) 1 | x − x ′ | ≤ . 05 � : T 2 ( x → x ′ ) = 0 . 1 0 : otherwise • Also random walk Metropolis

  26. 1D Mutation #2 • mutate 2 increases acceptance probability 1 , f ( x ′ ) T ( x ′ → x ) � � a ( x → x ′ ) = min f ( x ) T ( x → x ′ ) • When f ( x ) is large, will avoid x ′ when f ( x ′ ) < f ( x ) • Should try to avoid proposing mutations to such x ′

  27. 1D Results - pdf graphs 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 x x • Left: mutate 1 only • Right: a mix of the two (10%/90%) • 10,000 mutations total

  28. Why bother with mutate 1 , then? • If we just use the second mutation ( ± . 05 )... 0.8 0.6 0.4 0.2 0.0 0.00 0.25 0.50 0.75 1.00 x

  29. Ergodicity • Need finite prob. of sampling x , f ( x ) > 0 • This is true with mutate 2 , but is inefficient: 0.8 0.6 0.4 0.2 0.0 0.00 0.25 0.50 0.75 1.00 x • Still unbiased in the limit...

  30. Ergodicity – Easy Solution • Periodically pick an entirely new x • e.g. sample uniformly over Ω ...

  31. Application to Integration � • Given integral, f ( x ) g ( x )dΩ • Standard Monte Carlo estimator: N f ( x ) g ( x ) dΩ ≈ 1 f ( x i ) g ( x i ) � � N p ( x i ) Ω i =1 • where x i ∼ p ( x ) , an arbitrary pdf

  32. Application to Integration N � f ( x ) g ( x ) dΩ ≈ 1 f ( x i ) g ( x i ) � N p ( x i ) Ω i =1

  33. Application to Integration N � f ( x ) g ( x ) dΩ ≈ 1 f ( x i ) g ( x i ) � N p ( x i ) Ω i =1 • Metropolis gives x 1 , . . . , x N , x i ∼ f pdf ( x ) N � � 1 � � f ( x ) g ( x ) dΩ ≈ g ( x i ) · I ( f ) N Ω i =1 � • (Recall I ( f ) = Ω f ( x )dΩ )

  34. Start-Up Bias • x i converges to f pdf , but never actually reaches it • Especially in early iterations, the distribution of x i can be quite different from f pdf • This problem is called the Start-Up Bias

  35. Eliminating Start-Up Bias Start from any x 0 , make a couple of “dummy” • iterations without recording the results How many “dummy” iterations? – – x i only proportional to f pdf for i -> infty anyway Not a good solution –

  36. Eliminating Start-Up Bias Generate x 0 , proportional to any suitable pdf • p ( x ) • Weight all contributions by w = f ( x 0 ) / p ( x 0 ) Unbiased on average (over many runs) – Each run may be completely “off” – (even black image, if f ( x 0 ) was zero) – Not a good solution

  37. Eliminating Start-Up Bias Generate n initial samples ,  x , x • 0 , 1 0 , n proportional to any suitable pdf p ( x ) Compute weights w i = f ( x 0,i ) / p ( x 0,i ) • Pick initial state proportional to w i • Weight all contributions by n 1 • ∑ = w w i n = i 1 Note that •   n f ( x ) 1 ∑ ∫ = = 0 , i   E [ w ] E f ( x ) dx   n p ( x )   = Ω i 1 0 , i

  38. Motion Blur • Onward to a 3D problem • Scene radiance function L ( u, v, t ) (e.g. evaluated with ray tracing) • L = 0 outside the image boundary • Ω is ( u, v, t ) ∈ [0 , u max ] × [0 , v max ] × [0 , 1]

  39. Image Contribution Function • The key to applying Metro to image synthesis � I j = h j ( u, v ) L ( u, v, t ) d u d v d t Ω • I j is value of j’th pixel • h j is pixel reconstruction filter

  40. Image Contribution Function • So if we sample x i ∼ L pdf N �� � I j ≈ 1 � h j ( x i ) · L ( x ) dΩ , N Ω i =1 • The distribution of x i on the image plane forms the image � • Estimate Ω L ( x ) dΩ with standard MC

  41. Two Basic Mutations • Pick completely new ( u, v, t ) value • Perturb u and v ± 8 pixels, time ± . 01 . • Both are symmetric, Random-walk Metropolis

  42. Motion Blur – Result • Left: Distribution RT, stratified sampling • Right: Metropolis sampling • Same total number of samples

  43. Motion Blur – Parameter Studies • Left: ± 80 pixels, ± . 5 time. Many rejections. • Right: ± 0 . 5 pixels, ± . 001 time. Didn’t explore Ω well.

  44. Exponential Distribution • Vary the scale of proposed mutations r = r max e − log( r max /r min ) ξ , θ = 2 πξ ( du, dv ) = ( r sin θ, r cos θ ) dt = t max e − log( t max /t min ) ξ • Will reject when too big, still try wide variety

  45. Exponential distribution results

  46. Light Transport • Image contribution function was key • f ( x ) over infinite space of paths • State-space is light-carrying paths through the scene–from light source to sensor • Robustness is particularly nice–solve difficult transport problems efficiently • Few specialized parameters to set

  47. Light Transport – Setting • Samples x from Ω are sequences v 0 v 1 . . . v k , k ≥ 1 , of vertices on scene surfaces � 1 x � 0 x x3 � 2 x • f ( x ) is the product of emitted light, BRDF values, cosines, etc.

Recommend


More recommend