. Minimization November 1st, 2012 Biostatistics 615/815 - Lecture 16 Hyun Min Kang November 1st, 2012 Hyun Min Kang Single dimensional optimization Importance sampling Biostatistics 615/815 Lecture 16: . . Summary . Root Finding . Integration Rare Event Importance sampling Recap . . . . . . . 1 / 59 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. Algorithm . An example problem . . Calculating . . Summary . B B Hyun Min Kang Biostatistics 615/815 - Lecture 16 November 1st, 2012 The crude Monte-Carlo Methods . . Minimization . . . . . . . Recap Importance sampling 2 / 59 Root Finding Integration Rare Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∫ 1 θ = f ( x ) dx 0 where f ( x ) is a complex function with 0 ≤ f ( x ) ≤ 1 The problem is equivalent to computing E [ f ( u )] where u ∼ U (0 , 1) . • Generate u 1 , u 2 , · · · , u B uniformly from U (0 , 1) . • Take their average to estimate θ θ = 1 ˆ ∑ f ( u i ) i =1
. . . . . . . . . . Algorithm . 5 Repeat step 3 and 4 for B times . . h Hyun Min Kang Biostatistics 615/815 - Lecture 16 November 1st, 2012 . . . Integration . . . . . . . Recap Importance sampling Accept-reject (or hit-and-miss) Monte Carlo method Rare Event 3 / 59 Minimization Summary . Root Finding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Define a rectangle R between (0 , 0) and (1 , 1) • Or more generally, between ( x m , x M ) and ( y m , y M ) . 2 Set h = 0 (hit), m = 0 (miss). 3 Sample a random point ( x , y ) ∈ R . 4 If y < f ( x ) , then increase h . Otherwise, increase m 6 ˆ θ = h + m .
. Minimization November 1st, 2012 Biostatistics 615/815 - Lecture 16 Hyun Min Kang method The crude Monte-Carlo method has less variance then accept-rejection B B B B crude . Which method is better? Summary . 4 / 59 Recap Importance sampling . Integration . Rare Event . Root Finding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . BE [ f ( u ) 2 ] + θ 2 θ (1 − θ ) − 1 σ 2 AR − σ 2 = θ − E [ f ( u )] 2 = ∫ 1 1 = f ( u )(1 − f ( u )) du ≥ 0 0
. Root Finding November 1st, 2012 Biostatistics 615/815 - Lecture 16 Hyun Min Kang B B B B . Revisiting The Crude Monte Carlo Summary . Minimization 5 / 59 . Integration Rare Event . . Recap . . . . Importance sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∫ 1 = E [ f ( u )] = f ( u ) du θ 0 1 ˆ ∑ = f ( u i ) θ i =1 More generally, when x has pdf p ( x ) , if x i is random variable following p ( x ) , ∫ = E p [ f ( x )] = f ( x ) p ( x ) dx θ p 1 ˆ ∑ θ p = f ( x i ) i =1
. Root Finding November 1st, 2012 Biostatistics 615/815 - Lecture 16 Hyun Min Kang B B . function. Importance sampling Summary . Minimization 6 / 59 Integration Rare Event Importance sampling . Recap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Let x i be random variable, and let p ( x ) be an arbitrary probability density f ( x ) [ f ( x ) ] ∫ ∫ θ = E u [ f ( x )] = f ( x ) dx = p ( x ) p ( x ) dx = E p p ( x ) 1 f ( x i ) ˆ ∑ = θ p ( x i ) i =1 where x i is sampled from distribution represented by pdf p ( x )
. Integration November 1st, 2012 Biostatistics 615/815 - Lecture 16 Hyun Min Kang Key Idea Summary . Minimization Root Finding . 7 / 59 Rare Event Importance sampling Recap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • When f ( x ) is not uniform, variance of ˆ θ may be large. • The idea is to pretend sampling from (almost) uniform distribution.
. Var B B . Variance . . . . . . . . B E p f x p x p x dx BE p f x p x B The variance may or may not increase. Roughly speaking, if p x is similar to f x , f x p x becomes flattened and will have smaller variance. Hyun Min Kang Biostatistics 615/815 - Lecture 16 November 1st, 2012 . 8 / 59 . Bias . . . . Recap Importance sampling Rare Event Integration Root Finding Minimization Summary . . Analysis of Importance Sampling . . B B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . θ ] = 1 [ f ( x i ) ] = 1 E [ˆ ∑ ∑ θ = θ p ( x i ) i =1 i =1
. . . B B E p . B B Variance Bias . . B BE p B Hyun Min Kang Biostatistics 615/815 - Lecture 16 November 1st, 2012 . 8 / 59 . Integration Summary . . . Minimization . . . . . Root Finding Recap Analysis of Importance Sampling Importance sampling Rare Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . θ ] = 1 [ f ( x i ) ] = 1 E [ˆ ∑ ∑ θ = θ p ( x i ) i =1 i =1 ) 2 ∫ ( f ( x ) 1 Var [ˆ θ ] = p ( x ) − θ p ( x ) dx [( f ( x ) ) 2 ] − θ 2 1 = p ( x ) The variance may or may not increase. Roughly speaking, if p ( x ) is similar to f ( x ) , f ( x )/ p ( x ) becomes flattened and will have smaller variance.
• Let f x and F x be pdf and CDF of standard normal distribution. • Then Pr X • But what if we don’t have F x but only f x ? • In many cases, CDF is not easy to obtain compared to pdf or random . Possible Solutions . . . . . . . . . F , and we’re all set. draws. Hyun Min Kang Biostatistics 615/815 - Lecture 16 November 1st, 2012 . . . Problem . . . . . . . Recap Importance sampling Rare Event Integration Root Finding Minimization . Summary Simulation of rare events . 9 / 59 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Consider a random variable X ∼ N (0 , 1) • What is Pr [ X ≥ 10] ?
• But what if we don’t have F x but only f x ? • In many cases, CDF is not easy to obtain compared to pdf or random . Possible Solutions . Problem . . . . . Summary draws. Hyun Min Kang Biostatistics 615/815 - Lecture 16 November 1st, 2012 . Simulation of rare events . Minimization . . . . . . . Recap Importance sampling Integration Rare Event Root Finding 9 / 59 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Consider a random variable X ∼ N (0 , 1) • What is Pr [ X ≥ 10] ? • Let f ( x ) and F ( x ) be pdf and CDF of standard normal distribution. • Then Pr [ X ≥ 10] = 1 − F (10) = 7 . 62 × 10 − 24 , and we’re all set.
. . November 1st, 2012 Biostatistics 615/815 - Lecture 16 Hyun Min Kang draws. . . Possible Solutions . . . Problem . . Summary Simulation of rare events Minimization Importance sampling . . . . . . . Recap 9 / 59 Integration Rare Event Root Finding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • Consider a random variable X ∼ N (0 , 1) • What is Pr [ X ≥ 10] ? • Let f ( x ) and F ( x ) be pdf and CDF of standard normal distribution. • Then Pr [ X ≥ 10] = 1 − F (10) = 7 . 62 × 10 − 24 , and we’re all set. • But what if we don’t have F ( x ) but only f ( x ) ? • In many cases, CDF is not easy to obtain compared to pdf or random
• How many random variables should be sampled to observe at least • If we have pdf f x , Pr X • Use Monte-Carlo integration to compute this quantity f u i . . Monte-Carlo Integration . . . . . . . . f x dx . Pr X 1 Sample B values uniformly from W for a large value of W (e.g. 50). . . 2 Estimate B B i Hyun Min Kang Biostatistics 615/815 - Lecture 16 November 1st, 2012 . • . ? . . . . . . . Recap Importance sampling Rare Event Integration Root Finding Minimization Accept-reject sampling one X greater than 10 . . . . Summary 10 / 59 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . If we don’t have CDF: ways to calculate Pr [ X ≥ 10] Sample random variables from N (0 , 1) , and count how many of them are
• If we have pdf f x , Pr X • Use Monte-Carlo integration to compute this quantity f u i . . . . . . . . . . . f x dx 1 Sample B values uniformly from . W for a large value of W (e.g. 50). . . 2 Estimate B B i Hyun Min Kang Biostatistics 615/815 - Lecture 16 November 1st, 2012 Monte-Carlo Integration 10 / 59 . Minimization . . . . . . . Recap Importance sampling Rare Event Integration Root Finding . Accept-reject sampling greater than 10 . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . If we don’t have CDF: ways to calculate Pr [ X ≥ 10] Sample random variables from N (0 , 1) , and count how many of them are • How many random variables should be sampled to observe at least one X ≥ 10 ? • 1/ Pr [ X ≥ 10] = 1 . 3 × 10 23
Recommend
More recommend