gibbs sampling biostatistics 615 815 lecture 21
play

Gibbs Sampling Biostatistics 615/815 Lecture 21: . . 1 / 29 . - PowerPoint PPT Presentation

. Metropolis-Hastings November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang November 29th, 2012 Hyun Min Kang Gibbs Sampling Biostatistics 615/815 Lecture 21: . . 1 / 29 . Inference Implementation Gibbs Sampler . . . .


  1. . Metropolis-Hastings November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang November 29th, 2012 Hyun Min Kang Gibbs Sampling Biostatistics 615/815 Lecture 21: . . 1 / 29 . Inference Implementation Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  2. . Implementation November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang Gibbs Sampler Metropolis-Hastings . Inference Gibbs Sampler . . . . . . . . . . . 2 / 29 . . . . . . . . . . . . . . . . . . . . . • Another MCMC Method • Update a single parameter at a time • Sample from conditional distribution when other parameters are fixed

  3. . . November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang 3 Increment t and repeat the previous steps. . . i 2 Define the next set of parameter values by . . . . Gibbs Sampler Algorithm Metropolis-Hastings . . . . . . . . . . . Gibbs Sampler 3 / 29 Implementation Inference . . . . . . . . . . . . . . . . . . . . . 1 Consider a particular choice of parameter values λ ( t ) . • Selecting a component to update, say i . • Sample value for λ ( t +1) , from p ( λ i | x , λ 1 , · · · , λ i − 1 , λ i +1 , · · · , λ k ) .

  4. . . November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang 3 Increment t and repeat the previous steps. . . k 2 Define the next set of parameter values by . . . . An alternative Gibbs Sampler Algorithm Metropolis-Hastings . . . . . . . . . . . Gibbs Sampler 4 / 29 Implementation Inference . . . . . . . . . . . . . . . . . . . . . 1 Consider a particular choice of parameter values λ ( t ) . • Sample value for λ ( t +1) , from p ( λ 1 | x , λ 2 , λ 3 , · · · , λ k ) . 1 • Sample value for λ ( t +1) , from p ( λ 1 | x , λ 1 , λ 3 , · · · , λ k ) . 2 • · · · • Sample value for λ ( t +1) , from p ( λ k | x , λ 1 , λ 2 , · · · , λ k − 1 ) .

  5. . . November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang . . Using source of each observations . straightforward . . Using conditional distributions . Gibbs Sampling for Gaussian Mixture Implementation . . . . . . . . . . . Gibbs Sampler Metropolis-Hastings 5 / 29 Inference . . . . . . . . . . . . . . . . . . . . . • Observed data : x = ( x 1 , · · · , x n ) • Parameters : λ = ( π 1 , · · · , π k , µ 1 , · · · , µ k , σ 2 1 , · · · , σ 2 k ) . • Sample each λ i from conditional distribution - not very • Observed data : x = ( x 1 , · · · , x n ) • Parameters : z = ( z 1 , · · · , z n ) where z i ∈ { 1 , · · · , k } . • Sample each z i conditioned by all the other z .

  6. . Inference November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang estimates of mixture parameters. specific component . Metropolis-Hastings Update procedure in Gibbs sampler 6 / 29 Implementation . Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . π i N ( x j | µ i , σ 2 i ) Pr ( z j = i | x j , λ ) = l π l N ( x j | µ l , σ 2 ∑ l ) • Calculate the probability that the observation is originated from a • For a random j ∈ { 1 , · · · , n } , sample z j based on the current

  7. . . November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang probabilities observed data point Initialization Metropolis-Hastings Inference Implementation Gibbs Sampler . . . . . . . . . . . 7 / 29 . . . . . . . . . . . . . . . . . . . . . • Must start with an initial assignment of component labels for each • A simple choice is to start with random assignment with equal

  8. . Implementation November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang The Gibbs Sampler Metropolis-Hastings . Inference Gibbs Sampler . . . . . . . . . . . 8 / 29 . . . . . . . . . . . . . . . . . . . . . • Select initial parameters • Repeat a large number of times • Select an element • Update conditional on other elements

  9. . Inference November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang . Implementing Gaussian Mixture Gibbs Sampler Metropolis-Hastings 9 / 29 Implementation . . . . . . . . . . . Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . class normMixGibbs { public: int k; // # of components int n; // # of data std::vector<double> data; // size n : observed data std::vector<double> labels; // size n : label assignment for each observations std::vector<double> pis; // size k : pis std::vector<double> means; // size k : means std::vector<double> sigmas; // size k : sds std::vector<int> counts; // size k : # of elements in each labels std::vector<double> sums; // size k : sum across each label std::vector<double> sumsqs; // size k : squared sum across each label normMixGibbs(std::vector<double>& _data, int _k); // constructor void initParams(); // initialize parameters void updateParams(int numObs); // update parameters void remove(int i); // remove elements void add(int i, int label); // add an element with new label int sampleLabel(double x); // sample the label of an element void runGibbs(); // run Gibbs sampler };

  10. . Inference November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang estimates of mixture parameters. specific component . Metropolis-Hastings Update procedure in Gibbs sampler 10 / 29 Implementation . Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . π i N ( x j | µ i , σ 2 i ) Pr ( z j = i | x j , π, λ ) = l π l N ( x j | µ l , σ 2 ∑ l ) • Calculate the probability that the observation is originated from a • For a random j ∈ { 1 , · · · , n } , sample z j based on the current

  11. . Inference November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang . Sampling label of selected value Metropolis-Hastings 11 / 29 Implementation Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . // x is the data needs a new label assignment int normMixGibbs::sampleLabel(double x) { double p = randu(0,1); // generate a random probability // evaluate the likelihood of the observations given parameters double lk = NormMix615::dmix(x, pis, means, sigmas); // use randomized values to randomly assign labels // based on the likelihood contributed by each component, for(int i=0; i < k-1; ++i) { double pl = pis[i] * NormMix615::dnorm(x,means[i],sigmas[i])/lk; if ( p < pl ) return i; p -= pl; } // evaluate only k-1 components and assign to the last one if not found return k-1; }

  12. . Inference November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang large number of iterations. . Calculating the parameters at each update Metropolis-Hastings 12 / 29 Implementation Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • For each i ∈ { 1 , · · · , k } • n i = ∑ n j =1 I ( z j = i ) • π i = n i / n • µ i = ∑ n j =1 x j I ( z j = i )/ n i • σ 2 j =1 I ( z j = i )( x j − µ i ) 2 / n i i = ∑ n • This procedure takes O ( n ) , which is quite expensive to run over a • Is it possible to reduce the time complexity to constant ( O ( k ) )?

  13. . Implementation November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang . Metropolis-Hastings Inference Constant time update of parameters 13 / 29 . Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . // update parameters for each components // having counts, sums, sumsqs allows to reduce time complexity void normMixGibbs::updateParams(int numObs) { for(int i=0; i < k; ++i) { // This takes O(k) complexity pis[i] = (double)counts[i]/(double)(numObs); means[i] = sums[i]/counts[i]; sigmas[i] = sqrt((sumsqs[i]/counts[i] - means[i]*means[i])+1e-7); } }

  14. . Inference November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang . Removing and adding elements Metropolis-Hastings 14 / 29 Implementation Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . // We want to remove object to calculate the conditional distribution // excluding one component's label information void normMixGibbs::remove(int i) { int l = labels[i]; --counts[l]; sums[l] -= data[i]; sumsqs[l] -= (data[i]*data[i]); } // Adding the removed object with newer label assignment void normMixGibbs::add(int i, int label) { labels[i] = label; ++counts[label]; sums[label] += data[i]; sumsqs[label] += (data[i]*data[i]); }

  15. . Inference November 29th, 2012 Biostatistics 615/815 - Lecture 21 Hyun Min Kang . The Gibbs Sampler Metropolis-Hastings 15 / 29 Implementation Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . void normMixGibbs::runGibbs() { initParams(); // initialze label assignments for(int i=0; i < 10000000; ++i) { // repeat a large number of times int id = randn(0,n); // select a label to update if ( counts[labels[id]] < MIN_COUNTS ) continue; // avoid boundary conditions remove(id); // remove the elements updateParams(n-1); // update pis, means, sigmas int label = sampleLabel(data[id]); // sample new label conditionally add(id, label); // add the element back with new label if ( ( i > BURN_IN ) && ( i % THIN_INTERVAL == 0 ) ) { // report intermediate results if needed // calculate summary statistics of the parameters to estimate } } }

Recommend


More recommend