Nested sampling with demons Michael Habeck Max Planck Institute for - PowerPoint PPT Presentation

Nested sampling with demons Michael Habeck Max Planck Institute for Biophysical Chemistry and Institute for Mathematical Stochastics Göttingen, Germany Amboise, September 23, 2014

Bayesian inference • Probability rules posterior × evidence likelihood × prior = Pr ( θ | D , M ) × Pr ( D | M ) Pr ( D | θ, M ) × Pr ( θ | M ) = p ( θ ) × Z L ( θ ) × π ( θ ) = • Inference • Evidence ∫ Z = L ( θ ) π ( θ ) d θ • Posterior p ( θ ) = 1 Z L ( θ ) π ( θ )

Nested sampling • The evidence reduces to a one-dimensional integral: ∫ 1 ∫ Z = L ( θ ) π ( θ ) d θ = L ( X ) d X 0 summing over prior mass ∫ X ( λ ) = π ( θ ) d θ, X (0) = 1 , X ( ∞ ) = 0 . L ( θ ) ≥ λ

Nested sampling • The evidence reduces to a one-dimensional integral: ∫ 1 ∫ Z = L ( θ ) π ( θ ) d θ = L ( X ) d X 0 summing over prior mass ∫ X ( λ ) = π ( θ ) d θ, X (0) = 1 , X ( ∞ ) = 0 . L ( θ ) ≥ λ • Prior masses can be ordered: X ( λ ) < X ( λ ′ ) if λ > λ ′ • Idea: We can evaluate L exactly and estimate X

Estimation of prior masses • Nested sequence of truncated priors: { 0 ; p ( θ | λ ) = Θ[ L ( θ ) − λ ] x < 0 where Θ( x ) = π ( θ ) x ≥ 0 X ( λ ) 1 ; • Distribution of prior masses at likelihood contour λ : X ∼ Uniform (0 , X ( λ )) • Order statistics: X max ∼ N X N − 1 X ( λ ) N where X max is the maximum of N uniformely distributed X n ∼ Uniform (0 , X ( λ ))

Nested sampling algorithm Require: N (ensemble size), m (number of iterations) λ 0 = 0 , X 0 = 1 , S = { θ 1 , . . . , θ N } (states) where θ n ∼ π ( θ ) ▷ Initialize for k = 1 , . . . , m do θ k = argmin { L ( θ n ) | n = 1 , . . . , N } ▷ Store state with smallest likelihood λ k = L ( θ k ) ▷ Define new likelihood contour X k = t X k − 1 where t ∼ N t N − 1 ▷ Estimate prior mass θ ∼ p ( θ | λ k ) ▷ Draw new state from truncated prior S ← S \ { θ k } ∪ { θ } ▷ Replace current with new state end for Z = ∑ m k =1 λ k ( X k − 1 − X k ) ▷ Estimate evidence

Compression • Compression achieved by a single nested sampling iteration ∫ H ( λ → λ ′ ) p ( θ | λ ′ ) ln [ p ( θ | λ ′ )/ p ( θ | λ )] d θ = ln [ X ( λ )/ X ( λ ′ )] = • Average compression is constant ⟨ H ( λ → λ ′ ) ⟩ = −⟨ ln t ⟩ t ∼ Beta ( N , 1) = 1/ N

Physical analogy Bayesian inference Statistical physics model parameters θ configuration/microstate θ negative log likelihood − ln L ( θ ) potential energy E ( θ ) log prior mass ln X ( λ ) volume entropy S ( ϵ ) = ln X ( ϵ ) likelihood contour L ( θ ) > λ energy bound E ( θ ) < ϵ ∫ ϵ prior mass X ( λ ) cumulative DOS X ( ϵ ) = −∞ g ( E ) d E e − β E g ( E ) d E evidence Z = L ( X ) d X partition function Z ( β ) = ∫ ∫ truncated prior microcanonical ensemble

Microcanonical ensemble • Density of states (DOS) ∫ g ( E ) = δ [ E − E ( θ )] π ( θ ) d θ = ∂ E X ( E ) • Microcanonical entropy and temperature: S ( E ) = ln X ( E ) , T ( E ) = 1/ ∂ E S ( E ) • Compression: ∫ ϵ ′ H ( ϵ ′ → ϵ ) = S ( ϵ ′ ) − S ( ϵ ) = β ( E ) d E ϵ where the inverse temperature β = 1/ T measures the entropy production

Enter the demon • Implement truncated prior as microcanonical ensemble with additional demon absorbing energy D : 1 p ( θ, D | ϵ ) = X ( ϵ ) δ [ ϵ − D − E ( θ )] Θ( D ) π ( θ ) and explore constant energy shells • Creutz algorithm: Require: ϵ (upper bound on total energy) θ ∼ π ( θ ) with energy E = E ( θ ) ≤ ϵ , D = ϵ − E ▷ Initialize while not converged do θ ′ ∼ π ( θ ) with energy E ′ = E ( θ ′ ) ▷ Generate a candidate D ′ = D − ∆ E where ∆ E = E ′ − E ▷ Update demon’s state if D ′ ≥ 0 then ( θ, D ) ← ( θ ′ , D ′ ) ▷ Accept end if end while

Sampling the Ising model with a single demon 0 A Gibbs entropy S G ( E ) 500 1000 1500 2000 2500 estimated ln X k 8000 6000 4000 2000 0 energy E • Nearest-neighbor interaction on a 64 × 64 lattice: E ( θ ) = ∑ ⟨ i , j ⟩ θ i θ j where θ i = ± 1 • Nested sampling provides a very accurate estimate of the volume entropy S = ln X

Sampling the Ising model with a single demon 1.0 B heat capacity inverse temperature β G ( E ) p ln(1 + 2) / 2 0.8 0.6 0.4 0.2 0.0 8000 6000 4000 2000 0 energy E ∫ ϵ ′ • H ( ϵ ′ → ϵ ) = S ( ϵ ′ ) − S ( ϵ ) = ϵ β ( E ) d E • ⟨ H ( ϵ ′ → ϵ ) ⟩ = 1/ N , therefore β ( ϵ k ) ( ϵ k − ϵ k +1 ) ≈ 1/ N • histogram of energy bounds ϵ k matches the inverse temperature / entropy production β ( E )

Sampling the Ising model with a single demon 1.0 C estimated β B inverse temperature β G ( E ) 0.8 0.6 0.4 0.2 0.0 8000 6000 4000 2000 0 energy E • The demon’s energy distribution is p ( θ, D | ϵ ) d θ = g ( ϵ − D ) 1 ∫ p ( D | ϵ ) = T ( ϵ ) exp {− D / T ( ϵ ) } ≈ X ( ϵ ) • The demon may serve as a thermometer: D ≈ T

Properties of nested sampling Pros: 1 . Nested sampling is a microcanonical approach: energy E is the control parameter rather than the temperature used in thermal approaches 2 . . constructs an adaptive “cooling” protocol { ϵ k } 3 . . progresses at constant thermodynamic speed: ∆ S ≈ 1/ N 4 . . provides an estimate of the entropy S Cons: 1 . . Nested sampling requires efficient sampling from 1 p ( θ | ϵ ) X ( ϵ ) Θ[ ϵ − E ( θ )] π ( θ ) = 1 ∫ δ [ ϵ − D − E ( θ )] Θ( D ) π ( θ ) d D = X ( ϵ )

Releasing more demons • We would like to preserve nested sampling’s adaptive behavior but be more flexible in terms of the ensemble • Idea: introduce more demons in order to smooth the ensemble 1 p ( θ, D , K | ϵ ) = Y ( ϵ ) δ [ ϵ − D − K − E ( θ )] Θ( D ) f ( K ) π ( θ ) where the prior mass of the compound system is ∫ Y ( ϵ ) = Θ( ϵ − H ) ( f ⋆ g )( H ) d H involving the convolution ( f ⋆ g )( H ) • Nested sampling tracks Y ( ϵ ) where ϵ is an upper bound on the total energy H = K + E

Releasing more demons • Nested sampling estimates the evidence of the extended system ∫ e − H ( f ⋆ g )( H ) d H = Z K Z E Z H = from which we can obtain the evidence of the original system Z E • Marginal distribution of configurations ∫ 1 p ( θ | ϵ ) = p ( θ, D , K | ϵ ) d D d K = Y ( ϵ ) π ( θ ) F [ ϵ − E ( θ )] ∫ K where F ( K ) = −∞ f ( t ) d t is the cdf of the demon’s energy distribution • Sampling ( θ, K ) : p ( θ | ϵ ) ∝ π ( θ ) F [ ϵ − E ( θ )] θ ∼ K p ( K | θ, ϵ ) ∝ f ( K ) Θ[ ϵ − E ( θ ) − K ] ∼

Demonic nested sampling of the ten state Potts model Demon: f ( K ) ∝ Θ( K max − K ) K d /2 − 1 ( d -dimensional harmonic oscillator where d = dimension of configuration space) [ ϵ − E ( θ )] d /2 ; ϵ − E ( θ ) ≤ K max { p ( θ | ϵ ) ∝ Θ[ ϵ − E ( θ )] π ( θ ) × K d /2 ϵ − E ( θ ) > K max max ; 1e3 61e3 10 0.0 relative accuracy log Z [%] standard NS A B C demonic NS 5 energy bounds ǫ k 5 0.5 4 1.0 3 0 2 1.5 5 1 2.0 0 10 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 1080 1060 1040 1020 1000 980 960 940 0 100 200 300 400 500 iteration k 1e5 energy E demon capacity K max

Nested sampling in phase space • In continuous configuration spaces, it is convenient to unfold the demon and introduce momenta d ∫ K ( ξ ) = 1 f ( K ) = δ [ K − K ( ξ )] d ξ where ∑ ξ 2 ( kinetic energy ) i 2 i =1 • The marginal distribution in configuration space is [ ϵ − E ( θ )] d /2 ; ϵ − E ( θ ) ≤ K max { p ( θ | ϵ ) ∝ Θ[ ϵ − E ( θ )] π ( θ ) × K d /2 ϵ − E ( θ ) > K max max ; • Hamiltonian dynamics for exploration: ( θ, ξ ) L → ( θ ′ , ξ ′ ) where L is an integrator (e.g. the leapfrog)

Microcanonical Hamiltonian Monte Carlo • 2( d + 1) dimensional phase space: implement demon D as harmonic oscillator with energy D = ( ξ 2 d +1 + θ 2 d +1 )/2 • Require: ϵ (total energy), configuration θ with E ( θ ) < ϵ θ d +1 = 0 ▷ Initialize demon D while not converged do ξ ∼ N (0 , 1) ▷ Draw momenta from ( d + 1) -dim Gaussian √ ϵ − E − D / ∥ ξ ∥ ξ ← ξ × ▷ Scale momenta so as to match excess energy ( θ, ξ ) L → ( θ ′ , ξ ′ ) ▷ Run the leapfrog algorithm H ′ = E ( θ ′ ) + K ( ξ ′ ) ▷ Compute total energy of candidate if H ′ < ϵ then θ ← θ ′ , E ← E ( θ ) ▷ Accept end if end while

Application to GS peptide 800 200 B C 700 600 energy E ( θ k ) 150 500 400 100 300 200 50 100 0 100 0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0 1 2 3 4 5 iteration k 1e4 RMSD [ ] • A: Native structure of the GS peptide • B: Evolution of the energy (goodness-of-fit) during nested sampling • C: Structure’s accuracy measured by the root-mean square deviation (RMSD) to the crystal structure

Other demons Distribution of system’s energy p ( E | ϵ ) = Θ( ϵ − E ) g ( E ) F ( ϵ − E ) Y ( ϵ ) demon pdf f ( K ) cdf F ( K ) √ 2 π e − β 2 K 2 Gauss β 1 2 [1 + erf ( β /2 K )] √ 1 + β − 1 ln | ξ 2 | K ( ξ 1 , ξ 2 ) = 1 Logarithmic 2 ξ 2 ⇒ √ 8 πβ e β K 8 π / β e β K oscillator √ e − β K Fermi 1 β (1+ e − β K ) 2 1+ e − β K

Application to SH3 domain • Structure determination from sparse distance data measured by NMR spectroscopy • Structure ensemble as accurate and precise as with parallel tempering

Nested sampling with demons Michael Habeck Max Planck Institute for - PowerPoint PPT Presentation

Nested sampling with demons Michael Habeck Max Planck Institute for Biophysical Chemistry and Institute for Mathematical Stochastics Gttingen, Germany Amboise, September 23, 2014 Bayesian inference Probability rules posterior evidence

Nested Word Automata Jens Stimpfle 30.6.2014 Nested Words Nested Words Theoretically and

Spirits Spirits in in the the Mater Material Wo World: Demons Demons and and Theophani

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Nested Transactions Nested Transactions Flat transactions The rules for committing of

Nested and Composite Classes Lecture 14 COP 3252 Summer 2017 May 30, 2017 Nested Classes

Advanced OpenMP Lecture 6: Nested parallelism Nested parallelism Nested parallelism is

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

6 Subsequences and sequential compactness 6.1 Nested intervals and nested d -cells Recall the

NEVE: Nested Virtualization Extensions for ARM Jin Tack Lim, Christo ff er Dall, Shih-Wei Li, Jason

Newfound Water Quality Sampling: In Lake Sampling 8 Historic Sampling locations

Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Sampling

State Space Expectation Propagation Efficient Inference Schemes for Temporal Gaussian Processes

Deepire: First Experiments with Neural Guidance in Vampire Martin Suda Czech Technical

What the other 85% of V1 is doing Bruno A. Olshausen Helen Wills Neuroscience Institute School

LoraWAN Technology Luka Mustafa, Institute IRNAS, November 2018 IRNAS.EU CC BY-SA 4.0 LoRa

Logical Foundations of Cyber-Physical Systems Andr Platzer Andr Platzer (CMU) LFCPS/14:

1 & 2 Samuel Series Lesson #062 September 6, 2016 Dean Bible Ministries

Eph. 6:11, Put on the full armor of God, so that you will be able to stand firm against the

Chulan Kwon Myongji University, Korea ckwon@mju.ac.kr In

Nested sampling with demons Michael Habeck Max Planck Institute for - PowerPoint PPT Presentation

Nested sampling with demons Michael Habeck Max Planck Institute for Biophysical Chemistry and Institute for Mathematical Stochastics Gttingen, Germany Amboise, September 23, 2014 Bayesian inference Probability rules posterior evidence

Nested Word Automata Jens Stimpfle 30.6.2014 Nested Words Nested Words Theoretically and

Spirits Spirits in in the the Mater Material Wo World: Demons Demons and and Theophani

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Nested Transactions Nested Transactions Flat transactions The rules for committing of

Nested and Composite Classes Lecture 14 COP 3252 Summer 2017 May 30, 2017 Nested Classes

Advanced OpenMP Lecture 6: Nested parallelism Nested parallelism Nested parallelism is

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

6 Subsequences and sequential compactness 6.1 Nested intervals and nested d -cells Recall the

NEVE: Nested Virtualization Extensions for ARM Jin Tack Lim, Christo ff er Dall, Shih-Wei Li, Jason

Newfound Water Quality Sampling: In Lake Sampling 8 Historic Sampling locations

Sampling Distributions Sampling Distribution of the Mean &amp; Hypothesis Testing Sampling

State Space Expectation Propagation Efficient Inference Schemes for Temporal Gaussian Processes

Deepire: First Experiments with Neural Guidance in Vampire Martin Suda Czech Technical

What the other 85% of V1 is doing Bruno A. Olshausen Helen Wills Neuroscience Institute School

LoraWAN Technology Luka Mustafa, Institute IRNAS, November 2018 IRNAS.EU CC BY-SA 4.0 LoRa

Logical Foundations of Cyber-Physical Systems Andr Platzer Andr Platzer (CMU) LFCPS/14:

1 &amp; 2 Samuel Series Lesson #062 September 6, 2016 Dean Bible Ministries

Eph. 6:11, Put on the full armor of God, so that you will be able to stand firm against the

Chulan Kwon Myongji University, Korea ckwon@mju.ac.kr In

Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Sampling

1 & 2 Samuel Series Lesson #062 September 6, 2016 Dean Bible Ministries