Introduction to stochastic dynamical modelling Gavin J Gibson Maxwell Institute for Mathematical Sciences Heriot-Watt University
Outline • Motivation for stochastic modelling • Structure of stochastic models • Simulation of stochastic models • Fitting stochastic models to data with Bayesian methods • Examples from epidemiology and ecology
Stochastic – what and why? • Models explicitly represent the occurrence of random events • Repeated running of models allows a distribution of outcomes to be obtained • Probabilities of extreme events (e.g. extinction of a population) can be estimated • Very valuable in risk assessment • Important when observations display random characteristics
Individual based stochastic compartment models Example: Birth-death model • Population at time t has N ( t ) individuals. • Probability of any given individual giving birth in ( t , t + ∆ ) is approximately α∆ (∆ small). • Probability of any given individual dying in interval ( t , t + ∆ ) is β∆ . Net rate (per unit time) of births and deaths: α N ( t ) and β N ( t ).
4 realisations with α = 1.1, β = 1 and N (0) = 10 70 100 60 80 50 40 60 n n 30 40 20 20 10 0 0 5 10 15 20 0 5 10 15 t t 70 12 60 10 50 8 40 n n 6 30 4 20 EXTINCTION! 2 10 0 0 2 4 6 8 10 0 5 10 15 20 t t
Stochastic v deterministic • For the immigration death model, if α ≤ β the population will die out with certainty • If β > α then it may die out but there is some probability it will take off • The probability of ‘taking off’ increases with the population size • If it takes off it will exhibit exponential growth in the long term (like the deterministic version) • Can show E( N ( t )) = N (0)exp(( α - β ) t ) (exponential growth)
Stochastic compartment models e.g. SIR epidemic model • Consider fixed size population whose members can be in compartments Susceptible → Infected → Removed Instantaneous rates at time t at which a given individual transitions: S → I β I ( t ) I → R µ , Giving net rates of: β S ( t ) I ( t ); µ I ( t ) for transitions from S → I and I → R respectively.
SIR (continued) • Rates are non-linear functions of the numbers in each compartment • Simple solution for evolution of mean numbers in each compartment doesn’t exist • Similarly important to understand the probability of dichotomous behaviour (large or small outbreak) • R 0 = β N / µ , threshold for large v. small epidemics • Vaccination programmes aim to get R 0 below 1
Theoretical justification • If N is large and I (0) is small then rate of new infections is approximately β NI ( t ) • Rate of removals is µ I ( t ) • I ( t ) behaves like population governed by birth death model with birth rate β N and death rate µ β N / µ > 1 possible outbreak β N / µ < 1 extinction assured.
Simulating stochastic compartment models Gillespie algorithm: Current state ( S ( t ), I ( t )): [ S + I + R = N] Total instantaneous rate: R = β S( t )I( t ) + µ I( t ) Simulate ∆ ~Exp(R), t → t+ ∆ 1. Choose event type to be infection with probability β S( t )/R (otherwise 2. removal). 3. Update state vector at new time depending on the event type chosen. Repeat the process to generate a stochastic realisation of the process. Process generalises to compartment models of arbitrary complexity.
n_events = 1000 #number of times n = numeric(n_events) #vector for population sizes t = numeric(n_events) #vector of event times t[1] = 0 #initial time n[1] = 10 #initial population size beta = 1.1 #birth rate mu = 1 #death rate for( i in 1:(n_events-1) ) { #If extinct then nothing can happen! if(n[i] == 0) { n[i+1] = 0 t[i+1] = t[i] } #Otherwise simulate next event else{ del = rexp(1, n[i]*(beta + mu)) #time till event t[i+1] = t[i]+del #next event time #now select and implement event if( runif(1, 0, 1) < beta/(beta + mu) ) n[i+1] = n[i] + 1 else n[i+1] = n[i] - 1 } } plot(t, n, type = "l")
Applications of SIR Model • Help understand how effective vaccination can be • Suppose we can immunise a proportion p of the population • Initial number of susceptibles is N (1- p ), and effective R 0 decreases by this factor • How large must p be so that (1- p ) β N / µ < 1?
Adding realism to SIR • SIR model considered thus far makes certain unrealistic assumptions • Time spent by individual in class I has Exponential distribution (memoryless) • More realistic to assume distribution with some clear mode (such as Gamma distribution) • Not so straightforward to simulate – needs each individual’s history to be represented.
‘Method of stages’ • A way of removing Exponential assumption while keeping Markovian property by having 2 (or more stages in the I compartment) β 2 µ 2 µ S → I 1 → I 2 → R Combined time in I 1 and I 2 follows Gamma(2, 2 µ ).
Further extensions • SEIR models – E class corresponds to exposed but not yet infectious • Structured populations with different contact rates between groups (AIDS, influenza, measles) • Spatial interactions (see later)
Parameter estimation for epidemic models (SIR) • Parameter vector θ = ( β , µ ) – we want to estimate this, e.g. to estimate R 0 . • If complete knowledge, z , of all events (both infections and removals is available) then likelihood L( θ ) = Pr( z | θ ) can be calculated and maximised. • Problem is we often only see subset of the information – e.g. removal times
Partially observed process I → R S → I i 1 i 2 0 T time Challenge: How do we estimate the parameters if we only see the removal times?
Snapshot data: Rhizochtonia in radish (a) 35 5 reps without trichoderma 30 25 diseased plants 20 15 10 5 0 0 2 4 6 8 10 12 14 16 time (days) (b) 35 30 5 reps with trichoderma 25 diseased plants 20 15 10 5 0 0 2 4 6 8 10 12 14 16 18 time (days)
Bayesian inference for epidemics Experiment : yields partial data y . Stochastic model : with parameter θ specifying π ( y | θ ). Aim: Express belief re θ as a probability density π ( θ | y ). Bayesian solution: Assign prior distribution π ( θ ), to yield posterior π ( θ | y ) ∝ π ( θ ) π ( y | θ ). Likelihood Prior Problem: π ( y | θ ) is often intractable integral. Data augmentation: Consider complete data z for which π ( z | θ ) is tractable. Consider π ( θ , z | y ) ∝ π ( θ ) π ( z | θ ) π ( y | z ). Can simulate from π ( θ , z | y ) e.g. using MCMC.
Data-augmented Markov chain Monte Carlo • Set up Markov chain with state ( θ , z ) whose stationary distribution is π ( θ , z | y ) • In practice involves proposing and accepting and rejecting changes to θ and z repeatedly and storing the states ( θ , z ) visited. • Posterior distributions of individual parameters can be estimated from histograms
Benefits of data-augmented MCMC • Makes otherwise difficult statistical problems tractable • Can be used to make inference on unobserved events (e.g. time and location of first infection – index case) Disadvantages? • Difficult to implement – usually need to develop code from scratch • Long computation times – even for quite small problems
Example application: R solani in radish Model: SEI with ‘quenching’, No trichoderma primary and secondary infection constant latent period. S → E rate: R ( t ) = e -at ( r p S ( t ) + r s S ( t ) I ( t )) Stay in E for fixed period κ Trichoderma 3 ‘submodels’: 1. Latent period = 0 (SI) 2. SEI with observations recording I 3. SEI with observations recording E+I
3. SEI model, E+I observed Although quantitative estimates Secondary Primary of parameters changes with model the qualitative conclusion seems robust. There is consistent evidence that T viride appears to affect the primary infection parameter. Models are useful ‘lenses’ even if they cannot be used as ‘crystal Latent Quenching balls’!
Extension to spatio-temporal SI models Susceptible j acquires infection at rate: j R j = f(t; γ ) ( ε + β Σ i K ( d ij , α )) Can be fitted using standard Bayesian/data Spatial kernels (examples): augmentation/MCMC 1. K ( d , α ) = exp ( - α d ) approach 2. K ( d , α ) = exp (- α d 2 ) 3. K ( d , α ) = ( d +1) - α See GJG (1997), Jamieson (2004), Cook et al (2008) , Chis-Ster & Ferguson (2009) If models are used to design control strategy – e.g. spatial eradication programmes – then model choice can be crucial.
Example: Miami Citrus Canker epidemic (Gottwald et al. , Phytopathology , 2002; Jamieson, PhD Thesis, U. Cambridge, 2004, Cook et al. 2008, Neri et al 2014) Data: Dade county, Miami 6056 susceptibles, 1124 infections after 12 30-day periods Optimal strategy for eradication sensitive to model choice
Control strategies can be controversial
Experiment (Hughes et al. , J. Gen. Virol. 2002): 32 sheep allocated to 4 groups G 1 , .., G 4 . Animals in G 1 inoculated with FMD virus (4 at t =0, 4 at t =1 day). Thereafter animals mix according to the following scheme: G1 G2 G3 G4 Day 1 2 3 4 . Idea is to ‘force’ higher groups further down the chain of infection. Data: Daily tests on each animal, summarised by: y = (time of 1 st +ve test, last +ve test, peak viraemic level)
Recommend
More recommend