Applied Bayesian Inference - PM exercise
You can download the slides for the PM lecture here
Lotka-Volterra
In this practical, we are going to perform inference for the Lotka-Volterra model of population dynamics. The model describes the dynamics of population sizes of prey () and predators (), where, here, denotes a given time in years. These equations are given by:
where at time , and . Here, yields the prey population growth rate in the absence of predators; is the rate of decline in the prey as a result of predation; denotes the rate of decline of predators in absence of prey; is the rate of predator population growth due to predation.
Preliminary question
Have you got PINTS installed on your machine? If not, clone the PINTS repo on: https://github.com/pints-team/pints and install it by executing:
pip install -e .[dev,docs]
on the terminal in the PINTS main directory. (If you want to avoid clashes with any local installed packages, you may want to use a virtual environment.)
Lotka-Volterra
Write a function which solves these two equations for any choice of . Plot the solutions for .
Code up the RHS of equations and use numerical solver.
Different parameters
Run the model with . How do the dynamics compare with previous?
Observations
Suppose that the observations for each population are governed by the following:
where and . Show that these are equivalent to:
Generate observations
Using these observation models, generate annual data for 25 years assuming and using the parameter sets from parts 1 and 2. Graph these observations.
Note, this may be a helpful resource: https://ben18785.shinyapps.io/distribution-zoo/
Likelihood
The likelihood for the prey compartment is given by:
with a similar expression for the predator compartment. Here is the solution of the Lotka-Volterra eqns. Write a function which calculates the log-likelihood of a set of observations with given values of .
Intro to PINTS
PINTS is a Python package designed in the Department of Computer Science that provides access to all sorts of inference routines, which is especially good for ODEs. Following the approach given here: https://github.com/pints-team/pints/blob/master/examples/stats/custom-model.ipynb wrap a pints.LogPDF class around the log-likelihood function we just created to allow us to access these methods. We're going to hold a number of parameters constant to make inference a little more manageable for this practical, so set up your method with .
Run MCMC
We now are going to use an inbuilt MCMC method in PINTS called HaarioBardenetACMC
to generate posterior samples with 2000 iterations per chain and 200 initial phase iterations across 4 chains. To initialise each of the chains assume:
which can be done using:
nchains = 4 xs = [[0.6, 0.02, 1.0, 0.03]] * nchains
Note, initialising chains all at a single point is not good practive but allows us to get up and running faster.
To run the MCMC, we first instantiate the pints.LogPDF object you created in the previous question assuming observational data generated in question 4. Then we instantiate an MCMC Controller object using:
model = LotkaVolterraLogPDF(df) mcmc = pints.MCMCController(model, nchains, xs, method=pints.HaarioBardenetACMC)
where model
is an instantiated version of your model. Then we set the total number of iterations per chain and the initial phase iterations using:
mcmc.set_max_iterations(2000) mcmc.set_initial_phase_iterations(200)
Finally, we execute:
chains = mcmc.run()
to run the MCMC.
Note that, at the moment, since we have not specified priors, PINTS implicitly assumes that the priors are uniform (and, in this case, improper).
MCMC summary
PINTS has an in-built MCMC summary object that can be called and printed using:
results = pints.MCMCSummary(chains=chains, time=mcmc.time()) print(results)
Have your MCMC chains yet converged? If Rhat > 1.1 (probably 1.01 for publications), for any of your parameters, in practice you would rerun for more iterations. Since we only have a little time today, we won't rerun things.
Plot trace
Using PINTS' plotting tools (see: https://github.com/pints-team/pints/blob/master/examples/sampling/adaptive-covariance-haario-bardenet.ipynb) plot pairwise samples from the parameters. Before we do that, let's discard the first half of the chains as these are likely before convergence occurred.
Parameter Correlations
Why is there positive correlation between the estimates of and ?
Posterior predictive
Using a randomly drawn subset of 100 posterior draws, generate posterior predictive draws of the mean predator and prey numbers of time. Then plot these, overlaying the data. Why do the posterior predictive simulations not encompass all the variation seen in the data?
Random initialisation
We are now going to initialise the chains randomly, which is better practice since it is less likely the chains will be biased to a particular area of parameter space. To do this, we are going to create a UniformLogPrior
object in PINTS, then use the sample()
method to generate independent draws for each parameter across each chain. To do this, we are going to use the following code:
log_prior = pints.UniformLogPrior([0, 0, 0, 0], [2, 0.1, 2.0, 0.2])
Using this log_prior
object, sample initial positions for the chains and re-run the MCMC sampling as above. Afterwards, plot the path of the chains over time and compute convergence measures. Do the chains look as close to convergence as before?