Skip to main content

Scientific Computing

Bayesian Inference

Applied Bayes... []

This material has been adapted from material by Ben Lambert from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training.

This material has been adapted from material by Ben Lambert from the "Scientific Computing" module of the SABS R³ Center for Doctoral Training.

Creative Commons License
This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1

This course material was developed as part of UNIVERSE-HPC, which is funded through the SPF ExCALIBUR programme under grant number EP/W035731/1

Creative Commons License

Applied Bayesian Inference - PM exercise

You can download the slides for the PM lecture here

Lotka-Volterra

In this practical, we are going to perform inference for the Lotka-Volterra model of population dynamics. The model describes the dynamics of population sizes of prey (u(t)u(t)) and predators (v(t)v(t)), where, here, tt denotes a given time in years. These equations are given by:

dudt=αuβuvdvdt=γv+δuv\begin{aligned} \frac{d u}{dt} &= \alpha u - \beta u v\\ \frac{d v}{dt} &= -\gamma v + \delta u v \end{aligned}

where at time t=0t=0, u(0)=u0u(0) = u_0 and v(0)=v0v(0)=v_0. Here, α0\alpha\geq 0 yields the prey population growth rate in the absence of predators; β0\beta \geq 0 is the rate of decline in the prey as a result of predation; γ0\gamma \geq 0 denotes the rate of decline of predators in absence of prey; δ>0\delta > 0 is the rate of predator population growth due to predation.

Preliminary question

Have you got PINTS installed on your machine? If not, clone the PINTS repo on: https://github.com/pints-team/pints and install it by executing:

pip install -e .[dev,docs]

on the terminal in the PINTS main directory. (If you want to avoid clashes with any local installed packages, you may want to use a virtual environment.)

Lotka-Volterra

Write a function which solves these two equations for any choice of (α,β,γ,δ,u0,v0)(\alpha, \beta, \gamma, \delta, u_0, v_0). Plot the solutions for α=0.55,β=0.028,γ=0.84,δ=0.026,u0=33,v0=6\alpha=0.55, \beta=0.028, \gamma=0.84, \delta=0.026, u_0=33, v_0=6.

Code up the RHS of equations and use numerical solver.

Different parameters

Run the model with α=0.79,β=0.04,γ=1.3,δ=0.04,u0=33,v0=6\alpha=0.79, \beta=0.04, \gamma=1.3, \delta=0.04, u_0=33, v_0=6. How do the dynamics compare with previous?

Observations

Suppose that the observations for each population are governed by the following:

u~=uexp(ϵu)v~=vexp(ϵv)\begin{aligned} \tilde u &= u \exp(\epsilon_u)\\ \tilde v &= v \exp(\epsilon_v) \end{aligned}

where ϵuN(0,σu)\epsilon_u\sim\mathcal{N}(0, \sigma_u) and ϵvN(0,σv)\epsilon_v\sim\mathcal{N}(0, \sigma_v). Show that these are equivalent to:

u~log-normal(logu,σu)v~log-normal(logv,σv)\begin{aligned} \tilde u &\sim \text{log-normal}(\log u, \sigma_u)\\ \tilde v &\sim \text{log-normal}(\log v, \sigma_v)\\ \end{aligned}

Generate observations

Using these observation models, generate annual data for 25 years assuming σu=σv=0.25\sigma_u=\sigma_v=0.25 and using the parameter sets from parts 1 and 2. Graph these observations.

Note, this may be a helpful resource: https://ben18785.shinyapps.io/distribution-zoo/

Likelihood

The likelihood for the prey compartment is given by:

Lu=t=1Tlog-normal(u~(t)logu(tα,β,γ,δ,u0,v0),σu)L_u = \prod_{t=1}^{T} \text{log-normal}(\tilde u(t)| \log u(t|\alpha, \beta, \gamma, \delta, u_0, v_0), \sigma_u)

with a similar expression for the predator compartment. Here u(tα,β,γ,δ,u0,v0)u(t|\alpha, \beta, \gamma, \delta, u_0, v_0) is the solution of the Lotka-Volterra eqns. Write a function which calculates the log-likelihood of a set of observations with given values of (α,β,γ,δ,u0,v0,σu,σv)(\alpha, \beta, \gamma, \delta, u_0, v_0, \sigma_u, \sigma_v).

Intro to PINTS

PINTS is a Python package designed in the Department of Computer Science that provides access to all sorts of inference routines, which is especially good for ODEs. Following the approach given here: https://github.com/pints-team/pints/blob/master/examples/stats/custom-model.ipynb wrap a pints.LogPDF class around the log-likelihood function we just created to allow us to access these methods. We're going to hold a number of parameters constant to make inference a little more manageable for this practical, so set up your method with (u0=33,v0=6,σu=0.25,σv=0.25)(u_0=33, v_0=6, \sigma_u=0.25, \sigma_v=0.25).

Run MCMC

We now are going to use an inbuilt MCMC method in PINTS called HaarioBardenetACMC to generate posterior samples with 2000 iterations per chain and 200 initial phase iterations across 4 chains. To initialise each of the chains assume:

(α=0.6,β=0.02,γ=1.0,δ=0.03)(\alpha=0.6, \beta=0.02, \gamma=1.0, \delta=0.03)

which can be done using:

nchains = 4 xs = [[0.6, 0.02, 1.0, 0.03]] * nchains

Note, initialising chains all at a single point is not good practive but allows us to get up and running faster.

To run the MCMC, we first instantiate the pints.LogPDF object you created in the previous question assuming observational data generated in question 4. Then we instantiate an MCMC Controller object using:

model = LotkaVolterraLogPDF(df) mcmc = pints.MCMCController(model, nchains, xs, method=pints.HaarioBardenetACMC)

where model is an instantiated version of your model. Then we set the total number of iterations per chain and the initial phase iterations using:

mcmc.set_max_iterations(2000) mcmc.set_initial_phase_iterations(200)

Finally, we execute:

chains = mcmc.run()

to run the MCMC.

Note that, at the moment, since we have not specified priors, PINTS implicitly assumes that the priors are uniform (and, in this case, improper).

MCMC summary

PINTS has an in-built MCMC summary object that can be called and printed using:

results = pints.MCMCSummary(chains=chains, time=mcmc.time()) print(results)

Have your MCMC chains yet converged? If Rhat > 1.1 (probably 1.01 for publications), for any of your parameters, in practice you would rerun for more iterations. Since we only have a little time today, we won't rerun things.

Plot trace

Using PINTS' plotting tools (see: https://github.com/pints-team/pints/blob/master/examples/sampling/adaptive-covariance-haario-bardenet.ipynb) plot pairwise samples from the parameters. Before we do that, let's discard the first half of the chains as these are likely before convergence occurred.

Parameter Correlations

Why is there positive correlation between the estimates of α\alpha and β\beta?

Posterior predictive

Using a randomly drawn subset of 100 posterior draws, generate posterior predictive draws of the mean predator and prey numbers of time. Then plot these, overlaying the data. Why do the posterior predictive simulations not encompass all the variation seen in the data?

Random initialisation

We are now going to initialise the chains randomly, which is better practice since it is less likely the chains will be biased to a particular area of parameter space. To do this, we are going to create a UniformLogPrior object in PINTS, then use the sample() method to generate independent draws for each parameter across each chain. To do this, we are going to use the following code:

log_prior = pints.UniformLogPrior([0, 0, 0, 0], [2, 0.1, 2.0, 0.2])

Using this log_prior object, sample initial positions for the chains and re-run the MCMC sampling as above. Afterwards, plot the path of the chains over time and compute convergence measures. Do the chains look as close to convergence as before?