2. Getting Started¶
pyLFI is a Python toolbox for Bayesian parameter estimation in models with
intractable likelihood functions. By using Likelihood-Free Inference (LFI)
schemes, in particular Approximate Bayesian Computation (ABC), pyLFI estimates
the posterior distributions over model parameters. LFI is also known under the
moniker Simulation-Based Inference (SBI).
Introduction¶
Mechanistic models aim to explain phenomena in terms of causal mechanisms, and candidate models are validated by investigating whether proposed mechanisms can explain how experimental data manifests. The mechanistic modelling is generally through the use of differential equations, and these models often have non-measurable parameters. A central challenge in building a mechanistic model is to identify the parametrization of the system which achieves an agreement between the model and experimental data.
Many mechanistic models are defined through simulators which describe how the process generates data. However, simulators are poorly suited for inference and lead to challenging inverse problems. Standard Bayesian inference is performed within the context of a statistical model from which the likelihood can be derived. Likelihoods are generally intractable or computationally infeasible for simulator models, which makes the typical approach to inference inaccessible.
LFI, or SBI, refers to a suite of algorithms that avoid explicit likelihood evaluations by instead using model simulations.
The ABC of Approximate Bayesian Computation¶
Approximate Bayesian Computation (ABC) constitutes a class of computational sampling algorithms rooted in Bayesian statistics that bypass evaluation of the likelihood function. Given observed data \(y_\mathrm{obs}\), a simulator model \(\mathrm{M}(\theta)\) with parameters \(\theta\) having prior distributions \(\pi (\theta)\), ABC algorithms can be used to estimate the posterior distributions \(\pi (\theta \mid y_\mathrm{obs})\) over model parameters.
At its heart, the ABC approach is quite simple; evaluation of the likelihood is replaced by comparing simulated data (generated by the simulator model) to observed data, in order to assess how likely it is that the model could have produced the observed data.
Parameter Identification with pyLFI¶
pyLFI is made to be general and flexible so that it can accommodate other
algorithms as well. The price to pay for the generality and flexibility is that
the simulation of data and calculation of summary statistics from the data
are left entirely to the user.
To perform parameter identification with pyLFI, there are generally four
inputs that need to be specified:
A simulator model. The mechanistic model needs to be specified through a simulator model that can generate simulated data \(y_\mathrm{sim}\) for any parameters \(\theta\). The simulator must be a
Pythoncallable.A summary statistics calculator. The ABC algorithms require the use of low-dimensional summary statistics \(s = S(y)\) calculated from the raw data \(y\). The summary statistics calculator must be a
Pythoncallable.Observed data \(y_\mathrm{obs}\). This must be on the same form as \(y_\mathrm{sim}\).
A prior \(\pi (\theta)\) for each unknown parameter that describes the range of possible parameter values. Priors must be
pylfi.Priorobjects.
Simulators or summary statistic calculators not written in Python can be
used as long as they can be wrapped in a Python function or class
__call__ method.