System analysis: Code generation for ODE-based models

Author

David Kneis (firstname.lastname @ tu-dresden.de)

Published

December 23, 2023

\(\leftarrow\) BACK TO TEACHING SECTION OF MY HOMEPAGE

1 Motivation

In previous lessons, we learnt about simultaneous ODE and how they can be used to model the behavior of dynamic systems. We also learnt that simultaneous ODE can be written as a vector-matrix multiplication (Equation 1).

\[ \begin{align} \begin{bmatrix} \tfrac{d}{dt}y_1 \\ \tfrac{d}{dt}y_2 \\ \ldots \\ \tfrac{d}{dt}y_n \end{bmatrix} ^T & = & \begin{bmatrix} a \\ b \\ \ldots \\ z \end{bmatrix} ^T & \cdot & \begin{bmatrix} s_{a,1}, & s_{a,2}, & \ldots, & s_{a,n} \\ s_{b,1}, & s_{b,2}, & \ldots, & s_{b,n} \\ \ldots, & \ldots, & \ldots, & \ldots \\ s_{z,1}, & s_{z,2}, & \ldots, & s_{z,n} \\ \end{bmatrix} \\ \text{Derivatives} & & \text{Process rates} & & \text{Stoichiometry matrix} \end{align} \tag{1}\]

Finally, we learnt how the equations can be integrated numerically using the deSolve package. Now that we know these low-level basics, we can start looking at more complex models of physical, chemical, or biological systems.

But couldn’t we get rid of all the programming overhead?

If we were less distracted by the technical details of R code, we could spent more time on formulating the equations and understanding the system’s behavior.

One way to reduce programming overhead is to go for graphical modeling tools. However, we will rather take another approach known as code generation. That is, we outsource the writing of all the abstract, technical parts of R code to a computer program. Actually, we let the R code be written by R itself. Sounds strange? But it’s possible.

2 Tools for less programming

2.1 The rodeo package

This package allows all ‘ingredients’ of an ODE model to be imported from plain tables (e.g. text files or spreadsheets). Among these ‘ingredients’ are the process rates and the stoichiometry matrix (see Equation 1). All imported information is checked for integrity and a deSolve-compliant source code is subsequently generated.

The rodeo package is available in the CRAN repository and can be installed instantly:

install.packages("rodeo")
library("rodeo")

If you want to get a better understanding of how the rodeo package can facilitate the implementation of simultaneous ODE, you can look at this article or study the package vignette which contains more examples (PDF file available here). The package has previously been applied, for example, to simulate bacterial plasmid transfer or predator-prey interactions.

2.2 The rodeoEasy package

The rodeo package was primarily made as a power-tool for research. For teaching, I built another package on top: rodeoEasy. This package is not available on CRAN but can be installed and loaded like so:

install.packages("remotes")  ### ONLY IF NOT YET INSTALLED
library("remotes")
remotes::install_github("dkneis/rodeoEasy")
library("rodeoEasy")

Currently, the rodeoEasy package provides the following set of functions:

  • build: Imports the complete description of an ODE model from a workbook in ‘.xlsx’ or ‘.ods’ format.

  • run.scenarios: Runs a dynamic simulation for a model previously built with build. It supports the comparison of multiple scenarios. Each scenario can have its own set of initial values and/or parameter values.

All functions are fully documented and demo examples are provided with the package (see the ‘examples’ sections in the documentation).

3 Demonstration of rodeoEasy

3.1 Oxygen sag model reloaded

In a previous lesson, we considered the aerobic decay of a substance with re-aeration from the atmosphere in a very (over)simplistic fashion. The system with the two state variables \(OM\) (organic matter) and \(DO\) (dissolved oxygen) could be described by the following ODE (Equation 2)

\[ \begin{aligned} \tfrac{d}{dt}OM &= -k_d \cdot OM \\ \tfrac{d}{dt}DO &= -k_d \cdot OM \cdot s + k_a \cdot \left( DO_{sat}(T) - DO \right) \end{aligned} \tag{2}\]

with the rate constant of degradation (\(k_d\)), the rate constant of re-aeration (\(k_a\)), the oxygen saturation level \(DO_{sat}\) being a function of temperature \(T\), and the stoichiometric coefficient \(s\).

After adoption of the recommended notation based on vector-matrix multiplication, the equations took the form of Equation 3 (note the T operators for vector transposition):

\[ \begin{bmatrix} \tfrac{d}{dt}OM \\ \tfrac{d}{dt}DO \end{bmatrix}^T = \begin{bmatrix} {\color{blue}k_d} \cdot OM \\ {\color{blue}k_a} \cdot \left( {\color{red}DO_{sat}}({\color{blue}T}) - DO \right) \end{bmatrix}^T \cdot \begin{bmatrix} {\color{gray}-1}, & {\color{blue}-s} \\ {\color{gray}0}, & {\color{gray}1} \end{bmatrix} \tag{3}\]

To highlight the different categories of items in Equation 3, I highlighted them by color (state variables in black, parameters in blue, functions in red, plain numbers in gray).

To implement the ODE from Equation 3 with the help of the rodeoEasy package, we start with putting the vector of process rates and the stoichiometry matrix in a sheet of a spreadsheet workbook names eqns. The minimum required contents of that sheet is shown in Table 1.

Table 1: Contents of sheet ‘eqns’ of the oxygen model.
description name unit rate OM DO
degradation deg g OM / h kd * OM -1 -s
aeration aer g DO / h ka * (DO_sat(T) - DO) 0 1

Obviously, the process rate expressions end up in column ‘rate’ and minimal documentation has to be provided in the ‘description’ and ‘unit’ columns. The ‘name’ column is needed to access the value of the process rates in simulation output. Further, there must be one column for every state variable which holds the corresponding information of the stoichiometry matrix.

Although Table 1 fully implements the ODE right hand sides, some additional tables are needed by rodeoEasy. These are used to declare the different categories of items appearing the math expressions (Table 2, Table 3, Table 4). Essentially, one such table must be provided for every color used in Equation 3 (except gray). The declaration is primarily needed to validate the math expressions but it also enforces a minimum documentation of the model. A model without documentation is essentially useless!

Table 2: Contents of sheet ‘vars’ of the oxygen model.
name unit description default
OM mg/l organic matter 1
DO mg/l dissolved oxygen 10
Table 3: Contents of sheet ‘pars’ of the oxygen model.
name unit description default
kd 1/h degradation rate constant 1.00
ka 1/h aeration rate constant 0.50
s g DO / g OM stoichiometry 2.67
T °C temperature 20.00
Table 4: Contents of sheet ‘funs’ of the oxygen model. Here, we declare and define the function computing the oxygen saturation level. The function will be loaded automatically into R’s global environment.
name code
DO_sat # O2 saturation in water (mg/L) as a function of temperature (°C)
DO_sat DO_sat <- function(T) {
DO_sat exp(7.712 - 1.314 * log(T + 45.93))
DO_sat }

Once these four sheets (‘eqns’, ‘vars’, ‘pars’, ‘funs’) have been filled, we are ready to go!

The code section below imports a workbook containing the four sheets (this is a demo workbook shipped with the rodeoEasy package). After import, a dynamic simulation is performed with parameters and initial values taken from the ‘default’ columns of the sheets ‘pars’ and ‘vars’, respectively.

# set the workbook containing the model declaration
workbook <- system.file("models/oxygen.xlsx", package="rodeoEasy")

# run the code generator to receive a model object
model <- rodeoEasy::build(workbook)

# run a dynamic simulation, including basic plotting
x <- rodeoEasy::run.scenarios(model, times=seq(0, 12, 0.1),
  scenarios=NULL
)

Figure 1: Dynamics predicted by the oxygen sag model using default values for parameters and the initial state. Those defaults are taken from the ‘default’ columns of the sheets ‘vars’ and ‘pars’.

Very typically, we want to compare the behavior of a model for different initial values or parameters. The run.scenarios function of rodeoEasy supports this in a simple way (see R code below) and it also provides immediate visualization (Figure 2). The values of any parameters or variables not being explicitly set for a scenario remain at their respective defaults.

myScenarios <- list(
  clean= c(OM=1),  # 1st scenario
  dirty= c(OM=10)  # 2nd scenario
)
x <- rodeoEasy::run.scenarios(model, times=seq(0, 12, 0.1),
  scenarios=myScenarios, leg="bottomright"
)

Figure 2: Dynamics predicted by the oxygen sag model for alternative scenarios.

If you look at Figure 2, you’ll notice that something weird happened. Obviously, the oxygen concentration turned negative in response to a high initial concentration of degradable matter. How could this happen?

If you look at the equations again (Table 1), you’ll notice that the process of degradation solely depend on the presence of organic matter but not at all on the availability of oxygen. Thus, in the model, degradation of organic matter can continue even after oxygen was fully depleted. In reality, this would only be possible if (1) bacteria switch to an anaerobic metabolism (if they can) or (2) the aerobic bacterial community experiences a shift towards anaerobic species (by selection). In any case, no more oxygen would be consumed once it was depleted to zero.

The solution: One would need to add a term to the model that accounts for oxygen limitation. We’ll learn about common terms to represent limitation in another lesson.

3.2 Epidemic model of the SEIR family

3.2.1 Background

The term SEIR model denotes a particular family of epidemiological models. They describe the dynamics of an infectious disease within a population consisting of four sub-groups (Table 5, Figure 3).

Figure 3: Sub-populations distinguished by the SEIR model. Red dots indicate the pathogen.
Table 5: The four state variables of SEIR epidemiological models.
Group Variable symbol Description
Susceptible group S Individuals who did not acquire the disease yet and are thus considered susceptible
Exposed group E Individuals who acquired the infection but who are not yet infectious. In those people, the pathogen undergoes incubation.
Infected group I Individuals in an advance state of infection who can transmit the pathogen to others.
Recovered group R Recovered individuals. They are assumed to be (and remain) immune against recurrent infection.

The basic assumptions of behind SEIR models are described, e.g., in this article but also in many educational texts. Note that there are many variants of epidemiological models in addition to the simple SEIR. They either neglect aspects (e.g. the \(E\) group) or emphasize certain additional features like a gradual loss of immunity in the \(R\) group.

3.2.2 Implementation

A SEIR model is part of suite of demo models distributed with the rodeoEasy package. To find the respective workbook on your local computer, type the R command

print(system.file("models/SEIR.xlsx", package="rodeoEasy"))

and open that file with the spreadsheet software of your choice. For convenience, the contents of the workbook is also displayed below (Table 6 to Table 9).

Table 6: Contents of sheet ‘eqns’ of the SEIR model.
name unit description rate S E I R
transm ind./day transmission t * I/(S+E+I+R) * S -1 1 0 0
infect ind./day infection i * E 0 -1 1 0
recover ind./day recovery r * I 0 0 -1 1
Table 7: Contents of sheet ‘vars’ of the SEIR model.
name unit description default
S individuals susceptible 8e+07
E individuals carrier, not yet infectious 8e+05
I individuals infectious 0e+00
R individuals recovered, immune 0e+00
Table 8: Contents of sheet ‘pars’ of the SEIR model.
name unit description default
t 1/time transmission rate constant 1.0
i 1/time incubation rate constant 0.2
r 1/time recovery rate constant 0.4
Table 9: Contents of sheet ‘funs’ of the SEIR model. Since we did not reference any functions in the ‘eqns’ sheet, we declare an unused dummy only.
name code
dummy dummy <- function() { NULL }

The R code below triggers a simulation of the SEIR model using the default parameters and initial values. Note that visualization (Figure 4) is based on a few lines of custom code to arrange all variables in a single plot.

workbook <- system.file("models/SEIR.xlsx", package="rodeoEasy")
model <- rodeoEasy::build(workbook)
x <- rodeoEasy::run.scenarios(model, times=0:90, scenarios=NULL,
  plot.vars=FALSE)
varnames <- c("S","E","I","R")    # could also use: model$namesVars()
plot(range(x[,"time"]), range(x[,varnames]), type="n",
  xlab="Days", ylab="Individuals")
for (vn in varnames) {
  lines(x[,"time"], x[,vn], col=match(vn, varnames)) 
}
legend("right", bty="n", lty=1, col=1:length(varnames),
  legend=varnames)

Figure 4: Dynamics of an epidemic as predicted by the SEIR model.

As illustrated by Figure 4, the SEIR model as implemented above predicts an end of the epidemic once most of the individuals had been infected and went through recovery. However, we know by experience that immunity acquired in response to an infection does not necessarily last forever.

3.2.3 Excercise

  1. Modify the above system of ODE to account for a gradual loss of immunity in the group of recovered individuals.

  2. What are possible steady states of the original SEIR model and your extended version?

  3. Play with the model parameters within reasonable bounds. Does the model run straight into steady state under all circumstances?

3.2.4 A possible solution

The extended model which accounts for the possibility of recurrent infections is known as a ‘SEIRS’ model. It is included as a demo examples in the rodeoEasy package (see R code below) and a model output is shown in Figure 5. Note that I adjusted the values of some of the rate constants to make the epidemic a bit more dynamic and trigger damped oscillations.

workbook <- system.file("models/SEIRS.xlsx", package="rodeoEasy")
model <- rodeoEasy::build(workbook)
x <- rodeoEasy::run.scenarios(model, times=0:365,
  scenarios=list(SEIR=c(s=0), SEIRS=c(s=0.01)),
  plot.vars=TRUE, leg="bottomright")

Figure 5: Dynamics of an epidemic predicted by the extended SEIR model allowing for recurrent infections in response to a gradual loss of immunity (SEIRS). The occurrence of notable oscillations depends on chosen parameter values.