install.packages("rodeo")
library("rodeo")
System analysis: Code generation for ODE-based models
\(\leftarrow\) BACK TO TEACHING SECTION OF MY HOMEPAGE
1 Motivation
In previous lessons, we learnt about simultaneous ODE and how they can be used to model the behavior of dynamic systems. We also learnt that simultaneous ODE can be written as a vector-matrix multiplication (Equation 1).
\[ \begin{align} \begin{bmatrix} \tfrac{d}{dt}y_1 \\ \tfrac{d}{dt}y_2 \\ \ldots \\ \tfrac{d}{dt}y_n \end{bmatrix} ^T & = & \begin{bmatrix} a \\ b \\ \ldots \\ z \end{bmatrix} ^T & \cdot & \begin{bmatrix} s_{a,1}, & s_{a,2}, & \ldots, & s_{a,n} \\ s_{b,1}, & s_{b,2}, & \ldots, & s_{b,n} \\ \ldots, & \ldots, & \ldots, & \ldots \\ s_{z,1}, & s_{z,2}, & \ldots, & s_{z,n} \\ \end{bmatrix} \\ \text{Derivatives} & & \text{Process rates} & & \text{Stoichiometry matrix} \end{align} \tag{1}\]
Finally, we learnt how the equations can be integrated numerically using the deSolve
package. Now that we know these low-level basics, we can start looking at more complex models of physical, chemical, or biological systems.
But couldn’t we get rid of all the programming overhead?
If we were less distracted by the technical details of R code, we could spent more time on formulating the equations and understanding the system’s behavior.
One way to reduce programming overhead is to go for graphical modeling tools. However, we will rather take another approach known as code generation. That is, we outsource the writing of all the abstract, technical parts of R code to a computer program. Actually, we let the R code be written by R itself. Sounds strange? But it’s possible.
2 Tools for less programming
2.1 The rodeo
package
This package allows all ‘ingredients’ of an ODE model to be imported from plain tables (e.g. text files or spreadsheets). Among these ‘ingredients’ are the process rates and the stoichiometry matrix (see Equation 1). All imported information is checked for integrity and a deSolve
-compliant source code is subsequently generated.
The rodeo
package is available in the CRAN repository and can be installed instantly:
If you want to get a better understanding of how the rodeo
package can facilitate the implementation of simultaneous ODE, you can look at this article or study the package vignette which contains more examples (PDF file available here). The package has previously been applied, for example, to simulate bacterial plasmid transfer or predator-prey interactions.
2.2 The rodeoEasy
package
The rodeo
package was primarily made as a power-tool for research. For teaching, I built another package on top: rodeoEasy
. This package is not available on CRAN but can be installed and loaded like so:
install.packages("remotes") ### ONLY IF NOT YET INSTALLED
library("remotes")
::install_github("dkneis/rodeoEasy")
remoteslibrary("rodeoEasy")
Currently, the rodeoEasy
package provides the following set of functions:
build
: Imports the complete description of an ODE model from a workbook in ‘.xlsx’ or ‘.ods’ format.run.scenarios
: Runs a dynamic simulation for a model previously built withbuild
. It supports the comparison of multiple scenarios. Each scenario can have its own set of initial values and/or parameter values.
All functions are fully documented and demo examples are provided with the package (see the ‘examples’ sections in the documentation).
3 Demonstration of rodeoEasy
3.1 Oxygen sag model reloaded
In a previous lesson, we considered the aerobic decay of a substance with re-aeration from the atmosphere in a very (over)simplistic fashion. The system with the two state variables \(OM\) (organic matter) and \(DO\) (dissolved oxygen) could be described by the following ODE (Equation 2)
\[ \begin{aligned} \tfrac{d}{dt}OM &= -k_d \cdot OM \\ \tfrac{d}{dt}DO &= -k_d \cdot OM \cdot s + k_a \cdot \left( DO_{sat}(T) - DO \right) \end{aligned} \tag{2}\]
with the rate constant of degradation (\(k_d\)), the rate constant of re-aeration (\(k_a\)), the oxygen saturation level \(DO_{sat}\) being a function of temperature \(T\), and the stoichiometric coefficient \(s\).
After adoption of the recommended notation based on vector-matrix multiplication, the equations took the form of Equation 3 (note the T operators for vector transposition):
\[ \begin{bmatrix} \tfrac{d}{dt}OM \\ \tfrac{d}{dt}DO \end{bmatrix}^T = \begin{bmatrix} {\color{blue}k_d} \cdot OM \\ {\color{blue}k_a} \cdot \left( {\color{red}DO_{sat}}({\color{blue}T}) - DO \right) \end{bmatrix}^T \cdot \begin{bmatrix} {\color{gray}-1}, & {\color{blue}-s} \\ {\color{gray}0}, & {\color{gray}1} \end{bmatrix} \tag{3}\]
To highlight the different categories of items in Equation 3, I highlighted them by color (state variables in black, parameters in blue, functions in red, plain numbers in gray).
To implement the ODE from Equation 3 with the help of the rodeoEasy
package, we start with putting the vector of process rates and the stoichiometry matrix in a sheet of a spreadsheet workbook names eqns. The minimum required contents of that sheet is shown in Table 1.
description | name | unit | rate | OM | DO |
---|---|---|---|---|---|
degradation | deg | g OM / h | kd * OM | -1 | -s |
aeration | aer | g DO / h | ka * (DO_sat(T) - DO) | 0 | 1 |
Obviously, the process rate expressions end up in column ‘rate’ and minimal documentation has to be provided in the ‘description’ and ‘unit’ columns. The ‘name’ column is needed to access the value of the process rates in simulation output. Further, there must be one column for every state variable which holds the corresponding information of the stoichiometry matrix.
Although Table 1 fully implements the ODE right hand sides, some additional tables are needed by rodeoEasy
. These are used to declare the different categories of items appearing the math expressions (Table 2, Table 3, Table 4). Essentially, one such table must be provided for every color used in Equation 3 (except gray). The declaration is primarily needed to validate the math expressions but it also enforces a minimum documentation of the model. A model without documentation is essentially useless!
name | unit | description | default |
---|---|---|---|
OM | mg/l | organic matter | 1 |
DO | mg/l | dissolved oxygen | 10 |
name | unit | description | default |
---|---|---|---|
kd | 1/h | degradation rate constant | 1.00 |
ka | 1/h | aeration rate constant | 0.50 |
s | g DO / g OM | stoichiometry | 2.67 |
T | °C | temperature | 20.00 |
name | code |
---|---|
DO_sat | # O2 saturation in water (mg/L) as a function of temperature (°C) |
DO_sat | DO_sat <- function(T) { |
DO_sat | exp(7.712 - 1.314 * log(T + 45.93)) |
DO_sat | } |
Once these four sheets (‘eqns’, ‘vars’, ‘pars’, ‘funs’) have been filled, we are ready to go!
The code section below imports a workbook containing the four sheets (this is a demo workbook shipped with the rodeoEasy
package). After import, a dynamic simulation is performed with parameters and initial values taken from the ‘default’ columns of the sheets ‘pars’ and ‘vars’, respectively.
# set the workbook containing the model declaration
<- system.file("models/oxygen.xlsx", package="rodeoEasy")
workbook
# run the code generator to receive a model object
<- rodeoEasy::build(workbook)
model
# run a dynamic simulation, including basic plotting
<- rodeoEasy::run.scenarios(model, times=seq(0, 12, 0.1),
x scenarios=NULL
)
Very typically, we want to compare the behavior of a model for different initial values or parameters. The run.scenarios
function of rodeoEasy
supports this in a simple way (see R code below) and it also provides immediate visualization (Figure 2). The values of any parameters or variables not being explicitly set for a scenario remain at their respective defaults.
<- list(
myScenarios clean= c(OM=1), # 1st scenario
dirty= c(OM=10) # 2nd scenario
)<- rodeoEasy::run.scenarios(model, times=seq(0, 12, 0.1),
x scenarios=myScenarios, leg="bottomright"
)
If you look at Figure 2, you’ll notice that something weird happened. Obviously, the oxygen concentration turned negative in response to a high initial concentration of degradable matter. How could this happen?
If you look at the equations again (Table 1), you’ll notice that the process of degradation solely depend on the presence of organic matter but not at all on the availability of oxygen. Thus, in the model, degradation of organic matter can continue even after oxygen was fully depleted. In reality, this would only be possible if (1) bacteria switch to an anaerobic metabolism (if they can) or (2) the aerobic bacterial community experiences a shift towards anaerobic species (by selection). In any case, no more oxygen would be consumed once it was depleted to zero.
The solution: One would need to add a term to the model that accounts for oxygen limitation. We’ll learn about common terms to represent limitation in another lesson.
3.2 Epidemic model of the SEIR family
3.2.1 Background
The term SEIR model denotes a particular family of epidemiological models. They describe the dynamics of an infectious disease within a population consisting of four sub-groups (Table 5, Figure 3).
Group | Variable symbol | Description |
---|---|---|
Susceptible group | S | Individuals who did not acquire the disease yet and are thus considered susceptible |
Exposed group | E | Individuals who acquired the infection but who are not yet infectious. In those people, the pathogen undergoes incubation. |
Infected group | I | Individuals in an advance state of infection who can transmit the pathogen to others. |
Recovered group | R | Recovered individuals. They are assumed to be (and remain) immune against recurrent infection. |
The basic assumptions of behind SEIR models are described, e.g., in this article but also in many educational texts. Note that there are many variants of epidemiological models in addition to the simple SEIR. They either neglect aspects (e.g. the \(E\) group) or emphasize certain additional features like a gradual loss of immunity in the \(R\) group.
3.2.2 Implementation
A SEIR model is part of suite of demo models distributed with the rodeoEasy
package. To find the respective workbook on your local computer, type the R command
print(system.file("models/SEIR.xlsx", package="rodeoEasy"))
and open that file with the spreadsheet software of your choice. For convenience, the contents of the workbook is also displayed below (Table 6 to Table 9).
name | unit | description | rate | S | E | I | R |
---|---|---|---|---|---|---|---|
transm | ind./day | transmission | t * I/(S+E+I+R) * S | -1 | 1 | 0 | 0 |
infect | ind./day | infection | i * E | 0 | -1 | 1 | 0 |
recover | ind./day | recovery | r * I | 0 | 0 | -1 | 1 |
name | unit | description | default |
---|---|---|---|
S | individuals | susceptible | 8e+07 |
E | individuals | carrier, not yet infectious | 8e+05 |
I | individuals | infectious | 0e+00 |
R | individuals | recovered, immune | 0e+00 |
name | unit | description | default |
---|---|---|---|
t | 1/time | transmission rate constant | 1.0 |
i | 1/time | incubation rate constant | 0.2 |
r | 1/time | recovery rate constant | 0.4 |
name | code |
---|---|
dummy | dummy <- function() { NULL } |
The R code below triggers a simulation of the SEIR model using the default parameters and initial values. Note that visualization (Figure 4) is based on a few lines of custom code to arrange all variables in a single plot.
<- system.file("models/SEIR.xlsx", package="rodeoEasy")
workbook <- rodeoEasy::build(workbook)
model <- rodeoEasy::run.scenarios(model, times=0:90, scenarios=NULL,
x plot.vars=FALSE)
<- c("S","E","I","R") # could also use: model$namesVars()
varnames plot(range(x[,"time"]), range(x[,varnames]), type="n",
xlab="Days", ylab="Individuals")
for (vn in varnames) {
lines(x[,"time"], x[,vn], col=match(vn, varnames))
}legend("right", bty="n", lty=1, col=1:length(varnames),
legend=varnames)
As illustrated by Figure 4, the SEIR model as implemented above predicts an end of the epidemic once most of the individuals had been infected and went through recovery. However, we know by experience that immunity acquired in response to an infection does not necessarily last forever.
3.2.3 Excercise
Modify the above system of ODE to account for a gradual loss of immunity in the group of recovered individuals.
What are possible steady states of the original SEIR model and your extended version?
Play with the model parameters within reasonable bounds. Does the model run straight into steady state under all circumstances?
3.2.4 A possible solution
The extended model which accounts for the possibility of recurrent infections is known as a ‘SEIRS’ model. It is included as a demo examples in the rodeoEasy
package (see R code below) and a model output is shown in Figure 5. Note that I adjusted the values of some of the rate constants to make the epidemic a bit more dynamic and trigger damped oscillations.
<- system.file("models/SEIRS.xlsx", package="rodeoEasy")
workbook <- rodeoEasy::build(workbook)
model <- rodeoEasy::run.scenarios(model, times=0:365,
x scenarios=list(SEIR=c(s=0), SEIRS=c(s=0.01)),
plot.vars=TRUE, leg="bottomright")