| Response | Endpoint |
|---|---|
| Growth | Max. growth rate |
| Metabolic activity | mg O2 day-1 |
| Mobility | Migration distance |
| Survival | % dead after 24 h |
Living entities (cells, organisms) show physiological responses to chemicals
The level of the response is dependent on concentration
In ecotoxicology, we are concerned about harmful effects of toxins that reduce fitness or kill
In antimicrobial drug research, harmful effects are actually desired
Variable indicating fitness
Typical choices: growth, metabolic activity, mutation frequencies, …
Choice depends on organism, toxin, research goal
Response: Variable indicating fitness
Endpoint: A precisely defined, measurable outcome of an exposure test
| Response | Endpoint |
|---|---|
| Growth | Max. growth rate |
| Metabolic activity | mg O2 day-1 |
| Mobility | Migration distance |
| Survival | % dead after 24 h |
The response generally depends on (1) concentration and (2) exposure time
The time dimension can be eliminated by the choice of endpoints (example: % survived after 24 h)
Acute toxicity: Response to short term exposure, often at high dose
Chronic toxicity: Response to repeated or permanent exposure, often at lower dose
Concentration-response curves: Organisms are exposed to ambient concentrations
Dose-response curve: Refers to internal exposure (consumption of defined doses of toxic drugs)
Terms are not always used consistently
Environmental regulation: What is the maximum permissible concentrations of toxic chemicals?
Other regulation context: Food, Materials, …
Development of drugs or disinfectants: What level of exposure is needed?
Case A is encountered when working with clonal populations of (micro)organisms or cell lines
We expect a deterministic C-R relation
The corresponding discipline is known as pharmacodynamics
Case B may be encountered in field studies where individuals are drawn from sub-populations
Example: E. coli isolated from international patients
We need to use probabilistic approaches to deal with variability (genetic, epigenetic)
Case C may be encountered in environmental risk assessment
Tested organisms may include, for example, algae, zooplankton, and fish
See species sensitivity distributions (SSD) for details
the focus is on cases A and B, not C
we try to learn from examples
we process data ourselves and interpret outcomes
You need a working instance of the R software but no specific add-on packages
To run the examples …
place the mouse pointer over the code
click the copy symbol in the top-right corner
paste the code into a local script file
process that file (e.g., using “source”)
I studied the growth of ordinary baking yeast in plain sugar medium over about 2 days
NaCl concentration varied from 4 to 128 g/L
Two replicates per treatment
Yeast from the package likely forms a clonal population
Baking yeast, sugar, salt not shown
Each flask contains 20 mL salt solution and 15 mL yeast-sugar mix (35 mL total)
Yeast-sugar mix was prepared from 1000 mL desalinated water, 20 g cane sugar, 0.2 g dry yeast
Salt solution was prepared from desalinated water and NaCl to reach 4, 8, 16, 32, 64, 128 g/L in the final volume of 35 mL (highest stock had 224 mg/L)
Culture flasks and poor man’s photometer
Wave length of 605 nm (peak), path length is 45 mm
Resolution of about 1000 digits between full light and complete darkness
values <- c(
160,158,158,160,160,159,160,162,163,160,162,163,159,158,161,161,164,164,
167,166,165,167,169,169,160,159,162,166,169,167,171,169,171,168,171,172,
159,159,164,165,169,170,173,172,176,174,177,179,162,161,168,169,175,173,
179,178,179,181,185,186,162,161,170,170,174,175,180,179,184,183,190,191,
162,162,171,172,178,176,184,182,188,186,195,195,159,158,168,170,177,177,
184,183,188,187,195,197,159,159,170,170,180,179,186,187,193,191,200,198
)
salt <- 2^(7:2)
hours <- c(0, 3.5, 5.33, 13.5, 18, 22.33, 27.33, 39.66, 47.5)
x <- data.frame(
salt=rep(rep(salt, each=2), length(values) / 2 / length(salt)),
repl=rep(c("a","b"), length(values) / 2),
hour=rep(hours, each=2 * length(salt)),
value=values
)colorBySalt <- function(x) {
cl <- colorRampPalette(c("steelblue","khaki","tomato"))(6)
cl[log2(x)-1]
}
plot(x$hour, x$value, type="n", bty="n", log="y", xlab="Hours",
ylab="Optical density (605 nm)")
fn <- function(x) {
points(x[,"hour"], x[,"value"], pch=20, col=colorBySalt(x[,"salt"]))
lines(x[,"hour"], x[,"value"], col=colorBySalt(x[,"salt"]))
}
ignore <- by(x, x[,c("salt","repl")], fn)
legend("topleft", bty="n", ncol=2, pch=20, lty=1,
col=colorBySalt(unique(x[,"salt"])),
legend=unique(x[,"salt"]), title="NaCl (g/l)")Let’s use the maximum increase in OD as the response. We need a value for each treatment and replicate.
salt repl deltaOD
1 4 a 38
2 8 a 30
3 16 a 26
4 32 a 20
5 64 a 13
6 128 a 3
Model allows for inter- and extrapolation (e.g., we can find the salt concentration where \(\Delta\)OD = 0)
Model parameters can be compared across experiments, even if tested concentrations differ
Formulate a mathematical model: \(response = f(conc, parms)\)
Guess values for all unknown parameters (\(parms\))
Design an objective function to quantify the mismatch between observed and modeled response \(SSR = \sum_{i=1}^n (obs_i - mod_i)^2\)
Let an optimization algorithm find the values of \(parms\) which minimize \(SSR\)
nlsfit <- nls(deltaOD ~ a + b * salt^p, data=z,
start=c(a=0, b=1, p=1))
salt.show <- 0:150
pr <- data.frame(
salt=salt.show,
deltaOD.modeled=predict(fit,
newdata=data.frame(salt=salt.show))
)
plot(deltaOD.modeled ~ salt, data=pr,
type="l", log="x", bty="L",
xlab="NaCl (mg/L, log)",
ylab=expression(paste(Delta," OD")))
points(deltaOD ~ salt, data=z)SSR is implicitly used as the objective function
# intersection with x axis
saltAtRespZero <- with(as.list(coef(fit)), (-a/b)^(1/p))
# intersection with y axis
respAtSaltZero <- with(as.list(coef(fit)), a+b*0^p)
legend("topright", bty="n", title="Predicted extremes",
legend=c(
paste("Salt (response = 0): ", signif(saltAtRespZero, 3)),
paste("Response (salt = 0): ", signif(respAtSaltZero, 3))
))
Does the model fit the observations well?
Reasonable behavior at high concentrations?
Intersect with lower axis or convergence to zero?
Reasonable behavior at low concentrations?
Do we expect a non-zero intercept?
What if the medium contains no ions at all?
Let’s redefine the response
0 codes for “no inhibition”
1 codes for “fully inhibited”
A wrapper function to fit & display models
fit <- function(formula, start, data) {
x <- nls(formula, data=data, start=start)
salt.show <- seq(0.1*min(data[,"salt"]), 2.5*max(data[,"salt"]), by=1)
pr <- data.frame(salt=salt.show, inhib.mod=
pmax(0, pmin(1, predict(x, newdata=data.frame(salt=salt.show)))))
omar <- par("mar")
par(mar=c(4.2,4.2,1,1))
plot(inhib.mod ~ salt, data=pr, type="l", log="x", ylim=c(0,1),
xlab="NaCl (mg/L, log)", ylab="Inhibition")
points(inhib ~ salt, data=data, pch=20)
abline(h=c(0,1), lty=4:3, col=c("blue","red"))
legend(x=20, y=0.3, bty="n", pch=c(20,NA), lty=c(NA,1),
legend=c("Observed","Model"))
legend(x=.5, y=0.8, bty="n", title=as.character(as.expression(formula)),
legend=paste(names(coef(x)), signif(coef(x), 3), sep=" = "))
par(mar=omar)
}Fitting and visual comparison of 4 models
par(mfrow=c(2,2))
# power model
fit(inhib ~ a * salt^p + b, start=c(a=1, b=0, p=1), data=z)
# log-linear
fit(inhib ~ a * log(salt) + b, start=c(a=1, b=0), data=z)
# exponential
fit(inhib ~ 1 - exp(p*salt), start=c(p=1e-6), data=z)
# variant of the 'Hill model'
fit(inhib ~ salt^n / (salt^n + ec50^n), start=c(ec50=60, n=1), data=z)
par(mfrow=c(1,1))You want to fit empirical data by a model
You need to select the endpoint and a model equation
Proper choice requires knowledge about the mechanisms (chemistry, physiology, …)
Exposure time has been neglected so far
x <- data.frame(
drug = c(rep("Mecillinam",12),rep("Tetracycline",17),
rep("Trimethoprim",14)),
mg_per_L = c(5e-05,0.00011,0.00021,0.00043,0.00089,0.00186,
0.00406,0.00945,0.0162,0.0242,0.0336,0.0443,5e-04,0.0011,
0.0022,0.0044,0.0089,0.0186,0.0406,0.0945,0.162,0.242,
0.336,0.443,0.564,0.697,0.844,1,1.37,0.000319,0.00064,
0.00129,0.00261,0.00534,0.0112,0.0244,0.0567,0.097,0.145,
0.202,0.266,0.338,0.418),
mu_per_hour = c(1.24,1.21,1.21,1.18,1.22,1.2,1.23,1.24,1.22,
1.22,0.965,0.233,1.22,1.22,1.23,1.24,1.2,1.23,1.21,1.21,
1.06,0.934,0.812,0.64,0.547,0.438,0.305,0.291,0.182,1.22,
1.22,1.21,1.23,1.14,1.07,1.09,0.973,0.843,0.688,0.526,0.506,
0.374,0.292)
)Subset of data from Fig. 1 of Angermayr et al. (2022), DOI:10.15252/msb.202110490, CC BY 4.0 license.
# fits a model using 'nls' and adds the prediction to a plot
fit_single <- function(formula, data, start, col, label) {
f <- nls(formula=formula, data=data, start=start)
concs <- 10^(-6:6)
concs <- sort(rep(concs, each=100) * rep((1:100)/100, length(concs)))
pr <- data.frame(C=concs, R=predict(f, newdata=data.frame(C=concs)))
lines(R ~ C, data=pr, col=col)
data.frame(label=label, col=col, npar=length(start),
aic=signif(AIC(f),3), formula=as.character(as.expression(formula)))
}# fits multiple models to data of a single drug
fit_multiple <- function(x) {
plot(R ~ C, data=x, bty="L", log="x", pch=20, col="grey",
xlim=range(x[,"C"])*c(0.75, 2), ylim=c(0,1),
xlab=substitute(paste(a," (mg/L)"), list(a=unique(x[,"drug"]))),
ylab=expression(paste("R = ",mu / mu[0]," (-)")))
out <- rbind(
fit_single(R ~ 1 - (C / mic), data=x,
start=c(mic=max(x[,"C"])), col="steelblue4", label="Linear"),
fit_single(R ~ 1 - (C / mic)^b, data=x,
start=c(mic=max(x[,"C"]), b=1), col="darkorange", label="Power"),
fit_single(R ~ ec50^n / (ec50^n + C^n), data=x,
start=c(ec50=0.5*max(x[,"C"]), n=1), col="red", label="Hill"))
out <- out[order(out$aic),]
legend("left", bty="n", legend=paste0(out$label,": ",
"AIC = ",out$aic," (",out$npar," par.)"), text.col=out$col)
cbind(drug=unique(x[,"drug"]), out)
}# set up plot layout
n <- length(unique(x[,"drug"]))
nc <- 2
nr <- ceiling(n/nc)
omar <- par("mar")
par(mfrow=c(nr, nc), mar=c(4.5, 4.7, 1.5, 1))
# hard work happens here
stats <- do.call(rbind, by(x, x[,"drug"], fit_multiple))
# add legend
plot(0:1, 0:1, type="n", axes=F, ann=F)
stats <- stats[stats[,"drug"] == "Tetracycline",]
legend(x=0, y=1, bty="n",
legend=paste(stats[,"label"], stats[,"formula"], sep=": "))
par(mfrow=c(1,1), mar=omar)Different model structures may work
Interpretation of parameters differs
Trade-off between goodness of fit and model complexity
AIC helps to compare models
AIC cannot substitute physiological knowledge
What is a realistic behavior close to MIC ?
Can we trust data close to extremes ?
Empirical models are just mathematical structures which reproduce the shape of the C-R curve.
Mechanistic models are rooted in physiology.
The Hill equation has a mechanistic basis:
Toxin binds to a receptor
Response is proportional to occupancy of receptors
See this paper together with these notes for details
Hormesis is the positive response of an organisms to low doses of a toxic substance (stimulation)
Hormesis is not homeopathy
The phenomenon has been heavily debated
Shoot weight of lettuce grown with isobutyl alcohol
Data from Ewijk & Hoekstra, 1993
… because chemicals may be essential and toxic at the same time
Example:
Let plants grow without a nitrogen source
Moderate addition of NH4+ promotes growth
However, too much NH4+ is detrimental (see, e.g., Britto & Kronzucker, 2002)
… because chemicals trigger various responses (a) in a test organism and (b) its ecosystems
Example:
A root microbiome of symbionts and parasites
Let the plant be treated with antibiotic
If parasites are particularly sensitive, the plant profits from low-dose treatment
At higher doses, even symbionts get killed …
Bacterial infections are treated with antibiotics
MIC: Minimum inhibitory concentration
MICs are recorded for many drug-bug combinations
Over 106 values in MIC repository of EUCAST
# Subset of EUCAST MIC data (snapshot from 2024),
# Concentrations in mg/L, coded as log2 (2^-9 ... 2^9)
x <- rbind(
data.frame(
case = "S. enterica on Cefotaxime", # S. = Salmonella
conc = -9:9,
count = c(0,0,0,0,4,655,1520,183,20,5,0,0,1,1,3,0,0,0,0)
),
data.frame(
case = "K. pneumoniae on Meropenem", # K. = Klebsiella
conc = -9:9,
count = c(0,0,211,948,2896,11318,520,220,135,90,56,41,31,32,45,5,4,1,0)
)
)plot_hist <- function(x) {
x <- x[order(x[,"conc"]),]
z <- barplot(x[,"count"], xlab="log2 MIC (mg/L)", ylab="Count")
axis(1, at=z, labels=x[,"conc"], lwd=0, las=2)
legend("right", bty="n", legend=
paste(gsub(unique(x[,"case"]), pattern=" on ",
replacement="\non "), sep="\n"))
}
par(mfcol=c(1,2))
ignore <- by(x, x[,"case"], plot_hist)
par(mfcol=c(1,1))Reasons for variation
Genomic variation
Plasmids
Gene expression
Uncertain measurements
Probability is defined as \(P(X <= x)\)
plot_emp <- function(x) {
plot(prob ~ conc, data=x, pch=20, col="tomato", cex=1.5,
bty="L", las=2, xaxt="n",
xlab=" MIC (mg/L)", ylab="P(X <= MIC)")
axis(1, at=-9:9, labels=signif(2^(-9:9),3), las=2)
abline(h=c(0,1), lty=3, col="grey")
text(x[,"conc"], x[,"prob"], x[,"count"],
adj=c(1.75, 0.5), xpd=T, srt=-45, cex=0.8, col="grey50")
legend("right", bty="n", legend=
paste(gsub(unique(x[,"case"]), pattern=" on ",
replacement="\non "), sep="\n"))
}
par(mfcol=c(1,2))
ignore <- by(x, x[,"case"], plot_emp)
par(mfcol=c(1,1))# Fit a logistic regression model with 'glm'
fit_logi <- function(x) {
fit <- glm(prob ~ conc, data=x,
family=quasibinomial(link="logit"))
c_pred <- seq(-9, 9, .1)
pr <- data.frame(conc=c_pred,
prob=predict(fit, newdata=data.frame(conc=c_pred),
type="response"))
lines(prob ~ c_pred, data=pr)
}
par(mfcol=c(1,2))
fn <- function(x) { plot_emp(x); fit_logi(x) }
ignore <- by(x, x[,"case"], fn)
par(mfcol=c(1,1))Logistic regression appears to perform well but it may not be the method of first choice
Let’s fit a probability distribution model instead
Distribution fitting uses maximum likelihood estimation (MLE) instead of least-squares
fit_ML <- function(x) {
objfunc <- function(p, x) {
obs <- rep(x[,"conc"], times=x[,"count"])
-1*sum(log(dnorm(obs, mean=p["m"], sd=p["s"])))
}
fit <- optim(par=c(m=.1, s=1), fn=objfunc, x=x)
stopifnot(fit$convergence==0)
c_pred <- seq(-9, 9, .1)
lines(c_pred, pnorm(q=c_pred, mean=fit$par["m"],
sd=fit$par["s"]))
}
par(mfcol=c(1,2))
fn <- function(x) { plot_emp(x); fit_ML(x) }
ignore <- by(x, x[,"case"], fn)
par(mfcol=c(1,1))Substantial intra-species variation of MICs
Variation is partly explained by biology
Another (unknown) part of the variation is due to measurement uncertainty
MICs of susceptible strains often follow a log-normal distribution
Different cases require different concepts
Do not blindly follow what AI proposes
Only use concepts you understand
R-package “drc”: High-level methods for concentration-response curve fitting
Ritz C. (2010): Toward a unified approach to dose–response modeling in ecotoxicology. Environ. Toxicology & Chemistry, 29 (1), 220–229
Gabrielsson J. & Weiner D.: Pharmacokinetic and Pharmacodynamic Data Analysis (Textbook in various editions)