Concentration-Response Curves

David Kneis

Institute of Hydrobiology, TU Dresden, DE

Introduction

Concentration-response curves

  • Living entities (cells, organisms) show physiological responses to chemicals

  • The level of the response is dependent on concentration

  • In ecotoxicology, we are concerned about harmful effects of toxins that reduce fitness or kill

  • In antimicrobial drug research, harmful effects are actually desired

Responses

  • Variable indicating fitness

  • Typical choices: growth, metabolic activity, mutation frequencies, …

  • Choice depends on organism, toxin, research goal

Endpoint

  • Response: Variable indicating fitness

  • Endpoint: A precisely defined, measurable outcome of an exposure test

Response Endpoint
Growth Max. growth rate
Metabolic activity mg O2 day-1
Mobility Migration distance
Survival % dead after 24 h

Scaling of responses (I)

Scaling of responses (II)

Temporal dimension

  • The response generally depends on (1) concentration and (2) exposure time

  • The time dimension can be eliminated by the choice of endpoints (example: % survived after 24 h)

  • Acute toxicity: Response to short term exposure, often at high dose

  • Chronic toxicity: Response to repeated or permanent exposure, often at lower dose

Language: C-R vs. D-R curves

  • Concentration-response curves: Organisms are exposed to ambient concentrations

  • Dose-response curve: Refers to internal exposure (consumption of defined doses of toxic drugs)

  • Terms are not always used consistently

Relevance of C-R curves

  • Environmental regulation: What is the maximum permissible concentrations of toxic chemicals?

  • Other regulation context: Food, Materials, …

  • Development of drugs or disinfectants: What level of exposure is needed?

Classification by tested organism

Classification by tested organism

  • Case A is encountered when working with clonal populations of (micro)organisms or cell lines

  • We expect a deterministic C-R relation

  • The corresponding discipline is known as pharmacodynamics

Classification by tested organism

  • Case B may be encountered in field studies where individuals are drawn from sub-populations

  • Example: E. coli isolated from international patients

  • We need to use probabilistic approaches to deal with variability (genetic, epigenetic)

Classification by tested organism

In this course …

  • the focus is on cases A and B, not C

  • we try to learn from examples

  • we process data ourselves and interpret outcomes

How to run example codes

  • You need a working instance of the R software but no specific add-on packages

  • To run the examples …

    • place the mouse pointer over the code

    • click the copy symbol in the top-right corner

    • paste the code into a local script file

    • process that file (e.g., using “source”)

PART A: Deterministic C-R curves

Example A.1: Yeast and salt

Study system

  • I studied the growth of ordinary baking yeast in plain sugar medium over about 2 days

  • NaCl concentration varied from 4 to 128 g/L

  • Two replicates per treatment

  • Yeast from the package likely forms a clonal population

Study system

Baking yeast, sugar, salt not shown

Recipe

  • Each flask contains 20 mL salt solution and 15 mL yeast-sugar mix (35 mL total)

  • Yeast-sugar mix was prepared from 1000 mL desalinated water, 20 g cane sugar, 0.2 g dry yeast

  • Salt solution was prepared from desalinated water and NaCl to reach 4, 8, 16, 32, 64, 128 g/L in the final volume of 35 mL (highest stock had 224 mg/L)

Monitoring of optical density

Culture flasks and poor man’s photometer

Monitoring of optical density

Wave length of 605 nm (peak), path length is 45 mm

Resolution of about 1000 digits between full light and complete darkness

Data set in compressed format

values <- c(
  160,158,158,160,160,159,160,162,163,160,162,163,159,158,161,161,164,164,
  167,166,165,167,169,169,160,159,162,166,169,167,171,169,171,168,171,172,
  159,159,164,165,169,170,173,172,176,174,177,179,162,161,168,169,175,173,
  179,178,179,181,185,186,162,161,170,170,174,175,180,179,184,183,190,191,
  162,162,171,172,178,176,184,182,188,186,195,195,159,158,168,170,177,177,
  184,183,188,187,195,197,159,159,170,170,180,179,186,187,193,191,200,198
)
salt <- 2^(7:2)
hours <- c(0, 3.5, 5.33, 13.5, 18, 22.33, 27.33, 39.66, 47.5)

x <- data.frame(
  salt=rep(rep(salt, each=2), length(values) / 2 / length(salt)),
  repl=rep(c("a","b"), length(values) / 2),
  hour=rep(hours, each=2 * length(salt)),
  value=values
)

First and last records

print(head(x))
  salt repl hour value
1  128    a    0   160
2  128    b    0   158
3   64    a    0   158
4   64    b    0   160
5   32    a    0   160
6   32    b    0   159
print(tail(x))
    salt repl hour value
103   16    a 47.5   186
104   16    b 47.5   187
105    8    a 47.5   193
106    8    b 47.5   191
107    4    a 47.5   200
108    4    b 47.5   198

Graphical summary

colorBySalt <- function(x) {
  cl <- colorRampPalette(c("steelblue","khaki","tomato"))(6)
  cl[log2(x)-1]
}

plot(x$hour, x$value, type="n", bty="n", log="y", xlab="Hours",
  ylab="Optical density (605 nm)")
fn <- function(x) {
  points(x[,"hour"], x[,"value"], pch=20, col=colorBySalt(x[,"salt"]))
  lines(x[,"hour"], x[,"value"], col=colorBySalt(x[,"salt"]))
}
ignore <- by(x, x[,c("salt","repl")], fn)
legend("topleft", bty="n", ncol=2, pch=20, lty=1,
  col=colorBySalt(unique(x[,"salt"])),
  legend=unique(x[,"salt"]), title="NaCl (g/l)")

Graphical summary

Computing end points

Let’s use the maximum increase in OD as the response. We need a value for each treatment and replicate.

z <- aggregate(
  x = list(deltaOD=x[,"value"]),
  by = x[,c("salt","repl")],
  FUN = function(u) {
    diff(range(u)) }
)

# pipe-based alternative code
# library("dplyr")
# z <- x %>%
#   group_by(salt, repl) %>%
#   summarise(deltaOD =
#     diff(range(value)))   
# z <- as.data.frame(z)

print(head(z))
  salt repl deltaOD
1    4    a      38
2    8    a      30
3   16    a      26
4   32    a      20
5   64    a      13
6  128    a       3

Empirical concentration-response curve

plot(deltaOD ~ salt, data=z,
  log="x",
  xlab="NaCl (mg/L, log)",
  ylab=expression(paste(
    Delta," OD")),
  bty="L"
)

Fitting a model: Why ?

  • Model allows for inter- and extrapolation (e.g., we can find the salt concentration where \(\Delta\)OD = 0)

  • Model parameters can be compared across experiments, even if tested concentrations differ

Fitting a model: How ?

  • Formulate a mathematical model: \(response = f(conc, parms)\)

  • Guess values for all unknown parameters (\(parms\))

  • Design an objective function to quantify the mismatch between observed and modeled response \(SSR = \sum_{i=1}^n (obs_i - mod_i)^2\)

  • Let an optimization algorithm find the values of \(parms\) which minimize \(SSR\)

Fit a power model with nls

fit <- nls(deltaOD ~ a + b * salt^p, data=z,
  start=c(a=0, b=1, p=1))

salt.show <- 0:150
pr <- data.frame(
  salt=salt.show,
  deltaOD.modeled=predict(fit,
    newdata=data.frame(salt=salt.show))
)

plot(deltaOD.modeled ~ salt, data=pr,
  type="l", log="x", bty="L",
  xlab="NaCl (mg/L, log)",
  ylab=expression(paste(Delta," OD")))
points(deltaOD ~ salt, data=z)

SSR is implicitly used as the objective function

Fitted power model

Using the model for extrapolation

# intersection with x axis
saltAtRespZero <- with(as.list(coef(fit)), (-a/b)^(1/p))

# intersection with y axis
respAtSaltZero <- with(as.list(coef(fit)), a+b*0^p)

legend("topright", bty="n", title="Predicted extremes",
  legend=c(
  paste("Salt (response = 0): ", signif(saltAtRespZero, 3)),
  paste("Response (salt = 0): ", signif(respAtSaltZero, 3))
))

Using the model for extrapolation

Checking plausibility (1 / 3)


Does the model fit the observations well?

Checking plausibility (2 / 3)


  • Reasonable behavior at high concentrations?

  • Intersect with lower axis or convergence to zero?

Checking plausibility (3 / 3)


  • Reasonable behavior at low concentrations?

  • Do we expect a non-zero intercept?

  • What if the medium contains no ions at all?

Extrapolation on the left?

More about extrapolation

Let’s redefine the response

z <- cbind(z, inhib = 1 - z[,"deltaOD"] / max(z[,"deltaOD"]))
  • 0 codes for “no inhibition”

  • 1 codes for “fully inhibited”

More about extrapolation

A wrapper function to fit & display models

fit <- function(formula, start, data) {
  x <- nls(formula, data=data, start=start)
  salt.show <- seq(0.1*min(data[,"salt"]), 2.5*max(data[,"salt"]), by=1)
  pr <- data.frame(salt=salt.show, inhib.mod=
      pmax(0, pmin(1, predict(x, newdata=data.frame(salt=salt.show)))))
  omar <- par("mar")
  par(mar=c(4.2,4.2,1,1))
  plot(inhib.mod ~ salt, data=pr, type="l", log="x", ylim=c(0,1),
    xlab="NaCl (mg/L, log)", ylab="Inhibition")
  points(inhib ~ salt, data=data, pch=20)
  abline(h=c(0,1), lty=4:3, col=c("blue","red"))
  legend(x=20, y=0.3, bty="n", pch=c(20,NA), lty=c(NA,1),
    legend=c("Observed","Model"))
  legend(x=.5, y=0.8, bty="n", title=as.character(as.expression(formula)),
    legend=paste(names(coef(x)), signif(coef(x), 3), sep=" = "))
  par(mar=omar)
}

More about extrapolation

Fitting and visual comparison of 4 models

par(mfrow=c(2,2))

# power model
fit(inhib ~ a * salt^p + b, start=c(a=1, b=0, p=1), data=z)

# log-linear
fit(inhib ~ a * log(salt) + b, start=c(a=1, b=0), data=z)

# exponential
fit(inhib ~ 1 - exp(p*salt), start=c(p=1e-6), data=z)

# variant of the 'Hill model'
fit(inhib ~ salt^n / (salt^n + ec50^n), start=c(ec50=60, n=1), data=z)

par(mfrow=c(1,1))

Discrepancies w.r.t. extrapolation

Summary

  • You want to fit empirical data by a model

  • You need to select the endpoint and a model equation

  • Proper choice requires knowledge about the mechanisms (chemistry, physiology, …)

  • Exposure time has been neglected so far

Example A.2: Bacterial growth with antibiotics

Data set

x <- data.frame(
  drug = c(rep("Mecillinam",12),rep("Tetracycline",17),
    rep("Trimethoprim",14)),
  mg_per_L = c(5e-05,0.00011,0.00021,0.00043,0.00089,0.00186,
    0.00406,0.00945,0.0162,0.0242,0.0336,0.0443,5e-04,0.0011,
    0.0022,0.0044,0.0089,0.0186,0.0406,0.0945,0.162,0.242,
    0.336,0.443,0.564,0.697,0.844,1,1.37,0.000319,0.00064,
    0.00129,0.00261,0.00534,0.0112,0.0244,0.0567,0.097,0.145,
    0.202,0.266,0.338,0.418),
  mu_per_hour = c(1.24,1.21,1.21,1.18,1.22,1.2,1.23,1.24,1.22,
    1.22,0.965,0.233,1.22,1.22,1.23,1.24,1.2,1.23,1.21,1.21,
    1.06,0.934,0.812,0.64,0.547,0.438,0.305,0.291,0.182,1.22,
    1.22,1.21,1.23,1.14,1.07,1.09,0.973,0.843,0.688,0.526,0.506,
    0.374,0.292)
)

Subset of data from Fig. 1 of Angermayr et al. (2022), DOI:10.15252/msb.202110490, CC BY 4.0 license.

Normalizing the response

# Scales the growth rates to range 0 ... 1

normalize <- function(x) {
  data.frame(
    drug= unique(x[,"drug"]),
    C= x[,"mg_per_L"],
    R= x[,"mu_per_hour"] / x[which.min(x[,"mg_per_L"]), "mu_per_hour"]
  )
}
x <- do.call(rbind, by(x, x[,"drug"], normalize))

Helper function 1

# fits a model using 'nls' and adds the prediction to a plot

fit_single <- function(formula, data, start, col, label) {
  f <- nls(formula=formula, data=data, start=start)
  concs <- 10^(-6:6)
  concs <- sort(rep(concs, each=100) * rep((1:100)/100, length(concs)))
  pr <- data.frame(C=concs, R=predict(f, newdata=data.frame(C=concs)))
  lines(R ~ C, data=pr, col=col)
  data.frame(label=label, col=col, npar=length(start),
    aic=signif(AIC(f),3), formula=as.character(as.expression(formula)))
}

Helper function 2

# fits multiple models to data of a single drug
fit_multiple <- function(x) {
  plot(R ~ C, data=x, bty="L", log="x", pch=20, col="grey",
    xlim=range(x[,"C"])*c(0.75, 2), ylim=c(0,1),
    xlab=substitute(paste(a," (mg/L)"), list(a=unique(x[,"drug"]))),
    ylab=expression(paste("R = ",mu / mu[0]," (-)")))
  out <- rbind(
    fit_single(R ~ 1 - (C / mic), data=x,
      start=c(mic=max(x[,"C"])), col="steelblue4", label="Linear"),
    fit_single(R ~ 1 - (C / mic)^b, data=x,
      start=c(mic=max(x[,"C"]), b=1), col="darkorange", label="Power"),
    fit_single(R ~ ec50^n / (ec50^n + C^n), data=x,
      start=c(ec50=0.5*max(x[,"C"]), n=1), col="red", label="Hill"))
  out <- out[order(out$aic),]
  legend("left", bty="n", legend=paste0(out$label,": ",
      "AIC = ",out$aic," (",out$npar," par.)"), text.col=out$col)
  cbind(drug=unique(x[,"drug"]), out)
}

Fit all cases

# set up plot layout
n <- length(unique(x[,"drug"]))
nc <- 2
nr <- ceiling(n/nc)
omar <- par("mar")
par(mfrow=c(nr, nc), mar=c(4.5, 4.7, 1.5, 1))

# hard work happens here
stats <- do.call(rbind, by(x, x[,"drug"], fit_multiple))

# add legend
plot(0:1, 0:1, type="n", axes=F, ann=F)
stats <- stats[stats[,"drug"] == "Tetracycline",]
legend(x=0, y=1, bty="n", 
  legend=paste(stats[,"label"], stats[,"formula"], sep=": "))
par(mfrow=c(1,1), mar=omar)

Result

Discussion

  • Different model structures may work

  • Interpretation of parameters differs

  • Trade-off between goodness of fit and model complexity

  • AIC helps to compare models

Discussion

  • AIC cannot substitute physiological knowledge

  • What is a realistic behavior close to MIC ?

  • Can we trust data close to extremes ?

Empirical vs. mechanistic models

Empirical vs. mechanistic

  • Empirical models are just mathematical structures which reproduce the shape of the C-R curve.

  • Mechanistic models are rooted in physiology.

  • The Hill equation has a mechanistic basis:

    1. Toxin binds to a receptor

    2. Response is proportional to occupancy of receptors

  • See this paper together with these notes for details

Concentration-response curves with peaks

Hormesis

  • Hormesis is the positive response of an organisms to low doses of a toxic substance (stimulation)

  • Hormesis is not homeopathy

  • The phenomenon has been heavily debated

Hormesis

Shoot weight of lettuce grown with isobutyl alcohol

x <- read.table(header=T, sep="", text="
conc repl1  repl2
0    1.126  0.833
0.32 1.096  1.106
1    1.163  1.336
3.2  0.985  0.754
10   0.710  0.683
32   0.560  0.488
100  0.375  0.344
")

Data from Ewijk & Hoekstra, 1993

Hormesis

# turn into long table
x <- with(x, data.frame(
  conc=rep(conc, 2),
  repl=rep(1:2, each=nrow(x)),
  value=c(repl1, repl2)
))

plot(value ~ conc,
  data=x,
  log="x",
  xlab="Concentration (mg/L)",
  ylab="Shoot weight",
  bty="L"
)

Peaks in C-R curves aren’t surprising

… because chemicals may be essential and toxic at the same time


Example:

  • Let plants grow without a nitrogen source

  • Moderate addition of NH4+ promotes growth

  • However, too much NH4+ is detrimental (see, e.g., Britto & Kronzucker, 2002)

Peaks in C-R curves aren’t surprising

… because chemicals trigger various responses (a) in a test organism and (b) its ecosystems


Example:

  • A root microbiome of symbionts and parasites

  • Let the plant be treated with antibiotic

  • If parasites are particularly sensitive, the plant profits from low-dose treatment

  • At higher doses, even symbionts get killed …

PART B: Probabilistic C-R curves

MIC data of bacterial pathogens

  • Bacterial infections are treated with antibiotics

  • MIC: Minimum inhibitory concentration

  • MICs are recorded for many drug-bug combinations

  • Over 106 values in MIC repository of EUCAST

Subset of MIC repository

# Subset of EUCAST MIC data (snapshot from 2024),
# Concentrations in mg/L, coded as log2 (2^-9 ... 2^9)

x <- rbind(
  data.frame(
   case = "S. enterica on Cefotaxime",   # S. = Salmonella
   conc = -9:9,
   count = c(0,0,0,0,4,655,1520,183,20,5,0,0,1,1,3,0,0,0,0)
  ),
  data.frame(
   case = "K. pneumoniae on Meropenem",  # K. = Klebsiella
   conc = -9:9,
   count = c(0,0,211,948,2896,11318,520,220,135,90,56,41,31,32,45,5,4,1,0)
  )
)

MIC distributions as histograms

plot_hist <- function(x) {
  x <- x[order(x[,"conc"]),]
  z <- barplot(x[,"count"], xlab="log2 MIC (mg/L)", ylab="Count")
  axis(1, at=z, labels=x[,"conc"], lwd=0, las=2)
  legend("right", bty="n", legend=
    paste(gsub(unique(x[,"case"]), pattern=" on ",
      replacement="\non "), sep="\n"))
}

par(mfcol=c(1,2))
ignore <- by(x, x[,"case"], plot_hist)
par(mfcol=c(1,1))

MIC distributions as histograms

MIC distributions as histograms

Reasons for variation

  • Genomic variation

  • Plasmids

  • Gene expression

  • Uncertain measurements

Empirical probabilities

Probability is defined as \(P(X <= x)\)

# Add empirical probabilities; assuming that 50% of the counts
# in each bin occur below/above the nominal concentration

add_probs <- function(x) {
  x <- x[order(x[,"conc"]),]
  cbind(x,
    prob= with(x, (cumsum(count) - count/2) / sum(count))
  )
}

x <- do.call(rbind, by(x, x[,"case"], add_probs))

Empirical probabilities

plot_emp <- function(x) {
  plot(prob ~ conc, data=x, pch=20, col="tomato", cex=1.5,
    bty="L", las=2, xaxt="n",
    xlab="              MIC (mg/L)", ylab="P(X <= MIC)")
  axis(1, at=-9:9, labels=signif(2^(-9:9),3), las=2)
  abline(h=c(0,1), lty=3, col="grey")
  text(x[,"conc"], x[,"prob"], x[,"count"],
    adj=c(1.75, 0.5), xpd=T, srt=-45, cex=0.8, col="grey50")
  legend("right", bty="n", legend=
    paste(gsub(unique(x[,"case"]), pattern=" on ",
      replacement="\non "), sep="\n"))
}

par(mfcol=c(1,2))
ignore <- by(x, x[,"case"], plot_emp)
par(mfcol=c(1,1))

Empirical probabilities

1st attempt: Logistic regression

# Fit a logistic regression model with 'glm'

fit_logi <- function(x) {
  fit <- glm(prob ~ conc, data=x,
    family=quasibinomial(link="logit"))
  c_pred <- seq(-9, 9, .1)
  pr <- data.frame(conc=c_pred,
    prob=predict(fit, newdata=data.frame(conc=c_pred),
    type="response"))
  lines(prob ~ c_pred, data=pr)
}

par(mfcol=c(1,2))
fn <- function(x) { plot_emp(x); fit_logi(x) }
ignore <- by(x, x[,"case"], fn)
par(mfcol=c(1,1))

1st attempt: Logistic regression

Other options

  • Logistic regression appears to perform well but it may not be the method of first choice

  • Let’s fit a probability distribution model instead

  • Distribution fitting uses maximum likelihood estimation (MLE) instead of least-squares

2nd attempt: Distribution fitting

fit_ML <- function(x) {
  objfunc <- function(p, x) {
    obs <- rep(x[,"conc"], times=x[,"count"])
    -1*sum(log(dnorm(obs, mean=p["m"], sd=p["s"])))
  }
  fit <- optim(par=c(m=.1, s=1), fn=objfunc, x=x)
  stopifnot(fit$convergence==0)
  c_pred <- seq(-9, 9, .1)
  lines(c_pred, pnorm(q=c_pred, mean=fit$par["m"],
    sd=fit$par["s"]))
}

par(mfcol=c(1,2))
fn <- function(x) { plot_emp(x); fit_ML(x) }
ignore <- by(x, x[,"case"], fn)
par(mfcol=c(1,1))

2nd attempt: Distribution fitting

Summary

  • Substantial intra-species variation of MICs

  • Variation is partly explained by biology

  • Another (unknown) part of the variation is due to measurement uncertainty

  • MICs of susceptible strains often follow a log-normal distribution

Final notes

C-R analysis may be tricky

  • Different cases require different concepts

  • Do not blindly follow what AI proposes

  • Only use concepts you understand

Selected sources

  • R-package “drc”: High-level methods for concentration-response curve fitting

  • Ritz C. (2010): Toward a unified approach to dose–response modeling in ecotoxicology. Environ. Toxicology & Chemistry, 29 (1), 220–229

  • Gabrielsson J. & Weiner D.: Pharmacokinetic and Pharmacodynamic Data Analysis (Textbook in various editions)