Concentration-Response Curves

David Kneis

firstname.lastname @ tu-dresden.de

Institute of Hydrobiology, TU Dresden, DE

Introduction

Concentration-response curves

Living entities (cells, organisms) show physiological responses to chemicals
The level of the response is dependent on concentration
In ecotoxicology, we are concerned about harmful effects of toxins that reduce fitness or kill
In antimicrobial drug research, harmful effects are actually desired

Responses

Variable indicating fitness
Typical choices: growth, metabolic activity, mutation frequencies, …
Choice depends on organism, toxin, research goal

Endpoint

Response: Variable indicating fitness
Endpoint: A precisely defined, measurable outcome of an exposure test

Response	Endpoint
Growth	Max. growth rate
Metabolic activity	mg O₂ day^-1
Mobility	Migration distance
Survival	% dead after 24 h

Scaling of responses (I)

Scaling of responses (II)

Temporal dimension

The response generally depends on (1) concentration and (2) exposure time
The time dimension can be eliminated by the choice of endpoints (example: % survived after 24 h)
Acute toxicity: Response to short term exposure, often at high dose
Chronic toxicity: Response to repeated or permanent exposure, often at lower dose

Language: C-R vs. D-R curves

Concentration-response curves: Organisms are exposed to ambient concentrations
Dose-response curve: Refers to internal exposure (consumption of defined doses of toxic drugs)
Terms are not always used consistently

Relevance of C-R curves

Environmental regulation: What is the maximum permissible concentrations of toxic chemicals?
Other regulation context: Food, Materials, …
Development of drugs or disinfectants: What level of exposure is needed?

Classification by tested organism

Case A is encountered when working with clonal populations of (micro)organisms or cell lines
We expect a deterministic C-R relation
The corresponding discipline is known as pharmacodynamics

Classification by tested organism

Case B may be encountered in field studies where individuals are drawn from sub-populations
Example: E. coli isolated from international patients
We need to use probabilistic approaches to deal with variability (genetic, epigenetic)

Classification by tested organism

Case C may be encountered in environmental risk assessment
Tested organisms may include, for example, algae, zooplankton, and fish
See species sensitivity distributions (SSD) for details

In this course …

the focus is on cases A and B, not C
we try to learn from examples
we process data ourselves and interpret outcomes

How to run example codes

You need a working instance of the R software but no specific add-on packages
To run the examples …
- place the mouse pointer over the code
- click the copy symbol in the top-right corner
- paste the code into a local script file
- process that file (e.g., using “source”)

PART A: Deterministic C-R curves

Example A.1: Yeast and salt

Study system

I studied the growth of ordinary baking yeast in plain sugar medium over about 2 days
NaCl concentration varied from 4 to 128 g/L
Two replicates per treatment
Yeast from the package likely forms a clonal population

Study system

Baking yeast, sugar, salt not shown

Recipe

Each flask contains 20 mL salt solution and 15 mL yeast-sugar mix (35 mL total)
Yeast-sugar mix was prepared from 1000 mL desalinated water, 20 g cane sugar, 0.2 g dry yeast
Salt solution was prepared from desalinated water and NaCl to reach 4, 8, 16, 32, 64, 128 g/L in the final volume of 35 mL (highest stock had 224 mg/L)

Monitoring of optical density

Culture flasks and poor man’s photometer

Monitoring of optical density

Wave length of 605 nm (peak), path length is 45 mm

Resolution of about 1000 digits between full light and complete darkness

Data set in compressed format

values <- c(
  160,158,158,160,160,159,160,162,163,160,162,163,159,158,161,161,164,164,
  167,166,165,167,169,169,160,159,162,166,169,167,171,169,171,168,171,172,
  159,159,164,165,169,170,173,172,176,174,177,179,162,161,168,169,175,173,
  179,178,179,181,185,186,162,161,170,170,174,175,180,179,184,183,190,191,
  162,162,171,172,178,176,184,182,188,186,195,195,159,158,168,170,177,177,
  184,183,188,187,195,197,159,159,170,170,180,179,186,187,193,191,200,198
)
salt <- 2^(7:2)
hours <- c(0, 3.5, 5.33, 13.5, 18, 22.33, 27.33, 39.66, 47.5)

x <- data.frame(
  salt=rep(rep(salt, each=2), length(values) / 2 / length(salt)),
  repl=rep(c("a","b"), length(values) / 2),
  hour=rep(hours, each=2 * length(salt)),
  value=values
)

First and last records

print(head(x))

  salt repl hour value
1  128    a    0   160
2  128    b    0   158
3   64    a    0   158
4   64    b    0   160
5   32    a    0   160
6   32    b    0   159

print(tail(x))

    salt repl hour value
103   16    a 47.5   186
104   16    b 47.5   187
105    8    a 47.5   193
106    8    b 47.5   191
107    4    a 47.5   200
108    4    b 47.5   198

Graphical summary

colorBySalt <- function(x) {
  cl <- colorRampPalette(c("steelblue","khaki","tomato"))(6)
  cl[log2(x)-1]
}

plot(x$hour, x$value, type="n", bty="n", log="y", xlab="Hours",
  ylab="Optical density (605 nm)")
fn <- function(x) {
  points(x[,"hour"], x[,"value"], pch=20, col=colorBySalt(x[,"salt"]))
  lines(x[,"hour"], x[,"value"], col=colorBySalt(x[,"salt"]))
}
ignore <- by(x, x[,c("salt","repl")], fn)
legend("topleft", bty="n", ncol=2, pch=20, lty=1,
  col=colorBySalt(unique(x[,"salt"])),
  legend=unique(x[,"salt"]), title="NaCl (g/l)")

Graphical summary

Computing end points

Let’s use the maximum increase in OD as the response. We need a value for each treatment and replicate.

z <- aggregate(
  x = list(deltaOD=x[,"value"]),
  by = x[,c("salt","repl")],
  FUN = function(u) {
    diff(range(u)) }
)

# pipe-based alternative code
# library("dplyr")
# z <- x %>%
#   group_by(salt, repl) %>%
#   summarise(deltaOD =
#     diff(range(value)))   
# z <- as.data.frame(z)

print(head(z))

  salt repl deltaOD
1    4    a      38
2    8    a      30
3   16    a      26
4   32    a      20
5   64    a      13
6  128    a       3

Empirical concentration-response curve

plot(deltaOD ~ salt, data=z,
  log="x",
  xlab="NaCl (mg/L, log)",
  ylab=expression(paste(
    Delta," OD")),
  bty="L"
)

Fitting a model: Why ?

Model allows for inter- and extrapolation (e.g., we can find the salt concentration where \(\Delta\)OD = 0)
Model parameters can be compared across experiments, even if tested concentrations differ

Fitting a model: How ?

Formulate a mathematical model: \(response = f(conc, parms)\)
Guess values for all unknown parameters (\(parms\))
Design an objective function to quantify the mismatch between observed and modeled response \(SSR = \sum_{i=1}^n (obs_i - mod_i)^2\)
Let an optimization algorithm find the values of \(parms\) which minimize \(SSR\)

Fit a power model with `nls`

fit <- nls(deltaOD ~ a + b * salt^p, data=z,
  start=c(a=0, b=1, p=1))

salt.show <- 0:150
pr <- data.frame(
  salt=salt.show,
  deltaOD.modeled=predict(fit,
    newdata=data.frame(salt=salt.show))
)

plot(deltaOD.modeled ~ salt, data=pr,
  type="l", log="x", bty="L",
  xlab="NaCl (mg/L, log)",
  ylab=expression(paste(Delta," OD")))
points(deltaOD ~ salt, data=z)

SSR is implicitly used as the objective function

Fitted power model

Using the model for extrapolation

# intersection with x axis
saltAtRespZero <- with(as.list(coef(fit)), (-a/b)^(1/p))

# intersection with y axis
respAtSaltZero <- with(as.list(coef(fit)), a+b*0^p)

legend("topright", bty="n", title="Predicted extremes",
  legend=c(
  paste("Salt (response = 0): ", signif(saltAtRespZero, 3)),
  paste("Response (salt = 0): ", signif(respAtSaltZero, 3))
))

Using the model for extrapolation

Checking plausibility (1 / 3)

Does the model fit the observations well?

Checking plausibility (2 / 3)

Reasonable behavior at high concentrations?
Intersect with lower axis or convergence to zero?

Checking plausibility (3 / 3)

Reasonable behavior at low concentrations?
Do we expect a non-zero intercept?
What if the medium contains no ions at all?

Extrapolation on the left?

More about extrapolation

Let’s redefine the response

z <- cbind(z, inhib = 1 - z[,"deltaOD"] / max(z[,"deltaOD"]))

0 codes for “no inhibition”
1 codes for “fully inhibited”

More about extrapolation

A wrapper function to fit & display models

fit <- function(formula, start, data) {
  x <- nls(formula, data=data, start=start)
  salt.show <- seq(0.1*min(data[,"salt"]), 2.5*max(data[,"salt"]), by=1)
  pr <- data.frame(salt=salt.show, inhib.mod=
      pmax(0, pmin(1, predict(x, newdata=data.frame(salt=salt.show)))))
  omar <- par("mar")
  par(mar=c(4.2,4.2,1,1))
  plot(inhib.mod ~ salt, data=pr, type="l", log="x", ylim=c(0,1),
    xlab="NaCl (mg/L, log)", ylab="Inhibition")
  points(inhib ~ salt, data=data, pch=20)
  abline(h=c(0,1), lty=4:3, col=c("blue","red"))
  legend(x=20, y=0.3, bty="n", pch=c(20,NA), lty=c(NA,1),
    legend=c("Observed","Model"))
  legend(x=.5, y=0.8, bty="n", title=as.character(as.expression(formula)),
    legend=paste(names(coef(x)), signif(coef(x), 3), sep=" = "))
  par(mar=omar)
}

More about extrapolation

Fitting and visual comparison of 4 models

par(mfrow=c(2,2))

# power model
fit(inhib ~ a * salt^p + b, start=c(a=1, b=0, p=1), data=z)

# log-linear
fit(inhib ~ a * log(salt) + b, start=c(a=1, b=0), data=z)

# exponential
fit(inhib ~ 1 - exp(p*salt), start=c(p=1e-6), data=z)

# variant of the 'Hill model'
fit(inhib ~ salt^n / (salt^n + ec50^n), start=c(ec50=60, n=1), data=z)

par(mfrow=c(1,1))

Discrepancies w.r.t. extrapolation

Summary

You want to fit empirical data by a model
You need to select the endpoint and a model equation
Proper choice requires knowledge about the mechanisms (chemistry, physiology, …)
Exposure time has been neglected so far

Example A.2: Bacterial growth with antibiotics

Data set

x <- data.frame(
  drug = c(rep("Mecillinam",12),rep("Tetracycline",17),
    rep("Trimethoprim",14)),
  mg_per_L = c(5e-05,0.00011,0.00021,0.00043,0.00089,0.00186,
    0.00406,0.00945,0.0162,0.0242,0.0336,0.0443,5e-04,0.0011,
    0.0022,0.0044,0.0089,0.0186,0.0406,0.0945,0.162,0.242,
    0.336,0.443,0.564,0.697,0.844,1,1.37,0.000319,0.00064,
    0.00129,0.00261,0.00534,0.0112,0.0244,0.0567,0.097,0.145,
    0.202,0.266,0.338,0.418),
  mu_per_hour = c(1.24,1.21,1.21,1.18,1.22,1.2,1.23,1.24,1.22,
    1.22,0.965,0.233,1.22,1.22,1.23,1.24,1.2,1.23,1.21,1.21,
    1.06,0.934,0.812,0.64,0.547,0.438,0.305,0.291,0.182,1.22,
    1.22,1.21,1.23,1.14,1.07,1.09,0.973,0.843,0.688,0.526,0.506,
    0.374,0.292)
)

Subset of data from Fig. 1 of Angermayr et al. (2022), DOI:10.15252/msb.202110490, CC BY 4.0 license.

Normalizing the response

# Scales the growth rates to range 0 ... 1

normalize <- function(x) {
  data.frame(
    drug= unique(x[,"drug"]),
    C= x[,"mg_per_L"],
    R= x[,"mu_per_hour"] / x[which.min(x[,"mg_per_L"]), "mu_per_hour"]
  )
}
x <- do.call(rbind, by(x, x[,"drug"], normalize))

Helper function 1

# fits a model using 'nls' and adds the prediction to a plot

fit_single <- function(formula, data, start, col, label) {
  f <- nls(formula=formula, data=data, start=start)
  concs <- 10^(-6:6)
  concs <- sort(rep(concs, each=100) * rep((1:100)/100, length(concs)))
  pr <- data.frame(C=concs, R=predict(f, newdata=data.frame(C=concs)))
  lines(R ~ C, data=pr, col=col)
  data.frame(label=label, col=col, npar=length(start),
    aic=signif(AIC(f),3), formula=as.character(as.expression(formula)))
}

Helper function 2

# fits multiple models to data of a single drug
fit_multiple <- function(x) {
  plot(R ~ C, data=x, bty="L", log="x", pch=20, col="grey",
    xlim=range(x[,"C"])*c(0.75, 2), ylim=c(0,1),
    xlab=substitute(paste(a," (mg/L)"), list(a=unique(x[,"drug"]))),
    ylab=expression(paste("R = ",mu / mu[0]," (-)")))
  out <- rbind(
    fit_single(R ~ 1 - (C / mic), data=x,
      start=c(mic=max(x[,"C"])), col="steelblue4", label="Linear"),
    fit_single(R ~ 1 - (C / mic)^b, data=x,
      start=c(mic=max(x[,"C"]), b=1), col="darkorange", label="Power"),
    fit_single(R ~ ec50^n / (ec50^n + C^n), data=x,
      start=c(ec50=0.5*max(x[,"C"]), n=1), col="red", label="Hill"))
  out <- out[order(out$aic),]
  legend("left", bty="n", legend=paste0(out$label,": ",
      "AIC = ",out$aic," (",out$npar," par.)"), text.col=out$col)
  cbind(drug=unique(x[,"drug"]), out)
}

Fit all cases

# set up plot layout
n <- length(unique(x[,"drug"]))
nc <- 2
nr <- ceiling(n/nc)
omar <- par("mar")
par(mfrow=c(nr, nc), mar=c(4.5, 4.7, 1.5, 1))

# hard work happens here
stats <- do.call(rbind, by(x, x[,"drug"], fit_multiple))

# add legend
plot(0:1, 0:1, type="n", axes=F, ann=F)
stats <- stats[stats[,"drug"] == "Tetracycline",]
legend(x=0, y=1, bty="n", 
  legend=paste(stats[,"label"], stats[,"formula"], sep=": "))
par(mfrow=c(1,1), mar=omar)

Result

Discussion

Different model structures may work
Interpretation of parameters differs
Trade-off between goodness of fit and model complexity
AIC helps to compare models

Discussion

AIC cannot substitute physiological knowledge
What is a realistic behavior close to MIC ?
Can we trust data close to extremes ?

Empirical vs. mechanistic models

Empirical vs. mechanistic

Empirical models are just mathematical structures which reproduce the shape of the C-R curve.
Mechanistic models are rooted in physiology.
The Hill equation has a mechanistic basis:
1. Toxin binds to a receptor
2. Response is proportional to occupancy of receptors
See this paper together with these notes for details

Concentration-response curves with peaks

Hormesis

Hormesis is the positive response of an organisms to low doses of a toxic substance (stimulation)
Hormesis is not homeopathy
The phenomenon has been heavily debated

Hormesis

Shoot weight of lettuce grown with isobutyl alcohol

x <- read.table(header=T, sep="", text="
conc repl1  repl2
0    1.126  0.833
0.32 1.096  1.106
1    1.163  1.336
3.2  0.985  0.754
10   0.710  0.683
32   0.560  0.488
100  0.375  0.344
")

Data from Ewijk & Hoekstra, 1993

Hormesis

# turn into long table
x <- with(x, data.frame(
  conc=rep(conc, 2),
  repl=rep(1:2, each=nrow(x)),
  value=c(repl1, repl2)
))

plot(value ~ conc,
  data=x,
  log="x",
  xlab="Concentration (mg/L)",
  ylab="Shoot weight",
  bty="L"
)

Peaks in C-R curves aren’t surprising

… because chemicals may be essential and toxic at the same time

Example:

Let plants grow without a nitrogen source
Moderate addition of NH₄⁺ promotes growth
However, too much NH₄⁺ is detrimental (see, e.g., Britto & Kronzucker, 2002)

Peaks in C-R curves aren’t surprising

… because chemicals trigger various responses (a) in a test organism and (b) its ecosystems

Example:

A root microbiome of symbionts and parasites
Let the plant be treated with antibiotic
If parasites are particularly sensitive, the plant profits from low-dose treatment
At higher doses, even symbionts get killed …

PART B: Probabilistic C-R curves

MIC data of bacterial pathogens

Bacterial infections are treated with antibiotics
MIC: Minimum inhibitory concentration
MICs are recorded for many drug-bug combinations
Over 10⁶ values in MIC repository of EUCAST

Subset of MIC repository

# Subset of EUCAST MIC data (snapshot from 2024),
# Concentrations in mg/L, coded as log2 (2^-9 ... 2^9)

x <- rbind(
  data.frame(
   case = "S. enterica on Cefotaxime",   # S. = Salmonella
   conc = -9:9,
   count = c(0,0,0,0,4,655,1520,183,20,5,0,0,1,1,3,0,0,0,0)
  ),
  data.frame(
   case = "K. pneumoniae on Meropenem",  # K. = Klebsiella
   conc = -9:9,
   count = c(0,0,211,948,2896,11318,520,220,135,90,56,41,31,32,45,5,4,1,0)
  )
)

MIC distributions as histograms

plot_hist <- function(x) {
  x <- x[order(x[,"conc"]),]
  z <- barplot(x[,"count"], xlab="log2 MIC (mg/L)", ylab="Count")
  axis(1, at=z, labels=x[,"conc"], lwd=0, las=2)
  legend("right", bty="n", legend=
    paste(gsub(unique(x[,"case"]), pattern=" on ",
      replacement="\non "), sep="\n"))
}

par(mfcol=c(1,2))
ignore <- by(x, x[,"case"], plot_hist)
par(mfcol=c(1,1))

MIC distributions as histograms

Reasons for variation

Genomic variation
Plasmids
Gene expression
Uncertain measurements

Empirical probabilities

Probability is defined as \(P(X <= x)\)

# Add empirical probabilities; assuming that 50% of the counts
# in each bin occur below/above the nominal concentration

add_probs <- function(x) {
  x <- x[order(x[,"conc"]),]
  cbind(x,
    prob= with(x, (cumsum(count) - count/2) / sum(count))
  )
}

x <- do.call(rbind, by(x, x[,"case"], add_probs))

Empirical probabilities

plot_emp <- function(x) {
  plot(prob ~ conc, data=x, pch=20, col="tomato", cex=1.5,
    bty="L", las=2, xaxt="n",
    xlab="              MIC (mg/L)", ylab="P(X <= MIC)")
  axis(1, at=-9:9, labels=signif(2^(-9:9),3), las=2)
  abline(h=c(0,1), lty=3, col="grey")
  text(x[,"conc"], x[,"prob"], x[,"count"],
    adj=c(1.75, 0.5), xpd=T, srt=-45, cex=0.8, col="grey50")
  legend("right", bty="n", legend=
    paste(gsub(unique(x[,"case"]), pattern=" on ",
      replacement="\non "), sep="\n"))
}

par(mfcol=c(1,2))
ignore <- by(x, x[,"case"], plot_emp)
par(mfcol=c(1,1))

Empirical probabilities

1^st attempt: Logistic regression

# Fit a logistic regression model with 'glm'

fit_logi <- function(x) {
  fit <- glm(prob ~ conc, data=x,
    family=quasibinomial(link="logit"))
  c_pred <- seq(-9, 9, .1)
  pr <- data.frame(conc=c_pred,
    prob=predict(fit, newdata=data.frame(conc=c_pred),
    type="response"))
  lines(prob ~ c_pred, data=pr)
}

par(mfcol=c(1,2))
fn <- function(x) { plot_emp(x); fit_logi(x) }
ignore <- by(x, x[,"case"], fn)
par(mfcol=c(1,1))

1^st attempt: Logistic regression

Other options

Logistic regression appears to perform well but it may not be the method of first choice
Let’s fit a probability distribution model instead
Distribution fitting uses maximum likelihood estimation (MLE) instead of least-squares

2^nd attempt: Distribution fitting

fit_ML <- function(x) {
  objfunc <- function(p, x) {
    obs <- rep(x[,"conc"], times=x[,"count"])
    -1*sum(log(dnorm(obs, mean=p["m"], sd=p["s"])))
  }
  fit <- optim(par=c(m=.1, s=1), fn=objfunc, x=x)
  stopifnot(fit$convergence==0)
  c_pred <- seq(-9, 9, .1)
  lines(c_pred, pnorm(q=c_pred, mean=fit$par["m"],
    sd=fit$par["s"]))
}

par(mfcol=c(1,2))
fn <- function(x) { plot_emp(x); fit_ML(x) }
ignore <- by(x, x[,"case"], fn)
par(mfcol=c(1,1))

2^nd attempt: Distribution fitting

Summary

Substantial intra-species variation of MICs
Variation is partly explained by biology
Another (unknown) part of the variation is due to measurement uncertainty
MICs of susceptible strains often follow a log-normal distribution

Final notes

C-R analysis may be tricky

Different cases require different concepts
Do not blindly follow what AI proposes
Only use concepts you understand

Selected sources

R-package “drc”: High-level methods for concentration-response curve fitting
Ritz C. (2010): Toward a unified approach to dose–response modeling in ecotoxicology. Environ. Toxicology & Chemistry, 29 (1), 220–229
Gabrielsson J. & Weiner D.: Pharmacokinetic and Pharmacodynamic Data Analysis (Textbook in various editions)

Concentration-Response Curves

Introduction

Concentration-response curves

Responses

Endpoint

Scaling of responses (I)

Scaling of responses (II)

Temporal dimension

Language: C-R vs. D-R curves

Relevance of C-R curves

Classification by tested organism

Classification by tested organism

Classification by tested organism

Classification by tested organism

In this course …

How to run example codes

PART A: Deterministic C-R curves

Example A.1: Yeast and salt

Study system

Study system

Recipe

Monitoring of optical density

Monitoring of optical density

Data set in compressed format

First and last records

Graphical summary

Graphical summary

Computing end points

Empirical concentration-response curve

Fitting a model: Why ?

Fitting a model: How ?

Fit a power model with nls

Fitted power model

Using the model for extrapolation

Using the model for extrapolation

Checking plausibility (1 / 3)

Checking plausibility (2 / 3)

Checking plausibility (3 / 3)

Extrapolation on the left?

More about extrapolation

More about extrapolation

More about extrapolation

Discrepancies w.r.t. extrapolation

Summary

Example A.2: Bacterial growth with antibiotics

Data set

Normalizing the response

Helper function 1

Helper function 2

Fit all cases

Result

Discussion

Discussion

Empirical vs. mechanistic models

Empirical vs. mechanistic

Concentration-response curves with peaks

Hormesis

Hormesis

Hormesis

Peaks in C-R curves aren’t surprising

Peaks in C-R curves aren’t surprising

PART B: Probabilistic C-R curves

MIC data of bacterial pathogens

Subset of MIC repository

MIC distributions as histograms

MIC distributions as histograms

MIC distributions as histograms

Empirical probabilities

Empirical probabilities

Empirical probabilities

1st attempt: Logistic regression

1st attempt: Logistic regression

Other options

2nd attempt: Distribution fitting

2nd attempt: Distribution fitting

Summary

Final notes

C-R analysis may be tricky

Selected sources

Fit a power model with `nls`

1^st attempt: Logistic regression

1^st attempt: Logistic regression

2^nd attempt: Distribution fitting

2^nd attempt: Distribution fitting