Statistics review 13: Receiver operating characteristic ======================================================= R code accompanying `paper `__ Key learning points ------------------- - Methods for assessing the performance of a diagnostic test - Sensitivity, specificity and likelihood ratio of a test - Receiver operating characteristic curve and the area under the curve .. code:: r suppressPackageStartupMessages(library(tidyverse)) .. code:: r options(repr.plot.width=4, repr.plot.height=3) Data ---- .. code:: r died <- c(81, 45) survived <- c(591, 674) df1 <- data.frame(died=died, survived=survived, row.names=c("Lactate >1.5 mmol/l", "Lactate ≤1.5 mmol/l")) df1 .. raw:: html
diedsurvived
Lactate >1.5 mmol/l81 591
Lactate ≤1.5 mmol/l45 674
Data set with different prevalence ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code:: r died <- c(386, 214) survived <- c(370, 421) df2 <- data.frame(died=died, survived=survived, row.names=c("Lactate >1.5 mmol/l", "Lactate ≤1.5 mmol/l")) df2 .. raw:: html
diedsurvived
Lactate >1.5 mmol/l386370
Lactate ≤1.5 mmol/l214421
Sensitivity and specificity --------------------------- .. code:: r prop.ci <- function(p, n, alpha=0.05) { se <- sqrt(p*(1-p)/n) k <- qnorm(1 - alpha/2) round(c(p - k*se, p + k*se), 2) } .. code:: r sens <- function(tbl) { a <- tbl[1,1] b <- tbl[1,2] c <- tbl[2,1] d <- tbl[2,2] n <- a + c p <- a/n round(p, 2) } .. code:: r sens(df1) .. raw:: html 0.64 .. code:: r p <- sens(df1) n <- sum(df1) prop.ci(p, n) .. raw:: html
  1. 0.61
  2. 0.67
.. code:: r spec <- function(tbl) { a <- tbl[1,1] b <- tbl[1,2] c <- tbl[2,1] d <- tbl[2,2] n <- b + d p <- d/n round(p, 2) } .. code:: r spec(df1) .. raw:: html 0.53 Positive and negative predictive values --------------------------------------- .. code:: r ppv <- function(tbl) { a <- tbl[1,1] b <- tbl[1,2] c <- tbl[2,1] d <- tbl[2,2] n <- a + b p <- a/n round(p, 2) } .. code:: r ppv(df1) .. raw:: html 0.12 .. code:: r npv <- function(tbl, alpha=0.05) { a <- tbl[1,1] b <- tbl[1,2] c <- tbl[2,1] d <- tbl[2,2] n <- c + d p <- d/n round(p, 2) } .. code:: r npv(df1) .. raw:: html 0.94 Effect of prevalence -------------------- .. code:: r c(sens(df1)[1], sens(df2)[1]) .. raw:: html
  1. 0.64
  2. 0.64
.. code:: r c(spec(df1)[1], spec(df2)[1]) .. raw:: html
  1. 0.53
  2. 0.53
.. code:: r c(ppv(df1)[1], ppv(df2)[1]) .. raw:: html
  1. 0.12
  2. 0.51
.. code:: r c(npv(df1)[1], npv(df2)[1]) .. raw:: html
  1. 0.94
  2. 0.66
Likelihood ratios ----------------- .. code:: r lr <- function(tbl) { round(sens(tbl)[1]/(1 - spec(tbl)[1]), 2) } .. code:: r lr(df1) .. raw:: html 1.36 Receiver operating characteristic curve (ROC) --------------------------------------------- .. code:: r threshold <- c(0,1,1.5,2,3,5,25) died <- c(126,114,81,58,37,19,0) survived <- c(1265,996,591,329,131,27,0) df3 <- data.frame(threshold=threshold, died=died, survived=survived) df3 .. raw:: html
thresholddiedsurvived
0.0126 1265
1.0114 996
1.5 81 591
2.0 58 329
3.0 37 131
5.0 19 27
25.0 0 0
.. code:: r n.died <- 126 n.survived <- 1265 .. code:: r sens.1 <-function(died, ndied) { a <- died c <- ndied-died a/(a + c) } .. code:: r spec.1 <-function(survived, nsurvived) { b <- survived d <- nsurvived - survived b/(b+d) } .. code:: r df4 <- df3 %>% mutate(sens=sens.1(died, n.died), one.minus.spec=spec.1(survived, n.survived)) df4 .. raw:: html
thresholddiedsurvivedsensone.minus.spec
0.0 126 1265 1.0000000 1.00000000
1.0 114 996 0.9047619 0.78735178
1.5 81 591 0.6428571 0.46719368
2.0 58 329 0.4603175 0.26007905
3.0 37 131 0.2936508 0.10355731
5.0 19 27 0.1507937 0.02134387
25.0 0 0 0.0000000 0.00000000
.. code:: r ggplot(df4, aes(x=one.minus.spec, y=sens)) + geom_point() + geom_line() + geom_abline(intercept = 0, slope = 1, color='grey', linetype='dashed') + coord_fixed() .. image:: SR13_Receiver_operating_characteristic_files/SR13_Receiver_operating_characteristic_33_1.png Area under the curve (AUCROC) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code:: r suppressPackageStartupMessages(require(pracma)) .. code:: r round(-1 * trapz(x=df4$one.minus.spec, y=df4$sens), 2) .. raw:: html 0.64