Statistics review 11: Assessing risk ==================================== R code accompanying `paper `__ Key learning points ------------------- - Relative Risk - Odds Ratio - Measuring the impact of exposure to a risk factor - Measures of the success of a treatment .. code:: r suppressPackageStartupMessages(library(tidyverse)) .. code:: r options(repr.plot.width=4, repr.plot.height=3) Data ---- That study investigated the association between surfactant protein B and acute respiratory distress syndrome (ARDS). Patients were classified according to their thymine/cytosine (C/T) gene coding, and patients with the C allele present (genotype CC or CT) were compared with those with genotype TT .. code:: r ARDS <- c(11,1) NoARDS <- c(208, 182) df <- data.frame(row.names=c("CC/CT", "TT"), ARDS=ARDS, NoARDS=NoARDS) df .. raw:: html
ARDSNoARDS
CC/CT11 208
TT 1 182
Relative risk ------------- .. code:: r df1 <- df %>% mutate(Total = ARDS+NoARDS) %>% mutate(risk = ARDS/Total) df1 .. raw:: html
ARDSNoARDSTotalrisk
11 208 219 0.050228311
1 182 183 0.005464481
.. code:: r rr <- df1$risk[1]/df1$risk[2] round(rr, 2) .. raw:: html 9.19 .. code:: r rr <- function(tbl) { a <- tbl[1,1] b <- tbl[1,2] c <- tbl[2,1] d <- tbl[2,2] round((a/(a+b))/(c/(c+d)), 2) } .. code:: r rr_ci <- function(tbl, alpha=0.05) { a <- tbl[1,1] b <- tbl[1,2] c <- tbl[2,1] d <- tbl[2,2] log.se <- sqrt(1/a - 1/(a+b) + 1/c - 1/(c+d)) log.rr <- log((a/(a+b))/(c/(c+d))) k <- qnorm(1-alpha/2) log.ci <- c(log.rr - k*log.se, log.rr + k*log.se) round(exp(log.ci), 2) } .. code:: r rr(df) .. raw:: html 9.19 .. code:: r rr_ci(df) .. raw:: html
  1. 1.2
  2. 70.53
Odds ratio ---------- .. code:: r df2 <- df %>% mutate(odds = ARDS/NoARDS) df2 .. raw:: html
ARDSNoARDSodds
11 208 0.052884615
1 182 0.005494505
.. code:: r or <- df2$odds[1]/df2$odds[2] or .. raw:: html 9.625 .. code:: r or <- function(tbl) { a <- tbl[1,1] b <- tbl[1,2] c <- tbl[2,1] d <- tbl[2,2] round((a/b)/(c/d), 2) } .. code:: r or_ci <- function(tbl, alpha=0.05) { a <- tbl[1,1] b <- tbl[1,2] c <- tbl[2,1] d <- tbl[2,2] log.se <- sqrt(1/a + 1/b + 1/c - 1/d) log.rr <- log((a/b)/(c/d)) k <- qnorm(1-alpha/2) log.ci <- c(log.rr - k*log.se, log.rr + k*log.se) round(exp(log.ci), 2) } .. code:: r or(df) .. raw:: html 9.62 .. code:: r or_ci(df) .. raw:: html
  1. 1.24
  2. 74.5
Advantages of odds ratio ~~~~~~~~~~~~~~~~~~~~~~~~ - Can be estimated in case-control study - OR is a symmetric ratio in that the OR for the disease given the risk factor is the same as the OR for the risk factor given the disease. - form part of the output when carrying out logistic regression Attributable risk ----------------- The proportion of cases in a population that could be prevented if the risk factor were to be eliminated. The AR is the difference between the actual number of cases in a sample and the number of cases that would be expected if exposure to the risk factor were eliminated, expressed as a proportion of the former. .. code:: r ar <- function(tbl) { a <- tbl[1,1] b <- tbl[1,2] c <- tbl[2,1] d <- tbl[2,2] n <- a + b + c + d ar <- ((a+c) - (n*c)/(c+d))/(a+c) round(100*ar, 2) } .. code:: r ar_ci <- function(tbl, alpha=0.05) { a <- tbl[1,1] b <- tbl[1,2] c <- tbl[2,1] d <- tbl[2,2] n <- a + b + c + d k <- qnorm(1-alpha/2) u <- (k*(a+c)*(c+d))/(a*d-b*c) * sqrt(((a*d*(n-c) + c^2*b)/(n*c*(a+c)*(c+d)))) hi <- ((a*d - b*c)*exp(u))/(n*c + (a*d-b*c)*exp(u)) lo <- ((a*d - b*c)*exp(-u))/(n*c + (a*d-b*c)*exp(-u)) round(100*c(lo, hi), 2) } .. code:: r ar(df) .. raw:: html 81.69 .. code:: r ar_ci(df) .. raw:: html
  1. 31.16
  2. 97.78
Risk measurements in clinical trials ------------------------------------ .. code:: r df3 <- data.frame(survived=c(79, 60), died=c(38, 59), row.names=c("early", "standard")) df3 .. raw:: html
surviveddied
early7938
standard6059
.. code:: r rr(df3) .. raw:: html 1.34 .. code:: r or(df3) .. raw:: html 2.04 .. code:: r ar(df3) .. raw:: html 14.39 Risk difference --------------- .. code:: r arr <- function(tbl) { a <- tbl[1,1] b <- tbl[1,2] c <- tbl[2,1] d <- tbl[2,2] r <- (d/(c+d) - b/(a+b)) round(100*r, 2) } .. code:: r arr(df3) .. raw:: html 17.1 Number needed to treat ---------------------- .. code:: r nnt <- function(df) { round(100/arr(df), 0) } .. code:: r nnt(df3) .. raw:: html 6 Exercise -------- **1**. Write a function to calculate the confidence intervals of the absolute risk reduction (ARR) given a :math:`2 \times 2` table of outcomes. What is the 90% CI for the data given in ``df3``? **2** Write a function to calculate the confidence intervals for the number needed to treat (NNT) 2×2 table of outcomes. What is the 90% CI for the data given in df3?