Statistics review 12: Survival analysis

R code accompanying paper

Key learning points

  • Analysis of time-to-event data
  • Kaplan–Meier methods
  • log rank test
  • Cox’s proportional hazards
suppressPackageStartupMessages(library(tidyverse))
options(repr.plot.width=4, repr.plot.height=3)

Survival Data

pt <- 1:15
days <- c(1,1,4,5,6,8,9,9,12,15,22,25,37,55,72)
outcome <- c('d', 'd', 'd', 'd', NA, 'd', 's', 'd', 'd',
            NA, 'd', 's', 'd', 'd', 's')
treatment <- c(2,2,2,2,2,1,2,2,1,1,2,1,1,1,1)
age <- c(75,79,85,76,66,75,72,70,71,73,66,73,68,59,61)

df <- data.frame(pt=pt, days=days, outcome=outcome, treatment=treatment, age=age)
df
ptdaysoutcometreatmentage
1 1d 2 75
2 1d 2 79
3 4d 2 85
4 5d 2 76
5 6NA2 66
6 8d 1 75
7 9s 2 72
8 9d 2 70
912d 1 71
1015NA1 73
1122d 2 66
1225s 1 73
1337d 1 68
1455d 1 59
1572s 1 61

Estimating the survival curve using the Kaplan–Meier method

library(survival)
library(ggfortify)
df$surv <- with(df, Surv(days, outcome=='d'))
head(df)
ptdaysoutcometreatmentagesurv
1 1 d 2 751
2 1 d 2 791
3 4 d 2 854
4 5 d 2 765
5 6 NA2 666?
6 8 d 1 758

Overall

km1 <- survfit(surv ~ 1, data=df)
autoplot(km1)
_images/SR12_Survival_analysis_11_1.png

By treatment

km2 <- survfit(surv ~ treatment, data=df)
autoplot(km2)
_images/SR12_Survival_analysis_14_1.png

Plotting cumulative hazard

autoplot(km2, fun='event')
_images/SR12_Survival_analysis_16_1.png

Comparing survival curves of two groups using the log rank test

survdiff(surv ~ treatment, data=df)
Call:
survdiff(formula = surv ~ treatment, data = df)

n=13, 2 observations deleted due to missingness.

            N Observed Expected (O-E)^2/E (O-E)^2/V
treatment=1 6        4     6.99      1.28      5.27
treatment=2 7        6     3.01      2.98      5.27

 Chisq= 5.3  on 1 degrees of freedom, p= 0.0216

Cox’s proportional hazards model

Treatment only

m1 <- coxph(surv ~ treatment, data=df)
summary(m1)
Call:
coxph(formula = surv ~ treatment, data = df)

  n= 13, number of events= 10
   (2 observations deleted due to missingness)

            coef exp(coef) se(coef)    z Pr(>|z|)
treatment 1.7603    5.8144   0.8465 2.08   0.0376 *
---
Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

          exp(coef) exp(-coef) lower .95 upper .95
treatment     5.814      0.172     1.107     30.55

Concordance= 0.713  (se = 0.099 )
Rsquare= 0.327   (max possible= 0.948 )
Likelihood ratio test= 5.14  on 1 df,   p=0.02333
Wald test            = 4.32  on 1 df,   p=0.03756
Score (logrank) test = 5.28  on 1 df,   p=0.0216

Treatment and age

m2 <- coxph(surv ~ treatment + age, data=df)
summary(m2)
Call:
coxph(formula = surv ~ treatment + age, data = df)

  n= 13, number of events= 10
   (2 observations deleted due to missingness)

             coef exp(coef) se(coef)     z Pr(>|z|)
treatment 1.65039   5.20901  0.96397 1.712   0.0869 .
age       0.21523   1.24015  0.08562 2.514   0.0119 *
---
Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

          exp(coef) exp(-coef) lower .95 upper .95
treatment     5.209     0.1920    0.7875    34.458
age           1.240     0.8064    1.0486     1.467

Concordance= 0.868  (se = 0.118 )
Rsquare= 0.646   (max possible= 0.948 )
Likelihood ratio test= 13.5  on 2 df,   p=0.001173
Wald test            = 8.23  on 2 df,   p=0.01632
Score (logrank) test = 11.43  on 2 df,   p=0.003298

Check for violation of proportional hazard (constant HR over time)

res <- cox.zph(m2)
res
            rho  chisq     p
treatment 0.110 0.0994 0.753
age       0.274 0.4489 0.503
GLOBAL       NA 0.5000 0.779
plot(res)
_images/SR12_Survival_analysis_28_0.png _images/SR12_Survival_analysis_28_1.png

Exercise

library(KMsurv)
data(tongue)
help(tongue)
head(tongue)
typetimedelta
1 11
1 31
1 31
1 41
1 101
1 131

1. Create a KM plot for the tongue data set.

2. Evaluate if the type of tumor affects survival using a log-rank test

3. Evaluate if the type of tumor affects survival using Cox PH test.

4. Check if the Cox PH assumptions are met.