Estimation and Hypothesis Testing

In [1]:
library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ──
✔ ggplot2 2.2.1     ✔ purrr   0.2.5
✔ tibble  1.4.2     ✔ dplyr   0.7.5
✔ tidyr   0.8.1     ✔ stringr 1.3.1
✔ readr   1.1.1     ✔ forcats 0.3.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
In [2]:
options(repr.plot.width=4, repr.plot.height=3)

Set random number seed for reproucibility

In [3]:
set.seed(42)

Functions around probability distributions

Radnom numbers

In [4]:
rnorm(5)
  1. 1.37095844714667
  2. -0.564698171396089
  3. 0.363128411337339
  4. 0.63286260496104
  5. 0.404268323140999

PDF

In [5]:
x <- seq(-3, 3, length.out = 100)
plot(x, dnorm(x), type="l", ylab="PDF")
../_images/cliburn_Statistical_Inference_02_Solutions_9_0.png

CDF

In [6]:
x <- seq(-3, 3, length.out = 100)
plot(x, pnorm(x), type="l", ylab="CDF")
../_images/cliburn_Statistical_Inference_02_Solutions_11_0.png

Quantiles (inverse CDF)

In [7]:
p = seq(0, 1, length.out = 101)
plot(p, qnorm(p), type="l", ylab="Quantile")
../_images/cliburn_Statistical_Inference_02_Solutions_13_0.png

Point estimates

In [8]:
x <- rnorm(10)
In [9]:
x
  1. -0.106124516091484
  2. 1.51152199743894
  3. -0.0946590384130976
  4. 2.01842371387704
  5. -0.062714099052421
  6. 1.30486965422349
  7. 2.28664539270111
  8. -1.38886070111234
  9. -0.278788766817371
  10. -0.133321336393658

Mean

Manual calculation

In [10]:
sum(x)/length(x)
0.50569923003602

Using built-in function

In [11]:
mean(x)
0.50569923003602

Median

Manual calculation

In [12]:
x_sorted <- sort(x)
In [13]:
length(x)
10

Since there are an even number of observations, we need the average of the middle two data poitns

In [14]:
sum(x_sorted[5:6])/2
-0.0786865687327593

Using built-in function

In [15]:
median(x)
-0.0786865687327593

Quantiles

The mean is just the 50 percentile. We can use R to get any percentile we like.

In [16]:
quantile(x, 0.5)
50%: -0.0786865687327593
In [17]:
quantile(x, seq(0,1,length.out = 5))
0%
-1.38886070111234
25%
-0.126522131318115
50%
-0.0786865687327593
75%
1.45985891163508
100%
2.28664539270111

Exercise

Gene X is known to have a normal distribution with a mean of 100 units and a standard deviation of 15 units in the US population. With respect to this population,

    1. What is the medan value for gene X?
In [18]:
100
100
    1. What is the probability of finding a value of more than 130 for gene X if you pick a person at random?
In [19]:
m <- 100
s <- 15
In [20]:
1 - pnorm(130, m, s)
0.0227501319481792
    1. If you measrue gene X and find that it is in the 95th percentile for this population, what is the measured value?
In [21]:
qnorm(0.95, m, s)
124.672804404272
    1. Find answers to questions (1), (2) and (3) by simulating 1 million people sampled from the US population. Do they agree with the theoretical calculated values?
In [22]:
n <- 1e6
x <- rnorm(n, m, s)
In [23]:
median(x)
100.020793858963
In [24]:
sum(x > 130)/n
0.022838
In [25]:
quantile(x, 0.95)
95%: 124.695020504446
    1. Plot the PDF of gene X using ggplot2 for values between 50 and 150. Give it a title of PDF of N(100, 15), a subtile of I made this!, and label the x-axis as Gene X and y-axis as PDF. Make the PDF blue, and fill the region under the curve blue with a transparency of 50%.
In [26]:
x <- 50:150
df <- data.frame(x=x, y=dnorm(x, m, s))
ggplot(df, aes(x=x, y=y)) +
geom_line(color='blue') +
geom_polygon(fill='blue', alpha=0.5) +
labs(title='PDF of N(100, 15)',
    subtitle='I made this!',
    x='Gene X',
    y='PDF')
Data type cannot be displayed:
../_images/cliburn_Statistical_Inference_02_Solutions_47_1.png

Interval estimates

Confidence intervals

In [27]:
ci = 0.95
In [28]:
alpha = (1-ci)
n <- length(x)
m <- mean(x)
s <- sd(x)
se <- s/sqrt(n)
me <- qt(1-alpha/2, df=n-1) * se
c(m - me, m + me)
  1. 94.215778757373
  2. 105.784221242627

Note that confidence intervals get larger as the confidence required increases.

In [29]:
ci = 0.99
In [30]:
alpha = (1-ci)
n <- length(x)
m <- mean(x)
s <- sd(x)
se <- s/sqrt(n)
me <- qt(1-alpha/2, df=n-1) * se
c(m - me, m + me)
  1. 92.3442793441823
  2. 107.655720655818

Making a function

Review of R custom functions

In [31]:
f <- function(a, b=1) {
    a + b
}
In [32]:
f(2)
3
In [33]:
f(2,3)
5
In [34]:
f(b=4, a=1)
5

Exercise

Make a function called conf for calculating confidence intervals for the sample mean that takes two arguments

  • x is the vector of sample values
  • ci is the confidence interval with a default of 0.95

The funciton should return a vector of two numbers indicating the lwoer and upper limeit of the confidence interval

Check that it gives the same answer as the example above.

In [35]:
conf <- function(x, ci=0.95) {
   alpha = (1-ci)
    n <- length(x)
    m <- mean(x)
    s <- sd(x)
    se <- s/sqrt(n)
    me <- qt(1-alpha/2, df=n-1) * se
    c(m - me, m + me)
}
In [36]:
conf(x)
  1. 94.215778757373
  2. 105.784221242627

Coverage

In 1,000 experiments, we expect the true mean (0) to lie within the estimated 95% CIs 950 times.

In [37]:
n_expt <- 1000
n <- 10
cls <- t(replicate(n_expt, conf(rnorm(n))))
In [38]:
sum(cls[,1] < 0 & 0 < cls[,2])
960

Hypothesis testing

Binomial test

In [39]:
set.seed(123)

n = 50
tosses = sample(c('H', 'T'), n, replace=TRUE, prob=c(0.55, 0.45))
t = table(tosses)
t
tosses
 H  T
27 23
In [40]:
binom.test(table(tosses))

Exact binomial test

data:  table(tosses)
number of successes = 27, number of trials = 50, p-value = 0.6718
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.3932420 0.6818508
sample estimates:
probability of success
                  0.54

In [41]:
set.seed(123)

n = 250
tosses = sample(c('H', 'T'), n, replace=TRUE, prob=c(0.55, 0.45))
t = table(tosses)
t
tosses
  H   T
139 111
In [42]:
binom.test(t)

Exact binomial test

data:  t
number of successes = 139, number of trials = 250, p-value = 0.0875
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.4920569 0.6185995
sample estimates:
probability of success
                 0.556

What happens if we choose a one-sided test?

In [43]:
binom.test(t, alternative = "greater")

Exact binomial test

data:  t
number of successes = 139, number of trials = 250, p-value = 0.04375
alternative hypothesis: true probability of success is greater than 0.5
95 percent confidence interval:
 0.5020197 1.0000000
sample estimates:
probability of success
                 0.556

In [44]:
binom.test(t, alternative = "less")

Exact binomial test

data:  t
number of successes = 139, number of trials = 250, p-value = 0.9668
alternative hypothesis: true probability of success is less than 0.5
95 percent confidence interval:
 0.0000000 0.6089811
sample estimates:
probability of success
                 0.556

What happens if we change our null hypothesis?

In [45]:
binom.test(t, p = 0.55)

Exact binomial test

data:  t
number of successes = 139, number of trials = 250, p-value = 0.8989
alternative hypothesis: true probability of success is not equal to 0.55
95 percent confidence interval:
 0.4920569 0.6185995
sample estimates:
probability of success
                 0.556

Two-sample model

Welch t-test

In [46]:
set.seed(123)

n <- 10
x1 <- rnorm(n, 0, 1)
x2 <- rnorm(n, 1, 1)
In [47]:
t.test(x1, x2)

Welch Two Sample t-test

data:  x1 and x2
t = -2.5438, df = 17.872, p-value = 0.02044
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.0710488 -0.1969438
sample estimates:
 mean of x  mean of y
0.07462564 1.20862196

Standard t-test

In [48]:
t.test(x1, x2, var.equal = TRUE)

Two Sample t-test

data:  x1 and x2
t = -2.5438, df = 18, p-value = 0.02036
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.0705694 -0.1974232
sample estimates:
 mean of x  mean of y
0.07462564 1.20862196

Power of t-test

In [49]:
d <- 1 # Effect size is ratio of difference in means to standard deviation
library(pwr)
pwr.t.test(d = d, sig.level = 0.05, power = 0.9)

gives output

     Two-sample t test power calculation

              n = 22.02109
              d = 1
      sig.level = 0.05
          power = 0.9
    alternative = two.sided

NOTE: n is number in *each* group

Interpretation of the power calculaiton

If we did many experiments with n=23 per group where the effect size is as specified and the test assumptions are valid, we expect that at least 90% of them will have a p-value less than the nominal significance level (0.05). If we used n=22 we would expect that just under 90% of the experiments will have a p-value less than the nomial significance level (0.05).

In particular notet that about 1-power of the experiments will fail to show a statistically significant p value even if the assumptionss are met (false negative).

In [50]:
n_expts <- 10000
n <- 22
alpha = 0.05
sum(replicate(n_expts, t.test(rnorm(n, 0, 1), rnorm(n, 1, 1))$p.value) < alpha)/n_expts
0.901
In [51]:
n_expts <- 10000
n <- 23
alpha = 0.05
sum(replicate(n_expts, t.test(rnorm(n, 0, 1), rnorm(n, 1, 1))$p.value) < alpha)/n_expts
0.9125

Distribution of p-values under the null is uniform

That meas that you expect \(\alpha\) of the experiments to be false positives.

In [52]:
n_expt <- 10000
n <- 50
ps <- replicate(n_expts, t.test(rnorm(n), rnorm(n))$p.value)
In [53]:
sum(ps < alpha)/n_expts
0.0478
In [54]:
hist(ps)
../_images/cliburn_Statistical_Inference_02_Solutions_93_0.png

Paired and one-sample t-tests

A paired t-test is commonly used to evaluate if paired measuremetns (e..g. weight before and after a diet for the same person) has changed. The paired t-test is equivalent to a one-sample t-test that compares the difference in measurements for the paired values with a fixed number (usually 0).

In [55]:
x1 <- rnorm(10, 100, 15)
x2 <- rnorm(10, 100, 15)
delta <- x1 - x2
In [56]:
t.test(x1, x2, paired=TRUE)

Paired t-test

data:  x1 and x2
t = -2.4618, df = 9, p-value = 0.03605
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -12.3446069  -0.5215923
sample estimates:
mean of the differences
                -6.4331

In [57]:
t.test(delta, mu=0)

One Sample t-test

data:  delta
t = -2.4618, df = 9, p-value = 0.03605
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 -12.3446069  -0.5215923
sample estimates:
mean of x
  -6.4331

Exercise

Suppose that the null hypotehsis is that there is no difference in the paired measurements and the standard deviation of the differene is 2.

  • Run a simulation for 100,000 experiments with n=25 per experiment to show the distributon of the p values using a paired or one -sample t-test under the null.
  • If the significance level is 0.05, how many false positive results were observed?
In [58]:
n_expt <- 100000
n <- 25
s <- 2
ps <- replicate(n_expts, t.test(rnorm(10, 0, s), mu=0)$p.value)
In [59]:
hist(ps)
../_images/cliburn_Statistical_Inference_02_Solutions_100_0.png
In [60]:
n <- 100000
s <- sqrt(2)
sd(rnorm(n, 0, s) - rnorm(n, 0, s))
2.00293039283011