Statistical Estimation

In [15]:
suppressPackageStartupMessages(library(tidyverse))
suppressPackageStartupMessages(library(pwr))
Warning message:
“package ‘dplyr’ was built under R version 3.4.1”
In [14]:
options(repr.plot.width=4, repr.plot.height=3)

Working with probability distributions

5 samples from standard normal distribution

In [10]:
n <- 6
(x <- rnorm(n))
  1. 0.834184312954893
  2. -1.42097831766073
  3. 0.613256845929185
  4. 1.41229807396209
  5. -1.21942815702242
  6. 0.323522322297916

5 samples from normal distribution

In [11]:
n <- 6
mu <- 10
sigma <- 5
(x <- rnorm(n, mu, sigma))
  1. 6.10092036878306
  2. 10.4553327181587
  3. 7.60688063076039
  4. 12.7554584484242
  5. 4.49824865096167
  6. 14.7050261258722

PDF

In [57]:
x <- seq(-3,3,length.out = 100)
plot(x, dnorm(x), main="Standard normal", type='l')
../../_images/Computation_Wk3_Day2_PM_02-Stat-Estimation_9_0.png

CDF

In [58]:
x <- seq(-3,3,length.out = 100)
plot(x, pnorm(x), type="l", ylab='CDF')
../../_images/Computation_Wk3_Day2_PM_02-Stat-Estimation_11_0.png

Percentiles

In [28]:
round(qnorm(c(0.5, 0.65, 0.95, 0.99)), 3)
  1. 0
  2. 0.385
  3. 1.645
  4. 2.326

Relationship between percentiles and CDF

In [27]:
pnorm(c(0, 0.385320466407568, 1.64485362695147, 2.32634787404084))
  1. 0.5
  2. 0.65
  3. 0.95
  4. 0.99
In [52]:
x <- seq(-3,3,length.out = 100)
plot(x, pnorm(x), type="l", ylab='CDF')
abline(h=0.65, col='red', lty=2)
abline(v=0.385, col='blue', lty=2)
../../_images/Computation_Wk3_Day2_PM_02-Stat-Estimation_16_0.png

Exercise

Assume that IQ has a normal distribution with mean = 100 and standard deviation = 15.

Exercise 1

If your IQ is 154, what percentile are you?

In [ ]:

Exercise 2

What percentage of the population has IQ between 70 and 120?

In [ ]:

Exercise 3

What IQ do you need to be in the top 10 percentile?

In [ ]:

One sample model

In [1]:
mu <- 0
sigma <- 1
n <- 6
In [2]:
x <- rnorm(n, mu, sigma)
In [3]:
x
  1. 0.964404467414068
  2. 0.233532508165446
  3. -1.41560321218151
  4. -1.16570049438986
  5. -1.03987265775232
  6. -2.37490060325525

Point estimates

In [4]:
mean(x)
-0.799689998666571
In [5]:
median(x)
-1.10278657607109
In [6]:
sd(x)
1.20265264827186

Interval estimates

In [8]:
me <- qt(0.975, df=n-1) * sd(x)/sqrt(n)
In [9]:
xbar <- mean(x)
c(xbar - me, xbar + me)
  1. -2.06179655017835
  2. 0.46241655284521

Exercise 4

Simulate 10 samples from a normal distribution with mean = 100 and standard deviation = 15.

  • Find the sample mean, median, standard deviation, margin of error and 90% confidence interval of the mean
In [ ]:

Exercise 5

Simulate 200 samples from a normal distribution with mean = 100 and standard deviation = 15.

  • What percentile is a person with IQ = 154 in this population?
  • What percentage of this population has IQ between 70 and 120?
  • What IQ do you need to be in the top 10 percentile in this population?
In [ ]:

Exercise 6

Compare the t and normal distributions. Overlay plots of the PDF for the standard t-distribution with 1, 5 and 30 degrees of freedom on the standard normal distribution on the same plot. Use different colors and/or line styles.

In [ ]: