Exercises: Session 4

In [13]:
suppressPackageStartupMessages(library(tidyverse))
In [44]:
suppressPackageStartupMessages(library(stringr))

1. Warm up with the iris data frame.

  • Show the first 3 rows
  • Show the last 3 rows
  • Show 3 random rows without repetition
In [ ]:

2. Using the iris data set,

  • Find the mean value of all 4 measurements
  • Find the mean value of all 4 measurements for each Species
In [ ]:

3. Using the iris data set,

  • Sort the observations by Sepal.Width in decreasing order.
In [ ]:

4. Using the iris data`m set,

  • Count the number of flowers of each Species
In [ ]:

5. Using the iris data set,

  • Count the number of observations where Petal.Length is longer than Sepal.Width
In [ ]:

6. Using the iris data set,

  • Find the Species with the most number of observations where the Sepal.Length is less then the mean Sepal.Length of all observations
In [ ]:

7. Using the iris data set,

  • Convert the data frame from the current wide format to a tall format, with just 3 columns: Species, Measurement, Value.
In [ ]:

8. Using the mtcars data set,

  • Find the mean weight of all cars with mpg > 20 and cyl = 4.
In [ ]:

9. Using the mtcars data set,

  • Add a new column named bmi that is equal to (hp*mpg/wt)
In [ ]:

10. Using the mtcars data set

  • Find all rows whose car names have numbers in them.
In [ ]:

11. Using the iris data set, together with spread and any other operation necessary

  • Create a new data frame df that has only 3 columns (Species, Measure, Value) where Measure takes on the values Sepal.Length, Sepal.Width, Petal.Length or Petal.Width. Show the first 5 rows.
  • Show the mean value and counts for each Species and Measure of df
In [ ]:

12. Using the df data set, apply spread to

  • give each different treatement its own column.
In [ ]:
df <- data.frame(subject=rep(1:4,3),
                 treatment = rep(c("A", "B", "C"), each=4),
                 value = rnorm(12))
In [ ]:

13. Using the expt data set, together with separate and any other operation necessary

  • Find the average blood pressure for each treatment group (A, B or C).

Note: You are assumed not to have access to the pid and treat values separate.y.

In [ ]:
pid <- rep(1:4, 3)
treat <- rep(c('A','B','C'), each = 4)
bp <- rnorm(12, 120, 25)
expt <- data.frame(name=paste(pid, treat, sep='-'), bp=bp)
rm(pid)
rm(treat)
expt