Exercises: Session 3

In [1]:
suppressPackageStartupMessages(library(tidyverse))
Warning message:
“Installed Rcpp (0.12.12) different from Rcpp used to build dplyr (0.12.11).
Please reinstall dplyr to avoid random crashes or undefined behavior.”Warning message:
“package ‘dplyr’ was built under R version 3.4.1”

1. We will work with the Puromycin data set in this exercise.

  1. Use help to find out more about of the Puromycin data set
  2. Use class to find out the class of the data set
  3. How many rows and columns are there?
  4. What is the type of each column?
  5. Show all unique values for the state column
  6. Show the first 5 rows
  7. Show the last 5 rows
In [2]:
#1

In [3]:
# 2

In [4]:
#3

In [5]:
#4

In [6]:
#4

In [7]:
#5

In [8]:
#6

In [9]:
#7

2. Using the Puromycin data set,

  1. Show the first 20 rows using piping
  2. Show the last 10 rows using piping
  3. Show rows 11 to 20 using piping
In [10]:
#1

In [11]:
#2

In [12]:
#3

3. Using the Puromycin data set,togehter with piping and filter

  1. Show only rows where the state is untreated
  2. Show only rows where the conc is 0.11
  3. Show only rows where the conc is less than 0.1
  4. Show only rows where the state is treated and the rate is more than 100
  5. Show only rows where the conc is less than 0.1 or the rate is more than 200
In [13]:
#1

In [14]:
#2

In [15]:
#3

In [16]:
#4

In [17]:
#5

4. Using the Puromycin data set, together with piping, head and select, select_if and select_all

  1. Show only the conc and rate columns
  2. Show only the columns whose type is numeric
  3. Show only the columns whose names end with the letter e
  4. Convert all column names to UPPERCASE
  5. Rearrange the columns in the order state, conc, rate
  6. Drop the state column

Limit to only the first 3 rows in each case.

In [18]:
#1

In [19]:
#2

In [20]:
#3

In [21]:
#4

In [22]:
#5

In [23]:
#6

5. Using the Puromycin data set, together with mutate or transmutate and any other operation necessary

  1. Create a new column rate2 that is the square of rate
  2. Create a new data frame that only has the 3 columns with conc, conc^2 and conc^3 values. Name them conc, conc2 and conc3
  3. Replace each value of all numeric columns with the square root of the value

Show only the first 5 rows in each case

In [24]:
#1

In [25]:
#2

In [26]:
#3

6. Using the Puromycin data set, together with arrange and any other operation necessary

  1. Sort in ascending rate order
  2. Sort in descending rate order
  3. Sort first on conc i ascending order, then rate in ascending order
  4. Sort in ascending order of the number of characters in the state column

In each case show only the first 5 rows.

In [27]:
#1

In [28]:
#2

In [29]:
#2

In [30]:
#3

In [31]:
#4

7. Using the Puromycin data set, together with summarize and any other operation necessary

  • Find the mean value of numeric columns
  • Find the mean length of the state column
  • Find the min, median and max of the rate column
In [32]:
#1

In [33]:
#2

In [34]:
#3

8. Using the Puromycin data set, together with group_by and any other operation necessary

  1. Find the average rate for each state
  2. Find the number of treated and untreated states in a new column count
  3. Find the number of rows with the same conc and state in a new column count and only show rows where the count is an even number.
  4. Find the mean and standard deviation of rate for each state and conc. Remove any rows with an NA value for the rate standard deviation.

Hint: group_by is often combined with summarize, and n() returns the count.

In [35]:
#1

In [36]:
#2

In [37]:
#3

In [38]:
#4