Exercises: Session 3¶
In [1]:
suppressPackageStartupMessages(library(tidyverse))
Warning message:
“Installed Rcpp (0.12.12) different from Rcpp used to build dplyr (0.12.11).
Please reinstall dplyr to avoid random crashes or undefined behavior.”Warning message:
“package ‘dplyr’ was built under R version 3.4.1”
1. We will work with the Puromycin
data set in this exercise.
- Use
help
to find out more about of thePuromycin
data set - Use
class
to find out the class of the data set - How many rows and columns are there?
- What is the type of each column?
- Show all unique values for the
state
column - Show the first 5 rows
- Show the last 5 rows
In [2]:
#1
In [3]:
# 2
In [4]:
#3
In [5]:
#4
In [6]:
#4
In [7]:
#5
In [8]:
#6
In [9]:
#7
2. Using the Puromycin
data set,
- Show the first 20 rows using piping
- Show the last 10 rows using piping
- Show rows 11 to 20 using piping
In [10]:
#1
In [11]:
#2
In [12]:
#3
3. Using the Puromycin
data set,togehter with piping and
filter
- Show only rows where the
state
isuntreated
- Show only rows where the
conc
is 0.11 - Show only rows where the
conc
is less than 0.1 - Show only rows where the
state
istreated
and the rate is more than 100 - Show only rows where the
conc
is less than 0.1 or the rate is more than 200
In [13]:
#1
In [14]:
#2
In [15]:
#3
In [16]:
#4
In [17]:
#5
4. Using the Puromycin
data set, together with piping,
head and select, select_if and select_all
- Show only the
conc
andrate
columns - Show only the columns whose type is numeric
- Show only the columns whose names end with the letter
e
- Convert all column names to UPPERCASE
- Rearrange the columns in the order
state
,conc
,rate
- Drop the
state
column
Limit to only the first 3 rows in each case.
In [18]:
#1
In [19]:
#2
In [20]:
#3
In [21]:
#4
In [22]:
#5
In [23]:
#6
5. Using the Puromycin
data set, together with mutate or
transmutate and any other operation necessary
- Create a new column
rate2
that is the square of rate - Create a new data frame that only has the 3 columns with
conc
,conc^2
andconc^3
values. Name themconc
,conc2
andconc3
- Replace each value of all numeric columns with the square root of the value
Show only the first 5 rows in each case
In [24]:
#1
In [25]:
#2
In [26]:
#3
6. Using the Puromycin
data set, together with arrange and
any other operation necessary
- Sort in ascending
rate
order - Sort in descending
rate
order - Sort first on
conc
i ascending order, thenrate
in ascending order - Sort in ascending order of the number of characters in the
state
column
In each case show only the first 5 rows.
In [27]:
#1
In [28]:
#2
In [29]:
#2
In [30]:
#3
In [31]:
#4
7. Using the Puromycin
data set, together with summarize and
any other operation necessary
- Find the mean value of numeric columns
- Find the mean length of the
state
column - Find the min, median and max of the
rate
column
In [32]:
#1
In [33]:
#2
In [34]:
#3
8. Using the Puromycin
data set, together with group_by and
any other operation necessary
- Find the average rate for each
state
- Find the number of treated and untreated states in a new column
count
- Find the number of rows with the same
conc
andstate
in a new columncount
and only show rows where the count is an even number. - Find the mean and standard deviation of rate for each
state
andconc
. Remove any rows with an NA value for the rate standard deviation.
Hint: group_by
is often combined with summarize
, and n()
returns the count.
In [35]:
#1
In [36]:
#2
In [37]:
#3
In [38]:
#4