Solutions to Monday Morning Exercises

These are exercises that we will do in the optional class on Monday morning for those who want more practice data manipulation with tidyverse. If you can do these coding challenges with little difficulty, there is no need to attend the Monday class. Note: We will work with two the iris and mtcars data sets - while these have nothing to do with RNA-Seq, the skills you develop will translate directly to count or expression data from RNA-Seq experiments.

In [13]:
suppressPackageStartupMessages(library(tidyverse))
In [44]:
suppressPackageStartupMessages(library(stringr))

1. Warm up with the iris data frame.

  • Show the first 3 rows
  • Show the last 3 rows
  • Show 3 random rows without repetition
In [1]:
head(iris, n=3)
Sepal.LengthSepal.WidthPetal.LengthPetal.WidthSpecies
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa
In [2]:
tail(iris, n=3)
Sepal.LengthSepal.WidthPetal.LengthPetal.WidthSpecies
1486.5 3.0 5.2 2.0 virginica
1496.2 3.4 5.4 2.3 virginica
1505.9 3.0 5.1 1.8 virginica
In [9]:
idx <- sample(nrow(iris), 3)
iris[idx,]
Sepal.LengthSepal.WidthPetal.LengthPetal.WidthSpecies
365.0 3.2 1.2 0.2 setosa
666.7 3.1 4.4 1.4 versicolor
1327.9 3.8 6.4 2.0 virginica

2. Using the iris data set,

  • Find the mean value of all 4 measurements
  • Find the mean value of all 4 measurements for each Species
In [29]:
iris %>% summarize_if(is.numeric, mean)
Sepal.LengthSepal.WidthPetal.LengthPetal.Width
5.8433333.0573333.758 1.199333
In [30]:
iris %>% group_by(Species) %>% summarize_if(is.numeric, mean)
SpeciesSepal.LengthSepal.WidthPetal.LengthPetal.Width
setosa 5.006 3.428 1.462 0.246
versicolor5.936 2.770 4.260 1.326
virginica 6.588 2.974 5.552 2.026

3. Using the iris data set,

  • Sort the observations by Sepal.Width in decreasing order.
In [32]:
df <- iris %>% arrange(-Sepal.Width)
head(df)
Sepal.LengthSepal.WidthPetal.LengthPetal.WidthSpecies
5.7 4.4 1.5 0.4 setosa
5.5 4.2 1.4 0.2 setosa
5.2 4.1 1.5 0.1 setosa
5.8 4.0 1.2 0.2 setosa
5.4 3.9 1.7 0.4 setosa
5.4 3.9 1.3 0.4 setosa

4. Using the iris data`m set,

  • Count the number of flowers of each Species
In [14]:
iris %>% group_by(Species) %>% summarize(count=n())
Speciescount
setosa 50
versicolor50
virginica 50

5. Using the iris data set,

  • Count the number of observations where Petal.Length is longer than Sepal.Width
In [19]:
iris %>% filter(Petal.Length > Sepal.Width) %>% nrow
100

6. Using the iris data set,

  • Find the Species with the most number of observations where the Sepal.Length is less then the mean Sepal.Length of all observations
In [33]:
mu <- mean(iris$Sepal.Length)
iris %>%
transmute(Species, long = Sepal.Length > mu) %>%
group_by(Species) %>%
summarize(sum(long)) %>%
top_n(n=1)
Selecting by sum(long)
Speciessum(long)
virginica44

7. Using the iris data set,

  • Convert the data frame from the current wide format to a tall format, with just 3 columns: Species, Measurement, Value.
In [36]:
iris %>% gather(key='Measurement', value='Value', -Species) %>% head
SpeciesMeasurementValue
setosa Sepal.Length5.1
setosa Sepal.Length4.9
setosa Sepal.Length4.7
setosa Sepal.Length4.6
setosa Sepal.Length5.0
setosa Sepal.Length5.4

8. Using the mtcars data set,

  • Find the mean weight of all cars with mpg > 20 and cyl = 4.
In [39]:
mtcars %>% filter((mpg >20) & (cyl == 4)) %>% summarize(mean(wt))
mean(wt)
2.285727

9. Using the mtcars data set,

  • Add a new column named bmi that is equal to (hp*mpg/wt)
In [41]:
mtcars %>% mutate(bmi = hp*mpg/wt) %>% head
mpgcyldisphpdratwtqsecvsamgearcarbbmi
21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 881.6794
21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 803.4783
22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 913.9655
21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 732.1928
18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 951.3081
18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 549.2775

10. Using the mtcars data set

  • Find all rows whose car names have numbers in them.
In [48]:
mtcars %>% rownames_to_column(var = 'name') %>% filter(str_detect(name, '.*[0-9]+.*'))
namempgcyldisphpdratwtqsecvsamgearcarb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
In [ ]: