Using ggplot2: Solutions

In [3]:
suppressPackageStartupMessages(library(tidyverse))
Warning message:
“Installed Rcpp (0.12.12) different from Rcpp used to build dplyr (0.12.11).
Please reinstall dplyr to avoid random crashes or undefined behavior.”Warning message:
“package ‘dplyr’ was built under R version 3.4.1”
In [4]:
head(iris)
Sepal.LengthSepal.WidthPetal.LengthPetal.WidthSpecies
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa
4.6 3.1 1.5 0.2 setosa
5.0 3.6 1.4 0.2 setosa
5.4 3.9 1.7 0.4 setosa
In [5]:
options(repr.plot.width=4, repr.plot.height=3)

Map data attributes to aesthetics

In [6]:
g <- ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width, color=Species))
g
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_5_1.png

Use different geometries for the same mapping

In [7]:
g + geom_point()
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_7_1.png
In [8]:
g + geom_density2d()
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_8_1.png
In [9]:
g + geom_smooth()
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_9_2.png

Combine geometries

In [10]:
g + geom_point() + geom_smooth()
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_11_2.png

Changing labels

In [11]:
g + geom_point() + geom_smooth() +
labs(x="Sepal Length", y="Sepa; Width", title="Iris", subtitle="Plotted with ggplot")
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_13_2.png

Changing scales

In [12]:
g +
geom_jitter(width = 0.2) +
scale_y_log10()
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_15_1.png

Group plots using facets

In [13]:
g + geom_point() + geom_smooth() +
facet_wrap(~ Species)
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_17_2.png
In [14]:
g + geom_point() + geom_smooth() +
facet_wrap(~ Species, ncol = 1)
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_18_2.png

Turning guides on and off

In [15]:
g + geom_point() + geom_smooth() +
facet_wrap(~ Species) +
guides(color=FALSE)
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_20_2.png

Reshaping data.frames with gather and plotting

In [16]:
iris %>% gather(measure, value, -Species) %>% head
Speciesmeasurevalue
setosa Sepal.Length5.1
setosa Sepal.Length4.9
setosa Sepal.Length4.7
setosa Sepal.Length4.6
setosa Sepal.Length5.0
setosa Sepal.Length5.4
In [17]:
options(repr.plot.width=6, repr.plot.height=4)
In [18]:
g2 <- ggplot(iris %>% gather(measure, value, -Species),
            aes(x=Species, y=value, fill=Species, color=Species))
In [19]:
g2 + facet_wrap(~ measure) + geom_boxplot()
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_25_1.png
In [20]:
g2 + facet_wrap(~ measure) + geom_jitter()
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_26_1.png
In [21]:
g2 + facet_wrap(~ measure) + geom_bar(stat="identity")
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_27_1.png

Allowing different scales for each plot

In [22]:
g2 + facet_wrap(~ measure, scales="free") + geom_boxplot()
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_29_1.png

Change coordinates

In [23]:
g2 +
geom_boxplot() +
coord_flip()
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_31_1.png
In [24]:
options(repr.plot.width=6, repr.plot.height=6)
In [25]:
polar <- g2 +
facet_wrap(~ measure) +
geom_jitter(width = 0.2, size=1) +
coord_polar() +
guides(color=FALSE, fill=FALSE) +
labs(x="", y="")
In [26]:
polar
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_34_1.png

Transparency

In [27]:
ggplot(iris, aes(x=Sepal.Length, fill=Species)) +
geom_density(alpha=0.5)
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_36_1.png

Themes

In [28]:
ggplot(iris, aes(x=Sepal.Length, fill=Species)) +
geom_density(alpha=0.5) +
theme_bw()
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_38_1.png
In [30]:
ggplot(iris, aes(x=Sepal.Length, fill=Species)) +
geom_density(alpha=0.5) +
theme_classic()
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_39_1.png
In [31]:
ggplot(iris, aes(x=Sepal.Length, fill=Species)) +
geom_density(alpha=0.5) +
theme_dark() +
theme(axis.text.x = element_text(colour = 'red', size=20),
      axis.text.y = element_text(color = 'blue', size=20))
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_40_1.png

Using color scales

In [32]:
options(repr.plot.width=4, repr.plot.height=3)

Discrete colors or fills

In [33]:
g3 <- ggplot(iris %>% gather(measure, value, -Species),
            aes(x=Species, y=value, fill=Species)) +
     geom_bar(stat="identity")
In [34]:
g3
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_45_1.png
In [35]:
g3 + scale_fill_brewer(type='seq')
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_46_1.png
In [36]:
g3 + scale_fill_brewer(type='div')
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_47_1.png
In [37]:
g3 + scale_fill_brewer(type='qual')
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_48_1.png
In [38]:
g3 + scale_fill_brewer(type='seq', palette = 'Reds')
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_49_1.png
In [39]:
g3 + scale_fill_brewer(type='seq', palette = 'Reds', direction = -1)
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_50_1.png

Continuous colors or fills

In [40]:
suppressPackageStartupMessages(library(genefilter))
In [41]:
n <- 20
m <- 50000
EXPRS <- matrix(rnorm(m * 2 * n), m, 2*n)
rownames(EXPRS) <- paste('g', 1:m, sep='')
colnames(EXPRS) <- paste('pt', 1:(2*n), sep='')
grp <- as.factor(rep(c("Control", "Treated"), each=n))
In [42]:
p.values <- rowttests(EXPRS, grp)$p.value
ii <- order(p.values)
TOPEXPRS <- EXPRS[ii[1:100], ]
In [43]:
M <- data.frame(t(TOPEXPRS)) %>% rownames_to_column("pid") %>% gather(gene, expression, -pid)
In [44]:
head(M)
pidgeneexpression
pt1 g36200 0.537766165
pt2 g36200 -0.447619190
pt3 g36200 1.262114958
pt4 g36200 0.205835640
pt5 g36200 0.292946391
pt6 g36200 -0.007104385
In [45]:
options(repr.plot.width=6, repr.plot.height=4)
In [46]:
g4 <- ggplot(M, aes(gene, pid, fill=expression)) +
      geom_tile(colour='white') +
      theme(axis.text.x = element_blank(),
            axis.text.y = element_blank(),
            axis.ticks.x = element_blank(),
            axis.ticks.y = element_blank())
In [47]:
g4
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_59_1.png
In [48]:
g4 + scale_fill_gradient(low = "white", high="red")
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_60_1.png
In [49]:
g4 + scale_fill_gradient2(low = "darkgreen", high="darkblue")
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_61_1.png

Saving plots

In [50]:
ggsave('polar.png', polar)
Saving 7 x 7 in image

Check saved plot

Exercise

Hint: If nothing is plotted, wrap the entire R expression in a print() statement to see the error message.

In [51]:
head(mtcars)
mpgcyldisphpdratwtqsecvsamgearcarb
Mazda RX421.0 6 160 110 3.90 2.62016.460 1 4 4
Mazda RX4 Wag21.0 6 160 110 3.90 2.87517.020 1 4 4
Datsun 71022.8 4 108 93 3.85 2.32018.611 1 4 1
Hornet 4 Drive21.4 6 258 110 3.08 3.21519.441 0 3 1
Hornet Sportabout18.7 8 360 175 3.15 3.44017.020 0 3 2
Valiant18.1 6 225 105 2.76 3.46020.221 0 3 1

1. Make a scatter plot with y=mpg and x=wt

In [52]:
g <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
In [53]:
g
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_69_1.png

2 Add a linear regression curve.

In [54]:
g + geom_smooth(method='lm')
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_71_1.png

3. Add a title ‘Fuel efficiency decreases with weight’, and rename the x and y axis to ‘Weight’ and ‘Miles per gallon’.

In [55]:
g + geom_smooth(method='lm') +
labs(x="Weight", y="Miles per gallon",
     title="Fuel efficiency decreases with weight")
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_73_1.png

4. Change the color of the scatter points to salmon.

In [56]:
g + geom_point(color='salmon') +
geom_smooth(method='lm') +
labs(x="Weight", y="Miles per gallon",
     title="Fuel efficiency decreases with weight")
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_75_1.png

4. Change the color of the scatter points to represent the horsepower hp.

In [57]:
g + geom_point(aes(color=hp)) +
geom_smooth(method='lm') +
labs(x="Weight", y="Miles per gallon",
     title="Fuel efficiency decreases with weight")
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_77_1.png

5. Use color brewer to set the scale in Q4 with the Oranges seqeuntial palette for the cyl variable.

In [58]:
g + geom_point(aes(color=as.factor(cyl))) +
geom_smooth(method='lm') +
scale_color_brewer(type='seq', palette = 'Reds') +
labs(x="Weight", y="Miles per gallon",
     title="Fuel efficiency decreases with weight")
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_79_1.png

7. Make a density plot of mpg and fill by the factor cyl, and set the transparecny to 0.5.

In [59]:
ggplot(mtcars, aes(mpg, fill=as.factor(cyl))) +
geom_density(alpha=0.5)
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_81_1.png

8. Repeat Q7, but use 3 separate plots. Remove the legend.

In [60]:
ggplot(mtcars, aes(mpg, fill=as.factor(cyl))) +
facet_wrap(~ cyl) +
geom_density(alpha=0.5) +
guides(fill=FALSE)
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_83_1.png

9. Create a scatter plot -log(p value) on the y-axis and SNP location on the x-axis, coloring by chromosome number. This is known as a Manhattan plot. Use the code below to simulate data for the plot. Use the Set3 palette of qual type in scale_color_brewer for the color scheme.

n <- 10000 # number of genes
position <- 1:n
chromosome <- factor(rep(1:10, each=n/10))
p.value <- runif(n)
df <- data.frame(position=position, chromosome=chromosome, p.value=p.value)
In [61]:
n <- 10000 # number of genes
position <- 1:n
chromosome <- factor(rep(1:10, each=n/10))
p.value <- runif(n)
df <- data.frame(position=position, chromosome=chromosome, p.value=p.value)
In [62]:
ggplot(df %>% mutate(log.p.value = -log(p.value)),
       aes(x=position, y=log.p.value, color=chromosome)) +
geom_point(size=0.5) +
scale_color_brewer(type='qual', palette='Set3')
Data type cannot be displayed:
../../_images/Computation_Wk4_Day1_AM_Using_ggplo2_Solutions_86_1.png
In [ ]: