Using ggplot2: Solutions

In [3]:
In [4]:
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa
4.6 3.1 1.5 0.2 setosa
5.0 3.6 1.4 0.2 setosa
5.4 3.9 1.7 0.4 setosa
In [5]:
options(repr.plot.width=4, repr.plot.height=3)

Map data attributes to aesthetics

In [6]:
g <- ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width, color=Species))
Data type cannot be displayed:

Use different geometries for the same mapping

In [7]:
g + geom_point()
Data type cannot be displayed:
In [8]:
g + geom_density2d()
Data type cannot be displayed:
In [9]:
g + geom_smooth()
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:

Combine geometries

In [10]:
g + geom_point() + geom_smooth()
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:

Changing labels

In [11]:
g + geom_point() + geom_smooth() +
labs(x="Sepal Length", y="Sepa; Width", title="Iris", subtitle="Plotted with ggplot")
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:

Changing scales

In [12]:
g +
geom_jitter(width = 0.2) +
Data type cannot be displayed:

Group plots using facets

In [13]:
g + geom_point() + geom_smooth() +
facet_wrap(~ Species)
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:
In [14]:
g + geom_point() + geom_smooth() +
facet_wrap(~ Species, ncol = 1)
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:

Turning guides on and off

In [15]:
g + geom_point() + geom_smooth() +
facet_wrap(~ Species) +
`geom_smooth()` using method = 'loess'
Data type cannot be displayed:

Reshaping data.frames with gather and plotting

In [16]:
iris %>% gather(measure, value, -Species) %>% head
setosa Sepal.Length5.1
setosa Sepal.Length4.9
setosa Sepal.Length4.7
setosa Sepal.Length4.6
setosa Sepal.Length5.0
setosa Sepal.Length5.4
In [17]:
options(repr.plot.width=6, repr.plot.height=4)
In [18]:
g2 <- ggplot(iris %>% gather(measure, value, -Species),
            aes(x=Species, y=value, fill=Species, color=Species))
In [19]:
g2 + facet_wrap(~ measure) + geom_boxplot()
Data type cannot be displayed:
In [20]:
g2 + facet_wrap(~ measure) + geom_jitter()
Data type cannot be displayed:
In [21]:
g2 + facet_wrap(~ measure) + geom_bar(stat="identity")
Data type cannot be displayed:

Allowing different scales for each plot

In [22]:
g2 + facet_wrap(~ measure, scales="free") + geom_boxplot()
Data type cannot be displayed:

Change coordinates

In [23]:
g2 +
geom_boxplot() +
Data type cannot be displayed:
In [24]:
options(repr.plot.width=6, repr.plot.height=6)
In [25]:
polar <- g2 +
facet_wrap(~ measure) +
geom_jitter(width = 0.2, size=1) +
coord_polar() +
guides(color=FALSE, fill=FALSE) +
labs(x="", y="")
In [26]:
Data type cannot be displayed:


In [27]:
ggplot(iris, aes(x=Sepal.Length, fill=Species)) +
Data type cannot be displayed:


In [28]:
ggplot(iris, aes(x=Sepal.Length, fill=Species)) +
geom_density(alpha=0.5) +
Data type cannot be displayed:
In [30]:
ggplot(iris, aes(x=Sepal.Length, fill=Species)) +
geom_density(alpha=0.5) +
Data type cannot be displayed:
In [31]:
ggplot(iris, aes(x=Sepal.Length, fill=Species)) +
geom_density(alpha=0.5) +
theme_dark() +
theme(axis.text.x = element_text(colour = 'red', size=20),
      axis.text.y = element_text(color = 'blue', size=20))
Data type cannot be displayed:

Using color scales

In [32]:
options(repr.plot.width=4, repr.plot.height=3)

Discrete colors or fills

In [33]:
g3 <- ggplot(iris %>% gather(measure, value, -Species),
            aes(x=Species, y=value, fill=Species)) +
In [34]:
Data type cannot be displayed:
In [35]:
g3 + scale_fill_brewer(type='seq')
Data type cannot be displayed:
In [36]:
g3 + scale_fill_brewer(type='div')
Data type cannot be displayed:
In [37]:
g3 + scale_fill_brewer(type='qual')
Data type cannot be displayed:
In [38]:
g3 + scale_fill_brewer(type='seq', palette = 'Reds')
Data type cannot be displayed:
In [39]:
g3 + scale_fill_brewer(type='seq', palette = 'Reds', direction = -1)
Data type cannot be displayed:

Continuous colors or fills

In [40]:
In [41]:
n <- 20
m <- 50000
EXPRS <- matrix(rnorm(m * 2 * n), m, 2*n)
rownames(EXPRS) <- paste('g', 1:m, sep='')
colnames(EXPRS) <- paste('pt', 1:(2*n), sep='')
grp <- as.factor(rep(c("Control", "Treated"), each=n))
In [42]:
p.values <- rowttests(EXPRS, grp)$p.value
ii <- order(p.values)
TOPEXPRS <- EXPRS[ii[1:100], ]
In [43]:
M <- data.frame(t(TOPEXPRS)) %>% rownames_to_column("pid") %>% gather(gene, expression, -pid)
In [44]:
pt1 g36200 0.537766165
pt2 g36200 -0.447619190
pt3 g36200 1.262114958
pt4 g36200 0.205835640
pt5 g36200 0.292946391
pt6 g36200 -0.007104385
In [45]:
options(repr.plot.width=6, repr.plot.height=4)
In [46]:
g4 <- ggplot(M, aes(gene, pid, fill=expression)) +
      geom_tile(colour='white') +
      theme(axis.text.x = element_blank(),
            axis.text.y = element_blank(),
            axis.ticks.x = element_blank(),
            axis.ticks.y = element_blank())
In [47]:
Data type cannot be displayed:
In [48]:
g4 + scale_fill_gradient(low = "white", high="red")
Data type cannot be displayed:
In [49]:
g4 + scale_fill_gradient2(low = "darkgreen", high="darkblue")
Data type cannot be displayed:

Saving plots

In [50]:
ggsave('polar.png', polar)
Saving 7 x 7 in image

Check saved plot


Hint: If nothing is plotted, wrap the entire R expression in a print() statement to see the error message.

In [51]:
Mazda RX421.0 6 160 110 3.90 2.62016.460 1 4 4
Mazda RX4 Wag21.0 6 160 110 3.90 2.87517.020 1 4 4
Datsun 71022.8 4 108 93 3.85 2.32018.611 1 4 1
Hornet 4 Drive21.4 6 258 110 3.08 3.21519.441 0 3 1
Hornet Sportabout18.7 8 360 175 3.15 3.44017.020 0 3 2
Valiant18.1 6 225 105 2.76 3.46020.221 0 3 1

1. Make a scatter plot with y=mpg and x=wt

In [52]:
g <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
In [53]:
Data type cannot be displayed:

2 Add a linear regression curve.

In [54]:
g + geom_smooth(method='lm')
Data type cannot be displayed:

3. Add a title ‘Fuel efficiency decreases with weight’, and rename the x and y axis to ‘Weight’ and ‘Miles per gallon’.

In [55]:
g + geom_smooth(method='lm') +
labs(x="Weight", y="Miles per gallon",
     title="Fuel efficiency decreases with weight")
Data type cannot be displayed:

4. Change the color of the scatter points to salmon.

In [56]:
g + geom_point(color='salmon') +
geom_smooth(method='lm') +
labs(x="Weight", y="Miles per gallon",
     title="Fuel efficiency decreases with weight")
Data type cannot be displayed:

4. Change the color of the scatter points to represent the horsepower hp.

In [57]:
g + geom_point(aes(color=hp)) +
geom_smooth(method='lm') +
labs(x="Weight", y="Miles per gallon",
     title="Fuel efficiency decreases with weight")
Data type cannot be displayed:

5. Use color brewer to set the scale in Q4 with the Oranges seqeuntial palette for the cyl variable.

In [58]:
g + geom_point(aes(color=as.factor(cyl))) +
geom_smooth(method='lm') +
scale_color_brewer(type='seq', palette = 'Reds') +
labs(x="Weight", y="Miles per gallon",
     title="Fuel efficiency decreases with weight")
Data type cannot be displayed:

7. Make a density plot of mpg and fill by the factor cyl, and set the transparecny to 0.5.

In [59]:
ggplot(mtcars, aes(mpg, fill=as.factor(cyl))) +
Data type cannot be displayed:

8. Repeat Q7, but use 3 separate plots. Remove the legend.

In [60]:
ggplot(mtcars, aes(mpg, fill=as.factor(cyl))) +
facet_wrap(~ cyl) +
geom_density(alpha=0.5) +
Data type cannot be displayed:

9. Create a scatter plot -log(p value) on the y-axis and SNP location on the x-axis, coloring by chromosome number. This is known as a Manhattan plot. Use the code below to simulate data for the plot. Use the Set3 palette of qual type in scale_color_brewer for the color scheme.

n <- 10000 # number of genes
position <- 1:n
chromosome <- factor(rep(1:10, each=n/10))
p.value <- runif(n)
df <- data.frame(position=position, chromosome=chromosome, p.value=p.value)
In [61]:
n <- 10000 # number of genes
position <- 1:n
chromosome <- factor(rep(1:10, each=n/10))
p.value <- runif(n)
df <- data.frame(position=position, chromosome=chromosome, p.value=p.value)
In [62]:
ggplot(df %>% mutate(log.p.value = -log(p.value)),
       aes(x=position, y=log.p.value, color=chromosome)) +
geom_point(size=0.5) +
scale_color_brewer(type='qual', palette='Set3')
Data type cannot be displayed:
In [ ]: