R libraries and Bioconductor

Packages and Libraries

R is at heart a collection of ‘packages’. There is a ‘base’ system that contains the truly basic commands, such as the assignment operator -> or the command to create a vector. In addition to that, there are ‘standard R’ packages that are included when you install the R kernel (in the Jupyter notebook), or ‘R’ as a program to run either at the command line or with Rstudio. (I’ve shown some examples of these different ways to run R in class).

Libraries

Many packages, even those included in [standard R] (https://www.r-project.org/), will need to be ‘loaded’ to be used. In other words, they exist on your computer (or in your container), but the R kernel doesn’t know about them. This is because if it did, R would be using computer memory (RAM) to remember all their functions and variables. If all the available packages were loaded, you might not have any RAM left!

A consequence of this is that you often have to tell R explicitly that you want to use a particular package. You do that using library. Let’s read in the titanic data set to have something to play with.

In [1]:
titanic <- read.csv("titanic.csv")
In [2]:
head(titanic)
XNamePClassAgeSexSurvivedSexCode
1 Allen, Miss Elisabeth Walton 1st 29.00 female 1 1
2 Allison, Miss Helen Loraine 1st 2.00 female 0 1
3 Allison, Mr Hudson Joshua Creighton 1st 30.00 male 0 0
4 Allison, Mrs Hudson JC (Bessie Waldo Daniels)1st 25.00 female 0 1
5 Allison, Master Hudson Trevor 1st 0.92 male 1 0
6 Anderson, Mr Harry 1st 47.00 male 1 0

There is a cool R function that will allow us to look at some random rows from a data frame. It’s called sample_n. Let’s try it:

In [3]:
sample_n(titanic, 10)
Error in sample_n(titanic, 10): could not find function "sample_n"
Traceback:

Oops. It turns out sample_n is in the dplyr package. It’s installed in your container - but R doesn’t know that! Let’s tell R we want to use it:

In [4]:
library(dplyr)
Warning message:
“package ‘dplyr’ was built under R version 3.4.4”
Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

In [5]:
sample_n(titanic, 10)
XNamePClassAgeSexSurvivedSexCode
11231123 Pekoniemi, Mr Edvard 3rd NA male 1 0
688 688 Brahim, Mr Youssef 3rd NA male 0 0
346 346 Bowenur, Mr Solomon 2nd NA male 0 0
15 15 Baumann, Mr John D 1st NA male 0 0
422 422 Givard, Mr Hans Christensen 2nd 30 male 0 0
256 256 Swift, Mrs Frederick Joel (Margaret Welles Barron)1st 46 female 1 1
525 525 Pallas y Castello, Mr Emilio 2nd NA male 1 0
994 994 Marinko, Mr Dmitri 3rd NA male 0 0
364 364 Carter, Mrs Ernest Courtenay (Lillian Hughes) 2nd 44 female 0 1
12701270 Van der Planke, Miss Augusta 3rd 18 female 0 1

Installed and installing packages

Now, dplyr is actually not part of standard R. It’s installed separately. There are a multitude of R packages out there. Anyone can write one (yes, even you!!!). They are shared with the public using the [CRAN archive.] (https://cran.r-project.org/) In order to be listed in CRAN, packages need to meet specific criteria for documentation purposes, testing, etc.

You can check to see what packages are installed using installed.packages()

In [6]:
installed.packages()
PackageLibPathVersionPriorityDependsImportsLinkingToSuggestsEnhancesLicenseLicense_is_FOSSLicense_restricts_useOS_typeMD5sumNeedsCompilationBuilt
abindabind /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.4-5 NA R (>= 1.5.0) methods, utils NA NA NA LGPL (>= 2) NA NA NA NA no 3.4.0
acepackacepack /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.4.1 NA NA NA NA testthat NA MIT + file LICENSE NA NA NA NA yes 3.4.0
AERAER /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.2-5 NA R (>= 2.13.0), car (>= 2.0-19), lmtest, sandwich, survival (>= 2.37-5), zoo stats, Formula (>= 0.2-0) NA boot, dynlm, effects, fGarch, forecast, foreign, ineq, KernSmooth, lattice, longmemo, MASS, mlogit, nlme, nnet, np, plm, pscl, quantreg, rgl, ROCR, rugarch, sampleSelection, scatterplot3d, strucchange, systemfit, truncreg, tseries, urca, varsNA GPL-2 | GPL-3 NA NA NA NA no 3.4.0
affyaffy /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.56.0 NA R (>= 2.8.0), BiocGenerics (>= 0.1.12), Biobase (>= 2.5.5) affyio (>= 1.13.3), BiocInstaller, graphics, grDevices, methods, preprocessCore, stats, utils, zlibbioc preprocessCore tkWidgets (>= 1.19.0), affydata, widgetTools NA LGPL (>= 2.0) NA NA NA NA yes 3.4.2
affyioaffyio /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.48.0 NA R (>= 2.6.0) zlibbioc, methods NA NA NA LGPL (>= 2) NA NA NA NA yes 3.4.2
affyPLMaffyPLM /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.54.0 NA R (>= 2.6.0), BiocGenerics (>= 0.3.2), affy (>= 1.11.0), Biobase (>= 2.17.8), gcrma, stats, preprocessCore (>= 1.5.1) zlibbioc, graphics, grDevices, methods preprocessCore affydata, MASS NA GPL (>= 2) NA NA NA NA yes 3.4.2
agricolaeagricolae /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.2-8 NA R (>= 2.10) klaR, MASS, nlme, cluster, spdep, AlgDesign, graphics NA NA NA GPL NA NA NA NA no 3.4.1
AlgDesignAlgDesign /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.1-7.3 NA NA NA NA NA NA GPL (>= 2) NA NA NA NA yes 3.4.0
ALLALL /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.20.0 NA R (>= 2.10), Biobase (>= 2.5.5) NA NA rpart NA Artistic-2.0 NA NA NA NA no 3.4.0
AmeliaAmelia /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.7.5 NA R (>= 3.0.2), Rcpp (>= 0.11) foreign, utils, grDevices, graphics, methods, stats Rcpp (>= 0.11), RcppArmadillo tcltk, Zelig NA GPL (>= 2) NA NA NA NA yes 3.4.4
annotateannotate /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.56.2 NA R (>= 2.10), AnnotationDbi (>= 1.27.5), XML Biobase, DBI, xtable, graphics, utils, stats, methods, BiocGenerics (>= 0.13.8), RCurl NA hgu95av2.db, genefilter, Biostrings (>= 2.25.10), IRanges, rae230a.db, rae230aprobe, tkWidgets, GO.db, org.Hs.eg.db, org.Mm.eg.db, hom.Hs.inp.db, humanCHRLOC, Rgraphviz, RUnit, NA Artistic-2.0 NA NA NA NA no 3.4.4
AnnotationDbiAnnotationDbi /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.40.0 NA R (>= 2.7.0), methods, utils, stats4, BiocGenerics (>= 0.23.1), Biobase (>= 1.17.0), IRanges methods, utils, DBI, RSQLite, stats4, BiocGenerics, Biobase, S4Vectors (>= 0.9.25), IRanges NA DBI (>= 0.2-4), RSQLite (>= 0.6-4), hgu95av2.db, GO.db, org.Sc.sgd.db, org.At.tair.db, KEGG.db, RUnit, TxDb.Hsapiens.UCSC.hg19.knownGene, hom.Hs.inp.db, org.Hs.eg.db, reactome.db, AnnotationForge, graph, EnsDb.Hsapiens.v75, BiocStyle, knitr NA Artistic-2.0 NA NA NA NA no 3.4.2
AnnotationFilterAnnotationFilter /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.2.0 NA R (>= 3.4.0) utils, methods, GenomicRanges, lazyeval NA BiocStyle, knitr, testthat, RSQLite, org.Hs.eg.db NA Artistic-2.0 NA NA NA NA no 3.4.2
AnnotationHubAnnotationHub /Library/Frameworks/R.framework/Versions/3.4/Resources/library 2.10.1 NA BiocGenerics (>= 0.15.10) utils, methods, grDevices, RSQLite, BiocInstaller, curl, AnnotationDbi (>= 1.31.19), S4Vectors, interactiveDisplayBase, httr, yaml NA IRanges, GenomicRanges, GenomeInfoDb, VariantAnnotation, Rsamtools, rtracklayer, BiocStyle, knitr, AnnotationForge, rBiopaxParser, RUnit, GenomicFeatures, MSnbase, mzR, Biostrings, SummarizedExperiment, ExperimentHub, gdsfmt AnnotationHubData Artistic-2.0 NA NA NA NA yes 3.4.2
aodaod /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.3 NA R (>= 2.0.0), methods, stats NA NA MASS, boot, lme4 NA GPL (>= 2) NA NA NA NA NA 3.4.0
apeape /Library/Frameworks/R.framework/Versions/3.4/Resources/library 5.1 NA R (>= 3.2.0) nlme, lattice, graphics, methods, stats, tools, utils, parallel, Rcpp (>= 0.12.0) Rcpp gee, expm, igraph NA GPL (>= 2) NA NA NA NA yes 3.4.4
armarm /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.10-1 NA R (>= 3.1.0), MASS, Matrix (>= 1.0), stats, lme4 (>= 1.0) abind, coda, graphics, grDevices, methods, nlme, utils NA NA NA GPL (>= 3) NA NA NA NA no 3.4.4
assertthatassertthat /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.2.0 NA NA tools NA testthat NA GPL-3 NA NA NA NA no 3.4.0
backportsbackports /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.1.2 NA R (>= 3.0.0) utils NA NA NA GPL-2 NA NA NA NA yes 3.4.3
basebase /Library/Frameworks/R.framework/Versions/3.4/Resources/library 3.4.0 base NA NA NA methods NA Part of R 3.4.0 NA NA NA NA NA 3.4.0
base64base64 /Library/Frameworks/R.framework/Versions/3.4/Resources/library 2.0 NA NA openssl NA NA NA MIT + file LICENSE NA NA NA NA no 3.4.0
base64encbase64enc /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.1-3 NA R (>= 2.9.0) NA NA NA png GPL-2 | GPL-3 NA NA NA NA yes 3.4.0
bayesmbayesm /Library/Frameworks/R.framework/Versions/3.4/Resources/library 3.1-0.1 NA R (>= 3.2.0) Rcpp (>= 0.12.0), utils, stats, graphics, grDevices Rcpp, RcppArmadillo knitr, rmarkdown NA GPL (>= 2) NA NA NA NA yes 3.4.1
bayesplotbayesplot /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.5.0 NA R (>= 3.1.0) dplyr (>= 0.7.1), ggplot2 (>= 2.2.1), reshape2, stats, utils, rlang, ggridges NA arm, gridExtra (>= 2.2.1), knitr (>= 1.16), loo (>= 1.1.0), rmarkdown (>= 1.0.0), rstan (>= 2.14.1), rstanarm (>= 2.14.1), rstantools (>= 1.4.0), scales, shinystan (>= 2.3.0), testthat, vdiffr NA GPL (>= 3) NA NA NA NA no 3.4.4
BDgraphBDgraph /Library/Frameworks/R.framework/Versions/3.4/Resources/library 2.51 NA NA Matrix, igraph NA NA NA GPL (>= 2) NA NA NA NA yes 3.4.4
beanplotbeanplot /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.2 NA NA NA NA vioplot, lattice NA GPL-2 NA NA NA NA no 3.4.0
BHBH /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.66.0-1 NA NA NA NA NA NA BSL-1.0 NA NA NA NA no 3.4.3
BiasedUrnBiasedUrn /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.07 NA NA NA NA NA NA GPL-3 NA NA NA NA yes 3.4.0
bibtexbibtex /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.4.2 NA R (>= 3.0.2) stringr, utils NA testthat NA GPL (>= 2) NA NA NA NA yes 3.4.1
bindrbindr /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.1.1 NA NA NA NA testthat NA MIT + file LICENSE NA NA NA NA no 3.4.4
ucminfucminf /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.1-4 NA NA NA NA numDeriv NA GPL (>= 2) NA NA NA NA yes 3.4.0
utf8utf8 /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.1.4 NA R (>= 2.10) NA NA knitr, rmarkdown, testthat NA Apache License (== 2.0) | file LICENSE NA NA NA NA yes 3.4.4
utilsutils /Library/Frameworks/R.framework/Versions/3.4/Resources/library 3.4.0 base NA NA NA methods, XML NA Part of R 3.4.0 NA NA NA NA yes 3.4.0
uuiduuid /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.1-2 NA R (>= 2.9.0) NA NA NA NA MIT + file LICENSE NA NA NA NA yes 3.4.0
VariantAnnotationVariantAnnotation /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.24.5 NA R (>= 2.8.0), methods, BiocGenerics (>= 0.15.3), GenomeInfoDb (>= 1.11.4), GenomicRanges (>= 1.27.6), SummarizedExperiment (>= 1.5.3), Rsamtools (>= 1.23.10) utils, DBI, zlibbioc, Biobase, S4Vectors (>= 0.13.13), IRanges (>= 2.3.25), XVector (>= 0.5.6), Biostrings (>= 2.33.5), AnnotationDbi (>= 1.27.9), BSgenome (>= 1.37.6), rtracklayer (>= 1.25.16), GenomicFeatures (>= 1.27.4)S4Vectors, IRanges, XVector, Biostrings, Rsamtools RUnit, AnnotationHub, BSgenome.Hsapiens.UCSC.hg19, TxDb.Hsapiens.UCSC.hg19.knownGene, SNPlocs.Hsapiens.dbSNP.20101109, SIFT.Hsapiens.dbSNP132, SIFT.Hsapiens.dbSNP137, PolyPhen.Hsapiens.dbSNP131, snpStats, ggplot2, BiocStyle NA Artistic-2.0 NA NA NA NA yes 3.4.3
vcdvcd /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.4-4 NA R (>= 2.4.0), grid stats, utils, MASS, grDevices, colorspace, lmtest NA KernSmooth, mvtnorm, kernlab, HSAUR, coin NA GPL-2 NA NA NA NA no 3.4.3
veganvegan /Library/Frameworks/R.framework/Versions/3.4/Resources/library 2.5-2 NA permute (>= 0.9-0), lattice, R (>= 3.2.0) MASS, cluster, mgcv NA parallel, tcltk, knitr NA GPL-2 NA NA NA NA yes 3.4.4
verificationverification /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.42 NA R (>= 2.10), methods, fields, boot, CircStats, MASS, dtw graphics, stats NA NA NA GPL (>= 2) NA NA NA NA no 3.4.0
VGAMVGAM /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.0-5 NA R (>= 3.4.0), methods, stats, stats4, splines NA NA VGAMdata, MASS, mgcv NA GPL-3 NA NA NA NA yes 3.4.3
VIMVIM /Library/Frameworks/R.framework/Versions/3.4/Resources/library 4.7.0 NA R (>= 3.1.0),colorspace,grid,data.table(>= 1.9.4) car, grDevices, robustbase, stats, sp, vcd,MASS,nnet,e1071,methods,Rcpp,utils,graphics,laeken Rcpp dplyr NA GPL (>= 2) NA NA NA NA yes 3.4.0
viridisviridis /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.5.1 NA R (>= 2.10), viridisLite (>= 0.3.0) stats, ggplot2 (>= 1.0.1), gridExtra NA hexbin (>= 1.27.0), scales, MASS, knitr, dichromat, colorspace, rasterVis, httr, mapproj, vdiffr, svglite (>= 1.2.0), testthat, covr, rmarkdown, rgdal NA MIT + file LICENSE NA NA NA NA no 3.4.4
viridisLiteviridisLite /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.3.0 NA R (>= 2.10) NA NA hexbin (>= 1.27.0), ggplot2 (>= 1.0.1), testthat, covr NA MIT + file LICENSE NA NA NA NA no 3.4.3
wateRmelonwateRmelon /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.22.1 NA R (>= 2.10), Biobase, limma, methods, matrixStats, methylumi, lumi, ROC, IlluminaHumanMethylation450kanno.ilmn12.hg19, illuminaio Biobase NA RPMM minfi GPL-3 NA NA NA NA no 3.4.3
WhatIfWhatIf /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.5-9 NA R (>= 2.3.1) lpSolve, pbmcapply, Zelig (>= 5.0-17) NA testthat NA GPL (>= 3) NA NA NA NA no 3.4.1
whiskerwhisker /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.3-2 NA NA NA NA markdown NA GPL-3 NA NA NA NA no 3.4.0
withrwithr /Library/Frameworks/R.framework/Versions/3.4/Resources/library 2.1.2 NA R (>= 3.0.2) stats, graphics, grDevices NA testthat, covr, lattice, DBI, RSQLite, methods, knitr, rmarkdown NA GPL (>= 2) NA NA NA NA no 3.4.4
xfunxfun /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.3 NA NA tools NA testit, parallel, rstudioapi, tinytex, mime, markdown, knitr, rmarkdown NA MIT + file LICENSE NA NA NA NA no 3.4.4
XMLXML /Library/Frameworks/R.framework/Versions/3.4/Resources/library 3.98-1.12 NA R (>= 2.13.0), methods, utils NA NA bitops, RCurl NA BSD_2_clause + file LICENSE NA NA NA NA yes 3.4.4
xml2xml2 /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.2.0 NA R (>= 3.1.0) Rcpp Rcpp (>= 0.12.12) testthat, curl, covr, knitr, rmarkdown, magrittr, httr NA GPL (>= 2) NA NA NA NA yes 3.4.3
xtablextable /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.8-2 NA R (>= 2.10.0) stats, utils NA knitr, lsmeans, spdep, splm, sphet, plm, zoo, survival NA GPL (>= 2) NA NA NA NA no 3.4.0
xtsxts /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.11-0 NA zoo (>= 1.7-12) methods zoo timeSeries, timeDate, tseries, chron, fts, tis, RUnit NA GPL (>= 2) NA NA NA NA yes 3.4.4
XVectorXVector /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.18.0 NA R (>= 2.8.0), methods, BiocGenerics (>= 0.19.2), S4Vectors (>= 0.15.14), IRanges (>= 2.9.18) methods, zlibbioc, BiocGenerics, S4Vectors, IRanges S4Vectors, IRanges Biostrings, drosophila2probe, RUnit NA Artistic-2.0 NA NA NA NA yes 3.4.2
yamlyaml /Library/Frameworks/R.framework/Versions/3.4/Resources/library 2.1.19 NA NA NA NA testthat NA BSD_3_clause + file LICENSE NA NA NA NA yes 3.4.4
ZeligZelig /Library/Frameworks/R.framework/Versions/3.4/Resources/library 5.1.6 NA survival AER, Amelia, coda, dplyr (>= 0.3.0.2), Formula, geepack, jsonlite, sandwich, MASS, MatchIt, maxLik, MCMCpack, methods, quantreg, survey, VGAM NA ei, eiPack, knitr, networkD3, optmatch, rmarkdown, testthat, tidyverse, ZeligChoice, ZeligEI, zeligverse NA GPL (>= 3) NA NA NA NA no 3.4.3
ZeligChoiceZeligChoice /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.9-6 NA NA dplyr, Formula, jsonlite, MASS, methods, VGAM, Zelig (>= 5.1-1), NA testthat, knitr, zeligverse NA GPL (>= 3) NA NA NA NA no 3.4.0
ZeligEIZeligEI /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.1-2 NA eiPack dplyr, ei, Formula, jsonlite, MASS, MCMCpack, methods, Zelig (>= 5.1-0), NA knitr, testthat NA GPL (>= 3) NA NA NA NA no 3.4.0
zeligversezeligverse /Library/Frameworks/R.framework/Versions/3.4/Resources/library 0.1.1 NA NA Amelia, dplyr, MatchIt, purrr, rstudioapi, tibble, WhatIf, Zelig, ZeligChoice, ZeligEI NA testthat NA GPL (>= 3) NA NA NA NA no 3.4.0
zipzip /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.0.0 NA NA NA NA covr, testthat, withr NA CC0 NA NA NA NA yes 3.4.0
zlibbioczlibbioc /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.24.0 NA NA NA NA NA NA Artistic-2.0 + file LICENSE NA NA NA NA yes 3.4.2
zoozoo /Library/Frameworks/R.framework/Versions/3.4/Resources/library 1.8-3 NA R (>= 3.1.0), stats utils, graphics, grDevices, lattice (>= 0.20-27) NA coda, chron, DAAG, fts, ggplot2, mondate, scales, strucchange, timeDate, timeSeries, tis, tseries, xts NA GPL-2 | GPL-3 NA NA NA NA yes 3.4.4

You can install new packages using the command install.packages()

In [7]:
install.packages("auk")
also installing the dependency ‘countrycode’


The downloaded binary packages are in
        /var/folders/3l/tbmzdkss71152d8t9n1f8nx40000gn/T//Rtmpi7M8yz/downloaded_packages
In [8]:
remove.packages("auk")
Removing package from ‘/Library/Frameworks/R.framework/Versions/3.4/Resources/library’
(as ‘lib’ is unspecified)

BioConductor

CRAN is home to many, many R packages. But there is a whole other world out there when it comes to bioinformatics in R. It’s called BioConductor. BioConductor is a comprehensive toolkit for all things having to do with high-throughput sequencing data processing and analysis. In this course, we will use the BioConductor package DESeq2 to perform differential expression analysis. It’s the end of the pipeline, after QC, clipping and trimming, aligning and counting.

Installing BioConductor packages

BioConductor has it’s own installation procedure (and it’s own criteria for documentation, testing, etc.) - separate from CRAN. Let’s have a look at the page for DESeq2

In [9]:
source("https://bioconductor.org/biocLite.R")
biocLite("DESeq2")
Bioconductor version 3.6 (BiocInstaller 1.28.0), ?biocLite for help
A new version of Bioconductor is available after installing the most recent
  version of R; see http://bioconductor.org/install
BioC_mirror: https://bioconductor.org
Using Bioconductor 3.6 (BiocInstaller 1.28.0), R 3.4.0 (2017-04-21).
Installing package(s) ‘DESeq2’

The downloaded binary packages are in
        /var/folders/3l/tbmzdkss71152d8t9n1f8nx40000gn/T//Rtmpi7M8yz/downloaded_packages
Old packages: 'caTools', 'dbplyr', 'diffusionMap', 'foreign', 'fpc', 'future',
  'matrixStats', 'plotly', 'Rcpp', 'robustbase', 'stringi', 'trimcluster'

DESeq2 and S4 Objects

We’ll walk through an example using a sample data set called ‘airway’. Airway is an object of type ‘SummarizedExperiment’. This kind of object is the basis for many objects used in Bioconductor packages.

In [10]:
library("airway")
data("airway")
se <- airway
Error in library("airway"): there is no package called ‘airway’
Traceback:

1. library("airway")
2. stop(txt, domain = NA)
In [11]:
str(se)
Error in str(se): object 'se' not found
Traceback:

1. str(se)

This tutorial gives a great introduction to the SummarizedExperiment object. We’ll take a peek, and then move on to DESeq2

In [12]:
assays(se)
Error in assays(se): could not find function "assays"
Traceback:

In [13]:
assays(se)$counts
Error in assays(se): could not find function "assays"
Traceback:

In [14]:
rowRanges(se)
Error in rowRanges(se): could not find function "rowRanges"
Traceback:

In [15]:
colData(se)
Error in colData(se): could not find function "colData"
Traceback:

In [16]:
metadata(se)
Error in metadata(se): could not find function "metadata"
Traceback:

In [17]:
# Just a list - we can add elements

metadata(se)$formula <- counts ~ dex + albut

metadata(se)
Error in metadata(se)$formula <- counts ~ dex + albut: object 'se' not found
Traceback:

In [18]:
# subset the first five transcripts and first three samples
se[1:5, 1:3]
Error in eval(expr, envir, enclos): object 'se' not found
Traceback:

In [19]:
assays(se[1:5,1:3])$counts

Error in assays(se[1:5, 1:3]): could not find function "assays"
Traceback:

In [20]:
library("DESeq2")


dds <- DESeqDataSet(se, design = ~ cell + dex)
dds


Warning message:
“package ‘DESeq2’ was built under R version 3.4.2”Loading required package: S4Vectors
Warning message:
“package ‘S4Vectors’ was built under R version 3.4.2”Loading required package: stats4
Loading required package: BiocGenerics
Warning message:
“package ‘BiocGenerics’ was built under R version 3.4.2”Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:dplyr’:

    combine, intersect, setdiff, union

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, cbind, colMeans, colnames,
    colSums, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match,
    mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort,
    table, tapply, union, unique, unsplit, which, which.max, which.min


Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:dplyr’:

    first, rename

The following object is masked from ‘package:base’:

    expand.grid

Loading required package: IRanges
Warning message:
“package ‘IRanges’ was built under R version 3.4.2”
Attaching package: ‘IRanges’

The following objects are masked from ‘package:dplyr’:

    collapse, desc, slice

Loading required package: GenomicRanges
Warning message:
“package ‘GenomicRanges’ was built under R version 3.4.3”Loading required package: GenomeInfoDb
Warning message:
“package ‘GenomeInfoDb’ was built under R version 3.4.2”Loading required package: SummarizedExperiment
Warning message:
“package ‘SummarizedExperiment’ was built under R version 3.4.3”Loading required package: Biobase
Warning message:
“package ‘Biobase’ was built under R version 3.4.2”Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: DelayedArray
Warning message:
“package ‘DelayedArray’ was built under R version 3.4.2”Loading required package: matrixStats
Warning message:
“package ‘matrixStats’ was built under R version 3.4.3”
Attaching package: ‘matrixStats’

The following objects are masked from ‘package:Biobase’:

    anyMissing, rowMedians

The following object is masked from ‘package:dplyr’:

    count


Attaching package: ‘DelayedArray’

The following objects are masked from ‘package:matrixStats’:

    colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges

The following object is masked from ‘package:base’:

    apply

Error in is(se, "RangedSummarizedExperiment"): object 'se' not found
Traceback:

1. DESeqDataSet(se, design = ~cell + dex)
2. is(se, "RangedSummarizedExperiment")
In [21]:
# remove rows with less than 10 total transcripts

keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]
Error in counts(dds): object 'dds' not found
Traceback:

1. rowSums(counts(dds))
2. counts(dds)
In [22]:
# Specify reference level

dds$condition <- factor(dds$condition, levels = c("untreated","treated"))

#alternative
dds$condition <- relevel(dds$condition, ref = "untreated")

Error in factor(dds$condition, levels = c("untreated", "treated")): object 'dds' not found
Traceback:

1. factor(dds$condition, levels = c("untreated", "treated"))
In [23]:
dds <- DESeq(ddsSE)
res <- results(dds)
res
Error in is(object, "DESeqDataSet"): object 'ddsSE' not found
Traceback:

1. DESeq(ddsSE)
2. stopifnot(is(object, "DESeqDataSet"))
3. is(object, "DESeqDataSet")