R libraries and Bioconductor¶
Packages and Libraries¶
R is at heart a collection of ‘packages’. There is a ‘base’ system that
contains the truly basic commands, such as the assignment operator
->
or the command to create a vector. In addition to that, there are
‘standard R’ packages that are included when you install the R kernel
(in the Jupyter notebook), or ‘R’ as a program to run either at the
command line or with Rstudio. (I’ve shown some examples of these
different ways to run R in class).
Libraries¶
Many packages, even those included in [standard R] (https://www.r-project.org/), will need to be ‘loaded’ to be used. In other words, they exist on your computer (or in your container), but the R kernel doesn’t know about them. This is because if it did, R would be using computer memory (RAM) to remember all their functions and variables. If all the available packages were loaded, you might not have any RAM left!
A consequence of this is that you often have to tell R explicitly that
you want to use a particular package. You do that using library
.
Let’s read in the titanic data set to have something to play with.
In [1]:
titanic <- read.csv("titanic.csv")
In [2]:
head(titanic)
X | Name | PClass | Age | Sex | Survived | SexCode |
---|---|---|---|---|---|---|
1 | Allen, Miss Elisabeth Walton | 1st | 29.00 | female | 1 | 1 |
2 | Allison, Miss Helen Loraine | 1st | 2.00 | female | 0 | 1 |
3 | Allison, Mr Hudson Joshua Creighton | 1st | 30.00 | male | 0 | 0 |
4 | Allison, Mrs Hudson JC (Bessie Waldo Daniels) | 1st | 25.00 | female | 0 | 1 |
5 | Allison, Master Hudson Trevor | 1st | 0.92 | male | 1 | 0 |
6 | Anderson, Mr Harry | 1st | 47.00 | male | 1 | 0 |
There is a cool R function that will allow us to look at some random
rows from a data frame. It’s called sample_n
. Let’s try it:
In [3]:
sample_n(titanic, 10)
Error in sample_n(titanic, 10): could not find function "sample_n"
Traceback:
Oops. It turns out sample_n
is in the dplyr package. It’s installed
in your container - but R doesn’t know that! Let’s tell R we want to use
it:
In [4]:
library(dplyr)
Warning message:
“package ‘dplyr’ was built under R version 3.4.4”
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
In [5]:
sample_n(titanic, 10)
X | Name | PClass | Age | Sex | Survived | SexCode | |
---|---|---|---|---|---|---|---|
1123 | 1123 | Pekoniemi, Mr Edvard | 3rd | NA | male | 1 | 0 |
688 | 688 | Brahim, Mr Youssef | 3rd | NA | male | 0 | 0 |
346 | 346 | Bowenur, Mr Solomon | 2nd | NA | male | 0 | 0 |
15 | 15 | Baumann, Mr John D | 1st | NA | male | 0 | 0 |
422 | 422 | Givard, Mr Hans Christensen | 2nd | 30 | male | 0 | 0 |
256 | 256 | Swift, Mrs Frederick Joel (Margaret Welles Barron) | 1st | 46 | female | 1 | 1 |
525 | 525 | Pallas y Castello, Mr Emilio | 2nd | NA | male | 1 | 0 |
994 | 994 | Marinko, Mr Dmitri | 3rd | NA | male | 0 | 0 |
364 | 364 | Carter, Mrs Ernest Courtenay (Lillian Hughes) | 2nd | 44 | female | 0 | 1 |
1270 | 1270 | Van der Planke, Miss Augusta | 3rd | 18 | female | 0 | 1 |
Installed and installing packages¶
Now, dplyr
is actually not part of standard R. It’s installed
separately. There are a multitude of R packages out there. Anyone can
write one (yes, even you!!!). They are shared with the public using the
[CRAN archive.] (https://cran.r-project.org/) In order to be listed in
CRAN, packages need to meet specific criteria for documentation
purposes, testing, etc.
You can check to see what packages are installed using
installed.packages()
In [6]:
installed.packages()
Package | LibPath | Version | Priority | Depends | Imports | LinkingTo | Suggests | Enhances | License | License_is_FOSS | License_restricts_use | OS_type | MD5sum | NeedsCompilation | Built | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
abind | abind | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.4-5 | NA | R (>= 1.5.0) | methods, utils | NA | NA | NA | LGPL (>= 2) | NA | NA | NA | NA | no | 3.4.0 |
acepack | acepack | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.4.1 | NA | NA | NA | NA | testthat | NA | MIT + file LICENSE | NA | NA | NA | NA | yes | 3.4.0 |
AER | AER | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.2-5 | NA | R (>= 2.13.0), car (>= 2.0-19), lmtest, sandwich, survival (>= 2.37-5), zoo | stats, Formula (>= 0.2-0) | NA | boot, dynlm, effects, fGarch, forecast, foreign, ineq, KernSmooth, lattice, longmemo, MASS, mlogit, nlme, nnet, np, plm, pscl, quantreg, rgl, ROCR, rugarch, sampleSelection, scatterplot3d, strucchange, systemfit, truncreg, tseries, urca, vars | NA | GPL-2 | GPL-3 | NA | NA | NA | NA | no | 3.4.0 |
affy | affy | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.56.0 | NA | R (>= 2.8.0), BiocGenerics (>= 0.1.12), Biobase (>= 2.5.5) | affyio (>= 1.13.3), BiocInstaller, graphics, grDevices, methods, preprocessCore, stats, utils, zlibbioc | preprocessCore | tkWidgets (>= 1.19.0), affydata, widgetTools | NA | LGPL (>= 2.0) | NA | NA | NA | NA | yes | 3.4.2 |
affyio | affyio | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.48.0 | NA | R (>= 2.6.0) | zlibbioc, methods | NA | NA | NA | LGPL (>= 2) | NA | NA | NA | NA | yes | 3.4.2 |
affyPLM | affyPLM | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.54.0 | NA | R (>= 2.6.0), BiocGenerics (>= 0.3.2), affy (>= 1.11.0), Biobase (>= 2.17.8), gcrma, stats, preprocessCore (>= 1.5.1) | zlibbioc, graphics, grDevices, methods | preprocessCore | affydata, MASS | NA | GPL (>= 2) | NA | NA | NA | NA | yes | 3.4.2 |
agricolae | agricolae | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.2-8 | NA | R (>= 2.10) | klaR, MASS, nlme, cluster, spdep, AlgDesign, graphics | NA | NA | NA | GPL | NA | NA | NA | NA | no | 3.4.1 |
AlgDesign | AlgDesign | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.1-7.3 | NA | NA | NA | NA | NA | NA | GPL (>= 2) | NA | NA | NA | NA | yes | 3.4.0 |
ALL | ALL | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.20.0 | NA | R (>= 2.10), Biobase (>= 2.5.5) | NA | NA | rpart | NA | Artistic-2.0 | NA | NA | NA | NA | no | 3.4.0 |
Amelia | Amelia | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.7.5 | NA | R (>= 3.0.2), Rcpp (>= 0.11) | foreign, utils, grDevices, graphics, methods, stats | Rcpp (>= 0.11), RcppArmadillo | tcltk, Zelig | NA | GPL (>= 2) | NA | NA | NA | NA | yes | 3.4.4 |
annotate | annotate | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.56.2 | NA | R (>= 2.10), AnnotationDbi (>= 1.27.5), XML | Biobase, DBI, xtable, graphics, utils, stats, methods, BiocGenerics (>= 0.13.8), RCurl | NA | hgu95av2.db, genefilter, Biostrings (>= 2.25.10), IRanges, rae230a.db, rae230aprobe, tkWidgets, GO.db, org.Hs.eg.db, org.Mm.eg.db, hom.Hs.inp.db, humanCHRLOC, Rgraphviz, RUnit, | NA | Artistic-2.0 | NA | NA | NA | NA | no | 3.4.4 |
AnnotationDbi | AnnotationDbi | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.40.0 | NA | R (>= 2.7.0), methods, utils, stats4, BiocGenerics (>= 0.23.1), Biobase (>= 1.17.0), IRanges | methods, utils, DBI, RSQLite, stats4, BiocGenerics, Biobase, S4Vectors (>= 0.9.25), IRanges | NA | DBI (>= 0.2-4), RSQLite (>= 0.6-4), hgu95av2.db, GO.db, org.Sc.sgd.db, org.At.tair.db, KEGG.db, RUnit, TxDb.Hsapiens.UCSC.hg19.knownGene, hom.Hs.inp.db, org.Hs.eg.db, reactome.db, AnnotationForge, graph, EnsDb.Hsapiens.v75, BiocStyle, knitr | NA | Artistic-2.0 | NA | NA | NA | NA | no | 3.4.2 |
AnnotationFilter | AnnotationFilter | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.2.0 | NA | R (>= 3.4.0) | utils, methods, GenomicRanges, lazyeval | NA | BiocStyle, knitr, testthat, RSQLite, org.Hs.eg.db | NA | Artistic-2.0 | NA | NA | NA | NA | no | 3.4.2 |
AnnotationHub | AnnotationHub | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 2.10.1 | NA | BiocGenerics (>= 0.15.10) | utils, methods, grDevices, RSQLite, BiocInstaller, curl, AnnotationDbi (>= 1.31.19), S4Vectors, interactiveDisplayBase, httr, yaml | NA | IRanges, GenomicRanges, GenomeInfoDb, VariantAnnotation, Rsamtools, rtracklayer, BiocStyle, knitr, AnnotationForge, rBiopaxParser, RUnit, GenomicFeatures, MSnbase, mzR, Biostrings, SummarizedExperiment, ExperimentHub, gdsfmt | AnnotationHubData | Artistic-2.0 | NA | NA | NA | NA | yes | 3.4.2 |
aod | aod | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.3 | NA | R (>= 2.0.0), methods, stats | NA | NA | MASS, boot, lme4 | NA | GPL (>= 2) | NA | NA | NA | NA | NA | 3.4.0 |
ape | ape | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 5.1 | NA | R (>= 3.2.0) | nlme, lattice, graphics, methods, stats, tools, utils, parallel, Rcpp (>= 0.12.0) | Rcpp | gee, expm, igraph | NA | GPL (>= 2) | NA | NA | NA | NA | yes | 3.4.4 |
arm | arm | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.10-1 | NA | R (>= 3.1.0), MASS, Matrix (>= 1.0), stats, lme4 (>= 1.0) | abind, coda, graphics, grDevices, methods, nlme, utils | NA | NA | NA | GPL (>= 3) | NA | NA | NA | NA | no | 3.4.4 |
assertthat | assertthat | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.2.0 | NA | NA | tools | NA | testthat | NA | GPL-3 | NA | NA | NA | NA | no | 3.4.0 |
backports | backports | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.1.2 | NA | R (>= 3.0.0) | utils | NA | NA | NA | GPL-2 | NA | NA | NA | NA | yes | 3.4.3 |
base | base | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 3.4.0 | base | NA | NA | NA | methods | NA | Part of R 3.4.0 | NA | NA | NA | NA | NA | 3.4.0 |
base64 | base64 | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 2.0 | NA | NA | openssl | NA | NA | NA | MIT + file LICENSE | NA | NA | NA | NA | no | 3.4.0 |
base64enc | base64enc | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.1-3 | NA | R (>= 2.9.0) | NA | NA | NA | png | GPL-2 | GPL-3 | NA | NA | NA | NA | yes | 3.4.0 |
bayesm | bayesm | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 3.1-0.1 | NA | R (>= 3.2.0) | Rcpp (>= 0.12.0), utils, stats, graphics, grDevices | Rcpp, RcppArmadillo | knitr, rmarkdown | NA | GPL (>= 2) | NA | NA | NA | NA | yes | 3.4.1 |
bayesplot | bayesplot | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.5.0 | NA | R (>= 3.1.0) | dplyr (>= 0.7.1), ggplot2 (>= 2.2.1), reshape2, stats, utils, rlang, ggridges | NA | arm, gridExtra (>= 2.2.1), knitr (>= 1.16), loo (>= 1.1.0), rmarkdown (>= 1.0.0), rstan (>= 2.14.1), rstanarm (>= 2.14.1), rstantools (>= 1.4.0), scales, shinystan (>= 2.3.0), testthat, vdiffr | NA | GPL (>= 3) | NA | NA | NA | NA | no | 3.4.4 |
BDgraph | BDgraph | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 2.51 | NA | NA | Matrix, igraph | NA | NA | NA | GPL (>= 2) | NA | NA | NA | NA | yes | 3.4.4 |
beanplot | beanplot | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.2 | NA | NA | NA | NA | vioplot, lattice | NA | GPL-2 | NA | NA | NA | NA | no | 3.4.0 |
BH | BH | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.66.0-1 | NA | NA | NA | NA | NA | NA | BSL-1.0 | NA | NA | NA | NA | no | 3.4.3 |
BiasedUrn | BiasedUrn | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.07 | NA | NA | NA | NA | NA | NA | GPL-3 | NA | NA | NA | NA | yes | 3.4.0 |
bibtex | bibtex | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.4.2 | NA | R (>= 3.0.2) | stringr, utils | NA | testthat | NA | GPL (>= 2) | NA | NA | NA | NA | yes | 3.4.1 |
bindr | bindr | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.1.1 | NA | NA | NA | NA | testthat | NA | MIT + file LICENSE | NA | NA | NA | NA | no | 3.4.4 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
ucminf | ucminf | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.1-4 | NA | NA | NA | NA | numDeriv | NA | GPL (>= 2) | NA | NA | NA | NA | yes | 3.4.0 |
utf8 | utf8 | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.1.4 | NA | R (>= 2.10) | NA | NA | knitr, rmarkdown, testthat | NA | Apache License (== 2.0) | file LICENSE | NA | NA | NA | NA | yes | 3.4.4 |
utils | utils | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 3.4.0 | base | NA | NA | NA | methods, XML | NA | Part of R 3.4.0 | NA | NA | NA | NA | yes | 3.4.0 |
uuid | uuid | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.1-2 | NA | R (>= 2.9.0) | NA | NA | NA | NA | MIT + file LICENSE | NA | NA | NA | NA | yes | 3.4.0 |
VariantAnnotation | VariantAnnotation | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.24.5 | NA | R (>= 2.8.0), methods, BiocGenerics (>= 0.15.3), GenomeInfoDb (>= 1.11.4), GenomicRanges (>= 1.27.6), SummarizedExperiment (>= 1.5.3), Rsamtools (>= 1.23.10) | utils, DBI, zlibbioc, Biobase, S4Vectors (>= 0.13.13), IRanges (>= 2.3.25), XVector (>= 0.5.6), Biostrings (>= 2.33.5), AnnotationDbi (>= 1.27.9), BSgenome (>= 1.37.6), rtracklayer (>= 1.25.16), GenomicFeatures (>= 1.27.4) | S4Vectors, IRanges, XVector, Biostrings, Rsamtools | RUnit, AnnotationHub, BSgenome.Hsapiens.UCSC.hg19, TxDb.Hsapiens.UCSC.hg19.knownGene, SNPlocs.Hsapiens.dbSNP.20101109, SIFT.Hsapiens.dbSNP132, SIFT.Hsapiens.dbSNP137, PolyPhen.Hsapiens.dbSNP131, snpStats, ggplot2, BiocStyle | NA | Artistic-2.0 | NA | NA | NA | NA | yes | 3.4.3 |
vcd | vcd | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.4-4 | NA | R (>= 2.4.0), grid | stats, utils, MASS, grDevices, colorspace, lmtest | NA | KernSmooth, mvtnorm, kernlab, HSAUR, coin | NA | GPL-2 | NA | NA | NA | NA | no | 3.4.3 |
vegan | vegan | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 2.5-2 | NA | permute (>= 0.9-0), lattice, R (>= 3.2.0) | MASS, cluster, mgcv | NA | parallel, tcltk, knitr | NA | GPL-2 | NA | NA | NA | NA | yes | 3.4.4 |
verification | verification | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.42 | NA | R (>= 2.10), methods, fields, boot, CircStats, MASS, dtw | graphics, stats | NA | NA | NA | GPL (>= 2) | NA | NA | NA | NA | no | 3.4.0 |
VGAM | VGAM | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.0-5 | NA | R (>= 3.4.0), methods, stats, stats4, splines | NA | NA | VGAMdata, MASS, mgcv | NA | GPL-3 | NA | NA | NA | NA | yes | 3.4.3 |
VIM | VIM | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 4.7.0 | NA | R (>= 3.1.0),colorspace,grid,data.table(>= 1.9.4) | car, grDevices, robustbase, stats, sp, vcd,MASS,nnet,e1071,methods,Rcpp,utils,graphics,laeken | Rcpp | dplyr | NA | GPL (>= 2) | NA | NA | NA | NA | yes | 3.4.0 |
viridis | viridis | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.5.1 | NA | R (>= 2.10), viridisLite (>= 0.3.0) | stats, ggplot2 (>= 1.0.1), gridExtra | NA | hexbin (>= 1.27.0), scales, MASS, knitr, dichromat, colorspace, rasterVis, httr, mapproj, vdiffr, svglite (>= 1.2.0), testthat, covr, rmarkdown, rgdal | NA | MIT + file LICENSE | NA | NA | NA | NA | no | 3.4.4 |
viridisLite | viridisLite | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.3.0 | NA | R (>= 2.10) | NA | NA | hexbin (>= 1.27.0), ggplot2 (>= 1.0.1), testthat, covr | NA | MIT + file LICENSE | NA | NA | NA | NA | no | 3.4.3 |
wateRmelon | wateRmelon | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.22.1 | NA | R (>= 2.10), Biobase, limma, methods, matrixStats, methylumi, lumi, ROC, IlluminaHumanMethylation450kanno.ilmn12.hg19, illuminaio | Biobase | NA | RPMM | minfi | GPL-3 | NA | NA | NA | NA | no | 3.4.3 |
WhatIf | WhatIf | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.5-9 | NA | R (>= 2.3.1) | lpSolve, pbmcapply, Zelig (>= 5.0-17) | NA | testthat | NA | GPL (>= 3) | NA | NA | NA | NA | no | 3.4.1 |
whisker | whisker | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.3-2 | NA | NA | NA | NA | markdown | NA | GPL-3 | NA | NA | NA | NA | no | 3.4.0 |
withr | withr | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 2.1.2 | NA | R (>= 3.0.2) | stats, graphics, grDevices | NA | testthat, covr, lattice, DBI, RSQLite, methods, knitr, rmarkdown | NA | GPL (>= 2) | NA | NA | NA | NA | no | 3.4.4 |
xfun | xfun | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.3 | NA | NA | tools | NA | testit, parallel, rstudioapi, tinytex, mime, markdown, knitr, rmarkdown | NA | MIT + file LICENSE | NA | NA | NA | NA | no | 3.4.4 |
XML | XML | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 3.98-1.12 | NA | R (>= 2.13.0), methods, utils | NA | NA | bitops, RCurl | NA | BSD_2_clause + file LICENSE | NA | NA | NA | NA | yes | 3.4.4 |
xml2 | xml2 | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.2.0 | NA | R (>= 3.1.0) | Rcpp | Rcpp (>= 0.12.12) | testthat, curl, covr, knitr, rmarkdown, magrittr, httr | NA | GPL (>= 2) | NA | NA | NA | NA | yes | 3.4.3 |
xtable | xtable | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.8-2 | NA | R (>= 2.10.0) | stats, utils | NA | knitr, lsmeans, spdep, splm, sphet, plm, zoo, survival | NA | GPL (>= 2) | NA | NA | NA | NA | no | 3.4.0 |
xts | xts | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.11-0 | NA | zoo (>= 1.7-12) | methods | zoo | timeSeries, timeDate, tseries, chron, fts, tis, RUnit | NA | GPL (>= 2) | NA | NA | NA | NA | yes | 3.4.4 |
XVector | XVector | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.18.0 | NA | R (>= 2.8.0), methods, BiocGenerics (>= 0.19.2), S4Vectors (>= 0.15.14), IRanges (>= 2.9.18) | methods, zlibbioc, BiocGenerics, S4Vectors, IRanges | S4Vectors, IRanges | Biostrings, drosophila2probe, RUnit | NA | Artistic-2.0 | NA | NA | NA | NA | yes | 3.4.2 |
yaml | yaml | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 2.1.19 | NA | NA | NA | NA | testthat | NA | BSD_3_clause + file LICENSE | NA | NA | NA | NA | yes | 3.4.4 |
Zelig | Zelig | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 5.1.6 | NA | survival | AER, Amelia, coda, dplyr (>= 0.3.0.2), Formula, geepack, jsonlite, sandwich, MASS, MatchIt, maxLik, MCMCpack, methods, quantreg, survey, VGAM | NA | ei, eiPack, knitr, networkD3, optmatch, rmarkdown, testthat, tidyverse, ZeligChoice, ZeligEI, zeligverse | NA | GPL (>= 3) | NA | NA | NA | NA | no | 3.4.3 |
ZeligChoice | ZeligChoice | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.9-6 | NA | NA | dplyr, Formula, jsonlite, MASS, methods, VGAM, Zelig (>= 5.1-1), | NA | testthat, knitr, zeligverse | NA | GPL (>= 3) | NA | NA | NA | NA | no | 3.4.0 |
ZeligEI | ZeligEI | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.1-2 | NA | eiPack | dplyr, ei, Formula, jsonlite, MASS, MCMCpack, methods, Zelig (>= 5.1-0), | NA | knitr, testthat | NA | GPL (>= 3) | NA | NA | NA | NA | no | 3.4.0 |
zeligverse | zeligverse | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 0.1.1 | NA | NA | Amelia, dplyr, MatchIt, purrr, rstudioapi, tibble, WhatIf, Zelig, ZeligChoice, ZeligEI | NA | testthat | NA | GPL (>= 3) | NA | NA | NA | NA | no | 3.4.0 |
zip | zip | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.0.0 | NA | NA | NA | NA | covr, testthat, withr | NA | CC0 | NA | NA | NA | NA | yes | 3.4.0 |
zlibbioc | zlibbioc | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.24.0 | NA | NA | NA | NA | NA | NA | Artistic-2.0 + file LICENSE | NA | NA | NA | NA | yes | 3.4.2 |
zoo | zoo | /Library/Frameworks/R.framework/Versions/3.4/Resources/library | 1.8-3 | NA | R (>= 3.1.0), stats | utils, graphics, grDevices, lattice (>= 0.20-27) | NA | coda, chron, DAAG, fts, ggplot2, mondate, scales, strucchange, timeDate, timeSeries, tis, tseries, xts | NA | GPL-2 | GPL-3 | NA | NA | NA | NA | yes | 3.4.4 |
You can install new packages using the command install.packages()
In [7]:
install.packages("auk")
also installing the dependency ‘countrycode’
The downloaded binary packages are in
/var/folders/3l/tbmzdkss71152d8t9n1f8nx40000gn/T//Rtmpi7M8yz/downloaded_packages
In [8]:
remove.packages("auk")
Removing package from ‘/Library/Frameworks/R.framework/Versions/3.4/Resources/library’
(as ‘lib’ is unspecified)
BioConductor¶
CRAN is home to many, many R packages. But there is a whole other world
out there when it comes to bioinformatics in R. It’s called
BioConductor. BioConductor is a
comprehensive toolkit for all things having to do with high-throughput
sequencing data processing and analysis. In this course, we will use the
BioConductor package DESeq2
to perform differential expression
analysis. It’s the end of the pipeline, after QC, clipping and trimming,
aligning and counting.
Installing BioConductor packages¶
BioConductor has it’s own installation procedure (and it’s own criteria for documentation, testing, etc.) - separate from CRAN. Let’s have a look at the page for DESeq2
In [9]:
source("https://bioconductor.org/biocLite.R")
biocLite("DESeq2")
Bioconductor version 3.6 (BiocInstaller 1.28.0), ?biocLite for help
A new version of Bioconductor is available after installing the most recent
version of R; see http://bioconductor.org/install
BioC_mirror: https://bioconductor.org
Using Bioconductor 3.6 (BiocInstaller 1.28.0), R 3.4.0 (2017-04-21).
Installing package(s) ‘DESeq2’
The downloaded binary packages are in
/var/folders/3l/tbmzdkss71152d8t9n1f8nx40000gn/T//Rtmpi7M8yz/downloaded_packages
Old packages: 'caTools', 'dbplyr', 'diffusionMap', 'foreign', 'fpc', 'future',
'matrixStats', 'plotly', 'Rcpp', 'robustbase', 'stringi', 'trimcluster'
DESeq2 and S4 Objects¶
We’ll walk through an example using a sample data set called ‘airway’. Airway is an object of type ‘SummarizedExperiment’. This kind of object is the basis for many objects used in Bioconductor packages.
In [10]:
library("airway")
data("airway")
se <- airway
Error in library("airway"): there is no package called ‘airway’
Traceback:
1. library("airway")
2. stop(txt, domain = NA)
In [11]:
str(se)
Error in str(se): object 'se' not found
Traceback:
1. str(se)
This tutorial gives a great introduction to the SummarizedExperiment object. We’ll take a peek, and then move on to DESeq2
In [12]:
assays(se)
Error in assays(se): could not find function "assays"
Traceback:
In [13]:
assays(se)$counts
Error in assays(se): could not find function "assays"
Traceback:
In [14]:
rowRanges(se)
Error in rowRanges(se): could not find function "rowRanges"
Traceback:
In [15]:
colData(se)
Error in colData(se): could not find function "colData"
Traceback:
In [16]:
metadata(se)
Error in metadata(se): could not find function "metadata"
Traceback:
In [17]:
# Just a list - we can add elements
metadata(se)$formula <- counts ~ dex + albut
metadata(se)
Error in metadata(se)$formula <- counts ~ dex + albut: object 'se' not found
Traceback:
In [18]:
# subset the first five transcripts and first three samples
se[1:5, 1:3]
Error in eval(expr, envir, enclos): object 'se' not found
Traceback:
In [19]:
assays(se[1:5,1:3])$counts
Error in assays(se[1:5, 1:3]): could not find function "assays"
Traceback:
In [20]:
library("DESeq2")
dds <- DESeqDataSet(se, design = ~ cell + dex)
dds
Warning message:
“package ‘DESeq2’ was built under R version 3.4.2”Loading required package: S4Vectors
Warning message:
“package ‘S4Vectors’ was built under R version 3.4.2”Loading required package: stats4
Loading required package: BiocGenerics
Warning message:
“package ‘BiocGenerics’ was built under R version 3.4.2”Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:dplyr’:
combine, intersect, setdiff, union
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, cbind, colMeans, colnames,
colSums, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match,
mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
rbind, Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which, which.max, which.min
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:dplyr’:
first, rename
The following object is masked from ‘package:base’:
expand.grid
Loading required package: IRanges
Warning message:
“package ‘IRanges’ was built under R version 3.4.2”
Attaching package: ‘IRanges’
The following objects are masked from ‘package:dplyr’:
collapse, desc, slice
Loading required package: GenomicRanges
Warning message:
“package ‘GenomicRanges’ was built under R version 3.4.3”Loading required package: GenomeInfoDb
Warning message:
“package ‘GenomeInfoDb’ was built under R version 3.4.2”Loading required package: SummarizedExperiment
Warning message:
“package ‘SummarizedExperiment’ was built under R version 3.4.3”Loading required package: Biobase
Warning message:
“package ‘Biobase’ was built under R version 3.4.2”Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: DelayedArray
Warning message:
“package ‘DelayedArray’ was built under R version 3.4.2”Loading required package: matrixStats
Warning message:
“package ‘matrixStats’ was built under R version 3.4.3”
Attaching package: ‘matrixStats’
The following objects are masked from ‘package:Biobase’:
anyMissing, rowMedians
The following object is masked from ‘package:dplyr’:
count
Attaching package: ‘DelayedArray’
The following objects are masked from ‘package:matrixStats’:
colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges
The following object is masked from ‘package:base’:
apply
Error in is(se, "RangedSummarizedExperiment"): object 'se' not found
Traceback:
1. DESeqDataSet(se, design = ~cell + dex)
2. is(se, "RangedSummarizedExperiment")
In [21]:
# remove rows with less than 10 total transcripts
keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]
Error in counts(dds): object 'dds' not found
Traceback:
1. rowSums(counts(dds))
2. counts(dds)
In [22]:
# Specify reference level
dds$condition <- factor(dds$condition, levels = c("untreated","treated"))
#alternative
dds$condition <- relevel(dds$condition, ref = "untreated")
Error in factor(dds$condition, levels = c("untreated", "treated")): object 'dds' not found
Traceback:
1. factor(dds$condition, levels = c("untreated", "treated"))
In [23]:
dds <- DESeq(ddsSE)
res <- results(dds)
res
Error in is(object, "DESeqDataSet"): object 'ddsSE' not found
Traceback:
1. DESeq(ddsSE)
2. stopifnot(is(object, "DESeqDataSet"))
3. is(object, "DESeqDataSet")