Setting up a local install of Jupyter with multiple kernels (Python 3.5, Python 2.7, R, Juila)¶
The only installation you are recommended to do is to install Anaconda 3.5, so that you have a backup when the OIT version is flaky. The other kernels and the Docker version are not required and you should only do so if you are comforatble with command line installs. Even the Anaconda 3.5 installation is optional if the OIT version works well for you.
Note: I have only done this on OSX 10.11.2 (El Capitan) with XCode and command line compilers installed.
To install Anaconda for Python 3.5¶
Download and install Anaconda Python 3.5 from https://www.continuum.io/downloads
Open a terminal
conda update conda
conda update anaconda
(OPTIONAL) To install Python 2.7 as well¶
Open a terminal
conda create -n py27 python=2.7 anaconda
source activate py27
ipython kernel install
source deactivate
(OPTIONAL) To install R¶
- If you want
conda
to manage your R packages
conda install -y -c r r-irkernel r-recommended r-essentials
Note: The bug that required this appears to have been fixed
wget https://anaconda.org/r/ncurses/5.9/download/osx-64/ncurses-5.9-1.tar.bz2 \
https://anaconda.org/r/nlopt/2.4.2/download/osx-64/nlopt-2.4.2-1.tar.bz2 \
&& conda install --yes ncurses-5.9-1.tar.bz2 nlopt-2.4.2-1.tar.bz2
- If you have an existing R installation that you want to use
Start R
install.packages(c('rzmq','repr','IRkernel','IRdisplay'),
repos = c('http://irkernel.github.io/', getOption('repos')))
IRkernel::installspec()
(OPTIONAL) To install Julia¶
Download and install Julia from http://julialang.org/downloads/
Start Julia
Pkg.add("IJulia")
Pkg.build("IJulia")
(OPTIONAL) Installing Spark via Docker¶
- Install Docker (https://docs.docker.com/engine/installation/)
- Launch the Docker Quickstart Terminal
Be patient - this can take a while the first time you do it
When done, it shouuld show something like this
## .
## ## ## ==
## ## ## ## ## ===
/"""""""""""""""""\___/ ===
~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ / ===- ~~~
\______ o __/
\ \ __/
\____\_______/
docker is configured to use the default machine with IP 192.168.99.100
For help getting started, check out the docs at https://docs.docker.com
Note the IP address given - you will need this to access the notebook.
In the Docker terminal
docker run -d -p 8888:8888 jupyter/all-spark-notebook
For how to connect to a Spark cluster, see official instructions
Testing the Docker installation¶
Check by typing in the Docker terminal
docker ps
Be patient - this can take a while the first time you do it.
It shoudl show something like
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
965a6a80bf44 jupyter/all-spark-notebook "tini -- start-notebo" 4 minutes ago Up 4 minutes 0.0.0.0:8888->8888/tcp big_kilby
Note the machine name (mine is big_kilby, yours will likely be different).
Open your browser at the following URL http://192.168.99.100:8888 (Use the IP given above)
This should bring you to a Jupyter notebook. Open a Python3 notebook from the drop dwon menu and test:
import pyspark
sc = pyspark.SparkContext('local[*]')
# do something to prove it works
rdd = sc.parallelize(range(1000))
rdd.takeSample(False, 5)
If successful, you should get a list of 5 integers after a short delay.
Save and exit the notebook.
Cleap up in the Docker terminal
docker stop big_kilby
exit
Use the machine name foudnd with docker ps
in place of
big_kilby
.
In [ ]: