Jupyter is a development tool that excels at two things in particular: communicating with code, and prototyping new concepts. To me, the difference is something like “writing in Python” versus “developing in Python.”
A note before we go too much further, Joel Grus just gave a talk at Jupytercon all about why notebooks are terrible, and he’s mostly right. I think you should read it first, then willingly and completely throw caution to the wind and continue reading my post here.
Communicating with Code
Here’ I’ll use a Jupyter notebook to demonstrate a super-duper-simple Python concept – importing and using Python packages.
By showing the code, output, and well-formatted explanation all inline, the reader can very quickly digest what’s going on.
One of the big advantages of sharing your work in notebook format is that the reader can immediately begin interacting with the code. To understand this a little better, it helps to dive into how Jupyter itself works.
Jupyter Notebook and the (IPython) Kernel
The default Jupyter Notebook that ships is implemented as a web application. Typically, this runs locally – you start a server on your machine and open a web browser to localhost to use it. But there’s no reason it couldn’t be running on the internet, providing an interactive playground to your reader. In fact, Google has implemented an IPython Notebook with just that in mind, at https://colab.research.google.com/. This is what most TensorFlow tutorials seem to be written with, to provide just this immediate interactivity.
Prototyping new concepts
It’s this interactivity that provides another great advantage of Jupyter and the Notebook concept. When working on new code, it can make a lot of sense to prototype the code inside a notebook so that you can make quick edits live on the running execution environment. For instance, let’s say I’m prototyping a class. I have some code to build my class, and I have some code to use my class, and I’m jumping back and forth between them to get the desired end result. In a traditional development scenario, I might write all the class code towards the top of the file, then the usage code below that, and set a breakpoint before it to step through it, and maybe do some live monkeying around in the output terminal.
Alternatively, it might be more productive to do this in Notebook format. I define all my imports up top in one cell that I only run at the beginning of a session, to set up the environment. In the next cell, I have all my class definition code, which I can re-run at will to re-define what the class is. Then below that, I have the cell to use the class. Rather than re-running everything up until my breakpoint each time I change things, and maybe realizing that something somewhere unrelated has broken, requiring me to move the breakpoint and so forth, I can just re-run each piece as necessary. And if I want a completely clean execution to make sure everything works together as-currently-written, I can reset the kernel and re-run all the cells before the one I’m working on. That might look something like this.
Getting Set Up on MacOS
Okay, I’ve sold you – now how do you use Jupyter in anger to take over the world? Jupyter itslef is pretty easy – basically
$ pip install jupyter $ jupyter notebook
That leaves a lot out, though. Namely, you’re only getting a kernel with your native, system python environment.
Using Stock Jupyter Notebok with virtualenvwrapper
The trick here is to install an IPython kernel from your virtualenv, and run Jupyter notebook from your system environment. It doesn’t so much matter where you run the notebook from, but installing all of Jupyter is a bit heavier than installing just ipykernel, so you can save a little in the way of installed packages by not doing it all inside the virtualenv. Full disclosure, this workflow is basically lifted from https://anbasile.github.io/programming/2017/06/25/jupyter-venv/. It looks something like this (assuming you already use virtualenvwrapper):
$ pip install jupyter $ mkvirtualenv -p python3 my_virtualenv (my_virtualenv) $ pip install ipykernel (my_virtualenv) $ ipython kernel install --user --name=my_virtualenv_kernel From a different, non-venv terminal $ jupyter notebook
The notebook will launch and open a tab in your browser, and in the menu Kernel>Change kernel you’ll find a default or two, plus the my_virtualenv_kernel you just made, which operates in the virtualenv my_virtualenv.
I anticipate that this will be a regular workflow for me, so I put the following lines in my
# https://anbasile.github.io/programming/2017/06/25/jupyter-venv/ alias venvkernel_install='pip install ipykernel; ipython kernel install --user --name=`basename $VIRTUAL_ENV`'
This way, when I make a new virtualenv, I can run the command
venvkernel_install to make sure my Jupyter notebook has access to a python kernel in this environment, named thusly. I could have put this in my ~$WORKON_HOME/postmkvirtualenv, but I decided that might just clutter my package listings and lead me to forget how my setup works.
Using VSCode with a Jupyter Kernel for Prototyping
The Notebook environment is super useful for writing in Python, but developing applications and complex systems with Python may not lend itself so well to this editor. You’ll find yourself under-utilizing the features (such as textual cells), and probably wanting others that Notebook doesn’t implement (like code completion, detailed syntax highlighting, linting, and many others). For this type of work, I much prefer a full-featured IDE, like plugin-equipped VSCode. Luckily, even serious Python developers have enjoyed the convenience of the Jupyter workflow, so many IDEs have either mainline or plugin-based support for live-running code in an IPython kernel. In the case of VSCode, the “Python” extension is a must-have, but the Jupyter extension brings the run-in-kernel Jupyter workflow to the rest of those IDE features. By marking areas of your code as “cells” with
# %%, you can run those cells one-at-a-time in the IPython kernel of your choice, with output displayed sequentially or individually in a separate window.
Using the command pallette (Shift-Cmd-P), you can select an existing Notebook or start a new one, and from the left side of the status bar along the bottom, you can select which IPython kernel you’d like to use for running your cells – presumably, you pick the one matching the virtualenv of the project at hand.
This way, you can retain all the Jupyter advantages of setting up your application’s state once, then live-developing it in the running interpreter without ever leaving the comfort of the file you’re working on in your IDE.
These are the two ways I make use of the Jupyter ecosystem day to day – let me know in the comments if you have a different workflow you love!