Jupyter Notebook 101
An introduction to Jupyter Notebooks
A Jupyter Notebook is a document that helps with creating and sharing computational documents. Just like Google Docs are an online version of Microsoft Word, Jupyter Notebooks are documents that can execute code online. Notebooks have the file extension .ipynb (Interactive PYthon NOtebook), and each cell of code is executed in a REPL (Read-Eval-Print-Loop) fashion.
What is REPL? Read-Eval-Print-Loop is an interactive programming environment.
- 1.Read in: The user creates some code and sends it to be processed.
- 2.Eval: The code is evaluated.
- 3.Print: The results are printed out for the user.
- 4.Loop. The ability to repeat steps 1-3 as an iterative method.
Jupyter is an open source project. Many organizations and universities use it for data science, data discovery, and visualization workflows, and it is the most popular data science interface for code execution.
Accessed via a browser, it is has a kernel (in the video below it is Python) and has a UI to access the file system:
Jupyter Notebooks are built on the IPython kernel, famous for its REPL (Read-Eval-Print-Loop) capabilities. REPL interfaces take user inputs, execute the code, and presents the result to the user. This feedback loop can be repeated multiple times in each Notebook. IPython is a command line terminal through which we can interactively execute python commands.
Jupyter Notebooks contain cells that contain atomic commands. Each cell can be executed to get the result from a programming environment. For example, a cell might make a request to an API from a third-party system, and then return the result.
A cell with one line of code. The evaluation is printed below the cell.
In addition to code cells, you can add text content (in Markdown format!)
When the Markdown cell is Run - it will display the text inside the Notebook:
Rendering the markdown inside the notebook
Variables created in a cell are stored in the notebook kernel, and are available for use in subsequent cells:
Variable in the first cell can be referred to in the second cell
Cells in the notebook build on the results of previously run cells. One way to think of a cell is as a microservice. The microservices are called in order, and complete a full application when the Notebook is completed.
In the video below, we access a JSON file for an API key, and then make an API call for weather data:
Once we have collected the API data, we can load it into a dataframe and visualize the data.
DevOps/SRE groups create Runbooks to automate their workflows. Runbooks are comprised of steps that must be completed in order to complete a task. Many of these steps are accompanied by scripts that help in completeing the step. By placing the Runbook inside a Notebook container - the code can actually be evaluated while the Runbook is being used!
By applying the Jupyter Notebook concept to automate the infrastructure workflows simplifies the task of a DevOps/SRE. It also aids in decoupling and debugging various systems quickly.
PS: Unskript uses Jupyter Notebook under the hood. You could build or consume some knowledge shared by many engineers at the awesome-cloud-ops repo and run it via the Docker container.
There are many ways to install and run Jupyter Notebooks. Over the years, cloud platforms and several new-age startups have implemented Jupyter Notebooks - Google Collab, Deepnote, and Naas.ai.
I’m using the Anaconda Distribution of Jupyter Notebook. Search for “anaconda download”; the first link you find is probably from Anaconda.com, which distributes the Jupyter Notebooks and several other products like Anaconda Server.
- 1.Download and Install the Anaconda Opensource Distribution; it fits our use case to build and run a basic Jupyter Notebook.
- 2.Open the Anaconda Navigator
- 1.Launch Jupyter Notebook! (make you are not launching JupyterLab)
Next up, hit New -> Python 3 (ipykernel)
Each Jupyter notebook contains multiple cells, which contain Python code. Python code in each cell gets executed when you run the cell. Cells can also be configured for text so that you can input instructions and guidance needed for the following code snippet. (markdown!).
Please note each cell carries the programming context from the above cells.
A notebook (containing multiple cells) can be run in one go by using Cell -> Run Cells (as shown in the image above).
Alternatively, if you are a command line fan, you can use the below command to run a specific notebook.
> jupyter notebook notebook.ipynb
As discussed above, unSkript uses opensource Jupyter Notebooks under the hood and provides a seamless way of debugging/triggering complex infrastructure scripts.
unSkript has many open-source runbooks (aka notebooks) at the awesome-cloud-ops GitHub repository. So give us a star and raise an issue if you feel we are missing something.
Few more resources on getting started to understand Jupyter Notebooks: