Example: Creating a custom environment from a requirements.txt in Jupyter¶

Alexander Dunkel, Institute of Cartography, TU Dresden

An author of a repository on Github. We want to re-create this environment in Jupyter.

The author provided a requiements.txt
This requirements.txt is usually a starting point only. Just two issues I immediately see:
- Several packages are needed but not included in the list
- The author did not specify which python version was used

There are more problems that I go through step by step below.

Starting point¶

Pytorch is scientific software and even under linux, conda still seems the preferred way to install these big scientific Python packages.

Create environment using conda.

conda is preferred here because it will also install c-dependencies that cannot be installed with pip (e.g. g++, gdal-bindings)
venv does not work here since we cannot define the Python version (it depends on the System Python(s) available)
We use pytorch==1.8.0 as the specific version we want and let conda decide which matching python version is used
torchvision==0.9.0 is needed because torchvision is tightly coupled with torch, and the the pytorch website provides official instructions, also for past versions, which states 0.9.0 as the matching version
we use --channel pytorch as the source of packages, according to the pytorch docs above
we also use the cpuonly flag to indicate that we don't have a GPU ready, which makes things easier (e.g. we don't care for CUDA compatibility)
we use a Prefix --prefix /envs/cagis_env for the environment. This folder is bind-mounted from the outside, meaning that it gets stored at a persistent location outside of this jupyter container

Note: Remove the > /dev/null to see the (long) output in cells below.

Create environment¶

%%bash
conda create \
    --prefix /envs/cagis_env \
    pytorch==1.8.0 torchvision==0.9.0 cpuonly --channel pytorch  -y --quiet > /dev/null

Note that installing torch directly via requirements.txt did not work due to unsolvable dependencies.

Now, make a copy of requirements.txt (e.g. requirements-base.txt)

remove pytorch and add tensorflow-cpu>=2.6.0.
we already installed pytorch above
tensorflow is used in the authors notebook, but not added to the requirements.
I also loosened dependency pinnings a bit (>=2.6.0, <=0.8.1 etc.), this helps the dependency resolver

numpy==1.19.4
geopandas<=0.8.1
matplotlib==3.4.3
dgl==0.6.1
scikit-learn==0.24.2
fiona==1.8.13
tensorflow-cpu>=2.6.0
torchvision

We will install these additional dependencies via pip.

For fiona, geopandas (etc.), we also need to install c-dependency libgdal-dev.

!apt-get update && apt-get install libgdal-dev -y > /dev/null

Install additional dependencies from requirements-base.txt:

!/envs/cagis_env/bin/python -m pip install -r requirements-base.txt

Link environment to jupyter¶

Finally, install ipykernel, so this env can be loaded in jupyter:

%%bash
/envs/cagis_env/bin/python -m pip install ipykernel
/envs/cagis_env/bin/python -m ipykernel install --user --name=cagis_env

Now, hit F5 (or click refresh) and load cagiv_env Kernel on the top-right corner of Jupyter Lab.

Create a HTML version of notebook¶

!jupyter nbconvert --to html_toc \
    --output-dir=. dependencies.ipynb \
    --template=./nbconvert.tpl \
    --ExtractOutputPreprocessor.enabled=False >&- 2>&-