Managing Python Dependencies With Git Hooks

When developing Python applications, it’s good practice to “pin” your dependencies in a requirements.txt file with an explicit version number (==) for each library.

This enables repeatable deployments; source control always explicitly defines a single set of dependency versions. Otherwise, if you have Flask or Flask>=0.10.0, there could be a new release of Flask installed on deploy that no one has actually tested locally.

The standard way to do this is to use a virtualenv and do:

pip install Flask
pip freeze > requirements.txt

Resulting in something like:

Flask==0.10.1
Jinja2==2.7.1
MarkupSafe==0.18
Werkzeug==0.9.4
argparse==1.2.1
itsdangerous==0.23
wsgiref==0.1.2

Unfortunately, that leaves a bunch of information about Flask’s dependencies in our requirements.txt.

Since Flask specifies its dependencies as loosely as possible (Jinja2>=2.4) and so does Jinja2 (MarkupSafe), if we do this then soon we’ll start missing out on new versions of libraries in our dependency tree, ones that might have better performance, new features, or pending deprecations. It’s also just a lot of noise.

Fortunately, with pip we can make a requirements.txt that only contains Flask and then do

pip freeze -r requirements.txt requirements-pinned.txt

Resulting in:

Flask==0.10.1
## The following requirements were added by pip --freeze:
Jinja2==2.7.1
MarkupSafe==0.18
Werkzeug==0.9.4
argparse==1.2.1
itsdangerous==0.23
wsgiref==0.1.2

We can store both requirements files in version control. requirements.txt says what versions we support for top-level dependencies and requirements-pinned.txt has an explicit version for every dependency in the entire dependency tree, to be used in our deployment script.

With git hooks, we can automate all of this.

This pre-commit hook ensures that locally-installed dependencies are pinned before every commit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#!/bin/bash

if [[ $(git diff --name-only --cached | grep "requirements-pinned.txt") ]]; then
    echo "Modify requirements.txt instead of requirements-pinned.txt";
    exit 1;
fi

git stash -q --keep-index

# in case someone edited requirements.txt without actually installing the package yet
echo "Installing latest requirements from requirements.txt"
pip install -q -r requirements.txt

echo "Pinning dependencies to requirements-pinned.txt"
pip freeze -r requirements.txt > requirements-pinned.txt

git add requirements.txt requirements-pinned.txt

git stash pop -q

This post-merge and post-checkout hook ensures that locally-installed dependencies always correspond to the pinned dependencies in the currently checked-out version of the code.

1
2
3
#!/bin/bash
echo "Updating requirements from requirements-pinned.txt"
pip freeze -r requirements.txt | comm --nocheck-order -23 requirements-pinned.txt - | xargs pip install

git-hooks is a nice way to manage your git hooks. Just put them in a git_hooks directory in your repo, and you’re good to go with:

cd .git/ && ln -s ../git_hooks git_hooks
git hooks install

(Be sure to make them executable.)

This also enables developers to explicitly run the pre-commit hook:

git hooks run pre-commit

git commit --no-verify can be used to avoid triggering the pre-commit hook.

This setup ensures that all installations of the codebase from version control are always running on top of an explicit set of versions defined in version control for all dependencies in the dependency tree, while taking no extra time for developers.

At the same time, It also enables us to specify flexible version ranges for our top-level dependencies, making dependency management easier for installations not from version control (i.e., anyone depending on us via PyPI), but we’re not forced to take any immediate action when one of our dependencies releases a new version. (There was a service that notified you of this whose name I don’t remember, but I think it’s now defunct.)

Comments