Once we start working on more than one python project, the question of creating isolated environment for each project crops up. After all, we don't want any packages that we installed for the first project to impact the second one and vice versa.
In this article, we will learn about the different requirements that different projects need. Then in followup articles, we will take a look at different tools, what features they provide and how it relates to these requirements.
Isolated environments
Almost everyone has experienced the pain when one project needs one version of a library while another project needs another version of the same library. Whichever version we install, it will break the other project.
This is when we learn about Python's virtual environments provided via the venv
package. A virtual environment can have it's own Python version, its own dependencies and it is isolated from any other virtual environments. This way you can have one project that runs on Python 3.9 and requires Django 3 while another project uses the latest features of Python 3.11 and uses Django 4.
Support for Multiple Python Versions
If your project is a package, you will probably want to test it with a range of python versions. Maybe it should support all python versions from 3.7 upto 3.11. Then you need a way to choose the python version and run the codebase.
This is the first place where venv
fails. One virtual environment is for one particular python version. If you want to test under different python versions, you need to create a separate virtual environment for each one and put the code separately into each. Messy!
Installing Dependencies
Once you create a virtual environment, you will want to install your dependencies there. venv
won't do that but you can use pip
. You can either install dependencies manually, or use a requirements.txt
file to define the dependencies and install all of them. Beyond that, you might need different dependencies while developing and different ones in production. pip
doesn't have a good way to do this apart from creating different requirements files.
Reproducible Builds
Sometimes you have a bug in production that doesn't happen on the dev box. Upon investigation it is found that a different version of a library got installed in production. This could be because version X was the latest version when the developers installed the dependency. But when it installed in production, version Y was the latest.
You can handle this to an extent by specifying exact versions in requirements.txt
. But sometimes there is an update in a transitive dependency. So you need to maintain a file that contains the versioned dependencies that your app cares about, but also another file (called a lock file) that has version information for all transitive dependencies as well, so that the exact same version of every dependency is installed every time. This is called a reproducible build
This is possible to an extent using pip freeze
, but it is very cumbersome.
Secure Builds
In case you haven't been following, there is a huge increase in what is called as "supply chain attacks". The aim of these attacks is to get the victim to install a package that contains malware, which will steal access keys, api tokens or otherwise compromise the development or production system.
The way they work is by getting the system to install a different package than expected. Lets say I want to install Django v4 and I do pip install django==4.0.0
. A hacker that has got access to the network route might end up intercepting the request and returning a hacked django package with some malicious code injected. This malicious package will get installed on the system.
Or, maybe the PyPI account of one of the maintainers of the package got compromised and the hacker uploaded new code into PyPI which is malicious.
Ideally, we would like to know when the package we receive is different from what we expected. Some tools provide such secure builds by storing the hash of the package contents along with the version in the lock file. If the hash of the downloaded package does not match the hash mentioned in the lock file, then it will reject the package and give an error.
Packaging, build and upload
Some tools also have features to build pure python & binary wheels or source distributions of your package and upload them to PyPI or any other package repository. I'm going to be skipping this part for these articles because python packaging is a big topic by itself.
pyproject.toml support
PEP 621 defined the pyproject.toml format for storing project metadata. We can now declare dependencies in pyproject.toml. Not all tools support reading this format, for example pip
requires dependencies in requirements.txt
.
Summary
In this article, we had a look at some of the requirements that projects have regarding managing their python environment. Different projects have different needs. A solo developer may not care too much about secure build, while it could be a critical feature for a large corporation.
In the following articles, we will look at different tools that are available, discuss the common use cases, and go through which of the above features they support.
Did you like this article?
If you liked this article, consider subscribing to this site. Subscribing is free.
Why subscribe? Here are three reasons:
- You will get every new article as an email in your inbox, so you never miss an article
- You will be able to comment on all the posts, ask questions, etc
- Once in a while, I will be posting conference talk slides, longer form articles (such as this one), and other content as subscriber-only