Python is a very reasonable "default" language of choice for a variety of tasks. At least on desktop platforms, it makes sense to start with Python and see how it goes. Unfortunately, along its way to popularity it lost much of the simplicity that made it fun in the first place. An especially egregious aspect of Python is its poor backward compatibility. In practice it means that your codebase will only reliably work with a specific version of Python interpreter.
Even worse, the same holds for many popular libraries. Some Python codebase will only work with a certain library version 5.0.0 or lower, while another codebase requires 5.0.1 or higher. Thus, working with a single combination of Python interpreter and a bunch of libraries is simply not an option beyond the most basic experiments.
Working with third-party code is also not straightforward. Python has an official package repository, where any reasonable non-standard library is supposed to be present. Thus, code snippets found online often make no distinction between the standard library and third-party packages. For example, the following snippet will throw an exception on a freshly installed Python:
The problem is simple:
json is a standard module, while
dotmap is not, so it is supposed to be installed in advance. It is easy to install
dotmap by typing
pip install dotmap
However, it simply installs the most recent version of
dotmap globally for the current Python interpreter, which might cause issues with other codebases, requiring a certain different version of
Summing up, there are several separate issues here:
We should be able to ensure the right version of Python interpreter for our code.
We should be able to ensure the right combination of versions of third-party libraries and their presence in the system.
We should be able to isolate different setups from each other.
In other words, the basic problem is to recreate a Python environment, where our system is being developed, on another machine.
The issues described above are well known in Python world, and there are standard or semi-standard tools for dealing with them. However, they typically solve individual aspects of the problem rather than tackling it as a whole. I believe that nearly every Python project needs to deal with the full pack, and Poetry comes close to being an "all-purpose" solution.
Poetry can be installed via command line as follows:
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
# Windows PowerShell
(Invoke-WebRequest -Uri https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py -UseBasicParsing).Content | python -
Adding Poetry to Python Code
As discussed above, it is almost always desirable to make a local Python environment reproducible, so it makes sense to include Poetry configuration into the project from the very start. However, it can also be done any time later. To do it, navigate to the project directory and run
This command simply asks a few questions in an interactive mode and generates a plain text file named
pyproject.toml. It can also be created manually without Poetry, but the shell tool ensures that all the required elements are present.
The most important sections are
[tool.poetry.dependencies] listing both the main project dependencies and the required Python version, and
[tool.poetry.dev-dependencies], responsible for the development dependencies. The command line tool asks about dependencies, but they can be also added any time later.
When adding dependencies, it might be important to set constraints to their version numbers. By default, Poetry suggests the most recent version from PyPi, but it is also possible to add a library not listed on PyPi (such as a Git repository or a binary file). Dependency specification syntax is quite elaborate, and requires understanding of the semantic versioning system. I would say that in most cases the default option would work fine.
Adding a new main dependency is easy:
poetry add <name1> [... <nameN>]
A development dependency can be added in a similar manner:
poetry add --dev <name1> [... <nameN>]
"Development" dependencies are assumed to be necessary for project development only. In practice it means we can skip their installation by instructing the system to do so.
Removing dependencies is done similarly via
poetry remove command.
One may ask how to find out the list of dependencies for an existing project. Suppose we already have a Python environment with a number of installed libraries, and use it for some of our projects. How do we know which libraries are needed for any given project? Tools like pipreqs may help, but they are not 100% reliable, so, in general, some manual work is needed.
Handling Virtual Environments
There is one major difference between adding a dependency manually and using
poetry add: the command-line tool immediately installs the dependency. Since dependencies are installed into an isolated project-specific Python environment, we have to discuss these environments first.
It is important to note that Poetry will not install Python interpreters specified in project requirements. It is expected that the required Python version is already present in the system.
Thus, for example, running
poetry add dotmap
will fail if Python 3.6 is not installed, while
pyproject.toml has a dependency
python = "3.6"
Calling any command that installs/updates/removes project dependencies forces Poetry to modify the isolated virtual environment, associated with the current project. If this environment doesn’t exist yet, it will be created.
By default, Poetry uses the Python version found in
PATH. However, suppose that we need to use some other Python, present in the system. In this case, we have to trigger the creation of a virtual environment prior to
poetry add (or
poetry install) calls.
The most straightforward way to do it is to run
poetry env use with a full path to the required Python version:
poetry env use C:\Python3.6\python.exe
Poetry remembers this setup, and will use Python 3.6 every time we run
poetry add and similar commands from the project directory.
Technically, a virtual environment is just a directory where all environment-specific settings and libraries are stored. If it breaks for some reason, we can simply delete this directory and call
poetry install or
poetry env use to recreate it.
By default, all virtual environment directories are kept in the user home location. It is also possible to instruct Poetry to create virtual environments right inside the respective projects by calling
poetry config virtualenvs.in-project true
I think this setup is actually more convenient. If I want to delete a project, it will delete the corresponding virtual environment as well. The only real downside I see for now is cluttering of the project directory. When I search for a file inside a project, I have to exclude
.venv every time, which is annoying.
pyproject.toml file inside the project is enough to be able to run
poetry install [--no-dev]
and get all the necessary dependencies installed inside the corresponding virtual environment. As mentioned above, Poetry will also automatically create the environment if it is not present (alternatively, use
poetry env use to choose the right Python version). Thus, this is the minimal working setup for preparing the system for an existing project: navigate to the project directory on the disk and run
poetry install. The
--no-dev option instructs Poetry to skip development dependencies.
Poetry authors recommend to make one extra step, though. When Poetry installs dependencies, it writes down their exact version numbers into a file called
poetry.lock. If we keep it as a part of the project (i.e., add it to version control), Poetry will make sure to install these specific versions during
Let’s consider an example. Suppose we have a project that depends on the library
dotmap. We add it with a command
poetry add dotmap@1.*
Poetry actually installs
dotmap version 1.3.24 (the most recent version at the time of writing) and writes down this information into
poetry.lock. Suppose that after some time I pull this project from the repository on another machine without
poetry.lock. If I run
poetry install, it will install the most recent
dotmap version matching the mask
1.*. However, if I keep
poetry.lock as well,
dotmap version 1.3.24 will be installed.
So, to prevent possible issues it is recommended to place
poetry.lock under version control and update the libraries listed there manually. This is done by calling
poetry update [--no-dev]
The effect of this command is equivalent to deleting
poetry.lock and running
Running Project Code
The explicit way to run code inside a project-specific virtual environment is to use
poetry run command:
poetry run python main.py
poetry run executes any given command inside the project’s environment. It is also possible to open a command line shell to work with environment’s tools without having to prefix them with
poetry run all the time:
For now, I prefer the "explicit" version. It feels a bit overly verbose to type
poetry run python instead of
python, but in practice it is all done inside batch files or Visual Studio Code. Also I tend to think that explicit is good in this case.