📜 ⬆️ ⬇️

Why I hate virtualenv and pip

I do not share the universal love for virtualenv (hereinafter - venv) and pip. I believe that they only bring confusion and, moreover, harm. Python programmers most often disagree with me, and venv + pip is de facto considered the standard in the python community. Since I understand how unfounded my statements are, I decided to write this treatise. Of course, I sometimes argue on this subject in real life: well, I like to get people and watch how passionately they keep their position. But at the same time, it always seems to me that verbally I cannot fully justify my position. Therefore, instead of constantly trying to verbally prove my point of view, I decided to write this article in order to simply show it to people. Maybe then some will agree with me, because now almost no one agrees. Or maybe the other way around, as soon as my arguments are fully understood, there will be those who will argue with them convincingly. Anyway, I will be glad to any version of the development of events.

venv and the illusion of isolation

Isolation and easily reproducible pure python environments without any hidden dependencies on the underlying operating system are definitely a good thing. The main purpose of venv is to provide a convenient way of isolation at the python level. But not everything is perfect here: for those Python packages that are dependent on the system libraries, isolation is only partially carried out, extending only to the python component of these packages. If the developer is aware of this - still not so bad, but if not - he may face serious and incomprehensible problems.

Complete isolation methods lead to venv redundancy

There are several methods to isolate the entire file system. The most complete, but ponderous way is to use a virtual machine under the hypervisor . This functionality provides a number of programs, such as Vagrant. On the other hand, there are lightweight solutions, such as chroot, or lightweight containers operating at the operating system level, for example, on Linux, this is LXC . Moreover, LXC can use a copy-on-write file system like btrfs to create an environment with greater speed and less disk space than even in the case of venv.

venv is a deployment antipattern

I feel some readers are annoyed when they mention technologies like LXC. Yes, in practice we cannot always ensure the compatibility of our target environment with the LXC. And not always we can provide LXC with the superadmin rights it requires (and this is all just to simply deploy our application!)
But I think that venv is also not suitable for deployment. Why? As mentioned at the beginning, the initial goal of venv is only to provide convenient user access to the interactively created python sandbox. Deployment is at least a semi-automatic and easily repeatable process. Thus, an attempt to automate venv, in order to force it to automatically do what it is more convenient to do manually on it, seems to be a more complicated and nontrivial task than simply setting the PYTHONPATH environment variable as the input point of the program. It is very easy to install via pip some huge package, like Django, into an arbitrary folder (via the prefix option). At least it is much easier than indirectly controlling the venv and confused with the numerous shebangs . And do not forget that with venv, in fact, you have no control over the target environment of the environment, and you have to politely ask the administrator of the computer on which you are deploying to install client mysql libraries and header files into the operating system itself; all this so that you can just compile mysql-python for deployment!
Distributing commercial software is not easy, and venv is not an assistant in this.
')

venv is full of crutches

When you install venv, it is not really empty. The entire standard python library is copied to the lib / directory. In include / - pack python header files. The meaning of the existence of these directories seems to me far-fetched (in more detail in the next paragraph), but bin / annoys me much more. In bin / are pip and easy_install. venv spoils the shebang and both of them in order to launch them not under the system but under the python interpreter lying in the same directory. Shebang and all other scripts from additionally installed packages spoil in the same way. And you have to support this behavior of venv and monitor the shebangs all the time, while you need to work with the scripts lying inside the venv, “outside”, for example, launching them through the system cron. You have to “zahardkozhivat” the path to the appropriate venv, so that the script runs under the correct interpreter. This is at least as tedious as manually setting up PATH / PYTHONPATH. In fact, it's easier to do nothing, but I will come back to this a little later.

Oh, I forgot to mention bin / activate

Which sets the PATH environment variable and changes your input string in the console. If you always liked it, and you thought it was advanced technology, well, congratulations, it looks like you lived in a tank. However, like your script. Windows .NET developers make fun of you.

--no-site-packages

venv disfigures sys.path in two ways. The --sytem-site-packages option attaches venv's site-packages to the beginning of the existing path list, thereby making it possible to use globally installed python modules inside venv. There is also the option --no-site-packages, which is enabled by default, and which, as you might guess, does not make this connection. Apparently, this is why copies of some libraries like stdlib and header files are laid randomly dumped right inside venv. Actually, the fact of the existence of this option, as well as the fact that it is set by default, in my opinion, speak for themselves. Obviously, supporters of venv do not want to have hidden dependencies between packages in the system and venv; they also don’t want the packages of the wrong versions to accidentally leak into the venv. However, their favorite venv always appears at the very beginning of the list of paths, so a small chance is still present (no, I did not forget about the pip freeze command - we'll talk about it later). This fear may seem redundant, but this is the paradox. In fact, venv never provided 100% isolation! What good is it that you are 100% sure that you are not using the system version of mysql-python, while you are 100% sure that you are using the system version of libmysqlclient! It is impossible to partially use insulation at the same time and partially ignore it!

pip and venv is a great bunch

Everyone thinks that because they are written by the same person - Ian Bicking. Both programs have their own philosophy and their own uses. I don't like venv for the most part because it makes people believe , but I admit that it has its own niche. In fact, I myself use it from time to time for quick one-time tests. But the pip on the other hand should not have been born at all. It is only an “almost compatible” alternative to easy_install with additional whistlers that would not be better at all. Instead of it, I prefer to use easy_install, along with such interactive and not-so-many programs like puppet or compiling packages from source code in general. It may seem like a bias against pip, but it’s not. I agree that it’s more pleasant to write pip install in console rather than easy_install . easy_install sounds like something stupid. And the underscore in the name is clearly not practical. I bet that the name alone provides pip with some of its popularity.

pip each time from source

eggs in python are like jars in java
It seems that pip was deliberately deprived of the easy_install ability to install binaries (eggs). Despite the fact that the distribution of binaries was a significant part of the python platform and, by the way, quite workable, apparently, someone decided that this was a bad idea. Of course, from the point of view of developers, compiling packages from source is an obvious benefit, which allows them not to compile a package before each of all supported platforms (and transfer it to a user who is undoubtedly pleased with this) . But compilation becomes evil in the event that there are few target platforms, and you know her / them for sure and would like to build a package in advance, eliminating the need to have a compiler on the target computer (.NET and Java developers laugh at your problems again). But the biggest stupidity is that if you use venv with the option --no-site-packages, then every time every member of your team, deploying venv inside the SOE during the development process, you have to reassemble all the modules. And this is truly stupid, because you don’t even develop them and there’s simply no point in rebuilding.

This damn requirements.txt

To declare the necessary dependencies for your package, you can specify them in install_requires in setup.py. This is the python way. setuptools / distribute implements this mechanism, and it is used by both easy_install and pip to automatically boot from Pypi and install these dependencies. For reasons that are too long to explain, pip also allows you to specify a list of dependencies in a text file. It is usually called requirements.txt. Its syntax is exactly the same as in setup.py, but it also has the ability to additionally attach files in which the paths to dependencies can be specified as file paths, URIs, and even links to Mercurial / Git repositories (about all this we'll talk in the next paragraph).

I agree that these functions greatly expand the possibilities, but I do not believe that they are the reason for the existence of requirements.txt. In my opinion, the real reason is that all Python projects are divided into two classes: packages that are not used independently and are only imported into existing projects and, in fact, these projects themselves. Those developers who write only applications do not fully understand all the features of package creation, therefore, without thinking twice, they simply “hard-core” the entire range of modules they use to their application, simply listing them in requirements.txt, because it is so convenient! These developers most often simply advise users to install venv and then roll their package into it with the command pip install -r requirements.txt .

As a result, we have a number of python-developers who consider requirements.txt to be a panacea for all problems. They will never even know about the existence of setuptools. They are easily conquered by the seemingly simple stupid listing of links to the necessary packages lying somewhere in the depths of the Internet: on websites or in version control systems. I am discouraged by their holy confidence in the “fantastic” pragmatism of this approach and the consequent desire to promote the use of virtualenv + pip as a bundle of indispensable tools for everyone.

URIs as dependency paths that sucks

setuptools allows you to specify the name and the required version of the package, which by default is downloaded from Pypi. Pypi provides indexing, but you can create your own index (in the form of simple HTML pages) and indicate that the information should be extracted first of all from them, and not from the Pypi website. No matter who developed this technology, he tried to provide the developer with the ability to bind to package names, not their physical location or web protocol. And he thought correctly.

If you specify the path to a local file or to a tarball lying on any site in requirements.txt , in fact you are hard-hitting this link. Although in this case the best solution would be to use the package repository. Which would allow people, for example, to configure mirrors on it in the local area network. In addition, you can not specify the minimum version, only the exact current one . And one day, the same file with the package will move or be deleted, in general, it will disappear, and the code will suddenly stop working. Obviously, we don't want that, right?

Well, there is another way. Let's specify dependencies in this way:
git + https: //github.org/my/stupid/fucking/package#egg=1.2.3

But it requires the user to have git on the computer, and besides, pip has to download a complete copy of the repository. More often, people do not even use versioned notation (1.2.3 in the example - note of the translation) and assume that the stable version should lie in the master branch. All this is sad. I know that now it is fashionable to put everything right from version control systems, but is it hard to put these URLs in your project? And without that controversial decision, becoming completely unjustified, if everything can be done correctly, just sweating a little over the correct setup.py setup.

If you like pip freeze, there’s something wrong with you

I'm good at tracking my dependencies and managing them. I do this with pip freeze. The pip freeze command is used to ensure that no Python dependencies are missed in the middle of the development cycle. If you think that pip freeze gives you a list of dependencies just for insertion into requirements.txt (which, I remind you, is not needed) - then you simply use - no-site-packages (which is also not needed) when creating a new venv, and the whole set of dependencies still turns out to be globally-systemic, rather than pythonim. And, besides, in this way, it is impossible to find out which of your dependencies are installed directly, and which ones are pulled by others.

On the other hand, if you find that these dependencies ruined your environment, try to re-create it. But with venv + pip, it will take you forever (remember, you will need to rebuild everything and everyone). While with LXC CoW and packages already assembled in binary eggs (of all dependencies that you are not working with at the moment), you will very quickly find the missing dependencies, both at the system level and directly with python.
In general, pip freeze is not such a bad team, the fact is that people too often consider it indispensable, do not take into account its shortcomings and misuse it.

Conclusion

This is my critic, with a completely subjective and, perhaps, in some ways even a controversial analysis of the utility of the virtualenv and pip programs, as well as of the established culture around them. I really like python as a language, but less like a platform because it is fragmented by various package distribution standards and development process standards. Personally, in my case, this leads to the fact that I spend more time on fighting Python than on working with it. I regularly communicate with different smart people who sincerely believe that venv and pip provide everything they need to develop, collaborate, and deploy ready-made applications. I do not use either venv or pip during development.
And I hope that this article, at a minimum, will prove to the reader that it is possible and necessary to understand the principle of the operation of these programs and at the same time be critical of them.

From the translator:
To developers working under Windows: regardless of whether you decide to abandon the pip or are simply looking for a way to install some packages that don’t want to be bundled with pip (for example, falling with an error unable to find vcvarsall.bat), and Versions are not provided, I can advise a wonderful site that collects all sorts of compiled packages under my wing in various versions: Unofficial Windows Binaries for Python Extension Packages

Source: https://habr.com/ru/post/206024/


All Articles