Nix OS, Guix OS, and declarative package management

Home About GitHub

In my article on how to pick a Linux distro I wrote a brief section on Nix OS and Guix OS. In this article, I describe in more detail what these Linux distros are, how they are different, and far more technologically advanced, from all distros, and the difficulties this can cause for you as an end user of Linux software.

Declarative package management

Quoting myself in the article How to pick a Linux distro:

You could call the Nix and Guix software package repositories a form of expert system, which by 1980s standards would be considered a form of artificial intelligence. Although nowadays no one thinks of expert systems as AI (the definition of intelligence is always changing), still it is a highly advanced algorithm for building software. Unfortunately, most software nowadays, especially the Python and JavaScript ecosystems, are not well equipped to facilitate declarative package management, and Python and JavaScript are among the world's most often used programming languages. Therefore I think it is safe to say, the world is not yet ready for Nix and Guix, these distros are ahead of their time.

Nix and Guix use what are called declarative package management algorithms with pure lazy functional package configuration languages. What this ball of jargon really means is, you don't install software onto your computer, you declare what software should exist on your computer, and your package manager automatically computes for you exactly which pieces of software need to be installed to satisfy your demands. It then installs exactly those pieces of software, and nothing else. As long as you are careful to declare all the pieces of software you need, your computer system will work well.

You declare what software you want to install using a programming language, so right from the start, you need to have some basic computer programming skills to uses these Linux distros. Also, you won't be able to understand much about what is going on without some understanding of the algorithms used by the package manager, and what problem these algorithms solve.

In brief:

The problem with software interdependency

The problem that declarative package management solves is that software is made out of many interdependent units of code, and this makes things very complicated. Each piece of software is being maintained by a different group of people in a city or organization in a different part of the world than all the other software components. How do you get software programmed by so many people spread around the world to all work together without breaking down? Well, with rigorous testing, but this is where concept of software releases and version numbers become important. A release is a final product that has been fully tested and safe to use by other software components. Each release has a version number. And the set of interdependent pieces of software and all of their versions is called a configuration.

This is where tracking the interdependence of software components, the software configuration, becomes difficult, because:

Using a package database

The solution to these software interdependency problems that Nix and Guix offer is to create what some would call an expert system, and have an algorithm compute the best possible set of interdependent software required to satisfy a set of requirements — hence making your software configuration declarative, so that the algorithm can compute the precise conditions that satisfy your declaration.

All pieces of software are stored in a database, and the portions of this database that you need are downloaded onto your computer. When you change configurations, say by installing new apps, or upgrading your apps, new software is added to the database. Even if you remove software, it remains in the database, but in parts that are inaccessible to the rest of the system. Also, everything in the database is content addressable.

If you discover that the new apps are bad or broken, you can roll back the installation to the previous declaration, and your software will return to the exact configuration state it was in before your upgrade.

And when you are building larger software applications to distribute to customers, for example Docker images, or FlatPaks, or AppImages, the distribution images built by Nix or Guix are as small as can possibly be — still often in the gigabytes in size, but this might be the absolute smallest that a large application software can reasonably be.

And if you take care to make every, single, last piece of software in the entire computer system has it's version properly tracked in the database, then your software build becomes fully reproducible, that is, you can mathematically prove that the software that is built and installed onto one computer is bit-for-bit identical to the software that was built and installed onto other computer systems. This is also made possible by the content addressable nature of the database, since each piece of software is stored with it's cryptographic hash number, if the hash numbers differ between any two pieces of software, this provides a reasonable guarantee that the two pieces of software are different.

This property of reproducible builds is very attractive to engineers who might be responsible for ensuring some reasonable guarantee of correctness of the software on the computer systems they maintain, especially if people's valuable private information are at stake, or even if lives hang in the balance. And this is really one of the goals of both Nix OS and Guix OS — to demonstrate that it is possible to have reproducible software builds of every last piece of software on the operating system. I would say they have proved the concept; it is indeed possible.

Problems inherent in declarative package management

If you use Nix or Guix as your distro of choice, your computer will accumulate hours upon hours of time running the calculations necessary to satisfy the system software configurations that you have declared, and spend many more hours downloading and installing the software, and often downloading dozens of gigabytes of code, and it might have to spend even more hours built and tested on your computer.

When you ask Nix or Guix to install new software, you change your package configuration, this launches the software dependency calculations all over again. Calculations and software bundles are saved in the database, so you don't always need to compute everything again from scratch. But it is common to see software installation take anywhere from 30 minutes to several hours, depending on what software you are installing.

Also, without regular garbage collection — which is removing old calculations and software bundles in the database that are no longer used — you may end up with dozens of gigabytes of space on your computer's hard drive being wasted. But every time you collect garbage, you lose the results of those calculations for those pieces of software, and they may need to be re-computed, if you ever roll-back to a previous configuration, or if you install other new software that might have still needed those older calculations.

Probably the biggest problem that I see with declarative package management is that the world of software engineering doesn't yet seem ready to use it. Two of the world's most popular programming languages, Python and JavaScript, have their own package management systems which are not as rigorously declarative as Nix or Guix. Attempts have been made to translate the Python and JavaScript package databases into a form that Nix or Guix can use to compute software configurations in a way that results in reproducible builds, but there is usually not enough information in these databases to do this properly. Furthermore, Python and JavaScript developers tend not to be overly concerned with correctly, accurately built software, and Python programmers have been known to get testy when dealing with Nix OS people.

Python and JavaScript are scripting languages that are easy to use, and quick to write for non-professionals. Mathematical rigor is just not a concern of many programmers in milieu of Python or JavaScript applications. But Python and JavaScript applications can also be among the most popular applications in their respective fields of use, especially data science and machine learning, and so you can't just cut them out of your operating system.

Conclusions

Declarative package management, and operating systems like Nix and Guix which use this technique, are indeed fascinating and may become how software is engineered in the future. But there are problems with it that have yet to be solved, problems with:

Nix OS, and Guix OS, are ahead of their time. Personally, I think this level of rigor in software installation will become more popular in the future, especially since more often huge sums of money, and even people lives, are ever increasingly more often at the mercy of computer software, the need for correctness and rigor in building tractable software will likewise become more increasingly demanded. But absent a number of costly or deadly software mistakes, or government regulations, that would force software engineering companies to build such rigorous systems, this technology will probably continue to only be used in niche applications.