zypper

If you use openSUSE you probably use YaST or zypper (command line tool) for package management. In either case you end up using the libzypp library.

A little bit of history: the name “ZYpp” is an abbreviation for Zen / YaST Packages Patches Patterns Products. YaST is a relatively well known long time SUSE application that stands for Yet Another Setup Tool. I have been using it since the late ’90s. But what is Zen? In early 2000s Novell acquired SuSE and Ximian. They both had their own package management software, the aforementioned YaST and Red Carpet, respectively. Novell decided to merge those separate package management systems, and rebranded Red Carpet into ZENworks. Thus, the name ZYpp was born.

So, it’s just another package manager, right? What’s so special about ZYpp? The answer involves such exciting topics as P vs NP and Cook’s theorem. So stick around and read on.

In computational complexity, P stands for types of problems that can be solved in polynomial time. NP stands for types of problems that can be solved in non-deterministic polynomial time. Intuitively, and overly simplistically, P is the type of problems where as the number of data points increases, the computation required may increase with it by at most a constant factor. In contrast, for NP problems the increase in computation would be exponential.

NP complete are NP problems where a specific statement or answer can be verified in polynomial time. Whether P = NP is a great unsolved problem of our time.

According to Cook’s theorem, boolean satisfiability problem (commonly abbreviated as SAT) is NP complete. This means that given a number of constraints, there’s no algorithm that we know that can solve whether those constraints are valid in polynomial time. However, if given a specific statement, it can be checked and verified in polynomial time.

So, what does all of this have to do with ZYpp? ZYpp is actually implements such an algorithm – it is a SAT solver. Or, more precisely, the libzypp library uses libsolv which is a SAT solver. When you ask zypper to install a certain package, using package metadata it has to determine (1) whether it can be installed and (2) what is the best path to take, including adding any missing required packages, to do so.

As of this writing there are over 38,000 packages in openSUSE Tumbleweed base OSS repo alone. Each of those packages contain a list of libraries along with their versions they provide, and another list of libraries along with version or version range restrictions they depend on. For example, if I look at the amarok package, I get this output (truncated for brevity):

~ % zypper info --requires amarok
...
Information for package amarok:
-------------------------------
Repository     : openSUSE-Tumbleweed-Oss
...
Requires       : [97]
    libc.so.6(GLIBC_2.14)(64bit)
    libstdc++.so.6()(64bit)
    libstdc++.so.6(GLIBCXX_3.4)(64bit)
    libstdc++.so.6(CXXABI_1.3)(64bit)
    libm.so.6()(64bit)
    libpthread.so.0()(64bit)
    ...
    libtag-extras.so.1()(64bit)
    libtag-extras1 >= 1.0
    /bin/sh

We can see amarok package has 97 dependencies, including a requirement for

libtag-extras.so.1
combined with
libtag-extras1
package being at least at version 1.0. That package itself will have its own libraries it proves and depends on. So the job of the algorithm is to determine based on this data how to best satisfy these requirements, if possible. In addition, when there’s a conflict, it must provide available solutions and let the user pick a desired option.

I’m not going to go into details of SAT solver implementation, but the next time you type in commands such as

sudo zypper in MozillaFirefox

to install Firefox, or

sudo zypper rm --clean-deps chromium

to remove Chromium (including its no longer used dependencies), or

sudo zypper packages --orphaned

to find orphaned packages, or

sudo zypper dup

to perform the distribution upgrade of the whole world, and expect the correct answer to all of those in less than a second most of the time, remember that under the hood cool stuff happens to figure out the best path to take to do so.