What we know about the xz Utils backdoor that almost infected the world

Posted by

Malware Detected Warning Screen with abstract binary code 3d digital concept
Enlarge / Malware Detected Warning Screen with abstract binary code 3d digital concept

Getty Images

On Friday, researchers revealed the discovery of a backdoor that was intentionally planted in xz Utils, an open-source data compression utility available on almost all installations of Linux and other Unix-like operating systems. The person or people behind this project likely spent years on it. They were likely very close to seeing the backdoor update merged into Debian and Red Hat, the two biggest distributions of Linux when an eagle-eyed software developer spotted something fishy.

“This might be the best executed supply chain attack we’ve seen described in the open, and it’s a nightmare scenario: malicious, competent, authorized upstream in a widely used library,” software and cryptography engineer Filippo Valsorda said of the effort, which came frightfully close to succeeding.

Researchers have spent the weekend gathering clues. Here’s what we know so far.

What is xz Utils?

xz Utils is nearly ubiquitous in Linux. It provides lossless data compression on virtually all Unix-like operating systems, including Linux. xz Utils provides critical functions for compressing and decompressing data during all kinds of all kinds of operations. xz Utils also supports the legacy .lzma format, making this component even more crucial.

What happened?

Andres Freund, a developer and engineer working on Microsoft’s PostgreSQL offerings, was recently troubleshooting performance problems a Debian system was experiencing with SSH, the most widely used protocol for remotely logging into devices over the Internet. Specifically, SSH logins were consuming too many CPU cycles and were generating errors with valgrind, a utility for monitoring computer memory.

Through a combination of sheer luck and Freund’s careful eye, he eventually discovered the problems were the result of updates that had been made to xz Utils. On Friday, Freund took to the Open Source Security List to disclose the updates were the result of someone intentionally planting a backdoor in the compression software.

What does the backdoor do?

Malicious code added to xz Utils versions 5.6.0 and 5.6.1 modified the way the software functions when performing operations related to lzma compression or decompression. When these functions involved SSH, they allowed for malicious code to be executed with root privileges. This code allowed someone in possession of a predetermined encryption key to log into the backdoored system over SSH. From then on, that person would have the same level of control as any authorized administrator.

How did this backdoor come to be?

It would appear that this backdoor was years in the making. In 2021, someone with the username JiaT575 made their first known commit to an open-source project. In retrospect, the change to the libarchive project is suspicious, because it replaced the safe_fprint funcion with a variant that’s long been recognized as less secure. No one noticed at the time.

The following year, JiaT575 submited a patch over the xz Utils mailing list, and almost immediately, a never-before-seen participant named Jigar Kumar joined the discussion and argued that Lasse Collin, the longtime maintainer of xz Utils, hadn’t been updating the software often or fast enough. Kumar, with the support of Dennis Ens and several other people who had never had a presence on the list, pressured Collin to bring on an additional developer to maintain the project.

In January 2023, JiaT75,made their first commit to xz Utils. In the months following, JiaT75, who used the name Jia Tan, became increasingly involved in xz Utils affairs. For instance, Tan replaced Collins’s contact information with their own on Microsoft’s oss-fuzz, a project that scans open-source software for signs of maliciousness. Tan also requested that oss-fuzz disable the ifunc function during testing, a change that prevented it from detecting the malicious changes Tan would soon make to xz Utils.

In February of this year, Tan issued commits for versions 5.6.0 and 5.6.1 of xz Utils. The updates implemented the backdoor. In the following weeks, Tan or others appeal to developers of Ubuntu, Red Hat, and Debian to merge the updates into their OSes. Eventually, one of the two updates made its way into the following releases, according to security firm Tenable:

Can you say more about what this backdoor does?

In a nutshell, it allows someone with the right private key to hijack sshd, the executable file responsible for making SSH connections, and from there to execute malicious commands. The backdoor is implemented through a five-stage loader that uses a series of simple but clever techniques to hide itself. It also provides the means for new payloads to be delivered without major changes being required.

Multiple people who have reverse engineered the updates have much more to say about the backdoor.

Developer Sam James provided this overview:

This backdoor has several components. At a high level:

  • The release tarballs upstream publishes don’t have the same code that GitHub has. This is common in C projects so that downstream consumers don’t need to remember how to run autotools and autoconf. The version of build-to-host.m4 in the release tarballs differs wildly from the upstream on GitHub.
  • There are crafted test files in the tests/ folder within the git repository too. These files are in the following commits:
  • A script called by build-to-host.m4 that unpacks this malicious test data and uses it to modify the build process.
  • IFUNC, a mechanism in glibc that allows for indirect function calls, is used to perform runtime hooking/redirection of OpenSSH’s authentication routines. IFUNC is a tool that is normally used for legitimate things, but in this case it is exploited for this attack path.

Normally upstream publishes release tarballs that are different than the automatically generated ones in GitHub. In these modified tarballs, a malicious version of build-to-host.m4 is included to execute a script during the build process.

This script (at least in versions 5.6.0 and 5.6.1) checks for various conditions like the architecture of the machine. Here is a snippet of the malicious script that gets unpacked by build-to-host.m4 and an explanation of what it does:

if ! (echo "$build" | grep -Eq "^x86_64" > /dev/null 2>&1) && (echo "$build" | grep -Eq "linux-gnu$" > /dev/null 2>&1);then
  • If amd64/x86_64 is the target of the build
  • And if the target uses the name linux-gnu (mostly checks for the use of glibc)

It also checks for the toolchain being used:

if test "x$GCC" != 'xyes' > /dev/null 2>&1;then
exit 0
fi
if test "x$CC" != 'xgcc' > /dev/null 2>&1;then
exit 0
fi
LDv=$LD" -v"
if ! $LDv 2>&1 | grep -qs 'GNU ld' > /dev/null 2>&1;then
exit 0

And if you are trying to build a Debian or Red Hat package:

if test -f "$srcdir/debian/rules" || test "x$RPM_ARCH" = "xx86_64";then

This attack thusly seems to be targeted at amd64 systems running glibc using either Debian or Red Hat derived distributions. Other systems may be vulnerable at this time, but we don’t know.

In an online interview, developer and reverse engineer HD Moore confirmed the Sam James suspicion that the backdoor targeted either Debian or Red Hat distributions.

“The attack was sneaky in that it only did the final steps of the backdoor if you were building the library on amd64 (intel x86 64-bit) and were building a Debian or a RPM package (instead of using it for a local installation),” he wrote.

Paraphrasing observations from researchers who collectively spent the weekend analyzing the malicius updates, he continued:

When verifying an SSH public key, if the public key matches a certain fingerprint function, the key contents are decrypted using a pre-shared key before the public key is actually verified. The decrypted contents are then passed directly to system.

If the fingerprint doesn’t match or the decrypted contents don’t match a certain format, it falls back to regular key verification and no-one’s the wiser.

The backdoor is super sneaky. It uses a little-known feature of the glibc to hook a function. It only triggers when the backdoored xz library gets loaded by a /usr/bin/sshd process on one of the affected distributions. There may be many other backdoors, but the one everyone is talking about uses the function indirection stuff to add the hook. The payload was encoded into fake xz test files and runs as a shellcode effectively, changing the SSH RSA key verification code so that a magic public key (sent during normal authentication) let the attacker gain access

​​Their grand scheme was:

1) sneakily backdoor the release tarballs, but not the source code

2) use sockpuppet accounts to convince the various Linux distributions to pull the latest version and package it

3) once those distributions shipped it, they could take over any downstream user/company system/etc

Additional technical analysis is available from the above Bluesky thread from Valsorda, researcher Kevin Beaumont and Freund’s Friday disclosure.

What more do we know about Jia Tan?

At the moment, extremely little, especially for someone entrusted to steward a piece of software as ubiquitous and as sensitive as xz Utils. This developer persona has touched dozens of other pieces of open-source software in the past few years. At the moment, it’s unknown if there was ever a real-world person behind this username or if Jia Tan is a completely fabricated individual.