A streamlined view into data integrity and confidentiality in modern Linux environments

Current state of encryption on Generic Linux Distributions

Linux has had support for Full Disk Encryption and technologies such as UEFI Secure Boot and TPMs for a long time. They are usually set up suboptimally in distributions.

Supported:

[x] Full Disk Encryption
[x] UEFI SecureBoot
[x] TPMs

Tools & Mechanisms

We can separate the mechanisms and tools that we have to perform confidentiality and integrity operations in our systems in two large categories; disk encryption and data authentication.

Before we dive into the direction of listing the technologies and tools that are available for each aformentioned category, it would be useful to explain the two concepts.

Disk encryption translates into the process of transforming the data contained within a disk in such a way that reading them in clear-text form is only possible if you possess a secret of some form, usually a password/passphrase.

Data authentication means that there are mechanisms in place verifying that no one can make changes to the data on disk unless they have a secret of some sort.

Technologies empowering disk encryption:

LUKS/cryptsetup
dm-crypt

Technologies empowering data authentication:

LUKS/cryptsetup
dm-verity
dm-integrity

Since we have stated that both disk encryption and data authentication require a secret of some form, we can talk about TPMs. We will focus on one facet of TPMs capabilities, that of protecting secrets. TPMs, in general, release the secret keys only if the code that booted the host can be authenticated. The whole process is roughly explained below:

every boot component is hashed with a cryptographic hash function before it is used
resulting hashes are written to TPM's Platform Configuration Registers, essentially a small volatile memory.
- each step of the process writes the hashes of the resources needed for the next boot step
- PCRs are not freely written. The hashes written are combined with what is already stored in PCRs and the result of that is written to PCRs.
- this process is called "measuring"
secrets are protected not only by these PCR hashes but are also encrypted using a "seed key" that is generated by the TPM chip itself.
TPMs will enforce a limit on unlock attempts per time ("anti-hammering")

Distribution exploitation of these practices

In order to understand how these technologies are used in practive in modern Linux distributions we can take a look at the typical boot process of a distribution today (we are assuming - wrongfully so - that every system at hand is UEFI powered):

the UEFI firmware invokes "shim", a trivial EFI application stored in the EFI System Partition that, when run, attempts to open and execute another application. The "shim" is signed with a Microsoft key, built into all PCs/laptops. The "shim" is measured by the TPM.
The "shim" then loads the boot loader (often Grub) that is signed by a private key owned by the vendor. The boot loader is stored in the ESP or even a separate boot partition. The components of the boot loader are measured by the TPM.
The boot loader calls the kernel and passes an initial ramdisk image (initrd) which is the first userspace code encountered in the boot process. The kernel is also signed by the vendor and is validated via the "shim". The initrd remains unvalidated. Sometimes, the kernel is also measured by the TPM.
The kernel unpacks the initrd image and invokes what is contained in it. This is the first point that the system will interact with the user, asking him for a password for the encrypted root file system. The initrd then uses that to setup the encrypted volume. No TPM measuring takes place at this stage.
The initrd moves into the root file system.
The OS itself is up. It will ask the user for a username and a password. At this point no code authentication, no TPM measurements and no data decryption takes place. The username/password combination is only used for unlocking a certain account.

Encryption strategies

Provided that we are seeking to enforce an disk encryption and/or data authentication policy we have three operating system areas to include into our encryption strategy. We can, of course, choose to only focus on a specific area rather every one of them.

Authentication of OS binaries

Most Linux distributions store system-related binaries under /usr/. Given that it generally contains no secret data - anyone can download the binaries off the Internet anyway, and the sources too - by encrypting this you'll waste CPU cycles, but beyond that it doesn't hurt much. What you can do, although, is use some form of data authentication to verify the integrity of the binary files that your operating system is using to perform tasks. This can be achieved by:

making /usr/ a dm-verity volume. dm-verity is a concept implemented in the Linux kernel that provides authenticity to read-only block devices: every read access is cryptographically verified against a top-level hash value. It makes the /usr/ tree entirely immutable in a very simple way. However, the traditional rpm/apt based update logic cannot work in this mode.
making /usr/ a dm-integrity volume. dm-integrity is a concept provided by the Linux kernel that offers integrity guarantees to writable block devices, i.e. in some ways it can be considered to be a bit like dm-verity while permitting write access. There are multiple ways to use dm-integrity but the one that's most interesting in this use case would be using it in "stand-alone" mode, but with a keyed hash function (e.g. HMAC). This provides authenticity without encryption: if you make changes to the disk without knowing the secret this will be noticed on the next read attempt of the data and result in IO errors.

Encryption/Authentication of OS configuration and state

The OS state and configuration stores, a.k.a stuff in /etc/ and /var/ can be considered as the root file system. The root file system should be both encrypted and authenticated since it might contain secret keys, user passwords, sensitive logs and similar. The encryption of choice here is dm-crypt (LUKS) + dm-integrity. This provides both authenticity and encryption. The secret key must be provided somehow, ideally by the TPM.

Encryption/Authentication of the User's Home Directory

The data in the user's home directory should be encrypted as they usually contain personal and confidential information about the user.

We've seen the boot process of a system in a previous chapter and we now know that during boot, the data in the disk are decrypted after the invocation of the initrd image. In order for user specific encryption to make sense we need to get away from the concept of a system wide key and move to a per-user key. That will ensure that the user's password is what unlocks the user's data.

systemd provides a service called systemd-homed that implements this behavior in a safe way: each user gets its own LUKS volume stored in a loopback file in /home/, and this is enough to synthesize a user account. The encryption password for this volume is the user's account password, thus it's really the password provided at login time that unlocks the user's data.

Wrapping up

So, what is the state of encryption in modern day Linux distributions anyway; it's definitely complicated and fragmented.

And to be honest, I wouldn't expect it to be any other way. Vendors rarely agree on which package manager to ship, little could be done in the direction of a unified encryption strategy. It's up to the user, as is everything in the Linux way of doing things.

Speaking of idealized systems, I believe that simplicity is the key so that vendors can find a unified pattern of empowering their users to encrypt their data.

The ideal OS would be simpler without so many moving parts - especially during the boot process. Since UEFI is there and so is ESP, writing everything there would be the simplest solution. That would allow the firmware authenticate the boot loader/kernel/initrd without any further component for this in place.

In the end, the Linux ecosystem has always been that of a diasporic community and it is this dispersion that feeds creativity and diversity.

DISCLAIMER ⚠️ This work is inspired by Lennart's Poettering publication on "Authenticated Boot and Disk Encryption on Linux". My goal was to streamline the information so that it is digestible and usable in fast-paced research.