As a small start-up time optimization, you can pick the best suited compression algorithm for the initial ramdisk.
The Initial Ramdisk
When a Linux system boots, it needs to mount the root filesystem
This may be relatively complicated, as it may be on a software RAID, on LVM, encrypted…
To keep things manageable, an initial ramdisk can be used to get a small environment that has all the required modules and configuration to load the root filesystem.
On Arch Linux, this initial ramdisk is generated using mkinitcpio.
It takes multiple parameters to tune various aspects of the system and of the generated ramdisk.
One such parameter is
It compresses the ramdisk to make the resulting image smaller.
The manpage reads:
Defines a program to filter the generated image through. The kernel understands the compression formats yielded by the zstd, gzip, bzip2, lz4, lzop, lzma, and xz compressors. If unspecified, this setting defaults to zstd compression. In order to create an uncompressed image, define this variable as cat.
Another reason to compress the image is that it may reduce the start-up time. To understand why, imagine that the image is 100 MiB in size and only 20 MiB after compression. Let’s say that the disk reads 10 MiB per second and that the CPU can decompress the full image in 1 second. If we keep the image uncompressed, the disk will need 10 seconds to read the uncompressed image, while it needs only 2 seconds to read the compressed image. Adding the decompression time, the compressed version require only 3 seconds.
The above example is quite simple, but it illustrates the trade-off between a bigger image that the disk will take longer to read and a smaller image that may take longer to decompress. It is thus more of a spectrum, where more CPU-intensive compression (and decompression) methods could result in a smaller image and less read from the disk but more CPU time:
more read, less read, less CPU more CPU ◄────────────────────────────────────► uncompressed lz4 zstd
Then, the question is: is it worth compressing an image more (or at all), to get a faster start-up time?
To answer this question on a particular machine1, let’s compare the time required to read and decompress various initial ramdisks.
I’m using the
linux package in version 5.18.15-arch1-2 from the Arch Linux repository. Then, I generate (
sudo mkinitcpio -p linux) various images with the following parameters in
Each image is copied in a directory and renamed according to the compression used:
cp /boot/initramfs-linux.img initramfs-linux.img.zstd.
The result is as follows:
$ file *.img* initramfs-linux.img: ASCII cpio archive (SVR4 with no CRC) initramfs-linux.img.lz4: LZ4 compressed data (v0.1-v0.9) initramfs-linux.img.zstd: Zstandard compressed data (v0.8+), Dictionary ID: None
The images are compressing quite well too:
We could compare more algorithms and compression level, but compression levels would need to be passed through
COMPRESSION_OPTIONS, which the manpage discourages, as it can result in an unbootable image.
Let’s run some decompression commands and compare their run-time with hyperfine. On a quiet computer:
$ hyperfine \ --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' \ 'lz4 -d <./initramfs-linux.img.lz4' \ 'zstd -d <initramfs-linux.img.zstd' \ 'cat <initramfs-linux.img'
Note that the command has a
--prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' argument.
This empties the OS file system caches to be closer to start-up conditions: when the computer starts, everything has to be read from the disk as the RAM is basically empty.
--prepare argument, we get much shorter times, e.g. 45ms for lz4.
Here are the results:
|Command||Mean [ms]||Min [ms]||Max [ms]||Relative|
||137.9 ± 13.5||122.8||157.4||1.00|
||164.9 ± 13.4||153.6||187.9||1.20 ± 0.15|
||175.9 ± 19.0||157.9||218.8||1.28 ± 0.19|
Lz4 is slightly faster, followed by zstd and no compression at all with
If we go back to the sizes table, the trade-off between a smaller image but a slower decompression is clear.
Despite a ~30% smaller file size, zstd is still a bit slower to decompress than lz4, while no compression at all is even worse.
The above results are based on runs on a particular machine. As mentioned different machines will yield different results, depending on the relative performance of the disk and the CPU. It’s also a pretty small improvement in the grand scheme of things: only a few tens of milliseconds on a process that takes a couple seconds. But I found it to be a nice example of how compression can make things faster, compared to no compression at all, because CPU nowadays are so fast.
Appendix: Recording of the
This was done on a different run from the table above, as running the benchmark through Asciinema is sometimes a bit less stable):
- 2022-08-02: Slightly reworded the intro to account for https://fosstodon.org/@pixelherodev/108927525223368261
The conclusions will in all likelihood change depending on the machine, namely the relative performance of the CPU and the disk. ↩︎