Hi all,
ZSTD compression patches have been sent in a number of times over the
past few years. Every time, someone asks for benchmarks. Every time,
someone is concerned about compression time. Sometimes, someone provides
benchmarks.
But, as far as I can tell, nobody considered the compression parameters,
which have a significant impact on compression time and ratio.
So, I did some benchmarks myself, including all the compression levels
for each compressor.
Results:
The results are attached as SVG graphs and CSV data.
Summary:
- compression level, predictably, has a huge impact on compression time.
- compression level has virtually no impact on decompression time for
lz4, zstd, and some effect on others. interestingly, xz decompresses
slightly faster at higher compression levels (perhaps cache-related).
- gzip compresses slightly faster than zstd at medium compression levels.
- bzip2 sucks: slow compression, very slow decompression, poor ratio.
- lzma decompresses slightly faster than xz, but is also slightly larger.
- xz is smallest but with very slow compression and decompression.
- lz4 decompresses fastest.
- zstd is a good balanced default.
- 7z is much faster than xz, even with wine overhead.
Files:
For the kernel, I did "make allmodconfig; sed -i -e '/=m$/d' .config"
with a 5.6 kernel and gcc 9.3.0 on x86_64, then concatenated vmlinux.bin
and vmlinux.relocs. For the initramfs, I used the Arch Linux fallback
initramfs with default hooks.
Versions:
gzip 1.10
bzip2, a block-sorting file compressor. Version 1.0.8, 13-Jul-2019.
xz (XZ Utils) 5.2.5
*** LZ4 command line interface 64-bits v1.9.2, by Yann Collet ***
lzop 1.04
LZO library 2.10
*** zstd command line interface 64-bits v1.4.4, by Yann Collet ***
7-Zip 19.00 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2019-02-21
Notes:
I used the userspace versions of the decompressors, not the kernel
version. This is particularly relevant for xz, as the kernel xzminidec
is significantly slower than xz.
pigz is faster than gzip, but I used gzip as a common baseline.
7-Zip was run through wine with a persistent wineserver.
I ran the benchmark on a Ryzen 1600, with turbo boost turned off. Each
test was run only once, on the basis that any noise wouldn't disrupt the
overall curve, and also I don't want to spend hours waiting for the
results.
The current compression level defaults are:
- gzip -9
- bzip2 -9
- lzma -9
- xz --check=crc32 --x86 --lzma2=,dict=32MiB # except on ppc
- lzop -9
- lz4 -l -1
My conclusions:
- zstd is an improvement on almost all metrics.
- bzip2 and lzma should be removed post-haste.
- lzo should be removed once zstd is merged.
- compression level is important to consider for compression speed: the
default lz4 -1 compresses very fast but has a very poor compression
ratio. zstd -19 compresses barely better than zstd -18, but takes
significantly longer to compress.
- compression level should be configurable: lz4 -1 is useful, but so is
lz4 -9. zstd -1 is useful, but so is zstd -19. zstd -1 is useful for
developers who want kernel builds as fast as possible, zstd -19 for
everybody else.
- gzip is by far not the fastest compressor (even excluding cat)
- modern compressors (xz, lz4, zstd) decompress about as fast for each
compression level, only requiring more memory
- 7-Zip is much faster than xz, needs more research
- 7-Zip BCJ2 is slightly better than xz/BCJ. probably better filters for
all archs would be a good area of research, as apparently BCJ/BCJ2 are
intended only for 32-bit x86.
Thanks,
Alex.
Hi Alex,
(sorry... maybe my @gmx.com email is broken again...)
On Wed, Jul 01, 2020 at 10:35:48AM -0400, Alex Xu (Hello71) wrote:
>
> My conclusions:
>
> - zstd is an improvement on almost all metrics.
> - bzip2 and lzma should be removed post-haste.
I'm some familar with LZ4 and LZMA (xz) internals.
I'd like to add some notes from the principle perspective,
but I'm not sure if I would join some further topic
about this...
XZ is another form of LZMA2, which is based on LZMA.
It uses range coder technology. In principle, it has better
compession ratio with slowest speed (due to multiplication
by bits rather than lookup table). Instead, Zstd uses huffman
(which is much like deflate, aka. gzip) and FSE (If I'm not
wrong about Zstd)...
So in general (apart from the specific implementation),
the decompression speed vs compression ratio ralationship are
LZ4 - Zstd - LZMA
Some arguments such as compression level have impact on
LZ matchfinder (yeah, except for bzip2, all algorithms
are LZ-based) and dictionary size. And some specific
compressors aren't well-optimized (e.g. zlib).
Anyway, I think LZMA (xz) is still useful and which is more
friendly to fixed-sized output compression than Zstd yet (But
yeah, I'm not familar with all ZSTD internals. I will dig
into that if I've more extra time).
> - lzo should be removed once zstd is merged.
> - compression level is important to consider for compression speed: the
> default lz4 -1 compresses very fast but has a very poor compression
> ratio. zstd -19 compresses barely better than zstd -18, but takes
> significantly longer to compress.
> - compression level should be configurable: lz4 -1 is useful, but so is
> lz4 -9. zstd -1 is useful, but so is zstd -19. zstd -1 is useful for
> developers who want kernel builds as fast as possible, zstd -19 for
> everybody else.
> - gzip is by far not the fastest compressor (even excluding cat)
> - modern compressors (xz, lz4, zstd) decompress about as fast for each
> compression level, only requiring more memory
lz4 has fixed sliding window (dictionary, 64k), so it won't
require more memory among different compression level when
decompressing.
Thanks,
Gao Xiang
Excerpts from Gao Xiang's message of July 1, 2020 11:50 am:
> Anyway, I think LZMA (xz) is still useful and which is more
> friendly to fixed-sized output compression than Zstd yet (But
> yeah, I'm not familar with all ZSTD internals. I will dig
> into that if I've more extra time).
Yes, I agree. If you look at the graphs, LZMA2 (xz/7zip) still produces
smaller results, even compared to zstd maximum settings, so definitely
LZMA2 should be kept, at least for now. I am only suggesting removing
LZMA, since it has no benefits over xz and zstd combination (bigger than
xz, slower than zstd).
>> - modern compressors (xz, lz4, zstd) decompress about as fast for each
>> compression level, only requiring more memory
>
> lz4 has fixed sliding window (dictionary, 64k), so it won't
> require more memory among different compression level when
> decompressing.
Yes, this is true. I tried to simplify among all compressors, but I
think I simplified too much. Thanks for clarifying.
Cheers,
Alex.
On Wed, Jul 01, 2020 at 10:35:48AM -0400, Alex Xu (Hello71) wrote:
> ZSTD compression patches have been sent in a number of times over the
> past few years. Every time, someone asks for benchmarks. Every time,
> someone is concerned about compression time. Sometimes, someone provides
> benchmarks.
Where's the latest series for this, btw? I thought it had landed. :P It
seemed like it was done.
--
Kees Cook
On Thu, Jul 2, 2020 at 5:18 PM Kees Cook <[email protected]> wrote:
>
> On Wed, Jul 01, 2020 at 10:35:48AM -0400, Alex Xu (Hello71) wrote:
> > ZSTD compression patches have been sent in a number of times over the
> > past few years. Every time, someone asks for benchmarks. Every time,
> > someone is concerned about compression time. Sometimes, someone provides
> > benchmarks.
>
> Where's the latest series for this, btw? I thought it had landed. :P It
> seemed like it was done.
>
Hi,
Again, I would like to see this upstream, too.
Last I asked for a rebase against Linux v5.8-rc1 or later.
Beyond above adaptations, the latest series "zstd-v5" of Nick T.s
patchset needs some addition of zstd to the patch (see [1]):
commit 8dfb61dcbaceb19a5ded5e9c9dcf8d05acc32294
"kbuild: add variables for compression tools"
NOTE:
"zstd-v5" was against Linux-next 20200408 or download the series from
patchwork LKML which applies cleanly against Linux v5.7 - last is what
I did.
There was a follow-up to the above patch (see [2]):
commit e4a42c82e943b97ce124539fcd7a47445b43fa0d
"kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables"
Nevertheless, this is the kernel-side of doing - user-space like for
example Debian's initramfs-tools needs adaptations (see [3]).
@Kees: Can you aid Nick T. to get this upstream? You know the
processes a bit better than me.
Regards,
- Sedat -
[0] https://github.com/terrelln/linux/tree/zstd-v5
[0] https://lore.kernel.org/patchwork/project/lkml/list/?series=437934
[1] https://git.kernel.org/linus/8dfb61dcbaceb19a5ded5e9c9dcf8d05acc32294
[2] https://git.kernel.org/linus/e4a42c82e943b97ce124539fcd7a47445b43fa0d
[2] https://bugs.debian.org/955469
On Fri, Jul 03, 2020 at 10:15:20AM +0200, Sedat Dilek wrote:
> On Thu, Jul 2, 2020 at 5:18 PM Kees Cook <[email protected]> wrote:
> >
> > On Wed, Jul 01, 2020 at 10:35:48AM -0400, Alex Xu (Hello71) wrote:
> > > ZSTD compression patches have been sent in a number of times over the
> > > past few years. Every time, someone asks for benchmarks. Every time,
> > > someone is concerned about compression time. Sometimes, someone provides
> > > benchmarks.
> >
> > Where's the latest series for this, btw? I thought it had landed. :P It
> > seemed like it was done.
> >
>
> Hi,
>
> Again, I would like to see this upstream, too.
>
> Last I asked for a rebase against Linux v5.8-rc1 or later.
>
> Beyond above adaptations, the latest series "zstd-v5" of Nick T.s
> patchset needs some addition of zstd to the patch (see [1]):
>
> commit 8dfb61dcbaceb19a5ded5e9c9dcf8d05acc32294
> "kbuild: add variables for compression tools"
>
> NOTE:
> "zstd-v5" was against Linux-next 20200408 or download the series from
> patchwork LKML which applies cleanly against Linux v5.7 - last is what
> I did.
>
> There was a follow-up to the above patch (see [2]):
>
> commit e4a42c82e943b97ce124539fcd7a47445b43fa0d
> "kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables"
Okay, cool. Yes, now is the right time to send an updated series based
on v5.8-rc2 with any outstanding adjusted/fixes made.
It seems v5 is here?
https://lore.kernel.org/lkml/[email protected]/
That wasn't sent "to" a maintainer, so it likely went unnoticed by either
akpm or the x86 maintainers. I think this should likely go via the x86
tree.
> Nevertheless, this is the kernel-side of doing - user-space like for
> example Debian's initramfs-tools needs adaptations (see [3]).
Right, but the kernel needs to implement the support first. :)
> @Kees: Can you aid Nick T. to get this upstream? You know the
> processes a bit better than me.
Sure; Nick, can you please rebase and handle any issues from v5? With
the result, send a v6 as you did for v5 before, but I would make your
"to" be:
Borislav Petkov <[email protected]>
Thomas Gleixner <[email protected]>
and keep the CC as you had it.
--
Kees Cook
Am Fr., 3. Juli 2020 um 18:06 Uhr schrieb Kees Cook <[email protected]>:
>
> On Fri, Jul 03, 2020 at 10:15:20AM +0200, Sedat Dilek wrote:
> > On Thu, Jul 2, 2020 at 5:18 PM Kees Cook <[email protected]> wrote:
> > >
> > > On Wed, Jul 01, 2020 at 10:35:48AM -0400, Alex Xu (Hello71) wrote:
> > > > ZSTD compression patches have been sent in a number of times over the
> > > > past few years. Every time, someone asks for benchmarks. Every time,
> > > > someone is concerned about compression time. Sometimes, someone provides
> > > > benchmarks.
> > >
> > > Where's the latest series for this, btw? I thought it had landed. :P It
> > > seemed like it was done.
> > >
> >
> > Hi,
> >
> > Again, I would like to see this upstream, too.
> >
> > Last I asked for a rebase against Linux v5.8-rc1 or later.
> >
> > Beyond above adaptations, the latest series "zstd-v5" of Nick T.s
> > patchset needs some addition of zstd to the patch (see [1]):
> >
> > commit 8dfb61dcbaceb19a5ded5e9c9dcf8d05acc32294
> > "kbuild: add variables for compression tools"
> >
> > NOTE:
> > "zstd-v5" was against Linux-next 20200408 or download the series from
> > patchwork LKML which applies cleanly against Linux v5.7 - last is what
> > I did.
> >
> > There was a follow-up to the above patch (see [2]):
> >
> > commit e4a42c82e943b97ce124539fcd7a47445b43fa0d
> > "kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables"
>
> Okay, cool. Yes, now is the right time to send an updated series based
> on v5.8-rc2 with any outstanding adjusted/fixes made.
>
> It seems v5 is here?
> https://lore.kernel.org/lkml/[email protected]/
>
> That wasn't sent "to" a maintainer, so it likely went unnoticed by either
> akpm or the x86 maintainers. I think this should likely go via the x86
> tree.
>
> > Nevertheless, this is the kernel-side of doing - user-space like for
> > example Debian's initramfs-tools needs adaptations (see [3]).
>
> Right, but the kernel needs to implement the support first. :)
>
> > @Kees: Can you aid Nick T. to get this upstream? You know the
> > processes a bit better than me.
>
> Sure; Nick, can you please rebase and handle any issues from v5? With
> the result, send a v6 as you did for v5 before, but I would make your
> "to" be:
>
> Borislav Petkov <[email protected]>
> Thomas Gleixner <[email protected]>
I got the hint to bring in Andrew Morton <[email protected]> [1],
so you might add him aswell (he signded-off changes in lib/decompress*.c).
Norbert
https://lwn.net/ml/linux-kernel/CADYdroP0zdz=QtuDFCXpkDohEAgGOc7hDHT8_NnqKuvi979J5Q@mail.gmail.com/
>
> and keep the CC as you had it.
>
> --
> Kees Cook
> On Jul 3, 2020, at 12:06 PM, Kees Cook <[email protected]> wrote:
>
> On Fri, Jul 03, 2020 at 10:15:20AM +0200, Sedat Dilek wrote:
>> On Thu, Jul 2, 2020 at 5:18 PM Kees Cook <[email protected]> wrote:
>>>
>>> On Wed, Jul 01, 2020 at 10:35:48AM -0400, Alex Xu (Hello71) wrote:
>>>> ZSTD compression patches have been sent in a number of times over the
>>>> past few years. Every time, someone asks for benchmarks. Every time,
>>>> someone is concerned about compression time. Sometimes, someone provides
>>>> benchmarks.
>>>
>>> Where's the latest series for this, btw? I thought it had landed. :P It
>>> seemed like it was done.
>>>
>>
>> Hi,
>>
>> Again, I would like to see this upstream, too.
>>
>> Last I asked for a rebase against Linux v5.8-rc1 or later.
>>
>> Beyond above adaptations, the latest series "zstd-v5" of Nick T.s
>> patchset needs some addition of zstd to the patch (see [1]):
>>
>> commit 8dfb61dcbaceb19a5ded5e9c9dcf8d05acc32294
>> "kbuild: add variables for compression tools"
>>
>> NOTE:
>> "zstd-v5" was against Linux-next 20200408 or download the series from
>> patchwork LKML which applies cleanly against Linux v5.7 - last is what
>> I did.
>>
>> There was a follow-up to the above patch (see [2]):
>>
>> commit e4a42c82e943b97ce124539fcd7a47445b43fa0d
>> "kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables"
>
> Okay, cool. Yes, now is the right time to send an updated series based
> on v5.8-rc2 with any outstanding adjusted/fixes made.
>
> It seems v5 is here?
> https://lore.kernel.org/lkml/[email protected]/
>
> That wasn't sent "to" a maintainer, so it likely went unnoticed by either
> akpm or the x86 maintainers. I think this should likely go via the x86
> tree.
>
>> Nevertheless, this is the kernel-side of doing - user-space like for
>> example Debian's initramfs-tools needs adaptations (see [3]).
>
> Right, but the kernel needs to implement the support first. :)
>
>> @Kees: Can you aid Nick T. to get this upstream? You know the
>> processes a bit better than me.
>
> Sure; Nick, can you please rebase and handle any issues from v5? With
> the result, send a v6 as you did for v5 before, but I would make your
> "to" be:
>
> Borislav Petkov <[email protected]>
> Thomas Gleixner <[email protected]>
>
> and keep the CC as you had it.
I’ll send it out today, thanks for the advice!
-Nick
Hello
I looked at the SVG graphs and it appears that the formula used wasn't
T_load+T_decompress, but was just T_decompress.
Without considering the time it takes to load the compressed data from
a storage device, the SVG graphs are only half-done and might be
deceiving.
There are 3 kinds of typical device speeds nowadays, the sequential
read speed of a large non-fragmented compressed file is one of the
following:
100 MB/s: rotational disks and USB flash drives
500 MB/s: SATA SSD
2 GB/s: NVMe SSD
The read speeds of USB flash devices vary a lot, but in case of recent
high-speed USB flash drives it falls into the 100 MB/s category of
rotational disks. Taking USB flash read speed into consideration is
important for deciding which compression to use when creating the ISO
image of a Linux distribution.
In summary: Instead of the 1 kernel-decomp.svg file, there should be 3
kernel-read-decomp.svg files. Similarly in the case of the
initramfs-decomp.svg file.
As a rule of thumb, if the kernel and initramfs are stored on a NVMe
SSD then simply select the fastest decompressor without considering
the compression ratio - or avoid using any compression at all in which
case T_decompress will be zero.
The formula T_load+T_decompress assumes that loading and decompression
aren't executing in parallel. If they are, the formula should be
max(T_load, T_decompress).
Sincerely
Jan