From: Nick Terrell <[email protected]>
This patchset upgrades the zstd library to the latest upstream release. The
current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
But, it is now possible to use upstream zstd directly in the kernel.
I have not yet release zstd-1.4.6 upstream. I want the zstd version in the kernel
to match up with a known upstream release, so we know exactly what code is
running. Whenever this patchset is ready for merge, I will cut a release at the
upstream commit that gets merged. This should not be necessary for future
releases.
The kernel zstd library is automatically generated from upstream zstd. A script
makes the necessary changes and imports it into the kernel. The changes are:
1. Replace all libc dependencies with kernel replacements and rewrite includes.
2. Remove unncessary portability macros like: #if defined(_MSC_VER).
3. Use the kernel xxhash instead of bundling it.
This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.
I've updated zstd to upstream with one big patch because every commit must build,
so that precludes partial updates. Since the commit is 100% generated, I hope the
review burden is lightened. I considered replaying upstream commits, but that is
not possible because there have been ~3500 upstream commits since the last zstd
import, and the commits don't all build individually. The bulk update preserves
bisectablity because bugs can be bisected to the zstd version update. At that
point the update can be reverted, and we can work with upstream to find and fix
the bug. After this big switch in how the kernel consumes zstd, future patches
will be smaller, because they will only have one upstream release worth of
changes each.
This patchset changes the zstd API from a custom kernel API to the upstream API.
I considered wrapping the upstream API with a wrapper that is closer to the
kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814
I've chosen to use the upstream API directly, to minimize opportunities to
introduce bugs, and because using the upstream API directly makes debugging and
communication with upstream easier.
This patchset comes in 3 parts:
1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
compatibility wrapper so zstd can be upgraded without modifying any callers.
The second patch adds an indirection for the lib/decompress_unzstd.c including
of all decompression source files.
2. Import zstd-1.4.6. This patch is completely generated from upstream using
automated tooling.
3. Update all callers to the zstd-1.4.6 API then delete the compatibility
wrapper.
I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
using the compatibility wrapper, and after the final patch in this series.
I tested kernel and initramfs decompression in i386 and arm.
I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
I found:
* BtrFS zstd compression at levels 1 and 3 is 5% faster
* BtrFS zstd decompression+read is 15% faster
* SquashFS zstd decompression+read is 15% faster
* F2FS zstd compression+write at level 3 is 8% faster
* F2FS zstd decompression+read is 20% faster
* ZRAM decompression+read is 30% faster
* Kernel zstd decompression is 35% faster
* Initramfs zstd decompression+build is 5% faster
The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
compression. For example the recent problem with large kernel decompression has
been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27.
Please let me know if there is anything that I can do to ease the way for these
patches. I think it is important because it gets large performance improvements,
contains bug fixes, and is switching to a more maintainable model of consuming
upstream zstd directly, making it easy to keep up to date.
Best,
Nick Terrell
v1 -> v2:
* Successfully tested F2FS with help from Chao Yu to fix my test.
* (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown.
This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test.
v2 -> v3:
* (3/9) Silence warnings by Kernel Test Robot:
https://github.com/facebook/zstd/pull/2324
Stack size warnings remain, but these aren't new, and the functions it warns on
are either unused or not in the maximum stack path. This patchset reduces zstd
compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and
more stack reduction would require significant changes that have the potential
to introduce new bugs. However, I do hope to continue to reduce zstd stack
usage in future versions.
v3 -> v4:
* (3/9) Fix errors and warnings reported by Kernel Test Robot:
https://github.com/facebook/zstd/pull/2326
- Replace mem.h with a custom kernel implementation that matches the current
lib/zstd/mem.h in the kernel. This avoids calls to __builtin_bswap*() which
don't work on certain architectures, as exposed by the Kernel Test Robot.
- Remove ASAN/MSAN (un)poisoning code which doesn't work in the kernel, as
exposed by the Kernel Test Robot.
- I've fixed all of the valid cppcheck warnings reported, but there were many
false positives, where cppcheck was incorrectly analyzing the situation,
which I did not fix. I don't believe it is reasonable to expect that upstream
zstd silences all the static analyzer false positives. Upstream zstd uses
clang scan-build for its static analysis. We find that supporting multiple
static analysis tools multiplies the burden of silencing false positives,
without providing enough marginal value over running a single static analysis
tool.
Nick Terrell (9):
lib: zstd: Add zstd compatibility wrapper
lib: zstd: Add decompress_sources.h for decompress_unzstd
lib: zstd: Upgrade to latest upstream zstd version 1.4.6
crypto: zstd: Switch to zstd-1.4.6 API
btrfs: zstd: Switch to the zstd-1.4.6 API
f2fs: zstd: Switch to the zstd-1.4.6 API
squashfs: zstd: Switch to the zstd-1.4.6 API
lib: unzstd: Switch to the zstd-1.4.6 API
lib: zstd: Remove zstd compatibility wrapper
crypto/zstd.c | 22 +-
fs/btrfs/zstd.c | 46 +-
fs/f2fs/compress.c | 100 +-
fs/squashfs/zstd_wrapper.c | 7 +-
include/linux/zstd.h | 3021 ++++++++----
include/linux/zstd_errors.h | 76 +
lib/decompress_unzstd.c | 44 +-
lib/zstd/Makefile | 35 +-
lib/zstd/bitstream.h | 379 --
lib/zstd/common/bitstream.h | 437 ++
lib/zstd/common/compiler.h | 150 +
lib/zstd/common/cpu.h | 194 +
lib/zstd/common/debug.c | 24 +
lib/zstd/common/debug.h | 101 +
lib/zstd/common/entropy_common.c | 355 ++
lib/zstd/common/error_private.c | 55 +
lib/zstd/common/error_private.h | 66 +
lib/zstd/common/fse.h | 709 +++
lib/zstd/common/fse_decompress.c | 380 ++
lib/zstd/common/huf.h | 352 ++
lib/zstd/common/mem.h | 258 +
lib/zstd/common/zstd_common.c | 83 +
lib/zstd/common/zstd_deps.h | 124 +
lib/zstd/common/zstd_internal.h | 438 ++
lib/zstd/compress.c | 3485 --------------
lib/zstd/compress/fse_compress.c | 625 +++
lib/zstd/compress/hist.c | 165 +
lib/zstd/compress/hist.h | 75 +
lib/zstd/compress/huf_compress.c | 764 +++
lib/zstd/compress/zstd_compress.c | 4157 +++++++++++++++++
lib/zstd/compress/zstd_compress_internal.h | 1103 +++++
lib/zstd/compress/zstd_compress_literals.c | 158 +
lib/zstd/compress/zstd_compress_literals.h | 29 +
lib/zstd/compress/zstd_compress_sequences.c | 433 ++
lib/zstd/compress/zstd_compress_sequences.h | 54 +
lib/zstd/compress/zstd_compress_superblock.c | 849 ++++
lib/zstd/compress/zstd_compress_superblock.h | 32 +
lib/zstd/compress/zstd_cwksp.h | 465 ++
lib/zstd/compress/zstd_double_fast.c | 521 +++
lib/zstd/compress/zstd_double_fast.h | 32 +
lib/zstd/compress/zstd_fast.c | 496 ++
lib/zstd/compress/zstd_fast.h | 31 +
lib/zstd/compress/zstd_lazy.c | 1138 +++++
lib/zstd/compress/zstd_lazy.h | 61 +
lib/zstd/compress/zstd_ldm.c | 619 +++
lib/zstd/compress/zstd_ldm.h | 104 +
lib/zstd/compress/zstd_opt.c | 1200 +++++
lib/zstd/compress/zstd_opt.h | 50 +
lib/zstd/decompress.c | 2531 ----------
lib/zstd/decompress/huf_decompress.c | 1205 +++++
lib/zstd/decompress/zstd_ddict.c | 241 +
lib/zstd/decompress/zstd_ddict.h | 44 +
lib/zstd/decompress/zstd_decompress.c | 1836 ++++++++
lib/zstd/decompress/zstd_decompress_block.c | 1540 ++++++
lib/zstd/decompress/zstd_decompress_block.h | 62 +
.../decompress/zstd_decompress_internal.h | 195 +
lib/zstd/decompress_sources.h | 18 +
lib/zstd/entropy_common.c | 243 -
lib/zstd/error_private.h | 53 -
lib/zstd/fse.h | 575 ---
lib/zstd/fse_compress.c | 795 ----
lib/zstd/fse_decompress.c | 325 --
lib/zstd/huf.h | 212 -
lib/zstd/huf_compress.c | 772 ---
lib/zstd/huf_decompress.c | 960 ----
lib/zstd/mem.h | 151 -
lib/zstd/zstd_common.c | 75 -
lib/zstd/zstd_compress_module.c | 79 +
lib/zstd/zstd_decompress_module.c | 79 +
lib/zstd/zstd_internal.h | 273 --
lib/zstd/zstd_opt.h | 1014 ----
71 files changed, 24368 insertions(+), 13012 deletions(-)
create mode 100644 include/linux/zstd_errors.h
delete mode 100644 lib/zstd/bitstream.h
create mode 100644 lib/zstd/common/bitstream.h
create mode 100644 lib/zstd/common/compiler.h
create mode 100644 lib/zstd/common/cpu.h
create mode 100644 lib/zstd/common/debug.c
create mode 100644 lib/zstd/common/debug.h
create mode 100644 lib/zstd/common/entropy_common.c
create mode 100644 lib/zstd/common/error_private.c
create mode 100644 lib/zstd/common/error_private.h
create mode 100644 lib/zstd/common/fse.h
create mode 100644 lib/zstd/common/fse_decompress.c
create mode 100644 lib/zstd/common/huf.h
create mode 100644 lib/zstd/common/mem.h
create mode 100644 lib/zstd/common/zstd_common.c
create mode 100644 lib/zstd/common/zstd_deps.h
create mode 100644 lib/zstd/common/zstd_internal.h
delete mode 100644 lib/zstd/compress.c
create mode 100644 lib/zstd/compress/fse_compress.c
create mode 100644 lib/zstd/compress/hist.c
create mode 100644 lib/zstd/compress/hist.h
create mode 100644 lib/zstd/compress/huf_compress.c
create mode 100644 lib/zstd/compress/zstd_compress.c
create mode 100644 lib/zstd/compress/zstd_compress_internal.h
create mode 100644 lib/zstd/compress/zstd_compress_literals.c
create mode 100644 lib/zstd/compress/zstd_compress_literals.h
create mode 100644 lib/zstd/compress/zstd_compress_sequences.c
create mode 100644 lib/zstd/compress/zstd_compress_sequences.h
create mode 100644 lib/zstd/compress/zstd_compress_superblock.c
create mode 100644 lib/zstd/compress/zstd_compress_superblock.h
create mode 100644 lib/zstd/compress/zstd_cwksp.h
create mode 100644 lib/zstd/compress/zstd_double_fast.c
create mode 100644 lib/zstd/compress/zstd_double_fast.h
create mode 100644 lib/zstd/compress/zstd_fast.c
create mode 100644 lib/zstd/compress/zstd_fast.h
create mode 100644 lib/zstd/compress/zstd_lazy.c
create mode 100644 lib/zstd/compress/zstd_lazy.h
create mode 100644 lib/zstd/compress/zstd_ldm.c
create mode 100644 lib/zstd/compress/zstd_ldm.h
create mode 100644 lib/zstd/compress/zstd_opt.c
create mode 100644 lib/zstd/compress/zstd_opt.h
delete mode 100644 lib/zstd/decompress.c
create mode 100644 lib/zstd/decompress/huf_decompress.c
create mode 100644 lib/zstd/decompress/zstd_ddict.c
create mode 100644 lib/zstd/decompress/zstd_ddict.h
create mode 100644 lib/zstd/decompress/zstd_decompress.c
create mode 100644 lib/zstd/decompress/zstd_decompress_block.c
create mode 100644 lib/zstd/decompress/zstd_decompress_block.h
create mode 100644 lib/zstd/decompress/zstd_decompress_internal.h
create mode 100644 lib/zstd/decompress_sources.h
delete mode 100644 lib/zstd/entropy_common.c
delete mode 100644 lib/zstd/error_private.h
delete mode 100644 lib/zstd/fse.h
delete mode 100644 lib/zstd/fse_compress.c
delete mode 100644 lib/zstd/fse_decompress.c
delete mode 100644 lib/zstd/huf.h
delete mode 100644 lib/zstd/huf_compress.c
delete mode 100644 lib/zstd/huf_decompress.c
delete mode 100644 lib/zstd/mem.h
delete mode 100644 lib/zstd/zstd_common.c
create mode 100644 lib/zstd/zstd_compress_module.c
create mode 100644 lib/zstd/zstd_decompress_module.c
delete mode 100644 lib/zstd/zstd_internal.h
delete mode 100644 lib/zstd/zstd_opt.h
--
2.28.0
From: Nick Terrell <[email protected]>
Adds zstd_compat.h which provides the necessary functions from the
current zstd.h API. It is only active for zstd versions 1.4.6 and newer.
That means it is disabled currently, but will become active when a later
patch in this series updates the zstd library in the kernel to 1.4.6.
This header allows the zstd upgrade to 1.4.6 without changing any
callers, since they all include zstd through the compatibility wrapper.
Later patches in this series transition each caller away from the
compatibility wrapper. After all the callers have been transitioned away
from the compatibility wrapper, the final patch in this series deletes
it.
Signed-off-by: Nick Terrell <[email protected]>
---
crypto/zstd.c | 2 +-
fs/btrfs/zstd.c | 2 +-
fs/f2fs/compress.c | 2 +-
fs/squashfs/zstd_wrapper.c | 2 +-
include/linux/zstd_compat.h | 116 ++++++++++++++++++++++++++++++++++++
lib/decompress_unzstd.c | 2 +-
6 files changed, 121 insertions(+), 5 deletions(-)
create mode 100644 include/linux/zstd_compat.h
diff --git a/crypto/zstd.c b/crypto/zstd.c
index 1a3309f066f7..dcda3cad3b5c 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
#include <linux/module.h>
#include <linux/net.h>
#include <linux/vmalloc.h>
-#include <linux/zstd.h>
+#include <linux/zstd_compat.h>
#include <crypto/internal/scompress.h>
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index 9a4871636c6c..a7367ff573d4 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
#include <linux/refcount.h>
#include <linux/sched.h>
#include <linux/slab.h>
-#include <linux/zstd.h>
+#include <linux/zstd_compat.h>
#include "misc.h"
#include "compression.h"
#include "ctree.h"
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 1dfb126a0cb2..e056f3a2b404 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,7 @@
#include <linux/backing-dev.h>
#include <linux/lzo.h>
#include <linux/lz4.h>
-#include <linux/zstd.h>
+#include <linux/zstd_compat.h>
#include "f2fs.h"
#include "node.h"
diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index b7cb1faa652d..f8c512a6204e 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
#include <linux/mutex.h>
#include <linux/bio.h>
#include <linux/slab.h>
-#include <linux/zstd.h>
+#include <linux/zstd_compat.h>
#include <linux/vmalloc.h>
#include "squashfs_fs.h"
diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h
new file mode 100644
index 000000000000..cda9208bf04a
--- /dev/null
+++ b/include/linux/zstd_compat.h
@@ -0,0 +1,116 @@
+/*
+ * Copyright (c) 2016-present, Facebook, Inc.
+ * All rights reserved.
+ *
+ * This source code is licensed under the BSD-style license found in the
+ * LICENSE file in the root directory of https://github.com/facebook/zstd.
+ * An additional grant of patent rights can be found in the PATENTS file in the
+ * same directory.
+ *
+ * This program is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU General Public License version 2 as published by the
+ * Free Software Foundation. This program is dual-licensed; you may select
+ * either version 2 of the GNU General Public License ("GPL") or BSD license
+ * ("BSD").
+ */
+
+#ifndef ZSTD_COMPAT_H
+#define ZSTD_COMPAT_H
+
+#include <linux/zstd.h>
+
+#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406)
+/*
+ * This header provides backwards compatibility for the zstd-1.4.6 library
+ * upgrade. This header allows us to upgrade the zstd library version without
+ * modifying any callers. Then we will migrate callers from the compatibility
+ * wrapper one at a time until none remain. At which point we will delete this
+ * header.
+ *
+ * It is temporary and will be deleted once the upgrade is complete.
+ */
+
+#include <linux/zstd_errors.h>
+
+static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters compression_params)
+{
+ return ZSTD_estimateCCtxSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters compression_params)
+{
+ return ZSTD_estimateCStreamSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_DCtxWorkspaceBound(void)
+{
+ return ZSTD_estimateDCtxSize();
+}
+
+static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size)
+{
+ return ZSTD_estimateDStreamSize(window_size);
+}
+
+static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size)
+{
+ if (wksp == NULL)
+ return NULL;
+ return ZSTD_initStaticCCtx(wksp, wksp_size);
+}
+
+static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, uint64_t pledged_src_size, void* wksp, size_t wksp_size)
+{
+ ZSTD_CStream* cstream;
+ size_t ret;
+
+ if (wksp == NULL)
+ return NULL;
+
+ cstream = ZSTD_initStaticCStream(wksp, wksp_size);
+ if (cstream == NULL)
+ return NULL;
+
+ /* 0 means unknown in old API but means 0 in new API */
+ if (pledged_src_size == 0)
+ pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN;
+
+ ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, pledged_src_size);
+ if (ZSTD_isError(ret))
+ return NULL;
+
+ return cstream;
+}
+#define ZSTD_initCStream ZSTD_initCStream_compat
+
+static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size)
+{
+ if (wksp == NULL)
+ return NULL;
+ return ZSTD_initStaticDCtx(wksp, wksp_size);
+}
+
+static inline ZSTD_DStream* ZSTD_initDStream_compat(unsigned long long window_size, void* wksp, size_t wksp_size)
+{
+ if (wksp == NULL)
+ return NULL;
+ (void)window_size;
+ return ZSTD_initStaticDStream(wksp, wksp_size);
+}
+#define ZSTD_initDStream ZSTD_initDStream_compat
+
+typedef ZSTD_frameHeader ZSTD_frameParams;
+
+static inline size_t ZSTD_getFrameParams(ZSTD_frameParams* frame_params, const void* src, size_t src_size)
+{
+ return ZSTD_getFrameHeader(frame_params, src, src_size);
+}
+
+static inline size_t ZSTD_compressCCtx_compat(ZSTD_CCtx* cctx, void* dst, size_t dst_capacity, const void* src, size_t src_size, ZSTD_parameters params)
+{
+ return ZSTD_compress_advanced(cctx, dst, dst_capacity, src, src_size, NULL, 0, params);
+}
+#define ZSTD_compressCCtx ZSTD_compressCCtx_compat
+
+#endif /* ZSTD_VERSION_NUMBER >= 10406 */
+#endif /* ZSTD_COMPAT_H */
diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index 0ad2c15479ed..dbc290af26b4 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -77,7 +77,7 @@
#include <linux/decompress/mm.h>
#include <linux/kernel.h>
-#include <linux/zstd.h>
+#include <linux/zstd_compat.h>
/* 128MB is the maximum window size supported by zstd. */
#define ZSTD_WINDOWSIZE_MAX (1 << ZSTD_WINDOWLOG_MAX)
--
2.28.0
From: Nick Terrell <[email protected]>
Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.
This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.
Signed-off-by: Nick Terrell <[email protected]>
---
lib/decompress_unzstd.c | 6 +-----
lib/zstd/decompress_sources.h | 14 ++++++++++++++
2 files changed, 15 insertions(+), 5 deletions(-)
create mode 100644 lib/zstd/decompress_sources.h
diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index dbc290af26b4..a79f705f236d 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -68,11 +68,7 @@
#ifdef STATIC
# define UNZSTD_PREBOOT
# include "xxhash.c"
-# include "zstd/entropy_common.c"
-# include "zstd/fse_decompress.c"
-# include "zstd/huf_decompress.c"
-# include "zstd/zstd_common.c"
-# include "zstd/decompress.c"
+# include "zstd/decompress_sources.h"
#endif
#include <linux/decompress/mm.h>
diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h
new file mode 100644
index 000000000000..ccb4960ea0cd
--- /dev/null
+++ b/lib/zstd/decompress_sources.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * This file includes every .c file needed for decompression.
+ * It is used by lib/decompress_unzstd.c to include the decompression
+ * source into the translation-unit, so it can be used for kernel
+ * decompression.
+ */
+
+#include "entropy_common.c"
+#include "fse_decompress.c"
+#include "huf_decompress.c"
+#include "zstd_common.c"
+#include "decompress.c"
--
2.28.0
From: Nick Terrell <[email protected]>
Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.
Signed-off-by: Nick Terrell <[email protected]>
---
crypto/zstd.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/crypto/zstd.c b/crypto/zstd.c
index dcda3cad3b5c..767fe2fbe009 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
#include <linux/module.h>
#include <linux/net.h>
#include <linux/vmalloc.h>
-#include <linux/zstd_compat.h>
+#include <linux/zstd.h>
#include <crypto/internal/scompress.h>
@@ -24,16 +24,15 @@ struct zstd_ctx {
void *dwksp;
};
-static ZSTD_parameters zstd_params(void)
-{
- return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0);
-}
-
static int zstd_comp_init(struct zstd_ctx *ctx)
{
int ret = 0;
- const ZSTD_parameters params = zstd_params();
- const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams);
+ const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL);
+
+ if (ZSTD_isError(wksp_size)) {
+ ret = -EINVAL;
+ goto out_free;
+ }
ctx->cwksp = vzalloc(wksp_size);
if (!ctx->cwksp) {
@@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
goto out;
}
- ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size);
+ ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size);
if (!ctx->cctx) {
ret = -EINVAL;
goto out_free;
@@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
static int zstd_decomp_init(struct zstd_ctx *ctx)
{
int ret = 0;
- const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+ const size_t wksp_size = ZSTD_estimateDCtxSize();
ctx->dwksp = vzalloc(wksp_size);
if (!ctx->dwksp) {
@@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx)
goto out;
}
- ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size);
+ ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size);
if (!ctx->dctx) {
ret = -EINVAL;
goto out_free;
@@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen,
{
size_t out_len;
struct zstd_ctx *zctx = ctx;
- const ZSTD_parameters params = zstd_params();
- out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params);
+ out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, ZSTD_DEF_LEVEL);
if (ZSTD_isError(out_len))
return -EINVAL;
*dlen = out_len;
--
2.28.0
From: Nick Terrell <[email protected]>
Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.
Signed-off-by: Nick Terrell <[email protected]>
---
fs/squashfs/zstd_wrapper.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index f8c512a6204e..add582409866 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
#include <linux/mutex.h>
#include <linux/bio.h>
#include <linux/slab.h>
-#include <linux/zstd_compat.h>
+#include <linux/zstd.h>
#include <linux/vmalloc.h>
#include "squashfs_fs.h"
@@ -34,7 +34,7 @@ static void *zstd_init(struct squashfs_sb_info *msblk, void *buff)
goto failed;
wksp->window_size = max_t(size_t,
msblk->block_size, SQUASHFS_METADATA_SIZE);
- wksp->mem_size = ZSTD_DStreamWorkspaceBound(wksp->window_size);
+ wksp->mem_size = ZSTD_estimateDStreamSize(wksp->window_size);
wksp->mem = vmalloc(wksp->mem_size);
if (wksp->mem == NULL)
goto failed;
@@ -71,7 +71,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm,
struct bvec_iter_all iter_all = {};
struct bio_vec *bvec = bvec_init_iter_all(&iter_all);
- stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size);
+ stream = ZSTD_initStaticDStream(wksp->mem, wksp->mem_size);
if (!stream) {
ERROR("Failed to initialize zstd decompressor\n");
@@ -122,8 +122,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm,
break;
if (ZSTD_isError(zstd_err)) {
- ERROR("zstd decompression error: %d\n",
- (int)ZSTD_getErrorCode(zstd_err));
+ ERROR("zstd decompression error: %s\n", ZSTD_getErrorName(zstd_err));
error = -EIO;
break;
}
--
2.28.0
From: Nick Terrell <[email protected]>
Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.
Signed-off-by: Nick Terrell <[email protected]>
---
fs/btrfs/zstd.c | 48 ++++++++++++++++++++++++++++--------------------
1 file changed, 28 insertions(+), 20 deletions(-)
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index a7367ff573d4..6b466e090cd7 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
#include <linux/refcount.h>
#include <linux/sched.h>
#include <linux/slab.h>
-#include <linux/zstd_compat.h>
+#include <linux/zstd.h>
#include "misc.h"
#include "compression.h"
#include "ctree.h"
@@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void)
zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT);
size_t level_size =
max_t(size_t,
- ZSTD_CStreamWorkspaceBound(params.cParams),
- ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
+ ZSTD_estimateCStreamSize_usingCParams(params.cParams),
+ ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT));
max_size = max_t(size_t, max_size, level_size);
zstd_ws_mem_sizes[level - 1] = max_size;
@@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
*total_in = 0;
/* Initialize the stream */
- stream = ZSTD_initCStream(params, len, workspace->mem,
- workspace->size);
+ stream = ZSTD_initStaticCStream(workspace->mem, workspace->size);
if (!stream) {
- pr_warn("BTRFS: ZSTD_initCStream failed\n");
+ pr_warn("BTRFS: ZSTD_initStaticCStream failed\n");
ret = -EIO;
goto out;
}
+ {
+ size_t ret2;
+
+ ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len);
+ if (ZSTD_isError(ret2)) {
+ pr_warn("BTRFS: ZSTD_initCStream_advanced returned %s\n",
+ ZSTD_getErrorName(ret2));
+ ret = -EIO;
+ goto out;
+ }
+ }
/* map in the first page of input data */
in_page = find_get_page(mapping, start >> PAGE_SHIFT);
@@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
ret2 = ZSTD_compressStream(stream, &workspace->out_buf,
&workspace->in_buf);
if (ZSTD_isError(ret2)) {
- pr_debug("BTRFS: ZSTD_compressStream returned %d\n",
- ZSTD_getErrorCode(ret2));
+ pr_debug("BTRFS: ZSTD_compressStream returned %s\n",
+ ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
ret2 = ZSTD_endStream(stream, &workspace->out_buf);
if (ZSTD_isError(ret2)) {
- pr_debug("BTRFS: ZSTD_endStream returned %d\n",
- ZSTD_getErrorCode(ret2));
+ pr_debug("BTRFS: ZSTD_endStream returned %s\n",
+ ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
unsigned long buf_start;
unsigned long total_out = 0;
- stream = ZSTD_initDStream(
- ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+ stream = ZSTD_initStaticDStream(workspace->mem, workspace->size);
if (!stream) {
- pr_debug("BTRFS: ZSTD_initDStream failed\n");
+ pr_debug("BTRFS: ZSTD_initStaticDStream failed\n");
ret = -EIO;
goto done;
}
@@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
ret2 = ZSTD_decompressStream(stream, &workspace->out_buf,
&workspace->in_buf);
if (ZSTD_isError(ret2)) {
- pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
- ZSTD_getErrorCode(ret2));
+ pr_debug("BTRFS: ZSTD_decompressStream returned %s\n",
+ ZSTD_getErrorName(ret2));
ret = -EIO;
goto done;
}
@@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char *data_in,
unsigned long pg_offset = 0;
char *kaddr;
- stream = ZSTD_initDStream(
- ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+ stream = ZSTD_initStaticDStream(workspace->mem, workspace->size);
if (!stream) {
- pr_warn("BTRFS: ZSTD_initDStream failed\n");
+ pr_warn("BTRFS: ZSTD_initStaticDStream failed\n");
ret = -EIO;
goto finish;
}
@@ -667,8 +675,8 @@ int zstd_decompress(struct list_head *ws, unsigned char *data_in,
ret2 = ZSTD_decompressStream(stream, &workspace->out_buf,
&workspace->in_buf);
if (ZSTD_isError(ret2)) {
- pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
- ZSTD_getErrorCode(ret2));
+ pr_debug("BTRFS: ZSTD_decompressStream returned %s\n",
+ ZSTD_getErrorName(ret2));
ret = -EIO;
goto finish;
}
--
2.28.0
From: Nick Terrell <[email protected]>
Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is more efficient because it uses the single-pass API instead of
the streaming API. The streaming API is not necessary because the whole
input and output buffers are available. This saves memory because we
don't need to allocate a buffer for the window. It is also more
efficient because it saves unnecessary memcpy calls.
Compression memory increases from 168 KB to 204 KB because upstream
uses slightly more memory. Decompression memory decreases from 1.4 MB
to 158 KB.
Signed-off-by: Nick Terrell <[email protected]>
---
fs/f2fs/compress.c | 102 +++++++++++++++++----------------------------
1 file changed, 38 insertions(+), 64 deletions(-)
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index e056f3a2b404..b79efce81651 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,8 @@
#include <linux/backing-dev.h>
#include <linux/lzo.h>
#include <linux/lz4.h>
-#include <linux/zstd_compat.h>
+#include <linux/zstd.h>
+#include <linux/zstd_errors.h>
#include "f2fs.h"
#include "node.h"
@@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = {
static int zstd_init_compress_ctx(struct compress_ctx *cc)
{
ZSTD_parameters params;
- ZSTD_CStream *stream;
+ ZSTD_CCtx *ctx;
void *workspace;
unsigned int workspace_size;
params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0);
- workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams);
+ workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams);
workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode),
workspace_size, GFP_NOFS);
if (!workspace)
return -ENOMEM;
- stream = ZSTD_initCStream(params, 0, workspace, workspace_size);
- if (!stream) {
- printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream failed\n",
+ ctx = ZSTD_initStaticCCtx(workspace, workspace_size);
+ if (!ctx) {
+ printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream failed\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
__func__);
kvfree(workspace);
@@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc)
}
cc->private = workspace;
- cc->private2 = stream;
+ cc->private2 = ctx;
cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE;
return 0;
@@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx *cc)
static int zstd_compress_pages(struct compress_ctx *cc)
{
- ZSTD_CStream *stream = cc->private2;
- ZSTD_inBuffer inbuf;
- ZSTD_outBuffer outbuf;
- int src_size = cc->rlen;
- int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
- int ret;
-
- inbuf.pos = 0;
- inbuf.src = cc->rbuf;
- inbuf.size = src_size;
-
- outbuf.pos = 0;
- outbuf.dst = cc->cbuf->cdata;
- outbuf.size = dst_size;
-
- ret = ZSTD_compressStream(stream, &outbuf, &inbuf);
- if (ZSTD_isError(ret)) {
- printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream failed, ret: %d\n",
- KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
- __func__, ZSTD_getErrorCode(ret));
- return -EIO;
- }
-
- ret = ZSTD_endStream(stream, &outbuf);
+ ZSTD_CCtx *ctx = cc->private2;
+ const size_t src_size = cc->rlen;
+ const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
+ ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, src_size, 0);
+ size_t ret;
+
+ ret = ZSTD_compress_advanced(
+ ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, NULL, 0, params);
if (ZSTD_isError(ret)) {
- printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned %d\n",
+ /*
+ * there is compressed data remained in intermediate buffer due to
+ * no more space in cbuf.cdata
+ */
+ if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall)
+ return -EAGAIN;
+ /* other compression errors return -EIO */
+ printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced failed, err: %s\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
- __func__, ZSTD_getErrorCode(ret));
+ __func__, ZSTD_getErrorName(ret));
return -EIO;
}
- /*
- * there is compressed data remained in intermediate buffer due to
- * no more space in cbuf.cdata
- */
- if (ret)
- return -EAGAIN;
-
- cc->clen = outbuf.pos;
+ cc->clen = ret;
return 0;
}
static int zstd_init_decompress_ctx(struct decompress_io_ctx *dic)
{
- ZSTD_DStream *stream;
+ ZSTD_DCtx *ctx;
void *workspace;
unsigned int workspace_size;
- workspace_size = ZSTD_DStreamWorkspaceBound(MAX_COMPRESS_WINDOW_SIZE);
+ workspace_size = ZSTD_estimateDCtxSize();
workspace = f2fs_kvmalloc(F2FS_I_SB(dic->inode),
workspace_size, GFP_NOFS);
if (!workspace)
return -ENOMEM;
- stream = ZSTD_initDStream(MAX_COMPRESS_WINDOW_SIZE,
- workspace, workspace_size);
- if (!stream) {
- printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initDStream failed\n",
+ ctx = ZSTD_initStaticDCtx(workspace, workspace_size);
+ if (!ctx) {
+ printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initStaticDCtx failed\n",
KERN_ERR, F2FS_I_SB(dic->inode)->sb->s_id,
__func__);
kvfree(workspace);
@@ -401,7 +385,7 @@ static int zstd_init_decompress_ctx(struct decompress_io_ctx *dic)
}
dic->private = workspace;
- dic->private2 = stream;
+ dic->private2 = ctx;
return 0;
}
@@ -415,28 +399,18 @@ static void zstd_destroy_decompress_ctx(struct decompress_io_ctx *dic)
static int zstd_decompress_pages(struct decompress_io_ctx *dic)
{
- ZSTD_DStream *stream = dic->private2;
- ZSTD_inBuffer inbuf;
- ZSTD_outBuffer outbuf;
- int ret;
-
- inbuf.pos = 0;
- inbuf.src = dic->cbuf->cdata;
- inbuf.size = dic->clen;
-
- outbuf.pos = 0;
- outbuf.dst = dic->rbuf;
- outbuf.size = dic->rlen;
+ ZSTD_DCtx *ctx = dic->private2;
+ size_t ret;
- ret = ZSTD_decompressStream(stream, &outbuf, &inbuf);
+ ret = ZSTD_decompressDCtx(ctx, dic->rbuf, dic->rlen, dic->cbuf->cdata, dic->clen);
if (ZSTD_isError(ret)) {
- printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream failed, ret: %d\n",
+ printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_decompressDCtx failed, err: %s\n",
KERN_ERR, F2FS_I_SB(dic->inode)->sb->s_id,
- __func__, ZSTD_getErrorCode(ret));
+ __func__, ZSTD_getErrorName(ret));
return -EIO;
}
- if (dic->rlen != outbuf.pos) {
+ if (dic->rlen != ret) {
printk_ratelimited("%sF2FS-fs (%s): %s ZSTD invalid rlen:%zu, "
"expected:%lu\n", KERN_ERR,
F2FS_I_SB(dic->inode)->sb->s_id,
--
2.28.0
From: Nick Terrell <[email protected]>
Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.
Signed-off-by: Nick Terrell <[email protected]>
---
lib/decompress_unzstd.c | 40 ++++++++++++++--------------------------
1 file changed, 14 insertions(+), 26 deletions(-)
diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index a79f705f236d..d4685df0e120 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -73,7 +73,8 @@
#include <linux/decompress/mm.h>
#include <linux/kernel.h>
-#include <linux/zstd_compat.h>
+#include <linux/zstd.h>
+#include <linux/zstd_errors.h>
/* 128MB is the maximum window size supported by zstd. */
#define ZSTD_WINDOWSIZE_MAX (1 << ZSTD_WINDOWLOG_MAX)
@@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long in_len, u8 *out_buf,
long out_len, long *in_pos,
void (*error)(char *x))
{
- const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+ const size_t wksp_size = ZSTD_estimateDCtxSize();
void *wksp = large_malloc(wksp_size);
- ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size);
+ ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size);
int err;
size_t ret;
@@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len,
{
ZSTD_inBuffer in;
ZSTD_outBuffer out;
- ZSTD_frameParams params;
void *in_allocated = NULL;
void *out_allocated = NULL;
void *wksp = NULL;
@@ -229,36 +229,24 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len,
out.size = out_len;
/*
- * We need to know the window size to allocate the ZSTD_DStream.
- * Since we are streaming, we need to allocate a buffer for the sliding
- * window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX
- * (8 MB), so it is important to use the actual value so as not to
- * waste memory when it is smaller.
+ * Zstd determines the workspace size from the window size written
+ * into the frame header. This ensures that we use the minimum value
+ * possible, since the window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX
+ * (1 GB), so it is very important to use the actual value.
*/
- ret = ZSTD_getFrameParams(¶ms, in.src, in.size);
+ wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size);
err = handle_zstd_error(ret, error);
if (err)
goto out;
- if (ret != 0) {
- error("ZSTD-compressed data has an incomplete frame header");
- err = -1;
- goto out;
- }
- if (params.windowSize > ZSTD_WINDOWSIZE_MAX) {
- error("ZSTD-compressed data has too large a window size");
+ wksp = large_malloc(wksp_size);
+ if (wksp == NULL) {
+ error("Out of memory while allocating ZSTD_DStream");
err = -1;
goto out;
}
-
- /*
- * Allocate the ZSTD_DStream now that we know how much memory is
- * required.
- */
- wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize);
- wksp = large_malloc(wksp_size);
- dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size);
+ dstream = ZSTD_initStaticDStream(wksp, wksp_size);
if (dstream == NULL) {
- error("Out of memory while allocating ZSTD_DStream");
+ error("ZSTD_initStaticDStream failed");
err = -1;
goto out;
}
--
2.28.0
As you keep resend this I keep retelling you that should not do it.
Please provide a proper Linux API, and switch to that. Versioned APIs
have absolutely no business in the Linux kernel.
On Tue, Sep 29, 2020 at 11:53:09PM -0700, Nick Terrell wrote:
> From: Nick Terrell <[email protected]>
>
> This patchset upgrades the zstd library to the latest upstream release. The
> current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
> At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
> But, it is now possible to use upstream zstd directly in the kernel.
>
> I have not yet release zstd-1.4.6 upstream. I want the zstd version in the kernel
> to match up with a known upstream release, so we know exactly what code is
> running. Whenever this patchset is ready for merge, I will cut a release at the
> upstream commit that gets merged. This should not be necessary for future
> releases.
>
> The kernel zstd library is automatically generated from upstream zstd. A script
> makes the necessary changes and imports it into the kernel. The changes are:
>
> 1. Replace all libc dependencies with kernel replacements and rewrite includes.
> 2. Remove unncessary portability macros like: #if defined(_MSC_VER).
> 3. Use the kernel xxhash instead of bundling it.
>
> This automation gets tested every commit by upstream's continuous integration.
> When we cut a new zstd release, we will submit a patch to the kernel to update
> the zstd version in the kernel.
>
> I've updated zstd to upstream with one big patch because every commit must build,
> so that precludes partial updates. Since the commit is 100% generated, I hope the
> review burden is lightened. I considered replaying upstream commits, but that is
> not possible because there have been ~3500 upstream commits since the last zstd
> import, and the commits don't all build individually. The bulk update preserves
> bisectablity because bugs can be bisected to the zstd version update. At that
> point the update can be reverted, and we can work with upstream to find and fix
> the bug. After this big switch in how the kernel consumes zstd, future patches
> will be smaller, because they will only have one upstream release worth of
> changes each.
>
> This patchset changes the zstd API from a custom kernel API to the upstream API.
> I considered wrapping the upstream API with a wrapper that is closer to the
> kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814
> I've chosen to use the upstream API directly, to minimize opportunities to
> introduce bugs, and because using the upstream API directly makes debugging and
> communication with upstream easier.
>
> This patchset comes in 3 parts:
> 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
> compatibility wrapper so zstd can be upgraded without modifying any callers.
> The second patch adds an indirection for the lib/decompress_unzstd.c including
> of all decompression source files.
> 2. Import zstd-1.4.6. This patch is completely generated from upstream using
> automated tooling.
> 3. Update all callers to the zstd-1.4.6 API then delete the compatibility
> wrapper.
>
> I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
> using the compatibility wrapper, and after the final patch in this series.
>
> I tested kernel and initramfs decompression in i386 and arm.
>
> I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
> I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
> I found:
> * BtrFS zstd compression at levels 1 and 3 is 5% faster
> * BtrFS zstd decompression+read is 15% faster
> * SquashFS zstd decompression+read is 15% faster
> * F2FS zstd compression+write at level 3 is 8% faster
> * F2FS zstd decompression+read is 20% faster
> * ZRAM decompression+read is 30% faster
> * Kernel zstd decompression is 35% faster
> * Initramfs zstd decompression+build is 5% faster
>
> The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
> compression. For example the recent problem with large kernel decompression has
> been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27.
>
> Please let me know if there is anything that I can do to ease the way for these
> patches. I think it is important because it gets large performance improvements,
> contains bug fixes, and is switching to a more maintainable model of consuming
> upstream zstd directly, making it easy to keep up to date.
>
> Best,
> Nick Terrell
>
> v1 -> v2:
> * Successfully tested F2FS with help from Chao Yu to fix my test.
> * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown.
> This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test.
>
> v2 -> v3:
> * (3/9) Silence warnings by Kernel Test Robot:
> https://github.com/facebook/zstd/pull/2324
> Stack size warnings remain, but these aren't new, and the functions it warns on
> are either unused or not in the maximum stack path. This patchset reduces zstd
> compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and
> more stack reduction would require significant changes that have the potential
> to introduce new bugs. However, I do hope to continue to reduce zstd stack
> usage in future versions.
>
> v3 -> v4:
> * (3/9) Fix errors and warnings reported by Kernel Test Robot:
> https://github.com/facebook/zstd/pull/2326
> - Replace mem.h with a custom kernel implementation that matches the current
> lib/zstd/mem.h in the kernel. This avoids calls to __builtin_bswap*() which
> don't work on certain architectures, as exposed by the Kernel Test Robot.
> - Remove ASAN/MSAN (un)poisoning code which doesn't work in the kernel, as
> exposed by the Kernel Test Robot.
> - I've fixed all of the valid cppcheck warnings reported, but there were many
> false positives, where cppcheck was incorrectly analyzing the situation,
> which I did not fix. I don't believe it is reasonable to expect that upstream
> zstd silences all the static analyzer false positives. Upstream zstd uses
> clang scan-build for its static analysis. We find that supporting multiple
> static analysis tools multiplies the burden of silencing false positives,
> without providing enough marginal value over running a single static analysis
> tool.
>
> Nick Terrell (9):
> lib: zstd: Add zstd compatibility wrapper
> lib: zstd: Add decompress_sources.h for decompress_unzstd
> lib: zstd: Upgrade to latest upstream zstd version 1.4.6
> crypto: zstd: Switch to zstd-1.4.6 API
> btrfs: zstd: Switch to the zstd-1.4.6 API
> f2fs: zstd: Switch to the zstd-1.4.6 API
> squashfs: zstd: Switch to the zstd-1.4.6 API
> lib: unzstd: Switch to the zstd-1.4.6 API
> lib: zstd: Remove zstd compatibility wrapper
>
> crypto/zstd.c | 22 +-
> fs/btrfs/zstd.c | 46 +-
> fs/f2fs/compress.c | 100 +-
> fs/squashfs/zstd_wrapper.c | 7 +-
> include/linux/zstd.h | 3021 ++++++++----
> include/linux/zstd_errors.h | 76 +
> lib/decompress_unzstd.c | 44 +-
> lib/zstd/Makefile | 35 +-
> lib/zstd/bitstream.h | 379 --
> lib/zstd/common/bitstream.h | 437 ++
> lib/zstd/common/compiler.h | 150 +
> lib/zstd/common/cpu.h | 194 +
> lib/zstd/common/debug.c | 24 +
> lib/zstd/common/debug.h | 101 +
> lib/zstd/common/entropy_common.c | 355 ++
> lib/zstd/common/error_private.c | 55 +
> lib/zstd/common/error_private.h | 66 +
> lib/zstd/common/fse.h | 709 +++
> lib/zstd/common/fse_decompress.c | 380 ++
> lib/zstd/common/huf.h | 352 ++
> lib/zstd/common/mem.h | 258 +
> lib/zstd/common/zstd_common.c | 83 +
> lib/zstd/common/zstd_deps.h | 124 +
> lib/zstd/common/zstd_internal.h | 438 ++
> lib/zstd/compress.c | 3485 --------------
> lib/zstd/compress/fse_compress.c | 625 +++
> lib/zstd/compress/hist.c | 165 +
> lib/zstd/compress/hist.h | 75 +
> lib/zstd/compress/huf_compress.c | 764 +++
> lib/zstd/compress/zstd_compress.c | 4157 +++++++++++++++++
> lib/zstd/compress/zstd_compress_internal.h | 1103 +++++
> lib/zstd/compress/zstd_compress_literals.c | 158 +
> lib/zstd/compress/zstd_compress_literals.h | 29 +
> lib/zstd/compress/zstd_compress_sequences.c | 433 ++
> lib/zstd/compress/zstd_compress_sequences.h | 54 +
> lib/zstd/compress/zstd_compress_superblock.c | 849 ++++
> lib/zstd/compress/zstd_compress_superblock.h | 32 +
> lib/zstd/compress/zstd_cwksp.h | 465 ++
> lib/zstd/compress/zstd_double_fast.c | 521 +++
> lib/zstd/compress/zstd_double_fast.h | 32 +
> lib/zstd/compress/zstd_fast.c | 496 ++
> lib/zstd/compress/zstd_fast.h | 31 +
> lib/zstd/compress/zstd_lazy.c | 1138 +++++
> lib/zstd/compress/zstd_lazy.h | 61 +
> lib/zstd/compress/zstd_ldm.c | 619 +++
> lib/zstd/compress/zstd_ldm.h | 104 +
> lib/zstd/compress/zstd_opt.c | 1200 +++++
> lib/zstd/compress/zstd_opt.h | 50 +
> lib/zstd/decompress.c | 2531 ----------
> lib/zstd/decompress/huf_decompress.c | 1205 +++++
> lib/zstd/decompress/zstd_ddict.c | 241 +
> lib/zstd/decompress/zstd_ddict.h | 44 +
> lib/zstd/decompress/zstd_decompress.c | 1836 ++++++++
> lib/zstd/decompress/zstd_decompress_block.c | 1540 ++++++
> lib/zstd/decompress/zstd_decompress_block.h | 62 +
> .../decompress/zstd_decompress_internal.h | 195 +
> lib/zstd/decompress_sources.h | 18 +
> lib/zstd/entropy_common.c | 243 -
> lib/zstd/error_private.h | 53 -
> lib/zstd/fse.h | 575 ---
> lib/zstd/fse_compress.c | 795 ----
> lib/zstd/fse_decompress.c | 325 --
> lib/zstd/huf.h | 212 -
> lib/zstd/huf_compress.c | 772 ---
> lib/zstd/huf_decompress.c | 960 ----
> lib/zstd/mem.h | 151 -
> lib/zstd/zstd_common.c | 75 -
> lib/zstd/zstd_compress_module.c | 79 +
> lib/zstd/zstd_decompress_module.c | 79 +
> lib/zstd/zstd_internal.h | 273 --
> lib/zstd/zstd_opt.h | 1014 ----
> 71 files changed, 24368 insertions(+), 13012 deletions(-)
> create mode 100644 include/linux/zstd_errors.h
> delete mode 100644 lib/zstd/bitstream.h
> create mode 100644 lib/zstd/common/bitstream.h
> create mode 100644 lib/zstd/common/compiler.h
> create mode 100644 lib/zstd/common/cpu.h
> create mode 100644 lib/zstd/common/debug.c
> create mode 100644 lib/zstd/common/debug.h
> create mode 100644 lib/zstd/common/entropy_common.c
> create mode 100644 lib/zstd/common/error_private.c
> create mode 100644 lib/zstd/common/error_private.h
> create mode 100644 lib/zstd/common/fse.h
> create mode 100644 lib/zstd/common/fse_decompress.c
> create mode 100644 lib/zstd/common/huf.h
> create mode 100644 lib/zstd/common/mem.h
> create mode 100644 lib/zstd/common/zstd_common.c
> create mode 100644 lib/zstd/common/zstd_deps.h
> create mode 100644 lib/zstd/common/zstd_internal.h
> delete mode 100644 lib/zstd/compress.c
> create mode 100644 lib/zstd/compress/fse_compress.c
> create mode 100644 lib/zstd/compress/hist.c
> create mode 100644 lib/zstd/compress/hist.h
> create mode 100644 lib/zstd/compress/huf_compress.c
> create mode 100644 lib/zstd/compress/zstd_compress.c
> create mode 100644 lib/zstd/compress/zstd_compress_internal.h
> create mode 100644 lib/zstd/compress/zstd_compress_literals.c
> create mode 100644 lib/zstd/compress/zstd_compress_literals.h
> create mode 100644 lib/zstd/compress/zstd_compress_sequences.c
> create mode 100644 lib/zstd/compress/zstd_compress_sequences.h
> create mode 100644 lib/zstd/compress/zstd_compress_superblock.c
> create mode 100644 lib/zstd/compress/zstd_compress_superblock.h
> create mode 100644 lib/zstd/compress/zstd_cwksp.h
> create mode 100644 lib/zstd/compress/zstd_double_fast.c
> create mode 100644 lib/zstd/compress/zstd_double_fast.h
> create mode 100644 lib/zstd/compress/zstd_fast.c
> create mode 100644 lib/zstd/compress/zstd_fast.h
> create mode 100644 lib/zstd/compress/zstd_lazy.c
> create mode 100644 lib/zstd/compress/zstd_lazy.h
> create mode 100644 lib/zstd/compress/zstd_ldm.c
> create mode 100644 lib/zstd/compress/zstd_ldm.h
> create mode 100644 lib/zstd/compress/zstd_opt.c
> create mode 100644 lib/zstd/compress/zstd_opt.h
> delete mode 100644 lib/zstd/decompress.c
> create mode 100644 lib/zstd/decompress/huf_decompress.c
> create mode 100644 lib/zstd/decompress/zstd_ddict.c
> create mode 100644 lib/zstd/decompress/zstd_ddict.h
> create mode 100644 lib/zstd/decompress/zstd_decompress.c
> create mode 100644 lib/zstd/decompress/zstd_decompress_block.c
> create mode 100644 lib/zstd/decompress/zstd_decompress_block.h
> create mode 100644 lib/zstd/decompress/zstd_decompress_internal.h
> create mode 100644 lib/zstd/decompress_sources.h
> delete mode 100644 lib/zstd/entropy_common.c
> delete mode 100644 lib/zstd/error_private.h
> delete mode 100644 lib/zstd/fse.h
> delete mode 100644 lib/zstd/fse_compress.c
> delete mode 100644 lib/zstd/fse_decompress.c
> delete mode 100644 lib/zstd/huf.h
> delete mode 100644 lib/zstd/huf_compress.c
> delete mode 100644 lib/zstd/huf_decompress.c
> delete mode 100644 lib/zstd/mem.h
> delete mode 100644 lib/zstd/zstd_common.c
> create mode 100644 lib/zstd/zstd_compress_module.c
> create mode 100644 lib/zstd/zstd_decompress_module.c
> delete mode 100644 lib/zstd/zstd_internal.h
> delete mode 100644 lib/zstd/zstd_opt.h
>
> --
> 2.28.0
>
---end quoted text---
> On Sep 29, 2020, at 11:53 PM, Christoph Hellwig <[email protected]> wrote:
>
> As you keep resend this I keep retelling you that should not do it.
> Please provide a proper Linux API, and switch to that. Versioned APIs
> have absolutely no business in the Linux kernel.
The API is not versioned. We provide a stable ABI for a large section of our API,
and the parts that aren’t ABI stable don’t change in semantics, and undergo long
deprecation periods before being removed.
The change of callers is a one-time change to transition from the existing API
in the kernel, which was never upstream's API, to upstream's API.
-Nick
> On Tue, Sep 29, 2020 at 11:53:09PM -0700, Nick Terrell wrote:
>> From: Nick Terrell <[email protected]>
>>
>> This patchset upgrades the zstd library to the latest upstream release. The
>> current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
>> At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
>> But, it is now possible to use upstream zstd directly in the kernel.
>>
>> I have not yet release zstd-1.4.6 upstream. I want the zstd version in the kernel
>> to match up with a known upstream release, so we know exactly what code is
>> running. Whenever this patchset is ready for merge, I will cut a release at the
>> upstream commit that gets merged. This should not be necessary for future
>> releases.
>>
>> The kernel zstd library is automatically generated from upstream zstd. A script
>> makes the necessary changes and imports it into the kernel. The changes are:
>>
>> 1. Replace all libc dependencies with kernel replacements and rewrite includes.
>> 2. Remove unncessary portability macros like: #if defined(_MSC_VER).
>> 3. Use the kernel xxhash instead of bundling it.
>>
>> This automation gets tested every commit by upstream's continuous integration.
>> When we cut a new zstd release, we will submit a patch to the kernel to update
>> the zstd version in the kernel.
>>
>> I've updated zstd to upstream with one big patch because every commit must build,
>> so that precludes partial updates. Since the commit is 100% generated, I hope the
>> review burden is lightened. I considered replaying upstream commits, but that is
>> not possible because there have been ~3500 upstream commits since the last zstd
>> import, and the commits don't all build individually. The bulk update preserves
>> bisectablity because bugs can be bisected to the zstd version update. At that
>> point the update can be reverted, and we can work with upstream to find and fix
>> the bug. After this big switch in how the kernel consumes zstd, future patches
>> will be smaller, because they will only have one upstream release worth of
>> changes each.
>>
>> This patchset changes the zstd API from a custom kernel API to the upstream API.
>> I considered wrapping the upstream API with a wrapper that is closer to the
>> kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814
>> I've chosen to use the upstream API directly, to minimize opportunities to
>> introduce bugs, and because using the upstream API directly makes debugging and
>> communication with upstream easier.
>>
>> This patchset comes in 3 parts:
>> 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
>> compatibility wrapper so zstd can be upgraded without modifying any callers.
>> The second patch adds an indirection for the lib/decompress_unzstd.c including
>> of all decompression source files.
>> 2. Import zstd-1.4.6. This patch is completely generated from upstream using
>> automated tooling.
>> 3. Update all callers to the zstd-1.4.6 API then delete the compatibility
>> wrapper.
>>
>> I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
>> using the compatibility wrapper, and after the final patch in this series.
>>
>> I tested kernel and initramfs decompression in i386 and arm.
>>
>> I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
>> I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
>> I found:
>> * BtrFS zstd compression at levels 1 and 3 is 5% faster
>> * BtrFS zstd decompression+read is 15% faster
>> * SquashFS zstd decompression+read is 15% faster
>> * F2FS zstd compression+write at level 3 is 8% faster
>> * F2FS zstd decompression+read is 20% faster
>> * ZRAM decompression+read is 30% faster
>> * Kernel zstd decompression is 35% faster
>> * Initramfs zstd decompression+build is 5% faster
>>
>> The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
>> compression. For example the recent problem with large kernel decompression has
>> been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27.
>>
>> Please let me know if there is anything that I can do to ease the way for these
>> patches. I think it is important because it gets large performance improvements,
>> contains bug fixes, and is switching to a more maintainable model of consuming
>> upstream zstd directly, making it easy to keep up to date.
>>
>> Best,
>> Nick Terrell
>>
>> v1 -> v2:
>> * Successfully tested F2FS with help from Chao Yu to fix my test.
>> * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown.
>> This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test.
>>
>> v2 -> v3:
>> * (3/9) Silence warnings by Kernel Test Robot:
>> https://github.com/facebook/zstd/pull/2324
>> Stack size warnings remain, but these aren't new, and the functions it warns on
>> are either unused or not in the maximum stack path. This patchset reduces zstd
>> compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and
>> more stack reduction would require significant changes that have the potential
>> to introduce new bugs. However, I do hope to continue to reduce zstd stack
>> usage in future versions.
>>
>> v3 -> v4:
>> * (3/9) Fix errors and warnings reported by Kernel Test Robot:
>> https://github.com/facebook/zstd/pull/2326
>> - Replace mem.h with a custom kernel implementation that matches the current
>> lib/zstd/mem.h in the kernel. This avoids calls to __builtin_bswap*() which
>> don't work on certain architectures, as exposed by the Kernel Test Robot.
>> - Remove ASAN/MSAN (un)poisoning code which doesn't work in the kernel, as
>> exposed by the Kernel Test Robot.
>> - I've fixed all of the valid cppcheck warnings reported, but there were many
>> false positives, where cppcheck was incorrectly analyzing the situation,
>> which I did not fix. I don't believe it is reasonable to expect that upstream
>> zstd silences all the static analyzer false positives. Upstream zstd uses
>> clang scan-build for its static analysis. We find that supporting multiple
>> static analysis tools multiplies the burden of silencing false positives,
>> without providing enough marginal value over running a single static analysis
>> tool.
>>
>> Nick Terrell (9):
>> lib: zstd: Add zstd compatibility wrapper
>> lib: zstd: Add decompress_sources.h for decompress_unzstd
>> lib: zstd: Upgrade to latest upstream zstd version 1.4.6
>> crypto: zstd: Switch to zstd-1.4.6 API
>> btrfs: zstd: Switch to the zstd-1.4.6 API
>> f2fs: zstd: Switch to the zstd-1.4.6 API
>> squashfs: zstd: Switch to the zstd-1.4.6 API
>> lib: unzstd: Switch to the zstd-1.4.6 API
>> lib: zstd: Remove zstd compatibility wrapper
>>
>> crypto/zstd.c | 22 +-
>> fs/btrfs/zstd.c | 46 +-
>> fs/f2fs/compress.c | 100 +-
>> fs/squashfs/zstd_wrapper.c | 7 +-
>> include/linux/zstd.h | 3021 ++++++++----
>> include/linux/zstd_errors.h | 76 +
>> lib/decompress_unzstd.c | 44 +-
>> lib/zstd/Makefile | 35 +-
>> lib/zstd/bitstream.h | 379 --
>> lib/zstd/common/bitstream.h | 437 ++
>> lib/zstd/common/compiler.h | 150 +
>> lib/zstd/common/cpu.h | 194 +
>> lib/zstd/common/debug.c | 24 +
>> lib/zstd/common/debug.h | 101 +
>> lib/zstd/common/entropy_common.c | 355 ++
>> lib/zstd/common/error_private.c | 55 +
>> lib/zstd/common/error_private.h | 66 +
>> lib/zstd/common/fse.h | 709 +++
>> lib/zstd/common/fse_decompress.c | 380 ++
>> lib/zstd/common/huf.h | 352 ++
>> lib/zstd/common/mem.h | 258 +
>> lib/zstd/common/zstd_common.c | 83 +
>> lib/zstd/common/zstd_deps.h | 124 +
>> lib/zstd/common/zstd_internal.h | 438 ++
>> lib/zstd/compress.c | 3485 --------------
>> lib/zstd/compress/fse_compress.c | 625 +++
>> lib/zstd/compress/hist.c | 165 +
>> lib/zstd/compress/hist.h | 75 +
>> lib/zstd/compress/huf_compress.c | 764 +++
>> lib/zstd/compress/zstd_compress.c | 4157 +++++++++++++++++
>> lib/zstd/compress/zstd_compress_internal.h | 1103 +++++
>> lib/zstd/compress/zstd_compress_literals.c | 158 +
>> lib/zstd/compress/zstd_compress_literals.h | 29 +
>> lib/zstd/compress/zstd_compress_sequences.c | 433 ++
>> lib/zstd/compress/zstd_compress_sequences.h | 54 +
>> lib/zstd/compress/zstd_compress_superblock.c | 849 ++++
>> lib/zstd/compress/zstd_compress_superblock.h | 32 +
>> lib/zstd/compress/zstd_cwksp.h | 465 ++
>> lib/zstd/compress/zstd_double_fast.c | 521 +++
>> lib/zstd/compress/zstd_double_fast.h | 32 +
>> lib/zstd/compress/zstd_fast.c | 496 ++
>> lib/zstd/compress/zstd_fast.h | 31 +
>> lib/zstd/compress/zstd_lazy.c | 1138 +++++
>> lib/zstd/compress/zstd_lazy.h | 61 +
>> lib/zstd/compress/zstd_ldm.c | 619 +++
>> lib/zstd/compress/zstd_ldm.h | 104 +
>> lib/zstd/compress/zstd_opt.c | 1200 +++++
>> lib/zstd/compress/zstd_opt.h | 50 +
>> lib/zstd/decompress.c | 2531 ----------
>> lib/zstd/decompress/huf_decompress.c | 1205 +++++
>> lib/zstd/decompress/zstd_ddict.c | 241 +
>> lib/zstd/decompress/zstd_ddict.h | 44 +
>> lib/zstd/decompress/zstd_decompress.c | 1836 ++++++++
>> lib/zstd/decompress/zstd_decompress_block.c | 1540 ++++++
>> lib/zstd/decompress/zstd_decompress_block.h | 62 +
>> .../decompress/zstd_decompress_internal.h | 195 +
>> lib/zstd/decompress_sources.h | 18 +
>> lib/zstd/entropy_common.c | 243 -
>> lib/zstd/error_private.h | 53 -
>> lib/zstd/fse.h | 575 ---
>> lib/zstd/fse_compress.c | 795 ----
>> lib/zstd/fse_decompress.c | 325 --
>> lib/zstd/huf.h | 212 -
>> lib/zstd/huf_compress.c | 772 ---
>> lib/zstd/huf_decompress.c | 960 ----
>> lib/zstd/mem.h | 151 -
>> lib/zstd/zstd_common.c | 75 -
>> lib/zstd/zstd_compress_module.c | 79 +
>> lib/zstd/zstd_decompress_module.c | 79 +
>> lib/zstd/zstd_internal.h | 273 --
>> lib/zstd/zstd_opt.h | 1014 ----
>> 71 files changed, 24368 insertions(+), 13012 deletions(-)
>> create mode 100644 include/linux/zstd_errors.h
>> delete mode 100644 lib/zstd/bitstream.h
>> create mode 100644 lib/zstd/common/bitstream.h
>> create mode 100644 lib/zstd/common/compiler.h
>> create mode 100644 lib/zstd/common/cpu.h
>> create mode 100644 lib/zstd/common/debug.c
>> create mode 100644 lib/zstd/common/debug.h
>> create mode 100644 lib/zstd/common/entropy_common.c
>> create mode 100644 lib/zstd/common/error_private.c
>> create mode 100644 lib/zstd/common/error_private.h
>> create mode 100644 lib/zstd/common/fse.h
>> create mode 100644 lib/zstd/common/fse_decompress.c
>> create mode 100644 lib/zstd/common/huf.h
>> create mode 100644 lib/zstd/common/mem.h
>> create mode 100644 lib/zstd/common/zstd_common.c
>> create mode 100644 lib/zstd/common/zstd_deps.h
>> create mode 100644 lib/zstd/common/zstd_internal.h
>> delete mode 100644 lib/zstd/compress.c
>> create mode 100644 lib/zstd/compress/fse_compress.c
>> create mode 100644 lib/zstd/compress/hist.c
>> create mode 100644 lib/zstd/compress/hist.h
>> create mode 100644 lib/zstd/compress/huf_compress.c
>> create mode 100644 lib/zstd/compress/zstd_compress.c
>> create mode 100644 lib/zstd/compress/zstd_compress_internal.h
>> create mode 100644 lib/zstd/compress/zstd_compress_literals.c
>> create mode 100644 lib/zstd/compress/zstd_compress_literals.h
>> create mode 100644 lib/zstd/compress/zstd_compress_sequences.c
>> create mode 100644 lib/zstd/compress/zstd_compress_sequences.h
>> create mode 100644 lib/zstd/compress/zstd_compress_superblock.c
>> create mode 100644 lib/zstd/compress/zstd_compress_superblock.h
>> create mode 100644 lib/zstd/compress/zstd_cwksp.h
>> create mode 100644 lib/zstd/compress/zstd_double_fast.c
>> create mode 100644 lib/zstd/compress/zstd_double_fast.h
>> create mode 100644 lib/zstd/compress/zstd_fast.c
>> create mode 100644 lib/zstd/compress/zstd_fast.h
>> create mode 100644 lib/zstd/compress/zstd_lazy.c
>> create mode 100644 lib/zstd/compress/zstd_lazy.h
>> create mode 100644 lib/zstd/compress/zstd_ldm.c
>> create mode 100644 lib/zstd/compress/zstd_ldm.h
>> create mode 100644 lib/zstd/compress/zstd_opt.c
>> create mode 100644 lib/zstd/compress/zstd_opt.h
>> delete mode 100644 lib/zstd/decompress.c
>> create mode 100644 lib/zstd/decompress/huf_decompress.c
>> create mode 100644 lib/zstd/decompress/zstd_ddict.c
>> create mode 100644 lib/zstd/decompress/zstd_ddict.h
>> create mode 100644 lib/zstd/decompress/zstd_decompress.c
>> create mode 100644 lib/zstd/decompress/zstd_decompress_block.c
>> create mode 100644 lib/zstd/decompress/zstd_decompress_block.h
>> create mode 100644 lib/zstd/decompress/zstd_decompress_internal.h
>> create mode 100644 lib/zstd/decompress_sources.h
>> delete mode 100644 lib/zstd/entropy_common.c
>> delete mode 100644 lib/zstd/error_private.h
>> delete mode 100644 lib/zstd/fse.h
>> delete mode 100644 lib/zstd/fse_compress.c
>> delete mode 100644 lib/zstd/fse_decompress.c
>> delete mode 100644 lib/zstd/huf.h
>> delete mode 100644 lib/zstd/huf_compress.c
>> delete mode 100644 lib/zstd/huf_decompress.c
>> delete mode 100644 lib/zstd/mem.h
>> delete mode 100644 lib/zstd/zstd_common.c
>> create mode 100644 lib/zstd/zstd_compress_module.c
>> create mode 100644 lib/zstd/zstd_decompress_module.c
>> delete mode 100644 lib/zstd/zstd_internal.h
>> delete mode 100644 lib/zstd/zstd_opt.h
>>
>> --
>> 2.28.0
>>
> ---end quoted text---
> On Sep 29, 2020, at 11:53 PM, Nick Terrell <[email protected]> wrote:
>
> From: Nick Terrell <[email protected]>
It has been brought to my attention that patch 3 hasn’t made it to patchwork,
likely because it is too large. I’ll include a pull request in the next cover letter,
together with the patches (if needed).
Please pull from
[email protected]:terrelln/linux.git tags/v4-zstd-1.4.6
to get these changes.
> This patchset upgrades the zstd library to the latest upstream release. The
> current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
> At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
> But, it is now possible to use upstream zstd directly in the kernel.
>
> I have not yet release zstd-1.4.6 upstream. I want the zstd version in the kernel
> to match up with a known upstream release, so we know exactly what code is
> running. Whenever this patchset is ready for merge, I will cut a release at the
> upstream commit that gets merged. This should not be necessary for future
> releases.
>
> The kernel zstd library is automatically generated from upstream zstd. A script
> makes the necessary changes and imports it into the kernel. The changes are:
>
> 1. Replace all libc dependencies with kernel replacements and rewrite includes.
> 2. Remove unncessary portability macros like: #if defined(_MSC_VER).
> 3. Use the kernel xxhash instead of bundling it.
>
> This automation gets tested every commit by upstream's continuous integration.
> When we cut a new zstd release, we will submit a patch to the kernel to update
> the zstd version in the kernel.
>
> I've updated zstd to upstream with one big patch because every commit must build,
> so that precludes partial updates. Since the commit is 100% generated, I hope the
> review burden is lightened. I considered replaying upstream commits, but that is
> not possible because there have been ~3500 upstream commits since the last zstd
> import, and the commits don't all build individually. The bulk update preserves
> bisectablity because bugs can be bisected to the zstd version update. At that
> point the update can be reverted, and we can work with upstream to find and fix
> the bug. After this big switch in how the kernel consumes zstd, future patches
> will be smaller, because they will only have one upstream release worth of
> changes each.
>
> This patchset changes the zstd API from a custom kernel API to the upstream API.
> I considered wrapping the upstream API with a wrapper that is closer to the
> kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814
> I've chosen to use the upstream API directly, to minimize opportunities to
> introduce bugs, and because using the upstream API directly makes debugging and
> communication with upstream easier.
>
> This patchset comes in 3 parts:
> 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
> compatibility wrapper so zstd can be upgraded without modifying any callers.
> The second patch adds an indirection for the lib/decompress_unzstd.c including
> of all decompression source files.
> 2. Import zstd-1.4.6. This patch is completely generated from upstream using
> automated tooling.
> 3. Update all callers to the zstd-1.4.6 API then delete the compatibility
> wrapper.
>
> I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
> using the compatibility wrapper, and after the final patch in this series.
>
> I tested kernel and initramfs decompression in i386 and arm.
>
> I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
> I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
> I found:
> * BtrFS zstd compression at levels 1 and 3 is 5% faster
> * BtrFS zstd decompression+read is 15% faster
> * SquashFS zstd decompression+read is 15% faster
> * F2FS zstd compression+write at level 3 is 8% faster
> * F2FS zstd decompression+read is 20% faster
> * ZRAM decompression+read is 30% faster
> * Kernel zstd decompression is 35% faster
> * Initramfs zstd decompression+build is 5% faster
>
> The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
> compression. For example the recent problem with large kernel decompression has
> been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27.
>
> Please let me know if there is anything that I can do to ease the way for these
> patches. I think it is important because it gets large performance improvements,
> contains bug fixes, and is switching to a more maintainable model of consuming
> upstream zstd directly, making it easy to keep up to date.
>
> Best,
> Nick Terrell
>
> v1 -> v2:
> * Successfully tested F2FS with help from Chao Yu to fix my test.
> * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown.
> This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test.
>
> v2 -> v3:
> * (3/9) Silence warnings by Kernel Test Robot:
> https://github.com/facebook/zstd/pull/2324
> Stack size warnings remain, but these aren't new, and the functions it warns on
> are either unused or not in the maximum stack path. This patchset reduces zstd
> compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and
> more stack reduction would require significant changes that have the potential
> to introduce new bugs. However, I do hope to continue to reduce zstd stack
> usage in future versions.
>
> v3 -> v4:
> * (3/9) Fix errors and warnings reported by Kernel Test Robot:
> https://github.com/facebook/zstd/pull/2326
> - Replace mem.h with a custom kernel implementation that matches the current
> lib/zstd/mem.h in the kernel. This avoids calls to __builtin_bswap*() which
> don't work on certain architectures, as exposed by the Kernel Test Robot.
> - Remove ASAN/MSAN (un)poisoning code which doesn't work in the kernel, as
> exposed by the Kernel Test Robot.
> - I've fixed all of the valid cppcheck warnings reported, but there were many
> false positives, where cppcheck was incorrectly analyzing the situation,
> which I did not fix. I don't believe it is reasonable to expect that upstream
> zstd silences all the static analyzer false positives. Upstream zstd uses
> clang scan-build for its static analysis. We find that supporting multiple
> static analysis tools multiplies the burden of silencing false positives,
> without providing enough marginal value over running a single static analysis
> tool.
>
> Nick Terrell (9):
> lib: zstd: Add zstd compatibility wrapper
> lib: zstd: Add decompress_sources.h for decompress_unzstd
> lib: zstd: Upgrade to latest upstream zstd version 1.4.6
> crypto: zstd: Switch to zstd-1.4.6 API
> btrfs: zstd: Switch to the zstd-1.4.6 API
> f2fs: zstd: Switch to the zstd-1.4.6 API
> squashfs: zstd: Switch to the zstd-1.4.6 API
> lib: unzstd: Switch to the zstd-1.4.6 API
> lib: zstd: Remove zstd compatibility wrapper
>
> crypto/zstd.c | 22 +-
> fs/btrfs/zstd.c | 46 +-
> fs/f2fs/compress.c | 100 +-
> fs/squashfs/zstd_wrapper.c | 7 +-
> include/linux/zstd.h | 3021 ++++++++----
> include/linux/zstd_errors.h | 76 +
> lib/decompress_unzstd.c | 44 +-
> lib/zstd/Makefile | 35 +-
> lib/zstd/bitstream.h | 379 --
> lib/zstd/common/bitstream.h | 437 ++
> lib/zstd/common/compiler.h | 150 +
> lib/zstd/common/cpu.h | 194 +
> lib/zstd/common/debug.c | 24 +
> lib/zstd/common/debug.h | 101 +
> lib/zstd/common/entropy_common.c | 355 ++
> lib/zstd/common/error_private.c | 55 +
> lib/zstd/common/error_private.h | 66 +
> lib/zstd/common/fse.h | 709 +++
> lib/zstd/common/fse_decompress.c | 380 ++
> lib/zstd/common/huf.h | 352 ++
> lib/zstd/common/mem.h | 258 +
> lib/zstd/common/zstd_common.c | 83 +
> lib/zstd/common/zstd_deps.h | 124 +
> lib/zstd/common/zstd_internal.h | 438 ++
> lib/zstd/compress.c | 3485 --------------
> lib/zstd/compress/fse_compress.c | 625 +++
> lib/zstd/compress/hist.c | 165 +
> lib/zstd/compress/hist.h | 75 +
> lib/zstd/compress/huf_compress.c | 764 +++
> lib/zstd/compress/zstd_compress.c | 4157 +++++++++++++++++
> lib/zstd/compress/zstd_compress_internal.h | 1103 +++++
> lib/zstd/compress/zstd_compress_literals.c | 158 +
> lib/zstd/compress/zstd_compress_literals.h | 29 +
> lib/zstd/compress/zstd_compress_sequences.c | 433 ++
> lib/zstd/compress/zstd_compress_sequences.h | 54 +
> lib/zstd/compress/zstd_compress_superblock.c | 849 ++++
> lib/zstd/compress/zstd_compress_superblock.h | 32 +
> lib/zstd/compress/zstd_cwksp.h | 465 ++
> lib/zstd/compress/zstd_double_fast.c | 521 +++
> lib/zstd/compress/zstd_double_fast.h | 32 +
> lib/zstd/compress/zstd_fast.c | 496 ++
> lib/zstd/compress/zstd_fast.h | 31 +
> lib/zstd/compress/zstd_lazy.c | 1138 +++++
> lib/zstd/compress/zstd_lazy.h | 61 +
> lib/zstd/compress/zstd_ldm.c | 619 +++
> lib/zstd/compress/zstd_ldm.h | 104 +
> lib/zstd/compress/zstd_opt.c | 1200 +++++
> lib/zstd/compress/zstd_opt.h | 50 +
> lib/zstd/decompress.c | 2531 ----------
> lib/zstd/decompress/huf_decompress.c | 1205 +++++
> lib/zstd/decompress/zstd_ddict.c | 241 +
> lib/zstd/decompress/zstd_ddict.h | 44 +
> lib/zstd/decompress/zstd_decompress.c | 1836 ++++++++
> lib/zstd/decompress/zstd_decompress_block.c | 1540 ++++++
> lib/zstd/decompress/zstd_decompress_block.h | 62 +
> .../decompress/zstd_decompress_internal.h | 195 +
> lib/zstd/decompress_sources.h | 18 +
> lib/zstd/entropy_common.c | 243 -
> lib/zstd/error_private.h | 53 -
> lib/zstd/fse.h | 575 ---
> lib/zstd/fse_compress.c | 795 ----
> lib/zstd/fse_decompress.c | 325 --
> lib/zstd/huf.h | 212 -
> lib/zstd/huf_compress.c | 772 ---
> lib/zstd/huf_decompress.c | 960 ----
> lib/zstd/mem.h | 151 -
> lib/zstd/zstd_common.c | 75 -
> lib/zstd/zstd_compress_module.c | 79 +
> lib/zstd/zstd_decompress_module.c | 79 +
> lib/zstd/zstd_internal.h | 273 --
> lib/zstd/zstd_opt.h | 1014 ----
> 71 files changed, 24368 insertions(+), 13012 deletions(-)
> create mode 100644 include/linux/zstd_errors.h
> delete mode 100644 lib/zstd/bitstream.h
> create mode 100644 lib/zstd/common/bitstream.h
> create mode 100644 lib/zstd/common/compiler.h
> create mode 100644 lib/zstd/common/cpu.h
> create mode 100644 lib/zstd/common/debug.c
> create mode 100644 lib/zstd/common/debug.h
> create mode 100644 lib/zstd/common/entropy_common.c
> create mode 100644 lib/zstd/common/error_private.c
> create mode 100644 lib/zstd/common/error_private.h
> create mode 100644 lib/zstd/common/fse.h
> create mode 100644 lib/zstd/common/fse_decompress.c
> create mode 100644 lib/zstd/common/huf.h
> create mode 100644 lib/zstd/common/mem.h
> create mode 100644 lib/zstd/common/zstd_common.c
> create mode 100644 lib/zstd/common/zstd_deps.h
> create mode 100644 lib/zstd/common/zstd_internal.h
> delete mode 100644 lib/zstd/compress.c
> create mode 100644 lib/zstd/compress/fse_compress.c
> create mode 100644 lib/zstd/compress/hist.c
> create mode 100644 lib/zstd/compress/hist.h
> create mode 100644 lib/zstd/compress/huf_compress.c
> create mode 100644 lib/zstd/compress/zstd_compress.c
> create mode 100644 lib/zstd/compress/zstd_compress_internal.h
> create mode 100644 lib/zstd/compress/zstd_compress_literals.c
> create mode 100644 lib/zstd/compress/zstd_compress_literals.h
> create mode 100644 lib/zstd/compress/zstd_compress_sequences.c
> create mode 100644 lib/zstd/compress/zstd_compress_sequences.h
> create mode 100644 lib/zstd/compress/zstd_compress_superblock.c
> create mode 100644 lib/zstd/compress/zstd_compress_superblock.h
> create mode 100644 lib/zstd/compress/zstd_cwksp.h
> create mode 100644 lib/zstd/compress/zstd_double_fast.c
> create mode 100644 lib/zstd/compress/zstd_double_fast.h
> create mode 100644 lib/zstd/compress/zstd_fast.c
> create mode 100644 lib/zstd/compress/zstd_fast.h
> create mode 100644 lib/zstd/compress/zstd_lazy.c
> create mode 100644 lib/zstd/compress/zstd_lazy.h
> create mode 100644 lib/zstd/compress/zstd_ldm.c
> create mode 100644 lib/zstd/compress/zstd_ldm.h
> create mode 100644 lib/zstd/compress/zstd_opt.c
> create mode 100644 lib/zstd/compress/zstd_opt.h
> delete mode 100644 lib/zstd/decompress.c
> create mode 100644 lib/zstd/decompress/huf_decompress.c
> create mode 100644 lib/zstd/decompress/zstd_ddict.c
> create mode 100644 lib/zstd/decompress/zstd_ddict.h
> create mode 100644 lib/zstd/decompress/zstd_decompress.c
> create mode 100644 lib/zstd/decompress/zstd_decompress_block.c
> create mode 100644 lib/zstd/decompress/zstd_decompress_block.h
> create mode 100644 lib/zstd/decompress/zstd_decompress_internal.h
> create mode 100644 lib/zstd/decompress_sources.h
> delete mode 100644 lib/zstd/entropy_common.c
> delete mode 100644 lib/zstd/error_private.h
> delete mode 100644 lib/zstd/fse.h
> delete mode 100644 lib/zstd/fse_compress.c
> delete mode 100644 lib/zstd/fse_decompress.c
> delete mode 100644 lib/zstd/huf.h
> delete mode 100644 lib/zstd/huf_compress.c
> delete mode 100644 lib/zstd/huf_decompress.c
> delete mode 100644 lib/zstd/mem.h
> delete mode 100644 lib/zstd/zstd_common.c
> create mode 100644 lib/zstd/zstd_compress_module.c
> create mode 100644 lib/zstd/zstd_decompress_module.c
> delete mode 100644 lib/zstd/zstd_internal.h
> delete mode 100644 lib/zstd/zstd_opt.h
>
> --
> 2.28.0
>
On Wed, Sep 30, 2020 at 08:49:49PM +0000, Nick Terrell wrote:
> > On Sep 29, 2020, at 11:53 PM, Nick Terrell <[email protected]> wrote:
> >
> > From: Nick Terrell <[email protected]>
>
> It has been brought to my attention that patch 3 hasn’t made it to patchwork,
> likely because it is too large. I’ll include a pull request in the next cover letter,
> together with the patches (if needed).
The patch 3/9 saved to a file is 1.6M, over 35000 lines, the diffstat
says:
66 files changed, 24268 insertions(+), 12889 deletions(-)
Seriously, this is wrong in so many ways. There's the rationale for
one-time change etc, but the actual result is beyond what I would accept
and would not encourage anyone to merge as-is.
> On Oct 1, 2020, at 3:18 AM, David Sterba <[email protected]> wrote:
>
> On Wed, Sep 30, 2020 at 08:49:49PM +0000, Nick Terrell wrote:
>>> On Sep 29, 2020, at 11:53 PM, Nick Terrell <[email protected]> wrote:
>>>
>>> From: Nick Terrell <[email protected]>
>>
>> It has been brought to my attention that patch 3 hasn’t made it to patchwork,
>> likely because it is too large. I’ll include a pull request in the next cover letter,
>> together with the patches (if needed).
>
> The patch 3/9 saved to a file is 1.6M, over 35000 lines, the diffstat
> says:
>
> 66 files changed, 24268 insertions(+), 12889 deletions(-)
>
> Seriously, this is wrong in so many ways. There's the rationale for
> one-time change etc, but the actual result is beyond what I would accept
> and would not encourage anyone to merge as-is.
I’m open to suggestions on how to get a zstd update done better. I don’t
know of any way to break this patch up into smaller patches that all compile.
The code is all generated directly from upstream and modified to work in the
kernel by automated scripts.
I think the benefits of updating zstd are pretty clear: bug fixes, 3 years of testing,
features, debuggability, support from zstd upstream, and significant performance
improvements.
So I hope we can come up with a way forward to get this merged.
This large of a patch is a one-time change. But, the zstd updates in general
will be large, containing 100s of commits worth of changes (as opposed to
~3500 and a structure change in this diff). E.g. the upstream diff between
two upstream versions range from 50KB - 500KB. Zstd is an actively
maintained project, so there is going to be churn when consuming it. But it
also means that we’re actively supporting the project if any problems occur.
My view is that kernel developers don’t need to review upstreams zstd’s code. We
should focus on the diff from upstream, and ensuring that everything works in the
kernel environment. The imported code from upstream zstd is ~30K LOC, which is
too large for anyone to reasonably review.
As mentioned in the patch, this commit shows the diff from upstream zstd, which
is much more manageable:
https://github.com/terrelln/linux/commit/467c9ea1df1100db48c020c3c8b282a2a30f5116
I’ve generated it by importing upstream zstd as-is into the kernel file structure. Then
running the automation to generate the kernel patch from upstream and importing
it into the kernel on top of the upstream patch.
Best,
Nick
On Wed, Sep 30, 2020 at 08:05:45PM +0000, Nick Terrell wrote:
>
>
> > On Sep 29, 2020, at 11:53 PM, Christoph Hellwig <[email protected]> wrote:
> >
> > As you keep resend this I keep retelling you that should not do it.
> > Please provide a proper Linux API, and switch to that. Versioned APIs
> > have absolutely no business in the Linux kernel.
>
> The API is not versioned. We provide a stable ABI for a large section of our API,
> and the parts that aren???t ABI stable don???t change in semantics, and undergo long
> deprecation periods before being removed.
>
> The change of callers is a one-time change to transition from the existing API
> in the kernel, which was never upstream's API, to upstream's API.
Again, please transition it to a sane kernel API. We don't have an
"upstream" in this case.
On Thu, Oct 01, 2020 at 06:35:34PM +0000, Nick Terrell wrote:
> I???m open to suggestions on how to get a zstd update done better. I don???t
> know of any way to break this patch up into smaller patches that all compile.
> The code is all generated directly from upstream and modified to work in the
> kernel by automated scripts.
Documentation/process/submitting-patches.rst:
"Separate your changes
---------------------
Separate each **logical change** into a separate patch.
For example, if your changes include both bug fixes and performance
enhancements for a single driver, separate those changes into two
or more patches. If your changes include an API update, and a new
driver which uses that new API, separate those into two patches."
It's not that hard, is it? Please do your very basic homework instead
of pretending to be a special snowflake and then come back.
On 2 Oct 2020, at 2:54, Christoph Hellwig wrote:
> On Wed, Sep 30, 2020 at 08:05:45PM +0000, Nick Terrell wrote:
>>
>>
>>> On Sep 29, 2020, at 11:53 PM, Christoph Hellwig <[email protected]>
>>> wrote:
>>>
>>> As you keep resend this I keep retelling you that should not do it.
>>> Please provide a proper Linux API, and switch to that. Versioned
>>> APIs
>>> have absolutely no business in the Linux kernel.
>>
>> The API is not versioned. We provide a stable ABI for a large section
>> of our API,
>> and the parts that aren???t ABI stable don???t change in semantics,
>> and undergo long
>> deprecation periods before being removed.
>>
>> The change of callers is a one-time change to transition from the
>> existing API
>> in the kernel, which was never upstream's API, to upstream's API.
>
> Again, please transition it to a sane kernel API. We don't have an
> "upstream" in this case.
The upstream is the zstd project where all this code originates, and
where the active development takes place. As Eric Biggers pointed out,
it also receives a lot of Q/A separate from the kernel. I think we gain
a great deal by leveraging the testing and documentation of the zstd
project in the kernel interfaces we use.
We lose some consistency with the kernel coding style, but we gain the
ability to search for docs, issues, and fixes directly against the zstd
project and git repo.
-chris