2020-09-23 22:39:09

by Nick Terrell

[permalink] [raw]
Subject: [PATCH v3 0/9] Update to zstd-1.4.6

From: Nick Terrell <[email protected]>

This patchset upgrades the zstd library to the latest upstream release. The
current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
But, it is now possible to use upstream zstd directly in the kernel.

I have not yet release zstd-1.4.6 upstream. I want the zstd version in the kernel
to match up with a known upstream release, so we know exactly what code is
running. Whenever this patchset is ready for merge, I will cut a release at the
upstream commit that gets merged. This should not be necessary for future
releases.

The kernel zstd library is automatically generated from upstream zstd. A script
makes the necessary changes and imports it into the kernel. The changes are:

1. Replace all libc dependencies with kernel replacements and rewrite includes.
2. Remove unncessary portability macros like: #if defined(_MSC_VER).
3. Use the kernel xxhash instead of bundling it.

This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.

I've updated zstd to upstream with one big patch because every commit must build,
so that precludes partial updates. Since the commit is 100% generated, I hope the
review burden is lightened. I considered replaying upstream commits, but that is
not possible because there have been ~3500 upstream commits since the last zstd
import, and the commits don't all build individually. The bulk update preserves
bisectablity because bugs can be bisected to the zstd version update. At that
point the update can be reverted, and we can work with upstream to find and fix
the bug. After this big switch in how the kernel consumes zstd, future patches
will be smaller, because they will only have one upstream release worth of
changes each.

This patchset changes the zstd API from a custom kernel API to the upstream API.
I considered wrapping the upstream API with a wrapper that is closer to the
kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814
I've chosen to use the upstream API directly, to minimize opportunities to
introduce bugs, and because using the upstream API directly makes debugging and
communication with upstream easier.

This patchset comes in 3 parts:
1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
compatibility wrapper so zstd can be upgraded without modifying any callers.
The second patch adds an indirection for the lib/decompress_unzstd.c including
of all decompression source files.
2. Import zstd-1.4.6. This patch is completely generated from upstream using
automated tooling.
3. Update all callers to the zstd-1.4.6 API then delete the compatibility
wrapper.

I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
using the compatibility wrapper, and after the final patch in this series.

I tested kernel and initramfs decompression in i386 and arm.

I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
I found:
* BtrFS zstd compression at levels 1 and 3 is 5% faster
* BtrFS zstd decompression+read is 15% faster
* SquashFS zstd decompression+read is 15% faster
* F2FS zstd compression+write at level 3 is 8% faster
* F2FS zstd decompression+read is 20% faster
* ZRAM decompression+read is 30% faster
* Kernel zstd decompression is 35% faster
* Initramfs zstd decompression+build is 5% faster

The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
compression.

Please let me know if there is anything that I can do to ease the way for these
patches. I think it is important because it gets large performance improvements,
contains bug fixes, and is switching to a more maintainable model of consuming
upstream zstd directly, making it easy to keep up to date.

Best,
Nick Terrell

v1 -> v2:
* Successfully tested F2FS with help from Chao Yu to fix my test.
* (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown.
This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test.

v2 -> v3:
* (3/9) Silence warnings by Kernel Test Robot:
https://github.com/facebook/zstd/pull/2324
Stack size warnings remain, but these aren't new, and the functions it warns on
are either unused or not in the maximum stack path. This patchset reduces zstd
compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and
more stack reduction would require significant changes that have the potential
to introduce new bugs. However, I do hope to continue to reduce zstd stack
usage in future versions.

Nick Terrell (9):
lib: zstd: Add zstd compatibility wrapper
lib: zstd: Add decompress_sources.h for decompress_unzstd
lib: zstd: Upgrade to latest upstream zstd version 1.4.6
crypto: zstd: Switch to zstd-1.4.6 API
btrfs: zstd: Switch to the zstd-1.4.6 API
f2fs: zstd: Switch to the zstd-1.4.6 API
squashfs: zstd: Switch to the zstd-1.4.6 API
lib: unzstd: Switch to the zstd-1.4.6 API
lib: zstd: Remove zstd compatibility wrapper

crypto/zstd.c | 22 +-
fs/btrfs/zstd.c | 46 +-
fs/f2fs/compress.c | 100 +-
fs/squashfs/zstd_wrapper.c | 7 +-
include/linux/zstd.h | 3021 ++++++++----
include/linux/zstd_errors.h | 76 +
lib/decompress_unzstd.c | 44 +-
lib/zstd/Makefile | 35 +-
lib/zstd/bitstream.h | 379 --
lib/zstd/common/bitstream.h | 437 ++
lib/zstd/common/compiler.h | 134 +
lib/zstd/common/cpu.h | 194 +
lib/zstd/common/debug.c | 24 +
lib/zstd/common/debug.h | 101 +
lib/zstd/common/entropy_common.c | 355 ++
lib/zstd/common/error_private.c | 55 +
lib/zstd/common/error_private.h | 66 +
lib/zstd/common/fse.h | 709 +++
lib/zstd/common/fse_decompress.c | 380 ++
lib/zstd/common/huf.h | 352 ++
lib/zstd/common/mem.h | 347 ++
lib/zstd/common/zstd_common.c | 83 +
lib/zstd/common/zstd_deps.h | 134 +
lib/zstd/common/zstd_internal.h | 434 ++
lib/zstd/compress.c | 3485 --------------
lib/zstd/compress/fse_compress.c | 625 +++
lib/zstd/compress/hist.c | 165 +
lib/zstd/compress/hist.h | 75 +
lib/zstd/compress/huf_compress.c | 764 +++
lib/zstd/compress/zstd_compress.c | 4160 +++++++++++++++++
lib/zstd/compress/zstd_compress_internal.h | 1103 +++++
lib/zstd/compress/zstd_compress_literals.c | 158 +
lib/zstd/compress/zstd_compress_literals.h | 29 +
lib/zstd/compress/zstd_compress_sequences.c | 433 ++
lib/zstd/compress/zstd_compress_sequences.h | 54 +
lib/zstd/compress/zstd_compress_superblock.c | 849 ++++
lib/zstd/compress/zstd_compress_superblock.h | 32 +
lib/zstd/compress/zstd_cwksp.h | 524 +++
lib/zstd/compress/zstd_double_fast.c | 521 +++
lib/zstd/compress/zstd_double_fast.h | 32 +
lib/zstd/compress/zstd_fast.c | 496 ++
lib/zstd/compress/zstd_fast.h | 31 +
lib/zstd/compress/zstd_lazy.c | 1138 +++++
lib/zstd/compress/zstd_lazy.h | 61 +
lib/zstd/compress/zstd_ldm.c | 619 +++
lib/zstd/compress/zstd_ldm.h | 104 +
lib/zstd/compress/zstd_opt.c | 1200 +++++
lib/zstd/compress/zstd_opt.h | 50 +
lib/zstd/decompress.c | 2531 ----------
lib/zstd/decompress/huf_decompress.c | 1205 +++++
lib/zstd/decompress/zstd_ddict.c | 241 +
lib/zstd/decompress/zstd_ddict.h | 44 +
lib/zstd/decompress/zstd_decompress.c | 1836 ++++++++
lib/zstd/decompress/zstd_decompress_block.c | 1540 ++++++
lib/zstd/decompress/zstd_decompress_block.h | 62 +
.../decompress/zstd_decompress_internal.h | 195 +
lib/zstd/decompress_sources.h | 18 +
lib/zstd/entropy_common.c | 243 -
lib/zstd/error_private.h | 53 -
lib/zstd/fse.h | 575 ---
lib/zstd/fse_compress.c | 795 ----
lib/zstd/fse_decompress.c | 325 --
lib/zstd/huf.h | 212 -
lib/zstd/huf_compress.c | 772 ---
lib/zstd/huf_decompress.c | 960 ----
lib/zstd/mem.h | 151 -
lib/zstd/zstd_common.c | 75 -
lib/zstd/zstd_compress_module.c | 79 +
lib/zstd/zstd_decompress_module.c | 79 +
lib/zstd/zstd_internal.h | 273 --
lib/zstd/zstd_opt.h | 1014 ----
71 files changed, 24509 insertions(+), 13012 deletions(-)
create mode 100644 include/linux/zstd_errors.h
delete mode 100644 lib/zstd/bitstream.h
create mode 100644 lib/zstd/common/bitstream.h
create mode 100644 lib/zstd/common/compiler.h
create mode 100644 lib/zstd/common/cpu.h
create mode 100644 lib/zstd/common/debug.c
create mode 100644 lib/zstd/common/debug.h
create mode 100644 lib/zstd/common/entropy_common.c
create mode 100644 lib/zstd/common/error_private.c
create mode 100644 lib/zstd/common/error_private.h
create mode 100644 lib/zstd/common/fse.h
create mode 100644 lib/zstd/common/fse_decompress.c
create mode 100644 lib/zstd/common/huf.h
create mode 100644 lib/zstd/common/mem.h
create mode 100644 lib/zstd/common/zstd_common.c
create mode 100644 lib/zstd/common/zstd_deps.h
create mode 100644 lib/zstd/common/zstd_internal.h
delete mode 100644 lib/zstd/compress.c
create mode 100644 lib/zstd/compress/fse_compress.c
create mode 100644 lib/zstd/compress/hist.c
create mode 100644 lib/zstd/compress/hist.h
create mode 100644 lib/zstd/compress/huf_compress.c
create mode 100644 lib/zstd/compress/zstd_compress.c
create mode 100644 lib/zstd/compress/zstd_compress_internal.h
create mode 100644 lib/zstd/compress/zstd_compress_literals.c
create mode 100644 lib/zstd/compress/zstd_compress_literals.h
create mode 100644 lib/zstd/compress/zstd_compress_sequences.c
create mode 100644 lib/zstd/compress/zstd_compress_sequences.h
create mode 100644 lib/zstd/compress/zstd_compress_superblock.c
create mode 100644 lib/zstd/compress/zstd_compress_superblock.h
create mode 100644 lib/zstd/compress/zstd_cwksp.h
create mode 100644 lib/zstd/compress/zstd_double_fast.c
create mode 100644 lib/zstd/compress/zstd_double_fast.h
create mode 100644 lib/zstd/compress/zstd_fast.c
create mode 100644 lib/zstd/compress/zstd_fast.h
create mode 100644 lib/zstd/compress/zstd_lazy.c
create mode 100644 lib/zstd/compress/zstd_lazy.h
create mode 100644 lib/zstd/compress/zstd_ldm.c
create mode 100644 lib/zstd/compress/zstd_ldm.h
create mode 100644 lib/zstd/compress/zstd_opt.c
create mode 100644 lib/zstd/compress/zstd_opt.h
delete mode 100644 lib/zstd/decompress.c
create mode 100644 lib/zstd/decompress/huf_decompress.c
create mode 100644 lib/zstd/decompress/zstd_ddict.c
create mode 100644 lib/zstd/decompress/zstd_ddict.h
create mode 100644 lib/zstd/decompress/zstd_decompress.c
create mode 100644 lib/zstd/decompress/zstd_decompress_block.c
create mode 100644 lib/zstd/decompress/zstd_decompress_block.h
create mode 100644 lib/zstd/decompress/zstd_decompress_internal.h
create mode 100644 lib/zstd/decompress_sources.h
delete mode 100644 lib/zstd/entropy_common.c
delete mode 100644 lib/zstd/error_private.h
delete mode 100644 lib/zstd/fse.h
delete mode 100644 lib/zstd/fse_compress.c
delete mode 100644 lib/zstd/fse_decompress.c
delete mode 100644 lib/zstd/huf.h
delete mode 100644 lib/zstd/huf_compress.c
delete mode 100644 lib/zstd/huf_decompress.c
delete mode 100644 lib/zstd/mem.h
delete mode 100644 lib/zstd/zstd_common.c
create mode 100644 lib/zstd/zstd_compress_module.c
create mode 100644 lib/zstd/zstd_decompress_module.c
delete mode 100644 lib/zstd/zstd_internal.h
delete mode 100644 lib/zstd/zstd_opt.h

--
2.28.0


2020-09-23 22:39:31

by Nick Terrell

[permalink] [raw]
Subject: [PATCH v3 2/9] lib: zstd: Add decompress_sources.h for decompress_unzstd

From: Nick Terrell <[email protected]>

Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell <[email protected]>
---
lib/decompress_unzstd.c | 6 +-----
lib/zstd/decompress_sources.h | 14 ++++++++++++++
2 files changed, 15 insertions(+), 5 deletions(-)
create mode 100644 lib/zstd/decompress_sources.h

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index dbc290af26b4..a79f705f236d 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -68,11 +68,7 @@
#ifdef STATIC
# define UNZSTD_PREBOOT
# include "xxhash.c"
-# include "zstd/entropy_common.c"
-# include "zstd/fse_decompress.c"
-# include "zstd/huf_decompress.c"
-# include "zstd/zstd_common.c"
-# include "zstd/decompress.c"
+# include "zstd/decompress_sources.h"
#endif

#include <linux/decompress/mm.h>
diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h
new file mode 100644
index 000000000000..ccb4960ea0cd
--- /dev/null
+++ b/lib/zstd/decompress_sources.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * This file includes every .c file needed for decompression.
+ * It is used by lib/decompress_unzstd.c to include the decompression
+ * source into the translation-unit, so it can be used for kernel
+ * decompression.
+ */
+
+#include "entropy_common.c"
+#include "fse_decompress.c"
+#include "huf_decompress.c"
+#include "zstd_common.c"
+#include "decompress.c"
--
2.28.0

2020-09-23 22:39:53

by Nick Terrell

[permalink] [raw]
Subject: [PATCH v3 4/9] crypto: zstd: Switch to zstd-1.4.6 API

From: Nick Terrell <[email protected]>

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell <[email protected]>
---
crypto/zstd.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/crypto/zstd.c b/crypto/zstd.c
index dcda3cad3b5c..767fe2fbe009 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
#include <linux/module.h>
#include <linux/net.h>
#include <linux/vmalloc.h>
-#include <linux/zstd_compat.h>
+#include <linux/zstd.h>
#include <crypto/internal/scompress.h>


@@ -24,16 +24,15 @@ struct zstd_ctx {
void *dwksp;
};

-static ZSTD_parameters zstd_params(void)
-{
- return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0);
-}
-
static int zstd_comp_init(struct zstd_ctx *ctx)
{
int ret = 0;
- const ZSTD_parameters params = zstd_params();
- const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams);
+ const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL);
+
+ if (ZSTD_isError(wksp_size)) {
+ ret = -EINVAL;
+ goto out_free;
+ }

ctx->cwksp = vzalloc(wksp_size);
if (!ctx->cwksp) {
@@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
goto out;
}

- ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size);
+ ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size);
if (!ctx->cctx) {
ret = -EINVAL;
goto out_free;
@@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
static int zstd_decomp_init(struct zstd_ctx *ctx)
{
int ret = 0;
- const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+ const size_t wksp_size = ZSTD_estimateDCtxSize();

ctx->dwksp = vzalloc(wksp_size);
if (!ctx->dwksp) {
@@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx)
goto out;
}

- ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size);
+ ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size);
if (!ctx->dctx) {
ret = -EINVAL;
goto out_free;
@@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen,
{
size_t out_len;
struct zstd_ctx *zctx = ctx;
- const ZSTD_parameters params = zstd_params();

- out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params);
+ out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, ZSTD_DEF_LEVEL);
if (ZSTD_isError(out_len))
return -EINVAL;
*dlen = out_len;
--
2.28.0

2020-09-23 22:40:12

by Nick Terrell

[permalink] [raw]
Subject: [PATCH v3 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API

From: Nick Terrell <[email protected]>

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell <[email protected]>
---
fs/btrfs/zstd.c | 48 ++++++++++++++++++++++++++++--------------------
1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index a7367ff573d4..6b466e090cd7 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
#include <linux/refcount.h>
#include <linux/sched.h>
#include <linux/slab.h>
-#include <linux/zstd_compat.h>
+#include <linux/zstd.h>
#include "misc.h"
#include "compression.h"
#include "ctree.h"
@@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void)
zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT);
size_t level_size =
max_t(size_t,
- ZSTD_CStreamWorkspaceBound(params.cParams),
- ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
+ ZSTD_estimateCStreamSize_usingCParams(params.cParams),
+ ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT));

max_size = max_t(size_t, max_size, level_size);
zstd_ws_mem_sizes[level - 1] = max_size;
@@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
*total_in = 0;

/* Initialize the stream */
- stream = ZSTD_initCStream(params, len, workspace->mem,
- workspace->size);
+ stream = ZSTD_initStaticCStream(workspace->mem, workspace->size);
if (!stream) {
- pr_warn("BTRFS: ZSTD_initCStream failed\n");
+ pr_warn("BTRFS: ZSTD_initStaticCStream failed\n");
ret = -EIO;
goto out;
}
+ {
+ size_t ret2;
+
+ ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len);
+ if (ZSTD_isError(ret2)) {
+ pr_warn("BTRFS: ZSTD_initCStream_advanced returned %s\n",
+ ZSTD_getErrorName(ret2));
+ ret = -EIO;
+ goto out;
+ }
+ }

/* map in the first page of input data */
in_page = find_get_page(mapping, start >> PAGE_SHIFT);
@@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
ret2 = ZSTD_compressStream(stream, &workspace->out_buf,
&workspace->in_buf);
if (ZSTD_isError(ret2)) {
- pr_debug("BTRFS: ZSTD_compressStream returned %d\n",
- ZSTD_getErrorCode(ret2));
+ pr_debug("BTRFS: ZSTD_compressStream returned %s\n",
+ ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,

ret2 = ZSTD_endStream(stream, &workspace->out_buf);
if (ZSTD_isError(ret2)) {
- pr_debug("BTRFS: ZSTD_endStream returned %d\n",
- ZSTD_getErrorCode(ret2));
+ pr_debug("BTRFS: ZSTD_endStream returned %s\n",
+ ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
unsigned long buf_start;
unsigned long total_out = 0;

- stream = ZSTD_initDStream(
- ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+ stream = ZSTD_initStaticDStream(workspace->mem, workspace->size);
if (!stream) {
- pr_debug("BTRFS: ZSTD_initDStream failed\n");
+ pr_debug("BTRFS: ZSTD_initStaticDStream failed\n");
ret = -EIO;
goto done;
}
@@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
ret2 = ZSTD_decompressStream(stream, &workspace->out_buf,
&workspace->in_buf);
if (ZSTD_isError(ret2)) {
- pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
- ZSTD_getErrorCode(ret2));
+ pr_debug("BTRFS: ZSTD_decompressStream returned %s\n",
+ ZSTD_getErrorName(ret2));
ret = -EIO;
goto done;
}
@@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char *data_in,
unsigned long pg_offset = 0;
char *kaddr;

- stream = ZSTD_initDStream(
- ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+ stream = ZSTD_initStaticDStream(workspace->mem, workspace->size);
if (!stream) {
- pr_warn("BTRFS: ZSTD_initDStream failed\n");
+ pr_warn("BTRFS: ZSTD_initStaticDStream failed\n");
ret = -EIO;
goto finish;
}
@@ -667,8 +675,8 @@ int zstd_decompress(struct list_head *ws, unsigned char *data_in,
ret2 = ZSTD_decompressStream(stream, &workspace->out_buf,
&workspace->in_buf);
if (ZSTD_isError(ret2)) {
- pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
- ZSTD_getErrorCode(ret2));
+ pr_debug("BTRFS: ZSTD_decompressStream returned %s\n",
+ ZSTD_getErrorName(ret2));
ret = -EIO;
goto finish;
}
--
2.28.0

2020-09-23 22:40:54

by Nick Terrell

[permalink] [raw]
Subject: [PATCH v3 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API

From: Nick Terrell <[email protected]>

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is more efficient because it uses the single-pass API instead of
the streaming API. The streaming API is not necessary because the whole
input and output buffers are available. This saves memory because we
don't need to allocate a buffer for the window. It is also more
efficient because it saves unnecessary memcpy calls.

Compression memory increases from 168 KB to 204 KB because upstream
uses slightly more memory. Decompression memory decreases from 1.4 MB
to 158 KB.

Signed-off-by: Nick Terrell <[email protected]>
---
fs/f2fs/compress.c | 102 +++++++++++++++++----------------------------
1 file changed, 38 insertions(+), 64 deletions(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index e056f3a2b404..b79efce81651 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,8 @@
#include <linux/backing-dev.h>
#include <linux/lzo.h>
#include <linux/lz4.h>
-#include <linux/zstd_compat.h>
+#include <linux/zstd.h>
+#include <linux/zstd_errors.h>

#include "f2fs.h"
#include "node.h"
@@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = {
static int zstd_init_compress_ctx(struct compress_ctx *cc)
{
ZSTD_parameters params;
- ZSTD_CStream *stream;
+ ZSTD_CCtx *ctx;
void *workspace;
unsigned int workspace_size;

params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0);
- workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams);
+ workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams);

workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode),
workspace_size, GFP_NOFS);
if (!workspace)
return -ENOMEM;

- stream = ZSTD_initCStream(params, 0, workspace, workspace_size);
- if (!stream) {
- printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream failed\n",
+ ctx = ZSTD_initStaticCCtx(workspace, workspace_size);
+ if (!ctx) {
+ printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream failed\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
__func__);
kvfree(workspace);
@@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc)
}

cc->private = workspace;
- cc->private2 = stream;
+ cc->private2 = ctx;

cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE;
return 0;
@@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx *cc)

static int zstd_compress_pages(struct compress_ctx *cc)
{
- ZSTD_CStream *stream = cc->private2;
- ZSTD_inBuffer inbuf;
- ZSTD_outBuffer outbuf;
- int src_size = cc->rlen;
- int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
- int ret;
-
- inbuf.pos = 0;
- inbuf.src = cc->rbuf;
- inbuf.size = src_size;
-
- outbuf.pos = 0;
- outbuf.dst = cc->cbuf->cdata;
- outbuf.size = dst_size;
-
- ret = ZSTD_compressStream(stream, &outbuf, &inbuf);
- if (ZSTD_isError(ret)) {
- printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream failed, ret: %d\n",
- KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
- __func__, ZSTD_getErrorCode(ret));
- return -EIO;
- }
-
- ret = ZSTD_endStream(stream, &outbuf);
+ ZSTD_CCtx *ctx = cc->private2;
+ const size_t src_size = cc->rlen;
+ const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
+ ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, src_size, 0);
+ size_t ret;
+
+ ret = ZSTD_compress_advanced(
+ ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, NULL, 0, params);
if (ZSTD_isError(ret)) {
- printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned %d\n",
+ /*
+ * there is compressed data remained in intermediate buffer due to
+ * no more space in cbuf.cdata
+ */
+ if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall)
+ return -EAGAIN;
+ /* other compression errors return -EIO */
+ printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced failed, err: %s\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
- __func__, ZSTD_getErrorCode(ret));
+ __func__, ZSTD_getErrorName(ret));
return -EIO;
}

- /*
- * there is compressed data remained in intermediate buffer due to
- * no more space in cbuf.cdata
- */
- if (ret)
- return -EAGAIN;
-
- cc->clen = outbuf.pos;
+ cc->clen = ret;
return 0;
}

static int zstd_init_decompress_ctx(struct decompress_io_ctx *dic)
{
- ZSTD_DStream *stream;
+ ZSTD_DCtx *ctx;
void *workspace;
unsigned int workspace_size;

- workspace_size = ZSTD_DStreamWorkspaceBound(MAX_COMPRESS_WINDOW_SIZE);
+ workspace_size = ZSTD_estimateDCtxSize();

workspace = f2fs_kvmalloc(F2FS_I_SB(dic->inode),
workspace_size, GFP_NOFS);
if (!workspace)
return -ENOMEM;

- stream = ZSTD_initDStream(MAX_COMPRESS_WINDOW_SIZE,
- workspace, workspace_size);
- if (!stream) {
- printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initDStream failed\n",
+ ctx = ZSTD_initStaticDCtx(workspace, workspace_size);
+ if (!ctx) {
+ printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initStaticDCtx failed\n",
KERN_ERR, F2FS_I_SB(dic->inode)->sb->s_id,
__func__);
kvfree(workspace);
@@ -401,7 +385,7 @@ static int zstd_init_decompress_ctx(struct decompress_io_ctx *dic)
}

dic->private = workspace;
- dic->private2 = stream;
+ dic->private2 = ctx;

return 0;
}
@@ -415,28 +399,18 @@ static void zstd_destroy_decompress_ctx(struct decompress_io_ctx *dic)

static int zstd_decompress_pages(struct decompress_io_ctx *dic)
{
- ZSTD_DStream *stream = dic->private2;
- ZSTD_inBuffer inbuf;
- ZSTD_outBuffer outbuf;
- int ret;
-
- inbuf.pos = 0;
- inbuf.src = dic->cbuf->cdata;
- inbuf.size = dic->clen;
-
- outbuf.pos = 0;
- outbuf.dst = dic->rbuf;
- outbuf.size = dic->rlen;
+ ZSTD_DCtx *ctx = dic->private2;
+ size_t ret;

- ret = ZSTD_decompressStream(stream, &outbuf, &inbuf);
+ ret = ZSTD_decompressDCtx(ctx, dic->rbuf, dic->rlen, dic->cbuf->cdata, dic->clen);
if (ZSTD_isError(ret)) {
- printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream failed, ret: %d\n",
+ printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_decompressDCtx failed, err: %s\n",
KERN_ERR, F2FS_I_SB(dic->inode)->sb->s_id,
- __func__, ZSTD_getErrorCode(ret));
+ __func__, ZSTD_getErrorName(ret));
return -EIO;
}

- if (dic->rlen != outbuf.pos) {
+ if (dic->rlen != ret) {
printk_ratelimited("%sF2FS-fs (%s): %s ZSTD invalid rlen:%zu, "
"expected:%lu\n", KERN_ERR,
F2FS_I_SB(dic->inode)->sb->s_id,
--
2.28.0

2020-09-23 22:40:57

by Nick Terrell

[permalink] [raw]
Subject: [PATCH v3 8/9] lib: unzstd: Switch to the zstd-1.4.6 API

From: Nick Terrell <[email protected]>

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell <[email protected]>
---
lib/decompress_unzstd.c | 40 ++++++++++++++--------------------------
1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index a79f705f236d..d4685df0e120 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -73,7 +73,8 @@

#include <linux/decompress/mm.h>
#include <linux/kernel.h>
-#include <linux/zstd_compat.h>
+#include <linux/zstd.h>
+#include <linux/zstd_errors.h>

/* 128MB is the maximum window size supported by zstd. */
#define ZSTD_WINDOWSIZE_MAX (1 << ZSTD_WINDOWLOG_MAX)
@@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long in_len, u8 *out_buf,
long out_len, long *in_pos,
void (*error)(char *x))
{
- const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+ const size_t wksp_size = ZSTD_estimateDCtxSize();
void *wksp = large_malloc(wksp_size);
- ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size);
+ ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size);
int err;
size_t ret;

@@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len,
{
ZSTD_inBuffer in;
ZSTD_outBuffer out;
- ZSTD_frameParams params;
void *in_allocated = NULL;
void *out_allocated = NULL;
void *wksp = NULL;
@@ -229,36 +229,24 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len,
out.size = out_len;

/*
- * We need to know the window size to allocate the ZSTD_DStream.
- * Since we are streaming, we need to allocate a buffer for the sliding
- * window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX
- * (8 MB), so it is important to use the actual value so as not to
- * waste memory when it is smaller.
+ * Zstd determines the workspace size from the window size written
+ * into the frame header. This ensures that we use the minimum value
+ * possible, since the window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX
+ * (1 GB), so it is very important to use the actual value.
*/
- ret = ZSTD_getFrameParams(&params, in.src, in.size);
+ wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size);
err = handle_zstd_error(ret, error);
if (err)
goto out;
- if (ret != 0) {
- error("ZSTD-compressed data has an incomplete frame header");
- err = -1;
- goto out;
- }
- if (params.windowSize > ZSTD_WINDOWSIZE_MAX) {
- error("ZSTD-compressed data has too large a window size");
+ wksp = large_malloc(wksp_size);
+ if (wksp == NULL) {
+ error("Out of memory while allocating ZSTD_DStream");
err = -1;
goto out;
}
-
- /*
- * Allocate the ZSTD_DStream now that we know how much memory is
- * required.
- */
- wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize);
- wksp = large_malloc(wksp_size);
- dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size);
+ dstream = ZSTD_initStaticDStream(wksp, wksp_size);
if (dstream == NULL) {
- error("Out of memory while allocating ZSTD_DStream");
+ error("ZSTD_initStaticDStream failed");
err = -1;
goto out;
}
--
2.28.0