2024-04-26 11:10:19

by Cabiddu, Giovanni

[permalink] [raw]
Subject: [RFC PATCH 0/6] btrfs: offload zlib-deflate to accelerators

Add support for zlib compression and decompression through the acomp
APIs in BTRFS. This enables [de]compression operations to be offloaded
to accelerators. This is a rework of [1].

This set also re-enables zlib-deflate in the Crypto API and in the QAT
driver as they were removed in [2] since there was no user in kernel.
The re-enablement is done by reverting the commits that removed such
feature.

The code has been benchmarked on a system with the following specs:
* Dual socket Intel(R) Xeon(R) Platinum 8470N
* 512GB (16x32GB DDR5 4800 MT/s [4800 MT/s])
* 4 NVMe disks (349.3G INTEL SSDPE21K375GA)
* 2 QAT 4xxx devices, one per socket, configured for compression only
* Kernel 6.8.2

The test consisted of 4 processes running `dd` that wrote in parallel
50GB of data (Silesia corpus) to the 4 NVMe disks separately. We captured
disk write throughput, CPU utilization and compression ratio:

+---------------------------+---------+---------+---------+---------+
| | QAT-L9 | ZSTD-L3 | ZLIB-L3 | LZO-L1 |
+---------------------------+---------+---------+---------+---------+
| Disk Write TPUT (GiB/s) | 6.5 | 5.2 | 2.2 | 6.5 |
+---------------------------+---------+---------+---------+---------+
| CPU utils %age @208 cores | 4.56% | 15.67% | 12.79% | 19.85% |
+---------------------------+---------+---------+---------+---------+
| Compression Ratio | 34% | 35% | 37% | 58% |
+---------------------------+---------+---------+---------+---------+

From the results we see that BTRFS with QAT configured for zlib-deflate Level 9
provides the best throughput with less CPU utilization and better compression
ratio compared with software zstd-l3, zlib-l3 and lzo.

Limitations:
* The implementation is synchronous, even if acomp is an asynchronous API.
* The implementation tries always to use an acomp tfm even if only
zlib-deflate-scomp is present. This ignores the compression levels
configuration for zlib.
* There is no way to configure a compression level for acomp(zlib-deflate).
This is hardcoded in the acomp algorithm implementation/provider.

[1] https://lore.kernel.org/all/[email protected]/
[2] https://lore.kernel.org/all/[email protected]/

Giovanni Cabiddu (5):
Revert "crypto: testmgr - Remove zlib-deflate"
Revert "crypto: deflate - Remove zlib-deflate"
Revert "crypto: qat - Remove zlib-deflate"
Revert "crypto: qat - remove unused macros in qat_comp_alg.c"
crypto: qat - change compressor settings for QAT GEN4

Weigang Li (1):
btrfs: zlib: add support for zlib-deflate through acomp

crypto/deflate.c | 61 +++--
crypto/testmgr.c | 10 +
crypto/testmgr.h | 75 ++++++
.../crypto/intel/qat/qat_common/adf_gen4_dc.c | 4 +-
.../intel/qat/qat_common/qat_comp_algs.c | 138 ++++++++++-
fs/btrfs/zlib.c | 216 ++++++++++++++++++
6 files changed, 484 insertions(+), 20 deletions(-)

base-commit: ed265f7fd9a635d77c8022fc6d9a1b735dd4dfd7
--
2.44.0



2024-04-26 11:10:20

by Cabiddu, Giovanni

[permalink] [raw]
Subject: [RFC PATCH 1/6] Revert "crypto: testmgr - Remove zlib-deflate"

This reverts commit 30febae71c6182e0762dc7744737012b4f8e6a6d.
---
crypto/testmgr.c | 10 +++++++
crypto/testmgr.h | 75 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 85 insertions(+)

diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 00f5a6cf341a..ab904ab74bee 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -5733,6 +5733,16 @@ static const struct alg_test_desc alg_test_descs[] = {
.suite = {
.hash = __VECS(xxhash64_tv_template)
}
+ }, {
+ .alg = "zlib-deflate",
+ .test = alg_test_comp,
+ .fips_allowed = 1,
+ .suite = {
+ .comp = {
+ .comp = __VECS(zlib_deflate_comp_tv_template),
+ .decomp = __VECS(zlib_deflate_decomp_tv_template)
+ }
+ }
}, {
.alg = "zstd",
.test = alg_test_comp,
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 5350cfd9d325..71d87a2fd842 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -34752,6 +34752,81 @@ static const struct comp_testvec deflate_decomp_tv_template[] = {
},
};

+static const struct comp_testvec zlib_deflate_comp_tv_template[] = {
+ {
+ .inlen = 70,
+ .outlen = 44,
+ .input = "Join us now and share the software "
+ "Join us now and share the software ",
+ .output = "\x78\x5e\xf3\xca\xcf\xcc\x53\x28"
+ "\x2d\x56\xc8\xcb\x2f\x57\x48\xcc"
+ "\x4b\x51\x28\xce\x48\x2c\x4a\x55"
+ "\x28\xc9\x48\x55\x28\xce\x4f\x2b"
+ "\x29\x07\x71\xbc\x08\x2b\x01\x00"
+ "\x7c\x65\x19\x3d",
+ }, {
+ .inlen = 191,
+ .outlen = 129,
+ .input = "This document describes a compression method based on the DEFLATE"
+ "compression algorithm. This document defines the application of "
+ "the DEFLATE algorithm to the IP Payload Compression Protocol.",
+ .output = "\x78\x5e\x5d\xce\x41\x0a\xc3\x30"
+ "\x0c\x04\xc0\xaf\xec\x0b\xf2\x87"
+ "\xd2\xa6\x50\xe8\xc1\x07\x7f\x40"
+ "\xb1\x95\x5a\x60\x5b\xc6\x56\x0f"
+ "\xfd\x7d\x93\x1e\x42\xe8\x51\xec"
+ "\xee\x20\x9f\x64\x20\x6a\x78\x17"
+ "\xae\x86\xc8\x23\x74\x59\x78\x80"
+ "\x10\xb4\xb4\xce\x63\x88\x56\x14"
+ "\xb6\xa4\x11\x0b\x0d\x8e\xd8\x6e"
+ "\x4b\x8c\xdb\x7c\x7f\x5e\xfc\x7c"
+ "\xae\x51\x7e\x69\x17\x4b\x65\x02"
+ "\xfc\x1f\xbc\x4a\xdd\xd8\x7d\x48"
+ "\xad\x65\x09\x64\x3b\xac\xeb\xd9"
+ "\xc2\x01\xc0\xf4\x17\x3c\x1c\x1c"
+ "\x7d\xb2\x52\xc4\xf5\xf4\x8f\xeb"
+ "\x6a\x1a\x34\x4f\x5f\x2e\x32\x45"
+ "\x4e",
+ },
+};
+
+static const struct comp_testvec zlib_deflate_decomp_tv_template[] = {
+ {
+ .inlen = 128,
+ .outlen = 191,
+ .input = "\x78\x9c\x5d\x8d\x31\x0e\xc2\x30"
+ "\x10\x04\xbf\xb2\x2f\xc8\x1f\x10"
+ "\x04\x09\x89\xc2\x85\x3f\x70\xb1"
+ "\x2f\xf8\x24\xdb\x67\xd9\x47\xc1"
+ "\xef\x49\x68\x12\x51\xae\x76\x67"
+ "\xd6\x27\x19\x88\x1a\xde\x85\xab"
+ "\x21\xf2\x08\x5d\x16\x1e\x20\x04"
+ "\x2d\xad\xf3\x18\xa2\x15\x85\x2d"
+ "\x69\xc4\x42\x83\x23\xb6\x6c\x89"
+ "\x71\x9b\xef\xcf\x8b\x9f\xcf\x33"
+ "\xca\x2f\xed\x62\xa9\x4c\x80\xff"
+ "\x13\xaf\x52\x37\xed\x0e\x52\x6b"
+ "\x59\x02\xd9\x4e\xe8\x7a\x76\x1d"
+ "\x02\x98\xfe\x8a\x87\x83\xa3\x4f"
+ "\x56\x8a\xb8\x9e\x8e\x5c\x57\xd3"
+ "\xa0\x79\xfa\x02\x2e\x32\x45\x4e",
+ .output = "This document describes a compression method based on the DEFLATE"
+ "compression algorithm. This document defines the application of "
+ "the DEFLATE algorithm to the IP Payload Compression Protocol.",
+ }, {
+ .inlen = 44,
+ .outlen = 70,
+ .input = "\x78\x9c\xf3\xca\xcf\xcc\x53\x28"
+ "\x2d\x56\xc8\xcb\x2f\x57\x48\xcc"
+ "\x4b\x51\x28\xce\x48\x2c\x4a\x55"
+ "\x28\xc9\x48\x55\x28\xce\x4f\x2b"
+ "\x29\x07\x71\xbc\x08\x2b\x01\x00"
+ "\x7c\x65\x19\x3d",
+ .output = "Join us now and share the software "
+ "Join us now and share the software ",
+ },
+};
+
/*
* LZO test vectors (null-terminated strings).
*/
--
2.44.0


2024-04-26 11:10:31

by Cabiddu, Giovanni

[permalink] [raw]
Subject: [RFC PATCH 2/6] Revert "crypto: deflate - Remove zlib-deflate"

This reverts commit 62a465c25e99b9a98259a6b7f5bb759f5296d501.
---
crypto/deflate.c | 61 ++++++++++++++++++++++++++++++++++--------------
1 file changed, 44 insertions(+), 17 deletions(-)

diff --git a/crypto/deflate.c b/crypto/deflate.c
index 6e31e0db0e86..b2a46f6dc961 100644
--- a/crypto/deflate.c
+++ b/crypto/deflate.c
@@ -39,20 +39,24 @@ struct deflate_ctx {
struct z_stream_s decomp_stream;
};

-static int deflate_comp_init(struct deflate_ctx *ctx)
+static int deflate_comp_init(struct deflate_ctx *ctx, int format)
{
int ret = 0;
struct z_stream_s *stream = &ctx->comp_stream;

stream->workspace = vzalloc(zlib_deflate_workspacesize(
- -DEFLATE_DEF_WINBITS, MAX_MEM_LEVEL));
+ MAX_WBITS, MAX_MEM_LEVEL));
if (!stream->workspace) {
ret = -ENOMEM;
goto out;
}
- ret = zlib_deflateInit2(stream, DEFLATE_DEF_LEVEL, Z_DEFLATED,
- -DEFLATE_DEF_WINBITS, DEFLATE_DEF_MEMLEVEL,
- Z_DEFAULT_STRATEGY);
+ if (format)
+ ret = zlib_deflateInit(stream, 3);
+ else
+ ret = zlib_deflateInit2(stream, DEFLATE_DEF_LEVEL, Z_DEFLATED,
+ -DEFLATE_DEF_WINBITS,
+ DEFLATE_DEF_MEMLEVEL,
+ Z_DEFAULT_STRATEGY);
if (ret != Z_OK) {
ret = -EINVAL;
goto out_free;
@@ -64,7 +68,7 @@ static int deflate_comp_init(struct deflate_ctx *ctx)
goto out;
}

-static int deflate_decomp_init(struct deflate_ctx *ctx)
+static int deflate_decomp_init(struct deflate_ctx *ctx, int format)
{
int ret = 0;
struct z_stream_s *stream = &ctx->decomp_stream;
@@ -74,7 +78,10 @@ static int deflate_decomp_init(struct deflate_ctx *ctx)
ret = -ENOMEM;
goto out;
}
- ret = zlib_inflateInit2(stream, -DEFLATE_DEF_WINBITS);
+ if (format)
+ ret = zlib_inflateInit(stream);
+ else
+ ret = zlib_inflateInit2(stream, -DEFLATE_DEF_WINBITS);
if (ret != Z_OK) {
ret = -EINVAL;
goto out_free;
@@ -98,21 +105,21 @@ static void deflate_decomp_exit(struct deflate_ctx *ctx)
vfree(ctx->decomp_stream.workspace);
}

-static int __deflate_init(void *ctx)
+static int __deflate_init(void *ctx, int format)
{
int ret;

- ret = deflate_comp_init(ctx);
+ ret = deflate_comp_init(ctx, format);
if (ret)
goto out;
- ret = deflate_decomp_init(ctx);
+ ret = deflate_decomp_init(ctx, format);
if (ret)
deflate_comp_exit(ctx);
out:
return ret;
}

-static void *deflate_alloc_ctx(struct crypto_scomp *tfm)
+static void *gen_deflate_alloc_ctx(struct crypto_scomp *tfm, int format)
{
struct deflate_ctx *ctx;
int ret;
@@ -121,7 +128,7 @@ static void *deflate_alloc_ctx(struct crypto_scomp *tfm)
if (!ctx)
return ERR_PTR(-ENOMEM);

- ret = __deflate_init(ctx);
+ ret = __deflate_init(ctx, format);
if (ret) {
kfree(ctx);
return ERR_PTR(ret);
@@ -130,11 +137,21 @@ static void *deflate_alloc_ctx(struct crypto_scomp *tfm)
return ctx;
}

+static void *deflate_alloc_ctx(struct crypto_scomp *tfm)
+{
+ return gen_deflate_alloc_ctx(tfm, 0);
+}
+
+static void *zlib_deflate_alloc_ctx(struct crypto_scomp *tfm)
+{
+ return gen_deflate_alloc_ctx(tfm, 1);
+}
+
static int deflate_init(struct crypto_tfm *tfm)
{
struct deflate_ctx *ctx = crypto_tfm_ctx(tfm);

- return __deflate_init(ctx);
+ return __deflate_init(ctx, 0);
}

static void __deflate_exit(void *ctx)
@@ -269,7 +286,7 @@ static struct crypto_alg alg = {
.coa_decompress = deflate_decompress } }
};

-static struct scomp_alg scomp = {
+static struct scomp_alg scomp[] = { {
.alloc_ctx = deflate_alloc_ctx,
.free_ctx = deflate_free_ctx,
.compress = deflate_scompress,
@@ -279,7 +296,17 @@ static struct scomp_alg scomp = {
.cra_driver_name = "deflate-scomp",
.cra_module = THIS_MODULE,
}
-};
+}, {
+ .alloc_ctx = zlib_deflate_alloc_ctx,
+ .free_ctx = deflate_free_ctx,
+ .compress = deflate_scompress,
+ .decompress = deflate_sdecompress,
+ .base = {
+ .cra_name = "zlib-deflate",
+ .cra_driver_name = "zlib-deflate-scomp",
+ .cra_module = THIS_MODULE,
+ }
+} };

static int __init deflate_mod_init(void)
{
@@ -289,7 +316,7 @@ static int __init deflate_mod_init(void)
if (ret)
return ret;

- ret = crypto_register_scomp(&scomp);
+ ret = crypto_register_scomps(scomp, ARRAY_SIZE(scomp));
if (ret) {
crypto_unregister_alg(&alg);
return ret;
@@ -301,7 +328,7 @@ static int __init deflate_mod_init(void)
static void __exit deflate_mod_fini(void)
{
crypto_unregister_alg(&alg);
- crypto_unregister_scomp(&scomp);
+ crypto_unregister_scomps(scomp, ARRAY_SIZE(scomp));
}

subsys_initcall(deflate_mod_init);
--
2.44.0


2024-04-26 11:10:32

by Cabiddu, Giovanni

[permalink] [raw]
Subject: [RFC PATCH 3/6] Revert "crypto: qat - Remove zlib-deflate"

This reverts commit e9dd20e0e5f62d01d9404db2cf9824d1faebcf71.
---
.../intel/qat/qat_common/qat_comp_algs.c | 129 +++++++++++++++++-
1 file changed, 128 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/intel/qat/qat_common/qat_comp_algs.c b/drivers/crypto/intel/qat/qat_common/qat_comp_algs.c
index 2ba4aa22e092..79de04cfa012 100644
--- a/drivers/crypto/intel/qat/qat_common/qat_comp_algs.c
+++ b/drivers/crypto/intel/qat/qat_common/qat_comp_algs.c
@@ -100,6 +100,69 @@ static void qat_comp_resubmit(struct work_struct *work)
acomp_request_complete(areq, ret);
}

+static int parse_zlib_header(u16 zlib_h)
+{
+ int ret = -EINVAL;
+ __be16 header;
+ u8 *header_p;
+ u8 cmf, flg;
+
+ header = cpu_to_be16(zlib_h);
+ header_p = (u8 *)&header;
+
+ flg = header_p[0];
+ cmf = header_p[1];
+
+ if (cmf >> QAT_RFC_1950_CM_OFFSET > QAT_RFC_1950_CM_DEFLATE_CINFO_32K)
+ return ret;
+
+ if ((cmf & QAT_RFC_1950_CM_MASK) != QAT_RFC_1950_CM_DEFLATE)
+ return ret;
+
+ if (flg & QAT_RFC_1950_DICT_MASK)
+ return ret;
+
+ return 0;
+}
+
+static int qat_comp_rfc1950_callback(struct qat_compression_req *qat_req,
+ void *resp)
+{
+ struct acomp_req *areq = qat_req->acompress_req;
+ enum direction dir = qat_req->dir;
+ __be32 qat_produced_adler;
+
+ qat_produced_adler = cpu_to_be32(qat_comp_get_produced_adler32(resp));
+
+ if (dir == COMPRESSION) {
+ __be16 zlib_header;
+
+ zlib_header = cpu_to_be16(QAT_RFC_1950_COMP_HDR);
+ scatterwalk_map_and_copy(&zlib_header, areq->dst, 0, QAT_RFC_1950_HDR_SIZE, 1);
+ areq->dlen += QAT_RFC_1950_HDR_SIZE;
+
+ scatterwalk_map_and_copy(&qat_produced_adler, areq->dst, areq->dlen,
+ QAT_RFC_1950_FOOTER_SIZE, 1);
+ areq->dlen += QAT_RFC_1950_FOOTER_SIZE;
+ } else {
+ __be32 decomp_adler;
+ int footer_offset;
+ int consumed;
+
+ consumed = qat_comp_get_consumed_ctr(resp);
+ footer_offset = consumed + QAT_RFC_1950_HDR_SIZE;
+ if (footer_offset + QAT_RFC_1950_FOOTER_SIZE > areq->slen)
+ return -EBADMSG;
+
+ scatterwalk_map_and_copy(&decomp_adler, areq->src, footer_offset,
+ QAT_RFC_1950_FOOTER_SIZE, 0);
+
+ if (qat_produced_adler != decomp_adler)
+ return -EBADMSG;
+ }
+ return 0;
+}
+
static void qat_comp_generic_callback(struct qat_compression_req *qat_req,
void *resp)
{
@@ -221,6 +284,18 @@ static void qat_comp_alg_exit_tfm(struct crypto_acomp *acomp_tfm)
memset(ctx, 0, sizeof(*ctx));
}

+static int qat_comp_alg_rfc1950_init_tfm(struct crypto_acomp *acomp_tfm)
+{
+ struct crypto_tfm *tfm = crypto_acomp_tfm(acomp_tfm);
+ struct qat_compression_ctx *ctx = crypto_tfm_ctx(tfm);
+ int ret;
+
+ ret = qat_comp_alg_init_tfm(acomp_tfm);
+ ctx->qat_comp_callback = &qat_comp_rfc1950_callback;
+
+ return ret;
+}
+
static int qat_comp_alg_compress_decompress(struct acomp_req *areq, enum direction dir,
unsigned int shdr, unsigned int sftr,
unsigned int dhdr, unsigned int dftr)
@@ -316,6 +391,43 @@ static int qat_comp_alg_decompress(struct acomp_req *req)
return qat_comp_alg_compress_decompress(req, DECOMPRESSION, 0, 0, 0, 0);
}

+static int qat_comp_alg_rfc1950_compress(struct acomp_req *req)
+{
+ if (!req->dst && req->dlen != 0)
+ return -EINVAL;
+
+ if (req->dst && req->dlen <= QAT_RFC_1950_HDR_SIZE + QAT_RFC_1950_FOOTER_SIZE)
+ return -EINVAL;
+
+ return qat_comp_alg_compress_decompress(req, COMPRESSION, 0, 0,
+ QAT_RFC_1950_HDR_SIZE,
+ QAT_RFC_1950_FOOTER_SIZE);
+}
+
+static int qat_comp_alg_rfc1950_decompress(struct acomp_req *req)
+{
+ struct crypto_acomp *acomp_tfm = crypto_acomp_reqtfm(req);
+ struct crypto_tfm *tfm = crypto_acomp_tfm(acomp_tfm);
+ struct qat_compression_ctx *ctx = crypto_tfm_ctx(tfm);
+ struct adf_accel_dev *accel_dev = ctx->inst->accel_dev;
+ u16 zlib_header;
+ int ret;
+
+ if (req->slen <= QAT_RFC_1950_HDR_SIZE + QAT_RFC_1950_FOOTER_SIZE)
+ return -EBADMSG;
+
+ scatterwalk_map_and_copy(&zlib_header, req->src, 0, QAT_RFC_1950_HDR_SIZE, 0);
+
+ ret = parse_zlib_header(zlib_header);
+ if (ret) {
+ dev_dbg(&GET_DEV(accel_dev), "Error parsing zlib header\n");
+ return ret;
+ }
+
+ return qat_comp_alg_compress_decompress(req, DECOMPRESSION, QAT_RFC_1950_HDR_SIZE,
+ QAT_RFC_1950_FOOTER_SIZE, 0, 0);
+}
+
static struct acomp_alg qat_acomp[] = { {
.base = {
.cra_name = "deflate",
@@ -331,7 +443,22 @@ static struct acomp_alg qat_acomp[] = { {
.decompress = qat_comp_alg_decompress,
.dst_free = sgl_free,
.reqsize = sizeof(struct qat_compression_req),
-}};
+}, {
+ .base = {
+ .cra_name = "zlib-deflate",
+ .cra_driver_name = "qat_zlib_deflate",
+ .cra_priority = 4001,
+ .cra_flags = CRYPTO_ALG_ASYNC,
+ .cra_ctxsize = sizeof(struct qat_compression_ctx),
+ .cra_module = THIS_MODULE,
+ },
+ .init = qat_comp_alg_rfc1950_init_tfm,
+ .exit = qat_comp_alg_exit_tfm,
+ .compress = qat_comp_alg_rfc1950_compress,
+ .decompress = qat_comp_alg_rfc1950_decompress,
+ .dst_free = sgl_free,
+ .reqsize = sizeof(struct qat_compression_req),
+} };

int qat_comp_algs_register(void)
{
--
2.44.0


2024-04-26 11:10:42

by Cabiddu, Giovanni

[permalink] [raw]
Subject: [RFC PATCH 5/6] crypto: qat - change compressor settings for QAT GEN4

Enable dynamic compression by default for QAT GEN4 and change
compression level to 9.

Signed-off-by: Giovanni Cabiddu <[email protected]>
---
drivers/crypto/intel/qat/qat_common/adf_gen4_dc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/intel/qat/qat_common/adf_gen4_dc.c b/drivers/crypto/intel/qat/qat_common/adf_gen4_dc.c
index 5859238e37de..34f418b88738 100644
--- a/drivers/crypto/intel/qat/qat_common/adf_gen4_dc.c
+++ b/drivers/crypto/intel/qat/qat_common/adf_gen4_dc.c
@@ -22,7 +22,7 @@ static void qat_comp_build_deflate(void *ctx)
header->hdr_flags =
ICP_QAT_FW_COMN_HDR_FLAGS_BUILD(ICP_QAT_FW_COMN_REQ_FLAG_SET);
header->service_type = ICP_QAT_FW_COMN_REQ_CPM_FW_COMP;
- header->service_cmd_id = ICP_QAT_FW_COMP_CMD_STATIC;
+ header->service_cmd_id = ICP_QAT_FW_COMP_CMD_DYNAMIC;
header->comn_req_flags =
ICP_QAT_FW_COMN_FLAGS_BUILD(QAT_COMN_CD_FLD_TYPE_16BYTE_DATA,
QAT_COMN_PTR_TYPE_SGL);
@@ -35,7 +35,7 @@ static void qat_comp_build_deflate(void *ctx)
hw_comp_lower_csr.skip_ctrl = ICP_QAT_HW_COMP_20_BYTE_SKIP_3BYTE_LITERAL;
hw_comp_lower_csr.algo = ICP_QAT_HW_COMP_20_HW_COMP_FORMAT_ILZ77;
hw_comp_lower_csr.lllbd = ICP_QAT_HW_COMP_20_LLLBD_CTRL_LLLBD_ENABLED;
- hw_comp_lower_csr.sd = ICP_QAT_HW_COMP_20_SEARCH_DEPTH_LEVEL_1;
+ hw_comp_lower_csr.sd = ICP_QAT_HW_COMP_20_SEARCH_DEPTH_LEVEL_9;
hw_comp_lower_csr.hash_update = ICP_QAT_HW_COMP_20_SKIP_HASH_UPDATE_DONT_ALLOW;
hw_comp_lower_csr.edmm = ICP_QAT_HW_COMP_20_EXTENDED_DELAY_MATCH_MODE_EDMM_ENABLED;
hw_comp_upper_csr.nice = ICP_QAT_HW_COMP_20_CONFIG_CSR_NICE_PARAM_DEFAULT_VAL;
--
2.44.0


2024-04-26 11:10:43

by Cabiddu, Giovanni

[permalink] [raw]
Subject: [RFC PATCH 4/6] Revert "crypto: qat - remove unused macros in qat_comp_alg.c"

This reverts commit b4bf8295892924fca60d0704ac7cbc3b5897d233.
---
drivers/crypto/intel/qat/qat_common/qat_comp_algs.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/drivers/crypto/intel/qat/qat_common/qat_comp_algs.c b/drivers/crypto/intel/qat/qat_common/qat_comp_algs.c
index 79de04cfa012..b533984906ec 100644
--- a/drivers/crypto/intel/qat/qat_common/qat_comp_algs.c
+++ b/drivers/crypto/intel/qat/qat_common/qat_comp_algs.c
@@ -13,6 +13,15 @@
#include "qat_compression.h"
#include "qat_algs_send.h"

+#define QAT_RFC_1950_HDR_SIZE 2
+#define QAT_RFC_1950_FOOTER_SIZE 4
+#define QAT_RFC_1950_CM_DEFLATE 8
+#define QAT_RFC_1950_CM_DEFLATE_CINFO_32K 7
+#define QAT_RFC_1950_CM_MASK 0x0f
+#define QAT_RFC_1950_CM_OFFSET 4
+#define QAT_RFC_1950_DICT_MASK 0x20
+#define QAT_RFC_1950_COMP_HDR 0x785e
+
static DEFINE_MUTEX(algs_lock);
static unsigned int active_devs;

--
2.44.0


2024-04-26 11:11:47

by Cabiddu, Giovanni

[permalink] [raw]
Subject: [RFC PATCH 6/6] btrfs: zlib: add support for zlib-deflate through acomp

From: Weigang Li <[email protected]>

Add support for zlib compression and decompression through the acomp
APIs.
Input pages are added to an sg-list and sent to acomp in one request.
Since acomp is asynchronous, the thread is put to sleep and then the CPU
is freed up. Once compression is done, the acomp callback is triggered
and the thread is woke up.

This patch doesn't change the BTRFS disk format, this means that files
compressed by hardware engines can be de-compressed by the zlib software
library, and vice versa.

Limitations:
* The implementation tries always to use an acomp even if only
zlib-deflate-scomp is present
* Acomp does not provide a way to support compression levels
* Acomp is an asynchronous API but used here synchronously

Signed-off-by: Weigang Li <[email protected]>
---
fs/btrfs/zlib.c | 216 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 216 insertions(+)

diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c
index e5b3f2003896..b5bbb8c97244 100644
--- a/fs/btrfs/zlib.c
+++ b/fs/btrfs/zlib.c
@@ -18,6 +18,8 @@
#include <linux/pagemap.h>
#include <linux/bio.h>
#include <linux/refcount.h>
+#include <crypto/acompress.h>
+#include <linux/scatterlist.h>
#include "compression.h"

/* workspace buffer size for s390 zlib hardware support */
@@ -33,6 +35,201 @@ struct workspace {

static struct workspace_manager wsm;

+static int acomp_comp_pages(struct address_space *mapping, u64 start,
+ unsigned long len, struct page **pages,
+ unsigned long *out_pages,
+ unsigned long *total_in,
+ unsigned long *total_out)
+{
+ unsigned int nr_src_pages = 0, nr_dst_pages = 0, nr_pages = 0;
+ struct sg_table in_sg = { 0 }, out_sg = { 0 };
+ struct page *in_page, *out_page, **in_pages;
+ struct crypto_acomp *tfm = NULL;
+ struct acomp_req *req = NULL;
+ struct crypto_wait wait;
+ int ret, i;
+
+ nr_src_pages = (len + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ in_pages = kcalloc(nr_src_pages, sizeof(struct page *), GFP_KERNEL);
+ if (!in_pages) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ for (i = 0; i < nr_src_pages; i++) {
+ in_page = find_get_page(mapping, start >> PAGE_SHIFT);
+ out_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM);
+ if (!in_page || !out_page) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ in_pages[i] = in_page;
+ pages[i] = out_page;
+ nr_dst_pages += 1;
+ start += PAGE_SIZE;
+ }
+
+ ret = sg_alloc_table_from_pages(&in_sg, in_pages, nr_src_pages, 0,
+ nr_src_pages << PAGE_SHIFT, GFP_KERNEL);
+ if (ret)
+ goto out;
+
+ ret = sg_alloc_table_from_pages(&out_sg, pages, nr_dst_pages, 0,
+ nr_dst_pages << PAGE_SHIFT, GFP_KERNEL);
+ if (ret)
+ goto out;
+
+ crypto_init_wait(&wait);
+ tfm = crypto_alloc_acomp("zlib-deflate", 0, 0);
+ if (IS_ERR(tfm)) {
+ ret = PTR_ERR(tfm);
+ goto out;
+ }
+
+ req = acomp_request_alloc(tfm);
+ if (!req) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ acomp_request_set_params(req, in_sg.sgl, out_sg.sgl, len,
+ nr_dst_pages << PAGE_SHIFT);
+ acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+ crypto_req_done, &wait);
+
+ ret = crypto_wait_req(crypto_acomp_compress(req), &wait);
+ if (ret)
+ goto out;
+
+ *total_in = len;
+ *total_out = req->dlen;
+ nr_pages = (*total_out + PAGE_SIZE - 1) >> PAGE_SHIFT;
+
+out:
+ sg_free_table(&in_sg);
+ sg_free_table(&out_sg);
+
+ if (in_pages) {
+ for (i = 0; i < nr_src_pages; i++)
+ put_page(in_pages[i]);
+ kfree(in_pages);
+ }
+
+ /* free un-used out pages */
+ for (i = nr_pages; i < nr_dst_pages; i++)
+ put_page(pages[i]);
+
+ if (req)
+ acomp_request_free(req);
+
+ if (tfm)
+ crypto_free_acomp(tfm);
+
+ *out_pages = nr_pages;
+
+ return ret;
+}
+
+static int acomp_zlib_decomp_bio(struct page **in_pages,
+ struct compressed_bio *cb, size_t srclen,
+ unsigned long total_pages_in)
+{
+ unsigned int nr_dst_pages = BTRFS_MAX_COMPRESSED_PAGES;
+ struct sg_table in_sg = { 0 }, out_sg = { 0 };
+ struct bio *orig_bio = &cb->orig_bbio->bio;
+ char *data_out = NULL, *bv_buf = NULL;
+ int copy_len = 0, bytes_left = 0;
+ struct crypto_acomp *tfm = NULL;
+ struct page **out_pages = NULL;
+ struct acomp_req *req = NULL;
+ struct crypto_wait wait;
+ struct bio_vec bvec;
+ int ret, i = 0;
+
+ ret = sg_alloc_table_from_pages(&in_sg, in_pages, total_pages_in,
+ 0, srclen, GFP_KERNEL);
+ if (ret)
+ goto out;
+
+ out_pages = kcalloc(nr_dst_pages, sizeof(struct page *), GFP_KERNEL);
+ if (!out_pages) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ for (i = 0; i < nr_dst_pages; i++) {
+ out_pages[i] = alloc_page(GFP_NOFS | __GFP_HIGHMEM);
+ if (!out_pages[i]) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ }
+
+ ret = sg_alloc_table_from_pages(&out_sg, out_pages, nr_dst_pages, 0,
+ nr_dst_pages << PAGE_SHIFT, GFP_KERNEL);
+ if (ret)
+ goto out;
+
+ crypto_init_wait(&wait);
+ tfm = crypto_alloc_acomp("zlib-deflate", 0, 0);
+ if (IS_ERR(tfm)) {
+ ret = PTR_ERR(tfm);
+ goto out;
+ }
+
+ req = acomp_request_alloc(tfm);
+ if (!req) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ acomp_request_set_params(req, in_sg.sgl, out_sg.sgl, srclen,
+ nr_dst_pages << PAGE_SHIFT);
+ acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+ crypto_req_done, &wait);
+
+ ret = crypto_wait_req(crypto_acomp_decompress(req), &wait);
+ if (ret)
+ goto out;
+
+ /* Copy decompressed buffer to bio pages */
+ bytes_left = req->dlen;
+ for (i = 0; i < nr_dst_pages; i++) {
+ copy_len = bytes_left > PAGE_SIZE ? PAGE_SIZE : bytes_left;
+ data_out = kmap_local_page(out_pages[i]);
+
+ bvec = bio_iter_iovec(orig_bio, orig_bio->bi_iter);
+ bv_buf = kmap_local_page(bvec.bv_page);
+ memcpy(bv_buf, data_out, copy_len);
+ kunmap_local(bv_buf);
+
+ bio_advance(orig_bio, copy_len);
+ if (!orig_bio->bi_iter.bi_size)
+ break;
+ bytes_left -= copy_len;
+ if (bytes_left <= 0)
+ break;
+ }
+out:
+ sg_free_table(&in_sg);
+ sg_free_table(&out_sg);
+
+ if (out_pages) {
+ for (i = 0; i < nr_dst_pages; i++) {
+ if (out_pages[i])
+ put_page(out_pages[i]);
+ }
+ kfree(out_pages);
+ }
+
+ if (req)
+ acomp_request_free(req);
+ if (tfm)
+ crypto_free_acomp(tfm);
+
+ return ret;
+}
+
struct list_head *zlib_get_workspace(unsigned int level)
{
struct list_head *ws = btrfs_get_workspace(BTRFS_COMPRESS_ZLIB, level);
@@ -108,6 +305,15 @@ int zlib_compress_pages(struct list_head *ws, struct address_space *mapping,
unsigned long nr_dest_pages = *out_pages;
const unsigned long max_out = nr_dest_pages * PAGE_SIZE;

+ if (crypto_has_acomp("zlib-deflate", 0, 0)) {
+ ret = acomp_comp_pages(mapping, start, len, pages, out_pages,
+ total_in, total_out);
+ if (!ret)
+ return ret;
+
+ pr_warn("BTRFS: acomp compression failed: ret = %d\n", ret);
+ /* Fallback to SW implementation if HW compression failed */
+ }
*out_pages = 0;
*total_out = 0;
*total_in = 0;
@@ -281,6 +487,16 @@ int zlib_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
unsigned long buf_start;
struct page **pages_in = cb->compressed_pages;

+ if (crypto_has_acomp("zlib-deflate", 0, 0)) {
+ ret = acomp_zlib_decomp_bio(pages_in, cb, srclen,
+ total_pages_in);
+ if (!ret)
+ return ret;
+
+ pr_warn("BTRFS: acomp decompression failed, ret=%d\n", ret);
+ /* Fallback to SW implementation if HW decompression failed */
+ }
+
data_in = kmap_local_page(pages_in[page_in_index]);
workspace->strm.next_in = data_in;
workspace->strm.avail_in = min_t(size_t, srclen, PAGE_SIZE);
--
2.44.0


2024-04-29 13:57:44

by Josef Bacik

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] btrfs: zlib: add support for zlib-deflate through acomp

On Fri, Apr 26, 2024 at 11:54:29AM +0100, Giovanni Cabiddu wrote:
> From: Weigang Li <[email protected]>
>
> Add support for zlib compression and decompression through the acomp
> APIs.
> Input pages are added to an sg-list and sent to acomp in one request.
> Since acomp is asynchronous, the thread is put to sleep and then the CPU
> is freed up. Once compression is done, the acomp callback is triggered
> and the thread is woke up.
>
> This patch doesn't change the BTRFS disk format, this means that files
> compressed by hardware engines can be de-compressed by the zlib software
> library, and vice versa.
>
> Limitations:
> * The implementation tries always to use an acomp even if only
> zlib-deflate-scomp is present
> * Acomp does not provide a way to support compression levels

That's a non-starter. We can't just lie to the user about the compression level
that is being used. If the user just does "-o compress=zlib" then you need to
update btrfs_compress_set_level() to figure out the compression level that acomp
is going to use and set that appropriately, so we can report to the user what is
actually being used.

Additionally if a user specifies a compression level you need to make sure we
don't do acomp if it doesn't match what acomp is going to do.

Finally, for the normal code review, there's a bunch of things that need to be
fixed up before I take a closer look

- We don't use pr_(), we have btrfs specific printk helpers, please use those.
- We do 1 variable per line, fix up the variable declarations in your functions.

Thanks,

Josef

2024-04-29 15:23:53

by Cabiddu, Giovanni

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] btrfs: zlib: add support for zlib-deflate through acomp

On Mon, Apr 29, 2024 at 09:56:45AM -0400, Josef Bacik wrote:
> On Fri, Apr 26, 2024 at 11:54:29AM +0100, Giovanni Cabiddu wrote:
> > From: Weigang Li <[email protected]>
> >
> > Add support for zlib compression and decompression through the acomp
> > APIs.
> > Input pages are added to an sg-list and sent to acomp in one request.
> > Since acomp is asynchronous, the thread is put to sleep and then the CPU
> > is freed up. Once compression is done, the acomp callback is triggered
> > and the thread is woke up.
> >
> > This patch doesn't change the BTRFS disk format, this means that files
> > compressed by hardware engines can be de-compressed by the zlib software
> > library, and vice versa.
> >
> > Limitations:
> > * The implementation tries always to use an acomp even if only
> > zlib-deflate-scomp is present
> > * Acomp does not provide a way to support compression levels
>
> That's a non-starter. We can't just lie to the user about the compression level
> that is being used. If the user just does "-o compress=zlib" then you need to
> update btrfs_compress_set_level() to figure out the compression level that acomp
> is going to use and set that appropriately, so we can report to the user what is
> actually being used.
>
> Additionally if a user specifies a compression level you need to make sure we
> don't do acomp if it doesn't match what acomp is going to do.
Thanks for the feedback. We should then extend the acomp API to take the
compression level.
@Herbert, do you have any objection if we add the compression level to
the acomp tfm and we add an API to set it? Example:

tfm = crypto_alloc_acomp("deflate", 0, 0);
acomp_set_level(tfm, compression_level);

> Finally, for the normal code review, there's a bunch of things that need to be
> fixed up before I take a closer look
>
> - We don't use pr_(), we have btrfs specific printk helpers, please use those.
> - We do 1 variable per line, fix up the variable declarations in your functions.
I see that the code in fs/btrfs/zlib.c uses both pr_() and more than one
variable per line. If we change it, will mixed style be a concern?

--
Giovanni

2024-04-29 15:49:00

by David Sterba

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] btrfs: zlib: add support for zlib-deflate through acomp

On Mon, Apr 29, 2024 at 09:56:45AM -0400, Josef Bacik wrote:
> On Fri, Apr 26, 2024 at 11:54:29AM +0100, Giovanni Cabiddu wrote:
> > From: Weigang Li <[email protected]>
> >
> > Add support for zlib compression and decompression through the acomp
> > APIs.
> > Input pages are added to an sg-list and sent to acomp in one request.
> > Since acomp is asynchronous, the thread is put to sleep and then the CPU
> > is freed up. Once compression is done, the acomp callback is triggered
> > and the thread is woke up.
> >
> > This patch doesn't change the BTRFS disk format, this means that files
> > compressed by hardware engines can be de-compressed by the zlib software
> > library, and vice versa.
> >
> > Limitations:
> > * The implementation tries always to use an acomp even if only
> > zlib-deflate-scomp is present
> > * Acomp does not provide a way to support compression levels
>
> That's a non-starter. We can't just lie to the user about the compression level
> that is being used. If the user just does "-o compress=zlib" then you need to
> update btrfs_compress_set_level() to figure out the compression level that acomp
> is going to use and set that appropriately, so we can report to the user what is
> actually being used.
>
> Additionally if a user specifies a compression level you need to make sure we
> don't do acomp if it doesn't match what acomp is going to do.
>
> Finally, for the normal code review, there's a bunch of things that need to be
> fixed up before I take a closer look
>
> - We don't use pr_(), we have btrfs specific printk helpers, please use those.
> - We do 1 variable per line, fix up the variable declarations in your functions.

I'd skip the style and implementation details for now. The absence of
compression level support seems like the biggest problem, also in
combination with uncondtional use of the acomp interface. We'd have to
enhance the compression format specifier to make it configurable in the
sense: if accelerator is available use it, otherwise do CPU and
synchronous compression.

On the other hand, the compression levels are to trade off time and
space. If the QAT implementation with zlib level 9 is always better than
CPU compression then it's not that bad, not counting the possibly
misleading level to the users.

If QAT can also support ZSTD I'm not sure that lack of levels can work
there though, the memory overhead is bigger and it's a more complex
algorithm. Extending the acomp API with levels would be necessary.

Regarding the implementation, there are many allocations that set up the
async request. This is problematic as the compression is at the end of
the IO path and potentially called after memory pressure. We still do
some allocations there but also try not to fail due to ENOMEM, each
allocation is a new failure point. Anything that could be reused should
be in the workspace memory.

2024-04-29 16:04:34

by David Sterba

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] btrfs: zlib: add support for zlib-deflate through acomp

On Fri, Apr 26, 2024 at 11:54:29AM +0100, Giovanni Cabiddu wrote:
> From: Weigang Li <[email protected]>
> +static int acomp_comp_pages(struct address_space *mapping, u64 start,
> + unsigned long len, struct page **pages,
> + unsigned long *out_pages,
> + unsigned long *total_in,
> + unsigned long *total_out)
> +{
> + unsigned int nr_src_pages = 0, nr_dst_pages = 0, nr_pages = 0;
> + struct sg_table in_sg = { 0 }, out_sg = { 0 };
> + struct page *in_page, *out_page, **in_pages;
> + struct crypto_acomp *tfm = NULL;
> + struct acomp_req *req = NULL;
> + struct crypto_wait wait;
> + int ret, i;
> +
> + nr_src_pages = (len + PAGE_SIZE - 1) >> PAGE_SHIFT;
> + in_pages = kcalloc(nr_src_pages, sizeof(struct page *), GFP_KERNEL);

The maximum length is bounded so you could store the in_pages array in
zlib's workspace.

> + if (!in_pages) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + for (i = 0; i < nr_src_pages; i++) {
> + in_page = find_get_page(mapping, start >> PAGE_SHIFT);
> + out_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM);

Output pages should be newly allocated by btrfs_alloc_compr_folio()

> + if (!in_page || !out_page) {
> + ret = -ENOMEM;
> + goto out;
> + }
> + in_pages[i] = in_page;
> + pages[i] = out_page;
> + nr_dst_pages += 1;
> + start += PAGE_SIZE;
> + }
> +
> + ret = sg_alloc_table_from_pages(&in_sg, in_pages, nr_src_pages, 0,
> + nr_src_pages << PAGE_SHIFT, GFP_KERNEL);

I'm not sure if the sg interface allows to use an existing buffer but
the input parameters are bounded in size and count so the allocation
should be dropped and replaced by workspace data.

> + if (ret)
> + goto out;
> +
> + ret = sg_alloc_table_from_pages(&out_sg, pages, nr_dst_pages, 0,
> + nr_dst_pages << PAGE_SHIFT, GFP_KERNEL);
> + if (ret)
> + goto out;
> +
> + crypto_init_wait(&wait);
> + tfm = crypto_alloc_acomp("zlib-deflate", 0, 0);

AFAIK the TFM should be allocated only once way before any IO is done
and then reused, this can trigger resolving the best implementation or
maybe even module loading.

> + if (IS_ERR(tfm)) {
> + ret = PTR_ERR(tfm);
> + goto out;
> + }
> +
> + req = acomp_request_alloc(tfm);

The request should be in workspace, the only initialization I see
setting the right ->tfm pointer.

> + if (!req) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + acomp_request_set_params(req, in_sg.sgl, out_sg.sgl, len,
> + nr_dst_pages << PAGE_SHIFT);
> + acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
> + crypto_req_done, &wait);
> +
> + ret = crypto_wait_req(crypto_acomp_compress(req), &wait);
> + if (ret)
> + goto out;
> +
> + *total_in = len;
> + *total_out = req->dlen;
> + nr_pages = (*total_out + PAGE_SIZE - 1) >> PAGE_SHIFT;
> +
> +out:
> + sg_free_table(&in_sg);
> + sg_free_table(&out_sg);
> +
> + if (in_pages) {
> + for (i = 0; i < nr_src_pages; i++)
> + put_page(in_pages[i]);
> + kfree(in_pages);

Pages returned back to the pool by btrfs_free_compr_folio()

> + }
> +
> + /* free un-used out pages */
> + for (i = nr_pages; i < nr_dst_pages; i++)
> + put_page(pages[i]);
> +
> + if (req)
> + acomp_request_free(req);
> +
> + if (tfm)
> + crypto_free_acomp(tfm);
> +
> + *out_pages = nr_pages;
> +
> + return ret;
> +}
> +
> +static int acomp_zlib_decomp_bio(struct page **in_pages,
> + struct compressed_bio *cb, size_t srclen,
> + unsigned long total_pages_in)
> +{
> + unsigned int nr_dst_pages = BTRFS_MAX_COMPRESSED_PAGES;
> + struct sg_table in_sg = { 0 }, out_sg = { 0 };
> + struct bio *orig_bio = &cb->orig_bbio->bio;
> + char *data_out = NULL, *bv_buf = NULL;
> + int copy_len = 0, bytes_left = 0;
> + struct crypto_acomp *tfm = NULL;
> + struct page **out_pages = NULL;
> + struct acomp_req *req = NULL;
> + struct crypto_wait wait;
> + struct bio_vec bvec;
> + int ret, i = 0;
> +
> + ret = sg_alloc_table_from_pages(&in_sg, in_pages, total_pages_in,
> + 0, srclen, GFP_KERNEL);

Any allocation here needs to be GFP_NOFS for now. Actually we'd need
memalloc_nofs_save/memalloc_nofs_restore around all compression and
decompression code that does not use GFP_NOFS directly and could call
other APIs that do GFP_KERNEL. Like crypto or sg.

> + if (ret)
> + goto out;
> +
> + out_pages = kcalloc(nr_dst_pages, sizeof(struct page *), GFP_KERNEL);
> + if (!out_pages) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + for (i = 0; i < nr_dst_pages; i++) {
> + out_pages[i] = alloc_page(GFP_NOFS | __GFP_HIGHMEM);
> + if (!out_pages[i]) {
> + ret = -ENOMEM;
> + goto out;
> + }
> + }
> +
> + ret = sg_alloc_table_from_pages(&out_sg, out_pages, nr_dst_pages, 0,
> + nr_dst_pages << PAGE_SHIFT, GFP_KERNEL);
> + if (ret)
> + goto out;
> +
> + crypto_init_wait(&wait);
> + tfm = crypto_alloc_acomp("zlib-deflate", 0, 0);
> + if (IS_ERR(tfm)) {
> + ret = PTR_ERR(tfm);
> + goto out;
> + }
> +
> + req = acomp_request_alloc(tfm);
> + if (!req) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + acomp_request_set_params(req, in_sg.sgl, out_sg.sgl, srclen,
> + nr_dst_pages << PAGE_SHIFT);
> + acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
> + crypto_req_done, &wait);
> +
> + ret = crypto_wait_req(crypto_acomp_decompress(req), &wait);
> + if (ret)
> + goto out;
> +
> + /* Copy decompressed buffer to bio pages */
> + bytes_left = req->dlen;
> + for (i = 0; i < nr_dst_pages; i++) {
> + copy_len = bytes_left > PAGE_SIZE ? PAGE_SIZE : bytes_left;
> + data_out = kmap_local_page(out_pages[i]);
> +
> + bvec = bio_iter_iovec(orig_bio, orig_bio->bi_iter);
> + bv_buf = kmap_local_page(bvec.bv_page);
> + memcpy(bv_buf, data_out, copy_len);
> + kunmap_local(bv_buf);
> +
> + bio_advance(orig_bio, copy_len);
> + if (!orig_bio->bi_iter.bi_size)
> + break;
> + bytes_left -= copy_len;
> + if (bytes_left <= 0)
> + break;
> + }
> +out:
> + sg_free_table(&in_sg);
> + sg_free_table(&out_sg);
> +
> + if (out_pages) {
> + for (i = 0; i < nr_dst_pages; i++) {
> + if (out_pages[i])
> + put_page(out_pages[i]);
> + }
> + kfree(out_pages);
> + }
> +
> + if (req)
> + acomp_request_free(req);
> + if (tfm)
> + crypto_free_acomp(tfm);
> +
> + return ret;
> +}
> +
> struct list_head *zlib_get_workspace(unsigned int level)
> {
> struct list_head *ws = btrfs_get_workspace(BTRFS_COMPRESS_ZLIB, level);
> @@ -108,6 +305,15 @@ int zlib_compress_pages(struct list_head *ws, struct address_space *mapping,
> unsigned long nr_dest_pages = *out_pages;
> const unsigned long max_out = nr_dest_pages * PAGE_SIZE;
>
> + if (crypto_has_acomp("zlib-deflate", 0, 0)) {
> + ret = acomp_comp_pages(mapping, start, len, pages, out_pages,
> + total_in, total_out);
> + if (!ret)
> + return ret;
> +
> + pr_warn("BTRFS: acomp compression failed: ret = %d\n", ret);
> + /* Fallback to SW implementation if HW compression failed */
> + }
> *out_pages = 0;
> *total_out = 0;
> *total_in = 0;
> @@ -281,6 +487,16 @@ int zlib_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
> unsigned long buf_start;
> struct page **pages_in = cb->compressed_pages;
>
> + if (crypto_has_acomp("zlib-deflate", 0, 0)) {
> + ret = acomp_zlib_decomp_bio(pages_in, cb, srclen,
> + total_pages_in);
> + if (!ret)
> + return ret;
> +
> + pr_warn("BTRFS: acomp decompression failed, ret=%d\n", ret);
> + /* Fallback to SW implementation if HW decompression failed */
> + }
> +
> data_in = kmap_local_page(pages_in[page_in_index]);
> workspace->strm.next_in = data_in;
> workspace->strm.avail_in = min_t(size_t, srclen, PAGE_SIZE);
> --
> 2.44.0
>

2024-04-29 16:14:37

by David Sterba

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] btrfs: zlib: add support for zlib-deflate through acomp

On Mon, Apr 29, 2024 at 04:21:46PM +0100, Cabiddu, Giovanni wrote:
> On Mon, Apr 29, 2024 at 09:56:45AM -0400, Josef Bacik wrote:
> > On Fri, Apr 26, 2024 at 11:54:29AM +0100, Giovanni Cabiddu wrote:
> > > From: Weigang Li <[email protected]>
> > >
> > > Add support for zlib compression and decompression through the acomp
> > > APIs.
> > > Input pages are added to an sg-list and sent to acomp in one request.
> > > Since acomp is asynchronous, the thread is put to sleep and then the CPU
> > > is freed up. Once compression is done, the acomp callback is triggered
> > > and the thread is woke up.
> > >
> > > This patch doesn't change the BTRFS disk format, this means that files
> > > compressed by hardware engines can be de-compressed by the zlib software
> > > library, and vice versa.
> > >
> > > Limitations:
> > > * The implementation tries always to use an acomp even if only
> > > zlib-deflate-scomp is present
> > > * Acomp does not provide a way to support compression levels
> >
> > That's a non-starter. We can't just lie to the user about the compression level
> > that is being used. If the user just does "-o compress=zlib" then you need to
> > update btrfs_compress_set_level() to figure out the compression level that acomp
> > is going to use and set that appropriately, so we can report to the user what is
> > actually being used.
> >
> > Additionally if a user specifies a compression level you need to make sure we
> > don't do acomp if it doesn't match what acomp is going to do.
> Thanks for the feedback. We should then extend the acomp API to take the
> compression level.
> @Herbert, do you have any objection if we add the compression level to
> the acomp tfm and we add an API to set it? Example:
>
> tfm = crypto_alloc_acomp("deflate", 0, 0);
> acomp_set_level(tfm, compression_level);
>
> > Finally, for the normal code review, there's a bunch of things that need to be
> > fixed up before I take a closer look
> >
> > - We don't use pr_(), we have btrfs specific printk helpers, please use those.
> > - We do 1 variable per line, fix up the variable declarations in your functions.
> I see that the code in fs/btrfs/zlib.c uses both pr_() and more than one
> variable per line. If we change it, will mixed style be a concern?

I have a work in progress to rework the messages in compression. The
messages with pr_() helpers are there for historical reasons and using
proper btrfs_info/... need extracting structures like fs_info from
various data. Josef's comment is valid but you can skip that for the QAT
series.