2024-02-22 08:12:01

by Barry Song

[permalink] [raw]
Subject: [PATCH v6 0/2] zswap: remove the memcpy if acomp is not sleepable

From: Barry Song <[email protected]>

In zswap, if we use zsmalloc, we cannot sleep while we map the
compressed memory, so we copy it to a temporary buffer. By
knowing the alg won't sleep can help zswap to avoid the
memcpy.
Thus we introduce an API in crypto to expose if acomp is async,
and zswap can use it to decide if it can remove copying to the
tmp buffer.

-v6:
* add acked-by of Herbert, Thanks!
* remove patch 3/3 from the series, as that one will go
through crypto

Barry Song (2):
crypto: introduce: acomp_is_async to expose if comp drivers might
sleep
mm/zswap: remove the memcpy if acomp is not sleepable

include/crypto/acompress.h | 6 ++++++
mm/zswap.c | 6 ++++--
2 files changed, 10 insertions(+), 2 deletions(-)

--
2.34.1



2024-02-22 08:12:14

by Barry Song

[permalink] [raw]
Subject: [PATCH v6 1/2] crypto: introduce: acomp_is_async to expose if comp drivers might sleep

From: Barry Song <[email protected]>

acomp's users might want to know if acomp is really async to
optimize themselves. One typical user which can benefit from
exposed async stat is zswap.

In zswap, zsmalloc is the most commonly used allocator for
(and perhaps the only one). For zsmalloc, we cannot sleep
while we map the compressed memory, so we copy it to a
temporary buffer. By knowing the alg won't sleep can help
zswap to avoid the need for a buffer. This shows noticeable
improvement in load/store latency of zswap.

Signed-off-by: Barry Song <[email protected]>
Acked-by: Herbert Xu <[email protected]>
---
include/crypto/acompress.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/include/crypto/acompress.h b/include/crypto/acompress.h
index 574cffc90730..80e243611fe2 100644
--- a/include/crypto/acompress.h
+++ b/include/crypto/acompress.h
@@ -160,6 +160,12 @@ static inline void acomp_request_set_tfm(struct acomp_req *req,
req->base.tfm = crypto_acomp_tfm(tfm);
}

+static inline bool acomp_is_async(struct crypto_acomp *tfm)
+{
+ return crypto_comp_alg_common(tfm)->base.cra_flags &
+ CRYPTO_ALG_ASYNC;
+}
+
static inline struct crypto_acomp *crypto_acomp_reqtfm(struct acomp_req *req)
{
return __crypto_acomp_tfm(req->base.tfm);
--
2.34.1


2024-02-22 08:12:28

by Barry Song

[permalink] [raw]
Subject: [PATCH v6 2/2] mm/zswap: remove the memcpy if acomp is not sleepable

From: Barry Song <[email protected]>

Most compressors are actually CPU-based and won't sleep during
compression and decompression. We should remove the redundant
memcpy for them.
This patch checks if the algorithm is sleepable by testing the
CRYPTO_ALG_ASYNC algorithm flag.
Generally speaking, async and sleepable are semantically similar
but not equal. But for compress drivers, they are basically equal
at least due to the below facts.
Firstly, scompress drivers - crypto/deflate.c, lz4.c, zstd.c,
lzo.c etc have no sleep. Secondly, zRAM has been using these
scompress drivers for years in atomic contexts, and never
worried those drivers going to sleep.
One exception is that an async driver can sometimes still return
synchronously per Herbert's clarification. In this case, we are
still having a redundant memcpy. But we can't know if one
particular acomp request will sleep or not unless crypto can
expose more details for each specific request from offload
drivers.

Signed-off-by: Barry Song <[email protected]>
Tested-by: Chengming Zhou <[email protected]>
Reviewed-by: Nhat Pham <[email protected]>
Acked-by: Yosry Ahmed <[email protected]>
Reviewed-by: Chengming Zhou <[email protected]>
---
mm/zswap.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index 011e068eb355..de3c9e30bed7 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -162,6 +162,7 @@ struct crypto_acomp_ctx {
struct crypto_wait wait;
u8 *buffer;
struct mutex mutex;
+ bool is_sleepable;
};

/*
@@ -950,6 +951,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
goto acomp_fail;
}
acomp_ctx->acomp = acomp;
+ acomp_ctx->is_sleepable = acomp_is_async(acomp);

req = acomp_request_alloc(acomp_ctx->acomp);
if (!req) {
@@ -1077,7 +1079,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct page *page)
mutex_lock(&acomp_ctx->mutex);

src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
- if (!zpool_can_sleep_mapped(zpool)) {
+ if (acomp_ctx->is_sleepable && !zpool_can_sleep_mapped(zpool)) {
memcpy(acomp_ctx->buffer, src, entry->length);
src = acomp_ctx->buffer;
zpool_unmap_handle(zpool, entry->handle);
@@ -1091,7 +1093,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct page *page)
BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
mutex_unlock(&acomp_ctx->mutex);

- if (zpool_can_sleep_mapped(zpool))
+ if (!acomp_ctx->is_sleepable || zpool_can_sleep_mapped(zpool))
zpool_unmap_handle(zpool, entry->handle);
}

--
2.34.1


2024-02-24 16:53:19

by Chris Li

[permalink] [raw]
Subject: Re: [PATCH v6 1/2] crypto: introduce: acomp_is_async to expose if comp drivers might sleep

On Thu, Feb 22, 2024 at 12:12 AM Barry Song <[email protected]> wrote:
>
> From: Barry Song <[email protected]>
>
> acomp's users might want to know if acomp is really async to
> optimize themselves. One typical user which can benefit from
> exposed async stat is zswap.
>
> In zswap, zsmalloc is the most commonly used allocator for
> (and perhaps the only one). For zsmalloc, we cannot sleep
> while we map the compressed memory, so we copy it to a
> temporary buffer. By knowing the alg won't sleep can help
> zswap to avoid the need for a buffer. This shows noticeable
> improvement in load/store latency of zswap.
>
> Signed-off-by: Barry Song <[email protected]>
> Acked-by: Herbert Xu <[email protected]>

Acked-by: Chris Li <[email protected]>

Chris

> ---
> include/crypto/acompress.h | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/include/crypto/acompress.h b/include/crypto/acompress.h
> index 574cffc90730..80e243611fe2 100644
> --- a/include/crypto/acompress.h
> +++ b/include/crypto/acompress.h
> @@ -160,6 +160,12 @@ static inline void acomp_request_set_tfm(struct acomp_req *req,
> req->base.tfm = crypto_acomp_tfm(tfm);
> }
>
> +static inline bool acomp_is_async(struct crypto_acomp *tfm)
> +{
> + return crypto_comp_alg_common(tfm)->base.cra_flags &
> + CRYPTO_ALG_ASYNC;
> +}
> +
> static inline struct crypto_acomp *crypto_acomp_reqtfm(struct acomp_req *req)
> {
> return __crypto_acomp_tfm(req->base.tfm);
> --
> 2.34.1
>

2024-02-24 16:54:04

by Chris Li

[permalink] [raw]
Subject: Re: [PATCH v6 2/2] mm/zswap: remove the memcpy if acomp is not sleepable

Acked-by: Chris Li <[email protected]>

Chris

On Thu, Feb 22, 2024 at 12:12 AM Barry Song <[email protected]> wrote:
>
> From: Barry Song <[email protected]>
>
> Most compressors are actually CPU-based and won't sleep during
> compression and decompression. We should remove the redundant
> memcpy for them.
> This patch checks if the algorithm is sleepable by testing the
> CRYPTO_ALG_ASYNC algorithm flag.
> Generally speaking, async and sleepable are semantically similar
> but not equal. But for compress drivers, they are basically equal
> at least due to the below facts.
> Firstly, scompress drivers - crypto/deflate.c, lz4.c, zstd.c,
> lzo.c etc have no sleep. Secondly, zRAM has been using these
> scompress drivers for years in atomic contexts, and never
> worried those drivers going to sleep.
> One exception is that an async driver can sometimes still return
> synchronously per Herbert's clarification. In this case, we are
> still having a redundant memcpy. But we can't know if one
> particular acomp request will sleep or not unless crypto can
> expose more details for each specific request from offload
> drivers.
>
> Signed-off-by: Barry Song <[email protected]>
> Tested-by: Chengming Zhou <[email protected]>
> Reviewed-by: Nhat Pham <[email protected]>
> Acked-by: Yosry Ahmed <[email protected]>
> Reviewed-by: Chengming Zhou <[email protected]>
> ---
> mm/zswap.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 011e068eb355..de3c9e30bed7 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -162,6 +162,7 @@ struct crypto_acomp_ctx {
> struct crypto_wait wait;
> u8 *buffer;
> struct mutex mutex;
> + bool is_sleepable;
> };
>
> /*
> @@ -950,6 +951,7 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
> goto acomp_fail;
> }
> acomp_ctx->acomp = acomp;
> + acomp_ctx->is_sleepable = acomp_is_async(acomp);
>
> req = acomp_request_alloc(acomp_ctx->acomp);
> if (!req) {
> @@ -1077,7 +1079,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct page *page)
> mutex_lock(&acomp_ctx->mutex);
>
> src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
> - if (!zpool_can_sleep_mapped(zpool)) {
> + if (acomp_ctx->is_sleepable && !zpool_can_sleep_mapped(zpool)) {
> memcpy(acomp_ctx->buffer, src, entry->length);
> src = acomp_ctx->buffer;
> zpool_unmap_handle(zpool, entry->handle);
> @@ -1091,7 +1093,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct page *page)
> BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
> mutex_unlock(&acomp_ctx->mutex);
>
> - if (zpool_can_sleep_mapped(zpool))
> + if (!acomp_ctx->is_sleepable || zpool_can_sleep_mapped(zpool))
> zpool_unmap_handle(zpool, entry->handle);
> }
>
> --
> 2.34.1
>

2024-03-08 11:57:58

by Barry Song

[permalink] [raw]
Subject: Re: [PATCH v6 0/2] zswap: remove the memcpy if acomp is not sleepable

Hi Andrew,

On Thu, Feb 22, 2024 at 4:11 PM Barry Song <[email protected]> wrote:
>
> From: Barry Song <[email protected]>
>
> In zswap, if we use zsmalloc, we cannot sleep while we map the
> compressed memory, so we copy it to a temporary buffer. By
> knowing the alg won't sleep can help zswap to avoid the
> memcpy.
> Thus we introduce an API in crypto to expose if acomp is async,
> and zswap can use it to decide if it can remove copying to the
> tmp buffer.
>
> -v6:
> * add acked-by of Herbert, Thanks!
> * remove patch 3/3 from the series, as that one will go
> through crypto

Can you please pull this into mm-tree? This used to have 3 patches.

3/3 was separated according to Herbert's requirements and has
been in a crypto tree.
crypto: scomp - remove memcpy if sg_nents is 1 and pages are lowmem
https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/commit/?id=77292bb8ca

Two drivers fixes(patch 1 needs) have also been in crypto tree:
crypto: hisilicon/zip - fix the missing CRYPTO_ALG_ASYNC in cra_flags
https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/commit/?id=db8ac88385

crypto: iaa - fix the missing CRYPTO_ALG_ASYNC in cra_flags
https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/commit/?id=30dd94dba35

So it should be quite safe to pull this series into mm-tree now.

>
> Barry Song (2):
> crypto: introduce: acomp_is_async to expose if comp drivers might
> sleep
> mm/zswap: remove the memcpy if acomp is not sleepable
>
> include/crypto/acompress.h | 6 ++++++
> mm/zswap.c | 6 ++++--
> 2 files changed, 10 insertions(+), 2 deletions(-)
>
> --
> 2.34.1

Thanks
Barry

2024-03-09 03:23:43

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v6 0/2] zswap: remove the memcpy if acomp is not sleepable

On Fri, 8 Mar 2024 19:57:38 +0800 Barry Song <[email protected]> wrote:

> Hi Andrew,
>
> On Thu, Feb 22, 2024 at 4:11 PM Barry Song <[email protected]> wrote:
> >
> > From: Barry Song <[email protected]>
> >
> > In zswap, if we use zsmalloc, we cannot sleep while we map the
> > compressed memory, so we copy it to a temporary buffer. By
> > knowing the alg won't sleep can help zswap to avoid the
> > memcpy.
> > Thus we introduce an API in crypto to expose if acomp is async,
> > and zswap can use it to decide if it can remove copying to the
> > tmp buffer.
> >
> > -v6:
> > * add acked-by of Herbert, Thanks!
> > * remove patch 3/3 from the series, as that one will go
> > through crypto
>
> Can you please pull this into mm-tree? This used to have 3 patches.
>
> 3/3 was separated according to Herbert's requirements and has
> been in a crypto tree.
> crypto: scomp - remove memcpy if sg_nents is 1 and pages are lowmem
> https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/commit/?id=77292bb8ca
>
> Two drivers fixes(patch 1 needs) have also been in crypto tree:
> crypto: hisilicon/zip - fix the missing CRYPTO_ALG_ASYNC in cra_flags
> https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/commit/?id=db8ac88385
>
> crypto: iaa - fix the missing CRYPTO_ALG_ASYNC in cra_flags
> https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/commit/?id=30dd94dba35
>
> So it should be quite safe to pull this series into mm-tree now.

But this zswap chage requires the presence of the other patches, yes?

So the mm.git tree alone will be buggy? And if mm.git merges ahead of
the other trees, there will be a window where mainline will be buggy?

If so, I think it wuold be better to merge the zswap patch in the next
merge window.


2024-03-09 03:59:01

by Barry Song

[permalink] [raw]
Subject: Re: [PATCH v6 0/2] zswap: remove the memcpy if acomp is not sleepable

On Sat, Mar 9, 2024 at 11:23 AM Andrew Morton <[email protected]> wrote:
>
> On Fri, 8 Mar 2024 19:57:38 +0800 Barry Song <[email protected]> wrote:
>
> > Hi Andrew,
> >
> > On Thu, Feb 22, 2024 at 4:11 PM Barry Song <[email protected]> wrote:
> > >
> > > From: Barry Song <[email protected]>
> > >
> > > In zswap, if we use zsmalloc, we cannot sleep while we map the
> > > compressed memory, so we copy it to a temporary buffer. By
> > > knowing the alg won't sleep can help zswap to avoid the
> > > memcpy.
> > > Thus we introduce an API in crypto to expose if acomp is async,
> > > and zswap can use it to decide if it can remove copying to the
> > > tmp buffer.
> > >
> > > -v6:
> > > * add acked-by of Herbert, Thanks!
> > > * remove patch 3/3 from the series, as that one will go
> > > through crypto
> >
> > Can you please pull this into mm-tree? This used to have 3 patches.
> >
> > 3/3 was separated according to Herbert's requirements and has
> > been in a crypto tree.
> > crypto: scomp - remove memcpy if sg_nents is 1 and pages are lowmem
> > https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/commit/?id=77292bb8ca
> >
> > Two drivers fixes(patch 1 needs) have also been in crypto tree:
> > crypto: hisilicon/zip - fix the missing CRYPTO_ALG_ASYNC in cra_flags
> > https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/commit/?id=db8ac88385
> >
> > crypto: iaa - fix the missing CRYPTO_ALG_ASYNC in cra_flags
> > https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/commit/?id=30dd94dba35
> >
> > So it should be quite safe to pull this series into mm-tree now.
>
> But this zswap chage requires the presence of the other patches, yes?

As far as I understand, we rely on two driver fixes because those drivers didn't
set the correct cra_flags needed by our patch1. Without those fixes implemented,
two platforms might encounter issues: Intel with IAA (Intel Analytics
Accelerator)
and Hisilicon with ZIP. Other platforms should be unaffected.

The two driver fixes have been merged into the crypto tree.

>
> So the mm.git tree alone will be buggy? And if mm.git merges ahead of
> the other trees, there will be a window where mainline will be buggy?

Before 6.9-rc1, there might be issues if mm enters Linus' tree before Herbert's
crypto tree. However, by 6.9-rc1, everything should be fine.

>
> If so, I think it wuold be better to merge the zswap patch in the next
> merge window.
>

Okay, I understand. Since this patch improves zswap's performance, I wanted
it to be integrated sooner to contribute. However, I'm perfectly willing to
respect your concerns and adhere to the community's best practices.

So please handle it in the best way you think :-)

Thanks
Barry

2024-03-09 04:37:02

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v6 0/2] zswap: remove the memcpy if acomp is not sleepable

On Sat, 9 Mar 2024 11:58:39 +0800 Barry Song <[email protected]> wrote:

> > >
> > > So it should be quite safe to pull this series into mm-tree now.
> >
> > But this zswap chage requires the presence of the other patches, yes?
>
> As far as I understand, we rely on two driver fixes because those drivers didn't
> set the correct cra_flags needed by our patch1. Without those fixes implemented,
> two platforms might encounter issues: Intel with IAA (Intel Analytics
> Accelerator)
> and Hisilicon with ZIP. Other platforms should be unaffected.
>
> The two driver fixes have been merged into the crypto tree.
>
> >
> > So the mm.git tree alone will be buggy? And if mm.git merges ahead of
> > the other trees, there will be a window where mainline will be buggy?
>
> Before 6.9-rc1, there might be issues if mm enters Linus' tree before Herbert's
> crypto tree. However, by 6.9-rc1, everything should be fine.
>
> >
> > If so, I think it wuold be better to merge the zswap patch in the next
> > merge window.
> >
>
> Okay, I understand. Since this patch improves zswap's performance, I wanted
> it to be integrated sooner to contribute. However, I'm perfectly willing to
> respect your concerns and adhere to the community's best practices.
>

OK. I very much doubt if anyone is running those drivers on mm.git, so
adding it now isn't likely to hurt.

So I'll merge it now and shall aim to get it upstream very late in the
next merge window.

2024-03-09 04:42:58

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v6 0/2] zswap: remove the memcpy if acomp is not sleepable

On Fri, 8 Mar 2024 20:36:41 -0800 Andrew Morton <[email protected]> wrote:

> > Okay, I understand. Since this patch improves zswap's performance, I wanted
> > it to be integrated sooner to contribute. However, I'm perfectly willing to
> > respect your concerns and adhere to the community's best practices.
> >
>
> OK. I very much doubt if anyone is running those drivers on mm.git, so
> adding it now isn't likely to hurt.
>
> So I'll merge it now and shall aim to get it upstream very late in the
> next merge window.

Nope. mm.git won't build without acomp_is_async().

We can merge the zswap patch via the crypto tree. Acked-by: me.

Or please just resend the zswap change after 6.9-rc1 is released.



2024-03-09 05:08:34

by Barry Song

[permalink] [raw]
Subject: Re: [PATCH v6 0/2] zswap: remove the memcpy if acomp is not sleepable

On Sat, Mar 9, 2024 at 12:56 PM Barry Song <[email protected]> wrote:
>
> On Sat, Mar 9, 2024 at 12:42 PM Andrew Morton <[email protected]> wrote:
> >
> > On Fri, 8 Mar 2024 20:36:41 -0800 Andrew Morton <[email protected]> wrote:
> >
> > > > Okay, I understand. Since this patch improves zswap's performance, I wanted
> > > > it to be integrated sooner to contribute. However, I'm perfectly willing to
> > > > respect your concerns and adhere to the community's best practices.
> > > >
> > >
> > > OK. I very much doubt if anyone is running those drivers on mm.git, so
> > > adding it now isn't likely to hurt.
> > >
> > > So I'll merge it now and shall aim to get it upstream very late in the
> > > next merge window.
> >
> > Nope. mm.git won't build without acomp_is_async().
> >
> > We can merge the zswap patch via the crypto tree. Acked-by: me.
>
> Herbert Acked the acomp_is_async() patch in v5 instead of picking it up
> into crypto:
> https://lore.kernel.org/linux-mm/[email protected]/

More details: Herbert acked the acomp_is_async in v5 [1], while he requested
that patch 3/3 of v5 be split from the series and applied by crypto
[2]. Patch 3/3
of v5 can function independently of 1/3 and 2/3, and it has already
been included
in the crypto tree.

That is why v6 has only two left.

[1] https://lore.kernel.org/linux-mm/[email protected]/
[2] https://lore.kernel.org/linux-mm/ZdWLim6zYSl%[email protected]/


>
> >
> > Or please just resend the zswap change after 6.9-rc1 is released.

Thanks
Barry