LinuxLists.cc - [PATCH] zswap: do not allocate from atomic pool

2022-11-22 02:17:19

Subject: [PATCH] zswap: do not allocate from atomic pool

zswap_frontswap_load() should be called from preemptible
context (we even call mutex_lock() there) and it does not
look like we need to do GFP_ATOMIC allocaion for temp
buffer there. Use GFP_KERNEL instead.

Signed-off-by: Sergey Senozhatsky <[email protected]>
---
mm/zswap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index 2d69c1d678fe..f6c89049cf70 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1314,7 +1314,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,
}

if (!zpool_can_sleep_mapped(entry->pool->zpool)) {
- tmp = kmalloc(entry->length, GFP_ATOMIC);
+ tmp = kmalloc(entry->length, GFP_KERNEL);
if (!tmp) {
ret = -ENOMEM;
goto freeentry;
--
2.38.1.584.g0f3c55d4c2-goog

2022-11-22 02:34:54

by Andrew Morton

[permalink] [raw]

Subject: Re: [PATCH] zswap: do not allocate from atomic pool

On Tue, 22 Nov 2022 10:33:38 +0900 Sergey Senozhatsky <[email protected]> wrote:

> zswap_frontswap_load() should be called from preemptible
> context (we even call mutex_lock() there) and it does not
> look like we need to do GFP_ATOMIC allocaion for temp
> buffer there. Use GFP_KERNEL instead.
>
> ...
>
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1314,7 +1314,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,
> }
>
> if (!zpool_can_sleep_mapped(entry->pool->zpool)) {
> - tmp = kmalloc(entry->length, GFP_ATOMIC);
> + tmp = kmalloc(entry->length, GFP_KERNEL);
> if (!tmp) {
> ret = -ENOMEM;
> goto freeentry;

It seems strange to do

if (! can sleep)
do something which can sleep

or am I misreading the intent of zpool_driver.sleep_mapped? If so,
perhaps some explanatory code comments will help.

2022-11-22 03:05:50

by Sergey Senozhatsky

[permalink] [raw]

Subject: Re: [PATCH] zswap: do not allocate from atomic pool

On (22/11/21 17:56), Andrew Morton wrote:
> > zswap_frontswap_load() should be called from preemptible
> > context (we even call mutex_lock() there) and it does not
> > look like we need to do GFP_ATOMIC allocaion for temp
> > buffer there. Use GFP_KERNEL instead.
> >
> > ...
> >
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -1314,7 +1314,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,
> > }
> >
> > if (!zpool_can_sleep_mapped(entry->pool->zpool)) {
> > - tmp = kmalloc(entry->length, GFP_ATOMIC);
> > + tmp = kmalloc(entry->length, GFP_KERNEL);
> > if (!tmp) {
> > ret = -ENOMEM;
> > goto freeentry;
>
> It seems strange to do

It does indeed.

> if (! can sleep)
> do something which can sleep

Some allocators enter the non-preemptible context in ->map callback and
exit that context in ->unmap. zswap uses async compression and needs to
wait for crypto to finish decompression of the mapped object, which it
cannot do when allocator disables preemption in ->map. So for allocators
that do this zswap allocates a temp buffer, then maps the object, copies
it to that temp buffer and unmaps the objects. So now it can wait for
async crypto because we are in preemptible context and we use the temp
copy of the object.

> or am I misreading the intent of zpool_driver.sleep_mapped? If so,
> perhaps some explanatory code comments will help.

I can add one. I assume at mm/zpool.c zpool_can_sleep_mapped() would be
the right place.

2022-11-22 03:27:00

by Johannes Weiner

[permalink] [raw]

Subject: Re: [PATCH] zswap: do not allocate from atomic pool

On Tue, Nov 22, 2022 at 10:33:38AM +0900, Sergey Senozhatsky wrote:
> zswap_frontswap_load() should be called from preemptible
> context (we even call mutex_lock() there) and it does not
> look like we need to do GFP_ATOMIC allocaion for temp
> buffer there. Use GFP_KERNEL instead.
>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
> ---
> mm/zswap.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 2d69c1d678fe..f6c89049cf70 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1314,7 +1314,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,
> }
>
> if (!zpool_can_sleep_mapped(entry->pool->zpool)) {
> - tmp = kmalloc(entry->length, GFP_ATOMIC);
> + tmp = kmalloc(entry->length, GFP_KERNEL);

There is another one in zswap_writeback_entry() that seems equally
arbitrary. They came in through the same commit, with no further
explanation as to this choice. Do you want to pick that up too?

2022-11-22 04:02:56

by Sergey Senozhatsky

[permalink] [raw]

Subject: Re: [PATCH] zswap: do not allocate from atomic pool

On (22/11/21 22:16), Johannes Weiner wrote:
> On Tue, Nov 22, 2022 at 10:33:38AM +0900, Sergey Senozhatsky wrote:
> > zswap_frontswap_load() should be called from preemptible
> > context (we even call mutex_lock() there) and it does not
> > look like we need to do GFP_ATOMIC allocaion for temp
> > buffer there. Use GFP_KERNEL instead.
> >
> > Signed-off-by: Sergey Senozhatsky <[email protected]>
> > ---
> > mm/zswap.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index 2d69c1d678fe..f6c89049cf70 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -1314,7 +1314,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,
> > }
> >
> > if (!zpool_can_sleep_mapped(entry->pool->zpool)) {
> > - tmp = kmalloc(entry->length, GFP_ATOMIC);
> > + tmp = kmalloc(entry->length, GFP_KERNEL);
>
> There is another one in zswap_writeback_entry() that seems equally
> arbitrary. They came in through the same commit, with no further
> explanation as to this choice. Do you want to pick that up too?

Yup, you patch it in https://lore.kernel.org/lkml/[email protected]/
and I guess it's there just by accident. We probably want a separate
patch instead that touches those GFP_ATOMIC allocations in both places.

So I have it like this at present

---

From 66e4acffc0926498f818d11f2f57b9c772131f6e Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky <[email protected]>
Date: Tue, 22 Nov 2022 10:26:13 +0900
Subject: [PATCH] zswap: do not allocate from atomic pool

zswap_frontswap_load() should be called from preemptible
context (we even call mutex_lock() there) and it does not
look like we need to do GFP_ATOMIC allocaion for temp
buffer. The same applies to zswap_writeback_entry().

Use GFP_KERNEL for temporary buffer allocation in both
cases.

Signed-off-by: Johannes Weiner <[email protected]>
Signed-off-by: Nhat Pham <[email protected]>
Signed-off-by: Sergey Senozhatsky <[email protected]>
---
mm/zpool.c | 7 +++++++
mm/zswap.c | 4 ++--
2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/mm/zpool.c b/mm/zpool.c
index 68facc193496..f46c0d5e766c 100644
--- a/mm/zpool.c
+++ b/mm/zpool.c
@@ -387,6 +387,13 @@ bool zpool_evictable(struct zpool *zpool)
* zpool_can_sleep_mapped - Test if zpool can sleep when do mapped.
* @zpool: The zpool to test
*
+ * Some allocators enter non-preemptible context in ->map() callback (e.g.
+ * disable pagefaults) and exit that context in ->unmap(), which limits what
+ * we can do with the mapped object. For instance, we cannot wait for
+ * asynchronous crypto API to decompress such an object or take mutexes
+ * since those will call into the scheduler. This function tells us whether
+ * we use such an allocator.
+ *
* Returns: true if zpool can sleep; false otherwise.
*/
bool zpool_can_sleep_mapped(struct zpool *zpool)
diff --git a/mm/zswap.c b/mm/zswap.c
index 2d48fd59cc7a..3019f0bde194 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -958,7 +958,7 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle)
};

if (!zpool_can_sleep_mapped(pool)) {
- tmp = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+ tmp = kmalloc(PAGE_SIZE, GFP_KERNEL);
if (!tmp)
return -ENOMEM;
}
@@ -1311,7 +1311,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,
}

if (!zpool_can_sleep_mapped(entry->pool->zpool)) {
- tmp = kmalloc(entry->length, GFP_ATOMIC);
+ tmp = kmalloc(entry->length, GFP_KERNEL);
if (!tmp) {
ret = -ENOMEM;
goto freeentry;
--
2.38.1.584.g0f3c55d4c2-goog

2022-11-24 03:59:36

by Sergey Senozhatsky

[permalink] [raw]

Subject: Re: [PATCH] zswap: do not allocate from atomic pool

On (22/11/22 12:30), Sergey Senozhatsky wrote:
> zswap_frontswap_load() should be called from preemptible
> context (we even call mutex_lock() there) and it does not
> look like we need to do GFP_ATOMIC allocaion for temp
> buffer. The same applies to zswap_writeback_entry().
>
> Use GFP_KERNEL for temporary buffer allocation in both
> cases.
>
> Signed-off-by: Johannes Weiner <[email protected]>
> Signed-off-by: Nhat Pham <[email protected]>
> Signed-off-by: Sergey Senozhatsky <[email protected]>
> ---

Folks, how do we want to proceed with this? One of the hunks here
conflicts with https://lore.kernel.org/lkml/[email protected]/

Do we want to remove conflicting hunk from "[PATCH 1/6] zswap: fix writeback
lock ordering for zsmalloc" and pick this patch up?

> diff --git a/mm/zpool.c b/mm/zpool.c
> index 68facc193496..f46c0d5e766c 100644
> --- a/mm/zpool.c
> +++ b/mm/zpool.c
> @@ -387,6 +387,13 @@ bool zpool_evictable(struct zpool *zpool)
> * zpool_can_sleep_mapped - Test if zpool can sleep when do mapped.
> * @zpool: The zpool to test
> *
> + * Some allocators enter non-preemptible context in ->map() callback (e.g.
> + * disable pagefaults) and exit that context in ->unmap(), which limits what
> + * we can do with the mapped object. For instance, we cannot wait for
> + * asynchronous crypto API to decompress such an object or take mutexes
> + * since those will call into the scheduler. This function tells us whether
> + * we use such an allocator.
> + *
> * Returns: true if zpool can sleep; false otherwise.
> */
> bool zpool_can_sleep_mapped(struct zpool *zpool)
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 2d48fd59cc7a..3019f0bde194 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -958,7 +958,7 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle)
> };
>
> if (!zpool_can_sleep_mapped(pool)) {
> - tmp = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + tmp = kmalloc(PAGE_SIZE, GFP_KERNEL);
> if (!tmp)
> return -ENOMEM;
> }
> @@ -1311,7 +1311,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,
> }
>
> if (!zpool_can_sleep_mapped(entry->pool->zpool)) {
> - tmp = kmalloc(entry->length, GFP_ATOMIC);
> + tmp = kmalloc(entry->length, GFP_KERNEL);
> if (!tmp) {
> ret = -ENOMEM;
> goto freeentry;
> --
> 2.38.1.584.g0f3c55d4c2-goog
>

2022-11-24 04:43:23

by Andrew Morton

[permalink] [raw]

Subject: Re: [PATCH] zswap: do not allocate from atomic pool

On Thu, 24 Nov 2022 12:32:45 +0900 Sergey Senozhatsky <[email protected]> wrote:

> Folks, how do we want to proceed with this? One of the hunks here
> conflicts with https://lore.kernel.org/lkml/[email protected]/
>
> Do we want to remove conflicting hunk from "[PATCH 1/6] zswap: fix writeback
> lock ordering for zsmalloc" and pick this patch up?
>

The "Implement writeback for zsmalloc" series is clearly due for one or
more new versions, so I will drop that series and I will apply "zswap: do
not allocate from atomic pool". Let's ask Nhat Pham to prepare future
revisions of the "Implement writeback for zsmalloc" series against
mm-unstable.

2022-11-24 05:15:36

by Nhat Pham

[permalink] [raw]

Subject: Re: [PATCH] zswap: do not allocate from atomic pool

On Wed, Nov 23, 2022 at 7:42 PM Andrew Morton <[email protected]> wrote:
>
> On Thu, 24 Nov 2022 12:32:45 +0900 Sergey Senozhatsky <[email protected]> wrote:
>
> > Folks, how do we want to proceed with this? One of the hunks here
> > conflicts with https://lore.kernel.org/lkml/[email protected]/
> >
> > Do we want to remove conflicting hunk from "[PATCH 1/6] zswap: fix writeback
> > lock ordering for zsmalloc" and pick this patch up?
> >
>
> The "Implement writeback for zsmalloc" series is clearly due for one or
> more new versions, so I will drop that series and I will apply "zswap: do
> not allocate from atomic pool". Let's ask Nhat Pham to prepare future
> revisions of the "Implement writeback for zsmalloc" series against
> mm-unstable.

Will do! Apologies for the constant churning - but this should be a
quick and easy change
from my end. v7 should be out next week. Have a nice Thanksgiving everyone!