2023-10-06 16:00:50

by Johannes Weiner

[permalink] [raw]
Subject: [PATCH] mm: zswap: fix pool refcount bug around shrink_worker()

When a zswap store fails due to the limit, it acquires a pool
reference and queues the shrinker. When the shrinker runs, it drops
the reference. However, there can be multiple store attempts before
the shrinker wakes up and runs once. This results in reference leaks
and eventual saturation warnings for the pool refcount.

Fix this by dropping the reference again when the shrinker is already
queued. This ensures one reference per shrinker run.

Reported-by: Chris Mason <[email protected]>
Fixes: 45190f01dd40 ("mm/zswap.c: add allocation hysteresis if pool limit is hit")
Cc: [email protected] [5.6+]
Cc: Vitaly Wool <[email protected]>
Cc: Domenico Cerasuolo <[email protected]>
Cc: Nhat Pham <[email protected]>
Signed-off-by: Johannes Weiner <[email protected]>
---
mm/zswap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index 083c693602b8..37d2b1cb2ecb 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1383,8 +1383,8 @@ bool zswap_store(struct folio *folio)

shrink:
pool = zswap_pool_last_get();
- if (pool)
- queue_work(shrink_wq, &pool->shrink_work);
+ if (pool && !queue_work(shrink_wq, &pool->shrink_work))
+ zswap_pool_put(pool);
goto reject;
}

--
2.42.0


2023-10-06 21:42:31

by Nhat Pham

[permalink] [raw]
Subject: Re: [PATCH] mm: zswap: fix pool refcount bug around shrink_worker()

On Fri, Oct 6, 2023 at 9:00 AM Johannes Weiner <[email protected]> wrote:
>
> When a zswap store fails due to the limit, it acquires a pool
> reference and queues the shrinker. When the shrinker runs, it drops
> the reference. However, there can be multiple store attempts before
> the shrinker wakes up and runs once. This results in reference leaks
> and eventual saturation warnings for the pool refcount.
>
> Fix this by dropping the reference again when the shrinker is already
> queued. This ensures one reference per shrinker run.
>
> Reported-by: Chris Mason <[email protected]>
> Fixes: 45190f01dd40 ("mm/zswap.c: add allocation hysteresis if pool limit is hit")
> Cc: [email protected] [5.6+]
> Cc: Vitaly Wool <[email protected]>
> Cc: Domenico Cerasuolo <[email protected]>
> Cc: Nhat Pham <[email protected]>
> Signed-off-by: Johannes Weiner <[email protected]>
> ---
> mm/zswap.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 083c693602b8..37d2b1cb2ecb 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1383,8 +1383,8 @@ bool zswap_store(struct folio *folio)
>
> shrink:
> pool = zswap_pool_last_get();
> - if (pool)
> - queue_work(shrink_wq, &pool->shrink_work);
> + if (pool && !queue_work(shrink_wq, &pool->shrink_work))
> + zswap_pool_put(pool);
> goto reject;
> }
>
> --
> 2.42.0
>

Acked-by: Nhat Pham <[email protected]>

Random tangent: this asynchronous writeback mechanism
is always kinda weird to me. We could have quite a bit of memory
inversion before the shrinker finally kicks in and frees up zswap
pool space. But I guess if it doesn't break then don't fix it.

Maybe a shrinker that proactively writes pages back as memory
pressure builds up could help ;)