2017-09-14 13:59:41

by Vitaly Wool

[permalink] [raw]
Subject: [PATCH] z3fold: fix stale list handling

Fix the situation when clear_bit() is called for page->private before
the page pointer is actually assigned. While at it, remove work_busy()
check because it is costly and does not give 100% guarantee anyway.

Signed-of-by: Vitaly Wool <[email protected]>
---
mm/z3fold.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/mm/z3fold.c b/mm/z3fold.c
index b04fa3ba1bf2..b2ba2ba585f3 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -250,6 +250,7 @@ static void __release_z3fold_page(struct z3fold_header *zhdr, bool locked)

WARN_ON(!list_empty(&zhdr->buddy));
set_bit(PAGE_STALE, &page->private);
+ clear_bit(NEEDS_COMPACTING, &page->private);
spin_lock(&pool->lock);
if (!list_empty(&page->lru))
list_del(&page->lru);
@@ -303,7 +304,6 @@ static void free_pages_work(struct work_struct *w)
list_del(&zhdr->buddy);
if (WARN_ON(!test_bit(PAGE_STALE, &page->private)))
continue;
- clear_bit(NEEDS_COMPACTING, &page->private);
spin_unlock(&pool->stale_lock);
cancel_work_sync(&zhdr->work);
free_z3fold_page(page);
@@ -624,10 +624,8 @@ static int z3fold_alloc(struct z3fold_pool *pool, size_t size, gfp_t gfp,
* stale pages list. cancel_work_sync() can sleep so we must make
* sure it won't be called in case we're in atomic context.
*/
- if (zhdr && (can_sleep || !work_pending(&zhdr->work) ||
- !unlikely(work_busy(&zhdr->work)))) {
+ if (zhdr && (can_sleep || !work_pending(&zhdr->work))) {
list_del(&zhdr->buddy);
- clear_bit(NEEDS_COMPACTING, &page->private);
spin_unlock(&pool->stale_lock);
if (can_sleep)
cancel_work_sync(&zhdr->work);
--
2.11.0


2017-09-14 21:15:34

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] z3fold: fix stale list handling

On Thu, 14 Sep 2017 15:59:36 +0200 Vitaly Wool <[email protected]> wrote:

> Fix the situation when clear_bit() is called for page->private before
> the page pointer is actually assigned. While at it, remove work_busy()
> check because it is costly and does not give 100% guarantee anyway.

Does this fix https://bugzilla.kernel.org/show_bug.cgi?id=196877 ? If
so, the bugzilla references and a reported-by should be added.

What are the end-user visible effects of the bug? Please always
include this info when fixing bugs.

Should this fix be backported into -stable kernels?

2017-09-15 08:34:57

by Vitaly Wool

[permalink] [raw]
Subject: Re: [PATCH] z3fold: fix stale list handling

Hi Andrew,

2017-09-14 23:15 GMT+02:00 Andrew Morton <[email protected]>:
> On Thu, 14 Sep 2017 15:59:36 +0200 Vitaly Wool <[email protected]> wrote:
>
>> Fix the situation when clear_bit() is called for page->private before
>> the page pointer is actually assigned. While at it, remove work_busy()
>> check because it is costly and does not give 100% guarantee anyway.
>
> Does this fix https://bugzilla.kernel.org/show_bug.cgi?id=196877 ? If
> so, the bugzilla references and a reported-by should be added.

I wish it did but it doesn't. The bug you are referring to happens
with the "unbuddied" list, and the current version of
z3fold_reclaim_page() just doesn't have that code.
This patch fixes the processing of "stale" lists, with stale lists
having been introduced with the per-CPU unbuddied lists patch, which
is pretty recent.
To fix https://bugzilla.kernel.org/show_bug.cgi?id=196877, we'll have
to either backport per-CPU unbuddied lists plus the two fixes, or
propose a separate fix.

> What are the end-user visible effects of the bug? Please always
> include this info when fixing bugs.

If page is NULL, clear_bit for page->private will result in a kernel crash.

> Should this fix be backported into -stable kernels?

No, this patch fixes the code that is not in any released kernel yet.

~vitaly