2015-04-21 04:20:44

by Naoya Horiguchi

[permalink] [raw]
Subject: [PATCH] mm: soft-offline: fix num_poisoned_pages counting on concurrent events

If multiple soft offline events hit one free page/hugepage concurrently,
soft_offline_page() can handle the free page/hugepage multiple times,
which makes num_poisoned_pages counter increased more than once.
This patch fixes this wrong counting by checking TestSetPageHWPoison for
normal papes and by checking the return value of dequeue_hwpoisoned_huge_page()
for hugepages.

Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: [email protected] # v3.14+
---
# This problem might happen before 3.14, but it's rare and non-critical,
# so I want this patch to be backported to stable trees only if the patch
# cleanly applies (i.e. v3.14+).
---
mm/memory-failure.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git v4.0.orig/mm/memory-failure.c v4.0/mm/memory-failure.c
index 2cc1d578144b..72a5224c8084 100644
--- v4.0.orig/mm/memory-failure.c
+++ v4.0/mm/memory-failure.c
@@ -1721,12 +1721,12 @@ int soft_offline_page(struct page *page, int flags)
} else if (ret == 0) { /* for free pages */
if (PageHuge(page)) {
set_page_hwpoison_huge_page(hpage);
- dequeue_hwpoisoned_huge_page(hpage);
- atomic_long_add(1 << compound_order(hpage),
+ if (!dequeue_hwpoisoned_huge_page(hpage))
+ atomic_long_add(1 << compound_order(hpage),
&num_poisoned_pages);
} else {
- SetPageHWPoison(page);
- atomic_long_inc(&num_poisoned_pages);
+ if (!TestSetPageHWPoison(page))
+ atomic_long_inc(&num_poisoned_pages);
}
}
unset_migratetype_isolate(page, MIGRATE_MOVABLE);
--
2.1.0


2015-04-24 00:43:04

by Dean Nelson

[permalink] [raw]
Subject: Re: [PATCH] mm: soft-offline: fix num_poisoned_pages counting on concurrent events

On 04/20/2015 11:18 PM, Naoya Horiguchi wrote:
> If multiple soft offline events hit one free page/hugepage concurrently,
> soft_offline_page() can handle the free page/hugepage multiple times,
> which makes num_poisoned_pages counter increased more than once.
> This patch fixes this wrong counting by checking TestSetPageHWPoison for
> normal papes and by checking the return value of dequeue_hwpoisoned_huge_page()
> for hugepages.
>
> Signed-off-by: Naoya Horiguchi <[email protected]>

Acked-by: Dean Nelson <[email protected]>


> Cc: [email protected] # v3.14+
> ---
> # This problem might happen before 3.14, but it's rare and non-critical,
> # so I want this patch to be backported to stable trees only if the patch
> # cleanly applies (i.e. v3.14+).
> ---
> mm/memory-failure.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git v4.0.orig/mm/memory-failure.c v4.0/mm/memory-failure.c
> index 2cc1d578144b..72a5224c8084 100644
> --- v4.0.orig/mm/memory-failure.c
> +++ v4.0/mm/memory-failure.c
> @@ -1721,12 +1721,12 @@ int soft_offline_page(struct page *page, int flags)
> } else if (ret == 0) { /* for free pages */
> if (PageHuge(page)) {
> set_page_hwpoison_huge_page(hpage);
> - dequeue_hwpoisoned_huge_page(hpage);
> - atomic_long_add(1 << compound_order(hpage),
> + if (!dequeue_hwpoisoned_huge_page(hpage))
> + atomic_long_add(1 << compound_order(hpage),
> &num_poisoned_pages);
> } else {
> - SetPageHWPoison(page);
> - atomic_long_inc(&num_poisoned_pages);
> + if (!TestSetPageHWPoison(page))
> + atomic_long_inc(&num_poisoned_pages);
> }
> }
> unset_migratetype_isolate(page, MIGRATE_MOVABLE);
>