2019-06-10 08:18:37

by Naoya Horiguchi

[permalink] [raw]
Subject: [PATCH v2 1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

The pass/fail of soft offline should be judged by checking whether the
raw error page was finally contained or not (i.e. the result of
set_hwpoison_free_buddy_page()), but current code do not work like that.
So this patch is suggesting to fix it.

Signed-off-by: Naoya Horiguchi <[email protected]>
Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
Cc: <[email protected]> # v4.19+
---
mm/memory-failure.c | 2 ++
1 file changed, 2 insertions(+)

diff --git v5.2-rc3/mm/memory-failure.c v5.2-rc3_patched/mm/memory-failure.c
index fc8b517..7ea485e 100644
--- v5.2-rc3/mm/memory-failure.c
+++ v5.2-rc3_patched/mm/memory-failure.c
@@ -1733,6 +1733,8 @@ static int soft_offline_huge_page(struct page *page, int flags)
if (!ret) {
if (set_hwpoison_free_buddy_page(page))
num_poisoned_pages_inc();
+ else
+ ret = -EBUSY;
}
}
return ret;
--
2.7.0


2019-06-10 21:23:47

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

On Mon, 10 Jun 2019 17:18:05 +0900 Naoya Horiguchi <[email protected]> wrote:

> The pass/fail of soft offline should be judged by checking whether the
> raw error page was finally contained or not (i.e. the result of
> set_hwpoison_free_buddy_page()), but current code do not work like that.
> So this patch is suggesting to fix it.

Please describe the user-visible runtime effects of this change?

2019-06-10 22:53:08

by Naoya Horiguchi

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

On Mon, Jun 10, 2019 at 02:20:33PM -0700, Andrew Morton wrote:
> On Mon, 10 Jun 2019 17:18:05 +0900 Naoya Horiguchi <[email protected]> wrote:
>
> > The pass/fail of soft offline should be judged by checking whether the
> > raw error page was finally contained or not (i.e. the result of
> > set_hwpoison_free_buddy_page()), but current code do not work like that.
> > So this patch is suggesting to fix it.
>
> Please describe the user-visible runtime effects of this change?

Sorry, could you replace the description as follows (I inserted one sentence)?

The pass/fail of soft offline should be judged by checking whether the
raw error page was finally contained or not (i.e. the result of
set_hwpoison_free_buddy_page()), but current code do not work like that.
It might lead us to misjudge the test result when
set_hwpoison_free_buddy_page() fails. So this patch is suggesting to
fix it.

Thanks,
Naoya Horiguchi

2019-06-11 00:20:32

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

On 6/10/19 1:18 AM, Naoya Horiguchi wrote:
> The pass/fail of soft offline should be judged by checking whether the
> raw error page was finally contained or not (i.e. the result of
> set_hwpoison_free_buddy_page()), but current code do not work like that.
> So this patch is suggesting to fix it.
>
> Signed-off-by: Naoya Horiguchi <[email protected]>
> Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
> Cc: <[email protected]> # v4.19+

Reviewed-by: Mike Kravetz <[email protected]>

To follow-up on Andrew's comment/question about user visible effects. Without
this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may not offline the
original page and will not return an error. Are there any other visible
effects?

--
Mike Kravetz

2019-06-11 00:59:08

by Naoya Horiguchi

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

On Mon, Jun 10, 2019 at 05:19:45PM -0700, Mike Kravetz wrote:
> On 6/10/19 1:18 AM, Naoya Horiguchi wrote:
> > The pass/fail of soft offline should be judged by checking whether the
> > raw error page was finally contained or not (i.e. the result of
> > set_hwpoison_free_buddy_page()), but current code do not work like that.
> > So this patch is suggesting to fix it.
> >
> > Signed-off-by: Naoya Horiguchi <[email protected]>
> > Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
> > Cc: <[email protected]> # v4.19+
>
> Reviewed-by: Mike Kravetz <[email protected]>

Thank you, Mike.

>
> To follow-up on Andrew's comment/question about user visible effects. Without
> this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may not offline the
> original page and will not return an error.

Yes, that's right.

> Are there any other visible
> effects?

I can't think of other ones.

- Naoya

2019-06-11 08:16:25

by Anshuman Khandual

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails



On 06/11/2019 06:27 AM, Naoya Horiguchi wrote:
> On Mon, Jun 10, 2019 at 05:19:45PM -0700, Mike Kravetz wrote:
>> On 6/10/19 1:18 AM, Naoya Horiguchi wrote:
>>> The pass/fail of soft offline should be judged by checking whether the
>>> raw error page was finally contained or not (i.e. the result of
>>> set_hwpoison_free_buddy_page()), but current code do not work like that.
>>> So this patch is suggesting to fix it.
>>>
>>> Signed-off-by: Naoya Horiguchi <[email protected]>
>>> Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
>>> Cc: <[email protected]> # v4.19+
>>
>> Reviewed-by: Mike Kravetz <[email protected]>
>
> Thank you, Mike.
>
>>
>> To follow-up on Andrew's comment/question about user visible effects. Without
>> this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may not offline the
>> original page and will not return an error.
>
> Yes, that's right.

Then should this be included in the commit message as well ?

2019-06-12 08:35:30

by Naoya Horiguchi

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

On Tue, Jun 11, 2019 at 01:44:46PM +0530, Anshuman Khandual wrote:
>
>
> On 06/11/2019 06:27 AM, Naoya Horiguchi wrote:
> > On Mon, Jun 10, 2019 at 05:19:45PM -0700, Mike Kravetz wrote:
> >> On 6/10/19 1:18 AM, Naoya Horiguchi wrote:
> >>> The pass/fail of soft offline should be judged by checking whether the
> >>> raw error page was finally contained or not (i.e. the result of
> >>> set_hwpoison_free_buddy_page()), but current code do not work like that.
> >>> So this patch is suggesting to fix it.
> >>>
> >>> Signed-off-by: Naoya Horiguchi <[email protected]>
> >>> Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
> >>> Cc: <[email protected]> # v4.19+
> >>
> >> Reviewed-by: Mike Kravetz <[email protected]>
> >
> > Thank you, Mike.
> >
> >>
> >> To follow-up on Andrew's comment/question about user visible effects. Without
> >> this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may not offline the
> >> original page and will not return an error.
> >
> > Yes, that's right.
>
> Then should this be included in the commit message as well ?

Right, I'll clarify the point in the description.

Thanks,
- Naoya