2021-09-09 01:40:31

by Naoya Horiguchi

[permalink] [raw]
Subject: [PATCH v1] mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()

From: Naoya Horiguchi <[email protected]>

commit fcc00621d88b ("mm/hwpoison: retry with shake_page() for
unhandlable pages") changes the return value of __get_hwpoison_page() to
retry for transiently unhandlable cases. However, __get_hwpoison_page()
currently fails to properly judge buddy pages as handlable, so hard/soft
offline for buddy pages always fail as "unhandlable page". This is
totally regrettable.

So let's add is_free_buddy_page() in HWPoisonHandlable(), so that
__get_hwpoison_page() returns different return values between buddy
pages and unhandlable pages as intended.

Fixes: fcc00621d88b ("mm/hwpoison: retry with shake_page() for unhandlable pages")
Cc: <[email protected]>
Signed-off-by: Naoya Horiguchi <[email protected]>
---
mm/memory-failure.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
index 60df8fcd0444..3416c55be810 100644
--- v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c
+++ v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
@@ -1126,7 +1126,7 @@ static int page_action(struct page_state *ps, struct page *p,
*/
static inline bool HWPoisonHandlable(struct page *page)
{
- return PageLRU(page) || __PageMovable(page);
+ return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page);
}

static int __get_hwpoison_page(struct page *page)
--
2.25.1


2021-09-09 03:25:13

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v1] mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()

On Wed, Sep 8, 2021 at 5:41 PM Naoya Horiguchi
<[email protected]> wrote:
>
> From: Naoya Horiguchi <[email protected]>
>
> commit fcc00621d88b ("mm/hwpoison: retry with shake_page() for
> unhandlable pages") changes the return value of __get_hwpoison_page() to
> retry for transiently unhandlable cases. However, __get_hwpoison_page()
> currently fails to properly judge buddy pages as handlable, so hard/soft
> offline for buddy pages always fail as "unhandlable page". This is
> totally regrettable.
>
> So let's add is_free_buddy_page() in HWPoisonHandlable(), so that
> __get_hwpoison_page() returns different return values between buddy
> pages and unhandlable pages as intended.
>
> Fixes: fcc00621d88b ("mm/hwpoison: retry with shake_page() for unhandlable pages")
> Cc: <[email protected]>
> Signed-off-by: Naoya Horiguchi <[email protected]>
> ---
> mm/memory-failure.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> index 60df8fcd0444..3416c55be810 100644
> --- v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c
> +++ v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> @@ -1126,7 +1126,7 @@ static int page_action(struct page_state *ps, struct page *p,
> */
> static inline bool HWPoisonHandlable(struct page *page)
> {
> - return PageLRU(page) || __PageMovable(page);
> + return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page);

It seems to work. Although this may change the return value of
get_any_page() to 1 when MF_COUNT_INCREASED is set. This may cause
soft offline to mishandle free buddy page, but MF_COUNT_INCREASED is
only set when madvise is used, and madvise definitely can't soft
offline free buddy page. It did take me a while to figure out this
trick. Maybe need some refactor?

Anyway this patch looks fine to me. Reviewed-by: Yang Shi <[email protected]>

> }
>
> static int __get_hwpoison_page(struct page *page)
> --
> 2.25.1
>

Subject: Re: [PATCH v1] mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()

On Wed, Sep 08, 2021 at 08:22:14PM -0700, Yang Shi wrote:
> On Wed, Sep 8, 2021 at 5:41 PM Naoya Horiguchi
> <[email protected]> wrote:
> >
> > From: Naoya Horiguchi <[email protected]>
> >
> > commit fcc00621d88b ("mm/hwpoison: retry with shake_page() for
> > unhandlable pages") changes the return value of __get_hwpoison_page() to
> > retry for transiently unhandlable cases. However, __get_hwpoison_page()
> > currently fails to properly judge buddy pages as handlable, so hard/soft
> > offline for buddy pages always fail as "unhandlable page". This is
> > totally regrettable.
> >
> > So let's add is_free_buddy_page() in HWPoisonHandlable(), so that
> > __get_hwpoison_page() returns different return values between buddy
> > pages and unhandlable pages as intended.
> >
> > Fixes: fcc00621d88b ("mm/hwpoison: retry with shake_page() for unhandlable pages")
> > Cc: <[email protected]>
> > Signed-off-by: Naoya Horiguchi <[email protected]>
> > ---
> > mm/memory-failure.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> > index 60df8fcd0444..3416c55be810 100644
> > --- v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c
> > +++ v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> > @@ -1126,7 +1126,7 @@ static int page_action(struct page_state *ps, struct page *p,
> > */
> > static inline bool HWPoisonHandlable(struct page *page)
> > {
> > - return PageLRU(page) || __PageMovable(page);
> > + return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page);
>
> It seems to work. Although this may change the return value of
> get_any_page() to 1 when MF_COUNT_INCREASED is set. This may cause
> soft offline to mishandle free buddy page, but MF_COUNT_INCREASED is
> only set when madvise is used, and madvise definitely can't soft
> offline free buddy page.

Yes, madvise(MADV_SOFT_OFFLINE) should not be called for free buddy pages,
because the interface covers only user-mapped pages.

> It did take me a while to figure out this
> trick. Maybe need some refactor?

I think so, the current code looks to me fragile for future change, but
I'm not sure how we can improve this.

One (maybe not related to MF_COUNT_INCREASED) room for refactoring is
"if (PageTransHuge(head))" block in __get_hwpoison_page(), which might
be old and useless code for now.

>
> Anyway this patch looks fine to me. Reviewed-by: Yang Shi <[email protected]>

Thank you for the review.

- Naoya

2021-09-10 11:17:57

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v1] mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()

On 09.09.21 02:41, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <[email protected]>
>
> commit fcc00621d88b ("mm/hwpoison: retry with shake_page() for
> unhandlable pages") changes the return value of __get_hwpoison_page() to
> retry for transiently unhandlable cases. However, __get_hwpoison_page()
> currently fails to properly judge buddy pages as handlable, so hard/soft
> offline for buddy pages always fail as "unhandlable page". This is
> totally regrettable.
>
> So let's add is_free_buddy_page() in HWPoisonHandlable(), so that
> __get_hwpoison_page() returns different return values between buddy
> pages and unhandlable pages as intended.
>
> Fixes: fcc00621d88b ("mm/hwpoison: retry with shake_page() for unhandlable pages")
> Cc: <[email protected]>
> Signed-off-by: Naoya Horiguchi <[email protected]>
> ---
> mm/memory-failure.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> index 60df8fcd0444..3416c55be810 100644
> --- v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c
> +++ v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> @@ -1126,7 +1126,7 @@ static int page_action(struct page_state *ps, struct page *p,
> */
> static inline bool HWPoisonHandlable(struct page *page)
> {
> - return PageLRU(page) || __PageMovable(page);
> + return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page);
> }
>
> static int __get_hwpoison_page(struct page *page)
>

LGTM and I agree that code might desrve some cleanups.

Acked-by: David Hildenbrand <[email protected]>

--
Thanks,

David / dhildenb