2020-08-30 08:15:57

by Muchun Song

[permalink] [raw]
Subject: [PATCH] mm/memory-failure: Fix return wrong value when isolate page fail

When we isolate page fail, we should not return 0, because we do not
set page HWPoison on any page.

Signed-off-by: Muchun Song <[email protected]>
---
mm/memory-failure.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 696505f56910..4eb3c42ffe35 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1850,6 +1850,7 @@ static int __soft_offline_page(struct page *page)
} else {
pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
+ ret = -EBUSY;
}
return ret;
}
--
2.11.0


2020-08-30 19:45:29

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH] mm/memory-failure: Fix return wrong value when isolate page fail

On Sun, Aug 30, 2020 at 04:10:53PM +0800, Muchun Song wrote:
> When we isolate page fail, we should not return 0, because we do not
> set page HWPoison on any page.
>
> Signed-off-by: Muchun Song <[email protected]>

This seems solve the problem for me that madvise(MADV_SOFT_OFFLINE) will run
into ENOMEM eventually:

https://lore.kernel.org/lkml/[email protected]/

Reviewed-by: Qian Cai <[email protected]>

> ---
> mm/memory-failure.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 696505f56910..4eb3c42ffe35 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1850,6 +1850,7 @@ static int __soft_offline_page(struct page *page)
> } else {
> pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
> pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
> + ret = -EBUSY;
> }
> return ret;
> }
> --
> 2.11.0
>
>

Subject: Re: [PATCH] mm/memory-failure: Fix return wrong value when isolate page fail

On Sun, Aug 30, 2020 at 03:44:18PM -0400, Qian Cai wrote:
> On Sun, Aug 30, 2020 at 04:10:53PM +0800, Muchun Song wrote:
> > When we isolate page fail, we should not return 0, because we do not
> > set page HWPoison on any page.
> >
> > Signed-off-by: Muchun Song <[email protected]>
>
> This seems solve the problem for me that madvise(MADV_SOFT_OFFLINE) will run
> into ENOMEM eventually:
>
> https://lore.kernel.org/lkml/[email protected]/
>
> Reviewed-by: Qian Cai <[email protected]>

Thank you very much, both of you!

Acked-by: Naoya Horiguchi <[email protected]>

> > ---
> > mm/memory-failure.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 696505f56910..4eb3c42ffe35 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1850,6 +1850,7 @@ static int __soft_offline_page(struct page *page)
> > } else {
> > pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
> > pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
> > + ret = -EBUSY;
> > }
> > return ret;
> > }
> > --
> > 2.11.0
> >
> >
>

2020-08-31 07:35:00

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH] mm/memory-failure: Fix return wrong value when isolate page fail

On Sun, Aug 30, 2020 at 04:10:53PM +0800, Muchun Song wrote:
> When we isolate page fail, we should not return 0, because we do not
> set page HWPoison on any page.
>
> Signed-off-by: Muchun Song <[email protected]>

Thanks for spotting this

Reviewed-by: Oscar Salvador <[email protected]>

> ---
> mm/memory-failure.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 696505f56910..4eb3c42ffe35 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1850,6 +1850,7 @@ static int __soft_offline_page(struct page *page)
> } else {
> pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
> pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
> + ret = -EBUSY;
> }
> return ret;
> }
> --
> 2.11.0
>
>

--
Oscar Salvador
SUSE L3

2020-08-31 09:46:54

by Pankaj Gupta

[permalink] [raw]
Subject: Re: [PATCH] mm/memory-failure: Fix return wrong value when isolate page fail

> When we isolate page fail, we should not return 0, because we do not
> set page HWPoison on any page.
>
> Signed-off-by: Muchun Song <[email protected]>
> ---
> mm/memory-failure.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 696505f56910..4eb3c42ffe35 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1850,6 +1850,7 @@ static int __soft_offline_page(struct page *page)
> } else {
> pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
> pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
> + ret = -EBUSY;
> }
> return ret;
> }
> --

Acked-by: Pankaj Gupta <[email protected]>
> 2.11.0
>
>