2021-07-10 11:47:04

by Zhang, Qiang

[permalink] [raw]
Subject: [PATCH] mm/page_alloc: avoid hard lockups in __alloc_pages_bulk()

From: Zqiang <[email protected]>

The __alloc_pages_bulk() mainly used for batch allocation of
order-0 pages, in the case of holding pagesets.lock, if too
many pages are required, maybe trigger hard lockup watchdog.

Signed-off-by: Zqiang <[email protected]>
---
mm/page_alloc.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d6e94cc8066c..1127db25507f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5315,6 +5315,8 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
else
page_array[nr_populated] = page;
nr_populated++;
+
+ touch_nmi_watchdog();
}

local_unlock_irqrestore(&pagesets.lock, flags);
--
2.25.1


2021-07-10 18:48:32

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm/page_alloc: avoid hard lockups in __alloc_pages_bulk()

On Sat, 10 Jul 2021 19:29:29 +0800 [email protected] wrote:

> From: Zqiang <[email protected]>
>
> The __alloc_pages_bulk() mainly used for batch allocation of
> order-0 pages, in the case of holding pagesets.lock, if too
> many pages are required, maybe trigger hard lockup watchdog.

Ouch. Has this been observed in testing? If so, can you please share
the kernel debug output from that event?

2021-07-10 21:12:46

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] mm/page_alloc: avoid hard lockups in __alloc_pages_bulk()

On Sat, Jul 10, 2021 at 11:46:13AM -0700, Andrew Morton wrote:
> On Sat, 10 Jul 2021 19:29:29 +0800 [email protected] wrote:
>
> > From: Zqiang <[email protected]>
> >
> > The __alloc_pages_bulk() mainly used for batch allocation of
> > order-0 pages, in the case of holding pagesets.lock, if too
> > many pages are required, maybe trigger hard lockup watchdog.
>
> Ouch. Has this been observed in testing? If so, can you please share
> the kernel debug output from that event?

This should be fixed in the caller by asking for fewer pages.
The NFS and vmalloc cases have already been fixed for this.

2021-07-15 10:48:25

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH] mm/page_alloc: avoid hard lockups in __alloc_pages_bulk()

On Sat, Jul 10, 2021 at 10:57:53PM +0000, Zhang, Qiang wrote:
> ________________________________
> ??????: Matthew Wilcox <[email protected]>
> ????????: ??????, ???? 11, 2021 05:10
> ??????: Andrew Morton
> ????: Zhang, Qiang; [email protected]; [email protected]; [email protected]
> ????: Re: [PATCH] mm/page_alloc: avoid hard lockups in __alloc_pages_bulk()
>
> [Please note: This e-mail is from an EXTERNAL e-mail address]
>
> On Sat, Jul 10, 2021 at 11:46:13AM -0700, Andrew Morton wrote:
> > On Sat, 10 Jul 2021 19:29:29 +0800 [email protected] wrote:
> >
> > > From: Zqiang <[email protected]>
> > >
> > > The __alloc_pages_bulk() mainly used for batch allocation of
> > > order-0 pages, in the case of holding pagesets.lock, if too
> > > many pages are required, maybe trigger hard lockup watchdog.
> >
> > Ouch. Has this been observed in testing? If so, can you please share
> > the kernel debug output from that event?
>
> >This should be fixed in the caller by asking for fewer pages.
> >The NFS and vmalloc cases have already been fixed for this.
>
> The NFS and vmalloc cases haven been fixed??
> I don??t see if there is any information about that?
>

AFAIK, NFS simply doesn't ask for a large enough number of pages to be
of concern. For vmalloc, it's somewhat theoritical that it can happen
for anything other than a stress test but this exists
https://lore.kernel.org/r/[email protected]

I had no objection to the patch but didn't feel strongly enough to say
anything about it either given that it was triggered artifically.

--
Mel Gorman
SUSE Labs