2023-02-12 11:11:05

by Qi Zheng

[permalink] [raw]
Subject: [PATCH] mm: page_alloc: call panic() when memoryless node allocation fails

In free_area_init(), we will continue to run after pgdat of memoryless
node allocation fails. However, in the subsequent process (such as when
initializing zonelist), the case that NODE_DATA(nid) is NULL is not
handled, which will cause panic. Instead of this, it's better to call
panic() directly when the memory allocation fails during system boot.

Signed-off-by: Qi Zheng <[email protected]>
---
mm/page_alloc.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0745aedebb37..588555754601 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8360,11 +8360,9 @@ void __init free_area_init(unsigned long *max_zone_pfn)

/* Allocator not initialized yet */
pgdat = arch_alloc_nodedata(nid);
- if (!pgdat) {
- pr_err("Cannot allocate %zuB for node %d.\n",
- sizeof(*pgdat), nid);
- continue;
- }
+ if (!pgdat)
+ panic("Cannot allocate %zuB for node %d.\n",
+ sizeof(*pgdat), nid);
arch_refresh_nodedata(nid, pgdat);
free_area_init_memoryless_node(nid);

--
2.20.1



2023-02-13 18:53:37

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm: page_alloc: call panic() when memoryless node allocation fails

On Sun, 12 Feb 2023 19:10:27 +0800 Qi Zheng <[email protected]> wrote:

> In free_area_init(), we will continue to run after pgdat of memoryless
> node allocation fails. However, in the subsequent process (such as when
> initializing zonelist), the case that NODE_DATA(nid) is NULL is not
> handled, which will cause panic. Instead of this, it's better to call
> panic() directly when the memory allocation fails during system boot.
>
> ...
>
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8360,11 +8360,9 @@ void __init free_area_init(unsigned long *max_zone_pfn)
>
> /* Allocator not initialized yet */
> pgdat = arch_alloc_nodedata(nid);
> - if (!pgdat) {
> - pr_err("Cannot allocate %zuB for node %d.\n",
> - sizeof(*pgdat), nid);
> - continue;
> - }
> + if (!pgdat)
> + panic("Cannot allocate %zuB for node %d.\n",
> + sizeof(*pgdat), nid);
> arch_refresh_nodedata(nid, pgdat);
> free_area_init_memoryless_node(nid);

Have you actually hit this at runtime? If so, is there something we
can do to handle this failure more gracefully?


2023-02-14 03:15:27

by Qi Zheng

[permalink] [raw]
Subject: Re: [PATCH] mm: page_alloc: call panic() when memoryless node allocation fails



On 2023/2/14 02:53, Andrew Morton wrote:
> On Sun, 12 Feb 2023 19:10:27 +0800 Qi Zheng <[email protected]> wrote:
>
>> In free_area_init(), we will continue to run after pgdat of memoryless
>> node allocation fails. However, in the subsequent process (such as when
>> initializing zonelist), the case that NODE_DATA(nid) is NULL is not
>> handled, which will cause panic. Instead of this, it's better to call
>> panic() directly when the memory allocation fails during system boot.
>>
>> ...
>>
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -8360,11 +8360,9 @@ void __init free_area_init(unsigned long *max_zone_pfn)
>>
>> /* Allocator not initialized yet */
>> pgdat = arch_alloc_nodedata(nid);
>> - if (!pgdat) {
>> - pr_err("Cannot allocate %zuB for node %d.\n",
>> - sizeof(*pgdat), nid);
>> - continue;
>> - }
>> + if (!pgdat)
>> + panic("Cannot allocate %zuB for node %d.\n",
>> + sizeof(*pgdat), nid);
>> arch_refresh_nodedata(nid, pgdat);
>> free_area_init_memoryless_node(nid);
>
> Have you actually hit this at runtime? If so, is there something we

No, I just quickly tested the following code and found that it will
immediately crash in the subsequent position (many paths do not handle
this case):

@@ -8368,6 +8368,7 @@ void __init free_area_init(unsigned long
*max_zone_pfn)
sizeof(*pgdat), nid);
continue;
}
+ continue;
arch_refresh_nodedata(nid, pgdat);
free_area_init_memoryless_node(nid);

> can do to handle this failure more gracefully?

My consideration is that if the memory of the size of the pgdat
structure (so small) cannot be allocated at the boot time, there
is no need to continue running. So I choose to call the panic()
directly.

Thanks,
Qi

>