2022-08-25 09:37:47

by Aneesh Kumar K.V

[permalink] [raw]
Subject: [PATCH] mm/demotion: Fix kernel error with memory hotplug

On memory hot unplug, the kernel removes the node memory type
from the associated memory tier. Use list_del_init instead of
list del such that the same memory type can be added back
to a memory tier on hotplug.

Without this, we get the below warning and return error on
adding memory type to a new memory tier.

[ 33.596095] ------------[ cut here ]------------
[ 33.596099] WARNING: CPU: 3 PID: 667 at mm/memory-tiers.c:115 set_node_memory_tier+0xd6/0x2e0
[ 33.596109] Modules linked in: kmem

...

[ 33.596126] RIP: 0010:set_node_memory_tier+0xd6/0x2e0

....
[ 33.596196] memtier_hotplug_callback+0x48/0x68
[ 33.596204] blocking_notifier_call_chain+0x80/0xc0
[ 33.596211] online_pages+0x25e/0x280
[ 33.596218] memory_block_change_state+0x176/0x1f0
[ 33.596225] memory_subsys_online+0x37/0x40
[ 33.596230] online_store+0x9b/0x130
[ 33.596236] kernfs_fop_write_iter+0x128/0x1b0
[ 33.596242] vfs_write+0x24b/0x2c0
[ 33.596249] ksys_write+0x74/0xe0
[ 33.596254] do_syscall_64+0x43/0x90
[ 33.596259] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: mm/demotion: Add hotplug callbacks to handle new numa node onlined
Signed-off-by: Aneesh Kumar K.V <[email protected]>
---
mm/memory-tiers.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c
index a20795bb0e07..ba844fe9cc8c 100644
--- a/mm/memory-tiers.c
+++ b/mm/memory-tiers.c
@@ -451,7 +451,7 @@ static bool clear_node_memory_tier(int node)
memtype = node_memory_types[node];
node_clear(node, memtype->nodes);
if (nodes_empty(memtype->nodes)) {
- list_del(&memtype->tier_sibiling);
+ list_del_init(&memtype->tier_sibiling);
if (list_empty(&memtier->memory_types))
destroy_memory_tier(memtier);
}
--
2.37.2


2022-08-25 12:56:42

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH] mm/demotion: Fix kernel error with memory hotplug

On 25.08.22 11:20, Aneesh Kumar K.V wrote:
> On memory hot unplug, the kernel removes the node memory type
> from the associated memory tier. Use list_del_init instead of
> list del such that the same memory type can be added back
> to a memory tier on hotplug.
>
> Without this, we get the below warning and return error on
> adding memory type to a new memory tier.
>
> [ 33.596095] ------------[ cut here ]------------
> [ 33.596099] WARNING: CPU: 3 PID: 667 at mm/memory-tiers.c:115 set_node_memory_tier+0xd6/0x2e0
> [ 33.596109] Modules linked in: kmem
>
> ...
>
> [ 33.596126] RIP: 0010:set_node_memory_tier+0xd6/0x2e0
>
> ....
> [ 33.596196] memtier_hotplug_callback+0x48/0x68
> [ 33.596204] blocking_notifier_call_chain+0x80/0xc0
> [ 33.596211] online_pages+0x25e/0x280
> [ 33.596218] memory_block_change_state+0x176/0x1f0
> [ 33.596225] memory_subsys_online+0x37/0x40
> [ 33.596230] online_store+0x9b/0x130
> [ 33.596236] kernfs_fop_write_iter+0x128/0x1b0
> [ 33.596242] vfs_write+0x24b/0x2c0
> [ 33.596249] ksys_write+0x74/0xe0
> [ 33.596254] do_syscall_64+0x43/0x90
> [ 33.596259] entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> Fixes: mm/demotion: Add hotplug callbacks to handle new numa node onlined

Do we have a proper 12-digit commit id as well?

Do we have to cc stable?

> Signed-off-by: Aneesh Kumar K.V <[email protected]>
> ---
> mm/memory-tiers.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c
> index a20795bb0e07..ba844fe9cc8c 100644
> --- a/mm/memory-tiers.c
> +++ b/mm/memory-tiers.c
> @@ -451,7 +451,7 @@ static bool clear_node_memory_tier(int node)
> memtype = node_memory_types[node];
> node_clear(node, memtype->nodes);
> if (nodes_empty(memtype->nodes)) {
> - list_del(&memtype->tier_sibiling);
> + list_del_init(&memtype->tier_sibiling);
> if (list_empty(&memtier->memory_types))
> destroy_memory_tier(memtier);
> }


--
Thanks,

David / dhildenb

2022-08-25 13:04:42

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH] mm/demotion: Fix kernel error with memory hotplug

On 8/25/22 5:46 PM, David Hildenbrand wrote:
> On 25.08.22 11:20, Aneesh Kumar K.V wrote:
>> On memory hot unplug, the kernel removes the node memory type
>> from the associated memory tier. Use list_del_init instead of
>> list del such that the same memory type can be added back
>> to a memory tier on hotplug.
>>
>> Without this, we get the below warning and return error on
>> adding memory type to a new memory tier.
>>
>> [ 33.596095] ------------[ cut here ]------------
>> [ 33.596099] WARNING: CPU: 3 PID: 667 at mm/memory-tiers.c:115 set_node_memory_tier+0xd6/0x2e0
>> [ 33.596109] Modules linked in: kmem
>>
>> ...
>>
>> [ 33.596126] RIP: 0010:set_node_memory_tier+0xd6/0x2e0
>>
>> ....
>> [ 33.596196] memtier_hotplug_callback+0x48/0x68
>> [ 33.596204] blocking_notifier_call_chain+0x80/0xc0
>> [ 33.596211] online_pages+0x25e/0x280
>> [ 33.596218] memory_block_change_state+0x176/0x1f0
>> [ 33.596225] memory_subsys_online+0x37/0x40
>> [ 33.596230] online_store+0x9b/0x130
>> [ 33.596236] kernfs_fop_write_iter+0x128/0x1b0
>> [ 33.596242] vfs_write+0x24b/0x2c0
>> [ 33.596249] ksys_write+0x74/0xe0
>> [ 33.596254] do_syscall_64+0x43/0x90
>> [ 33.596259] entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>
>> Fixes: mm/demotion: Add hotplug callbacks to handle new numa node onlined
>
> Do we have a proper 12-digit commit id as well?
>
> Do we have to cc stable?
>

That patch is not yet merged upstream. It is in mm-unstable. I guess Andrew can fold the change
into the original patch?

-aneesh

2022-08-26 00:44:33

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH] mm/demotion: Fix kernel error with memory hotplug

Aneesh Kumar K V <[email protected]> writes:

> On 8/25/22 5:46 PM, David Hildenbrand wrote:
>> On 25.08.22 11:20, Aneesh Kumar K.V wrote:
>>> On memory hot unplug, the kernel removes the node memory type
>>> from the associated memory tier. Use list_del_init instead of
>>> list del such that the same memory type can be added back
>>> to a memory tier on hotplug.
>>>
>>> Without this, we get the below warning and return error on
>>> adding memory type to a new memory tier.
>>>
>>> [ 33.596095] ------------[ cut here ]------------
>>> [ 33.596099] WARNING: CPU: 3 PID: 667 at mm/memory-tiers.c:115 set_node_memory_tier+0xd6/0x2e0
>>> [ 33.596109] Modules linked in: kmem
>>>
>>> ...
>>>
>>> [ 33.596126] RIP: 0010:set_node_memory_tier+0xd6/0x2e0
>>>
>>> ....
>>> [ 33.596196] memtier_hotplug_callback+0x48/0x68
>>> [ 33.596204] blocking_notifier_call_chain+0x80/0xc0
>>> [ 33.596211] online_pages+0x25e/0x280
>>> [ 33.596218] memory_block_change_state+0x176/0x1f0
>>> [ 33.596225] memory_subsys_online+0x37/0x40
>>> [ 33.596230] online_store+0x9b/0x130
>>> [ 33.596236] kernfs_fop_write_iter+0x128/0x1b0
>>> [ 33.596242] vfs_write+0x24b/0x2c0
>>> [ 33.596249] ksys_write+0x74/0xe0
>>> [ 33.596254] do_syscall_64+0x43/0x90
>>> [ 33.596259] entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>>
>>> Fixes: mm/demotion: Add hotplug callbacks to handle new numa node onlined
>>
>> Do we have a proper 12-digit commit id as well?
>>
>> Do we have to cc stable?
>>
>
> That patch is not yet merged upstream. It is in mm-unstable. I guess Andrew can fold the change
> into the original patch?

I think it may better to reply the original patch and name this patch as
fix, for example,

mm/demotion: Add hotplug callbacks to handle new numa node onlined fix

I found Andrew uses this kind of name before for fixing.

Best Regards,
Huang, Ying

2022-08-26 03:12:28

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm/demotion: Fix kernel error with memory hotplug

On Fri, 26 Aug 2022 08:25:42 +0800 "Huang, Ying" <[email protected]> wrote:

> >> Do we have to cc stable?
> >>
> >
> > That patch is not yet merged upstream. It is in mm-unstable. I guess Andrew can fold the change
> > into the original patch?
>
> I think it may better to reply the original patch and name this patch as
> fix, for example,
>
> mm/demotion: Add hotplug callbacks to handle new numa node onlined fix
>
> I found Andrew uses this kind of name before for fixing.

Doesn't matter much - figuring out which-patch-did-this-patch-fix is,
shall we say, a common operation at akpm headquarters ;)

This was an easy one, thanks to the Fixes:. The patch didn't actually
apply at the desired point in the series, and that's pretty common.
All fixed up now, thanks.



2022-08-26 09:56:25

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH] mm/demotion: Fix kernel error with memory hotplug

On 25.08.22 14:53, Aneesh Kumar K V wrote:
> On 8/25/22 5:46 PM, David Hildenbrand wrote:
>> On 25.08.22 11:20, Aneesh Kumar K.V wrote:
>>> On memory hot unplug, the kernel removes the node memory type
>>> from the associated memory tier. Use list_del_init instead of
>>> list del such that the same memory type can be added back
>>> to a memory tier on hotplug.
>>>
>>> Without this, we get the below warning and return error on
>>> adding memory type to a new memory tier.
>>>
>>> [ 33.596095] ------------[ cut here ]------------
>>> [ 33.596099] WARNING: CPU: 3 PID: 667 at mm/memory-tiers.c:115 set_node_memory_tier+0xd6/0x2e0
>>> [ 33.596109] Modules linked in: kmem
>>>
>>> ...
>>>
>>> [ 33.596126] RIP: 0010:set_node_memory_tier+0xd6/0x2e0
>>>
>>> ....
>>> [ 33.596196] memtier_hotplug_callback+0x48/0x68
>>> [ 33.596204] blocking_notifier_call_chain+0x80/0xc0
>>> [ 33.596211] online_pages+0x25e/0x280
>>> [ 33.596218] memory_block_change_state+0x176/0x1f0
>>> [ 33.596225] memory_subsys_online+0x37/0x40
>>> [ 33.596230] online_store+0x9b/0x130
>>> [ 33.596236] kernfs_fop_write_iter+0x128/0x1b0
>>> [ 33.596242] vfs_write+0x24b/0x2c0
>>> [ 33.596249] ksys_write+0x74/0xe0
>>> [ 33.596254] do_syscall_64+0x43/0x90
>>> [ 33.596259] entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>>
>>> Fixes: mm/demotion: Add hotplug callbacks to handle new numa node onlined
>>
>> Do we have a proper 12-digit commit id as well?
>>
>> Do we have to cc stable?
>>
>
> That patch is not yet merged upstream. It is in mm-unstable. I guess Andrew can fold the change
> into the original patch?
>

Please make that clearer next time somehow -- either via "[PATCH
mm-unstable]" or just by stating "Andrew, please squash this into XYZ".

I know, akpm headquarter tracks all pending patches, but for other
reviewers this really helps to figure out how urgent this is and where
this applies to (+ saves time).


--
Thanks,

David / dhildenb