2021-11-09 16:30:57

by Michal Hocko

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

I have opposed this patch http://lkml.kernel.org/r/[email protected]
There was no response to that feedback. I will not go as far as to nack
it explicitly because pcp allocator is not an area I would nack patches
but seriously, this issue needs a deeper look rather than a paper over
patch. I hope we do not want to do a similar thing to all callers of
cpu_to_mem.

On Mon 08-11-21 12:50:31, Andrew Morton wrote:
>
> The patch titled
> Subject: mm: fix panic in __alloc_pages
> has been added to the -mm tree. Its filename is
> mm-fix-panic-in-__alloc_pages.patch
>
> This patch should soon appear at
> https://ozlabs.org/~akpm/mmots/broken-out/mm-fix-panic-in-__alloc_pages.patch
> and later at
> https://ozlabs.org/~akpm/mmotm/broken-out/mm-fix-panic-in-__alloc_pages.patch
>
> Before you just go and hit "reply", please:
> a) Consider who else should be cc'ed
> b) Prefer to cc a suitable mailing list as well
> c) Ideally: find the original patch on the mailing list and do a
> reply-to-all to that, adding suitable additional cc's
>
> *** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
>
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
>
> ------------------------------------------------------
> From: Alexey Makhalov <[email protected]>
> Subject: mm: fix panic in __alloc_pages
>
> There is a kernel panic caused by pcpu_alloc_pages() passing offlined and
> uninitialized node to alloc_pages_node() leading to panic by NULL
> dereferencing uninitialized NODE_DATA(nid).
>
> CPU2 has been hot-added
> BUG: unable to handle page fault for address: 0000000000001608
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP PTI
> CPU: 0 PID: 1 Comm: systemd Tainted: G E 5.15.0-rc7+ #11
> Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW
>
> RIP: 0010:__alloc_pages+0x127/0x290
> Code: 4c 89 f0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 44 89 e0 48 8b 55 b8 c1 e8 0c 83 e0 01 88 45 d0 4c 89 c8 48 85 d2 0f 85 1a 01 00 00 <45> 3b 41 08 0f 82 10 01 00 00 48 89 45 c0 48 8b 00 44 89 e2 81 e2
> RSP: 0018:ffffc900006f3bc8 EFLAGS: 00010246
> RAX: 0000000000001600 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000cc2
> RBP: ffffc900006f3c18 R08: 0000000000000001 R09: 0000000000001600
> R10: ffffc900006f3a40 R11: ffff88813c9fffe8 R12: 0000000000000cc2
> R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000cc2
> FS: 00007f27ead70500(0000) GS:ffff88807ce00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000001608 CR3: 000000000582c003 CR4: 00000000001706b0
> Call Trace:
> pcpu_alloc_pages.constprop.0+0xe4/0x1c0
> pcpu_populate_chunk+0x33/0xb0
> pcpu_alloc+0x4d3/0x6f0
> __alloc_percpu_gfp+0xd/0x10
> alloc_mem_cgroup_per_node_info+0x54/0xb0
> mem_cgroup_alloc+0xed/0x2f0
> mem_cgroup_css_alloc+0x33/0x2f0
> css_create+0x3a/0x1f0
> cgroup_apply_control_enable+0x12b/0x150
> cgroup_mkdir+0xdd/0x110
> kernfs_iop_mkdir+0x4f/0x80
> vfs_mkdir+0x178/0x230
> do_mkdirat+0xfd/0x120
> __x64_sys_mkdir+0x47/0x70
> ? syscall_exit_to_user_mode+0x21/0x50
> do_syscall_64+0x43/0x90
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> Panic can be easily reproduced by disabling udev rule for automatic
> onlining hot added CPU followed by CPU with memoryless node (NUMA node
> with CPU only) hot add.
>
> Hot adding CPU and memoryless node does not bring the node to online
> state. Memoryless node will be onlined only during the onlining its CPU.
>
> Node can be in one of the following states:
> 1. not present.(nid == NUMA_NO_NODE)
> 2. present, but offline (nid > NUMA_NO_NODE, node_online(nid) == 0,
> NODE_DATA(nid) == NULL)
> 3. present and online (nid > NUMA_NO_NODE, node_online(nid) > 0,
> NODE_DATA(nid) != NULL)
>
> Percpu code is doing allocations for all possible CPUs. The issue happens
> when it serves hot added but not yet onlined CPU when its node is in 2nd
> state. This node is not ready to use, fallback to numa_mem_id().
>
> Link: https://lkml.kernel.org/r/[email protected]
> Signed-off-by: Alexey Makhalov <[email protected]>
> Reviewed-by: David Hildenbrand <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Oscar Salvador <[email protected]>
> Cc: Dennis Zhou <[email protected]>
> Cc: Tejun Heo <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> ---
>
> mm/percpu-vm.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> --- a/mm/percpu-vm.c~mm-fix-panic-in-__alloc_pages
> +++ a/mm/percpu-vm.c
> @@ -84,15 +84,19 @@ static int pcpu_alloc_pages(struct pcpu_
> gfp_t gfp)
> {
> unsigned int cpu, tcpu;
> - int i;
> + int i, nid;
>
> gfp |= __GFP_HIGHMEM;
>
> for_each_possible_cpu(cpu) {
> + nid = cpu_to_node(cpu);
> + if (nid == NUMA_NO_NODE || !node_online(nid))
> + nid = numa_mem_id();
> +
> for (i = page_start; i < page_end; i++) {
> struct page **pagep = &pages[pcpu_page_idx(cpu, i)];
>
> - *pagep = alloc_pages_node(cpu_to_node(cpu), gfp, 0);
> + *pagep = alloc_pages_node(nid, gfp, 0);
> if (!*pagep)
> goto err;
> }
> _
>
> Patches currently in -mm which might be from [email protected] are
>
> mm-fix-panic-in-__alloc_pages.patch

--
Michal Hocko
SUSE Labs


2021-11-09 17:08:28

by David Hildenbrand

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

On 09.11.21 09:37, Michal Hocko wrote:
> I have opposed this patch http://lkml.kernel.org/r/[email protected]
> There was no response to that feedback. I will not go as far as to nack
> it explicitly because pcp allocator is not an area I would nack patches
> but seriously, this issue needs a deeper look rather than a paper over
> patch. I hope we do not want to do a similar thing to all callers of
> cpu_to_mem.

While we could move it into the !HOLES version of cpu_to_mem(), calling
cpu_to_mem() on an offline (and eventually not even present) CPU (with
an offline node) is really a corner case.

Instead of additional runtime overhead for all cpu_to_mem(), my take
would be to just do it for the random special cases. Sure, we can
document that people should be careful when calling cpu_to_mem() on
offline CPUs. But IMHO it's really a corner case.

--
Thanks,

David / dhildenb

2021-11-09 21:25:01

by Michal Hocko

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

On Tue 09-11-21 09:42:56, David Hildenbrand wrote:
> On 09.11.21 09:37, Michal Hocko wrote:
> > I have opposed this patch http://lkml.kernel.org/r/[email protected]
> > There was no response to that feedback. I will not go as far as to nack
> > it explicitly because pcp allocator is not an area I would nack patches
> > but seriously, this issue needs a deeper look rather than a paper over
> > patch. I hope we do not want to do a similar thing to all callers of
> > cpu_to_mem.
>
> While we could move it into the !HOLES version of cpu_to_mem(), calling
> cpu_to_mem() on an offline (and eventually not even present) CPU (with
> an offline node) is really a corner case.
>
> Instead of additional runtime overhead for all cpu_to_mem(), my take
> would be to just do it for the random special cases. Sure, we can
> document that people should be careful when calling cpu_to_mem() on
> offline CPUs. But IMHO it's really a corner case.

I suspect I haven't made myself clear enough. I do not think we should
be touching cpu_to_mem/cpu_to_node and handle this corner case. We
should be looking at the underlying problem instead. We cannot really
rely on cpu to be onlined to have a proper node association. We should
really look at the initialization code and handle this situation
properly. Memory less nodes are something we have been dealing with
already. This particular instance of the problem is new and we should
understand why.
--
Michal Hocko
SUSE Labs

2021-11-12 18:20:30

by Dennis Zhou

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

Hello,

On Tue, Nov 09, 2021 at 12:00:46PM +0100, Michal Hocko wrote:
> On Tue 09-11-21 09:42:56, David Hildenbrand wrote:
> > On 09.11.21 09:37, Michal Hocko wrote:
> > > I have opposed this patch http://lkml.kernel.org/r/[email protected]
> > > There was no response to that feedback. I will not go as far as to nack
> > > it explicitly because pcp allocator is not an area I would nack patches
> > > but seriously, this issue needs a deeper look rather than a paper over
> > > patch. I hope we do not want to do a similar thing to all callers of
> > > cpu_to_mem.
> >
> > While we could move it into the !HOLES version of cpu_to_mem(), calling
> > cpu_to_mem() on an offline (and eventually not even present) CPU (with
> > an offline node) is really a corner case.
> >
> > Instead of additional runtime overhead for all cpu_to_mem(), my take
> > would be to just do it for the random special cases. Sure, we can
> > document that people should be careful when calling cpu_to_mem() on
> > offline CPUs. But IMHO it's really a corner case.
>
> I suspect I haven't made myself clear enough. I do not think we should
> be touching cpu_to_mem/cpu_to_node and handle this corner case. We
> should be looking at the underlying problem instead. We cannot really
> rely on cpu to be onlined to have a proper node association. We should
> really look at the initialization code and handle this situation
> properly. Memory less nodes are something we have been dealing with
> already. This particular instance of the problem is new and we should
> understand why.
> --
> Michal Hocko
> SUSE Labs

So I think we're still short a solution here. This patch solves the side
effect but not the underlying problem related to cpu hotplug.

I'm fine with this going in as a stop gap because I imagine the fixes to
hotplug are a lot more intrusive, but do we have someone who can own
that work to fix hotplug? I think that should be a requirement for
taking this because clearly it's hotplug that's broken and not percpu.

Acked-by: Dennis Zhou <[email protected]>

Thanks,
Dennis

2021-11-15 10:41:30

by Michal Hocko

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

On Fri 12-11-21 13:20:20, Dennis Zhou wrote:
> Hello,
>
> On Tue, Nov 09, 2021 at 12:00:46PM +0100, Michal Hocko wrote:
> > On Tue 09-11-21 09:42:56, David Hildenbrand wrote:
> > > On 09.11.21 09:37, Michal Hocko wrote:
> > > > I have opposed this patch http://lkml.kernel.org/r/[email protected]
> > > > There was no response to that feedback. I will not go as far as to nack
> > > > it explicitly because pcp allocator is not an area I would nack patches
> > > > but seriously, this issue needs a deeper look rather than a paper over
> > > > patch. I hope we do not want to do a similar thing to all callers of
> > > > cpu_to_mem.
> > >
> > > While we could move it into the !HOLES version of cpu_to_mem(), calling
> > > cpu_to_mem() on an offline (and eventually not even present) CPU (with
> > > an offline node) is really a corner case.
> > >
> > > Instead of additional runtime overhead for all cpu_to_mem(), my take
> > > would be to just do it for the random special cases. Sure, we can
> > > document that people should be careful when calling cpu_to_mem() on
> > > offline CPUs. But IMHO it's really a corner case.
> >
> > I suspect I haven't made myself clear enough. I do not think we should
> > be touching cpu_to_mem/cpu_to_node and handle this corner case. We
> > should be looking at the underlying problem instead. We cannot really
> > rely on cpu to be onlined to have a proper node association. We should
> > really look at the initialization code and handle this situation
> > properly. Memory less nodes are something we have been dealing with
> > already. This particular instance of the problem is new and we should
> > understand why.
> > --
> > Michal Hocko
> > SUSE Labs
>
> So I think we're still short a solution here. This patch solves the side
> effect but not the underlying problem related to cpu hotplug.
>
> I'm fine with this going in as a stop gap because I imagine the fixes to
> hotplug are a lot more intrusive, but do we have someone who can own
> that work to fix hotplug? I think that should be a requirement for
> taking this because clearly it's hotplug that's broken and not percpu.

I have asked several times for details about the specific setup that has
led to the reported crash. Without much success so far. Reproduction
steps would be the first step. That would allow somebody to work on this
at least if Alexey doesn't have time to dive into this deeper.

I would be more inclined to a stop gap workaround if this was a more
wide spread problem but a lack of other repports suggests this has been
a one off.

The final saying is yours of course.

> Acked-by: Dennis Zhou <[email protected]>
>
> Thanks,
> Dennis

--
Michal Hocko
SUSE Labs

2021-11-15 11:04:53

by Alexey Makhalov

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

Hi Michal,

>
> I have asked several times for details about the specific setup that has
> led to the reported crash. Without much success so far. Reproduction
> steps would be the first step. That would allow somebody to work on this
> at least if Alexey doesn't have time to dive into this deeper.
>

I didn’t know that repro steps are still not clear.

To reproduce the panic you need to have a system, where you can hot add
the CPU that belongs to memoryless NUMA node which is not present and onlined
yet. In other words, by hot adding CPU, you will add both CPU and NUMA node
at the same time.
I’m using VMware hypervisor and linux VM there configured in a way
that every (possible) CPU has its own NUMA node.
Before doing CPU hot add, udev rule for CPU onlining should be disabled.
After CPU hot add event, panic will be triggered shortly right on the next
percpu allocation.

Let me know if this is enough or you need some extra information.

Thanks,
—Alexey


Attachments:
signature.asc (833.00 B)
Message signed with OpenPGP

2021-11-15 12:59:09

by Michal Hocko

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

On Mon 15-11-21 11:04:16, Alexey Makhalov wrote:
> Hi Michal,
>
> >
> > I have asked several times for details about the specific setup that has
> > led to the reported crash. Without much success so far. Reproduction
> > steps would be the first step. That would allow somebody to work on this
> > at least if Alexey doesn't have time to dive into this deeper.
> >
>
> I didn’t know that repro steps are still not clear.
>
> To reproduce the panic you need to have a system, where you can hot add
> the CPU that belongs to memoryless NUMA node which is not present and onlined
> yet. In other words, by hot adding CPU, you will add both CPU and NUMA node
> at the same time.

There seems to be something different in your setup because memory less
nodes have reportedly worked on x86. I suspect something must be
different in your setup. Maybe it is that you are adding a cpu that is
outside of possible cpus intialized during boot time. Those should have
their nodes initialized properly - at least per init_cpu_to_node. Your
report doesn't really explain how the cpu is hotadded. Maybe you are
trying to do something that has never been supported on x86.

It would be really great if you can provide more information in the
original email thread. E.g. boot time messges and then more details
about the hotplug operation as well (e.g. which cpu, the node
association, how it is injected to the guest etc.).

Thanks!
--
Michal Hocko
SUSE Labs

2021-11-16 00:14:16

by Alexey Makhalov

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree



> On Nov 15, 2021, at 4:58 AM, Michal Hocko <[email protected]> wrote:
>
> On Mon 15-11-21 11:04:16, Alexey Makhalov wrote:
>> Hi Michal,
>>
>>>
>>> I have asked several times for details about the specific setup that has
>>> led to the reported crash. Without much success so far. Reproduction
>>> steps would be the first step. That would allow somebody to work on this
>>> at least if Alexey doesn't have time to dive into this deeper.
>>>
>>
>> I didn’t know that repro steps are still not clear.
>>
>> To reproduce the panic you need to have a system, where you can hot add
>> the CPU that belongs to memoryless NUMA node which is not present and onlined
>> yet. In other words, by hot adding CPU, you will add both CPU and NUMA node
>> at the same time.
>
> There seems to be something different in your setup because memory less
> nodes have reportedly worked on x86. I suspect something must be
> different in your setup. Maybe it is that you are adding a cpu that is
> outside of possible cpus intialized during boot time. Those should have
> their nodes initialized properly - at least per init_cpu_to_node. Your
> report doesn't really explain how the cpu is hotadded. Maybe you are
> trying to do something that has never been supported on x86.
Memoryless nodes are supported by x86. But hot add of such nodes not quite
done.

>
> It would be really great if you can provide more information in the
> original email thread. E.g. boot time messges and then more details
> about the hotplug operation as well (e.g. which cpu, the node
> association, how it is injected to the guest etc.).
>
I’ll provide more information in the main thread.



Regards,
—Alexey


Attachments:
signature.asc (833.00 B)
Message signed with OpenPGP

2021-11-16 03:52:49

by Dennis Zhou

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

On Mon, Nov 15, 2021 at 11:11:44PM +0000, Alexey Makhalov wrote:
>
>
> > On Nov 15, 2021, at 4:58 AM, Michal Hocko <[email protected]> wrote:
> >
> > On Mon 15-11-21 11:04:16, Alexey Makhalov wrote:
> >> Hi Michal,
> >>
> >>>
> >>> I have asked several times for details about the specific setup that has
> >>> led to the reported crash. Without much success so far. Reproduction
> >>> steps would be the first step. That would allow somebody to work on this
> >>> at least if Alexey doesn't have time to dive into this deeper.
> >>>
> >>
> >> I didn’t know that repro steps are still not clear.
> >>
> >> To reproduce the panic you need to have a system, where you can hot add
> >> the CPU that belongs to memoryless NUMA node which is not present and onlined
> >> yet. In other words, by hot adding CPU, you will add both CPU and NUMA node
> >> at the same time.
> >
> > There seems to be something different in your setup because memory less
> > nodes have reportedly worked on x86. I suspect something must be
> > different in your setup. Maybe it is that you are adding a cpu that is
> > outside of possible cpus intialized during boot time. Those should have
> > their nodes initialized properly - at least per init_cpu_to_node. Your
> > report doesn't really explain how the cpu is hotadded. Maybe you are
> > trying to do something that has never been supported on x86.
> Memoryless nodes are supported by x86. But hot add of such nodes not quite
> done.
>

I need some clarification here. It sounds like memoryless nodes work on
x86, but hotplug + memoryless nodes isn't a supported use case or you're
introducing it as a new use case?

If this is a new use case, then I'm inclined to say this patch should
NOT go in and a proper fix should be implemented on hotplug's side. I
don't want to be in the business of having/seeing this conversation
reoccur because we just papered over this issue in percpu.

Thanks,
Dennis

2021-11-16 12:30:56

by Christoph Lameter

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

On Mon, 15 Nov 2021, Dennis Zhou wrote:

> I need some clarification here. It sounds like memoryless nodes work on
> x86, but hotplug + memoryless nodes isn't a supported use case or you're
> introducing it as a new use case?

Could you do that step by step?

First add the new node and ensure everything is ok and that the memory is
online.

*After* that is done bring up the new processor and associate the
processor with *online* memory.



2021-11-16 15:42:33

by Michal Hocko

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

On Tue 16-11-21 13:30:45, Christoph Lameter wrote:
> On Mon, 15 Nov 2021, Dennis Zhou wrote:
>
> > I need some clarification here. It sounds like memoryless nodes work on
> > x86, but hotplug + memoryless nodes isn't a supported use case or you're
> > introducing it as a new use case?
>
> Could you do that step by step?
>
> First add the new node and ensure everything is ok and that the memory is
> online.
>
> *After* that is done bring up the new processor and associate the
> processor with *online* memory.

We are discussing that in the original thread -
http://lkml.kernel.org/r/[email protected]

This patch is a a workaround that problem in the pcp code.


--
Michal Hocko
SUSE Labs

2021-12-14 10:11:59

by Michal Hocko

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

On Mon 15-11-21 22:52:27, Dennis Zhou wrote:
> On Mon, Nov 15, 2021 at 11:11:44PM +0000, Alexey Makhalov wrote:
> >
> >
> > > On Nov 15, 2021, at 4:58 AM, Michal Hocko <[email protected]> wrote:
> > >
> > > On Mon 15-11-21 11:04:16, Alexey Makhalov wrote:
> > >> Hi Michal,
> > >>
> > >>>
> > >>> I have asked several times for details about the specific setup that has
> > >>> led to the reported crash. Without much success so far. Reproduction
> > >>> steps would be the first step. That would allow somebody to work on this
> > >>> at least if Alexey doesn't have time to dive into this deeper.
> > >>>
> > >>
> > >> I didn’t know that repro steps are still not clear.
> > >>
> > >> To reproduce the panic you need to have a system, where you can hot add
> > >> the CPU that belongs to memoryless NUMA node which is not present and onlined
> > >> yet. In other words, by hot adding CPU, you will add both CPU and NUMA node
> > >> at the same time.
> > >
> > > There seems to be something different in your setup because memory less
> > > nodes have reportedly worked on x86. I suspect something must be
> > > different in your setup. Maybe it is that you are adding a cpu that is
> > > outside of possible cpus intialized during boot time. Those should have
> > > their nodes initialized properly - at least per init_cpu_to_node. Your
> > > report doesn't really explain how the cpu is hotadded. Maybe you are
> > > trying to do something that has never been supported on x86.
> > Memoryless nodes are supported by x86. But hot add of such nodes not quite
> > done.
> >
>
> I need some clarification here. It sounds like memoryless nodes work on
> x86, but hotplug + memoryless nodes isn't a supported use case or you're
> introducing it as a new use case?
>
> If this is a new use case, then I'm inclined to say this patch should
> NOT go in and a proper fix should be implemented on hotplug's side. I
> don't want to be in the business of having/seeing this conversation
> reoccur because we just papered over this issue in percpu.

The patch still seems to be in the mmotm tree. I have sent a different
fix candidate [1] which should be more robust and cover also other potential
places.

[1] http://lkml.kernel.org/r/[email protected]
--
Michal Hocko
SUSE Labs

2021-12-14 20:57:54

by Andrew Morton

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

On Tue, 14 Dec 2021 11:11:54 +0100 Michal Hocko <[email protected]> wrote:

> > I need some clarification here. It sounds like memoryless nodes work on
> > x86, but hotplug + memoryless nodes isn't a supported use case or you're
> > introducing it as a new use case?
> >
> > If this is a new use case, then I'm inclined to say this patch should
> > NOT go in and a proper fix should be implemented on hotplug's side. I
> > don't want to be in the business of having/seeing this conversation
> > reoccur because we just papered over this issue in percpu.
>
> The patch still seems to be in the mmotm tree. I have sent a different
> fix candidate [1] which should be more robust and cover also other potential
> places.
>
> [1] http://lkml.kernel.org/r/[email protected]

Is cool, I'm paying attention.

We do want something short and simple for backporting to -stable (like
Alexey's patch) so please bear that in mind while preparing an
alternative.

2021-12-15 10:05:16

by Michal Hocko

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

On Tue 14-12-21 12:57:48, Andrew Morton wrote:
> On Tue, 14 Dec 2021 11:11:54 +0100 Michal Hocko <[email protected]> wrote:
>
> > > I need some clarification here. It sounds like memoryless nodes work on
> > > x86, but hotplug + memoryless nodes isn't a supported use case or you're
> > > introducing it as a new use case?
> > >
> > > If this is a new use case, then I'm inclined to say this patch should
> > > NOT go in and a proper fix should be implemented on hotplug's side. I
> > > don't want to be in the business of having/seeing this conversation
> > > reoccur because we just papered over this issue in percpu.
> >
> > The patch still seems to be in the mmotm tree. I have sent a different
> > fix candidate [1] which should be more robust and cover also other potential
> > places.
> >
> > [1] http://lkml.kernel.org/r/[email protected]
>
> Is cool, I'm paying attention.
>
> We do want something short and simple for backporting to -stable (like
> Alexey's patch) so please bear that in mind while preparing an
> alternative.

I think we want something that fixes the underlying problem. Please keep
in mind that the pcp allocation is not the only place to hit the issue.
We have more. I do not want we want to handle each and every one
separately.

I am definitly not going to push for my solution but if there is a
consensus this is the right approach then I do not think we really want
to implement these partial workarounds.

--
Michal Hocko
SUSE Labs

2021-12-15 12:21:17

by Michal Hocko

[permalink] [raw]
Subject: Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

On Wed 15-12-21 11:05:12, Michal Hocko wrote:
> On Tue 14-12-21 12:57:48, Andrew Morton wrote:
> > On Tue, 14 Dec 2021 11:11:54 +0100 Michal Hocko <[email protected]> wrote:
> >
> > > > I need some clarification here. It sounds like memoryless nodes work on
> > > > x86, but hotplug + memoryless nodes isn't a supported use case or you're
> > > > introducing it as a new use case?
> > > >
> > > > If this is a new use case, then I'm inclined to say this patch should
> > > > NOT go in and a proper fix should be implemented on hotplug's side. I
> > > > don't want to be in the business of having/seeing this conversation
> > > > reoccur because we just papered over this issue in percpu.
> > >
> > > The patch still seems to be in the mmotm tree. I have sent a different
> > > fix candidate [1] which should be more robust and cover also other potential
> > > places.
> > >
> > > [1] http://lkml.kernel.org/r/[email protected]
> >
> > Is cool, I'm paying attention.
> >
> > We do want something short and simple for backporting to -stable (like
> > Alexey's patch) so please bear that in mind while preparing an
> > alternative.
>
> I think we want something that fixes the underlying problem. Please keep
> in mind that the pcp allocation is not the only place to hit the issue.
> We have more. I do not want we want to handle each and every one
> separately.
>
> I am definitly not going to push for my solution but if there is a
> consensus this is the right approach then I do not think we really want
> to implement these partial workarounds.

Btw. I forgot to add that if we do not agree on the preallocation
approach then the approach should be something like
http://lkml.kernel.org/r/[email protected]
proposed by David.
--
Michal Hocko
SUSE Labs