2014-01-28 09:02:51

by Tang Chen

[permalink] [raw]
Subject: [PATCH 0/2] numa, mem-hotplug: Fix array out of boundary in numa initialization.

Dave found a kernel crash problem during boot. This is a problem of array out of boundary.
These two patches fix this problem.

Since I am in a hurry, if the changelog is not good enough, I'll resend a new version tomorrow.

Tang Chen (2):
numa, mem-hotplug: Initialize numa_kernel_nodes in
numa_clear_kernel_node_hotplug().
numa, mem-hotplug: Fix array index overflow when synchronizing nid to
memblock.reserved.

arch/x86/mm/numa.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)

--
1.8.3.1


2014-01-28 09:02:53

by Tang Chen

[permalink] [raw]
Subject: [PATCH 1/2] numa, mem-hotplug: Initialize numa_kernel_nodes in numa_clear_kernel_node_hotplug().

On-stack variable numa_kernel_nodes in numa_clear_kernel_node_hotplug()
was not initialized. So we need to initialize it.

Signed-off-by: Tang Chen <[email protected]>
Tested-by: Gu Zheng <[email protected]>
---
arch/x86/mm/numa.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 81b2750..00c9f09 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -569,6 +569,8 @@ static void __init numa_clear_kernel_node_hotplug(void)
unsigned long start, end;
struct memblock_type *type = &memblock.reserved;

+ nodes_clear(numa_kernel_nodes);
+
/* Mark all kernel nodes. */
for (i = 0; i < type->cnt; i++)
node_set(type->regions[i].nid, numa_kernel_nodes);
--
1.8.3.1

2014-01-28 09:03:59

by Tang Chen

[permalink] [raw]
Subject: [PATCH 2/2] numa, mem-hotplug: Fix array index overflow when synchronizing nid to memblock.reserved.

The following path will cause array out of bound.

memblock_add_region() will always set nid in memblock.reserved to MAX_NUMNODES.
In numa_register_memblks(), after we set all nid to correct valus in memblock.reserved,
we called setup_node_data(), and used memblock_alloc_nid() to allocate memory, with
nid set to MAX_NUMNODES.

The nodemask_t type can be seen as a bit array. And the index is 0 ~ MAX_NUMNODES-1.

After that, when we call node_set() in numa_clear_kernel_node_hotplug(), the nodemask_t
got an index of value MAX_NUMNODES, which is out of [0 ~ MAX_NUMNODES-1].

See below:

numa_init()
|---> numa_register_memblks()
| |---> memblock_set_node(memory) set correct nid in memblock.memory
| |---> memblock_set_node(reserved) set correct nid in memblock.reserved
| |......
| |---> setup_node_data()
| |---> memblock_alloc_nid() here, nid is set to MAX_NUMNODES (1024)
|......
|---> numa_clear_kernel_node_hotplug()
|---> node_set() here, we have an index 1024, and overflowed

This patch moves nid setting to numa_clear_kernel_node_hotplug() to fix this problem.

Reported-by: Dave Jones <[email protected]>
Signed-off-by: Tang Chen <[email protected]>
Tested-by: Gu Zheng <[email protected]>
---
arch/x86/mm/numa.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 00c9f09..a183b43 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -493,14 +493,6 @@ static int __init numa_register_memblks(struct numa_meminfo *mi)
struct numa_memblk *mb = &mi->blk[i];
memblock_set_node(mb->start, mb->end - mb->start,
&memblock.memory, mb->nid);
-
- /*
- * At this time, all memory regions reserved by memblock are
- * used by the kernel. Set the nid in memblock.reserved will
- * mark out all the nodes the kernel resides in.
- */
- memblock_set_node(mb->start, mb->end - mb->start,
- &memblock.reserved, mb->nid);
}

/*
@@ -571,6 +563,17 @@ static void __init numa_clear_kernel_node_hotplug(void)

nodes_clear(numa_kernel_nodes);

+ /*
+ * At this time, all memory regions reserved by memblock are
+ * used by the kernel. Set the nid in memblock.reserved will
+ * mark out all the nodes the kernel resides in.
+ */
+ for (i = 0; i < numa_meminfo.nr_blks; i++) {
+ struct numa_memblk *mb = &numa_meminfo.blk[i];
+ memblock_set_node(mb->start, mb->end - mb->start,
+ &memblock.reserved, mb->nid);
+ }
+
/* Mark all kernel nodes. */
for (i = 0; i < type->cnt; i++)
node_set(type->regions[i].nid, numa_kernel_nodes);
--
1.8.3.1

2014-01-28 09:10:55

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH 1/2] numa, mem-hotplug: Initialize numa_kernel_nodes in numa_clear_kernel_node_hotplug().

On Tue, 28 Jan 2014, Tang Chen wrote:

> On-stack variable numa_kernel_nodes in numa_clear_kernel_node_hotplug()
> was not initialized. So we need to initialize it.
>
> Signed-off-by: Tang Chen <[email protected]>
> Tested-by: Gu Zheng <[email protected]>

Reported-by: David Rientjes <[email protected]>

> ---
> arch/x86/mm/numa.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> index 81b2750..00c9f09 100644
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -569,6 +569,8 @@ static void __init numa_clear_kernel_node_hotplug(void)
> unsigned long start, end;
> struct memblock_type *type = &memblock.reserved;
>
> + nodes_clear(numa_kernel_nodes)

Just initialize it with NODE_MASK_NONE.

> +
> /* Mark all kernel nodes. */
> for (i = 0; i < type->cnt; i++)
> node_set(type->regions[i].nid, numa_kernel_nodes);

2014-01-28 11:48:53

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/2] numa, mem-hotplug: Initialize numa_kernel_nodes in numa_clear_kernel_node_hotplug().


* David Rientjes <[email protected]> wrote:

> On Tue, 28 Jan 2014, Tang Chen wrote:
>
> > On-stack variable numa_kernel_nodes in numa_clear_kernel_node_hotplug()
> > was not initialized. So we need to initialize it.
> >
> > Signed-off-by: Tang Chen <[email protected]>
> > Tested-by: Gu Zheng <[email protected]>
>
> Reported-by: David Rientjes <[email protected]>

Agreed. Tang Chen, please also spell it out in the changelog:

David Rientjes reported a boot crash, caused by
commit XYZ ("foo: bar").

I find it somewhat annoying that you found time to credit a corporate
collegue with a Tested-by tag, who didn't even reply to the whole
thread to indicate his testing efforts, but you didn't find the time
to credit the original reporter of the bug who also reviewed your
patches ...

Thanks,

Ingo

2014-01-28 15:25:18

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH 2/2] numa, mem-hotplug: Fix array index overflow when synchronizing nid to memblock.reserved.

On Tue, Jan 28, 2014 at 05:05:16PM +0800, Tang Chen wrote:
> The following path will cause array out of bound.
>
> memblock_add_region() will always set nid in memblock.reserved to MAX_NUMNODES.
> In numa_register_memblks(), after we set all nid to correct valus in memblock.reserved,
> we called setup_node_data(), and used memblock_alloc_nid() to allocate memory, with
> nid set to MAX_NUMNODES.
>
> The nodemask_t type can be seen as a bit array. And the index is 0 ~ MAX_NUMNODES-1.
>
> After that, when we call node_set() in numa_clear_kernel_node_hotplug(), the nodemask_t
> got an index of value MAX_NUMNODES, which is out of [0 ~ MAX_NUMNODES-1].
>
> See below:
>
> numa_init()
> |---> numa_register_memblks()
> | |---> memblock_set_node(memory) set correct nid in memblock.memory
> | |---> memblock_set_node(reserved) set correct nid in memblock.reserved
> | |......
> | |---> setup_node_data()
> | |---> memblock_alloc_nid() here, nid is set to MAX_NUMNODES (1024)
> |......
> |---> numa_clear_kernel_node_hotplug()
> |---> node_set() here, we have an index 1024, and overflowed
>
> This patch moves nid setting to numa_clear_kernel_node_hotplug() to fix this problem.
>
> Reported-by: Dave Jones <[email protected]>
> Signed-off-by: Tang Chen <[email protected]>
> Tested-by: Gu Zheng <[email protected]>
> ---
> arch/x86/mm/numa.c | 19 +++++++++++--------
> 1 file changed, 11 insertions(+), 8 deletions(-)

This does seem to solve the problem (In conjunction with David's variant of the other patch).

Dave

2014-01-28 23:33:41

by Tang Chen

[permalink] [raw]
Subject: Re: [PATCH 1/2] numa, mem-hotplug: Initialize numa_kernel_nodes in numa_clear_kernel_node_hotplug().

On 01/28/2014 07:48 PM, Ingo Molnar wrote:
>
> * David Rientjes<[email protected]> wrote:
>
>> On Tue, 28 Jan 2014, Tang Chen wrote:
>>
>>> On-stack variable numa_kernel_nodes in numa_clear_kernel_node_hotplug()
>>> was not initialized. So we need to initialize it.
>>>
>>> Signed-off-by: Tang Chen<[email protected]>
>>> Tested-by: Gu Zheng<[email protected]>
>>
>> Reported-by: David Rientjes<[email protected]>
>
> Agreed. Tang Chen, please also spell it out in the changelog:
>
> David Rientjes reported a boot crash, caused by
> commit XYZ ("foo: bar").
>
> I find it somewhat annoying that you found time to credit a corporate
> collegue with a Tested-by tag, who didn't even reply to the whole
> thread to indicate his testing efforts, but you didn't find the time
> to credit the original reporter of the bug who also reviewed your
> patches ...

Hi David, Ingo,

I'm sorry for the missing original reporter. I was paying so much attention
to the second patch. And Andrew has added the missing info and committed
the patch. :)

Thanks.

>
> Thanks,
>
> Ingo
>

2014-01-29 01:40:46

by Gu Zheng

[permalink] [raw]
Subject: Re: [PATCH 1/2] numa, mem-hotplug: Initialize numa_kernel_nodes in numa_clear_kernel_node_hotplug().

Hi Ingo,
On 01/28/2014 07:48 PM, Ingo Molnar wrote:

>
> * David Rientjes <[email protected]> wrote:
>
>> On Tue, 28 Jan 2014, Tang Chen wrote:
>>
>>> On-stack variable numa_kernel_nodes in numa_clear_kernel_node_hotplug()
>>> was not initialized. So we need to initialize it.
>>>
>>> Signed-off-by: Tang Chen <[email protected]>
>>> Tested-by: Gu Zheng <[email protected]>
>>
>> Reported-by: David Rientjes <[email protected]>
>
> Agreed. Tang Chen, please also spell it out in the changelog:
>
> David Rientjes reported a boot crash, caused by
> commit XYZ ("foo: bar").
>
> I find it somewhat annoying that you found time to credit a corporate
> collegue with a Tested-by tag, who didn't even reply to the whole
> thread to indicate his testing efforts,

Sorry for missing to indicate the test result, because I am really busy with
some thorny issues. If Tang did not remind me, maybe I'll miss the whole thread.
But I think the "Tested-by:" is the best confirmation of the testing efforts,
it indicates that the guy has nearly completely tested the patch on his environment.

Thanks,
Gu

> but you didn't find the time
> to credit the original reporter of the bug who also reviewed your
> patches ...
>
> Thanks,
>
> Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2014-01-29 07:19:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/2] numa, mem-hotplug: Initialize numa_kernel_nodes in numa_clear_kernel_node_hotplug().


* Gu Zheng <[email protected]> wrote:

> Hi Ingo,
> On 01/28/2014 07:48 PM, Ingo Molnar wrote:
>
> >
> > * David Rientjes <[email protected]> wrote:
> >
> >> On Tue, 28 Jan 2014, Tang Chen wrote:
> >>
> >>> On-stack variable numa_kernel_nodes in numa_clear_kernel_node_hotplug()
> >>> was not initialized. So we need to initialize it.
> >>>
> >>> Signed-off-by: Tang Chen <[email protected]>
> >>> Tested-by: Gu Zheng <[email protected]>
> >>
> >> Reported-by: David Rientjes <[email protected]>
> >
> > Agreed. Tang Chen, please also spell it out in the changelog:
> >
> > David Rientjes reported a boot crash, caused by
> > commit XYZ ("foo: bar").
> >
> > I find it somewhat annoying that you found time to credit a corporate
> > collegue with a Tested-by tag, who didn't even reply to the whole
> > thread to indicate his testing efforts,
>
> Sorry for missing to indicate the test result, because I am really
> busy with some thorny issues. If Tang did not remind me, maybe I'll
> miss the whole thread. But I think the "Tested-by:" is the best
> confirmation of the testing efforts, it indicates that the guy has
> nearly completely tested the patch on his environment.

Thanks for the effort!

Ingo

2014-02-04 00:55:47

by Josh Boyer

[permalink] [raw]
Subject: Re: [PATCH 2/2] numa, mem-hotplug: Fix array index overflow when synchronizing nid to memblock.reserved.

On Tue, Jan 28, 2014 at 10:24 AM, Dave Jones <[email protected]> wrote:
> On Tue, Jan 28, 2014 at 05:05:16PM +0800, Tang Chen wrote:
> > The following path will cause array out of bound.
> >
> > memblock_add_region() will always set nid in memblock.reserved to MAX_NUMNODES.
> > In numa_register_memblks(), after we set all nid to correct valus in memblock.reserved,
> > we called setup_node_data(), and used memblock_alloc_nid() to allocate memory, with
> > nid set to MAX_NUMNODES.
> >
> > The nodemask_t type can be seen as a bit array. And the index is 0 ~ MAX_NUMNODES-1.
> >
> > After that, when we call node_set() in numa_clear_kernel_node_hotplug(), the nodemask_t
> > got an index of value MAX_NUMNODES, which is out of [0 ~ MAX_NUMNODES-1].
> >
> > See below:
> >
> > numa_init()
> > |---> numa_register_memblks()
> > | |---> memblock_set_node(memory) set correct nid in memblock.memory
> > | |---> memblock_set_node(reserved) set correct nid in memblock.reserved
> > | |......
> > | |---> setup_node_data()
> > | |---> memblock_alloc_nid() here, nid is set to MAX_NUMNODES (1024)
> > |......
> > |---> numa_clear_kernel_node_hotplug()
> > |---> node_set() here, we have an index 1024, and overflowed
> >
> > This patch moves nid setting to numa_clear_kernel_node_hotplug() to fix this problem.
> >
> > Reported-by: Dave Jones <[email protected]>
> > Signed-off-by: Tang Chen <[email protected]>
> > Tested-by: Gu Zheng <[email protected]>
> > ---
> > arch/x86/mm/numa.c | 19 +++++++++++--------
> > 1 file changed, 11 insertions(+), 8 deletions(-)
>
> This does seem to solve the problem (In conjunction with David's variant of the other patch).

Is this (and the first in the series) going to land in Linus' tree
soon? I don't see them in -rc1 and people are still hitting the early
oops Dave did without this.

josh