2015-06-08 06:46:04

by Zhu Guihua

[permalink] [raw]
Subject: [PATCH] mm/memory hotplug: print the last vmemmap region at the end of hot add memory

When hot add two nodes continuously, we found the vmemmap region info is a
bit messed. The last region of node 2 is printed when node 3 hot added,
like the following:
Initmem setup node 2 [mem 0x0000000000000000-0xffffffffffffffff]
On node 2 totalpages: 0
Built 2 zonelists in Node order, mobility grouping on. Total pages: 16090539
Policy zone: Normal
init_memory_mapping: [mem 0x40000000000-0x407ffffffff]
[mem 0x40000000000-0x407ffffffff] page 1G
[ffffea1000000000-ffffea10001fffff] PMD -> [ffff8a077d800000-ffff8a077d9fffff] on node 2
[ffffea1000200000-ffffea10003fffff] PMD -> [ffff8a077de00000-ffff8a077dffffff] on node 2
...
[ffffea101f600000-ffffea101f9fffff] PMD -> [ffff8a074ac00000-ffff8a074affffff] on node 2
[ffffea101fa00000-ffffea101fdfffff] PMD -> [ffff8a074a800000-ffff8a074abfffff] on node 2
Initmem setup node 3 [mem 0x0000000000000000-0xffffffffffffffff]
On node 3 totalpages: 0
Built 3 zonelists in Node order, mobility grouping on. Total pages: 16090539
Policy zone: Normal
init_memory_mapping: [mem 0x60000000000-0x607ffffffff]
[mem 0x60000000000-0x607ffffffff] page 1G
[ffffea101fe00000-ffffea101fffffff] PMD -> [ffff8a074a400000-ffff8a074a5fffff] on node 2 <=== node 2 ???
[ffffea1800000000-ffffea18001fffff] PMD -> [ffff8a074a600000-ffff8a074a7fffff] on node 3
[ffffea1800200000-ffffea18005fffff] PMD -> [ffff8a074a000000-ffff8a074a3fffff] on node 3
[ffffea1800600000-ffffea18009fffff] PMD -> [ffff8a0749c00000-ffff8a0749ffffff] on node 3
...

The cause is the last region was missed at the and of hot add memory, and
p_start, p_end, node_start were not reset, so when hot add memory to a new
node, it will consider they are not contiguous blocks and print the
previous one. So we print the last vmemmap region at the end of hot add
memory to avoid the confusion.

Signed-off-by: Zhu Guihua <[email protected]>
---
mm/memory_hotplug.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 457bde5..58fb223 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -513,6 +513,7 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
break;
err = 0;
}
+ vmemmap_populate_print_last();

return err;
}
--
1.9.3


2015-06-08 09:02:18

by Naoya Horiguchi

[permalink] [raw]
Subject: Re: [PATCH] mm/memory hotplug: print the last vmemmap region at the end of hot add memory

On Mon, Jun 08, 2015 at 02:44:41PM +0800, Zhu Guihua wrote:
> When hot add two nodes continuously, we found the vmemmap region info is a
> bit messed. The last region of node 2 is printed when node 3 hot added,
> like the following:
> Initmem setup node 2 [mem 0x0000000000000000-0xffffffffffffffff]
> On node 2 totalpages: 0
> Built 2 zonelists in Node order, mobility grouping on. Total pages: 16090539
> Policy zone: Normal
> init_memory_mapping: [mem 0x40000000000-0x407ffffffff]
> [mem 0x40000000000-0x407ffffffff] page 1G
> [ffffea1000000000-ffffea10001fffff] PMD -> [ffff8a077d800000-ffff8a077d9fffff] on node 2
> [ffffea1000200000-ffffea10003fffff] PMD -> [ffff8a077de00000-ffff8a077dffffff] on node 2
> ...
> [ffffea101f600000-ffffea101f9fffff] PMD -> [ffff8a074ac00000-ffff8a074affffff] on node 2
> [ffffea101fa00000-ffffea101fdfffff] PMD -> [ffff8a074a800000-ffff8a074abfffff] on node 2
> Initmem setup node 3 [mem 0x0000000000000000-0xffffffffffffffff]
> On node 3 totalpages: 0
> Built 3 zonelists in Node order, mobility grouping on. Total pages: 16090539
> Policy zone: Normal
> init_memory_mapping: [mem 0x60000000000-0x607ffffffff]
> [mem 0x60000000000-0x607ffffffff] page 1G
> [ffffea101fe00000-ffffea101fffffff] PMD -> [ffff8a074a400000-ffff8a074a5fffff] on node 2 <=== node 2 ???
> [ffffea1800000000-ffffea18001fffff] PMD -> [ffff8a074a600000-ffff8a074a7fffff] on node 3
> [ffffea1800200000-ffffea18005fffff] PMD -> [ffff8a074a000000-ffff8a074a3fffff] on node 3
> [ffffea1800600000-ffffea18009fffff] PMD -> [ffff8a0749c00000-ffff8a0749ffffff] on node 3
> ...
>
> The cause is the last region was missed at the and of hot add memory, and
> p_start, p_end, node_start were not reset, so when hot add memory to a new
> node, it will consider they are not contiguous blocks and print the
> previous one. So we print the last vmemmap region at the end of hot add
> memory to avoid the confusion.
>
> Signed-off-by: Zhu Guihua <[email protected]>

Looks good to me.

Reviewed-by: Naoya Horiguchi <[email protected]>

> ---
> mm/memory_hotplug.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 457bde5..58fb223 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -513,6 +513,7 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
> break;
> err = 0;
> }
> + vmemmap_populate_print_last();
>
> return err;
> }
> --
> 1.9.3
> -

2015-06-08 23:31:16

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm/memory hotplug: print the last vmemmap region at the end of hot add memory

On Mon, 8 Jun 2015 14:44:41 +0800 Zhu Guihua <[email protected]> wrote:

> When hot add two nodes continuously, we found the vmemmap region info is a
> bit messed. The last region of node 2 is printed when node 3 hot added,
> like the following:
> Initmem setup node 2 [mem 0x0000000000000000-0xffffffffffffffff]
> On node 2 totalpages: 0
> Built 2 zonelists in Node order, mobility grouping on. Total pages: 16090539
> Policy zone: Normal
> init_memory_mapping: [mem 0x40000000000-0x407ffffffff]
> [mem 0x40000000000-0x407ffffffff] page 1G
> [ffffea1000000000-ffffea10001fffff] PMD -> [ffff8a077d800000-ffff8a077d9fffff] on node 2
> [ffffea1000200000-ffffea10003fffff] PMD -> [ffff8a077de00000-ffff8a077dffffff] on node 2
> ...
> [ffffea101f600000-ffffea101f9fffff] PMD -> [ffff8a074ac00000-ffff8a074affffff] on node 2
> [ffffea101fa00000-ffffea101fdfffff] PMD -> [ffff8a074a800000-ffff8a074abfffff] on node 2
> Initmem setup node 3 [mem 0x0000000000000000-0xffffffffffffffff]
> On node 3 totalpages: 0
> Built 3 zonelists in Node order, mobility grouping on. Total pages: 16090539
> Policy zone: Normal
> init_memory_mapping: [mem 0x60000000000-0x607ffffffff]
> [mem 0x60000000000-0x607ffffffff] page 1G
> [ffffea101fe00000-ffffea101fffffff] PMD -> [ffff8a074a400000-ffff8a074a5fffff] on node 2 <=== node 2 ???
> [ffffea1800000000-ffffea18001fffff] PMD -> [ffff8a074a600000-ffff8a074a7fffff] on node 3
> [ffffea1800200000-ffffea18005fffff] PMD -> [ffff8a074a000000-ffff8a074a3fffff] on node 3
> [ffffea1800600000-ffffea18009fffff] PMD -> [ffff8a0749c00000-ffff8a0749ffffff] on node 3
> ...
>
> The cause is the last region was missed at the and of hot add memory, and
> p_start, p_end, node_start were not reset, so when hot add memory to a new
> node, it will consider they are not contiguous blocks and print the
> previous one. So we print the last vmemmap region at the end of hot add
> memory to avoid the confusion.
>
> ...
>
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -513,6 +513,7 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
> break;
> err = 0;
> }
> + vmemmap_populate_print_last();
>
> return err;
> }

vmemmap_populate_print_last() is only available on x86_64, when
CONFIG_SPARSEMEM_VMEMMAP=y. Are you sure this won't break builds?

2015-06-09 03:42:22

by Zhu Guihua

[permalink] [raw]
Subject: Re: [PATCH] mm/memory hotplug: print the last vmemmap region at the end of hot add memory


On 06/09/2015 07:30 AM, Andrew Morton wrote:
> On Mon, 8 Jun 2015 14:44:41 +0800 Zhu Guihua <[email protected]> wrote:
>
>> When hot add two nodes continuously, we found the vmemmap region info is a
>> bit messed. The last region of node 2 is printed when node 3 hot added,
>> like the following:
>> Initmem setup node 2 [mem 0x0000000000000000-0xffffffffffffffff]
>> On node 2 totalpages: 0
>> Built 2 zonelists in Node order, mobility grouping on. Total pages: 16090539
>> Policy zone: Normal
>> init_memory_mapping: [mem 0x40000000000-0x407ffffffff]
>> [mem 0x40000000000-0x407ffffffff] page 1G
>> [ffffea1000000000-ffffea10001fffff] PMD -> [ffff8a077d800000-ffff8a077d9fffff] on node 2
>> [ffffea1000200000-ffffea10003fffff] PMD -> [ffff8a077de00000-ffff8a077dffffff] on node 2
>> ...
>> [ffffea101f600000-ffffea101f9fffff] PMD -> [ffff8a074ac00000-ffff8a074affffff] on node 2
>> [ffffea101fa00000-ffffea101fdfffff] PMD -> [ffff8a074a800000-ffff8a074abfffff] on node 2
>> Initmem setup node 3 [mem 0x0000000000000000-0xffffffffffffffff]
>> On node 3 totalpages: 0
>> Built 3 zonelists in Node order, mobility grouping on. Total pages: 16090539
>> Policy zone: Normal
>> init_memory_mapping: [mem 0x60000000000-0x607ffffffff]
>> [mem 0x60000000000-0x607ffffffff] page 1G
>> [ffffea101fe00000-ffffea101fffffff] PMD -> [ffff8a074a400000-ffff8a074a5fffff] on node 2 <=== node 2 ???
>> [ffffea1800000000-ffffea18001fffff] PMD -> [ffff8a074a600000-ffff8a074a7fffff] on node 3
>> [ffffea1800200000-ffffea18005fffff] PMD -> [ffff8a074a000000-ffff8a074a3fffff] on node 3
>> [ffffea1800600000-ffffea18009fffff] PMD -> [ffff8a0749c00000-ffff8a0749ffffff] on node 3
>> ...
>>
>> The cause is the last region was missed at the and of hot add memory, and
>> p_start, p_end, node_start were not reset, so when hot add memory to a new
>> node, it will consider they are not contiguous blocks and print the
>> previous one. So we print the last vmemmap region at the end of hot add
>> memory to avoid the confusion.
>>
>> ...
>>
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -513,6 +513,7 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
>> break;
>> err = 0;
>> }
>> + vmemmap_populate_print_last();
>>
>> return err;
>> }
> vmemmap_populate_print_last() is only available on x86_64, when
> CONFIG_SPARSEMEM_VMEMMAP=y. Are you sure this won't break builds?

I tried this on i386 and on x86_64 when CONFIG_SPARSEMEM_VMEMMAP=n ,
it builds ok.

Thanks,
Zhu

>
> .
>

2015-06-09 20:29:17

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm/memory hotplug: print the last vmemmap region at the end of hot add memory

On Tue, 9 Jun 2015 11:41:28 +0800 Zhu Guihua <[email protected]> wrote:

> >> --- a/mm/memory_hotplug.c
> >> +++ b/mm/memory_hotplug.c
> >> @@ -513,6 +513,7 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
> >> break;
> >> err = 0;
> >> }
> >> + vmemmap_populate_print_last();
> >>
> >> return err;
> >> }
> > vmemmap_populate_print_last() is only available on x86_64, when
> > CONFIG_SPARSEMEM_VMEMMAP=y. Are you sure this won't break builds?
>
> I tried this on i386 and on x86_64 when CONFIG_SPARSEMEM_VMEMMAP=n ,
> it builds ok.

With powerpc:

akpm3:/usr/src/25> make allmodconfig
akpm3:/usr/src/25> make mm/memory_hotplug.o
akpm3:/usr/src/25> nm mm/memory_hotplug.o | grep vmemmap_populate_print_last
U .vmemmap_populate_print_last
akpm3:/usr/src/25> grep -r vmemmap_populate_print_last arch/powerpc
akpm3:/usr/src/25>

So I think that's going to break.

I expect ia64 will break also, but I didn't investigate.

2015-06-11 10:44:09

by Zhu Guihua

[permalink] [raw]
Subject: Re: [PATCH] mm/memory hotplug: print the last vmemmap region at the end of hot add memory


On 06/10/2015 04:29 AM, Andrew Morton wrote:
> On Tue, 9 Jun 2015 11:41:28 +0800 Zhu Guihua <[email protected]> wrote:
>
>>>> --- a/mm/memory_hotplug.c
>>>> +++ b/mm/memory_hotplug.c
>>>> @@ -513,6 +513,7 @@ int __ref __add_pages(int nid, struct zone *zone, unsigned long phys_start_pfn,
>>>> break;
>>>> err = 0;
>>>> }
>>>> + vmemmap_populate_print_last();
>>>>
>>>> return err;
>>>> }
>>> vmemmap_populate_print_last() is only available on x86_64, when
>>> CONFIG_SPARSEMEM_VMEMMAP=y. Are you sure this won't break builds?
>> I tried this on i386 and on x86_64 when CONFIG_SPARSEMEM_VMEMMAP=n ,
>> it builds ok.
> With powerpc:
>
> akpm3:/usr/src/25> make allmodconfig
> akpm3:/usr/src/25> make mm/memory_hotplug.o
> akpm3:/usr/src/25> nm mm/memory_hotplug.o | grep vmemmap_populate_print_last
> U .vmemmap_populate_print_last
> akpm3:/usr/src/25> grep -r vmemmap_populate_print_last arch/powerpc
> akpm3:/usr/src/25>
>
> So I think that's going to break.
>
> I expect ia64 will break also, but I didn't investigate.
> .
>

There is
void __weak __meminit vmemmap_populate_print last(void)
in /mm/sparse.c, so I think this won't break builds.

And I found the function was invoked in void __init sparse_init(void)
without
CONFIG_SPARSEMEM_VMEMMAP=y.

I also tried this on arm, it builds ok too.