2018-07-12 02:59:31

by Cannon Matthews

[permalink] [raw]
Subject: [PATCH v2] mm: hugetlb: don't zero 1GiB bootmem pages.

When using 1GiB pages during early boot, use the new
memblock_virt_alloc_try_nid_raw() function to allocate memory without
zeroing it. Zeroing out hundreds or thousands of GiB in a single core
memset() call is very slow, and can make early boot last upwards of
20-30 minutes on multi TiB machines.

The memory does not need to be zero'd as the hugetlb pages are always
zero'd on page fault.

Tested: Booted with ~3800 1G pages, and it booted successfully in
roughly the same amount of time as with 0, as opposed to the 25+
minutes it would take before.

Signed-off-by: Cannon Matthews <[email protected]>
---
v2: removed the memset of the huge_bootmem_page area and added
INIT_LIST_HEAD instead.

mm/hugetlb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 3612fbb32e9d..488330f23f04 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2101,7 +2101,7 @@ int __alloc_bootmem_huge_page(struct hstate *h)
for_each_node_mask_to_alloc(h, nr_nodes, node, &node_states[N_MEMORY]) {
void *addr;

- addr = memblock_virt_alloc_try_nid_nopanic(
+ addr = memblock_virt_alloc_try_nid_raw(
huge_page_size(h), huge_page_size(h),
0, BOOTMEM_ALLOC_ACCESSIBLE, node);
if (addr) {
@@ -2119,6 +2119,7 @@ int __alloc_bootmem_huge_page(struct hstate *h)
found:
BUG_ON(!IS_ALIGNED(virt_to_phys(m), huge_page_size(h)));
/* Put them into a private list first because mem_map is not up yet */
+ INIT_LIST_HEAD(&m->list);
list_add(&m->list, &huge_boot_pages);
m->hstate = h;
return 1;
--
2.18.0.203.gfac676dfb9-goog



2018-07-12 03:02:26

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH v2] mm: hugetlb: don't zero 1GiB bootmem pages.

On 07/11/2018 02:33 PM, Cannon Matthews wrote:
> When using 1GiB pages during early boot, use the new
> memblock_virt_alloc_try_nid_raw() function to allocate memory without
> zeroing it. Zeroing out hundreds or thousands of GiB in a single core
> memset() call is very slow, and can make early boot last upwards of
> 20-30 minutes on multi TiB machines.
>
> The memory does not need to be zero'd as the hugetlb pages are always
> zero'd on page fault.
>
> Tested: Booted with ~3800 1G pages, and it booted successfully in
> roughly the same amount of time as with 0, as opposed to the 25+
> minutes it would take before.
>
> Signed-off-by: Cannon Matthews <[email protected]>

Thanks,

Acked-by: Mike Kravetz <[email protected]>

--
Mike Kravetz

> ---
> v2: removed the memset of the huge_bootmem_page area and added
> INIT_LIST_HEAD instead.
>
> mm/hugetlb.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 3612fbb32e9d..488330f23f04 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2101,7 +2101,7 @@ int __alloc_bootmem_huge_page(struct hstate *h)
> for_each_node_mask_to_alloc(h, nr_nodes, node, &node_states[N_MEMORY]) {
> void *addr;
>
> - addr = memblock_virt_alloc_try_nid_nopanic(
> + addr = memblock_virt_alloc_try_nid_raw(
> huge_page_size(h), huge_page_size(h),
> 0, BOOTMEM_ALLOC_ACCESSIBLE, node);
> if (addr) {
> @@ -2119,6 +2119,7 @@ int __alloc_bootmem_huge_page(struct hstate *h)
> found:
> BUG_ON(!IS_ALIGNED(virt_to_phys(m), huge_page_size(h)));
> /* Put them into a private list first because mem_map is not up yet */
> + INIT_LIST_HEAD(&m->list);
> list_add(&m->list, &huge_boot_pages);
> m->hstate = h;
> return 1;
> --
> 2.18.0.203.gfac676dfb9-goog
>

2018-07-12 07:49:42

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2] mm: hugetlb: don't zero 1GiB bootmem pages.

On Wed 11-07-18 14:33:13, Cannon Matthews wrote:
> When using 1GiB pages during early boot, use the new
> memblock_virt_alloc_try_nid_raw() function to allocate memory without
> zeroing it. Zeroing out hundreds or thousands of GiB in a single core
> memset() call is very slow, and can make early boot last upwards of
> 20-30 minutes on multi TiB machines.
>
> The memory does not need to be zero'd as the hugetlb pages are always
> zero'd on page fault.
>
> Tested: Booted with ~3800 1G pages, and it booted successfully in
> roughly the same amount of time as with 0, as opposed to the 25+
> minutes it would take before.
>
> Signed-off-by: Cannon Matthews <[email protected]>

Thanks for the updated version.

Acked-by: Michal Hocko <[email protected]>

> ---
> v2: removed the memset of the huge_bootmem_page area and added
> INIT_LIST_HEAD instead.
>
> mm/hugetlb.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 3612fbb32e9d..488330f23f04 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2101,7 +2101,7 @@ int __alloc_bootmem_huge_page(struct hstate *h)
> for_each_node_mask_to_alloc(h, nr_nodes, node, &node_states[N_MEMORY]) {
> void *addr;
>
> - addr = memblock_virt_alloc_try_nid_nopanic(
> + addr = memblock_virt_alloc_try_nid_raw(
> huge_page_size(h), huge_page_size(h),
> 0, BOOTMEM_ALLOC_ACCESSIBLE, node);
> if (addr) {
> @@ -2119,6 +2119,7 @@ int __alloc_bootmem_huge_page(struct hstate *h)
> found:
> BUG_ON(!IS_ALIGNED(virt_to_phys(m), huge_page_size(h)));
> /* Put them into a private list first because mem_map is not up yet */
> + INIT_LIST_HEAD(&m->list);
> list_add(&m->list, &huge_boot_pages);
> m->hstate = h;
> return 1;
> --
> 2.18.0.203.gfac676dfb9-goog

--
Michal Hocko
SUSE Labs