This series aims to simplify hugetlb vmemmap and improve its readability
and is based on next-20220610.
Muchun Song (6):
mm: hugetlb_vmemmap: delete hugetlb_optimize_vmemmap_enabled()
mm: hugetlb_vmemmap: optimize vmemmap_optimize_mode handling
mm: hugetlb_vmemmap: introduce the name HVO
mm: hugetlb_vmemmap: move vmemmap code related to HugeTLB to
hugetlb_vmemmap.c
mm: hugetlb_vmemmap: replace early_param() with core_param()
mm: hugetlb_vmemmap: improve hugetlb_vmemmap code readability
Documentation/admin-guide/kernel-parameters.txt | 7 +-
Documentation/admin-guide/mm/hugetlbpage.rst | 3 +-
Documentation/admin-guide/sysctl/vm.rst | 3 +-
arch/arm64/mm/flush.c | 13 +-
fs/Kconfig | 13 +-
include/linux/hugetlb.h | 7 +-
include/linux/mm.h | 7 -
include/linux/page-flags.h | 16 +-
mm/hugetlb.c | 11 +-
mm/hugetlb_vmemmap.c | 592 ++++++++++++++++++------
mm/hugetlb_vmemmap.h | 43 +-
mm/sparse-vmemmap.c | 391 ----------------
12 files changed, 509 insertions(+), 597 deletions(-)
base-commit: 6d0c806803170f120f8cb97b321de7bd89d3a791
--
2.11.0
It it inconvenient to mention the feature of optimizing vmemmap pages associated
with HugeTLB pages when communicating with others since there is no specific or
abbreviated name for it when it is first introduced. Let us give it a name HVO
(HugeTLB Vmemmap Optimization) from now.
This commit also updates the document about "hugetlb_free_vmemmap" by the way
discussed in thread [1].
Link: https://lore.kernel.org/all/[email protected]/ [1]
Signed-off-by: Muchun Song <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 7 ++++---
Documentation/admin-guide/mm/hugetlbpage.rst | 3 +--
Documentation/admin-guide/sysctl/vm.rst | 3 +--
fs/Kconfig | 13 ++++++-------
mm/hugetlb_vmemmap.c | 8 ++++----
mm/hugetlb_vmemmap.h | 4 ++--
6 files changed, 18 insertions(+), 20 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 391b43fee93e..7539553b3fb0 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1725,12 +1725,13 @@
hugetlb_free_vmemmap=
[KNL] Reguires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
enabled.
+ Control if HugeTLB Vmemmap Optimization (HVO) is enabled.
Allows heavy hugetlb users to free up some more
memory (7 * PAGE_SIZE for each 2MB hugetlb page).
- Format: { [oO][Nn]/Y/y/1 | [oO][Ff]/N/n/0 (default) }
+ Format: { on | off (default) }
- [oO][Nn]/Y/y/1: enable the feature
- [oO][Ff]/N/n/0: disable the feature
+ on: enable HVO
+ off: disable HVO
Built with CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON=y,
the default is on.
diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
index a90330d0a837..64e0d5c512e7 100644
--- a/Documentation/admin-guide/mm/hugetlbpage.rst
+++ b/Documentation/admin-guide/mm/hugetlbpage.rst
@@ -164,8 +164,7 @@ default_hugepagesz
will all result in 256 2M huge pages being allocated. Valid default
huge page size is architecture dependent.
hugetlb_free_vmemmap
- When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables optimizing
- unused vmemmap pages associated with each HugeTLB page.
+ When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables HVO.
When multiple huge page sizes are supported, ``/proc/sys/vm/nr_hugepages``
indicates the current number of pre-allocated huge pages of the default size.
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index d7374a1e8ac9..c9f35db973f0 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -569,8 +569,7 @@ This knob is not available when the size of 'struct page' (a structure defined
in include/linux/mm_types.h) is not power of two (an unusual system config could
result in this).
-Enable (set to 1) or disable (set to 0) the feature of optimizing vmemmap pages
-associated with each HugeTLB page.
+Enable (set to 1) or disable (set to 0) HugeTLB Vmemmap Optimization (HVO).
Once enabled, the vmemmap pages of subsequent allocation of HugeTLB pages from
buddy allocator will be optimized (7 pages per 2MB HugeTLB page and 4095 pages
diff --git a/fs/Kconfig b/fs/Kconfig
index 5976eb33535f..2f9fd840cb66 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -247,8 +247,7 @@ config HUGETLB_PAGE
#
# Select this config option from the architecture Kconfig, if it is preferred
-# to enable the feature of minimizing overhead of struct page associated with
-# each HugeTLB page.
+# to enable the feature of HugeTLB Vmemmap Optimization (HVO).
#
config ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
bool
@@ -259,14 +258,14 @@ config HUGETLB_PAGE_OPTIMIZE_VMEMMAP
depends on SPARSEMEM_VMEMMAP
config HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON
- bool "Default optimizing vmemmap pages of HugeTLB to on"
+ bool "Default HugeTLB Vmemmap Optimization (HVO) to on"
default n
depends on HUGETLB_PAGE_OPTIMIZE_VMEMMAP
help
- When using HUGETLB_PAGE_OPTIMIZE_VMEMMAP, the optimizing unused vmemmap
- pages associated with each HugeTLB page is default off. Say Y here
- to enable optimizing vmemmap pages of HugeTLB by default. It can then
- be disabled on the command line via hugetlb_free_vmemmap=off.
+ When using HUGETLB_PAGE_OPTIMIZE_VMEMMAP, the HugeTLB Vmemmap
+ Optimization (HVO) is off by default. Say Y here to enable HVO
+ by default. It can then be disabled on the command line via
+ hugetlb_free_vmemmap=off or sysctl.
config MEMFD_CREATE
def_bool TMPFS || HUGETLBFS
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 132dc83f0130..c10540993577 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -1,8 +1,8 @@
// SPDX-License-Identifier: GPL-2.0
/*
- * Optimize vmemmap pages associated with HugeTLB
+ * HugeTLB Vmemmap Optimization (HVO)
*
- * Copyright (c) 2020, Bytedance. All rights reserved.
+ * Copyright (c) 2020, ByteDance. All rights reserved.
*
* Author: Muchun Song <[email protected]>
*
@@ -120,8 +120,8 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
/*
* There are only (RESERVE_VMEMMAP_SIZE / sizeof(struct page)) struct
- * page structs that can be used when CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP,
- * so add a BUILD_BUG_ON to catch invalid usage of the tail struct page.
+ * page structs that can be used when HVO is enabled, add a BUILD_BUG_ON
+ * to catch invalid usage of the tail page structs.
*/
BUILD_BUG_ON(__NR_USED_SUBPAGE >=
RESERVE_VMEMMAP_SIZE / sizeof(struct page));
diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h
index 109b0a53b6fe..ba66fadad9fc 100644
--- a/mm/hugetlb_vmemmap.h
+++ b/mm/hugetlb_vmemmap.h
@@ -1,8 +1,8 @@
// SPDX-License-Identifier: GPL-2.0
/*
- * Optimize vmemmap pages associated with HugeTLB
+ * HugeTLB Vmemmap Optimization (HVO)
*
- * Copyright (c) 2020, Bytedance. All rights reserved.
+ * Copyright (c) 2020, ByteDance. All rights reserved.
*
* Author: Muchun Song <[email protected]>
*/
--
2.11.0
We hold an another reference to hugetlb_optimize_vmemmap_key when
making vmemmap_optimize_mode on, because we use static_key to tell
memory_hotplug that memory_hotplug.memmap_on_memory should be
overridden. However, this rule has gone when we have introduced
SECTION_CANNOT_OPTIMIZE_VMEMMAP. Therefore, we could simplify
vmemmap_optimize_mode handling by not holding an another reference
to hugetlb_optimize_vmemmap_key.
Signed-off-by: Muchun Song <[email protected]>
---
include/linux/page-flags.h | 6 ++---
mm/hugetlb_vmemmap.c | 65 +++++-----------------------------------------
2 files changed, 9 insertions(+), 62 deletions(-)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index b8b992cb201c..da7ccc3b16ad 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -200,8 +200,7 @@ enum pageflags {
#ifndef __GENERATING_BOUNDS_H
#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
-DECLARE_STATIC_KEY_MAYBE(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON,
- hugetlb_optimize_vmemmap_key);
+DECLARE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key);
/*
* If the feature of optimizing vmemmap pages associated with each HugeTLB
@@ -221,8 +220,7 @@ DECLARE_STATIC_KEY_MAYBE(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON,
*/
static __always_inline const struct page *page_fixed_fake_head(const struct page *page)
{
- if (!static_branch_maybe(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON,
- &hugetlb_optimize_vmemmap_key))
+ if (!static_branch_unlikely(&hugetlb_optimize_vmemmap_key))
return page;
/*
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index e20a7082f2f8..132dc83f0130 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -23,42 +23,15 @@
#define RESERVE_VMEMMAP_NR 1U
#define RESERVE_VMEMMAP_SIZE (RESERVE_VMEMMAP_NR << PAGE_SHIFT)
-enum vmemmap_optimize_mode {
- VMEMMAP_OPTIMIZE_OFF,
- VMEMMAP_OPTIMIZE_ON,
-};
-
-DEFINE_STATIC_KEY_MAYBE(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON,
- hugetlb_optimize_vmemmap_key);
+DEFINE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key);
EXPORT_SYMBOL(hugetlb_optimize_vmemmap_key);
-static enum vmemmap_optimize_mode vmemmap_optimize_mode =
+static bool vmemmap_optimize_enabled =
IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON);
-static void vmemmap_optimize_mode_switch(enum vmemmap_optimize_mode to)
-{
- if (vmemmap_optimize_mode == to)
- return;
-
- if (to == VMEMMAP_OPTIMIZE_OFF)
- static_branch_dec(&hugetlb_optimize_vmemmap_key);
- else
- static_branch_inc(&hugetlb_optimize_vmemmap_key);
- WRITE_ONCE(vmemmap_optimize_mode, to);
-}
-
static int __init hugetlb_vmemmap_early_param(char *buf)
{
- bool enable;
- enum vmemmap_optimize_mode mode;
-
- if (kstrtobool(buf, &enable))
- return -EINVAL;
-
- mode = enable ? VMEMMAP_OPTIMIZE_ON : VMEMMAP_OPTIMIZE_OFF;
- vmemmap_optimize_mode_switch(mode);
-
- return 0;
+ return kstrtobool(buf, &vmemmap_optimize_enabled);
}
early_param("hugetlb_free_vmemmap", hugetlb_vmemmap_early_param);
@@ -103,7 +76,7 @@ static unsigned int optimizable_vmemmap_pages(struct hstate *h,
unsigned long pfn = page_to_pfn(head);
unsigned long end = pfn + pages_per_huge_page(h);
- if (READ_ONCE(vmemmap_optimize_mode) == VMEMMAP_OPTIMIZE_OFF)
+ if (!READ_ONCE(vmemmap_optimize_enabled))
return 0;
for (; pfn < end; pfn += PAGES_PER_SECTION) {
@@ -155,7 +128,6 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
if (!is_power_of_2(sizeof(struct page))) {
pr_warn_once("cannot optimize vmemmap pages because \"struct page\" crosses page boundaries\n");
- static_branch_disable(&hugetlb_optimize_vmemmap_key);
return;
}
@@ -176,36 +148,13 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
}
#ifdef CONFIG_PROC_SYSCTL
-static int hugetlb_optimize_vmemmap_handler(struct ctl_table *table, int write,
- void *buffer, size_t *length,
- loff_t *ppos)
-{
- int ret;
- enum vmemmap_optimize_mode mode;
- static DEFINE_MUTEX(sysctl_mutex);
-
- if (write && !capable(CAP_SYS_ADMIN))
- return -EPERM;
-
- mutex_lock(&sysctl_mutex);
- mode = vmemmap_optimize_mode;
- table->data = &mode;
- ret = proc_dointvec_minmax(table, write, buffer, length, ppos);
- if (write && !ret)
- vmemmap_optimize_mode_switch(mode);
- mutex_unlock(&sysctl_mutex);
-
- return ret;
-}
-
static struct ctl_table hugetlb_vmemmap_sysctls[] = {
{
.procname = "hugetlb_optimize_vmemmap",
- .maxlen = sizeof(enum vmemmap_optimize_mode),
+ .data = &vmemmap_optimize_enabled,
+ .maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = hugetlb_optimize_vmemmap_handler,
- .extra1 = SYSCTL_ZERO,
- .extra2 = SYSCTL_ONE,
+ .proc_handler = proc_dobool,
},
{ }
};
--
2.11.0
On Mon, Jun 13, 2022 at 02:35:08PM +0800, Muchun Song wrote:
> We hold an another reference to hugetlb_optimize_vmemmap_key when
> making vmemmap_optimize_mode on, because we use static_key to tell
> memory_hotplug that memory_hotplug.memmap_on_memory should be
> overridden. However, this rule has gone when we have introduced
> SECTION_CANNOT_OPTIMIZE_VMEMMAP. Therefore, we could simplify
> vmemmap_optimize_mode handling by not holding an another reference
> to hugetlb_optimize_vmemmap_key.
>
> Signed-off-by: Muchun Song <[email protected]>
LGTM, and it looks way nicer, so
Reviewed-by: Oscar Salvador <[email protected]>
One question below though
> -static enum vmemmap_optimize_mode vmemmap_optimize_mode =
> +static bool vmemmap_optimize_enabled =
> IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON);
So, by default vmemmap_optimize_enabled will be on if we have
CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON, but we can always override that
via cmdline, as below, right?
>
> -static void vmemmap_optimize_mode_switch(enum vmemmap_optimize_mode to)
> -{
> - if (vmemmap_optimize_mode == to)
> - return;
> -
> - if (to == VMEMMAP_OPTIMIZE_OFF)
> - static_branch_dec(&hugetlb_optimize_vmemmap_key);
> - else
> - static_branch_inc(&hugetlb_optimize_vmemmap_key);
> - WRITE_ONCE(vmemmap_optimize_mode, to);
> -}
> -
> static int __init hugetlb_vmemmap_early_param(char *buf)
> {
> - bool enable;
> - enum vmemmap_optimize_mode mode;
> -
> - if (kstrtobool(buf, &enable))
> - return -EINVAL;
> -
> - mode = enable ? VMEMMAP_OPTIMIZE_ON : VMEMMAP_OPTIMIZE_OFF;
> - vmemmap_optimize_mode_switch(mode);
> -
> - return 0;
> + return kstrtobool(buf, &vmemmap_optimize_enabled);
> }
> early_param("hugetlb_free_vmemmap", hugetlb_vmemmap_early_param);
>
> @@ -103,7 +76,7 @@ static unsigned int optimizable_vmemmap_pages(struct hstate *h,
> unsigned long pfn = page_to_pfn(head);
> unsigned long end = pfn + pages_per_huge_page(h);
>
> - if (READ_ONCE(vmemmap_optimize_mode) == VMEMMAP_OPTIMIZE_OFF)
> + if (!READ_ONCE(vmemmap_optimize_enabled))
> return 0;
>
> for (; pfn < end; pfn += PAGES_PER_SECTION) {
> @@ -155,7 +128,6 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
>
> if (!is_power_of_2(sizeof(struct page))) {
> pr_warn_once("cannot optimize vmemmap pages because \"struct page\" crosses page boundaries\n");
> - static_branch_disable(&hugetlb_optimize_vmemmap_key);
> return;
> }
>
> @@ -176,36 +148,13 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
> }
>
> #ifdef CONFIG_PROC_SYSCTL
> -static int hugetlb_optimize_vmemmap_handler(struct ctl_table *table, int write,
> - void *buffer, size_t *length,
> - loff_t *ppos)
> -{
> - int ret;
> - enum vmemmap_optimize_mode mode;
> - static DEFINE_MUTEX(sysctl_mutex);
> -
> - if (write && !capable(CAP_SYS_ADMIN))
> - return -EPERM;
> -
> - mutex_lock(&sysctl_mutex);
> - mode = vmemmap_optimize_mode;
> - table->data = &mode;
> - ret = proc_dointvec_minmax(table, write, buffer, length, ppos);
> - if (write && !ret)
> - vmemmap_optimize_mode_switch(mode);
> - mutex_unlock(&sysctl_mutex);
> -
> - return ret;
> -}
> -
> static struct ctl_table hugetlb_vmemmap_sysctls[] = {
> {
> .procname = "hugetlb_optimize_vmemmap",
> - .maxlen = sizeof(enum vmemmap_optimize_mode),
> + .data = &vmemmap_optimize_enabled,
> + .maxlen = sizeof(int),
> .mode = 0644,
> - .proc_handler = hugetlb_optimize_vmemmap_handler,
> - .extra1 = SYSCTL_ZERO,
> - .extra2 = SYSCTL_ONE,
> + .proc_handler = proc_dobool,
> },
> { }
> };
> --
> 2.11.0
>
>
--
Oscar Salvador
SUSE Labs
On Mon, Jun 13, 2022 at 02:35:09PM +0800, Muchun Song wrote:
> It it inconvenient to mention the feature of optimizing vmemmap pages associated
> with HugeTLB pages when communicating with others since there is no specific or
> abbreviated name for it when it is first introduced. Let us give it a name HVO
> (HugeTLB Vmemmap Optimization) from now.
>
> This commit also updates the document about "hugetlb_free_vmemmap" by the way
> discussed in thread [1].
>
> Link: https://lore.kernel.org/all/[email protected]/ [1]
> Signed-off-by: Muchun Song <[email protected]>
For the Documentation/admin-guide/kernel-parameters.txt, I think it gets much
more clear.
About the name, I do not have a strong opinion.
Reviewed-by: Oscar Salvador <[email protected]>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 7 ++++---
> Documentation/admin-guide/mm/hugetlbpage.rst | 3 +--
> Documentation/admin-guide/sysctl/vm.rst | 3 +--
> fs/Kconfig | 13 ++++++-------
> mm/hugetlb_vmemmap.c | 8 ++++----
> mm/hugetlb_vmemmap.h | 4 ++--
> 6 files changed, 18 insertions(+), 20 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 391b43fee93e..7539553b3fb0 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1725,12 +1725,13 @@
> hugetlb_free_vmemmap=
> [KNL] Reguires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
> enabled.
> + Control if HugeTLB Vmemmap Optimization (HVO) is enabled.
> Allows heavy hugetlb users to free up some more
> memory (7 * PAGE_SIZE for each 2MB hugetlb page).
> - Format: { [oO][Nn]/Y/y/1 | [oO][Ff]/N/n/0 (default) }
> + Format: { on | off (default) }
>
> - [oO][Nn]/Y/y/1: enable the feature
> - [oO][Ff]/N/n/0: disable the feature
> + on: enable HVO
> + off: disable HVO
>
> Built with CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON=y,
> the default is on.
> diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
> index a90330d0a837..64e0d5c512e7 100644
> --- a/Documentation/admin-guide/mm/hugetlbpage.rst
> +++ b/Documentation/admin-guide/mm/hugetlbpage.rst
> @@ -164,8 +164,7 @@ default_hugepagesz
> will all result in 256 2M huge pages being allocated. Valid default
> huge page size is architecture dependent.
> hugetlb_free_vmemmap
> - When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables optimizing
> - unused vmemmap pages associated with each HugeTLB page.
> + When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables HVO.
>
> When multiple huge page sizes are supported, ``/proc/sys/vm/nr_hugepages``
> indicates the current number of pre-allocated huge pages of the default size.
> diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> index d7374a1e8ac9..c9f35db973f0 100644
> --- a/Documentation/admin-guide/sysctl/vm.rst
> +++ b/Documentation/admin-guide/sysctl/vm.rst
> @@ -569,8 +569,7 @@ This knob is not available when the size of 'struct page' (a structure defined
> in include/linux/mm_types.h) is not power of two (an unusual system config could
> result in this).
>
> -Enable (set to 1) or disable (set to 0) the feature of optimizing vmemmap pages
> -associated with each HugeTLB page.
> +Enable (set to 1) or disable (set to 0) HugeTLB Vmemmap Optimization (HVO).
>
> Once enabled, the vmemmap pages of subsequent allocation of HugeTLB pages from
> buddy allocator will be optimized (7 pages per 2MB HugeTLB page and 4095 pages
> diff --git a/fs/Kconfig b/fs/Kconfig
> index 5976eb33535f..2f9fd840cb66 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -247,8 +247,7 @@ config HUGETLB_PAGE
>
> #
> # Select this config option from the architecture Kconfig, if it is preferred
> -# to enable the feature of minimizing overhead of struct page associated with
> -# each HugeTLB page.
> +# to enable the feature of HugeTLB Vmemmap Optimization (HVO).
> #
> config ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
> bool
> @@ -259,14 +258,14 @@ config HUGETLB_PAGE_OPTIMIZE_VMEMMAP
> depends on SPARSEMEM_VMEMMAP
>
> config HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON
> - bool "Default optimizing vmemmap pages of HugeTLB to on"
> + bool "Default HugeTLB Vmemmap Optimization (HVO) to on"
> default n
> depends on HUGETLB_PAGE_OPTIMIZE_VMEMMAP
> help
> - When using HUGETLB_PAGE_OPTIMIZE_VMEMMAP, the optimizing unused vmemmap
> - pages associated with each HugeTLB page is default off. Say Y here
> - to enable optimizing vmemmap pages of HugeTLB by default. It can then
> - be disabled on the command line via hugetlb_free_vmemmap=off.
> + When using HUGETLB_PAGE_OPTIMIZE_VMEMMAP, the HugeTLB Vmemmap
> + Optimization (HVO) is off by default. Say Y here to enable HVO
> + by default. It can then be disabled on the command line via
> + hugetlb_free_vmemmap=off or sysctl.
>
> config MEMFD_CREATE
> def_bool TMPFS || HUGETLBFS
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index 132dc83f0130..c10540993577 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -1,8 +1,8 @@
> // SPDX-License-Identifier: GPL-2.0
> /*
> - * Optimize vmemmap pages associated with HugeTLB
> + * HugeTLB Vmemmap Optimization (HVO)
> *
> - * Copyright (c) 2020, Bytedance. All rights reserved.
> + * Copyright (c) 2020, ByteDance. All rights reserved.
> *
> * Author: Muchun Song <[email protected]>
> *
> @@ -120,8 +120,8 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
>
> /*
> * There are only (RESERVE_VMEMMAP_SIZE / sizeof(struct page)) struct
> - * page structs that can be used when CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP,
> - * so add a BUILD_BUG_ON to catch invalid usage of the tail struct page.
> + * page structs that can be used when HVO is enabled, add a BUILD_BUG_ON
> + * to catch invalid usage of the tail page structs.
> */
> BUILD_BUG_ON(__NR_USED_SUBPAGE >=
> RESERVE_VMEMMAP_SIZE / sizeof(struct page));
> diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h
> index 109b0a53b6fe..ba66fadad9fc 100644
> --- a/mm/hugetlb_vmemmap.h
> +++ b/mm/hugetlb_vmemmap.h
> @@ -1,8 +1,8 @@
> // SPDX-License-Identifier: GPL-2.0
> /*
> - * Optimize vmemmap pages associated with HugeTLB
> + * HugeTLB Vmemmap Optimization (HVO)
> *
> - * Copyright (c) 2020, Bytedance. All rights reserved.
> + * Copyright (c) 2020, ByteDance. All rights reserved.
> *
> * Author: Muchun Song <[email protected]>
> */
> --
> 2.11.0
>
>
--
Oscar Salvador
SUSE Labs
On Mon, Jun 13, 2022 at 10:10:08AM +0200, Oscar Salvador wrote:
> On Mon, Jun 13, 2022 at 02:35:08PM +0800, Muchun Song wrote:
> > We hold an another reference to hugetlb_optimize_vmemmap_key when
> > making vmemmap_optimize_mode on, because we use static_key to tell
> > memory_hotplug that memory_hotplug.memmap_on_memory should be
> > overridden. However, this rule has gone when we have introduced
> > SECTION_CANNOT_OPTIMIZE_VMEMMAP. Therefore, we could simplify
> > vmemmap_optimize_mode handling by not holding an another reference
> > to hugetlb_optimize_vmemmap_key.
> >
> > Signed-off-by: Muchun Song <[email protected]>
>
> LGTM, and it looks way nicer, so
>
> Reviewed-by: Oscar Salvador <[email protected]>
>
Thanks for taking a look.
> One question below though
>
> > -static enum vmemmap_optimize_mode vmemmap_optimize_mode =
> > +static bool vmemmap_optimize_enabled =
> > IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON);
>
> So, by default vmemmap_optimize_enabled will be on if we have
> CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON, but we can always override that
> via cmdline, as below, right?
>
Totally right. CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON only control
if the feature is enabled by default when the users do not specify "it should
be off" via cmdline.
Thanks.
On 13.06.22 08:35, Muchun Song wrote:
> It it inconvenient to mention the feature of optimizing vmemmap pages associated
> with HugeTLB pages when communicating with others since there is no specific or
> abbreviated name for it when it is first introduced. Let us give it a name HVO
> (HugeTLB Vmemmap Optimization) from now.
>
> This commit also updates the document about "hugetlb_free_vmemmap" by the way
> discussed in thread [1].
>
> Link: https://lore.kernel.org/all/[email protected]/ [1]
> Signed-off-by: Muchun Song <[email protected]>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 7 ++++---
> Documentation/admin-guide/mm/hugetlbpage.rst | 3 +--
> Documentation/admin-guide/sysctl/vm.rst | 3 +--
> fs/Kconfig | 13 ++++++-------
> mm/hugetlb_vmemmap.c | 8 ++++----
> mm/hugetlb_vmemmap.h | 4 ++--
> 6 files changed, 18 insertions(+), 20 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 391b43fee93e..7539553b3fb0 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1725,12 +1725,13 @@
> hugetlb_free_vmemmap=
> [KNL] Reguires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
> enabled.
> + Control if HugeTLB Vmemmap Optimization (HVO) is enabled.
> Allows heavy hugetlb users to free up some more
> memory (7 * PAGE_SIZE for each 2MB hugetlb page).
> - Format: { [oO][Nn]/Y/y/1 | [oO][Ff]/N/n/0 (default) }
> + Format: { on | off (default) }
>
> - [oO][Nn]/Y/y/1: enable the feature
> - [oO][Ff]/N/n/0: disable the feature
> + on: enable HVO
> + off: disable HVO
>
> Built with CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON=y,
> the default is on.
> diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
> index a90330d0a837..64e0d5c512e7 100644
> --- a/Documentation/admin-guide/mm/hugetlbpage.rst
> +++ b/Documentation/admin-guide/mm/hugetlbpage.rst
> @@ -164,8 +164,7 @@ default_hugepagesz
> will all result in 256 2M huge pages being allocated. Valid default
> huge page size is architecture dependent.
> hugetlb_free_vmemmap
> - When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables optimizing
> - unused vmemmap pages associated with each HugeTLB page.
> + When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables HVO.
Heh, it would be convenient to call this
CONFIG_HUGETLB_PAGE_VMEMMAP_OPTIMIZATION (HVO) then.
--
Thanks,
David / dhildenb
On Mon, Jun 13, 2022 at 02:35:08PM +0800”, Muchun Song wrote:
> We hold an another reference to hugetlb_optimize_vmemmap_key when
> making vmemmap_optimize_mode on, because we use static_key to tell
> memory_hotplug that memory_hotplug.memmap_on_memory should be
> overridden. However, this rule has gone when we have introduced
> SECTION_CANNOT_OPTIMIZE_VMEMMAP. Therefore, we could simplify
> vmemmap_optimize_mode handling by not holding an another reference
> to hugetlb_optimize_vmemmap_key.
>
> Signed-off-by: Muchun Song <[email protected]>
> ---
> include/linux/page-flags.h | 6 ++---
> mm/hugetlb_vmemmap.c | 65 +++++-----------------------------------------
> 2 files changed, 9 insertions(+), 62 deletions(-)
>
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index b8b992cb201c..da7ccc3b16ad 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -200,8 +200,7 @@ enum pageflags {
> #ifndef __GENERATING_BOUNDS_H
>
> #ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
> -DECLARE_STATIC_KEY_MAYBE(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON,
> - hugetlb_optimize_vmemmap_key);
> +DECLARE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key);
>
> /*
> * If the feature of optimizing vmemmap pages associated with each HugeTLB
> @@ -221,8 +220,7 @@ DECLARE_STATIC_KEY_MAYBE(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON,
> */
> static __always_inline const struct page *page_fixed_fake_head(const struct page *page)
> {
> - if (!static_branch_maybe(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON,
> - &hugetlb_optimize_vmemmap_key))
> + if (!static_branch_unlikely(&hugetlb_optimize_vmemmap_key))
> return page;
This also means that we not incur the extra page_fixed_fake_head checks
if there are no vmemmap optinmized hugetlb pages. Nice!
Reviewed-by: Mike Kravetz <[email protected]>
--
Mike Kravetz
On Mon, Jun 13, 2022 at 02:35:09PM +0800”, Muchun Song wrote:
> It it inconvenient to mention the feature of optimizing vmemmap pages associated
> with HugeTLB pages when communicating with others since there is no specific or
> abbreviated name for it when it is first introduced. Let us give it a name HVO
> (HugeTLB Vmemmap Optimization) from now.
>
> This commit also updates the document about "hugetlb_free_vmemmap" by the way
> discussed in thread [1].
>
> Link: https://lore.kernel.org/all/[email protected]/ [1]
> Signed-off-by: Muchun Song <[email protected]>
> ---
> diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
> index a90330d0a837..64e0d5c512e7 100644
> --- a/Documentation/admin-guide/mm/hugetlbpage.rst
> +++ b/Documentation/admin-guide/mm/hugetlbpage.rst
> @@ -164,8 +164,7 @@ default_hugepagesz
> will all result in 256 2M huge pages being allocated. Valid default
> huge page size is architecture dependent.
> hugetlb_free_vmemmap
> - When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables optimizing
> - unused vmemmap pages associated with each HugeTLB page.
> + When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables HVO.
I think we need to define HVO before using it here. Readers may be able
to determine the meaning, but to be sure I would suggest:
When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set this enables
HugeTLB Vmemmap Optimization (HVO).
Everything else looks good to me.
Reviewed-by: Mike Kravetz <[email protected]>
--
Mike Kravetz
On Mon, Jun 13, 2022 at 02:19:45PM -0700, Mike Kravetz wrote:
> On Mon, Jun 13, 2022 at 02:35:09PM +0800”, Muchun Song wrote:
> > It it inconvenient to mention the feature of optimizing vmemmap pages associated
> > with HugeTLB pages when communicating with others since there is no specific or
> > abbreviated name for it when it is first introduced. Let us give it a name HVO
> > (HugeTLB Vmemmap Optimization) from now.
> >
> > This commit also updates the document about "hugetlb_free_vmemmap" by the way
> > discussed in thread [1].
> >
> > Link: https://lore.kernel.org/all/[email protected]/ [1]
> > Signed-off-by: Muchun Song <[email protected]>
> > ---
>
> > diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
> > index a90330d0a837..64e0d5c512e7 100644
> > --- a/Documentation/admin-guide/mm/hugetlbpage.rst
> > +++ b/Documentation/admin-guide/mm/hugetlbpage.rst
> > @@ -164,8 +164,7 @@ default_hugepagesz
> > will all result in 256 2M huge pages being allocated. Valid default
> > huge page size is architecture dependent.
> > hugetlb_free_vmemmap
> > - When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables optimizing
> > - unused vmemmap pages associated with each HugeTLB page.
> > + When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables HVO.
>
> I think we need to define HVO before using it here. Readers may be able
> to determine the meaning, but to be sure I would suggest:
>
Agree.
> When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set this enables
> HugeTLB Vmemmap Optimization (HVO).
>
I would use this. Thanks.
> Everything else looks good to me.
> Reviewed-by: Mike Kravetz <[email protected]>
Thanks for your review.
On Mon, Jun 13, 2022 at 05:39:59PM +0200, David Hildenbrand wrote:
> On 13.06.22 08:35, Muchun Song wrote:
> > It it inconvenient to mention the feature of optimizing vmemmap pages associated
> > with HugeTLB pages when communicating with others since there is no specific or
> > abbreviated name for it when it is first introduced. Let us give it a name HVO
> > (HugeTLB Vmemmap Optimization) from now.
> >
> > This commit also updates the document about "hugetlb_free_vmemmap" by the way
> > discussed in thread [1].
> >
> > Link: https://lore.kernel.org/all/[email protected]/ [1]
> > Signed-off-by: Muchun Song <[email protected]>
> > ---
> > Documentation/admin-guide/kernel-parameters.txt | 7 ++++---
> > Documentation/admin-guide/mm/hugetlbpage.rst | 3 +--
> > Documentation/admin-guide/sysctl/vm.rst | 3 +--
> > fs/Kconfig | 13 ++++++-------
> > mm/hugetlb_vmemmap.c | 8 ++++----
> > mm/hugetlb_vmemmap.h | 4 ++--
> > 6 files changed, 18 insertions(+), 20 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index 391b43fee93e..7539553b3fb0 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -1725,12 +1725,13 @@
> > hugetlb_free_vmemmap=
> > [KNL] Reguires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
> > enabled.
> > + Control if HugeTLB Vmemmap Optimization (HVO) is enabled.
> > Allows heavy hugetlb users to free up some more
> > memory (7 * PAGE_SIZE for each 2MB hugetlb page).
> > - Format: { [oO][Nn]/Y/y/1 | [oO][Ff]/N/n/0 (default) }
> > + Format: { on | off (default) }
> >
> > - [oO][Nn]/Y/y/1: enable the feature
> > - [oO][Ff]/N/n/0: disable the feature
> > + on: enable HVO
> > + off: disable HVO
> >
> > Built with CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON=y,
> > the default is on.
> > diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
> > index a90330d0a837..64e0d5c512e7 100644
> > --- a/Documentation/admin-guide/mm/hugetlbpage.rst
> > +++ b/Documentation/admin-guide/mm/hugetlbpage.rst
> > @@ -164,8 +164,7 @@ default_hugepagesz
> > will all result in 256 2M huge pages being allocated. Valid default
> > huge page size is architecture dependent.
> > hugetlb_free_vmemmap
> > - When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables optimizing
> > - unused vmemmap pages associated with each HugeTLB page.
> > + When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables HVO.
>
> Heh, it would be convenient to call this
>
> CONFIG_HUGETLB_PAGE_VMEMMAP_OPTIMIZATION (HVO) then.
>
Thanks for pointing it out. I would take Mike's suggestion. I would change to:
When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set this enables
HugeTLB Vmemmap Optimization (HVO).
Thanks.
On 6/13/22 07:35, Muchun Song wrote:
> It it inconvenient to mention the feature of optimizing vmemmap pages associated
> with HugeTLB pages when communicating with others since there is no specific or
> abbreviated name for it when it is first introduced. Let us give it a name HVO
> (HugeTLB Vmemmap Optimization) from now.
>
Just thought I would throw this suggestion, even though I am probably too late.
I find the term "vmemmap deduplication" more self-explanatory (at least for me)
to refer to your technique ,and similarly s/optimize/dedup. Or vmemmap tail page
deduplication (too verbose maybe) because really that's what this optimization is all
about. OTOH it would slightly deviate from what maybe established now
in hugetlb code.
On Wed, Jun 15, 2022 at 03:51:51PM +0100, Joao Martins wrote:
> On 6/13/22 07:35, Muchun Song wrote:
> > It it inconvenient to mention the feature of optimizing vmemmap pages associated
> > with HugeTLB pages when communicating with others since there is no specific or
> > abbreviated name for it when it is first introduced. Let us give it a name HVO
> > (HugeTLB Vmemmap Optimization) from now.
> >
>
> Just thought I would throw this suggestion, even though I am probably too late.
>
Not too late, we are still discussing the name.
> I find the term "vmemmap deduplication" more self-explanatory (at least for me)
> to refer to your technique ,and similarly s/optimize/dedup. Or vmemmap tail page
> deduplication (too verbose maybe) because really that's what this optimization is all
> about. OTOH it would slightly deviate from what maybe established now
> in hugetlb code.
>
Well, I have looked up this word "deduplication" which refers to a method of
eliminating a dataset’s redundant data. At least I agree with you "deduplication"
is more expressive for my technique. So I am thinking of renaming "HVO" to "HVD (
HugeTLB Vmemmap Deduplication)". In this series (patch 6), I have renamed
hugetlb_vmemmap_alloc/free to hugetlb_vmemmmap_optimize/restore. I am also
thinking of replacing it to:
hugetlb_vmemmmap_deduplicate vs hugetlb_vmemmmap_duplicate.
Many other places in hugetlb_vmemmap.c use "optimize" word, maybe most of them do
not need to be changed since "deduplication" is also a __optimization__ technique.
Hi Mike and David:
What your opinion on this? I want to hear some thoughts from you.
THanks.
On 06/16/22 11:28, Muchun Song wrote:
> On Wed, Jun 15, 2022 at 03:51:51PM +0100, Joao Martins wrote:
> > On 6/13/22 07:35, Muchun Song wrote:
> > > It it inconvenient to mention the feature of optimizing vmemmap pages associated
> > > with HugeTLB pages when communicating with others since there is no specific or
> > > abbreviated name for it when it is first introduced. Let us give it a name HVO
> > > (HugeTLB Vmemmap Optimization) from now.
> > >
> >
> > Just thought I would throw this suggestion, even though I am probably too late.
> >
>
> Not too late, we are still discussing the name.
>
> > I find the term "vmemmap deduplication" more self-explanatory (at least for me)
> > to refer to your technique ,and similarly s/optimize/dedup. Or vmemmap tail page
> > deduplication (too verbose maybe) because really that's what this optimization is all
> > about. OTOH it would slightly deviate from what maybe established now
> > in hugetlb code.
> >
>
> Well, I have looked up this word "deduplication" which refers to a method of
> eliminating a dataset’s redundant data. At least I agree with you "deduplication"
> is more expressive for my technique. So I am thinking of renaming "HVO" to "HVD (
> HugeTLB Vmemmap Deduplication)". In this series (patch 6), I have renamed
> hugetlb_vmemmap_alloc/free to hugetlb_vmemmmap_optimize/restore. I am also
> thinking of replacing it to:
>
> hugetlb_vmemmmap_deduplicate vs hugetlb_vmemmmap_duplicate.
>
> Many other places in hugetlb_vmemmap.c use "optimize" word, maybe most of them do
> not need to be changed since "deduplication" is also a __optimization__ technique.
>
> Hi Mike and David:
>
> What your opinion on this? I want to hear some thoughts from you.
I can understand Joao's preference for deduplication. However, I can
also understand just using the term optimization. IMO, neither is far
superior to the other. It is mostly a matter of personal preference.
My preference would be to leave it as named in this series unless
someone has a strong preference for changing.
--
Mike Kravetz
On Thu, Jun 16, 2022 at 03:27:40PM -0700, Mike Kravetz wrote:
> On 06/16/22 11:28, Muchun Song wrote:
> > On Wed, Jun 15, 2022 at 03:51:51PM +0100, Joao Martins wrote:
> > > On 6/13/22 07:35, Muchun Song wrote:
> > > > It it inconvenient to mention the feature of optimizing vmemmap pages associated
> > > > with HugeTLB pages when communicating with others since there is no specific or
> > > > abbreviated name for it when it is first introduced. Let us give it a name HVO
> > > > (HugeTLB Vmemmap Optimization) from now.
> > > >
> > >
> > > Just thought I would throw this suggestion, even though I am probably too late.
> > >
> >
> > Not too late, we are still discussing the name.
> >
> > > I find the term "vmemmap deduplication" more self-explanatory (at least for me)
> > > to refer to your technique ,and similarly s/optimize/dedup. Or vmemmap tail page
> > > deduplication (too verbose maybe) because really that's what this optimization is all
> > > about. OTOH it would slightly deviate from what maybe established now
> > > in hugetlb code.
> > >
> >
> > Well, I have looked up this word "deduplication" which refers to a method of
> > eliminating a dataset’s redundant data. At least I agree with you "deduplication"
> > is more expressive for my technique. So I am thinking of renaming "HVO" to "HVD (
> > HugeTLB Vmemmap Deduplication)". In this series (patch 6), I have renamed
> > hugetlb_vmemmap_alloc/free to hugetlb_vmemmmap_optimize/restore. I am also
> > thinking of replacing it to:
> >
> > hugetlb_vmemmmap_deduplicate vs hugetlb_vmemmmap_duplicate.
> >
> > Many other places in hugetlb_vmemmap.c use "optimize" word, maybe most of them do
> > not need to be changed since "deduplication" is also a __optimization__ technique.
> >
> > Hi Mike and David:
> >
> > What your opinion on this? I want to hear some thoughts from you.
>
> I can understand Joao's preference for deduplication. However, I can
> also understand just using the term optimization. IMO, neither is far
> superior to the other. It is mostly a matter of personal preference.
>
> My preference would be to leave it as named in this series unless
> someone has a strong preference for changing.
>
All right. I'll keep the HVO name.
Thanks.