2022-03-02 13:23:10

by Muchun Song

[permalink] [raw]
Subject: [PATCH v2 0/3] add hugetlb_free_vmemmap sysctl

This series amis to add hugetlb_free_vmemmap sysctl to enable the feature
of freeing vmemmap pages of HugeTLB pages.

v2:
- Fix compilation when !CONFIG_MHP_MEMMAP_ON_MEMORY reported by kernel
test robot <[email protected]>.
- Move sysctl code from kernel/sysctl.c to mm/hugetlb_vmemmap.c.

Muchun Song (3):
mm: hugetlb: disable freeing vmemmap pages when struct page crosses
page boundaries
sysctl: allow to set extra1 to SYSCTL_ONE
mm: hugetlb: add hugetlb_free_vmemmap sysctl

Documentation/admin-guide/sysctl/vm.rst | 13 ++++++++++
include/linux/memory_hotplug.h | 9 +++++++
kernel/sysctl.c | 2 +-
mm/hugetlb_vmemmap.c | 43 ++++++++++++++++++++++++++++++++-
mm/hugetlb_vmemmap.h | 4 ++-
mm/memory_hotplug.c | 5 ++++
6 files changed, 73 insertions(+), 3 deletions(-)

--
2.11.0


2022-03-02 15:14:30

by Muchun Song

[permalink] [raw]
Subject: [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries

If CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON is enabled and the size
of "struct page" is not power of two, we cannot optimize vmemmap pages
of HugeTLB pages. We should disable this feature in this case.

Signed-off-by: Muchun Song <[email protected]>
---
mm/hugetlb_vmemmap.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index b3118dba0518..836d1117f08b 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -121,6 +121,17 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
if (!hugetlb_free_vmemmap_enabled())
return;

+ if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
+ !is_power_of_2(sizeof(struct page))) {
+ /*
+ * The hugetlb_free_vmemmap_enabled_key can be enabled when
+ * CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
+ * be disabled if "struct page" crosses page boundaries.
+ */
+ static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
+ return;
+ }
+
vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
/*
* The head page is not to be freed to buddy allocator, the other tail
--
2.11.0

2022-03-02 20:38:07

by Muchun Song

[permalink] [raw]
Subject: [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl

We must add "hugetlb_free_vmemmap=on" to boot cmdline and reboot the
server to enable the feature of freeing vmemmap pages of HugeTLB
pages. Rebooting usually taske a long time. Add a sysctl to enable
the feature at runtime and do not need to reboot.

Signed-off-by: Muchun Song <[email protected]>
---
Documentation/admin-guide/sysctl/vm.rst | 13 ++++++++++
include/linux/memory_hotplug.h | 9 +++++++
mm/hugetlb_vmemmap.c | 42 ++++++++++++++++++++++++++++-----
mm/hugetlb_vmemmap.h | 4 +++-
mm/memory_hotplug.c | 5 ++++
5 files changed, 66 insertions(+), 7 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index f4804ce37c58..01f18e6cc227 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -561,6 +561,19 @@ Change the minimum size of the hugepage pool.
See Documentation/admin-guide/mm/hugetlbpage.rst


+hugetlb_free_vmemmap
+====================
+
+A toggle value indicating if vmemmap pages are allowed to be optimized.
+If it is off (0), then it can be set true (1). Once true, the vmemmap
+pages associated with each HugeTLB page will be optimized, and the toggle
+cannot be set back to false. It only optimizes the subsequent allocation
+of HugeTLB pages from buddy system, while already allocated HugeTLB pages
+will not be optimized.
+
+See Documentation/admin-guide/mm/hugetlbpage.rst
+
+
nr_hugepages_mempolicy
======================

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index e0b2209ab71c..20d7edf62a6a 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -351,4 +351,13 @@ void arch_remove_linear_mapping(u64 start, u64 size);
extern bool mhp_supports_memmap_on_memory(unsigned long size);
#endif /* CONFIG_MEMORY_HOTPLUG */

+#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
+bool mhp_memmap_on_memory(void);
+#else
+static inline bool mhp_memmap_on_memory(void)
+{
+ return false;
+}
+#endif
+
#endif /* __LINUX_MEMORY_HOTPLUG_H */
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 836d1117f08b..3bcc8f25bd50 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -10,6 +10,7 @@

#define pr_fmt(fmt) "HugeTLB: " fmt

+#include <linux/memory_hotplug.h>
#include "hugetlb_vmemmap.h"

/*
@@ -118,17 +119,14 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
BUILD_BUG_ON(__NR_USED_SUBPAGE >=
RESERVE_VMEMMAP_SIZE / sizeof(struct page));

- if (!hugetlb_free_vmemmap_enabled())
- return;
-
- if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
- !is_power_of_2(sizeof(struct page))) {
+ if (!is_power_of_2(sizeof(struct page))) {
/*
* The hugetlb_free_vmemmap_enabled_key can be enabled when
* CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
* be disabled if "struct page" crosses page boundaries.
*/
- static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
+ if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON))
+ static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
return;
}

@@ -147,3 +145,35 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages,
h->name);
}
+
+static struct ctl_table hugetlb_vmemmap_sysctls[] = {
+ {
+ .procname = "hugetlb_free_vmemmap",
+ .data = &hugetlb_free_vmemmap_enabled_key.key,
+ .mode = 0644,
+ /* only handle a transition from default "0" to "1" */
+ .proc_handler = proc_do_static_key,
+ .extra1 = SYSCTL_ONE,
+ .extra2 = SYSCTL_ONE,
+ },
+ { }
+};
+
+static __init int hugetlb_vmemmap_sysctls_init(void)
+{
+ /*
+ * The vmemmap pages cannot be optimized if
+ * "memory_hotplug.memmap_on_memory" is enabled unless
+ * "hugetlb_free_vmemmap" is enabled as well since
+ * "hugetlb_free_vmemmap" takes precedence over
+ * "memory_hotplug.memmap_on_memory".
+ */
+ if (mhp_memmap_on_memory() && !hugetlb_free_vmemmap_enabled())
+ return 0;
+
+ if (is_power_of_2(sizeof(struct page)))
+ register_sysctl_init("vm", hugetlb_vmemmap_sysctls);
+
+ return 0;
+}
+late_initcall(hugetlb_vmemmap_sysctls_init);
diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h
index cb2bef8f9e73..b67a159027f4 100644
--- a/mm/hugetlb_vmemmap.h
+++ b/mm/hugetlb_vmemmap.h
@@ -21,7 +21,9 @@ void hugetlb_vmemmap_init(struct hstate *h);
*/
static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h)
{
- return h->nr_free_vmemmap_pages;
+ if (hugetlb_free_vmemmap_enabled())
+ return h->nr_free_vmemmap_pages;
+ return 0;
}
#else
static inline int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c226a337c1ef..c2115e566abc 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -50,6 +50,11 @@ static bool memmap_on_memory __ro_after_init;
#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
module_param(memmap_on_memory, bool, 0444);
MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug");
+
+bool mhp_memmap_on_memory(void)
+{
+ return memmap_on_memory;
+}
#endif

enum {
--
2.11.0

2022-03-02 22:26:30

by Muchun Song

[permalink] [raw]
Subject: [PATCH v2 2/3] sysctl: allow to set extra1 to SYSCTL_ONE

Some sysctls only allow to be enabled and cannot be set back to be
disabled. But proc_do_static_key() does not consider this situation,
which set ->extra1 to SYSCTL_ZERO unconditionally. This patch add
the ability to set ->extra1 to SYSCTL_ONE, which will be used in
the next patch.

Signed-off-by: Muchun Song <[email protected]>
---
kernel/sysctl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 64065abf361e..ab3e9c937268 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1631,7 +1631,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
.data = &val,
.maxlen = sizeof(val),
.mode = table->mode,
- .extra1 = SYSCTL_ZERO,
+ .extra1 = table->extra1 == SYSCTL_ONE ? SYSCTL_ONE : SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
};

--
2.11.0

2022-03-02 23:59:05

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl

On Wed, Mar 02, 2022 at 04:37:58PM +0800, Muchun Song wrote:
> We must add "hugetlb_free_vmemmap=on" to boot cmdline and reboot the
> server to enable the feature of freeing vmemmap pages of HugeTLB
> pages. Rebooting usually taske a long time. Add a sysctl to enable
> the feature at runtime and do not need to reboot.
>
> Signed-off-by: Muchun Song <[email protected]>
> ---
> Documentation/admin-guide/sysctl/vm.rst | 13 ++++++++++
> include/linux/memory_hotplug.h | 9 +++++++
> mm/hugetlb_vmemmap.c | 42 ++++++++++++++++++++++++++++-----
> mm/hugetlb_vmemmap.h | 4 +++-
> mm/memory_hotplug.c | 5 ++++
> 5 files changed, 66 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> index f4804ce37c58..01f18e6cc227 100644
> --- a/Documentation/admin-guide/sysctl/vm.rst
> +++ b/Documentation/admin-guide/sysctl/vm.rst
> @@ -561,6 +561,19 @@ Change the minimum size of the hugepage pool.
> See Documentation/admin-guide/mm/hugetlbpage.rst
>
>
> +hugetlb_free_vmemmap
> +====================
> +
> +A toggle value indicating if vmemmap pages are allowed to be optimized.
> +If it is off (0), then it can be set true (1). Once true, the vmemmap
> +pages associated with each HugeTLB page will be optimized, and the toggle
> +cannot be set back to false. It only optimizes the subsequent allocation
> +of HugeTLB pages from buddy system, while already allocated HugeTLB pages
> +will not be optimized.

The commit log or documentation does not descrie why its safe to toggle
one way and not the other?

Luis

> +
> +See Documentation/admin-guide/mm/hugetlbpage.rst
> +
> +
> nr_hugepages_mempolicy
> ======================
>
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index e0b2209ab71c..20d7edf62a6a 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -351,4 +351,13 @@ void arch_remove_linear_mapping(u64 start, u64 size);
> extern bool mhp_supports_memmap_on_memory(unsigned long size);
> #endif /* CONFIG_MEMORY_HOTPLUG */
>
> +#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
> +bool mhp_memmap_on_memory(void);
> +#else
> +static inline bool mhp_memmap_on_memory(void)
> +{
> + return false;
> +}
> +#endif
> +
> #endif /* __LINUX_MEMORY_HOTPLUG_H */
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index 836d1117f08b..3bcc8f25bd50 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -10,6 +10,7 @@
>
> #define pr_fmt(fmt) "HugeTLB: " fmt
>
> +#include <linux/memory_hotplug.h>
> #include "hugetlb_vmemmap.h"
>
> /*
> @@ -118,17 +119,14 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
> BUILD_BUG_ON(__NR_USED_SUBPAGE >=
> RESERVE_VMEMMAP_SIZE / sizeof(struct page));
>
> - if (!hugetlb_free_vmemmap_enabled())
> - return;
> -
> - if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
> - !is_power_of_2(sizeof(struct page))) {
> + if (!is_power_of_2(sizeof(struct page))) {
> /*
> * The hugetlb_free_vmemmap_enabled_key can be enabled when
> * CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
> * be disabled if "struct page" crosses page boundaries.
> */
> - static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
> + if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON))
> + static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
> return;
> }
>
> @@ -147,3 +145,35 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
> pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages,
> h->name);
> }
> +
> +static struct ctl_table hugetlb_vmemmap_sysctls[] = {
> + {
> + .procname = "hugetlb_free_vmemmap",
> + .data = &hugetlb_free_vmemmap_enabled_key.key,
> + .mode = 0644,
> + /* only handle a transition from default "0" to "1" */
> + .proc_handler = proc_do_static_key,
> + .extra1 = SYSCTL_ONE,
> + .extra2 = SYSCTL_ONE,
> + },
> + { }
> +};
> +
> +static __init int hugetlb_vmemmap_sysctls_init(void)
> +{
> + /*
> + * The vmemmap pages cannot be optimized if
> + * "memory_hotplug.memmap_on_memory" is enabled unless
> + * "hugetlb_free_vmemmap" is enabled as well since
> + * "hugetlb_free_vmemmap" takes precedence over
> + * "memory_hotplug.memmap_on_memory".
> + */
> + if (mhp_memmap_on_memory() && !hugetlb_free_vmemmap_enabled())
> + return 0;
> +
> + if (is_power_of_2(sizeof(struct page)))
> + register_sysctl_init("vm", hugetlb_vmemmap_sysctls);
> +
> + return 0;
> +}
> +late_initcall(hugetlb_vmemmap_sysctls_init);
> diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h
> index cb2bef8f9e73..b67a159027f4 100644
> --- a/mm/hugetlb_vmemmap.h
> +++ b/mm/hugetlb_vmemmap.h
> @@ -21,7 +21,9 @@ void hugetlb_vmemmap_init(struct hstate *h);
> */
> static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h)
> {
> - return h->nr_free_vmemmap_pages;
> + if (hugetlb_free_vmemmap_enabled())
> + return h->nr_free_vmemmap_pages;
> + return 0;
> }
> #else
> static inline int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index c226a337c1ef..c2115e566abc 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -50,6 +50,11 @@ static bool memmap_on_memory __ro_after_init;
> #ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
> module_param(memmap_on_memory, bool, 0444);
> MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug");
> +
> +bool mhp_memmap_on_memory(void)
> +{
> + return memmap_on_memory;
> +}
> #endif
>
> enum {
> --
> 2.11.0
>

2022-03-03 00:20:58

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries

On Wed, Mar 02, 2022 at 04:37:56PM +0800, Muchun Song wrote:
> If CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON is enabled and the size
> of "struct page" is not power of two, we cannot optimize vmemmap pages
> of HugeTLB pages. We should disable this feature in this case.

The commit log does not descrie what happens if this is left enabled in
that case? Is this a fix? Why would it be a fix? Was something failing?
How did you spot this issue? What are the consequences of not applying
this patch?

Luis

2022-03-03 00:58:11

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries

On 3/2/22 00:37, Muchun Song wrote:
> If CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON is enabled and the size
> of "struct page" is not power of two, we cannot optimize vmemmap pages
> of HugeTLB pages. We should disable this feature in this case.

I'll let you reply to the question from Luis, but IIUC there is no issue
today as "struct page" is certainly a power of two. This is more future
looking. Correct?

> Signed-off-by: Muchun Song <[email protected]>
> ---
> mm/hugetlb_vmemmap.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index b3118dba0518..836d1117f08b 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -121,6 +121,17 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
> if (!hugetlb_free_vmemmap_enabled())
> return;
>
> + if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
> + !is_power_of_2(sizeof(struct page))) {
> + /*
> + * The hugetlb_free_vmemmap_enabled_key can be enabled when
> + * CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
> + * be disabled if "struct page" crosses page boundaries.
> + */
> + static_branch_disable(&hugetlb_free_vmemmap_enabled_key);

Should we possibly print a warning here as in the routine early_hugetlb_free_vmemmap_param? This is called once per hstate, so
perhaps pr_warn_once.

--
Mike Kravetz

> + return;
> + }
> +
> vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
> /*
> * The head page is not to be freed to buddy allocator, the other tail


2022-03-03 02:36:02

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries

On Thu, Mar 3, 2022 at 8:25 AM Mike Kravetz <[email protected]> wrote:
>
> On 3/2/22 00:37, Muchun Song wrote:
> > If CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON is enabled and the size
> > of "struct page" is not power of two, we cannot optimize vmemmap pages
> > of HugeTLB pages. We should disable this feature in this case.
>
> I'll let you reply to the question from Luis, but IIUC there is no issue
> today as "struct page" is certainly a power of two. This is more future
> looking. Correct?

Partly right. The size of "struct page" is not the power of two if
!CONFIG_MEMCG && CONFIG_SLAB on x86_64. But it is not
a conventional configuration nowadays. So it is not a critical
problem. I am not sure if a Fixes tag is necessary.

>
> > Signed-off-by: Muchun Song <[email protected]>
> > ---
> > mm/hugetlb_vmemmap.c | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> > index b3118dba0518..836d1117f08b 100644
> > --- a/mm/hugetlb_vmemmap.c
> > +++ b/mm/hugetlb_vmemmap.c
> > @@ -121,6 +121,17 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
> > if (!hugetlb_free_vmemmap_enabled())
> > return;
> >
> > + if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
> > + !is_power_of_2(sizeof(struct page))) {
> > + /*
> > + * The hugetlb_free_vmemmap_enabled_key can be enabled when
> > + * CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
> > + * be disabled if "struct page" crosses page boundaries.
> > + */
> > + static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
>
> Should we possibly print a warning here as in the routine early_hugetlb_free_vmemmap_param? This is called once per hstate, so
> perhaps pr_warn_once.

Good point. Will do.

Thanks.

2022-03-03 02:46:45

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries

On Thu, Mar 3, 2022 at 5:21 AM Luis Chamberlain <[email protected]> wrote:
>
> On Wed, Mar 02, 2022 at 04:37:56PM +0800, Muchun Song wrote:
> > If CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON is enabled and the size
> > of "struct page" is not power of two, we cannot optimize vmemmap pages
> > of HugeTLB pages. We should disable this feature in this case.
>
> The commit log does not describe what happens if this is left enabled in
> that case? Is this a fix? Why would it be a fix? Was something failing?
> How did you spot this issue? What are the consequences of not applying
> this patch?
>

If the size of "struct page" is not the power of two and this feature is
enabled, then the vmemmap pages of HugeTLB will be corrupted
after remapping (panic is about to happen in theory). But this only
exists when !CONFIG_MEMCG && CONFIG_SLAB on x86_64.
However, it is not a conventional configuration nowadays. So it is
not a real word issue, just the result of a code review. But we cannot
prevent someone from configuring that combined configure. OK,
this information should go to the commit log. Will update it.

Thanks.

2022-03-03 11:21:55

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl

On Thu, Mar 3, 2022 at 5:25 AM Luis Chamberlain <[email protected]> wrote:
>
> On Wed, Mar 02, 2022 at 04:37:58PM +0800, Muchun Song wrote:
> > We must add "hugetlb_free_vmemmap=on" to boot cmdline and reboot the
> > server to enable the feature of freeing vmemmap pages of HugeTLB
> > pages. Rebooting usually taske a long time. Add a sysctl to enable
> > the feature at runtime and do not need to reboot.
> >
> > Signed-off-by: Muchun Song <[email protected]>
> > ---
> > Documentation/admin-guide/sysctl/vm.rst | 13 ++++++++++
> > include/linux/memory_hotplug.h | 9 +++++++
> > mm/hugetlb_vmemmap.c | 42 ++++++++++++++++++++++++++++-----
> > mm/hugetlb_vmemmap.h | 4 +++-
> > mm/memory_hotplug.c | 5 ++++
> > 5 files changed, 66 insertions(+), 7 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> > index f4804ce37c58..01f18e6cc227 100644
> > --- a/Documentation/admin-guide/sysctl/vm.rst
> > +++ b/Documentation/admin-guide/sysctl/vm.rst
> > @@ -561,6 +561,19 @@ Change the minimum size of the hugepage pool.
> > See Documentation/admin-guide/mm/hugetlbpage.rst
> >
> >
> > +hugetlb_free_vmemmap
> > +====================
> > +
> > +A toggle value indicating if vmemmap pages are allowed to be optimized.
> > +If it is off (0), then it can be set true (1). Once true, the vmemmap
> > +pages associated with each HugeTLB page will be optimized, and the toggle
> > +cannot be set back to false. It only optimizes the subsequent allocation
> > +of HugeTLB pages from buddy system, while already allocated HugeTLB pages
> > +will not be optimized.
>
> The commit log or documentation does not descrie why its safe to toggle
> one way and not the other?
>

I thought it was easy to handle the transition from disable to enable
(code is simple). I might be wrong. I'll try to handle the other side in
the next version if it is not hard to handle.

Thanks Luis.

2022-03-03 19:10:05

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl

On Thu, Mar 03, 2022 at 07:15:05PM +0800, Muchun Song wrote:
> On Thu, Mar 3, 2022 at 5:25 AM Luis Chamberlain <[email protected]> wrote:
> >
> > On Wed, Mar 02, 2022 at 04:37:58PM +0800, Muchun Song wrote:
> > > We must add "hugetlb_free_vmemmap=on" to boot cmdline and reboot the
> > > server to enable the feature of freeing vmemmap pages of HugeTLB
> > > pages. Rebooting usually taske a long time. Add a sysctl to enable
> > > the feature at runtime and do not need to reboot.
> > >
> > > Signed-off-by: Muchun Song <[email protected]>
> > > ---
> > > Documentation/admin-guide/sysctl/vm.rst | 13 ++++++++++
> > > include/linux/memory_hotplug.h | 9 +++++++
> > > mm/hugetlb_vmemmap.c | 42 ++++++++++++++++++++++++++++-----
> > > mm/hugetlb_vmemmap.h | 4 +++-
> > > mm/memory_hotplug.c | 5 ++++
> > > 5 files changed, 66 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> > > index f4804ce37c58..01f18e6cc227 100644
> > > --- a/Documentation/admin-guide/sysctl/vm.rst
> > > +++ b/Documentation/admin-guide/sysctl/vm.rst
> > > @@ -561,6 +561,19 @@ Change the minimum size of the hugepage pool.
> > > See Documentation/admin-guide/mm/hugetlbpage.rst
> > >
> > >
> > > +hugetlb_free_vmemmap
> > > +====================
> > > +
> > > +A toggle value indicating if vmemmap pages are allowed to be optimized.
> > > +If it is off (0), then it can be set true (1). Once true, the vmemmap
> > > +pages associated with each HugeTLB page will be optimized, and the toggle
> > > +cannot be set back to false. It only optimizes the subsequent allocation
> > > +of HugeTLB pages from buddy system, while already allocated HugeTLB pages
> > > +will not be optimized.
> >
> > The commit log or documentation does not descrie why its safe to toggle
> > one way and not the other?
> >
>
> I thought it was easy to handle the transition from disable to enable
> (code is simple). I might be wrong. I'll try to handle the other side in
> the next version if it is not hard to handle.

You should do the homework and explain why something is not possible.
And if you are enabling to disable something why is it safe to do so
at runtime?

Luis