2022-03-24 05:58:06

by Muchun Song

[permalink] [raw]
Subject: [PATCH v5 0/4] add hugetlb_free_vmemmap sysctl

This series is based on next-20220310.

This series amis to add hugetlb_free_vmemmap sysctl to enable the feature
of freeing vmemmap pages of HugeTLB pages.

v5:
- Fix not working properly if one is workig off of a very clean build
reported by Luis Chamberlain.
- Add Suggested-by for Luis Chamberlain.

Thanks.

v4:
- Introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2 inspired by Luis.

v3:
- Add pr_warn_once() (Mike).
- Handle the transition from enabling to disabling (Luis)

v2:
- Fix compilation when !CONFIG_MHP_MEMMAP_ON_MEMORY reported by kernel
test robot <[email protected]>.
- Move sysctl code from kernel/sysctl.c to mm/hugetlb_vmemmap.c.

Muchun Song (4):
mm: hugetlb_vmemmap: introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2
mm: memory_hotplug: override memmap_on_memory when
hugetlb_free_vmemmap=on
sysctl: allow to set extra1 to SYSCTL_ONE
mm: hugetlb_vmemmap: add hugetlb_free_vmemmap sysctl

Documentation/admin-guide/sysctl/vm.rst | 14 +++++
Kbuild | 14 +++++
fs/Kconfig | 1 +
include/linux/memory_hotplug.h | 9 +++
include/linux/mm_types.h | 2 +
kernel/sysctl.c | 2 +-
mm/Kconfig | 3 +
mm/hugetlb_vmemmap.c | 107 ++++++++++++++++++++++++--------
mm/hugetlb_vmemmap.h | 4 +-
mm/memory_hotplug.c | 27 ++++++--
mm/struct_page_size.c | 19 ++++++
scripts/check_struct_page_po2.sh | 9 +++
12 files changed, 177 insertions(+), 34 deletions(-)
create mode 100644 mm/struct_page_size.c
create mode 100755 scripts/check_struct_page_po2.sh

--
2.11.0


2022-03-24 11:41:19

by Muchun Song

[permalink] [raw]
Subject: [PATCH v5 1/4] mm: hugetlb_vmemmap: introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2

If the size of "struct page" is not the power of two and this
feature is enabled, then the vmemmap pages of HugeTLB will be
corrupted after remapping (panic is about to happen in theory).
But this only exists when !CONFIG_MEMCG && !CONFIG_SLUB on
x86_64. However, it is not a conventional configuration nowadays.
So it is not a real word issue, just the result of a code review.
But we have to prevent anyone from configuring that combined
configuration. In order to avoid many checks like "is_power_of_2
(sizeof(struct page))" through mm/hugetlb_vmemmap.c. Introduce
STRUCT_PAGE_SIZE_IS_POWER_OF_2 to detect if the size of struct
page is power of 2 and make this feature depends on this new
config. Then we could prevent anyone do any unexpected
configuration.

Signed-off-by: Muchun Song <[email protected]>
Suggested-by: Luis Chamberlain <[email protected]>
---
Kbuild | 14 ++++++++++++++
fs/Kconfig | 1 +
include/linux/mm_types.h | 2 ++
mm/Kconfig | 3 +++
mm/hugetlb_vmemmap.c | 6 ------
mm/struct_page_size.c | 19 +++++++++++++++++++
scripts/check_struct_page_po2.sh | 9 +++++++++
7 files changed, 48 insertions(+), 6 deletions(-)
create mode 100644 mm/struct_page_size.c
create mode 100755 scripts/check_struct_page_po2.sh

diff --git a/Kbuild b/Kbuild
index fa441b98c9f6..21415c3b2728 100644
--- a/Kbuild
+++ b/Kbuild
@@ -37,6 +37,20 @@ $(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
$(call filechk,offsets,__ASM_OFFSETS_H__)

#####
+# Generate struct_page_size.h.
+
+struct_page_size-file := include/generated/struct_page_size.h
+
+always-y := $(struct_page_size-file)
+targets := mm/struct_page_size.s
+
+mm/struct_page_size.s: $(timeconst-file) $(bounds-file)
+
+$(struct_page_size-file): mm/struct_page_size.s FORCE
+ $(call filechk,offsets,__LINUX_STRUCT_PAGE_SIZE_H__)
+ $(Q)$(MAKE) -f $(srctree)/Makefile syncconfig
+
+#####
# Check for missing system calls

always-y += missing-syscalls
diff --git a/fs/Kconfig b/fs/Kconfig
index 7f2455e8e18a..856d2e9f5aef 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -249,6 +249,7 @@ config HUGETLB_PAGE_FREE_VMEMMAP
def_bool HUGETLB_PAGE
depends on X86_64
depends on SPARSEMEM_VMEMMAP
+ depends on STRUCT_PAGE_SIZE_IS_POWER_OF_2

config HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON
bool "Default freeing vmemmap pages of HugeTLB to on"
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 8834e38c06a4..5fbff44a4310 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -223,6 +223,7 @@ struct page {
#endif
} _struct_page_alignment;

+#ifndef __GENERATING_STRUCT_PAGE_SIZE_IS_POWER_OF_2_H
/**
* struct folio - Represents a contiguous set of bytes.
* @flags: Identical to the page flags.
@@ -844,5 +845,6 @@ enum fault_flag {
FAULT_FLAG_INSTRUCTION = 1 << 8,
FAULT_FLAG_INTERRUPTIBLE = 1 << 9,
};
+#endif /* !__GENERATING_STRUCT_PAGE_SIZE_IS_POWER_OF_2_H */

#endif /* _LINUX_MM_TYPES_H */
diff --git a/mm/Kconfig b/mm/Kconfig
index 034d87953600..9314bd34f49e 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -2,6 +2,9 @@

menu "Memory Management options"

+config STRUCT_PAGE_SIZE_IS_POWER_OF_2
+ def_bool $(success,test "$(shell, $(srctree)/scripts/check_struct_page_po2.sh)" = 1)
+
config SELECT_MEMORY_MODEL
def_bool y
depends on ARCH_SELECT_MEMORY_MODEL
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 791626983c2e..33ecb77c2b2a 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -194,12 +194,6 @@ EXPORT_SYMBOL(hugetlb_free_vmemmap_enabled_key);

static int __init early_hugetlb_free_vmemmap_param(char *buf)
{
- /* We cannot optimize if a "struct page" crosses page boundaries. */
- if (!is_power_of_2(sizeof(struct page))) {
- pr_warn("cannot free vmemmap pages because \"struct page\" crosses page boundaries\n");
- return 0;
- }
-
if (!buf)
return -EINVAL;

diff --git a/mm/struct_page_size.c b/mm/struct_page_size.c
new file mode 100644
index 000000000000..5749609aa1b3
--- /dev/null
+++ b/mm/struct_page_size.c
@@ -0,0 +1,19 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Generate definitions needed by the preprocessor.
+ * This code generates raw asm output which is post-processed
+ * to extract and format the required data.
+ */
+
+#define __GENERATING_STRUCT_PAGE_SIZE_IS_POWER_OF_2_H
+/* Include headers that define the enum constants of interest */
+#include <linux/mm_types.h>
+#include <linux/kbuild.h>
+#include <linux/log2.h>
+
+int main(void)
+{
+ DEFINE(STRUCT_PAGE_SIZE_IS_POWER_OF_2, is_power_of_2(sizeof(struct page)));
+
+ return 0;
+}
diff --git a/scripts/check_struct_page_po2.sh b/scripts/check_struct_page_po2.sh
new file mode 100755
index 000000000000..1764ef9a4f1d
--- /dev/null
+++ b/scripts/check_struct_page_po2.sh
@@ -0,0 +1,9 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# Check if the size of "struct page" is power of 2
+
+file="include/generated/struct_page_size.h"
+if [ -f "$file" ]; then
+ grep STRUCT_PAGE_SIZE_IS_POWER_OF_2 "$file" | cut -d' ' -f3
+fi
--
2.11.0

2022-03-24 16:48:30

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: hugetlb_vmemmap: introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2

On Thu, Mar 24, 2022 at 5:40 PM Chen, Rong A <[email protected]> wrote:
>
>
>
> On 3/24/2022 6:13 AM, Andrew Morton wrote:
> > On Thu, 24 Mar 2022 06:06:41 +0800 kernel test robot <[email protected]> wrote:
> >
> >> Hi Muchun,
> >>
> >> Thank you for the patch! Yet something to improve:
> >>
> >> [auto build test ERROR on hnaz-mm/master]
> >> [also build test ERROR on linus/master next-20220323]
> >> [cannot apply to mcgrof/sysctl-next v5.17]
> >> [If your patch is applied to the wrong git tree, kindly drop us a note.
> >> And when submitting patch, we suggest to use '--base' as documented in
> >> https://git-scm.com/docs/git-format-patch]
> >>
> >> url: https://github.com/0day-ci/linux/commits/Muchun-Song/add-hugetlb_free_vmemmap-sysctl/20220323-205902
> >> base: https://github.com/hnaz/linux-mm master
> >> config: arc-randconfig-r043-20220323 (https://download.01.org/0day-ci/archive/20220324/[email protected]/config)
> >> compiler: arc-elf-gcc (GCC) 11.2.0
> >> reproduce (this is a W=1 build):
> >> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> >> chmod +x ~/bin/make.cross
> >> # https://github.com/0day-ci/linux/commit/64211be650af117819368a26d7b86c33df5deea4
> >> git remote add linux-review https://github.com/0day-ci/linux
> >> git fetch --no-tags linux-review Muchun-Song/add-hugetlb_free_vmemmap-sysctl/20220323-205902
> >> git checkout 64211be650af117819368a26d7b86c33df5deea4
> >> # save the config file to linux build tree
> >> mkdir build_dir
> >> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=arc prepare
> >>
> >> If you fix the issue, kindly add following tag as appropriate
> >> Reported-by: kernel test robot <[email protected]>
> >>
> >> All errors (new ones prefixed by >>):
> >>
> >>>> cc1: fatal error: cannot open 'kernel/bounds.s' for writing: No such file or directory
> >
> > It would take a lot of talent for Munchun to have caused this!
> >
> > Methinks you just ran out of disk space?
>
> Hi Andrew,
>
> Thanks for the reply, I tried to apply this patch to the head of
> mainline and I still can reproduce the error in my local machine:
>
> $ wget -q -O -
> https://lore.kernel.org/lkml/[email protected]/raw
> | git apply -v
> $ mkdir build_dir && wget
> https://download.01.org/0day-ci/archive/20220324/[email protected]/config
> -O build_dir/.config
> $ COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross
> O=build_dir ARCH=arc olddefconfig prepare
> make --keep-going CONFIG_OF_ALL_DTBS=y CONFIG_DTC=y
> CROSS_COMPILE=/home/nfs/0day/gcc-11.2.0-nolibc/arc-elf/bin/arc-elf-
> --jobs=72 O=build_dir ARCH=arc olddefconfig prepare
> ...
> cc1: fatal error: cannot open 'kernel/bounds.s' for writing: No such
> file or directory
> compilation terminated.
> make[3]: *** [../scripts/Makefile.build:121: kernel/bounds.s] Error 1
> make[3]: Target '__build' not remade because of errors.
> make[2]: *** [../Makefile:1191: prepare0] Error 2
> make[2]: Target 'prepare' not remade because of errors.
>

Would you help me to test the following patch? Thanks.

diff --git a/Kbuild b/Kbuild
index 21415c3b2728..a8477c011b1d 100644
--- a/Kbuild
+++ b/Kbuild
@@ -42,7 +42,7 @@ $(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
struct_page_size-file := include/generated/struct_page_size.h

always-y := $(struct_page_size-file)
-targets := mm/struct_page_size.s
+targets += mm/struct_page_size.s

mm/struct_page_size.s: $(timeconst-file) $(bounds-file)

2022-03-24 21:16:02

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: hugetlb_vmemmap: introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2

On Thu, Mar 24, 2022 at 5:40 PM Chen, Rong A <[email protected]> wrote:
>
>
>
> On 3/24/2022 6:13 AM, Andrew Morton wrote:
> > On Thu, 24 Mar 2022 06:06:41 +0800 kernel test robot <[email protected]> wrote:
> >
> >> Hi Muchun,
> >>
> >> Thank you for the patch! Yet something to improve:
> >>
> >> [auto build test ERROR on hnaz-mm/master]
> >> [also build test ERROR on linus/master next-20220323]
> >> [cannot apply to mcgrof/sysctl-next v5.17]
> >> [If your patch is applied to the wrong git tree, kindly drop us a note.
> >> And when submitting patch, we suggest to use '--base' as documented in
> >> https://git-scm.com/docs/git-format-patch]
> >>
> >> url: https://github.com/0day-ci/linux/commits/Muchun-Song/add-hugetlb_free_vmemmap-sysctl/20220323-205902
> >> base: https://github.com/hnaz/linux-mm master
> >> config: arc-randconfig-r043-20220323 (https://download.01.org/0day-ci/archive/20220324/[email protected]/config)
> >> compiler: arc-elf-gcc (GCC) 11.2.0
> >> reproduce (this is a W=1 build):
> >> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> >> chmod +x ~/bin/make.cross
> >> # https://github.com/0day-ci/linux/commit/64211be650af117819368a26d7b86c33df5deea4
> >> git remote add linux-review https://github.com/0day-ci/linux
> >> git fetch --no-tags linux-review Muchun-Song/add-hugetlb_free_vmemmap-sysctl/20220323-205902
> >> git checkout 64211be650af117819368a26d7b86c33df5deea4
> >> # save the config file to linux build tree
> >> mkdir build_dir
> >> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=arc prepare
> >>
> >> If you fix the issue, kindly add following tag as appropriate
> >> Reported-by: kernel test robot <[email protected]>
> >>
> >> All errors (new ones prefixed by >>):
> >>
> >>>> cc1: fatal error: cannot open 'kernel/bounds.s' for writing: No such file or directory
> >
> > It would take a lot of talent for Munchun to have caused this!
> >
> > Methinks you just ran out of disk space?
>
> Hi Andrew,
>
> Thanks for the reply, I tried to apply this patch to the head of
> mainline and I still can reproduce the error in my local machine:
>
> $ wget -q -O -
> https://lore.kernel.org/lkml/[email protected]/raw
> | git apply -v
> $ mkdir build_dir && wget
> https://download.01.org/0day-ci/archive/20220324/[email protected]/config
> -O build_dir/.config
> $ COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross
> O=build_dir ARCH=arc olddefconfig prepare
> make --keep-going CONFIG_OF_ALL_DTBS=y CONFIG_DTC=y
> CROSS_COMPILE=/home/nfs/0day/gcc-11.2.0-nolibc/arc-elf/bin/arc-elf-
> --jobs=72 O=build_dir ARCH=arc olddefconfig prepare
> ...
> cc1: fatal error: cannot open 'kernel/bounds.s' for writing: No such
> file or directory
> compilation terminated.
> make[3]: *** [../scripts/Makefile.build:121: kernel/bounds.s] Error 1
> make[3]: Target '__build' not remade because of errors.
> make[2]: *** [../Makefile:1191: prepare0] Error 2
> make[2]: Target 'prepare' not remade because of errors.
>

I'll look at this, thanks for your report.

2022-03-25 06:00:01

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: hugetlb_vmemmap: introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2

Hi Muchun,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on hnaz-mm/master]
[also build test ERROR on linus/master next-20220323]
[cannot apply to mcgrof/sysctl-next v5.17]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Muchun-Song/add-hugetlb_free_vmemmap-sysctl/20220323-205902
base: https://github.com/hnaz/linux-mm master
config: arc-randconfig-r043-20220323 (https://download.01.org/0day-ci/archive/20220324/[email protected]/config)
compiler: arc-elf-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/64211be650af117819368a26d7b86c33df5deea4
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Muchun-Song/add-hugetlb_free_vmemmap-sysctl/20220323-205902
git checkout 64211be650af117819368a26d7b86c33df5deea4
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=arc prepare

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> cc1: fatal error: cannot open 'kernel/bounds.s' for writing: No such file or directory
compilation terminated.
make[2]: *** [scripts/Makefile.build:127: kernel/bounds.s] Error 1
make[2]: Target '__build' not remade because of errors.
make[1]: *** [Makefile:1261: prepare0] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:226: __sub-make] Error 2
make: Target 'prepare' not remade because of errors.

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-03-25 08:47:21

by Chen, Rong A

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: hugetlb_vmemmap: introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2



On 3/24/2022 6:20 PM, Muchun Song wrote:
> On Thu, Mar 24, 2022 at 5:40 PM Chen, Rong A <[email protected]> wrote:
>>
>>
>>
>> On 3/24/2022 6:13 AM, Andrew Morton wrote:
>>> On Thu, 24 Mar 2022 06:06:41 +0800 kernel test robot <[email protected]> wrote:
>>>
>>>> Hi Muchun,
>>>>
>>>> Thank you for the patch! Yet something to improve:
>>>>
>>>> [auto build test ERROR on hnaz-mm/master]
>>>> [also build test ERROR on linus/master next-20220323]
>>>> [cannot apply to mcgrof/sysctl-next v5.17]
>>>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>>>> And when submitting patch, we suggest to use '--base' as documented in
>>>> https://git-scm.com/docs/git-format-patch]
>>>>
>>>> url: https://github.com/0day-ci/linux/commits/Muchun-Song/add-hugetlb_free_vmemmap-sysctl/20220323-205902
>>>> base: https://github.com/hnaz/linux-mm master
>>>> config: arc-randconfig-r043-20220323 (https://download.01.org/0day-ci/archive/20220324/[email protected]/config)
>>>> compiler: arc-elf-gcc (GCC) 11.2.0
>>>> reproduce (this is a W=1 build):
>>>> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>>> chmod +x ~/bin/make.cross
>>>> # https://github.com/0day-ci/linux/commit/64211be650af117819368a26d7b86c33df5deea4
>>>> git remote add linux-review https://github.com/0day-ci/linux
>>>> git fetch --no-tags linux-review Muchun-Song/add-hugetlb_free_vmemmap-sysctl/20220323-205902
>>>> git checkout 64211be650af117819368a26d7b86c33df5deea4
>>>> # save the config file to linux build tree
>>>> mkdir build_dir
>>>> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=arc prepare
>>>>
>>>> If you fix the issue, kindly add following tag as appropriate
>>>> Reported-by: kernel test robot <[email protected]>
>>>>
>>>> All errors (new ones prefixed by >>):
>>>>
>>>>>> cc1: fatal error: cannot open 'kernel/bounds.s' for writing: No such file or directory
>>>
>>> It would take a lot of talent for Munchun to have caused this!
>>>
>>> Methinks you just ran out of disk space?
>>
>> Hi Andrew,
>>
>> Thanks for the reply, I tried to apply this patch to the head of
>> mainline and I still can reproduce the error in my local machine:
>>
>> $ wget -q -O -
>> https://lore.kernel.org/lkml/[email protected]/raw
>> | git apply -v
>> $ mkdir build_dir && wget
>> https://download.01.org/0day-ci/archive/20220324/[email protected]/config
>> -O build_dir/.config
>> $ COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross
>> O=build_dir ARCH=arc olddefconfig prepare
>> make --keep-going CONFIG_OF_ALL_DTBS=y CONFIG_DTC=y
>> CROSS_COMPILE=/home/nfs/0day/gcc-11.2.0-nolibc/arc-elf/bin/arc-elf-
>> --jobs=72 O=build_dir ARCH=arc olddefconfig prepare
>> ...
>> cc1: fatal error: cannot open 'kernel/bounds.s' for writing: No such
>> file or directory
>> compilation terminated.
>> make[3]: *** [../scripts/Makefile.build:121: kernel/bounds.s] Error 1
>> make[3]: Target '__build' not remade because of errors.
>> make[2]: *** [../Makefile:1191: prepare0] Error 2
>> make[2]: Target 'prepare' not remade because of errors.
>>
>
> Would you help me to test the following patch? Thanks.

I have confirmed the patch can fix the issue.

Best Regards,
Rong Chen

>
> diff --git a/Kbuild b/Kbuild
> index 21415c3b2728..a8477c011b1d 100644
> --- a/Kbuild
> +++ b/Kbuild
> @@ -42,7 +42,7 @@ $(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
> struct_page_size-file := include/generated/struct_page_size.h
>
> always-y := $(struct_page_size-file)
> -targets := mm/struct_page_size.s
> +targets += mm/struct_page_size.s
>
> mm/struct_page_size.s: $(timeconst-file) $(bounds-file)
>

2022-03-25 13:45:27

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v5 0/4] add hugetlb_free_vmemmap sysctl

Masahiro,

can I trouble you to help review the first patch here? I thought
something like this might be possible, and Muchun has done some good
work to try it. If anyone can find hole on that kconfig hack it would
be you. I'll bounce you a copy of the patches.

Luis

On Wed, Mar 23, 2022 at 08:55:19PM +0800, Muchun Song wrote:
> This series is based on next-20220310.
>
> This series amis to add hugetlb_free_vmemmap sysctl to enable the feature
> of freeing vmemmap pages of HugeTLB pages.
>
> v5:
> - Fix not working properly if one is workig off of a very clean build
> reported by Luis Chamberlain.
> - Add Suggested-by for Luis Chamberlain.
>
> Thanks.
>
> v4:
> - Introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2 inspired by Luis.
>
> v3:
> - Add pr_warn_once() (Mike).
> - Handle the transition from enabling to disabling (Luis)
>
> v2:
> - Fix compilation when !CONFIG_MHP_MEMMAP_ON_MEMORY reported by kernel
> test robot <[email protected]>.
> - Move sysctl code from kernel/sysctl.c to mm/hugetlb_vmemmap.c.
>
> Muchun Song (4):
> mm: hugetlb_vmemmap: introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2
> mm: memory_hotplug: override memmap_on_memory when
> hugetlb_free_vmemmap=on
> sysctl: allow to set extra1 to SYSCTL_ONE
> mm: hugetlb_vmemmap: add hugetlb_free_vmemmap sysctl
>
> Documentation/admin-guide/sysctl/vm.rst | 14 +++++
> Kbuild | 14 +++++
> fs/Kconfig | 1 +
> include/linux/memory_hotplug.h | 9 +++
> include/linux/mm_types.h | 2 +
> kernel/sysctl.c | 2 +-
> mm/Kconfig | 3 +
> mm/hugetlb_vmemmap.c | 107 ++++++++++++++++++++++++--------
> mm/hugetlb_vmemmap.h | 4 +-
> mm/memory_hotplug.c | 27 ++++++--
> mm/struct_page_size.c | 19 ++++++
> scripts/check_struct_page_po2.sh | 9 +++
> 12 files changed, 177 insertions(+), 34 deletions(-)
> create mode 100644 mm/struct_page_size.c
> create mode 100755 scripts/check_struct_page_po2.sh
>
> --
> 2.11.0
>

2022-03-25 17:27:31

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: hugetlb_vmemmap: introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2

On Wed, Mar 23, 2022 at 9:57 PM Muchun Song <[email protected]> wrote:
>
> If the size of "struct page" is not the power of two and this
> feature is enabled, then the vmemmap pages of HugeTLB will be
> corrupted after remapping (panic is about to happen in theory).
> But this only exists when !CONFIG_MEMCG && !CONFIG_SLUB on
> x86_64. However, it is not a conventional configuration nowadays.
> So it is not a real word issue, just the result of a code review.
> But we have to prevent anyone from configuring that combined
> configuration. In order to avoid many checks like "is_power_of_2
> (sizeof(struct page))" through mm/hugetlb_vmemmap.c. Introduce
> STRUCT_PAGE_SIZE_IS_POWER_OF_2 to detect if the size of struct
> page is power of 2 and make this feature depends on this new
> config. Then we could prevent anyone do any unexpected
> configuration.
>
> Signed-off-by: Muchun Song <[email protected]>
> Suggested-by: Luis Chamberlain <[email protected]>
> ---
> Kbuild | 14 ++++++++++++++
> fs/Kconfig | 1 +
> include/linux/mm_types.h | 2 ++
> mm/Kconfig | 3 +++
> mm/hugetlb_vmemmap.c | 6 ------
> mm/struct_page_size.c | 19 +++++++++++++++++++
> scripts/check_struct_page_po2.sh | 9 +++++++++
> 7 files changed, 48 insertions(+), 6 deletions(-)
> create mode 100644 mm/struct_page_size.c
> create mode 100755 scripts/check_struct_page_po2.sh
>
> diff --git a/Kbuild b/Kbuild
> index fa441b98c9f6..21415c3b2728 100644
> --- a/Kbuild
> +++ b/Kbuild
> @@ -37,6 +37,20 @@ $(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
> $(call filechk,offsets,__ASM_OFFSETS_H__)
>
> #####
> +# Generate struct_page_size.h.
> +
> +struct_page_size-file := include/generated/struct_page_size.h
> +
> +always-y := $(struct_page_size-file)
> +targets := mm/struct_page_size.s
> +
> +mm/struct_page_size.s: $(timeconst-file) $(bounds-file)
> +
> +$(struct_page_size-file): mm/struct_page_size.s FORCE
> + $(call filechk,offsets,__LINUX_STRUCT_PAGE_SIZE_H__)
> + $(Q)$(MAKE) -f $(srctree)/Makefile syncconfig


No, please do not do this.
It is terrible to feed back this to Kconfig again.

If you know this happens on !CONFIG_MEMCG && !CONFIG_SLUB on x86_64,
why don't you add this dependency directly?


If you want to avoid the run-time check,
why don't you use BUILD_BUG_ON() ?






> +
> +#####
> # Check for missing system calls
>
> always-y += missing-syscalls
> diff --git a/fs/Kconfig b/fs/Kconfig
> index 7f2455e8e18a..856d2e9f5aef 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -249,6 +249,7 @@ config HUGETLB_PAGE_FREE_VMEMMAP
> def_bool HUGETLB_PAGE
> depends on X86_64
> depends on SPARSEMEM_VMEMMAP
> + depends on STRUCT_PAGE_SIZE_IS_POWER_OF_2
>
> config HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON
> bool "Default freeing vmemmap pages of HugeTLB to on"
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 8834e38c06a4..5fbff44a4310 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -223,6 +223,7 @@ struct page {
> #endif
> } _struct_page_alignment;
>
> +#ifndef __GENERATING_STRUCT_PAGE_SIZE_IS_POWER_OF_2_H
> /**
> * struct folio - Represents a contiguous set of bytes.
> * @flags: Identical to the page flags.
> @@ -844,5 +845,6 @@ enum fault_flag {
> FAULT_FLAG_INSTRUCTION = 1 << 8,
> FAULT_FLAG_INTERRUPTIBLE = 1 << 9,
> };
> +#endif /* !__GENERATING_STRUCT_PAGE_SIZE_IS_POWER_OF_2_H */
>
> #endif /* _LINUX_MM_TYPES_H */
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 034d87953600..9314bd34f49e 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -2,6 +2,9 @@
>
> menu "Memory Management options"
>
> +config STRUCT_PAGE_SIZE_IS_POWER_OF_2
> + def_bool $(success,test "$(shell, $(srctree)/scripts/check_struct_page_po2.sh)" = 1)
> +
> config SELECT_MEMORY_MODEL
> def_bool y
> depends on ARCH_SELECT_MEMORY_MODEL
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index 791626983c2e..33ecb77c2b2a 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -194,12 +194,6 @@ EXPORT_SYMBOL(hugetlb_free_vmemmap_enabled_key);
>
> static int __init early_hugetlb_free_vmemmap_param(char *buf)
> {
> - /* We cannot optimize if a "struct page" crosses page boundaries. */
> - if (!is_power_of_2(sizeof(struct page))) {
> - pr_warn("cannot free vmemmap pages because \"struct page\" crosses page boundaries\n");
> - return 0;
> - }
> -
> if (!buf)
> return -EINVAL;
>
> diff --git a/mm/struct_page_size.c b/mm/struct_page_size.c
> new file mode 100644
> index 000000000000..5749609aa1b3
> --- /dev/null
> +++ b/mm/struct_page_size.c
> @@ -0,0 +1,19 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Generate definitions needed by the preprocessor.
> + * This code generates raw asm output which is post-processed
> + * to extract and format the required data.
> + */
> +
> +#define __GENERATING_STRUCT_PAGE_SIZE_IS_POWER_OF_2_H
> +/* Include headers that define the enum constants of interest */
> +#include <linux/mm_types.h>
> +#include <linux/kbuild.h>
> +#include <linux/log2.h>
> +
> +int main(void)
> +{
> + DEFINE(STRUCT_PAGE_SIZE_IS_POWER_OF_2, is_power_of_2(sizeof(struct page)));
> +
> + return 0;
> +}
> diff --git a/scripts/check_struct_page_po2.sh b/scripts/check_struct_page_po2.sh
> new file mode 100755
> index 000000000000..1764ef9a4f1d
> --- /dev/null
> +++ b/scripts/check_struct_page_po2.sh
> @@ -0,0 +1,9 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Check if the size of "struct page" is power of 2
> +
> +file="include/generated/struct_page_size.h"
> +if [ -f "$file" ]; then
> + grep STRUCT_PAGE_SIZE_IS_POWER_OF_2 "$file" | cut -d' ' -f3
> +fi
> --
> 2.11.0
>


--
Best Regards
Masahiro Yamada

2022-03-25 17:41:45

by Muchun Song

[permalink] [raw]
Subject: [PATCH v5 2/4] mm: memory_hotplug: override memmap_on_memory when hugetlb_free_vmemmap=on

When "hugetlb_free_vmemmap=on" and "memory_hotplug.memmap_on_memory"
are both passed to boot cmdline, the variable of "memmap_on_memory"
will be set to 1 even if the vmemmap pages will not be allocated from
the hotadded memory since the former takes precedence over the latter.
In the next patch, we want to enable or disable the feature of freeing
vmemmap pages of HugeTLB via sysctl. We need a way to know if the
feature of memory_hotplug.memmap_on_memory is enabled when enabling
the feature of freeing vmemmap pages since those two features are not
compatible, however, the variable of "memmap_on_memory" cannot indicate
this nowadays. Do not set "memmap_on_memory" to 1 when both parameters
are passed to cmdline, in this case, "memmap_on_memory" could indicate
if this feature is enabled by the users.

Also introduce mhp_memmap_on_memory() helper to move the definition of
"memmap_on_memory" to the scope of CONFIG_MHP_MEMMAP_ON_MEMORY. In the
next patch, mhp_memmap_on_memory() will also be exported to be used in
hugetlb_vmemmap.c.

Signed-off-by: Muchun Song <[email protected]>
---
mm/memory_hotplug.c | 32 ++++++++++++++++++++++++++------
1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 416b38ca8def..da594b382829 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -42,14 +42,36 @@
#include "internal.h"
#include "shuffle.h"

+#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
+static int memmap_on_memory_set(const char *val, const struct kernel_param *kp)
+{
+ if (hugetlb_free_vmemmap_enabled())
+ return 0;
+ return param_set_bool(val, kp);
+}
+
+static const struct kernel_param_ops memmap_on_memory_ops = {
+ .flags = KERNEL_PARAM_OPS_FL_NOARG,
+ .set = memmap_on_memory_set,
+ .get = param_get_bool,
+};

/*
* memory_hotplug.memmap_on_memory parameter
*/
static bool memmap_on_memory __ro_after_init;
-#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
-module_param(memmap_on_memory, bool, 0444);
+module_param_cb(memmap_on_memory, &memmap_on_memory_ops, &memmap_on_memory, 0444);
MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug");
+
+static inline bool mhp_memmap_on_memory(void)
+{
+ return memmap_on_memory;
+}
+#else
+static inline bool mhp_memmap_on_memory(void)
+{
+ return false;
+}
#endif

enum {
@@ -1288,9 +1310,7 @@ bool mhp_supports_memmap_on_memory(unsigned long size)
* altmap as an alternative source of memory, and we do not exactly
* populate a single PMD.
*/
- return memmap_on_memory &&
- !hugetlb_free_vmemmap_enabled() &&
- IS_ENABLED(CONFIG_MHP_MEMMAP_ON_MEMORY) &&
+ return mhp_memmap_on_memory() &&
size == memory_block_size_bytes() &&
IS_ALIGNED(vmemmap_size, PMD_SIZE) &&
IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT));
@@ -2074,7 +2094,7 @@ static int __ref try_remove_memory(u64 start, u64 size)
* We only support removing memory added with MHP_MEMMAP_ON_MEMORY in
* the same granularity it was added - a single memory block.
*/
- if (memmap_on_memory) {
+ if (mhp_memmap_on_memory()) {
nr_vmemmap_pages = walk_memory_blocks(start, size, NULL,
get_nr_vmemmap_pages_cb);
if (nr_vmemmap_pages) {
--
2.11.0

2022-03-25 18:06:26

by Muchun Song

[permalink] [raw]
Subject: [PATCH v5 4/4] mm: hugetlb_vmemmap: add hugetlb_free_vmemmap sysctl

We must add "hugetlb_free_vmemmap=on" to boot cmdline and reboot the
server to enable the feature of freeing vmemmap pages of HugeTLB
pages. Rebooting usually takes a long time. Add a sysctl to enable
or disable the feature at runtime without rebooting.

Disabling requires there is no any optimized HugeTLB page in the
system. If you fail to disable it, you can set "nr_hugepages" to 0
and then retry.

Signed-off-by: Muchun Song <[email protected]>
---
Documentation/admin-guide/sysctl/vm.rst | 14 +++++
include/linux/memory_hotplug.h | 9 +++
mm/hugetlb_vmemmap.c | 101 +++++++++++++++++++++++++-------
mm/hugetlb_vmemmap.h | 4 +-
mm/memory_hotplug.c | 7 +--
5 files changed, 108 insertions(+), 27 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index f4804ce37c58..9e0e153ed935 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -561,6 +561,20 @@ Change the minimum size of the hugepage pool.
See Documentation/admin-guide/mm/hugetlbpage.rst


+hugetlb_free_vmemmap
+====================
+
+Enable (set to 1) or disable (set to 0) the feature of optimizing vmemmap
+pages associated with each HugeTLB page. Once true, the vmemmap pages of
+subsequent allocation of HugeTLB pages from buddy system will be optimized,
+whereas already allocated HugeTLB pages will not be optimized. If you fail
+to disable this feature, you can set "nr_hugepages" to 0 and then retry
+since it is only allowed to be disabled after there is no any optimized
+HugeTLB page in the system.
+
+See Documentation/admin-guide/mm/hugetlbpage.rst
+
+
nr_hugepages_mempolicy
======================

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 1ce6f8044f1e..9b015b254e86 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -348,4 +348,13 @@ void arch_remove_linear_mapping(u64 start, u64 size);
extern bool mhp_supports_memmap_on_memory(unsigned long size);
#endif /* CONFIG_MEMORY_HOTPLUG */

+#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
+bool mhp_memmap_on_memory(void);
+#else
+static inline bool mhp_memmap_on_memory(void)
+{
+ return false;
+}
+#endif
+
#endif /* __LINUX_MEMORY_HOTPLUG_H */
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 33ecb77c2b2a..f920073d52ba 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -176,6 +176,7 @@
*/
#define pr_fmt(fmt) "HugeTLB: " fmt

+#include <linux/memory_hotplug.h>
#include "hugetlb_vmemmap.h"

/*
@@ -192,6 +193,10 @@ DEFINE_STATIC_KEY_MAYBE(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON,
hugetlb_free_vmemmap_enabled_key);
EXPORT_SYMBOL(hugetlb_free_vmemmap_enabled_key);

+/* How many HugeTLB pages with vmemmap pages optimized. */
+static atomic_long_t optimized_pages = ATOMIC_LONG_INIT(0);
+static DECLARE_RWSEM(sysctl_rwsem);
+
static int __init early_hugetlb_free_vmemmap_param(char *buf)
{
if (!buf)
@@ -208,11 +213,6 @@ static int __init early_hugetlb_free_vmemmap_param(char *buf)
}
early_param("hugetlb_free_vmemmap", early_hugetlb_free_vmemmap_param);

-static inline unsigned long free_vmemmap_pages_size_per_hpage(struct hstate *h)
-{
- return (unsigned long)free_vmemmap_pages_per_hpage(h) << PAGE_SHIFT;
-}
-
/*
* Previously discarded vmemmap pages will be allocated and remapping
* after this function returns zero.
@@ -221,14 +221,18 @@ int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
{
int ret;
unsigned long vmemmap_addr = (unsigned long)head;
- unsigned long vmemmap_end, vmemmap_reuse;
+ unsigned long vmemmap_end, vmemmap_reuse, vmemmap_pages;

if (!HPageVmemmapOptimized(head))
return 0;

- vmemmap_addr += RESERVE_VMEMMAP_SIZE;
- vmemmap_end = vmemmap_addr + free_vmemmap_pages_size_per_hpage(h);
- vmemmap_reuse = vmemmap_addr - PAGE_SIZE;
+ vmemmap_addr += RESERVE_VMEMMAP_SIZE;
+ vmemmap_pages = free_vmemmap_pages_per_hpage(h);
+ vmemmap_end = vmemmap_addr + (vmemmap_pages << PAGE_SHIFT);
+ vmemmap_reuse = vmemmap_addr - PAGE_SIZE;
+
+ VM_BUG_ON_PAGE(!vmemmap_pages, head);
+
/*
* The pages which the vmemmap virtual address range [@vmemmap_addr,
* @vmemmap_end) are mapped to are freed to the buddy allocator, and
@@ -238,8 +242,14 @@ int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
*/
ret = vmemmap_remap_alloc(vmemmap_addr, vmemmap_end, vmemmap_reuse,
GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE);
- if (!ret)
+ if (!ret) {
ClearHPageVmemmapOptimized(head);
+ /*
+ * Paired with acquire semantic in
+ * hugetlb_free_vmemmap_handler().
+ */
+ atomic_long_dec_return_release(&optimized_pages);
+ }

return ret;
}
@@ -247,22 +257,28 @@ int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
void free_huge_page_vmemmap(struct hstate *h, struct page *head)
{
unsigned long vmemmap_addr = (unsigned long)head;
- unsigned long vmemmap_end, vmemmap_reuse;
+ unsigned long vmemmap_end, vmemmap_reuse, vmemmap_pages;

- if (!free_vmemmap_pages_per_hpage(h))
- return;
+ down_read(&sysctl_rwsem);
+ vmemmap_pages = free_vmemmap_pages_per_hpage(h);
+ if (!vmemmap_pages)
+ goto out;

- vmemmap_addr += RESERVE_VMEMMAP_SIZE;
- vmemmap_end = vmemmap_addr + free_vmemmap_pages_size_per_hpage(h);
- vmemmap_reuse = vmemmap_addr - PAGE_SIZE;
+ vmemmap_addr += RESERVE_VMEMMAP_SIZE;
+ vmemmap_end = vmemmap_addr + (vmemmap_pages << PAGE_SHIFT);
+ vmemmap_reuse = vmemmap_addr - PAGE_SIZE;

/*
* Remap the vmemmap virtual address range [@vmemmap_addr, @vmemmap_end)
* to the page which @vmemmap_reuse is mapped to, then free the pages
* which the range [@vmemmap_addr, @vmemmap_end] is mapped to.
*/
- if (!vmemmap_remap_free(vmemmap_addr, vmemmap_end, vmemmap_reuse))
+ if (!vmemmap_remap_free(vmemmap_addr, vmemmap_end, vmemmap_reuse)) {
SetHPageVmemmapOptimized(head);
+ atomic_long_inc(&optimized_pages);
+ }
+out:
+ up_read(&sysctl_rwsem);
}

void __init hugetlb_vmemmap_init(struct hstate *h)
@@ -278,9 +294,6 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
BUILD_BUG_ON(__NR_USED_SUBPAGE >=
RESERVE_VMEMMAP_SIZE / sizeof(struct page));

- if (!hugetlb_free_vmemmap_enabled())
- return;
-
vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
/*
* The head page is not to be freed to buddy allocator, the other tail
@@ -296,3 +309,51 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages,
h->name);
}
+
+static int hugetlb_free_vmemmap_handler(struct ctl_table *table, int write,
+ void *buffer, size_t *length,
+ loff_t *ppos)
+{
+ int ret;
+
+ down_write(&sysctl_rwsem);
+ /*
+ * Cannot be disabled when there is at lease one optimized
+ * HugeTLB in the system.
+ *
+ * The acquire semantic is paired with release semantic in
+ * alloc_huge_page_vmemmap(). If we saw the @optimized_pages
+ * with 0, all the operations of vmemmap pages remapping from
+ * alloc_huge_page_vmemmap() are visible too so that we can
+ * safely disable static key.
+ */
+ table->extra1 = atomic_long_read_acquire(&optimized_pages) ?
+ SYSCTL_ONE : SYSCTL_ZERO;
+ ret = proc_do_static_key(table, write, buffer, length, ppos);
+ up_write(&sysctl_rwsem);
+
+ return ret;
+}
+
+static struct ctl_table hugetlb_vmemmap_sysctls[] = {
+ {
+ .procname = "hugetlb_free_vmemmap",
+ .data = &hugetlb_free_vmemmap_enabled_key.key,
+ .mode = 0644,
+ .proc_handler = hugetlb_free_vmemmap_handler,
+ },
+ { }
+};
+
+static __init int hugetlb_vmemmap_sysctls_init(void)
+{
+ /*
+ * The vmemmap pages cannot be optimized if
+ * "memory_hotplug.memmap_on_memory" is enabled.
+ */
+ if (!mhp_memmap_on_memory())
+ register_sysctl_init("vm", hugetlb_vmemmap_sysctls);
+
+ return 0;
+}
+late_initcall(hugetlb_vmemmap_sysctls_init);
diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h
index cb2bef8f9e73..b67a159027f4 100644
--- a/mm/hugetlb_vmemmap.h
+++ b/mm/hugetlb_vmemmap.h
@@ -21,7 +21,9 @@ void hugetlb_vmemmap_init(struct hstate *h);
*/
static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h)
{
- return h->nr_free_vmemmap_pages;
+ if (hugetlb_free_vmemmap_enabled())
+ return h->nr_free_vmemmap_pages;
+ return 0;
}
#else
static inline int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index da594b382829..793c04cfe46f 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -63,15 +63,10 @@ static bool memmap_on_memory __ro_after_init;
module_param_cb(memmap_on_memory, &memmap_on_memory_ops, &memmap_on_memory, 0444);
MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug");

-static inline bool mhp_memmap_on_memory(void)
+bool mhp_memmap_on_memory(void)
{
return memmap_on_memory;
}
-#else
-static inline bool mhp_memmap_on_memory(void)
-{
- return false;
-}
#endif

enum {
--
2.11.0

2022-03-25 18:52:21

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: hugetlb_vmemmap: introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2

On Thu, Mar 24, 2022 at 6:40 PM Chen, Rong A <[email protected]> wrote:
>
>
>
> On 3/24/2022 6:20 PM, Muchun Song wrote:
> > On Thu, Mar 24, 2022 at 5:40 PM Chen, Rong A <[email protected]> wrote:
> >>
> >>
> >>
> >> On 3/24/2022 6:13 AM, Andrew Morton wrote:
> >>> On Thu, 24 Mar 2022 06:06:41 +0800 kernel test robot <[email protected]> wrote:
> >>>
> >>>> Hi Muchun,
> >>>>
> >>>> Thank you for the patch! Yet something to improve:
> >>>>
> >>>> [auto build test ERROR on hnaz-mm/master]
> >>>> [also build test ERROR on linus/master next-20220323]
> >>>> [cannot apply to mcgrof/sysctl-next v5.17]
> >>>> [If your patch is applied to the wrong git tree, kindly drop us a note.
> >>>> And when submitting patch, we suggest to use '--base' as documented in
> >>>> https://git-scm.com/docs/git-format-patch]
> >>>>
> >>>> url: https://github.com/0day-ci/linux/commits/Muchun-Song/add-hugetlb_free_vmemmap-sysctl/20220323-205902
> >>>> base: https://github.com/hnaz/linux-mm master
> >>>> config: arc-randconfig-r043-20220323 (https://download.01.org/0day-ci/archive/20220324/[email protected]/config)
> >>>> compiler: arc-elf-gcc (GCC) 11.2.0
> >>>> reproduce (this is a W=1 build):
> >>>> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> >>>> chmod +x ~/bin/make.cross
> >>>> # https://github.com/0day-ci/linux/commit/64211be650af117819368a26d7b86c33df5deea4
> >>>> git remote add linux-review https://github.com/0day-ci/linux
> >>>> git fetch --no-tags linux-review Muchun-Song/add-hugetlb_free_vmemmap-sysctl/20220323-205902
> >>>> git checkout 64211be650af117819368a26d7b86c33df5deea4
> >>>> # save the config file to linux build tree
> >>>> mkdir build_dir
> >>>> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=arc prepare
> >>>>
> >>>> If you fix the issue, kindly add following tag as appropriate
> >>>> Reported-by: kernel test robot <[email protected]>
> >>>>
> >>>> All errors (new ones prefixed by >>):
> >>>>
> >>>>>> cc1: fatal error: cannot open 'kernel/bounds.s' for writing: No such file or directory
> >>>
> >>> It would take a lot of talent for Munchun to have caused this!
> >>>
> >>> Methinks you just ran out of disk space?
> >>
> >> Hi Andrew,
> >>
> >> Thanks for the reply, I tried to apply this patch to the head of
> >> mainline and I still can reproduce the error in my local machine:
> >>
> >> $ wget -q -O -
> >> https://lore.kernel.org/lkml/[email protected]/raw
> >> | git apply -v
> >> $ mkdir build_dir && wget
> >> https://download.01.org/0day-ci/archive/20220324/[email protected]/config
> >> -O build_dir/.config
> >> $ COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross
> >> O=build_dir ARCH=arc olddefconfig prepare
> >> make --keep-going CONFIG_OF_ALL_DTBS=y CONFIG_DTC=y
> >> CROSS_COMPILE=/home/nfs/0day/gcc-11.2.0-nolibc/arc-elf/bin/arc-elf-
> >> --jobs=72 O=build_dir ARCH=arc olddefconfig prepare
> >> ...
> >> cc1: fatal error: cannot open 'kernel/bounds.s' for writing: No such
> >> file or directory
> >> compilation terminated.
> >> make[3]: *** [../scripts/Makefile.build:121: kernel/bounds.s] Error 1
> >> make[3]: Target '__build' not remade because of errors.
> >> make[2]: *** [../Makefile:1191: prepare0] Error 2
> >> make[2]: Target 'prepare' not remade because of errors.
> >>
> >
> > Would you help me to test the following patch? Thanks.
>
> I have confirmed the patch can fix the issue.
>

Thanks Chen.

2022-03-25 19:26:16

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH v5 0/4] add hugetlb_free_vmemmap sysctl

On Thu, Mar 24, 2022 at 6:58 AM Luis Chamberlain <[email protected]> wrote:
>
> Masahiro,
>
> can I trouble you to help review the first patch here? I thought
> something like this might be possible, and Muchun has done some good
> work to try it. If anyone can find hole on that kconfig hack it would
> be you. I'll bounce you a copy of the patches.
>
> Luis


Now, I took a look at it.
Please do not do such a horrible hack.

Thanks.











>
> On Wed, Mar 23, 2022 at 08:55:19PM +0800, Muchun Song wrote:
> > This series is based on next-20220310.
> >
> > This series amis to add hugetlb_free_vmemmap sysctl to enable the feature
> > of freeing vmemmap pages of HugeTLB pages.
> >
> > v5:
> > - Fix not working properly if one is workig off of a very clean build
> > reported by Luis Chamberlain.
> > - Add Suggested-by for Luis Chamberlain.
> >
> > Thanks.
> >
> > v4:
> > - Introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2 inspired by Luis.
> >
> > v3:
> > - Add pr_warn_once() (Mike).
> > - Handle the transition from enabling to disabling (Luis)
> >
> > v2:
> > - Fix compilation when !CONFIG_MHP_MEMMAP_ON_MEMORY reported by kernel
> > test robot <[email protected]>.
> > - Move sysctl code from kernel/sysctl.c to mm/hugetlb_vmemmap.c.
> >
> > Muchun Song (4):
> > mm: hugetlb_vmemmap: introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2
> > mm: memory_hotplug: override memmap_on_memory when
> > hugetlb_free_vmemmap=on
> > sysctl: allow to set extra1 to SYSCTL_ONE
> > mm: hugetlb_vmemmap: add hugetlb_free_vmemmap sysctl
> >
> > Documentation/admin-guide/sysctl/vm.rst | 14 +++++
> > Kbuild | 14 +++++
> > fs/Kconfig | 1 +
> > include/linux/memory_hotplug.h | 9 +++
> > include/linux/mm_types.h | 2 +
> > kernel/sysctl.c | 2 +-
> > mm/Kconfig | 3 +
> > mm/hugetlb_vmemmap.c | 107 ++++++++++++++++++++++++--------
> > mm/hugetlb_vmemmap.h | 4 +-
> > mm/memory_hotplug.c | 27 ++++++--
> > mm/struct_page_size.c | 19 ++++++
> > scripts/check_struct_page_po2.sh | 9 +++
> > 12 files changed, 177 insertions(+), 34 deletions(-)
> > create mode 100644 mm/struct_page_size.c
> > create mode 100755 scripts/check_struct_page_po2.sh
> >
> > --
> > 2.11.0
> >



--
Best Regards
Masahiro Yamada

2022-03-31 02:59:49

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v5 1/4] mm: hugetlb_vmemmap: introduce STRUCT_PAGE_SIZE_IS_POWER_OF_2

On Fri, Mar 25, 2022 at 1:11 PM Masahiro Yamada <[email protected]> wrote:
>
> On Wed, Mar 23, 2022 at 9:57 PM Muchun Song <[email protected]> wrote:
> >
> > If the size of "struct page" is not the power of two and this
> > feature is enabled, then the vmemmap pages of HugeTLB will be
> > corrupted after remapping (panic is about to happen in theory).
> > But this only exists when !CONFIG_MEMCG && !CONFIG_SLUB on
> > x86_64. However, it is not a conventional configuration nowadays.
> > So it is not a real word issue, just the result of a code review.
> > But we have to prevent anyone from configuring that combined
> > configuration. In order to avoid many checks like "is_power_of_2
> > (sizeof(struct page))" through mm/hugetlb_vmemmap.c. Introduce
> > STRUCT_PAGE_SIZE_IS_POWER_OF_2 to detect if the size of struct
> > page is power of 2 and make this feature depends on this new
> > config. Then we could prevent anyone do any unexpected
> > configuration.
> >
> > Signed-off-by: Muchun Song <[email protected]>
> > Suggested-by: Luis Chamberlain <[email protected]>
> > ---
> > Kbuild | 14 ++++++++++++++
> > fs/Kconfig | 1 +
> > include/linux/mm_types.h | 2 ++
> > mm/Kconfig | 3 +++
> > mm/hugetlb_vmemmap.c | 6 ------
> > mm/struct_page_size.c | 19 +++++++++++++++++++
> > scripts/check_struct_page_po2.sh | 9 +++++++++
> > 7 files changed, 48 insertions(+), 6 deletions(-)
> > create mode 100644 mm/struct_page_size.c
> > create mode 100755 scripts/check_struct_page_po2.sh
> >
> > diff --git a/Kbuild b/Kbuild
> > index fa441b98c9f6..21415c3b2728 100644
> > --- a/Kbuild
> > +++ b/Kbuild
> > @@ -37,6 +37,20 @@ $(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
> > $(call filechk,offsets,__ASM_OFFSETS_H__)
> >
> > #####
> > +# Generate struct_page_size.h.
> > +
> > +struct_page_size-file := include/generated/struct_page_size.h
> > +
> > +always-y := $(struct_page_size-file)
> > +targets := mm/struct_page_size.s
> > +
> > +mm/struct_page_size.s: $(timeconst-file) $(bounds-file)
> > +
> > +$(struct_page_size-file): mm/struct_page_size.s FORCE
> > + $(call filechk,offsets,__LINUX_STRUCT_PAGE_SIZE_H__)
> > + $(Q)$(MAKE) -f $(srctree)/Makefile syncconfig
>
>
> No, please do not do this.
> It is terrible to feed back this to Kconfig again.

OK. I'll remove syncconfig.

>
> If you know this happens on !CONFIG_MEMCG && !CONFIG_SLUB on x86_64,
> why don't you add this dependency directly?

It is not enough since the size of the struct page also depends on
LAST_CPUPID_NOT_IN_PAGE_FLAGS && CONFIG_SLAB.
We cannot know the result of LAST_CPUPID_NOT_IN_PAGE_FLAGS in
Kconfig.

>
>
> If you want to avoid the run-time check,
> why don't you use BUILD_BUG_ON() ?
>

Now I have another solution to avoid the run-time check.
We could use macro STRUCT_PAGE_SIZE_IS_POWER_OF_2
to do that like the following.

#ifdef STRUCT_PAGE_SIZE_IS_POWER_OF_2
/* code */
#endif

Thanks.