v3 -
Used weak attribute method of defining arch_hugetlb_valid_size.
This eliminates changes to arch specific hugetlb.h files (Peter)
Updated documentation (Peter, Randy)
Fixed handling of implicitly specified gigantic page preallocation
in existing code and removed documentation of such. There is now
no difference between handling of gigantic and non-gigantic pages.
(Peter, Nitesh).
This requires the most review as there is a small change to
undocumented behavior. See patch 4 commit message for details.
Added Acks and Reviews (Mina, Peter)
v2 -
Fix build errors with patch 1 (Will)
Change arch_hugetlb_valid_size arg to unsigned long and remove
irrelevant 'extern' keyword (Christophe)
Documentation and other misc changes (Randy, Christophe, Mina)
Do not process command line options if !hugepages_supported()
(Dave, but it sounds like we may want to additional changes to
hugepages_supported() for x86? If that is needed I would prefer
a separate patch.)
Longpeng(Mike) reported a weird message from hugetlb command line processing
and proposed a solution [1]. While the proposed patch does address the
specific issue, there are other related issues in command line processing.
As hugetlbfs evolved, updates to command line processing have been made to
meet immediate needs and not necessarily in a coordinated manner. The result
is that some processing is done in arch specific code, some is done in arch
independent code and coordination is problematic. Semantics can vary between
architectures.
The patch series does the following:
- Define arch specific arch_hugetlb_valid_size routine used to validate
passed huge page sizes.
- Move hugepagesz= command line parsing out of arch specific code and into
an arch independent routine.
- Clean up command line processing to follow desired semantics and
document those semantics.
[1] https://lore.kernel.org/linux-mm/[email protected]
Mike Kravetz (4):
hugetlbfs: add arch_hugetlb_valid_size
hugetlbfs: move hugepagesz= parsing to arch independent code
hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate
hugetlbfs: clean up command line processing
.../admin-guide/kernel-parameters.txt | 40 ++--
Documentation/admin-guide/mm/hugetlbpage.rst | 35 ++++
arch/arm64/mm/hugetlbpage.c | 30 +--
arch/powerpc/mm/hugetlbpage.c | 30 +--
arch/riscv/mm/hugetlbpage.c | 24 +--
arch/s390/mm/hugetlbpage.c | 24 +--
arch/sparc/mm/init_64.c | 43 +---
arch/x86/mm/hugetlbpage.c | 23 +--
include/linux/hugetlb.h | 2 +-
mm/hugetlb.c | 190 +++++++++++++++---
10 files changed, 271 insertions(+), 170 deletions(-)
--
2.25.2
With all hugetlb page processing done in a single file clean up code.
- Make code match desired semantics
- Update documentation with semantics
- Make all warnings and errors messages start with 'HugeTLB:'.
- Consistently name command line parsing routines.
- Check for hugepages_supported() before processing parameters.
- Add comments to code
- Describe some of the subtle interactions
- Describe semantics of command line arguments
This patch also fixes issues with implicitly setting the number of
gigantic huge pages to preallocate. Previously on X86 command line,
hugepages=2 default_hugepagesz=1G
would result in zero 1G pages being preallocated and,
# grep HugePages_Total /proc/meminfo
HugePages_Total: 0
# sysctl -a | grep nr_hugepages
vm.nr_hugepages = 2
vm.nr_hugepages_mempolicy = 2
# cat /proc/sys/vm/nr_hugepages
2
After this patch 2 gigantic pages will be preallocated and all the
proc, sysfs, sysctl and meminfo files will accurately reflect this.
To address the issue with gigantic pages, a small change in behavior
was made to command line processing. Previously the command line,
hugepages=128 default_hugepagesz=2M hugepagesz=2M hugepages=256
would result in the allocation of 256 2M huge pages. The value 128
would be ignored without any warning. After this patch, 128 2M pages
will be allocated and a warning message will be displayed indicating
the value of 256 is ignored. This change in behavior is required
because allocation of implicitly specified gigantic pages must be done
when the default_hugepagesz= is encountered for gigantic pages.
Previously the code waited until later in the boot process (hugetlb_init),
to allocate pages of default size. However the bootmem allocator required
for gigantic allocations is not available at this time.
Signed-off-by: Mike Kravetz <[email protected]>
---
.../admin-guide/kernel-parameters.txt | 40 +++--
Documentation/admin-guide/mm/hugetlbpage.rst | 35 ++++
mm/hugetlb.c | 159 ++++++++++++++----
3 files changed, 190 insertions(+), 44 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f2a93c8679e8..8cd78cc87a1c 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -834,12 +834,15 @@
See also Documentation/networking/decnet.txt.
default_hugepagesz=
- [same as hugepagesz=] The size of the default
- HugeTLB page size. This is the size represented by
- the legacy /proc/ hugepages APIs, used for SHM, and
- default size when mounting hugetlbfs filesystems.
- Defaults to the default architecture's huge page size
- if not specified.
+ [HW] The size of the default HugeTLB page. This is
+ the size represented by the legacy /proc/ hugepages
+ APIs. In addition, this is the default hugetlb size
+ used for shmget(), mmap() and mounting hugetlbfs
+ filesystems. If not specified, defaults to the
+ architecture's default huge page size. Huge page
+ sizes are architecture dependent. See also
+ Documentation/admin-guide/mm/hugetlbpage.rst.
+ Format: size[KMG]
deferred_probe_timeout=
[KNL] Debugging option to set a timeout in seconds for
@@ -1479,13 +1482,24 @@
hugepages using the cma allocator. If enabled, the
boot-time allocation of gigantic hugepages is skipped.
- hugepages= [HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
- hugepagesz= [HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
- On x86-64 and powerpc, this option can be specified
- multiple times interleaved with hugepages= to reserve
- huge pages of different sizes. Valid pages sizes on
- x86-64 are 2M (when the CPU supports "pse") and 1G
- (when the CPU supports the "pdpe1gb" cpuinfo flag).
+ hugepages= [HW] Number of HugeTLB pages to allocate at boot.
+ If this follows hugepagesz (below), it specifies
+ the number of pages of hugepagesz to be allocated.
+ If this is the first HugeTLB parameter on the command
+ line, it specifies the number of pages to allocate for
+ the default huge page size. See also
+ Documentation/admin-guide/mm/hugetlbpage.rst.
+ Format: <integer>
+
+ hugepagesz=
+ [HW] The size of the HugeTLB pages. This is used in
+ conjunction with hugepages (above) to allocate huge
+ pages of a specific size at boot. The pair
+ hugepagesz=X hugepages=Y can be specified once for
+ each supported huge page size. Huge page sizes are
+ architecture dependent. See also
+ Documentation/admin-guide/mm/hugetlbpage.rst.
+ Format: size[KMG]
hung_task_panic=
[KNL] Should the hung task detector generate panics.
diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
index 1cc0bc78d10e..5026e58826e2 100644
--- a/Documentation/admin-guide/mm/hugetlbpage.rst
+++ b/Documentation/admin-guide/mm/hugetlbpage.rst
@@ -100,6 +100,41 @@ with a huge page size selection parameter "hugepagesz=<size>". <size> must
be specified in bytes with optional scale suffix [kKmMgG]. The default huge
page size may be selected with the "default_hugepagesz=<size>" boot parameter.
+Hugetlb boot command line parameter semantics
+hugepagesz - Specify a huge page size. Used in conjunction with hugepages
+ parameter to preallocate a number of huge pages of the specified
+ size. Hence, hugepagesz and hugepages are typically specified in
+ pairs such as:
+ hugepagesz=2M hugepages=512
+ hugepagesz can only be specified once on the command line for a
+ specific huge page size. Valid huge page sizes are architecture
+ dependent.
+hugepages - Specify the number of huge pages to preallocate. This typically
+ follows a valid hugepagesz or default_hugepagesz parameter. However,
+ if hugepages is the first or only hugetlb command line parameter it
+ implicitly specifies the number of huge pages of default size to
+ allocate. If the number of huge pages of default size is implicitly
+ specified, it can not be overwritten by a hugepagesz,hugepages
+ parameter pair for the default size.
+ For example, on an architecture with 2M default huge page size:
+ hugepages=256 hugepagesz=2M hugepages=512
+ will result in 256 2M huge pages being allocated and a warning message
+ indicating that the hugepages=512 parameter is ignored. If a hugepages
+ parameter is preceded by an invalid hugepagesz parameter, it will
+ be ignored.
+default_hugepagesz - Specify the default huge page size. This parameter can
+ only be specified once on the command line. default_hugepagesz can
+ optionally be followed by the hugepages parameter to preallocate a
+ specific number of huge pages of default size. The number of default
+ sized huge pages to preallocate can also be implicitly specified as
+ mentioned in the hugepages section above. Therefore, on an
+ architecture with 2M default huge page size:
+ hugepages=256
+ default_hugepagesz=2M hugepages=256
+ hugepages=256 default_hugepagesz=2M
+ will all result in 256 2M huge pages being allocated. Valid default
+ huge page size is architecture dependent.
+
When multiple huge page sizes are supported, ``/proc/sys/vm/nr_hugepages``
indicates the current number of pre-allocated huge pages of the default size.
Thus, one can use the following command to dynamically allocate/deallocate
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 0e6eb755ae94..e272ebc2b5cb 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -59,8 +59,8 @@ __initdata LIST_HEAD(huge_boot_pages);
/* for command line parsing */
static struct hstate * __initdata parsed_hstate;
static unsigned long __initdata default_hstate_max_huge_pages;
-static unsigned long __initdata default_hstate_size;
static bool __initdata parsed_valid_hugepagesz = true;
+static bool __initdata parsed_default_hugepagesz;
/*
* Protects updates to hugepage_freelists, hugepage_activelist, nr_huge_pages,
@@ -3060,7 +3060,7 @@ static void __init hugetlb_sysfs_init(void)
err = hugetlb_sysfs_add_hstate(h, hugepages_kobj,
hstate_kobjs, &hstate_attr_group);
if (err)
- pr_err("Hugetlb: Unable to add hstate %s", h->name);
+ pr_err("HugeTLB: Unable to add hstate %s", h->name);
}
}
@@ -3164,7 +3164,7 @@ static void hugetlb_register_node(struct node *node)
nhs->hstate_kobjs,
&per_node_hstate_attr_group);
if (err) {
- pr_err("Hugetlb: Unable to add hstate %s for node %d\n",
+ pr_err("HugeTLB: Unable to add hstate %s for node %d\n",
h->name, node->dev.id);
hugetlb_unregister_node(node);
break;
@@ -3215,19 +3215,35 @@ static int __init hugetlb_init(void)
if (!hugepages_supported())
return 0;
- if (!size_to_hstate(default_hstate_size)) {
- if (default_hstate_size != 0) {
- pr_err("HugeTLB: unsupported default_hugepagesz %lu. Reverting to %lu\n",
- default_hstate_size, HPAGE_SIZE);
+ /*
+ * Make sure HPAGE_SIZE (HUGETLB_PAGE_ORDER) hstate exists. Some
+ * architectures depend on setup being done here.
+ */
+ hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
+ if (!parsed_default_hugepagesz) {
+ /*
+ * If we did not parse a default huge page size, set
+ * default_hstate_idx to HPAGE_SIZE hstate. And, if the
+ * number of huge pages for this default size was implicitly
+ * specified, set that here as well.
+ * Note that the implicit setting will overwrite an explicit
+ * setting. A warning will be printed in this case.
+ */
+ default_hstate_idx = hstate_index(size_to_hstate(HPAGE_SIZE));
+ if (default_hstate_max_huge_pages) {
+ if (default_hstate.max_huge_pages) {
+ char buf[32];
+
+ string_get_size(huge_page_size(&default_hstate),
+ 1, STRING_UNITS_2, buf, 32);
+ pr_warn("HugeTLB: Ignoring hugepages=%lu associated with %s page size\n",
+ default_hstate.max_huge_pages, buf);
+ pr_warn("HugeTLB: Using hugepages=%lu for number of default huge pages\n",
+ default_hstate_max_huge_pages);
+ }
+ default_hstate.max_huge_pages =
+ default_hstate_max_huge_pages;
}
-
- default_hstate_size = HPAGE_SIZE;
- hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
- }
- default_hstate_idx = hstate_index(size_to_hstate(default_hstate_size));
- if (default_hstate_max_huge_pages) {
- if (!default_hstate.max_huge_pages)
- default_hstate.max_huge_pages = default_hstate_max_huge_pages;
}
hugetlb_cma_check();
@@ -3287,20 +3303,34 @@ void __init hugetlb_add_hstate(unsigned int order)
parsed_hstate = h;
}
-static int __init hugetlb_nrpages_setup(char *s)
+/*
+ * hugepages command line processing
+ * hugepages normally follows a valid hugepagsz or default_hugepagsz
+ * specification. If not, ignore the hugepages value. hugepages can also
+ * be the first huge page command line option in which case it implicitly
+ * specifies the number of huge pages for the default size.
+ */
+static int __init hugepages_setup(char *s)
{
unsigned long *mhp;
static unsigned long *last_mhp;
+ if (!hugepages_supported()) {
+ pr_warn("HugeTLB: huge pages not supported, ignoring hugepages = %s\n", s);
+ return 0;
+ }
+
if (!parsed_valid_hugepagesz) {
- pr_warn("hugepages = %s preceded by "
- "an unsupported hugepagesz, ignoring\n", s);
+ pr_warn("HugeTLB: hugepages=%s does not follow a valid hugepagesz, ignoring\n", s);
parsed_valid_hugepagesz = true;
- return 1;
+ return 0;
}
+
/*
- * !hugetlb_max_hstate means we haven't parsed a hugepagesz= parameter yet,
- * so this hugepages= parameter goes to the "default hstate".
+ * !hugetlb_max_hstate means we haven't parsed a hugepagesz= parameter
+ * yet, so this hugepages= parameter goes to the "default hstate".
+ * Otherwise, it goes with the previously parsed hugepagesz or
+ * default_hugepagesz.
*/
else if (!hugetlb_max_hstate)
mhp = &default_hstate_max_huge_pages;
@@ -3308,8 +3338,8 @@ static int __init hugetlb_nrpages_setup(char *s)
mhp = &parsed_hstate->max_huge_pages;
if (mhp == last_mhp) {
- pr_warn("hugepages= specified twice without interleaving hugepagesz=, ignoring\n");
- return 1;
+ pr_warn("HugeTLB: hugepages= specified twice without interleaving hugepagesz=, ignoring hugepages=%s\n", s);
+ return 0;
}
if (sscanf(s, "%lu", mhp) <= 0)
@@ -3327,42 +3357,109 @@ static int __init hugetlb_nrpages_setup(char *s)
return 1;
}
-__setup("hugepages=", hugetlb_nrpages_setup);
+__setup("hugepages=", hugepages_setup);
+/*
+ * hugepagesz command line processing
+ * A specific huge page size can only be specified once with hugepagesz.
+ * hugepagesz is followed by hugepages on the command line. The global
+ * variable 'parsed_valid_hugepagesz' is used to determine if prior
+ * hugepagesz argument was valid.
+ */
static int __init hugepagesz_setup(char *s)
{
unsigned long size;
+ struct hstate *h;
+
+ parsed_valid_hugepagesz = false;
+ if (!hugepages_supported()) {
+ pr_warn("HugeTLB: huge pages not supported, ignoring hugepagesz = %s\n", s);
+ return 0;
+ }
size = (unsigned long)memparse(s, NULL);
if (!arch_hugetlb_valid_size(size)) {
- parsed_valid_hugepagesz = false;
- pr_err("HugeTLB: unsupported hugepagesz %s\n", s);
+ pr_err("HugeTLB: unsupported hugepagesz=%s\n", s);
return 0;
}
- if (size_to_hstate(size)) {
- pr_warn("HugeTLB: hugepagesz %s specified twice, ignoring\n", s);
- return 0;
+ h = size_to_hstate(size);
+ if (h) {
+ /*
+ * hstate for this size already exists. This is normally
+ * an error, but is allowed if the existing hstate is the
+ * default hstate. More specifically, it is only allowed if
+ * the number of huge pages for the default hstate was not
+ * previously specified.
+ */
+ if (!parsed_default_hugepagesz || h != &default_hstate ||
+ default_hstate.max_huge_pages) {
+ pr_warn("HugeTLB: hugepagesz=%s specified twice, ignoring\n", s);
+ return 0;
+ }
+
+ /*
+ * No need to call hugetlb_add_hstate() as hstate already
+ * exists. But, do set parsed_hstate so that a following
+ * hugepages= parameter will be applied to this hstate.
+ */
+ parsed_hstate = h;
+ parsed_valid_hugepagesz = true;
+ return 1;
}
hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT);
+ parsed_valid_hugepagesz = true;
return 1;
}
__setup("hugepagesz=", hugepagesz_setup);
+/*
+ * default_hugepagesz command line input
+ * Only one instance of default_hugepagesz allowed on command line.
+ */
static int __init default_hugepagesz_setup(char *s)
{
unsigned long size;
+ parsed_valid_hugepagesz = false;
+ if (!hugepages_supported()) {
+ pr_warn("HugeTLB: huge pages not supported, ignoring default_hugepagesz = %s\n", s);
+ return 0;
+ }
+
+ if (parsed_default_hugepagesz) {
+ pr_err("HugeTLB: default_hugepagesz previously specified, ignoring %s\n", s);
+ return 0;
+ }
+
size = (unsigned long)memparse(s, NULL);
if (!arch_hugetlb_valid_size(size)) {
- pr_err("HugeTLB: unsupported default_hugepagesz %s\n", s);
+ pr_err("HugeTLB: unsupported default_hugepagesz=%s\n", s);
return 0;
}
- default_hstate_size = size;
+ hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT);
+ parsed_valid_hugepagesz = true;
+ parsed_default_hugepagesz = true;
+ default_hstate_idx = hstate_index(size_to_hstate(size));
+
+ /*
+ * The number of default huge pages (for this size) could have been
+ * specified as the first hugetlb parameter: hugepages=X. If so,
+ * then default_hstate_max_huge_pages is set. If the default huge
+ * page size is gigantic (>= MAX_ORDER), then the pages must be
+ * allocated here from bootmem allocator.
+ */
+ if (default_hstate_max_huge_pages) {
+ default_hstate.max_huge_pages = default_hstate_max_huge_pages;
+ if (hstate_is_gigantic(&default_hstate))
+ hugetlb_hstate_alloc_pages(&default_hstate);
+ default_hstate_max_huge_pages = 0;
+ }
+
return 1;
}
__setup("default_hugepagesz=", default_hugepagesz_setup);
--
2.25.2
> On Apr 17, 2020, at 2:50 PM, Mike Kravetz <[email protected]> wrote:
>
> Longpeng(Mike) reported a weird message from hugetlb command line processing
> and proposed a solution [1]. While the proposed patch does address the
> specific issue, there are other related issues in command line processing.
> As hugetlbfs evolved, updates to command line processing have been made to
> meet immediate needs and not necessarily in a coordinated manner. The result
> is that some processing is done in arch specific code, some is done in arch
> independent code and coordination is problematic. Semantics can vary between
> architectures.
>
> The patch series does the following:
> - Define arch specific arch_hugetlb_valid_size routine used to validate
> passed huge page sizes.
> - Move hugepagesz= command line parsing out of arch specific code and into
> an arch independent routine.
> - Clean up command line processing to follow desired semantics and
> document those semantics.
>
> [1] https://lore.kernel.org/linux-mm/[email protected]
>
> Mike Kravetz (4):
> hugetlbfs: add arch_hugetlb_valid_size
> hugetlbfs: move hugepagesz= parsing to arch independent code
> hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate
> hugetlbfs: clean up command line processing
Reverted this series fixed many undefined behaviors on arm64 with the config,
https://raw.githubusercontent.com/cailca/linux-mm/master/arm64.config
[ 54.172683][ T1] UBSAN: shift-out-of-bounds in ./include/linux/hugetlb.h:555:34
[ 54.180411][ T1] shift exponent 4294967285 is too large for 64-bit type 'unsigned long'
[ 54.188885][ T1] CPU: 130 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc2-next-20200420 #1
[ 54.197284][ T1] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019
[ 54.207888][ T1] Call trace:
[ 54.211100][ T1] dump_backtrace+0x0/0x224
[ 54.215565][ T1] show_stack+0x20/0x2c
[ 54.219651][ T1] dump_stack+0xfc/0x184
[ 54.223829][ T1] __ubsan_handle_shift_out_of_bounds+0x304/0x344
[ 54.230204][ T1] hugetlb_add_hstate+0x3ec/0x414
huge_page_size at include/linux/hugetlb.h:555
(inlined by) hugetlb_add_hstate at mm/hugetlb.c:3301
[ 54.235191][ T1] hugetlbpage_init+0x14/0x30
[ 54.239824][ T1] do_one_initcall+0x6c/0x144
[ 54.244446][ T1] do_initcall_level+0x158/0x1c4
[ 54.249336][ T1] do_initcalls+0x68/0xb0
[ 54.253597][ T1] do_basic_setup+0x28/0x30
[ 54.258049][ T1] kernel_init_freeable+0x19c/0x228
[ 54.263188][ T1] kernel_init+0x14/0x208
[ 54.267473][ T1] ret_from_fork+0x10/0x18
[ 55.534338][ T1] UBSAN: shift-out-of-bounds in ./include/linux/hugetlb.h:555:34
[ 55.542064][ T1] shift exponent 4294967285 is too large for 64-bit type 'unsigned long'
[ 55.550555][ T1] CPU: 129 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc2-next-20200420 #1
[ 55.558992][ T1] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019
[ 55.569659][ T1] Call trace:
[ 55.572898][ T1] dump_backtrace+0x0/0x224
[ 55.577335][ T1] show_stack+0x20/0x2c
[ 55.581442][ T1] dump_stack+0xfc/0x184
[ 55.585621][ T1] __ubsan_handle_shift_out_of_bounds+0x304/0x344
[ 55.592031][ T1] __hugetlb_cgroup_file_dfl_init+0x37c/0x384
[ 55.598062][ T1] hugetlb_cgroup_file_init+0x9c/0xd8
[ 55.603399][ T1] hugetlb_init+0x248/0x448
[ 55.607840][ T1] do_one_initcall+0x6c/0x144
[ 55.612493][ T1] do_initcall_level+0x158/0x1c4
[ 55.617404][ T1] do_initcalls+0x68/0xb0
[ 55.621664][ T1] do_basic_setup+0x28/0x30
[ 55.626107][ T1] kernel_init_freeable+0x19c/0x228
[ 55.631253][ T1] kernel_init+0x14/0x208
[ 55.635519][ T1] ret_from_fork+0x10/0x18
[ 153.283648][ T1] ================================================================================
[ 153.293078][ T1] UBSAN: shift-out-of-bounds in ./include/linux/hugetlb.h:555:34
[ 153.300841][ T1] shift exponent 4294967285 is too large for 64-bit type 'unsigned long'
[ 153.309185][ T1] CPU: 161 PID: 1 Comm: swapper/0 Tainted: G L 5.7.0-rc2-next-20200420 #1
[ 153.318879][ T1] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019
[ 153.329352][ T1] Call trace:
[ 153.332545][ T1] dump_backtrace+0x0/0x224
[ 153.336945][ T1] show_stack+0x20/0x2c
[ 153.341000][ T1] dump_stack+0xfc/0x184
[ 153.345149][ T1] __ubsan_handle_shift_out_of_bounds+0x304/0x344
[ 153.351465][ T1] hugetlbfs_fill_super+0x424/0x43c
[ 153.356560][ T1] vfs_get_super+0xcc/0x170
[ 153.360959][ T1] get_tree_nodev+0x28/0x34
[ 153.365358][ T1] hugetlbfs_get_tree+0xfc/0x128
[ 153.370193][ T1] vfs_get_tree+0x54/0x158
[ 153.374513][ T1] fc_mount+0x1c/0x5c
[ 153.378399][ T1] mount_one_hugetlbfs+0x54/0xc8
[ 153.383233][ T1] init_hugetlbfs_fs+0x18c/0x268
[ 153.388068][ T1] do_one_initcall+0x6c/0x144
[ 153.392647][ T1] do_initcall_level+0x158/0x1c4
[ 153.397480][ T1] do_initcalls+0x68/0xb0
[ 153.401706][ T1] do_basic_setup+0x28/0x30
[ 153.406105][ T1] kernel_init_freeable+0x19c/0x228
[ 153.411208][ T1] kernel_init+0x14/0x208
[ 153.415436][ T1] ret_from_fork+0x10/0x18
[ 194.312926][ T1828] UBSAN: shift-out-of-bounds in ./include/linux/hugetlb.h:584:11
[ 194.320664][ T1828] shift exponent 4294967285 is too large for 32-bit type 'int'
[ 194.328103][ T1828] CPU: 194 PID: 1828 Comm: systemd-journal Tainted: G L 5.7.0-rc2-next-20200420 #1
[ 194.338558][ T1828] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019
[ 194.349010][ T1828] Call trace:
[ 194.352183][ T1828] dump_backtrace+0x0/0x224
[ 194.356560][ T1828] show_stack+0x20/0x2c
[ 194.360595][ T1828] dump_stack+0xfc/0x184
[ 194.364724][ T1828] __ubsan_handle_shift_out_of_bounds+0x304/0x344
[ 194.371020][ T1828] hugetlb_total_pages+0x100/0x128
[ 194.376017][ T1828] vm_commit_limit+0x54/0xb0
[ 194.380484][ T1828] meminfo_proc_show+0x8f4/0xc4c
[ 194.385297][ T1828] seq_read+0x380/0x930
[ 194.389353][ T1828] pde_read+0x5c/0x78
[ 194.393232][ T1828] proc_reg_read+0x74/0xc0
[ 194.397528][ T1828] __vfs_read+0x84/0x1d0
[ 194.401646][ T1828] vfs_read+0xec/0x124
[ 194.405588][ T1828] ksys_read+0xb0/0x120
[ 194.409643][ T1828] __arm64_sys_read+0x54/0x88
[ 194.414195][ T1828] do_el0_svc+0x128/0x1dc
[ 194.418405][ T1828] el0_sync_handler+0x150/0x250
[ 194.423157][ T1828] el0_sync+0x164/0x180
[ 194.427425][ T1828] ================================================================================
[ 194.436930][ T1828] ================================================================================
[ 194.446355][ T1828] UBSAN: shift-out-of-bounds in mm/hugetlb.c:3564:23
[ 194.453190][ T1828] shift exponent 4294967285 is too large for 64-bit type 'unsigned long'
[ 194.461752][ T1828] CPU: 194 PID: 1828 Comm: systemd-journal Tainted: G L 5.7.0-rc2-next-20200420 #1
[ 194.472245][ T1828] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019
[ 194.482720][ T1828] Call trace:
[ 194.485909][ T1828] dump_backtrace+0x0/0x224
[ 194.490312][ T1828] show_stack+0x20/0x2c
[ 194.494368][ T1828] dump_stack+0xfc/0x184
[ 194.498513][ T1828] __ubsan_handle_shift_out_of_bounds+0x304/0x344
[ 194.504828][ T1828] hugetlb_report_meminfo+0x25c/0x2ac
[ 194.510103][ T1828] meminfo_proc_show+0xc08/0xc4c
[ 194.514938][ T1828] seq_read+0x380/0x930
[ 194.518993][ T1828] pde_read+0x5c/0x78
[ 194.522874][ T1828] proc_reg_read+0x74/0xc0
[ 194.527190][ T1828] __vfs_read+0x84/0x1d0
[ 194.531335][ T1828] vfs_read+0xec/0x124
[ 194.535304][ T1828] ksys_read+0xb0/0x120
[ 194.539371][ T1828] __arm64_sys_read+0x54/0x88
[ 194.543958][ T1828] do_el0_svc+0x128/0x1dc
[ 194.548187][ T1828] el0_sync_handler+0x150/0x250
[ 194.552936][ T1828] el0_sync+0x164/0x180
>
> .../admin-guide/kernel-parameters.txt | 40 ++--
> Documentation/admin-guide/mm/hugetlbpage.rst | 35 ++++
> arch/arm64/mm/hugetlbpage.c | 30 +--
> arch/powerpc/mm/hugetlbpage.c | 30 +--
> arch/riscv/mm/hugetlbpage.c | 24 +--
> arch/s390/mm/hugetlbpage.c | 24 +--
> arch/sparc/mm/init_64.c | 43 +---
> arch/x86/mm/hugetlbpage.c | 23 +--
> include/linux/hugetlb.h | 2 +-
> mm/hugetlb.c | 190 +++++++++++++++---
> 10 files changed, 271 insertions(+), 170 deletions(-)
>
> --
> 2.25.2
>
>
On 4/20/20 8:34 AM, Qian Cai wrote:
>
>
>> On Apr 17, 2020, at 2:50 PM, Mike Kravetz <[email protected]> wrote:
>>
>> Longpeng(Mike) reported a weird message from hugetlb command line processing
>> and proposed a solution [1]. While the proposed patch does address the
>> specific issue, there are other related issues in command line processing.
>> As hugetlbfs evolved, updates to command line processing have been made to
>> meet immediate needs and not necessarily in a coordinated manner. The result
>> is that some processing is done in arch specific code, some is done in arch
>> independent code and coordination is problematic. Semantics can vary between
>> architectures.
>>
>> The patch series does the following:
>> - Define arch specific arch_hugetlb_valid_size routine used to validate
>> passed huge page sizes.
>> - Move hugepagesz= command line parsing out of arch specific code and into
>> an arch independent routine.
>> - Clean up command line processing to follow desired semantics and
>> document those semantics.
>>
>> [1] https://lore.kernel.org/linux-mm/[email protected]
>>
>> Mike Kravetz (4):
>> hugetlbfs: add arch_hugetlb_valid_size
>> hugetlbfs: move hugepagesz= parsing to arch independent code
>> hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate
>> hugetlbfs: clean up command line processing
>
> Reverted this series fixed many undefined behaviors on arm64 with the config,
>
> https://raw.githubusercontent.com/cailca/linux-mm/master/arm64.config
>
> [ 54.172683][ T1] UBSAN: shift-out-of-bounds in ./include/linux/hugetlb.h:555:34
> [ 54.180411][ T1] shift exponent 4294967285 is too large for 64-bit type 'unsigned long'
> [ 54.188885][ T1] CPU: 130 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc2-next-20200420 #1
> [ 54.197284][ T1] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019
> [ 54.207888][ T1] Call trace:
> [ 54.211100][ T1] dump_backtrace+0x0/0x224
> [ 54.215565][ T1] show_stack+0x20/0x2c
> [ 54.219651][ T1] dump_stack+0xfc/0x184
> [ 54.223829][ T1] __ubsan_handle_shift_out_of_bounds+0x304/0x344
> [ 54.230204][ T1] hugetlb_add_hstate+0x3ec/0x414
> huge_page_size at include/linux/hugetlb.h:555
> (inlined by) hugetlb_add_hstate at mm/hugetlb.c:3301
> [ 54.235191][ T1] hugetlbpage_init+0x14/0x30
> [ 54.239824][ T1] do_one_initcall+0x6c/0x144
> [ 54.244446][ T1] do_initcall_level+0x158/0x1c4
> [ 54.249336][ T1] do_initcalls+0x68/0xb0
> [ 54.253597][ T1] do_basic_setup+0x28/0x30
> [ 54.258049][ T1] kernel_init_freeable+0x19c/0x228
> [ 54.263188][ T1] kernel_init+0x14/0x208
> [ 54.267473][ T1] ret_from_fork+0x10/0x18
While rearranging the code (patch 3 in series), I made the incorrect
assumption that CONT_XXX_SIZE == (1UL << CONT_XXX_SHIFT). However,
this is not the case. Does the following patch fix these issues?
From b75cb4a0852e208bee8c4eb347dc076fcaa88859 Mon Sep 17 00:00:00 2001
From: Mike Kravetz <[email protected]>
Date: Mon, 20 Apr 2020 10:41:18 -0700
Subject: [PATCH] arm64/hugetlb: fix hugetlb initialization
When calling hugetlb_add_hstate() to initialize a new hugetlb size,
be sure to use correct huge pages size order.
Signed-off-by: Mike Kravetz <[email protected]>
---
arch/arm64/mm/hugetlbpage.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 9ca840527296..a02411a1f19a 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -453,11 +453,11 @@ void huge_ptep_clear_flush(struct vm_area_struct *vma,
static int __init hugetlbpage_init(void)
{
#ifdef CONFIG_ARM64_4K_PAGES
- hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
+ hugetlb_add_hstate(ilog2(PUD_SIZE) - PAGE_SHIFT);
#endif
- hugetlb_add_hstate(CONT_PMD_SHIFT - PAGE_SHIFT);
- hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
- hugetlb_add_hstate(CONT_PTE_SHIFT - PAGE_SHIFT);
+ hugetlb_add_hstate(ilog2(CONT_PMD_SIZE) - PAGE_SHIFT);
+ hugetlb_add_hstate(ilog2(PMD_SIZE) - PAGE_SHIFT);
+ hugetlb_add_hstate(ilog2(CONT_PTE_SIZE) - PAGE_SHIFT);
return 0;
}
--
2.25.2
On Mon, Apr 20, 2020 at 11:20:23AM -0700, Mike Kravetz wrote:
> On 4/20/20 8:34 AM, Qian Cai wrote:
> >
> >
> >> On Apr 17, 2020, at 2:50 PM, Mike Kravetz <[email protected]> wrote:
> >>
> >> Longpeng(Mike) reported a weird message from hugetlb command line processing
> >> and proposed a solution [1]. While the proposed patch does address the
> >> specific issue, there are other related issues in command line processing.
> >> As hugetlbfs evolved, updates to command line processing have been made to
> >> meet immediate needs and not necessarily in a coordinated manner. The result
> >> is that some processing is done in arch specific code, some is done in arch
> >> independent code and coordination is problematic. Semantics can vary between
> >> architectures.
> >>
> >> The patch series does the following:
> >> - Define arch specific arch_hugetlb_valid_size routine used to validate
> >> passed huge page sizes.
> >> - Move hugepagesz= command line parsing out of arch specific code and into
> >> an arch independent routine.
> >> - Clean up command line processing to follow desired semantics and
> >> document those semantics.
> >>
> >> [1] https://lore.kernel.org/linux-mm/[email protected]
> >>
> >> Mike Kravetz (4):
> >> hugetlbfs: add arch_hugetlb_valid_size
> >> hugetlbfs: move hugepagesz= parsing to arch independent code
> >> hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate
> >> hugetlbfs: clean up command line processing
> >
> > Reverted this series fixed many undefined behaviors on arm64 with the config,
> >
> > https://raw.githubusercontent.com/cailca/linux-mm/master/arm64.config
> >
> > [ 54.172683][ T1] UBSAN: shift-out-of-bounds in ./include/linux/hugetlb.h:555:34
> > [ 54.180411][ T1] shift exponent 4294967285 is too large for 64-bit type 'unsigned long'
> > [ 54.188885][ T1] CPU: 130 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc2-next-20200420 #1
> > [ 54.197284][ T1] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019
> > [ 54.207888][ T1] Call trace:
> > [ 54.211100][ T1] dump_backtrace+0x0/0x224
> > [ 54.215565][ T1] show_stack+0x20/0x2c
> > [ 54.219651][ T1] dump_stack+0xfc/0x184
> > [ 54.223829][ T1] __ubsan_handle_shift_out_of_bounds+0x304/0x344
> > [ 54.230204][ T1] hugetlb_add_hstate+0x3ec/0x414
> > huge_page_size at include/linux/hugetlb.h:555
> > (inlined by) hugetlb_add_hstate at mm/hugetlb.c:3301
> > [ 54.235191][ T1] hugetlbpage_init+0x14/0x30
> > [ 54.239824][ T1] do_one_initcall+0x6c/0x144
> > [ 54.244446][ T1] do_initcall_level+0x158/0x1c4
> > [ 54.249336][ T1] do_initcalls+0x68/0xb0
> > [ 54.253597][ T1] do_basic_setup+0x28/0x30
> > [ 54.258049][ T1] kernel_init_freeable+0x19c/0x228
> > [ 54.263188][ T1] kernel_init+0x14/0x208
> > [ 54.267473][ T1] ret_from_fork+0x10/0x18
>
> While rearranging the code (patch 3 in series), I made the incorrect
> assumption that CONT_XXX_SIZE == (1UL << CONT_XXX_SHIFT). However,
> this is not the case. Does the following patch fix these issues?
>
> From b75cb4a0852e208bee8c4eb347dc076fcaa88859 Mon Sep 17 00:00:00 2001
> From: Mike Kravetz <[email protected]>
> Date: Mon, 20 Apr 2020 10:41:18 -0700
> Subject: [PATCH] arm64/hugetlb: fix hugetlb initialization
>
> When calling hugetlb_add_hstate() to initialize a new hugetlb size,
> be sure to use correct huge pages size order.
>
> Signed-off-by: Mike Kravetz <[email protected]>
> ---
> arch/arm64/mm/hugetlbpage.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index 9ca840527296..a02411a1f19a 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -453,11 +453,11 @@ void huge_ptep_clear_flush(struct vm_area_struct *vma,
> static int __init hugetlbpage_init(void)
> {
> #ifdef CONFIG_ARM64_4K_PAGES
> - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
> + hugetlb_add_hstate(ilog2(PUD_SIZE) - PAGE_SHIFT);
> #endif
> - hugetlb_add_hstate(CONT_PMD_SHIFT - PAGE_SHIFT);
> - hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
> - hugetlb_add_hstate(CONT_PTE_SHIFT - PAGE_SHIFT);
> + hugetlb_add_hstate(ilog2(CONT_PMD_SIZE) - PAGE_SHIFT);
> + hugetlb_add_hstate(ilog2(PMD_SIZE) - PAGE_SHIFT);
> + hugetlb_add_hstate(ilog2(CONT_PTE_SIZE) - PAGE_SHIFT);
Might be clearer to leave the non CONT_* definitions alone and instead
convert the CONT versions along the lines of:
hugetlb_add_hstate(CONT_PMD_SHIFT + PMD_SHIFT - PAGE_SHIFT);
(untested, but I think that's right...)
But that doesn't matter until Qian has confirmed that your patch fixes the
issue.
Will
On Mon, 20 Apr 2020 at 20:23, Mike Kravetz <[email protected]> wrote:
>
> On 4/20/20 8:34 AM, Qian Cai wrote:
> >
> >
> >> On Apr 17, 2020, at 2:50 PM, Mike Kravetz <[email protected]> wrote:
> >>
> >> Longpeng(Mike) reported a weird message from hugetlb command line processing
> >> and proposed a solution [1]. While the proposed patch does address the
> >> specific issue, there are other related issues in command line processing.
> >> As hugetlbfs evolved, updates to command line processing have been made to
> >> meet immediate needs and not necessarily in a coordinated manner. The result
> >> is that some processing is done in arch specific code, some is done in arch
> >> independent code and coordination is problematic. Semantics can vary between
> >> architectures.
> >>
> >> The patch series does the following:
> >> - Define arch specific arch_hugetlb_valid_size routine used to validate
> >> passed huge page sizes.
> >> - Move hugepagesz= command line parsing out of arch specific code and into
> >> an arch independent routine.
> >> - Clean up command line processing to follow desired semantics and
> >> document those semantics.
> >>
> >> [1] https://lore.kernel.org/linux-mm/[email protected]
> >>
> >> Mike Kravetz (4):
> >> hugetlbfs: add arch_hugetlb_valid_size
> >> hugetlbfs: move hugepagesz= parsing to arch independent code
> >> hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate
> >> hugetlbfs: clean up command line processing
> >
> > Reverted this series fixed many undefined behaviors on arm64 with the config,
> >
> > https://raw.githubusercontent.com/cailca/linux-mm/master/arm64.config
> >
> > [ 54.172683][ T1] UBSAN: shift-out-of-bounds in ./include/linux/hugetlb.h:555:34
> > [ 54.180411][ T1] shift exponent 4294967285 is too large for 64-bit type 'unsigned long'
> > [ 54.188885][ T1] CPU: 130 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc2-next-20200420 #1
> > [ 54.197284][ T1] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019
> > [ 54.207888][ T1] Call trace:
> > [ 54.211100][ T1] dump_backtrace+0x0/0x224
> > [ 54.215565][ T1] show_stack+0x20/0x2c
> > [ 54.219651][ T1] dump_stack+0xfc/0x184
> > [ 54.223829][ T1] __ubsan_handle_shift_out_of_bounds+0x304/0x344
> > [ 54.230204][ T1] hugetlb_add_hstate+0x3ec/0x414
> > huge_page_size at include/linux/hugetlb.h:555
> > (inlined by) hugetlb_add_hstate at mm/hugetlb.c:3301
> > [ 54.235191][ T1] hugetlbpage_init+0x14/0x30
> > [ 54.239824][ T1] do_one_initcall+0x6c/0x144
> > [ 54.244446][ T1] do_initcall_level+0x158/0x1c4
> > [ 54.249336][ T1] do_initcalls+0x68/0xb0
> > [ 54.253597][ T1] do_basic_setup+0x28/0x30
> > [ 54.258049][ T1] kernel_init_freeable+0x19c/0x228
> > [ 54.263188][ T1] kernel_init+0x14/0x208
> > [ 54.267473][ T1] ret_from_fork+0x10/0x18
>
> While rearranging the code (patch 3 in series), I made the incorrect
> assumption that CONT_XXX_SIZE == (1UL << CONT_XXX_SHIFT). However,
> this is not the case. Does the following patch fix these issues?
>
> From b75cb4a0852e208bee8c4eb347dc076fcaa88859 Mon Sep 17 00:00:00 2001
> From: Mike Kravetz <[email protected]>
> Date: Mon, 20 Apr 2020 10:41:18 -0700
> Subject: [PATCH] arm64/hugetlb: fix hugetlb initialization
>
> When calling hugetlb_add_hstate() to initialize a new hugetlb size,
> be sure to use correct huge pages size order.
>
> Signed-off-by: Mike Kravetz <[email protected]>
> ---
> arch/arm64/mm/hugetlbpage.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index 9ca840527296..a02411a1f19a 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -453,11 +453,11 @@ void huge_ptep_clear_flush(struct vm_area_struct *vma,
> static int __init hugetlbpage_init(void)
> {
> #ifdef CONFIG_ARM64_4K_PAGES
> - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
> + hugetlb_add_hstate(ilog2(PUD_SIZE) - PAGE_SHIFT);
> #endif
> - hugetlb_add_hstate(CONT_PMD_SHIFT - PAGE_SHIFT);
> - hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
> - hugetlb_add_hstate(CONT_PTE_SHIFT - PAGE_SHIFT);
> + hugetlb_add_hstate(ilog2(CONT_PMD_SIZE) - PAGE_SHIFT);
> + hugetlb_add_hstate(ilog2(PMD_SIZE) - PAGE_SHIFT);
> + hugetlb_add_hstate(ilog2(CONT_PTE_SIZE) - PAGE_SHIFT);
>
> return 0;
> }
I build this for an arm64 kernel and ran it in qemu and it worked.
Cheers,
Anders
On 4/20/20 1:29 PM, Anders Roxell wrote:
> On Mon, 20 Apr 2020 at 20:23, Mike Kravetz <[email protected]> wrote:
>> On 4/20/20 8:34 AM, Qian Cai wrote:
>>>
>>> Reverted this series fixed many undefined behaviors on arm64 with the config,
>> While rearranging the code (patch 3 in series), I made the incorrect
>> assumption that CONT_XXX_SIZE == (1UL << CONT_XXX_SHIFT). However,
>> this is not the case. Does the following patch fix these issues?
>>
>> From b75cb4a0852e208bee8c4eb347dc076fcaa88859 Mon Sep 17 00:00:00 2001
>> From: Mike Kravetz <[email protected]>
>> Date: Mon, 20 Apr 2020 10:41:18 -0700
>> Subject: [PATCH] arm64/hugetlb: fix hugetlb initialization
>>
>> When calling hugetlb_add_hstate() to initialize a new hugetlb size,
>> be sure to use correct huge pages size order.
>>
>> Signed-off-by: Mike Kravetz <[email protected]>
>> ---
>> arch/arm64/mm/hugetlbpage.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
>> index 9ca840527296..a02411a1f19a 100644
>> --- a/arch/arm64/mm/hugetlbpage.c
>> +++ b/arch/arm64/mm/hugetlbpage.c
>> @@ -453,11 +453,11 @@ void huge_ptep_clear_flush(struct vm_area_struct *vma,
>> static int __init hugetlbpage_init(void)
>> {
>> #ifdef CONFIG_ARM64_4K_PAGES
>> - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
>> + hugetlb_add_hstate(ilog2(PUD_SIZE) - PAGE_SHIFT);
>> #endif
>> - hugetlb_add_hstate(CONT_PMD_SHIFT - PAGE_SHIFT);
>> - hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
>> - hugetlb_add_hstate(CONT_PTE_SHIFT - PAGE_SHIFT);
>> + hugetlb_add_hstate(ilog2(CONT_PMD_SIZE) - PAGE_SHIFT);
>> + hugetlb_add_hstate(ilog2(PMD_SIZE) - PAGE_SHIFT);
>> + hugetlb_add_hstate(ilog2(CONT_PTE_SIZE) - PAGE_SHIFT);
>>
>> return 0;
>> }
>
> I build this for an arm64 kernel and ran it in qemu and it worked.
Thanks for testing Anders!
Will, here is an updated version of the patch based on your suggestion.
I added the () for emphasis but that may just be noise for some. Also,
the naming differences and values for CONT_PTE may make some people
look twice. Not sure if being consistent here helps?
I have only built this. No testing.
From daf833ab6b806ecc0816d84d45dcbacc052a7eec Mon Sep 17 00:00:00 2001
From: Mike Kravetz <[email protected]>
Date: Mon, 20 Apr 2020 13:56:15 -0700
Subject: [PATCH] arm64/hugetlb: fix hugetlb initialization
When calling hugetlb_add_hstate() to initialize a new hugetlb size,
be sure to use correct huge pages size order.
Signed-off-by: Mike Kravetz <[email protected]>
---
arch/arm64/mm/hugetlbpage.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 9ca840527296..bed6dc7c4276 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -455,9 +455,9 @@ static int __init hugetlbpage_init(void)
#ifdef CONFIG_ARM64_4K_PAGES
hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
#endif
- hugetlb_add_hstate(CONT_PMD_SHIFT - PAGE_SHIFT);
+ hugetlb_add_hstate((CONT_PMD_SHIFT + PMD_SHIFT) - PAGE_SHIFT);
hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
- hugetlb_add_hstate(CONT_PTE_SHIFT - PAGE_SHIFT);
+ hugetlb_add_hstate((CONT_PTE_SHIFT + PAGE_SHIFT) - PAGE_SHIFT);
return 0;
}
--
2.25.2
On Mon, 20 Apr 2020 at 23:43, Mike Kravetz <[email protected]> wrote:
>
> On 4/20/20 1:29 PM, Anders Roxell wrote:
> > On Mon, 20 Apr 2020 at 20:23, Mike Kravetz <[email protected]> wrote:
> >> On 4/20/20 8:34 AM, Qian Cai wrote:
> >>>
> >>> Reverted this series fixed many undefined behaviors on arm64 with the config,
> >> While rearranging the code (patch 3 in series), I made the incorrect
> >> assumption that CONT_XXX_SIZE == (1UL << CONT_XXX_SHIFT). However,
> >> this is not the case. Does the following patch fix these issues?
> >>
> >> From b75cb4a0852e208bee8c4eb347dc076fcaa88859 Mon Sep 17 00:00:00 2001
> >> From: Mike Kravetz <[email protected]>
> >> Date: Mon, 20 Apr 2020 10:41:18 -0700
> >> Subject: [PATCH] arm64/hugetlb: fix hugetlb initialization
> >>
> >> When calling hugetlb_add_hstate() to initialize a new hugetlb size,
> >> be sure to use correct huge pages size order.
> >>
> >> Signed-off-by: Mike Kravetz <[email protected]>
> >> ---
> >> arch/arm64/mm/hugetlbpage.c | 8 ++++----
> >> 1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> >> index 9ca840527296..a02411a1f19a 100644
> >> --- a/arch/arm64/mm/hugetlbpage.c
> >> +++ b/arch/arm64/mm/hugetlbpage.c
> >> @@ -453,11 +453,11 @@ void huge_ptep_clear_flush(struct vm_area_struct *vma,
> >> static int __init hugetlbpage_init(void)
> >> {
> >> #ifdef CONFIG_ARM64_4K_PAGES
> >> - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
> >> + hugetlb_add_hstate(ilog2(PUD_SIZE) - PAGE_SHIFT);
> >> #endif
> >> - hugetlb_add_hstate(CONT_PMD_SHIFT - PAGE_SHIFT);
> >> - hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
> >> - hugetlb_add_hstate(CONT_PTE_SHIFT - PAGE_SHIFT);
> >> + hugetlb_add_hstate(ilog2(CONT_PMD_SIZE) - PAGE_SHIFT);
> >> + hugetlb_add_hstate(ilog2(PMD_SIZE) - PAGE_SHIFT);
> >> + hugetlb_add_hstate(ilog2(CONT_PTE_SIZE) - PAGE_SHIFT);
> >>
> >> return 0;
> >> }
> >
> > I build this for an arm64 kernel and ran it in qemu and it worked.
>
> Thanks for testing Anders!
>
> Will, here is an updated version of the patch based on your suggestion.
> I added the () for emphasis but that may just be noise for some. Also,
> the naming differences and values for CONT_PTE may make some people
> look twice. Not sure if being consistent here helps?
>
> I have only built this. No testing.
>
> From daf833ab6b806ecc0816d84d45dcbacc052a7eec Mon Sep 17 00:00:00 2001
> From: Mike Kravetz <[email protected]>
> Date: Mon, 20 Apr 2020 13:56:15 -0700
> Subject: [PATCH] arm64/hugetlb: fix hugetlb initialization
>
> When calling hugetlb_add_hstate() to initialize a new hugetlb size,
> be sure to use correct huge pages size order.
>
> Signed-off-by: Mike Kravetz <[email protected]>
Tested-by: Anders Roxell <[email protected]>
I tested this patch on qemu-aarch64.
Cheers,
Anders
> ---
> arch/arm64/mm/hugetlbpage.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index 9ca840527296..bed6dc7c4276 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -455,9 +455,9 @@ static int __init hugetlbpage_init(void)
> #ifdef CONFIG_ARM64_4K_PAGES
> hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
> #endif
> - hugetlb_add_hstate(CONT_PMD_SHIFT - PAGE_SHIFT);
> + hugetlb_add_hstate((CONT_PMD_SHIFT + PMD_SHIFT) - PAGE_SHIFT);
> hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
> - hugetlb_add_hstate(CONT_PTE_SHIFT - PAGE_SHIFT);
> + hugetlb_add_hstate((CONT_PTE_SHIFT + PAGE_SHIFT) - PAGE_SHIFT);
>
> return 0;
> }
> --
> 2.25.2
>
On Mon, Apr 20, 2020 at 02:40:05PM -0700, Mike Kravetz wrote:
> On 4/20/20 1:29 PM, Anders Roxell wrote:
> > On Mon, 20 Apr 2020 at 20:23, Mike Kravetz <[email protected]> wrote:
> >> On 4/20/20 8:34 AM, Qian Cai wrote:
> >>>
> >>> Reverted this series fixed many undefined behaviors on arm64 with the config,
> >> While rearranging the code (patch 3 in series), I made the incorrect
> >> assumption that CONT_XXX_SIZE == (1UL << CONT_XXX_SHIFT). However,
> >> this is not the case. Does the following patch fix these issues?
> >>
> >> From b75cb4a0852e208bee8c4eb347dc076fcaa88859 Mon Sep 17 00:00:00 2001
> >> From: Mike Kravetz <[email protected]>
> >> Date: Mon, 20 Apr 2020 10:41:18 -0700
> >> Subject: [PATCH] arm64/hugetlb: fix hugetlb initialization
> >>
> >> When calling hugetlb_add_hstate() to initialize a new hugetlb size,
> >> be sure to use correct huge pages size order.
> >>
> >> Signed-off-by: Mike Kravetz <[email protected]>
> >> ---
> >> arch/arm64/mm/hugetlbpage.c | 8 ++++----
> >> 1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> >> index 9ca840527296..a02411a1f19a 100644
> >> --- a/arch/arm64/mm/hugetlbpage.c
> >> +++ b/arch/arm64/mm/hugetlbpage.c
> >> @@ -453,11 +453,11 @@ void huge_ptep_clear_flush(struct vm_area_struct *vma,
> >> static int __init hugetlbpage_init(void)
> >> {
> >> #ifdef CONFIG_ARM64_4K_PAGES
> >> - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
> >> + hugetlb_add_hstate(ilog2(PUD_SIZE) - PAGE_SHIFT);
> >> #endif
> >> - hugetlb_add_hstate(CONT_PMD_SHIFT - PAGE_SHIFT);
> >> - hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
> >> - hugetlb_add_hstate(CONT_PTE_SHIFT - PAGE_SHIFT);
> >> + hugetlb_add_hstate(ilog2(CONT_PMD_SIZE) - PAGE_SHIFT);
> >> + hugetlb_add_hstate(ilog2(PMD_SIZE) - PAGE_SHIFT);
> >> + hugetlb_add_hstate(ilog2(CONT_PTE_SIZE) - PAGE_SHIFT);
> >>
> >> return 0;
> >> }
> >
> > I build this for an arm64 kernel and ran it in qemu and it worked.
>
> Thanks for testing Anders!
>
> Will, here is an updated version of the patch based on your suggestion.
> I added the () for emphasis but that may just be noise for some. Also,
> the naming differences and values for CONT_PTE may make some people
> look twice. Not sure if being consistent here helps?
Cheers, thanks for this. I think being consistent is worthwhile, as it's
the definitions themselves that are weird and we can conceivably clean
that up as a separate patch.
So,
Acked-by: Will Deacon <[email protected]>
Looks like Andrew already picked it up (thanks!)
Thanks,
Will
On Fri, 17 Apr 2020 11:50:45 -0700
Mike Kravetz <[email protected]> wrote:
> v3 -
> Used weak attribute method of defining arch_hugetlb_valid_size.
> This eliminates changes to arch specific hugetlb.h files (Peter)
> Updated documentation (Peter, Randy)
> Fixed handling of implicitly specified gigantic page preallocation
> in existing code and removed documentation of such. There is now
> no difference between handling of gigantic and non-gigantic pages.
> (Peter, Nitesh).
> This requires the most review as there is a small change to
> undocumented behavior. See patch 4 commit message for details.
> Added Acks and Reviews (Mina, Peter)
>
> v2 -
> Fix build errors with patch 1 (Will)
> Change arch_hugetlb_valid_size arg to unsigned long and remove
> irrelevant 'extern' keyword (Christophe)
> Documentation and other misc changes (Randy, Christophe, Mina)
> Do not process command line options if !hugepages_supported()
> (Dave, but it sounds like we may want to additional changes to
> hugepages_supported() for x86? If that is needed I would prefer
> a separate patch.)
>
> Longpeng(Mike) reported a weird message from hugetlb command line processing
> and proposed a solution [1]. While the proposed patch does address the
> specific issue, there are other related issues in command line processing.
> As hugetlbfs evolved, updates to command line processing have been made to
> meet immediate needs and not necessarily in a coordinated manner. The result
> is that some processing is done in arch specific code, some is done in arch
> independent code and coordination is problematic. Semantics can vary between
> architectures.
>
> The patch series does the following:
> - Define arch specific arch_hugetlb_valid_size routine used to validate
> passed huge page sizes.
> - Move hugepagesz= command line parsing out of arch specific code and into
> an arch independent routine.
> - Clean up command line processing to follow desired semantics and
> document those semantics.
>
> [1] https://lore.kernel.org/linux-mm/[email protected]
>
> Mike Kravetz (4):
> hugetlbfs: add arch_hugetlb_valid_size
> hugetlbfs: move hugepagesz= parsing to arch independent code
> hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate
> hugetlbfs: clean up command line processing
>
> .../admin-guide/kernel-parameters.txt | 40 ++--
> Documentation/admin-guide/mm/hugetlbpage.rst | 35 ++++
> arch/arm64/mm/hugetlbpage.c | 30 +--
> arch/powerpc/mm/hugetlbpage.c | 30 +--
> arch/riscv/mm/hugetlbpage.c | 24 +--
> arch/s390/mm/hugetlbpage.c | 24 +--
> arch/sparc/mm/init_64.c | 43 +---
> arch/x86/mm/hugetlbpage.c | 23 +--
> include/linux/hugetlb.h | 2 +-
> mm/hugetlb.c | 190 +++++++++++++++---
> 10 files changed, 271 insertions(+), 170 deletions(-)
>
Looks good and works fine for s390, thanks for cleaning up!
Acked-by: Gerald Schaefer <[email protected]> # s390