2024-02-26 19:12:04

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v2 0/9] Cover a guard gap corner case

Hi,

For v2, the notable change is a bug fix to not clobber the MMF_TOPDOWN
during fork. In the RFC this resulted in fork() children that didn't exec
getting the map up behavior, which included the stress-ng bigheap test. It
turns out much of the 4% improvement seen was due to the bottomup mapping
direction. With the fix, the performance benefit was a less surprising
~1%. Other than that, Kirill's style feedback was addressed. There were no
functional changes (other than the bug).

Link to v1:
https://lore.kernel.org/lkml/[email protected]/

=======

In working on x86’s shadow stack feature, I came across some limitations
around the kernel’s handling of guard gaps. AFAICT these limitations are
not too important for the traditional stack usage of guard gaps, but have
bigger impact on shadow stack’s usage. And now in addition to x86, we have
two other architectures implementing shadow stack like features that plan
to use guard gaps. I wanted to see about addressing them, but I have not
worked on mmap() placement related code before, so would greatly
appreciate if people could take a look and point me in the right
direction.

The nature of the limitations of concern is as follows. In order to ensure
guard gaps between mappings, mmap() would need to consider two things:
1. That the new mapping isn’t placed in an any existing mapping’s guard
gap.
2. That the new mapping isn’t placed such that any existing mappings are
not in *its* guard gaps
Currently mmap never considers (2), and (1) is not considered in some
situations.

When not passing an address hint, or passing one without
MAP_FIXED_NOREPLACE, (1) is enforced. With MAP_FIXED_NOREPLACE, (1) is not
enforced. With MAP_FIXED, (1) is not considered, but this seems to be
expected since MAP_FIXED can already clobber existing mappings. For
MAP_FIXED_NOREPLACE I would have guessed it should respect the guard gaps
of existing mappings, but it is probably a little ambiguous.

In this series I just tried to add enforcement of (2) for the normal (no
address hint) case and only for the newer shadow stack memory (not
stacks). The reason is that with the no-address-hint situation, landing
next to a guard gap could come up naturally and so be more influencable by
attackers such that two shadow stacks could be adjacent without a guard
gap. Where as the address-hint scenarios would require more control -
being able to call mmap() with specific arguments. As for why not just fix
the other corner cases anyway, I thought it might have some greater
possibility of affecting existing apps.

Thanks,

Rick

Rick Edgecombe (9):
mm: Switch mm->get_unmapped_area() to a flag
mm: Introduce arch_get_unmapped_area_vmflags()
mm: Use get_unmapped_area_vmflags()
thp: Add thp_get_unmapped_area_vmflags()
mm: Initialize struct vm_unmapped_area_info
mm: Take placement mappings gap into account
x86/mm: Implement HAVE_ARCH_UNMAPPED_AREA_VMFLAGS
x86/mm: Care about shadow stack guard gap during placement
selftests/x86: Add placement guard gap test for shstk

arch/alpha/kernel/osf_sys.c | 2 +-
arch/arc/mm/mmap.c | 2 +-
arch/arm/mm/mmap.c | 4 +-
arch/csky/abiv1/mmap.c | 2 +-
arch/loongarch/mm/mmap.c | 2 +-
arch/mips/mm/mmap.c | 2 +-
arch/parisc/kernel/sys_parisc.c | 2 +-
arch/powerpc/mm/book3s64/slice.c | 4 +-
arch/s390/mm/hugetlbpage.c | 6 +-
arch/s390/mm/mmap.c | 8 +-
arch/sh/mm/mmap.c | 4 +-
arch/sparc/kernel/sys_sparc_32.c | 2 +-
arch/sparc/kernel/sys_sparc_64.c | 19 ++--
arch/sparc/mm/hugetlbpage.c | 6 +-
arch/x86/include/asm/pgtable_64.h | 1 +
arch/x86/kernel/cpu/sgx/driver.c | 2 +-
arch/x86/kernel/sys_x86_64.c | 39 +++++--
arch/x86/mm/hugetlbpage.c | 6 +-
arch/x86/mm/mmap.c | 4 +-
drivers/char/mem.c | 2 +-
drivers/dax/device.c | 6 +-
fs/hugetlbfs/inode.c | 6 +-
fs/proc/inode.c | 15 +--
fs/ramfs/file-mmu.c | 2 +-
include/linux/huge_mm.h | 11 ++
include/linux/mm.h | 12 ++-
include/linux/mm_types.h | 6 +-
include/linux/sched/coredump.h | 5 +-
include/linux/sched/mm.h | 22 ++++
io_uring/io_uring.c | 2 +-
mm/debug.c | 6 --
mm/huge_memory.c | 23 ++--
mm/mmap.c | 101 +++++++++++++-----
mm/shmem.c | 11 +-
mm/util.c | 6 +-
.../testing/selftests/x86/test_shadow_stack.c | 67 +++++++++++-
36 files changed, 298 insertions(+), 122 deletions(-)

--
2.34.1



2024-02-26 19:12:25

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v2 2/9] mm: Introduce arch_get_unmapped_area_vmflags()

When memory is being placed, mmap() will take care to respect the guard
gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and
VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap()
needs to consider two things:
1. That the new mapping isn’t placed in an any existing mappings guard
gaps.
2. That the new mapping isn’t placed such that any existing mappings
are not in *its* guard gaps.

The long standing behavior of mmap() is to ensure 1, but not take any care
around 2. So for example, if there is a PAGE_SIZE free area, and a
mmap() with a PAGE_SIZE size, and a type that has a guard gap is being
placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then
the mapping that is supposed to have a guard gap will not have a gap to
the adjacent VMA.

In order to take the start gap into account, the maple tree search needs
to know the size of start gap the new mapping will need. The call chain
from do_mmap() to the actual maple tree search looks like this:

do_mmap(size, vm_flags, map_flags, ..)
mm/mmap.c:get_unmapped_area(size, map_flags, ...)
arch_get_unmapped_area(size, map_flags, ...)
vm_unmapped_area(struct vm_unmapped_area_info)

One option would be to add another MAP_ flag to mean a one page start gap
(as is for shadow stack), but this consumes a flag unnecessarily. Another
option could be to simply increase the size passed in do_mmap() by the
start gap size, and adjust after the fact, but this will interfere with
the alignment requirements passed in struct vm_unmapped_area_info, and
unknown to mmap.c. Instead, introduce variants of
arch_get_unmapped_area/_topdown() that take vm_flags. In future changes,
these variants can be used in mmap.c:get_unmapped_area() to allow the
vm_flags to be passed through to vm_unmapped_area(), while preserving the
normal arch_get_unmapped_area/_topdown() for the existing callers.

Signed-off-by: Rick Edgecombe <[email protected]>
---
include/linux/sched/mm.h | 17 +++++++++++++++++
mm/mmap.c | 28 ++++++++++++++++++++++++++++
2 files changed, 45 insertions(+)

diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index cde946e926d8..7b44441865c5 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -191,6 +191,23 @@ unsigned long mm_get_unmapped_area(struct mm_struct *mm, struct file *filp,
unsigned long addr, unsigned long len,
unsigned long pgoff, unsigned long flags);

+extern unsigned long
+arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,
+ unsigned long len, unsigned long pgoff,
+ unsigned long flags, vm_flags_t vm_flags);
+extern unsigned long
+arch_get_unmapped_area_topdown_vmflags(struct file *filp, unsigned long addr,
+ unsigned long len, unsigned long pgoff,
+ unsigned long flags, vm_flags_t);
+
+unsigned long mm_get_unmapped_area_vmflags(struct mm_struct *mm,
+ struct file *filp,
+ unsigned long addr,
+ unsigned long len,
+ unsigned long pgoff,
+ unsigned long flags,
+ vm_flags_t vm_flags);
+
unsigned long
generic_get_unmapped_area(struct file *filp, unsigned long addr,
unsigned long len, unsigned long pgoff,
diff --git a/mm/mmap.c b/mm/mmap.c
index b61bc515c729..2021bc040e81 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1802,6 +1802,34 @@ arch_get_unmapped_area_topdown(struct file *filp, unsigned long addr,
}
#endif

+#ifndef HAVE_ARCH_UNMAPPED_AREA_VMFLAGS
+extern unsigned long
+arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags)
+{
+ return arch_get_unmapped_area(filp, addr, len, pgoff, flags);
+}
+
+extern unsigned long
+arch_get_unmapped_area_topdown_vmflags(struct file *filp, unsigned long addr,
+ unsigned long len, unsigned long pgoff,
+ unsigned long flags, vm_flags_t vm_flags)
+{
+ return arch_get_unmapped_area_topdown(filp, addr, len, pgoff, flags);
+}
+#endif
+
+unsigned long mm_get_unmapped_area_vmflags(struct mm_struct *mm, struct file *filp,
+ unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags,
+ vm_flags_t vm_flags)
+{
+ if (test_bit(MMF_TOPDOWN, &mm->flags))
+ return arch_get_unmapped_area_topdown_vmflags(filp, addr, len, pgoff,
+ flags, vm_flags);
+ return arch_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags, vm_flags);
+}
+
unsigned long
get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
unsigned long pgoff, unsigned long flags)
--
2.34.1


2024-02-26 19:12:34

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v2 1/9] mm: Switch mm->get_unmapped_area() to a flag

The mm_struct contains a function pointer *get_unmapped_area(), which
is set to either arch_get_unmapped_area() or
arch_get_unmapped_area_topdown() during the initialization of the mm.

Since the function pointer only ever points to two functions that are named
the same across all arch's, a function pointer is not really required. In
addition future changes will want to add versions of the functions that
take additional arguments. So to save a pointers worth of bytes in
mm_struct, and prevent adding additional function pointers to mm_struct in
future changes, remove it and keep the information about which
get_unmapped_area() to use in a flag.

Add the new flag to MMF_INIT_MASK so it doesn't get clobbered on fork by
mmf_init_flags(). Most MM flags get clobbered on fork. In the pre-existing
behavior mm->get_unmapped_area() would get copied to the new mm in dup_mm(),
so not clobbering the flag preserves the existing behavior around
inherriting the topdown-ness.

Introduce a helper, mm_get_unmapped_area(), to easily convert code that
refers to the old function pointer to instead select and call either
arch_get_unmapped_area() or arch_get_unmapped_area_topdown() based on the
flag. Then drop the mm->get_unmapped_area() function pointer. Leave the
get_unmapped_area() pointer in struct file_operations alone. The main
purpose of this change is to reorganize in preparation for future changes,
but it also converts the calls of mm->get_unmapped_area() from indirect
branches into a direct ones.

The stress-ng bigheap benchmark calls realloc a lot, which calls through
get_unmapped_area() in the kernel. On x86, the change yielded a ~1%
improvement there on a retpoline config.

In testing a few x86 configs, removing the pointer unfortunately didn't
result in any actual size reductions in the compiled layout of mm_struct.
But depending on compiler or arch alignment requirements, the change could
shrink the size of mm_struct.

Signed-off-by: Rick Edgecombe <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Acked-by: Liam R. Howlett <[email protected]>
Reviewed-by: Kirill A. Shutemov <[email protected]>
---
v2:
- Fix comment on MMF_TOPDOWN (Kirill, rppt)
- Move MMF_TOPDOWN to actually unused bit
- Add MMF_TOPDOWN to MMF_INIT_MASK so it doesn't get clobbered on fork,
and result in the children using the search up path.
- New lower performance results after above bug fix
- Add Reviews and Acks
---
arch/s390/mm/hugetlbpage.c | 2 +-
arch/s390/mm/mmap.c | 4 ++--
arch/sparc/kernel/sys_sparc_64.c | 15 ++++++---------
arch/sparc/mm/hugetlbpage.c | 2 +-
arch/x86/kernel/cpu/sgx/driver.c | 2 +-
arch/x86/mm/hugetlbpage.c | 2 +-
arch/x86/mm/mmap.c | 4 ++--
drivers/char/mem.c | 2 +-
drivers/dax/device.c | 6 +++---
fs/hugetlbfs/inode.c | 2 +-
fs/proc/inode.c | 15 ++++++++-------
fs/ramfs/file-mmu.c | 2 +-
include/linux/mm_types.h | 6 +-----
include/linux/sched/coredump.h | 5 ++++-
include/linux/sched/mm.h | 5 +++++
io_uring/io_uring.c | 2 +-
mm/debug.c | 6 ------
mm/huge_memory.c | 6 +++---
mm/mmap.c | 21 ++++++++++++++++++---
mm/shmem.c | 11 +++++------
mm/util.c | 6 +++---
21 files changed, 68 insertions(+), 58 deletions(-)

diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c
index 297a6d897d5a..c2d2850ec8d5 100644
--- a/arch/s390/mm/hugetlbpage.c
+++ b/arch/s390/mm/hugetlbpage.c
@@ -328,7 +328,7 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
goto check_asce_limit;
}

- if (mm->get_unmapped_area == arch_get_unmapped_area)
+ if (!test_bit(MMF_TOPDOWN, &mm->flags))
addr = hugetlb_get_unmapped_area_bottomup(file, addr, len,
pgoff, flags);
else
diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c
index fc9a7dc26c5e..cd52d72b59cf 100644
--- a/arch/s390/mm/mmap.c
+++ b/arch/s390/mm/mmap.c
@@ -182,10 +182,10 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
*/
if (mmap_is_legacy(rlim_stack)) {
mm->mmap_base = mmap_base_legacy(random_factor);
- mm->get_unmapped_area = arch_get_unmapped_area;
+ clear_bit(MMF_TOPDOWN, &mm->flags);
} else {
mm->mmap_base = mmap_base(random_factor, rlim_stack);
- mm->get_unmapped_area = arch_get_unmapped_area_topdown;
+ set_bit(MMF_TOPDOWN, &mm->flags);
}
}

diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c
index 1e9a9e016237..1dbf7211666e 100644
--- a/arch/sparc/kernel/sys_sparc_64.c
+++ b/arch/sparc/kernel/sys_sparc_64.c
@@ -218,14 +218,10 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, unsigned long len, unsigned long pgoff, unsigned long flags)
{
unsigned long align_goal, addr = -ENOMEM;
- unsigned long (*get_area)(struct file *, unsigned long,
- unsigned long, unsigned long, unsigned long);
-
- get_area = current->mm->get_unmapped_area;

if (flags & MAP_FIXED) {
/* Ok, don't mess with it. */
- return get_area(NULL, orig_addr, len, pgoff, flags);
+ return mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, flags);
}
flags &= ~MAP_SHARED;

@@ -238,7 +234,8 @@ unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, u
align_goal = (64UL * 1024);

do {
- addr = get_area(NULL, orig_addr, len + (align_goal - PAGE_SIZE), pgoff, flags);
+ addr = mm_get_unmapped_area(current->mm, NULL, orig_addr,
+ len + (align_goal - PAGE_SIZE), pgoff, flags);
if (!(addr & ~PAGE_MASK)) {
addr = (addr + (align_goal - 1UL)) & ~(align_goal - 1UL);
break;
@@ -256,7 +253,7 @@ unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_addr, u
* be obtained.
*/
if (addr & ~PAGE_MASK)
- addr = get_area(NULL, orig_addr, len, pgoff, flags);
+ addr = mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, flags);

return addr;
}
@@ -292,7 +289,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
gap == RLIM_INFINITY ||
sysctl_legacy_va_layout) {
mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
- mm->get_unmapped_area = arch_get_unmapped_area;
+ clear_bit(MMF_TOPDOWN, &mm->flags);
} else {
/* We know it's 32-bit */
unsigned long task_size = STACK_TOP32;
@@ -303,7 +300,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
gap = (task_size / 6 * 5);

mm->mmap_base = PAGE_ALIGN(task_size - gap - random_factor);
- mm->get_unmapped_area = arch_get_unmapped_area_topdown;
+ set_bit(MMF_TOPDOWN, &mm->flags);
}
}

diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
index b432500c13a5..38a1bef47efb 100644
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -123,7 +123,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
(!vma || addr + len <= vm_start_gap(vma)))
return addr;
}
- if (mm->get_unmapped_area == arch_get_unmapped_area)
+ if (!test_bit(MMF_TOPDOWN, &mm->flags))
return hugetlb_get_unmapped_area_bottomup(file, addr, len,
pgoff, flags);
else
diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c
index 262f5fb18d74..22b65a5f5ec6 100644
--- a/arch/x86/kernel/cpu/sgx/driver.c
+++ b/arch/x86/kernel/cpu/sgx/driver.c
@@ -113,7 +113,7 @@ static unsigned long sgx_get_unmapped_area(struct file *file,
if (flags & MAP_FIXED)
return addr;

- return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
+ return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags);
}

#ifdef CONFIG_COMPAT
diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 5804bbae4f01..6d77c0039617 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -141,7 +141,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
}

get_unmapped_area:
- if (mm->get_unmapped_area == arch_get_unmapped_area)
+ if (!test_bit(MMF_TOPDOWN, &mm->flags))
return hugetlb_get_unmapped_area_bottomup(file, addr, len,
pgoff, flags);
else
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index c90c20904a60..a2cabb1c81e1 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -129,9 +129,9 @@ static void arch_pick_mmap_base(unsigned long *base, unsigned long *legacy_base,
void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
{
if (mmap_is_legacy())
- mm->get_unmapped_area = arch_get_unmapped_area;
+ clear_bit(MMF_TOPDOWN, &mm->flags);
else
- mm->get_unmapped_area = arch_get_unmapped_area_topdown;
+ set_bit(MMF_TOPDOWN, &mm->flags);

arch_pick_mmap_base(&mm->mmap_base, &mm->mmap_legacy_base,
arch_rnd(mmap64_rnd_bits), task_size_64bit(0),
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 3c6670cf905f..9b80e622ae80 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -544,7 +544,7 @@ static unsigned long get_unmapped_area_zero(struct file *file,
}

/* Otherwise flags & MAP_PRIVATE: with no shmem object beneath it */
- return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
+ return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags);
#else
return -ENOSYS;
#endif
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index 93ebedc5ec8c..47c126d37b59 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -329,14 +329,14 @@ static unsigned long dax_get_unmapped_area(struct file *filp,
if ((off + len_align) < off)
goto out;

- addr_align = current->mm->get_unmapped_area(filp, addr, len_align,
- pgoff, flags);
+ addr_align = mm_get_unmapped_area(current->mm, filp, addr, len_align,
+ pgoff, flags);
if (!IS_ERR_VALUE(addr_align)) {
addr_align += (off - addr_align) & (align - 1);
return addr_align;
}
out:
- return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+ return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
}

static const struct address_space_operations dev_dax_aops = {
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index f757d4f7ad98..a63d2eee086f 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -242,7 +242,7 @@ generic_hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
* If architectures have special needs, they should define their own
* version of hugetlb_get_unmapped_area.
*/
- if (mm->get_unmapped_area == arch_get_unmapped_area_topdown)
+ if (test_bit(MMF_TOPDOWN, &mm->flags))
return hugetlb_get_unmapped_area_topdown(file, addr, len,
pgoff, flags);
return hugetlb_get_unmapped_area_bottomup(file, addr, len,
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index b33e490e3fd9..6f4c2e21e68f 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -454,15 +454,16 @@ pde_get_unmapped_area(struct proc_dir_entry *pde, struct file *file, unsigned lo
unsigned long len, unsigned long pgoff,
unsigned long flags)
{
- typeof_member(struct proc_ops, proc_get_unmapped_area) get_area;
-
- get_area = pde->proc_ops->proc_get_unmapped_area;
+ if (pde->proc_ops->proc_get_unmapped_area)
+ return pde->proc_ops->proc_get_unmapped_area(file, orig_addr,
+ len, pgoff,
+ flags);
#ifdef CONFIG_MMU
- if (!get_area)
- get_area = current->mm->get_unmapped_area;
+ else
+ return mm_get_unmapped_area(current->mm, file, orig_addr,
+ len, pgoff, flags);
#endif
- if (get_area)
- return get_area(file, orig_addr, len, pgoff, flags);
+
return orig_addr;
}

diff --git a/fs/ramfs/file-mmu.c b/fs/ramfs/file-mmu.c
index c7a1aa3c882b..b45c7edc3225 100644
--- a/fs/ramfs/file-mmu.c
+++ b/fs/ramfs/file-mmu.c
@@ -35,7 +35,7 @@ static unsigned long ramfs_mmu_get_unmapped_area(struct file *file,
unsigned long addr, unsigned long len, unsigned long pgoff,
unsigned long flags)
{
- return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
+ return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags);
}

const struct file_operations ramfs_file_operations = {
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 957ce38768b2..f01c01b4a4fc 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -745,11 +745,7 @@ struct mm_struct {
} ____cacheline_aligned_in_smp;

struct maple_tree mm_mt;
-#ifdef CONFIG_MMU
- unsigned long (*get_unmapped_area) (struct file *filp,
- unsigned long addr, unsigned long len,
- unsigned long pgoff, unsigned long flags);
-#endif
+
unsigned long mmap_base; /* base of mmap area */
unsigned long mmap_legacy_base; /* base of mmap area in bottom-up allocations */
#ifdef CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES
diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
index 02f5090ffea2..e62ff805cfc9 100644
--- a/include/linux/sched/coredump.h
+++ b/include/linux/sched/coredump.h
@@ -92,9 +92,12 @@ static inline int get_dumpable(struct mm_struct *mm)
#define MMF_VM_MERGE_ANY 30
#define MMF_VM_MERGE_ANY_MASK (1 << MMF_VM_MERGE_ANY)

+#define MMF_TOPDOWN 31 /* mm searches top down by default */
+#define MMF_TOPDOWN_MASK (1 << MMF_TOPDOWN)
+
#define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\
MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK |\
- MMF_VM_MERGE_ANY_MASK)
+ MMF_VM_MERGE_ANY_MASK | MMF_TOPDOWN_MASK)

static inline unsigned long mmf_init_flags(unsigned long flags)
{
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 9a19f1b42f64..cde946e926d8 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -8,6 +8,7 @@
#include <linux/mm_types.h>
#include <linux/gfp.h>
#include <linux/sync_core.h>
+#include <linux/sched/coredump.h>

/*
* Routines for handling mm_structs
@@ -186,6 +187,10 @@ arch_get_unmapped_area_topdown(struct file *filp, unsigned long addr,
unsigned long len, unsigned long pgoff,
unsigned long flags);

+unsigned long mm_get_unmapped_area(struct mm_struct *mm, struct file *filp,
+ unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags);
+
unsigned long
generic_get_unmapped_area(struct file *filp, unsigned long addr,
unsigned long len, unsigned long pgoff,
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 9626a363f121..b3c4ec115989 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3569,7 +3569,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
#else
addr = 0UL;
#endif
- return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+ return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
}

#else /* !CONFIG_MMU */
diff --git a/mm/debug.c b/mm/debug.c
index ee533a5ceb79..32db5de8e1e7 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -162,9 +162,6 @@ EXPORT_SYMBOL(dump_vma);
void dump_mm(const struct mm_struct *mm)
{
pr_emerg("mm %px task_size %lu\n"
-#ifdef CONFIG_MMU
- "get_unmapped_area %px\n"
-#endif
"mmap_base %lu mmap_legacy_base %lu\n"
"pgd %px mm_users %d mm_count %d pgtables_bytes %lu map_count %d\n"
"hiwater_rss %lx hiwater_vm %lx total_vm %lx locked_vm %lx\n"
@@ -190,9 +187,6 @@ void dump_mm(const struct mm_struct *mm)
"def_flags: %#lx(%pGv)\n",

mm, mm->task_size,
-#ifdef CONFIG_MMU
- mm->get_unmapped_area,
-#endif
mm->mmap_base, mm->mmap_legacy_base,
mm->pgd, atomic_read(&mm->mm_users),
atomic_read(&mm->mm_count),
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 86ee29b5c39c..e9ef43a719a5 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -641,8 +641,8 @@ static unsigned long __thp_get_unmapped_area(struct file *filp,
if (len_pad < len || (off + len_pad) < off)
return 0;

- ret = current->mm->get_unmapped_area(filp, addr, len_pad,
- off >> PAGE_SHIFT, flags);
+ ret = mm_get_unmapped_area(current->mm, filp, addr, len_pad,
+ off >> PAGE_SHIFT, flags);

/*
* The failure might be due to length padding. The caller will retry
@@ -672,7 +672,7 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
if (ret)
return ret;

- return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+ return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
}
EXPORT_SYMBOL_GPL(thp_get_unmapped_area);

diff --git a/mm/mmap.c b/mm/mmap.c
index aa82eec17489..b61bc515c729 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1807,7 +1807,8 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
unsigned long pgoff, unsigned long flags)
{
unsigned long (*get_area)(struct file *, unsigned long,
- unsigned long, unsigned long, unsigned long);
+ unsigned long, unsigned long, unsigned long)
+ = NULL;

unsigned long error = arch_mmap_check(addr, len, flags);
if (error)
@@ -1817,7 +1818,6 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
if (len > TASK_SIZE)
return -ENOMEM;

- get_area = current->mm->get_unmapped_area;
if (file) {
if (file->f_op->get_unmapped_area)
get_area = file->f_op->get_unmapped_area;
@@ -1834,7 +1834,11 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
get_area = thp_get_unmapped_area;
}

- addr = get_area(file, addr, len, pgoff, flags);
+ if (get_area)
+ addr = get_area(file, addr, len, pgoff, flags);
+ else
+ addr = mm_get_unmapped_area(current->mm, file, addr, len,
+ pgoff, flags);
if (IS_ERR_VALUE(addr))
return addr;

@@ -1849,6 +1853,17 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,

EXPORT_SYMBOL(get_unmapped_area);

+unsigned long
+mm_get_unmapped_area(struct mm_struct *mm, struct file *file,
+ unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags)
+{
+ if (test_bit(MMF_TOPDOWN, &mm->flags))
+ return arch_get_unmapped_area_topdown(file, addr, len, pgoff, flags);
+ return arch_get_unmapped_area(file, addr, len, pgoff, flags);
+}
+EXPORT_SYMBOL(mm_get_unmapped_area);
+
/**
* find_vma_intersection() - Look up the first VMA which intersects the interval
* @mm: The process address space.
diff --git a/mm/shmem.c b/mm/shmem.c
index 0d1ce70bce38..a8cac92dff88 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2242,8 +2242,6 @@ unsigned long shmem_get_unmapped_area(struct file *file,
unsigned long uaddr, unsigned long len,
unsigned long pgoff, unsigned long flags)
{
- unsigned long (*get_area)(struct file *,
- unsigned long, unsigned long, unsigned long, unsigned long);
unsigned long addr;
unsigned long offset;
unsigned long inflated_len;
@@ -2253,8 +2251,8 @@ unsigned long shmem_get_unmapped_area(struct file *file,
if (len > TASK_SIZE)
return -ENOMEM;

- get_area = current->mm->get_unmapped_area;
- addr = get_area(file, uaddr, len, pgoff, flags);
+ addr = mm_get_unmapped_area(current->mm, file, uaddr, len, pgoff,
+ flags);

if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
return addr;
@@ -2311,7 +2309,8 @@ unsigned long shmem_get_unmapped_area(struct file *file,
if (inflated_len < len)
return addr;

- inflated_addr = get_area(NULL, uaddr, inflated_len, 0, flags);
+ inflated_addr = mm_get_unmapped_area(current->mm, NULL, uaddr,
+ inflated_len, 0, flags);
if (IS_ERR_VALUE(inflated_addr))
return addr;
if (inflated_addr & ~PAGE_MASK)
@@ -4757,7 +4756,7 @@ unsigned long shmem_get_unmapped_area(struct file *file,
unsigned long addr, unsigned long len,
unsigned long pgoff, unsigned long flags)
{
- return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
+ return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags);
}
#endif

diff --git a/mm/util.c b/mm/util.c
index 744b4d7e3fae..10c836a75e66 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -452,17 +452,17 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)

if (mmap_is_legacy(rlim_stack)) {
mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
- mm->get_unmapped_area = arch_get_unmapped_area;
+ clear_bit(MMF_TOPDOWN, &mm->flags);
} else {
mm->mmap_base = mmap_base(random_factor, rlim_stack);
- mm->get_unmapped_area = arch_get_unmapped_area_topdown;
+ set_bit(MMF_TOPDOWN, &mm->flags);
}
}
#elif defined(CONFIG_MMU) && !defined(HAVE_ARCH_PICK_MMAP_LAYOUT)
void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
{
mm->mmap_base = TASK_UNMAPPED_BASE;
- mm->get_unmapped_area = arch_get_unmapped_area;
+ clear_bit(MMF_TOPDOWN, &mm->flags);
}
#endif

--
2.34.1


2024-02-26 19:13:20

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v2 6/9] mm: Take placement mappings gap into account

When memory is being placed, mmap() will take care to respect the guard
gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and
VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap()
needs to consider two things:
1. That the new mapping isn’t placed in an any existing mappings guard
gaps.
2. That the new mapping isn’t placed such that any existing mappings
are not in *its* guard gaps.

The long standing behavior of mmap() is to ensure 1, but not take any care
around 2. So for example, if there is a PAGE_SIZE free area, and a
mmap() with a PAGE_SIZE size, and a type that has a guard gap is being
placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then
the mapping that is supposed to have a guard gap will not have a gap to
the adjacent VMA.

For MAP_GROWSDOWN/VM_GROWSDOWN and MAP_GROWSUP/VM_GROWSUP this has not
been a problem in practice because applications place these kinds of
mappings very early, when there is not many mappings to find a space
between. But for shadow stacks, they may be placed throughout the lifetime
of the application.

Use the start_gap field to find a space that includes the guard gap for
the new mapping. Take care to not interfere with the alignment.

Signed-off-by: Rick Edgecombe <[email protected]>
---
v2:
- Remove VM_UNMAPPED_START_GAP_SET and have struct vm_unmapped_area_info
initialized with zeros (in another patch). (Kirill)
- Drop unrelated space change (Kirill)
- Add comment around interactions of alignment and start gap step
(Kirill)
---
include/linux/mm.h | 1 +
mm/mmap.c | 12 +++++++++---
2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a091181ef65a..19fc1eb86b9a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3406,6 +3406,7 @@ struct vm_unmapped_area_info {
unsigned long high_limit;
unsigned long align_mask;
unsigned long align_offset;
+ unsigned long start_gap;
};

extern unsigned long vm_unmapped_area(struct vm_unmapped_area_info *info);
diff --git a/mm/mmap.c b/mm/mmap.c
index 33af683a643f..3d7642eb84ea 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1574,7 +1574,7 @@ static unsigned long unmapped_area(struct vm_unmapped_area_info *info)
MA_STATE(mas, &current->mm->mm_mt, 0, 0);

/* Adjust search length to account for worst case alignment overhead */
- length = info->length + info->align_mask;
+ length = info->length + info->align_mask + info->start_gap;
if (length < info->length)
return -ENOMEM;

@@ -1586,7 +1586,13 @@ static unsigned long unmapped_area(struct vm_unmapped_area_info *info)
if (mas_empty_area(&mas, low_limit, high_limit - 1, length))
return -ENOMEM;

- gap = mas.index;
+ /*
+ * Adjust for the gap first so it doesn't interfere with the
+ * later alignment. The first step is the minimum needed to
+ * fufill the start gap, the next steps is the minimum to align
+ * that. It is the minimum needed to fufill both.
+ */
+ gap = mas.index + info->start_gap;
gap += (info->align_offset - gap) & info->align_mask;
tmp = mas_next(&mas, ULONG_MAX);
if (tmp && (tmp->vm_flags & VM_STARTGAP_FLAGS)) { /* Avoid prev check if possible */
@@ -1625,7 +1631,7 @@ static unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info)

MA_STATE(mas, &current->mm->mm_mt, 0, 0);
/* Adjust search length to account for worst case alignment overhead */
- length = info->length + info->align_mask;
+ length = info->length + info->align_mask + info->start_gap;
if (length < info->length)
return -ENOMEM;

--
2.34.1


2024-02-26 19:13:43

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v2 7/9] x86/mm: Implement HAVE_ARCH_UNMAPPED_AREA_VMFLAGS

When memory is being placed, mmap() will take care to respect the guard
gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and
VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap()
needs to consider two things:
1. That the new mapping isn’t placed in an any existing mappings guard
gaps.
2. That the new mapping isn’t placed such that any existing mappings
are not in *its* guard gaps.

The long standing behavior of mmap() is to ensure 1, but not take any care
around 2. So for example, if there is a PAGE_SIZE free area, and a
mmap() with a PAGE_SIZE size, and a type that has a guard gap is being
placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then
the mapping that is supposed to have a guard gap will not have a gap to
the adjacent VMA.

Add x86 arch implementations of arch_get_unmapped_area_vmflags/_topdown()
so future changes can allow the the guard gap of type of vma being placed
to be taken into account. This will be used for shadow stack memory.

Signed-off-by: Rick Edgecombe <[email protected]>
---
v2:
- Remove unnecessary added extern
---
arch/x86/include/asm/pgtable_64.h | 1 +
arch/x86/kernel/sys_x86_64.c | 25 ++++++++++++++++++++-----
2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index a629b1b9f65a..eb09a11621ad 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -244,6 +244,7 @@ extern void cleanup_highmap(void);

#define HAVE_ARCH_UNMAPPED_AREA
#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+#define HAVE_ARCH_UNMAPPED_AREA_VMFLAGS

#define PAGE_AGP PAGE_KERNEL_NOCACHE
#define HAVE_PAGE_AGP 1
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index 6e5d4fa5fc42..95cb9efc47fb 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -120,8 +120,8 @@ static void find_start_end(unsigned long addr, unsigned long flags,
}

unsigned long
-arch_get_unmapped_area(struct file *filp, unsigned long addr,
- unsigned long len, unsigned long pgoff, unsigned long flags)
+arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
@@ -158,9 +158,9 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
}

unsigned long
-arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
- const unsigned long len, const unsigned long pgoff,
- const unsigned long flags)
+arch_get_unmapped_area_topdown_vmflags(struct file *filp, unsigned long addr0,
+ unsigned long len, unsigned long pgoff,
+ unsigned long flags, vm_flags_t vm_flags)
{
struct vm_area_struct *vma;
struct mm_struct *mm = current->mm;
@@ -230,3 +230,18 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
*/
return arch_get_unmapped_area(filp, addr0, len, pgoff, flags);
}
+
+unsigned long
+arch_get_unmapped_area(struct file *filp, unsigned long addr,
+ unsigned long len, unsigned long pgoff, unsigned long flags)
+{
+ return arch_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags, 0);
+}
+
+unsigned long
+arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr,
+ const unsigned long len, const unsigned long pgoff,
+ const unsigned long flags)
+{
+ return arch_get_unmapped_area_topdown_vmflags(filp, addr, len, pgoff, flags, 0);
+}
--
2.34.1


2024-02-26 19:13:46

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

Future changes will need to add a field to struct vm_unmapped_area_info.
This would cause trouble for any archs that don't initialize the
struct. Currently every user sets each field, so if new fields are
added, the core code parsing the struct will see garbage in the new
field.

It could be possible to initialize the new field for each arch to 0, but
instead simply inialize the field with a C99 struct inializing syntax.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Suggested-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Link: https://lore.kernel.org/lkml/3ynogxcgokc6i6xojbxzzwqectg472laes24u7jmtktlxcch5e@dfytra3ia3zc/#t
---
Hi archs,

For some context, this is part of a larger series to improve shadow stack
guard gaps. It involves plumbing a new field via
struct vm_unmapped_area_info. The first user is x86, but arm and riscv may
likely use it as well. The change is compile tested only for non-x86 but
seems like a relatively safe one.

Thanks,

Rick

v2:
- New patch
---
arch/alpha/kernel/osf_sys.c | 2 +-
arch/arc/mm/mmap.c | 2 +-
arch/arm/mm/mmap.c | 4 ++--
arch/csky/abiv1/mmap.c | 2 +-
arch/loongarch/mm/mmap.c | 2 +-
arch/mips/mm/mmap.c | 2 +-
arch/parisc/kernel/sys_parisc.c | 2 +-
arch/powerpc/mm/book3s64/slice.c | 4 ++--
arch/s390/mm/hugetlbpage.c | 4 ++--
arch/s390/mm/mmap.c | 4 ++--
arch/sh/mm/mmap.c | 4 ++--
arch/sparc/kernel/sys_sparc_32.c | 2 +-
arch/sparc/kernel/sys_sparc_64.c | 4 ++--
arch/sparc/mm/hugetlbpage.c | 4 ++--
arch/x86/kernel/sys_x86_64.c | 4 ++--
arch/x86/mm/hugetlbpage.c | 4 ++--
fs/hugetlbfs/inode.c | 4 ++--
mm/mmap.c | 4 ++--
18 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 5db88b627439..dd6801bb9240 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -1218,7 +1218,7 @@ static unsigned long
arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
unsigned long limit)
{
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

info.flags = 0;
info.length = len;
diff --git a/arch/arc/mm/mmap.c b/arch/arc/mm/mmap.c
index 3c1c7ae73292..6549b3375f54 100644
--- a/arch/arc/mm/mmap.c
+++ b/arch/arc/mm/mmap.c
@@ -27,7 +27,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

/*
* We enforce the MAP_FIXED case.
diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index a0f8a0ca0788..525795578c29 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -34,7 +34,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
struct vm_area_struct *vma;
int do_align = 0;
int aliasing = cache_is_vipt_aliasing();
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

/*
* We only need to do colour alignment if either the I or D
@@ -87,7 +87,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
unsigned long addr = addr0;
int do_align = 0;
int aliasing = cache_is_vipt_aliasing();
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

/*
* We only need to do colour alignment if either the I or D
diff --git a/arch/csky/abiv1/mmap.c b/arch/csky/abiv1/mmap.c
index 6792aca49999..726659d41fa9 100644
--- a/arch/csky/abiv1/mmap.c
+++ b/arch/csky/abiv1/mmap.c
@@ -28,7 +28,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
int do_align = 0;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

/*
* We only need to do colour alignment if either the I or D
diff --git a/arch/loongarch/mm/mmap.c b/arch/loongarch/mm/mmap.c
index a9630a81b38a..664bf4abfdcf 100644
--- a/arch/loongarch/mm/mmap.c
+++ b/arch/loongarch/mm/mmap.c
@@ -24,7 +24,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
struct vm_area_struct *vma;
unsigned long addr = addr0;
int do_color_align;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

if (unlikely(len > TASK_SIZE))
return -ENOMEM;
diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
index 00fe90c6db3e..6321b53dc995 100644
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -34,7 +34,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
struct vm_area_struct *vma;
unsigned long addr = addr0;
int do_color_align;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

if (unlikely(len > TASK_SIZE))
return -ENOMEM;
diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
index 98af719d5f85..e87c0e325abf 100644
--- a/arch/parisc/kernel/sys_parisc.c
+++ b/arch/parisc/kernel/sys_parisc.c
@@ -104,7 +104,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
struct vm_area_struct *vma, *prev;
unsigned long filp_pgoff;
int do_color_align;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

if (unlikely(len > TASK_SIZE))
return -ENOMEM;
diff --git a/arch/powerpc/mm/book3s64/slice.c b/arch/powerpc/mm/book3s64/slice.c
index c0b58afb9a47..5884f384866f 100644
--- a/arch/powerpc/mm/book3s64/slice.c
+++ b/arch/powerpc/mm/book3s64/slice.c
@@ -282,7 +282,7 @@ static unsigned long slice_find_area_bottomup(struct mm_struct *mm,
{
int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
unsigned long found, next_end;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

info.flags = 0;
info.length = len;
@@ -326,7 +326,7 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm,
{
int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
unsigned long found, prev;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};
unsigned long min_addr = max(PAGE_SIZE, mmap_min_addr);

info.flags = VM_UNMAPPED_AREA_TOPDOWN;
diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c
index c2d2850ec8d5..7f68485feea0 100644
--- a/arch/s390/mm/hugetlbpage.c
+++ b/arch/s390/mm/hugetlbpage.c
@@ -258,7 +258,7 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
unsigned long pgoff, unsigned long flags)
{
struct hstate *h = hstate_file(file);
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

info.flags = 0;
info.length = len;
@@ -274,7 +274,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
unsigned long pgoff, unsigned long flags)
{
struct hstate *h = hstate_file(file);
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};
unsigned long addr;

info.flags = VM_UNMAPPED_AREA_TOPDOWN;
diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c
index cd52d72b59cf..df88496e2903 100644
--- a/arch/s390/mm/mmap.c
+++ b/arch/s390/mm/mmap.c
@@ -77,7 +77,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

if (len > TASK_SIZE - mmap_min_addr)
return -ENOMEM;
@@ -116,7 +116,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp, unsigned long ad
{
struct vm_area_struct *vma;
struct mm_struct *mm = current->mm;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

/* requested length too big for entire address space */
if (len > TASK_SIZE - mmap_min_addr)
diff --git a/arch/sh/mm/mmap.c b/arch/sh/mm/mmap.c
index b82199878b45..6aee5f761e08 100644
--- a/arch/sh/mm/mmap.c
+++ b/arch/sh/mm/mmap.c
@@ -57,7 +57,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
int do_colour_align;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

if (flags & MAP_FIXED) {
/* We do not accept a shared mapping if it would violate
@@ -106,7 +106,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
struct mm_struct *mm = current->mm;
unsigned long addr = addr0;
int do_colour_align;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

if (flags & MAP_FIXED) {
/* We do not accept a shared mapping if it would violate
diff --git a/arch/sparc/kernel/sys_sparc_32.c b/arch/sparc/kernel/sys_sparc_32.c
index 082a551897ed..7e781dbfd052 100644
--- a/arch/sparc/kernel/sys_sparc_32.c
+++ b/arch/sparc/kernel/sys_sparc_32.c
@@ -41,7 +41,7 @@ SYSCALL_DEFINE0(getpagesize)

unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags)
{
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

if (flags & MAP_FIXED) {
/* We do not accept a shared mapping if it would violate
diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c
index 1dbf7211666e..fc48ab3f83af 100644
--- a/arch/sparc/kernel/sys_sparc_64.c
+++ b/arch/sparc/kernel/sys_sparc_64.c
@@ -93,7 +93,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
struct vm_area_struct * vma;
unsigned long task_size = TASK_SIZE;
int do_color_align;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

if (flags & MAP_FIXED) {
/* We do not accept a shared mapping if it would violate
@@ -154,7 +154,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
unsigned long task_size = STACK_TOP32;
unsigned long addr = addr0;
int do_color_align;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

/* This should only ever run for 32-bit processes. */
BUG_ON(!test_thread_flag(TIF_32BIT));
diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
index 38a1bef47efb..614e2c46d781 100644
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -31,7 +31,7 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *filp,
{
struct hstate *h = hstate_file(filp);
unsigned long task_size = TASK_SIZE;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

if (test_thread_flag(TIF_32BIT))
task_size = STACK_TOP32;
@@ -63,7 +63,7 @@ hugetlb_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
struct hstate *h = hstate_file(filp);
struct mm_struct *mm = current->mm;
unsigned long addr = addr0;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

/* This should only ever run for 32-bit processes. */
BUG_ON(!test_thread_flag(TIF_32BIT));
diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index c783aeb37dce..6e5d4fa5fc42 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -125,7 +125,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};
unsigned long begin, end;

if (flags & MAP_FIXED)
@@ -165,7 +165,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
struct vm_area_struct *vma;
struct mm_struct *mm = current->mm;
unsigned long addr = addr0;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

/* requested length too big for entire address space */
if (len > TASK_SIZE)
diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 6d77c0039617..88726bd1f72d 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -51,7 +51,7 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
unsigned long pgoff, unsigned long flags)
{
struct hstate *h = hstate_file(file);
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

info.flags = 0;
info.length = len;
@@ -74,7 +74,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
unsigned long pgoff, unsigned long flags)
{
struct hstate *h = hstate_file(file);
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

info.flags = VM_UNMAPPED_AREA_TOPDOWN;
info.length = len;
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index a63d2eee086f..848b2239a215 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -165,7 +165,7 @@ hugetlb_get_unmapped_area_bottomup(struct file *file, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags)
{
struct hstate *h = hstate_file(file);
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

info.flags = 0;
info.length = len;
@@ -181,7 +181,7 @@ hugetlb_get_unmapped_area_topdown(struct file *file, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags)
{
struct hstate *h = hstate_file(file);
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};

info.flags = VM_UNMAPPED_AREA_TOPDOWN;
info.length = len;
diff --git a/mm/mmap.c b/mm/mmap.c
index e02bb17fef5b..33af683a643f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1699,7 +1699,7 @@ generic_get_unmapped_area(struct file *filp, unsigned long addr,
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma, *prev;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};
const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);

if (len > mmap_end - mmap_min_addr)
@@ -1747,7 +1747,7 @@ generic_get_unmapped_area_topdown(struct file *filp, unsigned long addr,
{
struct vm_area_struct *vma, *prev;
struct mm_struct *mm = current->mm;
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {};
const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);

/* requested length too big for entire address space */
--
2.34.1


2024-02-26 19:13:55

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v2 8/9] x86/mm: Care about shadow stack guard gap during placement

When memory is being placed, mmap() will take care to respect the guard
gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and
VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap()
needs to consider two things:
1. That the new mapping isn’t placed in an any existing mappings guard
gaps.
2. That the new mapping isn’t placed such that any existing mappings
are not in *its* guard gaps.

The long standing behavior of mmap() is to ensure 1, but not take any care
around 2. So for example, if there is a PAGE_SIZE free area, and a
mmap() with a PAGE_SIZE size, and a type that has a guard gap is being
placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then
the mapping that is supposed to have a guard gap will not have a gap to
the adjacent VMA.

Now that the vm_flags is passed into the arch get_unmapped_area()'s, and
vm_unmapped_area() is ready to consider it, have VM_SHADOW_STACK's get
guard gap consideration for scenario 2.

Signed-off-by: Rick Edgecombe <[email protected]>
---
arch/x86/kernel/sys_x86_64.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
index 95cb9efc47fb..f44afb3345db 100644
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -119,6 +119,14 @@ static void find_start_end(unsigned long addr, unsigned long flags,
*end = task_size_64bit(addr > DEFAULT_MAP_WINDOW);
}

+static inline unsigned long stack_guard_placement(vm_flags_t vm_flags)
+{
+ if (vm_flags & VM_SHADOW_STACK)
+ return PAGE_SIZE;
+
+ return 0;
+}
+
unsigned long
arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, unsigned long len,
unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags)
@@ -150,6 +158,7 @@ arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, unsigned l
info.high_limit = end;
info.align_mask = 0;
info.align_offset = pgoff << PAGE_SHIFT;
+ info.start_gap = stack_guard_placement(vm_flags);
if (filp) {
info.align_mask = get_align_mask();
info.align_offset += get_align_bits();
@@ -199,6 +208,7 @@ arch_get_unmapped_area_topdown_vmflags(struct file *filp, unsigned long addr0,
info.low_limit = PAGE_SIZE;

info.high_limit = get_mmap_base(0);
+ info.start_gap = stack_guard_placement(vm_flags);

/*
* If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area
--
2.34.1


2024-02-26 19:14:08

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v2 9/9] selftests/x86: Add placement guard gap test for shstk

The existing shadow stack test for guard gaps just checks that new
mappings are not placed in an existing mapping's guard gap. Add one that
checks that new mappings are not placed such that preexisting mappings are
in the new mappings guard gap.

Signed-off-by: Rick Edgecombe <[email protected]>
---
.../testing/selftests/x86/test_shadow_stack.c | 67 +++++++++++++++++--
1 file changed, 63 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/x86/test_shadow_stack.c b/tools/testing/selftests/x86/test_shadow_stack.c
index 757e6527f67e..ee909a7927f9 100644
--- a/tools/testing/selftests/x86/test_shadow_stack.c
+++ b/tools/testing/selftests/x86/test_shadow_stack.c
@@ -556,7 +556,7 @@ struct node {
* looked at the shadow stack gaps.
* 5. See if it landed in the gap.
*/
-int test_guard_gap(void)
+int test_guard_gap_other_gaps(void)
{
void *free_area, *shstk, *test_map = (void *)0xFFFFFFFFFFFFFFFF;
struct node *head = NULL, *cur;
@@ -593,11 +593,64 @@ int test_guard_gap(void)
if (shstk - test_map - PAGE_SIZE != PAGE_SIZE)
return 1;

- printf("[OK]\tGuard gap test\n");
+ printf("[OK]\tGuard gap test, other mapping's gaps\n");

return 0;
}

+/* Tests respecting the guard gap of the mapping getting placed */
+int test_guard_gap_new_mappings_gaps(void)
+{
+ void *free_area, *shstk_start, *test_map = (void *)0xFFFFFFFFFFFFFFFF;
+ struct node *head = NULL, *cur;
+ int ret = 0;
+
+ free_area = mmap(0, PAGE_SIZE * 4, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ munmap(free_area, PAGE_SIZE * 4);
+
+ /* Test letting map_shadow_stack find a free space */
+ shstk_start = mmap(free_area, PAGE_SIZE, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ if (shstk_start == MAP_FAILED || shstk_start != free_area)
+ return 1;
+
+ while (test_map > shstk_start) {
+ test_map = (void *)syscall(__NR_map_shadow_stack, 0, PAGE_SIZE, 0);
+ if (test_map == MAP_FAILED) {
+ printf("[INFO]\tmap_shadow_stack MAP_FAILED\n");
+ ret = 1;
+ break;
+ }
+
+ cur = malloc(sizeof(*cur));
+ cur->mapping = test_map;
+
+ cur->next = head;
+ head = cur;
+
+ if (test_map == free_area + PAGE_SIZE) {
+ printf("[INFO]\tNew mapping has other mapping in guard gap!\n");
+ ret = 1;
+ break;
+ }
+ }
+
+ while (head) {
+ cur = head;
+ head = cur->next;
+ munmap(cur->mapping, PAGE_SIZE);
+ free(cur);
+ }
+
+ munmap(shstk_start, PAGE_SIZE);
+
+ if (!ret)
+ printf("[OK]\tGuard gap test, placement mapping's gaps\n");
+
+ return ret;
+}
+
/*
* Too complicated to pull it out of the 32 bit header, but also get the
* 64 bit one needed above. Just define a copy here.
@@ -850,9 +903,15 @@ int main(int argc, char *argv[])
goto out;
}

- if (test_guard_gap()) {
+ if (test_guard_gap_other_gaps()) {
ret = 1;
- printf("[FAIL]\tGuard gap test\n");
+ printf("[FAIL]\tGuard gap test, other mappings' gaps\n");
+ goto out;
+ }
+
+ if (test_guard_gap_new_mappings_gaps()) {
+ ret = 1;
+ printf("[FAIL]\tGuard gap test, placement mapping's gaps\n");
goto out;
}

--
2.34.1


2024-02-26 19:40:36

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v2 3/9] mm: Use get_unmapped_area_vmflags()

When memory is being placed, mmap() will take care to respect the guard
gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and
VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap()
needs to consider two things:
1. That the new mapping isn’t placed in an any existing mappings guard
gaps.
2. That the new mapping isn’t placed such that any existing mappings
are not in *its* guard gaps.

The long standing behavior of mmap() is to ensure 1, but not take any care
around 2. So for example, if there is a PAGE_SIZE free area, and a
mmap() with a PAGE_SIZE size, and a type that has a guard gap is being
placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then
the mapping that is supposed to have a guard gap will not have a gap to
the adjacent VMA.

Use mm_get_unmapped_area_vmflags() in the do_mmap() so future changes
can cause shadow stack mappings to be placed with a guard gap. Also use
the THP variant that takes vm_flags, such that THP shadow stack can get the
same treatment. Adjust the vm_flags calculation to happen earlier so that
the vm_flags can be passed into __get_unmapped_area().

Signed-off-by: Rick Edgecombe <[email protected]>
---
v2:
- Make get_unmapped_area() a static inline (Kirill)
---
include/linux/mm.h | 11 ++++++++++-
mm/mmap.c | 34 ++++++++++++++++------------------
2 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index da5219b48d52..a091181ef65a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3351,7 +3351,16 @@ extern int install_special_mapping(struct mm_struct *mm,
unsigned long randomize_stack_top(unsigned long stack_top);
unsigned long randomize_page(unsigned long start, unsigned long range);

-extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
+unsigned long
+__get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags);
+
+static inline unsigned long
+get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags)
+{
+ return __get_unmapped_area(file, addr, len, pgoff, flags, 0);
+}

extern unsigned long mmap_region(struct file *file, unsigned long addr,
unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
diff --git a/mm/mmap.c b/mm/mmap.c
index 2021bc040e81..ac7601d05e89 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1249,18 +1249,6 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
if (mm->map_count > sysctl_max_map_count)
return -ENOMEM;

- /* Obtain the address to map to. we verify (or select) it and ensure
- * that it represents a valid section of the address space.
- */
- addr = get_unmapped_area(file, addr, len, pgoff, flags);
- if (IS_ERR_VALUE(addr))
- return addr;
-
- if (flags & MAP_FIXED_NOREPLACE) {
- if (find_vma_intersection(mm, addr, addr + len))
- return -EEXIST;
- }
-
if (prot == PROT_EXEC) {
pkey = execute_only_pkey(mm);
if (pkey < 0)
@@ -1274,6 +1262,18 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
vm_flags |= calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) |
mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;

+ /* Obtain the address to map to. we verify (or select) it and ensure
+ * that it represents a valid section of the address space.
+ */
+ addr = __get_unmapped_area(file, addr, len, pgoff, flags, vm_flags);
+ if (IS_ERR_VALUE(addr))
+ return addr;
+
+ if (flags & MAP_FIXED_NOREPLACE) {
+ if (find_vma_intersection(mm, addr, addr + len))
+ return -EEXIST;
+ }
+
if (flags & MAP_LOCKED)
if (!can_do_mlock())
return -EPERM;
@@ -1831,8 +1831,8 @@ unsigned long mm_get_unmapped_area_vmflags(struct mm_struct *mm, struct file *fi
}

unsigned long
-get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
- unsigned long pgoff, unsigned long flags)
+__get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags)
{
unsigned long (*get_area)(struct file *, unsigned long,
unsigned long, unsigned long, unsigned long)
@@ -1865,8 +1865,8 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
if (get_area)
addr = get_area(file, addr, len, pgoff, flags);
else
- addr = mm_get_unmapped_area(current->mm, file, addr, len,
- pgoff, flags);
+ addr = mm_get_unmapped_area_vmflags(current->mm, file, addr, len,
+ pgoff, flags, vm_flags);
if (IS_ERR_VALUE(addr))
return addr;

@@ -1879,8 +1879,6 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
return error ? error : addr;
}

-EXPORT_SYMBOL(get_unmapped_area);
-
unsigned long
mm_get_unmapped_area(struct mm_struct *mm, struct file *file,
unsigned long addr, unsigned long len,
--
2.34.1


2024-02-26 19:46:21

by Edgecombe, Rick P

[permalink] [raw]
Subject: [PATCH v2 4/9] thp: Add thp_get_unmapped_area_vmflags()

When memory is being placed, mmap() will take care to respect the guard
gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and
VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap()
needs to consider two things:
1. That the new mapping isn’t placed in an any existing mappings guard
gaps.
2. That the new mapping isn’t placed such that any existing mappings
are not in *its* guard gaps.

The long standing behavior of mmap() is to ensure 1, but not take any care
around 2. So for example, if there is a PAGE_SIZE free area, and a
mmap() with a PAGE_SIZE size, and a type that has a guard gap is being
placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then
the mapping that is supposed to have a guard gap will not have a gap to
the adjacent VMA.

Add a THP implementations of the vm_flags variant of get_unmapped_area().
Future changes will call this from mmap.c in the do_mmap() path to allow
shadow stacks to be placed with consideration taken for the start guard
gap. Shadow stack memory is always private and anonymous and so special
guard gap logic is not needed in a lot of caseis, but it can be mapped by
THP, so needs to be handled.

Signed-off-by: Rick Edgecombe <[email protected]>
---
include/linux/huge_mm.h | 11 +++++++++++
mm/huge_memory.c | 23 ++++++++++++++++-------
mm/mmap.c | 12 +++++++-----
3 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index fa0350b0812a..ef7251dfd9f9 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -139,6 +139,9 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags,

unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags);
+unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,
+ unsigned long len, unsigned long pgoff, unsigned long flags,
+ vm_flags_t vm_flags);

void folio_prep_large_rmappable(struct folio *folio);
bool can_split_folio(struct folio *folio, int *pextra_pins);
@@ -286,6 +289,14 @@ static inline void folio_prep_large_rmappable(struct folio *folio) {}

#define thp_get_unmapped_area NULL

+static inline unsigned long
+thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,
+ unsigned long len, unsigned long pgoff,
+ unsigned long flags, vm_flags_t vm_flags)
+{
+ return 0;
+}
+
static inline bool
can_split_folio(struct folio *folio, int *pextra_pins)
{
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e9ef43a719a5..f235f6d3ff62 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -628,7 +628,8 @@ static inline bool is_transparent_hugepage(struct folio *folio)

static unsigned long __thp_get_unmapped_area(struct file *filp,
unsigned long addr, unsigned long len,
- loff_t off, unsigned long flags, unsigned long size)
+ loff_t off, unsigned long flags, unsigned long size,
+ vm_flags_t vm_flags)
{
loff_t off_end = off + len;
loff_t off_align = round_up(off, size);
@@ -641,8 +642,8 @@ static unsigned long __thp_get_unmapped_area(struct file *filp,
if (len_pad < len || (off + len_pad) < off)
return 0;

- ret = mm_get_unmapped_area(current->mm, filp, addr, len_pad,
- off >> PAGE_SHIFT, flags);
+ ret = mm_get_unmapped_area_vmflags(current->mm, filp, addr, len_pad,
+ off >> PAGE_SHIFT, flags, vm_flags);

/*
* The failure might be due to length padding. The caller will retry
@@ -662,17 +663,25 @@ static unsigned long __thp_get_unmapped_area(struct file *filp,
return ret;
}

-unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
- unsigned long len, unsigned long pgoff, unsigned long flags)
+unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,
+ unsigned long len, unsigned long pgoff, unsigned long flags,
+ vm_flags_t vm_flags)
{
unsigned long ret;
loff_t off = (loff_t)pgoff << PAGE_SHIFT;

- ret = __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE);
+ ret = __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE, vm_flags);
if (ret)
return ret;

- return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags);
+ return mm_get_unmapped_area_vmflags(current->mm, filp, addr, len, pgoff, flags,
+ vm_flags);
+}
+
+unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
+ unsigned long len, unsigned long pgoff, unsigned long flags)
+{
+ return thp_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags, 0);
}
EXPORT_SYMBOL_GPL(thp_get_unmapped_area);

diff --git a/mm/mmap.c b/mm/mmap.c
index ac7601d05e89..e02bb17fef5b 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1857,16 +1857,18 @@ __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
*/
pgoff = 0;
get_area = shmem_get_unmapped_area;
- } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
- /* Ensures that larger anonymous mappings are THP aligned. */
- get_area = thp_get_unmapped_area;
}

- if (get_area)
+ if (get_area) {
addr = get_area(file, addr, len, pgoff, flags);
- else
+ } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
+ /* Ensures that larger anonymous mappings are THP aligned. */
+ addr = thp_get_unmapped_area_vmflags(file, addr, len,
+ pgoff, flags, vm_flags);
+ } else {
addr = mm_get_unmapped_area_vmflags(current->mm, file, addr, len,
pgoff, flags, vm_flags);
+ }
if (IS_ERR_VALUE(addr))
return addr;

--
2.34.1


2024-02-27 07:03:43

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info



Le 26/02/2024 à 20:09, Rick Edgecombe a écrit :
> Future changes will need to add a field to struct vm_unmapped_area_info.
> This would cause trouble for any archs that don't initialize the
> struct. Currently every user sets each field, so if new fields are
> added, the core code parsing the struct will see garbage in the new
> field.
>
> It could be possible to initialize the new field for each arch to 0, but
> instead simply inialize the field with a C99 struct inializing syntax.

Why doing a full init of the struct when all fields are re-written a few
lines after ?

If I take the exemple of powerpc function slice_find_area_bottomup():

struct vm_unmapped_area_info info;

info.flags = 0;
info.length = len;
info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
info.align_offset = 0;

For me it looks better to just add:

info.new_field = 0; /* or whatever value it needs to have */

Christophe


>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Suggested-by: Kirill A. Shutemov <[email protected]>
> Signed-off-by: Rick Edgecombe <[email protected]>
> Link: https://lore.kernel.org/lkml/3ynogxcgokc6i6xojbxzzwqectg472laes24u7jmtktlxcch5e@dfytra3ia3zc/#t
> ---
> Hi archs,
>
> For some context, this is part of a larger series to improve shadow stack
> guard gaps. It involves plumbing a new field via
> struct vm_unmapped_area_info. The first user is x86, but arm and riscv may
> likely use it as well. The change is compile tested only for non-x86 but
> seems like a relatively safe one.
>
> Thanks,
>
> Rick
>
> v2:
> - New patch
> ---
> arch/alpha/kernel/osf_sys.c | 2 +-
> arch/arc/mm/mmap.c | 2 +-
> arch/arm/mm/mmap.c | 4 ++--
> arch/csky/abiv1/mmap.c | 2 +-
> arch/loongarch/mm/mmap.c | 2 +-
> arch/mips/mm/mmap.c | 2 +-
> arch/parisc/kernel/sys_parisc.c | 2 +-
> arch/powerpc/mm/book3s64/slice.c | 4 ++--
> arch/s390/mm/hugetlbpage.c | 4 ++--
> arch/s390/mm/mmap.c | 4 ++--
> arch/sh/mm/mmap.c | 4 ++--
> arch/sparc/kernel/sys_sparc_32.c | 2 +-
> arch/sparc/kernel/sys_sparc_64.c | 4 ++--
> arch/sparc/mm/hugetlbpage.c | 4 ++--
> arch/x86/kernel/sys_x86_64.c | 4 ++--
> arch/x86/mm/hugetlbpage.c | 4 ++--
> fs/hugetlbfs/inode.c | 4 ++--
> mm/mmap.c | 4 ++--
> 18 files changed, 29 insertions(+), 29 deletions(-)
>
> diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
> index 5db88b627439..dd6801bb9240 100644
> --- a/arch/alpha/kernel/osf_sys.c
> +++ b/arch/alpha/kernel/osf_sys.c
> @@ -1218,7 +1218,7 @@ static unsigned long
> arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
> unsigned long limit)
> {
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> info.flags = 0;
> info.length = len;
> diff --git a/arch/arc/mm/mmap.c b/arch/arc/mm/mmap.c
> index 3c1c7ae73292..6549b3375f54 100644
> --- a/arch/arc/mm/mmap.c
> +++ b/arch/arc/mm/mmap.c
> @@ -27,7 +27,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
> {
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> /*
> * We enforce the MAP_FIXED case.
> diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
> index a0f8a0ca0788..525795578c29 100644
> --- a/arch/arm/mm/mmap.c
> +++ b/arch/arm/mm/mmap.c
> @@ -34,7 +34,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
> struct vm_area_struct *vma;
> int do_align = 0;
> int aliasing = cache_is_vipt_aliasing();
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> /*
> * We only need to do colour alignment if either the I or D
> @@ -87,7 +87,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
> unsigned long addr = addr0;
> int do_align = 0;
> int aliasing = cache_is_vipt_aliasing();
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> /*
> * We only need to do colour alignment if either the I or D
> diff --git a/arch/csky/abiv1/mmap.c b/arch/csky/abiv1/mmap.c
> index 6792aca49999..726659d41fa9 100644
> --- a/arch/csky/abiv1/mmap.c
> +++ b/arch/csky/abiv1/mmap.c
> @@ -28,7 +28,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma;
> int do_align = 0;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> /*
> * We only need to do colour alignment if either the I or D
> diff --git a/arch/loongarch/mm/mmap.c b/arch/loongarch/mm/mmap.c
> index a9630a81b38a..664bf4abfdcf 100644
> --- a/arch/loongarch/mm/mmap.c
> +++ b/arch/loongarch/mm/mmap.c
> @@ -24,7 +24,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
> struct vm_area_struct *vma;
> unsigned long addr = addr0;
> int do_color_align;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> if (unlikely(len > TASK_SIZE))
> return -ENOMEM;
> diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
> index 00fe90c6db3e..6321b53dc995 100644
> --- a/arch/mips/mm/mmap.c
> +++ b/arch/mips/mm/mmap.c
> @@ -34,7 +34,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
> struct vm_area_struct *vma;
> unsigned long addr = addr0;
> int do_color_align;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> if (unlikely(len > TASK_SIZE))
> return -ENOMEM;
> diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
> index 98af719d5f85..e87c0e325abf 100644
> --- a/arch/parisc/kernel/sys_parisc.c
> +++ b/arch/parisc/kernel/sys_parisc.c
> @@ -104,7 +104,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
> struct vm_area_struct *vma, *prev;
> unsigned long filp_pgoff;
> int do_color_align;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> if (unlikely(len > TASK_SIZE))
> return -ENOMEM;
> diff --git a/arch/powerpc/mm/book3s64/slice.c b/arch/powerpc/mm/book3s64/slice.c
> index c0b58afb9a47..5884f384866f 100644
> --- a/arch/powerpc/mm/book3s64/slice.c
> +++ b/arch/powerpc/mm/book3s64/slice.c
> @@ -282,7 +282,7 @@ static unsigned long slice_find_area_bottomup(struct mm_struct *mm,
> {
> int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
> unsigned long found, next_end;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> info.flags = 0;
> info.length = len;
> @@ -326,7 +326,7 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm,
> {
> int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
> unsigned long found, prev;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
> unsigned long min_addr = max(PAGE_SIZE, mmap_min_addr);
>
> info.flags = VM_UNMAPPED_AREA_TOPDOWN;
> diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c
> index c2d2850ec8d5..7f68485feea0 100644
> --- a/arch/s390/mm/hugetlbpage.c
> +++ b/arch/s390/mm/hugetlbpage.c
> @@ -258,7 +258,7 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
> unsigned long pgoff, unsigned long flags)
> {
> struct hstate *h = hstate_file(file);
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> info.flags = 0;
> info.length = len;
> @@ -274,7 +274,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
> unsigned long pgoff, unsigned long flags)
> {
> struct hstate *h = hstate_file(file);
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
> unsigned long addr;
>
> info.flags = VM_UNMAPPED_AREA_TOPDOWN;
> diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c
> index cd52d72b59cf..df88496e2903 100644
> --- a/arch/s390/mm/mmap.c
> +++ b/arch/s390/mm/mmap.c
> @@ -77,7 +77,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
> {
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> if (len > TASK_SIZE - mmap_min_addr)
> return -ENOMEM;
> @@ -116,7 +116,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp, unsigned long ad
> {
> struct vm_area_struct *vma;
> struct mm_struct *mm = current->mm;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> /* requested length too big for entire address space */
> if (len > TASK_SIZE - mmap_min_addr)
> diff --git a/arch/sh/mm/mmap.c b/arch/sh/mm/mmap.c
> index b82199878b45..6aee5f761e08 100644
> --- a/arch/sh/mm/mmap.c
> +++ b/arch/sh/mm/mmap.c
> @@ -57,7 +57,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma;
> int do_colour_align;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> if (flags & MAP_FIXED) {
> /* We do not accept a shared mapping if it would violate
> @@ -106,7 +106,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
> struct mm_struct *mm = current->mm;
> unsigned long addr = addr0;
> int do_colour_align;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> if (flags & MAP_FIXED) {
> /* We do not accept a shared mapping if it would violate
> diff --git a/arch/sparc/kernel/sys_sparc_32.c b/arch/sparc/kernel/sys_sparc_32.c
> index 082a551897ed..7e781dbfd052 100644
> --- a/arch/sparc/kernel/sys_sparc_32.c
> +++ b/arch/sparc/kernel/sys_sparc_32.c
> @@ -41,7 +41,7 @@ SYSCALL_DEFINE0(getpagesize)
>
> unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags)
> {
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> if (flags & MAP_FIXED) {
> /* We do not accept a shared mapping if it would violate
> diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c
> index 1dbf7211666e..fc48ab3f83af 100644
> --- a/arch/sparc/kernel/sys_sparc_64.c
> +++ b/arch/sparc/kernel/sys_sparc_64.c
> @@ -93,7 +93,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi
> struct vm_area_struct * vma;
> unsigned long task_size = TASK_SIZE;
> int do_color_align;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> if (flags & MAP_FIXED) {
> /* We do not accept a shared mapping if it would violate
> @@ -154,7 +154,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
> unsigned long task_size = STACK_TOP32;
> unsigned long addr = addr0;
> int do_color_align;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> /* This should only ever run for 32-bit processes. */
> BUG_ON(!test_thread_flag(TIF_32BIT));
> diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
> index 38a1bef47efb..614e2c46d781 100644
> --- a/arch/sparc/mm/hugetlbpage.c
> +++ b/arch/sparc/mm/hugetlbpage.c
> @@ -31,7 +31,7 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *filp,
> {
> struct hstate *h = hstate_file(filp);
> unsigned long task_size = TASK_SIZE;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> if (test_thread_flag(TIF_32BIT))
> task_size = STACK_TOP32;
> @@ -63,7 +63,7 @@ hugetlb_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
> struct hstate *h = hstate_file(filp);
> struct mm_struct *mm = current->mm;
> unsigned long addr = addr0;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> /* This should only ever run for 32-bit processes. */
> BUG_ON(!test_thread_flag(TIF_32BIT));
> diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c
> index c783aeb37dce..6e5d4fa5fc42 100644
> --- a/arch/x86/kernel/sys_x86_64.c
> +++ b/arch/x86/kernel/sys_x86_64.c
> @@ -125,7 +125,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
> {
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
> unsigned long begin, end;
>
> if (flags & MAP_FIXED)
> @@ -165,7 +165,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
> struct vm_area_struct *vma;
> struct mm_struct *mm = current->mm;
> unsigned long addr = addr0;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> /* requested length too big for entire address space */
> if (len > TASK_SIZE)
> diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
> index 6d77c0039617..88726bd1f72d 100644
> --- a/arch/x86/mm/hugetlbpage.c
> +++ b/arch/x86/mm/hugetlbpage.c
> @@ -51,7 +51,7 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
> unsigned long pgoff, unsigned long flags)
> {
> struct hstate *h = hstate_file(file);
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> info.flags = 0;
> info.length = len;
> @@ -74,7 +74,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
> unsigned long pgoff, unsigned long flags)
> {
> struct hstate *h = hstate_file(file);
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> info.flags = VM_UNMAPPED_AREA_TOPDOWN;
> info.length = len;
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index a63d2eee086f..848b2239a215 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -165,7 +165,7 @@ hugetlb_get_unmapped_area_bottomup(struct file *file, unsigned long addr,
> unsigned long len, unsigned long pgoff, unsigned long flags)
> {
> struct hstate *h = hstate_file(file);
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> info.flags = 0;
> info.length = len;
> @@ -181,7 +181,7 @@ hugetlb_get_unmapped_area_topdown(struct file *file, unsigned long addr,
> unsigned long len, unsigned long pgoff, unsigned long flags)
> {
> struct hstate *h = hstate_file(file);
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> info.flags = VM_UNMAPPED_AREA_TOPDOWN;
> info.length = len;
> diff --git a/mm/mmap.c b/mm/mmap.c
> index e02bb17fef5b..33af683a643f 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1699,7 +1699,7 @@ generic_get_unmapped_area(struct file *filp, unsigned long addr,
> {
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma, *prev;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
> const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
>
> if (len > mmap_end - mmap_min_addr)
> @@ -1747,7 +1747,7 @@ generic_get_unmapped_area_topdown(struct file *filp, unsigned long addr,
> {
> struct vm_area_struct *vma, *prev;
> struct mm_struct *mm = current->mm;
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
> const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
>
> /* requested length too big for entire address space */

2024-02-27 15:03:19

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

On Tue, 2024-02-27 at 07:02 +0000, Christophe Leroy wrote:
> > It could be possible to initialize the new field for each arch to
> > 0, but
> > instead simply inialize the field with a C99 struct inializing
> > syntax.
>
> Why doing a full init of the struct when all fields are re-written a
> few
> lines after ?
>
> If I take the exemple of powerpc function slice_find_area_bottomup():
>
>         struct vm_unmapped_area_info info;
>
>         info.flags = 0;
>         info.length = len;
>         info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
>         info.align_offset = 0;
>
> For me it looks better to just add:
>
>         info.new_field = 0; /* or whatever value it needs to have */

Hi,

Thanks for taking a look. Yes, I guess that should have some
justification. I was thinking of two reasons:
1. No future additions of optional parameters would need to make tree
wide changes like this.
2. The change is easier to review and get correct because the necessary
context is within a single line. For example, in that function some of
members are set within a while loop. The place you pointed seems to be
the correct one, but a diff that had the new field set after:
info.high_limit = addr;
...would look correct too, but not be.

What is the concern with C99 initialization? FWIW, the full series also
removes an indirect branch, and probably is a net win for performance
in this path.

2024-02-27 18:08:04

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

On Tue, Feb 27, 2024 at 07:02:59AM +0000, Christophe Leroy wrote:
>
>
> Le 26/02/2024 ? 20:09, Rick Edgecombe a ?crit?:
> > Future changes will need to add a field to struct vm_unmapped_area_info.
> > This would cause trouble for any archs that don't initialize the
> > struct. Currently every user sets each field, so if new fields are
> > added, the core code parsing the struct will see garbage in the new
> > field.
> >
> > It could be possible to initialize the new field for each arch to 0, but
> > instead simply inialize the field with a C99 struct inializing syntax.
>
> Why doing a full init of the struct when all fields are re-written a few
> lines after ?

It's a nice change for robustness and makes future changes easier. It's
not actually wasteful since the compiler will throw away all redundant
stores.

> If I take the exemple of powerpc function slice_find_area_bottomup():
>
> struct vm_unmapped_area_info info;
>
> info.flags = 0;
> info.length = len;
> info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
> info.align_offset = 0;

But one cleanup that is possible from explicitly zero-initializing the
whole structure would be dropping all the individual "= 0" assignments.
:)

--
Kees Cook

2024-02-27 18:16:48

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info



Le 27/02/2024 à 19:07, Kees Cook a écrit :
> On Tue, Feb 27, 2024 at 07:02:59AM +0000, Christophe Leroy wrote:
>>
>>
>> Le 26/02/2024 à 20:09, Rick Edgecombe a écrit :
>>> Future changes will need to add a field to struct vm_unmapped_area_info.
>>> This would cause trouble for any archs that don't initialize the
>>> struct. Currently every user sets each field, so if new fields are
>>> added, the core code parsing the struct will see garbage in the new
>>> field.
>>>
>>> It could be possible to initialize the new field for each arch to 0, but
>>> instead simply inialize the field with a C99 struct inializing syntax.
>>
>> Why doing a full init of the struct when all fields are re-written a few
>> lines after ?
>
> It's a nice change for robustness and makes future changes easier. It's
> not actually wasteful since the compiler will throw away all redundant
> stores.

Well, I tend to dislike default init at declaration because it often
hides missed real init. When a field is not initialized GCC should emit
a Warning, at least when built with W=2 which sets
-Wmissing-field-initializers ?

>
>> If I take the exemple of powerpc function slice_find_area_bottomup():
>>
>> struct vm_unmapped_area_info info;
>>
>> info.flags = 0;
>> info.length = len;
>> info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
>> info.align_offset = 0;
>
> But one cleanup that is possible from explicitly zero-initializing the
> whole structure would be dropping all the individual "= 0" assignments.
> :)
>

Sure if we decide to go that direction all those 0 assignments void.

2024-02-27 20:25:57

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

On Tue, 2024-02-27 at 18:16 +0000, Christophe Leroy wrote:
> > > Why doing a full init of the struct when all fields are re-
> > > written a few
> > > lines after ?
> >
> > It's a nice change for robustness and makes future changes easier.
> > It's
> > not actually wasteful since the compiler will throw away all
> > redundant
> > stores.
>
> Well, I tend to dislike default init at declaration because it often
> hides missed real init. When a field is not initialized GCC should
> emit
> a Warning, at least when built with W=2 which sets
> -Wmissing-field-initializers ?

Sorry, I'm not following where you are going with this. There aren't
any struct vm_unmapped_area_info users that use initializers today, so
that warning won't apply in this case. Meanwhile, designated style
struct initialization (which would zero new members) is very common, as
well as not get anything checked by that warning. Anything with this
many members is probably going to use the designated style.

If we are optimizing to avoid bugs, the way this struct is used today
is not great. It is essentially being used as an argument passer.
Normally when a function signature changes, but a caller is missed, of
course the compiler will notice loudly. But not here. So I think
probably zero initializing it is safer than being setup to pass
garbage.

I'm trying to figure out what to do here. If I changed it so that just
powerpc set the new field manually, then the convention across the
kernel would be for everything to be default zero, and future other new
parameters could have a greater chance of turning into garbage on
powerpc. Since it could be easy to miss that powerpc was special. Would
you prefer it?

Or maybe I could try a new vm_unmapped_area() that takes the extra
argument separately? The old callers could call the old function and
not need any arch updates. It all seems strange though, because
automatic zero initializing struct members is so common in the kernel.
But it also wouldn't add the cleanup Kees was pointing out. Hmm.

Any preference? Or maybe am I missing your point and talking nonsense?

2024-02-28 13:22:46

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info



Le 27/02/2024 à 21:25, Edgecombe, Rick P a écrit :
> On Tue, 2024-02-27 at 18:16 +0000, Christophe Leroy wrote:
>>>> Why doing a full init of the struct when all fields are re-
>>>> written a few
>>>> lines after ?
>>>
>>> It's a nice change for robustness and makes future changes easier.
>>> It's
>>> not actually wasteful since the compiler will throw away all
>>> redundant
>>> stores.
>>
>> Well, I tend to dislike default init at declaration because it often
>> hides missed real init. When a field is not initialized GCC should
>> emit
>> a Warning, at least when built with W=2 which sets
>> -Wmissing-field-initializers ?
>
> Sorry, I'm not following where you are going with this. There aren't
> any struct vm_unmapped_area_info users that use initializers today, so
> that warning won't apply in this case. Meanwhile, designated style
> struct initialization (which would zero new members) is very common, as
> well as not get anything checked by that warning. Anything with this
> many members is probably going to use the designated style.
>
> If we are optimizing to avoid bugs, the way this struct is used today
> is not great. It is essentially being used as an argument passer.
> Normally when a function signature changes, but a caller is missed, of
> course the compiler will notice loudly. But not here. So I think
> probably zero initializing it is safer than being setup to pass
> garbage.

No worry, if everybody thinks that init at declaration is worth it in
that case it is OK for me and I'm not going to ask for something special
on powerpc, my comment was more general allthough I used powerpc as an
exemple.

My worry with initialisation at declaration is it often hides missing
assignments. Let's take following simple exemple:

char *colour(int num)
{
char *name;

if (num == 0) {
name = "black";
} else if (num == 1) {
name = "white";
} else if (num == 2) {
} else {
name = "no colour";
}

return name;
}


Here, GCC warns about a missing initialisation of variable 'name'.

But if I declare it as

char *name = "no colour";

Then GCC won't warn anymore that we are missing a value for when num is 2.

During my life I have so many times spent huge amount of time
investigating issues and bugs due to missing assignments that were going
undetected due to default initialisation at declaration.

>
> I'm trying to figure out what to do here. If I changed it so that just
> powerpc set the new field manually, then the convention across the
> kernel would be for everything to be default zero, and future other new
> parameters could have a greater chance of turning into garbage on
> powerpc. Since it could be easy to miss that powerpc was special. Would
> you prefer it?
>
> Or maybe I could try a new vm_unmapped_area() that takes the extra
> argument separately? The old callers could call the old function and
> not need any arch updates. It all seems strange though, because
> automatic zero initializing struct members is so common in the kernel.
> But it also wouldn't add the cleanup Kees was pointing out. Hmm.
>
> Any preference? Or maybe am I missing your point and talking nonsense?
>

So my preference would go to the addition of:

info.new_field = 0;

But that's very minor and if you think it is easier to manage and
maintain by performing {} initialisation at declaration, lets go for that.

Christophe

2024-02-28 16:45:27

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

On Mon, Feb 26, 2024 at 11:09:47AM -0800, Rick Edgecombe wrote:
> diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
> index 5db88b627439..dd6801bb9240 100644
> --- a/arch/alpha/kernel/osf_sys.c
> +++ b/arch/alpha/kernel/osf_sys.c
> @@ -1218,7 +1218,7 @@ static unsigned long
> arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
> unsigned long limit)
> {
> - struct vm_unmapped_area_info info;
> + struct vm_unmapped_area_info info = {};
>
> info.flags = 0;
> info.length = len;

Can we make a step forward and actually move initialization inside the
initializator? Something like below.

I understand that it is substantially more work, but I think it is useful.

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 5db88b627439..c40ddede3b13 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -1218,14 +1218,12 @@ static unsigned long
arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
unsigned long limit)
{
- struct vm_unmapped_area_info info;
+ struct vm_unmapped_area_info info = {
+ .length = len;
+ .low_limit = addr,
+ .high_limit = limit,
+ };

- info.flags = 0;
- info.length = len;
- info.low_limit = addr;
- info.high_limit = limit;
- info.align_mask = 0;
- info.align_offset = 0;
return vm_unmapped_area(&info);
}

--
Kiryl Shutsemau / Kirill A. Shutemov

2024-02-28 17:02:14

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

On Wed, 2024-02-28 at 13:22 +0000, Christophe Leroy wrote:
> > Any preference? Or maybe am I missing your point and talking
> > nonsense?
> >
>
> So my preference would go to the addition of:
>
>         info.new_field = 0;
>
> But that's very minor and if you think it is easier to manage and
> maintain by performing {} initialisation at declaration, lets go for
> that.

Appreciate the clarification and help getting this right. I'm thinking
Kees' and now Kirill's point about this patch resulting in unnecessary
manual zero initialization of the structs is probably something that
needs to be addressed.

If I created a bunch of patches to change each call site, I think the
the best is probably to do the designated field zero initialization
way.

But I can do something for powerpc special if you want. I'll first try
with powerpc matching the others, and if it seems objectionable, please
let me know.

Thanks,

Rick

2024-02-28 17:21:53

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

On Wed, Feb 28, 2024 at 01:22:09PM +0000, Christophe Leroy wrote:
> [...]
> My worry with initialisation at declaration is it often hides missing
> assignments. Let's take following simple exemple:
>
> char *colour(int num)
> {
> char *name;
>
> if (num == 0) {
> name = "black";
> } else if (num == 1) {
> name = "white";
> } else if (num == 2) {
> } else {
> name = "no colour";
> }
>
> return name;
> }
>
> Here, GCC warns about a missing initialisation of variable 'name'.

Sometimes. :( We build with -Wno-maybe-uninitialized because GCC gets
this wrong too often. Also, like with large structs like this, all
uninit warnings get suppressed if anything takes it by reference. So, if
before your "return name" statement above, you had something like:

do_something(&name);

it won't warn with any option enabled.

> But if I declare it as
>
> char *name = "no colour";
>
> Then GCC won't warn anymore that we are missing a value for when num is 2.
>
> During my life I have so many times spent huge amount of time
> investigating issues and bugs due to missing assignments that were going
> undetected due to default initialisation at declaration.

I totally understand. If the "uninitialized" warnings were actually
reliable, I would agree. I look at it this way:

- initializations can be missed either in static initializers or via
run time initializers. (So the risk of mistake here is matched --
though I'd argue it's easier to *find* static initializers when adding
new struct members.)
- uninitialized warnings are inconsistent (this becomes an unknown risk)
- when a run time initializer is missed, the contents are whatever was
on the stack (high risk)
- what a static initializer is missed, the content is 0 (low risk)

I think unambiguous state (always 0) is significantly more important for
the safety of the system as a whole. Yes, individual cases maybe bad
("what uid should this be? root?!") but from a general memory safety
perspective the value doesn't become potentially influenced by order of
operations, leftover stack memory, etc.

I'd agree, lifting everything into a static initializer does seem
cleanest of all the choices.

-Kees

--
Kees Cook

2024-02-28 23:10:59

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info



Le 28/02/2024 à 18:01, Edgecombe, Rick P a écrit :
> On Wed, 2024-02-28 at 13:22 +0000, Christophe Leroy wrote:
>>> Any preference? Or maybe am I missing your point and talking
>>> nonsense?
>>>
>>
>> So my preference would go to the addition of:
>>
>>         info.new_field = 0;
>>
>> But that's very minor and if you think it is easier to manage and
>> maintain by performing {} initialisation at declaration, lets go for
>> that.
>
> Appreciate the clarification and help getting this right. I'm thinking
> Kees' and now Kirill's point about this patch resulting in unnecessary
> manual zero initialization of the structs is probably something that
> needs to be addressed.
>
> If I created a bunch of patches to change each call site, I think the
> the best is probably to do the designated field zero initialization
> way.
>
> But I can do something for powerpc special if you want. I'll first try
> with powerpc matching the others, and if it seems objectionable, please
> let me know.
>

My comments were generic, it was not powerpc oriented. Please keep
powerpc as similar as possible with others.

Christophe

2024-03-02 00:47:29

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

On Wed, 2024-02-28 at 09:21 -0800, Kees Cook wrote:
> I totally understand. If the "uninitialized" warnings were actually
> reliable, I would agree. I look at it this way:
>
> - initializations can be missed either in static initializers or via
>   run time initializers. (So the risk of mistake here is matched --
>   though I'd argue it's easier to *find* static initializers when
> adding
>   new struct members.)
> - uninitialized warnings are inconsistent (this becomes an unknown
> risk)
> - when a run time initializer is missed, the contents are whatever
> was
>   on the stack (high risk)
> - what a static initializer is missed, the content is 0 (low risk)
>
> I think unambiguous state (always 0) is significantly more important
> for
> the safety of the system as a whole. Yes, individual cases maybe bad
> ("what uid should this be? root?!") but from a general memory safety
> perspective the value doesn't become potentially influenced by order
> of
> operations, leftover stack memory, etc.
>
> I'd agree, lifting everything into a static initializer does seem
> cleanest of all the choices.

Hi Kees,

Well, I just gave this a try. It is giving me flashbacks of when I last
had to do a tree wide change that I couldn't fully test and the
breakage was caught by Linus.

Could you let me know if you think this is additionally worthwhile
cleanup outside of the guard gap improvements of this series? Because I
was thinking a more cowardly approach could be a new vm_unmapped_area()
variant that takes the new start gap member as a separate argument
outside of struct vm_unmapped_area_info. It would be kind of strange to
keep them separate, but it would be less likely to bump something.

Thanks,

Rick

2024-03-02 01:51:34

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

On Sat, Mar 02, 2024 at 12:47:08AM +0000, Edgecombe, Rick P wrote:
> On Wed, 2024-02-28 at 09:21 -0800, Kees Cook wrote:
> > I totally understand. If the "uninitialized" warnings were actually
> > reliable, I would agree. I look at it this way:
> >
> > - initializations can be missed either in static initializers or via
> > ? run time initializers. (So the risk of mistake here is matched --
> > ? though I'd argue it's easier to *find* static initializers when
> > adding
> > ? new struct members.)
> > - uninitialized warnings are inconsistent (this becomes an unknown
> > risk)
> > - when a run time initializer is missed, the contents are whatever
> > was
> > ? on the stack (high risk)
> > - what a static initializer is missed, the content is 0 (low risk)
> >
> > I think unambiguous state (always 0) is significantly more important
> > for
> > the safety of the system as a whole. Yes, individual cases maybe bad
> > ("what uid should this be? root?!") but from a general memory safety
> > perspective the value doesn't become potentially influenced by order
> > of
> > operations, leftover stack memory, etc.
> >
> > I'd agree, lifting everything into a static initializer does seem
> > cleanest of all the choices.
>
> Hi Kees,
>
> Well, I just gave this a try. It is giving me flashbacks of when I last
> had to do a tree wide change that I couldn't fully test and the
> breakage was caught by Linus.

Yeah, testing isn't fun for these kinds of things. This is traditionally
why the "obviously correct" changes tend to have an easier time landing
(i.e. adding "= {}" to all of them).

> Could you let me know if you think this is additionally worthwhile
> cleanup outside of the guard gap improvements of this series? Because I
> was thinking a more cowardly approach could be a new vm_unmapped_area()
> variant that takes the new start gap member as a separate argument
> outside of struct vm_unmapped_area_info. It would be kind of strange to
> keep them separate, but it would be less likely to bump something.

I think you want a new member -- AIUI, that's what that struct is for.

Looking at this resulting set of patches, I do kinda think just adding
the "= {}" in a single patch is more sensible. Having to split things
that are know at the top of the function from the stuff known at the
existing initialization time is rather awkward.

Personally, I think a single patch that sets "= {}" for all of them and
drop the all the "= 0" or "= NULL" assignments would be the cleanest way
to go.

-Kees

--
Kees Cook

2024-03-04 18:01:47

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info



Le 02/03/2024 à 02:51, Kees Cook a écrit :
> On Sat, Mar 02, 2024 at 12:47:08AM +0000, Edgecombe, Rick P wrote:
>> On Wed, 2024-02-28 at 09:21 -0800, Kees Cook wrote:
>>> I totally understand. If the "uninitialized" warnings were actually
>>> reliable, I would agree. I look at it this way:
>>>
>>> - initializations can be missed either in static initializers or via
>>>   run time initializers. (So the risk of mistake here is matched --
>>>   though I'd argue it's easier to *find* static initializers when
>>> adding
>>>   new struct members.)
>>> - uninitialized warnings are inconsistent (this becomes an unknown
>>> risk)
>>> - when a run time initializer is missed, the contents are whatever
>>> was
>>>   on the stack (high risk)
>>> - what a static initializer is missed, the content is 0 (low risk)
>>>
>>> I think unambiguous state (always 0) is significantly more important
>>> for
>>> the safety of the system as a whole. Yes, individual cases maybe bad
>>> ("what uid should this be? root?!") but from a general memory safety
>>> perspective the value doesn't become potentially influenced by order
>>> of
>>> operations, leftover stack memory, etc.
>>>
>>> I'd agree, lifting everything into a static initializer does seem
>>> cleanest of all the choices.
>>
>> Hi Kees,
>>
>> Well, I just gave this a try. It is giving me flashbacks of when I last
>> had to do a tree wide change that I couldn't fully test and the
>> breakage was caught by Linus.
>
> Yeah, testing isn't fun for these kinds of things. This is traditionally
> why the "obviously correct" changes tend to have an easier time landing
> (i.e. adding "= {}" to all of them).
>
>> Could you let me know if you think this is additionally worthwhile
>> cleanup outside of the guard gap improvements of this series? Because I
>> was thinking a more cowardly approach could be a new vm_unmapped_area()
>> variant that takes the new start gap member as a separate argument
>> outside of struct vm_unmapped_area_info. It would be kind of strange to
>> keep them separate, but it would be less likely to bump something.
>
> I think you want a new member -- AIUI, that's what that struct is for.
>
> Looking at this resulting set of patches, I do kinda think just adding
> the "= {}" in a single patch is more sensible. Having to split things
> that are know at the top of the function from the stuff known at the
> existing initialization time is rather awkward.
>
> Personally, I think a single patch that sets "= {}" for all of them and
> drop the all the "= 0" or "= NULL" assignments would be the cleanest way
> to go.

I agree with Kees, set = {} and drop all the "something = 0;" stuff.

Christophe

2024-03-04 18:04:10

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH v2 5/9] mm: Initialize struct vm_unmapped_area_info

On Mon, 2024-03-04 at 18:00 +0000, Christophe Leroy wrote:
> > Personally, I think a single patch that sets "= {}" for all of them
> > and
> > drop the all the "= 0" or "= NULL" assignments would be the
> > cleanest way
> > to go.
>
> I agree with Kees, set = {} and drop all the "something = 0;" stuff.

Thanks. Now some of the arch's have very nicely acked and reviewed the
existing patches. I'll leave those as is, and do this for anyone that
doesn't respond.