2023-10-17 09:08:38

by Jeff Xu

[permalink] [raw]
Subject: [RFC PATCH v2 0/8] Introduce mseal() syscall

From: Jeff Xu <[email protected]>

This patchset proposes a new mseal() syscall for the Linux kernel.

Modern CPUs support memory permissions such as RW and NX bits. Linux has
supported NX since the release of kernel version 2.6.8 in August 2004 [1].
The memory permission feature improves security stance on memory
corruption bugs, i.e. the attacker can’t just write to arbitrary memory
and point the code to it, the memory has to be marked with X bit, or
else an exception will happen. The protection is set by mmap(2),
mprotect(2), mremap(2).

Memory sealing additionally protects the mapping itself against
modifications. This is useful to mitigate memory corruption issues where
a corrupted pointer is passed to a memory management syscall. For example,
such an attacker primitive can break control-flow integrity guarantees
since read-only memory that is supposed to be trusted can become writable
or .text pages can get remapped. Memory sealing can automatically be
applied by the runtime loader to seal .text and .rodata pages and
applications can additionally seal security critical data at runtime.
A similar feature already exists in the XNU kernel with the
VM_FLAGS_PERMANENT [3] flag and on OpenBSD with the mimmutable syscall [4].
Also, Chrome wants to adopt this feature for their CFI work [2] and this
patchset has been designed to be compatible with the Chrome use case.

The new mseal() is an architecture independent syscall, and with
following signature:

mseal(void addr, size_t len, unsigned long types, unsigned long flags)

addr/len: memory range. Must be continuous/allocated memory, or else
mseal() will fail and no VMA is updated. For details on acceptable
arguments, please refer to comments in mseal.c. Those are also fully
covered by the selftest.

types: bit mask to specify which syscall to seal.

Five syscalls can be sealed, as specified by bitmasks:
MM_SEAL_MPROTECT: Deny mprotect(2)/pkey_mprotect(2).
MM_SEAL_MUNMAP: Deny munmap(2).
MM_SEAL_MMAP: Deny mmap(2).
MM_SEAL_MREMAP: Deny mremap(2).
MM_SEAL_MSEAL: Deny adding a new seal type.

Each bit represents sealing for one specific syscall type, e.g.
MM_SEAL_MPROTECT will deny mprotect syscall. The consideration of bitmask
is that the API is extendable, i.e. when needed, the sealing can be
extended to madvise, mlock, etc. Backward compatibility is also easy.

The kernel will remember which seal types are applied, and the application
doesn’t need to repeat all existing seal types in the next mseal(). Once
a seal type is applied, it can’t be unsealed. Call mseal() on an existing
seal type is a no-action, not a failure.

MM_SEAL_MSEAL will deny mseal() calls that try to add a new seal type.

Internally, vm_area_struct adds a new field vm_seals, to store the bit
masks.

For the affected syscalls, such as mprotect, a check(can_modify_mm) for
sealing is added, this usually happens at the early point of the syscall,
before any update is made to VMAs. The effect of that is: if any of the
VMAs in the given address range fails the sealing check, none of the VMA
will be updated.

The idea that inspired this patch comes from Stephen Röttger’s work in
V8 CFI [5], Chrome browser in ChromeOS will be the first user of this API.

[1] https://kernelnewbies.org/Linux_2_6_8
[2] https://v8.dev/blog/control-flow-integrity
[3] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b9f69dbd3c8c3fd30a/osfmk/mach/vm_statistics.h#L274
[4] https://man.openbsd.org/mimmutable.2
[5] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXgeaRHo/edit#heading=h.bvaojj9fu6hc

PATCH history:

v1:
Use _BITUL to define MM_SEAL_XX type.
Use unsigned long for seal type in sys_mseal() and other functions.
Remove internal VM_SEAL_XX type and convert_user_seal_type().
Remove MM_ACTION_XX type.
Remove caller_origin(ON_BEHALF_OF_XX) and replace with sealing bitmask.
Add more comments in code.
Add detailed commit message.

v0:
https://lore.kernel.org/lkml/[email protected]/

Jeff Xu (8):
mseal: Add mseal(2) syscall.
mseal: Wire up mseal syscall
mseal: add can_modify_mm and can_modify_vma
mseal: Check seal flag for mprotect(2)
mseal: Check seal flag for munmap(2)
mseal: Check seal flag for mremap(2)
mseal:Check seal flag for mmap(2)
selftest mm/mseal mprotect/munmap/mremap/mmap

arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 2 +
arch/ia64/kernel/syscalls/syscall.tbl | 1 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
fs/aio.c | 5 +-
include/linux/mm.h | 44 +-
include/linux/mm_types.h | 7 +
include/linux/syscalls.h | 2 +
include/uapi/asm-generic/unistd.h | 5 +-
include/uapi/linux/mman.h | 6 +
ipc/shm.c | 3 +-
kernel/sys_ni.c | 1 +
mm/Kconfig | 8 +
mm/Makefile | 1 +
mm/internal.h | 4 +-
mm/mmap.c | 57 +-
mm/mprotect.c | 15 +
mm/mremap.c | 30 +-
mm/mseal.c | 268 ++++
mm/nommu.c | 6 +-
mm/util.c | 8 +-
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/mseal_test.c | 1428 +++++++++++++++++++
37 files changed, 1891 insertions(+), 28 deletions(-)
create mode 100644 mm/mseal.c
create mode 100644 tools/testing/selftests/mm/mseal_test.c

--
2.42.0.655.g421f12c284-goog


2023-10-17 09:09:03

by Jeff Xu

[permalink] [raw]
Subject: [RFC PATCH v2 1/8] mseal: Add mseal(2) syscall.

From: Jeff Xu <[email protected]>

This patchset proposes a new mseal() syscall for the Linux kernel.

Modern CPUs support memory permissions such as RW and NX bits. Linux has
supported NX since the release of kernel version 2.6.8 in August 2004 [1].
The memory permission feature improves security stance on memory
corruption bugs, i.e. the attacker can’t just write to arbitrary memory
and point the code to it, the memory has to be marked with X bit, or
else an exception will happen.

Memory sealing additionally protects the mapping itself against
modifications. This is useful to mitigate memory corruption issues where
a corrupted pointer is passed to a memory management syscall. For example,
such an attacker primitive can break control-flow integrity guarantees
since read-only memory that is supposed to be trusted can become writable
or .text pages can get remapped. Memory sealing can automatically be
applied by the runtime loader to seal .text and .rodata pages and
applications can additionally seal security critical data at runtime.
A similar feature already exists in the XNU kernel with the
VM_FLAGS_PERMANENT [3] flag and on OpenBSD with the mimmutable syscall [4].
Also, Chrome wants to adopt this feature for their CFI work [2] and this
patchset has been designed to be compatible with the Chrome use case.

The new mseal() is an architecture independent syscall, and with
following signature:

mseal(void addr, size_t len, unsigned long types, unsigned long flags)

addr/len: memory range. Must be continuous/allocated memory, or else
mseal() will fail and no VMA is updated. For details on acceptable
arguments, please refer to comments in mseal.c. Those are also fully
covered by the selftest.

types: bit mask to specify which syscall to seal.

Five syscalls can be sealed, as specified by bitmasks:
MM_SEAL_MPROTECT: Deny mprotect(2)/pkey_mprotect(2).
MM_SEAL_MUNMAP: Deny munmap(2).
MM_SEAL_MMAP: Deny mmap(2).
MM_SEAL_MREMAP: Deny mremap(2).
MM_SEAL_MSEAL: Deny adding a new seal type.

Each bit represents sealing for one specific syscall type, e.g.
MM_SEAL_MPROTECT will deny mprotect syscall. The consideration of bitmask
is that the API is extendable, i.e. when needed, the sealing can be
extended to madvise, mlock, etc. Backward compatibility is also easy.

The kernel will remember which seal types are applied, and the application
doesn’t need to repeat all existing seal types in the next mseal(). Once
a seal type is applied, it can’t be unsealed. Call mseal() on an existing
seal type is a no-action, not a failure.

MM_SEAL_MSEAL will deny mseal() calls that try to add a new seal type.

Internally, vm_area_struct adds a new field vm_seals, to store the bit
masks.

For the affected syscalls, such as mprotect, a check(can_modify_mm) for
sealing is added, this usually happens at the early point of the syscall,
before any update is made to VMAs. The effect of that is: if any of the
VMAs in the given address range fails the sealing check, none of the VMA
will be updated.

The idea that inspired this patch comes from Stephen Röttger’s work in
V8 CFI [5], Chrome browser in ChromeOS will be the first user of this API.

In addition, Stephen is working on glibc change to add sealing support
into the dynamic linker to seal all non-writable segments at startup. When
that work is completed, all applications can automatically benefit from
these new protections.

[1] https://kernelnewbies.org/Linux_2_6_8
[2] https://v8.dev/blog/control-flow-integrity
[3] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b9f69dbd3c8c3fd30a/osfmk/mach/vm_statistics.h#L274
[4] https://man.openbsd.org/mimmutable.2
[5] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXgeaRHo/edit#heading=h.bvaojj9fu6hc

Signed-off-by: Jeff Xu <[email protected]>
---
include/linux/mm.h | 11 ++
include/linux/mm_types.h | 7 ++
include/linux/syscalls.h | 2 +
include/uapi/linux/mman.h | 6 +
kernel/sys_ni.c | 1 +
mm/Kconfig | 8 ++
mm/Makefile | 1 +
mm/mmap.c | 14 +++
mm/mseal.c | 230 ++++++++++++++++++++++++++++++++++++++
9 files changed, 280 insertions(+)
create mode 100644 mm/mseal.c

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 53efddc4d178..b511932df033 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -30,6 +30,7 @@
#include <linux/kasan.h>
#include <linux/memremap.h>
#include <linux/slab.h>
+#include <uapi/linux/mman.h>

struct mempolicy;
struct anon_vma;
@@ -257,6 +258,16 @@ extern struct rw_semaphore nommu_region_sem;
extern unsigned int kobjsize(const void *objp);
#endif

+/*
+ * MM_SEAL_ALL is all supported flags in mseal().
+ */
+#define MM_SEAL_ALL ( \
+ MM_SEAL_MSEAL | \
+ MM_SEAL_MPROTECT | \
+ MM_SEAL_MUNMAP | \
+ MM_SEAL_MMAP | \
+ MM_SEAL_MREMAP)
+
/*
* vm_flags in vm_area_struct, see mm_types.h.
* When changing, update also include/trace/events/mmflags.h
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 36c5b43999e6..17d80f5a73dc 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -660,6 +660,13 @@ struct vm_area_struct {
struct vma_numab_state *numab_state; /* NUMA Balancing state */
#endif
struct vm_userfaultfd_ctx vm_userfaultfd_ctx;
+#ifdef CONFIG_MSEAL
+ /*
+ * bit masks for seal.
+ * need this since vm_flags is full.
+ */
+ unsigned long vm_seals; /* seal flags, see mm.h. */
+#endif
} __randomize_layout;

#ifdef CONFIG_SCHED_MM_CID
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index c0cb22cd607d..dbc8d4f76646 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -802,6 +802,8 @@ asmlinkage long sys_process_mrelease(int pidfd, unsigned int flags);
asmlinkage long sys_remap_file_pages(unsigned long start, unsigned long size,
unsigned long prot, unsigned long pgoff,
unsigned long flags);
+asmlinkage long sys_mseal(unsigned long start, size_t len, unsigned long types,
+ unsigned long flags);
asmlinkage long sys_mbind(unsigned long start, unsigned long len,
unsigned long mode,
const unsigned long __user *nmask,
diff --git a/include/uapi/linux/mman.h b/include/uapi/linux/mman.h
index a246e11988d5..5ed4072cf4a6 100644
--- a/include/uapi/linux/mman.h
+++ b/include/uapi/linux/mman.h
@@ -55,4 +55,10 @@ struct cachestat {
__u64 nr_recently_evicted;
};

+#define MM_SEAL_MSEAL _BITUL(0)
+#define MM_SEAL_MPROTECT _BITUL(1)
+#define MM_SEAL_MUNMAP _BITUL(2)
+#define MM_SEAL_MMAP _BITUL(3)
+#define MM_SEAL_MREMAP _BITUL(4)
+
#endif /* _UAPI_LINUX_MMAN_H */
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 781de7cc6a4e..06fabf379e33 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -192,6 +192,7 @@ COND_SYSCALL(migrate_pages);
COND_SYSCALL(move_pages);
COND_SYSCALL(set_mempolicy_home_node);
COND_SYSCALL(cachestat);
+COND_SYSCALL(mseal);

COND_SYSCALL(perf_event_open);
COND_SYSCALL(accept4);
diff --git a/mm/Kconfig b/mm/Kconfig
index 264a2df5ecf5..db8a567cb4d3 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1258,6 +1258,14 @@ config LOCK_MM_AND_FIND_VMA
bool
depends on !STACK_GROWSUP

+config MSEAL
+ default n
+ bool "Enable mseal() system call"
+ depends on MMU
+ help
+ Enable the mseal() system call. Make memory areas's metadata immutable
+ by selected system calls, i.e. mprotect(), munmap(), mremap(), mmap().
+
source "mm/damon/Kconfig"

endmenu
diff --git a/mm/Makefile b/mm/Makefile
index ec65984e2ade..643d8518dac0 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -120,6 +120,7 @@ obj-$(CONFIG_PAGE_EXTENSION) += page_ext.o
obj-$(CONFIG_PAGE_TABLE_CHECK) += page_table_check.o
obj-$(CONFIG_CMA_DEBUGFS) += cma_debug.o
obj-$(CONFIG_SECRETMEM) += secretmem.o
+obj-$(CONFIG_MSEAL) += mseal.o
obj-$(CONFIG_CMA_SYSFS) += cma_sysfs.o
obj-$(CONFIG_USERFAULTFD) += userfaultfd.o
obj-$(CONFIG_IDLE_PAGE_TRACKING) += page_idle.o
diff --git a/mm/mmap.c b/mm/mmap.c
index 514ced13c65c..414ac31aa9fa 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -730,6 +730,20 @@ static inline bool is_mergeable_vma(struct vm_area_struct *vma,
return false;
if (!anon_vma_name_eq(anon_vma_name(vma), anon_name))
return false;
+#ifdef CONFIG_MSEAL
+ /*
+ * If a VMA is sealed, it won't be merged with another VMA.
+ * This might be useful for diagnosis, i.e. the boundary used
+ * in the mseal() call will be preserved.
+ * There are chances of too many mseal() calls can create
+ * many segmentations. Considering mseal() usually comes
+ * with a careful memory layout design by the application,
+ * this might not be an issue in real world.
+ * Though, we could add merging support later if needed.
+ */
+ if (vma->vm_seals & MM_SEAL_ALL)
+ return 0;
+#endif
return true;
}

diff --git a/mm/mseal.c b/mm/mseal.c
new file mode 100644
index 000000000000..ffe4c4c3f1bc
--- /dev/null
+++ b/mm/mseal.c
@@ -0,0 +1,230 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Implement mseal() syscall.
+ *
+ * Copyright (c) 2023 Google, Inc.
+ *
+ * Author: Jeff Xu <[email protected]>
+ */
+
+#include <linux/mman.h>
+#include <linux/mm.h>
+#include <linux/syscalls.h>
+#include <linux/sched.h>
+#include "internal.h"
+
+static bool can_do_mseal(unsigned long types, unsigned long flags)
+{
+ /* check types is a valid bitmap */
+ if (types & ~MM_SEAL_ALL)
+ return false;
+
+ /* flags isn't used for now */
+ if (flags)
+ return false;
+
+ return true;
+}
+
+/*
+ * Check if a seal type can be added to VMA.
+ */
+static bool can_add_vma_seals(struct vm_area_struct *vma, unsigned long newSeals)
+{
+ /* When SEAL_MSEAL is set, reject if a new type of seal is added */
+ if ((vma->vm_seals & MM_SEAL_MSEAL) &&
+ (newSeals & ~(vma->vm_seals & MM_SEAL_ALL)))
+ return false;
+
+ return true;
+}
+
+static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma,
+ struct vm_area_struct **prev, unsigned long start,
+ unsigned long end, unsigned long addtypes)
+{
+ int ret = 0;
+
+ if (addtypes & ~(vma->vm_seals & MM_SEAL_ALL)) {
+ /*
+ * Handle split at start and end.
+ * Note: sealed VMA doesn't merge with other VMAs.
+ */
+ if (start != vma->vm_start) {
+ ret = split_vma(vmi, vma, start, 1);
+ if (ret)
+ goto out;
+ }
+
+ if (end != vma->vm_end) {
+ ret = split_vma(vmi, vma, end, 0);
+ if (ret)
+ goto out;
+ }
+
+ vma->vm_seals |= addtypes;
+ }
+
+out:
+ *prev = vma;
+ return ret;
+}
+
+/*
+ * Check for do_mseal:
+ * 1> start is part of a valid vma.
+ * 2> end is part of a valid vma.
+ * 3> No gap (unallocated address) between start and end.
+ * 4> requested seal type can be added in given address range.
+ */
+static int check_mm_seal(unsigned long start, unsigned long end,
+ unsigned long newtypes)
+{
+ struct vm_area_struct *vma;
+ unsigned long nstart = start;
+
+ VMA_ITERATOR(vmi, current->mm, start);
+
+ /* going through each vma to check */
+ for_each_vma_range(vmi, vma, end) {
+ if (vma->vm_start > nstart)
+ /* unallocated memory found */
+ return -ENOMEM;
+
+ if (!can_add_vma_seals(vma, newtypes))
+ return -EACCES;
+
+ if (vma->vm_end >= end)
+ return 0;
+
+ nstart = vma->vm_end;
+ }
+
+ return -ENOMEM;
+}
+
+/*
+ * Apply sealing.
+ */
+static int apply_mm_seal(unsigned long start, unsigned long end,
+ unsigned long newtypes)
+{
+ unsigned long nstart, nend;
+ struct vm_area_struct *vma, *prev = NULL;
+ struct vma_iterator vmi;
+ int error = 0;
+
+ vma_iter_init(&vmi, current->mm, start);
+ vma = vma_find(&vmi, end);
+
+ prev = vma_prev(&vmi);
+ if (start > vma->vm_start)
+ prev = vma;
+
+ nstart = start;
+
+ /* going through each vma to update */
+ for_each_vma_range(vmi, vma, end) {
+ nend = vma->vm_end;
+ if (nend > end)
+ nend = end;
+
+ error = mseal_fixup(&vmi, vma, &prev, nstart, nend, newtypes);
+ if (error)
+ break;
+
+ nstart = vma->vm_end;
+ }
+
+ return error;
+}
+
+/*
+ * mseal(2) seals the VM's meta data from
+ * selected syscalls.
+ *
+ * addr/len: VM address range.
+ *
+ * The address range by addr/len must meet:
+ * start (addr) must be in a valid VMA.
+ * end (addr + len) must be in a valid VMA.
+ * no gap (unallocated memory) between start and end.
+ * start (addr) must be page aligned.
+ *
+ * len: len will be page aligned implicitly.
+ *
+ * types: bit mask for sealed syscalls.
+ * MM_SEAL_MPROTECT: seal mprotect(2)/pkey_mprotect(2).
+ * MM_SEAL_MUNMAP: seal munmap(2).
+ * MM_SEAL_MMAP: seal mmap(2).
+ * MM_SEAL_MREMAP: seal mremap(2).
+ * MM_SEAL_MSEAL: adding new seal type will be rejected.
+ *
+ * flags: reserved.
+ *
+ * return values:
+ * zero: success
+ * -EINVAL:
+ * invalid seal type.
+ * invalid input flags.
+ * addr is not page aligned.
+ * addr + len overflow.
+ * -ENOMEM:
+ * addr is not a valid address (not allocated).
+ * end (addr + len) is not a valid address.
+ * a gap (unallocated memory) between start and end.
+ * -EACCES:
+ * MM_SEAL_MSEAL is set, adding a new seal is rejected.
+ *
+ * Note:
+ * user can call mseal(2) multiple times to add new seal types.
+ * adding an already added seal type is a no-action (no error).
+ * adding a new seal type after MM_SEAL_MSEAL will be rejected.
+ * unseal() or removing a seal type is not supported.
+ */
+static int do_mseal(unsigned long start, size_t len_in, unsigned long types,
+ unsigned long flags)
+{
+ int ret = 0;
+ unsigned long end;
+ struct mm_struct *mm = current->mm;
+ size_t len;
+
+ if (!can_do_mseal(types, flags))
+ return -EINVAL;
+
+ start = untagged_addr(start);
+ if (!PAGE_ALIGNED(start))
+ return -EINVAL;
+
+ len = PAGE_ALIGN(len_in);
+ /* Check to see whether len was rounded up from small -ve to zero */
+ if (len_in && !len)
+ return -EINVAL;
+
+ end = start + len;
+ if (end < start)
+ return -EINVAL;
+
+ if (end == start)
+ return 0;
+
+ if (mmap_write_lock_killable(mm))
+ return -EINTR;
+
+ ret = check_mm_seal(start, end, types);
+ if (ret)
+ goto out;
+
+ ret = apply_mm_seal(start, end, types);
+
+out:
+ mmap_write_unlock(current->mm);
+ return ret;
+}
+
+SYSCALL_DEFINE4(mseal, unsigned long, start, size_t, len, unsigned long, types, unsigned long,
+ flags)
+{
+ return do_mseal(start, len, types, flags);
+}
--
2.42.0.655.g421f12c284-goog

2023-10-17 09:09:07

by Jeff Xu

[permalink] [raw]
Subject: [RFC PATCH v2 3/8] mseal: add can_modify_mm and can_modify_vma

From: Jeff Xu <[email protected]>

can_modify_mm:
checks sealing flags for given memory range.

can_modify_vma:
checks sealing flags for given vma.

Signed-off-by: Jeff Xu <[email protected]>
---
include/linux/mm.h | 26 ++++++++++++++++++++++++++
mm/mseal.c | 42 ++++++++++++++++++++++++++++++++++++++++--
2 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index b511932df033..b09df8501987 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3299,6 +3299,32 @@ static inline void mm_populate(unsigned long addr, unsigned long len)
static inline void mm_populate(unsigned long addr, unsigned long len) {}
#endif

+#ifdef CONFIG_MSEAL
+extern bool can_modify_mm(struct mm_struct *mm, unsigned long start,
+ unsigned long end, unsigned long checkSeals);
+
+extern bool can_modify_vma(struct vm_area_struct *vma,
+ unsigned long checkSeals);
+
+static inline unsigned long vma_seals(struct vm_area_struct *vma)
+{
+ return (vma->vm_seals & MM_SEAL_ALL);
+}
+
+#else
+static inline bool can_modify_mm(struct mm_struct *mm, unsigned long start,
+ unsigned long end, unsigned long checkSeals)
+{
+ return true;
+}
+
+static inline bool can_modify_vma(struct vm_area_struct *vma,
+ unsigned long checkSeals)
+{
+ return true;
+}
+#endif
+
/* These take the mm semaphore themselves */
extern int __must_check vm_brk(unsigned long, unsigned long);
extern int __must_check vm_brk_flags(unsigned long, unsigned long, unsigned long);
diff --git a/mm/mseal.c b/mm/mseal.c
index ffe4c4c3f1bc..3e9d1c732c38 100644
--- a/mm/mseal.c
+++ b/mm/mseal.c
@@ -26,6 +26,44 @@ static bool can_do_mseal(unsigned long types, unsigned long flags)
return true;
}

+/*
+ * check if a vma is sealed for modification.
+ * return true, if modification is allowed.
+ */
+bool can_modify_vma(struct vm_area_struct *vma,
+ unsigned long checkSeals)
+{
+ if (checkSeals & vma_seals(vma))
+ return false;
+
+ return true;
+}
+
+/*
+ * Check if the vmas of a memory range are allowed to be modified.
+ * the memory ranger can have a gap (unallocated memory).
+ * return true, if it is allowed.
+ */
+bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end,
+ unsigned long checkSeals)
+{
+ struct vm_area_struct *vma;
+
+ VMA_ITERATOR(vmi, mm, start);
+
+ if (!checkSeals)
+ return true;
+
+ /* going through each vma to check */
+ for_each_vma_range(vmi, vma, end) {
+ if (!can_modify_vma(vma, checkSeals))
+ return false;
+ }
+
+ /* Allow by default. */
+ return true;
+}
+
/*
* Check if a seal type can be added to VMA.
*/
@@ -33,7 +71,7 @@ static bool can_add_vma_seals(struct vm_area_struct *vma, unsigned long newSeals
{
/* When SEAL_MSEAL is set, reject if a new type of seal is added */
if ((vma->vm_seals & MM_SEAL_MSEAL) &&
- (newSeals & ~(vma->vm_seals & MM_SEAL_ALL)))
+ (newSeals & ~(vma_seals(vma))))
return false;

return true;
@@ -45,7 +83,7 @@ static int mseal_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma,
{
int ret = 0;

- if (addtypes & ~(vma->vm_seals & MM_SEAL_ALL)) {
+ if (addtypes & ~(vma_seals(vma))) {
/*
* Handle split at start and end.
* Note: sealed VMA doesn't merge with other VMAs.
--
2.42.0.655.g421f12c284-goog

2023-10-17 09:09:12

by Jeff Xu

[permalink] [raw]
Subject: [RFC PATCH v2 2/8] mseal: Wire up mseal syscall

From: Jeff Xu <[email protected]>

Wire up mseal syscall for all architectures.

Signed-off-by: Jeff Xu <[email protected]>
---
arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 2 ++
arch/ia64/kernel/syscalls/syscall.tbl | 1 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
include/uapi/asm-generic/unistd.h | 5 ++++-
19 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index ad37569d0507..b5847d53102a 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -492,3 +492,4 @@
560 common set_mempolicy_home_node sys_ni_syscall
561 common cachestat sys_cachestat
562 common fchmodat2 sys_fchmodat2
+563 common mseal sys_mseal
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index c572d6c3dee0..b50c5ca5047d 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -466,3 +466,4 @@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
+453 common mseal sys_mseal
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index bd77253b62e0..6a28fb91b85d 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -39,7 +39,7 @@
#define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5)
#define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800)

-#define __NR_compat_syscalls 453
+#define __NR_compat_syscalls 454
#endif

#define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index 78b68311ec81..1e9b3c098a8e 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -911,6 +911,8 @@ __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
__SYSCALL(__NR_cachestat, sys_cachestat)
#define __NR_fchmodat2 452
__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
+#define __NR_mseal 453
+__SYSCALL(__NR_mseal, sys_mseal)

/*
* Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
index 83d8609aec03..babe34d221ee 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -373,3 +373,4 @@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
+453 common mseal sys_mseal
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index 259ceb125367..27cd3f7dbd5e 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -452,3 +452,4 @@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
+453 common mseal sys_mseal
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index a3798c2637fd..e49861f7c61f 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -458,3 +458,4 @@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
+453 common mseal sys_mseal
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 152034b8e0a0..78d15010cd77 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -391,3 +391,4 @@
450 n32 set_mempolicy_home_node sys_set_mempolicy_home_node
451 n32 cachestat sys_cachestat
452 n32 fchmodat2 sys_fchmodat2
+453 n32 mseal sys_mseal
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index cb5e757f6621..813614fedb72 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -367,3 +367,4 @@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 n64 cachestat sys_cachestat
452 n64 fchmodat2 sys_fchmodat2
+453 n64 mseal sys_mseal
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 1a646813afdc..01d88d3a6f3e 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -440,3 +440,4 @@
450 o32 set_mempolicy_home_node sys_set_mempolicy_home_node
451 o32 cachestat sys_cachestat
452 o32 fchmodat2 sys_fchmodat2
+453 o32 mseal sys_mseal
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index e97c175b56f9..d52d08f0a1ea 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -451,3 +451,4 @@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
+453 common mseal sys_mseal
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index 20e50586e8a2..d38deba73a7b 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -539,3 +539,4 @@
450 nospu set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
+453 common mseal sys_mseal
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index 0122cc156952..cf3243c2978b 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -455,3 +455,4 @@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2 sys_fchmodat2
+453 common mseal sys_mseal sys_mseal
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index e90d585c4d3e..76f1cd33adaa 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -455,3 +455,4 @@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
+453 common mseal sys_mseal
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index 4ed06c71c43f..d7728695d780 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -498,3 +498,4 @@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
+453 common mseal sys_mseal
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 2d0b1bd866ea..6d4cc386df22 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -457,3 +457,4 @@
450 i386 set_mempolicy_home_node sys_set_mempolicy_home_node
451 i386 cachestat sys_cachestat
452 i386 fchmodat2 sys_fchmodat2
+453 i386 mseal sys_mseal
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 814768249eae..73dcfc43d921 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -374,6 +374,7 @@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
+453 common mseal sys_mseal

#
# Due to a historical design error, certain syscalls are numbered differently
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index fc1a4f3c81d9..e8fd3bf35d73 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -423,3 +423,4 @@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
+453 common mseal sys_mseal
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index abe087c53b4b..0c945a798208 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -823,8 +823,11 @@ __SYSCALL(__NR_cachestat, sys_cachestat)
#define __NR_fchmodat2 452
__SYSCALL(__NR_fchmodat2, sys_fchmodat2)

+#define __NR_mseal 453
+__SYSCALL(__NR_mseal, sys_mseal)
+
#undef __NR_syscalls
-#define __NR_syscalls 453
+#define __NR_syscalls 454

/*
* 32 bit systems traditionally used different
--
2.42.0.655.g421f12c284-goog

2023-10-17 09:09:12

by Jeff Xu

[permalink] [raw]
Subject: [RFC PATCH v2 4/8] mseal: Check seal flag for mprotect(2)

From: Jeff Xu <[email protected]>

mprotect(2) changes protection of VMAs in the given address
range. Sealing will prevent unintended mprotect call.

What this patch does:
When a mprotect(2) is invoked, if one of its VMAs has MM_SEAL_MPROTECT
set from previous mseal(2) call, this mprotect(2) will fail, without
any VMA modified.

This patch is based on following:
1. do_mprotect_pkey() currently called in two places:
SYSCALL_DEFINE3(mprotect,...)
SYSCALL_DEFINE4(pkey_mprotect, ...)
and not in any other places, therefore omit changing the signature of
do_mprotect_pkey(), i.e. not passing checkSeals flag.

2. In do_mprotect_pkey(), calls can_modify_mm() before any
update is made on the VMAs.

Signed-off-by: Jeff Xu <[email protected]>
---
mm/mprotect.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 130db91d3a8c..6321c4d0aa3f 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -32,6 +32,7 @@
#include <linux/sched/sysctl.h>
#include <linux/userfaultfd_k.h>
#include <linux/memory-tiers.h>
+#include <uapi/linux/mman.h>
#include <asm/cacheflush.h>
#include <asm/mmu_context.h>
#include <asm/tlbflush.h>
@@ -753,6 +754,20 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
}
}

+ /*
+ * do_mprotect_pkey() currently called from two places:
+ * SYSCALL_DEFINE3(mprotect,...)
+ * SYSCALL_DEFINE4(pkey_mprotect, ...)
+ * and not from other places.
+ * Therefore, omit changing the signature of do_mprotect_pkey().
+ * Otherwise, we might need to add checkSeals and pass it
+ * from all callers of do_mprotect_pkey().
+ */
+ if (!can_modify_mm(current->mm, start, end, MM_SEAL_MPROTECT)) {
+ error = -EACCES;
+ goto out;
+ }
+
prev = vma_prev(&vmi);
if (start > vma->vm_start)
prev = vma;
--
2.42.0.655.g421f12c284-goog

2023-10-17 09:09:59

by Jeff Xu

[permalink] [raw]
Subject: [RFC PATCH v2 6/8] mseal: Check seal flag for mremap(2)

From: Jeff Xu <[email protected]>

mremap(2) can shrink/expand a VMA, or move a VMA to a fixed
address and overwriting or existing VMA. Sealing will
prevent unintended mremap(2) call.

What this patch does:
When a mremap(2) is invoked, if one of its VMAs has MM_SEAL_MREMAP
set from previous mseal(2) call, this mremap(2) will fail, without
any VMA modified.

This patch is based on following:
1. At syscall entry point: SYSCALL_DEFINE5(mremap,...)
There are two cases:
a. going into mremap_to().
b. not going into mremap_to().

2. For mremap_to() case.
Since mremap_to() is called only from SYSCALL_DEFINE5(mremap,..),
omit changing signature of mremap_to(), i.e. not passing
checkSeals flag.
In mremap_to(), it calls can_modify_mm() for src address and
dst address (when MREMAP_FIXED is used), before any update is
made to the VMAs.

3. For non mremap_to() case.
It is still part of SYSCALL_DEFINE5(mremap,...).
It calls can_modify_mm() to check sealing in the src address,
before any update is made to src VMAs.
Check for dest address is not needed, because dest memory is
allocated in current mremap(2) call.

Signed-off-by: Jeff Xu <[email protected]>
---
mm/mremap.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/mm/mremap.c b/mm/mremap.c
index ac363937f8c4..691fc32d37e4 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -836,7 +836,27 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len,
if ((mm->map_count + 2) >= sysctl_max_map_count - 3)
return -ENOMEM;

+ /*
+ * Check src address for sealing.
+ *
+ * Note: mremap_to() currently called from one place:
+ * SYSCALL_DEFINE4(pkey_mprotect, ...)
+ * and not in any other places.
+ * Therefore, omit changing the signature of mremap_to()
+ * Otherwise, we might need to add checkSeals and pass it
+ * from all callers of mremap_to().
+ */
+ if (!can_modify_mm(mm, addr, addr + old_len, MM_SEAL_MREMAP))
+ return -EACCES;
+
if (flags & MREMAP_FIXED) {
+ /*
+ * Check dest address for sealing.
+ */
+ if (!can_modify_mm(mm, new_addr, new_addr + new_len,
+ MM_SEAL_MREMAP))
+ return -EACCES;
+
ret = do_munmap(mm, new_addr, new_len, uf_unmap_early);
if (ret)
goto out;
@@ -995,6 +1015,11 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
goto out;
}

+ if (!can_modify_mm(mm, addr, addr + old_len, MM_SEAL_MREMAP)) {
+ ret = -EACCES;
+ goto out;
+ }
+
/*
* Always allow a shrinking remap: that just unmaps
* the unnecessary pages..
--
2.42.0.655.g421f12c284-goog

2023-10-17 09:09:59

by Jeff Xu

[permalink] [raw]
Subject: [RFC PATCH v2 7/8] mseal:Check seal flag for mmap(2)

From: Jeff Xu <[email protected]>

mmap(2) can change a protection of existing VMAs.
Sealing will prevent unintended mmap(2) call.

What this patch does:
When a mmap(2) is invoked, if one of its VMAs has MM_SEAL_MMAP set
from previous mseal(2) call, the mmap(2) will fail, without any
VMAs modified.

The patch is based on following:
There are two cases: with MMU, NO MMU.

For MMU case:
1. ksys_mmap_pgoff() currently are called in 2 places:
SYSCALL_DEFINE1(old_mmap, ...)
SYSCALL_DEFINE6(mmap_pgoff,...)
Since both are syscall entry point, omit adding
checkSeals in the signature of ksys_mmap_pgoff().

2. ksys_mmap_pgoff() calls vm_mmap_pgoff() with
checkSeals = MM_SEAL_MMAP, in turn, checkSeals flag is
passed into do_mmap(),
Note: Of all the call paths that goes into do_mmap(),
ksys_mmap_pgoff() is the only place where
checkSeals = MM_SEAL_MMAP. The rest has checkSeals = 0.

3. In do_mmap(), call can_modify_mm() before any update
is maded to the VMAs.

For NON-MMU case:
Set checkSeals = 0 for all cases.

Signed-off-by: Jeff Xu <[email protected]>
---
fs/aio.c | 5 +++--
include/linux/mm.h | 5 ++++-
ipc/shm.c | 3 ++-
mm/internal.h | 4 ++--
mm/mmap.c | 22 ++++++++++++++++++----
mm/nommu.c | 6 ++++--
mm/util.c | 8 +++++---
7 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index b3174da80ff6..7f4863d0082d 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -557,8 +557,9 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events)
}

ctx->mmap_base = do_mmap(ctx->aio_ring_file, 0, ctx->mmap_size,
- PROT_READ | PROT_WRITE,
- MAP_SHARED, 0, &unused, NULL);
+ PROT_READ | PROT_WRITE, MAP_SHARED, 0, &unused,
+ NULL, 0);
+
mmap_write_unlock(mm);
if (IS_ERR((void *)ctx->mmap_base)) {
ctx->mmap_size = 0;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index f2f316522f2a..9f496c9f2970 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3274,9 +3274,12 @@ extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned lo
extern unsigned long mmap_region(struct file *file, unsigned long addr,
unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
struct list_head *uf);
+
extern unsigned long do_mmap(struct file *file, unsigned long addr,
unsigned long len, unsigned long prot, unsigned long flags,
- unsigned long pgoff, unsigned long *populate, struct list_head *uf);
+ unsigned long pgoff, unsigned long *populate, struct list_head *uf,
+ unsigned long checkSeals);
+
extern int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
unsigned long start, size_t len, struct list_head *uf,
bool unlock, unsigned long checkSeals);
diff --git a/ipc/shm.c b/ipc/shm.c
index 60e45e7045d4..3660f522ecba 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -1662,7 +1662,8 @@ long do_shmat(int shmid, char __user *shmaddr, int shmflg,
goto invalid;
}

- addr = do_mmap(file, addr, size, prot, flags, 0, &populate, NULL);
+ addr = do_mmap(file, addr, size, prot, flags, 0, &populate, NULL,
+ 0);
*raddr = addr;
err = 0;
if (IS_ERR_VALUE(addr))
diff --git a/mm/internal.h b/mm/internal.h
index d1d4bf4e63c0..2c074d8c6abd 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -800,8 +800,8 @@ extern u64 hwpoison_filter_memcg;
extern u32 hwpoison_filter_enable;

extern unsigned long __must_check vm_mmap_pgoff(struct file *, unsigned long,
- unsigned long, unsigned long,
- unsigned long, unsigned long);
+ unsigned long, unsigned long, unsigned long, unsigned long,
+ unsigned long checkSeals);

extern void set_pageblock_order(void);
unsigned long reclaim_pages(struct list_head *folio_list);
diff --git a/mm/mmap.c b/mm/mmap.c
index 62d592f16f45..edcadd2bb394 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1197,7 +1197,8 @@ static inline bool file_mmap_ok(struct file *file, struct inode *inode,
unsigned long do_mmap(struct file *file, unsigned long addr,
unsigned long len, unsigned long prot,
unsigned long flags, unsigned long pgoff,
- unsigned long *populate, struct list_head *uf)
+ unsigned long *populate, struct list_head *uf,
+ unsigned long checkSeals)
{
struct mm_struct *mm = current->mm;
vm_flags_t vm_flags;
@@ -1365,6 +1366,9 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
vm_flags |= VM_NORESERVE;
}

+ if (!can_modify_mm(mm, addr, addr + len, MM_SEAL_MMAP))
+ return -EACCES;
+
addr = mmap_region(file, addr, len, vm_flags, pgoff, uf);
if (!IS_ERR_VALUE(addr) &&
((vm_flags & VM_LOCKED) ||
@@ -1411,7 +1415,17 @@ unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len,
return PTR_ERR(file);
}

- retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff);
+ /*
+ * vm_mmap_pgoff() currently called from two places:
+ * SYSCALL_DEFINE1(old_mmap, ...)
+ * SYSCALL_DEFINE6(mmap_pgoff,...)
+ * and not in any other places.
+ * Therefore, omit changing the signature of vm_mmap_pgoff()
+ * Otherwise, we might need to add checkSeals and pass it
+ * from all callers of vm_mmap_pgoff().
+ */
+ retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff,
+ MM_SEAL_MMAP);
out_fput:
if (file)
fput(file);
@@ -3016,8 +3030,8 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size,
flags |= MAP_LOCKED;

file = get_file(vma->vm_file);
- ret = do_mmap(vma->vm_file, start, size,
- prot, flags, pgoff, &populate, NULL);
+ ret = do_mmap(vma->vm_file, start, size, prot, flags, pgoff,
+ &populate, NULL, 0);
fput(file);
out:
mmap_write_unlock(mm);
diff --git a/mm/nommu.c b/mm/nommu.c
index 8dba41cfc44d..dc83651ee777 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1018,7 +1018,8 @@ unsigned long do_mmap(struct file *file,
unsigned long flags,
unsigned long pgoff,
unsigned long *populate,
- struct list_head *uf)
+ struct list_head *uf,
+ unsigned long checkSeals)
{
struct vm_area_struct *vma;
struct vm_region *region;
@@ -1262,7 +1263,8 @@ unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len,
goto out;
}

- retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff);
+ retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff,
+ 0);

if (file)
fput(file);
diff --git a/mm/util.c b/mm/util.c
index 4ed8b9b5273c..ca9d8c69267c 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -532,7 +532,8 @@ EXPORT_SYMBOL_GPL(account_locked_vm);

unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
unsigned long len, unsigned long prot,
- unsigned long flag, unsigned long pgoff)
+ unsigned long flag, unsigned long pgoff,
+ unsigned long checkseals)
{
unsigned long ret;
struct mm_struct *mm = current->mm;
@@ -544,7 +545,7 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
if (mmap_write_lock_killable(mm))
return -EINTR;
ret = do_mmap(file, addr, len, prot, flag, pgoff, &populate,
- &uf);
+ &uf, checkseals);
mmap_write_unlock(mm);
userfaultfd_unmap_complete(mm, &uf);
if (populate)
@@ -562,7 +563,8 @@ unsigned long vm_mmap(struct file *file, unsigned long addr,
if (unlikely(offset_in_page(offset)))
return -EINVAL;

- return vm_mmap_pgoff(file, addr, len, prot, flag, offset >> PAGE_SHIFT);
+ return vm_mmap_pgoff(file, addr, len, prot, flag, offset >> PAGE_SHIFT,
+ 0);
}
EXPORT_SYMBOL(vm_mmap);

--
2.42.0.655.g421f12c284-goog

2023-10-17 09:10:09

by Jeff Xu

[permalink] [raw]
Subject: [RFC PATCH v2 5/8] mseal: Check seal flag for munmap(2)

From: Jeff Xu <[email protected]>

munmap(2) unmap VMAs in the given address range.
Sealing will prevent unintended munmap(2) call.

What this patch does:
When a munmap(2) is invoked, if one of its VMAs has MM_SEAL_MUNMAP
set from previous mseal(2) call, this munmap(2) will fail,
without any VMA modified.

This patch is based on following:
1. At syscall entry point: SYSCALL_DEFINE2(munmap, ...)
Pass checkSeals = MM_SEAL_MUNMAP into __vm_munmap(),
in turn, to do_vmi_munmap().

Of all the call paths that call into do_vmi_munmap(),
this is the only place where checkSeals = MM_SEAL_MUNMAP.
The rest has checkSeals = 0.

2. In do_vmi_munmap(), calls can_modify_mm() before any
update is made to VMAs.

Signed-off-by: Jeff Xu <[email protected]>
---
include/linux/mm.h | 2 +-
mm/mmap.c | 21 +++++++++++++--------
mm/mremap.c | 5 +++--
3 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index b09df8501987..f2f316522f2a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3279,7 +3279,7 @@ extern unsigned long do_mmap(struct file *file, unsigned long addr,
unsigned long pgoff, unsigned long *populate, struct list_head *uf);
extern int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
unsigned long start, size_t len, struct list_head *uf,
- bool unlock);
+ bool unlock, unsigned long checkSeals);
extern int do_munmap(struct mm_struct *, unsigned long, size_t,
struct list_head *uf);
extern int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int behavior);
diff --git a/mm/mmap.c b/mm/mmap.c
index 414ac31aa9fa..62d592f16f45 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2601,6 +2601,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
* @len: The length of the range to munmap
* @uf: The userfaultfd list_head
* @unlock: set to true if the user wants to drop the mmap_lock on success
+ * @checkSeals: seal type to check.
*
* This function takes a @mas that is either pointing to the previous VMA or set
* to MA_START and sets it up to remove the mapping(s). The @len will be
@@ -2611,7 +2612,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
*/
int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
unsigned long start, size_t len, struct list_head *uf,
- bool unlock)
+ bool unlock, unsigned long checkSeals)
{
unsigned long end;
struct vm_area_struct *vma;
@@ -2623,6 +2624,9 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
if (end == start)
return -EINVAL;

+ if (!can_modify_mm(mm, start, end, checkSeals))
+ return -EACCES;
+
/* arch_unmap() might do unmaps itself. */
arch_unmap(mm, start, end);

@@ -2650,7 +2654,7 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len,
{
VMA_ITERATOR(vmi, mm, start);

- return do_vmi_munmap(&vmi, mm, start, len, uf, false);
+ return do_vmi_munmap(&vmi, mm, start, len, uf, false, 0);
}

unsigned long mmap_region(struct file *file, unsigned long addr,
@@ -2684,7 +2688,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
}

/* Unmap any existing mapping in the area */
- if (do_vmi_munmap(&vmi, mm, addr, len, uf, false))
+ if (do_vmi_munmap(&vmi, mm, addr, len, uf, false, 0))
return -ENOMEM;

/*
@@ -2909,7 +2913,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
return error;
}

-static int __vm_munmap(unsigned long start, size_t len, bool unlock)
+static int __vm_munmap(unsigned long start, size_t len, bool unlock,
+ unsigned long checkSeals)
{
int ret;
struct mm_struct *mm = current->mm;
@@ -2919,7 +2924,7 @@ static int __vm_munmap(unsigned long start, size_t len, bool unlock)
if (mmap_write_lock_killable(mm))
return -EINTR;

- ret = do_vmi_munmap(&vmi, mm, start, len, &uf, unlock);
+ ret = do_vmi_munmap(&vmi, mm, start, len, &uf, unlock, checkSeals);
if (ret || !unlock)
mmap_write_unlock(mm);

@@ -2929,14 +2934,14 @@ static int __vm_munmap(unsigned long start, size_t len, bool unlock)

int vm_munmap(unsigned long start, size_t len)
{
- return __vm_munmap(start, len, false);
+ return __vm_munmap(start, len, false, 0);
}
EXPORT_SYMBOL(vm_munmap);

SYSCALL_DEFINE2(munmap, unsigned long, addr, size_t, len)
{
addr = untagged_addr(addr);
- return __vm_munmap(addr, len, true);
+ return __vm_munmap(addr, len, true, MM_SEAL_MUNMAP);
}


@@ -3168,7 +3173,7 @@ int vm_brk_flags(unsigned long addr, unsigned long request, unsigned long flags)
if (ret)
goto limits_failed;

- ret = do_vmi_munmap(&vmi, mm, addr, len, &uf, 0);
+ ret = do_vmi_munmap(&vmi, mm, addr, len, &uf, 0, 0);
if (ret)
goto munmap_failed;

diff --git a/mm/mremap.c b/mm/mremap.c
index 056478c106ee..ac363937f8c4 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -715,7 +715,8 @@ static unsigned long move_vma(struct vm_area_struct *vma,
}

vma_iter_init(&vmi, mm, old_addr);
- if (!do_vmi_munmap(&vmi, mm, old_addr, old_len, uf_unmap, false)) {
+ if (!do_vmi_munmap(&vmi, mm, old_addr, old_len, uf_unmap, false,
+ 0)) {
/* OOM: unable to split vma, just get accounts right */
if (vm_flags & VM_ACCOUNT && !(flags & MREMAP_DONTUNMAP))
vm_acct_memory(old_len >> PAGE_SHIFT);
@@ -1009,7 +1010,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
}

ret = do_vmi_munmap(&vmi, mm, addr + new_len, old_len - new_len,
- &uf_unmap, true);
+ &uf_unmap, true, 0);
if (ret)
goto out;

--
2.42.0.655.g421f12c284-goog

2023-10-17 09:11:26

by Jeff Xu

[permalink] [raw]
Subject: [RFC PATCH v2 8/8] selftest mm/mseal mprotect/munmap/mremap/mmap

From: Jeff Xu <[email protected]>

selftest for sealing mprotect/munmap/mremap/mmap

Signed-off-by: Jeff Xu <[email protected]>
---
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/mseal_test.c | 1428 +++++++++++++++++++++++
2 files changed, 1429 insertions(+)
create mode 100644 tools/testing/selftests/mm/mseal_test.c

diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 6a9fc5693145..0c086cecc093 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -59,6 +59,7 @@ TEST_GEN_FILES += mlock2-tests
TEST_GEN_FILES += mrelease_test
TEST_GEN_FILES += mremap_dontunmap
TEST_GEN_FILES += mremap_test
+TEST_GEN_FILES += mseal_test
TEST_GEN_FILES += on-fault-limit
TEST_GEN_FILES += thuge-gen
TEST_GEN_FILES += transhuge-stress
diff --git a/tools/testing/selftests/mm/mseal_test.c b/tools/testing/selftests/mm/mseal_test.c
new file mode 100644
index 000000000000..d6ae09729394
--- /dev/null
+++ b/tools/testing/selftests/mm/mseal_test.c
@@ -0,0 +1,1428 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <sys/mman.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/time.h>
+#include <sys/resource.h>
+#include <stdbool.h>
+#include "../kselftest.h"
+#include <syscall.h>
+#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <assert.h>
+
+#ifndef MM_SEAL_MSEAL
+#define MM_SEAL_MSEAL 0x1
+#endif
+
+#ifndef MM_SEAL_MPROTECT
+#define MM_SEAL_MPROTECT 0x2
+#endif
+
+#ifndef MM_SEAL_MUNMAP
+#define MM_SEAL_MUNMAP 0x4
+#endif
+
+#ifndef MM_SEAL_MMAP
+#define MM_SEAL_MMAP 0x8
+#endif
+
+#ifndef MM_SEAL_MREMAP
+#define MM_SEAL_MREMAP 0x10
+#endif
+
+#ifndef DEBUG
+#define LOG_TEST_ENTER() {}
+#else
+#define LOG_TEST_ENTER() { printf("%s\n", __func__); }
+#endif
+
+static int sys_mseal(void *start, size_t len, int types)
+{
+ int sret;
+
+ errno = 0;
+ sret = syscall(__NR_mseal, start, len, types, 0);
+ return sret;
+}
+
+int sys_mprotect(void *ptr, size_t size, unsigned long prot)
+{
+ int sret;
+
+ errno = 0;
+ sret = syscall(SYS_mprotect, ptr, size, prot);
+ return sret;
+}
+
+int sys_munmap(void *ptr, size_t size)
+{
+ int sret;
+
+ errno = 0;
+ sret = syscall(SYS_munmap, ptr, size);
+ return sret;
+}
+
+static int sys_madvise(void *start, size_t len, int types)
+{
+ int sret;
+
+ errno = 0;
+ sret = syscall(__NR_madvise, start, len, types);
+ return sret;
+}
+
+void *addr1 = (void *)0x50000000;
+void *addr2 = (void *)0x50004000;
+void *addr3 = (void *)0x50008000;
+void setup_single_address(int size, void **ptrOut)
+{
+ void *ptr;
+
+ ptr = mmap(NULL, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+ assert(ptr != (void *)-1);
+ *ptrOut = ptr;
+}
+
+void setup_single_fixed_address(int size, void **ptrOut)
+{
+ void *ptr;
+
+ ptr = mmap(addr1, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+ assert(ptr == (void *)addr1);
+
+ *ptrOut = ptr;
+}
+
+void clean_single_address(void *ptr, int size)
+{
+ int ret;
+
+ ret = munmap(ptr, size);
+ assert(!ret);
+}
+
+void seal_mprotect_single_address(void *ptr, int size)
+{
+ int ret;
+
+ ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
+ assert(!ret);
+}
+
+static void test_seal_addseals(void)
+{
+ LOG_TEST_ENTER();
+ int ret;
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+
+ setup_single_address(size, &ptr);
+
+ /* adding seal one by one */
+ ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
+ assert(!ret);
+
+ ret = sys_mseal(ptr, size, MM_SEAL_MMAP);
+ assert(!ret);
+
+ ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
+ assert(!ret);
+
+ ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_addseals_combined(void)
+{
+ LOG_TEST_ENTER();
+ int ret;
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+
+ setup_single_address(size, &ptr);
+
+ ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
+ assert(!ret);
+
+ /* adding multiple seals */
+ ret = sys_mseal(ptr, size,
+ MM_SEAL_MPROTECT | MM_SEAL_MMAP | MM_SEAL_MREMAP |
+ MM_SEAL_MSEAL);
+ assert(!ret);
+
+ /* not adding more seal type, so ok. */
+ ret = sys_mseal(ptr, size,
+ MM_SEAL_MMAP | MM_SEAL_MREMAP | MM_SEAL_MSEAL);
+ assert(!ret);
+
+ /* not adding more seal type, so ok. */
+ ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_addseals_reject(void)
+{
+ LOG_TEST_ENTER();
+ int ret;
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+
+ setup_single_address(size, &ptr);
+
+ ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT | MM_SEAL_MSEAL);
+ assert(!ret);
+
+ ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
+ assert(!ret);
+
+ /* MM_SEAL_MSEAL is set, so not allow new seal type . */
+ ret = sys_mseal(ptr, size,
+ MM_SEAL_MPROTECT | MM_SEAL_MMAP | MM_SEAL_MSEAL);
+ assert(ret < 0);
+
+ ret = sys_mseal(ptr, size, MM_SEAL_MMAP);
+ assert(ret < 0);
+
+ ret = sys_mseal(ptr, size, MM_SEAL_MMAP | MM_SEAL_MSEAL);
+ assert(ret < 0);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_unmapped_start(void)
+{
+ LOG_TEST_ENTER();
+ int ret;
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+
+ setup_single_address(size, &ptr);
+
+ // munmap 2 pages from ptr.
+ ret = sys_munmap(ptr, 2 * page_size);
+ assert(!ret);
+
+ // mprotect will fail because 2 pages from ptr are unmapped.
+ ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE);
+ assert(ret < 0);
+
+ // mseal will fail because 2 pages from ptr are unmapped.
+ ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
+ assert(ret < 0);
+
+ ret = sys_mseal(ptr + 2 * page_size, 2 * page_size, MM_SEAL_MSEAL);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_unmapped_middle(void)
+{
+ LOG_TEST_ENTER();
+ int ret;
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+
+ setup_single_address(size, &ptr);
+
+ // munmap 2 pages from ptr + page.
+ ret = sys_munmap(ptr + page_size, 2 * page_size);
+ assert(!ret);
+
+ // mprotect will fail, since size is 4 pages.
+ ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE);
+ assert(ret < 0);
+
+ // mseal will fail as well.
+ ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
+ assert(ret < 0);
+
+ /* we still can add seal to the first page and last page*/
+ ret = sys_mseal(ptr, page_size, MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
+ assert(!ret);
+
+ ret = sys_mseal(ptr + 3 * page_size, page_size,
+ MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_unmapped_end(void)
+{
+ LOG_TEST_ENTER();
+ int ret;
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+
+ setup_single_address(size, &ptr);
+
+ // unmap last 2 pages.
+ ret = sys_munmap(ptr + 2 * page_size, 2 * page_size);
+ assert(!ret);
+
+ //mprotect will fail since last 2 pages are unmapped.
+ ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE);
+ assert(ret < 0);
+
+ //mseal will fail as well.
+ ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
+ assert(ret < 0);
+
+ /* The first 2 pages is not sealed, and can add seals */
+ ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_multiple_vmas(void)
+{
+ LOG_TEST_ENTER();
+ int ret;
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+
+ setup_single_address(size, &ptr);
+
+ // use mprotect to split the vma into 3.
+ ret = sys_mprotect(ptr + page_size, 2 * page_size,
+ PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ // mprotect will get applied to all 4 pages - 3 VMAs.
+ ret = sys_mprotect(ptr, size, PROT_READ);
+ assert(!ret);
+
+ // use mprotect to split the vma into 3.
+ ret = sys_mprotect(ptr + page_size, 2 * page_size,
+ PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ // mseal get applied to all 4 pages - 3 VMAs.
+ ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
+ assert(!ret);
+
+ // verify additional seal type will fail after MM_SEAL_MSEAL set.
+ ret = sys_mseal(ptr, page_size, MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
+ assert(ret < 0);
+
+ ret = sys_mseal(ptr + page_size, 2 * page_size,
+ MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
+ assert(ret < 0);
+
+ ret = sys_mseal(ptr + 3 * page_size, page_size,
+ MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
+ assert(ret < 0);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_split_start(void)
+{
+ LOG_TEST_ENTER();
+ int ret;
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+
+ setup_single_address(size, &ptr);
+
+ /* use mprotect to split at middle */
+ ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ /* seal the first page, this will split the VMA */
+ ret = sys_mseal(ptr, page_size, MM_SEAL_MSEAL);
+ assert(!ret);
+
+ /* can't add seal to the first page */
+ ret = sys_mseal(ptr, page_size, MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
+ assert(ret < 0);
+
+ /* add seal to the remain 3 pages */
+ ret = sys_mseal(ptr + page_size, 3 * page_size,
+ MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_split_end(void)
+{
+ LOG_TEST_ENTER();
+ int ret;
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+
+ setup_single_fixed_address(size, &ptr);
+
+ /* use mprotect to split at middle */
+ ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ /* seal the last page */
+ ret = sys_mseal(ptr + 3 * page_size, page_size, MM_SEAL_MSEAL);
+ assert(!ret);
+
+ /* adding seal to the last page is rejected. */
+ ret = sys_mseal(ptr + 3 * page_size, page_size,
+ MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
+ assert(ret < 0);
+
+ /* Adding seals to the first 3 pages */
+ ret = sys_mseal(ptr, 3 * page_size, MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_invalid_input(void)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_fixed_address(size, &ptr);
+
+ /* invalid flag */
+ ret = sys_mseal(ptr, size, 0x20);
+ assert(ret < 0);
+
+ ret = sys_mseal(ptr, size, 0x31);
+ assert(ret < 0);
+
+ ret = sys_mseal(ptr, size, 0x3F);
+ assert(ret < 0);
+
+ /* unaligned address */
+ ret = sys_mseal(ptr + 1, 2 * page_size, MM_SEAL_MSEAL);
+ assert(ret < 0);
+
+ /* length too big */
+ ret = sys_mseal(ptr, 5 * page_size, MM_SEAL_MSEAL);
+ assert(ret < 0);
+
+ /* start is not in a valid VMA */
+ ret = sys_mseal(ptr - page_size, 5 * page_size, MM_SEAL_MSEAL);
+ assert(ret < 0);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_zero_length(void)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ ret = sys_mprotect(ptr, 0, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ /* seal 0 length will be OK, same as mprotect */
+ ret = sys_mseal(ptr, 0, MM_SEAL_MPROTECT);
+ assert(!ret);
+
+ // verify the 4 pages are not sealed by previous call.
+ ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_twice(void)
+{
+ LOG_TEST_ENTER();
+ int ret;
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+
+ setup_single_address(size, &ptr);
+
+ ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
+ assert(!ret);
+
+ // apply the same seal will be OK. idempotent.
+ ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
+ assert(!ret);
+
+ ret = sys_mseal(ptr, size,
+ MM_SEAL_MPROTECT | MM_SEAL_MMAP | MM_SEAL_MREMAP |
+ MM_SEAL_MSEAL);
+ assert(!ret);
+
+ ret = sys_mseal(ptr, size,
+ MM_SEAL_MPROTECT | MM_SEAL_MMAP | MM_SEAL_MREMAP |
+ MM_SEAL_MSEAL);
+ assert(!ret);
+
+ ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_mprotect(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ if (seal)
+ seal_mprotect_single_address(ptr, size);
+
+ ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_start_mprotect(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ if (seal)
+ seal_mprotect_single_address(ptr, page_size);
+
+ // the first page is sealed.
+ ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ // pages after the first page is not sealed.
+ ret = sys_mprotect(ptr + page_size, page_size * 3,
+ PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_end_mprotect(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ if (seal)
+ seal_mprotect_single_address(ptr + page_size, 3 * page_size);
+
+ /* first page is not sealed */
+ ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ /* last 3 page are sealed */
+ ret = sys_mprotect(ptr + page_size, page_size * 3,
+ PROT_READ | PROT_WRITE);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_mprotect_unalign_len(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ if (seal)
+ seal_mprotect_single_address(ptr, page_size * 2 - 1);
+
+ // 2 pages are sealed.
+ ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ ret = sys_mprotect(ptr + page_size * 2, page_size,
+ PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_mprotect_unalign_len_variant_2(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+ if (seal)
+ seal_mprotect_single_address(ptr, page_size * 2 + 1);
+
+ // 3 pages are sealed.
+ ret = sys_mprotect(ptr, page_size * 3, PROT_READ | PROT_WRITE);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ ret = sys_mprotect(ptr + page_size * 3, page_size,
+ PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_mprotect_two_vma(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ /* use mprotect to split */
+ ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ if (seal)
+ seal_mprotect_single_address(ptr, page_size * 4);
+
+ ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ ret = sys_mprotect(ptr + page_size * 2, page_size * 2,
+ PROT_READ | PROT_WRITE);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_mprotect_two_vma_with_split(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ // use mprotect to split as two vma.
+ ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ // mseal can apply across 2 vma, also split them.
+ if (seal)
+ seal_mprotect_single_address(ptr + page_size, page_size * 2);
+
+ // the first page is not sealed.
+ ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ // the second page is sealed.
+ ret = sys_mprotect(ptr + page_size, page_size, PROT_READ | PROT_WRITE);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ // the third page is sealed.
+ ret = sys_mprotect(ptr + 2 * page_size, page_size,
+ PROT_READ | PROT_WRITE);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ // the fouth page is not sealed.
+ ret = sys_mprotect(ptr + 3 * page_size, page_size,
+ PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_mprotect_partial_mprotect(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ // seal one page.
+ if (seal)
+ seal_mprotect_single_address(ptr, page_size);
+
+ // mprotect first 2 page will fail, since the first page are sealed.
+ ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_mprotect_two_vma_with_gap(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ // use mprotect to split.
+ ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ // use mprotect to split.
+ ret = sys_mprotect(ptr + 3 * page_size, page_size,
+ PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ // use munmap to free two pages in the middle
+ ret = sys_munmap(ptr + page_size, 2 * page_size);
+ assert(!ret);
+
+ // mprotect will fail, because there is a gap in the address.
+ // notes, internally mprotect still updated the first page.
+ ret = sys_mprotect(ptr, 4 * page_size, PROT_READ);
+ assert(ret < 0);
+
+ // mseal will fail as well.
+ ret = sys_mseal(ptr, 4 * page_size, MM_SEAL_MPROTECT);
+ assert(ret < 0);
+
+ // unlike mprotect, the first page is not sealed.
+ ret = sys_mprotect(ptr, page_size, PROT_READ);
+ assert(ret == 0);
+
+ // the last page is not sealed.
+ ret = sys_mprotect(ptr + 3 * page_size, page_size, PROT_READ);
+ assert(ret == 0);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_mprotect_split(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ //use mprotect to split.
+ ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ //seal all 4 pages.
+ if (seal) {
+ ret = sys_mseal(ptr, 4 * page_size, MM_SEAL_MPROTECT);
+ assert(!ret);
+ }
+
+ //madvice is OK.
+ ret = sys_madvise(ptr, page_size * 2, MADV_WILLNEED);
+ assert(!ret);
+
+ //mprotect is sealed.
+ ret = sys_mprotect(ptr, 2 * page_size, PROT_READ);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+
+ ret = sys_mprotect(ptr + 2 * page_size, 2 * page_size, PROT_READ);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_mprotect_merge(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ // use mprotect to split one page.
+ ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ // seal first two pages.
+ if (seal) {
+ ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_MPROTECT);
+ assert(!ret);
+ }
+
+ ret = sys_madvise(ptr, page_size, MADV_WILLNEED);
+ assert(!ret);
+
+ // 2 pages are sealed.
+ ret = sys_mprotect(ptr, 2 * page_size, PROT_READ);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ // last 2 pages are not sealed.
+ ret = sys_mprotect(ptr + 2 * page_size, 2 * page_size, PROT_READ);
+ assert(ret == 0);
+
+ clean_single_address(ptr, size);
+}
+
+static void test_seal_munmap(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ if (seal) {
+ ret = sys_mseal(ptr, size, MM_SEAL_MUNMAP);
+ assert(!ret);
+ }
+
+ // 4 pages are sealed.
+ ret = sys_munmap(ptr, size);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+}
+
+/*
+ * allocate 4 pages,
+ * use mprotect to split it as two VMAs
+ * seal the whole range
+ * munmap will fail on both
+ */
+static void test_seal_munmap_two_vma(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ /* use mprotect to split */
+ ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE);
+ assert(!ret);
+
+ if (seal) {
+ ret = sys_mseal(ptr, size, MM_SEAL_MUNMAP);
+ assert(!ret);
+ }
+
+ ret = sys_munmap(ptr, page_size * 2);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+
+ ret = sys_munmap(ptr + page_size, page_size * 2);
+ if (seal)
+ assert(ret < 0);
+ else
+ assert(!ret);
+}
+
+/*
+ * allocate a VMA with 4 pages.
+ * munmap the middle 2 pages.
+ * seal the whole 4 pages, will fail.
+ * note: one of the pages are sealed
+ * munmap the first page will be OK.
+ * munmap the last page will be OK.
+ */
+static void test_seal_munmap_vma_with_gap(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ ret = sys_munmap(ptr + page_size, page_size * 2);
+ assert(!ret);
+
+ if (seal) {
+ // can't have gap in the middle.
+ ret = sys_mseal(ptr, size, MM_SEAL_MUNMAP);
+ assert(ret < 0);
+ }
+
+ ret = sys_munmap(ptr, page_size);
+ assert(!ret);
+
+ ret = sys_munmap(ptr + page_size * 2, page_size);
+ assert(!ret);
+
+ ret = sys_munmap(ptr, size);
+ assert(!ret);
+}
+
+static void test_munmap_start_freed(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+
+ // unmap the first page.
+ ret = sys_munmap(ptr, page_size);
+ assert(!ret);
+
+ // seal the last 3 pages.
+ if (seal) {
+ ret = sys_mseal(ptr + page_size, 3 * page_size, MM_SEAL_MUNMAP);
+ assert(!ret);
+ }
+
+ // unmap from the first page.
+ ret = sys_munmap(ptr, size);
+ if (seal) {
+ assert(ret < 0);
+
+ // use mprotect to verify page is not unmapped.
+ ret = sys_mprotect(ptr + page_size, 3 * page_size, PROT_READ);
+ assert(!ret);
+ } else
+ // note: this will be OK, even the first page is
+ // already unmapped.
+ assert(!ret);
+}
+
+static void test_munmap_end_freed(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+ // unmap last page.
+ ret = sys_munmap(ptr + page_size * 3, page_size);
+ assert(!ret);
+
+ // seal the first 3 pages.
+ if (seal) {
+ ret = sys_mseal(ptr, 3 * page_size, MM_SEAL_MUNMAP);
+ assert(!ret);
+ }
+
+ // unmap all pages.
+ ret = sys_munmap(ptr, size);
+ if (seal) {
+ assert(ret < 0);
+
+ // use mprotect to verify page is not unmapped.
+ ret = sys_mprotect(ptr, 3 * page_size, PROT_READ);
+ assert(!ret);
+ } else
+ assert(!ret);
+}
+
+static void test_munmap_middle_freed(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+
+ setup_single_address(size, &ptr);
+ // unmap 2 pages in the middle.
+ ret = sys_munmap(ptr + page_size, page_size * 2);
+ assert(!ret);
+
+ // seal the first page.
+ if (seal) {
+ ret = sys_mseal(ptr, page_size, MM_SEAL_MUNMAP);
+ assert(!ret);
+ }
+
+ // munmap all 4 pages.
+ ret = sys_munmap(ptr, size);
+ if (seal) {
+ assert(ret < 0);
+
+ // use mprotect to verify page is not unmapped.
+ ret = sys_mprotect(ptr, page_size, PROT_READ);
+ assert(!ret);
+ } else
+ assert(!ret);
+}
+
+void test_seal_mremap_shrink(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+ void *ret2;
+
+ setup_single_address(size, &ptr);
+
+ if (seal) {
+ ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
+ assert(!ret);
+ }
+
+ // shrink from 4 pages to 2 pages.
+ ret2 = mremap(ptr, size, 2 * page_size, 0, 0);
+ if (seal) {
+ assert(ret2 == MAP_FAILED);
+ assert(errno == EACCES);
+ } else {
+ assert(ret2 != MAP_FAILED);
+ clean_single_address(ret2, 2 * page_size);
+ }
+ clean_single_address(ptr, size);
+}
+
+void test_seal_mremap_expand(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+ void *ret2;
+
+ setup_single_address(size, &ptr);
+ // ummap last 2 pages.
+ ret = sys_munmap(ptr + 2 * page_size, 2 * page_size);
+ assert(!ret);
+
+ if (seal) {
+ ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_MREMAP);
+ assert(!ret);
+ }
+
+ // expand from 2 page to 4 pages.
+ ret2 = mremap(ptr, 2 * page_size, 4 * page_size, 0, 0);
+ if (seal) {
+ assert(ret2 == MAP_FAILED);
+ assert(errno == EACCES);
+ } else {
+ assert(ret2 == ptr);
+ clean_single_address(ret2, 4 * page_size);
+ }
+ clean_single_address(ptr, size);
+}
+
+void test_seal_mremap_move(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = page_size;
+ int ret;
+ void *ret2;
+
+ setup_single_address(size, &ptr);
+ if (seal) {
+ ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
+ assert(!ret);
+ }
+
+ // move from ptr to fixed address.
+ ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
+ if (seal) {
+ assert(ret2 == MAP_FAILED);
+ assert(errno == EACCES);
+ } else {
+ assert(ret2 != MAP_FAILED);
+ clean_single_address(ret2, size);
+ }
+ clean_single_address(ptr, size);
+}
+
+void test_seal_mmap_overwrite_prot(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = page_size;
+ int ret;
+ void *ret2;
+
+ setup_single_address(size, &ptr);
+
+ if (seal) {
+ ret = sys_mseal(ptr, size, MM_SEAL_MMAP);
+ assert(!ret);
+ }
+
+ // use mmap to change protection.
+ ret2 = mmap(ptr, size, PROT_NONE,
+ MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
+ if (seal) {
+ assert(ret2 == MAP_FAILED);
+ assert(errno == EACCES);
+ } else
+ assert(ret2 == ptr);
+
+ clean_single_address(ptr, size);
+}
+
+void test_seal_mremap_shrink_fixed(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ void *newAddr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+ void *ret2;
+
+ setup_single_address(size, &ptr);
+ setup_single_fixed_address(size, &newAddr);
+
+ if (seal) {
+ ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
+ assert(!ret);
+ }
+
+ // mremap to move and shrink to fixed address
+ ret2 = mremap(ptr, size, 2 * page_size, MREMAP_MAYMOVE | MREMAP_FIXED,
+ newAddr);
+ if (seal) {
+ assert(ret2 == MAP_FAILED);
+ assert(errno == EACCES);
+ } else
+ assert(ret2 == newAddr);
+
+ clean_single_address(ptr, size);
+ clean_single_address(newAddr, size);
+}
+
+void test_seal_mremap_expand_fixed(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ void *newAddr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+ void *ret2;
+
+ setup_single_address(page_size, &ptr);
+ setup_single_fixed_address(size, &newAddr);
+
+ if (seal) {
+ ret = sys_mseal(newAddr, size, MM_SEAL_MREMAP);
+ assert(!ret);
+ }
+
+ // mremap to move and expand to fixed address
+ ret2 = mremap(ptr, page_size, size, MREMAP_MAYMOVE | MREMAP_FIXED,
+ newAddr);
+ if (seal) {
+ assert(ret2 == MAP_FAILED);
+ assert(errno == EACCES);
+ } else
+ assert(ret2 == newAddr);
+
+ clean_single_address(ptr, page_size);
+ clean_single_address(newAddr, size);
+}
+
+void test_seal_mremap_move_fixed(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ void *newAddr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+ void *ret2;
+
+ setup_single_address(size, &ptr);
+ setup_single_fixed_address(size, &newAddr);
+
+ if (seal) {
+ ret = sys_mseal(newAddr, size, MM_SEAL_MREMAP);
+ assert(!ret);
+ }
+
+ // mremap to move to fixed address
+ ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_FIXED, newAddr);
+ if (seal) {
+ assert(ret2 == MAP_FAILED);
+ assert(errno == EACCES);
+ } else
+ assert(ret2 == newAddr);
+
+ clean_single_address(ptr, page_size);
+ clean_single_address(newAddr, size);
+}
+
+void test_seal_mremap_move_fixed_zero(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ void *newAddr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+ void *ret2;
+
+ setup_single_address(size, &ptr);
+
+ if (seal) {
+ ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
+ assert(!ret);
+ }
+
+ /*
+ * MREMAP_FIXED can move the mapping to zero address
+ */
+ ret2 = mremap(ptr, size, 2 * page_size, MREMAP_MAYMOVE | MREMAP_FIXED,
+ 0);
+ if (seal) {
+ assert(ret2 == MAP_FAILED);
+ assert(errno == EACCES);
+ } else {
+ assert(ret2 == 0);
+ clean_single_address(ret2, 2 * page_size);
+ }
+ clean_single_address(ptr, size);
+}
+
+void test_seal_mremap_move_dontunmap(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ void *newAddr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+ void *ret2;
+
+ setup_single_address(size, &ptr);
+
+ if (seal) {
+ ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
+ assert(!ret);
+ }
+
+ // mremap to move, and don't unmap src addr.
+ ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_DONTUNMAP, 0);
+ if (seal) {
+ assert(ret2 == MAP_FAILED);
+ assert(errno == EACCES);
+ } else {
+ assert(ret2 != MAP_FAILED);
+ clean_single_address(ret2, size);
+ }
+
+ clean_single_address(ptr, page_size);
+}
+
+void test_seal_mremap_move_dontunmap_anyaddr(bool seal)
+{
+ LOG_TEST_ENTER();
+ void *ptr;
+ void *newAddr;
+ unsigned long page_size = getpagesize();
+ unsigned long size = 4 * page_size;
+ int ret;
+ void *ret2;
+
+ setup_single_address(size, &ptr);
+
+ if (seal) {
+ ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
+ assert(!ret);
+ }
+
+ /*
+ * The 0xdeaddead should not have effect on dest addr
+ * when MREMAP_DONTUNMAP is set.
+ */
+ ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_DONTUNMAP,
+ 0xdeaddead);
+ if (seal) {
+ assert(ret2 == MAP_FAILED);
+ assert(errno == EACCES);
+ } else {
+ assert(ret2 != MAP_FAILED);
+ assert((long)ret2 != 0xdeaddead);
+ clean_single_address(ret2, size);
+ }
+
+ clean_single_address(ptr, page_size);
+}
+
+int main(int argc, char **argv)
+{
+ test_seal_invalid_input();
+ test_seal_addseals();
+ test_seal_addseals_combined();
+ test_seal_addseals_reject();
+ test_seal_unmapped_start();
+ test_seal_unmapped_middle();
+ test_seal_unmapped_end();
+ test_seal_multiple_vmas();
+ test_seal_split_start();
+ test_seal_split_end();
+
+ test_seal_zero_length();
+ test_seal_twice();
+
+ test_seal_mprotect(false);
+ test_seal_mprotect(true);
+
+ test_seal_start_mprotect(false);
+ test_seal_start_mprotect(true);
+
+ test_seal_end_mprotect(false);
+ test_seal_end_mprotect(true);
+
+ test_seal_mprotect_unalign_len(false);
+ test_seal_mprotect_unalign_len(true);
+
+ test_seal_mprotect_unalign_len_variant_2(false);
+ test_seal_mprotect_unalign_len_variant_2(true);
+
+ test_seal_mprotect_two_vma(false);
+ test_seal_mprotect_two_vma(true);
+
+ test_seal_mprotect_two_vma_with_split(false);
+ test_seal_mprotect_two_vma_with_split(true);
+
+ test_seal_mprotect_partial_mprotect(false);
+ test_seal_mprotect_partial_mprotect(true);
+
+ test_seal_mprotect_two_vma_with_gap(false);
+ test_seal_mprotect_two_vma_with_gap(true);
+
+ test_seal_mprotect_merge(false);
+ test_seal_mprotect_merge(true);
+
+ test_seal_mprotect_split(false);
+ test_seal_mprotect_split(true);
+
+ test_seal_munmap(false);
+ test_seal_munmap(true);
+ test_seal_munmap_two_vma(false);
+ test_seal_munmap_two_vma(true);
+ test_seal_munmap_vma_with_gap(false);
+ test_seal_munmap_vma_with_gap(true);
+
+ test_munmap_start_freed(false);
+ test_munmap_start_freed(true);
+ test_munmap_middle_freed(false);
+ test_munmap_middle_freed(true);
+ test_munmap_end_freed(false);
+ test_munmap_end_freed(true);
+
+ test_seal_mremap_shrink(false);
+ test_seal_mremap_shrink(true);
+ test_seal_mremap_expand(false);
+ test_seal_mremap_expand(true);
+ test_seal_mremap_move(false);
+ test_seal_mremap_move(true);
+
+ test_seal_mremap_shrink_fixed(false);
+ test_seal_mremap_shrink_fixed(true);
+ test_seal_mremap_expand_fixed(false);
+ test_seal_mremap_expand_fixed(true);
+ test_seal_mremap_move_fixed(false);
+ test_seal_mremap_move_fixed(true);
+ test_seal_mremap_move_dontunmap(false);
+ test_seal_mremap_move_dontunmap(true);
+ test_seal_mremap_move_fixed_zero(false);
+ test_seal_mremap_move_fixed_zero(true);
+ test_seal_mremap_move_dontunmap_anyaddr(false);
+ test_seal_mremap_move_dontunmap_anyaddr(true);
+
+ test_seal_mmap_overwrite_prot(false);
+ test_seal_mmap_overwrite_prot(true);
+
+ printf("OK\n");
+ return 0;
+}
--
2.42.0.655.g421f12c284-goog

2023-10-17 15:46:49

by Randy Dunlap

[permalink] [raw]
Subject: Re: [RFC PATCH v2 1/8] mseal: Add mseal(2) syscall.

nit:

On 10/17/23 02:08, [email protected] wrote:

| diff --git a/mm/Kconfig b/mm/Kconfig
| index 264a2df5ecf5..db8a567cb4d3 100644
| --- a/mm/Kconfig
| +++ b/mm/Kconfig
| @@ -1258,6 +1258,14 @@ config LOCK_MM_AND_FIND_VMA
| bool
| depends on !STACK_GROWSUP
|
| +config MSEAL
| + default n
| + bool "Enable mseal() system call"
| + depends on MMU
| + help
| + Enable the mseal() system call. Make memory areas's metadata immutable

areas'

$search_engine is your friend.

| + by selected system calls, i.e. mprotect(), munmap(), mremap(), mmap().


--
~Randy

2023-10-17 16:55:06

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC PATCH v2 5/8] mseal: Check seal flag for munmap(2)

On Tue, 17 Oct 2023 at 02:08, <[email protected]> wrote:
>
> Of all the call paths that call into do_vmi_munmap(),
> this is the only place where checkSeals = MM_SEAL_MUNMAP.
> The rest has checkSeals = 0.

Why?

None of this makes sense.

So you say "we can't munmap in this *one* place, but all others ignore
the sealing".

Crazy.

Linus

2023-10-17 17:05:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC PATCH v2 7/8] mseal:Check seal flag for mmap(2)

On Tue, 17 Oct 2023 at 02:08, <[email protected]> wrote:
>
> Note: Of all the call paths that goes into do_mmap(),
> ksys_mmap_pgoff() is the only place where
> checkSeals = MM_SEAL_MMAP. The rest has checkSeals = 0.

Again, this is all completely nonsensical.

First off, since seals only exist on existing vma's, the "MMAP seal"
makes no sense to begin with. You cannot mmap twice - and mmap'ing
over an existing mapping involves unmapping the old one.

So a "mmap seal" is nonsensical. What should protect a mapping is "you
cannot unmap this". That automatically means that you cannot mmap over
it.

In other words, all these sealing flag semantics are completely random
noise. None of this makes sense.

You need to EXPLAIN what the concept is.

Honestly, there is only two kinds of sealing that makes sense:

- you cannot change the permissions of some area

- you cannot unmap an area

where that first version might then have sub-cases (maybe you can make
permissions _stricter_, but not the other way)?

And dammit, once something is sealed, it is SEALED. None of this crazy
"one place honors the sealing, random other places do not".

I do *NOT* want to see another random patch series tomorrow that
modifies something small here.

I want to get an EXPLANATION and the whole "what the f*ck is the
concept". No more random rules. No more nonsensical code. No more of
this "one place honors seals, another one does not".

Seriously. As long as this is chock-full of these kinds of random
"this makes no sense", please don't send any patches AT ALL. Explain
the high-level rules first, and if you cannot explain why one random
place does something different from all the other random places, don't
even bother.

No more random code. No more random seals. None of this crazy "you
ostensibly can't unmap a vma, but you you can actually unmap it by
mmap'ing over it and then unmapping the new one".

Linus

2023-10-17 17:44:13

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC PATCH v2 7/8] mseal:Check seal flag for mmap(2)

On Tue, 17 Oct 2023 at 10:04, Linus Torvalds
<[email protected]> wrote:
>
> Honestly, there is only two kinds of sealing that makes sense:
>
> - you cannot change the permissions of some area
>
> - you cannot unmap an area

Actually, I guess at least theoretically, there could be three different things:

- you cannot move an area

although I do think that maybe just saying "you cannot unmap" might
also include "you cannot move".

But I guess it depends on whether you feel it's the virtual _address_
you are protecting, or whether it's the concept of mapping something.

I personally think that from a security perspective, what you want to
protect is a particular address. That implies that "seal from
unmapping" would thus also include "you can't move this area
elsewhere".

But at least conceptually, splitting "unmap" and "move" apart might
make some sense. I would like to hear a practical reason for it,
though.

Without that practical reason, I think the only two sane sealing operations are:

- SEAL_MUNMAP: "don't allow this mapping address to go away"

IOW no unmap, no shrinking, no moving mremap

- SEAL_MPROTECT: "don't allow any mapping permission changes"

Again, that permission case might end up being "don't allow
_additional_ permissions" and "don't allow taking permissions away".
Or it could be split by operation (ie "don't allow permission changes
to writability / readability / executability respectively").

I suspect there isn't a real-life example of splitting the
SEAL_MPROTECT (the same way I doubt there's a real-life example for
splitting the UNMAP into "unmap vs move"), so unless there is some
real reason, I'd keep the sealing minimal and to just those two flags.

We could always add more flags later, if there is a real use case
(IOW, if we start with "don't allow any permission changes", we could
add a flag later that just says "don't allow writability changes").

Linus

2023-10-18 07:02:35

by Jeff Xu

[permalink] [raw]
Subject: Re: [RFC PATCH v2 7/8] mseal:Check seal flag for mmap(2)

Hi Linus,

On Tue, Oct 17, 2023 at 10:43 AM Linus Torvalds
<[email protected]> wrote:
>
> On Tue, 17 Oct 2023 at 10:04, Linus Torvalds
> <[email protected]> wrote:
> >
> > Honestly, there is only two kinds of sealing that makes sense:
> >
> > - you cannot change the permissions of some area
> >
> > - you cannot unmap an area
>
> Actually, I guess at least theoretically, there could be three different things:
>
> - you cannot move an area
>
Yes.

Actually, the newly added selftest covers some of the above:
1. can't change the permission of some areas.
test_seal_mprotect
test_seal_mmap_overwrite_prot

2. can't unmap an area (thus mmap() to the same address later)
test_seal_munmap

3. can't move to an area:
test_seal_mremap_move //can't move from a sealed area:
test_seal_mremap_move_fixed_zero //can't move from a sealed area to a
fixed address
test_seal_mremap_move_fixed //can't move to a sealed area.

4 can't expand or shrink the area:
test_seal_mremap_shrink
test_seal_mremap_expand

> although I do think that maybe just saying "you cannot unmap" might
> also include "you cannot move".
>
> But I guess it depends on whether you feel it's the virtual _address_
> you are protecting, or whether it's the concept of mapping something.
>
> I personally think that from a security perspective, what you want to
> protect is a particular address. That implies that "seal from
> unmapping" would thus also include "you can't move this area
> elsewhere".
>
> But at least conceptually, splitting "unmap" and "move" apart might
> make some sense. I would like to hear a practical reason for it,
> though.
>
> Without that practical reason, I think the only two sane sealing operations are:
>
> - SEAL_MUNMAP: "don't allow this mapping address to go away"
>
> IOW no unmap, no shrinking, no moving mremap
>
> - SEAL_MPROTECT: "don't allow any mapping permission changes"
>
I agree with the concept in general. The separation of two seal types
is easy to understand.

For mmap(MAP_FIXED), I know for a fact that it can modify permission of
an existing mapping, (as in selftest:test_seal_mmap_overwrite_prot).
I think it can also expand an existing VMA. This is not a problem, code-wise,
I mention it here, because it needs extra care when coding mmap() change.

> Again, that permission case might end up being "don't allow
> _additional_ permissions" and "don't allow taking permissions away".
> Or it could be split by operation (ie "don't allow permission changes
> to writability / readability / executability respectively").
>
Yes. If the application desires this, it can also be done.
i.e. seal of X bit, or seal of W bit, this will be similar to file sealing.
I discussed this with Stephan before, at this point of time, Chrome
doesn't have a use case.

> I suspect there isn't a real-life example of splitting the
> SEAL_MPROTECT (the same way I doubt there's a real-life example for
> splitting the UNMAP into "unmap vs move"), so unless there is some
> real reason, I'd keep the sealing minimal and to just those two flags.
>
I think two seal-type (permission and unmap/move/expand/shrink)
will work for the Chrome case. Stephen Röttger is an expert in Chrome,
on vacation/ be back soon. I will wait for Stephen to confirm.

> We could always add more flags later, if there is a real use case
> (IOW, if we start with "don't allow any permission changes", we could
> add a flag later that just says "don't allow writability changes").
>
Agreed 100%, thanks for understanding.

-Jeff


> Linus

2023-10-18 15:09:43

by Jeff Xu

[permalink] [raw]
Subject: Re: [RFC PATCH v2 5/8] mseal: Check seal flag for munmap(2)

On Tue, Oct 17, 2023 at 9:54 AM Linus Torvalds
<[email protected]> wrote:
>
> On Tue, 17 Oct 2023 at 02:08, <[email protected]> wrote:
> >
> > Of all the call paths that call into do_vmi_munmap(),
> > this is the only place where checkSeals = MM_SEAL_MUNMAP.
> > The rest has checkSeals = 0.
>
> Why?
>
> None of this makes sense.
>
> So you say "we can't munmap in this *one* place, but all others ignore
> the sealing".
>
I apologize that previously, I described what this code does, and not reasoning.

In our threat model, as Stephen Röttger point out in [1], and I quote:

V8 exploits typically follow a similar pattern: an initial bug leads
to memory corruption but often the initial corruption is limited and
the attacker has to find a way to arbitrarily read/write in the whole
address space.

The memory correction is in the user space process, e.g. Chrome.
Attackers will try to modify permission of the memory, by calling
mprotect, or munmap then mmap to the same address but with different
permission, etc.

Sealing blocks mprotect/munmap/mremap/mmap call from the user space
process, e.g. Chrome.

At time of handling those 4 syscalls, we need to check the seal (
can_modify_mm), this requires locking the VMA (
mmap_write_lock_killable), and ideally, after validating the syscall
input. The reasonable place for can_modify_mm() is from utility
functions, such as do_mmap(), do_vmi_munmap(), etc.

However, there is no guarantee that do_mmap() and do_vmi_munmap() are
only reachable from mprotect/munmap/mremap/mmap syscall entry point
(SYSCALL_DEFINE_XX). In theory, the kernel can call those in other
scenarios, and some of them can be perfectly legit. Those other
scenarios are not covered by our threat model at this time. Therefore,
we need a flag, passed from the SYSCALL_DEFINE_XX entry , down to
can_modify_mm(), to differentiate those other scenarios.

Now, back to code, it did some optimization, i.e. doesn't pass the
flag from SYSCALL_DEFINE_XX in all cases. If SYSCALL_DEFINE_XX calls
do_a, and do_a has only one caller, I will set the flag in do_a,
instead of SYSCALL_DEFINE_XX. Doing this reduces the size of the
patchset, but it also makes the code less readable indeed. I could
remove this optimization in V3. I welcome suggestions to improve
readability on this.

When handing the mmap/munmap/mremap/mmap, once the code passed
can_modify_mm(), it means the memory area is not sealed, if the code
continues to call the other utility functions, we don't need to check
the seal again. This is the case for mremap(), the seal of src address
and dest address (when applicable) are checked first, later when the
code calls do_vmi_munmap(), it no longer needs to check the seal
again.

[1] https://v8.dev/blog/control-flow-integrity

-Jeff

2023-10-18 17:15:02

by Jeff Xu

[permalink] [raw]
Subject: Re: [RFC PATCH v2 5/8] mseal: Check seal flag for munmap(2)

On Wed, Oct 18, 2023 at 8:08 AM Jeff Xu <[email protected]> wrote:
>
> On Tue, Oct 17, 2023 at 9:54 AM Linus Torvalds
> <[email protected]> wrote:
> >
> > On Tue, 17 Oct 2023 at 02:08, <[email protected]> wrote:
> > >
> > > Of all the call paths that call into do_vmi_munmap(),
> > > this is the only place where checkSeals = MM_SEAL_MUNMAP.
> > > The rest has checkSeals = 0.
> >
> > Why?
> >
> > None of this makes sense.
> >
> > So you say "we can't munmap in this *one* place, but all others ignore
> > the sealing".
> >
> I apologize that previously, I described what this code does, and not reasoning.
>
> In our threat model, as Stephen Röttger point out in [1], and I quote:
>
> V8 exploits typically follow a similar pattern: an initial bug leads
> to memory corruption but often the initial corruption is limited and
> the attacker has to find a way to arbitrarily read/write in the whole
> address space.
>
> The memory correction is in the user space process, e.g. Chrome.
> Attackers will try to modify permission of the memory, by calling
> mprotect, or munmap then mmap to the same address but with different
> permission, etc.
>
> Sealing blocks mprotect/munmap/mremap/mmap call from the user space
> process, e.g. Chrome.
>
> At time of handling those 4 syscalls, we need to check the seal (
> can_modify_mm), this requires locking the VMA (
> mmap_write_lock_killable), and ideally, after validating the syscall
> input. The reasonable place for can_modify_mm() is from utility
> functions, such as do_mmap(), do_vmi_munmap(), etc.
>
> However, there is no guarantee that do_mmap() and do_vmi_munmap() are
> only reachable from mprotect/munmap/mremap/mmap syscall entry point
> (SYSCALL_DEFINE_XX). In theory, the kernel can call those in other
> scenarios, and some of them can be perfectly legit. Those other
> scenarios are not covered by our threat model at this time. Therefore,
> we need a flag, passed from the SYSCALL_DEFINE_XX entry , down to
> can_modify_mm(), to differentiate those other scenarios.
>
> Now, back to code, it did some optimization, i.e. doesn't pass the
> flag from SYSCALL_DEFINE_XX in all cases. If SYSCALL_DEFINE_XX calls
> do_a, and do_a has only one caller, I will set the flag in do_a,
> instead of SYSCALL_DEFINE_XX. Doing this reduces the size of the
> patchset, but it also makes the code less readable indeed. I could
> remove this optimization in V3. I welcome suggestions to improve
> readability on this.
>
> When handing the mmap/munmap/mremap/mmap, once the code passed
> can_modify_mm(), it means the memory area is not sealed, if the code
> continues to call the other utility functions, we don't need to check
> the seal again. This is the case for mremap(), the seal of src address
> and dest address (when applicable) are checked first, later when the
> code calls do_vmi_munmap(), it no longer needs to check the seal
> again.
>
> [1] https://v8.dev/blog/control-flow-integrity
>
> -Jeff

There is also alternative approach:

For all the places that call do_vmi_munmap(), find out which
case should ignore the sealing flag legitimately, set an ignore_seal
flag and pass it down into do_vmi_munmap(). For the rest case,
use default behavior.

All future API will automatically be covered for sealing, by using default.

The risky side, if I missed a case that requires setting ignore_seal,
there will be a bug.

Also if a driver calls the utility functions to unmap a memory, the
seal will be checked as well. (Driver is not in our threat model,
but Chrome probably doesn't mind it.)

Which of those two approaches are better ? I appreciate the direction on this.

Thanks!
-Jeff


-Jeff

2023-10-18 18:28:10

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC PATCH v2 5/8] mseal: Check seal flag for munmap(2)

On Wed, 18 Oct 2023 at 10:14, Jeff Xu <[email protected]> wrote:
>
> There is also alternative approach:
>
> For all the places that call do_vmi_munmap(), find out which
> case should ignore the sealing flag legitimately,

NO.

Christ.

THERE ARE NO LEGITIMATE CASES OF IGNORING SEALING FLAGS.

If you ignore a sealing flag, it's not a sealing flag. It's random
crap, and claiming that it has *anything* to do with security is just
a cruel joke.

Really.

Stop this. I do not want to hear your excuses for garbage any more.
We're done. If I hear any more arguments for this sh*t, I will
literally put you in my ignore file, and will auto-NAK any future
patches.

This is simply not up for discussion. Any flag for "ignore sealing" is wrong.

We do have one special "unmap" case, namely "unmap_vmas()' called at
last mmput() -> __mmput() -> exit_mmap().

And yes, that is called at munmap() time too, but that's after the
point of no return after we've already removed the vma's from the VM
lists. So it's long after any error cases have been checked.

Linus

2023-10-18 19:08:28

by Jeff Xu

[permalink] [raw]
Subject: Re: [RFC PATCH v2 5/8] mseal: Check seal flag for munmap(2)

On Wed, Oct 18, 2023 at 11:27 AM Linus Torvalds
<[email protected]> wrote:
>
> On Wed, 18 Oct 2023 at 10:14, Jeff Xu <[email protected]> wrote:
> This is simply not up for discussion. Any flag for "ignore sealing" is wrong.
>
> We do have one special "unmap" case, namely "unmap_vmas()' called at
> last mmput() -> __mmput() -> exit_mmap().
>
> And yes, that is called at munmap() time too, but that's after the
> point of no return after we've already removed the vma's from the VM
> lists. So it's long after any error cases have been checked.
>
Ah. I see.
I didn't know there was no legit case, which is what I worried about before.
this flag can be removed.

2023-10-19 07:27:32

by Stephen Röttger

[permalink] [raw]
Subject: Re: [RFC PATCH v2 7/8] mseal:Check seal flag for mmap(2)

> Without that practical reason, I think the only two sane sealing operations are:
>
> - SEAL_MUNMAP: "don't allow this mapping address to go away"
>
> IOW no unmap, no shrinking, no moving mremap
>
> - SEAL_MPROTECT: "don't allow any mapping permission changes"
>
> Again, that permission case might end up being "don't allow
> _additional_ permissions" and "don't allow taking permissions away".
> Or it could be split by operation (ie "don't allow permission changes
> to writability / readability / executability respectively").
>
> I suspect there isn't a real-life example of splitting the
> SEAL_MPROTECT (the same way I doubt there's a real-life example for
> splitting the UNMAP into "unmap vs move"), so unless there is some
> real reason, I'd keep the sealing minimal and to just those two flags.

These two flags are exactly what we would use in Chrome. I can't think of a
use case for a more fine grained split either.


Attachments:
smime.p7s (3.91 kB)
S/MIME Cryptographic Signature

2023-10-19 09:19:44

by David Laight

[permalink] [raw]
Subject: RE: [RFC PATCH v2 0/8] Introduce mseal() syscall

From: [email protected]
> Sent: 17 October 2023 10:08
>
> This patchset proposes a new mseal() syscall for the Linux kernel.

I'm sure you can give it a better name, there isn't a 6 character
limit on identifiers!

FWIW you could also use mprotect(addr, len, IMMUTABLE);

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-10-20 13:57:56

by Muhammad Usama Anjum

[permalink] [raw]
Subject: Re: [RFC PATCH v2 6/8] mseal: Check seal flag for mremap(2)

On 10/17/23 2:08 PM, [email protected] wrote:
> From: Jeff Xu <[email protected]>
>
> mremap(2) can shrink/expand a VMA, or move a VMA to a fixed
> address and overwriting or existing VMA. Sealing will
> prevent unintended mremap(2) call.
>
> What this patch does:
> When a mremap(2) is invoked, if one of its VMAs has MM_SEAL_MREMAP
> set from previous mseal(2) call, this mremap(2) will fail, without
> any VMA modified.
>
> This patch is based on following:
> 1. At syscall entry point: SYSCALL_DEFINE5(mremap,...)
> There are two cases:
Maybe we can reduce the code duplication by bringing the check if memory is
sealed before call to mremap_to().

> a. going into mremap_to().
> b. not going into mremap_to().
>
> 2. For mremap_to() case.
> Since mremap_to() is called only from SYSCALL_DEFINE5(mremap,..),
> omit changing signature of mremap_to(), i.e. not passing
> checkSeals flag.
> In mremap_to(), it calls can_modify_mm() for src address and
> dst address (when MREMAP_FIXED is used), before any update is
> made to the VMAs.
>
> 3. For non mremap_to() case.
> It is still part of SYSCALL_DEFINE5(mremap,...).
> It calls can_modify_mm() to check sealing in the src address,
> before any update is made to src VMAs.
> Check for dest address is not needed, because dest memory is
> allocated in current mremap(2) call.
>
> Signed-off-by: Jeff Xu <[email protected]>
> ---
> mm/mremap.c | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/mm/mremap.c b/mm/mremap.c
> index ac363937f8c4..691fc32d37e4 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -836,7 +836,27 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len,
> if ((mm->map_count + 2) >= sysctl_max_map_count - 3)
> return -ENOMEM;
>
> + /*
> + * Check src address for sealing.
> + *
> + * Note: mremap_to() currently called from one place:
> + * SYSCALL_DEFINE4(pkey_mprotect, ...)
> + * and not in any other places.
> + * Therefore, omit changing the signature of mremap_to()
> + * Otherwise, we might need to add checkSeals and pass it
> + * from all callers of mremap_to().
> + */
> + if (!can_modify_mm(mm, addr, addr + old_len, MM_SEAL_MREMAP))
> + return -EACCES;
> +
> if (flags & MREMAP_FIXED) {
> + /*
> + * Check dest address for sealing.
> + */
> + if (!can_modify_mm(mm, new_addr, new_addr + new_len,
> + MM_SEAL_MREMAP))
> + return -EACCES;
> +
Move these two checks to just before call to mremap_to() in sys_mremap() or
even earlier. Or even better move the first condition before mremap_to()
and second condition can be checked before call to mremap_to().

> ret = do_munmap(mm, new_addr, new_len, uf_unmap_early);
> if (ret)
> goto out;
> @@ -995,6 +1015,11 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
> goto out;
> }
>
> + if (!can_modify_mm(mm, addr, addr + old_len, MM_SEAL_MREMAP)) {
> + ret = -EACCES;
> + goto out;
> + }
> +
> /*
> * Always allow a shrinking remap: that just unmaps
> * the unnecessary pages..

--
BR,
Muhammad Usama Anjum

2023-10-20 14:25:51

by Muhammad Usama Anjum

[permalink] [raw]
Subject: Re: [RFC PATCH v2 8/8] selftest mm/mseal mprotect/munmap/mremap/mmap

On 10/17/23 2:08 PM, [email protected] wrote:
> From: Jeff Xu <[email protected]>
>
> selftest for sealing mprotect/munmap/mremap/mmap
>
> Signed-off-by: Jeff Xu <[email protected]>
> ---
> tools/testing/selftests/mm/Makefile | 1 +
> tools/testing/selftests/mm/mseal_test.c | 1428 +++++++++++++++++++++++
Please add the new MSEAL config to tools/testing/selftests/mm/config.
Please add the generated object in tools/testing/selftests/mm/.gitignore file.

> 2 files changed, 1429 insertions(+)
> create mode 100644 tools/testing/selftests/mm/mseal_test.c
>
> diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
> index 6a9fc5693145..0c086cecc093 100644
> --- a/tools/testing/selftests/mm/Makefile
> +++ b/tools/testing/selftests/mm/Makefile
> @@ -59,6 +59,7 @@ TEST_GEN_FILES += mlock2-tests
> TEST_GEN_FILES += mrelease_test
> TEST_GEN_FILES += mremap_dontunmap
> TEST_GEN_FILES += mremap_test
> +TEST_GEN_FILES += mseal_test
> TEST_GEN_FILES += on-fault-limit
> TEST_GEN_FILES += thuge-gen
> TEST_GEN_FILES += transhuge-stress
> diff --git a/tools/testing/selftests/mm/mseal_test.c b/tools/testing/selftests/mm/mseal_test.c
> new file mode 100644
> index 000000000000..d6ae09729394
> --- /dev/null
> +++ b/tools/testing/selftests/mm/mseal_test.c
> @@ -0,0 +1,1428 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#define _GNU_SOURCE
> +#include <sys/mman.h>
> +#include <stdint.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <sys/time.h>
> +#include <sys/resource.h>
> +#include <stdbool.h>
> +#include "../kselftest.h"
> +#include <syscall.h>
> +#include <errno.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <assert.h>
> +
> +#ifndef MM_SEAL_MSEAL
> +#define MM_SEAL_MSEAL 0x1
> +#endif
> +
> +#ifndef MM_SEAL_MPROTECT
> +#define MM_SEAL_MPROTECT 0x2
> +#endif
> +
> +#ifndef MM_SEAL_MUNMAP
> +#define MM_SEAL_MUNMAP 0x4
> +#endif
> +
> +#ifndef MM_SEAL_MMAP
> +#define MM_SEAL_MMAP 0x8
> +#endif
> +
> +#ifndef MM_SEAL_MREMAP
> +#define MM_SEAL_MREMAP 0x10
> +#endif
Please remove these. These macros would be picked up from the kernel
headers automatically.

> +
> +#ifndef DEBUG
> +#define LOG_TEST_ENTER() {}
> +#else
> +#define LOG_TEST_ENTER() { printf("%s\n", __func__); }
> +#endif
> +
> +static int sys_mseal(void *start, size_t len, int types)
> +{
> + int sret;
> +
> + errno = 0;
> + sret = syscall(__NR_mseal, start, len, types, 0);
> + return sret;
> +}
> +
> +int sys_mprotect(void *ptr, size_t size, unsigned long prot)
Why aren't you using the mprotect() wrapper already provided by glibC? Same
question for other syscalls. Maybe you are trying to avoid some glibc magic?

> +{
> + int sret;
> +
> + errno = 0;
> + sret = syscall(SYS_mprotect, ptr, size, prot);
> + return sret;
> +}
> +
> +int sys_munmap(void *ptr, size_t size)
> +{
> + int sret;
> +
> + errno = 0;
> + sret = syscall(SYS_munmap, ptr, size);
> + return sret;
> +}
> +
> +static int sys_madvise(void *start, size_t len, int types)
> +{
> + int sret;
> +
> + errno = 0;
> + sret = syscall(__NR_madvise, start, len, types);
> + return sret;
> +}
> +
> +void *addr1 = (void *)0x50000000;
> +void *addr2 = (void *)0x50004000;
> +void *addr3 = (void *)0x50008000;
> +void setup_single_address(int size, void **ptrOut)
> +{
> + void *ptr;
> +
> + ptr = mmap(NULL, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> + assert(ptr != (void *)-1);
MAP_FAILED inlace of -1

> + *ptrOut = ptr;
> +}
> +
> +void setup_single_fixed_address(int size, void **ptrOut)
> +{
> + void *ptr;
> +
> + ptr = mmap(addr1, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> + assert(ptr == (void *)addr1);
> +
> + *ptrOut = ptr;
> +}
> +
> +void clean_single_address(void *ptr, int size)
> +{
> + int ret;
> +
> + ret = munmap(ptr, size);
> + assert(!ret);
> +}
> +
> +void seal_mprotect_single_address(void *ptr, int size)
> +{
> + int ret;
> +
> + ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
> + assert(!ret);
> +}
> +
> +static void test_seal_addseals(void)
> +{
> + LOG_TEST_ENTER();
> + int ret;
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> +
> + setup_single_address(size, &ptr);
> +
> + /* adding seal one by one */
> + ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
> + assert(!ret);
> +
> + ret = sys_mseal(ptr, size, MM_SEAL_MMAP);
> + assert(!ret);
> +
> + ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
> + assert(!ret);
> +
> + ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_addseals_combined(void)
> +{
> + LOG_TEST_ENTER();
> + int ret;
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> +
> + setup_single_address(size, &ptr);
> +
> + ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
> + assert(!ret);
> +
> + /* adding multiple seals */
> + ret = sys_mseal(ptr, size,
> + MM_SEAL_MPROTECT | MM_SEAL_MMAP | MM_SEAL_MREMAP |
> + MM_SEAL_MSEAL);
> + assert(!ret);
The output should be conformed to the kselftets output format by using
ksft_exit_fail_msg() and ksft_test_result*() macros etc. Please find these
macros in kselftest.h.

> +
> + /* not adding more seal type, so ok. */
> + ret = sys_mseal(ptr, size,
> + MM_SEAL_MMAP | MM_SEAL_MREMAP | MM_SEAL_MSEAL);
> + assert(!ret);
> +
> + /* not adding more seal type, so ok. */
> + ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_addseals_reject(void)
> +{
> + LOG_TEST_ENTER();
> + int ret;
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> +
> + setup_single_address(size, &ptr);
> +
> + ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT | MM_SEAL_MSEAL);
> + assert(!ret);
> +
> + ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
> + assert(!ret);
> +
> + /* MM_SEAL_MSEAL is set, so not allow new seal type . */
> + ret = sys_mseal(ptr, size,
> + MM_SEAL_MPROTECT | MM_SEAL_MMAP | MM_SEAL_MSEAL);
> + assert(ret < 0);
> +
> + ret = sys_mseal(ptr, size, MM_SEAL_MMAP);
> + assert(ret < 0);
> +
> + ret = sys_mseal(ptr, size, MM_SEAL_MMAP | MM_SEAL_MSEAL);
> + assert(ret < 0);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_unmapped_start(void)
> +{
> + LOG_TEST_ENTER();
> + int ret;
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> +
> + setup_single_address(size, &ptr);
> +
> + // munmap 2 pages from ptr.
Use consistent commenting style in the file.

> + ret = sys_munmap(ptr, 2 * page_size);
> + assert(!ret);
> +
> + // mprotect will fail because 2 pages from ptr are unmapped.
> + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE);
> + assert(ret < 0);
> +
> + // mseal will fail because 2 pages from ptr are unmapped.
> + ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
> + assert(ret < 0);
> +
> + ret = sys_mseal(ptr + 2 * page_size, 2 * page_size, MM_SEAL_MSEAL);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_unmapped_middle(void)
> +{
> + LOG_TEST_ENTER();
> + int ret;
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> +
> + setup_single_address(size, &ptr);
> +
> + // munmap 2 pages from ptr + page.
> + ret = sys_munmap(ptr + page_size, 2 * page_size);
> + assert(!ret);
> +
> + // mprotect will fail, since size is 4 pages.
> + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE);
> + assert(ret < 0);
> +
> + // mseal will fail as well.
> + ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
> + assert(ret < 0);
> +
> + /* we still can add seal to the first page and last page*/
> + ret = sys_mseal(ptr, page_size, MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
> + assert(!ret);
> +
> + ret = sys_mseal(ptr + 3 * page_size, page_size,
> + MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_unmapped_end(void)
> +{
> + LOG_TEST_ENTER();
> + int ret;
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> +
> + setup_single_address(size, &ptr);
> +
> + // unmap last 2 pages.
> + ret = sys_munmap(ptr + 2 * page_size, 2 * page_size);
> + assert(!ret);
> +
> + //mprotect will fail since last 2 pages are unmapped.
> + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE);
> + assert(ret < 0);
> +
> + //mseal will fail as well.
> + ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
> + assert(ret < 0);
> +
> + /* The first 2 pages is not sealed, and can add seals */
> + ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_multiple_vmas(void)
> +{
> + LOG_TEST_ENTER();
> + int ret;
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> +
> + setup_single_address(size, &ptr);
> +
> + // use mprotect to split the vma into 3.
> + ret = sys_mprotect(ptr + page_size, 2 * page_size,
> + PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + // mprotect will get applied to all 4 pages - 3 VMAs.
> + ret = sys_mprotect(ptr, size, PROT_READ);
> + assert(!ret);
> +
> + // use mprotect to split the vma into 3.
> + ret = sys_mprotect(ptr + page_size, 2 * page_size,
> + PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + // mseal get applied to all 4 pages - 3 VMAs.
> + ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
> + assert(!ret);
> +
> + // verify additional seal type will fail after MM_SEAL_MSEAL set.
> + ret = sys_mseal(ptr, page_size, MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
> + assert(ret < 0);
> +
> + ret = sys_mseal(ptr + page_size, 2 * page_size,
> + MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
> + assert(ret < 0);
> +
> + ret = sys_mseal(ptr + 3 * page_size, page_size,
> + MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
> + assert(ret < 0);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_split_start(void)
> +{
> + LOG_TEST_ENTER();
> + int ret;
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> +
> + setup_single_address(size, &ptr);
> +
> + /* use mprotect to split at middle */
> + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + /* seal the first page, this will split the VMA */
> + ret = sys_mseal(ptr, page_size, MM_SEAL_MSEAL);
> + assert(!ret);
> +
> + /* can't add seal to the first page */
> + ret = sys_mseal(ptr, page_size, MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
> + assert(ret < 0);
> +
> + /* add seal to the remain 3 pages */
> + ret = sys_mseal(ptr + page_size, 3 * page_size,
> + MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_split_end(void)
> +{
> + LOG_TEST_ENTER();
> + int ret;
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> +
> + setup_single_fixed_address(size, &ptr);
> +
> + /* use mprotect to split at middle */
> + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + /* seal the last page */
> + ret = sys_mseal(ptr + 3 * page_size, page_size, MM_SEAL_MSEAL);
> + assert(!ret);
> +
> + /* adding seal to the last page is rejected. */
> + ret = sys_mseal(ptr + 3 * page_size, page_size,
> + MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
> + assert(ret < 0);
> +
> + /* Adding seals to the first 3 pages */
> + ret = sys_mseal(ptr, 3 * page_size, MM_SEAL_MSEAL | MM_SEAL_MPROTECT);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_invalid_input(void)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_fixed_address(size, &ptr);
> +
> + /* invalid flag */
> + ret = sys_mseal(ptr, size, 0x20);
> + assert(ret < 0);
> +
> + ret = sys_mseal(ptr, size, 0x31);
> + assert(ret < 0);
> +
> + ret = sys_mseal(ptr, size, 0x3F);
> + assert(ret < 0);
> +
> + /* unaligned address */
> + ret = sys_mseal(ptr + 1, 2 * page_size, MM_SEAL_MSEAL);
> + assert(ret < 0);
> +
> + /* length too big */
> + ret = sys_mseal(ptr, 5 * page_size, MM_SEAL_MSEAL);
> + assert(ret < 0);
> +
> + /* start is not in a valid VMA */
> + ret = sys_mseal(ptr - page_size, 5 * page_size, MM_SEAL_MSEAL);
> + assert(ret < 0);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_zero_length(void)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + ret = sys_mprotect(ptr, 0, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + /* seal 0 length will be OK, same as mprotect */
> + ret = sys_mseal(ptr, 0, MM_SEAL_MPROTECT);
> + assert(!ret);
> +
> + // verify the 4 pages are not sealed by previous call.
> + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_twice(void)
> +{
> + LOG_TEST_ENTER();
> + int ret;
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> +
> + setup_single_address(size, &ptr);
> +
> + ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
> + assert(!ret);
> +
> + // apply the same seal will be OK. idempotent.
> + ret = sys_mseal(ptr, size, MM_SEAL_MPROTECT);
> + assert(!ret);
> +
> + ret = sys_mseal(ptr, size,
> + MM_SEAL_MPROTECT | MM_SEAL_MMAP | MM_SEAL_MREMAP |
> + MM_SEAL_MSEAL);
> + assert(!ret);
> +
> + ret = sys_mseal(ptr, size,
> + MM_SEAL_MPROTECT | MM_SEAL_MMAP | MM_SEAL_MREMAP |
> + MM_SEAL_MSEAL);
> + assert(!ret);
> +
> + ret = sys_mseal(ptr, size, MM_SEAL_MSEAL);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_mprotect(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + if (seal)
> + seal_mprotect_single_address(ptr, size);
> +
> + ret = sys_mprotect(ptr, size, PROT_READ | PROT_WRITE);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_start_mprotect(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + if (seal)
> + seal_mprotect_single_address(ptr, page_size);
> +
> + // the first page is sealed.
> + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + // pages after the first page is not sealed.
> + ret = sys_mprotect(ptr + page_size, page_size * 3,
> + PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_end_mprotect(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + if (seal)
> + seal_mprotect_single_address(ptr + page_size, 3 * page_size);
> +
> + /* first page is not sealed */
> + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + /* last 3 page are sealed */
> + ret = sys_mprotect(ptr + page_size, page_size * 3,
> + PROT_READ | PROT_WRITE);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_mprotect_unalign_len(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + if (seal)
> + seal_mprotect_single_address(ptr, page_size * 2 - 1);
> +
> + // 2 pages are sealed.
> + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + ret = sys_mprotect(ptr + page_size * 2, page_size,
> + PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_mprotect_unalign_len_variant_2(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> + if (seal)
> + seal_mprotect_single_address(ptr, page_size * 2 + 1);
> +
> + // 3 pages are sealed.
> + ret = sys_mprotect(ptr, page_size * 3, PROT_READ | PROT_WRITE);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + ret = sys_mprotect(ptr + page_size * 3, page_size,
> + PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_mprotect_two_vma(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + /* use mprotect to split */
> + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + if (seal)
> + seal_mprotect_single_address(ptr, page_size * 4);
> +
> + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + ret = sys_mprotect(ptr + page_size * 2, page_size * 2,
> + PROT_READ | PROT_WRITE);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_mprotect_two_vma_with_split(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + // use mprotect to split as two vma.
> + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + // mseal can apply across 2 vma, also split them.
> + if (seal)
> + seal_mprotect_single_address(ptr + page_size, page_size * 2);
> +
> + // the first page is not sealed.
> + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + // the second page is sealed.
> + ret = sys_mprotect(ptr + page_size, page_size, PROT_READ | PROT_WRITE);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + // the third page is sealed.
> + ret = sys_mprotect(ptr + 2 * page_size, page_size,
> + PROT_READ | PROT_WRITE);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + // the fouth page is not sealed.
> + ret = sys_mprotect(ptr + 3 * page_size, page_size,
> + PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_mprotect_partial_mprotect(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + // seal one page.
> + if (seal)
> + seal_mprotect_single_address(ptr, page_size);
> +
> + // mprotect first 2 page will fail, since the first page are sealed.
> + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ | PROT_WRITE);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_mprotect_two_vma_with_gap(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + // use mprotect to split.
> + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + // use mprotect to split.
> + ret = sys_mprotect(ptr + 3 * page_size, page_size,
> + PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + // use munmap to free two pages in the middle
> + ret = sys_munmap(ptr + page_size, 2 * page_size);
> + assert(!ret);
> +
> + // mprotect will fail, because there is a gap in the address.
> + // notes, internally mprotect still updated the first page.
> + ret = sys_mprotect(ptr, 4 * page_size, PROT_READ);
> + assert(ret < 0);
> +
> + // mseal will fail as well.
> + ret = sys_mseal(ptr, 4 * page_size, MM_SEAL_MPROTECT);
> + assert(ret < 0);
> +
> + // unlike mprotect, the first page is not sealed.
> + ret = sys_mprotect(ptr, page_size, PROT_READ);
> + assert(ret == 0);
> +
> + // the last page is not sealed.
> + ret = sys_mprotect(ptr + 3 * page_size, page_size, PROT_READ);
> + assert(ret == 0);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_mprotect_split(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + //use mprotect to split.
> + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + //seal all 4 pages.
> + if (seal) {
> + ret = sys_mseal(ptr, 4 * page_size, MM_SEAL_MPROTECT);
> + assert(!ret);
> + }
> +
> + //madvice is OK.
> + ret = sys_madvise(ptr, page_size * 2, MADV_WILLNEED);
> + assert(!ret);
> +
> + //mprotect is sealed.
> + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> +
> + ret = sys_mprotect(ptr + 2 * page_size, 2 * page_size, PROT_READ);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_mprotect_merge(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + // use mprotect to split one page.
> + ret = sys_mprotect(ptr, page_size, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + // seal first two pages.
> + if (seal) {
> + ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_MPROTECT);
> + assert(!ret);
> + }
> +
> + ret = sys_madvise(ptr, page_size, MADV_WILLNEED);
> + assert(!ret);
> +
> + // 2 pages are sealed.
> + ret = sys_mprotect(ptr, 2 * page_size, PROT_READ);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + // last 2 pages are not sealed.
> + ret = sys_mprotect(ptr + 2 * page_size, 2 * page_size, PROT_READ);
> + assert(ret == 0);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +static void test_seal_munmap(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + if (seal) {
> + ret = sys_mseal(ptr, size, MM_SEAL_MUNMAP);
> + assert(!ret);
> + }
> +
> + // 4 pages are sealed.
> + ret = sys_munmap(ptr, size);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +}
> +
> +/*
> + * allocate 4 pages,
> + * use mprotect to split it as two VMAs
> + * seal the whole range
> + * munmap will fail on both
> + */
> +static void test_seal_munmap_two_vma(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + /* use mprotect to split */
> + ret = sys_mprotect(ptr, page_size * 2, PROT_READ | PROT_WRITE);
> + assert(!ret);
> +
> + if (seal) {
> + ret = sys_mseal(ptr, size, MM_SEAL_MUNMAP);
> + assert(!ret);
> + }
> +
> + ret = sys_munmap(ptr, page_size * 2);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +
> + ret = sys_munmap(ptr + page_size, page_size * 2);
> + if (seal)
> + assert(ret < 0);
> + else
> + assert(!ret);
> +}
> +
> +/*
> + * allocate a VMA with 4 pages.
> + * munmap the middle 2 pages.
> + * seal the whole 4 pages, will fail.
> + * note: one of the pages are sealed
> + * munmap the first page will be OK.
> + * munmap the last page will be OK.
> + */
> +static void test_seal_munmap_vma_with_gap(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + ret = sys_munmap(ptr + page_size, page_size * 2);
> + assert(!ret);
> +
> + if (seal) {
> + // can't have gap in the middle.
> + ret = sys_mseal(ptr, size, MM_SEAL_MUNMAP);
> + assert(ret < 0);
> + }
> +
> + ret = sys_munmap(ptr, page_size);
> + assert(!ret);
> +
> + ret = sys_munmap(ptr + page_size * 2, page_size);
> + assert(!ret);
> +
> + ret = sys_munmap(ptr, size);
> + assert(!ret);
> +}
> +
> +static void test_munmap_start_freed(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> +
> + // unmap the first page.
> + ret = sys_munmap(ptr, page_size);
> + assert(!ret);
> +
> + // seal the last 3 pages.
> + if (seal) {
> + ret = sys_mseal(ptr + page_size, 3 * page_size, MM_SEAL_MUNMAP);
> + assert(!ret);
> + }
> +
> + // unmap from the first page.
> + ret = sys_munmap(ptr, size);
> + if (seal) {
> + assert(ret < 0);
> +
> + // use mprotect to verify page is not unmapped.
> + ret = sys_mprotect(ptr + page_size, 3 * page_size, PROT_READ);
> + assert(!ret);
> + } else
> + // note: this will be OK, even the first page is
> + // already unmapped.
> + assert(!ret);
> +}
> +
> +static void test_munmap_end_freed(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> + // unmap last page.
> + ret = sys_munmap(ptr + page_size * 3, page_size);
> + assert(!ret);
> +
> + // seal the first 3 pages.
> + if (seal) {
> + ret = sys_mseal(ptr, 3 * page_size, MM_SEAL_MUNMAP);
> + assert(!ret);
> + }
> +
> + // unmap all pages.
> + ret = sys_munmap(ptr, size);
> + if (seal) {
> + assert(ret < 0);
> +
> + // use mprotect to verify page is not unmapped.
> + ret = sys_mprotect(ptr, 3 * page_size, PROT_READ);
> + assert(!ret);
> + } else
> + assert(!ret);
> +}
> +
> +static void test_munmap_middle_freed(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> +
> + setup_single_address(size, &ptr);
> + // unmap 2 pages in the middle.
> + ret = sys_munmap(ptr + page_size, page_size * 2);
> + assert(!ret);
> +
> + // seal the first page.
> + if (seal) {
> + ret = sys_mseal(ptr, page_size, MM_SEAL_MUNMAP);
> + assert(!ret);
> + }
> +
> + // munmap all 4 pages.
> + ret = sys_munmap(ptr, size);
> + if (seal) {
> + assert(ret < 0);
> +
> + // use mprotect to verify page is not unmapped.
> + ret = sys_mprotect(ptr, page_size, PROT_READ);
> + assert(!ret);
> + } else
> + assert(!ret);
> +}
> +
> +void test_seal_mremap_shrink(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> + void *ret2;
> +
> + setup_single_address(size, &ptr);
> +
> + if (seal) {
> + ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
> + assert(!ret);
> + }
> +
> + // shrink from 4 pages to 2 pages.
> + ret2 = mremap(ptr, size, 2 * page_size, 0, 0);
> + if (seal) {
> + assert(ret2 == MAP_FAILED);
> + assert(errno == EACCES);
> + } else {
> + assert(ret2 != MAP_FAILED);
> + clean_single_address(ret2, 2 * page_size);
> + }
> + clean_single_address(ptr, size);
> +}
> +
> +void test_seal_mremap_expand(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> + void *ret2;
> +
> + setup_single_address(size, &ptr);
> + // ummap last 2 pages.
> + ret = sys_munmap(ptr + 2 * page_size, 2 * page_size);
> + assert(!ret);
> +
> + if (seal) {
> + ret = sys_mseal(ptr, 2 * page_size, MM_SEAL_MREMAP);
> + assert(!ret);
> + }
> +
> + // expand from 2 page to 4 pages.
> + ret2 = mremap(ptr, 2 * page_size, 4 * page_size, 0, 0);
> + if (seal) {
> + assert(ret2 == MAP_FAILED);
> + assert(errno == EACCES);
> + } else {
> + assert(ret2 == ptr);
> + clean_single_address(ret2, 4 * page_size);
> + }
> + clean_single_address(ptr, size);
> +}
> +
> +void test_seal_mremap_move(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = page_size;
> + int ret;
> + void *ret2;
> +
> + setup_single_address(size, &ptr);
> + if (seal) {
> + ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
> + assert(!ret);
> + }
> +
> + // move from ptr to fixed address.
> + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_FIXED, addr1);
> + if (seal) {
> + assert(ret2 == MAP_FAILED);
> + assert(errno == EACCES);
> + } else {
> + assert(ret2 != MAP_FAILED);
> + clean_single_address(ret2, size);
> + }
> + clean_single_address(ptr, size);
> +}
> +
> +void test_seal_mmap_overwrite_prot(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = page_size;
> + int ret;
> + void *ret2;
> +
> + setup_single_address(size, &ptr);
> +
> + if (seal) {
> + ret = sys_mseal(ptr, size, MM_SEAL_MMAP);
> + assert(!ret);
> + }
> +
> + // use mmap to change protection.
> + ret2 = mmap(ptr, size, PROT_NONE,
> + MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
> + if (seal) {
> + assert(ret2 == MAP_FAILED);
> + assert(errno == EACCES);
> + } else
> + assert(ret2 == ptr);
> +
> + clean_single_address(ptr, size);
> +}
> +
> +void test_seal_mremap_shrink_fixed(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + void *newAddr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> + void *ret2;
> +
> + setup_single_address(size, &ptr);
> + setup_single_fixed_address(size, &newAddr);
> +
> + if (seal) {
> + ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
> + assert(!ret);
> + }
> +
> + // mremap to move and shrink to fixed address
> + ret2 = mremap(ptr, size, 2 * page_size, MREMAP_MAYMOVE | MREMAP_FIXED,
> + newAddr);
> + if (seal) {
> + assert(ret2 == MAP_FAILED);
> + assert(errno == EACCES);
> + } else
> + assert(ret2 == newAddr);
> +
> + clean_single_address(ptr, size);
> + clean_single_address(newAddr, size);
> +}
> +
> +void test_seal_mremap_expand_fixed(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + void *newAddr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> + void *ret2;
> +
> + setup_single_address(page_size, &ptr);
> + setup_single_fixed_address(size, &newAddr);
> +
> + if (seal) {
> + ret = sys_mseal(newAddr, size, MM_SEAL_MREMAP);
> + assert(!ret);
> + }
> +
> + // mremap to move and expand to fixed address
> + ret2 = mremap(ptr, page_size, size, MREMAP_MAYMOVE | MREMAP_FIXED,
> + newAddr);
> + if (seal) {
> + assert(ret2 == MAP_FAILED);
> + assert(errno == EACCES);
> + } else
> + assert(ret2 == newAddr);
> +
> + clean_single_address(ptr, page_size);
> + clean_single_address(newAddr, size);
> +}
> +
> +void test_seal_mremap_move_fixed(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + void *newAddr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> + void *ret2;
> +
> + setup_single_address(size, &ptr);
> + setup_single_fixed_address(size, &newAddr);
> +
> + if (seal) {
> + ret = sys_mseal(newAddr, size, MM_SEAL_MREMAP);
> + assert(!ret);
> + }
> +
> + // mremap to move to fixed address
> + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_FIXED, newAddr);
> + if (seal) {
> + assert(ret2 == MAP_FAILED);
> + assert(errno == EACCES);
> + } else
> + assert(ret2 == newAddr);
> +
> + clean_single_address(ptr, page_size);
> + clean_single_address(newAddr, size);
> +}
> +
> +void test_seal_mremap_move_fixed_zero(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + void *newAddr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> + void *ret2;
> +
> + setup_single_address(size, &ptr);
> +
> + if (seal) {
> + ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
> + assert(!ret);
> + }
> +
> + /*
> + * MREMAP_FIXED can move the mapping to zero address
> + */
> + ret2 = mremap(ptr, size, 2 * page_size, MREMAP_MAYMOVE | MREMAP_FIXED,
> + 0);
> + if (seal) {
> + assert(ret2 == MAP_FAILED);
> + assert(errno == EACCES);
> + } else {
> + assert(ret2 == 0);
> + clean_single_address(ret2, 2 * page_size);
> + }
> + clean_single_address(ptr, size);
> +}
> +
> +void test_seal_mremap_move_dontunmap(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + void *newAddr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> + void *ret2;
> +
> + setup_single_address(size, &ptr);
> +
> + if (seal) {
> + ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
> + assert(!ret);
> + }
> +
> + // mremap to move, and don't unmap src addr.
> + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_DONTUNMAP, 0);
> + if (seal) {
> + assert(ret2 == MAP_FAILED);
> + assert(errno == EACCES);
> + } else {
> + assert(ret2 != MAP_FAILED);
> + clean_single_address(ret2, size);
> + }
> +
> + clean_single_address(ptr, page_size);
> +}
> +
> +void test_seal_mremap_move_dontunmap_anyaddr(bool seal)
> +{
> + LOG_TEST_ENTER();
> + void *ptr;
> + void *newAddr;
> + unsigned long page_size = getpagesize();
> + unsigned long size = 4 * page_size;
> + int ret;
> + void *ret2;
> +
> + setup_single_address(size, &ptr);
> +
> + if (seal) {
> + ret = sys_mseal(ptr, size, MM_SEAL_MREMAP);
> + assert(!ret);
> + }
> +
> + /*
> + * The 0xdeaddead should not have effect on dest addr
> + * when MREMAP_DONTUNMAP is set.
> + */
> + ret2 = mremap(ptr, size, size, MREMAP_MAYMOVE | MREMAP_DONTUNMAP,
> + 0xdeaddead);
> + if (seal) {
> + assert(ret2 == MAP_FAILED);
> + assert(errno == EACCES);
> + } else {
> + assert(ret2 != MAP_FAILED);
> + assert((long)ret2 != 0xdeaddead);
> + clean_single_address(ret2, size);
> + }
> +
> + clean_single_address(ptr, page_size);
> +}
> +
> +int main(int argc, char **argv)
> +{
> + test_seal_invalid_input();
> + test_seal_addseals();
> + test_seal_addseals_combined();
> + test_seal_addseals_reject();
> + test_seal_unmapped_start();
> + test_seal_unmapped_middle();
> + test_seal_unmapped_end();
> + test_seal_multiple_vmas();
> + test_seal_split_start();
> + test_seal_split_end();
> +
> + test_seal_zero_length();
> + test_seal_twice();
> +
> + test_seal_mprotect(false);
> + test_seal_mprotect(true);
> +
> + test_seal_start_mprotect(false);
> + test_seal_start_mprotect(true);
> +
> + test_seal_end_mprotect(false);
> + test_seal_end_mprotect(true);
> +
> + test_seal_mprotect_unalign_len(false);
> + test_seal_mprotect_unalign_len(true);
> +
> + test_seal_mprotect_unalign_len_variant_2(false);
> + test_seal_mprotect_unalign_len_variant_2(true);
> +
> + test_seal_mprotect_two_vma(false);
> + test_seal_mprotect_two_vma(true);
> +
> + test_seal_mprotect_two_vma_with_split(false);
> + test_seal_mprotect_two_vma_with_split(true);
> +
> + test_seal_mprotect_partial_mprotect(false);
> + test_seal_mprotect_partial_mprotect(true);
> +
> + test_seal_mprotect_two_vma_with_gap(false);
> + test_seal_mprotect_two_vma_with_gap(true);
> +
> + test_seal_mprotect_merge(false);
> + test_seal_mprotect_merge(true);
> +
> + test_seal_mprotect_split(false);
> + test_seal_mprotect_split(true);
> +
> + test_seal_munmap(false);
> + test_seal_munmap(true);
> + test_seal_munmap_two_vma(false);
> + test_seal_munmap_two_vma(true);
> + test_seal_munmap_vma_with_gap(false);
> + test_seal_munmap_vma_with_gap(true);
> +
> + test_munmap_start_freed(false);
> + test_munmap_start_freed(true);
> + test_munmap_middle_freed(false);
> + test_munmap_middle_freed(true);
> + test_munmap_end_freed(false);
> + test_munmap_end_freed(true);
> +
> + test_seal_mremap_shrink(false);
> + test_seal_mremap_shrink(true);
> + test_seal_mremap_expand(false);
> + test_seal_mremap_expand(true);
> + test_seal_mremap_move(false);
> + test_seal_mremap_move(true);
> +
> + test_seal_mremap_shrink_fixed(false);
> + test_seal_mremap_shrink_fixed(true);
> + test_seal_mremap_expand_fixed(false);
> + test_seal_mremap_expand_fixed(true);
> + test_seal_mremap_move_fixed(false);
> + test_seal_mremap_move_fixed(true);
> + test_seal_mremap_move_dontunmap(false);
> + test_seal_mremap_move_dontunmap(true);
> + test_seal_mremap_move_fixed_zero(false);
> + test_seal_mremap_move_fixed_zero(true);
> + test_seal_mremap_move_dontunmap_anyaddr(false);
> + test_seal_mremap_move_dontunmap_anyaddr(true);
> +
> + test_seal_mmap_overwrite_prot(false);
> + test_seal_mmap_overwrite_prot(true);
> +
> + printf("OK\n");
> + return 0;
> +}

--
BR,
Muhammad Usama Anjum

2023-10-20 15:24:58

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH v2 8/8] selftest mm/mseal mprotect/munmap/mremap/mmap

On Fri, Oct 20, 2023 at 07:24:03PM +0500, Muhammad Usama Anjum wrote:

> Please remove these. These macros would be picked up from the kernel
> headers automatically.

As per the previous discussions, how does that work if you have O= build
directories?

I find this push to force people to do 'make headers' in order to use
simple selftests quite misguided. You're making it *harder* to use,
leading to less use.

2023-10-20 16:34:18

by Muhammad Usama Anjum

[permalink] [raw]
Subject: Re: [RFC PATCH v2 8/8] selftest mm/mseal mprotect/munmap/mremap/mmap

On 10/20/23 8:23 PM, Peter Zijlstra wrote:
> On Fri, Oct 20, 2023 at 07:24:03PM +0500, Muhammad Usama Anjum wrote:
>
>> Please remove these. These macros would be picked up from the kernel
>> headers automatically.
>
> As per the previous discussions, how does that work if you have O= build
> directories?
Then headers should be prepared in that O= directory first.
make headers O=abc && make -C tools/testing/selftests O=abc

>
> I find this push to force people to do 'make headers' in order to use
> simple selftests quite misguided. You're making it *harder* to use,
> leading to less use.
I'm just following what we have been doing over selftests mailing list to
fix build issues in different use cases and kselfest.rst. Let me share the
history:

Around 2 years ago, selftest Makefile used to prepare kernel headers from
source automatically and include them to build selftests. It had several
bugs. So they separated the header preparation from selftest build. After a
while people started getting build failures because they weren't building
headers which were previously built automatically. So someone had written a
patch (already in v6.6-rc6) to show informative error if headers aren't
present. So now selftests can't be built until headers are built.

The understanding here is that selftests come with kernel source and they
should be built using in-source kernel headers as people don't always have
updated headers. I think, if someone want to build just one selftest
without doing make headers, he should install kernel headers from source
before doing so instead of adding duplicate defines in the test itself. It
helps while development to not to keep the duplicate copy of these macros
in selftest as well.


--
BR,
Muhammad Usama Anjum