v2:
- Added R-bs
- Fix patch 1 issue on hugetlb swap entry [Mika]
This is a follow up of previous discussion here:
https://lore.kernel.org/r/20230324222707.GA3046@monkey
There, Mike correctly pointed out that uffd-wp bit can get lost too when
Copy-On-Read triggers. Last time we didn't have a reproducer, I finally
wrote a reproducer and attached as the last patch.
When at it, I decided to also add some more uffd-wp tests against fork(),
and I found more bugs. None of them were reported by anyone probably
because none of us cares, but since they're still bugs and can be
reproduced by the unit test I fixed them too in another patch.
The initial patch 1-2 are fixes to bugs, copied stable.
The rest patches 3-6 introduces unit tests to verify (based on the recent
rework on uffd unit test). Note that not all the bugfixes in patch 1 is
verified (e.g. on changes to hugetlb hwpoison / migration entries), but I
assume they can be reviewed with careful eyes.
Thanks,
Peter Xu (6):
mm/hugetlb: Fix uffd-wp during fork()
mm/hugetlb: Fix uffd-wp bit lost when unsharing happens
selftests/mm: Add a few options for uffd-unit-test
selftests/mm: Extend and rename uffd pagemap test
selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS
selftests/mm: Add tests for RO pinning vs fork()
mm/hugetlb.c | 31 +-
tools/testing/selftests/mm/Makefile | 8 +-
tools/testing/selftests/mm/check_config.sh | 4 +-
tools/testing/selftests/mm/uffd-unit-tests.c | 318 +++++++++++++++++--
4 files changed, 314 insertions(+), 47 deletions(-)
--
2.39.1
There're a bunch of things that were wrong:
- Reading uffd-wp bit from a swap entry should use pte_swp_uffd_wp()
rather than huge_pte_uffd_wp().
- When copying over a pte, we should drop uffd-wp bit when
!EVENT_FORK (aka, when !userfaultfd_wp(dst_vma)).
- When doing early CoW for private hugetlb (e.g. when the parent page was
pinned), uffd-wp bit should be properly carried over if necessary.
No bug reported probably because most people do not even care about these
corner cases, but they are still bugs and can be exposed by the recent unit
tests introduced, so fix all of them in one shot.
Cc: linux-stable <[email protected]>
Fixes: bc70fbf269fd ("mm/hugetlb: handle uffd-wp during fork()")
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
mm/hugetlb.c | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f16b25b1a6b9..0213efaf31be 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4953,11 +4953,15 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
static void
hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr,
- struct folio *new_folio)
+ struct folio *new_folio, pte_t old)
{
+ pte_t newpte = make_huge_pte(vma, &new_folio->page, 1);
+
__folio_mark_uptodate(new_folio);
hugepage_add_new_anon_rmap(new_folio, vma, addr);
- set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, &new_folio->page, 1));
+ if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old))
+ newpte = huge_pte_mkuffd_wp(newpte);
+ set_huge_pte_at(vma->vm_mm, addr, ptep, newpte);
hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
folio_set_hugetlb_migratable(new_folio);
}
@@ -5032,14 +5036,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
*/
;
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
- bool uffd_wp = huge_pte_uffd_wp(entry);
-
- if (!userfaultfd_wp(dst_vma) && uffd_wp)
+ if (!userfaultfd_wp(dst_vma))
entry = huge_pte_clear_uffd_wp(entry);
set_huge_pte_at(dst, addr, dst_pte, entry);
} else if (unlikely(is_hugetlb_entry_migration(entry))) {
swp_entry_t swp_entry = pte_to_swp_entry(entry);
- bool uffd_wp = huge_pte_uffd_wp(entry);
+ bool uffd_wp = pte_swp_uffd_wp(entry);
if (!is_readable_migration_entry(swp_entry) && cow) {
/*
@@ -5050,10 +5052,10 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
swp_offset(swp_entry));
entry = swp_entry_to_pte(swp_entry);
if (userfaultfd_wp(src_vma) && uffd_wp)
- entry = huge_pte_mkuffd_wp(entry);
+ entry = pte_swp_mkuffd_wp(entry);
set_huge_pte_at(src, addr, src_pte, entry);
}
- if (!userfaultfd_wp(dst_vma) && uffd_wp)
+ if (!userfaultfd_wp(dst_vma))
entry = huge_pte_clear_uffd_wp(entry);
set_huge_pte_at(dst, addr, dst_pte, entry);
} else if (unlikely(is_pte_marker(entry))) {
@@ -5114,7 +5116,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
/* huge_ptep of dst_pte won't change as in child */
goto again;
}
- hugetlb_install_folio(dst_vma, dst_pte, addr, new_folio);
+ hugetlb_install_folio(dst_vma, dst_pte, addr,
+ new_folio, src_pte_old);
spin_unlock(src_ptl);
spin_unlock(dst_ptl);
continue;
@@ -5132,6 +5135,9 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
entry = huge_pte_wrprotect(entry);
}
+ if (!userfaultfd_wp(dst_vma))
+ entry = huge_pte_clear_uffd_wp(entry);
+
set_huge_pte_at(dst, addr, dst_pte, entry);
hugetlb_count_add(npages, dst);
}
--
2.39.1
When we try to unshare a pinned page for a private hugetlb, uffd-wp bit can
get lost during unsharing. Fix it by carrying it over.
This should be very rare, only if an unsharing happened on a private
hugetlb page with uffd-wp protected (e.g. in a child which shares the same
page with parent with UFFD_FEATURE_EVENT_FORK enabled).
Cc: linux-stable <[email protected]>
Fixes: 166f3ecc0daf ("mm/hugetlb: hook page faults for uffd write protection")
Reported-by: Mike Kravetz <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
mm/hugetlb.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 0213efaf31be..cd3a9d8f4b70 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5637,13 +5637,16 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
spin_lock(ptl);
ptep = hugetlb_walk(vma, haddr, huge_page_size(h));
if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) {
+ pte_t newpte = make_huge_pte(vma, &new_folio->page, !unshare);
+
/* Break COW or unshare */
huge_ptep_clear_flush(vma, haddr, ptep);
mmu_notifier_invalidate_range(mm, range.start, range.end);
page_remove_rmap(old_page, vma, true);
hugepage_add_new_anon_rmap(new_folio, vma, haddr);
- set_huge_pte_at(mm, haddr, ptep,
- make_huge_pte(vma, &new_folio->page, !unshare));
+ if (huge_pte_uffd_wp(pte))
+ newpte = huge_pte_mkuffd_wp(newpte);
+ set_huge_pte_at(mm, haddr, ptep, newpte);
folio_set_hugetlb_migratable(new_folio);
/* Make the old page be freed below */
new_folio = page_folio(old_page);
--
2.39.1
Extend it to all types of mem, meanwhile add one parallel test when
EVENT_FORK is enabled, where uffd-wp bits should be persisted rather than
dropped.
Since at it, rename the test to "wp-fork" to better show what it means.
Making the new test called "wp-fork-with-event".
Before:
Testing pagemap on anon... done
After:
Testing wp-fork on anon... done
Testing wp-fork on shmem... done
Testing wp-fork on shmem-private... done
Testing wp-fork on hugetlb... done
Testing wp-fork on hugetlb-private... done
Testing wp-fork-with-event on anon... done
Testing wp-fork-with-event on shmem... done
Testing wp-fork-with-event on shmem-private... done
Testing wp-fork-with-event on hugetlb... done
Testing wp-fork-with-event on hugetlb-private... done
Signed-off-by: Peter Xu <[email protected]>
---
tools/testing/selftests/mm/uffd-unit-tests.c | 130 +++++++++++++++----
1 file changed, 106 insertions(+), 24 deletions(-)
diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c
index 452ca05a829d..739fc4d30342 100644
--- a/tools/testing/selftests/mm/uffd-unit-tests.c
+++ b/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -227,25 +227,65 @@ static int pagemap_open(void)
err("pagemap uffd-wp bit error: 0x%"PRIx64, value); \
} while (0)
-static int pagemap_test_fork(bool present)
+typedef struct {
+ int parent_uffd, child_uffd;
+} fork_event_args;
+
+static void *fork_event_consumer(void *data)
{
- pid_t child = fork();
+ fork_event_args *args = data;
+ struct uffd_msg msg = { 0 };
+
+ /* Read until a full msg received */
+ while (uffd_read_msg(args->parent_uffd, &msg));
+
+ if (msg.event != UFFD_EVENT_FORK)
+ err("wrong message: %u\n", msg.event);
+
+ /* Just to be properly freed later */
+ args->child_uffd = msg.arg.fork.ufd;
+ return NULL;
+}
+
+static int pagemap_test_fork(int uffd, bool with_event)
+{
+ fork_event_args args = { .parent_uffd = uffd, .child_uffd = -1 };
+ pthread_t thread;
+ pid_t child;
uint64_t value;
int fd, result;
+ /* Prepare a thread to resolve EVENT_FORK */
+ if (with_event) {
+ if (pthread_create(&thread, NULL, fork_event_consumer, &args))
+ err("pthread_create()");
+ }
+
+ child = fork();
if (!child) {
/* Open the pagemap fd of the child itself */
fd = pagemap_open();
value = pagemap_get_entry(fd, area_dst);
/*
- * After fork() uffd-wp bit should be gone as long as we're
- * without UFFD_FEATURE_EVENT_FORK
+ * After fork(), we should handle uffd-wp bit differently:
+ *
+ * (1) when with EVENT_FORK, it should persist
+ * (2) when without EVENT_FORK, it should be dropped
*/
- pagemap_check_wp(value, false);
+ pagemap_check_wp(value, with_event);
/* Succeed */
exit(0);
}
waitpid(child, &result, 0);
+
+ if (with_event) {
+ if (pthread_join(thread, NULL))
+ err("pthread_join()");
+ if (args.child_uffd < 0)
+ err("Didn't receive child uffd");
+ close(args.child_uffd);
+ }
+
return result;
}
@@ -295,7 +335,8 @@ static void uffd_wp_unpopulated_test(uffd_test_args_t *args)
uffd_test_pass();
}
-static void uffd_pagemap_test(uffd_test_args_t *args)
+static void uffd_wp_fork_test_common(uffd_test_args_t *args,
+ bool with_event)
{
int pagemap_fd;
uint64_t value;
@@ -311,23 +352,42 @@ static void uffd_pagemap_test(uffd_test_args_t *args)
wp_range(uffd, (uint64_t)area_dst, page_size, true);
value = pagemap_get_entry(pagemap_fd, area_dst);
pagemap_check_wp(value, true);
- /* Make sure uffd-wp bit dropped when fork */
- if (pagemap_test_fork(true))
- err("Detected stall uffd-wp bit in child");
-
- /* Exclusive required or PAGEOUT won't work */
- if (!(value & PM_MMAP_EXCLUSIVE))
- err("multiple mapping detected: 0x%"PRIx64, value);
+ if (pagemap_test_fork(uffd, with_event)) {
+ uffd_test_fail("Detected %s uffd-wp bit in child in present pte",
+ with_event ? "missing" : "stall");
+ goto out;
+ }
- if (madvise(area_dst, page_size, MADV_PAGEOUT))
- err("madvise(MADV_PAGEOUT) failed");
+ /*
+ * This is an attempt for zapping the pgtable so as to test the
+ * markers.
+ *
+ * For private mappings, PAGEOUT will only work on exclusive ptes
+ * (PM_MMAP_EXCLUSIVE) which we should satisfy.
+ *
+ * For shared, PAGEOUT may not work. Use DONTNEED instead which
+ * plays a similar role of zapping (rather than freeing the page)
+ * to expose pte markers.
+ */
+ if (args->mem_type->shared) {
+ if (madvise(area_dst, page_size, MADV_DONTNEED))
+ err("MADV_DONTNEED");
+ } else {
+ /*
+ * NOTE: ignore retval because private-hugetlb doesn't yet
+ * support swapping, so it could fail.
+ */
+ madvise(area_dst, page_size, MADV_PAGEOUT);
+ }
/* Uffd-wp should persist even swapped out */
value = pagemap_get_entry(pagemap_fd, area_dst);
pagemap_check_wp(value, true);
- /* Make sure uffd-wp bit dropped when fork */
- if (pagemap_test_fork(false))
- err("Detected stall uffd-wp bit in child");
+ if (pagemap_test_fork(uffd, with_event)) {
+ uffd_test_fail("Detected %s uffd-wp bit in child in zapped pte",
+ with_event ? "missing" : "stall");
+ goto out;
+ }
/* Unprotect; this tests swap pte modifications */
wp_range(uffd, (uint64_t)area_dst, page_size, false);
@@ -338,9 +398,21 @@ static void uffd_pagemap_test(uffd_test_args_t *args)
*area_dst = 2;
value = pagemap_get_entry(pagemap_fd, area_dst);
pagemap_check_wp(value, false);
-
- close(pagemap_fd);
uffd_test_pass();
+out:
+ if (uffd_unregister(uffd, area_dst, nr_pages * page_size))
+ err("unregister failed");
+ close(pagemap_fd);
+}
+
+static void uffd_wp_fork_test(uffd_test_args_t *args)
+{
+ uffd_wp_fork_test_common(args, false);
+}
+
+static void uffd_wp_fork_with_event_test(uffd_test_args_t *args)
+{
+ uffd_wp_fork_test_common(args, true);
}
static void check_memory_contents(char *p)
@@ -836,10 +908,20 @@ uffd_test_case_t uffd_tests[] = {
.uffd_feature_required = 0,
},
{
- .name = "pagemap",
- .uffd_fn = uffd_pagemap_test,
- .mem_targets = MEM_ANON,
- .uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP,
+ .name = "wp-fork",
+ .uffd_fn = uffd_wp_fork_test,
+ .mem_targets = MEM_ALL,
+ .uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+ UFFD_FEATURE_WP_HUGETLBFS_SHMEM,
+ },
+ {
+ .name = "wp-fork-with-event",
+ .uffd_fn = uffd_wp_fork_with_event_test,
+ .mem_targets = MEM_ALL,
+ .uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+ UFFD_FEATURE_WP_HUGETLBFS_SHMEM |
+ /* when set, child process should inherit uffd-wp bits */
+ UFFD_FEATURE_EVENT_FORK,
},
{
.name = "wp-unpopulated",
--
2.39.1
Namely:
"-f": add a wildcard filter for tests to run
"-l": list tests rather than running any
"-h": help msg
Signed-off-by: Peter Xu <[email protected]>
---
tools/testing/selftests/mm/uffd-unit-tests.c | 52 +++++++++++++++++---
1 file changed, 45 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c
index d871bf732e62..452ca05a829d 100644
--- a/tools/testing/selftests/mm/uffd-unit-tests.c
+++ b/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -909,28 +909,65 @@ uffd_test_case_t uffd_tests[] = {
},
};
+static void usage(const char *prog)
+{
+ printf("usage: %s [-f TESTNAME]\n", prog);
+ puts("");
+ puts(" -f: test name to filter (e.g., event)");
+ puts(" -h: show the help msg");
+ puts(" -l: list tests only");
+ puts("");
+ exit(KSFT_FAIL);
+}
+
int main(int argc, char *argv[])
{
int n_tests = sizeof(uffd_tests) / sizeof(uffd_test_case_t);
int n_mems = sizeof(mem_types) / sizeof(mem_type_t);
+ const char *test_filter = NULL;
+ bool list_only = false;
uffd_test_case_t *test;
mem_type_t *mem_type;
uffd_test_args_t args;
char test_name[128];
const char *errmsg;
- int has_uffd;
+ int has_uffd, opt;
int i, j;
- has_uffd = test_uffd_api(false);
- has_uffd |= test_uffd_api(true);
+ while ((opt = getopt(argc, argv, "f:hl")) != -1) {
+ switch (opt) {
+ case 'f':
+ test_filter = optarg;
+ break;
+ case 'l':
+ list_only = true;
+ break;
+ case 'h':
+ default:
+ /* Unknown */
+ usage(argv[0]);
+ break;
+ }
+ }
+
+ if (!test_filter && !list_only) {
+ has_uffd = test_uffd_api(false);
+ has_uffd |= test_uffd_api(true);
- if (!has_uffd) {
- printf("Userfaultfd not supported or unprivileged, skip all tests\n");
- exit(KSFT_SKIP);
+ if (!has_uffd) {
+ printf("Userfaultfd not supported or unprivileged, skip all tests\n");
+ exit(KSFT_SKIP);
+ }
}
for (i = 0; i < n_tests; i++) {
test = &uffd_tests[i];
+ if (test_filter && !strstr(test->name, test_filter))
+ continue;
+ if (list_only) {
+ printf("%s\n", test->name);
+ continue;
+ }
for (j = 0; j < n_mems; j++) {
mem_type = &mem_types[j];
if (!(test->mem_targets & mem_type->mem_flag))
@@ -952,7 +989,8 @@ int main(int argc, char *argv[])
}
}
- uffd_test_report();
+ if (!list_only)
+ uffd_test_report();
return ksft_get_fail_cnt() ? KSFT_FAIL : KSFT_PASS;
}
--
2.39.1
The macro and facility can be reused in other tests too. Make it general.
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
tools/testing/selftests/mm/Makefile | 8 ++++----
tools/testing/selftests/mm/check_config.sh | 4 ++--
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 63c03a6414fc..0ee00769b53f 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -161,8 +161,8 @@ warn_32bit_failure:
endif
endif
-# cow_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
-$(OUTPUT)/cow: LDLIBS += $(COW_EXTRA_LIBS)
+# IOURING_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
+$(OUTPUT)/cow: LDLIBS += $(IOURING_EXTRA_LIBS)
$(OUTPUT)/mlock-random-test $(OUTPUT)/memfd_secret: LDLIBS += -lcap
@@ -175,11 +175,11 @@ local_config.mk local_config.h: check_config.sh
EXTRA_CLEAN += local_config.mk local_config.h
-ifeq ($(COW_EXTRA_LIBS),)
+ifeq ($(IOURING_EXTRA_LIBS),)
all: warn_missing_liburing
warn_missing_liburing:
@echo ; \
- echo "Warning: missing liburing support. Some COW tests will be skipped." ; \
+ echo "Warning: missing liburing support. Some tests will be skipped." ; \
echo
endif
diff --git a/tools/testing/selftests/mm/check_config.sh b/tools/testing/selftests/mm/check_config.sh
index bcba3af0acea..3954f4746161 100644
--- a/tools/testing/selftests/mm/check_config.sh
+++ b/tools/testing/selftests/mm/check_config.sh
@@ -21,11 +21,11 @@ $CC -c $tmpfile_c -o $tmpfile_o >/dev/null 2>&1
if [ -f $tmpfile_o ]; then
echo "#define LOCAL_CONFIG_HAVE_LIBURING 1" > $OUTPUT_H_FILE
- echo "COW_EXTRA_LIBS = -luring" > $OUTPUT_MKFILE
+ echo "IOURING_EXTRA_LIBS = -luring" > $OUTPUT_MKFILE
else
echo "// No liburing support found" > $OUTPUT_H_FILE
echo "# No liburing support found, so:" > $OUTPUT_MKFILE
- echo "COW_EXTRA_LIBS = " >> $OUTPUT_MKFILE
+ echo "IOURING_EXTRA_LIBS = " >> $OUTPUT_MKFILE
fi
rm ${tmpname}.*
--
2.39.1
Add a test suite (with 10 more sub-tests) to cover RO pinning against
fork() over uffd-wp. It covers both:
(1) Early CoW test in fork() when page pinned,
(2) page unshare due to RO longterm pin.
They are:
Testing wp-fork-pin on anon... done
Testing wp-fork-pin on shmem... done
Testing wp-fork-pin on shmem-private... done
Testing wp-fork-pin on hugetlb... done
Testing wp-fork-pin on hugetlb-private... done
Testing wp-fork-pin-with-event on anon... done
Testing wp-fork-pin-with-event on shmem... done
Testing wp-fork-pin-with-event on shmem-private... done
Testing wp-fork-pin-with-event on hugetlb... done
Testing wp-fork-pin-with-event on hugetlb-private... done
CONFIG_GUP_TEST needed or they'll be skipped.
Testing wp-fork-pin on anon... skipped [reason: Possibly CONFIG_GUP_TEST missing or unprivileged]
Note that the major test goal is on private memory, but no hurt to also run
all of them over shared because shared memory should work the same.
Signed-off-by: Peter Xu <[email protected]>
---
tools/testing/selftests/mm/uffd-unit-tests.c | 144 ++++++++++++++++++-
1 file changed, 141 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c
index 739fc4d30342..269c86768a02 100644
--- a/tools/testing/selftests/mm/uffd-unit-tests.c
+++ b/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -7,6 +7,8 @@
#include "uffd-common.h"
+#include "../../../../mm/gup_test.h"
+
#ifdef __NR_userfaultfd
/* The unit test doesn't need a large or random size, make it 32MB for now */
@@ -247,7 +249,53 @@ static void *fork_event_consumer(void *data)
return NULL;
}
-static int pagemap_test_fork(int uffd, bool with_event)
+typedef struct {
+ int gup_fd;
+ bool pinned;
+} pin_args;
+
+/*
+ * Returns 0 if succeed, <0 for errors. pin_pages() needs to be paired
+ * with unpin_pages(). Currently it needs to be RO longterm pin to satisfy
+ * all needs of the test cases (e.g., trigger unshare, trigger fork() early
+ * CoW, etc.).
+ */
+static int pin_pages(pin_args *args, void *buffer, size_t size)
+{
+ struct pin_longterm_test test = {
+ .addr = (uintptr_t)buffer,
+ .size = size,
+ /* Read-only pins */
+ .flags = 0,
+ };
+
+ if (args->pinned)
+ err("already pinned");
+
+ args->gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR);
+ if (args->gup_fd < 0)
+ return -errno;
+
+ if (ioctl(args->gup_fd, PIN_LONGTERM_TEST_START, &test)) {
+ /* Even if gup_test existed, can be an old gup_test / kernel */
+ close(args->gup_fd);
+ return -errno;
+ }
+ args->pinned = true;
+ return 0;
+}
+
+static void unpin_pages(pin_args *args)
+{
+ if (!args->pinned)
+ err("unpin without pin first");
+ if (ioctl(args->gup_fd, PIN_LONGTERM_TEST_STOP))
+ err("PIN_LONGTERM_TEST_STOP");
+ close(args->gup_fd);
+ args->pinned = false;
+}
+
+static int pagemap_test_fork(int uffd, bool with_event, bool test_pin)
{
fork_event_args args = { .parent_uffd = uffd, .child_uffd = -1 };
pthread_t thread;
@@ -264,7 +312,17 @@ static int pagemap_test_fork(int uffd, bool with_event)
child = fork();
if (!child) {
/* Open the pagemap fd of the child itself */
+ pin_args args = {};
+
fd = pagemap_open();
+
+ if (test_pin && pin_pages(&args, area_dst, page_size))
+ /*
+ * Normally when reach here we have pinned in
+ * previous tests, so shouldn't fail anymore
+ */
+ err("pin page failed in child");
+
value = pagemap_get_entry(fd, area_dst);
/*
* After fork(), we should handle uffd-wp bit differently:
@@ -273,6 +331,8 @@ static int pagemap_test_fork(int uffd, bool with_event)
* (2) when without EVENT_FORK, it should be dropped
*/
pagemap_check_wp(value, with_event);
+ if (test_pin)
+ unpin_pages(&args);
/* Succeed */
exit(0);
}
@@ -352,7 +412,7 @@ static void uffd_wp_fork_test_common(uffd_test_args_t *args,
wp_range(uffd, (uint64_t)area_dst, page_size, true);
value = pagemap_get_entry(pagemap_fd, area_dst);
pagemap_check_wp(value, true);
- if (pagemap_test_fork(uffd, with_event)) {
+ if (pagemap_test_fork(uffd, with_event, false)) {
uffd_test_fail("Detected %s uffd-wp bit in child in present pte",
with_event ? "missing" : "stall");
goto out;
@@ -383,7 +443,7 @@ static void uffd_wp_fork_test_common(uffd_test_args_t *args,
/* Uffd-wp should persist even swapped out */
value = pagemap_get_entry(pagemap_fd, area_dst);
pagemap_check_wp(value, true);
- if (pagemap_test_fork(uffd, with_event)) {
+ if (pagemap_test_fork(uffd, with_event, false)) {
uffd_test_fail("Detected %s uffd-wp bit in child in zapped pte",
with_event ? "missing" : "stall");
goto out;
@@ -415,6 +475,68 @@ static void uffd_wp_fork_with_event_test(uffd_test_args_t *args)
uffd_wp_fork_test_common(args, true);
}
+static void uffd_wp_fork_pin_test_common(uffd_test_args_t *args,
+ bool with_event)
+{
+ int pagemap_fd;
+ pin_args pin_args = {};
+
+ if (uffd_register(uffd, area_dst, page_size, false, true, false))
+ err("register failed");
+
+ pagemap_fd = pagemap_open();
+
+ /* Touch the page */
+ *area_dst = 1;
+ wp_range(uffd, (uint64_t)area_dst, page_size, true);
+
+ /*
+ * 1. First pin, then fork(). This tests fork() special path when
+ * doing early CoW if the page is private.
+ */
+ if (pin_pages(&pin_args, area_dst, page_size)) {
+ uffd_test_skip("Possibly CONFIG_GUP_TEST missing "
+ "or unprivileged");
+ close(pagemap_fd);
+ uffd_unregister(uffd, area_dst, page_size);
+ return;
+ }
+
+ if (pagemap_test_fork(uffd, with_event, false)) {
+ uffd_test_fail("Detected %s uffd-wp bit in early CoW of fork()",
+ with_event ? "missing" : "stall");
+ unpin_pages(&pin_args);
+ goto out;
+ }
+
+ unpin_pages(&pin_args);
+
+ /*
+ * 2. First fork(), then pin (in the child, where test_pin==true).
+ * This tests COR, aka, page unsharing on private memories.
+ */
+ if (pagemap_test_fork(uffd, with_event, true)) {
+ uffd_test_fail("Detected %s uffd-wp bit when RO pin",
+ with_event ? "missing" : "stall");
+ goto out;
+ }
+ uffd_test_pass();
+out:
+ if (uffd_unregister(uffd, area_dst, page_size))
+ err("register failed");
+ close(pagemap_fd);
+}
+
+static void uffd_wp_fork_pin_test(uffd_test_args_t *args)
+{
+ uffd_wp_fork_pin_test_common(args, false);
+}
+
+static void uffd_wp_fork_pin_with_event_test(uffd_test_args_t *args)
+{
+ uffd_wp_fork_pin_test_common(args, true);
+}
+
static void check_memory_contents(char *p)
{
unsigned long i, j;
@@ -923,6 +1045,22 @@ uffd_test_case_t uffd_tests[] = {
/* when set, child process should inherit uffd-wp bits */
UFFD_FEATURE_EVENT_FORK,
},
+ {
+ .name = "wp-fork-pin",
+ .uffd_fn = uffd_wp_fork_pin_test,
+ .mem_targets = MEM_ALL,
+ .uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+ UFFD_FEATURE_WP_HUGETLBFS_SHMEM,
+ },
+ {
+ .name = "wp-fork-pin-with-event",
+ .uffd_fn = uffd_wp_fork_pin_with_event_test,
+ .mem_targets = MEM_ALL,
+ .uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+ UFFD_FEATURE_WP_HUGETLBFS_SHMEM |
+ /* when set, child process should inherit uffd-wp bits */
+ UFFD_FEATURE_EVENT_FORK,
+ },
{
.name = "wp-unpopulated",
.uffd_fn = uffd_wp_unpopulated_test,
--
2.39.1
On Mon, 17 Apr 2023 15:53:13 -0400 Peter Xu <[email protected]> wrote:
> When we try to unshare a pinned page for a private hugetlb, uffd-wp bit can
> get lost during unsharing. Fix it by carrying it over.
>
> This should be very rare, only if an unsharing happened on a private
> hugetlb page with uffd-wp protected (e.g. in a child which shares the same
> page with parent with UFFD_FEATURE_EVENT_FORK enabled).
What are the user-visible consequences of the bug?
> Cc: linux-stable <[email protected]>
When proposing a backport, it's better to present the patch as a
standalone thing, against current -linus. I'll then queue it in
mm-hotfixes and shall send it upstream during this -rc cycle.
As presented, this patch won't go upstream until after 6.3 is released,
and as it comes later in time, more backporting effort might be needed.
I can rework things if this fix is reasonably urgent (the "user-visible
consequences" info is the guide). If not urgent, we can leave things
as they are.
Hi, Andrew,
On Mon, Apr 17, 2023 at 04:48:22PM -0700, Andrew Morton wrote:
> On Mon, 17 Apr 2023 15:53:13 -0400 Peter Xu <[email protected]> wrote:
>
> > When we try to unshare a pinned page for a private hugetlb, uffd-wp bit can
> > get lost during unsharing. Fix it by carrying it over.
> >
> > This should be very rare, only if an unsharing happened on a private
> > hugetlb page with uffd-wp protected (e.g. in a child which shares the same
> > page with parent with UFFD_FEATURE_EVENT_FORK enabled).
>
> What are the user-visible consequences of the bug?
When above condition met, one can lose uffd-wp bit on the privately mapped
hugetlb page. It allows the page to be writable even if it should still be
wr-protected. I assume it can mean data loss.
However it's very hard to trigger. When I wrote the reproducer (provided in
the last patch) I needed to use the newest gup_test cmd introduced by David
to trigger it because I don't even know another way to do a proper RO
longerm pin.
Besides that, it needs a bunch of other conditions all met:
(1) hugetlb being mapped privately,
(2) userfaultfd registered with WP and EVENT_FORK,
(3) the user app fork()s, then,
(4) RO longterm pin onto a wr-protected anonymous page.
If it's not impossible to hit in production I'd say extremely rare.
>
> > Cc: linux-stable <[email protected]>
>
> When proposing a backport, it's better to present the patch as a
> standalone thing, against current -linus. I'll then queue it in
> mm-hotfixes and shall send it upstream during this -rc cycle.
>
> As presented, this patch won't go upstream until after 6.3 is released,
> and as it comes later in time, more backporting effort might be needed.
>
> I can rework things if this fix is reasonably urgent (the "user-visible
> consequences" info is the guide). If not urgent, we can leave things
> as they are.
IMHO it's not urgent so suitable for mm-unstable (current base of this set;
sorry if I forgot to mention it explicitly). I'll post (and remember to
post) patches on top of mm-stable if they're urgent, or e.g. bugs
introduced in current release.
I copied stable for the pure logic of fixing a bug in old kernels. The
consequence of hitting the bug is very bad but chance to hit is very low.
Thanks,
--
Peter Xu
On 17.04.23 21:53, Peter Xu wrote:
> Namely:
>
> "-f": add a wildcard filter for tests to run
> "-l": list tests rather than running any
> "-h": help msg
>
Sounds helpful.
Reviewed-by: David Hildenbrand <[email protected]>
--
Thanks,
David / dhildenb