2023-04-13 23:12:50

by Peter Xu

[permalink] [raw]
Subject: [PATCH 0/6] mm/hugetlb: More fixes around uffd-wp vs fork() / RO pins

This is a follow up of previous discussion here:

https://lore.kernel.org/r/20230324222707.GA3046@monkey

There, Mike correctly pointed out that uffd-wp bit can get lost too when
Copy-On-Read triggers. Last time we didn't have a reproducer, I finally
wrote a reproducer and attached as the last patch.

When at it, I decided to also add some more uffd-wp tests against fork(),
and I found more bugs. None of them were reported by anyone probably
because none of us cares, but since they're still bugs and can be
reproduced by the unit test I fixed them too in another patch.

The initial patch 1-2 are fixes to bugs, copied stable.

The rest patches 3-6 introduces unit tests to verify (based on the recent
rework on uffd unit test). Note that not all the bugfixes in patch 1 is
verified (e.g. on changes to hugetlb hwpoison / migration entries), but I
assume they can be reviewed with careful eyes.

Thanks,

Peter Xu (6):
mm/hugetlb: Fix uffd-wp during fork()
mm/hugetlb: Fix uffd-wp bit lost when unsharing happens
selftests/mm: Add a few options for uffd-unit-test
selftests/mm: Extend and rename uffd pagemap test
selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS
selftests/mm: Add tests for RO pinning vs fork()

mm/hugetlb.c | 33 +-
tools/testing/selftests/mm/Makefile | 8 +-
tools/testing/selftests/mm/check_config.sh | 4 +-
tools/testing/selftests/mm/uffd-unit-tests.c | 318 +++++++++++++++++--
4 files changed, 315 insertions(+), 48 deletions(-)

--
2.39.1


2023-04-13 23:13:08

by Peter Xu

[permalink] [raw]
Subject: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()

There're a bunch of things that were wrong:

- Reading uffd-wp bit from a swap entry should use pte_swp_uffd_wp()
rather than huge_pte_uffd_wp().

- When copying over a pte, we should drop uffd-wp bit when
!EVENT_FORK (aka, when !userfaultfd_wp(dst_vma)).

- When doing early CoW for private hugetlb (e.g. when the parent page was
pinned), uffd-wp bit should be properly carried over if necessary.

No bug reported probably because most people do not even care about these
corner cases, but they are still bugs and can be exposed by the recent unit
tests introduced, so fix all of them in one shot.

Cc: linux-stable <[email protected]>
Fixes: bc70fbf269fd ("mm/hugetlb: handle uffd-wp during fork()")
Signed-off-by: Peter Xu <[email protected]>
---
mm/hugetlb.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f16b25b1a6b9..7320e64aacc6 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4953,11 +4953,15 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)

static void
hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr,
- struct folio *new_folio)
+ struct folio *new_folio, pte_t old)
{
+ pte_t newpte = make_huge_pte(vma, &new_folio->page, 1);
+
__folio_mark_uptodate(new_folio);
hugepage_add_new_anon_rmap(new_folio, vma, addr);
- set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, &new_folio->page, 1));
+ if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old))
+ newpte = huge_pte_mkuffd_wp(newpte);
+ set_huge_pte_at(vma->vm_mm, addr, ptep, newpte);
hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
folio_set_hugetlb_migratable(new_folio);
}
@@ -5032,14 +5036,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
*/
;
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
- bool uffd_wp = huge_pte_uffd_wp(entry);
-
- if (!userfaultfd_wp(dst_vma) && uffd_wp)
+ if (!userfaultfd_wp(dst_vma))
entry = huge_pte_clear_uffd_wp(entry);
set_huge_pte_at(dst, addr, dst_pte, entry);
} else if (unlikely(is_hugetlb_entry_migration(entry))) {
swp_entry_t swp_entry = pte_to_swp_entry(entry);
- bool uffd_wp = huge_pte_uffd_wp(entry);

if (!is_readable_migration_entry(swp_entry) && cow) {
/*
@@ -5049,11 +5050,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
swp_entry = make_readable_migration_entry(
swp_offset(swp_entry));
entry = swp_entry_to_pte(swp_entry);
- if (userfaultfd_wp(src_vma) && uffd_wp)
- entry = huge_pte_mkuffd_wp(entry);
+ if (userfaultfd_wp(src_vma) &&
+ pte_swp_uffd_wp(entry))
+ entry = pte_swp_mkuffd_wp(entry);
set_huge_pte_at(src, addr, src_pte, entry);
}
- if (!userfaultfd_wp(dst_vma) && uffd_wp)
+ if (!userfaultfd_wp(dst_vma))
entry = huge_pte_clear_uffd_wp(entry);
set_huge_pte_at(dst, addr, dst_pte, entry);
} else if (unlikely(is_pte_marker(entry))) {
@@ -5114,7 +5116,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
/* huge_ptep of dst_pte won't change as in child */
goto again;
}
- hugetlb_install_folio(dst_vma, dst_pte, addr, new_folio);
+ hugetlb_install_folio(dst_vma, dst_pte, addr,
+ new_folio, src_pte_old);
spin_unlock(src_ptl);
spin_unlock(dst_ptl);
continue;
@@ -5132,6 +5135,9 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
entry = huge_pte_wrprotect(entry);
}

+ if (!userfaultfd_wp(dst_vma))
+ entry = huge_pte_clear_uffd_wp(entry);
+
set_huge_pte_at(dst, addr, dst_pte, entry);
hugetlb_count_add(npages, dst);
}
--
2.39.1

2023-04-13 23:13:21

by Peter Xu

[permalink] [raw]
Subject: [PATCH 3/6] selftests/mm: Add a few options for uffd-unit-test

Namely:

"-f": add a wildcard filter for tests to run
"-l": list tests rather than running any
"-h": help msg

Signed-off-by: Peter Xu <[email protected]>
---
tools/testing/selftests/mm/uffd-unit-tests.c | 52 +++++++++++++++++---
1 file changed, 45 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c
index d871bf732e62..452ca05a829d 100644
--- a/tools/testing/selftests/mm/uffd-unit-tests.c
+++ b/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -909,28 +909,65 @@ uffd_test_case_t uffd_tests[] = {
},
};

+static void usage(const char *prog)
+{
+ printf("usage: %s [-f TESTNAME]\n", prog);
+ puts("");
+ puts(" -f: test name to filter (e.g., event)");
+ puts(" -h: show the help msg");
+ puts(" -l: list tests only");
+ puts("");
+ exit(KSFT_FAIL);
+}
+
int main(int argc, char *argv[])
{
int n_tests = sizeof(uffd_tests) / sizeof(uffd_test_case_t);
int n_mems = sizeof(mem_types) / sizeof(mem_type_t);
+ const char *test_filter = NULL;
+ bool list_only = false;
uffd_test_case_t *test;
mem_type_t *mem_type;
uffd_test_args_t args;
char test_name[128];
const char *errmsg;
- int has_uffd;
+ int has_uffd, opt;
int i, j;

- has_uffd = test_uffd_api(false);
- has_uffd |= test_uffd_api(true);
+ while ((opt = getopt(argc, argv, "f:hl")) != -1) {
+ switch (opt) {
+ case 'f':
+ test_filter = optarg;
+ break;
+ case 'l':
+ list_only = true;
+ break;
+ case 'h':
+ default:
+ /* Unknown */
+ usage(argv[0]);
+ break;
+ }
+ }
+
+ if (!test_filter && !list_only) {
+ has_uffd = test_uffd_api(false);
+ has_uffd |= test_uffd_api(true);

- if (!has_uffd) {
- printf("Userfaultfd not supported or unprivileged, skip all tests\n");
- exit(KSFT_SKIP);
+ if (!has_uffd) {
+ printf("Userfaultfd not supported or unprivileged, skip all tests\n");
+ exit(KSFT_SKIP);
+ }
}

for (i = 0; i < n_tests; i++) {
test = &uffd_tests[i];
+ if (test_filter && !strstr(test->name, test_filter))
+ continue;
+ if (list_only) {
+ printf("%s\n", test->name);
+ continue;
+ }
for (j = 0; j < n_mems; j++) {
mem_type = &mem_types[j];
if (!(test->mem_targets & mem_type->mem_flag))
@@ -952,7 +989,8 @@ int main(int argc, char *argv[])
}
}

- uffd_test_report();
+ if (!list_only)
+ uffd_test_report();

return ksft_get_fail_cnt() ? KSFT_FAIL : KSFT_PASS;
}
--
2.39.1

2023-04-13 23:13:30

by Peter Xu

[permalink] [raw]
Subject: [PATCH 4/6] selftests/mm: Extend and rename uffd pagemap test

Extend it to all types of mem, meanwhile add one parallel test when
EVENT_FORK is enabled, where uffd-wp bits should be persisted rather than
dropped.

Since at it, rename the test to "wp-fork" to better show what it means.
Making the new test called "wp-fork-with-event".

Before:

Testing pagemap on anon... done

After:

Testing wp-fork on anon... done
Testing wp-fork on shmem... done
Testing wp-fork on shmem-private... done
Testing wp-fork on hugetlb... done
Testing wp-fork on hugetlb-private... done
Testing wp-fork-with-event on anon... done
Testing wp-fork-with-event on shmem... done
Testing wp-fork-with-event on shmem-private... done
Testing wp-fork-with-event on hugetlb... done
Testing wp-fork-with-event on hugetlb-private... done

Signed-off-by: Peter Xu <[email protected]>
---
tools/testing/selftests/mm/uffd-unit-tests.c | 130 +++++++++++++++----
1 file changed, 106 insertions(+), 24 deletions(-)

diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c
index 452ca05a829d..739fc4d30342 100644
--- a/tools/testing/selftests/mm/uffd-unit-tests.c
+++ b/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -227,25 +227,65 @@ static int pagemap_open(void)
err("pagemap uffd-wp bit error: 0x%"PRIx64, value); \
} while (0)

-static int pagemap_test_fork(bool present)
+typedef struct {
+ int parent_uffd, child_uffd;
+} fork_event_args;
+
+static void *fork_event_consumer(void *data)
{
- pid_t child = fork();
+ fork_event_args *args = data;
+ struct uffd_msg msg = { 0 };
+
+ /* Read until a full msg received */
+ while (uffd_read_msg(args->parent_uffd, &msg));
+
+ if (msg.event != UFFD_EVENT_FORK)
+ err("wrong message: %u\n", msg.event);
+
+ /* Just to be properly freed later */
+ args->child_uffd = msg.arg.fork.ufd;
+ return NULL;
+}
+
+static int pagemap_test_fork(int uffd, bool with_event)
+{
+ fork_event_args args = { .parent_uffd = uffd, .child_uffd = -1 };
+ pthread_t thread;
+ pid_t child;
uint64_t value;
int fd, result;

+ /* Prepare a thread to resolve EVENT_FORK */
+ if (with_event) {
+ if (pthread_create(&thread, NULL, fork_event_consumer, &args))
+ err("pthread_create()");
+ }
+
+ child = fork();
if (!child) {
/* Open the pagemap fd of the child itself */
fd = pagemap_open();
value = pagemap_get_entry(fd, area_dst);
/*
- * After fork() uffd-wp bit should be gone as long as we're
- * without UFFD_FEATURE_EVENT_FORK
+ * After fork(), we should handle uffd-wp bit differently:
+ *
+ * (1) when with EVENT_FORK, it should persist
+ * (2) when without EVENT_FORK, it should be dropped
*/
- pagemap_check_wp(value, false);
+ pagemap_check_wp(value, with_event);
/* Succeed */
exit(0);
}
waitpid(child, &result, 0);
+
+ if (with_event) {
+ if (pthread_join(thread, NULL))
+ err("pthread_join()");
+ if (args.child_uffd < 0)
+ err("Didn't receive child uffd");
+ close(args.child_uffd);
+ }
+
return result;
}

@@ -295,7 +335,8 @@ static void uffd_wp_unpopulated_test(uffd_test_args_t *args)
uffd_test_pass();
}

-static void uffd_pagemap_test(uffd_test_args_t *args)
+static void uffd_wp_fork_test_common(uffd_test_args_t *args,
+ bool with_event)
{
int pagemap_fd;
uint64_t value;
@@ -311,23 +352,42 @@ static void uffd_pagemap_test(uffd_test_args_t *args)
wp_range(uffd, (uint64_t)area_dst, page_size, true);
value = pagemap_get_entry(pagemap_fd, area_dst);
pagemap_check_wp(value, true);
- /* Make sure uffd-wp bit dropped when fork */
- if (pagemap_test_fork(true))
- err("Detected stall uffd-wp bit in child");
-
- /* Exclusive required or PAGEOUT won't work */
- if (!(value & PM_MMAP_EXCLUSIVE))
- err("multiple mapping detected: 0x%"PRIx64, value);
+ if (pagemap_test_fork(uffd, with_event)) {
+ uffd_test_fail("Detected %s uffd-wp bit in child in present pte",
+ with_event ? "missing" : "stall");
+ goto out;
+ }

- if (madvise(area_dst, page_size, MADV_PAGEOUT))
- err("madvise(MADV_PAGEOUT) failed");
+ /*
+ * This is an attempt for zapping the pgtable so as to test the
+ * markers.
+ *
+ * For private mappings, PAGEOUT will only work on exclusive ptes
+ * (PM_MMAP_EXCLUSIVE) which we should satisfy.
+ *
+ * For shared, PAGEOUT may not work. Use DONTNEED instead which
+ * plays a similar role of zapping (rather than freeing the page)
+ * to expose pte markers.
+ */
+ if (args->mem_type->shared) {
+ if (madvise(area_dst, page_size, MADV_DONTNEED))
+ err("MADV_DONTNEED");
+ } else {
+ /*
+ * NOTE: ignore retval because private-hugetlb doesn't yet
+ * support swapping, so it could fail.
+ */
+ madvise(area_dst, page_size, MADV_PAGEOUT);
+ }

/* Uffd-wp should persist even swapped out */
value = pagemap_get_entry(pagemap_fd, area_dst);
pagemap_check_wp(value, true);
- /* Make sure uffd-wp bit dropped when fork */
- if (pagemap_test_fork(false))
- err("Detected stall uffd-wp bit in child");
+ if (pagemap_test_fork(uffd, with_event)) {
+ uffd_test_fail("Detected %s uffd-wp bit in child in zapped pte",
+ with_event ? "missing" : "stall");
+ goto out;
+ }

/* Unprotect; this tests swap pte modifications */
wp_range(uffd, (uint64_t)area_dst, page_size, false);
@@ -338,9 +398,21 @@ static void uffd_pagemap_test(uffd_test_args_t *args)
*area_dst = 2;
value = pagemap_get_entry(pagemap_fd, area_dst);
pagemap_check_wp(value, false);
-
- close(pagemap_fd);
uffd_test_pass();
+out:
+ if (uffd_unregister(uffd, area_dst, nr_pages * page_size))
+ err("unregister failed");
+ close(pagemap_fd);
+}
+
+static void uffd_wp_fork_test(uffd_test_args_t *args)
+{
+ uffd_wp_fork_test_common(args, false);
+}
+
+static void uffd_wp_fork_with_event_test(uffd_test_args_t *args)
+{
+ uffd_wp_fork_test_common(args, true);
}

static void check_memory_contents(char *p)
@@ -836,10 +908,20 @@ uffd_test_case_t uffd_tests[] = {
.uffd_feature_required = 0,
},
{
- .name = "pagemap",
- .uffd_fn = uffd_pagemap_test,
- .mem_targets = MEM_ANON,
- .uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP,
+ .name = "wp-fork",
+ .uffd_fn = uffd_wp_fork_test,
+ .mem_targets = MEM_ALL,
+ .uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+ UFFD_FEATURE_WP_HUGETLBFS_SHMEM,
+ },
+ {
+ .name = "wp-fork-with-event",
+ .uffd_fn = uffd_wp_fork_with_event_test,
+ .mem_targets = MEM_ALL,
+ .uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+ UFFD_FEATURE_WP_HUGETLBFS_SHMEM |
+ /* when set, child process should inherit uffd-wp bits */
+ UFFD_FEATURE_EVENT_FORK,
},
{
.name = "wp-unpopulated",
--
2.39.1

2023-04-13 23:13:32

by Peter Xu

[permalink] [raw]
Subject: [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS

The macro and facility can be reused in other tests too. Make it general.

Signed-off-by: Peter Xu <[email protected]>
---
tools/testing/selftests/mm/Makefile | 8 ++++----
tools/testing/selftests/mm/check_config.sh | 4 ++--
2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 5a3434419403..9ffce175d5e6 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -161,8 +161,8 @@ warn_32bit_failure:
endif
endif

-# cow_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
-$(OUTPUT)/cow: LDLIBS += $(COW_EXTRA_LIBS)
+# IOURING_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
+$(OUTPUT)/cow: LDLIBS += $(IOURING_EXTRA_LIBS)

$(OUTPUT)/mlock-random-test $(OUTPUT)/memfd_secret: LDLIBS += -lcap

@@ -175,11 +175,11 @@ local_config.mk local_config.h: check_config.sh

EXTRA_CLEAN += local_config.mk local_config.h

-ifeq ($(COW_EXTRA_LIBS),)
+ifeq ($(IOURING_EXTRA_LIBS),)
all: warn_missing_liburing

warn_missing_liburing:
@echo ; \
- echo "Warning: missing liburing support. Some COW tests will be skipped." ; \
+ echo "Warning: missing liburing support. Some tests will be skipped." ; \
echo
endif
diff --git a/tools/testing/selftests/mm/check_config.sh b/tools/testing/selftests/mm/check_config.sh
index bcba3af0acea..3954f4746161 100644
--- a/tools/testing/selftests/mm/check_config.sh
+++ b/tools/testing/selftests/mm/check_config.sh
@@ -21,11 +21,11 @@ $CC -c $tmpfile_c -o $tmpfile_o >/dev/null 2>&1

if [ -f $tmpfile_o ]; then
echo "#define LOCAL_CONFIG_HAVE_LIBURING 1" > $OUTPUT_H_FILE
- echo "COW_EXTRA_LIBS = -luring" > $OUTPUT_MKFILE
+ echo "IOURING_EXTRA_LIBS = -luring" > $OUTPUT_MKFILE
else
echo "// No liburing support found" > $OUTPUT_H_FILE
echo "# No liburing support found, so:" > $OUTPUT_MKFILE
- echo "COW_EXTRA_LIBS = " >> $OUTPUT_MKFILE
+ echo "IOURING_EXTRA_LIBS = " >> $OUTPUT_MKFILE
fi

rm ${tmpname}.*
--
2.39.1

2023-04-13 23:16:17

by Peter Xu

[permalink] [raw]
Subject: [PATCH 6/6] selftests/mm: Add tests for RO pinning vs fork()

Add 10 one more test to cover RO pinning against fork() over uffd-wp. It
covers both:

(1) Early CoW test in fork() when page pinned,
(2) page unshare due to RO longterm pin.

They are:

Testing wp-fork-pin on anon... done
Testing wp-fork-pin on shmem... done
Testing wp-fork-pin on shmem-private... done
Testing wp-fork-pin on hugetlb... done
Testing wp-fork-pin on hugetlb-private... done
Testing wp-fork-pin-with-event on anon... done
Testing wp-fork-pin-with-event on shmem... done
Testing wp-fork-pin-with-event on shmem-private... done
Testing wp-fork-pin-with-event on hugetlb... done
Testing wp-fork-pin-with-event on hugetlb-private... done

CONFIG_GUP_TEST needed or they'll be skipped.

Testing wp-fork-pin on anon... skipped [reason: Possibly CONFIG_GUP_TEST missing or unprivileged]

Note that only private pages matter here, but no hurt to also run all of
them over shared.

Signed-off-by: Peter Xu <[email protected]>
---
tools/testing/selftests/mm/uffd-unit-tests.c | 144 ++++++++++++++++++-
1 file changed, 141 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c
index 739fc4d30342..269c86768a02 100644
--- a/tools/testing/selftests/mm/uffd-unit-tests.c
+++ b/tools/testing/selftests/mm/uffd-unit-tests.c
@@ -7,6 +7,8 @@

#include "uffd-common.h"

+#include "../../../../mm/gup_test.h"
+
#ifdef __NR_userfaultfd

/* The unit test doesn't need a large or random size, make it 32MB for now */
@@ -247,7 +249,53 @@ static void *fork_event_consumer(void *data)
return NULL;
}

-static int pagemap_test_fork(int uffd, bool with_event)
+typedef struct {
+ int gup_fd;
+ bool pinned;
+} pin_args;
+
+/*
+ * Returns 0 if succeed, <0 for errors. pin_pages() needs to be paired
+ * with unpin_pages(). Currently it needs to be RO longterm pin to satisfy
+ * all needs of the test cases (e.g., trigger unshare, trigger fork() early
+ * CoW, etc.).
+ */
+static int pin_pages(pin_args *args, void *buffer, size_t size)
+{
+ struct pin_longterm_test test = {
+ .addr = (uintptr_t)buffer,
+ .size = size,
+ /* Read-only pins */
+ .flags = 0,
+ };
+
+ if (args->pinned)
+ err("already pinned");
+
+ args->gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR);
+ if (args->gup_fd < 0)
+ return -errno;
+
+ if (ioctl(args->gup_fd, PIN_LONGTERM_TEST_START, &test)) {
+ /* Even if gup_test existed, can be an old gup_test / kernel */
+ close(args->gup_fd);
+ return -errno;
+ }
+ args->pinned = true;
+ return 0;
+}
+
+static void unpin_pages(pin_args *args)
+{
+ if (!args->pinned)
+ err("unpin without pin first");
+ if (ioctl(args->gup_fd, PIN_LONGTERM_TEST_STOP))
+ err("PIN_LONGTERM_TEST_STOP");
+ close(args->gup_fd);
+ args->pinned = false;
+}
+
+static int pagemap_test_fork(int uffd, bool with_event, bool test_pin)
{
fork_event_args args = { .parent_uffd = uffd, .child_uffd = -1 };
pthread_t thread;
@@ -264,7 +312,17 @@ static int pagemap_test_fork(int uffd, bool with_event)
child = fork();
if (!child) {
/* Open the pagemap fd of the child itself */
+ pin_args args = {};
+
fd = pagemap_open();
+
+ if (test_pin && pin_pages(&args, area_dst, page_size))
+ /*
+ * Normally when reach here we have pinned in
+ * previous tests, so shouldn't fail anymore
+ */
+ err("pin page failed in child");
+
value = pagemap_get_entry(fd, area_dst);
/*
* After fork(), we should handle uffd-wp bit differently:
@@ -273,6 +331,8 @@ static int pagemap_test_fork(int uffd, bool with_event)
* (2) when without EVENT_FORK, it should be dropped
*/
pagemap_check_wp(value, with_event);
+ if (test_pin)
+ unpin_pages(&args);
/* Succeed */
exit(0);
}
@@ -352,7 +412,7 @@ static void uffd_wp_fork_test_common(uffd_test_args_t *args,
wp_range(uffd, (uint64_t)area_dst, page_size, true);
value = pagemap_get_entry(pagemap_fd, area_dst);
pagemap_check_wp(value, true);
- if (pagemap_test_fork(uffd, with_event)) {
+ if (pagemap_test_fork(uffd, with_event, false)) {
uffd_test_fail("Detected %s uffd-wp bit in child in present pte",
with_event ? "missing" : "stall");
goto out;
@@ -383,7 +443,7 @@ static void uffd_wp_fork_test_common(uffd_test_args_t *args,
/* Uffd-wp should persist even swapped out */
value = pagemap_get_entry(pagemap_fd, area_dst);
pagemap_check_wp(value, true);
- if (pagemap_test_fork(uffd, with_event)) {
+ if (pagemap_test_fork(uffd, with_event, false)) {
uffd_test_fail("Detected %s uffd-wp bit in child in zapped pte",
with_event ? "missing" : "stall");
goto out;
@@ -415,6 +475,68 @@ static void uffd_wp_fork_with_event_test(uffd_test_args_t *args)
uffd_wp_fork_test_common(args, true);
}

+static void uffd_wp_fork_pin_test_common(uffd_test_args_t *args,
+ bool with_event)
+{
+ int pagemap_fd;
+ pin_args pin_args = {};
+
+ if (uffd_register(uffd, area_dst, page_size, false, true, false))
+ err("register failed");
+
+ pagemap_fd = pagemap_open();
+
+ /* Touch the page */
+ *area_dst = 1;
+ wp_range(uffd, (uint64_t)area_dst, page_size, true);
+
+ /*
+ * 1. First pin, then fork(). This tests fork() special path when
+ * doing early CoW if the page is private.
+ */
+ if (pin_pages(&pin_args, area_dst, page_size)) {
+ uffd_test_skip("Possibly CONFIG_GUP_TEST missing "
+ "or unprivileged");
+ close(pagemap_fd);
+ uffd_unregister(uffd, area_dst, page_size);
+ return;
+ }
+
+ if (pagemap_test_fork(uffd, with_event, false)) {
+ uffd_test_fail("Detected %s uffd-wp bit in early CoW of fork()",
+ with_event ? "missing" : "stall");
+ unpin_pages(&pin_args);
+ goto out;
+ }
+
+ unpin_pages(&pin_args);
+
+ /*
+ * 2. First fork(), then pin (in the child, where test_pin==true).
+ * This tests COR, aka, page unsharing on private memories.
+ */
+ if (pagemap_test_fork(uffd, with_event, true)) {
+ uffd_test_fail("Detected %s uffd-wp bit when RO pin",
+ with_event ? "missing" : "stall");
+ goto out;
+ }
+ uffd_test_pass();
+out:
+ if (uffd_unregister(uffd, area_dst, page_size))
+ err("register failed");
+ close(pagemap_fd);
+}
+
+static void uffd_wp_fork_pin_test(uffd_test_args_t *args)
+{
+ uffd_wp_fork_pin_test_common(args, false);
+}
+
+static void uffd_wp_fork_pin_with_event_test(uffd_test_args_t *args)
+{
+ uffd_wp_fork_pin_test_common(args, true);
+}
+
static void check_memory_contents(char *p)
{
unsigned long i, j;
@@ -923,6 +1045,22 @@ uffd_test_case_t uffd_tests[] = {
/* when set, child process should inherit uffd-wp bits */
UFFD_FEATURE_EVENT_FORK,
},
+ {
+ .name = "wp-fork-pin",
+ .uffd_fn = uffd_wp_fork_pin_test,
+ .mem_targets = MEM_ALL,
+ .uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+ UFFD_FEATURE_WP_HUGETLBFS_SHMEM,
+ },
+ {
+ .name = "wp-fork-pin-with-event",
+ .uffd_fn = uffd_wp_fork_pin_with_event_test,
+ .mem_targets = MEM_ALL,
+ .uffd_feature_required = UFFD_FEATURE_PAGEFAULT_FLAG_WP |
+ UFFD_FEATURE_WP_HUGETLBFS_SHMEM |
+ /* when set, child process should inherit uffd-wp bits */
+ UFFD_FEATURE_EVENT_FORK,
+ },
{
.name = "wp-unpopulated",
.uffd_fn = uffd_wp_unpopulated_test,
--
2.39.1

2023-04-14 09:42:43

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()

On 14.04.23 01:11, Peter Xu wrote:
> There're a bunch of things that were wrong:
>
> - Reading uffd-wp bit from a swap entry should use pte_swp_uffd_wp()
> rather than huge_pte_uffd_wp().
>
> - When copying over a pte, we should drop uffd-wp bit when
> !EVENT_FORK (aka, when !userfaultfd_wp(dst_vma)).
>
> - When doing early CoW for private hugetlb (e.g. when the parent page was
> pinned), uffd-wp bit should be properly carried over if necessary.
>
> No bug reported probably because most people do not even care about these
> corner cases, but they are still bugs and can be exposed by the recent unit
> tests introduced, so fix all of them in one shot.
>
> Cc: linux-stable <[email protected]>
> Fixes: bc70fbf269fd ("mm/hugetlb: handle uffd-wp during fork()")
> Signed-off-by: Peter Xu <[email protected]>
> ---
> mm/hugetlb.c | 26 ++++++++++++++++----------
> 1 file changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f16b25b1a6b9..7320e64aacc6 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4953,11 +4953,15 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
>
> static void
> hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr,
> - struct folio *new_folio)
> + struct folio *new_folio, pte_t old)
> {

Nit: The function now expects old to be !swap_pte. Which works perfectly
fine with existing code -- the function name is a bit generic and
misleading, unfortunately. IMHO, instead of factoring that functionality
out to desperately try keeping copy_hugetlb_page_range() somewhat
readable, we should just have factored out the complete copy+replace
into a copy_hugetlb_page() function -- similar to the ordinary page
handling -- which would have made copy_hugetlb_page_range() more
readable eventually.

Anyhow, unrelated.

> + pte_t newpte = make_huge_pte(vma, &new_folio->page, 1);
> +
> __folio_mark_uptodate(new_folio);
> hugepage_add_new_anon_rmap(new_folio, vma, addr);
> - set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, &new_folio->page, 1));
> + if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old))
> + newpte = huge_pte_mkuffd_wp(newpte);
> + set_huge_pte_at(vma->vm_mm, addr, ptep, newpte);
> hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
> folio_set_hugetlb_migratable(new_folio);
> }
> @@ -5032,14 +5036,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> */
> ;
> } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
> - bool uffd_wp = huge_pte_uffd_wp(entry);
> -
> - if (!userfaultfd_wp(dst_vma) && uffd_wp)
> + if (!userfaultfd_wp(dst_vma))
> entry = huge_pte_clear_uffd_wp(entry);
> set_huge_pte_at(dst, addr, dst_pte, entry);
> } else if (unlikely(is_hugetlb_entry_migration(entry))) {
> swp_entry_t swp_entry = pte_to_swp_entry(entry);
> - bool uffd_wp = huge_pte_uffd_wp(entry);
>
> if (!is_readable_migration_entry(swp_entry) && cow) {
> /*
> @@ -5049,11 +5050,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> swp_entry = make_readable_migration_entry(
> swp_offset(swp_entry));
> entry = swp_entry_to_pte(swp_entry);
> - if (userfaultfd_wp(src_vma) && uffd_wp)
> - entry = huge_pte_mkuffd_wp(entry);
> + if (userfaultfd_wp(src_vma) &&
> + pte_swp_uffd_wp(entry))
> + entry = pte_swp_mkuffd_wp(entry);
> set_huge_pte_at(src, addr, src_pte, entry);
> }
> - if (!userfaultfd_wp(dst_vma) && uffd_wp)
> + if (!userfaultfd_wp(dst_vma))
> entry = huge_pte_clear_uffd_wp(entry);
> set_huge_pte_at(dst, addr, dst_pte, entry);
> } else if (unlikely(is_pte_marker(entry))) {
> @@ -5114,7 +5116,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> /* huge_ptep of dst_pte won't change as in child */
> goto again;
> }
> - hugetlb_install_folio(dst_vma, dst_pte, addr, new_folio);
> + hugetlb_install_folio(dst_vma, dst_pte, addr,
> + new_folio, src_pte_old);
> spin_unlock(src_ptl);
> spin_unlock(dst_ptl);
> continue;
> @@ -5132,6 +5135,9 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> entry = huge_pte_wrprotect(entry);
> }
>
> + if (!userfaultfd_wp(dst_vma))
> + entry = huge_pte_clear_uffd_wp(entry);
> +
> set_huge_pte_at(dst, addr, dst_pte, entry);
> hugetlb_count_add(npages, dst);
> }

LGTM

Reviewed-by: David Hildenbrand <[email protected]>

--
Thanks,

David / dhildenb

2023-04-14 09:55:14

by Mika Penttilä

[permalink] [raw]
Subject: Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()



On 14.4.2023 2.11, Peter Xu wrote:
> There're a bunch of things that were wrong:
>
> - Reading uffd-wp bit from a swap entry should use pte_swp_uffd_wp()
> rather than huge_pte_uffd_wp().
>
> - When copying over a pte, we should drop uffd-wp bit when
> !EVENT_FORK (aka, when !userfaultfd_wp(dst_vma)).
>
> - When doing early CoW for private hugetlb (e.g. when the parent page was
> pinned), uffd-wp bit should be properly carried over if necessary.
>
> No bug reported probably because most people do not even care about these
> corner cases, but they are still bugs and can be exposed by the recent unit
> tests introduced, so fix all of them in one shot.
>
> Cc: linux-stable <[email protected]>
> Fixes: bc70fbf269fd ("mm/hugetlb: handle uffd-wp during fork()")
> Signed-off-by: Peter Xu <[email protected]>
> ---
> mm/hugetlb.c | 26 ++++++++++++++++----------
> 1 file changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f16b25b1a6b9..7320e64aacc6 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4953,11 +4953,15 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
>
> static void
> hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr,
> - struct folio *new_folio)
> + struct folio *new_folio, pte_t old)
> {
> + pte_t newpte = make_huge_pte(vma, &new_folio->page, 1);
> +
> __folio_mark_uptodate(new_folio);
> hugepage_add_new_anon_rmap(new_folio, vma, addr);
> - set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, &new_folio->page, 1));
> + if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old))
> + newpte = huge_pte_mkuffd_wp(newpte);
> + set_huge_pte_at(vma->vm_mm, addr, ptep, newpte);
> hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
> folio_set_hugetlb_migratable(new_folio);
> }
> @@ -5032,14 +5036,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> */
> ;
> } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
> - bool uffd_wp = huge_pte_uffd_wp(entry);
> -
> - if (!userfaultfd_wp(dst_vma) && uffd_wp)
> + if (!userfaultfd_wp(dst_vma))
> entry = huge_pte_clear_uffd_wp(entry);
> set_huge_pte_at(dst, addr, dst_pte, entry);
> } else if (unlikely(is_hugetlb_entry_migration(entry))) {
> swp_entry_t swp_entry = pte_to_swp_entry(entry);
> - bool uffd_wp = huge_pte_uffd_wp(entry);
>
> if (!is_readable_migration_entry(swp_entry) && cow) {
> /*
> @@ -5049,11 +5050,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> swp_entry = make_readable_migration_entry(
> swp_offset(swp_entry));
> entry = swp_entry_to_pte(swp_entry);
> - if (userfaultfd_wp(src_vma) && uffd_wp)
> - entry = huge_pte_mkuffd_wp(entry);
> + if (userfaultfd_wp(src_vma) &&
> + pte_swp_uffd_wp(entry))
> + entry = pte_swp_mkuffd_wp(entry);


This looks interesting with pte_swp_uffd_wp and pte_swp_mkuffd_wp ?


> set_huge_pte_at(src, addr, src_pte, entry);
> }
> - if (!userfaultfd_wp(dst_vma) && uffd_wp)
> + if (!userfaultfd_wp(dst_vma))
> entry = huge_pte_clear_uffd_wp(entry);
> set_huge_pte_at(dst, addr, dst_pte, entry);
> } else if (unlikely(is_pte_marker(entry))) {
> @@ -5114,7 +5116,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> /* huge_ptep of dst_pte won't change as in child */
> goto again;
> }
> - hugetlb_install_folio(dst_vma, dst_pte, addr, new_folio);
> + hugetlb_install_folio(dst_vma, dst_pte, addr,
> + new_folio, src_pte_old);
> spin_unlock(src_ptl);
> spin_unlock(dst_ptl);
> continue;
> @@ -5132,6 +5135,9 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> entry = huge_pte_wrprotect(entry);
> }
>
> + if (!userfaultfd_wp(dst_vma))
> + entry = huge_pte_clear_uffd_wp(entry);
> +
> set_huge_pte_at(dst, addr, dst_pte, entry);
> hugetlb_count_add(npages, dst);
> }


--Mika


2023-04-14 09:59:17

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS

On 14.04.23 01:11, Peter Xu wrote:
> The macro and facility can be reused in other tests too. Make it general.
>
> Signed-off-by: Peter Xu <[email protected]>
> ---
> tools/testing/selftests/mm/Makefile | 8 ++++----
> tools/testing/selftests/mm/check_config.sh | 4 ++--
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
> index 5a3434419403..9ffce175d5e6 100644
> --- a/tools/testing/selftests/mm/Makefile
> +++ b/tools/testing/selftests/mm/Makefile
> @@ -161,8 +161,8 @@ warn_32bit_failure:
> endif
> endif
>
> -# cow_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
> -$(OUTPUT)/cow: LDLIBS += $(COW_EXTRA_LIBS)
> +# IOURING_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
> +$(OUTPUT)/cow: LDLIBS += $(IOURING_EXTRA_LIBS)
>
> $(OUTPUT)/mlock-random-test $(OUTPUT)/memfd_secret: LDLIBS += -lcap
>
> @@ -175,11 +175,11 @@ local_config.mk local_config.h: check_config.sh
>
> EXTRA_CLEAN += local_config.mk local_config.h
>
> -ifeq ($(COW_EXTRA_LIBS),)
> +ifeq ($(IOURING_EXTRA_LIBS),)
> all: warn_missing_liburing
>
> warn_missing_liburing:
> @echo ; \
> - echo "Warning: missing liburing support. Some COW tests will be skipped." ; \
> + echo "Warning: missing liburing support. Some tests will be skipped." ; \
> echo
> endif
> diff --git a/tools/testing/selftests/mm/check_config.sh b/tools/testing/selftests/mm/check_config.sh
> index bcba3af0acea..3954f4746161 100644
> --- a/tools/testing/selftests/mm/check_config.sh
> +++ b/tools/testing/selftests/mm/check_config.sh
> @@ -21,11 +21,11 @@ $CC -c $tmpfile_c -o $tmpfile_o >/dev/null 2>&1
>
> if [ -f $tmpfile_o ]; then
> echo "#define LOCAL_CONFIG_HAVE_LIBURING 1" > $OUTPUT_H_FILE
> - echo "COW_EXTRA_LIBS = -luring" > $OUTPUT_MKFILE
> + echo "IOURING_EXTRA_LIBS = -luring" > $OUTPUT_MKFILE
> else
> echo "// No liburing support found" > $OUTPUT_H_FILE
> echo "# No liburing support found, so:" > $OUTPUT_MKFILE
> - echo "COW_EXTRA_LIBS = " >> $OUTPUT_MKFILE
> + echo "IOURING_EXTRA_LIBS = " >> $OUTPUT_MKFILE
> fi
>
> rm ${tmpname}.*

Reviewed-by: David Hildenbrand <[email protected]>

--
Thanks,

David / dhildenb

2023-04-14 14:00:47

by Peter Xu

[permalink] [raw]
Subject: Re: [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS

On Fri, Apr 14, 2023 at 11:52:40AM +0200, David Hildenbrand wrote:
> On 14.04.23 01:11, Peter Xu wrote:
> > The macro and facility can be reused in other tests too. Make it general.
> >
> > Signed-off-by: Peter Xu <[email protected]>
> > ---
> > tools/testing/selftests/mm/Makefile | 8 ++++----
> > tools/testing/selftests/mm/check_config.sh | 4 ++--
> > 2 files changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
> > index 5a3434419403..9ffce175d5e6 100644
> > --- a/tools/testing/selftests/mm/Makefile
> > +++ b/tools/testing/selftests/mm/Makefile
> > @@ -161,8 +161,8 @@ warn_32bit_failure:
> > endif
> > endif
> > -# cow_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
> > -$(OUTPUT)/cow: LDLIBS += $(COW_EXTRA_LIBS)
> > +# IOURING_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
> > +$(OUTPUT)/cow: LDLIBS += $(IOURING_EXTRA_LIBS)
> > $(OUTPUT)/mlock-random-test $(OUTPUT)/memfd_secret: LDLIBS += -lcap
> > @@ -175,11 +175,11 @@ local_config.mk local_config.h: check_config.sh
> > EXTRA_CLEAN += local_config.mk local_config.h
> > -ifeq ($(COW_EXTRA_LIBS),)
> > +ifeq ($(IOURING_EXTRA_LIBS),)
> > all: warn_missing_liburing
> > warn_missing_liburing:
> > @echo ; \
> > - echo "Warning: missing liburing support. Some COW tests will be skipped." ; \
> > + echo "Warning: missing liburing support. Some tests will be skipped." ; \
> > echo
> > endif
> > diff --git a/tools/testing/selftests/mm/check_config.sh b/tools/testing/selftests/mm/check_config.sh
> > index bcba3af0acea..3954f4746161 100644
> > --- a/tools/testing/selftests/mm/check_config.sh
> > +++ b/tools/testing/selftests/mm/check_config.sh
> > @@ -21,11 +21,11 @@ $CC -c $tmpfile_c -o $tmpfile_o >/dev/null 2>&1
> > if [ -f $tmpfile_o ]; then
> > echo "#define LOCAL_CONFIG_HAVE_LIBURING 1" > $OUTPUT_H_FILE
> > - echo "COW_EXTRA_LIBS = -luring" > $OUTPUT_MKFILE
> > + echo "IOURING_EXTRA_LIBS = -luring" > $OUTPUT_MKFILE
> > else
> > echo "// No liburing support found" > $OUTPUT_H_FILE
> > echo "# No liburing support found, so:" > $OUTPUT_MKFILE
> > - echo "COW_EXTRA_LIBS = " >> $OUTPUT_MKFILE
> > + echo "IOURING_EXTRA_LIBS = " >> $OUTPUT_MKFILE
> > fi
> > rm ${tmpname}.*
>
> Reviewed-by: David Hildenbrand <[email protected]>

Oops, I planned to drop this patch but I forgot.. I was planning to use
iouring but only later found that it cannot take RO pins so switched to
gup_test per your cow test. Hence this patch is not needed anymore.

But since it's already there and looks like still good to have.. let me
keep it around with your R-b then.

Thanks,

--
Peter Xu

2023-04-14 14:14:23

by Peter Xu

[permalink] [raw]
Subject: Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()

On Fri, Apr 14, 2023 at 12:45:29PM +0300, Mika Penttilä wrote:
> > } else if (unlikely(is_hugetlb_entry_migration(entry))) {
> > swp_entry_t swp_entry = pte_to_swp_entry(entry);
> > - bool uffd_wp = huge_pte_uffd_wp(entry);

[1]

> > if (!is_readable_migration_entry(swp_entry) && cow) {
> > /*
> > @@ -5049,11 +5050,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> > swp_entry = make_readable_migration_entry(
> > swp_offset(swp_entry));
> > entry = swp_entry_to_pte(swp_entry);

[2]

> > - if (userfaultfd_wp(src_vma) && uffd_wp)
> > - entry = huge_pte_mkuffd_wp(entry);
> > + if (userfaultfd_wp(src_vma) &&
> > + pte_swp_uffd_wp(entry))
> > + entry = pte_swp_mkuffd_wp(entry);
>
>
> This looks interesting with pte_swp_uffd_wp and pte_swp_mkuffd_wp ?

Could you explain what do you mean?

I think these helpers are the right ones to use, as afaict hugetlb
migration should follow the same pte format with !hugetlb. However, I
noticed I did it wrong when dropping the temp var - when at [1], "entry"
still points to the src entry, but at [2] it's already pointing to the
newly created one.. so I think I can't drop the var, a fixup should like:

===8<===
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 083aae35bff8..cd3a9d8f4b70 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5041,6 +5041,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
set_huge_pte_at(dst, addr, dst_pte, entry);
} else if (unlikely(is_hugetlb_entry_migration(entry))) {
swp_entry_t swp_entry = pte_to_swp_entry(entry);
+ bool uffd_wp = pte_swp_uffd_wp(entry);

if (!is_readable_migration_entry(swp_entry) && cow) {
/*
@@ -5050,8 +5051,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
swp_entry = make_readable_migration_entry(
swp_offset(swp_entry));
entry = swp_entry_to_pte(swp_entry);
- if (userfaultfd_wp(src_vma) &&
- pte_swp_uffd_wp(entry))
+ if (userfaultfd_wp(src_vma) && uffd_wp)
entry = pte_swp_mkuffd_wp(entry);
set_huge_pte_at(src, addr, src_pte, entry);
===8<===

Besides, did I miss something else?

Thanks,

--
Peter Xu

2023-04-14 14:27:01

by Mika Penttilä

[permalink] [raw]
Subject: Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()



On 14.4.2023 17.09, Peter Xu wrote:
> On Fri, Apr 14, 2023 at 12:45:29PM +0300, Mika Penttilä wrote:
>>> } else if (unlikely(is_hugetlb_entry_migration(entry))) {
>>> swp_entry_t swp_entry = pte_to_swp_entry(entry);
>>> - bool uffd_wp = huge_pte_uffd_wp(entry);
>
> [1]
>
>>> if (!is_readable_migration_entry(swp_entry) && cow) {
>>> /*
>>> @@ -5049,11 +5050,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>>> swp_entry = make_readable_migration_entry(
>>> swp_offset(swp_entry));
>>> entry = swp_entry_to_pte(swp_entry);
>
> [2]
>
>>> - if (userfaultfd_wp(src_vma) && uffd_wp)
>>> - entry = huge_pte_mkuffd_wp(entry);
>>> + if (userfaultfd_wp(src_vma) &&
>>> + pte_swp_uffd_wp(entry))
>>> + entry = pte_swp_mkuffd_wp(entry);
>>
>>
>> This looks interesting with pte_swp_uffd_wp and pte_swp_mkuffd_wp ?
>
> Could you explain what do you mean?
>

Yes like you noticed also you called pte_swp_mkuffd_wp(entry) iff
pte_swp_uffd_wp(entry) which is of course a nop.

But the fixup not dropping the temp var should work.

> I think these helpers are the right ones to use, as afaict hugetlb
> migration should follow the same pte format with !hugetlb. However, I
> noticed I did it wrong when dropping the temp var - when at [1], "entry"
> still points to the src entry, but at [2] it's already pointing to the
> newly created one.. so I think I can't drop the var, a fixup should like:
>
> ===8<===
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 083aae35bff8..cd3a9d8f4b70 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5041,6 +5041,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> set_huge_pte_at(dst, addr, dst_pte, entry);
> } else if (unlikely(is_hugetlb_entry_migration(entry))) {
> swp_entry_t swp_entry = pte_to_swp_entry(entry);
> + bool uffd_wp = pte_swp_uffd_wp(entry);
>
> if (!is_readable_migration_entry(swp_entry) && cow) {
> /*
> @@ -5050,8 +5051,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> swp_entry = make_readable_migration_entry(
> swp_offset(swp_entry));
> entry = swp_entry_to_pte(swp_entry);
> - if (userfaultfd_wp(src_vma) &&
> - pte_swp_uffd_wp(entry))
> + if (userfaultfd_wp(src_vma) && uffd_wp)
> entry = pte_swp_mkuffd_wp(entry);
> set_huge_pte_at(src, addr, src_pte, entry);
> ===8<===
>
> Besides, did I miss something else?
>
> Thanks,
>

--Mika

2023-04-14 14:33:01

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH 5/6] selftests/mm: Rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS

On 14.04.23 15:56, Peter Xu wrote:
> On Fri, Apr 14, 2023 at 11:52:40AM +0200, David Hildenbrand wrote:
>> On 14.04.23 01:11, Peter Xu wrote:
>>> The macro and facility can be reused in other tests too. Make it general.
>>>
>>> Signed-off-by: Peter Xu <[email protected]>
>>> ---
>>> tools/testing/selftests/mm/Makefile | 8 ++++----
>>> tools/testing/selftests/mm/check_config.sh | 4 ++--
>>> 2 files changed, 6 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
>>> index 5a3434419403..9ffce175d5e6 100644
>>> --- a/tools/testing/selftests/mm/Makefile
>>> +++ b/tools/testing/selftests/mm/Makefile
>>> @@ -161,8 +161,8 @@ warn_32bit_failure:
>>> endif
>>> endif
>>> -# cow_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
>>> -$(OUTPUT)/cow: LDLIBS += $(COW_EXTRA_LIBS)
>>> +# IOURING_EXTRA_LIBS may get set in local_config.mk, or it may be left empty.
>>> +$(OUTPUT)/cow: LDLIBS += $(IOURING_EXTRA_LIBS)
>>> $(OUTPUT)/mlock-random-test $(OUTPUT)/memfd_secret: LDLIBS += -lcap
>>> @@ -175,11 +175,11 @@ local_config.mk local_config.h: check_config.sh
>>> EXTRA_CLEAN += local_config.mk local_config.h
>>> -ifeq ($(COW_EXTRA_LIBS),)
>>> +ifeq ($(IOURING_EXTRA_LIBS),)
>>> all: warn_missing_liburing
>>> warn_missing_liburing:
>>> @echo ; \
>>> - echo "Warning: missing liburing support. Some COW tests will be skipped." ; \
>>> + echo "Warning: missing liburing support. Some tests will be skipped." ; \
>>> echo
>>> endif
>>> diff --git a/tools/testing/selftests/mm/check_config.sh b/tools/testing/selftests/mm/check_config.sh
>>> index bcba3af0acea..3954f4746161 100644
>>> --- a/tools/testing/selftests/mm/check_config.sh
>>> +++ b/tools/testing/selftests/mm/check_config.sh
>>> @@ -21,11 +21,11 @@ $CC -c $tmpfile_c -o $tmpfile_o >/dev/null 2>&1
>>> if [ -f $tmpfile_o ]; then
>>> echo "#define LOCAL_CONFIG_HAVE_LIBURING 1" > $OUTPUT_H_FILE
>>> - echo "COW_EXTRA_LIBS = -luring" > $OUTPUT_MKFILE
>>> + echo "IOURING_EXTRA_LIBS = -luring" > $OUTPUT_MKFILE
>>> else
>>> echo "// No liburing support found" > $OUTPUT_H_FILE
>>> echo "# No liburing support found, so:" > $OUTPUT_MKFILE
>>> - echo "COW_EXTRA_LIBS = " >> $OUTPUT_MKFILE
>>> + echo "IOURING_EXTRA_LIBS = " >> $OUTPUT_MKFILE
>>> fi
>>> rm ${tmpname}.*
>>
>> Reviewed-by: David Hildenbrand <[email protected]>
>
> Oops, I planned to drop this patch but I forgot.. I was planning to use
> iouring but only later found that it cannot take RO pins so switched to
> gup_test per your cow test. Hence this patch is not needed anymore.
>

Yeah, it's unfortunate ... I briefly thought about adding R/O fixed
buffer support, but it looked like more work than eventual benefit.

> But since it's already there and looks like still good to have.. let me
> keep it around with your R-b then.

Yes, makes sense to me.

--
Thanks,

David / dhildenb

2023-04-14 15:25:19

by Peter Xu

[permalink] [raw]
Subject: Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()

On Fri, Apr 14, 2023 at 05:23:12PM +0300, Mika Penttilä wrote:
> But the fixup not dropping the temp var should work.

Ok I see. I'll wait for a few more days for a respin. Thanks,

--
Peter Xu

2023-04-14 22:26:34

by Mike Kravetz

[permalink] [raw]
Subject: Re: [PATCH 1/6] mm/hugetlb: Fix uffd-wp during fork()

On 04/13/23 19:11, Peter Xu wrote:
> There're a bunch of things that were wrong:
>
> - Reading uffd-wp bit from a swap entry should use pte_swp_uffd_wp()
> rather than huge_pte_uffd_wp().

That was/is quite confusing to me at least.

>
> - When copying over a pte, we should drop uffd-wp bit when
> !EVENT_FORK (aka, when !userfaultfd_wp(dst_vma)).
>
> - When doing early CoW for private hugetlb (e.g. when the parent page was
> pinned), uffd-wp bit should be properly carried over if necessary.
>
> No bug reported probably because most people do not even care about these
> corner cases, but they are still bugs and can be exposed by the recent unit
> tests introduced, so fix all of them in one shot.
>
> Cc: linux-stable <[email protected]>
> Fixes: bc70fbf269fd ("mm/hugetlb: handle uffd-wp during fork()")
> Signed-off-by: Peter Xu <[email protected]>
> ---
> mm/hugetlb.c | 26 ++++++++++++++++----------
> 1 file changed, 16 insertions(+), 10 deletions(-)

No issues except losing information in pte entry as pointed out by Mika.

--
Mike Kravetz