LinuxLists.cc - [PATCH v2 00/24] userfaultfd-wp: Support shmem and hugetlbfs

2021-04-27 16:20:44

Subject: [PATCH v2 00/24] userfaultfd-wp: Support shmem and hugetlbfs

This is v2 of uffd-wp shmem & hugetlbfs support, which completes uffd-wp as a

kernel full feature, as it only supports anonymous before this series.

This patchset is based mostly on tag v5.12-rc8-mmots-2021-04-21-23-08, however

I dropped some patches there otherwise I'll get some weird build errors. I

think I should have kept most of the -mm-next materials, so hopefully when

Andrew gets a new -mm tree, this series can be applied cleanly upon it. I'll

respin otherwise.

The whole series can also be found online [1].

The major change from v1->v2 is that I dropped the FAULT_FLAG_UFFD_WP flag in

previous v1, as I think it's not needed and checking of the flag:

vmf->flags & FAULT_FLAG_UFFD_WP

Can be replaced safely by:

pte_swp_uffd_wp_special(vmf->orig_pte)

So we don't need to introduce yet another flag, and the code actually even

shrinked a few more lines, which is good.

Also per discussion with Mike, I dropped the READ emulation when a thread

faulted on a uffd-wp special swap pte, as I'll keep the WRITE fault as is in

this series.

Nothing big really changed otherwise. Full changelog listed below.

v2:

- Add R-bs

- Added patch "mm/hugetlb: Drop __unmap_hugepage_range definition from

hugetlb.h" as noticed/suggested by Mike Kravets

- Fix commit message of patch "hugetlb/userfaultfd: Only drop uffd-wp special

pte if required" [MikeK]

- Removing comments for fields in zap_details since they're either incorrect or

not helping [Matthew]

- Rephrase commit message in patch "hugetlb/userfaultfd: Take care of

UFFDIO_COPY_MODE_WP" to explain better on why set dirty bit for UFFDIO_COPY

in hugetlbfs [MikeK]

- Don't emulate READ for uffd-wp-special on both shmem & hugetlbfs.

- Drop FAULT_FLAG_UFFD_WP flag, by checking vmf->orig_pte directly against

pte_swp_uffd_wp_special()

- Fix race condition of page fault handling on uffd-wp-special [Mike]

About Swap Special PTE

======================

In short, the so-called "swap special pte" in this patchset is a new type of

pte that doesn't exist in the past, but it got used initially in this series in

file-backed memories. It is used to persist information even if the ptes got

dropped meanwhile when the page cache still existed. For example, when

splitting a file-backed huge pmd, we could be simply dropping the pmd entry

then wait until another fault coming. It's okay in the past since all

information in the pte can be retained from the page cache when the next page

fault triggers. However in this case, uffd-wp is per-pte information which

cannot be kept in page cache, so that information needs to be maintained

somehow still in the pgtable entry, even if the pgtable entry is going to be

dropped. Here instead of replacing with a none entry, we used the "swap

special pte". Then when the next page fault triggers, we can observe orig_pte

to retain this information.

I'm copy-pasting some commit message from the patch "mm/swap: Introduce the

idea of special swap ptes", where it tried to explain this pte in another angle:

We used to have special swap entries, like migration entries, hw-poison

entries, device private entries, etc.

Those "special swap entries" reside in the range that they need to be at least

swap entries first, and their types are decided by swp_type(entry).

This patch introduces another idea called "special swap ptes".

It's very easy to get confused against "special swap entries", but a speical

swap pte should never contain a swap entry at all. It means, it's illegal to

call pte_to_swp_entry() upon a special swap pte.

Make the uffd-wp special pte to be the first special swap pte.

Before this patch, is_swap_pte()==true means one of the below:

(a.1) The pte has a normal swap entry (non_swap_entry()==false). For

example, when an anonymous page got swapped out.

(a.2) The pte has a special swap entry (non_swap_entry()==true). For

example, a migration entry, a hw-poison entry, etc.

After this patch, is_swap_pte()==true means one of the below, where case (b) is

added:

(a) The pte contains a swap entry.

(a.1) The pte has a normal swap entry (non_swap_entry()==false). For

example, when an anonymous page got swapped out.

(a.2) The pte has a special swap entry (non_swap_entry()==true). For

example, a migration entry, a hw-poison entry, etc.

(b) The pte does not contain a swap entry at all (so it cannot be passed

into pte_to_swp_entry()). For example, uffd-wp special swap pte.

Hugetlbfs needs similar thing because it's also file-backed. I directly reused

the same special pte there, though the shmem/hugetlb change on supporting this

new pte is different since they don't share code path a lot.

Patch layout

============

Part (1): Shmem support, this is where the special swap pte is introduced.

Some zap rework is needed within the process:

shmem/userfaultfd: Take care of UFFDIO_COPY_MODE_WP

mm: Clear vmf->pte after pte_unmap_same() returns

mm/userfaultfd: Introduce special pte for unmapped file-backed mem

mm/swap: Introduce the idea of special swap ptes

shmem/userfaultfd: Handle uffd-wp special pte in page fault handler

mm: Drop first_index/last_index in zap_details

mm: Introduce zap_details.zap_flags

mm: Introduce ZAP_FLAG_SKIP_SWAP

mm: Pass zap_flags into unmap_mapping_pages()

shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed

shmem/userfaultfd: Allow wr-protect none pte for file-backed mem

shmem/userfaultfd: Allows file-back mem to be uffd wr-protected on thps

shmem/userfaultfd: Handle the left-overed special swap ptes

shmem/userfaultfd: Pass over uffd-wp special swap pte when fork()

Part (2): Hugetlb support, we need to disable huge pmd sharing for uffd-wp

because not compatible just like uffd minor mode. The rest is the changes

required to teach hugetlbfs understand the special swap pte too that introduced

with the uffd-wp change:

mm/hugetlb: Drop __unmap_hugepage_range definition from hugetlb.h

hugetlb/userfaultfd: Hook page faults for uffd write protection

hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP

hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT

hugetlb: Pass vma into huge_pte_alloc()

hugetlb/userfaultfd: Forbid huge pmd sharing when uffd enabled

mm/hugetlb: Introduce huge version of special swap pte helpers

mm/hugetlb: Move flush_hugetlb_tlb_range() into hugetlb.h

hugetlb/userfaultfd: Unshare all pmds for hugetlbfs when register wp

hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler

hugetlb/userfaultfd: Allow wr-protect none ptes

hugetlb/userfaultfd: Only drop uffd-wp special pte if required

Part (3): Enable both features in code and test

userfaultfd: Enable write protection for shmem & hugetlbfs

userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs

Tests

=====

I've tested it using either userfaultfd kselftest program, but also with

umapsort [2] which should be even stricter. Tested page swapping in/out during

umapsort.

If anyone would like to try umapsort, need to use an extremely hacked version

of umap library [3], because by default umap only supports anonymous. So to

test it we need to build [3] then [2].

Any comment would be greatly welcomed. Thanks,

[1] https://github.com/xzpeter/linux/tree/uffd-wp-shmem-hugetlbfs

[2] https://github.com/LLNL/umap-apps

[3] https://github.com/xzpeter/umap/tree/peter-shmem-hugetlbfs

Peter Xu (24):

shmem/userfaultfd: Take care of UFFDIO_COPY_MODE_WP

mm: Clear vmf->pte after pte_unmap_same() returns

mm/userfaultfd: Introduce special pte for unmapped file-backed mem

mm/swap: Introduce the idea of special swap ptes

shmem/userfaultfd: Handle uffd-wp special pte in page fault handler

mm: Drop first_index/last_index in zap_details

mm: Introduce zap_details.zap_flags

mm: Introduce ZAP_FLAG_SKIP_SWAP

mm: Pass zap_flags into unmap_mapping_pages()

shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed

shmem/userfaultfd: Allow wr-protect none pte for file-backed mem

shmem/userfaultfd: Allows file-back mem to be uffd wr-protected on

thps

shmem/userfaultfd: Handle the left-overed special swap ptes

shmem/userfaultfd: Pass over uffd-wp special swap pte when fork()

mm/hugetlb: Drop __unmap_hugepage_range definition from hugetlb.h

hugetlb/userfaultfd: Hook page faults for uffd write protection

hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP

hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT

mm/hugetlb: Introduce huge version of special swap pte helpers

hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler

hugetlb/userfaultfd: Allow wr-protect none ptes

hugetlb/userfaultfd: Only drop uffd-wp special pte if required

mm/userfaultfd: Enable write protection for shmem & hugetlbfs

userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs

arch/arm64/kernel/mte.c | 2 +-

arch/x86/include/asm/pgtable.h | 28 +++

fs/dax.c | 10 +-

fs/hugetlbfs/inode.c | 15 +-

fs/proc/task_mmu.c | 14 +-

fs/userfaultfd.c | 39 ++--

include/asm-generic/hugetlb.h | 10 +

include/asm-generic/pgtable_uffd.h | 3 +

include/linux/hugetlb.h | 30 ++-

include/linux/mm.h | 48 ++++-

include/linux/mm_inline.h | 43 +++++

include/linux/shmem_fs.h | 5 +-

include/linux/swapops.h | 39 +++-

include/linux/userfaultfd_k.h | 47 +++++

include/uapi/linux/userfaultfd.h | 7 +-

mm/gup.c | 2 +-

mm/hmm.c | 2 +-

mm/hugetlb.c | 160 +++++++++++++---

mm/khugepaged.c | 14 +-

mm/madvise.c | 4 +-

mm/memcontrol.c | 2 +-

mm/memory.c | 226 +++++++++++++++++------

mm/migrate.c | 4 +-

mm/mincore.c | 2 +-

mm/mprotect.c | 63 ++++++-

mm/mremap.c | 2 +-

mm/page_vma_mapped.c | 6 +-

mm/rmap.c | 8 +

mm/shmem.c | 40 +++-

mm/swapfile.c | 2 +-

mm/truncate.c | 17 +-

mm/userfaultfd.c | 37 ++--

tools/testing/selftests/vm/userfaultfd.c | 9 +-

33 files changed, 753 insertions(+), 187 deletions(-)

--

2.26.2

2021-04-27 16:22:08

Subject: [PATCH v2 00/24] userfaultfd-wp: Support shmem and hugetlbfs

Subject: [PATCH v2 03/24] mm/userfaultfd: Introduce special pte for unmapped file-backed mem

Subject: [PATCH v2 09/24] mm: Pass zap_flags into unmap_mapping_pages()

Subject: [PATCH v2 04/24] mm/swap: Introduce the idea of special swap ptes

Subject: [PATCH v2 08/24] mm: Introduce ZAP_FLAG_SKIP_SWAP

Subject: [PATCH v2 10/24] shmem/userfaultfd: Persist uffd-wp bit across zapping for file-backed

Subject: [PATCH v2 13/24] shmem/userfaultfd: Handle the left-overed special swap ptes

Subject: [PATCH v2 07/24] mm: Introduce zap_details.zap_flags

Subject: [PATCH v2 12/24] shmem/userfaultfd: Allows file-back mem to be uffd wr-protected on thps

Subject: [PATCH v2 14/24] shmem/userfaultfd: Pass over uffd-wp special swap pte when fork()

Subject: [PATCH v2 05/24] shmem/userfaultfd: Handle uffd-wp special pte in page fault handler

Subject: [PATCH v2 15/24] mm/hugetlb: Drop __unmap_hugepage_range definition from hugetlb.h

Subject: [PATCH v2 11/24] shmem/userfaultfd: Allow wr-protect none pte for file-backed mem

Subject: [PATCH v2 16/24] hugetlb/userfaultfd: Hook page faults for uffd write protection

Subject: [PATCH v2 20/24] hugetlb/userfaultfd: Handle uffd-wp special pte in hugetlb pf handler

Subject: [PATCH v2 17/24] hugetlb/userfaultfd: Take care of UFFDIO_COPY_MODE_WP

Subject: [PATCH v2 19/24] mm/hugetlb: Introduce huge version of special swap pte helpers

Subject: [PATCH v2 21/24] hugetlb/userfaultfd: Allow wr-protect none ptes

Subject: [PATCH v2 18/24] hugetlb/userfaultfd: Handle UFFDIO_WRITEPROTECT

Subject: [PATCH v2 23/24] mm/userfaultfd: Enable write protection for shmem & hugetlbfs

Subject: [PATCH v2 06/24] mm: Drop first_index/last_index in zap_details

Subject: [PATCH v2 22/24] hugetlb/userfaultfd: Only drop uffd-wp special pte if required

Subject: [PATCH v2 24/24] userfaultfd/selftests: Enable uffd-wp for shmem/hugetlbfs

Subject: Re: [PATCH v2 05/24] shmem/userfaultfd: Handle uffd-wp special pte in page fault handler

Subject: Re: [PATCH v2 16/24] hugetlb/userfaultfd: Hook page faults for uffd write protection

Subject: Re: [PATCH v2 00/24] userfaultfd-wp: Support shmem and hugetlbfs

Subject: Re: [PATCH v2 00/24] userfaultfd-wp: Support shmem and hugetlbfs

Subject: Re: [PATCH v2 00/24] userfaultfd-wp: Support shmem and hugetlbfs