2021-03-29 22:20:18

by Peter Xu

[permalink] [raw]
Subject: [PATCH v5 0/4] man2: udpate mm/userfaultfd manpages to latest

v5:

- add r-bs for Mike R.

- Fix spelling mistake "diable" [Mike R.]

- s/Starting from/Since/ for patch 2 (also replaced two existing ones in the

same file) [Alex]

- s/un-write-protect/write-unprotect/ [Alex]

- s/The process was interrupted and need to retry/The process was interrupted;

retry this call/ in the last patch. [Alex]



v4:

- Fixed a few "subordinate clauses" (SC) cases [Alex]

- Reword in ioctl_userfaultfd.2 to use bold font for the two modes referenced,

so as to be clear on what is "both" referring to [Alex]



v3:

- Don't use "Currently", instead add "(since x.y)" mark where proper [Alex]

- Always use semantic newlines across the whole patchset [Alex]

- Use quote when possible, rather than escapes [Alex]

- Fix one missing replacement of ".BR" -> ".B" [Alex]

- Some other trivial rephrases here and there when fixing up above



v2 changes:

- Fix wordings as suggested [MikeR]

- convert ".BR" to ".B" where proper for the patchset [Alex]

- rearrange a few lines in the last two patches where they got messed up

- document more things, e.g. UFFDIO_COPY_MODE_WP; and also on how to resolve a

wr-protect page fault.



There're two features missing in current manpage, namely:



(1) Userfaultfd Thread-ID feature

(2) Userfaultfd write protect mode



There's also a 3rd one which was just contributed from Axel - Axel, I think it

would be great if you can add that part too, probably after the whole

hugetlbfs/shmem minor mode reaches the linux master branch.



Please review, thanks.



Peter Xu (4):

userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

userfaultfd.2: Add write-protect mode

ioctl_userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

ioctl_userfaultfd.2: Add write-protect mode docs



man2/ioctl_userfaultfd.2 | 89 +++++++++++++++++++++++++++-

man2/userfaultfd.2 | 121 +++++++++++++++++++++++++++++++++++++--

2 files changed, 203 insertions(+), 7 deletions(-)



--

2.26.2





2021-03-29 22:20:27

by Peter Xu

[permalink] [raw]
Subject: [PATCH v5 1/4] userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

UFFD_FEATURE_THREAD_ID is supported since Linux 4.14.

Acked-by: Mike Rapoport <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
man2/userfaultfd.2 | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2
index e7dc9f813..5c41e4816 100644
--- a/man2/userfaultfd.2
+++ b/man2/userfaultfd.2
@@ -77,6 +77,13 @@ When the last file descriptor referring to a userfaultfd object is closed,
all memory ranges that were registered with the object are unregistered
and unread events are flushed.
.\"
+.PP
+Since Linux 4.14, userfaultfd page fault message can selectively embed faulting
+thread ID information into the fault message.
+One needs to enable this feature explicitly using the
+.BR UFFD_FEATURE_THREAD_ID
+feature bit when initializing the userfaultfd context.
+By default, thread ID reporting is disabled.
.SS Usage
The userfaultfd mechanism is designed to allow a thread in a multithreaded
program to perform user-space paging for the other threads in the process.
@@ -229,6 +236,9 @@ struct uffd_msg {
struct {
__u64 flags; /* Flags describing fault */
__u64 address; /* Faulting address */
+ union {
+ __u32 ptid; /* Thread ID of the fault */
+ } feat;
} pagefault;

struct { /* Since Linux 4.11 */
@@ -358,6 +368,9 @@ otherwise it is a read fault.
.\" UFFD_PAGEFAULT_FLAG_WP is not yet supported.
.RE
.TP
+.I pagefault.feat.pid
+The thread ID that triggered the page fault.
+.TP
.I fork.ufd
The file descriptor associated with the userfault object
created for the child created by
--
2.26.2

2021-03-29 22:20:35

by Peter Xu

[permalink] [raw]
Subject: [PATCH v5 3/4] ioctl_userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

UFFD_FEATURE_THREAD_ID is supported in Linux 4.14.

Acked-by: Mike Rapoport <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
man2/ioctl_userfaultfd.2 | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2
index 47ae5f473..d4a8375b8 100644
--- a/man2/ioctl_userfaultfd.2
+++ b/man2/ioctl_userfaultfd.2
@@ -208,6 +208,11 @@ signal will be sent to the faulting process.
Applications using this
feature will not require the use of a userfaultfd monitor for processing
memory accesses to the regions registered with userfaultfd.
+.TP
+.BR UFFD_FEATURE_THREAD_ID " (since Linux 4.14)"
+If this feature bit is set,
+.I uffd_msg.pagefault.feat.ptid
+will be set to the faulted thread ID for each page fault message.
.PP
The returned
.I ioctls
--
2.26.2

2021-03-29 22:20:36

by Peter Xu

[permalink] [raw]
Subject: [PATCH v5 2/4] userfaultfd.2: Add write-protect mode

Write-protect mode is supported starting from Linux 5.7.

Acked-by: Mike Rapoport <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
man2/userfaultfd.2 | 108 +++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 104 insertions(+), 4 deletions(-)

diff --git a/man2/userfaultfd.2 b/man2/userfaultfd.2
index 5c41e4816..474294c3d 100644
--- a/man2/userfaultfd.2
+++ b/man2/userfaultfd.2
@@ -78,6 +78,32 @@ all memory ranges that were registered with the object are unregistered
and unread events are flushed.
.\"
.PP
+Userfaultfd supports two modes of registration:
+.TP
+.BR UFFDIO_REGISTER_MODE_MISSING " (since 4.10)"
+When registered with
+.B UFFDIO_REGISTER_MODE_MISSING
+mode, the userspace will receive a page fault message
+when a missing page is accessed.
+The faulted thread will be stopped from execution until the page fault is
+resolved from the userspace by either an
+.B UFFDIO_COPY
+or an
+.B UFFDIO_ZEROPAGE
+ioctl.
+.TP
+.BR UFFDIO_REGISTER_MODE_WP " (since 5.7)"
+When registered with
+.B UFFDIO_REGISTER_MODE_WP
+mode, the userspace will receive a page fault message
+when a write-protected page is written.
+The faulted thread will be stopped from execution
+until the userspace write-unprotect the page using an
+.B UFFDIO_WRITEPROTECT
+ioctl.
+.PP
+Multiple modes can be enabled at the same time for the same memory range.
+.PP
Since Linux 4.14, userfaultfd page fault message can selectively embed faulting
thread ID information into the fault message.
One needs to enable this feature explicitly using the
@@ -107,7 +133,7 @@ the process that monitors userfaultfd and handles page faults
needs to be aware of the changes in the virtual memory layout
of the faulting process to avoid memory corruption.
.PP
-Starting from Linux 4.11,
+Since Linux 4.11,
userfaultfd can also notify the fault-handling threads about changes
in the virtual memory layout of the faulting process.
In addition, if the faulting process invokes
@@ -144,6 +170,17 @@ single threaded non-cooperative userfaultfd manager implementations.
.\" and limitations remaining in 4.11
.\" Maybe it's worth adding a dedicated sub-section...
.\"
+.PP
+Since Linux 5.7, userfaultfd is able to do
+synchronous page dirty tracking using the new write-protect register mode.
+One should check against the feature bit
+.B UFFD_FEATURE_PAGEFAULT_FLAG_WP
+before using this feature.
+Similar to the original userfaultfd missing mode, the write-protect mode will
+generate an userfaultfd message when the protected page is written.
+The user needs to resolve the page fault by unprotecting the faulted page and
+kick the faulted thread to continue.
+For more information, please refer to "Userfaultfd write-protect mode" section.
.SS Userfaultfd operation
After the userfaultfd object is created with
.BR userfaultfd (),
@@ -179,7 +216,7 @@ or
.BR ioctl (2)
operations to resolve the page fault.
.PP
-Starting from Linux 4.14, if the application sets the
+Since Linux 4.14, if the application sets the
.B UFFD_FEATURE_SIGBUS
feature bit using the
.B UFFDIO_API
@@ -219,6 +256,65 @@ userfaultfd can be used only with anonymous private memory mappings.
Since Linux 4.11,
userfaultfd can be also used with hugetlbfs and shared memory mappings.
.\"
+.SS Userfaultfd write-protect mode (since 5.7)
+Since Linux 5.7, userfaultfd supports write-protect mode.
+The user needs to first check availability of this feature using
+.B UFFDIO_API
+ioctl against the feature bit
+.B UFFD_FEATURE_PAGEFAULT_FLAG_WP
+before using this feature.
+.PP
+To register with userfaultfd write-protect mode, the user needs to initiate the
+.B UFFDIO_REGISTER
+ioctl with mode
+.B UFFDIO_REGISTER_MODE_WP
+set.
+Note that it's legal to monitor the same memory range with multiple modes.
+For example, the user can do
+.B UFFDIO_REGISTER
+with the mode set to
+.BR "UFFDIO_REGISTER_MODE_MISSING | UFFDIO_REGISTER_MODE_WP" .
+When there is only
+.B UFFDIO_REGISTER_MODE_WP
+registered, the userspace will
+.I not
+receive any message when a missing page is written.
+Instead, the userspace will only receive a write-protect page fault message
+when an existing but write-protected page got written.
+.PP
+After the
+.B UFFDIO_REGISTER
+ioctl completed with
+.B UFFDIO_REGISTER_MODE_WP
+mode set,
+the user can write-protect any existing memory within the range using the ioctl
+.B UFFDIO_WRITEPROTECT
+where
+.I uffdio_writeprotect.mode
+should be set to
+.BR UFFDIO_WRITEPROTECT_MODE_WP .
+.PP
+When a write-protect event happens,
+the userspace will receive a page fault message whose
+.I uffd_msg.pagefault.flags
+will be with
+.B UFFD_PAGEFAULT_FLAG_WP
+flag set.
+Note: since only writes can trigger such kind of fault,
+write-protect messages will always be with
+.B UFFD_PAGEFAULT_FLAG_WRITE
+bit set too along with bit
+.BR UFFD_PAGEFAULT_FLAG_WP .
+.PP
+To resolve a write-protection page fault, the user should initiate another
+.B UFFDIO_WRITEPROTECT
+ioctl, whose
+.I uffd_msg.pagefault.flags
+should have the flag
+.B UFFDIO_WRITEPROTECT_MODE_WP
+cleared upon the faulted page or range.
+.PP
+Write-protect mode only supports private anonymous memory.
.SS Reading from the userfaultfd structure
Each
.BR read (2)
@@ -364,8 +460,12 @@ flag (see
.BR ioctl_userfaultfd (2))
and this flag is set, this a write fault;
otherwise it is a read fault.
-.\"
-.\" UFFD_PAGEFAULT_FLAG_WP is not yet supported.
+.TP
+.B UFFD_PAGEFAULT_FLAG_WP
+If the address is in a range that was registered with the
+.B UFFDIO_REGISTER_MODE_WP
+flag, when this bit is set it means it's a write-protect fault.
+Otherwise it's a page missing fault.
.RE
.TP
.I pagefault.feat.pid
--
2.26.2

2021-03-29 22:20:42

by Peter Xu

[permalink] [raw]
Subject: [PATCH v5 4/4] ioctl_userfaultfd.2: Add write-protect mode docs

Userfaultfd write-protect mode is supported starting from Linux 5.7.

Acked-by: Mike Rapoport <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
man2/ioctl_userfaultfd.2 | 84 ++++++++++++++++++++++++++++++++++++++--
1 file changed, 81 insertions(+), 3 deletions(-)

diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2
index d4a8375b8..ca533a383 100644
--- a/man2/ioctl_userfaultfd.2
+++ b/man2/ioctl_userfaultfd.2
@@ -234,6 +234,11 @@ operation is supported.
The
.B UFFDIO_UNREGISTER
operation is supported.
+.TP
+.B 1 << _UFFDIO_WRITEPROTECT
+The
+.B UFFDIO_WRITEPROTECT
+operation is supported.
.PP
This
.BR ioctl (2)
@@ -322,9 +327,6 @@ Track page faults on missing pages.
.B UFFDIO_REGISTER_MODE_WP
Track page faults on write-protected pages.
.PP
-Currently, the only supported mode is
-.BR UFFDIO_REGISTER_MODE_MISSING .
-.PP
If the operation is successful, the kernel modifies the
.I ioctls
bit-mask field to indicate which
@@ -443,6 +445,16 @@ operation:
.TP
.B UFFDIO_COPY_MODE_DONTWAKE
Do not wake up the thread that waits for page-fault resolution
+.TP
+.B UFFDIO_COPY_MODE_WP
+Copy the page with read-only permission.
+This allows the user to trap the next write to the page,
+which will block and generate another write-protect userfault message.
+This is only used when both
+.B UFFDIO_REGISTER_MODE_MISSING
+and
+.B UFFDIO_REGISTER_MODE_WP
+modes are enabled for the registered range.
.PP
The
.I copy
@@ -654,6 +666,72 @@ field of the
structure was not a multiple of the system page size; or
.I len
was zero; or the specified range was otherwise invalid.
+.SS UFFDIO_WRITEPROTECT (Since Linux 5.7)
+Write-protect or write-unprotect an userfaultfd registered memory range
+registered with mode
+.BR UFFDIO_REGISTER_MODE_WP .
+.PP
+The
+.I argp
+argument is a pointer to a
+.I uffdio_range
+structure as shown below:
+.PP
+.in +4n
+.EX
+struct uffdio_writeprotect {
+ struct uffdio_range range; /* Range to change write permission */
+ __u64 mode; /* Mode to change write permission */
+};
+.EE
+.in
+There're two mode bits that are supported in this structure:
+.TP
+.B UFFDIO_WRITEPROTECT_MODE_WP
+When this mode bit is set, the ioctl will be a write-protect operation upon the
+memory range specified by
+.IR range .
+Otherwise it'll be a write-unprotect operation upon the specified range,
+which can be used to resolve an userfaultfd write-protect page fault.
+.TP
+.B UFFDIO_WRITEPROTECT_MODE_DONTWAKE
+When this mode bit is set,
+do not wake up any thread that waits for page-fault resolution after the operation.
+This could only be specified if
+.B UFFDIO_WRITEPROTECT_MODE_WP
+is not specified.
+.PP
+This
+.BR ioctl (2)
+operation returns 0 on success.
+On error, \-1 is returned and
+.I errno
+is set to indicate the error.
+Possible errors include:
+.TP
+.B EINVAL
+The
+.I start
+or the
+.I len
+field of the
+.I ufdio_range
+structure was not a multiple of the system page size; or
+.I len
+was zero; or the specified range was otherwise invalid.
+.TP
+.B EAGAIN
+The process was interrupted; retry this call.
+.TP
+.B ENOENT
+The range specified in
+.I range
+is not valid.
+For example, the virtual address does not exist,
+or not registered with userfaultfd write-protect mode.
+.TP
+.B EFAULT
+Encountered a generic fault during processing.
.SH RETURN VALUE
See descriptions of the individual operations, above.
.SH ERRORS
--
2.26.2

2021-04-01 18:29:26

by Alejandro Colomar

[permalink] [raw]
Subject: Re: [PATCH v5 0/4] man2: udpate mm/userfaultfd manpages to latest

Hi Peter,

On 3/30/21 12:18 AM, Peter Xu wrote:
> v5:
> - add r-bs for Mike R.
> - Fix spelling mistake "diable" [Mike R.]
> - s/Starting from/Since/ for patch 2 (also replaced two existing ones in the
> same file) [Alex]
> - s/un-write-protect/write-unprotect/ [Alex]
> - s/The process was interrupted and need to retry/The process was interrupted;
> retry this call/ in the last patch. [Alex]
>
> v4:
> - Fixed a few "subordinate clauses" (SC) cases [Alex]
> - Reword in ioctl_userfaultfd.2 to use bold font for the two modes referenced,
> so as to be clear on what is "both" referring to [Alex]
>
> v3:
> - Don't use "Currently", instead add "(since x.y)" mark where proper [Alex]
> - Always use semantic newlines across the whole patchset [Alex]
> - Use quote when possible, rather than escapes [Alex]
> - Fix one missing replacement of ".BR" -> ".B" [Alex]
> - Some other trivial rephrases here and there when fixing up above
>
> v2 changes:
> - Fix wordings as suggested [MikeR]
> - convert ".BR" to ".B" where proper for the patchset [Alex]
> - rearrange a few lines in the last two patches where they got messed up
> - document more things, e.g. UFFDIO_COPY_MODE_WP; and also on how to resolve a
> wr-protect page fault.
>
> There're two features missing in current manpage, namely:
>
> (1) Userfaultfd Thread-ID feature
> (2) Userfaultfd write protect mode
>
> There's also a 3rd one which was just contributed from Axel - Axel, I think it
> would be great if you can add that part too, probably after the whole
> hugetlbfs/shmem minor mode reaches the linux master branch.
>
> Please review, thanks.
>
> Peter Xu (4):
> userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs
> userfaultfd.2: Add write-protect mode
> ioctl_userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs
> ioctl_userfaultfd.2: Add write-protect mode docs

I applied all 4 patches (with a few minor fixes to 1/4 and 4/4 (cosmetic
fixes; some of them about the 80-col right margin)):
<https://github.com/alejandro-colomar/man-pages/tree/eb8f2001d493d458d08b9b87605ed2ac453c7f5f>

Thanks!

Alex

>
> man2/ioctl_userfaultfd.2 | 89 +++++++++++++++++++++++++++-
> man2/userfaultfd.2 | 121 +++++++++++++++++++++++++++++++++++++--
> 2 files changed, 203 insertions(+), 7 deletions(-)
>


--
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

Subject: Re: [PATCH v5 0/4] man2: udpate mm/userfaultfd manpages to latest

Hi Alex,

> I applied all 4 patches (with a few minor fixes to 1/4 and 4/4 (cosmetic
> fixes; some of them about the 80-col right margin)):
> <https://github.com/alejandro-colomar/man-pages/tree/eb8f2001d493d458d08b9b87605ed2ac453c7f5f>

How big is your current queue of pending patches from others?

Thanks,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/