2020-08-24 12:27:08

by David Howells

[permalink] [raw]
Subject: [PATCH 1/5] Add manpage for open_tree(2)

Add a manual page to document the open_tree() system call.

Signed-off-by: David Howells <[email protected]>
---

man2/open_tree.2 | 249 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 249 insertions(+)
create mode 100644 man2/open_tree.2

diff --git a/man2/open_tree.2 b/man2/open_tree.2
new file mode 100644
index 000000000..d480bd82f
--- /dev/null
+++ b/man2/open_tree.2
@@ -0,0 +1,249 @@
+'\" t
+.\" Copyright (c) 2020 David Howells <[email protected]>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date. The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein. The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH OPEN_TREE 2 2020-08-24 "Linux" "Linux Programmer's Manual"
+.SH NAME
+open_tree \- Pick or clone mount object and attach to fd
+.SH SYNOPSIS
+.nf
+.B #include <sys/types.h>
+.B #include <sys/mount.h>
+.B #include <unistd.h>
+.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
+.PP
+.BI "int open_tree(int " dirfd ", const char *" pathname ", unsigned int " flags );
+.fi
+.PP
+.IR Note :
+There are no glibc wrappers for these system calls.
+.SH DESCRIPTION
+.BR open_tree ()
+picks the mount object specified by the pathname and attaches it to a new file
+descriptor or clones it and attaches the clone to the file descriptor. The
+resultant file descriptor is indistinguishable from one produced by
+.BR open "(2) with " O_PATH .
+.PP
+In the case that the mount object is cloned, the clone will be "unmounted" and
+destroyed when the file descriptor is closed if it is not otherwise mounted
+somewhere by calling
+.BR move_mount (2).
+.PP
+To select a mount object, no permissions are required on the object referred
+to by the path, but execute (search) permission is required on all of the
+directories in
+.I pathname
+that lead to the object.
+.PP
+Appropriate privilege (Linux: the
+.B CAP_SYS_ADMIN
+capability) is required to clone mount objects.
+.PP
+.BR open_tree ()
+uses
+.IR pathname ", " dirfd " and " flags
+to locate the target object in one of a variety of ways:
+.TP
+[*] By absolute path.
+.I pathname
+points to an absolute path and
+.I dirfd
+is ignored. The object is looked up by name, starting from the root of the
+filesystem as seen by the calling process.
+.TP
+[*] By cwd-relative path.
+.I pathname
+points to a relative path and
+.IR dirfd " is " AT_FDCWD .
+The object is looked up by name, starting from the current working directory.
+.TP
+[*] By dir-relative path.
+.I pathname
+points to relative path and
+.I dirfd
+indicates a file descriptor pointing to a directory. The object is looked up
+by name, starting from the directory specified by
+.IR dirfd .
+.TP
+[*] By file descriptor.
+.I pathname
+is "",
+.I dirfd
+indicates a file descriptor and
+.B AT_EMPTY_PATH
+is set in
+.IR flags .
+The mount attached to the file descriptor is queried directly. The file
+descriptor may point to any type of file, not just a directory.
+.PP
+.I flags
+can be used to control the operation of the function and to influence a
+path-based lookup. A value for
+.I flags
+is constructed by OR'ing together zero or more of the following constants:
+.TP
+.BR AT_EMPTY_PATH
+.\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d
+If
+.I pathname
+is an empty string, operate on the file referred to by
+.IR dirfd
+(which may have been obtained from
+.BR open "(2) with"
+.BR O_PATH ", from " fsmount (2)
+or from another
+.BR open_tree ()).
+If
+.I dirfd
+is
+.BR AT_FDCWD ,
+the call operates on the current working directory.
+In this case,
+.I dirfd
+can refer to any type of file, not just a directory.
+This flag is Linux-specific; define
+.B _GNU_SOURCE
+.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
+to obtain its definition.
+.TP
+.BR AT_NO_AUTOMOUNT
+Don't automount the final ("basename") component of
+.I pathname
+if it is a directory that is an automount point. This flag allows the
+automount point itself to be picked up or a mount cloned that is rooted on the
+automount point. The
+.B AT_NO_AUTOMOUNT
+flag has no effect if the mount point has already been mounted over.
+This flag is Linux-specific; define
+.B _GNU_SOURCE
+.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
+to obtain its definition.
+.TP
+.B AT_SYMLINK_NOFOLLOW
+If
+.I pathname
+is a symbolic link, do not dereference it: instead pick up or clone a mount
+rooted on the link itself.
+.TP
+.B OPEN_TREE_CLOEXEC
+Set the close-on-exec flag for the new file descriptor. This will cause the
+file descriptor to be closed automatically when a process exec's.
+.TP
+.B OPEN_TREE_CLONE
+Rather than directly attaching the selected object to the file descriptor,
+clone the object, set the root of the new mount object to that point and
+attach the clone to the file descriptor.
+.TP
+.B AT_RECURSIVE
+This is only permitted in conjunction with OPEN_TREE_CLONE. It causes the
+entire mount subtree rooted at the selected spot to be cloned rather than just
+that one mount object.
+.SH RETURN VALUE
+On success, the new file descriptor is returned. On error, \-1 is returned,
+and
+.I errno
+is set appropriately.
+.SH ERRORS
+.TP
+.B EACCES
+Search permission is denied for one of the directories
+in the path prefix of
+.IR pathname .
+(See also
+.BR path_resolution (7).)
+.TP
+.B EBADF
+.I dirfd
+is not a valid open file descriptor.
+.TP
+.B EFAULT
+.I pathname
+is NULL or
+.IR pathname
+point to a location outside the process's accessible address space.
+.TP
+.B EINVAL
+Reserved flag specified in
+.IR flags .
+.TP
+.B ELOOP
+Too many symbolic links encountered while traversing the pathname.
+.TP
+.B ENAMETOOLONG
+.I pathname
+is too long.
+.TP
+.B ENOENT
+A component of
+.I pathname
+does not exist, or
+.I pathname
+is an empty string and
+.B AT_EMPTY_PATH
+was not specified in
+.IR flags .
+.TP
+.B ENOMEM
+Out of memory (i.e., kernel memory).
+.TP
+.B ENOTDIR
+A component of the path prefix of
+.I pathname
+is not a directory or
+.I pathname
+is relative and
+.I dirfd
+is a file descriptor referring to a file other than a directory.
+.SH VERSIONS
+.BR open_tree ()
+was added to Linux in kernel 5.2.
+.SH CONFORMING TO
+.BR open_tree ()
+is Linux-specific.
+.SH NOTES
+Glibc does not (yet) provide a wrapper for the
+.BR open_tree ()
+system call; call it using
+.BR syscall (2).
+.SH EXAMPLE
+The
+.BR open_tree ()
+function can be used like the following:
+.PP
+.RS
+.nf
+fd1 = open_tree(AT_FDCWD, "/mnt", 0);
+fd2 = open_tree(fd1, "",
+ AT_EMPTY_PATH | OPEN_TREE_CLONE | AT_RECURSIVE);
+move_mount(fd2, "", AT_FDCWD, "/mnt2", MOVE_MOUNT_F_EMPTY_PATH);
+.fi
+.RE
+.PP
+This would attach the path point for "/mnt" to fd1, then it would copy the
+entire subtree at the point referred to by fd1 and attach that to fd2; lastly,
+it would attach the clone to "/mnt2".
+.SH SEE ALSO
+.BR fsmount (2),
+.BR move_mount (2),
+.BR open (2)



2020-08-24 12:27:17

by David Howells

[permalink] [raw]
Subject: [PATCH 4/5] Add manpage for fsopen(2) and fsmount(2)

Add a manual page to document the fsopen() and fsmount() system calls.

Signed-off-by: David Howells <[email protected]>
---

man2/fsmount.2 | 1
man2/fsopen.2 | 245 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 246 insertions(+)
create mode 100644 man2/fsmount.2
create mode 100644 man2/fsopen.2

diff --git a/man2/fsmount.2 b/man2/fsmount.2
new file mode 100644
index 000000000..2bf59fc3e
--- /dev/null
+++ b/man2/fsmount.2
@@ -0,0 +1 @@
+.so man2/fsopen.2
diff --git a/man2/fsopen.2 b/man2/fsopen.2
new file mode 100644
index 000000000..1d1bba238
--- /dev/null
+++ b/man2/fsopen.2
@@ -0,0 +1,245 @@
+'\" t
+.\" Copyright (c) 2020 David Howells <[email protected]>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date. The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein. The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH FSOPEN 2 2020-08-07 "Linux" "Linux Programmer's Manual"
+.SH NAME
+fsopen, fsmount \- Filesystem parameterisation and mount creation
+.SH SYNOPSIS
+.nf
+.B #include <sys/types.h>
+.B #include <sys/mount.h>
+.B #include <unistd.h>
+.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
+.PP
+.BI "int fsopen(const char *" fsname ", unsigned int " flags );
+.PP
+.BI "int fsmount(int " fd ", unsigned int " flags ", unsigned int " mount_attrs );
+.fi
+.PP
+.IR Note :
+There are no glibc wrappers for these system calls.
+.SH DESCRIPTION
+.PP
+.BR fsopen ()
+creates a blank filesystem configuration context within the kernel for the
+filesystem named in the
+.I fsname
+parameter, puts it into creation mode and attaches it to a file descriptor,
+which it then returns. The file descriptor can be marked close-on-exec by
+setting
+.B FSOPEN_CLOEXEC
+in
+.IR flags .
+.PP
+After calling fsopen(), the file descriptor should be passed to the
+.BR fsconfig (2)
+system call, using that to specify the desired filesystem and security
+parameters.
+.PP
+When the parameters are all set, the
+.BR fsconfig ()
+system call should then be called again with
+.B FSCONFIG_CMD_CREATE
+as the command argument to effect the creation.
+.RS
+.PP
+.BR "[!]\ NOTE" :
+Depending on the filesystem type and parameters, this may rather share an
+existing in-kernel filesystem representation instead of creating a new one.
+In such a case, the parameters specified may be discarded or may overwrite the
+parameters set by a previous mount - at the filesystem's discretion.
+.RE
+.PP
+The file descriptor also serves as a channel by which more comprehensive error,
+warning and information messages may be retrieved from the kernel using
+.BR read (2).
+.PP
+Once the creation command has been successfully run on a context, the context
+will not accept further configuration. At
+this point,
+.BR fsmount ()
+should be called to create a mount object.
+.PP
+.BR fsmount ()
+takes the file descriptor returned by
+.BR fsopen ()
+and creates a mount object for the filesystem root specified there. The
+attributes of the mount object are set from the
+.I mount_attrs
+parameter. The attributes specify the propagation and mount restrictions to
+be applied to accesses through this mount.
+.PP
+The mount object is then attached to a new file descriptor that looks like one
+created by
+.BR open "(2) with " O_PATH " or " open_tree (2).
+This can be passed to
+.BR move_mount (2)
+to attach the mount object to a mountpoint, thereby completing the process.
+.PP
+The file descriptor returned by fsmount() is marked close-on-exec if
+FSMOUNT_CLOEXEC is specified in
+.IR flags .
+.PP
+After fsmount() has completed, the context created by fsopen() is reset and
+moved to reconfiguration state, allowing the new superblock to be
+reconfigured. See
+.BR fspick (2)
+for details.
+.PP
+To use either of these calls, the caller requires the appropriate privilege
+(Linux: the
+.B CAP_SYS_ADMIN
+capability).
+.PP
+.SS Message Retrieval Interface
+The context file descriptor may be queried for message strings at any time by
+calling
+.BR read (2)
+on the file descriptor. This will return formatted messages that are prefixed
+to indicate their class:
+.TP
+\fB"e <message>"\fP
+An error message string was logged.
+.TP
+\fB"i <message>"\fP
+An informational message string was logged.
+.TP
+\fB"w <message>"\fP
+An warning message string was logged.
+.PP
+Messages are removed from the queue as they're read.
+.SH RETURN VALUE
+On success, both functions return a file descriptor. On error, \-1 is
+returned, and
+.I errno
+is set appropriately.
+.SH ERRORS
+The error values given below result from filesystem type independent
+errors.
+Each filesystem type may have its own special errors and its
+own special behavior.
+See the Linux kernel source code for details.
+.TP
+.B EBUSY
+The context referred to by
+.I fd
+is not in the right state to be used by
+.BR fsmount ().
+.TP
+.B EFAULT
+One of the pointer arguments points outside the user address space.
+.TP
+.B EINVAL
+.I flags
+had an invalid flag set.
+.TP
+.B EINVAL
+.I mount_attrs,
+includes invalid
+.BR MOUNT_ATTR_*
+flags.
+.TP
+.B EMFILE
+The system has too many open files to create more.
+.TP
+.B ENFILE
+The process has too many open files to create more.
+.TP
+.B ENODEV
+The filesystem
+.I fsname
+is not available in the kernel.
+.TP
+.B ENOMEM
+The kernel could not allocate sufficient memory to complete the call.
+.TP
+.B EPERM
+The caller does not have the required privileges.
+.SH CONFORMING TO
+These functions are Linux-specific and should not be used in programs intended
+to be portable.
+.SH VERSIONS
+.BR fsopen "(), and " fsmount ()
+were added to Linux in kernel 5.2.
+.SH NOTES
+Glibc does not (yet) provide a wrapper for the
+.BR fsopen "() or " fsmount "()"
+system calls; call them using
+.BR syscall (2).
+.SH EXAMPLES
+To illustrate the process, here's an example whereby this can be used to mount
+an ext4 filesystem on /dev/sdb1 onto /mnt.
+.PP
+.in +4n
+.nf
+sfd = fsopen("ext4", FSOPEN_CLOEXEC);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "source", "/dev/sdb1", 0);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "noatime", NULL, 0);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "acl", NULL, 0);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "user_attr", NULL, 0);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "iversion", NULL, 0);
+fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+mfd = fsmount(sfd, FSMOUNT_CLOEXEC, MS_RELATIME);
+move_mount(mfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
+.fi
+.in
+.PP
+Here, an ext4 context is created first and attached to sfd. The context is
+then told where its source will be, given a bunch of options and a superblock
+record object is then created. Then fsmount() is called to create a mount
+object and
+.BR move_mount (2)
+is called to attach it to its intended mountpoint.
+.PP
+And here's an example of mounting from an NFS server and setting a Smack
+security module label on it too:
+.PP
+.in +4n
+.nf
+sfd = fsopen("nfs", 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "source", "example.com:/pub", 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "nfsvers", "3", 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "rsize", "65536", 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "wsize", "65536", 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "smackfsdef", "foolabel", 0);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "rdma", NULL, 0);
+fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+mfd = fsmount(sfd, 0, MS_NODEV);
+move_mount(mfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
+.fi
+.in
+.PP
+.SH SEE ALSO
+.BR mountpoint (1),
+.BR fsconfig (2),
+.BR fspick (2),
+.BR move_mount (2),
+.BR open_tree (2),
+.BR umount (2),
+.BR mount_namespaces (7),
+.BR path_resolution (7),
+.BR mount (8),
+.BR umount (8)


2020-08-24 12:27:56

by David Howells

[permalink] [raw]
Subject: [PATCH 3/5] Add manpage for fspick(2)

Add a manual page to document the fspick() system call.

Signed-off-by: David Howells <[email protected]>
---

man2/fspick.2 | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 180 insertions(+)
create mode 100644 man2/fspick.2

diff --git a/man2/fspick.2 b/man2/fspick.2
new file mode 100644
index 000000000..72bf645dd
--- /dev/null
+++ b/man2/fspick.2
@@ -0,0 +1,180 @@
+'\" t
+.\" Copyright (c) 2020 David Howells <[email protected]>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date. The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein. The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH FSPICK 2 2020-08-24 "Linux" "Linux Programmer's Manual"
+.SH NAME
+fspick \- Select filesystem for reconfiguration
+.SH SYNOPSIS
+.nf
+.B #include <sys/types.h>
+.B #include <sys/mount.h>
+.B #include <unistd.h>
+.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
+.PP
+.BI "int fspick(int " dirfd ", const char *" pathname ", unsigned int " flags );
+.fi
+.PP
+.IR Note :
+There is no glibc wrapper for this system call.
+.SH DESCRIPTION
+.PP
+.BR fspick ()
+creates a new filesystem configuration context within the kernel and attaches a
+pre-existing superblock to it so that it can be reconfigured (similar to
+.BR mount (8)
+with the "-o remount" option). The configuration context is marked as being in
+reconfiguration mode and attached to a file descriptor, which is returned to
+the caller. The file descriptor can be marked close-on-exec by setting
+.B FSPICK_CLOEXEC
+in
+.IR flags .
+.PP
+The target is whichever superblock backs the object determined by
+.IR dfd ", " pathname " and " flags .
+The following can be set in
+.I flags
+to control the pathwalk to that object:
+.TP
+.B FSPICK_SYMLINK_NOFOLLOW
+Don't follow symbolic links in the final component of the path.
+.TP
+.B FSPICK_NO_AUTOMOUNT
+Don't follow automounts in the final component of the path.
+.TP
+.B FSPICK_EMPTY_PATH
+Allow an empty string to be specified as the pathname. This allows
+.I dirfd
+to specify the target mount exactly.
+.PP
+After calling fspick(), the file descriptor should be passed to the
+.BR fsconfig (2)
+system call, using that to specify the desired changes to filesystem and
+security parameters.
+.PP
+When the parameters are all set, the
+.BR fsconfig ()
+system call should then be called again with
+.B FSCONFIG_CMD_RECONFIGURE
+as the command argument to effect the reconfiguration.
+.PP
+After the reconfiguration has taken place, the context is wiped clean (apart
+from the superblock attachment, which remains) and can be reused to make
+another reconfiguration.
+.PP
+The file descriptor also serves as a channel by which more comprehensive error,
+warning and information messages may be retrieved from the kernel using
+.BR read (2).
+.SS Message Retrieval Interface
+The context file descriptor may be queried for message strings at any time by
+calling
+.BR read (2)
+on the file descriptor. This will return formatted messages that are prefixed
+to indicate their class:
+.TP
+\fB"e <message>"\fP
+An error message string was logged.
+.TP
+\fB"i <message>"\fP
+An informational message string was logged.
+.TP
+\fB"w <message>"\fP
+An warning message string was logged.
+.PP
+Messages are removed from the queue as they're read and the queue has a limited
+depth of 8 messages, so it's possible for some to get lost.
+.SH RETURN VALUE
+On success, the function returns a file descriptor. On error, \-1 is returned,
+and
+.I errno
+is set appropriately.
+.SH ERRORS
+The error values given below result from filesystem type independent errors.
+Additionally, each filesystem type may have its own special errors and its own
+special behavior. See the Linux kernel source code for details.
+.TP
+.B EACCES
+A component of a path was not searchable.
+(See also
+.BR path_resolution (7).)
+.TP
+.B EFAULT
+.I pathname
+points outside the user address space.
+.TP
+.B EINVAL
+.I flags
+includes an undefined value.
+.TP
+.B ELOOP
+Too many links encountered during pathname resolution.
+.TP
+.B EMFILE
+The system has too many open files to create more.
+.TP
+.B ENFILE
+The process has too many open files to create more.
+.TP
+.B ENAMETOOLONG
+A pathname was longer than
+.BR MAXPATHLEN .
+.TP
+.B ENOENT
+A pathname was empty or had a nonexistent component.
+.TP
+.B ENOMEM
+The kernel could not allocate sufficient memory to complete the call.
+.TP
+.B EPERM
+The caller does not have the required privileges.
+.SH CONFORMING TO
+These functions are Linux-specific and should not be used in programs intended
+to be portable.
+.SH VERSIONS
+.BR fsopen "(), " fsmount "() and " fspick ()
+were added to Linux in kernel 5.2.
+.SH EXAMPLES
+To illustrate the process, here's an example whereby this can be used to
+reconfigure a filesystem:
+.PP
+.in +4n
+.nf
+sfd = fspick(AT_FDCWD, "/mnt", FSPICK_NO_AUTOMOUNT | FSPICK_CLOEXEC);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "user_xattr", "false", 0);
+fsconfig(sfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0);
+.fi
+.in
+.PP
+.SH NOTES
+Glibc does not (yet) provide a wrapper for the
+.BR fspick "()"
+system call; call it using
+.BR syscall (2).
+.SH SEE ALSO
+.BR mountpoint (1),
+.BR fsconfig (2),
+.BR fsopen (2),
+.BR path_resolution (7),
+.BR mount (8)


Subject: Re: [PATCH 3/5] Add manpage for fspick(2)

Hello David,

On 8/24/20 2:24 PM, David Howells wrote:
> Add a manual page to document the fspick() system call.
>
> Signed-off-by: David Howells <[email protected]>
> ---
>
> man2/fspick.2 | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 180 insertions(+)
> create mode 100644 man2/fspick.2
>
> diff --git a/man2/fspick.2 b/man2/fspick.2
> new file mode 100644
> index 000000000..72bf645dd
> --- /dev/null
> +++ b/man2/fspick.2
> @@ -0,0 +1,180 @@
> +'\" t
> +.\" Copyright (c) 2020 David Howells <[email protected]>
> +.\"
> +.\" %%%LICENSE_START(VERBATIM)
> +.\" Permission is granted to make and distribute verbatim copies of this
> +.\" manual provided the copyright notice and this permission notice are
> +.\" preserved on all copies.
> +.\"
> +.\" Permission is granted to copy and distribute modified versions of this
> +.\" manual under the conditions for verbatim copying, provided that the
> +.\" entire resulting derived work is distributed under the terms of a
> +.\" permission notice identical to this one.
> +.\"
> +.\" Since the Linux kernel and libraries are constantly changing, this
> +.\" manual page may be incorrect or out-of-date. The author(s) assume no
> +.\" responsibility for errors or omissions, or for damages resulting from
> +.\" the use of the information contained herein. The author(s) may not
> +.\" have taken the same level of care in the production of this manual,
> +.\" which is licensed free of charge, as they might when working
> +.\" professionally.
> +.\"
> +.\" Formatted or processed versions of this manual, if unaccompanied by
> +.\" the source, must acknowledge the copyright and authors of this work.
> +.\" %%%LICENSE_END
> +.\"
> +.TH FSPICK 2 2020-08-24 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +fspick \- Select filesystem for reconfiguration
> +.SH SYNOPSIS
> +.nf
> +.B #include <sys/types.h>
> +.B #include <sys/mount.h>
> +.B #include <unistd.h>
> +.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
> +.PP
> +.BI "int fspick(int " dirfd ", const char *" pathname ", unsigned int " flags );
> +.fi
> +.PP
> +.IR Note :
> +There is no glibc wrapper for this system call.
> +.SH DESCRIPTION
> +.PP
> +.BR fspick ()
> +creates a new filesystem configuration context within the kernel and attaches a
> +pre-existing superblock to it so that it can be reconfigured (similar to
> +.BR mount (8)
> +with the "-o remount" option). The configuration context is marked as being in
> +reconfiguration mode and attached to a file descriptor, which is returned to
> +the caller. The file descriptor can be marked close-on-exec by setting
> +.B FSPICK_CLOEXEC
> +in
> +.IR flags .
> +.PP
> +The target is whichever superblock backs the object determined by
> +.IR dfd ", " pathname " and " flags .
> +The following can be set in
> +.I flags
> +to control the pathwalk to that object:
> +.TP
> +.B FSPICK_SYMLINK_NOFOLLOW
> +Don't follow symbolic links in the final component of the path.
> +.TP
> +.B FSPICK_NO_AUTOMOUNT
> +Don't follow automounts in the final component of the path.
> +.TP
> +.B FSPICK_EMPTY_PATH
> +Allow an empty string to be specified as the pathname. This allows
> +.I dirfd
> +to specify the target mount exactly.
> +.PP
> +After calling fspick(), the file descriptor should be passed to the
> +.BR fsconfig (2)
> +system call, using that to specify the desired changes to filesystem and

Better: s/using that/in order/

> +security parameters.
> +.PP
> +When the parameters are all set, the
> +.BR fsconfig ()
> +system call should then be called again with
> +.B FSCONFIG_CMD_RECONFIGURE
> +as the command argument to effect the reconfiguration.
> +.PP
> +After the reconfiguration has taken place, the context is wiped clean (apart
> +from the superblock attachment, which remains) and can be reused to make
> +another reconfiguration.
> +.PP
> +The file descriptor also serves as a channel by which more comprehensive error,
> +warning and information messages may be retrieved from the kernel using
> +.BR read (2).
> +.SS Message Retrieval Interface
> +The context file descriptor may be queried for message strings at any time by

s/descriptor/descriptor returned by fspick()/

> +calling
> +.BR read (2)
> +on the file descriptor. This will return formatted messages that are prefixed
> +to indicate their class:
> +.TP
> +\fB"e <message>"\fP
> +An error message string was logged.
> +.TP
> +\fB"i <message>"\fP
> +An informational message string was logged.
> +.TP
> +\fB"w <message>"\fP
> +An warning message string was logged.
> +.PP
> +Messages are removed from the queue as they're read and the queue has a limited
> +depth of 8 messages, so it's possible for some to get lost.

What if there are no pending error messages to retrieve? What does
read() do in that case? Please add an explanation here.

> +.SH RETURN VALUE
> +On success, the function returns a file descriptor. On error, \-1 is returned,
> +and
> +.I errno
> +is set appropriately.
> +.SH ERRORS
> +The error values given below result from filesystem type independent errors.
> +Additionally, each filesystem type may have its own special errors and its own
> +special behavior. See the Linux kernel source code for details.
> +.TP
> +.B EACCES
> +A component of a path was not searchable.
> +(See also
> +.BR path_resolution (7).)
> +.TP
> +.B EFAULT
> +.I pathname
> +points outside the user address space.
> +.TP
> +.B EINVAL
> +.I flags
> +includes an undefined value.
> +.TP
> +.B ELOOP
> +Too many links encountered during pathname resolution.
> +.TP
> +.B EMFILE
> +The system has too many open files to create more.
> +.TP
> +.B ENFILE
> +The process has too many open files to create more.
> +.TP
> +.B ENAMETOOLONG
> +A pathname was longer than
> +.BR MAXPATHLEN .

MAXPATHLEN is not, I think, a constant known in user space. What is this?
Should it be PATH_MAX?

> +.TP
> +.B ENOENT
> +A pathname was empty or had a nonexistent component.
> +.TP
> +.B ENOMEM
> +The kernel could not allocate sufficient memory to complete the call.
> +.TP
> +.B EPERM
> +The caller does not have the required privileges.

Please note the necessary capability here. Also, there was no mention of
capabilities/privileges in DESCRIPTION. Should there have been?

> +.SH CONFORMING TO
> +These functions are Linux-specific and should not be used in programs intended
> +to be portable.
> +.SH VERSIONS
> +.BR fsopen "(), " fsmount "() and " fspick ()
> +were added to Linux in kernel 5.2.
> +.SH EXAMPLES
> +To illustrate the process, here's an example whereby this can be used to
> +reconfigure a filesystem:
> +.PP
> +.in +4n
> +.nf
> +sfd = fspick(AT_FDCWD, "/mnt", FSPICK_NO_AUTOMOUNT | FSPICK_CLOEXEC);
> +fsconfig(sfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
> +fsconfig(sfd, FSCONFIG_SET_STRING, "user_xattr", "false", 0);
> +fsconfig(sfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0);
> +.fi
> +.in
> +.PP
> +.SH NOTES
> +Glibc does not (yet) provide a wrapper for the
> +.BR fspick "()"
> +system call; call it using
> +.BR syscall (2).
> +.SH SEE ALSO
> +.BR mountpoint (1),
> +.BR fsconfig (2),
> +.BR fsopen (2),
> +.BR path_resolution (7),
> +.BR mount (8)

Thanks,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Subject: Re: [PATCH 1/5] Add manpage for open_tree(2)

Hello David,

Can I ask that you please reply to each of my mails, rather than
just sending out a new patch series (which of course I would also
like you to do). Some things that I mentioned in the last mails
got lost, and I end up having to repeat them.

So, even where I say "please change this", could you please reply with
"done", or a reason why you declined the suggested change, is useful.
But in any case, a few words in reply to explain the other changes
that you make would be helpful.

Also, some of my questions now will get a little more complex, and as
well as you updating the pages, I think a little discussion may be
required in some cases.

On 8/24/20 2:24 PM, David Howells wrote:
> Add a manual page to document the open_tree() system call.
>
> Signed-off-by: David Howells <[email protected]>
> ---
>
> man2/open_tree.2 | 249 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 249 insertions(+)
> create mode 100644 man2/open_tree.2
>
> diff --git a/man2/open_tree.2 b/man2/open_tree.2
> new file mode 100644
> index 000000000..d480bd82f
> --- /dev/null
> +++ b/man2/open_tree.2
> @@ -0,0 +1,249 @@
> +'\" t
> +.\" Copyright (c) 2020 David Howells <[email protected]>
> +.\"
> +.\" %%%LICENSE_START(VERBATIM)
> +.\" Permission is granted to make and distribute verbatim copies of this
> +.\" manual provided the copyright notice and this permission notice are
> +.\" preserved on all copies.
> +.\"
> +.\" Permission is granted to copy and distribute modified versions of this
> +.\" manual under the conditions for verbatim copying, provided that the
> +.\" entire resulting derived work is distributed under the terms of a
> +.\" permission notice identical to this one.
> +.\"
> +.\" Since the Linux kernel and libraries are constantly changing, this
> +.\" manual page may be incorrect or out-of-date. The author(s) assume no
> +.\" responsibility for errors or omissions, or for damages resulting from
> +.\" the use of the information contained herein. The author(s) may not
> +.\" have taken the same level of care in the production of this manual,
> +.\" which is licensed free of charge, as they might when working
> +.\" professionally.
> +.\"
> +.\" Formatted or processed versions of this manual, if unaccompanied by
> +.\" the source, must acknowledge the copyright and authors of this work.
> +.\" %%%LICENSE_END
> +.\"
> +.TH OPEN_TREE 2 2020-08-24 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +open_tree \- Pick or clone mount object and attach to fd
> +.SH SYNOPSIS
> +.nf
> +.B #include <sys/types.h>
> +.B #include <sys/mount.h>
> +.B #include <unistd.h>
> +.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
> +.PP
> +.BI "int open_tree(int " dirfd ", const char *" pathname ", unsigned int " flags );
> +.fi
> +.PP
> +.IR Note :
> +There are no glibc wrappers for these system calls.
> +.SH DESCRIPTION
> +.BR open_tree ()
> +picks the mount object specified by the pathname and attaches it to a new file

The terminology "pick" is unusual, and you never really explain what
it means. Is there better terminology? In any case, can you add a few
words to explain what the term (('pick" or whatever alternative you
come up with) means.

> +descriptor or clones it and attaches the clone to the file descriptor. The

Please replace "it" by a noun (phrase) -- maybe: "the mount object"?

> +resultant file descriptor is indistinguishable from one produced by
> +.BR open "(2) with " O_PATH .

What is the significance of that last piece? Can you add some words
about why the fact that the resulting FD is indistinguishable from one
produced by open() O_PATH matters or is useful?

> +.PP
> +In the case that the mount object is cloned, the clone will be "unmounted" and

You place "unmounted" in quotes. Why? Is this to signify that the the
unmount is somehow different from other unmounts? If so, please
explain how it is different. If not, then I think we can lose the double
quotes.

> +destroyed when the file descriptor is closed if it is not otherwise mounted
> +somewhere by calling
> +.BR move_mount (2).
> +.PP
> +To select a mount object, no permissions are required on the object referred

Here you use the word "select". Is this the same as "pick"? If yes, please
use the same term.

> +to by the path, but execute (search) permission is required on all of the

s/the path/.I pathname/ ?

(Where pathname == "the pathname argument)

> +directories in
> +.I pathname
> +that lead to the object.
> +.PP
> +Appropriate privilege (Linux: the

s/Linux: //
(This is a Linux specific system call...)

> +.B CAP_SYS_ADMIN
> +capability) is required to clone mount objects.
> +.PP
> +.BR open_tree ()
> +uses
> +.IR pathname ", " dirfd " and " flags
> +to locate the target object in one of a variety of ways:
> +.TP
> +[*] By absolute path.
> +.I pathname
> +points to an absolute path and
> +.I dirfd
> +is ignored. The object is looked up by name, starting from the root of the
> +filesystem as seen by the calling process.
> +.TP
> +[*] By cwd-relative path.
> +.I pathname
> +points to a relative path and
> +.IR dirfd " is " AT_FDCWD .
> +The object is looked up by name, starting from the current working directory.
> +.TP
> +[*] By dir-relative path.
> +.I pathname
> +points to relative path and
> +.I dirfd
> +indicates a file descriptor pointing to a directory. The object is looked up
> +by name, starting from the directory specified by
> +.IR dirfd .
> +.TP
> +[*] By file descriptor.
> +.I pathname
> +is "",
> +.I dirfd
> +indicates a file descriptor and
> +.B AT_EMPTY_PATH
> +is set in
> +.IR flags .
> +The mount attached to the file descriptor is queried directly. The file
> +descriptor may point to any type of file, not just a directory.

I want to check here. Is it really *any* type of file? Can it be a UNIX
domain socket or a char/block device or a FIFO?

> +.PP
> +.I flags
> +can be used to control the operation of the function and to influence a
> +path-based lookup. A value for
> +.I flags
> +is constructed by OR'ing together zero or more of the following constants:
> +.TP
> +.BR AT_EMPTY_PATH
> +.\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d
> +If
> +.I pathname
> +is an empty string, operate on the file referred to by
> +.IR dirfd
> +(which may have been obtained from
> +.BR open "(2) with"
> +.BR O_PATH ", from " fsmount (2)
> +or from another

s/another/a previous call to/

> +.BR open_tree ()).
> +If
> +.I dirfd
> +is
> +.BR AT_FDCWD ,
> +the call operates on the current working directory.
> +In this case,
> +.I dirfd
> +can refer to any type of file, not just a directory.
> +This flag is Linux-specific; define
> +.B _GNU_SOURCE
> +.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
> +to obtain its definition.
> +.TP
> +.BR AT_NO_AUTOMOUNT
> +Don't automount the final ("basename") component of
> +.I pathname
> +if it is a directory that is an automount point. This flag allows the
> +automount point itself to be picked up or a mount cloned that is rooted on the
> +automount point. The
> +.B AT_NO_AUTOMOUNT
> +flag has no effect if the mount point has already been mounted over.
> +This flag is Linux-specific; define
> +.B _GNU_SOURCE
> +.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
> +to obtain its definition.
> +.TP
> +.B AT_SYMLINK_NOFOLLOW
> +If
> +.I pathname
> +is a symbolic link, do not dereference it: instead pick up or clone a mount
> +rooted on the link itself.
> +.TP
> +.B OPEN_TREE_CLOEXEC
> +Set the close-on-exec flag for the new file descriptor. This will cause the
> +file descriptor to be closed automatically when a process exec's.
> +.TP
> +.B OPEN_TREE_CLONE
> +Rather than directly attaching the selected object to the file descriptor,
> +clone the object, set the root of the new mount object to that point and

Could you expand on "that point" a little. It's not quite clear to me what
you mean there.

> +attach the clone to the file descriptor.
> +.TP
> +.B AT_RECURSIVE
> +This is only permitted in conjunction with OPEN_TREE_CLONE. It causes the
> +entire mount subtree rooted at the selected spot to be cloned rather than just

Is there a better word than "spot"?

> +that one mount object.
> +.SH RETURN VALUE
> +On success, the new file descriptor is returned. On error, \-1 is returned,
> +and
> +.I errno
> +is set appropriately.
> +.SH ERRORS
> +.TP
> +.B EACCES
> +Search permission is denied for one of the directories
> +in the path prefix of
> +.IR pathname .
> +(See also
> +.BR path_resolution (7).)
> +.TP
> +.B EBADF
> +.I dirfd
> +is not a valid open file descriptor.
> +.TP
> +.B EFAULT
> +.I pathname
> +is NULL or
> +.IR pathname
> +point to a location outside the process's accessible address space.
> +.TP
> +.B EINVAL
> +Reserved flag specified in
> +.IR flags .
> +.TP
> +.B ELOOP
> +Too many symbolic links encountered while traversing the pathname.
> +.TP
> +.B ENAMETOOLONG
> +.I pathname
> +is too long.
> +.TP
> +.B ENOENT
> +A component of
> +.I pathname
> +does not exist, or
> +.I pathname
> +is an empty string and
> +.B AT_EMPTY_PATH
> +was not specified in
> +.IR flags .
> +.TP
> +.B ENOMEM
> +Out of memory (i.e., kernel memory).
> +.TP
> +.B ENOTDIR
> +A component of the path prefix of
> +.I pathname
> +is not a directory or
> +.I pathname
> +is relative and
> +.I dirfd
> +is a file descriptor referring to a file other than a directory.
> +.SH VERSIONS
> +.BR open_tree ()
> +was added to Linux in kernel 5.2.
> +.SH CONFORMING TO
> +.BR open_tree ()
> +is Linux-specific.
> +.SH NOTES
> +Glibc does not (yet) provide a wrapper for the
> +.BR open_tree ()
> +system call; call it using
> +.BR syscall (2).

What's the current status with respect to glibc support? Is it coming/is
someone working on this?

> +.SH EXAMPLE

s/EXAMPLE/EXAMPLES/
(That's the standard section header name these days.)

> +The
> +.BR open_tree ()
> +function can be used like the following:

The following example does a recursive bind mount, right?
Can you please add some words to say that explicitly.

> +.PP
> +.RS
> +.nf
> +fd1 = open_tree(AT_FDCWD, "/mnt", 0);
> +fd2 = open_tree(fd1, "",
> + AT_EMPTY_PATH | OPEN_TREE_CLONE | AT_RECURSIVE);
> +move_mount(fd2, "", AT_FDCWD, "/mnt2", MOVE_MOUNT_F_EMPTY_PATH);
> +.fi
> +.RE
> +.PP
> +This would attach the path point for "/mnt" to fd1, then it would copy the

What is a "path point"? This is not standard terminology. Can you
replace this with something better?

> +entire subtree at the point referred to by fd1 and attach that to fd2; lastly,
> +it would attach the clone to "/mnt2".
> +.SH SEE ALSO
> +.BR fsmount (2),
> +.BR move_mount (2),
> +.BR open (2)

Thanks,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Subject: Re: [PATCH 4/5] Add manpage for fsopen(2) and fsmount(2)

Hello David,

On 8/24/20 2:25 PM, David Howells wrote:
> Add a manual page to document the fsopen() and fsmount() system calls.
>
> Signed-off-by: David Howells <[email protected]>
> ---
>
> man2/fsmount.2 | 1
> man2/fsopen.2 | 245 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 246 insertions(+)
> create mode 100644 man2/fsmount.2
> create mode 100644 man2/fsopen.2
>
> diff --git a/man2/fsmount.2 b/man2/fsmount.2
> new file mode 100644
> index 000000000..2bf59fc3e
> --- /dev/null
> +++ b/man2/fsmount.2
> @@ -0,0 +1 @@
> +.so man2/fsopen.2
> diff --git a/man2/fsopen.2 b/man2/fsopen.2
> new file mode 100644
> index 000000000..1d1bba238
> --- /dev/null
> +++ b/man2/fsopen.2
> @@ -0,0 +1,245 @@
> +'\" t
> +.\" Copyright (c) 2020 David Howells <[email protected]>
> +.\"
> +.\" %%%LICENSE_START(VERBATIM)
> +.\" Permission is granted to make and distribute verbatim copies of this
> +.\" manual provided the copyright notice and this permission notice are
> +.\" preserved on all copies.
> +.\"
> +.\" Permission is granted to copy and distribute modified versions of this
> +.\" manual under the conditions for verbatim copying, provided that the
> +.\" entire resulting derived work is distributed under the terms of a
> +.\" permission notice identical to this one.
> +.\"
> +.\" Since the Linux kernel and libraries are constantly changing, this
> +.\" manual page may be incorrect or out-of-date. The author(s) assume no
> +.\" responsibility for errors or omissions, or for damages resulting from
> +.\" the use of the information contained herein. The author(s) may not
> +.\" have taken the same level of care in the production of this manual,
> +.\" which is licensed free of charge, as they might when working
> +.\" professionally.
> +.\"
> +.\" Formatted or processed versions of this manual, if unaccompanied by
> +.\" the source, must acknowledge the copyright and authors of this work.
> +.\" %%%LICENSE_END
> +.\"
> +.TH FSOPEN 2 2020-08-07 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +fsopen, fsmount \- Filesystem parameterisation and mount creation
> +.SH SYNOPSIS
> +.nf
> +.B #include <sys/types.h>
> +.B #include <sys/mount.h>
> +.B #include <unistd.h>
> +.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
> +.PP
> +.BI "int fsopen(const char *" fsname ", unsigned int " flags );
> +.PP
> +.BI "int fsmount(int " fd ", unsigned int " flags ", unsigned int " mount_attrs );
> +.fi
> +.PP
> +.IR Note :
> +There are no glibc wrappers for these system calls.
> +.SH DESCRIPTION
> +.PP
> +.BR fsopen ()
> +creates a blank filesystem configuration context within the kernel for the
> +filesystem named in the
> +.I fsname
> +parameter, puts it into creation mode and attaches it to a file descriptor,
> +which it then returns.

In the preceding sentence, "it" is used three times, with two *different*
referents. That's quite hard on the reader.

How about:

[[
.BR fsopen ()
creates a blank filesystem configuration context within the kernel for the
filesystem named in the
.I fsname
parameter, puts the context into creation mode and
attaches it to a file descriptor;
.BR fsopen ()
returns the file descriptor as the function result.
]]

> The file descriptor can be marked close-on-exec by
> +setting
> +.B FSOPEN_CLOEXEC
> +in
> +.IR flags .
> +.PP
> +After calling fsopen(), the file descriptor should be passed to the
> +.BR fsconfig (2)
> +system call, using that to specify the desired filesystem and security
> +parameters.
> +.PP
> +When the parameters are all set, the
> +.BR fsconfig ()
> +system call should then be called again with
> +.B FSCONFIG_CMD_CREATE
> +as the command argument to effect the creation.
> +.RS
> +.PP
> +.BR "[!]\ NOTE" :
> +Depending on the filesystem type and parameters, this may rather share an

Please replace "this" with a noun (phrase), since it is a little
unclear what "this" refers to.

> +existing in-kernel filesystem representation instead of creating a new one.
> +In such a case, the parameters specified may be discarded or may overwrite the
> +parameters set by a previous mount - at the filesystem's discretion.
> +.RE
> +.PP
> +The file descriptor also serves as a channel by which more comprehensive error,
> +warning and information messages may be retrieved from the kernel using
> +.BR read (2).
> +.PP
> +Once the creation command has been successfully run on a context, the context
> +will not accept further configuration. At
> +this point,
> +.BR fsmount ()
> +should be called to create a mount object.
> +.PP
> +.BR fsmount ()
> +takes the file descriptor returned by
> +.BR fsopen ()
> +and creates a mount object for the filesystem root specified there. The
> +attributes of the mount object are set from the
> +.I mount_attrs
> +parameter. The attributes specify the propagation and mount restrictions to
> +be applied to accesses through this mount.

Can we please have a list of the available attributes here, with a
description of each attribute.

> +.PP
> +The mount object is then attached to a new file descriptor that looks like one
> +created by
> +.BR open "(2) with " O_PATH " or " open_tree (2).
> +This can be passed to
> +.BR move_mount (2)
> +to attach the mount object to a mountpoint, thereby completing the process.

s/mountpoint/mount point/

In the preceding paragraph, the description is a bit unclear. (Again,
overuse of pronouns ("this) does not help. I think it
would be better to say something like:

[[
.BR fsmount()
attaches the mount object to a new file descriptor that looks like one
created by
.BR open "(2) with " O_PATH " or " open_tree (2).
This file descriptor can be passed to
.BR move_mount (2)
to attach the mount object to a mount point, thereby completing the process.
]]

But, please also replace "the process" with a more meaningful phrase.

> +.PP
> +The file descriptor returned by fsmount() is marked close-on-exec if
> +FSMOUNT_CLOEXEC is specified in
> +.IR flags .
> +.PP
> +After fsmount() has completed, the context created by fsopen() is reset and
> +moved to reconfiguration state, allowing the new superblock to be
> +reconfigured. See
> +.BR fspick (2)
> +for details.
> +.PP
> +To use either of these calls, the caller requires the appropriate privilege
> +(Linux: the

s/Linux: //
(this is after all a Linux-specific system call)

> +.B CAP_SYS_ADMIN
> +capability).
> +.PP
> +.SS Message Retrieval Interface
> +The context file descriptor may be queried for message strings at any time by

s/The context file descriptor/
The context file descriptor returned by fsopen()/

> +calling
> +.BR read (2)
> +on the file descriptor. This will return formatted messages that are prefixed
> +to indicate their class:
> +.TP
> +\fB"e <message>"\fP
> +An error message string was logged.
> +.TP
> +\fB"i <message>"\fP
> +An informational message string was logged.
> +.TP
> +\fB"w <message>"\fP
> +An warning message string was logged.
> +.PP
> +Messages are removed from the queue as they're read.

What if there are no pending error messages to retrieve? What does
read() do in that case? Please add an explanation here.

> +.SH RETURN VALUE
> +On success, both functions return a file descriptor. On error, \-1 is
> +returned, and
> +.I errno
> +is set appropriately> +.SH ERRORS
> +The error values given below result from filesystem type independent
> +errors.
> +Each filesystem type may have its own special errors and its
> +own special behavior.
> +See the Linux kernel source code for details.
> +.TP
> +.B EBUSY
> +The context referred to by
> +.I fd
> +is not in the right state to be used by
> +.BR fsmount ().
> +.TP
> +.B EFAULT
> +One of the pointer arguments points outside the user address space.
> +.TP
> +.B EINVAL
> +.I flags
> +had an invalid flag set.
> +.TP
> +.B EINVAL
> +.I mount_attrs,
> +includes invalid
> +.BR MOUNT_ATTR_*
> +flags.
> +.TP
> +.B EMFILE
> +The system has too many open files to create more.
> +.TP
> +.B ENFILE
> +The process has too many open files to create more.
> +.TP
> +.B ENODEV
> +The filesystem
> +.I fsname
> +is not available in the kernel.
> +.TP
> +.B ENOMEM
> +The kernel could not allocate sufficient memory to complete the call.
> +.TP
> +.B EPERM
> +The caller does not have the required privileges.

Please name the required capability.

> +.SH CONFORMING TO
> +These functions are Linux-specific and should not be used in programs intended
> +to be portable.
> +.SH VERSIONS
> +.BR fsopen "(), and " fsmount ()
> +were added to Linux in kernel 5.2.
> +.SH NOTES
> +Glibc does not (yet) provide a wrapper for the
> +.BR fsopen "() or " fsmount "()"
> +system calls; call them using
> +.BR syscall (2).
> +.SH EXAMPLES
> +To illustrate the process, here's an example whereby this can be used to mount

Please replace "this" by a noun (phrase).

> +an ext4 filesystem on /dev/sdb1 onto /mnt.
> +.PP
> +.in +4n
> +.nf
> +sfd = fsopen("ext4", FSOPEN_CLOEXEC);
> +fsconfig(sfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
> +fsconfig(sfd, FSCONFIG_SET_STRING, "source", "/dev/sdb1", 0);
> +fsconfig(sfd, FSCONFIG_SET_FLAG, "noatime", NULL, 0);
> +fsconfig(sfd, FSCONFIG_SET_FLAG, "acl", NULL, 0);
> +fsconfig(sfd, FSCONFIG_SET_FLAG, "user_attr", NULL, 0);
> +fsconfig(sfd, FSCONFIG_SET_FLAG, "iversion", NULL, 0);
> +fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
> +mfd = fsmount(sfd, FSMOUNT_CLOEXEC, MS_RELATIME);
> +move_mount(mfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> +.fi
> +.in
> +.PP
> +Here, an ext4 context is created first and attached to sfd. The context is
> +then told where its source will be, given a bunch of options and a superblock
> +record object is then created. Then fsmount() is called to create a mount
> +object and
> +.BR move_mount (2)
> +is called to attach it to its intended mountpoint.

s/mountpoint/mount point/

> +.PP
> +And here's an example of mounting from an NFS server and setting a Smack
> +security module label on it too:

Please replace "it" with a noun (phrase).

> +.PP
> +.in +4n
> +.nf
> +sfd = fsopen("nfs", 0);
> +fsconfig(sfd, FSCONFIG_SET_STRING, "source", "example.com:/pub", 0);
> +fsconfig(sfd, FSCONFIG_SET_STRING, "nfsvers", "3", 0);
> +fsconfig(sfd, FSCONFIG_SET_STRING, "rsize", "65536", 0);
> +fsconfig(sfd, FSCONFIG_SET_STRING, "wsize", "65536", 0);
> +fsconfig(sfd, FSCONFIG_SET_STRING, "smackfsdef", "foolabel", 0);
> +fsconfig(sfd, FSCONFIG_SET_FLAG, "rdma", NULL, 0);
> +fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
> +mfd = fsmount(sfd, 0, MS_NODEV);
> +move_mount(mfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> +.fi
> +.in
> +.PP
> +.SH SEE ALSO
> +.BR mountpoint (1),
> +.BR fsconfig (2),
> +.BR fspick (2),
> +.BR move_mount (2),
> +.BR open_tree (2),
> +.BR umount (2),
> +.BR mount_namespaces (7),
> +.BR path_resolution (7),
> +.BR mount (8),
> +.BR umount (8)

Thanks,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Subject: Re: [PATCH 4/5] Add manpage for fsopen(2) and fsmount(2)

Hi David,

One further thought...

> +++ b/man2/fsopen.2
[...]
> +.BR fsopen ()
> +creates a blank filesystem configuration context within the kernel for the
> +filesystem named in the
> +.I fsname
> +parameter, puts it into creation mode and attaches it to a file descriptor,
> +which it then returns.

The term "filesystem configuration context" is introduced, but never
really explained. I think it would be very helpful to have a sentence
or three that explains this concept at the start of the page.

Cheers,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

2020-09-02 16:18:38

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 4/5] Add manpage for fsopen(2) and fsmount(2)

Michael Kerrisk (man-pages) <[email protected]> wrote:

> The term "filesystem configuration context" is introduced, but never
> really explained. I think it would be very helpful to have a sentence
> or three that explains this concept at the start of the page.

Does that need a .7 manpage?

David

Subject: Re: [PATCH 4/5] Add manpage for fsopen(2) and fsmount(2)

On Wed, 2 Sep 2020 at 18:14, David Howells <[email protected]> wrote:
>
> Michael Kerrisk (man-pages) <[email protected]> wrote:
>
> > The term "filesystem configuration context" is introduced, but never
> > really explained. I think it would be very helpful to have a sentence
> > or three that explains this concept at the start of the page.
>
> Does that need a .7 manpage?

I was hoping a sentence or a paragraph in this page might suffice. Do
you think more is required?

Cheers,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Subject: Re: [PATCH 4/5] Add manpage for fsopen(2) and fsmount(2)

Hi David,

A ping for these five patches please!

Cheers,

Michael

On Wed, 2 Sep 2020 at 22:14, Michael Kerrisk (man-pages)
<[email protected]> wrote:
>
> On Wed, 2 Sep 2020 at 18:14, David Howells <[email protected]> wrote:
> >
> > Michael Kerrisk (man-pages) <[email protected]> wrote:
> >
> > > The term "filesystem configuration context" is introduced, but never
> > > really explained. I think it would be very helpful to have a sentence
> > > or three that explains this concept at the start of the page.
> >
> > Does that need a .7 manpage?
>
> I was hoping a sentence or a paragraph in this page might suffice. Do
> you think more is required?
>
> Cheers,
>
> Michael
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Subject: Re: [PATCH 4/5] Add manpage for fsopen(2) and fsmount(2)

Hi David,

Another ping for these five patches please!

Cheers,

Michael

On Fri, 11 Sep 2020 at 14:44, Michael Kerrisk (man-pages)
<[email protected]> wrote:
>
> Hi David,
>
> A ping for these five patches please!
>
> Cheers,
>
> Michael
>
> On Wed, 2 Sep 2020 at 22:14, Michael Kerrisk (man-pages)
> <[email protected]> wrote:
> >
> > On Wed, 2 Sep 2020 at 18:14, David Howells <[email protected]> wrote:
> > >
> > > Michael Kerrisk (man-pages) <[email protected]> wrote:
> > >
> > > > The term "filesystem configuration context" is introduced, but never
> > > > really explained. I think it would be very helpful to have a sentence
> > > > or three that explains this concept at the start of the page.
> > >
> > > Does that need a .7 manpage?
> >
> > I was hoping a sentence or a paragraph in this page might suffice. Do
> > you think more is required?
> >
> > Cheers,
> >
> > Michael
> >
> > --
> > Michael Kerrisk
> > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> > Linux/UNIX System Programming Training: http://man7.org/training/
>
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Subject: Re: [PATCH 1/5] Add manpage for open_tree(2)

Hello David,

Ping!

Thanks,

Michael

On Thu, 27 Aug 2020 at 13:01, Michael Kerrisk (man-pages)
<[email protected]> wrote:
>
> Hello David,
>
> Can I ask that you please reply to each of my mails, rather than
> just sending out a new patch series (which of course I would also
> like you to do). Some things that I mentioned in the last mails
> got lost, and I end up having to repeat them.
>
> So, even where I say "please change this", could you please reply with
> "done", or a reason why you declined the suggested change, is useful.
> But in any case, a few words in reply to explain the other changes
> that you make would be helpful.
>
> Also, some of my questions now will get a little more complex, and as
> well as you updating the pages, I think a little discussion may be
> required in some cases.
>
> On 8/24/20 2:24 PM, David Howells wrote:
> > Add a manual page to document the open_tree() system call.
> >
> > Signed-off-by: David Howells <[email protected]>
> > ---
> >
> > man2/open_tree.2 | 249 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 249 insertions(+)
> > create mode 100644 man2/open_tree.2
> >
> > diff --git a/man2/open_tree.2 b/man2/open_tree.2
> > new file mode 100644
> > index 000000000..d480bd82f
> > --- /dev/null
> > +++ b/man2/open_tree.2
> > @@ -0,0 +1,249 @@
> > +'\" t
> > +.\" Copyright (c) 2020 David Howells <[email protected]>
> > +.\"
> > +.\" %%%LICENSE_START(VERBATIM)
> > +.\" Permission is granted to make and distribute verbatim copies of this
> > +.\" manual provided the copyright notice and this permission notice are
> > +.\" preserved on all copies.
> > +.\"
> > +.\" Permission is granted to copy and distribute modified versions of this
> > +.\" manual under the conditions for verbatim copying, provided that the
> > +.\" entire resulting derived work is distributed under the terms of a
> > +.\" permission notice identical to this one.
> > +.\"
> > +.\" Since the Linux kernel and libraries are constantly changing, this
> > +.\" manual page may be incorrect or out-of-date. The author(s) assume no
> > +.\" responsibility for errors or omissions, or for damages resulting from
> > +.\" the use of the information contained herein. The author(s) may not
> > +.\" have taken the same level of care in the production of this manual,
> > +.\" which is licensed free of charge, as they might when working
> > +.\" professionally.
> > +.\"
> > +.\" Formatted or processed versions of this manual, if unaccompanied by
> > +.\" the source, must acknowledge the copyright and authors of this work.
> > +.\" %%%LICENSE_END
> > +.\"
> > +.TH OPEN_TREE 2 2020-08-24 "Linux" "Linux Programmer's Manual"
> > +.SH NAME
> > +open_tree \- Pick or clone mount object and attach to fd
> > +.SH SYNOPSIS
> > +.nf
> > +.B #include <sys/types.h>
> > +.B #include <sys/mount.h>
> > +.B #include <unistd.h>
> > +.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
> > +.PP
> > +.BI "int open_tree(int " dirfd ", const char *" pathname ", unsigned int " flags );
> > +.fi
> > +.PP
> > +.IR Note :
> > +There are no glibc wrappers for these system calls.
> > +.SH DESCRIPTION
> > +.BR open_tree ()
> > +picks the mount object specified by the pathname and attaches it to a new file
>
> The terminology "pick" is unusual, and you never really explain what
> it means. Is there better terminology? In any case, can you add a few
> words to explain what the term (('pick" or whatever alternative you
> come up with) means.
>
> > +descriptor or clones it and attaches the clone to the file descriptor. The
>
> Please replace "it" by a noun (phrase) -- maybe: "the mount object"?
>
> > +resultant file descriptor is indistinguishable from one produced by
> > +.BR open "(2) with " O_PATH .
>
> What is the significance of that last piece? Can you add some words
> about why the fact that the resulting FD is indistinguishable from one
> produced by open() O_PATH matters or is useful?
>
> > +.PP
> > +In the case that the mount object is cloned, the clone will be "unmounted" and
>
> You place "unmounted" in quotes. Why? Is this to signify that the the
> unmount is somehow different from other unmounts? If so, please
> explain how it is different. If not, then I think we can lose the double
> quotes.
>
> > +destroyed when the file descriptor is closed if it is not otherwise mounted
> > +somewhere by calling
> > +.BR move_mount (2).
> > +.PP
> > +To select a mount object, no permissions are required on the object referred
>
> Here you use the word "select". Is this the same as "pick"? If yes, please
> use the same term.
>
> > +to by the path, but execute (search) permission is required on all of the
>
> s/the path/.I pathname/ ?
>
> (Where pathname == "the pathname argument)
>
> > +directories in
> > +.I pathname
> > +that lead to the object.
> > +.PP
> > +Appropriate privilege (Linux: the
>
> s/Linux: //
> (This is a Linux specific system call...)
>
> > +.B CAP_SYS_ADMIN
> > +capability) is required to clone mount objects.
> > +.PP
> > +.BR open_tree ()
> > +uses
> > +.IR pathname ", " dirfd " and " flags
> > +to locate the target object in one of a variety of ways:
> > +.TP
> > +[*] By absolute path.
> > +.I pathname
> > +points to an absolute path and
> > +.I dirfd
> > +is ignored. The object is looked up by name, starting from the root of the
> > +filesystem as seen by the calling process.
> > +.TP
> > +[*] By cwd-relative path.
> > +.I pathname
> > +points to a relative path and
> > +.IR dirfd " is " AT_FDCWD .
> > +The object is looked up by name, starting from the current working directory.
> > +.TP
> > +[*] By dir-relative path.
> > +.I pathname
> > +points to relative path and
> > +.I dirfd
> > +indicates a file descriptor pointing to a directory. The object is looked up
> > +by name, starting from the directory specified by
> > +.IR dirfd .
> > +.TP
> > +[*] By file descriptor.
> > +.I pathname
> > +is "",
> > +.I dirfd
> > +indicates a file descriptor and
> > +.B AT_EMPTY_PATH
> > +is set in
> > +.IR flags .
> > +The mount attached to the file descriptor is queried directly. The file
> > +descriptor may point to any type of file, not just a directory.
>
> I want to check here. Is it really *any* type of file? Can it be a UNIX
> domain socket or a char/block device or a FIFO?
>
> > +.PP
> > +.I flags
> > +can be used to control the operation of the function and to influence a
> > +path-based lookup. A value for
> > +.I flags
> > +is constructed by OR'ing together zero or more of the following constants:
> > +.TP
> > +.BR AT_EMPTY_PATH
> > +.\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d
> > +If
> > +.I pathname
> > +is an empty string, operate on the file referred to by
> > +.IR dirfd
> > +(which may have been obtained from
> > +.BR open "(2) with"
> > +.BR O_PATH ", from " fsmount (2)
> > +or from another
>
> s/another/a previous call to/
>
> > +.BR open_tree ()).
> > +If
> > +.I dirfd
> > +is
> > +.BR AT_FDCWD ,
> > +the call operates on the current working directory.
> > +In this case,
> > +.I dirfd
> > +can refer to any type of file, not just a directory.
> > +This flag is Linux-specific; define
> > +.B _GNU_SOURCE
> > +.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
> > +to obtain its definition.
> > +.TP
> > +.BR AT_NO_AUTOMOUNT
> > +Don't automount the final ("basename") component of
> > +.I pathname
> > +if it is a directory that is an automount point. This flag allows the
> > +automount point itself to be picked up or a mount cloned that is rooted on the
> > +automount point. The
> > +.B AT_NO_AUTOMOUNT
> > +flag has no effect if the mount point has already been mounted over.
> > +This flag is Linux-specific; define
> > +.B _GNU_SOURCE
> > +.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
> > +to obtain its definition.
> > +.TP
> > +.B AT_SYMLINK_NOFOLLOW
> > +If
> > +.I pathname
> > +is a symbolic link, do not dereference it: instead pick up or clone a mount
> > +rooted on the link itself.
> > +.TP
> > +.B OPEN_TREE_CLOEXEC
> > +Set the close-on-exec flag for the new file descriptor. This will cause the
> > +file descriptor to be closed automatically when a process exec's.
> > +.TP
> > +.B OPEN_TREE_CLONE
> > +Rather than directly attaching the selected object to the file descriptor,
> > +clone the object, set the root of the new mount object to that point and
>
> Could you expand on "that point" a little. It's not quite clear to me what
> you mean there.
>
> > +attach the clone to the file descriptor.
> > +.TP
> > +.B AT_RECURSIVE
> > +This is only permitted in conjunction with OPEN_TREE_CLONE. It causes the
> > +entire mount subtree rooted at the selected spot to be cloned rather than just
>
> Is there a better word than "spot"?
>
> > +that one mount object.
> > +.SH RETURN VALUE
> > +On success, the new file descriptor is returned. On error, \-1 is returned,
> > +and
> > +.I errno
> > +is set appropriately.
> > +.SH ERRORS
> > +.TP
> > +.B EACCES
> > +Search permission is denied for one of the directories
> > +in the path prefix of
> > +.IR pathname .
> > +(See also
> > +.BR path_resolution (7).)
> > +.TP
> > +.B EBADF
> > +.I dirfd
> > +is not a valid open file descriptor.
> > +.TP
> > +.B EFAULT
> > +.I pathname
> > +is NULL or
> > +.IR pathname
> > +point to a location outside the process's accessible address space.
> > +.TP
> > +.B EINVAL
> > +Reserved flag specified in
> > +.IR flags .
> > +.TP
> > +.B ELOOP
> > +Too many symbolic links encountered while traversing the pathname.
> > +.TP
> > +.B ENAMETOOLONG
> > +.I pathname
> > +is too long.
> > +.TP
> > +.B ENOENT
> > +A component of
> > +.I pathname
> > +does not exist, or
> > +.I pathname
> > +is an empty string and
> > +.B AT_EMPTY_PATH
> > +was not specified in
> > +.IR flags .
> > +.TP
> > +.B ENOMEM
> > +Out of memory (i.e., kernel memory).
> > +.TP
> > +.B ENOTDIR
> > +A component of the path prefix of
> > +.I pathname
> > +is not a directory or
> > +.I pathname
> > +is relative and
> > +.I dirfd
> > +is a file descriptor referring to a file other than a directory.
> > +.SH VERSIONS
> > +.BR open_tree ()
> > +was added to Linux in kernel 5.2.
> > +.SH CONFORMING TO
> > +.BR open_tree ()
> > +is Linux-specific.
> > +.SH NOTES
> > +Glibc does not (yet) provide a wrapper for the
> > +.BR open_tree ()
> > +system call; call it using
> > +.BR syscall (2).
>
> What's the current status with respect to glibc support? Is it coming/is
> someone working on this?
>
> > +.SH EXAMPLE
>
> s/EXAMPLE/EXAMPLES/
> (That's the standard section header name these days.)
>
> > +The
> > +.BR open_tree ()
> > +function can be used like the following:
>
> The following example does a recursive bind mount, right?
> Can you please add some words to say that explicitly.
>
> > +.PP
> > +.RS
> > +.nf
> > +fd1 = open_tree(AT_FDCWD, "/mnt", 0);
> > +fd2 = open_tree(fd1, "",
> > + AT_EMPTY_PATH | OPEN_TREE_CLONE | AT_RECURSIVE);
> > +move_mount(fd2, "", AT_FDCWD, "/mnt2", MOVE_MOUNT_F_EMPTY_PATH);
> > +.fi
> > +.RE
> > +.PP
> > +This would attach the path point for "/mnt" to fd1, then it would copy the
>
> What is a "path point"? This is not standard terminology. Can you
> replace this with something better?
>
> > +entire subtree at the point referred to by fd1 and attach that to fd2; lastly,
> > +it would attach the clone to "/mnt2".
> > +.SH SEE ALSO
> > +.BR fsmount (2),
> > +.BR move_mount (2),
> > +.BR open (2)
>
> Thanks,
>
> Michael
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Subject: Re: [PATCH 4/5] Add manpage for fsopen(2) and fsmount(2)

Hello David,

Ping!

Thanks,

Michael

On Fri, 16 Oct 2020 at 08:50, Michael Kerrisk (man-pages)
<[email protected]> wrote:
>
> Hi David,
>
> Another ping for these five patches please!
>
> Cheers,
>
> Michael
>
> On Fri, 11 Sep 2020 at 14:44, Michael Kerrisk (man-pages)
> <[email protected]> wrote:
> >
> > Hi David,
> >
> > A ping for these five patches please!
> >
> > Cheers,
> >
> > Michael
> >
> > On Wed, 2 Sep 2020 at 22:14, Michael Kerrisk (man-pages)
> > <[email protected]> wrote:
> > >
> > > On Wed, 2 Sep 2020 at 18:14, David Howells <[email protected]> wrote:
> > > >
> > > > Michael Kerrisk (man-pages) <[email protected]> wrote:
> > > >
> > > > > The term "filesystem configuration context" is introduced, but never
> > > > > really explained. I think it would be very helpful to have a sentence
> > > > > or three that explains this concept at the start of the page.
> > > >
> > > > Does that need a .7 manpage?
> > >
> > > I was hoping a sentence or a paragraph in this page might suffice. Do
> > > you think more is required?
> > >
> > > Cheers,
> > >
> > > Michael
> > >
> > > --
> > > Michael Kerrisk
> > > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> > > Linux/UNIX System Programming Training: http://man7.org/training/
> >
> >
> >
> > --
> > Michael Kerrisk
> > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> > Linux/UNIX System Programming Training: http://man7.org/training/
>
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Subject: Re: [PATCH 3/5] Add manpage for fspick(2)

Hello David,

Ping!

Thanks,

Michael

On Mon, 24 Aug 2020 at 14:25, David Howells <[email protected]> wrote:
>
> Add a manual page to document the fspick() system call.
>
> Signed-off-by: David Howells <[email protected]>
> ---
>
> man2/fspick.2 | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 180 insertions(+)
> create mode 100644 man2/fspick.2
>
> diff --git a/man2/fspick.2 b/man2/fspick.2
> new file mode 100644
> index 000000000..72bf645dd
> --- /dev/null
> +++ b/man2/fspick.2
> @@ -0,0 +1,180 @@
> +'\" t
> +.\" Copyright (c) 2020 David Howells <[email protected]>
> +.\"
> +.\" %%%LICENSE_START(VERBATIM)
> +.\" Permission is granted to make and distribute verbatim copies of this
> +.\" manual provided the copyright notice and this permission notice are
> +.\" preserved on all copies.
> +.\"
> +.\" Permission is granted to copy and distribute modified versions of this
> +.\" manual under the conditions for verbatim copying, provided that the
> +.\" entire resulting derived work is distributed under the terms of a
> +.\" permission notice identical to this one.
> +.\"
> +.\" Since the Linux kernel and libraries are constantly changing, this
> +.\" manual page may be incorrect or out-of-date. The author(s) assume no
> +.\" responsibility for errors or omissions, or for damages resulting from
> +.\" the use of the information contained herein. The author(s) may not
> +.\" have taken the same level of care in the production of this manual,
> +.\" which is licensed free of charge, as they might when working
> +.\" professionally.
> +.\"
> +.\" Formatted or processed versions of this manual, if unaccompanied by
> +.\" the source, must acknowledge the copyright and authors of this work.
> +.\" %%%LICENSE_END
> +.\"
> +.TH FSPICK 2 2020-08-24 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +fspick \- Select filesystem for reconfiguration
> +.SH SYNOPSIS
> +.nf
> +.B #include <sys/types.h>
> +.B #include <sys/mount.h>
> +.B #include <unistd.h>
> +.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
> +.PP
> +.BI "int fspick(int " dirfd ", const char *" pathname ", unsigned int " flags );
> +.fi
> +.PP
> +.IR Note :
> +There is no glibc wrapper for this system call.
> +.SH DESCRIPTION
> +.PP
> +.BR fspick ()
> +creates a new filesystem configuration context within the kernel and attaches a
> +pre-existing superblock to it so that it can be reconfigured (similar to
> +.BR mount (8)
> +with the "-o remount" option). The configuration context is marked as being in
> +reconfiguration mode and attached to a file descriptor, which is returned to
> +the caller. The file descriptor can be marked close-on-exec by setting
> +.B FSPICK_CLOEXEC
> +in
> +.IR flags .
> +.PP
> +The target is whichever superblock backs the object determined by
> +.IR dfd ", " pathname " and " flags .
> +The following can be set in
> +.I flags
> +to control the pathwalk to that object:
> +.TP
> +.B FSPICK_SYMLINK_NOFOLLOW
> +Don't follow symbolic links in the final component of the path.
> +.TP
> +.B FSPICK_NO_AUTOMOUNT
> +Don't follow automounts in the final component of the path.
> +.TP
> +.B FSPICK_EMPTY_PATH
> +Allow an empty string to be specified as the pathname. This allows
> +.I dirfd
> +to specify the target mount exactly.
> +.PP
> +After calling fspick(), the file descriptor should be passed to the
> +.BR fsconfig (2)
> +system call, using that to specify the desired changes to filesystem and
> +security parameters.
> +.PP
> +When the parameters are all set, the
> +.BR fsconfig ()
> +system call should then be called again with
> +.B FSCONFIG_CMD_RECONFIGURE
> +as the command argument to effect the reconfiguration.
> +.PP
> +After the reconfiguration has taken place, the context is wiped clean (apart
> +from the superblock attachment, which remains) and can be reused to make
> +another reconfiguration.
> +.PP
> +The file descriptor also serves as a channel by which more comprehensive error,
> +warning and information messages may be retrieved from the kernel using
> +.BR read (2).
> +.SS Message Retrieval Interface
> +The context file descriptor may be queried for message strings at any time by
> +calling
> +.BR read (2)
> +on the file descriptor. This will return formatted messages that are prefixed
> +to indicate their class:
> +.TP
> +\fB"e <message>"\fP
> +An error message string was logged.
> +.TP
> +\fB"i <message>"\fP
> +An informational message string was logged.
> +.TP
> +\fB"w <message>"\fP
> +An warning message string was logged.
> +.PP
> +Messages are removed from the queue as they're read and the queue has a limited
> +depth of 8 messages, so it's possible for some to get lost.
> +.SH RETURN VALUE
> +On success, the function returns a file descriptor. On error, \-1 is returned,
> +and
> +.I errno
> +is set appropriately.
> +.SH ERRORS
> +The error values given below result from filesystem type independent errors.
> +Additionally, each filesystem type may have its own special errors and its own
> +special behavior. See the Linux kernel source code for details.
> +.TP
> +.B EACCES
> +A component of a path was not searchable.
> +(See also
> +.BR path_resolution (7).)
> +.TP
> +.B EFAULT
> +.I pathname
> +points outside the user address space.
> +.TP
> +.B EINVAL
> +.I flags
> +includes an undefined value.
> +.TP
> +.B ELOOP
> +Too many links encountered during pathname resolution.
> +.TP
> +.B EMFILE
> +The system has too many open files to create more.
> +.TP
> +.B ENFILE
> +The process has too many open files to create more.
> +.TP
> +.B ENAMETOOLONG
> +A pathname was longer than
> +.BR MAXPATHLEN .
> +.TP
> +.B ENOENT
> +A pathname was empty or had a nonexistent component.
> +.TP
> +.B ENOMEM
> +The kernel could not allocate sufficient memory to complete the call.
> +.TP
> +.B EPERM
> +The caller does not have the required privileges.
> +.SH CONFORMING TO
> +These functions are Linux-specific and should not be used in programs intended
> +to be portable.
> +.SH VERSIONS
> +.BR fsopen "(), " fsmount "() and " fspick ()
> +were added to Linux in kernel 5.2.
> +.SH EXAMPLES
> +To illustrate the process, here's an example whereby this can be used to
> +reconfigure a filesystem:
> +.PP
> +.in +4n
> +.nf
> +sfd = fspick(AT_FDCWD, "/mnt", FSPICK_NO_AUTOMOUNT | FSPICK_CLOEXEC);
> +fsconfig(sfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
> +fsconfig(sfd, FSCONFIG_SET_STRING, "user_xattr", "false", 0);
> +fsconfig(sfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0);
> +.fi
> +.in
> +.PP
> +.SH NOTES
> +Glibc does not (yet) provide a wrapper for the
> +.BR fspick "()"
> +system call; call it using
> +.BR syscall (2).
> +.SH SEE ALSO
> +.BR mountpoint (1),
> +.BR fsconfig (2),
> +.BR fsopen (2),
> +.BR path_resolution (7),
> +.BR mount (8)
>
>


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

2021-02-25 19:07:25

by Aurélien Aptel

[permalink] [raw]
Subject: Re: [PATCH 1/5] Add manpage for open_tree(2)


I was looking at this to possibly give it a go in mount.cifs (cifs-utils).

Sorry if this has been debated before but is there an interest in
converting those man page to RST? We already switched in cifs-utils.
Iterating on patchsets is quite daunting in roff.

Cheers,
--
Aurélien Aptel / SUSE Labs Samba Team
GPG: 1839 CB5F 9F5B FB9B AA97 8C99 03C8 A49B 521B D5D3
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, DE
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 247165 (AG München)

Subject: Re: [PATCH 1/5] Add manpage for open_tree(2)

Hello David,

I've pinged on these manual pages for the new mount API already a few
times in the past.

I would really like to get them out the door, but some work is
required, and I can't do it on my own; I need your help. In
particular, there are a number of open questions that I do not feel
confident at guessing the answer.

How can I get your help please with completing these pages?

I will ping on all of the other mails, just to raise all the patches
to the top of the inbox.

Thanks,

Michael


On Thu, 27 Aug 2020 at 13:01, Michael Kerrisk (man-pages)
<[email protected]> wrote:
>
> Hello David,
>
> Can I ask that you please reply to each of my mails, rather than
> just sending out a new patch series (which of course I would also
> like you to do). Some things that I mentioned in the last mails
> got lost, and I end up having to repeat them.
>
> So, even where I say "please change this", could you please reply with
> "done", or a reason why you declined the suggested change, is useful.
> But in any case, a few words in reply to explain the other changes
> that you make would be helpful.
>
> Also, some of my questions now will get a little more complex, and as
> well as you updating the pages, I think a little discussion may be
> required in some cases.
>
> On 8/24/20 2:24 PM, David Howells wrote:
> > Add a manual page to document the open_tree() system call.
> >
> > Signed-off-by: David Howells <[email protected]>
> > ---
> >
> > man2/open_tree.2 | 249 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 249 insertions(+)
> > create mode 100644 man2/open_tree.2
> >
> > diff --git a/man2/open_tree.2 b/man2/open_tree.2
> > new file mode 100644
> > index 000000000..d480bd82f
> > --- /dev/null
> > +++ b/man2/open_tree.2
> > @@ -0,0 +1,249 @@
> > +'\" t
> > +.\" Copyright (c) 2020 David Howells <[email protected]>
> > +.\"
> > +.\" %%%LICENSE_START(VERBATIM)
> > +.\" Permission is granted to make and distribute verbatim copies of this
> > +.\" manual provided the copyright notice and this permission notice are
> > +.\" preserved on all copies.
> > +.\"
> > +.\" Permission is granted to copy and distribute modified versions of this
> > +.\" manual under the conditions for verbatim copying, provided that the
> > +.\" entire resulting derived work is distributed under the terms of a
> > +.\" permission notice identical to this one.
> > +.\"
> > +.\" Since the Linux kernel and libraries are constantly changing, this
> > +.\" manual page may be incorrect or out-of-date. The author(s) assume no
> > +.\" responsibility for errors or omissions, or for damages resulting from
> > +.\" the use of the information contained herein. The author(s) may not
> > +.\" have taken the same level of care in the production of this manual,
> > +.\" which is licensed free of charge, as they might when working
> > +.\" professionally.
> > +.\"
> > +.\" Formatted or processed versions of this manual, if unaccompanied by
> > +.\" the source, must acknowledge the copyright and authors of this work.
> > +.\" %%%LICENSE_END
> > +.\"
> > +.TH OPEN_TREE 2 2020-08-24 "Linux" "Linux Programmer's Manual"
> > +.SH NAME
> > +open_tree \- Pick or clone mount object and attach to fd
> > +.SH SYNOPSIS
> > +.nf
> > +.B #include <sys/types.h>
> > +.B #include <sys/mount.h>
> > +.B #include <unistd.h>
> > +.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
> > +.PP
> > +.BI "int open_tree(int " dirfd ", const char *" pathname ", unsigned int " flags );
> > +.fi
> > +.PP
> > +.IR Note :
> > +There are no glibc wrappers for these system calls.
> > +.SH DESCRIPTION
> > +.BR open_tree ()
> > +picks the mount object specified by the pathname and attaches it to a new file
>
> The terminology "pick" is unusual, and you never really explain what
> it means. Is there better terminology? In any case, can you add a few
> words to explain what the term (('pick" or whatever alternative you
> come up with) means.
>
> > +descriptor or clones it and attaches the clone to the file descriptor. The
>
> Please replace "it" by a noun (phrase) -- maybe: "the mount object"?
>
> > +resultant file descriptor is indistinguishable from one produced by
> > +.BR open "(2) with " O_PATH .
>
> What is the significance of that last piece? Can you add some words
> about why the fact that the resulting FD is indistinguishable from one
> produced by open() O_PATH matters or is useful?
>
> > +.PP
> > +In the case that the mount object is cloned, the clone will be "unmounted" and
>
> You place "unmounted" in quotes. Why? Is this to signify that the the
> unmount is somehow different from other unmounts? If so, please
> explain how it is different. If not, then I think we can lose the double
> quotes.
>
> > +destroyed when the file descriptor is closed if it is not otherwise mounted
> > +somewhere by calling
> > +.BR move_mount (2).
> > +.PP
> > +To select a mount object, no permissions are required on the object referred
>
> Here you use the word "select". Is this the same as "pick"? If yes, please
> use the same term.
>
> > +to by the path, but execute (search) permission is required on all of the
>
> s/the path/.I pathname/ ?
>
> (Where pathname == "the pathname argument)
>
> > +directories in
> > +.I pathname
> > +that lead to the object.
> > +.PP
> > +Appropriate privilege (Linux: the
>
> s/Linux: //
> (This is a Linux specific system call...)
>
> > +.B CAP_SYS_ADMIN
> > +capability) is required to clone mount objects.
> > +.PP
> > +.BR open_tree ()
> > +uses
> > +.IR pathname ", " dirfd " and " flags
> > +to locate the target object in one of a variety of ways:
> > +.TP
> > +[*] By absolute path.
> > +.I pathname
> > +points to an absolute path and
> > +.I dirfd
> > +is ignored. The object is looked up by name, starting from the root of the
> > +filesystem as seen by the calling process.
> > +.TP
> > +[*] By cwd-relative path.
> > +.I pathname
> > +points to a relative path and
> > +.IR dirfd " is " AT_FDCWD .
> > +The object is looked up by name, starting from the current working directory.
> > +.TP
> > +[*] By dir-relative path.
> > +.I pathname
> > +points to relative path and
> > +.I dirfd
> > +indicates a file descriptor pointing to a directory. The object is looked up
> > +by name, starting from the directory specified by
> > +.IR dirfd .
> > +.TP
> > +[*] By file descriptor.
> > +.I pathname
> > +is "",
> > +.I dirfd
> > +indicates a file descriptor and
> > +.B AT_EMPTY_PATH
> > +is set in
> > +.IR flags .
> > +The mount attached to the file descriptor is queried directly. The file
> > +descriptor may point to any type of file, not just a directory.
>
> I want to check here. Is it really *any* type of file? Can it be a UNIX
> domain socket or a char/block device or a FIFO?
>
> > +.PP
> > +.I flags
> > +can be used to control the operation of the function and to influence a
> > +path-based lookup. A value for
> > +.I flags
> > +is constructed by OR'ing together zero or more of the following constants:
> > +.TP
> > +.BR AT_EMPTY_PATH
> > +.\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d
> > +If
> > +.I pathname
> > +is an empty string, operate on the file referred to by
> > +.IR dirfd
> > +(which may have been obtained from
> > +.BR open "(2) with"
> > +.BR O_PATH ", from " fsmount (2)
> > +or from another
>
> s/another/a previous call to/
>
> > +.BR open_tree ()).
> > +If
> > +.I dirfd
> > +is
> > +.BR AT_FDCWD ,
> > +the call operates on the current working directory.
> > +In this case,
> > +.I dirfd
> > +can refer to any type of file, not just a directory.
> > +This flag is Linux-specific; define
> > +.B _GNU_SOURCE
> > +.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
> > +to obtain its definition.
> > +.TP
> > +.BR AT_NO_AUTOMOUNT
> > +Don't automount the final ("basename") component of
> > +.I pathname
> > +if it is a directory that is an automount point. This flag allows the
> > +automount point itself to be picked up or a mount cloned that is rooted on the
> > +automount point. The
> > +.B AT_NO_AUTOMOUNT
> > +flag has no effect if the mount point has already been mounted over.
> > +This flag is Linux-specific; define
> > +.B _GNU_SOURCE
> > +.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
> > +to obtain its definition.
> > +.TP
> > +.B AT_SYMLINK_NOFOLLOW
> > +If
> > +.I pathname
> > +is a symbolic link, do not dereference it: instead pick up or clone a mount
> > +rooted on the link itself.
> > +.TP
> > +.B OPEN_TREE_CLOEXEC
> > +Set the close-on-exec flag for the new file descriptor. This will cause the
> > +file descriptor to be closed automatically when a process exec's.
> > +.TP
> > +.B OPEN_TREE_CLONE
> > +Rather than directly attaching the selected object to the file descriptor,
> > +clone the object, set the root of the new mount object to that point and
>
> Could you expand on "that point" a little. It's not quite clear to me what
> you mean there.
>
> > +attach the clone to the file descriptor.
> > +.TP
> > +.B AT_RECURSIVE
> > +This is only permitted in conjunction with OPEN_TREE_CLONE. It causes the
> > +entire mount subtree rooted at the selected spot to be cloned rather than just
>
> Is there a better word than "spot"?
>
> > +that one mount object.
> > +.SH RETURN VALUE
> > +On success, the new file descriptor is returned. On error, \-1 is returned,
> > +and
> > +.I errno
> > +is set appropriately.
> > +.SH ERRORS
> > +.TP
> > +.B EACCES
> > +Search permission is denied for one of the directories
> > +in the path prefix of
> > +.IR pathname .
> > +(See also
> > +.BR path_resolution (7).)
> > +.TP
> > +.B EBADF
> > +.I dirfd
> > +is not a valid open file descriptor.
> > +.TP
> > +.B EFAULT
> > +.I pathname
> > +is NULL or
> > +.IR pathname
> > +point to a location outside the process's accessible address space.
> > +.TP
> > +.B EINVAL
> > +Reserved flag specified in
> > +.IR flags .
> > +.TP
> > +.B ELOOP
> > +Too many symbolic links encountered while traversing the pathname.
> > +.TP
> > +.B ENAMETOOLONG
> > +.I pathname
> > +is too long.
> > +.TP
> > +.B ENOENT
> > +A component of
> > +.I pathname
> > +does not exist, or
> > +.I pathname
> > +is an empty string and
> > +.B AT_EMPTY_PATH
> > +was not specified in
> > +.IR flags .
> > +.TP
> > +.B ENOMEM
> > +Out of memory (i.e., kernel memory).
> > +.TP
> > +.B ENOTDIR
> > +A component of the path prefix of
> > +.I pathname
> > +is not a directory or
> > +.I pathname
> > +is relative and
> > +.I dirfd
> > +is a file descriptor referring to a file other than a directory.
> > +.SH VERSIONS
> > +.BR open_tree ()
> > +was added to Linux in kernel 5.2.
> > +.SH CONFORMING TO
> > +.BR open_tree ()
> > +is Linux-specific.
> > +.SH NOTES
> > +Glibc does not (yet) provide a wrapper for the
> > +.BR open_tree ()
> > +system call; call it using
> > +.BR syscall (2).
>
> What's the current status with respect to glibc support? Is it coming/is
> someone working on this?
>
> > +.SH EXAMPLE
>
> s/EXAMPLE/EXAMPLES/
> (That's the standard section header name these days.)
>
> > +The
> > +.BR open_tree ()
> > +function can be used like the following:
>
> The following example does a recursive bind mount, right?
> Can you please add some words to say that explicitly.
>
> > +.PP
> > +.RS
> > +.nf
> > +fd1 = open_tree(AT_FDCWD, "/mnt", 0);
> > +fd2 = open_tree(fd1, "",
> > + AT_EMPTY_PATH | OPEN_TREE_CLONE | AT_RECURSIVE);
> > +move_mount(fd2, "", AT_FDCWD, "/mnt2", MOVE_MOUNT_F_EMPTY_PATH);
> > +.fi
> > +.RE
> > +.PP
> > +This would attach the path point for "/mnt" to fd1, then it would copy the
>
> What is a "path point"? This is not standard terminology. Can you
> replace this with something better?
>
> > +entire subtree at the point referred to by fd1 and attach that to fd2; lastly,
> > +it would attach the clone to "/mnt2".
> > +.SH SEE ALSO
> > +.BR fsmount (2),
> > +.BR move_mount (2),
> > +.BR open (2)
>
> Thanks,
>
> Michael
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Subject: Re: [PATCH 3/5] Add manpage for fspick(2)

Hello David,

As noted in another mail, I will ping on all of the mails, just to
raise all the patches to the top of the inbox.

Thanks,

Michael

On Thu, 27 Aug 2020 at 13:05, Michael Kerrisk (man-pages)
<[email protected]> wrote:
>
> Hello David,
>
> On 8/24/20 2:24 PM, David Howells wrote:
> > Add a manual page to document the fspick() system call.
> >
> > Signed-off-by: David Howells <[email protected]>
> > ---
> >
> > man2/fspick.2 | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 180 insertions(+)
> > create mode 100644 man2/fspick.2
> >
> > diff --git a/man2/fspick.2 b/man2/fspick.2
> > new file mode 100644
> > index 000000000..72bf645dd
> > --- /dev/null
> > +++ b/man2/fspick.2
> > @@ -0,0 +1,180 @@
> > +'\" t
> > +.\" Copyright (c) 2020 David Howells <[email protected]>
> > +.\"
> > +.\" %%%LICENSE_START(VERBATIM)
> > +.\" Permission is granted to make and distribute verbatim copies of this
> > +.\" manual provided the copyright notice and this permission notice are
> > +.\" preserved on all copies.
> > +.\"
> > +.\" Permission is granted to copy and distribute modified versions of this
> > +.\" manual under the conditions for verbatim copying, provided that the
> > +.\" entire resulting derived work is distributed under the terms of a
> > +.\" permission notice identical to this one.
> > +.\"
> > +.\" Since the Linux kernel and libraries are constantly changing, this
> > +.\" manual page may be incorrect or out-of-date. The author(s) assume no
> > +.\" responsibility for errors or omissions, or for damages resulting from
> > +.\" the use of the information contained herein. The author(s) may not
> > +.\" have taken the same level of care in the production of this manual,
> > +.\" which is licensed free of charge, as they might when working
> > +.\" professionally.
> > +.\"
> > +.\" Formatted or processed versions of this manual, if unaccompanied by
> > +.\" the source, must acknowledge the copyright and authors of this work.
> > +.\" %%%LICENSE_END
> > +.\"
> > +.TH FSPICK 2 2020-08-24 "Linux" "Linux Programmer's Manual"
> > +.SH NAME
> > +fspick \- Select filesystem for reconfiguration
> > +.SH SYNOPSIS
> > +.nf
> > +.B #include <sys/types.h>
> > +.B #include <sys/mount.h>
> > +.B #include <unistd.h>
> > +.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
> > +.PP
> > +.BI "int fspick(int " dirfd ", const char *" pathname ", unsigned int " flags );
> > +.fi
> > +.PP
> > +.IR Note :
> > +There is no glibc wrapper for this system call.
> > +.SH DESCRIPTION
> > +.PP
> > +.BR fspick ()
> > +creates a new filesystem configuration context within the kernel and attaches a
> > +pre-existing superblock to it so that it can be reconfigured (similar to
> > +.BR mount (8)
> > +with the "-o remount" option). The configuration context is marked as being in
> > +reconfiguration mode and attached to a file descriptor, which is returned to
> > +the caller. The file descriptor can be marked close-on-exec by setting
> > +.B FSPICK_CLOEXEC
> > +in
> > +.IR flags .
> > +.PP
> > +The target is whichever superblock backs the object determined by
> > +.IR dfd ", " pathname " and " flags .
> > +The following can be set in
> > +.I flags
> > +to control the pathwalk to that object:
> > +.TP
> > +.B FSPICK_SYMLINK_NOFOLLOW
> > +Don't follow symbolic links in the final component of the path.
> > +.TP
> > +.B FSPICK_NO_AUTOMOUNT
> > +Don't follow automounts in the final component of the path.
> > +.TP
> > +.B FSPICK_EMPTY_PATH
> > +Allow an empty string to be specified as the pathname. This allows
> > +.I dirfd
> > +to specify the target mount exactly.
> > +.PP
> > +After calling fspick(), the file descriptor should be passed to the
> > +.BR fsconfig (2)
> > +system call, using that to specify the desired changes to filesystem and
>
> Better: s/using that/in order/
>
> > +security parameters.
> > +.PP
> > +When the parameters are all set, the
> > +.BR fsconfig ()
> > +system call should then be called again with
> > +.B FSCONFIG_CMD_RECONFIGURE
> > +as the command argument to effect the reconfiguration.
> > +.PP
> > +After the reconfiguration has taken place, the context is wiped clean (apart
> > +from the superblock attachment, which remains) and can be reused to make
> > +another reconfiguration.
> > +.PP
> > +The file descriptor also serves as a channel by which more comprehensive error,
> > +warning and information messages may be retrieved from the kernel using
> > +.BR read (2).
> > +.SS Message Retrieval Interface
> > +The context file descriptor may be queried for message strings at any time by
>
> s/descriptor/descriptor returned by fspick()/
>
> > +calling
> > +.BR read (2)
> > +on the file descriptor. This will return formatted messages that are prefixed
> > +to indicate their class:
> > +.TP
> > +\fB"e <message>"\fP
> > +An error message string was logged.
> > +.TP
> > +\fB"i <message>"\fP
> > +An informational message string was logged.
> > +.TP
> > +\fB"w <message>"\fP
> > +An warning message string was logged.
> > +.PP
> > +Messages are removed from the queue as they're read and the queue has a limited
> > +depth of 8 messages, so it's possible for some to get lost.
>
> What if there are no pending error messages to retrieve? What does
> read() do in that case? Please add an explanation here.
>
> > +.SH RETURN VALUE
> > +On success, the function returns a file descriptor. On error, \-1 is returned,
> > +and
> > +.I errno
> > +is set appropriately.
> > +.SH ERRORS
> > +The error values given below result from filesystem type independent errors.
> > +Additionally, each filesystem type may have its own special errors and its own
> > +special behavior. See the Linux kernel source code for details.
> > +.TP
> > +.B EACCES
> > +A component of a path was not searchable.
> > +(See also
> > +.BR path_resolution (7).)
> > +.TP
> > +.B EFAULT
> > +.I pathname
> > +points outside the user address space.
> > +.TP
> > +.B EINVAL
> > +.I flags
> > +includes an undefined value.
> > +.TP
> > +.B ELOOP
> > +Too many links encountered during pathname resolution.
> > +.TP
> > +.B EMFILE
> > +The system has too many open files to create more.
> > +.TP
> > +.B ENFILE
> > +The process has too many open files to create more.
> > +.TP
> > +.B ENAMETOOLONG
> > +A pathname was longer than
> > +.BR MAXPATHLEN .
>
> MAXPATHLEN is not, I think, a constant known in user space. What is this?
> Should it be PATH_MAX?
>
> > +.TP
> > +.B ENOENT
> > +A pathname was empty or had a nonexistent component.
> > +.TP
> > +.B ENOMEM
> > +The kernel could not allocate sufficient memory to complete the call.
> > +.TP
> > +.B EPERM
> > +The caller does not have the required privileges.
>
> Please note the necessary capability here. Also, there was no mention of
> capabilities/privileges in DESCRIPTION. Should there have been?
>
> > +.SH CONFORMING TO
> > +These functions are Linux-specific and should not be used in programs intended
> > +to be portable.
> > +.SH VERSIONS
> > +.BR fsopen "(), " fsmount "() and " fspick ()
> > +were added to Linux in kernel 5.2.
> > +.SH EXAMPLES
> > +To illustrate the process, here's an example whereby this can be used to
> > +reconfigure a filesystem:
> > +.PP
> > +.in +4n
> > +.nf
> > +sfd = fspick(AT_FDCWD, "/mnt", FSPICK_NO_AUTOMOUNT | FSPICK_CLOEXEC);
> > +fsconfig(sfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
> > +fsconfig(sfd, FSCONFIG_SET_STRING, "user_xattr", "false", 0);
> > +fsconfig(sfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0);
> > +.fi
> > +.in
> > +.PP
> > +.SH NOTES
> > +Glibc does not (yet) provide a wrapper for the
> > +.BR fspick "()"
> > +system call; call it using
> > +.BR syscall (2).
> > +.SH SEE ALSO
> > +.BR mountpoint (1),
> > +.BR fsconfig (2),
> > +.BR fsopen (2),
> > +.BR path_resolution (7),
> > +.BR mount (8)
>
> Thanks,
>
> Michael
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Subject: Re: [PATCH 4/5] Add manpage for fsopen(2) and fsmount(2)

Hello David,

As noted in another mail, I will ping on all of the mails, just to
raise all the patches to the top of the inbox.

Thanks,

Michael

On Thu, 27 Aug 2020 at 13:07, Michael Kerrisk (man-pages)
<[email protected]> wrote:
>
> Hello David,
>
> On 8/24/20 2:25 PM, David Howells wrote:
> > Add a manual page to document the fsopen() and fsmount() system calls.
> >
> > Signed-off-by: David Howells <[email protected]>
> > ---
> >
> > man2/fsmount.2 | 1
> > man2/fsopen.2 | 245 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 246 insertions(+)
> > create mode 100644 man2/fsmount.2
> > create mode 100644 man2/fsopen.2
> >
> > diff --git a/man2/fsmount.2 b/man2/fsmount.2
> > new file mode 100644
> > index 000000000..2bf59fc3e
> > --- /dev/null
> > +++ b/man2/fsmount.2
> > @@ -0,0 +1 @@
> > +.so man2/fsopen.2
> > diff --git a/man2/fsopen.2 b/man2/fsopen.2
> > new file mode 100644
> > index 000000000..1d1bba238
> > --- /dev/null
> > +++ b/man2/fsopen.2
> > @@ -0,0 +1,245 @@
> > +'\" t
> > +.\" Copyright (c) 2020 David Howells <[email protected]>
> > +.\"
> > +.\" %%%LICENSE_START(VERBATIM)
> > +.\" Permission is granted to make and distribute verbatim copies of this
> > +.\" manual provided the copyright notice and this permission notice are
> > +.\" preserved on all copies.
> > +.\"
> > +.\" Permission is granted to copy and distribute modified versions of this
> > +.\" manual under the conditions for verbatim copying, provided that the
> > +.\" entire resulting derived work is distributed under the terms of a
> > +.\" permission notice identical to this one.
> > +.\"
> > +.\" Since the Linux kernel and libraries are constantly changing, this
> > +.\" manual page may be incorrect or out-of-date. The author(s) assume no
> > +.\" responsibility for errors or omissions, or for damages resulting from
> > +.\" the use of the information contained herein. The author(s) may not
> > +.\" have taken the same level of care in the production of this manual,
> > +.\" which is licensed free of charge, as they might when working
> > +.\" professionally.
> > +.\"
> > +.\" Formatted or processed versions of this manual, if unaccompanied by
> > +.\" the source, must acknowledge the copyright and authors of this work.
> > +.\" %%%LICENSE_END
> > +.\"
> > +.TH FSOPEN 2 2020-08-07 "Linux" "Linux Programmer's Manual"
> > +.SH NAME
> > +fsopen, fsmount \- Filesystem parameterisation and mount creation
> > +.SH SYNOPSIS
> > +.nf
> > +.B #include <sys/types.h>
> > +.B #include <sys/mount.h>
> > +.B #include <unistd.h>
> > +.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
> > +.PP
> > +.BI "int fsopen(const char *" fsname ", unsigned int " flags );
> > +.PP
> > +.BI "int fsmount(int " fd ", unsigned int " flags ", unsigned int " mount_attrs );
> > +.fi
> > +.PP
> > +.IR Note :
> > +There are no glibc wrappers for these system calls.
> > +.SH DESCRIPTION
> > +.PP
> > +.BR fsopen ()
> > +creates a blank filesystem configuration context within the kernel for the
> > +filesystem named in the
> > +.I fsname
> > +parameter, puts it into creation mode and attaches it to a file descriptor,
> > +which it then returns.
>
> In the preceding sentence, "it" is used three times, with two *different*
> referents. That's quite hard on the reader.
>
> How about:
>
> [[
> .BR fsopen ()
> creates a blank filesystem configuration context within the kernel for the
> filesystem named in the
> .I fsname
> parameter, puts the context into creation mode and
> attaches it to a file descriptor;
> .BR fsopen ()
> returns the file descriptor as the function result.
> ]]
>
> > The file descriptor can be marked close-on-exec by
> > +setting
> > +.B FSOPEN_CLOEXEC
> > +in
> > +.IR flags .
> > +.PP
> > +After calling fsopen(), the file descriptor should be passed to the
> > +.BR fsconfig (2)
> > +system call, using that to specify the desired filesystem and security
> > +parameters.
> > +.PP
> > +When the parameters are all set, the
> > +.BR fsconfig ()
> > +system call should then be called again with
> > +.B FSCONFIG_CMD_CREATE
> > +as the command argument to effect the creation.
> > +.RS
> > +.PP
> > +.BR "[!]\ NOTE" :
> > +Depending on the filesystem type and parameters, this may rather share an
>
> Please replace "this" with a noun (phrase), since it is a little
> unclear what "this" refers to.
>
> > +existing in-kernel filesystem representation instead of creating a new one.
> > +In such a case, the parameters specified may be discarded or may overwrite the
> > +parameters set by a previous mount - at the filesystem's discretion.
> > +.RE
> > +.PP
> > +The file descriptor also serves as a channel by which more comprehensive error,
> > +warning and information messages may be retrieved from the kernel using
> > +.BR read (2).
> > +.PP
> > +Once the creation command has been successfully run on a context, the context
> > +will not accept further configuration. At
> > +this point,
> > +.BR fsmount ()
> > +should be called to create a mount object.
> > +.PP
> > +.BR fsmount ()
> > +takes the file descriptor returned by
> > +.BR fsopen ()
> > +and creates a mount object for the filesystem root specified there. The
> > +attributes of the mount object are set from the
> > +.I mount_attrs
> > +parameter. The attributes specify the propagation and mount restrictions to
> > +be applied to accesses through this mount.
>
> Can we please have a list of the available attributes here, with a
> description of each attribute.
>
> > +.PP
> > +The mount object is then attached to a new file descriptor that looks like one
> > +created by
> > +.BR open "(2) with " O_PATH " or " open_tree (2).
> > +This can be passed to
> > +.BR move_mount (2)
> > +to attach the mount object to a mountpoint, thereby completing the process.
>
> s/mountpoint/mount point/
>
> In the preceding paragraph, the description is a bit unclear. (Again,
> overuse of pronouns ("this) does not help. I think it
> would be better to say something like:
>
> [[
> .BR fsmount()
> attaches the mount object to a new file descriptor that looks like one
> created by
> .BR open "(2) with " O_PATH " or " open_tree (2).
> This file descriptor can be passed to
> .BR move_mount (2)
> to attach the mount object to a mount point, thereby completing the process.
> ]]
>
> But, please also replace "the process" with a more meaningful phrase.
>
> > +.PP
> > +The file descriptor returned by fsmount() is marked close-on-exec if
> > +FSMOUNT_CLOEXEC is specified in
> > +.IR flags .
> > +.PP
> > +After fsmount() has completed, the context created by fsopen() is reset and
> > +moved to reconfiguration state, allowing the new superblock to be
> > +reconfigured. See
> > +.BR fspick (2)
> > +for details.
> > +.PP
> > +To use either of these calls, the caller requires the appropriate privilege
> > +(Linux: the
>
> s/Linux: //
> (this is after all a Linux-specific system call)
>
> > +.B CAP_SYS_ADMIN
> > +capability).
> > +.PP
> > +.SS Message Retrieval Interface
> > +The context file descriptor may be queried for message strings at any time by
>
> s/The context file descriptor/
> The context file descriptor returned by fsopen()/
>
> > +calling
> > +.BR read (2)
> > +on the file descriptor. This will return formatted messages that are prefixed
> > +to indicate their class:
> > +.TP
> > +\fB"e <message>"\fP
> > +An error message string was logged.
> > +.TP
> > +\fB"i <message>"\fP
> > +An informational message string was logged.
> > +.TP
> > +\fB"w <message>"\fP
> > +An warning message string was logged.
> > +.PP
> > +Messages are removed from the queue as they're read.
>
> What if there are no pending error messages to retrieve? What does
> read() do in that case? Please add an explanation here.
>
> > +.SH RETURN VALUE
> > +On success, both functions return a file descriptor. On error, \-1 is
> > +returned, and
> > +.I errno
> > +is set appropriately> +.SH ERRORS
> > +The error values given below result from filesystem type independent
> > +errors.
> > +Each filesystem type may have its own special errors and its
> > +own special behavior.
> > +See the Linux kernel source code for details.
> > +.TP
> > +.B EBUSY
> > +The context referred to by
> > +.I fd
> > +is not in the right state to be used by
> > +.BR fsmount ().
> > +.TP
> > +.B EFAULT
> > +One of the pointer arguments points outside the user address space.
> > +.TP
> > +.B EINVAL
> > +.I flags
> > +had an invalid flag set.
> > +.TP
> > +.B EINVAL
> > +.I mount_attrs,
> > +includes invalid
> > +.BR MOUNT_ATTR_*
> > +flags.
> > +.TP
> > +.B EMFILE
> > +The system has too many open files to create more.
> > +.TP
> > +.B ENFILE
> > +The process has too many open files to create more.
> > +.TP
> > +.B ENODEV
> > +The filesystem
> > +.I fsname
> > +is not available in the kernel.
> > +.TP
> > +.B ENOMEM
> > +The kernel could not allocate sufficient memory to complete the call.
> > +.TP
> > +.B EPERM
> > +The caller does not have the required privileges.
>
> Please name the required capability.
>
> > +.SH CONFORMING TO
> > +These functions are Linux-specific and should not be used in programs intended
> > +to be portable.
> > +.SH VERSIONS
> > +.BR fsopen "(), and " fsmount ()
> > +were added to Linux in kernel 5.2.
> > +.SH NOTES
> > +Glibc does not (yet) provide a wrapper for the
> > +.BR fsopen "() or " fsmount "()"
> > +system calls; call them using
> > +.BR syscall (2).
> > +.SH EXAMPLES
> > +To illustrate the process, here's an example whereby this can be used to mount
>
> Please replace "this" by a noun (phrase).
>
> > +an ext4 filesystem on /dev/sdb1 onto /mnt.
> > +.PP
> > +.in +4n
> > +.nf
> > +sfd = fsopen("ext4", FSOPEN_CLOEXEC);
> > +fsconfig(sfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
> > +fsconfig(sfd, FSCONFIG_SET_STRING, "source", "/dev/sdb1", 0);
> > +fsconfig(sfd, FSCONFIG_SET_FLAG, "noatime", NULL, 0);
> > +fsconfig(sfd, FSCONFIG_SET_FLAG, "acl", NULL, 0);
> > +fsconfig(sfd, FSCONFIG_SET_FLAG, "user_attr", NULL, 0);
> > +fsconfig(sfd, FSCONFIG_SET_FLAG, "iversion", NULL, 0);
> > +fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
> > +mfd = fsmount(sfd, FSMOUNT_CLOEXEC, MS_RELATIME);
> > +move_mount(mfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> > +.fi
> > +.in
> > +.PP
> > +Here, an ext4 context is created first and attached to sfd. The context is
> > +then told where its source will be, given a bunch of options and a superblock
> > +record object is then created. Then fsmount() is called to create a mount
> > +object and
> > +.BR move_mount (2)
> > +is called to attach it to its intended mountpoint.
>
> s/mountpoint/mount point/
>
> > +.PP
> > +And here's an example of mounting from an NFS server and setting a Smack
> > +security module label on it too:
>
> Please replace "it" with a noun (phrase).
>
> > +.PP
> > +.in +4n
> > +.nf
> > +sfd = fsopen("nfs", 0);
> > +fsconfig(sfd, FSCONFIG_SET_STRING, "source", "example.com:/pub", 0);
> > +fsconfig(sfd, FSCONFIG_SET_STRING, "nfsvers", "3", 0);
> > +fsconfig(sfd, FSCONFIG_SET_STRING, "rsize", "65536", 0);
> > +fsconfig(sfd, FSCONFIG_SET_STRING, "wsize", "65536", 0);
> > +fsconfig(sfd, FSCONFIG_SET_STRING, "smackfsdef", "foolabel", 0);
> > +fsconfig(sfd, FSCONFIG_SET_FLAG, "rdma", NULL, 0);
> > +fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
> > +mfd = fsmount(sfd, 0, MS_NODEV);
> > +move_mount(mfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> > +.fi
> > +.in
> > +.PP
> > +.SH SEE ALSO
> > +.BR mountpoint (1),
> > +.BR fsconfig (2),
> > +.BR fspick (2),
> > +.BR move_mount (2),
> > +.BR open_tree (2),
> > +.BR umount (2),
> > +.BR mount_namespaces (7),
> > +.BR path_resolution (7),
> > +.BR mount (8),
> > +.BR umount (8)
>
> Thanks,
>
> Michael
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/