Add manual pages to document the move_mount and open_tree() system calls.
Signed-off-by: David Howells <[email protected]>
---
man2/move_mount.2 | 275 +++++++++++++++++++++++++++++++++++++++++++++++++++++
man2/open_tree.2 | 260 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 535 insertions(+)
create mode 100644 man2/move_mount.2
create mode 100644 man2/open_tree.2
diff --git a/man2/move_mount.2 b/man2/move_mount.2
new file mode 100644
index 000000000..a9720ea6f
--- /dev/null
+++ b/man2/move_mount.2
@@ -0,0 +1,275 @@
+'\" t
+.\" Copyright (c) 2019 David Howells <[email protected]>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date. The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein. The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH MOVE_MOUNT 2 2019-10-10 "Linux" "Linux Programmer's Manual"
+.SH NAME
+move_mount \- Move mount objects around the filesystem topology
+.SH SYNOPSIS
+.nf
+.B #include <sys/types.h>
+.br
+.B #include <sys/mount.h>
+.br
+.B #include <unistd.h>
+.br
+.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
+.PP
+.BI "int move_mount(int " from_dirfd ", const char *" from_pathname ","
+.BI " int " to_dirfd ", const char *" to_pathname ","
+.BI " unsigned int " flags );
+.fi
+.PP
+.IR Note :
+There is no glibc wrapper for this system call.
+.SH DESCRIPTION
+The
+.BR move_mount ()
+call moves a mount from one place to another; it can also be used to attach an
+unattached mount created by
+.BR fsmount "() or " open_tree "() with " OPEN_TREE_CLONE .
+.PP
+If
+.BR move_mount ()
+is called repeatedly with a file descriptor that refers to a mount object,
+then the object will be attached/moved the first time and then moved again and
+again and again, detaching it from the previous mountpoint each time.
+.PP
+To access the source mount object or the destination mountpoint, no
+permissions are required on the object itself, but if either pathname is
+supplied, execute (search) permission is required on all of the directories
+specified in
+.IR from_pathname " or " to_pathname .
+.PP
+The caller does, however, require the appropriate capabilities or permission
+to effect a mount.
+.PP
+.BR move_mount ()
+uses
+.IR from_pathname ", " from_dirfd " and part of " flags
+to locate the mount object to be moved and
+.IR to_pathname ", " to_dirfd " and another part of " flags
+to locate the destination mountpoint. Each lookup can be done in one of a
+variety of ways:
+.TP
+[*] By absolute path.
+The pathname points to an absolute path and the dirfd is ignored. The file is
+looked up by name, starting from the root of the filesystem as seen by the
+calling process.
+.TP
+[*] By cwd-relative path.
+The pathname points to a relative path and the dirfd is
+.IR AT_FDCWD .
+The file is looked up by name, starting from the current working directory.
+.TP
+[*] By dir-relative path.
+The pathname points to relative path and the dirfd indicates a file descriptor
+pointing to a directory. The file is looked up by name, starting from the
+directory specified by
+.IR dirfd .
+.TP
+[*] By file descriptor.
+The pathname points to "", the dirfd points directly to the mount object to
+move or the destination mount point and the appropriate
+.B *_EMPTY_PATH
+flag is set.
+.PP
+.I flags
+can be used to influence a path-based lookup. A value for
+.I flags
+is constructed by OR'ing together zero or more of the following constants:
+.TP
+.BR MOVE_MOUNT_F_EMPTY_PATH
+.\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d
+If
+.I from_pathname
+is an empty string, operate on the file referred to by
+.IR from_dirfd
+(which may have been obtained using the
+.BR open (2)
+.B O_PATH
+flag or
+.BR open_tree ())
+If
+.I from_dirfd
+is
+.BR AT_FDCWD ,
+the call operates on the current working directory.
+In this case,
+.I from_dirfd
+can refer to any type of file, not just a directory.
+This flag is Linux-specific; define
+.B _GNU_SOURCE
+.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
+to obtain its definition.
+.TP
+.B MOVE_MOUNT_T_EMPTY_PATH
+As above, but operating on
+.IR to_pathname " and " to_dirfd .
+.TP
+.B MOVE_MOUNT_F_NO_AUTOMOUNT
+Don't automount the terminal ("basename") component of
+.I from_pathname
+if it is a directory that is an automount point. This allows a mount object
+that has an automount point at its root to be moved and prevents unintended
+triggering of an automount point.
+The
+.B MOVE_MOUNT_F_NO_AUTOMOUNT
+flag has no effect if the automount point has already been mounted over. This
+flag is Linux-specific; define
+.B _GNU_SOURCE
+.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
+to obtain its definition.
+.TP
+.B MOVE_MOUNT_T_NO_AUTOMOUNT
+As above, but operating on
+.IR to_pathname " and " to_dirfd .
+This allows an automount point to be manually mounted over.
+.TP
+.B MOVE_MOUNT_F_SYMLINKS
+If
+.I from_pathname
+is a symbolic link, then dereference it. The default for
+.BR move_mount ()
+is to not follow symlinks.
+.TP
+.B MOVE_MOUNT_T_SYMLINKS
+As above, but operating on
+.IR to_pathname " and " to_dirfd .
+
+.SH EXAMPLES
+The
+.BR move_mount ()
+function can be used like the following:
+.PP
+.RS
+.nf
+move_mount(AT_FDCWD, "/a", AT_FDCWD, "/b", 0);
+.fi
+.RE
+.PP
+This would move the object mounted on "/a" to "/b". It can also be used in
+conjunction with
+.BR open_tree "(2) or " open "(2) with " O_PATH :
+.PP
+.RS
+.nf
+fd = open_tree(AT_FDCWD, "/mnt", 0);
+move_mount(fd, "", AT_FDCWD, "/mnt2", MOVE_MOUNT_F_EMPTY_PATH);
+move_mount(fd, "", AT_FDCWD, "/mnt3", MOVE_MOUNT_F_EMPTY_PATH);
+move_mount(fd, "", AT_FDCWD, "/mnt4", MOVE_MOUNT_F_EMPTY_PATH);
+.fi
+.RE
+.PP
+This would attach the path point for "/mnt" to fd, then it would move the
+mount to "/mnt2", then move it to "/mnt3" and finally to "/mnt4".
+.PP
+It can also be used to attach new mounts:
+.PP
+.RS
+.nf
+sfd = fsopen("ext4", FSOPEN_CLOEXEC);
+fsconfig(sfd, FSCONFIG_SET_STRING, "source", "/dev/sda1", 0);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0);
+fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+mfd = fsmount(sfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NODEV);
+move_mount(mfd, "", AT_FDCWD, "/home", MOVE_MOUNT_F_EMPTY_PATH);
+.fi
+.RE
+.PP
+Which would open the Ext4 filesystem mounted on "/dev/sda1", turn on user
+extended attribute support and create a mount object for it. Finally, the new
+mount object would be attached with
+.BR move_mount ()
+to "/home".
+
+
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.SH RETURN VALUE
+On success, 0 is returned. On error, \-1 is returned, and
+.I errno
+is set appropriately.
+.SH ERRORS
+.TP
+.B EACCES
+Search permission is denied for one of the directories
+in the path prefix of
+.IR pathname .
+(See also
+.BR path_resolution (7).)
+.TP
+.B EBADF
+.IR from_dirfd " or " to_dirfd
+is not a valid open file descriptor.
+.TP
+.B EFAULT
+.IR from_pathname " or " to_pathname
+is NULL or either one point to a location outside the process's accessible
+address space.
+.TP
+.B EINVAL
+Reserved flag specified in
+.IR flags .
+.TP
+.B ELOOP
+Too many symbolic links encountered while traversing the pathname.
+.TP
+.B ENAMETOOLONG
+.IR from_pathname " or " to_pathname
+is too long.
+.TP
+.B ENOENT
+A component of
+.IR from_pathname " or " to_pathname
+does not exist, or one is an empty string and the appropriate
+.B *_EMPTY_PATH
+was not specified in
+.IR flags .
+.TP
+.B ENOMEM
+Out of memory (i.e., kernel memory).
+.TP
+.B ENOTDIR
+A component of the path prefix of
+.IR from_pathname " or " to_pathname
+is not a directory or one or the other is relative and the appropriate
+.I *_dirfd
+is a file descriptor referring to a file other than a directory.
+.SH VERSIONS
+.BR move_mount ()
+was added to Linux in kernel 4.18.
+.SH CONFORMING TO
+.BR move_mount ()
+is Linux-specific.
+.SH NOTES
+Glibc does not (yet) provide a wrapper for the
+.BR move_mount ()
+system call; call it using
+.BR syscall (2).
+.SH SEE ALSO
+.BR fsmount (2),
+.BR fsopen (2),
+.BR open_tree (2)
diff --git a/man2/open_tree.2 b/man2/open_tree.2
new file mode 100644
index 000000000..cbebbfda8
--- /dev/null
+++ b/man2/open_tree.2
@@ -0,0 +1,260 @@
+'\" t
+.\" Copyright (c) 2019 David Howells <[email protected]>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date. The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein. The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH OPEN_TREE 2 2019-10-10 "Linux" "Linux Programmer's Manual"
+.SH NAME
+open_tree \- Pick or clone mount object and attach to fd
+.SH SYNOPSIS
+.nf
+.B #include <sys/types.h>
+.br
+.B #include <sys/mount.h>
+.br
+.B #include <unistd.h>
+.br
+.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
+.PP
+.BI "int open_tree(int " dirfd ", const char *" pathname ", unsigned int " flags );
+.fi
+.PP
+.IR Note :
+There are no glibc wrappers for these system calls.
+.SH DESCRIPTION
+.BR open_tree ()
+picks the mount object specified by the pathname and attaches it to a new file
+descriptor or clones it and attaches the clone to the file descriptor. The
+resultant file descriptor is indistinguishable from one produced by
+.BR open "(2) with " O_PATH .
+.PP
+In the case that the mount object is cloned, the clone will be "unmounted" and
+destroyed when the file descriptor is closed if it is not otherwise mounted
+somewhere by calling
+.BR move_mount (2).
+.PP
+To select a mount object, no permissions are required on the object referred
+to by the path, but execute (search) permission is required on all of the
+directories in
+.I pathname
+that lead to the object.
+.PP
+To clone an object, however, the caller must have mount capabilities and
+permissions.
+.PP
+.BR open_tree ()
+uses
+.IR pathname ", " dirfd " and " flags
+to locate the target object in one of a variety of ways:
+.TP
+[*] By absolute path.
+.I pathname
+points to an absolute path and
+.I dirfd
+is ignored. The object is looked up by name, starting from the root of the
+filesystem as seen by the calling process.
+.TP
+[*] By cwd-relative path.
+.I pathname
+points to a relative path and
+.IR dirfd " is " AT_FDCWD .
+The object is looked up by name, starting from the current working directory.
+.TP
+[*] By dir-relative path.
+.I pathname
+points to relative path and
+.I dirfd
+indicates a file descriptor pointing to a directory. The object is looked up
+by name, starting from the directory specified by
+.IR dirfd .
+.TP
+[*] By file descriptor.
+.I pathname
+is "",
+.I dirfd
+indicates a file descriptor and
+.B AT_EMPTY_PATH
+is set in
+.IR flags .
+The mount attached to the file descriptor is queried directly. The file
+descriptor may point to any type of file, not just a directory.
+
+.\"______________________________________________________________
+.PP
+.I flags
+can be used to control the operation of the function and to influence a
+path-based lookup. A value for
+.I flags
+is constructed by OR'ing together zero or more of the following constants:
+.TP
+.BR AT_EMPTY_PATH
+.\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d
+If
+.I pathname
+is an empty string, operate on the file referred to by
+.IR dirfd
+(which may have been obtained from
+.BR open "(2) with"
+.BR O_PATH ", from " fsmount (2)
+or from another
+.BR open_tree ()).
+If
+.I dirfd
+is
+.BR AT_FDCWD ,
+the call operates on the current working directory.
+In this case,
+.I dirfd
+can refer to any type of file, not just a directory.
+This flag is Linux-specific; define
+.B _GNU_SOURCE
+.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
+to obtain its definition.
+.TP
+.BR AT_NO_AUTOMOUNT
+Don't automount the terminal ("basename") component of
+.I pathname
+if it is a directory that is an automount point. This flag allows the
+automount point itself to be picked up or a mount cloned that is rooted on the
+automount point. The
+.B AT_NO_AUTOMOUNT
+flag has no effect if the mount point has already been mounted over.
+This flag is Linux-specific; define
+.B _GNU_SOURCE
+.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
+to obtain its definition.
+.TP
+.B AT_SYMLINK_NOFOLLOW
+If
+.I pathname
+is a symbolic link, do not dereference it: instead pick up or clone a mount
+rooted on the link itself.
+.TP
+.B OPEN_TREE_CLOEXEC
+Set the close-on-exec flag for the new file descriptor. This will cause the
+file descriptor to be closed automatically when a process exec's.
+.TP
+.B OPEN_TREE_CLONE
+Rather than directly attaching the selected object to the file descriptor,
+clone the object, set the root of the new mount object to that point and
+attach the clone to the file descriptor.
+.TP
+.B AT_RECURSIVE
+This is only permitted in conjunction with OPEN_TREE_CLONE. It causes the
+entire mount subtree rooted at the selected spot to be cloned rather than just
+that one mount object.
+
+
+.SH EXAMPLE
+The
+.BR open_tree ()
+function can be used like the following:
+.PP
+.RS
+.nf
+fd1 = open_tree(AT_FDCWD, "/mnt", 0);
+fd2 = open_tree(fd1, "",
+ AT_EMPTY_PATH | OPEN_TREE_CLONE | AT_RECURSIVE);
+move_mount(fd2, "", AT_FDCWD, "/mnt2", MOVE_MOUNT_F_EMPTY_PATH);
+.fi
+.RE
+.PP
+This would attach the path point for "/mnt" to fd1, then it would copy the
+entire subtree at the point referred to by fd1 and attach that to fd2; lastly,
+it would attach the clone to "/mnt2".
+
+
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.SH RETURN VALUE
+On success, the new file descriptor is returned. On error, \-1 is returned,
+and
+.I errno
+is set appropriately.
+.SH ERRORS
+.TP
+.B EACCES
+Search permission is denied for one of the directories
+in the path prefix of
+.IR pathname .
+(See also
+.BR path_resolution (7).)
+.TP
+.B EBADF
+.I dirfd
+is not a valid open file descriptor.
+.TP
+.B EFAULT
+.I pathname
+is NULL or
+.IR pathname
+point to a location outside the process's accessible address space.
+.TP
+.B EINVAL
+Reserved flag specified in
+.IR flags .
+.TP
+.B ELOOP
+Too many symbolic links encountered while traversing the pathname.
+.TP
+.B ENAMETOOLONG
+.I pathname
+is too long.
+.TP
+.B ENOENT
+A component of
+.I pathname
+does not exist, or
+.I pathname
+is an empty string and
+.B AT_EMPTY_PATH
+was not specified in
+.IR flags .
+.TP
+.B ENOMEM
+Out of memory (i.e., kernel memory).
+.TP
+.B ENOTDIR
+A component of the path prefix of
+.I pathname
+is not a directory or
+.I pathname
+is relative and
+.I dirfd
+is a file descriptor referring to a file other than a directory.
+.SH VERSIONS
+.BR open_tree ()
+was added to Linux in kernel 4.18.
+.SH CONFORMING TO
+.BR open_tree ()
+is Linux-specific.
+.SH NOTES
+Glibc does not (yet) provide a wrapper for the
+.BR open_tree ()
+system call; call it using
+.BR syscall (2).
+.SH SEE ALSO
+.BR fsmount (2),
+.BR move_mount (2),
+.BR open (2)
Add a manual page to document the fsopen(), fspick() and fsmount() system
calls.
Signed-off-by: David Howells <[email protected]>
---
man2/fsconfig.2 | 282 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
man2/fsmount.2 | 1
man2/fsopen.2 | 256 ++++++++++++++++++++++++++++++++++++++++++++++++++
man2/fspick.2 | 196 ++++++++++++++++++++++++++++++++++++++
4 files changed, 735 insertions(+)
create mode 100644 man2/fsconfig.2
create mode 100644 man2/fsmount.2
create mode 100644 man2/fsopen.2
create mode 100644 man2/fspick.2
diff --git a/man2/fsconfig.2 b/man2/fsconfig.2
new file mode 100644
index 000000000..8c8ae0842
--- /dev/null
+++ b/man2/fsconfig.2
@@ -0,0 +1,282 @@
+'\" t
+.\" Copyright (c) 2019 David Howells <[email protected]>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date. The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein. The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH FSCONFIG 2 2019-10-10 "Linux" "Linux Programmer's Manual"
+.SH NAME
+fsconfig \- Filesystem parameterisation
+.SH SYNOPSIS
+.nf
+.B #include <sys/types.h>
+.br
+.B #include <sys/mount.h>
+.br
+.B #include <unistd.h>
+.br
+.B #include <sys/mount.h>
+.PP
+.BI "int fsconfig(int *" fd ", unsigned int " cmd ", const char *" key ,
+.br
+.BI " const void __user *" value ", int " aux ");"
+.br
+.BI
+.fi
+.PP
+.IR Note :
+There is no glibc wrapper for this system call.
+.SH DESCRIPTION
+.PP
+.BR fsconfig ()
+is used to supply parameters to and issue commands against a filesystem
+configuration context as set up by
+.BR fsopen (2)
+or
+.BR fspick (2).
+The context is supplied attached to the file descriptor specified by
+.I fd
+argument.
+.PP
+The
+.I cmd
+argument indicates the command to be issued, where some of the commands simply
+supply parameters to the context. The meaning of
+.IR key ", " value " and " aux
+are command-dependent; unless required for the command, these should be set to
+NULL or 0.
+.PP
+The available commands are:
+.TP
+.B FSCONFIG_SET_FLAG
+Set the parameter named by
+.IR key
+to true. This may incur error
+.B EINVAL
+if the parameter requires an argument.
+.TP
+.B FSCONFIG_SET_STRING
+Set the parameter named by
+.I key
+to a string. This may incur error
+.B EINVAL
+if the parser doesn't want a parameter here, wants a non-string or the string
+cannot be interpreted appropriately.
+.I value
+points to a NUL-terminated string.
+.TP
+.B FSCONFIG_SET_BINARY
+Set the parameter named by
+.I key
+to be a binary blob argument. This may cause
+.B EINVAL
+to be returned if the filesystem parser isn't expecting a binary blob and it
+can't be converted to something usable.
+.I value
+points to the data and
+.I aux
+indicates the size of the data.
+.TP
+.B FSCONFIG_SET_PATH
+Set the parameter named by
+.I key
+to the object at the provided path.
+.I value
+should point to a NULL-terminated pathname string and aux may indicate
+.B AT_FDCWD
+or a file descriptor indicating a directory from which to begin a relative
+pathwalk. This may return any errors incurred by the pathwalk and may return
+.B EINVAL
+if the parameter isn't expecting a path.
+.IP
+Note that FSCONFIG_SET_STRING can be used instead, implying AT_FDCWD.
+.TP
+.B FSCONFIG_SET_PATH_EMPTY
+As FSCONFIG_SET_PATH, but with
+.B AT_EMPTY_PATH
+applied to the pathwalk.
+.TP
+.B FSCONFIG_SET_FD
+Set the parameter named by
+.I key
+to the file descriptor specified by
+.IR aux .
+This will incur
+.B EINVAL
+if the parameter doesn't expect a file descriptor or
+.B EBADF
+if the file descriptor is invalid.
+.IP
+Note that FSCONFIG_SET_STRING can be used instead with the file descriptor
+passed as a decimal string.
+.TP
+.B FSCONFIG_CMD_CREATE
+This command causes the filesystem to take the parameters set in the context
+and to try to create filesystem representation in the kernel. If it can share
+an existing one, it may do that instead if the filesystem type and parameters
+permit that. This is intended for use with
+.BR fsopen (2).
+.TP
+.B FSCONFIG_CMD_RECONFIGURE
+This command causes the filesystem to apply the parameters set in the context
+to an already existing filesystem representation in memory and to alter it.
+This is intended for use with
+.BR fspick (2),
+but may also by used against the context created by
+.BR fsopen()
+after
+.BR fsmount (2)
+has been called on it.
+
+.\"________________________________________________________
+.SH EXAMPLES
+.PP
+.in +4n
+.nf
+fsconfig(sfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
+
+fsconfig(sfd, FSCONFIG_SET_STRING, "user_xattr", "false", 0);
+
+fsconfig(sfd, FSCONFIG_SET_BINARY, "ms_pac", pac_buffer, pac_size);
+
+fsconfig(sfd, FSCONFIG_SET_PATH, "journal", "/dev/sdd4", AT_FDCWD);
+
+dirfd = open("/dev/", O_PATH);
+fsconfig(sfd, FSCONFIG_SET_PATH, "journal", "sdd4", dirfd);
+
+fd = open("/overlays/mine/", O_PATH);
+fsconfig(sfd, FSCONFIG_SET_PATH_EMPTY, "lower_dir", "", fd);
+
+pipe(pipefds);
+fsconfig(sfd, FSCONFIG_SET_FD, "fd", NULL, pipefds[1]);
+.fi
+.in
+.PP
+
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.SH RETURN VALUE
+On success, the function returns 0. On error, \-1 is returned, and
+.I errno
+is set appropriately.
+.SH ERRORS
+The error values given below result from filesystem type independent
+errors.
+Each filesystem type may have its own special errors and its
+own special behavior.
+See the Linux kernel source code for details.
+.TP
+.B EACCES
+A component of a path was not searchable.
+(See also
+.BR path_resolution (7).)
+.TP
+.B EACCES
+Mounting a read-only filesystem was attempted without specifying the
+.RB ' ro '
+parameter.
+.TP
+.B EACCES
+A specified block device is located on a filesystem mounted with the
+.B MS_NODEV
+option.
+.\" mtk: Probably: write permission is required for MS_BIND, with
+.\" the error EPERM if not present; CAP_DAC_OVERRIDE is required.
+.TP
+.B EBADF
+The file descriptor given by
+.I fd
+or possibly by
+.I aux
+(depending on the command) is invalid.
+.TP
+.B EBUSY
+The context attached to
+.I fd
+is in the wrong state for the given command.
+.TP
+.B EBUSY
+The filesystem representation cannot be reconfigured read-only because it still
+holds files open for writing.
+.TP
+.B EFAULT
+One of the pointer arguments points outside the user address space.
+.TP
+.B EINVAL
+.I fd
+does not refer to a filesystem configuration context.
+.TP
+.B EINVAL
+One of the source parameters referred to an invalid superblock.
+.TP
+.B ELOOP
+Too many links encountered during pathname resolution.
+.TP
+.B ENAMETOOLONG
+A path name was longer than
+.BR MAXPATHLEN .
+.TP
+.B ENOENT
+A pathname was empty or had a nonexistent component.
+.TP
+.B ENOMEM
+The kernel could not allocate sufficient memory to complete the call.
+.TP
+.B ENOTBLK
+Once of the parameters does not refer to a block device (and a device was
+required).
+.TP
+.B ENOTDIR
+.IR pathname ,
+or a prefix of
+.IR source ,
+is not a directory.
+.TP
+.B EOPNOTSUPP
+The command given by
+.I cmd
+was not valid.
+.TP
+.B ENXIO
+The major number of a block device parameter is out of range.
+.TP
+.B EPERM
+The caller does not have the required privileges.
+.SH CONFORMING TO
+These functions are Linux-specific and should not be used in programs intended
+to be portable.
+.SH VERSIONS
+.BR fsconfig ()
+was added to Linux in kernel 5.1.
+.SH NOTES
+Glibc does not (yet) provide a wrapper for the
+.BR fspick ()
+system call; call it using
+.BR syscall (2).
+.SH SEE ALSO
+.BR mountpoint (1),
+.BR fsmount (2),
+.BR fsopen (2),
+.BR fspick (2),
+.BR mount_namespaces (7),
+.BR path_resolution (7)
diff --git a/man2/fsmount.2 b/man2/fsmount.2
new file mode 100644
index 000000000..2bf59fc3e
--- /dev/null
+++ b/man2/fsmount.2
@@ -0,0 +1 @@
+.so man2/fsopen.2
diff --git a/man2/fsopen.2 b/man2/fsopen.2
new file mode 100644
index 000000000..338ce93ed
--- /dev/null
+++ b/man2/fsopen.2
@@ -0,0 +1,256 @@
+'\" t
+.\" Copyright (c) 2019 David Howells <[email protected]>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date. The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein. The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH FSOPEN 2 2019-10-10 "Linux" "Linux Programmer's Manual"
+.SH NAME
+fsopen, fsmount \- Filesystem parameterisation and mount creation
+.SH SYNOPSIS
+.nf
+.B #include <sys/types.h>
+.br
+.B #include <sys/mount.h>
+.br
+.B #include <unistd.h>
+.br
+.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
+.br
+.BR "#include <sys/mount.h> "
+.PP
+.BI "int fsopen(const char *" fsname ", unsigned int " flags );
+.PP
+.BI "int fsmount(int " fd ", unsigned int " flags ", unsigned int " mount_attrs );
+.fi
+.PP
+.IR Note :
+There are no glibc wrappers for these system calls.
+.SH DESCRIPTION
+.PP
+.BR fsopen ()
+creates a blank filesystem configuration context within the kernel for the
+filesystem named in the
+.I fsname
+parameter, puts it into creation mode and attaches it to a file descriptor,
+which it then returns. The file descriptor can be marked close-on-exec by
+setting
+.B FSOPEN_CLOEXEC
+in
+.IR flags .
+.PP
+After calling fsopen(), the file descriptor should be passed to the
+.BR fsconfig (2)
+system call, using that to specify the desired filesystem and security
+parameters.
+.PP
+When the parameters are all set, the
+.BR fsconfig ()
+system call should then be called again with
+.B FSCONFIG_CMD_CREATE
+as the command argument to effect the creation.
+.RS
+.PP
+.BR "[!]\ NOTE" :
+Depending on the filesystem type and parameters, this may rather share an
+existing in-kernel filesystem representation instead of creating a new one.
+In such a case, the parameters specified may be discarded or may overwrite the
+parameters set by a previous mount - at the filesystem's discretion.
+.RE
+.PP
+The file descriptor also serves as a channel by which more comprehensive error,
+warning and information messages may be retrieved from the kernel using
+.BR read (2).
+
+.PP
+Once the creation command has been successfully run on a context, the context
+is switched into need-mount mode which prevents further configuration. At
+this point,
+.BR fsmount ()
+should be called to create a mount object.
+.PP
+.BR fsmount ()
+takes the file descriptor returned by
+.BR fsopen ()
+and creates a mount object for the filesystem root specified there. The
+attributes of the mount object are set from the
+.I mount_attrs
+parameter. The attributes specify the propagation and mount restrictions to
+be applied to accesses through this mount.
+.PP
+The mount object is then attached to a new file descriptor that looks like one
+created by
+.BR open "(2) with " O_PATH " or " open_tree (2).
+This can be passed to
+.BR move_mount (2)
+to attach the mount object to a mountpoint, thereby completing the process.
+.PP
+The file descriptor returned by fsmount() is marked close-on-exec if
+FSMOUNT_CLOEXEC is specified in
+.IR flags .
+.PP
+After fsmount() has completed, the context created by fsopen() is reset and
+moved to reconfiguration state, allowing the new superblock to be
+reconfigured. See
+.BR fspick (2)
+for details.
+.PP
+
+.\"________________________________________________________
+.SS Message Retrieval Interface
+The context file descriptor may be queried for message strings at any time by
+calling
+.BR read (2)
+on the file descriptor. This will return formatted messages that are prefixed
+to indicate their class:
+.TP
+\fB"e <message>"\fP
+An error message string was logged.
+.TP
+\fB"i <message>"\fP
+An informational message string was logged.
+.TP
+\fB"w <message>"\fP
+An warning message string was logged.
+.PP
+Messages are removed from the queue as they're read.
+
+.\"________________________________________________________
+.SH EXAMPLES
+To illustrate the process, here's an example whereby this can be used to mount
+an ext4 filesystem on /dev/sdb1 onto /mnt.
+.PP
+.in +4n
+.nf
+sfd = fsopen("ext4", FSOPEN_CLOEXEC);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "source", "/dev/sdb1", 0);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "noatime", NULL, 0);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "acl", NULL, 0);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "user_attr", NULL, 0);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "iversion", NULL, 0);
+fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+mfd = fsmount(sfd, FSMOUNT_CLOEXEC, MS_RELATIME);
+move_mount(mfd, "", sfd, AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
+.fi
+.in
+.PP
+Here, an ext4 context is created first and attached to sfd. This is then told
+where its source will be, given a bunch of options and created. Then
+fsmount() is called to create a mount object and
+.BR move_mount (2)
+is called to attach it to its intended mountpoint.
+.PP
+And here's an example of mounting from an NFS server and setting a Smack
+security module label on it too:
+.PP
+.in +4n
+.nf
+sfd = fsopen("nfs", 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "source", "example.com/pub/linux", 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "nfsvers", "3", 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "rsize", "65536", 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "wsize", "65536", 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "smackfsdef", "foolabel", 0);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "rdma", NULL, 0);
+fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+mfd = fsmount(sfd, 0, MS_NODEV);
+move_mount(mfd, "", sfd, AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
+.fi
+.in
+.PP
+
+
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.SH RETURN VALUE
+On success, both functions return a file descriptor. On error, \-1 is
+returned, and
+.I errno
+is set appropriately.
+.SH ERRORS
+The error values given below result from filesystem type independent
+errors.
+Each filesystem type may have its own special errors and its
+own special behavior.
+See the Linux kernel source code for details.
+.TP
+.B EBUSY
+The context referred to by
+.I fd
+is not in the right state to be used by
+.BR fsmount ().
+.TP
+.B EFAULT
+One of the pointer arguments points outside the user address space.
+.TP
+.B EINVAL
+.I flags
+had an invalid flag set.
+.TP
+.B EINVAL
+.I mount_attrs,
+includes invalid
+.BR MOUNT_ATTR_*
+flags.
+.TP
+.B EMFILE
+The system has too many open files to create more.
+.TP
+.B ENFILE
+The process has too many open files to create more.
+.TP
+.B ENODEV
+Filesystem
+.I fsname
+not configured in the kernel.
+.TP
+.B ENOMEM
+The kernel could not allocate sufficient memory to complete the call.
+.TP
+.B EPERM
+The caller does not have the required privileges.
+.SH CONFORMING TO
+These functions are Linux-specific and should not be used in programs intended
+to be portable.
+.SH VERSIONS
+.BR fsopen "(), and " fsmount ()
+were added to Linux in kernel 5.1.
+.SH NOTES
+Glibc does not (yet) provide a wrapper for the
+.BR fsopen "() or " fsmount "()"
+system calls; call them using
+.BR syscall (2).
+.SH SEE ALSO
+.BR mountpoint (1),
+.BR fsconfig (2),
+.BR fspick (2),
+.BR move_mount (2),
+.BR open_tree (2),
+.BR umount (2),
+.BR mount_namespaces (7),
+.BR path_resolution (7),
+.BR findmnt (8),
+.BR lsblk (8),
+.BR mount (8),
+.BR umount (8)
diff --git a/man2/fspick.2 b/man2/fspick.2
new file mode 100644
index 000000000..03f787007
--- /dev/null
+++ b/man2/fspick.2
@@ -0,0 +1,196 @@
+'\" t
+.\" Copyright (c) 2019 David Howells <[email protected]>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date. The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein. The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH FSPICK 2 2019-10-10 "Linux" "Linux Programmer's Manual"
+.SH NAME
+fspick \- Select filesystem for reconfiguration
+.SH SYNOPSIS
+.nf
+.B #include <sys/types.h>
+.br
+.B #include <sys/mount.h>
+.br
+.B #include <unistd.h>
+.br
+.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
+.PP
+.BI "int fspick(int " dirfd ", const char *" pathname ", unsigned int " flags );
+.fi
+.PP
+.IR Note :
+There is no glibc wrapper for this system call.
+.SH DESCRIPTION
+.PP
+.BR fspick ()
+creates a new filesystem configuration context within the kernel and attaches a
+pre-existing superblock to it so that it can be reconfigured (similar to
+.BR mount (8)
+with the "-o remount" option). The configuration context is marked as being in
+reconfiguration mode and attached to a file descriptor, which is returned to
+the caller. This can be marked close-on-exec by setting
+.B FSPICK_CLOEXEC
+in
+.IR flags .
+.PP
+The target is whichever superblock backs the object determined by
+.IR dfd ", " pathname " and " flags .
+The following can be set in
+.I flags
+to control the pathwalk to that object:
+.TP
+.B FSPICK_SYMLINK_NOFOLLOW
+Don't follow symbolic links in the terminal component of the path.
+.TP
+.B FSPICK_NO_AUTOMOUNT
+Don't follow automounts in the terminal component of the path.
+.TP
+.B FSPICK_EMPTY_PATH
+Allow an empty string to be specified as the pathname. This allows
+.I dirfd
+to specify a path exactly.
+.PP
+After calling fspick(), the file descriptor should be passed to the
+.BR fsconfig (2)
+system call, using that to specify the desired changes to filesystem and
+security parameters.
+.PP
+When the parameters are all set, the
+.BR fsconfig ()
+system call should then be called again with
+.B FSCONFIG_CMD_RECONFIGURE
+as the command argument to effect the reconfiguration.
+.PP
+After the reconfiguration has taken place, the context is wiped clean (apart
+from the superblock attachment, which remains) and can be reused to make
+another reconfiguration.
+.PP
+The file descriptor also serves as a channel by which more comprehensive error,
+warning and information messages may be retrieved from the kernel using
+.BR read (2).
+
+
+.\"________________________________________________________
+.SS Message Retrieval Interface
+The context file descriptor may be queried for message strings at any time by
+calling
+.BR read (2)
+on the file descriptor. This will return formatted messages that are prefixed
+to indicate their class:
+.TP
+\fB"e <message>"\fP
+An error message string was logged.
+.TP
+\fB"i <message>"\fP
+An informational message string was logged.
+.TP
+\fB"w <message>"\fP
+An warning message string was logged.
+.PP
+Messages are removed from the queue as they're read and the queue has a limited
+depth, so it's possible for some to get lost.
+
+.\"________________________________________________________
+.SH EXAMPLES
+To illustrate the process, here's an example whereby this can be used to
+reconfigure a filesystem:
+.PP
+.in +4n
+.nf
+sfd = fspick(AT_FDCWD, "/mnt", FSPICK_NO_AUTOMOUNT | FSPICK_CLOEXEC);
+fsconfig(sfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
+fsconfig(sfd, FSCONFIG_SET_STRING, "user_xattr", "false", 0);
+fsconfig(sfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0);
+.fi
+.in
+.PP
+
+
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.SH RETURN VALUE
+On success, the function returns a file descriptor. On error, \-1 is returned,
+and
+.I errno
+is set appropriately.
+.SH ERRORS
+The error values given below result from filesystem type independent
+errors.
+Each filesystem type may have its own special errors and its
+own special behavior.
+See the Linux kernel source code for details.
+.TP
+.B EACCES
+A component of a path was not searchable.
+(See also
+.BR path_resolution (7).)
+.TP
+.B EFAULT
+.I pathname
+points outside the user address space.
+.TP
+.B EINVAL
+.I flags
+includes an undefined value.
+.TP
+.B ELOOP
+Too many links encountered during pathname resolution.
+.TP
+.B EMFILE
+The system has too many open files to create more.
+.TP
+.B ENFILE
+The process has too many open files to create more.
+.TP
+.B ENAMETOOLONG
+A pathname was longer than
+.BR MAXPATHLEN .
+.TP
+.B ENOENT
+A pathname was empty or had a nonexistent component.
+.TP
+.B ENOMEM
+The kernel could not allocate sufficient memory to complete the call.
+.TP
+.B EPERM
+The caller does not have the required privileges.
+.SH CONFORMING TO
+These functions are Linux-specific and should not be used in programs intended
+to be portable.
+.SH VERSIONS
+.BR fsopen "(), " fsmount "() and " fspick ()
+were added to Linux in kernel 5.1.
+.SH NOTES
+Glibc does not (yet) provide a wrapper for the
+.BR fspick "()"
+system call; call it using
+.BR syscall (2).
+.SH SEE ALSO
+.BR mountpoint (1),
+.BR fsconfig (2),
+.BR fsopen (2),
+.BR path_resolution (7),
+.BR mount (8)
+
Hello David,
I got a bunch of mail with man-pages patches from you,
but I think there's some confusion, since the mails I
got were:
[MANPAGE] fsopen.2, fsmount.2
[MANPAGE] fspick.2
[MANPAGE] fsconfig.2
[MANPAGE] open_tree.2
[MANPAGE] move_mount.2
[PATCH 1/2] Add manpages for move_mount(2) and open_tree(2)
[PATCH 2/2] Add manpage for fsopen(2), fspick(2) and fsmount(2)
There are two problems here:
(1) There's obviously some duplication in the mails,
so I don't know which ones to detail with.
(2) Some of the mails still provide two man pages in one mail.
This would make review very difficult.
Could you resend please? One patch per page (and no duplicates).
Also, I greatly appreciate you CCing linux-kernel@ and
linux-fsdevel@, so that there might be wider exposure for
review, but could you also CC linux-man@ please.
Thanks,
Michael
On 10/10/19 7:05 PM, David Howells wrote:
> Add manual pages to document the move_mount and open_tree() system calls.
>
> Signed-off-by: David Howells <[email protected]>
> ---
>
> man2/move_mount.2 | 275 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> man2/open_tree.2 | 260 ++++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 535 insertions(+)
> create mode 100644 man2/move_mount.2
> create mode 100644 man2/open_tree.2
>
> diff --git a/man2/move_mount.2 b/man2/move_mount.2
> new file mode 100644
> index 000000000..a9720ea6f
> --- /dev/null
> +++ b/man2/move_mount.2
> @@ -0,0 +1,275 @@
> +'\" t
> +.\" Copyright (c) 2019 David Howells <[email protected]>
> +.\"
> +.\" %%%LICENSE_START(VERBATIM)
> +.\" Permission is granted to make and distribute verbatim copies of this
> +.\" manual provided the copyright notice and this permission notice are
> +.\" preserved on all copies.
> +.\"
> +.\" Permission is granted to copy and distribute modified versions of this
> +.\" manual under the conditions for verbatim copying, provided that the
> +.\" entire resulting derived work is distributed under the terms of a
> +.\" permission notice identical to this one.
> +.\"
> +.\" Since the Linux kernel and libraries are constantly changing, this
> +.\" manual page may be incorrect or out-of-date. The author(s) assume no
> +.\" responsibility for errors or omissions, or for damages resulting from
> +.\" the use of the information contained herein. The author(s) may not
> +.\" have taken the same level of care in the production of this manual,
> +.\" which is licensed free of charge, as they might when working
> +.\" professionally.
> +.\"
> +.\" Formatted or processed versions of this manual, if unaccompanied by
> +.\" the source, must acknowledge the copyright and authors of this work.
> +.\" %%%LICENSE_END
> +.\"
> +.TH MOVE_MOUNT 2 2019-10-10 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +move_mount \- Move mount objects around the filesystem topology
> +.SH SYNOPSIS
> +.nf
> +.B #include <sys/types.h>
> +.br
> +.B #include <sys/mount.h>
> +.br
> +.B #include <unistd.h>
> +.br
> +.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
> +.PP
> +.BI "int move_mount(int " from_dirfd ", const char *" from_pathname ","
> +.BI " int " to_dirfd ", const char *" to_pathname ","
> +.BI " unsigned int " flags );
> +.fi
> +.PP
> +.IR Note :
> +There is no glibc wrapper for this system call.
> +.SH DESCRIPTION
> +The
> +.BR move_mount ()
> +call moves a mount from one place to another; it can also be used to attach an
> +unattached mount created by
> +.BR fsmount "() or " open_tree "() with " OPEN_TREE_CLONE .
> +.PP
> +If
> +.BR move_mount ()
> +is called repeatedly with a file descriptor that refers to a mount object,
> +then the object will be attached/moved the first time and then moved again and
> +again and again, detaching it from the previous mountpoint each time.
> +.PP
> +To access the source mount object or the destination mountpoint, no
> +permissions are required on the object itself, but if either pathname is
> +supplied, execute (search) permission is required on all of the directories
> +specified in
> +.IR from_pathname " or " to_pathname .
> +.PP
> +The caller does, however, require the appropriate capabilities or permission
> +to effect a mount.
> +.PP
> +.BR move_mount ()
> +uses
> +.IR from_pathname ", " from_dirfd " and part of " flags
> +to locate the mount object to be moved and
> +.IR to_pathname ", " to_dirfd " and another part of " flags
> +to locate the destination mountpoint. Each lookup can be done in one of a
> +variety of ways:
> +.TP
> +[*] By absolute path.
> +The pathname points to an absolute path and the dirfd is ignored. The file is
> +looked up by name, starting from the root of the filesystem as seen by the
> +calling process.
> +.TP
> +[*] By cwd-relative path.
> +The pathname points to a relative path and the dirfd is
> +.IR AT_FDCWD .
> +The file is looked up by name, starting from the current working directory.
> +.TP
> +[*] By dir-relative path.
> +The pathname points to relative path and the dirfd indicates a file descriptor
> +pointing to a directory. The file is looked up by name, starting from the
> +directory specified by
> +.IR dirfd .
> +.TP
> +[*] By file descriptor.
> +The pathname points to "", the dirfd points directly to the mount object to
> +move or the destination mount point and the appropriate
> +.B *_EMPTY_PATH
> +flag is set.
> +.PP
> +.I flags
> +can be used to influence a path-based lookup. A value for
> +.I flags
> +is constructed by OR'ing together zero or more of the following constants:
> +.TP
> +.BR MOVE_MOUNT_F_EMPTY_PATH
> +.\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d
> +If
> +.I from_pathname
> +is an empty string, operate on the file referred to by
> +.IR from_dirfd
> +(which may have been obtained using the
> +.BR open (2)
> +.B O_PATH
> +flag or
> +.BR open_tree ())
> +If
> +.I from_dirfd
> +is
> +.BR AT_FDCWD ,
> +the call operates on the current working directory.
> +In this case,
> +.I from_dirfd
> +can refer to any type of file, not just a directory.
> +This flag is Linux-specific; define
> +.B _GNU_SOURCE
> +.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
> +to obtain its definition.
> +.TP
> +.B MOVE_MOUNT_T_EMPTY_PATH
> +As above, but operating on
> +.IR to_pathname " and " to_dirfd .
> +.TP
> +.B MOVE_MOUNT_F_NO_AUTOMOUNT
> +Don't automount the terminal ("basename") component of
> +.I from_pathname
> +if it is a directory that is an automount point. This allows a mount object
> +that has an automount point at its root to be moved and prevents unintended
> +triggering of an automount point.
> +The
> +.B MOVE_MOUNT_F_NO_AUTOMOUNT
> +flag has no effect if the automount point has already been mounted over. This
> +flag is Linux-specific; define
> +.B _GNU_SOURCE
> +.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
> +to obtain its definition.
> +.TP
> +.B MOVE_MOUNT_T_NO_AUTOMOUNT
> +As above, but operating on
> +.IR to_pathname " and " to_dirfd .
> +This allows an automount point to be manually mounted over.
> +.TP
> +.B MOVE_MOUNT_F_SYMLINKS
> +If
> +.I from_pathname
> +is a symbolic link, then dereference it. The default for
> +.BR move_mount ()
> +is to not follow symlinks.
> +.TP
> +.B MOVE_MOUNT_T_SYMLINKS
> +As above, but operating on
> +.IR to_pathname " and " to_dirfd .
> +
> +.SH EXAMPLES
> +The
> +.BR move_mount ()
> +function can be used like the following:
> +.PP
> +.RS
> +.nf
> +move_mount(AT_FDCWD, "/a", AT_FDCWD, "/b", 0);
> +.fi
> +.RE
> +.PP
> +This would move the object mounted on "/a" to "/b". It can also be used in
> +conjunction with
> +.BR open_tree "(2) or " open "(2) with " O_PATH :
> +.PP
> +.RS
> +.nf
> +fd = open_tree(AT_FDCWD, "/mnt", 0);
> +move_mount(fd, "", AT_FDCWD, "/mnt2", MOVE_MOUNT_F_EMPTY_PATH);
> +move_mount(fd, "", AT_FDCWD, "/mnt3", MOVE_MOUNT_F_EMPTY_PATH);
> +move_mount(fd, "", AT_FDCWD, "/mnt4", MOVE_MOUNT_F_EMPTY_PATH);
> +.fi
> +.RE
> +.PP
> +This would attach the path point for "/mnt" to fd, then it would move the
> +mount to "/mnt2", then move it to "/mnt3" and finally to "/mnt4".
> +.PP
> +It can also be used to attach new mounts:
> +.PP
> +.RS
> +.nf
> +sfd = fsopen("ext4", FSOPEN_CLOEXEC);
> +fsconfig(sfd, FSCONFIG_SET_STRING, "source", "/dev/sda1", 0);
> +fsconfig(sfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0);
> +fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
> +mfd = fsmount(sfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NODEV);
> +move_mount(mfd, "", AT_FDCWD, "/home", MOVE_MOUNT_F_EMPTY_PATH);
> +.fi
> +.RE
> +.PP
> +Which would open the Ext4 filesystem mounted on "/dev/sda1", turn on user
> +extended attribute support and create a mount object for it. Finally, the new
> +mount object would be attached with
> +.BR move_mount ()
> +to "/home".
> +
> +
> +.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
> +.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
> +.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
> +.SH RETURN VALUE
> +On success, 0 is returned. On error, \-1 is returned, and
> +.I errno
> +is set appropriately.
> +.SH ERRORS
> +.TP
> +.B EACCES
> +Search permission is denied for one of the directories
> +in the path prefix of
> +.IR pathname .
> +(See also
> +.BR path_resolution (7).)
> +.TP
> +.B EBADF
> +.IR from_dirfd " or " to_dirfd
> +is not a valid open file descriptor.
> +.TP
> +.B EFAULT
> +.IR from_pathname " or " to_pathname
> +is NULL or either one point to a location outside the process's accessible
> +address space.
> +.TP
> +.B EINVAL
> +Reserved flag specified in
> +.IR flags .
> +.TP
> +.B ELOOP
> +Too many symbolic links encountered while traversing the pathname.
> +.TP
> +.B ENAMETOOLONG
> +.IR from_pathname " or " to_pathname
> +is too long.
> +.TP
> +.B ENOENT
> +A component of
> +.IR from_pathname " or " to_pathname
> +does not exist, or one is an empty string and the appropriate
> +.B *_EMPTY_PATH
> +was not specified in
> +.IR flags .
> +.TP
> +.B ENOMEM
> +Out of memory (i.e., kernel memory).
> +.TP
> +.B ENOTDIR
> +A component of the path prefix of
> +.IR from_pathname " or " to_pathname
> +is not a directory or one or the other is relative and the appropriate
> +.I *_dirfd
> +is a file descriptor referring to a file other than a directory.
> +.SH VERSIONS
> +.BR move_mount ()
> +was added to Linux in kernel 4.18.
> +.SH CONFORMING TO
> +.BR move_mount ()
> +is Linux-specific.
> +.SH NOTES
> +Glibc does not (yet) provide a wrapper for the
> +.BR move_mount ()
> +system call; call it using
> +.BR syscall (2).
> +.SH SEE ALSO
> +.BR fsmount (2),
> +.BR fsopen (2),
> +.BR open_tree (2)
> diff --git a/man2/open_tree.2 b/man2/open_tree.2
> new file mode 100644
> index 000000000..cbebbfda8
> --- /dev/null
> +++ b/man2/open_tree.2
> @@ -0,0 +1,260 @@
> +'\" t
> +.\" Copyright (c) 2019 David Howells <[email protected]>
> +.\"
> +.\" %%%LICENSE_START(VERBATIM)
> +.\" Permission is granted to make and distribute verbatim copies of this
> +.\" manual provided the copyright notice and this permission notice are
> +.\" preserved on all copies.
> +.\"
> +.\" Permission is granted to copy and distribute modified versions of this
> +.\" manual under the conditions for verbatim copying, provided that the
> +.\" entire resulting derived work is distributed under the terms of a
> +.\" permission notice identical to this one.
> +.\"
> +.\" Since the Linux kernel and libraries are constantly changing, this
> +.\" manual page may be incorrect or out-of-date. The author(s) assume no
> +.\" responsibility for errors or omissions, or for damages resulting from
> +.\" the use of the information contained herein. The author(s) may not
> +.\" have taken the same level of care in the production of this manual,
> +.\" which is licensed free of charge, as they might when working
> +.\" professionally.
> +.\"
> +.\" Formatted or processed versions of this manual, if unaccompanied by
> +.\" the source, must acknowledge the copyright and authors of this work.
> +.\" %%%LICENSE_END
> +.\"
> +.TH OPEN_TREE 2 2019-10-10 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +open_tree \- Pick or clone mount object and attach to fd
> +.SH SYNOPSIS
> +.nf
> +.B #include <sys/types.h>
> +.br
> +.B #include <sys/mount.h>
> +.br
> +.B #include <unistd.h>
> +.br
> +.BR "#include <fcntl.h> " "/* Definition of AT_* constants */"
> +.PP
> +.BI "int open_tree(int " dirfd ", const char *" pathname ", unsigned int " flags );
> +.fi
> +.PP
> +.IR Note :
> +There are no glibc wrappers for these system calls.
> +.SH DESCRIPTION
> +.BR open_tree ()
> +picks the mount object specified by the pathname and attaches it to a new file
> +descriptor or clones it and attaches the clone to the file descriptor. The
> +resultant file descriptor is indistinguishable from one produced by
> +.BR open "(2) with " O_PATH .
> +.PP
> +In the case that the mount object is cloned, the clone will be "unmounted" and
> +destroyed when the file descriptor is closed if it is not otherwise mounted
> +somewhere by calling
> +.BR move_mount (2).
> +.PP
> +To select a mount object, no permissions are required on the object referred
> +to by the path, but execute (search) permission is required on all of the
> +directories in
> +.I pathname
> +that lead to the object.
> +.PP
> +To clone an object, however, the caller must have mount capabilities and
> +permissions.
> +.PP
> +.BR open_tree ()
> +uses
> +.IR pathname ", " dirfd " and " flags
> +to locate the target object in one of a variety of ways:
> +.TP
> +[*] By absolute path.
> +.I pathname
> +points to an absolute path and
> +.I dirfd
> +is ignored. The object is looked up by name, starting from the root of the
> +filesystem as seen by the calling process.
> +.TP
> +[*] By cwd-relative path.
> +.I pathname
> +points to a relative path and
> +.IR dirfd " is " AT_FDCWD .
> +The object is looked up by name, starting from the current working directory.
> +.TP
> +[*] By dir-relative path.
> +.I pathname
> +points to relative path and
> +.I dirfd
> +indicates a file descriptor pointing to a directory. The object is looked up
> +by name, starting from the directory specified by
> +.IR dirfd .
> +.TP
> +[*] By file descriptor.
> +.I pathname
> +is "",
> +.I dirfd
> +indicates a file descriptor and
> +.B AT_EMPTY_PATH
> +is set in
> +.IR flags .
> +The mount attached to the file descriptor is queried directly. The file
> +descriptor may point to any type of file, not just a directory.
> +
> +.\"______________________________________________________________
> +.PP
> +.I flags
> +can be used to control the operation of the function and to influence a
> +path-based lookup. A value for
> +.I flags
> +is constructed by OR'ing together zero or more of the following constants:
> +.TP
> +.BR AT_EMPTY_PATH
> +.\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d
> +If
> +.I pathname
> +is an empty string, operate on the file referred to by
> +.IR dirfd
> +(which may have been obtained from
> +.BR open "(2) with"
> +.BR O_PATH ", from " fsmount (2)
> +or from another
> +.BR open_tree ()).
> +If
> +.I dirfd
> +is
> +.BR AT_FDCWD ,
> +the call operates on the current working directory.
> +In this case,
> +.I dirfd
> +can refer to any type of file, not just a directory.
> +This flag is Linux-specific; define
> +.B _GNU_SOURCE
> +.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
> +to obtain its definition.
> +.TP
> +.BR AT_NO_AUTOMOUNT
> +Don't automount the terminal ("basename") component of
> +.I pathname
> +if it is a directory that is an automount point. This flag allows the
> +automount point itself to be picked up or a mount cloned that is rooted on the
> +automount point. The
> +.B AT_NO_AUTOMOUNT
> +flag has no effect if the mount point has already been mounted over.
> +This flag is Linux-specific; define
> +.B _GNU_SOURCE
> +.\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed
> +to obtain its definition.
> +.TP
> +.B AT_SYMLINK_NOFOLLOW
> +If
> +.I pathname
> +is a symbolic link, do not dereference it: instead pick up or clone a mount
> +rooted on the link itself.
> +.TP
> +.B OPEN_TREE_CLOEXEC
> +Set the close-on-exec flag for the new file descriptor. This will cause the
> +file descriptor to be closed automatically when a process exec's.
> +.TP
> +.B OPEN_TREE_CLONE
> +Rather than directly attaching the selected object to the file descriptor,
> +clone the object, set the root of the new mount object to that point and
> +attach the clone to the file descriptor.
> +.TP
> +.B AT_RECURSIVE
> +This is only permitted in conjunction with OPEN_TREE_CLONE. It causes the
> +entire mount subtree rooted at the selected spot to be cloned rather than just
> +that one mount object.
> +
> +
> +.SH EXAMPLE
> +The
> +.BR open_tree ()
> +function can be used like the following:
> +.PP
> +.RS
> +.nf
> +fd1 = open_tree(AT_FDCWD, "/mnt", 0);
> +fd2 = open_tree(fd1, "",
> + AT_EMPTY_PATH | OPEN_TREE_CLONE | AT_RECURSIVE);
> +move_mount(fd2, "", AT_FDCWD, "/mnt2", MOVE_MOUNT_F_EMPTY_PATH);
> +.fi
> +.RE
> +.PP
> +This would attach the path point for "/mnt" to fd1, then it would copy the
> +entire subtree at the point referred to by fd1 and attach that to fd2; lastly,
> +it would attach the clone to "/mnt2".
> +
> +
> +.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
> +.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
> +.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
> +.SH RETURN VALUE
> +On success, the new file descriptor is returned. On error, \-1 is returned,
> +and
> +.I errno
> +is set appropriately.
> +.SH ERRORS
> +.TP
> +.B EACCES
> +Search permission is denied for one of the directories
> +in the path prefix of
> +.IR pathname .
> +(See also
> +.BR path_resolution (7).)
> +.TP
> +.B EBADF
> +.I dirfd
> +is not a valid open file descriptor.
> +.TP
> +.B EFAULT
> +.I pathname
> +is NULL or
> +.IR pathname
> +point to a location outside the process's accessible address space.
> +.TP
> +.B EINVAL
> +Reserved flag specified in
> +.IR flags .
> +.TP
> +.B ELOOP
> +Too many symbolic links encountered while traversing the pathname.
> +.TP
> +.B ENAMETOOLONG
> +.I pathname
> +is too long.
> +.TP
> +.B ENOENT
> +A component of
> +.I pathname
> +does not exist, or
> +.I pathname
> +is an empty string and
> +.B AT_EMPTY_PATH
> +was not specified in
> +.IR flags .
> +.TP
> +.B ENOMEM
> +Out of memory (i.e., kernel memory).
> +.TP
> +.B ENOTDIR
> +A component of the path prefix of
> +.I pathname
> +is not a directory or
> +.I pathname
> +is relative and
> +.I dirfd
> +is a file descriptor referring to a file other than a directory.
> +.SH VERSIONS
> +.BR open_tree ()
> +was added to Linux in kernel 4.18.
> +.SH CONFORMING TO
> +.BR open_tree ()
> +is Linux-specific.
> +.SH NOTES
> +Glibc does not (yet) provide a wrapper for the
> +.BR open_tree ()
> +system call; call it using
> +.BR syscall (2).
> +.SH SEE ALSO
> +.BR fsmount (2),
> +.BR move_mount (2),
> +.BR open (2)
>
>
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
Michael Kerrisk (man-pages) <[email protected]> wrote:
> [MANPAGE] fsopen.2, fsmount.2
> [MANPAGE] fspick.2
> [MANPAGE] fsconfig.2
> [MANPAGE] open_tree.2
> [MANPAGE] move_mount.2
Those were really aimed at non-doccy people to have a look over without
needing to apply the patch or strip the diff metadata.
Roff is hard enough to read as it is ;-)
> [PATCH 1/2] Add manpages for move_mount(2) and open_tree(2)
> [PATCH 2/2] Add manpage for fsopen(2), fspick(2) and fsmount(2)
Those were aimed specifically at you.
If it's okay with you, I'll wait a bit before splitting and resending the
patches, just in case anyone points out any problems.
David