2019-05-16 11:54:28

by David Howells

[permalink] [raw]
Subject: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]


Hi Linus, Al,

Here are some patches that make changes to the mount API UAPI and two of
them really need applying, before -rc1 - if they're going to be applied at
all.

The following changes are made:

(1) Make the file descriptors returned by open_tree(), fsopen(), fspick()
and fsmount() O_CLOEXEC by default and remove the flags that allow
this to be specified from the UAPI, shuffling other flags down as
appropriate. fcntl() can still be used to change the flag.

(2) Make the name of the anon inode object "[fscontext]" with square
brackets to match other users.

(3) Fix the numbering of the mount API syscalls to be in the common space
rather than in the arch-specific space.

(4) Wire up the mount API syscalls on all arches (it's only on x86
currently).

Thanks,
David
---
Christian Brauner (2):
uapi, fs: make all new mount api fds cloexec by default
uapi, fsopen: use square brackets around "fscontext"

David Howells (2):
uapi, x86: Fix the syscall numbering of the mount API syscalls
uapi: Wire up the mount API syscalls on non-x86 arches


arch/alpha/kernel/syscalls/syscall.tbl | 6 ++++++
arch/arm/tools/syscall.tbl | 6 ++++++
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 12 ++++++++++++
arch/ia64/kernel/syscalls/syscall.tbl | 6 ++++++
arch/m68k/kernel/syscalls/syscall.tbl | 6 ++++++
arch/microblaze/kernel/syscalls/syscall.tbl | 6 ++++++
arch/mips/kernel/syscalls/syscall_n32.tbl | 6 ++++++
arch/mips/kernel/syscalls/syscall_n64.tbl | 6 ++++++
arch/mips/kernel/syscalls/syscall_o32.tbl | 6 ++++++
arch/parisc/kernel/syscalls/syscall.tbl | 6 ++++++
arch/powerpc/kernel/syscalls/syscall.tbl | 6 ++++++
arch/s390/kernel/syscalls/syscall.tbl | 6 ++++++
arch/sh/kernel/syscalls/syscall.tbl | 6 ++++++
arch/sparc/kernel/syscalls/syscall.tbl | 6 ++++++
arch/x86/entry/syscalls/syscall_32.tbl | 12 ++++++------
arch/x86/entry/syscalls/syscall_64.tbl | 12 ++++++------
arch/xtensa/kernel/syscalls/syscall.tbl | 6 ++++++
fs/fsopen.c | 15 +++++++--------
fs/namespace.c | 11 ++++-------
include/uapi/asm-generic/unistd.h | 14 +++++++++++++-
include/uapi/linux/mount.h | 18 +++---------------
22 files changed, 136 insertions(+), 44 deletions(-)


2019-05-16 11:54:35

by David Howells

[permalink] [raw]
Subject: [PATCH 1/4] uapi, fs: make all new mount api fds cloexec by default [ver #2]

From: Christian Brauner <[email protected]>

This makes all file descriptors returned from new syscalls of the new mount
api cloexec by default.

>From a userspace perspective it is rarely the case that fds are supposed to
be inherited across exec. Having them not cloexec by default forces
userspace to remember to pass the <SPECIFIC>_CLOEXEC flag along or to
invoke fcntl() on the fd to prevent leaking it. And leaking the fd is a
much bigger issue than forgetting to remove the cloexec flag and failing to
inherit the fd.
For old fd types we can't break userspace. But for new ones we should
whenever reasonable make them cloexec by default (Examples of this policy
are the new seccomp notify fds and also pidfds.). If userspace wants to
inherit fds across exec they can remove the O_CLOEXEC flag and so opt in to
inheritance explicitly.

This patch also has the advantage that we can get rid of all the special
flags per file descriptor type for the new mount api. In total this lets us
remove 4 flags:
- FSMOUNT_CLOEXEC
- FSOPEN_CLOEXEC
- FSPICK_CLOEXEC
- OPEN_TREE_CLOEXEC

Signed-off-by: Christian Brauner <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

fs/fsopen.c | 13 ++++++-------
fs/namespace.c | 11 ++++-------
include/uapi/linux/mount.h | 18 +++---------------
3 files changed, 13 insertions(+), 29 deletions(-)

diff --git a/fs/fsopen.c b/fs/fsopen.c
index 3bb9c0c8cbcc..a38fa8c616cf 100644
--- a/fs/fsopen.c
+++ b/fs/fsopen.c
@@ -88,12 +88,12 @@ const struct file_operations fscontext_fops = {
/*
* Attach a filesystem context to a file and an fd.
*/
-static int fscontext_create_fd(struct fs_context *fc, unsigned int o_flags)
+static int fscontext_create_fd(struct fs_context *fc)
{
int fd;

fd = anon_inode_getfd("fscontext", &fscontext_fops, fc,
- O_RDWR | o_flags);
+ O_RDWR | O_CLOEXEC);
if (fd < 0)
put_fs_context(fc);
return fd;
@@ -126,7 +126,7 @@ SYSCALL_DEFINE2(fsopen, const char __user *, _fs_name, unsigned int, flags)
if (!ns_capable(current->nsproxy->mnt_ns->user_ns, CAP_SYS_ADMIN))
return -EPERM;

- if (flags & ~FSOPEN_CLOEXEC)
+ if (flags)
return -EINVAL;

fs_name = strndup_user(_fs_name, PAGE_SIZE);
@@ -149,7 +149,7 @@ SYSCALL_DEFINE2(fsopen, const char __user *, _fs_name, unsigned int, flags)
if (ret < 0)
goto err_fc;

- return fscontext_create_fd(fc, flags & FSOPEN_CLOEXEC ? O_CLOEXEC : 0);
+ return fscontext_create_fd(fc);

err_fc:
put_fs_context(fc);
@@ -169,8 +169,7 @@ SYSCALL_DEFINE3(fspick, int, dfd, const char __user *, path, unsigned int, flags
if (!ns_capable(current->nsproxy->mnt_ns->user_ns, CAP_SYS_ADMIN))
return -EPERM;

- if ((flags & ~(FSPICK_CLOEXEC |
- FSPICK_SYMLINK_NOFOLLOW |
+ if ((flags & ~(FSPICK_SYMLINK_NOFOLLOW |
FSPICK_NO_AUTOMOUNT |
FSPICK_EMPTY_PATH)) != 0)
return -EINVAL;
@@ -203,7 +202,7 @@ SYSCALL_DEFINE3(fspick, int, dfd, const char __user *, path, unsigned int, flags
goto err_fc;

path_put(&target);
- return fscontext_create_fd(fc, flags & FSPICK_CLOEXEC ? O_CLOEXEC : 0);
+ return fscontext_create_fd(fc);

err_fc:
put_fs_context(fc);
diff --git a/fs/namespace.c b/fs/namespace.c
index ffb13f0562b0..3d14e83787b1 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2369,11 +2369,8 @@ SYSCALL_DEFINE3(open_tree, int, dfd, const char *, filename, unsigned, flags)
int error;
int fd;

- BUILD_BUG_ON(OPEN_TREE_CLOEXEC != O_CLOEXEC);
-
if (flags & ~(AT_EMPTY_PATH | AT_NO_AUTOMOUNT | AT_RECURSIVE |
- AT_SYMLINK_NOFOLLOW | OPEN_TREE_CLONE |
- OPEN_TREE_CLOEXEC))
+ AT_SYMLINK_NOFOLLOW | OPEN_TREE_CLONE))
return -EINVAL;

if ((flags & (AT_RECURSIVE | OPEN_TREE_CLONE)) == AT_RECURSIVE)
@@ -2389,7 +2386,7 @@ SYSCALL_DEFINE3(open_tree, int, dfd, const char *, filename, unsigned, flags)
if (detached && !may_mount())
return -EPERM;

- fd = get_unused_fd_flags(flags & O_CLOEXEC);
+ fd = get_unused_fd_flags(O_CLOEXEC);
if (fd < 0)
return fd;

@@ -3352,7 +3349,7 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
if (!may_mount())
return -EPERM;

- if ((flags & ~(FSMOUNT_CLOEXEC)) != 0)
+ if (flags)
return -EINVAL;

if (attr_flags & ~(MOUNT_ATTR_RDONLY |
@@ -3457,7 +3454,7 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags,
}
file->f_mode |= FMODE_NEED_UNMOUNT;

- ret = get_unused_fd_flags((flags & FSMOUNT_CLOEXEC) ? O_CLOEXEC : 0);
+ ret = get_unused_fd_flags(O_CLOEXEC);
if (ret >= 0)
fd_install(ret, file);
else
diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h
index 96a0240f23fe..c688e4ac843b 100644
--- a/include/uapi/linux/mount.h
+++ b/include/uapi/linux/mount.h
@@ -59,7 +59,6 @@
* open_tree() flags.
*/
#define OPEN_TREE_CLONE 1 /* Clone the target tree and attach the clone */
-#define OPEN_TREE_CLOEXEC O_CLOEXEC /* Close the file on execve() */

/*
* move_mount() flags.
@@ -72,18 +71,12 @@
#define MOVE_MOUNT_T_EMPTY_PATH 0x00000040 /* Empty to path permitted */
#define MOVE_MOUNT__MASK 0x00000077

-/*
- * fsopen() flags.
- */
-#define FSOPEN_CLOEXEC 0x00000001
-
/*
* fspick() flags.
*/
-#define FSPICK_CLOEXEC 0x00000001
-#define FSPICK_SYMLINK_NOFOLLOW 0x00000002
-#define FSPICK_NO_AUTOMOUNT 0x00000004
-#define FSPICK_EMPTY_PATH 0x00000008
+#define FSPICK_SYMLINK_NOFOLLOW 0x00000001
+#define FSPICK_NO_AUTOMOUNT 0x00000002
+#define FSPICK_EMPTY_PATH 0x00000004

/*
* The type of fsconfig() call made.
@@ -99,11 +92,6 @@ enum fsconfig_command {
FSCONFIG_CMD_RECONFIGURE = 7, /* Invoke superblock reconfiguration */
};

-/*
- * fsmount() flags.
- */
-#define FSMOUNT_CLOEXEC 0x00000001
-
/*
* Mount attributes.
*/

2019-05-16 11:55:01

by David Howells

[permalink] [raw]
Subject: [PATCH 4/4] uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]

Wire up the mount API syscalls on non-x86 arches.

Reported-by: Arnd Bergmann <[email protected]>
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Arnd Bergmann <[email protected]>
---

arch/alpha/kernel/syscalls/syscall.tbl | 6 ++++++
arch/arm/tools/syscall.tbl | 6 ++++++
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 12 ++++++++++++
arch/ia64/kernel/syscalls/syscall.tbl | 6 ++++++
arch/m68k/kernel/syscalls/syscall.tbl | 6 ++++++
arch/microblaze/kernel/syscalls/syscall.tbl | 6 ++++++
arch/mips/kernel/syscalls/syscall_n32.tbl | 6 ++++++
arch/mips/kernel/syscalls/syscall_n64.tbl | 6 ++++++
arch/mips/kernel/syscalls/syscall_o32.tbl | 6 ++++++
arch/parisc/kernel/syscalls/syscall.tbl | 6 ++++++
arch/powerpc/kernel/syscalls/syscall.tbl | 6 ++++++
arch/s390/kernel/syscalls/syscall.tbl | 6 ++++++
arch/sh/kernel/syscalls/syscall.tbl | 6 ++++++
arch/sparc/kernel/syscalls/syscall.tbl | 6 ++++++
arch/xtensa/kernel/syscalls/syscall.tbl | 6 ++++++
include/uapi/asm-generic/unistd.h | 14 +++++++++++++-
17 files changed, 110 insertions(+), 2 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index 165f268beafc..9e7704e44f6d 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -467,3 +467,9 @@
535 common io_uring_setup sys_io_uring_setup
536 common io_uring_enter sys_io_uring_enter
537 common io_uring_register sys_io_uring_register
+538 common open_tree sys_open_tree
+539 common move_mount sys_move_mount
+540 common fsopen sys_fsopen
+541 common fsconfig sys_fsconfig
+542 common fsmount sys_fsmount
+543 common fspick sys_fspick
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 0393917eaa57..aaf479a9e92d 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -441,3 +441,9 @@
425 common io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register
+428 common open_tree sys_open_tree
+429 common move_mount sys_move_mount
+430 common fsopen sys_fsopen
+431 common fsconfig sys_fsconfig
+432 common fsmount sys_fsmount
+433 common fspick sys_fspick
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index f2a83ff6b73c..70e6882853c0 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -44,7 +44,7 @@
#define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5)
#define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800)

-#define __NR_compat_syscalls 428
+#define __NR_compat_syscalls 434
#endif

#define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index 23f1a44acada..c39e90600bb3 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -874,6 +874,18 @@ __SYSCALL(__NR_io_uring_setup, sys_io_uring_setup)
__SYSCALL(__NR_io_uring_enter, sys_io_uring_enter)
#define __NR_io_uring_register 427
__SYSCALL(__NR_io_uring_register, sys_io_uring_register)
+#define __NR_open_tree 428
+__SYSCALL(__NR_open_tree, sys_open_tree)
+#define __NR_move_mount 429
+__SYSCALL(__NR_move_mount, sys_move_mount)
+#define __NR_fsopen 430
+__SYSCALL(__NR_fsopen, sys_fsopen)
+#define __NR_fsconfig 431
+__SYSCALL(__NR_fsconfig, sys_fsconfig)
+#define __NR_fsmount 432
+__SYSCALL(__NR_fsmount, sys_fsmount)
+#define __NR_fspick 433
+__SYSCALL(__NR_fspick, sys_fspick)

/*
* Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
index 56e3d0b685e1..e01df3f2f80d 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -348,3 +348,9 @@
425 common io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register
+428 common open_tree sys_open_tree
+429 common move_mount sys_move_mount
+430 common fsopen sys_fsopen
+431 common fsconfig sys_fsconfig
+432 common fsmount sys_fsmount
+433 common fspick sys_fspick
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index df4ec3ec71d1..7e3d0734b2f3 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -427,3 +427,9 @@
425 common io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register
+428 common open_tree sys_open_tree
+429 common move_mount sys_move_mount
+430 common fsopen sys_fsopen
+431 common fsconfig sys_fsconfig
+432 common fsmount sys_fsmount
+433 common fspick sys_fspick
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index 4964947732af..26339e417695 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -433,3 +433,9 @@
425 common io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register
+428 common open_tree sys_open_tree
+429 common move_mount sys_move_mount
+430 common fsopen sys_fsopen
+431 common fsconfig sys_fsconfig
+432 common fsmount sys_fsmount
+433 common fspick sys_fspick
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 9392dfe33f97..0e2dd68ade57 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -366,3 +366,9 @@
425 n32 io_uring_setup sys_io_uring_setup
426 n32 io_uring_enter sys_io_uring_enter
427 n32 io_uring_register sys_io_uring_register
+428 n32 open_tree sys_open_tree
+429 n32 move_mount sys_move_mount
+430 n32 fsopen sys_fsopen
+431 n32 fsconfig sys_fsconfig
+432 n32 fsmount sys_fsmount
+433 n32 fspick sys_fspick
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index cd0c8aa21fba..5eebfa0d155c 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -342,3 +342,9 @@
425 n64 io_uring_setup sys_io_uring_setup
426 n64 io_uring_enter sys_io_uring_enter
427 n64 io_uring_register sys_io_uring_register
+428 n64 open_tree sys_open_tree
+429 n64 move_mount sys_move_mount
+430 n64 fsopen sys_fsopen
+431 n64 fsconfig sys_fsconfig
+432 n64 fsmount sys_fsmount
+433 n64 fspick sys_fspick
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index e849e8ffe4a2..3cc1374e02d0 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -415,3 +415,9 @@
425 o32 io_uring_setup sys_io_uring_setup
426 o32 io_uring_enter sys_io_uring_enter
427 o32 io_uring_register sys_io_uring_register
+428 o32 open_tree sys_open_tree
+429 o32 move_mount sys_move_mount
+430 o32 fsopen sys_fsopen
+431 o32 fsconfig sys_fsconfig
+432 o32 fsmount sys_fsmount
+433 o32 fspick sys_fspick
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index fe8ca623add8..c9e377d59232 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -424,3 +424,9 @@
425 common io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register
+428 common open_tree sys_open_tree
+429 common move_mount sys_move_mount
+430 common fsopen sys_fsopen
+431 common fsconfig sys_fsconfig
+432 common fsmount sys_fsmount
+433 common fspick sys_fspick
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index 00f5a63c8d9a..103655d84b4b 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -509,3 +509,9 @@
425 common io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register
+428 common open_tree sys_open_tree
+429 common move_mount sys_move_mount
+430 common fsopen sys_fsopen
+431 common fsconfig sys_fsconfig
+432 common fsmount sys_fsmount
+433 common fspick sys_fspick
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index 061418f787c3..e822b2964a83 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -430,3 +430,9 @@
425 common io_uring_setup sys_io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register sys_io_uring_register
+428 common open_tree sys_open_tree sys_open_tree
+429 common move_mount sys_move_mount sys_move_mount
+430 common fsopen sys_fsopen sys_fsopen
+431 common fsconfig sys_fsconfig sys_fsconfig
+432 common fsmount sys_fsmount sys_fsmount
+433 common fspick sys_fspick sys_fspick
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index 480b057556ee..016a727d4357 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -430,3 +430,9 @@
425 common io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register
+428 common open_tree sys_open_tree
+429 common move_mount sys_move_mount
+430 common fsopen sys_fsopen
+431 common fsconfig sys_fsconfig
+432 common fsmount sys_fsmount
+433 common fspick sys_fspick
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index a1dd24307b00..e047480b1605 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -473,3 +473,9 @@
425 common io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register
+428 common open_tree sys_open_tree
+429 common move_mount sys_move_mount
+430 common fsopen sys_fsopen
+431 common fsconfig sys_fsconfig
+432 common fsmount sys_fsmount
+433 common fspick sys_fspick
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 30084eaf8422..5fa0ee1c8e00 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -398,3 +398,9 @@
425 common io_uring_setup sys_io_uring_setup
426 common io_uring_enter sys_io_uring_enter
427 common io_uring_register sys_io_uring_register
+428 common open_tree sys_open_tree
+429 common move_mount sys_move_mount
+430 common fsopen sys_fsopen
+431 common fsconfig sys_fsconfig
+432 common fsmount sys_fsmount
+433 common fspick sys_fspick
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index dee7292e1df6..a87904daf103 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -832,9 +832,21 @@ __SYSCALL(__NR_io_uring_setup, sys_io_uring_setup)
__SYSCALL(__NR_io_uring_enter, sys_io_uring_enter)
#define __NR_io_uring_register 427
__SYSCALL(__NR_io_uring_register, sys_io_uring_register)
+#define __NR_open_tree 428
+__SYSCALL(__NR_open_tree, sys_open_tree)
+#define __NR_move_mount 429
+__SYSCALL(__NR_move_mount, sys_move_mount)
+#define __NR_fsopen 430
+__SYSCALL(__NR_fsopen, sys_fsopen)
+#define __NR_fsconfig 431
+__SYSCALL(__NR_fsconfig, sys_fsconfig)
+#define __NR_fsmount 432
+__SYSCALL(__NR_fsmount, sys_fsmount)
+#define __NR_fspick 433
+__SYSCALL(__NR_fspick, sys_fspick)

#undef __NR_syscalls
-#define __NR_syscalls 428
+#define __NR_syscalls 434

/*
* 32 bit systems traditionally used different

2019-05-16 13:04:05

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH 4/4] uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]

On Thu, May 16, 2019 at 12:52:34PM +0100, David Howells wrote:
> Wire up the mount API syscalls on non-x86 arches.
>
> Reported-by: Arnd Bergmann <[email protected]>
> Signed-off-by: David Howells <[email protected]>
> Reviewed-by: Arnd Bergmann <[email protected]>

Reviewed-by: Christian Brauner <[email protected]>

> ---
>
> arch/alpha/kernel/syscalls/syscall.tbl | 6 ++++++
> arch/arm/tools/syscall.tbl | 6 ++++++
> arch/arm64/include/asm/unistd.h | 2 +-
> arch/arm64/include/asm/unistd32.h | 12 ++++++++++++
> arch/ia64/kernel/syscalls/syscall.tbl | 6 ++++++
> arch/m68k/kernel/syscalls/syscall.tbl | 6 ++++++
> arch/microblaze/kernel/syscalls/syscall.tbl | 6 ++++++
> arch/mips/kernel/syscalls/syscall_n32.tbl | 6 ++++++
> arch/mips/kernel/syscalls/syscall_n64.tbl | 6 ++++++
> arch/mips/kernel/syscalls/syscall_o32.tbl | 6 ++++++
> arch/parisc/kernel/syscalls/syscall.tbl | 6 ++++++
> arch/powerpc/kernel/syscalls/syscall.tbl | 6 ++++++
> arch/s390/kernel/syscalls/syscall.tbl | 6 ++++++
> arch/sh/kernel/syscalls/syscall.tbl | 6 ++++++
> arch/sparc/kernel/syscalls/syscall.tbl | 6 ++++++
> arch/xtensa/kernel/syscalls/syscall.tbl | 6 ++++++
> include/uapi/asm-generic/unistd.h | 14 +++++++++++++-
> 17 files changed, 110 insertions(+), 2 deletions(-)
>
> diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
> index 165f268beafc..9e7704e44f6d 100644
> --- a/arch/alpha/kernel/syscalls/syscall.tbl
> +++ b/arch/alpha/kernel/syscalls/syscall.tbl
> @@ -467,3 +467,9 @@
> 535 common io_uring_setup sys_io_uring_setup
> 536 common io_uring_enter sys_io_uring_enter
> 537 common io_uring_register sys_io_uring_register
> +538 common open_tree sys_open_tree
> +539 common move_mount sys_move_mount
> +540 common fsopen sys_fsopen
> +541 common fsconfig sys_fsconfig
> +542 common fsmount sys_fsmount
> +543 common fspick sys_fspick
> diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
> index 0393917eaa57..aaf479a9e92d 100644
> --- a/arch/arm/tools/syscall.tbl
> +++ b/arch/arm/tools/syscall.tbl
> @@ -441,3 +441,9 @@
> 425 common io_uring_setup sys_io_uring_setup
> 426 common io_uring_enter sys_io_uring_enter
> 427 common io_uring_register sys_io_uring_register
> +428 common open_tree sys_open_tree
> +429 common move_mount sys_move_mount
> +430 common fsopen sys_fsopen
> +431 common fsconfig sys_fsconfig
> +432 common fsmount sys_fsmount
> +433 common fspick sys_fspick
> diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
> index f2a83ff6b73c..70e6882853c0 100644
> --- a/arch/arm64/include/asm/unistd.h
> +++ b/arch/arm64/include/asm/unistd.h
> @@ -44,7 +44,7 @@
> #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5)
> #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800)
>
> -#define __NR_compat_syscalls 428
> +#define __NR_compat_syscalls 434
> #endif
>
> #define __ARCH_WANT_SYS_CLONE
> diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
> index 23f1a44acada..c39e90600bb3 100644
> --- a/arch/arm64/include/asm/unistd32.h
> +++ b/arch/arm64/include/asm/unistd32.h
> @@ -874,6 +874,18 @@ __SYSCALL(__NR_io_uring_setup, sys_io_uring_setup)
> __SYSCALL(__NR_io_uring_enter, sys_io_uring_enter)
> #define __NR_io_uring_register 427
> __SYSCALL(__NR_io_uring_register, sys_io_uring_register)
> +#define __NR_open_tree 428
> +__SYSCALL(__NR_open_tree, sys_open_tree)
> +#define __NR_move_mount 429
> +__SYSCALL(__NR_move_mount, sys_move_mount)
> +#define __NR_fsopen 430
> +__SYSCALL(__NR_fsopen, sys_fsopen)
> +#define __NR_fsconfig 431
> +__SYSCALL(__NR_fsconfig, sys_fsconfig)
> +#define __NR_fsmount 432
> +__SYSCALL(__NR_fsmount, sys_fsmount)
> +#define __NR_fspick 433
> +__SYSCALL(__NR_fspick, sys_fspick)
>
> /*
> * Please add new compat syscalls above this comment and update
> diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
> index 56e3d0b685e1..e01df3f2f80d 100644
> --- a/arch/ia64/kernel/syscalls/syscall.tbl
> +++ b/arch/ia64/kernel/syscalls/syscall.tbl
> @@ -348,3 +348,9 @@
> 425 common io_uring_setup sys_io_uring_setup
> 426 common io_uring_enter sys_io_uring_enter
> 427 common io_uring_register sys_io_uring_register
> +428 common open_tree sys_open_tree
> +429 common move_mount sys_move_mount
> +430 common fsopen sys_fsopen
> +431 common fsconfig sys_fsconfig
> +432 common fsmount sys_fsmount
> +433 common fspick sys_fspick
> diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
> index df4ec3ec71d1..7e3d0734b2f3 100644
> --- a/arch/m68k/kernel/syscalls/syscall.tbl
> +++ b/arch/m68k/kernel/syscalls/syscall.tbl
> @@ -427,3 +427,9 @@
> 425 common io_uring_setup sys_io_uring_setup
> 426 common io_uring_enter sys_io_uring_enter
> 427 common io_uring_register sys_io_uring_register
> +428 common open_tree sys_open_tree
> +429 common move_mount sys_move_mount
> +430 common fsopen sys_fsopen
> +431 common fsconfig sys_fsconfig
> +432 common fsmount sys_fsmount
> +433 common fspick sys_fspick
> diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
> index 4964947732af..26339e417695 100644
> --- a/arch/microblaze/kernel/syscalls/syscall.tbl
> +++ b/arch/microblaze/kernel/syscalls/syscall.tbl
> @@ -433,3 +433,9 @@
> 425 common io_uring_setup sys_io_uring_setup
> 426 common io_uring_enter sys_io_uring_enter
> 427 common io_uring_register sys_io_uring_register
> +428 common open_tree sys_open_tree
> +429 common move_mount sys_move_mount
> +430 common fsopen sys_fsopen
> +431 common fsconfig sys_fsconfig
> +432 common fsmount sys_fsmount
> +433 common fspick sys_fspick
> diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
> index 9392dfe33f97..0e2dd68ade57 100644
> --- a/arch/mips/kernel/syscalls/syscall_n32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
> @@ -366,3 +366,9 @@
> 425 n32 io_uring_setup sys_io_uring_setup
> 426 n32 io_uring_enter sys_io_uring_enter
> 427 n32 io_uring_register sys_io_uring_register
> +428 n32 open_tree sys_open_tree
> +429 n32 move_mount sys_move_mount
> +430 n32 fsopen sys_fsopen
> +431 n32 fsconfig sys_fsconfig
> +432 n32 fsmount sys_fsmount
> +433 n32 fspick sys_fspick
> diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
> index cd0c8aa21fba..5eebfa0d155c 100644
> --- a/arch/mips/kernel/syscalls/syscall_n64.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
> @@ -342,3 +342,9 @@
> 425 n64 io_uring_setup sys_io_uring_setup
> 426 n64 io_uring_enter sys_io_uring_enter
> 427 n64 io_uring_register sys_io_uring_register
> +428 n64 open_tree sys_open_tree
> +429 n64 move_mount sys_move_mount
> +430 n64 fsopen sys_fsopen
> +431 n64 fsconfig sys_fsconfig
> +432 n64 fsmount sys_fsmount
> +433 n64 fspick sys_fspick
> diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
> index e849e8ffe4a2..3cc1374e02d0 100644
> --- a/arch/mips/kernel/syscalls/syscall_o32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
> @@ -415,3 +415,9 @@
> 425 o32 io_uring_setup sys_io_uring_setup
> 426 o32 io_uring_enter sys_io_uring_enter
> 427 o32 io_uring_register sys_io_uring_register
> +428 o32 open_tree sys_open_tree
> +429 o32 move_mount sys_move_mount
> +430 o32 fsopen sys_fsopen
> +431 o32 fsconfig sys_fsconfig
> +432 o32 fsmount sys_fsmount
> +433 o32 fspick sys_fspick
> diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
> index fe8ca623add8..c9e377d59232 100644
> --- a/arch/parisc/kernel/syscalls/syscall.tbl
> +++ b/arch/parisc/kernel/syscalls/syscall.tbl
> @@ -424,3 +424,9 @@
> 425 common io_uring_setup sys_io_uring_setup
> 426 common io_uring_enter sys_io_uring_enter
> 427 common io_uring_register sys_io_uring_register
> +428 common open_tree sys_open_tree
> +429 common move_mount sys_move_mount
> +430 common fsopen sys_fsopen
> +431 common fsconfig sys_fsconfig
> +432 common fsmount sys_fsmount
> +433 common fspick sys_fspick
> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
> index 00f5a63c8d9a..103655d84b4b 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -509,3 +509,9 @@
> 425 common io_uring_setup sys_io_uring_setup
> 426 common io_uring_enter sys_io_uring_enter
> 427 common io_uring_register sys_io_uring_register
> +428 common open_tree sys_open_tree
> +429 common move_mount sys_move_mount
> +430 common fsopen sys_fsopen
> +431 common fsconfig sys_fsconfig
> +432 common fsmount sys_fsmount
> +433 common fspick sys_fspick
> diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
> index 061418f787c3..e822b2964a83 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -430,3 +430,9 @@
> 425 common io_uring_setup sys_io_uring_setup sys_io_uring_setup
> 426 common io_uring_enter sys_io_uring_enter sys_io_uring_enter
> 427 common io_uring_register sys_io_uring_register sys_io_uring_register
> +428 common open_tree sys_open_tree sys_open_tree
> +429 common move_mount sys_move_mount sys_move_mount
> +430 common fsopen sys_fsopen sys_fsopen
> +431 common fsconfig sys_fsconfig sys_fsconfig
> +432 common fsmount sys_fsmount sys_fsmount
> +433 common fspick sys_fspick sys_fspick
> diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
> index 480b057556ee..016a727d4357 100644
> --- a/arch/sh/kernel/syscalls/syscall.tbl
> +++ b/arch/sh/kernel/syscalls/syscall.tbl
> @@ -430,3 +430,9 @@
> 425 common io_uring_setup sys_io_uring_setup
> 426 common io_uring_enter sys_io_uring_enter
> 427 common io_uring_register sys_io_uring_register
> +428 common open_tree sys_open_tree
> +429 common move_mount sys_move_mount
> +430 common fsopen sys_fsopen
> +431 common fsconfig sys_fsconfig
> +432 common fsmount sys_fsmount
> +433 common fspick sys_fspick
> diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
> index a1dd24307b00..e047480b1605 100644
> --- a/arch/sparc/kernel/syscalls/syscall.tbl
> +++ b/arch/sparc/kernel/syscalls/syscall.tbl
> @@ -473,3 +473,9 @@
> 425 common io_uring_setup sys_io_uring_setup
> 426 common io_uring_enter sys_io_uring_enter
> 427 common io_uring_register sys_io_uring_register
> +428 common open_tree sys_open_tree
> +429 common move_mount sys_move_mount
> +430 common fsopen sys_fsopen
> +431 common fsconfig sys_fsconfig
> +432 common fsmount sys_fsmount
> +433 common fspick sys_fspick
> diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
> index 30084eaf8422..5fa0ee1c8e00 100644
> --- a/arch/xtensa/kernel/syscalls/syscall.tbl
> +++ b/arch/xtensa/kernel/syscalls/syscall.tbl
> @@ -398,3 +398,9 @@
> 425 common io_uring_setup sys_io_uring_setup
> 426 common io_uring_enter sys_io_uring_enter
> 427 common io_uring_register sys_io_uring_register
> +428 common open_tree sys_open_tree
> +429 common move_mount sys_move_mount
> +430 common fsopen sys_fsopen
> +431 common fsconfig sys_fsconfig
> +432 common fsmount sys_fsmount
> +433 common fspick sys_fspick
> diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> index dee7292e1df6..a87904daf103 100644
> --- a/include/uapi/asm-generic/unistd.h
> +++ b/include/uapi/asm-generic/unistd.h
> @@ -832,9 +832,21 @@ __SYSCALL(__NR_io_uring_setup, sys_io_uring_setup)
> __SYSCALL(__NR_io_uring_enter, sys_io_uring_enter)
> #define __NR_io_uring_register 427
> __SYSCALL(__NR_io_uring_register, sys_io_uring_register)
> +#define __NR_open_tree 428
> +__SYSCALL(__NR_open_tree, sys_open_tree)
> +#define __NR_move_mount 429
> +__SYSCALL(__NR_move_mount, sys_move_mount)
> +#define __NR_fsopen 430
> +__SYSCALL(__NR_fsopen, sys_fsopen)
> +#define __NR_fsconfig 431
> +__SYSCALL(__NR_fsconfig, sys_fsconfig)
> +#define __NR_fsmount 432
> +__SYSCALL(__NR_fsmount, sys_fsmount)
> +#define __NR_fspick 433
> +__SYSCALL(__NR_fspick, sys_fspick)
>
> #undef __NR_syscalls
> -#define __NR_syscalls 428
> +#define __NR_syscalls 434
>
> /*
> * 32 bit systems traditionally used different
>

2019-05-16 14:58:31

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH 4/4] uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]

Hi David, Christian,

On Thu, May 16, 2019 at 1:54 PM David Howells <[email protected]> wrote:
> Wire up the mount API syscalls on non-x86 arches.
>
> Reported-by: Arnd Bergmann <[email protected]>
> Signed-off-by: David Howells <[email protected]>
> Reviewed-by: Arnd Bergmann <[email protected]>

> +428 common open_tree sys_open_tree
> +429 common move_mount sys_move_mount
> +430 common fsopen sys_fsopen
> +431 common fsconfig sys_fsconfig
> +432 common fsmount sys_fsmount
> +433 common fspick sys_fspick

The first number conflicts with "[PATCH v1 1/2] pid: add pidfd_open()".

Note that none of this is part of linux-next.

Gr{oetje,eeting}s,

Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-05-16 15:02:39

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH 4/4] uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]

On Thu, May 16, 2019 at 04:56:10PM +0200, Geert Uytterhoeven wrote:
> Hi David, Christian,
>
> On Thu, May 16, 2019 at 1:54 PM David Howells <[email protected]> wrote:
> > Wire up the mount API syscalls on non-x86 arches.
> >
> > Reported-by: Arnd Bergmann <[email protected]>
> > Signed-off-by: David Howells <[email protected]>
> > Reviewed-by: Arnd Bergmann <[email protected]>
>
> > +428 common open_tree sys_open_tree
> > +429 common move_mount sys_move_mount
> > +430 common fsopen sys_fsopen
> > +431 common fsconfig sys_fsconfig
> > +432 common fsmount sys_fsmount
> > +433 common fspick sys_fspick
>
> The first number conflicts with "[PATCH v1 1/2] pid: add pidfd_open()".
>
> Note that none of this is part of linux-next.

Yep, already spotted this thanks to Arnd.
David, there's nothing you need to do of course. I'll change the syscall
number for pidfd_open(). Your patchset obviously has priority!

Thanks!
Christian

>
> Gr{oetje,eeting}s,
>
> Geert
>
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds

2019-05-16 16:25:17

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

On Thu, May 16, 2019 at 12:52:04PM +0100, David Howells wrote:
>
> Hi Linus, Al,
>
> Here are some patches that make changes to the mount API UAPI and two of
> them really need applying, before -rc1 - if they're going to be applied at
> all.

I'm fine with 2--4, but I'm not convinced that cloexec-by-default crusade
makes any sense. Could somebody give coherent arguments in favour of
abandoning the existing conventions?

2019-05-16 16:33:22

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

On Thu, May 16, 2019 at 05:22:59PM +0100, Al Viro wrote:
> On Thu, May 16, 2019 at 12:52:04PM +0100, David Howells wrote:
> >
> > Hi Linus, Al,
> >
> > Here are some patches that make changes to the mount API UAPI and two of
> > them really need applying, before -rc1 - if they're going to be applied at
> > all.
>
> I'm fine with 2--4, but I'm not convinced that cloexec-by-default crusade
> makes any sense. Could somebody give coherent arguments in favour of
> abandoning the existing conventions?

To elaborate: existing syscalls (open, socket, pipe, accept, epoll_create,
etc., etc.) are not cloexec-by-default and that's not going to change,
simply because it would be break the living hell out of existing userland
code.

IOW, the userland has to worry about leaking stuff over sensitive execve(),
no matter what. All this change does is complicate things for userland
programmer - which syscall belongs to which class.

Where's the benefit? I could buy an argument about gradually changing
over to APIs that are cloexec-by-default across the board, except for
the obvious fact that it's not going to happen; not with the things
like open() involved.

2019-05-16 16:35:28

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

On Thu, May 16, 2019 at 05:22:59PM +0100, Al Viro wrote:
> On Thu, May 16, 2019 at 12:52:04PM +0100, David Howells wrote:
> >
> > Hi Linus, Al,
> >
> > Here are some patches that make changes to the mount API UAPI and two of
> > them really need applying, before -rc1 - if they're going to be applied at
> > all.
>
> I'm fine with 2--4, but I'm not convinced that cloexec-by-default crusade
> makes any sense. Could somebody give coherent arguments in favour of
> abandoning the existing conventions?

So as I said in the commit message. From a userspace perspective it's
more of an issue if one accidently leaks an fd to a task during exec.

Also, most of the time one does not want to inherit an fd during an
exec. It is a hazzle to always have to specify an extra flag.

As Al pointed out to me open() semantics are not going anywhere. Sure,
no argument there at all.
But the idea of making fds cloexec by default is only targeted at fds
that come from separate syscalls. fsopen(), open_tree_clone(), etc. they
all return fds independent of open() so it's really easy to have them
cloexec by default without regressing anyone and we also remove the need
for a bunch of separate flags for each syscall to turn them into
cloexec-fds. I mean, those for syscalls came with 4 separate flags to be
able to specify that the returned fd should be made cloexec. The other
way around, cloexec by default, fcntl() to remove the cloexec bit is way
saner imho.

Christian

2019-05-16 19:34:42

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

[linux-abi cc'd]

On Thu, May 16, 2019 at 06:31:52PM +0200, Christian Brauner wrote:
> On Thu, May 16, 2019 at 05:22:59PM +0100, Al Viro wrote:
> > On Thu, May 16, 2019 at 12:52:04PM +0100, David Howells wrote:
> > >
> > > Hi Linus, Al,
> > >
> > > Here are some patches that make changes to the mount API UAPI and two of
> > > them really need applying, before -rc1 - if they're going to be applied at
> > > all.
> >
> > I'm fine with 2--4, but I'm not convinced that cloexec-by-default crusade
> > makes any sense. Could somebody give coherent arguments in favour of
> > abandoning the existing conventions?
>
> So as I said in the commit message. From a userspace perspective it's
> more of an issue if one accidently leaks an fd to a task during exec.
>
> Also, most of the time one does not want to inherit an fd during an
> exec. It is a hazzle to always have to specify an extra flag.
>
> As Al pointed out to me open() semantics are not going anywhere. Sure,
> no argument there at all.
> But the idea of making fds cloexec by default is only targeted at fds
> that come from separate syscalls. fsopen(), open_tree_clone(), etc. they
> all return fds independent of open() so it's really easy to have them
> cloexec by default without regressing anyone and we also remove the need
> for a bunch of separate flags for each syscall to turn them into
> cloexec-fds. I mean, those for syscalls came with 4 separate flags to be
> able to specify that the returned fd should be made cloexec. The other
> way around, cloexec by default, fcntl() to remove the cloexec bit is way
> saner imho.

Re separate flags - it is, in principle, a valid argument. OTOH, I'm not
sure if they need to be separate - they all have the same value and
I don't see any reason for that to change...

Only tangentially related, but I wonder if something like close_range(from, to)
would be a more useful approach... That kind of open-coded loops is not
rare in userland and kernel-side code can do them much cheaper. Something
like
/* that exec is sensitive */
unshare(CLONE_FILES);
/* we don't want anything past stderr here */
close_range(3, ~0U);
execve(....);
on the userland side of thing. Comments?

2019-05-16 20:08:52

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

On May 16, 2019 6:50:22 PM GMT+02:00, Al Viro <[email protected]> wrote:
>[linux-abi cc'd]
>
>On Thu, May 16, 2019 at 06:31:52PM +0200, Christian Brauner wrote:
>> On Thu, May 16, 2019 at 05:22:59PM +0100, Al Viro wrote:
>> > On Thu, May 16, 2019 at 12:52:04PM +0100, David Howells wrote:
>> > >
>> > > Hi Linus, Al,
>> > >
>> > > Here are some patches that make changes to the mount API UAPI and
>two of
>> > > them really need applying, before -rc1 - if they're going to be
>applied at
>> > > all.
>> >
>> > I'm fine with 2--4, but I'm not convinced that cloexec-by-default
>crusade
>> > makes any sense. Could somebody give coherent arguments in favour
>of
>> > abandoning the existing conventions?
>>
>> So as I said in the commit message. From a userspace perspective it's
>> more of an issue if one accidently leaks an fd to a task during exec.
>>
>> Also, most of the time one does not want to inherit an fd during an
>> exec. It is a hazzle to always have to specify an extra flag.
>>
>> As Al pointed out to me open() semantics are not going anywhere.
>Sure,
>> no argument there at all.
>> But the idea of making fds cloexec by default is only targeted at fds
>> that come from separate syscalls. fsopen(), open_tree_clone(), etc.
>they
>> all return fds independent of open() so it's really easy to have them
>> cloexec by default without regressing anyone and we also remove the
>need
>> for a bunch of separate flags for each syscall to turn them into
>> cloexec-fds. I mean, those for syscalls came with 4 separate flags to
>be
>> able to specify that the returned fd should be made cloexec. The
>other
>> way around, cloexec by default, fcntl() to remove the cloexec bit is
>way
>> saner imho.
>
>Re separate flags - it is, in principle, a valid argument. OTOH, I'm
>not
>sure if they need to be separate - they all have the same value and
>I don't see any reason for that to change...
>
>Only tangentially related, but I wonder if something like
>close_range(from, to)
>would be a more useful approach... That kind of open-coded loops is
>not
>rare in userland and kernel-side code can do them much cheaper.
>Something
>like
> /* that exec is sensitive */
> unshare(CLONE_FILES);
> /* we don't want anything past stderr here */
> close_range(3, ~0U);
> execve(....);
>on the userland side of thing. Comments?

Very much in favor of that!
That'd be a neat new addition.

2019-05-16 21:43:36

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

> unshare(CLONE_FILES);
> /* we don't want anything past stderr here */
> close_range(3, ~0U);
> execve(....);

Yes please.

nextfd(2)
https://lkml.org/lkml/2012/4/1/71

fdmap(2)
https://marc.info/?t=150628366900006&r=1&w=4

I like fdmap more.

2019-05-16 23:26:11

by Dmitry V. Levin

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

[looks like linux-abi is a typo, Cc'ed linux-api instead]

On Thu, May 16, 2019 at 05:50:22PM +0100, Al Viro wrote:
> [linux-abi cc'd]
>
> On Thu, May 16, 2019 at 06:31:52PM +0200, Christian Brauner wrote:
> > On Thu, May 16, 2019 at 05:22:59PM +0100, Al Viro wrote:
> > > On Thu, May 16, 2019 at 12:52:04PM +0100, David Howells wrote:
> > > >
> > > > Hi Linus, Al,
> > > >
> > > > Here are some patches that make changes to the mount API UAPI and two of
> > > > them really need applying, before -rc1 - if they're going to be applied at
> > > > all.
> > >
> > > I'm fine with 2--4, but I'm not convinced that cloexec-by-default crusade
> > > makes any sense. Could somebody give coherent arguments in favour of
> > > abandoning the existing conventions?
> >
> > So as I said in the commit message. From a userspace perspective it's
> > more of an issue if one accidently leaks an fd to a task during exec.
> >
> > Also, most of the time one does not want to inherit an fd during an
> > exec. It is a hazzle to always have to specify an extra flag.
> >
> > As Al pointed out to me open() semantics are not going anywhere. Sure,
> > no argument there at all.
> > But the idea of making fds cloexec by default is only targeted at fds
> > that come from separate syscalls. fsopen(), open_tree_clone(), etc. they
> > all return fds independent of open() so it's really easy to have them
> > cloexec by default without regressing anyone and we also remove the need
> > for a bunch of separate flags for each syscall to turn them into
> > cloexec-fds. I mean, those for syscalls came with 4 separate flags to be
> > able to specify that the returned fd should be made cloexec. The other
> > way around, cloexec by default, fcntl() to remove the cloexec bit is way
> > saner imho.
>
> Re separate flags - it is, in principle, a valid argument. OTOH, I'm not
> sure if they need to be separate - they all have the same value and
> I don't see any reason for that to change...
>
> Only tangentially related, but I wonder if something like close_range(from, to)
> would be a more useful approach... That kind of open-coded loops is not
> rare in userland and kernel-side code can do them much cheaper. Something
> like
> /* that exec is sensitive */
> unshare(CLONE_FILES);
> /* we don't want anything past stderr here */
> close_range(3, ~0U);
> execve(....);
> on the userland side of thing. Comments?

glibc people need a syscall to implement closefrom properly, see
https://sourceware.org/bugzilla/show_bug.cgi?id=10353#c14


--
ldv

2019-05-17 07:29:13

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

On May 16, 2019 6:50:22 PM GMT+02:00, Al Viro <[email protected]> wrote:
>[linux-abi cc'd]
>
>On Thu, May 16, 2019 at 06:31:52PM +0200, Christian Brauner wrote:
>> On Thu, May 16, 2019 at 05:22:59PM +0100, Al Viro wrote:
>> > On Thu, May 16, 2019 at 12:52:04PM +0100, David Howells wrote:
>> > >
>> > > Hi Linus, Al,
>> > >
>> > > Here are some patches that make changes to the mount API UAPI and
>two of
>> > > them really need applying, before -rc1 - if they're going to be
>applied at
>> > > all.
>> >
>> > I'm fine with 2--4, but I'm not convinced that cloexec-by-default
>crusade
>> > makes any sense. Could somebody give coherent arguments in favour
>of
>> > abandoning the existing conventions?
>>
>> So as I said in the commit message. From a userspace perspective it's
>> more of an issue if one accidently leaks an fd to a task during exec.
>>
>> Also, most of the time one does not want to inherit an fd during an
>> exec. It is a hazzle to always have to specify an extra flag.
>>
>> As Al pointed out to me open() semantics are not going anywhere.
>Sure,
>> no argument there at all.
>> But the idea of making fds cloexec by default is only targeted at fds
>> that come from separate syscalls. fsopen(), open_tree_clone(), etc.
>they
>> all return fds independent of open() so it's really easy to have them
>> cloexec by default without regressing anyone and we also remove the
>need
>> for a bunch of separate flags for each syscall to turn them into
>> cloexec-fds. I mean, those for syscalls came with 4 separate flags to
>be
>> able to specify that the returned fd should be made cloexec. The
>other
>> way around, cloexec by default, fcntl() to remove the cloexec bit is
>way
>> saner imho.
>
>Re separate flags - it is, in principle, a valid argument. OTOH, I'm
>not
>sure if they need to be separate - they all have the same value and
>I don't see any reason for that to change...

One last thing I'd like to point out is that
we already have syscalls and ioctls that
return cloexec fds. So the consistency
argument is kinda dead.

If you still prefer to have cloexec flags
for the 4 new syscalls then yes,
if they could at least all have the same name
(FSMOUNT_CLOEXEC?) that would be good.

>
>Only tangentially related, but I wonder if something like
>close_range(from, to)
>would be a more useful approach... That kind of open-coded loops is
>not
>rare in userland and kernel-side code can do them much cheaper.
>Something
>like
> /* that exec is sensitive */
> unshare(CLONE_FILES);
> /* we don't want anything past stderr here */
> close_range(3, ~0U);
> execve(....);
>on the userland side of thing. Comments?

Said it before but, the list was mistyped so again:
I think that's a great idea.
I have a prototype for close_range(start, end, flags).
I'll wait after rc1 and then send it out.

Christian

2019-05-17 08:03:46

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

On May 16, 2019 10:23:31 PM GMT+02:00, "Dmitry V. Levin" <[email protected]> wrote:
>[looks like linux-abi is a typo, Cc'ed linux-api instead]
>
>On Thu, May 16, 2019 at 05:50:22PM +0100, Al Viro wrote:
>> [linux-abi cc'd]
>>
>> On Thu, May 16, 2019 at 06:31:52PM +0200, Christian Brauner wrote:
>> > On Thu, May 16, 2019 at 05:22:59PM +0100, Al Viro wrote:
>> > > On Thu, May 16, 2019 at 12:52:04PM +0100, David Howells wrote:
>> > > >
>> > > > Hi Linus, Al,
>> > > >
>> > > > Here are some patches that make changes to the mount API UAPI
>and two of
>> > > > them really need applying, before -rc1 - if they're going to be
>applied at
>> > > > all.
>> > >
>> > > I'm fine with 2--4, but I'm not convinced that cloexec-by-default
>crusade
>> > > makes any sense. Could somebody give coherent arguments in
>favour of
>> > > abandoning the existing conventions?
>> >
>> > So as I said in the commit message. From a userspace perspective
>it's
>> > more of an issue if one accidently leaks an fd to a task during
>exec.
>> >
>> > Also, most of the time one does not want to inherit an fd during an
>> > exec. It is a hazzle to always have to specify an extra flag.
>> >
>> > As Al pointed out to me open() semantics are not going anywhere.
>Sure,
>> > no argument there at all.
>> > But the idea of making fds cloexec by default is only targeted at
>fds
>> > that come from separate syscalls. fsopen(), open_tree_clone(), etc.
>they
>> > all return fds independent of open() so it's really easy to have
>them
>> > cloexec by default without regressing anyone and we also remove the
>need
>> > for a bunch of separate flags for each syscall to turn them into
>> > cloexec-fds. I mean, those for syscalls came with 4 separate flags
>to be
>> > able to specify that the returned fd should be made cloexec. The
>other
>> > way around, cloexec by default, fcntl() to remove the cloexec bit
>is way
>> > saner imho.
>>
>> Re separate flags - it is, in principle, a valid argument. OTOH, I'm
>not
>> sure if they need to be separate - they all have the same value and
>> I don't see any reason for that to change...
>>
>> Only tangentially related, but I wonder if something like
>close_range(from, to)
>> would be a more useful approach... That kind of open-coded loops is
>not
>> rare in userland and kernel-side code can do them much cheaper.
>Something
>> like
>> /* that exec is sensitive */
>> unshare(CLONE_FILES);
>> /* we don't want anything past stderr here */
>> close_range(3, ~0U);
>> execve(....);
>> on the userland side of thing. Comments?
>
>glibc people need a syscall to implement closefrom properly, see
>https://sourceware.org/bugzilla/show_bug.cgi?id=10353#c14

I have a prototype for close_range().
I'll send it out after rc1.

Christian

2019-05-17 08:19:56

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

On May 17, 2019 9:13:26 AM GMT+02:00, David Howells <[email protected]> wrote:
>Christian Brauner <[email protected]> wrote:
>
>> If you still prefer to have cloexec flags
>> for the 4 new syscalls then yes,
>> if they could at least all have the same name
>> (FSMOUNT_CLOEXEC?) that would be good.
>
>They don't all have the same value (see OPEN_TREE_CLOEXEC).
>
>Note that I also don't want to blindly #define them to O_CLOEXEC
>because it's
>not necessarily the same value on all arches. Currently it can be
>02000000,
>010000000 or 0x400000 for instance, which means that if it's sharing a
>mask
>with other flags, at least three bits have to be reserved for it or we
>have to
>have arch-dependent bit juggling.


Ugh. Right, I forgot about that entirely.

Christian

>
>One thing I like about your approach of just making them O_CLOEXEC by
>default
>and removing the constants is that it avoids this mess entirely.
>
>David

2019-05-17 08:59:28

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

Christian Brauner <[email protected]> wrote:

> If you still prefer to have cloexec flags
> for the 4 new syscalls then yes,
> if they could at least all have the same name
> (FSMOUNT_CLOEXEC?) that would be good.

They don't all have the same value (see OPEN_TREE_CLOEXEC).

Note that I also don't want to blindly #define them to O_CLOEXEC because it's
not necessarily the same value on all arches. Currently it can be 02000000,
010000000 or 0x400000 for instance, which means that if it's sharing a mask
with other flags, at least three bits have to be reserved for it or we have to
have arch-dependent bit juggling.

One thing I like about your approach of just making them O_CLOEXEC by default
and removing the constants is that it avoids this mess entirely.

David

2019-05-17 09:00:12

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

On Fri, May 17, 2019 at 9:13 AM David Howells <[email protected]> wrote:
>
> Christian Brauner <[email protected]> wrote:
>
> > If you still prefer to have cloexec flags
> > for the 4 new syscalls then yes,
> > if they could at least all have the same name
> > (FSMOUNT_CLOEXEC?) that would be good.
>
> They don't all have the same value (see OPEN_TREE_CLOEXEC).
>
> Note that I also don't want to blindly #define them to O_CLOEXEC because it's
> not necessarily the same value on all arches. Currently it can be 02000000,
> 010000000 or 0x400000 for instance, which means that if it's sharing a mask
> with other flags, at least three bits have to be reserved for it or we have to
> have arch-dependent bit juggling.
>
> One thing I like about your approach of just making them O_CLOEXEC by default
> and removing the constants is that it avoids this mess entirely.

+1.

Confusion caused by inconsistency of naming is going to hurt more than
inconsistency of semantics wrt. open(2).

Thanks,
Miklos