2020-09-16 00:25:22

by Rich Felker

[permalink] [raw]
Subject: [PATCH v2 0/2] changes for addding fchmodat2 syscall

I'm resubmitting this split into two parts, first blocking chmod of
symlinks in chmod_common, then adding the new syscall, as requested by
Christoph. This will make it impossible to chmod symlinks via the
O_PATH magic symlink path. I've also reordered the unsupported flags
test to come first.

Rich


2020-09-16 00:25:30

by Rich Felker

[permalink] [raw]
Subject: [PATCH v2 2/2] vfs: add fchmodat2 syscall

POSIX defines fchmodat as having a 4th argument, flags, that can be
AT_SYMLINK_NOFOLLOW. Support for changing the access mode of symbolic
links is optional (EOPNOTSUPP allowed if not supported), but this flag
is important even on systems where symlinks do not have access modes,
since it's the only way to safely change the mode of a file which
might be asynchronously replaced with a symbolic link, without a race
condition whereby the link target is changed.

It's possible to emulate AT_SYMLINK_NOFOLLOW in userspace, and both
musl libc and glibc do this, by opening an O_PATH file descriptor and
performing chmod on the corresponding magic symlink in /proc/self/fd.
However, this requires procfs to be mounted and accessible.

Signed-off-by: Rich Felker <[email protected]>
---
arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 2 ++
arch/ia64/kernel/syscalls/syscall.tbl | 1 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
fs/open.c | 17 ++++++++++++++---
include/linux/syscalls.h | 2 ++
include/uapi/asm-generic/unistd.h | 4 +++-
21 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index ec8bed9e7b75..5648fa8be7a1 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -479,3 +479,4 @@
547 common openat2 sys_openat2
548 common pidfd_getfd sys_pidfd_getfd
549 common faccessat2 sys_faccessat2
+550 common fchmodat2 sys_fchmodat2
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 171077cbf419..b6b715bb3315 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -453,3 +453,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
+440 common fchmodat2 sys_fchmodat2
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 3b859596840d..b3b2019f8d16 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -38,7 +38,7 @@
#define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5)
#define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800)

-#define __NR_compat_syscalls 440
+#define __NR_compat_syscalls 441
#endif

#define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index 734860ac7cf9..cd0845f3c19f 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -887,6 +887,8 @@ __SYSCALL(__NR_openat2, sys_openat2)
__SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
#define __NR_faccessat2 439
__SYSCALL(__NR_faccessat2, sys_faccessat2)
+#define __NR_fchmodat2 440
+__SYSCALL(__NR_fchmodat2, sys_fchmodat2)

/*
* Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
index f52a41f4c340..7c3f8564d0f3 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -360,3 +360,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
+440 common fchmodat2 sys_fchmodat2
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index 81fc799d8392..063d875377bf 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -439,3 +439,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
+440 common fchmodat2 sys_fchmodat2
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index b4e263916f41..6aea8a435fd0 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -445,3 +445,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
+440 common fchmodat2 sys_fchmodat2
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index f9df9edb67a4..a9205843251d 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -378,3 +378,4 @@
437 n32 openat2 sys_openat2
438 n32 pidfd_getfd sys_pidfd_getfd
439 n32 faccessat2 sys_faccessat2
+440 n32 fchmodat2 sys_fchmodat2
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index 557f9954a2b9..31da28e2d6f3 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -354,3 +354,4 @@
437 n64 openat2 sys_openat2
438 n64 pidfd_getfd sys_pidfd_getfd
439 n64 faccessat2 sys_faccessat2
+440 n64 fchmodat2 sys_fchmodat2
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 195b43cf27c8..af0e38302ed8 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -427,3 +427,4 @@
437 o32 openat2 sys_openat2
438 o32 pidfd_getfd sys_pidfd_getfd
439 o32 faccessat2 sys_faccessat2
+440 o32 fchmodat2 sys_fchmodat2
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index def64d221cd4..379cdb44ca0b 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -437,3 +437,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
+440 common fchmodat2 sys_fchmodat2
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index c2d737ff2e7b..ada11db506e6 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -529,3 +529,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
+440 common fchmodat2 sys_fchmodat2
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index 10456bc936fb..a4dae0abb353 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -442,3 +442,4 @@
437 common openat2 sys_openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2 sys_faccessat2
+440 common fchmodat2 sys_fchmodat2 sys_fchmodat2
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index ae0a00beea5f..b59b4408b85f 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -442,3 +442,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
+440 common fchmodat2 sys_fchmodat2
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index 4af114e84f20..e817416f81df 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -485,3 +485,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
+440 common fchmodat2 sys_fchmodat2
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 9d1102873666..208b06650cef 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -444,3 +444,4 @@
437 i386 openat2 sys_openat2
438 i386 pidfd_getfd sys_pidfd_getfd
439 i386 faccessat2 sys_faccessat2
+440 i386 fchmodat2 sys_fchmodat2
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index f30d6ae9a688..d9a591db72fb 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -361,6 +361,7 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
+440 common fchmodat2 sys_fchmodat2

#
# x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 6276e3c2d3fc..ff756cb2f5d7 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -410,3 +410,4 @@
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
439 common faccessat2 sys_faccessat2
+440 common fchmodat2 sys_fchmodat2
diff --git a/fs/open.c b/fs/open.c
index cdb7964aaa6e..f492c782c0ed 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -616,11 +616,16 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
return err;
}

-static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
+static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int flags)
{
struct path path;
int error;
unsigned int lookup_flags = LOOKUP_FOLLOW;
+
+ if (flags & ~AT_SYMLINK_NOFOLLOW)
+ return -EINVAL;
+ if (flags & AT_SYMLINK_NOFOLLOW)
+ lookup_flags &= ~LOOKUP_FOLLOW;
retry:
error = user_path_at(dfd, filename, lookup_flags, &path);
if (!error) {
@@ -634,15 +639,21 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
return error;
}

+SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename,
+ umode_t, mode, int, flags)
+{
+ return do_fchmodat(dfd, filename, mode, flags);
+}
+
SYSCALL_DEFINE3(fchmodat, int, dfd, const char __user *, filename,
umode_t, mode)
{
- return do_fchmodat(dfd, filename, mode);
+ return do_fchmodat(dfd, filename, mode, 0);
}

SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
{
- return do_fchmodat(AT_FDCWD, filename, mode);
+ return do_fchmodat(AT_FDCWD, filename, mode, 0);
}

int chown_common(const struct path *path, uid_t user, gid_t group)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 75ac7f8ae93c..ced00c56eba7 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -435,6 +435,8 @@ asmlinkage long sys_chroot(const char __user *filename);
asmlinkage long sys_fchmod(unsigned int fd, umode_t mode);
asmlinkage long sys_fchmodat(int dfd, const char __user * filename,
umode_t mode);
+asmlinkage long sys_fchmodat2(int dfd, const char __user * filename,
+ umode_t mode, int flags);
asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user,
gid_t group, int flag);
asmlinkage long sys_fchown(unsigned int fd, uid_t user, gid_t group);
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 995b36c2ea7d..ebf5cdb3f444 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -859,9 +859,11 @@ __SYSCALL(__NR_openat2, sys_openat2)
__SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
#define __NR_faccessat2 439
__SYSCALL(__NR_faccessat2, sys_faccessat2)
+#define __NR_fchmodat2 440
+__SYSCALL(__NR_fchmodat2, sys_fchmodat2)

#undef __NR_syscalls
-#define __NR_syscalls 440
+#define __NR_syscalls 441

/*
* 32 bit systems traditionally used different
--
2.21.0

2020-09-16 00:25:57

by Rich Felker

[permalink] [raw]
Subject: [PATCH v2 1/2] vfs: block chmod of symlinks

It was discovered while implementing userspace emulation of fchmodat
AT_SYMLINK_NOFOLLOW (using O_PATH and procfs magic symlinks; otherwise
it's not possible to target symlinks with chmod operations) that some
filesystems erroneously allow access mode of symlinks to be changed,
but return failure with EOPNOTSUPP (see glibc issue #14578 and commit
a492b1e5ef). This inconsistency is non-conforming and wrong, and the
consensus seems to be that it was unintentional to allow link modes to
be changed in the first place.

Signed-off-by: Rich Felker <[email protected]>
---
fs/open.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/fs/open.c b/fs/open.c
index 9af548fb841b..cdb7964aaa6e 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -570,6 +570,12 @@ int chmod_common(const struct path *path, umode_t mode)
struct iattr newattrs;
int error;

+ /* Block chmod from getting to fs layer. Ideally the fs would either
+ * allow it or fail with EOPNOTSUPP, but some are buggy and return
+ * an error but change the mode, which is non-conforming and wrong. */
+ if (S_ISLNK(inode->i_mode))
+ return -EOPNOTSUPP;
+
error = mnt_want_write(path->mnt);
if (error)
return error;
--
2.21.0

2020-09-16 06:04:09

by Aleksa Sarai

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] vfs: add fchmodat2 syscall

On 2020-09-15, Rich Felker <[email protected]> wrote:
> POSIX defines fchmodat as having a 4th argument, flags, that can be
> AT_SYMLINK_NOFOLLOW. Support for changing the access mode of symbolic
> links is optional (EOPNOTSUPP allowed if not supported), but this flag
> is important even on systems where symlinks do not have access modes,
> since it's the only way to safely change the mode of a file which
> might be asynchronously replaced with a symbolic link, without a race
> condition whereby the link target is changed.

Could we also add AT_EMPTY_PATH support, so that you can fchmod an open
file in a race-free way without needing to go through procfs?

> It's possible to emulate AT_SYMLINK_NOFOLLOW in userspace, and both
> musl libc and glibc do this, by opening an O_PATH file descriptor and
> performing chmod on the corresponding magic symlink in /proc/self/fd.
> However, this requires procfs to be mounted and accessible.
>
> Signed-off-by: Rich Felker <[email protected]>
> ---
> arch/alpha/kernel/syscalls/syscall.tbl | 1 +
> arch/arm/tools/syscall.tbl | 1 +
> arch/arm64/include/asm/unistd.h | 2 +-
> arch/arm64/include/asm/unistd32.h | 2 ++
> arch/ia64/kernel/syscalls/syscall.tbl | 1 +
> arch/m68k/kernel/syscalls/syscall.tbl | 1 +
> arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
> arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
> arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
> arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
> arch/parisc/kernel/syscalls/syscall.tbl | 1 +
> arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
> arch/s390/kernel/syscalls/syscall.tbl | 1 +
> arch/sh/kernel/syscalls/syscall.tbl | 1 +
> arch/sparc/kernel/syscalls/syscall.tbl | 1 +
> arch/x86/entry/syscalls/syscall_32.tbl | 1 +
> arch/x86/entry/syscalls/syscall_64.tbl | 1 +
> arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
> fs/open.c | 17 ++++++++++++++---
> include/linux/syscalls.h | 2 ++
> include/uapi/asm-generic/unistd.h | 4 +++-
> 21 files changed, 38 insertions(+), 5 deletions(-)
>
> diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
> index ec8bed9e7b75..5648fa8be7a1 100644
> --- a/arch/alpha/kernel/syscalls/syscall.tbl
> +++ b/arch/alpha/kernel/syscalls/syscall.tbl
> @@ -479,3 +479,4 @@
> 547 common openat2 sys_openat2
> 548 common pidfd_getfd sys_pidfd_getfd
> 549 common faccessat2 sys_faccessat2
> +550 common fchmodat2 sys_fchmodat2
> diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
> index 171077cbf419..b6b715bb3315 100644
> --- a/arch/arm/tools/syscall.tbl
> +++ b/arch/arm/tools/syscall.tbl
> @@ -453,3 +453,4 @@
> 437 common openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2
> +440 common fchmodat2 sys_fchmodat2
> diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
> index 3b859596840d..b3b2019f8d16 100644
> --- a/arch/arm64/include/asm/unistd.h
> +++ b/arch/arm64/include/asm/unistd.h
> @@ -38,7 +38,7 @@
> #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5)
> #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800)
>
> -#define __NR_compat_syscalls 440
> +#define __NR_compat_syscalls 441
> #endif
>
> #define __ARCH_WANT_SYS_CLONE
> diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
> index 734860ac7cf9..cd0845f3c19f 100644
> --- a/arch/arm64/include/asm/unistd32.h
> +++ b/arch/arm64/include/asm/unistd32.h
> @@ -887,6 +887,8 @@ __SYSCALL(__NR_openat2, sys_openat2)
> __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
> #define __NR_faccessat2 439
> __SYSCALL(__NR_faccessat2, sys_faccessat2)
> +#define __NR_fchmodat2 440
> +__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
>
> /*
> * Please add new compat syscalls above this comment and update
> diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
> index f52a41f4c340..7c3f8564d0f3 100644
> --- a/arch/ia64/kernel/syscalls/syscall.tbl
> +++ b/arch/ia64/kernel/syscalls/syscall.tbl
> @@ -360,3 +360,4 @@
> 437 common openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2
> +440 common fchmodat2 sys_fchmodat2
> diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
> index 81fc799d8392..063d875377bf 100644
> --- a/arch/m68k/kernel/syscalls/syscall.tbl
> +++ b/arch/m68k/kernel/syscalls/syscall.tbl
> @@ -439,3 +439,4 @@
> 437 common openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2
> +440 common fchmodat2 sys_fchmodat2
> diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
> index b4e263916f41..6aea8a435fd0 100644
> --- a/arch/microblaze/kernel/syscalls/syscall.tbl
> +++ b/arch/microblaze/kernel/syscalls/syscall.tbl
> @@ -445,3 +445,4 @@
> 437 common openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2
> +440 common fchmodat2 sys_fchmodat2
> diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
> index f9df9edb67a4..a9205843251d 100644
> --- a/arch/mips/kernel/syscalls/syscall_n32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
> @@ -378,3 +378,4 @@
> 437 n32 openat2 sys_openat2
> 438 n32 pidfd_getfd sys_pidfd_getfd
> 439 n32 faccessat2 sys_faccessat2
> +440 n32 fchmodat2 sys_fchmodat2
> diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
> index 557f9954a2b9..31da28e2d6f3 100644
> --- a/arch/mips/kernel/syscalls/syscall_n64.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
> @@ -354,3 +354,4 @@
> 437 n64 openat2 sys_openat2
> 438 n64 pidfd_getfd sys_pidfd_getfd
> 439 n64 faccessat2 sys_faccessat2
> +440 n64 fchmodat2 sys_fchmodat2
> diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
> index 195b43cf27c8..af0e38302ed8 100644
> --- a/arch/mips/kernel/syscalls/syscall_o32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
> @@ -427,3 +427,4 @@
> 437 o32 openat2 sys_openat2
> 438 o32 pidfd_getfd sys_pidfd_getfd
> 439 o32 faccessat2 sys_faccessat2
> +440 o32 fchmodat2 sys_fchmodat2
> diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
> index def64d221cd4..379cdb44ca0b 100644
> --- a/arch/parisc/kernel/syscalls/syscall.tbl
> +++ b/arch/parisc/kernel/syscalls/syscall.tbl
> @@ -437,3 +437,4 @@
> 437 common openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2
> +440 common fchmodat2 sys_fchmodat2
> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
> index c2d737ff2e7b..ada11db506e6 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -529,3 +529,4 @@
> 437 common openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2
> +440 common fchmodat2 sys_fchmodat2
> diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
> index 10456bc936fb..a4dae0abb353 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -442,3 +442,4 @@
> 437 common openat2 sys_openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2 sys_faccessat2
> +440 common fchmodat2 sys_fchmodat2 sys_fchmodat2
> diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
> index ae0a00beea5f..b59b4408b85f 100644
> --- a/arch/sh/kernel/syscalls/syscall.tbl
> +++ b/arch/sh/kernel/syscalls/syscall.tbl
> @@ -442,3 +442,4 @@
> 437 common openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2
> +440 common fchmodat2 sys_fchmodat2
> diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
> index 4af114e84f20..e817416f81df 100644
> --- a/arch/sparc/kernel/syscalls/syscall.tbl
> +++ b/arch/sparc/kernel/syscalls/syscall.tbl
> @@ -485,3 +485,4 @@
> 437 common openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2
> +440 common fchmodat2 sys_fchmodat2
> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
> index 9d1102873666..208b06650cef 100644
> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> @@ -444,3 +444,4 @@
> 437 i386 openat2 sys_openat2
> 438 i386 pidfd_getfd sys_pidfd_getfd
> 439 i386 faccessat2 sys_faccessat2
> +440 i386 fchmodat2 sys_fchmodat2
> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
> index f30d6ae9a688..d9a591db72fb 100644
> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> @@ -361,6 +361,7 @@
> 437 common openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2
> +440 common fchmodat2 sys_fchmodat2
>
> #
> # x32-specific system call numbers start at 512 to avoid cache impact
> diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
> index 6276e3c2d3fc..ff756cb2f5d7 100644
> --- a/arch/xtensa/kernel/syscalls/syscall.tbl
> +++ b/arch/xtensa/kernel/syscalls/syscall.tbl
> @@ -410,3 +410,4 @@
> 437 common openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2
> +440 common fchmodat2 sys_fchmodat2
> diff --git a/fs/open.c b/fs/open.c
> index cdb7964aaa6e..f492c782c0ed 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -616,11 +616,16 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
> return err;
> }
>
> -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int flags)
> {
> struct path path;
> int error;
> unsigned int lookup_flags = LOOKUP_FOLLOW;
> +
> + if (flags & ~AT_SYMLINK_NOFOLLOW)
> + return -EINVAL;
> + if (flags & AT_SYMLINK_NOFOLLOW)
> + lookup_flags &= ~LOOKUP_FOLLOW;
> retry:
> error = user_path_at(dfd, filename, lookup_flags, &path);
> if (!error) {
> @@ -634,15 +639,21 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> return error;
> }
>
> +SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename,
> + umode_t, mode, int, flags)
> +{
> + return do_fchmodat(dfd, filename, mode, flags);
> +}
> +
> SYSCALL_DEFINE3(fchmodat, int, dfd, const char __user *, filename,
> umode_t, mode)
> {
> - return do_fchmodat(dfd, filename, mode);
> + return do_fchmodat(dfd, filename, mode, 0);
> }
>
> SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
> {
> - return do_fchmodat(AT_FDCWD, filename, mode);
> + return do_fchmodat(AT_FDCWD, filename, mode, 0);
> }
>
> int chown_common(const struct path *path, uid_t user, gid_t group)
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 75ac7f8ae93c..ced00c56eba7 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -435,6 +435,8 @@ asmlinkage long sys_chroot(const char __user *filename);
> asmlinkage long sys_fchmod(unsigned int fd, umode_t mode);
> asmlinkage long sys_fchmodat(int dfd, const char __user * filename,
> umode_t mode);
> +asmlinkage long sys_fchmodat2(int dfd, const char __user * filename,
> + umode_t mode, int flags);
> asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user,
> gid_t group, int flag);
> asmlinkage long sys_fchown(unsigned int fd, uid_t user, gid_t group);
> diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> index 995b36c2ea7d..ebf5cdb3f444 100644
> --- a/include/uapi/asm-generic/unistd.h
> +++ b/include/uapi/asm-generic/unistd.h
> @@ -859,9 +859,11 @@ __SYSCALL(__NR_openat2, sys_openat2)
> __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
> #define __NR_faccessat2 439
> __SYSCALL(__NR_faccessat2, sys_faccessat2)
> +#define __NR_fchmodat2 440
> +__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
>
> #undef __NR_syscalls
> -#define __NR_syscalls 440
> +#define __NR_syscalls 441
>
> /*
> * 32 bit systems traditionally used different
> --
> 2.21.0
>

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>


Attachments:
(No filename) (12.87 kB)
signature.asc (235.00 B)
Download all attachments

2020-09-16 06:20:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] vfs: block chmod of symlinks

On Tue, Sep 15, 2020 at 08:22:54PM -0400, Rich Felker wrote:
> It was discovered while implementing userspace emulation of fchmodat
> AT_SYMLINK_NOFOLLOW (using O_PATH and procfs magic symlinks; otherwise
> it's not possible to target symlinks with chmod operations) that some
> filesystems erroneously allow access mode of symlinks to be changed,
> but return failure with EOPNOTSUPP (see glibc issue #14578 and commit
> a492b1e5ef). This inconsistency is non-conforming and wrong, and the
> consensus seems to be that it was unintentional to allow link modes to
> be changed in the first place.
>
> Signed-off-by: Rich Felker <[email protected]>
> ---
> fs/open.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/fs/open.c b/fs/open.c
> index 9af548fb841b..cdb7964aaa6e 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -570,6 +570,12 @@ int chmod_common(const struct path *path, umode_t mode)
> struct iattr newattrs;
> int error;
>
> + /* Block chmod from getting to fs layer. Ideally the fs would either
> + * allow it or fail with EOPNOTSUPP, but some are buggy and return
> + * an error but change the mode, which is non-conforming and wrong. */
> + if (S_ISLNK(inode->i_mode))
> + return -EOPNOTSUPP;

I still fail to understand why these "buggy" filesystems can not be
fixed. Why are you papering over a filesystem-specific-bug with this
core kernel change that we will forever have to keep?

thanks,

greg k-h

2020-09-16 06:20:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] vfs: add fchmodat2 syscall

On Tue, Sep 15, 2020 at 08:23:38PM -0400, Rich Felker wrote:
> POSIX defines fchmodat as having a 4th argument, flags, that can be
> AT_SYMLINK_NOFOLLOW. Support for changing the access mode of symbolic
> links is optional (EOPNOTSUPP allowed if not supported), but this flag
> is important even on systems where symlinks do not have access modes,
> since it's the only way to safely change the mode of a file which
> might be asynchronously replaced with a symbolic link, without a race
> condition whereby the link target is changed.
>
> It's possible to emulate AT_SYMLINK_NOFOLLOW in userspace, and both
> musl libc and glibc do this, by opening an O_PATH file descriptor and
> performing chmod on the corresponding magic symlink in /proc/self/fd.
> However, this requires procfs to be mounted and accessible.
>
> Signed-off-by: Rich Felker <[email protected]>

No kselftest for this new system call, or man page? How do we know this
actually works and what the expected outcome should be?

thanks,

greg k-h

2020-09-16 06:25:14

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] vfs: block chmod of symlinks

On Wed, Sep 16, 2020 at 08:18:15AM +0200, Greg KH wrote:
> On Tue, Sep 15, 2020 at 08:22:54PM -0400, Rich Felker wrote:
> > It was discovered while implementing userspace emulation of fchmodat
> > AT_SYMLINK_NOFOLLOW (using O_PATH and procfs magic symlinks; otherwise
> > it's not possible to target symlinks with chmod operations) that some
> > filesystems erroneously allow access mode of symlinks to be changed,
> > but return failure with EOPNOTSUPP (see glibc issue #14578 and commit
> > a492b1e5ef). This inconsistency is non-conforming and wrong, and the
> > consensus seems to be that it was unintentional to allow link modes to
> > be changed in the first place.
> >
> > Signed-off-by: Rich Felker <[email protected]>
> > ---
> > fs/open.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/fs/open.c b/fs/open.c
> > index 9af548fb841b..cdb7964aaa6e 100644
> > --- a/fs/open.c
> > +++ b/fs/open.c
> > @@ -570,6 +570,12 @@ int chmod_common(const struct path *path, umode_t mode)
> > struct iattr newattrs;
> > int error;
> >
> > + /* Block chmod from getting to fs layer. Ideally the fs would either
> > + * allow it or fail with EOPNOTSUPP, but some are buggy and return
> > + * an error but change the mode, which is non-conforming and wrong. */
> > + if (S_ISLNK(inode->i_mode))
> > + return -EOPNOTSUPP;
>
> I still fail to understand why these "buggy" filesystems can not be
> fixed. Why are you papering over a filesystem-specific-bug with this
> core kernel change that we will forever have to keep?

Because checking this once in the VFS is much saner than trying to
patch up a gazillion file systems.

2020-09-16 06:27:10

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] vfs: block chmod of symlinks

On Tue, Sep 15, 2020 at 08:22:54PM -0400, Rich Felker wrote:
> It was discovered while implementing userspace emulation of fchmodat
> AT_SYMLINK_NOFOLLOW (using O_PATH and procfs magic symlinks; otherwise
> it's not possible to target symlinks with chmod operations) that some
> filesystems erroneously allow access mode of symlinks to be changed,
> but return failure with EOPNOTSUPP (see glibc issue #14578 and commit
> a492b1e5ef). This inconsistency is non-conforming and wrong, and the
> consensus seems to be that it was unintentional to allow link modes to
> be changed in the first place.
>
> Signed-off-by: Rich Felker <[email protected]>
> ---
> fs/open.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/fs/open.c b/fs/open.c
> index 9af548fb841b..cdb7964aaa6e 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -570,6 +570,12 @@ int chmod_common(const struct path *path, umode_t mode)
> struct iattr newattrs;
> int error;
>
> + /* Block chmod from getting to fs layer. Ideally the fs would either
> + * allow it or fail with EOPNOTSUPP, but some are buggy and return
> + * an error but change the mode, which is non-conforming and wrong. */
> + if (S_ISLNK(inode->i_mode))
> + return -EOPNOTSUPP;

Our usualy place for this would be setattr_prepare. Also the comment
style is off, and I don't think we should talk about buggy file systems
here, but a policy to not allow the chmod. I also suspect the right
error value is EINVAL - EOPNOTSUPP isn't really used in normal posix
file system interfaces.

2020-09-16 16:24:44

by Rich Felker

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] vfs: block chmod of symlinks

On Wed, Sep 16, 2020 at 07:25:53AM +0100, Christoph Hellwig wrote:
> On Tue, Sep 15, 2020 at 08:22:54PM -0400, Rich Felker wrote:
> > It was discovered while implementing userspace emulation of fchmodat
> > AT_SYMLINK_NOFOLLOW (using O_PATH and procfs magic symlinks; otherwise
> > it's not possible to target symlinks with chmod operations) that some
> > filesystems erroneously allow access mode of symlinks to be changed,
> > but return failure with EOPNOTSUPP (see glibc issue #14578 and commit
> > a492b1e5ef). This inconsistency is non-conforming and wrong, and the
> > consensus seems to be that it was unintentional to allow link modes to
> > be changed in the first place.
> >
> > Signed-off-by: Rich Felker <[email protected]>
> > ---
> > fs/open.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/fs/open.c b/fs/open.c
> > index 9af548fb841b..cdb7964aaa6e 100644
> > --- a/fs/open.c
> > +++ b/fs/open.c
> > @@ -570,6 +570,12 @@ int chmod_common(const struct path *path, umode_t mode)
> > struct iattr newattrs;
> > int error;
> >
> > + /* Block chmod from getting to fs layer. Ideally the fs would either
> > + * allow it or fail with EOPNOTSUPP, but some are buggy and return
> > + * an error but change the mode, which is non-conforming and wrong. */
> > + if (S_ISLNK(inode->i_mode))
> > + return -EOPNOTSUPP;
>
> Our usualy place for this would be setattr_prepare. Also the comment
> style is off, and I don't think we should talk about buggy file systems
> here, but a policy to not allow the chmod. I also suspect the right
> error value is EINVAL - EOPNOTSUPP isn't really used in normal posix
> file system interfaces.

EINVAL is non-conforming here. POSIX defines it as unsupported mode or
flag. Lack of support for setting an access mode on symlinks is a
distinct failure reason, and POSIX does not allow overloading error
codes like this.

Even if it were permitted, it would be bad to do this because it would
make it impossible for the application to tell whether the cause of
failure is an invalid mode or that the filesystem/implementation
doesn't support modes on symlinks. This matters because one is usually
a fatal error and the other is a condition to ignore. Moreover, the
affected applications (e.g. coreutils, tar, etc.) already accept
EOPNOTSUPP here from libc. Defining a new error would break them
unless libc translated whatever the syscall returns to the expected
EOPNOTSUPP.

Rich

2020-09-16 20:26:47

by Rich Felker

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] vfs: block chmod of symlinks

On Wed, Sep 16, 2020 at 08:18:15AM +0200, Greg KH wrote:
> On Tue, Sep 15, 2020 at 08:22:54PM -0400, Rich Felker wrote:
> > It was discovered while implementing userspace emulation of fchmodat
> > AT_SYMLINK_NOFOLLOW (using O_PATH and procfs magic symlinks; otherwise
> > it's not possible to target symlinks with chmod operations) that some
> > filesystems erroneously allow access mode of symlinks to be changed,
> > but return failure with EOPNOTSUPP (see glibc issue #14578 and commit
> > a492b1e5ef). This inconsistency is non-conforming and wrong, and the
> > consensus seems to be that it was unintentional to allow link modes to
> > be changed in the first place.
> >
> > Signed-off-by: Rich Felker <[email protected]>
> > ---
> > fs/open.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/fs/open.c b/fs/open.c
> > index 9af548fb841b..cdb7964aaa6e 100644
> > --- a/fs/open.c
> > +++ b/fs/open.c
> > @@ -570,6 +570,12 @@ int chmod_common(const struct path *path, umode_t mode)
> > struct iattr newattrs;
> > int error;
> >
> > + /* Block chmod from getting to fs layer. Ideally the fs would either
> > + * allow it or fail with EOPNOTSUPP, but some are buggy and return
> > + * an error but change the mode, which is non-conforming and wrong. */
> > + if (S_ISLNK(inode->i_mode))
> > + return -EOPNOTSUPP;
>
> I still fail to understand why these "buggy" filesystems can not be
> fixed. Why are you papering over a filesystem-specific-bug with this

Because that's what Christoph wanted, and it seems exposure of the
vector for applying chmod to symlinks was unintentional to begin with.
I have no preference how this is fixed as long as breakage is not
exposed to userspace via the new fchmodat2 syscall (since a broken
syscall would be worse than not having it at all).

> core kernel change that we will forever have to keep?

There's no fundamental reason it would have to be kept forever. The
contract remains "either it works and reports success, or it makes no
change and reports EOPNOTSUPP". It just can't do both.

Rich

2020-09-17 04:08:42

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] vfs: block chmod of symlinks

On Wed, Sep 16, 2020 at 07:25:53AM +0100, Christoph Hellwig wrote:
> On Tue, Sep 15, 2020 at 08:22:54PM -0400, Rich Felker wrote:
> > It was discovered while implementing userspace emulation of fchmodat
> > AT_SYMLINK_NOFOLLOW (using O_PATH and procfs magic symlinks; otherwise
> > it's not possible to target symlinks with chmod operations) that some
> > filesystems erroneously allow access mode of symlinks to be changed,
> > but return failure with EOPNOTSUPP (see glibc issue #14578 and commit
> > a492b1e5ef). This inconsistency is non-conforming and wrong, and the
> > consensus seems to be that it was unintentional to allow link modes to
> > be changed in the first place.
> >
> > Signed-off-by: Rich Felker <[email protected]>
> > ---
> > fs/open.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/fs/open.c b/fs/open.c
> > index 9af548fb841b..cdb7964aaa6e 100644
> > --- a/fs/open.c
> > +++ b/fs/open.c
> > @@ -570,6 +570,12 @@ int chmod_common(const struct path *path, umode_t mode)
> > struct iattr newattrs;
> > int error;
> >
> > + /* Block chmod from getting to fs layer. Ideally the fs would either
> > + * allow it or fail with EOPNOTSUPP, but some are buggy and return
> > + * an error but change the mode, which is non-conforming and wrong. */
> > + if (S_ISLNK(inode->i_mode))
> > + return -EOPNOTSUPP;
>
> Our usualy place for this would be setattr_prepare. Also the comment
> style is off, and I don't think we should talk about buggy file systems
> here, but a policy to not allow the chmod. I also suspect the right
> error value is EINVAL - EOPNOTSUPP isn't really used in normal posix
> file system interfaces.

Er... Wasn't that an ACL-related crap? XFS calling posix_acl_chmod()
after it has committed to i_mode change, propagating the error to
caller of ->notify_change(), IIRC...

Put it another way, why do we want
if (!inode->i_op->set_acl)
return -EOPNOTSUPP;
in posix_acl_chmod(), when we have
if (!IS_POSIXACL(inode))
return 0;
right next to it? If nothing else, make that
if (!IS_POSIXACL(inode) || !inode->i_op->get_acl)
return 0; // piss off - nothing to adjust here

2020-09-17 04:16:43

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] vfs: block chmod of symlinks

On Thu, Sep 17, 2020 at 05:07:15AM +0100, Al Viro wrote:
> On Wed, Sep 16, 2020 at 07:25:53AM +0100, Christoph Hellwig wrote:
> > On Tue, Sep 15, 2020 at 08:22:54PM -0400, Rich Felker wrote:
> > > It was discovered while implementing userspace emulation of fchmodat
> > > AT_SYMLINK_NOFOLLOW (using O_PATH and procfs magic symlinks; otherwise
> > > it's not possible to target symlinks with chmod operations) that some
> > > filesystems erroneously allow access mode of symlinks to be changed,
> > > but return failure with EOPNOTSUPP (see glibc issue #14578 and commit
> > > a492b1e5ef). This inconsistency is non-conforming and wrong, and the
> > > consensus seems to be that it was unintentional to allow link modes to
> > > be changed in the first place.
> > >
> > > Signed-off-by: Rich Felker <[email protected]>
> > > ---
> > > fs/open.c | 6 ++++++
> > > 1 file changed, 6 insertions(+)
> > >
> > > diff --git a/fs/open.c b/fs/open.c
> > > index 9af548fb841b..cdb7964aaa6e 100644
> > > --- a/fs/open.c
> > > +++ b/fs/open.c
> > > @@ -570,6 +570,12 @@ int chmod_common(const struct path *path, umode_t mode)
> > > struct iattr newattrs;
> > > int error;
> > >
> > > + /* Block chmod from getting to fs layer. Ideally the fs would either
> > > + * allow it or fail with EOPNOTSUPP, but some are buggy and return
> > > + * an error but change the mode, which is non-conforming and wrong. */
> > > + if (S_ISLNK(inode->i_mode))
> > > + return -EOPNOTSUPP;
> >
> > Our usualy place for this would be setattr_prepare. Also the comment
> > style is off, and I don't think we should talk about buggy file systems
> > here, but a policy to not allow the chmod. I also suspect the right
> > error value is EINVAL - EOPNOTSUPP isn't really used in normal posix
> > file system interfaces.
>
> Er... Wasn't that an ACL-related crap? XFS calling posix_acl_chmod()
> after it has committed to i_mode change, propagating the error to
> caller of ->notify_change(), IIRC...
>
> Put it another way, why do we want
> if (!inode->i_op->set_acl)
> return -EOPNOTSUPP;
> in posix_acl_chmod(), when we have
> if (!IS_POSIXACL(inode))
> return 0;
> right next to it? If nothing else, make that
> if (!IS_POSIXACL(inode) || !inode->i_op->get_acl)
> return 0; // piss off - nothing to adjust here

Arrgh... That'd break shmem and similar filesystems... Still, it
feels like we should _not_ bother in cases when there's no ACL
for that sucker; after all, if get_acl() returns NULL, we quietly
return 0 and that's it.

How about something like this instead?

diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 95882b3f5f62..2339160fabab 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -559,8 +559,6 @@ posix_acl_chmod(struct inode *inode, umode_t mode)

if (!IS_POSIXACL(inode))
return 0;
- if (!inode->i_op->set_acl)
- return -EOPNOTSUPP;

acl = get_acl(inode, ACL_TYPE_ACCESS);
if (IS_ERR_OR_NULL(acl)) {
@@ -569,6 +567,10 @@ posix_acl_chmod(struct inode *inode, umode_t mode)
return PTR_ERR(acl);
}

+ if (!inode->i_op->set_acl) {
+ posix_acl_release(acl);
+ return -EOPNOTSUPP;
+ }
ret = __posix_acl_chmod(&acl, GFP_KERNEL, mode);
if (ret)
return ret;

2020-09-17 19:12:40

by Rich Felker

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] vfs: block chmod of symlinks

On Thu, Sep 17, 2020 at 05:15:03AM +0100, Al Viro wrote:
> On Thu, Sep 17, 2020 at 05:07:15AM +0100, Al Viro wrote:
> > On Wed, Sep 16, 2020 at 07:25:53AM +0100, Christoph Hellwig wrote:
> > > On Tue, Sep 15, 2020 at 08:22:54PM -0400, Rich Felker wrote:
> > > > It was discovered while implementing userspace emulation of fchmodat
> > > > AT_SYMLINK_NOFOLLOW (using O_PATH and procfs magic symlinks; otherwise
> > > > it's not possible to target symlinks with chmod operations) that some
> > > > filesystems erroneously allow access mode of symlinks to be changed,
> > > > but return failure with EOPNOTSUPP (see glibc issue #14578 and commit
> > > > a492b1e5ef). This inconsistency is non-conforming and wrong, and the
> > > > consensus seems to be that it was unintentional to allow link modes to
> > > > be changed in the first place.
> > > >
> > > > Signed-off-by: Rich Felker <[email protected]>
> > > > ---
> > > > fs/open.c | 6 ++++++
> > > > 1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/fs/open.c b/fs/open.c
> > > > index 9af548fb841b..cdb7964aaa6e 100644
> > > > --- a/fs/open.c
> > > > +++ b/fs/open.c
> > > > @@ -570,6 +570,12 @@ int chmod_common(const struct path *path, umode_t mode)
> > > > struct iattr newattrs;
> > > > int error;
> > > >
> > > > + /* Block chmod from getting to fs layer. Ideally the fs would either
> > > > + * allow it or fail with EOPNOTSUPP, but some are buggy and return
> > > > + * an error but change the mode, which is non-conforming and wrong. */
> > > > + if (S_ISLNK(inode->i_mode))
> > > > + return -EOPNOTSUPP;
> > >
> > > Our usualy place for this would be setattr_prepare. Also the comment
> > > style is off, and I don't think we should talk about buggy file systems
> > > here, but a policy to not allow the chmod. I also suspect the right
> > > error value is EINVAL - EOPNOTSUPP isn't really used in normal posix
> > > file system interfaces.
> >
> > Er... Wasn't that an ACL-related crap? XFS calling posix_acl_chmod()
> > after it has committed to i_mode change, propagating the error to
> > caller of ->notify_change(), IIRC...
> >
> > Put it another way, why do we want
> > if (!inode->i_op->set_acl)
> > return -EOPNOTSUPP;
> > in posix_acl_chmod(), when we have
> > if (!IS_POSIXACL(inode))
> > return 0;
> > right next to it? If nothing else, make that
> > if (!IS_POSIXACL(inode) || !inode->i_op->get_acl)
> > return 0; // piss off - nothing to adjust here
>
> Arrgh... That'd break shmem and similar filesystems... Still, it
> feels like we should _not_ bother in cases when there's no ACL
> for that sucker; after all, if get_acl() returns NULL, we quietly
> return 0 and that's it.
>
> How about something like this instead?
>
> diff --git a/fs/posix_acl.c b/fs/posix_acl.c
> index 95882b3f5f62..2339160fabab 100644
> --- a/fs/posix_acl.c
> +++ b/fs/posix_acl.c
> @@ -559,8 +559,6 @@ posix_acl_chmod(struct inode *inode, umode_t mode)
>
> if (!IS_POSIXACL(inode))
> return 0;
> - if (!inode->i_op->set_acl)
> - return -EOPNOTSUPP;
>
> acl = get_acl(inode, ACL_TYPE_ACCESS);
> if (IS_ERR_OR_NULL(acl)) {
> @@ -569,6 +567,10 @@ posix_acl_chmod(struct inode *inode, umode_t mode)
> return PTR_ERR(acl);
> }
>
> + if (!inode->i_op->set_acl) {
> + posix_acl_release(acl);
> + return -EOPNOTSUPP;
> + }
> ret = __posix_acl_chmod(&acl, GFP_KERNEL, mode);
> if (ret)
> return ret;

Does this make chmod of links behave consistently (either succeeding
with return value 0, or returning -EOPNOTSUPP without doing anything)
for all filesystems? I'm fine with (and would probably prefer) this
fix if it's a complete one. If this goes in I think my patch 1/2 can
just be dropped and patch 2/2 behaves as intended; does that sound
correct to you?

Rich

2020-09-29 17:54:06

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] vfs: block chmod of symlinks

On Thu, Sep 17, 2020 at 05:15:03AM +0100, Al Viro wrote:
> Arrgh... That'd break shmem and similar filesystems... Still, it
> feels like we should _not_ bother in cases when there's no ACL
> for that sucker; after all, if get_acl() returns NULL, we quietly
> return 0 and that's it.
>
> How about something like this instead?

Do you plan to turn this into a submission?

Rich, can you share your original reproducer? I would be really
helpful to have it wired up in xfstests as a regression tests.