2023-05-11 15:29:36

by Christian Göttsche

[permalink] [raw]
Subject: [RFC PATCH v2] fs/xattr: add *at family syscalls

Add the four syscalls setxattrat(), getxattrat(), listxattrat() and
removexattrat(). Those can be used to operate on extended attributes,
especially security related ones, either relative to a pinned directory
or on a file descriptor without read access, avoiding a
/proc/<pid>/fd/<fd> detour, requiring a mounted procfs.

One use case will be setfiles(8) setting SELinux file contexts
("security.selinux") without race conditions.

Add XATTR flags to the private namespace of AT_* flags.

Use the do_{name}at() pattern from fs/open.c.

Use a single flag parameter for extended attribute flags (currently
XATTR_CREATE and XATTR_REPLACE) and *at() flags to not exceed six
syscall arguments in setxattrat().

Previous approach ("f*xattr: allow O_PATH descriptors"): https://lore.kernel.org/all/[email protected]/
v1 discussion: https://lore.kernel.org/all/[email protected]/

Signed-off-by: Christian Göttsche <[email protected]>
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
---
v2:
- squash syscall introduction and wire up commits
- add AT_XATTR_CREATE and AT_XATTR_REPLACE constants
---
arch/alpha/kernel/syscalls/syscall.tbl | 4 +
arch/arm/tools/syscall.tbl | 4 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 8 ++
arch/ia64/kernel/syscalls/syscall.tbl | 4 +
arch/m68k/kernel/syscalls/syscall.tbl | 4 +
arch/microblaze/kernel/syscalls/syscall.tbl | 4 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 4 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 4 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 4 +
arch/parisc/kernel/syscalls/syscall.tbl | 4 +
arch/powerpc/kernel/syscalls/syscall.tbl | 4 +
arch/s390/kernel/syscalls/syscall.tbl | 4 +
arch/sh/kernel/syscalls/syscall.tbl | 4 +
arch/sparc/kernel/syscalls/syscall.tbl | 4 +
arch/x86/entry/syscalls/syscall_32.tbl | 4 +
arch/x86/entry/syscalls/syscall_64.tbl | 4 +
arch/xtensa/kernel/syscalls/syscall.tbl | 4 +
fs/xattr.c | 107 ++++++++++++++++----
include/asm-generic/audit_change_attr.h | 6 ++
include/linux/syscalls.h | 8 ++
include/uapi/asm-generic/unistd.h | 12 ++-
include/uapi/linux/fcntl.h | 2 +
23 files changed, 185 insertions(+), 24 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index 8ebacf37a8cf..1dc58a5dc730 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -490,3 +490,7 @@
558 common process_mrelease sys_process_mrelease
559 common futex_waitv sys_futex_waitv
560 common set_mempolicy_home_node sys_ni_syscall
+561 common setxattrat sys_setxattrat
+562 common getxattrat sys_getxattrat
+563 common listxattrat sys_listxattrat
+564 common removexattrat sys_removexattrat
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index ac964612d8b0..f0e9d9d487f0 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -464,3 +464,7 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
+451 common setxattrat sys_setxattrat
+452 common getxattrat sys_getxattrat
+453 common listxattrat sys_listxattrat
+454 common removexattrat sys_removexattrat
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 037feba03a51..63a8a9c4abc1 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -39,7 +39,7 @@
#define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5)
#define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800)

-#define __NR_compat_syscalls 451
+#define __NR_compat_syscalls 455
#endif

#define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index 604a2053d006..cd6ac63376d1 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -907,6 +907,14 @@ __SYSCALL(__NR_process_mrelease, sys_process_mrelease)
__SYSCALL(__NR_futex_waitv, sys_futex_waitv)
#define __NR_set_mempolicy_home_node 450
__SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
+#define __NR_setxattrat 451
+__SYSCALL(__NR_setxattrat, sys_setxattrat)
+#define __NR_getxattrat 452
+__SYSCALL(__NR_getxattrat, sys_getxattrat)
+#define __NR_listxattrat 453
+__SYSCALL(__NR_listxattrat, sys_listxattrat)
+#define __NR_removexattrat 454
+__SYSCALL(__NR_removexattrat, sys_removexattrat)

/*
* Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
index 72c929d9902b..fe9aea54222c 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -371,3 +371,7 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
+451 common setxattrat sys_setxattrat
+452 common getxattrat sys_getxattrat
+453 common listxattrat sys_listxattrat
+454 common removexattrat sys_removexattrat
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index b1f3940bc298..0847efdee734 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -450,3 +450,7 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
+451 common setxattrat sys_setxattrat
+452 common getxattrat sys_getxattrat
+453 common listxattrat sys_listxattrat
+454 common removexattrat sys_removexattrat
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index 820145e47350..7f619bbc718d 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -456,3 +456,7 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
+451 common setxattrat sys_setxattrat
+452 common getxattrat sys_getxattrat
+453 common listxattrat sys_listxattrat
+454 common removexattrat sys_removexattrat
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 253ff994ed2e..5e4206c0aede 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -389,3 +389,7 @@
448 n32 process_mrelease sys_process_mrelease
449 n32 futex_waitv sys_futex_waitv
450 n32 set_mempolicy_home_node sys_set_mempolicy_home_node
+451 n32 setxattrat sys_setxattrat
+452 n32 getxattrat sys_getxattrat
+453 n32 listxattrat sys_listxattrat
+454 n32 removexattrat sys_removexattrat
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index 3f1886ad9d80..df0f053e76cd 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -365,3 +365,7 @@
448 n64 process_mrelease sys_process_mrelease
449 n64 futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
+451 n64 setxattrat sys_setxattrat
+452 n64 getxattrat sys_getxattrat
+453 n64 listxattrat sys_listxattrat
+454 n64 removexattrat sys_removexattrat
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 8f243e35a7b2..09ec31ad475f 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -438,3 +438,7 @@
448 o32 process_mrelease sys_process_mrelease
449 o32 futex_waitv sys_futex_waitv
450 o32 set_mempolicy_home_node sys_set_mempolicy_home_node
+451 o32 setxattrat sys_setxattrat
+452 o32 getxattrat sys_getxattrat
+453 o32 listxattrat sys_listxattrat
+454 o32 removexattrat sys_removexattrat
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 0e42fceb2d5e..0123f895a674 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -448,3 +448,7 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
+451 common setxattrat sys_setxattrat
+452 common getxattrat sys_getxattrat
+453 common listxattrat sys_listxattrat
+454 common removexattrat sys_removexattrat
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index a0be127475b1..06fd4153f0d1 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -537,3 +537,7 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 nospu set_mempolicy_home_node sys_set_mempolicy_home_node
+451 common setxattrat sys_setxattrat
+452 common getxattrat sys_getxattrat
+453 common listxattrat sys_listxattrat
+454 common removexattrat sys_removexattrat
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index b68f47541169..9babd831fe1e 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -453,3 +453,7 @@
448 common process_mrelease sys_process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node sys_set_mempolicy_home_node
+451 common setxattrat sys_setxattrat sys_setxattrat
+452 common getxattrat sys_getxattrat sys_getxattrat
+453 common listxattrat sys_listxattrat sys_listxattrat
+454 common removexattrat sys_removexattrat sys_removexattrat
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index 2de85c977f54..d4daa8afe45c 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -453,3 +453,7 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
+451 common setxattrat sys_setxattrat
+452 common getxattrat sys_getxattrat
+453 common listxattrat sys_listxattrat
+454 common removexattrat sys_removexattrat
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index 4398cc6fb68d..510d5175f80a 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -496,3 +496,7 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
+451 common setxattrat sys_setxattrat
+452 common getxattrat sys_getxattrat
+453 common listxattrat sys_listxattrat
+454 common removexattrat sys_removexattrat
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 320480a8db4f..8488cc157fe0 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -455,3 +455,7 @@
448 i386 process_mrelease sys_process_mrelease
449 i386 futex_waitv sys_futex_waitv
450 i386 set_mempolicy_home_node sys_set_mempolicy_home_node
+451 i386 setxattrat sys_setxattrat
+452 i386 getxattrat sys_getxattrat
+453 i386 listxattrat sys_listxattrat
+454 i386 removexattrat sys_removexattrat
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index c84d12608cd2..f45d723d5a30 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -372,6 +372,10 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
+451 common setxattrat sys_setxattrat
+452 common getxattrat sys_getxattrat
+453 common listxattrat sys_listxattrat
+454 common removexattrat sys_removexattrat

#
# Due to a historical design error, certain syscalls are numbered differently
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 52c94ab5c205..dbafe441a83f 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -421,3 +421,7 @@
448 common process_mrelease sys_process_mrelease
449 common futex_waitv sys_futex_waitv
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
+451 common setxattrat sys_setxattrat
+452 common getxattrat sys_getxattrat
+453 common listxattrat sys_listxattrat
+454 common removexattrat sys_removexattrat
diff --git a/fs/xattr.c b/fs/xattr.c
index fcf67d80d7f9..a57ce39483d7 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -656,21 +656,34 @@ setxattr(struct mnt_idmap *idmap, struct dentry *d,
return error;
}

-static int path_setxattr(const char __user *pathname,
+static int do_setxattrat(int dfd, const char __user *pathname,
const char __user *name, const void __user *value,
- size_t size, int flags, unsigned int lookup_flags)
+ size_t size, int flags)
{
struct path path;
int error;
+ int lookup_flags;

+ /* AT_ and XATTR_ flags must not overlap. */
+ BUILD_BUG_ON(XATTR_CREATE != AT_XATTR_CREATE);
+ BUILD_BUG_ON(XATTR_REPLACE != AT_XATTR_REPLACE);
+ #define AT_XATTR__FLAGS (AT_XATTR_CREATE | AT_XATTR_REPLACE)
+ BUILD_BUG_ON(((AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH) & AT_XATTR__FLAGS) != 0);
+
+ if ((flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH | AT_XATTR__FLAGS)) != 0)
+ return -EINVAL;
+
+ lookup_flags = (flags & AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW;
+ if (flags & AT_EMPTY_PATH)
+ lookup_flags |= LOOKUP_EMPTY;
retry:
- error = user_path_at(AT_FDCWD, pathname, lookup_flags, &path);
+ error = user_path_at(dfd, pathname, lookup_flags, &path);
if (error)
return error;
error = mnt_want_write(path.mnt);
if (!error) {
error = setxattr(mnt_idmap(path.mnt), path.dentry, name,
- value, size, flags);
+ value, size, flags & AT_XATTR__FLAGS);
mnt_drop_write(path.mnt);
}
path_put(&path);
@@ -681,18 +694,25 @@ static int path_setxattr(const char __user *pathname,
return error;
}

+SYSCALL_DEFINE6(setxattrat, int, dfd, const char __user *, pathname,
+ const char __user *, name, const void __user *, value,
+ size_t, size, int, flags)
+{
+ return do_setxattrat(dfd, pathname, name, value, size, flags);
+}
+
SYSCALL_DEFINE5(setxattr, const char __user *, pathname,
const char __user *, name, const void __user *, value,
size_t, size, int, flags)
{
- return path_setxattr(pathname, name, value, size, flags, LOOKUP_FOLLOW);
+ return do_setxattrat(AT_FDCWD, pathname, name, value, size, flags);
}

SYSCALL_DEFINE5(lsetxattr, const char __user *, pathname,
const char __user *, name, const void __user *, value,
size_t, size, int, flags)
{
- return path_setxattr(pathname, name, value, size, flags, 0);
+ return do_setxattrat(AT_FDCWD, pathname, name, value, size, flags | AT_SYMLINK_NOFOLLOW);
}

SYSCALL_DEFINE5(fsetxattr, int, fd, const char __user *, name,
@@ -775,14 +795,22 @@ getxattr(struct mnt_idmap *idmap, struct dentry *d,
return error;
}

-static ssize_t path_getxattr(const char __user *pathname,
+static ssize_t do_getxattrat(int dfd, const char __user *pathname,
const char __user *name, void __user *value,
- size_t size, unsigned int lookup_flags)
+ size_t size, int flags)
{
struct path path;
ssize_t error;
+ int lookup_flags;
+
+ if ((flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
+ return -EINVAL;
+
+ lookup_flags = (flags & AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW;
+ if (flags & AT_EMPTY_PATH)
+ lookup_flags |= LOOKUP_EMPTY;
retry:
- error = user_path_at(AT_FDCWD, pathname, lookup_flags, &path);
+ error = user_path_at(dfd, pathname, lookup_flags, &path);
if (error)
return error;
error = getxattr(mnt_idmap(path.mnt), path.dentry, name, value, size);
@@ -794,16 +822,23 @@ static ssize_t path_getxattr(const char __user *pathname,
return error;
}

+SYSCALL_DEFINE6(getxattrat, int, dfd, const char __user *, pathname,
+ const char __user *, name, void __user *, value, size_t, size,
+ int, flags)
+{
+ return do_getxattrat(dfd, pathname, name, value, size, flags);
+}
+
SYSCALL_DEFINE4(getxattr, const char __user *, pathname,
const char __user *, name, void __user *, value, size_t, size)
{
- return path_getxattr(pathname, name, value, size, LOOKUP_FOLLOW);
+ return do_getxattrat(AT_FDCWD, pathname, name, value, size, 0);
}

SYSCALL_DEFINE4(lgetxattr, const char __user *, pathname,
const char __user *, name, void __user *, value, size_t, size)
{
- return path_getxattr(pathname, name, value, size, 0);
+ return do_getxattrat(AT_FDCWD, pathname, name, value, size, AT_SYMLINK_NOFOLLOW);
}

SYSCALL_DEFINE4(fgetxattr, int, fd, const char __user *, name,
@@ -853,13 +888,21 @@ listxattr(struct dentry *d, char __user *list, size_t size)
return error;
}

-static ssize_t path_listxattr(const char __user *pathname, char __user *list,
- size_t size, unsigned int lookup_flags)
+static ssize_t do_listxattrat(int dfd, const char __user *pathname, char __user *list,
+ size_t size, int flags)
{
struct path path;
ssize_t error;
+ int lookup_flags;
+
+ if ((flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
+ return -EINVAL;
+
+ lookup_flags = (flags & AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW;
+ if (flags & AT_EMPTY_PATH)
+ lookup_flags |= LOOKUP_EMPTY;
retry:
- error = user_path_at(AT_FDCWD, pathname, lookup_flags, &path);
+ error = user_path_at(dfd, pathname, lookup_flags, &path);
if (error)
return error;
error = listxattr(path.dentry, list, size);
@@ -871,16 +914,22 @@ static ssize_t path_listxattr(const char __user *pathname, char __user *list,
return error;
}

+SYSCALL_DEFINE5(listxattrat, int, dfd, const char __user *, pathname, char __user *, list,
+ size_t, size, int, flags)
+{
+ return do_listxattrat(dfd, pathname, list, size, flags);
+}
+
SYSCALL_DEFINE3(listxattr, const char __user *, pathname, char __user *, list,
size_t, size)
{
- return path_listxattr(pathname, list, size, LOOKUP_FOLLOW);
+ return do_listxattrat(AT_FDCWD, pathname, list, size, 0);
}

SYSCALL_DEFINE3(llistxattr, const char __user *, pathname, char __user *, list,
size_t, size)
{
- return path_listxattr(pathname, list, size, 0);
+ return do_listxattrat(AT_FDCWD, pathname, list, size, AT_SYMLINK_NOFOLLOW);
}

SYSCALL_DEFINE3(flistxattr, int, fd, char __user *, list, size_t, size)
@@ -899,7 +948,7 @@ SYSCALL_DEFINE3(flistxattr, int, fd, char __user *, list, size_t, size)
/*
* Extended attribute REMOVE operations
*/
-static long
+static int
removexattr(struct mnt_idmap *idmap, struct dentry *d,
const char __user *name)
{
@@ -918,13 +967,21 @@ removexattr(struct mnt_idmap *idmap, struct dentry *d,
return vfs_removexattr(idmap, d, kname);
}

-static int path_removexattr(const char __user *pathname,
- const char __user *name, unsigned int lookup_flags)
+static int do_removexattrat(int dfd, const char __user *pathname,
+ const char __user *name, int flags)
{
struct path path;
int error;
+ int lookup_flags;
+
+ if ((flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0)
+ return -EINVAL;
+
+ lookup_flags = (flags & AT_SYMLINK_NOFOLLOW) ? 0 : LOOKUP_FOLLOW;
+ if (flags & AT_EMPTY_PATH)
+ lookup_flags |= LOOKUP_EMPTY;
retry:
- error = user_path_at(AT_FDCWD, pathname, lookup_flags, &path);
+ error = user_path_at(dfd, pathname, lookup_flags, &path);
if (error)
return error;
error = mnt_want_write(path.mnt);
@@ -940,16 +997,22 @@ static int path_removexattr(const char __user *pathname,
return error;
}

+SYSCALL_DEFINE4(removexattrat, int, dfd, const char __user *, pathname,
+ const char __user *, name, int, flags)
+{
+ return do_removexattrat(dfd, pathname, name, flags);
+}
+
SYSCALL_DEFINE2(removexattr, const char __user *, pathname,
const char __user *, name)
{
- return path_removexattr(pathname, name, LOOKUP_FOLLOW);
+ return do_removexattrat(AT_FDCWD, pathname, name, 0);
}

SYSCALL_DEFINE2(lremovexattr, const char __user *, pathname,
const char __user *, name)
{
- return path_removexattr(pathname, name, 0);
+ return do_removexattrat(AT_FDCWD, pathname, name, AT_SYMLINK_NOFOLLOW);
}

SYSCALL_DEFINE2(fremovexattr, int, fd, const char __user *, name)
diff --git a/include/asm-generic/audit_change_attr.h b/include/asm-generic/audit_change_attr.h
index 331670807cf0..cc840537885f 100644
--- a/include/asm-generic/audit_change_attr.h
+++ b/include/asm-generic/audit_change_attr.h
@@ -11,9 +11,15 @@ __NR_lchown,
__NR_fchown,
#endif
__NR_setxattr,
+#ifdef __NR_setxattrat
+__NR_setxattrat,
+#endif
__NR_lsetxattr,
__NR_fsetxattr,
__NR_removexattr,
+#ifdef __NR_removexattrat
+__NR_removexattrat,
+#endif
__NR_lremovexattr,
__NR_fremovexattr,
#ifdef __NR_fchownat
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 33a0ee3bcb2e..0612661c9eca 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -350,23 +350,31 @@ asmlinkage long sys_io_uring_register(unsigned int fd, unsigned int op,
/* fs/xattr.c */
asmlinkage long sys_setxattr(const char __user *path, const char __user *name,
const void __user *value, size_t size, int flags);
+asmlinkage long sys_setxattrat(int dfd, const char __user *path, const char __user *name,
+ const void __user *value, size_t size, int flags);
asmlinkage long sys_lsetxattr(const char __user *path, const char __user *name,
const void __user *value, size_t size, int flags);
asmlinkage long sys_fsetxattr(int fd, const char __user *name,
const void __user *value, size_t size, int flags);
asmlinkage long sys_getxattr(const char __user *path, const char __user *name,
void __user *value, size_t size);
+asmlinkage long sys_getxattrat(int dfd, const char __user *path, const char __user *name,
+ void __user *value, size_t size, int flags);
asmlinkage long sys_lgetxattr(const char __user *path, const char __user *name,
void __user *value, size_t size);
asmlinkage long sys_fgetxattr(int fd, const char __user *name,
void __user *value, size_t size);
asmlinkage long sys_listxattr(const char __user *path, char __user *list,
size_t size);
+asmlinkage long sys_listxattrat(int dfd, const char __user *path, char __user *list,
+ size_t size, int flags);
asmlinkage long sys_llistxattr(const char __user *path, char __user *list,
size_t size);
asmlinkage long sys_flistxattr(int fd, char __user *list, size_t size);
asmlinkage long sys_removexattr(const char __user *path,
const char __user *name);
+asmlinkage long sys_removexattrat(int dfd, const char __user *path,
+ const char __user *name, int flags);
asmlinkage long sys_lremovexattr(const char __user *path,
const char __user *name);
asmlinkage long sys_fremovexattr(int fd, const char __user *name);
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 45fa180cc56a..4fcc71612b7a 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -886,8 +886,18 @@ __SYSCALL(__NR_futex_waitv, sys_futex_waitv)
#define __NR_set_mempolicy_home_node 450
__SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)

+/* fs/xattr.c */
+#define __NR_setxattrat 451
+__SYSCALL(__NR_setxattrat, sys_setxattrat)
+#define __NR_getxattrat 452
+__SYSCALL(__NR_getxattrat, sys_getxattrat)
+#define __NR_listxattrat 453
+__SYSCALL(__NR_listxattrat, sys_listxattrat)
+#define __NR_removexattrat 454
+__SYSCALL(__NR_removexattrat, sys_removexattrat)
+
#undef __NR_syscalls
-#define __NR_syscalls 451
+#define __NR_syscalls 455

/*
* 32 bit systems traditionally used different
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index e8c07da58c9f..b456547c8460 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -96,6 +96,8 @@
#define AT_FDCWD -100 /* Special value used to indicate
openat should use the current
working directory. */
+#define AT_XATTR_CREATE 0x1 /* setxattrat(2): set value, fail if attr already exists */
+#define AT_XATTR_REPLACE 0x2 /* setxattrat(2): set value, fail if attr does not exist */
#define AT_SYMLINK_NOFOLLOW 0x100 /* Do not follow symbolic links. */
#define AT_EACCESS 0x200 /* Test access permitted for
effective IDs, not real IDs. */
--
2.40.1



2023-05-15 10:38:31

by Christian Brauner

[permalink] [raw]
Subject: Re: [RFC PATCH v2] fs/xattr: add *at family syscalls

On Thu, May 11, 2023 at 05:08:02PM +0200, Christian Göttsche wrote:
> Add the four syscalls setxattrat(), getxattrat(), listxattrat() and
> removexattrat(). Those can be used to operate on extended attributes,
> especially security related ones, either relative to a pinned directory
> or on a file descriptor without read access, avoiding a
> /proc/<pid>/fd/<fd> detour, requiring a mounted procfs.
>
> One use case will be setfiles(8) setting SELinux file contexts
> ("security.selinux") without race conditions.
>
> Add XATTR flags to the private namespace of AT_* flags.
>
> Use the do_{name}at() pattern from fs/open.c.
>
> Use a single flag parameter for extended attribute flags (currently
> XATTR_CREATE and XATTR_REPLACE) and *at() flags to not exceed six
> syscall arguments in setxattrat().
>
> Previous approach ("f*xattr: allow O_PATH descriptors"): https://lore.kernel.org/all/[email protected]/
> v1 discussion: https://lore.kernel.org/all/[email protected]/
>
> Signed-off-by: Christian Göttsche <[email protected]>
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> CC: [email protected]
> ---

Fwiw, your header doesn't let me see who the mail was directly sent to
so I'm only able to reply to lists which is a bit pointless...

> v2:
> - squash syscall introduction and wire up commits
> - add AT_XATTR_CREATE and AT_XATTR_REPLACE constants

> +#define AT_XATTR_CREATE 0x1 /* setxattrat(2): set value, fail if attr already exists */
> +#define AT_XATTR_REPLACE 0x2 /* setxattrat(2): set value, fail if attr does not exist */

We really shouldn't waste any AT_* flags for this. Otherwise we'll run
out of them rather quickly. Two weeks ago we added another AT_* flag
which is up for merging for v6.5 iirc and I've glimpsed another AT_*
flag proposal in one of the talks at last weeks Vancouver conference
extravaganza.

Even if we reuse 0x200 for AT_XATTR_CREATE (like we did for AT_EACCESS
and AT_REMOVEDIR) we still need another bit for AT_XATTR_REPLACE.

Plus, this is really ugly since AT_XATTR_{CREATE,REPLACE} really isn't
in any way related to lookup and we're mixing it in with lookup
modifying flags.

So my proposal for {g,s}etxattrat() would be:

struct xattr_args {
__aligned_u64 value;
__u32 size;
__u32 cmd;
};

So everything's nicely 64bit aligned in the struct. Use the @cmd member
to set either XATTR_REPLACE or XATTR_CREATE and treat it as a proper
enum and not as a flag argument like the old calls did.

So then we'd have:

setxattrat(int dfd, const char *path, const char __user *name,
struct xattr_args __user *args, size_t size, unsigned int flags)
getxattrat(int dfd, const char *path, const char __user *name,
struct xattr_args __user *args, size_t size, unsigned int flags)

The current in-kernel struct xattr_ctx would be renamed to struct
kernel_xattr_args and then we do the usual copy_struct_from_user()
dance:

struct xattr_args args;
err = copy_struct_from_user(&args, sizeof(args), uargs, usize);

and then go on to handle value/size for setxattrat()/getxattrat()
accordingly.

getxattr()/setxattr() aren't meaningfully filterable by seccomp already
so there's not point in not using a struct.

If that isn't very appealing then another option is to add a new flag
namespace just for setxattrat() similar to fspick() and move_mount()
duplicating the needed lookup modifying flags.
Thoughts?

2023-05-15 13:12:48

by Amir Goldstein

[permalink] [raw]
Subject: Re: [RFC PATCH v2] fs/xattr: add *at family syscalls

On Mon, May 15, 2023 at 1:33 PM Christian Brauner <[email protected]> wrote:
>
> On Thu, May 11, 2023 at 05:08:02PM +0200, Christian Göttsche wrote:
> > Add the four syscalls setxattrat(), getxattrat(), listxattrat() and
> > removexattrat(). Those can be used to operate on extended attributes,
> > especially security related ones, either relative to a pinned directory
> > or on a file descriptor without read access, avoiding a
> > /proc/<pid>/fd/<fd> detour, requiring a mounted procfs.
> >
> > One use case will be setfiles(8) setting SELinux file contexts
> > ("security.selinux") without race conditions.
> >
> > Add XATTR flags to the private namespace of AT_* flags.
> >
> > Use the do_{name}at() pattern from fs/open.c.
> >
> > Use a single flag parameter for extended attribute flags (currently
> > XATTR_CREATE and XATTR_REPLACE) and *at() flags to not exceed six
> > syscall arguments in setxattrat().
> >
> > Previous approach ("f*xattr: allow O_PATH descriptors"): https://lore.kernel.org/all/[email protected]/
> > v1 discussion: https://lore.kernel.org/all/[email protected]/
> >
> > Signed-off-by: Christian Göttsche <[email protected]>
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > CC: [email protected]
> > ---
>
> Fwiw, your header doesn't let me see who the mail was directly sent to
> so I'm only able to reply to lists which is a bit pointless...
>
> > v2:
> > - squash syscall introduction and wire up commits
> > - add AT_XATTR_CREATE and AT_XATTR_REPLACE constants
>
> > +#define AT_XATTR_CREATE 0x1 /* setxattrat(2): set value, fail if attr already exists */
> > +#define AT_XATTR_REPLACE 0x2 /* setxattrat(2): set value, fail if attr does not exist */
>
> We really shouldn't waste any AT_* flags for this. Otherwise we'll run
> out of them rather quickly. Two weeks ago we added another AT_* flag
> which is up for merging for v6.5 iirc and I've glimpsed another AT_*
> flag proposal in one of the talks at last weeks Vancouver conference
> extravaganza.
>
> Even if we reuse 0x200 for AT_XATTR_CREATE (like we did for AT_EACCESS
> and AT_REMOVEDIR) we still need another bit for AT_XATTR_REPLACE.
>
> Plus, this is really ugly since AT_XATTR_{CREATE,REPLACE} really isn't
> in any way related to lookup and we're mixing it in with lookup
> modifying flags.
>
> So my proposal for {g,s}etxattrat() would be:
>
> struct xattr_args {
> __aligned_u64 value;
> __u32 size;
> __u32 cmd;
> };
>
> So everything's nicely 64bit aligned in the struct. Use the @cmd member
> to set either XATTR_REPLACE or XATTR_CREATE and treat it as a proper
> enum and not as a flag argument like the old calls did.
>
> So then we'd have:
>
> setxattrat(int dfd, const char *path, const char __user *name,
> struct xattr_args __user *args, size_t size, unsigned int flags)
> getxattrat(int dfd, const char *path, const char __user *name,
> struct xattr_args __user *args, size_t size, unsigned int flags)
>
> The current in-kernel struct xattr_ctx would be renamed to struct
> kernel_xattr_args and then we do the usual copy_struct_from_user()
> dance:
>
> struct xattr_args args;
> err = copy_struct_from_user(&args, sizeof(args), uargs, usize);
>
> and then go on to handle value/size for setxattrat()/getxattrat()
> accordingly.
>
> getxattr()/setxattr() aren't meaningfully filterable by seccomp already
> so there's not point in not using a struct.
>
> If that isn't very appealing then another option is to add a new flag
> namespace just for setxattrat() similar to fspick() and move_mount()
> duplicating the needed lookup modifying flags.
> Thoughts?

Here is a thought: I am not sure if I am sorry we did not discuss this API
issue in LSFMM or happy that we did not waste our time on this... :-/

I must say that I dislike redefined flag namespace like FSPICK_*
just as much as I dislike overloading the AT_* namespace and TBH,
I am not crazy about avoiding this problem with xattr_args either.

A more sane solution IMO could have been:
- Use lower word of flags for generic AT_ flags
- Use the upper word of flags for syscall specific flags

So if it were up to me, I would vote starting this practice:

+ /* Start of syscall specific range */
+ #define AT_XATTR_CREATE 0x10000 /* setxattrat(2): set
value, fail if attr already exists */
+ #define AT_XATTR_REPLACE 0x20000 /* setxattrat(2): set
value, fail if attr does not exist */

Which coincidentally happens to be inline with my AT_HANDLE_FID patch...

Sure, we will have some special cases like MOVE_MOUNT_* and
legacy pollution to the lower AT_ flags word, but as a generic solution
for syscalls that need the common AT_ lookup flags and just a few
private flags, that seems like the lesser evil to me.

Thanks,
Amir.

2023-05-15 13:54:06

by Christian Brauner

[permalink] [raw]
Subject: Re: [RFC PATCH v2] fs/xattr: add *at family syscalls

On Mon, May 15, 2023 at 04:04:21PM +0300, Amir Goldstein wrote:
> On Mon, May 15, 2023 at 1:33 PM Christian Brauner <[email protected]> wrote:
> >
> > On Thu, May 11, 2023 at 05:08:02PM +0200, Christian Göttsche wrote:
> > > Add the four syscalls setxattrat(), getxattrat(), listxattrat() and
> > > removexattrat(). Those can be used to operate on extended attributes,
> > > especially security related ones, either relative to a pinned directory
> > > or on a file descriptor without read access, avoiding a
> > > /proc/<pid>/fd/<fd> detour, requiring a mounted procfs.
> > >
> > > One use case will be setfiles(8) setting SELinux file contexts
> > > ("security.selinux") without race conditions.
> > >
> > > Add XATTR flags to the private namespace of AT_* flags.
> > >
> > > Use the do_{name}at() pattern from fs/open.c.
> > >
> > > Use a single flag parameter for extended attribute flags (currently
> > > XATTR_CREATE and XATTR_REPLACE) and *at() flags to not exceed six
> > > syscall arguments in setxattrat().
> > >
> > > Previous approach ("f*xattr: allow O_PATH descriptors"): https://lore.kernel.org/all/[email protected]/
> > > v1 discussion: https://lore.kernel.org/all/[email protected]/
> > >
> > > Signed-off-by: Christian Göttsche <[email protected]>
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > CC: [email protected]
> > > ---
> >
> > Fwiw, your header doesn't let me see who the mail was directly sent to
> > so I'm only able to reply to lists which is a bit pointless...
> >
> > > v2:
> > > - squash syscall introduction and wire up commits
> > > - add AT_XATTR_CREATE and AT_XATTR_REPLACE constants
> >
> > > +#define AT_XATTR_CREATE 0x1 /* setxattrat(2): set value, fail if attr already exists */
> > > +#define AT_XATTR_REPLACE 0x2 /* setxattrat(2): set value, fail if attr does not exist */
> >
> > We really shouldn't waste any AT_* flags for this. Otherwise we'll run
> > out of them rather quickly. Two weeks ago we added another AT_* flag
> > which is up for merging for v6.5 iirc and I've glimpsed another AT_*
> > flag proposal in one of the talks at last weeks Vancouver conference
> > extravaganza.
> >
> > Even if we reuse 0x200 for AT_XATTR_CREATE (like we did for AT_EACCESS
> > and AT_REMOVEDIR) we still need another bit for AT_XATTR_REPLACE.
> >
> > Plus, this is really ugly since AT_XATTR_{CREATE,REPLACE} really isn't
> > in any way related to lookup and we're mixing it in with lookup
> > modifying flags.
> >
> > So my proposal for {g,s}etxattrat() would be:
> >
> > struct xattr_args {
> > __aligned_u64 value;
> > __u32 size;
> > __u32 cmd;
> > };
> >
> > So everything's nicely 64bit aligned in the struct. Use the @cmd member
> > to set either XATTR_REPLACE or XATTR_CREATE and treat it as a proper
> > enum and not as a flag argument like the old calls did.
> >
> > So then we'd have:
> >
> > setxattrat(int dfd, const char *path, const char __user *name,
> > struct xattr_args __user *args, size_t size, unsigned int flags)
> > getxattrat(int dfd, const char *path, const char __user *name,
> > struct xattr_args __user *args, size_t size, unsigned int flags)
> >
> > The current in-kernel struct xattr_ctx would be renamed to struct
> > kernel_xattr_args and then we do the usual copy_struct_from_user()
> > dance:
> >
> > struct xattr_args args;
> > err = copy_struct_from_user(&args, sizeof(args), uargs, usize);
> >
> > and then go on to handle value/size for setxattrat()/getxattrat()
> > accordingly.
> >
> > getxattr()/setxattr() aren't meaningfully filterable by seccomp already
> > so there's not point in not using a struct.
> >
> > If that isn't very appealing then another option is to add a new flag
> > namespace just for setxattrat() similar to fspick() and move_mount()
> > duplicating the needed lookup modifying flags.
> > Thoughts?
>
> Here is a thought: I am not sure if I am sorry we did not discuss this API
> issue in LSFMM or happy that we did not waste our time on this... :-/
>
> I must say that I dislike redefined flag namespace like FSPICK_*
> just as much as I dislike overloading the AT_* namespace and TBH,
> I am not crazy about avoiding this problem with xattr_args either.
>
> A more sane solution IMO could have been:
> - Use lower word of flags for generic AT_ flags
> - Use the upper word of flags for syscall specific flags

We'd have 16 lower bits for AT_* flags and upper 16 bits for non-AT_*
flags. That might be ok but it isn't great because if we ever extend
AT_* flags into the upper 16 bits that are generally useful for all
AT_* flag taking system calls we'd not be able to use them. And at the
rate people keep suggesting new AT_* flags that issue might arise
quicker than we might think.

And we really don't want 64 bit flag arguments because of 32 bit
architectures as that gets really ugly to handle cleanly (Arnd has
talked a lot about issues in this area before).

>
> So if it were up to me, I would vote starting this practice:
>
> + /* Start of syscall specific range */
> + #define AT_XATTR_CREATE 0x10000 /* setxattrat(2): set
> value, fail if attr already exists */
> + #define AT_XATTR_REPLACE 0x20000 /* setxattrat(2): set
> value, fail if attr does not exist */
>
> Which coincidentally happens to be inline with my AT_HANDLE_FID patch...

This is different though. The reason AT_HANDLE_FID is acceptable is
because we need the ability to extend an existing system call and we're
reusing a bit that is already used in two other system calls. So we
avoid adding a new system call just to add another flag argument and
we're also not using up an additional AT_* bit. This makes it bearable
imho. But here we're talking about new system calls where we can avoid
this problem arising in the first place.

>
> Sure, we will have some special cases like MOVE_MOUNT_* and
> legacy pollution to the lower AT_ flags word, but as a generic solution
> for syscalls that need the common AT_ lookup flags and just a few
> private flags, that seems like the lesser evil to me.

It is fine to do this in some cases but we shouldn't encourage mixing
distinct flag namespaces let alone advertising this as a generic
solution imho. The AT_XATTR_* flags aren't even flags they behave like
an enum.

2023-05-15 14:34:08

by Amir Goldstein

[permalink] [raw]
Subject: Re: [RFC PATCH v2] fs/xattr: add *at family syscalls

On Mon, May 15, 2023 at 4:52 PM Christian Brauner <[email protected]> wrote:
>
> On Mon, May 15, 2023 at 04:04:21PM +0300, Amir Goldstein wrote:
> > On Mon, May 15, 2023 at 1:33 PM Christian Brauner <[email protected]> wrote:
> > >
> > > On Thu, May 11, 2023 at 05:08:02PM +0200, Christian Göttsche wrote:
> > > > Add the four syscalls setxattrat(), getxattrat(), listxattrat() and
> > > > removexattrat(). Those can be used to operate on extended attributes,
> > > > especially security related ones, either relative to a pinned directory
> > > > or on a file descriptor without read access, avoiding a
> > > > /proc/<pid>/fd/<fd> detour, requiring a mounted procfs.
> > > >
> > > > One use case will be setfiles(8) setting SELinux file contexts
> > > > ("security.selinux") without race conditions.
> > > >
> > > > Add XATTR flags to the private namespace of AT_* flags.
> > > >
> > > > Use the do_{name}at() pattern from fs/open.c.
> > > >
> > > > Use a single flag parameter for extended attribute flags (currently
> > > > XATTR_CREATE and XATTR_REPLACE) and *at() flags to not exceed six
> > > > syscall arguments in setxattrat().
> > > >
> > > > Previous approach ("f*xattr: allow O_PATH descriptors"): https://lore.kernel.org/all/[email protected]/
> > > > v1 discussion: https://lore.kernel.org/all/[email protected]/
> > > >
> > > > Signed-off-by: Christian Göttsche <[email protected]>
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > CC: [email protected]
> > > > ---
> > >
> > > Fwiw, your header doesn't let me see who the mail was directly sent to
> > > so I'm only able to reply to lists which is a bit pointless...
> > >
> > > > v2:
> > > > - squash syscall introduction and wire up commits
> > > > - add AT_XATTR_CREATE and AT_XATTR_REPLACE constants
> > >
> > > > +#define AT_XATTR_CREATE 0x1 /* setxattrat(2): set value, fail if attr already exists */
> > > > +#define AT_XATTR_REPLACE 0x2 /* setxattrat(2): set value, fail if attr does not exist */
> > >
> > > We really shouldn't waste any AT_* flags for this. Otherwise we'll run
> > > out of them rather quickly. Two weeks ago we added another AT_* flag
> > > which is up for merging for v6.5 iirc and I've glimpsed another AT_*
> > > flag proposal in one of the talks at last weeks Vancouver conference
> > > extravaganza.
> > >
> > > Even if we reuse 0x200 for AT_XATTR_CREATE (like we did for AT_EACCESS
> > > and AT_REMOVEDIR) we still need another bit for AT_XATTR_REPLACE.
> > >
> > > Plus, this is really ugly since AT_XATTR_{CREATE,REPLACE} really isn't
> > > in any way related to lookup and we're mixing it in with lookup
> > > modifying flags.
> > >
> > > So my proposal for {g,s}etxattrat() would be:
> > >
> > > struct xattr_args {
> > > __aligned_u64 value;
> > > __u32 size;
> > > __u32 cmd;
> > > };
> > >
> > > So everything's nicely 64bit aligned in the struct. Use the @cmd member
> > > to set either XATTR_REPLACE or XATTR_CREATE and treat it as a proper
> > > enum and not as a flag argument like the old calls did.
> > >
> > > So then we'd have:
> > >
> > > setxattrat(int dfd, const char *path, const char __user *name,
> > > struct xattr_args __user *args, size_t size, unsigned int flags)
> > > getxattrat(int dfd, const char *path, const char __user *name,
> > > struct xattr_args __user *args, size_t size, unsigned int flags)
> > >
> > > The current in-kernel struct xattr_ctx would be renamed to struct
> > > kernel_xattr_args and then we do the usual copy_struct_from_user()
> > > dance:
> > >
> > > struct xattr_args args;
> > > err = copy_struct_from_user(&args, sizeof(args), uargs, usize);
> > >
> > > and then go on to handle value/size for setxattrat()/getxattrat()
> > > accordingly.
> > >
> > > getxattr()/setxattr() aren't meaningfully filterable by seccomp already
> > > so there's not point in not using a struct.
> > >
> > > If that isn't very appealing then another option is to add a new flag
> > > namespace just for setxattrat() similar to fspick() and move_mount()
> > > duplicating the needed lookup modifying flags.
> > > Thoughts?
> >
> > Here is a thought: I am not sure if I am sorry we did not discuss this API
> > issue in LSFMM or happy that we did not waste our time on this... :-/
> >
> > I must say that I dislike redefined flag namespace like FSPICK_*
> > just as much as I dislike overloading the AT_* namespace and TBH,
> > I am not crazy about avoiding this problem with xattr_args either.
> >
> > A more sane solution IMO could have been:
> > - Use lower word of flags for generic AT_ flags
> > - Use the upper word of flags for syscall specific flags
>
> We'd have 16 lower bits for AT_* flags and upper 16 bits for non-AT_*
> flags. That might be ok but it isn't great because if we ever extend
> AT_* flags into the upper 16 bits that are generally useful for all
> AT_* flag taking system calls we'd not be able to use them. And at the
> rate people keep suggesting new AT_* flags that issue might arise
> quicker than we might think.
>
> And we really don't want 64 bit flag arguments because of 32 bit
> architectures as that gets really ugly to handle cleanly (Arnd has
> talked a lot about issues in this area before).
>
> >
> > So if it were up to me, I would vote starting this practice:
> >
> > + /* Start of syscall specific range */
> > + #define AT_XATTR_CREATE 0x10000 /* setxattrat(2): set
> > value, fail if attr already exists */
> > + #define AT_XATTR_REPLACE 0x20000 /* setxattrat(2): set
> > value, fail if attr does not exist */
> >
> > Which coincidentally happens to be inline with my AT_HANDLE_FID patch...
>
> This is different though. The reason AT_HANDLE_FID is acceptable is
> because we need the ability to extend an existing system call and we're
> reusing a bit that is already used in two other system calls. So we
> avoid adding a new system call just to add another flag argument and
> we're also not using up an additional AT_* bit. This makes it bearable
> imho. But here we're talking about new system calls where we can avoid
> this problem arising in the first place.
>
> >
> > Sure, we will have some special cases like MOVE_MOUNT_* and
> > legacy pollution to the lower AT_ flags word, but as a generic solution
> > for syscalls that need the common AT_ lookup flags and just a few
> > private flags, that seems like the lesser evil to me.
>
> It is fine to do this in some cases but we shouldn't encourage mixing
> distinct flag namespaces let alone advertising this as a generic
> solution imho. The AT_XATTR_* flags aren't even flags they behave like
> an enum.

OK. I see your point.
Also, wrt struct xattr_args, there is sort of a precedent with
XFS_IOC_ATTRMULTI_BY_HANDLE ioctl, struct xfs_attr_multiop
and flags XFS_IOC_ATTR_{CREATE,REPLACE}.

Just a nit, I would use xattr_args field names that are the
same as setxattr() arg names, so s/cmd/flags.

Thanks,
Amir.