Hi,
Changes from RFC v3:
- Removed the warning "fixes" patches, as they could hide potencial
bugs (Christian Brauner);
- Added "cred-specific" macros (Christian Brauner), from my side,
added a few '_' to the guards to signify that the newly introduced
helper macros are preferred.
- Changed a few guard() to scoped_guard() to fix the clang (17.0.6)
compilation error about 'goto' bypassing variable initialization;
Link to RFC v3:
https://lore.kernel.org/r/[email protected]/
Changes from RFC v2:
- Added separate patches for the warnings for the discarded const
when using the cleanup macros: one for DEFINE_GUARD() and one for
DEFINE_LOCK_GUARD_1() (I am uncertain if it's better to squash them
together);
- Reordered the series so the backing file patch is the first user of
the introduced helpers (Amir Goldstein);
- Change the definition of the cleanup "class" from a GUARD to a
LOCK_GUARD_1, which defines an implicit container, that allows us
to remove some variable declarations to store the overriden
credentials (Amir Goldstein);
- Replaced most of the uses of scoped_guard() with guard(), to reduce
the code churn, the remaining ones I wasn't sure if I was changing
the behavior: either they were nested (overrides "inside"
overrides) or something calls current_cred() (Amir Goldstein).
New questions:
- The backing file callbacks are now called with the "light"
overriden credentials, so they are kind of restricted in what they
can do with their credentials, is this acceptable in general?
- in ovl_rename() I had to manually call the "light" the overrides,
both using the guard() macro or using the non-light version causes
the workload to crash the kernel. I still have to investigate why
this is happening. Hints are appreciated.
Link to the RFC v2:
https://lore.kernel.org/r/[email protected]/
Original cover letter (lightly edited):
It was noticed that some workloads suffer from contention on
increasing/decrementing the ->usage counter in their credentials,
those refcount operations are associated with overriding/reverting the
current task credentials. (the linked thread adds more context)
In some specialized cases, overlayfs is one of them, the credentials
in question have a longer lifetime than the override/revert "critical
section". In the overlayfs case, the credentials are created when the
fs is mounted and destroyed when it's unmounted. In this case of long
lived credentials, the usage counter doesn't need to be
incremented/decremented.
Add a lighter version of credentials override/revert to be used in
these specialized cases. To make sure that the override/revert calls
are paired, add a cleanup guard macro. This was suggested here:
https://lore.kernel.org/all/20231219-marken-pochen-26d888fb9bb9@brauner/
With a small number of tweaks:
- Used inline functions instead of macros;
- A small change to store the credentials into the passed argument,
the guard is now defined as (note the added '_T ='):
DEFINE_GUARD(cred, const struct cred *, _T = override_creds_light(_T),
revert_creds_light(_T));
- Allow "const" arguments to be used with these kind of guards;
Some comments:
- If patch 1/5 and 2/5 are not a good idea (adding the cast), the
alternative I can see is using some kind of container for the
credentials;
- The only user for the backing file ops is overlayfs, so these
changes make sense, but may not make sense in the most general
case;
For the numbers, some from 'perf c2c', before this series:
(edited to fit)
#
# ----- HITM ----- Shared
# Num RmtHitm LclHitm Symbol Object Source:Line Node
# ..... ....... ....... .......................... ................ .................. ....
#
-------------------------
0 412 1028
-------------------------
41.50% 42.22% [k] revert_creds [kernel.vmlinux] atomic64_64.h:39 0 1
15.05% 10.60% [k] override_creds [kernel.vmlinux] atomic64_64.h:25 0 1
0.73% 0.58% [k] init_file [kernel.vmlinux] atomic64_64.h:25 0 1
0.24% 0.10% [k] revert_creds [kernel.vmlinux] cred.h:266 0 1
32.28% 37.16% [k] generic_permission [kernel.vmlinux] mnt_idmapping.h:81 0 1
9.47% 8.75% [k] generic_permission [kernel.vmlinux] mnt_idmapping.h:81 0 1
0.49% 0.58% [k] inode_owner_or_capable [kernel.vmlinux] mnt_idmapping.h:81 0 1
0.24% 0.00% [k] generic_permission [kernel.vmlinux] namei.c:354 0
-------------------------
1 50 103
-------------------------
100.00% 100.00% [k] update_cfs_group [kernel.vmlinux] atomic64_64.h:15 0 1
-------------------------
2 50 98
-------------------------
96.00% 96.94% [k] update_cfs_group [kernel.vmlinux] atomic64_64.h:15 0 1
2.00% 1.02% [k] update_load_avg [kernel.vmlinux] atomic64_64.h:25 0 1
0.00% 2.04% [k] update_load_avg [kernel.vmlinux] fair.c:4118 0
2.00% 0.00% [k] update_cfs_group [kernel.vmlinux] fair.c:3932 0 1
after this series:
#
# ----- HITM ----- Shared
# Num RmtHitm LclHitm Symbol Object Source:Line Node
# ..... ....... ....... .................... ................ ................ ....
#
-------------------------
0 54 88
-------------------------
100.00% 100.00% [k] update_cfs_group [kernel.vmlinux] atomic64_64.h:15 0 1
-------------------------
1 48 83
-------------------------
97.92% 97.59% [k] update_cfs_group [kernel.vmlinux] atomic64_64.h:15 0 1
2.08% 1.20% [k] update_load_avg [kernel.vmlinux] atomic64_64.h:25 0 1
0.00% 1.20% [k] update_load_avg [kernel.vmlinux] fair.c:4118 0 1
-------------------------
2 28 44
-------------------------
85.71% 79.55% [k] generic_permission [kernel.vmlinux] mnt_idmapping.h:81 0 1
14.29% 20.45% [k] generic_permission [kernel.vmlinux] mnt_idmapping.h:81 0 1
The contention is practically gone.
Link: https://lore.kernel.org/all/[email protected]/
Vinicius Costa Gomes (3):
cred: Add a light version of override/revert_creds()
fs: Optimize credentials reference count for backing file ops
overlayfs: Optimize credentials usage
fs/backing-file.c | 27 +++++-------------
fs/overlayfs/copy_up.c | 4 +--
fs/overlayfs/dir.c | 22 +++++++--------
fs/overlayfs/file.c | 63 +++++++++++++++++-------------------------
fs/overlayfs/inode.c | 60 +++++++++++++++-------------------------
fs/overlayfs/namei.c | 22 ++++-----------
fs/overlayfs/readdir.c | 16 +++--------
fs/overlayfs/util.c | 25 ++++++++---------
fs/overlayfs/xattrs.c | 33 +++++++++-------------
include/linux/cred.h | 25 +++++++++++++++++
kernel/cred.c | 6 ++--
11 files changed, 127 insertions(+), 176 deletions(-)
--
2.44.0
Add a light version of override/revert_creds(), this should only be
used when the credentials in question will outlive the critical
section and the critical section doesn't change the ->usage of the
credentials.
To make their usage less error prone, introduce cleanup guards to be
used like this:
cred_guard(credentials_to_override_and_restore);
or this:
cred_scoped_guard(credentials_to_override_and_restore) {
/* with credentials overridden */
}
Suggested-by: Christian Brauner <[email protected]>
Signed-off-by: Vinicius Costa Gomes <[email protected]>
---
include/linux/cred.h | 25 +++++++++++++++++++++++++
kernel/cred.c | 6 +++---
2 files changed, 28 insertions(+), 3 deletions(-)
diff --git a/include/linux/cred.h b/include/linux/cred.h
index 2976f534a7a3..f4f3d55cd6a2 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -172,6 +172,31 @@ static inline bool cap_ambient_invariant_ok(const struct cred *cred)
cred->cap_inheritable));
}
+/*
+ * Override creds without bumping reference count. Caller must ensure
+ * reference remains valid or has taken reference. Almost always not the
+ * interface you want. Use override_creds()/revert_creds() instead.
+ */
+static inline const struct cred *override_creds_light(const struct cred *override_cred)
+{
+ const struct cred *old = current->cred;
+
+ rcu_assign_pointer(current->cred, override_cred);
+ return old;
+}
+
+static inline void revert_creds_light(const struct cred *revert_cred)
+{
+ rcu_assign_pointer(current->cred, revert_cred);
+}
+
+DEFINE_LOCK_GUARD_1(__cred, struct cred,
+ _T->lock = (struct cred *)override_creds_light(_T->lock),
+ revert_creds_light(_T->lock));
+
+#define cred_guard(_cred) guard(__cred)(((struct cred *)_cred))
+#define cred_scoped_guard(_cred) scoped_guard(__cred, ((struct cred *)_cred))
+
/**
* get_new_cred_many - Get references on a new set of credentials
* @cred: The new credentials to reference
diff --git a/kernel/cred.c b/kernel/cred.c
index 075cfa7c896f..da7da250f7c8 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -485,7 +485,7 @@ EXPORT_SYMBOL(abort_creds);
*/
const struct cred *override_creds(const struct cred *new)
{
- const struct cred *old = current->cred;
+ const struct cred *old;
kdebug("override_creds(%p{%ld})", new,
atomic_long_read(&new->usage));
@@ -499,7 +499,7 @@ const struct cred *override_creds(const struct cred *new)
* visible to other threads under RCU.
*/
get_new_cred((struct cred *)new);
- rcu_assign_pointer(current->cred, new);
+ old = override_creds_light(new);
kdebug("override_creds() = %p{%ld}", old,
atomic_long_read(&old->usage));
@@ -521,7 +521,7 @@ void revert_creds(const struct cred *old)
kdebug("revert_creds(%p{%ld})", old,
atomic_long_read(&old->usage));
- rcu_assign_pointer(current->cred, old);
+ revert_creds_light(old);
put_cred(override);
}
EXPORT_SYMBOL(revert_creds);
--
2.44.0
For backing file operations, users are expected to pass credentials
that will outlive the backing file common operations.
Use the specialized guard statements to override/revert the
credentials.
Signed-off-by: Vinicius Costa Gomes <[email protected]>
---
fs/backing-file.c | 27 +++++++--------------------
1 file changed, 7 insertions(+), 20 deletions(-)
diff --git a/fs/backing-file.c b/fs/backing-file.c
index 740185198db3..9610d5166736 100644
--- a/fs/backing-file.c
+++ b/fs/backing-file.c
@@ -140,7 +140,6 @@ ssize_t backing_file_read_iter(struct file *file, struct iov_iter *iter,
struct backing_file_ctx *ctx)
{
struct backing_aio *aio = NULL;
- const struct cred *old_cred;
ssize_t ret;
if (WARN_ON_ONCE(!(file->f_mode & FMODE_BACKING)))
@@ -153,7 +152,7 @@ ssize_t backing_file_read_iter(struct file *file, struct iov_iter *iter,
!(file->f_mode & FMODE_CAN_ODIRECT))
return -EINVAL;
- old_cred = override_creds(ctx->cred);
+ cred_guard(ctx->cred);
if (is_sync_kiocb(iocb)) {
rwf_t rwf = iocb_to_rw_flags(flags);
@@ -174,8 +173,6 @@ ssize_t backing_file_read_iter(struct file *file, struct iov_iter *iter,
backing_aio_cleanup(aio, ret);
}
out:
- revert_creds(old_cred);
-
if (ctx->accessed)
ctx->accessed(ctx->user_file);
@@ -187,7 +184,6 @@ ssize_t backing_file_write_iter(struct file *file, struct iov_iter *iter,
struct kiocb *iocb, int flags,
struct backing_file_ctx *ctx)
{
- const struct cred *old_cred;
ssize_t ret;
if (WARN_ON_ONCE(!(file->f_mode & FMODE_BACKING)))
@@ -210,7 +206,7 @@ ssize_t backing_file_write_iter(struct file *file, struct iov_iter *iter,
*/
flags &= ~IOCB_DIO_CALLER_COMP;
- old_cred = override_creds(ctx->cred);
+ cred_guard(ctx->cred);
if (is_sync_kiocb(iocb)) {
rwf_t rwf = iocb_to_rw_flags(flags);
@@ -222,12 +218,12 @@ ssize_t backing_file_write_iter(struct file *file, struct iov_iter *iter,
ret = backing_aio_init_wq(iocb);
if (ret)
- goto out;
+ return ret;
ret = -ENOMEM;
aio = kmem_cache_zalloc(backing_aio_cachep, GFP_KERNEL);
if (!aio)
- goto out;
+ return ret;
aio->orig_iocb = iocb;
aio->end_write = ctx->end_write;
@@ -240,9 +236,6 @@ ssize_t backing_file_write_iter(struct file *file, struct iov_iter *iter,
if (ret != -EIOCBQUEUED)
backing_aio_cleanup(aio, ret);
}
-out:
- revert_creds(old_cred);
-
return ret;
}
EXPORT_SYMBOL_GPL(backing_file_write_iter);
@@ -252,15 +245,13 @@ ssize_t backing_file_splice_read(struct file *in, loff_t *ppos,
unsigned int flags,
struct backing_file_ctx *ctx)
{
- const struct cred *old_cred;
ssize_t ret;
if (WARN_ON_ONCE(!(in->f_mode & FMODE_BACKING)))
return -EIO;
- old_cred = override_creds(ctx->cred);
+ cred_guard(ctx->cred);
ret = vfs_splice_read(in, ppos, pipe, len, flags);
- revert_creds(old_cred);
if (ctx->accessed)
ctx->accessed(ctx->user_file);
@@ -274,7 +265,6 @@ ssize_t backing_file_splice_write(struct pipe_inode_info *pipe,
unsigned int flags,
struct backing_file_ctx *ctx)
{
- const struct cred *old_cred;
ssize_t ret;
if (WARN_ON_ONCE(!(out->f_mode & FMODE_BACKING)))
@@ -284,11 +274,10 @@ ssize_t backing_file_splice_write(struct pipe_inode_info *pipe,
if (ret)
return ret;
- old_cred = override_creds(ctx->cred);
+ cred_guard(ctx->cred);
file_start_write(out);
ret = iter_file_splice_write(pipe, out, ppos, len, flags);
file_end_write(out);
- revert_creds(old_cred);
if (ctx->end_write)
ctx->end_write(ctx->user_file);
@@ -300,7 +289,6 @@ EXPORT_SYMBOL_GPL(backing_file_splice_write);
int backing_file_mmap(struct file *file, struct vm_area_struct *vma,
struct backing_file_ctx *ctx)
{
- const struct cred *old_cred;
int ret;
if (WARN_ON_ONCE(!(file->f_mode & FMODE_BACKING)) ||
@@ -312,9 +300,8 @@ int backing_file_mmap(struct file *file, struct vm_area_struct *vma,
vma_set_file(vma, file);
- old_cred = override_creds(ctx->cred);
+ cred_guard(ctx->cred);
ret = call_mmap(vma->vm_file, vma);
- revert_creds(old_cred);
if (ctx->accessed)
ctx->accessed(ctx->user_file);
--
2.44.0
File operations in overlayfs also check against the credentials of the
mounter task, stored in the superblock, this credentials will outlive
most of the operations. For these cases, use the recently introduced
guard statements to guarantee that override/revert_creds() are paired.
As the guard() statements are in effect non-trivial variable
declarations and initializations, in a few cases, replacing
ovl_override_creds()/revert_creds() by guard() statements would cause
those initializations to be skipped when 'goto' labels are used. In
those cases, use scoped_guard(), clang 17.0.6 emits an error in these
cases.
Signed-off-by: Vinicius Costa Gomes <[email protected]>
---
fs/overlayfs/copy_up.c | 4 +--
fs/overlayfs/dir.c | 22 +++++++--------
fs/overlayfs/file.c | 63 +++++++++++++++++-------------------------
fs/overlayfs/inode.c | 60 +++++++++++++++-------------------------
fs/overlayfs/namei.c | 22 ++++-----------
fs/overlayfs/readdir.c | 16 +++--------
fs/overlayfs/util.c | 25 ++++++++---------
fs/overlayfs/xattrs.c | 33 +++++++++-------------
8 files changed, 92 insertions(+), 153 deletions(-)
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 0762575a1e70..14a55823fdf1 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -1180,7 +1180,6 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
static int ovl_copy_up_flags(struct dentry *dentry, int flags)
{
int err = 0;
- const struct cred *old_cred;
bool disconnected = (dentry->d_flags & DCACHE_DISCONNECTED);
/*
@@ -1200,7 +1199,7 @@ static int ovl_copy_up_flags(struct dentry *dentry, int flags)
if (err)
return err;
- old_cred = ovl_override_creds(dentry->d_sb);
+ cred_guard(ovl_creds(dentry->d_sb));
while (!err) {
struct dentry *next;
struct dentry *parent = NULL;
@@ -1225,7 +1224,6 @@ static int ovl_copy_up_flags(struct dentry *dentry, int flags)
dput(parent);
dput(next);
}
- revert_creds(old_cred);
return err;
}
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 0f8b4a719237..e68d5ba24ca7 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -684,12 +684,10 @@ static int ovl_symlink(struct mnt_idmap *idmap, struct inode *dir,
static int ovl_set_link_redirect(struct dentry *dentry)
{
- const struct cred *old_cred;
int err;
- old_cred = ovl_override_creds(dentry->d_sb);
+ cred_guard(ovl_creds(dentry->d_sb));
err = ovl_set_redirect(dentry, false);
- revert_creds(old_cred);
return err;
}
@@ -875,7 +873,6 @@ static void ovl_drop_nlink(struct dentry *dentry)
static int ovl_do_remove(struct dentry *dentry, bool is_dir)
{
int err;
- const struct cred *old_cred;
bool lower_positive = ovl_lower_positive(dentry);
LIST_HEAD(list);
@@ -894,12 +891,13 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
if (err)
goto out;
- old_cred = ovl_override_creds(dentry->d_sb);
- if (!lower_positive)
- err = ovl_remove_upper(dentry, is_dir, &list);
- else
- err = ovl_remove_and_whiteout(dentry, &list);
- revert_creds(old_cred);
+ cred_scoped_guard(ovl_creds(dentry->d_sb)) {
+ if (!lower_positive)
+ err = ovl_remove_upper(dentry, is_dir, &list);
+ else
+ err = ovl_remove_and_whiteout(dentry, &list);
+ }
+
if (!err) {
if (is_dir)
clear_nlink(dentry->d_inode);
@@ -1146,7 +1144,7 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
goto out;
}
- old_cred = ovl_override_creds(old->d_sb);
+ old_cred = override_creds_light(ovl_creds(old->d_sb));
if (!list_empty(&list)) {
opaquedir = ovl_clear_empty(new, &list);
@@ -1279,7 +1277,7 @@ static int ovl_rename(struct mnt_idmap *idmap, struct inode *olddir,
out_unlock:
unlock_rename(new_upperdir, old_upperdir);
out_revert_creds:
- revert_creds(old_cred);
+ revert_creds_light(old_cred);
if (update_nlink)
ovl_nlink_end(new);
else
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 05536964d37f..2a76220d8889 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -34,7 +34,6 @@ static struct file *ovl_open_realfile(const struct file *file,
struct inode *inode = file_inode(file);
struct mnt_idmap *real_idmap;
struct file *realfile;
- const struct cred *old_cred;
int flags = file->f_flags | OVL_OPEN_FLAGS;
int acc_mode = ACC_MODE(flags);
int err;
@@ -42,7 +41,7 @@ static struct file *ovl_open_realfile(const struct file *file,
if (flags & O_APPEND)
acc_mode |= MAY_APPEND;
- old_cred = ovl_override_creds(inode->i_sb);
+ cred_guard(ovl_creds(inode->i_sb));
real_idmap = mnt_idmap(realpath->mnt);
err = inode_permission(real_idmap, realinode, MAY_OPEN | acc_mode);
if (err) {
@@ -54,7 +53,6 @@ static struct file *ovl_open_realfile(const struct file *file,
realfile = backing_file_open(&file->f_path, flags, realpath,
current_cred());
}
- revert_creds(old_cred);
pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
file, file, ovl_whatisit(inode, realinode), file->f_flags,
@@ -185,7 +183,6 @@ static loff_t ovl_llseek(struct file *file, loff_t offset, int whence)
{
struct inode *inode = file_inode(file);
struct fd real;
- const struct cred *old_cred;
loff_t ret;
/*
@@ -214,9 +211,8 @@ static loff_t ovl_llseek(struct file *file, loff_t offset, int whence)
ovl_inode_lock(inode);
real.file->f_pos = file->f_pos;
- old_cred = ovl_override_creds(inode->i_sb);
+ cred_guard(ovl_creds(inode->i_sb));
ret = vfs_llseek(real.file, offset, whence);
- revert_creds(old_cred);
file->f_pos = real.file->f_pos;
ovl_inode_unlock(inode);
@@ -388,7 +384,6 @@ static ssize_t ovl_splice_write(struct pipe_inode_info *pipe, struct file *out,
static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
{
struct fd real;
- const struct cred *old_cred;
int ret;
ret = ovl_sync_status(OVL_FS(file_inode(file)->i_sb));
@@ -401,9 +396,8 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
/* Don't sync lower file for fear of receiving EROFS error */
if (file_inode(real.file) == ovl_inode_upper(file_inode(file))) {
- old_cred = ovl_override_creds(file_inode(file)->i_sb);
+ cred_guard(ovl_creds(file_inode(file)->i_sb));
ret = vfs_fsync_range(real.file, start, end, datasync);
- revert_creds(old_cred);
}
fdput(real);
@@ -427,7 +421,6 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
{
struct inode *inode = file_inode(file);
struct fd real;
- const struct cred *old_cred;
int ret;
inode_lock(inode);
@@ -441,9 +434,8 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
if (ret)
goto out_unlock;
- old_cred = ovl_override_creds(file_inode(file)->i_sb);
- ret = vfs_fallocate(real.file, mode, offset, len);
- revert_creds(old_cred);
+ cred_scoped_guard(ovl_creds(file_inode(file)->i_sb))
+ ret = vfs_fallocate(real.file, mode, offset, len);
/* Update size */
ovl_file_modified(file);
@@ -459,16 +451,14 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
static int ovl_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
{
struct fd real;
- const struct cred *old_cred;
int ret;
ret = ovl_real_fdget(file, &real);
if (ret)
return ret;
- old_cred = ovl_override_creds(file_inode(file)->i_sb);
+ cred_guard(ovl_creds(file_inode(file)->i_sb));
ret = vfs_fadvise(real.file, offset, len, advice);
- revert_creds(old_cred);
fdput(real);
@@ -487,7 +477,6 @@ static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
{
struct inode *inode_out = file_inode(file_out);
struct fd real_in, real_out;
- const struct cred *old_cred;
loff_t ret;
inode_lock(inode_out);
@@ -509,25 +498,25 @@ static loff_t ovl_copyfile(struct file *file_in, loff_t pos_in,
goto out_unlock;
}
- old_cred = ovl_override_creds(file_inode(file_out)->i_sb);
- switch (op) {
- case OVL_COPY:
- ret = vfs_copy_file_range(real_in.file, pos_in,
- real_out.file, pos_out, len, flags);
- break;
-
- case OVL_CLONE:
- ret = vfs_clone_file_range(real_in.file, pos_in,
- real_out.file, pos_out, len, flags);
- break;
-
- case OVL_DEDUPE:
- ret = vfs_dedupe_file_range_one(real_in.file, pos_in,
- real_out.file, pos_out, len,
- flags);
- break;
+ cred_scoped_guard(ovl_creds(file_inode(file_out)->i_sb)) {
+ switch (op) {
+ case OVL_COPY:
+ ret = vfs_copy_file_range(real_in.file, pos_in,
+ real_out.file, pos_out, len, flags);
+ break;
+
+ case OVL_CLONE:
+ ret = vfs_clone_file_range(real_in.file, pos_in,
+ real_out.file, pos_out, len, flags);
+ break;
+
+ case OVL_DEDUPE:
+ ret = vfs_dedupe_file_range_one(real_in.file, pos_in,
+ real_out.file, pos_out, len,
+ flags);
+ break;
+ }
}
- revert_creds(old_cred);
/* Update size */
ovl_file_modified(file_out);
@@ -579,7 +568,6 @@ static loff_t ovl_remap_file_range(struct file *file_in, loff_t pos_in,
static int ovl_flush(struct file *file, fl_owner_t id)
{
struct fd real;
- const struct cred *old_cred;
int err;
err = ovl_real_fdget(file, &real);
@@ -587,9 +575,8 @@ static int ovl_flush(struct file *file, fl_owner_t id)
return err;
if (real.file->f_op->flush) {
- old_cred = ovl_override_creds(file_inode(file)->i_sb);
+ cred_guard(ovl_creds(file_inode(file)->i_sb));
err = real.file->f_op->flush(real.file, id);
- revert_creds(old_cred);
}
fdput(real);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index c63b31a460be..56d7d6516a02 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -26,7 +26,6 @@ int ovl_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
bool full_copy_up = false;
struct dentry *upperdentry;
- const struct cred *old_cred;
err = setattr_prepare(&nop_mnt_idmap, dentry, attr);
if (err)
@@ -79,9 +78,9 @@ int ovl_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
goto out_put_write;
inode_lock(upperdentry->d_inode);
- old_cred = ovl_override_creds(dentry->d_sb);
- err = ovl_do_notify_change(ofs, upperdentry, attr);
- revert_creds(old_cred);
+ cred_scoped_guard(ovl_creds(dentry->d_sb))
+ err = ovl_do_notify_change(ofs, upperdentry, attr);
+
if (!err)
ovl_copyattr(dentry->d_inode);
inode_unlock(upperdentry->d_inode);
@@ -160,7 +159,6 @@ int ovl_getattr(struct mnt_idmap *idmap, const struct path *path,
struct dentry *dentry = path->dentry;
enum ovl_path_type type;
struct path realpath;
- const struct cred *old_cred;
struct inode *inode = d_inode(dentry);
bool is_dir = S_ISDIR(inode->i_mode);
int fsid = 0;
@@ -170,7 +168,7 @@ int ovl_getattr(struct mnt_idmap *idmap, const struct path *path,
metacopy_blocks = ovl_is_metacopy_dentry(dentry);
type = ovl_path_real(dentry, &realpath);
- old_cred = ovl_override_creds(dentry->d_sb);
+ cred_guard(ovl_creds(dentry->d_sb));
err = ovl_do_getattr(&realpath, stat, request_mask, flags);
if (err)
goto out;
@@ -281,8 +279,6 @@ int ovl_getattr(struct mnt_idmap *idmap, const struct path *path,
stat->nlink = dentry->d_inode->i_nlink;
out:
- revert_creds(old_cred);
-
return err;
}
@@ -292,7 +288,6 @@ int ovl_permission(struct mnt_idmap *idmap,
struct inode *upperinode = ovl_inode_upper(inode);
struct inode *realinode;
struct path realpath;
- const struct cred *old_cred;
int err;
/* Careful in RCU walk mode */
@@ -310,7 +305,8 @@ int ovl_permission(struct mnt_idmap *idmap,
if (err)
return err;
- old_cred = ovl_override_creds(inode->i_sb);
+ cred_guard(ovl_creds(inode->i_sb));
+
if (!upperinode &&
!special_file(realinode->i_mode) && mask & MAY_WRITE) {
mask &= ~(MAY_WRITE | MAY_APPEND);
@@ -318,7 +314,6 @@ int ovl_permission(struct mnt_idmap *idmap,
mask |= MAY_READ;
}
err = inode_permission(mnt_idmap(realpath.mnt), realinode, mask);
- revert_creds(old_cred);
return err;
}
@@ -327,15 +322,14 @@ static const char *ovl_get_link(struct dentry *dentry,
struct inode *inode,
struct delayed_call *done)
{
- const struct cred *old_cred;
const char *p;
if (!dentry)
return ERR_PTR(-ECHILD);
- old_cred = ovl_override_creds(dentry->d_sb);
+ cred_guard(ovl_creds(inode->i_sb));
p = vfs_get_link(ovl_dentry_real(dentry), done);
- revert_creds(old_cred);
+
return p;
}
@@ -466,11 +460,8 @@ struct posix_acl *do_ovl_get_acl(struct mnt_idmap *idmap,
acl = get_cached_acl_rcu(realinode, type);
} else {
- const struct cred *old_cred;
-
- old_cred = ovl_override_creds(inode->i_sb);
+ cred_guard(ovl_creds(inode->i_sb));
acl = ovl_get_acl_path(&realpath, posix_acl_xattr_name(type), noperm);
- revert_creds(old_cred);
}
return acl;
@@ -482,7 +473,6 @@ static int ovl_set_or_remove_acl(struct dentry *dentry, struct inode *inode,
int err;
struct path realpath;
const char *acl_name;
- const struct cred *old_cred;
struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
struct dentry *upperdentry = ovl_dentry_upper(dentry);
struct dentry *realdentry = upperdentry ?: ovl_dentry_lower(dentry);
@@ -496,10 +486,10 @@ static int ovl_set_or_remove_acl(struct dentry *dentry, struct inode *inode,
struct posix_acl *real_acl;
ovl_path_lower(dentry, &realpath);
- old_cred = ovl_override_creds(dentry->d_sb);
- real_acl = vfs_get_acl(mnt_idmap(realpath.mnt), realdentry,
- acl_name);
- revert_creds(old_cred);
+ cred_scoped_guard(ovl_creds(dentry->d_sb))
+ real_acl = vfs_get_acl(mnt_idmap(realpath.mnt), realdentry,
+ acl_name);
+
if (IS_ERR(real_acl)) {
err = PTR_ERR(real_acl);
goto out;
@@ -519,12 +509,12 @@ static int ovl_set_or_remove_acl(struct dentry *dentry, struct inode *inode,
if (err)
goto out;
- old_cred = ovl_override_creds(dentry->d_sb);
- if (acl)
- err = ovl_do_set_acl(ofs, realdentry, acl_name, acl);
- else
- err = ovl_do_remove_acl(ofs, realdentry, acl_name);
- revert_creds(old_cred);
+ cred_scoped_guard(ovl_creds(dentry->d_sb)) {
+ if (acl)
+ err = ovl_do_set_acl(ofs, realdentry, acl_name, acl);
+ else
+ err = ovl_do_remove_acl(ofs, realdentry, acl_name);
+ }
ovl_drop_write(dentry);
/* copy c/mtime */
@@ -591,7 +581,6 @@ static int ovl_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
{
int err;
struct inode *realinode = ovl_inode_realdata(inode);
- const struct cred *old_cred;
if (!realinode)
return -EIO;
@@ -599,9 +588,8 @@ static int ovl_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
if (!realinode->i_op->fiemap)
return -EOPNOTSUPP;
- old_cred = ovl_override_creds(inode->i_sb);
+ cred_guard(ovl_creds(inode->i_sb));
err = realinode->i_op->fiemap(realinode, fieinfo, start, len);
- revert_creds(old_cred);
return err;
}
@@ -649,7 +637,6 @@ int ovl_fileattr_set(struct mnt_idmap *idmap,
{
struct inode *inode = d_inode(dentry);
struct path upperpath;
- const struct cred *old_cred;
unsigned int flags;
int err;
@@ -661,7 +648,7 @@ int ovl_fileattr_set(struct mnt_idmap *idmap,
if (err)
goto out;
- old_cred = ovl_override_creds(inode->i_sb);
+ cred_guard(ovl_creds(inode->i_sb));
/*
* Store immutable/append-only flags in xattr and clear them
* in upper fileattr (in case they were set by older kernel)
@@ -672,7 +659,6 @@ int ovl_fileattr_set(struct mnt_idmap *idmap,
err = ovl_set_protattr(inode, upperpath.dentry, fa);
if (!err)
err = ovl_real_fileattr_set(&upperpath, fa);
- revert_creds(old_cred);
ovl_drop_write(dentry);
/*
@@ -726,15 +712,13 @@ int ovl_fileattr_get(struct dentry *dentry, struct fileattr *fa)
{
struct inode *inode = d_inode(dentry);
struct path realpath;
- const struct cred *old_cred;
int err;
ovl_path_real(dentry, &realpath);
- old_cred = ovl_override_creds(inode->i_sb);
+ cred_guard(ovl_creds(inode->i_sb));
err = ovl_real_fileattr_get(&realpath, fa);
ovl_fileattr_prot_flags(inode, fa);
- revert_creds(old_cred);
return err;
}
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 5764f91d283e..dd7c645140f9 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -953,15 +953,11 @@ static int ovl_maybe_validate_verity(struct dentry *dentry)
return err;
if (!ovl_test_flag(OVL_VERIFIED_DIGEST, inode)) {
- const struct cred *old_cred;
-
- old_cred = ovl_override_creds(dentry->d_sb);
+ cred_guard(ovl_creds(dentry->d_sb));
err = ovl_validate_verity(ofs, &metapath, &datapath);
if (err == 0)
ovl_set_flag(OVL_VERIFIED_DIGEST, inode);
-
- revert_creds(old_cred);
}
ovl_inode_unlock(inode);
@@ -975,7 +971,6 @@ static int ovl_maybe_lookup_lowerdata(struct dentry *dentry)
struct inode *inode = d_inode(dentry);
const char *redirect = ovl_lowerdata_redirect(inode);
struct ovl_path datapath = {};
- const struct cred *old_cred;
int err;
if (!redirect || ovl_dentry_lowerdata(dentry))
@@ -993,9 +988,9 @@ static int ovl_maybe_lookup_lowerdata(struct dentry *dentry)
if (ovl_dentry_lowerdata(dentry))
goto out;
- old_cred = ovl_override_creds(dentry->d_sb);
- err = ovl_lookup_data_layers(dentry, redirect, &datapath);
- revert_creds(old_cred);
+ cred_scoped_guard(ovl_creds(dentry->d_sb))
+ err = ovl_lookup_data_layers(dentry, redirect, &datapath);
+
if (err)
goto out_err;
@@ -1030,7 +1025,6 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
unsigned int flags)
{
struct ovl_entry *oe = NULL;
- const struct cred *old_cred;
struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
struct ovl_entry *poe = OVL_E(dentry->d_parent);
struct ovl_entry *roe = OVL_E(dentry->d_sb->s_root);
@@ -1061,7 +1055,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
if (dentry->d_name.len > ofs->namelen)
return ERR_PTR(-ENAMETOOLONG);
- old_cred = ovl_override_creds(dentry->d_sb);
+ cred_guard(ovl_creds(dentry->d_sb));
upperdir = ovl_dentry_upper(dentry->d_parent);
if (upperdir) {
d.layer = &ofs->layers[0];
@@ -1342,7 +1336,6 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
ovl_dentry_init_reval(dentry, upperdentry, OVL_I_E(inode));
- revert_creds(old_cred);
if (origin_path) {
dput(origin_path->dentry);
kfree(origin_path);
@@ -1366,7 +1359,6 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
kfree(upperredirect);
out:
kfree(d.redirect);
- revert_creds(old_cred);
return ERR_PTR(err);
}
@@ -1374,7 +1366,6 @@ bool ovl_lower_positive(struct dentry *dentry)
{
struct ovl_entry *poe = OVL_E(dentry->d_parent);
const struct qstr *name = &dentry->d_name;
- const struct cred *old_cred;
unsigned int i;
bool positive = false;
bool done = false;
@@ -1390,7 +1381,7 @@ bool ovl_lower_positive(struct dentry *dentry)
if (!ovl_dentry_upper(dentry))
return true;
- old_cred = ovl_override_creds(dentry->d_sb);
+ cred_guard(ovl_creds(dentry->d_sb));
/* Positive upper -> have to look up lower to see whether it exists */
for (i = 0; !done && !positive && i < ovl_numlower(poe); i++) {
struct dentry *this;
@@ -1423,7 +1414,6 @@ bool ovl_lower_positive(struct dentry *dentry)
dput(this);
}
}
- revert_creds(old_cred);
return positive;
}
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 0ca8af060b0c..41e01fe3ae4a 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -273,9 +273,8 @@ static int ovl_check_whiteouts(const struct path *path, struct ovl_readdir_data
int err;
struct ovl_cache_entry *p;
struct dentry *dentry, *dir = path->dentry;
- const struct cred *old_cred;
- old_cred = ovl_override_creds(rdd->dentry->d_sb);
+ cred_guard(ovl_creds(rdd->dentry->d_sb));
err = down_write_killable(&dir->d_inode->i_rwsem);
if (!err) {
@@ -290,7 +289,6 @@ static int ovl_check_whiteouts(const struct path *path, struct ovl_readdir_data
}
inode_unlock(dir->d_inode);
}
- revert_creds(old_cred);
return err;
}
@@ -753,10 +751,9 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx)
struct dentry *dentry = file->f_path.dentry;
struct ovl_fs *ofs = OVL_FS(dentry->d_sb);
struct ovl_cache_entry *p;
- const struct cred *old_cred;
int err;
- old_cred = ovl_override_creds(dentry->d_sb);
+ cred_guard(ovl_creds(dentry->d_sb));
if (!ctx->pos)
ovl_dir_reset(file);
@@ -808,7 +805,6 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx)
}
err = 0;
out:
- revert_creds(old_cred);
return err;
}
@@ -856,11 +852,9 @@ static struct file *ovl_dir_open_realfile(const struct file *file,
const struct path *realpath)
{
struct file *res;
- const struct cred *old_cred;
- old_cred = ovl_override_creds(file_inode(file)->i_sb);
+ cred_guard(ovl_creds(file_inode(file)->i_sb));
res = ovl_path_open(realpath, O_RDONLY | (file->f_flags & O_LARGEFILE));
- revert_creds(old_cred);
return res;
}
@@ -983,11 +977,9 @@ int ovl_check_empty_dir(struct dentry *dentry, struct list_head *list)
int err;
struct ovl_cache_entry *p, *n;
struct rb_root root = RB_ROOT;
- const struct cred *old_cred;
- old_cred = ovl_override_creds(dentry->d_sb);
+ cred_guard(ovl_creds(dentry->d_sb));
err = ovl_dir_read_merged(dentry, list, &root);
- revert_creds(old_cred);
if (err)
return err;
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index d285d1d7baad..2f8edd1024ff 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -1133,7 +1133,6 @@ static void ovl_cleanup_index(struct dentry *dentry)
int ovl_nlink_start(struct dentry *dentry)
{
struct inode *inode = d_inode(dentry);
- const struct cred *old_cred;
int err;
if (WARN_ON(!inode))
@@ -1170,15 +1169,16 @@ int ovl_nlink_start(struct dentry *dentry)
if (d_is_dir(dentry) || !ovl_test_flag(OVL_INDEX, inode))
return 0;
- old_cred = ovl_override_creds(dentry->d_sb);
- /*
- * The overlay inode nlink should be incremented/decremented IFF the
- * upper operation succeeds, along with nlink change of upper inode.
- * Therefore, before link/unlink/rename, we store the union nlink
- * value relative to the upper inode nlink in an upper inode xattr.
- */
- err = ovl_set_nlink_upper(dentry);
- revert_creds(old_cred);
+ cred_scoped_guard(ovl_creds(dentry->d_sb)) {
+ /*
+ * The overlay inode nlink should be incremented/decremented IFF the
+ * upper operation succeeds, along with nlink change of upper inode.
+ * Therefore, before link/unlink/rename, we store the union nlink
+ * value relative to the upper inode nlink in an upper inode xattr.
+ */
+ err = ovl_set_nlink_upper(dentry);
+ }
+
if (err)
goto out_drop_write;
@@ -1199,11 +1199,8 @@ void ovl_nlink_end(struct dentry *dentry)
ovl_drop_write(dentry);
if (ovl_test_flag(OVL_INDEX, inode) && inode->i_nlink == 0) {
- const struct cred *old_cred;
-
- old_cred = ovl_override_creds(dentry->d_sb);
+ cred_guard(ovl_creds(dentry->d_sb));
ovl_cleanup_index(dentry);
- revert_creds(old_cred);
}
ovl_inode_unlock(inode);
diff --git a/fs/overlayfs/xattrs.c b/fs/overlayfs/xattrs.c
index 383978e4663c..e4ce093a3a73 100644
--- a/fs/overlayfs/xattrs.c
+++ b/fs/overlayfs/xattrs.c
@@ -41,13 +41,11 @@ static int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char
struct dentry *upperdentry = ovl_i_dentry_upper(inode);
struct dentry *realdentry = upperdentry ?: ovl_dentry_lower(dentry);
struct path realpath;
- const struct cred *old_cred;
if (!value && !upperdentry) {
ovl_path_lower(dentry, &realpath);
- old_cred = ovl_override_creds(dentry->d_sb);
- err = vfs_getxattr(mnt_idmap(realpath.mnt), realdentry, name, NULL, 0);
- revert_creds(old_cred);
+ cred_scoped_guard(ovl_creds(dentry->d_sb))
+ err = vfs_getxattr(mnt_idmap(realpath.mnt), realdentry, name, NULL, 0);
if (err < 0)
goto out;
}
@@ -64,15 +62,15 @@ static int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char
if (err)
goto out;
- old_cred = ovl_override_creds(dentry->d_sb);
- if (value) {
- err = ovl_do_setxattr(ofs, realdentry, name, value, size,
- flags);
- } else {
- WARN_ON(flags != XATTR_REPLACE);
- err = ovl_do_removexattr(ofs, realdentry, name);
+ cred_scoped_guard(ovl_creds(dentry->d_sb)) {
+ if (value) {
+ err = ovl_do_setxattr(ofs, realdentry, name, value, size,
+ flags);
+ } else {
+ WARN_ON(flags != XATTR_REPLACE);
+ err = ovl_do_removexattr(ofs, realdentry, name);
+ }
}
- revert_creds(old_cred);
ovl_drop_write(dentry);
/* copy c/mtime */
@@ -85,13 +83,11 @@ static int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char
void *value, size_t size)
{
ssize_t res;
- const struct cred *old_cred;
struct path realpath;
ovl_i_path_real(inode, &realpath);
- old_cred = ovl_override_creds(dentry->d_sb);
+ cred_guard(ovl_creds(dentry->d_sb));
res = vfs_getxattr(mnt_idmap(realpath.mnt), realpath.dentry, name, value, size);
- revert_creds(old_cred);
return res;
}
@@ -116,12 +112,10 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
ssize_t res;
size_t len;
char *s;
- const struct cred *old_cred;
size_t prefix_len, name_len;
- old_cred = ovl_override_creds(dentry->d_sb);
- res = vfs_listxattr(realdentry, list, size);
- revert_creds(old_cred);
+ cred_scoped_guard(ovl_creds(dentry->d_sb))
+ res = vfs_listxattr(realdentry, list, size);
if (res <= 0 || size == 0)
return res;
@@ -268,4 +262,3 @@ const struct xattr_handler * const *ovl_xattr_handlers(struct ovl_fs *ofs)
return ofs->config.userxattr ? ovl_user_xattr_handlers :
ovl_trusted_xattr_handlers;
}
-
--
2.44.0
Hi Vinicius,
kernel test robot noticed the following build warnings:
[auto build test WARNING on brauner-vfs/vfs.all]
[also build test WARNING on linus/master v6.9-rc2 next-20240403]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Vinicius-Costa-Gomes/cred-Add-a-light-version-of-override-revert_creds/20240403-101954
base: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all
patch link: https://lore.kernel.org/r/20240403021808.309900-3-vinicius.gomes%40intel.com
patch subject: [PATCH v1 2/3] fs: Optimize credentials reference count for backing file ops
config: i386-randconfig-061-20240403 (https://download.01.org/0day-ci/archive/20240403/[email protected]/config)
compiler: gcc-13 (Ubuntu 13.2.0-4ubuntu3) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240403/[email protected]/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
sparse warnings: (new ones prefixed by >>)
fs/backing-file.c: note: in included file (through include/linux/sched/signal.h, include/linux/rcuwait.h, include/linux/percpu-rwsem.h, ...):
>> include/linux/cred.h:182:41: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected struct cred const *old @@ got struct cred const [noderef] __rcu *cred @@
include/linux/cred.h:182:41: sparse: expected struct cred const *old
include/linux/cred.h:182:41: sparse: got struct cred const [noderef] __rcu *cred
>> include/linux/cred.h:182:41: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected struct cred const *old @@ got struct cred const [noderef] __rcu *cred @@
include/linux/cred.h:182:41: sparse: expected struct cred const *old
include/linux/cred.h:182:41: sparse: got struct cred const [noderef] __rcu *cred
>> include/linux/cred.h:182:41: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected struct cred const *old @@ got struct cred const [noderef] __rcu *cred @@
include/linux/cred.h:182:41: sparse: expected struct cred const *old
include/linux/cred.h:182:41: sparse: got struct cred const [noderef] __rcu *cred
>> include/linux/cred.h:182:41: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected struct cred const *old @@ got struct cred const [noderef] __rcu *cred @@
include/linux/cred.h:182:41: sparse: expected struct cred const *old
include/linux/cred.h:182:41: sparse: got struct cred const [noderef] __rcu *cred
>> include/linux/cred.h:182:41: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected struct cred const *old @@ got struct cred const [noderef] __rcu *cred @@
include/linux/cred.h:182:41: sparse: expected struct cred const *old
include/linux/cred.h:182:41: sparse: got struct cred const [noderef] __rcu *cred
vim +182 include/linux/cred.h
58319057b78476 Andy Lutomirski 2015-09-04 174
dd60a254548056 Vinicius Costa Gomes 2024-04-02 175 /*
dd60a254548056 Vinicius Costa Gomes 2024-04-02 176 * Override creds without bumping reference count. Caller must ensure
dd60a254548056 Vinicius Costa Gomes 2024-04-02 177 * reference remains valid or has taken reference. Almost always not the
dd60a254548056 Vinicius Costa Gomes 2024-04-02 178 * interface you want. Use override_creds()/revert_creds() instead.
dd60a254548056 Vinicius Costa Gomes 2024-04-02 179 */
dd60a254548056 Vinicius Costa Gomes 2024-04-02 180 static inline const struct cred *override_creds_light(const struct cred *override_cred)
dd60a254548056 Vinicius Costa Gomes 2024-04-02 181 {
dd60a254548056 Vinicius Costa Gomes 2024-04-02 @182 const struct cred *old = current->cred;
dd60a254548056 Vinicius Costa Gomes 2024-04-02 183
dd60a254548056 Vinicius Costa Gomes 2024-04-02 184 rcu_assign_pointer(current->cred, override_cred);
dd60a254548056 Vinicius Costa Gomes 2024-04-02 185 return old;
dd60a254548056 Vinicius Costa Gomes 2024-04-02 186 }
dd60a254548056 Vinicius Costa Gomes 2024-04-02 187
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
On Wed, 3 Apr 2024 at 04:18, Vinicius Costa Gomes
<[email protected]> wrote:
> - in ovl_rename() I had to manually call the "light" the overrides,
> both using the guard() macro or using the non-light version causes
> the workload to crash the kernel. I still have to investigate why
> this is happening. Hints are appreciated.
Don't know. Well, there's nesting (in ovl_nlink_end()) but I don't
see why that should be an issue.
I see why Amir suggested moving away from scoped guards, but that also
introduces the possibility of subtle bugs if we don't audit every one
of those sites carefully...
Maybe patchset should be restructured to first do the
override_creds_light() conversion without guards, and then move over
to guards. Or the other way round, I don't have a preference. But
mixing these two independent changes doesn't sound like a great idea
in any case.
Thanks,
Miklos
On Tue, Apr 02, 2024 at 07:18:05PM -0700, Vinicius Costa Gomes wrote:
> Hi,
>
> Changes from RFC v3:
> - Removed the warning "fixes" patches, as they could hide potencial
> bugs (Christian Brauner);
> - Added "cred-specific" macros (Christian Brauner), from my side,
> added a few '_' to the guards to signify that the newly introduced
> helper macros are preferred.
> - Changed a few guard() to scoped_guard() to fix the clang (17.0.6)
> compilation error about 'goto' bypassing variable initialization;
>
> Link to RFC v3:
>
> https://lore.kernel.org/r/[email protected]/
>
> Changes from RFC v2:
> - Added separate patches for the warnings for the discarded const
> when using the cleanup macros: one for DEFINE_GUARD() and one for
> DEFINE_LOCK_GUARD_1() (I am uncertain if it's better to squash them
> together);
> - Reordered the series so the backing file patch is the first user of
> the introduced helpers (Amir Goldstein);
> - Change the definition of the cleanup "class" from a GUARD to a
> LOCK_GUARD_1, which defines an implicit container, that allows us
> to remove some variable declarations to store the overriden
> credentials (Amir Goldstein);
> - Replaced most of the uses of scoped_guard() with guard(), to reduce
> the code churn, the remaining ones I wasn't sure if I was changing
> the behavior: either they were nested (overrides "inside"
> overrides) or something calls current_cred() (Amir Goldstein).
>
> New questions:
> - The backing file callbacks are now called with the "light"
> overriden credentials, so they are kind of restricted in what they
> can do with their credentials, is this acceptable in general?
Until we grow additional users, I think yes. Just needs to be
documented.
> - in ovl_rename() I had to manually call the "light" the overrides,
> both using the guard() macro or using the non-light version causes
> the workload to crash the kernel. I still have to investigate why
> this is happening. Hints are appreciated.
Do you have a reproducer? Do you have a splat from dmesg?
Miklos Szeredi <[email protected]> writes:
> On Wed, 3 Apr 2024 at 04:18, Vinicius Costa Gomes
> <[email protected]> wrote:
>
>> - in ovl_rename() I had to manually call the "light" the overrides,
>> both using the guard() macro or using the non-light version causes
>> the workload to crash the kernel. I still have to investigate why
>> this is happening. Hints are appreciated.
>
> Don't know. Well, there's nesting (in ovl_nlink_end()) but I don't
> see why that should be an issue.
>
> I see why Amir suggested moving away from scoped guards, but that also
> introduces the possibility of subtle bugs if we don't audit every one
> of those sites carefully...
>
> Maybe patchset should be restructured to first do the
> override_creds_light() conversion without guards, and then move over
> to guards. Or the other way round, I don't have a preference. But
> mixing these two independent changes doesn't sound like a great idea
> in any case.
Sounds good. Here's I am thinking:
patch 1: introduce *_creds_light()
patch 2: move backing-file.c to *_creds_light()
patch 3: move overlayfs to *_creds_light()
patch 4: introduce the guard helpers
patch 5: move backing-file.c to the guard helpers
patch 6: move overlayfs to the guard helpers
(and yeah, the subject of the patches will be better than these ;-)
Is this what you had in mind?
Cheers,
--
Vinicius
Christian Brauner <[email protected]> writes:
> On Tue, Apr 02, 2024 at 07:18:05PM -0700, Vinicius Costa Gomes wrote:
>> Hi,
>>
>> Changes from RFC v3:
>> - Removed the warning "fixes" patches, as they could hide potencial
>> bugs (Christian Brauner);
>> - Added "cred-specific" macros (Christian Brauner), from my side,
>> added a few '_' to the guards to signify that the newly introduced
>> helper macros are preferred.
>> - Changed a few guard() to scoped_guard() to fix the clang (17.0.6)
>> compilation error about 'goto' bypassing variable initialization;
>>
>> Link to RFC v3:
>>
>> https://lore.kernel.org/r/[email protected]/
>>
>> Changes from RFC v2:
>> - Added separate patches for the warnings for the discarded const
>> when using the cleanup macros: one for DEFINE_GUARD() and one for
>> DEFINE_LOCK_GUARD_1() (I am uncertain if it's better to squash them
>> together);
>> - Reordered the series so the backing file patch is the first user of
>> the introduced helpers (Amir Goldstein);
>> - Change the definition of the cleanup "class" from a GUARD to a
>> LOCK_GUARD_1, which defines an implicit container, that allows us
>> to remove some variable declarations to store the overriden
>> credentials (Amir Goldstein);
>> - Replaced most of the uses of scoped_guard() with guard(), to reduce
>> the code churn, the remaining ones I wasn't sure if I was changing
>> the behavior: either they were nested (overrides "inside"
>> overrides) or something calls current_cred() (Amir Goldstein).
>>
>> New questions:
>> - The backing file callbacks are now called with the "light"
>> overriden credentials, so they are kind of restricted in what they
>> can do with their credentials, is this acceptable in general?
>
> Until we grow additional users, I think yes. Just needs to be
> documented.
>
Will add some documentation for it, then.
>> - in ovl_rename() I had to manually call the "light" the overrides,
>> both using the guard() macro or using the non-light version causes
>> the workload to crash the kernel. I still have to investigate why
>> this is happening. Hints are appreciated.
>
> Do you have a reproducer? Do you have a splat from dmesg?
Just to be sure, with this version of the series the crash doesn't
happen. It was only happening when I was using the guard() macro
everywhere.
I just looked at my crash collection and couldn't find the splats, from
what I remember I lost connection to the machine, and wasn't able to
retrieve the splat.
I believe the crash and clang 17 compilation error point to the same
problem, that in ovl_rename() some 'goto' skips the declaration of the
(implicit) variable that the guard() macro generates. And it ends up
doing a revert_creds_light() on garbage memory when ovl_rename()
returns.
(if you want I can try and go back to "guard() everywhere" and try a bit
harder to get a splat)
Does that make sense?
Cheers,
--
Vinicius
On Wed, Apr 24, 2024 at 10:01 PM Vinicius Costa Gomes
<[email protected]> wrote:
>
> Miklos Szeredi <[email protected]> writes:
>
> > On Wed, 3 Apr 2024 at 04:18, Vinicius Costa Gomes
> > <[email protected]> wrote:
> >
> >> - in ovl_rename() I had to manually call the "light" the overrides,
> >> both using the guard() macro or using the non-light version causes
> >> the workload to crash the kernel. I still have to investigate why
> >> this is happening. Hints are appreciated.
> >
> > Don't know. Well, there's nesting (in ovl_nlink_end()) but I don't
> > see why that should be an issue.
> >
> > I see why Amir suggested moving away from scoped guards, but that also
> > introduces the possibility of subtle bugs if we don't audit every one
> > of those sites carefully...
> >
> > Maybe patchset should be restructured to first do the
> > override_creds_light() conversion without guards, and then move over
> > to guards. Or the other way round, I don't have a preference. But
> > mixing these two independent changes doesn't sound like a great idea
> > in any case.
>
> Sounds good. Here's I am thinking:
>
> patch 1: introduce *_creds_light()
> patch 2: move backing-file.c to *_creds_light()
> patch 3: move overlayfs to *_creds_light()
> patch 4: introduce the guard helpers
> patch 5: move backing-file.c to the guard helpers
> patch 6: move overlayfs to the guard helpers
>
> (and yeah, the subject of the patches will be better than these ;-)
>
> Is this what you had in mind?
>
I think this series would make a lot of sense.
It first addresses the issue that motivated your work
and I expect patch 3 would be rather simple to review.
Please use your best judgement to break patch 6 into more chewable
pieces because the current ovl patch is quite large to review in one go.
I will leave it up to you to find the right balance.
Also w.r.t the guard() vs. scoped_guard() question, remember that
there is another option that may be better than either in some cases -
re-factoring to a helper with a guard().
One example that jumps to me is ovl_copyfile() - seems nicer
to add a guard() in all the specific helpers then adding the scoped_guard()
around the switch statement.
But really this is a question of taste and avoiding unneeded code churn and
unneeded nested scopes. There is no one clear way, so please use your
judgement per case and we can debate on your choices in review.
Thanks,
Amir.
On Wed, Apr 24, 2024 at 12:15:25PM -0700, Vinicius Costa Gomes wrote:
> Christian Brauner <[email protected]> writes:
>
> > On Tue, Apr 02, 2024 at 07:18:05PM -0700, Vinicius Costa Gomes wrote:
> >> Hi,
> >>
> >> Changes from RFC v3:
> >> - Removed the warning "fixes" patches, as they could hide potencial
> >> bugs (Christian Brauner);
> >> - Added "cred-specific" macros (Christian Brauner), from my side,
> >> added a few '_' to the guards to signify that the newly introduced
> >> helper macros are preferred.
> >> - Changed a few guard() to scoped_guard() to fix the clang (17.0.6)
> >> compilation error about 'goto' bypassing variable initialization;
> >>
> >> Link to RFC v3:
> >>
> >> https://lore.kernel.org/r/[email protected]/
> >>
> >> Changes from RFC v2:
> >> - Added separate patches for the warnings for the discarded const
> >> when using the cleanup macros: one for DEFINE_GUARD() and one for
> >> DEFINE_LOCK_GUARD_1() (I am uncertain if it's better to squash them
> >> together);
> >> - Reordered the series so the backing file patch is the first user of
> >> the introduced helpers (Amir Goldstein);
> >> - Change the definition of the cleanup "class" from a GUARD to a
> >> LOCK_GUARD_1, which defines an implicit container, that allows us
> >> to remove some variable declarations to store the overriden
> >> credentials (Amir Goldstein);
> >> - Replaced most of the uses of scoped_guard() with guard(), to reduce
> >> the code churn, the remaining ones I wasn't sure if I was changing
> >> the behavior: either they were nested (overrides "inside"
> >> overrides) or something calls current_cred() (Amir Goldstein).
> >>
> >> New questions:
> >> - The backing file callbacks are now called with the "light"
> >> overriden credentials, so they are kind of restricted in what they
> >> can do with their credentials, is this acceptable in general?
> >
> > Until we grow additional users, I think yes. Just needs to be
> > documented.
> >
>
> Will add some documentation for it, then.
>
> >> - in ovl_rename() I had to manually call the "light" the overrides,
> >> both using the guard() macro or using the non-light version causes
> >> the workload to crash the kernel. I still have to investigate why
> >> this is happening. Hints are appreciated.
> >
> > Do you have a reproducer? Do you have a splat from dmesg?
>
> Just to be sure, with this version of the series the crash doesn't
> happen. It was only happening when I was using the guard() macro
> everywhere.
>
> I just looked at my crash collection and couldn't find the splats, from
> what I remember I lost connection to the machine, and wasn't able to
> retrieve the splat.
>
> I believe the crash and clang 17 compilation error point to the same
> problem, that in ovl_rename() some 'goto' skips the declaration of the
> (implicit) variable that the guard() macro generates. And it ends up
> doing a revert_creds_light() on garbage memory when ovl_rename()
> returns.
If this is a compiler bug this warrants at least a comment in the commit
message because right now people will be wondering why that place
doesn't use a guard. Ideally we can just use guards everywhere though
and report this as a bug against clang, I think.
>
> (if you want I can try and go back to "guard() everywhere" and try a bit
> harder to get a splat)
>
> Does that make sense?
Yes.
Christian Brauner <[email protected]> writes:
> On Wed, Apr 24, 2024 at 12:15:25PM -0700, Vinicius Costa Gomes wrote:
>> Christian Brauner <[email protected]> writes:
>>
>> > On Tue, Apr 02, 2024 at 07:18:05PM -0700, Vinicius Costa Gomes wrote:
>> >> Hi,
>> >>
>> >> Changes from RFC v3:
>> >> - Removed the warning "fixes" patches, as they could hide potencial
>> >> bugs (Christian Brauner);
>> >> - Added "cred-specific" macros (Christian Brauner), from my side,
>> >> added a few '_' to the guards to signify that the newly introduced
>> >> helper macros are preferred.
>> >> - Changed a few guard() to scoped_guard() to fix the clang (17.0.6)
>> >> compilation error about 'goto' bypassing variable initialization;
>> >>
>> >> Link to RFC v3:
>> >>
>> >> https://lore.kernel.org/r/[email protected]/
>> >>
>> >> Changes from RFC v2:
>> >> - Added separate patches for the warnings for the discarded const
>> >> when using the cleanup macros: one for DEFINE_GUARD() and one for
>> >> DEFINE_LOCK_GUARD_1() (I am uncertain if it's better to squash them
>> >> together);
>> >> - Reordered the series so the backing file patch is the first user of
>> >> the introduced helpers (Amir Goldstein);
>> >> - Change the definition of the cleanup "class" from a GUARD to a
>> >> LOCK_GUARD_1, which defines an implicit container, that allows us
>> >> to remove some variable declarations to store the overriden
>> >> credentials (Amir Goldstein);
>> >> - Replaced most of the uses of scoped_guard() with guard(), to reduce
>> >> the code churn, the remaining ones I wasn't sure if I was changing
>> >> the behavior: either they were nested (overrides "inside"
>> >> overrides) or something calls current_cred() (Amir Goldstein).
>> >>
>> >> New questions:
>> >> - The backing file callbacks are now called with the "light"
>> >> overriden credentials, so they are kind of restricted in what they
>> >> can do with their credentials, is this acceptable in general?
>> >
>> > Until we grow additional users, I think yes. Just needs to be
>> > documented.
>> >
>>
>> Will add some documentation for it, then.
>>
>> >> - in ovl_rename() I had to manually call the "light" the overrides,
>> >> both using the guard() macro or using the non-light version causes
>> >> the workload to crash the kernel. I still have to investigate why
>> >> this is happening. Hints are appreciated.
>> >
>> > Do you have a reproducer? Do you have a splat from dmesg?
>>
>> Just to be sure, with this version of the series the crash doesn't
>> happen. It was only happening when I was using the guard() macro
>> everywhere.
>>
>> I just looked at my crash collection and couldn't find the splats, from
>> what I remember I lost connection to the machine, and wasn't able to
>> retrieve the splat.
>>
>> I believe the crash and clang 17 compilation error point to the same
>> problem, that in ovl_rename() some 'goto' skips the declaration of the
>> (implicit) variable that the guard() macro generates. And it ends up
>> doing a revert_creds_light() on garbage memory when ovl_rename()
>> returns.
>
> If this is a compiler bug this warrants at least a comment in the commit
> message because right now people will be wondering why that place
> doesn't use a guard. Ideally we can just use guards everywhere though
> and report this as a bug against clang, I think.
>
I am seeing this like a bug/mising feature in gcc (at least in the
version I was using), as clang (correctly) refuses to compile the buggy
code (I agree with the error).
But I will add a comment to the code explaining why guard() cannot be
used in that case.
Cheers,
--
Vinicius
On Thu, Apr 25, 2024 at 10:12:34AM -0700, Vinicius Costa Gomes wrote:
> Christian Brauner <[email protected]> writes:
>
> > On Wed, Apr 24, 2024 at 12:15:25PM -0700, Vinicius Costa Gomes wrote:
> >> I believe the crash and clang 17 compilation error point to the same
> >> problem, that in ovl_rename() some 'goto' skips the declaration of the
> >> (implicit) variable that the guard() macro generates. And it ends up
> >> doing a revert_creds_light() on garbage memory when ovl_rename()
> >> returns.
> >
> > If this is a compiler bug this warrants at least a comment in the commit
> > message because right now people will be wondering why that place
> > doesn't use a guard. Ideally we can just use guards everywhere though
> > and report this as a bug against clang, I think.
> >
>
> I am seeing this like a bug/mising feature in gcc (at least in the
> version I was using), as clang (correctly) refuses to compile the buggy
> code (I agree with the error).
Indeed, your description of the issue and the fact clang refuses to
compile the problematic code makes me think that
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91951 is the relevant GCC
issue.
As an aside, just in case it comes up in the future, there is a
potential issue in clang's scope checking where it would attempt to
validate all labels in a function as potential destinations of 'asm
goto()' instances in that same function, rather than just the labels
that the 'asm goto()' could jump to, which can lead to false positive
errors about jumping past the initialization of a variable declared with
cleanup.
https://github.com/ClangBuiltLinux/linux/issues/1886
https://github.com/ClangBuiltLinux/linux/issues/2003
Cheers,
Nathan