2022-04-16 01:08:28

by Yang Xu (Fujitsu)

[permalink] [raw]
Subject: [PATCH v3 1/7] fs/inode: move sgid strip operation from inode_init_owner into inode_sgid_strip

This has no functional change. Just create and export inode_sgid_strip api for
the subsequent patch. This function is used to strip S_ISGID mode when init
a new inode.

Signed-off-by: Yang Xu <[email protected]>
---
v2->v3:
1.Use const struct inode * instead of struct inode *
2.replace sgid strip with inode_sgid_strip in a single patch
fs/inode.c | 24 ++++++++++++++++++++----
include/linux/fs.h | 3 ++-
2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 9d9b422504d1..1b569ad882ce 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -2246,10 +2246,8 @@ void inode_init_owner(struct user_namespace *mnt_userns, struct inode *inode,
/* Directories are special, and always inherit S_ISGID */
if (S_ISDIR(mode))
mode |= S_ISGID;
- else if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP) &&
- !in_group_p(i_gid_into_mnt(mnt_userns, dir)) &&
- !capable_wrt_inode_uidgid(mnt_userns, dir, CAP_FSETID))
- mode &= ~S_ISGID;
+ else
+ inode_sgid_strip(mnt_userns, dir, &mode);
} else
inode_fsgid_set(inode, mnt_userns);
inode->i_mode = mode;
@@ -2405,3 +2403,21 @@ struct timespec64 current_time(struct inode *inode)
return timestamp_truncate(now, inode);
}
EXPORT_SYMBOL(current_time);
+
+void inode_sgid_strip(struct user_namespace *mnt_userns,
+ const struct inode *dir, umode_t *mode)
+{
+ if (!dir || !(dir->i_mode & S_ISGID))
+ return;
+ if ((*mode & (S_ISGID | S_IXGRP)) != (S_ISGID | S_IXGRP))
+ return;
+ if (S_ISDIR(*mode))
+ return;
+ if (in_group_p(i_gid_into_mnt(mnt_userns, dir)))
+ return;
+ if (capable_wrt_inode_uidgid(mnt_userns, dir, CAP_FSETID))
+ return;
+
+ *mode &= ~S_ISGID;
+}
+EXPORT_SYMBOL(inode_sgid_strip);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index bbde95387a23..4a617aaab6f6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1897,7 +1897,8 @@ extern long compat_ptr_ioctl(struct file *file, unsigned int cmd,
void inode_init_owner(struct user_namespace *mnt_userns, struct inode *inode,
const struct inode *dir, umode_t mode);
extern bool may_open_dev(const struct path *path);
-
+void inode_sgid_strip(struct user_namespace *mnt_userns,
+ const struct inode *dir, umode_t *mode);
/*
* This is the "filldir" function type, used by readdir() to let
* the kernel specify what kind of dirent layout it wants to have.
--
2.27.0


2022-04-16 01:59:51

by Yang Xu (Fujitsu)

[permalink] [raw]
Subject: [PATCH v3 3/7] xfs: Only do posix acl setup/release operation under CONFIG_XFS_POSIX_ACL

Usually, filesystem will use a function named as fs_init_acl function that belong
to acl.c and this function is externed in acl.h by using CONFIG_FS_POSIX_ACL.

If filesystem disable this switch, we should not call xfs_set_acl also not call
posix_acl_create/posix_acl_release because it is useless(We have do umask
strip in vfs).

Signed-off-by: Yang Xu <[email protected]>
---
fs/xfs/xfs_iops.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index b34e8e4344a8..9487e68bdd3d 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -146,10 +146,12 @@ xfs_create_need_xattr(
struct posix_acl *default_acl,
struct posix_acl *acl)
{
+#ifdef CONFIG_XFS_POSIX_ACL
if (acl)
return true;
if (default_acl)
return true;
+#endif
#if IS_ENABLED(CONFIG_SECURITY)
if (dir->i_sb->s_security)
return true;
@@ -184,9 +186,11 @@ xfs_generic_create(
rdev = 0;
}

+#ifdef CONFIG_XFS_POSIX_ACL
error = posix_acl_create(dir, &mode, &default_acl, &acl);
if (error)
return error;
+#endif

/* Verify mode is valid also for tmpfile case */
error = xfs_dentry_mode_to_name(&name, dentry, mode);
@@ -241,8 +245,10 @@ xfs_generic_create(
xfs_finish_inode_setup(ip);

out_free_acl:
+#ifdef CONFIG_XFS_POSIX_ACL
posix_acl_release(default_acl);
posix_acl_release(acl);
+#endif
return error;

out_cleanup_inode:
--
2.27.0

2022-04-16 02:12:49

by Yang Xu (Fujitsu)

[permalink] [raw]
Subject: [PATCH v3 7/7] ceph: Remove S_ISGID clear code in ceph_finish_async_create

Since vfs has stripped S_ISGID, we don't need this code any more.

Signed-off-by: Yang Xu <[email protected]>
---
fs/ceph/file.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 6c9e837aa1d3..8e3b99853333 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -651,10 +651,6 @@ static int ceph_finish_async_create(struct inode *dir, struct dentry *dentry,
/* Directories always inherit the setgid bit. */
if (S_ISDIR(mode))
mode |= S_ISGID;
- else if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP) &&
- !in_group_p(dir->i_gid) &&
- !capable_wrt_inode_uidgid(&init_user_ns, dir, CAP_FSETID))
- mode &= ~S_ISGID;
} else {
in.gid = cpu_to_le32(from_kgid(&init_user_ns, current_fsgid()));
}
--
2.27.0

2022-04-16 02:19:18

by Yang Xu (Fujitsu)

[permalink] [raw]
Subject: [PATCH v3 2/7] fs/namei.c: Add missing umask strip in vfs_tmpfile

If underflying filesystem doesn't enable own CONFIG_FS_POSIX_ACL, then
posix_acl_create can't be called. So we will miss umask strip, ie
use ext4 with noacl or disblae CONFIG_EXT4_FS_POSIX_ACL.

Reported-by: Christian Brauner (Microsoft) <[email protected]>
Signed-off-by: Yang Xu <[email protected]>
---
fs/namei.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/fs/namei.c b/fs/namei.c
index 3f1829b3ab5b..bbc7c950bbdc 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3521,6 +3521,8 @@ struct dentry *vfs_tmpfile(struct user_namespace *mnt_userns,
child = d_alloc(dentry, &slash_name);
if (unlikely(!child))
goto out_err;
+ if (!IS_POSIXACL(dir))
+ mode &= ~current_umask();
error = dir->i_op->tmpfile(mnt_userns, dir, child, mode);
if (error)
goto out_err;
--
2.27.0

2022-04-16 02:29:59

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] fs/inode: move sgid strip operation from inode_init_owner into inode_sgid_strip

On Fri, Apr 15, 2022 at 07:02:17PM +0800, Yang Xu wrote:
> This has no functional change. Just create and export inode_sgid_strip api for
> the subsequent patch. This function is used to strip S_ISGID mode when init
> a new inode.
>
> Signed-off-by: Yang Xu <[email protected]>
> ---
> v2->v3:
> 1.Use const struct inode * instead of struct inode *
> 2.replace sgid strip with inode_sgid_strip in a single patch
> fs/inode.c | 24 ++++++++++++++++++++----
> include/linux/fs.h | 3 ++-
> 2 files changed, 22 insertions(+), 5 deletions(-)
>
> diff --git a/fs/inode.c b/fs/inode.c
> index 9d9b422504d1..1b569ad882ce 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -2246,10 +2246,8 @@ void inode_init_owner(struct user_namespace *mnt_userns, struct inode *inode,
> /* Directories are special, and always inherit S_ISGID */
> if (S_ISDIR(mode))
> mode |= S_ISGID;
> - else if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP) &&
> - !in_group_p(i_gid_into_mnt(mnt_userns, dir)) &&
> - !capable_wrt_inode_uidgid(mnt_userns, dir, CAP_FSETID))
> - mode &= ~S_ISGID;
> + else
> + inode_sgid_strip(mnt_userns, dir, &mode);
> } else
> inode_fsgid_set(inode, mnt_userns);
> inode->i_mode = mode;
> @@ -2405,3 +2403,21 @@ struct timespec64 current_time(struct inode *inode)
> return timestamp_truncate(now, inode);
> }
> EXPORT_SYMBOL(current_time);
> +
> +void inode_sgid_strip(struct user_namespace *mnt_userns,
> + const struct inode *dir, umode_t *mode)
> +{
> + if (!dir || !(dir->i_mode & S_ISGID))
> + return;
> + if ((*mode & (S_ISGID | S_IXGRP)) != (S_ISGID | S_IXGRP))
> + return;
> + if (S_ISDIR(*mode))
> + return;

I'd place that check first as this whole function is really only
relevant for non-directories.

Otherwise I can live with *mode being a pointer although I still find
this unpleasant API wise but the bikeshed does it's job without having
my color. :)

I'd like to do some good testing on this.

Acked-by: Christian Brauner (Microsoft) <[email protected]>

2022-04-18 06:45:43

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] fs/inode: move sgid strip operation from inode_init_owner into inode_sgid_strip

On Fri, Apr 15, 2022 at 04:09:24PM +0200, Christian Brauner wrote:
> > + inode_sgid_strip(mnt_userns, dir, &mode);
> > } else
> > inode_fsgid_set(inode, mnt_userns);
> > inode->i_mode = mode;
> > @@ -2405,3 +2403,21 @@ struct timespec64 current_time(struct inode *inode)
> > return timestamp_truncate(now, inode);
> > }
> > EXPORT_SYMBOL(current_time);
> > +
> > +void inode_sgid_strip(struct user_namespace *mnt_userns,
> > + const struct inode *dir, umode_t *mode)
> > +{
> > + if (!dir || !(dir->i_mode & S_ISGID))
> > + return;
> > + if ((*mode & (S_ISGID | S_IXGRP)) != (S_ISGID | S_IXGRP))
> > + return;
> > + if (S_ISDIR(*mode))
> > + return;
>
> I'd place that check first as this whole function is really only
> relevant for non-directories.
>
> Otherwise I can live with *mode being a pointer although I still find
> this unpleasant API wise but the bikeshed does it's job without having
> my color. :)

No, I think your instincts are correct. This should be

umode_t inode_sgid_strip(struct user_namespace *mnt_userns,
const struct inode *dir, umode_t mode)
{
if (S_ISDIR(mode) || !dir || !(dir->i_mode & S_ISGID))
return mode;
if (mode & (S_ISGID | S_IXGRP) != (S_ISGID | S_IXGRP))
return mode;
...

and the same for prepare_mode().

And really, I think this should be called inode_strip_sgid(). Right?

2022-04-18 08:00:36

by Yang Xu (Fujitsu)

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] fs/inode: move sgid strip operation from inode_init_owner into inode_sgid_strip

on 2022/4/15 22:09, Christian Brauner wrote:
> On Fri, Apr 15, 2022 at 07:02:17PM +0800, Yang Xu wrote:
>> This has no functional change. Just create and export inode_sgid_strip api for
>> the subsequent patch. This function is used to strip S_ISGID mode when init
>> a new inode.
>>
>> Signed-off-by: Yang Xu<[email protected]>
>> ---
>> v2->v3:
>> 1.Use const struct inode * instead of struct inode *
>> 2.replace sgid strip with inode_sgid_strip in a single patch
>> fs/inode.c | 24 ++++++++++++++++++++----
>> include/linux/fs.h | 3 ++-
>> 2 files changed, 22 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/inode.c b/fs/inode.c
>> index 9d9b422504d1..1b569ad882ce 100644
>> --- a/fs/inode.c
>> +++ b/fs/inode.c
>> @@ -2246,10 +2246,8 @@ void inode_init_owner(struct user_namespace *mnt_userns, struct inode *inode,
>> /* Directories are special, and always inherit S_ISGID */
>> if (S_ISDIR(mode))
>> mode |= S_ISGID;
>> - else if ((mode& (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)&&
>> - !in_group_p(i_gid_into_mnt(mnt_userns, dir))&&
>> - !capable_wrt_inode_uidgid(mnt_userns, dir, CAP_FSETID))
>> - mode&= ~S_ISGID;
>> + else
>> + inode_sgid_strip(mnt_userns, dir,&mode);
>> } else
>> inode_fsgid_set(inode, mnt_userns);
>> inode->i_mode = mode;
>> @@ -2405,3 +2403,21 @@ struct timespec64 current_time(struct inode *inode)
>> return timestamp_truncate(now, inode);
>> }
>> EXPORT_SYMBOL(current_time);
>> +
>> +void inode_sgid_strip(struct user_namespace *mnt_userns,
>> + const struct inode *dir, umode_t *mode)
>> +{
>> + if (!dir || !(dir->i_mode& S_ISGID))
>> + return;
>> + if ((*mode& (S_ISGID | S_IXGRP)) != (S_ISGID | S_IXGRP))
>> + return;
>> + if (S_ISDIR(*mode))
>> + return;
>
> I'd place that check first as this whole function is really only
> relevant for non-directories.
Sound reasonable.

Best Regards
Yang Xu
>
> Otherwise I can live with *mode being a pointer although I still find
> this unpleasant API wise but the bikeshed does it's job without having
> my color. :)
>
> I'd like to do some good testing on this.
>
> Acked-by: Christian Brauner (Microsoft)<[email protected]>

2022-04-18 08:28:30

by Yang Xu (Fujitsu)

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] ceph: Remove S_ISGID clear code in ceph_finish_async_create

on 2022/4/18 11:04, Xiubo Li wrote:
>
> On 4/15/22 7:02 PM, Yang Xu wrote:
>> Since vfs has stripped S_ISGID, we don't need this code any more.
>>
>> Signed-off-by: Yang Xu <[email protected]>
>> ---
>> fs/ceph/file.c | 4 ----
>> 1 file changed, 4 deletions(-)
>>
>> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
>> index 6c9e837aa1d3..8e3b99853333 100644
>> --- a/fs/ceph/file.c
>> +++ b/fs/ceph/file.c
>> @@ -651,10 +651,6 @@ static int ceph_finish_async_create(struct inode
>> *dir, struct dentry *dentry,
>> /* Directories always inherit the setgid bit. */
>> if (S_ISDIR(mode))
>> mode |= S_ISGID;
>> - else if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP) &&
>> - !in_group_p(dir->i_gid) &&
>> - !capable_wrt_inode_uidgid(&init_user_ns, dir, CAP_FSETID))
>> - mode &= ~S_ISGID;
>
> Could you point me where has done this for ceph ?

You can see the 6th patch, it added prepare_mode for tmpfile, open,
mknodat, mkdirat in vfs. The prepare_mode does inode sgid strip and
umask strip.

Best Regards
Yang Xu
>
> -- Xiubo
>
>
>> } else {
>> in.gid = cpu_to_le32(from_kgid(&init_user_ns, current_fsgid()));
>> }
>

2022-04-18 10:42:12

by Xiubo Li

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] ceph: Remove S_ISGID clear code in ceph_finish_async_create


On 4/15/22 7:02 PM, Yang Xu wrote:
> Since vfs has stripped S_ISGID, we don't need this code any more.
>
> Signed-off-by: Yang Xu <[email protected]>
> ---
> fs/ceph/file.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> index 6c9e837aa1d3..8e3b99853333 100644
> --- a/fs/ceph/file.c
> +++ b/fs/ceph/file.c
> @@ -651,10 +651,6 @@ static int ceph_finish_async_create(struct inode *dir, struct dentry *dentry,
> /* Directories always inherit the setgid bit. */
> if (S_ISDIR(mode))
> mode |= S_ISGID;
> - else if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP) &&
> - !in_group_p(dir->i_gid) &&
> - !capable_wrt_inode_uidgid(&init_user_ns, dir, CAP_FSETID))
> - mode &= ~S_ISGID;

Could you point me where has done this for ceph ?

-- Xiubo


> } else {
> in.gid = cpu_to_le32(from_kgid(&init_user_ns, current_fsgid()));
> }

2022-04-18 10:43:53

by Yang Xu (Fujitsu)

[permalink] [raw]
Subject: Re: [PATCH v3 1/7] fs/inode: move sgid strip operation from inode_init_owner into inode_sgid_strip

on 2022/4/18 11:08, Matthew Wilcox wrote:
> On Fri, Apr 15, 2022 at 04:09:24PM +0200, Christian Brauner wrote:
>>> + inode_sgid_strip(mnt_userns, dir,&mode);
>>> } else
>>> inode_fsgid_set(inode, mnt_userns);
>>> inode->i_mode = mode;
>>> @@ -2405,3 +2403,21 @@ struct timespec64 current_time(struct inode *inode)
>>> return timestamp_truncate(now, inode);
>>> }
>>> EXPORT_SYMBOL(current_time);
>>> +
>>> +void inode_sgid_strip(struct user_namespace *mnt_userns,
>>> + const struct inode *dir, umode_t *mode)
>>> +{
>>> + if (!dir || !(dir->i_mode& S_ISGID))
>>> + return;
>>> + if ((*mode& (S_ISGID | S_IXGRP)) != (S_ISGID | S_IXGRP))
>>> + return;
>>> + if (S_ISDIR(*mode))
>>> + return;
>>
>> I'd place that check first as this whole function is really only
>> relevant for non-directories.
>>
>> Otherwise I can live with *mode being a pointer although I still find
>> this unpleasant API wise but the bikeshed does it's job without having
>> my color. :)
>
> No, I think your instincts are correct. This should be
I can't understand why returning umode_t is better. So Does kernel have
some rules for adding new function I don't notice before? Just need a
reason.

ps: I will decide whether use pointer or use return umode_t value before
I send v4.

Best Regards
Yang Xu
>
> umode_t inode_sgid_strip(struct user_namespace *mnt_userns,
> const struct inode *dir, umode_t mode)
> {
> if (S_ISDIR(mode) || !dir || !(dir->i_mode& S_ISGID))
> return mode;
> if (mode& (S_ISGID | S_IXGRP) != (S_ISGID | S_IXGRP))
> return mode;
> ...
>
> and the same for prepare_mode().
>
> And really, I think this should be called inode_strip_sgid(). Right?