2021-01-21 13:36:24

by Christian Brauner

[permalink] [raw]
Subject: [PATCH v6 23/40] exec: handle idmapped mounts

When executing a setuid binary the kernel will verify in bprm_fill_uid()
that the inode has a mapping in the caller's user namespace before
setting the callers uid and gid. Let bprm_fill_uid() handle idmapped
mounts. If the inode is accessed through an idmapped mount it is mapped
according to the mount's user namespace. Afterwards the checks are
identical to non-idmapped mounts. If the initial user namespace is
passed nothing changes so non-idmapped mounts will see identical
behavior as before.

Link: https://lore.kernel.org/r/[email protected]
Cc: Christoph Hellwig <[email protected]>
Cc: David Howells <[email protected]>
Cc: Al Viro <[email protected]>
Cc: [email protected]
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
---
/* v2 */
unchanged

/* v3 */
unchanged

/* v4 */
- Serge Hallyn <[email protected]>:
- Use "mnt_userns" to refer to a vfsmount's userns everywhere to make
terminology consistent.

/* v5 */
unchanged
base-commit: 7c53f6b671f4aba70ff15e1b05148b10d58c2837

/* v6 */
base-commit: 19c329f6808995b142b3966301f217c831e7cf31

- Christoph Hellwig <[email protected]>:
- Use new file_mnt_user_ns() helper.
---
fs/exec.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index d803227805f6..48d1e8b1610b 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1580,6 +1580,7 @@ static void check_unsafe_exec(struct linux_binprm *bprm)
static void bprm_fill_uid(struct linux_binprm *bprm, struct file *file)
{
/* Handle suid and sgid on files */
+ struct user_namespace *mnt_userns;
struct inode *inode;
unsigned int mode;
kuid_t uid;
@@ -1596,13 +1597,15 @@ static void bprm_fill_uid(struct linux_binprm *bprm, struct file *file)
if (!(mode & (S_ISUID|S_ISGID)))
return;

+ mnt_userns = file_mnt_user_ns(file);
+
/* Be careful if suid/sgid is set */
inode_lock(inode);

/* reload atomically mode/uid/gid now that lock held */
mode = inode->i_mode;
- uid = inode->i_uid;
- gid = inode->i_gid;
+ uid = i_uid_into_mnt(mnt_userns, inode);
+ gid = i_gid_into_mnt(mnt_userns, inode);
inode_unlock(inode);

/* We ignore suid/sgid if there are no mappings for them in the ns */
--
2.30.0


2021-01-22 05:13:48

by James Morris

[permalink] [raw]
Subject: Re: [PATCH v6 23/40] exec: handle idmapped mounts

On Thu, 21 Jan 2021, Christian Brauner wrote:

> When executing a setuid binary the kernel will verify in bprm_fill_uid()
> that the inode has a mapping in the caller's user namespace before
> setting the callers uid and gid. Let bprm_fill_uid() handle idmapped
> mounts. If the inode is accessed through an idmapped mount it is mapped
> according to the mount's user namespace. Afterwards the checks are
> identical to non-idmapped mounts. If the initial user namespace is
> passed nothing changes so non-idmapped mounts will see identical
> behavior as before.
>
> Link: https://lore.kernel.org/r/[email protected]
> Cc: Christoph Hellwig <[email protected]>
> Cc: David Howells <[email protected]>
> Cc: Al Viro <[email protected]>
> Cc: [email protected]
> Reviewed-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Christian Brauner <[email protected]>


Reviewed-by: James Morris <[email protected]>


--
James Morris
<[email protected]>

2021-01-25 16:47:20

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH v6 23/40] exec: handle idmapped mounts

Christian Brauner <[email protected]> writes:

> When executing a setuid binary the kernel will verify in bprm_fill_uid()
> that the inode has a mapping in the caller's user namespace before
> setting the callers uid and gid. Let bprm_fill_uid() handle idmapped
> mounts. If the inode is accessed through an idmapped mount it is mapped
> according to the mount's user namespace. Afterwards the checks are
> identical to non-idmapped mounts. If the initial user namespace is
> passed nothing changes so non-idmapped mounts will see identical
> behavior as before.

This does not handle the v3 capabilites xattr with embeds a uid.
So at least at that level you are missing some critical conversions.

Eric

> Link: https://lore.kernel.org/r/[email protected]
> Cc: Christoph Hellwig <[email protected]>
> Cc: David Howells <[email protected]>
> Cc: Al Viro <[email protected]>
> Cc: [email protected]
> Reviewed-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Christian Brauner <[email protected]>
> ---
> /* v2 */
> unchanged
>
> /* v3 */
> unchanged
>
> /* v4 */
> - Serge Hallyn <[email protected]>:
> - Use "mnt_userns" to refer to a vfsmount's userns everywhere to make
> terminology consistent.
>
> /* v5 */
> unchanged
> base-commit: 7c53f6b671f4aba70ff15e1b05148b10d58c2837
>
> /* v6 */
> base-commit: 19c329f6808995b142b3966301f217c831e7cf31
>
> - Christoph Hellwig <[email protected]>:
> - Use new file_mnt_user_ns() helper.
> ---
> fs/exec.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index d803227805f6..48d1e8b1610b 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1580,6 +1580,7 @@ static void check_unsafe_exec(struct linux_binprm *bprm)
> static void bprm_fill_uid(struct linux_binprm *bprm, struct file *file)
> {
> /* Handle suid and sgid on files */
> + struct user_namespace *mnt_userns;
> struct inode *inode;
> unsigned int mode;
> kuid_t uid;
> @@ -1596,13 +1597,15 @@ static void bprm_fill_uid(struct linux_binprm *bprm, struct file *file)
> if (!(mode & (S_ISUID|S_ISGID)))
> return;
>
> + mnt_userns = file_mnt_user_ns(file);
> +
> /* Be careful if suid/sgid is set */
> inode_lock(inode);
>
> /* reload atomically mode/uid/gid now that lock held */
> mode = inode->i_mode;
> - uid = inode->i_uid;
> - gid = inode->i_gid;
> + uid = i_uid_into_mnt(mnt_userns, inode);
> + gid = i_gid_into_mnt(mnt_userns, inode);
> inode_unlock(inode);
>
> /* We ignore suid/sgid if there are no mappings for them in the ns */

2021-01-26 02:55:11

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH v6 23/40] exec: handle idmapped mounts

On Mon, Jan 25, 2021 at 05:44:04PM +0100, Christian Brauner wrote:
> On Mon, Jan 25, 2021 at 10:39:01AM -0600, Eric W. Biederman wrote:
> > Christian Brauner <[email protected]> writes:
> >
> > > When executing a setuid binary the kernel will verify in bprm_fill_uid()
> > > that the inode has a mapping in the caller's user namespace before
> > > setting the callers uid and gid. Let bprm_fill_uid() handle idmapped
> > > mounts. If the inode is accessed through an idmapped mount it is mapped
> > > according to the mount's user namespace. Afterwards the checks are
> > > identical to non-idmapped mounts. If the initial user namespace is
> > > passed nothing changes so non-idmapped mounts will see identical
> > > behavior as before.
> >
> > This does not handle the v3 capabilites xattr with embeds a uid.
> > So at least at that level you are missing some critical conversions.
>
> Thanks for looking. Vfs v3 caps are handled earlier in the series. I'm
> not sure what you're referring to here. There are tests in xfstests that
> verify vfs3 capability behavior.

*just* to make sure i'm not misunderstanding - s/vfs3/v3/ right?

2021-01-26 05:35:41

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v6 23/40] exec: handle idmapped mounts

On Mon, Jan 25, 2021 at 10:39:01AM -0600, Eric W. Biederman wrote:
> Christian Brauner <[email protected]> writes:
>
> > When executing a setuid binary the kernel will verify in bprm_fill_uid()
> > that the inode has a mapping in the caller's user namespace before
> > setting the callers uid and gid. Let bprm_fill_uid() handle idmapped
> > mounts. If the inode is accessed through an idmapped mount it is mapped
> > according to the mount's user namespace. Afterwards the checks are
> > identical to non-idmapped mounts. If the initial user namespace is
> > passed nothing changes so non-idmapped mounts will see identical
> > behavior as before.
>
> This does not handle the v3 capabilites xattr with embeds a uid.
> So at least at that level you are missing some critical conversions.

Thanks for looking. Vfs v3 caps are handled earlier in the series. I'm
not sure what you're referring to here. There are tests in xfstests that
verify vfs3 capability behavior.

Christian

2021-01-26 21:46:56

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v6 23/40] exec: handle idmapped mounts

On Mon, Jan 25, 2021 at 11:03:16AM -0600, Serge Hallyn wrote:
> On Mon, Jan 25, 2021 at 05:44:04PM +0100, Christian Brauner wrote:
> > On Mon, Jan 25, 2021 at 10:39:01AM -0600, Eric W. Biederman wrote:
> > > Christian Brauner <[email protected]> writes:
> > >
> > > > When executing a setuid binary the kernel will verify in bprm_fill_uid()
> > > > that the inode has a mapping in the caller's user namespace before
> > > > setting the callers uid and gid. Let bprm_fill_uid() handle idmapped
> > > > mounts. If the inode is accessed through an idmapped mount it is mapped
> > > > according to the mount's user namespace. Afterwards the checks are
> > > > identical to non-idmapped mounts. If the initial user namespace is
> > > > passed nothing changes so non-idmapped mounts will see identical
> > > > behavior as before.
> > >
> > > This does not handle the v3 capabilites xattr with embeds a uid.
> > > So at least at that level you are missing some critical conversions.
> >
> > Thanks for looking. Vfs v3 caps are handled earlier in the series. I'm
> > not sure what you're referring to here. There are tests in xfstests that
> > verify vfs3 capability behavior.
>
> *just* to make sure i'm not misunderstanding - s/vfs3/v3/ right?

Yes, in my mind it's always as "vfs v3 caps -> vfs3 caps". Sorry for the
confusion.

2021-01-27 08:22:53

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH v6 23/40] exec: handle idmapped mounts

On Mon, Jan 25, 2021 at 05:44:04PM +0100, Christian Brauner wrote:
> On Mon, Jan 25, 2021 at 10:39:01AM -0600, Eric W. Biederman wrote:
> > Christian Brauner <[email protected]> writes:
> >
> > > When executing a setuid binary the kernel will verify in bprm_fill_uid()
> > > that the inode has a mapping in the caller's user namespace before
> > > setting the callers uid and gid. Let bprm_fill_uid() handle idmapped
> > > mounts. If the inode is accessed through an idmapped mount it is mapped
> > > according to the mount's user namespace. Afterwards the checks are
> > > identical to non-idmapped mounts. If the initial user namespace is
> > > passed nothing changes so non-idmapped mounts will see identical
> > > behavior as before.
> >
> > This does not handle the v3 capabilites xattr with embeds a uid.
> > So at least at that level you are missing some critical conversions.
>
> Thanks for looking. Vfs v3 caps are handled earlier in the series. I'm
> not sure what you're referring to here. There are tests in xfstests that
> verify vfs3 capability behavior.
>
> Christian

So fwiw I just tested it manually as well. Scenario:

uid 1000 user ubuntu
uid 1001 user u2
sudo ./mount-idmapped --map-mount b:1000:1001:1 /home/ubuntu/ /mnt
su - u2
cp /usr/bin/id /mnt/
unshare -Urm
setcap cap_setuid=pe /mnt/id
At this point, from init_user_ns,
ubuntu@fscaps:~/mount-idmapped$ /home/u2/rcap /mnt/id
v3, rootid is 1001
ubuntu@fscaps:~/mount-idmapped$ /home/u2/rcap //home/ubuntu/id
v3, rootid is 1000

-serge