2015-02-21 01:04:29

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH] devpts: Add ptmx_uid and ptmx_gid options

It's currently impossible to mount devpts in a user namespace that
has no root user, since ptmx can't be created. This adds options
ptmx_uid and ptmx_gid that override the default uid and gid of 0.

These options are not shown in mountinfo because they have no effect
other than changing the initial mode of ptmx, and, in particular, it
wouldn't make any sense to change them on remount. Instead, we
disallow them on remount.

This could be changed, but we'd probably want to fix the userns
behavior of uid and gid at the same time if we did so.

Signed-off-by: Andy Lutomirski <[email protected]>
---
Documentation/filesystems/devpts.txt | 4 +++
fs/devpts/inode.c | 58 ++++++++++++++++++++++++++----------
2 files changed, 46 insertions(+), 16 deletions(-)

diff --git a/Documentation/filesystems/devpts.txt b/Documentation/filesystems/devpts.txt
index 68dffd87f9b7..7808e77d0d72 100644
--- a/Documentation/filesystems/devpts.txt
+++ b/Documentation/filesystems/devpts.txt
@@ -121,6 +121,10 @@ once), following user-space issues should be noted.

chmod 666 /dev/pts/ptmx

+ The ownership for /dev/pts/ptmx can be specified using the ptmxuid
+ and ptmxgid options. Both default to zero, which, in user namespaces
+ that have no root user, will cause mounting to fail.
+
7. A mount of devpts without the 'newinstance' option results in binding to
initial kernel mount. This behavior while preserving legacy semantics,
does not provide strict isolation in a container environment. i.e by
diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index cfe8466f7fef..b60d1438c660 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -102,6 +102,8 @@ struct pts_mount_opts {
int setgid;
kuid_t uid;
kgid_t gid;
+ uid_t ptmx_uid;
+ gid_t ptmx_gid;
umode_t mode;
umode_t ptmxmode;
int newinstance;
@@ -109,8 +111,8 @@ struct pts_mount_opts {
};

enum {
- Opt_uid, Opt_gid, Opt_mode, Opt_ptmxmode, Opt_newinstance, Opt_max,
- Opt_err
+ Opt_uid, Opt_gid, Opt_ptmx_uid, Opt_ptmx_gid, Opt_mode, Opt_ptmxmode,
+ Opt_newinstance, Opt_max, Opt_err,
};

static const match_table_t tokens = {
@@ -118,6 +120,8 @@ static const match_table_t tokens = {
{Opt_gid, "gid=%u"},
{Opt_mode, "mode=%o"},
#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES
+ {Opt_ptmx_uid, "ptmxuid=%u"},
+ {Opt_ptmx_gid, "ptmxgid=%u"},
{Opt_ptmxmode, "ptmxmode=%o"},
{Opt_newinstance, "newinstance"},
{Opt_max, "max=%d"},
@@ -162,14 +166,17 @@ static int parse_mount_options(char *data, int op, struct pts_mount_opts *opts)
char *p;
kuid_t uid;
kgid_t gid;
-
- opts->setuid = 0;
- opts->setgid = 0;
- opts->uid = GLOBAL_ROOT_UID;
- opts->gid = GLOBAL_ROOT_GID;
- opts->mode = DEVPTS_DEFAULT_MODE;
+ bool setptmxid = false;
+
+ opts->setuid = 0;
+ opts->setgid = 0;
+ opts->uid = GLOBAL_ROOT_UID;
+ opts->gid = GLOBAL_ROOT_GID;
+ opts->ptmx_uid = 0;
+ opts->ptmx_gid = 0;
+ opts->mode = DEVPTS_DEFAULT_MODE;
opts->ptmxmode = DEVPTS_DEFAULT_PTMX_MODE;
- opts->max = NR_UNIX98_PTY_MAX;
+ opts->max = NR_UNIX98_PTY_MAX;

/* newinstance makes sense only on initial mount */
if (op == PARSE_MOUNT)
@@ -209,6 +216,22 @@ static int parse_mount_options(char *data, int op, struct pts_mount_opts *opts)
opts->mode = option & S_IALLUGO;
break;
#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES
+ case Opt_ptmx_uid:
+ if (match_int(&args[0], &option))
+ return -EINVAL;
+ if (op != PARSE_MOUNT)
+ return -EINVAL;
+ opts->ptmx_uid = option;
+ setptmxid = true;
+ break;
+ case Opt_ptmx_gid:
+ if (match_int(&args[0], &option))
+ return -EINVAL;
+ if (op != PARSE_MOUNT)
+ return -EINVAL;
+ opts->ptmx_gid = option;
+ setptmxid = true;
+ break;
case Opt_ptmxmode:
if (match_octal(&args[0], &option))
return -EINVAL;
@@ -232,6 +255,9 @@ static int parse_mount_options(char *data, int op, struct pts_mount_opts *opts)
}
}

+ if (setptmxid && !opts->newinstance)
+ return -EINVAL;
+
return 0;
}

@@ -245,12 +271,12 @@ static int mknod_ptmx(struct super_block *sb)
struct dentry *root = sb->s_root;
struct pts_fs_info *fsi = DEVPTS_SB(sb);
struct pts_mount_opts *opts = &fsi->mount_opts;
- kuid_t root_uid;
- kgid_t root_gid;
+ kuid_t ptmx_uid;
+ kgid_t ptmx_gid;

- root_uid = make_kuid(current_user_ns(), 0);
- root_gid = make_kgid(current_user_ns(), 0);
- if (!uid_valid(root_uid) || !gid_valid(root_gid))
+ ptmx_uid = make_kuid(current_user_ns(), opts->ptmx_uid);
+ ptmx_gid = make_kgid(current_user_ns(), opts->ptmx_gid);
+ if (!uid_valid(ptmx_uid) || !gid_valid(ptmx_gid))
return -EINVAL;

mutex_lock(&root->d_inode->i_mutex);
@@ -282,8 +308,8 @@ static int mknod_ptmx(struct super_block *sb)

mode = S_IFCHR|opts->ptmxmode;
init_special_inode(inode, mode, MKDEV(TTYAUX_MAJOR, 2));
- inode->i_uid = root_uid;
- inode->i_gid = root_gid;
+ inode->i_uid = ptmx_uid;
+ inode->i_gid = ptmx_gid;

d_add(dentry, inode);

--
2.3.0


2015-04-02 10:12:14

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] devpts: Add ptmx_uid and ptmx_gid options

On Tue, 2015-03-31 at 16:17 +0200, Alexander Larsson wrote:
> On tis, 2015-03-31 at 17:08 +0300, James Bottomley wrote:
> > On Tue, 2015-03-31 at 06:59 -0700, Andy Lutomirski wrote:
> > >
> > > I don't think that this is correct. That user can already create a
> > > nested userns and map themselves as 0 inside it. Then they can mount
> > > devpts.
> >
> > I don't mind if they create a container and control the isolated ttys in
> > that sub container in the VPS; that's fine. I do mind if they get
> > access to the ttys in the VPS.
> >
> > If you can convince me (and the rest of Linux) that the tty subsystem
> > should be mountable by an unprivileged user generally, then what you
> > propose is OK.
>
> That is controlled by the general rights to mount stuff. I.e. unless you
> have CAP_SYS_ADMIN in the VPS container you will not be able to mount
> devpts there. You can only do it in a subcontainer where you got
> permissions to mount via using user namespaces.

OK let me try again. Fine, if you want to speak capabilities, you've
given a non-root user an unexpected capability (the capability of
creating a ptmx device). But you haven't used a capability separation
to do this, you've just hard coded it via a mount parameter mechanism.

If you want to do this thing, do it properly, so it's acceptable to the
whole of Linux, not a special corner case for one particular type of
container.

Security breaches are created when people code in special, little used,
corner cases because they don't get as thoroughly tested and inspected
as generally applicable mechanisms.

What you want is to be able to use the tty subsystem as a non root user:
fine, but set that up globally, don't hide it in containers so a lot
fewer people care.

James

2015-04-02 14:07:01

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] devpts: Add ptmx_uid and ptmx_gid options

On Thu, Apr 2, 2015 at 3:12 AM, James Bottomley
<[email protected]> wrote:
> On Tue, 2015-03-31 at 16:17 +0200, Alexander Larsson wrote:
>> On tis, 2015-03-31 at 17:08 +0300, James Bottomley wrote:
>> > On Tue, 2015-03-31 at 06:59 -0700, Andy Lutomirski wrote:
>> > >
>> > > I don't think that this is correct. That user can already create a
>> > > nested userns and map themselves as 0 inside it. Then they can mount
>> > > devpts.
>> >
>> > I don't mind if they create a container and control the isolated ttys in
>> > that sub container in the VPS; that's fine. I do mind if they get
>> > access to the ttys in the VPS.
>> >
>> > If you can convince me (and the rest of Linux) that the tty subsystem
>> > should be mountable by an unprivileged user generally, then what you
>> > propose is OK.
>>
>> That is controlled by the general rights to mount stuff. I.e. unless you
>> have CAP_SYS_ADMIN in the VPS container you will not be able to mount
>> devpts there. You can only do it in a subcontainer where you got
>> permissions to mount via using user namespaces.
>
> OK let me try again. Fine, if you want to speak capabilities, you've
> given a non-root user an unexpected capability (the capability of
> creating a ptmx device). But you haven't used a capability separation
> to do this, you've just hard coded it via a mount parameter mechanism.
>
> If you want to do this thing, do it properly, so it's acceptable to the
> whole of Linux, not a special corner case for one particular type of
> container.
>
> Security breaches are created when people code in special, little used,
> corner cases because they don't get as thoroughly tested and inspected
> as generally applicable mechanisms.
>
> What you want is to be able to use the tty subsystem as a non root user:
> fine, but set that up globally, don't hide it in containers so a lot
> fewer people care.

I tend to agree, and not just for the tty subsystem. This is an
attack surface issue. With unprivileged user namespaces, unprivileged
users can create mount namespaces (probably a good thing for bind
mounts, etc), network namespaces (reasonably safe by themselves),
network interfaces and iptables rules (scary), fresh
instances/superblocks of some filesystems (scariness depends on the fs
-- tmpfs is probably fine), and more.

I think we should have real controls for this, and this is mostly
Eric's domain. Eric? A silly issue that sometimes prevents devpts
from being mountable isn't a real control, though.

--Andy

>
> James
>
>



--
Andy Lutomirski
AMA Capital Management, LLC

2015-04-02 14:29:50

by Alexander Larsson

[permalink] [raw]
Subject: Re: [PATCH] devpts: Add ptmx_uid and ptmx_gid options

On Thu, 2015-04-02 at 07:06 -0700, Andy Lutomirski wrote:
> On Thu, Apr 2, 2015 at 3:12 AM, James Bottomley
> <[email protected]> wrote:
> > On Tue, 2015-03-31 at 16:17 +0200, Alexander Larsson wrote:
> >> On tis, 2015-03-31 at 17:08 +0300, James Bottomley wrote:
> >> > On Tue, 2015-03-31 at 06:59 -0700, Andy Lutomirski wrote:
> >> > >
> >> > > I don't think that this is correct. That user can already create a
> >> > > nested userns and map themselves as 0 inside it. Then they can mount
> >> > > devpts.
> >> >
> >> > I don't mind if they create a container and control the isolated ttys in
> >> > that sub container in the VPS; that's fine. I do mind if they get
> >> > access to the ttys in the VPS.
> >> >
> >> > If you can convince me (and the rest of Linux) that the tty subsystem
> >> > should be mountable by an unprivileged user generally, then what you
> >> > propose is OK.
> >>
> >> That is controlled by the general rights to mount stuff. I.e. unless you
> >> have CAP_SYS_ADMIN in the VPS container you will not be able to mount
> >> devpts there. You can only do it in a subcontainer where you got
> >> permissions to mount via using user namespaces.
> >
> > OK let me try again. Fine, if you want to speak capabilities, you've
> > given a non-root user an unexpected capability (the capability of
> > creating a ptmx device). But you haven't used a capability separation
> > to do this, you've just hard coded it via a mount parameter mechanism.
> >
> > If you want to do this thing, do it properly, so it's acceptable to the
> > whole of Linux, not a special corner case for one particular type of
> > container.
> >
> > Security breaches are created when people code in special, little used,
> > corner cases because they don't get as thoroughly tested and inspected
> > as generally applicable mechanisms.
> >
> > What you want is to be able to use the tty subsystem as a non root user:
> > fine, but set that up globally, don't hide it in containers so a lot
> > fewer people care.
>
> I tend to agree, and not just for the tty subsystem. This is an
> attack surface issue. With unprivileged user namespaces, unprivileged
> users can create mount namespaces (probably a good thing for bind
> mounts, etc), network namespaces (reasonably safe by themselves),
> network interfaces and iptables rules (scary), fresh
> instances/superblocks of some filesystems (scariness depends on the fs
> -- tmpfs is probably fine), and more.
>
> I think we should have real controls for this, and this is mostly
> Eric's domain. Eric? A silly issue that sometimes prevents devpts
> from being mountable isn't a real control, though.

I'm honestly surprised that non-root is allowed to mount things in
general with user namespaces. This was long disabled use for non-root in
Fedora, but it is now enabled.

For instance, using loopback mounted files you could probably attack
some of the less well tested filesystem implementations by feeding them
fuzzed data.

Anyway, I don't see how this affects devpts though. If you're running in
a container (or uncontained), as a regular users with no mount
capabilities you can already mount a devpts filesystem if you create a
subbcontainer with user namespaces and map your uid to 0 in the
subcontainer. Then you get a new ptmx device that you can do whatever
you want with. The mount option would let you do the same, except be
your regular uid in the subcontainer.

The only difference outside of the subcontainer is that if the outer
container has no uid 0 mapped, yet the user has CAP_SYSADMIN rights in
that container. Then he can mount devpts in the outer container where he
before could only mount it in an inner container.

2015-04-02 14:33:56

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] devpts: Add ptmx_uid and ptmx_gid options

On Thu, Apr 2, 2015 at 7:29 AM, Alexander Larsson <[email protected]> wrote:
> On Thu, 2015-04-02 at 07:06 -0700, Andy Lutomirski wrote:
>> On Thu, Apr 2, 2015 at 3:12 AM, James Bottomley
>> <[email protected]> wrote:
>> > On Tue, 2015-03-31 at 16:17 +0200, Alexander Larsson wrote:
>> >> On tis, 2015-03-31 at 17:08 +0300, James Bottomley wrote:
>> >> > On Tue, 2015-03-31 at 06:59 -0700, Andy Lutomirski wrote:
>> >> > >
>> >> > > I don't think that this is correct. That user can already create a
>> >> > > nested userns and map themselves as 0 inside it. Then they can mount
>> >> > > devpts.
>> >> >
>> >> > I don't mind if they create a container and control the isolated ttys in
>> >> > that sub container in the VPS; that's fine. I do mind if they get
>> >> > access to the ttys in the VPS.
>> >> >
>> >> > If you can convince me (and the rest of Linux) that the tty subsystem
>> >> > should be mountable by an unprivileged user generally, then what you
>> >> > propose is OK.
>> >>
>> >> That is controlled by the general rights to mount stuff. I.e. unless you
>> >> have CAP_SYS_ADMIN in the VPS container you will not be able to mount
>> >> devpts there. You can only do it in a subcontainer where you got
>> >> permissions to mount via using user namespaces.
>> >
>> > OK let me try again. Fine, if you want to speak capabilities, you've
>> > given a non-root user an unexpected capability (the capability of
>> > creating a ptmx device). But you haven't used a capability separation
>> > to do this, you've just hard coded it via a mount parameter mechanism.
>> >
>> > If you want to do this thing, do it properly, so it's acceptable to the
>> > whole of Linux, not a special corner case for one particular type of
>> > container.
>> >
>> > Security breaches are created when people code in special, little used,
>> > corner cases because they don't get as thoroughly tested and inspected
>> > as generally applicable mechanisms.
>> >
>> > What you want is to be able to use the tty subsystem as a non root user:
>> > fine, but set that up globally, don't hide it in containers so a lot
>> > fewer people care.
>>
>> I tend to agree, and not just for the tty subsystem. This is an
>> attack surface issue. With unprivileged user namespaces, unprivileged
>> users can create mount namespaces (probably a good thing for bind
>> mounts, etc), network namespaces (reasonably safe by themselves),
>> network interfaces and iptables rules (scary), fresh
>> instances/superblocks of some filesystems (scariness depends on the fs
>> -- tmpfs is probably fine), and more.
>>
>> I think we should have real controls for this, and this is mostly
>> Eric's domain. Eric? A silly issue that sometimes prevents devpts
>> from being mountable isn't a real control, though.
>
> I'm honestly surprised that non-root is allowed to mount things in
> general with user namespaces. This was long disabled use for non-root in
> Fedora, but it is now enabled.
>
> For instance, using loopback mounted files you could probably attack
> some of the less well tested filesystem implementations by feeding them
> fuzzed data.
>

You actually can't do that right now. Filesystems have to opt in to
being mounted in unprivileged user namespaces, and no filesystems with
backing stores have opted in. devpts has, but it's buggy without this
patch IMO.

> Anyway, I don't see how this affects devpts though. If you're running in
> a container (or uncontained), as a regular users with no mount
> capabilities you can already mount a devpts filesystem if you create a
> subbcontainer with user namespaces and map your uid to 0 in the
> subcontainer. Then you get a new ptmx device that you can do whatever
> you want with. The mount option would let you do the same, except be
> your regular uid in the subcontainer.
>
> The only difference outside of the subcontainer is that if the outer
> container has no uid 0 mapped, yet the user has CAP_SYSADMIN rights in
> that container. Then he can mount devpts in the outer container where he
> before could only mount it in an inner container.
>

Agreed. Also, devpts doesn't seem scary at all to me from a userns
perspective. Regular users on normal systems can already use ptmx,
and AFAICS basically all of the attack surface is already available
through the normal /dev/ptmx node.

--Andy

2015-04-02 15:49:44

by Serge Hallyn

[permalink] [raw]
Subject: Re: [PATCH] devpts: Add ptmx_uid and ptmx_gid options

Quoting Andy Lutomirski ([email protected]):
> On Thu, Apr 2, 2015 at 7:29 AM, Alexander Larsson <[email protected]> wrote:
> > On Thu, 2015-04-02 at 07:06 -0700, Andy Lutomirski wrote:
> >> On Thu, Apr 2, 2015 at 3:12 AM, James Bottomley
> >> <[email protected]> wrote:
> >> > On Tue, 2015-03-31 at 16:17 +0200, Alexander Larsson wrote:
> >> >> On tis, 2015-03-31 at 17:08 +0300, James Bottomley wrote:
> >> >> > On Tue, 2015-03-31 at 06:59 -0700, Andy Lutomirski wrote:
> >> >> > >
> >> >> > > I don't think that this is correct. That user can already create a
> >> >> > > nested userns and map themselves as 0 inside it. Then they can mount
> >> >> > > devpts.
> >> >> >
> >> >> > I don't mind if they create a container and control the isolated ttys in
> >> >> > that sub container in the VPS; that's fine. I do mind if they get
> >> >> > access to the ttys in the VPS.
> >> >> >
> >> >> > If you can convince me (and the rest of Linux) that the tty subsystem
> >> >> > should be mountable by an unprivileged user generally, then what you
> >> >> > propose is OK.
> >> >>
> >> >> That is controlled by the general rights to mount stuff. I.e. unless you
> >> >> have CAP_SYS_ADMIN in the VPS container you will not be able to mount
> >> >> devpts there. You can only do it in a subcontainer where you got
> >> >> permissions to mount via using user namespaces.
> >> >
> >> > OK let me try again. Fine, if you want to speak capabilities, you've
> >> > given a non-root user an unexpected capability (the capability of
> >> > creating a ptmx device). But you haven't used a capability separation
> >> > to do this, you've just hard coded it via a mount parameter mechanism.
> >> >
> >> > If you want to do this thing, do it properly, so it's acceptable to the
> >> > whole of Linux, not a special corner case for one particular type of
> >> > container.
> >> >
> >> > Security breaches are created when people code in special, little used,
> >> > corner cases because they don't get as thoroughly tested and inspected
> >> > as generally applicable mechanisms.
> >> >
> >> > What you want is to be able to use the tty subsystem as a non root user:
> >> > fine, but set that up globally, don't hide it in containers so a lot
> >> > fewer people care.
> >>
> >> I tend to agree, and not just for the tty subsystem. This is an
> >> attack surface issue. With unprivileged user namespaces, unprivileged
> >> users can create mount namespaces (probably a good thing for bind
> >> mounts, etc), network namespaces (reasonably safe by themselves),
> >> network interfaces and iptables rules (scary), fresh
> >> instances/superblocks of some filesystems (scariness depends on the fs
> >> -- tmpfs is probably fine), and more.
> >>
> >> I think we should have real controls for this, and this is mostly
> >> Eric's domain. Eric? A silly issue that sometimes prevents devpts
> >> from being mountable isn't a real control, though.
> >
> > I'm honestly surprised that non-root is allowed to mount things in
> > general with user namespaces. This was long disabled use for non-root in
> > Fedora, but it is now enabled.
> >
> > For instance, using loopback mounted files you could probably attack
> > some of the less well tested filesystem implementations by feeding them
> > fuzzed data.
> >
>
> You actually can't do that right now. Filesystems have to opt in to
> being mounted in unprivileged user namespaces, and no filesystems with
> backing stores have opted in. devpts has, but it's buggy without this
> patch IMO.
>
> > Anyway, I don't see how this affects devpts though. If you're running in
> > a container (or uncontained), as a regular users with no mount
> > capabilities you can already mount a devpts filesystem if you create a
> > subbcontainer with user namespaces and map your uid to 0 in the
> > subcontainer. Then you get a new ptmx device that you can do whatever
> > you want with. The mount option would let you do the same, except be
> > your regular uid in the subcontainer.
> >
> > The only difference outside of the subcontainer is that if the outer
> > container has no uid 0 mapped, yet the user has CAP_SYSADMIN rights in
> > that container. Then he can mount devpts in the outer container where he
> > before could only mount it in an inner container.
> >
>
> Agreed. Also, devpts doesn't seem scary at all to me from a userns
> perspective. Regular users on normal systems can already use ptmx,
> and AFAICS basically all of the attack surface is already available
> through the normal /dev/ptmx node.

I've been ignoring this thread bc I was pretty sure I had acked the
original patch. If you don't have a record of that (or I'm plain wrong
and never did) please let me know.

2015-04-02 18:32:06

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] devpts: Add ptmx_uid and ptmx_gid options

Andy Lutomirski <[email protected]> writes:

> On Thu, Apr 2, 2015 at 7:29 AM, Alexander Larsson <[email protected]> wrote:
>> On Thu, 2015-04-02 at 07:06 -0700, Andy Lutomirski wrote:
>>> On Thu, Apr 2, 2015 at 3:12 AM, James Bottomley
>>> <[email protected]> wrote:
>>> > On Tue, 2015-03-31 at 16:17 +0200, Alexander Larsson wrote:
>>> >> On tis, 2015-03-31 at 17:08 +0300, James Bottomley wrote:
>>> >> > On Tue, 2015-03-31 at 06:59 -0700, Andy Lutomirski wrote:
>>> >> > >
>>> >> > > I don't think that this is correct. That user can already create a
>>> >> > > nested userns and map themselves as 0 inside it. Then they can mount
>>> >> > > devpts.
>>> >> >
>>> >> > I don't mind if they create a container and control the isolated ttys in
>>> >> > that sub container in the VPS; that's fine. I do mind if they get
>>> >> > access to the ttys in the VPS.
>>> >> >
>>> >> > If you can convince me (and the rest of Linux) that the tty subsystem
>>> >> > should be mountable by an unprivileged user generally, then what you
>>> >> > propose is OK.
>>> >>
>>> >> That is controlled by the general rights to mount stuff. I.e. unless you
>>> >> have CAP_SYS_ADMIN in the VPS container you will not be able to mount
>>> >> devpts there. You can only do it in a subcontainer where you got
>>> >> permissions to mount via using user namespaces.
>>> >
>>> > OK let me try again. Fine, if you want to speak capabilities, you've
>>> > given a non-root user an unexpected capability (the capability of
>>> > creating a ptmx device). But you haven't used a capability separation
>>> > to do this, you've just hard coded it via a mount parameter mechanism.
>>> >
>>> > If you want to do this thing, do it properly, so it's acceptable to the
>>> > whole of Linux, not a special corner case for one particular type of
>>> > container.
>>> >
>>> > Security breaches are created when people code in special, little used,
>>> > corner cases because they don't get as thoroughly tested and inspected
>>> > as generally applicable mechanisms.
>>> >
>>> > What you want is to be able to use the tty subsystem as a non root user:
>>> > fine, but set that up globally, don't hide it in containers so a lot
>>> > fewer people care.
>>>
>>> I tend to agree, and not just for the tty subsystem. This is an
>>> attack surface issue. With unprivileged user namespaces, unprivileged
>>> users can create mount namespaces (probably a good thing for bind
>>> mounts, etc), network namespaces (reasonably safe by themselves),
>>> network interfaces and iptables rules (scary), fresh
>>> instances/superblocks of some filesystems (scariness depends on the fs
>>> -- tmpfs is probably fine), and more.
>>>
>>> I think we should have real controls for this, and this is mostly
>>> Eric's domain. Eric? A silly issue that sometimes prevents devpts
>>> from being mountable isn't a real control, though.

I thought the controls for limiting how much of the userspace API
an application could use were called seccomp and seccomp2.

Do we need something like a PAM module so that we can set up these
controls during login?

>> I'm honestly surprised that non-root is allowed to mount things in
>> general with user namespaces. This was long disabled use for non-root in
>> Fedora, but it is now enabled.
>>
>> For instance, using loopback mounted files you could probably attack
>> some of the less well tested filesystem implementations by feeding them
>> fuzzed data.
>>
>
> You actually can't do that right now. Filesystems have to opt in to
> being mounted in unprivileged user namespaces, and no filesystems with
> backing stores have opted in. devpts has, but it's buggy without this
> patch IMO.

Arguably you should use two user namespaces. The first to do what you
want to as root the second to run as the uid you want to run as.

>> Anyway, I don't see how this affects devpts though. If you're running in
>> a container (or uncontained), as a regular users with no mount
>> capabilities you can already mount a devpts filesystem if you create a
>> subbcontainer with user namespaces and map your uid to 0 in the
>> subcontainer. Then you get a new ptmx device that you can do whatever
>> you want with. The mount option would let you do the same, except be
>> your regular uid in the subcontainer.
>>
>> The only difference outside of the subcontainer is that if the outer
>> container has no uid 0 mapped, yet the user has CAP_SYSADMIN rights in
>> that container. Then he can mount devpts in the outer container where he
>> before could only mount it in an inner container.
>>
>
> Agreed. Also, devpts doesn't seem scary at all to me from a userns
> perspective. Regular users on normal systems can already use ptmx,
> and AFAICS basically all of the attack surface is already available
> through the normal /dev/ptmx node.

My only real take is that there are a lot more places that you need to
tweak beyond devpts. So this patch seemed lacking and boring.

Beyond that until I get the mount namespace sorted out things are pretty
much in a feature freeze because I can't multitask well enough to do
complicated patches and take feature patches.

Eric

2015-05-18 21:05:18

by Alexander Larsson

[permalink] [raw]
Subject: Re: [PATCH] devpts: Add ptmx_uid and ptmx_gid options

On tor, 2015-03-26 at 12:29 -0700, Andy Lutomirski wrote:
> Ping? It's been over a month.

Ping again. I've tested this with
https://github.com/alexlarsson/xdg-app/tree/wip/userns
and this is the final kernel change needed to allow desktop sandboxing
without any raised priviledges (setuid etc).

So,
Tested-by: [email protected]

And please, can we get some eyeballs on this, it really is very useful
(and very simple too).

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Alexander Larsson Red Hat, Inc
[email protected] [email protected]
He's a hate-fuelled sweet-toothed cat burglar on his last day in the job.
She's a chain-smoking mute Hell's Angel who believes she is the
reincarnation of an ancient Egyptian queen. They fight crime!

2016-03-08 05:00:21

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] devpts: Add ptmx_uid and ptmx_gid options

On Thu, May 28, 2015 at 12:42 PM, Eric W. Biederman
<[email protected]> wrote:
> Andy Lutomirski <[email protected]> writes:
>
>> On Thu, May 28, 2015 at 10:01 AM, Alexander Larsson <[email protected]> wrote:
>>> On Thu, 2015-05-28 at 11:44 -0500, Eric W. Biederman wrote:
>>>> Andy Lutomirski <[email protected]> writes:
>>>>
>>>> > On Thu, Apr 2, 2015 at 11:27 AM, Eric W. Biederman
>>>> > <[email protected]> wrote:
>>>> > > Andy Lutomirski <[email protected]> writes:
>>>> > >
>>>> > > > On Thu, Apr 2, 2015 at 7:29 AM, Alexander Larsson <
>>>> > > > [email protected]> wrote:
>>>> > > > > On Thu, 2015-04-02 at 07:06 -0700, Andy Lutomirski wrote:
>>>> > > > > > On Thu, Apr 2, 2015 at 3:12 AM, James Bottomley
>>>> > > > > > <[email protected]> wrote:
>>>> > > > > > > On Tue, 2015-03-31 at 16:17 +0200, Alexander Larsson
>>>> > > > > > > wrote:
>>>> > > > > > > > On tis, 2015-03-31 at 17:08 +0300, James Bottomley
>>>> > > > > > > > wrote:
>>>> > > > > > > > > On Tue, 2015-03-31 at 06:59 -0700, Andy Lutomirski
>>>> > > > > > > > > wrote:
>>>> > > > > > > > > >
>>>> > > > > > > > > > I don't think that this is correct. That user can
>>>> > > > > > > > > > already create a
>>>> > > > > > > > > > nested userns and map themselves as 0 inside it.
>>>> > > > > > > > > > Then they can mount
>>>> > > > > > > > > > devpts.
>>>> > > > > > > > >
>>>> > > > > > > > > I don't mind if they create a container and control
>>>> > > > > > > > > the isolated ttys in
>>>> > > > > > > > > that sub container in the VPS; that's fine. I do
>>>> > > > > > > > > mind if they get
>>>> > > > > > > > > access to the ttys in the VPS.
>>>> > > > > > > > >
>>>> > > > > > > > > If you can convince me (and the rest of Linux) that
>>>> > > > > > > > > the tty subsystem
>>>> > > > > > > > > should be mountable by an unprivileged user
>>>> > > > > > > > > generally, then what you
>>>> > > > > > > > > propose is OK.
>>>> > > > > > > >
>>>> > > > > > > > That is controlled by the general rights to mount
>>>> > > > > > > > stuff. I.e. unless you
>>>> > > > > > > > have CAP_SYS_ADMIN in the VPS container you will not be
>>>> > > > > > > > able to mount
>>>> > > > > > > > devpts there. You can only do it in a subcontainer
>>>> > > > > > > > where you got
>>>> > > > > > > > permissions to mount via using user namespaces.
>>>> > > > > > >
>>>> > > > > > > OK let me try again. Fine, if you want to speak
>>>> > > > > > > capabilities, you've
>>>> > > > > > > given a non-root user an unexpected capability (the
>>>> > > > > > > capability of
>>>> > > > > > > creating a ptmx device). But you haven't used a
>>>> > > > > > > capability separation
>>>> > > > > > > to do this, you've just hard coded it via a mount
>>>> > > > > > > parameter mechanism.
>>>> > > > > > >
>>>> > > > > > > If you want to do this thing, do it properly, so it's
>>>> > > > > > > acceptable to the
>>>> > > > > > > whole of Linux, not a special corner case for one
>>>> > > > > > > particular type of
>>>> > > > > > > container.
>>>> > > > > > >
>>>> > > > > > > Security breaches are created when people code in
>>>> > > > > > > special, little used,
>>>> > > > > > > corner cases because they don't get as thoroughly tested
>>>> > > > > > > and inspected
>>>> > > > > > > as generally applicable mechanisms.
>>>> > > > > > >
>>>> > > > > > > What you want is to be able to use the tty subsystem as a
>>>> > > > > > > non root user:
>>>> > > > > > > fine, but set that up globally, don't hide it in
>>>> > > > > > > containers so a lot
>>>> > > > > > > fewer people care.
>>>> > > > > >
>>>> > > > > > I tend to agree, and not just for the tty subsystem. This
>>>> > > > > > is an
>>>> > > > > > attack surface issue. With unprivileged user namespaces,
>>>> > > > > > unprivileged
>>>> > > > > > users can create mount namespaces (probably a good thing
>>>> > > > > > for bind
>>>> > > > > > mounts, etc), network namespaces (reasonably safe by
>>>> > > > > > themselves),
>>>> > > > > > network interfaces and iptables rules (scary), fresh
>>>> > > > > > instances/superblocks of some filesystems (scariness
>>>> > > > > > depends on the fs
>>>> > > > > > -- tmpfs is probably fine), and more.
>>>> > > > > >
>>>> > > > > > I think we should have real controls for this, and this is
>>>> > > > > > mostly
>>>> > > > > > Eric's domain. Eric? A silly issue that sometimes
>>>> > > > > > prevents devpts
>>>> > > > > > from being mountable isn't a real control, though.
>>>> > >
>>>> > > I thought the controls for limiting how much of the userspace API
>>>> > > an application could use were called seccomp and seccomp2.
>>>> > >
>>>> > > Do we need something like a PAM module so that we can set up
>>>> > > these
>>>> > > controls during login?
>>>> > >
>>>> > > > > I'm honestly surprised that non-root is allowed to mount
>>>> > > > > things in
>>>> > > > > general with user namespaces. This was long disabled use for
>>>> > > > > non-root in
>>>> > > > > Fedora, but it is now enabled.
>>>> > > > >
>>>> > > > > For instance, using loopback mounted files you could probably
>>>> > > > > attack
>>>> > > > > some of the less well tested filesystem implementations by
>>>> > > > > feeding them
>>>> > > > > fuzzed data.
>>>> > > > >
>>>> > > >
>>>> > > > You actually can't do that right now. Filesystems have to opt
>>>> > > > in to
>>>> > > > being mounted in unprivileged user namespaces, and no
>>>> > > > filesystems with
>>>> > > > backing stores have opted in. devpts has, but it's buggy
>>>> > > > without this
>>>> > > > patch IMO.
>>>> > >
>>>> > > Arguably you should use two user namespaces. The first to do
>>>> > > what you
>>>> > > want to as root the second to run as the uid you want to run as.
>>>> > >
>>>> > > > > Anyway, I don't see how this affects devpts though. If you're
>>>> > > > > running in
>>>> > > > > a container (or uncontained), as a regular users with no
>>>> > > > > mount
>>>> > > > > capabilities you can already mount a devpts filesystem if you
>>>> > > > > create a
>>>> > > > > subbcontainer with user namespaces and map your uid to 0 in
>>>> > > > > the
>>>> > > > > subcontainer. Then you get a new ptmx device that you can do
>>>> > > > > whatever
>>>> > > > > you want with. The mount option would let you do the same,
>>>> > > > > except be
>>>> > > > > your regular uid in the subcontainer.
>>>> > > > >
>>>> > > > > The only difference outside of the subcontainer is that if
>>>> > > > > the outer
>>>> > > > > container has no uid 0 mapped, yet the user has CAP_SYSADMIN
>>>> > > > > rights in
>>>> > > > > that container. Then he can mount devpts in the outer
>>>> > > > > container where he
>>>> > > > > before could only mount it in an inner container.
>>>> > > > >
>>>> > > >
>>>> > > > Agreed. Also, devpts doesn't seem scary at all to me from a
>>>> > > > userns
>>>> > > > perspective. Regular users on normal systems can already use
>>>> > > > ptmx,
>>>> > > > and AFAICS basically all of the attack surface is already
>>>> > > > available
>>>> > > > through the normal /dev/ptmx node.
>>>> > >
>>>> > > My only real take is that there are a lot more places that you
>>>> > > need to
>>>> > > tweak beyond devpts. So this patch seemed lacking and boring.
>>>> > >
>>>> > > Beyond that until I get the mount namespace sorted out things are
>>>> > > pretty
>>>> > > much in a feature freeze because I can't multitask well enough to
>>>> > > do
>>>> > > complicated patches and take feature patches.
>>>> > >
>>>> >
>>>> > Eric, do you think you have time now to take a look at this patch?
>>>>
>>>> I am much closer. Escaping bind mounts is still not yet fixed but I
>>>> have code that almost works.
>>>>
>>>> My gut feel still says that two user namespaces one where your 0 is
>>>> mapped to your uid and a second where your uid is identity mapped is
>>>> the
>>>> preferrable configuration, and makes this patch unnecessary.
>>>
>>> I don't really understand this. My usecase is that I want a desktop app
>>> sandbox, it should run as the actual user that is running the graphical
>>> session mapped to its real uid. In this namespace i want a /dev/pts so
>>> that i can e.g. shell out to ssh and feed it a password on the tty
>>> prompt or similar. And i don't want to bind-mount in the host /dev/pts,
>>> because then the sandbox can read from the ttys of other apps.
>>>
>>> Where does the second namespace enter into this?
>>>
>>
>> I think Eric is suggesting making a user namespace that maps your uid
>> as 0, then making a mount namespace and mounting devpts, then making
>> *another* user namespace that maps your uid (seen as 0) back to
>> whatever nonzero number you want.
>>
>> That would probably work, but I think it's really ugly.
>
> I just looked and the number of places where we actually care if uid 0
> is mapped is very small. Mostly just the places that have to deal with
> setuid applications. So I think the maintenance burden is much smaller
> that I would have expected.
>
> That said if we update /dev/pts to handle being mounted by a non-root
> user I expect what we actually want is to use the fsuid and fsgid
> of the caller of mount. That is less code and it does the right thing
> without effort, and it makes sense even outside of a user namespace
> context.
>
> Something like:
>
> diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
> index add566303c68..8fdaa6740f23 100644
> --- a/fs/devpts/inode.c
> +++ b/fs/devpts/inode.c
> @@ -245,13 +245,8 @@ static int mknod_ptmx(struct super_block *sb)
> struct dentry *root = sb->s_root;
> struct pts_fs_info *fsi = DEVPTS_SB(sb);
> struct pts_mount_opts *opts = &fsi->mount_opts;
> - kuid_t root_uid;
> - kgid_t root_gid;
> -
> - root_uid = make_kuid(current_user_ns(), 0);
> - root_gid = make_kgid(current_user_ns(), 0);
> - if (!uid_valid(root_uid) || !gid_valid(root_gid))
> - return -EINVAL;
> + kuid_t ptmx_uid = current_fsuid();
> + kgid_t ptmx_gid = current_fsgid();
>
> mutex_lock(&d_inode(root)->i_mutex);
>
> @@ -282,8 +277,8 @@ static int mknod_ptmx(struct super_block *sb)
>
> mode = S_IFCHR|opts->ptmxmode;
> init_special_inode(inode, mode, MKDEV(TTYAUX_MAJOR, 2));
> - inode->i_uid = root_uid;
> - inode->i_gid = root_gid;
> + inode->i_uid = ptmx_uid;
> + inode->i_gid = ptmx_gid;
>
> d_add(dentry, inode);

Apparently alexl is encountering some annoyances related to the
current workaround, and the workaround is certainly ugly.

Your proposal seems like it could break some use cases involving
fscaps on a mount or mount-like binary.

What if we change it to use the owner of the userns that owns the
current mount ns? For anything that doesn't explicitly use
namespaces, this will be zero. For namespace users, it should do the
right thing.

--Andy

2016-03-08 09:16:42

by Alexander Larsson

[permalink] [raw]
Subject: Re: [PATCH] devpts: Add ptmx_uid and ptmx_gid options

On mån, 2016-03-07 at 20:59 -0800, Andy Lutomirski wrote:
> On Thu, May 28, 2015 at 12:42 PM, Eric W. Biederman
> <[email protected]> wrote:
> > Andy Lutomirski <[email protected]> writes:
> > 
> Apparently alexl is encountering some annoyances related to the
> current workaround, and the workaround is certainly ugly.

It works, but it introduces an extra namespace that gets exposed to the
world, which is pretty ugly. For instance, entering the namespace
becomes hard. I can setns() into the intermediate user+mount namespace
without problems, but if i try to setns into the final user+mount ns
(it gets its own implicit mount ns) i get EPERM. I'm not sure exactly
why though...

> Your proposal seems like it could break some use cases involving
> fscaps on a mount or mount-like binary.
>
> What if we change it to use the owner of the userns that owns the
> current mount ns?  For anything that doesn't explicitly use
> namespaces, this will be zero.  For namespace users, it should do the
> right thing.

Any of these is fine with me. One nice thing would if i could somehow
detect whether this was supported or not so that i can fall back on the
old workaround.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Alexander Larsson Red Hat, Inc
[email protected] [email protected]
He's an all-American guitar-strumming househusband with no name. She's a
scantily clad impetuous former first lady who don't take no shit from
nobody. They fight crime!


2016-03-08 18:18:26

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] devpts: Add ptmx_uid and ptmx_gid options

On Tue, Mar 8, 2016 at 1:16 AM, Alexander Larsson <[email protected]> wrote:
> On mån, 2016-03-07 at 20:59 -0800, Andy Lutomirski wrote:
>> On Thu, May 28, 2015 at 12:42 PM, Eric W. Biederman
>> <[email protected]> wrote:
>> > Andy Lutomirski <[email protected]> writes:
>> >
>> Apparently alexl is encountering some annoyances related to the
>> current workaround, and the workaround is certainly ugly.
>
> It works, but it introduces an extra namespace that gets exposed to the
> world, which is pretty ugly. For instance, entering the namespace
> becomes hard. I can setns() into the intermediate user+mount namespace
> without problems, but if i try to setns into the final user+mount ns
> (it gets its own implicit mount ns) i get EPERM. I'm not sure exactly
> why though...
>
>> Your proposal seems like it could break some use cases involving
>> fscaps on a mount or mount-like binary.
>>
>> What if we change it to use the owner of the userns that owns the
>> current mount ns? For anything that doesn't explicitly use
>> namespaces, this will be zero. For namespace users, it should do the
>> right thing.
>
> Any of these is fine with me. One nice thing would if i could somehow
> detect whether this was supported or not so that i can fall back on the
> old workaround.

I'll send a patch.

I suppose the straightforward, if slightly awkward, way to detect it
is just to try it -- create a namespace and try to mount devpts.

--Andy