2008-12-04 15:20:40

by Michael Kerrisk

[permalink] [raw]
Subject: Could you write some CLONE_NEWUSER?

Hi Serge,

Thanks for CCing me on recent CLONE_NEWUSER patches.

Would you be will to write some documentation for this flag? (It's
the only remaining undocumented flag in clone(2).) Plain text would
be fine -- I'll integrate it into the man page with suitable macros.

Cheers,

Michael


---------- Forwarded message ----------
From: Michael Kerrisk <[email protected]>
Date: Wed, Nov 19, 2008 at 3:04 PM
Subject: Current state of CLONE_NEWUSER?
To: Serge Hallyn <[email protected]>
Cc: Subrata Modak <[email protected]>, [email protected],
lkml <[email protected]>, [email protected],
[email protected], [email protected], [email protected]


Hi Serge,

What is the current status of CLONE_NEWUSER? I'm currently trying to
test this flag in preparation for documenting it in the clone(2) man
page, but am running into an ENOMEM error from the clone() call, which
seems to occur after a failure in kobject_init_and_add() in the
following call sequence:

clone_user_ns() --> alloc_uid() --> uids_user_create() -->
kobject_init_and_add()

Are there already some test programs somewhere? Is there any
documentation already available for this flag?

Thanks,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html


2008-12-04 15:40:15

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: Could you write some CLONE_NEWUSER?

Quoting Michael Kerrisk ([email protected]):
> Hi Serge,
>
> Thanks for CCing me on recent CLONE_NEWUSER patches.
>
> Would you be will to write some documentation for this flag? (It's
> the only remaining undocumented flag in clone(2).) Plain text would
> be fine -- I'll integrate it into the man page with suitable macros.

Will do, will aim to send it to you by tomorrow.

-serge

2008-12-04 15:43:11

by Michael Kerrisk

[permalink] [raw]
Subject: Re: Could you write some CLONE_NEWUSER?

On Thu, Dec 4, 2008 at 10:34 AM, Serge E. Hallyn <[email protected]> wrote:
> Quoting Michael Kerrisk ([email protected]):
>> Hi Serge,
>>
>> Thanks for CCing me on recent CLONE_NEWUSER patches.
>>
>> Would you be will to write some documentation for this flag? (It's
>> the only remaining undocumented flag in clone(2).) Plain text would
>> be fine -- I'll integrate it into the man page with suitable macros.
>
> Will do, will aim to send it to you by tomorrow.

Thanks Serge.


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html

2008-12-04 19:04:58

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: Could you write some CLONE_NEWUSER?

Quoting Michael Kerrisk ([email protected]):
> Hi Serge,
>
> Thanks for CCing me on recent CLONE_NEWUSER patches.
>
> Would you be will to write some documentation for this flag? (It's
> the only remaining undocumented flag in clone(2).) Plain text would
> be fine -- I'll integrate it into the man page with suitable macros.

Well here is a start. David, writing this actually reminded me that
the per-user keys still aren't per-namespace. Did you say you were
looking at that, or should I send a patch (starting at
security/keys/key.c:key_user_lookup())?

Eric, if you get a second, could you please review?

thanks,
-serge


CLONE_NEWUSER
Start the child in a new user namespace.

User namespaces are very incomplete. When complete, they
will implement hierarchical userid namespaces designed to
be safely used without privilege. User namespaces are
unnamed, but for the sake of this explanation we will give
them a single-letter ID. Let us refer to userid 500 in user
namespace B as (B, 500). Assume a process owned by (B, 500)
passes CLONE_NEWUSER to clone(2). A new user namespace, C,
will be created. The new task will be owned by user
(C, 0). No userid in user namespace C will be able to
gain more access than (B, 500) could obtain. User (C, 500)
will be protected from (C, 501) as usual. Files created
by (C, 501) are owned by both (C, 501) and (B, 500), so
(B, 500) owns all files created in user namespace C. Likewise
(B, 500) can kill and ptrace any processes owned by (C, 501).

In (!SECURE_NOROOT) mode, userid 0 gets privilege when executing
files. With user namespaces, userid 0 will still get these
privileges, but limited to namespaces it owns. For instance,
CAP_DAC_OVERRIDE will be targeted to files owned by the user's
user namespace, while CAP_SETUID is by nature per-namespace
and hence always safe.

Most of the permission checks to make this work are currently
unimplemented. If your kernel is compiled with CONFIG_USER_NS,
then you can create a new user namespace if you have
CAP_SYS_ADMIN, CAP_SETUID and CAP_SETGID capabilities. The
new task will be owned by userid and gid 0 in the new user
namespace. Current support is sufficient to provide separate
accounting, since uid 0 in different namespaces are represented by
different user structs.

Will return -EINVAL if called on a kernel compiled without
user namespace support (CONFIG_USER_NS=n), and -EPERM if
called by a process with insufficient privilege before support
is complete.

2008-12-04 20:18:48

by Bryan Donlan

[permalink] [raw]
Subject: Re: Could you write some CLONE_NEWUSER?

On Thu, Dec 4, 2008 at 2:04 PM, Serge E. Hallyn <[email protected]> wrote:
> Quoting Michael Kerrisk ([email protected]):
>> Hi Serge,
>>
>> Thanks for CCing me on recent CLONE_NEWUSER patches.
>>
>> Would you be will to write some documentation for this flag? (It's
>> the only remaining undocumented flag in clone(2).) Plain text would
>> be fine -- I'll integrate it into the man page with suitable macros.
>
> Well here is a start. David, writing this actually reminded me that
> the per-user keys still aren't per-namespace. Did you say you were
> looking at that, or should I send a patch (starting at
> security/keys/key.c:key_user_lookup())?
>
> Eric, if you get a second, could you please review?
>
> thanks,
> -serge
>
>
> CLONE_NEWUSER
> Start the child in a new user namespace.
>
> User namespaces are very incomplete. When complete, they
> will implement hierarchical userid namespaces designed to
> be safely used without privilege. User namespaces are
> unnamed, but for the sake of this explanation we will give
> them a single-letter ID. Let us refer to userid 500 in user
> namespace B as (B, 500). Assume a process owned by (B, 500)
> passes CLONE_NEWUSER to clone(2). A new user namespace, C,
> will be created. The new task will be owned by user
> (C, 0). No userid in user namespace C will be able to
> gain more access than (B, 500) could obtain. User (C, 500)
> will be protected from (C, 501) as usual. Files created
> by (C, 501) are owned by both (C, 501) and (B, 500), so
> (B, 500) owns all files created in user namespace C. Likewise
> (B, 500) can kill and ptrace any processes owned by (C, 501).

This is something more of a general question than one about this
manpage, but how will files owned by user namespaces be represented on
the underlying filesystem? Since (C, 501) will be meaningless after a
reboot at the latest, it makes little sense to persist them...

2008-12-04 22:33:50

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: Could you write some CLONE_NEWUSER?

Quoting Bryan Donlan ([email protected]):
> This is something more of a general question than one about this
> manpage, but how will files owned by user namespaces be represented on
> the underlying filesystem? Since (C, 501) will be meaningless after a
> reboot at the latest, it makes little sense to persist them...

Yeah that's a very interesting question. Clearly persistant names for
the user namespaces are needed. Eric very much wanted to avoid having
the user namespaces be explicitly named, so we pursued the path of
having the filesystem handle the naming. So in my last patchset,
a mount option could register the mounter's user namespace name.
There would be a system-wide policy saying for instance that (B,500)
user namespaces owned by can register themselves at C.

(end of discussion arising from that patchset is here: )
https://lists.linux-foundation.org/pipermail/containers/2008-August/012793.html

In the simplest case of no fs support for user namespaces, the mount
will be 'owned' by the userns which mounted it (no persistant name
needed for that). Users who are in a different namespace will only get
the 'user other' permission to the file/dir, and may not create files
there (since we wouldn't know which userid to place on it).

Then the fs can support user namespaces - however it wants. It could
just store (B, 500),(C, 501) in an xattr. Or it could just store the
userid and userns name of the lowest user (I.e. C and 0), and count on
knowning that (B, 500) owns user namespace C. We do want to provide
generic helpers in lib/fsuserns.c which any fs could use.

But yes, picking a meaningful persistant name for a user namespace
is an issue.

-serge