LinuxLists.cc - per-process namespace?

2004-06-29 18:47:25

Subject: per-process namespace?

Is there a way for an application to
1. fork its own namespace and modify it, and
2. still be able to see changes to the system namespace?

Al Viro's Per-process namespace implementation provides the first
feature. But is there any work done to do the second part? Is it worth
doing?

RP

2004-06-29 21:11:04

by Mike Waychison

[permalink] [raw]

Subject: Re: per-process namespace?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ram Pai wrote:
> Is there a way for an application to
> 1. fork its own namespace and modify it, and
> 2. still be able to see changes to the system namespace?
>
> Al Viro's Per-process namespace implementation provides the first
> feature. But is there any work done to do the second part? Is it worth
> doing?
>
> RP

In what sense?

The current model has no definition for a 'system namespace'.

Accessing /proc/<pid>/mounts where <pid> is running in a different
namespace appears to work. As well, you can always fchdir back into
another namespace temporarily. As long as you don't reference any
file/directories using absolute paths (including following symlinks),
then you can already navigate the entire namespace.

This falls apart though when there are no longer any processes keeping
that namespace alive. When this happens, the vfsmount's are unstitched
and you end up 'stuck' on a given mount :(.

Another caveat is that the current system disallows you from doing any
mount/umount's in another namespace (bogus security?).

- --
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFA4dq9dQs4kOxk3/MRApkaAKCPe0Nw9QBZH425SZeOIvIzSzksUACfQk5D
xLgBDN/dsmVMkAAD73mugiY=
=8OEy
-----END PGP SIGNATURE-----

2004-06-29 22:10:30

by Al Viro

[permalink] [raw]

Subject: Re: per-process namespace?

On Tue, Jun 29, 2004 at 05:10:21PM -0400, Mike Waychison wrote:
> Another caveat is that the current system disallows you from doing any
> mount/umount's in another namespace (bogus security?).

Nothing bogus here - namespace boundary _IS_ a trust boundary and that's
exactly the diference between symlinks and bindings - symlink attacks
are possible exactly because they allow you to modify visible tree topology
for other users.

Note that sharing parts of namespace (which is basically what automounter
wants and what we do not have yet) is deliberate act of trust - same as
having a part of your address space shared with other process.

2004-06-29 22:25:29

by Ram Pai

[permalink] [raw]

Subject: Re: per-process namespace?

On Tue, 2004-06-29 at 14:10, Mike Waychison wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Ram Pai wrote:
> > Is there a way for an application to
> > 1. fork its own namespace and modify it, and
> > 2. still be able to see changes to the system namespace?
> >
> > Al Viro's Per-process namespace implementation provides the first
> > feature. But is there any work done to do the second part? Is it worth
> > doing?
> >
> > RP
>
> In what sense?
>
> The current model has no definition for a 'system namespace'.

by 'system namespace' I mean the very first initial hand-crafted
namespace.

>
> Accessing /proc/<pid>/mounts where <pid> is running in a different
> namespace appears to work.

Are you sure? I dont see it to be the case. I just verified it on 2.6.7
/proc/<pid>/mounts is a file. However /proc/pid/root is a symbolic link
to the root directory of the process. So the process with a cloned
namespace wont be able to access it through its namespace.

> As well, you can always fchdir back into
> another namespace temporarily. As long as you don't reference any
> file/directories using absolute paths (including following symlinks),
> then you can already navigate the entire namespace.

If this feature is available then great!

>
> This falls apart though when there are no longer any processes keeping
> that namespace alive. When this happens, the vfsmount's are unstitched
> and you end up 'stuck' on a given mount :(.

> Another caveat is that the current system disallows you from doing any
> mount/umount's in another namespace (bogus security?)
> .
>
> - --
> Mike Waychison
> Sun Microsystems, Inc.
> 1 (650) 352-5299 voice
> 1 (416) 202-8336 voice
> http://www.sun.com
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> NOTICE: The opinions expressed in this email are held by me,
> and may not represent the views of Sun Microsystems, Inc.
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (GNU/Linux)
>
> iD8DBQFA4dq9dQs4kOxk3/MRApkaAKCPe0Nw9QBZH425SZeOIvIzSzksUACfQk5D
> xLgBDN/dsmVMkAAD73mugiY=
> =8OEy
> -----END PGP SIGNATURE-----
>

2004-06-29 23:23:27

by Ram Pai

[permalink] [raw]

Subject: Re: per-process namespace?

On Tue, 2004-06-29 at 15:10, [email protected]
wrote:

> Note that sharing parts of namespace (which is basically what automounter
> wants and what we do not have yet) is deliberate act of trust - same as
> having a part of your address space shared with other process.

Al, are you working on this feature(namespace sharing) ?
If so, can we help? If not, any estimate on the complexity of this
work?

Thanks,
RP

2004-06-30 13:15:55

by Mike Waychison

[permalink] [raw]

Subject: Re: per-process namespace?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[email protected] wrote:
> On Tue, Jun 29, 2004 at 05:10:21PM -0400, Mike Waychison wrote:
>
>>Another caveat is that the current system disallows you from doing any
>>mount/umount's in another namespace (bogus security?).
>
>
> Nothing bogus here - namespace boundary _IS_ a trust boundary and that's
> exactly the diference between symlinks and bindings - symlink attacks
> are possible exactly because they allow you to modify visible tree
topology
> for other users.

Yet being able to access another namespace via directory fd still breaks
that boundary. I'm not sure if it's a feature or a not. If it is a
feature, I'd argue that having the fd means you are trusted to play in
that namespace, which implies the right to do things like call mount(2)
in it.

>
> Note that sharing parts of namespace (which is basically what automounter
> wants and what we do not have yet) is deliberate act of trust - same as
> having a part of your address space shared with other process.

Namespace sharing has been touched on before, but hasn't been discussed
publicly. My take on it is that in order to have namespace sharing work
in some semantically sane way, we need to be able to identify
owner-namespaces for shared branches of the vfsmount tree. This implies
making namespaces first-class primitives. Is this where we want to go
with this?

I only see automounting being the only consumer of such a beast, are
there other possible users?

- --
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFA4rzgdQs4kOxk3/MRAkxVAJ9kHj/6xfa/zSXLpT7v2hkOSFWhrgCggjL/
ovcsxTkm6FpbWMlzIQn4geU=
=B4D5
-----END PGP SIGNATURE-----

2004-06-30 13:30:52

by Mike Waychison

[permalink] [raw]

Subject: Re: per-process namespace?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ram Pai wrote:
> On Tue, 2004-06-29 at 14:10, Mike Waychison wrote:
>
>>-----BEGIN PGP SIGNED MESSAGE-----
>>Hash: SHA1
>>
>>Ram Pai wrote:
>>
>>>Is there a way for an application to
>>>1. fork its own namespace and modify it, and
>>>2. still be able to see changes to the system namespace?
>>>
>>>Al Viro's Per-process namespace implementation provides the first
>>>feature. But is there any work done to do the second part? Is it worth
>>>doing?
>>>
>>>RP
>>
>>In what sense?
>>
>>The current model has no definition for a 'system namespace'.
>
>
> by 'system namespace' I mean the very first initial hand-crafted
> namespace.
>

The problem is that namespaces have no inherent hierarchy to them. Once
you create one, all relation to the parenting namespace is lost. You
can't even tell if you are in a different namespace from the 'system
namespace' other than by comparing /proc/self/mounts with /proc/1/mounts.

>
>>Accessing /proc/<pid>/mounts where <pid> is running in a different
>>namespace appears to work.
>
>
> Are you sure? I dont see it to be the case. I just verified it on 2.6.7
> /proc/<pid>/mounts is a file. However /proc/pid/root is a symbolic link
> to the root directory of the process. So the process with a cloned
> namespace wont be able to access it through its namespace.
>
>

Yes. mounts gives you the mount-table. root is a symbolic link. You
can obtain the fd across a fork or over a unix socket. Proc doesn't
give you any magic files to access namespaces directly.

- --
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
http://www.sun.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFA4sBidQs4kOxk3/MRAgFUAJ0V19QWPRhT3OMJeSi/2cGhwpJB1ACePHSE
aYAsHb1TNiY7bs7a+FFBsno=
=qpir
-----END PGP SIGNATURE-----

2004-06-30 18:15:46

by Ram Pai

[permalink] [raw]

Subject: Re: per-process namespace?

On Wed, 2004-06-30 at 06:15, Mike Waychison wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> [email protected] wrote:
> > On Tue, Jun 29, 2004 at 05:10:21PM -0400, Mike Waychison wrote:
> >
> >>Another caveat is that the current system disallows you from doing any
> >>mount/umount's in another namespace (bogus security?).
> >
> >
> > Nothing bogus here - namespace boundary _IS_ a trust boundary and that's
> > exactly the diference between symlinks and bindings - symlink attacks
> > are possible exactly because they allow you to modify visible tree
> topology
> > for other users.
>
> Yet being able to access another namespace via directory fd still breaks
> that boundary. I'm not sure if it's a feature or a not. If it is a
> feature, I'd argue that having the fd means you are trusted to play in
> that namespace, which implies the right to do things like call mount(2)
> in it.
>
> >
> > Note that sharing parts of namespace (which is basically what automounter
> > wants and what we do not have yet) is deliberate act of trust - same as
> > having a part of your address space shared with other process.
>
> Namespace sharing has been touched on before, but hasn't been discussed
> publicly. My take on it is that in order to have namespace sharing work
> in some semantically sane way, we need to be able to identify
> owner-namespaces for shared branches of the vfsmount tree. This implies
> making namespaces first-class primitives. Is this where we want to go
> with this?
>
> I only see automounting being the only consumer of such a beast, are
> there other possible users?

We have a customer requirement where they want to tailor the namespace
of each process according to some environmental attributes. But at the
same time they want to see the system mounts except at some predefined
local directory tree.

The per-process namespace concept comes in handy here except for the
static nature of the namespace. In the sense, any changes to the system
namespace do not reflect in the children namespace.

Perhaps the way to implement this feature is to
provide a feature to make some mounts points maskable.

A new forked-off namespace sees all the mounts in the parent
namespace, except the mounts on mount point that are
masked.

eager to hear how Al Viro envisions this to work,
RP

>
> - --
> Mike Waychison
> Sun Microsystems, Inc.
> 1 (650) 352-5299 voice
> 1 (416) 202-8336 voice
> http://www.sun.com
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> NOTICE: The opinions expressed in this email are held by me,
> and may not represent the views of Sun Microsystems, Inc.
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (GNU/Linux)
>
> iD8DBQFA4rzgdQs4kOxk3/MRAkxVAJ9kHj/6xfa/zSXLpT7v2hkOSFWhrgCggjL/
> ovcsxTkm6FpbWMlzIQn4geU=
> =B4D5
> -----END PGP SIGNATURE-----
>

2004-07-01 00:15:36

by Serge E. Hallyn

[permalink] [raw]

Subject: Re: per-process namespace?

> The per-process namespace concept comes in handy here except for the
> static nature of the namespace. In the sense, any changes to the system
> namespace do not reflect in the children namespace.

Static?

It's not static! It's private, as advertised.

It sounds like you're asking (or your customer is asking) for
copy-on-write namespaces :)

2004-07-01 01:33:06

by Ram Pai

[permalink] [raw]

Subject: Re: per-process namespace?

On Wed, 2004-06-30 at 17:14, Serge E. Hallyn wrote:
> > The per-process namespace concept comes in handy here except for the
> > static nature of the namespace. In the sense, any changes to the system
> > namespace do not reflect in the children namespace.
>
> Static?
>
> It's not static! It's private, as advertised.

> It sounds like you're asking (or your customer is asking) for
> copy-on-write namespaces :)
>
Yes! exactly right. A copy-on-write namespace. That way we dont even
have to invent new interfaces to define maskable mount-points that I
mentioned earlier.