2005-10-02 22:08:50

by David Leimbach

[permalink] [raw]
Subject: /etc/mtab and per-process namespaces

I've been just playing around with the v9fs work and private
namespaces from yesterday's [October 1, 2005] top of tree from Linus'
git archive and I was looking at /etc/mtab's reaction to having
multiple namespaces with bind mounts.

I have a directory ("slash") and bind it to "/" in a new namespace
created with a clone call. /etc/mtab then has a line appended to it
for this bind.

/ /home/dave/slash none rw,bind 0 0

Then in the "global default" namespace I do the same thing. This adds
yet another line to my /etc/mtab with exactly the same contents.

I then exited both shells and /etc/mtab is left with both lines:

/ /home/dave/slash none rw,bind 0 0
/ /home/dave/slash none rw,bind 0 0

Did I just "leak" a namespace or is mtab just way off from reality now?

Also, if I check the procfs it seems to have no record of either of
these binds ever occurring. [/proc/mounts]

Also, does it make sense to even think about adding a pid column
number for the "private" namespaces or perhaps just to mark it as
"priv" or something to that effect.

It's not clear to me what the best way to deal with this would be.
Right now it appears to be broken or at least very inconsistent.

I can think of a lot of ways to use private namespaces to avoid
conflicts with software installations... much like the DragonFlyBSD
folks use variant symlinks to have two-level data about what version
of libx libx.so links to. I just think private namespaces make this a
bit more elegant to handle, especially in the context of Xen and
upcoming virtualization hardware from Intel and AMD. As such I find
this feature of linux [and Plan 9/Inferno - where it came from] to be
pretty important.


- Dave


2005-10-04 19:14:49

by David Leimbach

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

Hmm no responses on this thread a couple days now. I guess:

1) No one cares about private namespaces or the fact that they make
/etc/mtab totally inconsistent.
2) Private Namespaces aren't important to anyone and will never be
robust unless someone who cares, like me, takes it over somehow.
3) Everyone is busy with their own shit and doesn't want to deal with
me or mine right now.

I'm seriously hoping it's 3 :). 2 Is acceptable too of course. I
think this is important and I want to know more about the innards
anyway. 1 would make me sad as I think Linux can really show other
Unix's what-for here when it comes to showing off how good the VFS can
be.

Linux has always been a bit of DIY, so I guess I just need to accept
that. It's not unlike the KDE development model. People who want
certain things done either motivate others to help or make a run for
it on their own, even in the face of adversity. Kind of more noble
that way I guess.

Dave

2005-10-04 19:18:22

by Christoph Hellwig

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

I suspect not one cares about /etc/mtab. It's a pretty horrible
interface. Use /proc/self/mounts if your care about the mount table
for your current namespace, it's guranteed uptodate.

2005-10-04 19:43:03

by Al Viro

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Tue, Oct 04, 2005 at 12:14:47PM -0700, David Leimbach wrote:
> Hmm no responses on this thread a couple days now. I guess:
>
> 1) No one cares about private namespaces or the fact that they make
> /etc/mtab totally inconsistent.
> 2) Private Namespaces aren't important to anyone and will never be
> robust unless someone who cares, like me, takes it over somehow.
> 3) Everyone is busy with their own shit and doesn't want to deal with
> me or mine right now.

4) If you insist on having /etc/mtab the same file in all namespaces,
you obviously will have its contents not matching at least some
of them. Either have it separate in each namespace where you want
to see it, or simply use /proc/self/mounts instead.

BTW, "private" is an odd term - they are all on the same footing; "system"
one is just the namespace of init (and those of its descendents that share
the namespace with it). Nothing special about it...

2005-10-04 19:48:58

by Michael Tokarev

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

Christoph Hellwig wrote:
> I suspect not one cares about /etc/mtab. It's a pretty horrible
> interface. Use /proc/self/mounts if your care about the mount table
> for your current namespace, it's guranteed uptodate.

Well, it's uptodate, but it isn't the same as mtab. Like:

/tmp/test on /mnt/test type ext2 (rw,loop=/dev/loop/0)
(mtab), vs
/dev/loop/0 /mnt/test ext2 rw 0 0

or:

tmpfs on /dev type tmpfs (rw,size=10M,mode=0755)
vs
tmpfs /dev tmpfs rw 0 0

ie, sometimes, mtab format is more useful. Also, with the
above example with loop device, umount is able to delete the
loop device for loop-mounts.

Another funky example:

losetup /dev/loop/0 /tmp/test
cd /dev/loop
mount 0 /mnt/test

now, mtab shows:

/dev/loop/0 /mnt/test ext2 rw 0 0

while /proc/mounts shows

0 /mnt/test ext2 rw 0 0

which is rather useless.

/mjt

2005-10-04 19:52:47

by David Leimbach

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On 10/4/05, Christoph Hellwig <[email protected]> wrote:
> I suspect not one cares about /etc/mtab. It's a pretty horrible
> interface. Use /proc/self/mounts if your care about the mount table
> for your current namespace, it's guranteed uptodate.
>
>

Hmmm that works pretty well, but it's lacking in some ways.

/dev/hdc3 /root/slash reiserfs rw 0 0
/dev/hdc3 /home/dave/blah resierfs rw 0 0

The above is not very descriptive.

/etc/mtab has:
/ /root/slash none rw,bind 0 0
/home/dave/public_html /home/dave/blah none rw,bind 0 0

Which tells me more about what I care about for 'bind' mounts.

However it does violate the "privacy" of the namespace by telling
everyone on the system how I have my stuff mounted :).

/proc/self/mounts does a much better job respecting this privacy but
doesn't give the information I really care about.

I think I'm looking for something like "ns" on Plan 9 or Inferno that
dumps out how my current namespace is constructed. Each process with
a private namespace should get different results for "ns".

Dave

2005-10-04 20:07:51

by David Leimbach

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On 10/4/05, Al Viro <[email protected]> wrote:
> On Tue, Oct 04, 2005 at 12:14:47PM -0700, David Leimbach wrote:
> > Hmm no responses on this thread a couple days now. I guess:
> >
> > 1) No one cares about private namespaces or the fact that they make
> > /etc/mtab totally inconsistent.
> > 2) Private Namespaces aren't important to anyone and will never be
> > robust unless someone who cares, like me, takes it over somehow.
> > 3) Everyone is busy with their own shit and doesn't want to deal with
> > me or mine right now.
>
> 4) If you insist on having /etc/mtab the same file in all namespaces,
> you obviously will have its contents not matching at least some
> of them. Either have it separate in each namespace where you want
> to see it, or simply use /proc/self/mounts instead.

Well I guess it's my fault to some extent with the subject line. I
don't really care about /etc/mtab so much except that I'd like it to
be consistent if it is going to be there. I'd rather it do one of two
things. Show me my current process's namespace accurately or just the
stuff that's global to all namespaces. Right now it's kind of in
between.

Also since when I type "mount" it just spits out what's in "mtab" it
seems like that should be made more accurate... not /proc/self/mounts.

(it looks like you can just edit the file and stick whatever you want
in there... I just stuck the line:
"blah blah blah"
in there and got:
"blah on blah type blah ()"
from "mount" with no arguments)

>
> BTW, "private" is an odd term - they are all on the same footing; "system"
> one is just the namespace of init (and those of its descendents that share
> the namespace with it). Nothing special about it...
>

It's actually a bit more like "protected" I suppose [in an OO
inheritance sense]. If I use clone with the right flags my new
process has a namespace that doesn't get reflected in the other
process's. Sure I still inherit the parent's namespace, but I'm free
to bind to my hearts content in ways that other processes will not see
[unless they are children of the currently clone'd process].

Even in this "protected" namespace other processes can clearly see
what I'm doing via /etc/mtab. It seems there will always be a way to
sniff out this information though since all the files of the form
/proc/<pid>/mounts are read accessible by everyone.

Dave

2005-10-04 20:20:49

by Al Viro

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Tue, Oct 04, 2005 at 01:07:48PM -0700, David Leimbach wrote:
> On 10/4/05, Al Viro <[email protected]> wrote:
> > On Tue, Oct 04, 2005 at 12:14:47PM -0700, David Leimbach wrote:
> > > Hmm no responses on this thread a couple days now. I guess:
> > >
> > > 1) No one cares about private namespaces or the fact that they make
> > > /etc/mtab totally inconsistent.
> > > 2) Private Namespaces aren't important to anyone and will never be
> > > robust unless someone who cares, like me, takes it over somehow.
> > > 3) Everyone is busy with their own shit and doesn't want to deal with
> > > me or mine right now.
> >
> > 4) If you insist on having /etc/mtab the same file in all namespaces,
> > you obviously will have its contents not matching at least some
> > of them. Either have it separate in each namespace where you want
> > to see it, or simply use /proc/self/mounts instead.
>
> Well I guess it's my fault to some extent with the subject line. I
> don't really care about /etc/mtab so much except that I'd like it to
> be consistent if it is going to be there. I'd rather it do one of two
> things. Show me my current process's namespace accurately or just the
> stuff that's global to all namespaces. Right now it's kind of in
> between.

/etc/mtab is just a regular file; no more, no less. It's a place used by
mount(8) and several other programs. Kernel has nothing to do with it...

Obns: that can get tough. Note that Plan 9 one is an approximation that
works well enough for most uses; if you play with mounting/unmounting/renaming
in sufficiently perverted ways, you'll get unusable /proc/<pid>/ns. The
trouble being, they are luckier - they don't have to deal with many classes
of perversion we do, so their soluition wouldn't work well for Linux.

2005-10-04 21:20:29

by Bodo Eggert

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

Al Viro <[email protected]> wrote:

> 4) If you insist on having /etc/mtab the same file in all namespaces,
> you obviously will have its contents not matching at least some
> of them. Either have it separate in each namespace where you want
> to see it, or simply use /proc/self/mounts instead.

So /proc/mounts should be a symlink to /proc/self/mounts?

--
Ich danke GMX daf?r, die Verwendung meiner Adressen mittels per SPF
verbreiteten L?gen zu sabotieren.

2005-10-05 00:14:43

by Al Viro

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Tue, Oct 04, 2005 at 11:20:12PM +0200, Bodo Eggert wrote:
> Al Viro <[email protected]> wrote:
>
> > 4) If you insist on having /etc/mtab the same file in all namespaces,
> > you obviously will have its contents not matching at least some
> > of them. Either have it separate in each namespace where you want
> > to see it, or simply use /proc/self/mounts instead.
>
> So /proc/mounts should be a symlink to /proc/self/mounts?

; ls -l /proc/mounts
lrwxrwxrwx 1 root root 11 Oct 4 20:13 /proc/mounts -> self/mounts
;

like that, perhaps? IOW, it's been done that way for almost 4 years
already...

2005-10-05 16:30:20

by Ram Pai

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Tue, Oct 04, 2005 at 12:14:47PM -0700, David Leimbach wrote:
> Hmm no responses on this thread a couple days now. I guess:
>
> 1) No one cares about private namespaces or the fact that they make
> /etc/mtab totally inconsistent.
> 2) Private Namespaces aren't important to anyone and will never be
> robust unless someone who cares, like me, takes it over somehow.
> 3) Everyone is busy with their own shit and doesn't want to deal with
> me or mine right now.
>
> I'm seriously hoping it's 3 :). 2 Is acceptable too of course. I
> think this is important and I want to know more about the innards
> anyway. 1 would make me sad as I think Linux can really show other
> Unix's what-for here when it comes to showing off how good the VFS can
> be.

This becomes even more intresting when sharedsubtree gets added to
the equation. One would like to know all the mounts in its namesapace
and than all the mounts it propagates to which could include mounts in
other namespaces too..

I guess some interface that meets the following needs would eventually
be needed:

1. what are all the mounts in my namespace ?
A. what are the attributes of each of the mounts?
a. where is it mounted
b. who is its parent
c. what is it mounted from
d. what are the attributes of its mount
e. what are its peer mounts (I suspect some kind
of identifier has
to be associated with each mount)
f. if it has a master mount where is it
g. what are its slave mounts.at
(note: e, f, g can point to mounts in other namespaces)
2. what are the attributes of my namespace?
a. what is the parent namespace? ( I suspect some kind
of identifier has to associated
with each namespace, pid of the cloned
process?)
b. what are my children namespace?

3. which processes can access my namespace?


And I don't think /etc/mtab can do a decent job with this, because it
would not know where all the mounts propagate, when it attempts a mount.
Only the kernel would know, and hence all the commands who depend on
/etc/mtab may have to depend on some /proc or maybe /sysfs interface to
do a descent job.

RP

>
> Linux has always been a bit of DIY, so I guess I just need to accept
> that. It's not unlike the KDE development model. People who want
> certain things done either motivate others to help or make a run for
> it on their own, even in the face of adversity. Kind of more noble
> that way I guess.
>
> Dave
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2005-10-14 02:11:11

by Mike Waychison

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

Ram wrote:
> On Tue, Oct 04, 2005 at 12:14:47PM -0700, David Leimbach wrote:
>
>>Hmm no responses on this thread a couple days now. I guess:
>>
>>1) No one cares about private namespaces or the fact that they make
>>/etc/mtab totally inconsistent.
>>2) Private Namespaces aren't important to anyone and will never be
>>robust unless someone who cares, like me, takes it over somehow.
>>3) Everyone is busy with their own shit and doesn't want to deal with
>>me or mine right now.
>>
>>I'm seriously hoping it's 3 :). 2 Is acceptable too of course. I
>>think this is important and I want to know more about the innards
>>anyway. 1 would make me sad as I think Linux can really show other
>>Unix's what-for here when it comes to showing off how good the VFS can
>>be.
>
>
> This becomes even more intresting when sharedsubtree gets added to
> the equation. One would like to know all the mounts in its namesapace
> and than all the mounts it propagates to which could include mounts in
> other namespaces too..
>
> I guess some interface that meets the following needs would eventually
> be needed:
>
> 1. what are all the mounts in my namespace ?
> A. what are the attributes of each of the mounts?
> a. where is it mounted
> b. who is its parent
> c. what is it mounted from
> d. what are the attributes of its mount
> e. what are its peer mounts (I suspect some kind
> of identifier has
> to be associated with each mount)
> f. if it has a master mount where is it
> g. what are its slave mounts.at
> (note: e, f, g can point to mounts in other namespaces)
> 2. what are the attributes of my namespace?
> a. what is the parent namespace? ( I suspect some kind
> of identifier has to associated
> with each namespace, pid of the cloned
> process?)
> b. what are my children namespace?
>
> 3. which processes can access my namespace?
>
>
> And I don't think /etc/mtab can do a decent job with this, because it
> would not know where all the mounts propagate, when it attempts a mount.
> Only the kernel would know, and hence all the commands who depend on
> /etc/mtab may have to depend on some /proc or maybe /sysfs interface to
> do a descent job.
>

Or, you bite the bullet and fix /proc/mounts and let distributions bind
mount /proc/mounts over /etc/mtab.

Sun recognized this as a problem a long time ago and /etc/mnttab has
been magic for quite some time now.

Add to this the fact that a textfile /etc/mtab is busted because it's
whitespace seperated and pieces blows up and you do things like:

mount filer:/export/mikew "/home/Mike Waychison"

Mike Waychison

2005-10-17 00:47:22

by Ian Kent

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Thu, 13 Oct 2005, Mike Waychison wrote:

> Ram wrote:
> > On Tue, Oct 04, 2005 at 12:14:47PM -0700, David Leimbach wrote:
> >
> >>Hmm no responses on this thread a couple days now. I guess:
> >>
> >>1) No one cares about private namespaces or the fact that they make
> >>/etc/mtab totally inconsistent.
> >>2) Private Namespaces aren't important to anyone and will never be
> >>robust unless someone who cares, like me, takes it over somehow.
> >>3) Everyone is busy with their own shit and doesn't want to deal with
> >>me or mine right now.
> >>
> >>I'm seriously hoping it's 3 :). 2 Is acceptable too of course. I
> >>think this is important and I want to know more about the innards
> >>anyway. 1 would make me sad as I think Linux can really show other
> >>Unix's what-for here when it comes to showing off how good the VFS can
> >>be.
> >
> >
> > This becomes even more intresting when sharedsubtree gets added to
> > the equation. One would like to know all the mounts in its namesapace
> > and than all the mounts it propagates to which could include mounts in
> > other namespaces too..
> >
> > I guess some interface that meets the following needs would eventually
> > be needed:
> >
> > 1. what are all the mounts in my namespace ?
> > A. what are the attributes of each of the mounts?
> > a. where is it mounted
> > b. who is its parent
> > c. what is it mounted from
> > d. what are the attributes of its mount
> > e. what are its peer mounts (I suspect some kind
> > of identifier has
> > to be associated with each mount)
> > f. if it has a master mount where is it
> > g. what are its slave mounts.at
> > (note: e, f, g can point to mounts in other namespaces)
> > 2. what are the attributes of my namespace?
> > a. what is the parent namespace? ( I suspect some kind
> > of identifier has to associated
> > with each namespace, pid of the cloned
> > process?)
> > b. what are my children namespace?
> >
> > 3. which processes can access my namespace?
> >
> >
> > And I don't think /etc/mtab can do a decent job with this, because it
> > would not know where all the mounts propagate, when it attempts a mount.
> > Only the kernel would know, and hence all the commands who depend on
> > /etc/mtab may have to depend on some /proc or maybe /sysfs interface to
> > do a descent job.
> >
>
> Or, you bite the bullet and fix /proc/mounts and let distributions bind
> mount /proc/mounts over /etc/mtab.
>
> Sun recognized this as a problem a long time ago and /etc/mnttab has
> been magic for quite some time now.

Don't forget to update mount as well.

Ian



2005-10-22 13:24:04

by Dr. Greg Wettstein

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Oct 13, 7:10pm, Mike Waychison wrote:
} Subject: Re: /etc/mtab and per-process namespaces

Good morning to everyone, really behind on e-mail, my apologies for
joining the thread late.

> Ram wrote:
> > On Tue, Oct 04, 2005 at 12:14:47PM -0700, David Leimbach wrote:
> >
> >>Hmm no responses on this thread a couple days now. I guess:
> >>
> >>1) No one cares about private namespaces or the fact that they make
> >>/etc/mtab totally inconsistent.
> >>2) Private Namespaces aren't important to anyone and will never be
> >>robust unless someone who cares, like me, takes it over somehow.
> >>3) Everyone is busy with their own shit and doesn't want to deal with
> >>me or mine right now.
> >>
> >>I'm seriously hoping it's 3 :). 2 Is acceptable too of course. I
> >>think this is important and I want to know more about the innards
> >>anyway. 1 would make me sad as I think Linux can really show other
> >>Unix's what-for here when it comes to showing off how good the VFS can
> >>be.

> Or, you bite the bullet and fix /proc/mounts and let distributions bind
> mount /proc/mounts over /etc/mtab.
>
> Sun recognized this as a problem a long time ago and /etc/mnttab has
> been magic for quite some time now.
>
> Add to this the fact that a textfile /etc/mtab is busted because it's
> whitespace seperated and pieces blows up and you do things like:
>
> mount filer:/export/mikew "/home/Mike Waychison"

As to the three options above, I believe number 3 would be operative.
Private namespaces are extremely useful concepts, we are growing
increasingly dependent on them for systems management and
administration. I believe the issue is a chicken/egg problem, without
an update in tools the concept of namespaces are less approachable
than they should be.

Mike's comments are very apt. The current situation with mount
support is untenable. Even working on private development machines it
gets confusing as to what is or is not mounted in various
shells/processes. The basic infra-structure is there with process
specific mount information (/proc/self/mounts) but mount and friends
are a bit problematic with respect to supporting this.

I'm working on a namespace toolkit to address these issues. I've got
a pretty basic tool, similar to sudo, which allows spawning processes
with a protected namespace. I'm adding a configuration system which
allow systems administrators to define a setup of bind mounts which
are automatically executed before the user is given their shell. I'm
also working up a PAM account module to go along with this. I would
certainly be open to suggestions as to what else people would consider
useful in such a toolkit.

I've been pondering the best way to take on the mount problem.
Current mount binaries seem to fall back to /proc/mounts if /etc/mtab
is not present. All bets are off of course if the mount binary is
used for the bind mount since a new /etc/mtab is created.

I'm willing to whack on the mount binary a bit as part of this. The
obvious solution is to teach mount to act differently if it is running
in a private namespace. If anybody knows of a good way to detect this
I would be interested in knowing that. In newns (the namespace sudo
tool) I'm setting an environment variable for mount to detect on but a
system level approach would be more generic.

The other problem is the information exported in /proc/mounts. It
would seem problematic to modify its format but in order to serve as a
useful source of information for a modified mount binary it would need
to contain mount option information. Since this is definitely process
specific information it would seem to call for something in /proc
rather than /sysfs. Do we need a new pseudo-file?

I would be certainly interested in peoples reflections on this. When
I get something a bit more shaken down I will roll up a preliminary
distribution and announce it.

> Mike Waychison

Best wishes for a pleasant weekend to everyone.

Greg

}-- End of excerpt from Mike Waychison

As always,
Dr. Greg 'GW' Wettstein
------------------------------------------------------------------------------
The Hurderos Project
Open Identity, Service and Authorization Management
http://www.hurderos.org

2005-10-22 17:01:26

by Rob Landley

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Sunday 02 October 2005 17:08, David Leimbach wrote:
> I've been just playing around with the v9fs work and private
> namespaces from yesterday's [October 1, 2005] top of tree from Linus'
> git archive and I was looking at /etc/mtab's reaction to having
> multiple namespaces with bind mounts.

Oh you don't need namespaces to hork mtab. Do a mount from a chroot
environment. Or try to use --bind or --move mounts (at all) and watch it beg
for mercy. (I accidentally ran UserMode Linux as root once, using a hostfs
root filesystem to borrow the existing Linux's root filesystem, and its'
mounts edited the parent system's /etc/mtab. Yeah, that was user error on my
part, but it's also the _only_ gotcha I've found when doing that.)

/etc/mtab is simply brittle. Personally, on systems I build, I ln
-s /proc/mounts /etc/mtab

Rob

2005-10-22 17:01:45

by Rob Landley

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Sunday 16 October 2005 19:47, Ian Kent wrote:

> > Or, you bite the bullet and fix /proc/mounts and let distributions bind
> > mount /proc/mounts over /etc/mtab.
> >
> > Sun recognized this as a problem a long time ago and /etc/mnttab has
> > been magic for quite some time now.
>
> Don't forget to update mount as well.
>
> Ian

I'm the maintainer of the busybox mount command. We've had /etc/mtab support
be optional (you can configure it out) for a while now.

There was some fancy footwork trying to get umount to automatically free loop
devices and such, but as far as I know that's all resolved in subversion and
if we can ever get a 1.1 release out, it should all just work...

Rob

2005-10-22 17:26:47

by Bodo Eggert

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

Dr. Greg Wettstein <[email protected]> wrote:
> On Oct 13, 7:10pm, Mike Waychison wrote:
> } Subject: Re: /etc/mtab and per-process namespaces

> I've been pondering the best way to take on the mount problem.
> Current mount binaries seem to fall back to /proc/mounts if /etc/mtab
> is not present. All bets are off of course if the mount binary is
> used for the bind mount since a new /etc/mtab is created.
>
> I'm willing to whack on the mount binary a bit as part of this. The
> obvious solution is to teach mount to act differently if it is running
> in a private namespace. If anybody knows of a good way to detect this
> I would be interested in knowing that. In newns (the namespace sudo
> tool) I'm setting an environment variable for mount to detect on but a
> system level approach would be more generic.

- If named namespaces are to be implemented, you could check for a set
namespace ID. (You could also get rid of the persistent-namespace-daemon.)

- If secure user mounts are implemented, missing privileges will be hint.
Privileged mounts will require a global configuration where a flag can
be set. (BTW: You'll also want to disable user mounts in global namespaces
to catch errors, abuse and exploits)

- Until any of these is implemented, users will run in a private namespace,
so UID != EUID will indicate a private namespace. Unfortunately this is
not secure, but:

- If the proc/mounts information is extended as described below, a different
behaviour wouldn't make sense anymore, would it?

> The other problem is the information exported in /proc/mounts. It
> would seem problematic to modify its format but in order to serve as a
> useful source of information for a modified mount binary it would need
> to contain mount option information. Since this is definitely process
> specific information it would seem to call for something in /proc
> rather than /sysfs. Do we need a new pseudo-file?

- The file format is broken for whitespace in filenames, so changing the
format in these cases by adding quoting won't actually break anything.

- The userspace options (loop device etc) can be encoded in the mount
options field, e.g.
'rw,mount=lo:"/home/Arthur Dent/iso":/dev/loop/42;ns=public,async'.

So if you _want_ to keep the format, you can IMHO do so. Maybe you can
think of something better, who knows? We'll need to upgrade the tools
to use non-broken semantics anyway, so as long as you keep the old file
around, this should be no real problem.

--
Ich danke GMX daf?r, die Verwendung meiner Adressen mittels per SPF
verbreiteten L?gen zu sabotieren.

2005-10-29 00:07:26

by Ram Pai

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Sat, 2005-10-22 at 06:23, Dr. Greg Wettstein wrote:
> On Oct 13, 7:10pm, Mike Waychison wrote:
> } Subject: Re: /etc/mtab and per-process namespaces
>
> Good morning to everyone, really behind on e-mail, my apologies for
> joining the thread late.
>
> > Ram wrote:
> > > On Tue, Oct 04, 2005 at 12:14:47PM -0700, David Leimbach wrote:
> > >
> > >>Hmm no responses on this thread a couple days now. I guess:
> > >>
> > >>1) No one cares about private namespaces or the fact that they make
> > >>/etc/mtab totally inconsistent.
> > >>2) Private Namespaces aren't important to anyone and will never be
> > >>robust unless someone who cares, like me, takes it over somehow.
> > >>3) Everyone is busy with their own shit and doesn't want to deal with
> > >>me or mine right now.
> > >>
> > >>I'm seriously hoping it's 3 :). 2 Is acceptable too of course. I
> > >>think this is important and I want to know more about the innards
> > >>anyway. 1 would make me sad as I think Linux can really show other
> > >>Unix's what-for here when it comes to showing off how good the VFS can
> > >>be.
>
> > Or, you bite the bullet and fix /proc/mounts and let distributions bind
> > mount /proc/mounts over /etc/mtab.
> >
> > Sun recognized this as a problem a long time ago and /etc/mnttab has
> > been magic for quite some time now.
> >
> > Add to this the fact that a textfile /etc/mtab is busted because it's
> > whitespace seperated and pieces blows up and you do things like:
> >
> > mount filer:/export/mikew "/home/Mike Waychison"
>
> As to the three options above, I believe number 3 would be operative.
> Private namespaces are extremely useful concepts, we are growing
> increasingly dependent on them for systems management and
> administration. I believe the issue is a chicken/egg problem, without
> an update in tools the concept of namespaces are less approachable
> than they should be.
>
> Mike's comments are very apt. The current situation with mount
> support is untenable. Even working on private development machines it
> gets confusing as to what is or is not mounted in various
> shells/processes. The basic infra-structure is there with process
> specific mount information (/proc/self/mounts) but mount and friends
> are a bit problematic with respect to supporting this.
>
> I'm working on a namespace toolkit to address these issues. I've got
> a pretty basic tool, similar to sudo, which allows spawning processes
> with a protected namespace. I'm adding a configuration system which
> allow systems administrators to define a setup of bind mounts which
> are automatically executed before the user is given their shell. I'm
> also working up a PAM account module to go along with this. I would
> certainly be open to suggestions as to what else people would consider
> useful in such a toolkit.
>
> I've been pondering the best way to take on the mount problem.
> Current mount binaries seem to fall back to /proc/mounts if /etc/mtab
> is not present. All bets are off of course if the mount binary is
> used for the bind mount since a new /etc/mtab is created.
>
> I'm willing to whack on the mount binary a bit as part of this. The
> obvious solution is to teach mount to act differently if it is running
> in a private namespace. If anybody knows of a good way to detect this
> I would be interested in knowing that. In newns (the namespace sudo
> tool) I'm setting an environment variable for mount to detect on but a
> system level approach would be more generic.

actually there is a hackish way for a process to figure out if it is in
a different namespace than the system namespace.

ls /proc/1/root

in a system namespace it will allow you to see the content.
And in a per-process-namespace it will fail with permission denied.

But I think we should figure out a cleaner way to decipher this,
and that would start with clearly defining the requirements, I think.

RP



2005-10-29 10:17:43

by Rob Landley

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Friday 28 October 2005 19:06, Ram Pai wrote:

> > Mike's comments are very apt. The current situation with mount
> > support is untenable. Even working on private development machines it
> > gets confusing as to what is or is not mounted in various
> > shells/processes. The basic infra-structure is there with process
> > specific mount information (/proc/self/mounts) but mount and friends
> > are a bit problematic with respect to supporting this.

I fairly extensively rewrote busybox mount, and one of my goals was doing the
best job with /proc/mounts (only) support that I could. In some ways,
busybox's mount is better (such as the fact it can autodetect when you're
trying to mount a file and figure out it needs -o loop without being told).

If you want try the busybox version of mount/losetup/umount, I hope it does
what you want and am willing to fix it if it doesn't. (P.S. To
use /proc/mounts either configure it without /etc/mtab support or
symlink /etc/mtab to /proc/mounts.)

> > I'm working on a namespace toolkit to address these issues. I've got
> > a pretty basic tool, similar to sudo, which allows spawning processes
> > with a protected namespace. I'm adding a configuration system which
> > allow systems administrators to define a setup of bind mounts which
> > are automatically executed before the user is given their shell. I'm
> > also working up a PAM account module to go along with this. I would
> > certainly be open to suggestions as to what else people would consider
> > useful in such a toolkit.
> >
> > I've been pondering the best way to take on the mount problem.
> > Current mount binaries seem to fall back to /proc/mounts if /etc/mtab
> > is not present. All bets are off of course if the mount binary is
> > used for the bind mount since a new /etc/mtab is created.

Have you tried having /etc/mtab be a symlink to /proc/mounts?

> > I'm willing to whack on the mount binary a bit as part of this. The
> > obvious solution is to teach mount to act differently if it is running
> > in a private namespace. If anybody knows of a good way to detect this
> > I would be interested in knowing that. In newns (the namespace sudo
> > tool) I'm setting an environment variable for mount to detect on but a
> > system level approach would be more generic.
>
> actually there is a hackish way for a process to figure out if it is in
> a different namespace than the system namespace.
>
> ls /proc/1/root
>
> in a system namespace it will allow you to see the content.
> And in a per-process-namespace it will fail with permission denied.
>
> But I think we should figure out a cleaner way to decipher this,
> and that would start with clearly defining the requirements, I think.

The big thing I've never figured out how to do is make umount -a work in the
presence of multiple namespaces. (Should it just umount what it sees? I
don't know how to umount everything because I can't find everything...)

> RP

Rob

2005-10-31 19:11:42

by Ram Pai

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Sat, 2005-10-29 at 03:16, Rob Landley wrote:
> On Friday 28 October 2005 19:06, Ram Pai wrote:
>
> > > Mike's comments are very apt. The current situation with mount
> > > support is untenable. Even working on private development machines it
> > > gets confusing as to what is or is not mounted in various
> > > shells/processes. The basic infra-structure is there with process
> > > specific mount information (/proc/self/mounts) but mount and friends
> > > are a bit problematic with respect to supporting this.
>
> I fairly extensively rewrote busybox mount, and one of my goals was doing the
> best job with /proc/mounts (only) support that I could. In some ways,
> busybox's mount is better (such as the fact it can autodetect when you're
> trying to mount a file and figure out it needs -o loop without being told).
>
> If you want try the busybox version of mount/losetup/umount, I hope it does
> what you want and am willing to fix it if it doesn't. (P.S. To
> use /proc/mounts either configure it without /etc/mtab support or
> symlink /etc/mtab to /proc/mounts.)
>
> > > I'm working on a namespace toolkit to address these issues. I've got
> > > a pretty basic tool, similar to sudo, which allows spawning processes
> > > with a protected namespace. I'm adding a configuration system which
> > > allow systems administrators to define a setup of bind mounts which
> > > are automatically executed before the user is given their shell. I'm
> > > also working up a PAM account module to go along with this. I would
> > > certainly be open to suggestions as to what else people would consider
> > > useful in such a toolkit.
> > >
> > > I've been pondering the best way to take on the mount problem.
> > > Current mount binaries seem to fall back to /proc/mounts if /etc/mtab
> > > is not present. All bets are off of course if the mount binary is
> > > used for the bind mount since a new /etc/mtab is created.
>
> Have you tried having /etc/mtab be a symlink to /proc/mounts?
>
> > > I'm willing to whack on the mount binary a bit as part of this. The
> > > obvious solution is to teach mount to act differently if it is running
> > > in a private namespace. If anybody knows of a good way to detect this
> > > I would be interested in knowing that. In newns (the namespace sudo
> > > tool) I'm setting an environment variable for mount to detect on but a
> > > system level approach would be more generic.
> >
> > actually there is a hackish way for a process to figure out if it is in
> > a different namespace than the system namespace.
> >
> > ls /proc/1/root
> >
> > in a system namespace it will allow you to see the content.
> > And in a per-process-namespace it will fail with permission denied.
> >
> > But I think we should figure out a cleaner way to decipher this,
> > and that would start with clearly defining the requirements, I think.
>
> The big thing I've never figured out how to do is make umount -a work in the
> presence of multiple namespaces. (Should it just umount what it sees? I
> don't know how to umount everything because I can't find everything...)

Yes you won't find everything, since some of them are in a different
namespaces. Instead unmount whatever you see. Or use /proc/mounts
to unmount whatever is there in its namespace.

RP


2005-10-31 23:27:56

by Rob Landley

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Monday 31 October 2005 13:11, Ram Pai wrote:
> > The big thing I've never figured out how to do is make umount -a work in
> > the presence of multiple namespaces. (Should it just umount what it
> > sees? I don't know how to umount everything because I can't find
> > everything...)
>
> Yes you won't find everything, since some of them are in a different
> namespaces. Instead unmount whatever you see. Or use /proc/mounts
> to unmount whatever is there in its namespace.

But /proc/mounts is a symlink to self/mounts, and self is a symlink to $PID,
so after burrowing through the symlinks you wind up looking
at /proc/$PID/mounts.

My concern is that if I have init, as root, try to perform a umount -a, it
_still_ won't get the mounts belonging to child processes with a separate
namespace. There's no "global view" of mounts available anywhere.

On the other hand, if we fork a child process with its own namespace, the
child performs a private mount, and then we kill that child process, does
that hidden mount get umounted cleanly via refcounting? (Or does it leak?)

If killing the processes umounts their private mounts, all init has to do is
make sure all child processes are dead before doing a umount -a on what's
left. (Then, of course, there's FUSE. Does killing the FUSE helper prevent
the mount from being umounted?)

It's a bit conceptually persnickety, so far...

Rob

2005-11-01 00:01:38

by Ram Pai

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Mon, 2005-10-31 at 15:27, Rob Landley wrote:
> On Monday 31 October 2005 13:11, Ram Pai wrote:
> > > The big thing I've never figured out how to do is make umount -a work in
> > > the presence of multiple namespaces. (Should it just umount what it
> > > sees? I don't know how to umount everything because I can't find
> > > everything...)
> >
> > Yes you won't find everything, since some of them are in a different
> > namespaces. Instead unmount whatever you see. Or use /proc/mounts
> > to unmount whatever is there in its namespace.
>
> But /proc/mounts is a symlink to self/mounts, and self is a symlink to $PID,
> so after burrowing through the symlinks you wind up looking
> at /proc/$PID/mounts.
>
> My concern is that if I have init, as root, try to perform a umount -a, it
> _still_ won't get the mounts belonging to child processes with a separate
> namespace. There's no "global view" of mounts available anywhere.

and having a "global view" is a debatable issue. What you are asking for
is a way for a process to be able to access all the mounts irrespective
of which namespace it belongs to.

I think 'umount -a' semantics has to be refined and made as 'unmount all
the mounts belonging its namespace'. And if you agree with the
semantics, than unmouting whatever is found in /proc/mounts would
suffice.


>
> On the other hand, if we fork a child process with its own namespace, the
> child performs a private mount, and then we kill that child process, does
> that hidden mount get umounted cleanly via refcounting? (Or does it leak?)

yes all the mounts in the namespace will get cleaned up if no processes
have access to that namespace.

>
> If killing the processes umounts their private mounts, all init has to do is
> make sure all child processes are dead before doing a umount -a on what's
> left. (Then, of course, there's FUSE. Does killing the FUSE helper prevent
> the mount from being umounted?)

Again as I said above, 'umount -a' has just to restrict itself to its
own namespace.

RP
>
> It's a bit conceptually persnickety, so far...
>
> Rob

2005-11-01 07:37:28

by Miklos Szeredi

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

> (Then, of course, there's FUSE. Does killing the FUSE helper
> prevent the mount from being umounted?)

No. On clean exit (via INT, TERM, HUP handlers installed by library)
it will lazy umount itself. Violent death of a filesystem daemon will
leave the mount intact, but umountable.

Miklos

2005-11-01 08:45:45

by Rob Landley

[permalink] [raw]
Subject: Re: /etc/mtab and per-process namespaces

On Tuesday 01 November 2005 01:36, Miklos Szeredi wrote:
> > (Then, of course, there's FUSE. Does killing the FUSE helper
> > prevent the mount from being umounted?)
>
> No. On clean exit (via INT, TERM, HUP handlers installed by library)
> it will lazy umount itself. Violent death of a filesystem daemon will
> leave the mount intact, but umountable.

Ok, so it sounds like the proper init-go-byebye procedure once namespaces get
deployed is for init to kill all child processes, umount -a what's left in
its namespace, and all is well. So no changes are needed to the umount -a
implementation...

Rob