> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Tim Hockin
> Sent: Tuesday, January 06, 2004 3:50 PM
> To: H. Peter Anvin
> Cc: autofs mailing list; Mike Waychison; Kernel Mailing List
> Subject: Re: [autofs] [RFC] Towards a Modern Autofs
>
<...snip...>
>
> > Pardon me for sounding harsh, but I'm seriously sick of the
oft-repeated
> > idiocy that effectively boils down to "the daemon can die and would
lose
> > its state, so let's put it all in the kernel." A dead daemon is a
> > painful recovery, admitted. It is also a THIS SHOULD NOT HAPPEN
>
> But it *does* happen.
>
> > condition. By cramming it into the kernel, you're in fact
> > making the system less stable, not more, because the kernel being
tainted with
> > faulty code is a total system malfunction; a crashed userspace
daemon is
>
> I don't think this design crams anything into the kernel. It
> doesn't put a whole lot more into the kernel than is currently in
there
> (expiry and new mount stuff, aside). All the work still happens in
userland.
>
> The daemon as it stands does NOT handle namespaces, does NOT handle
expiry
> well, and is a pretty sad copy of an old design.
>
> > "merely" a messy cleanup. In practice, the autofs daemon does not
die
> > unless a careless system administrator kills it. It is a
non-problem.
>
> I have some customers I'd love to send to you, if you really
> think that's true.
Speaking as a sysadmin with 300+ machines (some linux, some solaris)
using autofs, I can say that the linux autofs daemon does die on
occasion, or at least some of the children become hung or unresponsive.
This happened to us with autofs3 and autofs4, leading me to contact Ian
Kent and become involved in testing new versions of autofs4. I don't
have any problems with the newest versions (4.1.0+) but with previous
code, 4.0.0pre10 for example, I found the ability to restart the daemon
invaluable. On those occasions where the autofs daemon gets confused
(loses track of mountpoints, gets corruption in its internal
representation of NIS maps, etc.) we could shut down the autofs daemon,
kill any remaining processes, and restart it from scratch. In most
cases restarting the daemon fixes the problem. It's worth noting that I
have seen this happen on Solaris 2.6 as well but it is extremely rare.
On the solaris machine there was no automount daemon to restart so I was
forced to reboot it to regain access to the 'missing' mountpoint.
If you've read this far, what I'm trying to say is that having userspace
related code remain in userland is a good thing since you can restart
the daemon if something goes wrong. If you move all of this to
kernel-space you can't do anything about it if there is a problem. In
Solaris there is a command called 'automount' that tells the kernel to
re-read the automount maps, perhaps it resets the autofs subsystem in
the kernel as well. If linux autofs had the same capability we might
not need the daemon, but until then, having the daemon in userland is a
good thing.
On Tue, Jan 06, 2004 at 04:28:59PM -0600, Ogden, Aaron A. wrote:
[snip]
> having the daemon in userland is a
> good thing.
You and hpa are agreeing...
As another sysadmin with 300+ linux and solaris boxes, I second
you sentiments exactly. As my previous post today states, I am
having exactly the problem you describe with automount daemons
becoming hung or unresponsive. Guess I should give 4.1.0 a try.
Of course the same arguement applies to NFS server but they went
ahead and moved most of that into the kernel anyway for the
performance gain.
--
---------------------------------------------------------------
Paul Raines email: [email protected]
MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging
149 (2301) 13th Street Charlestown, MA 02129 USA
On Tue, Jan 06, 2004 at 04:28:59PM -0600, Ogden, Aaron A. wrote:
> Solaris there is a command called 'automount' that tells the kernel to
> re-read the automount maps, perhaps it resets the autofs subsystem in
> the kernel as well. If linux autofs had the same capability we might
> not need the daemon, but until then, having the daemon in userland is a
> good thing.
That's more or less exactly what is proposed.
On Tue, 6 Jan 2004, Ogden, Aaron A. wrote:
> If you've read this far, what I'm trying to say is that having userspace
> related code remain in userland is a good thing since you can restart
> the daemon if something goes wrong.
Hear, hear. But...
> If you move all of this to
> kernel-space you can't do anything about it if there is a problem. In
> Solaris there is a command called 'automount' that tells the kernel to
> re-read the automount maps, perhaps it resets the autofs subsystem in
> the kernel as well. If linux autofs had the same capability we might
> not need the daemon, but until then, having the daemon in userland is a
> good thing.
To my mind the ideal design goes something like this:
1. you can mount a synthetic autofs filesystem on lots of directories,
including subdirs of other autofs filesystems.
2. Whenever anything tries to access one of those directories (for a
direct map) or one of its subdirs whether visible or not (indirect map), if
nothing is mounted on it [and it hasn't been told by a special flag that
it's non-mountable, see the /home/user/server{A,B} example], the autofs
kernel module runs a script in user space (in the namespace context of the
originally requesting process). Upon exit, if something is now mounted on
the subdir, fine. Otherwise, ENOENT. The module is not required to know
anything about autofs maps that the userspace helper may or may not
consult.
3. Periodically the module should check if mounted filesystems are
potentially unmountable (this seems to be inexpensive), and if so it should
run the userspace helper to unmount them. If the unmount fails, the helper
(not the kernel) should try to distinguish a race condition from a dead NFS
server, and whether the mount will be viable once the server comes back. If
not, it should be more aggressive than the present daemon in unmounting. At
present the module carefully keeps up-to-date a last_used field and a
timeout potentially different for each mount, but it's probably sufficient
to merely poll all the mount points periodically all at once, perhaps with
a one-time exemption when something is first mounted.
And that's *all* the complexity that should be in the kernel. That's quite
complex enough in my opinion. If the userspace helper needs state, it can
lock and read/write a file. I don't really see the need for the autofs
system to have state beyond "it's mounted".
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: [email protected] http://www.math.ucla.edu/~jimc (q.v. for PGP key)
Jim Carter wrote:
>
> To my mind the ideal design goes something like this:
>
> 1. you can mount a synthetic autofs filesystem on lots of directories,
> including subdirs of other autofs filesystems.
>
> 2. Whenever anything tries to access one of those directories (for a
> direct map) or one of its subdirs whether visible or not (indirect map), if
> nothing is mounted on it [and it hasn't been told by a special flag that
> it's non-mountable, see the /home/user/server{A,B} example], the autofs
> kernel module runs a script in user space (in the namespace context of the
> originally requesting process). Upon exit, if something is now mounted on
> the subdir, fine. Otherwise, ENOENT. The module is not required to know
> anything about autofs maps that the userspace helper may or may not
> consult.
>
> 3. Periodically the module should check if mounted filesystems are
> potentially unmountable (this seems to be inexpensive), and if so it should
> run the userspace helper to unmount them. If the unmount fails, the helper
> (not the kernel) should try to distinguish a race condition from a dead NFS
> server, and whether the mount will be viable once the server comes back. If
> not, it should be more aggressive than the present daemon in unmounting. At
> present the module carefully keeps up-to-date a last_used field and a
> timeout potentially different for each mount, but it's probably sufficient
> to merely poll all the mount points periodically all at once, perhaps with
> a one-time exemption when something is first mounted.
>
> And that's *all* the complexity that should be in the kernel. That's quite
> complex enough in my opinion. If the userspace helper needs state, it can
> lock and read/write a file. I don't really see the need for the autofs
> system to have state beyond "it's mounted".
>
What you've described above is more or less the autofs v3 design. There
are reasons why you really want to have a simple-minded timeout in the
kernel, mostly because attempting umount is more expensive than it
should be on some filesystems. It only needs to be statistically
accurate, though, and thus it does not introduce a race.
Once you have to deal with mount trees (multiple filesystems on the same
mount point which you want to have appear to userspace as a unit),
things get significantly more complex, unfortunately. Mounting is not a
problem, since the nonprivileged processes are simply held, but
umounting is, since in order to make sure there are no race conditions
userspace needs to be locked out from filesystem "a" while umounting
filesystem "a/b", *or* the equivalent of a direct mount autofs point has
to be imposed on node "a/b" of filesystem "a" which can be atomically
deleted together with the umounting of filesystem "a".
These are the mount traps Al Viro has been architecting.
-hpa
On Wed, 7 Jan 2004, H. Peter Anvin wrote:
>
> These are the mount traps Al Viro has been architecting.
>
Please tell me about these.
I have`nt seen any discussion on the implementation.
Just a few sentences ....
Ian
On Thu, Jan 08, 2004 at 08:52:31PM +0800, Ian Kent wrote:
> On Wed, 7 Jan 2004, H. Peter Anvin wrote:
>
> >
> > These are the mount traps Al Viro has been architecting.
> >
>
> Please tell me about these.
>
> I have`nt seen any discussion on the implementation.
>
> Just a few sentences ....
Special vfsmount mounted somewhere; has no superblock associated with it;
attempt to step on it triggers event; normal result of that event is to
get a normal mount on top of it, at which point usual chaining logics
will make sure that we don't see the trap until it's uncovered by removal
of covering filesystem. Trap (and everything mounted on it, etc.) can
be removed by normal lazy umount.
Basically, it's a single-point analog of autofs done entirely in VFS.
The job of automounter is to maintain the traps and react to events.
And yes, I should've done that months ago. Waaaaay too long backlog -
bdev work, dev_t stuff, netdev, yadda, yadda.
On Thu, 8 Jan 2004 [email protected] wrote:
> Basically, it's a single-point analog of autofs done entirely in VFS.
> The job of automounter is to maintain the traps and react to events.
>
> And yes, I should've done that months ago. Waaaaay too long backlog -
> bdev work, dev_t stuff, netdev, yadda, yadda.
>
So that's why Peter appears to have not made progress.
Yes. Tell me about the 24 hour days that feel like an hour and feel like
only an hours progress has been made.
Ian
[email protected] wrote:
>On Thu, Jan 08, 2004 at 08:52:31PM +0800, Ian Kent wrote:
>
>
>>On Wed, 7 Jan 2004, H. Peter Anvin wrote:
>>
>>
>>
>>>These are the mount traps Al Viro has been architecting.
>>>
>>>
>>>
>>Please tell me about these.
>>
>>I have`nt seen any discussion on the implementation.
>>
>>Just a few sentences ....
>>
>>
>
>Special vfsmount mounted somewhere; has no superblock associated with it;
>attempt to step on it triggers event; normal result of that event is to
>get a normal mount on top of it, at which point usual chaining logics
>will make sure that we don't see the trap until it's uncovered by removal
>of covering filesystem. Trap (and everything mounted on it, etc.) can
>be removed by normal lazy umount.
>
>Basically, it's a single-point analog of autofs done entirely in VFS.
>The job of automounter is to maintain the traps and react to events.
>
>
>
Is there any clear advantage to doing this in the VFS other than saving
a superblock and a dentry/inode pair or two?
I remember talking to you about this, and I seem to recall that these
mount traps would probably communicate using a struct file, so a
trap-user would somehow receive events about when the trap was set
off. Will this communication model continue to work within a cloned
namespace? What happens if the trap-client closes the file?
--
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: [email protected]
http://www.sun.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Mike Waychison wrote:
>>
>> Special vfsmount mounted somewhere; has no superblock associated with it;
>> attempt to step on it triggers event; normal result of that event is to
>> get a normal mount on top of it, at which point usual chaining logics
>> will make sure that we don't see the trap until it's uncovered by removal
>> of covering filesystem. Trap (and everything mounted on it, etc.) can
>> be removed by normal lazy umount.
>>
>> Basically, it's a single-point analog of autofs done entirely in VFS.
>> The job of automounter is to maintain the traps and react to events.
>>
> Is there any clear advantage to doing this in the VFS other than saving
> a superblock and a dentry/inode pair or two?
>
> I remember talking to you about this, and I seem to recall that these
> mount traps would probably communicate using a struct file, so a
> trap-user would somehow receive events about when the trap was set
> off. Will this communication model continue to work within a cloned
> namespace? What happens if the trap-client closes the file?
>
The biggest issue is to ensure that the appropriate atomicity guarantees
can be maintained. In particular, it must be possible to umount the
underlying filesystem and all mount traps on top of it atomically.
Anything less will result in race conditions.
-hpa
H. Peter Anvin wrote:
>Mike Waychison wrote:
>
>
>>>Special vfsmount mounted somewhere; has no superblock associated with it;
>>>attempt to step on it triggers event; normal result of that event is to
>>>get a normal mount on top of it, at which point usual chaining logics
>>>will make sure that we don't see the trap until it's uncovered by removal
>>>of covering filesystem. Trap (and everything mounted on it, etc.) can
>>>be removed by normal lazy umount.
>>>
>>>Basically, it's a single-point analog of autofs done entirely in VFS.
>>>The job of automounter is to maintain the traps and react to events.
>>>
>>>
>>>
>>Is there any clear advantage to doing this in the VFS other than saving
>>a superblock and a dentry/inode pair or two?
>>
>>I remember talking to you about this, and I seem to recall that these
>>mount traps would probably communicate using a struct file, so a
>>trap-user would somehow receive events about when the trap was set
>>off. Will this communication model continue to work within a cloned
>>namespace? What happens if the trap-client closes the file?
>>
>>
>>
>
>The biggest issue is to ensure that the appropriate atomicity guarantees
>can be maintained. In particular, it must be possible to umount the
>underlying filesystem and all mount traps on top of it atomically.
>Anything less will result in race conditions.
>
> -hpa
>
>
>
Unless I'm missing something, implementing this as a seperate filesystem
type still has the appropriate atomicity guarantees as long as the VFS
support complex expiry, whereby userspace would tag submounts as being
part of the overall expiry for a base mountpoint.
--
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: [email protected]
http://www.sun.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Mike Waychison wrote:
>
> Unless I'm missing something, implementing this as a seperate filesystem
> type still has the appropriate atomicity guarantees as long as the VFS
> support complex expiry, whereby userspace would tag submounts as being
> part of the overall expiry for a base mountpoint.
>
It would, but it seems like a vastly more invasive change to the VFS
than ought to be necessary.
-hpa