We've had a number of cases recently whereby NFS servers think a client
has umounted a disk, but the client still has the disk mounted.
i.e. the client's entry in the server's /var/lib/nfs/rmtab has been
removed.
The clients mount the NFS disks using autofs
This doesn't cause a problem in normal use (processes running on the
client can access the mounted file system OK) - but if the server
reboots, processes on the client get 'permission denied' when accessing
the mounted file system from the server (not surprising, as the server
has no record on the mount).
The problem _seems_ to occur with processes on the client that are in a
defunct state, and may have been so for days (or weeks) e.g. here is the
edited logs from a client and server:
client:
messages.4:Jan 20 13:00:04 client automount[505]: attempting to mount
entry /export/server1
messages.4:Jan 20 13:09:55 client automount[2416]: expired
/export/server1
messages.4:Jan 20 13:21:48 client automount[505]: attempting to mount
entry /export/server1
server:
messages.4:Jan 20 13:00:04 server rpc.mountd: authenticated mount
request from client:641 for /disk1 (/disk1)
messages.4:Jan 20 13:09:55 server rpc.mountd: authenticated unmount
request from client:898 for /disk1 (/disk1)
messages.4:Jan 20 13:21:48 server rpc.mountd: authenticated mount
request from client:709 for /disk1 (/disk1)
messages.4:Jan 20 14:09:40 server rpc.mountd: authenticated unmount
request from client:912 for /disk1 (/disk1)
As you can see, rpc.mountd on the server has recorded an unmount request
at 14:09:40, but there is no corresponding automount 'expired' message
on the client for this file system ... the file system is still mounted
on the client (and usable) - however, if the server were to reboot, then
the client would no longer be able to access the file system.
If I kill the parent process of the defunct process, autofs will not
umount the mount, but I can manually umount the file system and then
access it again as normal.
I'm using a 2.4.18 based kernels on the client and server. The server is
using nfs-utils 0.3.3 and the client autofs 4.0.0pre10
Could there be a case where automount tells the server it's umounting a
mount, but then finds it can't actually do the umount?
James Pearson
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
>=20
> The clients mount the NFS disks using autofs
>=20
[...]
>=20
> I'm using a 2.4.18 based kernels on the client and server. The server i=
s
> using nfs-utils 0.3.3 and the client autofs 4.0.0pre10
>=20
> Could there be a case where automount tells the server it's umounting a
> mount, but then finds it can't actually do the umount?
>=20
Automount doesn't tell the server anything. It's possible that
/bin/umount has this bug, however.
-hpa
=20
--=20
PGP public key available - finger [email protected]
Key fingerprint: 2047/2A960705 BA 03 D3 2C 14 A8 A8 BD 1E DF FE 69 EE 35=
BD 74
"The earth is but one country, and mankind its citizens." -- Bah=E1'u'l=
l=E1h
Just Say No to Morden * The Shadows were defeated -- Babylon 5 is renewed=
!!
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
>>>>> " " == H Peter Anvin <[email protected]> writes:
> Automount doesn't tell the server anything. It's possible that
> /bin/umount has this bug, however.
That is correct. In the util-linux implemention of 'umount' the RPC
call is done before the actual syscall.
My guess is that this was decided upon in order to avoid races for the
case where some other process does another 'mount'.
Cheers,
Trond
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
So, is there a possible workround?
Thanks
James Pearson
Trond Myklebust wrote:
>
> >>>>> " " == H Peter Anvin <[email protected]> writes:
>
> > Automount doesn't tell the server anything. It's possible that
> > /bin/umount has this bug, however.
>
> That is correct. In the util-linux implemention of 'umount' the RPC
> call is done before the actual syscall.
> My guess is that this was decided upon in order to avoid races for the
> case where some other process does another 'mount'.
>
> Cheers,
> Trond
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
>>>>> " " == James Pearson <[email protected]> writes:
> So, is there a possible workround? Thanks
Workaround: drop the entire RPC call. Failing that, you have a choice
of races...
A real fix would involve teaching the knfsd kernel processes to do
upcalls to userland in order to find out if a client is authorized or
not. This is feasible in 2.5.x, but not in 2.4.x (as the latter lacks
the machinery for doing upcalls to userland).
Cheers,
Trond
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
What would be the side effects of dropping the RPC call? I guess the
rpc.mountd on the server will never get a umount request - will this
cause problems with the client autofs repeatedly mounting/umounting file
systems from the server?
With my particular problem, there was nothing in the system log on the
client to indicate the the NFS filesystem couldn't be umounted - i.e.
under what circumstances could umount think it's OK to umount, but the
actual umount fails - if I try to umount an NFS filesystem that has open
resources, it get a 'device is busy' error.
Thanks
James Pearson
Trond Myklebust wrote:
>
> >>>>> " " == James Pearson <[email protected]> writes:
>
> > So, is there a possible workround? Thanks
>
> Workaround: drop the entire RPC call. Failing that, you have a choice
> of races...
>
> A real fix would involve teaching the knfsd kernel processes to do
> upcalls to userland in order to find out if a client is authorized or
> not. This is feasible in 2.5.x, but not in 2.4.x (as the latter lacks
> the machinery for doing upcalls to userland).
>
> Cheers,
> Trond
>
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
I've been looking into the umount code a bit more, trying to work out
what's going on - however it appears that as rpc.mountd on the server is
contacted before the umount, then the entry in /var/lib/nfs/rmtab on the
server for the client mount is removed regardless of the success of the
umount system call ...
This means that if you attempt a umount of an NFS file system, but get
an error like 'device is busy', then if the server reboots, the client
gets 'permission denied' on the NFS file system mount point ...
Surely this can't be right?
Fortunately, as autofs doesn't appear to attempt a umount until the
mount point is 'free', then you shouldn't have this problem - unless you
manually umount the mount point, or something goes wrong...
The man page for rpc.mountd states:
The rmtab File
For every mount request received from an NFS client,
rpc.mountd adds an entry to the /var/state/nfs/rmtab file.
When receiving an unmount request, that entry is removed.
user level part of the NFS service.
However, this file is mostly ornamental. One, the client
can continue to use the file handle even after calling
rpc.mountd 's UMOUNT procedure. And two, if a client
reboots without notifying rpc.mountd , a stale entry will
remain in rmtab.
which appears to suggest that having 'stale' entries in rmtab is not a
problem, which seems to suggest that dropping the RPC umount call
wouldn't cause any adverse effects - and also stop the 'permission
denied' my problem ...
James Pearson
James Pearson wrote:
>
> What would be the side effects of dropping the RPC call? I guess the
> rpc.mountd on the server will never get a umount request - will this
> cause problems with the client autofs repeatedly mounting/umounting file
> systems from the server?
>
> With my particular problem, there was nothing in the system log on the
> client to indicate the the NFS filesystem couldn't be umounted - i.e.
> under what circumstances could umount think it's OK to umount, but the
> actual umount fails - if I try to umount an NFS filesystem that has open
> resources, it get a 'device is busy' error.
>
> Thanks
>
> James Pearson
>
> Trond Myklebust wrote:
> >
> > >>>>> " " == James Pearson <[email protected]> writes:
> >
> > > So, is there a possible workround? Thanks
> >
> > Workaround: drop the entire RPC call. Failing that, you have a choice
> > of races...
> >
> > A real fix would involve teaching the knfsd kernel processes to do
> > upcalls to userland in order to find out if a client is authorized or
> > not. This is feasible in 2.5.x, but not in 2.4.x (as the latter lacks
> > the machinery for doing upcalls to userland).
> >
> > Cheers,
> > Trond
> >
> > -------------------------------------------------------
> > This sf.net email is sponsored by:ThinkGeek
> > Welcome to geek heaven.
> > http://thinkgeek.com/sf
> > _______________________________________________
> > NFS maillist - [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nfs
> _______________________________________________
> autofs mailing list
> [email protected]
> http://linux.kernel.org/mailman/listinfo/autofs
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
>
> Fortunately, as autofs doesn't appear to attempt a umount until the
> mount point is 'free', then you shouldn't have this problem - unless you
> manually umount the mount point, or something goes wrong...
>
Well, it does, but the test is a sieve, rather than a true test, as there
is no protection against race conditions. It's mostly there to avoid the
high overheard of the umount operation when it's not necessary or
desirable.
A user can come in between the test and the umount request, and autofs
depends on /bin/umount signalling error without harmful side effects.
-hpa
-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
>>>>> " " == H Peter Anvin <[email protected]> writes:
> A user can come in between the test and the umount request, and
> autofs depends on /bin/umount signalling error without harmful
> side effects.
Right. Unfortunately harmful side effects are almost inevitable here
since stateful operations over RPC are inherently race-prone (support
for out-of-order message delivery just isn't part of the state game
8-)).
All we can do is reduce the number of races to a minimum and
hopefully fix knfsd at some point in the near future.
Cheers,
Trond
-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Wed, 19 Feb 2003 01:56:59 -0800 (PST), H. Peter Anvin <[email protected]> wrote:
> Well, it does, but the test is a sieve, rather than a true test, as there
> is no protection against race conditions. It's mostly there to avoid the
> high overheard of the umount operation when it's not necessary or
> desirable.
>
> A user can come in between the test and the umount request, and autofs
> depends on /bin/umount signalling error without harmful side effects.
Isn't autofs v4 supposed to fix this? I remember Jeremy mentioning that
the filesystem would lock all accesses to a dentry scheduled to be
unmounted until the daemon replied to the request.
In fact, from his web page:
Previously, when the kernel sent an expire message to the daemon it was
more an hint than anything else. Now the kernel makes certain that the
tree isn't busy before sending an expire message to the daemon. It also
blocks all new users of the tree until the umount is complete, so that
they can't see a partially mounted tree.
Ion
--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.
-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Ion Badulescu wrote:
>
> Isn't autofs v4 supposed to fix this? I remember Jeremy mentioning that
> the filesystem would lock all accesses to a dentry scheduled to be
> unmounted until the daemon replied to the request.
>
That's fine and dandy, but you would have to atomically lock *all*
dentries belonging to the subtree, which I suspect (but don't know for
sure) is impossible.
-hpa
-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Wed, 2003-02-19 at 10:16, H. Peter Anvin wrote:
> Ion Badulescu wrote:
> > Isn't autofs v4 supposed to fix this? I remember Jeremy mentioning that
> > the filesystem would lock all accesses to a dentry scheduled to be
> > unmounted until the daemon replied to the request.
> >
>
> That's fine and dandy, but you would have to atomically lock *all*
> dentries belonging to the subtree, which I suspect (but don't know for
> sure) is impossible.
I seem to have missed the rest of this thread. What's the context?
J
-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On 19 Feb 2003, Jeremy Fitzhardinge wrote:
> On Wed, 2003-02-19 at 10:16, H. Peter Anvin wrote:
> > Ion Badulescu wrote:
> > > Isn't autofs v4 supposed to fix this? I remember Jeremy mentioning that
> > > the filesystem would lock all accesses to a dentry scheduled to be
> > > unmounted until the daemon replied to the request.
> > >
> >
> > That's fine and dandy, but you would have to atomically lock *all*
> > dentries belonging to the subtree, which I suspect (but don't know for
> > sure) is impossible.
>
> I seem to have missed the rest of this thread. What's the context?
HPA was saying that autofs checks if a filesystem is busy before trying to
unmount it, but does not lock it, so that a remount request could come in
while umount is still running.
I replied with the above...
Ion
--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.
-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Wed, 2003-02-19 at 19:26, Ion Badulescu wrote:
> > On Wed, 2003-02-19 at 10:16, H. Peter Anvin wrote:
> > > Ion Badulescu wrote:
> > > > Isn't autofs v4 supposed to fix this? I remember Jeremy mentioning that
> > > > the filesystem would lock all accesses to a dentry scheduled to be
> > > > unmounted until the daemon replied to the request.
> > > >
> > >
> > > That's fine and dandy, but you would have to atomically lock *all*
> > > dentries belonging to the subtree, which I suspect (but don't know for
> > > sure) is impossible.
> >
> > I seem to have missed the rest of this thread. What's the context?
>
> HPA was saying that autofs checks if a filesystem is busy before trying to
> unmount it, but does not lock it, so that a remount request could come in
> while umount is still running.
Autofs4 addresses this in two ways.
For the actual busyness check, it holds the dcache lock while traversing
the dentries, so that shouldn't change during the check. It also blocks
new name traversals while it is doing an expiry check, so once it has
decided a tree is not busy, and gets the daemon to do the umount, new
users are blocked until the umount is finished. Once the umount is
done, the new users will trigger a new mount.
J
-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs