2003-02-03 07:58:01

by Chris Jensen

[permalink] [raw]
Subject: Stale file handle

Hi,
I'm getting a stale file handle with a linux file server.
I've been running the file server for a long time now along with another box
as the client and there has never been any problem (in fact they've both been
server and client), both the boxes run linux from scratch (entire linux OS
compiled from source) with kernel 2.4.20 (though they've both gone through
most of the 2.4.x series).
They've run fine together for over a year, but recently I have tried
interoperating them with other boxes and I'm getting a real headache from
Stale File Handles. I've tried RedHat 7.3 and OpenBSD 3.2, all I have to do
is mount a remote share, and try to untar a file from the share, before the
end of the file I get a stale file handle error. I should note that both the
RedHat and OpenBSD systems were running on VMWare, but they had no other
network connectivity problems either to the boxes sharing nfs, or anything
else.
I've tried upgrading the servers to the latest nfs-utils (1.0.1). There were
inconsistancies with DNS resolution (the server wasn't in DNS, so I was
connecting by IP), but I fixed that. Neither of these things have fixed the
problem. I've looked in the log files and can find nothing significant.
What's going on here? I'm tearing my hair out!

--
Chris Jensen

Public Key: http://drspirograph.com/public_key/

Wait: Did you know that there's a direct correlation between the decline of
Spirograph and the rise in gang activity? Think about it.
- Dr Spirograph (The Simpsons)



-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-02-03 13:26:49

by Trond Myklebust

[permalink] [raw]
Subject: Re: Stale file handle

>>>>> " " == Chris Jensen <[email protected]> writes:

> What's going on here? I'm tearing my hair out!

Nobody will be able to figure that out until you give a description of
your setup.

Cheers,
Trond


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-03 23:38:47

by Chris Jensen

[permalink] [raw]
Subject: Re: Stale file handle

On Tue, 4 Feb 2003 12:26 am, Trond Myklebust wrote:
> >>>>> " " == Chris Jensen <[email protected]> writes:
> > What's going on here? I'm tearing my hair out!
>
> Nobody will be able to figure that out until you give a description of
> your setup.

The server is on linux, kernel 2.4.20, nfs-utils 1.0.1, compiled with gcc
2.95.2.1, glibc 2.2.3
All of these were compiled from source, nfs-utils was configured with
./configure --enable-nfsv3 --enable-secure-statd --prefix=/usr

The clients were default setups of RedHat 7.3 and OpenBSD 3.2, which were
actually VMWare Virtual Machines running on the server.
The server name resolution is being done via the hosts file on the client.
The clients were DHCP clients.
I'd simply used
mount server:/mnt/data /mnt
(trying both IP and name for server)
Then I tried both hard and soft mounts, and forcing to nfsvers 2 and 3

What further information do you need? Like I said before, I couldn't find
anything significant in the logs (client or server).

--
Chris Jensen

Public Key: http://drspirograph.com/public_key/

Wait: Did you know that there's a direct correlation between the decline of
Spirograph and the rise in gang activity? Think about it.
- Dr Spirograph (The Simpsons)



-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 00:09:23

by Chris Jensen

[permalink] [raw]
Subject: Re: Stale file handle

On Tue, 4 Feb 2003 12:26 am, Trond Myklebust wrote:
> >>>>> " " == Chris Jensen <[email protected]> writes:
> > What's going on here? I'm tearing my hair out!
>
> Nobody will be able to figure that out until you give a description of
> your setup.

In addition, the other box that I mentioned that has been working happily with
the server for over a year, has now shown the problem also.
I had just recently changed the mount options on that box from soft to hard,
reverting to soft seems to have fixed it.

--
Chris Jensen

Public Key: http://drspirograph.com/public_key/

Wait: Did you know that there's a direct correlation between the decline of
Spirograph and the rise in gang activity? Think about it.
- Dr Spirograph (The Simpsons)



-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 00:26:42

by Trond Myklebust

[permalink] [raw]
Subject: Re: Stale file handle

>>>>> " " == Chris Jensen <[email protected]> writes:

> What further information do you need? Like I said before, I
> couldn't find anything significant in the logs (client or
> server).

What type of filesystem are you using on the server?

Cheers,
Trond


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 00:44:53

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: Stale file handle

On Tue, Feb 04, 2003 at 01:26:33AM +0100, Trond Myklebust wrote:
> What type of filesystem are you using on the server?

I've managed to trigger a stale file handle error by accessing a file on
the server via a hardlink in a directory that is outside of the exported
path on ext2...

-ben
--
"Do you seek knowledge in time travel?"


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 01:10:58

by Chris Jensen

[permalink] [raw]
Subject: Re: Stale file handle

On Tue, 4 Feb 2003 11:44 am, Benjamin LaHaise wrote:
> On Tue, Feb 04, 2003 at 01:26:33AM +0100, Trond Myklebust wrote:
> > What type of filesystem are you using on the server?
>
> I've managed to trigger a stale file handle error by accessing a file on
> the server via a hardlink in a directory that is outside of the exported
> path on ext2...

That's not what's happening in my case, however I am referencing the file on
the share via a symlink on a local file system.

ie
mount server:/mnt/data /mnt
/usr/src/sources -> /mnt/sources
tar -zxvf /usr/src/sources/apps/somefile.tar.gz

--
Chris Jensen

Public Key: http://drspirograph.com/public_key/

Wait: Did you know that there's a direct correlation between the decline of
Spirograph and the rise in gang activity? Think about it.
- Dr Spirograph (The Simpsons)



-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 01:12:38

by Chris Jensen

[permalink] [raw]
Subject: Re: Stale file handle

> What type of filesystem are you using on the server?

Ohy, yes that :) FAT32 (vfat module)

--
Chris Jensen

Public Key: http://drspirograph.com/public_key/

Wait: Did you know that there's a direct correlation between the decline of
Spirograph and the rise in gang activity? Think about it.
- Dr Spirograph (The Simpsons)



-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 01:13:46

by Trond Myklebust

[permalink] [raw]
Subject: Re: Stale file handle

>>>>> " " == Benjamin LaHaise <[email protected]> writes:

> On Tue, Feb 04, 2003 at 01:26:33AM +0100, Trond Myklebust
> wrote:
>> What type of filesystem are you using on the server?

> I've managed to trigger a stale file handle error by accessing
> a file on the server via a hardlink in a directory that is
> outside of the exported path on ext2...

Yep. Rightly or wrongly, that is expected behaviour if you have
subtree checking enabled on that export (the default).

Cheers,
Trond


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 01:25:32

by Trond Myklebust

[permalink] [raw]
Subject: Re: Stale file handle

>>>>> " " == Chris Jensen <[email protected]> writes:

>> What type of filesystem are you using on the server?
> Ohy, yes that :) FAT32 (vfat module)

Ouch. You are out of luck.

Unfortunately FAT filesystems are not very NFS-friendly as they lack a
reliable way to generate an NFS filehandle. (The NFS filehandle is a
way of uniquely identifying the file to the server in a manner that
does not depend on the pathname).

The current implementation in knfsd relies heavily on the file staying
in the server's 'dentry cache' until the NFS client has finished with
it. If/when memory pressure forces garbage collection on the dcache,
the server will lose its mapping between file and NFS filehandle, and
you will get a 'stale file handle' error being returned to the client.

Cheers,
Trond


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 01:37:00

by NeilBrown

[permalink] [raw]
Subject: Re: Stale file handle

On Tuesday February 4, [email protected] wrote:
> > What type of filesystem are you using on the server?
>
> Ohy, yes that :) FAT32 (vfat module)

Sorry. You lose.

kNFSd only begrudginly supports FAT, and only to a limited extent. It
sometimes works, but don't depend on it. Particularly don't depend on
it after a server restart.

NeilBrown


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 01:40:53

by Chris Jensen

[permalink] [raw]
Subject: Re: Stale file handle

> Unfortunately FAT filesystems are not very NFS-friendly as they lack a
> reliable way to generate an NFS filehandle. (The NFS filehandle is a
> way of uniquely identifying the file to the server in a manner that
> does not depend on the pathname).
>
> The current implementation in knfsd relies heavily on the file staying
> in the server's 'dentry cache' until the NFS client has finished with
> it. If/when memory pressure forces garbage collection on the dcache,
> the server will lose its mapping between file and NFS filehandle, and
> you will get a 'stale file handle' error being returned to the client.

So is it just a coincidence that this has worked for over a year? I wouldn't
have thought so, because there have often been times when I have been
compiling on the server while connecting with a client, wouldn't this have
placed memory pressure on the server?

--
Chris Jensen

Public Key: http://drspirograph.com/public_key/

Wait: Did you know that there's a direct correlation between the decline of
Spirograph and the rise in gang activity? Think about it.
- Dr Spirograph (The Simpsons)




-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 08:49:12

by Trond Myklebust

[permalink] [raw]
Subject: Re: Stale file handle

>>>>> " " == Chris Jensen <[email protected]> writes:

> So is it just a coincidence that this has worked for over a
> year? I wouldn't have thought so, because there have often been

You ought to have the source available to you so feel free to take a
look. However unless you have been using some specially patched
version of the kernel, then VFAT support will be poor, as described.

I believe that a lot could be done to fix this particular issue.
Although it is clear that some compromises will need to be made
w.r.t. the NFSv3 specs, the stale filehandle thing could perhaps be
fixed by (for instance) encoding the beginning of the FAT chain of the
file + the parent directory in the filehandle. Such an encoding is
admittedly not safe under renames of the file, but it would already be
a huge improvement.

Cheers,
Trond


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 10:09:19

by NeilBrown

[permalink] [raw]
Subject: Re: Stale file handle

On February 4, [email protected] wrote:
> >>>>> " " == Chris Jensen <[email protected]> writes:
>
> > So is it just a coincidence that this has worked for over a
> > year? I wouldn't have thought so, because there have often been
>
> You ought to have the source available to you so feel free to take a
> look. However unless you have been using some specially patched
> version of the kernel, then VFAT support will be poor, as described.
>
> I believe that a lot could be done to fix this particular issue.
> Although it is clear that some compromises will need to be made
> w.r.t. the NFSv3 specs, the stale filehandle thing could perhaps be
> fixed by (for instance) encoding the beginning of the FAT chain of the
> file + the parent directory in the filehandle. Such an encoding is
> admittedly not safe under renames of the file, but it would already be
> a huge improvement.

The filehandles for FAT files have had this information ever since 2.4
first supported FAT (which wasn't the very early 2.4s. 2.2 sort of
supported FAT.. sometimes, maybe).

However the filehandle decoding code doesn't actually use this
information. It seemed like a lot of work for little gain. I doubt
very much if I will ever bother making it work better. Ofcourse if
anyone else wants to try ..... :-)

NeilBrown


>
> Cheers,
> Trond
>
>
> -------------------------------------------------------
> This SF.NET email is sponsored by:
> SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
> http://www.vasoftware.com
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-04 10:41:04

by Trond Myklebust

[permalink] [raw]
Subject: Re: Stale file handle

>>>>> " " == Neil Brown <[email protected]> writes:

>> I believe that a lot could be done to fix this particular
>> issue. Although it is clear that some compromises will need to
>> be made w.r.t. the NFSv3 specs, the stale filehandle thing
>> could perhaps be fixed by (for instance) encoding the beginning
>> of the FAT chain of the file + the parent directory in the
>> filehandle. Such an encoding is admittedly not safe under
>> renames of the file, but it would already be a huge
>> improvement.

> The filehandles for FAT files have had this information ever
> since 2.4 first supported FAT (which wasn't the very early
> 2.4s. 2.2 sort of supported FAT.. sometimes, maybe).

> However the filehandle decoding code doesn't actually use this
> information. It seemed like a lot of work for little gain. I
> doubt very much if I will ever bother making it work better.
> Ofcourse if anyone else wants to try ..... :-)


The VFAT filesystem is listed as being maintained by Gordon
Chaffee. If this is still the case, perhaps he could be persuaded to
give it a try?
If not, then it sounds like a golden opportunity for someone who would
like an extra entry on their CV ;-)

Cheers,
Trond


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs