LinuxLists.cc - possible client stale filehandle bug?

2005-01-25 17:40:31

Subject: possible client stale filehandle bug?

Hi all,
I have lots of storage in a large Solaris samfs environment that is NFS
shared to a large number of Solaris and RHEL3 clients. Under some conditions,
linux apps have been getting stale filehandles during the normal course of
their activity. Various file handling syscalls like read() or open() might
error. Lots of renames and setattrs calls seem to trigger the problem.
'ci' and 'cvs commit' are particularly good at this.

It seems that the Solaris clients never report any such errors, only the Linux
clients. However, watching 'snoop' on the Solaris NFS server, I see that it IS
returning stale file handles to both OSes, but Solaris clients seem to retry
the request several times; and the Linux clients immediately pass the error up
to the application.

Is there some condition that the 2.4 kernel is handling incorrectly?

Sample snippet from the 'snoop' on the Solaris server with a Solaris client
waiting...

rcf102.usc.edu -> almaak.usc.edu NFS C LOOKUP3 FH=B41B Entries.Log
almaak.usc.edu -> rcf102.usc.edu NFS R LOOKUP3 OK FH=7BFE
rcf102.usc.edu -> almaak.usc.edu TCP D=2049 S=610 Ack=3071279992 Seq=337022612 Len=0 Win=64240
rcf102.usc.edu -> almaak.usc.edu NFS C ACCESS3 FH=7BFE (read,modify,extend,execute)
almaak.usc.edu -> rcf102.usc.edu TCP D=610 S=2049 Ack=337022752 Seq=3071279992 Len=0 Win=64240
almaak.usc.edu -> rcf102.usc.edu NFS R ACCESS3 Stale NFS file handle
rcf102.usc.edu -> almaak.usc.edu NFS C LOOKUP3 FH=B41B Entries.Log
almaak.usc.edu -> rcf102.usc.edu NFS R LOOKUP3 OK FH=7BFE
rcf102.usc.edu -> almaak.usc.edu NFS C LOOKUP3 FH=B41B Entries.Log
almaak.usc.edu -> rcf102.usc.edu NFS R LOOKUP3 OK FH=7BFE
rcf102.usc.edu -> almaak.usc.edu TCP D=2049 S=610 Ack=3071280516 Seq=337023056 Len=0 Win=64240
rcf102.usc.edu -> almaak.usc.edu NFS C ACCESS3 FH=7BFE (read,modify,extend,execute)
almaak.usc.edu -> rcf102.usc.edu TCP D=610 S=2049 Ack=337023196 Seq=3071280516 Len=0 Win=64240
almaak.usc.edu -> rcf102.usc.edu NFS R ACCESS3 Stale NFS file handle
rcf102.usc.edu -> almaak.usc.edu NFS C LOOKUP3 FH=B41B Entries.Log
almaak.usc.edu -> rcf102.usc.edu NFS R LOOKUP3 OK FH=7BFE
rcf102.usc.edu -> almaak.usc.edu TCP D=2049 S=610 Ack=3071280796 Seq=337023348 Len=0 Win=64240
rcf102.usc.edu -> almaak.usc.edu NFS C LOOKUP3 FH=B41B Entries.Log
almaak.usc.edu -> rcf102.usc.edu NFS R LOOKUP3 OK FH=7BFE
rcf102.usc.edu -> almaak.usc.edu TCP D=2049 S=610 Ack=3071281040 Seq=337023500 Len=0 Win=64240
rcf102.usc.edu -> almaak.usc.edu NFS C ACCESS3 FH=7BFE (read,modify,extend,execute)
almaak.usc.edu -> rcf102.usc.edu TCP D=610 S=2049 Ack=337023640 Seq=3071281040 Len=0 Win=64240
almaak.usc.edu -> rcf102.usc.edu NFS R ACCESS3 Stale NFS file handle
rcf102.usc.edu -> almaak.usc.edu NFS C LOOKUP3 FH=B41B Entries.Log
almaak.usc.edu -> rcf102.usc.edu NFS R LOOKUP3 OK FH=7BFE
rcf102.usc.edu -> almaak.usc.edu NFS C LOOKUP3 FH=B41B Entries.Log
almaak.usc.edu -> rcf102.usc.edu NFS R LOOKUP3 OK FH=7BFE

--
Garrick Staples, Linux/HPCC Administrator
University of Southern California

Attachments:

(No filename) (2.97 kB)
(No filename) (189.00 B)
Download all attachments

2005-01-26 14:42:34

by Ian Kent

[permalink] [raw]

Subject: Re: possible client stale filehandle bug?

On Tue, 25 Jan 2005, Garrick Staples wrote:

>
> I'd be very happy to see any patches lieing around that might do this
> behaviour. It would get me through the short term until Sun fixes this bug in
> samfs.

Don't hold your breath waiting!

Ian

-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-26 17:50:09

by Garrick Staples

[permalink] [raw]

Subject: Re: possible client stale filehandle bug?

On Wed, Jan 26, 2005 at 10:31:57PM +0800, [email protected] alleged:
> On Tue, 25 Jan 2005, Garrick Staples wrote:
>
> >
> >I'd be very happy to see any patches lieing around that might do this
> >behaviour. It would get me through the short term until Sun fixes this
> >bug in
> >samfs.
>
> Don't hold your breath waiting!

Now you are just being cruel!

--
Garrick Staples, Linux/HPCC Administrator
University of Southern California

Attachments:

(No filename) (439.00 B)
(No filename) (189.00 B)
Download all attachments

2005-01-28 00:48:09

by Ian Kent

[permalink] [raw]

Subject: Re: possible client stale filehandle bug?

On Wed, 26 Jan 2005, Garrick Staples wrote:

> On Wed, Jan 26, 2005 at 10:31:57PM +0800, [email protected] alleged:
> > On Tue, 25 Jan 2005, Garrick Staples wrote:
> >
> > >
> > >I'd be very happy to see any patches lieing around that might do this
> > >behaviour. It would get me through the short term until Sun fixes this
> > >bug in
> > >samfs.
> >
> > Don't hold your breath waiting!
>
> Now you are just being cruel!

We don't use SAMfs in the standard configuration either!

Oh joy, oh bliss!

Sun purchasing LSCI was a real bummer for us.

Ian

-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-26 06:07:59

by Trond Myklebust

[permalink] [raw]

Subject: Re: possible client stale filehandle bug?

ty den 25.01.2005 Klokka 09:39 (-0800) skreiv Garrick Staples:
> Hi all,
> I have lots of storage in a large Solaris samfs environment that is NFS
> shared to a large number of Solaris and RHEL3 clients. Under some conditions,
> linux apps have been getting stale filehandles during the normal course of
> their activity. Various file handling syscalls like read() or open() might
> error. Lots of renames and setattrs calls seem to trigger the problem.
> 'ci' and 'cvs commit' are particularly good at this.

ESTALE is usually a sign that someone is deleting a file on the server
that is in use by the client. It is a sign that you are doing something
that violates the caching rules of NFS.

> It seems that the Solaris clients never report any such errors, only the Linux
> clients. However, watching 'snoop' on the Solaris NFS server, I see that it IS
> returning stale file handles to both OSes, but Solaris clients seem to retry
> the request several times; and the Linux clients immediately pass the error up
> to the application.
>
> Is there some condition that the 2.4 kernel is handling incorrectly?

I do not believe that Solaris redrives ESTALE on read, but they may do
it on open(). Linux does not redrive either case. See the many
discussions in the NFS list archives for why.

Cheers,
Trond

--
Trond Myklebust <[email protected]>

-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-26 06:35:51

by Garrick Staples

[permalink] [raw]

Subject: Re: possible client stale filehandle bug?

On Tue, Jan 25, 2005 at 10:06:27PM -0800, Trond Myklebust alleged:
> ty den 25.01.2005 Klokka 09:39 (-0800) skreiv Garrick Staples:
> > Hi all,
> > I have lots of storage in a large Solaris samfs environment that is NFS
> > shared to a large number of Solaris and RHEL3 clients. Under some conditions,
> > linux apps have been getting stale filehandles during the normal course of
> > their activity. Various file handling syscalls like read() or open() might
> > error. Lots of renames and setattrs calls seem to trigger the problem.
> > 'ci' and 'cvs commit' are particularly good at this.
>
> ESTALE is usually a sign that someone is deleting a file on the server
> that is in use by the client. It is a sign that you are doing something
> that violates the caching rules of NFS.

Nothing of the kind is happening here. I've tested this a thousand times over
the last few days trying to find a solution. In this case, Sun's samfs
filesystem is definitely at fault and doing the wrong thing. Backline
engineers at Sun confirm this and are working on a fix.

The reason for _this_ email isn't because of the ESTALEs, it's regarding the
handling of the ESTALEs. Right now I need the Solaris client behaviour to
deal with this particular buggy server.

Incidentally, 2.6.10 never has a problem. It's behaviour never creates ESTALEs in
the first place.

> > It seems that the Solaris clients never report any such errors, only the Linux
> > clients. However, watching 'snoop' on the Solaris NFS server, I see that it IS
> > returning stale file handles to both OSes, but Solaris clients seem to retry
> > the request several times; and the Linux clients immediately pass the error up
> > to the application.
> >
> > Is there some condition that the 2.4 kernel is handling incorrectly?
>
> I do not believe that Solaris redrives ESTALE on read, but they may do
> it on open(). Linux does not redrive either case. See the many
> discussions in the NFS list archives for why.

Did you look at the 'snoop' bits in the previous email? During that time, the
process on the Solaris client is hanging in a write() call.

I'd be very happy to see any patches lieing around that might do this
behaviour. It would get me through the short term until Sun fixes this bug in
samfs.

--
Garrick Staples, Linux/HPCC Administrator
University of Southern California

Attachments:

(No filename) (2.31 kB)
(No filename) (189.00 B)
Download all attachments

2005-02-16 21:23:16

by Neil Horman

[permalink] [raw]

Subject: Re: possible client stale filehandle bug?

Lever, Charles wrote:
> hi neil-
>
>
>>>>It seems that the Solaris clients never report any such
>>
>>errors, only the Linux
>>
>>>>clients. However, watching 'snoop' on the Solaris NFS
>>
>>server, I see that it IS
>>
>>>>returning stale file handles to both OSes, but Solaris
>>
>>clients seem to retry
>>
>>>>the request several times; and the Linux clients
>>
>>immediately pass the error up
>>
>>>>to the application.
>>>>
>>>>Is there some condition that the 2.4 kernel is handling incorrectly?
>>>
>>>
>>>I do not believe that Solaris redrives ESTALE on read, but
>>
>>they may do
>>
>>>it on open(). Linux does not redrive either case. See the many
>>>discussions in the NFS list archives for why.
>>>
>>
>>Solaris does in fact retry on operations on ESTALE errors,
>>definately on
>>open, and I think on read/readdir/stat/etc. as well. We had some
>>discussion about tht here recently.
>
>
> as far as i know Solaris doesn't redrive on read or write, but only
> during pathname resolution. redriving a read or write will only work in
> the case where the server has taken the export offline temporarily; if
> the file handle really is bad, then redriving a read or write is
> probably safe, but won't accomplish anything.
>
> i have a patch that adds support for pathname resolution retry to 2.6
> (now in Trond's NFS_ALL for 2.6.11-rc4) and a pair of patches that
> implement this for RHEL 3.0 that i've sent to steve and al viro for
> review.

I agree, it probably doesn't re-drive on any operation that doesn't walk
a path, which is in line with what RHEL is doing currently. I didn't
mean to imply that solaris retired ESTALE in all occurances of the
event. Anywho, Can you point me to your patches? I'd be interested to
know how you managed to implement retry on ESTALE without leaking into
the VFS, which I think you will recall was the big sticking point that
we were debating here.

Thanks! :)
Neil

--
/***************************************************
*Neil Horman
*Software Engineer
*Red Hat, Inc.
*[email protected]
*gpg keyid: 1024D / 0x92A74FA1
*http://pgp.mit.edu
***************************************************/

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-02-16 21:47:43

by Neil Horman

[permalink] [raw]

Subject: Re: possible client stale filehandle bug?

Lever, Charles wrote:
>>I agree, it probably doesn't re-drive on any operation that
>>doesn't walk
>>a path, which is in line with what RHEL is doing currently. I didn't
>>mean to imply that solaris retired ESTALE in all occurances of the
>>event. Anywho, Can you point me to your patches? I'd be
>>interested to
>>know how you managed to implement retry on ESTALE without
>>leaking into
>>the VFS, which I think you will recall was the big sticking
>>point that
>>we were debating here.
>
>
> the patches do touch fs/namei.c (it was al viro's suggestion) with a
> pretty simple change. and i think they are KABI friendly enough to be
> included in RHEL 3, once we are satisfied that the solution is
> effective.
>
> the cto-lookup-revalidate patch adds just enough of the 2.6
> lookup-intent logic to the 2.4 VFS layer to allow us to support NFS
> close-to-open in nfs_lookup_revalidate instead of in nfs_open. this
> resolves one of the most common ESTALE failure modes, where just the
> object at the end of the pathname has been replaced.
>
> the second patch applies on top of this. it adds logic to redrive
> pathname resolution if an ESTALE is encountered anywhere during a
> pathname lookup. it redrives it once from the top, asserting a flag
> that causes the VFS layer to abandon the dcache and use only real
> lookups for this resolution request. if the redriven resolution fails,
> we give up. this resolves the other typical ESTALE failure mode, where
> some or all of the path has been replaced, while avoiding retrying an
> unbounded number of times.
Fantastic, Thanks!
Neil

--
/***************************************************
*Neil Horman
*Software Engineer
*Red Hat, Inc.
*[email protected]
*gpg keyid: 1024D / 0x92A74FA1
*http://pgp.mit.edu
***************************************************/

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-02-24 19:34:22

by Trond Myklebust

[permalink] [raw]

Subject: Re: possible client stale filehandle bug?

on den 16.02.2005 Klokka 16:23 (-0500) skreiv Neil Horman:
> I agree, it probably doesn't re-drive on any operation that doesn't walk
> a path, which is in line with what RHEL is doing currently. I didn't
> mean to imply that solaris retired ESTALE in all occurances of the
> event. Anywho, Can you point me to your patches? I'd be interested to
> know how you managed to implement retry on ESTALE without leaking into
> the VFS, which I think you will recall was the big sticking point that
> we were debating here.

It does leak into the VFS, but in a way that is non-intrusive. The VFS
basically sets a new flag, LOOKUP_REVAL, in the struct nameidata flags
(which are passed down to d_op->d_revalidate() on Linux 2.4.x).
The NFS code takes that as an command to expire the caches, and force a
revalidation.

The exact same trick is performed on Linux-2.6.x: see the patch
"linux-2.6.11-31-stale_retry.dif" in
http://client.linux-nfs.org/Linux-2.6.x/2.6.11-rc5/

Cheers,
Trond

--
Trond Myklebust <[email protected]>

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-01-26 13:11:53

by Neil Horman

[permalink] [raw]

Subject: Re: possible client stale filehandle bug?

diff -urNp linux-6090/fs/nfs/inode.c linux-6091/fs/nfs/inode.c
--- linux-6090/fs/nfs/inode.c
+++ linux-6091/fs/nfs/inode.c
@@ -1022,7 +1022,15 @@ int
nfs_revalidate(struct dentry *dentry)
{
struct inode *inode = dentry->d_inode;
- return nfs_revalidate_inode(NFS_SERVER(inode), inode);
+ int error;
+
+ error = nfs_revalidate_inode(NFS_SERVER(inode), inode);
+ if (error == -ESTALE) {
+ nfs_zap_caches(dentry->d_parent->d_inode);
+ d_drop(dentry);
+ }
+
+ return error;
}

/*
diff -urNp linux-6090/fs/nfs/xattr.c linux-6091/fs/nfs/xattr.c
--- linux-6090/fs/nfs/xattr.c
+++ linux-6091/fs/nfs/xattr.c
@@ -53,8 +53,13 @@ nfs_getxattr(struct dentry *dentry, cons
acl = ERR_PTR(-EOPNOTSUPP);
if (NFS_PROTO(inode)->version == 3 && NFS_PROTO(inode)->getacl)
acl = NFS_PROTO(inode)->getacl(inode, type);
- if (IS_ERR(acl))
+ if (IS_ERR(acl)) {
+ if (PTR_ERR(acl) == -ESTALE) {
+ nfs_zap_caches(dentry->d_parent->d_inode);
+ d_drop(dentry);
+ }
return PTR_ERR(acl);
+ }
else if (acl) {
if (type == ACL_TYPE_ACCESS && acl->a_count == 0)
error = -ENODATA;
diff -urNp linux-6090/fs/open.c linux-6091/fs/open.c
--- linux-6090/fs/open.c
+++ linux-6091/fs/open.c
@@ -807,6 +807,30 @@ asmlinkage long sys_open(const char * fi
if (fd >= 0) {
struct file *f = filp_open(tmp, flags, mode);
error = PTR_ERR(f);
+
+ /*
+ * ESTALE errors can be a pain. On some
+ * filesystems (e.g. NFS), ESTALE can often
+ * be resolved by retry, as the ESTALE resulted
+ * in a cache invalidation. We perform this
+ * retry here, once for every directory element
+ * in the path to avoid the case where the removal
+ * of the nth parent directory of the file we're
+ * trying to open results in n ESTALE errors.
+ */
+ if (error == -ESTALE) {
+ int nretries = 1;
+ char *cp;
+
+ for (cp = tmp; *cp; cp++) {
+ if (*cp == '/')
+ nretries++;
+ }
+ do {
+ f = filp_open(tmp, flags, mode);
+ error = PTR_ERR(f);
+ } while (error == -ESTALE && --nretries > 0);
+ }
if (IS_ERR(f))
goto out_error;
fd_install(fd, f);
diff -urNp linux-6090/fs/stat.c linux-6091/fs/stat.c
--- linux-6090/fs/stat.c
+++ linux-6091/fs/stat.c
@@ -143,8 +143,9 @@ static int cp_new_stat(struct inode * in
asmlinkage long sys_stat(char * filename, struct __old_kernel_stat * statbuf)
{
struct nameidata nd;
- int error;
+ int error, errcnt = 0;

+again:
error = user_path_walk(filename, &nd);
if (!error) {
error = do_revalidate(nd.dentry);
@@ -152,6 +153,10 @@ asmlinkage long sys_stat(char * filename
error = cp_old_stat(nd.dentry->d_inode, statbuf);
path_release(&nd);
}
+ if (error == -ESTALE && !errcnt) {
+ errcnt++;
+ goto again;
+ }
return error;
}
#endif
@@ -159,8 +164,9 @@ asmlinkage long sys_stat(char * filename
asmlinkage long sys_newstat(char * filename, struct stat * statbuf)
{
struct nameidata nd;
- int error;
+ int error, errcnt = 0;

+again:
error = user_path_walk(filename, &nd);
if (!error) {
error = do_revalidate(nd.dentry);
@@ -168,6 +174,11 @@ asmlinkage long sys_newstat(char * filen
error = cp_new_stat(nd.dentry->d_inode, statbuf);
path_release(&nd);
}
+ if (error == -ESTALE && !errcnt) {
+ errcnt++;
+ goto again;
+ }
+
return error;
}

@@ -180,8 +191,9 @@ asmlinkage long sys_newstat(char * filen
asmlinkage long sys_lstat(char * filename, struct __old_kernel_stat * statbuf)
{
struct nameidata nd;
- int error;
+ int error, errcnt = 0;

+again:
error = user_path_walk_link(filename, &nd);
if (!error) {
error = do_revalidate(nd.dentry);
@@ -189,6 +201,11 @@ asmlinkage long sys_lstat(char * filenam
error = cp_old_stat(nd.dentry->d_inode, statbuf);
path_release(&nd);
}
+ if (error == -ESTALE && !errcnt) {
+ errcnt++;
+ goto again;
+ }
+
return error;
}

@@ -197,8 +214,9 @@ asmlinkage long sys_lstat(char * filenam
asmlinkage long sys_newlstat(char * filename, struct stat * statbuf)
{
struct nameidata nd;
- int error;
+ int error, errcnt = 0;

+again:
error = user_path_walk_link(filename, &nd);
if (!error) {
error = do_revalidate(nd.dentry);
@@ -206,6 +224,12 @@ asmlinkage long sys_newlstat(char * file
error = cp_new_stat(nd.dentry->d_inode, statbuf);
path_release(&nd);
}
+
+ if (error == -ESTALE && !errcnt) {
+ errcnt++;
+ goto again;
+ }
+
return error;
}

@@ -340,8 +364,9 @@ static long cp_new_stat64(struct inode *
asmlinkage long sys_stat64(char * filename, struct stat64 * statbuf, long flags)
{
struct nameidata nd;
- int error;
+ int error, errcnt = 0;

+again:
error = user_path_walk(filename, &nd);
if (!error) {
error = do_revalidate(nd.dentry);
@@ -349,14 +374,20 @@ asmlinkage long sys_stat64(char * filena
error = cp_new_stat64(nd.dentry->d_inode, statbuf);
path_release(&nd);
}
+ if (error == -ESTALE && !errcnt) {
+ errcnt++;
+ goto again;
+ }
+
return error;
}

asmlinkage long sys_lstat64(char * filename, struct stat64 * statbuf, long flags)
{
struct nameidata nd;
- int error;
+ int error, errcnt = 0;

+again:
error = user_path_walk_link(filename, &nd);
if (!error) {
error = do_revalidate(nd.dentry);
@@ -364,6 +395,11 @@ asmlinkage long sys_lstat64(char * filen
error = cp_new_stat64(nd.dentry->d_inode, statbuf);
path_release(&nd);
}
+ if (error == -ESTALE && !errcnt) {
+ errcnt++;
+ goto again;
+ }
+
return error;
}

Attachments:

linux-2.4.21-nfs-estale.patch (5.28 kB)

2005-01-26 13:07:23

by Neil Horman

[permalink] [raw]

Subject: Re: possible client stale filehandle bug?

Trond Myklebust wrote:
> ty den 25.01.2005 Klokka 09:39 (-0800) skreiv Garrick Staples:
>
>>Hi all,
>> I have lots of storage in a large Solaris samfs environment that is NFS
>>shared to a large number of Solaris and RHEL3 clients. Under some conditions,
>>linux apps have been getting stale filehandles during the normal course of
>>their activity. Various file handling syscalls like read() or open() might
>>error. Lots of renames and setattrs calls seem to trigger the problem.
>>'ci' and 'cvs commit' are particularly good at this.
>
>
> ESTALE is usually a sign that someone is deleting a file on the server
> that is in use by the client. It is a sign that you are doing something
> that violates the caching rules of NFS.
>
>
>>It seems that the Solaris clients never report any such errors, only the Linux
>>clients. However, watching 'snoop' on the Solaris NFS server, I see that it IS
>>returning stale file handles to both OSes, but Solaris clients seem to retry
>>the request several times; and the Linux clients immediately pass the error up
>>to the application.
>>
>>Is there some condition that the 2.4 kernel is handling incorrectly?
>
>
> I do not believe that Solaris redrives ESTALE on read, but they may do
> it on open(). Linux does not redrive either case. See the many
> discussions in the NFS list archives for why.
>

Solaris does in fact retry on operations on ESTALE errors, definately on
open, and I think on read/readdir/stat/etc. as well. We had some
discussion about tht here recently.

--
/***************************************************
*Neil Horman
*Software Engineer
*Red Hat, Inc.
*[email protected]
*gpg keyid: 1024D / 0x92A74FA1
*http://pgp.mit.edu
***************************************************/

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-02-16 21:18:47

by Lever, Charles

[permalink] [raw]

Subject: RE: possible client stale filehandle bug?

hi neil-

> >>It seems that the Solaris clients never report any such=20
> errors, only the Linux
> >>clients. However, watching 'snoop' on the Solaris NFS=20
> server, I see that it IS
> >>returning stale file handles to both OSes, but Solaris=20
> clients seem to retry
> >>the request several times; and the Linux clients=20
> immediately pass the error up
> >>to the application.
> >>
> >>Is there some condition that the 2.4 kernel is handling incorrectly?
> >=20
> >=20
> > I do not believe that Solaris redrives ESTALE on read, but=20
> they may do
> > it on open(). Linux does not redrive either case. See the many
> > discussions in the NFS list archives for why.
> >=20
>=20
> Solaris does in fact retry on operations on ESTALE errors,=20
> definately on=20
> open, and I think on read/readdir/stat/etc. as well. We had some=20
> discussion about tht here recently.

as far as i know Solaris doesn't redrive on read or write, but only
during pathname resolution. redriving a read or write will only work in
the case where the server has taken the export offline temporarily; if
the file handle really is bad, then redriving a read or write is
probably safe, but won't accomplish anything.

i have a patch that adds support for pathname resolution retry to 2.6
(now in Trond's NFS_ALL for 2.6.11-rc4) and a pair of patches that
implement this for RHEL 3.0 that i've sent to steve and al viro for
review.

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-02-16 21:42:44

by Lever, Charles

[permalink] [raw]

Subject: RE: possible client stale filehandle bug?

> I agree, it probably doesn't re-drive on any operation that
> doesn't walk
> a path, which is in line with what RHEL is doing currently. I didn't
> mean to imply that solaris retired ESTALE in all occurances of the
> event. Anywho, Can you point me to your patches? I'd be
> interested to
> know how you managed to implement retry on ESTALE without
> leaking into
> the VFS, which I think you will recall was the big sticking
> point that
> we were debating here.

the patches do touch fs/namei.c (it was al viro's suggestion) with a
pretty simple change. and i think they are KABI friendly enough to be
included in RHEL 3, once we are satisfied that the solution is
effective.

the cto-lookup-revalidate patch adds just enough of the 2.6
lookup-intent logic to the 2.4 VFS layer to allow us to support NFS
close-to-open in nfs_lookup_revalidate instead of in nfs_open. this
resolves one of the most common ESTALE failure modes, where just the
object at the end of the pathname has been replaced.

the second patch applies on top of this. it adds logic to redrive
pathname resolution if an ESTALE is encountered anywhere during a
pathname lookup. it redrives it once from the top, asserting a flag
that causes the VFS layer to abandon the dcache and use only real
lookups for this resolution request. if the redriven resolution fails,
we give up. this resolves the other typical ESTALE failure mode, where
some or all of the path has been replaced, while avoiding retrying an
unbounded number of times.

Attachments:

linux-2.4.21-nfs-cto-lookup-revalidate.patch (4.62 kB)
linux-2.4.21-nfs-cto-lookup-revalidate.patch linux-2.4.21-pathname-retry.patch (5.80 kB)
linux-2.4.21-pathname-retry.patch Download all attachments

2005-02-24 20:44:03

by Neil Horman

[permalink] [raw]

Subject: Re: possible client stale filehandle bug?

On Thu, Feb 24, 2005 at 11:33:43AM -0800, Trond Myklebust wrote:
> on den 16.02.2005 Klokka 16:23 (-0500) skreiv Neil Horman:
> > I agree, it probably doesn't re-drive on any operation that doesn't walk
> > a path, which is in line with what RHEL is doing currently. I didn't
> > mean to imply that solaris retired ESTALE in all occurances of the
> > event. Anywho, Can you point me to your patches? I'd be interested to
> > know how you managed to implement retry on ESTALE without leaking into
> > the VFS, which I think you will recall was the big sticking point that
> > we were debating here.
>
> It does leak into the VFS, but in a way that is non-intrusive. The VFS
> basically sets a new flag, LOOKUP_REVAL, in the struct nameidata flags
> (which are passed down to d_op->d_revalidate() on Linux 2.4.x).
> The NFS code takes that as an command to expire the caches, and force a
> revalidation.
>
> The exact same trick is performed on Linux-2.6.x: see the patch
> "linux-2.6.11-31-stale_retry.dif" in
> http://client.linux-nfs.org/Linux-2.6.x/2.6.11-rc5/
>
> Cheers,
> Trond
>
> --
> Trond Myklebust <[email protected]>
>

Thanks for the pointer!
Neil

--
/***************************************************
*Neil Horman
*Software Engineer
*Red Hat, Inc.
*[email protected]
*gpg keyid: 1024D / 0x92A74FA1
*http://pgp.mit.edu
***************************************************/

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs