LinuxLists.cc - [PATCH 0/3] enhanced ESTALE error handling

Attachments:

syscallgen.c (14.83 kB)
exec_test.c (42.00 B)
Download all attachments

2008-03-10 20:23:25

by Peter Staubach

[permalink] [raw]

Subject: [PATCH 0/3] enhanced ESTALE error handling (v3)

Hi.

Here is version 3 of a patch set which modifies the system to
enhance the ESTALE error handling for system calls which take
pathnames as arguments. This patch set is essentially the
same as the v2 patches, but updated to reflect the current
state of the code around them.

The error, ESTALE, was originally introduced to handle the
situation where a file handle, which NFS uses to uniquely
identify a file on the server, no longer refers to a valid file
on the server. This can happen when the file is removed on the
server, either by an application on the server, some other
client accessing the server, or sometimes even by another
mounted file system from the same client. The NFS server also
returns this error when the file resides upon a file system
which is no longer exported. Additionally, some NFS servers
even change the file handle when a file is renamed, although
this practice is discouraged.

This error occurs even if a file or directory, with the same
name, is recreated on the server without the client being
aware of it. The file handle refers to a specific instance
of a file and deleting the file and then recreating it creates
a new instance of the file.

The error, ESTALE, is usually seen when cached directory
information is used to convert a pathname to a dentry/inode pair.
The information is discovered to be out of date or stale when a
subsequent operation is sent to the NFS server. This can easily
happen in system calls such as stat(2) when the pathname is
converted a dentry/inode pair using cached information, but then
a subsequent GETATTR call to the server discovers that the file
handle is no longer valid.

This error can also occur when a change is made on the server
in between looking up different components of the pathname to
be looked up or between a successful lookup and a subsequent
operation.

System calls which take pathnames as arguments should never see
ESTALE errors from situations like this. These system calls
should either fail with an ENOENT error if the pathname can not
be successfully be translated to a dentry/inode pair or succeed
or fail based on their own semantics. In the above example,
stat(2), restarting at the pathname lookup will either cause the
system call to succeed or fail, depending upon whether the
file really exists or not.

ESTALE errors which occur during the lookup process can be
handled by dropping the dentry which refers to the non-existent
file from the dcache and then restarting the lookup process.
Care is taken to ensure that forward progress is always being
made in order to avoiding infinite loops.

ESTALE errors which occur during operations subsequent to the
lookup process can be handled by unwinding appropriately and
then performing the lookup process again. Eventually, either
the lookup process will succeed or fail correctly or the
subsequent operation will succeed or fail on its own merits.

This support is desired in order to tighten up recovery from
discovering stale resources due to the loose cache consistency
semantics that file systems such as NFS employ. In particular,
there are several large Red Hat customers, converting from
Solaris to Linux, who desire this support in order that their
applications environments continue to work.

The loose consistency model of file systems such as NFS is
exacerbated by the large granularity of timestamps available
for files on file systems such ext3. The NFS client may not
be able to detect changes in directories due to multiple
changes occurring in the same second, for example.

Please note that system calls which do not take pathnames as
arguments or perhaps use file descriptors to identify the
file to be manipulated may still fail with ESTALE errors.
There is no recovery possible with these systems calls like
there is with system calls which take pathnames as arguments.

This support was tested using the attached programs and
running multiple copies on mounted file systems which do not
share superblocks. When two or more copies of this program
are running, many ESTALE errors can be seen over the network.
Without these patches, the test program errors out almost
immediately. With these patches, the test program runs
for as long one desires.

Comments?

Thanx...

ps

Attachments:

syscallgen.c (14.83 kB)
exec_test.c (42.00 B)
Download all attachments

2008-03-10 22:42:32

by Andreas Dilger

[permalink] [raw]

Subject: Re: [PATCH 0/3] enhanced ESTALE error handling (v3)

On Mar 10, 2008 16:23 -0400, Peter Staubach wrote:
> Here is version 3 of a patch set which modifies the system to
> enhance the ESTALE error handling for system calls which take
> pathnames as arguments. This patch set is essentially the
> same as the v2 patches, but updated to reflect the current
> state of the code around them.

[snip long discussion of ESTALE causes]

> This support was tested using the attached programs and
> running multiple copies on mounted file systems which do not
> share superblocks. When two or more copies of this program
> are running, many ESTALE errors can be seen over the network.
> Without these patches, the test program errors out almost
> immediately. With these patches, the test program runs
> for as long one desires.

Have you tried "racer.sh"? That is a very stressful metadata tester
that does random operations on a handful of file and directory names.
It can be run on a single client, or on multiple clients and needs no
coordination between the clients. I guess it won't tell you if you
are getting ESTALE back correctly or not, but it can quickly find if
there are any problems with the retrying code.

I've attached an updated tarball of the original scripts here.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Attachments:

(No filename) (1.30 kB)
racer-lustre.tar.gz (1.96 kB)
Download all attachments