I've stumbled into a problem running 2.6.22.1 on both my NFS client and
my NFS server. I've just upgraded from 2.4.31, so I have no idea whether
this is a new problem or if it is known in the 2.6.x series.
Here's a high-level description of the context:
* The NFS server has a directory which is full.
* That directory is mounted on the NFS client.
* The NFS client tries to "mv local-file /nfs/remote-dir/"
* local-file is big (typically 700 MiB).
What happens is:
* The "mv" takes a long long time and eventually fails, of course.
* The load on the NFS server (initially at 0) increases to about 8.
* Any access to the NFS-mounted dir from the client whilst "mv" is in
progress stalls (e.g. ls -l /nfs/remote-dir).
I've tried to write my own "mv" in C to see which syscalls were involved.
What happens is:
* All the write() succeed with no error.
* The final close() returns -1 with either EINTR or ENOSPC.
I could not determine what makes close return EINTR or ENOSPC.
Problem is, under 2.4.31, the write() was immediately failing when writing
to a full NFS partition.
This looks like an important bug, but I don't know if it is in the NFS-client
or NFS-server side. I'm tempted to say NFS-server, but that's more a hunch.
Raphael
On Wed, 2007-08-01 at 17:00 +0200, Raphael Manfredi wrote:
> I've stumbled into a problem running 2.6.22.1 on both my NFS client and
> my NFS server. I've just upgraded from 2.4.31, so I have no idea whether
> this is a new problem or if it is known in the 2.6.x series.
>
> Here's a high-level description of the context:
>
> * The NFS server has a directory which is full.
> * That directory is mounted on the NFS client.
> * The NFS client tries to "mv local-file /nfs/remote-dir/"
> * local-file is big (typically 700 MiB).
>
> What happens is:
>
> * The "mv" takes a long long time and eventually fails, of course.
> * The load on the NFS server (initially at 0) increases to about 8.
> * Any access to the NFS-mounted dir from the client whilst "mv" is in
> progress stalls (e.g. ls -l /nfs/remote-dir).
>
> I've tried to write my own "mv" in C to see which syscalls were involved.
> What happens is:
>
> * All the write() succeed with no error.
> * The final close() returns -1 with either EINTR or ENOSPC.
>
> I could not determine what makes close return EINTR or ENOSPC.
>
> Problem is, under 2.4.31, the write() was immediately failing when writing
> to a full NFS partition.
>
> This looks like an important bug, but I don't know if it is in the NFS-client
> or NFS-server side. I'm tempted to say NFS-server, but that's more a hunch.
The answer appears to be that some filesystems really _suck_ when they
have to return errors: they take forever to return to the user. When the
client then tries with several WRITE requests (it can cache huge numbers
of requests) then the cumulative effect of the delays are quite
noticeable as you can see above.
I've got a tentative client-side patch to deal with this sort of server.
Basically, when the client sees that a cached write returns an error,
then it will stop caching, and start doing O_SYNC-style writes until the
error conditions stop. That won't fix the server side problem, but it
does ensure that the application gets notified of the error as soon as
possible.
Cheers
Trond