2008-05-27 13:32:23

by Talpey, Thomas

[permalink] [raw]
Subject: Re: Nfs filesystem corruption(?) after kmail crash

At 08:15 AM 5/27/2008, Alexander Borghgraef wrote:
>It varies. I've had occurrences where it lasted for 15mins, but recent
>ones have been too short to register.

When you say "lasted", do you mean the file with the problem starts to
work (i.e. shows attributes), or that it basically vanishes? I am thinking
that perhaps the client thinks the file exists, but the server disagrees.
If you have multiple mail servers and there's an application synchronization
issue, this could be the problem.

Also, are the clocks synchronized between your clients and the server?
Clock skew can make this kind of problem worse.

>> If so, it might be interesting to run:
>> strace stat cur
>> ...and see what error it's returning.
>Ok, I'll do that when I get an error which lasts long enough.

And if it shows a hard error, please also turn on a few NFS client
debugging flags and capture the log:

rpcdebug -m nfs -s dircache lookupcache
stat cur
dmesg >/tmp/send-this-log
rpcdebug -m nfs -c dircache lookupcache