2008-05-27 13:21:21

by Jeff Layton

[permalink] [raw]
Subject: Re: Nfs filesystem corruption(?) after kmail crash

On Tue, 27 May 2008 14:15:41 +0200
"Alexander Borghgraef" <[email protected]> wrote:

> On Mon, May 26, 2008 at 1:40 PM, Jeff Layton <[email protected]> wrote:
> >
> > The ???? fields usually pop up when stat() calls fail. The odd thing in
> > this case though is that stat() seemed to figure out that this was a
> > directory and not a plain file, but missed out on everything else.
> >
> > Is this problem persistent?
>
> It varies. I've had occurrences where it lasted for 15mins, but recent
> ones have been too short to register. I've had cases where an ls or cd
> into the affected directory showed nothing out of the ordinary, which
> leads me to believe that the error was just there long enough to cause
> the kmail crash, and then disappeared.
>
> > If so, it might be interesting to run:
> >
> > strace stat cur
> >
> > ...and see what error it's returning.
>
> Ok, I'll do that when I get an error which lasts long enough.
>
> > Even better would be to get a
> > capture on the wire at the same time and see if the server is returning
> > an error of some sort.
>
> I have absolutely no idea how to do that :-)
>

>From the client, I usually do something like:

# tcpdump -i <interface> -s0 -w /tmp/nfs-stat.pcap host <server_ip_addr>

start that up, run your test and then kill it. You can then
open /tmp/nfs-stat.pcap with wireshark or some other analyzer and
inspect the packets that went between client and server during the
test.

--
Jeff Layton <[email protected]>