2003-06-02 08:37:40

by Frank Cusack

[permalink] [raw]
Subject: nfs_refresh_inode: inode number mismatch (bug)

Hi,

I'm using a frankenstein kernel, 2.4.21-rc3 with some -ac bits,
and 2.5.69 NFS+RPC backported to it. Like the CITI kernel (for krb5),
but a little more aggressive on the bits backported.

I've found an issue with nfs_rmdir(), at least that's where I think it is.
Consider these two shell sequences:

1 2

cd /nfs cd /nfs
mkdir tmp
echo foo > tmp/foo
less tmp/foo
[less waits for input]
rm -rf tmp
'v'
[vi tries to access tmp/foo]

At this point, inode.c:__nfs_refresh_inode() prints the "inode number
mismatch" error. AFAICT, this is just noise, but the noise is driving
me crazy. :-) I'm using fsstress and I can't tell if the messages are
all because of this problem, or if my backport is bad as well.

Now, if sequence 2 is run on a different machine, there is no error!
So that hints to me that the local cache just needs to be cleared,
perhaps in nfs_rmdir() or maybe in nfs_unlink()/nfs_safe_remove().
I've tried a few things, but I'm not familiar enough with the code
and am not making headway.

The stock 2.4.20 has this problem as well. I haven't tried any 2.5
kernels. The 2.2 kernel doesn't have this problem, because it doesn't
allow you to unlink a .nfsXXX file while it's open (and therefore you
cannot remove the dir); clearly incorrect behavior.

This is against a netapp server, although I can't see how the server would
matter.

Thanks for any advice, guidance, or hopefully a fix! BTW, I'm interested
to hear what tools folks use to stress NFS.

/fc


-------------------------------------------------------
This SF.net email is sponsored by: eBay
Get office equipment for less on eBay!
http://adfarm.mediaplex.com/ad/ck/711-11697-6916-5
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-06-02 09:46:33

by Frank Cusack

[permalink] [raw]
Subject: Re: nfs_refresh_inode: inode number mismatch (bug)

On Mon, Jun 02, 2003 at 01:37:31AM -0700, Frank Cusack wrote:
> I've found an issue with nfs_rmdir(), at least that's where I think it is.

A little more info ...

This seems related to nfs3, and I'm going to guess readdirplus. If
I mount as nfsv2, I don't see the problem.

It just struck me that there are a lot of readdirplus-related logs just
before the "bug". I didn't think anything of it (because, *of course*
there is a bunch of activity there) but in a rare moment of clarity I
figured I should try v2.

I'll continue to look at this tomorrow, but I'm sure Trond or one of
the wizards here will instantly know where the problem is. :-)

/fc


-------------------------------------------------------
This SF.net email is sponsored by: eBay
Get office equipment for less on eBay!
http://adfarm.mediaplex.com/ad/ck/711-11697-6916-5
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs