LinuxLists.cc - Re: nfs2/3 ESTALE bug on mount point (v2.6.24-rc8)

2008-01-21 22:08:35

Subject: Re: nfs2/3 ESTALE bug on mount point (v2.6.24-rc8)

On Mon, Jan 21, 2008 at 03:28:51PM -0500, Erez Zadok wrote:
> In message <[email protected]>, "J. Bruce Fields" writes:
> > On Mon, Jan 21, 2008 at 01:19:30PM -0500, Erez Zadok wrote:
> > > Since around 2.6.24-rc5 or so I've had an occasional problem: I get an
> > > ESTALE error on the mount point after setting up a localhost exported mount
> > > point, and trying to mkdir something there (this is part of my setup scripts
> > > prior to running unionfs regression tests).
> > >
> > > I'm CC'ing both client and server maintainers/list, b/c I'm not certain
> > > where the problem is. The problem doesn't exist in 2.6.23 or earlier stable
> > > kernels. It doesn't appear in nfs4 either, only nfs2 and nfs3.
> > >
> > > The problem is seen intermittently, and is probably some form of a race. I
> > > was finally able to narrow it down a bit. I was able to write a shell
> > > script that for me reproduces the problem within a few minutes (I tried it
> > > on v2.6.24-rc8-74-ga7da60f and several different machine configurations).
> > >
> > > I've included the shell script below. Hopefully you can use it to track the
> > > problem down. The mkdir command in the middle of the script is that one
> > > that'll eventually cause an ESTALE error and cause the script to abort; you
> > > can run "df" afterward to see the stale mount points.
> > >
> > > Notes: the one anecdotal factor that seems to make the bug appear sooner is
> > > if you increase the number of total mounts that the script below creates
> > > ($MAX in the script).
> >
> > OK, so to summarize:
> >
> > 1. create $MAX ext2 filesystem images, loopback-mount them, and export
> > the result.
> > 2. nfs-mount each of those $MAX exports.
> > 3. create a directory under each of those nfs-mounts.
> > 4. unmount and unexport
> >
> > Repeat that a thousand times, and eventually get you ESTALE at step 3?
>
> Your description is correct.
>
> > I guess one step would be to see if it's possible to get a network trace
> > showing what happened in the bad case....
>
> Here you go. See the tcpdump in here:
>
> http://agora.fsl.cs.sunysb.edu/tmp/nfs/
>
> I captured it on an x86_64 machine using
>
> tcpdump -s 0 -i lo -w tcpdump2
>
> And it shows near the very end the ESTALE error.

Yep, thanks! So frame 107855 has the MNT reply that returns the
filehandle in question, which is used in an ACCESS call in frame 107855
that gets an ESTALE. Looks like an unhappy server!

> Do you think this could be related to nfs-utils? I find that I can easily
> trigger this problem on an FC7 machine with nfs-utils-1.1.0-4.fc7 (within
> 10-30 runs of the above loop); but so far I cannot trigger the problem on an
> FC6 machine with nfs-utils-1.0.10-14.fc6 (even after 300+ runs of the above
> loop).

Yes, it's quite likely, though on a quick skim through the git logs I
don't see an obviously related commit....

--b.

2008-01-28 04:38:28

by Erez Zadok

[permalink] [raw]

Subject: Re: [NFS] nfs2/3 ESTALE bug on mount point (v2.6.24-rc8)

In message <[email protected]>, "J. Bruce Fields" writes:
> On Mon, Jan 21, 2008 at 05:08:28PM -0500, bfields wrote:
> > On Mon, Jan 21, 2008 at 03:28:51PM -0500, Erez Zadok wrote:
> > >
> > > Here you go. See the tcpdump in here:
> > >
> > > http://agora.fsl.cs.sunysb.edu/tmp/nfs/
> > >
> > > I captured it on an x86_64 machine using
> > >
> > > tcpdump -s 0 -i lo -w tcpdump2
> > >
> > > And it shows near the very end the ESTALE error.
> >
> > Yep, thanks! So frame 107855 has the MNT reply that returns the
> > filehandle in question, which is used in an ACCESS call in frame 107855
> > that gets an ESTALE. Looks like an unhappy server!
> >
> > > Do you think this could be related to nfs-utils? I find that I can easily
> > > trigger this problem on an FC7 machine with nfs-utils-1.1.0-4.fc7 (within
> > > 10-30 runs of the above loop); but so far I cannot trigger the problem on an
> > > FC6 machine with nfs-utils-1.0.10-14.fc6 (even after 300+ runs of the above
> > > loop).
> >
> > Yes, it's quite likely, though on a quick skim through the git logs I
> > don't see an obviously related commit...
>
> It might help to turn on rpc cache debugging:
>
> echo 2048 >/proc/sys/sunrpc/rpc_debug
>
> and then capture the contents of the /proc/net/rpc/*/content files just
> after the failure.
>
> Possibly even better, though it'll produce a lot of stuff:
>
> strace -p `pidof rpc.mountd` -s4096 -otmp
>
> and then pass along "tmp".

You can find both an strace and content files in

http://agora.fsl.cs.sunysb.edu/tmp/nfs/

> And then of course if the regression is in nfs-utils then there's always
> a git-bisect as the debugging tool of last-resort: assuming you can
> reproduce the same regression between nfs-utils-1-0-10 and
> nfs-utils-1-1-0 from git://linux-nfs.org/nfs-utils, then all you'd need
> to do is clone that repo and do
>
> git bisect start
> git bisect good nfs-utils-1-0-10
> git bisect bad nfs-utils-1-1-0
>
> And it shouldn't take more than 8 tries.
>
> Sorry for not having any more clever suggestions....
>
> --b.

I tried to bisect nfs-utils but it didn't work. First, the latest version
of nfs-utils didn't configure for me. It complained

Unable to locate information required to use librpcsecgss. If you
have pkgconfig installed, you might try setting environment variable
PKG_CONFIG_PATH to /usr/local/lib/pkgconfig

The above appears to be an error if you don't have librpcsecgss API >=
0.10. But mine, on FC7. is 0.11. (I'm using a vanilla FC7.)

So I ran configure --disable-gss and was finally able to build the utils.
But then, I was having mount.nfs hanging often; stracing it revealed that
mount(2) was getting EACESS as if the dir wasn't exported (but exportfs said
it was). I don't know if disabling gss at configure time could have
resulted in these hangs.

I continued and tried a few more intermediate versions in the bisection, and
several of them failed to compile and/or configure and/or autogen.sh.

So I don't know what else I can do; this bug may have to be fixed the hard
way. (BTW, I can get you a self contained VMware image that'll show the
bug, if you'd like.)

Cheers,
Erez.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs

2008-01-22 16:41:19

by J. Bruce Fields

[permalink] [raw]

Subject: Re: nfs2/3 ESTALE bug on mount point (v2.6.24-rc8)

On Mon, Jan 21, 2008 at 05:08:28PM -0500, bfields wrote:
> On Mon, Jan 21, 2008 at 03:28:51PM -0500, Erez Zadok wrote:
> >
> > Here you go. See the tcpdump in here:
> >
> > http://agora.fsl.cs.sunysb.edu/tmp/nfs/
> >
> > I captured it on an x86_64 machine using
> >
> > tcpdump -s 0 -i lo -w tcpdump2
> >
> > And it shows near the very end the ESTALE error.
>
> Yep, thanks! So frame 107855 has the MNT reply that returns the
> filehandle in question, which is used in an ACCESS call in frame 107855
> that gets an ESTALE. Looks like an unhappy server!
>
> > Do you think this could be related to nfs-utils? I find that I can easily
> > trigger this problem on an FC7 machine with nfs-utils-1.1.0-4.fc7 (within
> > 10-30 runs of the above loop); but so far I cannot trigger the problem on an
> > FC6 machine with nfs-utils-1.0.10-14.fc6 (even after 300+ runs of the above
> > loop).
>
> Yes, it's quite likely, though on a quick skim through the git logs I
> don't see an obviously related commit...

It might help to turn on rpc cache debugging:

echo 2048 >/proc/sys/sunrpc/rpc_debug

and then capture the contents of the /proc/net/rpc/*/content files just
after the failure.

Possibly even better, though it'll produce a lot of stuff:

strace -p `pidof rpc.mountd` -s4096 -otmp

and then pass along "tmp".

And then of course if the regression is in nfs-utils then there's always
a git-bisect as the debugging tool of last-resort: assuming you can
reproduce the same regression between nfs-utils-1-0-10 and
nfs-utils-1-1-0 from git://linux-nfs.org/nfs-utils, then all you'd need
to do is clone that repo and do

git bisect start
git bisect good nfs-utils-1-0-10
git bisect bad nfs-utils-1-1-0

And it shouldn't take more than 8 tries.

Sorry for not having any more clever suggestions....

--b.