2008-01-29 01:08:27

by [email protected]

[permalink] [raw]
Subject: Re: nfs2/3 ESTALE bug on mount point (v2.6.24-rc8)

On Sun, Jan 27, 2008 at 11:37:59PM -0500, Erez Zadok wrote:
> In message <[email protected]>, "J. Bruce Fields" writes:
> > On Mon, Jan 21, 2008 at 05:08:28PM -0500, bfields wrote:
> > > On Mon, Jan 21, 2008 at 03:28:51PM -0500, Erez Zadok wrote:
> > > >
> > > > Here you go. See the tcpdump in here:
> > > >
> > > > http://agora.fsl.cs.sunysb.edu/tmp/nfs/
> > > >
> > > > I captured it on an x86_64 machine using
> > > >
> > > > tcpdump -s 0 -i lo -w tcpdump2
> > > >
> > > > And it shows near the very end the ESTALE error.
> > >
> > > Yep, thanks! So frame 107855 has the MNT reply that returns the
> > > filehandle in question, which is used in an ACCESS call in frame 107855
> > > that gets an ESTALE. Looks like an unhappy server!
> > >
> > > > Do you think this could be related to nfs-utils? I find that I can easily
> > > > trigger this problem on an FC7 machine with nfs-utils-1.1.0-4.fc7 (within
> > > > 10-30 runs of the above loop); but so far I cannot trigger the problem on an
> > > > FC6 machine with nfs-utils-1.0.10-14.fc6 (even after 300+ runs of the above
> > > > loop).
> > >
> > > Yes, it's quite likely, though on a quick skim through the git logs I
> > > don't see an obviously related commit...
> >
> > It might help to turn on rpc cache debugging:
> >
> > echo 2048 >/proc/sys/sunrpc/rpc_debug
> >
> > and then capture the contents of the /proc/net/rpc/*/content files just
> > after the failure.
> >
> > Possibly even better, though it'll produce a lot of stuff:
> >
> > strace -p `pidof rpc.mountd` -s4096 -otmp
> >
> > and then pass along "tmp".
>
> You can find both an strace and content files in
>
> http://agora.fsl.cs.sunysb.edu/tmp/nfs/

The "content" files are all empty. That can't be right....

But anyway the strace does show a problem: grepping for reads and
writes to fd 5, which (based on the traffic) must be
/proc/net/rpc/nfsd.fh/channel, what you see is the kernel repeatedly
asking whether the server has exported a filesystem with a given uuid
(the big hex blob) to localhost.localdomain. And mountd responds with
an export each time:....

1295 read(5, "localhost.localdomain 6 \\x2527c026ed8b49b6a2138c0ef212af45\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\x2527c026ed8b49b6a2138c0ef212af45 2147483647 /n/export/b5 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\x4a5b1c28abc84cc3abd6e072738b2418\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\x4a5b1c28abc84cc3abd6e072738b2418 2147483647 /n/export/b6 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\x46177ac6944a41fda30a0cfb7adf0694\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\x46177ac6944a41fda30a0cfb7adf0694 2147483647 /n/export/b0 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\x040be43d3d2a4433979d321ac075f6e5\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\x040be43d3d2a4433979d321ac075f6e5 2147483647 /n/export/b1 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\xeac197ca26b444eeaf67c1f6cd92c24b\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\xeac197ca26b444eeaf67c1f6cd92c24b 2147483647 /n/export/b2 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\x8e07ea5612584a19ba209780971f3221\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\x8e07ea5612584a19ba209780971f3221 2147483647 /n/export/b3 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\xa509393cc478496a9c0eec09681a5033\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\xa509393cc478496a9c0eec09681a5033 2147483647 /n/export/b4 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\x46177ac6944a41fda30a0cfb7adf0694\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\x46177ac6944a41fda30a0cfb7adf0694 2147483647 /n/export/b0 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\x040be43d3d2a4433979d321ac075f6e5\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\x040be43d3d2a4433979d321ac075f6e5 2147483647 /n/export/b1 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\xeac197ca26b444eeaf67c1f6cd92c24b\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\xeac197ca26b444eeaf67c1f6cd92c24b 2147483647 /n/export/b2 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\x8e07ea5612584a19ba209780971f3221\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\x8e07ea5612584a19ba209780971f3221 2147483647 /n/export/b3 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\xa509393cc478496a9c0eec09681a5033\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\xa509393cc478496a9c0eec09681a5033 2147483647 /n/export/b4 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\xb97fa10332784bd2848a979af256ec61\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\xb97fa10332784bd2848a979af256ec61 2147483647 /n/export/b5 \n", 84) = 84
1295 read(5, "localhost.localdomain 6 \\x27a0f76db3fe44198a82d6370c63383f\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\x27a0f76db3fe44198a82d6370c63383f 2147483647 /n/export/b6 \n", 84) = 84

.... except the last:

1295 read(5, "localhost.localdomain 6 \\x46177ac6944a41fda30a0cfb7adf0694\n", 128) = 59
1295 write(5, "localhost.localdomain 6 \\x46177ac6944a41fda30a0cfb7adf0694 2147483647 \n", 71) = 71

Here suddenly it's claiming there is no such export, even though it
actually identified it as /n/export/b0 just above.

What this means is that "found" is false in the check:

if (found)
qword_print(f, found_path);

at the end of nfs-utils/utils/mountd/cache.c:nfsd_fh(). Note that we're
in the FSID_UUID16_INUM case (that's the "6" in the reads and writes
above.

Anyway, it's clearly a bug in mountd someplace. Maybe in the blkid
library. Maybe a comparison of the strace leading up to the two results
would yield something?

I'm out of steam....

--b.

>
> > And then of course if the regression is in nfs-utils then there's always
> > a git-bisect as the debugging tool of last-resort: assuming you can
> > reproduce the same regression between nfs-utils-1-0-10 and
> > nfs-utils-1-1-0 from git://linux-nfs.org/nfs-utils, then all you'd need
> > to do is clone that repo and do
> >
> > git bisect start
> > git bisect good nfs-utils-1-0-10
> > git bisect bad nfs-utils-1-1-0
> >
> > And it shouldn't take more than 8 tries.
> >
> > Sorry for not having any more clever suggestions....
> >
> > --b.
>
> I tried to bisect nfs-utils but it didn't work. First, the latest version
> of nfs-utils didn't configure for me. It complained
>
> Unable to locate information required to use librpcsecgss. If you
> have pkgconfig installed, you might try setting environment variable
> PKG_CONFIG_PATH to /usr/local/lib/pkgconfig
>
> The above appears to be an error if you don't have librpcsecgss API >=
> 0.10. But mine, on FC7. is 0.11. (I'm using a vanilla FC7.)
>
> So I ran configure --disable-gss and was finally able to build the utils.
> But then, I was having mount.nfs hanging often; stracing it revealed that
> mount(2) was getting EACESS as if the dir wasn't exported (but exportfs said
> it was). I don't know if disabling gss at configure time could have
> resulted in these hangs.
>
> I continued and tried a few more intermediate versions in the bisection, and
> several of them failed to compile and/or configure and/or autogen.sh.
>
> So I don't know what else I can do; this bug may have to be fixed the hard
> way. (BTW, I can get you a self contained VMware image that'll show the
> bug, if you'd like.)
>
> Cheers,
> Erez.


2008-01-29 03:04:18

by Erez Zadok

[permalink] [raw]
Subject: Re: [NFS] nfs2/3 ESTALE bug on mount point (v2.6.24-rc8)

In message <[email protected]>, "J. Bruce Fields" writes:
> On Sun, Jan 27, 2008 at 11:37:59PM -0500, Erez Zadok wrote:

> > You can find both an strace and content files in
> >
> > http://agora.fsl.cs.sunysb.edu/tmp/nfs/

Sorry. I guess tar on /proc files doesn't really save the content of the
files.

I cat'ed the content files and alo placed an updated strace in the above
URL. Here's the content files. In this run, I got an ESTALE on
/n/lower/b2:

mkdir: cannot create directory `/n/lower/b2': Stale NFS file handle

Erez.

::::::::::::::
/proc/net/rpc/auth.unix.gid/content
::::::::::::::
#uid cnt: gids...
# expiry=1201575525 refcnt=1 flags=3
# 0 0:
::::::::::::::
/proc/net/rpc/auth.unix.ip/content
::::::::::::::
#class IP domain
# expiry=1201577204 refcnt=2 flags=1
nfsd 127.0.0.1 localhost.localdomain
# expiry=2147483647 refcnt=1 flags=1
nfsd 0.0.0.0 -test-client-
::::::::::::::
/proc/net/rpc/nfs4.idtoname/content
::::::::::::::
#domain type id [name]
::::::::::::::
/proc/net/rpc/nfs4.nametoid/content
::::::::::::::
#domain type name [id]
::::::::::::::
/proc/net/rpc/nfsd.export/content
::::::::::::::
#path domain(flags)
# expiry=1201577204 refcnt=1 flags=1
/n/export/b5 localhost.localdomain(rw,no_root_squash,async,wdelay,no_subtree_check,uuid=a91d2562:3acf4709:97cf4c71:24c5dcfb)
# expiry=1201577205 refcnt=1 flags=1
/n/export/b6 localhost.localdomain(rw,no_root_squash,async,wdelay,no_subtree_check,uuid=533ab699:d98c46e6:9e3084f6:2b447886)
# expiry=1201577207 refcnt=1 flags=1
/n/export/b1 localhost.localdomain(rw,no_root_squash,async,wdelay,no_subtree_check,uuid=80754c2e:e5984d43:93d59492:65ee9933)
# expiry=1201577206 refcnt=1 flags=1
/n/export/b0 localhost.localdomain(rw,no_root_squash,async,wdelay,no_subtree_check,uuid=6d5e7d95:a33948c3:a3b73aa6:cadaa639)
::::::::::::::
/proc/net/rpc/nfsd.fh/content
::::::::::::::
#domain fsidtype fsid [path]
# expiry=2147483647 refcnt=1 flags=1
localhost.localdomain 6 0x62251da90947cf3a714ccf97fbdcc524 /n/export/b5
# expiry=2147483647 refcnt=1 flags=3
# localhost.localdomain 6 0x0800bfebda4f9d24013f00ba702bbdb6
# expiry=2147483647 refcnt=1 flags=1
localhost.localdomain 6 0x2e4c7580434d98e59294d5933399ee65 /n/export/b1
# expiry=2147483647 refcnt=1 flags=1
localhost.localdomain 6 0x99b63a53e6468cd9f684309e8678442b /n/export/b6
# expiry=2147483647 refcnt=1 flags=1
localhost.localdomain 6 0x957d5e6dc34839a3a63ab7a339a6daca /n/export/b0

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs