2012-08-15 06:17:59

by Jamie Heilman

[permalink] [raw]
Subject: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return

I really wish I could have nailed this down better, but I've had a
hard time reliably reproducing the problem during bisection, and I
haven't seen anyone report a similar sounding problem. Here's what
I've seen: since 3.5 I've been having spurious delays on my nfs
clients, noticable particularly when I open an mbox file in mutt over
an nfs v4 mount from a v3.5 or later server. The servers I've
reproduced this on are all uni-proc 32-bit systems... but then I
haven't tried SMP or 64-bit systems yet, it may or may not exist
there. When the delay occurs, it's quite noticable. I've never seen
one that takes less than 40 seconds to "unstick." I wrote a quick and
dirty reproduction tool, based on the syscalls mutt was doing that
triggered the problem, attached to this message. To use it, compile
the file as utime-test on an exported volume, then execute with (cd
/some/mount/point && strace -T ./utime-test) from a nfs4 client.

For whatever, reason I frequently find the second call to utime takes
an irritatingly long time to return and I see something like:
utime("utime-test.c", [2012/08/14-22:47:21, 2012/08/14-17:25:21]) = 0 <70.510913>
in the strace output.

I've reproduced this on Debian Squeeze / nfs-utils 1.2.2 based servers
(legacy idmapper, no user-space nfsidmap), as well as Debian Wheezy /
nfs-utils 1.2.6 (uses keyutils upcalls) servers, so I doubt it's a
user-space related issue... Attempts to bisect have been muddled,
I'll keep trying in the interim, but the best I've been able to pin
things down is that issue was probably introduced in the
419f4319495043a9507ac3e616be9ca60af09744 merge. I can't repo on a
kernel based on fb21affa49204acd409328415b49bfe90136653c. (I say
based on, because I have to apply the patch from
http://marc.info/?l=linux-nfs&m=133950479803025 or face additional
problems.)

I'll try to get full rcpdebug traces on client and server as the delay
is occuring in the hopes that helps pin things down, and post them
separately.

--
Jamie Heilman http://audible.transient.net/~jamie/


Attachments:
(No filename) (2.02 kB)
utime-test.c (696.00 B)
Download all attachments

2012-08-15 13:59:03

by Jamie Heilman

[permalink] [raw]
Subject: Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return

Jamie Heilman wrote:
> I'll try to get full rcpdebug traces on client and server as the delay
> is occuring in the hopes that helps pin things down, and post them
> separately.

OK, here are the logs from client and server, where a run of my test program
under strace -T resulted in:

utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <0.150815>
open("utime-test.c", O_RDONLY) = 3 <0.242635>
close(3) = 0 <0.147768>
stat("utime-test.c", {st_mode=S_IFREG|0644, st_size=696, ...}) = 0 <0.002772>
utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <71.878058>

The client is a 64-bit v3.4.8 kernel, the server is 32-bit, v3.5.1 +
the two sunprc patches that will be in v3.5.2.

(The client's system clock is a touch faster than the server's, but
these logs start at the same instant.)


--
Jamie Heilman http://audible.transient.net/~jamie/


Attachments:
(No filename) (938.00 B)
server.gz (6.48 kB)
client.gz (16.98 kB)
Download all attachments

2012-08-15 15:02:55

by J. Bruce Fields

[permalink] [raw]
Subject: Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return

On Wed, Aug 15, 2012 at 01:58:54PM +0000, Jamie Heilman wrote:
> Jamie Heilman wrote:
> > I'll try to get full rcpdebug traces on client and server as the delay
> > is occuring in the hopes that helps pin things down, and post them
> > separately.
>
> OK, here are the logs from client and server, where a run of my test program
> under strace -T resulted in:
>
> utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <0.150815>
> open("utime-test.c", O_RDONLY) = 3 <0.242635>
> close(3) = 0 <0.147768>
> stat("utime-test.c", {st_mode=S_IFREG|0644, st_size=696, ...}) = 0 <0.002772>
> utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <71.878058>
>
> The client is a 64-bit v3.4.8 kernel, the server is 32-bit, v3.5.1 +
> the two sunprc patches that will be in v3.5.2.
>
> (The client's system clock is a touch faster than the server's, but
> these logs start at the same instant.)

Thanks for all the details.

What's probably happening is that the client is returning a delegation
with the open. The setattr then breaks that delegation; you can see it
getting 10008 (NFS4ERR_DELAY) replies while the server waits for the
delegation to be returned. But for some reason the callback to break
the delegation isn't working. ("NFSD: warning: no callback path to
client 192.168.2.42/192.168.2.4 tcp UNIX 0: error -110" (110 is
ETIMEDOUT).) So instead you wait for the delegation to time out and get
forcibly revoked.

The reproducer might be more reliable if you did two opens.

It'd be worth looking at the traffic in wireshark. You should see
setattr, open, close, setattr, a DELAY reply to the setattr, a
CB_RECALL, and then a DELEGRETURN that gets a succesful reply. But for
some reason the DELEGRETURN isn't getting through in your case, I'm not
sure why. I can't reproduce that. You'll need to start wireshark
before you mount to make sure it knows how to parse the callbacks.

I don't see anything in that range of git commits that looks suspicious,
but I may well have overlooked something.

--b.

2012-08-16 20:18:42

by Jamie Heilman

[permalink] [raw]
Subject: Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return

J. Bruce Fields wrote:
> On Wed, Aug 15, 2012 at 01:58:54PM +0000, Jamie Heilman wrote:
> > Jamie Heilman wrote:
> > > I'll try to get full rcpdebug traces on client and server as the delay
> > > is occuring in the hopes that helps pin things down, and post them
> > > separately.
> >
> > OK, here are the logs from client and server, where a run of my test program
> > under strace -T resulted in:
> >
> > utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <0.150815>
> > open("utime-test.c", O_RDONLY) = 3 <0.242635>
> > close(3) = 0 <0.147768>
> > stat("utime-test.c", {st_mode=S_IFREG|0644, st_size=696, ...}) = 0 <0.002772>
> > utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <71.878058>
> >
> > The client is a 64-bit v3.4.8 kernel, the server is 32-bit, v3.5.1 +
> > the two sunprc patches that will be in v3.5.2.
> >
> > (The client's system clock is a touch faster than the server's, but
> > these logs start at the same instant.)
>
> Thanks for all the details.
>
> What's probably happening is that the client is returning a delegation
> with the open. The setattr then breaks that delegation; you can see it
> getting 10008 (NFS4ERR_DELAY) replies while the server waits for the
> delegation to be returned. But for some reason the callback to break
> the delegation isn't working. ("NFSD: warning: no callback path to
> client 192.168.2.42/192.168.2.4 tcp UNIX 0: error -110" (110 is
> ETIMEDOUT).) So instead you wait for the delegation to time out and get
> forcibly revoked.
>
> The reproducer might be more reliable if you did two opens.

I made the change... not entirely sure it helped, but I think I've
bisected this reliably anyway. It came down to:

d5497fc693a446ce9100fcf4117c3f795ddfd0d2 is the first bad commit
commit d5497fc693a446ce9100fcf4117c3f795ddfd0d2
Author: J. Bruce Fields <[email protected]>
Date: Mon May 14 22:06:49 2012 -0400

nfsd4: move rq_flavor into svc_cred

Move the rq_flavor into struct svc_cred, and use it in setclientid and
exchange_id comparisons as well.

Signed-off-by: J. Bruce Fields <[email protected]>

:040000 040000 165568aae0fc47067863ea8ca911494033cbf2ce 54244f0084f58df4844b09d56e8e6bad0e19d913 M fs
:040000 040000 bae5a74a85bda2aa2b155b0bdca1ea0028502164 207751ce30c8f49c68fb0a85d0772cb0fa426ae1 M include
:040000 040000 ef940cf0c558fe65a21a710e6c95b2cefb2996fd 459ac143acf5cb2b3b7f59df29c417cedbb23c4b M net

FWIW reverting this commit does seem to fix the problem.

> It'd be worth looking at the traffic in wireshark. You should see
> setattr, open, close, setattr, a DELAY reply to the setattr, a
> CB_RECALL, and then a DELEGRETURN that gets a succesful reply. But for
> some reason the DELEGRETURN isn't getting through in your case, I'm not
> sure why. I can't reproduce that. You'll need to start wireshark
> before you mount to make sure it knows how to parse the callbacks.

Captures from the server showing the delay (running 3.5.2) and the
previous behavior (running 3.4.9) are attached.

--
Jamie Heilman http://audible.transient.net/~jamie/


Attachments:
(No filename) (3.11 kB)
nfsd4-delay.pcap.gz (8.60 kB)
nfsd4-ok.pcap.gz (7.10 kB)
Download all attachments

2012-08-16 20:42:13

by J. Bruce Fields

[permalink] [raw]
Subject: Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return

On Thu, Aug 16, 2012 at 08:18:38PM +0000, Jamie Heilman wrote:
> J. Bruce Fields wrote:
> > On Wed, Aug 15, 2012 at 01:58:54PM +0000, Jamie Heilman wrote:
> > > Jamie Heilman wrote:
> > > > I'll try to get full rcpdebug traces on client and server as the delay
> > > > is occuring in the hopes that helps pin things down, and post them
> > > > separately.
> > >
> > > OK, here are the logs from client and server, where a run of my test program
> > > under strace -T resulted in:
> > >
> > > utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <0.150815>
> > > open("utime-test.c", O_RDONLY) = 3 <0.242635>
> > > close(3) = 0 <0.147768>
> > > stat("utime-test.c", {st_mode=S_IFREG|0644, st_size=696, ...}) = 0 <0.002772>
> > > utime("utime-test.c", [2012/08/15-13:35:20, 2012/08/15-13:22:04]) = 0 <71.878058>
> > >
> > > The client is a 64-bit v3.4.8 kernel, the server is 32-bit, v3.5.1 +
> > > the two sunprc patches that will be in v3.5.2.
> > >
> > > (The client's system clock is a touch faster than the server's, but
> > > these logs start at the same instant.)
> >
> > Thanks for all the details.
> >
> > What's probably happening is that the client is returning a delegation
> > with the open. The setattr then breaks that delegation; you can see it
> > getting 10008 (NFS4ERR_DELAY) replies while the server waits for the
> > delegation to be returned. But for some reason the callback to break
> > the delegation isn't working. ("NFSD: warning: no callback path to
> > client 192.168.2.42/192.168.2.4 tcp UNIX 0: error -110" (110 is
> > ETIMEDOUT).) So instead you wait for the delegation to time out and get
> > forcibly revoked.
> >
> > The reproducer might be more reliable if you did two opens.
>
> I made the change... not entirely sure it helped, but I think I've
> bisected this reliably anyway. It came down to:

Hm, weird. In the good case the cb_recall's done with auth_unix, in the
bad case with auth_null. OK, that should be enough to go on....

Thanks for digging into this!

--b.

>
> d5497fc693a446ce9100fcf4117c3f795ddfd0d2 is the first bad commit
> commit d5497fc693a446ce9100fcf4117c3f795ddfd0d2
> Author: J. Bruce Fields <[email protected]>
> Date: Mon May 14 22:06:49 2012 -0400
>
> nfsd4: move rq_flavor into svc_cred
>
> Move the rq_flavor into struct svc_cred, and use it in setclientid and
> exchange_id comparisons as well.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
>
> :040000 040000 165568aae0fc47067863ea8ca911494033cbf2ce 54244f0084f58df4844b09d56e8e6bad0e19d913 M fs
> :040000 040000 bae5a74a85bda2aa2b155b0bdca1ea0028502164 207751ce30c8f49c68fb0a85d0772cb0fa426ae1 M include
> :040000 040000 ef940cf0c558fe65a21a710e6c95b2cefb2996fd 459ac143acf5cb2b3b7f59df29c417cedbb23c4b M net
>
> FWIW reverting this commit does seem to fix the problem.
>
> > It'd be worth looking at the traffic in wireshark. You should see
> > setattr, open, close, setattr, a DELAY reply to the setattr, a
> > CB_RECALL, and then a DELEGRETURN that gets a succesful reply. But for
> > some reason the DELEGRETURN isn't getting through in your case, I'm not
> > sure why. I can't reproduce that. You'll need to start wireshark
> > before you mount to make sure it knows how to parse the callbacks.
>
> Captures from the server showing the delay (running 3.5.2) and the
> previous behavior (running 3.4.9) are attached.
>
> --
> Jamie Heilman http://audible.transient.net/~jamie/


2012-08-16 21:09:19

by J. Bruce Fields

[permalink] [raw]
Subject: Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return

On Thu, Aug 16, 2012 at 04:42:08PM -0400, J. Bruce Fields wrote:
> Hm, weird. In the good case the cb_recall's done with auth_unix, in the
> bad case with auth_null. OK, that should be enough to go on....
>
> Thanks for digging into this!

This should fix it--could you confirm?

--b.

commit e950bebdac1f17121f972728489cdba43734d56d
Author: J. Bruce Fields <[email protected]>
Date: Thu Aug 16 17:01:21 2012 -0400

nfsd4: fix security flavor of NFSv4.0 callback

Commit d5497fc693a446ce9100fcf4117c3f795ddfd0d2 "nfsd4: move rq_flavor
into svc_cred" forgot to remove cl_flavor from the client, leaving two
places (cl_flavor and cl_cred.cr_flavor) for the flavor to be stored.
After that patch, the latter was the one that was updated, but the
former was the one that the callback used.

Symptoms were a long delay on utime(). This is because the utime()
generated a setattr which recalled a delegation, but the cb_recall was
ignored by the client because it had the wrong security flavor.

Cc: [email protected]
Reported-by: Jamie Heilman <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index cbaf4f8..4c7bd35 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -651,12 +651,12 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c

if (clp->cl_minorversion == 0) {
if (!clp->cl_cred.cr_principal &&
- (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
+ (clp->cl_cred.cr_flavor >= RPC_AUTH_GSS_KRB5))
return -EINVAL;
args.client_name = clp->cl_cred.cr_principal;
args.prognumber = conn->cb_prog,
args.protocol = XPRT_TRANSPORT_TCP;
- args.authflavor = clp->cl_flavor;
+ args.authflavor = clp->cl_cred.cr_flavor;
clp->cl_cb_ident = conn->cb_ident;
} else {
if (!conn->cb_xprt)
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 167d7d8..9db0bb5 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -231,7 +231,6 @@ struct nfs4_client {
nfs4_verifier cl_verifier; /* generated by client */
time_t cl_time; /* time of last lease renewal */
struct sockaddr_storage cl_addr; /* client ipaddress */
- u32 cl_flavor; /* setclientid pseudoflavor */
struct svc_cred cl_cred; /* setclientid principal */
clientid_t cl_clientid; /* generated by server */
nfs4_verifier cl_confirm; /* generated by server */

2012-08-16 21:34:27

by Jamie Heilman

[permalink] [raw]
Subject: Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return

J. Bruce Fields wrote:
> On Thu, Aug 16, 2012 at 04:42:08PM -0400, J. Bruce Fields wrote:
> > Hm, weird. In the good case the cb_recall's done with auth_unix, in the
> > bad case with auth_null. OK, that should be enough to go on....
> >
> > Thanks for digging into this!
>
> This should fix it--could you confirm?

Yep, looks like that fixes it.

> commit e950bebdac1f17121f972728489cdba43734d56d
> Author: J. Bruce Fields <[email protected]>
> Date: Thu Aug 16 17:01:21 2012 -0400
>
> nfsd4: fix security flavor of NFSv4.0 callback
>
> Commit d5497fc693a446ce9100fcf4117c3f795ddfd0d2 "nfsd4: move rq_flavor
> into svc_cred" forgot to remove cl_flavor from the client, leaving two
> places (cl_flavor and cl_cred.cr_flavor) for the flavor to be stored.
> After that patch, the latter was the one that was updated, but the
> former was the one that the callback used.
>
> Symptoms were a long delay on utime(). This is because the utime()
> generated a setattr which recalled a delegation, but the cb_recall was
> ignored by the client because it had the wrong security flavor.
>
> Cc: [email protected]
> Reported-by: Jamie Heilman <[email protected]>
> Signed-off-by: J. Bruce Fields <[email protected]>
>
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index cbaf4f8..4c7bd35 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
> @@ -651,12 +651,12 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
>
> if (clp->cl_minorversion == 0) {
> if (!clp->cl_cred.cr_principal &&
> - (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
> + (clp->cl_cred.cr_flavor >= RPC_AUTH_GSS_KRB5))
> return -EINVAL;
> args.client_name = clp->cl_cred.cr_principal;
> args.prognumber = conn->cb_prog,
> args.protocol = XPRT_TRANSPORT_TCP;
> - args.authflavor = clp->cl_flavor;
> + args.authflavor = clp->cl_cred.cr_flavor;
> clp->cl_cb_ident = conn->cb_ident;
> } else {
> if (!conn->cb_xprt)
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 167d7d8..9db0bb5 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -231,7 +231,6 @@ struct nfs4_client {
> nfs4_verifier cl_verifier; /* generated by client */
> time_t cl_time; /* time of last lease renewal */
> struct sockaddr_storage cl_addr; /* client ipaddress */
> - u32 cl_flavor; /* setclientid pseudoflavor */
> struct svc_cred cl_cred; /* setclientid principal */
> clientid_t cl_clientid; /* generated by server */
> nfs4_verifier cl_confirm; /* generated by server */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Jamie Heilman http://audible.transient.net/~jamie/

2012-08-17 13:08:51

by J. Bruce Fields

[permalink] [raw]
Subject: Re: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return

On Thu, Aug 16, 2012 at 09:34:24PM +0000, Jamie Heilman wrote:
> J. Bruce Fields wrote:
> > On Thu, Aug 16, 2012 at 04:42:08PM -0400, J. Bruce Fields wrote:
> > > Hm, weird. In the good case the cb_recall's done with auth_unix, in the
> > > bad case with auth_null. OK, that should be enough to go on....
> > >
> > > Thanks for digging into this!
> >
> > This should fix it--could you confirm?
>
> Yep, looks like that fixes it.

Thanks!

I'll queue this up for 3.6 and stable.

--b.

>
> > commit e950bebdac1f17121f972728489cdba43734d56d
> > Author: J. Bruce Fields <[email protected]>
> > Date: Thu Aug 16 17:01:21 2012 -0400
> >
> > nfsd4: fix security flavor of NFSv4.0 callback
> >
> > Commit d5497fc693a446ce9100fcf4117c3f795ddfd0d2 "nfsd4: move rq_flavor
> > into svc_cred" forgot to remove cl_flavor from the client, leaving two
> > places (cl_flavor and cl_cred.cr_flavor) for the flavor to be stored.
> > After that patch, the latter was the one that was updated, but the
> > former was the one that the callback used.
> >
> > Symptoms were a long delay on utime(). This is because the utime()
> > generated a setattr which recalled a delegation, but the cb_recall was
> > ignored by the client because it had the wrong security flavor.
> >
> > Cc: [email protected]
> > Reported-by: Jamie Heilman <[email protected]>
> > Signed-off-by: J. Bruce Fields <[email protected]>
> >
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index cbaf4f8..4c7bd35 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -651,12 +651,12 @@ static int setup_callback_client(struct nfs4_client *clp, struct nfs4_cb_conn *c
> >
> > if (clp->cl_minorversion == 0) {
> > if (!clp->cl_cred.cr_principal &&
> > - (clp->cl_flavor >= RPC_AUTH_GSS_KRB5))
> > + (clp->cl_cred.cr_flavor >= RPC_AUTH_GSS_KRB5))
> > return -EINVAL;
> > args.client_name = clp->cl_cred.cr_principal;
> > args.prognumber = conn->cb_prog,
> > args.protocol = XPRT_TRANSPORT_TCP;
> > - args.authflavor = clp->cl_flavor;
> > + args.authflavor = clp->cl_cred.cr_flavor;
> > clp->cl_cb_ident = conn->cb_ident;
> > } else {
> > if (!conn->cb_xprt)
> > diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> > index 167d7d8..9db0bb5 100644
> > --- a/fs/nfsd/state.h
> > +++ b/fs/nfsd/state.h
> > @@ -231,7 +231,6 @@ struct nfs4_client {
> > nfs4_verifier cl_verifier; /* generated by client */
> > time_t cl_time; /* time of last lease renewal */
> > struct sockaddr_storage cl_addr; /* client ipaddress */
> > - u32 cl_flavor; /* setclientid pseudoflavor */
> > struct svc_cred cl_cred; /* setclientid principal */
> > clientid_t cl_clientid; /* generated by server */
> > nfs4_verifier cl_confirm; /* generated by server */
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
> --
> Jamie Heilman http://audible.transient.net/~jamie/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html