2019-02-20 11:29:08

by James Pearson

[permalink] [raw]
Subject: nfsd thread limit and UDP ?

On a very busy NFSv3 server (running CentOS 6), we recently upped the
nfsd thread count to 1024 - but this caused client mount requests over
UDP to fail.

We configure all our clients to use TCP for NFS mounts, but the
automounter (automountd) on MacOS (up to version MacOS 10.12) seeds a
'null call' to the NFS server over UDP before attempting the mount -
but the server appears to ignore any UDP requests - and the automount
fails

I can also reproduce the issue on a Linux client via:

mount -o udp,nfsvers=3 server:/export /mount/point

I've found, by trial and error, that the maximum number of nfsd
threads that can be run on the server is 1017 before UDP mount
requests fail

Running tcpdump on the server shows the UDP requests from the client,
but the server never replies

It looks like more recent versions of MacOS will do its test 'null
call' over TCP - so that is one 'solution' to this issue

However, I'm interested to know if we're hitting some hard limit, or
if there are any settings we can tweak that could mitigate the
problem?

Thanks

James Pearson


2019-02-20 17:44:06

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfsd thread limit and UDP ?

On Wed, Feb 20, 2019 at 11:28:53AM +0000, James Pearson wrote:
> On a very busy NFSv3 server (running CentOS 6), we recently upped the
> nfsd thread count to 1024 - but this caused client mount requests over
> UDP to fail.
>
> We configure all our clients to use TCP for NFS mounts, but the
> automounter (automountd) on MacOS (up to version MacOS 10.12) seeds a
> 'null call' to the NFS server over UDP before attempting the mount -
> but the server appears to ignore any UDP requests - and the automount
> fails
>
> I can also reproduce the issue on a Linux client via:
>
> mount -o udp,nfsvers=3 server:/export /mount/point
>
> I've found, by trial and error, that the maximum number of nfsd
> threads that can be run on the server is 1017 before UDP mount
> requests fail

Thanks for investigating, that's very weird and interesting!

Just looking through the UDP code in net/sunrpc/svcsock.c.... I wonder
if it's this:

svc_sock_setbufsize(svsk->sk_sock,
(serv->sv_nrthreads+3) * serv->sv_max_mesg,
(serv->sv_nrthreads+3) * serv->sv_max_mesg);

sv_max_mesg will be about 2^20, so the result will be about 2^30 in your case.
Then svc_sock_setbufsize throws in another multiple of 2:

sock->sk->sk_sndbuf = snd * 2;
sock->sk->sk_rcvbuf = rcv * 2;

so we've got to be very close to overflowing sk_sndbuf and sk_rcvbuf, which are
ints.

--b.

2019-02-20 18:15:15

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfsd thread limit and UDP ?

Would it be possible for you to try this?

--b.

commit b45466587b37
Author: J. Bruce Fields <[email protected]>
Date: Wed Feb 20 12:54:50 2019 -0500

svcrpc: fix UDP on servers with lots of threads

James Pearson found that an NFS server stopped responding to UDP
requests if started with more than 1017 threads.

sv_max_mesg is about 2^20, so that is probably where the calculation
performed by

svc_sock_setbufsize(svsk->sk_sock,
(serv->sv_nrthreads+3) * serv->sv_max_mesg,
(serv->sv_nrthreads+3) * serv->sv_max_mesg);

starts to overflow an int.

Reported-by: James Pearson <[email protected]>
Cc: [email protected]
Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index a6a060925e5d..43590a968b73 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -349,12 +349,16 @@ static ssize_t svc_recvfrom(struct svc_rqst *rqstp, struct kvec *iov,
/*
* Set socket snd and rcv buffer lengths
*/
-static void svc_sock_setbufsize(struct socket *sock, unsigned int snd,
- unsigned int rcv)
+static void svc_sock_setbufsize(struct svc_sock *svsk, unsigned int nreqs)
{
+ unsigned int max_mesg = svsk->sk_xprt.xpt_server->sv_max_mesg;
+ struct socket *sock = svsk->sk_sock;
+
+ nreqs = min(nreqs, INT_MAX / 2 / max_mesg);
+
lock_sock(sock->sk);
- sock->sk->sk_sndbuf = snd * 2;
- sock->sk->sk_rcvbuf = rcv * 2;
+ sock->sk->sk_sndbuf = nreqs * max_mesg * 2;
+ sock->sk->sk_rcvbuf = nreqs * max_mesg * 2;
sock->sk->sk_write_space(sock->sk);
release_sock(sock->sk);
}
@@ -516,9 +520,7 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
* provides an upper bound on the number of threads
* which will access the socket.
*/
- svc_sock_setbufsize(svsk->sk_sock,
- (serv->sv_nrthreads+3) * serv->sv_max_mesg,
- (serv->sv_nrthreads+3) * serv->sv_max_mesg);
+ svc_sock_setbufsize(svsk, serv->sv_nrthreads + 3);

clear_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);
skb = NULL;
@@ -681,9 +683,7 @@ static void svc_udp_init(struct svc_sock *svsk, struct svc_serv *serv)
* receive and respond to one request.
* svc_udp_recvfrom will re-adjust if necessary
*/
- svc_sock_setbufsize(svsk->sk_sock,
- 3 * svsk->sk_xprt.xpt_server->sv_max_mesg,
- 3 * svsk->sk_xprt.xpt_server->sv_max_mesg);
+ svc_sock_setbufsize(svsk, 3);

/* data might have come in before data_ready set up */
set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags);

2019-02-21 04:18:22

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfsd thread limit and UDP ?

On Wed, Feb 20, 2019 at 11:28:53AM +0000, James Pearson wrote:
> On a very busy NFSv3 server (running CentOS 6), we recently upped the
> nfsd thread count to 1024 - but this caused client mount requests over
> UDP to fail.
>
> We configure all our clients to use TCP for NFS mounts, but the
> automounter (automountd) on MacOS (up to version MacOS 10.12) seeds a
> 'null call' to the NFS server over UDP before attempting the mount -
> but the server appears to ignore any UDP requests - and the automount
> fails

By the way, you might also just turn off UDP. (Start run rpc.nfsd with
the -U option.) Hopefully MacOS can handle that case.

--b.

>
> I can also reproduce the issue on a Linux client via:
>
> mount -o udp,nfsvers=3 server:/export /mount/point
>
> I've found, by trial and error, that the maximum number of nfsd
> threads that can be run on the server is 1017 before UDP mount
> requests fail
>
> Running tcpdump on the server shows the UDP requests from the client,
> but the server never replies
>
> It looks like more recent versions of MacOS will do its test 'null
> call' over TCP - so that is one 'solution' to this issue
>
> However, I'm interested to know if we're hitting some hard limit, or
> if there are any settings we can tweak that could mitigate the
> problem?
>
> Thanks
>
> James Pearson

2019-02-21 12:36:02

by James Pearson

[permalink] [raw]
Subject: Re: nfsd thread limit and UDP ?

On Thu, 21 Feb 2019 at 04:18, J. Bruce Fields <[email protected]> wrote:
>
> On Wed, Feb 20, 2019 at 11:28:53AM +0000, James Pearson wrote:
> > On a very busy NFSv3 server (running CentOS 6), we recently upped the
> > nfsd thread count to 1024 - but this caused client mount requests over
> > UDP to fail.
> >
> > We configure all our clients to use TCP for NFS mounts, but the
> > automounter (automountd) on MacOS (up to version MacOS 10.12) seeds a
> > 'null call' to the NFS server over UDP before attempting the mount -
> > but the server appears to ignore any UDP requests - and the automount
> > fails
>
> By the way, you might also just turn off UDP. (Start run rpc.nfsd with
> the -U option.) Hopefully MacOS can handle that case.

We tried that - but when we restarted nfs, some existing mounts hung
(not sure why, as we should be just using TCP everywhere) ... although
when tested on a test server, the MacOS automounter worked fine

I tried your patch - it doesn't apply 'as is' on a CentOS 6 kernel -
but with a bit of manual hacking, I can get it to fit

However, the net/sunrpc/svcsock.c in these kernels has an extra call
to svc_sock_setbufsize() :

/* Initialize the socket */
if (sock->type == SOCK_DGRAM)
svc_udp_init(svsk, serv);
else {
/* initialise setting must have enough space to
* receive and respond to one request.
*/
svc_sock_setbufsize(svsk->sk_sock, 4 * serv->sv_max_mesg,
4 * serv->sv_max_mesg);
svc_tcp_init(svsk, serv);
}

I tried replacing that svc_sock_setbufsize() with:

svc_sock_setbufsize(svsk, 4);

but that just caused the whole machine to lock up shortly after
sunrpc.ko was loaded ...

However, things seem to work fine if I call a copy of the original
svc_sock_setbufsize() at that point in the code with the original args
...

i.e. mounts over UDP (and MacOS automounts) now work with nfsd threads
over 1017 (I tried 2048 ... and it worked)

Incidentally, I came across an old thread on this list that appears to
be related to this issue (well, it mentions a 1020 thread limit and
buffer size wraps in svc_sock_setbufsize() ???) :

https://www.spinics.net/lists/linux-nfs/msg34927.html

... but I'm not sure what the result of that was (nor if it is
actually related to the issue here) ?

Thanks

James Pearson

2019-02-21 15:20:28

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfsd thread limit and UDP ?

On Thu, Feb 21, 2019 at 12:35:46PM +0000, James Pearson wrote:
> On Thu, 21 Feb 2019 at 04:18, J. Bruce Fields <[email protected]> wrote:
> >
> > On Wed, Feb 20, 2019 at 11:28:53AM +0000, James Pearson wrote:
> > > On a very busy NFSv3 server (running CentOS 6), we recently upped the
> > > nfsd thread count to 1024 - but this caused client mount requests over
> > > UDP to fail.
> > >
> > > We configure all our clients to use TCP for NFS mounts, but the
> > > automounter (automountd) on MacOS (up to version MacOS 10.12) seeds a
> > > 'null call' to the NFS server over UDP before attempting the mount -
> > > but the server appears to ignore any UDP requests - and the automount
> > > fails
> >
> > By the way, you might also just turn off UDP. (Start run rpc.nfsd with
> > the -U option.) Hopefully MacOS can handle that case.
>
> We tried that - but when we restarted nfs, some existing mounts hung
> (not sure why, as we should be just using TCP everywhere) ... although
> when tested on a test server, the MacOS automounter worked fine

It's probably not a good idea to turn off UDP while there are existing
mounts, even if the mounts are supposedly TCP. At a guess, maybe some
one of the sideband protocols (NLM or NSM) is using UDP and that's
causing problems.

> I tried your patch - it doesn't apply 'as is' on a CentOS 6 kernel -
> but with a bit of manual hacking, I can get it to fit

Whoops, I missed at first that you were on an older kernel.

> However, the net/sunrpc/svcsock.c in these kernels has an extra call
> to svc_sock_setbufsize() :
>
> /* Initialize the socket */
> if (sock->type == SOCK_DGRAM)
> svc_udp_init(svsk, serv);
> else {
> /* initialise setting must have enough space to
> * receive and respond to one request.
> */
> svc_sock_setbufsize(svsk->sk_sock, 4 * serv->sv_max_mesg,
> 4 * serv->sv_max_mesg);
> svc_tcp_init(svsk, serv);
> }
>
> I tried replacing that svc_sock_setbufsize() with:
>
> svc_sock_setbufsize(svsk, 4);
>
> but that just caused the whole machine to lock up shortly after
> sunrpc.ko was loaded ...

Looks like it's trying to dereference svsk->xpt_server before
svc_tcp_init() has initialized it.

> However, things seem to work fine if I call a copy of the original
> svc_sock_setbufsize() at that point in the code with the original args
> ...
>
> i.e. mounts over UDP (and MacOS automounts) now work with nfsd threads
> over 1017 (I tried 2048 ... and it worked)

OK, I think that's evidence enough that this overflow was the problem
you were hitting, so I'll send that patch upstream.

> Incidentally, I came across an old thread on this list that appears to
> be related to this issue (well, it mentions a 1020 thread limit and
> buffer size wraps in svc_sock_setbufsize() ???) :
>
> https://www.spinics.net/lists/linux-nfs/msg34927.html
>
> ... but I'm not sure what the result of that was (nor if it is
> actually related to the issue here) ?

Yeah, see https://www.spinics.net/lists/linux-nfs/msg34932.html. So, I
knew about this problem and even made a patch before and then somehow
dropped it. I'm not sure how that happened. Anyway, I have it queued
up for 5.1 now, so that shouldn't happen again.

--b.