LinuxLists.cc - [PATCH 0/2] tweak knfsd session slot table sizing

2017-09-25 13:21:46

Subject: [PATCH 0/2] tweak knfsd session slot table sizing

From: "J. Bruce Fields" <[email protected]>

Current client and server defaults can limit NFSv4.1+ mounts to as few
as 4 per gigabytes of server RAM, and people have been running into the
limit on upgrading the protocol version from 4.0 to 4.1.

Bump the limit. Also return smaller ca_maxrequests as the limit
approaches instead of waiting till we have to fail CREATE_SESSION
completely.

A few years ago Trond also posted some patches to dynamically resize
existing sessions as limits approach--I'll revisit those at some point
too.

The numbers chosen here are honestly still a little arbitrary, and we
may tweak them again if we see more problems.

--b.

J. Bruce Fields (2):
nfsd: increase DRC cache limit
nfsd: give out fewer session slots as limit approaches

fs/nfsd/nfs4state.c | 5 +++++
fs/nfsd/nfssvc.c | 2 +-
2 files changed, 6 insertions(+), 1 deletion(-)

--
2.13.5

2017-09-25 13:21:46

by J. Bruce Fields

[permalink] [raw]

Subject: [PATCH 2/2] nfsd: give out fewer session slots as limit approaches

From: "J. Bruce Fields" <[email protected]>

Instead of granting client's full requests until we hit our DRC size
limit and then failing CREATE_SESSIONs (and hence mounts) completely,
start granting clients smaller slot tables as we approach the limit.

The factor chosen here is pretty much arbitrary.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfs4state.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 28e5bded6a4b..2fe930e92545 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1472,6 +1472,11 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
spin_lock(&nfsd_drc_lock);
avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION,
nfsd_drc_max_mem - nfsd_drc_mem_used);
+ /*
+ * Never use more than a third of the remaining memory,
+ * unless it's the only way to give this client a slot:
+ */
+ avail = clamp_t(int, avail, slotsize, avail/3);
num = min_t(int, num, avail / slotsize);
nfsd_drc_mem_used += num * slotsize;
spin_unlock(&nfsd_drc_lock);
--
2.13.5

2017-09-25 13:21:46

by J. Bruce Fields

[permalink] [raw]

Subject: [PATCH 1/2] nfsd: increase DRC cache limit

From: "J. Bruce Fields" <[email protected]>

An NFSv4.1+ client negotiates the size of its duplicate reply cache size
in the initial CREATE_SESSION request. The server preallocates the
memory for the duplicate reply cache to ensure that we'll never fail to
record the response to a nonidempotent operation.

To prevent a few CREATE_SESSIONs from consuming all of memory we set an
upper limit based on nr_free_buffer_pages(). 1/2^10 has been too
limiting in practice; 1/2^7 is still less than one percent.

Signed-off-by: J. Bruce Fields <[email protected]>
---
fs/nfsd/nfssvc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 7e3af3ef0917..6bbc717f40f2 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -446,7 +446,7 @@ void nfsd_reset_versions(void)
*/
static void set_max_drc(void)
{
- #define NFSD_DRC_SIZE_SHIFT 10
+ #define NFSD_DRC_SIZE_SHIFT 7
nfsd_drc_max_mem = (nr_free_buffer_pages()
>> NFSD_DRC_SIZE_SHIFT) * PAGE_SIZE;
nfsd_drc_mem_used = 0;
--
2.13.5

2019-09-19 01:28:08

by NeilBrown

[permalink] [raw]

Subject: Re: [PATCH 2/2] nfsd: give out fewer session slots as limit approaches

On Mon, Sep 25 2017, J. Bruce Fields wrote:

> From: "J. Bruce Fields" <[email protected]>
>
> Instead of granting client's full requests until we hit our DRC size
> limit and then failing CREATE_SESSIONs (and hence mounts) completely,
> start granting clients smaller slot tables as we approach the limit.
>
> The factor chosen here is pretty much arbitrary.

Hi Bruce....
I've been looking at this patch - and the various add-ons that fix it :-(
There seems to be another problem though.
Prior to this patch, avail would never exceed
nfsd_drc_max_mem - nfsd_drc_mem_used
since this patch, avail will never be less than slotsize, so it could
exceed the above.
This means that 'num' will never be less than 1 (i.e. never zero).
num * slotsize might exceed
nfsd_drc_max_mem - nfsd_drc_mem_used
and then nfsd_drc_mem_used would exceed nfsd_drc_max_mem

When that happens, the next call to nfsd4_get_drc_mem() will evaluate

total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;

which will be very large (unsigned long) indeed. Maybe not the intention.

I would have sent a patch to fix this, except that it bothers me that
nfsd4_get_drc_mem() could ever return 0 (it cannot at the moment, but
would after a "fix"). That would result in check_forechannel_attrs()
returning nfserr_jukebox, and the client retrying indefinitely (which
is exactly the symptom I have reported by a customer with a 4.12
kernel).
This isn't nice behaviour.

Given that the server makes no attempt to reclaim slot memory for
clients, would NFS4ERR_RESOURCE be a better error here?

Also, I'd like to suggest that the '1/3' heuristic be change to 1/16.
Assuming 30 slots get handed out normally (which my testing shows -
about 2k each, with an upper limit of 64k):
When 90 slots left, we hand out
30 (now 60 left)
20 (now 40 left)
13 (now 27 left)
9 (now 18 left)
6 (now 12 left)
4 (now 8 left)
2 (now 6 left)
2 (now 4 left)
1
1
1
1
which is a rapid decline as clients are added.
With 16, we hand out 30 at a time until 480 slots are left (30Meg)
then: 30 28 26 24 23 21 20 19 18 6 15 15 14 13 12 11 10 10 9 9 8 8 7 7
6 6 5 5 5 5 4 4 4 3 3 3 3 3 3 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1
slots per session

Am I convincing?

To make it more concrete: this is what I'm thinking of. Which bits do
you like?

Thanks,
NeilBrown

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7857942c5ca6..5d11ceaee998 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1573,11 +1573,15 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
/*
- * Never use more than a third of the remaining memory,
+ * Never use more than a 1/16 of the remaining memory,
* unless it's the only way to give this client a slot:
*/
- avail = clamp_t(unsigned long, avail, slotsize, total_avail/3);
+ avail = clamp_t(unsigned long, avail, slotsize, total_avail/16);
num = min_t(int, num, avail / slotsize);
+ if (nfsd_drc_mem_used + num * slotsize > nfsd_drc_max_mem)
+ /* Completely out of space - sorry */
+ num = 0;
+
nfsd_drc_mem_used += num * slotsize;
spin_unlock(&nfsd_drc_lock);

@@ -3172,7 +3176,7 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs
*/
ca->maxreqs = nfsd4_get_drc_mem(ca);
if (!ca->maxreqs)
- return nfserr_jukebox;
+ return nfserr_resource;

return nfs_ok;
}

Attachments:

signature.asc (847.00 B)

2019-09-19 18:30:48

by J. Bruce Fields

[permalink] [raw]

Subject: Re: [PATCH 2/2] nfsd: give out fewer session slots as limit approaches

On Thu, Sep 19, 2019 at 11:08:10AM +1000, NeilBrown wrote:
> On Mon, Sep 25 2017, J. Bruce Fields wrote:
>
> > From: "J. Bruce Fields" <[email protected]>
> >
> > Instead of granting client's full requests until we hit our DRC size
> > limit and then failing CREATE_SESSIONs (and hence mounts) completely,
> > start granting clients smaller slot tables as we approach the limit.
> >
> > The factor chosen here is pretty much arbitrary.
>
> Hi Bruce....
> I've been looking at this patch - and the various add-ons that fix it :-(

It's got to be some kind of record. Sorry....

> There seems to be another problem though.
> Prior to this patch, avail would never exceed
> nfsd_drc_max_mem - nfsd_drc_mem_used
> since this patch, avail will never be less than slotsize, so it could
> exceed the above.
> This means that 'num' will never be less than 1 (i.e. never zero).
> num * slotsize might exceed
> nfsd_drc_max_mem - nfsd_drc_mem_used
> and then nfsd_drc_mem_used would exceed nfsd_drc_max_mem
>
> When that happens, the next call to nfsd4_get_drc_mem() will evaluate
>
> total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
>
> which will be very large (unsigned long) indeed. Maybe not the intention.

That all sounds correct.

> I would have sent a patch to fix this, except that it bothers me that
> nfsd4_get_drc_mem() could ever return 0 (it cannot at the moment, but
> would after a "fix"). That would result in check_forechannel_attrs()
> returning nfserr_jukebox, and the client retrying indefinitely (which
> is exactly the symptom I have reported by a customer with a 4.12
> kernel).
> This isn't nice behaviour.
>
> Given that the server makes no attempt to reclaim slot memory for
> clients, would NFS4ERR_RESOURCE be a better error here?

NFSv4 is confusing here: RESOURCE seems like it should be the right
error for this kind of thing, but it was actually just mean to be for
compounds that are too complicated in some way. And since NFSv4.1
(which added negotiated limits on compounds), it's not supposed to be
used at all.

Looking.... Oh, I overlooked this before:

https://tools.ietf.org/html/rfc5661#page-522

The server creates the session by recording the parameter values
used (including whether the CREATE_SESSION4_FLAG_PERSIST flag is
set and has been accepted by the server) and allocating space
for the session reply cache (if there is not enough space, the
server returns NFS4ERR_NOSPC).

So, slightly odd use of NOSPC, but that's the right error there.

> Also, I'd like to suggest that the '1/3' heuristic be change to 1/16.
> Assuming 30 slots get handed out normally (which my testing shows -
> about 2k each, with an upper limit of 64k):
> When 90 slots left, we hand out
> 30 (now 60 left)
> 20 (now 40 left)
> 13 (now 27 left)
> 9 (now 18 left)
> 6 (now 12 left)
> 4 (now 8 left)
> 2 (now 6 left)
> 2 (now 4 left)
> 1
> 1
> 1
> 1
> which is a rapid decline as clients are added.
> With 16, we hand out 30 at a time until 480 slots are left (30Meg)
> then: 30 28 26 24 23 21 20 19 18 6 15 15 14 13 12 11 10 10 9 9 8 8 7 7
> 6 6 5 5 5 5 4 4 4 3 3 3 3 3 3 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1
> 1 1
> slots per session
>
> Am I convincing?
>
> To make it more concrete: this is what I'm thinking of. Which bits do
> you like?

Except for the error return, it looks good to me.

--b.

>
> Thanks,
> NeilBrown
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 7857942c5ca6..5d11ceaee998 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1573,11 +1573,15 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
> total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
> avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
> /*
> - * Never use more than a third of the remaining memory,
> + * Never use more than a 1/16 of the remaining memory,
> * unless it's the only way to give this client a slot:
> */
> - avail = clamp_t(unsigned long, avail, slotsize, total_avail/3);
> + avail = clamp_t(unsigned long, avail, slotsize, total_avail/16);
> num = min_t(int, num, avail / slotsize);
> + if (nfsd_drc_mem_used + num * slotsize > nfsd_drc_max_mem)
> + /* Completely out of space - sorry */
> + num = 0;
> +
> nfsd_drc_mem_used += num * slotsize;
> spin_unlock(&nfsd_drc_lock);
>
> @@ -3172,7 +3176,7 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs
> */
> ca->maxreqs = nfsd4_get_drc_mem(ca);
> if (!ca->maxreqs)
> - return nfserr_jukebox;
> + return nfserr_resource;
>
> return nfs_ok;
> }

2019-09-19 18:56:07

by J. Bruce Fields

[permalink] [raw]

Subject: Re: [PATCH 2/2] nfsd: give out fewer session slots as limit approaches

On Thu, Sep 19, 2019 at 12:22:11PM -0400, J. Bruce Fields wrote:
> Looking.... Oh, I overlooked this before:
>
> https://tools.ietf.org/html/rfc5661#page-522
>
> The server creates the session by recording the parameter values
> used (including whether the CREATE_SESSION4_FLAG_PERSIST flag is
> set and has been accepted by the server) and allocating space
> for the session reply cache (if there is not enough space, the
> server returns NFS4ERR_NOSPC).
>
> So, slightly odd use of NOSPC, but that's the right error there.

Looking back some more... last year Manjunath Patil sent a patch that
fixed that and I somehow ignored the part where he actually cited that
spec language and said, no, ENOSPC isn't for that:

https://lore.kernel.org/linux-nfs/[email protected]/

But, Trond seems to disagree with the spec here anyway:

There are a range of errors which we may need to handle by
destroying the session, and then creating a new one (mainly the
ones where the client and server slot handling get out of sync).
That's why returning NFS4ERR_NOSPC in response to CREATE_SESSION
is unhelpful, and is why the only sane response by the client
will be to treat is as a temporary error.

Other todo's from that thread:

- The protocol allows us to recall existing slots from clients.
We can use that when we get close to the limit. Trond wrote
patches to do that some time ago, though said he'd take a
different approach now:

https://lore.kernel.org/linux-nfs/CAABAsM6vDOaudUZYWH23oGiWGqX5Bd1YbCDnL6L=pxzMXgZzaw@mail.gmail.com/

- I seem to recall the client is asking for rather large slots.
It could be a careful look at the xdr code would show we don't
need as much?

- Maybe we should just keep allowing small sessions (1 slot?)
even past the limit. Worst case the subsequent kmalloc fails.

--b.

>
> > Also, I'd like to suggest that the '1/3' heuristic be change to 1/16.
> > Assuming 30 slots get handed out normally (which my testing shows -
> > about 2k each, with an upper limit of 64k):
> > When 90 slots left, we hand out
> > 30 (now 60 left)
> > 20 (now 40 left)
> > 13 (now 27 left)
> > 9 (now 18 left)
> > 6 (now 12 left)
> > 4 (now 8 left)
> > 2 (now 6 left)
> > 2 (now 4 left)
> > 1
> > 1
> > 1
> > 1
> > which is a rapid decline as clients are added.
> > With 16, we hand out 30 at a time until 480 slots are left (30Meg)
> > then: 30 28 26 24 23 21 20 19 18 6 15 15 14 13 12 11 10 10 9 9 8 8 7 7
> > 6 6 5 5 5 5 4 4 4 3 3 3 3 3 3 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1
> > 1 1
> > slots per session
> >
> > Am I convincing?
> >
> > To make it more concrete: this is what I'm thinking of. Which bits do
> > you like?
>
> Except for the error return, it looks good to me.
>
> --b.
>
> >
> > Thanks,
> > NeilBrown
> >
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 7857942c5ca6..5d11ceaee998 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -1573,11 +1573,15 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
> > total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
> > avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
> > /*
> > - * Never use more than a third of the remaining memory,
> > + * Never use more than a 1/16 of the remaining memory,
> > * unless it's the only way to give this client a slot:
> > */
> > - avail = clamp_t(unsigned long, avail, slotsize, total_avail/3);
> > + avail = clamp_t(unsigned long, avail, slotsize, total_avail/16);
> > num = min_t(int, num, avail / slotsize);
> > + if (nfsd_drc_mem_used + num * slotsize > nfsd_drc_max_mem)
> > + /* Completely out of space - sorry */
> > + num = 0;
> > +
> > nfsd_drc_mem_used += num * slotsize;
> > spin_unlock(&nfsd_drc_lock);
> >
> > @@ -3172,7 +3176,7 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs
> > */
> > ca->maxreqs = nfsd4_get_drc_mem(ca);
> > if (!ca->maxreqs)
> > - return nfserr_jukebox;
> > + return nfserr_resource;
> >
> > return nfs_ok;
> > }
>
>

2019-09-19 22:14:01

by J. Bruce Fields

[permalink] [raw]

Subject: Re: [PATCH 2/2] nfsd: give out fewer session slots as limit approaches

On Thu, Sep 19, 2019 at 01:17:30PM -0400, J. Bruce Fields wrote:
> - Maybe we should just keep allowing small sessions (1 slot?)
> even past the limit. Worst case the subsequent kmalloc fails.

So to be clear, I think that's what I'd do for now instead of trying to
find a better error return.

And probably put the 1/3->1/16 change in a separate patch.

--b.

>
> --b.
>
> >
> > > Also, I'd like to suggest that the '1/3' heuristic be change to 1/16.
> > > Assuming 30 slots get handed out normally (which my testing shows -
> > > about 2k each, with an upper limit of 64k):
> > > When 90 slots left, we hand out
> > > 30 (now 60 left)
> > > 20 (now 40 left)
> > > 13 (now 27 left)
> > > 9 (now 18 left)
> > > 6 (now 12 left)
> > > 4 (now 8 left)
> > > 2 (now 6 left)
> > > 2 (now 4 left)
> > > 1
> > > 1
> > > 1
> > > 1
> > > which is a rapid decline as clients are added.
> > > With 16, we hand out 30 at a time until 480 slots are left (30Meg)
> > > then: 30 28 26 24 23 21 20 19 18 6 15 15 14 13 12 11 10 10 9 9 8 8 7 7
> > > 6 6 5 5 5 5 4 4 4 3 3 3 3 3 3 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1
> > > 1 1
> > > slots per session
> > >
> > > Am I convincing?
> > >
> > > To make it more concrete: this is what I'm thinking of. Which bits do
> > > you like?
> >
> > Except for the error return, it looks good to me.
> >
> > --b.
> >
> > >
> > > Thanks,
> > > NeilBrown
> > >
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 7857942c5ca6..5d11ceaee998 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -1573,11 +1573,15 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
> > > total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
> > > avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
> > > /*
> > > - * Never use more than a third of the remaining memory,
> > > + * Never use more than a 1/16 of the remaining memory,
> > > * unless it's the only way to give this client a slot:
> > > */
> > > - avail = clamp_t(unsigned long, avail, slotsize, total_avail/3);
> > > + avail = clamp_t(unsigned long, avail, slotsize, total_avail/16);
> > > num = min_t(int, num, avail / slotsize);
> > > + if (nfsd_drc_mem_used + num * slotsize > nfsd_drc_max_mem)
> > > + /* Completely out of space - sorry */
> > > + num = 0;
> > > +
> > > nfsd_drc_mem_used += num * slotsize;
> > > spin_unlock(&nfsd_drc_lock);
> > >
> > > @@ -3172,7 +3176,7 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs
> > > */
> > > ca->maxreqs = nfsd4_get_drc_mem(ca);
> > > if (!ca->maxreqs)
> > > - return nfserr_jukebox;
> > > + return nfserr_resource;
> > >
> > > return nfs_ok;
> > > }
> >
> >

2019-09-20 17:09:49

by NeilBrown

[permalink] [raw]

Subject: [PATCH 1/2] nfsd: handle drc over-allocation gracefully.

Since commit de766e570413 ("nfsd: give out fewer session slots as
limit approaches") nfsd4_get_drc_mem() ensures 'avail' is always at
least 'slotsize', so the return value 'num' is always at least 1.

This means that:
1/ nfsd_drc_mem_used could exceed nfsd_drc_max_mem, which becomes
a problem when we later perform an unsigned subtraction of
the larger from the smaller to calculate 'total_avail'.
2/ Check the return value of nfsd4_get_drc_mem() against
zero is pointless - the error code (which isn't actually correct
according to RFC-5661) is never returned.

So avoid the integer overflow, and discard the check.

Signed-off-by: NeilBrown <[email protected]>
---
fs/nfsd/nfs4state.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7857942c5ca6..8c09ac49fff2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1570,11 +1570,21 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
unsigned long avail, total_avail;

spin_lock(&nfsd_drc_lock);
- total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
+ if (nfsd_drc_max_mem > nfsd_drc_mem_used)
+ total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
+ else
+ /* We have handed out more space than we chose in
+ * set_max_drc() to allow. That isn't really a
+ * problem as long as that doesn't make us thing we
+ * have lots more due to integer overflow.
+ */
+ total_avail = 0;
avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
/*
* Never use more than a third of the remaining memory,
* unless it's the only way to give this client a slot:
+ * Note that we always return at least 1 here, even if
+ * we have exceeded nfsd_drc_max_mem.
*/
avail = clamp_t(unsigned long, avail, slotsize, total_avail/3);
num = min_t(int, num, avail / slotsize);
@@ -3169,10 +3179,10 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs
* performance. When short on memory we therefore prefer to
* decrease number of slots instead of their size. Clients that
* request larger slots than they need will get poor results:
+ * Note that we always allow at least one slot, because our
+ * accounting is soft and provides no guarantees either way.
*/
ca->maxreqs = nfsd4_get_drc_mem(ca);
- if (!ca->maxreqs)
- return nfserr_jukebox;

return nfs_ok;
}
--
2.23.0

Attachments:

signature.asc (847.00 B)

2019-09-20 17:14:30

by NeilBrown

[permalink] [raw]

Subject: [PATCH 1/2 - vers2] nfsd: handle drc over-allocation gracefully.

Currently, if there are more clients than allowed for by the
space allocation in set_max_drc(), we fail a SESSION_CREATE
request with NFS4ERR_DELAY.
This means that the client retries indefinitely, which isn't
a user-friendly response.

The RFC requires NFS4ERR_NOSPC, but that would at best result in a
clean failure on the client, which is not much more friendly.

The current space allocation is a best-guess and doesn't provide any
guarantees, we could still run out of space when trying to allocate
drc space.

So fail more gracefully - always give out at least one slot.
If all clients used all the space in all slots, we might start getting
memory pressure, but that is possible anyway.

So ensure 'num' is always at least 1, and remove the test for it
being zero.

Signed-off-by: NeilBrown <[email protected]>
---
fs/nfsd/nfs4state.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)

Sorry, but I was confused about clamp_t().
If the 'max' is smaller than the 'min', then the 'max' wins, so
'avail' will never be more than total_avail/3.
I might have believed the comment here instead of the code there.

So the current code cannot overflow, but I think we agree that
failure is not good. So this patch avoids failure.
NeilBrown

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7857942c5ca6..084a30d77359 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1570,14 +1570,25 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
unsigned long avail, total_avail;

spin_lock(&nfsd_drc_lock);
- total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
+ if (nfsd_drc_max_mem > nfsd_drc_mem_used)
+ total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
+ else
+ /* We have handed out more space than we chose in
+ * set_max_drc() to allow. That isn't really a
+ * problem as long as that doesn't make us thing we
+ * have lots more due to integer overflow.
+ */
+ total_avail = 0;
avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
/*
* Never use more than a third of the remaining memory,
- * unless it's the only way to give this client a slot:
+ * unless it's the only way to give this client a slot.
+ * Give the client one slot even that would require
+ * over-allocation, it is better than failure.
*/
avail = clamp_t(unsigned long, avail, slotsize, total_avail/3);
num = min_t(int, num, avail / slotsize);
+ num = max_t(int, num, 1);
nfsd_drc_mem_used += num * slotsize;
spin_unlock(&nfsd_drc_lock);

@@ -3169,10 +3180,10 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs
* performance. When short on memory we therefore prefer to
* decrease number of slots instead of their size. Clients that
* request larger slots than they need will get poor results:
+ * Note that we always allow at least one slot, because our
+ * accounting is soft and provides no guarantees either way.
*/
ca->maxreqs = nfsd4_get_drc_mem(ca);
- if (!ca->maxreqs)
- return nfserr_jukebox;

return nfs_ok;
}
--
2.23.0

Attachments:

signature.asc (847.00 B)

2019-09-20 17:15:27

by NeilBrown

[permalink] [raw]

Subject: [PATCH - 2/2] nfsd: degraded slot-count more gracefully as allocation nears exhaustion.

This original code in nfsd4_get_drc_mem() would hand out 30
slots (approximately NFSD_MAX_MEM_PER_SESSION bytes at slightly
over 2K per slot) to each requesting client until it ran out
of space, then it would possibly give one last client a reduced
allocation, then fail the allocation.

Since commit de766e570413 ("nfsd: give out fewer session slots as
limit approaches") the last 90 slots to be given to about 12
clients with quickly reducing slot counts (better than just 3
clients). This still seems unnecessarily hasty.
A subsequent patch allows over-allocation so every client gets
at least one slot, but that might be a bit restrictive.

The requested number of nfsd threads is the best guide we have to the
expected number of clients, so use that - if it is at least 8.

256 threads on a 256Meg machine - which is a lot for a tiny machine -
would result in nfsd_drc_max_mem being 2Meg, so 8K (3 slots) would be
available for the first client, and over 200 clients would get more
than 1 slot. So I don't think this change will be too debilitating on
poorly configured machines, though it does mean that a sensible
configuration is a little more important.

Signed-off-by: NeilBrown <[email protected]>
---
fs/nfsd/nfs4state.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 084a30d77359..c49af2ba3924 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1563,11 +1563,12 @@ static inline u32 slot_bytes(struct nfsd4_channel_attrs *ca)
* re-negotiate active sessions and reduce their slot usage to make
* room for new connections. For now we just fail the create session.
*/
-static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
+static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca, struct nfsd_net *nn)
{
u32 slotsize = slot_bytes(ca);
u32 num = ca->maxreqs;
unsigned long avail, total_avail;
+ unsigned int scale_factor;

spin_lock(&nfsd_drc_lock);
if (nfsd_drc_max_mem > nfsd_drc_mem_used)
@@ -1581,12 +1582,18 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
total_avail = 0;
avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
/*
- * Never use more than a third of the remaining memory,
+ * Never use more than a fraction of the remaining memory,
* unless it's the only way to give this client a slot.
+ * The chosen fraction is either 1/8 or 1/number of threads,
+ * whichever is smaller. This ensures there are adequate
+ * slots to support multiple clients per thread.
* Give the client one slot even that would require
* over-allocation, it is better than failure.
*/
- avail = clamp_t(unsigned long, avail, slotsize, total_avail/3);
+ scale_factor = max_t(unsigned int, 8, nn->nfsd_serv->sv_nrthreads);
+
+ avail = clamp_t(unsigned long, avail, slotsize,
+ total_avail/scale_factor);
num = min_t(int, num, avail / slotsize);
num = max_t(int, num, 1);
nfsd_drc_mem_used += num * slotsize;
@@ -3183,7 +3190,7 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs
* Note that we always allow at least one slot, because our
* accounting is soft and provides no guarantees either way.
*/
- ca->maxreqs = nfsd4_get_drc_mem(ca);
+ ca->maxreqs = nfsd4_get_drc_mem(ca, nn);

return nfs_ok;
}
--
2.23.0

Attachments:

signature.asc (847.00 B)

2019-09-22 18:54:15

by J. Bruce Fields

[permalink] [raw]

Subject: Re: [PATCH 1/2 - vers2] nfsd: handle drc over-allocation gracefully.

On Fri, Sep 20, 2019 at 04:33:16PM +1000, NeilBrown wrote:
>
> Currently, if there are more clients than allowed for by the
> space allocation in set_max_drc(), we fail a SESSION_CREATE
> request with NFS4ERR_DELAY.
> This means that the client retries indefinitely, which isn't
> a user-friendly response.
>
> The RFC requires NFS4ERR_NOSPC, but that would at best result in a
> clean failure on the client, which is not much more friendly.
>
> The current space allocation is a best-guess and doesn't provide any
> guarantees, we could still run out of space when trying to allocate
> drc space.
>
> So fail more gracefully - always give out at least one slot.
> If all clients used all the space in all slots, we might start getting
> memory pressure, but that is possible anyway.
>
> So ensure 'num' is always at least 1, and remove the test for it
> being zero.
>
> Signed-off-by: NeilBrown <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 19 +++++++++++++++----
> 1 file changed, 15 insertions(+), 4 deletions(-)
>
> Sorry, but I was confused about clamp_t().
> If the 'max' is smaller than the 'min', then the 'max' wins, so
> 'avail' will never be more than total_avail/3.
> I might have believed the comment here instead of the code there.
>
> So the current code cannot overflow, but I think we agree that
> failure is not good. So this patch avoids failure.

Ah, got it, looks good. Applying for 5.4.

--b.

> NeilBrown
>
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 7857942c5ca6..084a30d77359 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1570,14 +1570,25 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
> unsigned long avail, total_avail;
>
> spin_lock(&nfsd_drc_lock);
> - total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
> + if (nfsd_drc_max_mem > nfsd_drc_mem_used)
> + total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
> + else
> + /* We have handed out more space than we chose in
> + * set_max_drc() to allow. That isn't really a
> + * problem as long as that doesn't make us thing we
> + * have lots more due to integer overflow.
> + */
> + total_avail = 0;
> avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
> /*
> * Never use more than a third of the remaining memory,
> - * unless it's the only way to give this client a slot:
> + * unless it's the only way to give this client a slot.
> + * Give the client one slot even that would require
> + * over-allocation, it is better than failure.
> */
> avail = clamp_t(unsigned long, avail, slotsize, total_avail/3);
> num = min_t(int, num, avail / slotsize);
> + num = max_t(int, num, 1);
> nfsd_drc_mem_used += num * slotsize;
> spin_unlock(&nfsd_drc_lock);
>
> @@ -3169,10 +3180,10 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs
> * performance. When short on memory we therefore prefer to
> * decrease number of slots instead of their size. Clients that
> * request larger slots than they need will get poor results:
> + * Note that we always allow at least one slot, because our
> + * accounting is soft and provides no guarantees either way.
> */
> ca->maxreqs = nfsd4_get_drc_mem(ca);
> - if (!ca->maxreqs)
> - return nfserr_jukebox;
>
> return nfs_ok;
> }
> --
> 2.23.0
>