2024-01-13 00:20:50

by Dan Shelton

[permalink] [raw]
Subject: Increasing NFSD_MAX_OPS_PER_COMPOUND to 96

Hello!

We've been experiencing significant nfsd performance problems with a
customer who has a deeply nested filesystem hierarchy, lots of
subdirs, some of them 60-80 dirs deep (!!), which leads to an
exponentially slowdown with nfsd accesses.

Some of the issues have been addressed by implementing a better
directory walker via multiple dir fds and openat() (instead of just
cwd+open()), but the nfsd side still was a pretty dramatic issue,
until we bumped #define NFSD_MAX_OPS_PER_COMPOUND in
linux-6.7/fs/nfsd/nfsd.h from 50 to 96. After that the nfsd side
behaved MUCH more performant.

Dan
--
Dan Shelton - Cluster Specialist Win/Lin/Bsd


2024-01-13 01:32:53

by Jeffrey Layton

[permalink] [raw]
Subject: Re: Increasing NFSD_MAX_OPS_PER_COMPOUND to 96

On Sat, 2024-01-13 at 01:19 +0100, Dan Shelton wrote:
> Hello!
>
> We've been experiencing significant nfsd performance problems with a
> customer who has a deeply nested filesystem hierarchy, lots of
> subdirs, some of them 60-80 dirs deep (!!), which leads to an
> exponentially slowdown with nfsd accesses.
>
> Some of the issues have been addressed by implementing a better
> directory walker via multiple dir fds and openat() (instead of just
> cwd+open()), but the nfsd side still was a pretty dramatic issue,
> until we bumped #define NFSD_MAX_OPS_PER_COMPOUND in
> linux-6.7/fs/nfsd/nfsd.h from 50 to 96. After that the nfsd side
> behaved MUCH more performant.
>

I guess your clients are trying to do a long pathwalk in a single
COMPOUND? Is this the windows client?

At first glance, I don't see any real downside to increasing that value.
Maybe we can bump it to 100 or so? What would probably be best is to
propose a patch so we can discuss the change formally.

Cheers,
--
Jeff Layton <[email protected]>

2024-01-13 01:47:56

by Dan Shelton

[permalink] [raw]
Subject: Re: Increasing NFSD_MAX_OPS_PER_COMPOUND to 96

On Sat, 13 Jan 2024 at 02:32, Jeff Layton <[email protected]> wrote:
>
> On Sat, 2024-01-13 at 01:19 +0100, Dan Shelton wrote:
> > Hello!
> >
> > We've been experiencing significant nfsd performance problems with a
> > customer who has a deeply nested filesystem hierarchy, lots of
> > subdirs, some of them 60-80 dirs deep (!!), which leads to an
> > exponentially slowdown with nfsd accesses.
> >
> > Some of the issues have been addressed by implementing a better
> > directory walker via multiple dir fds and openat() (instead of just
> > cwd+open()), but the nfsd side still was a pretty dramatic issue,
> > until we bumped #define NFSD_MAX_OPS_PER_COMPOUND in
> > linux-6.7/fs/nfsd/nfsd.h from 50 to 96. After that the nfsd side
> > behaved MUCH more performant.
> >
>
> I guess your clients are trying to do a long pathwalk in a single
> COMPOUND?

Likely.

> Is this the windows client?

No, clients are Solaris 11, Linux and freeBSD

>
> At first glance, I don't see any real downside to increasing that value.
> Maybe we can bump it to 100 or so? What would probably be best is to
> propose a patch so we can discuss the change formally.

OK. How does this work?

Dan
--
Dan Shelton - Cluster Specialist Win/Lin/Bsd

2024-01-13 01:53:21

by Chuck Lever

[permalink] [raw]
Subject: Re: Increasing NFSD_MAX_OPS_PER_COMPOUND to 96



> On Jan 12, 2024, at 8:47 PM, Dan Shelton <[email protected]> wrote:
>
> On Sat, 13 Jan 2024 at 02:32, Jeff Layton <[email protected]> wrote:
>>
>> On Sat, 2024-01-13 at 01:19 +0100, Dan Shelton wrote:
>>> Hello!
>>>
>>> We've been experiencing significant nfsd performance problems with a
>>> customer who has a deeply nested filesystem hierarchy, lots of
>>> subdirs, some of them 60-80 dirs deep (!!), which leads to an
>>> exponentially slowdown with nfsd accesses.
>>>
>>> Some of the issues have been addressed by implementing a better
>>> directory walker via multiple dir fds and openat() (instead of just
>>> cwd+open()), but the nfsd side still was a pretty dramatic issue,
>>> until we bumped #define NFSD_MAX_OPS_PER_COMPOUND in
>>> linux-6.7/fs/nfsd/nfsd.h from 50 to 96. After that the nfsd side
>>> behaved MUCH more performant.
>>>
>>
>> I guess your clients are trying to do a long pathwalk in a single
>> COMPOUND?
>
> Likely.

That's known bad client behavior, btw. It won't scale
in the number of path components.


>> Is this the windows client?
>
> No, clients are Solaris 11, Linux and freeBSD

Solaris 11 is known to send COMPOUNDs that are too large
during mount, but the rest of the time these three client
implementations are not known to send large COMPOUNDs.


>> At first glance, I don't see any real downside to increasing that value.
>> Maybe we can bump it to 100 or so? What would probably be best is to
>> propose a patch so we can discuss the change formally.
>
> OK. How does this work?

Let's back up a minute.

I'd like to see raw packet captures with the current
MAX_OPS setting and the new larger one. Something is not
adding up.


--
Chuck Lever


2024-01-13 02:40:33

by Rick Macklem

[permalink] [raw]
Subject: Re: Increasing NFSD_MAX_OPS_PER_COMPOUND to 96

On Fri, Jan 12, 2024 at 5:53 PM Chuck Lever III <[email protected]> wrote:
>
>
>
> > On Jan 12, 2024, at 8:47 PM, Dan Shelton <[email protected]> wrote:
> >
> > On Sat, 13 Jan 2024 at 02:32, Jeff Layton <[email protected]> wrote:
> >>
> >> On Sat, 2024-01-13 at 01:19 +0100, Dan Shelton wrote:
> >>> Hello!
> >>>
> >>> We've been experiencing significant nfsd performance problems with a
> >>> customer who has a deeply nested filesystem hierarchy, lots of
> >>> subdirs, some of them 60-80 dirs deep (!!), which leads to an
> >>> exponentially slowdown with nfsd accesses.
> >>>
> >>> Some of the issues have been addressed by implementing a better
> >>> directory walker via multiple dir fds and openat() (instead of just
> >>> cwd+open()), but the nfsd side still was a pretty dramatic issue,
> >>> until we bumped #define NFSD_MAX_OPS_PER_COMPOUND in
> >>> linux-6.7/fs/nfsd/nfsd.h from 50 to 96. After that the nfsd side
> >>> behaved MUCH more performant.
> >>>
> >>
> >> I guess your clients are trying to do a long pathwalk in a single
> >> COMPOUND?
> >
> > Likely.
>
> That's known bad client behavior, btw. It won't scale
> in the number of path components.
>
>
> >> Is this the windows client?
> >
> > No, clients are Solaris 11, Linux and freeBSD
>
> Solaris 11 is known to send COMPOUNDs that are too large
> during mount, but the rest of the time these three client
> implementations are not known to send large COMPOUNDs.
Actually the FreeBSD client is the same as Solaris, in that it does the
entire mount path in one compound. If you were to attempt a mount
with more than 48 components, it would exceed 50 ops in the compound.
I don't think it can exceed 50 ops any other way.

rick

>
>
> >> At first glance, I don't see any real downside to increasing that value.
> >> Maybe we can bump it to 100 or so? What would probably be best is to
> >> propose a patch so we can discuss the change formally.
> >
> > OK. How does this work?
>
> Let's back up a minute.
>
> I'd like to see raw packet captures with the current
> MAX_OPS setting and the new larger one. Something is not
> adding up.
>
>
> --
> Chuck Lever
>
>

2024-01-13 07:20:26

by Roland Mainz

[permalink] [raw]
Subject: Re: Increasing NFSD_MAX_OPS_PER_COMPOUND to 96

On Sat, Jan 13, 2024 at 2:32 AM Jeff Layton <[email protected]> wrote:
> On Sat, 2024-01-13 at 01:19 +0100, Dan Shelton wrote:
> > We've been experiencing significant nfsd performance problems with a
> > customer who has a deeply nested filesystem hierarchy, lots of
> > subdirs, some of them 60-80 dirs deep (!!), which leads to an
> > exponentially slowdown with nfsd accesses.
> >
> > Some of the issues have been addressed by implementing a better
> > directory walker via multiple dir fds and openat() (instead of just
> > cwd+open()), but the nfsd side still was a pretty dramatic issue,
> > until we bumped #define NFSD_MAX_OPS_PER_COMPOUND in
> > linux-6.7/fs/nfsd/nfsd.h from 50 to 96. After that the nfsd side
> > behaved MUCH more performant.
>
> I guess your clients are trying to do a long pathwalk in a single
> COMPOUND?

Is there a problem with that (assuming NFSv4.1 session limits are honored) ?

> Is this the windows client?

No, the ms-nfs41-client (see
https://github.com/kofemann/ms-nfs41-client) uses a limit of |16|, but
it is on our ToDo list to bump that to |128| (but honoring the limit
set by the NFSv4.1 server during session negotiation) since it now
supports very long paths ([1]) and this issue is a known performance
bottleneck.

[1]=Windows 10 build 1607 allows longer paths than the infamous
|MAXPATH|, see https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=registry

> At first glance, I don't see any real downside to increasing that value.
> Maybe we can bump it to 100 or so? What would probably be best is to
> propose a patch so we can discuss the change formally.

AFAIK the Solaris 11's and Illumos's (see
https://github.com/racktopsystems/illumos-gate/commit/27e4199512d9d7b1e5409904f13cd96d8a05ee6e)
limit is AFAIK |NFS4_COMPOUND_LIMIT| (=|2048|), both capped by the
limits negotiated at NFSv4.1 session creation, if I recall it
correctly...

----

Bye,
Roland
--
__ . . __
(o.\ \/ /.o) [email protected]
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)