2023-08-31 22:42:59

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v2] NFSv4: Always ask for type with READDIR

On Thu, 2023-08-31 at 20:41 +0200, Cedric Blancher wrote:
> On Thu, 31 Aug 2023 at 02:17, Jeff Layton <[email protected]> wrote:
> >
> > On Wed, 2023-08-30 at 20:20 +0000, Trond Myklebust wrote:
> > > On Wed, 2023-08-30 at 16:10 -0400, Jeff Layton wrote:
> > > > On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
> > > > > Again we have claimed regressions for walking a directory tree,
> > > > > this time
> > > > > with the "find" utility which always tries to optimize away asking
> > > > > for any
> > > > > attributes until it has a complete list of entries. This behavior
> > > > > makes
> > > > > the readdir plus heuristic do the wrong thing, which causes a storm
> > > > > of
> > > > > GETATTRs to determine each entry's type in order to continue the
> > > > > walk.
> > > > >
> > > > > For v4 add the type attribute to each READDIR request to include it
> > > > > no
> > > > > matter the heuristic. This allows a simple `find` command to
> > > > > proceed
> > > > > quickly through a directory tree.
> > > > >
> > > >
> > > > The important bit here is that with v4, we can fill out d_type even
> > > > when
> > > > "plus" is false, at little cost. The downside is that non-plus
> > > > READDIR
> > > > replies will now be a bit larger on the wire. I think it's a
> > > > worthwhile
> > > > tradeoff though.
> > >
> > > The reason why we never did it before is that for many servers, it
> > > forces them to go to the inode in order to retrieve the information.
> > >
> > > IOW: You might as well just do readdirplus.
> > >
> >
> > That makes total sense, given how this code has evolved.
> >
> > FWIW, the Linux NFS server already calls vfs_getattr for every dentry in
> > a v4 READDIR reply regardless of what the client requests. It has to in
> > order to detect junctions, so we're bringing in the inode no matter
> > what. Fetching the type is trivial, so I don't see this as costing
> > anything extra there.
> >
> > Mileage could vary on other servers with more synthetic filesystems, but
> > one would hope that most of them can also return the type cheaply.
>
> Do you have examples for such synthetic filesystems?
>

Synthetic is probably the wrong distinction here, actually.

If looking up the inode type info is expensive, then you'll feel it here
more with this change. That's true regardless of whether this is a
"normal" or "synthetic" fs.

I wouldn't expect a big performance hit from the Linux NFS server given
that we'll almost certainly have that info in-core, but other servers
(ganesha? some commercial servers?) could take a hit here.
--
Jeff Layton <[email protected]>


2023-09-04 05:45:38

by Rick Macklem

[permalink] [raw]
Subject: Re: [PATCH v2] NFSv4: Always ask for type with READDIR

On Thu, Aug 31, 2023 at 11:53 AM Jeff Layton <[email protected]> wrote:
>
> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to [email protected].
>
>
> On Thu, 2023-08-31 at 20:41 +0200, Cedric Blancher wrote:
> > On Thu, 31 Aug 2023 at 02:17, Jeff Layton <[email protected]> wrote:
> > >
> > > On Wed, 2023-08-30 at 20:20 +0000, Trond Myklebust wrote:
> > > > On Wed, 2023-08-30 at 16:10 -0400, Jeff Layton wrote:
> > > > > On Wed, 2023-08-30 at 15:42 -0400, Benjamin Coddington wrote:
> > > > > > Again we have claimed regressions for walking a directory tree,
> > > > > > this time
> > > > > > with the "find" utility which always tries to optimize away asking
> > > > > > for any
> > > > > > attributes until it has a complete list of entries. This behavior
> > > > > > makes
> > > > > > the readdir plus heuristic do the wrong thing, which causes a storm
> > > > > > of
> > > > > > GETATTRs to determine each entry's type in order to continue the
> > > > > > walk.
> > > > > >
> > > > > > For v4 add the type attribute to each READDIR request to include it
> > > > > > no
> > > > > > matter the heuristic. This allows a simple `find` command to
> > > > > > proceed
> > > > > > quickly through a directory tree.
> > > > > >
> > > > >
> > > > > The important bit here is that with v4, we can fill out d_type even
> > > > > when
> > > > > "plus" is false, at little cost. The downside is that non-plus
> > > > > READDIR
> > > > > replies will now be a bit larger on the wire. I think it's a
> > > > > worthwhile
> > > > > tradeoff though.
> > > >
> > > > The reason why we never did it before is that for many servers, it
> > > > forces them to go to the inode in order to retrieve the information.
> > > >
> > > > IOW: You might as well just do readdirplus.
> > > >
> > >
> > > That makes total sense, given how this code has evolved.
> > >
> > > FWIW, the Linux NFS server already calls vfs_getattr for every dentry in
> > > a v4 READDIR reply regardless of what the client requests. It has to in
> > > order to detect junctions, so we're bringing in the inode no matter
> > > what. Fetching the type is trivial, so I don't see this as costing
> > > anything extra there.
> > >
> > > Mileage could vary on other servers with more synthetic filesystems, but
> > > one would hope that most of them can also return the type cheaply.
> >
> > Do you have examples for such synthetic filesystems?
> >
>
> Synthetic is probably the wrong distinction here, actually.
>
> If looking up the inode type info is expensive, then you'll feel it here
> more with this change. That's true regardless of whether this is a
> "normal" or "synthetic" fs.
In case you are interested in an outsider's perspective...
I recently patched the FreeBSD server so that it did not need to
acquire a vnode to generate a Readdir reply if only the following
attributes are requested and the entry is not a directory.
(FreeBSD has a d_type field in its "struct dirent".)
RDAttr_error, Mounted_on_FileID, FileID, Type
--> Adding a requirement for Type to nordirplus would not
have any negative effect on the FreeBSD server.

This patch resulted in about a 5% improvement on Readdir RPC
response time for Readdirs only asking for the above attributes,
for some simple measurements I did using the FreeBSD client.

I still need to acquire the vnode for directories, to check for
server file system mount points. I do not know if what you
refer as "junctions" are directory specific?

rick

>
> I wouldn't expect a big performance hit from the Linux NFS server given
> that we'll almost certainly have that info in-core, but other servers
> (ganesha? some commercial servers?) could take a hit here.
> --
> Jeff Layton <[email protected]>
>