2008-03-07 19:48:43

by Kyle Rose

[permalink] [raw]
Subject: READDIRPLUS max mount option

I have a very specific use for an NFS mount over a WAN, and allowing for
much larger expected READDIRPLUS requests actually improves performance
by at least a factor of 10 by eliminating the round-trip latency that
results from the application's single-threaded
readdir/stat/stat/stat/... behavior. Rather than maintain a hacked
kernel on my end, I'd rather the READDIRPLUS limit be a mount option.
Hence, the following patch. It defaults to the old behavior
(8*PAGE_SIZE), but with a properly-prepared mount binary will allow the
client to specify a limit.

I'm not subscribed to the list, so please CC me in any relevant discussion.

Kyle


Attachments:
readdirplusmax.patch (9.83 kB)

2008-03-07 19:59:54

by Trond Myklebust

[permalink] [raw]
Subject: Re: READDIRPLUS max mount option


On Fri, 2008-03-07 at 14:37 -0500, Kyle Rose wrote:
> I have a very specific use for an NFS mount over a WAN, and allowing for
> much larger expected READDIRPLUS requests actually improves performance
> by at least a factor of 10 by eliminating the round-trip latency that
> results from the application's single-threaded
> readdir/stat/stat/stat/... behavior. Rather than maintain a hacked
> kernel on my end, I'd rather the READDIRPLUS limit be a mount option.
> Hence, the following patch. It defaults to the old behavior
> (8*PAGE_SIZE), but with a properly-prepared mount binary will allow the
> client to specify a limit.
>
> I'm not subscribed to the list, so please CC me in any relevant discussion.
>
> Kyle

(adding cc to [email protected])

The binary mount format is frozen forever, so the changes to nfs_mount.h
and nfs4_mount.h are definitely NACKed.

Otherwise, it would be nice to know why this absolutely has to be made a
mount option rather than just having a system-wide option (either a
module/boot parameter or a sysctl) to control the behaviour of all
mounts.

Cheers
Trond

2008-03-07 20:09:35

by Kyle Rose

[permalink] [raw]
Subject: Re: READDIRPLUS max mount option


> The binary mount format is frozen forever, so the changes to nfs_mount.h
> and nfs4_mount.h are definitely NACKed.
>
Ah. :-) So is there no way to add mount options, or is there a
different mechanism today?
> Otherwise, it would be nice to know why this absolutely has to be made a
> mount option rather than just having a system-wide option (either a
> module/boot parameter or a sysctl) to control the behaviour of all
> mounts.
>
I mount multiple remote file systems. Only one of them I own, so I'm
willing to potentially hammer it with huge READDIRPLUS requests, while
the others probably deserve more benign behavior. ;-)

In general, I think having system-wide defaults somewhere in proc is
helpful---and certainly superior to a constant in the source---but there
should really be mount-specific overrides wherever the system-wide
default might not be globally appropriate.

Kyle

2008-03-07 20:43:08

by Trond Myklebust

[permalink] [raw]
Subject: Re: READDIRPLUS max mount option


On Fri, 2008-03-07 at 15:09 -0500, Kyle Rose wrote:
> > The binary mount format is frozen forever, so the changes to nfs_mount.h
> > and nfs4_mount.h are definitely NACKed.
> >
> Ah. :-) So is there no way to add mount options, or is there a
> different mechanism today?

Newer versions of 'mount' should use the text-based interface.

> > Otherwise, it would be nice to know why this absolutely has to be made a
> > mount option rather than just having a system-wide option (either a
> > module/boot parameter or a sysctl) to control the behaviour of all
> > mounts.
> >
> I mount multiple remote file systems. Only one of them I own, so I'm
> willing to potentially hammer it with huge READDIRPLUS requests, while
> the others probably deserve more benign behavior. ;-)

The size of the actual READDIRPLUS requests is completely unaffected by
your patch. Your change actually means that the client will continue to
use READDIRPLUS on very large directories instead of falling back to
readdir.
The reason for falling back to readdir is that value of readdirplus
tends to decrease with larger directories as the cost of caching all
those dentries, attributes and filehandles both on the client and the
server goes up.

If you want a faster readdir(), you will find that splitting those huge
directories up into smaller subdirs is an alternative solution that
tends to scale much better on both client and server.

> In general, I think having system-wide defaults somewhere in proc is
> helpful---and certainly superior to a constant in the source---but there
> should really be mount-specific overrides wherever the system-wide
> default might not be globally appropriate.

Having hundreds of mount options for minor tweaks is not an acceptable
practice. Each mount option needs to be abundantly justified.

Since we're talking about what is really a quite arbitrary limit, I can
certainly see an argument for why we might want a way to change it, but
I'm still not convinced that we need to be setting this parameter at the
mountpoint level.

Cheers
Trond

2008-03-07 21:04:23

by Kyle Rose

[permalink] [raw]
Subject: Re: READDIRPLUS max mount option


> The size of the actual READDIRPLUS requests is completely unaffected by
> your patch. Your change actually means that the client will continue to
> use READDIRPLUS on very large directories instead of falling back to
> readdir.
>
Sorry to be imprecise. "Size of request" should be "size of response"
or "cost of request". The meaning is clear, I think.
> If you want a faster readdir(), you will find that splitting those huge
> directories up into smaller subdirs is an alternative solution that
> tends to scale much better on both client and server.
>
Agreed that this is probably the least terrible of the available
solutions, but in my specific case it requires a more extensive
modification to my software than the relatively minor kernel change.
> Having hundreds of mount options for minor tweaks is not an acceptable
> practice. Each mount option needs to be abundantly justified.
>
Regarding your straw man, nobody's proposing hundreds of mount options.
I imagine the effort required to implement each one would keep such a
thing from happening. ;-)
> Since we're talking about what is really a quite arbitrary limit, I can
> certainly see an argument for why we might want a way to change it, but
> I'm still not convinced that we need to be setting this parameter at the
> mountpoint level.
Fair enough. A proc entry to alter this globally would be an acceptable
compromise for me, even if my local sysadmins might not like it.

Kyle

2008-03-09 16:16:48

by Jan Engelhardt

[permalink] [raw]
Subject: Re: READDIRPLUS max mount option


On Mar 7 2008 16:04, Kyle Rose wrote:
>> Since we're talking about what is really a quite arbitrary limit, I can
>> certainly see an argument for why we might want a way to change it, but
>> I'm still not convinced that we need to be setting this parameter at the
>> mountpoint level.
>
> Fair enough. A proc entry to alter this globally would be an acceptable
> compromise for me, even if my local sysadmins might not like it.
>
sysfs, no? A module parameter is easy for a global default and is
the minimum thing to do :)