2003-11-25 17:34:31

by James Pearson

[permalink] [raw]
Subject: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

Is it possible to force a different maximum [rw]size on the server
depending on whether the client uses udp or tcp?

Currently, when a client NFS mounts a file system without specifying a
[rw]size, it defaults to whatever NFSSVC_MAXBLKSIZE was defined as (at
compile time) on the server. However, this is the same for udp and tcp
clients.

What I would like is to have a smaller default [rw]size for udp clients,
but allow tcp clients to use 32k.

Is this possible?

Thanks

James Pearson


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-11-26 01:08:48

by Greg Banks

[permalink] [raw]
Subject: Re: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

James Pearson wrote:
>
> Is it possible to force a different maximum [rw]size on the server
> depending on whether the client uses udp or tcp?

If someone writes the code, e.g. some new export options?

> Currently, when a client NFS mounts a file system without specifying a
> [rw]size, it defaults to whatever NFSSVC_MAXBLKSIZE was defined as (at
> compile time) on the server. However, this is the same for udp and tcp
> clients.

Also NFSSVC_MAXBLKSIZE acts as a limit even when [wr]size is specifically
set in mount options.

> What I would like is to have a smaller default [rw]size for udp clients,
> but allow tcp clients to use 32k.

What I would like is to allow UDP clients to use up to 48K and TCP
clients to use up to 4M. And on 2.4 kernels, ensure that MAXBLKSIZE
is at least PAGE_CACHE_SIZE.

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-26 01:41:15

by Trond Myklebust

[permalink] [raw]
Subject: Re: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

>>>>> " " == Greg Banks <[email protected]> writes:

> What I would like is to allow UDP clients to use up to 48K and
> TCP clients to use up to 4M.

Those are already on the roadmap, but haven't yet made it to the top
of the priority list...

They will never make it into 2.4.x though. The main reason is the
xdr_kmap() stuff, which doesn't really scale too well.

Cheers,
Trond


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-26 01:51:36

by Greg Banks

[permalink] [raw]
Subject: Re: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

Trond Myklebust wrote:
>
> >>>>> " " == Greg Banks <[email protected]> writes:
>
> > What I would like is to allow UDP clients to use up to 48K and
> > TCP clients to use up to 4M.
>
> Those are already on the roadmap, but haven't yet made it to the top
> of the priority list...

Are there preliminary patches or analysis/random thoughts somewhere online?

> They will never make it into 2.4.x though. The main reason is the
> xdr_kmap() stuff, which doesn't really scale too well.

Suits me.

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-26 02:00:43

by Trond Myklebust

[permalink] [raw]
Subject: Re: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

>>>>> " " == Greg Banks <[email protected]> writes:

>> Those are already on the roadmap, but haven't yet made it to
>> the top of the priority list...

> Are there preliminary patches or analysis/random thoughts
> somewhere online?

Most of the work should already have been done now that we've
eliminated MAX_IOVEC from the RPC code (those tests in nfs_xdr.h are
bogus and should be removed).

The remaining job is to make the 'pagevec' fields in nfs_read_data and
nfs_write_data support arbitrary size arrays.

Cheers,
Trond


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-26 10:43:43

by James Pearson

[permalink] [raw]
Subject: Re: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

Greg Banks wrote:
>
> James Pearson wrote:
> >
> > Is it possible to force a different maximum [rw]size on the server
> > depending on whether the client uses udp or tcp?
>
> If someone writes the code, e.g. some new export options?

Do you know if this would be possible with 2.4 kernels?

If it is possible, and someone could give me some pointers, I won't mind
giving it a a go ...

James Pearson


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-26 14:21:02

by Lever, Charles

[permalink] [raw]
Subject: RE: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

here's a vote for leaving these values variable rather than
fixed on the server side. 48KB maximum for UDP is great on
a clean network, but some folks would probably prefer setting
it to 8KB to ensure clients on more realistically congested
networks retain reasonable UDP performance.

is there any evidence to show that a 4MB TCP maximum will
have benefits over something smaller, like say 1MB ? as
the per-op costs go down when you add more data, the per-byte
costs will become the major part of the overhead for such
operations. there's probably a sweet spot where the sum
of both per-op and per-byte costs reach a minimum. as
far as i can tell that value is around 16KB in the current
Linux client implementations.


> -----Original Message-----
> From: James Pearson [mailto:[email protected]]=20
> Sent: Wednesday, November 26, 2003 5:44 AM
> To: Greg Banks; Trond Myklebust
> Cc: [email protected]
> Subject: Re: [NFS] Different NFSSVC_MAXBLKSIZE for udp and=20
> tcp clients?
>=20
>=20
> Greg Banks wrote:
> >=20
> > James Pearson wrote:
> > >
> > > Is it possible to force a different maximum [rw]size on the server
> > > depending on whether the client uses udp or tcp?
> >=20
> > If someone writes the code, e.g. some new export options?
>=20
> Do you know if this would be possible with 2.4 kernels?
>=20
> If it is possible, and someone could give me some pointers, I=20
> won't mind
> giving it a a go ...
>=20
> James Pearson
>=20
>=20
> -------------------------------------------------------
> This SF.net email is sponsored by: SF.net Giveback Program.
> Does SourceForge.net help you be more productive? Does it
> help you create better code? SHARE THE LOVE, and help us help
> YOU! Click Here: http://sourceforge.net/donate/
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-26 22:15:33

by NeilBrown

[permalink] [raw]
Subject: Re: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

On Wednesday November 26, [email protected] wrote:
> Greg Banks wrote:
> >
> > James Pearson wrote:
> > >
> > > Is it possible to force a different maximum [rw]size on the server
> > > depending on whether the client uses udp or tcp?
> >
> > If someone writes the code, e.g. some new export options?
>
> Do you know if this would be possible with 2.4 kernels?
>
> If it is possible, and someone could give me some pointers, I won't mind
> giving it a a go ...
>

I have yet to see the point, so presumably I am missing something....

request size is completely under the control of the client via rsize
and wsize (except that you cannot exceed the server's maximum). So
why do you want to add an option to the server? Why not just use the
already-existing-and-working option on the client?

NeilBrown


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-27 00:08:17

by Greg Banks

[permalink] [raw]
Subject: Re: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

"Lever, Charles" wrote:
>
> here's a vote for leaving these values variable rather than
> fixed on the server side. 48KB maximum for UDP is great on
> a clean network, but some folks would probably prefer setting
> it to 8KB to ensure clients on more realistically congested
> networks retain reasonable UDP performance.

Agreed.

> is there any evidence to show that a 4MB TCP maximum will
> have benefits over something smaller, like say 1MB ?

None at all on Linux. I mention that specific number only
because it's the hardcoded maximum on IRIX.

However there is evidence that 8K isn't enough. In particular,
it's smaller than an Altix client's PAGE_CACHE_SIZE so all writes
between two Altix boxes go synchronous.

> [...] there's probably a sweet spot where the sum
> of both per-op and per-byte costs reach a minimum.

Yes.

> as
> far as i can tell that value is around 16KB in the current
> Linux client implementations.

The value will be higher for Altix clients and servers, which
have more RAM, CPU, and larger pages than the machines you're
probably thinking of.

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-27 00:21:29

by Greg Banks

[permalink] [raw]
Subject: Re: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

Neil Brown wrote:
>
> request size is completely under the control of the client via rsize
> and wsize (except that you cannot exceed the server's maximum).

The point is that the server's maximum is too low, at least when
the client and server are powerful machines with a clean network
between them (i.e. a significant part of SGI's customers).

When I say too low, I have two separate issues:

1. wsize<16K means an Altix client does synchronous writes, which
results in a catastrophic performance hit (writes proceed at
no more than 0.7% of line rate on gige). This is a hot issue
right now.

2. Experience with IRIX shows that (up to a point) better performance
can be achieved for some loads with larger transfer sizes. On
IRIX that point is beyond Linux' current hardcoded maximums. This
is more of a long-term issue.

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-27 00:36:04

by NeilBrown

[permalink] [raw]
Subject: Re: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

On Thursday November 27, [email protected] wrote:
> Neil Brown wrote:
> >
> > request size is completely under the control of the client via rsize
> > and wsize (except that you cannot exceed the server's maximum).
>
> The point is that the server's maximum is too low, at least when
> the client and server are powerful machines with a clean network
> between them (i.e. a significant part of SGI's customers).

Why didn't you say so then (or did you and I missed it?).

The code in 2.4 is not ready to cope well with larger NFS packets. It
requires any nfs message to be linear in memory, so that means that a
large (multi-page) kmalloc has to succeed, and there is no gurantee
that it will.

For udp, any incoming packet that is not already 'linear' is
'linearize'd (in svc_udp_recvfrom). For a 32k write, that means a
kmalloc(32768) has to succeed, and very often free memory is too
fragmented for it to succeed.

For tcp, the request will be copied into a pre-allocated buffer. The
(large) buffer is allocated when the nfsd thread is started, and I
have several times found that I couldn't start as many nfsd threads as
I wanted because the system could not do the order-3 allocations that
were needed. (this was when I was playing with large block sizes).

2.6 copes with page-lists and so can handle requests and replies that
are not cntiguous in memory, so this is not an issue and MAXBLKSIZE
can safely be increased.

I suggest that you simply recompile the kernel with a larger
NFSSVC_MAXBLKSIZE and see how well it works.
I can see no reasonable for having different values for udp and tcp
though. Just make it uniformly 32768 and see how it goes.

NeilBrown


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-27 00:53:17

by Trond Myklebust

[permalink] [raw]
Subject: Re: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

>>>>> " " == Greg Banks <[email protected]> writes:

>> is there any evidence to show that a 4MB TCP maximum will have
>> benefits over something smaller, like say 1MB ?

> None at all on Linux. I mention that specific number only
> because it's the hardcoded maximum on IRIX.

Actually, there is plenty of evidence that this would help Linux too.

A couple of the 64-bit architectures with large page sizes would
definitely benefit, since the NFS client code currently is forced to
doing synchronous reads/writes if the r/wsize is less than the page
size (yes - fixing this in the client is on the agenda too).

Those that still want to use NFSv2, or that wish to write short files
under NFSv3 will benefit too (the latter from the fact that you can
coalesce the unstable writes + commit into 1 stable write). Ditto for
smallish O_SYNC writes.

Cheers,
Trond


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-27 01:34:19

by Greg Banks

[permalink] [raw]
Subject: Re: Different NFSSVC_MAXBLKSIZE for udp and tcp clients?

Neil Brown wrote:
>
> On Thursday November 27, [email protected] wrote:
> > Neil Brown wrote:
> >
> > The point is that the server's maximum is too low, [...]
>
> Why didn't you say so then (or did you and I missed it?).

Sorry, just being obscure ;-)

> The code in 2.4 is not ready to cope well with larger NFS packets. [...]
> For a 32k write, that means a
> kmalloc(32768) has to succeed, and very often free memory is too
> fragmented for it to succeed.
>
> For tcp, the request will be copied into a pre-allocated buffer. The
> (large) buffer is allocated when the nfsd thread is started,[...]

Ok, so if physical memory fragmentation is the only issue I'll just
try it. With 16K pages the allocation orders will be more reasonable.

> 2.6 copes with page-lists and so can handle requests and replies that
> are not cntiguous in memory, so this is not an issue and MAXBLKSIZE
> can safely be increased.

Excellent, this sounds like it deals with my long-term issue.

> I suggest that you simply recompile the kernel with a larger
> NFSSVC_MAXBLKSIZE and see how well it works.
> I can see no reasonable for having different values for udp and tcp
> though. Just make it uniformly 32768 and see how it goes.

How about I conditionally define NFSSVC_MAXBLKSIZE to 32k if the
page size is sufficently large? Would such a patch make it into 2.4?

Greg.
--
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs