We have a series of test transfers going, where we are shuttling data
from GFS->NFS V3 over UDP->NFS V3 over TCP->Lustre.
On the NFS V3 over TCP link, we're seeing a lot of tinygrams, despite
having 8K NFS block sizes turned on, and jumbo packets enabled (9000
byte MTU).
The GFS machine runs Redhat 9, the first NFS server also runs Redhat 9.
The machine copying from NFS to NFS is running AIX 5.1. The machine
copying NFS to Lustre is running RHEL 3.
I didn't check on the packet sizes of the other legs of the transfer.
I've verified that we do have jumbo packets being used some of the time,
on that AIX 5.1 -> RHEL 3 hop. However, we're still getting a pretty
large percentage of tinygrams.
Is there any way of cutting down on the tinygrams, to more effectively
utilize our large MTU? Is there perhaps any sort of "intent based"
packetizing in standard implementations of NFS on Redhat 9, AIX 5.1,
and/or RHEL 3?
(Yes, we could short circuit the AIX 5.1 part of the transfer, and that
Would make things faster, but it Wouldn't test what we need to test!)
Thanks!
--
Dan Stromberg DCS/NACS/UCI <[email protected]>
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
what's a "tinygram" ?
do you mean the NFS write requests aren't all "wsize" bytes? or do you
mean the TCP layer is segmenting into small IP packets? these are two
separate layers, and do not interact.
> -----Original Message-----
> From: Dan Stromberg [mailto:[email protected]]=20
> Sent: Thursday, October 21, 2004 1:05 PM
> To: Linux NFS Mailing List
> Cc: Dan Stromberg
> Subject: [NFS] NFS and tinygrams
>=20
>=20
>=20
> We have a series of test transfers going, where we are=20
> shuttling data from GFS->NFS V3 over UDP->NFS V3 over TCP->Lustre.
>=20
> On the NFS V3 over TCP link, we're seeing a lot of tinygrams,=20
> despite having 8K NFS block sizes turned on, and jumbo=20
> packets enabled (9000 byte MTU).
>=20
> The GFS machine runs Redhat 9, the first NFS server also runs=20
> Redhat 9.=20
> The machine copying from NFS to NFS is running AIX 5.1. The=20
> machine copying NFS to Lustre is running RHEL 3.
>=20
> I didn't check on the packet sizes of the other legs of the transfer.
>=20
> I've verified that we do have jumbo packets being used some=20
> of the time, on that AIX 5.1 -> RHEL 3 hop. However, we're=20
> still getting a pretty large percentage of tinygrams.
>=20
> Is there any way of cutting down on the tinygrams, to more=20
> effectively utilize our large MTU? Is there perhaps any sort=20
> of "intent based" packetizing in standard implementations of=20
> NFS on Redhat 9, AIX 5.1, and/or RHEL 3?
>=20
> (Yes, we could short circuit the AIX 5.1 part of the=20
> transfer, and that Would make things faster, but it Wouldn't=20
> test what we need to test!)
>=20
> Thanks!
>=20
> --=20
> Dan Stromberg DCS/NACS/UCI <[email protected]>
>=20
>=20
>=20
>=20
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on=20
> ITManagersJournal Use IT products in your business? Tell us=20
> what you think of them. Give us Your Opinions, Get Free=20
> ThinkGeek Gift Certificates! Click to find out more=20
> http://productguide.itmanagersjournal.com/guid> epromo.tmpl
>=20
> _______________________________________________
>=20
> NFS maillist - [email protected]=20
> https://lists.sourceforge.net/lists/listinfo/n> fs
>=20
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
A tinygram is a small packet.
Many of the NFS packets I'm seeing are small - say about 200 or 300
bytes. Then from time to time, there's a 7k packet, like I'd like to
see more of.
I don't think it's the TCP layer doing it, or we wouldn't be seeing our
7K packets from time to time. TCP shouldn't fragment these, because
we've turned on "jumbo frames" and set our MTU's to 9000. It about has
to be one or both NFS layers doing many small transfers I think.
Someone just told me that netapp servers can do intent-based NFS. Do
you concur?
On Thu, 2004-10-21 at 10:47, Lever, Charles wrote:
> what's a "tinygram" ?
>
> do you mean the NFS write requests aren't all "wsize" bytes? or do you
> mean the TCP layer is segmenting into small IP packets? these are two
> separate layers, and do not interact.
>
> > -----Original Message-----
> > From: Dan Stromberg [mailto:[email protected]]
> > Sent: Thursday, October 21, 2004 1:05 PM
> > To: Linux NFS Mailing List
> > Cc: Dan Stromberg
> > Subject: [NFS] NFS and tinygrams
> >
> >
> >
> > We have a series of test transfers going, where we are
> > shuttling data from GFS->NFS V3 over UDP->NFS V3 over TCP->Lustre.
> >
> > On the NFS V3 over TCP link, we're seeing a lot of tinygrams,
> > despite having 8K NFS block sizes turned on, and jumbo
> > packets enabled (9000 byte MTU).
> >
> > The GFS machine runs Redhat 9, the first NFS server also runs
> > Redhat 9.
> > The machine copying from NFS to NFS is running AIX 5.1. The
> > machine copying NFS to Lustre is running RHEL 3.
> >
> > I didn't check on the packet sizes of the other legs of the transfer.
> >
> > I've verified that we do have jumbo packets being used some
> > of the time, on that AIX 5.1 -> RHEL 3 hop. However, we're
> > still getting a pretty large percentage of tinygrams.
> >
> > Is there any way of cutting down on the tinygrams, to more
> > effectively utilize our large MTU? Is there perhaps any sort
> > of "intent based" packetizing in standard implementations of
> > NFS on Redhat 9, AIX 5.1, and/or RHEL 3?
> >
> > (Yes, we could short circuit the AIX 5.1 part of the
> > transfer, and that Would make things faster, but it Wouldn't
> > test what we need to test!)
> >
> > Thanks!
> >
> > --
> > Dan Stromberg DCS/NACS/UCI <[email protected]>
> >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by: IT Product Guide on
> > ITManagersJournal Use IT products in your business? Tell us
> > what you think of them. Give us Your Opinions, Get Free
> > ThinkGeek Gift Certificates! Click to find out more
> > http://productguide.itmanagersjournal.com/guid> epromo.tmpl
> >
> > _______________________________________________
> >
> > NFS maillist - [email protected]
> > https://lists.sourceforge.net/lists/listinfo/n> fs
> >
--
Dan Stromberg DCS/NACS/UCI <[email protected]>
> A tinygram is a small packet.
>=20
> Many of the NFS packets I'm seeing are small - say about 200=20
> or 300 bytes. Then from time to time, there's a 7k packet,=20
> like I'd like to see more of.
do you know what's in the small packets? 200 to 300 bytes are typical
of most NFS operations (not READ or WRITE). maybe your application is
causing the client to generate lots of NFS requests, but only a few of
them are WRITEs.
> Someone just told me that netapp servers can do intent-based=20
> NFS. Do you concur?
i've never heard of "intent-based NFS." can you explain what this
means?
> On Thu, 2004-10-21 at 10:47, Lever, Charles wrote:
> > what's a "tinygram" ?
> >=20
> > do you mean the NFS write requests aren't all "wsize" bytes? or do=20
> > you mean the TCP layer is segmenting into small IP packets?=20
> these are=20
> > two separate layers, and do not interact.
> >=20
> > > -----Original Message-----
> > > From: Dan Stromberg [mailto:[email protected]]
> > > Sent: Thursday, October 21, 2004 1:05 PM
> > > To: Linux NFS Mailing List
> > > Cc: Dan Stromberg
> > > Subject: [NFS] NFS and tinygrams
> > >=20
> > >=20
> > >=20
> > > We have a series of test transfers going, where we are
> > > shuttling data from GFS->NFS V3 over UDP->NFS V3 over TCP->Lustre.
> > >=20
> > > On the NFS V3 over TCP link, we're seeing a lot of tinygrams,
> > > despite having 8K NFS block sizes turned on, and jumbo=20
> > > packets enabled (9000 byte MTU).
> > >=20
> > > The GFS machine runs Redhat 9, the first NFS server also runs
> > > Redhat 9.=20
> > > The machine copying from NFS to NFS is running AIX 5.1. The=20
> > > machine copying NFS to Lustre is running RHEL 3.
> > >=20
> > > I didn't check on the packet sizes of the other legs of the=20
> > > transfer.
> > >=20
> > > I've verified that we do have jumbo packets being used some
> > > of the time, on that AIX 5.1 -> RHEL 3 hop. However, we're=20
> > > still getting a pretty large percentage of tinygrams.
> > >=20
> > > Is there any way of cutting down on the tinygrams, to more
> > > effectively utilize our large MTU? Is there perhaps any sort=20
> > > of "intent based" packetizing in standard implementations of=20
> > > NFS on Redhat 9, AIX 5.1, and/or RHEL 3?
> > >=20
> > > (Yes, we could short circuit the AIX 5.1 part of the
> > > transfer, and that Would make things faster, but it Wouldn't=20
> > > test what we need to test!)
> > >=20
> > > Thanks!
> > >=20
> > > --
> > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > >=20
> > >=20
> > >=20
> > >=20
> > > -------------------------------------------------------
> > > This SF.net email is sponsored by: IT Product Guide on
> > > ITManagersJournal Use IT products in your business? Tell us=20
> > > what you think of them. Give us Your Opinions, Get Free=20
> > > ThinkGeek Gift Certificates! Click to find out more=20
> > > http://productguide.itmanagersjournal.com/guid> epromo.tmpl
> > >=20
> > > _______________________________________________
> > >=20
> > > NFS maillist - [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/n> fs
> > >=20
> --=20
> Dan Stromberg DCS/NACS/UCI <[email protected]>
>=20
>=20
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Thu, 2004-10-21 at 11:56, Lever, Charles wrote:
> > A tinygram is a small packet.
> >
> > Many of the NFS packets I'm seeing are small - say about 200
> > or 300 bytes. Then from time to time, there's a 7k packet,
> > like I'd like to see more of.
>
> do you know what's in the small packets? 200 to 300 bytes are typical
> of most NFS operations (not READ or WRITE). maybe your application is
> causing the client to generate lots of NFS requests, but only a few of
> them are WRITEs.
This is the NFS portion of a 190 byte packet, that appears to be fairly
representative, taken from tethereal:
Network File System
Program Version: 3
V3 Procedure: READ (6)
file
length: 36
hash: 0x3305e54e
type: unknown
data: 01000006007900411A00000000000000
001B8C1A000000000000000000057E72
00000000
offset: 1484812288
count: 8192
Most of the files in this filesystem are large (data from simulation
runs in netcdf format), but there certainly are some small ones.
Right now, our application is rsync. But that may change later.
> > Someone just told me that netapp servers can do intent-based
> > NFS. Do you concur?
>
> i've never heard of "intent-based NFS." can you explain what this
> means?
I believe it means that you bundle a bunch of operations together into
one large packet, and the execution of later operations is contingent on
the success of earlier operations (or perhaps more generally, the exit
status of earlier operations - not sure).
Lustre, I'm told, uses an intent-based protocol to speed up its
operations.
The FC2 nfs implementation (kernel 2.6.8-1) has a structure named
"intent", which -might- only be used in NFS v4.
There's some discussion of the data structure for intent-based NFS here:
http://seclists.org/lists/linux-kernel/2003/May/6040.html
Unfortunately, our AIX 5.1 machine does not support NFS v4. Anyone know
if AIX 5.3 does? I'll ask on an AIX mailing list too...
>
>
> > On Thu, 2004-10-21 at 10:47, Lever, Charles wrote:
> > > what's a "tinygram" ?
> > >
> > > do you mean the NFS write requests aren't all "wsize" bytes? or do
> > > you mean the TCP layer is segmenting into small IP packets?
> > these are
> > > two separate layers, and do not interact.
> > >
> > > > -----Original Message-----
> > > > From: Dan Stromberg [mailto:[email protected]]
> > > > Sent: Thursday, October 21, 2004 1:05 PM
> > > > To: Linux NFS Mailing List
> > > > Cc: Dan Stromberg
> > > > Subject: [NFS] NFS and tinygrams
> > > >
> > > >
> > > >
> > > > We have a series of test transfers going, where we are
> > > > shuttling data from GFS->NFS V3 over UDP->NFS V3 over TCP->Lustre.
> > > >
> > > > On the NFS V3 over TCP link, we're seeing a lot of tinygrams,
> > > > despite having 8K NFS block sizes turned on, and jumbo
> > > > packets enabled (9000 byte MTU).
> > > >
> > > > The GFS machine runs Redhat 9, the first NFS server also runs
> > > > Redhat 9.
> > > > The machine copying from NFS to NFS is running AIX 5.1. The
> > > > machine copying NFS to Lustre is running RHEL 3.
> > > >
> > > > I didn't check on the packet sizes of the other legs of the
> > > > transfer.
> > > >
> > > > I've verified that we do have jumbo packets being used some
> > > > of the time, on that AIX 5.1 -> RHEL 3 hop. However, we're
> > > > still getting a pretty large percentage of tinygrams.
> > > >
> > > > Is there any way of cutting down on the tinygrams, to more
> > > > effectively utilize our large MTU? Is there perhaps any sort
> > > > of "intent based" packetizing in standard implementations of
> > > > NFS on Redhat 9, AIX 5.1, and/or RHEL 3?
> > > >
> > > > (Yes, we could short circuit the AIX 5.1 part of the
> > > > transfer, and that Would make things faster, but it Wouldn't
> > > > test what we need to test!)
> > > >
> > > > Thanks!
> > > >
> > > > --
> > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > >
> > > >
> > > >
> > > >
> > > > -------------------------------------------------------
> > > > This SF.net email is sponsored by: IT Product Guide on
> > > > ITManagersJournal Use IT products in your business? Tell us
> > > > what you think of them. Give us Your Opinions, Get Free
> > > > ThinkGeek Gift Certificates! Click to find out more
> > > > http://productguide.itmanagersjournal.com/guid> epromo.tmpl
> > > >
> > > > _______________________________________________
> > > >
> > > > NFS maillist - [email protected]
> > > > https://lists.sourceforge.net/lists/listinfo/n> fs
> > > >
> > --
> > Dan Stromberg DCS/NACS/UCI <[email protected]>
> >
> >
--
Dan Stromberg DCS/NACS/UCI <[email protected]>
Here are our packet lengths with counts, over 10000 packets:
count packet length
3 70
1 74
2 82
3 98
164 182
180 186
8827 190
76 202
407 286
52 4266
1 7418
284 8362
Does this look normal for a network with jumbo frames enabled
transferring lots of mostly-large files?
On Thu, 2004-10-21 at 12:15, Dan Stromberg wrote:
> On Thu, 2004-10-21 at 11:56, Lever, Charles wrote:
> > > A tinygram is a small packet.
> > >
> > > Many of the NFS packets I'm seeing are small - say about 200
> > > or 300 bytes. Then from time to time, there's a 7k packet,
> > > like I'd like to see more of.
> >
> > do you know what's in the small packets? 200 to 300 bytes are typical
> > of most NFS operations (not READ or WRITE). maybe your application is
> > causing the client to generate lots of NFS requests, but only a few of
> > them are WRITEs.
>
> This is the NFS portion of a 190 byte packet, that appears to be fairly
> representative, taken from tethereal:
>
> Network File System
> Program Version: 3
> V3 Procedure: READ (6)
> file
> length: 36
> hash: 0x3305e54e
> type: unknown
> data: 01000006007900411A00000000000000
> 001B8C1A000000000000000000057E72
> 00000000
> offset: 1484812288
> count: 8192
>
> Most of the files in this filesystem are large (data from simulation
> runs in netcdf format), but there certainly are some small ones.
>
> Right now, our application is rsync. But that may change later.
>
> > > Someone just told me that netapp servers can do intent-based
> > > NFS. Do you concur?
> >
> > i've never heard of "intent-based NFS." can you explain what this
> > means?
>
> I believe it means that you bundle a bunch of operations together into
> one large packet, and the execution of later operations is contingent on
> the success of earlier operations (or perhaps more generally, the exit
> status of earlier operations - not sure).
>
> Lustre, I'm told, uses an intent-based protocol to speed up its
> operations.
>
> The FC2 nfs implementation (kernel 2.6.8-1) has a structure named
> "intent", which -might- only be used in NFS v4.
>
> There's some discussion of the data structure for intent-based NFS here:
>
> http://seclists.org/lists/linux-kernel/2003/May/6040.html
>
> Unfortunately, our AIX 5.1 machine does not support NFS v4. Anyone know
> if AIX 5.3 does? I'll ask on an AIX mailing list too...
>
> >
> >
> > > On Thu, 2004-10-21 at 10:47, Lever, Charles wrote:
> > > > what's a "tinygram" ?
> > > >
> > > > do you mean the NFS write requests aren't all "wsize" bytes? or do
> > > > you mean the TCP layer is segmenting into small IP packets?
> > > these are
> > > > two separate layers, and do not interact.
> > > >
> > > > > -----Original Message-----
> > > > > From: Dan Stromberg [mailto:[email protected]]
> > > > > Sent: Thursday, October 21, 2004 1:05 PM
> > > > > To: Linux NFS Mailing List
> > > > > Cc: Dan Stromberg
> > > > > Subject: [NFS] NFS and tinygrams
> > > > >
> > > > >
> > > > >
> > > > > We have a series of test transfers going, where we are
> > > > > shuttling data from GFS->NFS V3 over UDP->NFS V3 over TCP->Lustre.
> > > > >
> > > > > On the NFS V3 over TCP link, we're seeing a lot of tinygrams,
> > > > > despite having 8K NFS block sizes turned on, and jumbo
> > > > > packets enabled (9000 byte MTU).
> > > > >
> > > > > The GFS machine runs Redhat 9, the first NFS server also runs
> > > > > Redhat 9.
> > > > > The machine copying from NFS to NFS is running AIX 5.1. The
> > > > > machine copying NFS to Lustre is running RHEL 3.
> > > > >
> > > > > I didn't check on the packet sizes of the other legs of the
> > > > > transfer.
> > > > >
> > > > > I've verified that we do have jumbo packets being used some
> > > > > of the time, on that AIX 5.1 -> RHEL 3 hop. However, we're
> > > > > still getting a pretty large percentage of tinygrams.
> > > > >
> > > > > Is there any way of cutting down on the tinygrams, to more
> > > > > effectively utilize our large MTU? Is there perhaps any sort
> > > > > of "intent based" packetizing in standard implementations of
> > > > > NFS on Redhat 9, AIX 5.1, and/or RHEL 3?
> > > > >
> > > > > (Yes, we could short circuit the AIX 5.1 part of the
> > > > > transfer, and that Would make things faster, but it Wouldn't
> > > > > test what we need to test!)
> > > > >
> > > > > Thanks!
> > > > >
> > > > > --
> > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > -------------------------------------------------------
> > > > > This SF.net email is sponsored by: IT Product Guide on
> > > > > ITManagersJournal Use IT products in your business? Tell us
> > > > > what you think of them. Give us Your Opinions, Get Free
> > > > > ThinkGeek Gift Certificates! Click to find out more
> > > > > http://productguide.itmanagersjournal.com/guid> epromo.tmpl
> > > > >
> > > > > _______________________________________________
> > > > >
> > > > > NFS maillist - [email protected]
> > > > > https://lists.sourceforge.net/lists/listinfo/n> fs
> > > > >
> > > --
> > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > >
> > >
--
Dan Stromberg DCS/NACS/UCI <[email protected]>
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
> On Thu, 2004-10-21 at 11:56, Lever, Charles wrote:
> This is the NFS portion of a 190 byte packet, that appears to=20
> be fairly representative, taken from tethereal:
>=20
> Network File System
> Program Version: 3
> V3 Procedure: READ (6)
> file
> length: 36
> hash: 0x3305e54e
> type: unknown
> data: 01000006007900411A00000000000000
> 001B8C1A000000000000000000057E72
> 00000000
> offset: 1484812288
> count: 8192
>=20
> Most of the files in this filesystem are large (data from=20
> simulation runs in netcdf format), but there certainly are=20
> some small ones.
a READ request is small. the reply, however, is large. likewise, a
WRITE request is large, but the reply is small. requests like GETATTR,
ACCESS, LOOKUP, CREATE, REMOVE are all small requests.
so take a look at "nfsstat -c" on your client to understand what
requests are going over the wire. if you see few WRITEs but lots of
GETATTRs and LOOKUPs, that can explain why you are seeing small packets
on the wire.
> > > Someone just told me that netapp servers can do intent-based
> > > NFS. Do you concur?
> >=20
> > i've never heard of "intent-based NFS." can you explain what this=20
> > means?
>=20
> I believe it means that you bundle a bunch of operations=20
> together into one large packet, and the execution of later=20
> operations is contingent on the success of earlier operations=20
> (or perhaps more generally, the exit status of earlier=20
> operations - not sure).
you've managed to conflate a number of features. what you are referring
to here is known as compound RPCs, and is a feature of NFSv4.
this has nothing to do with how much data TCP puts in a single segment.
that is determined more by the client's network stack -- the use (or
lack of use) of TCP_CORK and the Nagle algorithm.
if you are seeing small datagrams, then this is probably UDP we're
talking about. TCP tries to pack as many bytes into each segment as it
can.
> Lustre, I'm told, uses an intent-based protocol to speed up=20
> its operations.
now here, you're talking about the statefulness of the upper layer file
system protocol. NFSv4 uses OPEN and CLOSE to communicate to the server
the access intentions of the application. when the Linux VFS layer
invokes a file system's lookup method, it can now tell the file system
whether the lookup is for a DNLC refresh or because some application is
trying to open a file.
> The FC2 nfs implementation (kernel 2.6.8-1) has a structure=20
> named "intent", which -might- only be used in NFS v4.
there are a few NFSv3 areas where this can be used (exclusive create
being one).
> Unfortunately, our AIX 5.1 machine does not support NFS v4. =20
> Anyone know if AIX 5.3 does? I'll ask on an AIX mailing list too...
yes, AIX 5.3 supports NFSv4.
> > > On Thu, 2004-10-21 at 10:47, Lever, Charles wrote:
> > > > what's a "tinygram" ?
> > > >=20
> > > > do you mean the NFS write requests aren't all "wsize"=20
> bytes? or=20
> > > > do
> > > > you mean the TCP layer is segmenting into small IP packets?=20
> > > these are
> > > > two separate layers, and do not interact.
> > > >=20
> > > > > -----Original Message-----
> > > > > From: Dan Stromberg [mailto:[email protected]]
> > > > > Sent: Thursday, October 21, 2004 1:05 PM
> > > > > To: Linux NFS Mailing List
> > > > > Cc: Dan Stromberg
> > > > > Subject: [NFS] NFS and tinygrams
> > > > >=20
> > > > >=20
> > > > >=20
> > > > > We have a series of test transfers going, where we=20
> are shuttling=20
> > > > > data from GFS->NFS V3 over UDP->NFS V3 over TCP->Lustre.
> > > > >=20
> > > > > On the NFS V3 over TCP link, we're seeing a lot of tinygrams,=20
> > > > > despite having 8K NFS block sizes turned on, and=20
> jumbo packets=20
> > > > > enabled (9000 byte MTU).
> > > > >=20
> > > > > The GFS machine runs Redhat 9, the first NFS server also runs=20
> > > > > Redhat 9. The machine copying from NFS to NFS is running AIX=20
> > > > > 5.1. The machine copying NFS to Lustre is running RHEL 3.
> > > > >=20
> > > > > I didn't check on the packet sizes of the other legs of the
> > > > > transfer.
> > > > >=20
> > > > > I've verified that we do have jumbo packets being=20
> used some of=20
> > > > > the time, on that AIX 5.1 -> RHEL 3 hop. However,=20
> we're still=20
> > > > > getting a pretty large percentage of tinygrams.
> > > > >=20
> > > > > Is there any way of cutting down on the tinygrams, to more=20
> > > > > effectively utilize our large MTU? Is there perhaps=20
> any sort of=20
> > > > > "intent based" packetizing in standard=20
> implementations of NFS on=20
> > > > > Redhat 9, AIX 5.1, and/or RHEL 3?
> > > > >=20
> > > > > (Yes, we could short circuit the AIX 5.1 part of the=20
> transfer,=20
> > > > > and that Would make things faster, but it Wouldn't=20
> test what we=20
> > > > > need to test!)
> > > > >=20
> > > > > Thanks!
> > > > >=20
> > > > > --
> > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > >=20
> > > > >=20
> > > > >=20
> > > > >=20
> > > > > -------------------------------------------------------
> > > > > This SF.net email is sponsored by: IT Product Guide on=20
> > > > > ITManagersJournal Use IT products in your business?=20
> Tell us what=20
> > > > > you think of them. Give us Your Opinions, Get Free ThinkGeek=20
> > > > > Gift Certificates! Click to find out more=20
> > > > > http://productguide.itmanagersjournal.com/guid> epromo.tmpl
> > > > >=20
> > > > > _______________________________________________
> > > > >=20
> > > > > NFS maillist - [email protected]=20
> > > > > https://lists.sourceforge.net/lists/listinfo/n> fs
> > > > >=20
> > > --
> > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > >=20
> > >=20
> --=20
> Dan Stromberg DCS/NACS/UCI <[email protected]>
>=20
>=20
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
> Here are our packet lengths with counts, over 10000 packets:
>=20
> count packet length =20
> 3 70
> 1 74
> 2 82
> 3 98
> 164 182
> 180 186
> 8827 190
> 76 202
> 407 286
> 52 4266
> 1 7418
> 284 8362
>=20
> Does this look normal for a network with jumbo frames enabled=20
> transferring lots of mostly-large files?
you are confusing the network transport with the upper layer protocol.
in addition i think you are looking at UDP traffic, not TCP.
note that 4266 =3D 170 + 4096, and that 8362 =3D 170 + 8192. 170 is the
size of the IP, UDP, RPC, and NFS headers, and the rest is the data
payload (multiple of the client's page size, 4096). anything smaller
than 300 is likely to be an NFS metadata op (GETATTR, LOOKUP, and the
like). that one 7000-odd byte packet is probably a READDIR.
if you want an analysis of the efficiency of the NFS client, use
"nfsstat -c" to decide whether your client is generating mostly metadata
ops, or whether these are really small reads and writes.
> On Thu, 2004-10-21 at 12:15, Dan Stromberg wrote:
> > On Thu, 2004-10-21 at 11:56, Lever, Charles wrote:
> > > > A tinygram is a small packet.
> > > >=20
> > > > Many of the NFS packets I'm seeing are small - say about 200
> > > > or 300 bytes. Then from time to time, there's a 7k packet,=20
> > > > like I'd like to see more of.
> > >=20
> > > do you know what's in the small packets? 200 to 300 bytes are=20
> > > typical of most NFS operations (not READ or WRITE). maybe your=20
> > > application is causing the client to generate lots of NFS=20
> requests,=20
> > > but only a few of them are WRITEs.
> >=20
> > This is the NFS portion of a 190 byte packet, that appears to be=20
> > fairly representative, taken from tethereal:
> >=20
> > Network File System
> > Program Version: 3
> > V3 Procedure: READ (6)
> > file
> > length: 36
> > hash: 0x3305e54e
> > type: unknown
> > data: 01000006007900411A00000000000000
> > 001B8C1A000000000000000000057E72
> > 00000000
> > offset: 1484812288
> > count: 8192
> >=20
> > Most of the files in this filesystem are large (data from=20
> simulation=20
> > runs in netcdf format), but there certainly are some small ones.
> >=20
> > Right now, our application is rsync. But that may change later.
> >=20
> > > > Someone just told me that netapp servers can do intent-based
> > > > NFS. Do you concur?
> > >=20
> > > i've never heard of "intent-based NFS." can you explain=20
> what this=20
> > > means?
> >=20
> > I believe it means that you bundle a bunch of operations=20
> together into=20
> > one large packet, and the execution of later operations is=20
> contingent=20
> > on the success of earlier operations (or perhaps more=20
> generally, the=20
> > exit status of earlier operations - not sure).
> >=20
> > Lustre, I'm told, uses an intent-based protocol to speed up its=20
> > operations.
> >=20
> > The FC2 nfs implementation (kernel 2.6.8-1) has a structure named=20
> > "intent", which -might- only be used in NFS v4.
> >=20
> > There's some discussion of the data structure for intent-based NFS=20
> > here:
> >=20
> > http://seclists.org/lists/linux-kernel/2003/May/6040.html
> >=20
> > Unfortunately, our AIX 5.1 machine does not support NFS v4. Anyone=20
> > know if AIX 5.3 does? I'll ask on an AIX mailing list too...
> >=20
> > >=20
> > >=20
> > > > On Thu, 2004-10-21 at 10:47, Lever, Charles wrote:
> > > > > what's a "tinygram" ?
> > > > >=20
> > > > > do you mean the NFS write requests aren't all "wsize"=20
> bytes? or=20
> > > > > do
> > > > > you mean the TCP layer is segmenting into small IP packets?=20
> > > > these are
> > > > > two separate layers, and do not interact.
> > > > >=20
> > > > > > -----Original Message-----
> > > > > > From: Dan Stromberg [mailto:[email protected]]
> > > > > > Sent: Thursday, October 21, 2004 1:05 PM
> > > > > > To: Linux NFS Mailing List
> > > > > > Cc: Dan Stromberg
> > > > > > Subject: [NFS] NFS and tinygrams
> > > > > >=20
> > > > > >=20
> > > > > >=20
> > > > > > We have a series of test transfers going, where we are=20
> > > > > > shuttling data from GFS->NFS V3 over UDP->NFS V3 over=20
> > > > > > TCP->Lustre.
> > > > > >=20
> > > > > > On the NFS V3 over TCP link, we're seeing a lot of=20
> tinygrams,=20
> > > > > > despite having 8K NFS block sizes turned on, and=20
> jumbo packets=20
> > > > > > enabled (9000 byte MTU).
> > > > > >=20
> > > > > > The GFS machine runs Redhat 9, the first NFS server=20
> also runs=20
> > > > > > Redhat 9. The machine copying from NFS to NFS is=20
> running AIX=20
> > > > > > 5.1. The machine copying NFS to Lustre is running RHEL 3.
> > > > > >=20
> > > > > > I didn't check on the packet sizes of the other legs of the
> > > > > > transfer.
> > > > > >=20
> > > > > > I've verified that we do have jumbo packets being=20
> used some of=20
> > > > > > the time, on that AIX 5.1 -> RHEL 3 hop. However,=20
> we're still=20
> > > > > > getting a pretty large percentage of tinygrams.
> > > > > >=20
> > > > > > Is there any way of cutting down on the tinygrams, to more=20
> > > > > > effectively utilize our large MTU? Is there=20
> perhaps any sort=20
> > > > > > of "intent based" packetizing in standard=20
> implementations of=20
> > > > > > NFS on Redhat 9, AIX 5.1, and/or RHEL 3?
> > > > > >=20
> > > > > > (Yes, we could short circuit the AIX 5.1 part of=20
> the transfer,=20
> > > > > > and that Would make things faster, but it Wouldn't=20
> test what=20
> > > > > > we need to test!)
> > > > > >=20
> > > > > > Thanks!
> > > > > >=20
> > > > > > --
> > > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > > >=20
> > > > > >=20
> > > > > >=20
> > > > > >=20
> > > > > > -------------------------------------------------------
> > > > > > This SF.net email is sponsored by: IT Product Guide on=20
> > > > > > ITManagersJournal Use IT products in your business? Tell us=20
> > > > > > what you think of them. Give us Your Opinions, Get Free=20
> > > > > > ThinkGeek Gift Certificates! Click to find out more=20
> > > > > > http://productguide.itmanagersjournal.com/guid> epromo.tmpl
> > > > > >=20
> > > > > > _______________________________________________
> > > > > >=20
> > > > > > NFS maillist - [email protected]=20
> > > > > > https://lists.sourceforge.net/lists/listinfo/n> fs
> > > > > >=20
> > > > --
> > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > >=20
> > > >=20
> --=20
> Dan Stromberg DCS/NACS/UCI <[email protected]>
>=20
>=20
>=20
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Yes, you're right. I was on the wrong server - rxvt lied to me.
hostname did not.
Upon doing a similar check on the Right server, it's become clear that
while our Redhat 9 host is doing jumbo frames, our RHEL 3 host is not.
I've set the MTU to 9000 on the RHEL 3 host. Is there something else I
need to do to set jumbo frames on RHEL 3? (The AIX 5.1 host this RHEL 3
host is talking to, is doing jumbo frames fine with the Redhat 9 host,
so I assume the AIX 5.1 host is configured fine in this regard...)
Thanks!
> > Here are our packet lengths with counts, over 10000 packets:
> >
> > count packet length
> > 3 70
> > 1 74
> > 2 82
> > 3 98
> > 164 182
> > 180 186
> > 8827 190
> > 76 202
> > 407 286
> > 52 4266
> > 1 7418
> > 284 8362
> >
> > Does this look normal for a network with jumbo frames enabled
> > transferring lots of mostly-large files?
>
> you are confusing the network transport with the upper layer protocol.
> in addition i think you are looking at UDP traffic, not TCP.
>
> note that 4266 = 170 + 4096, and that 8362 = 170 + 8192. 170 is the
> size of the IP, UDP, RPC, and NFS headers, and the rest is the data
> payload (multiple of the client's page size, 4096). anything smaller
> than 300 is likely to be an NFS metadata op (GETATTR, LOOKUP, and the
> like). that one 7000-odd byte packet is probably a READDIR.
>
> if you want an analysis of the efficiency of the NFS client, use
> "nfsstat -c" to decide whether your client is generating mostly metadata
> ops, or whether these are really small reads and writes.
>
> > On Thu, 2004-10-21 at 12:15, Dan Stromberg wrote:
> > > On Thu, 2004-10-21 at 11:56, Lever, Charles wrote:
> > > > > A tinygram is a small packet.
> > > > >
> > > > > Many of the NFS packets I'm seeing are small - say about 200
> > > > > or 300 bytes. Then from time to time, there's a 7k packet,
> > > > > like I'd like to see more of.
> > > >
> > > > do you know what's in the small packets? 200 to 300 bytes are
> > > > typical of most NFS operations (not READ or WRITE). maybe your
> > > > application is causing the client to generate lots of NFS
> > requests,
> > > > but only a few of them are WRITEs.
> > >
> > > This is the NFS portion of a 190 byte packet, that appears to be
> > > fairly representative, taken from tethereal:
> > >
> > > Network File System
> > > Program Version: 3
> > > V3 Procedure: READ (6)
> > > file
> > > length: 36
> > > hash: 0x3305e54e
> > > type: unknown
> > > data: 01000006007900411A00000000000000
> > > 001B8C1A000000000000000000057E72
> > > 00000000
> > > offset: 1484812288
> > > count: 8192
> > >
> > > Most of the files in this filesystem are large (data from
> > simulation
> > > runs in netcdf format), but there certainly are some small ones.
> > >
> > > Right now, our application is rsync. But that may change later.
> > >
> > > > > Someone just told me that netapp servers can do intent-based
> > > > > NFS. Do you concur?
> > > >
> > > > i've never heard of "intent-based NFS." can you explain
> > what this
> > > > means?
> > >
> > > I believe it means that you bundle a bunch of operations
> > together into
> > > one large packet, and the execution of later operations is
> > contingent
> > > on the success of earlier operations (or perhaps more
> > generally, the
> > > exit status of earlier operations - not sure).
> > >
> > > Lustre, I'm told, uses an intent-based protocol to speed up its
> > > operations.
> > >
> > > The FC2 nfs implementation (kernel 2.6.8-1) has a structure named
> > > "intent", which -might- only be used in NFS v4.
> > >
> > > There's some discussion of the data structure for intent-based NFS
> > > here:
> > >
> > > http://seclists.org/lists/linux-kernel/2003/May/6040.html
> > >
> > > Unfortunately, our AIX 5.1 machine does not support NFS v4. Anyone
> > > know if AIX 5.3 does? I'll ask on an AIX mailing list too...
> > >
> > > >
> > > >
> > > > > On Thu, 2004-10-21 at 10:47, Lever, Charles wrote:
> > > > > > what's a "tinygram" ?
> > > > > >
> > > > > > do you mean the NFS write requests aren't all "wsize"
> > bytes? or
> > > > > > do
> > > > > > you mean the TCP layer is segmenting into small IP packets?
> > > > > these are
> > > > > > two separate layers, and do not interact.
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Dan Stromberg [mailto:[email protected]]
> > > > > > > Sent: Thursday, October 21, 2004 1:05 PM
> > > > > > > To: Linux NFS Mailing List
> > > > > > > Cc: Dan Stromberg
> > > > > > > Subject: [NFS] NFS and tinygrams
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > We have a series of test transfers going, where we are
> > > > > > > shuttling data from GFS->NFS V3 over UDP->NFS V3 over
> > > > > > > TCP->Lustre.
> > > > > > >
> > > > > > > On the NFS V3 over TCP link, we're seeing a lot of
> > tinygrams,
> > > > > > > despite having 8K NFS block sizes turned on, and
> > jumbo packets
> > > > > > > enabled (9000 byte MTU).
> > > > > > >
> > > > > > > The GFS machine runs Redhat 9, the first NFS server
> > also runs
> > > > > > > Redhat 9. The machine copying from NFS to NFS is
> > running AIX
> > > > > > > 5.1. The machine copying NFS to Lustre is running RHEL 3.
> > > > > > >
> > > > > > > I didn't check on the packet sizes of the other legs of the
> > > > > > > transfer.
> > > > > > >
> > > > > > > I've verified that we do have jumbo packets being
> > used some of
> > > > > > > the time, on that AIX 5.1 -> RHEL 3 hop. However,
> > we're still
> > > > > > > getting a pretty large percentage of tinygrams.
> > > > > > >
> > > > > > > Is there any way of cutting down on the tinygrams, to more
> > > > > > > effectively utilize our large MTU? Is there
> > perhaps any sort
> > > > > > > of "intent based" packetizing in standard
> > implementations of
> > > > > > > NFS on Redhat 9, AIX 5.1, and/or RHEL 3?
> > > > > > >
> > > > > > > (Yes, we could short circuit the AIX 5.1 part of
> > the transfer,
> > > > > > > and that Would make things faster, but it Wouldn't
> > test what
> > > > > > > we need to test!)
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > --
> > > > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > -------------------------------------------------------
> > > > > > > This SF.net email is sponsored by: IT Product Guide on
> > > > > > > ITManagersJournal Use IT products in your business? Tell us
> > > > > > > what you think of them. Give us Your Opinions, Get Free
> > > > > > > ThinkGeek Gift Certificates! Click to find out more
> > > > > > > http://productguide.itmanagersjournal.com/guid> epromo.tmpl
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > >
> > > > > > > NFS maillist - [email protected]
> > > > > > > https://lists.sourceforge.net/lists/listinfo/n> fs
> > > > > > >
> > > > > --
> > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > >
> > > > >
> > --
> > Dan Stromberg DCS/NACS/UCI <[email protected]>
> >
> >
> >
--
Dan Stromberg DCS/NACS/UCI <[email protected]>
>From what I have seen you need to umount and remount to get it to
use the jumbos. It appears to me (someone correct me if this is
wrong) that the MTU is set on a per connection basis when the
connection is initially established, and does not appear to change
once established, at least not in the upward direction.
Roger
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Dan Stromberg
Sent: Thursday, October 21, 2004 3:52 PM
To: Lever, Charles
Cc: Dan Stromberg; Linux NFS Mailing List
Subject: RE: [NFS] NFS and tinygrams
Yes, you're right. I was on the wrong server - rxvt lied to me.
hostname did not.
Upon doing a similar check on the Right server, it's become clear that while
our Redhat 9 host is doing jumbo frames, our RHEL 3 host is not.
I've set the MTU to 9000 on the RHEL 3 host. Is there something else I need
to do to set jumbo frames on RHEL 3? (The AIX 5.1 host this RHEL 3 host is
talking to, is doing jumbo frames fine with the Redhat 9 host, so I assume
the AIX 5.1 host is configured fine in this regard...)
Thanks!
> > Here are our packet lengths with counts, over 10000 packets:
> >
> > count packet length
> > 3 70
> > 1 74
> > 2 82
> > 3 98
> > 164 182
> > 180 186
> > 8827 190
> > 76 202
> > 407 286
> > 52 4266
> > 1 7418
> > 284 8362
> >
> > Does this look normal for a network with jumbo frames enabled
> > transferring lots of mostly-large files?
>
> you are confusing the network transport with the upper layer protocol.
> in addition i think you are looking at UDP traffic, not TCP.
>
> note that 4266 = 170 + 4096, and that 8362 = 170 + 8192. 170 is the
> size of the IP, UDP, RPC, and NFS headers, and the rest is the data
> payload (multiple of the client's page size, 4096). anything smaller
> than 300 is likely to be an NFS metadata op (GETATTR, LOOKUP, and the
> like). that one 7000-odd byte packet is probably a READDIR.
>
> if you want an analysis of the efficiency of the NFS client, use
> "nfsstat -c" to decide whether your client is generating mostly
> metadata ops, or whether these are really small reads and writes.
>
> > On Thu, 2004-10-21 at 12:15, Dan Stromberg wrote:
> > > On Thu, 2004-10-21 at 11:56, Lever, Charles wrote:
> > > > > A tinygram is a small packet.
> > > > >
> > > > > Many of the NFS packets I'm seeing are small - say about 200
> > > > > or 300 bytes. Then from time to time, there's a 7k packet,
> > > > > like I'd like to see more of.
> > > >
> > > > do you know what's in the small packets? 200 to 300 bytes are
> > > > typical of most NFS operations (not READ or WRITE). maybe your
> > > > application is causing the client to generate lots of NFS
> > requests,
> > > > but only a few of them are WRITEs.
> > >
> > > This is the NFS portion of a 190 byte packet, that appears to be
> > > fairly representative, taken from tethereal:
> > >
> > > Network File System
> > > Program Version: 3
> > > V3 Procedure: READ (6)
> > > file
> > > length: 36
> > > hash: 0x3305e54e
> > > type: unknown
> > > data: 01000006007900411A00000000000000
> > > 001B8C1A000000000000000000057E72
> > > 00000000
> > > offset: 1484812288
> > > count: 8192
> > >
> > > Most of the files in this filesystem are large (data from
> > simulation
> > > runs in netcdf format), but there certainly are some small ones.
> > >
> > > Right now, our application is rsync. But that may change later.
> > >
> > > > > Someone just told me that netapp servers can do intent-based
> > > > > NFS. Do you concur?
> > > >
> > > > i've never heard of "intent-based NFS." can you explain
> > what this
> > > > means?
> > >
> > > I believe it means that you bundle a bunch of operations
> > together into
> > > one large packet, and the execution of later operations is
> > contingent
> > > on the success of earlier operations (or perhaps more
> > generally, the
> > > exit status of earlier operations - not sure).
> > >
> > > Lustre, I'm told, uses an intent-based protocol to speed up its
> > > operations.
> > >
> > > The FC2 nfs implementation (kernel 2.6.8-1) has a structure named
> > > "intent", which -might- only be used in NFS v4.
> > >
> > > There's some discussion of the data structure for intent-based NFS
> > > here:
> > >
> > > http://seclists.org/lists/linux-kernel/2003/May/6040.html
> > >
> > > Unfortunately, our AIX 5.1 machine does not support NFS v4.
> > > Anyone know if AIX 5.3 does? I'll ask on an AIX mailing list too...
> > >
> > > >
> > > >
> > > > > On Thu, 2004-10-21 at 10:47, Lever, Charles wrote:
> > > > > > what's a "tinygram" ?
> > > > > >
> > > > > > do you mean the NFS write requests aren't all "wsize"
> > bytes? or
> > > > > > do
> > > > > > you mean the TCP layer is segmenting into small IP packets?
> > > > > these are
> > > > > > two separate layers, and do not interact.
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Dan Stromberg [mailto:[email protected]]
> > > > > > > Sent: Thursday, October 21, 2004 1:05 PM
> > > > > > > To: Linux NFS Mailing List
> > > > > > > Cc: Dan Stromberg
> > > > > > > Subject: [NFS] NFS and tinygrams
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > We have a series of test transfers going, where we are
> > > > > > > shuttling data from GFS->NFS V3 over UDP->NFS V3 over
> > > > > > > TCP->Lustre.
> > > > > > >
> > > > > > > On the NFS V3 over TCP link, we're seeing a lot of
> > tinygrams,
> > > > > > > despite having 8K NFS block sizes turned on, and
> > jumbo packets
> > > > > > > enabled (9000 byte MTU).
> > > > > > >
> > > > > > > The GFS machine runs Redhat 9, the first NFS server
> > also runs
> > > > > > > Redhat 9. The machine copying from NFS to NFS is
> > running AIX
> > > > > > > 5.1. The machine copying NFS to Lustre is running RHEL 3.
> > > > > > >
> > > > > > > I didn't check on the packet sizes of the other legs of
> > > > > > > the transfer.
> > > > > > >
> > > > > > > I've verified that we do have jumbo packets being
> > used some of
> > > > > > > the time, on that AIX 5.1 -> RHEL 3 hop. However,
> > we're still
> > > > > > > getting a pretty large percentage of tinygrams.
> > > > > > >
> > > > > > > Is there any way of cutting down on the tinygrams, to more
> > > > > > > effectively utilize our large MTU? Is there
> > perhaps any sort
> > > > > > > of "intent based" packetizing in standard
> > implementations of
> > > > > > > NFS on Redhat 9, AIX 5.1, and/or RHEL 3?
> > > > > > >
> > > > > > > (Yes, we could short circuit the AIX 5.1 part of
> > the transfer,
> > > > > > > and that Would make things faster, but it Wouldn't
> > test what
> > > > > > > we need to test!)
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > --
> > > > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > -------------------------------------------------------
> > > > > > > This SF.net email is sponsored by: IT Product Guide on
> > > > > > > ITManagersJournal Use IT products in your business? Tell
> > > > > > > us what you think of them. Give us Your Opinions, Get Free
> > > > > > > ThinkGeek Gift Certificates! Click to find out more
> > > > > > > http://productguide.itmanagersjournal.com/guid>
> > > > > > > epromo.tmpl
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > >
> > > > > > > NFS maillist - [email protected]
> > > > > > > https://lists.sourceforge.net/lists/listinfo/n> fs
> > > > > > >
> > > > > --
> > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > >
> > > > >
> > --
> > Dan Stromberg DCS/NACS/UCI <[email protected]>
> >
> >
> >
--
Dan Stromberg DCS/NACS/UCI <[email protected]>
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
That sounds worth trying, but should I be seeing:
[root@esmft2 etc]# tracepath esmf04d
1: esmft2 (192.168.2.102) asymm 65
0.260ms pmtu 1492
1: esmf04d (192.168.2.12) 0.294ms
reached
Resume: pmtu 1492 hops 1 back 1
?
Thanks!
On Thu, 2004-10-21 at 14:22, Roger Heflin wrote:
> From what I have seen you need to umount and remount to get it to
> use the jumbos. It appears to me (someone correct me if this is
> wrong) that the MTU is set on a per connection basis when the
> connection is initially established, and does not appear to change
> once established, at least not in the upward direction.
>
> Roger
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Dan Stromberg
> Sent: Thursday, October 21, 2004 3:52 PM
> To: Lever, Charles
> Cc: Dan Stromberg; Linux NFS Mailing List
> Subject: RE: [NFS] NFS and tinygrams
>
>
> Yes, you're right. I was on the wrong server - rxvt lied to me.
> hostname did not.
>
> Upon doing a similar check on the Right server, it's become clear that while
> our Redhat 9 host is doing jumbo frames, our RHEL 3 host is not.
>
> I've set the MTU to 9000 on the RHEL 3 host. Is there something else I need
> to do to set jumbo frames on RHEL 3? (The AIX 5.1 host this RHEL 3 host is
> talking to, is doing jumbo frames fine with the Redhat 9 host, so I assume
> the AIX 5.1 host is configured fine in this regard...)
>
> Thanks!
>
> > > Here are our packet lengths with counts, over 10000 packets:
> > >
> > > count packet length
> > > 3 70
> > > 1 74
> > > 2 82
> > > 3 98
> > > 164 182
> > > 180 186
> > > 8827 190
> > > 76 202
> > > 407 286
> > > 52 4266
> > > 1 7418
> > > 284 8362
> > >
> > > Does this look normal for a network with jumbo frames enabled
> > > transferring lots of mostly-large files?
> >
> > you are confusing the network transport with the upper layer protocol.
> > in addition i think you are looking at UDP traffic, not TCP.
> >
> > note that 4266 = 170 + 4096, and that 8362 = 170 + 8192. 170 is the
> > size of the IP, UDP, RPC, and NFS headers, and the rest is the data
> > payload (multiple of the client's page size, 4096). anything smaller
> > than 300 is likely to be an NFS metadata op (GETATTR, LOOKUP, and the
> > like). that one 7000-odd byte packet is probably a READDIR.
> >
> > if you want an analysis of the efficiency of the NFS client, use
> > "nfsstat -c" to decide whether your client is generating mostly
> > metadata ops, or whether these are really small reads and writes.
> >
> > > On Thu, 2004-10-21 at 12:15, Dan Stromberg wrote:
> > > > On Thu, 2004-10-21 at 11:56, Lever, Charles wrote:
> > > > > > A tinygram is a small packet.
> > > > > >
> > > > > > Many of the NFS packets I'm seeing are small - say about 200
> > > > > > or 300 bytes. Then from time to time, there's a 7k packet,
> > > > > > like I'd like to see more of.
> > > > >
> > > > > do you know what's in the small packets? 200 to 300 bytes are
> > > > > typical of most NFS operations (not READ or WRITE). maybe your
> > > > > application is causing the client to generate lots of NFS
> > > requests,
> > > > > but only a few of them are WRITEs.
> > > >
> > > > This is the NFS portion of a 190 byte packet, that appears to be
> > > > fairly representative, taken from tethereal:
> > > >
> > > > Network File System
> > > > Program Version: 3
> > > > V3 Procedure: READ (6)
> > > > file
> > > > length: 36
> > > > hash: 0x3305e54e
> > > > type: unknown
> > > > data: 01000006007900411A00000000000000
> > > > 001B8C1A000000000000000000057E72
> > > > 00000000
> > > > offset: 1484812288
> > > > count: 8192
> > > >
> > > > Most of the files in this filesystem are large (data from
> > > simulation
> > > > runs in netcdf format), but there certainly are some small ones.
> > > >
> > > > Right now, our application is rsync. But that may change later.
> > > >
> > > > > > Someone just told me that netapp servers can do intent-based
> > > > > > NFS. Do you concur?
> > > > >
> > > > > i've never heard of "intent-based NFS." can you explain
> > > what this
> > > > > means?
> > > >
> > > > I believe it means that you bundle a bunch of operations
> > > together into
> > > > one large packet, and the execution of later operations is
> > > contingent
> > > > on the success of earlier operations (or perhaps more
> > > generally, the
> > > > exit status of earlier operations - not sure).
> > > >
> > > > Lustre, I'm told, uses an intent-based protocol to speed up its
> > > > operations.
> > > >
> > > > The FC2 nfs implementation (kernel 2.6.8-1) has a structure named
> > > > "intent", which -might- only be used in NFS v4.
> > > >
> > > > There's some discussion of the data structure for intent-based NFS
> > > > here:
> > > >
> > > > http://seclists.org/lists/linux-kernel/2003/May/6040.html
> > > >
> > > > Unfortunately, our AIX 5.1 machine does not support NFS v4.
> > > > Anyone know if AIX 5.3 does? I'll ask on an AIX mailing list too...
> > > >
> > > > >
> > > > >
> > > > > > On Thu, 2004-10-21 at 10:47, Lever, Charles wrote:
> > > > > > > what's a "tinygram" ?
> > > > > > >
> > > > > > > do you mean the NFS write requests aren't all "wsize"
> > > bytes? or
> > > > > > > do
> > > > > > > you mean the TCP layer is segmenting into small IP packets?
> > > > > > these are
> > > > > > > two separate layers, and do not interact.
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Dan Stromberg [mailto:[email protected]]
> > > > > > > > Sent: Thursday, October 21, 2004 1:05 PM
> > > > > > > > To: Linux NFS Mailing List
> > > > > > > > Cc: Dan Stromberg
> > > > > > > > Subject: [NFS] NFS and tinygrams
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > We have a series of test transfers going, where we are
> > > > > > > > shuttling data from GFS->NFS V3 over UDP->NFS V3 over
> > > > > > > > TCP->Lustre.
> > > > > > > >
> > > > > > > > On the NFS V3 over TCP link, we're seeing a lot of
> > > tinygrams,
> > > > > > > > despite having 8K NFS block sizes turned on, and
> > > jumbo packets
> > > > > > > > enabled (9000 byte MTU).
> > > > > > > >
> > > > > > > > The GFS machine runs Redhat 9, the first NFS server
> > > also runs
> > > > > > > > Redhat 9. The machine copying from NFS to NFS is
> > > running AIX
> > > > > > > > 5.1. The machine copying NFS to Lustre is running RHEL 3.
> > > > > > > >
> > > > > > > > I didn't check on the packet sizes of the other legs of
> > > > > > > > the transfer.
> > > > > > > >
> > > > > > > > I've verified that we do have jumbo packets being
> > > used some of
> > > > > > > > the time, on that AIX 5.1 -> RHEL 3 hop. However,
> > > we're still
> > > > > > > > getting a pretty large percentage of tinygrams.
> > > > > > > >
> > > > > > > > Is there any way of cutting down on the tinygrams, to more
> > > > > > > > effectively utilize our large MTU? Is there
> > > perhaps any sort
> > > > > > > > of "intent based" packetizing in standard
> > > implementations of
> > > > > > > > NFS on Redhat 9, AIX 5.1, and/or RHEL 3?
> > > > > > > >
> > > > > > > > (Yes, we could short circuit the AIX 5.1 part of
> > > the transfer,
> > > > > > > > and that Would make things faster, but it Wouldn't
> > > test what
> > > > > > > > we need to test!)
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > --
> > > > > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > -------------------------------------------------------
> > > > > > > > This SF.net email is sponsored by: IT Product Guide on
> > > > > > > > ITManagersJournal Use IT products in your business? Tell
> > > > > > > > us what you think of them. Give us Your Opinions, Get Free
> > > > > > > > ThinkGeek Gift Certificates! Click to find out more
> > > > > > > > http://productguide.itmanagersjournal.com/guid>
> > > > > > > > epromo.tmpl
> > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > >
> > > > > > > > NFS maillist - [email protected]
> > > > > > > > https://lists.sourceforge.net/lists/listinfo/n> fs
> > > > > > > >
> > > > > > --
> > > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > > >
> > > > > >
> > > --
> > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > >
> > >
> > >
> --
> Dan Stromberg DCS/NACS/UCI <[email protected]>
--
Dan Stromberg DCS/NACS/UCI <[email protected]>
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
I suspect tracepath is either lying, or I'm misinterpreting its output,
because if I fire up an iperf pair from esmf04d to esmft2, I -do- see
some jumbo frames.
I'll try the remounting when I can.
Thanks folks.
On Thu, 2004-10-21 at 15:03, Dan Stromberg wrote:
> That sounds worth trying, but should I be seeing:
>
> [root@esmft2 etc]# tracepath esmf04d
> 1: esmft2 (192.168.2.102) asymm 65
> 0.260ms pmtu 1492
> 1: esmf04d (192.168.2.12) 0.294ms
> reached
> Resume: pmtu 1492 hops 1 back 1
>
> ?
>
> Thanks!
>
> On Thu, 2004-10-21 at 14:22, Roger Heflin wrote:
> > From what I have seen you need to umount and remount to get it to
> > use the jumbos. It appears to me (someone correct me if this is
> > wrong) that the MTU is set on a per connection basis when the
> > connection is initially established, and does not appear to change
> > once established, at least not in the upward direction.
> >
> > Roger
> >
> > -----Original Message-----
> > From: [email protected]
> > [mailto:[email protected]] On Behalf Of Dan Stromberg
> > Sent: Thursday, October 21, 2004 3:52 PM
> > To: Lever, Charles
> > Cc: Dan Stromberg; Linux NFS Mailing List
> > Subject: RE: [NFS] NFS and tinygrams
> >
> >
> > Yes, you're right. I was on the wrong server - rxvt lied to me.
> > hostname did not.
> >
> > Upon doing a similar check on the Right server, it's become clear that while
> > our Redhat 9 host is doing jumbo frames, our RHEL 3 host is not.
> >
> > I've set the MTU to 9000 on the RHEL 3 host. Is there something else I need
> > to do to set jumbo frames on RHEL 3? (The AIX 5.1 host this RHEL 3 host is
> > talking to, is doing jumbo frames fine with the Redhat 9 host, so I assume
> > the AIX 5.1 host is configured fine in this regard...)
> >
> > Thanks!
> >
> > > > Here are our packet lengths with counts, over 10000 packets:
> > > >
> > > > count packet length
> > > > 3 70
> > > > 1 74
> > > > 2 82
> > > > 3 98
> > > > 164 182
> > > > 180 186
> > > > 8827 190
> > > > 76 202
> > > > 407 286
> > > > 52 4266
> > > > 1 7418
> > > > 284 8362
> > > >
> > > > Does this look normal for a network with jumbo frames enabled
> > > > transferring lots of mostly-large files?
> > >
> > > you are confusing the network transport with the upper layer protocol.
> > > in addition i think you are looking at UDP traffic, not TCP.
> > >
> > > note that 4266 = 170 + 4096, and that 8362 = 170 + 8192. 170 is the
> > > size of the IP, UDP, RPC, and NFS headers, and the rest is the data
> > > payload (multiple of the client's page size, 4096). anything smaller
> > > than 300 is likely to be an NFS metadata op (GETATTR, LOOKUP, and the
> > > like). that one 7000-odd byte packet is probably a READDIR.
> > >
> > > if you want an analysis of the efficiency of the NFS client, use
> > > "nfsstat -c" to decide whether your client is generating mostly
> > > metadata ops, or whether these are really small reads and writes.
> > >
> > > > On Thu, 2004-10-21 at 12:15, Dan Stromberg wrote:
> > > > > On Thu, 2004-10-21 at 11:56, Lever, Charles wrote:
> > > > > > > A tinygram is a small packet.
> > > > > > >
> > > > > > > Many of the NFS packets I'm seeing are small - say about 200
> > > > > > > or 300 bytes. Then from time to time, there's a 7k packet,
> > > > > > > like I'd like to see more of.
> > > > > >
> > > > > > do you know what's in the small packets? 200 to 300 bytes are
> > > > > > typical of most NFS operations (not READ or WRITE). maybe your
> > > > > > application is causing the client to generate lots of NFS
> > > > requests,
> > > > > > but only a few of them are WRITEs.
> > > > >
> > > > > This is the NFS portion of a 190 byte packet, that appears to be
> > > > > fairly representative, taken from tethereal:
> > > > >
> > > > > Network File System
> > > > > Program Version: 3
> > > > > V3 Procedure: READ (6)
> > > > > file
> > > > > length: 36
> > > > > hash: 0x3305e54e
> > > > > type: unknown
> > > > > data: 01000006007900411A00000000000000
> > > > > 001B8C1A000000000000000000057E72
> > > > > 00000000
> > > > > offset: 1484812288
> > > > > count: 8192
> > > > >
> > > > > Most of the files in this filesystem are large (data from
> > > > simulation
> > > > > runs in netcdf format), but there certainly are some small ones.
> > > > >
> > > > > Right now, our application is rsync. But that may change later.
> > > > >
> > > > > > > Someone just told me that netapp servers can do intent-based
> > > > > > > NFS. Do you concur?
> > > > > >
> > > > > > i've never heard of "intent-based NFS." can you explain
> > > > what this
> > > > > > means?
> > > > >
> > > > > I believe it means that you bundle a bunch of operations
> > > > together into
> > > > > one large packet, and the execution of later operations is
> > > > contingent
> > > > > on the success of earlier operations (or perhaps more
> > > > generally, the
> > > > > exit status of earlier operations - not sure).
> > > > >
> > > > > Lustre, I'm told, uses an intent-based protocol to speed up its
> > > > > operations.
> > > > >
> > > > > The FC2 nfs implementation (kernel 2.6.8-1) has a structure named
> > > > > "intent", which -might- only be used in NFS v4.
> > > > >
> > > > > There's some discussion of the data structure for intent-based NFS
> > > > > here:
> > > > >
> > > > > http://seclists.org/lists/linux-kernel/2003/May/6040.html
> > > > >
> > > > > Unfortunately, our AIX 5.1 machine does not support NFS v4.
> > > > > Anyone know if AIX 5.3 does? I'll ask on an AIX mailing list too...
> > > > >
> > > > > >
> > > > > >
> > > > > > > On Thu, 2004-10-21 at 10:47, Lever, Charles wrote:
> > > > > > > > what's a "tinygram" ?
> > > > > > > >
> > > > > > > > do you mean the NFS write requests aren't all "wsize"
> > > > bytes? or
> > > > > > > > do
> > > > > > > > you mean the TCP layer is segmenting into small IP packets?
> > > > > > > these are
> > > > > > > > two separate layers, and do not interact.
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Dan Stromberg [mailto:[email protected]]
> > > > > > > > > Sent: Thursday, October 21, 2004 1:05 PM
> > > > > > > > > To: Linux NFS Mailing List
> > > > > > > > > Cc: Dan Stromberg
> > > > > > > > > Subject: [NFS] NFS and tinygrams
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > We have a series of test transfers going, where we are
> > > > > > > > > shuttling data from GFS->NFS V3 over UDP->NFS V3 over
> > > > > > > > > TCP->Lustre.
> > > > > > > > >
> > > > > > > > > On the NFS V3 over TCP link, we're seeing a lot of
> > > > tinygrams,
> > > > > > > > > despite having 8K NFS block sizes turned on, and
> > > > jumbo packets
> > > > > > > > > enabled (9000 byte MTU).
> > > > > > > > >
> > > > > > > > > The GFS machine runs Redhat 9, the first NFS server
> > > > also runs
> > > > > > > > > Redhat 9. The machine copying from NFS to NFS is
> > > > running AIX
> > > > > > > > > 5.1. The machine copying NFS to Lustre is running RHEL 3.
> > > > > > > > >
> > > > > > > > > I didn't check on the packet sizes of the other legs of
> > > > > > > > > the transfer.
> > > > > > > > >
> > > > > > > > > I've verified that we do have jumbo packets being
> > > > used some of
> > > > > > > > > the time, on that AIX 5.1 -> RHEL 3 hop. However,
> > > > we're still
> > > > > > > > > getting a pretty large percentage of tinygrams.
> > > > > > > > >
> > > > > > > > > Is there any way of cutting down on the tinygrams, to more
> > > > > > > > > effectively utilize our large MTU? Is there
> > > > perhaps any sort
> > > > > > > > > of "intent based" packetizing in standard
> > > > implementations of
> > > > > > > > > NFS on Redhat 9, AIX 5.1, and/or RHEL 3?
> > > > > > > > >
> > > > > > > > > (Yes, we could short circuit the AIX 5.1 part of
> > > > the transfer,
> > > > > > > > > and that Would make things faster, but it Wouldn't
> > > > test what
> > > > > > > > > we need to test!)
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > -------------------------------------------------------
> > > > > > > > > This SF.net email is sponsored by: IT Product Guide on
> > > > > > > > > ITManagersJournal Use IT products in your business? Tell
> > > > > > > > > us what you think of them. Give us Your Opinions, Get Free
> > > > > > > > > ThinkGeek Gift Certificates! Click to find out more
> > > > > > > > > http://productguide.itmanagersjournal.com/guid>
> > > > > > > > > epromo.tmpl
> > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > >
> > > > > > > > > NFS maillist - [email protected]
> > > > > > > > > https://lists.sourceforge.net/lists/listinfo/n> fs
> > > > > > > > >
> > > > > > > --
> > > > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > > > >
> > > > > > >
> > > > --
> > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > >
> > > >
> > > >
> > --
> > Dan Stromberg DCS/NACS/UCI <[email protected]>
--
Dan Stromberg DCS/NACS/UCI <[email protected]>
Rather than remount the filesystem we're really wanting to optimize,
which would interrupt a very large filetransfer, I instead NFS mounted a
directory from the same NFS server to the same NFS client, which the
same mount options.
And unfortunately, I'm continuing to see standard ethernet frames rather
than jumbo frames with that new NFS v3/tcp mount with 8k rsize and
wsize.
However, as I indicated before, iperf -does- use jumbo frames.
Is it going to take a reboot?
Thanks!
On Thu, 2004-10-21 at 15:21, Dan Stromberg wrote:
> I suspect tracepath is either lying, or I'm misinterpreting its output,
> because if I fire up an iperf pair from esmf04d to esmft2, I -do- see
> some jumbo frames.
>
> I'll try the remounting when I can.
>
> Thanks folks.
>
> On Thu, 2004-10-21 at 15:03, Dan Stromberg wrote:
> > That sounds worth trying, but should I be seeing:
> >
> > [root@esmft2 etc]# tracepath esmf04d
> > 1: esmft2 (192.168.2.102) asymm 65
> > 0.260ms pmtu 1492
> > 1: esmf04d (192.168.2.12) 0.294ms
> > reached
> > Resume: pmtu 1492 hops 1 back 1
> >
> > ?
> >
> > Thanks!
> >
> > On Thu, 2004-10-21 at 14:22, Roger Heflin wrote:
> > > From what I have seen you need to umount and remount to get it to
> > > use the jumbos. It appears to me (someone correct me if this is
> > > wrong) that the MTU is set on a per connection basis when the
> > > connection is initially established, and does not appear to change
> > > once established, at least not in the upward direction.
> > >
> > > Roger
> > >
> > > -----Original Message-----
> > > From: [email protected]
> > > [mailto:[email protected]] On Behalf Of Dan Stromberg
> > > Sent: Thursday, October 21, 2004 3:52 PM
> > > To: Lever, Charles
> > > Cc: Dan Stromberg; Linux NFS Mailing List
> > > Subject: RE: [NFS] NFS and tinygrams
> > >
> > >
> > > Yes, you're right. I was on the wrong server - rxvt lied to me.
> > > hostname did not.
> > >
> > > Upon doing a similar check on the Right server, it's become clear that while
> > > our Redhat 9 host is doing jumbo frames, our RHEL 3 host is not.
> > >
> > > I've set the MTU to 9000 on the RHEL 3 host. Is there something else I need
> > > to do to set jumbo frames on RHEL 3? (The AIX 5.1 host this RHEL 3 host is
> > > talking to, is doing jumbo frames fine with the Redhat 9 host, so I assume
> > > the AIX 5.1 host is configured fine in this regard...)
> > >
> > > Thanks!
> > >
> > > > > Here are our packet lengths with counts, over 10000 packets:
> > > > >
> > > > > count packet length
> > > > > 3 70
> > > > > 1 74
> > > > > 2 82
> > > > > 3 98
> > > > > 164 182
> > > > > 180 186
> > > > > 8827 190
> > > > > 76 202
> > > > > 407 286
> > > > > 52 4266
> > > > > 1 7418
> > > > > 284 8362
> > > > >
> > > > > Does this look normal for a network with jumbo frames enabled
> > > > > transferring lots of mostly-large files?
> > > >
> > > > you are confusing the network transport with the upper layer protocol.
> > > > in addition i think you are looking at UDP traffic, not TCP.
> > > >
> > > > note that 4266 = 170 + 4096, and that 8362 = 170 + 8192. 170 is the
> > > > size of the IP, UDP, RPC, and NFS headers, and the rest is the data
> > > > payload (multiple of the client's page size, 4096). anything smaller
> > > > than 300 is likely to be an NFS metadata op (GETATTR, LOOKUP, and the
> > > > like). that one 7000-odd byte packet is probably a READDIR.
> > > >
> > > > if you want an analysis of the efficiency of the NFS client, use
> > > > "nfsstat -c" to decide whether your client is generating mostly
> > > > metadata ops, or whether these are really small reads and writes.
> > > >
> > > > > On Thu, 2004-10-21 at 12:15, Dan Stromberg wrote:
> > > > > > On Thu, 2004-10-21 at 11:56, Lever, Charles wrote:
> > > > > > > > A tinygram is a small packet.
> > > > > > > >
> > > > > > > > Many of the NFS packets I'm seeing are small - say about 200
> > > > > > > > or 300 bytes. Then from time to time, there's a 7k packet,
> > > > > > > > like I'd like to see more of.
> > > > > > >
> > > > > > > do you know what's in the small packets? 200 to 300 bytes are
> > > > > > > typical of most NFS operations (not READ or WRITE). maybe your
> > > > > > > application is causing the client to generate lots of NFS
> > > > > requests,
> > > > > > > but only a few of them are WRITEs.
> > > > > >
> > > > > > This is the NFS portion of a 190 byte packet, that appears to be
> > > > > > fairly representative, taken from tethereal:
> > > > > >
> > > > > > Network File System
> > > > > > Program Version: 3
> > > > > > V3 Procedure: READ (6)
> > > > > > file
> > > > > > length: 36
> > > > > > hash: 0x3305e54e
> > > > > > type: unknown
> > > > > > data: 01000006007900411A00000000000000
> > > > > > 001B8C1A000000000000000000057E72
> > > > > > 00000000
> > > > > > offset: 1484812288
> > > > > > count: 8192
> > > > > >
> > > > > > Most of the files in this filesystem are large (data from
> > > > > simulation
> > > > > > runs in netcdf format), but there certainly are some small ones.
> > > > > >
> > > > > > Right now, our application is rsync. But that may change later.
> > > > > >
> > > > > > > > Someone just told me that netapp servers can do intent-based
> > > > > > > > NFS. Do you concur?
> > > > > > >
> > > > > > > i've never heard of "intent-based NFS." can you explain
> > > > > what this
> > > > > > > means?
> > > > > >
> > > > > > I believe it means that you bundle a bunch of operations
> > > > > together into
> > > > > > one large packet, and the execution of later operations is
> > > > > contingent
> > > > > > on the success of earlier operations (or perhaps more
> > > > > generally, the
> > > > > > exit status of earlier operations - not sure).
> > > > > >
> > > > > > Lustre, I'm told, uses an intent-based protocol to speed up its
> > > > > > operations.
> > > > > >
> > > > > > The FC2 nfs implementation (kernel 2.6.8-1) has a structure named
> > > > > > "intent", which -might- only be used in NFS v4.
> > > > > >
> > > > > > There's some discussion of the data structure for intent-based NFS
> > > > > > here:
> > > > > >
> > > > > > http://seclists.org/lists/linux-kernel/2003/May/6040.html
> > > > > >
> > > > > > Unfortunately, our AIX 5.1 machine does not support NFS v4.
> > > > > > Anyone know if AIX 5.3 does? I'll ask on an AIX mailing list too...
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > On Thu, 2004-10-21 at 10:47, Lever, Charles wrote:
> > > > > > > > > what's a "tinygram" ?
> > > > > > > > >
> > > > > > > > > do you mean the NFS write requests aren't all "wsize"
> > > > > bytes? or
> > > > > > > > > do
> > > > > > > > > you mean the TCP layer is segmenting into small IP packets?
> > > > > > > > these are
> > > > > > > > > two separate layers, and do not interact.
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Dan Stromberg [mailto:[email protected]]
> > > > > > > > > > Sent: Thursday, October 21, 2004 1:05 PM
> > > > > > > > > > To: Linux NFS Mailing List
> > > > > > > > > > Cc: Dan Stromberg
> > > > > > > > > > Subject: [NFS] NFS and tinygrams
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > We have a series of test transfers going, where we are
> > > > > > > > > > shuttling data from GFS->NFS V3 over UDP->NFS V3 over
> > > > > > > > > > TCP->Lustre.
> > > > > > > > > >
> > > > > > > > > > On the NFS V3 over TCP link, we're seeing a lot of
> > > > > tinygrams,
> > > > > > > > > > despite having 8K NFS block sizes turned on, and
> > > > > jumbo packets
> > > > > > > > > > enabled (9000 byte MTU).
> > > > > > > > > >
> > > > > > > > > > The GFS machine runs Redhat 9, the first NFS server
> > > > > also runs
> > > > > > > > > > Redhat 9. The machine copying from NFS to NFS is
> > > > > running AIX
> > > > > > > > > > 5.1. The machine copying NFS to Lustre is running RHEL 3.
> > > > > > > > > >
> > > > > > > > > > I didn't check on the packet sizes of the other legs of
> > > > > > > > > > the transfer.
> > > > > > > > > >
> > > > > > > > > > I've verified that we do have jumbo packets being
> > > > > used some of
> > > > > > > > > > the time, on that AIX 5.1 -> RHEL 3 hop. However,
> > > > > we're still
> > > > > > > > > > getting a pretty large percentage of tinygrams.
> > > > > > > > > >
> > > > > > > > > > Is there any way of cutting down on the tinygrams, to more
> > > > > > > > > > effectively utilize our large MTU? Is there
> > > > > perhaps any sort
> > > > > > > > > > of "intent based" packetizing in standard
> > > > > implementations of
> > > > > > > > > > NFS on Redhat 9, AIX 5.1, and/or RHEL 3?
> > > > > > > > > >
> > > > > > > > > > (Yes, we could short circuit the AIX 5.1 part of
> > > > > the transfer,
> > > > > > > > > > and that Would make things faster, but it Wouldn't
> > > > > test what
> > > > > > > > > > we need to test!)
> > > > > > > > > >
> > > > > > > > > > Thanks!
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > -------------------------------------------------------
> > > > > > > > > > This SF.net email is sponsored by: IT Product Guide on
> > > > > > > > > > ITManagersJournal Use IT products in your business? Tell
> > > > > > > > > > us what you think of them. Give us Your Opinions, Get Free
> > > > > > > > > > ThinkGeek Gift Certificates! Click to find out more
> > > > > > > > > > http://productguide.itmanagersjournal.com/guid>
> > > > > > > > > > epromo.tmpl
> > > > > > > > > >
> > > > > > > > > > _______________________________________________
> > > > > > > > > >
> > > > > > > > > > NFS maillist - [email protected]
> > > > > > > > > > https://lists.sourceforge.net/lists/listinfo/n> fs
> > > > > > > > > >
> > > > > > > > --
> > > > > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > > > > >
> > > > > > > >
> > > > > --
> > > > > Dan Stromberg DCS/NACS/UCI <[email protected]>
> > > > >
> > > > >
> > > > >
> > > --
> > > Dan Stromberg DCS/NACS/UCI <[email protected]>
--
Dan Stromberg DCS/NACS/UCI <[email protected]>
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
> Unfortunately, our AIX 5.1 machine does not support NFS v4. Anyone know
> if AIX 5.3 does? I'll ask on an AIX mailing list too...
>
> Dan Stromberg DCS/NACS/UCI <[email protected]>
Yes, AIX 5.3 does support NFSv4.
--
Tom Haynes
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs