2005-07-13 18:30:40

by Chris Penney

[permalink] [raw]
Subject: NFS Client Performance Question

I have a large smp SGI Itanium box box running a 2.4.21 kernel and I'm
getting fairly bad NFS performance, esp. when doing random writes to
an unloaded NFS server. The box is rather busy doing computation and
a lot of i/o to /tmp; however, there is little network i/o (<100
KB/s). I use iozone with the following options to test "-c -e -i 2 -w
-s 16m" and use an unloaded Sun NFS server (same performance results
with Linux NFS servers, but they are all under load so I test with the
Sun). I only get ~2.5MB/s using the above test. I found an Intel box
still running a 2.4.20 kernel and it gets 15MB/s. On a 2.6 kernel box
the rate jumps to 46MB/s (nice work).

I did some sniffing on the Sun and noticed that the problem Itanium
system was not doing async writes (snips are from the start of the
write test):

[snip]
redhat -> server NFS C WRITE3 FH=3D40E5 at 1478656 for 4096 (ASYNC)
server -> redhat NFS R WRITE3 OK 4096 (ASYNC)
redhat -> server NFS C WRITE3 FH=3D40E5 at 1179648 for 4096 (ASYNC)
[snip]
itanium -> server NFS C WRITE3 FH=3D40E5 at 4239360 for 4096 (FSYNC)
server -> server NFS R WRITE3 OK 4096 (FSYNC)
server -> server NFS C WRITE3 FH=3D40E5 at 15101952 for 4096 (FSYNC)
[snip]

The client mount options are:
rw,nosuid,bg,hard,intr,nfsvers=3D3,tcp,rsize=3D32768,wsize=3D32768

So my question: Is the Itanium system not doing async nfs i/o because
nfract_sync has been exceeded due to local i/o to /tmp or is it
something else? Is there anything I can do to improve NFS performance
on this box?

Chris


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2005-07-13 18:44:25

by Lever, Charles

[permalink] [raw]
Subject: RE: NFS Client Performance Question

hi chris-

take a look at the NFS faq (http://nfs.sourceforge.net) i believe this
issue is addressed there.

the problem is your itanium uses pages larger than your r/wsize setting,
which forces the client into synchronous I/O mode for NFS. some
possible workarounds:

1. upgrade to a late model 2.6 kernel where this is fixed

2. reduce the page size on your client system

3. increase the r/wsize to something larger than your page size

4. compile your client with a larger maximum transfer size so you can
do 3.

the goal is to have r/wsize be equal to or greater than your client's
page size.

> -----Original Message-----
> From: Chris Penney [mailto:[email protected]]=20
> Sent: Wednesday, July 13, 2005 2:30 PM
> To: [email protected]
> Subject: [NFS] NFS Client Performance Question
>=20
>=20
> I have a large smp SGI Itanium box box running a 2.4.21 kernel and I'm
> getting fairly bad NFS performance, esp. when doing random writes to
> an unloaded NFS server. The box is rather busy doing computation and
> a lot of i/o to /tmp; however, there is little network i/o (<100
> KB/s). I use iozone with the following options to test "-c -e -i 2 -w
> -s 16m" and use an unloaded Sun NFS server (same performance results
> with Linux NFS servers, but they are all under load so I test with the
> Sun). I only get ~2.5MB/s using the above test. I found an Intel box
> still running a 2.4.20 kernel and it gets 15MB/s. On a 2.6 kernel box
> the rate jumps to 46MB/s (nice work).
>=20
> I did some sniffing on the Sun and noticed that the problem Itanium
> system was not doing async writes (snips are from the start of the
> write test):
>=20
> [snip]
> redhat -> server NFS C WRITE3 FH=3D40E5 at 1478656 for 4096 (ASYNC)
> server -> redhat NFS R WRITE3 OK 4096 (ASYNC)
> redhat -> server NFS C WRITE3 FH=3D40E5 at 1179648 for 4096 (ASYNC)
> [snip]
> itanium -> server NFS C WRITE3 FH=3D40E5 at 4239360 for 4096 (FSYNC)
> server -> server NFS R WRITE3 OK 4096 (FSYNC)
> server -> server NFS C WRITE3 FH=3D40E5 at 15101952 for 4096 (FSYNC)
> [snip]
>=20
> The client mount options are:
> rw,nosuid,bg,hard,intr,nfsvers=3D3,tcp,rsize=3D32768,wsize=3D32768
>=20
> So my question: Is the Itanium system not doing async nfs i/o because
> nfract_sync has been exceeded due to local i/o to /tmp or is it
> something else? Is there anything I can do to improve NFS performance
> on this box?
>=20
> Chris
>=20
>=20
> -------------------------------------------------------
> This SF.Net email is sponsored by the 'Do More With Dual!'=20
> webinar happening
> July 14 at 8am PDT/11am EDT. We invite you to explore the=20
> latest in dual
> core and dual graphics technology at this free one hour event=20
> hosted by HP,=20
> AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-13 18:56:44

by Chris Penney

[permalink] [raw]
Subject: Re: NFS Client Performance Question

The Itanium page size is 16k (unless I'm mistaken) and the r/wsize is
32k so can that still be it? Or are you saying that it is because the
size of the write is only 4k?

Chris

On 7/13/05, Lever, Charles <[email protected]> wrote:
> hi chris-
>=20
> take a look at the NFS faq (http://nfs.sourceforge.net) i believe this
> issue is addressed there.
>=20
> the problem is your itanium uses pages larger than your r/wsize setting,
> which forces the client into synchronous I/O mode for NFS. some
> possible workarounds:
>=20
> 1. upgrade to a late model 2.6 kernel where this is fixed
>=20
> 2. reduce the page size on your client system
>=20
> 3. increase the r/wsize to something larger than your page size
>=20
> 4. compile your client with a larger maximum transfer size so you can
> do 3.
>=20
> the goal is to have r/wsize be equal to or greater than your client's
> page size.
>=20
> > -----Original Message-----
> > From: Chris Penney [mailto:[email protected]]
> > Sent: Wednesday, July 13, 2005 2:30 PM
> > To: [email protected]
> > Subject: [NFS] NFS Client Performance Question
> >
> >
> > I have a large smp SGI Itanium box box running a 2.4.21 kernel and I'm
> > getting fairly bad NFS performance, esp. when doing random writes to
> > an unloaded NFS server. The box is rather busy doing computation and
> > a lot of i/o to /tmp; however, there is little network i/o (<100
> > KB/s). I use iozone with the following options to test "-c -e -i 2 -w
> > -s 16m" and use an unloaded Sun NFS server (same performance results
> > with Linux NFS servers, but they are all under load so I test with the
> > Sun). I only get ~2.5MB/s using the above test. I found an Intel box
> > still running a 2.4.20 kernel and it gets 15MB/s. On a 2.6 kernel box
> > the rate jumps to 46MB/s (nice work).
> >
> > I did some sniffing on the Sun and noticed that the problem Itanium
> > system was not doing async writes (snips are from the start of the
> > write test):
> >
> > [snip]
> > redhat -> server NFS C WRITE3 FH=3D40E5 at 1478656 for 4096 (ASYNC)
> > server -> redhat NFS R WRITE3 OK 4096 (ASYNC)
> > redhat -> server NFS C WRITE3 FH=3D40E5 at 1179648 for 4096 (ASYNC)
> > [snip]
> > itanium -> server NFS C WRITE3 FH=3D40E5 at 4239360 for 4096 (FSYNC)
> > server -> server NFS R WRITE3 OK 4096 (FSYNC)
> > server -> server NFS C WRITE3 FH=3D40E5 at 15101952 for 4096 (FSYNC)
> > [snip]
> >
> > The client mount options are:
> > rw,nosuid,bg,hard,intr,nfsvers=3D3,tcp,rsize=3D32768,wsize=3D32768
> >
> > So my question: Is the Itanium system not doing async nfs i/o because
> > nfract_sync has been exceeded due to local i/o to /tmp or is it
> > something else? Is there anything I can do to improve NFS performance
> > on this box?
> >
> > Chris
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by the 'Do More With Dual!'
> > webinar happening
> > July 14 at 8am PDT/11am EDT. We invite you to explore the
> > latest in dual
> > core and dual graphics technology at this free one hour event
> > hosted by HP,
> > AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
> > _______________________________________________
> > NFS maillist - [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nfs
> >
>


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-13 19:04:23

by Lever, Charles

[permalink] [raw]
Subject: RE: NFS Client Performance Question

reads and writes should be asynchronous if your page size is indeed 16KB
and your r/wsize is 32KB.

"/usr/bin/time -v date" will report your client's page size, and "cat
/proc/mounts" will show your actual rsize and wsize. you may find that
the server has negotiated these down to 8KB even if you requested larger
on your mount command line.

> -----Original Message-----
> From: Chris Penney [mailto:[email protected]]=20
> Sent: Wednesday, July 13, 2005 2:56 PM
> To: Lever, Charles
> Cc: [email protected]
> Subject: Re: [NFS] NFS Client Performance Question
>=20
>=20
> The Itanium page size is 16k (unless I'm mistaken) and the r/wsize is
> 32k so can that still be it? Or are you saying that it is because the
> size of the write is only 4k?
>=20
> Chris
>=20
> On 7/13/05, Lever, Charles <[email protected]> wrote:
> > hi chris-
> >=20
> > take a look at the NFS faq (http://nfs.sourceforge.net) i=20
> believe this
> > issue is addressed there.
> >=20
> > the problem is your itanium uses pages larger than your=20
> r/wsize setting,
> > which forces the client into synchronous I/O mode for NFS. some
> > possible workarounds:
> >=20
> > 1. upgrade to a late model 2.6 kernel where this is fixed
> >=20
> > 2. reduce the page size on your client system
> >=20
> > 3. increase the r/wsize to something larger than your page size
> >=20
> > 4. compile your client with a larger maximum transfer size=20
> so you can
> > do 3.
> >=20
> > the goal is to have r/wsize be equal to or greater than=20
> your client's
> > page size.
> >=20
> > > -----Original Message-----
> > > From: Chris Penney [mailto:[email protected]]
> > > Sent: Wednesday, July 13, 2005 2:30 PM
> > > To: [email protected]
> > > Subject: [NFS] NFS Client Performance Question
> > >
> > >
> > > I have a large smp SGI Itanium box box running a 2.4.21=20
> kernel and I'm
> > > getting fairly bad NFS performance, esp. when doing=20
> random writes to
> > > an unloaded NFS server. The box is rather busy doing=20
> computation and
> > > a lot of i/o to /tmp; however, there is little network i/o (<100
> > > KB/s). I use iozone with the following options to test=20
> "-c -e -i 2 -w
> > > -s 16m" and use an unloaded Sun NFS server (same=20
> performance results
> > > with Linux NFS servers, but they are all under load so I=20
> test with the
> > > Sun). I only get ~2.5MB/s using the above test. I found=20
> an Intel box
> > > still running a 2.4.20 kernel and it gets 15MB/s. On a=20
> 2.6 kernel box
> > > the rate jumps to 46MB/s (nice work).
> > >
> > > I did some sniffing on the Sun and noticed that the=20
> problem Itanium
> > > system was not doing async writes (snips are from the start of the
> > > write test):
> > >
> > > [snip]
> > > redhat -> server NFS C WRITE3 FH=3D40E5 at 1478656 for 4096 =
(ASYNC)
> > > server -> redhat NFS R WRITE3 OK 4096 (ASYNC)
> > > redhat -> server NFS C WRITE3 FH=3D40E5 at 1179648 for 4096 =
(ASYNC)
> > > [snip]
> > > itanium -> server NFS C WRITE3 FH=3D40E5 at 4239360 for 4096 =
(FSYNC)
> > > server -> server NFS R WRITE3 OK 4096 (FSYNC)
> > > server -> server NFS C WRITE3 FH=3D40E5 at 15101952 for 4096 =
(FSYNC)
> > > [snip]
> > >
> > > The client mount options are:
> > > =
rw,nosuid,bg,hard,intr,nfsvers=3D3,tcp,rsize=3D32768,wsize=3D32768
> > >
> > > So my question: Is the Itanium system not doing async nfs=20
> i/o because
> > > nfract_sync has been exceeded due to local i/o to /tmp or is it
> > > something else? Is there anything I can do to improve=20
> NFS performance
> > > on this box?
> > >
> > > Chris
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.Net email is sponsored by the 'Do More With Dual!'
> > > webinar happening
> > > July 14 at 8am PDT/11am EDT. We invite you to explore the
> > > latest in dual
> > > core and dual graphics technology at this free one hour event
> > > hosted by HP,
> > > AMD, and NVIDIA. To register visit=20
> http://www.hp.com/go/dualwebinar
> > >=20
> _______________________________________________
> > > NFS maillist - [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/nfs
> > >
> >
>=20


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-13 19:11:13

by Peter Staubach

[permalink] [raw]
Subject: Re: NFS Client Performance Question

Chris Penney wrote:

>The Itanium page size is 16k (unless I'm mistaken) and the r/wsize is
>32k so can that still be it? Or are you saying that it is because the
>size of the write is only 4k?
>

Well, it is curious that the over the wire WRITE requests are still only 4K.
The Sun server is not restricting the transfer sizes, so apparently
something
else is.

You say that the application on the client is doing random, 4K writes to
the file? Is the file opened with O_SYNC or anything special like that?

Thanx...

ps


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-13 19:58:22

by Chris Penney

[permalink] [raw]
Subject: Re: NFS Client Performance Question

On 7/13/05, Peter Staubach <[email protected]> wrote:
> Chris Penney wrote:
>=20
> >The Itanium page size is 16k (unless I'm mistaken) and the r/wsize is
> >32k so can that still be it? Or are you saying that it is because the
> >size of the write is only 4k?
> >
>=20
> Well, it is curious that the over the wire WRITE requests are still only =
4K.
> The Sun server is not restricting the transfer sizes, so apparently
> something
> else is.
>=20
> You say that the application on the client is doing random, 4K writes to
> the file? Is the file opened with O_SYNC or anything special like that?

I'm using a benchmark program (iozone) as it roughly simulations the
real application. The way I am running it is such that it will
randomly write 4k chunks of the file until the whole file is
rewritten. So each 4k write should be corrosponding to a specifc
chunk being writen. What I don't understand is why they are not async
(I'm not an expert at all on NFS).

I reran the test using various chunk sizes and I found that if I use
'-r' to specify 8k or more it always uses async writes. If I use 6,
4, or 2k I get a lot of fsync writes. They aren't /always/ fsynced
writes, but most of them are. I'll see a steam of fsync write, then a
few async, and then back to fsync.

Chris


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-13 20:16:31

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS Client Performance Question

on den 13.07.2005 Klokka 14:30 (-0400) skreiv Chris Penney:
> The client mount options are:
> rw,nosuid,bg,hard,intr,nfsvers=3,tcp,rsize=32768,wsize=32768

Does this match the entry for this mountpoint in /proc/mounts ?

Cheers,
Trond



-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-13 20:40:48

by Chris Penney

[permalink] [raw]
Subject: Re: NFS Client Performance Question

On 7/13/05, Trond Myklebust <[email protected]> wrote:
> on den 13.07.2005 Klokka 14:30 (-0400) skreiv Chris Penney:
> > The client mount options are:
> > rw,nosuid,bg,hard,intr,nfsvers=3D3,tcp,rsize=3D32768,wsize=3D32768
>=20
> Does this match the entry for this mountpoint in /proc/mounts ?

Yes, and if I do a streaming write test I do in fact see only 32k
async writes even with '-r 4k'.

The entry is:
server:/s on /s/nicfs5 type nfs
(rw,nosuid,bg,hard,intr,nfsvers=3D3,tcp,rsize=3D32768,wsize=3D32768,addr=3D=
1.1.1.1)

Chris


-------------------------------------------------------
This SF.Net email is sponsored by the 'Do More With Dual!' webinar happening
July 14 at 8am PDT/11am EDT. We invite you to explore the latest in dual
core and dual graphics technology at this free one hour event hosted by HP,
AMD, and NVIDIA. To register visit http://www.hp.com/go/dualwebinar
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs