2002-10-08 20:53:59

by Heflin, Roger A.

[permalink] [raw]
Subject: RE: 2.4.19 NFSALL performance oddity

Ok,

I applied the new patch, it does not appear to have made any difference=20
in my setup and tests, it also does not appear to break anything.
I am still analyzing the results, but not finding=20
anything obvious about what is going on. 2.2.19 appears to make alot
better client to a 2.4.19 server than 2.4.1[89][no patch/nfsall]
does. And 2.4.19 with no patches appears to make a better
server than 2.4.19 (for the writes) with the NFS patches
(using 2.2.19 as a client).

Using 2.4.xx as a client, a 2.4.1[89][nfsall/nopatch] server is=20
about the same on writes, but 2.4.19 nfsall is much better
on the reads than the previous 2.4.1[89] without
the nfsall patch.

I guess the read numbers with NFSall look pretty good,=20
the write numbers probably need to be better, though the write=20
numbers do look ok (2.5x larger) with 2.2.19 as
a client, they don't look good with 2.4.19 nfsall as a client. =20
And from 2.4.19 to 2.4.19 nfsall the write numbers got a bit worse=20
for a 2.2.19 client, but did not appear to change a large amount
when using a 2.4.19 client.

I have a excel spreadsheet of the various results, that try to
summarize up all of the recent emails. If anyone wants this=20
spreadsheet I will sent it out.


2.4.19 NFSALLnew Server:
2.4.19 NFSALLnew Client
write: 2.312MB
read: 8.875MB
2.4.18 Client:
write: 2.125
read: 8.125

2.4.19 NFSALLold Server:
2.4.19 NFSALLnew Client:
write: 2.312 MB
read: 8.688 MB

2.4.18 no patches Server:
2.4.19 NFSALL Client:
write: 2.00 MB
read: 2.50 MB
2.4.19 nopatches client:
write: 2.0MB
read: 3.0MB
2.2.19 stock client:
write: 6.00 MB
read: 10.0 MB



=20
Roger
> -----Original Message-----
> From: Trond Myklebust [SMTP:[email protected]]
> Sent: Monday, October 07, 2002 5:21 PM
> To: Heflin, Roger A.
> Cc: [email protected]
> Subject: RE: 2.4.19 NFSALL performance oddity
>=20
>=20
> FYI: I updated 2.4.19-NFS_ALL on Saturday with a couple of
> bugfixes. Among them was one which fixes a queueing bottleneck when a
> timeout+resend occurs.
>=20
> Cheers,
> Trond


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2002-10-08 22:42:23

by Trond Myklebust

[permalink] [raw]
Subject: Re: RE: 2.4.19 NFSALL performance oddity

>>>>> " " == Roger A Heflin <Heflin> writes:

> Ok, I applied the new patch, it does not appear to have made
> any difference in my setup and tests, it also does not appear
> to break anything. I am still analyzing the results, but not
> finding anything obvious about what is going on. 2.2.19
> appears to make alot better client to a 2.4.19 server than
> 2.4.1[89][no patch/nfsall] does. And 2.4.19 with no patches
> appears to make a better server than 2.4.19 (for the writes)
> with the NFS patches (using 2.2.19 as a client).

2.4.19-NFS_ALL doesn't touch the server code unless you are also
applying my hacked up version of Neil's experimental NFS-over-TCP
server patches on top of the 'stock NFS_ALL'.

The direct comparison with 2.2.19 is interesting, but only makes sense
if you can follow up and do a binary search of 2.4.x kernels (start
off by testing something like 2.4.9, say) in order to find out where
the performance drop occurs. I simply wouldn't know where to start
looking otherwise...

Note that on my own setups, I'm seeing performance numbers with
2.4.19-NFS_ALL clients against both Linux 2.4.19 servers and Sun
Solaris servers of the same order as your 2.2.19 numbers...

Cheers,
Trond


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-10-09 14:07:46

by Heflin, Roger A.

[permalink] [raw]
Subject: RE: RE: 2.4.19 NFSALL performance oddity


I have tried it both with and without Neil's patches, they don't
make it better and don't make it worse. Runing the server/client
tcp makes things a little bit worse.

We should be using sync on the mount option, correct? =20

All of my tests used sync on the mount and sync in the export,
previously when using async on 2.2.xx (on the export side),=20
and not using the sync option on the mount appeared to cause
issues under heavy loads, ie the out of slots warnings, and
sometimes worse.

I know from some limited testing that the sync option does
make the IO really bad on real disks.

I did look and it appears that how the sync option on the
mount command changed sometime in 2001, someplace
in the 2.4 series.

Roger

> -----Original Message-----
> From: Trond Myklebust [SMTP:[email protected]]
> Sent: Tuesday, October 08, 2002 5:42 PM
> To: Heflin, Roger A.
> Cc: [email protected]
> Subject: Re: [NFS] RE: 2.4.19 NFSALL performance oddity
>=20
> >>>>> " " =3D=3D Roger A Heflin <Heflin> writes:
>=20
> > Ok, I applied the new patch, it does not appear to have made
> > any difference in my setup and tests, it also does not appear
> > to break anything. I am still analyzing the results, but not
> > finding anything obvious about what is going on. 2.2.19
> > appears to make alot better client to a 2.4.19 server than
> > 2.4.1[89][no patch/nfsall] does. And 2.4.19 with no patches
> > appears to make a better server than 2.4.19 (for the writes)
> > with the NFS patches (using 2.2.19 as a client).
>=20
> 2.4.19-NFS_ALL doesn't touch the server code unless you are also
> applying my hacked up version of Neil's experimental NFS-over-TCP
> server patches on top of the 'stock NFS_ALL'.
>=20
> The direct comparison with 2.2.19 is interesting, but only makes sense
> if you can follow up and do a binary search of 2.4.x kernels (start
> off by testing something like 2.4.9, say) in order to find out where
> the performance drop occurs. I simply wouldn't know where to start
> looking otherwise...
>=20
> Note that on my own setups, I'm seeing performance numbers with
> 2.4.19-NFS_ALL clients against both Linux 2.4.19 servers and Sun
> Solaris servers of the same order as your 2.2.19 numbers...
>=20
> Cheers,
> Trond


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-10-09 14:13:50

by Trond Myklebust

[permalink] [raw]
Subject: RE: RE: 2.4.19 NFSALL performance oddity

>>>>> " " == Roger A Heflin <Heflin> writes:

> We should be using sync on the mount option, correct?

No. That shouldn't be necessary. The nfsd server will sync correctly
when told to do so by the client as long as the 'sync' option is set
in /etc/exports.

> All of my tests used sync on the mount and sync in the export,
> previously when using async on 2.2.xx (on the export side), and
> not using the sync option on the mount appeared to cause issues
> under heavy loads, ie the out of slots warnings, and sometimes
> worse.

You are saying that server-side caching makes things worse? Could you
please back that up with some numbers?

Cheers,
Trond


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-10-09 14:17:50

by Lever, Charles

[permalink] [raw]
Subject: RE: RE: 2.4.19 NFSALL performance oddity

> I have tried it both with and without Neil's patches, they don't
> make it better and don't make it worse. Runing the server/client
> tcp makes things a little bit worse.
>
> We should be using sync on the mount option, correct?

no, only use sync in the exports file on the server.

use the sync mount option on the client only if you require
all writes to be pushed to the server's disk before the client
gives control back to your application. (certain databases
require this option, but it is not normally needed).

> All of my tests used sync on the mount and sync in the export,
> previously when using async on 2.2.xx (on the export side),
> and not using the sync option on the mount appeared to cause
> issues under heavy loads, ie the out of slots warnings, and
> sometimes worse.

that might be an interesting issue to pursue some time.

> I did look and it appears that how the sync option on the
> mount command changed sometime in 2001, someplace
> in the 2.4 series.

the mount command is a user-land utility. i'm not sure
you could tie a change in its behavior to a specific
kernel version.



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-10-09 15:01:47

by Heflin, Roger A.

[permalink] [raw]
Subject: RE: RE: 2.4.19 NFSALL performance oddity



> -----Original Message-----
> From: Lever, Charles [SMTP:[email protected]]
> Sent: Wednesday, October 09, 2002 9:18 AM
> To: Heflin, Roger A.
> Cc: '[email protected]'
> Subject: RE: [NFS] RE: 2.4.19 NFSALL performance oddity
>=20
> > I have tried it both with and without Neil's patches, they don't
> > make it better and don't make it worse. Runing the server/client
> > tcp makes things a little bit worse.
> >=20
> > We should be using sync on the mount option, correct? =20
>=20
> no, only use sync in the exports file on the server.
>=20
> use the sync mount option on the client only if you require
> all writes to be pushed to the server's disk before the client
> gives control back to your application. (certain databases
> require this option, but it is not normally needed).
>=20
For some stuff we do this might be useful, but not if it was
to slow down the io.

> > All of my tests used sync on the mount and sync in the export,
> > previously when using async on 2.2.xx (on the export side),=20
> > and not using the sync option on the mount appeared to cause
> > issues under heavy loads, ie the out of slots warnings, and
> > sometimes worse.
>=20
> that might be an interesting issue to pursue some time.
>=20
I am trying to verify I can still make this happen, I know
with 2.2.16 server, and clients, with async on both exports
and mount I would get clients getting out of slots and crashing
if the disk server went down for an extended period of time,
and I also believe that I could get the server to sometimes
crash by overloading it with writes. I am currently trying to
duplicate this with a 2.4 server, in a few hours I should know.

We had at least one occasion were our monitoring server
when down (every machine was updating 3-4 files every
minute or so), and had about 1/4-1/3 of the clients developed
issues with out of slot errors, the machine actually required=20
a reboot to make it useful again.

> > I did look and it appears that how the sync option on the
> > mount command changed sometime in 2001, someplace
> > in the 2.4 series.
>=20
> the mount command is a user-land utility. i'm not sure
> you could tie a change in its behavior to a specific
> kernel version.
>=20
There were some messages from Trond about changing
how the kernel interprets the sync option, apparently in
a version of 2.2 sync only synced the directory info, and
not the data and at some point in time it was changed=20
to sync the actual data, this is probably the change that
makes the difference in the speed between the 2.2 and
2.4 client. It does sound like a perfectly reasonable thing
do to, and it suprises me that it was not syncing everything
originally.

Roger


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs