2002-08-07 15:17:59

by Jeff Layton

[permalink] [raw]
Subject: RH 7.3 kernels and NFS performance

Hello,

I remember reading on the list about the terrible NFS
performance of the latest RH kernels. Would someone
care to summarize this for me (UDP or TCP, etc.)? Also,
does anyone have any rough estimates of the performance
hit?

Thanks!

Jeff Layton

Lockheed-Martin Aeronautical Company - Marietta




-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2002-08-07 15:37:42

by Rex Dieter

[permalink] [raw]
Subject: Re: RH 7.3 kernels and NFS performance

On Wednesday 07 August 2002 10:35 am, Jeff Layton wrote:

> I remember reading on the list about the terrible NFS
> performance of the latest RH kernels. Would someone
> care to summarize this for me (UDP or TCP, etc.)? Also,
> does anyone have any rough estimates of the performance
> hit?

The Hit is big and bad. When using the default NFS export options (in=20
particular sync), I see NFS write performance of ~50k with the 2.4.18-5=20
kernels. When set async, performance comes back up to par. I can give y=
ou=20
the bugzilla # I reported it under, if you like.

-- Rex


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-07 15:45:41

by Lever, Charles

[permalink] [raw]
Subject: RE: RH 7.3 kernels and NFS performance

hi jeff-

> I remember reading on the list about the terrible NFS
> performance of the latest RH kernels. Would someone
> care to summarize this for me (UDP or TCP, etc.)? Also,
> does anyone have any rough estimates of the performance
> hit?

if you are referring to bad NFS client performance, this
is due to a bug in the Linux IP fragmentation logic which
causes it to send part of a fragmented packet, and drop
the rest, if it runs out of socket buffer space during
the fragmentation process.

thus it only affects NFS over UDP.

there is an easy workaround: enlarge the size of the
RPC transport socket's buffers. see the NFS FAQ for
instructions.

a bug was reported in the eepro100 driver too, and that
may have some effect on client performance.


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-07 15:55:38

by Trond Myklebust

[permalink] [raw]
Subject: Re: RH 7.3 kernels and NFS performance

>>>>> " " == Rex Dieter <[email protected]> writes:

> The Hit is big and bad. When using the default NFS export
> options (in particular sync), I see NFS write performance of
> ~50k with the 2.4.18-5 kernels. When set async, performance
> comes back up to par. I can give you the bugzilla # I reported
> it under, if you like.

That is *not* a bug. The NFS protocol does not actually allow for the
sort of behaviour that 'async' provides. The main problem is that
actions like 'fsync()' on the client can no longer guarantee that the
data is physically written to disk on the server. IOW if the server
crashes, your data is lost...

The new default of 'sync' for exports is quite correct and should only
be changed by people who are aware of the consequences for data
integrity.

Cheers,
Trond


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-07 16:05:37

by Rex Dieter

[permalink] [raw]
Subject: Re: RH 7.3 kernels and NFS performance

On Wednesday 07 August 2002 10:55 am, Trond Myklebust wrote:
> >>>>> " " =3D=3D Rex Dieter <[email protected]> writes:
> > The Hit is big and bad. When using the default NFS export
> > options (in particular sync), I see NFS write performance of
> > ~50k with the 2.4.18-5 kernels. When set async, performance
> > comes back up to par. I can give you the bugzilla # I reported
> > it under, if you like.

> That is *not* a bug.=20

It depends. I agree that defaulting to 'sync' is not a bug. I do *not*=20
agree that 50k writes is not a bug.

-- Rex


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-07 16:17:56

by Trond Myklebust

[permalink] [raw]
Subject: Re: RH 7.3 kernels and NFS performance

>>>>> " " == Rex Dieter <[email protected]> writes:

> It depends. I agree that defaulting to 'sync' is not a bug. I
> do *not* agree that 50k writes is not a bug.

NFSv2 or v3?

Cheers,
Trond


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-07 16:21:08

by Rex Dieter

[permalink] [raw]
Subject: Re: RH 7.3 kernels and NFS performance

On Wednesday 07 August 2002 11:17 am, Trond Myklebust wrote:
> >>>>> " " =3D=3D Rex Dieter <[email protected]> writes:
> > It depends. I agree that defaulting to 'sync' is not a bug. I
> > do *not* agree that 50k writes is not a bug.
>
> NFSv2 or v3?

I tried primarily v3, but I think I remember testing v2 as well, with=20
similar results.

-- Rex


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-07 16:25:49

by Trond Myklebust

[permalink] [raw]
Subject: Re: RH 7.3 kernels and NFS performance

>>>>> " " == Rex Dieter <[email protected]> writes:

> On Wednesday 07 August 2002 11:17 am, Trond Myklebust wrote:
>> >>>>> " " == Rex Dieter <[email protected]> writes:
>> > It depends. I agree that defaulting to 'sync' is not a bug.
>> > I do *not* agree that 50k writes is not a bug.
>>
>> NFSv2 or v3?

> I tried primarily v3, but I think I remember testing v2 as
> well, with similar results.

If you are getting that performance with v3, then it does indeed sound
like a bug somewhere. Try with a stock 2.4.19 kernel instead...

Cheers,
Trond


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-07 16:43:28

by Rex Dieter

[permalink] [raw]
Subject: Re: RH 7.3 kernels and NFS performance

On Wed, 7 Aug 2002, Trond Myklebust wrote:

> >>>>> " " == Rex Dieter <[email protected]> writes:
>
> > On Wednesday 07 August 2002 11:17 am, Trond Myklebust wrote:
> >> >>>>> " " == Rex Dieter <[email protected]> writes:
> >> > It depends. I agree that defaulting to 'sync' is not a bug.
> >> > I do *not* agree that 50k writes is not a bug.
> >>
> >> NFSv2 or v3?
>
> > I tried primarily v3, but I think I remember testing v2 as
> > well, with similar results.
>
> If you are getting that performance with v3, then it does indeed sound
> like a bug somewhere. Try with a stock 2.4.19 kernel instead...

I just tested things with a test kernel (2.4.18-7) from redhat, and it
sync write speeds were again way up there. Problem fixed, as far as I'm
concerned... hopefully, they'll release an errata soon.

--
Rex A. Dieter [email protected]
Computer System Administrator http://www.math.unl.edu/~rdieter/
Mathematics and Statistics
University of Nebraska Lincoln



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-07 17:15:15

by Tom McNeal

[permalink] [raw]
Subject: Re: RH 7.3 kernels and NFS performance

Rex Dieter wrote:
>
> On Wed, 7 Aug 2002, Trond Myklebust wrote:
>
> >
> > If you are getting that performance with v3, then it does indeed sound
> > like a bug somewhere. Try with a stock 2.4.19 kernel instead...
>
> I just tested things with a test kernel (2.4.18-7) from redhat, and it
> sync write speeds were again way up there. Problem fixed, as far as I'm
> concerned... hopefully, they'll release an errata soon.
>
> --
> Rex A. Dieter [email protected]
> Computer System Administrator http://www.math.unl.edu/~rdieter/
> Mathematics and Statistics
> University of Nebraska Lincoln

and

"Lever, Charles" wrote:
>
> hi jeff-
>
> > I remember reading on the list about the terrible NFS
> > performance of the latest RH kernels. Would someone
> > care to summarize this for me (UDP or TCP, etc.)? Also,
> > does anyone have any rough estimates of the performance
> > hit?
>
> if you are referring to bad NFS client performance, this
> is due to a bug in the Linux IP fragmentation logic which
> causes it to send part of a fragmented packet, and drop
> the rest, if it runs out of socket buffer space during
> the fragmentation process.
>
> thus it only affects NFS over UDP.
>
> there is an easy workaround: enlarge the size of the
> RPC transport socket's buffers. see the NFS FAQ for
> instructions.
>
> a bug was reported in the eepro100 driver too, and that
> may have some effect on client performance.
>

So if this is one and the same bug, then enlarging the transport
socket size is actually discussed in the performance section
of the howto documents (section 5.7) at
http://nfs.sourceforge.net/nfs-howto/performance.html#MEMLIMITS

Tom
--
------------------------------------------------------------
Tom McNeal [email protected] (650)906-0761 (cell)
------------------------------------------------------------


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-07 19:04:17

by Bill Rugolsky Jr.

[permalink] [raw]
Subject: Re: RH 7.3 kernels and NFS performance

On Wed, Aug 07, 2002 at 11:43:24AM -0500, Rex Dieter wrote:
> I just tested things with a test kernel (2.4.18-7) from redhat, and it
> sync write speeds were again way up there. Problem fixed, as far as I'm
> concerned... hopefully, they'll release an errata soon.

BTW, Arjan has included many of Trond's RPC patches in the latest
Rawhide kernels:

$ rpm -qp --changelog kernel-2.4.18-7.93.src.rpm | head -4
* Fri Aug 02 2002 Arjan van de Ven <[email protected]>

- added most of the NFS patchkit

Based on the patch file names, these look like Trond's 2.4.19-rc3 patches:

linux-2.4.19-00-fix_clnt.patch
linux-2.4.19-01-fix_kmap1.patch
linux-2.4.19-02-fix_kmap2.patch
linux-2.4.19-03-fix_kmap3.patch
linux-2.4.19-04-rpc_rep.patch
linux-2.4.19-05-rpc_rtt1.patch
linux-2.4.19-06-rpc_rtt2.patch
linux-2.4.19-07-rpc_rtt3.patch
linux-2.4.19-08-xprt_write.patch
linux-2.4.19-09-rpc_cong1.patch
linux-2.4.19-10-rpc_cong2.patch
linux-2.4.19-11-rpc_cong3.patch
linux-2.4.19-12-rpc_wspace.patch
linux-2.4.19-13-rpc_cleanup.patch
linux-2.4.19-16-rpcbuf.patch

This should be nearly current, though Trond's 2.4.19 patches are broken
down somewhat differently.

I hope that Trond is planning to submit most of this to Marcelo for
2.4.20, as Neil is submitting his NFS server patches, and it would be
very nice to have a 2.4.20 that is fully up-to-date w.r.t current NFS
patches for both client and server. [Those of us on this list for a
long time will remember a similar exercise with Alan for 2.2.19 --
gosh, that seems like ancient history. :-P]

In any case, Trond's work should receive some more widespread "end-user"
testing from the Limbo Beta and Rawhide crowd in the near future.

I've long pined [since RH 5.x] for a Red Hat kernel with working
NFS that played nice with the NetApp and Solaris. It has since become
a ritual to patch NFS into the Red Hat tree.

Many thanks to Trond and Neil for their tireless efforts!

It is difficult to underestimate how important working NFS is in deployments
like ours ...

Regards

Bill Rugolsky


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-08 05:08:57

by seth vidal

[permalink] [raw]
Subject: Re: RH 7.3 kernels and NFS performance

> > > I tried primarily v3, but I think I remember testing v2 as
> > > well, with similar results.
> >
> > If you are getting that performance with v3, then it does indeed sound
> > like a bug somewhere. Try with a stock 2.4.19 kernel instead...
>
> I just tested things with a test kernel (2.4.18-7) from redhat, and it
> sync write speeds were again way up there. Problem fixed, as far as I'm
> concerned... hopefully, they'll release an errata soon.

look carefully - the only patches in 2.4.18-X (where X >5) are to misc
network drivers. 2.4.18-5 with some network cards (specifically eepros and
3c905s) were terribly slow. 2.4.18-5e+ fix it.

-sv





-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-08-08 11:56:20

by Rex Dieter

[permalink] [raw]
Subject: Re: RH 7.3 kernels and NFS performance


----- Original Message -----
From: "Seth Vidal" <[email protected]>

> > I just tested things with a test kernel (2.4.18-7) from redhat, and i=
t
> > sync write speeds were again way up there. Problem fixed, as far as =
I'm
> > concerned... hopefully, they'll release an errata soon.
>
> look carefully - the only patches in 2.4.18-X (where X >5) are to misc
> network drivers. 2.4.18-5 with some network cards (specifically eepros =
and
> 3c905s) were terribly slow. 2.4.18-5e+ fix it.

I didn't look carefully, and it's changelog didn't reveal anything, but I=
=20
can tell you that I HAD tried kernel-2.4.18-5e, and the problem persisted=
=2E =20
As of 2.4.18-7, the problem is gone. I may have to take a closer look=20
now... I'm curious.

-- Rex






-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs