2008-06-24 20:31:27

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [NFS] NFS performance debugging

On Tue, Jun 24, 2008 at 12:17:24PM +0200, Adrian von Bidder wrote:
> Hi again,
>
> Thanks for your replies (You too, Trond)

The custom around here is to leave everyone on the cc: line.

>
> On Monday 23 June 2008 21.28:36 you wrote:
>
> [... NFS performance ...]
>
> > In what way exactly is it sluggish?
>
> Starting KDE, opening documents, sometimes also closing oo.org and saving
> documents takes several seconds longer than on local disk.

"close" on nfs is an operation that requires a round-trip to the server
and waiting for the disk to commit any writes made before the close, so
if you've got to do a lot of those it can take time. Fooling with the
journaling on the exported filesystem may help.

> Certainly network latency (especially with these silly lots of small config
> files) takes some time, but I'm still surprised. At the same time, I don't
> have data to compare a "known good" NFS against ours, so perhaps NFS is
> indeed so slow?
>
> >
> > > tcpdump shows many "reply ERR 1448" etc. msgs whenever NFS activitiy is
> > > going on (both stat like with "find /home" or read/write with dd)
> >
> > I'm afraid I don't know how to read that tcpdump output.
>
> tcpdump "-vvv" doesn't give more information on these packets; at the same
> time wireshark doesn't show anything suspicious except tons of wrong TCP
> checksums caused (I hope...) by offloading.

Yes, that's normal.

> I'll have to look if I can get
> the raw traffic at the network switch to check this (but I think with 30%
> and more wrong tcp checksums, traffic would completely break down so I'm
> quite confident here.)
>
>
> Slightly different topic: is there an NFS related mailing list I can
> subscribe to? This one is apparently closed for new subscribers, and the
> bounce instructs me to send mail to [email protected] which
^^^^
vger

Where'd the typo in that address get introduced?

> bounces :-( Reading others' NFS postings might just give me more ideas on
> where to look.

Should be: http://vger.kernel.org/vger-lists.html#linux-nfs

> TODO today: play around with NFSv4 on the shaky assumption that nfsv3 is
> actually working but net latency is killing my performance.

Delegations *might* help if the problem is really open latency.

--b.

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs



2008-06-25 07:03:12

by Adrian von Bidder

[permalink] [raw]
Subject: Re: [NFS] NFS performance debugging

On Tuesday 24 June 2008 22.29:31 J. Bruce Fields wrote:
> On Tue, Jun 24, 2008 at 12:17:24PM +0200, Adrian von Bidder wrote:

> > Starting KDE, opening documents, sometimes also closing oo.org and
> > saving documents takes several seconds longer than on local disk.
>
> "close" on nfs is an operation that requires a round-trip to the server
> and waiting for the disk to commit any writes made before the close, so
> if you've got to do a lot of those it can take time. Fooling with the
> journaling on the exported filesystem may help.

Are there tools to measure latencies on NFS? Given a network dump, desired
output would be histograms of latencies by file operation? (Or maybe I can
catch the information on the client, VFS side instead of NFS?

At this time, I really need to collect more data on where the problem is
since all I'm doing right now is fooling around based on assumptions... :-(

OTOH I'd suspect KDE/oo.org startup to be mostly reads of those config
files, so the problem shouldn't be close latencies. Assumptions again.

> > TODO today: play around with NFSv4 on the shaky assumption that nfsv3
> > is actually working but net latency is killing my performance.
>
> Delegations *might* help if the problem is really open latency.

First tries showed
* There are no acl on my files now
* user id mapping seems funny: some users map to nobody, others map
correctly. Huh?
* Performance seems to be ok (timing desktop applications is always
difficult, and so far I'm working against on the production server with
varying load anyway...)

Haven't investigated these yet...

cheers
-- vbi


--
Today is Sweetmorn, the 30th day of Confusion in the YOLD 3174


Attachments:
(No filename) (1.65 kB)
signature.asc (388.00 B)
This is a digitally signed message part.
(No filename) (247.00 B)
(No filename) (362.00 B)
Download all attachments