2007-11-20 21:17:14

by Peter Staubach

[permalink] [raw]
Subject: Re: [NFS] NFS Digest, Vol 18, Issue 70 (NFS performance problems)

Wendy Cheng wrote:
>
>> top - 15:50:56 up 20 days, 1:33, 9 users, load average: 3.42, 2.95, 2.38
>>
>> 19200 geo0501 15 0 75076 5224 3480 S 2 0.1 0:07.94 smbd
>> 2336 root 10 -5 0 0 0 S 1 0.0 57:07.70 kjournald
>> 2334 root 10 -5 0 0 0 S 1 0.0 33:19.89 kjournald
>> 2279 root 10 -5 0 0 0 S 0 0.0 15:10.98 md0_raid1
>> 2283 root 10 -5 0 0 0 S 0 0.0 24:45.79 md1_raid1
>> 3935 root 15 0 0 0 0 S 0 0.0 14:04.25 nfsd
>> 3943 root 15 0 0 0 0 S 0 0.0 14:18.43 nfsd
>> 3947 root 15 0 0 0 0 S 0 0.0 13:57.06 nfsd
>> 8325 ed0127 15 0 75044 4812 3264 S 0 0.1 0:01.29 smbd
>>
>
> Intuitively (based on ext3's journal threads info above) I would suspect this is due to the change of the export default option from "async" to "sync" between 2.6.9 and 2.6.18 kernels. So go to your /etc/exports file and explicitly set the export option to "async" to see whether you can get the performance back.
>
> e.g. changes "/server *(rw)" to "/server *(async, rw)".

While this may or may not restore your performance aspects, it
is not safe to make this change. The change was made for a
reason.

Please any and all other possibilities before making this change.
It is not free.

Thanx...

ps

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs



2007-11-20 21:25:35

by Wendy Cheng

[permalink] [raw]
Subject: Re: [NFS] NFS Digest, Vol 18, Issue 70 (NFS performance problems)

Peter Staubach wrote:
> Wendy Cheng wrote:
>>
>>> top - 15:50:56 up 20 days, 1:33, 9 users, load average: 3.42,
>>> 2.95, 2.38
>>>
>>> 19200 geo0501 15 0 75076 5224 3480 S 2 0.1 0:07.94
>>> smbd 2336 root 10 -5 0 0 0 S 1
>>> 0.0 57:07.70 kjournald 2334 root 10 -5 0
>>> 0 0 S 1 0.0 33:19.89 kjournald 2279 root 10
>>> -5 0 0 0 S 0 0.0 15:10.98 md0_raid1 2283
>>> root 10 -5 0 0 0 S 0 0.0 24:45.79
>>> md1_raid1 3935 root 15 0 0 0 0 S 0
>>> 0.0 14:04.25 nfsd 3943 root 15 0 0
>>> 0 0 S 0 0.0 14:18.43 nfsd 3947 root 15
>>> 0 0 0 0 S 0 0.0 13:57.06 nfsd 8325
>>> ed0127 15 0 75044 4812 3264 S 0 0.1 0:01.29
>>> smbd
>>
>> Intuitively (based on ext3's journal threads info above) I would
>> suspect this is due to the change of the export default option from
>> "async" to "sync" between 2.6.9 and 2.6.18 kernels. So go to your
>> /etc/exports file and explicitly set the export option to "async" to
>> see whether you can get the performance back.
>>
>> e.g. changes "/server *(rw)" to "/server *(async, rw)".
>
> While this may or may not restore your performance aspects, it
> is not safe to make this change. The change was made for a
> reason.
Not to start a flame war :) but please read his email. His *old* system,
that uses "async" option", has been running fine for several years. Why
all of sudden, an "async" option is such a big issue ?

-- Wendy
>
> Please any and all other possibilities before making this change.
> It is not free.
>
> Thanx...
>
> ps


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs


2007-11-21 16:06:22

by Chuck Lever

[permalink] [raw]
Subject: Re: [NFS] NFS Digest, Vol 18, Issue 70 (NFS performance problems)

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs


Attachments:
chuck.lever.vcf (315.00 B)
(No filename) (228.00 B)
(No filename) (362.00 B)
Download all attachments

2007-11-26 03:12:24

by Wendy Cheng

[permalink] [raw]
Subject: Re: [NFS] NFS Digest, Vol 18, Issue 70 (NFS performance problems)

Chuck Lever wrote:

> Hi Wendy-
>
> That means his old system would have been exposed to data corruption
> issues if it crashes (panic, power outage, etc). Using "sync" became
> default because async is inherently careless about data integrity.
> The data loss is often entirely silent.
>
> This is explained in the Linux NFS FAQ, question B6.
>
> See http://nfs.sourceforge.net/index.php#faq_b6


Setting aside NFS for a moment... for a locally mounted filesystem, the
file data stays in the cache until write-back occurs. Upon crashing,
there are always possibilities that the data could be lost. Journaling
filesystems such as EXT3 can only ensure no meta-data corruption, there
is no guarantee that data would be saved unless the filesystem is
mounted with "sync" option. With non-trivial performance hits, most of
the filesystems are hardly mounted with "sync" option. Applications
normally understand the problem and whenever required, fsync() and/or
similar mechanisms are applied.

For Linux NFS servers to deviate from this common practice, by reading
the FAQ, I assume something has been done (particularly from client
ends) to alleviate the performance hit ? Could you elaborate more about
this ?

Again, I'm not trying to argue and/or start a flamewar. I have a need to
understand more about this issue. The "sync" operation is very expensive
for us (cluster filesystem) and I'm under the gun to improve our NFS
file serving performance at this moment.

-- Wendy



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs


2007-11-26 04:42:06

by Trond Myklebust

[permalink] [raw]
Subject: Re: [NFS] NFS Digest, Vol 18, Issue 70 (NFS performance problems)


On Sun, 2007-11-25 at 22:28 -0500, Wendy Cheng wrote:
> Setting aside NFS for a moment... for a locally mounted filesystem, the
> file data stays in the cache until write-back occurs. Upon crashing,
> there are always possibilities that the data could be lost. Journaling
> filesystems such as EXT3 can only ensure no meta-data corruption, there
> is no guarantee that data would be saved unless the filesystem is
> mounted with "sync" option. With non-trivial performance hits, most of
> the filesystems are hardly mounted with "sync" option. Applications
> normally understand the problem and whenever required, fsync() and/or
> similar mechanisms are applied.
>
> For Linux NFS servers to deviate from this common practice, by reading
> the FAQ, I assume something has been done (particularly from client
> ends) to alleviate the performance hit ? Could you elaborate more about
> this ?
>
> Again, I'm not trying to argue and/or start a flamewar. I have a need to
> understand more about this issue. The "sync" operation is very expensive
> for us (cluster filesystem) and I'm under the gun to improve our NFS
> file serving performance at this moment.

You've got it wrong. The 'async' option was the Linux-specific option
that violates the NFS spec, not 'sync'.

Please read the RFCs: NFS has always imposed strict requirements on the
server w.r.t. data integrity. 'async' violates those requirements
because it allows the server to cache data in circumstances where the
client is under the belief that the data is on permanent storage.

Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs


2007-11-26 05:02:36

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [NFS] NFS Digest, Vol 18, Issue 70 (NFS performance problems)

On Sun, Nov 25, 2007 at 10:28:43PM -0500, Wendy Cheng wrote:
> Chuck Lever wrote:
>
> > Hi Wendy-
> >
> > That means his old system would have been exposed to data corruption
> > issues if it crashes (panic, power outage, etc). Using "sync" became
> > default because async is inherently careless about data integrity.
> > The data loss is often entirely silent.
> >
> > This is explained in the Linux NFS FAQ, question B6.
> >
> > See http://nfs.sourceforge.net/index.php#faq_b6
>
>
> Setting aside NFS for a moment... for a locally mounted filesystem, the
> file data stays in the cache until write-back occurs. Upon crashing,
> there are always possibilities that the data could be lost. Journaling
> filesystems such as EXT3 can only ensure no meta-data corruption, there
> is no guarantee that data would be saved unless the filesystem is
> mounted with "sync" option. With non-trivial performance hits, most of
> the filesystems are hardly mounted with "sync" option. Applications
> normally understand the problem and whenever required, fsync() and/or
> similar mechanisms are applied.

As far as I know, even an explicit fsync() is ineffective in the case of
NFSv2 when the async export option is set. (With v3 and v4 I think it
still works, since (from a quick check of the code) it does respect the
stable flag even on async exports.)

An application on a local disk goes down when the system goes down,
whereas an NFS server can reboot without the applications using it
exiting.

So while a well-designed application might be built to deal with the
situation where a mkdir() that a previous instance performed is no
longer there when it starts up again, it may not be ready to deal with a
directory it just created simply diseappearing out from under it while
it's running.

(Stupid question: what would it take to give NFS the equivalent to
COMMIT for directory operations?)

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs


Subject: Re: [NFS] NFS Digest, Vol 18, Issue 70 (NFS performance problems)

Thanks for the replies, everyone. Last week I didn't had the time to
send this mail. Some comments below:

On Wed, Nov 21, 2007 at 11:04:55AM -0500, Chuck Lever wrote:
> Wendy Cheng wrote:
> >Peter Staubach wrote:
> >>Wendy Cheng wrote:

> >>>Intuitively (based on ext3's journal threads info above) I would
> >>>suspect this is due to the change of the export default option from
> >>>"async" to "sync" between 2.6.9 and 2.6.18 kernels.

The problem with that idea is that my /etc/exports file had (always ?) the
"sync" option on. (I checked a backup file from 2005-03-14). I think the
Suse management software already used the "sync" option in /etc/exports.
(probably when I first transitioned the server to linux I used whatever
YAST chose as default).

The default export might have been "async", but unless the option "sync"
in /etc/exports was being ignored I was already using "sync". Nevertheless
I will try to change to async and test if it makes a difference.

(one day later: )

I have now tried it and the load on the NFS server is much lower and KDE
logins seem to be reasonably fast now.

This doesn't mean that the drop in performance was due to "sync" versus
"async". That is, the old version could be really using "sync" and for
some other change (not a change from "async" to "sync") the performance
dropped a lot between those versions.

One thing I suspected was quotas since the old version didn't seem to
handle them. But that was the first thing I tried, turn off quotas
and see if it made a difference. It didn't.

BTW, part of the problem is due to KDE doing a lot of file activity.
I already knew that fvwm (what I personally use) did not take a long
time to login and I have now tried gnome which also started fast enough
(with "sync"). But KDE used to work...

> >>> So go to your
> >>>/etc/exports file and explicitly set the export option to "async" to
> >>>see whether you can get the performance back.

> >>While this may or may not restore your performance aspects, it
> >>is not safe to make this change. The change was made for a
> >>reason.

> That means his old system would have been exposed to data corruption
> issues if it crashes (panic,

Luckly it has been reliable. Some years ago (a previous server) crashed
a lot but that was due to an obscure bug (XFS + SMP kernel + NFS = crash,
IIRC) which I don't know if it was ever fixed:

http://groups.google.com/group/alt.os.linux.suse/browse_frm/thread/f24dd8f878bb3ea3/7e6ffa45f3873716?hl=en&lnk=st#7e6ffa45f3873716

> power outage,

Of course, the server is on an UPS.

And of course, some hours after I wrote the above, the UPS had an hickup
and the server crashed during the middle of the night. We had to change
its batteries.

> It's another case of where we perform better in older kernels but we are
> more correct in recent kernels... but our users don't appreciate the
> correctness improvement :-)

The correctness improvement doesn't matter if the performance is so low
that you can't use it. I hope this has solved the problem, because I was
getting desperate. I had thought about abandoning NFS in linux and trying
openBSD, but so many things would be different (RAID, filesystems, backups)
that I really didn't want to.

BTW, is what I am doing rare ? I have about 50 linux computers (including
the mail server) mounting user areas from the NFS server. (Most of the
time only some of them are being used.) The users use mostly KDE (because
it has been the default option in SUSE for the past few years). This
sort of setup allows an user to login in any of the computers and to
have the same environment. So I would expect it to be widely used. But
when I asked about this in other places I never got a reply of the kind
"I am doing the same, and it works for me".



Thanks again.
--
rps

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs