2003-12-24 15:12:36

by Wade Hampton

[permalink] [raw]
Subject: Linux client on Solaris 7 NFS server

Merry Christmas and Happy Holidays!

I have a Fedora Core 1.0 client connecting to a
Solaris X86 Version 7 NFS server. The Solaris server
has locked up several times since I started accessing
it from Linux (this is the only change to the Solaris
system's environment). I was wondering if this
was an NFS problem on Solaris 7 or could be a
hardware problem with the NIC.

The options I was using for NFS mount were
auto,ro,bg,soft.

After reading the NFS howto, it indicated that
"Solaris servers are especially sensitive to packet size".
This is vague and does not tell what the consequences of
default values or values other than 32768 might be....

Could "sensitive" mean bad performance, lost data, lost connections,
disconnects, locked up clients, or locked up servers?

Could this be a reason that the Solaris server is locking up?

Has anyone else experienced this?

If this is the case, does Sun have a patch that addresses this?

Any Solaris Admins out there that can tell me how I can
check the Solaris server to see if such a patch (if it exists)
is applied?

I did change my mount to the following options:
auto,rsize=32768,wsize=32768,ro,bg,soft.

So far, I have not had any server lockups (since yesterday).
However, I need to be sure that I am not introducing a problem
by using Linux boxes with these servers.....

Also, if my Fedora box is accessing the server via a WAN,
should I use NFS over TCP/IP? If my network goes down,
do I have to unmount and remount if I am using TCP/IP?

Thanks,
--
Wade Hampton



-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-12-24 22:30:49

by Steve Dickson

[permalink] [raw]
Subject: Re: Linux client on Solaris 7 NFS server

Wade Hampton wrote:

> I have a Fedora Core 1.0 client connecting to a
> Solaris X86 Version 7 NFS server. The Solaris server
> has locked up several times since I started accessing
> it from Linux (this is the only change to the Solaris
> system's environment). I was wondering if this
> was an NFS problem on Solaris 7 or could be a
> hardware problem with the NIC.

A network trace of the lockup would be good...

SteveD.



-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-29 16:49:23

by Lever, Charles

[permalink] [raw]
Subject: RE: Linux client on Solaris 7 NFS server

> Also, if my Fedora box is accessing the server via a WAN,
> should I use NFS over TCP/IP? If my network goes down,
> do I have to unmount and remount if I am using TCP/IP?

yes, you should use TCP when running NFS over a WAN.

generally on these late kernels network partitions will
not cause issues that require a umount/remount sequence.
if you are unsure about the reliability of TCP, may i
suggest you use it with the "soft" mount option and a
high retrans setting (see the "mount" and "nfs" man pages).

(and btw, i agree with steve, if you can reproduce this
problem and capture a network trace, that would be helpful:
use "tcpdump -s0 -w /tmp/dump host server-name-here" to
capture a raw network trace).


-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-30 05:25:09

by dwight

[permalink] [raw]
Subject: RE: Linux client on Solaris 7 NFS server

On 2003-12-24 15:12, Wade Hampton wrote:
>
> I have a Fedora Core 1.0 client connecting to a
> Solaris X86 Version 7 NFS server. The Solaris server
> has locked up several times since I started accessing
> it from Linux (this is the only change to the Solaris
> system's environment). I was wondering if this
> was an NFS problem on Solaris 7 or could be a
> hardware problem with the NIC.
> ...

If the Solaris Server is locking up, this is a problem with
Solaris. It might be somehow triggered by Linux, but that is
absolutely no excuse for a server problem. The server should
just stay up.

Sun has some excellent resources available for making certain
your system is up-to-date. Go over sun.com, and start with the
support link. They have this patch manager tool which will
examine your system, figure out what patches it needs, and
apply them. I don't recall if you need a support contract to
use it or not.

One of the other nice tools they have is their search of their
bugs database system. Here's one which may be related:

#57406: "NFS Server May Panic Upon Receipt of Certain Invalid
Client Requests" Oct 27, 2003.

That's a bad sign. They do list a number of patches for the
issue.

"Sensitive" should refer to performance. My experience is that
Linux NFS clients exhibit horrible performance issues with Sun
servers in standard, out-of-the-box configurations. That's in
comparison to using pure Sun or pure Linux Server/Client
combinations.

-dwight-




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-30 15:50:47

by Lever, Charles

[permalink] [raw]
Subject: RE: RE: Linux client on Solaris 7 NFS server

hi dwight-

> If the Solaris Server is locking up, this is a problem with
> Solaris. It might be somehow triggered by Linux, but that is
> absolutely no excuse for a server problem. The server should
> just stay up.

i think "locking up" is a pretty general description of the
problem. we need to have precise details from wade about
the server's behavior. after all, it could be bad hardware,
rather than any problem specific to the NFS implementation.

> "Sensitive" should refer to performance. My experience is that
> Linux NFS clients exhibit horrible performance issues with Sun
> servers in standard, out-of-the-box configurations. That's in
> comparison to using pure Sun or pure Linux Server/Client
> combinations.

again, specific details about the problem would be very helpful.
just saying that "they exhibit horrible performance out of the
box" does not reflect the specifics of your environment, nor
what kind of analysis and tuning you did to resolve the issues.
was this a problem with default settings, or were there software,
network, or hardware issues you encountered?


-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-30 16:52:08

by dwight

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

Hi Charles,

> hi dwight-
> =

> i think "locking up" is a pretty general description of the
> problem. we need to have precise details from wade about
> the server's behavior. after all, it could be bad hardware,
> rather than any problem specific to the NFS implementation.

I agree; it could indeed be bad hardware. The point was that he should =

be fully updated with the latest patches from Sun first, before debugging=

an issue with Solaris.

But I must say that I'd be interested in knowing what packets are causing=

issues with Solaris, if that is the case, and Wade has the time to
provide that information. =


> > "Sensitive" should refer to performance. My experience is that
> > Linux NFS clients exhibit horrible performance issues with Sun
> > servers in standard, out-of-the-box configurations. That's in
> > comparison to using pure Sun or pure Linux Server/Client
> > combinations.
> =

> again, specific details about the problem would be very helpful.
> just saying that "they exhibit horrible performance out of the
> box" does not reflect the specifics of your environment, nor
> what kind of analysis and tuning you did to resolve the issues.
> was this a problem with default settings, or were there software,
> network, or hardware issues you encountered?

Thank you for asking. I was referring to standard RedHat releases (7.x on=
up)
with the default settings. I'd be interested in knowing if you or anyone
else has any experience to the contrary.

As far as tuning goes, that gets rather extensive, from the MTU on up to =
the
NFS r/w sizes, as well as even the network topology. Since the options ar=
e
numerous and specific to one's environment, let me ask a simpler question=
: Has
anyone been able to tune a Linux-client and Solaris-server combination su=
ch
that the speed is comparable to a pure Linux or a pure Solaris client/ser=
ver
combination? This is for heavy r/w access. =


As far the networking topology goes, let us assume a simple isolated
configuration with just one dumb switch handling the client and server.
No NIS, no automounts, etc., etc; just simple NFS mounts. This is the typ=
e
of environment that I have been looking at the issue with lately.

-dwight-





-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-30 17:58:33

by Lever, Charles

[permalink] [raw]
Subject: RE: RE: Linux client on Solaris 7 NFS server

> Thank you for asking. I was referring to standard RedHat=20
> releases (7.x on up)
> with the default settings. I'd be interested in knowing if=20
> you or anyone
> else has any experience to the contrary.
>=20
> As far as tuning goes, that gets rather extensive, from the=20
> MTU on up to the
> NFS r/w sizes, as well as even the network topology. Since=20
> the options are
> numerous and specific to one's environment, let me ask a=20
> simpler question: Has
> anyone been able to tune a Linux-client and Solaris-server=20
> combination such
> that the speed is comparable to a pure Linux or a pure=20
> Solaris client/server
> combination? This is for heavy r/w access.=20
>=20
> As far the networking topology goes, let us assume a simple isolated
> configuration with just one dumb switch handling the client=20
> and server.
> No NIS, no automounts, etc., etc; just simple NFS mounts.=20
> This is the type
> of environment that I have been looking at the issue with lately.

does "tcp,rsize=3D32768,wsize=3D32768" work? the Linux defaults are
UDP and r/wsize=3D4096, which work adequately with Linux servers,
but probably are trouble for Solaris.

which Linux kernels, specifically, have you tried on your clients?

what performance do you see, and what do you expect?


-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-31 01:56:03

by Ian Kent

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

On Tue, 30 Dec 2003 [email protected] wrote:

> Thank you for asking. I was referring to standard RedHat releases (7.x on up)
> with the default settings. I'd be interested in knowing if you or anyone
> else has any experience to the contrary.

I have 7.1 and 7.3 clients. Haven't had many problems over the years and
the recent 2.4 kernels are much improved.

Currently I'm running the RedHat 2.4.20-20.7 kernel.

I used UDP transport for a long time but have recently changed to TCP for
nearly all my clients.

>
> As far as tuning goes, that gets rather extensive, from the MTU on up to the
> NFS r/w sizes, as well as even the network topology. Since the options are
> numerous and specific to one's environment, let me ask a simpler question: Has
> anyone been able to tune a Linux-client and Solaris-server combination such
> that the speed is comparable to a pure Linux or a pure Solaris client/server
> combination? This is for heavy r/w access.

My usage is largely read.

Performance has improved quite a bit as the kernel version has increased.
I can get around 80% (ie. 80% wire speed) the throughput of a 'similar'
Sun. Problem is that most of the Sparc clients can saturate 100Mb (onto a
gigabit backbone) so I don't really know what the difference is.

Ian




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-05 17:31:27

by dwight

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

Ian Kent wrote:
> Currently I'm running the RedHat 2.4.20-20.7 kernel.
>
> I used UDP transport for a long time but have recently changed to TCP for
> nearly all my clients.
> ...
>
> My usage is largely read.
>
> Performance has improved quite a bit as the kernel version has increased.
> I can get around 80% (ie. 80% wire speed) the throughput of a 'similar'
> Sun. Problem is that most of the Sparc clients can saturate 100Mb (onto a
> gigabit backbone) so I don't really know what the difference is.
>
> Ian

Thanks for the information, Ian.

So if I understand you correctly, you are saying that your performance for Linux clients and Solaris clients with a Solaris server is the same? Or that any
difference is not notiable?

Would you happen to know what throughput you are seeing with the Linux
client? My own, as reported by Ethereal, is about 11 Mbs in this case.

My transactions are a combination of read and write (mimicking the common
behavior that we use in a production environment). I appreciate the information;
I'll try a test of pure reads.

And yes, I too have saturated a 100-Mb switch before, though not with NFS.
The statistics on the switch in the production environment show that we're
nowhere near saturating the switch. For my test environment, there's almost
nothing else going on at the time (I.e. one client, one server). While I could
just use a cross-over cable and eliminate the switch, I don't see any information
indicating that this would improve things at this time.

Best Regards,

-dwight-



-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-05 18:22:13

by Lever, Charles

[permalink] [raw]
Subject: RE: RE: Linux client on Solaris 7 NFS server

hi dwight-

please capture some network traces of both a fast run and
a slow run (using UDP for both, preferrably). the traces
don't need to be large (in the neighborhood of 10MB), but
they do need to be raw format, and be sure you capture
at least 256 bytes of each frame (-s256).

the v2/v3 performance delta is usually a red herring; it
is almost always the result of some other problem.

> -----Original Message-----
> From: [email protected] [mailto:[email protected]]=20
> Sent: Monday, January 05, 2004 12:11 PM
> To: Lever, Charles
> Cc: [email protected]
> Subject: Re: [NFS] RE: Linux client on Solaris 7 NFS server=20
>=20
>=20
> Chareles Lever wrote:
>=20
> > does "tcp,rsize2768,wsize2768" work? the Linux defaults are
> > UDP and r/wsize@96, which work adequately with Linux servers,
> > but probably are trouble for Solaris.
> >=20
> > which Linux kernels, specifically, have you tried on your clients?
> >=20
> > what performance do you see, and what do you expect?
>=20
> Well, yes, it works, but there's no noticeable performance=20
> improvement. I'll caveat that with the observation that this was in
> a more complicated production environment. I haven't tried it yet
> with V3 in a semi-isolated environment.
>=20
> One of the more significant speed-ups I've noticed was with
> completely turning off Version 3 NFS support in the kernel, and
> forcing all transactions between Solaris and Linux to be Version 2.
> Just specifying V2 as a mount option improves things a bit; but using
> a 2.4.24-pre1 kernel without V3 support built in dropped the test
> time from about 6 minutes to 3.0 minutes. Just specifying V2 as a=20
> mount option drops the performance from 6 minutes to 4 min,=20
> 45 seconds.
>=20
> In comparision, the pure Linux server and client test ran about 30
> seconds (UDP and TCP); and the pure Solaris environment ran about
> the same. Mind you, the Solaris 8 systems are much slower CPU's in=20
> my test bed the same. Mind you, the Solaris 8 systems are much=20
> slower CPU's in my test bed
>=20
> Another rather interesting thing is that a Solaris client using a=20
> Linux server runs the test at the same speed as the pure Solaris or=20
> pure Linux environment. So this behaviour is specific to the=20
> Linux-client/Solaris-server situation.
>=20
> As for what kind of results I was expecting, I was hoping for=20
> something comparable to the other 3 pairs of combinations. Perhaps=20
> even just double would be of use; but not a factor of 6.
>=20
> As far as the kernels used, the same consistent results have been=20
> seen across the range of 2.4 kernels. Mostly the RH variations of=20
> 2.4.18, but earlier ones as well, and right now I'm looking at the=20
> 2.4.24 pre-releases. However, I will note that the one's prior to=20
> 2.4.18 were done in the production environment, and not with some=20
> attempt at isolation.
>=20
> Also, the problem seems unaffected by using the latest version of=20
> nfsutils.
>=20
> The overall throughput seems to be pegged at about 11 Mbps IIRC.=20
> Googling shows that other people have also reported this issue over=20
> the years, and there is no clear solution. =20
>=20
> So, in summary, this seems to be a constant issue with Linux/Solaris
> interaction. There also seem to be be issues with V3 in this
> environment.
>=20
> Any suggestions and/or ideas would be welcome.
>=20
> Best Regards,
>=20
> -dwight-
>=20
>=20
>=20


-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-05 19:46:08

by Wade Hampton

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

Slightly off topic, but have either of you seen a problem with
a Solaris 7 server locking up or having networking problems
when moving data to a linux client (read only)?

I was locked up my solaris 7 x86 server several times
with Fedora 1.0 (kernel 2.4.22-1.2115). I changed
rsize=32768,wsize=32768 and have not seen a problem
since.

Thanks,
--
Wade Hampton




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-05 22:34:09

by Wade Hampton

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

Wade Hampton wrote:

> [snip brainfart]

This is what happens when you ask a question and then
forget you asked it.... Happy New Year....

I have not had any problems with the Solaris server
since I changed to the rsize=32768.... My system ran
for over a week without any issues. Maybe in a week
or so (after I deliver something working), I could go back
to the earlier setting, try to crash the server, and try to
capture packets, but unless the packets are very generic
(and edited), I could not post them as they are accessing
a proprietary system.

The reason I think it is Solaris is that it crashed at least
two times a couple of days apart when I was starting my
testing, but since I changed the rsize value on 12/23,
I have not had any NFS problems.....

I'll check to see if the referenced patch has been applied
to the server.

So much for the first day back after the new year :(

Thanks,
--
Wade Hampton




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-06 04:42:32

by Ian Kent

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

On Mon, 5 Jan 2004 [email protected] wrote:

> Ian Kent wrote:
> > Currently I'm running the RedHat 2.4.20-20.7 kernel.
> >
> > I used UDP transport for a long time but have recently changed to TCP for
> > nearly all my clients.
> > ...
> >
> > My usage is largely read.
> >
> > Performance has improved quite a bit as the kernel version has increased.
> > I can get around 80% (ie. 80% wire speed) the throughput of a 'similar'
> > Sun. Problem is that most of the Sparc clients can saturate 100Mb (onto a
> > gigabit backbone) so I don't really know what the difference is.
> >
> > Ian
>
> Thanks for the information, Ian.
>
> So if I understand you correctly, you are saying that your performance for Linux clients and Solaris clients with a Solaris server is the same? Or that any
> difference is not notiable?

I had some figures, but I've deleted them.

Basically I'm saying, with a 100Mb(it)/sec interface I get just over
8 MB(ytes)/sec whereas I can consistently get over 10MB(ytes)/sec on most
Sparcs (Solaris). So the Sparcs are pushing their interfaces about fast as
they can go but the Linux box is not. So there's still room for
improvement.

>
> Would you happen to know what throughput you are seeing with the Linux
> client? My own, as reported by Ethereal, is about 11 Mbs in this case.
>
> My transactions are a combination of read and write (mimicking the common
> behavior that we use in a production environment). I appreciate the information;
> I'll try a test of pure reads.
>
> And yes, I too have saturated a 100-Mb switch before, though not with NFS.
> The statistics on the switch in the production environment show that we're
> nowhere near saturating the switch. For my test environment, there's almost
> nothing else going on at the time (I.e. one client, one server). While I could
> just use a cross-over cable and eliminate the switch, I don't see any information
> indicating that this would improve things at this time.

Not talking about the switch the clients are connected to but the
interface on the client. I expect the switches can deal with a good deal
more that a single client going fat tack.

Ian






-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-06 04:59:29

by Ian Kent

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

On Mon, 5 Jan 2004 [email protected] wrote:

>
> Any suggestions and/or ideas would be welcome.

Sorry but I have to ask.

How many server threads are you using on the Solaris server?

Ian




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-06 04:59:10

by Ian Kent

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

On Mon, 5 Jan 2004, Wade Hampton wrote:

> Slightly off topic, but have either of you seen a problem with
> a Solaris 7 server locking up or having networking problems
> when moving data to a linux client (read only)?
>
> I was locked up my solaris 7 x86 server several times
> with Fedora 1.0 (kernel 2.4.22-1.2115). I changed
> rsize=32768,wsize=32768 and have not seen a problem
> since.

Don't use x86 Solaris, sorry.

Ian




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-06 07:38:19

by dwight

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

Hi Charles,

I actually have lots of data on this, since I ran tcpdump on every major
run when I was gathering data. Each tcpdump output file is about 20 MB.

That's the good news. =


The bad news is I don't know if I can release that specific data. My =

guess is that I'll have to cut over to a totally isolated subnet and =

rerun things. =


However, if such data can be released, and if there's enough interest in
looking the tcpdump output, I'd be willing to go to the efforts of doing =

so for those who'd be interested in looking at the tcpdump data.

So I'll investigate further. =


Best Regards,

-dwight-

From: "Lever, Charles" <[email protected]>
Date: Mon, 5 Jan 2004 10:21:59 -0800
> hi dwight-
> =

> please capture some network traces of both a fast run and
> a slow run (using UDP for both, preferrably). the traces
> don't need to be large (in the neighborhood of 10MB), but
> they do need to be raw format, and be sure you capture
> at least 256 bytes of each frame (-s256).
> =

> the v2/v3 performance delta is usually a red herring; it
> is almost always the result of some other problem.
> =





-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-06 07:41:20

by dwight

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

Hi Wade,

I haven't seen such behaviour myself, but I do remember some bugid's
which seem related. Basically, as I mentioned before, you need to go to
sun.com, and run their patch expert tool. Use the web-based java applet;
avoid their java-based CLI technology - there are too many problems
with it, and it is basically useless IMHO.

Then just select the patches that you need, download the tarball,
extract it, and run the patching tool. It's pretty straightforward
if you do it by hand. Just be *certain* to do a `boot -r` right
afterwards if the README says to reboot the system.

Specifically, I recall seeing a big kernel-level patch related to NFS
hanging issues.

You can tell what level your kernel is patched to via `uname -a` on
Solaris IIRC. In general, if you have an older stock Solaris release,
you need to patch it.

Regards,

-dwight-




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-07 18:17:45

by dwight

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

Thanks for the suggestion, Ian. It was a good one, and I've carried it a
little farther.

First, this is all single threaded at the user level. It's a mixture of
reads and writes; akin to a large compile (actually link) of a bunch of
files, creating one big binary. Just using a local disk, we're looking at
about 3-4 secs. Over NFS, it's 30 seconds, except for the Linux client case.

Doing just reads (dd of the 2.4.23 kernel source .tgz, which is about 35 MB),
it takes about 3-4 seconds. That's UDP, with a dd blocksize to match the
NFS r/w size of 8k, on a 100 Mbs network. So this matches your results.
This is for a Linux client talking to either a Linux or a Solaris server.

However, doing a write, I see different results. In the Linux/Linux case,
again the result is again about 3-4 seconds. However, with a Linux-client/
Solaris-server, it takes about 12 seconds. This is a factor of 3-4.

I'm wondering if the mixture of reads and writes is what is driving the
degradation up to a factor of 6.

I'll give the pure Solaris/Solaris results a try. Anecdotally, I've heard
that there are also similar issues with HP-UX and AIX, though I haven't
tried it out myself.

Best Regards,

-dwight-


Ian Kent wrote:

> I had some figures, but I've deleted them.
>
> Basically I'm saying, with a 100Mb(it)/sec interface I get just over
> 8 MB(ytes)/sec whereas I can consistently get over 10MB(ytes)/sec on most
> Sparcs (Solaris). So the Sparcs are pushing their interfaces about fast as
> they can go but the Linux box is not. So there's still room for
> improvement.





-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-08 00:32:31

by Ian Kent

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

On Wed, 7 Jan 2004 [email protected] wrote:

>
> Doing just reads (dd of the 2.4.23 kernel source .tgz, which is about 35 MB),
> it takes about 3-4 seconds. That's UDP, with a dd blocksize to match the
> NFS r/w size of 8k, on a 100 Mbs network. So this matches your results.
> This is for a Linux client talking to either a Linux or a Solaris server.

Umm. I use 32k and TCP.




-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-01-05 17:10:55

by dwight

[permalink] [raw]
Subject: Re: RE: Linux client on Solaris 7 NFS server

Chareles Lever wrote:

> does "tcp,rsize2768,wsize2768" work? the Linux defaults are
> UDP and r/wsize@96, which work adequately with Linux servers,
> but probably are trouble for Solaris.
> =

> which Linux kernels, specifically, have you tried on your clients?
> =

> what performance do you see, and what do you expect?

Well, yes, it works, but there's no noticeable performance =

improvement. I'll caveat that with the observation that this was in
a more complicated production environment. I haven't tried it yet
with V3 in a semi-isolated environment.

One of the more significant speed-ups I've noticed was with
completely turning off Version 3 NFS support in the kernel, and
forcing all transactions between Solaris and Linux to be Version 2.
Just specifying V2 as a mount option improves things a bit; but using
a 2.4.24-pre1 kernel without V3 support built in dropped the test
time from about 6 minutes to 3.0 minutes. Just specifying V2 as a =

mount option drops the performance from 6 minutes to 4 min, 45 seconds.

In comparision, the pure Linux server and client test ran about 30
seconds (UDP and TCP); and the pure Solaris environment ran about
the same. Mind you, the Solaris 8 systems are much slower CPU's in =

my test bed the same. Mind you, the Solaris 8 systems are much =

slower CPU's in my test bed

Another rather interesting thing is that a Solaris client using a =

Linux server runs the test at the same speed as the pure Solaris or =

pure Linux environment. So this behaviour is specific to the =

Linux-client/Solaris-server situation.

As for what kind of results I was expecting, I was hoping for =

something comparable to the other 3 pairs of combinations. Perhaps =

even just double would be of use; but not a factor of 6.

As far as the kernels used, the same consistent results have been =

seen across the range of 2.4 kernels. Mostly the RH variations of =

2.4.18, but earlier ones as well, and right now I'm looking at the =

2.4.24 pre-releases. However, I will note that the one's prior to =

2.4.18 were done in the production environment, and not with some =

attempt at isolation.

Also, the problem seems unaffected by using the latest version of =

nfsutils.

The overall throughput seems to be pegged at about 11 Mbps IIRC. =

Googling shows that other people have also reported this issue over =

the years, and there is no clear solution. =


So, in summary, this seems to be a constant issue with Linux/Solaris
interaction. There also seem to be be issues with V3 in this
environment.

Any suggestions and/or ideas would be welcome.

Best Regards,

-dwight-




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs