2003-02-17 17:39:08

by Fabrizio Nesti

[permalink] [raw]
Subject: 2.4.20 TCP server + solaris client performance

Hello to everybody.
we are reporting a very low performance for nfs access from Solaris clients
to the linux nfs server on RH8.0. We thought it was udp and upgraded to
kernel 2.4.20.

Now the performance is still low, compared to a solaris server:
# time gtar xf /var/tmp/cvs-1.11.5.tar
Writing on Linux_2.4.20> real 0m22.132s
Writing on Solaris_8> real 0m7.174s

Both filesystems are mounted with proto=tcp,rsize=32768,wsize=32768.
Snooping the traffic however, it appears that the linux server is not
serving with size=32768, but with a maximum size of 8192.

- Is there a reson for this?
- May this be the reason for the poor performance above?

Thanks in advance,
Fabrizio Nesti


PS: Some snoop traffic:
...
solaris -> linux NFS C CREATE3 FH=884D (EXCLUSIVE) check_cvs.in
linux -> solaris NFS R CREATE3 OK FH=174A
solaris -> linux NFS C SETATTR3 FH=174A
linux -> solaris NFS R SETATTR3 OK
solaris -> linux NFS C WRITE3 FH=174A at 0 for 8192 (ASYNC)
solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849797604 Len=1460
solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849799064 Len=1460
solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849800524 Len=1460
solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849801984 Len=1460
solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849803444 Len=1056
linux -> solaris TCP D=793 S=2049 Ack=849804500 Seq=633960980 Len=0
...



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-02-18 02:36:11

by Alan Powell

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

8192 block size is a Linux daemon limitation. Also,
don't switch to TCP unless you need to. UDP will be
faster if you have a decent network. Revert back to
UDP, and on the client side, run "nfsstat -c" and
monitor the number of retransmissions. If that number
doesn't increase, you have a clean network, and you
should stay on UDP.


--- Fabrizio Nesti <[email protected]> wrote:
> Hello to everybody.
> we are reporting a very low performance for nfs
> access from Solaris clients
> to the linux nfs server on RH8.0. We thought it was
> udp and upgraded to
> kernel 2.4.20.
>
> Now the performance is still low, compared to a
> solaris server:
> # time gtar xf /var/tmp/cvs-1.11.5.tar
> Writing on Linux_2.4.20> real 0m22.132s
> Writing on Solaris_8> real 0m7.174s
>
> Both filesystems are mounted with
> proto=tcp,rsize=32768,wsize=32768.
> Snooping the traffic however, it appears that the
> linux server is not
> serving with size=32768, but with a maximum size of
> 8192.
>
> - Is there a reson for this?
> - May this be the reason for the poor performance
> above?
>
> Thanks in advance,
> Fabrizio Nesti
>
>
> PS: Some snoop traffic:
> ...
> solaris -> linux NFS C CREATE3 FH=884D
> (EXCLUSIVE) check_cvs.in
> linux -> solaris NFS R CREATE3 OK FH=174A
> solaris -> linux NFS C SETATTR3 FH=174A
> linux -> solaris NFS R SETATTR3 OK
> solaris -> linux NFS C WRITE3 FH=174A at 0 for
> 8192 (ASYNC)
> solaris -> linux TCP D=2049 S=793 Ack=633960980
> Seq=849797604 Len=1460
> solaris -> linux TCP D=2049 S=793 Ack=633960980
> Seq=849799064 Len=1460
> solaris -> linux TCP D=2049 S=793 Ack=633960980
> Seq=849800524 Len=1460
> solaris -> linux TCP D=2049 S=793 Ack=633960980
> Seq=849801984 Len=1460
> solaris -> linux TCP D=2049 S=793 Ack=633960980
> Seq=849803444 Len=1056
> linux -> solaris TCP D=793 S=2049 Ack=849804500
> Seq=633960980 Len=0
> ...
>
>
>
>
-------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs


__________________________________________________
Do you Yahoo!?
Yahoo! Shopping - Send Flowers for Valentine's Day
http://shopping.yahoo.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-18 10:46:51

by Fabrizio Nesti

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

Sorry,
the reason to switch to TCP was exactly the same: poor nfs performance as
seen from any solaris client. And by poor I mean three-five times slower.

The same "tar xf" (a typical high load r/w usage) gives

linux server solaris server
linux client 1 sec (udp)(caching?) 8 sec (for both tcp and udp)
solaris client 22/40 sec (tcp/udp) 10/8 sec (tcp/udp)

We worry about these ^^^ figures, since we bought a new linux server
to switch to, and we have some solaris clients.

Since solaris nfs clients sseems to prefer TCP, and following some messages
on the list, we tried that. (Even tuning can not improve this situation).
Is there something wrong we are doing or we have to switch back to soalris
server?

Thanks again,
Fabrizio Nesti


On Mon, 17 Feb 2003, Alan Powell wrote:

> 8192 block size is a Linux daemon limitation. Also,
> don't switch to TCP unless you need to. UDP will be
> faster if you have a decent network. Revert back to
> UDP, and on the client side, run "nfsstat -c" and
> monitor the number of retransmissions. If that number
> doesn't increase, you have a clean network, and you
> should stay on UDP.
>
>
> --- Fabrizio Nesti <[email protected]> wrote:
> > Hello to everybody.
> > we are reporting a very low performance for nfs
> > access from Solaris clients
> > to the linux nfs server on RH8.0. We thought it was
> > udp and upgraded to
> > kernel 2.4.20.
> >
> > Now the performance is still low, compared to a
> > solaris server:
> > # time gtar xf /var/tmp/cvs-1.11.5.tar
> > Writing on Linux_2.4.20> real 0m22.132s
> > Writing on Solaris_8> real 0m7.174s
> >
> > Both filesystems are mounted with
> > proto=tcp,rsize=32768,wsize=32768.
> > Snooping the traffic however, it appears that the
> > linux server is not
> > serving with size=32768, but with a maximum size of
> > 8192.
> >
> > - Is there a reson for this?
> > - May this be the reason for the poor performance
> > above?
> >
> > Thanks in advance,
> > Fabrizio Nesti
> >
> >
> > PS: Some snoop traffic:
> > ...
> > solaris -> linux NFS C CREATE3 FH=884D
> > (EXCLUSIVE) check_cvs.in
> > linux -> solaris NFS R CREATE3 OK FH=174A
> > solaris -> linux NFS C SETATTR3 FH=174A
> > linux -> solaris NFS R SETATTR3 OK
> > solaris -> linux NFS C WRITE3 FH=174A at 0 for
> > 8192 (ASYNC)
> > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > Seq=849797604 Len=1460
> > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > Seq=849799064 Len=1460
> > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > Seq=849800524 Len=1460
> > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > Seq=849801984 Len=1460
> > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > Seq=849803444 Len=1056
> > linux -> solaris TCP D=793 S=2049 Ack=849804500
> > Seq=633960980 Len=0
> > ...
> >
> >
> >
> >
> -------------------------------------------------------
> > This sf.net email is sponsored by:ThinkGeek
> > Welcome to geek heaven.
> > http://thinkgeek.com/sf
> > _______________________________________________
> > NFS maillist - [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nfs
>
>
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Shopping - Send Flowers for Valentine's Day
> http://shopping.yahoo.com
>




-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-18 15:30:03

by Eric Whiting

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

Have you verified that your storage is properly setup? Is it IDE/SCSI/RAID? IDE
storage with DMA disabled will cause terrible performance. Please verify disk
speed on local writes and make sure they are 'faster' than the network.

What version of SOlaris are you running? Solaris 8 and newer has a lot of NFS
fixes...

We are getting good NFS numbers with 2.4.20 UDP NFS servers against solaris [89]
clients.

eric





Fabrizio Nesti wrote:
>
> Sorry,
> the reason to switch to TCP was exactly the same: poor nfs performance as
> seen from any solaris client. And by poor I mean three-five times slower.
>
> The same "tar xf" (a typical high load r/w usage) gives
>
> linux server solaris server
> linux client 1 sec (udp)(caching?) 8 sec (for both tcp and udp)
> solaris client 22/40 sec (tcp/udp) 10/8 sec (tcp/udp)
>
> We worry about these ^^^ figures, since we bought a new linux server
> to switch to, and we have some solaris clients.
>
> Since solaris nfs clients sseems to prefer TCP, and following some messages
> on the list, we tried that. (Even tuning can not improve this situation).
> Is there something wrong we are doing or we have to switch back to soalris
> server?
>
> Thanks again,
> Fabrizio Nesti
>
> On Mon, 17 Feb 2003, Alan Powell wrote:
>
> > 8192 block size is a Linux daemon limitation. Also,
> > don't switch to TCP unless you need to. UDP will be
> > faster if you have a decent network. Revert back to
> > UDP, and on the client side, run "nfsstat -c" and
> > monitor the number of retransmissions. If that number
> > doesn't increase, you have a clean network, and you
> > should stay on UDP.
> >
> >
> > --- Fabrizio Nesti <[email protected]> wrote:
> > > Hello to everybody.
> > > we are reporting a very low performance for nfs
> > > access from Solaris clients
> > > to the linux nfs server on RH8.0. We thought it was
> > > udp and upgraded to
> > > kernel 2.4.20.
> > >
> > > Now the performance is still low, compared to a
> > > solaris server:
> > > # time gtar xf /var/tmp/cvs-1.11.5.tar
> > > Writing on Linux_2.4.20> real 0m22.132s
> > > Writing on Solaris_8> real 0m7.174s
> > >
> > > Both filesystems are mounted with
> > > proto=tcp,rsize=32768,wsize=32768.
> > > Snooping the traffic however, it appears that the
> > > linux server is not
> > > serving with size=32768, but with a maximum size of
> > > 8192.
> > >
> > > - Is there a reson for this?
> > > - May this be the reason for the poor performance
> > > above?
> > >
> > > Thanks in advance,
> > > Fabrizio Nesti
> > >
> > >
> > > PS: Some snoop traffic:
> > > ...
> > > solaris -> linux NFS C CREATE3 FH=884D
> > > (EXCLUSIVE) check_cvs.in
> > > linux -> solaris NFS R CREATE3 OK FH=174A
> > > solaris -> linux NFS C SETATTR3 FH=174A
> > > linux -> solaris NFS R SETATTR3 OK
> > > solaris -> linux NFS C WRITE3 FH=174A at 0 for
> > > 8192 (ASYNC)
> > > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > > Seq=849797604 Len=1460
> > > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > > Seq=849799064 Len=1460
> > > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > > Seq=849800524 Len=1460
> > > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > > Seq=849801984 Len=1460
> > > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > > Seq=849803444 Len=1056
> > > linux -> solaris TCP D=793 S=2049 Ack=849804500
> > > Seq=633960980 Len=0
> > > ...
> > >
> > >
> > >
> > >
> > -------------------------------------------------------
> > > This sf.net email is sponsored by:ThinkGeek
> > > Welcome to geek heaven.
> > > http://thinkgeek.com/sf
> > > _______________________________________________
> > > NFS maillist - [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/nfs
> >
> >
> > __________________________________________________
> > Do you Yahoo!?
> > Yahoo! Shopping - Send Flowers for Valentine's Day
> > http://shopping.yahoo.com
> >
>
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-18 15:51:58

by Fabrizio Nesti

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

Thanks for the replies, but:

yes,
- the UDP server is a (SCSI) 4disks RAID5. Very fast.
- The TCP (experimental) server is all the same very fast (dma enabled).
- Typical time for local operation (the same tar xf) is 1 sec or less.

solaris is : SunOS 5.8 Generic_108528-09 sun4u sparc SUNW,Ultra-5_10
linux are :
udp server: Linux 2.4.18-24.8.0smp #1 SMP i686 i686
tcp server: Linux 2.4.20 i686 athlon

Plain read or write (dd) is quite satisfactory, so it seems that problems
arise with many near reads and writes (like tar, cvs etc).
Also, it seems that we have no retransmission problems.
(see below nfsstat -c on solaris).

Puzzled.

Thanks to all,
cheers,
Fabrizio

PS: root@caslon:/root$ nfsstat -c

Client rpc:
Connection oriented:
calls badcalls badxids timeouts newcreds badverfs
10994822 1645 30 16 0 0
timers cantconn nomem interrupts
0 1598 0 29
Connectionless:
calls badcalls retrans badxids timeouts newcreds
37885148 10303 6058 238 16180 0
badverfs timers nomem cantsend
0 4267 0 0

Client nfs:
calls badcalls clgets cltoomany
48204358 218 48204358 28
Version 2: (4019 calls)
null getattr setattr root lookup readlink
0 0% 3909 97% 42 1% 0 0% 55 1% 6 0%
read wrcache write create remove rename
0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
link symlink mkdir rmdir readdir statfs
0 0% 0 0% 0 0% 0 0% 6 0% 1 0%
Version 3: (48171346 calls)
null getattr setattr lookup access readlink
0 0% 18461695 38% 482002 1% 8829223 18% 7849906 16% 16660 0%
read write create mkdir symlink mknod
7750211 16% 3143812 6% 244856 0% 25973 0% 8847 0% 0 0%
remove rmdir rename link readdir readdirplus
228254 0% 18425 0% 13163 0% 14493 0% 259741 0% 361588 0%
fsstat fsinfo pathconf commit
50919 0% 740 0% 2075 0% 408763 0%

Client nfs_acl:
Version 2: (1 calls)
null getacl setacl getattr access
0 0% 0 0% 0 0% 1 100% 0 0%
Version 3: (28992 calls)
null getacl setacl
0 0% 28992 100% 0 0%



On Tue, 18 Feb 2003, Eric Whiting wrote:

> Have you verified that your storage is properly setup? Is it IDE/SCSI/RAID? IDE
> storage with DMA disabled will cause terrible performance. Please verify disk
> speed on local writes and make sure they are 'faster' than the network.
>
> What version of SOlaris are you running? Solaris 8 and newer has a lot of NFS
> fixes...
>
> We are getting good NFS numbers with 2.4.20 UDP NFS servers against solaris [89]
> clients.
>
> eric
>
> Fabrizio Nesti wrote:
> >
> > Sorry,
> > the reason to switch to TCP was exactly the same: poor nfs performance as
> > seen from any solaris client. And by poor I mean three-five times slower.
> >
> > The same "tar xf" (a typical high load r/w usage) gives
> >
> > linux server solaris server
> > linux client 1 sec (udp)(caching?) 8 sec (for both tcp and udp)
> > solaris client 22/40 sec (tcp/udp) 10/8 sec (tcp/udp)
> >
> > We worry about these ^^^ figures, since we bought a new linux server
> > to switch to, and we have some solaris clients.
> >
> > Since solaris nfs clients sseems to prefer TCP, and following some messages
> > on the list, we tried that. (Even tuning can not improve this situation).
> > Is there something wrong we are doing or we have to switch back to soalris
> > server?
> >
> > Thanks again,
> > Fabrizio Nesti
> >
> > On Mon, 17 Feb 2003, Alan Powell wrote:
> >
> > > 8192 block size is a Linux daemon limitation. Also,
> > > don't switch to TCP unless you need to. UDP will be
> > > faster if you have a decent network. Revert back to
> > > UDP, and on the client side, run "nfsstat -c" and
> > > monitor the number of retransmissions. If that number
> > > doesn't increase, you have a clean network, and you
> > > should stay on UDP.
> > >
> > >
> > > --- Fabrizio Nesti <[email protected]> wrote:
> > > > Hello to everybody.
> > > > we are reporting a very low performance for nfs
> > > > access from Solaris clients
> > > > to the linux nfs server on RH8.0. We thought it was
> > > > udp and upgraded to
> > > > kernel 2.4.20.
> > > >
> > > > Now the performance is still low, compared to a
> > > > solaris server:
> > > > # time gtar xf /var/tmp/cvs-1.11.5.tar
> > > > Writing on Linux_2.4.20> real 0m22.132s
> > > > Writing on Solaris_8> real 0m7.174s
> > > >
> > > > Both filesystems are mounted with
> > > > proto=tcp,rsize=32768,wsize=32768.
> > > > Snooping the traffic however, it appears that the
> > > > linux server is not
> > > > serving with size=32768, but with a maximum size of
> > > > 8192.
> > > >
> > > > - Is there a reson for this?
> > > > - May this be the reason for the poor performance
> > > > above?
> > > >
> > > > Thanks in advance,
> > > > Fabrizio Nesti
> > > >
> > > >
> > > > PS: Some snoop traffic:
> > > > ...
> > > > solaris -> linux NFS C CREATE3 FH=884D
> > > > (EXCLUSIVE) check_cvs.in
> > > > linux -> solaris NFS R CREATE3 OK FH=174A
> > > > solaris -> linux NFS C SETATTR3 FH=174A
> > > > linux -> solaris NFS R SETATTR3 OK
> > > > solaris -> linux NFS C WRITE3 FH=174A at 0 for
> > > > 8192 (ASYNC)
> > > > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > > > Seq=849797604 Len=1460
> > > > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > > > Seq=849799064 Len=1460
> > > > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > > > Seq=849800524 Len=1460
> > > > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > > > Seq=849801984 Len=1460
> > > > solaris -> linux TCP D=2049 S=793 Ack=633960980
> > > > Seq=849803444 Len=1056
> > > > linux -> solaris TCP D=793 S=2049 Ack=849804500
> > > > Seq=633960980 Len=0
> > > > ...
> > > >
> > > >
> > > >
> > > >
> > > -------------------------------------------------------
> > > > This sf.net email is sponsored by:ThinkGeek
> > > > Welcome to geek heaven.
> > > > http://thinkgeek.com/sf
> > > > _______________________________________________
> > > > NFS maillist - [email protected]
> > > > https://lists.sourceforge.net/lists/listinfo/nfs
> > >
> > >
> > > __________________________________________________
> > > Do you Yahoo!?
> > > Yahoo! Shopping - Send Flowers for Valentine's Day
> > > http://shopping.yahoo.com
> > >
> >
> > -------------------------------------------------------
> > This sf.net email is sponsored by:ThinkGeek
> > Welcome to geek heaven.
> > http://thinkgeek.com/sf
> > _______________________________________________
> > NFS maillist - [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nfs
>





-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-18 17:35:45

by Lever, Charles

[permalink] [raw]
Subject: RE: 2.4.20 TCP server + solaris client performance

> 8192 block size is a Linux daemon limitation.

i think what alan meant is that the Linux NFS server
cannot respond to read or write requests larger than
8KB, although that will change soon (or already has).
the Linux NFS client supports rsize and wsize up to
32KB.

> Also,
> don't switch to TCP unless you need to. UDP will be
> faster if you have a decent network. Revert back to
> UDP, and on the client side, run "nfsstat -c" and
> monitor the number of retransmissions. If that number
> doesn't increase, you have a clean network, and you
> should stay on UDP.

TCP is a better default in the long run for the following
reasons:

1. many networks (even switched LANs) contain parts
that run at different speeds (GbE vs. 100Mb/s).
TCP is better at managing flow control in situations
like this, or in the infrequent cases where a network
congestion storm occurs.

2. UDP is exposed to some rare forms of silent data
corruption resulting from the IP ID field wrapping.

3. the performance overhead of TCP will shrink over time
so that there is no longer much advantage to using
UDP.

4. as time goes on, NFS over TCP will work well enough
in every network scenario, whereas UDP will always be
limited (UDP will probably never work well on WANs,
for example).

also note that future versions of NFS will not support UDP
at all, so it is best to start getting comfortable with
stream protocols now.

the real overhead of using TCP is probably not visible for
a server with disks so slow that it can't fill its local
network pipe.

> --- Fabrizio Nesti <[email protected]> wrote:
> > Hello to everybody.
> > we are reporting a very low performance for nfs
> > access from Solaris clients
> > to the linux nfs server on RH8.0. We thought it was
> > udp and upgraded to
> > kernel 2.4.20.
> >
> > Now the performance is still low, compared to a
> > solaris server:
> > # time gtar xf /var/tmp/cvs-1.11.5.tar
> > Writing on Linux_2.4.20> real 0m22.132s
> > Writing on Solaris_8> real 0m7.174s
> >
> > Both filesystems are mounted with proto=tcp,rsize=32768,wsize=32768.
> > Snooping the traffic however, it appears that the
> > linux server is not
> > serving with size=32768, but with a maximum size of
> > 8192.
> >
> > - Is there a reson for this?
> > - May this be the reason for the poor performance
> > above?
> >
> > Thanks in advance,
> > Fabrizio Nesti
> >
> >
> > PS: Some snoop traffic:
> > ...
> > solaris -> linux NFS C CREATE3 FH=884D
> > (EXCLUSIVE) check_cvs.in
> > linux -> solaris NFS R CREATE3 OK FH=174A
> > solaris -> linux NFS C SETATTR3 FH=174A
> > linux -> solaris NFS R SETATTR3 OK
> > solaris -> linux NFS C WRITE3 FH=174A at 0 for
> > 8192 (ASYNC)
> > solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849797604
> > Len=1460
> > solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849799064
> > Len=1460
> > solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849800524
> > Len=1460
> > solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849801984
> > Len=1460
> > solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849803444
> > Len=1056
> > linux -> solaris TCP D=793 S=2049 Ack=849804500
> > Seq=633960980 Len=0
> > ...
> >
> >
> >
> >
> -------------------------------------------------------
> > This sf.net email is sponsored by:ThinkGeek
> > Welcome to geek heaven.
> > http://thinkgeek.com/sf
> > _______________________________________________
> > NFS maillist - [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nfs
>
>
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Shopping - Send Flowers for Valentine's Day
http://shopping.yahoo.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf _______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-19 04:39:38

by NeilBrown

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

On Monday February 17, [email protected] wrote:
> Hello to everybody.
> we are reporting a very low performance for nfs access from Solaris clients
> to the linux nfs server on RH8.0. We thought it was udp and upgraded to
> kernel 2.4.20.
>
> Now the performance is still low, compared to a solaris server:
> # time gtar xf /var/tmp/cvs-1.11.5.tar
> Writing on Linux_2.4.20> real 0m22.132s
> Writing on Solaris_8> real 0m7.174s
>
> Both filesystems are mounted with proto=tcp,rsize=32768,wsize=32768.
> Snooping the traffic however, it appears that the linux server is not
> serving with size=32768, but with a maximum size of 8192.
>
> - Is there a reson for this?

Yes. Changing a #define in include/linux/nfsd/const.h and recompiling
might work. There is a small chance that it would cause problems
starting lots of nfsd threads or running with UDP.

> - May this be the reason for the poor performance above?

Unlikely. It could possibly cause a 10% difference, but not a 300%
difference.

What filesystem are you using on Linux?
ext3 with data=journal and preferably the journal on a separate device
gives quite good performance with NFS.
Also with ext3, I have found that the no_wdelay export option helps.

NeilBrown

>
> Thanks in advance,
> Fabrizio Nesti
>
>
> PS: Some snoop traffic:
> ...
> solaris -> linux NFS C CREATE3 FH=884D (EXCLUSIVE) check_cvs.in
> linux -> solaris NFS R CREATE3 OK FH=174A
> solaris -> linux NFS C SETATTR3 FH=174A
> linux -> solaris NFS R SETATTR3 OK
> solaris -> linux NFS C WRITE3 FH=174A at 0 for 8192 (ASYNC)
> solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849797604 Len=1460
> solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849799064 Len=1460
> solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849800524 Len=1460
> solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849801984 Len=1460
> solaris -> linux TCP D=2049 S=793 Ack=633960980 Seq=849803444 Len=1056
> linux -> solaris TCP D=793 S=2049 Ack=849804500 Seq=633960980 Len=0
> ...
>
>
>
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-19 11:34:19

by Fabrizio Nesti

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

> > Now the performance is still low, compared to a solaris server:
> > # time gtar xf /var/tmp/cvs-1.11.5.tar
> > Writing from Linux_2.4.20> real 0m22.132s
> > Writing from Solaris_8> real 0m7.174s
> >

> > serving with size=32768, but with a maximum size of 8192.
> > - May this be the reason for the poor performance above?
>
> Unlikely. It could possibly cause a 10% difference, but not a 300%
> difference. What filesystem are you using on Linux?
> ext3 with data=journal and preferably the journal on a separate device
> gives quite good performance with NFS.
> Also with ext3, I have found that the no_wdelay export option helps.

I supposed the filesystem is ok, since it works very fast locally (<1sec).
In any case, they were ext3 on 4xSCSI320(RAID5+LVM) in udp tests, and an
ext3 on IDE133 for these tcp tests.

Network is a single 100MB/fullduplez switch.

So, if it is not UDP/TCP or r/wsize the cause, what can it be?
Even more, after the email from

> Eric Whiting <[email protected]>
> ...
> We are getting good NFS numbers with 2.4.20 UDP NFS servers against solaris
> [89] clients.

So Eric, can you please show your configuration?

Also it is strange that a standard out-of-the-box RH8.0 on that big server
does perform so bad. Knowing this in advance, we wouldn't have chosen
linux for serving... :(

Ok,
thanks and ciao,
Fabrizio



-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-19 14:07:37

by Ion Badulescu

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

On Wed, 19 Feb 2003 12:30:06 +0100 (MET), Fabrizio Nesti <[email protected]> wrote:

> Also it is strange that a standard out-of-the-box RH8.0 on that big server
> does perform so bad. Knowing this in advance, we wouldn't have chosen
> linux for serving... :(

Is RH8.0 using nfs-utils 1.0 or newer? If so, then the default server
export options have changed and the filesystem is now exported "sync",
instead of "async".

The other thing to remember is that "cto" (close-to-open) consistency,
which is enabled by default on the client, forces the client to do a
fsync on any file it closes, which leads to rather poor performance when
opening and closing lots of small files. Which is precisely what tar is
doing...

So, try exporting the filesystem "async" if you can tolerate data loss
in case of a server crash, and try mounting the filesystem with "nocto"
if you don't need inter-client consistency (and if you do then locking
the files with fcntl() is a much better solution anyway).

Ion

--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.


-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-19 15:31:12

by Fabrizio Nesti

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

> Is RH8.0 using nfs-utils 1.0 or newer?

1.0.1..

> If so, then the default server export options have changed and the
> filesystem is now exported "sync", instead of "async".

Yes we exported it async :) as confirmed by the snoop that I reported
earlier. See it also below. But nothing changed.

> The other thing to remember is that "cto"...
> ...

On the solaris client? It is not in the man mount_nfs, but it
worked. However there was no performance upgrade.

Also, I can't recognize it in the old traffic snoop..
I report below some of it during the write of one file from tar.

...It must be something really more crucial, for a performance upgrade of
3-5 times..

A clue seems this now: the "tar" process itself on solaris uses 4% cpu while
writing on a local disk, and 40% and over while writing to the nfs disk...
(not using -z gzip:) Also, mounting nfs2 seems 30% faster.. But we do not
understanf the reason.

Fabrizio


PS: Excerpt from network traffic during "tar"

solaris -> linux NFS C LOOKUP3 FH=884D Makefile.in
linux -> solaris NFS R LOOKUP3 No such file or directory
solaris -> linux NFS C LOOKUP3 FH=884D Makefile.in
linux -> solaris NFS R LOOKUP3 No such file or directory
solaris -> linux NFS C CREATE3 FH=884D (EXCLUSIVE) Makefile.in
linux -> solaris NFS R CREATE3 OK FH=174A
solaris -> linux NFS C SETATTR3 FH=174A
linux -> solaris NFS R SETATTR3 OK
solaris -> linux NFS C WRITE3 FH=174A at 0 for 8192 (ASYNC)
solaris -> linux TCP D=2049 S=793 .. Len=1460 Win=24820
solaris -> linux TCP D=2049 S=793 .. Len=1460 Win=24820
solaris -> linux TCP D=2049 S=793 .. Len=1460 Win=24820
solaris -> linux TCP D=2049 S=793 .. Len=1460 Win=24820
solaris -> linux TCP D=2049 S=793 .. Len=1056 Win=24820
linux -> solaris TCP D=793 S=2049 .. Len=0 Win=40880
linux -> solaris NFS R WRITE3 OK 8192 (ASYNC)
solaris -> linux NFS C WRITE3 FH=174A at 8192 for 3260 (ASYNC)
solaris -> linux TCP D=2049 S=793 .. Len=1460 Win=24820
solaris -> linux TCP D=2049 S=793 .. Len=504 Win=24820
linux -> solaris TCP D=793 S=2049 .. Len=0 Win=40880
linux -> solaris NFS R WRITE3 OK 3260 (ASYNC)
solaris -> linux NFS C COMMIT3 FH=174A at 0 for 16384
linux -> solaris NFS R COMMIT3 OK
solaris -> linux NFS C LOOKUP3 FH=884D Makefile.in
linux -> solaris NFS R LOOKUP3 OK FH=174A
solaris -> linux NFS C SETATTR3 FH=174A
linux -> solaris NFS R SETATTR3 OK
solaris -> linux NFS C LOOKUP3 FH=884D Makefile.in
linux -> solaris NFS R LOOKUP3 OK FH=174A
solaris -> linux NFS C SETATTR3 FH=174A
linux -> solaris NFS R SETATTR3 OK





-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-19 15:47:35

by Ion Badulescu

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

On Wed, 19 Feb 2003, Fabrizio Nesti wrote:

> Yes we exported it async :) as confirmed by the snoop that I reported
> earlier. See it also below. But nothing changed.

The snoop would not show anything, actually, since it's the server that
chooses to ignore the COMMIT call or not. The WRITE's will be async anyway
for NFSv3 (unlike for v2) precisely because the COMMIT call exists in v3.

> > The other thing to remember is that "cto"...
> > ...
>
> On the solaris client? It is not in the man mount_nfs, but it
> worked. However there was no performance upgrade.

Yes, solaris definitely supports nocto since at least sol2.6, it is option
0x2000 in its flags field in the NFS mount structure.

> Also, I can't recognize it in the old traffic snoop..
> I report below some of it during the write of one file from tar.

If you could get timestamps for that snoop (or tcpdump) output, it would
be great. We could at least see who is taking too much time.

> A clue seems this now: the "tar" process itself on solaris uses 4% cpu while
> writing on a local disk, and 40% and over while writing to the nfs disk...
> (not using -z gzip:) Also, mounting nfs2 seems 30% faster.. But we do not
> understanf the reason.

So perhaps the Solaris client is inefficient? Maybe it's something in
the way the Linux NFS server replies that triggers a bug in the client?
Who knows...

Have you tried looking a two capture outputs, one with a Linux server and
one with a Solaris server, to see what the differences are?

Ion

--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.





-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-19 16:58:29

by Fabrizio Nesti

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

Hello, some more precise data as requested...

> > I report below some of it during the write of one file from tar.
> If you could get timestamps for that snoop (or tcpdump) output, it would
> be great. We could at least see who is taking too much time.

This is the comparison of two UDP traffics for a "tar xf", up to the
first file written (README) from the same solaris client to two different
linux/solaris servers (apologies for the large column size..):

TO LINUX SERVER: TO SOLARIS SERVER:

0.00000 sunclient: NFS C ACCESS3 FH=1600 (lookup) 0.00000 sunclient: NFS C GETATTR3 FH=0102
0.00018 linuxserver: NFS R ACCESS3 OK (lookup) 0.00022 sunserver: NFS R GETATTR3 OK
0.00051 sunclient: NFS C GETATTR3 FH=1600
0.00063 linuxserver: NFS R GETATTR3 OK

0.00083 sunclient: NFS C MKDIR3 FH=1600 cvs-1.11.5 0.00049 sunclient: NFS C MKDIR3 FH=0102 cvs-1.11.5
0.00112 linuxserver: NFS R MKDIR3 OK FH=4E81 0.00281 sunserver: NFS R MKDIR3 OK FH=0588
0.02229 sunclient: NFS C ACCESS3 FH=1600 (lookup)
0.02248 linuxserver: NFS R ACCESS3 OK (lookup)

0.02273 sunclient: NFS C LOOKUP3 FH=4E81 contrib 0.00322 sunclient: NFS C LOOKUP3 FH=0588 contrib
0.02290 linuxserver: NFS R LOOKUP3 No such file or directory 0.00345 sunserver: NFS R LOOKUP3 No such file or directory
0.02311 sunclient: NFS C MKDIR3 FH=4E81 contrib 0.00363 sunclient: NFS C MKDIR3 FH=0588 contrib
0.02353 linuxserver: NFS R MKDIR3 OK FH=4C81 0.00565 sunserver: NFS R MKDIR3 OK FH=6324
0.02402 sunclient: NFS C ACCESS3 FH=4E81 (lookup) 0.00589 sunclient: NFS C ACCESS3 FH=0588 (lookup)
0.02418 linuxserver: NFS R ACCESS3 OK (lookup) 0.00611 sunserver: NFS R ACCESS3 OK (lookup)

0.02438 sunclient: NFS C LOOKUP3 FH=4C81 README 0.00627 sunclient: NFS C LOOKUP3 FH=6324 README
0.02453 linuxserver: NFS R LOOKUP3 No such file or directory 0.00650 sunserver: NFS R LOOKUP3 No such file or directory
0.02470 sunclient: NFS C LOOKUP3 FH=4C81 README 0.00664 sunclient: NFS C LOOKUP3 FH=6324 README
0.02483 linuxserver: NFS R LOOKUP3 No such file or directory 0.00685 sunserver: NFS R LOOKUP3 No such file or directory
0.02500 sunclient: NFS C CREATE3 FH=4C81 (EXCLUSIVE) README 0.00699 sunclient: NFS C CREATE3 FH=6324 (EXCLUSIVE) README
0.02537 linuxserver: NFS R CREATE3 OK FH=DD86 0.00851 sunserver: NFS R CREATE3 OK FH=CA37
0.02588 sunclient: NFS C SETATTR3 FH=DD86 0.00894 sunclient: NFS C SETATTR3 FH=CA37
0.02608 linuxserver: NFS R SETATTR3 OK 0.00983 sunserver: NFS R SETATTR3 OK
0.04781 sunclient: UDP IP fragment ID=59208 Offset=0 MF=1 0.01063 sunclient: UDP IP fragment ID=46356 Offset=0 MF=1
0.04784 sunclient: UDP IP fragment ID=59208 Offset=1480 MF=1 0.01066 sunclient: UDP IP fragment ID=46356 Offset=1480 MF=1
0.04786 sunclient: UDP IP fragment ID=59208 Offset=2960 MF=1 0.01068 sunclient: UDP IP fragment ID=46356 Offset=2960 MF=1
0.04789 sunclient: UDP IP fragment ID=59208 Offset=4440 MF=0 0.01070 sunclient: UDP IP fragment ID=46356 Offset=4440 MF=0
0.04900 linuxserver: RPC R XID=991204799 Success 0.01176 sunserver: RPC R XID=991211441 Success
0.04943 sunclient: NFS C COMMIT3 FH=DD86 at 0 for 8192 0.01220 sunclient: NFS C COMMIT3 FH=CA37 at 0 for 8192
0.04959 linuxserver: NFS R COMMIT3 OK 0.01419 sunserver: NFS R COMMIT3 OK
0.04990 sunclient: NFS C LOOKUP3 FH=4C81 README 0.01460 sunclient: NFS C ACCESS3 FH=6324 (lookup)
0.05006 linuxserver: NFS R LOOKUP3 OK FH=DD86 0.01481 sunserver: NFS R ACCESS3 OK (lookup)
0.05032 sunclient: NFS C SETATTR3 FH=DD86 0.01504 sunclient: NFS C SETATTR3 FH=CA37
0.05048 linuxserver: NFS R SETATTR3 OK 0.01592 sunserver: NFS R SETATTR3 OK
0.07132 sunclient: NFS C LOOKUP3 FH=4C81 README 0.01750 sunclient: NFS C SETATTR3 FH=CA37
0.07152 linuxserver: NFS R LOOKUP3 OK FH=DD86 0.01838 sunserver: NFS R SETATTR3 OK
0.07184 sunclient: NFS C SETATTR3 FH=DD86
0.07201 linuxserver: NFS R SETATTR3 OK

At this point linux traffic is already late wrt sun traffic...

These were mounted with UDP and performance is still worse on the
linux side (say double times).

TCP improves dramatically on solaris, but not on linux.
Tomorrow I'll pack up TCP data for kernel 2.4.20.

Ciao...
Fabrizio

PS:

--- nfsstat -m on the client for the two servers:

sunserver:
Flags: vers=3,proto=udp,sec=sys,hard,intr,link,symlink,acl,rsize=8192,wsize=8192,retrans=5,timeo=11
Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
Lookups: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
Reads: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
Writes: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
All: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)


linuxserver:
Flags: vers=3,proto=udp,sec=none,hard,intr,link,symlink,acl,rsize=8192,wsize=8192,retrans=5,timeo=11
Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
Lookups: srtt=1 (2ms), dev=1 (5ms), cur=0 (0ms)


--- And nfsstat -c for the client (retrans did not increase during this test)

Client rpc:
Connection oriented:
calls badcalls badxids timeouts newcreds badverfs
2209288 117 5 1 0 0
timers cantconn nomem interrupts
0 111 0 5
Connectionless:
calls badcalls retrans badxids timeouts newcreds
14327877 17289 4498 105 21658 0
badverfs timers nomem cantsend
0 2502 0 0

Client nfs:
calls badcalls clgets cltoomany
16115088 132 16115088 0
Version 2: (1067 calls)
null getattr setattr root lookup readlink
0 0% 1061 99% 0 0% 0 0% 5 0% 0 0%
read wrcache write create remove rename
0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
link symlink mkdir rmdir readdir statfs
0 0% 0 0% 0 0% 0 0% 0 0% 1 0%
Version 3: (16109023 calls)
null getattr setattr lookup access readlink
0 0% 5103618 31% 289621 1% 2757869 17% 772866 4% 8005 0%
read write create mkdir symlink mknod
2955234 18% 3392759 21% 118810 0% 8634 0% 7433 0% 2 0%
remove rmdir rename link readdir readdirplus
133306 0% 8222 0% 10810 0% 11005 0% 93489 0% 94502 0%
fsstat fsinfo pathconf commit
38683 0% 475 0% 3118 0% 300562 1%

Client nfs_acl:
Version 2: (1 calls)
null getacl setacl getattr access
0 0% 0 0% 0 0% 1 100% 0 0%
Version 3: (4997 calls)
null getacl setacl
0 0% 4997 100% 0 0%







































Tomorrow We'll try to packup the
same analysis for TCP.



-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-19 17:56:47

by Eric Whiting

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

Fabrizio,

2.4.20 NFS server testing -- I had been using bonnie numbers to judge
performance. Users and applications had not reported problems. I had not tried
your tar -xf cvs.tar test.

I just tried your tar -xf cvs-1.11.5.tar test and I see numbers like yours
(except I don't see super fast solaris NFS numbers)

Client Server Time
-------------------------------------
solaris7 2.4.20 27.3
solaris7 solaris9 26.9
solaris9 solaris7 25.3
2.4.18 2.4.20 7.0 (defaults to async mounts right?)
2.4.20 2.4.18 15.1
linux local (no NFS) 1.2 (including sync)

I think what we are seeing is partially the overhead in the creation of 500+
small files...

I'm not sure there is a major NFS UDP/TCP problem here -- but the differences
between your sun server and your linux server are interesting. Perhaps there is
a sync option that is making those numbers look different?

My solaris boxes are UFS. (are you runing veritas?)
My linux boxes are reiserfs.
The above numbers are on 100M network.

Please try bonnie tests for additional insights. Different tests show different
bottlenecks and advantages. Bonnie testing usually shows other areas where linux
NFS can do better than solaris NFS.

eric



Fabrizio Nesti wrote:
> So, if it is not UDP/TCP or r/wsize the cause, what can it be?
> Even more, after the email from
>
> > Eric Whiting <[email protected]>
> > ...
> > We are getting good NFS numbers with 2.4.20 UDP NFS servers against solaris
> > [89] clients.
>
> So Eric, can you please show your configuration?
>
> Also it is strange that a standard out-of-the-box RH8.0 on that big server
> does perform so bad. Knowing this in advance, we wouldn't have chosen
> linux for serving... :(
>
> Ok,
> thanks and ciao,
> Fabrizio


-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-02-20 18:23:28

by Fabrizio Nesti

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

>I just tried your tar -xf cvs-1.11.5.tar test and I see numbers like yours
> (except I don't see super fast solaris NFS numbers)
> Client Server Time
> -------------------------------------
> solaris7 2.4.20 27.3
> solaris7 solaris9 26.9
> solaris9 solaris7 25.3
> 2.4.18 2.4.20 7.0 (defaults to async mounts right?)
> 2.4.20 2.4.18 15.1
> linux local (no NFS) 1.2 (including sync)

Hello, your third line is indeed strange to us.
These are our times, in the full range of situations, still for tar xf.
We'll try some other tests (dd and bonnie) tomorrow.

Client Server Time (sec) TCP/UDP
-------------------------------------------------------------------------
1) 2.4.18 solaris7 7 (rw=32k) (7 for rw=8k) T
2) solaris8 solaris7 8 (rw=32k) (15 for rw=8k) T
3) 2.4.18 solaris7 7 U
4) solaris8 solaris7 15 U

5) 2.4.18 2.4.20 (sync) 30 (rw=8k) T
6) 2.4.18 2.4.20 (async) 10 (rw=8k) T
7) solaris8 2.4.20 (sync) 55 (rw=8k) 60 (rw=1k) T
8) solaris8 2.4.20 (async) 40 (rw=8k) T
9) 2.4.18s 2.4.20 (sync) 15 U
A) 2.4.18s 2.4.20 (async) 3 U
B) solaris8 2.4.20 (sync) 53 U
C) solaris8 2.4.20 (async) 34 U

D) 2.4.18 2.4.18s (sync) 50 (both machines loaded) U
E) 2.4.18 2.4.18s (async) 3 U
F) solaris8 2.4.18s (sync) 87 (both machines loaded) U
G) solaris8 2.4.18s (async) 33 U

Local Writes (no NFS):
2.4.18s (Xeon server, RAID5/ext3) 2 (including sync)
2.4.18/20 (Athlon1900DDR test PC, ext3) 3 (including sync)
solaris7 (E250 server, RAID5/UFS+logg) 3 (including sync)

solaris8 client is an U10 (and has no retransmission problems).

Comments:
- TCP does not help the present linux server. (on the contrary for pure
linux it's worse).
- For pure solaris, wsize=32k doubles the TCP speed, otherwise comparable
to UDP performance. We may try to enable it for linux also..
- Does solaris default to async? (Strange, but there's no server side flag).
If not, solaris is _very_ fast.

--
In other words, we switched from a SUN Enterprise250 to a quad Xeon (Dell),
to find performance loss from case 2) to G). :(

Hoping in the best,
ciao,
Fabrizio


PS: Some nfsstat -m as seen from the solaris8 client.

1,2) solaris 7 server via TCP
from didot:/export/backup
Flags: vers=3,proto=tcp,sec=sys,hard,intr,link,symlink,acl,
rsize=<....>,wsize=<.....>,retrans=5,timeo=600
Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

3,4) Solaris 7 server via UDP
Flags: vers=3,proto=udp,sec=sys,hard,intr,link,symlink,acl,
rsize=8192,wsize=8192,retrans=5,timeo=11
Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
Lookups: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
Reads: srtt=9 (22ms), dev=6 (30ms), cur=4 (80ms)
Writes: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
All: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)

5,6,7,8) Linux server via TCP:
Flags: vers=3,proto=tcp,sec=none,hard,intr,link,symlink,acl,
rsize=8192,wsize=8192,retrans=5,timeo=600
Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60

9,A,B,C,D,E,F,G) Linux servers via UDP
Flags: vers=3,proto=udp,sec=none,hard,intr,link,symlink,
rsize=8192,wsize=8192,retrans=5,timeo=11
Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
Lookups: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
Reads: srtt=7 (17ms), dev=4 (20ms), cur=2 (40ms)
Writes: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
All: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)





Client Server Time (sec)
-----------------------------------------------
TCP:
1) 2.4.18 solaris7 7 (rw=32k) (7 for rw=8k)
2) solaris8 solaris7 8 (rw=32k) (15 for rw=8k)
3) 2.4.18 2.4.20 (sync) 30 (rw=8k)
4) 2.4.18 2.4.20 (async) 10 (rw=8k)
5) solaris8 2.4.20 (sync) 55 (rw=8k) 60 (rw=1k)
6) solaris8 2.4.20 (async) 40 (rw=8k)
UDP:
7) 2.4.18 solaris7 7.5
8) solaris8 solaris7 15
9) 2.4.18s 2.4.20 (sync) 15
A) 2.4.18s 2.4.20 (async) 2.5
B) solaris8 2.4.20 (sync) 53
C) solaris8 2.4.20 (async) 34

9) 2.4.18 2.4.18s (sync) 50 (both machines loaded)
A) 2.4.18 2.4.18s (async) 2.6
B) solaris8 2.4.18s (sync) 87 (both machines loaded)
C) solaris8 2.4.18s (async) 33








-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
http://www.slickedit.com/sourceforge
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-03-07 23:59:53

by Eric Whiting

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

Neil/Fabrizio,

I'm still seeing the slow linux 2.4.20 nfs server when using a solaris client.
(as reported by Fabrizio Nesti <[email protected]> last month)

Summary: NFS operations related to the untar of a file are very/very slow on a
linux 2.4.20 NFS server (TCP and UDP). Bonnie streaming numbers look very good.
Just the file creation stuff is slow. 15 minutes instead of 2 minutes for the
tar -xf.

Is this an issue related to noac or sync options from the solaris client?

More benchmark data changing the test to highlight the problem.
I used 'time tar -xf linux-2.4.20.tar' for this testing.

eric


2.4.20 NFS server (TCP NFSV3 -- UDP numbers similar)
----------------------------------------------
0m 4s untar on local linux box (no sync included)
2m 18s Linux 2.4.19 NFS client (UDP)
15m 28s Solaris 2.7 NFS client (TCP)
26m 9s Solaris 2.9 NFS client (somewhat a traffic issue perhaps?)


NETAPPS NFS SERVER
-------------------
1m 19s Linux 2.4.20 client (UDP)
1m 54s Solaris 2.8 client





Fabrizio Nesti wrote:
>
> >I just tried your tar -xf cvs-1.11.5.tar test and I see numbers like yours
> > (except I don't see super fast solaris NFS numbers)
> > Client Server Time
> > -------------------------------------
> > solaris7 2.4.20 27.3
> > solaris7 solaris9 26.9
> > solaris9 solaris7 25.3
> > 2.4.18 2.4.20 7.0 (defaults to async mounts right?)
> > 2.4.20 2.4.18 15.1
> > linux local (no NFS) 1.2 (including sync)
>
> Hello, your third line is indeed strange to us.
> These are our times, in the full range of situations, still for tar xf.
> We'll try some other tests (dd and bonnie) tomorrow.
>
> Client Server Time (sec) TCP/UDP
> -------------------------------------------------------------------------
> 1) 2.4.18 solaris7 7 (rw=32k) (7 for rw=8k) T
> 2) solaris8 solaris7 8 (rw=32k) (15 for rw=8k) T
> 3) 2.4.18 solaris7 7 U
> 4) solaris8 solaris7 15 U
>
> 5) 2.4.18 2.4.20 (sync) 30 (rw=8k) T
> 6) 2.4.18 2.4.20 (async) 10 (rw=8k) T
> 7) solaris8 2.4.20 (sync) 55 (rw=8k) 60 (rw=1k) T
> 8) solaris8 2.4.20 (async) 40 (rw=8k) T
> 9) 2.4.18s 2.4.20 (sync) 15 U
> A) 2.4.18s 2.4.20 (async) 3 U
> B) solaris8 2.4.20 (sync) 53 U
> C) solaris8 2.4.20 (async) 34 U
>
> D) 2.4.18 2.4.18s (sync) 50 (both machines loaded) U
> E) 2.4.18 2.4.18s (async) 3 U
> F) solaris8 2.4.18s (sync) 87 (both machines loaded) U
> G) solaris8 2.4.18s (async) 33 U
>
> Local Writes (no NFS):
> 2.4.18s (Xeon server, RAID5/ext3) 2 (including sync)
> 2.4.18/20 (Athlon1900DDR test PC, ext3) 3 (including sync)
> solaris7 (E250 server, RAID5/UFS+logg) 3 (including sync)
>
> solaris8 client is an U10 (and has no retransmission problems).
>
> Comments:
> - TCP does not help the present linux server. (on the contrary for pure
> linux it's worse).
> - For pure solaris, wsize=32k doubles the TCP speed, otherwise comparable
> to UDP performance. We may try to enable it for linux also..
> - Does solaris default to async? (Strange, but there's no server side flag).
> If not, solaris is _very_ fast.
>
> --
> In other words, we switched from a SUN Enterprise250 to a quad Xeon (Dell),
> to find performance loss from case 2) to G). :(
>
> Hoping in the best,
> ciao,
> Fabrizio
>
> PS: Some nfsstat -m as seen from the solaris8 client.
>
> 1,2) solaris 7 server via TCP
> from didot:/export/backup
> Flags: vers=3,proto=tcp,sec=sys,hard,intr,link,symlink,acl,
> rsize=<....>,wsize=<.....>,retrans=5,timeo=600
> Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
>
> 3,4) Solaris 7 server via UDP
> Flags: vers=3,proto=udp,sec=sys,hard,intr,link,symlink,acl,
> rsize=8192,wsize=8192,retrans=5,timeo=11
> Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
> Lookups: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
> Reads: srtt=9 (22ms), dev=6 (30ms), cur=4 (80ms)
> Writes: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
> All: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
>
> 5,6,7,8) Linux server via TCP:
> Flags: vers=3,proto=tcp,sec=none,hard,intr,link,symlink,acl,
> rsize=8192,wsize=8192,retrans=5,timeo=600
> Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
>
> 9,A,B,C,D,E,F,G) Linux servers via UDP
> Flags: vers=3,proto=udp,sec=none,hard,intr,link,symlink,
> rsize=8192,wsize=8192,retrans=5,timeo=11
> Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
> Lookups: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
> Reads: srtt=7 (17ms), dev=4 (20ms), cur=2 (40ms)
> Writes: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
> All: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
>
> Client Server Time (sec)
> -----------------------------------------------
> TCP:
> 1) 2.4.18 solaris7 7 (rw=32k) (7 for rw=8k)
> 2) solaris8 solaris7 8 (rw=32k) (15 for rw=8k)
> 3) 2.4.18 2.4.20 (sync) 30 (rw=8k)
> 4) 2.4.18 2.4.20 (async) 10 (rw=8k)
> 5) solaris8 2.4.20 (sync) 55 (rw=8k) 60 (rw=1k)
> 6) solaris8 2.4.20 (async) 40 (rw=8k)
> UDP:
> 7) 2.4.18 solaris7 7.5
> 8) solaris8 solaris7 15
> 9) 2.4.18s 2.4.20 (sync) 15
> A) 2.4.18s 2.4.20 (async) 2.5
> B) solaris8 2.4.20 (sync) 53
> C) solaris8 2.4.20 (async) 34
>
> 9) 2.4.18 2.4.18s (sync) 50 (both machines loaded)
> A) 2.4.18 2.4.18s (async) 2.6
> B) solaris8 2.4.18s (sync) 87 (both machines loaded)
> C) solaris8 2.4.18s (async) 33


-------------------------------------------------------
This SF.net email is sponsored by: Etnus, makers of TotalView, The debugger
for complex code. Debugging C/C++ programs can leave you feeling lost and
disoriented. TotalView can help you find your way. Available on major UNIX
and Linux platforms. Try it free. http://www.etnus.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-03-17 17:13:22

by Fabrizio Nesti

[permalink] [raw]
Subject: [NFS] 2.4.20 TCP server + solaris client performance

Hi to all.
I'd like to report a linux-server - solaris-client performance problem
that still seems without solution.

- The problem is that untaring a (small) archive on a solaris client is 4-5
times slower (!) with a linux server than with a solaris server. We tried
UDP as well as TCP (2.4.20 and also with bs=32k) without improvements.
See details below..

There was a discussion on the linux-nfs list, from which I extract the last
message below.
http://marc.theaimsgroup.com/?t=104550454800002&r=1&w=2
There you'll find also some tcpdump output. In case of need don't hesitate
to ask me numbers or further tests.

We really hope not to be forced to install Solaris on our new Linux servers :)

Cheers and thanks in advance to all,
Fabrizio




On Fri, 7 Mar 2003, Eric Whiting wrote:

> Neil/Fabrizio,
>
> I'm still seeing the slow linux 2.4.20 nfs server when using a solaris client.
> (as reported by Fabrizio Nesti <[email protected]> last month)
>
> Summary: NFS operations related to the untar of a file are very/very slow on a
> linux 2.4.20 NFS server (TCP and UDP). Bonnie streaming numbers look very good.
> Just the file creation stuff is slow. 15 minutes instead of 2 minutes for the
> tar -xf.
>
> Is this an issue related to noac or sync options from the solaris client?
>
> More benchmark data changing the test to highlight the problem.
> I used 'time tar -xf linux-2.4.20.tar' for this testing.
>
> eric
>
> 2.4.20 NFS server (TCP NFSV3 -- UDP numbers similar)
> ----------------------------------------------
> 0m 4s untar on local linux box (no sync included)
> 2m 18s Linux 2.4.19 NFS client (UDP)
> 15m 28s Solaris 2.7 NFS client (TCP)
> 26m 9s Solaris 2.9 NFS client (somewhat a traffic issue perhaps?)
>
> NETAPPS NFS SERVER
> -------------------
> 1m 19s Linux 2.4.20 client (UDP)
> 1m 54s Solaris 2.8 client
>
>
>
>
>
> Fabrizio Nesti wrote:
> >
> > >I just tried your tar -xf cvs-1.11.5.tar test and I see numbers like yours
> > > (except I don't see super fast solaris NFS numbers)
> > > Client Server Time
> > > -------------------------------------
> > > solaris7 2.4.20 27.3
> > > solaris7 solaris9 26.9
> > > solaris9 solaris7 25.3
> > > 2.4.18 2.4.20 7.0 (defaults to async mounts right?)
> > > 2.4.20 2.4.18 15.1
> > > linux local (no NFS) 1.2 (including sync)
> >
> > Hello, your third line is indeed strange to us.
> > These are our times, in the full range of situations, still for tar xf.
> > We'll try some other tests (dd and bonnie) tomorrow.
> >
> > Client Server Time (sec) TCP/UDP
> > -------------------------------------------------------------------------
> > 1) 2.4.18 solaris7 7 (rw=32k) (7 for rw=8k) T
> > 2) solaris8 solaris7 8 (rw=32k) (15 for rw=8k) T
> > 3) 2.4.18 solaris7 7 U
> > 4) solaris8 solaris7 15 U
> >
> > 5) 2.4.18 2.4.20 (sync) 30 (rw=8k) T
> > 6) 2.4.18 2.4.20 (async) 10 (rw=8k) T
> > 7) solaris8 2.4.20 (sync) 55 (rw=8k) 60 (rw=1k) T
> > 8) solaris8 2.4.20 (async) 40 (rw=8k) T
> > 9) 2.4.18s 2.4.20 (sync) 15 U
> > A) 2.4.18s 2.4.20 (async) 3 U
> > B) solaris8 2.4.20 (sync) 53 U
> > C) solaris8 2.4.20 (async) 34 U
> >
> > D) 2.4.18 2.4.18s (sync) 50 (both machines loaded) U
> > E) 2.4.18 2.4.18s (async) 3 U
> > F) solaris8 2.4.18s (sync) 87 (both machines loaded) U
> > G) solaris8 2.4.18s (async) 33 U
> >
> > Local Writes (no NFS):
> > 2.4.18s (Xeon server, RAID5/ext3) 2 (including sync)
> > 2.4.18/20 (Athlon1900DDR test PC, ext3) 3 (including sync)
> > solaris7 (E250 server, RAID5/UFS+logg) 3 (including sync)
> >
> > solaris8 client is an U10 (and has no retransmission problems).
> >
> > Comments:
> > - TCP does not help the present linux server. (on the contrary for pure
> > linux it's worse).
> > - For pure solaris, wsize=32k doubles the TCP speed, otherwise comparable
> > to UDP performance. We may try to enable it for linux also..
> > - Does solaris default to async? (Strange, but there's no server side flag).
> > If not, solaris is _very_ fast.
> >
> > --
> > In other words, we switched from a SUN Enterprise250 to a quad Xeon (Dell),
> > to find performance loss from case 2) to G). :(
> >
> > Hoping in the best,
> > ciao,
> > Fabrizio
> >
> > PS: Some nfsstat -m as seen from the solaris8 client.
> >
> > 1,2) solaris 7 server via TCP
> > from didot:/export/backup
> > Flags: vers=3,proto=tcp,sec=sys,hard,intr,link,symlink,acl,
> > rsize=<....>,wsize=<.....>,retrans=5,timeo=600
> > Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
> >
> > 3,4) Solaris 7 server via UDP
> > Flags: vers=3,proto=udp,sec=sys,hard,intr,link,symlink,acl,
> > rsize=8192,wsize=8192,retrans=5,timeo=11
> > Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
> > Lookups: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
> > Reads: srtt=9 (22ms), dev=6 (30ms), cur=4 (80ms)
> > Writes: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
> > All: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
> >
> > 5,6,7,8) Linux server via TCP:
> > Flags: vers=3,proto=tcp,sec=none,hard,intr,link,symlink,acl,
> > rsize=8192,wsize=8192,retrans=5,timeo=600
> > Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
> >
> > 9,A,B,C,D,E,F,G) Linux servers via UDP
> > Flags: vers=3,proto=udp,sec=none,hard,intr,link,symlink,
> > rsize=8192,wsize=8192,retrans=5,timeo=11
> > Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
> > Lookups: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
> > Reads: srtt=7 (17ms), dev=4 (20ms), cur=2 (40ms)
> > Writes: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
> > All: srtt=7 (17ms), dev=3 (15ms), cur=2 (40ms)
> >
> > Client Server Time (sec)
> > -----------------------------------------------
> > TCP:
> > 1) 2.4.18 solaris7 7 (rw=32k) (7 for rw=8k)
> > 2) solaris8 solaris7 8 (rw=32k) (15 for rw=8k)
> > 3) 2.4.18 2.4.20 (sync) 30 (rw=8k)
> > 4) 2.4.18 2.4.20 (async) 10 (rw=8k)
> > 5) solaris8 2.4.20 (sync) 55 (rw=8k) 60 (rw=1k)
> > 6) solaris8 2.4.20 (async) 40 (rw=8k)
> > UDP:
> > 7) 2.4.18 solaris7 7.5
> > 8) solaris8 solaris7 15
> > 9) 2.4.18s 2.4.20 (sync) 15
> > A) 2.4.18s 2.4.20 (async) 2.5
> > B) solaris8 2.4.20 (sync) 53
> > C) solaris8 2.4.20 (async) 34
> >
> > 9) 2.4.18 2.4.18s (sync) 50 (both machines loaded)
> > A) 2.4.18 2.4.18s (async) 2.6
> > B) solaris8 2.4.18s (sync) 87 (both machines loaded)
> > C) solaris8 2.4.18s (async) 33

2003-03-17 22:53:43

by Wendy Cheng

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

What's the underneath file system you use ?
ext2 ? ext3 ? Could you try SGI's xfs ?

Wendy
-------




-------------------------------------------------------
This SF.net email is sponsored by:Crypto Challenge is now open!
Get cracking and register here for some mind boggling fun and
the chance of winning an Apple iPod:
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0031en
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-06-05 14:48:54

by Fabrizio Nesti

[permalink] [raw]
Subject: Re: 2.4.20 TCP server + solaris client performance

Thanks for the solution to Tom Georgoulias.

This issue is solved by sun patch 108727-25.

bye,
Fabrizio

--------------------------------------------------------------------