2003-11-07 16:54:17

by Juergen Sauer

[permalink] [raw]
Subject: XFS NFS Fileserver, kernel: nfs: server server not responding, still trying

=2D----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi List !

Since upgrading to Kernel 2.4.22-xfs (direct from SGI-CVS) I get this
problems, during bigger file NFS-copying (local XFS ->=20
remote NFS-based filesserver (also completely in XFS).

Nov 7 17:20:02 pc2 kernel: nfs: server server not responding, still trying
Nov 7 17:20:02 pc2 kernel: nfs: server server not responding, still trying
Nov 7 17:20:02 pc2 kernel: nfs: server server OK

This is my Test-Setup:
server: 64GB XFS-fs on ICP-Vortex (Hardware Raid5) SCSI controller

I tried everything, mentioned on the NFS FAQ Page to avoid this problems.
http://nfs.sourceforge.net/nfs-howto/performance.html#TIMEOUT

jojo@pc2:jojo $ ssh root@localhost
Linux pc2 2.4.22-xfs #2 Don Okt 23 17:01:40 CEST 2003 i686 unknown
root@pc2:root# nfsstat
[...]
Client rpc stats:
calls retrans authrefrsh
1483912 21137 0
[...]

root@pc2:root# echo 262144 > /proc/sys/net/core/rmem_default
root@pc2:root# echo 262144 > /proc/sys/net/core/rmem_max
root@pc2:root# mount /home -o remount,timeo=3D14,retrans=3D6
Copy Try -> Same Problem

root@pc2:root# mount /home -o remount,timeo=3D30,retrans=3D12
root@pc2:root# mount /home -o remount,timeo=3D30,retrans=3D12,rsize=3D3276=
8,wsize=3D32768

root@pc2:jojo# nfsstat
=2E..
Client rpc stats:
calls retrans authrefrsh
1752966 23343 0
=2E..

root@pc2:jojo# mount /home -o remount,timeo=3D30,retrans=3D12,rsize=3D3276=
8,wsize=3D32768,nfsvers=3D3
root@pc2:jojo# mount
server:/home on /home type nfs (rw,lock,timeo=3D30,retrans=3D12,rsize=3D327=
68,wsize=3D32768,nfsvers=3D3,addr=3D192.168.11.1)
Copy Try -> Same Problem

This Kernel 2.4.22-xfs is in case of NFS performance the worst since 2.4.7 =
and reiserfs or the userspace
nfs daemon.

How can we stop the problem ?
Any ideas ?

TIA !

mfG
J=FCrgen
automatiX Linux Support Crew
=2D --=20
J=FCrgen Sauer - AutomatiX GmbH, +49-4209-4699, [email protected] **
** Das Linux Systemhaus - Service - Support - Server - L=F6sungen **
** http://www.automatix.de ICQ: #344389676 **
=2D----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/q8n4W7UKI9EqarERAvA7AKC2Jy4x5aYF1/VbuDqh8gYxDqhvCQCfczpg
xbNgxw8+W9OJJ+7R6NKveAQ=3D
=3D9uNz
=2D----END PGP SIGNATURE-----



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-11-07 17:35:06

by Bogdan Costescu

[permalink] [raw]
Subject: Re: XFS NFS Fileserver, kernel: nfs: server server not responding, still trying

On Fri, 7 Nov 2003, Juergen Sauer wrote:

> calls retrans authrefrsh
> 1483912 21137 0
> ...
> calls retrans authrefrsh
> 1752966 23343 0

The number of "retrans" is pretty high...

> ...rsize=32768,wsize=32768...

... and I think that this is the problem. Try lowering the sizes to
something more closely matched to the network that you are using, like 8k
or even lower. Although you said that you followed the NFS FAQ ???

--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [email protected]




-------------------------------------------------------
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-08 08:54:23

by Juergen Sauer

[permalink] [raw]
Subject: Re: XFS NFS Fileserver, kernel: nfs: server server not responding, still trying

=2D----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am Freitag, 7. November 2003 19:01 schrieb Lever, Charles:
> have you tried with NFS over TCP? you may have a
> network problem which prevents NFS over UDP from
> working efficiently.
Not yet.
In Server/Client-Kernel 2.4.20-xfs was all fine,
In 2.4.22-xfs on the Server and on the Client the horror began.

On the Client 2.4.22-xfs is nessesary because of Hardware forced.
The Server is already back on 2.4.20-xfs, like in other posts on the
List described.

mfG
J=FCrgen
automatiX Linux Support Crew
=2D --=20
J=FCrgen Sauer - AutomatiX GmbH, +49-4209-4699, [email protected] **
** Das Linux Systemhaus - Service - Support - Server - L=F6sungen **
** http://www.automatix.de ICQ: #344389676 **
=2D----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/rK1bW7UKI9EqarERAk2CAJ0U6S8uLo4ezShPBtix50VZhUxtHQCg0doh
6CW2mjJfZuopNUHo+Obt7WQ=3D
=3DhKG5
=2D----END PGP SIGNATURE-----



-------------------------------------------------------
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-08 08:54:23

by Juergen Sauer

[permalink] [raw]
Subject: Re: XFS NFS Fileserver, kernel: nfs: server server not responding, still trying

=2D----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am Freitag, 7. November 2003 18:34 schrieb Bogdan Costescu:
> On Fri, 7 Nov 2003, Juergen Sauer wrote:
=20
> > calls retrans authrefrsh
> > 1483912 21137 0
> > ...
> > calls retrans authrefrsh
> > 1752966 23343 0
=20
> The number of "retrans" is pretty high...
That's it. ;-<
=20
> > ...rsize=3D32768,wsize=3D32768...
I moved to lower and higher values. Including 8192, 4096, 1024

> ... and I think that this is the problem. Try lowering the sizes to=20
> something more closely matched to the network that you are using, like 8k=
=20
> or even lower. Although you said that you followed the NFS FAQ ???
Yepp, I did the posting after the behavior changed dramatically in
2.4.22-xfs against 2.4.20-xfs.

Thanks so far.

mfG
J=FCrgen
automatiX Linux Support Crew
=2D --=20
J=FCrgen Sauer - AutomatiX GmbH, +49-4209-4699, [email protected] **
** Das Linux Systemhaus - Service - Support - Server - L=F6sungen **
** http://www.automatix.de ICQ: #344389676 **
=2D----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/rKx+W7UKI9EqarERAmMRAJ4/6UAWEYKv8uBlg1fJvHm7ke6WjgCgyasV
IqQbGyhZssyB+t/Ih+1s3HQ=3D
=3DVVtv
=2D----END PGP SIGNATURE-----



-------------------------------------------------------
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-11-11 09:24:19

by Juergen Sauer

[permalink] [raw]
Subject: Solution/Work around: Re: XFS NFS Fileserver, kernel: nfs: server server not responding, still trying

=2D----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am Samstag, 8. November 2003 09:42 schrieb Juergen Sauer:
> Am Freitag, 7. November 2003 18:34 schrieb Bogdan Costescu:
> > On Fri, 7 Nov 2003, Juergen Sauer wrote:
>=20
> > > calls retrans authrefrsh
> > > 1483912 21137 0
> > > ...
> > > calls retrans authrefrsh
> > > 1752966 23343 0
>=20
> > The number of "retrans" is pretty high...
> That's it. ;-<
>=20
> > > ...rsize=3D32768,wsize=3D32768...
> I moved to lower and higher values. Including 8192, 4096, 1024
>=20
> > ... and I think that this is the problem. Try lowering the sizes to
> > something more closely matched to the network that you are using, like =
8k
> > or even lower. Although you said that you followed the NFS FAQ ???
> Yepp, I did the posting after the behavior changed dramatically in
> 2.4.22-xfs against 2.4.20-xfs.

I found the reason for the troubles. ;-<<
The NVIDIA Networkdriver is buggy. I opend an incident on the NVIDIA Support
for it, it's the matter of the driver crew.

Meanvile work around is putting this into modules.conf:
options nvnet optimization=3D0 speed=3D2 duplex=3D1

optimization=3D0 -> No throughput "optimizsation"
speed=3D2 -> 100 Mbit/s
duplex=3D1 -> no Duplex

Never thought, that s driver bug could infect in Linux.
I am definitely used to get execllent drivers out of the kernel, from
the kernel crew. That are the free/GPLed drivers.

The closed source driver nvnet (Nvidia/Nforce2 Chipset) taints the kernel
and breaks the speed. In kernel 2.4.
There are only 3,33 MByte/sec left after those "optimazations".


mfG
J=FCrgen
automatiX Linux Support Crew
=2D --=20
J=FCrgen Sauer - AutomatiX GmbH, +49-4209-4699, [email protected] **
** Das Linux Systemhaus - Service - Support - Server - L=F6sungen **
** http://www.automatix.de ICQ: #344389676 **
=2D----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/sKmQW7UKI9EqarERAkr/AJ9kpqxnfbjSv3K4SHaSu2R8a7iKpwCfYVTJ
VZCQZBdxUGZfRB9ZJIGBt04=3D
=3D1Q0w
=2D----END PGP SIGNATURE-----



-------------------------------------------------------
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs