2004-03-16 11:35:28

by Brasseur Valéry

[permalink] [raw]
Subject: linus 2.4.20 nfs performance problem

I have a config where LInux machines (bi Xeon 2.4Ghz 2Gb memory, kernel =
2.4.20 + trond patch) are accessing a Netapp 810 (or 825)

there is a lot of process (#400) =20
when in load a lots of process are in D state, the filer are around 5000 =
nfsops and the network is near 50 mbit/s for each machine

the machine and the filer are connect by a switch, machines are at 100mb =
and filer is at 1Gb

the cpu on the machines are near &=E0% and load average is near 100 or =
more


I don't know where is the bottleneck...=20

is somebody can help me ?


thanks in advance
valery



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2004-03-16 13:39:27

by Fabiano Reis

[permalink] [raw]
Subject: Re: linus 2.4.20 nfs performance problem

looks like you are getting the maximum number of nfs operations by
mountpoint. See nfs.sourceforge.net

The maximum nfs operations by mountpoint for kernel < 2.5.47 is 256

I?m getting the same problem with kernel 2.4.20. I ipgraded a nfs client to
kernel 2.6 and I still getting the same problem. Maybe a solution is to
upgrade the server and client to kernel 2.6 (I didnt tested it yet)

-
Fabiano

----- Original Message -----
From: "Brasseur Val?ry" <[email protected]>
To: <[email protected]>
Cc: <[email protected]>
Sent: Tuesday, March 16, 2004 8:35 AM
Subject: [NFS] linus 2.4.20 nfs performance problem


I have a config where LInux machines (bi Xeon 2.4Ghz 2Gb memory, kernel
2.4.20 + trond patch) are accessing a Netapp 810 (or 825)

there is a lot of process (#400)
when in load a lots of process are in D state, the filer are around 5000
nfsops and the network is near 50 mbit/s for each machine

the machine and the filer are connect by a switch, machines are at 100mb and
filer is at 1Gb

the cpu on the machines are near &?% and load average is near 100 or more


I don't know where is the bottleneck...

is somebody can help me ?


thanks in advance
valery



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id70&alloc_id638&op=ick
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-03-16 16:08:31

by Trond Myklebust

[permalink] [raw]
Subject: Re: linus 2.4.20 nfs performance problem

P=E5 ty , 16/03/2004 klokka 08:37, skreiv Fabiano Reis:
> looks like you are getting the maximum number of nfs operations by
> mountpoint. See nfs.sourceforge.net
>=20
> The maximum nfs operations by mountpoint for kernel < 2.5.47 is 256
>=20
> I=B4m getting the same problem with kernel 2.4.20. I ipgraded a nfs clien=
t to
> kernel 2.6 and I still getting the same problem. Maybe a solution is to
> upgrade the server and client to kernel 2.6 (I didnt tested it yet)

There is a second limit: the RPC client limits the number of RPC
requests it can have outstanding on the wire.

On TCP, that limit is always 16 requests. On UDP, there is a congestion
control algorithm that will reduce that limit dynamically if the network
is losing packets (or if the server is dropping them).

Note: 2.6.5-rc1 contains a patch by Chuck Lever that allows you to
change that upper limit of 16 requests by means of a /proc interface.

Cheers,
Trond


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-03-16 16:26:57

by Lever, Charles

[permalink] [raw]
Subject: RE: linus 2.4.20 nfs performance problem

i don't think these limits are the problem. the congestion control
and slot table limits should put these processes to sleep, not leave
them in D state, right? same for the per-mountpoint pages in flight
limit.

valery, your message was a little garbled. did you say that there
was very little idle left on the clients? is keystroke responsiveness
affected by this "overload" condition?

> -----Original Message-----
> From: Trond Myklebust [mailto:[email protected]]=20
> Sent: Tuesday, March 16, 2004 11:08 AM
> To: Fabiano Reis
> Cc: [email protected]
> Subject: Re: [NFS] linus 2.4.20 nfs performance problem
>=20
>=20
> P=E5 ty , 16/03/2004 klokka 08:37, skreiv Fabiano Reis:
> > looks like you are getting the maximum number of nfs operations by=20
> > mountpoint. See nfs.sourceforge.net
> >=20
> > The maximum nfs operations by mountpoint for kernel < 2.5.47 is 256
> >=20
> > I=B4m getting the same problem with kernel 2.4.20. I ipgraded a nfs=20
> > client to kernel 2.6 and I still getting the same problem. Maybe a=20
> > solution is to upgrade the server and client to kernel 2.6 (I didnt=20
> > tested it yet)
>=20
> There is a second limit: the RPC client limits the number of=20
> RPC requests it can have outstanding on the wire.
>=20
> On TCP, that limit is always 16 requests. On UDP, there is a=20
> congestion control algorithm that will reduce that limit=20
> dynamically if the network is losing packets (or if the=20
> server is dropping them).
>=20
> Note: 2.6.5-rc1 contains a patch by Chuck Lever that allows=20
> you to change that upper limit of 16 requests by means of a=20
> /proc interface.
>=20
> Cheers,
> Trond
>=20
>=20
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President=20
> and CEO of GenToo technologies. Learn everything from=20
> fundamentals to system=20
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id638&op=3Dick
> _______________________________________________
> NFS maillist - [email protected]=20
> https://lists.sourceforge.net/lists/listinfo/n> fs
>=20


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-03-16 16:29:52

by Brasseur Valéry

[permalink] [raw]
Subject: RE: linus 2.4.20 nfs performance problem

no, there is plenty of idle on the client !
only the load average is high !

note : on our latest experiment we have seen that adding more memory =
(ie: now we are at 4Gb), and perhaps going over Gb network is a way =
better....


Valery BRASSEUR | Phone # +33 320 60 7982=20
Atosorigin Branche Multimedia | Fax # +33 320 60 7649
Tout krab-la m=F2 an bari-la=20
(Tous les crabes sont morts dans le baril).


> -----Original Message-----
> From: Lever, Charles [mailto:[email protected]]
> Sent: Tuesday, March 16, 2004 5:27 PM
> To: Brasseur Val=E9ry
> Cc: [email protected]
> Subject: RE: [NFS] linus 2.4.20 nfs performance problem
>=20
>=20
> i don't think these limits are the problem. the congestion control
> and slot table limits should put these processes to sleep, not leave
> them in D state, right? same for the per-mountpoint pages in flight
> limit.
>=20
> valery, your message was a little garbled. did you say that there
> was very little idle left on the clients? is keystroke responsiveness
> affected by this "overload" condition?
>=20
> > -----Original Message-----
> > From: Trond Myklebust [mailto:[email protected]]=20
> > Sent: Tuesday, March 16, 2004 11:08 AM
> > To: Fabiano Reis
> > Cc: [email protected]
> > Subject: Re: [NFS] linus 2.4.20 nfs performance problem
> >=20
> >=20
> > P=E5 ty , 16/03/2004 klokka 08:37, skreiv Fabiano Reis:
> > > looks like you are getting the maximum number of nfs=20
> operations by=20
> > > mountpoint. See nfs.sourceforge.net
> > >=20
> > > The maximum nfs operations by mountpoint for kernel <=20
> 2.5.47 is 256
> > >=20
> > > I=B4m getting the same problem with kernel 2.4.20. I ipgraded a =
nfs=20
> > > client to kernel 2.6 and I still getting the same=20
> problem. Maybe a=20
> > > solution is to upgrade the server and client to kernel=20
> 2.6 (I didnt=20
> > > tested it yet)
> >=20
> > There is a second limit: the RPC client limits the number of=20
> > RPC requests it can have outstanding on the wire.
> >=20
> > On TCP, that limit is always 16 requests. On UDP, there is a=20
> > congestion control algorithm that will reduce that limit=20
> > dynamically if the network is losing packets (or if the=20
> > server is dropping them).
> >=20
> > Note: 2.6.5-rc1 contains a patch by Chuck Lever that allows=20
> > you to change that upper limit of 16 requests by means of a=20
> > /proc interface.
> >=20
> > Cheers,
> > Trond
> >=20
> >=20
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: IBM Linux Tutorials
> > Free Linux tutorial presented by Daniel Robbins, President=20
> > and CEO of GenToo technologies. Learn everything from=20
> > fundamentals to system=20
> > administration.http://ads.osdn.com/?ad_id=1470&alloc_id638&op=3Dick
> > _______________________________________________
> > NFS maillist - [email protected]=20
> > https://lists.sourceforge.net/lists/listinfo/n> fs
> >=20
>=20


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-03-16 16:31:19

by Brasseur Valéry

[permalink] [raw]
Subject: RE: linus 2.4.20 nfs performance problem

for testing we have increase the limit from 256 to 1024 en recompile.

the probleme seems to be the same !

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Fabiano Reis
> Sent: Tuesday, March 16, 2004 2:37 PM
> Cc: [email protected]
> Subject: Re: [NFS] linus 2.4.20 nfs performance problem
>=20
>=20
> looks like you are getting the maximum number of nfs operations by
> mountpoint. See nfs.sourceforge.net
>=20
> The maximum nfs operations by mountpoint for kernel < 2.5.47 is 256
>=20
> I=B4m getting the same problem with kernel 2.4.20. I ipgraded a=20
> nfs client to
> kernel 2.6 and I still getting the same problem. Maybe a=20
> solution is to
> upgrade the server and client to kernel 2.6 (I didnt tested it yet)
>=20
> -
> Fabiano
>=20
> ----- Original Message -----=20
> From: "Brasseur Val=E9ry" <[email protected]>
> To: <[email protected]>
> Cc: <[email protected]>
> Sent: Tuesday, March 16, 2004 8:35 AM
> Subject: [NFS] linus 2.4.20 nfs performance problem
>=20
>=20
> I have a config where LInux machines (bi Xeon 2.4Ghz 2Gb=20
> memory, kernel
> 2.4.20 + trond patch) are accessing a Netapp 810 (or 825)
>=20
> there is a lot of process (#400)
> when in load a lots of process are in D state, the filer are=20
> around 5000
> nfsops and the network is near 50 mbit/s for each machine
>=20
> the machine and the filer are connect by a switch, machines=20
> are at 100mb and
> filer is at 1Gb
>=20
> the cpu on the machines are near &=E0% and load average is near=20
> 100 or more
>=20
>=20
> I don't know where is the bottleneck...
>=20
> is somebody can help me ?
>=20
>=20
> thanks in advance
> valery
>=20
>=20
>=20
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id638&op=3Dick
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20
>=20
>=20
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> =
administration.http://ads.osdn.com/?ad_id=3D1470&alloc_id=3D3638&op=3Dcli=
ck
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-03-16 16:56:04

by Fabiano Reis

[permalink] [raw]
Subject: Re: linus 2.4.20 nfs performance problem

how did you change it? is it simple like changing a #define on kernel header
files and recompiling it? I was thinking the patch for this is too complex.

----- Original Message -----
From: "Brasseur Val?ry" <[email protected]>
To: "Fabiano Reis" <[email protected]>
Cc: <[email protected]>
Sent: Tuesday, March 16, 2004 1:31 PM
Subject: RE: [NFS] linus 2.4.20 nfs performance problem


for testing we have increase the limit from 256 to 1024 en recompile.

the probleme seems to be the same !

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Fabiano Reis
> Sent: Tuesday, March 16, 2004 2:37 PM
> Cc: [email protected]
> Subject: Re: [NFS] linus 2.4.20 nfs performance problem
>
>
> looks like you are getting the maximum number of nfs operations by
> mountpoint. See nfs.sourceforge.net
>
> The maximum nfs operations by mountpoint for kernel < 2.5.47 is 256
>
> I?m getting the same problem with kernel 2.4.20. I ipgraded a
> nfs client to
> kernel 2.6 and I still getting the same problem. Maybe a
> solution is to
> upgrade the server and client to kernel 2.6 (I didnt tested it yet)
>
> -
> Fabiano
>
> ----- Original Message -----
> From: "Brasseur Val?ry" <[email protected]>
> To: <[email protected]>
> Cc: <[email protected]>
> Sent: Tuesday, March 16, 2004 8:35 AM
> Subject: [NFS] linus 2.4.20 nfs performance problem
>
>
> I have a config where LInux machines (bi Xeon 2.4Ghz 2Gb
> memory, kernel
> 2.4.20 + trond patch) are accessing a Netapp 810 (or 825)
>
> there is a lot of process (#400)
> when in load a lots of process are in D state, the filer are
> around 5000
> nfsops and the network is near 50 mbit/s for each machine
>
> the machine and the filer are connect by a switch, machines
> are at 100mb and
> filer is at 1Gb
>
> the cpu on the machines are near &?% and load average is near
> 100 or more
>
>
> I don't know where is the bottleneck...
>
> is somebody can help me ?
>
>
> thanks in advance
> valery
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id70&alloc_id638&op=ick
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id70&alloc_id638&op=ick
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-03-16 17:00:58

by Trond Myklebust

[permalink] [raw]
Subject: RE: linus 2.4.20 nfs performance problem

P=E5 ty , 16/03/2004 klokka 11:26, skreiv Lever, Charles:
> i don't think these limits are the problem. the congestion control
> and slot table limits should put these processes to sleep, not leave
> them in D state, right? same for the per-mountpoint pages in flight
> limit.

man top

w: S -- Process Status
The status of the task which can be one of:
=FFD=FF =3D uninterruptible sleep
=FFR=FF =3D running
=FFS=FF =3D sleeping
=FFT=FF =3D traced or stopped
=FFZ=FF =3D zombie
=
=20
Cheers,
Trond


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-03-17 15:56:56

by David Dougall

[permalink] [raw]
Subject: RE: linus 2.4.20 nfs performance problem

I know in the past we had symptoms like this when we had mismatched
speeds between server and client. If you can, change all machines to
gigabit or 100Mbit.
--David Dougall


On Tue, 16 Mar 2004, Trond Myklebust wrote:

> P=E5 ty , 16/03/2004 klokka 11:26, skreiv Lever, Charles:
> > i don't think these limits are the problem. the congestion control
> > and slot table limits should put these processes to sleep, not leave
> > them in D state, right? same for the per-mountpoint pages in flight
> > limit.
>
> man top
>
> w: S -- Process Status
> The status of the task which can be one of:
> =FFD=FF =3D uninterruptible sleep
> =FFR=FF =3D running
> =FFS=FF =3D sleeping
> =FFT=FF =3D traced or stopped
> =FFZ=FF =3D zombie
>
> Cheers,
> Trond
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id638&op=CCk
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>
>


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2004-03-17 16:38:29

by Brasseur Valéry

[permalink] [raw]
Subject: RE: linus 2.4.20 nfs performance problem

we have done that : put all client on Gb as the netapp filer was.=20

at first it was worsed ! for now the only thing which get better result =
is when we have put more memory in clients letting the Gb card in place =
too.

valery
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of David Dougall
> Sent: Wednesday, March 17, 2004 4:57 PM
> To: [email protected]
> Subject: RE: [NFS] linus 2.4.20 nfs performance problem
>=20
>=20
> I know in the past we had symptoms like this when we had mismatched
> speeds between server and client. If you can, change all machines to
> gigabit or 100Mbit.
> --David Dougall
>=20
>=20
> On Tue, 16 Mar 2004, Trond Myklebust wrote:
>=20
> > P=E5 ty , 16/03/2004 klokka 11:26, skreiv Lever, Charles:
> > > i don't think these limits are the problem. the=20
> congestion control
> > > and slot table limits should put these processes to=20
> sleep, not leave
> > > them in D state, right? same for the per-mountpoint=20
> pages in flight
> > > limit.
> >
> > man top
> >
> > w: S -- Process Status
> > The status of the task which can be one of:
> > =FFD=FF =3D uninterruptible sleep
> > =FFR=FF =3D running
> > =FFS=FF =3D sleeping
> > =FFT=FF =3D traced or stopped
> > =FFZ=FF =3D zombie
> >
> > Cheers,
> > Trond
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: IBM Linux Tutorials
> > Free Linux tutorial presented by Daniel Robbins, President=20
> and CEO of
> > GenToo technologies. Learn everything from fundamentals to system
> > administration.http://ads.osdn.com/?ad_id=1470&alloc_id638&op=CCk
> > _______________________________________________
> > NFS maillist - [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nfs
> >
> >
>=20
>=20
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id638&op=3Dick
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs