2003-06-25 08:56:55

by Matthias Kittner

[permalink] [raw]
Subject: Linux client network performance

[I am not a subscriber, please CC to my eMail!]

Hello,

we have the following problem:

- 2 linux clients:
2.4.19-16mdk
2.4.20
- Solaris NFS-Server (nfs.server 1.21)
SunOS 5.7

If we compile on one of the linux machines sometimes in this compile
process the whole network hangs resp. is so slow that one can wait
between 10s or 2min to finish a "ls".

"snoop" at the nfs-server machine show very/very much "UDP continuation"
traffic between the linux client and the solaris server. Killing the
compile process stops this messages.

Can anyone help us? Is this a configuration problem or a bug?

Regards,
Matthias



-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-06-25 14:29:46

by Lever, Charles

[permalink] [raw]
Subject: RE: Linux client network performance

hi matthias-

it sounds like you've hit the IP fragmentation bug.
you should use NFS/TCP (the "tcp" mount option).

probably it's your 2.4.19-based client that is
causing this problem.

> -----Original Message-----
> From: Matthias Kittner [mailto:[email protected]]
> Sent: Wednesday, June 25, 2003 4:57 AM
> To: [email protected]
> Subject: [NFS] Linux client network performance
>=20
>=20
> [I am not a subscriber, please CC to my eMail!]
>=20
> Hello,
>=20
> we have the following problem:
>=20
> - 2 linux clients:
> 2.4.19-16mdk
> 2.4.20
> - Solaris NFS-Server (nfs.server 1.21)
> SunOS 5.7
>=20
> If we compile on one of the linux machines sometimes in this compile=20
> process the whole network hangs resp. is so slow that one can wait=20
> between 10s or 2min to finish a "ls".
>=20
> "snoop" at the nfs-server machine show very/very much "UDP=20
> continuation"=20
> traffic between the linux client and the solaris server. Killing the=20
> compile process stops this messages.
>=20
> Can anyone help us? Is this a configuration problem or a bug?
>=20
> Regards,
> Matthias
>=20
>=20
>=20
> -------------------------------------------------------
> This SF.Net email is sponsored by: INetU
> Attention Web Developers & Consultants: Become An INetU=20
> Hosting Partner.
> Refer Dedicated Servers. We Manage Them. You Get 10% Monthly=20
> Commission!
> INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-06-25 14:58:24

by Brasseur Valéry

[permalink] [raw]
Subject: RE: Linux client network performance

hi,

what's the "IP fragmentation bug" ?
what are the symptoms of this bug ?

thanks
valery
> -----Original Message-----
> From: Lever, Charles [mailto:[email protected]]
> Sent: Wednesday, June 25, 2003 4:30 PM
> To: Matthias Kittner
> Cc: [email protected]
> Subject: RE: [NFS] Linux client network performance
>
>
> hi matthias-
>
> it sounds like you've hit the IP fragmentation bug.
> you should use NFS/TCP (the "tcp" mount option).
>
> probably it's your 2.4.19-based client that is
> causing this problem.
>
> > -----Original Message-----
> > From: Matthias Kittner [mailto:[email protected]]
> > Sent: Wednesday, June 25, 2003 4:57 AM
> > To: [email protected]
> > Subject: [NFS] Linux client network performance
> >
> >
> > [I am not a subscriber, please CC to my eMail!]
> >
> > Hello,
> >
> > we have the following problem:
> >
> > - 2 linux clients:
> > 2.4.19-16mdk
> > 2.4.20
> > - Solaris NFS-Server (nfs.server 1.21)
> > SunOS 5.7
> >
> > If we compile on one of the linux machines sometimes in
> this compile
> > process the whole network hangs resp. is so slow that one can wait
> > between 10s or 2min to finish a "ls".
> >
> > "snoop" at the nfs-server machine show very/very much "UDP
> > continuation"
> > traffic between the linux client and the solaris server.
> Killing the
> > compile process stops this messages.
> >
> > Can anyone help us? Is this a configuration problem or a bug?
> >
> > Regards,
> > Matthias
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: INetU
> > Attention Web Developers & Consultants: Become An INetU
> > Hosting Partner.
> > Refer Dedicated Servers. We Manage Them. You Get 10% Monthly
> > Commission!
> > INetU Dedicated Managed Hosting
http://www.inetu.net/partner/index.php
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>


-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-06-25 17:40:36

by Lever, Charles

[permalink] [raw]
Subject: RE: Linux client network performance

hi valery-

> what's the "IP fragmentation bug" ?

the Linux IP layer sends packet fragments on the wire
as it fragments each datagram. it can run out of socket
buffer space in the middle of fragmenting a datagram.
the bug is that if it runs out of buffer space, it
stops fragmenting and drops the packet. =20

this leaves the receiving end with a bunch of fragments
it can't assemble into a whole datagram. this becomes
a problem when the sending end is perpetually running
out of socket space. this fills the receiving end's
reassembly queue with fragments that can't be used,
preventing all UDP traffic from getting to the server.

the fix is to have it continue fragmenting this datagram
even though the socket buffer is "full."

> what are the symptoms of this bug ?

you're using NFS/UDP with largish r/wsize. your
network features links of different speed (100Mb
mixed with GbE) between client and server, or is
routed rather than only switched.

you see lots of IP fragments on your network from
one or two NFS clients. you have periods of very slow
server and network performance, followed by periods of
normal performance. =20

> > -----Original Message-----
> > From: Lever, Charles [mailto:[email protected]]
> > Sent: Wednesday, June 25, 2003 4:30 PM
> > To: Matthias Kittner
> > Cc: [email protected]
> > Subject: RE: [NFS] Linux client network performance
> >=20
> >=20
> > hi matthias-
> >=20
> > it sounds like you've hit the IP fragmentation bug.
> > you should use NFS/TCP (the "tcp" mount option).
> >=20
> > probably it's your 2.4.19-based client that is
> > causing this problem.
> >=20
> > > -----Original Message-----
> > > From: Matthias Kittner [mailto:[email protected]]
> > > Sent: Wednesday, June 25, 2003 4:57 AM
> > > To: [email protected]
> > > Subject: [NFS] Linux client network performance
> > >=20
> > >=20
> > > [I am not a subscriber, please CC to my eMail!]
> > >=20
> > > Hello,
> > >=20
> > > we have the following problem:
> > >=20
> > > - 2 linux clients:
> > > 2.4.19-16mdk
> > > 2.4.20
> > > - Solaris NFS-Server (nfs.server 1.21)
> > > SunOS 5.7
> > >=20
> > > If we compile on one of the linux machines sometimes in=20
> > this compile=20
> > > process the whole network hangs resp. is so slow that one=20
> can wait=20
> > > between 10s or 2min to finish a "ls".
> > >=20
> > > "snoop" at the nfs-server machine show very/very much "UDP=20
> > > continuation"=20
> > > traffic between the linux client and the solaris server.=20
> > Killing the=20
> > > compile process stops this messages.
> > >=20
> > > Can anyone help us? Is this a configuration problem or a bug?
> > >=20
> > > Regards,
> > > Matthias
> > >=20
> > >=20
> > >=20
> > > -------------------------------------------------------
> > > This SF.Net email is sponsored by: INetU
> > > Attention Web Developers & Consultants: Become An INetU=20
> > > Hosting Partner.
> > > Refer Dedicated Servers. We Manage Them. You Get 10% Monthly=20
> > > Commission!
> > > INetU Dedicated Managed Hosting=20
> http://www.inetu.net/partner/index.php
> > _______________________________________________
> > NFS maillist - [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nfs
> >=20
>=20
>=20
> -------------------------------------------------------
> This SF.Net email is sponsored by: INetU
> Attention Web Developers & Consultants: Become An INetU=20
> Hosting Partner.
> Refer Dedicated Servers. We Manage Them. You Get 10% Monthly=20
> Commission!
> INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-06-25 18:09:34

by s o f i a

[permalink] [raw]
Subject: RE: Linux client network performance

Hello,

(pls cc me, a colleague just forwarded
this to me...)

I have seen this at a customer site.
And thank you so much for the clarification.

Can you please let me know:
- are there any identified workarounds?
- are there any papers or writeups on
various other symptoms of this issue?

At one site, the customer environment mentioned
about having issues with Linux clients; they
had to change NFS block size to 8k....

Wondering if one can comment about this one case,
I noticed even with the Linux client side
mounted at 8k, at a certain point in time,
with a certain application that is write intensive,
to a Solaris NFS server, the client had to do
(2) writes, before the NFS server replied ...
(in looking at the snoop) ... i am not sure, but
i think the application had the capability to
write in even less the 8k blocksize, and it seems
like this eliminated the problem of NFS-writes
from the linux client...

any thoughts on this issue?

thanks in advance,

sofia.


> Sent: Wednesday, June 25, 2003 10:40 AM
> To: Brasseur Val?ry
> Cc: [email protected]; Matthias Kittner
> Subject: RE: [NFS] Linux client network performance
>
>
> hi valery-
>
> > what's the "IP fragmentation bug" ?
>
> the Linux IP layer sends packet fragments on the
wire
> as it fragments each datagram. it can run out of
socket
> buffer space in the middle of fragmenting a
datagram.
> the bug is that if it runs out of buffer space, it
> stops fragmenting and drops the packet.
>
> this leaves the receiving end with a bunch of
fragments
> it can't assemble into a whole datagram. this
becomes
> a problem when the sending end is perpetually
running
> out of socket space. this fills the receiving end's
> reassembly queue with fragments that can't be used,
> preventing all UDP traffic from getting to the
server.
>
> the fix is to have it continue fragmenting this
datagram
> even though the socket buffer is "full."
>
> > what are the symptoms of this bug ?
>
> you're using NFS/UDP with largish r/wsize. your
> network features links of different speed (100Mb
> mixed with GbE) between client and server, or is
> routed rather than only switched.
>
> you see lots of IP fragments on your network from
> one or two NFS clients. you have periods of very
slow
> server and network performance, followed by periods
of
> normal performance.
>
> > > -----Original Message-----
> > > From: Lever, Charles
[mailto:[email protected]]
> > > Sent: Wednesday, June 25, 2003 4:30 PM
> > > To: Matthias Kittner
> > > Cc: [email protected]
> > > Subject: RE: [NFS] Linux client network
performance
> > >
> > >
> > > hi matthias-
> > >
> > > it sounds like you've hit the IP fragmentation
bug.
> > > you should use NFS/TCP (the "tcp" mount option).
> > >
> > > probably it's your 2.4.19-based client that is
> > > causing this problem.
> > >
> > > > -----Original Message-----
> > > > From: Matthias Kittner
[mailto:[email protected]]
> > > > Sent: Wednesday, June 25, 2003 4:57 AM
> > > > To: [email protected]
> > > > Subject: [NFS] Linux client network
performance
> > > >
> > > >
> > > > [I am not a subscriber, please CC to my
eMail!]
> > > >
> > > > Hello,
> > > >
> > > > we have the following problem:
> > > >
> > > > - 2 linux clients:
> > > > 2.4.19-16mdk
> > > > 2.4.20
> > > > - Solaris NFS-Server (nfs.server 1.21)
> > > > SunOS 5.7
> > > >
> > > > If we compile on one of the linux machines
sometimes in
> > > this compile
> > > > process the whole network hangs resp. is so
slow that one
> > can wait
> > > > between 10s or 2min to finish a "ls".
> > > >
> > > > "snoop" at the nfs-server machine show
very/very much "UDP
> > > > continuation"
> > > > traffic between the linux client and the
solaris server.
> > > Killing the
> > > > compile process stops this messages.
> > > >
> > > > Can anyone help us? Is this a configuration
problem or a bug?
> > > >
> > > > Regards,
> > > > Matthias
> > > >
> > > >
> > > >
> > > >
-------------------------------------------------------
> > > > This SF.Net email is sponsored by: INetU
> > > > Attention Web Developers & Consultants: Become
An INetU
> > > > Hosting Partner.
> > > > Refer Dedicated Servers. We Manage Them. You
Get 10% Monthly
> > > > Commission!
> > > > INetU Dedicated Managed Hosting
> > http://www.inetu.net/partner/index.php
> > > _______________________________________________
> > > NFS maillist - [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/nfs
> > >
> >
> >
> >
-------------------------------------------------------
> > This SF.Net email is sponsored by: INetU
> > Attention Web Developers & Consultants: Become An
INetU
> > Hosting Partner.
> > Refer Dedicated Servers. We Manage Them. You Get
10% Monthly
> > Commission!
> > INetU Dedicated Managed Hosting
> http://www.inetu.net/partner/index.php


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com


-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-06-25 19:40:36

by Lever, Charles

[permalink] [raw]
Subject: RE: Linux client network performance

hi sofia-

> Can you please let me know:
> - are there any identified workarounds?

yes, there are several.

1. use NFS/TCP

2. if you can't use NFS/TCP, upgrade your clients
to 2.4.20 or later

3. if you can't upgrade your clients, ensure the
default socket buffer size on your clients is
large before mounting any NFS servers (see
below).

> - are there any papers or writeups on
> various other symptoms of this issue?

http://www.netapp/com/tech_library/3183.html

contains instructions for deploying workarounds
in the appendix, including the socket buffer
enlargement workaround.

the symptoms that i described to valery are
what you should look for. other behaviors
could be the result of many different problems.

> At one site, the customer environment mentioned
> about having issues with Linux clients; they
> had to change NFS block size to 8k....=20

reducing r/wsize helps, but is not a guaranteed
workaround.

> Wondering if one can comment about this one case,=20
> I noticed even with the Linux client side
> mounted at 8k, at a certain point in time,
> with a certain application that is write intensive,
> to a Solaris NFS server, the client had to do
> (2) writes, before the NFS server replied ...
> (in looking at the snoop) ... i am not sure, but
> i think the application had the capability to
> write in even less the 8k blocksize, and it seems
> like this eliminated the problem of NFS-writes
> from the linux client...=20

that's not a very clear description of the problem.
it sounds like your client is retransmitting NFS
write operations, but there isn't enough in your
description to determine why that might occur.

> > Sent: Wednesday, June 25, 2003 10:40 AM
> > To: Brasseur Val=E9ry
> > Cc: [email protected]; Matthias Kittner
> > Subject: RE: [NFS] Linux client network performance
> >=20
> >=20
> > hi valery-
> >=20
> > > what's the "IP fragmentation bug" ?
> >=20
> > the Linux IP layer sends packet fragments on the
> wire
> > as it fragments each datagram. it can run out of
> socket
> > buffer space in the middle of fragmenting a
> datagram.
> > the bug is that if it runs out of buffer space, it
> > stops fragmenting and drops the packet. =20
> >=20
> > this leaves the receiving end with a bunch of
> fragments
> > it can't assemble into a whole datagram. this
> becomes
> > a problem when the sending end is perpetually
> running
> > out of socket space. this fills the receiving end's
> > reassembly queue with fragments that can't be used,
> > preventing all UDP traffic from getting to the
> server.
> >=20
> > the fix is to have it continue fragmenting this
> datagram
> > even though the socket buffer is "full."
> >=20
> > > what are the symptoms of this bug ?
> >=20
> > you're using NFS/UDP with largish r/wsize. your
> > network features links of different speed (100Mb
> > mixed with GbE) between client and server, or is
> > routed rather than only switched.
> >=20
> > you see lots of IP fragments on your network from
> > one or two NFS clients. you have periods of very
> slow
> > server and network performance, followed by periods
> of
> > normal performance. =20
> >=20
> > > > -----Original Message-----
> > > > From: Lever, Charles
> [mailto:[email protected]]
> > > > Sent: Wednesday, June 25, 2003 4:30 PM
> > > > To: Matthias Kittner
> > > > Cc: [email protected]
> > > > Subject: RE: [NFS] Linux client network
> performance
> > > >=20
> > > >=20
> > > > hi matthias-
> > > >=20
> > > > it sounds like you've hit the IP fragmentation
> bug.
> > > > you should use NFS/TCP (the "tcp" mount option).
> > > >=20
> > > > probably it's your 2.4.19-based client that is
> > > > causing this problem.
> > > >=20
> > > > > -----Original Message-----
> > > > > From: Matthias Kittner
> [mailto:[email protected]]
> > > > > Sent: Wednesday, June 25, 2003 4:57 AM
> > > > > To: [email protected]
> > > > > Subject: [NFS] Linux client network
> performance
> > > > >=20
> > > > >=20
> > > > > [I am not a subscriber, please CC to my
> eMail!]
> > > > >=20
> > > > > Hello,
> > > > >=20
> > > > > we have the following problem:
> > > > >=20
> > > > > - 2 linux clients:
> > > > > 2.4.19-16mdk
> > > > > 2.4.20
> > > > > - Solaris NFS-Server (nfs.server 1.21)
> > > > > SunOS 5.7
> > > > >=20
> > > > > If we compile on one of the linux machines
> sometimes in
> > > > this compile
> > > > > process the whole network hangs resp. is so
> slow that one
> > > can wait
> > > > > between 10s or 2min to finish a "ls".
> > > > >=20
> > > > > "snoop" at the nfs-server machine show
> very/very much "UDP
> > > > > continuation"=20
> > > > > traffic between the linux client and the
> solaris server.=20
> > > > Killing the
> > > > > compile process stops this messages.
> > > > >=20
> > > > > Can anyone help us? Is this a configuration
> problem or a bug?
> > > > >=20
> > > > > Regards,
> > > > > Matthias
> > > > >=20
> > > > >=20
> > > > >=20
> > > > >
> -------------------------------------------------------
> > > > > This SF.Net email is sponsored by: INetU
> > > > > Attention Web Developers & Consultants: Become
> An INetU
> > > > > Hosting Partner.
> > > > > Refer Dedicated Servers. We Manage Them. You
> Get 10% Monthly=20
> > > > > Commission!
> > > > > INetU Dedicated Managed Hosting=20
> > > http://www.inetu.net/partner/index.php
> > > > _______________________________________________
> > > > NFS maillist - [email protected]=20
> > > > https://lists.sourceforge.net/lists/listinfo/nfs
> > > >=20
> > >=20
> > >=20
> > >
> -------------------------------------------------------
> > > This SF.Net email is sponsored by: INetU
> > > Attention Web Developers & Consultants: Become An
> INetU
> > > Hosting Partner.
> > > Refer Dedicated Servers. We Manage Them. You Get
> 10% Monthly=20
> > > Commission!
> > > INetU Dedicated Managed Hosting=20
> > http://www.inetu.net/partner/index.php
>=20
>=20
> __________________________________
> Do you Yahoo!?
> SBC Yahoo! DSL - Now only $29.95 per month!
> http://sbc.yahoo.com
>=20
>=20
> -------------------------------------------------------
> This SF.Net email is sponsored by: INetU
> Attention Web Developers & Consultants: Become An INetU=20
> Hosting Partner.
> Refer Dedicated Servers. We Manage Them. You Get 10% Monthly=20
> Commission!
> INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-06-26 09:30:37

by Matthias Kittner

[permalink] [raw]
Subject: Re: Linux client network performance

Thanks in advance. I will try it out!

Regards,
Matthias

Lever, Charles wrote:
> hi sofia-
>
>
>>Can you please let me know:
>>- are there any identified workarounds?
>
>
> yes, there are several.
>
> 1. use NFS/TCP
>
> 2. if you can't use NFS/TCP, upgrade your clients
> to 2.4.20 or later
>
> 3. if you can't upgrade your clients, ensure the
> default socket buffer size on your clients is
> large before mounting any NFS servers (see
> below).
>
>
>>- are there any papers or writeups on
>>various other symptoms of this issue?
>
>
> http://www.netapp/com/tech_library/3183.html
>
> contains instructions for deploying workarounds
> in the appendix, including the socket buffer
> enlargement workaround.
>
> the symptoms that i described to valery are
> what you should look for. other behaviors
> could be the result of many different problems.
>
>
>>At one site, the customer environment mentioned
>>about having issues with Linux clients; they
>>had to change NFS block size to 8k....
>
>
> reducing r/wsize helps, but is not a guaranteed
> workaround.
>
>
>>Wondering if one can comment about this one case,
>>I noticed even with the Linux client side
>>mounted at 8k, at a certain point in time,
>>with a certain application that is write intensive,
>>to a Solaris NFS server, the client had to do
>>(2) writes, before the NFS server replied ...
>>(in looking at the snoop) ... i am not sure, but
>>i think the application had the capability to
>>write in even less the 8k blocksize, and it seems
>>like this eliminated the problem of NFS-writes
>>from the linux client...
>
>
> that's not a very clear description of the problem.
> it sounds like your client is retransmitting NFS
> write operations, but there isn't enough in your
> description to determine why that might occur.
>
>
>>>Sent: Wednesday, June 25, 2003 10:40 AM
>>>To: Brasseur Val?ry
>>>Cc: [email protected]; Matthias Kittner
>>>Subject: RE: [NFS] Linux client network performance
>>>
>>>
>>>hi valery-
>>>
>>>
>>>>what's the "IP fragmentation bug" ?
>>>
>>>the Linux IP layer sends packet fragments on the
>>
>>wire
>>
>>>as it fragments each datagram. it can run out of
>>
>>socket
>>
>>>buffer space in the middle of fragmenting a
>>
>>datagram.
>>
>>>the bug is that if it runs out of buffer space, it
>>>stops fragmenting and drops the packet.
>>>
>>>this leaves the receiving end with a bunch of
>>
>>fragments
>>
>>>it can't assemble into a whole datagram. this
>>
>>becomes
>>
>>>a problem when the sending end is perpetually
>>
>>running
>>
>>>out of socket space. this fills the receiving end's
>>>reassembly queue with fragments that can't be used,
>>>preventing all UDP traffic from getting to the
>>
>>server.
>>
>>>the fix is to have it continue fragmenting this
>>
>>datagram
>>
>>>even though the socket buffer is "full."
>>>
>>>
>>>>what are the symptoms of this bug ?
>>>
>>>you're using NFS/UDP with largish r/wsize. your
>>>network features links of different speed (100Mb
>>>mixed with GbE) between client and server, or is
>>>routed rather than only switched.
>>>
>>>you see lots of IP fragments on your network from
>>>one or two NFS clients. you have periods of very
>>
>>slow
>>
>>>server and network performance, followed by periods
>>
>>of
>>
>>>normal performance.
>>>
>>>
>>>>>-----Original Message-----
>>>>>From: Lever, Charles
>>
>>[mailto:[email protected]]
>>
>>>>>Sent: Wednesday, June 25, 2003 4:30 PM
>>>>>To: Matthias Kittner
>>>>>Cc: [email protected]
>>>>>Subject: RE: [NFS] Linux client network
>>
>>performance
>>
>>>>>
>>>>>hi matthias-
>>>>>
>>>>>it sounds like you've hit the IP fragmentation
>>
>>bug.
>>
>>>>>you should use NFS/TCP (the "tcp" mount option).
>>>>>
>>>>>probably it's your 2.4.19-based client that is
>>>>>causing this problem.
>>>>>
>>>>>
>>>>>>-----Original Message-----
>>>>>>From: Matthias Kittner
>>
>>[mailto:[email protected]]
>>
>>>>>>Sent: Wednesday, June 25, 2003 4:57 AM
>>>>>>To: [email protected]
>>>>>>Subject: [NFS] Linux client network
>>
>>performance
>>
>>>>>>
>>>>>>[I am not a subscriber, please CC to my
>>
>>eMail!]
>>
>>>>>>Hello,
>>>>>>
>>>>>>we have the following problem:
>>>>>>
>>>>>>- 2 linux clients:
>>>>>> 2.4.19-16mdk
>>>>>> 2.4.20
>>>>>>- Solaris NFS-Server (nfs.server 1.21)
>>>>>> SunOS 5.7
>>>>>>
>>>>>>If we compile on one of the linux machines
>>
>>sometimes in
>>
>>>>>this compile
>>>>>
>>>>>>process the whole network hangs resp. is so
>>
>>slow that one
>>
>>>>can wait
>>>>
>>>>>>between 10s or 2min to finish a "ls".
>>>>>>
>>>>>>"snoop" at the nfs-server machine show
>>
>>very/very much "UDP
>>
>>>>>>continuation"
>>>>>>traffic between the linux client and the
>>
>>solaris server.
>>
>>>>>Killing the
>>>>>
>>>>>>compile process stops this messages.
>>>>>>
>>>>>>Can anyone help us? Is this a configuration
>>
>>problem or a bug?
>>
>>>>>>Regards,
>>>>>>Matthias
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>
>>-------------------------------------------------------
>>
>>>>>>This SF.Net email is sponsored by: INetU
>>>>>>Attention Web Developers & Consultants: Become
>>
>>An INetU
>>
>>>>>>Hosting Partner.
>>>>>>Refer Dedicated Servers. We Manage Them. You
>>
>>Get 10% Monthly
>>
>>>>>>Commission!
>>>>>>INetU Dedicated Managed Hosting
>>>>
>>>>http://www.inetu.net/partner/index.php
>>>>
>>>>>_______________________________________________
>>>>>NFS maillist - [email protected]
>>>>>https://lists.sourceforge.net/lists/listinfo/nfs
>>>>>
>>>>
>>>>
>>>>
>>-------------------------------------------------------
>>
>>>>This SF.Net email is sponsored by: INetU
>>>>Attention Web Developers & Consultants: Become An
>>
>>INetU
>>
>>>>Hosting Partner.
>>>>Refer Dedicated Servers. We Manage Them. You Get
>>
>>10% Monthly
>>
>>>>Commission!
>>>>INetU Dedicated Managed Hosting
>>>
>>>http://www.inetu.net/partner/index.php
>>
>>
>>__________________________________
>>Do you Yahoo!?
>>SBC Yahoo! DSL - Now only $29.95 per month!
>>http://sbc.yahoo.com
>>
>>
>>-------------------------------------------------------
>>This SF.Net email is sponsored by: INetU
>>Attention Web Developers & Consultants: Become An INetU
>>Hosting Partner.
>>Refer Dedicated Servers. We Manage Them. You Get 10% Monthly
>>Commission!
>>INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
>>_______________________________________________
>>NFS maillist - [email protected]
>>https://lists.sourceforge.net/lists/listinfo/nfs
>
>


--
/\_ _ _ Matthias Kittner, +49(0)6151-30083-0
/\/_|_|_| Mornewegstr. 28, 64293 Darmstadt, Germany
\/vrcom.de ? 71F (22?C) EDT (UTC) Temperature EDT (UTC) Temperature;
; Wind E, 5 mph
PTL! ? 1017 hPa; diesel



-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-06-26 16:50:35

by s o f i a

[permalink] [raw]
Subject: RE: Linux client network performance

Charles thank you so much for
your pointers. I have downloaded
the Netapp paper, and will review
it....

> that's not a very clear description of the problem.
> it sounds like your client is retransmitting NFS
> write operations, but there isn't enough in your
> description to determine why that might occur.

yes you are right, and apologize for the
vague characterization ... the issue is actually:

I have an application that performs an "NFS-copy"
on an NFS Linux client. This Linux client is
mounted to *source* volume in a Netapp Filer;
copying files to an NFS mounted volume on a
*destination* host: SUN Solaris 8, VERITAS Foundation
Suite. both have GigE network adapters.
The application's data movement is NFS over UDP.

The symptom I saw was that, at about 30-50%
completion of the operation, we do see the client
retransmitting NFS write operations. In fact,
in running tcpdump on the linux client and snoop
on the Solaris host, I saw the following sequence:

- NFS write attempt from client
- is received in destination
- NFS 2nd write attempt from client
- is received in destination
- destination sends back a reply
- NFS transaction completed.
... note: the host replies to the FIRST
NFS write_attempt, not the SECOND... therefore
based on timestamps.

any thoughts or comments will be most welcome.

thank you again,

- s o f i a

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com


-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs