2002-03-20 15:40:02

by I Lee Hetherington

[permalink] [raw]
Subject: Re: 2.4.18: NFS_ALL patch greatly hurting UDP speed

Did you ever get a chance to look at the tcpdumps I sent you?

Given that UDP is now *100* times slower with NFS_ALL vs. without,
something is fishy. We have always gotten good performance with UDP in
our network, even with the 1Gb/s to 100Mb/s switching. Maybe our
switches have big enough buffers to deal with the peaks.

--Lee Hetherington



_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2002-03-20 15:45:15

by Trond Myklebust

[permalink] [raw]
Subject: Re: 2.4.18: NFS_ALL patch greatly hurting UDP speed

>>>>> " " == I Lee Hetherington <I> writes:

> Did you ever get a chance to look at the tcpdumps I sent you?
> Given that UDP is now *100* times slower with NFS_ALL
> vs. without, something is fishy. We have always gotten good
> performance with UDP in our network, even with the 1Gb/s to
> 100Mb/s switching. Maybe our switches have big enough buffers
> to deal with the peaks.

> --Lee Hetherington

Yes. I already sent you a breakdown based on those tcpdumps. Here it is
again.

Cheers,
Trond

----

Interesting... On the 2.2.18 vanilla, the system is settling in a routine
where the client only sends 1 request at a time, and then waits for the
answer.

On 2.2.18 w/ NFS_ALL, the client attempts to send off 6 requests, but only
gets answered *very* slowly (one request at a time). Eventually (line number
113 in the tcpdump sequence) the replies start to miss fragments, and this is
where the problem starts...

My guess is that while you are able to send off 1 request at a time, the GigE
-> 100Mbit bridge is able to cache the replies. As the the number of
'simultaneous' replies being sent off by the server increases, though, the
bridge is no longer able to cache enough, and so it starts to truncate the
messages (and the NFS read requests have to time out and retry).
Another sign of this is the fact that in the NFS_ALL tcpdump, there are
assorted 'Time-to-live exceeded' ICMP messages littered around the place. The
first one comes just after the loss of fragments, and is accompanied by a 2
second delay, during which all the reads that are sent time out without
receiving a single reply...

Basically, as I said before, you are in a situation where TCP should always
be able to outperform UDP. I'm a bit surprised that this is not the case for
'vanilla' 2.2.18.

Cheers,
Trond


_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-03-20 15:50:13

by Trond Myklebust

[permalink] [raw]
Subject: Re: 2.4.18: NFS_ALL patch greatly hurting UDP speed

>>>>> " " == Trond Myklebust <[email protected]> writes:

> exceeded' ICMP messages littered around the place. The first
> one comes just after the loss of fragments, and is accompanied
> by a 2 second delay, during which all the reads that are sent
> time out without receiving a single reply...

Note: this 2 second period of silence appears to be what is really
causing the *100 slowdown. I've no idea what the switch is engaging in
during that time, but you might want to take a look to see if those
messages being sent during that period are indeed being received on
the server.

Cheers,
Trond

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs