2001-10-03 12:33:33

by James D Strandboge

[permalink] [raw]
Subject: Re: status of nfs and tcp with 2.4

Forgive me, Trond, for sending a second reply to this, I am trying to
dig in and get my hands dirty, but want to make sure I understand the
problem.

> The biggest problem is to prevent the TCP server hogging all the
> threads when a client gets congested.
>
> With the UDP code, we use non-blocking I/O and simply drop all replies
> that don't get through. For TCP dropping replies is not acceptable as
> the client will only resend requests every ~60seconds. Currently, the
> code therefore uses blocking I/O something which means that if the
> socket blocks, you run out of free nfsd threads...

By 'when a client gets congested' my understanding is you mean 'when
a client is sending a lot to the server, and the server can't respond
quickly enough.' Therefore, dropping udp replies is ok, since the
client will just send it again, however, with tcp, the client will only
resend every 60 seconds and that is too slow, and it blocks the socket
in the meantime. Is my understanding correct?

> There are 2 possible strategies:
>
> 1 Allocate 1 thread per TCP connection

This seems to be the easier of the two to implement, however you opted
against this because we are putting an eventual limit on the number of
clients we can serve based on NFSD_MAXSERVS. Is this correct?

> 2 Use non-blocking I/O, but allow TCP connections to defer sending
> the reply until the socket is available (and allow the thread to
> service other requests while the socket is busy).
>
> I started work on (2) last autumn, <snip>

Are there patches for this that I could look at?

Jamie Strandboge

--
GPG/PGP Info
Email: [email protected]
ID: 26384A3A
Fingerprint: D9FF DF4A 2D46 A353 A289 E8F5 AA75 DCBE 2638 4A3A
--


2001-10-03 14:31:50

by Trond Myklebust

[permalink] [raw]
Subject: Re: status of nfs and tcp with 2.4

>>>>> " " == James D Strandboge <[email protected]> writes:

> By 'when a client gets congested' my understanding is you mean
> 'when a client is sending a lot to the server, and the server
> can't respond quickly enough.' Therefore, dropping udp replies

There are several scenarios. The one that worries me most on TCP
connections is when the TCP socket on the server gets swamped for some
reason, and the call to sendmsg() sleeps. This means that if one
client has fired off a load of requests, and then doesn't listen for
the reply, we can end up sleeping for a long time (and being
unavailable to other clients).

OTOH under UDP, we use nonblocking I/O, so the sendmsg() returns, and
the server can just drop the request (as the UDP allows for quick
resends). The thread in this scenario therefore never sleeps if a
client is unavailable. It can only sleep on (relatively fast) disk
I/O.


> is ok, since the client will just send it again, however, with
> tcp, the client will only resend every 60 seconds and that is
> too slow, and it blocks the socket in the meantime. Is my
> understanding correct?

That is correct. TCP is designed to be a reliable protocol, so clients
are allowed to assume that the server will reply to a request once it
has been sent.

>> There are 2 possible strategies:
>>
>> 1 Allocate 1 thread per TCP connection

> This seems to be the easier of the two to implement, however
> you opted against this because we are putting an eventual limit
> on the number of clients we can serve based on NFSD_MAXSERVS.
> Is this correct?

Well... Thread limits can be changed. My main objection is that it is
ugly. Why allocate a thread when what you want to do is to be able to
cope with sleeping? We have non-blocking I/O, and the tcp
'write_space()' socket routine (see the client use in
net/sunrpc/xprt.c) that was designed to enable a thread to get called
back once a socket is free.

>> 2 Use non-blocking I/O, but allow TCP connections to defer
>> sending the reply until the socket is available (and allow the
>> thread to service other requests while the socket is busy).
>>
>> I started work on (2) last autumn, <snip>

> Are there patches for this that I could look at?

http://www.fys.uio.no/~trondmy/src/pre_alpha/linux-2.4.0-test6-rpctcp.dif

It's a patch against linux-2.4.0-test6 and is basically at the 'toy'
stage. Definitely nowhere near ready for release. IIRC though it did
actually run fairly reliably.

Cheers,
Trond