2009-04-22 22:22:15

by Petr Vandrovec

[permalink] [raw]
Subject: Re: [Bug 13034] printk in xs_tcp_setup_socket needs rate limit ... and delay

Trond Myklebust wrote:
> (Switching to bugzilla email interface, and ccing linux-nfs)
>
> On Wed, 2009-04-22 at 20:49 +0000, bugzilla-daemon-590EEB7GvNiWaY/[email protected]
> wrote:
>> http://bugzilla.kernel.org/show_bug.cgi?id=13034
>>
>>
>>
>>
>>
>> --- Comment #12 from Petr Vandrovec <[email protected]> 2009-04-22 20:49:58 ---
>> Unfortunately I have no access to the server (they are some NetApp and EMC
>> storage devices maintained by company IT).
>>
>> It seems that they are all configured same way, so after mount they start
>> timing out connections all at the same moment (after 10 minutes since mount, or
>> something like that), and netstat above is captured when I run 'df' after
>> connections moved from established to time_wait on client side. That TIME_WAIT
>> disappear after 60 seconds, as expected.
>
> So these connections are basically timing out because the systems are
> idle? (FYI: the NFS convention is that clients are supposed to close the
> TCP connection if it has been idle for 5 minutes, whereas the servers
> usually close it if the client has been idle for 6 minutes)...

Yes, they are timing out because system is idle. It seems that it
behaves way it should - it is client who closes them, 5 minutes after I
run 'df':

$ savedq=-1; df > /dev/null; while true; do q="`netstat -atn | grep 2049
| grep TIME_WAIT | wc -l`"; if [ "$q" != "$savedq" ]; then echo -n "$q
"; date; savedq="$q"; fi; sleep 1; done
0 Wed Apr 22 14:59:10 PDT 2009
28 Wed Apr 22 15:04:10 PDT 2009
0 Wed Apr 22 15:05:08 PDT 2009

So as far as I can tell with patch everything works as expected.
Petr


2009-04-23 00:53:39

by Trond Myklebust

[permalink] [raw]
Subject: Re: [Bug 13034] printk in xs_tcp_setup_socket needs rate limit ... and delay

On Wed, 2009-04-22 at 15:22 -0700, Petr Vandrovec wrote:
> Trond Myklebust wrote:
> > So these connections are basically timing out because the systems are
> > idle? (FYI: the NFS convention is that clients are supposed to close the
> > TCP connection if it has been idle for 5 minutes, whereas the servers
> > usually close it if the client has been idle for 6 minutes)...
>
> Yes, they are timing out because system is idle. It seems that it
> behaves way it should - it is client who closes them, 5 minutes after I
> run 'df':

Good! The 5minute idle timeout is the one case where we don't care about
preserving the port number (because there are no outstanding NFS
requests to replay to the server).

Jean, are you seeing the same behaviour (i.e. errors only on idle
timeout), and is the fix working for you?