2008-08-24 22:15:45

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS regression? Odd delays and lockups accessing an NFS export.

On Sun, 2008-08-24 at 23:09 +0100, Ian Campbell wrote:
> (added some quoting from previous mail to save replying twice)
>
> On Sun, 2008-08-24 at 15:19 -0400, Trond Myklebust wrote:
> > On Sun, 2008-08-24 at 15:17 -0400, Trond Myklebust wrote:
> > > >From the tcpdump, it looks as if the NFS server is failing to close the
> > > socket, when the client closes its side. You therefore end up getting
> > > stuck in the FIN_WAIT2 state (as netstat clearly shows above).
> > >
> > > Is the server keeping the client in this state for a very long
> > > period?
>
> Well, it had been around an hour and a half on this occasion. Next time
> it happens I can wait longer but I'm pretty sure I've come back from
> time away and it's been wedged for at least a day. How long would you
> expect it to remain in this state for?

The server should ideally start to close the socket as soon as it
receives the FIN from the client. I'll have a look at the code.

Trond



2008-08-27 14:43:03

by Tom Tucker

[permalink] [raw]
Subject: Re: NFS regression? Odd delays and lockups accessing an NFS export.

J. Bruce Fields wrote:
> On Mon, Aug 25, 2008 at 11:04:08AM -0500, Tom Tucker wrote:
>> Trond Myklebust wrote:
>>> On Sun, 2008-08-24 at 23:09 +0100, Ian Campbell wrote:
>>>> (added some quoting from previous mail to save replying twice)
>>>>
>>>> On Sun, 2008-08-24 at 15:19 -0400, Trond Myklebust wrote:
>>>>> On Sun, 2008-08-24 at 15:17 -0400, Trond Myklebust wrote:
>>>>>> >From the tcpdump, it looks as if the NFS server is failing to close the
>>>>>> socket, when the client closes its side. You therefore end up getting
>>>>>> stuck in the FIN_WAIT2 state (as netstat clearly shows above).
>>>>>>
>>>>>> Is the server keeping the client in this state for a very long
>>>>>> period?
>>>> Well, it had been around an hour and a half on this occasion. Next time
>>>> it happens I can wait longer but I'm pretty sure I've come back from
>>>> time away and it's been wedged for at least a day. How long would you
>>>> expect it to remain in this state for?
>>> The server should ideally start to close the socket as soon as it
>>> receives the FIN from the client. I'll have a look at the code.
>>>
>> I don't think it should matter how long the connection stays in FIN WAIT,
>> the client should reconnect anyway.
>>
>> Since the client seems to be the variable, I would think it might be an
>> issue with the client reconnect logic?
>>
>> That said, 2.6.25 is when the server side transport switch logic went in.
>
> Any chance you could help Trond figure out why the server might be doing
> this?
>
> If not, I'll get to it, but not as soon as I should.
>

Sure. I've actually tried to reproduce it here unsuccessfully.

As a starter, I would suggest turning on transport debugging:

# echo 256 > /proc/sys/sunrpc/rpc_debug

Here are my thoughts on how it is supposed to work. Trond if this doesn't match
your understanding, please let me know.

For the case where the client closes the connection first
because it's timeout is shorter (5min vs. 6), the sequence of events should be:


client server

close
sends FIN goes to FIN-WAIT-1
receives FIN replies with ACK
and goes to CLOSE-WAIT
receives ACK goes to FIN-WAIT-2
calls tcp_state_change callback on socket
svc_tcp_state_change sets XPT_CLOSE on
the transport and enqueues transport for
servicing by svc thread.

thread gets dequeued, calls svc_recv
svc_recv sees XPT_CLOSE and calls
svc_xprt_destroy that closes the socket

TCP sends FIN/ACK after close
receives FIN/ACK goes to TIME-WAIT

since state is TIME-WAIT and
reuse port is set, we can re-connect

There's a couple places we could be getting stuck:

- The state-change callback never gets called.
- The state-change callback gets called but svsk is null and it ignores the event
- svc_tcp_state_change enqueues the transport, but due to a race bug, the transport
doesn't actually get queued and since there is no activity it never closes
- something else

The message log with RPCDBG_SVCXPRT (256) enabled as shown above would help figure
out where it's happening.

If Ian is willing to create the log (or already has one), I'm certainly willing to
look at it.

Tom


> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


2008-08-30 15:47:51

by Ian Campbell

[permalink] [raw]
Subject: Re: NFS regression? Odd delays and lockups accessing an NFS export.

On Wed, 2008-08-27 at 09:43 -0500, Tom Tucker wrote:
> Sure. I've actually tried to reproduce it here unsuccessfully.
>
> As a starter, I would suggest turning on transport debugging:
>
> # echo 256 > /proc/sys/sunrpc/rpc_debug
> [...]
> If Ian is willing to create the log (or already has one), I'm
> certainly willing to look at it.

It produced only the following (is that what was expected?):

[146866.448112] -pid- proc flgs status -client- -prog- --rqstp- -timeout -rpcwait -action- ---ops--
[146866.448112] 30576 0001 00a0 0 f77a1600 100003 f7903340 15000 xprt_pending fa0ba88e fa0c9df4
[146866.448112] 30577 0004 0080 -11 f77a1600 100003 f7903000 0 xprt_sending fa0ba88e fa0c9df4

Ian.
--
Ian Campbell

Life is a concentration camp. You're stuck here and there's no way
out and you can only rage impotently against your persecutors.
-- Woody Allen


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-08-25 10:00:37

by Ian Campbell

[permalink] [raw]
Subject: Re: NFS regression? Odd delays and lockups accessing an NFS export.

On Sun, 2008-08-24 at 18:15 -0400, Trond Myklebust wrote:
> The server should ideally start to close the socket as soon as it
> receives the FIN from the client. I'll have a look at the code.

The server is a 2.6.25 kernel from Debian backports.org. I was planning
to downgrade to the 2.6.24 "Etch and a half" kernel but was holding off
while this thread was in progress. If you think it would be a useful
data point I can do it sooner rather than later.

It's changing the client kernel which seems to make a difference since
running 2.6.24 against this server has given me no trouble. I guess the
newer client could expose a bug in the server though.

Ian.
--
Ian Campbell

I can't understand why a person will take a year or two to write a
novel when he can easily buy one for a few dollars.
-- Fred Allen


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-08-25 16:04:08

by Tom Tucker

[permalink] [raw]
Subject: Re: NFS regression? Odd delays and lockups accessing an NFS export.

Trond Myklebust wrote:
> On Sun, 2008-08-24 at 23:09 +0100, Ian Campbell wrote:
>> (added some quoting from previous mail to save replying twice)
>>
>> On Sun, 2008-08-24 at 15:19 -0400, Trond Myklebust wrote:
>>> On Sun, 2008-08-24 at 15:17 -0400, Trond Myklebust wrote:
>>>> >From the tcpdump, it looks as if the NFS server is failing to close the
>>>> socket, when the client closes its side. You therefore end up getting
>>>> stuck in the FIN_WAIT2 state (as netstat clearly shows above).
>>>>
>>>> Is the server keeping the client in this state for a very long
>>>> period?
>> Well, it had been around an hour and a half on this occasion. Next time
>> it happens I can wait longer but I'm pretty sure I've come back from
>> time away and it's been wedged for at least a day. How long would you
>> expect it to remain in this state for?
>
> The server should ideally start to close the socket as soon as it
> receives the FIN from the client. I'll have a look at the code.
>

I don't think it should matter how long the connection stays in FIN WAIT,
the client should reconnect anyway.

Since the client seems to be the variable, I would think it might be an
issue with the client reconnect logic?

That said, 2.6.25 is when the server side transport switch logic went in.

Tom

> Trond
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-08-25 16:54:13

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS regression? Odd delays and lockups accessing an NFS export.

On Mon, 2008-08-25 at 11:04 -0500, Tom Tucker wrote:
> I don't think it should matter how long the connection stays in FIN WAIT,
> the client should reconnect anyway.

Which would break the replay cache, since we can't reconnect using the
same port.

> Since the client seems to be the variable, I would think it might be an
> issue with the client reconnect logic?

No.

Trond

2008-08-25 20:15:03

by Tom Tucker

[permalink] [raw]
Subject: Re: NFS regression? Odd delays and lockups accessing an NFS export.

Trond Myklebust wrote:
> On Mon, 2008-08-25 at 11:04 -0500, Tom Tucker wrote:
>> I don't think it should matter how long the connection stays in FIN WAIT,
>> the client should reconnect anyway.
>
> Which would break the replay cache, since we can't reconnect using the
> same port.
>
>> Since the client seems to be the variable, I would think it might be an
>> issue with the client reconnect logic?
>
> No.
>

Yeah, you're right. I got confused about which host was stuck in FIN_WAIT2.
It's the client as you pointed out earlier.

Thanks,
Tom


> Trond
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


2008-08-26 19:27:11

by J. Bruce Fields

[permalink] [raw]
Subject: Re: NFS regression? Odd delays and lockups accessing an NFS export.

On Mon, Aug 25, 2008 at 11:04:08AM -0500, Tom Tucker wrote:
> Trond Myklebust wrote:
>> On Sun, 2008-08-24 at 23:09 +0100, Ian Campbell wrote:
>>> (added some quoting from previous mail to save replying twice)
>>>
>>> On Sun, 2008-08-24 at 15:19 -0400, Trond Myklebust wrote:
>>>> On Sun, 2008-08-24 at 15:17 -0400, Trond Myklebust wrote:
>>>>> >From the tcpdump, it looks as if the NFS server is failing to close the
>>>>> socket, when the client closes its side. You therefore end up getting
>>>>> stuck in the FIN_WAIT2 state (as netstat clearly shows above).
>>>>>
>>>>> Is the server keeping the client in this state for a very long
>>>>> period?
>>> Well, it had been around an hour and a half on this occasion. Next time
>>> it happens I can wait longer but I'm pretty sure I've come back from
>>> time away and it's been wedged for at least a day. How long would you
>>> expect it to remain in this state for?
>>
>> The server should ideally start to close the socket as soon as it
>> receives the FIN from the client. I'll have a look at the code.
>>
>
> I don't think it should matter how long the connection stays in FIN WAIT,
> the client should reconnect anyway.
>
> Since the client seems to be the variable, I would think it might be an
> issue with the client reconnect logic?
>
> That said, 2.6.25 is when the server side transport switch logic went in.

Any chance you could help Trond figure out why the server might be doing
this?

If not, I'll get to it, but not as soon as I should.

--b.

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/