2009-11-26 15:20:59

by Trond Myklebust

[permalink] [raw]
Subject: Re: Fw: Deadlock regression in v2.6.31.6

On Thu, 2009-11-26 at 16:07 +0100, Stephen R. van den Berg wrote:
> On Thu, Nov 26, 2009 at 16:01, Trond Myklebust
> <[email protected]> wrote:
> > On Thu, 2009-11-26 at 00:11 +0100, Stephen R. van den Berg wrote:
> >> 1.2.3.167 is the Linux client kernel which locks up, 1.2.3.151 is the
> >> unfs server.
> >> It looks like the client terminates the TCP connection. The server
> >> confirms it, the client then sends a final acknowledge. At that point
> >> the client kernel locks up in the infinite loop.
>
> > OK. Have you tried running with RPCDBG_TRANS debugging enabled? I
> > suspect you might see a flood of 'sendmsg returned unrecognized error'
> > or 'connect returned unhandled error' messages if you do.
>
> The pcap trace is not conclusive (enough)?
> I can run with RPCDBG_TRANS enabled, if it is needed to further
> pinpoint the problem.

The pcap trace shows what is happening: the socket is getting closed
correctly, and so the RPC client needs to initiate a reconnection before
it can transmit again. What I don't understand is why it is failing to
do so...

Trond



2009-11-27 00:07:41

by Stephen R. van den Berg

[permalink] [raw]
Subject: Re: Fw: Deadlock regression in v2.6.31.6

On Thu, Nov 26, 2009 at 16:20, Trond Myklebust
<[email protected]> wrote:
>> I can run with RPCDBG_TRANS enabled, if it is needed to further
>> pinpoint the problem.

I don't think I understand how to enable RPCDBG_TRANS exactly.
Anyway, I activated the dprintks in xprtsock.c with the following
(endless) result:

RPC: cfa94400 connect status 99 connected 0 sock state 7
RPC: xs_tcp_send_request(32896) = -32
RPC: xs_tcp_state_change client cfa94400...
RPC: state 7 conn 0 dead 0 zapped 1
RPC: xs_connect delayed xprt cfa94400 for 0 seconds
RPC: worker connecting xprt cfa94400 to address: addr=1.2.3.151
port=2049 proto=tcp
RPC: cfa94400 connect status 99 connected 0 sock state 7
RPC: xs_tcp_send_request(32896) = -32
RPC: xs_tcp_state_change client cfa94400...
RPC: state 7 conn 0 dead 0 zapped 1
RPC: xs_connect delayed xprt cfa94400 for 0 seconds
RPC: worker connecting xprt cfa94400 to address: addr=1.2.3.151
port=2049 proto=tcp
RPC: cfa94400 connect status 99 connected 0 sock state 7
RPC: xs_tcp_send_request(32896) = -32
RPC: xs_tcp_state_change client cfa94400...
RPC: state 7 conn 0 dead 0 zapped 1
RPC: xs_connect delayed xprt cfa94400 for 0 seconds
RPC: worker connecting xprt cfa94400 to address: addr=1.2.3.151
port=2049 proto=tcp

Does this tell you more?
--
Sincerely,
Stephen R. van den Berg.