2010-03-04 10:11:42

by Mi Jinlong

[permalink] [raw]
Subject: [RFC] SUNRPC connect timeout case network request delay

Hi,

When I use the nfs at RHEL with kernel-2.6.31 and nfs-utils-1.2.0,
I got a problem, but i don't sure.

The process:

Step1: [22:40:9] Open file /mnt/nfs/file(OK)
Step2: [22:40:9] Lock file
[22:40:9] Lock file success
Step3: [22:40:9] Reboot the nfs server
If reboot success, put <enter> to continue
Step4: [22:42:16] Write data to file
[22:42:16] Write data success
Step5: [22:42:16] Unlock file
[22:46:30] Unlock file success.
Step6: [22:46:30] Close file /mnt/nfs/file
[22:46:30] Close fiel /mnt/nfs/file success

The problem is at step5, unlock file takes 4 min, it's a long time than expected.
When traceing the kernel, I find SUNRPC call call_connect timeout many times,
one timeout is 1min.

I think it's a problem of kernel, but i don't know why, can someone help me ?

thanks,
Mi Jinlong



2010-03-04 16:59:24

by Chuck Lever III

[permalink] [raw]
Subject: Re: [RFC] SUNRPC connect timeout case network request delay

On 03/04/2010 05:12 AM, Mi Jinlong wrote:
> Hi,
>
> When I use the nfs at RHEL with kernel-2.6.31 and nfs-utils-1.2.0,
> I got a problem, but i don't sure.
>
> The process:
>
> Step1: [22:40:9] Open file /mnt/nfs/file(OK)
> Step2: [22:40:9] Lock file
> [22:40:9] Lock file success
> Step3: [22:40:9] Reboot the nfs server
> If reboot success, put<enter> to continue
> Step4: [22:42:16] Write data to file
> [22:42:16] Write data success
> Step5: [22:42:16] Unlock file
> [22:46:30] Unlock file success.
> Step6: [22:46:30] Close file /mnt/nfs/file
> [22:46:30] Close fiel /mnt/nfs/file success
>
> The problem is at step5, unlock file takes 4 min, it's a long time than expected.
> When traceing the kernel, I find SUNRPC call call_connect timeout many times,
> one timeout is 1min.

The kernel's TCP reconnect logic will retry until it succeeds, without
letting the upper level make progress. For some reason, it is having
difficulty reconnecting with your server.

> I think it's a problem of kernel, but i don't know why, can someone help me ?

# sudo rpcdebug -m rpc -s xprt trans

run your test, and then look in your kernel log. It might tell you what
error is occurring. Another good thing to do is capture a network trace
with

# sudo tcpdump -s 1540 -w /tmp/raw

while you run your test.

This is similar to the symptoms of other connect problems I've
encountered recently.

--
chuck[dot]lever[at]oracle[dot]com

2010-03-12 13:35:23

by Trond Myklebust

[permalink] [raw]
Subject: Re: [RFC] SUNRPC connect timeout case network request delay

On Fri, 2010-03-12 at 17:55 +0800, Mi Jinlong wrote:
> Hi,
>
> Thanks for your reply.
>
> Chuck Lever 写道:
> > On 03/08/2010 04:59 AM, Mi Jinlong wrote:
> >> Hi chuck,
> >>
> >> Thanks for your reply.
> >>
> >> Chuck Lever 写道:
> >>> On 03/04/2010 05:12 AM, Mi Jinlong wrote:
> >>>> Hi,
> >>>> Step4: [22:42:16] Write data to file
> >>>> [22:42:16] Write data success
> >>>> Step5: [22:42:16] Unlock file
> >>>> [22:46:30] Unlock file success.
> >>>> Step6: [22:46:30] Close file /mnt/nfs/file
> >>>> [22:46:30] Close fiel /mnt/nfs/file success
> >>>>
> >>>> The problem is at step5, unlock file takes 4 min, it's a long time
> >>>> than expected.
> >>>> When traceing the kernel, I find SUNRPC call call_connect timeout many
> >>>> times,
> >>>> one timeout is 1min.
> >>>
> >>> The kernel's TCP reconnect logic will retry until it succeeds, without
> >>> letting the upper level make progress. For some reason, it is having
> >>> difficulty reconnecting with your server.
> >>>
> >>>> I think it's a problem of kernel, but i don't know why, can someone
> >>>> help me ?
> >>>
> >>> # sudo rpcdebug -m rpc -s xprt trans
> >>
> >> After running this command, I got some important messages that I think.
> >>
> >> RPC: xs_connect delayed xprt for 3 seconds
> >> ...
> >> RPC: xs_connect delayed xprt for 6 seconds
> >> ...
> >> RPC: xs_connect delayed xprt for 12 seconds
> >> ...
> >> RPC: xs_connect delayed xprt for 24 seconds
> >> ...
> >> ...
> >> RPC: xs_connect delayed xprt for 300 seconds
> >>
> >> This message is printed at xs_connect, and the delay time is double
> >> there.
> >> IMO, when some data translate over through a socket, the socket should
> >> be released.
> >> But, it seems the socket isn't released through those messages above.
> >> Is it wrong, or there are some other reasons ?
> >
> > The code is trying to connect, but the ->connect call isn't working
> > somehow. The code backs off by doubling the timeout each time, so that
> > the connect attempts don't overload the server.
> >
> > This tells us that the code is attempting to connect, but not why the
> > connect attempt is failing.
>
> When reading the kernel codes, I find a problem at function xs_tcp_close.
> ....
> 772 static void xs_tcp_close(struct rpc_xprt *xprt)
> 773 {
> 774 if (test_and_clear_bit(XPRT_CONNECTION_CLOSE, &xprt->state))
> 775 xs_close(xprt);
> 776 else
> 777 xs_tcp_shutdown(xprt);
> 778 }
> ...
> If a task call xs_tcp_close to close the xprt's sock, many times it only call
> xs_tcp_shutdown to using the next layer's close function to close the socket
> connection.
> But after close the socket connection, the socket also exist, so the socket may
> be reused. Is it a problem ? I think after xs_tcp_shutdown, the socket should
> be released.

No it shouldn't. The whole point of the current code is to allow the RPC
client to _reuse_ the same port without having to wait for a TIME_WAIT.

The reason why we want to do that is because a lot of servers key their
duplicate reply caches on the port number. See
http://www.connectathon.org/talks96/werme1.html

Trond


2010-03-08 15:41:37

by Chuck Lever III

[permalink] [raw]
Subject: Re: [RFC] SUNRPC connect timeout case network request delay

On 03/08/2010 04:59 AM, Mi Jinlong wrote:
> Hi chuck,
>
> Thanks for your reply.
>
> Chuck Lever =E5=86=99=E9=81=93:
>> On 03/04/2010 05:12 AM, Mi Jinlong wrote:
>>> Hi,
>>> Step4: [22:42:16] Write data to file
>>> [22:42:16] Write data success
>>> Step5: [22:42:16] Unlock file
>>> [22:46:30] Unlock file success.
>>> Step6: [22:46:30] Close file /mnt/nfs/file
>>> [22:46:30] Close fiel /mnt/nfs/file success
>>>
>>> The problem is at step5, unlock file takes 4 min, it's a long time
>>> than expected.
>>> When traceing the kernel, I find SUNRPC call call_connect timeout m=
any
>>> times,
>>> one timeout is 1min.
>>
>> The kernel's TCP reconnect logic will retry until it succeeds, witho=
ut
>> letting the upper level make progress. For some reason, it is havin=
g
>> difficulty reconnecting with your server.
>>
>>> I think it's a problem of kernel, but i don't know why, can someone
>>> help me ?
>>
>> # sudo rpcdebug -m rpc -s xprt trans
>
> After running this command, I got some important messages that I th=
ink.
>
> RPC: xs_connect delayed xprt for 3 seconds
> ...
> RPC: xs_connect delayed xprt for 6 seconds
> ...
> RPC: xs_connect delayed xprt for 12 seconds
> ...
> RPC: xs_connect delayed xprt for 24 seconds
> ...
> ...
> RPC: xs_connect delayed xprt for 300 seconds
>
> This message is printed at xs_connect, and the delay time is double t=
here.
> IMO, when some data translate over through a socket, the socket shoul=
d be released.
> But, it seems the socket isn't released through those messages above.
> Is it wrong, or there are some other reasons ?

The code is trying to connect, but the ->connect call isn't working=20
somehow. The code backs off by doubling the timeout each time, so that=
=20
the connect attempts don't overload the server.

This tells us that the code is attempting to connect, but not why the=20
connect attempt is failing.

> At the latest kernel, this bug was fix by patch
> "NFS/RPC: fix problems with reestablish_timeout and related code."
> But I don't sure about this.

--=20
chuck[dot]lever[at]oracle[dot]com

2010-03-12 09:54:07

by Mi Jinlong

[permalink] [raw]
Subject: Re: [RFC] SUNRPC connect timeout case network request delay

Hi,

Thanks for your reply.

Chuck Lever =E5=86=99=E9=81=93:
> On 03/08/2010 04:59 AM, Mi Jinlong wrote:
>> Hi chuck,
>>
>> Thanks for your reply.
>>
>> Chuck Lever =E5=86=99=E9=81=93:
>>> On 03/04/2010 05:12 AM, Mi Jinlong wrote:
>>>> Hi,
>>>> Step4: [22:42:16] Write data to file
>>>> [22:42:16] Write data success
>>>> Step5: [22:42:16] Unlock file
>>>> [22:46:30] Unlock file success.
>>>> Step6: [22:46:30] Close file /mnt/nfs/file
>>>> [22:46:30] Close fiel /mnt/nfs/file success
>>>>
>>>> The problem is at step5, unlock file takes 4 min, it's a long time
>>>> than expected.
>>>> When traceing the kernel, I find SUNRPC call call_connect timeout =
many
>>>> times,
>>>> one timeout is 1min.
>>>
>>> The kernel's TCP reconnect logic will retry until it succeeds, with=
out
>>> letting the upper level make progress. For some reason, it is havi=
ng
>>> difficulty reconnecting with your server.
>>>
>>>> I think it's a problem of kernel, but i don't know why, can someon=
e
>>>> help me ?
>>>
>>> # sudo rpcdebug -m rpc -s xprt trans
>>
>> After running this command, I got some important messages that I t=
hink.
>>
>> RPC: xs_connect delayed xprt for 3 seconds
>> ...
>> RPC: xs_connect delayed xprt for 6 seconds
>> ...
>> RPC: xs_connect delayed xprt for 12 seconds
>> ...
>> RPC: xs_connect delayed xprt for 24 seconds
>> ...
>> ...
>> RPC: xs_connect delayed xprt for 300 seconds
>>
>> This message is printed at xs_connect, and the delay time is double
>> there.
>> IMO, when some data translate over through a socket, the socket shou=
ld
>> be released.
>> But, it seems the socket isn't released through those messages above=
=2E
>> Is it wrong, or there are some other reasons ?
>=20
> The code is trying to connect, but the ->connect call isn't working
> somehow. The code backs off by doubling the timeout each time, so th=
at
> the connect attempts don't overload the server.
>=20
> This tells us that the code is attempting to connect, but not why the
> connect attempt is failing.

When reading the kernel codes, I find a problem at function xs_tcp_cl=
ose.
....
772 static void xs_tcp_close(struct rpc_xprt *xprt)
773 {
774 if (test_and_clear_bit(XPRT_CONNECTION_CLOSE, &xprt->st=
ate))
775 xs_close(xprt);
776 else
777 xs_tcp_shutdown(xprt);
778 }
...
If a task call xs_tcp_close to close the xprt's sock, many times it o=
nly call=20
xs_tcp_shutdown to using the next layer's close function to close the=
socket
connection.
But after close the socket connection, the socket also exist, so the =
socket may
be reused. Is it a problem ? I think after xs_tcp_shutdown, the socke=
t should
be released.
=20
thanks,
Mi Jinlong


2010-03-08 09:58:03

by Mi Jinlong

[permalink] [raw]
Subject: Re: [RFC] SUNRPC connect timeout case network request delay

Hi chuck,

Thanks for your reply.

Chuck Lever =E5=86=99=E9=81=93:
> On 03/04/2010 05:12 AM, Mi Jinlong wrote:
>> Hi,
>> Step4: [22:42:16] Write data to file
>> [22:42:16] Write data success
>> Step5: [22:42:16] Unlock file
>> [22:46:30] Unlock file success.
>> Step6: [22:46:30] Close file /mnt/nfs/file
>> [22:46:30] Close fiel /mnt/nfs/file success
>>
>> The problem is at step5, unlock file takes 4 min, it's a long time
>> than expected.
>> When traceing the kernel, I find SUNRPC call call_connect timeout ma=
ny
>> times,
>> one timeout is 1min.
>=20
> The kernel's TCP reconnect logic will retry until it succeeds, withou=
t
> letting the upper level make progress. For some reason, it is having
> difficulty reconnecting with your server.
>=20
>> I think it's a problem of kernel, but i don't know why, can someone
>> help me ?
>=20
> # sudo rpcdebug -m rpc -s xprt trans

After running this command, I got some important messages that I think=
=2E

RPC: xs_connect delayed xprt for 3 seconds
...
RPC: xs_connect delayed xprt for 6 seconds
...
RPC: xs_connect delayed xprt for 12 seconds
...
RPC: xs_connect delayed xprt for 24 seconds
...
...
RPC: xs_connect delayed xprt for 300 seconds
=20
This message is printed at xs_connect, and the delay time is double the=
re.
IMO, when some data translate over through a socket, the socket should =
be released.
But, it seems the socket isn't released through those messages above.=20
Is it wrong, or there are some other reasons ?

At the latest kernel, this bug was fix by patch
"NFS/RPC: fix problems with reestablish_timeout and related code."
But I don't sure about this.

thanks,
Mi Jinlong