2005-07-14 21:25:36

by Haakon Riiser

[permalink] [raw]
Subject: "Server not responding" after periods of client inactivity

(Trying to post this for the fourth time -- nothing appeared on the
list the first three times.)

I have noticed that after periods of inactivity on the client
machine, the first NFS operation will always hang for around
15 seconds. My first guess was that the server had powered down
its disk drives, and that it was the spin-up time that caused the
delay, but when I had an ssh session open on the server at the
same time as the client started complaining about not getting a
reply, I saw that this was /not/ the case -- there is nothing
on the server that would explain why the client is stalling.
No load, and no delay when working directly on the server.

I did tcpdump (on the server side) while the client was hanging,
and this is what I found:

Source Time Packets
------ ---- -------
client 0.00 V3 ACCESS Call, FH:0x02120000
client 0.10 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 0.31 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 0.71 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 1.53 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 3.16 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 6.42 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 7.12 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 8.52 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 11.32 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
server 15.30 V3 ACCESS Reply

After this, there are no more delays for this shared file system;
any file operation on the same file system will succeed instantly.
However, the first operation on any /other/ NFS file system also
mounted on the client will still hang just like the above example.
I.e., the hang always happens exactly once for each mount point.

I have tried setting rsize=1024,wsize=1024, and I have tried both
tcp and udp, but nothing has helped so far. Any ideas? tcpdump
clearly shows that all the requests arrive at the server, so why
does the server wait 15 seconds before it replies?

NFS server:
Pentium III 650 MHz, 256 MB RAM
Fedora Core 3 (fully updated)
nfs-utils 1.0.6-52
kernel 2.6.11-1.35_FC3

NFS client:
Athlon XP2500+, 1 GB RAM
Slackware 10.1
nfs-utils 1.0.7
kernel 2.6.11.11

--
Haakon


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2005-07-30 13:11:15

by Haakon Riiser

[permalink] [raw]
Subject: Re: "Server not responding" after periods of client inactivity

(Replying to my own email, since I still haven't seen any comments.
The full original email is included below, since it's been a while
since I posted it.)

I have now tried this with another NFS client (Fedora Core 4)
connected to the NFS server simultaneously with client used
in the below problem report. The same problem happens on
the new client as well, even when the other client has just
experienced the hang.

That is, after the stalled NFS operation completes on client A,
it is still possible to observe the same problem on client B.
It looks to me like this is caused by the server disconnecting idle
clients, and when a new request suddenly occurs, it takes a while
before the server manages to resurrect the connection. I tried
adding a cron job that does 'ls -l /NFS-MOUNT-POINT >/dev/null'
every few minutes, but surprisingly, that didn't keep the the
connection alive. Is this because of client side caching? Is there
anything I can do? These hangs are getting very annoying; it
would be great if I could somehow change the idle timeout.

> I have noticed that after periods of inactivity on the client
> machine, the first NFS operation will always hang for around
> 15 seconds. My first guess was that the server had powered down
> its disk drives, and that it was the spin-up time that caused the
> delay, but when I had an ssh session open on the server at the
> same time as the client started complaining about not getting a
> reply, I saw that this was /not/ the case -- there is nothing
> on the server that would explain why the client is stalling.
> No load, and no delay when working directly on the server.
>
> I did tcpdump (on the server side) while the client was hanging,
> and this is what I found:
>
> Source Time Packets
> ------ ---- -------
> client 0.00 V3 ACCESS Call, FH:0x02120000
> client 0.10 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 0.31 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 0.71 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 1.53 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 3.16 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 6.42 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 7.12 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 8.52 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 11.32 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> server 15.30 V3 ACCESS Reply
>
> After this, there are no more delays for this shared file system;
> any file operation on the same file system will succeed instantly.
> However, the first operation on any /other/ NFS file system also
> mounted on the client will still hang just like the above example.
> I.e., the hang always happens exactly once for each mount point.
>
> I have tried setting rsize=1024,wsize=1024, and I have tried both
> tcp and udp, but nothing has helped so far. Any ideas? tcpdump
> clearly shows that all the requests arrive at the server, so why
> does the server wait 15 seconds before it replies?
>
> NFS server:
> Pentium III 650 MHz, 256 MB RAM
> Fedora Core 3 (fully updated)
> nfs-utils 1.0.6-52
> kernel 2.6.11-1.35_FC3
>
> NFS client:
> Athlon XP2500+, 1 GB RAM
> Slackware 10.1
> nfs-utils 1.0.7
> kernel 2.6.11.11

--
Haakon


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-30 14:16:09

by Trond Myklebust

[permalink] [raw]
Subject: Re: "Server not responding" after periods of client inactivity

lau den 30.07.2005 Klokka 15:10 (+0200) skreiv Haakon Riiser:
> I have now tried this with another NFS client (Fedora Core 4)
> connected to the NFS server simultaneously with client used
> in the below problem report. The same problem happens on
> the new client as well, even when the other client has just
> experienced the hang.
>
> That is, after the stalled NFS operation completes on client A,
> it is still possible to observe the same problem on client B.
> It looks to me like this is caused by the server disconnecting idle
> clients, and when a new request suddenly occurs, it takes a while
> before the server manages to resurrect the connection. I tried
> adding a cron job that does 'ls -l /NFS-MOUNT-POINT >/dev/null'
> every few minutes, but surprisingly, that didn't keep the the
> connection alive. Is this because of client side caching? Is there
> anything I can do? These hangs are getting very annoying; it
> would be great if I could somehow change the idle timeout.

If the server drops the connection, then the client will wait 15 seconds
before retrying. The reason for this is that client has to assume that
the server is disconnecting due to congestion issues.
Note that if congestion really is an issue, your Linux server should
normally send an error message to the effect of "too many open TCP
sockets, consider increasing the number of nfsd threads" to your syslog.
You can fiddle with RPC_REESTABLISH_TIMEOUT if you want to change the 15
second delay, but I wouldn't recommend this unless you are sure you know
what you are doing. (FYI, the fixed timeout is, BTW, soon due to be
replaced with an exponential backoff-based timeout.)

Otherwise, you should note that the client too will attempt to drop the
connection after 5 minutes of idle activity on the socket. That should
normally not lead to a 15 second wait, though.
If the client fails to disconnect the idle connection, then the server
will do so after 6 minutes (i.e. ~ 1 minute after the client timeout
should have occurred).

Cheers,
Trond



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-30 14:32:31

by Haakon Riiser

[permalink] [raw]
Subject: Re: "Server not responding" after periods of client inactivity

Trond,

> If the server drops the connection, then the client will wait 15 seconds
> before retrying. The reason for this is that client has to assume that
> the server is disconnecting due to congestion issues.
> Note that if congestion really is an issue, your Linux server should
> normally send an error message to the effect of "too many open TCP
> sockets, consider increasing the number of nfsd threads" to your syslog.

Congestion is almost definitely not the case here, since the NFS
server is located in my own home, and is typically only used by one
client at a time. During these timeouts, it isn't doing anything.
This is why I first thought that the problem was that the server
disks had spun down to save power.

Btw, one thing I haven't tried yet is upgrading to NFSv4. Do you
think that could help?

> You can fiddle with RPC_REESTABLISH_TIMEOUT if you want to change the 15
> second delay, but I wouldn't recommend this unless you are sure you know
> what you are doing. (FYI, the fixed timeout is, BTW, soon due to be
> replaced with an exponential backoff-based timeout.)
>
> Otherwise, you should note that the client too will attempt to drop the
> connection after 5 minutes of idle activity on the socket. That should
> normally not lead to a 15 second wait, though.
> If the client fails to disconnect the idle connection, then the server
> will do so after 6 minutes (i.e. ~ 1 minute after the client timeout
> should have occurred).

Hmm, what did you make of the result I got with tcpdump/Ethereal?
(Reposted below for convenience.) It looks like the problem is on
the server side.

Source Time Packets
------ ---- -------
client 0.00 V3 ACCESS Call, FH:0x02120000
client 0.10 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 0.31 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 0.71 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 1.53 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 3.16 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 6.42 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 7.12 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 8.52 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
client 11.32 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
server 15.30 V3 ACCESS Reply

--
Haakon


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-07-30 14:56:03

by Trond Myklebust

[permalink] [raw]
Subject: Re: "Server not responding" after periods of client inactivity

lau den 30.07.2005 Klokka 16:32 (+0200) skreiv Haakon Riiser:

> Btw, one thing I haven't tried yet is upgrading to NFSv4. Do you
> think that could help?

Normally it shouldn't. Sockets are managed by the RPC code, not the NFS
code. The only difference there is that the NFSv4 client has a
"heartbeat" that pings the server every 30 seconds or so in order to
tell the server it is still alive.

> > You can fiddle with RPC_REESTABLISH_TIMEOUT if you want to change the 15
> > second delay, but I wouldn't recommend this unless you are sure you know
> > what you are doing. (FYI, the fixed timeout is, BTW, soon due to be
> > replaced with an exponential backoff-based timeout.)
> >
> > Otherwise, you should note that the client too will attempt to drop the
> > connection after 5 minutes of idle activity on the socket. That should
> > normally not lead to a 15 second wait, though.
> > If the client fails to disconnect the idle connection, then the server
> > will do so after 6 minutes (i.e. ~ 1 minute after the client timeout
> > should have occurred).
>
> Hmm, what did you make of the result I got with tcpdump/Ethereal?
> (Reposted below for convenience.) It looks like the problem is on
> the server side.
>
> Source Time Packets
> ------ ---- -------
> client 0.00 V3 ACCESS Call, FH:0x02120000
> client 0.10 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 0.31 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 0.71 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 1.53 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 3.16 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 6.42 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 7.12 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 8.52 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> client 11.32 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
> server 15.30 V3 ACCESS Reply

This is a different problem that has nothing to do with connecting. If
the client can send RPC requests, then the connection has clearly been
set up.

If I were you, I'd look into what "mountd" is up to on the server when
this happens. The server will just silently drop packets if mountd is
slow to authorise the client.

Cheers,
Trond



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-08-09 19:06:54

by Haakon Riiser

[permalink] [raw]
Subject: Re: "Server not responding" after periods of client inactivity

Trond,

>> Source Time Packets
>> ------ ---- -------
>> client 0.00 V3 ACCESS Call, FH:0x02120000
>> client 0.10 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
>> client 0.31 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
>> client 0.71 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
>> client 1.53 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
>> client 3.16 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
>> client 6.42 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
>> client 7.12 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
>> client 8.52 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
>> client 11.32 [Retransmission of #1] V3 ACCESS Call, FH:0x02120000
>> server 15.30 V3 ACCESS Reply
>
> This is a different problem that has nothing to do with connecting. If
> the client can send RPC requests, then the connection has clearly been
> set up.
>
> If I were you, I'd look into what "mountd" is up to on the server when
> this happens. The server will just silently drop packets if mountd is
> slow to authorise the client.

OK, I've been trying for some time to reproduce the bug while
stracing mountd on the server, but it seems impossible to do it.
If I do

strace -p $(pidof rpc.mountd)

the bug never occurs. If I try to start strace immediately after
the hang begins, strace doesn't attach until the hang is over.
That is, it takes approximately 15 seconds (the entire duration of
the hang) before the

Process <PID> attached - interrupt to quit

message is displayed and anything else is printed by strace. Does
this tell you anything?

--
Haakon


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-08-09 19:15:12

by Trond Myklebust

[permalink] [raw]
Subject: Re: "Server not responding" after periods of client inactivity

ty den 09.08.2005 Klokka 21:06 (+0200) skreiv Haakon Riiser:
> OK, I've been trying for some time to reproduce the bug while
> stracing mountd on the server, but it seems impossible to do it.
> If I do
>
> strace -p $(pidof rpc.mountd)
>
> the bug never occurs. If I try to start strace immediately after
> the hang begins, strace doesn't attach until the hang is over.
> That is, it takes approximately 15 seconds (the entire duration of
> the hang) before the
>
> Process <PID> attached - interrupt to quit
>
> message is displayed and anything else is printed by strace. Does
> this tell you anything?

Just means that the mountd process is stuck in a syscall. Have you tried
getting a thread dump when it happens?

echo "t" > /proc/sysrq-trigger

on the server then scour through the 'dmesg' output until you find out
where mountd is hiding.

Cheers,
Trond



-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-08-09 19:22:37

by Haakon Riiser

[permalink] [raw]
Subject: Re: "Server not responding" after periods of client inactivity

Trond,

> Have you tried getting a thread dump when it happens?
> echo "t" > /proc/sysrq-trigger
> on the server then scour through the 'dmesg' output until you find out
> where mountd is hiding.

Haven't tried that yet, but I'll get right on it. (That is, in an
hour or two, since that's about how long I have to wait before the
bug occurs. :-)

--
Haakon


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-08-09 22:12:23

by Trond Myklebust

[permalink] [raw]
Subject: Re: "Server not responding" after periods of client inactivity

ty den 09.08.2005 Klokka 22:32 (+0200) skreiv Haakon Riiser:
> Done. Unfortunately, I couldn't find a call trace for mountd,
> I only got this:
>
> rpc.mountd R running 1880 2814 1 18906 (NOTLB)

Makes sense. That means that rpc.mountd is hanging in userland. That
would also explain why you can't immediately ptrace it.

It is worrying, though. rpc.mountd should almost always be in the
kernel, waiting in select().

Did you compile this version of rpc.mountd yourself? If not, is there
anything special/weird in your /etc/exports file that might cause the
rpc.mountd parser to be screwing up?

Cheers,
Trond



-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-08-09 22:21:18

by Haakon Riiser

[permalink] [raw]
Subject: Re: "Server not responding" after periods of client inactivity

Trond,

> Just means that the mountd process is stuck in a syscall. Have you tried
> getting a thread dump when it happens?
> echo "t" > /proc/sysrq-trigger
> on the server then scour through the 'dmesg' output until you find out
> where mountd is hiding.

Done. Unfortunately, I couldn't find a call trace for mountd,
I only got this:

rpc.mountd R running 1880 2814 1 18906 (NOTLB)

When mountd isn't hanging, I do get a call trace. Also, I
noticed that echo "t" > /proc/sysrq-trigger didn't return
immediately while mountd was hanging. Normally, it returns
almost instantaneously, but during the hang, it took a second or
two. (Returned long before the hang was over, though.)

Btw, I saved the entire dmesg output, just in case. Let me know if
you're interested, OK?

--
Haakon


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-08-09 22:24:18

by Haakon Riiser

[permalink] [raw]
Subject: Re: "Server not responding" after periods of client inactivity

Trond,

>> rpc.mountd R running 1880 2814 1 18906 (NOTLB)
>
> Makes sense. That means that rpc.mountd is hanging in userland. That
> would also explain why you can't immediately ptrace it.

But shouldn't strace immediately attach, even if mountd is
busy-waiting in userspace? E.g., if I create a program that loops
forever, such as

for (;;) { }

I can still attach immediately, although nothing is printed since
there are no system calls (I've verified this). How can a program
running in userspace delay strace's initial attachment?

> It is worrying, though. rpc.mountd should almost always be in the
> kernel, waiting in select().
>
> Did you compile this version of rpc.mountd yourself? If not, is there
> anything special/weird in your /etc/exports file that might cause the
> rpc.mountd parser to be screwing up?

No and no. I'm using Fedora Core 3's rpc.mountd (nfs-utils-1.0.6-52),
and this is my /etc/exports:

/pub 10.0.0.4(rw,async,no_root_squash,no_subtree_check)
/raid 10.0.0.4(rw,async,no_subtree_check)
/pub 10.0.0.21(rw,async,no_root_squash,no_subtree_check)
/raid 10.0.0.21(rw,async,no_subtree_check)

Shouldn't have any problems with this, right?

--
Haakon


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs