2008-10-17 12:33:28

by Max Kellermann

[permalink] [raw]
Subject: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

Hi,

Ian: this is a follow-up to your post "NFS regression? Odd delays and
lockups accessing an NFS export" a few weeks ago
(http://lkml.org/lkml/2008/9/27/42).

I am able to trigger this bug within a few minutes on a customer's
machine (large web hoster, a *lot* of NFS traffic).

Symptom: with 2.6.26 (2.6.27.1, too), load goes to 100+, dmesg says
"INFO: task migration/2:9 blocked for more than 120 seconds." with
varying task names. Except for the high load average, the machine
seems to work.

With git bisect, I was finally able to identify the guilty commit,
it's not "Ensure we zap only the access and acl caches when setting
new acls" like you guessed, Ian. According to my bisect,
6becedbb06072c5741d4057b9facecb4b3143711 is the origin of the problem.
e481fcf8563d300e7f8875cae5fdc41941d29de0 (its parent) works well.

Glauber: that is your patch "x86: minor adjustments for do_boot_cpu"
(http://lkml.org/lkml/2008/3/19/143). I don't understand this patch
well, and I fail to see a connection with the symptom, but maybe
somebody else does...

See patch below (applies to 2.6.27.1). So far, it looks like the
problem is solved on the server, no visible side effects.

Max


Revert "x86: minor adjustments for do_boot_cpu"

According to a bisect, Glauber Costa's patch induced high load and
"task ... blocked for more than 120 seconds" messages in dmesg. This
patch reverts 6becedbb06072c5741d4057b9facecb4b3143711.

Signed-off-by: Max Kellermann <[email protected]>
---

arch/x86/kernel/smpboot.c | 21 ++++++++-------------
1 files changed, 8 insertions(+), 13 deletions(-)


diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 7985c5b..789cf84 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -808,7 +808,7 @@ static int __cpuinit do_boot_cpu(int apicid, int cpu)
* Returns zero if CPU booted OK, else error code from wakeup_secondary_cpu.
*/
{
- unsigned long boot_error = 0;
+ unsigned long boot_error;
int timeout;
unsigned long start_ip;
unsigned short nmi_high = 0, nmi_low = 0;
@@ -828,7 +828,11 @@ static int __cpuinit do_boot_cpu(int apicid, int cpu)
}
#endif

- alternatives_smp_switch(1);
+ /*
+ * Save current MTRR state in case it was changed since early boot
+ * (e.g. by the ACPI SMI) to initialize new CPUs with MTRRs in sync:
+ */
+ mtrr_save_state();

c_idle.idle = get_idle_for_cpu(cpu);

@@ -873,6 +877,8 @@ do_rest:
/* start_ip had better be page-aligned! */
start_ip = setup_trampoline();

+ alternatives_smp_switch(1);
+
/* So we see what's up */
printk(KERN_INFO "Booting processor %d/%d ip %lx\n",
cpu, apicid, start_ip);
@@ -891,11 +897,6 @@ do_rest:
store_NMI_vector(&nmi_high, &nmi_low);

smpboot_setup_warm_reset_vector(start_ip);
- /*
- * Be paranoid about clearing APIC errors.
- */
- apic_write(APIC_ESR, 0);
- apic_read(APIC_ESR);
}

/*
@@ -986,12 +987,6 @@ int __cpuinit native_cpu_up(unsigned int cpu)
return -ENOSYS;
}

- /*
- * Save current MTRR state in case it was changed since early boot
- * (e.g. by the ACPI SMI) to initialize new CPUs with MTRRs in sync:
- */
- mtrr_save_state();
-
per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;

#ifdef CONFIG_X86_32


2008-10-17 14:32:14

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Fri, Oct 17, 2008 at 02:32:07PM +0200, Max Kellermann wrote:
> Hi,
>
> Ian: this is a follow-up to your post "NFS regression? Odd delays and
> lockups accessing an NFS export" a few weeks ago
> (http://lkml.org/lkml/2008/9/27/42).
>
> I am able to trigger this bug within a few minutes on a customer's
> machine (large web hoster, a *lot* of NFS traffic).
>
> Symptom: with 2.6.26 (2.6.27.1, too), load goes to 100+, dmesg says
> "INFO: task migration/2:9 blocked for more than 120 seconds." with
> varying task names. Except for the high load average, the machine
> seems to work.
>
> With git bisect, I was finally able to identify the guilty commit,
> it's not "Ensure we zap only the access and acl caches when setting
> new acls" like you guessed, Ian. According to my bisect,
> 6becedbb06072c5741d4057b9facecb4b3143711 is the origin of the problem.
> e481fcf8563d300e7f8875cae5fdc41941d29de0 (its parent) works well.
>
> Glauber: that is your patch "x86: minor adjustments for do_boot_cpu"
> (http://lkml.org/lkml/2008/3/19/143). I don't understand this patch
> well, and I fail to see a connection with the symptom, but maybe
> somebody else does...
>
> See patch below (applies to 2.6.27.1). So far, it looks like the
> problem is solved on the server, no visible side effects.
>
> Max
That's probably something related to apic congestion.
Does the problem go away if the only thing you change is this:


> @@ -891,11 +897,6 @@ do_rest:
> store_NMI_vector(&nmi_high, &nmi_low);
>
> smpboot_setup_warm_reset_vector(start_ip);
> - /*
> - * Be paranoid about clearing APIC errors.
> - */
> - apic_write(APIC_ESR, 0);
> - apic_read(APIC_ESR);
> }


Please let me know.

2008-10-20 06:27:47

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

(adding back some CC's, please don't drop people)

On Fri, 2008-10-17 at 14:32 +0200, Max Kellermann wrote:
> Ian: this is a follow-up to your post "NFS regression? Odd delays and
> lockups accessing an NFS export" a few weeks ago
> (http://lkml.org/lkml/2008/9/27/42).
>
> I am able to trigger this bug within a few minutes on a customer's
> machine (large web hoster, a *lot* of NFS traffic).
>
> Symptom: with 2.6.26 (2.6.27.1, too), load goes to 100+, dmesg says
> "INFO: task migration/2:9 blocked for more than 120 seconds." with
> varying task names. Except for the high load average, the machine
> seems to work.
>
> With git bisect, I was finally able to identify the guilty commit,
> it's not "Ensure we zap only the access and acl caches when setting
> new acls" like you guessed, Ian. According to my bisect,
> 6becedbb06072c5741d4057b9facecb4b3143711 is the origin of the problem.
> e481fcf8563d300e7f8875cae5fdc41941d29de0 (its parent) works well.

The issue I see still occurs well before those changesets. I have seen
it with v2.6.25 but v2.6.24 survived for 7 days without issue (my
threshold for a good kernel is 7 days, hence bisecting is a bit
slow...).

So far I have bisected down to this range and am currently testing
acee478 which has been up for >4days.

$ git bisect visualize --pretty=oneline
bdc7f021f3a1fade77adf3c2d7f65690566fddfe NFS: Clean up the (commit|read|write)_setup() callback routines
3ff7576ddac06c3d07089e241b40826d24bbf1ac SUNRPC: Clean up the initialisation of priority queue scheduling info.
c970aa85e71bd581726c42df843f6f129db275ac SUNRPC: Clean up rpc_run_task
84115e1cd4a3614c4e566d4cce31381dce3dbef9 SUNRPC: Cleanup of rpc_task initialisation
ef818a28fac9bd214e676986d8301db0582b92a9 NFS: Stop sillyname renames and unmounts from racing
2f74c0a05612b9c2014b5b67833dba9b9f523948 NFSv4: Clean up the OPEN/CLOSE serialisation code
acee478afc6ff7e1b8852d9a4dca1ff36021414d NFS: Clean up the write request locking.
8b1f9ee56e21e505a3d5d3e33f823006d1abdbaf NFS: Optimise nfs_vm_page_mkwrite()
77f111929d024165e736e919187cff017279bebe NFS: Ensure that we eject stale inodes as soon as possible
d45b9d8baf41acb177abbbe6746b1dea094b8a28 NFS: Handle -ENOENT errors in unlink()/rmdir()/rename()
609005c319bc6062b95ed82e132884ed7e22cdb9 NFS: Sillyrename: in the case of a race, check aliases are really positive
fccca7fc6aab4e6b519e2d606ef34632e4f50e33 NFS: Fix a sillyrename race...

note that this bisect is over fs/nfs only so it's possible the I might
drop off the beginning and have to bisect the 3878 commits between
v2.6.24 and fccca7f. I hope not! acee478 looks good so far.

$ git bisect log
# bad: [4b119e21d0c66c22e8ca03df05d9de623d0eb50f] Linux 2.6.25
# good: [49914084e797530d9baaf51df9eda77babc98fa8] Linux 2.6.24
git-bisect start 'v2.6.25' 'v2.6.24' '--' 'fs/nfs'
# bad: [4c5680177012a2b5c0f3fdf58f4375dd84a1da67] NFS: Support non-IPv4 addresses in nfs_parsed_mount_data
git-bisect bad 4c5680177012a2b5c0f3fdf58f4375dd84a1da67
# bad: [d45273ed6f4613e81701c3e896d9db200c288fff] NFS: Clean up address comparison in __nfs_find_client()
git-bisect bad d45273ed6f4613e81701c3e896d9db200c288fff
# bad: [bdc7f021f3a1fade77adf3c2d7f65690566fddfe] NFS: Clean up the (commit|read|write)_setup() callback routines
git-bisect bad bdc7f021f3a1fade77adf3c2d7f65690566fddfe

Ian.
--
Ian Campbell

"It is easier to fight for principles than to live up to them."
-- Alfred Adler


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-10-20 06:53:14

by Max Kellermann

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On 2008/10/17 16:33, Glauber Costa <[email protected]> wrote:
> That's probably something related to apic congestion.
> Does the problem go away if the only thing you change is this:
>
>
> > @@ -891,11 +897,6 @@ do_rest:
> > store_NMI_vector(&nmi_high, &nmi_low);
> >
> > smpboot_setup_warm_reset_vector(start_ip);
> > - /*
> > - * Be paranoid about clearing APIC errors.
> > - */
> > - apic_write(APIC_ESR, 0);
> > - apic_read(APIC_ESR);
> > }
>
>
> Please let me know.

Hello Glauber,

I have rebooted the server with 2.6.27.1 + this patchlet an hour ago.
No problems since.

Hardware: Compaq P4 Xeon server, Broadcom CMIC-WS / CIOB-X2 board.
Tell me if you need more detailed information.


On 2008/10/20 08:27, Ian Campbell <[email protected]> wrote:
> The issue I see still occurs well before those changesets. I have
> seen it with v2.6.25 but v2.6.24 survived for 7 days without issue
> (my threshold for a good kernel is 7 days, hence bisecting is a bit
> slow...).

Hello Ian,

it seems we're hunting down different bugs after all. Too bad, I
hoped I could have solved your problem, too. Our machine has been
running well over the weekend with the patch I posted; with faulty
kernels, the problem would occur after a few minutes.

Max

2008-10-20 07:43:55

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Mon, 2008-10-20 at 08:51 +0200, Max Kellermann wrote:
>
> On 2008/10/20 08:27, Ian Campbell <[email protected]> wrote:
> > The issue I see still occurs well before those changesets. I have
> > seen it with v2.6.25 but v2.6.24 survived for 7 days without issue
> > (my threshold for a good kernel is 7 days, hence bisecting is a bit
> > slow...).
>
> Hello Ian,
>
> it seems we're hunting down different bugs after all. Too bad, I
> hoped I could have solved your problem, too.

Thanks anyway, I'll just keep on bisecting ;-)

> Our machine has been
> running well over the weekend with the patch I posted; with faulty
> kernels, the problem would occur after a few minutes.
>
--
Ian Campbell

BOFH excuse #400:

We are Microsoft. What you are experiencing is not a problem; it is an undocumented feature.


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-10-20 13:16:17

by Glauber Costa

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Mon, Oct 20, 2008 at 4:51 AM, Max Kellermann <[email protected]> wrote:
> On 2008/10/17 16:33, Glauber Costa <[email protected]> wrote:
>> That's probably something related to apic congestion.
>> Does the problem go away if the only thing you change is this:
>>
>>
>> > @@ -891,11 +897,6 @@ do_rest:
>> > store_NMI_vector(&nmi_high, &nmi_low);
>> >
>> > smpboot_setup_warm_reset_vector(start_ip);
>> > - /*
>> > - * Be paranoid about clearing APIC errors.
>> > - */
>> > - apic_write(APIC_ESR, 0);
>> > - apic_read(APIC_ESR);
>> > }
>>
>>
>> Please let me know.
>
> Hello Glauber,
>
> I have rebooted the server with 2.6.27.1 + this patchlet an hour ago.
> No problems since.
>
> Hardware: Compaq P4 Xeon server, Broadcom CMIC-WS / CIOB-X2 board.
> Tell me if you need more detailed information.
>

There's a patch in flight from cyrill that probably fixes your problem:
http://lkml.org/lkml/2008/9/15/93

The checks are obviously there for a reason, and we can't just wipe
them out unconditionally ;-) So can you check please that you are also
covered by the case provided?

> On 2008/10/20 08:27, Ian Campbell <[email protected]> wrote:
>> The issue I see still occurs well before those changesets. I have
>> seen it with v2.6.25 but v2.6.24 survived for 7 days without issue
>> (my threshold for a good kernel is 7 days, hence bisecting is a bit
>> slow...).
>
> Hello Ian,
>
> it seems we're hunting down different bugs after all. Too bad, I
> hoped I could have solved your problem, too. Our machine has been
> running well over the weekend with the patch I posted; with faulty
> kernels, the problem would occur after a few minutes.
>
> Max
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>



--
Glauber Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

2008-10-20 14:14:19

by Max Kellermann

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On 2008/10/20 15:15, Glauber Costa <[email protected]> wrote:
> There's a patch in flight from cyrill that probably fixes your
> problem: http://lkml.org/lkml/2008/9/15/93
>
> The checks are obviously there for a reason, and we can't just wipe
> them out unconditionally ;-) So can you check please that you are
> also covered by the case provided?

Looks good: booted the machine 30 minutes ago, no problems so far.

Max

2008-10-20 14:21:18

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

[Glauber Costa - Mon, Oct 20, 2008 at 11:15:56AM -0200]
| On Mon, Oct 20, 2008 at 4:51 AM, Max Kellermann <[email protected]> wrote:
| > On 2008/10/17 16:33, Glauber Costa <[email protected]> wrote:
| >> That's probably something related to apic congestion.
| >> Does the problem go away if the only thing you change is this:
| >>
| >>
| >> > @@ -891,11 +897,6 @@ do_rest:
| >> > store_NMI_vector(&nmi_high, &nmi_low);
| >> >
| >> > smpboot_setup_warm_reset_vector(start_ip);
| >> > - /*
| >> > - * Be paranoid about clearing APIC errors.
| >> > - */
| >> > - apic_write(APIC_ESR, 0);
| >> > - apic_read(APIC_ESR);
| >> > }
| >>
| >>
| >> Please let me know.
| >
| > Hello Glauber,
| >
| > I have rebooted the server with 2.6.27.1 + this patchlet an hour ago.
| > No problems since.
| >
| > Hardware: Compaq P4 Xeon server, Broadcom CMIC-WS / CIOB-X2 board.
| > Tell me if you need more detailed information.
| >
|
| There's a patch in flight from cyrill that probably fixes your problem:
| http://lkml.org/lkml/2008/9/15/93
|
| The checks are obviously there for a reason, and we can't just wipe
| them out unconditionally ;-) So can you check please that you are also
| covered by the case provided?

Actually I'll wonder if it help. Do Xeon processors really not
have ESR register and not integrated?

...

- Cyrill -

2008-10-20 14:34:24

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

[Max Kellermann - Mon, Oct 20, 2008 at 04:12:58PM +0200]
| On 2008/10/20 15:15, Glauber Costa <[email protected]> wrote:
| > There's a patch in flight from cyrill that probably fixes your
| > problem: http://lkml.org/lkml/2008/9/15/93
| >
| > The checks are obviously there for a reason, and we can't just wipe
| > them out unconditionally ;-) So can you check please that you are
| > also covered by the case provided?
|
| Looks good: booted the machine 30 minutes ago, no problems so far.
|
| Max
|

Thanks Max for testing! (me -- wonders since the patch helped
for now :-)

- Cyrill -

2008-11-01 11:46:06

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> So far I have bisected down to this range and am currently testing
> acee478 which has been up for >4days.

Another update. It has now bisected down to a small range

7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()

I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.

7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.

Ian.
--
Ian Campbell

There is no delight the equal of dread. As long as it is somebody
else's.
-- Clive Barker


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-11-01 13:41:34

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > So far I have bisected down to this range and am currently testing
> > acee478 which has been up for >4days.
>
> Another update. It has now bisected down to a small range
>
> 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
>
> I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
>
> 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.
>
> Ian.

Have you tested with the TCP RST fix yet? It has been merged into
mainline, so it should be in the latest 2.6.28-git, but I've attached it
so you can apply it to your test kernel...

Cheers
Trond


Attachments:
linux-2.6.27-001-respond_promptly_to_socket_errors.dif (4.31 kB)

2008-11-02 14:41:18

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
>
>
> Have you tested with the TCP RST fix yet? It has been merged into
> mainline, so it should be in the latest 2.6.28-git, but I've attached
> it so you can apply it to your test kernel...

I wasn't aware of it. I'll give it a go.

Thanks,
Ian.
>
--
Ian Campbell

His designs were strictly honourable, as the phrase is: that is, to rob
a lady of her fortune by way of marriage.
-- Henry Fielding, "Tom Jones"


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-11-04 19:10:43

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> > On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > > So far I have bisected down to this range and am currently testing
> > > acee478 which has been up for >4days.
> >
> > Another update. It has now bisected down to a small range
> >
> > 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> > e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> > ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> > 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> > 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> > 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> >
> > I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> >
> > 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> > ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> > 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.
> >
> > Ian.
>
> Have you tested with the TCP RST fix yet? It has been merged into
> mainline, so it should be in the latest 2.6.28-git, but I've attached it
> so you can apply it to your test kernel...

I cherry picked 2a9e1cfa23fb62da37739af81127dab5af095d99 onto v2.6.25
and unfortunately it has not fixed the issue. I'll go back to bisecting
with 3b948ae5be5e22532584113e2e02029519bbad8f.

Ian.

>
> Cheers
> Trond
>
--
Ian Campbell

Superior ability breeds superior ambition.
-- Spock, "Space Seed", stardate 3141.9


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-11-07 02:12:50

by Kenneth Johansson

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Sun, 02 Nov 2008 14:40:47 +0000, Ian Campbell wrote:

> On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
>>
>>
>> Have you tested with the TCP RST fix yet? It has been merged into
>> mainline, so it should be in the latest 2.6.28-git, but I've attached
>> it so you can apply it to your test kernel...
>
> I wasn't aware of it. I'll give it a go.
>
> Thanks,
> Ian.
>>

I think I having the same problem as you. At least I have a gut feeling it's nfs related.

What good and bad versions do you have so far in your bisecting. ??

I see the problem several times a day so it should be possible
to at least try one or two versions per day.

this is on a 2.6.27.2 client to a 2.6.26.3 server.
--------

sudo grep blocked /var/log/syslog.0
Nov 5 02:06:27 duo kernel: [ 5080.947067] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 02:08:40 duo kernel: [ 5214.091071] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 02:10:49 duo kernel: [ 5342.940064] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 02:33:15 duo kernel: [ 6688.338072] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 02:35:44 duo kernel: [ 6837.588072] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 02:38:12 duo kernel: [ 6985.765070] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 02:40:37 duo kernel: [ 7130.720067] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 06:56:00 duo kernel: [22454.090070] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 07:51:38 duo kernel: [25791.279105] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 09:39:33 duo kernel: [32267.016068] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 09:41:55 duo kernel: [32408.750061] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 09:44:17 duo kernel: [32550.484061] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 09:46:37 duo kernel: [32691.144064] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 16:26:25 duo kernel: [56678.536068] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 16:28:50 duo kernel: [56823.492067] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 16:31:17 duo kernel: [56970.594061] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 16:33:44 duo kernel: [57117.697062] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 16:51:04 duo kernel: [58158.153065] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 18:00:39 duo kernel: [62256.625050] INFO: task hald-addon-stor:7110 blocked for more than 120 seconds.
Nov 5 18:15:16 duo kernel: [63210.108080] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 18:24:08 duo kernel: [63741.610074] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 19:47:09 duo kernel: [68722.698056] INFO: task hald-addon-stor:7102 blocked for more than 120 seconds.
Nov 5 19:47:53 duo kernel: [68722.698307] INFO: task hald-addon-stor:7105 blocked for more than 120 seconds.
Nov 5 19:47:53 duo kernel: [68722.698513] INFO: task hald-addon-stor:7110 blocked for more than 120 seconds.
Nov 5 19:47:53 duo kernel: [68755.984030] INFO: task scsi_eh_12:2687 blocked for more than 120 seconds.
Nov 5 22:20:11 duo kernel: [77904.265068] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 22:23:10 duo kernel: [78083.580065] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 22:46:09 duo kernel: [79462.264081] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 5 23:22:05 duo kernel: [81599.010033] INFO: task scsi_eh_10:2675 blocked for more than 120 seconds.
Nov 5 23:22:05 duo kernel: [81599.010253] INFO: task hald-addon-stor:7097 blocked for more than 120 seconds.
Nov 5 23:22:05 duo kernel: [81599.010468] INFO: task hald-addon-stor:7102 blocked for more than 120 seconds.
Nov 5 23:22:05 duo kernel: [81599.010674] INFO: task hald-addon-stor:7110 blocked for more than 120 seconds.
Nov 6 01:46:31 duo kernel: [90209.346061] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 6 01:46:31 duo kernel: [90242.632048] INFO: task hald-addon-stor:7102 blocked for more than 120 seconds.
Nov 6 01:46:31 duo kernel: [90242.632230] INFO: task hald-addon-stor:7105 blocked for more than 120 seconds.
Nov 6 01:46:31 duo kernel: [90242.632368] INFO: task hald-addon-stor:7110 blocked for more than 120 seconds.
Nov 6 01:46:31 duo kernel: [90275.918024] INFO: task scsi_eh_12:2687 blocked for more than 120 seconds.
Nov 6 02:11:59 duo kernel: [91812.443070] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 6 02:14:12 duo kernel: [91945.587069] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 6 02:16:50 duo kernel: [92103.427069] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.
Nov 6 02:28:26 duo kernel: [92759.483053] INFO: task hald-addon-usb-:7009 blocked for more than 120 seconds.
Nov 6 02:28:26 duo kernel: [92759.483233] INFO: task hald-addon-stor:7105 blocked for more than 120 seconds.
Nov 6 02:28:26 duo kernel: [92759.483447] INFO: task hald-addon-stor:7110 blocked for more than 120 seconds.
Nov 6 02:28:26 duo kernel: [92792.769034] INFO: task scsi_eh_12:2687 blocked for more than 120 seconds.
Nov 6 02:58:49 duo kernel: [94622.425059] INFO: task cpufreq-applet:11956 blocked for more than 120 seconds.

2008-11-25 07:09:55

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> > On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > > So far I have bisected down to this range and am currently testing
> > > acee478 which has been up for >4days.
> >
> > Another update. It has now bisected down to a small range
> >
> > 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> > e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> > ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> > 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> > 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> > 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> >
> > I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> >
> > 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> > ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> > 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.

According to bisect:

e06799f958bf7f9f8fae15f0c6f519953fb0257c is first bad commit
commit e06799f958bf7f9f8fae15f0c6f519953fb0257c
Author: Trond Myklebust <[email protected]>
Date: Mon Nov 5 15:44:12 2007 -0500

SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket

By using shutdown() rather than close() we allow the RPC client to wait
for the TCP close handshake to complete before we start trying to reconnect
using the same port.
We use shutdown(SHUT_WR) only instead of shutting down both directions,
however we wait until the server has closed the connection on its side.

Signed-off-by: Trond Myklebust <[email protected]>

I've started testing 2.6.26 + revert. It's been a long while since I
started this process so I'll also have a go at an up to date version.

Cheers,
Ian.
--
Ian Campbell

By failing to prepare, you are preparing to fail.


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-11-25 13:28:38

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Tue, 2008-11-25 at 07:09 +0000, Ian Campbell wrote:
> On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> > On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> > > On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > > > So far I have bisected down to this range and am currently testing
> > > > acee478 which has been up for >4days.
> > >
> > > Another update. It has now bisected down to a small range
> > >
> > > 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> > > e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> > > ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> > > 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> > > 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> > > 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> > >
> > > I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> > >
> > > 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> > > ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> > > 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.
>
> According to bisect:
>
> e06799f958bf7f9f8fae15f0c6f519953fb0257c is first bad commit
> commit e06799f958bf7f9f8fae15f0c6f519953fb0257c
> Author: Trond Myklebust <[email protected]>
> Date: Mon Nov 5 15:44:12 2007 -0500
>
> SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
>
> By using shutdown() rather than close() we allow the RPC client to wait
> for the TCP close handshake to complete before we start trying to reconnect
> using the same port.
> We use shutdown(SHUT_WR) only instead of shutting down both directions,
> however we wait until the server has closed the connection on its side.
>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> I've started testing 2.6.26 + revert. It's been a long while since I
> started this process so I'll also have a go at an up to date version.
>
> Cheers,

That would indicate that the server is failing to close the TCP
connection when the client closes on its end.

Could you remind me what server you are using? Also, does 'netstat -t'
show connections that are stuck in the CLOSE_WAIT state when you see the
hang?

Trond

2008-11-25 13:39:55

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Tue, 2008-11-25 at 08:28 -0500, Trond Myklebust wrote:
> On Tue, 2008-11-25 at 07:09 +0000, Ian Campbell wrote:
> > On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> > > On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> > > > On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > > > > So far I have bisected down to this range and am currently testing
> > > > > acee478 which has been up for >4days.
> > > >
> > > > Another update. It has now bisected down to a small range
> > > >
> > > > 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> > > > e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> > > > ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> > > > 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> > > > 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> > > > 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> > > >
> > > > I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> > > >
> > > > 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> > > > ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> > > > 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.
> >
> > According to bisect:
> >
> > e06799f958bf7f9f8fae15f0c6f519953fb0257c is first bad commit
> > commit e06799f958bf7f9f8fae15f0c6f519953fb0257c
> > Author: Trond Myklebust <[email protected]>
> > Date: Mon Nov 5 15:44:12 2007 -0500
> >
> > SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> >
> > By using shutdown() rather than close() we allow the RPC client to wait
> > for the TCP close handshake to complete before we start trying to reconnect
> > using the same port.
> > We use shutdown(SHUT_WR) only instead of shutting down both directions,
> > however we wait until the server has closed the connection on its side.
> >
> > Signed-off-by: Trond Myklebust <[email protected]>
> >
> > I've started testing 2.6.26 + revert. It's been a long while since I
> > started this process so I'll also have a go at an up to date version.
> >
> > Cheers,
>
> That would indicate that the server is failing to close the TCP
> connection when the client closes on its end.
>
> Could you remind me what server you are using?

2.6.25-2-486 which is a Debian package from backports.org, changelog
indicates that it contains 2.6.25.7.

> Also, does 'netstat -t'
> show connections that are stuck in the CLOSE_WAIT state when you see the
> hang?

I'd have to wait for it to reproduce again to be 100% sure but according
to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.

Ian.

--
Ian Campbell
Current Noise: Diamond Head - It's Electric

"The only real way to look younger is not to be born so soon."
-- Charles Schulz, "Things I've Had to Learn Over and
Over and Over"

2008-11-25 13:57:31

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Tue, 2008-11-25 at 13:38 +0000, Ian Campbell wrote:
> > That would indicate that the server is failing to close the TCP
> > connection when the client closes on its end.
> >
> > Could you remind me what server you are using?
>
> 2.6.25-2-486 which is a Debian package from backports.org, changelog
> indicates that it contains 2.6.25.7.

Hmm... It should normally close sockets when the state changes. There
might be a race, though...

> > Also, does 'netstat -t'
> > show connections that are stuck in the CLOSE_WAIT state when you see the
> > hang?
>
> I'd have to wait for it to reproduce again to be 100% sure but according
> to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
> I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.

That would be on the client side. I'm talking about the server.

Cheers
Trond

2008-11-25 14:05:21

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Tue, 2008-11-25 at 08:57 -0500, Trond Myklebust wrote:
> On Tue, 2008-11-25 at 13:38 +0000, Ian Campbell wrote:
> > > That would indicate that the server is failing to close the TCP
> > > connection when the client closes on its end.
> > >
> > > Could you remind me what server you are using?
> >
> > 2.6.25-2-486 which is a Debian package from backports.org, changelog
> > indicates that it contains 2.6.25.7.
>
> Hmm... It should normally close sockets when the state changes. There
> might be a race, though...
>
> > > Also, does 'netstat -t'
> > > show connections that are stuck in the CLOSE_WAIT state when you see the
> > > hang?
> >
> > I'd have to wait for it to reproduce again to be 100% sure but according
> > to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
> > I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.
>
> That would be on the client side. I'm talking about the server.

Ah, OK. I'll abort my current test of 2.6.26+revert and wait for a repro
so I can netstat the server, give me a couple of days...

Ian.
--
Ian Campbell

It is more rational to sacrifice one life than six.
-- Spock, "The Galileo Seven", stardate 2822.3

2008-11-26 22:12:45

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Tue, 2008-11-25 at 14:04 +0000, Ian Campbell wrote:
> On Tue, 2008-11-25 at 08:57 -0500, Trond Myklebust wrote:
> > On Tue, 2008-11-25 at 13:38 +0000, Ian Campbell wrote:
> > > > That would indicate that the server is failing to close the TCP
> > > > connection when the client closes on its end.
> > > >
> > > > Could you remind me what server you are using?
> > >
> > > 2.6.25-2-486 which is a Debian package from backports.org, changelog
> > > indicates that it contains 2.6.25.7.
> >
> > Hmm... It should normally close sockets when the state changes. There
> > might be a race, though...
> >
> > > > Also, does 'netstat -t'
> > > > show connections that are stuck in the CLOSE_WAIT state when you see the
> > > > hang?
> > >
> > > I'd have to wait for it to reproduce again to be 100% sure but according
> > > to http://lkml.indiana.edu/hypermail/linux/kernel/0808.3/0120.html
> > > I was seeing connections in FIN_WAIT2 but not CLOSE_WAIT.
> >
> > That would be on the client side. I'm talking about the server.
>
> Ah, OK. I'll abort my current test of 2.6.26+revert and wait for a repro
> so I can netstat the server, give me a couple of days...

So on the server I see the following. 192.168.1.4 is the problematic
client and 192.168.1.6 is the server.

Maybe not interesting but 192.168.1.5 also uses NFS for my $HOME and
runs 2.6.26 with no lockups.

# netstat -t -n
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 1 0 192.168.1.6:2049 192.168.1.4:723 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:920 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:890 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:698 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:705 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:943 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:915 CLOSE_WAIT
tcp 0 0 192.168.1.6:2049 192.168.1.5:783 ESTABLISHED
tcp 1 0 192.168.1.6:2049 192.168.1.4:998 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:758 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:955 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:845 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:827 CLOSE_WAIT
tcp 0 0 192.168.1.6:58464 128.31.0.36:80 ESTABLISHED
tcp 1 0 192.168.1.6:2049 192.168.1.4:754 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:837 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:918 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:865 CLOSE_WAIT
tcp 0 0 192.168.1.6:48343 192.168.1.5:832 ESTABLISHED
tcp 1 0 192.168.1.6:2049 192.168.1.4:840 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:883 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:785 CLOSE_WAIT
tcp 1 0 192.168.1.6:2049 192.168.1.4:720 CLOSE_WAIT
tcp6 0 0 ::ffff:192.168.1.6:22 ::ffff:192.168.1.:38206 ESTABLISHED
tcp6 0 0 ::ffff:192.168.1.6:143 ::ffff:192.168.1.:41308 ESTABLISHED
tcp6 0 0 ::ffff:192.168.1.6:143 ::ffff:192.168.1.:55784 ESTABLISHED
tcp6 0 0 ::ffff:192.168.1.6:22 ::ffff:192.168.1.:39046 ESTABLISHED

and on the client

# netstat -t -n
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 192.168.1.4:943 192.168.1.6:2049 FIN_WAIT2
tcp 0 0 192.168.1.4:33959 192.168.1.4:6543 ESTABLISHED
tcp 0 0 192.168.1.4:6543 192.168.1.4:54157 ESTABLISHED
tcp 0 0 127.0.0.1:13666 127.0.0.1:33364 ESTABLISHED
tcp 0 0 192.168.1.4:22 192.168.1.5:54696 ESTABLISHED
tcp 0 0 192.168.1.4:22 192.168.1.5:47599 ESTABLISHED
tcp 0 0 192.168.1.4:54156 192.168.1.4:6543 ESTABLISHED
tcp 0 0 192.168.1.4:6543 192.168.1.4:33957 ESTABLISHED
tcp 0 0 192.168.1.4:33957 192.168.1.4:6543 ESTABLISHED
tcp 0 0 192.168.1.4:54157 192.168.1.4:6543 ESTABLISHED
tcp 0 0 192.168.1.4:6543 192.168.1.4:54156 ESTABLISHED
tcp 0 0 192.168.1.4:6543 192.168.1.4:33959 ESTABLISHED
tcp 0 0 127.0.0.1:47756 127.0.0.1:6545 ESTABLISHED
tcp 0 0 127.0.0.1:33364 127.0.0.1:13666 ESTABLISHED
tcp 0 0 127.0.0.1:6545 127.0.0.1:47756 ESTABLISHED

>
> Ian.
--
Ian Campbell

Just once, I wish we would encounter an alien menace that wasn't
immune to bullets.
-- The Brigadier, "Dr. Who"


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-12-01 00:17:30

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

Can you see if the following 3 patches help? They're against 2.6.28-rc6,
but afaics the problems are pretty much the same on 2.6.26.

Cheers
Trond

2008-12-01 00:18:52

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion

We want to ensure that connected sockets close down the connection when we
set XPT_CLOSE, so that we don't keep it hanging while cleaning up all the
stuff that is keeping a reference to the socket.

Signed-off-by: Trond Myklebust <[email protected]>
---

net/sunrpc/svcsock.c | 21 ++++++++++++++++++++-
1 files changed, 20 insertions(+), 1 deletions(-)


diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 95293f5..a1b048d 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -59,6 +59,7 @@ static void svc_udp_data_ready(struct sock *, int);
static int svc_udp_recvfrom(struct svc_rqst *);
static int svc_udp_sendto(struct svc_rqst *);
static void svc_sock_detach(struct svc_xprt *);
+static void svc_tcp_sock_detach(struct svc_xprt *);
static void svc_sock_free(struct svc_xprt *);

static struct svc_xprt *svc_create_socket(struct svc_serv *, int,
@@ -1017,7 +1018,7 @@ static struct svc_xprt_ops svc_tcp_ops = {
.xpo_recvfrom = svc_tcp_recvfrom,
.xpo_sendto = svc_tcp_sendto,
.xpo_release_rqst = svc_release_skb,
- .xpo_detach = svc_sock_detach,
+ .xpo_detach = svc_tcp_sock_detach,
.xpo_free = svc_sock_free,
.xpo_prep_reply_hdr = svc_tcp_prep_reply_hdr,
.xpo_has_wspace = svc_tcp_has_wspace,
@@ -1282,6 +1283,24 @@ static void svc_sock_detach(struct svc_xprt *xprt)
sk->sk_state_change = svsk->sk_ostate;
sk->sk_data_ready = svsk->sk_odata;
sk->sk_write_space = svsk->sk_owspace;
+
+ if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
+ wake_up_interruptible(sk->sk_sleep);
+}
+
+/*
+ * Disconnect the socket, and reset the callbacks
+ */
+static void svc_tcp_sock_detach(struct svc_xprt *xprt)
+{
+ struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
+
+ dprintk("svc: svc_tcp_sock_detach(%p)\n", svsk);
+
+ svc_sock_detach(xprt);
+
+ if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
+ kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);
}

/*

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-12-01 00:19:47

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 2/3] SUNRPC: We only need to call svc_delete_xprt() once...

Use XPT_DEAD to ensure that we only call xpo_detach & friends once.

Signed-off-by: Trond Myklebust <[email protected]>
---

net/sunrpc/svc_xprt.c | 17 +++++++++++------
1 files changed, 11 insertions(+), 6 deletions(-)


diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index bf5b5cd..a417064 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -838,6 +838,10 @@ void svc_delete_xprt(struct svc_xprt *xprt)
{
struct svc_serv *serv = xprt->xpt_server;

+ /* Only do this once */
+ if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags) != 0)
+ return;
+
dprintk("svc: svc_delete_xprt(%p)\n", xprt);
xprt->xpt_ops->xpo_detach(xprt);

@@ -851,13 +855,14 @@ void svc_delete_xprt(struct svc_xprt *xprt)
* while still attached to a queue, the queue itself
* is about to be destroyed (in svc_destroy).
*/
- if (!test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) {
- BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
- if (test_bit(XPT_TEMP, &xprt->xpt_flags))
- serv->sv_tmpcnt--;
- svc_xprt_put(xprt);
- }
+ if (test_bit(XPT_TEMP, &xprt->xpt_flags))
+ serv->sv_tmpcnt--;
spin_unlock_bh(&serv->sv_lock);
+
+ /* FIXME: Is this really needed here? */
+ BUG_ON(atomic_read(&xprt->xpt_ref.refcount) < 2);
+
+ svc_xprt_put(xprt);
}

void svc_close_xprt(struct svc_xprt *xprt)

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-12-01 00:20:30

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

Aside from being racy (there is nothing preventing someone setting XPT_DEAD
after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
wrong to assume that transports which have called svc_delete_xprt() might
not need to be re-enqueued.

See the list of deferred requests, which is currently never going to
be cleared if the revisit call happens after svc_delete_xprt(). In this
case, the deferred request will currently keep a reference to the transport
forever.

The fix should be to allow dead transports to be enqueued in order to clear
the deferred requests, then change the order of processing in svc_recv() so
that we pick up deferred requests before we do the XPT_CLOSE processing.

Signed-off-by: Trond Myklebust <[email protected]>
---

net/sunrpc/svc_xprt.c | 124 +++++++++++++++++++++++++++----------------------
1 files changed, 69 insertions(+), 55 deletions(-)


diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index a417064..b54cf84 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -297,10 +297,15 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
struct svc_serv *serv = xprt->xpt_server;
struct svc_pool *pool;
struct svc_rqst *rqstp;
+ unsigned long flags;
int cpu;

- if (!(xprt->xpt_flags &
- ((1<<XPT_CONN)|(1<<XPT_DATA)|(1<<XPT_CLOSE)|(1<<XPT_DEFERRED))))
+ flags = xprt->xpt_flags &
+ (1UL<<XPT_CONN | 1UL<<XPT_DATA | 1UL<<XPT_CLOSE |
+ 1UL<<XPT_DEAD | 1UL<<XPT_DEFERRED);
+ if (flags == 0)
+ return;
+ if ((flags & 1UL<<XPT_DEAD) != 0 && (flags & 1UL<<XPT_DEFERRED) == 0)
return;

cpu = get_cpu();
@@ -315,12 +320,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
"svc_xprt_enqueue: "
"threads and transports both waiting??\n");

- if (test_bit(XPT_DEAD, &xprt->xpt_flags)) {
- /* Don't enqueue dead transports */
- dprintk("svc: transport %p is dead, not enqueued\n", xprt);
- goto out_unlock;
- }
-
/* Mark transport as busy. It will remain in this state until
* the provider calls svc_xprt_received. We update XPT_BUSY
* atomically because it also guards against trying to enqueue
@@ -566,6 +565,7 @@ static void svc_check_conn_limits(struct svc_serv *serv)
int svc_recv(struct svc_rqst *rqstp, long timeout)
{
struct svc_xprt *xprt = NULL;
+ struct svc_xprt *newxpt;
struct svc_serv *serv = rqstp->rq_server;
struct svc_pool *pool = rqstp->rq_pool;
int len, i;
@@ -673,62 +673,76 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
spin_unlock_bh(&pool->sp_lock);

len = 0;
+
+ /*
+ * Deal with deferred requests first, since they need to be
+ * dequeued and dropped if the transport has been closed.
+ */
+ rqstp->rq_deferred = svc_deferred_dequeue(xprt);
+ if (rqstp->rq_deferred) {
+ svc_xprt_received(xprt);
+ len = svc_deferred_recv(rqstp);
+ }
+
if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
dprintk("svc_recv: found XPT_CLOSE\n");
svc_delete_xprt(xprt);
- } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
- struct svc_xprt *newxpt;
- newxpt = xprt->xpt_ops->xpo_accept(xprt);
- if (newxpt) {
- /*
- * We know this module_get will succeed because the
- * listener holds a reference too
- */
- __module_get(newxpt->xpt_class->xcl_owner);
- svc_check_conn_limits(xprt->xpt_server);
- spin_lock_bh(&serv->sv_lock);
- set_bit(XPT_TEMP, &newxpt->xpt_flags);
- list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
- serv->sv_tmpcnt++;
- if (serv->sv_temptimer.function == NULL) {
- /* setup timer to age temp transports */
- setup_timer(&serv->sv_temptimer,
- svc_age_temp_xprts,
- (unsigned long)serv);
- mod_timer(&serv->sv_temptimer,
- jiffies + svc_conn_age_period * HZ);
- }
- spin_unlock_bh(&serv->sv_lock);
- svc_xprt_received(newxpt);
- }
- svc_xprt_received(xprt);
- } else {
- dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
- rqstp, pool->sp_id, xprt,
- atomic_read(&xprt->xpt_ref.refcount));
- rqstp->rq_deferred = svc_deferred_dequeue(xprt);
- if (rqstp->rq_deferred) {
- svc_xprt_received(xprt);
- len = svc_deferred_recv(rqstp);
- } else
+ goto drop_request;
+ }
+
+ if (!test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
+ if (len == 0) {
+ dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
+ rqstp, pool->sp_id, xprt,
+ atomic_read(&xprt->xpt_ref.refcount));
len = xprt->xpt_ops->xpo_recvfrom(rqstp);
+
+ /* No data, incomplete (TCP) read, or accept() */
+ if (len == 0 || len == -EAGAIN)
+ goto drop_request;
+ }
+
dprintk("svc: got len=%d\n", len);
- }

- /* No data, incomplete (TCP) read, or accept() */
- if (len == 0 || len == -EAGAIN) {
- rqstp->rq_res.len = 0;
- svc_xprt_release(rqstp);
- return -EAGAIN;
+ clear_bit(XPT_OLD, &xprt->xpt_flags);
+
+ rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
+ rqstp->rq_chandle.defer = svc_defer;
+
+ if (serv->sv_stats)
+ serv->sv_stats->netcnt++;
+ return len;
}
- clear_bit(XPT_OLD, &xprt->xpt_flags);

- rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
- rqstp->rq_chandle.defer = svc_defer;
+ newxpt = xprt->xpt_ops->xpo_accept(xprt);
+ if (newxpt) {
+ /*
+ * We know this module_get will succeed because the
+ * listener holds a reference too
+ */
+ __module_get(newxpt->xpt_class->xcl_owner);
+ svc_check_conn_limits(xprt->xpt_server);
+ spin_lock_bh(&serv->sv_lock);
+ set_bit(XPT_TEMP, &newxpt->xpt_flags);
+ list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
+ serv->sv_tmpcnt++;
+ if (serv->sv_temptimer.function == NULL) {
+ /* setup timer to age temp transports */
+ setup_timer(&serv->sv_temptimer,
+ svc_age_temp_xprts,
+ (unsigned long)serv);
+ mod_timer(&serv->sv_temptimer,
+ jiffies + svc_conn_age_period * HZ);
+ }
+ spin_unlock_bh(&serv->sv_lock);
+ svc_xprt_received(newxpt);
+ }
+ svc_xprt_received(xprt);

- if (serv->sv_stats)
- serv->sv_stats->netcnt++;
- return len;
+drop_request:
+ rqstp->rq_res.len = 0;
+ svc_xprt_release(rqstp);
+ return -EAGAIN;
}
EXPORT_SYMBOL(svc_recv);


--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-12-01 00:29:52

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> but afaics the problems are pretty much the same on 2.6.26.
>
> Cheers
> Trond

Sorry... I forgot to add that these 3 patches need to be applied to the
nfs server, not the client.

Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-12-01 22:10:32

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> but afaics the problems are pretty much the same on 2.6.26.

Thanks.

The server was actually running 2.6.25.7 but the matching sources have
since been removed the backports.org so I've reproduce with 2.6.26 and
now I'll add the patches.

Ian.

--
Ian Campbell

It has been said that man is a rational animal. All my life I have
been searching for evidence which could support this.
-- Bertrand Russell


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-12-02 15:37:30

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Tue, 2008-12-02 at 16:22 +0100, Kasparek Tomas wrote:
> On Sun, Nov 30, 2008 at 07:29:40PM -0500, Trond Myklebust wrote:
> > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > but afaics the problems are pretty much the same on 2.6.26.
> >
> > Sorry... I forgot to add that these 3 patches need to be applied to the
> > nfs server, not the client.
>
> Hi,
>
> I have the problem on client side and can not change server (FreeBSD 7.0).
> these patches does not change the situation (and they are probably not
> supposed to do so, just giving it a try). After few minutes I got this on
> the client with 2.6.28-rc6 with patches:
>
> tcp 0 0 147.229.12.146:674 147.229.176.14:2049 FIN_WAIT2
>
> Applying reverse e06799f958bf7f9f8fae15f0c6f519953fb0257c suggested by Ian
> does help on the other side (with 2.6.27.4).

Then I suggest working around the problem by reducing the value of the
sysctl net.ipv4.tcp_fin_timeout on the client.
--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-12-02 15:39:19

by Kasparek Tomas

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Sun, Nov 30, 2008 at 07:29:40PM -0500, Trond Myklebust wrote:
> On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > but afaics the problems are pretty much the same on 2.6.26.
>
> Sorry... I forgot to add that these 3 patches need to be applied to the
> nfs server, not the client.

Hi,

I have the problem on client side and can not change server (FreeBSD 7.0).
these patches does not change the situation (and they are probably not
supposed to do so, just giving it a try). After few minutes I got this on
the client with 2.6.28-rc6 with patches:

tcp 0 0 147.229.12.146:674 147.229.176.14:2049 FIN_WAIT2

Applying reverse e06799f958bf7f9f8fae15f0c6f519953fb0257c suggested by Ian
does help on the other side (with 2.6.27.4).

Bye

--

Tomas Kasparek, PhD student E-mail: [email protected]
CVT FIT VUT Brno, L127 Web: http://www.fit.vutbr.cz/~kasparek
Bozetechova 1, 612 66 Fax: +420 54114-1270
Brno, Czech Republic Phone: +420 54114-1220

jabber: [email protected]
GPG: 2F1E 1AAF FD3B CFA3 1537 63BD DCBE 18FF A035 53BC

2008-12-02 16:29:23

by Kasparek Tomas

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Tue, Dec 02, 2008 at 10:37:02AM -0500, Trond Myklebust wrote:
> On Tue, 2008-12-02 at 16:22 +0100, Kasparek Tomas wrote:
> > On Sun, Nov 30, 2008 at 07:29:40PM -0500, Trond Myklebust wrote:
> > > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > > but afaics the problems are pretty much the same on 2.6.26.
> > >
> > > Sorry... I forgot to add that these 3 patches need to be applied to the
> > > nfs server, not the client.
> >
> > Hi,
> >
> > I have the problem on client side and can not change server (FreeBSD 7.0).
> > these patches does not change the situation (and they are probably not
> > supposed to do so, just giving it a try). After few minutes I got this on
> > the client with 2.6.28-rc6 with patches:
> >
> > tcp 0 0 147.229.12.146:674 147.229.176.14:2049 FIN_WAIT2
> >
> > Applying reverse e06799f958bf7f9f8fae15f0c6f519953fb0257c suggested by Ian
> > does help on the other side (with 2.6.27.4).
>
> Then I suggest working around the problem by reducing the value of the
> sysctl net.ipv4.tcp_fin_timeout on the client.

Did tried. The number should be seconds and defaults to 60, These
connections are still there after several hours. Changing it to 10 (sec)
and same behaviour. (BTW The server did not changed in last several months)

--

Tomas Kasparek, PhD student E-mail: [email protected]
CVT FIT VUT Brno, L127 Web: http://www.fit.vutbr.cz/~kasparek
Bozetechova 1, 612 66 Fax: +420 54114-1270
Brno, Czech Republic Phone: +420 54114-1220

jabber: [email protected]
GPG: 2F1E 1AAF FD3B CFA3 1537 63BD DCBE 18FF A035 53BC

2008-12-02 18:15:48

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Tue, 2008-12-02 at 17:26 +0100, Kasparek Tomas wrote:

> Did tried. The number should be seconds and defaults to 60, These
> connections are still there after several hours. Changing it to 10 (sec)
> and same behaviour. (BTW The server did not changed in last several months)

Are you seeing the same behaviour with 'netstat -t'?

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-12-06 12:17:22

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > but afaics the problems are pretty much the same on 2.6.26.
>
> Thanks.
>
> The server was actually running 2.6.25.7 but the matching sources have
> since been removed the backports.org so I've reproduce with 2.6.26 and
> now I'll add the patches.

Just a small progress report. Anecdotally I thought that unpatched
2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
day I was running it where previously it was less frequent than once per
day.

With the patched server the client ran OK for 2.5 days then mysteriously
hung, the logs show none of the normal symptoms and my wife reset it
before I got home so I've no real clue what happened but I'm inclined to
think it was unrelated for now. I'll get back to you in a week or so if
the problem hasn't reoccurred.

Ian.
--
Ian Campbell

It's later than you think.


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-12-14 18:24:45

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Sat, 2008-12-06 at 12:16 +0000, Ian Campbell wrote:
> On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > but afaics the problems are pretty much the same on 2.6.26.
> >
> > Thanks.
> >
> > The server was actually running 2.6.25.7 but the matching sources have
> > since been removed the backports.org so I've reproduce with 2.6.26 and
> > now I'll add the patches.
>
> Just a small progress report. Anecdotally I thought that unpatched
> 2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
> day I was running it where previously it was less frequent than once per
> day.
>
> With the patched server the client ran OK for 2.5 days then mysteriously
> hung, the logs show none of the normal symptoms and my wife reset it
> before I got home so I've no real clue what happened but I'm inclined to
> think it was unrelated for now. I'll get back to you in a week or so if
> the problem hasn't reoccurred.

$ uptime
18:15:29 up 9 days, 22 min, 1 user, load average: 0.74, 0.64, 0.46

This is on the problematic client, so it looks like the server side fix
has sorted it. Thanks very much Trond.

Ian.
--
Ian Campbell

You have only to mumble a few words in church to get married and few words
in your sleep to get divorced.


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-12-16 17:55:46

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Sun, Dec 14, 2008 at 06:24:05PM +0000, Ian Campbell wrote:
> On Sat, 2008-12-06 at 12:16 +0000, Ian Campbell wrote:
> > On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> > > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > > but afaics the problems are pretty much the same on 2.6.26.
> > >
> > > Thanks.
> > >
> > > The server was actually running 2.6.25.7 but the matching sources have
> > > since been removed the backports.org so I've reproduce with 2.6.26 and
> > > now I'll add the patches.
> >
> > Just a small progress report. Anecdotally I thought that unpatched
> > 2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
> > day I was running it where previously it was less frequent than once per
> > day.
> >
> > With the patched server the client ran OK for 2.5 days then mysteriously
> > hung, the logs show none of the normal symptoms and my wife reset it
> > before I got home so I've no real clue what happened but I'm inclined to
> > think it was unrelated for now. I'll get back to you in a week or so if
> > the problem hasn't reoccurred.
>
> $ uptime
> 18:15:29 up 9 days, 22 min, 1 user, load average: 0.74, 0.64, 0.46
>
> This is on the problematic client, so it looks like the server side fix
> has sorted it. Thanks very much Trond.

Thanks for the testing! So this was with the following three patches
applied on the server on top of 2.6.26?

[PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
[PATCH 2/3] SUNRPC: We only need to call svc_delete_xprt() once...
[PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

I'll try to take a look at these before I leave for the holidays,
assuming the versions Trond posted on Nov. 30 are the latest.

--b.

2008-12-16 18:40:02

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Tue, 2008-12-16 at 12:55 -0500, J. Bruce Fields wrote:
> On Sun, Dec 14, 2008 at 06:24:05PM +0000, Ian Campbell wrote:
> > On Sat, 2008-12-06 at 12:16 +0000, Ian Campbell wrote:
> > > On Mon, 2008-12-01 at 22:09 +0000, Ian Campbell wrote:
> > > > On Sun, 2008-11-30 at 19:17 -0500, Trond Myklebust wrote:
> > > > > Can you see if the following 3 patches help? They're against 2.6.28-rc6,
> > > > > but afaics the problems are pretty much the same on 2.6.26.
> > > >
> > > > Thanks.
> > > >
> > > > The server was actually running 2.6.25.7 but the matching sources have
> > > > since been removed the backports.org so I've reproduce with 2.6.26 and
> > > > now I'll add the patches.
> > >
> > > Just a small progress report. Anecdotally I thought that unpatched
> > > 2.6.26.7 was worse than 2.6.25.7, mostly because it hung twice in the ~1
> > > day I was running it where previously it was less frequent than once per
> > > day.
> > >
> > > With the patched server the client ran OK for 2.5 days then mysteriously
> > > hung, the logs show none of the normal symptoms and my wife reset it
> > > before I got home so I've no real clue what happened but I'm inclined to
> > > think it was unrelated for now. I'll get back to you in a week or so if
> > > the problem hasn't reoccurred.
> >
> > $ uptime
> > 18:15:29 up 9 days, 22 min, 1 user, load average: 0.74, 0.64, 0.46
> >
> > This is on the problematic client, so it looks like the server side fix
> > has sorted it. Thanks very much Trond.
>
> Thanks for the testing! So this was with the following three patches
> applied on the server on top of 2.6.26?
>
> [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion
> [PATCH 2/3] SUNRPC: We only need to call svc_delete_xprt() once...
> [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

That's right, it was actually 2.6.26.7 FWIW.

> I'll try to take a look at these before I leave for the holidays,
> assuming the versions Trond posted on Nov. 30 are the latest.

Thanks.

Ian.
--
Ian Campbell

The light of a hundred stars does not equal the light of the moon.


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2008-12-23 14:49:19

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

Trond Myklebust wrote:
> On Wed, 2008-12-17 at 09:35 -0600, Tom Tucker wrote:
>
>> Trond Myklebust wrote:
>>
>>> Aside from being racy (there is nothing preventing someone setting XPT_DEAD
>>> after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
>>> wrong to assume that transports which have called svc_delete_xprt() might
>>> not need to be re-enqueued.
>>>
>> This is only true because now you allow transports with XPT_DEAD set to
>> be enqueued -- yes?
>>
>>
>>> See the list of deferred requests, which is currently never going to
>>> be cleared if the revisit call happens after svc_delete_xprt(). In this
>>> case, the deferred request will currently keep a reference to the transport
>>> forever.
>>>
>>>
>> I agree this is a possibility and it needs to be fixed. I'm concerned
>> that the root cause is still there though. I thought the test case was
>> the client side timing out the connection. Why are there deferred
>> requests sitting on what is presumably an idle connection?
>>
>
> I haven't said that they are the cause of this test case. I've said that
> deferred requests hold references to the socket that can obviously
> deadlock. That needs to be fixed regardless of whether or not it is the
> cause here.
>
> There are plenty of situations in which the client may choose to close
> the TCP socket even if there are outstanding requests. One of the most
> common is when the user signals the process, so that an RPC call that
> was partially transmitted (ran out of buffer space) gets cancelled
> before it can finish transmitting. In that case the client has no choice
> but to disconnect and immediately reconnect.
>
>
>>> The fix should be to allow dead transports to be enqueued in order to clear
>>> the deferred requests, then change the order of processing in svc_recv() so
>>> that we pick up deferred requests before we do the XPT_CLOSE processing.
>>>
>>>
>> Wouldn't it be simpler to clean up any deferred requests in the close
>> path instead of changing the meaning of XPT_DEAD and dispatching
>> N-threads to do the same?
>>
>
> AFAICS, deferred requests are the property of the cache until they
> expire or a downcall occurs. I'm not aware of any way to cancel only
> those deferred requests that hold a reference to this particular
> transport.
>
>
Ok, I think you're right, and I think that this fix is correct and makes
the symptom go away.

I may be completely confused here, but:

- The deferred requests should be getting cleaned up by timing out, and
that does not not seem to be happening, (Is this true?)

- By releasing the underlying connection prior to releasing the
transport that manages it, we've converted the visible resource leek to
an invisible one.

- This has been around forever and changing the client side close
behavior graceful exposed this bug,

So I'm wondering if what we want to do here is to provide a mechanism
for canceling deferred requests for a particular transport. This would
provide a mechanism for the generic transport driver to force
cancellation of deferred requests when closing.

Tom

2008-12-23 23:39:28

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

Tom Tucker wrote:
> Trond Myklebust wrote:
>> On Wed, 2008-12-17 at 09:35 -0600, Tom Tucker wrote:
>>
>>> Trond Myklebust wrote:
>>>
>>>> Aside from being racy (there is nothing preventing someone setting
>>>> XPT_DEAD
>>>> after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
>>>> wrong to assume that transports which have called svc_delete_xprt()
>>>> might
>>>> not need to be re-enqueued.
>>>>
>>> This is only true because now you allow transports with XPT_DEAD set
>>> to be enqueued -- yes?
>>>
>>>
>>>> See the list of deferred requests, which is currently never going to
>>>> be cleared if the revisit call happens after svc_delete_xprt(). In
>>>> this
>>>> case, the deferred request will currently keep a reference to the
>>>> transport
>>>> forever.
>>>>
>>>>
>>> I agree this is a possibility and it needs to be fixed. I'm
>>> concerned that the root cause is still there though. I thought the
>>> test case was the client side timing out the connection. Why are
>>> there deferred requests sitting on what is presumably an idle
>>> connection?
>>>
>>
>> I haven't said that they are the cause of this test case. I've said that
>> deferred requests hold references to the socket that can obviously
>> deadlock. That needs to be fixed regardless of whether or not it is the
>> cause here.
>>
>> There are plenty of situations in which the client may choose to close
>> the TCP socket even if there are outstanding requests. One of the most
>> common is when the user signals the process, so that an RPC call that
>> was partially transmitted (ran out of buffer space) gets cancelled
>> before it can finish transmitting. In that case the client has no choice
>> but to disconnect and immediately reconnect.
>>
>>
>>>> The fix should be to allow dead transports to be enqueued in order
>>>> to clear
>>>> the deferred requests, then change the order of processing in
>>>> svc_recv() so
>>>> that we pick up deferred requests before we do the XPT_CLOSE
>>>> processing.
>>>>
>>>>
>>> Wouldn't it be simpler to clean up any deferred requests in the
>>> close path instead of changing the meaning of XPT_DEAD and
>>> dispatching N-threads to do the same?
>>>
>>
>> AFAICS, deferred requests are the property of the cache until they
>> expire or a downcall occurs. I'm not aware of any way to cancel only
>> those deferred requests that hold a reference to this particular
>> transport.
>>
>>
> Ok, I think you're right, and I think that this fix is correct and
> makes the symptom go away.
>
> I may be completely confused here, but:
>
> - The deferred requests should be getting cleaned up by timing out,
> and that does not not seem to be happening, (Is this true?)
>
They are getting "cleaned up", but by the time they do the transport is
dead, the request has been added to the deferred queue, but it won't get
processed because svc_xprt_enqueue won't "schedule" a dead transport.

> - By releasing the underlying connection prior to releasing the
> transport that manages it, we've converted the visible resource leek
> to an invisible one.
>
Not with your changes per the above.

> - This has been around forever and changing the client side close
> behavior graceful exposed this bug,
>
> So I'm wondering if what we want to do here is to provide a mechanism
> for canceling deferred requests for a particular transport. This would
> provide a mechanism for the generic transport driver to force
> cancellation of deferred requests when closing.
This is a new interface and we'd still need to handle requests sitting
on the transport's deferred queue. Probably not a good idea.

> Tom
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-12-17 15:27:29

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion

Trond Myklebust wrote:
> We want to ensure that connected sockets close down the connection when we
> set XPT_CLOSE, so that we don't keep it hanging while cleaning up all the
> stuff that is keeping a reference to the socket.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
>
> net/sunrpc/svcsock.c | 21 ++++++++++++++++++++-
> 1 files changed, 20 insertions(+), 1 deletions(-)
>
>
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 95293f5..a1b048d 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -59,6 +59,7 @@ static void svc_udp_data_ready(struct sock *, int);
> static int svc_udp_recvfrom(struct svc_rqst *);
> static int svc_udp_sendto(struct svc_rqst *);
> static void svc_sock_detach(struct svc_xprt *);
> +static void svc_tcp_sock_detach(struct svc_xprt *);
> static void svc_sock_free(struct svc_xprt *);
>
> static struct svc_xprt *svc_create_socket(struct svc_serv *, int,
> @@ -1017,7 +1018,7 @@ static struct svc_xprt_ops svc_tcp_ops = {
> .xpo_recvfrom = svc_tcp_recvfrom,
> .xpo_sendto = svc_tcp_sendto,
> .xpo_release_rqst = svc_release_skb,
> - .xpo_detach = svc_sock_detach,
> + .xpo_detach = svc_tcp_sock_detach,
> .xpo_free = svc_sock_free,
> .xpo_prep_reply_hdr = svc_tcp_prep_reply_hdr,
> .xpo_has_wspace = svc_tcp_has_wspace,
> @@ -1282,6 +1283,24 @@ static void svc_sock_detach(struct svc_xprt *xprt)
> sk->sk_state_change = svsk->sk_ostate;
> sk->sk_data_ready = svsk->sk_odata;
> sk->sk_write_space = svsk->sk_owspace;
> +
> + if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
> + wake_up_interruptible(sk->sk_sleep);
> +}
> +
> +/*
> + * Disconnect the socket, and reset the callbacks
> + */
> +static void svc_tcp_sock_detach(struct svc_xprt *xprt)
> +{
> + struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
> +
> + dprintk("svc: svc_tcp_sock_detach(%p)\n", svsk);
> +
> + svc_sock_detach(xprt);
> +
> + if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
> + kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);

How is this different than what happens as an artifact of sock_release?

> }
>
> /*
>

2008-12-17 15:36:10

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

Trond Myklebust wrote:
> Aside from being racy (there is nothing preventing someone setting XPT_DEAD
> after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
> wrong to assume that transports which have called svc_delete_xprt() might
> not need to be re-enqueued.

This is only true because now you allow transports with XPT_DEAD set to
be enqueued -- yes?

>
> See the list of deferred requests, which is currently never going to
> be cleared if the revisit call happens after svc_delete_xprt(). In this
> case, the deferred request will currently keep a reference to the transport
> forever.
>

I agree this is a possibility and it needs to be fixed. I'm concerned
that the root cause is still there though. I thought the test case was
the client side timing out the connection. Why are there deferred
requests sitting on what is presumably an idle connection?


> The fix should be to allow dead transports to be enqueued in order to clear
> the deferred requests, then change the order of processing in svc_recv() so
> that we pick up deferred requests before we do the XPT_CLOSE processing.
>

Wouldn't it be simpler to clean up any deferred requests in the close
path instead of changing the meaning of XPT_DEAD and dispatching
N-threads to do the same?

> Signed-off-by: Trond Myklebust <[email protected]>
> ---
>
> net/sunrpc/svc_xprt.c | 124 +++++++++++++++++++++++++++----------------------
> 1 files changed, 69 insertions(+), 55 deletions(-)
>
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index a417064..b54cf84 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -297,10 +297,15 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
> struct svc_serv *serv = xprt->xpt_server;
> struct svc_pool *pool;
> struct svc_rqst *rqstp;
> + unsigned long flags;
> int cpu;
>
> - if (!(xprt->xpt_flags &
> - ((1<<XPT_CONN)|(1<<XPT_DATA)|(1<<XPT_CLOSE)|(1<<XPT_DEFERRED))))
> + flags = xprt->xpt_flags &
> + (1UL<<XPT_CONN | 1UL<<XPT_DATA | 1UL<<XPT_CLOSE |
> + 1UL<<XPT_DEAD | 1UL<<XPT_DEFERRED);
> + if (flags == 0)
> + return;
> + if ((flags & 1UL<<XPT_DEAD) != 0 && (flags & 1UL<<XPT_DEFERRED) == 0)
> return;
>
> cpu = get_cpu();
> @@ -315,12 +320,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
> "svc_xprt_enqueue: "
> "threads and transports both waiting??\n");
>
> - if (test_bit(XPT_DEAD, &xprt->xpt_flags)) {
> - /* Don't enqueue dead transports */
> - dprintk("svc: transport %p is dead, not enqueued\n", xprt);
> - goto out_unlock;
> - }
> -
> /* Mark transport as busy. It will remain in this state until
> * the provider calls svc_xprt_received. We update XPT_BUSY
> * atomically because it also guards against trying to enqueue
> @@ -566,6 +565,7 @@ static void svc_check_conn_limits(struct svc_serv *serv)
> int svc_recv(struct svc_rqst *rqstp, long timeout)
> {
> struct svc_xprt *xprt = NULL;
> + struct svc_xprt *newxpt;
> struct svc_serv *serv = rqstp->rq_server;
> struct svc_pool *pool = rqstp->rq_pool;
> int len, i;
> @@ -673,62 +673,76 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
> spin_unlock_bh(&pool->sp_lock);
>
> len = 0;
> +
> + /*
> + * Deal with deferred requests first, since they need to be
> + * dequeued and dropped if the transport has been closed.
> + */
> + rqstp->rq_deferred = svc_deferred_dequeue(xprt);
> + if (rqstp->rq_deferred) {
> + svc_xprt_received(xprt);
> + len = svc_deferred_recv(rqstp);
> + }
> +
> if (test_bit(XPT_CLOSE, &xprt->xpt_flags)) {
> dprintk("svc_recv: found XPT_CLOSE\n");
> svc_delete_xprt(xprt);
> - } else if (test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
> - struct svc_xprt *newxpt;
> - newxpt = xprt->xpt_ops->xpo_accept(xprt);
> - if (newxpt) {
> - /*
> - * We know this module_get will succeed because the
> - * listener holds a reference too
> - */
> - __module_get(newxpt->xpt_class->xcl_owner);
> - svc_check_conn_limits(xprt->xpt_server);
> - spin_lock_bh(&serv->sv_lock);
> - set_bit(XPT_TEMP, &newxpt->xpt_flags);
> - list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
> - serv->sv_tmpcnt++;
> - if (serv->sv_temptimer.function == NULL) {
> - /* setup timer to age temp transports */
> - setup_timer(&serv->sv_temptimer,
> - svc_age_temp_xprts,
> - (unsigned long)serv);
> - mod_timer(&serv->sv_temptimer,
> - jiffies + svc_conn_age_period * HZ);
> - }
> - spin_unlock_bh(&serv->sv_lock);
> - svc_xprt_received(newxpt);
> - }
> - svc_xprt_received(xprt);
> - } else {
> - dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
> - rqstp, pool->sp_id, xprt,
> - atomic_read(&xprt->xpt_ref.refcount));
> - rqstp->rq_deferred = svc_deferred_dequeue(xprt);
> - if (rqstp->rq_deferred) {
> - svc_xprt_received(xprt);
> - len = svc_deferred_recv(rqstp);
> - } else
> + goto drop_request;
> + }
> +
> + if (!test_bit(XPT_LISTENER, &xprt->xpt_flags)) {
> + if (len == 0) {
> + dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
> + rqstp, pool->sp_id, xprt,
> + atomic_read(&xprt->xpt_ref.refcount));
> len = xprt->xpt_ops->xpo_recvfrom(rqstp);
> +
> + /* No data, incomplete (TCP) read, or accept() */
> + if (len == 0 || len == -EAGAIN)
> + goto drop_request;
> + }
> +
> dprintk("svc: got len=%d\n", len);
> - }
>
> - /* No data, incomplete (TCP) read, or accept() */
> - if (len == 0 || len == -EAGAIN) {
> - rqstp->rq_res.len = 0;
> - svc_xprt_release(rqstp);
> - return -EAGAIN;
> + clear_bit(XPT_OLD, &xprt->xpt_flags);
> +
> + rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
> + rqstp->rq_chandle.defer = svc_defer;
> +
> + if (serv->sv_stats)
> + serv->sv_stats->netcnt++;
> + return len;
> }
> - clear_bit(XPT_OLD, &xprt->xpt_flags);
>
> - rqstp->rq_secure = svc_port_is_privileged(svc_addr(rqstp));
> - rqstp->rq_chandle.defer = svc_defer;
> + newxpt = xprt->xpt_ops->xpo_accept(xprt);
> + if (newxpt) {
> + /*
> + * We know this module_get will succeed because the
> + * listener holds a reference too
> + */
> + __module_get(newxpt->xpt_class->xcl_owner);
> + svc_check_conn_limits(xprt->xpt_server);
> + spin_lock_bh(&serv->sv_lock);
> + set_bit(XPT_TEMP, &newxpt->xpt_flags);
> + list_add(&newxpt->xpt_list, &serv->sv_tempsocks);
> + serv->sv_tmpcnt++;
> + if (serv->sv_temptimer.function == NULL) {
> + /* setup timer to age temp transports */
> + setup_timer(&serv->sv_temptimer,
> + svc_age_temp_xprts,
> + (unsigned long)serv);
> + mod_timer(&serv->sv_temptimer,
> + jiffies + svc_conn_age_period * HZ);
> + }
> + spin_unlock_bh(&serv->sv_lock);
> + svc_xprt_received(newxpt);
> + }
> + svc_xprt_received(xprt);
>
> - if (serv->sv_stats)
> - serv->sv_stats->netcnt++;
> - return len;
> +drop_request:
> + rqstp->rq_res.len = 0;
> + svc_xprt_release(rqstp);
> + return -EAGAIN;
> }
> EXPORT_SYMBOL(svc_recv);
>
>

2008-12-17 18:13:57

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion

On Wed, 2008-12-17 at 09:27 -0600, Tom Tucker wrote:
> > + if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
> > + kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);
>
> How is this different than what happens as an artifact of sock_release?

The point is that it is independent of whether or not something is
holding a reference to the svc_sock.

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2008-12-17 18:59:38

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH 1/3] SUNRPC: Ensure the server closes sockets in a timely fashion

Trond Myklebust wrote:
> On Wed, 2008-12-17 at 09:27 -0600, Tom Tucker wrote:
>>> + if (!test_bit(XPT_LISTENER, &xprt->xpt_flags))
>>> + kernel_sock_shutdown(svsk->sk_sock, SHUT_RDWR);
>> How is this different than what happens as an artifact of sock_release?
>
> The point is that it is independent of whether or not something is
> holding a reference to the svc_sock.

Thanks, makes sense.

>

2008-12-17 19:08:59

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 3/3] SUNRPC: svc_xprt_enqueue should not refuse to enqueue 'XPT_DEAD' transports

On Wed, 2008-12-17 at 09:35 -0600, Tom Tucker wrote:
> Trond Myklebust wrote:
> > Aside from being racy (there is nothing preventing someone setting XPT_DEAD
> > after the test in svc_xprt_enqueue, and before XPT_BUSY is set), it is
> > wrong to assume that transports which have called svc_delete_xprt() might
> > not need to be re-enqueued.
>
> This is only true because now you allow transports with XPT_DEAD set to
> be enqueued -- yes?
>
> >
> > See the list of deferred requests, which is currently never going to
> > be cleared if the revisit call happens after svc_delete_xprt(). In this
> > case, the deferred request will currently keep a reference to the transport
> > forever.
> >
>
> I agree this is a possibility and it needs to be fixed. I'm concerned
> that the root cause is still there though. I thought the test case was
> the client side timing out the connection. Why are there deferred
> requests sitting on what is presumably an idle connection?

I haven't said that they are the cause of this test case. I've said that
deferred requests hold references to the socket that can obviously
deadlock. That needs to be fixed regardless of whether or not it is the
cause here.

There are plenty of situations in which the client may choose to close
the TCP socket even if there are outstanding requests. One of the most
common is when the user signals the process, so that an RPC call that
was partially transmitted (ran out of buffer space) gets cancelled
before it can finish transmitting. In that case the client has no choice
but to disconnect and immediately reconnect.

> > The fix should be to allow dead transports to be enqueued in order to clear
> > the deferred requests, then change the order of processing in svc_recv() so
> > that we pick up deferred requests before we do the XPT_CLOSE processing.
> >
>
> Wouldn't it be simpler to clean up any deferred requests in the close
> path instead of changing the meaning of XPT_DEAD and dispatching
> N-threads to do the same?

AFAICS, deferred requests are the property of the cache until they
expire or a downcall occurs. I'm not aware of any way to cancel only
those deferred requests that hold a reference to this particular
transport.

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com

2009-01-07 22:21:39

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> That's right, it was actually 2.6.26.7 FWIW.
>
> > I'll try to take a look at these before I leave for the holidays,
> > assuming the versions Trond posted on Nov. 30 are the latest.
>
> Thanks.

Sorry for getting behind.

If you got a chance to retest with the for-2.6.29 branch at

git://linux-nfs.org/~bfields/linux.git for-2.6.29

that'd be great; that's what I intend to send to Linus.

--b.

2009-01-08 18:20:27

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Wed, Jan 07, 2009 at 05:21:15PM -0500, J. Bruce Fields wrote:
> On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> > That's right, it was actually 2.6.26.7 FWIW.
> >
> > > I'll try to take a look at these before I leave for the holidays,
> > > assuming the versions Trond posted on Nov. 30 are the latest.
> >
> > Thanks.
>
> Sorry for getting behind.
>
> If you got a chance to retest with the for-2.6.29 branch at
>
> git://linux-nfs.org/~bfields/linux.git for-2.6.29
>
> that'd be great; that's what I intend to send to Linus.

(Merged now, so testing mainline as of today should work too.)

--b.

2009-01-08 21:23:15

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Thu, 2009-01-08 at 13:20 -0500, J. Bruce Fields wrote:
> On Wed, Jan 07, 2009 at 05:21:15PM -0500, J. Bruce Fields wrote:
> > On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> > > That's right, it was actually 2.6.26.7 FWIW.
> > >
> > > > I'll try to take a look at these before I leave for the holidays,
> > > > assuming the versions Trond posted on Nov. 30 are the latest.
> > >
> > > Thanks.
> >
> > Sorry for getting behind.
> >
> > If you got a chance to retest with the for-2.6.29 branch at
> >
> > git://linux-nfs.org/~bfields/linux.git for-2.6.29
> >
> > that'd be great; that's what I intend to send to Linus.
>
> (Merged now, so testing mainline as of today should work too.)

The server isn't really a machine I want to test random kernels on, is
there some subset of those changesets which it would be useful for me to
pull back onto the 2.6.26 kernel I'm using to test? (I can most like
manage the backporting myself).

These two look like the relevant ones to me but I'm not sure:
22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion

I think 69b6 was in the set of three I tested previously and the other
two turned into 2294?

Ian.

Full list 238c6d54830c624f34ac9cf123ac04aebfca5013..nfs-bfields/for-2.6.29:
db43910cb42285a99f45f7e0a0a32e32d0b61dcf nfsd: get rid of NFSD_VERSION
87df4de8073f922a1f643b9fa6ba0412d5529ecf nfsd: last_byte_offset
4e65ebf08951326709817e654c149d0a94982e01 nfsd: delete wrong file comment from nfsd/nfs4xdr.c
df96fcf02a5fd2ae4e9b09e079dd6ef12d10ecd7 nfsd: git rid of nfs4_cb_null_ops declaration
0407717d8587f60003f4904bff27650cd836c00c nfsd: dprint each op status in nfsd4_proc_compound
b7aeda40d3010666d2c024c80557b6aa92a1a1ad nfsd: add etoosmall to nfserrno
30fa8c0157e4591ee2227aaa0b17cd3b0da5e6cb NFSD: FIDs need to take precedence over UUIDs
24c3767e41a6a59d32bb45abe899eb194e6bf1b8 SUNRPC: The sunrpc server code should not be used by out-of-tree modules
22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
9a8d248e2d2e9c880ac4561f27fea5dc200655bd nfsd: fix double-locks of directory mutex
2779e3ae39645515cb6c1126634f47c28c9e7190 svc: Move kfree of deferral record to common code
f05ef8db1abe68e3f6fc272efee51bc54ce528c5 CRED: Fix NFSD regression
0dba7c2a9ed3d4a1e58f5d94fffa9f44dbe012e6 NLM: Clean up flow of control in make_socks() function
d3fe5ea7cf815c037c90b1f1464ffc1ab5e8601b NLM: Refactor make_socks() function
55ef1274dddd4de387c54d110e354ffbb6cdc706 nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT
69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
262a09823bb07c6aafb6c1d312cde613d0b90c85 NFSD: Add documenting comments for nfsctl interface
9e074856caf13ba83363f73759f5e395f74ccf41 NFSD: Replace open-coded integer with macro
54224f04ae95d86b27c0673cd773ebb120d86876 NFSD: Fix a handful of coding style issues in write_filehandle()
b046ccdc1f8171f6d0129dcc2a28d49187b4bf69 NFSD: clean up failover sysctl function naming
b064ec038a6180b13e5f89b6a30b42cb5ce8febc lockd: Enable NLM use of AF_INET6
57ef692588bc225853ca3267ca5b7cea2b07e058 NLM: Rewrite IPv4 privileged requester's check
d1208f70738c91f13b4eadb1b7a694082e439da2 NLM: nlm_privileged_requester() doesn't recognize mapped loopback address
49b5699b3fc22b363534c509c1b7dba06bc677bf NSM: Move nsm_create()
b7ba597fb964dfa44284904b3b3d74d44b8e1c42 NSM: Move nsm_use_hostnames to mon.c
8529bc51d30b8f001734b29b21a51b579c260f5b NSM: Move nsm_addr() to fs/lockd/mon.c
e6765b83977f07983c7a10e6bbb19d6c7bbfc3a4 NSM: Remove include/linux/lockd/sm_inter.h
94da7663db26530a8377f7219f8be8bd4d4822c2 NSM: Replace IP address as our nlm_reboot lookup key
77a3ef33e2de6fc8aabd7cb1700bfef81757c28a NSM: More clean up of nsm_get_handle()
b39b897c259fc1fd1998505f2b1d4ec1f115bce1 NSM: Refactor nsm_handle creation into a helper function
92fd91b998a5216a6d6606704e71d541a180216c NLM: Remove "create" argument from nsm_find()
8c7378fd2a5f22016542931b887a2ae98d146eaf NLM: Call nsm_reboot_lookup() instead of nsm_find()
3420a8c4359a189f7d854ed7075d151257415447 NSM: Add nsm_lookup() function
576df4634e37e46b441fefb91915184edb13bb94 NLM: Decode "priv" argument of NLMPROC_SM_NOTIFY as an opaque
7fefc9cb9d5f129c238d93166f705c96ca2e7e51 NLM: Change nlm_host_rebooted() to take a single nlm_reboot argument
cab2d3c99165abbba2943f1b269003b17fd3b1cb NSM: Encode the new "priv" cookie for NSMPROC_MON requests
7e44d3bea21fbb9494930d1cd35ca92a9a4a3279 NSM: Generate NSMPROC_MON's "priv" argument when nsm_handle is created
05f3a9af58180d24a9decedd71d4587935782d70 NSM: Remove !nsm check from nsm_release()
bc1cc6c4e476b60df48227165990c87a22db6bb7 NSM: Remove NULL pointer check from nsm_find()
5cf1c4b19db99d21d44c2ab457cfd44eb86b4439 NSM: Add dprintk() calls in nsm_find and nsm_release
67c6d107a689243979a2b5f15244b5261634a924 NSM: Move nsm_find() to fs/lockd/mon.c
03eb1dcbb799304b58730f4dba65812f49fb305e NSM: move to xdr_stream-based XDR encoders and decoders
36e8e668d3e6a61848a8921ddeb663b417299fa5 NSM: Move NSM program and procedure numbers to fs/lockd/mon.c
9c1bfd037f7ff8badaecb47418f109148d88bf45 NSM: Move NSM-related XDR data structures to lockd's xdr.h
0c7aef4569f8680951b7dee01dddffb9d2f809ff NSM: Check result of SM_UNMON upcall
356c3eb466fd1a12afd6448d90fba3922836e5f1 NLM: Move the public declaration of nsm_unmonitor() to lockd.h
c8c23c423dec49cb439697d3dc714e1500ff1610 NSM: Release nsmhandle in nlm_destroy_host
1e49323c4ab044d05bbc68cf13cadcbd4372468c NLM: Move the public declaration of nsm_monitor() to lockd.h
5d254b119823658cc318f88589c6c426b3d0a153 NSM: Make sure to return an error if the SM_MON call result is not zero
5bc74bef7c9b652f0f2aa9c5a8d5ac86881aba79 NSM: Remove BUG_ON() in nsm_monitor()
501c1ed3fb5c2648ba1709282c71617910917f66 NLM: Remove redundant printk() in nlmclnt_lock()
9fee49024ed19d849413df4ab6ec1a1a60aaae94 NSM: Use sm_name instead of h_name in nsm_monitor() and nsm_unmonitor()
29ed1407ed81086b778ebf12145b048ac3f7e10e NSM: Support IPv6 version of mon_name
f47534f7f0ac7727e05ec4274b764b181df2cf7f NSM: Use modern style for sm_name field in nsm_handle
5acf43155d1bcc412d892c73f64044f9a826cde6 NSM: convert printk(KERN_DEBUG) to a dprintk()
a4846750f090702e2fb848ac4fe5827bcef34060 NSM: Use C99 structure initializer to initialize nsm_args
afb03699dc0a920aed3322ad0e6895533941fb1e NLM: Add helper to handle IPv4 addresses
bc995801a09d1fead0bec1356bfd836911c8eed7 NLM: Support IPv6 scope IDs in nlm_display_address()
6999fb4016b2604c2f8a65586bba4a62a4b24ce7 NLM: Remove AF_UNSPEC arm in nlm_display_address()
1df40b609ad5a622904eb652109c287fe9c93ec5 NLM: Remove address eye-catcher buffers from nlm_host
7538ce1eb656a1477bedd5b1c202226e7abf5e7b NLM: Use modern style for pointer fields in nlm_host
c72a476b4b7ecadb80185de31236edb303c1a5d0 lockd: set svc_serv->sv_maxconn to a more reasonable value (try #3)
c9233eb7b0b11ef176d4bf68da2ce85464b6ec39 sunrpc: add sv_maxconn field to svc_serv (try #3)
548eaca46b3cf4419b6c2be839a106d8641ffb70 nfsd: document new filehandle fsid types
2bd9e7b62e6e1da3f881c40c73d93e9a212ce6de nfsd: Fix leaked memory in nfs4_make_rec_clidname
9346eff0dea1e5855fba25c9fe639d92a4db3135 nfsd: Minor cleanup of find_stateid
b3d47676d474ecd914c72049c87e71e5f0ffe040 nfsd: update fh_verify description


--
Ian Campbell

You're definitely on their list. The question to ask next is what list it is.


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2009-01-08 21:26:38

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Thu, Jan 08, 2009 at 09:22:33PM +0000, Ian Campbell wrote:
> On Thu, 2009-01-08 at 13:20 -0500, J. Bruce Fields wrote:
> > On Wed, Jan 07, 2009 at 05:21:15PM -0500, J. Bruce Fields wrote:
> > > On Tue, Dec 16, 2008 at 06:39:35PM +0000, Ian Campbell wrote:
> > > > That's right, it was actually 2.6.26.7 FWIW.
> > > >
> > > > > I'll try to take a look at these before I leave for the holidays,
> > > > > assuming the versions Trond posted on Nov. 30 are the latest.
> > > >
> > > > Thanks.
> > >
> > > Sorry for getting behind.
> > >
> > > If you got a chance to retest with the for-2.6.29 branch at
> > >
> > > git://linux-nfs.org/~bfields/linux.git for-2.6.29
> > >
> > > that'd be great; that's what I intend to send to Linus.
> >
> > (Merged now, so testing mainline as of today should work too.)
>
> The server isn't really a machine I want to test random kernels on, is
> there some subset of those changesets which it would be useful for me to
> pull back onto the 2.6.26 kernel I'm using to test? (I can most like
> manage the backporting myself).
>
> These two look like the relevant ones to me but I'm not sure:
> 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
> 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
>
> I think 69b6 was in the set of three I tested previously and the other
> two turned into 2294?

Yep, exactly.--b.

>
> Ian.
>
> Full list 238c6d54830c624f34ac9cf123ac04aebfca5013..nfs-bfields/for-2.6.29:
> db43910cb42285a99f45f7e0a0a32e32d0b61dcf nfsd: get rid of NFSD_VERSION
> 87df4de8073f922a1f643b9fa6ba0412d5529ecf nfsd: last_byte_offset
> 4e65ebf08951326709817e654c149d0a94982e01 nfsd: delete wrong file comment from nfsd/nfs4xdr.c
> df96fcf02a5fd2ae4e9b09e079dd6ef12d10ecd7 nfsd: git rid of nfs4_cb_null_ops declaration
> 0407717d8587f60003f4904bff27650cd836c00c nfsd: dprint each op status in nfsd4_proc_compound
> b7aeda40d3010666d2c024c80557b6aa92a1a1ad nfsd: add etoosmall to nfserrno
> 30fa8c0157e4591ee2227aaa0b17cd3b0da5e6cb NFSD: FIDs need to take precedence over UUIDs
> 24c3767e41a6a59d32bb45abe899eb194e6bf1b8 SUNRPC: The sunrpc server code should not be used by out-of-tree modules
> 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
> 9a8d248e2d2e9c880ac4561f27fea5dc200655bd nfsd: fix double-locks of directory mutex
> 2779e3ae39645515cb6c1126634f47c28c9e7190 svc: Move kfree of deferral record to common code
> f05ef8db1abe68e3f6fc272efee51bc54ce528c5 CRED: Fix NFSD regression
> 0dba7c2a9ed3d4a1e58f5d94fffa9f44dbe012e6 NLM: Clean up flow of control in make_socks() function
> d3fe5ea7cf815c037c90b1f1464ffc1ab5e8601b NLM: Refactor make_socks() function
> 55ef1274dddd4de387c54d110e354ffbb6cdc706 nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT
> 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
> 262a09823bb07c6aafb6c1d312cde613d0b90c85 NFSD: Add documenting comments for nfsctl interface
> 9e074856caf13ba83363f73759f5e395f74ccf41 NFSD: Replace open-coded integer with macro
> 54224f04ae95d86b27c0673cd773ebb120d86876 NFSD: Fix a handful of coding style issues in write_filehandle()
> b046ccdc1f8171f6d0129dcc2a28d49187b4bf69 NFSD: clean up failover sysctl function naming
> b064ec038a6180b13e5f89b6a30b42cb5ce8febc lockd: Enable NLM use of AF_INET6
> 57ef692588bc225853ca3267ca5b7cea2b07e058 NLM: Rewrite IPv4 privileged requester's check
> d1208f70738c91f13b4eadb1b7a694082e439da2 NLM: nlm_privileged_requester() doesn't recognize mapped loopback address
> 49b5699b3fc22b363534c509c1b7dba06bc677bf NSM: Move nsm_create()
> b7ba597fb964dfa44284904b3b3d74d44b8e1c42 NSM: Move nsm_use_hostnames to mon.c
> 8529bc51d30b8f001734b29b21a51b579c260f5b NSM: Move nsm_addr() to fs/lockd/mon.c
> e6765b83977f07983c7a10e6bbb19d6c7bbfc3a4 NSM: Remove include/linux/lockd/sm_inter.h
> 94da7663db26530a8377f7219f8be8bd4d4822c2 NSM: Replace IP address as our nlm_reboot lookup key
> 77a3ef33e2de6fc8aabd7cb1700bfef81757c28a NSM: More clean up of nsm_get_handle()
> b39b897c259fc1fd1998505f2b1d4ec1f115bce1 NSM: Refactor nsm_handle creation into a helper function
> 92fd91b998a5216a6d6606704e71d541a180216c NLM: Remove "create" argument from nsm_find()
> 8c7378fd2a5f22016542931b887a2ae98d146eaf NLM: Call nsm_reboot_lookup() instead of nsm_find()
> 3420a8c4359a189f7d854ed7075d151257415447 NSM: Add nsm_lookup() function
> 576df4634e37e46b441fefb91915184edb13bb94 NLM: Decode "priv" argument of NLMPROC_SM_NOTIFY as an opaque
> 7fefc9cb9d5f129c238d93166f705c96ca2e7e51 NLM: Change nlm_host_rebooted() to take a single nlm_reboot argument
> cab2d3c99165abbba2943f1b269003b17fd3b1cb NSM: Encode the new "priv" cookie for NSMPROC_MON requests
> 7e44d3bea21fbb9494930d1cd35ca92a9a4a3279 NSM: Generate NSMPROC_MON's "priv" argument when nsm_handle is created
> 05f3a9af58180d24a9decedd71d4587935782d70 NSM: Remove !nsm check from nsm_release()
> bc1cc6c4e476b60df48227165990c87a22db6bb7 NSM: Remove NULL pointer check from nsm_find()
> 5cf1c4b19db99d21d44c2ab457cfd44eb86b4439 NSM: Add dprintk() calls in nsm_find and nsm_release
> 67c6d107a689243979a2b5f15244b5261634a924 NSM: Move nsm_find() to fs/lockd/mon.c
> 03eb1dcbb799304b58730f4dba65812f49fb305e NSM: move to xdr_stream-based XDR encoders and decoders
> 36e8e668d3e6a61848a8921ddeb663b417299fa5 NSM: Move NSM program and procedure numbers to fs/lockd/mon.c
> 9c1bfd037f7ff8badaecb47418f109148d88bf45 NSM: Move NSM-related XDR data structures to lockd's xdr.h
> 0c7aef4569f8680951b7dee01dddffb9d2f809ff NSM: Check result of SM_UNMON upcall
> 356c3eb466fd1a12afd6448d90fba3922836e5f1 NLM: Move the public declaration of nsm_unmonitor() to lockd.h
> c8c23c423dec49cb439697d3dc714e1500ff1610 NSM: Release nsmhandle in nlm_destroy_host
> 1e49323c4ab044d05bbc68cf13cadcbd4372468c NLM: Move the public declaration of nsm_monitor() to lockd.h
> 5d254b119823658cc318f88589c6c426b3d0a153 NSM: Make sure to return an error if the SM_MON call result is not zero
> 5bc74bef7c9b652f0f2aa9c5a8d5ac86881aba79 NSM: Remove BUG_ON() in nsm_monitor()
> 501c1ed3fb5c2648ba1709282c71617910917f66 NLM: Remove redundant printk() in nlmclnt_lock()
> 9fee49024ed19d849413df4ab6ec1a1a60aaae94 NSM: Use sm_name instead of h_name in nsm_monitor() and nsm_unmonitor()
> 29ed1407ed81086b778ebf12145b048ac3f7e10e NSM: Support IPv6 version of mon_name
> f47534f7f0ac7727e05ec4274b764b181df2cf7f NSM: Use modern style for sm_name field in nsm_handle
> 5acf43155d1bcc412d892c73f64044f9a826cde6 NSM: convert printk(KERN_DEBUG) to a dprintk()
> a4846750f090702e2fb848ac4fe5827bcef34060 NSM: Use C99 structure initializer to initialize nsm_args
> afb03699dc0a920aed3322ad0e6895533941fb1e NLM: Add helper to handle IPv4 addresses
> bc995801a09d1fead0bec1356bfd836911c8eed7 NLM: Support IPv6 scope IDs in nlm_display_address()
> 6999fb4016b2604c2f8a65586bba4a62a4b24ce7 NLM: Remove AF_UNSPEC arm in nlm_display_address()
> 1df40b609ad5a622904eb652109c287fe9c93ec5 NLM: Remove address eye-catcher buffers from nlm_host
> 7538ce1eb656a1477bedd5b1c202226e7abf5e7b NLM: Use modern style for pointer fields in nlm_host
> c72a476b4b7ecadb80185de31236edb303c1a5d0 lockd: set svc_serv->sv_maxconn to a more reasonable value (try #3)
> c9233eb7b0b11ef176d4bf68da2ce85464b6ec39 sunrpc: add sv_maxconn field to svc_serv (try #3)
> 548eaca46b3cf4419b6c2be839a106d8641ffb70 nfsd: document new filehandle fsid types
> 2bd9e7b62e6e1da3f881c40c73d93e9a212ce6de nfsd: Fix leaked memory in nfs4_make_rec_clidname
> 9346eff0dea1e5855fba25c9fe639d92a4db3135 nfsd: Minor cleanup of find_stateid
> b3d47676d474ecd914c72049c87e71e5f0ffe040 nfsd: update fh_verify description
>
>
> --
> Ian Campbell
>
> You're definitely on their list. The question to ask next is what list it is.

2009-01-12 09:46:36

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Thu, 2009-01-08 at 16:26 -0500, J. Bruce Fields wrote:
> On Thu, Jan 08, 2009 at 09:22:33PM +0000, Ian Campbell wrote:

> > These two look like the relevant ones to me but I'm not sure:
> > 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred requests on transport destruction
> > 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server closes sockets in a timely fashion
> >
> > I think 69b6 was in the set of three I tested previously and the other
> > two turned into 2294?
>
> Yep, exactly.--b.

OK, I have patched my 2.6.26 kernel with these and it is now running.
It'll be about a week before I can say with any certainty that the issue
hasn't recurred.

Ian.
--
Ian Campbell
Current Noise: Pitchshifter - Subject To Status

Do I have a lifestyle yet?

2009-01-22 08:28:06

by Ian Campbell

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Thu, 2009-01-08 at 16:26 -0500, J. Bruce Fields wrote:
>
> > > (Merged now, so testing mainline as of today should work too.)
> >
> > The server isn't really a machine I want to test random kernels on,
> is
> > there some subset of those changesets which it would be useful for
> me to
> > pull back onto the 2.6.26 kernel I'm using to test? (I can most like
> > manage the backporting myself).
> >
> > These two look like the relevant ones to me but I'm not sure:
> > 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred
> requests on transport destruction
> > 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server
> closes sockets in a timely fashion
> >
> > I think 69b6 was in the set of three I tested previously and the
> other
> > two turned into 2294?
>
> Yep, exactly.--b.

The client machine now has an uptime of ten days without error after
these two patches were applied to the server.

Thanks everybody,
Ian.

--
Ian Campbell

I used to think that the brain was the most wonderful organ in
my body. Then I realized who was telling me this.
-- Emo Phillips


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2009-01-22 16:44:44

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/3] NFS regression in 2.6.26?, "task blocked for more than 120 seconds"

On Thu, Jan 22, 2009 at 08:27:40AM +0000, Ian Campbell wrote:
> On Thu, 2009-01-08 at 16:26 -0500, J. Bruce Fields wrote:
> >
> > > > (Merged now, so testing mainline as of today should work too.)
> > >
> > > The server isn't really a machine I want to test random kernels on,
> > is
> > > there some subset of those changesets which it would be useful for
> > me to
> > > pull back onto the 2.6.26 kernel I'm using to test? (I can most like
> > > manage the backporting myself).
> > >
> > > These two look like the relevant ones to me but I'm not sure:
> > > 22945e4a1c7454c97f5d8aee1ef526c83fef3223 svc: Clean up deferred
> > requests on transport destruction
> > > 69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a SUNRPC: Ensure the server
> > closes sockets in a timely fashion
> > >
> > > I think 69b6 was in the set of three I tested previously and the
> > other
> > > two turned into 2294?
> >
> > Yep, exactly.--b.
>
> The client machine now has an uptime of ten days without error after
> these two patches were applied to the server.
>
> Thanks everybody,

Very good, so upstream should be OK. Thanks for the testing!

--b.