2010-08-25 16:12:34

by davidr

[permalink] [raw]
Subject: fsync/wb deadlocks in 2.6.32

Hello all,

I have ~100 nfs clients running Ubuntu 10.04 LTS, and under moderate and
heavy v3 write loads, I will periodically get deadlocks in nfs_do_fsync().
Unfortunately, it's rare enough that I've not been able to come up with
a test case that works reliably. The usage pattern looks like this:

1. 8 jobs are started on each of 100 nodes (each node has 8 cores)
2. These jobs stat(), read() and close() unique files of size 10-20MB
on the source NFS filesystem.
3. They open(), write(), and close() the files on the target NFS
filesystem (not the same as the source filesystem). Occasionally, the
clients will insert a mkdir() before the open().
4. Steps 2-3 are repeated for a total of ~20m files (as in, all clients copy a
total of 20m files cumulatively)

After an hour or two, at least one of these nodes gives a series of these
messages:

[88792.122324] INFO: task awk:7184 blocked for more than 120 seconds.
[88792.122643] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[88792.122990] python2.6 D 0000000000000000 0 7184 7150 0x00000000
[88792.122992] ffff8806313cfb78 0000000000000046 0000000000015bc0
0000000000015bc0
[88792.122995] ffff8806267483c0 ffff8806313cffd8 0000000000015bc0
ffff880626748000
[88792.122997] 0000000000015bc0 ffff8806313cffd8 0000000000015bc0
ffff8806267483c0
[88792.122999] Call Trace:
[88792.123010] [<ffffffffa02a82b0>] ?
nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
[88792.123014] [<ffffffff8153ebb7>] io_schedule+0x47/0x70
[88792.123019] [<ffffffffa02a82be>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
[88792.123021] [<ffffffff8153f40f>] __wait_on_bit+0x5f/0x90
[88792.123027] [<ffffffffa02a82b0>] ?
nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
[88792.123029] [<ffffffff8153f4b8>] out_of_line_wait_on_bit+0x78/0x90
[88792.123033] [<ffffffff81085360>] ? wake_bit_function+0x0/0x40
[88792.123038] [<ffffffffa02a829f>] nfs_wait_on_request+0x2f/0x40 [nfs]
[88792.123044] [<ffffffffa02ac6af>] nfs_wait_on_requests_locked+0x7f/0xd0 [nfs]
[88792.123051] [<ffffffffa02adaee>] nfs_sync_mapping_wait+0x9e/0x1a0 [nfs]
[88792.123057] [<ffffffffa02aded9>] nfs_write_mapping+0x79/0xb0 [nfs]
[88792.123061] [<ffffffff81155d9f>] ? __d_free+0x3f/0x60
[88792.123063] [<ffffffff8115e4c0>] ? mntput_no_expire+0x30/0x110
[88792.123069] [<ffffffffa02adf47>] nfs_wb_all+0x17/0x20 [nfs]
[88792.123073] [<ffffffffa029ceba>] nfs_do_fsync+0x2a/0x60 [nfs]
[88792.123077] [<ffffffffa029d105>] nfs_file_flush+0x75/0xa0 [nfs]
[88792.123079] [<ffffffff8114051c>] filp_close+0x3c/0x90
[88792.123082] [<ffffffff81068d8f>] put_files_struct+0x7f/0xf0
[88792.123084] [<ffffffff81068e54>] exit_files+0x54/0x70
[88792.123086] [<ffffffff8106b3ab>] do_exit+0x14b/0x380
[88792.123088] [<ffffffff8106b635>] do_group_exit+0x55/0xd0
[88792.123089] [<ffffffff8106b6c7>] sys_exit_group+0x17/0x20
[88792.123092] [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b

At which point, all writing process on the client go into iowait and
never return until the client is rebooted. In any given 24 hour period,
usually no more than 5 of my clients will exhibit this problem, and
frequently it's only 1 or 2 (although not the same from test to test).

I tried Ubuntu kernels 2.6.32.24.25 and 2.6.32.24.41, and I tried a
stock kernel.org build of 2.6.32.18, none of which appear to have had
any noticable effect.

Here are the current mount options:
async,nocto,proto=udp,auto,intr,noatime,nodiratime, \
rsize=32768,rw,vers=3,wsize=32768

I've tried tcp/udp, cto/nocto (i.e., grasping at straws), and none of
those options appear to have any effect either.

As far as I can tell, the problem appears to be unrelated to the
NFS server. We've seen these hangs while writing to a RHEL server
(2.6.18-92.1.22.el5) as well as an F5 ARX NFS proxy.

If anyone has seen this before, knows what it is, or needs more info
from me, please let me know.

Thanks,

David


2010-08-25 17:20:22

by davidr

[permalink] [raw]
Subject: Re: fsync/wb deadlocks in 2.6.32

On Wed, Aug 25, 2010 at 12:07 PM, Joe Landman
<[email protected]> wrote:
>> [88792.122324] INFO: task awk:7184 blocked for more than 120 seconds.
>
> Did you get an "nfs server not responding" type message in the logs?

Unfortunately not, and no messages like that in any of the clients,
deadlocked or no. From an IO perspective, the NFS servers have
capacity and bandwidth to spare. We haven't been able to see any
load-related causes to this, although that's not to say they're not
there. We *have* seen this problem under a lighter load, with just 4-8
servers running the same 8 processes per server, but that's quite
rare, I suspect just due to the rarity of the problem.

This happens over both GigE or IPoIB (QDR) as well.

2010-08-27 14:16:08

by davidr

[permalink] [raw]
Subject: Re: fsync/wb deadlocks in 2.6.32

Hi all,

I'm guessing this is uncommon and nobody here has seen it. One of my
friends looked through the list archives and discovered commit
0702099bd86c33c2dcdbd3963433a61f3f503901, which looked relevant. I
backported it to 2.6.32.18 (if you can call anything involving a one
line patch "backporting" :) ), and the problem has not yet returned.

That said, I'm not sure if this actually corrects the problem or
pushes it deeper into a place where it's not going to hang the host,
but is still unsafe because the original commit was to 2.6.35+. Any
comments?

Thanks!

David



--- linux-2.6.32.18.orig/fs/nfs/file.c 2010-08-10 12:45:57.000000000 -0500
+++ linux-2.6.32.18/fs/nfs/file.c 2010-08-20 10:15:37.608665292 -0500
@@ -220,7 +220,7 @@ static int nfs_do_fsync(struct nfs_open_
have_error |= test_bit(NFS_CONTEXT_ERROR_WRITE, &ctx->flags);
if (have_error)
ret = xchg(&ctx->error, 0);
- if (!ret)
+ if (!ret && status < 0)
ret = status;
return ret;
}

Subject: Re: fsync/wb deadlocks in 2.6.32

We got another instance yesterday. A large number of machines
simultaneously started spewing the kernel messages. Had to reboot to get
the state cleared up. Very very annoying. Anyone know a reason for these
writeback deadlocks?


2010-08-25 17:17:36

by Joe Landman

[permalink] [raw]
Subject: Re: fsync/wb deadlocks in 2.6.32

[email protected] wrote:
> Hello all,
>
> I have ~100 nfs clients running Ubuntu 10.04 LTS, and under moderate and
> heavy v3 write loads, I will periodically get deadlocks in nfs_do_fsync().
> Unfortunately, it's rare enough that I've not been able to come up with
> a test case that works reliably. The usage pattern looks like this:
>
> 1. 8 jobs are started on each of 100 nodes (each node has 8 cores)
> 2. These jobs stat(), read() and close() unique files of size 10-20MB
> on the source NFS filesystem.
> 3. They open(), write(), and close() the files on the target NFS
> filesystem (not the same as the source filesystem). Occasionally, the
> clients will insert a mkdir() before the open().
> 4. Steps 2-3 are repeated for a total of ~20m files (as in, all clients copy a
> total of 20m files cumulatively)
>
> After an hour or two, at least one of these nodes gives a series of these
> messages:
>
> [88792.122324] INFO: task awk:7184 blocked for more than 120 seconds.

Did you get an "nfs server not responding" type message in the logs?

[...]

> Here are the current mount options:
> async,nocto,proto=udp,auto,intr,noatime,nodiratime, \
> rsize=32768,rw,vers=3,wsize=32768
>
> I've tried tcp/udp, cto/nocto (i.e., grasping at straws), and none of
> those options appear to have any effect either.

If you get the nfs server not responding message, you might switch to

proto=tcp,retrans=10

or similar (try retrans=N if you just want to play with that).

> As far as I can tell, the problem appears to be unrelated to the
> NFS server. We've seen these hangs while writing to a RHEL server
> (2.6.18-92.1.22.el5) as well as an F5 ARX NFS proxy.

Is it possible that a server or switch was overloaded during this
interval?

> If anyone has seen this before, knows what it is, or needs more info
> from me, please let me know.

Things like this, usually under heavy load situations, when we hit a
spot of resource contention (usually network or server being too busy).



>
> Thanks,
>
> David
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: [email protected]
web : http://scalableinformatics.com
http://scalableinformatics.com/jackrabbit
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615

2010-08-30 16:19:31

by Kian Mohageri

[permalink] [raw]
Subject: Re: fsync/wb deadlocks in 2.6.32

On Mon, Aug 30, 2010 at 9:04 AM, Christoph Lameter <[email protected]> wrote:
> On Fri, 27 Aug 2010, Kian Mohageri wrote:
>
>> Just happened upon this message.  My symptoms are a little different,
>> however, and I'm still investigating the possibility of a faulty drive
>> on the NFS server.... but thought I'd chime in anyway:
>
> Its a bit troublesome that a faulty drive on an NFS server could cause
> kernel backtraces to show up on the NFS client. The faulty NFS server
> should also give you some indication that there are issues with the drive.
> Does it?
>

Some other messages in the logs on the NFS server pointed me to the
possibility of disk failure, for example (there are more instances of
similar messages, and they correspond to times when I see NFS
problems):

Aug 24 08:17:51 www01 kernel: [143799.812353] ata3.00: configured for UDMA/133
Aug 24 08:17:51 www01 kernel: [143799.812365] ata3: EH complete
Aug 24 08:17:58 www01 kernel: [143806.844363] ata3.00: configured for UDMA/133
Aug 24 08:17:58 www01 kernel: [143806.844372] ata3: EH complete
Aug 24 08:18:05 www01 kernel: [143813.868368] ata3.00: configured for UDMA/133
Aug 24 08:18:05 www01 kernel: [143813.868382] sd 2:0:0:0: [sda]
Unhandled sense code
Aug 24 08:18:05 www01 kernel: [143813.868383] sd 2:0:0:0: [sda]
Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Aug 24 08:18:05 www01 kernel: [143813.868386] sd 2:0:0:0: [sda] Sense
Key : Medium Error [current] [descriptor]
Aug 24 08:18:05 www01 kernel: [143813.868390] Descriptor sense data
with sense descriptors (in hex):
Aug 24 08:18:05 www01 kernel: [143813.868392] 72 03 11 04 00
00 00 0c 00 0a 80 00 00 00 00 00
Aug 24 08:18:05 www01 kernel: [143813.868398] 03 41 18 c8
Aug 24 08:18:05 www01 kernel: [143813.868400] sd 2:0:0:0: [sda] Add.
Sense: Unrecovered read error - auto reallocate failed
Aug 24 08:18:05 www01 kernel: [143813.868404] sd 2:0:0:0: [sda] CDB:
Read(10): 28 00 03 41 18 c8 00 00 08 00
Aug 24 08:18:05 www01 kernel: [143813.868456] ata3: EH complete
Aug 24 08:18:12 www01 kernel: [143820.892365] ata3.00: configured for UDMA/133
Aug 24 08:18:12 www01 kernel: [143820.892375] ata3: EH complete
Aug 24 08:18:19 www01 kernel: [143827.917368] ata3.00: configured for UDMA/133
Aug 24 08:18:19 www01 kernel: [143827.917381] ata3: EH complete
Aug 24 08:18:26 www01 kernel: [143834.940364] ata3.00: configured for UDMA/133
Aug 24 08:18:26 www01 kernel: [143834.940378] ata3: EH complete
Aug 24 08:18:33 www01 kernel: [143841.964365] ata3.00: configured for UDMA/133
Aug 24 08:18:33 www01 kernel: [143841.964372] ata3: EH complete
Aug 24 08:18:41 www01 kernel: [143848.992358] ata3.00: configured for UDMA/133
Aug 24 08:18:41 www01 kernel: [143848.992374] ata3: EH complete
Aug 24 08:18:48 www01 kernel: [143856.016368] ata3.00: configured for UDMA/133
Aug 24 08:18:48 www01 kernel: [143856.016381] sd 2:0:0:0: [sda]
Unhandled sense code
Aug 24 08:18:48 www01 kernel: [143856.016383] sd 2:0:0:0: [sda]
Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Aug 24 08:18:48 www01 kernel: [143856.016386] sd 2:0:0:0: [sda] Sense
Key : Medium Error [current] [descriptor]
Aug 24 08:18:48 www01 kernel: [143856.016389] Descriptor sense data
with sense descriptors (in hex):
Aug 24 08:18:48 www01 kernel: [143856.016391] 72 03 11 04 00
00 00 0c 00 0a 80 00 00 00 00 00
Aug 24 08:18:48 www01 kernel: [143856.016397] 03 ca d8 a0
Aug 24 08:18:48 www01 kernel: [143856.016400] sd 2:0:0:0: [sda] Add.
Sense: Unrecovered read error - auto reallocate failed
Aug 24 08:18:48 www01 kernel: [143856.016403] sd 2:0:0:0: [sda] CDB:
Read(10): 28 00 03 ca d8 a0 00 00 08 00
Aug 24 08:18:48 www01 kernel: [143856.016459] ata3: EH complete
Aug 24 08:18:55 www01 kernel: [143863.040364] ata3.00: configured for UDMA/133
Aug 24 08:18:55 www01 kernel: [143863.040374] ata3: EH complete
Aug 24 08:19:02 www01 kernel: [143870.064363] ata3.00: configured for UDMA/133
Aug 24 08:19:02 www01 kernel: [143870.064379] ata3: EH complete
Aug 24 08:19:09 www01 kernel: [143877.088360] ata3.00: configured for UDMA/133
Aug 24 08:19:09 www01 kernel: [143877.088376] ata3: EH complete
Aug 24 08:19:12 www01 kernel: [143880.704093] kjournald D
0000000000000002 0 309 2 0x00000000
Aug 24 08:19:12 www01 kernel: [143880.704097] ffff88012fad8710
0000000000000046 0000000000000002 0000000000015640
Aug 24 08:19:12 www01 kernel: [143880.704101] 0000000000015640
0000000000015640 000000000000f8a0 ffff88012bcbdfd8
Aug 24 08:19:12 www01 kernel: [143880.704104] 0000000000015640
0000000000015640 ffff88012bccb170 ffff88012bccb468
Aug 24 08:19:12 www01 kernel: [143880.704107] Call Trace:
Aug 24 08:19:12 www01 kernel: [143880.704116] [<ffffffff8103fe62>] ?
update_curr+0xa6/0x147
Aug 24 08:19:12 www01 kernel: [143880.704121] [<ffffffff810170d9>] ?
read_tsc+0xa/0x20
Aug 24 08:19:12 www01 kernel: [143880.704125] [<ffffffff8110d2f8>] ?
sync_buffer+0x0/0x40
Aug 24 08:19:12 www01 kernel: [143880.704129] [<ffffffff812f9549>] ?
io_schedule+0x73/0xb7
Aug 24 08:19:12 www01 kernel: [143880.704132] [<ffffffff8110d333>] ?
sync_buffer+0x3b/0x40
Aug 24 08:19:12 www01 kernel: [143880.704134] [<ffffffff812f9a56>] ?
__wait_on_bit+0x41/0x70
Aug 24 08:19:12 www01 kernel: [143880.704136] [<ffffffff8110d2f8>] ?
sync_buffer+0x0/0x40
Aug 24 08:19:12 www01 kernel: [143880.704139] [<ffffffff812f9af0>] ?
out_of_line_wait_on_bit+0x6b/0x77
Aug 24 08:19:12 www01 kernel: [143880.704143] [<ffffffff81064b28>] ?
wake_bit_function+0x0/0x23
Aug 24 08:19:12 www01 kernel: [143880.704158] [<ffffffffa01391d1>] ?
journal_commit_transaction+0x508/0xe2b [jbd]
Aug 24 08:19:12 www01 kernel: [143880.704163] [<ffffffff8105a4ac>] ?
lock_timer_base+0x26/0x4b
Aug 24 08:19:12 www01 kernel: [143880.704167] [<ffffffffa013c423>] ?
kjournald+0xdf/0x226 [jbd]
Aug 24 08:19:12 www01 kernel: [143880.704169] [<ffffffff81064afa>] ?
autoremove_wake_function+0x0/0x2e
Aug 24 08:19:12 www01 kernel: [143880.704173] [<ffffffffa013c344>] ?
kjournald+0x0/0x226 [jbd]
Aug 24 08:19:12 www01 kernel: [143880.704176] [<ffffffff8106482d>] ?
kthread+0x79/0x81
Aug 24 08:19:12 www01 kernel: [143880.704179] [<ffffffff81011baa>] ?
child_rip+0xa/0x20
Aug 24 08:19:12 www01 kernel: [143880.704181] [<ffffffff810647b4>] ?
kthread+0x0/0x81
Aug 24 08:19:12 www01 kernel: [143880.704183] [<ffffffff81011ba0>] ?
child_rip+0x0/0x20


I'm still running diagnostics on the disk, but SMART did complain
about at least 1 thing:

Currently unreadable (pending) sectors detected:
/dev/sda [SAT] - 48 Time(s)
5 unreadable sectors detected

Though the numbers are all within their "safe" ranges, and I ran an
extended test last night which the drive passed :\ Of course
hardware/software doesn't always fail predictably, but the server ran
seemingly fine all weekend.

Not sure if there's other information that would be valuable, but let
me know and I'll provide what I can if it's of use to anyone.

Subject: Re: fsync/wb deadlocks in 2.6.32

On Fri, 27 Aug 2010, Kian Mohageri wrote:

> Just happened upon this message. My symptoms are a little different,
> however, and I'm still investigating the possibility of a faulty drive
> on the NFS server.... but thought I'd chime in anyway:

Its a bit troublesome that a faulty drive on an NFS server could cause
kernel backtraces to show up on the NFS client. The faulty NFS server
should also give you some indication that there are issues with the drive.
Does it?

Plus this is an async NFS configuration. Why does the NFS server fsync and
wait at all?

2010-08-27 19:01:14

by Kian Mohageri

[permalink] [raw]
Subject: Re: fsync/wb deadlocks in 2.6.32

On Fri, Aug 27, 2010 at 7:16 AM, <[email protected]> wrote:
> Hi all,
>
> I'm guessing this is uncommon and nobody here has seen it. One of my
> friends looked through the list archives and discovered commit
> 0702099bd86c33c2dcdbd3963433a61f3f503901, which looked relevant. I
> backported it to 2.6.32.18 (if you can call anything involving a one
> line patch "backporting" :) ), and the problem has not yet returned.
>

Hi David,

Just happened upon this message. My symptoms are a little different,
however, and I'm still investigating the possibility of a faulty drive
on the NFS server.... but thought I'd chime in anyway:

[141720.614673] INFO: task apache2:21298 blocked for more than 120 seconds.
[141720.614704] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[141720.614749] apache2 D 0000000000000000 0 21298 1697 0x00000000
[141720.614752] ffffffff8145b1f0 0000000000000046 0000000000000000
0000000000000000
[141720.614755] ffff88007c80fc98 0000000000000000 000000000000f8a0
ffff88007c80ffd8
[141720.614758] 0000000000015640 0000000000015640 ffff880029b69530
ffff880029b69828
[141720.614761] Call Trace:
[141720.614764] [<ffffffff810bb72a>] ? pagevec_lookup_tag+0x1a/0x21
[141720.614771] [<ffffffffa03f0d54>] ?
nfs_wait_bit_uninterruptible+0x0/0xd [nfs]
[141720.614773] [<ffffffff812f9549>] ? io_schedule+0x73/0xb7
[141720.614780] [<ffffffffa03f0d5d>] ?
nfs_wait_bit_uninterruptible+0x9/0xd [nfs]
[141720.614783] [<ffffffff812f9a56>] ? __wait_on_bit+0x41/0x70
[141720.614785] [<ffffffff81190848>] ? __lookup_tag+0xad/0x11b
[141720.614792] [<ffffffffa03f0d54>] ?
nfs_wait_bit_uninterruptible+0x0/0xd [nfs]
[141720.614795] [<ffffffff812f9af0>] ? out_of_line_wait_on_bit+0x6b/0x77
[141720.614797] [<ffffffff81064b28>] ? wake_bit_function+0x0/0x23
[141720.614805] [<ffffffffa03f4cff>] ? nfs_sync_mapping_wait+0xfa/0x227 [nfs]
[141720.614812] [<ffffffffa03f54e6>] ? nfs_write_mapping+0x69/0x8e [nfs]
[141720.614815] [<ffffffff810cf8f5>] ? remove_vma+0x6b/0x72
[141720.614821] [<ffffffffa03e83af>] ? nfs_do_fsync+0x1c/0x3c [nfs]
[141720.614823] [<ffffffff810ec3ae>] ? filp_close+0x37/0x62
[141720.614826] [<ffffffff8104f768>] ? put_files_struct+0x64/0xc1
[141720.614828] [<ffffffff81051016>] ? do_exit+0x225/0x6b5
[141720.614831] [<ffffffff8105151c>] ? do_group_exit+0x76/0x9d
[141720.614833] [<ffffffff81051555>] ? sys_exit_group+0x12/0x16
[141720.614836] [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
[141758.556016] nfs: server 10.20.153.68 not responding, still trying
[141758.556790] nfs: server 10.20.153.68 OK


2 clients + 1 server, all are Debian Squeeze (2.6.32-5-amd64).

Client mount options:
10.20.153.68:/ on /mnt/data type nfs4
(rw,sync,noatime,lookupcache=none,noac,addr=10.20.153.68,clientaddr=10.20.153.70)

Server:
/srv/nfs4
10.20.153.70/26(rw,sync,fsid=0,no_subtree_check,no_root_squash)


ii nfs-common 1:1.2.2-1
NFS support files common to client and serve
ii nfs-kernel-server 1:1.2.2-1
support for NFS kernel server

Seeing the same thing you are with iowait but of course, if it's the
server at fault, I might expect that.

-Kian

2010-09-01 04:20:39

by Myklebust, Trond

[permalink] [raw]
Subject: Re: fsync/wb deadlocks in 2.6.32

On Tue, 2010-08-31 at 09:36 -0500, Christoph Lameter wrote:
> We got another instance yesterday. A large number of machines
> simultaneously started spewing the kernel messages. Had to reboot to get
> the state cleared up. Very very annoying. Anyone know a reason for these
> writeback deadlocks?
>

Hi Christoph,

>From what I could see in your original email to me, you appear to be
using NFS over UDP on IPoIB on a 2.6.32-18 kernel. Is that still the
case? If so, does anything change if you change to use TCP and/or an
ordinary NIC?

Your link to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/561210
appears to be the kswapd bug, which should be hacked around in later
2.6.32 stable versions.

Cheers
Trond

Subject: Re: fsync/wb deadlocks in 2.6.32

On Wed, 1 Sep 2010, Trond Myklebust wrote:

> >From what I could see in your original email to me, you appear to be
> using NFS over UDP on IPoIB on a 2.6.32-18 kernel. Is that still the
> case? If so, does anything change if you change to use TCP and/or an
> ordinary NIC?

It is still the case. We have changed to udp from tcp and saw no change.
We have not tried to use an Ethernet NIC instead.

> Your link to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/561210
> appears to be the kswapd bug, which should be hacked around in later
> 2.6.32 stable versions.

We are runing 2.6.32.18 which should have the fix. But the issue is still
there.

Is there any other patch available?