2009-07-04 13:03:49

by Pekka Enberg

[permalink] [raw]
Subject: Re: [Bug 13631] BUG/panic - update_curr

Hi Brad,

[email protected] wrote:
> --- Comment #14 from Brad Plant <[email protected]> 2009-07-03 15:23:06 ---
> Created an attachment (id=22193)
> --> (http://bugzilla.kernel.org/attachment.cgi?id=22193)
> BUG kmalloc-16: Redzone overwritten
>
> (In reply to comment #13)
>> Looking at the bug report, I'd be pretty surprised if this would be a
>> SLUB bug. It seems more likely that there's some memory corruption going
>> on under heavy load and SLAB just happens to have a different layout of
>> slab objects or something.
>>
>> Did you run the test with CONFIG_SLAB_DEBUG, btw?
>
> I tried slub debugging first. I tried to make it crash for a while but of
> course it wouldn't do it when I wanted it to. I had given up on trying to crash
> slub and was just rebooting the node to change the kernel when I hit the
> jackpot.
>
> Does this suggest ocfs2 is corrupting the memory?

Yup, that would be the prime suspect here. Lets cc ocfs2 developers and
LKML. The corruption can be found here:

http://bugzilla.kernel.org/attachment.cgi?id=22193

Pekka