Hi all,
I've recently run into an issue with ext4 and delayed allocation where
my system will silently hang just seconds into a dbench run. I first
saw this using 2.6.30.9, but it still happens with 2.6.32-rc8. I've
tried with an SMP x86_64 system, and a UP i386 system. The SMP x86_64
system hangs every time for me within 15 seconds, but the UP i386
system never hangs on me. If I remount with -o nodelalloc on the SMP
system, the hang goes away. There are no messages printed out by the
kernel; it just locks hard. Has anyone seen this before? I'm just
running "dbench 500" to produce this behavior.
-Justin
Hi,
> I've recently run into an issue with ext4 and delayed allocation where
> my system will silently hang just seconds into a dbench run. I first
> saw this using 2.6.30.9, but it still happens with 2.6.32-rc8. I've
> tried with an SMP x86_64 system, and a UP i386 system. The SMP x86_64
> system hangs every time for me within 15 seconds, but the UP i386
> system never hangs on me. If I remount with -o nodelalloc on the SMP
> system, the hang goes away. There are no messages printed out by the
> kernel; it just locks hard. Has anyone seen this before? I'm just
> running "dbench 500" to produce this behavior.
Can you still switch consoles after the system hangs (it's good to
debug this on a text console)? If yes, could you press Alt-Sysrq-w and
take a picture of the console by digital camera or so (take pictures of
as many screens as possible using console scrollback)? Thanks.
Honza
--
Jan Kara <[email protected]>
SuSE CR Labs
On Wed, Dec 9, 2009 at 5:23 PM, Justin Maggard <[email protected]> wrote:
> On Tue, Dec 8, 2009 at 7:18 AM, Jan Kara <[email protected]> wrote:
>> ?Can you still switch consoles after the system hangs (it's good to
>> debug this on a text console)? If yes, could you press Alt-Sysrq-w and
>> take a picture of the console by digital camera or so (take pictures of
>> as many screens as possible using console scrollback)? Thanks.
>
> Thanks for the response. ?Console scrollback doesn't seem to work for
> me in that mode. ?I tried it three times. ?Two of the times the list
> was empty. ?The other time, the screen just listed a bunch of rm
> processes.
I have a little more information to add. ?After noticing the recent
"Fix potential quota deadlock" patch on the mailing list, I figured it
would be worth a shot to try it without quotas enabled. ?This also
avoids the system hang. ?I tried applying that patch, but still had
the same symptoms using that kernel. ?So I'm seeing a consistent
system hang with ext4 when delalloc and quotas are enabled on an SMP
system. ?With either quotas or delalloc disabled, it doesn't hang.
Both enabled on a single processor system also doesn't hang.
-Justin
Justin Maggard <[email protected]> writes:
> On Wed, Dec 9, 2009 at 5:23 PM, Justin Maggard <[email protected]> wrote:
>> On Tue, Dec 8, 2009 at 7:18 AM, Jan Kara <[email protected]> wrote:
>>> Can you still switch consoles after the system hangs (it's good to
>>> debug this on a text console)? If yes, could you press Alt-Sysrq-w and
>>> take a picture of the console by digital camera or so (take pictures of
>>> as many screens as possible using console scrollback)? Thanks.
>>
>> Thanks for the response. Console scrollback doesn't seem to work for
>> me in that mode. I tried it three times. Two of the times the list
>> was empty. The other time, the screen just listed a bunch of rm
>> processes.
>
> I have a little more information to add. After noticing the recent
> "Fix potential quota deadlock" patch on the mailing list, I figured it
> would be worth a shot to try it without quotas enabled. This also
> avoids the system hang. I tried applying that patch, but still had
> the same symptoms using that kernel. So I'm seeing a consistent
> system hang with ext4 when delalloc and quotas are enabled on an SMP
> system. With either quotas or delalloc disabled, it doesn't hang.
> Both enabled on a single processor system also doesn't hang.
You also may try another patch
"[PATCH] ext4: fix sleep inside spinlock issue aka #14739 V2"
>
> -Justin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 10, 2009 at 5:29 PM, Dmitry Monakhov <[email protected]> wrote:
> Justin Maggard <[email protected]> writes:
>> I have a little more information to add. ?After noticing the recent
>> "Fix potential quota deadlock" patch on the mailing list, I figured it
>> would be worth a shot to try it without quotas enabled. ?This also
>> avoids the system hang. ?I tried applying that patch, but still had
>> the same symptoms using that kernel. ?So I'm seeing a consistent
>> system hang with ext4 when delalloc and quotas are enabled on an SMP
>> system. ?With either quotas or delalloc disabled, it doesn't hang.
>> Both enabled on a single processor system also doesn't hang.
> You also may try another patch
> "[PATCH] ext4: fix sleep inside spinlock issue aka #14739 V2"
Yes, that patch looks like it did the trick. Thanks!
Justin Maggard <[email protected]> writes:
> On Thu, Dec 10, 2009 at 5:29 PM, Dmitry Monakhov <[email protected]> wrote:
>> Justin Maggard <[email protected]> writes:
>>> I have a little more information to add. After noticing the recent
>>> "Fix potential quota deadlock" patch on the mailing list, I figured it
>>> would be worth a shot to try it without quotas enabled. This also
>>> avoids the system hang. I tried applying that patch, but still had
>>> the same symptoms using that kernel. So I'm seeing a consistent
>>> system hang with ext4 when delalloc and quotas are enabled on an SMP
>>> system. With either quotas or delalloc disabled, it doesn't hang.
>>> Both enabled on a single processor system also doesn't hang.
>> You also may try another patch
>> "[PATCH] ext4: fix sleep inside spinlock issue aka #14739 V2"
>
> Yes, that patch looks like it did the trick. Thanks!
Ohhh.. in fact i've to apologies my patch is wrong a little bit
after i_block_reservation_lock was reacquired second time
i_reserved_meta_blocks block's may be changed so we have to
use add instead of assign
- EXT4_I(inode)->i_reserved_meta_blocks = mdblocks;
+ EXT4_I(inode)->i_reserved_meta_blocks += md_needed;
As result quota leak is possible on heavy SMP stress test.
I've already send correct version.