2014-06-12 05:35:11

by John Stultz

[permalink] [raw]
Subject: [Regression] 3.15 mmc related ext4 corruption with qemu-system-arm

I've been seeing some ext4 corruption with recent kernels under qemu-system-arm.

This issue seems to crop up after shutting down uncleanly (terminating
qemu), shortly after booting about 50% of the time.

ext4/mmc related dmesg details are:
[ 3.206809] mmci-pl18x mb:mmci: mmc0: PL181 manf 41 rev0 at
0x10005000 irq 41,42 (pio)
[ 3.268316] mmc0: new SDHC card at address 4567
[ 3.281963] mmcblk0: mmc0:4567 QEMU! 2.00 GiB
[ 3.315699] mmcblk0: p1 p2 p3 p4 < p5 p6 >
...
[ 11.806169] EXT4-fs (mmcblk0p5): Ignoring removed nomblk_io_submit option
[ 11.904714] EXT4-fs (mmcblk0p5): recovery complete
[ 11.905854] EXT4-fs (mmcblk0p5): mounted filesystem with ordered
data mode. Opts: nomblk_io_submit,errors=panic
...
[ 91.558824] EXT4-fs error (device mmcblk0p5):
ext4_mb_generate_buddy:756: group 1, 2252 clusters in bitmap, 2284 in
gd; block bitmap corrupt.
[ 91.560641] Aborting journal on device mmcblk0p5-8.
[ 91.562589] Kernel panic - not syncing: EXT4-fs (device mmcblk0p5):
panic forced after error
[ 91.562589]
[ 91.563486] CPU: 0 PID: 1 Comm: init Not tainted 3.15.0-rc1 #560
[ 91.564616] [<c00116e5>] (unwind_backtrace) from [<c000f3b1>]
(show_stack+0x11/0x14)
[ 91.565154] [<c000f3b1>] (show_stack) from [<c04262b1>]
(dump_stack+0x59/0x7c)
[ 91.565666] [<c04262b1>] (dump_stack) from [<c0423297>] (panic+0x67/0x178)
[ 91.566147] [<c0423297>] (panic) from [<c0134bb1>]
(ext4_handle_error+0x69/0x74)
[ 91.566659] [<c0134bb1>] (ext4_handle_error) from [<c0135437>]
(__ext4_grp_locked_error+0x6b/0x160)
[ 91.567223] [<c0135437>] (__ext4_grp_locked_error) from
[<c0143079>] (ext4_mb_generate_buddy+0x1b1/0x29c)
[ 91.567860] [<c0143079>] (ext4_mb_generate_buddy) from [<c01447e5>]
(ext4_mb_init_cache+0x219/0x4e0)
[ 91.568473] [<c01447e5>] (ext4_mb_init_cache) from [<c0144b67>]
(ext4_mb_init_group+0xbb/0x138)
[ 91.569021] [<c0144b67>] (ext4_mb_init_group) from [<c0144cd7>]
(ext4_mb_good_group+0xf3/0xfc)
[ 91.569659] [<c0144cd7>] (ext4_mb_good_group) from [<c0145c8f>]
(ext4_mb_regular_allocator+0x153/0x2c4)
[ 91.570250] [<c0145c8f>] (ext4_mb_regular_allocator) from
[<c0148095>] (ext4_mb_new_blocks+0x2fd/0x4e4)
[ 91.570868] [<c0148095>] (ext4_mb_new_blocks) from [<c013f931>]
(ext4_ext_map_blocks+0x965/0x10bc)
[ 91.571444] [<c013f931>] (ext4_ext_map_blocks) from [<c0122c8b>]
(ext4_map_blocks+0xfb/0x36c)
[ 91.571992] [<c0122c8b>] (ext4_map_blocks) from [<c01263b1>]
(mpage_map_and_submit_extent+0x99/0x5f0)
[ 91.572614] [<c01263b1>] (mpage_map_and_submit_extent) from
[<c0126bc1>] (ext4_writepages+0x2b9/0x4e8)
[ 91.573201] [<c0126bc1>] (ext4_writepages) from [<c0094ae9>]
(do_writepages+0x19/0x28)
[ 91.573709] [<c0094ae9>] (do_writepages) from [<c008c811>]
(__filemap_fdatawrite_range+0x3d/0x44)
[ 91.574265] [<c008c811>] (__filemap_fdatawrite_range) from
[<c008c883>] (filemap_flush+0x23/0x28)
[ 91.574854] [<c008c883>] (filemap_flush) from [<c012bf75>]
(ext4_rename+0x2f9/0x3e4)
[ 91.575360] [<c012bf75>] (ext4_rename) from [<c00c3363>]
(vfs_rename+0x183/0x45c)
[ 91.575911] [<c00c3363>] (vfs_rename) from [<c00c3867>]
(SyS_renameat2+0x22b/0x26c)
[ 91.576460] [<c00c3867>] (SyS_renameat2) from [<c00c38df>]
(SyS_rename+0x1f/0x24)
[ 91.576961] [<c00c38df>] (SyS_rename) from [<c000cd01>]
(ret_fast_syscall+0x1/0x5c)


Bisecting this points to: e7f3d22289e4307b3071cc18b1d8ecc6598c0be4
(mmc: mmci: Handle CMD irq before DATA irq). Which I guess shouldn't
be surprising, as I saw problems with that patch earlier in the
3.15-rc cycle:
https://lkml.org/lkml/2014/4/14/824

However that discussion petered out (possibly my fault for not
following up) as to if it was an issue with the patch or a issue with
qemu. Then the original issue disappeared for me, which I figured was
due to a fix upstream, but now I'm guessing coincided with me updating
my system and getting qemu v2.0 (where as previously I was on 1.5).

$ qemu-system-arm -version
QEMU emulator version 2.0.0 (Debian 2.0.0+dfsg-2ubuntu1.1), Copyright
(c) 2003-2008 Fabrice Bellard

While the previous behavior was annoying and kept my emulated
environments from booting, this while a bit more rare and subtle eats
the disks, which is much more painful for my testing.

Unfortunately reverting the change (manually, as it doesn't revert
cleanly anymore) doesn't seem to completely avoid the issue, so the
bisection may have gone slightly astray (though it is interesting it
landed on the same commit I earlier had trouble with). So I'll
back-track and double check some of the last few "good" results to
validate I didn't just luck into 3 good boots accidentally. I'll also
review my revert in case I missed something subtle in doing it
manually.

Anyway, if there is any thoughts on how to better chase this down and
debug it, I'd appreciate it! I can also provide reproduction
instructions with a pre-built Linaro android disk image and hand built
kernel if anyone wants to debug this themselves.

thanks
-john


2014-06-12 12:09:54

by Ulf Hansson

[permalink] [raw]
Subject: Re: [Regression] 3.15 mmc related ext4 corruption with qemu-system-arm

On 12 June 2014 07:35, John Stultz <[email protected]> wrote:
> I've been seeing some ext4 corruption with recent kernels under qemu-system-arm.
>
> This issue seems to crop up after shutting down uncleanly (terminating
> qemu), shortly after booting about 50% of the time.
>
> ext4/mmc related dmesg details are:
> [ 3.206809] mmci-pl18x mb:mmci: mmc0: PL181 manf 41 rev0 at
> 0x10005000 irq 41,42 (pio)
> [ 3.268316] mmc0: new SDHC card at address 4567
> [ 3.281963] mmcblk0: mmc0:4567 QEMU! 2.00 GiB
> [ 3.315699] mmcblk0: p1 p2 p3 p4 < p5 p6 >
> ...
> [ 11.806169] EXT4-fs (mmcblk0p5): Ignoring removed nomblk_io_submit option
> [ 11.904714] EXT4-fs (mmcblk0p5): recovery complete
> [ 11.905854] EXT4-fs (mmcblk0p5): mounted filesystem with ordered
> data mode. Opts: nomblk_io_submit,errors=panic
> ...
> [ 91.558824] EXT4-fs error (device mmcblk0p5):
> ext4_mb_generate_buddy:756: group 1, 2252 clusters in bitmap, 2284 in
> gd; block bitmap corrupt.
> [ 91.560641] Aborting journal on device mmcblk0p5-8.
> [ 91.562589] Kernel panic - not syncing: EXT4-fs (device mmcblk0p5):
> panic forced after error
> [ 91.562589]
> [ 91.563486] CPU: 0 PID: 1 Comm: init Not tainted 3.15.0-rc1 #560
> [ 91.564616] [<c00116e5>] (unwind_backtrace) from [<c000f3b1>]
> (show_stack+0x11/0x14)
> [ 91.565154] [<c000f3b1>] (show_stack) from [<c04262b1>]
> (dump_stack+0x59/0x7c)
> [ 91.565666] [<c04262b1>] (dump_stack) from [<c0423297>] (panic+0x67/0x178)
> [ 91.566147] [<c0423297>] (panic) from [<c0134bb1>]
> (ext4_handle_error+0x69/0x74)
> [ 91.566659] [<c0134bb1>] (ext4_handle_error) from [<c0135437>]
> (__ext4_grp_locked_error+0x6b/0x160)
> [ 91.567223] [<c0135437>] (__ext4_grp_locked_error) from
> [<c0143079>] (ext4_mb_generate_buddy+0x1b1/0x29c)
> [ 91.567860] [<c0143079>] (ext4_mb_generate_buddy) from [<c01447e5>]
> (ext4_mb_init_cache+0x219/0x4e0)
> [ 91.568473] [<c01447e5>] (ext4_mb_init_cache) from [<c0144b67>]
> (ext4_mb_init_group+0xbb/0x138)
> [ 91.569021] [<c0144b67>] (ext4_mb_init_group) from [<c0144cd7>]
> (ext4_mb_good_group+0xf3/0xfc)
> [ 91.569659] [<c0144cd7>] (ext4_mb_good_group) from [<c0145c8f>]
> (ext4_mb_regular_allocator+0x153/0x2c4)
> [ 91.570250] [<c0145c8f>] (ext4_mb_regular_allocator) from
> [<c0148095>] (ext4_mb_new_blocks+0x2fd/0x4e4)
> [ 91.570868] [<c0148095>] (ext4_mb_new_blocks) from [<c013f931>]
> (ext4_ext_map_blocks+0x965/0x10bc)
> [ 91.571444] [<c013f931>] (ext4_ext_map_blocks) from [<c0122c8b>]
> (ext4_map_blocks+0xfb/0x36c)
> [ 91.571992] [<c0122c8b>] (ext4_map_blocks) from [<c01263b1>]
> (mpage_map_and_submit_extent+0x99/0x5f0)
> [ 91.572614] [<c01263b1>] (mpage_map_and_submit_extent) from
> [<c0126bc1>] (ext4_writepages+0x2b9/0x4e8)
> [ 91.573201] [<c0126bc1>] (ext4_writepages) from [<c0094ae9>]
> (do_writepages+0x19/0x28)
> [ 91.573709] [<c0094ae9>] (do_writepages) from [<c008c811>]
> (__filemap_fdatawrite_range+0x3d/0x44)
> [ 91.574265] [<c008c811>] (__filemap_fdatawrite_range) from
> [<c008c883>] (filemap_flush+0x23/0x28)
> [ 91.574854] [<c008c883>] (filemap_flush) from [<c012bf75>]
> (ext4_rename+0x2f9/0x3e4)
> [ 91.575360] [<c012bf75>] (ext4_rename) from [<c00c3363>]
> (vfs_rename+0x183/0x45c)
> [ 91.575911] [<c00c3363>] (vfs_rename) from [<c00c3867>]
> (SyS_renameat2+0x22b/0x26c)
> [ 91.576460] [<c00c3867>] (SyS_renameat2) from [<c00c38df>]
> (SyS_rename+0x1f/0x24)
> [ 91.576961] [<c00c38df>] (SyS_rename) from [<c000cd01>]
> (ret_fast_syscall+0x1/0x5c)
>
>
> Bisecting this points to: e7f3d22289e4307b3071cc18b1d8ecc6598c0be4
> (mmc: mmci: Handle CMD irq before DATA irq). Which I guess shouldn't
> be surprising, as I saw problems with that patch earlier in the
> 3.15-rc cycle:
> https://lkml.org/lkml/2014/4/14/824
>
> However that discussion petered out (possibly my fault for not
> following up) as to if it was an issue with the patch or a issue with
> qemu. Then the original issue disappeared for me, which I figured was
> due to a fix upstream, but now I'm guessing coincided with me updating
> my system and getting qemu v2.0 (where as previously I was on 1.5).
>
> $ qemu-system-arm -version
> QEMU emulator version 2.0.0 (Debian 2.0.0+dfsg-2ubuntu1.1), Copyright
> (c) 2003-2008 Fabrice Bellard
>
> While the previous behavior was annoying and kept my emulated
> environments from booting, this while a bit more rare and subtle eats
> the disks, which is much more painful for my testing.
>
> Unfortunately reverting the change (manually, as it doesn't revert
> cleanly anymore) doesn't seem to completely avoid the issue, so the
> bisection may have gone slightly astray (though it is interesting it
> landed on the same commit I earlier had trouble with). So I'll
> back-track and double check some of the last few "good" results to
> validate I didn't just luck into 3 good boots accidentally. I'll also
> review my revert in case I missed something subtle in doing it
> manually.
>
> Anyway, if there is any thoughts on how to better chase this down and
> debug it, I'd appreciate it! I can also provide reproduction
> instructions with a pre-built Linaro android disk image and hand built
> kernel if anyone wants to debug this themselves.

According to your log, the primecell-periphid is 0x00041181, which
means mmci will be using the arm_variant.

A simple fix; for the arm_variant, go back to use the old behaviour.

A quite simple fix; Invent a new primecell-periphid and a new
corresponding variant and use the old behaviour for this variant. The
new primecell-periphid then needs to be provided through DT for the
QEMU dtb.

Is there any of the above solution you see as the preferred one?

Kind regards
Uffe

2014-06-12 12:16:12

by Peter Maydell

[permalink] [raw]
Subject: Re: [Regression] 3.15 mmc related ext4 corruption with qemu-system-arm

On 12 June 2014 13:09, Ulf Hansson <[email protected]> wrote:
> A simple fix; for the arm_variant, go back to use the old behaviour.
>
> A quite simple fix; Invent a new primecell-periphid and a new
> corresponding variant and use the old behaviour for this variant. The
> new primecell-periphid then needs to be provided through DT for the
> QEMU dtb.
>
> Is there any of the above solution you see as the preferred one?

Those both sound like workarounds, not fixes, to me. Somebody
needs to identify whether the bug here is in:
* the kernel (unlikely, but possibly the kernel has a race
condition that only gets triggered by QEMU's "operations
that take time in h/w happen instantaneously in emulation"
behaviour)
* the QEMU model of the PL181
* the QEMU model of the SD card
and then fix whichever of these is not conforming to the
specs/docs/etc.

Also, there's no such thing as a "QEMU dtb", at least for
most of our board models. QEMU models the actual hardware
(sometimes buggily or incompletely) and so should use the
exact same dtb you would use with the hardware.

thanks
-- PMM

2014-06-12 23:51:07

by John Stultz

[permalink] [raw]
Subject: Re: [Regression] 3.15 mmc related ext4 corruption with qemu-system-arm

On Wed, Jun 11, 2014 at 10:35 PM, John Stultz <[email protected]> wrote:
> Bisecting this points to: e7f3d22289e4307b3071cc18b1d8ecc6598c0be4
> (mmc: mmci: Handle CMD irq before DATA irq). Which I guess shouldn't
> be surprising, as I saw problems with that patch earlier in the
> 3.15-rc cycle:
> https://lkml.org/lkml/2014/4/14/824
>
[...]
>
> Unfortunately reverting the change (manually, as it doesn't revert
> cleanly anymore) doesn't seem to completely avoid the issue, so the
> bisection may have gone slightly astray (though it is interesting it
> landed on the same commit I earlier had trouble with). So I'll
> back-track and double check some of the last few "good" results to
> validate I didn't just luck into 3 good boots accidentally. I'll also
> review my revert in case I missed something subtle in doing it
> manually.

So I'm getting some baffling results. I started going back over the
git bisect logs to see if I had mis-marked a revision as good due to
the issue just not reproducing.

However, despite many many reboots the last good commit in my branch
- bb5cba40dc7f079ea7ee3ae760b7c388b6eb5fc3 (mmc: block: Fixup busy
detection while...) doesn't ever show the issue. While the immediately
following commit which bisect found -
e7f3d22289e4307b3071cc18b1d8ecc6598c0be4 (mmc: mmci: Handle CMD irq
before DATA irq) always does.

The immensely frustrating part is while backing that single change off
from its commit sha always makes the issue go away, reverting that
change from on top of v3.15 doesn't. The issue persists. Since it
doesn't revert cleanly, I also reverted a following patch that it
interacted with 8d94b54d99ea968a9d188ca0e68793ebed601220 (mmc: mmci:
Enable support for busy detection....) to make sure I didn't miss some
dependency and the issue *still* crops up. In fact, doing a git diff
bb5cba40dc7f079ea7ee3ae760b7c388b6eb5fc3..v3.15 drivers/mmc/ doesn't
seem to resolve the issue.

So I'm really at a bit of a loss on what to do next. While it seems
that the "mmci: Handle CMD irq before DATA..." commit is problematic,
there also seems to be some other commit in v3.15 which results in the
same problematic behavior. I may try to bisect again between the
first bad commit and v3.15, reverting the bad commit each time to see
if I can chase it down, but if anyone has better debugging tools here,
I'd greatly appreciate it.

Again, I'm happy to help interested folks get this reproducing on
their own machine for debugging.

thanks
-john

2014-06-13 11:35:38

by Ulf Hansson

[permalink] [raw]
Subject: Re: [Regression] 3.15 mmc related ext4 corruption with qemu-system-arm

On 12 June 2014 14:15, Peter Maydell <[email protected]> wrote:
> On 12 June 2014 13:09, Ulf Hansson <[email protected]> wrote:
>> A simple fix; for the arm_variant, go back to use the old behaviour.
>>
>> A quite simple fix; Invent a new primecell-periphid and a new
>> corresponding variant and use the old behaviour for this variant. The
>> new primecell-periphid then needs to be provided through DT for the
>> QEMU dtb.
>>
>> Is there any of the above solution you see as the preferred one?
>
> Those both sound like workarounds, not fixes, to me. Somebody
> needs to identify whether the bug here is in:
> * the kernel (unlikely, but possibly the kernel has a race
> condition that only gets triggered by QEMU's "operations
> that take time in h/w happen instantaneously in emulation"
> behaviour)
> * the QEMU model of the PL181
> * the QEMU model of the SD card
> and then fix whichever of these is not conforming to the
> specs/docs/etc.

You are right. But...

Since we (or actually me) have made the ARM model to break (it worked
nicely before), I just wanted to restore the behaviour as a quick fix.
I believe going into this in detail can take some more time,
especially if it's related to the ARM model, right!?

Kind regards
Uffe

>
> Also, there's no such thing as a "QEMU dtb", at least for
> most of our board models. QEMU models the actual hardware
> (sometimes buggily or incompletely) and so should use the
> exact same dtb you would use with the hardware.
>
> thanks
> -- PMM

2014-06-13 12:28:34

by Ulf Hansson

[permalink] [raw]
Subject: Re: [Regression] 3.15 mmc related ext4 corruption with qemu-system-arm

On 13 June 2014 01:51, John Stultz <[email protected]> wrote:
> On Wed, Jun 11, 2014 at 10:35 PM, John Stultz <[email protected]> wrote:
>> Bisecting this points to: e7f3d22289e4307b3071cc18b1d8ecc6598c0be4
>> (mmc: mmci: Handle CMD irq before DATA irq). Which I guess shouldn't
>> be surprising, as I saw problems with that patch earlier in the
>> 3.15-rc cycle:
>> https://lkml.org/lkml/2014/4/14/824
>>
> [...]
>>
>> Unfortunately reverting the change (manually, as it doesn't revert
>> cleanly anymore) doesn't seem to completely avoid the issue, so the
>> bisection may have gone slightly astray (though it is interesting it
>> landed on the same commit I earlier had trouble with). So I'll
>> back-track and double check some of the last few "good" results to
>> validate I didn't just luck into 3 good boots accidentally. I'll also
>> review my revert in case I missed something subtle in doing it
>> manually.
>
> So I'm getting some baffling results. I started going back over the
> git bisect logs to see if I had mis-marked a revision as good due to
> the issue just not reproducing.
>
> However, despite many many reboots the last good commit in my branch
> - bb5cba40dc7f079ea7ee3ae760b7c388b6eb5fc3 (mmc: block: Fixup busy
> detection while...) doesn't ever show the issue. While the immediately
> following commit which bisect found -
> e7f3d22289e4307b3071cc18b1d8ecc6598c0be4 (mmc: mmci: Handle CMD irq
> before DATA irq) always does.
>
> The immensely frustrating part is while backing that single change off
> from its commit sha always makes the issue go away, reverting that
> change from on top of v3.15 doesn't. The issue persists. Since it
> doesn't revert cleanly, I also reverted a following patch that it
> interacted with 8d94b54d99ea968a9d188ca0e68793ebed601220 (mmc: mmci:
> Enable support for busy detection....) to make sure I didn't miss some
> dependency and the issue *still* crops up. In fact, doing a git diff
> bb5cba40dc7f079ea7ee3ae760b7c388b6eb5fc3..v3.15 drivers/mmc/ doesn't
> seem to resolve the issue.
>
> So I'm really at a bit of a loss on what to do next. While it seems
> that the "mmci: Handle CMD irq before DATA..." commit is problematic,
> there also seems to be some other commit in v3.15 which results in the
> same problematic behavior. I may try to bisect again between the
> first bad commit and v3.15, reverting the bad commit each time to see
> if I can chase it down, but if anyone has better debugging tools here,
> I'd greatly appreciate it.
>
> Again, I'm happy to help interested folks get this reproducing on
> their own machine for debugging.
>

Hi John,

I have quickly implemented my proposal 1). I am testing them on real
HW now, will post the patches as soon as I can and keep you on cc.

I would also really appreciate if you could help out giving them a
quick try for your QEMU environment.

Kind regards
Uffe

2014-06-16 07:22:17

by Jeff Chua

[permalink] [raw]
Subject: Re: [Regression] 3.15 mmc related ext4 corruption with qemu-system-arm

On Fri, Jun 13, 2014 at 8:28 PM, Ulf Hansson <[email protected]> wrote:
> On 13 June 2014 01:51, John Stultz <[email protected]> wrote:
>> On Wed, Jun 11, 2014 at 10:35 PM, John Stultz [email protected]> wrote:

> I have quickly implemented my proposal 1). I am testing them on real
> HW now, will post the patches as soon as I can and keep you on cc.
>
> I would also really appreciate if you could help out giving them a
> quick try for your QEMU environment.

Please cc me the patch. I'm seeing my host's reiserfs corrupted with
qemu all over the places in linux-3.15.0 (linux-3.16-rc1). Pretty sure
it's qemu as it doesn't seem to happen if I don't run qemu.

Thanks,
Jeff

2014-06-16 13:02:04

by Ulf Hansson

[permalink] [raw]
Subject: Re: [Regression] 3.15 mmc related ext4 corruption with qemu-system-arm

On 16 June 2014 09:22, Jeff Chua <[email protected]> wrote:
> On Fri, Jun 13, 2014 at 8:28 PM, Ulf Hansson <[email protected]> wrote:
>> On 13 June 2014 01:51, John Stultz <[email protected]> wrote:
>>> On Wed, Jun 11, 2014 at 10:35 PM, John Stultz [email protected]> wrote:
>
>> I have quickly implemented my proposal 1). I am testing them on real
>> HW now, will post the patches as soon as I can and keep you on cc.
>>
>> I would also really appreciate if you could help out giving them a
>> quick try for your QEMU environment.
>
> Please cc me the patch. I'm seeing my host's reiserfs corrupted with
> qemu all over the places in linux-3.15.0 (linux-3.16-rc1). Pretty sure
> it's qemu as it doesn't seem to happen if I don't run qemu.
>
> Thanks,
> Jeff

Hi Jeff,

I posted them prior I noticed this, sorry. There are three patches for
the mmci driver, just search for "mmci".

Kind regards
Uffe