LinuxLists.cc - [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

2009-10-16 09:16:38

Subject: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On Fri, Oct 16, 2009 at 12:28:18AM -0400, Parag Warudkar wrote:
> So I have been experimenting with various root file systems on my
> laptop running latest git. This laptop some times has problems waking
> up from sleep and that results in it needing a hard reset and
> subsequently unclean file system.

A number of people have reported this, and there is some discussion
and some suggestions that I've made here:

http://bugzilla.kernel.org/show_bug.cgi?id=14354

It's been very frustrating because I have not been able to replicate
it myself; I've been very much looking for someone who is (a) willing
to work with me on this, and perhaps willing to risk running fsck
frequently, perhaps after every single unclean shutdown, and (b) who
can reliably reproduce this problem. On my system, which is a T400
running 9.04 with the latest git kernels, I've not been able to
reproduce it, despite many efforts to try to reproduce it. (i.e.,
suspend the machine and then pull the battery and power; pulling the
battery and power, "echo c > /proc/sysrq-trigger", etc., while
doing "make -j4" when the system is being uncleanly shutdown)

So if you can come up with a reliable reproduction case, and don't
mind doing some experiments and/or exchanging debugging correspondance
with me, please let me know. I'd **really** appreciate the help.

Information that would be helpful to me would be:

a) Detailed hardware information (what type of disk/SSD, what type of
laptop, hardware configuration, etc.)

b) Detailed software information (what version of the kernel are you
using including any special patches, what distro and version are you
using, are you using LVM or dm-crypt, what partition or partitions did
you have mounted, was the failing partition a root partition or some
other mounted partition, etc.)

c) Detailed reproduction recipe (what programs were you running before
the crash/failed suspend/resume, etc.)

If you do decide to go hunting this problem, one thing I would
strongly suggest is that either to use "tune2fs -c 1 /dev/XXX" to
force a fsck after every reboot, or if you are using LVM, to use the
e2croncheck script (found as an attachment in the above bugzilla entry
or in the e2fsprogs sources in the contrib directory) to take a
snapshot and then check the snapshot right after you reboot and login
to your system. The reported file system corruptions seem to involve
the block allocation bitmaps getting corrupted, and so you will
significantly reduce the chances of data loss if you run e2fsck as
soon as possible after the file system corruption happens. This helps
you not lose data, and it also helps us find the bug, since it helps
pinpoint the earliest possible point where the file system is getting
corrupted.

(I suspect that some bug reporters had their file system get corrupted
one or more boot sessions earlier, and by the time the corruption was
painfully obvious, they had lost data. Mercifully, running fsck
frequently is much less painful on a freshly created ext4 filesystem,
and of course if you are using an SSD.)

If you can reliably reproduce the problem, it would be great to get a
bisection, or at least a confirmation that the problem doesn't exist
on 2.6.31, but does exist on 2.6.32-rcX kernels. At this point I'm
reasonably sure it's a post-2.6.31 regression, but it would be good to
get a hard confirmation of that fact.

For people with a reliable reproduction case, one possible experiment
can be found here:

http://bugzilla.kernel.org/show_bug.cgi?id=14354#c18

Another thing you might try is to try reverting these commits one at a
time, and see if they make the problem go away: d0646f7, 5534fb5,
7178057. These are three commits that seem most likely, but there are
only 93 ext4-related commits, so doing a "git bisect start v2.6.31
v2.6.32-rc5 -- fs/ext4 fs/jbd2" should only take at most seven compile
tests --- assuming this is indeed a 2.6.31 regression and the problem
is an ext4-specific code change, as opposed to some other recent
change in the writeback code or some device driver which is
interacting badly with ext4.

If that assumption isn't true and so a git bisect limited to fs/ext4
and fs/jbd2 doesn't find a bad commit which when reverted makes the
problem go away, we could try a full bisection search via "git bisect
start v2.6.31 v2.6.31-rc3", which would take approximately 14 compile
tests, but hopefully that wouldn't be necessary.

I'm going to be at the kernel summit in Tokyo next week, so my e-mail
latency will be a bit longer than normal, which is one of the reason
why I've left a goodly list of potential experiments for people to
try. If you can come up with a reliable regression, and are willing
to work with me or to try some of the above mentioned tests, I'll
definitely buy you a real (or virtual) beer.

Given that a number of people have reported losing data as a result,
it would **definitely** be a good thing to get this fixed before
2.6.32 is released.

Thanks,

- Ted

2009-10-16 14:14:08

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On Fri, Oct 16, 2009 at 05:15:58AM -0400, Theodore Tso wrote:
> These are three commits that seem most likely, but there are
> only 93 ext4-related commits, so doing a "git bisect start v2.6.31
> v2.6.32-rc5 -- fs/ext4 fs/jbd2"

One correction. The git bsect start usage is:

git bisect start [<bad> [<good>...]] [--] [<paths>...]

So the correct git bisect start command should be:

git bisect start v2.6.32-rc5 v2.6.31 -- fs/ext4 fs/jbd2

And similarly, "git bisect start v2.6.31 v2.6.31-rc3" should have been:

git bisect start v2.6.31-rc3 v2.6.31

My apologies for any confusion.

- Ted

2009-10-16 19:14:28

by Ric Wheeler

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On 10/16/2009 05:15 AM, Theodore Tso wrote:
> On Fri, Oct 16, 2009 at 12:28:18AM -0400, Parag Warudkar wrote:
>
>> So I have been experimenting with various root file systems on my
>> laptop running latest git. This laptop some times has problems waking
>> up from sleep and that results in it needing a hard reset and
>> subsequently unclean file system.
>>
> A number of people have reported this, and there is some discussion
> and some suggestions that I've made here:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=14354
>
> It's been very frustrating because I have not been able to replicate
> it myself; I've been very much looking for someone who is (a) willing
> to work with me on this, and perhaps willing to risk running fsck
> frequently, perhaps after every single unclean shutdown, and (b) who
> can reliably reproduce this problem. On my system, which is a T400
> running 9.04 with the latest git kernels, I've not been able to
> reproduce it, despite many efforts to try to reproduce it. (i.e.,
> suspend the machine and then pull the battery and power; pulling the
> battery and power, "echo c> /proc/sysrq-trigger", etc., while
> doing "make -j4" when the system is being uncleanly shutdown)
>

I wonder if we might have better luck if we tested using an external
(e-sata or USB connected) S-ATA drive.

Instead of pulling the drive's data connection, most of these have an
external power source that could be turned off so the drive firmware
won't have a chance to flush the volatile write cache. Note that some
drives automatically write back the cache if they have power and see a
bus disconnect, so hot unplugging just the e-sata or usb cable does not
do the trick.

Given the number of cheap external drives, this should be easy to test
at home....

Ric

> So if you can come up with a reliable reproduction case, and don't
> mind doing some experiments and/or exchanging debugging correspondance
> with me, please let me know. I'd **really** appreciate the help.
>
> Information that would be helpful to me would be:
>
> a) Detailed hardware information (what type of disk/SSD, what type of
> laptop, hardware configuration, etc.)
>
> b) Detailed software information (what version of the kernel are you
> using including any special patches, what distro and version are you
> using, are you using LVM or dm-crypt, what partition or partitions did
> you have mounted, was the failing partition a root partition or some
> other mounted partition, etc.)
>
> c) Detailed reproduction recipe (what programs were you running before
> the crash/failed suspend/resume, etc.)
>
>
> If you do decide to go hunting this problem, one thing I would
> strongly suggest is that either to use "tune2fs -c 1 /dev/XXX" to
> force a fsck after every reboot, or if you are using LVM, to use the
> e2croncheck script (found as an attachment in the above bugzilla entry
> or in the e2fsprogs sources in the contrib directory) to take a
> snapshot and then check the snapshot right after you reboot and login
> to your system. The reported file system corruptions seem to involve
> the block allocation bitmaps getting corrupted, and so you will
> significantly reduce the chances of data loss if you run e2fsck as
> soon as possible after the file system corruption happens. This helps
> you not lose data, and it also helps us find the bug, since it helps
> pinpoint the earliest possible point where the file system is getting
> corrupted.
>
> (I suspect that some bug reporters had their file system get corrupted
> one or more boot sessions earlier, and by the time the corruption was
> painfully obvious, they had lost data. Mercifully, running fsck
> frequently is much less painful on a freshly created ext4 filesystem,
> and of course if you are using an SSD.)
>
> If you can reliably reproduce the problem, it would be great to get a
> bisection, or at least a confirmation that the problem doesn't exist
> on 2.6.31, but does exist on 2.6.32-rcX kernels. At this point I'm
> reasonably sure it's a post-2.6.31 regression, but it would be good to
> get a hard confirmation of that fact.
>
> For people with a reliable reproduction case, one possible experiment
> can be found here:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=14354#c18
>
> Another thing you might try is to try reverting these commits one at a
> time, and see if they make the problem go away: d0646f7, 5534fb5,
> 7178057. These are three commits that seem most likely, but there are
> only 93 ext4-related commits, so doing a "git bisect start v2.6.31
> v2.6.32-rc5 -- fs/ext4 fs/jbd2" should only take at most seven compile
> tests --- assuming this is indeed a 2.6.31 regression and the problem
> is an ext4-specific code change, as opposed to some other recent
> change in the writeback code or some device driver which is
> interacting badly with ext4.
>
> If that assumption isn't true and so a git bisect limited to fs/ext4
> and fs/jbd2 doesn't find a bad commit which when reverted makes the
> problem go away, we could try a full bisection search via "git bisect
> start v2.6.31 v2.6.31-rc3", which would take approximately 14 compile
> tests, but hopefully that wouldn't be necessary.
>
> I'm going to be at the kernel summit in Tokyo next week, so my e-mail
> latency will be a bit longer than normal, which is one of the reason
> why I've left a goodly list of potential experiments for people to
> try. If you can come up with a reliable regression, and are willing
> to work with me or to try some of the above mentioned tests, I'll
> definitely buy you a real (or virtual) beer.
>
> Given that a number of people have reported losing data as a result,
> it would **definitely** be a good thing to get this fixed before
> 2.6.32 is released.
>
> Thanks,
>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2009-10-16 22:24:18

by Parag Warudkar

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On Fri, Oct 16, 2009 at 5:15 AM, Theodore Tso <[email protected]> wrote:
>
> A number of people have reported this, and there is some discussion
> and some suggestions that I've made here:
>
> ? ? ? ?http://bugzilla.kernel.org/show_bug.cgi?id=14354

Ok, I went through this bug report and here are some immediately
useful things to note -
1) I am not running Karmic - I am running Jaunty x86_64 clean install
2) I am not using dm/lvm or anything fancy for that matter
3) So far, I have been able to reproduce this problem by just hitting
the power button on my laptop when it is doing nothing. It also
happens when waking up from s2ram and the laptop wasn't doing anything
when it was suspended (I mean I wasn't copying/deleting stuff, wasn't
running make - laptop was

> So if you can come up with a reliable reproduction case, and don't
> mind doing some experiments and/or exchanging debugging correspondance
> with me, please let me know. ?I'd **really** appreciate the help.

My laptop is a reliable test case - for me at least! I tried just now
to abruptly reset the laptop and upon reboot there was fsck followed
by another reboot only to have X fail to start and NetWorkManager
segfault. At this point I am pretty sure I can reproduce it just be
power cycling the laptop using the power button.
After another fsck and a reboot it finally comes up.

>
> Information that would be helpful to me would be:
>
> a) Detailed hardware information (what type of disk/SSD, what type of
> laptop, hardware configuration, etc.)

SSD is a Corsair P256 latest firmware, been used on other machine
without any issues.
Laptop is HP EliteBook 8530p, 4GB RAM, Intel T9400 CPU, Intel WiFi,
ATI 3650 GPU.
No proprietary drivers ever loaded.

>
> b) Detailed software information (what version of the kernel are you
> using including any special patches, what distro and version are you
> using, are you using LVM or dm-crypt, what partition or partitions did
> you have mounted, was the failing partition a root partition or some
> other mounted partition, etc.)
It happens on current custom compiled git - both on custom and minimal
localmodconfig.
It also happens on Ubuntu daily kernel PPA builds.

>
> c) Detailed reproduction recipe (what programs were you running before
> the crash/failed suspend/resume, etc.)
>
Really nothing special - I boot to desktop, may be open FireFox for
few minutes and then try reset.
fsck then reports a bunch of errors and forces reboot. On reboot X
fails to start, file system although mounted rw cannot be written to -
vim for instance won't open any file due to write errors. Another fsck
finds few more problems (or sometimes not) and reboot brings it back
to desktop.

So my problem is not corruption really but the amount and nature of
errors fsck encounters and corrects on unclean shutdown and the write
failures until another fsck -f finds more problems and reboots. None
of this happens on any other filesystem including the /boot ext3 fs on
the same disk.

>
> If you do decide to go hunting this problem, one thing I would
> strongly suggest is that either to use "tune2fs -c 1 /dev/XXX" to
> force a fsck after every reboot, or if you are using LVM, to use the
> e2croncheck script (found as an attachment in the above bugzilla entry
> or in the e2fsprogs sources in the contrib directory) to take a
> snapshot and then check the snapshot right after you reboot and login
> to your system. ?The reported file system corruptions seem to involve
> the block allocation bitmaps getting corrupted, and so you will
> significantly reduce the chances of data loss if you run e2fsck as
> soon as possible after the file system corruption happens. ?This helps
> you not lose data, and it also helps us find the bug, since it helps
> pinpoint the earliest possible point where the file system is getting
> corrupted.

I have enabled fsck on every mount but I am not certain ongoing "clean
state" corruption is the problem in my case.
Things have worked well without any trouble if I don't end up doing a
unclean shutdown.

[ snip]

> I'm going to be at the kernel summit in Tokyo next week, so my e-mail
> latency will be a bit longer than normal, which is one of the reason
> why I've left a goodly list of potential experiments for people to
> try. ?If you can come up with a reliable regression, and are willing
> to work with me or to try some of the above mentioned tests, I'll
> definitely buy you a real (or virtual) beer.
>

I will try the things you mentioned - finding if this happens in pre
.32 kernels is the first one on my list followed by reverting the
specific commits you mentioned, followed if necessary by complete
bisection.

I am afraid however this is not a regression - at least not a recent
one, as I have had this experience with ext4 and unclean shutdowns
since long time.
And that's on different hardware/different disks.

Thanks,

Parag

2009-10-25 19:04:12

by Pavel Machek

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

Hi!

>>> So I have been experimenting with various root file systems on my
>>> laptop running latest git. This laptop some times has problems waking
>>> up from sleep and that results in it needing a hard reset and
>>> subsequently unclean file system.
>>>
>> A number of people have reported this, and there is some discussion
>> and some suggestions that I've made here:
>>
>> http://bugzilla.kernel.org/show_bug.cgi?id=14354
>>
>> It's been very frustrating because I have not been able to replicate
>> it myself; I've been very much looking for someone who is (a) willing
>> to work with me on this, and perhaps willing to risk running fsck
>> frequently, perhaps after every single unclean shutdown, and (b) who
>> can reliably reproduce this problem. On my system, which is a T400
>> running 9.04 with the latest git kernels, I've not been able to
>> reproduce it, despite many efforts to try to reproduce it. (i.e.,
>> suspend the machine and then pull the battery and power; pulling the
>> battery and power, "echo c> /proc/sysrq-trigger", etc., while
>> doing "make -j4" when the system is being uncleanly shutdown)
>>
>
> I wonder if we might have better luck if we tested using an external
> (e-sata or USB connected) S-ATA drive.
>
> Instead of pulling the drive's data connection, most of these have an
> external power source that could be turned off so the drive firmware
> won't have a chance to flush the volatile write cache. Note that some
> drives automatically write back the cache if they have power and see a
> bus disconnect, so hot unplugging just the e-sata or usb cable does not
> do the trick.
>
> Given the number of cheap external drives, this should be easy to test
> at home....

Do they support barriers?

(Anyway, you may want to use some kind of VM for testing. That should
make the testing cycle shorter, easier to reprorduce *and* more repeatable.)

Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-10-26 13:46:28

by Ric Wheeler

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On 10/25/2009 02:22 AM, Pavel Machek wrote:
> Hi!
>
>>>> So I have been experimenting with various root file systems on my
>>>> laptop running latest git. This laptop some times has problems waking
>>>> up from sleep and that results in it needing a hard reset and
>>>> subsequently unclean file system.
>>>>
>>> A number of people have reported this, and there is some discussion
>>> and some suggestions that I've made here:
>>>
>>> http://bugzilla.kernel.org/show_bug.cgi?id=14354
>>>
>>> It's been very frustrating because I have not been able to replicate
>>> it myself; I've been very much looking for someone who is (a) willing
>>> to work with me on this, and perhaps willing to risk running fsck
>>> frequently, perhaps after every single unclean shutdown, and (b) who
>>> can reliably reproduce this problem. On my system, which is a T400
>>> running 9.04 with the latest git kernels, I've not been able to
>>> reproduce it, despite many efforts to try to reproduce it. (i.e.,
>>> suspend the machine and then pull the battery and power; pulling the
>>> battery and power, "echo c> /proc/sysrq-trigger", etc., while
>>> doing "make -j4" when the system is being uncleanly shutdown)
>>>
>>
>> I wonder if we might have better luck if we tested using an external
>> (e-sata or USB connected) S-ATA drive.
>>
>> Instead of pulling the drive's data connection, most of these have an
>> external power source that could be turned off so the drive firmware
>> won't have a chance to flush the volatile write cache. Note that some
>> drives automatically write back the cache if they have power and see a
>> bus disconnect, so hot unplugging just the e-sata or usb cable does not
>> do the trick.
>>
>> Given the number of cheap external drives, this should be easy to test
>> at home....
>
> Do they support barriers?
>
> (Anyway, you may want to use some kind of VM for testing. That should
> make the testing cycle shorter, easier to reprorduce *and* more repeatable.)
>
> Pavel
>

The drives themselves will support barriers - they are the same S-ATA/ATA drives
you get normally for your desktop, etc.

I think that e-SATA would have no trouble (but fewer boxes have that external
S-ATA port). Not sure how reliable the SCSI -> USB -> ATA conversion is for USB
drives though (a lot of moving pieces there!).

VM testing is a good idea, but I worry that the virtual IO stack support for
data integrity is still somewhat shaky. Christoph was working on fixing various
bits and pieces I think...

ric

2009-10-26 15:42:36

by Linus Torvalds

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

Just a note: I'm seeing this too on my new Dell laptop, with a fresh
Fedora-12 Beta install (+ "yum update" + current -git kernel).

The root filesystem was seriously scrogged, and I had the RPM databases
corrupted. Don't know what else is wrong, I'm rebuilding them now.

I had a few unclean shutdowns (debugging wireless driver etc), but they
weren't during any heavy disk activity. And I'm pretty sure they weren't
during any yum update, so the rpm database corruption smells like ext4 is
not writing back inode information in a timely manner at all (ie any rpm
database activity would have happened much earlier).

Example kernel messages appended. I probably missed a few. I do not have
fsck logs (this was all on the root filesystem, nowhere to log them), but
they weren't pretty.

I can do pretty much anything to this laptop (I'm just testing it), so I'm
open to testing.

Linus

---
Oct 25 10:46:00 localhost kernel: device-mapper: multipath: version 1.1.0 loaded
Oct 25 10:46:00 localhost kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode
Oct 25 10:46:00 localhost kernel: scsi 4:0:0:0: Direct-Access Generic- Multi-Card 1.00 PQ: 0 ANSI: 0 CCS
Oct 25 10:46:00 localhost kernel: sd 4:0:0:0: Attached scsi generic sg1 type 0
Oct 25 10:46:00 localhost kernel: sd 4:0:0:0: [sdb] Attached SCSI removable disk
Oct 25 10:46:00 localhost kernel: EXT4-fs error (device dm-0): ext4_mb_generate_buddy: EXT4-fs: group 2: 5553 blocks in bitmap, 5546 in gd
Oct 25 10:46:00 localhost kernel: JBD: Spotted dirty metadata buffer (dev = dm-0, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
...
Oct 25 10:46:56 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 133089
Oct 25 10:46:56 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 133089
Oct 25 10:46:56 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 133089
Oct 25 10:46:56 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 133089
Oct 25 10:46:56 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 133089
Oct 25 10:46:56 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 133089
Oct 25 10:46:56 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 133089
...
Oct 25 10:46:59 localhost kernel: EXT4-fs error (device dm-0): ext4_mb_generate_buddy: EXT4-fs: group 17: 3663 blocks in bitmap, 3664 in gd
...
Oct 26 07:54:58 localhost kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode
Oct 26 07:54:58 localhost kernel: EXT4-fs error (device dm-0): ext4_mb_generate_buddy: EXT4-fs: group 3: 4789 blocks in bitmap, 3785 in gd
Oct 26 07:54:58 localhost kernel: JBD: Spotted dirty metadata buffer (dev = dm-0, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
Oct 26 07:54:58 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 5832
Oct 26 07:54:58 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21482
Oct 26 07:54:58 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21485
Oct 26 07:54:58 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21486
...
Oct 26 07:55:22 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 657981
Oct 26 07:55:22 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 657981
...
Oct 26 08:11:39 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21486
Oct 26 08:11:40 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21482
Oct 26 08:11:41 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21485
Oct 26 08:12:11 localhost kernel: EXT4-fs error (device dm-0): ext4_mb_generate_buddy: EXT4-fs: group 6: 6342 blocks in bitmap, 6330 in gd
Oct 26 08:12:11 localhost kernel: EXT4-fs error (device dm-0): ext4_mb_generate_buddy: EXT4-fs: group 16: 17220 blocks in bitmap, 18932 in gd
Oct 26 08:12:11 localhost kernel: EXT4-fs error (device dm-0): ext4_mb_generate_buddy: EXT4-fs: group 26: 6769 blocks in bitmap, 4721 in gd
Oct 26 08:12:11 localhost kernel: EXT4-fs error (device dm-0): ext4_mb_generate_buddy: EXT4-fs: group 27: 15419 blocks in bitmap, 1081 in gd
Oct 26 08:12:11 localhost kernel: EXT4-fs error (device dm-0): ext4_mb_generate_buddy: EXT4-fs: group 28: 8309 blocks in bitmap, 1879 in gd
Oct 26 08:12:11 localhost kernel: EXT4-fs error (device dm-0): ext4_mb_generate_buddy: EXT4-fs: group 29: 5982 blocks in bitmap, 1880 in gd
Oct 26 08:12:11 localhost kernel: EXT4-fs error (device dm-0): ext4_mb_generate_buddy: EXT4-fs: group 30: 9476 blocks in bitmap, 3886 in gd
Oct 26 08:12:12 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21482
Oct 26 08:12:12 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21485
Oct 26 08:12:12 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21486
Oct 26 08:12:14 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21482
Oct 26 08:12:14 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21485
Oct 26 08:12:14 localhost kernel: EXT4-fs error (device dm-0): ext4_lookup: deleted inode referenced: 21486

2009-10-27 10:00:31

by Aneesh Kumar K.V

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

Can you try this patch ?

commit a8836b1d6f92273e001012c7705ae8f4c3d5fb65
Author: Aneesh Kumar K.V <[email protected]>
Date: Tue Oct 27 15:36:38 2009 +0530

ext4: discard preallocation during truncate

We need to make sure when we drop and reacquire the inode's
i_data_sem we discard the inode preallocation. Otherwise we
could have blocks marked as free in bitmap but still belonging
to prealloc space.

Signed-off-by: Aneesh Kumar K.V <[email protected]>

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 5c5bc5d..a1ef1c3 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -209,6 +209,12 @@ static int try_to_extend_transaction(handle_t *handle, struct inode *inode)
up_write(&EXT4_I(inode)->i_data_sem);
ret = ext4_journal_restart(handle, blocks_for_truncate(inode));
down_write(&EXT4_I(inode)->i_data_sem);
+ /*
+ * We have dropped i_data_sem. So somebody else could have done
+ * block allocation. So discard the prealloc space created as a
+ * part of block allocation
+ */
+ ext4_discard_preallocations(inode);

return ret;
}

2009-10-29 20:11:05

by Mingming Cao

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On Tue, 2009-10-27 at 15:45 +0530, Aneesh Kumar K.V wrote:
> Can you try this patch ?
>
> commit a8836b1d6f92273e001012c7705ae8f4c3d5fb65
> Author: Aneesh Kumar K.V <[email protected]>
> Date: Tue Oct 27 15:36:38 2009 +0530
>
> ext4: discard preallocation during truncate
>
> We need to make sure when we drop and reacquire the inode's
> i_data_sem we discard the inode preallocation. Otherwise we
> could have blocks marked as free in bitmap but still belonging
> to prealloc space.
>
> Signed-off-by: Aneesh Kumar K.V <[email protected]>
>

Make sense, reviewed-by: Mingming Cao <[email protected]>

> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 5c5bc5d..a1ef1c3 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -209,6 +209,12 @@ static int try_to_extend_transaction(handle_t *handle, struct inode *inode)
> up_write(&EXT4_I(inode)->i_data_sem);
> ret = ext4_journal_restart(handle, blocks_for_truncate(inode));
> down_write(&EXT4_I(inode)->i_data_sem);
> + /*
> + * We have dropped i_data_sem. So somebody else could have done
> + * block allocation. So discard the prealloc space created as a
> + * part of block allocation
> + */
> + ext4_discard_preallocations(inode);
>
> return ret;
> }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2009-10-29 21:30:59

by Parag Warudkar

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On Tue, Oct 27, 2009 at 6:15 AM, Aneesh Kumar K.V
<[email protected]> wrote:
> Can you try this patch ?
>
> commit a8836b1d6f92273e001012c7705ae8f4c3d5fb65
> Author: Aneesh Kumar K.V <[email protected]>
> Date: ? Tue Oct 27 15:36:38 2009 +0530
>
> ? ?ext4: discard preallocation during truncate
>
> ? ?We need to make sure when we drop and reacquire the inode's
> ? ?i_data_sem we discard the inode preallocation. Otherwise we
> ? ?could have blocks marked as free in bitmap but still belonging
> ? ?to prealloc space.

Just wanted to let you know that I have applied this patch and one
unclean shutdown later it seems to have not given me any trouble.

I will continue testing it - hopefully I won't have to reformat this
time ( every time I tested previously I ended up having weird issues
that I decided to get rid of by reformatting /).

Parag

2009-10-29 21:39:03

by Eric Sandeen

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

Parag Warudkar wrote:
> On Tue, Oct 27, 2009 at 6:15 AM, Aneesh Kumar K.V
> <[email protected]> wrote:
>> Can you try this patch ?
>>
>> commit a8836b1d6f92273e001012c7705ae8f4c3d5fb65
>> Author: Aneesh Kumar K.V <[email protected]>
>> Date: Tue Oct 27 15:36:38 2009 +0530
>>
>> ext4: discard preallocation during truncate
>>
>> We need to make sure when we drop and reacquire the inode's
>> i_data_sem we discard the inode preallocation. Otherwise we
>> could have blocks marked as free in bitmap but still belonging
>> to prealloc space.
>
> Just wanted to let you know that I have applied this patch and one
> unclean shutdown later it seems to have not given me any trouble.
>
> I will continue testing it - hopefully I won't have to reformat this
> time ( every time I tested previously I ended up having weird issues
> that I decided to get rid of by reformatting /).

I've been running my testcase, and I just hit the usual corruption with
this patch in place after 8 iterations, I'm afraid.

-Eric

> Parag
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2009-10-29 21:42:42

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On Thu, Oct 29, 2009 at 05:25:17PM -0400, Parag Warudkar wrote:
> > Author: Aneesh Kumar K.V <[email protected]>
> > Date: ? Tue Oct 27 15:36:38 2009 +0530
> >
> > ? ?ext4: discard preallocation during truncate
> >
> Just wanted to let you know that I have applied this patch and one
> unclean shutdown later it seems to have not given me any trouble.
>
> I will continue testing it - hopefully I won't have to reformat this
> time ( every time I tested previously I ended up having weird issues
> that I decided to get rid of by reformatting /).

Cool!

It looked like Avery Fisher had reported that had had tried patch
didn't help him, and so I had put it in the category of "good patch,
makes sense, but probably not the one that would solve this
regression". But if it works for you, I'll accelerate getting it to
Linus. Do let us know if you have any additional problems with this
patch, please!

- Ted

2009-10-29 21:52:30

by Parag Warudkar

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On Thu, Oct 29, 2009 at 5:42 PM, Theodore Tso <[email protected]> wrote:
> On Thu, Oct 29, 2009 at 05:25:17PM -0400, Parag Warudkar wrote:
>> > Author: Aneesh Kumar K.V <[email protected]>
>> > Date: ? Tue Oct 27 15:36:38 2009 +0530
>> >
>> > ? ?ext4: discard preallocation during truncate
>> >
>> Just wanted to let you know that I have applied this patch and one
>> unclean shutdown later it seems to have not given me any trouble.
>>
>> I will continue testing it - hopefully I won't have to reformat ?this
>> time ( every time I tested previously I ended up having weird issues
>> that I decided to get rid of by reformatting /).
>
> Cool!
>
> It looked like Avery Fisher had reported that had had tried patch
> didn't help him, and so I had put it in the category of "good patch,
> makes sense, but probably not the one that would solve this
> regression". ?But if it works for you, I'll accelerate getting it to
> Linus. ?Do let us know if you have any additional problems with this
> patch, please!
>

I have gone through 3 unclean shutdowns so far - this time on Ubuntu
Karmic Release, today's -git and this patch.
Nothing odd to report.

[ Goes and crashes it again - comes back clean! ]

Since literally all of my previous attempts always resulted in
problems - something seems to have improved although there is no way
to tell if this is going to last long or that if it was Aneesh's patch
alone that improved things as last time i tried this I ended up
reinstalling and I switched to Karmic from Jaunty and I did not test
mainline for few days.

Any rate, I will let everyone know if I see anything different.

Thanks
Parag

2009-10-30 08:16:02

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On Thu, Oct 29, 2009 at 04:38:57PM -0500, Eric Sandeen wrote:
>
> I've been running my testcase, and I just hit the usual corruption with
> this patch in place after 8 iterations, I'm afraid.

Eric, since you have a relatively controllable reproduction case, have
you tried reproducing Alexey's bisection results? Specifically he
seemed to find that commit fe188c0e shows no problem, and commit
91ac6f43 is the first commit with problems?

Having to do multiple iterations will make doing a bisection a major
pain, but maybe we'll get something out of that.

Other things that might be worth doing given that you have a test case
would be to try reverting commit 91ac6f43, and see if that helps, and
to try this patch: http://bugzilla.kernel.org/attachment.cgi?id=23468

Or have you tried some of these experiments already?

Regards,

- Ted

2009-10-30 13:54:32

by Eric Sandeen

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

Theodore Tso wrote:
> On Thu, Oct 29, 2009 at 04:38:57PM -0500, Eric Sandeen wrote:
>> I've been running my testcase, and I just hit the usual corruption with
>> this patch in place after 8 iterations, I'm afraid.
>
> Eric, since you have a relatively controllable reproduction case, have
> you tried reproducing Alexey's bisection results? Specifically he
> seemed to find that commit fe188c0e shows no problem, and commit
> 91ac6f43 is the first commit with problems?

I can try it but I have very little faith in that result to be honest.

> Having to do multiple iterations will make doing a bisection a major
> pain, but maybe we'll get something out of that.

Well I've been doing bisects but I'm getting skeptical of the results;
either my testcase isn't reliable enough or all the merges are confusing
git-bisect (?) Anyway it keeps ending up on nonsensical commits.

> Other things that might be worth doing given that you have a test case
> would be to try reverting commit 91ac6f43, and see if that helps, and
> to try this patch: http://bugzilla.kernel.org/attachment.cgi?id=23468
>
> Or have you tried some of these experiments already?
>
> Regards,
>
> - Ted

After talking to Aneesh last night, I think other good spot-checks will
be to revert 487caeef9fc08c0565e082c40a8aaf58dad92bbb, and to test Jan's
sync patches.

-Eric

2009-10-30 19:56:35

by Andreas Dilger

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On 2009-10-29, at 15:38, Eric Sandeen wrote:
> Parag Warudkar wrote:
>> On Tue, Oct 27, 2009 at 6:15 AM, Aneesh Kumar K.V
>> <[email protected]> wrote:
>>> Can you try this patch ?
>>>
>>> commit a8836b1d6f92273e001012c7705ae8f4c3d5fb65
>>> Author: Aneesh Kumar K.V <[email protected]>
>>> Date: Tue Oct 27 15:36:38 2009 +0530
>>>
>>> ext4: discard preallocation during truncate
>>>
>>> We need to make sure when we drop and reacquire the inode's
>>> i_data_sem we discard the inode preallocation. Otherwise we
>>> could have blocks marked as free in bitmap but still belonging
>>> to prealloc space.
>> Just wanted to let you know that I have applied this patch and one
>> unclean shutdown later it seems to have not given me any trouble.
>> I will continue testing it - hopefully I won't have to reformat this
>> time ( every time I tested previously I ended up having weird issues
>> that I decided to get rid of by reformatting /).
>
> I've been running my testcase, and I just hit the usual corruption
> with this patch in place after 8 iterations, I'm afraid.

I wonder if there are multiple problems involved here? Eric, it seems
possible that your reproducer is exercising a similar, though unrelated
codepath. I would agree with Ted that if this patch is solving the
problem for some of the users it is definitely worth landing, even if it
ends up not solving all of the problems.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2009-10-31 09:15:36

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On Fri, Oct 30, 2009 at 01:56:27PM -0600, Andreas Dilger wrote:
> I wonder if there are multiple problems involved here? Eric, it seems
> possible that your reproducer is exercising a similar, though unrelated
> codepath.

Note that Aneesh has pubished two patches which insert a call to
ext4_discard_preallocations(). One is a patch which inserts it into
fs/inode.c's truncate path (for direct/indirect-mapped inodes) and one
which is patch which inserts it into fs/extents.c truncate path (for
extent-mapped inodes). As near as I can tell both patches are
necessary, and it looks to me like they should be combined into a
single patch, since commit 487caeef9 affects both truncate paths.
Aneesh, do you concur?

Like Andreas, I am suspicious that there may be multiple problems
occurring here, so here is a proposed plan of attack.

Step 1) Sanity check that commit 0a80e986 shows the problem. This is
immediately after the first batch of ext4 patches which I sent to
Linus during the post-2.6.31 merge window. Given that patches in the
middle of this first patch have been reported by Avery as showing the
problem, and while we may have some "git bisect good" revisions that
were really bad, in general if a revision is reported bad, the problem
is probably there at that version and successive versions. Hence, I'm
_pretty_ sure that 0a80e986 should demonstrate the problem.

Step 2) Sanity check that commit ab86e576 does _not_ show the problem.
This commit corresponds to 2.6.31-git6, and there are no ext4 patches
that I pushed before that point. There are three commits that show up
in response to the command "git log v2.6.31..v2.6.31-git6 -- fs/ext4
fs/jbd2", but they weren't pushed by me. Although come to think of
it, Jan Kara's commit 0d34ec62, "ext4: Remove syncing logic from
ext4_file_write" is one we might want to look at very carefully if
commit ab86e576 also shows the problem....

Step 3) Assuming that Step 1 and Step 2 are as I expect, with commit
ab86e576 being "good", and commit 0a80e986 being "bad", we will have
localized the problem commit(s) to the 63 commits that were initially
pushed to Linus during the merge window. One of the commits is
487caeef9, which Aneesh has argued convincingly seems to be
problematic, and which seems to solve at least one or two reporter's
problems, but clearly isn't a complete solution. So let's try to
narrow things down further by testing this branch:

git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git test-history

This branch corresponds to commit ab86e576 (from Step 2), but with the
problematic commit 487caeef9 removed. It was generated by applying
the following guilt patch series to v2.6.31-git6:

git://repo.or.cz/ext4-patch-queue.git test-history

The advantage of starting with the head of test-history is that if
there are multiple problematic commits, this should show the problem
(just as reverting 487caeef9 would) --- but since 487caeef9 is
actually removed, we can now do a "git bisect start test-history
v2.6.31-git6" and hopefully be able to localize whatever additional
commits might be bad.

(We could also keep applying and unapplying the patch corresponding to
the revert of 487caeef9 while doing a bisection, but that tends to be
error prone.)

Does that sounds like a plan?

- Ted

2009-10-31 15:25:54

by Aneesh Kumar K.V

[permalink] [raw]

Subject: Re: [Bug 14354] Re: ext4 increased intolerance to unclean shutdown?

On Sat, Oct 31, 2009 at 05:15:28AM -0400, Theodore Tso wrote:
> On Fri, Oct 30, 2009 at 01:56:27PM -0600, Andreas Dilger wrote:
> > I wonder if there are multiple problems involved here? Eric, it seems
> > possible that your reproducer is exercising a similar, though unrelated
> > codepath.
>
> Note that Aneesh has pubished two patches which insert a call to
> ext4_discard_preallocations(). One is a patch which inserts it into
> fs/inode.c's truncate path (for direct/indirect-mapped inodes) and one
> which is patch which inserts it into fs/extents.c truncate path (for
> extent-mapped inodes). As near as I can tell both patches are
> necessary, and it looks to me like they should be combined into a
> single patch, since commit 487caeef9 affects both truncate paths.
> Aneesh, do you concur?
>

We need only the patch that drop prealloc space in ext4_truncate_restart_trans
ext4_ext_truncate_extend_restart calls ext4_truncate_restart_trans. So adding
the prealloc space dropping in ext4_truncate_restart_trans should handle both
direct/indirect-mapped inode and extent-mapped inodes.

-aneesh