LinuxLists.cc - Massive ext4 filesystem corruption after a failed s2disk/ram cycle

2009-10-06 21:07:38

Subject: Massive ext4 filesystem corruption after a failed s2disk/ram cycle

Hi,

Just prior to 2.6.32 cycle I tried -next tree and noticed that after a
failed s2ram (here it works only once, and I test once in a whileto see
if fixed accidentally) I got a minor filesystem corruption. I am sorry I
didn't report that back then.

Now I have installed 2.6.32-rc2 (well -rc1...) and things were sort of
ok, I have even thought that hibernation is once again stable
(somewhere in the not that distinct past the hibernation which used to
work, began to fail randomly on resume)

Few days ago, I got a read-only filesystem again, an fsck, few more
corrupted files..., It should have had rung the bell for me (I have
still used hibernation, trying to understand why it fails sometimes)

Yesterday, however, I have decided to fix that once and for all, and for
that I have set up a loop + rtc wakealarm to make it cycle through
hibernation.

Needless to say I didn't run that loop more that maybe 3 cycles (and no
failures), but noticed that rtc clock is dead on resume.

I sort of fixed that (this is hpet emulation that strikes again), I will
post when I test the fix (trivial), because when I had rebooted the
system into the modified kernel, I got that readonly filesystem again,
and this time the damage had spread over lots of files.
(I have even lost most of dpkg database..., many programs,
libraries,..., settings)

Yet, thanks to Linux flexibility, after a day, and some study of
nautilus source, I had the system recovered fully.
(Now am doing backups.....)

But I don't want that to happen again...

Another clue that I have seen was that ext4 driver reported that it
aborts journal replay.

I know that for now there is not much you can do, but just to let you
know that something is there...

What is especially interesting is that there were no s2ram'disk faulure
preceding the corruption, but my theory is that corruption wasn't
detected for a while from last failure, probably giving such bad
consequences.

You do sync file-systems before entering the hibernation, don't you?

Best regards,
Maxim Levitsky

2009-10-06 21:42:56

by Theodore Ts'o

[permalink] [raw]

Subject: Re: Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Tue, Oct 06, 2009 at 11:06:55PM +0200, Maxim Levitsky wrote:
>
> Just prior to 2.6.32 cycle I tried -next tree and noticed that after a
> failed s2ram (here it works only once, and I test once in a whileto see
> if fixed accidentally) I got a minor filesystem corruption. I am sorry I
> didn't report that back then.

When you say filesystem corruption, it's important to indicate whether
you meant that (a) you noticed that some files were had corrupted
contents, (b) the kernel complained that the filesystem was corrupted,
and remounted the filesystem read-only, or (c) e2fsck found and fixed
errors.

Also, when you found errors of either class (a) or (b), did you run
e2fsck to find and fix any potential errors? In a few places it
sounded like the kernel had complained about errors, but you had
ignored them and hadn't run e2fsck to fix them. I hope that was just
me misunderstanding what you wrote! Can you clarify?

- Ted

2009-10-06 21:57:35

by Rafael J. Wysocki

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Tuesday 06 October 2009, Maxim Levitsky wrote:
> Hi,
>
> Just prior to 2.6.32 cycle I tried -next tree and noticed that after a
> failed s2ram (here it works only once, and I test once in a whileto see
> if fixed accidentally) I got a minor filesystem corruption. I am sorry I
> didn't report that back then.
>
> Now I have installed 2.6.32-rc2 (well -rc1...) and things were sort of
> ok, I have even thought that hibernation is once again stable
> (somewhere in the not that distinct past the hibernation which used to
> work, began to fail randomly on resume)
>
> Few days ago, I got a read-only filesystem again, an fsck, few more
> corrupted files..., It should have had rung the bell for me (I have
> still used hibernation, trying to understand why it fails sometimes)
>
> Yesterday, however, I have decided to fix that once and for all, and for
> that I have set up a loop + rtc wakealarm to make it cycle through
> hibernation.
>
> Needless to say I didn't run that loop more that maybe 3 cycles (and no
> failures), but noticed that rtc clock is dead on resume.
>
> I sort of fixed that (this is hpet emulation that strikes again), I will
> post when I test the fix (trivial), because when I had rebooted the
> system into the modified kernel, I got that readonly filesystem again,
> and this time the damage had spread over lots of files.
> (I have even lost most of dpkg database..., many programs,
> libraries,..., settings)
>
> Yet, thanks to Linux flexibility, after a day, and some study of
> nautilus source, I had the system recovered fully.
> (Now am doing backups.....)
>
> But I don't want that to happen again...
>
> Another clue that I have seen was that ext4 driver reported that it
> aborts journal replay.
>
> I know that for now there is not much you can do, but just to let you
> know that something is there...
>
> What is especially interesting is that there were no s2ram'disk faulure
> preceding the corruption, but my theory is that corruption wasn't
> detected for a while from last failure, probably giving such bad
> consequences.
>
> You do sync file-systems before entering the hibernation, don't you?

Yes, a sync is there, but it is not effective on some filesystems.

Thanks,
Rafael

2009-10-06 22:54:38

by Henrique de Moraes Holschuh

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Tue, 06 Oct 2009, Rafael J. Wysocki wrote:
> > You do sync file-systems before entering the hibernation, don't you?
>
> Yes, a sync is there, but it is not effective on some filesystems.

Which ones?

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2009-10-06 23:04:18

by Maxim Levitsky

[permalink] [raw]

Subject: Re: Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Tue, 2009-10-06 at 17:42 -0400, Theodore Tso wrote:
> On Tue, Oct 06, 2009 at 11:06:55PM +0200, Maxim Levitsky wrote:
> >
> > Just prior to 2.6.32 cycle I tried -next tree and noticed that after a
> > failed s2ram (here it works only once, and I test once in a whileto see
> > if fixed accidentally) I got a minor filesystem corruption. I am sorry I
> > didn't report that back then.
>
> When you say filesystem corruption, it's important to indicate whether
> you meant that (a) you noticed that some files were had corrupted
> contents, (b) the kernel complained that the filesystem was corrupted,
> and remounted the filesystem read-only, or (c) e2fsck found and fixed
> errors.
>
> Also, when you found errors of either class (a) or (b), did you run
> e2fsck to find and fix any potential errors? In a few places it
> sounded like the kernel had complained about errors, but you had
> ignored them and hadn't run e2fsck to fix them. I hope that was just
> me misunderstanding what you wrote! Can you clarify?

Sure, kernel noticed errors, and remounted the filesystem R/O (I didn't
write anything down. really sorry)

I had rebooted the system.
Then startup scripts had booted the system to root shell

I had run fsck on the filesystem. It had plenty of files with shared
blocks, many orphaned inodes, errors in free bitmaps.

Then, after the fsck, I got many missing files (many probably went to
lost+found), some had garbage, some became truncated (0 size)

Mostly were affected files that were from recent dpkg update.

I use ubuntu 9.10, and (almost) latest -git of kernel tree.

Best regards,
Maxim Levitsky

2009-10-06 23:02:00

by Rafael J. Wysocki

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Wednesday 07 October 2009, Henrique de Moraes Holschuh wrote:
> On Tue, 06 Oct 2009, Rafael J. Wysocki wrote:
> > > You do sync file-systems before entering the hibernation, don't you?
> >
> > Yes, a sync is there, but it is not effective on some filesystems.
>
> Which ones?

XFS for one example.

2009-10-07 01:30:38

by Henrique de Moraes Holschuh

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Wed, 07 Oct 2009, Rafael J. Wysocki wrote:
> On Wednesday 07 October 2009, Henrique de Moraes Holschuh wrote:
> > On Tue, 06 Oct 2009, Rafael J. Wysocki wrote:
> > > > You do sync file-systems before entering the hibernation, don't you?
> > >
> > > Yes, a sync is there, but it is not effective on some filesystems.
> >
> > Which ones?
>
> XFS for one example.

Interesting. So XFS is not only a Bad Idea for /, but also for anything
that might enter S3/S4. Not nice. I sure hope it doesn't do a
half-assed job of flushing and checkpointing itself during machine
shutdown/restart like it apparently does when told to "sync" before
S3/S4...

Would you be so kind to disclose to us, the uninitated, which other
filesystems are unsafe when faced with a sleep/suspend request?

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2009-10-07 02:26:08

by Daniel Pittman

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

Henrique de Moraes Holschuh <[email protected]> writes:
> On Wed, 07 Oct 2009, Rafael J. Wysocki wrote:
>> On Wednesday 07 October 2009, Henrique de Moraes Holschuh wrote:
>> > On Tue, 06 Oct 2009, Rafael J. Wysocki wrote:
>> > > > You do sync file-systems before entering the hibernation, don't you?
>> > >
>> > > Yes, a sync is there, but it is not effective on some filesystems.
>> >
>> > Which ones?
>>
>> XFS for one example.
>
> Interesting. So XFS is not only a Bad Idea for /, but also for anything
> that might enter S3/S4. Not nice. I sure hope it doesn't do a
> half-assed job of flushing and checkpointing itself during machine
> shutdown/restart like it apparently does when told to "sync" before
> S3/S4...

For what it is worth, I would also be quite interested to know /why/ XFS is
bad in this regard. Is it just the previously stated "XFS writes to disk
despite freezing kernel threads" issue, or something deeper?

Daniel
--
✣ Daniel Pittman ✉ [email protected] ☎ +61 401 155 707
♽ made with 100 percent post-consumer electrons

2009-10-07 14:26:31

by Jindrich Makovicka

[permalink] [raw]

Subject: Re: Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Wed, 07 Oct 2009 01:02:25 +0200
Maxim Levitsky <[email protected]> wrote:

> On Tue, 2009-10-06 at 17:42 -0400, Theodore Tso wrote:
> > On Tue, Oct 06, 2009 at 11:06:55PM +0200, Maxim Levitsky wrote:
> > >
> > > Just prior to 2.6.32 cycle I tried -next tree and noticed that
> > > after a failed s2ram (here it works only once, and I test once in
> > > a whileto see if fixed accidentally) I got a minor filesystem
> > > corruption. I am sorry I didn't report that back then.
> >
>
> Sure, kernel noticed errors, and remounted the filesystem R/O (I
> didn't write anything down. really sorry)
>
> I had rebooted the system.
> Then startup scripts had booted the system to root shell
>
> I had run fsck on the filesystem. It had plenty of files with shared
> blocks, many orphaned inodes, errors in free bitmaps.
>
>
> Then, after the fsck, I got many missing files (many probably went to
> lost+found), some had garbage, some became truncated (0 size)
>
> Mostly were affected files that were from recent dpkg update.
>
>
> I use ubuntu 9.10, and (almost) latest -git of kernel tree.

I encountered something very similar yesterday, with 2.6.32-rc3. When
doing sync after accidentally removing a mounted USB stick, sync got
stuck, so I resorted to SysRq+S/U/B. Unfortunately this was also just
after an apt-get upgrade. The result was the same, corrupted ext4
partitions, shared blocks, orphaned inodes, free bitmap errors.

Only recently written files seem to be affected, in my case
upgraded stuff in / and configuration files in /home.

--
Jindrich Makovicka

2009-10-07 16:16:53

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Wed, Oct 07, 2009 at 01:14:10PM +1100, Daniel Pittman wrote:
> For what it is worth, I would also be quite interested to know /why/ XFS is
> bad in this regard. Is it just the previously stated "XFS writes to disk
> despite freezing kernel threads" issue, or something deeper?

sync pushes out all data to disk, but in a journaling filesystem that
might just but the log not the "normal" place on disk. For a boot
loader to deal with it properly it actually needs to do an replay of
the log. Grub does so for reiserfs but not for XFS for some reason.
I don't know why problems don't trigger more often with ext3, though.

2009-10-10 03:28:14

by Maxim Levitsky

[permalink] [raw]

Subject: ext4 filesystem corruption

I have more information on that issue.

First of all this isn't related to s2ram/disk.
Second, this happened here again 3 times.

Now kernel complains loudly about access to freed inode.

After a reboot fsck tells the following:

- Some directory entries point to freed inodes,
Which means these files are gone, but I never deleted some of them

- Some inodes have shared blocks

- Some orpahaned inodes found

- Free block counts/bitmaps corrupted.

That all happens without any s2ram/disk cycle.

However, yet an unusual situation did happen today.
I had installed an update to mountall ubuntu package, and it hosed all
boot process.
I had to reboot many times, and once did hold the power button for 4
seconds.

I also used often the SYSRQ+U/SYSRQ+B tool.

On the contrary, I did several s2disk cycles, and one did fail, but
there was no corruption.

I must say that until now, I had never seen any ext3/ext4 corruption,
even though there were many many crashes, power failures, forced
reboots, etc...

Best regards,
Maxim Levitsky

2009-11-04 02:18:06

by KOSAKI Motohiro

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

> On Wed, Oct 07, 2009 at 01:14:10PM +1100, Daniel Pittman wrote:
> > For what it is worth, I would also be quite interested to know /why/ XFS is
> > bad in this regard. Is it just the previously stated "XFS writes to disk
> > despite freezing kernel threads" issue, or something deeper?
>
> sync pushes out all data to disk, but in a journaling filesystem that
> might just but the log not the "normal" place on disk. For a boot
> loader to deal with it properly it actually needs to do an replay of
> the log. Grub does so for reiserfs but not for XFS for some reason.
> I don't know why problems don't trigger more often with ext3, though.

I'm sorry for the long delayed and offtopic responce. I discussed this
issue with okuji-san (GRUB2 maintainer) at several month ago.
He really wish linux implement real sync.

A bootloader has much constraint than OS (mainly caused by size constraint).
it can't implemnt jornal log replay logic for _all_ filesystem. Why can't we
implement storong sync syscall? I don't think this is PM nor bootloader fault.

2009-11-05 09:56:37

by Henrique de Moraes Holschuh

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Wed, 04 Nov 2009, KOSAKI Motohiro wrote:
> > On Wed, Oct 07, 2009 at 01:14:10PM +1100, Daniel Pittman wrote:
> > > For what it is worth, I would also be quite interested to know /why/ XFS is
> > > bad in this regard. Is it just the previously stated "XFS writes to disk
> > > despite freezing kernel threads" issue, or something deeper?
> >
> > sync pushes out all data to disk, but in a journaling filesystem that
> > might just but the log not the "normal" place on disk. For a boot
> > loader to deal with it properly it actually needs to do an replay of
> > the log. Grub does so for reiserfs but not for XFS for some reason.
> > I don't know why problems don't trigger more often with ext3, though.
>
> I'm sorry for the long delayed and offtopic responce. I discussed this
> issue with okuji-san (GRUB2 maintainer) at several month ago.
> He really wish linux implement real sync.

This is not about real sync. It is about the box being able to reboot after
a crash or power failure.

GRUB2 is broken in that regard, at least in its peecee-BIOS version: last
time I checked, it doesn't sort RAID components so that it won't boot from
failed or out-of-sync older components, it can't deal with some of the
filesystems being unclean...

> A bootloader has much constraint than OS (mainly caused by size constraint).
> it can't implemnt jornal log replay logic for _all_ filesystem. Why can't we
> implement storong sync syscall? I don't think this is PM nor bootloader fault.

A bootloader that can't boot a system that went through an unclean shutdown
is quite broken.

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2009-11-07 22:22:38

by Thomas Fjellstrom

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Thu November 5 2009, Henrique de Moraes Holschuh wrote:
> On Wed, 04 Nov 2009, KOSAKI Motohiro wrote:
> > > On Wed, Oct 07, 2009 at 01:14:10PM +1100, Daniel Pittman wrote:
> > > > For what it is worth, I would also be quite interested to know
> > > > /why/ XFS is bad in this regard. Is it just the previously stated
> > > > "XFS writes to disk despite freezing kernel threads" issue, or
> > > > something deeper?
> > >
> > > sync pushes out all data to disk, but in a journaling filesystem that
> > > might just but the log not the "normal" place on disk. For a boot
> > > loader to deal with it properly it actually needs to do an replay of
> > > the log. Grub does so for reiserfs but not for XFS for some reason.
> > > I don't know why problems don't trigger more often with ext3, though.
> >
> > I'm sorry for the long delayed and offtopic responce. I discussed this
> > issue with okuji-san (GRUB2 maintainer) at several month ago.
> > He really wish linux implement real sync.
>
> This is not about real sync. It is about the box being able to reboot
> after a crash or power failure.
>
> GRUB2 is broken in that regard, at least in its peecee-BIOS version:
> last time I checked, it doesn't sort RAID components so that it won't
> boot from failed or out-of-sync older components, it can't deal with
> some of the filesystems being unclean...
>
> > A bootloader has much constraint than OS (mainly caused by size
> > constraint). it can't implemnt jornal log replay logic for _all_
> > filesystem. Why can't we implement storong sync syscall? I don't think
> > this is PM nor bootloader fault.
>
> A bootloader that can't boot a system that went through an unclean
> shutdown is quite broken.
>

It can barely boot a system that's gone through a clean shutdown. "bios read
error" and all that.

--
Thomas Fjellstrom
[email protected]

2009-11-08 08:29:06

by Dave Chinner

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Wed, Nov 04, 2009 at 11:18:05AM +0900, KOSAKI Motohiro wrote:
> > On Wed, Oct 07, 2009 at 01:14:10PM +1100, Daniel Pittman wrote:
> > > For what it is worth, I would also be quite interested to know
> > > /why/ XFS is bad in this regard. Is it just the previously
> > > stated "XFS writes to disk despite freezing kernel threads"
> > > issue, or something deeper?
> >
> > sync pushes out all data to disk, but in a journaling filesystem
> > that might just but the log not the "normal" place on disk. For
> > a boot loader to deal with it properly it actually needs to do
> > an replay of the log. Grub does so for reiserfs but not for XFS
> > for some reason. I don't know why problems don't trigger more
> > often with ext3, though.
>
> I'm sorry for the long delayed and offtopic responce. I discussed
> this issue with okuji-san (GRUB2 maintainer) at several month ago.
> He really wish linux implement real sync.
>
> A bootloader has much constraint than OS (mainly caused by size
> constraint). it can't implemnt jornal log replay logic for _all_
> filesystem. Why can't we implement storong sync syscall? I don't
> think this is PM nor bootloader fault.

We already have an ioctl that does what you want: FIFREEZE.

Cheers,

Dave.
--
Dave Chinner
[email protected]

2009-11-08 16:49:28

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Sun, Nov 08, 2009 at 07:29:05PM +1100, Dave Chinner wrote:
> We already have an ioctl that does what you want: FIFREEZE.

Doesn't really help as there is not guarentee important metadata is
modified again before the bootloader accesses it, but that's a fate
share with any other kind of super sync. The only way to really fix
the problem is to implement proper (in-memory) log recovery in the
bootloader, especially as it doesn't only have to deal with the
relatively easy case of clean shutdowns but also needs to deal with the
case of an unclean shutdown with major amounts of updates to the lookup
and allocation data structures in the log.

IMHO the best option is to have a separate partition for /boot with a
very simple filesystem that we can expect boot loader developers to
implement fully and correctly.

2009-11-09 09:42:51

by Henrique de Moraes Holschuh

[permalink] [raw]

Subject: Re: [linux-pm] Massive ext4 filesystem corruption after a failed s2disk/ram cycle

On Sun, 08 Nov 2009, Christoph Hellwig wrote:
> IMHO the best option is to have a separate partition for /boot with a
> very simple filesystem that we can expect boot loader developers to
> implement fully and correctly.

Agreed. And one that is not all but abandoned kernel-side, or you will risk
bugs there.

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh