2010-06-07 20:45:41

by Jeffrey Merkey

[permalink] [raw]
Subject: EXT3 File System Corruption 2.6.34

Still seeing file system corruption after journal recovery in EXT3.
It's easy to reproduce, though the symptoms vary. One way is to
rebuild a program and while the program is being compiled just shut
off power to the system by pulling the plug. I am seeing the
/root/.viminfo file trashed after recovery if Vim was active during
poweroff. I am also seeing object modules getting built which the LD
linker claims are "invalid" following a recovery event. I suspect a
bug in the buffer cache since deleting the file still causes the old
data to be returned from buffer cache even when the sectors are
overwritten, but both are interrelated. Seems in some way related to
EXT3 recovery which results in the buffer cache returning old sectors
and junk.

Not hard to reproduce, but the symptoms are always a little different
but the /root/.viminfo file getting nuked seems a common affect of
this bug.

Jeff


2010-06-07 21:00:00

by Eric Sandeen

[permalink] [raw]
Subject: Re: EXT3 File System Corruption 2.6.34

Jeffrey Merkey wrote:
> Still seeing file system corruption after journal recovery in EXT3.
> It's easy to reproduce, though the symptoms vary. One way is to
> rebuild a program and while the program is being compiled just shut
> off power to the system by pulling the plug. I am seeing the
> /root/.viminfo file trashed after recovery if Vim was active during
> poweroff. I am also seeing object modules getting built which the LD
> linker claims are "invalid" following a recovery event. I suspect a
> bug in the buffer cache since deleting the file still causes the old
> data to be returned from buffer cache even when the sectors are
> overwritten, but both are interrelated. Seems in some way related to
> EXT3 recovery which results in the buffer cache returning old sectors
> and junk.
>
> Not hard to reproduce, but the symptoms are always a little different
> but the /root/.viminfo file getting nuked seems a common affect of
> this bug.

"file system corruption" usually means corrupted metadata, but I guess
here you mean file corruption, i.e. corrupted data.

If you have buffered data in the cache, it will be lost when you pull
the plug. If your userspace doesn't sync it, this is expected. But it's
not clear to me what you're seeing.

I'm also not clear on what you mean about deleting the file and having old
data returned. Maybe a little cut and paste from the screen would help
explain what you see.

I'd also check CONFIG_EXT3_DEFAULTS_TO_ORDERED and be sure you're
using data=ordered mode by default.

-Eric

> Jeff

2010-06-07 21:04:29

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: EXT3 File System Corruption 2.6.34

On Mon, 07 Jun 2010 14:45:38 MDT, Jeffrey Merkey said:
> Still seeing file system corruption after journal recovery in EXT3.

Are you getting bit by one of these mount options? (from 'man mount')
There were changes a few releases ago, might want to check what
your kernel build defaulted it to in your 2.6.34.

data={journal|ordered|writeback}
Specifies the journalling mode for file data. Metadata is
always journaled. To use modes other than ordered on the root
filesystem, pass the mode to the kernel as boot parameter, e.g.
rootflags=data=journal.

journal
All data is committed into the journal prior to being
written into the main filesystem.

ordered
This is the default mode. All data is forced directly
out to the main file system prior to its metadata being
committed to the journal.

writeback
Data ordering is not preserved - data may be written into
the main filesystem after its metadata has been committed
to the journal. This is rumoured to be the highest-
throughput option. It guarantees internal filesystem
integrity, however it can allow old data to appear in
files after a crash and journal recovery.

barrier=0 / barrier=1
This enables/disables barriers. barrier=0 disables it, bar‐
rier=1 enables it. Write barriers enforce proper on-disk order‐
ing of journal commits, making volatile disk write caches safe
to use, at some performance penalty. The ext3 filesystem does
not enable write barriers by default. Be sure to enable barri‐
ers unless your disks are battery-backed one way or another.
Otherwise you risk filesystem corruption in case of power fail‐
ure.


Attachments:
(No filename) (227.00 B)

2010-06-08 02:37:10

by Jeffrey Merkey

[permalink] [raw]
Subject: Re: EXT3 File System Corruption 2.6.34

Cool. I'll use that from now on. wonder if the source code came from
xdump 10 years ago ... LOL

:)

Jeff

On Mon, Jun 7, 2010 at 8:22 PM, Eric Sandeen <[email protected]> wrote:
> Jeffrey Merkey wrote:
>> Well, I will set this as default from now on. ?Tell Evil Emperor Linus
>> to put the fucking thing back the way it was so default kernel builds
>> are not fucked up.
>
> ... it was a long discussion I won't re-hash.
>
>
>> Jeff
>>
>> here is the source to the xdump.c file. ?may be useful in the future
>> to someone who needs a tool to dump files to post to the list. ?Anyway
>> -- easier to use than that fucking hexedit program.
>>
>> :-)
>>
>
>
> you don't like hexdump -C? ?more or less the same.
>
> # hexdump -C a8866020.pdf ?| more
> 00000000 ?25 50 44 46 2d 31 2e 32 ?0d 0a 25 e2 e3 cf d3 0d ?|%PDF-1.2..%.....|
> 00000010 ?0a 32 20 30 20 6f 62 6a ?0d 0a 3c 3c 0d 0a 2f 4c ?|.2 0 obj..<<../L|
> 00000020 ?65 6e 67 74 68 20 31 37 ?32 37 34 0d 0a 2f 46 69 ?|ength 17274../Fi|
> 00000030 ?6c 74 65 72 20 2f 46 6c ?61 74 65 44 65 63 6f 64 ?|lter /FlateDecod|
> 00000040 ?65 0d 0a 3e 3e 0d 0a 73 ?74 72 65 61 6d 0d 0a 48 ?|e..>>..stream..H|
> .....
>
> -Eric
>

2010-06-08 05:40:59

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: EXT3 File System Corruption 2.6.34

On Mon, 07 Jun 2010 20:37:06 MDT, Jeffrey Merkey said:
> Cool. I'll use that from now on. wonder if the source code came from
> xdump 10 years ago ... LOL

Probably not, given that 'man hexdump' says:

BSD April 18, 1994 BSD

Plus, they obviously rolled the code for '-e formatstring' themselves, nobody
could have been so desperate to steal that code. ;)

hexdump -e '"%08.08_ax " 4/4 "%08X " " " 4/4 "%08x " ' -e '" *" 32/1 "%_p"' -e '"*\n"'

That's so old-skool it hurts. :)



Attachments:
(No filename) (227.00 B)

2010-06-08 14:45:00

by Jeffrey Merkey

[permalink] [raw]
Subject: Re: EXT3 File System Corruption 2.6.34

On Mon, Jun 7, 2010 at 11:40 PM, <[email protected]> wrote:
> On Mon, 07 Jun 2010 20:37:06 MDT, Jeffrey Merkey said:
>> Cool. ?I'll use that from now on. ?wonder if the source code came from
>> xdump 10 years ago ... LOL
>
> Probably not, given that 'man hexdump' says:
>
> BSD ? ? ? ? ? ? ? ? ? ? ? ? ? ? April 18, 1994 ? ? ? ? ? ? ? ? ? ? ? ? ? ? BSD
>
> Plus, they obviously rolled the code for '-e formatstring' themselves, nobody
> could have been so desperate to steal that code. ;)
>
> hexdump -e '"%08.08_ax ?" 4/4 "%08X " " ?" 4/4 "%08x " ' -e '" ?*" 32/1 "%_p"' -e '"*\n"'
>
> That's so old-skool it hurts. :)
>
>
>

LOL. :)

Jeff