2003-08-12 10:14:41

by Matt Bernstein

[permalink] [raw]
Subject: 2.6.0-test2-mm1, ext3 (external journal): nasty filesystem corruption under high load

Hi,

Admittedly I was being pathological, but I've got a new toy to play with!
Our new server is a dual-Athlon, 1.5G RAM (the other .5 failed memtest) +
about 6GB swap, with 15x70GB drives running under gdth.o with 12 as the
RAID-5 set, and the journal on 2 as a RAID-1 pair. System on IDE for now.

It's currently running Red Hat "severn" + 2.6.0-test2-mm1 (with PREEMPT
for now), and this particular stress test was attempting to build
2.6.0-test3-mm1 with the scary invocation "make -j". More info on request.

I saw thousands of messages like:
cc1: page allocation failure. order:0, mode:0x20
(where only the process names might change). I don't know how Bad this is.

Amazingly I could still ssh in to the box and discover that its load had
more than likely broken 1000. However, the compile started to complain
bitterly about non-ASCII characters in source files, and indeed corruption
did occur (random overwriting, it would appear).

I have a couple more weeks I can play with this box before it has to go
into production (running much older brains), and can do more tests if
anyone thinks it might be useful.

I would like to suggest the "make -j" test for those developers with
enough memory (and fast enough swap).

Matt


2003-08-12 11:54:15

by NeilBrown

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm1, ext3 (external journal): nasty filesystem corruption under high load

On Tuesday August 12, mb/[email protected] wrote:
> Hi,
>
> Admittedly I was being pathological, but I've got a new toy to play with!
> Our new server is a dual-Athlon, 1.5G RAM (the other .5 failed memtest) +
> about 6GB swap, with 15x70GB drives running under gdth.o with 12 as the
> RAID-5 set, and the journal on 2 as a RAID-1 pair. System on IDE for now.
>
> It's currently running Red Hat "severn" + 2.6.0-test2-mm1 (with PREEMPT
> for now), and this particular stress test was attempting to build
> 2.6.0-test3-mm1 with the scary invocation "make -j". More info on request.
>
> I saw thousands of messages like:
> cc1: page allocation failure. order:0, mode:0x20
> (where only the process names might change). I don't know how Bad
> this is.

I think this si just noise. 0x20 is GFP_ATOMIC, and you expect atomic
allocations to fail sometimes.

>
> Amazingly I could still ssh in to the box and discover that its load had
> more than likely broken 1000. However, the compile started to complain
> bitterly about non-ASCII characters in source files, and indeed corruption
> did occur (random overwriting, it would appear).

Almost certainly a raid5 bug, fix by the following patch.

NeilBrown

==========================================================================
Disable raid5 handling of read-ahead

raid5 trys to honour RWA_MASK, but messes it up and can return bad data.
Just ignore RAW_MASK for now.

----------- Diffstat output ------------
./drivers/md/raid5.c | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~ 2003-08-11 09:01:44.000000000 +1000
+++ ./drivers/md/raid5.c 2003-08-11 09:01:44.000000000 +1000
@@ -1326,7 +1326,7 @@ static int make_request (request_queue_t
(unsigned long long)new_sector,
(unsigned long long)logical_sector);

- sh = get_active_stripe(conf, new_sector, pd_idx, (bi->bi_rw&RWA_MASK));
+ sh = get_active_stripe(conf, new_sector, pd_idx, 0/*(bi->bi_rw&RWA_MASK)*/);
if (sh) {

add_stripe_bio(sh, bi, dd_idx, (bi->bi_rw&RW_MASK));

2003-08-12 12:23:31

by Matt Bernstein

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm1, ext3 (external journal): nasty filesystem corruption under high load

At 21:53 +1000 Neil Brown wrote:

>> Admittedly I was being pathological, but I've got a new toy to play with!
>> Our new server is a dual-Athlon, 1.5G RAM (the other .5 failed memtest) +
>> about 6GB swap, with 15x70GB drives running under gdth.o with 12 as the
>> RAID-5 set, and the journal on 2 as a RAID-1 pair. System on IDE for now.
[snip]
>> Amazingly I could still ssh in to the box and discover that its load had
>> more than likely broken 1000. However, the compile started to complain
>> bitterly about non-ASCII characters in source files, and indeed corruption
>> did occur (random overwriting, it would appear).
>
>Almost certainly a raid5 bug, fix by the following patch.

Sorry, I should have made it clearer that it's hardware RAID--gdth.o is
the driver for our ICP vortex card. (Actually of course it's gdth.ko)

Matt