2002-07-01 07:57:47

by Helge Hafting

[permalink] [raw]
Subject: 2.5.24-dj1,smp,ext2,raid0: I got random zero blocks in my files.

2.5.24-dj1 gave me files with zeroed blocks inside.
What I did: I untarred the source for lyx 1.2.0
and tried to compile it, several times.

gcc and make choked on occational blocks of zeroes
inside files, different places each time.
Going back to 2.5.18 fixed it.

This isn't all that surprising considering that
the raid driver logs complaints about requests
bigger than 32k, which is the stripe size.
I believed this worked by retrying with much smaller
requests, perhaps I am wrong?

The filesystems use 4k blocks.
I haven't seen any trouble on non-raid or raid-1
partitions.

Helge Hafting


2002-07-01 08:10:41

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.24-dj1,smp,ext2,raid0: I got random zero blocks in my files.

Helge Hafting wrote:
>
> 2.5.24-dj1 gave me files with zeroed blocks inside.
> What I did: I untarred the source for lyx 1.2.0
> and tried to compile it, several times.
>
> gcc and make choked on occational blocks of zeroes
> inside files, different places each time.
> Going back to 2.5.18 fixed it.
>
> This isn't all that surprising considering that
> the raid driver logs complaints about requests
> bigger than 32k, which is the stripe size.
> I believed this worked by retrying with much smaller
> requests, perhaps I am wrong?
>
> The filesystems use 4k blocks.
> I haven't seen any trouble on non-raid or raid-1
> partitions.

Yes, the large BIO stuff went into 2.5.19. RAID0 doesn't
like those big BIOs. Jens is cooking up a fix for that.

Just to confirm that this is the problem, could you please
set MPAGE_BIO_MAX_SIZE in 32768 in fs/mpage.c and see if the
failure goes away?

-

2002-07-04 00:30:07

by NeilBrown

[permalink] [raw]
Subject: Re: 2.5.24-dj1,smp,ext2,raid0: I got random zero blocks in my files.

On Monday July 1, [email protected] wrote:
> 2.5.24-dj1 gave me files with zeroed blocks inside.
> What I did: I untarred the source for lyx 1.2.0
> and tried to compile it, several times.
>
> gcc and make choked on occational blocks of zeroes
> inside files, different places each time.
> Going back to 2.5.18 fixed it.
>
> This isn't all that surprising considering that
> the raid driver logs complaints about requests
> bigger than 32k, which is the stripe size.
> I believed this worked by retrying with much smaller
> requests, perhaps I am wrong?

You are wrong. It doesn't re-try. It just fails.
raid0 does not work in 2.5 yet. Don't even both trying.

NeilBrown