2003-11-01 08:33:22

by Ville Herva

[permalink] [raw]
Subject: Re: Something corrupts raid5 disks slightly during reboot

On Fri, Oct 31, 2003 at 05:57:33PM -0800, you [Mike Fedyk] wrote:
> On Fri, Oct 31, 2003 at 07:41:30PM -0600, Jeffrey E. Hundstad wrote:
> > Try:
> >
> > hdparm -W0 /dev/hdX
> >
> > for each of your ide drives. This turns off write-caching which is
> > usually a bad thing with ide drives anyway.
> >
>
> Also try installing smartmontools, and run smartmon -a on each of the
> drives. It might tell you one of the drives is going bad...

I am monitoring all my drives with smart constantly, and they haven't shown
any symptoms. The corruption only happens upon reboot, which is a quite
rare event for a server.

Also, I find that smart rarely gives much useful warnings beforehand when a
drive is about to fail. And when the drive fails I usually get a good doze
of UncorrectableErrors into the log, not silent corruption (and I've seen a
lot drives to fail ;( ). Silent corruption is usually caused by the chipset
or driver (seen that, too ;( ), but it has usually happened under stress,
not when nothing much is being written to the drive.


-- v --

[email protected]