2003-11-02 03:19:31

by Florian Reitmeir

[permalink] [raw]
Subject: many ide drives, raid0/raid5

Hi,

i'm using on the machine kernel 2.6.0-test6-mm1 and 15 IDE Drives. Everything worked fine (uptime about 30 days)
I used, some raid5's, below some raid0's and on top evms.

heres "cat /proc/mdstat" so its more clear

============ cut
Personalities : [raid0] [raid5]
md4 : active raid5 hdp[2] hdo[1] hdn[0]
120627072 blocks level 5, 32k chunk, algorithm 2 [4/3] [UUU_]

md3 : active raid0 hdm[1] hdl[0]
49956352 blocks 32k chunks

md0 : inactive hdc[0]
80418176 blocks
unused devices: <none>
============ cut

here are some drives missig,
md0, is one drive missing
md1, complete
md2, also

when i make a "evms_activate" i get


========= CUT
MDRaid5RegMgr: Region md/md4 object index 3 is faulty. Array may be degraded.
MDRaid5RegMgr: Region md/md4 disks array not zeroed
MDRaid5RegMgr: Region md/md4 has disk counts that are not correct.
MDRaid5RegMgr: RAID5 array md/md4 is missing the member md/md3 with RAID index 3. The array is running in degrade mode.
LvmRegMgr: Container lvm/rubbish has incorrect number of objects!
LvmRegMgr: Looking for 3 objects, found 2 objects.
LvmRegMgr: A UUID is recorded for PV 3, but PV 3 was not found.
LvmRegMgr: UUID: vWAf1j-veQx-IDk7-SaJ0-dhga-Lase-rQ9ydR
LvmRegMgr: Container lvm/rubbish has a UUID recorded for PV 3, but PV 3 was not found. Would you like to remove PV 3 from container lvm/rubbish *PERMANENTLY*?

You should only remove this PV if you know the PV will *NEVER* be available again. If you think it is just temporarily missing, do not remove it from the container.evms_activate: Responding with default selection "Don't Remove".
LvmRegMgr: Would you like to fix the metadata for container lvm/rubbish?
evms_activate: Responding with default selection "Don't Fix".
LvmRegMgr: Region lvm/rubbish/basket has an incomplete LE map.
Missing 7327 out of 14359 LEs.
MDRaid5RegMgr: RAID5 array md/md2 is missing the member with RAID index 3. The array is running in degrade mode.
MDRaid0RegMgr: Region md/md1 object index incorrect: is 0, should be 1
MDRaid0RegMgr: Region md/md1 object index 1 is greater than nr_disks.
MDRaid0RegMgr: Region md/md1 object index 1 is in invalid state.
MDRaid0RegMgr: Region md/md1 disk counts incorrect
Engine: Error code 5 (Input/output error) when reading the primary copy of feature header on object lvm/rubbish/basket.
Engine: Error code 5 (Input/output error) when reading the secondary copy of feature header on object lvm/rubbish/basket.
MDRaid0RegMgr: Region md/md1 object index incorrect: is 0, should be 1
MDRaid0RegMgr: Region md/md1 object index 1 is greater than nr_disks.
MDRaid0RegMgr: Region md/md1 object index 1 is in invalid state.
MDRaid0RegMgr: Region md/md1 disk counts incorrect
MDRaid0RegMgr: Region md/md1 object index incorrect: is 0, should be 1
MDRaid0RegMgr: Region md/md1 object index 1 is greater than nr_disks.
MDRaid0RegMgr: Region md/md1 object index 1 is in invalid state.
MDRaid0RegMgr: Region md/md1 disk counts incorrect
========= CUT


or in other words, its a mess. I verfied, all drives are found and work correct, so its "only" the meta information which is broken, i uses evmsgui to configure the whole thing, so there is no raidtab file, which i could use to force the configuration.

I also tried newer Kernel versions, but there is a timing problem i think, i use promise ide-pci controllers, and the new test9 always tries to reset those with UDMA(i think so), which won't work with that many drives.

So, is there someone who can help me ?

--
mfG Florian Reitmeir


2003-11-02 11:04:19

by Dave Gilbert (Home)

[permalink] [raw]
Subject: Re: many ide drives, raid0/raid5

* Florian Reitmeir ([email protected]) wrote:
> Hi,
>
> i'm using on the machine kernel 2.6.0-test6-mm1 and 15 IDE
> Drives. Everything worked fine (uptime about 30 days) > I used, some
> raid5's, below some raid0's and on top evms.

>
> heres "cat /proc/mdstat" so its more clear
>
> ============ cut
> Personalities : [raid0] [raid5]
> md4 : active raid5 hdp[2] hdo[1] hdn[0]
> 120627072 blocks level 5, 32k chunk, algorithm 2 [4/3] [UUU_]

This isn't too unusual; if as you say the drives are fine then I'd agree
it is probably a controller issue. I've got a RAID that does this
regularly and you normally find in the log somewhere an IDE error of
some type (typically it dropping out of DMA due to a busy or it not
responding).

I've given up using multiple IDE PCI cards for this - they only just
work, and occasionally you'll get a glitch like this. I've tried
a mixture of Promise and HPT cards. In the end I gave up and got
a 3ware IDE RAID card which is working fine (you could just use it
as a large multichannel IDE card and run soft raid if you want I think).

Dave
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/