2007-12-01 11:19:37

by Justin Piszcz

[permalink] [raw]
Subject: Kernel 2.6.23.9 + mdadm 2.6.2-2 + Auto rebuild RAID1?

Quick question,

Setup a new machine last night with two raptor 150 disks. Setup RAID1 as
I do everywhere else, 0.90.03 superblocks (in order to be compatible with
LILO, if you use 1.x superblocks with LILO you can't boot), and then:

/dev/sda1+sdb1 <-> /dev/md0 <-> swap
/dev/sda2+sdb2 <-> /dev/md1 <-> /boot (ext3)
/dev/sda3+sdb3 <-> /dev/md2 <-> / (xfs)

All works fine, no issues...

Quick question though, I turned off the machine, disconnected /dev/sda
from the machine, boot from /dev/sdb, no problems, shows as degraded
RAID1. Turn the machine off. Re-attach the first drive. When I boot my
first partition either re-synced by itself or it was not degraded, was is
this?

So two questions:

1) If it rebuilt by itself, how come it only rebuilt /dev/md0?
2) If it did not rebuild, is it because the kernel knows it does not need
to re-calculate parity etc for swap?

I had to:

mdadm /dev/md1 -a /dev/sda2
and
mdadm /dev/md2 -a /dev/sda3

To rebuild the /boot and /, which worked fine, I am just curious though
why it works like this, I figured it would be all or nothing.

More info:

Not using ANY initramfs/initrd images, everything is compiled into 1
kernel image (makes things MUCH simpler and the expected device layout etc
is always the same, unlike initrd/etc).

Justin.


2007-12-01 12:10:38

by Jan Engelhardt

[permalink] [raw]
Subject: Re: Kernel 2.6.23.9 + mdadm 2.6.2-2 + Auto rebuild RAID1?


On Dec 1 2007 06:19, Justin Piszcz wrote:

> RAID1, 0.90.03 superblocks (in order to be compatible with LILO, if
> you use 1.x superblocks with LILO you can't boot)

Says who? (Don't use LILO ;-)

>, and then:
>
> /dev/sda1+sdb1 <-> /dev/md0 <-> swap
> /dev/sda2+sdb2 <-> /dev/md1 <-> /boot (ext3)
> /dev/sda3+sdb3 <-> /dev/md2 <-> / (xfs)
>
> All works fine, no issues...
>
> Quick question though, I turned off the machine, disconnected /dev/sda
> from the machine, boot from /dev/sdb, no problems, shows as degraded
> RAID1. Turn the machine off. Re-attach the first drive. When I boot
> my first partition either re-synced by itself or it was not degraded,
> was is this?

If md0 was not touched (written to) after you disconnected sda, it also
should not be in a degraded state.

> So two questions:
>
> 1) If it rebuilt by itself, how come it only rebuilt /dev/md0?

So md1/md2 was NOT rebuilt?

> 2) If it did not rebuild, is it because the kernel knows it does not
> need to re-calculate parity etc for swap?

Kernel does not know what's inside an md usually. And it should not
try to be smart.

> I had to:
>
> mdadm /dev/md1 -a /dev/sda2
> and
> mdadm /dev/md2 -a /dev/sda3
>
> To rebuild the /boot and /, which worked fine, I am just curious
> though why it works like this, I figured it would be all or nothing.

Devices are not automatically readded. Who knows, maybe you inserted a
different disk into sda which you don't want to be overwritten.

> More info:
>
> Not using ANY initramfs/initrd images, everything is compiled into 1
> kernel image (makes things MUCH simpler and the expected device layout
> etc is always the same, unlike initrd/etc).
>
My expected device layout is also always the same, _with_ initrd. Why?
Simply because mdadm.conf is copied to the initrd, and mdadm will
use your defined order.

2007-12-01 12:13:19

by Justin Piszcz

[permalink] [raw]
Subject: Re: Kernel 2.6.23.9 + mdadm 2.6.2-2 + Auto rebuild RAID1?



On Sat, 1 Dec 2007, Jan Engelhardt wrote:

>
> On Dec 1 2007 06:19, Justin Piszcz wrote:
>
>> RAID1, 0.90.03 superblocks (in order to be compatible with LILO, if
>> you use 1.x superblocks with LILO you can't boot)
>
> Says who? (Don't use LILO ;-)
I like LILO :)

>
>> , and then:
>>
>> /dev/sda1+sdb1 <-> /dev/md0 <-> swap
>> /dev/sda2+sdb2 <-> /dev/md1 <-> /boot (ext3)
>> /dev/sda3+sdb3 <-> /dev/md2 <-> / (xfs)
>>
>> All works fine, no issues...
>>
>> Quick question though, I turned off the machine, disconnected /dev/sda
>> from the machine, boot from /dev/sdb, no problems, shows as degraded
>> RAID1. Turn the machine off. Re-attach the first drive. When I boot
>> my first partition either re-synced by itself or it was not degraded,
>> was is this?
>
> If md0 was not touched (written to) after you disconnected sda, it also
> should not be in a degraded state.
>
>> So two questions:
>>
>> 1) If it rebuilt by itself, how come it only rebuilt /dev/md0?
>
> So md1/md2 was NOT rebuilt?
Correct.

>
>> 2) If it did not rebuild, is it because the kernel knows it does not
>> need to re-calculate parity etc for swap?
>
> Kernel does not know what's inside an md usually. And it should not
> try to be smart.
Ok.

>
>> I had to:
>>
>> mdadm /dev/md1 -a /dev/sda2
>> and
>> mdadm /dev/md2 -a /dev/sda3
>>
>> To rebuild the /boot and /, which worked fine, I am just curious
>> though why it works like this, I figured it would be all or nothing.
>
> Devices are not automatically readded. Who knows, maybe you inserted a
> different disk into sda which you don't want to be overwritten.
Makes sense, I just wanted to confirm that it was normal..

>
>> More info:
>>
>> Not using ANY initramfs/initrd images, everything is compiled into 1
>> kernel image (makes things MUCH simpler and the expected device layout
>> etc is always the same, unlike initrd/etc).
>>
> My expected device layout is also always the same, _with_ initrd. Why?
> Simply because mdadm.conf is copied to the initrd, and mdadm will
> use your defined order.
>
That is another way as well, people seem to be divided.

2007-12-01 12:17:58

by Jan Engelhardt

[permalink] [raw]
Subject: Re: Kernel 2.6.23.9 + mdadm 2.6.2-2 + Auto rebuild RAID1?


On Dec 1 2007 07:12, Justin Piszcz wrote:
> On Sat, 1 Dec 2007, Jan Engelhardt wrote:
>> On Dec 1 2007 06:19, Justin Piszcz wrote:
>>
>> > RAID1, 0.90.03 superblocks (in order to be compatible with LILO, if
>> > you use 1.x superblocks with LILO you can't boot)
>>
>> Says who? (Don't use LILO ;-)
>
> I like LILO :)

LILO cares much less about disk layout / filesystems than GRUB does,
so I would have expected LILO to cope with all sorts of superblocks.
OTOH I would suspect GRUB to only handle 0.90 and 1.0, where the MDSB
is at the end of the disk <=> the filesystem SB is at the very beginning.

>> > So two questions:
>> >
>> > 1) If it rebuilt by itself, how come it only rebuilt /dev/md0?
>>
>> So md1/md2 was NOT rebuilt?
>
> Correct.

Well it should, after they are readded using -a.
If they still don't, then perhaps another resync is in progress.

2007-12-01 12:23:16

by Justin Piszcz

[permalink] [raw]
Subject: Re: Kernel 2.6.23.9 + mdadm 2.6.2-2 + Auto rebuild RAID1?



On Sat, 1 Dec 2007, Jan Engelhardt wrote:

>
> On Dec 1 2007 07:12, Justin Piszcz wrote:
>> On Sat, 1 Dec 2007, Jan Engelhardt wrote:
>>> On Dec 1 2007 06:19, Justin Piszcz wrote:
>>>
>>>> RAID1, 0.90.03 superblocks (in order to be compatible with LILO, if
>>>> you use 1.x superblocks with LILO you can't boot)
>>>
>>> Says who? (Don't use LILO ;-)
>>
>> I like LILO :)
>
> LILO cares much less about disk layout / filesystems than GRUB does,
> so I would have expected LILO to cope with all sorts of superblocks.
> OTOH I would suspect GRUB to only handle 0.90 and 1.0, where the MDSB
> is at the end of the disk <=> the filesystem SB is at the very beginning.
>
>>>> So two questions:
>>>>
>>>> 1) If it rebuilt by itself, how come it only rebuilt /dev/md0?
>>>
>>> So md1/md2 was NOT rebuilt?
>>
>> Correct.
>
> Well it should, after they are readded using -a.
> If they still don't, then perhaps another resync is in progress.
>

There was nothing in progress, md0 was synced up and md1,md2 = degraded.

2007-12-05 19:30:20

by Nix

[permalink] [raw]
Subject: Re: Kernel 2.6.23.9 + mdadm 2.6.2-2 + Auto rebuild RAID1?

On 1 Dec 2007, Jan Engelhardt uttered the following:

>
> On Dec 1 2007 06:19, Justin Piszcz wrote:
>
>> RAID1, 0.90.03 superblocks (in order to be compatible with LILO, if
>> you use 1.x superblocks with LILO you can't boot)
>
> Says who? (Don't use LILO ;-)

Well, your kernels must be on a 0.90-superblocked RAID-0 or RAID-1
device. It can't handle booting off 1.x superblocks nor RAID-[56]
(not that I could really hope for the latter).

But that's just /boot, not everything else.

>>
>> Not using ANY initramfs/initrd images, everything is compiled into 1
>> kernel image (makes things MUCH simpler and the expected device layout
>> etc is always the same, unlike initrd/etc).
>>
> My expected device layout is also always the same, _with_ initrd. Why?
> Simply because mdadm.conf is copied to the initrd, and mdadm will
> use your defined order.

Of course the same is true of initramfs, which can give you the 1 kernel
image back, too. (It's also nicer in that you can autoassemble
e.g. LVM-on-RAID, or even LVM-on-RAID-over-nbd if you so desire.)

--
`The rest is a tale of post and counter-post.' --- Ian Rawlings
describes USENET

2007-12-06 15:09:04

by Jan Engelhardt

[permalink] [raw]
Subject: Re: Kernel 2.6.23.9 + mdadm 2.6.2-2 + Auto rebuild RAID1?


On Dec 5 2007 19:29, Nix wrote:
>>
>> On Dec 1 2007 06:19, Justin Piszcz wrote:
>>
>>> RAID1, 0.90.03 superblocks (in order to be compatible with LILO, if
>>> you use 1.x superblocks with LILO you can't boot)
>>
>> Says who? (Don't use LILO ;-)
>
>Well, your kernels must be on a 0.90-superblocked RAID-0 or RAID-1
>device. It can't handle booting off 1.x superblocks nor RAID-[56]
>(not that I could really hope for the latter).

If the superblock is at the end (which is the case for 0.90 and 1.0),
then the offsets for a specific block on /dev/mdX match the ones for /dev/sda,
so it should be "easy" to use lilo on 1.0 too, no?
(Yes, it will not work with 1.1 or 1.2.)

2007-12-07 07:30:46

by Nix

[permalink] [raw]
Subject: Re: Kernel 2.6.23.9 + mdadm 2.6.2-2 + Auto rebuild RAID1?

On 6 Dec 2007, Jan Engelhardt verbalised:
> On Dec 5 2007 19:29, Nix wrote:
>>>
>>> On Dec 1 2007 06:19, Justin Piszcz wrote:
>>>
>>>> RAID1, 0.90.03 superblocks (in order to be compatible with LILO, if
>>>> you use 1.x superblocks with LILO you can't boot)
>>>
>>> Says who? (Don't use LILO ;-)
>>
>>Well, your kernels must be on a 0.90-superblocked RAID-0 or RAID-1
>>device. It can't handle booting off 1.x superblocks nor RAID-[56]
>>(not that I could really hope for the latter).
>
> If the superblock is at the end (which is the case for 0.90 and 1.0),
> then the offsets for a specific block on /dev/mdX match the ones for /dev/sda,
> so it should be "easy" to use lilo on 1.0 too, no?

Sure, but you may have to hack /sbin/lilo to convince it to create the
superblock there at all. It's likely to recognise that this is an md
device without a v0.90 superblock and refuse to continue. (But I haven't
tested it.)

--
`The rest is a tale of post and counter-post.' --- Ian Rawlings
describes USENET

2007-12-07 08:36:30

by Jan Engelhardt

[permalink] [raw]
Subject: Re: Kernel 2.6.23.9 + mdadm 2.6.2-2 + Auto rebuild RAID1?


On Dec 7 2007 07:30, Nix wrote:
>On 6 Dec 2007, Jan Engelhardt verbalised:
>> On Dec 5 2007 19:29, Nix wrote:
>>>> On Dec 1 2007 06:19, Justin Piszcz wrote:
>>>>
>>>>> RAID1, 0.90.03 superblocks (in order to be compatible with LILO, if
>>>>> you use 1.x superblocks with LILO you can't boot)
>>>>
>>>> Says who? (Don't use LILO ;-)
>>>
>>>Well, your kernels must be on a 0.90-superblocked RAID-0 or RAID-1
>>>device. It can't handle booting off 1.x superblocks nor RAID-[56]
>>>(not that I could really hope for the latter).
>>
>> If the superblock is at the end (which is the case for 0.90 and 1.0),
>> then the offsets for a specific block on /dev/mdX match the ones for /dev/sda,
>> so it should be "easy" to use lilo on 1.0 too, no?
>
>Sure, but you may have to hack /sbin/lilo to convince it to create the
>superblock there at all. It's likely to recognise that this is an md
>device without a v0.90 superblock and refuse to continue. (But I haven't
>tested it.)
>
In that case, see above - move to a different bootloader.