2006-01-24 00:40:54

by NeilBrown

[permalink] [raw]
Subject: [PATCH 000 of 7] md: Introduction - raid5 reshape mark-2

Here is second release of my patches to support online reshaping
of a raid5 array, i.e. adding 1 or more devices and restriping the
whole thing.

This release fixes an assortment of bugs and adds checkpoint/restart
to the process (the last two patches).
This means that if your machine crashes, or if you have to stop an
array before the reshape is complete, md will notice and will restart
the reshape at an appropriate place.

There is still a small window ( < 1 second) at the start of the reshape
during which a crash will cause unrecoverable corruption. My plan is
to resolve this in mdadm rather than md. The critical data will be copied
into the new drive(s) prior to commencing the reshape. If there is a crash
the kernel will refuse the reassmble the array. mdadm will be able to
re-assemble it by first restoring the critical data and then letting
the remainder of the reshape run it's course.

I will be changing the interface for starting a reshape slightly before
this patch become final. This will mean that current 'mdadm' will not
be able to start a raid5 reshape.
This is partly to save people from risking the above mentioned tiny hole,
but also to prepare for reshaping which changes other aspects of the
shape, e.g. layout, chunksize, level.

I am expecting that I will ultimately support online conversion of
raid5 to raid6 with only one extra device. This process is not
(efficiently) checkpointable and so will be at-your-risk.
Checkpointing such a process with anything like reasonable efficiency
requires a largish (multi-megabytes) temporary store, and doing so
will at-best halve the speed. I will make sure the posibility of
add this later will be left open.

My thanks to those who have tested the first release, who have
provided feedback, who will test this release, and who contribute to
the discussion in any way.

NeilBrown



[PATCH 001 of 7] md: Split disks array out of raid5 conf structure so it is easier to grow.
[PATCH 002 of 7] md: Allow stripes to be expanded in preparation for expanding an array.
[PATCH 003 of 7] md: Infrastructure to allow normal IO to continue while array is expanding.
[PATCH 004 of 7] md: Core of raid5 resize process
[PATCH 005 of 7] md: Final stages of raid5 expand code.
[PATCH 006 of 7] md: Checkpoint and allow restart of raid5 reshape
[PATCH 007 of 7] md: Only checkpoint expansion progress occasionally.


2006-01-24 09:23:52

by Lars Marowsky-Bree

[permalink] [raw]
Subject: Re: [PATCH 000 of 7] md: Introduction - raid5 reshape mark-2

On 2006-01-24T11:40:47, NeilBrown <[email protected]> wrote:

> I am expecting that I will ultimately support online conversion of
> raid5 to raid6 with only one extra device. This process is not
> (efficiently) checkpointable and so will be at-your-risk.

So the best way to go about that, if one wants to keep that option open
w/o that risk, would be to not create a raid5 in the first place, but a
raid6 with one disk missing?

Maybe even have mdadm default to that - as long as just one parity disk
is missing, no slowdown should happen, right?


Sincerely,
Lars Marowsky-Br?e

--
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"

2006-01-24 09:32:09

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 000 of 7] md: Introduction - raid5 reshape mark-2

On Tuesday January 24, [email protected] wrote:
> On 2006-01-24T11:40:47, NeilBrown <[email protected]> wrote:
>
> > I am expecting that I will ultimately support online conversion of
> > raid5 to raid6 with only one extra device. This process is not
> > (efficiently) checkpointable and so will be at-your-risk.
>
> So the best way to go about that, if one wants to keep that option open
> w/o that risk, would be to not create a raid5 in the first place, but a
> raid6 with one disk missing?
>
> Maybe even have mdadm default to that - as long as just one parity disk
> is missing, no slowdown should happen, right?

Not exactly....

raid6 has rotating parity drives, for both P and Q (the two different
'parity' blocks).
With one missing device, some Ps, some Qs, and some data would be
missing, and you would definitely get a slowdown trying to generate
some of it.

We could define a raid6 layout that didn't rotate Q. Then you would
be able to do what you suggest.
However it would then be no different from creating a normal raid5 and
supporting online conversion from raid5 to raid6-with-non-rotating-Q.
This conversion doesn't need an reshaping pass, just a recovery of the
now-missing device.

raid6-with-non-rotating-Q would have similar issues to raid4 - one
drive becomes a hot-spot for writes. I don't know how much of an
issue this really is though.

NeilBrown