2010-01-14 00:12:16

by Mike Mestnik

[permalink] [raw]
Subject: How to use mkfs.ext4 "stride=" on RAID correctly?

What should this value be? From what I gather it should be the length
of data stored on a single disk for each RAID level block. If that's
the case how is it that two given data blocks are calculated to be on
separate drives? It seams to me that the stripe-width is also
essential in this regard, but the man page does not reflect this.

For example let's say that stride=1, then which of the following
blocks are not on the same drive as 1: 8 9 10?
The answer is dependent on the number data disks, like so.
Where x = n - 1 or n depending on the RAID type.
if x = 2 then 9
if x = 3 then 8 and 10
if x = 5 then 8 and 9

There is no way to make this calculation with out knowing x, further
more calculating x based of of both stride and stripe-width is round
about... Why not simply ask for x, the number of data disks and have
stripe-width be the value that is calculated, as stride might not go
into stripe-width evenly leaving you with a headache.

Did I locate a bug?

Is there a better forum for this discussion?


2010-01-14 00:13:13

by Mike Mestnik

[permalink] [raw]
Subject: How to use mkfs.ext4 "stride=" on RAID correctly?

What should this value be? ?From what I gather it should be the length
of data stored on a single disk for each RAID level block. ?If that's
the case how is it that two given data blocks are calculated to be on
separate drives? ?It seams to me that the stripe-width is also
essential in this regard, but the man page does not reflect this.

For example let's say that stride=1, then which of the following
blocks are not on the same drive as 1: 8 9 10?
The answer is dependent on the number data disks, like so.
Where x = n - 1 or n depending on the RAID type.
if x = 2 then 9
if x = 3 then 8 and 10
if x = 5 then 8 and 9

There is no way to make this calculation with out knowing x, further
more calculating x based of of both stride and stripe-width is round
about... ?Why not simply ask for x, the number of data disks and have
stripe-width be the value that is calculated, as stride might not go
into stripe-width evenly leaving you with a headache.

Did I locate a bug?

Is there a better forum for this discussion?

2010-01-14 00:58:47

by Mike Mestnik

[permalink] [raw]
Subject: Re: How to use mkfs.ext4 "stride=" on RAID correctly?

On Wed, Jan 13, 2010 at 6:13 PM, Mike Mestnik <[email protected]> wrote:
> What should this value be? ?From what I gather it should be the length
> of data stored on a single disk for each RAID level block. ?If that's
> the case how is it that two given data blocks are calculated to be on
> separate drives? ?It seams to me that the stripe-width is also
> essential in this regard, but the man page does not reflect this.
>
> For example let's say that stride=1, then which of the following
> blocks are not on the same drive as 1: 8 9 10?
> The answer is dependent on the number data disks, like so.
> Where x = n - 1 or n depending on the RAID type.
> if x = 2 then 9
> if x = 3 then 8 and 10
> if x = 5 then 8 and 9
>
Wait!!
I got this all wrong, one would need all of x, n, and stride to
successfully determine the disk used for a given stride.

Seams to me mkfs is missing some parameters. What about [-g
blocks-per-group]...

http://tldp.org/HOWTO/Software-RAID-0.4x-HOWTO-8.html

> There is no way to make this calculation with out knowing x, further
> more calculating x based of of both stride and stripe-width is round
> about... ?Why not simply ask for x, the number of data disks and have
> stripe-width be the value that is calculated, as stride might not go
> into stripe-width evenly leaving you with a headache.
>
> Did I locate a bug?
>
> Is there a better forum for this discussion?
>

2010-01-15 17:51:17

by Ric Wheeler

[permalink] [raw]
Subject: Re: How to use mkfs.ext4 "stride=" on RAID correctly?

On 01/13/2010 07:58 PM, Mike Mestnik wrote:
> On Wed, Jan 13, 2010 at 6:13 PM, Mike Mestnik<[email protected]> wrote:
>
>> What should this value be? From what I gather it should be the length
>> of data stored on a single disk for each RAID level block. If that's
>> the case how is it that two given data blocks are calculated to be on
>> separate drives? It seams to me that the stripe-width is also
>> essential in this regard, but the man page does not reflect this.
>>
>> For example let's say that stride=1, then which of the following
>> blocks are not on the same drive as 1: 8 9 10?
>> The answer is dependent on the number data disks, like so.
>> Where x = n - 1 or n depending on the RAID type.
>> if x = 2 then 9
>> if x = 3 then 8 and 10
>> if x = 5 then 8 and 9
>>
>>
> Wait!!
> I got this all wrong, one would need all of x, n, and stride to
> successfully determine the disk used for a given stride.
>
> Seams to me mkfs is missing some parameters. What about [-g
> blocks-per-group]...
>
> http://tldp.org/HOWTO/Software-RAID-0.4x-HOWTO-8.html
>
>

Hi Mike,

Recent changes upstream export alignment and optimal IO size for you (at
least for software RAID/dm devices) and for some external arrays if the
vendor exports the information in a standard way.

Martin was the lead on that & work has been ongoing to add support up
the tool chain.

Hopefully, this will get easier to do so we all don't have to work out
the numbers each time we build on RAID :-)

Ric

>> There is no way to make this calculation with out knowing x, further
>> more calculating x based of of both stride and stripe-width is round
>> about... Why not simply ask for x, the number of data disks and have
>> stripe-width be the value that is calculated, as stride might not go
>> into stripe-width evenly leaving you with a headache.
>>
>> Did I locate a bug?
>>
>> Is there a better forum for this discussion?
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


2010-01-16 02:59:39

by Mike Mestnik

[permalink] [raw]
Subject: Re: How to use mkfs.ext4 "stride=" on RAID correctly?

On Fri, Jan 15, 2010 at 11:51 AM, Ric Wheeler <[email protected]> wrote:
> On 01/13/2010 07:58 PM, Mike Mestnik wrote:
>>
>> On Wed, Jan 13, 2010 at 6:13 PM, Mike Mestnik<[email protected]> ?wrote:
>>
>>>
>>> What should this value be? ?From what I gather it should be the length
>>> of data stored on a single disk for each RAID level block. ?If that's
>>> the case how is it that two given data blocks are calculated to be on
>>> separate drives? ?It seams to me that the stripe-width is also
>>> essential in this regard, but the man page does not reflect this.
>>>
>>> For example let's say that stride=1, then which of the following
>>> blocks are not on the same drive as 1: 8 9 10?
>>> The answer is dependent on the number data disks, like so.
>>> Where x = n - 1 or n depending on the RAID type.
>>> if x = 2 then 9
>>> if x = 3 then 8 and 10
>>> if x = 5 then 8 and 9
>>>
>>>
>>
>> Wait!!
>> I got this all wrong, one would need all of x, n, and stride to
>> successfully determine the disk used for a given stride.
>>
>> Seams to me mkfs is missing some parameters. ?What about [-g
>> blocks-per-group]...
>>
>> http://tldp.org/HOWTO/Software-RAID-0.4x-HOWTO-8.html
>>
>>
>
> Hi Mike,
>
> Recent changes upstream export alignment and optimal IO size for you (at
> least for software RAID/dm devices) and for some external arrays if the
> vendor exports the information in a standard way.
>
Would this include eSATA DAS units?
Example: LaCie 4big Quadra

I can't discover or alter the IO size on this device and have not
received an answer from LaCie support. They did say something to the
effect of "We don't have this information in our documentation, so it
must be I.P. that I can't disclose."

> Martin was the lead on that & work has been ongoing to add support up the
> tool chain.
>
> Hopefully, this will get easier to do so we all don't have to work out the
> numbers each time we build on RAID :-)
>
> Ric
>
>>> There is no way to make this calculation with out knowing x, further
>>> more calculating x based of of both stride and stripe-width is round
>>> about... ?Why not simply ask for x, the number of data disks and have
>>> stripe-width be the value that is calculated, as stride might not go
>>> into stripe-width evenly leaving you with a headache.
>>>
>>> Did I locate a bug?
>>>
>>> Is there a better forum for this discussion?
>>>
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>
>
>

2010-01-16 05:39:31

by Martin K. Petersen

[permalink] [raw]
Subject: Re: How to use mkfs.ext4 "stride=" on RAID correctly?

>>>>> "Mike" == Mike Mestnik <[email protected]> writes:

>> Recent changes upstream export alignment and optimal IO size for you
>> (at least for software RAID/dm devices) and for some external arrays
>> if the vendor exports the information in a standard way.
>>
Mike> Would this include eSATA DAS units? Example: LaCie 4big Quadra

No. Only SCSI devices report the necessary information. There are no
comparable fields in ATA-ACS.


Mike> I can't discover or alter the IO size on this device and have not
Mike> received an answer from LaCie support. They did say something to
Mike> the effect of "We don't have this information in our
Mike> documentation, so it must be I.P. that I can't disclose."

I.e. "we source this controller from a company in China and have no
idea".

--
Martin K. Petersen Oracle Linux Engineering