I have several raid arrays (level 0 and 1) in my machine and I have
noticed that raid1 is much more slower than I expected.
The arrays are made from two equal hds (/dev/hde, /dev/hdg). And some
numbers about the read performances are:
/dev/hde: 29 Mb/s
/dev/hdg: 29 Mb/s
/dev/md0: 27 Mb/s (raid1)
/dev/md1: 56 Mb/s (raid0)
/dev/md2: 27 Mb/s (raid1)
These numbers comes from hdparm -tT. I have noticed a very poor
performance when reading sequentially a large file from raid1 (I suppose
this is what hdparm does).
I have taken a look at the read balancing code at raid1.c and I have found
that when a sequential read happens no balancing is done, and so all the
reading is done from only one of the mirrors while the others are iddle.?
I have tried to modify the balancing algorithm in order to balance also
sequential access, but I have got almost the same numbers.
I have thought that the reason may be that some layer bellow is making
reads of greater size than the chunks in which I balance, and so the same
work is being done twice; but I don't know the way to find this.
Does anybody know how this works?
Regards,
Jaime Medrano
Jaime Medrano wrote:
>
> I have several raid arrays (level 0 and 1) in my machine and I have
> noticed that raid1 is much more slower than I expected.
>
> The arrays are made from two equal hds (/dev/hde, /dev/hdg). And some
> numbers about the read performances are:
>
> /dev/hde: 29 Mb/s
> /dev/hdg: 29 Mb/s
> /dev/md0: 27 Mb/s (raid1)
> /dev/md1: 56 Mb/s (raid0)
> /dev/md2: 27 Mb/s (raid1)
>
> These numbers comes from hdparm -tT. I have noticed a very poor
> performance when reading sequentially a large file from raid1 (I suppose
> this is what hdparm does).
>
> I have taken a look at the read balancing code at raid1.c and I have found
> that when a sequential read happens no balancing is done, and so all the
> reading is done from only one of the mirrors while the others are iddle.?
Yes this is expected. Sequential reads from RAID1 with the
current on disk format are as fast as the fastest disk.
The reason for this is simple:
<ascii art of the on disk layout, each letter is a "block">
Disk 1: ABCDEFGHIJK
Disk 2: ABCDEFGHIJK
If you read block A from disk 1, to get more than the speed for just 1
disk
you would need to read block B from disk 2 *in parallel*, and so far so
good.
However then you need to read block C, and to do it in parallel you need
to
read it from Disk 1, but disk 1's diskhead was at block A -> so you get
a head seek.
or if the drive is trying to be intelligent it'll read block B into it's
own cache
anyway and then block C after that (which is the more common case). Etc
etc.
This later case effectively means that Disk 1 will still read ALL blocks
from the platter
into the drive's cache, and of course Disk 2 will do likewise. In just
about all
cases you care about the platter transfer rate is the limiting facter
and not the
"disk to host" rate. So both disk 1 and disk 2 are reading ALL the data
at platter speed,
which means the maximum speed at which you can get the data is at
platter speed.
Now if the disk wasn't smart and was doing seeks, it would suck much
much more due
to the high cost of seeks....
The only way to get the "1 thread sequential read" case faster is by
modifying the
disk layout to be
Disk 1: ACEGIKBDFHJ
Disk 2: ACEGIKBDFHJ
where disk 1 again reads block A, and disk 2 reads block B.
To read block C, disk 1 doesn't have to move it's head or read a dummy
block away,
it can read block C sequention, and disk 2 can read block D that way.
That way the disks actually each only read the relevant blocks in a
sequential way
and you get (in theory) 2x the performance of 1 disk.
Greetings,
Arjan van de Ven
On Tue, Apr 30, 2002 at 01:38:16PM +0100, Arjan van de Ven wrote, very
roughly:
[that RAID 1 is only as fast in reading as the fastest disk because of
seeking over alternate blocks, and ]
> The only way to get the "1 thread sequential read" case faster is by
> modifying the disk layout to be
>
> Disk 1: ACEGIKBDFHJ
> Disk 2: ACEGIKBDFHJ
>
> where disk 1 again reads block A, and disk 2 reads block B. To read
> block C, disk 1 doesn't have to move it's head or read a dummy block
> away, it can read block C sequention, and disk 2 can read block D
> that way.
>
> That way the disks actually each only read the relevant blocks in a
> sequential way and you get (in theory) 2x the performance of 1 disk.
I am confused.
Assuming a big enough read is requested to allow a parallelizing to
two disks, why can't the second disk be told not to read alternate
blocks but to start reading sequential blocks starting half way up the
request?
Also, why does hdparm give me significantly faster read numbers on
/dev/md<whatever> than it does on /dev/hd<whatever>? I had assumed
there was parallelizing going on. Does this mean I would get a speed
improvement if I ran my single disk notebook as a single disk RAID 1
because there is some bigger or better buffering going on in that code
even without parallelizing?
Thanks,
-kb
In article <[email protected]> you wrote:
>> No, you just distribute the ready round robin, this means each disk has only
>> half the seeks it had before.
> No, this is the way it was done a long time ago.
> It turns out to be an incredibly bad idea. In fact, it is the most CPU-efficient
> way of guaranteeing the largest average seek times on your disks ;)
> The RAID-1 code now looks at which disk worked closest to the wanted position
> last, and picks that disk for the seek.
Thats right, it is done on the distance in sector numbers. Thats a simple
compare, not sure if one could do that better.
raid1.c:raid1_read_balance()
Greetings
Bernd