All these tests were run on 2.4.21-rc2-ac1...
We start out with a three disk raid1 array in failure mode:
md21 : active raid1 hdi9[1] hdg9[2]
3145856 blocks [3/2] [_UU]
There are three 500 MiB contiguous files in the volume root:
-rw-r--r-- 1 root root 524288000 May 11 13:53 file1
-rw-r--r-- 1 root root 524288000 May 11 13:54 file2
-rw-r--r-- 1 root root 524288000 May 11 13:55 file3
First test is a single sequential stream:
# umount /mnt/md21; mount /mnt/md21
# time dd if=/mnt/md21/file1 of=/dev/null bs=64k &
real 0m18.230s <==== 27.5 MiB/s
user 0m0.020s
sys 0m3.760s
Now two streams:
# umount /mnt/md21; mount /mnt/md21
# time dd if=/mnt/md21/file1 of=/dev/null bs=64k &
# time dd if=/mnt/md21/file2 of=/dev/null bs=64k &
real 0m18.065s <==== 27.6 MiB/s
user 0m0.040s
sys 0m4.280s
real 0m23.197s <==== 21.6 MiB/s
user 0m0.020s
sys 0m4.250s
Add the third disk back into the array:
# raidhotadd /dev/md21 /dev/hde9
[rebuilt 3000MiB in 313s == (3000+3000)/313 == 19MiB/s throughput]
Now rerun the two streams test:
# umount /mnt/md21; mount /mnt/md21
# time dd if=/mnt/md21/file1 of=/dev/null bs=64k &
# time dd if=/mnt/md21/file2 of=/dev/null bs=64k &
real 0m50.336s <==== 9.94 MiB/s (!)
user 0m0.030s
sys 0m4.350s
real 0m50.431s <==== 9.91 MiB/s (!)
user 0m0.030s
sys 0m4.200s
So 50% more hardware (disks, channels) gives a 60% performance drop
when using raid1 with sequential reads... and raid1.c:read_balance()
is the culprit.
Uniform Multi-Platform E-IDE driver Revision: 7.00beta3-.2.4
HPT370: IDE controller at PCI slot 00:0d.0
HPT370: chipset revision 3
PDC20262: IDE controller at PCI slot 00:10.0
PDC20262: chipset revision 1
hde: MAXTOR 4K060H3, ATA DISK drive
hdg: MAXTOR 4K060H3, ATA DISK drive
hdi: MAXTOR 4K060H3, ATA DISK drive
hde: hde1 hde2 hde3 hde4 < hde5 hde6 hde7 hde8 hde9 hde10 >
hdg: hdg1 hdg2 hdg3 hdg4 < hdg5 hdg6 hdg7 hdg8 hdg9 hdg10 >
hdi: hdi1 hdi2 hdi3 hdi4 < hdi5 hdi6 hdi7 hdi8 hdi9 hdi10 >
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Chuck Ebbert wrote:
> Add the third disk back into the array:
> # raidhotadd /dev/md21 /dev/hde9
> [rebuilt 3000MiB in 313s == (3000+3000)/313 == 19MiB/s throughput]
Why three drives in a Raid1? Raid one is just mirror, or is the third
drive like a "hot" replace drive if one of the others fail?
- --
Clemens Schwaighofer - IT Engineer & System Administration
==========================================================
Tequila Japan, 6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN
Tel: +81-(0)3-3545-7703 Fax: +81-(0)3-3545-7343
http://www.tequila.jp
==========================================================
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQE+vySnjBz/yQjBxz8RAi+jAJ96566475BKb8o21/A7Wlzztba1jQCfSCnG
EchYBgaJBdvOPzVbx9rPorU=
=Ydkv
-----END PGP SIGNATURE-----
On Mon, 2003-05-12 at 05:35, Clemens Schwaighofer wrote:
> Why three drives in a Raid1? Raid one is just mirror, or is the third
> drive like a "hot" replace drive if one of the others fail?
With normal mirroring (one original, one copy) you do have the
redundancy and the speedboost at reads, but at mirroring with one
original and two copies (I know AIX does this), you get in to a scenario
that is quite handy. Say you run a large database in a 24/7 operation.
You want to back the database up, but you can only get 5-10 minutes
downtime on it. You then quiesce the database, split off the second copy
from the mirror, mount that as a separate filesystem and back that up
while the original with its first copy has already stepped back into
full use.
Once you finished your backup, you add your split-off copy back to the
original and primary copy and you are back where you started.
HTH,
/Anders
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Replying to Anders Karlsson:
> downtime on it. You then quiesce the database, split off the second copy
> from the mirror, mount that as a separate filesystem and back that up
> while the original with its first copy has already stepped back into
> full use.
Why do not use snapshots for this?
- --
Paul P 'Stingray' Komkoff Jr // http://stingr.net/key <- my pgp key
This message represents the official view of the voices in my head
-----BEGIN PGP SIGNATURE-----
iD8DBQE+vzQCyMW8naS07KQRAnDBAKC9+yL2chK4eIldN8KiGQRIA5VkEQCfadZH
GMYbeKYtHmQ7p9rEBqlxmmA=
=0CCn
-----END PGP SIGNATURE-----
On Mon, 2003-05-12 at 06:41, Paul P Komkoff Jr wrote:
> Replying to Anders Karlsson:
> > downtime on it. You then quiesce the database, split off the second copy
> > from the mirror, mount that as a separate filesystem and back that up
> > while the original with its first copy has already stepped back into
> > full use.
>
> Why do not use snapshots for this?
Snapshots was not around then. AFAIK snapshotting in the LVM is a recent
thing. I know people were doing backups by splitting off the 2nd copy in
a mirror some eight years ago.
/Anders
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Anders Karlsson wrote:
> On Mon, 2003-05-12 at 05:35, Clemens Schwaighofer wrote:
>
[db raid1 with 3 discs]
that sounds like a super special featuer never needed in Software (!!)
Raid thing (IMvHO).
I can only image a Hotspare Disc, thats all.
- --
Clemens Schwaighofer - IT Engineer & System Administration
==========================================================
Tequila Japan, 6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN
Tel: +81-(0)3-3545-7703 Fax: +81-(0)3-3545-7343
http://www.tequila.jp
==========================================================
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQE+v13yjBz/yQjBxz8RAptGAKCIikhrT18Qn5QpFYqjo+e/tlbisACg34Yb
m45mtqNXKDf239bDJkAqdpw=
=/rFz
-----END PGP SIGNATURE-----
On Mon, 2003-05-12 at 09:40, Clemens Schwaighofer wrote:
> [db raid1 with 3 discs]
>
> that sounds like a super special featuer never needed in Software (!!)
> Raid thing (IMvHO).
I think it depends greatly on your needs. For small companies running
commercial unices, this might be the best solution based upon need and
cost. For even smaller outfits running Linux, the snapshot feature in
Linux LVM will do the job.
For large operations where money is not really the issue, a SAN with
some SAN Volume Controllers is probably the answer.
Then there is the issue of people that are just simple ultra-paranoid
about their data or where the 2nd copy is in fact off-site (using SCSI
extenders).
> I can only image a Hotspare Disc, thats all.
In the tradition of Unix, there are more than one way to skin a cat. ;-)
/Anders
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Replying to Anders Karlsson:
> I think it depends greatly on your needs. For small companies running
> commercial unices, this might be the best solution based upon need and
As for commercial market veritas vxfs doing snapshots for ages :)
> In the tradition of Unix, there are more than one way to skin a cat. ;-)
There are more efficient and less efficient ways ...
- --
Paul P 'Stingray' Komkoff Jr // http://stingr.net/key <- my pgp key
This message represents the official view of the voices in my head
-----BEGIN PGP SIGNATURE-----
iD8DBQE+v4N+yMW8naS07KQRAjfvAJ9TSQkqO9XgoOITceLQZXSCODL1kwCdHnOX
t2HQukF1fEPzcZW0RgVTpuY=
=/dIG
-----END PGP SIGNATURE-----
On Mon, 12 May 2003 15:20:34 +0400, Paul P Komkoff Jr <[email protected]> said:
> Replying to Anders Karlsson:
> > I think it depends greatly on your needs. For small companies running
> > commercial unices, this might be the best solution based upon need and
>
> As for commercial market veritas vxfs doing snapshots for ages :)
a) vxfs isn't free.
b) vxfs may not be available on the platform required by the application.
c) If you've already *got* platform A that doesn't have vxfs, you will probably
think *really* hard about migrating - there's the costs of buying the hardware,
software, and liveware....
> There are more efficient and less efficient ways ...
And often, a lot of inefficiency is overlooked. The cost of buying a new server
from another vendor, buying new licenses for all the software, installing it
all, arranging maintenance contracts, getting sysadmins with a clue regarding
the new system, and all the rest of the costs of a conversion can *easily*
outweigh any "less efficient" ways.
On 2003-05-12T17:40:18,
Clemens Schwaighofer <[email protected]> said:
> that sounds like a super special featuer never needed in Software (!!)
> Raid thing (IMvHO).
No.
3way mirroring is actually rather useful. You can take a failure and
_still_ be fully redundant (ie, like a hot-spare, just already synced).
In theory, you could even read from three drives and correct errors on
one drive.
Sincerely,
Lars Marowsky-Br?e <[email protected]>
--
SuSE Labs - Research & Development, SuSE Linux AG
"If anything can go wrong, it will." "Chance favors the prepared (mind)."
-- Capt. Edward A. Murphy -- Louis Pasteur
Clemens Schwaighofer wrote:
> Why three drives in a Raid1? Raid one is just mirror, or is the third
> drive like a "hot" replace drive if one of the others fail?
The goal is to get better (read) performance, as well as extra
redundancy. The system is supposed to balance reads among the
available drives but in this case it breaks when there are more
than two disks.
I have a "changes way too much code" patch that fixes this; guess
I should at least post it and see what happens...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Anders Karlsson wrote:
> On Mon, 2003-05-12 at 09:40, Clemens Schwaighofer wrote:
>
> I think it depends greatly on your needs. For small companies running
> commercial unices, this might be the best solution based upon need and
> cost. For even smaller outfits running Linux, the snapshot feature in
> Linux LVM will do the job.
well, for private yes, but even as a small company I would invest in a
rother more solid hardware RAID system than into software. I saw so many
horrible data losses due software raid or IDE HDs (which where in a
external hardware box actually), that I don't trust this much anymore.
> Then there is the issue of people that are just simple ultra-paranoid
> about their data or where the 2nd copy is in fact off-site (using SCSI
> extenders).
well I always had a hot spare HD in my boxes because I was a bit
ultra-paranoid, actually I became ultra-paranoid :)
>>I can only image a Hotspare Disc, thats all.
> In the tradition of Unix, there are more than one way to skin a cat. ;-)
there is for Software, but when it comes to Hardware security, I trust
in real RAID (unless it is private adventures)
- --
Clemens Schwaighofer - IT Engineer & System Administration
==========================================================
Tequila Japan, 6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN
Tel: +81-(0)3-3545-7703 Fax: +81-(0)3-3545-7343
http://www.tequila.jp
==========================================================
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQE+wFdFjBz/yQjBxz8RAjucAJwNYqxZKkwZOUiNlc7v3Fxf3y+/RACfRo2S
CNs6Ln3nALc7WUg+MGMlbkc=
=2syv
-----END PGP SIGNATURE-----
On Tue, 2003-05-13 at 03:24, Clemens Schwaighofer wrote:
> > I think it depends greatly on your needs. For small companies running
> > commercial unices, this might be the best solution based upon need and
> > cost. For even smaller outfits running Linux, the snapshot feature in
> > Linux LVM will do the job.
>
> well, for private yes, but even as a small company I would invest in a
> rother more solid hardware RAID system than into software. I saw so many
> horrible data losses due software raid or IDE HDs (which where in a
> external hardware box actually), that I don't trust this much anymore.
SSA does quite happily mirroring with two copies and has done for many
years now. There are still instances where read performance outweighs
the cost of having to get three times the storage you require to use.
And for the record, I was never talking about IDE storage... ;-)
Regards,
/Anders
Hi Anders.
>> Why three drives in a Raid1? Raid one is just mirror, or is the
>> third drive like a "hot" replace drive if one of the others fail?
> With normal mirroring (one original, one copy) you do have the
> redundancy and the speed boost at reads, but at mirroring with one
> original and two copies (I know AIX does this), you get in to a
> scenario that is quite handy. Say you run a large database in a
> 24/7 operation. You want to back the database up, but you can only
> get 5-10 minutes downtime on it. You then quiesce the database,
> split off the second copy from the mirror, mount that as a
> separate file system and back that up while the original with its
> first copy has already stepped back into full use.
>
> Once you finished your backup, you add your split-off copy back to
> the original and primary copy and you are back where you started.
Does this cause any problems, with the third disc now being out of
date compared to the first two?
Best wishes from Riley.
---
* Nothing as pretty as a smile, nothing as ugly as a frown.
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.481 / Virus Database: 277 - Release Date: 13-May-2003
Hi Riley,
> > With normal mirroring (one original, one copy) you do have the
> > redundancy and the speed boost at reads, but at mirroring with one
> > original and two copies (I know AIX does this), you get in to a
> > scenario that is quite handy. Say you run a large database in a
> > 24/7 operation. You want to back the database up, but you can only
> > get 5-10 minutes downtime on it. You then quiesce the database,
> > split off the second copy from the mirror, mount that as a
> > separate file system and back that up while the original with its
> > first copy has already stepped back into full use.
> >
> > Once you finished your backup, you add your split-off copy back to
> > the original and primary copy and you are back where you started.
>
> Does this cause any problems, with the third disc now being out of
> date compared to the first two?
No, it should not cause problems as when you add the split-off copy back
into the mirror, it is treated as 'stale' and will get synchronised with
the original.
I would be very surprised if the Linux software md driver worked any
diffrently than this. Perhaps someone that knows it in-depth can add to
the conversation?
With the facilities of LVM 'snapshots' now being available, this
practice of splitting off one copy from a three-way mirror is perhaps
becoming redundant, but people will likely take the approach of "if it
ain't broken, don't fix it" and leave old backup methods as they are.
So if you work in the sysadm field, you might well come across this
practice.
Regards,
/Anders
On Thu, May 15, 2003 at 11:51:14AM +0100, Anders Karlsson wrote:
...
> No, it should not cause problems as when you add the split-off copy back
> into the mirror, it is treated as 'stale' and will get synchronised with
> the original.
Correct
If this was not the case, background resynchronization of standard
2-disk RAID-1 would be a really horrible feature (with half the reads
returning stale data from the new disk)...
>
> I would be very surprised if the Linux software md driver worked any
> diffrently than this. Perhaps someone that knows it in-depth can add to
> the conversation?
Unless there's bugs in the driver, your description is correct :)
>
> With the facilities of LVM 'snapshots' now being available, this
> practice of splitting off one copy from a three-way mirror is perhaps
> becoming redundant, but people will likely take the approach of "if it
> ain't broken, don't fix it" and leave old backup methods as they are.
> So if you work in the sysadm field, you might well come across this
> practice.
The really good argument for N>2 disk RAID-1 is still the seek-time and
multiple-readers performance benefits which you won't be addressing with
LVM snapshots.
--
................................................................
: [email protected] : And I see the elder races, :
:.........................: putrid forms of man :
: Jakob ?stergaard : See him rise and claim the earth, :
: OZ9ABN : his downfall is at hand. :
:.........................:............{Konkhra}...............: