Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758652AbYFIIny (ORCPT ); Mon, 9 Jun 2008 04:43:54 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755861AbYFIInn (ORCPT ); Mon, 9 Jun 2008 04:43:43 -0400 Received: from 213.237.47.228.adsl.vbr.worldonline.dk ([213.237.47.228]:13693 "EHLO rap.rap.dk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755823AbYFIInm (ORCPT ); Mon, 9 Jun 2008 04:43:42 -0400 Date: Mon, 9 Jun 2008 10:43:40 +0200 From: Keld =?iso-8859-1?Q?J=F8rn?= Simonsen To: thomas62186218@aol.com Cc: dan.j.williams@gmail.com, jpiszcz@lucidpixels.com, linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com, ap@solarrain.com Subject: Re: Linux MD RAID 5 Benchmarks Across (3 to 10) 300 Gigabyte Veliciraptors Message-ID: <20080609084340.GA22209@rap.rap.dk> References: <8CA981CB5C2B4D6-E68-18E2@MBLK-M14.sysops.aol.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <8CA981CB5C2B4D6-E68-18E2@MBLK-M14.sysops.aol.com> User-Agent: Mutt/1.5.11 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4946 Lines: 159 On Mon, Jun 09, 2008 at 03:51:07AM -0400, thomas62186218@aol.com wrote: > Thank you for sharing these results. One issue that I consistently see > with these results is miserable random IO performance. Looking at these > numbers, even a low-end RAID controller with 128MB of cache will outrun > md-based RAIDs in random IO benchmarks. In today's world of virtual > machines, etc, random IO is far more common than sequential IO. What > can be done with md (or something else) to alleviate this problem? Have you got any numbers to back up this? What benchmark are you using for random IO? Anyway the numbers that Justin reported was with an outdate motherboard, My take is that Linux MD raid can outperform most HW RAID by a factor of two on random IO. Best regards keld > -Thomas > > > -----Original Message----- > From: Dan Williams > To: Justin Piszcz > Cc: linux-kernel@vger.kernel.org; linux-raid@vger.kernel.org; > xfs@oss.sgi.com; Alan Piszcz > Sent: Sat, 7 Jun 2008 6:46 pm > Subject: Re: Linux MD RAID 5 Benchmarks Across (3 to 10) 300 Gigabyte > Veliciraptors > > > > > > > > > > > On Sat, Jun 7, 2008 at 7:22 AM, Justin Piszcz > wrote: > >First, the original benchmarks with 6-SATA drives with fixed > formatting, > >using > >right justification and the same decimal point precision throughout: > > > http://home.comcast.net/~jpiszcz/20080607/raid-benchmarks-decimal-fix-and-right-justified/disks.html > > > >Now for for veliciraptors! Ever wonder what kind of speed is > possible with > >3 disk, 4,5,6,7,8,9,10-disk RAID5s? I ran a loop to find out, each > run is > >executed three times and the average is taken of all three runs per > each > >RAID5 disk set. > > > >In short? The 965 no longer does justice with faster drives, a new > chipset > >and motherboard are needed. After reading or writing to 4-5 > veliciraptors > >it saturates the bus/965 chipset. > > > >Here is a picture of the 12 veliciraptors I tested with: > > > http://home.comcast.net/~jpiszcz/20080607/raid5-benchmarks-3to10-veliciraptors/raptors.jpg > > > >Here are the bonnie++ results: > > > http://home.comcast.net/~jpiszcz/20080607/raid5-benchmarks-3to10-veliciraptors/veliciraptor-raid.html > > > >For those who want the results in text: > > > http://home.comcast.net/~jpiszcz/20080607/raid5-benchmarks-3to10-veliciraptors/veliciraptor-raid.txt > > > >System used, same/similar as before: > >Motherboard: Intel DG965WH > >Memory: 8GiB > >Kernel: 2.6.25.4 > >Distribution: Debian Testing x86_64 > >Filesystem: XFS with default mkfs.xfs parameters [auto-optimized for > SW > >RAID] > >Mount options: defaults,noatime,nodiratime,logbufs=8,logbsize=262144 > 0 1 > >Chunk size: 1024KiB > >RAID5 Layout: Default (left-symmetric) > >Mdadm Superblock used: 0.90 > > > >Optimizations used (last one is for the CFQ scheduler), it improves > >performance by a modest 5-10MiB/s: > >http://home.comcast.net/~jpiszcz/raid/20080601/raid5.html > > > ># Tell user what's going on. > >echo "Optimizing RAID Arrays..." > > > ># Define DISKS. > >cd /sys/block > >DISKS=$(/bin/ls -1d sd[a-z]) > > > ># Set read-ahead. > ># > That's actually 65k x 512byte blocks so 32MiB > >echo "Setting read-ahead to 32 MiB for /dev/md3" > >blockdev --setra 65536 /dev/md3 > > > ># Set stripe-cache_size for RAID5. > >echo "Setting stripe_cache_size to 16 MiB for /dev/md3" > > Sorry to sound like a broken record, 16MiB is not correct. > > size=$((num_disks * 4 * 16384 / 1024)) > echo "Setting stripe_cache_size to $size MiB for /dev/md3" > > ...and commit 8b3e6cdc should improve the performance / > stripe_cache_size ratio. > > >echo 16384 > /sys/block/md3/md/stripe_cache_size > > > ># Disable NCQ on all disks. > >echo "Disabling NCQ on all disks..." > >for i in $DISKS > >do > > echo "Disabling NCQ on $i" > > echo 1 > /sys/block/"$i"/device/queue_depth > >done > > > ># Fix slice_idle. > ># See http://www.nextre.it/oracledocs/ioscheduler_03.html > >echo "Fixing slice_idle to 0..." > >for i in $DISKS > >do > > echo "Changing slice_idle to 0 on $i" > > echo 0 > /sys/block/"$i"/queue/iosched/slice_idle > >done > > > > Thanks for putting this data together. > > Regards, > Dan > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/