Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756957AbYFCSoa (ORCPT ); Tue, 3 Jun 2008 14:44:30 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754681AbYFCSoW (ORCPT ); Tue, 3 Jun 2008 14:44:22 -0400 Received: from atlantis.cc.ndsu.NoDak.edu ([134.129.99.37]:38109 "EHLO atlantis.cc.ndsu.NoDak.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754563AbYFCSoU (ORCPT ); Tue, 3 Jun 2008 14:44:20 -0400 Date: Tue, 3 Jun 2008 13:44:17 -0500 From: Bryan Mesich To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com Subject: Re: Limits of the 965 chipset & 3 PCI-e cards/southbridge? ~774MiB/s peak for read, ~650MiB/s peak for write? Message-ID: <20080603184417.GA23450@atlantis.cc.ndsu.NoDak.edu> Reply-To: Bryan Mesich Mail-Followup-To: Bryan Mesich , linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Operating-System: Fedora Release 8 User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7071 Lines: 161 On Sun, Jun 01, 2008 at 05:45:39AM -0400, Justin Piszcz wrote: > I am testing some drives for someone and was curious to see how far one can > push the disks/backplane to their theoretical limit. This testing would indeed only suggest theoretical limits. In a production environment, I think a person would be hard pressed to reproduce these numbers. > Does/has anyone done this with server intel board/would greater speeds be > achievable? Nope, but your post inspired me to give it a try. My setup is as follows: Kernel: linux 2.6.25.3-18 (Fedora 9) Motherboard: Intel SE7520BD2-DDR2 SATA Controller: (2) 8 port 3Ware 9550SX Disks (12) 750GB Seagate ST3750640NS Disks sd[a-h] are plugged into the first 3Ware controller while sd[i-l] are plugged into the second controller. Both 3Ware cards are plugged onto PCIX 100 slots. The disks are being exported as "single disk" and write caching has been disabled. The OS is loaded on sd[a-d] (small 10GB partitions mirrored). For my first test, I ran dd on a single disk: dd if=/dev/sde of=/dev/null bs=1M dstat -D sde ----total-cpu-usage---- --dsk/sde-- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 7 53 40 0 0| 78M 0 | 526B 420B| 0 0 |1263 2559 0 8 53 38 0 0| 79M 0 | 574B 420B| 0 0 |1262 2529 0 7 54 39 0 0| 78M 0 | 390B 420B| 0 0 |1262 2576 0 7 54 39 0 0| 76M 0 | 284B 420B| 0 0 |1216 2450 0 8 54 38 0 0| 76M 0 | 376B 420B| 0 0 |1236 2489 0 9 54 36 0 0| 79M 0 | 397B 420B| 0 0 |1265 2537 0 9 54 37 0 0| 77M 0 | 344B 510B| 0 0 |1262 2872 0 8 54 38 0 0| 75M 0 | 637B 420B| 0 0 |1214 2992 0 8 53 38 0 0| 78M 0 | 422B 420B| 0 0 |1279 3179 And for a write: dd if=/dev/zero of=/dev/sde bs=1M dstat -D sde ----total-cpu-usage---- --dsk/sde-- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 7 2 90 0 0| 0 73M| 637B 420B| 0 0 | 614 166 0 7 0 93 0 0| 0 73M| 344B 420B| 0 0 | 586 105 0 7 0 93 0 0| 0 75M| 344B 420B| 0 0 | 629 177 0 7 0 93 0 0| 0 74M| 344B 420B| 0 0 | 600 103 0 7 0 93 0 0| 0 73M| 875B 420B| 0 0 | 612 219 0 8 0 92 0 0| 0 68M| 595B 420B| 0 0 | 546 374 0 8 5 86 0 0| 0 76M| 132B 420B| 0 0 | 632 453 0 9 0 91 0 0| 0 74M| 799B 420B| 0 0 | 596 421 0 8 0 92 0 0| 0 74M| 693B 420B| 0 0 | 624 436 For my next test, I ran dd on 8 disks (sd[e-l]). These are non-system disks (OS is installed on sd[a-d) and they are split between the 3Ware controllers. Here are my results: dd if=/dev/sd[e-l] of=/dev/null bs=1M dstat ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 91 0 0 1 8| 397M 0 | 811B 306B| 0 0 |6194 6654 0 91 0 0 1 7| 420M 0 | 158B 322B| 0 0 |6596 7097 1 91 0 0 1 8| 415M 0 | 324B 322B| 0 0 |6406 6839 1 91 0 0 1 8| 413M 0 | 316B 436B| 0 0 |6464 6941 0 90 0 0 2 8| 419M 0 | 66B 306B| 0 0 |6588 7121 1 91 0 0 2 7| 412M 0 | 461B 322B| 0 0 |6449 6916 0 91 0 0 1 7| 415M 0 | 665B 436B| 0 0 |6535 7044 0 92 0 0 1 7| 418M 0 | 299B 306B| 0 0 |6555 7028 0 90 0 0 1 8| 412M 0 | 192B 436B| 0 0 |6496 7014 And for write: dd if=/dev/zero of=/dev/sd[e-l] bs=1M dstat ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 0 86 0 0 1 12| 0 399M| 370B 306B| 0 0 |3520 855 0 87 0 0 1 12| 0 407M| 310B 322B| 0 0 |3506 813 1 87 0 0 1 12| 0 413M| 218B 322B| 0 0 |3568 827 0 87 0 0 0 12| 0 425M| 278B 322B| 0 0 |3641 785 0 87 0 0 1 12| 0 430M| 310B 322B| 0 0 |3658 845 0 86 0 0 1 14| 0 421M| 218B 322B| 0 0 |3605 756 1 85 0 0 1 14| 0 417M| 627B 322B| 0 0 |3579 984 0 84 0 0 1 14| 0 420M| 224B 436B| 0 0 |3548 1006 0 86 0 0 1 13| 0 433M| 310B 306B| 0 0 |3679 836 It seems that I'm running into a wall around 420-430M. Assuming the disks can push 75M, 8 disks should push 600M together. This is obviously not the case. According to Intel's Tech Specifications: http://download.intel.com/support/motherboards/server/se7520bd2/sb/se7520bd2_server_board_tps_r23.pdf I think the IO contention (in my case) is due to the PXH. All and all, when it comes down to moving IO in reality, these tests are pretty much useless in my opinion. Filesystem overhead and other operations limit the amount of IO that can be serviced by the PCI bus and/or the block devices (although it's interesting to see if the theoretical speeds are possible). For example, the box I used in the above example will be used as a fibre channel target server. Below is a performance print out of a running fibre target with the same hardware as tested above: mayacli> show performance controller=fc1 read/sec write/sec IOPS 16k 844k 141 52k 548k 62 1m 344k 64 52k 132k 26 0 208k 27 12k 396k 42 168k 356k 64 32k 76k 16 952k 248k 124 860k 264k 132 1m 544k 165 1m 280k 166 900k 344k 105 340k 284k 60 1m 280k 125 1m 340k 138 764k 592k 118 1m 448k 127 2m 356k 276 2m 480k 174 2m 8m 144 540k 376k 89 324k 380k 77 4k 348k 71 This particular fibre target is providing storage to 8 initiators, 4 of which are busy IMAP mail servers. Granted this isn't the busiest time of the year for us, but were not comming even close to the numbers mentioned in the above example. As always, corrections to my above bable are appreciated and welcomed :-) Bryan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/