2007-02-06 08:58:42

by Bernardo Innocenti

[permalink] [raw]
Subject: Writing performance problem with SAS1068

Hello,

I've stumbled onto a strange performance problem on a new server:
reading from disks is fast (70-80MB/s), but writing is extremely
slow (13-15MB/s). I've measured it like this:

dd if=/dev/zero of=/dev/sdd bs=4096 count=65536 conv=fdatasync
65536+0 records in
65536+0 records out
268435456 bytes (268 MB) copied, 17.7004 seconds, 15.2 MB/s

*but*: if I rebuild the kernel and change CONFIG_FUSION_MAX_SGE
from 40 (Fedora's default) to 128 (maximum value), it suddenly
gets much faster: 31MB/s!

Looks very much like an interrupt problem to me. Maybe
increasing the scatter gather mitigates the problem of
missing completion notifications.

Evidence:

Exhibit A: custom kernel config for 2.6.18-1.2257.fc5.bernie
http://www.codewiz.org/helium_logs/config

Exhibit B: dmesg output from said kernel
http://www.codewiz.org/helium_logs/dmesg

Exhibit C: misc proc files, and all that
http://www.codewiz.org/helium_logs/

Exhibit D: motherboard and chipset specification
http://www.supermicro.com/products/motherboard/Xeon3000/3010/PDSME+.cfm


Circumstantial evidence:

- Seems to affect just the LSI SAS1068 PCI-X controller.
The on-board AHCI controller writes very fast (>60MB/s)

- I've seen a very similar writing bottleneck with a
Promise TX4 SATA controller (not PCI-X) on a server with
a similar motherboard (Supermicro with Mukilteo 3000).

- Passing mpt_msi_enable=1 doesn't change anything

- FreeBSD 6.2 is even slower: writes at 7MB/s

- OpenSolaris is much, much slower... less than 1MB/s.

- Windows Vista (rc something) writes at 90MB/s. Too
fast to believe, maybe dd from Cygwin is misbehaving.

--
// Bernardo Innocenti - Develer S.r.l., R&D dept.
\X/ http://www.develer.com/


2007-02-06 19:27:28

by Douglas Gilbert

[permalink] [raw]
Subject: Re: Writing performance problem with SAS1068

Bernardo Innocenti wrote:
> Hello,
>
> I've stumbled onto a strange performance problem on a new server:
> reading from disks is fast (70-80MB/s), but writing is extremely
> slow (13-15MB/s). I've measured it like this:
>
> dd if=/dev/zero of=/dev/sdd bs=4096 count=65536 conv=fdatasync
> 65536+0 records in
> 65536+0 records out
> 268435456 bytes (268 MB) copied, 17.7004 seconds, 15.2 MB/s

# dd if=/dev/zero of=/dev/sdj bs=4096 count=65536 conv=fdatasync
65536+0 records in
65536+0 records out
268435456 bytes (268 MB) copied, 2.24953 seconds, 119 MB/s

# dd if=/dev/zero of=/dev/sdd bs=4096 count=65536 conv=fdatasync
65536+0 records in
65536+0 records out
268435456 bytes (268 MB) copied, 2.3246 seconds, 115 MB/s

Both /dev/sdj and /dev/sdd connect via an expander to the same
SAS disk. /dev/sdj is via the LT aic94xx driver and a PCI-X HBA.
/dev/sdd is via the mptsas driver and a SAS1068 (PCIe) based HBA.
The kernel version is 2.6.20-rc5.

Looks good to me.

You may like to check that Write Cache Enable is on with:
'sdparm --get=WCE /dev/sdd'.

Doug Gilbert

> *but*: if I rebuild the kernel and change CONFIG_FUSION_MAX_SGE
> from 40 (Fedora's default) to 128 (maximum value), it suddenly
> gets much faster: 31MB/s!
>
> Looks very much like an interrupt problem to me. Maybe
> increasing the scatter gather mitigates the problem of
> missing completion notifications.
>
> Evidence:
>
> Exhibit A: custom kernel config for 2.6.18-1.2257.fc5.bernie
> http://www.codewiz.org/helium_logs/config
>
> Exhibit B: dmesg output from said kernel
> http://www.codewiz.org/helium_logs/dmesg
>
> Exhibit C: misc proc files, and all that
> http://www.codewiz.org/helium_logs/
>
> Exhibit D: motherboard and chipset specification
> http://www.supermicro.com/products/motherboard/Xeon3000/3010/PDSME+.cfm
>
>
> Circumstantial evidence:
>
> - Seems to affect just the LSI SAS1068 PCI-X controller.
> The on-board AHCI controller writes very fast (>60MB/s)
>
> - I've seen a very similar writing bottleneck with a
> Promise TX4 SATA controller (not PCI-X) on a server with
> a similar motherboard (Supermicro with Mukilteo 3000).
>
> - Passing mpt_msi_enable=1 doesn't change anything
>
> - FreeBSD 6.2 is even slower: writes at 7MB/s
>
> - OpenSolaris is much, much slower... less than 1MB/s.
>
> - Windows Vista (rc something) writes at 90MB/s. Too
> fast to believe, maybe dd from Cygwin is misbehaving.
>

2007-02-07 20:59:10

by Bernardo Innocenti

[permalink] [raw]
Subject: Re: Writing performance problem with SAS1068

Douglas Gilbert wrote:

> You may like to check that Write Cache Enable is on with:
> 'sdparm --get=WCE /dev/sdd'.

Yeah, yeah! Works fine once I toggled the WCE to 1. Writing
flies at 70MB/s, which is extremely good for those desktop-grade
disks (Seagate Barracuda SATA 250GB @7200RPM).

Thank you very much!

But... who do you think I should bug to make this the system
default? Does write caching need to be enabled by the driver
itself, in the SCSI layer or perhaps by the distro initscripts?

--
// Bernardo Innocenti - Develer S.r.l., R&D dept.
\X/ http://www.develer.com/