Dear all,
now I was able to do a performance test with an Intel ICH6R chipset.
Basic hardware data:
- Intel Pentium 4 Xeon at 3.2 GHz
- Intel ICH6R chipset, AHCI enabled
- Intel Hyperthreading On and Off
- 1 GB SDDR RAM
- SATA controller onboard (4x)
- SATA harddisks 250 GB
I used SuSE Linux 9.3 with Linux kernel 2.6.11.4-21.14-smp.
I tried to put data from memory to harddisk by writing blocks of 1 MB size
with 2 GB overall filesize. And I got the following results:
SuSE 9.3 - 32-bit Installation - Hyperthreading Off:
- CPU time 16% (sys, approx. 0% user) - Write speed 40-45 MB/s
SuSE 9.3 - 64-bit Installation - Hyperthreading Off:
- CPU time 14% (sys, approx. 0% user) - Write speed 50-60 MB/s
SuSE 9.3 - 64-bit Installation - Hyperthreading ON:
- CPU time 16% (sys, approx. 0% user) - Write speed 40-50 MB/s
Compared to ICH6R with AHCI OFF the only difference I can see is that with
AHCI the system seems to reac much faster on keyboard events and screen
redraw seems to be as fast as normal. It looks like that CPU usage has not
decreased that dramatically as I would have expected it.
Thus I did a small calculation:
Assuming that the processor gives workloads of (a) 1B (b) 1kB (c) 64kB to the
DMA controller in AHCI mode to write 45 MB/s to disk, I calculate for 10% CPU
time usage of the 3.2 GHz Pentium
(a) 10% * 3.2GHz / 45M calls = 7.3 CPU cycles per 1B call to DMA
(b) 10% * 3.2GHz / 45k calls = 7.4E+03 CPU cycles per 1kB call to DMA
(c) 10% * 3.2GHz / 720 calls = 4.8E+05 CPU cycles per 64kB call to DMA
For me (a) looks reasonable (some overhead per byte), but stupid - if
implemented. Giving bigger packages like (b) and (c) looks better to me, but
then I can't understand that huge overhead (1E3 to 1E5 cpu cycles per
package) for one package.
Is this normal or do I still have something wrong in my system?
Thank you for your help,
Martin
Martin A. Fink wrote:
> Compared to ICH6R with AHCI OFF the only difference I can see is that with
> AHCI the system seems to reac much faster on keyboard events and screen
> redraw seems to be as fast as normal. It looks like that CPU usage has not
> decreased that dramatically as I would have expected it.
>
> Thus I did a small calculation:
> Assuming that the processor gives workloads of (a) 1B (b) 1kB (c) 64kB to the
> DMA controller in AHCI mode to write 45 MB/s to disk, I calculate for 10% CPU
> time usage of the 3.2 GHz Pentium
> (a) 10% * 3.2GHz / 45M calls = 7.3 CPU cycles per 1B call to DMA
> (b) 10% * 3.2GHz / 45k calls = 7.4E+03 CPU cycles per 1kB call to DMA
> (c) 10% * 3.2GHz / 720 calls = 4.8E+05 CPU cycles per 64kB call to DMA
>
> For me (a) looks reasonable (some overhead per byte), but stupid - if
> implemented. Giving bigger packages like (b) and (c) looks better to me, but
> then I can't understand that huge overhead (1E3 to 1E5 cpu cycles per
> package) for one package.
>
> Is this normal or do I still have something wrong in my system?
It's not ata_piix or ahci that's eating up your cpu cycles. It's
memcpy from your user program to kernel buffer.
[root]# dd if=/dev/zero of=/dev/sda bs=1M &
[1] 2649
[root]# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 3 0 6216 477328 3660 0 0 2579 31377 955 1380 2 8 38 51
0 3 0 6008 477616 3680 0 0 0 55296 404 170 0 14 0 86
0 3 0 5772 477808 3640 0 0 0 73728 392 207 0 18 0 82
0 3 0 5896 477680 3656 0 0 0 74240 394 207 0 18 0 82
0 3 0 6084 477520 3656 0 0 0 73728 393 205 0 18 0 82
1 2 0 6652 477204 3668 0 0 0 57440 401 197 0 16 0 84
0 3 0 6136 477496 3664 0 0 0 72168 394 195 0 17 0 83
0 3 0 6320 477316 3644 0 0 0 73728 392 207 0 19 0 81
[root]# dd if=/dev/zero of=/dev/sda bs=1M oflag=direct &
[1] 2657
[root]# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 1 0 494568 224 3836 0 0 835 11149 480 454 1 3 78 18
0 1 0 494568 224 3836 0 0 0 70656 406 146 0 2 0 98
0 1 0 494568 224 3840 0 0 0 69632 393 144 0 1 0 99
0 1 0 494568 232 3832 0 0 0 69680 396 152 0 2 0 98
0 1 0 494568 232 3840 0 0 0 68608 392 142 0 1 0 99
1 1 0 494568 232 3840 0 0 0 69632 395 144 0 2 0 98
0 1 0 494576 232 3840 0 0 0 69632 393 143 0 2 0 98
0 1 0 494576 232 3840 0 0 0 70656 395 144 0 1 0 99
0 1 0 494576 232 3840 0 0 0 70656 394 148 0 2 0 98
--
tejun