2014-04-21 07:31:04

by Jet Chen

[permalink] [raw]
Subject: [libata/ahci] 8a4aeec8d2d: +138.4% perf-stat.dTLB-store-misses, +37.2% perf-stat.dTLB-load-misses

HI Dan,

we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata for-next
commit 8a4aeec8d2d6a3edeffbdfae451cdf05cbf0fefd ("libata/ahci: accommodate tag ordered controllers")

2cf532f5e67c0cf 8a4aeec8d2d6a3edeffbdfae4
--------------- -------------------------
88694337 ~39% +138.4% 2.115e+08 ~46% TOTAL perf-stat.dTLB-store-misses
217057 ~ 0% -31.3% 149221 ~ 3% TOTAL interrupts.46:PCI-MSI-edge.ahci
6.995e+08 ~20% +37.2% 9.598e+08 ~25% TOTAL perf-stat.dTLB-load-misses
110302 ~ 0% -28.9% 78402 ~ 2% TOTAL interrupts.CAL
3.168e+08 ~ 9% +14.5% 3.627e+08 ~10% TOTAL perf-stat.L1-dcache-prefetches
2.553e+09 ~12% +26.5% 3.228e+09 ~11% TOTAL perf-stat.LLC-loads
5.815e+08 ~ 6% +27.3% 7.403e+08 ~11% TOTAL perf-stat.LLC-stores
3.662e+09 ~11% +22.9% 4.501e+09 ~10% TOTAL perf-stat.L1-dcache-load-misses
2.155e+10 ~ 1% +8.3% 2.333e+10 ~ 1% TOTAL perf-stat.L1-dcache-store-misses
3.619e+10 ~ 1% +5.9% 3.832e+10 ~ 2% TOTAL perf-stat.cache-references
1.605e+10 ~ 1% +4.3% 1.674e+10 ~ 1% TOTAL perf-stat.L1-icache-load-misses
239691 ~ 7% -8.4% 219537 ~ 1% TOTAL interrupts.RES
3483 ~ 0% -5.4% 3297 ~ 0% TOTAL vmstat.system.in
2.748e+08 ~ 1% +4.3% 2.865e+08 ~ 0% TOTAL perf-stat.cache-misses
98935369 ~ 0% +4.9% 1.038e+08 ~ 0% TOTAL perf-stat.LLC-store-misses
699 ~ 1% -3.7% 673 ~ 1% TOTAL iostat.sda.w_await
698 ~ 1% -3.7% 672 ~ 1% TOTAL iostat.sda.await
203893 ~ 0% +3.7% 211474 ~ 0% TOTAL iostat.sda.wkB/s
203972 ~ 0% +3.7% 211488 ~ 0% TOTAL vmstat.io.bo
618082 ~ 4% -4.6% 589619 ~ 1% TOTAL perf-stat.context-switches
1.432e+12 ~ 1% +3.0% 1.475e+12 ~ 0% TOTAL perf-stat.L1-icache-loads
3.35e+11 ~ 0% +3.2% 3.456e+11 ~ 0% TOTAL perf-stat.L1-dcache-stores
1.486e+12 ~ 0% +2.8% 1.527e+12 ~ 0% TOTAL perf-stat.iTLB-loads
3.006e+11 ~ 0% +2.6% 3.084e+11 ~ 0% TOTAL perf-stat.branch-instructions
1.793e+12 ~ 0% +2.8% 1.843e+12 ~ 0% TOTAL perf-stat.cpu-cycles
3.352e+11 ~ 1% +2.9% 3.451e+11 ~ 0% TOTAL perf-stat.dTLB-stores
2.994e+11 ~ 1% +3.1% 3.087e+11 ~ 0% TOTAL perf-stat.branch-loads
1.49e+12 ~ 0% +2.9% 1.533e+12 ~ 0% TOTAL perf-stat.instructions
5.48e+11 ~ 0% +2.8% 5.633e+11 ~ 0% TOTAL perf-stat.dTLB-loads
2.028e+11 ~ 1% +2.9% 2.086e+11 ~ 1% TOTAL perf-stat.bus-cycles
5.484e+11 ~ 0% +2.9% 5.644e+11 ~ 0% TOTAL perf-stat.L1-dcache-loads
1.829e+12 ~ 0% +2.7% 1.877e+12 ~ 1% TOTAL perf-stat.ref-cycles

Legend:
~XX% - stddev percent
[+-]XX% - change percent

Attach full stats changes entries for reference.

Thanks,
Jet





Attachments:
reproduce (453.00 B)
stats_changes (11.54 kB)
Download all attachments

2014-04-22 17:11:59

by Dan Williams

[permalink] [raw]
Subject: Re: [libata/ahci] 8a4aeec8d2d: +138.4% perf-stat.dTLB-store-misses, +37.2% perf-stat.dTLB-load-misses

On Mon, Apr 21, 2014 at 12:29 AM, Jet Chen <[email protected]> wrote:
> HI Dan,
>
> we noticed the below changes on
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata for-next
> commit 8a4aeec8d2d6a3edeffbdfae451cdf05cbf0fefd ("libata/ahci: accommodate
> tag ordered controllers")

Hi, was this on simulated hardware or a real AHCI controller and disk?

It does appear this test noticed increased throughput:

203893 ~ 0% +3.7% 211474 ~ 0% TOTAL iostat.sda.wkB/s

I wonder if ap->last_tag can be moved to a hotter cacheline, but if
throughput goes up I can imagine it throws off the cpu statistics
quite a bit.

2014-04-23 08:21:20

by Jet Chen

[permalink] [raw]
Subject: Re: [libata/ahci] 8a4aeec8d2d: +138.4% perf-stat.dTLB-store-misses, +37.2% perf-stat.dTLB-load-misses

On 04/23/2014 01:11 AM, Dan Williams wrote:
> On Mon, Apr 21, 2014 at 12:29 AM, Jet Chen <[email protected]> wrote:
>> HI Dan,
>>
>> we noticed the below changes on
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata for-next
>> commit 8a4aeec8d2d6a3edeffbdfae451cdf05cbf0fefd ("libata/ahci: accommodate
>> tag ordered controllers")
>
> Hi, was this on simulated hardware or a real AHCI controller and disk?
>

Testing was on a physical machine with a real AHCI controller.

root@bay ~# lspci | grep AHCI
00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02)

> It does appear this test noticed increased throughput:
>
> 203893 ~ 0% +3.7% 211474 ~ 0% TOTAL iostat.sda.wkB/s
>
> I wonder if ap->last_tag can be moved to a hotter cacheline, but if
> throughput goes up I can imagine it throws off the cpu statistics
> quite a bit.
>