Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753556AbbDXCQL (ORCPT ); Thu, 23 Apr 2015 22:16:11 -0400 Received: from cantor2.suse.de ([195.135.220.15]:56997 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750766AbbDXCQJ (ORCPT ); Thu, 23 Apr 2015 22:16:09 -0400 Date: Fri, 24 Apr 2015 12:15:59 +1000 From: NeilBrown To: Huang Ying Cc: "shli@kernel.org" , LKML , LKP ML Subject: Re: [LKP] [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses Message-ID: <20150424121559.321677ce@notabene.brown> In-Reply-To: <1429772159.25120.9.camel@intel.com> References: <1429772159.25120.9.camel@intel.com> X-Mailer: Claws Mail 3.10.1-162-g4d0ed6 (GTK+ 2.24.25; x86_64-suse-linux-gnu) MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; boundary="Sig_/Bn/pzh.tB+7SfHZQEuf=ag."; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 18154 Lines: 436 --Sig_/Bn/pzh.tB+7SfHZQEuf=ag. Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Thu, 23 Apr 2015 14:55:59 +0800 Huang Ying wrote: > FYI, we noticed the below changes on >=20 > git://neil.brown.name/md for-next > commit 878ee6792799e2f88bdcac329845efadb205252f ("RAID5: batch adjacent f= ull stripe write") Hi, is there any chance that you could explain what some of this means? There is lots of data and some very pretty graphs, but no explanation. Which numbers are "good", which are "bad"? Which is "worst". What do the graphs really show? and what would we like to see in them? I think it is really great that you are doing this testing and reporting the results. It's just so sad that I completely fail to understand them. Thanks, NeilBrown >=20 >=20 > testbox/testcase/testparams: lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs= -1dd >=20 > a87d7f782b47e030 878ee6792799e2f88bdcac3298 =20 > ---------------- -------------------------- =20 > %stddev %change %stddev > \ | \ =20 > 59035 =C2=B1 0% +18.4% 69913 =C2=B1 1% softirqs.SCHED > 1330 =C2=B1 10% +17.4% 1561 =C2=B1 4% slabinfo.kmalloc-= 512.num_objs > 1330 =C2=B1 10% +17.4% 1561 =C2=B1 4% slabinfo.kmalloc-= 512.active_objs > 305908 =C2=B1 0% -1.8% 300427 =C2=B1 0% vmstat.io.bo > 1 =C2=B1 0% +100.0% 2 =C2=B1 0% vmstat.procs.r > 8266 =C2=B1 1% -15.7% 6968 =C2=B1 0% vmstat.system.cs > 14819 =C2=B1 0% -2.1% 14503 =C2=B1 0% vmstat.system.in > 18.20 =C2=B1 6% +10.2% 20.05 =C2=B1 4% perf-profile.cpu-= cycles.raid_run_ops.handle_stripe.handle_active_stripes.raid5d.md_thread > 1.94 =C2=B1 9% +90.6% 3.70 =C2=B1 9% perf-profile.cpu-= cycles.async_xor.raid_run_ops.handle_stripe.handle_active_stripes.raid5d > 0.00 =C2=B1 0% +Inf% 25.18 =C2=B1 3% perf-profile.cpu-= cycles.handle_active_stripes.isra.45.raid5d.md_thread.kthread.ret_from_fork > 0.00 =C2=B1 0% +Inf% 14.14 =C2=B1 4% perf-profile.cpu-= cycles.async_copy_data.isra.42.raid_run_ops.handle_stripe.handle_active_str= ipes.raid5d > 1.79 =C2=B1 7% +102.9% 3.64 =C2=B1 9% perf-profile.cpu-= cycles.xor_blocks.async_xor.raid_run_ops.handle_stripe.handle_active_stripes > 3.09 =C2=B1 4% -10.8% 2.76 =C2=B1 4% perf-profile.cpu-= cycles.get_active_stripe.make_request.md_make_request.generic_make_request.= submit_bio > 0.80 =C2=B1 14% +28.1% 1.02 =C2=B1 10% perf-profile.cpu-= cycles.mutex_lock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_= write.vfs_write > 14.78 =C2=B1 6% -100.0% 0.00 =C2=B1 0% perf-profile.cpu-= cycles.async_copy_data.isra.38.raid_run_ops.handle_stripe.handle_active_str= ipes.raid5d > 25.68 =C2=B1 4% -100.0% 0.00 =C2=B1 0% perf-profile.cpu-= cycles.handle_active_stripes.isra.41.raid5d.md_thread.kthread.ret_from_fork > 1.23 =C2=B1 5% +140.0% 2.96 =C2=B1 7% perf-profile.cpu-= cycles.xor_sse_5_pf64.xor_blocks.async_xor.raid_run_ops.handle_stripe > 2.62 =C2=B1 6% -95.6% 0.12 =C2=B1 33% perf-profile.cpu-= cycles.analyse_stripe.handle_stripe.handle_active_stripes.raid5d.md_thread > 0.96 =C2=B1 9% +17.5% 1.12 =C2=B1 2% perf-profile.cpu-= cycles.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_w= rite.vfs_write > 1.461e+10 =C2=B1 0% -5.3% 1.384e+10 =C2=B1 1% perf-stat.L1-dcac= he-load-misses > 3.688e+11 =C2=B1 0% -2.7% 3.59e+11 =C2=B1 0% perf-stat.L1-dcac= he-loads > 1.124e+09 =C2=B1 0% -27.7% 8.125e+08 =C2=B1 0% perf-stat.L1-dcac= he-prefetches > 2.767e+10 =C2=B1 0% -1.8% 2.717e+10 =C2=B1 0% perf-stat.L1-dcac= he-store-misses > 2.352e+11 =C2=B1 0% -2.8% 2.287e+11 =C2=B1 0% perf-stat.L1-dcac= he-stores > 6.774e+09 =C2=B1 0% -2.3% 6.62e+09 =C2=B1 0% perf-stat.L1-icac= he-load-misses > 5.571e+08 =C2=B1 0% +40.5% 7.826e+08 =C2=B1 1% perf-stat.LLC-loa= d-misses > 6.263e+09 =C2=B1 0% -13.7% 5.407e+09 =C2=B1 1% perf-stat.LLC-loa= ds > 1.914e+11 =C2=B1 0% -4.2% 1.833e+11 =C2=B1 0% perf-stat.branch-= instructions > 1.145e+09 =C2=B1 2% -5.6% 1.081e+09 =C2=B1 0% perf-stat.branch-= load-misses > 1.911e+11 =C2=B1 0% -4.3% 1.829e+11 =C2=B1 0% perf-stat.branch-= loads > 1.142e+09 =C2=B1 2% -5.1% 1.083e+09 =C2=B1 0% perf-stat.branch-= misses > 1.218e+09 =C2=B1 0% +19.8% 1.46e+09 =C2=B1 0% perf-stat.cache-m= isses > 2.118e+10 =C2=B1 0% -5.2% 2.007e+10 =C2=B1 0% perf-stat.cache-r= eferences > 2510308 =C2=B1 1% -15.7% 2115410 =C2=B1 0% perf-stat.context= -switches > 39623 =C2=B1 0% +22.1% 48370 =C2=B1 1% perf-stat.cpu-mig= rations > 4.179e+08 =C2=B1 40% +165.7% 1.111e+09 =C2=B1 35% perf-stat.dTLB-lo= ad-misses > 3.684e+11 =C2=B1 0% -2.5% 3.592e+11 =C2=B1 0% perf-stat.dTLB-lo= ads > 1.232e+08 =C2=B1 15% +62.5% 2.002e+08 =C2=B1 27% perf-stat.dTLB-st= ore-misses > 2.348e+11 =C2=B1 0% -2.5% 2.288e+11 =C2=B1 0% perf-stat.dTLB-st= ores > 3577297 =C2=B1 2% +8.7% 3888986 =C2=B1 1% perf-stat.iTLB-lo= ad-misses > 1.035e+12 =C2=B1 0% -3.5% 9.988e+11 =C2=B1 0% perf-stat.iTLB-lo= ads > 1.036e+12 =C2=B1 0% -3.7% 9.978e+11 =C2=B1 0% perf-stat.instruc= tions > 594 =C2=B1 30% +130.3% 1369 =C2=B1 13% sched_debug.cfs_r= q[0]:/.blocked_load_avg > 17 =C2=B1 10% -28.2% 12 =C2=B1 23% sched_debug.cfs_r= q[0]:/.nr_spread_over > 210 =C2=B1 21% +42.1% 298 =C2=B1 28% sched_debug.cfs_r= q[0]:/.tg_runnable_contrib > 9676 =C2=B1 21% +42.1% 13754 =C2=B1 28% sched_debug.cfs_r= q[0]:/.avg->runnable_avg_sum > 772 =C2=B1 25% +116.5% 1672 =C2=B1 9% sched_debug.cfs_r= q[0]:/.tg_load_contrib > 8402 =C2=B1 9% +83.3% 15405 =C2=B1 11% sched_debug.cfs_r= q[0]:/.tg_load_avg > 8356 =C2=B1 9% +82.8% 15272 =C2=B1 11% sched_debug.cfs_r= q[1]:/.tg_load_avg > 968 =C2=B1 25% +100.8% 1943 =C2=B1 14% sched_debug.cfs_r= q[1]:/.blocked_load_avg > 16242 =C2=B1 9% -22.2% 12643 =C2=B1 14% sched_debug.cfs_r= q[1]:/.avg->runnable_avg_sum > 353 =C2=B1 9% -22.1% 275 =C2=B1 14% sched_debug.cfs_r= q[1]:/.tg_runnable_contrib > 1183 =C2=B1 23% +77.7% 2102 =C2=B1 12% sched_debug.cfs_r= q[1]:/.tg_load_contrib > 181 =C2=B1 8% -31.4% 124 =C2=B1 26% sched_debug.cfs_r= q[2]:/.tg_runnable_contrib > 8364 =C2=B1 8% -31.3% 5745 =C2=B1 26% sched_debug.cfs_r= q[2]:/.avg->runnable_avg_sum > 8297 =C2=B1 9% +81.7% 15079 =C2=B1 12% sched_debug.cfs_r= q[2]:/.tg_load_avg > 30439 =C2=B1 13% -45.2% 16681 =C2=B1 26% sched_debug.cfs_r= q[2]:/.exec_clock > 39735 =C2=B1 14% -48.3% 20545 =C2=B1 29% sched_debug.cfs_r= q[2]:/.min_vruntime > 8231 =C2=B1 10% +82.2% 15000 =C2=B1 12% sched_debug.cfs_r= q[3]:/.tg_load_avg > 1210 =C2=B1 14% +110.3% 2546 =C2=B1 30% sched_debug.cfs_r= q[4]:/.tg_load_contrib > 8188 =C2=B1 10% +82.8% 14964 =C2=B1 12% sched_debug.cfs_r= q[4]:/.tg_load_avg > 8132 =C2=B1 10% +83.1% 14890 =C2=B1 12% sched_debug.cfs_r= q[5]:/.tg_load_avg > 749 =C2=B1 29% +205.9% 2292 =C2=B1 34% sched_debug.cfs_r= q[5]:/.blocked_load_avg > 963 =C2=B1 30% +169.9% 2599 =C2=B1 33% sched_debug.cfs_r= q[5]:/.tg_load_contrib > 37791 =C2=B1 32% -38.6% 23209 =C2=B1 13% sched_debug.cfs_r= q[6]:/.min_vruntime > 693 =C2=B1 25% +132.2% 1609 =C2=B1 29% sched_debug.cfs_r= q[6]:/.blocked_load_avg > 10838 =C2=B1 13% -39.2% 6587 =C2=B1 13% sched_debug.cfs_r= q[6]:/.avg->runnable_avg_sum > 29329 =C2=B1 27% -33.2% 19577 =C2=B1 10% sched_debug.cfs_r= q[6]:/.exec_clock > 235 =C2=B1 14% -39.7% 142 =C2=B1 14% sched_debug.cfs_r= q[6]:/.tg_runnable_contrib > 8085 =C2=B1 10% +83.6% 14848 =C2=B1 12% sched_debug.cfs_r= q[6]:/.tg_load_avg > 839 =C2=B1 25% +128.5% 1917 =C2=B1 18% sched_debug.cfs_r= q[6]:/.tg_load_contrib > 8051 =C2=B1 10% +83.6% 14779 =C2=B1 12% sched_debug.cfs_r= q[7]:/.tg_load_avg > 156 =C2=B1 34% +97.9% 309 =C2=B1 19% sched_debug.cpu#0= .cpu_load[4] > 160 =C2=B1 25% +64.0% 263 =C2=B1 16% sched_debug.cpu#0= .cpu_load[2] > 156 =C2=B1 32% +83.7% 286 =C2=B1 17% sched_debug.cpu#0= .cpu_load[3] > 164 =C2=B1 20% -35.1% 106 =C2=B1 31% sched_debug.cpu#2= .cpu_load[0] > 249 =C2=B1 15% +80.2% 449 =C2=B1 10% sched_debug.cpu#4= .cpu_load[3] > 231 =C2=B1 11% +101.2% 466 =C2=B1 13% sched_debug.cpu#4= .cpu_load[2] > 217 =C2=B1 14% +189.9% 630 =C2=B1 38% sched_debug.cpu#4= .cpu_load[0] > 71951 =C2=B1 5% +21.6% 87526 =C2=B1 7% sched_debug.cpu#4= .nr_load_updates > 214 =C2=B1 8% +146.1% 527 =C2=B1 27% sched_debug.cpu#4= .cpu_load[1] > 256 =C2=B1 17% +75.7% 449 =C2=B1 13% sched_debug.cpu#4= .cpu_load[4] > 209 =C2=B1 23% +98.3% 416 =C2=B1 48% sched_debug.cpu#5= .cpu_load[2] > 68024 =C2=B1 2% +18.8% 80825 =C2=B1 1% sched_debug.cpu#5= .nr_load_updates > 217 =C2=B1 26% +74.9% 380 =C2=B1 45% sched_debug.cpu#5= .cpu_load[3] > 852 =C2=B1 21% -38.3% 526 =C2=B1 22% sched_debug.cpu#6= .curr->pid >=20 > lkp-st02: Core2 > Memory: 8G >=20 >=20 >=20 >=20 > perf-stat.cache-misses >=20 > 1.6e+09 O+-----O--O---O--O---O-----------------------------------------= ---+ > | O O O O O O O O O O = | > 1.4e+09 ++ = | > 1.2e+09 *+.*...* *..* * *...*..*...*..*...*..*...*..*...= *..* > | : : : : : = | > 1e+09 ++ : : : : : : = | > | : : : : : : = | > 8e+08 ++ : : : : : : = | > | : : : : : : = | > 6e+08 ++ : : : : : : = | > 4e+08 ++ : : : : : : = | > | : : : : : : = | > 2e+08 ++ : : : : : : = | > | : : : = | > 0 ++-O------*----------*------*----------------------------------= ---+ >=20 >=20 > perf-stat.L1-dcache-prefetches >=20 > 1.2e+09 ++-------------------------------------------------------------= ---+ > *..*...* *..* * ..*.. ..*..*...*..*...*..*...= *..* > 1e+09 ++ : : : : *. *. = | > | : : : :: : = | > | : : : : : : O = | > 8e+08 O+ O: O :O O: O :O: O :O O O O O O O = | > | : : : : : : = | > 6e+08 ++ : : : : : : = | > | : : : : : : = | > 4e+08 ++ : : : : : : = | > | : : : : : : = | > | : : : : : : = | > 2e+08 ++ :: :: : : = | > | : : : = | > 0 ++-O------*----------*------*----------------------------------= ---+ >=20 >=20 > perf-stat.LLC-load-misses >=20 > 1e+09 ++---------------------------------------------------------------= ---+ > 9e+08 O+ O O O O O = | > | O O O O = | > 8e+08 ++ O O O O O O = | > 7e+08 ++ = | > | = | > 6e+08 *+..*..* *...* * *...*..*...*...*..*...*..*...*..*= ...* > 5e+08 ++ : : : :: : = | > 4e+08 ++ : : : : : : = | > | : : : : : : = | > 3e+08 ++ : : : : : : = | > 2e+08 ++ : : : : : : = | > | : : : : : : = | > 1e+08 ++ : :: : = | > 0 ++--O------*---------*-------*-----------------------------------= ---+ >=20 >=20 > perf-stat.context-switches >=20 > 3e+06 ++-------------------------------------------------------------= ---+ > | *...*..*... = | > 2.5e+06 *+.*...* *..* * : *..*... .*...*..*...= .* > | : : : : : *. = *. | > O O: O :O O: O :: : O O O O O O = | > 2e+06 ++ : : : :O: O :O O = | > | : : : : : : = | > 1.5e+06 ++ : : : : : : = | > | : : : : : : = | > 1e+06 ++ : : : : : : = | > | : : : : : : = | > | : : : : : : = | > 500000 ++ :: : : :: = | > | : : : = | > 0 ++-O------*----------*------*----------------------------------= ---+ >=20 >=20 > vmstat.system.cs >=20 > 10000 ++---------------------------------------------------------------= ---+ > 9000 ++ *...*.. = | > *...*..* *...* * : *...*...*.. ..*..*...*.. = ..* > 8000 ++ : : : : : *. *= . | > 7000 O+ O: O O O: O : : : O O O O O O = | > | : : : :O: O :O O = | > 6000 ++ : : : : : : = | > 5000 ++ : : : : : : = | > 4000 ++ : : : : : : = | > | : : : : : : = | > 3000 ++ : : : : : : = | > 2000 ++ : : : : : : = | > | : : :: :: = | > 1000 ++ : : : = | > 0 ++--O------*---------*-------*-----------------------------------= ---+ >=20 >=20 > [*] bisect-good sample > [O] bisect-bad sample >=20 > To reproduce: >=20 > apt-get install ruby > git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git > cd lkp-tests > bin/setup-local job.yaml # the job file attached in this email > bin/run-local job.yaml >=20 >=20 > Disclaimer: > Results have been estimated based on internal Intel analysis and are prov= ided > for informational purposes only. Any difference in system hardware or sof= tware > design or configuration may affect actual performance. >=20 >=20 > Thanks, > Ying Huang >=20 --Sig_/Bn/pzh.tB+7SfHZQEuf=ag. Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIVAwUBVTmnYDnsnt1WYoG5AQKY/BAAwqqzq+/Vh5Nz7SwpQkFGpfWiBlovB+81 P2vt7juhmUEEKt9Wub8nCRqplM4xagF8epzzmOFXv9KlO30629IufP0H8oSsjv13 tAxGb1CeEk3H0xnqO4cCGgkfwlgQ/Au1Xoea4JJe3BvDVd/KiFaEMJiAzQ3Rtd01 Qulk+tb/uEc9HyAUBrBIAHzEVjYcHOsF2P7wa2OhpmURvJ7VZW4dUumGuDbRs27H /Q9lUgN6Lq6INCWFJKKP3OrmCh55rzSWpjAVqJyfOp5EaDL1qJNwzgvX/re8cvAj JWz8O9sPu3ZwgKPfTagHBKS7CFB4og6gHMZT8LWR5XgZLs0UIuhRLKuMOD0SLsCB h594RmRNmbYCrlW9H7kkcS6+GZ4A118e9fRfD6tIV7mHXsOXYMVyDr5VHTGD9/V8 JmJMfJFM/9gElb8zXXnE5IDIn+WjEmA4jtyEmu02eKZ9Uxh/7VbsS2b2L8UkzSMk GcxP23mEZ1xQPzbV9DLR8Fcmk4UkarEh9K8PHgtSDIW1nsZmCF7kU8FfHwJxVpLb U0SPThu9hpPThceqjQBc0k+JeLl8AqZ4SmXPk6sMRfaQCsaKJhtPV6m/c3jFZ0S9 BXvZIvC/tmPmY2BTOChB/vpc/psk/XrFmvH1MhmcDZSvAjhpVtwYDT/YugMXuLWY 4oFNPt3p5qA= =afHM -----END PGP SIGNATURE----- --Sig_/Bn/pzh.tB+7SfHZQEuf=ag.-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/