Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751598AbbD3GX7 (ORCPT ); Thu, 30 Apr 2015 02:23:59 -0400 Received: from mga01.intel.com ([192.55.52.88]:26663 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751408AbbD3GX5 (ORCPT ); Thu, 30 Apr 2015 02:23:57 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.11,675,1422950400"; d="scan'208";a="487531916" Date: Thu, 30 Apr 2015 14:25:23 +0800 From: Yuanhan Liu To: NeilBrown Cc: Huang Ying , "shli@kernel.org" , LKML , LKP ML , Fengguang Wu Subject: Re: [LKP] [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses Message-ID: <20150430062523.GA25995@yliu-dev.sh.intel.com> References: <1429772159.25120.9.camel@intel.com> <20150424121559.321677ce@notabene.brown> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20150424121559.321677ce@notabene.brown> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 16798 Lines: 261 On Fri, Apr 24, 2015 at 12:15:59PM +1000, NeilBrown wrote: > On Thu, 23 Apr 2015 14:55:59 +0800 Huang Ying wrote: > > > FYI, we noticed the below changes on > > > > git://neil.brown.name/md for-next > > commit 878ee6792799e2f88bdcac329845efadb205252f ("RAID5: batch adjacent full stripe write") > > Hi, > is there any chance that you could explain what some of this means? > There is lots of data and some very pretty graphs, but no explanation. Hi Neil, (Sorry for late response: Ying is on vacation) I guess you can simply ignore this report, as I already reported to you month ago that this patch made fsmark performs better in most cases: https://lists.01.org/pipermail/lkp/2015-March/002411.html > > Which numbers are "good", which are "bad"? Which is "worst". > What do the graphs really show? and what would we like to see in them? > > I think it is really great that you are doing this testing and reporting the > results. It's just so sad that I completely fail to understand them. Sorry, it's our bad to make them hard to understand as well as to report a duplicate one(well, the commit hash is different ;). We might need take some time to make those data understood easier. --yliu > > > > > > > testbox/testcase/testparams: lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd > > > > a87d7f782b47e030 878ee6792799e2f88bdcac3298 > > ---------------- -------------------------- > > %stddev %change %stddev > > \ | \ > > 59035 ? 0% +18.4% 69913 ? 1% softirqs.SCHED > > 1330 ? 10% +17.4% 1561 ? 4% slabinfo.kmalloc-512.num_objs > > 1330 ? 10% +17.4% 1561 ? 4% slabinfo.kmalloc-512.active_objs > > 305908 ? 0% -1.8% 300427 ? 0% vmstat.io.bo > > 1 ? 0% +100.0% 2 ? 0% vmstat.procs.r > > 8266 ? 1% -15.7% 6968 ? 0% vmstat.system.cs > > 14819 ? 0% -2.1% 14503 ? 0% vmstat.system.in > > 18.20 ? 6% +10.2% 20.05 ? 4% perf-profile.cpu-cycles.raid_run_ops.handle_stripe.handle_active_stripes.raid5d.md_thread > > 1.94 ? 9% +90.6% 3.70 ? 9% perf-profile.cpu-cycles.async_xor.raid_run_ops.handle_stripe.handle_active_stripes.raid5d > > 0.00 ? 0% +Inf% 25.18 ? 3% perf-profile.cpu-cycles.handle_active_stripes.isra.45.raid5d.md_thread.kthread.ret_from_fork > > 0.00 ? 0% +Inf% 14.14 ? 4% perf-profile.cpu-cycles.async_copy_data.isra.42.raid_run_ops.handle_stripe.handle_active_stripes.raid5d > > 1.79 ? 7% +102.9% 3.64 ? 9% perf-profile.cpu-cycles.xor_blocks.async_xor.raid_run_ops.handle_stripe.handle_active_stripes > > 3.09 ? 4% -10.8% 2.76 ? 4% perf-profile.cpu-cycles.get_active_stripe.make_request.md_make_request.generic_make_request.submit_bio > > 0.80 ? 14% +28.1% 1.02 ? 10% perf-profile.cpu-cycles.mutex_lock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write > > 14.78 ? 6% -100.0% 0.00 ? 0% perf-profile.cpu-cycles.async_copy_data.isra.38.raid_run_ops.handle_stripe.handle_active_stripes.raid5d > > 25.68 ? 4% -100.0% 0.00 ? 0% perf-profile.cpu-cycles.handle_active_stripes.isra.41.raid5d.md_thread.kthread.ret_from_fork > > 1.23 ? 5% +140.0% 2.96 ? 7% perf-profile.cpu-cycles.xor_sse_5_pf64.xor_blocks.async_xor.raid_run_ops.handle_stripe > > 2.62 ? 6% -95.6% 0.12 ? 33% perf-profile.cpu-cycles.analyse_stripe.handle_stripe.handle_active_stripes.raid5d.md_thread > > 0.96 ? 9% +17.5% 1.12 ? 2% perf-profile.cpu-cycles.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write > > 1.461e+10 ? 0% -5.3% 1.384e+10 ? 1% perf-stat.L1-dcache-load-misses > > 3.688e+11 ? 0% -2.7% 3.59e+11 ? 0% perf-stat.L1-dcache-loads > > 1.124e+09 ? 0% -27.7% 8.125e+08 ? 0% perf-stat.L1-dcache-prefetches > > 2.767e+10 ? 0% -1.8% 2.717e+10 ? 0% perf-stat.L1-dcache-store-misses > > 2.352e+11 ? 0% -2.8% 2.287e+11 ? 0% perf-stat.L1-dcache-stores > > 6.774e+09 ? 0% -2.3% 6.62e+09 ? 0% perf-stat.L1-icache-load-misses > > 5.571e+08 ? 0% +40.5% 7.826e+08 ? 1% perf-stat.LLC-load-misses > > 6.263e+09 ? 0% -13.7% 5.407e+09 ? 1% perf-stat.LLC-loads > > 1.914e+11 ? 0% -4.2% 1.833e+11 ? 0% perf-stat.branch-instructions > > 1.145e+09 ? 2% -5.6% 1.081e+09 ? 0% perf-stat.branch-load-misses > > 1.911e+11 ? 0% -4.3% 1.829e+11 ? 0% perf-stat.branch-loads > > 1.142e+09 ? 2% -5.1% 1.083e+09 ? 0% perf-stat.branch-misses > > 1.218e+09 ? 0% +19.8% 1.46e+09 ? 0% perf-stat.cache-misses > > 2.118e+10 ? 0% -5.2% 2.007e+10 ? 0% perf-stat.cache-references > > 2510308 ? 1% -15.7% 2115410 ? 0% perf-stat.context-switches > > 39623 ? 0% +22.1% 48370 ? 1% perf-stat.cpu-migrations > > 4.179e+08 ? 40% +165.7% 1.111e+09 ? 35% perf-stat.dTLB-load-misses > > 3.684e+11 ? 0% -2.5% 3.592e+11 ? 0% perf-stat.dTLB-loads > > 1.232e+08 ? 15% +62.5% 2.002e+08 ? 27% perf-stat.dTLB-store-misses > > 2.348e+11 ? 0% -2.5% 2.288e+11 ? 0% perf-stat.dTLB-stores > > 3577297 ? 2% +8.7% 3888986 ? 1% perf-stat.iTLB-load-misses > > 1.035e+12 ? 0% -3.5% 9.988e+11 ? 0% perf-stat.iTLB-loads > > 1.036e+12 ? 0% -3.7% 9.978e+11 ? 0% perf-stat.instructions > > 594 ? 30% +130.3% 1369 ? 13% sched_debug.cfs_rq[0]:/.blocked_load_avg > > 17 ? 10% -28.2% 12 ? 23% sched_debug.cfs_rq[0]:/.nr_spread_over > > 210 ? 21% +42.1% 298 ? 28% sched_debug.cfs_rq[0]:/.tg_runnable_contrib > > 9676 ? 21% +42.1% 13754 ? 28% sched_debug.cfs_rq[0]:/.avg->runnable_avg_sum > > 772 ? 25% +116.5% 1672 ? 9% sched_debug.cfs_rq[0]:/.tg_load_contrib > > 8402 ? 9% +83.3% 15405 ? 11% sched_debug.cfs_rq[0]:/.tg_load_avg > > 8356 ? 9% +82.8% 15272 ? 11% sched_debug.cfs_rq[1]:/.tg_load_avg > > 968 ? 25% +100.8% 1943 ? 14% sched_debug.cfs_rq[1]:/.blocked_load_avg > > 16242 ? 9% -22.2% 12643 ? 14% sched_debug.cfs_rq[1]:/.avg->runnable_avg_sum > > 353 ? 9% -22.1% 275 ? 14% sched_debug.cfs_rq[1]:/.tg_runnable_contrib > > 1183 ? 23% +77.7% 2102 ? 12% sched_debug.cfs_rq[1]:/.tg_load_contrib > > 181 ? 8% -31.4% 124 ? 26% sched_debug.cfs_rq[2]:/.tg_runnable_contrib > > 8364 ? 8% -31.3% 5745 ? 26% sched_debug.cfs_rq[2]:/.avg->runnable_avg_sum > > 8297 ? 9% +81.7% 15079 ? 12% sched_debug.cfs_rq[2]:/.tg_load_avg > > 30439 ? 13% -45.2% 16681 ? 26% sched_debug.cfs_rq[2]:/.exec_clock > > 39735 ? 14% -48.3% 20545 ? 29% sched_debug.cfs_rq[2]:/.min_vruntime > > 8231 ? 10% +82.2% 15000 ? 12% sched_debug.cfs_rq[3]:/.tg_load_avg > > 1210 ? 14% +110.3% 2546 ? 30% sched_debug.cfs_rq[4]:/.tg_load_contrib > > 8188 ? 10% +82.8% 14964 ? 12% sched_debug.cfs_rq[4]:/.tg_load_avg > > 8132 ? 10% +83.1% 14890 ? 12% sched_debug.cfs_rq[5]:/.tg_load_avg > > 749 ? 29% +205.9% 2292 ? 34% sched_debug.cfs_rq[5]:/.blocked_load_avg > > 963 ? 30% +169.9% 2599 ? 33% sched_debug.cfs_rq[5]:/.tg_load_contrib > > 37791 ? 32% -38.6% 23209 ? 13% sched_debug.cfs_rq[6]:/.min_vruntime > > 693 ? 25% +132.2% 1609 ? 29% sched_debug.cfs_rq[6]:/.blocked_load_avg > > 10838 ? 13% -39.2% 6587 ? 13% sched_debug.cfs_rq[6]:/.avg->runnable_avg_sum > > 29329 ? 27% -33.2% 19577 ? 10% sched_debug.cfs_rq[6]:/.exec_clock > > 235 ? 14% -39.7% 142 ? 14% sched_debug.cfs_rq[6]:/.tg_runnable_contrib > > 8085 ? 10% +83.6% 14848 ? 12% sched_debug.cfs_rq[6]:/.tg_load_avg > > 839 ? 25% +128.5% 1917 ? 18% sched_debug.cfs_rq[6]:/.tg_load_contrib > > 8051 ? 10% +83.6% 14779 ? 12% sched_debug.cfs_rq[7]:/.tg_load_avg > > 156 ? 34% +97.9% 309 ? 19% sched_debug.cpu#0.cpu_load[4] > > 160 ? 25% +64.0% 263 ? 16% sched_debug.cpu#0.cpu_load[2] > > 156 ? 32% +83.7% 286 ? 17% sched_debug.cpu#0.cpu_load[3] > > 164 ? 20% -35.1% 106 ? 31% sched_debug.cpu#2.cpu_load[0] > > 249 ? 15% +80.2% 449 ? 10% sched_debug.cpu#4.cpu_load[3] > > 231 ? 11% +101.2% 466 ? 13% sched_debug.cpu#4.cpu_load[2] > > 217 ? 14% +189.9% 630 ? 38% sched_debug.cpu#4.cpu_load[0] > > 71951 ? 5% +21.6% 87526 ? 7% sched_debug.cpu#4.nr_load_updates > > 214 ? 8% +146.1% 527 ? 27% sched_debug.cpu#4.cpu_load[1] > > 256 ? 17% +75.7% 449 ? 13% sched_debug.cpu#4.cpu_load[4] > > 209 ? 23% +98.3% 416 ? 48% sched_debug.cpu#5.cpu_load[2] > > 68024 ? 2% +18.8% 80825 ? 1% sched_debug.cpu#5.nr_load_updates > > 217 ? 26% +74.9% 380 ? 45% sched_debug.cpu#5.cpu_load[3] > > 852 ? 21% -38.3% 526 ? 22% sched_debug.cpu#6.curr->pid > > > > lkp-st02: Core2 > > Memory: 8G > > > > > > > > > > perf-stat.cache-misses > > > > 1.6e+09 O+-----O--O---O--O---O--------------------------------------------+ > > | O O O O O O O O O O | > > 1.4e+09 ++ | > > 1.2e+09 *+.*...* *..* * *...*..*...*..*...*..*...*..*...*..* > > | : : : : : | > > 1e+09 ++ : : : : : : | > > | : : : : : : | > > 8e+08 ++ : : : : : : | > > | : : : : : : | > > 6e+08 ++ : : : : : : | > > 4e+08 ++ : : : : : : | > > | : : : : : : | > > 2e+08 ++ : : : : : : | > > | : : : | > > 0 ++-O------*----------*------*-------------------------------------+ > > > > > > perf-stat.L1-dcache-prefetches > > > > 1.2e+09 ++----------------------------------------------------------------+ > > *..*...* *..* * ..*.. ..*..*...*..*...*..*...*..* > > 1e+09 ++ : : : : *. *. | > > | : : : :: : | > > | : : : : : : O | > > 8e+08 O+ O: O :O O: O :O: O :O O O O O O O | > > | : : : : : : | > > 6e+08 ++ : : : : : : | > > | : : : : : : | > > 4e+08 ++ : : : : : : | > > | : : : : : : | > > | : : : : : : | > > 2e+08 ++ :: :: : : | > > | : : : | > > 0 ++-O------*----------*------*-------------------------------------+ > > > > > > perf-stat.LLC-load-misses > > > > 1e+09 ++------------------------------------------------------------------+ > > 9e+08 O+ O O O O O | > > | O O O O | > > 8e+08 ++ O O O O O O | > > 7e+08 ++ | > > | | > > 6e+08 *+..*..* *...* * *...*..*...*...*..*...*..*...*..*...* > > 5e+08 ++ : : : :: : | > > 4e+08 ++ : : : : : : | > > | : : : : : : | > > 3e+08 ++ : : : : : : | > > 2e+08 ++ : : : : : : | > > | : : : : : : | > > 1e+08 ++ : :: : | > > 0 ++--O------*---------*-------*--------------------------------------+ > > > > > > perf-stat.context-switches > > > > 3e+06 ++----------------------------------------------------------------+ > > | *...*..*... | > > 2.5e+06 *+.*...* *..* * : *..*... .*...*..*... .* > > | : : : : : *. *. | > > O O: O :O O: O :: : O O O O O O | > > 2e+06 ++ : : : :O: O :O O | > > | : : : : : : | > > 1.5e+06 ++ : : : : : : | > > | : : : : : : | > > 1e+06 ++ : : : : : : | > > | : : : : : : | > > | : : : : : : | > > 500000 ++ :: : : :: | > > | : : : | > > 0 ++-O------*----------*------*-------------------------------------+ > > > > > > vmstat.system.cs > > > > 10000 ++------------------------------------------------------------------+ > > 9000 ++ *...*.. | > > *...*..* *...* * : *...*...*.. ..*..*...*.. ..* > > 8000 ++ : : : : : *. *. | > > 7000 O+ O: O O O: O : : : O O O O O O | > > | : : : :O: O :O O | > > 6000 ++ : : : : : : | > > 5000 ++ : : : : : : | > > 4000 ++ : : : : : : | > > | : : : : : : | > > 3000 ++ : : : : : : | > > 2000 ++ : : : : : : | > > | : : :: :: | > > 1000 ++ : : : | > > 0 ++--O------*---------*-------*--------------------------------------+ > > > > > > [*] bisect-good sample > > [O] bisect-bad sample > > > > To reproduce: > > > > apt-get install ruby > > git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git > > cd lkp-tests > > bin/setup-local job.yaml # the job file attached in this email > > bin/run-local job.yaml > > > > > > Disclaimer: > > Results have been estimated based on internal Intel analysis and are provided > > for informational purposes only. Any difference in system hardware or software > > design or configuration may affect actual performance. > > > > > > Thanks, > > Ying Huang > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/