Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751139AbbDUEcr (ORCPT ); Tue, 21 Apr 2015 00:32:47 -0400 Received: from mga14.intel.com ([192.55.52.115]:15369 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750711AbbDUEcp (ORCPT ); Tue, 21 Apr 2015 00:32:45 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.11,613,1422950400"; d="scan'208";a="559196689" Date: Tue, 21 Apr 2015 12:34:08 +0800 From: Yuanhan Liu To: Jaegeuk Kim Cc: lkp@01.org, Yuanhan Liu , LKML Subject: performance changes on 78373b73: -46.6% fsmark.files_per_sec, and few more Message-ID: <20150421043408.GH8084@yliu-dev.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9178 Lines: 145 FYI, we found changes on `fsmark.files_per_sec' by 78373b7319abdf15050af5b1632c4c8b8b398f33: > commit 78373b7319abdf15050af5b1632c4c8b8b398f33 > Author: Jaegeuk Kim > AuthorDate: Fri Mar 13 21:44:36 2015 -0700 > Commit: Jaegeuk Kim > CommitDate: Fri Apr 10 15:08:45 2015 -0700 > > f2fs: enhance multi-threads performance 3402e87cfb5e762f9c95071bf4a2ad65fd9392a2 78373b7319abdf15050af5b1632c4c8b8b398f33 ---------------------------------------- ---------------------------------------- run time(m) metric_value ?stddev run time(m) metric_value ?stddev change testbox/benchmark/testcase-params --- ------ ---------------------------- --- ------ ---------------------------- -------- ------------------------------ 3 0.3 |490.800| ?5.7 3 0.5 |262.067| ?0.4 -46.6% ivb44/fsmark/1x-64t-4BRD_12G-RAID0-f2fs-4M-30G-fsyncBeforeClose 3 0.3 |468.367| ?3.5 3 0.5 |264.467| ?0.2 -43.5% ivb44/fsmark/1x-64t-9BRD_6G-RAID0-f2fs-4M-30G-fsyncBeforeClose 3 0.6 |211.867| ?0.7 3 0.7 |191.067| ?0.5 -9.8% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose NOTE: here are some more info about those test parameters for you to know what the testcase does better: 1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark 1t, 64t: where 't' means thread 4M: means the single file size, corresponding to the '-s' option of fsmark 40G, 30G, 120G: means the total test size 4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means the size of one ramdisk. So, it would be 48G in total. And we made a raid on those ramdisk The change is a bit interesting as you already stated it clear that this patch is for performance gain. The patch itself is clear, too: remove a mutex lock. So the only reasonable cause, without too much dig, I can think of would be the remove of this lock reduces sleep time, and brings more process to be able run, but somehow increases the context switches and cpu usage in the meantime at somewhere. I guess this is what the following changes are trying to tell us: 29708 ? 2% +5720.0% 1729051 ? 1% fsmark.time.voluntary_context_switches 302 ? 0% +113.8% 647 ? 0% fsmark.time.percent_of_cpu_this_job_got 61.05 ? 0% +214.0% 191.70 ? 0% fsmark.time.system_time FYI, Here I listed all changes for the outstanding change: 3 0.3 |490.800| ?5.7 3 0.5 |262.067| ?0.4 -46.6% ivb44/fsmark/1x-64t-4BRD_12G-RAID0-f2fs-4M-30G-fsyncBeforeClose 3402e87cfb5e762f 78373b7319abdf15050af5b163 ---------------- -------------------------- %stddev %change %stddev \ | \ 29708 ? 2% +5720.0% 1729051 ? 1% fsmark.time.voluntary_context_switches 61.05 ? 0% +214.0% 191.70 ? 0% fsmark.time.system_time 302 ? 0% +113.8% 647 ? 0% fsmark.time.percent_of_cpu_this_job_got 10476 ? 0% +95.4% 20467 ? 5% fsmark.time.minor_page_faults 490 ? 5% -46.6% 262 ? 0% fsmark.files_per_sec 20.21 ? 0% +46.7% 29.65 ? 0% fsmark.time.elapsed_time 20.21 ? 0% +46.7% 29.65 ? 0% fsmark.time.elapsed_time.max 226379 ? 0% +32.5% 299882 ? 0% fsmark.app_overhead 0 ? 0% +Inf% 1045 ? 2% proc-vmstat.numa_pages_migrated 209 ? 26% +3272.3% 7059 ? 3% cpuidle.C1E-IVT.usage 228 ? 42% +686.7% 1799 ? 14% numa-meminfo.node0.Writeback 14633 ? 5% +7573.2% 1122849 ? 1% cpuidle.C1-IVT.usage 0 ? 0% +Inf% 1045 ? 2% proc-vmstat.pgmigrate_success 29708 ? 2% +5720.0% 1729051 ? 1% time.voluntary_context_switches 55663 ? 0% +776.9% 488081 ? 0% cpuidle.C6-IVT.usage 56 ? 42% +718.8% 464 ? 11% numa-vmstat.node0.nr_writeback 535 ? 29% +334.4% 2325 ? 10% meminfo.Writeback 129 ? 30% +295.6% 511 ? 4% proc-vmstat.nr_writeback 59.25 ? 5% -74.2% 15.26 ? 3% turbostat.CPU%c6 2.58 ? 8% -74.5% 0.66 ? 11% turbostat.Pkg%pc2 1.551e+08 ? 14% +233.4% 5.171e+08 ? 4% cpuidle.C1-IVT.time 32564 ? 24% +208.1% 100330 ? 5% softirqs.RCU 61.05 ? 0% +214.0% 191.70 ? 0% time.system_time 60 ? 32% +165.7% 160 ? 16% numa-vmstat.node1.nr_writeback 2 ? 0% +200.0% 6 ? 0% vmstat.procs.r 3057 ? 2% +166.1% 8136 ? 22% numa-vmstat.node0.nr_mapped 12240 ? 2% +165.9% 32547 ? 22% numa-meminfo.node0.Mapped 6324 ? 3% +148.4% 15709 ? 0% proc-vmstat.nr_mapped 25318 ? 3% +148.6% 62931 ? 0% meminfo.Mapped 302 ? 0% +113.8% 647 ? 0% time.percent_of_cpu_this_job_got 34.39 ? 8% +102.9% 69.79 ? 0% turbostat.CPU%c1 3247 ? 32% +124.9% 7303 ? 14% numa-vmstat.node0.nr_active_anon 12994 ? 32% +124.8% 29215 ? 14% numa-meminfo.node0.Active(anon) 3256 ? 4% +132.5% 7569 ? 23% numa-vmstat.node1.nr_mapped 3252 ? 32% +124.6% 7304 ? 14% numa-vmstat.node0.nr_anon_pages 13036 ? 4% +132.3% 30279 ? 23% numa-meminfo.node1.Mapped 13033 ? 31% +124.1% 29214 ? 14% numa-meminfo.node0.AnonPages 10476 ? 0% +95.4% 20467 ? 5% time.minor_page_faults 121752 ? 7% -35.8% 78122 ? 2% numa-meminfo.node0.Dirty 30381 ? 7% -36.0% 19456 ? 2% numa-vmstat.node0.nr_dirty 113996 ? 4% +64.4% 187416 ? 0% proc-vmstat.pgfault 59301 ? 1% +60.9% 95414 ? 1% softirqs.SCHED 1.92 ? 1% -31.9% 1.31 ? 4% turbostat.Pkg%pc6 16 ? 35% +54.2% 24 ? 10% cpuidle.POLL.usage 20.21 ? 0% +46.7% 29.65 ? 0% time.elapsed_time.max 20.21 ? 0% +46.7% 29.65 ? 0% time.elapsed_time 7114 ? 2% +46.3% 10405 ? 0% proc-vmstat.nr_active_anon 28499 ? 2% +46.2% 41659 ? 0% meminfo.Active(anon) 28617 ? 2% +45.7% 41692 ? 0% meminfo.AnonPages 7138 ? 2% +45.7% 10403 ? 0% proc-vmstat.nr_anon_pages 95342 ? 2% +44.9% 138177 ? 6% numa-meminfo.node0.Active 82347 ? 2% +32.3% 108962 ? 9% numa-meminfo.node0.Active(file) 20586 ? 2% +32.3% 27240 ? 9% numa-vmstat.node0.nr_active_file 233023 ? 3% -30.7% 161486 ? 0% meminfo.Dirty 0.19 ? 2% +36.2% 0.26 ? 3% time.user_time 57900 ? 3% -30.5% 40219 ? 0% proc-vmstat.nr_dirty 108006 ? 8% -24.6% 81460 ? 2% numa-meminfo.node1.Dirty 26896 ? 8% -24.4% 20331 ? 2% numa-vmstat.node1.nr_dirty 219977 ? 0% +32.9% 292456 ? 1% softirqs.TIMER 197382 ? 1% +32.6% 261733 ? 0% meminfo.Active 42210 ? 1% +30.2% 54947 ? 0% proc-vmstat.nr_active_file 168883 ? 1% +30.3% 220073 ? 0% meminfo.Active(file) 353 ? 4% +21.7% 430 ? 3% numa-vmstat.node0.nr_page_table_pages 1425 ? 4% +20.2% 1714 ? 4% numa-meminfo.node0.PageTables 86494 ? 2% +28.0% 110745 ? 10% numa-meminfo.node1.Active(file) 21621 ? 2% +28.0% 27683 ? 10% numa-vmstat.node1.nr_active_file 46.00 ? 4% +20.8% 55.56 ? 3% uptime.boot 101939 ? 4% +20.8% 123117 ? 8% numa-meminfo.node1.Active 1412 ? 9% +10.8% 1565 ? 8% numa-vmstat.node0.nr_alloc_batch 90644 ? 6% +22.4% 110984 ? 0% proc-vmstat.pgfree 2084 ? 4% +15.7% 2412 ? 3% uptime.idle 18666 ? 1% +9.2% 20390 ? 4% slabinfo.anon_vma.num_objs 18663 ? 1% +9.3% 20390 ? 4% slabinfo.anon_vma.active_objs 435.51 ? 0% -89.9% 43.95 ? 0% iostat.md0.avgrq-sz 6415 ? 0% +875.4% 62579 ? 1% iostat.md0.w/s 5595 ? 1% +1883.4% 110979 ? 1% vmstat.system.cs 280678 ? 0% +315.2% 1165516 ? 0% iostat.md0.avgqu-sz 6.31 ? 0% +123.9% 14.14 ? 0% turbostat.%Busy 195 ? 0% +116.4% 423 ? 0% turbostat.Avg_MHz 26035 ? 0% +76.8% 46037 ? 0% vmstat.system.in 1370424 ? 0% -28.1% 985009 ? 0% vmstat.io.bo 1459306 ? 0% -28.1% 1048622 ? 0% iostat.md0.wkB/s 72.28 ? 0% +29.6% 93.70 ? 0% turbostat.CorWatt 102 ? 0% +21.4% 124 ? 0% turbostat.PkgWatt 6.66 ? 2% +8.1% 7.19 ? 0% turbostat.RAMWatt -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/