Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752256AbbDWG4F (ORCPT ); Thu, 23 Apr 2015 02:56:05 -0400 Received: from mga14.intel.com ([192.55.52.115]:13507 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751008AbbDWG4D (ORCPT ); Thu, 23 Apr 2015 02:56:03 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.11,629,1422950400"; d="yaml'?scan'208";a="699617624" Message-ID: <1429772159.25120.9.camel@intel.com> Subject: [LKP] [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses From: Huang Ying To: "shli@kernel.org" Cc: NeilBrown , LKML , LKP ML Date: Thu, 23 Apr 2015 14:55:59 +0800 Content-Type: multipart/mixed; boundary="=-J1SCj+EuabYH3Xm+MigZ" X-Mailer: Evolution 3.12.9-1+b1 Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 19687 Lines: 365 --=-J1SCj+EuabYH3Xm+MigZ Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit FYI, we noticed the below changes on git://neil.brown.name/md for-next commit 878ee6792799e2f88bdcac329845efadb205252f ("RAID5: batch adjacent full stripe write") testbox/testcase/testparams: lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd a87d7f782b47e030 878ee6792799e2f88bdcac3298 ---------------- -------------------------- %stddev %change %stddev \ | \ 59035 ± 0% +18.4% 69913 ± 1% softirqs.SCHED 1330 ± 10% +17.4% 1561 ± 4% slabinfo.kmalloc-512.num_objs 1330 ± 10% +17.4% 1561 ± 4% slabinfo.kmalloc-512.active_objs 305908 ± 0% -1.8% 300427 ± 0% vmstat.io.bo 1 ± 0% +100.0% 2 ± 0% vmstat.procs.r 8266 ± 1% -15.7% 6968 ± 0% vmstat.system.cs 14819 ± 0% -2.1% 14503 ± 0% vmstat.system.in 18.20 ± 6% +10.2% 20.05 ± 4% perf-profile.cpu-cycles.raid_run_ops.handle_stripe.handle_active_stripes.raid5d.md_thread 1.94 ± 9% +90.6% 3.70 ± 9% perf-profile.cpu-cycles.async_xor.raid_run_ops.handle_stripe.handle_active_stripes.raid5d 0.00 ± 0% +Inf% 25.18 ± 3% perf-profile.cpu-cycles.handle_active_stripes.isra.45.raid5d.md_thread.kthread.ret_from_fork 0.00 ± 0% +Inf% 14.14 ± 4% perf-profile.cpu-cycles.async_copy_data.isra.42.raid_run_ops.handle_stripe.handle_active_stripes.raid5d 1.79 ± 7% +102.9% 3.64 ± 9% perf-profile.cpu-cycles.xor_blocks.async_xor.raid_run_ops.handle_stripe.handle_active_stripes 3.09 ± 4% -10.8% 2.76 ± 4% perf-profile.cpu-cycles.get_active_stripe.make_request.md_make_request.generic_make_request.submit_bio 0.80 ± 14% +28.1% 1.02 ± 10% perf-profile.cpu-cycles.mutex_lock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write 14.78 ± 6% -100.0% 0.00 ± 0% perf-profile.cpu-cycles.async_copy_data.isra.38.raid_run_ops.handle_stripe.handle_active_stripes.raid5d 25.68 ± 4% -100.0% 0.00 ± 0% perf-profile.cpu-cycles.handle_active_stripes.isra.41.raid5d.md_thread.kthread.ret_from_fork 1.23 ± 5% +140.0% 2.96 ± 7% perf-profile.cpu-cycles.xor_sse_5_pf64.xor_blocks.async_xor.raid_run_ops.handle_stripe 2.62 ± 6% -95.6% 0.12 ± 33% perf-profile.cpu-cycles.analyse_stripe.handle_stripe.handle_active_stripes.raid5d.md_thread 0.96 ± 9% +17.5% 1.12 ± 2% perf-profile.cpu-cycles.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write 1.461e+10 ± 0% -5.3% 1.384e+10 ± 1% perf-stat.L1-dcache-load-misses 3.688e+11 ± 0% -2.7% 3.59e+11 ± 0% perf-stat.L1-dcache-loads 1.124e+09 ± 0% -27.7% 8.125e+08 ± 0% perf-stat.L1-dcache-prefetches 2.767e+10 ± 0% -1.8% 2.717e+10 ± 0% perf-stat.L1-dcache-store-misses 2.352e+11 ± 0% -2.8% 2.287e+11 ± 0% perf-stat.L1-dcache-stores 6.774e+09 ± 0% -2.3% 6.62e+09 ± 0% perf-stat.L1-icache-load-misses 5.571e+08 ± 0% +40.5% 7.826e+08 ± 1% perf-stat.LLC-load-misses 6.263e+09 ± 0% -13.7% 5.407e+09 ± 1% perf-stat.LLC-loads 1.914e+11 ± 0% -4.2% 1.833e+11 ± 0% perf-stat.branch-instructions 1.145e+09 ± 2% -5.6% 1.081e+09 ± 0% perf-stat.branch-load-misses 1.911e+11 ± 0% -4.3% 1.829e+11 ± 0% perf-stat.branch-loads 1.142e+09 ± 2% -5.1% 1.083e+09 ± 0% perf-stat.branch-misses 1.218e+09 ± 0% +19.8% 1.46e+09 ± 0% perf-stat.cache-misses 2.118e+10 ± 0% -5.2% 2.007e+10 ± 0% perf-stat.cache-references 2510308 ± 1% -15.7% 2115410 ± 0% perf-stat.context-switches 39623 ± 0% +22.1% 48370 ± 1% perf-stat.cpu-migrations 4.179e+08 ± 40% +165.7% 1.111e+09 ± 35% perf-stat.dTLB-load-misses 3.684e+11 ± 0% -2.5% 3.592e+11 ± 0% perf-stat.dTLB-loads 1.232e+08 ± 15% +62.5% 2.002e+08 ± 27% perf-stat.dTLB-store-misses 2.348e+11 ± 0% -2.5% 2.288e+11 ± 0% perf-stat.dTLB-stores 3577297 ± 2% +8.7% 3888986 ± 1% perf-stat.iTLB-load-misses 1.035e+12 ± 0% -3.5% 9.988e+11 ± 0% perf-stat.iTLB-loads 1.036e+12 ± 0% -3.7% 9.978e+11 ± 0% perf-stat.instructions 594 ± 30% +130.3% 1369 ± 13% sched_debug.cfs_rq[0]:/.blocked_load_avg 17 ± 10% -28.2% 12 ± 23% sched_debug.cfs_rq[0]:/.nr_spread_over 210 ± 21% +42.1% 298 ± 28% sched_debug.cfs_rq[0]:/.tg_runnable_contrib 9676 ± 21% +42.1% 13754 ± 28% sched_debug.cfs_rq[0]:/.avg->runnable_avg_sum 772 ± 25% +116.5% 1672 ± 9% sched_debug.cfs_rq[0]:/.tg_load_contrib 8402 ± 9% +83.3% 15405 ± 11% sched_debug.cfs_rq[0]:/.tg_load_avg 8356 ± 9% +82.8% 15272 ± 11% sched_debug.cfs_rq[1]:/.tg_load_avg 968 ± 25% +100.8% 1943 ± 14% sched_debug.cfs_rq[1]:/.blocked_load_avg 16242 ± 9% -22.2% 12643 ± 14% sched_debug.cfs_rq[1]:/.avg->runnable_avg_sum 353 ± 9% -22.1% 275 ± 14% sched_debug.cfs_rq[1]:/.tg_runnable_contrib 1183 ± 23% +77.7% 2102 ± 12% sched_debug.cfs_rq[1]:/.tg_load_contrib 181 ± 8% -31.4% 124 ± 26% sched_debug.cfs_rq[2]:/.tg_runnable_contrib 8364 ± 8% -31.3% 5745 ± 26% sched_debug.cfs_rq[2]:/.avg->runnable_avg_sum 8297 ± 9% +81.7% 15079 ± 12% sched_debug.cfs_rq[2]:/.tg_load_avg 30439 ± 13% -45.2% 16681 ± 26% sched_debug.cfs_rq[2]:/.exec_clock 39735 ± 14% -48.3% 20545 ± 29% sched_debug.cfs_rq[2]:/.min_vruntime 8231 ± 10% +82.2% 15000 ± 12% sched_debug.cfs_rq[3]:/.tg_load_avg 1210 ± 14% +110.3% 2546 ± 30% sched_debug.cfs_rq[4]:/.tg_load_contrib 8188 ± 10% +82.8% 14964 ± 12% sched_debug.cfs_rq[4]:/.tg_load_avg 8132 ± 10% +83.1% 14890 ± 12% sched_debug.cfs_rq[5]:/.tg_load_avg 749 ± 29% +205.9% 2292 ± 34% sched_debug.cfs_rq[5]:/.blocked_load_avg 963 ± 30% +169.9% 2599 ± 33% sched_debug.cfs_rq[5]:/.tg_load_contrib 37791 ± 32% -38.6% 23209 ± 13% sched_debug.cfs_rq[6]:/.min_vruntime 693 ± 25% +132.2% 1609 ± 29% sched_debug.cfs_rq[6]:/.blocked_load_avg 10838 ± 13% -39.2% 6587 ± 13% sched_debug.cfs_rq[6]:/.avg->runnable_avg_sum 29329 ± 27% -33.2% 19577 ± 10% sched_debug.cfs_rq[6]:/.exec_clock 235 ± 14% -39.7% 142 ± 14% sched_debug.cfs_rq[6]:/.tg_runnable_contrib 8085 ± 10% +83.6% 14848 ± 12% sched_debug.cfs_rq[6]:/.tg_load_avg 839 ± 25% +128.5% 1917 ± 18% sched_debug.cfs_rq[6]:/.tg_load_contrib 8051 ± 10% +83.6% 14779 ± 12% sched_debug.cfs_rq[7]:/.tg_load_avg 156 ± 34% +97.9% 309 ± 19% sched_debug.cpu#0.cpu_load[4] 160 ± 25% +64.0% 263 ± 16% sched_debug.cpu#0.cpu_load[2] 156 ± 32% +83.7% 286 ± 17% sched_debug.cpu#0.cpu_load[3] 164 ± 20% -35.1% 106 ± 31% sched_debug.cpu#2.cpu_load[0] 249 ± 15% +80.2% 449 ± 10% sched_debug.cpu#4.cpu_load[3] 231 ± 11% +101.2% 466 ± 13% sched_debug.cpu#4.cpu_load[2] 217 ± 14% +189.9% 630 ± 38% sched_debug.cpu#4.cpu_load[0] 71951 ± 5% +21.6% 87526 ± 7% sched_debug.cpu#4.nr_load_updates 214 ± 8% +146.1% 527 ± 27% sched_debug.cpu#4.cpu_load[1] 256 ± 17% +75.7% 449 ± 13% sched_debug.cpu#4.cpu_load[4] 209 ± 23% +98.3% 416 ± 48% sched_debug.cpu#5.cpu_load[2] 68024 ± 2% +18.8% 80825 ± 1% sched_debug.cpu#5.nr_load_updates 217 ± 26% +74.9% 380 ± 45% sched_debug.cpu#5.cpu_load[3] 852 ± 21% -38.3% 526 ± 22% sched_debug.cpu#6.curr->pid lkp-st02: Core2 Memory: 8G perf-stat.cache-misses 1.6e+09 O+-----O--O---O--O---O--------------------------------------------+ | O O O O O O O O O O | 1.4e+09 ++ | 1.2e+09 *+.*...* *..* * *...*..*...*..*...*..*...*..*...*..* | : : : : : | 1e+09 ++ : : : : : : | | : : : : : : | 8e+08 ++ : : : : : : | | : : : : : : | 6e+08 ++ : : : : : : | 4e+08 ++ : : : : : : | | : : : : : : | 2e+08 ++ : : : : : : | | : : : | 0 ++-O------*----------*------*-------------------------------------+ perf-stat.L1-dcache-prefetches 1.2e+09 ++----------------------------------------------------------------+ *..*...* *..* * ..*.. ..*..*...*..*...*..*...*..* 1e+09 ++ : : : : *. *. | | : : : :: : | | : : : : : : O | 8e+08 O+ O: O :O O: O :O: O :O O O O O O O | | : : : : : : | 6e+08 ++ : : : : : : | | : : : : : : | 4e+08 ++ : : : : : : | | : : : : : : | | : : : : : : | 2e+08 ++ :: :: : : | | : : : | 0 ++-O------*----------*------*-------------------------------------+ perf-stat.LLC-load-misses 1e+09 ++------------------------------------------------------------------+ 9e+08 O+ O O O O O | | O O O O | 8e+08 ++ O O O O O O | 7e+08 ++ | | | 6e+08 *+..*..* *...* * *...*..*...*...*..*...*..*...*..*...* 5e+08 ++ : : : :: : | 4e+08 ++ : : : : : : | | : : : : : : | 3e+08 ++ : : : : : : | 2e+08 ++ : : : : : : | | : : : : : : | 1e+08 ++ : :: : | 0 ++--O------*---------*-------*--------------------------------------+ perf-stat.context-switches 3e+06 ++----------------------------------------------------------------+ | *...*..*... | 2.5e+06 *+.*...* *..* * : *..*... .*...*..*... .* | : : : : : *. *. | O O: O :O O: O :: : O O O O O O | 2e+06 ++ : : : :O: O :O O | | : : : : : : | 1.5e+06 ++ : : : : : : | | : : : : : : | 1e+06 ++ : : : : : : | | : : : : : : | | : : : : : : | 500000 ++ :: : : :: | | : : : | 0 ++-O------*----------*------*-------------------------------------+ vmstat.system.cs 10000 ++------------------------------------------------------------------+ 9000 ++ *...*.. | *...*..* *...* * : *...*...*.. ..*..*...*.. ..* 8000 ++ : : : : : *. *. | 7000 O+ O: O O O: O : : : O O O O O O | | : : : :O: O :O O | 6000 ++ : : : : : : | 5000 ++ : : : : : : | 4000 ++ : : : : : : | | : : : : : : | 3000 ++ : : : : : : | 2000 ++ : : : : : : | | : : :: :: | 1000 ++ : : : | 0 ++--O------*---------*-------*--------------------------------------+ [*] bisect-good sample [O] bisect-bad sample To reproduce: apt-get install ruby git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git cd lkp-tests bin/setup-local job.yaml # the job file attached in this email bin/run-local job.yaml Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Ying Huang --=-J1SCj+EuabYH3Xm+MigZ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="job.yaml" Content-Transfer-Encoding: 7bit --- testcase: dd-write default-monitors: wait: pre-test uptime: iostat: vmstat: numa-numastat: numa-vmstat: numa-meminfo: proc-vmstat: proc-stat: meminfo: slabinfo: interrupts: lock_stat: latency_stats: softirqs: bdi_dev_mapping: diskstats: nfsstat: cpuidle: cpufreq-stats: turbostat: pmeter: sched_debug: interval: 10 default-watchdogs: watch-oom: watchdog: cpufreq_governor: commit: a1a71cc4c0a53e29fe27cede9392b0ad816ee956 model: Core2 memory: 8G nr_hdd_partitions: 12 wait_disks_timeout: 300 hdd_partitions: "/dev/disk/by-id/scsi-35000c5000???????" swap_partitions: runtime: 5m disk: 11HDD md: RAID5 iosched: cfq fs: xfs fs2: monitors: perf-stat: perf-profile: ftrace: events: balance_dirty_pages bdi_dirty_ratelimit global_dirty_state writeback_single_inode nr_threads: 1dd dd: testbox: lkp-st02 tbox_group: lkp-st02 kconfig: x86_64-rhel enqueue_time: 2015-04-19 11:59:58.120063120 +08:00 head_commit: a1a71cc4c0a53e29fe27cede9392b0ad816ee956 base_commit: 39a8804455fb23f09157341d3ba7db6d7ae6ee76 branch: linux-devel/devel-hourly-2015042014 kernel: "/kernel/x86_64-rhel/a1a71cc4c0a53e29fe27cede9392b0ad816ee956/vmlinuz-4.0.0-09109-ga1a71cc" user: lkp queue: cyclic rootfs: debian-x86_64-2015-02-07.cgz result_root: "/result/lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd/debian-x86_64-2015-02-07.cgz/x86_64-rhel/a1a71cc4c0a53e29fe27cede9392b0ad816ee956/0" LKP_SERVER: inn job_file: "/lkp/scheduled/lkp-st02/cyclic_dd-write-300-5m-11HDD-RAID5-cfq-xfs-1dd-x86_64-rhel-HEAD-a1a71cc4c0a53e29fe27cede9392b0ad816ee956-0-20150419-35022-17ddag2.yaml" dequeue_time: 2015-04-20 16:17:46.635323077 +08:00 nr_cpu: "$(nproc)" initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz" bootloader_append: - root=/dev/ram0 - user=lkp - job=/lkp/scheduled/lkp-st02/cyclic_dd-write-300-5m-11HDD-RAID5-cfq-xfs-1dd-x86_64-rhel-HEAD-a1a71cc4c0a53e29fe27cede9392b0ad816ee956-0-20150419-35022-17ddag2.yaml - ARCH=x86_64 - kconfig=x86_64-rhel - branch=linux-devel/devel-hourly-2015042014 - commit=a1a71cc4c0a53e29fe27cede9392b0ad816ee956 - BOOT_IMAGE=/kernel/x86_64-rhel/a1a71cc4c0a53e29fe27cede9392b0ad816ee956/vmlinuz-4.0.0-09109-ga1a71cc - RESULT_ROOT=/result/lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd/debian-x86_64-2015-02-07.cgz/x86_64-rhel/a1a71cc4c0a53e29fe27cede9392b0ad816ee956/0 - LKP_SERVER=inn - |2- earlyprintk=ttyS0,115200 rd.udev.log-priority=err systemd.log_target=journal systemd.log_level=warning debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal rw max_uptime: 1500 lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz" modules_initrd: "/kernel/x86_64-rhel/a1a71cc4c0a53e29fe27cede9392b0ad816ee956/modules.cgz" bm_initrd: "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/fs.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/fs2.cgz" job_state: finished loadavg: 1.60 1.36 0.63 1/145 5859 start_time: '1429517927' end_time: '1429518229' version: "/lkp/lkp/.src-20150418-142223" time_delta: '1429517881.362849165' --=-J1SCj+EuabYH3Xm+MigZ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="reproduce" Content-Transfer-Encoding: 7bit mdadm --stop /dev/md0 mdadm -q --create /dev/md0 --chunk=256 --level=raid5 --raid-devices=11 --force --assume-clean /dev/sdb /dev/sdg /dev/sdi /dev/sdh /dev/sdl /dev/sdf /dev/sdm /dev/sdk /dev/sdd /dev/sde /dev/sdc mkfs -t xfs /dev/md0 mount -t xfs -o nobarrier,inode64 /dev/md0 /fs/md0 echo 1 > /sys/kernel/debug/tracing/events/writeback/balance_dirty_pages/enable echo 1 > /sys/kernel/debug/tracing/events/writeback/bdi_dirty_ratelimit/enable echo 1 > /sys/kernel/debug/tracing/events/writeback/global_dirty_state/enable echo 1 > /sys/kernel/debug/tracing/events/writeback/writeback_single_inode/enable dd if=/dev/zero of=/fs/md0/zero-1 status=noxfer & sleep 300 killall -9 dd --=-J1SCj+EuabYH3Xm+MigZ Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable _______________________________________________ LKP mailing list LKP@linux.intel.com =0D --=-J1SCj+EuabYH3Xm+MigZ-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/