Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753487AbdLDEAC (ORCPT ); Sun, 3 Dec 2017 23:00:02 -0500 Received: from mail-pg0-f41.google.com ([74.125.83.41]:46943 "EHLO mail-pg0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752977AbdLDEAB (ORCPT ); Sun, 3 Dec 2017 23:00:01 -0500 X-Google-Smtp-Source: AGs4zMZZ274SWhnJeQ9LLkpnRknN0hX+Ah5gqlJ5kCfshM+wwxTtgtUDpKXR4KN3L5qTPixTbgXh0A== Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: [lkp-robot] [x86/entry/64] 63e02a2a32: will-it-scale.per_process_ops -13.0% regression From: Andy Lutomirski X-Mailer: iPhone Mail (15C114) In-Reply-To: <20171204030240.GX21779@yexl-desktop> Date: Sun, 3 Dec 2017 19:59:56 -0800 Cc: Andy Lutomirski , Ingo Molnar , Thomas Gleixner , Borislav Petkov , Brian Gerst , Dave Hansen , Denys Vlasenko , "H. Peter Anvin" , Josh Poimboeuf , Linus Torvalds , Peter Zijlstra , Rik van Riel , LKML , Stephen Rothwell , lkp@01.org Message-Id: References: <20171204030240.GX21779@yexl-desktop> To: kernel test robot Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id vB4408Hs015449 Content-Length: 37090 Lines: 488 Thomas, has my fix for this landed? --Andy > On Dec 3, 2017, at 7:02 PM, kernel test robot wrote: > > > Greeting, > > FYI, we noticed a -13.0% regression of will-it-scale.per_process_ops due to commit: > > > commit: 63e02a2a3292d8815eac7be438c8c73d72a7bb93 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline") > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > in testcase: will-it-scale > on test machine: 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory > with following parameters: > > test: poll1 > cpufreq_governor: performance > > test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. > test-url: https://github.com/antonblanchard/will-it-scale > > In addition to that, the commit also has significant impact on the following tests: > > +------------------+---------------------------------------------------------------------+ > | testcase: change | will-it-scale: will-it-scale.per_process_ops -7.0% regression | > | test machine | 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory | > | test parameters | cpufreq_governor=performance | > | | test=writeseek1 | > +------------------+---------------------------------------------------------------------+ > | testcase: change | aim9: aim9.brk_test.ops_per_sec -9.9% regression | > | test machine | 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory | > | test parameters | cpufreq_governor=performance | > | | test=brk_test | > | | testtime=300s | > +------------------+---------------------------------------------------------------------+ > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > To reproduce: > > git clone https://github.com/intel/lkp-tests.git > cd lkp-tests > bin/lkp install job.yaml # job file is attached in this email > bin/lkp run job.yaml > > ========================================================================================= > compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase: > gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-sb03/poll1/will-it-scale > > commit: > 955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack") > 63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline") > > 955cef1517a1be93 63e02a2a3292d8815eac7be438 > ---------------- -------------------------- > %stddev %change %stddev > \ | \ > 7435674 -13.0% 6465918 will-it-scale.per_process_ops > 5868564 -10.4% 5256868 will-it-scale.per_thread_ops > 0.56 +8.0% 0.61 ± 2% will-it-scale.scalability > 1947 -2.0% 1908 will-it-scale.time.system_time > 562.79 +6.9% 601.69 will-it-scale.time.user_time > 8.06 +0.8 8.86 ± 3% mpstat.cpu.usr% > 4969 ± 83% -84.5% 769.00 ± 6% numa-meminfo.node1.Inactive(anon) > 116.75 ± 63% +90.1% 222.00 ± 9% numa-vmstat.node0.nr_mlock > 116.75 ± 63% +90.1% 222.00 ± 9% numa-vmstat.node0.nr_unevictable > 116.75 ± 63% +90.1% 222.00 ± 9% numa-vmstat.node0.nr_zone_unevictable > 1242 ± 83% -84.6% 191.25 ± 6% numa-vmstat.node1.nr_inactive_anon > 1242 ± 83% -84.6% 191.25 ± 6% numa-vmstat.node1.nr_zone_inactive_anon > 1414780 +7.7% 1524182 ± 3% sched_debug.cfs_rq:/.min_vruntime.max > 144.71 ± 12% +17.8% 170.42 ± 2% sched_debug.cfs_rq:/.runnable_load_avg.max > -568616 -29.5% -400842 sched_debug.cfs_rq:/.spread0.min > 202980 ± 13% +56.8% 318219 ± 6% sched_debug.cpu.avg_idle.min > 173545 ± 3% -13.9% 149414 ± 5% sched_debug.cpu.avg_idle.stddev > 2.906e+12 -7.9% 2.676e+12 perf-stat.branch-instructions > 0.01 ± 2% +2.0 2.00 perf-stat.branch-miss-rate% > 2.405e+08 +22170.9% 5.356e+10 perf-stat.branch-misses > 1.15 +11.6% 1.28 perf-stat.cpi > 3.659e+12 -9.3% 3.318e+12 perf-stat.dTLB-loads > 0.00 ± 6% +0.0 0.00 ± 3% perf-stat.dTLB-store-miss-rate% > 2.869e+12 -8.8% 2.616e+12 perf-stat.dTLB-stores > 1.406e+13 -9.7% 1.27e+13 perf-stat.instructions > 0.87 -10.4% 0.78 perf-stat.ipc > 13.72 ± 2% -13.7 0.00 perf-profile.calltrace.cycles.entry_SYSCALL_64 > 24.53 ± 2% -0.2 24.30 ± 3% perf-profile.calltrace.cycles.copy_user_generic_string._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath > 12.15 ± 3% -0.2 11.98 ± 3% perf-profile.calltrace.cycles.__fget_light.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath > 9.57 ± 3% -0.1 9.48 ± 4% perf-profile.calltrace.cycles.__fget.__fget_light.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath > 5.79 ± 6% -0.0 5.75 ± 3% perf-profile.calltrace.cycles.fput.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath > 32.25 ± 2% +1.5 33.78 ± 3% perf-profile.calltrace.cycles._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath > 3.99 ± 5% +1.6 5.56 ± 3% perf-profile.calltrace.cycles.__might_fault._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath > 65.36 ± 2% +2.0 67.34 ± 2% perf-profile.calltrace.cycles.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath > 68.87 ± 2% +3.1 72.01 ± 2% perf-profile.calltrace.cycles.sys_poll.entry_SYSCALL_64_fastpath > 7.33 ± 35% +3.7 11.05 ± 23% perf-profile.calltrace.cycles.poll_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary > 71.48 ± 2% +3.9 75.41 ± 2% perf-profile.calltrace.cycles.entry_SYSCALL_64_fastpath > 9.50 ± 25% +4.0 13.49 ± 19% perf-profile.calltrace.cycles.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 > 10.06 ± 23% +4.0 14.05 ± 18% perf-profile.calltrace.cycles.secondary_startup_64 > 9.66 ± 24% +4.0 13.66 ± 19% perf-profile.calltrace.cycles.cpu_startup_entry.start_secondary.secondary_startup_64 > 9.66 ± 24% +4.0 13.66 ± 19% perf-profile.calltrace.cycles.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 > 9.66 ± 24% +4.0 13.66 ± 19% perf-profile.calltrace.cycles.start_secondary.secondary_startup_64 > 2.25 ± 3% +5.4 7.67 ± 3% perf-profile.calltrace.cycles.entry_SYSCALL_64_after_hwframe > 13.72 ± 2% -13.7 0.00 perf-profile.children.cycles.entry_SYSCALL_64 > 24.53 ± 2% -0.2 24.31 ± 3% perf-profile.children.cycles.copy_user_generic_string > 12.16 ± 3% -0.2 11.99 ± 3% perf-profile.children.cycles.__fget_light > 9.57 ± 3% -0.1 9.48 ± 4% perf-profile.children.cycles.__fget > 5.79 ± 6% -0.0 5.75 ± 3% perf-profile.children.cycles.fput > 32.25 ± 2% +1.5 33.78 ± 3% perf-profile.children.cycles._copy_from_user > 3.99 ± 5% +1.6 5.56 ± 3% perf-profile.children.cycles.__might_fault > 65.36 ± 2% +2.0 67.34 ± 2% perf-profile.children.cycles.do_sys_poll > 68.87 ± 2% +3.1 72.01 ± 2% perf-profile.children.cycles.sys_poll > 7.42 ± 34% +3.7 11.14 ± 22% perf-profile.children.cycles.poll_idle > 71.61 ± 2% +3.9 75.50 ± 2% perf-profile.children.cycles.entry_SYSCALL_64_fastpath > 9.88 ± 23% +4.0 13.87 ± 19% perf-profile.children.cycles.cpuidle_enter_state > 10.06 ± 23% +4.0 14.05 ± 18% perf-profile.children.cycles.secondary_startup_64 > 10.06 ± 23% +4.0 14.05 ± 18% perf-profile.children.cycles.cpu_startup_entry > 9.66 ± 24% +4.0 13.66 ± 19% perf-profile.children.cycles.start_secondary > 10.06 ± 23% +4.0 14.05 ± 18% perf-profile.children.cycles.do_idle > 2.25 ± 3% +5.4 7.67 ± 3% perf-profile.children.cycles.entry_SYSCALL_64_after_hwframe > 13.72 ± 2% -13.7 0.00 perf-profile.self.cycles.entry_SYSCALL_64 > 24.21 ± 2% -0.3 23.93 ± 2% perf-profile.self.cycles.copy_user_generic_string > 9.47 ± 3% -0.1 9.41 ± 4% perf-profile.self.cycles.__fget > 5.69 ± 5% +0.0 5.71 ± 3% perf-profile.self.cycles.fput > 13.55 ± 4% +0.7 14.24 perf-profile.self.cycles.do_sys_poll > 7.41 ± 34% +3.7 11.07 ± 22% perf-profile.self.cycles.poll_idle > 2.25 ± 3% +5.4 7.67 ± 3% perf-profile.self.cycles.entry_SYSCALL_64_after_hwframe > > > > will-it-scale.per_process_ops > > 7.8e+06 +-+---------------------------------------------------------------+ > |. .+.++ .++. | > 7.6e+06 +-+ : .+.+ +.+.+.+ +.+ | > | : .+.+ + + + | > 7.4e+06 +-+ +.+.+.+.++.+.+.+.+.++ ++.+.+ ++.+.| > | | > 7.2e+06 +-+ | > | | > 7e+06 +-+ | > | | > 6.8e+06 +-+ | > | | > 6.6e+06 O-+ O OO OO O O | > | O O O O OO O O O O OO O O O O O | > 6.4e+06 +-+--------O-----------------------O-O-------------O--------------+ > > > perf-stat.instructions > > 1.5e+13 +-+--------------------------------------------------------------+ > | | > 1.45e+13 +-+ +.+ .+. | > | +.+ + +.+.+.+. .+.+.+. +. .+.++.+ +. | > | +. : +.++ + +.+ ++.+.| > 1.4e+13 +-+ +.++.+.+.+.+ | > | | > 1.35e+13 +-+ | > | | > 1.3e+13 +-+ | > O OO O O OO O O O O O | > | O O O O OO O O O O O O O O O | > 1.25e+13 +-+ O O | > | | > 1.2e+13 +-+--------------------------------------------------------------+ > > > perf-stat.branch-instructions > > 3.05e+12 +-+--------------------------------------------------------------+ > 3e+12 +-+ + | > |.+.++.+ + ++ .+.+ .+. + + + | > 2.95e+12 +-+ + + + +.+. .+. + +. + + .+ + + + + + : +| > 2.9e+12 +-+ + + + + + + + + + + + :+ + : | > | + + + + ++ | > 2.85e+12 +-+ | > 2.8e+12 +-+ | > 2.75e+12 +-+ | > | O | > 2.7e+12 +-+ O O O O O | > 2.65e+12 O-+ O O O O O O O O O O O O | > | O O O O O O O O O | > 2.6e+12 +-+ O | > 2.55e+12 +-+--------------------------------------------------------------+ > > > perf-stat.branch-misses > > 6e+10 +-+-----------------------------------------------------------------+ > | O O O O O O O | > 5e+10 O-O O O O O O O O O OO O O O O O O OO O O | > | | > | | > 4e+10 +-+ | > | | > 3e+10 +-+ | > | | > 2e+10 +-+ | > | | > | | > 1e+10 +-+ | > | | > 0 +-+-----------------------------------------------------------------+ > > > perf-stat.dTLB-stores > > 3.2e+12 +-+---------------------------------------------------------------+ > | + + + + | > 3.1e+12 +-+ + + : :+ +: | > | + + : + + : | > 3e+12 +-+ : : : : | > |. : : : : + | > 2.9e+12 +-+.+.++. : : +.+ .+. : +. .+ : +| > | +.+. .+.++.+.: +. + :.+ +.: + :: | > 2.8e+12 +-+ + + +.+ + + + | > | | > 2.7e+12 +-+ | > O OO O O O O | > 2.6e+12 +-O O O O O O O O OO O O OO | > | O O O O O O O O | > 2.5e+12 +-+---------------------------------------------------------------+ > > > perf-stat.branch-miss-rate_ > > 2.5 +-+-------------------------------------------------------------------+ > | | > | | > 2 O-O O O O O O O O OO O O O O O O O O O O O O O O O O OO | > | | > | | > 1.5 +-+ | > | | > 1 +-+ | > | | > | | > 0.5 +-+ | > | | > | | > 0 +-+-------------------------------------------------------------------+ > > > perf-stat.ipc > > 0.92 +-+------------------------------------------------------------------+ > | | > 0.9 +-+.+. +. .+. .+. +. .+. | > 0.88 +-+ +. + + +. +.+ +. .+. + + + .+. | > | +. +. .+ +.+ + +.+ + +. .+.| > 0.86 +-+ +.+ +.+.+.+ + | > | | > 0.84 +-+ | > | | > 0.82 +-+ | > 0.8 +-+ O O O O | > | O O O O | > 0.78 +-O O O O O O O O O O O O O O | > O O O O O O O | > 0.76 +-+------------------------------------------------------------------+ > > > perf-stat.cpi > > 1.3 +-+---------------------------------O-O------------------------------+ > 1.28 O-+ O O O O O O O O O | > | O O O O O O O O O O O O | > 1.26 +-+ O | > 1.24 +-+ O O O O | > | | > 1.22 +-+ | > 1.2 +-+ | > 1.18 +-+ | > | | > 1.16 +-+ .+.+ .+.+.+.+. .+ .+. | > 1.14 +-+ .+ + + .+ +. .+. .+.+ +. .+ +.| > |.+. .+ + .+. .+ +. .+ + + .+. .+ + | > 1.12 +-+ + + + + + + | > 1.1 +-+------------------------------------------------------------------+ > > > will-it-scale.time.user_time > > 620 +-+-------------------------------------------------------------------+ > 610 +-+ O O | > O O O O O O O OO O O O O O O | > 600 +-+ O O O O O O O O O O O O | > 590 +-+ | > | | > 580 +-+ | > 570 +-+ | > 560 +-+ +.+.+.| > | : | > 550 +-+.+.+.+. .+ .+.+. : | > 540 +-+ +.+. + + .+.+ +.+ +. : | > | +.+.++.+.+. + +.+ + + + | > 530 +-+ +.+.+.+ ++.+.+ | > 520 +-+-------------------------------------------------------------------+ > > > [*] bisect-good sample > [O] bisect-bad sample > > *************************************************************************************************** > lkp-sb03: 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory > ========================================================================================= > compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase: > gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-sb03/writeseek1/will-it-scale > > commit: > 955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack") > 63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline") > > 955cef1517a1be93 63e02a2a3292d8815eac7be438 > ---------------- -------------------------- > %stddev %change %stddev > \ | \ > 1902014 -7.0% 1768039 will-it-scale.per_process_ops > 1557647 -6.3% 1459046 will-it-scale.per_thread_ops > 0.52 +4.0% 0.54 will-it-scale.scalability > 2293 -1.8% 2251 will-it-scale.time.system_time > 216.11 +19.7% 258.70 will-it-scale.time.user_time > 1.453e+08 ± 6% +21.7% 1.769e+08 ± 9% cpuidle.POLL.time > 3.43 +0.8 4.26 mpstat.cpu.usr% > 284863 ± 6% +12.9% 321561 ± 3% softirqs.RCU > 7178 ± 6% -11.3% 6368 slabinfo.kmalloc-96.active_objs > 7218 ± 5% -10.6% 6450 slabinfo.kmalloc-96.num_objs > 72.27 ± 6% +19.5% 86.39 ± 7% sched_debug.cfs_rq:/.load_avg.avg > 107.67 ± 3% +31.1% 141.11 ± 19% sched_debug.cfs_rq:/.load_avg.stddev > 50035 ± 23% +17.3% 58672 ± 24% sched_debug.cpu.load.stddev > 7.58 ± 21% +65.4% 12.54 ± 11% sched_debug.cpu.nr_uninterruptible.max > 3.143e+12 -4.7% 2.995e+12 perf-stat.branch-instructions > 0.01 ± 2% +1.0 0.97 perf-stat.branch-miss-rate% > 3.791e+08 ± 3% +7525.5% 2.891e+10 perf-stat.branch-misses > 2.54e+08 +1.0% 2.566e+08 perf-stat.cache-misses > 1.03 +6.3% 1.10 perf-stat.cpi > 6.671e+12 -4.7% 6.361e+12 perf-stat.dTLB-loads > 4.722e+12 -5.0% 4.485e+12 perf-stat.dTLB-stores > 35.63 ± 12% -29.7 5.89 ± 20% perf-stat.iTLB-load-miss-rate% > 8.119e+08 ± 8% +829.8% 7.549e+09 ± 2% perf-stat.iTLB-loads > 1.563e+13 -5.3% 1.48e+13 perf-stat.instructions > 0.97 -5.9% 0.91 perf-stat.ipc > 5.97 -6.0 0.00 perf-profile.calltrace.cycles.entry_SYSCALL_64 > 7.43 ± 2% -0.1 7.29 ± 3% perf-profile.calltrace.cycles.find_lock_entry.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter > 9.10 ± 2% -0.1 9.00 ± 3% perf-profile.calltrace.cycles.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter > 9.43 ± 2% -0.1 9.33 ± 3% perf-profile.calltrace.cycles.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write > 19.45 -0.1 19.39 ± 2% perf-profile.calltrace.cycles.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter > 19.14 -0.0 19.10 perf-profile.calltrace.cycles.copy_user_generic_string.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter > 21.14 +0.0 21.15 ± 2% perf-profile.calltrace.cycles.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write > 9.16 ± 10% +0.0 9.20 ± 41% perf-profile.calltrace.cycles.poll_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary > 41.59 +0.1 41.71 ± 2% perf-profile.calltrace.cycles.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write.vfs_write > 11.09 ± 8% +0.2 11.24 ± 31% perf-profile.calltrace.cycles.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 > 11.21 ± 8% +0.2 11.37 ± 31% perf-profile.calltrace.cycles.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 > 11.21 ± 8% +0.2 11.37 ± 31% perf-profile.calltrace.cycles.cpu_startup_entry.start_secondary.secondary_startup_64 > 11.21 ± 8% +0.2 11.37 ± 31% perf-profile.calltrace.cycles.start_secondary.secondary_startup_64 > 11.68 ± 7% +0.2 11.90 ± 27% perf-profile.calltrace.cycles.secondary_startup_64 > 45.10 +0.3 45.37 ± 2% perf-profile.calltrace.cycles.__generic_file_write_iter.generic_file_write_iter.__vfs_write.vfs_write.sys_write > 51.69 +0.3 52.02 ± 2% perf-profile.calltrace.cycles.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath > 50.28 +0.4 50.63 ± 2% perf-profile.calltrace.cycles.generic_file_write_iter.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath > 61.80 +0.8 62.60 ± 3% perf-profile.calltrace.cycles.vfs_write.sys_write.entry_SYSCALL_64_fastpath > 4.92 +0.9 5.80 ± 5% perf-profile.calltrace.cycles.__fdget_pos.sys_lseek.entry_SYSCALL_64_fastpath > 4.96 +0.9 5.86 ± 3% perf-profile.calltrace.cycles.__fdget_pos.sys_write.entry_SYSCALL_64_fastpath > 8.74 +1.0 9.75 ± 6% perf-profile.calltrace.cycles.sys_lseek.entry_SYSCALL_64_fastpath > 69.88 +1.6 71.49 ± 3% perf-profile.calltrace.cycles.sys_write.entry_SYSCALL_64_fastpath > 80.00 +2.9 82.90 ± 3% perf-profile.calltrace.cycles.entry_SYSCALL_64_fastpath > 5.97 -6.0 0.00 perf-profile.children.cycles.entry_SYSCALL_64 > 7.43 ± 2% -0.1 7.29 ± 3% perf-profile.children.cycles.find_lock_entry > 9.10 ± 2% -0.1 9.00 ± 3% perf-profile.children.cycles.shmem_getpage_gfp > 9.43 ± 2% -0.1 9.33 ± 3% perf-profile.children.cycles.shmem_write_begin > 19.45 -0.1 19.39 ± 2% perf-profile.children.cycles.copyin > 19.14 -0.0 19.11 perf-profile.children.cycles.copy_user_generic_string > 21.14 +0.0 21.15 ± 2% perf-profile.children.cycles.iov_iter_copy_from_user_atomic > 9.46 ± 9% +0.1 9.56 ± 36% perf-profile.children.cycles.poll_idle > 41.60 +0.1 41.72 ± 2% perf-profile.children.cycles.generic_perform_write > 11.21 ± 8% +0.2 11.37 ± 31% perf-profile.children.cycles.start_secondary > 11.56 ± 7% +0.2 11.76 ± 27% perf-profile.children.cycles.cpuidle_enter_state > 11.69 ± 7% +0.2 11.90 ± 27% perf-profile.children.cycles.do_idle > 11.68 ± 7% +0.2 11.90 ± 27% perf-profile.children.cycles.secondary_startup_64 > 11.68 ± 7% +0.2 11.90 ± 27% perf-profile.children.cycles.cpu_startup_entry > 45.10 +0.3 45.37 ± 2% perf-profile.children.cycles.__generic_file_write_iter > 51.72 +0.3 52.03 ± 2% perf-profile.children.cycles.__vfs_write > 50.28 +0.4 50.63 ± 2% perf-profile.children.cycles.generic_file_write_iter > 61.84 +0.8 62.62 ± 3% perf-profile.children.cycles.vfs_write > 8.74 +1.0 9.75 ± 6% perf-profile.children.cycles.sys_lseek > 3.81 +1.6 5.38 ± 5% perf-profile.children.cycles.__fget_light > 69.93 +1.6 71.50 ± 3% perf-profile.children.cycles.sys_write > 9.88 +1.8 11.67 ± 3% perf-profile.children.cycles.__fdget_pos > 80.23 +2.7 82.94 ± 3% perf-profile.children.cycles.entry_SYSCALL_64_fastpath > 5.97 -6.0 0.00 perf-profile.self.cycles.entry_SYSCALL_64 > 18.93 -0.1 18.84 ± 2% perf-profile.self.cycles.copy_user_generic_string > 9.39 ± 8% +0.0 9.42 ± 35% perf-profile.self.cycles.poll_idle > > > > *************************************************************************************************** > lkp-ivb-d03: 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory > ========================================================================================= > compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime: > gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-ivb-d03/brk_test/aim9/300s > > commit: > 955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack") > 63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline") > > 955cef1517a1be93 63e02a2a3292d8815eac7be438 > ---------------- -------------------------- > %stddev %change %stddev > \ | \ > 4124214 -9.9% 3717599 aim9.brk_test.ops_per_sec > 272.29 -4.9% 259.03 aim9.time.system_time > 27.71 +47.2% 40.78 aim9.time.user_time > 12605 ± 9% -27.0% 9203 ± 10% cpuidle.POLL.usage > 3.24 ± 2% +1.4 4.62 mpstat.cpu.usr% > 4007 ± 3% -9.2% 3639 ± 4% slabinfo.anon_vma_chain.num_objs > 9.80 -1.9% 9.61 turbostat.CorWatt > 30309 -1.3% 29929 vmstat.system.cs > 18905 -1.1% 18689 vmstat.system.in > 716.67 ± 11% -22.7% 554.33 ± 6% sched_debug.cfs_rq:/.load_avg.avg > 1.00 ± 11% -79.2% 0.21 ±173% sched_debug.cfs_rq:/.nr_spread_over.min > 0.45 ± 55% +70.3% 0.76 ± 19% sched_debug.cfs_rq:/.nr_spread_over.stddev > 521.82 ± 3% -10.2% 468.57 ± 2% sched_debug.cfs_rq:/.util_avg.avg > 1.96 ± 7% +34.0% 2.62 ± 9% sched_debug.cpu.nr_running.max > 0.68 ± 15% +42.9% 0.98 ± 15% sched_debug.cpu.nr_running.stddev > 0.06 ± 19% +0.9 0.92 perf-stat.branch-miss-rate% > 3.583e+08 ± 5% +1125.0% 4.389e+09 ± 28% perf-stat.branch-misses > 9163065 -1.8% 8997254 perf-stat.context-switches > 0.56 ± 2% +12.8% 0.63 ± 4% perf-stat.cpi > 0.06 ±132% +0.2 0.23 ± 6% perf-stat.dTLB-load-miss-rate% > 4.062e+08 ±142% +234.1% 1.357e+09 ± 8% perf-stat.dTLB-load-misses > 9061724 ± 12% +22.0% 11056158 ± 6% perf-stat.dTLB-store-misses > 11.72 ± 24% -6.6 5.08 ± 33% perf-stat.iTLB-load-miss-rate% > 4.4e+08 ± 29% +135.5% 1.036e+09 ± 23% perf-stat.iTLB-loads > 1.80 ± 2% -11.2% 1.60 ± 3% perf-stat.ipc > 14.11 ± 88% -2.6 11.50 ± 86% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 > 14.22 ± 88% -2.6 11.63 ± 85% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64 > 14.22 ± 88% -2.6 11.63 ± 85% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64 > 14.22 ± 88% -2.6 11.63 ± 85% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64 > 12.86 ± 92% -2.4 10.45 ± 97% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary > 45.20 ± 3% -1.4 43.82 perf-profile.calltrace.cycles-pp.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath > 16.60 ± 3% -0.9 15.74 ± 3% perf-profile.calltrace.cycles-pp.vma_merge.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath > 56.05 ± 2% -0.8 55.25 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_fastpath > 14.60 ± 3% -0.7 13.88 ± 2% perf-profile.calltrace.cycles-pp.__vma_adjust.vma_merge.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath > 54.84 ± 3% -0.7 54.15 perf-profile.calltrace.cycles-pp.sys_brk.entry_SYSCALL_64_fastpath > 11.52 ± 9% -0.1 11.46 perf-profile.calltrace.cycles-pp.perf_event_mmap.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath > 6.30 ± 5% +0.2 6.48 ± 3% perf-profile.calltrace.cycles-pp.security_vm_enough_memory_mm.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath > 27.40 ± 3% +0.8 28.18 ± 4% perf-profile.calltrace.cycles-pp.secondary_startup_64 > 12.40 ± 94% +3.3 15.73 ± 62% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel > 13.18 ± 88% +3.4 16.55 ± 57% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64 > 13.18 ± 88% +3.4 16.55 ± 57% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_kernel.secondary_startup_64 > 13.18 ± 88% +3.4 16.55 ± 57% perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64 > 13.14 ± 88% +3.4 16.53 ± 57% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64 > 14.22 ± 88% -2.6 11.63 ± 85% perf-profile.children.cycles-pp.start_secondary > 45.83 ± 3% -1.2 44.59 perf-profile.children.cycles-pp.do_brk_flags > 56.30 ± 2% -0.9 55.36 perf-profile.children.cycles-pp.entry_SYSCALL_64_fastpath > 17.05 ± 3% -0.8 16.24 ± 3% perf-profile.children.cycles-pp.vma_merge > 15.45 ± 3% -0.7 14.79 ± 2% perf-profile.children.cycles-pp.__vma_adjust > 55.47 ± 3% -0.6 54.88 perf-profile.children.cycles-pp.sys_brk > 12.21 ± 8% -0.1 12.08 perf-profile.children.cycles-pp.perf_event_mmap > 6.40 ± 5% +0.2 6.57 ± 3% perf-profile.children.cycles-pp.security_vm_enough_memory_mm > 27.41 ± 3% +0.8 28.19 ± 4% perf-profile.children.cycles-pp.do_idle > 27.30 ± 3% +0.8 28.07 ± 4% perf-profile.children.cycles-pp.cpuidle_enter_state > 27.40 ± 3% +0.8 28.18 ± 4% perf-profile.children.cycles-pp.secondary_startup_64 > 27.40 ± 3% +0.8 28.18 ± 4% perf-profile.children.cycles-pp.cpu_startup_entry > 25.27 +0.9 26.19 perf-profile.children.cycles-pp.intel_idle > 13.18 ± 88% +3.4 16.55 ± 57% perf-profile.children.cycles-pp.start_kernel > 4.82 ± 9% +0.0 4.83 ± 5% perf-profile.self.cycles-pp.__vma_adjust > 5.25 ± 9% +0.0 5.29 ± 2% perf-profile.self.cycles-pp.perf_event_mmap > 5.33 ± 3% +0.4 5.75 ± 3% perf-profile.self.cycles-pp.do_brk_flags > 25.26 +0.9 26.19 perf-profile.self.cycles-pp.intel_idle > > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance. > > > Thanks, > Xiaolong > > >