Greeting,
FYI, we noticed a 87.8% improvement of vm-scalability.throughput due to commit:
commit: 7fef431be9c9ac255838a9578331567b9dba4477 ("mm/page_alloc: place pages to tail in __free_pages_core()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: vm-scalability
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:
runtime: 300s
size: 512G
test: anon-wx-rand-mt
cpufreq_governor: performance
ucode: 0x5002f01
test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/512G/lkp-csl-2ap4/anon-wx-rand-mt/vm-scalability/0x5002f01
commit:
293ffa5ebb ("mm/page_alloc: move pages to tail in move_to_free_list()")
7fef431be9 ("mm/page_alloc: place pages to tail in __free_pages_core()")
293ffa5ebb9c08a7 7fef431be9c9ac255838a957833
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
7:4 -138% 1:4 perf-profile.children.cycles-pp.error_entry
%stddev %change %stddev
\ | \
18760 +87.9% 35243 ± 2% vm-scalability.median
3560088 +87.8% 6684531 ± 2% vm-scalability.throughput
211592 ± 2% +9.1% 230774 vm-scalability.time.involuntary_context_switches
992.50 ± 6% +125.5% 2238 ± 14% vm-scalability.time.major_page_faults
431224 ± 9% -31.9% 293851 ± 7% vm-scalability.time.minor_page_faults
43637 ± 4% +81.1% 79043 ± 7% vm-scalability.time.voluntary_context_switches
1.001e+09 +86.1% 1.863e+09 ± 2% vm-scalability.workload
23388316 ± 9% -49.2% 11881255 ± 7% meminfo.DirectMap2M
2990 +16.5% 3482 ± 2% vmstat.system.cs
235.75 ± 7% +80.3% 425.00 ± 3% slabinfo.taskstats.active_objs
235.75 ± 7% +80.3% 425.00 ± 3% slabinfo.taskstats.num_objs
1.36 -0.2 1.18 mpstat.cpu.all.irq%
0.11 ± 4% -0.1 0.05 ± 2% mpstat.cpu.all.soft%
0.41 ± 2% -0.0 0.37 ± 4% mpstat.cpu.all.sys%
2045 ± 35% +1270.4% 28024 ±146% numa-meminfo.node2.Shmem
49773 ± 2% -16.6% 41532 ± 9% numa-meminfo.node3.Active
49713 ± 2% -16.5% 41530 ± 9% numa-meminfo.node3.Active(anon)
511.00 ± 35% +1266.5% 6982 ±146% numa-vmstat.node2.nr_shmem
12460 ± 2% -16.6% 10395 ± 8% numa-vmstat.node3.nr_active_anon
12460 ± 2% -16.6% 10395 ± 8% numa-vmstat.node3.nr_zone_active_anon
2381 ± 7% -32.1% 1617 ± 13% sched_debug.cfs_rq:/.exec_clock.stddev
370383 ± 5% -20.6% 294007 ± 9% sched_debug.cfs_rq:/.min_vruntime.stddev
20.79 ± 27% -67.6% 6.74 ± 21% sched_debug.cfs_rq:/.nr_spread_over.avg
20.71 ± 22% -37.2% 13.01 ± 28% sched_debug.cfs_rq:/.nr_spread_over.stddev
371094 ± 5% -20.8% 294023 ± 10% sched_debug.cfs_rq:/.spread0.stddev
952874 ± 4% -15.8% 802020 ± 5% sched_debug.cpu.avg_idle.avg
2511684 ± 7% -41.5% 1468459 ± 11% sched_debug.cpu.avg_idle.max
338913 ± 7% -33.5% 225357 ± 18% sched_debug.cpu.avg_idle.min
297395 ± 8% -42.1% 172172 ± 6% sched_debug.cpu.avg_idle.stddev
834.46 ± 9% -13.5% 721.42 ± 2% sched_debug.cpu.clock_task.stddev
6671 ± 12% +18.7% 7922 ± 10% sched_debug.cpu.curr->pid.avg
581973 -12.8% 507618 sched_debug.cpu.max_idle_balance_cost.avg
1711617 ± 12% -47.2% 904006 ± 8% sched_debug.cpu.max_idle_balance_cost.max
175767 ± 4% -75.3% 43435 ± 29% sched_debug.cpu.max_idle_balance_cost.stddev
369.91 ± 7% +20.7% 446.38 ± 10% sched_debug.cpu.sched_goidle.avg
3580 ± 9% +25.5% 4493 ± 7% sched_debug.cpu.sched_goidle.max
150.04 ± 4% +48.1% 222.23 ± 13% sched_debug.cpu.sched_goidle.min
13744 ± 2% -12.6% 12016 ± 7% proc-vmstat.nr_active_anon
120.25 +22.0% 146.75 proc-vmstat.nr_dirtied
309.50 ± 5% +6.9% 331.00 proc-vmstat.nr_inactive_file
173.50 ± 26% -75.5% 42.50 ± 85% proc-vmstat.nr_isolated_anon
41374 -2.9% 40154 ± 2% proc-vmstat.nr_shmem
115.25 +20.6% 139.00 proc-vmstat.nr_written
13744 ± 2% -12.6% 12016 ± 7% proc-vmstat.nr_zone_active_anon
309.50 ± 5% +6.9% 331.00 proc-vmstat.nr_zone_inactive_file
1031073 +4.1% 1073838 proc-vmstat.numa_hit
937564 +4.6% 980248 proc-vmstat.numa_local
928663 +71.4% 1591534 ± 2% proc-vmstat.numa_pages_migrated
27789 -18.4% 22689 ± 11% proc-vmstat.pgactivate
4500429 +57.7% 7097147 proc-vmstat.pgalloc_normal
1564121 ± 2% -8.5% 1431381 ± 2% proc-vmstat.pgfault
4552435 ± 3% +55.2% 7067211 proc-vmstat.pgfree
23040 ± 17% +131.1% 53248 ± 8% proc-vmstat.pgmigrate_fail
928663 +71.4% 1591534 ± 2% proc-vmstat.pgmigrate_success
2284 +85.4% 4235 ± 2% proc-vmstat.thp_fault_alloc
8.179e+09 +83.7% 1.502e+10 ± 2% perf-stat.i.branch-instructions
0.09 ± 6% -0.0 0.06 ± 5% perf-stat.i.branch-miss-rate%
6391877 ± 4% +8.0% 6902221 ± 2% perf-stat.i.branch-misses
3.38e+08 +85.9% 6.284e+08 ± 2% perf-stat.i.cache-misses
3.867e+08 +85.2% 7.162e+08 ± 2% perf-stat.i.cache-references
2932 +17.7% 3450 ± 3% perf-stat.i.context-switches
16.32 -45.3% 8.92 ± 4% perf-stat.i.cpi
5.51e+11 -1.0% 5.458e+11 perf-stat.i.cpu-cycles
239.58 +10.5% 264.84 perf-stat.i.cpu-migrations
1692 -46.2% 909.72 ± 4% perf-stat.i.cycles-between-cache-misses
0.00 ± 13% -0.0 0.00 ± 11% perf-stat.i.dTLB-load-miss-rate%
9.761e+09 +83.4% 1.79e+10 ± 2% perf-stat.i.dTLB-loads
0.06 -0.0 0.04 ± 4% perf-stat.i.dTLB-store-miss-rate%
2189276 +26.5% 2770382 ± 7% perf-stat.i.dTLB-store-misses
3.575e+09 +82.2% 6.513e+09 ± 2% perf-stat.i.dTLB-stores
86.37 -1.2 85.15 perf-stat.i.iTLB-load-miss-rate%
1294147 ± 2% -4.4% 1236620 perf-stat.i.iTLB-load-misses
3.487e+10 +83.3% 6.39e+10 ± 2% perf-stat.i.instructions
31395 ± 2% +95.5% 61363 ± 2% perf-stat.i.instructions-per-iTLB-miss
0.07 ± 2% +77.1% 0.13 ± 2% perf-stat.i.ipc
3.86 ± 6% +112.2% 8.20 ± 13% perf-stat.i.major-faults
0.95 -25.7% 0.71 ± 5% perf-stat.i.metric.K/sec
114.91 +84.2% 211.60 ± 2% perf-stat.i.metric.M/sec
5051 ± 3% -8.7% 4613 ± 2% perf-stat.i.minor-faults
72.66 -2.1 70.61 perf-stat.i.node-load-miss-rate%
571417 +28.5% 734202 ± 7% perf-stat.i.node-load-misses
2.824e+08 +85.7% 5.245e+08 ± 2% perf-stat.i.node-store-misses
53934951 +88.4% 1.016e+08 ± 2% perf-stat.i.node-stores
5054 ± 3% -8.6% 4622 ± 2% perf-stat.i.page-faults
11.09 +1.1% 11.21 perf-stat.overall.MPKI
0.08 ± 5% -0.0 0.05 perf-stat.overall.branch-miss-rate%
15.76 -45.8% 8.54 ± 2% perf-stat.overall.cpi
1626 -46.6% 868.32 ± 2% perf-stat.overall.cycles-between-cache-misses
0.00 ± 10% -0.0 0.00 ± 9% perf-stat.overall.dTLB-load-miss-rate%
0.06 ± 2% -0.0 0.04 ± 4% perf-stat.overall.dTLB-store-miss-rate%
26749 ± 2% +92.9% 51602 ± 3% perf-stat.overall.instructions-per-iTLB-miss
0.06 +84.7% 0.12 ± 2% perf-stat.overall.ipc
57.23 ± 5% +6.3 63.56 ± 7% perf-stat.overall.node-load-miss-rate%
10539 -2.0% 10327 perf-stat.overall.path-length
8.066e+09 +84.8% 1.491e+10 ± 2% perf-stat.ps.branch-instructions
6271950 ± 4% +7.1% 6715322 ± 2% perf-stat.ps.branch-misses
3.334e+08 +87.1% 6.238e+08 ± 2% perf-stat.ps.cache-misses
3.815e+08 +86.4% 7.109e+08 ± 2% perf-stat.ps.cache-references
2911 +17.1% 3409 ± 2% perf-stat.ps.context-switches
234.93 +10.6% 259.91 perf-stat.ps.cpu-migrations
9.627e+09 +84.5% 1.776e+10 ± 2% perf-stat.ps.dTLB-loads
2168648 ± 2% +26.8% 2748820 ± 6% perf-stat.ps.dTLB-store-misses
3.526e+09 +83.3% 6.463e+09 ± 2% perf-stat.ps.dTLB-stores
1286470 ± 2% -4.5% 1229184 perf-stat.ps.iTLB-load-misses
512663 -7.4% 474596 ± 4% perf-stat.ps.iTLB-loads
3.439e+10 +84.4% 6.341e+10 ± 2% perf-stat.ps.instructions
3.92 ± 6% +104.8% 8.04 ± 13% perf-stat.ps.major-faults
4965 ± 2% -7.7% 4583 ± 2% perf-stat.ps.minor-faults
567778 +32.8% 753732 ± 10% perf-stat.ps.node-load-misses
2.784e+08 +86.9% 5.204e+08 ± 2% perf-stat.ps.node-store-misses
53248528 +89.6% 1.01e+08 ± 2% perf-stat.ps.node-stores
4969 ± 2% -7.6% 4591 ± 2% perf-stat.ps.page-faults
1.055e+13 +82.4% 1.924e+13 ± 2% perf-stat.total.instructions
63.20 ± 12% -49.6 13.64 ±157% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt
61.11 ± 13% -47.8 13.28 ±157% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
52.78 ± 15% -40.8 11.98 ±157% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
52.36 ± 15% -40.5 11.87 ±157% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
34.28 ± 19% -27.4 6.84 ±164% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
29.76 ± 20% -23.9 5.91 ±165% perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
27.02 ± 24% -21.7 5.30 ±165% perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt
26.98 ± 24% -21.7 5.29 ±165% perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
23.29 ± 31% -18.8 4.49 ±166% perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues
20.30 ± 38% -16.5 3.75 ±173% perf-profile.calltrace.cycles-pp.task_tick_fair.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer
12.35 ± 68% -10.3 2.09 ±173% perf-profile.calltrace.cycles-pp.hrtimer_active.task_tick_fair.scheduler_tick.update_process_times.tick_sched_handle
8.33 ± 88% -8.1 0.25 ±173% perf-profile.calltrace.cycles-pp._raw_spin_lock.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
8.31 ± 88% -8.1 0.24 ±173% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault
8.35 ± 87% -7.5 0.84 ±173% perf-profile.calltrace.cycles-pp.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
10.11 ± 8% -7.3 2.84 ±162% perf-profile.calltrace.cycles-pp.clockevents_program_event.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
7.69 ± 6% -6.6 1.06 ±173% perf-profile.calltrace.cycles-pp.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
8.65 ± 8% -6.3 2.37 ±173% perf-profile.calltrace.cycles-pp.ktime_get.clockevents_program_event.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
6.79 ± 6% -5.9 0.89 ±173% perf-profile.calltrace.cycles-pp.do_softirq_own_stack.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
6.77 ± 6% -5.9 0.89 ±173% perf-profile.calltrace.cycles-pp.asm_call_sysvec_on_stack.do_softirq_own_stack.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
6.73 ± 6% -5.9 0.88 ±173% perf-profile.calltrace.cycles-pp.__softirqentry_text_start.asm_call_sysvec_on_stack.do_softirq_own_stack.irq_exit_rcu.sysvec_apic_timer_interrupt
10.38 ± 73% -5.2 5.13 ±173% perf-profile.calltrace.cycles-pp.asm_exc_page_fault
10.36 ± 74% -5.2 5.12 ±173% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault
10.35 ± 74% -5.2 5.12 ±173% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
10.27 ± 74% -5.2 5.12 ±173% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
10.21 ± 74% -5.1 5.11 ±173% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
6.41 ± 19% -4.9 1.48 ±173% perf-profile.calltrace.cycles-pp.ktime_get_update_offsets_now.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
2.11 ± 17% -1.6 0.48 ±118% perf-profile.calltrace.cycles-pp.ret_from_fork
2.11 ± 17% -1.6 0.48 ±118% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
63.78 ± 12% -49.2 14.62 ±145% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
61.55 ± 13% -47.5 14.05 ±148% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
53.04 ± 15% -40.5 12.54 ±149% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
52.63 ± 15% -40.2 12.43 ±149% perf-profile.children.cycles-pp.hrtimer_interrupt
34.48 ± 19% -27.2 7.32 ±151% perf-profile.children.cycles-pp.__hrtimer_run_queues
29.89 ± 20% -23.6 6.29 ±153% perf-profile.children.cycles-pp.tick_sched_timer
27.12 ± 24% -21.5 5.63 ±154% perf-profile.children.cycles-pp.tick_sched_handle
27.09 ± 24% -21.5 5.61 ±154% perf-profile.children.cycles-pp.update_process_times
23.38 ± 31% -18.7 4.73 ±156% perf-profile.children.cycles-pp.scheduler_tick
20.67 ± 37% -16.6 4.03 ±159% perf-profile.children.cycles-pp.task_tick_fair
12.38 ± 68% -10.2 2.19 ±162% perf-profile.children.cycles-pp.hrtimer_active
10.17 ± 8% -7.1 3.08 ±146% perf-profile.children.cycles-pp.clockevents_program_event
7.90 ± 5% -6.6 1.31 ±142% perf-profile.children.cycles-pp.irq_exit_rcu
7.44 ± 9% -6.2 1.29 ±126% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
7.28 ± 5% -6.0 1.29 ±132% perf-profile.children.cycles-pp.asm_call_sysvec_on_stack
7.24 ± 9% -6.0 1.25 ±125% perf-profile.children.cycles-pp.do_syscall_64
6.97 ± 6% -5.9 1.09 ±144% perf-profile.children.cycles-pp.do_softirq_own_stack
6.93 ± 6% -5.8 1.09 ±144% perf-profile.children.cycles-pp.__softirqentry_text_start
6.52 ± 18% -4.8 1.68 ±148% perf-profile.children.cycles-pp.ktime_get_update_offsets_now
4.84 ± 14% -3.9 0.91 ±138% perf-profile.children.cycles-pp.native_irq_return_iret
3.07 ± 5% -2.6 0.48 ±154% perf-profile.children.cycles-pp.rcu_core
2.55 ± 23% -2.0 0.55 ±139% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
2.24 ± 11% -1.9 0.37 ±141% perf-profile.children.cycles-pp.rebalance_domains
2.41 ± 19% -1.9 0.56 ±148% perf-profile.children.cycles-pp.update_load_avg
2.14 ± 24% -1.8 0.36 ±149% perf-profile.children.cycles-pp.load_balance
2.02 ± 85% -1.7 0.28 ±162% perf-profile.children.cycles-pp.update_cfs_group
2.20 ± 18% -1.5 0.66 ± 68% perf-profile.children.cycles-pp.ret_from_fork
1.77 ± 15% -1.5 0.26 ±162% perf-profile.children.cycles-pp.process_interval
1.77 ± 15% -1.5 0.26 ±162% perf-profile.children.cycles-pp.dispatch_events
1.74 ± 16% -1.5 0.26 ±162% perf-profile.children.cycles-pp.__run_perf_stat
2.11 ± 17% -1.5 0.65 ± 68% perf-profile.children.cycles-pp.kthread
1.68 ± 6% -1.4 0.27 ±144% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
1.62 ± 11% -1.4 0.23 ±154% perf-profile.children.cycles-pp.ksys_read
1.52 ± 12% -1.3 0.22 ±153% perf-profile.children.cycles-pp.vfs_read
1.56 ± 6% -1.2 0.35 ±156% perf-profile.children.cycles-pp.sync_regs
1.41 ± 7% -1.2 0.23 ±158% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
1.38 ± 29% -1.1 0.23 ±156% perf-profile.children.cycles-pp.find_busiest_group
1.42 ± 19% -1.1 0.28 ±158% perf-profile.children.cycles-pp.update_curr
1.32 ± 16% -1.1 0.18 ±157% perf-profile.children.cycles-pp.execve
1.32 ± 15% -1.1 0.18 ±157% perf-profile.children.cycles-pp.__x64_sys_execve
1.32 ± 15% -1.1 0.18 ±157% perf-profile.children.cycles-pp.do_execveat_common
1.33 ± 30% -1.1 0.27 ±156% perf-profile.children.cycles-pp.__schedule
1.27 ± 6% -1.1 0.21 ±159% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
1.25 ± 31% -1.0 0.22 ±155% perf-profile.children.cycles-pp.update_sd_lb_stats
1.12 ± 14% -1.0 0.15 ±155% perf-profile.children.cycles-pp.bprm_execve
1.13 ± 15% -1.0 0.17 ±150% perf-profile.children.cycles-pp.read
1.19 ± 29% -1.0 0.23 ±158% perf-profile.children.cycles-pp.schedule
1.25 ± 10% -0.9 0.33 ±148% perf-profile.children.cycles-pp.lapic_next_deadline
1.12 ± 16% -0.9 0.20 ±155% perf-profile.children.cycles-pp.rcu_sched_clock_irq
1.10 ± 10% -0.9 0.25 ±143% perf-profile.children.cycles-pp.irqtime_account_irq
1.06 ± 23% -0.8 0.22 ±160% perf-profile.children.cycles-pp.__update_load_avg_se
1.12 ± 15% -0.8 0.31 ± 77% perf-profile.children.cycles-pp.ksys_write
1.02 ± 24% -0.8 0.22 ±155% perf-profile.children.cycles-pp.__intel_pmu_enable_all
1.11 ± 16% -0.8 0.31 ± 77% perf-profile.children.cycles-pp.vfs_write
1.09 ± 15% -0.8 0.31 ± 78% perf-profile.children.cycles-pp.new_sync_write
0.90 ± 11% -0.8 0.13 ±151% perf-profile.children.cycles-pp.__libc_fork
0.92 ± 20% -0.8 0.16 ±152% perf-profile.children.cycles-pp.__libc_write
0.92 ± 20% -0.8 0.16 ±152% perf-profile.children.cycles-pp.generic_file_write_iter
0.91 ± 21% -0.8 0.16 ±152% perf-profile.children.cycles-pp.__generic_file_write_iter
0.90 ± 21% -0.7 0.16 ±151% perf-profile.children.cycles-pp.generic_perform_write
0.90 ± 15% -0.7 0.20 ±159% perf-profile.children.cycles-pp.update_rq_clock
0.71 ± 13% -0.6 0.11 ±148% perf-profile.children.cycles-pp.__do_sys_clone
0.70 ± 12% -0.6 0.11 ±148% perf-profile.children.cycles-pp.kernel_clone
0.76 ± 10% -0.6 0.17 ±156% perf-profile.children.cycles-pp.sched_clock_cpu
0.72 ± 40% -0.5 0.18 ±148% perf-profile.children.cycles-pp.calc_global_load_tick
0.65 ± 14% -0.5 0.14 ±153% perf-profile.children.cycles-pp.__remove_hrtimer
0.95 ± 23% -0.5 0.47 ± 36% perf-profile.children.cycles-pp.worker_thread
0.68 ± 4% -0.5 0.21 ± 83% perf-profile.children.cycles-pp.io_serial_in
0.58 ± 56% -0.4 0.14 ±153% perf-profile.children.cycles-pp.pick_next_task_fair
0.57 ± 9% -0.4 0.15 ±126% perf-profile.children.cycles-pp.irq_enter_rcu
0.85 ± 24% -0.4 0.44 ± 29% perf-profile.children.cycles-pp.process_one_work
0.44 ± 33% -0.2 0.20 ± 69% perf-profile.children.cycles-pp.uart_console_write
0.44 ± 33% -0.2 0.20 ± 65% perf-profile.children.cycles-pp.serial8250_console_write
0.41 ± 35% -0.2 0.18 ± 73% perf-profile.children.cycles-pp.serial8250_console_putchar
0.42 ± 34% -0.2 0.19 ± 68% perf-profile.children.cycles-pp.wait_for_xmitr
0.24 ± 17% -0.1 0.15 ± 28% perf-profile.children.cycles-pp.write
12.37 ± 68% -10.2 2.19 ±162% perf-profile.self.cycles-pp.hrtimer_active
6.26 ± 19% -4.7 1.59 ±148% perf-profile.self.cycles-pp.ktime_get_update_offsets_now
4.83 ± 14% -3.9 0.91 ±139% perf-profile.self.cycles-pp.native_irq_return_iret
2.65 ± 41% -2.0 0.68 ±169% perf-profile.self.cycles-pp.task_tick_fair
2.02 ± 85% -1.7 0.28 ±162% perf-profile.self.cycles-pp.update_cfs_group
1.54 ± 6% -1.2 0.34 ±156% perf-profile.self.cycles-pp.sync_regs
1.36 ± 25% -1.1 0.24 ±154% perf-profile.self.cycles-pp.tick_sched_timer
1.25 ± 10% -0.9 0.33 ±148% perf-profile.self.cycles-pp.lapic_next_deadline
1.00 ± 17% -0.8 0.17 ±156% perf-profile.self.cycles-pp.rcu_sched_clock_irq
0.99 ± 36% -0.8 0.19 ±155% perf-profile.self.cycles-pp.update_sd_lb_stats
1.02 ± 24% -0.8 0.22 ±155% perf-profile.self.cycles-pp.__intel_pmu_enable_all
0.80 ± 8% -0.6 0.16 ±155% perf-profile.self.cycles-pp._raw_spin_lock
0.71 ± 40% -0.5 0.17 ±147% perf-profile.self.cycles-pp.calc_global_load_tick
0.64 ± 21% -0.5 0.15 ±154% perf-profile.self.cycles-pp.update_curr
0.46 ± 12% -0.4 0.10 ±144% perf-profile.self.cycles-pp.irqtime_account_irq
0.55 ± 19% -0.3 0.21 ± 83% perf-profile.self.cycles-pp.io_serial_in
2675279 +27.5% 3411087 ± 3% interrupts.CAL:Function_call_interrupts
13632 ± 2% +29.3% 17632 ± 4% interrupts.CPU0.CAL:Function_call_interrupts
12847 ± 2% +36.7% 17565 ± 4% interrupts.CPU0.TLB:TLB_shootdowns
14081 ± 3% +28.4% 18086 ± 2% interrupts.CPU1.CAL:Function_call_interrupts
12807 ± 2% +40.3% 17972 ± 4% interrupts.CPU1.TLB:TLB_shootdowns
13782 ± 2% +27.4% 17556 ± 4% interrupts.CPU10.CAL:Function_call_interrupts
12905 ± 2% +38.7% 17904 ± 4% interrupts.CPU10.TLB:TLB_shootdowns
13926 ± 2% +26.3% 17594 ± 4% interrupts.CPU100.CAL:Function_call_interrupts
13026 ± 3% +36.7% 17803 ± 4% interrupts.CPU100.TLB:TLB_shootdowns
13812 ± 2% +28.5% 17756 ± 4% interrupts.CPU101.CAL:Function_call_interrupts
13001 ± 3% +37.1% 17826 ± 4% interrupts.CPU101.TLB:TLB_shootdowns
13642 +29.0% 17603 ± 5% interrupts.CPU102.CAL:Function_call_interrupts
8483 -23.5% 6491 ± 21% interrupts.CPU102.NMI:Non-maskable_interrupts
8483 -23.5% 6491 ± 21% interrupts.CPU102.PMI:Performance_monitoring_interrupts
12794 ± 3% +38.9% 17774 ± 5% interrupts.CPU102.TLB:TLB_shootdowns
13763 +30.2% 17926 ± 5% interrupts.CPU103.CAL:Function_call_interrupts
12879 ± 3% +38.1% 17782 ± 4% interrupts.CPU103.TLB:TLB_shootdowns
13905 +25.1% 17396 ± 4% interrupts.CPU104.CAL:Function_call_interrupts
13152 ± 2% +34.5% 17693 ± 5% interrupts.CPU104.TLB:TLB_shootdowns
13722 +29.0% 17701 ± 4% interrupts.CPU105.CAL:Function_call_interrupts
71.75 ± 40% +73.9% 124.75 ± 18% interrupts.CPU105.RES:Rescheduling_interrupts
13011 ± 3% +38.8% 18057 ± 4% interrupts.CPU105.TLB:TLB_shootdowns
13981 ± 2% +24.6% 17423 ± 3% interrupts.CPU106.CAL:Function_call_interrupts
13058 ± 2% +36.4% 17812 ± 3% interrupts.CPU106.TLB:TLB_shootdowns
13856 ± 2% +26.7% 17554 ± 3% interrupts.CPU107.CAL:Function_call_interrupts
13031 ± 4% +36.8% 17831 ± 4% interrupts.CPU107.TLB:TLB_shootdowns
14077 ± 5% +26.4% 17788 ± 4% interrupts.CPU108.CAL:Function_call_interrupts
8467 ± 2% -32.8% 5694 ± 38% interrupts.CPU108.NMI:Non-maskable_interrupts
8467 ± 2% -32.8% 5694 ± 38% interrupts.CPU108.PMI:Performance_monitoring_interrupts
12699 ± 4% +43.1% 18167 ± 4% interrupts.CPU108.TLB:TLB_shootdowns
13822 ± 2% +27.7% 17647 ± 4% interrupts.CPU109.CAL:Function_call_interrupts
13095 ± 3% +37.2% 17971 ± 5% interrupts.CPU109.TLB:TLB_shootdowns
13585 ± 2% +29.3% 17560 ± 4% interrupts.CPU11.CAL:Function_call_interrupts
270.00 ± 11% +26.5% 341.50 ± 13% interrupts.CPU11.RES:Rescheduling_interrupts
12897 ± 3% +39.0% 17925 ± 4% interrupts.CPU11.TLB:TLB_shootdowns
13736 ± 3% +28.0% 17581 ± 4% interrupts.CPU110.CAL:Function_call_interrupts
12824 ± 5% +39.2% 17848 ± 3% interrupts.CPU110.TLB:TLB_shootdowns
13675 +29.2% 17670 ± 4% interrupts.CPU111.CAL:Function_call_interrupts
50.75 ± 18% +112.8% 108.00 ± 13% interrupts.CPU111.RES:Rescheduling_interrupts
12870 ± 3% +39.5% 17960 ± 4% interrupts.CPU111.TLB:TLB_shootdowns
14122 ± 3% +24.6% 17600 ± 4% interrupts.CPU112.CAL:Function_call_interrupts
13098 ± 5% +36.1% 17822 ± 4% interrupts.CPU112.TLB:TLB_shootdowns
14091 ± 3% +26.2% 17788 ± 3% interrupts.CPU113.CAL:Function_call_interrupts
13255 ± 3% +35.8% 17999 ± 3% interrupts.CPU113.TLB:TLB_shootdowns
13863 ± 2% +26.8% 17585 ± 5% interrupts.CPU114.CAL:Function_call_interrupts
8436 -10.5% 7554 ± 7% interrupts.CPU114.NMI:Non-maskable_interrupts
8436 -10.5% 7554 ± 7% interrupts.CPU114.PMI:Performance_monitoring_interrupts
62.50 ± 26% +143.2% 152.00 ± 36% interrupts.CPU114.RES:Rescheduling_interrupts
12966 ± 2% +37.5% 17828 ± 5% interrupts.CPU114.TLB:TLB_shootdowns
14496 ± 4% +21.3% 17579 ± 3% interrupts.CPU115.CAL:Function_call_interrupts
13123 ± 3% +36.0% 17847 ± 3% interrupts.CPU115.TLB:TLB_shootdowns
14188 ± 2% +23.9% 17580 ± 4% interrupts.CPU116.CAL:Function_call_interrupts
8472 -10.8% 7561 ± 7% interrupts.CPU116.NMI:Non-maskable_interrupts
8472 -10.8% 7561 ± 7% interrupts.CPU116.PMI:Performance_monitoring_interrupts
13270 ± 3% +33.8% 17760 ± 3% interrupts.CPU116.TLB:TLB_shootdowns
13999 ± 2% +26.5% 17703 ± 4% interrupts.CPU117.CAL:Function_call_interrupts
12964 ± 4% +37.5% 17820 ± 4% interrupts.CPU117.TLB:TLB_shootdowns
13934 ± 2% +26.9% 17677 ± 3% interrupts.CPU118.CAL:Function_call_interrupts
12986 ± 2% +38.0% 17920 ± 3% interrupts.CPU118.TLB:TLB_shootdowns
14015 ± 2% +26.2% 17690 ± 4% interrupts.CPU119.CAL:Function_call_interrupts
13191 ± 3% +35.7% 17898 ± 4% interrupts.CPU119.TLB:TLB_shootdowns
13743 +27.6% 17543 ± 4% interrupts.CPU12.CAL:Function_call_interrupts
13013 +37.8% 17937 ± 4% interrupts.CPU12.TLB:TLB_shootdowns
13063 +36.8% 17867 ± 2% interrupts.CPU120.TLB:TLB_shootdowns
13985 ± 4% +28.0% 17902 ± 4% interrupts.CPU121.CAL:Function_call_interrupts
12726 ± 6% +40.4% 17862 ± 5% interrupts.CPU121.TLB:TLB_shootdowns
13999 ± 3% +26.4% 17702 ± 4% interrupts.CPU122.CAL:Function_call_interrupts
12836 ± 3% +38.1% 17724 ± 4% interrupts.CPU122.TLB:TLB_shootdowns
13830 ± 2% +28.8% 17817 ± 4% interrupts.CPU123.CAL:Function_call_interrupts
8553 -27.1% 6234 ± 25% interrupts.CPU123.NMI:Non-maskable_interrupts
8553 -27.1% 6234 ± 25% interrupts.CPU123.PMI:Performance_monitoring_interrupts
12728 ± 4% +41.1% 17958 ± 5% interrupts.CPU123.TLB:TLB_shootdowns
13998 +25.8% 17616 ± 3% interrupts.CPU124.CAL:Function_call_interrupts
12892 ± 2% +37.8% 17768 ± 4% interrupts.CPU124.TLB:TLB_shootdowns
13952 ± 2% +25.7% 17534 ± 4% interrupts.CPU125.CAL:Function_call_interrupts
13101 ± 4% +34.6% 17636 ± 4% interrupts.CPU125.TLB:TLB_shootdowns
13893 +27.5% 17711 ± 4% interrupts.CPU126.CAL:Function_call_interrupts
12940 ± 2% +38.3% 17894 ± 5% interrupts.CPU126.TLB:TLB_shootdowns
13869 +28.0% 17754 ± 3% interrupts.CPU127.CAL:Function_call_interrupts
12908 ± 3% +38.7% 17901 ± 4% interrupts.CPU127.TLB:TLB_shootdowns
14037 +25.4% 17596 ± 3% interrupts.CPU128.CAL:Function_call_interrupts
13110 +35.6% 17773 ± 3% interrupts.CPU128.TLB:TLB_shootdowns
13961 +26.2% 17623 ± 3% interrupts.CPU129.CAL:Function_call_interrupts
12899 +37.3% 17709 ± 3% interrupts.CPU129.TLB:TLB_shootdowns
13708 +28.2% 17575 ± 4% interrupts.CPU13.CAL:Function_call_interrupts
13025 ± 2% +38.2% 17996 ± 5% interrupts.CPU13.TLB:TLB_shootdowns
14069 +24.8% 17554 ± 3% interrupts.CPU130.CAL:Function_call_interrupts
13047 +35.6% 17698 ± 3% interrupts.CPU130.TLB:TLB_shootdowns
13964 ± 2% +26.9% 17725 ± 4% interrupts.CPU131.CAL:Function_call_interrupts
13047 ± 4% +37.2% 17904 ± 4% interrupts.CPU131.TLB:TLB_shootdowns
14020 ± 2% +26.0% 17662 ± 3% interrupts.CPU132.CAL:Function_call_interrupts
13026 ± 3% +36.1% 17724 ± 3% interrupts.CPU132.TLB:TLB_shootdowns
13965 +26.6% 17685 ± 2% interrupts.CPU133.CAL:Function_call_interrupts
13027 +37.6% 17931 ± 3% interrupts.CPU133.TLB:TLB_shootdowns
14200 ± 2% +23.6% 17551 ± 4% interrupts.CPU134.CAL:Function_call_interrupts
13110 ± 2% +34.2% 17592 ± 4% interrupts.CPU134.TLB:TLB_shootdowns
14147 +24.0% 17548 ± 3% interrupts.CPU135.CAL:Function_call_interrupts
13167 +35.1% 17793 ± 3% interrupts.CPU135.TLB:TLB_shootdowns
13949 +27.5% 17790 ± 4% interrupts.CPU136.CAL:Function_call_interrupts
12917 +39.1% 17969 ± 4% interrupts.CPU136.TLB:TLB_shootdowns
13844 +29.6% 17937 ± 3% interrupts.CPU137.CAL:Function_call_interrupts
12817 ± 3% +41.0% 18076 ± 4% interrupts.CPU137.TLB:TLB_shootdowns
14052 ± 3% +27.3% 17892 ± 2% interrupts.CPU138.CAL:Function_call_interrupts
12979 +38.3% 17956 ± 3% interrupts.CPU138.TLB:TLB_shootdowns
14005 +26.3% 17691 ± 3% interrupts.CPU139.CAL:Function_call_interrupts
13070 +36.8% 17876 ± 3% interrupts.CPU139.TLB:TLB_shootdowns
13597 ± 2% +26.7% 17229 ± 5% interrupts.CPU14.CAL:Function_call_interrupts
239.00 ± 9% +44.9% 346.25 ± 15% interrupts.CPU14.RES:Rescheduling_interrupts
12941 ± 4% +36.5% 17661 ± 5% interrupts.CPU14.TLB:TLB_shootdowns
14046 ± 2% +26.0% 17692 ± 3% interrupts.CPU140.CAL:Function_call_interrupts
12970 ± 3% +37.5% 17835 ± 3% interrupts.CPU140.TLB:TLB_shootdowns
14019 +25.5% 17600 ± 4% interrupts.CPU141.CAL:Function_call_interrupts
12970 ± 2% +36.0% 17636 ± 4% interrupts.CPU141.TLB:TLB_shootdowns
14056 ± 3% +25.1% 17588 ± 4% interrupts.CPU142.CAL:Function_call_interrupts
13033 ± 3% +35.3% 17639 ± 4% interrupts.CPU142.TLB:TLB_shootdowns
13924 +29.6% 18045 ± 4% interrupts.CPU143.CAL:Function_call_interrupts
54.50 ± 43% +216.1% 172.25 ± 79% interrupts.CPU143.RES:Rescheduling_interrupts
12943 ± 2% +38.3% 17897 ± 6% interrupts.CPU143.TLB:TLB_shootdowns
15179 ± 4% +37.4% 20852 ± 4% interrupts.CPU144.CAL:Function_call_interrupts
12849 ± 2% +40.1% 18005 ± 3% interrupts.CPU144.TLB:TLB_shootdowns
14377 ± 5% +25.2% 18005 ± 3% interrupts.CPU145.CAL:Function_call_interrupts
12769 +40.5% 17937 ± 3% interrupts.CPU145.TLB:TLB_shootdowns
13942 ± 2% +34.7% 18780 ± 10% interrupts.CPU146.CAL:Function_call_interrupts
12724 +41.0% 17947 ± 4% interrupts.CPU146.TLB:TLB_shootdowns
13952 ± 2% +28.6% 17947 ± 2% interrupts.CPU147.CAL:Function_call_interrupts
12962 ± 3% +39.4% 18069 ± 2% interrupts.CPU147.TLB:TLB_shootdowns
14129 +26.8% 17911 ± 4% interrupts.CPU148.CAL:Function_call_interrupts
12950 ± 2% +39.1% 18019 ± 5% interrupts.CPU148.TLB:TLB_shootdowns
13946 +28.0% 17854 ± 3% interrupts.CPU149.CAL:Function_call_interrupts
12751 ± 2% +40.8% 17954 ± 4% interrupts.CPU149.TLB:TLB_shootdowns
13763 +26.2% 17376 ± 4% interrupts.CPU15.CAL:Function_call_interrupts
13068 ± 2% +35.8% 17753 ± 4% interrupts.CPU15.TLB:TLB_shootdowns
13661 +33.5% 18236 ± 2% interrupts.CPU150.CAL:Function_call_interrupts
12727 +41.1% 17963 ± 3% interrupts.CPU150.TLB:TLB_shootdowns
13928 +29.5% 18038 ± 4% interrupts.CPU151.CAL:Function_call_interrupts
12860 ± 2% +41.2% 18153 ± 3% interrupts.CPU151.TLB:TLB_shootdowns
13847 +28.7% 17828 ± 3% interrupts.CPU152.CAL:Function_call_interrupts
12872 ± 2% +40.2% 18044 ± 4% interrupts.CPU152.TLB:TLB_shootdowns
13871 +28.2% 17785 ± 2% interrupts.CPU153.CAL:Function_call_interrupts
12736 ± 2% +40.0% 17834 ± 4% interrupts.CPU153.TLB:TLB_shootdowns
13948 +28.8% 17964 ± 4% interrupts.CPU154.CAL:Function_call_interrupts
12850 ± 2% +40.0% 17989 ± 3% interrupts.CPU154.TLB:TLB_shootdowns
14278 +27.3% 18176 ± 3% interrupts.CPU155.CAL:Function_call_interrupts
156.25 ± 20% -41.0% 92.25 ± 57% interrupts.CPU155.RES:Rescheduling_interrupts
12998 +40.1% 18211 ± 3% interrupts.CPU155.TLB:TLB_shootdowns
13991 +28.2% 17943 ± 3% interrupts.CPU156.CAL:Function_call_interrupts
12835 ± 3% +40.9% 18083 ± 2% interrupts.CPU156.TLB:TLB_shootdowns
14051 ± 2% +27.4% 17895 ± 4% interrupts.CPU157.CAL:Function_call_interrupts
12951 ± 3% +38.6% 17955 ± 4% interrupts.CPU157.TLB:TLB_shootdowns
13976 +27.3% 17794 ± 3% interrupts.CPU158.CAL:Function_call_interrupts
12766 +40.0% 17874 ± 3% interrupts.CPU158.TLB:TLB_shootdowns
14240 ± 2% +24.2% 17687 ± 4% interrupts.CPU159.CAL:Function_call_interrupts
12845 +37.2% 17624 ± 5% interrupts.CPU159.TLB:TLB_shootdowns
13728 ± 2% +27.9% 17557 ± 4% interrupts.CPU16.CAL:Function_call_interrupts
13015 ± 3% +38.3% 17996 ± 4% interrupts.CPU16.TLB:TLB_shootdowns
13972 ± 2% +29.3% 18071 ± 4% interrupts.CPU160.CAL:Function_call_interrupts
8549 -33.8% 5662 ± 34% interrupts.CPU160.NMI:Non-maskable_interrupts
8549 -33.8% 5662 ± 34% interrupts.CPU160.PMI:Performance_monitoring_interrupts
12753 ± 2% +40.8% 17961 ± 3% interrupts.CPU160.TLB:TLB_shootdowns
13957 +29.0% 18010 ± 3% interrupts.CPU161.CAL:Function_call_interrupts
12846 ± 2% +39.9% 17972 ± 3% interrupts.CPU161.TLB:TLB_shootdowns
13784 +29.5% 17855 ± 3% interrupts.CPU162.CAL:Function_call_interrupts
99.25 ± 20% -51.6% 48.00 ± 36% interrupts.CPU162.RES:Rescheduling_interrupts
12468 ± 3% +42.6% 17775 ± 3% interrupts.CPU162.TLB:TLB_shootdowns
13937 +27.6% 17784 ± 2% interrupts.CPU163.CAL:Function_call_interrupts
12711 +38.7% 17633 ± 3% interrupts.CPU163.TLB:TLB_shootdowns
13867 +29.8% 18005 ± 3% interrupts.CPU164.CAL:Function_call_interrupts
8543 -33.6% 5675 ± 33% interrupts.CPU164.NMI:Non-maskable_interrupts
8543 -33.6% 5675 ± 33% interrupts.CPU164.PMI:Performance_monitoring_interrupts
12723 ± 3% +40.5% 17877 ± 4% interrupts.CPU164.TLB:TLB_shootdowns
13861 ± 2% +34.4% 18630 ± 8% interrupts.CPU165.CAL:Function_call_interrupts
50.75 ± 28% +166.0% 135.00 ± 44% interrupts.CPU165.RES:Rescheduling_interrupts
12695 ± 3% +39.9% 17759 ± 3% interrupts.CPU165.TLB:TLB_shootdowns
13953 +30.1% 18151 ± 3% interrupts.CPU166.CAL:Function_call_interrupts
8546 -17.7% 7029 ± 11% interrupts.CPU166.NMI:Non-maskable_interrupts
8546 -17.7% 7029 ± 11% interrupts.CPU166.PMI:Performance_monitoring_interrupts
12802 ± 2% +40.3% 17964 ± 3% interrupts.CPU166.TLB:TLB_shootdowns
13939 +28.7% 17939 ± 3% interrupts.CPU167.CAL:Function_call_interrupts
8544 -33.7% 5664 ± 33% interrupts.CPU167.NMI:Non-maskable_interrupts
8544 -33.7% 5664 ± 33% interrupts.CPU167.PMI:Performance_monitoring_interrupts
12598 +41.5% 17828 ± 3% interrupts.CPU167.TLB:TLB_shootdowns
17519 ± 13% +39.9% 24501 ± 10% interrupts.CPU168.CAL:Function_call_interrupts
12781 +41.6% 18095 ± 3% interrupts.CPU168.TLB:TLB_shootdowns
14114 +28.8% 18184 ± 5% interrupts.CPU169.CAL:Function_call_interrupts
12886 +38.8% 17891 ± 4% interrupts.CPU169.TLB:TLB_shootdowns
13583 +29.0% 17519 ± 3% interrupts.CPU17.CAL:Function_call_interrupts
8463 ± 2% -10.3% 7590 ± 7% interrupts.CPU17.NMI:Non-maskable_interrupts
8463 ± 2% -10.3% 7590 ± 7% interrupts.CPU17.PMI:Performance_monitoring_interrupts
236.00 ± 7% +35.2% 319.00 ± 13% interrupts.CPU17.RES:Rescheduling_interrupts
12921 ± 3% +39.7% 18047 ± 3% interrupts.CPU17.TLB:TLB_shootdowns
14127 +28.8% 18198 ± 7% interrupts.CPU170.CAL:Function_call_interrupts
12975 +35.2% 17544 ± 2% interrupts.CPU170.TLB:TLB_shootdowns
13780 ± 2% +30.0% 17909 ± 5% interrupts.CPU171.CAL:Function_call_interrupts
12651 ± 2% +40.5% 17771 ± 4% interrupts.CPU171.TLB:TLB_shootdowns
14082 +27.1% 17901 ± 3% interrupts.CPU172.CAL:Function_call_interrupts
12902 +39.5% 18004 ± 3% interrupts.CPU172.TLB:TLB_shootdowns
14194 +24.8% 17710 ± 4% interrupts.CPU173.CAL:Function_call_interrupts
13139 +34.2% 17638 ± 4% interrupts.CPU173.TLB:TLB_shootdowns
14282 +24.6% 17797 ± 4% interrupts.CPU174.CAL:Function_call_interrupts
38.25 ± 60% +109.2% 80.00 ± 55% interrupts.CPU174.RES:Rescheduling_interrupts
13161 +33.8% 17607 ± 4% interrupts.CPU174.TLB:TLB_shootdowns
14880 ± 6% +18.2% 17594 ± 4% interrupts.CPU175.CAL:Function_call_interrupts
13125 +33.5% 17521 ± 5% interrupts.CPU175.TLB:TLB_shootdowns
14166 +25.6% 17789 ± 3% interrupts.CPU176.CAL:Function_call_interrupts
13108 ± 2% +34.6% 17644 ± 3% interrupts.CPU176.TLB:TLB_shootdowns
13980 +26.8% 17725 ± 5% interrupts.CPU177.CAL:Function_call_interrupts
12932 ± 3% +37.7% 17804 ± 4% interrupts.CPU177.TLB:TLB_shootdowns
14042 +27.7% 17925 ± 3% interrupts.CPU178.CAL:Function_call_interrupts
13008 +37.4% 17868 ± 3% interrupts.CPU178.TLB:TLB_shootdowns
14139 +26.5% 17885 ± 4% interrupts.CPU179.CAL:Function_call_interrupts
13059 +35.9% 17751 ± 4% interrupts.CPU179.TLB:TLB_shootdowns
13703 +28.8% 17647 ± 3% interrupts.CPU18.CAL:Function_call_interrupts
237.25 ± 6% +37.0% 325.00 ± 17% interrupts.CPU18.RES:Rescheduling_interrupts
12993 ± 2% +39.5% 18119 ± 3% interrupts.CPU18.TLB:TLB_shootdowns
14163 +25.1% 17715 ± 5% interrupts.CPU180.CAL:Function_call_interrupts
12943 ± 3% +35.8% 17571 ± 5% interrupts.CPU180.TLB:TLB_shootdowns
14061 +28.1% 18005 ± 4% interrupts.CPU181.CAL:Function_call_interrupts
13021 ± 3% +37.2% 17858 ± 5% interrupts.CPU181.TLB:TLB_shootdowns
14211 +25.8% 17877 ± 5% interrupts.CPU182.CAL:Function_call_interrupts
13169 +34.9% 17771 ± 5% interrupts.CPU182.TLB:TLB_shootdowns
14170 +25.0% 17706 ± 4% interrupts.CPU183.CAL:Function_call_interrupts
13095 +34.5% 17618 ± 5% interrupts.CPU183.TLB:TLB_shootdowns
14162 +27.6% 18075 ± 4% interrupts.CPU184.CAL:Function_call_interrupts
12953 +38.3% 17919 ± 4% interrupts.CPU184.TLB:TLB_shootdowns
14137 +25.9% 17797 ± 4% interrupts.CPU185.CAL:Function_call_interrupts
13007 +35.8% 17669 ± 3% interrupts.CPU185.TLB:TLB_shootdowns
14255 +24.9% 17805 ± 4% interrupts.CPU186.CAL:Function_call_interrupts
13200 ± 2% +33.5% 17627 ± 4% interrupts.CPU186.TLB:TLB_shootdowns
14052 +28.6% 18064 ± 5% interrupts.CPU187.CAL:Function_call_interrupts
12862 ± 2% +36.8% 17592 ± 4% interrupts.CPU187.TLB:TLB_shootdowns
14522 ± 2% +24.4% 18066 ± 4% interrupts.CPU188.CAL:Function_call_interrupts
13035 +37.6% 17932 ± 3% interrupts.CPU188.TLB:TLB_shootdowns
14066 ± 2% +26.9% 17844 ± 5% interrupts.CPU189.CAL:Function_call_interrupts
12861 ± 3% +37.0% 17623 ± 4% interrupts.CPU189.TLB:TLB_shootdowns
13995 ± 2% +25.8% 17605 ± 4% interrupts.CPU19.CAL:Function_call_interrupts
13198 ± 3% +35.9% 17935 ± 3% interrupts.CPU19.TLB:TLB_shootdowns
14094 ± 2% +28.2% 18070 ± 5% interrupts.CPU190.CAL:Function_call_interrupts
12930 ± 2% +38.5% 17908 ± 5% interrupts.CPU190.TLB:TLB_shootdowns
13821 ± 2% +24.7% 17233 ± 3% interrupts.CPU191.CAL:Function_call_interrupts
12554 ± 3% +35.6% 17024 ± 3% interrupts.CPU191.TLB:TLB_shootdowns
13801 +26.1% 17402 ± 4% interrupts.CPU2.CAL:Function_call_interrupts
12964 ± 2% +36.0% 17634 ± 4% interrupts.CPU2.TLB:TLB_shootdowns
13736 ± 2% +26.7% 17409 ± 4% interrupts.CPU20.CAL:Function_call_interrupts
237.25 ± 11% +23.8% 293.75 ± 12% interrupts.CPU20.RES:Rescheduling_interrupts
13129 ± 3% +34.4% 17648 ± 4% interrupts.CPU20.TLB:TLB_shootdowns
13623 ± 3% +27.8% 17407 ± 3% interrupts.CPU21.CAL:Function_call_interrupts
8491 ± 2% -25.0% 6372 ± 20% interrupts.CPU21.NMI:Non-maskable_interrupts
8491 ± 2% -25.0% 6372 ± 20% interrupts.CPU21.PMI:Performance_monitoring_interrupts
12967 ± 4% +37.5% 17831 ± 3% interrupts.CPU21.TLB:TLB_shootdowns
13675 ± 2% +28.6% 17587 ± 4% interrupts.CPU22.CAL:Function_call_interrupts
8487 ± 2% -21.2% 6691 ± 23% interrupts.CPU22.NMI:Non-maskable_interrupts
8487 ± 2% -21.2% 6691 ± 23% interrupts.CPU22.PMI:Performance_monitoring_interrupts
12989 ± 2% +38.9% 18038 ± 4% interrupts.CPU22.TLB:TLB_shootdowns
13780 ± 2% +28.0% 17633 ± 3% interrupts.CPU23.CAL:Function_call_interrupts
8488 ± 2% -6.4% 7943 ± 3% interrupts.CPU23.NMI:Non-maskable_interrupts
8488 ± 2% -6.4% 7943 ± 3% interrupts.CPU23.PMI:Performance_monitoring_interrupts
225.00 ± 16% +24.2% 279.50 ± 7% interrupts.CPU23.RES:Rescheduling_interrupts
13128 ± 3% +37.4% 18042 ± 3% interrupts.CPU23.TLB:TLB_shootdowns
14048 ± 2% +32.1% 18560 ± 2% interrupts.CPU24.CAL:Function_call_interrupts
14018 ± 9% +31.6% 18450 ± 3% interrupts.CPU24.TLB:TLB_shootdowns
13791 +32.2% 18229 ± 6% interrupts.CPU25.CAL:Function_call_interrupts
12869 ± 3% +38.6% 17837 ± 4% interrupts.CPU25.TLB:TLB_shootdowns
13768 +28.0% 17623 ± 4% interrupts.CPU26.CAL:Function_call_interrupts
13055 +36.6% 17828 ± 4% interrupts.CPU26.TLB:TLB_shootdowns
14045 +26.1% 17709 ± 2% interrupts.CPU27.CAL:Function_call_interrupts
13327 +34.0% 17856 ± 3% interrupts.CPU27.TLB:TLB_shootdowns
13580 ± 2% +29.0% 17520 ± 3% interrupts.CPU28.CAL:Function_call_interrupts
12906 ± 2% +38.3% 17855 ± 3% interrupts.CPU28.TLB:TLB_shootdowns
13829 +27.5% 17636 ± 4% interrupts.CPU29.CAL:Function_call_interrupts
13096 ± 2% +37.2% 17964 ± 4% interrupts.CPU29.TLB:TLB_shootdowns
13729 ± 2% +25.9% 17291 ± 4% interrupts.CPU3.CAL:Function_call_interrupts
295.25 ± 6% +25.4% 370.25 ± 12% interrupts.CPU3.RES:Rescheduling_interrupts
13084 ± 3% +33.6% 17478 ± 5% interrupts.CPU3.TLB:TLB_shootdowns
13889 +25.9% 17486 ± 4% interrupts.CPU30.CAL:Function_call_interrupts
159.25 ± 27% +42.4% 226.75 ± 2% interrupts.CPU30.RES:Rescheduling_interrupts
13200 ± 2% +35.3% 17862 ± 4% interrupts.CPU30.TLB:TLB_shootdowns
13668 +27.7% 17456 ± 4% interrupts.CPU31.CAL:Function_call_interrupts
13029 ± 2% +36.4% 17774 ± 4% interrupts.CPU31.TLB:TLB_shootdowns
13719 ± 2% +26.8% 17401 ± 3% interrupts.CPU32.CAL:Function_call_interrupts
142.00 ± 29% +61.4% 229.25 ± 10% interrupts.CPU32.RES:Rescheduling_interrupts
13070 ± 3% +35.9% 17757 ± 4% interrupts.CPU32.TLB:TLB_shootdowns
13810 +26.6% 17485 ± 4% interrupts.CPU33.CAL:Function_call_interrupts
134.25 ± 29% +75.4% 235.50 ± 2% interrupts.CPU33.RES:Rescheduling_interrupts
13198 +36.0% 17953 ± 4% interrupts.CPU33.TLB:TLB_shootdowns
13611 +29.2% 17588 ± 3% interrupts.CPU34.CAL:Function_call_interrupts
129.25 ± 15% +83.9% 237.75 ± 12% interrupts.CPU34.RES:Rescheduling_interrupts
12924 ± 2% +39.1% 17972 ± 4% interrupts.CPU34.TLB:TLB_shootdowns
13772 +28.5% 17697 ± 2% interrupts.CPU35.CAL:Function_call_interrupts
136.00 ± 39% +86.6% 253.75 ± 19% interrupts.CPU35.RES:Rescheduling_interrupts
13212 ± 2% +35.6% 17912 ± 3% interrupts.CPU35.TLB:TLB_shootdowns
13822 +26.4% 17477 ± 4% interrupts.CPU36.CAL:Function_call_interrupts
176.25 ± 26% +38.0% 243.25 ± 7% interrupts.CPU36.RES:Rescheduling_interrupts
13079 ± 2% +36.0% 17785 ± 4% interrupts.CPU36.TLB:TLB_shootdowns
13735 ± 2% +27.1% 17462 ± 4% interrupts.CPU37.CAL:Function_call_interrupts
131.50 ± 36% +100.6% 263.75 ± 21% interrupts.CPU37.RES:Rescheduling_interrupts
13061 ± 3% +36.9% 17883 ± 4% interrupts.CPU37.TLB:TLB_shootdowns
13837 +25.3% 17332 ± 4% interrupts.CPU38.CAL:Function_call_interrupts
161.50 ± 10% +26.0% 203.50 ± 8% interrupts.CPU38.RES:Rescheduling_interrupts
13167 +34.7% 17737 ± 4% interrupts.CPU38.TLB:TLB_shootdowns
13630 +29.2% 17610 ± 3% interrupts.CPU39.CAL:Function_call_interrupts
155.00 ± 7% +38.7% 215.00 ± 9% interrupts.CPU39.RES:Rescheduling_interrupts
12856 +39.9% 17991 ± 4% interrupts.CPU39.TLB:TLB_shootdowns
13635 ± 2% +29.8% 17699 ± 4% interrupts.CPU4.CAL:Function_call_interrupts
285.00 ± 7% +22.0% 347.75 ± 19% interrupts.CPU4.RES:Rescheduling_interrupts
13045 ± 3% +38.9% 18123 ± 4% interrupts.CPU4.TLB:TLB_shootdowns
13760 +27.4% 17537 ± 4% interrupts.CPU40.CAL:Function_call_interrupts
137.25 ± 30% +51.9% 208.50 ± 19% interrupts.CPU40.RES:Rescheduling_interrupts
13067 ± 2% +37.4% 17948 ± 5% interrupts.CPU40.TLB:TLB_shootdowns
13680 ± 2% +29.2% 17670 ± 3% interrupts.CPU41.CAL:Function_call_interrupts
102.50 ± 18% +184.4% 291.50 ± 28% interrupts.CPU41.RES:Rescheduling_interrupts
12967 ± 3% +38.8% 17993 ± 4% interrupts.CPU41.TLB:TLB_shootdowns
13599 ± 2% +28.1% 17414 ± 4% interrupts.CPU42.CAL:Function_call_interrupts
12782 ± 3% +38.8% 17737 ± 4% interrupts.CPU42.TLB:TLB_shootdowns
13586 +28.5% 17464 ± 4% interrupts.CPU43.CAL:Function_call_interrupts
12842 +38.9% 17832 ± 4% interrupts.CPU43.TLB:TLB_shootdowns
13505 +29.4% 17469 ± 3% interrupts.CPU44.CAL:Function_call_interrupts
12683 +39.8% 17737 ± 2% interrupts.CPU44.TLB:TLB_shootdowns
13812 ± 2% +27.7% 17632 ± 5% interrupts.CPU45.CAL:Function_call_interrupts
13053 ± 2% +38.1% 18027 ± 4% interrupts.CPU45.TLB:TLB_shootdowns
13754 +26.6% 17417 ± 4% interrupts.CPU46.CAL:Function_call_interrupts
131.00 ± 24% +38.2% 181.00 ± 14% interrupts.CPU46.RES:Rescheduling_interrupts
13087 +36.0% 17795 ± 4% interrupts.CPU46.TLB:TLB_shootdowns
13831 ± 2% +26.7% 17530 ± 3% interrupts.CPU47.CAL:Function_call_interrupts
12993 +37.8% 17910 ± 4% interrupts.CPU47.TLB:TLB_shootdowns
13898 +27.8% 17763 ± 3% interrupts.CPU48.CAL:Function_call_interrupts
143.00 ± 35% +86.9% 267.25 ± 29% interrupts.CPU48.RES:Rescheduling_interrupts
13224 ± 2% +35.6% 17927 ± 4% interrupts.CPU48.TLB:TLB_shootdowns
14644 ± 8% +25.8% 18424 ± 4% interrupts.CPU49.CAL:Function_call_interrupts
12929 ± 3% +39.6% 18052 ± 4% interrupts.CPU49.TLB:TLB_shootdowns
13649 ± 2% +27.8% 17446 ± 5% interrupts.CPU5.CAL:Function_call_interrupts
13030 ± 3% +37.2% 17876 ± 7% interrupts.CPU5.TLB:TLB_shootdowns
14022 ± 4% +25.5% 17598 ± 4% interrupts.CPU50.CAL:Function_call_interrupts
12950 ± 3% +39.0% 17997 ± 5% interrupts.CPU50.TLB:TLB_shootdowns
13563 +30.3% 17671 ± 4% interrupts.CPU51.CAL:Function_call_interrupts
113.50 ± 30% +86.1% 211.25 ± 46% interrupts.CPU51.RES:Rescheduling_interrupts
12745 +41.8% 18067 ± 4% interrupts.CPU51.TLB:TLB_shootdowns
13556 +28.9% 17479 ± 2% interrupts.CPU52.CAL:Function_call_interrupts
92.25 ± 54% +110.3% 194.00 ± 34% interrupts.CPU52.RES:Rescheduling_interrupts
12761 +39.9% 17852 ± 3% interrupts.CPU52.TLB:TLB_shootdowns
13592 +28.7% 17496 ± 3% interrupts.CPU53.CAL:Function_call_interrupts
8573 -36.4% 5453 ± 22% interrupts.CPU53.NMI:Non-maskable_interrupts
8573 -36.4% 5453 ± 22% interrupts.CPU53.PMI:Performance_monitoring_interrupts
94.00 ± 56% +123.7% 210.25 ± 27% interrupts.CPU53.RES:Rescheduling_interrupts
12871 ± 2% +38.3% 17800 ± 3% interrupts.CPU53.TLB:TLB_shootdowns
13583 ± 2% +29.6% 17606 ± 4% interrupts.CPU54.CAL:Function_call_interrupts
87.00 ± 32% +127.3% 197.75 ± 23% interrupts.CPU54.RES:Rescheduling_interrupts
12916 ± 2% +40.1% 18094 ± 4% interrupts.CPU54.TLB:TLB_shootdowns
13590 +28.5% 17457 ± 3% interrupts.CPU55.CAL:Function_call_interrupts
12885 ± 2% +39.1% 17926 ± 3% interrupts.CPU55.TLB:TLB_shootdowns
13738 ± 2% +25.6% 17261 ± 4% interrupts.CPU56.CAL:Function_call_interrupts
79.75 ± 36% +157.7% 205.50 ± 22% interrupts.CPU56.RES:Rescheduling_interrupts
13049 ± 2% +35.2% 17640 ± 4% interrupts.CPU56.TLB:TLB_shootdowns
13678 +28.5% 17578 ± 3% interrupts.CPU57.CAL:Function_call_interrupts
117.50 ± 37% +69.4% 199.00 ± 8% interrupts.CPU57.RES:Rescheduling_interrupts
12837 ± 2% +39.5% 17912 ± 3% interrupts.CPU57.TLB:TLB_shootdowns
13422 +31.3% 17623 ± 4% interrupts.CPU58.CAL:Function_call_interrupts
8541 -24.2% 6478 ± 20% interrupts.CPU58.NMI:Non-maskable_interrupts
8541 -24.2% 6478 ± 20% interrupts.CPU58.PMI:Performance_monitoring_interrupts
12617 ± 3% +42.5% 17977 ± 3% interrupts.CPU58.TLB:TLB_shootdowns
13668 +29.3% 17674 ± 3% interrupts.CPU59.CAL:Function_call_interrupts
12861 +41.3% 18170 ± 2% interrupts.CPU59.TLB:TLB_shootdowns
13723 +25.8% 17260 ± 4% interrupts.CPU6.CAL:Function_call_interrupts
13180 ± 2% +33.9% 17655 ± 3% interrupts.CPU6.TLB:TLB_shootdowns
13794 +26.3% 17425 ± 5% interrupts.CPU60.CAL:Function_call_interrupts
8571 -26.3% 6316 ± 18% interrupts.CPU60.NMI:Non-maskable_interrupts
8571 -26.3% 6316 ± 18% interrupts.CPU60.PMI:Performance_monitoring_interrupts
77.75 ± 40% +166.9% 207.50 ± 31% interrupts.CPU60.RES:Rescheduling_interrupts
13053 +35.4% 17677 ± 6% interrupts.CPU60.TLB:TLB_shootdowns
13639 +30.6% 17811 ± 4% interrupts.CPU61.CAL:Function_call_interrupts
12847 ± 2% +41.6% 18191 ± 4% interrupts.CPU61.TLB:TLB_shootdowns
13644 +27.8% 17443 ± 2% interrupts.CPU62.CAL:Function_call_interrupts
8565 -24.6% 6462 ± 21% interrupts.CPU62.NMI:Non-maskable_interrupts
8565 -24.6% 6462 ± 21% interrupts.CPU62.PMI:Performance_monitoring_interrupts
12840 ± 3% +39.2% 17876 ± 2% interrupts.CPU62.TLB:TLB_shootdowns
13639 +27.8% 17432 ± 3% interrupts.CPU63.CAL:Function_call_interrupts
12878 ± 2% +37.7% 17734 ± 2% interrupts.CPU63.TLB:TLB_shootdowns
13910 ± 2% +27.6% 17743 ± 3% interrupts.CPU64.CAL:Function_call_interrupts
13103 +37.8% 18052 ± 3% interrupts.CPU64.TLB:TLB_shootdowns
13410 ± 2% +30.3% 17468 ± 4% interrupts.CPU65.CAL:Function_call_interrupts
12593 ± 5% +41.2% 17780 ± 3% interrupts.CPU65.TLB:TLB_shootdowns
13573 +30.1% 17659 ± 3% interrupts.CPU66.CAL:Function_call_interrupts
12678 ± 2% +43.0% 18124 ± 3% interrupts.CPU66.TLB:TLB_shootdowns
13754 +28.4% 17667 ± 3% interrupts.CPU67.CAL:Function_call_interrupts
69.50 ± 32% +91.4% 133.00 ± 15% interrupts.CPU67.RES:Rescheduling_interrupts
12994 ± 3% +39.1% 18079 ± 4% interrupts.CPU67.TLB:TLB_shootdowns
13627 +27.9% 17431 ± 3% interrupts.CPU68.CAL:Function_call_interrupts
8559 -36.3% 5452 ± 23% interrupts.CPU68.NMI:Non-maskable_interrupts
8559 -36.3% 5452 ± 23% interrupts.CPU68.PMI:Performance_monitoring_interrupts
60.00 ± 70% +106.2% 123.75 ± 30% interrupts.CPU68.RES:Rescheduling_interrupts
12837 ± 2% +38.5% 17782 ± 2% interrupts.CPU68.TLB:TLB_shootdowns
13617 +29.5% 17629 ± 2% interrupts.CPU69.CAL:Function_call_interrupts
12807 +40.5% 17988 ± 3% interrupts.CPU69.TLB:TLB_shootdowns
13963 +26.3% 17633 ± 4% interrupts.CPU7.CAL:Function_call_interrupts
13112 ± 2% +38.5% 18155 ± 4% interrupts.CPU7.TLB:TLB_shootdowns
13776 +26.8% 17472 ± 3% interrupts.CPU70.CAL:Function_call_interrupts
37.00 ± 44% +192.6% 108.25 ± 20% interrupts.CPU70.RES:Rescheduling_interrupts
13122 ± 2% +35.2% 17744 ± 3% interrupts.CPU70.TLB:TLB_shootdowns
13370 +32.1% 17667 ± 3% interrupts.CPU71.CAL:Function_call_interrupts
46.00 ± 19% +212.5% 143.75 ± 39% interrupts.CPU71.RES:Rescheduling_interrupts
12572 +43.2% 18000 ± 3% interrupts.CPU71.TLB:TLB_shootdowns
14836 ± 6% +21.3% 17996 ± 3% interrupts.CPU72.CAL:Function_call_interrupts
13785 ± 6% +33.4% 18391 ± 3% interrupts.CPU72.TLB:TLB_shootdowns
14247 ± 5% +23.6% 17615 ± 3% interrupts.CPU73.CAL:Function_call_interrupts
13249 +34.8% 17855 ± 4% interrupts.CPU73.TLB:TLB_shootdowns
13766 +31.9% 18163 ± 8% interrupts.CPU74.CAL:Function_call_interrupts
12972 +37.5% 17840 ± 4% interrupts.CPU74.TLB:TLB_shootdowns
13687 +28.7% 17619 ± 5% interrupts.CPU75.CAL:Function_call_interrupts
13061 ± 2% +37.9% 18013 ± 5% interrupts.CPU75.TLB:TLB_shootdowns
13696 +27.9% 17519 ± 4% interrupts.CPU76.CAL:Function_call_interrupts
45.50 ± 14% +193.4% 133.50 ± 52% interrupts.CPU76.RES:Rescheduling_interrupts
12969 ± 3% +38.5% 17958 ± 4% interrupts.CPU76.TLB:TLB_shootdowns
13724 +25.7% 17245 ± 5% interrupts.CPU77.CAL:Function_call_interrupts
13077 ± 2% +35.5% 17723 ± 5% interrupts.CPU77.TLB:TLB_shootdowns
13924 +24.7% 17360 ± 4% interrupts.CPU78.CAL:Function_call_interrupts
13279 ± 2% +43.7% 19082 ± 14% interrupts.CPU78.TLB:TLB_shootdowns
13710 +27.4% 17472 ± 4% interrupts.CPU79.CAL:Function_call_interrupts
13132 ± 2% +36.8% 17958 ± 4% interrupts.CPU79.TLB:TLB_shootdowns
13606 +28.2% 17437 ± 4% interrupts.CPU8.CAL:Function_call_interrupts
8475 ± 2% -21.9% 6618 ± 24% interrupts.CPU8.NMI:Non-maskable_interrupts
8475 ± 2% -21.9% 6618 ± 24% interrupts.CPU8.PMI:Performance_monitoring_interrupts
12901 +38.4% 17850 ± 4% interrupts.CPU8.TLB:TLB_shootdowns
13543 +28.1% 17353 ± 4% interrupts.CPU80.CAL:Function_call_interrupts
8642 -33.4% 5755 ± 30% interrupts.CPU80.NMI:Non-maskable_interrupts
8642 -33.4% 5755 ± 30% interrupts.CPU80.PMI:Performance_monitoring_interrupts
12899 +37.3% 17706 ± 2% interrupts.CPU80.TLB:TLB_shootdowns
13805 +25.9% 17377 ± 5% interrupts.CPU81.CAL:Function_call_interrupts
8641 -34.9% 5626 ± 32% interrupts.CPU81.NMI:Non-maskable_interrupts
8641 -34.9% 5626 ± 32% interrupts.CPU81.PMI:Performance_monitoring_interrupts
13172 +35.7% 17871 ± 5% interrupts.CPU81.TLB:TLB_shootdowns
13695 +27.3% 17430 ± 3% interrupts.CPU82.CAL:Function_call_interrupts
8638 -47.6% 4523 ± 23% interrupts.CPU82.NMI:Non-maskable_interrupts
8638 -47.6% 4523 ± 23% interrupts.CPU82.PMI:Performance_monitoring_interrupts
13071 ± 2% +38.2% 18059 ± 2% interrupts.CPU82.TLB:TLB_shootdowns
13910 +25.3% 17424 ± 5% interrupts.CPU83.CAL:Function_call_interrupts
8635 -36.4% 5491 ± 30% interrupts.CPU83.NMI:Non-maskable_interrupts
8635 -36.4% 5491 ± 30% interrupts.CPU83.PMI:Performance_monitoring_interrupts
13321 +34.4% 17901 ± 5% interrupts.CPU83.TLB:TLB_shootdowns
13663 +27.3% 17394 ± 3% interrupts.CPU84.CAL:Function_call_interrupts
13034 ± 2% +37.7% 17947 ± 3% interrupts.CPU84.TLB:TLB_shootdowns
13887 +24.8% 17337 ± 5% interrupts.CPU85.CAL:Function_call_interrupts
13158 +34.9% 17749 ± 6% interrupts.CPU85.TLB:TLB_shootdowns
13765 +25.8% 17311 ± 4% interrupts.CPU86.CAL:Function_call_interrupts
13069 ± 3% +35.4% 17692 ± 4% interrupts.CPU86.TLB:TLB_shootdowns
13825 +25.2% 17310 ± 4% interrupts.CPU87.CAL:Function_call_interrupts
8636 -35.5% 5568 ± 32% interrupts.CPU87.NMI:Non-maskable_interrupts
8636 -35.5% 5568 ± 32% interrupts.CPU87.PMI:Performance_monitoring_interrupts
13185 +33.9% 17650 ± 3% interrupts.CPU87.TLB:TLB_shootdowns
13849 +25.7% 17411 ± 5% interrupts.CPU88.CAL:Function_call_interrupts
13254 +33.9% 17753 ± 5% interrupts.CPU88.TLB:TLB_shootdowns
13724 +27.0% 17433 ± 4% interrupts.CPU89.CAL:Function_call_interrupts
12959 +37.2% 17774 ± 4% interrupts.CPU89.TLB:TLB_shootdowns
13792 ± 2% +27.2% 17543 ± 3% interrupts.CPU9.CAL:Function_call_interrupts
241.50 ± 2% +45.1% 350.50 ± 10% interrupts.CPU9.RES:Rescheduling_interrupts
13099 ± 3% +36.2% 17846 ± 4% interrupts.CPU9.TLB:TLB_shootdowns
13667 +26.7% 17313 ± 4% interrupts.CPU90.CAL:Function_call_interrupts
12835 +38.6% 17787 ± 3% interrupts.CPU90.TLB:TLB_shootdowns
13766 +26.5% 17407 ± 3% interrupts.CPU91.CAL:Function_call_interrupts
12970 +37.7% 17855 ± 3% interrupts.CPU91.TLB:TLB_shootdowns
13896 +25.1% 17383 ± 4% interrupts.CPU92.CAL:Function_call_interrupts
13145 +35.6% 17825 ± 4% interrupts.CPU92.TLB:TLB_shootdowns
13783 ± 2% +26.6% 17456 ± 5% interrupts.CPU93.CAL:Function_call_interrupts
13152 ± 3% +35.6% 17835 ± 6% interrupts.CPU93.TLB:TLB_shootdowns
13769 +26.2% 17383 ± 5% interrupts.CPU94.CAL:Function_call_interrupts
13156 ± 3% +36.2% 17918 ± 5% interrupts.CPU94.TLB:TLB_shootdowns
14112 ± 3% +23.4% 17415 ± 4% interrupts.CPU95.CAL:Function_call_interrupts
13326 +34.0% 17855 ± 4% interrupts.CPU95.TLB:TLB_shootdowns
15493 ± 6% +37.7% 21329 ± 7% interrupts.CPU96.CAL:Function_call_interrupts
12765 ± 3% +37.9% 17602 ± 4% interrupts.CPU96.TLB:TLB_shootdowns
13972 ± 2% +35.6% 18947 ± 7% interrupts.CPU97.CAL:Function_call_interrupts
8466 ± 2% -22.9% 6530 ± 21% interrupts.CPU97.NMI:Non-maskable_interrupts
8466 ± 2% -22.9% 6530 ± 21% interrupts.CPU97.PMI:Performance_monitoring_interrupts
12726 ± 4% +40.7% 17903 ± 5% interrupts.CPU97.TLB:TLB_shootdowns
13820 +28.5% 17766 ± 4% interrupts.CPU98.CAL:Function_call_interrupts
8467 ± 2% -23.8% 6452 ± 21% interrupts.CPU98.NMI:Non-maskable_interrupts
8467 ± 2% -23.8% 6452 ± 21% interrupts.CPU98.PMI:Performance_monitoring_interrupts
12822 ± 2% +38.5% 17756 ± 5% interrupts.CPU98.TLB:TLB_shootdowns
13765 ± 2% +26.2% 17377 ± 4% interrupts.CPU99.CAL:Function_call_interrupts
13015 ± 3% +35.1% 17585 ± 5% interrupts.CPU99.TLB:TLB_shootdowns
26261 ± 7% +26.3% 33155 ± 8% interrupts.RES:Rescheduling_interrupts
2491417 +37.7% 3430454 ± 4% interrupts.TLB:TLB_shootdowns
vm-scalability.throughput
7e+06 +-----------------------------------------------------------------+
| O O O O OO OO OO O O O OOO OO OO |
6.5e+06 |O+ O OO O O OO O O O O O |
| O O O O |
6e+06 |-+ |
| |
5.5e+06 |-+ |
| |
5e+06 |-+ |
| |
4.5e+06 |-+ |
| |
4e+06 |-+ |
| + +. + + .+ ++ |
3.5e+06 +-----------------------------------------------------------------+
vm-scalability.median
38000 +-------------------------------------------------------------------+
36000 |-O O O |
|O OOOO O OO O OOOO OOO OO O OO OOO O |
34000 |-+ O O O O O O O O O O |
32000 |-+ |
| |
30000 |-+ |
28000 |-+ |
26000 |-+ |
| |
24000 |-+ |
22000 |-+ |
| |
20000 |-+.+++++ .+++ +.++++ .+++++ .++ ++. + ++.+++ +.++++++. +++ +.+ |
18000 +-------------------------------------------------------------------+
vm-scalability.workload
2.2e+09 +-----------------------------------------------------------------+
| |
2e+09 |-O |
| |
|O OOOO O OOO O OO OOOOO O O O OOO OOO OO |
1.8e+09 |-+O OO O O O O O O |
| |
1.6e+09 |-+ |
| |
1.4e+09 |-+ |
| |
| |
1.2e+09 |-+ |
| +. ++ + ++ + +++. + + + ++. + ++ .+ ++ |
1e+09 +-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
On Wed, 21 Oct 2020, kernel test robot wrote:
> Greeting,
>
> FYI, we noticed a 87.8% improvement of vm-scalability.throughput due to commit:
>
>
> commit: 7fef431be9c9ac255838a9578331567b9dba4477 ("mm/page_alloc: place pages to tail in __free_pages_core()")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
>
> in testcase: vm-scalability
> on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
> with following parameters:
>
> runtime: 300s
> size: 512G
> test: anon-wx-rand-mt
> cpufreq_governor: performance
> ucode: 0x5002f01
>
> test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
> test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
>
I'm curious why we are not able to reproduce this improvement on Skylake
and actually see a slight performance degradation, at least for
300s_128G_truncate_throughput.
Axel Rasmussen <[email protected]> can provide more details on our
results.
> Am 23.10.2020 um 21:29 schrieb David Rientjes <[email protected]>:
>
> On Wed, 21 Oct 2020, kernel test robot wrote:
>
>> Greeting,
>>
>> FYI, we noticed a 87.8% improvement of vm-scalability.throughput due to commit:
>>
>>
>> commit: 7fef431be9c9ac255838a9578331567b9dba4477 ("mm/page_alloc: place pages to tail in __free_pages_core()")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>>
>> in testcase: vm-scalability
>> on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
>> with following parameters:
>>
>> runtime: 300s
>> size: 512G
>> test: anon-wx-rand-mt
>> cpufreq_governor: performance
>> ucode: 0x5002f01
>>
>> test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
>> test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
>>
>
> I'm curious why we are not able to reproduce this improvement on Skylake
> and actually see a slight performance degradation, at least for
> 300s_128G_truncate_throughput.
>
> Axel Rasmussen <[email protected]> can provide more details on our
> results.
>
As this patch only affects how we first place pages into the freelists when booting up, I‘d be surprised if there would be observable change in actual numbers. Run your system for long enough and it‘s all going to be random in the freelists anyway.
Looks more like random measurement anomalies to me. But maybe there are corner cases where the initial state of the freelists affects a benchmark when run immediately after boot?
On Fri, Oct 23, 2020 at 12:29 PM David Rientjes <[email protected]> wrote:
>
> On Wed, 21 Oct 2020, kernel test robot wrote:
>
> > Greeting,
> >
> > FYI, we noticed a 87.8% improvement of vm-scalability.throughput due to commit:
> >
> >
> > commit: 7fef431be9c9ac255838a9578331567b9dba4477 ("mm/page_alloc: place pages to tail in __free_pages_core()")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> >
> > in testcase: vm-scalability
> > on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
> > with following parameters:
> >
> > runtime: 300s
> > size: 512G
> > test: anon-wx-rand-mt
> > cpufreq_governor: performance
> > ucode: 0x5002f01
> >
> > test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
> > test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
> >
>
> I'm curious why we are not able to reproduce this improvement on Skylake
> and actually see a slight performance degradation, at least for
> 300s_128G_truncate_throughput.
>
> Axel Rasmussen <[email protected]> can provide more details on our
> results.
Right, our results show a slight regression on a Skylake machine [1],
and a slight performance increase on a Rome machine [2]. For these
tests, I used Linus' v5.9 tag as a baseline, and then applied this
patchset onto that tag as a test kernel (the patches applied cleanly
besides one comment, I didn't have to do any code fixups). This is
running the same anon-wx-rand-mt test defined in the upstream
lkp-tests job file:
https://github.com/intel/lkp-tests/blob/master/jobs/vm-scalability.yaml
I'm happy to provide any other information that might be useful, like
the kconfig I used, or some logs from the test itself.
[1]:
[*] KERNELS
----- BASE KERNEL (A) -----
Arch: x86_64
CommitId: bbf5c979011a099af5dc76498918ed7df445635b
Describe: v5.9
----- TEST KERNEL (B) -----
Arch: x86_64
CommitId: 0eed18403b89d685c736fd41d83312bc18d1fc74
Describe: v5.9-5-g0eed18403b89
[*] TAGS
LABEL | VALUE
--------------------------------+-------------------
kernel_version | 5.9.0-smp-DEV
kernel_version_major | 5
kernel_version_minor | 9
machine_architecture | x86_64
machine_config_memory | 393216
machine_config_num_cores | 112
machine_config_num_cpus | 2
machine_config_num_phys_cores | 56
machine_platform_genus | skylake
test_name | vm-scalability
[*] METRICS
LABEL | COUNT | MIN |
MAX | MEAN | MEDIAN | STDDEV
| DIRECTION
---------------------------------------+-------+------------------+-----------------+-------------------+------------------+-----------------------+------------
300s_128G_truncate_throughput | | |
| | |
|
(A) bbf5c979011a | 5 | 3.7552221368e+10 |
3.881560468e+10 | 3.83416430016e+10 | 3.8676061688e+10 |
5.123384177998683e+08 |
(B) 0eed18403b89 | 5 | 3.20600355e+10 |
3.862106519e+10 | 3.6402760077e+10 | 3.7563289678e+10 |
2.334955983229862e+09 |
| | -14.63% |
-0.50% | -5.06% | -2.88% | +355.74%
| + is good
300s_512G_anon_wx_rand_mt_throughput | | |
| | |
|
(A) bbf5c979011a | 5 | 8.127738e+06 |
8.850316e+06 | 8.4767952e+06 | 8.49689e+06 |
238015.6101665603 |
(B) 0eed18403b89 | 5 | 7.997802e+06 |
8.650092e+06 | 8.3851078e+06 | 8.501602e+06 |
232913.70310602157 |
| | -1.60% |
-2.26% | -1.08% | +0.06% | -2.14%
| + is good
[2]:
[*] KERNELS
----- BASE KERNEL (A) -----
Arch: x86_64
CommitId: bbf5c979011a099af5dc76498918ed7df445635b
Describe: v5.9
----- TEST KERNEL (B) -----
Arch: x86_64
CommitId: 0eed18403b89d685c736fd41d83312bc18d1fc74
Describe: v5.9-5-g0eed18403b89
[*] TAGS
LABEL | VALUE
--------------------------------+-------------------
kernel_version | 5.9.0-smp-DEV
kernel_version_major | 5
kernel_version_minor | 9
machine_architecture | x86_64
machine_config_memory | 1048576
machine_config_num_cores | 256
machine_config_num_cpus | 2
machine_config_num_phys_cores | 128
machine_platform_genus | rome
test_name | vm-scalability
[*] METRICS
LABEL | COUNT | MIN |
MAX | MEAN | MEDIAN | STDDEV
| DIRECTION
---------------------------------------+-------+------------------+------------------+-------------------+------------------+------------------------+------------
300s_128G_truncate_throughput | | |
| | |
|
(A) bbf5c979011a | 5 | 3.4145093376e+10 |
3.7176031393e+10 | 3.55926734966e+10 | 3.5521843244e+10 |
1.0127887857614994e+09 |
(B) 0eed18403b89 | 5 | 3.4908582472e+10 |
3.6828513899e+10 | 3.56578033004e+10 | 3.5495102793e+10 |
6.518510126266636e+08 |
| | +2.24% |
-0.93% | +0.18% | -0.08% | -35.64%
| + is good
300s_512G_anon_wx_rand_mt_throughput | | |
| | |
|
(A) bbf5c979011a | 5 | 5.041427e+06 |
5.27816e+06 | 5.1413284e+06 | 5.128602e+06 |
93566.28579269352 |
(B) 0eed18403b89 | 5 | 5.323419e+06 |
5.513787e+06 | 5.451148e+06 | 5.457595e+06 |
68242.81926767099 |
| | +5.59% |
+4.46% | +6.03% | +6.41% | -27.06%
| + is good
On 23.10.20 21:44, Axel Rasmussen wrote:
> On Fri, Oct 23, 2020 at 12:29 PM David Rientjes <[email protected]> wrote:
>>
>> On Wed, 21 Oct 2020, kernel test robot wrote:
>>
>>> Greeting,
>>>
>>> FYI, we noticed a 87.8% improvement of vm-scalability.throughput due to commit:
>>>
>>>
>>> commit: 7fef431be9c9ac255838a9578331567b9dba4477 ("mm/page_alloc: place pages to tail in __free_pages_core()")
>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>>
>>>
>>> in testcase: vm-scalability
>>> on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
>>> with following parameters:
>>>
>>> runtime: 300s
>>> size: 512G
>>> test: anon-wx-rand-mt
>>> cpufreq_governor: performance
>>> ucode: 0x5002f01
>>>
>>> test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
>>> test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
>>>
>>
>> I'm curious why we are not able to reproduce this improvement on Skylake
>> and actually see a slight performance degradation, at least for
>> 300s_128G_truncate_throughput.
>>
>> Axel Rasmussen <[email protected]> can provide more details on our
>> results.
>
> Right, our results show a slight regression on a Skylake machine [1],
> and a slight performance increase on a Rome machine [2]. For these
> tests, I used Linus' v5.9 tag as a baseline, and then applied this
> patchset onto that tag as a test kernel (the patches applied cleanly
> besides one comment, I didn't have to do any code fixups). This is
> running the same anon-wx-rand-mt test defined in the upstream
> lkp-tests job file:
> https://github.com/intel/lkp-tests/blob/master/jobs/vm-scalability.yaml
Hi,
looking at the yaml, am I right that each test is run after a fresh boot?
As I already replied to David, this patch merely changes the initial
order of the freelists. The general end result is that lower memory
addresses will be allocated before higher memory addresses will be
allocated - within a zone, the first time memory is getting allocated.
Before, it was the other way around. Once a system ran for some time,
freelists are randomized.
There might be benchmarks/systems where this initial system state might
now be better suited - or worse. It doesn't really tell you that core-mm
is behaving better/worse now - it merely means that the initial system
state under which the benchmark was started affected the benchmark.
Looks like so far there is one benchmark+system where it's really
beneficial, there is one benchmark+system where it's slightly
beneficial, and one benchmark+system where there is a slight regression.
Something like the following would revert to the previous behavior:
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23f5066bd4a5..fac82420cc3d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1553,7 +1553,9 @@ void __free_pages_core(struct page *page, unsigned
int order)
* Bypass PCP and place fresh pages right to the tail, primarily
* relevant for memory onlining.
*/
- __free_pages_ok(page, order, FPI_TO_TAIL);
+ __free_pages_ok(page, order,
+ system_state < SYSTEM_RUNNING ? FPI_NONE :
+ FPI_TO_TAIL);
}
#ifdef CONFIG_NEED_MULTIPLE_NODES
(Or better, passing the expected behavior via MEMINIT_EARLY/... to
__free_pages_core().)
But then, I am not convinced we should perform that change: having a
clean (initial) state might be true for these benchmarks, but it's far
from reality. The change in numbers doesn't show you that core-mm is
operating better/worse, just that the baseline for you tests changed due
to a changed initial system state.
Thanks!
--
Thanks,
David / dhildenb
On Mon, Oct 26, 2020 at 1:31 AM David Hildenbrand <[email protected]> wrote:
>
> On 23.10.20 21:44, Axel Rasmussen wrote:
> > On Fri, Oct 23, 2020 at 12:29 PM David Rientjes <[email protected]> wrote:
> >>
> >> On Wed, 21 Oct 2020, kernel test robot wrote:
> >>
> >>> Greeting,
> >>>
> >>> FYI, we noticed a 87.8% improvement of vm-scalability.throughput due to commit:
> >>>
> >>>
> >>> commit: 7fef431be9c9ac255838a9578331567b9dba4477 ("mm/page_alloc: place pages to tail in __free_pages_core()")
> >>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >>>
> >>>
> >>> in testcase: vm-scalability
> >>> on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
> >>> with following parameters:
> >>>
> >>> runtime: 300s
> >>> size: 512G
> >>> test: anon-wx-rand-mt
> >>> cpufreq_governor: performance
> >>> ucode: 0x5002f01
> >>>
> >>> test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
> >>> test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
> >>>
> >>
> >> I'm curious why we are not able to reproduce this improvement on Skylake
> >> and actually see a slight performance degradation, at least for
> >> 300s_128G_truncate_throughput.
> >>
> >> Axel Rasmussen <[email protected]> can provide more details on our
> >> results.
> >
> > Right, our results show a slight regression on a Skylake machine [1],
> > and a slight performance increase on a Rome machine [2]. For these
> > tests, I used Linus' v5.9 tag as a baseline, and then applied this
> > patchset onto that tag as a test kernel (the patches applied cleanly
> > besides one comment, I didn't have to do any code fixups). This is
> > running the same anon-wx-rand-mt test defined in the upstream
> > lkp-tests job file:
> > https://github.com/intel/lkp-tests/blob/master/jobs/vm-scalability.yaml
>
> Hi,
>
> looking at the yaml, am I right that each test is run after a fresh boot?
Yes-ish. For the results I posted, the larger context would have been
something like:
- Kernel installed, machine freshly rebooted.
- Various machine management daemons start by default, some are
stopped so as not to interfere with the test.
- Some packages are installed on the machine (the thing which
orchestrates the testing in particular).
- The test is run.
So, the machine is somewhat fresh in the sense that it hasn't been
e.g. serving production traffic just before running the test, but it's
also not as clean as it could be. It seems plausible this difference
explains the difference in the results (I'm not too familiar with how
the Intel kernel test robot is implemented).
>
> As I already replied to David, this patch merely changes the initial
> order of the freelists. The general end result is that lower memory
> addresses will be allocated before higher memory addresses will be
> allocated - within a zone, the first time memory is getting allocated.
> Before, it was the other way around. Once a system ran for some time,
> freelists are randomized.
>
> There might be benchmarks/systems where this initial system state might
> now be better suited - or worse. It doesn't really tell you that core-mm
> is behaving better/worse now - it merely means that the initial system
> state under which the benchmark was started affected the benchmark.
>
> Looks like so far there is one benchmark+system where it's really
> beneficial, there is one benchmark+system where it's slightly
> beneficial, and one benchmark+system where there is a slight regression.
>
>
> Something like the following would revert to the previous behavior:
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 23f5066bd4a5..fac82420cc3d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1553,7 +1553,9 @@ void __free_pages_core(struct page *page, unsigned
> int order)
> * Bypass PCP and place fresh pages right to the tail, primarily
> * relevant for memory onlining.
> */
> - __free_pages_ok(page, order, FPI_TO_TAIL);
> + __free_pages_ok(page, order,
> + system_state < SYSTEM_RUNNING ? FPI_NONE :
> + FPI_TO_TAIL);
> }
>
> #ifdef CONFIG_NEED_MULTIPLE_NODES
>
>
> (Or better, passing the expected behavior via MEMINIT_EARLY/... to
> __free_pages_core().)
>
>
> But then, I am not convinced we should perform that change: having a
> clean (initial) state might be true for these benchmarks, but it's far
> from reality. The change in numbers doesn't show you that core-mm is
> operating better/worse, just that the baseline for you tests changed due
> to a changed initial system state.
Not to put words in David's mouth :) but at least from my perspective,
our original interest was "wow, an 87% improvement! maybe we should
deploy this patch to production!", and I'm mostly sharing my results
just to say "it actually doesn't seem to be a huge *general*
improvement", rather than to advocate for further changes / fixes.
IIUC the original motivation for this patch was to fix somewhat of an
edge case, not to make a very general improvement, so this seems fine.
>
> Thanks!
>
> --
> Thanks,
>
> David / dhildenb
>
> Am 26.10.2020 um 19:11 schrieb Axel Rasmussen <[email protected]>:
>
> On Mon, Oct 26, 2020 at 1:31 AM David Hildenbrand <[email protected]> wrote:
>>
>>> On 23.10.20 21:44, Axel Rasmussen wrote:
>>> On Fri, Oct 23, 2020 at 12:29 PM David Rientjes <[email protected]> wrote:
>>>>
>>>> On Wed, 21 Oct 2020, kernel test robot wrote:
>>>>
>>>>> Greeting,
>>>>>
>>>>> FYI, we noticed a 87.8% improvement of vm-scalability.throughput due to commit:
>>>>>
>>>>>
>>>>> commit: 7fef431be9c9ac255838a9578331567b9dba4477 ("mm/page_alloc: place pages to tail in __free_pages_core()")
>>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>>>>
>>>>>
>>>>> in testcase: vm-scalability
>>>>> on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
>>>>> with following parameters:
>>>>>
>>>>> runtime: 300s
>>>>> size: 512G
>>>>> test: anon-wx-rand-mt
>>>>> cpufreq_governor: performance
>>>>> ucode: 0x5002f01
>>>>>
>>>>> test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
>>>>> test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
>>>>>
>>>>
>>>> I'm curious why we are not able to reproduce this improvement on Skylake
>>>> and actually see a slight performance degradation, at least for
>>>> 300s_128G_truncate_throughput.
>>>>
>>>> Axel Rasmussen <[email protected]> can provide more details on our
>>>> results.
>>>
>>> Right, our results show a slight regression on a Skylake machine [1],
>>> and a slight performance increase on a Rome machine [2]. For these
>>> tests, I used Linus' v5.9 tag as a baseline, and then applied this
>>> patchset onto that tag as a test kernel (the patches applied cleanly
>>> besides one comment, I didn't have to do any code fixups). This is
>>> running the same anon-wx-rand-mt test defined in the upstream
>>> lkp-tests job file:
>>> https://github.com/intel/lkp-tests/blob/master/jobs/vm-scalability.yaml
>>
>> Hi,
>>
>> looking at the yaml, am I right that each test is run after a fresh boot?
>
> Yes-ish. For the results I posted, the larger context would have been
> something like:
>
> - Kernel installed, machine freshly rebooted.
> - Various machine management daemons start by default, some are
> stopped so as not to interfere with the test.
> - Some packages are installed on the machine (the thing which
> orchestrates the testing in particular).
> - The test is run.
>
> So, the machine is somewhat fresh in the sense that it hasn't been
> e.g. serving production traffic just before running the test, but it's
> also not as clean as it could be. It seems plausible this difference
> explains the difference in the results (I'm not too familiar with how
> the Intel kernel test robot is implemented).
Ah, okay. So most memory in the system is indeed untouched. Thanks!
>
>>
>> As I already replied to David, this patch merely changes the initial
>> order of the freelists. The general end result is that lower memory
>> addresses will be allocated before higher memory addresses will be
>> allocated - within a zone, the first time memory is getting allocated.
>> Before, it was the other way around. Once a system ran for some time,
>> freelists are randomized.
>>
>> There might be benchmarks/systems where this initial system state might
>> now be better suited - or worse. It doesn't really tell you that core-mm
>> is behaving better/worse now - it merely means that the initial system
>> state under which the benchmark was started affected the benchmark.
>>
>> Looks like so far there is one benchmark+system where it's really
>> beneficial, there is one benchmark+system where it's slightly
>> beneficial, and one benchmark+system where there is a slight regression.
>>
>>
>> Something like the following would revert to the previous behavior:
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 23f5066bd4a5..fac82420cc3d 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1553,7 +1553,9 @@ void __free_pages_core(struct page *page, unsigned
>> int order)
>> * Bypass PCP and place fresh pages right to the tail, primarily
>> * relevant for memory onlining.
>> */
>> - __free_pages_ok(page, order, FPI_TO_TAIL);
>> + __free_pages_ok(page, order,
>> + system_state < SYSTEM_RUNNING ? FPI_NONE :
>> + FPI_TO_TAIL);
>> }
>>
>> #ifdef CONFIG_NEED_MULTIPLE_NODES
>>
>>
>> (Or better, passing the expected behavior via MEMINIT_EARLY/... to
>> __free_pages_core().)
>>
>>
>> But then, I am not convinced we should perform that change: having a
>> clean (initial) state might be true for these benchmarks, but it's far
>> from reality. The change in numbers doesn't show you that core-mm is
>> operating better/worse, just that the baseline for you tests changed due
>> to a changed initial system state.
>
> Not to put words in David's mouth :) but at least from my perspective,
> our original interest was "wow, an 87% improvement! maybe we should
> deploy this patch to production!", and I'm mostly sharing my results
> just to say "it actually doesn't seem to be a huge *general*
> improvement", rather than to advocate for further changes / fixes.
Ah, yes, I saw the +87% and thought „that can‘t be right“.
> IIUC the original motivation for this patch was to fix somewhat of an
> edge case, not to make a very general improvement, so this seems fine.
>
Exactly.