2021-05-19 19:59:23

by kernel test robot

[permalink] [raw]
Subject: [smp] a32a4d8a81: netperf.Throughput_tps -2.1% regression



Greeting,

FYI, we noticed a -2.1% regression of netperf.Throughput_tps due to commit:


commit: a32a4d8a815c4eb6dc64b8962dc13a9dfae70868 ("smp: Run functions concurrently in smp_call_function_many_cond()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: netperf
on test machine: 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

ip: ipv4
runtime: 300s
nr_threads: 1
cluster: cs-localhost
test: UDP_RR
cpufreq_governor: performance
ucode: 0x5003006

test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
test-url: http://www.netperf.org/netperf/



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file

=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
cs-localhost/gcc-9/performance/ipv4/x86_64-rhel-8.3/1/debian-10.4-x86_64-20200603.cgz/300s/lkp-csl-2ap3/UDP_RR/netperf/0x5003006

commit:
v5.12-rc2
a32a4d8a81 ("smp: Run functions concurrently in smp_call_function_many_cond()")

v5.12-rc2 a32a4d8a815c4eb6dc64b8962dc
---------------- ---------------------------
%stddev %change %stddev
\ | \
116903 -2.1% 114404 netperf.Throughput_total_tps
116903 -2.1% 114404 netperf.Throughput_tps
35066769 -2.1% 34317990 netperf.time.voluntary_context_switches
35071059 -2.1% 34321258 netperf.workload
67295 +1.5% 68333 proc-vmstat.nr_anon_pages
463520 -2.1% 453603 vmstat.system.cs
535.28 ? 6% -8.3% 490.97 ? 10% sched_debug.cfs_rq:/.util_est_enqueued.max
0.02 ? 8% -10.8% 0.02 ? 4% sched_debug.cpu.nr_running.avg
76309820 ? 4% +320.0% 3.205e+08 ?158% cpuidle.C1.time
23409116 ? 3% +31.0% 30676822 ? 20% cpuidle.C1.usage
46720133 ? 2% -12.9% 40709940 ? 2% cpuidle.POLL.usage
5282 ?110% +317.0% 22029 ? 58% numa-vmstat.node3.nr_anon_pages
11998 ? 55% +138.7% 28637 ? 45% numa-vmstat.node3.nr_inactive_anon
11998 ? 55% +138.7% 28637 ? 45% numa-vmstat.node3.nr_zone_inactive_anon
8397 ?136% +588.7% 57827 ? 75% numa-meminfo.node3.AnonHugePages
21162 ?110% +316.7% 88189 ? 58% numa-meminfo.node3.AnonPages
48780 ? 54% +136.8% 115533 ? 45% numa-meminfo.node3.Inactive
48780 ? 54% +136.8% 115533 ? 45% numa-meminfo.node3.Inactive(anon)
467040 -2.1% 457094 perf-stat.i.context-switches
0.01 ?138% +0.0 0.03 ? 73% perf-stat.i.dTLB-store-miss-rate%
9.415e+08 -2.4% 9.188e+08 ? 2% perf-stat.i.dTLB-stores
0.01 ?137% +0.0 0.03 ? 73% perf-stat.overall.dTLB-store-miss-rate%
465472 -2.1% 455557 perf-stat.ps.context-switches
9.385e+08 -2.4% 9.158e+08 ? 2% perf-stat.ps.dTLB-stores
1.21 ? 14% +0.2 1.41 ? 5% perf-profile.calltrace.cycles-pp.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto
2.05 ? 10% +0.3 2.33 ? 4% perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
0.06 ? 7% +0.0 0.08 ? 14% perf-profile.children.cycles-pp.__calc_delta
0.08 ? 19% +0.0 0.10 ? 9% perf-profile.children.cycles-pp._copy_to_user
0.09 ? 22% +0.0 0.12 ? 8% perf-profile.children.cycles-pp._copy_from_user
0.12 ? 20% +0.0 0.17 ? 13% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
0.14 ? 11% +0.1 0.19 ? 9% perf-profile.children.cycles-pp.skb_release_data
1.21 ? 14% +0.2 1.41 ? 5% perf-profile.children.cycles-pp.__ip_append_data
2.07 ? 11% +0.3 2.33 ? 4% perf-profile.children.cycles-pp.schedule_idle
0.06 ? 7% +0.0 0.08 ? 11% perf-profile.self.cycles-pp.__calc_delta
0.19 ? 8% +0.0 0.24 ? 6% perf-profile.self.cycles-pp.__softirqentry_text_start
0.24 ? 8% +0.1 0.29 ? 4% perf-profile.self.cycles-pp.__skb_recv_udp
0.14 ? 11% +0.1 0.19 ? 9% perf-profile.self.cycles-pp.skb_release_data
0.02 ?142% +0.1 0.08 ? 17% perf-profile.self.cycles-pp.sock_alloc_send_pskb
0.11 ? 17% +0.1 0.19 ? 13% perf-profile.self.cycles-pp.__ip_append_data
0.12 ? 34% +0.1 0.26 ? 22% perf-profile.self.cycles-pp.perf_mux_hrtimer_handler
0.87 ? 13% +0.2 1.05 ? 6% perf-profile.self.cycles-pp._raw_spin_lock
1287 ? 42% +75.3% 2256 ? 14% interrupts.CPU111.CAL:Function_call_interrupts
1326 ? 43% +71.0% 2267 ? 13% interrupts.CPU119.CAL:Function_call_interrupts
1300 ? 45% +75.9% 2287 ? 37% interrupts.CPU120.CAL:Function_call_interrupts
1299 ? 45% +60.1% 2081 ? 28% interrupts.CPU128.CAL:Function_call_interrupts
1305 ? 45% +61.7% 2110 ? 29% interrupts.CPU131.CAL:Function_call_interrupts
1299 ? 45% +61.8% 2102 ? 28% interrupts.CPU139.CAL:Function_call_interrupts
66.67 ?133% -97.2% 1.83 ?155% interrupts.CPU14.TLB:TLB_shootdowns
1299 ? 45% +107.8% 2700 ? 33% interrupts.CPU142.CAL:Function_call_interrupts
301.83 ?128% -95.6% 13.17 ?140% interrupts.CPU149.RES:Rescheduling_interrupts
389.17 ? 89% -73.5% 103.17 ? 35% interrupts.CPU164.NMI:Non-maskable_interrupts
389.17 ? 89% -73.5% 103.17 ? 35% interrupts.CPU164.PMI:Performance_monitoring_interrupts
1299 ? 45% +60.2% 2081 ? 28% interrupts.CPU35.CAL:Function_call_interrupts
1244 ? 50% +66.8% 2076 ? 27% interrupts.CPU45.CAL:Function_call_interrupts
1300 ? 44% +59.5% 2075 ? 28% interrupts.CPU46.CAL:Function_call_interrupts
1.50 ? 63% +1422.2% 22.83 ?167% interrupts.CPU47.RES:Rescheduling_interrupts
467.33 ? 85% -64.6% 165.67 ? 74% interrupts.CPU58.NMI:Non-maskable_interrupts
467.33 ? 85% -64.6% 165.67 ? 74% interrupts.CPU58.PMI:Performance_monitoring_interrupts
306.67 ? 75% -59.9% 122.83 ? 16% interrupts.CPU68.NMI:Non-maskable_interrupts
306.67 ? 75% -59.9% 122.83 ? 16% interrupts.CPU68.PMI:Performance_monitoring_interrupts
1131 ? 27% +61.2% 1822 ? 35% interrupts.CPU85.CAL:Function_call_interrupts
1180 ? 31% +79.6% 2119 ? 24% interrupts.CPU86.CAL:Function_call_interrupts



netperf.Throughput_tps

121000 +------------------------------------------------------------------+
120000 |-+ :+ |
| : + |
119000 |-+ : + + |
118000 |-+ : : :+ + + + + |
|.+ : : : + + + :: + + : |
117000 |-++ +.: : +.+ + + +. : :.+ + : |
116000 |-+ + + :+ +.+ + + + |
115000 |-+ O + O O |
| O O O O O O O O O O O O |
114000 |-+ O O O O O O O |
113000 |-+ O O O O O |
| O O O O O |
112000 |-+ O O |
111000 +------------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (9.36 kB)
config-5.12.0-rc2-00001-ga32a4d8a815c (175.54 kB)
job-script (8.04 kB)
job.yaml (5.47 kB)
reproduce (337.00 B)
Download all attachments

2021-05-19 20:20:48

by Nadav Amit

[permalink] [raw]
Subject: Re: [smp] a32a4d8a81: netperf.Throughput_tps -2.1% regression

[ +PeterZ for reference ]


> On May 19, 2021, at 7:27 AM, kernel test robot <[email protected]> wrote:
>
>
>
> Greeting,
>
> FYI, we noticed a -2.1% regression of netperf.Throughput_tps due to commit:
>
>
> commit: a32a4d8a815c4eb6dc64b8962dc13a9dfae70868 ("smp: Run functions concurrently in smp_call_function_many_cond()")
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fcgit%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git&amp;data=04%7C01%7Cnamit%40vmware.com%7Ca49b22e928144aab039908d91acff8c4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637570302823256266%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=h2VRetBNlEQBvOlkYrRCMCK6%2BukRqlCElYxM8UfVxqI%3D&amp;reserved=0 master
>
>
> in testcase: netperf
> on test machine: 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
> with following parameters:
>
> ip: ipv4
> runtime: 300s
> nr_threads: 1
> cluster: cs-localhost
> test: UDP_RR
> cpufreq_governor: performance
> ucode: 0x5003006
>
>

[snip]

> commit:
> v5.12-rc2
> a32a4d8a81 ("smp: Run functions concurrently in smp_call_function_many_cond()")
>
> v5.12-rc2 a32a4d8a815c4eb6dc64b8962dc
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 116903 -2.1% 114404 netperf.Throughput_total_tps
> 116903 -2.1% 114404 netperf.Throughput_tps
> 35066769 -2.1% 34317990 netperf.time.voluntary_context_switches
> 35071059 -2.1% 34321258 netperf.workload
> 67295 +1.5% 68333 proc-vmstat.nr_anon_pages
> 463520 -2.1% 453603 vmstat.system.cs
> 535.28 ± 6% -8.3% 490.97 ± 10% sched_debug.cfs_rq:/.util_est_enqueued.max
> 0.02 ± 8% -10.8% 0.02 ± 4% sched_debug.cpu.nr_running.avg
> 76309820 ± 4% +320.0% 3.205e+08 ±158% cpuidle.C1.time
> 23409116 ± 3% +31.0% 30676822 ± 20% cpuidle.C1.usage
> 46720133 ± 2% -12.9% 40709940 ± 2% cpuidle.POLL.usage
> 5282 ±110% +317.0% 22029 ± 58% numa-vmstat.node3.nr_anon_pages
> 11998 ± 55% +138.7% 28637 ± 45% numa-vmstat.node3.nr_inactive_anon
> 11998 ± 55% +138.7% 28637 ± 45% numa-vmstat.node3.nr_zone_inactive_anon
> 8397 ±136% +588.7% 57827 ± 75% numa-meminfo.node3.AnonHugePages
> 21162 ±110% +316.7% 88189 ± 58% numa-meminfo.node3.AnonPages
> 48780 ± 54% +136.8% 115533 ± 45% numa-meminfo.node3.Inactive
> 48780 ± 54% +136.8% 115533 ± 45% numa-meminfo.node3.Inactive(anon)
> 467040 -2.1% 457094 perf-stat.i.context-switches
> 0.01 ±138% +0.0 0.03 ± 73% perf-stat.i.dTLB-store-miss-rate%
> 9.415e+08 -2.4% 9.188e+08 ± 2% perf-stat.i.dTLB-stores
> 0.01 ±137% +0.0 0.03 ± 73% perf-stat.overall.dTLB-store-miss-rate%
> 465472 -2.1% 455557 perf-stat.ps.context-switches
> 9.385e+08 -2.4% 9.158e+08 ± 2% perf-stat.ps.dTLB-stores
> 1.21 ± 14% +0.2 1.41 ± 5% perf-profile.calltrace.cycles-pp.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto
> 2.05 ± 10% +0.3 2.33 ± 4% perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
> 0.06 ± 7% +0.0 0.08 ± 14% perf-profile.children.cycles-pp.__calc_delta
> 0.08 ± 19% +0.0 0.10 ± 9% perf-profile.children.cycles-pp._copy_to_user
> 0.09 ± 22% +0.0 0.12 ± 8% perf-profile.children.cycles-pp._copy_from_user
> 0.12 ± 20% +0.0 0.17 ± 13% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
> 0.14 ± 11% +0.1 0.19 ± 9% perf-profile.children.cycles-pp.skb_release_data
> 1.21 ± 14% +0.2 1.41 ± 5% perf-profile.children.cycles-pp.__ip_append_data
> 2.07 ± 11% +0.3 2.33 ± 4% perf-profile.children.cycles-pp.schedule_idle
> 0.06 ± 7% +0.0 0.08 ± 11% perf-profile.self.cycles-pp.__calc_delta
> 0.19 ± 8% +0.0 0.24 ± 6% perf-profile.self.cycles-pp.__softirqentry_text_start
> 0.24 ± 8% +0.1 0.29 ± 4% perf-profile.self.cycles-pp.__skb_recv_udp
> 0.14 ± 11% +0.1 0.19 ± 9% perf-profile.self.cycles-pp.skb_release_data
> 0.02 ±142% +0.1 0.08 ± 17% perf-profile.self.cycles-pp.sock_alloc_send_pskb
> 0.11 ± 17% +0.1 0.19 ± 13% perf-profile.self.cycles-pp.__ip_append_data
> 0.12 ± 34% +0.1 0.26 ± 22% perf-profile.self.cycles-pp.perf_mux_hrtimer_handler
> 0.87 ± 13% +0.2 1.05 ± 6% perf-profile.self.cycles-pp._raw_spin_lock
> 1287 ± 42% +75.3% 2256 ± 14% interrupts.CPU111.CAL:Function_call_interrupts
> 1326 ± 43% +71.0% 2267 ± 13% interrupts.CPU119.CAL:Function_call_interrupts
> 1300 ± 45% +75.9% 2287 ± 37% interrupts.CPU120.CAL:Function_call_interrupts
> 1299 ± 45% +60.1% 2081 ± 28% interrupts.CPU128.CAL:Function_call_interrupts
> 1305 ± 45% +61.7% 2110 ± 29% interrupts.CPU131.CAL:Function_call_interrupts
> 1299 ± 45% +61.8% 2102 ± 28% interrupts.CPU139.CAL:Function_call_interrupts
> 66.67 ±133% -97.2% 1.83 ±155% interrupts.CPU14.TLB:TLB_shootdowns
> 1299 ± 45% +107.8% 2700 ± 33% interrupts.CPU142.CAL:Function_call_interrupts
> 301.83 ±128% -95.6% 13.17 ±140% interrupts.CPU149.RES:Rescheduling_interrupts
> 389.17 ± 89% -73.5% 103.17 ± 35% interrupts.CPU164.NMI:Non-maskable_interrupts
> 389.17 ± 89% -73.5% 103.17 ± 35% interrupts.CPU164.PMI:Performance_monitoring_interrupts
> 1299 ± 45% +60.2% 2081 ± 28% interrupts.CPU35.CAL:Function_call_interrupts
> 1244 ± 50% +66.8% 2076 ± 27% interrupts.CPU45.CAL:Function_call_interrupts
> 1300 ± 44% +59.5% 2075 ± 28% interrupts.CPU46.CAL:Function_call_interrupts
> 1.50 ± 63% +1422.2% 22.83 ±167% interrupts.CPU47.RES:Rescheduling_interrupts
> 467.33 ± 85% -64.6% 165.67 ± 74% interrupts.CPU58.NMI:Non-maskable_interrupts
> 467.33 ± 85% -64.6% 165.67 ± 74% interrupts.CPU58.PMI:Performance_monitoring_interrupts
> 306.67 ± 75% -59.9% 122.83 ± 16% interrupts.CPU68.NMI:Non-maskable_interrupts
> 306.67 ± 75% -59.9% 122.83 ± 16% interrupts.CPU68.PMI:Performance_monitoring_interrupts
> 1131 ± 27% +61.2% 1822 ± 35% interrupts.CPU85.CAL:Function_call_interrupts
> 1180 ± 31% +79.6% 2119 ± 24% interrupts.CPU86.CAL:Function_call_interrupts
>

Could it be a result of a regression that was resolved by commit
641acbf6fd6 ("smp: Micro-optimize smp_call_function_many_cond()")
or does this report mean that the performance regression also
happened on the -rc?


Attachments:
signature.asc (849.00 B)
Message signed with OpenPGP

2021-05-19 20:21:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [smp] a32a4d8a81: netperf.Throughput_tps -2.1% regression

On Wed, May 19, 2021 at 06:17:35PM +0000, Nadav Amit wrote:
> > 1287 ? 42% +75.3% 2256 ? 14% interrupts.CPU111.CAL:Function_call_interrupts
> > 1326 ? 43% +71.0% 2267 ? 13% interrupts.CPU119.CAL:Function_call_interrupts
> > 1300 ? 45% +75.9% 2287 ? 37% interrupts.CPU120.CAL:Function_call_interrupts
> > 1299 ? 45% +60.1% 2081 ? 28% interrupts.CPU128.CAL:Function_call_interrupts
> > 1305 ? 45% +61.7% 2110 ? 29% interrupts.CPU131.CAL:Function_call_interrupts
> > 1299 ? 45% +61.8% 2102 ? 28% interrupts.CPU139.CAL:Function_call_interrupts
> > 66.67 ?133% -97.2% 1.83 ?155% interrupts.CPU14.TLB:TLB_shootdowns
> > 1299 ? 45% +107.8% 2700 ? 33% interrupts.CPU142.CAL:Function_call_interrupts
> > 301.83 ?128% -95.6% 13.17 ?140% interrupts.CPU149.RES:Rescheduling_interrupts
> > 389.17 ? 89% -73.5% 103.17 ? 35% interrupts.CPU164.NMI:Non-maskable_interrupts
> > 389.17 ? 89% -73.5% 103.17 ? 35% interrupts.CPU164.PMI:Performance_monitoring_interrupts
> > 1299 ? 45% +60.2% 2081 ? 28% interrupts.CPU35.CAL:Function_call_interrupts
> > 1244 ? 50% +66.8% 2076 ? 27% interrupts.CPU45.CAL:Function_call_interrupts
> > 1300 ? 44% +59.5% 2075 ? 28% interrupts.CPU46.CAL:Function_call_interrupts
> > 1.50 ? 63% +1422.2% 22.83 ?167% interrupts.CPU47.RES:Rescheduling_interrupts
> > 467.33 ? 85% -64.6% 165.67 ? 74% interrupts.CPU58.NMI:Non-maskable_interrupts
> > 467.33 ? 85% -64.6% 165.67 ? 74% interrupts.CPU58.PMI:Performance_monitoring_interrupts
> > 306.67 ? 75% -59.9% 122.83 ? 16% interrupts.CPU68.NMI:Non-maskable_interrupts
> > 306.67 ? 75% -59.9% 122.83 ? 16% interrupts.CPU68.PMI:Performance_monitoring_interrupts
> > 1131 ? 27% +61.2% 1822 ? 35% interrupts.CPU85.CAL:Function_call_interrupts
> > 1180 ? 31% +79.6% 2119 ? 24% interrupts.CPU86.CAL:Function_call_interrupts
> >

It looks to be sending *waay* more call IPIs, did we mess up the mask or
loose an optimization somewhere?

I'll go read the commit again...

2021-05-19 20:24:45

by Nadav Amit

[permalink] [raw]
Subject: Re: [smp] a32a4d8a81: netperf.Throughput_tps -2.1% regression



> On May 19, 2021, at 11:38 AM, Peter Zijlstra <[email protected]> wrote:
>
> On Wed, May 19, 2021 at 06:17:35PM +0000, Nadav Amit wrote:
>>> 1287 ± 42% +75.3% 2256 ± 14% interrupts.CPU111.CAL:Function_call_interrupts
>>> 1326 ± 43% +71.0% 2267 ± 13% interrupts.CPU119.CAL:Function_call_interrupts
>>> 1300 ± 45% +75.9% 2287 ± 37% interrupts.CPU120.CAL:Function_call_interrupts
>>> 1299 ± 45% +60.1% 2081 ± 28% interrupts.CPU128.CAL:Function_call_interrupts
>>> 1305 ± 45% +61.7% 2110 ± 29% interrupts.CPU131.CAL:Function_call_interrupts
>>> 1299 ± 45% +61.8% 2102 ± 28% interrupts.CPU139.CAL:Function_call_interrupts
>>> 66.67 ±133% -97.2% 1.83 ±155% interrupts.CPU14.TLB:TLB_shootdowns
>>> 1299 ± 45% +107.8% 2700 ± 33% interrupts.CPU142.CAL:Function_call_interrupts
>>> 301.83 ±128% -95.6% 13.17 ±140% interrupts.CPU149.RES:Rescheduling_interrupts
>>> 389.17 ± 89% -73.5% 103.17 ± 35% interrupts.CPU164.NMI:Non-maskable_interrupts
>>> 389.17 ± 89% -73.5% 103.17 ± 35% interrupts.CPU164.PMI:Performance_monitoring_interrupts
>>> 1299 ± 45% +60.2% 2081 ± 28% interrupts.CPU35.CAL:Function_call_interrupts
>>> 1244 ± 50% +66.8% 2076 ± 27% interrupts.CPU45.CAL:Function_call_interrupts
>>> 1300 ± 44% +59.5% 2075 ± 28% interrupts.CPU46.CAL:Function_call_interrupts
>>> 1.50 ± 63% +1422.2% 22.83 ±167% interrupts.CPU47.RES:Rescheduling_interrupts
>>> 467.33 ± 85% -64.6% 165.67 ± 74% interrupts.CPU58.NMI:Non-maskable_interrupts
>>> 467.33 ± 85% -64.6% 165.67 ± 74% interrupts.CPU58.PMI:Performance_monitoring_interrupts
>>> 306.67 ± 75% -59.9% 122.83 ± 16% interrupts.CPU68.NMI:Non-maskable_interrupts
>>> 306.67 ± 75% -59.9% 122.83 ± 16% interrupts.CPU68.PMI:Performance_monitoring_interrupts
>>> 1131 ± 27% +61.2% 1822 ± 35% interrupts.CPU85.CAL:Function_call_interrupts
>>> 1180 ± 31% +79.6% 2119 ± 24% interrupts.CPU86.CAL:Function_call_interrupts
>>>
>
> It looks to be sending *waay* more call IPIs, did we mess up the mask or
> loose an optimization somewhere?
>
> I'll go read the commit again…

As you know, I did mess up by calling arch_send_call_function_single_ipi()
instead of smp_call_function_single(), which could explain the extra IPIs.
But that was resolved by your subsequent patch.

For me, what stands out is the time in C1 spent after the patch.

I will try to reproduce the issue to figure it out, since so far I could
not find an error in the code.


Attachments:
signature.asc (849.00 B)
Message signed with OpenPGP