2020-07-21 00:17:59

by Chen, Rong A

[permalink] [raw]
Subject: [fsnotify] c738fbabb0: will-it-scale.per_process_ops -9.5% regression

Greeting,

FYI, we noticed a -9.5% regression of will-it-scale.per_process_ops due to commit:


commit: c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a ("fsnotify: fold fsnotify() call into fsnotify_parent()")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

nr_task: 16
mode: process
test: open1
cpufreq_governor: performance
ucode: 0x5002f01

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -9.6% regression |
| test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=16 |
| | test=open2 |
| | ucode=0x5002f01 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -9.8% regression |
| test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=open2 |
| | ucode=0x5002f01 |
+------------------+---------------------------------------------------------------------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/16/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/open1/will-it-scale/0x5002f01

commit:
71d734103e ("fsnotify: Rearrange fast path to minimise overhead when there is no watcher")
c738fbabb0 ("fsnotify: fold fsnotify() call into fsnotify_parent()")

71d734103edfa2b4 c738fbabb0ff62d0f9a9572e56e
---------------- ---------------------------
%stddev %change %stddev
\ | \
230517 -9.5% 208520 will-it-scale.per_process_ops
3688279 -9.5% 3336327 will-it-scale.workload
0.14 -0.0 0.13 ± 3% mpstat.cpu.all.usr%
18920 +1.3% 19175 vmstat.system.in
1326004 ± 28% +30.6% 1732214 ± 8% cpuidle.C1.time
16564 ± 36% +84.9% 30624 ± 15% cpuidle.C1.usage
1.25 ± 48% +86.7% 2.33 ± 20% sched_debug.cfs_rq:/.nr_spread_over.max
9.37 ± 22% -31.7% 6.40 ± 11% sched_debug.cpu.clock.stddev
3287 ± 4% -24.3% 2487 ± 11% slabinfo.fsnotify_mark_connector.active_objs
3287 ± 4% -24.3% 2487 ± 11% slabinfo.fsnotify_mark_connector.num_objs
94165 -1.5% 92776 proc-vmstat.nr_slab_unreclaimable
14685686 -8.2% 13486706 proc-vmstat.numa_hit
14685455 -8.2% 13486604 proc-vmstat.numa_local
56441317 -8.5% 51651910 proc-vmstat.pgalloc_normal
56554299 -8.5% 51766282 proc-vmstat.pgfree
1129 ± 87% +148.1% 2801 ± 32% numa-vmstat.node0.nr_inactive_anon
1129 ± 87% +148.1% 2801 ± 32% numa-vmstat.node0.nr_zone_inactive_anon
2215 ± 44% -31.9% 1508 ± 30% numa-vmstat.node2.nr_mapped
28641 ± 49% -82.5% 5004 ± 67% numa-vmstat.node3.nr_active_anon
28540 ± 49% -82.9% 4877 ± 70% numa-vmstat.node3.nr_anon_pages
266.25 ± 21% -65.5% 91.75 ± 20% numa-vmstat.node3.nr_page_table_pages
22919 ± 6% -18.0% 18797 ± 10% numa-vmstat.node3.nr_slab_unreclaimable
28641 ± 49% -82.5% 5004 ± 67% numa-vmstat.node3.nr_zone_active_anon
649878 ± 4% -25.9% 481856 ± 11% numa-vmstat.node3.numa_hit
593060 ± 6% -34.1% 390815 ± 13% numa-vmstat.node3.numa_local
4691 ± 83% +143.3% 11414 ± 31% numa-meminfo.node0.Inactive
4519 ± 87% +148.0% 11207 ± 31% numa-meminfo.node0.Inactive(anon)
8735 ± 44% -32.3% 5911 ± 27% numa-meminfo.node2.Mapped
114519 ± 49% -82.1% 20520 ± 68% numa-meminfo.node3.Active
114519 ± 49% -82.5% 20016 ± 67% numa-meminfo.node3.Active(anon)
81741 ± 59% -84.3% 12813 ± 92% numa-meminfo.node3.AnonHugePages
114110 ± 49% -82.9% 19510 ± 70% numa-meminfo.node3.AnonPages
906290 ± 10% -19.5% 729240 ± 6% numa-meminfo.node3.MemUsed
1072 ± 20% -65.6% 368.25 ± 20% numa-meminfo.node3.PageTables
91676 ± 6% -18.0% 75192 ± 10% numa-meminfo.node3.SUnreclaim
120360 ± 7% -17.0% 99920 ± 9% numa-meminfo.node3.Slab
2410 ±142% +926.2% 24739 ± 68% interrupts.CPU124.LOC:Local_timer_interrupts
19270 ± 81% -73.3% 5143 ±128% interrupts.CPU16.LOC:Local_timer_interrupts
10986 ± 91% -88.9% 1217 ±138% interrupts.CPU179.LOC:Local_timer_interrupts
579.25 ± 58% +547.7% 3752 ± 91% interrupts.CPU20.LOC:Local_timer_interrupts
21592 ± 96% +1060.3% 250538 ± 26% interrupts.CPU3.LOC:Local_timer_interrupts
367.75 ± 20% +1182.1% 4714 ±109% interrupts.CPU31.LOC:Local_timer_interrupts
372.25 ± 19% +1195.1% 4821 ± 64% interrupts.CPU32.LOC:Local_timer_interrupts
379.00 ± 15% +1211.7% 4971 ±100% interrupts.CPU33.LOC:Local_timer_interrupts
1459 ± 82% -73.3% 389.25 ± 9% interrupts.CPU45.LOC:Local_timer_interrupts
354.25 ± 18% +399.6% 1770 ± 95% interrupts.CPU53.LOC:Local_timer_interrupts
354.75 ± 20% +2367.8% 8754 ± 95% interrupts.CPU59.LOC:Local_timer_interrupts
377.75 ± 24% +2115.8% 8370 ±134% interrupts.CPU68.LOC:Local_timer_interrupts
28421 ±168% +597.9% 198364 ± 45% interrupts.CPU7.LOC:Local_timer_interrupts
4590 ±154% -93.3% 305.75 ± 10% interrupts.CPU71.LOC:Local_timer_interrupts
4612 ± 51% -82.8% 791.75 ± 72% interrupts.CPU75.LOC:Local_timer_interrupts
287703 ± 8% -82.1% 51605 ±128% interrupts.CPU99.LOC:Local_timer_interrupts
18559 ± 11% -42.7% 10626 ± 44% softirqs.CPU105.RCU
107621 ± 28% -46.1% 58018 ± 39% softirqs.CPU105.TIMER
2351 ±103% +301.0% 9431 ± 82% softirqs.CPU106.RCU
14847 ± 68% +259.2% 53325 ± 75% softirqs.CPU106.TIMER
13789 ± 35% -35.5% 8896 ± 10% softirqs.CPU179.TIMER
10467 ± 16% -18.0% 8578 ± 7% softirqs.CPU180.TIMER
2836 ± 49% +486.1% 16622 ± 25% softirqs.CPU3.RCU
16935 ± 34% +378.9% 81110 ± 23% softirqs.CPU3.TIMER
9767 ± 2% +23.7% 12080 ± 22% softirqs.CPU31.TIMER
10688 ± 9% +27.3% 13610 ± 14% softirqs.CPU48.TIMER
2604 ± 17% +140.9% 6274 ± 88% softirqs.CPU5.SCHED
9464 +12.4% 10638 ± 12% softirqs.CPU52.TIMER
9460 ± 3% +40.8% 13320 ± 41% softirqs.CPU68.TIMER
3307 ± 96% +305.3% 13404 ± 38% softirqs.CPU7.RCU
2515 ± 7% +590.0% 17354 ± 84% softirqs.CPU7.SCHED
18138 ± 76% +299.2% 72406 ± 26% softirqs.CPU7.TIMER
11539 ± 9% -17.6% 9503 ± 2% softirqs.CPU75.TIMER
11004 ± 13% -12.3% 9649 ± 5% softirqs.CPU81.TIMER
19004 ± 8% -78.6% 4062 ± 91% softirqs.CPU99.RCU
91631 ± 8% -73.9% 23943 ± 79% softirqs.CPU99.TIMER



will-it-scale.per_process_ops

235000 +------------------------------------------------------------------+
|.+..+.+..+ +..+.+..+.+.+.. |
230000 |-+ : : +.+.+..+ |
| : : |
| :.+.. .+ |
225000 |-+ + +.+..+ |
| |
220000 |-+ |
| O O O O |
215000 |-+ |
| O O O O O |
| O O O O O O O O |
210000 |-O O O O O O O |
| O O O |
205000 +------------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample

***************************************************************************************************
lkp-csl-2ap2: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/16/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/open2/will-it-scale/0x5002f01

commit:
71d734103e ("fsnotify: Rearrange fast path to minimise overhead when there is no watcher")
c738fbabb0 ("fsnotify: fold fsnotify() call into fsnotify_parent()")

71d734103edfa2b4 c738fbabb0ff62d0f9a9572e56e
---------------- ---------------------------
%stddev %change %stddev
\ | \
231143 -9.6% 208858 will-it-scale.per_process_ops
3698309 -9.6% 3341737 will-it-scale.workload
13950150 -8.1% 12819132 numa-numastat.node0.local_node
13950241 -8.1% 12819137 numa-numastat.node0.numa_hit
196.25 ±107% +1048.3% 2253 ± 58% numa-vmstat.node0.nr_inactive_anon
211.75 ±103% +981.8% 2290 ± 56% numa-vmstat.node0.nr_shmem
196.25 ±107% +1048.3% 2253 ± 58% numa-vmstat.node0.nr_zone_inactive_anon
2767 ± 18% -42.0% 1605 ± 37% numa-vmstat.node2.nr_mapped
7998 ± 5% +24.0% 9916 ± 4% slabinfo.eventpoll_pwq.active_objs
7998 ± 5% +24.0% 9916 ± 4% slabinfo.eventpoll_pwq.num_objs
12722 ± 5% +10.4% 14044 slabinfo.shmem_inode_cache.active_objs
12834 ± 5% +10.1% 14134 slabinfo.shmem_inode_cache.num_objs
924.75 ± 82% +882.8% 9088 ± 57% numa-meminfo.node0.Inactive
786.25 ±107% +1046.6% 9015 ± 58% numa-meminfo.node0.Inactive(anon)
848.50 ±103% +980.2% 9165 ± 56% numa-meminfo.node0.Shmem
44264 ±111% +123.6% 98992 ± 76% numa-meminfo.node1.AnonHugePages
11068 ± 18% -42.0% 6422 ± 37% numa-meminfo.node2.Mapped
836.25 ± 2% +3.9% 868.50 proc-vmstat.nr_page_table_pages
14706505 -8.0% 13532306 proc-vmstat.numa_hit
14706160 -8.0% 13531870 proc-vmstat.numa_local
56514816 -8.3% 51824913 proc-vmstat.pgalloc_normal
56591067 -8.2% 51934860 proc-vmstat.pgfree
0.28 ± 29% -35.5% 0.18 ± 15% sched_debug.cfs_rq:/.nr_spread_over.avg
3.12 ± 48% -50.7% 1.54 ± 24% sched_debug.cfs_rq:/.nr_spread_over.max
0.59 ± 34% -50.1% 0.29 ± 29% sched_debug.cfs_rq:/.nr_spread_over.stddev
872.50 ± 16% -28.4% 624.33 ± 25% sched_debug.cpu.nr_switches.min
82760 -20.6% 65730 ± 13% sched_debug.cpu.sched_goidle.max
9676 ± 91% -90.7% 897.75 ± 3% softirqs.CPU104.RCU
40157 ± 80% +176.3% 110969 ± 25% softirqs.CPU107.TIMER
15639 ± 36% -89.9% 1573 ± 9% softirqs.CPU11.RCU
72883 ± 33% -84.7% 11155 ± 10% softirqs.CPU11.TIMER
2311 ± 2% +258.5% 8287 ±105% softirqs.CPU112.SCHED
9516 ± 13% +63.5% 15558 ± 32% softirqs.CPU121.TIMER
17314 ± 42% -47.0% 9170 ± 6% softirqs.CPU147.TIMER
12408 ± 23% -27.1% 9050 ± 12% softirqs.CPU179.TIMER
9262 ± 22% +43.7% 13310 ± 21% softirqs.CPU27.TIMER
23438 ± 94% +155.8% 59950 ± 56% softirqs.CPU7.TIMER
183705 ± 38% +42.5% 261827 ± 15% interrupts.CPU101.LOC:Local_timer_interrupts
109313 ±102% +174.9% 300490 interrupts.CPU107.LOC:Local_timer_interrupts
205710 ± 50% -98.3% 3401 ± 74% interrupts.CPU11.LOC:Local_timer_interrupts
300.75 ± 25% +342.6% 1331 ±115% interrupts.CPU117.LOC:Local_timer_interrupts
17471 ± 88% -97.1% 505.75 ± 69% interrupts.CPU147.LOC:Local_timer_interrupts
475.25 ± 49% +346.2% 2120 ± 75% interrupts.CPU18.LOC:Local_timer_interrupts
2921 ±100% -91.2% 258.00 ± 45% interrupts.CPU183.LOC:Local_timer_interrupts
305.75 ± 22% +998.1% 3357 ± 93% interrupts.CPU30.LOC:Local_timer_interrupts
311.00 ± 26% +160.7% 810.75 ± 63% interrupts.CPU34.LOC:Local_timer_interrupts
3080 ±141% -89.7% 318.00 ± 7% interrupts.CPU37.LOC:Local_timer_interrupts
276.50 ± 17% +52.4% 421.25 ± 19% interrupts.CPU42.LOC:Local_timer_interrupts
548.25 ± 61% -43.5% 309.50 ± 7% interrupts.CPU44.LOC:Local_timer_interrupts
701.50 ± 59% +476.4% 4043 ± 70% interrupts.CPU56.LOC:Local_timer_interrupts
45593 ±171% +284.7% 175395 ± 67% interrupts.CPU7.LOC:Local_timer_interrupts
6.00 ±163% +637.5% 44.25 ± 94% interrupts.CPU72.RES:Rescheduling_interrupts



***************************************************************************************************
lkp-csl-2ap3: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap3/open2/will-it-scale/0x5002f01

commit:
71d734103e ("fsnotify: Rearrange fast path to minimise overhead when there is no watcher")
c738fbabb0 ("fsnotify: fold fsnotify() call into fsnotify_parent()")

71d734103edfa2b4 c738fbabb0ff62d0f9a9572e56e
---------------- ---------------------------
%stddev %change %stddev
\ | \
26529 -9.8% 23925 will-it-scale.per_process_ops
2546888 -9.8% 2296858 will-it-scale.workload
0.54 ± 4% -0.1 0.43 ± 4% mpstat.cpu.all.soft%
39.86 -4.0% 38.28 ± 2% boot-time.boot
6499 -4.9% 6180 ± 2% boot-time.idle
33149201 ± 6% -27.3% 24087303 ± 17% cpuidle.C1E.time
95442 ± 3% -21.0% 75408 ± 12% cpuidle.C1E.usage
99.34 ± 7% +39.3% 138.40 ± 17% sched_debug.cpu.clock.stddev
0.00 ± 13% +44.7% 0.00 ± 28% sched_debug.cpu.next_balance.stddev
9603 ± 7% -21.7% 7522 sched_debug.cpu.ttwu_count.max
975.84 ± 10% -15.2% 827.50 ± 6% sched_debug.cpu.ttwu_count.stddev
1695 ± 5% -24.4% 1281 ± 8% slabinfo.dmaengine-unmap-16.active_objs
1695 ± 5% -24.4% 1281 ± 8% slabinfo.dmaengine-unmap-16.num_objs
941.25 ± 4% -18.7% 765.00 ± 7% slabinfo.skbuff_fclone_cache.active_objs
941.25 ± 4% -18.7% 765.00 ± 7% slabinfo.skbuff_fclone_cache.num_objs
49736 ± 33% +123.5% 111141 ± 40% numa-meminfo.node1.Active
48772 ± 36% +127.9% 111137 ± 40% numa-meminfo.node1.Active(anon)
23841 ± 68% +183.7% 67649 ± 44% numa-meminfo.node1.AnonHugePages
48154 ± 37% +130.3% 110911 ± 40% numa-meminfo.node1.AnonPages
728587 ± 2% +11.6% 812918 ± 7% numa-meminfo.node1.MemUsed
2750817 ± 2% -13.2% 2388164 ± 3% numa-numastat.node1.local_node
2750948 ± 2% -13.2% 2388260 ± 3% numa-numastat.node1.numa_hit
2757763 ± 3% -12.0% 2428082 ± 4% numa-numastat.node2.local_node
2757943 ± 3% -12.0% 2428249 ± 4% numa-numastat.node2.numa_hit
2761126 -11.7% 2438785 ± 4% numa-numastat.node3.local_node
2761212 -11.7% 2438967 ± 4% numa-numastat.node3.numa_hit
10278834 -11.7% 9073752 proc-vmstat.numa_hit
10278355 -11.7% 9073178 proc-vmstat.numa_local
5810 ± 18% -76.3% 1379 ± 99% proc-vmstat.numa_pages_migrated
30111 ± 26% -90.3% 2912 ±137% proc-vmstat.numa_pte_updates
38829450 -12.3% 34068635 proc-vmstat.pgalloc_normal
1100989 -1.5% 1084378 proc-vmstat.pgfault
38905599 -12.3% 34116999 proc-vmstat.pgfree
5810 ± 18% -76.3% 1379 ± 99% proc-vmstat.pgmigrate_success
3924 ± 46% +900.3% 39258 ± 85% numa-vmstat.node0.numa_other
12197 ± 36% +128.0% 27807 ± 40% numa-vmstat.node1.nr_active_anon
12043 ± 37% +130.4% 27753 ± 40% numa-vmstat.node1.nr_anon_pages
1268 +33.2% 1690 ± 33% numa-vmstat.node1.nr_mapped
12197 ± 36% +128.0% 27807 ± 40% numa-vmstat.node1.nr_zone_active_anon
1832036 ± 4% -10.3% 1642592 ± 6% numa-vmstat.node2.numa_hit
1740980 ± 4% -9.9% 1568287 ± 6% numa-vmstat.node2.numa_local
1822776 ± 5% -9.8% 1643294 ± 5% numa-vmstat.node3.numa_hit
1731472 ± 5% -10.4% 1551925 ± 6% numa-vmstat.node3.numa_local
123.75 ± 40% -46.1% 66.75 ± 5% interrupts.CPU1.RES:Rescheduling_interrupts
135803 ± 7% +15.3% 156593 ± 7% interrupts.CPU107.LOC:Local_timer_interrupts
149908 ± 7% -21.5% 117714 ± 5% interrupts.CPU111.LOC:Local_timer_interrupts
157315 ± 7% -13.8% 135670 ± 13% interrupts.CPU112.LOC:Local_timer_interrupts
485.00 ± 5% +21.4% 589.00 ± 5% interrupts.CPU12.CAL:Function_call_interrupts
94331 ± 38% +58.4% 149451 ± 4% interrupts.CPU124.LOC:Local_timer_interrupts
89739 ± 51% +81.1% 162547 ± 8% interrupts.CPU138.LOC:Local_timer_interrupts
92784 ± 39% +57.9% 146526 ± 15% interrupts.CPU142.LOC:Local_timer_interrupts
139247 ± 12% +20.5% 167753 ± 3% interrupts.CPU147.LOC:Local_timer_interrupts
426.75 ± 6% +40.6% 600.00 ± 29% interrupts.CPU148.CAL:Function_call_interrupts
566.00 ± 16% -22.7% 437.75 ± 10% interrupts.CPU15.CAL:Function_call_interrupts
75.50 ± 22% -30.8% 52.25 ± 13% interrupts.CPU15.RES:Rescheduling_interrupts
121329 ± 20% +30.5% 158362 ± 8% interrupts.CPU155.LOC:Local_timer_interrupts
124618 ± 43% +44.1% 179592 ± 10% interrupts.CPU156.LOC:Local_timer_interrupts
424.25 ± 6% +17.7% 499.50 ± 12% interrupts.CPU16.CAL:Function_call_interrupts
68.75 ± 29% -52.4% 32.75 ± 13% interrupts.CPU162.RES:Rescheduling_interrupts
501.75 ± 10% -16.2% 420.50 ± 2% interrupts.CPU164.CAL:Function_call_interrupts
764.25 ± 27% -45.0% 420.50 ± 3% interrupts.CPU165.CAL:Function_call_interrupts
90.00 ± 35% -65.3% 31.25 ± 19% interrupts.CPU165.RES:Rescheduling_interrupts
133726 ± 36% +32.1% 176637 ± 5% interrupts.CPU167.LOC:Local_timer_interrupts
124249 ± 48% +50.7% 187215 ± 9% interrupts.CPU177.LOC:Local_timer_interrupts
140808 ± 6% +22.0% 171763 ± 9% interrupts.CPU180.LOC:Local_timer_interrupts
145005 ± 9% +33.0% 192841 ± 7% interrupts.CPU186.LOC:Local_timer_interrupts
77.00 ± 69% -54.2% 35.25 ± 5% interrupts.CPU186.RES:Rescheduling_interrupts
462.25 ± 17% +43.9% 665.00 ± 25% interrupts.CPU189.CAL:Function_call_interrupts
416.75 ± 2% +14.2% 475.75 ± 9% interrupts.CPU19.CAL:Function_call_interrupts
144205 ± 7% +24.5% 179585 ± 6% interrupts.CPU190.LOC:Local_timer_interrupts
198576 ± 6% -10.4% 177947 ± 3% interrupts.CPU22.LOC:Local_timer_interrupts
253121 ± 14% -24.9% 190117 ± 5% interrupts.CPU28.LOC:Local_timer_interrupts
241270 ± 15% -26.8% 176499 ± 4% interrupts.CPU42.LOC:Local_timer_interrupts
487.50 ± 14% -14.1% 419.00 interrupts.CPU44.CAL:Function_call_interrupts
204857 -12.1% 180159 ± 4% interrupts.CPU51.LOC:Local_timer_interrupts
212433 ± 13% -15.1% 180387 ± 5% interrupts.CPU54.LOC:Local_timer_interrupts
96.25 ± 13% -25.5% 71.75 ± 9% interrupts.CPU6.RES:Rescheduling_interrupts
192858 ± 4% -10.4% 172746 ± 7% interrupts.CPU72.LOC:Local_timer_interrupts
189260 ± 9% -9.1% 172040 ± 6% interrupts.CPU75.LOC:Local_timer_interrupts
224414 ± 20% -22.4% 174209 ± 4% interrupts.CPU78.LOC:Local_timer_interrupts
212931 ± 22% -27.0% 155382 ± 11% interrupts.CPU81.LOC:Local_timer_interrupts
196798 ± 4% -14.7% 167819 ± 4% interrupts.CPU83.LOC:Local_timer_interrupts
197057 ± 4% -15.3% 166862 ± 10% interrupts.CPU84.LOC:Local_timer_interrupts
191603 ± 4% -21.2% 150944 ± 8% interrupts.CPU90.LOC:Local_timer_interrupts
192164 ± 4% -16.0% 161414 ± 10% interrupts.CPU93.LOC:Local_timer_interrupts
197294 ± 2% -18.8% 160294 ± 7% interrupts.CPU94.LOC:Local_timer_interrupts
83.75 ± 46% -39.1% 51.00 ± 8% interrupts.CPU97.RES:Rescheduling_interrupts
63370 ± 4% +13.1% 71681 ± 7% softirqs.CPU100.TIMER
33919 ± 8% +17.7% 39933 ± 6% softirqs.CPU107.RCU
58170 ± 6% +13.2% 65866 ± 7% softirqs.CPU107.TIMER
35558 ± 9% -21.7% 27835 ± 3% softirqs.CPU111.RCU
63000 ± 6% -17.7% 51841 ± 4% softirqs.CPU111.TIMER
66128 ± 7% -11.8% 58335 ± 11% softirqs.CPU112.TIMER
35138 ± 9% +14.2% 40140 ± 7% softirqs.CPU119.RCU
17612 ± 66% +110.0% 36984 ± 6% softirqs.CPU124.RCU
42791 ± 29% +45.5% 62274 ± 4% softirqs.CPU124.TIMER
20661 ± 58% +60.7% 33193 ± 16% softirqs.CPU130.RCU
20568 ± 58% +95.8% 40282 ± 7% softirqs.CPU138.RCU
40747 ± 40% +63.4% 66563 ± 7% softirqs.CPU138.TIMER
25381 ± 45% +42.0% 36043 ± 13% softirqs.CPU139.RCU
49061 ± 8% +13.5% 55672 softirqs.CPU14.RCU
32504 ± 8% +23.9% 40272 ± 5% softirqs.CPU147.RCU
58341 ± 10% +17.9% 68785 ± 2% softirqs.CPU147.TIMER
49739 ± 9% +17.9% 58621 softirqs.CPU15.RCU
81506 ± 6% +11.4% 90763 softirqs.CPU15.TIMER
26093 ± 24% +47.4% 38461 ± 13% softirqs.CPU155.RCU
52387 ± 17% +24.4% 65149 ± 7% softirqs.CPU155.TIMER
53209 ± 36% +36.8% 72800 ± 8% softirqs.CPU156.TIMER
7548 ± 25% -33.1% 5046 ± 4% softirqs.CPU162.SCHED
30310 ± 53% +42.8% 43295 ± 6% softirqs.CPU167.RCU
35461 ± 10% +17.5% 41681 ± 9% softirqs.CPU168.RCU
27019 ± 54% +50.6% 40701 ± 4% softirqs.CPU174.RCU
29670 ± 52% +57.1% 46613 ± 11% softirqs.CPU177.RCU
52974 ± 41% +42.8% 75671 ± 8% softirqs.CPU177.TIMER
34091 ± 9% +25.2% 42690 ± 7% softirqs.CPU179.RCU
58482 ± 10% +21.2% 70860 ± 6% softirqs.CPU179.TIMER
33463 ± 7% +30.2% 43553 ± 10% softirqs.CPU180.RCU
58849 ± 6% +20.2% 70753 ± 8% softirqs.CPU180.TIMER
35205 ± 8% +35.7% 47765 ± 7% softirqs.CPU186.RCU
60079 ± 7% +28.7% 77301 ± 6% softirqs.CPU186.TIMER
36328 ± 9% +18.0% 42852 ± 10% softirqs.CPU188.RCU
62892 ± 8% +11.7% 70261 ± 8% softirqs.CPU188.TIMER
33741 ± 4% +33.2% 44938 ± 7% softirqs.CPU190.RCU
60176 ± 5% +20.7% 72657 ± 6% softirqs.CPU190.TIMER
41064 ± 7% +18.1% 48502 ± 14% softirqs.CPU191.RCU
81015 ± 5% -8.0% 74547 ± 3% softirqs.CPU22.TIMER
99890 ± 12% -22.1% 77778 ± 4% softirqs.CPU28.TIMER
6830 ± 41% -32.4% 4619 ± 12% softirqs.CPU39.SCHED
95663 ± 13% -24.0% 72722 ± 3% softirqs.CPU42.TIMER
51773 ± 5% -13.2% 44920 ± 4% softirqs.CPU51.RCU
85532 ± 11% -13.5% 73979 ± 5% softirqs.CPU54.TIMER
78232 ± 4% -8.2% 71835 ± 6% softirqs.CPU72.TIMER
77241 ± 7% -7.6% 71394 ± 5% softirqs.CPU75.TIMER
89520 ± 18% -18.8% 72701 ± 3% softirqs.CPU78.TIMER
85226 ± 20% -23.8% 64917 ± 10% softirqs.CPU81.TIMER
49438 ± 6% -14.8% 42097 ± 5% softirqs.CPU83.RCU
79490 ± 4% -12.8% 69335 ± 3% softirqs.CPU83.TIMER
50505 ± 4% -18.1% 41342 ± 11% softirqs.CPU84.RCU
79488 ± 3% -13.3% 68951 ± 9% softirqs.CPU84.TIMER
48600 ± 5% -24.5% 36673 ± 9% softirqs.CPU90.RCU
77399 ± 3% -18.2% 63329 ± 7% softirqs.CPU90.TIMER
47854 ± 7% -15.6% 40405 ± 9% softirqs.CPU93.RCU
77855 ± 4% -13.3% 67512 ± 8% softirqs.CPU93.TIMER
49558 ± 4% -20.5% 39403 ± 9% softirqs.CPU94.RCU
80364 -17.2% 66551 ± 6% softirqs.CPU94.TIMER
22283 ± 26% -78.3% 4842 ±122% softirqs.NET_RX





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


Attachments:
(No filename) (28.17 kB)
config-5.8.0-rc4-00085-gc738fbabb0ff6 (160.83 kB)
job-script (7.50 kB)
job.yaml (5.12 kB)
reproduce (347.00 B)
Download all attachments

2020-07-21 16:00:19

by Amir Goldstein

[permalink] [raw]
Subject: Re: [fsnotify] c738fbabb0: will-it-scale.per_process_ops -9.5% regression

On Tue, Jul 21, 2020 at 3:15 AM kernel test robot <[email protected]> wrote:
>
> Greeting,
>
> FYI, we noticed a -9.5% regression of will-it-scale.per_process_ops due to commit:
>
>
> commit: c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a ("fsnotify: fold fsnotify() call into fsnotify_parent()")

Strange, that's a pretty dumb patch moving some inlined code from one
function to
another (assuming there are no fsnotify marks in this test).

Unless I am missing something the only thing that changes slightly is
an extra d_inode(file->f_path.dentry) deference.
I can get rid of it.

Is it possible to ask for a re-test with fix patch (attached)?

Thanks,
Amir.


Attachments:
fsnotify-pass-inode-to-fsnotify_parent.patch.txt (2.05 kB)

2020-07-24 02:48:41

by Chen, Rong A

[permalink] [raw]
Subject: Re: [fsnotify] c738fbabb0: will-it-scale.per_process_ops -9.5% regression



On 7/21/20 11:59 PM, Amir Goldstein wrote:
> On Tue, Jul 21, 2020 at 3:15 AM kernel test robot <[email protected]> wrote:
>> Greeting,
>>
>> FYI, we noticed a -9.5% regression of will-it-scale.per_process_ops due to commit:
>>
>>
>> commit: c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a ("fsnotify: fold fsnotify() call into fsnotify_parent()")
> Strange, that's a pretty dumb patch moving some inlined code from one
> function to
> another (assuming there are no fsnotify marks in this test).
>
> Unless I am missing something the only thing that changes slightly is
> an extra d_inode(file->f_path.dentry) deference.
> I can get rid of it.
>
> Is it possible to ask for a re-test with fix patch (attached)?

Hi Amir,

We failed to apply this patch, could you tell us the base commit or the
base branch?

Best Regards,
Rong Chen

2020-07-24 03:48:40

by Amir Goldstein

[permalink] [raw]
Subject: Re: [fsnotify] c738fbabb0: will-it-scale.per_process_ops -9.5% regression

On Fri, Jul 24, 2020 at 5:45 AM Rong Chen <[email protected]> wrote:
>
>
>
> On 7/21/20 11:59 PM, Amir Goldstein wrote:
> > On Tue, Jul 21, 2020 at 3:15 AM kernel test robot <[email protected]> wrote:
> >> Greeting,
> >>
> >> FYI, we noticed a -9.5% regression of will-it-scale.per_process_ops due to commit:
> >>
> >>
> >> commit: c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a ("fsnotify: fold fsnotify() call into fsnotify_parent()")
> > Strange, that's a pretty dumb patch moving some inlined code from one
> > function to
> > another (assuming there are no fsnotify marks in this test).
> >
> > Unless I am missing something the only thing that changes slightly is
> > an extra d_inode(file->f_path.dentry) deference.
> > I can get rid of it.
> >
> > Is it possible to ask for a re-test with fix patch (attached)?
>
> Hi Amir,
>
> We failed to apply this patch, could you tell us the base commit or the
> base branch?
>

Hi Rong,

The patch is applied on top of the reported offending commit:
c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a ("fsnotify: fold fsnotify()
call into fsnotify_parent()")

I pushed it to my github:
https://github.com/amir73il/linux/commits/for_lkp

Thanks,
Amir.

2020-07-26 11:54:27

by Amir Goldstein

[permalink] [raw]
Subject: Re: [fsnotify] c738fbabb0: will-it-scale.per_process_ops -9.5% regression

On Fri, Jul 24, 2020 at 6:47 AM Amir Goldstein <[email protected]> wrote:
>
> On Fri, Jul 24, 2020 at 5:45 AM Rong Chen <[email protected]> wrote:
> >
> >
> >
> > On 7/21/20 11:59 PM, Amir Goldstein wrote:
> > > On Tue, Jul 21, 2020 at 3:15 AM kernel test robot <[email protected]> wrote:
> > >> Greeting,
> > >>
> > >> FYI, we noticed a -9.5% regression of will-it-scale.per_process_ops due to commit:
> > >>
> > >>
> > >> commit: c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a ("fsnotify: fold fsnotify() call into fsnotify_parent()")
> > > Strange, that's a pretty dumb patch moving some inlined code from one
> > > function to
> > > another (assuming there are no fsnotify marks in this test).
> > >
> > > Unless I am missing something the only thing that changes slightly is
> > > an extra d_inode(file->f_path.dentry) deference.
> > > I can get rid of it.
> > >
> > > Is it possible to ask for a re-test with fix patch (attached)?
> >
> > Hi Amir,
> >
> > We failed to apply this patch, could you tell us the base commit or the
> > base branch?
> >
>
> Hi Rong,
>
> The patch is applied on top of the reported offending commit:
> c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a ("fsnotify: fold fsnotify()
> call into fsnotify_parent()")
>
> I pushed it to my github:
> https://github.com/amir73il/linux/commits/for_lkp
>

FWIW, I tried reproducing the reported regression on a local machine.

I ran the test twice on each of the branch commits:

26dc3d2bff62 fsnotify: pass inode to fsnotify_parent()
c738fbabb0ff fsnotify: fold fsnotify() call into fsnotify_parent()
71d734103edf fsnotify: Rearrange fast path to minimise overhead when
there is no watcher
47aaabdedf36 fanotify: Avoid softlockups when reading many events

Not only did I not observe a regression with the reported commit,
but there was a slight improvement. And then there yet was another
improvement with the fix commit on top of it.

But it could be that I am doing something wrong, because I have zero
millage with LKP.

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-7/performance/defconfig/process/16/ubuntu/amir-lkp/open1/will-it-scale

commit:
47aaabdedf366ac5894c7fddec388832f0d8193e
71d734103edfa2b4c6657578a3082ee0e51d767e
c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a
26dc3d2bff623768cbbd0c8053ddd6390fd828d2

47aaabdedf366ac5 71d734103edfa2b4c6657578a30
c738fbabb0ff62d0f9a9572e56e 26dc3d2bff623768cbbd0c8053d
---------------- ---------------------------
--------------------------- ---------------------------
fail:runs %reproduction fail:runs %reproduction
fail:runs %reproduction fail:runs
| | | | |
| |
45:2 -555% 34:2 -807% 29:2
-996% 25:2 dmesg.timestamp:last
45:2 -555% 34:2 -807% 29:2
-996% 25:2 kmsg.timestamp:last
%stddev %change %stddev %change
%stddev %change %stddev
\ | \ | \
| \
1097404 +1.7% 1116452 +2.7% 1126533
+3.5% 1135663 will-it-scale.16.processes
0.02 ± 60% +20.0% 0.03 ± 66% +20.0% 0.03 ±
66% -20.0% 0.02 ± 50% will-it-scale.16.processes_idle
68587 +1.7% 69778 +2.7% 70408
+3.5% 70978 will-it-scale.per_process_ops


Thanks,
Amir.

2020-07-27 02:09:17

by Xing Zhengjun

[permalink] [raw]
Subject: Re: [LKP] Re: [fsnotify] c738fbabb0: will-it-scale.per_process_ops -9.5% regression



On 7/24/2020 10:44 AM, Rong Chen wrote:
>
>
> On 7/21/20 11:59 PM, Amir Goldstein wrote:
>> On Tue, Jul 21, 2020 at 3:15 AM kernel test robot
>> <[email protected]> wrote:
>>> Greeting,
>>>
>>> FYI, we noticed a -9.5% regression of will-it-scale.per_process_ops
>>> due to commit:
>>>
>>>
>>> commit: c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a ("fsnotify: fold
>>> fsnotify() call into fsnotify_parent()")
>> Strange, that's a pretty dumb patch moving some inlined code from one
>> function to
>> another (assuming there are no fsnotify marks in this test).
>>
>> Unless I am missing something the only thing that changes slightly is
>> an extra d_inode(file->f_path.dentry) deference.
>> I can get rid of it.
>>
>> Is it possible to ask for a re-test with fix patch (attached)?
>

I apply the fix patch, the regression still exists.
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/nr_task/mode/test/cpufreq_governor/ucode:

lkp-csl-2ap2/will-it-scale/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3/gcc-9/16/process/open1/performance/0x5002f01

commit:
71d734103edfa2b4c6657578a3082ee0e51d767e
c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a
5c32fe90f2a57e7c4da06be51f705aec6affceb6 (the commit which the fix
patch apply based on)
7f66797f773621d0ef6718df0ef2cf849814d114 (the fix patch)

71d734103edfa2b4 c738fbabb0ff62d0f9a9572e56e 5c32fe90f2a57e7c4da06be51f7
7f66797f773621d0ef6718df0ef
---------------- --------------------------- ---------------------------
---------------------------
%stddev %change %stddev %change
%stddev %change %stddev
\ | \ | \
| \
229940 -9.8% 207333 -13.0% 199996
-11.7% 202927 will-it-scale.per_process_ops
3679048 -9.8% 3317347 -13.0% 3199942
-11.7% 3246851 will-it-scale.workload



> Hi Amir,
>
> We failed to apply this patch, could you tell us the base commit or the
> base branch?
>
> Best Regards,
> Rong Chen
> _______________________________________________
> LKP mailing list -- [email protected]
> To unsubscribe send an email to [email protected]

--
Zhengjun Xing

2020-07-27 11:07:30

by Jan Kara

[permalink] [raw]
Subject: Re: [fsnotify] c738fbabb0: will-it-scale.per_process_ops -9.5% regression

On Sun 26-07-20 14:52:47, Amir Goldstein wrote:
> On Fri, Jul 24, 2020 at 6:47 AM Amir Goldstein <[email protected]> wrote:
> >
> > On Fri, Jul 24, 2020 at 5:45 AM Rong Chen <[email protected]> wrote:
> > >
> > >
> > >
> > > On 7/21/20 11:59 PM, Amir Goldstein wrote:
> > > > On Tue, Jul 21, 2020 at 3:15 AM kernel test robot <[email protected]> wrote:
> > > >> Greeting,
> > > >>
> > > >> FYI, we noticed a -9.5% regression of will-it-scale.per_process_ops due to commit:
> > > >>
> > > >>
> > > >> commit: c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a ("fsnotify: fold fsnotify() call into fsnotify_parent()")
> > > > Strange, that's a pretty dumb patch moving some inlined code from one
> > > > function to
> > > > another (assuming there are no fsnotify marks in this test).
> > > >
> > > > Unless I am missing something the only thing that changes slightly is
> > > > an extra d_inode(file->f_path.dentry) deference.
> > > > I can get rid of it.
> > > >
> > > > Is it possible to ask for a re-test with fix patch (attached)?
> > >
> > > Hi Amir,
> > >
> > > We failed to apply this patch, could you tell us the base commit or the
> > > base branch?
> > >
> >
> > Hi Rong,
> >
> > The patch is applied on top of the reported offending commit:
> > c738fbabb0ff62d0f9a9572e56e65d05a1b34c6a ("fsnotify: fold fsnotify()
> > call into fsnotify_parent()")
> >
> > I pushed it to my github:
> > https://github.com/amir73il/linux/commits/for_lkp
> >
>
> FWIW, I tried reproducing the reported regression on a local machine.
>
> I ran the test twice on each of the branch commits:
>
> 26dc3d2bff62 fsnotify: pass inode to fsnotify_parent()
> c738fbabb0ff fsnotify: fold fsnotify() call into fsnotify_parent()
> 71d734103edf fsnotify: Rearrange fast path to minimise overhead when
> there is no watcher
> 47aaabdedf36 fanotify: Avoid softlockups when reading many events
>
> Not only did I not observe a regression with the reported commit,
> but there was a slight improvement. And then there yet was another
> improvement with the fix commit on top of it.

I suspect this may be closely related to code generation, code cacheline
alignment etc. and thus depends heavily on a particular compiler version
and CPU. I've checked the commit myself and I agree it looks innocent so
for these reasons, I'm not particularly worried about this regression.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR