LinuxLists.cc - [PATCH v2 0/7] sched/fair: misfit task load-balance tweaks

2021-02-23 12:40:31

by Valentin Schneider

[permalink] [raw]

Subject: Re: [sched/fair] b360fb5e59: stress-ng.vm-segv.ops_per_sec -13.9% regression

On 23/02/21 10:30, kernel test robot wrote:
> Greeting,
>
> FYI, we noticed a -13.9% regression of stress-ng.vm-segv.ops_per_sec due to commit:
>
>
> commit: b360fb5e5954a8a440ef95bf11257e2e7ea90340 ("[PATCH v2 1/7] sched/fair: Ignore percpu threads for imbalance pulls")
> url: https://github.com/0day-ci/linux/commits/Valentin-Schneider/sched-fair-misfit-task-load-balance-tweaks/20210219-211028
> base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git c5e6fc08feb2b88dc5dac2f3c817e1c2a4cafda4
>
> in testcase: stress-ng
> on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
> with following parameters:
>
> nr_threads: 10%
> disk: 1HDD
> testtime: 60s
> fs: ext4
> class: vm
> test: vm-segv
> cpufreq_governor: performance
> ucode: 0x5003003
>
>
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <[email protected]>
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> To reproduce:
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp install job.yaml # job file is attached in this email
> bin/lkp split-job --compatible job.yaml
> bin/lkp run compatible-job.yaml
>
> =========================================================================================
> class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
> vm/gcc-9/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp7/vm-segv/stress-ng/60s/0x5003003
>
> commit:
> c5e6fc08fe ("sched,x86: Allow !PREEMPT_DYNAMIC")
> b360fb5e59 ("sched/fair: Ignore percpu threads for imbalance pulls")
>
> c5e6fc08feb2b88d b360fb5e5954a8a440ef95bf112
> ---------------- ---------------------------
> fail:runs %reproduction fail:runs
> | | |
> 1:6 -3% 1:6 perf-profile.children.cycles-pp.error_entry
> 0:6 -1% 0:6 perf-profile.self.cycles-pp.error_entry
> %stddev %change %stddev
> \ | \
> 11324 ï¿½ 3% -28.1% 8140 ï¿½ 3% stress-ng.time.involuntary_context_switches
> 6818 ï¿½ 15% +315.2% 28311 ï¿½ 12% stress-ng.time.major_page_faults
> 30952041 -12.8% 26988502 stress-ng.time.minor_page_faults

> 378.82 +5.3% 398.75 stress-ng.time.system_time
> 215.82 -10.0% 194.24 stress-ng.time.user_time
> 62102177 -13.9% 53448474 stress-ng.time.voluntary_context_switches
> 810348 -13.9% 698034 stress-ng.vm-segv.ops
> 13505 -13.9% 11633 stress-ng.vm-segv.ops_per_sec

My hunch was that this could be due to the balance interval no longer being
increased when load balance catches pcpu kworkers, but that's not the case:
we would still have LBF_ALL_PINNED, which will still double the balance
interval if no task was moved.

I'm not sure which stat to look at wrt softirqs; this seems to say there
weren't that many more:

> 1.11 -0.3 0.85 mpstat.cpu.all.irq%
> 0.18 -0.0 0.16 mpstat.cpu.all.soft%
> 0.40 -0.1 0.35 mpstat.cpu.all.usr%

But this does:

> 11501 ï¿½ 5% +9.6% 12610 ï¿½ 6% softirqs.CPU12.RCU
> 10678 ï¿½ 5% +16.0% 12383 ï¿½ 2% softirqs.CPU16.RCU
> 10871 ï¿½ 4% +13.1% 12294 ï¿½ 2% softirqs.CPU17.RCU
> 10724 ï¿½ 2% +13.8% 12205 ï¿½ 3% softirqs.CPU18.RCU
> 10810 ï¿½ 4% +16.2% 12560 ï¿½ 3% softirqs.CPU19.RCU
> 10647 ï¿½ 6% +16.2% 12372 ï¿½ 6% softirqs.CPU20.RCU
> 10863 ï¿½ 3% +14.7% 12461 ï¿½ 3% softirqs.CPU21.RCU
> 11231 ï¿½ 5% +14.6% 12873 ï¿½ 6% softirqs.CPU22.RCU
> 11141 ï¿½ 6% +21.0% 13480 ï¿½ 8% softirqs.CPU64.RCU
> 11209 ï¿½ 6% +20.8% 13545 ï¿½ 2% softirqs.CPU65.RCU
> 11108 ï¿½ 3% +20.0% 13334 ï¿½ 6% softirqs.CPU66.RCU
> 11414 ï¿½ 9% +16.9% 13345 ï¿½ 6% softirqs.CPU67.RCU
> 11162 ï¿½ 4% +16.2% 12968 ï¿½ 9% softirqs.CPU68.RCU
> 11035 ï¿½ 5% +13.6% 12533 ï¿½ 4% softirqs.CPU69.RCU
> 11003 ï¿½ 5% +18.9% 13078 ï¿½ 8% softirqs.CPU70.RCU
> 11097 ï¿½ 4% +14.9% 12756 ï¿½ 5% softirqs.CPU71.RCU

2021-03-04 23:27:16

by Valentin Schneider

[permalink] [raw]

Subject: Re: [sched/fair] b360fb5e59: stress-ng.vm-segv.ops_per_sec -13.9% regression

On 23/02/21 12:36, Valentin Schneider wrote:
> On 23/02/21 10:30, kernel test robot wrote:
>> Greeting,
>>
>> FYI, we noticed a -13.9% regression of stress-ng.vm-segv.ops_per_sec due to commit:
>>
>>
>> commit: b360fb5e5954a8a440ef95bf11257e2e7ea90340 ("[PATCH v2 1/7] sched/fair: Ignore percpu threads for imbalance pulls")
>> url: https://github.com/0day-ci/linux/commits/Valentin-Schneider/sched-fair-misfit-task-load-balance-tweaks/20210219-211028
>> base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git c5e6fc08feb2b88dc5dac2f3c817e1c2a4cafda4
>>
>> in testcase: stress-ng
>> on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
>> with following parameters:
>>
>> nr_threads: 10%
>> disk: 1HDD
>> testtime: 60s
>> fs: ext4
>> class: vm
>> test: vm-segv
>> cpufreq_governor: performance
>> ucode: 0x5003003
>>

So I've been running this on my 32 CPU arm64 desktop with:
nr_threads: 10%
nr_threads: 50%
(20 iterations each)

In the 50% case I see a ~2% improvement, in the 10% a -0.3%
regression (another batch showed -0.08%)... Still far off from the reported
-14%. If it's really required I can go find an x86 box to test this on, but
so far it looks like a fluke.