2021-04-22 12:38:21

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 01/19] sched/fair: Add a few assertions


Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
kernel/sched/fair.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6246,6 +6246,11 @@ static int select_idle_sibling(struct ta
task_util = uclamp_task_util(p);
}

+ /*
+ * per-cpu select_idle_mask usage
+ */
+ lockdep_assert_irqs_disabled();
+
if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
asym_fits_capacity(task_util, target))
return target;
@@ -6711,8 +6716,6 @@ static int find_energy_efficient_cpu(str
* certain conditions an idle sibling CPU if the domain has SD_WAKE_AFFINE set.
*
* Returns the target CPU number.
- *
- * preempt must be disabled.
*/
static int
select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
@@ -6725,6 +6728,10 @@ select_task_rq_fair(struct task_struct *
/* SD_flags and WF_flags share the first nibble */
int sd_flag = wake_flags & 0xF;

+ /*
+ * required for stable ->cpus_allowed
+ */
+ lockdep_assert_held(&p->pi_lock);
if (wake_flags & WF_TTWU) {
record_wakee(p);




2021-05-12 10:31:59

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: sched/core] sched/fair: Add a few assertions

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 9099a14708ce1dfecb6002605594a0daa319b555
Gitweb: https://git.kernel.org/tip/9099a14708ce1dfecb6002605594a0daa319b555
Author: Peter Zijlstra <[email protected]>
AuthorDate: Tue, 17 Nov 2020 18:19:35 -05:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Wed, 12 May 2021 11:43:26 +02:00

sched/fair: Add a few assertions

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Don Hiatt <[email protected]>
Tested-by: Hongyu Ning <[email protected]>
Tested-by: Vincent Guittot <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/sched/fair.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c209f68..6bdbb7b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6288,6 +6288,11 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
task_util = uclamp_task_util(p);
}

+ /*
+ * per-cpu select_idle_mask usage
+ */
+ lockdep_assert_irqs_disabled();
+
if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
asym_fits_capacity(task_util, target))
return target;
@@ -6781,8 +6786,6 @@ unlock:
* certain conditions an idle sibling CPU if the domain has SD_WAKE_AFFINE set.
*
* Returns the target CPU number.
- *
- * preempt must be disabled.
*/
static int
select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
@@ -6795,6 +6798,10 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
/* SD_flags and WF_flags share the first nibble */
int sd_flag = wake_flags & 0xF;

+ /*
+ * required for stable ->cpus_allowed
+ */
+ lockdep_assert_held(&p->pi_lock);
if (wake_flags & WF_TTWU) {
record_wakee(p);

2021-05-13 10:52:55

by Ning, Hongyu

[permalink] [raw]
Subject: Re: [tip: sched/core] sched/fair: Add a few assertions


On 2021/5/12 18:28, tip-bot2 for Peter Zijlstra wrote:
> The following commit has been merged into the sched/core branch of tip:
>
> Commit-ID: 9099a14708ce1dfecb6002605594a0daa319b555
> Gitweb: https://git.kernel.org/tip/9099a14708ce1dfecb6002605594a0daa319b555
> Author: Peter Zijlstra <[email protected]>
> AuthorDate: Tue, 17 Nov 2020 18:19:35 -05:00
> Committer: Peter Zijlstra <[email protected]>
> CommitterDate: Wed, 12 May 2021 11:43:26 +02:00
>
> sched/fair: Add a few assertions
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Tested-by: Don Hiatt <[email protected]>
> Tested-by: Hongyu Ning <[email protected]>
> Tested-by: Vincent Guittot <[email protected]>
> Link: https://lkml.kernel.org/r/[email protected]
> ---
> kernel/sched/fair.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>

Add quick test results based on tip tree sched/core merge commit:

====TEST INFO====
- kernel under test:
-- tip tree sched/core merge: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=60208dac643e24cbc62317de4e486fdcbbf05215
-- coresched_v10 kernel source: https://github.com/digitalocean/linux-coresched/commits/coresched/v10-v5.10.y

- test machine setup:
CPU(s): 192
On-line CPU(s) list: 0-191
Thread(s) per core: 2
Core(s) per socket: 48
Socket(s): 2
NUMA node(s): 4

- performance test workloads:
-- A. sysbench cpu (192 threads) + sysbench cpu (192 threads)
-- B. sysbench cpu (192 threads) + sysbench mysql (192 threads)
-- C. uperf netperf.xml (192 threads over TCP or UDP protocol separately)
-- D. will-it-scale context_switch via pipe (192 threads)

- negative test:
-- A. continuously toggle coresched (enable/disable) thru prctl on task cookies of PGID, during full loading of uperf workload with coresched on
-- B. continuously toggle smt (on/off) via /sys/devices/system/cpu/smt/control, during full loading of uperf workload with coresched on

====TEST RESULTS====
- performance change key info:
--workload B: coresched (cs_on), sysbench mysql performance drop around 20% vs coresched_v10
--workload C, coresched (cs_on), uperf performance increased almost double vs coresched_v10
--workload C, default (cs_off), uperf performance drop over 25% vs coresched_v10, same issue seen on v5.13-rc1 base (w/o coresched patchset)
--workload D, coresched (cs_on), wis performance increased almost double vs coresched_v10

- negative test summary:
no platform hang or kernel panic observed for both test

- performance info of workloads, normalized based on coresched_v10 results
-- performance workload A:
Note:
* no performance change compared to coresched_v10
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| | ** | coresched_tip_merge_base_v5.13-rc1 | coresched_tip_merge_base_v5.13-rc1 | ** | coresched_v10_base_v5.10.11 | coresched_v10_base_v5.10.11 |
+=======================================+======+======================================+========================================+=======+===============================+=================================+
| workload | ** | sysbench cpu * 192 | sysbench cpu * 192 | ** | sysbench cpu * 192 | sysbench cpu * 192 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| prctl/cgroup | ** | prctl on workload cpu_0 | prctl on workload cpu_1 | ** | cg_sysbench_cpu_0 | cg_sysbench_cpu_1 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| record_item | ** | Tput_avg (events/s) | Tput_avg (events/s) | ** | Tput_avg (events/s) | Tput_avg (events/s) |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| coresched normalized vs coresched_v10 | ** | 0.97 | 1.05 | ** | 1 | 1 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| default normalized vs coresched_v10 | ** | 1.03 | 0.95 | ** | 1 | 1 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| smtoff normalized vs coresched_v10 | ** | 0.96 | 1.04 | ** | 1 | 1 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+


-- performance workload B:
Note:
* coresched (cs_on), sysbench mysql performance drop around 20% vs coresched_v10
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| | ** | coresched_tip_merge_base_v5.13-rc1 | coresched_tip_merge_base_v5.13-rc1 | ** | coresched_v10_base_v5.10.11 | coresched_v10_base_v5.10.11 |
+=======================================+======+======================================+========================================+=======+===============================+=================================+
| workload | ** | sysbench cpu * 192 | sysbench mysql * 192 | ** | sysbench cpu * 192 | sysbench mysql * 192 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| prctl/cgroup | ** | prctl on workload cpu_0 | prctl on workload mysql_0 | ** | cg_sysbench_cpu_0 | cg_sysbench_mysql_0 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| record_item | ** | Tput_avg (events/s) | Tput_avg (events/s) | ** | Tput_avg (events/s) | Tput_avg (events/s) |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| coresched normalized vs coresched_v10 | ** | 1.02 | 0.81 | ** | 1 | 1 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| default normalized vs coresched_v10 | ** | 1.01 | 0.94 | ** | 1 | 1 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| smtoff normalized vs coresched_v10 | ** | 0.93 | 1.18 | ** | 1 | 1 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+


-- performance workload C:
Note:
* coresched (cs_on), uperf performance increased almost double vs coresched_v10
* default (cs_off), uperf performance drop over 25% vs coresched_v10, same issue seen on v5.13-rc1 base (w/o coresched patchset)
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| | ** | coresched_tip_merge_base_v5.13-rc1 | coresched_tip_merge_base_v5.13-rc1 | ** | coresched_v10_base_v5.10.11 | coresched_v10_base_v5.10.11 |
+=======================================+======+======================================+========================================+=======+===============================+=================================+
| workload | ** | uperf netperf TCP * 192 | uperf netperf UDP * 192 | ** | uperf netperf TCP * 192 | uperf netperf UDP * 192 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| prctl/cgroup | ** | prctl on workload uperf | prctl on workload uperf | ** | cg_uperf | cg_uperf |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| record_item | ** | Tput_avg (Gb/s) | Tput_avg (Gb/s) | ** | Tput_avg (Gb/s) | Tput_avg (Gb/s) |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| coresched normalized vs coresched_v10 | ** | 1.83 | 1.93 | ** | 1 | 1 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| default normalized vs coresched_v10 | ** | 0.75 | 0.71 | ** | 1 | 1 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+
| smtoff normalized vs coresched_v10 | ** | 1 | 1.06 | ** | 1 | 1 |
+---------------------------------------+------+--------------------------------------+----------------------------------------+-------+-------------------------------+---------------------------------+


-- performance workload D:
Note:
* coresched (cs_on), wis performance increased almost double vs coresched_v10
* default (cs_off) and smtoff, wis performance is better vs coresched_v10
+---------------------------------------+------+--------------------------------------+-------+-------------------------------+
| | ** | coresched_tip_merge_base_v5.13-rc1 | ** | coresched_v10_base_v5.10.11 |
+=======================================+======+======================================+=======+===============================+
| workload | ** | will-it-scale * 192 | ** | will-it-scale * 192 |
| | | (pipe based context_switch) | | (pipe based context_switch) |
+---------------------------------------+------+--------------------------------------+-------+-------------------------------+
| prctl/cgroup | ** | prctl on workload wis | ** | cg_wis |
+---------------------------------------+------+--------------------------------------+-------+-------------------------------+
| record_item | ** | threads_avg | ** | threads_avg |
+---------------------------------------+------+--------------------------------------+-------+-------------------------------+
| coresched normalized vs coresched_v10 | ** | 2.01 | ** | 1.00 |
+---------------------------------------+------+--------------------------------------+-------+-------------------------------+
| default normalized vs coresched_v10 | ** | 1.13 | ** | 1.00 |
+---------------------------------------+------+--------------------------------------+-------+-------------------------------+
| smtoff normalized vs coresched_v10 | ** | 1.29 | ** | 1.00 |
+---------------------------------------+------+--------------------------------------+-------+-------------------------------+


-- notes on record_item:
* coresched normalized vs coresched_v10: smton, cs enabled, test result normalized by result of coresched_v10 under same config
* default normalized vs coresched_v10: smton, cs disabled, test result normalized by result of coresched_v10 under same config
* smtoff normalized vs coresched_v10: smtoff, test result normalized by result of coresched_v10 under same config



-- Hongyu Ning