2021-11-24 08:56:57

by Yicong Yang

[permalink] [raw]
Subject: [PATCH] sched/fair: Clear target from cpus to scan in select_idle_cpu

Commit 56498cfb045d noticed that "When select_idle_cpu starts scanning for
an idle CPU, it starts with a target CPU that has already been checked
by select_idle_sibling. This patch starts with the next CPU instead."
It only changed the scanning start cpu to target + 1 but still leave
the target in the scanning cpumask. The target still have a chance to be
checked in the last turn. Fix this by clear the target from the cpus
to scan.

Fixes: 56498cfb045d ("sched/fair: Avoid a second scan of target in select_idle_cpu")
Signed-off-by: Yicong Yang <[email protected]>
---
kernel/sched/fair.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6e476f6d9435..e1031e0da231 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6249,6 +6249,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
return -1;

cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+ cpumask_clear_cpu(target, cpus);

if (sched_feat(SIS_PROP) && !has_idle_core) {
u64 avg_cost, avg_idle, span_avg;
--
2.33.0



2021-11-25 11:19:39

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Clear target from cpus to scan in select_idle_cpu

On Wed, Nov 24, 2021 at 04:54:01PM +0800, Yicong Yang wrote:
> Commit 56498cfb045d noticed that "When select_idle_cpu starts scanning for
> an idle CPU, it starts with a target CPU that has already been checked
> by select_idle_sibling. This patch starts with the next CPU instead."
> It only changed the scanning start cpu to target + 1 but still leave
> the target in the scanning cpumask. The target still have a chance to be
> checked in the last turn. Fix this by clear the target from the cpus
> to scan.
>
> Fixes: 56498cfb045d ("sched/fair: Avoid a second scan of target in select_idle_cpu")
> Signed-off-by: Yicong Yang <[email protected]>

Did you check the performance of this? When I tried something like this
in a different context, I found that the cost of clearing the bit was
more expensive than simply using target + 1. For the target to be
rescanned, the whole mask would have to be scanned as no other CPUs are
idle which is the unlikely case. By clearing the bit, a cost is always
incurred even if the first CPU scanned is idle.

--
Mel Gorman
SUSE Labs

2021-11-25 12:48:56

by Yicong Yang

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Clear target from cpus to scan in select_idle_cpu

On 2021/11/25 19:17, Mel Gorman wrote:
> On Wed, Nov 24, 2021 at 04:54:01PM +0800, Yicong Yang wrote:
>> Commit 56498cfb045d noticed that "When select_idle_cpu starts scanning for
>> an idle CPU, it starts with a target CPU that has already been checked
>> by select_idle_sibling. This patch starts with the next CPU instead."
>> It only changed the scanning start cpu to target + 1 but still leave
>> the target in the scanning cpumask. The target still have a chance to be
>> checked in the last turn. Fix this by clear the target from the cpus
>> to scan.
>>
>> Fixes: 56498cfb045d ("sched/fair: Avoid a second scan of target in select_idle_cpu")
>> Signed-off-by: Yicong Yang <[email protected]>
>
> Did you check the performance of this? When I tried something like this
> in a different context, I found that the cost of clearing the bit was
> more expensive than simply using target + 1. For the target to be
> rescanned, the whole mask would have to be scanned as no other CPUs are
> idle which is the unlikely case. By clearing the bit, a cost is always
> incurred even if the first CPU scanned is idle.
>

Not yet, it's from code. I've launched some tests and we'll see the results tomorrow.

We traced the scanning here and seems the case that scan the whole LLC without
finding an idle cpu has some proportion. On 4-NUMA 128-Core Kunpeng 920 server
tested with mysql, there is ~1% probability for not finding and idle cpu when
sysbench threads is 128. The probability will increase when the load increases.

2021-11-26 09:40:23

by Yicong Yang

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Clear target from cpus to scan in select_idle_cpu

On 2021/11/25 20:46, Yicong Yang wrote:
> On 2021/11/25 19:17, Mel Gorman wrote:
>> On Wed, Nov 24, 2021 at 04:54:01PM +0800, Yicong Yang wrote:
>>> Commit 56498cfb045d noticed that "When select_idle_cpu starts scanning for
>>> an idle CPU, it starts with a target CPU that has already been checked
>>> by select_idle_sibling. This patch starts with the next CPU instead."
>>> It only changed the scanning start cpu to target + 1 but still leave
>>> the target in the scanning cpumask. The target still have a chance to be
>>> checked in the last turn. Fix this by clear the target from the cpus
>>> to scan.
>>>
>>> Fixes: 56498cfb045d ("sched/fair: Avoid a second scan of target in select_idle_cpu")
>>> Signed-off-by: Yicong Yang <[email protected]>
>>
>> Did you check the performance of this? When I tried something like this
>> in a different context, I found that the cost of clearing the bit was
>> more expensive than simply using target + 1. For the target to be
>> rescanned, the whole mask would have to be scanned as no other CPUs are
>> idle which is the unlikely case. By clearing the bit, a cost is always
>> incurred even if the first CPU scanned is idle.
>>
>
> Not yet, it's from code. I've launched some tests and we'll see the results tomorrow.
>
> We traced the scanning here and seems the case that scan the whole LLC without
> finding an idle cpu has some proportion. On 4-NUMA 128-Core Kunpeng 920 server
> tested with mysql, there is ~1% probability for not finding and idle cpu when
> sysbench threads is 128. The probability will increase when the load increases.
> .
>

Hi Mel,

I tested hackbench and tbench on our machine with
numactl -N 0 run-mmtests.sh -c $config

config-workload-hackbench-process-pipes
5.16-rc1 5.16-rc1+patch
Amean 1 0.5178 ( 0.00%) 0.5207 ( -0.56%)
Amean 4 1.0108 ( 0.00%) 0.9274 ( 8.25%)
Amean 7 1.9349 ( 0.00%) 1.8508 ( 4.35%)
Amean 12 3.4179 ( 0.00%) 3.3170 ( 2.95%)
Amean 21 5.9209 ( 0.00%) 5.8878 ( 0.56%)
Amean 30 6.8677 ( 0.00%) 6.6241 * 3.55%*
Amean 48 10.3759 ( 0.00%) 9.5785 * 7.69%*
Amean 64 13.4606 ( 0.00%) 12.3713 * 8.09%*

config-network-tbench
5.16-rc1 5.16-rc1+patch
Hmean 1 324.56 ( 0.00%) 324.01 * -0.17%*
Hmean 2 650.91 ( 0.00%) 646.89 * -0.62%*
Hmean 4 1291.16 ( 0.00%) 1298.56 * 0.57%*
Hmean 8 2625.06 ( 0.00%) 2615.81 * -0.35%*
Hmean 16 5293.86 ( 0.00%) 5267.24 * -0.50%*
Hmean 32 8464.34 ( 0.00%) 9578.40 * 13.16%*
Hmean 64 7417.02 ( 0.00%) 7218.91 * -2.67%*
Hmean 128 6313.71 ( 0.00%) 6180.67 * -2.11%*

Thanks.