2019-04-09 17:37:46

by Valentin Schneider

[permalink] [raw]
Subject: [PATCH 0/2] sched/topology: Small fixes

Nothing too fancy here, just some fixlets resulting from staring at topology.c

Patch 1 skips writes to sched_group's when they are already initialized
Patch 2 fixes a derelict comment (spotted by Dietmar)

Valentin Schneider (2):
sched/topology: build_sched_groups: Skip duplicate group rewrites
sched/topology: Fix build_sched_groups() comment

kernel/sched/topology.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

--
2.20.1


2019-04-09 17:37:51

by Valentin Schneider

[permalink] [raw]
Subject: [PATCH 1/2] sched/topology: build_sched_groups: Skip duplicate group rewrites

While staring at build_sched_domains(), I realized that get_group()
does several duplicate (thus useless) writes.

If you take the Arm Juno r0 (LITTLEs = [0, 3, 4, 5], bigs = [1, 2]), the
sched_group build flow would look like this:

('MC[cpu]->sg' means 'per_cpu_ptr(&tl->data->sg, cpu)' with 'tl == MC')

build_sched_groups(MC[CPU0]->sd, CPU0)
get_group(0) -> MC[CPU0]->sg
get_group(3) -> MC[CPU3]->sg
get_group(4) -> MC[CPU4]->sg
get_group(5) -> MC[CPU5]->sg

build_sched_groups(DIE[CPU0]->sd, CPU0)
get_group(0) -> DIE[CPU0]->sg
get_group(1) -> DIE[CPU1]->sg <-----------------+
|
build_sched_groups(MC[CPU1]->sd, CPU1) |
get_group(1) -> MC[CPU1]->sg |
get_group(2) -> MC[CPU2]->sg |
|
build_sched_groups(DIE[CPU1]->sd, CPU1) ^
get_group(1) -> DIE[CPU1]->sg } We've set up these two up here!
get_group(3) -> DIE[CPU0]->sg }

From this point on, we will only use sched_groups that have been
previously visited & initialized. The only new operation will
be which group pointer we affect to sd->groups.

On the Juno r0 we get 32 get_group() calls, every single one of them
writing to a sched_group->cpumask. However, all of the data structures
we need are set up after 8 visits (see above).

Return early from get_group() if we've already visited (and thus
initialized) the sched_group we're looking at. Overlapping domains
are not affected as they do not use build_sched_groups().

Tested on a Juno and a 2 * (Xeon E5-2690) system.

Signed-off-by: Valentin Schneider <[email protected]>
---
FWIW I initially checked the refs for both sg && sg->sgc, but figured if
they weren't both 0 or > 1 then something must have gone wrong, so I
threw in a WARN_ON().
---
kernel/sched/topology.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 64bec54ded3e..6c0b7326f66e 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1059,6 +1059,7 @@ static struct sched_group *get_group(int cpu, struct sd_data *sdd)
struct sched_domain *sd = *per_cpu_ptr(sdd->sd, cpu);
struct sched_domain *child = sd->child;
struct sched_group *sg;
+ bool already_visited;

if (child)
cpu = cpumask_first(sched_domain_span(child));
@@ -1066,9 +1067,14 @@ static struct sched_group *get_group(int cpu, struct sd_data *sdd)
sg = *per_cpu_ptr(sdd->sg, cpu);
sg->sgc = *per_cpu_ptr(sdd->sgc, cpu);

- /* For claim_allocations: */
- atomic_inc(&sg->ref);
- atomic_inc(&sg->sgc->ref);
+ /* Increase refcounts for claim_allocations: */
+ already_visited = atomic_inc_return(&sg->ref) > 1;
+ /* sgc visits should follow a similar trend as sg */
+ WARN_ON(already_visited != (atomic_inc_return(&sg->sgc->ref) > 1));
+
+ /* If we have already visited that group, it's already initialized. */
+ if (already_visited)
+ return sg;

if (child) {
cpumask_copy(sched_group_span(sg), sched_domain_span(child));
--
2.20.1

2019-04-09 17:37:57

by Valentin Schneider

[permalink] [raw]
Subject: [PATCH 2/2] sched/topology: Fix build_sched_groups() comment

The comment was introduced (pre 2.6.12) by

commit 8a7a2318dc07 ("[PATCH] sched: consolidate sched domains")

and referred to sched_group->cpu_power. This was folded into
sched_group->sched_group_power in

commit 9c3f75cbd144 ("sched: Break out cpu_power from the sched_group structure")

The comment was then updated in

commit ced549fa5fc1 ("sched: Remove remaining dubious usage of "power"")

but should have replaced "sg->cpu_capacity" with
"sg->sched_group_capacity". Do that now.

Signed-off-by: Valentin Schneider <[email protected]>
---
kernel/sched/topology.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 6c0b7326f66e..c65b31e9458b 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1093,8 +1093,8 @@ static struct sched_group *get_group(int cpu, struct sd_data *sdd)

/*
* build_sched_groups will build a circular linked list of the groups
- * covered by the given span, and will set each group's ->cpumask correctly,
- * and ->cpu_capacity to 0.
+ * covered by the given span, will set each group's ->cpumask correctly,
+ * and will initialize their ->sgc.
*
* Assumes the sched_domain tree is fully constructed
*/
--
2.20.1

2019-04-10 10:02:49

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH 1/2] sched/topology: build_sched_groups: Skip duplicate group rewrites

On 04/09/19 18:35, Valentin Schneider wrote:
> While staring at build_sched_domains(), I realized that get_group()
> does several duplicate (thus useless) writes.
>
> If you take the Arm Juno r0 (LITTLEs = [0, 3, 4, 5], bigs = [1, 2]), the
> sched_group build flow would look like this:
>
> ('MC[cpu]->sg' means 'per_cpu_ptr(&tl->data->sg, cpu)' with 'tl == MC')
>
> build_sched_groups(MC[CPU0]->sd, CPU0)
> get_group(0) -> MC[CPU0]->sg
> get_group(3) -> MC[CPU3]->sg
> get_group(4) -> MC[CPU4]->sg
> get_group(5) -> MC[CPU5]->sg
>
> build_sched_groups(DIE[CPU0]->sd, CPU0)
> get_group(0) -> DIE[CPU0]->sg
> get_group(1) -> DIE[CPU1]->sg <-----------------+
> |
> build_sched_groups(MC[CPU1]->sd, CPU1) |
> get_group(1) -> MC[CPU1]->sg |
> get_group(2) -> MC[CPU2]->sg |
> |
> build_sched_groups(DIE[CPU1]->sd, CPU1) ^
> get_group(1) -> DIE[CPU1]->sg } We've set up these two up here!
> get_group(3) -> DIE[CPU0]->sg }
>
> From this point on, we will only use sched_groups that have been
> previously visited & initialized. The only new operation will
> be which group pointer we affect to sd->groups.
>
> On the Juno r0 we get 32 get_group() calls, every single one of them
> writing to a sched_group->cpumask. However, all of the data structures
> we need are set up after 8 visits (see above).
>
> Return early from get_group() if we've already visited (and thus
> initialized) the sched_group we're looking at. Overlapping domains
> are not affected as they do not use build_sched_groups().
>
> Tested on a Juno and a 2 * (Xeon E5-2690) system.
>
> Signed-off-by: Valentin Schneider <[email protected]>
> ---
> FWIW I initially checked the refs for both sg && sg->sgc, but figured if
> they weren't both 0 or > 1 then something must have gone wrong, so I
> threw in a WARN_ON().
> ---
> kernel/sched/topology.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 64bec54ded3e..6c0b7326f66e 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1059,6 +1059,7 @@ static struct sched_group *get_group(int cpu, struct sd_data *sdd)
> struct sched_domain *sd = *per_cpu_ptr(sdd->sd, cpu);
> struct sched_domain *child = sd->child;
> struct sched_group *sg;
> + bool already_visited;
>
> if (child)
> cpu = cpumask_first(sched_domain_span(child));
> @@ -1066,9 +1067,14 @@ static struct sched_group *get_group(int cpu, struct sd_data *sdd)
> sg = *per_cpu_ptr(sdd->sg, cpu);
> sg->sgc = *per_cpu_ptr(sdd->sgc, cpu);
>
> - /* For claim_allocations: */
> - atomic_inc(&sg->ref);
> - atomic_inc(&sg->sgc->ref);
> + /* Increase refcounts for claim_allocations: */
> + already_visited = atomic_inc_return(&sg->ref) > 1;
> + /* sgc visits should follow a similar trend as sg */
> + WARN_ON(already_visited != (atomic_inc_return(&sg->sgc->ref) > 1));

NIT: I think it would be better to not have any side effect of calling
WARN_ON(). Ie: do the atomic_inc_return() outside the WARN_ON() condition.
Having two bool already_visited_sg and already_visited_sgc will make the code
more readable too.

Thanks

--
Qais Yousef

> +
> + /* If we have already visited that group, it's already initialized. */
> + if (already_visited)
> + return sg;
>
> if (child) {
> cpumask_copy(sched_group_span(sg), sched_domain_span(child));
> --
> 2.20.1
>

Subject: [tip:sched/core] sched/topology: Fix build_sched_groups() comment

Commit-ID: d8743230c9f4e92f370ecd2a90c680ddcede6ae5
Gitweb: https://git.kernel.org/tip/d8743230c9f4e92f370ecd2a90c680ddcede6ae5
Author: Valentin Schneider <[email protected]>
AuthorDate: Tue, 9 Apr 2019 18:35:46 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 10 Apr 2019 09:41:34 +0200

sched/topology: Fix build_sched_groups() comment

The comment was introduced (pre 2.6.12) by:

8a7a2318dc07 ("[PATCH] sched: consolidate sched domains")

and referred to sched_group->cpu_power. This was folded into
sched_group->sched_group_power in

commit 9c3f75cbd144 ("sched: Break out cpu_power from the sched_group structure")

The comment was then updated in:

ced549fa5fc1 ("sched: Remove remaining dubious usage of "power"")

but should have replaced "sg->cpu_capacity" with
"sg->sched_group_capacity". Do that now.

Signed-off-by: Valentin Schneider <[email protected]>
Cc: [email protected]
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched/topology.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 64bec54ded3e..90e1a870fb0d 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1087,8 +1087,8 @@ static struct sched_group *get_group(int cpu, struct sd_data *sdd)

/*
* build_sched_groups will build a circular linked list of the groups
- * covered by the given span, and will set each group's ->cpumask correctly,
- * and ->cpu_capacity to 0.
+ * covered by the given span, will set each group's ->cpumask correctly,
+ * and will initialize their ->sgc.
*
* Assumes the sched_domain tree is fully constructed
*/

2019-04-10 12:00:49

by Valentin Schneider

[permalink] [raw]
Subject: Re: [PATCH 1/2] sched/topology: build_sched_groups: Skip duplicate group rewrites

On 10/04/2019 10:27, Qais Yousef wrote:
[...]
>> @@ -1066,9 +1067,14 @@ static struct sched_group *get_group(int cpu, struct sd_data *sdd)
>> sg = *per_cpu_ptr(sdd->sg, cpu);
>> sg->sgc = *per_cpu_ptr(sdd->sgc, cpu);
>>
>> - /* For claim_allocations: */
>> - atomic_inc(&sg->ref);
>> - atomic_inc(&sg->sgc->ref);
>> + /* Increase refcounts for claim_allocations: */
>> + already_visited = atomic_inc_return(&sg->ref) > 1;
>> + /* sgc visits should follow a similar trend as sg */
>> + WARN_ON(already_visited != (atomic_inc_return(&sg->sgc->ref) > 1));
>
> NIT: I think it would be better to not have any side effect of calling
> WARN_ON(). Ie: do the atomic_inc_return() outside the WARN_ON() condition.
> Having two bool already_visited_sg and already_visited_sgc will make the code
> more readable too.
>

I agree it's not the nicest reading flow there is. It's already gone in
tip/sched/core but if needed I can resend with this extra separation.

> Thanks
>
> --
> Qais Yousef
>
>> +
>> + /* If we have already visited that group, it's already initialized. */
>> + if (already_visited)
>> + return sg;
>>
>> if (child) {
>> cpumask_copy(sched_group_span(sg), sched_domain_span(child));
>> --
>> 2.20.1
>>

2019-04-10 12:06:34

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH 1/2] sched/topology: build_sched_groups: Skip duplicate group rewrites

On 04/10/19 11:17, Valentin Schneider wrote:
> On 10/04/2019 10:27, Qais Yousef wrote:
> [...]
> >> @@ -1066,9 +1067,14 @@ static struct sched_group *get_group(int cpu, struct sd_data *sdd)
> >> sg = *per_cpu_ptr(sdd->sg, cpu);
> >> sg->sgc = *per_cpu_ptr(sdd->sgc, cpu);
> >>
> >> - /* For claim_allocations: */
> >> - atomic_inc(&sg->ref);
> >> - atomic_inc(&sg->sgc->ref);
> >> + /* Increase refcounts for claim_allocations: */
> >> + already_visited = atomic_inc_return(&sg->ref) > 1;
> >> + /* sgc visits should follow a similar trend as sg */
> >> + WARN_ON(already_visited != (atomic_inc_return(&sg->sgc->ref) > 1));
> >
> > NIT: I think it would be better to not have any side effect of calling
> > WARN_ON(). Ie: do the atomic_inc_return() outside the WARN_ON() condition.
> > Having two bool already_visited_sg and already_visited_sgc will make the code
> > more readable too.
> >
>
> I agree it's not the nicest reading flow there is. It's already gone in
> tip/sched/core but if needed I can resend with this extra separation.

It was just a nit from my side. So for me it's not worth a resend if it's
already accepted.

--
Qais Yousef