When comparing the current struct sched_group with the yet-busiest
domain in update_sd_pick_busiest(), if the two groups have the same
group type, we're currently doing a bit of unnecessary work for any
group >= group_misfit_task. We're comparing the two groups, and then
returning only if false (the group in question is not the busiest).
Othewise, we break, do an extra unnecessary conditional check that's
vacuously false for any group type > group_fully_busy, and then always
return true.
Let's just return directly in the switch statement instead. This doesn't
change the size of vmlinux with llvm 17 (not surprising given that all
of this is inlined in load_balance()), but it does shrink load_balance()
by 88 bytes on x86. Given that it also improves readability, this seems
worth doing.
As a bonus, remove an unnecessary goto in update_sd_lb_stats().
Signed-off-by: David Vernet <[email protected]>
---
kernel/sched/fair.c | 19 ++++---------------
1 file changed, 4 insertions(+), 15 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b803030c3a03..04bd655b81d3 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10006,9 +10006,7 @@ static bool update_sd_pick_busiest(struct lb_env *env,
switch (sgs->group_type) {
case group_overloaded:
/* Select the overloaded group with highest avg_load. */
- if (sgs->avg_load <= busiest->avg_load)
- return false;
- break;
+ return sgs->avg_load > busiest->avg_load;
case group_imbalanced:
/*
@@ -10019,18 +10017,14 @@ static bool update_sd_pick_busiest(struct lb_env *env,
case group_asym_packing:
/* Prefer to move from lowest priority CPU's work */
- if (sched_asym_prefer(sg->asym_prefer_cpu, sds->busiest->asym_prefer_cpu))
- return false;
- break;
+ return sched_asym_prefer(sds->busiest->asym_prefer_cpu, sg->asym_prefer_cpu);
case group_misfit_task:
/*
* If we have more than one misfit sg go with the biggest
* misfit.
*/
- if (sgs->group_misfit_task_load < busiest->group_misfit_task_load)
- return false;
- break;
+ return sgs->group_misfit_task_load >= busiest->group_misfit_task_load;
case group_smt_balance:
/*
@@ -10578,16 +10572,11 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
update_sg_lb_stats(env, sds, sg, sgs, &sg_status);
- if (local_group)
- goto next_group;
-
-
- if (update_sd_pick_busiest(env, sds, sg, sgs)) {
+ if (!local_group && update_sd_pick_busiest(env, sds, sg, sgs)) {
sds->busiest = sg;
sds->busiest_stat = *sgs;
}
-next_group:
/* Now, start updating sd_lb_stats */
sds->total_load += sgs->group_load;
sds->total_capacity += sgs->group_capacity;
--
2.43.0
On 02/02/24 01:02, David Vernet wrote:
> When comparing the current struct sched_group with the yet-busiest
> domain in update_sd_pick_busiest(), if the two groups have the same
> group type, we're currently doing a bit of unnecessary work for any
> group >= group_misfit_task. We're comparing the two groups, and then
> returning only if false (the group in question is not the busiest).
> Othewise, we break, do an extra unnecessary conditional check that's
> vacuously false for any group type > group_fully_busy, and then always
> return true.
>
> Let's just return directly in the switch statement instead. This doesn't
> change the size of vmlinux with llvm 17 (not surprising given that all
> of this is inlined in load_balance()), but it does shrink load_balance()
> by 88 bytes on x86. Given that it also improves readability, this seems
> worth doing.
>
> As a bonus, remove an unnecessary goto in update_sd_lb_stats().
>
Given that's a different scope than what the rest of the patch touches, I'd
rather see this as a separate patch.
Other than that:
Reviewed-by: Valentin Schneider <[email protected]>
On Fri, Feb 02, 2024 at 06:01:22PM +0100, Valentin Schneider wrote:
> On 02/02/24 01:02, David Vernet wrote:
> > When comparing the current struct sched_group with the yet-busiest
> > domain in update_sd_pick_busiest(), if the two groups have the same
> > group type, we're currently doing a bit of unnecessary work for any
> > group >= group_misfit_task. We're comparing the two groups, and then
> > returning only if false (the group in question is not the busiest).
> > Othewise, we break, do an extra unnecessary conditional check that's
> > vacuously false for any group type > group_fully_busy, and then always
> > return true.
> >
> > Let's just return directly in the switch statement instead. This doesn't
> > change the size of vmlinux with llvm 17 (not surprising given that all
> > of this is inlined in load_balance()), but it does shrink load_balance()
> > by 88 bytes on x86. Given that it also improves readability, this seems
> > worth doing.
> >
> > As a bonus, remove an unnecessary goto in update_sd_lb_stats().
> >
>
> Given that's a different scope than what the rest of the patch touches, I'd
> rather see this as a separate patch.
>
> Other than that:
> Reviewed-by: Valentin Schneider <[email protected]>
Thanks, would you like me to send a follow-on series split into two with
your tag on both? Or were you just letting me know for next time?
We could also update this check to only do a strict greater than to
avoid unnecessary writes, but I figured it was preferable to have no
logical changes for this iteration:
return sgs->group_misfit_task_load >= busiest->group_misfit_task_load;
On 02/02/24 11:07, David Vernet wrote:
> On Fri, Feb 02, 2024 at 06:01:22PM +0100, Valentin Schneider wrote:
>> On 02/02/24 01:02, David Vernet wrote:
>> > When comparing the current struct sched_group with the yet-busiest
>> > domain in update_sd_pick_busiest(), if the two groups have the same
>> > group type, we're currently doing a bit of unnecessary work for any
>> > group >= group_misfit_task. We're comparing the two groups, and then
>> > returning only if false (the group in question is not the busiest).
>> > Othewise, we break, do an extra unnecessary conditional check that's
>> > vacuously false for any group type > group_fully_busy, and then always
>> > return true.
>> >
>> > Let's just return directly in the switch statement instead. This doesn't
>> > change the size of vmlinux with llvm 17 (not surprising given that all
>> > of this is inlined in load_balance()), but it does shrink load_balance()
>> > by 88 bytes on x86. Given that it also improves readability, this seems
>> > worth doing.
>> >
>> > As a bonus, remove an unnecessary goto in update_sd_lb_stats().
>> >
>>
>> Given that's a different scope than what the rest of the patch touches, I'd
>> rather see this as a separate patch.
>>
>> Other than that:
>> Reviewed-by: Valentin Schneider <[email protected]>
>
> Thanks, would you like me to send a follow-on series split into two with
> your tag on both? Or were you just letting me know for next time?
>
Well, I'm not picking up any patches, just reviewing them :) So yes I'd say
re-send with the split and feel free to apply the tag on both.
> We could also update this check to only do a strict greater than to
> avoid unnecessary writes, but I figured it was preferable to have no
> logical changes for this iteration:
>
> return sgs->group_misfit_task_load >= busiest->group_misfit_task_load;
That's a good point, I don't think there was a specific reason for going
with a lower-than rather than a lower-or-equal back then:
cad68e552e77 ("sched/fair: Consider misfit tasks when load-balancing")
On Fri, Feb 02, 2024 at 06:40:21PM +0100, Valentin Schneider wrote:
> On 02/02/24 11:07, David Vernet wrote:
> > On Fri, Feb 02, 2024 at 06:01:22PM +0100, Valentin Schneider wrote:
> >> On 02/02/24 01:02, David Vernet wrote:
> >> > When comparing the current struct sched_group with the yet-busiest
> >> > domain in update_sd_pick_busiest(), if the two groups have the same
> >> > group type, we're currently doing a bit of unnecessary work for any
> >> > group >= group_misfit_task. We're comparing the two groups, and then
> >> > returning only if false (the group in question is not the busiest).
> >> > Othewise, we break, do an extra unnecessary conditional check that's
> >> > vacuously false for any group type > group_fully_busy, and then always
> >> > return true.
> >> >
> >> > Let's just return directly in the switch statement instead. This doesn't
> >> > change the size of vmlinux with llvm 17 (not surprising given that all
> >> > of this is inlined in load_balance()), but it does shrink load_balance()
> >> > by 88 bytes on x86. Given that it also improves readability, this seems
> >> > worth doing.
> >> >
> >> > As a bonus, remove an unnecessary goto in update_sd_lb_stats().
> >> >
> >>
> >> Given that's a different scope than what the rest of the patch touches, I'd
> >> rather see this as a separate patch.
> >>
> >> Other than that:
> >> Reviewed-by: Valentin Schneider <[email protected]>
> >
> > Thanks, would you like me to send a follow-on series split into two with
> > your tag on both? Or were you just letting me know for next time?
> >
>
> Well, I'm not picking up any patches, just reviewing them :) So yes I'd say
> re-send with the split and feel free to apply the tag on both.
Sounds good, I'll send a follow up at some point tomorrow or early next
week.
> > We could also update this check to only do a strict greater than to
> > avoid unnecessary writes, but I figured it was preferable to have no
> > logical changes for this iteration:
> >
> > return sgs->group_misfit_task_load >= busiest->group_misfit_task_load;
>
> That's a good point, I don't think there was a specific reason for going
> with a lower-than rather than a lower-or-equal back then:
> cad68e552e77 ("sched/fair: Consider misfit tasks when load-balancing")
Yeah, the goal is to choose the group with the highest misfit task load,
so it seems pretty straightforward that we don't gain anything from
choosing a new group that has the same load as the previous busiest.
When I send out the follow-on patch set, I'll do it as 3 separate
patches (unless I hear otherwise from someone else).
Thanks,
David