2022-04-24 15:35:18

by Borislav Petkov

[permalink] [raw]
Subject: [GIT PULL] sched/urgent for 5.18-rc4

Hi Linus,

please pull a single urgent scheduler fix for 5.18.

Thx.

---

The following changes since commit b2d229d4ddb17db541098b83524d901257e93845:

Linux 5.18-rc3 (2022-04-17 13:57:31 -0700)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/sched_urgent_for_v5.18_rc4

for you to fetch changes up to 40f5aa4c5eaebfeaca4566217cb9c468e28ed682:

sched/pelt: Fix attach_entity_load_avg() corner case (2022-04-19 21:15:41 +0200)

----------------------------------------------------------------
- Fix a corner case when calculating sched runqueue variables

----------------------------------------------------------------
kuyo chang (1):
sched/pelt: Fix attach_entity_load_avg() corner case

kernel/sched/fair.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

--
Regards/Gruss,
Boris.

SUSE Software Solutions Germany GmbH, GF: Ivo Totev, HRB 36809, AG Nürnberg


2022-04-25 06:03:00

by pr-tracker-bot

[permalink] [raw]
Subject: Re: [GIT PULL] sched/urgent for 5.18-rc4

The pull request you sent on Sun, 24 Apr 2022 11:55:28 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git tags/sched_urgent_for_v5.18_rc4

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/42740a2ff5d3f2cc0c73876dfb37ed0b88d926fd

Thank you!

--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

2022-04-25 06:50:32

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] sched/urgent for 5.18-rc4

On Sun, Apr 24, 2022 at 2:55 AM Borislav Petkov <[email protected]> wrote:
>
> - Fix a corner case when calculating sched runqueue variables

This worries me.

It now does:

+ if (se_weight(se) < se->avg.load_sum)
+ se->avg.load_sum = div_u64(se->avg.load_sum, se_weight(se));

and at no point does it check if se_weight(se) is zero.

It *used* to check for that divide-by-zero issue, so from what I can
tell, a zero se_weight() is actually possible.

Now, it's entirely possible that no, se_weight() can never go down to
zero. But it's not obvious,. and the commit message doesn't mention
this change at all.

So I pulled, but then after looking at it I unpulled again in the
hopes that somebody will clarify the issue for me.

And scale_load_down() (in se_weight()) does try to make the result be
at least 2 on 64-bit, but only if the original wasn't zero. Very
confusing.

So can somebody please tell me why se_weight() cannot be 0, and why we
_used_ to check for zero? Because that commit sure as heck doesn't
explain it.

And - as usual with the -tip tree - the "Link:" thing is almost
entirely pointless. It doesn't actually point to any discussion of the
problems, it only points to the patch submission.

I realize that is convenient for automation, but it's really not
generally a very useful link. It would be much more useful to link to
whatever problem report that actually *causes* the submission, not to
the submission itself. We already see the end result in the commit,
it's the "how did we get here" that is the most interesting part.

And no, I don't see any explanation for "why se_weight() cannot be
zero" in that submission thread either.

Somebody please hit me over the head with a clue bat.

Linus

2022-04-25 08:56:37

by Vincent Guittot

[permalink] [raw]
Subject: Re: [GIT PULL] sched/urgent for 5.18-rc4

On Sun, 24 Apr 2022 at 21:00, Linus Torvalds
<[email protected]> wrote:
>
> On Sun, Apr 24, 2022 at 2:55 AM Borislav Petkov <[email protected]> wrote:
> >
> > - Fix a corner case when calculating sched runqueue variables
>
> This worries me.
>
> It now does:
>
> + if (se_weight(se) < se->avg.load_sum)
> + se->avg.load_sum = div_u64(se->avg.load_sum, se_weight(se));
>
> and at no point does it check if se_weight(se) is zero.
>
> It *used* to check for that divide-by-zero issue, so from what I can
> tell, a zero se_weight() is actually possible.
>
> Now, it's entirely possible that no, se_weight() can never go down to
> zero. But it's not obvious,. and the commit message doesn't mention
> this change at all.
>
> So I pulled, but then after looking at it I unpulled again in the
> hopes that somebody will clarify the issue for me.
>
> And scale_load_down() (in se_weight()) does try to make the result be
> at least 2 on 64-bit, but only if the original wasn't zero. Very
> confusing.
>
> So can somebody please tell me why se_weight() cannot be 0, and why we
> _used_ to check for zero? Because that commit sure as heck doesn't
> explain it.

For task, weight can't be null
For task group, weight is initialized to nice 0 in init_tg_cfs_entry()
and then it's clamp in calc_group_shares in order to not be null
Then since 26cf52229efc ("sched: Avoid scale real weight down to
zero"), scale_load_down can't return null value.

In fact, the condition if (se_weight(se)) was not needed any more and
should have been removed with commit 26cf52229efc

>
> And - as usual with the -tip tree - the "Link:" thing is almost
> entirely pointless. It doesn't actually point to any discussion of the
> problems, it only points to the patch submission.
>
> I realize that is convenient for automation, but it's really not
> generally a very useful link. It would be much more useful to link to
> whatever problem report that actually *causes* the submission, not to
> the submission itself. We already see the end result in the commit,
> it's the "how did we get here" that is the most interesting part.
>
> And no, I don't see any explanation for "why se_weight() cannot be
> zero" in that submission thread either.
>
> Somebody please hit me over the head with a clue bat.
>
> Linus