2023-10-06 10:25:46

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 3/3] sched/nohz: Remove weird error handling from find_new_ilb()

find_new_ilb() returns nr_cpu_ids on failure - which is a weird
choice in itself: not only is it a global variable, it is
a +1 out of bounds CPU index...

Its only user, kick_ilb(), then checks the return against nr_cpu_ids
to decide to return.

Instead of this, use a standard -1 return on failure to find an
idle CPU, as the argument is signed already.

Signed-off-by: Ingo Molnar <[email protected]>
Cc: [email protected]
---
kernel/sched/fair.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d4e90d15bd77..dad60576cf56 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11572,7 +11572,7 @@ static inline int find_new_ilb(void)
return ilb_cpu;
}

- return nr_cpu_ids;
+ return -1;
}

/*
@@ -11593,8 +11593,7 @@ static void kick_ilb(unsigned int flags)
nohz.next_balance = jiffies+1;

ilb_cpu = find_new_ilb();
-
- if (ilb_cpu >= nr_cpu_ids)
+ if (ilb_cpu < 0)
return;

/*
--
2.39.2


2023-10-06 11:01:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 3/3] sched/nohz: Remove weird error handling from find_new_ilb()


* Peter Zijlstra <[email protected]> wrote:

> On Fri, Oct 06, 2023 at 12:25:18PM +0200, Ingo Molnar wrote:
> > find_new_ilb() returns nr_cpu_ids on failure - which is a weird
> > choice in itself: not only is it a global variable, it is
> > a +1 out of bounds CPU index...
>
> FWIW this is what all the cpumask bitops return when they've exhausted
> the mask. Eg. no bits left set etc..

yeah, which then results in type-forcing uglies like:

kernel/events/core.c: if ((unsigned)cpu >= nr_cpu_ids) {
kernel/events/core.c: if ((unsigned)cpu >= nr_cpu_ids) {
kernel/smp.c: if ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu)) {

:-/

So I don't think this is a particularly well thought-out interface.

Thanks,

Ingo

Subject: [tip: sched/core] sched/nohz: Remove unnecessarily complex error handling pattern from find_new_ilb()

The following commit has been merged into the sched/core branch of tip:

Commit-ID: f4bb5705114530cd775a5a649b666755b3efe7aa
Gitweb: https://git.kernel.org/tip/f4bb5705114530cd775a5a649b666755b3efe7aa
Author: Ingo Molnar <[email protected]>
AuthorDate: Fri, 06 Oct 2023 12:25:18 +02:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 09 Oct 2023 12:21:23 +02:00

sched/nohz: Remove unnecessarily complex error handling pattern from find_new_ilb()

find_new_ilb() returns nr_cpu_ids on failure - which is the usual
cpumask bitops return pattern, but is weird & unnecessary in this
context: not only is it a global variable, it it is a +1 out of
bounds CPU index and also has different signedness ...

Its only user, kick_ilb(), then checks the return against nr_cpu_ids
to decide to return. There's no other use.

So instead of this, use a standard -1 return on failure to find an
idle CPU, as the argument is signed already.

Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/sched/fair.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f82b301..19bb4ac 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11528,7 +11528,7 @@ static inline int find_new_ilb(void)
return ilb_cpu;
}

- return nr_cpu_ids;
+ return -1;
}

/*
@@ -11549,8 +11549,7 @@ static void kick_ilb(unsigned int flags)
nohz.next_balance = jiffies+1;

ilb_cpu = find_new_ilb();
-
- if (ilb_cpu >= nr_cpu_ids)
+ if (ilb_cpu < 0)
return;

/*

2023-10-09 15:13:16

by Valentin Schneider

[permalink] [raw]
Subject: Re: [PATCH 3/3] sched/nohz: Remove weird error handling from find_new_ilb()

On 06/10/23 13:01, Ingo Molnar wrote:
> * Peter Zijlstra <[email protected]> wrote:
>
>> On Fri, Oct 06, 2023 at 12:25:18PM +0200, Ingo Molnar wrote:
>> > find_new_ilb() returns nr_cpu_ids on failure - which is a weird
>> > choice in itself: not only is it a global variable, it is
>> > a +1 out of bounds CPU index...
>>
>> FWIW this is what all the cpumask bitops return when they've exhausted
>> the mask. Eg. no bits left set etc..
>
> yeah, which then results in type-forcing uglies like:
>
> kernel/events/core.c: if ((unsigned)cpu >= nr_cpu_ids) {
> kernel/events/core.c: if ((unsigned)cpu >= nr_cpu_ids) {
> kernel/smp.c: if ((unsigned)cpu >= nr_cpu_ids || !cpu_online(cpu)) {

I can't see why we'd want smp_call_function_single*() /
generic_exec_single() to take a signed int as input, shouldn't this just be
unsigned?

The perf thing does look like it wants signed though...

>
> :-/
>
> So I don't think this is a particularly well thought-out interface.
>
> Thanks,
>
> Ingo