LinuxLists.cc - [PATCH] sched/fair: simplify __calc

2024-03-06 22:29:45

Subject: [PATCH] sched/fair: simplify __calc_delta()

Based on how __calc_delta() is called now, the input parameter, weight
is always NICE_0_LOAD. I think we don't need it as an input parameter
now?

Also, when weight is always NICE_0_LOAD, the initial fact value is
always 2^10, and the first fact_hi will always be 0. Thus, we can get
rid of the first if bock.

The previous comment "(delta_exec * (weight * lw->inv_weight)) >>
WMULT_SHIFT" seems to be assuming that lw->weight * lw->inv_weight is
always (approximately) equal to 2^WMULT_SHIFT. However, when
CONFIG_64BIT is set, lw->weight * lw->inv_weight is (approximately)
equal to 2^WMULT_SHIFT * 2^10. What remains true for both CONFIG_32BIT
and CONFIG_64BIT is: scale_load_down(lw->weight) * lw->inv_weight is
(approximately) equal to 2^WMULT_SHIFT. (Correct me if I am wrong.)

Also updated the comment for calc_delta_fair() to make it more
accurate.

Signed-off-by: Dawei Li <[email protected]>
---
kernel/sched/fair.c | 29 ++++++++++-------------------
1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6a16129f9a5c..c5cdb15f7d62 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -252,32 +252,23 @@ static void __update_inv_weight(struct load_weight *lw)
}

/*
- * delta_exec * weight / lw.weight
+ * delta_exec * NICE_0_LOAD / lw->weight
* OR
- * (delta_exec * (weight * lw->inv_weight)) >> WMULT_SHIFT
+ * (delta_exec * scale_load_down(NICE_0_LOAD) * lw->inv_weight) >> WMULT_SHIFT
*
- * Either weight := NICE_0_LOAD and lw \e sched_prio_to_wmult[], in which case
- * we're guaranteed shift stays positive because inv_weight is guaranteed to
- * fit 32 bits, and NICE_0_LOAD gives another 10 bits; therefore shift >= 22.
- *
- * Or, weight =< lw.weight (because lw.weight is the runqueue weight), thus
- * weight/lw.weight <= 1, and therefore our shift will also be positive.
+ * We're guaranteed shift stays positive because inv_weight is guaranteed to
+ * fit 32 bits, and scale_load_down(NICE_0_LOAD) gives another 10 bits;
+ * therefore shift >= 22.
*/
-static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_weight *lw)
+static u64 __calc_delta(u64 delta_exec, struct load_weight *lw)
{
- u64 fact = scale_load_down(weight);
- u32 fact_hi = (u32)(fact >> 32);
+ u64 fact = scale_load_down(NICE_0_LOAD);
+ int fact_hi;
int shift = WMULT_SHIFT;
int fs;

__update_inv_weight(lw);

- if (unlikely(fact_hi)) {
- fs = fls(fact_hi);
- shift -= fs;
- fact >>= fs;
- }
-
fact = mul_u32_u32(fact, lw->inv_weight);

fact_hi = (u32)(fact >> 32);
@@ -291,12 +282,12 @@ static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_weight
}

/*
- * delta /= w
+ * delta *= NICE_0_LOAD / se->load.weight
*/
static inline u64 calc_delta_fair(u64 delta, struct sched_entity *se)
{
if (unlikely(se->load.weight != NICE_0_LOAD))
- delta = __calc_delta(delta, NICE_0_LOAD, &se->load);
+ delta = __calc_delta(delta, &se->load);

return delta;
}
--
2.40.1

2024-03-12 13:18:31

by Pierre Gondois

[permalink] [raw]

Subject: Re: [PATCH] sched/fair: simplify __calc_delta()

Hello Dawei,

On 3/6/24 23:28, Dawei Li wrote:
> Based on how __calc_delta() is called now, the input parameter, weight
> is always NICE_0_LOAD. I think we don't need it as an input parameter
> now?

Maybe
5e963f2bd4654a202a8a05aa3a86cb0300b10e6c ("sched/fair: Commit to EEVDF")
should be referenced to explain that the case where (weight =< lw.weight)
doesn't exist anymore and that NICE_0_LOAD could be incorporated in
__calc_delta() directly.

Also I think indirect forms are preferred in general:
"I think we don't need it as an input parameter now ?" ->
"The 'weight' parameter doesn't seem to be required anymore"
(same note for the whole commit message)

>
> Also, when weight is always NICE_0_LOAD, the initial fact value is
> always 2^10, and the first fact_hi will always be 0. Thus, we can get
> rid of the first if bock.
>
> The previous comment "(delta_exec * (weight * lw->inv_weight)) >>
> WMULT_SHIFT" seems to be assuming that lw->weight * lw->inv_weight is
> always (approximately) equal to 2^WMULT_SHIFT. However, when
> CONFIG_64BIT is set, lw->weight * lw->inv_weight is (approximately)
> equal to 2^WMULT_SHIFT * 2^10. What remains true for both CONFIG_32BIT
> and CONFIG_64BIT is: scale_load_down(lw->weight) * lw->inv_weight is
> (approximately) equal to 2^WMULT_SHIFT. (Correct me if I am wrong.)

I think the comment is more about explaining that:
X * lw.weight
equals:
X * lw->inv_weight >> WMULT_SHIFT

Also, if CONFIG_64BIT is set, we should have:
weight / lw.weight == scale_load_down(lw->weight) * 2**10 * lw->inv_weight

So IIUC, either both lines should be update, either none.
(meaning that:
delta_exec * NICE_0_LOAD / lw->weight
should be changed to
delta_exec * scale_load_down(NICE_0_LOAD) / lw->weight
)

I assume it's better to let the comment as is.

>
> Also updated the comment for calc_delta_fair() to make it more
> accurate.
>
> Signed-off-by: Dawei Li <[email protected]>
> ---
> kernel/sched/fair.c | 29 ++++++++++-------------------
> 1 file changed, 10 insertions(+), 19 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 6a16129f9a5c..c5cdb15f7d62 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -252,32 +252,23 @@ static void __update_inv_weight(struct load_weight *lw)
> }
>
> /*
> - * delta_exec * weight / lw.weight
> + * delta_exec * NICE_0_LOAD / lw->weight
> * OR
> - * (delta_exec * (weight * lw->inv_weight)) >> WMULT_SHIFT
> + * (delta_exec * scale_load_down(NICE_0_LOAD) * lw->inv_weight) >> WMULT_SHIFT
> *
> - * Either weight := NICE_0_LOAD and lw \e sched_prio_to_wmult[], in which case
> - * we're guaranteed shift stays positive because inv_weight is guaranteed to
> - * fit 32 bits, and NICE_0_LOAD gives another 10 bits; therefore shift >= 22.
> - *
> - * Or, weight =< lw.weight (because lw.weight is the runqueue weight), thus
> - * weight/lw.weight <= 1, and therefore our shift will also be positive.
> + * We're guaranteed shift stays positive because inv_weight is guaranteed to
> + * fit 32 bits, and scale_load_down(NICE_0_LOAD) gives another 10 bits;
> + * therefore shift >= 22.
> */
> -static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_weight *lw)
> +static u64 __calc_delta(u64 delta_exec, struct load_weight *lw)
> {
> - u64 fact = scale_load_down(weight);
> - u32 fact_hi = (u32)(fact >> 32);
> + u64 fact = scale_load_down(NICE_0_LOAD);
> + int fact_hi;
> int shift = WMULT_SHIFT;
> int fs;

NIT: maybe re-ordering the variables to have a reverse tree

Otherwise, the patch looks good to me,
Regards,
Pierre

2024-03-12 23:26:19

by Dawei Li

[permalink] [raw]

Subject: Re: [PATCH] sched/fair: simplify __calc_delta()

Hi Pierre,
Thank you for the review!

On Tue, Mar 12, 2024 at 6:18 AM Pierre Gondois <[email protected]> wrote:
>
> Hello Dawei,
>
> On 3/6/24 23:28, Dawei Li wrote:
> > Based on how __calc_delta() is called now, the input parameter, weight
> > is always NICE_0_LOAD. I think we don't need it as an input parameter
> > now?
>
> Maybe
> 5e963f2bd4654a202a8a05aa3a86cb0300b10e6c ("sched/fair: Commit to EEVDF")
> should be referenced to explain that the case where (weight =< lw.weight)
> doesn't exist anymore and that NICE_0_LOAD could be incorporated in
> __calc_delta() directly.
>
>
> Also I think indirect forms are preferred in general:
> "I think we don't need it as an input parameter now ?" ->
> "The 'weight' parameter doesn't seem to be required anymore"
> (same note for the whole commit message)
>
> >
> > Also, when weight is always NICE_0_LOAD, the initial fact value is
> > always 2^10, and the first fact_hi will always be 0. Thus, we can get
> > rid of the first if bock.
> >
> > The previous comment "(delta_exec * (weight * lw->inv_weight)) >>
> > WMULT_SHIFT" seems to be assuming that lw->weight * lw->inv_weight is
> > always (approximately) equal to 2^WMULT_SHIFT. However, when
> > CONFIG_64BIT is set, lw->weight * lw->inv_weight is (approximately)
> > equal to 2^WMULT_SHIFT * 2^10. What remains true for both CONFIG_32BIT
> > and CONFIG_64BIT is: scale_load_down(lw->weight) * lw->inv_weight is
> > (approximately) equal to 2^WMULT_SHIFT. (Correct me if I am wrong.)
>
> I think the comment is more about explaining that:
> X * lw.weight
> equals:
> X * lw->inv_weight >> WMULT_SHIFT
>
I assume you mean
X / lw->weight
equals:
X * lw->inv_weight >> WMULT_SHIFT
However, this is not always true, and that's why I'd like to revise
it. It is true for
CONFIG_32BIT. However, For CONFIG_64BIT, we have lw->weight * lw->inv_weight =
2**WMULT_SHIFT * 2**10. Thus,
X / lw->weight
equals:
X * lw->inv_weight >> (WMULT_SHIFT + 10)

> Also, if CONFIG_64BIT is set, we should have:
> weight / lw.weight == scale_load_down(lw->weight) * 2**10 * lw->inv_weight
>

weight / lw->weight should be equal to scale_load_down(weight) /
scale_load_down(lw->weight)
= scale_load_down(weight) * lw->inv_weight / (2**WMULT_SHIFT)
Right?

> So IIUC, either both lines should be update, either none.
> (meaning that:
> delta_exec * NICE_0_LOAD / lw->weight
> should be changed to
> delta_exec * scale_load_down(NICE_0_LOAD) / lw->weight

I think this is not correct? scale_load_down(NICE_0_LOAD) is the true
weight, as mapped
directly from the task's nice/priority value, while lw->weight is the
scaled_up load.
Their units/scales don't match.

I am quite new to the source code. I could be wrong. But would like to
see more clarifications
on this.

> )
>
> I assume it's better to let the comment as is.
>
>
> >
> > Also updated the comment for calc_delta_fair() to make it more
> > accurate.
> >
> > Signed-off-by: Dawei Li <[email protected]>
> > ---
> > kernel/sched/fair.c | 29 ++++++++++-------------------
> > 1 file changed, 10 insertions(+), 19 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 6a16129f9a5c..c5cdb15f7d62 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -252,32 +252,23 @@ static void __update_inv_weight(struct load_weight *lw)
> > }
> >
> > /*
> > - * delta_exec * weight / lw.weight
> > + * delta_exec * NICE_0_LOAD / lw->weight
> > * OR
> > - * (delta_exec * (weight * lw->inv_weight)) >> WMULT_SHIFT
> > + * (delta_exec * scale_load_down(NICE_0_LOAD) * lw->inv_weight) >> WMULT_SHIFT
> > *
> > - * Either weight := NICE_0_LOAD and lw \e sched_prio_to_wmult[], in which case
> > - * we're guaranteed shift stays positive because inv_weight is guaranteed to
> > - * fit 32 bits, and NICE_0_LOAD gives another 10 bits; therefore shift >= 22.
> > - *
> > - * Or, weight =< lw.weight (because lw.weight is the runqueue weight), thus
> > - * weight/lw.weight <= 1, and therefore our shift will also be positive.
> > + * We're guaranteed shift stays positive because inv_weight is guaranteed to
> > + * fit 32 bits, and scale_load_down(NICE_0_LOAD) gives another 10 bits;
> > + * therefore shift >= 22.
> > */
> > -static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_weight *lw)
> > +static u64 __calc_delta(u64 delta_exec, struct load_weight *lw)
> > {
> > - u64 fact = scale_load_down(weight);
> > - u32 fact_hi = (u32)(fact >> 32);
> > + u64 fact = scale_load_down(NICE_0_LOAD);
> > + int fact_hi;
> > int shift = WMULT_SHIFT;
> > int fs;
>
> NIT: maybe re-ordering the variables to have a reverse tree
>
> Otherwise, the patch looks good to me,
> Regards,
> Pierre

2024-03-13 10:21:42

by Pierre Gondois

[permalink] [raw]

Subject: Re: [PATCH] sched/fair: simplify __calc_delta()

Hello Dawei,

On 3/13/24 00:25, Dawei Li wrote:
> Hi Pierre,
> Thank you for the review!
>
> On Tue, Mar 12, 2024 at 6:18 AM Pierre Gondois <[email protected]> wrote:
>>
>> Hello Dawei,
>>
>> On 3/6/24 23:28, Dawei Li wrote:
>>> Based on how __calc_delta() is called now, the input parameter, weight
>>> is always NICE_0_LOAD. I think we don't need it as an input parameter
>>> now?
>>
>> Maybe
>> 5e963f2bd4654a202a8a05aa3a86cb0300b10e6c ("sched/fair: Commit to EEVDF")
>> should be referenced to explain that the case where (weight =< lw.weight)
>> doesn't exist anymore and that NICE_0_LOAD could be incorporated in
>> __calc_delta() directly.
>>
>>
>> Also I think indirect forms are preferred in general:
>> "I think we don't need it as an input parameter now ?" ->
>> "The 'weight' parameter doesn't seem to be required anymore"
>> (same note for the whole commit message)
>>
>>>
>>> Also, when weight is always NICE_0_LOAD, the initial fact value is
>>> always 2^10, and the first fact_hi will always be 0. Thus, we can get
>>> rid of the first if bock.
>>>
>>> The previous comment "(delta_exec * (weight * lw->inv_weight)) >>
>>> WMULT_SHIFT" seems to be assuming that lw->weight * lw->inv_weight is
>>> always (approximately) equal to 2^WMULT_SHIFT. However, when
>>> CONFIG_64BIT is set, lw->weight * lw->inv_weight is (approximately)
>>> equal to 2^WMULT_SHIFT * 2^10. What remains true for both CONFIG_32BIT
>>> and CONFIG_64BIT is: scale_load_down(lw->weight) * lw->inv_weight is
>>> (approximately) equal to 2^WMULT_SHIFT. (Correct me if I am wrong.)
>>
>> I think the comment is more about explaining that:
>> X * lw.weight
>> equals:
>> X * lw->inv_weight >> WMULT_SHIFT
>>
> I assume you mean
> X / lw->weight
> equals:
> X * lw->inv_weight >> WMULT_SHIFT

Yes right indeed.

> However, this is not always true, and that's why I'd like to revise
> it. It is true for
> CONFIG_32BIT. However, For CONFIG_64BIT, we have lw->weight * lw->inv_weight =
> 2**WMULT_SHIFT * 2**10. Thus,
> X / lw->weight
> equals:
> X * lw->inv_weight >> (WMULT_SHIFT + 10)

Ok yes, you're correct indeed.
The equality is always correct when scale_load_down() is used,

Regards,
Pierre

>
>
>> Also, if CONFIG_64BIT is set, we should have:
>> weight / lw.weight == scale_load_down(lw->weight) * 2**10 * lw->inv_weight
>>
>
> weight / lw->weight should be equal to scale_load_down(weight) /
> scale_load_down(lw->weight)
> = scale_load_down(weight) * lw->inv_weight / (2**WMULT_SHIFT)
> Right?
>
>> So IIUC, either both lines should be update, either none.
>> (meaning that:
>> delta_exec * NICE_0_LOAD / lw->weight
>> should be changed to
>> delta_exec * scale_load_down(NICE_0_LOAD) / lw->weight
>
> I think this is not correct? scale_load_down(NICE_0_LOAD) is the true
> weight, as mapped
> directly from the task's nice/priority value, while lw->weight is the
> scaled_up load.
> Their units/scales don't match.
>