2022-11-15 04:02:42

by Yun Zhou

[permalink] [raw]
Subject: [PATCH] timers: fix LVL_START macro

The number of buckets per level should be LVL_SIZE, not LVL_SIZE-1.

Signed-off-by: Yun Zhou <[email protected]>
---
kernel/time/timer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 717fcb9fb14a..1116b208093e 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -161,7 +161,7 @@ EXPORT_SYMBOL(jiffies_64);
* time. We start from the last possible delta of the previous level
* so that we can later add an extra LVL_GRAN(n) to n (see calc_index()).
*/
-#define LVL_START(n) ((LVL_SIZE - 1) << (((n) - 1) * LVL_CLK_SHIFT))
+#define LVL_START(n) (LVL_SIZE << (((n) - 1) * LVL_CLK_SHIFT))

/* Size of each clock level */
#define LVL_BITS 6
--
2.35.2



2022-11-15 12:38:28

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] timers: fix LVL_START macro

Hi Yun Zhou,

On Tue, Nov 15, 2022 at 10:56:14AM +0800, Yun Zhou wrote:
> The number of buckets per level should be LVL_SIZE, not LVL_SIZE-1.
>
> Signed-off-by: Yun Zhou <[email protected]>
> ---
> kernel/time/timer.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 717fcb9fb14a..1116b208093e 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -161,7 +161,7 @@ EXPORT_SYMBOL(jiffies_64);
> * time. We start from the last possible delta of the previous level
> * so that we can later add an extra LVL_GRAN(n) to n (see calc_index()).
> */
> -#define LVL_START(n) ((LVL_SIZE - 1) << (((n) - 1) * LVL_CLK_SHIFT))
> +#define LVL_START(n) (LVL_SIZE << (((n) - 1) * LVL_CLK_SHIFT))

See the comment above:

"We start from the last possible delta of the previous level
so that we can later add an extra LVL_GRAN(n) to n (see calc_index())."

Thanks.

>
> /* Size of each clock level */
> #define LVL_BITS 6
> --
> 2.35.2
>

2022-11-15 23:12:29

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] timers: fix LVL_START macro

On Tue, Nov 15, 2022 at 01:15:11PM +0000, Zhou, Yun wrote:
> Hi Frederic,
>
> The issue now is that a timer may be thrown into the upper level bucket. For example, expires 4090 and 1000 HZ, it should be in level 2, but now it will be placed in the level 3. Is this expected?
>
> * HZ 1000 steps
> * Level Offset Granularity Range
> * 0 0 1 ms 0 ms - 63 ms
> * 1 64 8 ms 64 ms - 511 ms
> * 2 128 64 ms 512 ms - 4095 ms (512ms - ~4s)
> * 3 192 512 ms 4096 ms - 32767 ms (~4s - ~32s)
> * 4 256 4096 ms (~4s) 32768 ms - 262143 ms (~32s - ~4m)

The rule is that a timer is not allowed to expire too early. But it can expire
a bit late. Hence why it is always rounded up. So in the case of 4090, we have
the choice between:

1) expiring at bucket 2 after 4096 - 64 = 4032 ms
2) expiring at bucket 3 after 4096 ms

The 1) rounds down and expires too early. The 2) rounds up and expires a bit
late. So the second solution is preferred.

Thanks.

2022-11-17 00:33:26

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] timers: fix LVL_START macro

On Tue, Nov 15 2022 at 23:40, Frederic Weisbecker wrote:
> On Tue, Nov 15, 2022 at 01:15:11PM +0000, Zhou, Yun wrote:
>> Hi Frederic,
>>
>> The issue now is that a timer may be thrown into the upper level bucket. For example, expires 4090 and 1000 HZ, it should be in level 2, but now it will be placed in the level 3. Is this expected?
>>
>> * HZ 1000 steps
>> * Level Offset Granularity Range
>> * 0 0 1 ms 0 ms - 63 ms
>> * 1 64 8 ms 64 ms - 511 ms
>> * 2 128 64 ms 512 ms - 4095 ms (512ms - ~4s)
>> * 3 192 512 ms 4096 ms - 32767 ms (~4s - ~32s)
>> * 4 256 4096 ms (~4s) 32768 ms - 262143 ms (~32s - ~4m)
>
> The rule is that a timer is not allowed to expire too early. But it can expire
> a bit late. Hence why it is always rounded up. So in the case of 4090, we have
> the choice between:
>
> 1) expiring at bucket 2 after 4096 - 64 = 4032 ms
> 2) expiring at bucket 3 after 4096 ms
>
> The 1) rounds down and expires too early. The 2) rounds up and expires a bit
> late. So the second solution is preferred.

It's not only preferred, it's required simply because the timer wheel
has only one guarantee: Not to expire early.

Timer wheel based timers are fundamentaly not precise unless the timeout
is short and hits the first level.

But even hrtimers which are designed to be precise have only one real
guarantee: Not to expire early.

hrtimers do not have the side effect of batching on long timeouts like
timer wheel based timer have, but that's it.

Timers in the kernel come with a choice:

- Imprecise and inexpensive to arm and cancel (timer_list)
- Precise and expensive to arm and cancel (hrtimer)

You can't have both. That's well documented.

Thanks,

tglx

2022-11-17 12:40:00

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] timers: fix LVL_START macro

On Thu, Nov 17, 2022 at 12:48:05AM +0100, Thomas Gleixner wrote:
> On Tue, Nov 15 2022 at 23:40, Frederic Weisbecker wrote:
> > On Tue, Nov 15, 2022 at 01:15:11PM +0000, Zhou, Yun wrote:
> >> Hi Frederic,
> >>
> >> The issue now is that a timer may be thrown into the upper level bucket. For example, expires 4090 and 1000 HZ, it should be in level 2, but now it will be placed in the level 3. Is this expected?
> >>
> >> * HZ 1000 steps
> >> * Level Offset Granularity Range
> >> * 0 0 1 ms 0 ms - 63 ms
> >> * 1 64 8 ms 64 ms - 511 ms
> >> * 2 128 64 ms 512 ms - 4095 ms (512ms - ~4s)
> >> * 3 192 512 ms 4096 ms - 32767 ms (~4s - ~32s)
> >> * 4 256 4096 ms (~4s) 32768 ms - 262143 ms (~32s - ~4m)
> >
> > The rule is that a timer is not allowed to expire too early. But it can expire
> > a bit late. Hence why it is always rounded up. So in the case of 4090, we have
> > the choice between:
> >
> > 1) expiring at bucket 2 after 4096 - 64 = 4032 ms
> > 2) expiring at bucket 3 after 4096 ms
> >
> > The 1) rounds down and expires too early. The 2) rounds up and expires a bit
> > late. So the second solution is preferred.
>
> It's not only preferred, it's required simply because the timer wheel
> has only one guarantee: Not to expire early.
>
> Timer wheel based timers are fundamentaly not precise unless the timeout
> is short and hits the first level.
>
> But even hrtimers which are designed to be precise have only one real
> guarantee: Not to expire early.
>
> hrtimers do not have the side effect of batching on long timeouts like
> timer wheel based timer have, but that's it.
>
> Timers in the kernel come with a choice:
>
> - Imprecise and inexpensive to arm and cancel (timer_list)
> - Precise and expensive to arm and cancel (hrtimer)
>
> You can't have both. That's well documented.

Actually I'm pretty sure we can manage imprecise and expensive to arm and
cancel. It's a matter of willpower!

Anyway, thanks for confirming what I thought about timers guarantees.

Thanks.

>
> Thanks,
>
> tglx