2023-12-19 08:29:37

by Yu Liao

[permalink] [raw]
Subject: [PATCH] tick/broadcast-hrtimer: Prevent the timer device on broadcast duty CPU from being disabled

It was found that running the LTP hotplug stress test on a aarch64
system could produce rcu_sched stall warnings.

The issue is the following:

CPU1 (owns the broadcast hrtimer) CPU2

tick_broadcast_enter()
//shut down local timer device
...
tick_broadcast_exit()
//exits with tick_broadcast_force_mask set,
timer device remains disabled

initiates offlining of CPU1
take_cpu_down()
//CPU1 shuts down and does
not send broadcast IPI anymore
takedown_cpu()
hotplug_cpu__broadcast_tick_pull()
//move broadcast hrtimer to this CPU
clockevents_program_event()
bc_set_next()
hrtimer_start()
//does not call hrtimer_reprogram()
to program timer device if expires
equals dev->next_event, so the timer
device remains disabled.

CPU2 takes over the broadcast duty but local timer device is disabled,
causing many CPUs to become stuck.

Fix this by calling tick_program_event() to reprogram the local timer
device in this scenario.

Signed-off-by: Yu Liao <[email protected]>
---
kernel/time/tick-broadcast-hrtimer.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/time/tick-broadcast-hrtimer.c b/kernel/time/tick-broadcast-hrtimer.c
index e28f9210f8a1..6a4a612581fb 100644
--- a/kernel/time/tick-broadcast-hrtimer.c
+++ b/kernel/time/tick-broadcast-hrtimer.c
@@ -42,10 +42,22 @@ static int bc_shutdown(struct clock_event_device *evt)
*/
static int bc_set_next(ktime_t expires, struct clock_event_device *bc)
{
+ ktime_t next_event = this_cpu_ptr(&tick_cpu_device)->evtdev->next_event;
+
/*
- * This is called either from enter/exit idle code or from the
- * broadcast handler. In all cases tick_broadcast_lock is held.
- *
+ * This can be called from CPU offline operation to move broadcast
+ * assignment. If tick_broadcast_force_mask is set, the CPU local
+ * timer device may be disabled. And hrtimer_reprogram() will not
+ * called if the timer is not the first expiring timer. Reprogram
+ * the cpu local timer device to ensure we can take over the
+ * broadcast duty.
+ */
+ if (tick_check_broadcast_expired() && expires >= next_event)
+ tick_program_event(next_event, 1);
+
+ /*
+ * This is called from enter/exit idle code, broadcast handler or
+ * CPU offline operation. In all cases tick_broadcast_lock is held.
* hrtimer_cancel() cannot be called here neither from the
* broadcast handler nor from the enter/exit idle code. The idle
* code can run into the problem described in bc_shutdown() and the
--
2.33.0



2024-01-12 07:40:46

by Yu Liao

[permalink] [raw]
Subject: Re: [PATCH] tick/broadcast-hrtimer: Prevent the timer device on broadcast duty CPU from being disabled

Hi Thomas,

Kindly ping..

On 2023/12/18 10:58, Yu Liao wrote:
> It was found that running the LTP hotplug stress test on a aarch64
> system could produce rcu_sched stall warnings.
>
> The issue is the following:
>
> CPU1 (owns the broadcast hrtimer) CPU2
>
> tick_broadcast_enter()
> //shut down local timer device
> ...
> tick_broadcast_exit()
> //exits with tick_broadcast_force_mask set,
> timer device remains disabled
>
> initiates offlining of CPU1
> take_cpu_down()
> //CPU1 shuts down and does
> not send broadcast IPI anymore
> takedown_cpu()
> hotplug_cpu__broadcast_tick_pull()
> //move broadcast hrtimer to this CPU
> clockevents_program_event()
> bc_set_next()
> hrtimer_start()
> //does not call hrtimer_reprogram()
> to program timer device if expires
> equals dev->next_event, so the timer
> device remains disabled.
>
> CPU2 takes over the broadcast duty but local timer device is disabled,
> causing many CPUs to become stuck.
>
> Fix this by calling tick_program_event() to reprogram the local timer
> device in this scenario.
>
> Signed-off-by: Yu Liao <[email protected]>
> ---
> kernel/time/tick-broadcast-hrtimer.c | 18 +++++++++++++++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/time/tick-broadcast-hrtimer.c b/kernel/time/tick-broadcast-hrtimer.c
> index e28f9210f8a1..6a4a612581fb 100644
> --- a/kernel/time/tick-broadcast-hrtimer.c
> +++ b/kernel/time/tick-broadcast-hrtimer.c
> @@ -42,10 +42,22 @@ static int bc_shutdown(struct clock_event_device *evt)
> */
> static int bc_set_next(ktime_t expires, struct clock_event_device *bc)
> {
> + ktime_t next_event = this_cpu_ptr(&tick_cpu_device)->evtdev->next_event;
> +
> /*
> - * This is called either from enter/exit idle code or from the
> - * broadcast handler. In all cases tick_broadcast_lock is held.
> - *
> + * This can be called from CPU offline operation to move broadcast
> + * assignment. If tick_broadcast_force_mask is set, the CPU local
> + * timer device may be disabled. And hrtimer_reprogram() will not
> + * called if the timer is not the first expiring timer. Reprogram
> + * the cpu local timer device to ensure we can take over the
> + * broadcast duty.
> + */
> + if (tick_check_broadcast_expired() && expires >= next_event)
> + tick_program_event(next_event, 1);
> +
> + /*
> + * This is called from enter/exit idle code, broadcast handler or
> + * CPU offline operation. In all cases tick_broadcast_lock is held.
> * hrtimer_cancel() cannot be called here neither from the
> * broadcast handler nor from the enter/exit idle code. The idle
> * code can run into the problem described in bc_shutdown() and the


2024-01-12 15:35:29

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] tick/broadcast-hrtimer: Prevent the timer device on broadcast duty CPU from being disabled

On Fri, Jan 12 2024 at 15:40, Yu Liao wrote:
> Hi Thomas,
>
> Kindly ping..

It's in my backlog ...

2024-01-24 19:41:32

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] tick/broadcast-hrtimer: Prevent the timer device on broadcast duty CPU from being disabled

On Mon, Dec 18 2023 at 10:58, Yu Liao wrote:
> static int bc_set_next(ktime_t expires, struct clock_event_device *bc)
> {
> + ktime_t next_event = this_cpu_ptr(&tick_cpu_device)->evtdev->next_event;
> +
> /*
> - * This is called either from enter/exit idle code or from the
> - * broadcast handler. In all cases tick_broadcast_lock is held.
> - *
> + * This can be called from CPU offline operation to move broadcast
> + * assignment. If tick_broadcast_force_mask is set, the CPU local
> + * timer device may be disabled. And hrtimer_reprogram() will not
> + * called if the timer is not the first expiring timer. Reprogram
> + * the cpu local timer device to ensure we can take over the
> + * broadcast duty.
> + */
> + if (tick_check_broadcast_expired() && expires >= next_event)
> + tick_program_event(next_event, 1);

I'm not really enthused about another conditional here and that
condition is more than obscure.

The problem is that the local clockevent might be shut down, right?

So checking for that state is the right thing to do and the proper place
is in hotplug_cpu__broadcast_tick_pull(), no?

Thanks,

tglx