tick_broadcast_enter returns 0 when CPU can switch to broadcast
timer and non-zero otherwise. However when GENERIC_CLOCKEVENTS_BROADCAST
and TICK_ONESHOT are disabled, tick_broadcast_oneshot_control returns 0
which indicates to the CPUIdle framework that the CPU can enter deeper
idle states even when the CPU local timer will be shutdown. If the
target state needs broadcast but not broadcast timer is available, then
the CPU can not resume back from that idle state.
This patch returns error when there's no broadcast timer support
available so that CPUIdle framework prevents the CPU from entering any
idle states losing the local timer.
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Preeti U Murthy <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Reported-and-Tested-by: Suzuki Poulose <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
---
include/linux/tick.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 3741ba1a652c..0624968a0bcc 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -70,7 +70,10 @@ static inline void tick_broadcast_control(enum tick_broadcast_mode mode) { }
#if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) && defined(CONFIG_TICK_ONESHOT)
extern int tick_broadcast_oneshot_control(enum tick_broadcast_state state);
#else
-static inline int tick_broadcast_oneshot_control(enum tick_broadcast_state state) { return 0; }
+static inline int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
+{
+ return -ENODEV;
+}
#endif
static inline void tick_broadcast_enable(void)
--
1.9.1
On Thu, 25 Jun 2015, Sudeep Holla wrote:
> tick_broadcast_enter returns 0 when CPU can switch to broadcast
> timer and non-zero otherwise. However when GENERIC_CLOCKEVENTS_BROADCAST
> and TICK_ONESHOT are disabled, tick_broadcast_oneshot_control returns 0
> which indicates to the CPUIdle framework that the CPU can enter deeper
> idle states even when the CPU local timer will be shutdown. If the
> target state needs broadcast but not broadcast timer is available, then
> the CPU can not resume back from that idle state.
>
> This patch returns error when there's no broadcast timer support
> available so that CPUIdle framework prevents the CPU from entering any
> idle states losing the local timer.
That's wrong and breaks stuff which does not require the broadcast
nonsense.
If TICK_ONESHOT is disabled, then everything is in periodic mode and
tick_broadcast_enter() rightfully returns 0. Ditto for 'highres=off'
on the command line.
But there is a case which is not correctly handled right now. That's
what you are trying to solve in the wrong way.
If
GENERIC_CLOCKEVENTS_BROADCAST=n
or
GENERIC_CLOCKEVENTS_BROADCAST=y and no broadcast device is available,
AND cpu local tick device has the C3STOP flag set,
then we have no way to tell the idle code that going deep is not
allowed.
So we need to be smarter than blindly changing a return
value. Completely untested patch below.
Thanks,
tglx
---
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 4191b5623a28..8731a58dd747 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -63,11 +63,7 @@ extern void tick_broadcast_control(enum tick_broadcast_mode mode);
static inline void tick_broadcast_control(enum tick_broadcast_mode mode) { }
#endif /* BROADCAST */
-#if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) && defined(CONFIG_TICK_ONESHOT)
extern int tick_broadcast_oneshot_control(enum tick_broadcast_state state);
-#else
-static inline int tick_broadcast_oneshot_control(enum tick_broadcast_state state) { return 0; }
-#endif
static inline void tick_broadcast_enable(void)
{
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index d39f32cdd1b5..52c8d01b956b 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -662,24 +662,15 @@ static void broadcast_shutdown_local(struct clock_event_device *bc,
clockevents_switch_state(dev, CLOCK_EVT_STATE_SHUTDOWN);
}
-/**
- * tick_broadcast_oneshot_control - Enter/exit broadcast oneshot mode
- * @state: The target state (enter/exit)
- *
- * The system enters/leaves a state, where affected devices might stop
- * Returns 0 on success, -EBUSY if the cpu is used to broadcast wakeups.
- *
- * Called with interrupts disabled, so clockevents_lock is not
- * required here because the local clock event device cannot go away
- * under us.
- */
-int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
+int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)
{
struct clock_event_device *bc, *dev;
- struct tick_device *td;
int cpu, ret = 0;
ktime_t now;
+ if (!tick_broadcast_device.evtdev)
+ return -EBUSY;
+
/*
* Periodic mode does not care about the enter/exit of power
* states
@@ -687,15 +678,7 @@ int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
if (tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
return 0;
- /*
- * We are called with preemtion disabled from the depth of the
- * idle code, so we can't be moved away.
- */
- td = this_cpu_ptr(&tick_cpu_device);
- dev = td->evtdev;
-
- if (!(dev->features & CLOCK_EVT_FEAT_C3STOP))
- return 0;
+ dev = this_cpu_ptr(&tick_cpu_device)->evtdev;
raw_spin_lock(&tick_broadcast_lock);
bc = tick_broadcast_device.evtdev;
@@ -938,6 +921,13 @@ bool tick_broadcast_oneshot_available(void)
return bc ? bc->features & CLOCK_EVT_FEAT_ONESHOT : false;
}
+#else
+int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)
+{
+ if (!tick_broadcast_device.evtdev)
+ return -EBUSY;
+ return 0;
+}
#endif
void __init tick_broadcast_init(void)
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 17f144450050..f66c11a30348 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -342,6 +342,27 @@ out_bc:
tick_install_broadcast_device(newdev);
}
+/**
+ * tick_broadcast_oneshot_control - Enter/exit broadcast oneshot mode
+ * @state: The target state (enter/exit)
+ *
+ * The system enters/leaves a state, where affected devices might stop
+ * Returns 0 on success, -EBUSY if the cpu is used to broadcast wakeups.
+ *
+ * Called with interrupts disabled, so clockevents_lock is not
+ * required here because the local clock event device cannot go away
+ * under us.
+ */
+int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
+{
+ struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
+
+ if (!(td->evtdev->features & CLOCK_EVT_FEAT_C3STOP))
+ return 0;
+
+ return __tick_broadcast_oneshot_control(state);
+}
+
#ifdef CONFIG_HOTPLUG_CPU
/*
* Transfer the do_timer job away from a dying cpu.
diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
index 42fdf4958bcc..a4a8d4e9baa1 100644
--- a/kernel/time/tick-sched.h
+++ b/kernel/time/tick-sched.h
@@ -71,4 +71,14 @@ extern void tick_cancel_sched_timer(int cpu);
static inline void tick_cancel_sched_timer(int cpu) { }
#endif
+#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
+extern int __tick_broadcast_oneshot_control(enum tick_broadcast_state state);
+#else
+static inline int
+__tick_broadcast_oneshot_control(enum tick_broadcast_state state)
+{
+ return -EBUSY;
+}
+#endif
+
#endif
On 25/06/15 14:55, Thomas Gleixner wrote:
> On Thu, 25 Jun 2015, Sudeep Holla wrote:
>
>> tick_broadcast_enter returns 0 when CPU can switch to broadcast
>> timer and non-zero otherwise. However when GENERIC_CLOCKEVENTS_BROADCAST
>> and TICK_ONESHOT are disabled, tick_broadcast_oneshot_control returns 0
>> which indicates to the CPUIdle framework that the CPU can enter deeper
>> idle states even when the CPU local timer will be shutdown. If the
>> target state needs broadcast but not broadcast timer is available, then
>> the CPU can not resume back from that idle state.
>>
>> This patch returns error when there's no broadcast timer support
>> available so that CPUIdle framework prevents the CPU from entering any
>> idle states losing the local timer.
>
> That's wrong and breaks stuff which does not require the broadcast
> nonsense.
>
OK, sorry for not considering that case.
> If TICK_ONESHOT is disabled, then everything is in periodic mode and
> tick_broadcast_enter() rightfully returns 0. Ditto for 'highres=off'
> on the command line.
>
> But there is a case which is not correctly handled right now. That's
> what you are trying to solve in the wrong way.
>
Correct I was trying to solve exactly the case mentioned below.
> If
> GENERIC_CLOCKEVENTS_BROADCAST=n
>
> or
>
> GENERIC_CLOCKEVENTS_BROADCAST=y and no broadcast device is available,
>
> AND cpu local tick device has the C3STOP flag set,
>
> then we have no way to tell the idle code that going deep is not
> allowed.
>
> So we need to be smarter than blindly changing a return
> value. Completely untested patch below.
>
Agreed, thanks for the quick patch, I have tested it and it works fine.
You can add
Tested-by: Sudeep Holla <[email protected]>
Regards,
Sudeep
On 06/25/2015 07:25 PM, Thomas Gleixner wrote:
> But there is a case which is not correctly handled right now. That's
> what you are trying to solve in the wrong way.
>
> If
> GENERIC_CLOCKEVENTS_BROADCAST=n
>
> or
>
> GENERIC_CLOCKEVENTS_BROADCAST=y and no broadcast device is available,
>
> AND cpu local tick device has the C3STOP flag set,
>
> then we have no way to tell the idle code that going deep is not
> allowed.
>
> So we need to be smarter than blindly changing a return
> value. Completely untested patch below.
>
> Thanks,
>
> tglx
> ---
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 4191b5623a28..8731a58dd747 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -63,11 +63,7 @@ extern void tick_broadcast_control(enum tick_broadcast_mode mode);
> static inline void tick_broadcast_control(enum tick_broadcast_mode mode) { }
> #endif /* BROADCAST */
>
> -#if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) && defined(CONFIG_TICK_ONESHOT)
> extern int tick_broadcast_oneshot_control(enum tick_broadcast_state state);
> -#else
> -static inline int tick_broadcast_oneshot_control(enum tick_broadcast_state state) { return 0; }
> -#endif
>
> static inline void tick_broadcast_enable(void)
> {
> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
> index d39f32cdd1b5..52c8d01b956b 100644
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -662,24 +662,15 @@ static void broadcast_shutdown_local(struct clock_event_device *bc,
> clockevents_switch_state(dev, CLOCK_EVT_STATE_SHUTDOWN);
> }
>
> -/**
> - * tick_broadcast_oneshot_control - Enter/exit broadcast oneshot mode
> - * @state: The target state (enter/exit)
> - *
> - * The system enters/leaves a state, where affected devices might stop
> - * Returns 0 on success, -EBUSY if the cpu is used to broadcast wakeups.
> - *
> - * Called with interrupts disabled, so clockevents_lock is not
> - * required here because the local clock event device cannot go away
> - * under us.
> - */
> -int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
> +int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)
> {
> struct clock_event_device *bc, *dev;
> - struct tick_device *td;
> int cpu, ret = 0;
> ktime_t now;
>
> + if (!tick_broadcast_device.evtdev)
> + return -EBUSY;
> +
> /*
> * Periodic mode does not care about the enter/exit of power
> * states
> @@ -687,15 +678,7 @@ int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
> if (tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
> return 0;
>
> - /*
> - * We are called with preemtion disabled from the depth of the
> - * idle code, so we can't be moved away.
> - */
> - td = this_cpu_ptr(&tick_cpu_device);
> - dev = td->evtdev;
> -
> - if (!(dev->features & CLOCK_EVT_FEAT_C3STOP))
> - return 0;
> + dev = this_cpu_ptr(&tick_cpu_device)->evtdev;
>
> raw_spin_lock(&tick_broadcast_lock);
> bc = tick_broadcast_device.evtdev;
> @@ -938,6 +921,13 @@ bool tick_broadcast_oneshot_available(void)
> return bc ? bc->features & CLOCK_EVT_FEAT_ONESHOT : false;
> }
>
> +#else
> +int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)
> +{
> + if (!tick_broadcast_device.evtdev)
> + return -EBUSY;
> + return 0;
> +}
> #endif
>
> void __init tick_broadcast_init(void)
> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> index 17f144450050..f66c11a30348 100644
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -342,6 +342,27 @@ out_bc:
> tick_install_broadcast_device(newdev);
> }
>
> +/**
> + * tick_broadcast_oneshot_control - Enter/exit broadcast oneshot mode
> + * @state: The target state (enter/exit)
> + *
> + * The system enters/leaves a state, where affected devices might stop
> + * Returns 0 on success, -EBUSY if the cpu is used to broadcast wakeups.
> + *
> + * Called with interrupts disabled, so clockevents_lock is not
> + * required here because the local clock event device cannot go away
> + * under us.
> + */
> +int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
> +{
> + struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
> +
> + if (!(td->evtdev->features & CLOCK_EVT_FEAT_C3STOP))
> + return 0;
> +
> + return __tick_broadcast_oneshot_control(state);
> +}
> +
> #ifdef CONFIG_HOTPLUG_CPU
> /*
> * Transfer the do_timer job away from a dying cpu.
> diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
> index 42fdf4958bcc..a4a8d4e9baa1 100644
> --- a/kernel/time/tick-sched.h
> +++ b/kernel/time/tick-sched.h
> @@ -71,4 +71,14 @@ extern void tick_cancel_sched_timer(int cpu);
> static inline void tick_cancel_sched_timer(int cpu) { }
> #endif
>
> +#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
> +extern int __tick_broadcast_oneshot_control(enum tick_broadcast_state state);
> +#else
> +static inline int
> +__tick_broadcast_oneshot_control(enum tick_broadcast_state state)
> +{
> + return -EBUSY;
> +}
> +#endif
> +
> #endif
Reviewed-by: Preeti U Murthy <[email protected]>
>
On 06/25/2015 09:00 PM, Sudeep Holla wrote:
>
>
> On 25/06/15 14:55, Thomas Gleixner wrote:
>> On Thu, 25 Jun 2015, Sudeep Holla wrote:
>>
>>> tick_broadcast_enter returns 0 when CPU can switch to broadcast
>>> timer and non-zero otherwise. However when GENERIC_CLOCKEVENTS_BROADCAST
>>> and TICK_ONESHOT are disabled, tick_broadcast_oneshot_control returns 0
>>> which indicates to the CPUIdle framework that the CPU can enter deeper
>>> idle states even when the CPU local timer will be shutdown. If the
>>> target state needs broadcast but not broadcast timer is available, then
>>> the CPU can not resume back from that idle state.
>>>
>>> This patch returns error when there's no broadcast timer support
>>> available so that CPUIdle framework prevents the CPU from entering any
>>> idle states losing the local timer.
>>
>> That's wrong and breaks stuff which does not require the broadcast
>> nonsense.
>>
>
> OK, sorry for not considering that case.
>
>> If TICK_ONESHOT is disabled, then everything is in periodic mode and
>> tick_broadcast_enter() rightfully returns 0. Ditto for 'highres=off'
>> on the command line.
>>
>> But there is a case which is not correctly handled right now. That's
>> what you are trying to solve in the wrong way.
>>
>
> Correct I was trying to solve exactly the case mentioned below.
>
>> If
>> GENERIC_CLOCKEVENTS_BROADCAST=n
>>
>> or
>>
>> GENERIC_CLOCKEVENTS_BROADCAST=y and no broadcast device is available,
>>
>> AND cpu local tick device has the C3STOP flag set,
>>
>> then we have no way to tell the idle code that going deep is not
>> allowed.
>>
>> So we need to be smarter than blindly changing a return
>> value. Completely untested patch below.
>>
>
> Agreed, thanks for the quick patch, I have tested it and it works fine.
> You can add
>
> Tested-by: Sudeep Holla <[email protected]>
What about the case where GENERIC_CLOCKEVENTS_BROADCAST=y and
TICK_ONESHOT=n (HZ_PERIODIC=y) ? Have you tested this ?
This will hang the kernel at boot if you are using the hrtimer mode of
broadcast. This is because the local timers of all cpus are shutdown
when the cpuidle driver registers itself, on finding out that there are
idle states where local tick devices stop. The broadcast tick device is
then in charge of waking up the cpus at every period. In hrtimer mode of
broadcast, there is no such real device and we hang.
There was a patch sent out recently to fix this on powerpc.
https://lkml.org/lkml/2015/6/24/42
Regards
Preeti U Murthy
>
> Regards,
> Sudeep
>
On Fri, 26 Jun 2015, Preeti U Murthy wrote:
> What about the case where GENERIC_CLOCKEVENTS_BROADCAST=y and
> TICK_ONESHOT=n (HZ_PERIODIC=y) ? Have you tested this ?
>
> This will hang the kernel at boot if you are using the hrtimer mode of
> broadcast. This is because the local timers of all cpus are shutdown
> when the cpuidle driver registers itself, on finding out that there are
> idle states where local tick devices stop. The broadcast tick device is
> then in charge of waking up the cpus at every period. In hrtimer mode of
> broadcast, there is no such real device and we hang.
Hmm, no. tick-broadcast-hrtimer.o depends on TICK_ONESHOT=y. So this
is covered by the check for the broadcast device, which is NULL.
But there is another variant:
GENERIC_CLOCKEVENTS_BROADCAST=y TICK_ONESHOT=y and 'highres=off
nohz=off' on the kernel command line.
So we need to have the broadcast_needs_cpu() check in that case as
well. Updated patch below.
Thanks,
tglx
---
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 4191b5623a28..8731a58dd747 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -63,11 +63,7 @@ extern void tick_broadcast_control(enum tick_broadcast_mode mode);
static inline void tick_broadcast_control(enum tick_broadcast_mode mode) { }
#endif /* BROADCAST */
-#if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) && defined(CONFIG_TICK_ONESHOT)
extern int tick_broadcast_oneshot_control(enum tick_broadcast_state state);
-#else
-static inline int tick_broadcast_oneshot_control(enum tick_broadcast_state state) { return 0; }
-#endif
static inline void tick_broadcast_enable(void)
{
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index d39f32cdd1b5..281ce29d295e 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -662,46 +662,39 @@ static void broadcast_shutdown_local(struct clock_event_device *bc,
clockevents_switch_state(dev, CLOCK_EVT_STATE_SHUTDOWN);
}
-/**
- * tick_broadcast_oneshot_control - Enter/exit broadcast oneshot mode
- * @state: The target state (enter/exit)
- *
- * The system enters/leaves a state, where affected devices might stop
- * Returns 0 on success, -EBUSY if the cpu is used to broadcast wakeups.
- *
- * Called with interrupts disabled, so clockevents_lock is not
- * required here because the local clock event device cannot go away
- * under us.
- */
-int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
+int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)
{
struct clock_event_device *bc, *dev;
- struct tick_device *td;
int cpu, ret = 0;
ktime_t now;
- /*
- * Periodic mode does not care about the enter/exit of power
- * states
- */
- if (tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
- return 0;
+ if (!tick_broadcast_device.evtdev)
+ return -EBUSY;
- /*
- * We are called with preemtion disabled from the depth of the
- * idle code, so we can't be moved away.
- */
- td = this_cpu_ptr(&tick_cpu_device);
- dev = td->evtdev;
-
- if (!(dev->features & CLOCK_EVT_FEAT_C3STOP))
- return 0;
+ dev = this_cpu_ptr(&tick_cpu_device)->evtdev;
raw_spin_lock(&tick_broadcast_lock);
bc = tick_broadcast_device.evtdev;
cpu = smp_processor_id();
if (state == TICK_BROADCAST_ENTER) {
+ /*
+ * If the current CPU owns the hrtimer broadcast
+ * mechanism, it cannot go deep idle and we do not add
+ * the CPU to the broadcast mask. We don't have to go
+ * through the EXIT path as the local timer is not
+ * shutdown.
+ */
+ ret = broadcast_needs_cpu(bc, cpu);
+
+ /*
+ * If the hrtimer broadcast check tells us that the
+ * cpu cannot go deep idle, or if the broadcast device
+ * is in periodic mode, we just return.
+ */
+ if (ret || tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
+ goto out;
+
if (!cpumask_test_and_set_cpu(cpu, tick_broadcast_oneshot_mask)) {
WARN_ON_ONCE(cpumask_test_cpu(cpu, tick_broadcast_pending_mask));
broadcast_shutdown_local(bc, dev);
@@ -717,16 +710,6 @@ int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
dev->next_event.tv64 < bc->next_event.tv64)
tick_broadcast_set_event(bc, cpu, dev->next_event);
}
- /*
- * If the current CPU owns the hrtimer broadcast
- * mechanism, it cannot go deep idle and we remove the
- * CPU from the broadcast mask. We don't have to go
- * through the EXIT path as the local timer is not
- * shutdown.
- */
- ret = broadcast_needs_cpu(bc, cpu);
- if (ret)
- cpumask_clear_cpu(cpu, tick_broadcast_oneshot_mask);
} else {
if (cpumask_test_and_clear_cpu(cpu, tick_broadcast_oneshot_mask)) {
clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT);
@@ -938,6 +921,13 @@ bool tick_broadcast_oneshot_available(void)
return bc ? bc->features & CLOCK_EVT_FEAT_ONESHOT : false;
}
+#else
+int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)
+{
+ if (!tick_broadcast_device.evtdev)
+ return -EBUSY;
+ return 0;
+}
#endif
void __init tick_broadcast_init(void)
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 17f144450050..f66c11a30348 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -342,6 +342,27 @@ out_bc:
tick_install_broadcast_device(newdev);
}
+/**
+ * tick_broadcast_oneshot_control - Enter/exit broadcast oneshot mode
+ * @state: The target state (enter/exit)
+ *
+ * The system enters/leaves a state, where affected devices might stop
+ * Returns 0 on success, -EBUSY if the cpu is used to broadcast wakeups.
+ *
+ * Called with interrupts disabled, so clockevents_lock is not
+ * required here because the local clock event device cannot go away
+ * under us.
+ */
+int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
+{
+ struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
+
+ if (!(td->evtdev->features & CLOCK_EVT_FEAT_C3STOP))
+ return 0;
+
+ return __tick_broadcast_oneshot_control(state);
+}
+
#ifdef CONFIG_HOTPLUG_CPU
/*
* Transfer the do_timer job away from a dying cpu.
diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
index 42fdf4958bcc..a4a8d4e9baa1 100644
--- a/kernel/time/tick-sched.h
+++ b/kernel/time/tick-sched.h
@@ -71,4 +71,14 @@ extern void tick_cancel_sched_timer(int cpu);
static inline void tick_cancel_sched_timer(int cpu) { }
#endif
+#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
+extern int __tick_broadcast_oneshot_control(enum tick_broadcast_state state);
+#else
+static inline int
+__tick_broadcast_oneshot_control(enum tick_broadcast_state state)
+{
+ return -EBUSY;
+}
+#endif
+
#endif
On 26/06/15 06:08, Preeti U Murthy wrote:
> On 06/25/2015 09:00 PM, Sudeep Holla wrote:
>>
>>
>> On 25/06/15 14:55, Thomas Gleixner wrote:
>>> On Thu, 25 Jun 2015, Sudeep Holla wrote:
>>>
>>>> tick_broadcast_enter returns 0 when CPU can switch to broadcast
>>>> timer and non-zero otherwise. However when GENERIC_CLOCKEVENTS_BROADCAST
>>>> and TICK_ONESHOT are disabled, tick_broadcast_oneshot_control returns 0
>>>> which indicates to the CPUIdle framework that the CPU can enter deeper
>>>> idle states even when the CPU local timer will be shutdown. If the
>>>> target state needs broadcast but not broadcast timer is available, then
>>>> the CPU can not resume back from that idle state.
>>>>
>>>> This patch returns error when there's no broadcast timer support
>>>> available so that CPUIdle framework prevents the CPU from entering any
>>>> idle states losing the local timer.
>>>
>>> That's wrong and breaks stuff which does not require the broadcast
>>> nonsense.
>>>
>>
>> OK, sorry for not considering that case.
>>
>>> If TICK_ONESHOT is disabled, then everything is in periodic mode and
>>> tick_broadcast_enter() rightfully returns 0. Ditto for 'highres=off'
>>> on the command line.
>>>
>>> But there is a case which is not correctly handled right now. That's
>>> what you are trying to solve in the wrong way.
>>>
>>
>> Correct I was trying to solve exactly the case mentioned below.
>>
>>> If
>>> GENERIC_CLOCKEVENTS_BROADCAST=n
>>>
>>> or
>>>
>>> GENERIC_CLOCKEVENTS_BROADCAST=y and no broadcast device is available,
>>>
>>> AND cpu local tick device has the C3STOP flag set,
>>>
>>> then we have no way to tell the idle code that going deep is not
>>> allowed.
>>>
>>> So we need to be smarter than blindly changing a return
>>> value. Completely untested patch below.
>>>
>>
>> Agreed, thanks for the quick patch, I have tested it and it works fine.
>> You can add
>>
>> Tested-by: Sudeep Holla <[email protected]>
>
> What about the case where GENERIC_CLOCKEVENTS_BROADCAST=y and
> TICK_ONESHOT=n (HZ_PERIODIC=y) ? Have you tested this ?
>
Yes I did test this config, but not the one through cmdline which tglx
is suggesting in the other mail. It doesn't hang but all cpus are in
shallow idle states(WFI in ARM) and no progress in the boot. But IMO
that's different issue and that config is not tested for long time and
need more investigation. I will get into that ASAP but that's least used
configuration.
> This will hang the kernel at boot if you are using the hrtimer mode of
> broadcast. This is because the local timers of all cpus are shutdown
> when the cpuidle driver registers itself, on finding out that there are
> idle states where local tick devices stop. The broadcast tick device is
> then in charge of waking up the cpus at every period. In hrtimer mode of
> broadcast, there is no such real device and we hang.
>
No sure what you mean by this. IIUC when you select HIGH_RES_TIMERS,
TICK_ONESHOT is selected by default. So I don't understand how to get
HZ_PERIODIC=y, HIGH_RES_TIMERS=n and hrtimer mode of broadcast. Am I
missing something ?
> There was a patch sent out recently to fix this on powerpc.
> https://lkml.org/lkml/2015/6/24/42
>
Yes I saw that and IIUC you don't register the idle states where local
timer stops, correct ? But I am not seeing the hang on ARM as described
in the log, I will spend more time to check if I am missing something
and not testing the right configuration.
Regards,
Sudeep
On 06/26/2015 01:17 PM, Thomas Gleixner wrote:
> On Fri, 26 Jun 2015, Preeti U Murthy wrote:
>> What about the case where GENERIC_CLOCKEVENTS_BROADCAST=y and
>> TICK_ONESHOT=n (HZ_PERIODIC=y) ? Have you tested this ?
>>
>> This will hang the kernel at boot if you are using the hrtimer mode of
>> broadcast. This is because the local timers of all cpus are shutdown
>> when the cpuidle driver registers itself, on finding out that there are
>> idle states where local tick devices stop. The broadcast tick device is
>> then in charge of waking up the cpus at every period. In hrtimer mode of
>> broadcast, there is no such real device and we hang.
>
> Hmm, no. tick-broadcast-hrtimer.o depends on TICK_ONESHOT=y. So this
> is covered by the check for the broadcast device, which is NULL.
>
> But there is another variant:
>
> GENERIC_CLOCKEVENTS_BROADCAST=y TICK_ONESHOT=y and 'highres=off
> nohz=off' on the kernel command line.
Can this happen at all? It is during tick_init_highres() or
tick_nohz_switch_to_nohz() that we switch to oneshot mode, not otherwise
AFAICT.
I was actually talking of the following scenario. In periodic mode,
where GENERIC_CLOCKEVENTS_BROADCAST=y, the arch can execute
tick_setup_hrtimer_broadcast(), which will return nothing as you point
out above. So there is no broadcast clockevent device.
When the cpuidle driver registers with the cpuidle core however,
cpuidle_setup_broadcast_timer() on every cpu is executed if it finds
that there is an idle state where ticks stop.
cpuidle_setup_broadcast_timer()
tick_broadcast_enable()
tick_broadcast_control(BROADCAST_ON)
bc = tick_broadcast_device.evtdev which is NULL in this case
TICK_BROADCAST_ON:
checks for periodic mode of the broadcast device - succeeds
although we haven't registered a broadcast device because
value of TICKDEV_PERIODIC is 0, the default value of td.mode.
clockevents_shutdown(dev)
At this point all cpus stop.
Regards
Preeti U Murthy
On Fri, 26 Jun 2015, Preeti U Murthy wrote:
> On 06/26/2015 01:17 PM, Thomas Gleixner wrote:
> > On Fri, 26 Jun 2015, Preeti U Murthy wrote:
> >> What about the case where GENERIC_CLOCKEVENTS_BROADCAST=y and
> >> TICK_ONESHOT=n (HZ_PERIODIC=y) ? Have you tested this ?
> >>
> >> This will hang the kernel at boot if you are using the hrtimer mode of
> >> broadcast. This is because the local timers of all cpus are shutdown
> >> when the cpuidle driver registers itself, on finding out that there are
> >> idle states where local tick devices stop. The broadcast tick device is
> >> then in charge of waking up the cpus at every period. In hrtimer mode of
> >> broadcast, there is no such real device and we hang.
> >
> > Hmm, no. tick-broadcast-hrtimer.o depends on TICK_ONESHOT=y. So this
> > is covered by the check for the broadcast device, which is NULL.
> >
> > But there is another variant:
> >
> > GENERIC_CLOCKEVENTS_BROADCAST=y TICK_ONESHOT=y and 'highres=off
> > nohz=off' on the kernel command line.
>
> Can this happen at all? It is during tick_init_highres() or
> tick_nohz_switch_to_nohz() that we switch to oneshot mode, not otherwise
> AFAICT.
And how does that matter? If 'highres=off nohz=off' is on the kernel
command line none of the switchovers happens. So system stays in
periodic mode and the broadcast hrtimer thing is registered, right?
> I was actually talking of the following scenario. In periodic mode,
> where GENERIC_CLOCKEVENTS_BROADCAST=y, the arch can execute
> tick_setup_hrtimer_broadcast(), which will return nothing as you point
> out above. So there is no broadcast clockevent device.
>
> When the cpuidle driver registers with the cpuidle core however,
> cpuidle_setup_broadcast_timer() on every cpu is executed if it finds
> that there is an idle state where ticks stop.
>
> cpuidle_setup_broadcast_timer()
> tick_broadcast_enable()
> tick_broadcast_control(BROADCAST_ON)
> bc = tick_broadcast_device.evtdev which is NULL in this case
>
> TICK_BROADCAST_ON:
> checks for periodic mode of the broadcast device - succeeds
> although we haven't registered a broadcast device because
> value of TICKDEV_PERIODIC is 0, the default value of td.mode.
>
> clockevents_shutdown(dev)
>
> At this point all cpus stop.
Right. That's a different one, if tick_broadcast_device.evtdev == NULL.
We should not stop any cpu local timer in that case. Combined with the
patch I sent, we prevent the idle stuff from going into a state where
the cpu local timers stop.
Thanks,
tglx
On 06/26/2015 01:17 PM, Thomas Gleixner wrote:
> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
> index d39f32cdd1b5..281ce29d295e 100644
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -662,46 +662,39 @@ static void broadcast_shutdown_local(struct clock_event_device *bc,
> clockevents_switch_state(dev, CLOCK_EVT_STATE_SHUTDOWN);
> }
>
> -/**
> - * tick_broadcast_oneshot_control - Enter/exit broadcast oneshot mode
> - * @state: The target state (enter/exit)
> - *
> - * The system enters/leaves a state, where affected devices might stop
> - * Returns 0 on success, -EBUSY if the cpu is used to broadcast wakeups.
> - *
> - * Called with interrupts disabled, so clockevents_lock is not
> - * required here because the local clock event device cannot go away
> - * under us.
> - */
> -int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
> +int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)
> {
> struct clock_event_device *bc, *dev;
> - struct tick_device *td;
> int cpu, ret = 0;
> ktime_t now;
>
> - /*
> - * Periodic mode does not care about the enter/exit of power
> - * states
> - */
> - if (tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
> - return 0;
> + if (!tick_broadcast_device.evtdev)
> + return -EBUSY;
>
> - /*
> - * We are called with preemtion disabled from the depth of the
> - * idle code, so we can't be moved away.
> - */
> - td = this_cpu_ptr(&tick_cpu_device);
> - dev = td->evtdev;
> -
> - if (!(dev->features & CLOCK_EVT_FEAT_C3STOP))
> - return 0;
> + dev = this_cpu_ptr(&tick_cpu_device)->evtdev;
>
> raw_spin_lock(&tick_broadcast_lock);
> bc = tick_broadcast_device.evtdev;
> cpu = smp_processor_id();
>
> if (state == TICK_BROADCAST_ENTER) {
> + /*
> + * If the current CPU owns the hrtimer broadcast
> + * mechanism, it cannot go deep idle and we do not add
> + * the CPU to the broadcast mask. We don't have to go
> + * through the EXIT path as the local timer is not
> + * shutdown.
> + */
> + ret = broadcast_needs_cpu(bc, cpu);
> +
> + /*
> + * If the hrtimer broadcast check tells us that the
> + * cpu cannot go deep idle, or if the broadcast device
> + * is in periodic mode, we just return.
> + */
> + if (ret || tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
> + goto out;
The check on PERIODIC mode is good, but I don't see the point of moving
broadcast_needs_cpu() up above. broadcast_shutdown_local() calls
broadcast_needs_cpu() internally.
Besides, by jumping to 'out', we will miss programming the broadcast
hrtimer in tick_broadcast_set_event() below, if the cpu happen to be the
broadcast cpu(which is why it was not allowed to go to deep idle).
> +
> if (!cpumask_test_and_set_cpu(cpu, tick_broadcast_oneshot_mask)) {
> WARN_ON_ONCE(cpumask_test_cpu(cpu, tick_broadcast_pending_mask));
> broadcast_shutdown_local(bc, dev);
> @@ -717,16 +710,6 @@ int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
> dev->next_event.tv64 < bc->next_event.tv64)
> tick_broadcast_set_event(bc, cpu, dev->next_event);
> }
> - /*
> - * If the current CPU owns the hrtimer broadcast
> - * mechanism, it cannot go deep idle and we remove the
> - * CPU from the broadcast mask. We don't have to go
> - * through the EXIT path as the local timer is not
> - * shutdown.
> - */
> - ret = broadcast_needs_cpu(bc, cpu);
> - if (ret)
> - cpumask_clear_cpu(cpu, tick_broadcast_oneshot_mask);
> } else {
> if (cpumask_test_and_clear_cpu(cpu, tick_broadcast_oneshot_mask)) {
> clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT);
> @@ -938,6 +921,13 @@ bool tick_broadcast_oneshot_available(void)
> return bc ? bc->features & CLOCK_EVT_FEAT_ONESHOT : false;
> }
>
Regards
Preeti U Murthy
On 06/26/2015 05:20 PM, Thomas Gleixner wrote:
> On Fri, 26 Jun 2015, Preeti U Murthy wrote:
>> On 06/26/2015 01:17 PM, Thomas Gleixner wrote:
>>> On Fri, 26 Jun 2015, Preeti U Murthy wrote:
>>>> What about the case where GENERIC_CLOCKEVENTS_BROADCAST=y and
>>>> TICK_ONESHOT=n (HZ_PERIODIC=y) ? Have you tested this ?
>>>>
>>>> This will hang the kernel at boot if you are using the hrtimer mode of
>>>> broadcast. This is because the local timers of all cpus are shutdown
>>>> when the cpuidle driver registers itself, on finding out that there are
>>>> idle states where local tick devices stop. The broadcast tick device is
>>>> then in charge of waking up the cpus at every period. In hrtimer mode of
>>>> broadcast, there is no such real device and we hang.
>>>
>>> Hmm, no. tick-broadcast-hrtimer.o depends on TICK_ONESHOT=y. So this
>>> is covered by the check for the broadcast device, which is NULL.
>>>
>>> But there is another variant:
>>>
>>> GENERIC_CLOCKEVENTS_BROADCAST=y TICK_ONESHOT=y and 'highres=off
>>> nohz=off' on the kernel command line.
>>
>> Can this happen at all? It is during tick_init_highres() or
>> tick_nohz_switch_to_nohz() that we switch to oneshot mode, not otherwise
>> AFAICT.
>
> And how does that matter? If 'highres=off nohz=off' is on the kernel
> command line none of the switchovers happens. So system stays in
> periodic mode and the broadcast hrtimer thing is registered, right?
Yes we are good here. I overlooked the fact that we could disable high
resolution/nohz just before boot.
>
>> I was actually talking of the following scenario. In periodic mode,
>> where GENERIC_CLOCKEVENTS_BROADCAST=y, the arch can execute
>> tick_setup_hrtimer_broadcast(), which will return nothing as you point
>> out above. So there is no broadcast clockevent device.
>>
>> When the cpuidle driver registers with the cpuidle core however,
>> cpuidle_setup_broadcast_timer() on every cpu is executed if it finds
>> that there is an idle state where ticks stop.
>>
>> cpuidle_setup_broadcast_timer()
>> tick_broadcast_enable()
>> tick_broadcast_control(BROADCAST_ON)
>> bc = tick_broadcast_device.evtdev which is NULL in this case
>>
>> TICK_BROADCAST_ON:
>> checks for periodic mode of the broadcast device - succeeds
>> although we haven't registered a broadcast device because
>> value of TICKDEV_PERIODIC is 0, the default value of td.mode.
>>
>> clockevents_shutdown(dev)
>>
>> At this point all cpus stop.
>
> Right. That's a different one, if tick_broadcast_device.evtdev == NULL.
>
> We should not stop any cpu local timer in that case. Combined with the
> patch I sent, we prevent the idle stuff from going into a state where
> the cpu local timers stop.
Right. We need a check above too.
Regards
Preeti U Murthy
>
> Thanks,
>
> tglx
>
On Fri, 26 Jun 2015, Preeti U Murthy wrote:
> On 06/26/2015 01:17 PM, Thomas Gleixner wrote:
> > if (state == TICK_BROADCAST_ENTER) {
> > + /*
> > + * If the current CPU owns the hrtimer broadcast
> > + * mechanism, it cannot go deep idle and we do not add
> > + * the CPU to the broadcast mask. We don't have to go
> > + * through the EXIT path as the local timer is not
> > + * shutdown.
> > + */
> > + ret = broadcast_needs_cpu(bc, cpu);
> > +
> > + /*
> > + * If the hrtimer broadcast check tells us that the
> > + * cpu cannot go deep idle, or if the broadcast device
> > + * is in periodic mode, we just return.
> > + */
> > + if (ret || tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
> > + goto out;
>
> The check on PERIODIC mode is good, but I don't see the point of moving
> broadcast_needs_cpu() up above. broadcast_shutdown_local() calls
> broadcast_needs_cpu() internally.
>
> Besides, by jumping to 'out', we will miss programming the broadcast
> hrtimer in tick_broadcast_set_event() below, if the cpu happen to be the
> broadcast cpu(which is why it was not allowed to go to deep idle).
Hmm. Need to think a bit more about this convoluted maze ...
On 06/26/2015 06:08 PM, Thomas Gleixner wrote:
> On Fri, 26 Jun 2015, Preeti U Murthy wrote:
>> On 06/26/2015 01:17 PM, Thomas Gleixner wrote:
>>> if (state == TICK_BROADCAST_ENTER) {
>>> + /*
>>> + * If the current CPU owns the hrtimer broadcast
>>> + * mechanism, it cannot go deep idle and we do not add
>>> + * the CPU to the broadcast mask. We don't have to go
>>> + * through the EXIT path as the local timer is not
>>> + * shutdown.
>>> + */
>>> + ret = broadcast_needs_cpu(bc, cpu);
>>> +
>>> + /*
>>> + * If the hrtimer broadcast check tells us that the
>>> + * cpu cannot go deep idle, or if the broadcast device
>>> + * is in periodic mode, we just return.
>>> + */
>>> + if (ret || tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
>>> + goto out;
>>
>> The check on PERIODIC mode is good, but I don't see the point of moving
>> broadcast_needs_cpu() up above. broadcast_shutdown_local() calls
>> broadcast_needs_cpu() internally.
>>
>> Besides, by jumping to 'out', we will miss programming the broadcast
>> hrtimer in tick_broadcast_set_event() below, if the cpu happen to be the
>> broadcast cpu(which is why it was not allowed to go to deep idle).
>
> Hmm. Need to think a bit more about this convoluted maze ...
I think you cover all cases just by having that check on periodic mode.
This covers the nohz_full=n,highres=n, TICK_ONESHOT=y and
GENERIC_CLOCKEVENTS_BROADCAST=y. broadcast_needs_cpu() should remain
where it was though.
And of course, the additional patch on tick_broadcast_device.evtdev ==
NULL in BROADCAST_ON.
Regards
Preeti U Murthy
>
On 26/06/15 13:34, Preeti U Murthy wrote:
> On 06/26/2015 01:17 PM, Thomas Gleixner wrote:
>> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
>> index d39f32cdd1b5..281ce29d295e 100644
>> --- a/kernel/time/tick-broadcast.c
>> +++ b/kernel/time/tick-broadcast.c
>> @@ -662,46 +662,39 @@ static void broadcast_shutdown_local(struct clock_event_device *bc,
>> clockevents_switch_state(dev, CLOCK_EVT_STATE_SHUTDOWN);
>> }
>>
>> -/**
>> - * tick_broadcast_oneshot_control - Enter/exit broadcast oneshot mode
>> - * @state: The target state (enter/exit)
>> - *
>> - * The system enters/leaves a state, where affected devices might stop
>> - * Returns 0 on success, -EBUSY if the cpu is used to broadcast wakeups.
>> - *
>> - * Called with interrupts disabled, so clockevents_lock is not
>> - * required here because the local clock event device cannot go away
>> - * under us.
>> - */
>> -int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
>> +int __tick_broadcast_oneshot_control(enum tick_broadcast_state state)
>> {
>> struct clock_event_device *bc, *dev;
>> - struct tick_device *td;
>> int cpu, ret = 0;
>> ktime_t now;
>>
>> - /*
>> - * Periodic mode does not care about the enter/exit of power
>> - * states
>> - */
>> - if (tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
>> - return 0;
>> + if (!tick_broadcast_device.evtdev)
>> + return -EBUSY;
>>
>> - /*
>> - * We are called with preemtion disabled from the depth of the
>> - * idle code, so we can't be moved away.
>> - */
>> - td = this_cpu_ptr(&tick_cpu_device);
>> - dev = td->evtdev;
>> -
>> - if (!(dev->features & CLOCK_EVT_FEAT_C3STOP))
>> - return 0;
>> + dev = this_cpu_ptr(&tick_cpu_device)->evtdev;
>>
>> raw_spin_lock(&tick_broadcast_lock);
>> bc = tick_broadcast_device.evtdev;
>> cpu = smp_processor_id();
>>
>> if (state == TICK_BROADCAST_ENTER) {
>> + /*
>> + * If the current CPU owns the hrtimer broadcast
>> + * mechanism, it cannot go deep idle and we do not add
>> + * the CPU to the broadcast mask. We don't have to go
>> + * through the EXIT path as the local timer is not
>> + * shutdown.
>> + */
>> + ret = broadcast_needs_cpu(bc, cpu);
>> +
>> + /*
>> + * If the hrtimer broadcast check tells us that the
>> + * cpu cannot go deep idle, or if the broadcast device
>> + * is in periodic mode, we just return.
>> + */
>> + if (ret || tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
>> + goto out;
>
> The check on PERIODIC mode is good, but I don't see the point of moving
> broadcast_needs_cpu() up above. broadcast_shutdown_local() calls
> broadcast_needs_cpu() internally.
>
> Besides, by jumping to 'out', we will miss programming the broadcast
> hrtimer in tick_broadcast_set_event() below, if the cpu happen to be the
> broadcast cpu(which is why it was not allowed to go to deep idle).
>
I tested the updated patch and found issues. I am seeing some random
behaviour(with GENERIC_CLOCKEVENTS_BROADCAST=y TICK_ONESHOT=y):
1. sometimes all the CPUs have entered deeper idle states(though very
rare, finding it difficult to hit this scenario)
2. few other times I see one CPU in shallow state which matches the
above explanation of missing hrtimer programming.
Regards,
Sudeep
On Fri, 26 Jun 2015, Thomas Gleixner wrote:
> On Fri, 26 Jun 2015, Preeti U Murthy wrote:
> > On 06/26/2015 01:17 PM, Thomas Gleixner wrote:
> > > if (state == TICK_BROADCAST_ENTER) {
> > > + /*
> > > + * If the current CPU owns the hrtimer broadcast
> > > + * mechanism, it cannot go deep idle and we do not add
> > > + * the CPU to the broadcast mask. We don't have to go
> > > + * through the EXIT path as the local timer is not
> > > + * shutdown.
> > > + */
> > > + ret = broadcast_needs_cpu(bc, cpu);
> > > +
> > > + /*
> > > + * If the hrtimer broadcast check tells us that the
> > > + * cpu cannot go deep idle, or if the broadcast device
> > > + * is in periodic mode, we just return.
> > > + */
> > > + if (ret || tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
> > > + goto out;
> >
> > The check on PERIODIC mode is good, but I don't see the point of moving
> > broadcast_needs_cpu() up above. broadcast_shutdown_local() calls
> > broadcast_needs_cpu() internally.
> >
> > Besides, by jumping to 'out', we will miss programming the broadcast
> > hrtimer in tick_broadcast_set_event() below, if the cpu happen to be the
> > broadcast cpu(which is why it was not allowed to go to deep idle).
>
> Hmm. Need to think a bit more about this convoluted maze ...
Actually, that does not matter at all because the CPU which runs the
broadcast does not need to program its own next event.
That's just redundant because the next event on this cpu is already
programmed in the cpu local timer device and the associated hrtimer is
in the tree. The cpu does not go into deep idle so it just works and
we spare the set/clear bit dance along with the hrtimer update.
Now, there is another caveat. If the cpu is not holding the broadcast
device and has the first expiring timer then the broadcast device
might migrate over and we should clear the cpu in the
broadcast_oneshot_mask.
I so wish we had never invented that broadcast crap at all.
Thanks,
tglx