2023-08-11 17:31:55

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 00/10] timers/cpuidle: Fixes and cleanups

Hi,

The first part of the series is cpuidle callback fixes against timers,
some of which haven't been Signed by Peter yet.

Another part is removing the overhead of useless TIF_NR_POLLING clearing.

The rest is comments.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
timers/idle

HEAD: b66dd12bb29cca558b9323f2b270a7dae8f56c48

Thanks,
Frederic
---

Frederic Weisbecker (8):
x86: Add a comment about the "magic" behind shadow sti before mwait
cpuidle: Report illegal tick stopped while polling
cpuidle: Comment about timers requirements VS idle handler
cpuidle: Remove unnecessary current_clr_polling_and_test() from haltpoll
cpuidle: Remove unnecessary current_clr_polling() on poll_idle()
x86: Remove __current_clr_polling() from mwait_idle()
x86: Remove the current_clr_polling() call upon mwait exit
sched/timers: Explain why idle task schedules out on remote timer enqueue

Peter Zijlstra (2):
cpuidle: Fix CPUIDLE_FLAG_IRQ_ENABLE leaking timer reprogram
loongson: Fix idle VS timer enqueue


arch/loongarch/kernel/genex.S | 32 ++++++++++++++++++--------------
arch/loongarch/kernel/idle.c | 1 -
arch/x86/include/asm/mwait.h | 21 ++++++++++++++++++---
arch/x86/kernel/process.c | 1 -
drivers/acpi/acpi_pad.c | 1 +
drivers/cpuidle/cpuidle-haltpoll.c | 5 +----
drivers/cpuidle/poll_state.c | 10 ++++++++--
drivers/idle/intel_idle.c | 19 +++++++------------
kernel/sched/core.c | 22 ++++++++++++++++++++++
kernel/sched/idle.c | 30 ++++++++++++++++++++++++++++++
10 files changed, 105 insertions(+), 37 deletions(-)


2023-08-11 17:31:56

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 03/10] cpuidle: Report illegal tick stopped while polling

poll_idle() can't be called while the tick is stopped because it enables
interrupts and only polls on TIF_NEED_RESCHED, which doesn't tell if an
interrupt queues a timer that would require a tick re-programming.

There is no point anyway to poll with the tick stopped so add a check
to make sure it never happens.

Signed-off-by: Frederic Weisbecker <[email protected]>
---
drivers/cpuidle/poll_state.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
index 9b6d90a72601..009f46f121ae 100644
--- a/drivers/cpuidle/poll_state.c
+++ b/drivers/cpuidle/poll_state.c
@@ -7,6 +7,7 @@
#include <linux/sched.h>
#include <linux/sched/clock.h>
#include <linux/sched/idle.h>
+#include <linux/tick.h>

#define POLL_IDLE_RELAX_COUNT 200

@@ -19,6 +20,13 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,

dev->poll_time_limit = false;

+ /*
+ * This re-enables IRQs and only polls on TIF_NEED_RESCHED.
+ * A timer queued by an interrupt here may go unnoticed if
+ * the tick is stopped.
+ */
+ WARN_ON_ONCE(tick_nohz_tick_stopped());
+
raw_local_irq_enable();
if (!current_set_polling_and_test()) {
unsigned int loop_count = 0;
--
2.34.1


2023-08-11 17:34:52

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 09/10] x86: Remove the current_clr_polling() call upon mwait exit

mwait_idle_with_hints() is mostly called from cpuidle which already sets
back TIF_NR_POLLING and fold TIF_NEED_RESCHED if necessary.

The only non-cpuidle caller is power_saving_thread() which acts as an idle
loop and is the only reason why mwait_idle_with_hints() carries a costly
fully ordered atomic operation upon idle exit.

Make this overhead proper to power_saving_thread() instead which already
duplicates quite some cpuidle code.

Signed-off-by: Frederic Weisbecker <[email protected]>
---
arch/x86/include/asm/mwait.h | 1 -
drivers/acpi/acpi_pad.c | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 920426d691ce..c1be3775b94a 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -134,7 +134,6 @@ static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned lo
}
}
}
- current_clr_polling();
}

/*
diff --git a/drivers/acpi/acpi_pad.c b/drivers/acpi/acpi_pad.c
index 7a453c5ff303..5a44afafe185 100644
--- a/drivers/acpi/acpi_pad.c
+++ b/drivers/acpi/acpi_pad.c
@@ -174,6 +174,7 @@ static int power_saving_thread(void *data)
stop_critical_timings();

mwait_idle_with_hints(power_saving_mwait_eax, 1);
+ current_clr_polling();

start_critical_timings();
tick_broadcast_exit();
--
2.34.1


2023-08-11 17:36:11

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 07/10] cpuidle: Remove unnecessary current_clr_polling() on poll_idle()

There is no point in clearing TIF_NR_POLLING and folding TIF_NEED_RESCHED
upon poll_idle() exit because cpuidle_idle_call() is going to set again
TIF_NR_POLLING anyway. Also if TIF_NEED_RESCHED is set, it will be
folded and TIF_NR_POLLING will be cleared at the end of do_idle().

Signed-off-by: Frederic Weisbecker <[email protected]>
---
drivers/cpuidle/poll_state.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
index 009f46f121ae..44a656464d06 100644
--- a/drivers/cpuidle/poll_state.c
+++ b/drivers/cpuidle/poll_state.c
@@ -48,8 +48,6 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
}
raw_local_irq_disable();

- current_clr_polling();
-
return index;
}

--
2.34.1


2023-08-11 17:48:10

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 01/10] x86: Add a comment about the "magic" behind shadow sti before mwait

Add a note to make sure we never miss and break the requirements behind
it.

Signed-off-by: Frederic Weisbecker <[email protected]>
---
arch/x86/include/asm/mwait.h | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 778df05f8539..341ee4f1d91e 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -87,6 +87,15 @@ static __always_inline void __mwaitx(unsigned long eax, unsigned long ebx,
:: "a" (eax), "b" (ebx), "c" (ecx));
}

+/*
+ * Re-enable interrupts right upon calling mwait in such a way that
+ * no interrupt can fire _before_ the execution of mwait, ie: no
+ * instruction must be placed between "sti" and "mwait".
+ *
+ * This is necessary because if an interrupt queues a timer before
+ * executing mwait, it would otherwise go unnoticed and the next tick
+ * would not be reprogrammed accordingly before mwait ever wakes up.
+ */
static __always_inline void __sti_mwait(unsigned long eax, unsigned long ecx)
{
mds_idle_clear_cpu_buffers();
--
2.34.1


2023-08-11 17:52:26

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 10/10] sched/timers: Explain why idle task schedules out on remote timer enqueue

Trying to avoid that didn't bring much value after testing, add comment
about this.

Signed-off-by: Frederic Weisbecker <[email protected]>
---
kernel/sched/core.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c52c2eba7c73..e53b892167ad 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1135,6 +1135,28 @@ static void wake_up_idle_cpu(int cpu)
if (cpu == smp_processor_id())
return;

+ /*
+ * Set TIF_NEED_RESCHED and send an IPI if in the non-polling
+ * part of the idle loop. This forces an exit from the idle loop
+ * and a round trip to schedule(). Now this could be optimized
+ * because a simple new idle loop iteration is enough to
+ * re-evaluate the next tick. Provided some re-ordering of tick
+ * nohz functions that would need to follow TIF_NR_POLLING
+ * clearing:
+ *
+ * - On most archs, a simple fetch_or on ti::flags with a
+ * "0" value would be enough to know if an IPI needs to be sent.
+ *
+ * - x86 needs to perform a last need_resched() check between
+ * monitor and mwait which doesn't take timers into account.
+ * There a dedicated TIF_TIMER flag would be required to
+ * fetch_or here and be checked along with TIF_NEED_RESCHED
+ * before mwait().
+ *
+ * However, remote timer enqueue is not such a frequent event
+ * and testing of the above solutions didn't appear to report
+ * much benefits.
+ */
if (set_nr_and_not_polling(rq->idle))
smp_send_reschedule(cpu);
else
--
2.34.1


2023-08-11 18:00:15

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 04/10] loongson: Fix idle VS timer enqueue

From: Peter Zijlstra <[email protected]>

Loongson re-enables interrupts on its idle routine and performs a
TIF_NEED_RESCHED check afterwards before putting the CPU to sleep.

The IRQs firing between the check and the idling instruction may set the
TIF_NEED_RESCHED flag. In order to deal with the such a race, IRQs
interrupting __arch_cpu_idle() rollback their return address to the
beginning of __arch_cpu_idle() so that TIF_NEED_RESCHED is checked
again before going back to sleep.

However idle IRQs can also queue timers that may require a tick
reprogramming through a new generic idle loop iteration but those timers
would go unnoticed here because __arch_cpu_idle() only checks
TIF_NEED_RESCHED. It doesn't check for pending timers.

Fix this with fast-forwarding idle IRQs return value to the end of the
idle routine instead of the beginning, so that the generic idle loop
handles both TIF_NEED_RESCHED and pending timers.

Fixes: 0603839b18f4 (LoongArch: Add exception/interrupt handling)
Tested-by: Bibo, Mao <[email protected]>
Not-yet-signed-off-by: Peter Zijlstra <[email protected]>
Cc: WANG Xuerui <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
---
arch/loongarch/kernel/genex.S | 32 ++++++++++++++++++--------------
arch/loongarch/kernel/idle.c | 1 -
2 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
index 78f066384657..359d693f112e 100644
--- a/arch/loongarch/kernel/genex.S
+++ b/arch/loongarch/kernel/genex.S
@@ -18,27 +18,31 @@

.align 5
SYM_FUNC_START(__arch_cpu_idle)
- /* start of rollback region */
- LONG_L t0, tp, TI_FLAGS
- nop
- andi t0, t0, _TIF_NEED_RESCHED
- bnez t0, 1f
- nop
- nop
- nop
+ /* start of idle interrupt region */
+ move t0, CSR_CRMD_IE
+ csrxchg t0, t0, LOONGARCH_CSR_CRMD
+ /*
+ * If an interrupt lands here; between enabling interrupts above and
+ * going idle on the next instruction, we must *NOT* go idle since the
+ * interrupt could have set TIF_NEED_RESCHED or caused an timer to need
+ * reprogramming. Fall through -- see handle_vint() below -- and have
+ * the idle loop take care of things.
+ */
idle 0
- /* end of rollback region */
-1: jr ra
+ nop
+ /* end of idle interrupt region */
+SYM_INNER_LABEL(__arch_cpu_idle_exit, SYM_L_LOCAL)
+ jr ra
SYM_FUNC_END(__arch_cpu_idle)

SYM_FUNC_START(handle_vint)
BACKUP_T0T1
SAVE_ALL
- la_abs t1, __arch_cpu_idle
+ la_abs t1, __arch_cpu_idle_exit
LONG_L t0, sp, PT_ERA
- /* 32 byte rollback region */
- ori t0, t0, 0x1f
- xori t0, t0, 0x1f
+ /* 16 byte idle interrupt region */
+ ori t0, t0, 0x0f
+ addi.d t0, t0, 1
bne t0, t1, 1f
LONG_S t0, sp, PT_ERA
1: move a0, sp
diff --git a/arch/loongarch/kernel/idle.c b/arch/loongarch/kernel/idle.c
index 0b5dd2faeb90..5ba72d229920 100644
--- a/arch/loongarch/kernel/idle.c
+++ b/arch/loongarch/kernel/idle.c
@@ -11,7 +11,6 @@

void __cpuidle arch_cpu_idle(void)
{
- raw_local_irq_enable();
__arch_cpu_idle(); /* idle instruction needs irq enabled */
raw_local_irq_disable();
}
--
2.34.1


2023-08-11 18:08:16

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 07/10] cpuidle: Remove unnecessary current_clr_polling() on poll_idle()

On Fri, Aug 11, 2023 at 7:01 PM Frederic Weisbecker <[email protected]> wrote:
>
> There is no point in clearing TIF_NR_POLLING and folding TIF_NEED_RESCHED
> upon poll_idle() exit because cpuidle_idle_call() is going to set again
> TIF_NR_POLLING anyway. Also if TIF_NEED_RESCHED is set, it will be
> folded and TIF_NR_POLLING will be cleared at the end of do_idle().
>
> Signed-off-by: Frederic Weisbecker <[email protected]>

Acked-by: Rafael J. Wysocki <[email protected]>

> ---
> drivers/cpuidle/poll_state.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
> index 009f46f121ae..44a656464d06 100644
> --- a/drivers/cpuidle/poll_state.c
> +++ b/drivers/cpuidle/poll_state.c
> @@ -48,8 +48,6 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
> }
> raw_local_irq_disable();
>
> - current_clr_polling();
> -
> return index;
> }
>
> --
> 2.34.1
>

2023-08-11 18:09:38

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 10/10] sched/timers: Explain why idle task schedules out on remote timer enqueue

On Fri, Aug 11, 2023 at 7:01 PM Frederic Weisbecker <[email protected]> wrote:
>
> Trying to avoid that didn't bring much value after testing, add comment
> about this.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>

Acked-by: Rafael J. Wysocki <[email protected]>

> ---
> kernel/sched/core.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index c52c2eba7c73..e53b892167ad 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1135,6 +1135,28 @@ static void wake_up_idle_cpu(int cpu)
> if (cpu == smp_processor_id())
> return;
>
> + /*
> + * Set TIF_NEED_RESCHED and send an IPI if in the non-polling
> + * part of the idle loop. This forces an exit from the idle loop
> + * and a round trip to schedule(). Now this could be optimized
> + * because a simple new idle loop iteration is enough to
> + * re-evaluate the next tick. Provided some re-ordering of tick
> + * nohz functions that would need to follow TIF_NR_POLLING
> + * clearing:
> + *
> + * - On most archs, a simple fetch_or on ti::flags with a
> + * "0" value would be enough to know if an IPI needs to be sent.
> + *
> + * - x86 needs to perform a last need_resched() check between
> + * monitor and mwait which doesn't take timers into account.
> + * There a dedicated TIF_TIMER flag would be required to
> + * fetch_or here and be checked along with TIF_NEED_RESCHED
> + * before mwait().
> + *
> + * However, remote timer enqueue is not such a frequent event
> + * and testing of the above solutions didn't appear to report
> + * much benefits.
> + */
> if (set_nr_and_not_polling(rq->idle))
> smp_send_reschedule(cpu);
> else
> --
> 2.34.1
>

2023-08-11 18:23:32

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 01/10] x86: Add a comment about the "magic" behind shadow sti before mwait

On Fri, Aug 11, 2023 at 7:00 PM Frederic Weisbecker <[email protected]> wrote:
>
> Add a note to make sure we never miss and break the requirements behind
> it.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>

Acked-by: Rafael J. Wysocki <[email protected]>

> ---
> arch/x86/include/asm/mwait.h | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
> index 778df05f8539..341ee4f1d91e 100644
> --- a/arch/x86/include/asm/mwait.h
> +++ b/arch/x86/include/asm/mwait.h
> @@ -87,6 +87,15 @@ static __always_inline void __mwaitx(unsigned long eax, unsigned long ebx,
> :: "a" (eax), "b" (ebx), "c" (ecx));
> }
>
> +/*
> + * Re-enable interrupts right upon calling mwait in such a way that
> + * no interrupt can fire _before_ the execution of mwait, ie: no
> + * instruction must be placed between "sti" and "mwait".
> + *
> + * This is necessary because if an interrupt queues a timer before
> + * executing mwait, it would otherwise go unnoticed and the next tick
> + * would not be reprogrammed accordingly before mwait ever wakes up.
> + */
> static __always_inline void __sti_mwait(unsigned long eax, unsigned long ecx)
> {
> mds_idle_clear_cpu_buffers();
> --
> 2.34.1
>

2023-08-11 18:39:30

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 03/10] cpuidle: Report illegal tick stopped while polling

On Fri, Aug 11, 2023 at 7:01 PM Frederic Weisbecker <[email protected]> wrote:
>
> poll_idle() can't be called while the tick is stopped because it enables
> interrupts and only polls on TIF_NEED_RESCHED, which doesn't tell if an
> interrupt queues a timer that would require a tick re-programming.
>
> There is no point anyway to poll with the tick stopped so add a check
> to make sure it never happens.

I'd rather update governors so they never use polling states when the
tick has been stopped and then add the WARN_ON().

> Signed-off-by: Frederic Weisbecker <[email protected]>
> ---
> drivers/cpuidle/poll_state.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
> index 9b6d90a72601..009f46f121ae 100644
> --- a/drivers/cpuidle/poll_state.c
> +++ b/drivers/cpuidle/poll_state.c
> @@ -7,6 +7,7 @@
> #include <linux/sched.h>
> #include <linux/sched/clock.h>
> #include <linux/sched/idle.h>
> +#include <linux/tick.h>
>
> #define POLL_IDLE_RELAX_COUNT 200
>
> @@ -19,6 +20,13 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
>
> dev->poll_time_limit = false;
>
> + /*
> + * This re-enables IRQs and only polls on TIF_NEED_RESCHED.
> + * A timer queued by an interrupt here may go unnoticed if
> + * the tick is stopped.
> + */
> + WARN_ON_ONCE(tick_nohz_tick_stopped());
> +
> raw_local_irq_enable();
> if (!current_set_polling_and_test()) {
> unsigned int loop_count = 0;
> --
> 2.34.1
>

2023-08-11 18:54:11

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 09/10] x86: Remove the current_clr_polling() call upon mwait exit

On Fri, Aug 11, 2023 at 7:01 PM Frederic Weisbecker <[email protected]> wrote:
>
> mwait_idle_with_hints() is mostly called from cpuidle which already sets
> back TIF_NR_POLLING and fold TIF_NEED_RESCHED if necessary.
>
> The only non-cpuidle caller is power_saving_thread() which acts as an idle
> loop and is the only reason why mwait_idle_with_hints() carries a costly
> fully ordered atomic operation upon idle exit.
>
> Make this overhead proper to power_saving_thread() instead which already
> duplicates quite some cpuidle code.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>

Acked-by: Rafael J. Wysocki <[email protected]>

> ---
> arch/x86/include/asm/mwait.h | 1 -
> drivers/acpi/acpi_pad.c | 1 +
> 2 files changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
> index 920426d691ce..c1be3775b94a 100644
> --- a/arch/x86/include/asm/mwait.h
> +++ b/arch/x86/include/asm/mwait.h
> @@ -134,7 +134,6 @@ static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned lo
> }
> }
> }
> - current_clr_polling();
> }
>
> /*
> diff --git a/drivers/acpi/acpi_pad.c b/drivers/acpi/acpi_pad.c
> index 7a453c5ff303..5a44afafe185 100644
> --- a/drivers/acpi/acpi_pad.c
> +++ b/drivers/acpi/acpi_pad.c
> @@ -174,6 +174,7 @@ static int power_saving_thread(void *data)
> stop_critical_timings();
>
> mwait_idle_with_hints(power_saving_mwait_eax, 1);
> + current_clr_polling();
>
> start_critical_timings();
> tick_broadcast_exit();
> --
> 2.34.1
>

2023-08-14 05:18:08

by Bibo Mao

[permalink] [raw]
Subject: Re: [PATCH 04/10] loongson: Fix idle VS timer enqueue

+Add huacai

asm instruction move should be replaced with li.w, the other looks good to me.

diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
index 359d693f112e..8a98023ecafd 100644
--- a/arch/loongarch/kernel/genex.S
+++ b/arch/loongarch/kernel/genex.S
@@ -19,7 +19,7 @@
.align 5
SYM_FUNC_START(__arch_cpu_idle)
/* start of idle interrupt region */
- move t0, CSR_CRMD_IE
+ li.w t0, CSR_CRMD_IE
csrxchg t0, t0, LOONGARCH_CSR_CRMD
/*
* If an interrupt lands here; between enabling interrupts above and

By the way __arch_cpu_idle is called by machine_halt/machine_power_off etc,
irq will be enabled with new patch. Need we add something like this?

diff --git a/arch/loongarch/kernel/reset.c b/arch/loongarch/kernel/reset.c
index 1ef8c6383535..9ecd42c0c804 100644
--- a/arch/loongarch/kernel/reset.c
+++ b/arch/loongarch/kernel/reset.c
@@ -20,6 +20,11 @@
void (*pm_power_off)(void);
EXPORT_SYMBOL(pm_power_off);

+static __always_inline void native_halt(void)
+{
+ asm volatile("idle 0": : :"memory");
+}
+
void machine_halt(void)
{
#ifdef CONFIG_SMP
@@ -32,9 +37,9 @@ void machine_halt(void)
pr_notice("\n\n** You can safely turn off the power now **\n\n");
console_flush_on_panic(CONSOLE_FLUSH_PENDING);

- while (true) {
- __arch_cpu_idle();
- }
+ while (1) {
+ native_halt();
+ };
}

void machine_power_off(void)
@@ -52,9 +57,9 @@ void machine_power_off(void)
efi.reset_system(EFI_RESET_SHUTDOWN, EFI_SUCCESS, 0, NULL);
#endif

- while (true) {
- __arch_cpu_idle();
- }
+ while (1) {
+ native_halt();
+ };
}

void machine_restart(char *command)
@@ -73,7 +78,7 @@ void machine_restart(char *command)
if (!acpi_disabled)
acpi_reboot();

- while (true) {
- __arch_cpu_idle();
- }
+ while (1) {
+ native_halt();
+ };
}

Regards
Bibo Mao

在 2023/8/12 01:00, Frederic Weisbecker 写道:
> From: Peter Zijlstra <[email protected]>
>
> Loongson re-enables interrupts on its idle routine and performs a
> TIF_NEED_RESCHED check afterwards before putting the CPU to sleep.
>
> The IRQs firing between the check and the idling instruction may set the
> TIF_NEED_RESCHED flag. In order to deal with the such a race, IRQs
> interrupting __arch_cpu_idle() rollback their return address to the
> beginning of __arch_cpu_idle() so that TIF_NEED_RESCHED is checked
> again before going back to sleep.
>
> However idle IRQs can also queue timers that may require a tick
> reprogramming through a new generic idle loop iteration but those timers
> would go unnoticed here because __arch_cpu_idle() only checks
> TIF_NEED_RESCHED. It doesn't check for pending timers.
>
> Fix this with fast-forwarding idle IRQs return value to the end of the
> idle routine instead of the beginning, so that the generic idle loop
> handles both TIF_NEED_RESCHED and pending timers.
>
> Fixes: 0603839b18f4 (LoongArch: Add exception/interrupt handling)
> Tested-by: Bibo, Mao <[email protected]>
> Not-yet-signed-off-by: Peter Zijlstra <[email protected]>
> Cc: WANG Xuerui <[email protected]>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> ---
> arch/loongarch/kernel/genex.S | 32 ++++++++++++++++++--------------
> arch/loongarch/kernel/idle.c | 1 -
> 2 files changed, 18 insertions(+), 15 deletions(-)
>
> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
> index 78f066384657..359d693f112e 100644
> --- a/arch/loongarch/kernel/genex.S
> +++ b/arch/loongarch/kernel/genex.S
> @@ -18,27 +18,31 @@
>
> .align 5
> SYM_FUNC_START(__arch_cpu_idle)
> - /* start of rollback region */
> - LONG_L t0, tp, TI_FLAGS
> - nop
> - andi t0, t0, _TIF_NEED_RESCHED
> - bnez t0, 1f
> - nop
> - nop
> - nop
> + /* start of idle interrupt region */
> + move t0, CSR_CRMD_IE
> + csrxchg t0, t0, LOONGARCH_CSR_CRMD
> + /*
> + * If an interrupt lands here; between enabling interrupts above and
> + * going idle on the next instruction, we must *NOT* go idle since the
> + * interrupt could have set TIF_NEED_RESCHED or caused an timer to need
> + * reprogramming. Fall through -- see handle_vint() below -- and have
> + * the idle loop take care of things.
> + */
> idle 0
> - /* end of rollback region */
> -1: jr ra
> + nop
> + /* end of idle interrupt region */
> +SYM_INNER_LABEL(__arch_cpu_idle_exit, SYM_L_LOCAL)
> + jr ra
> SYM_FUNC_END(__arch_cpu_idle)
>
> SYM_FUNC_START(handle_vint)
> BACKUP_T0T1
> SAVE_ALL
> - la_abs t1, __arch_cpu_idle
> + la_abs t1, __arch_cpu_idle_exit
> LONG_L t0, sp, PT_ERA
> - /* 32 byte rollback region */
> - ori t0, t0, 0x1f
> - xori t0, t0, 0x1f
> + /* 16 byte idle interrupt region */
> + ori t0, t0, 0x0f
> + addi.d t0, t0, 1
> bne t0, t1, 1f
> LONG_S t0, sp, PT_ERA
> 1: move a0, sp
> diff --git a/arch/loongarch/kernel/idle.c b/arch/loongarch/kernel/idle.c
> index 0b5dd2faeb90..5ba72d229920 100644
> --- a/arch/loongarch/kernel/idle.c
> +++ b/arch/loongarch/kernel/idle.c
> @@ -11,7 +11,6 @@
>
> void __cpuidle arch_cpu_idle(void)
> {
> - raw_local_irq_enable();
> __arch_cpu_idle(); /* idle instruction needs irq enabled */
> raw_local_irq_disable();
> }


2023-08-29 12:18:34

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 00/10] timers/cpuidle: Fixes and cleanups

On Tue, Aug 15, 2023 at 06:10:43PM +0200, Peter Zijlstra wrote:
> On Fri, Aug 11, 2023 at 07:00:39PM +0200, Frederic Weisbecker wrote:
> > Hi,
> >
> > The first part of the series is cpuidle callback fixes against timers,
> > some of which haven't been Signed by Peter yet.
> >
> > Another part is removing the overhead of useless TIF_NR_POLLING clearing.
>
> So I've again forgotten why we don't simply set TIF_NEED_RESCHED if we
> need the timer re-programmed. That is by far the simplest fix.
>
> I'm sure we talked about it, but this was a long time ago and I can't
> remember :/

I don't think we did but the rationale is that with forcing setting
TIF_NEED_RESCHED, you force a needless timer restart (which is then going
to be cancelled shortly after) and a round trip to the scheduler with the
rq lock overhead, etc...

Just for the fun I just tried the following change:

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c52c2eba7c73..ec43d135cf65 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1132,8 +1132,10 @@ static void wake_up_idle_cpu(int cpu)
{
struct rq *rq = cpu_rq(cpu);

- if (cpu == smp_processor_id())
+ if (cpu == smp_processor_id()) {
+ set_tsk_need_resched(current);
return;
+ }

if (set_nr_and_not_polling(rq->idle))
smp_send_reschedule(cpu);


Then I computed the average of 100 runs of "make clean -C tools/perf; make -C
tools/perf/" before and after this patch.

I observed an average regression of 1.27% less time spent in C-states.

So this has a measurable impact.

>
> Anyway, the patches look good, except I think there's a whole bunch of
> architectures that still need fixing. In particular since loongson
> 'borrowed' the whole lot from MIPS, they need an identical fix. But I'm
> sure there's more architectures affected.

MIPS at least yes, I only did a quick check and it seems that most archs
use a "wfi" like instruction. I'll check for others.

Thanks.

2023-08-29 14:02:20

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 03/10] cpuidle: Report illegal tick stopped while polling

On Fri, Aug 11, 2023 at 07:37:55PM +0200, Rafael J. Wysocki wrote:
> On Fri, Aug 11, 2023 at 7:01 PM Frederic Weisbecker <[email protected]> wrote:
> >
> > poll_idle() can't be called while the tick is stopped because it enables
> > interrupts and only polls on TIF_NEED_RESCHED, which doesn't tell if an
> > interrupt queues a timer that would require a tick re-programming.
> >
> > There is no point anyway to poll with the tick stopped so add a check
> > to make sure it never happens.
>
> I'd rather update governors so they never use polling states when the
> tick has been stopped and then add the WARN_ON().

Sounds good, let's put this one aside.

In the meantime feel free to pick up those you acked and see fit in your
queue.

Thanks!