2024-02-29 14:23:54

by Thomas Gleixner

[permalink] [raw]
Subject: [patch 0/6] x86/idle: Cure RCU violations and cleanups

Boris reported that a RCU related warning triggers in the tracer code on
AMD machines which are affected by Erratum 400. On those CPUs the local
APIC timer stops in the C1E halt state. This is handled by a special idle
function which invokes tick_broadcast_enter()/exit() around HALT. These
functions can end up in lockdep or tracing which use RCU protected data,
but the core code already set RCU to idle which means that the RCU
protection is not longer given.

This series fixes this by handling the tick broadcast conditionally in the
core idle function. While working on it I noticed a few bogosities in the
related code and cleaned that up on top.

The series is also available from git:

git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/core

Thanks,

tglx
---
arch/x86/Kconfig | 1
arch/x86/include/asm/processor.h | 2
arch/x86/kernel/cpu/common.c | 4 -
arch/x86/kernel/process.c | 89 +++++++++++----------------------------
include/linux/cpu.h | 2
include/linux/tick.h | 3 +
kernel/sched/idle.c | 21 +++++++++
kernel/time/Kconfig | 5 ++
8 files changed, 62 insertions(+), 65 deletions(-)


2024-03-01 19:15:44

by Borislav Petkov

[permalink] [raw]
Subject: Re: [patch 0/6] x86/idle: Cure RCU violations and cleanups

On Thu, Feb 29, 2024 at 03:23:35PM +0100, Thomas Gleixner wrote:
> Boris reported that a RCU related warning triggers in the tracer code on
> AMD machines which are affected by Erratum 400. On those CPUs the local
> APIC timer stops in the C1E halt state. This is handled by a special idle
> function which invokes tick_broadcast_enter()/exit() around HALT. These
> functions can end up in lockdep or tracing which use RCU protected data,
> but the core code already set RCU to idle which means that the RCU
> protection is not longer given.
>
> This series fixes this by handling the tick broadcast conditionally in the
> core idle function. While working on it I noticed a few bogosities in the
> related code and cleaned that up on top.
>
> The series is also available from git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/core
>
> Thanks,
>
> tglx
> ---
> arch/x86/Kconfig | 1
> arch/x86/include/asm/processor.h | 2
> arch/x86/kernel/cpu/common.c | 4 -
> arch/x86/kernel/process.c | 89 +++++++++++----------------------------
> include/linux/cpu.h | 2
> include/linux/tick.h | 3 +
> kernel/sched/idle.c | 21 +++++++++
> kernel/time/Kconfig | 5 ++
> 8 files changed, 62 insertions(+), 65 deletions(-)

I refreshed my local branch with all your fixed patches and it still
works.

Tested-by: Borislav Petkov (AMD) <[email protected]>

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette