2014-02-10 03:42:26

by poma

[permalink] [raw]
Subject: WARNING: CPU: 1 PID: 0 at kernel/time/tick-broadcast.c:668 tick_broadcast_oneshot_control+0x17d/0x190()


Let's have one last round of Champions!


[ 83.558497] ------------[ cut here ]------------
[ 83.558503] WARNING: CPU: 1 PID: 0 at
kernel/time/tick-broadcast.c:668
tick_broadcast_oneshot_control+0x17d/0x190()
[ 83.558527] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4
nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack
xt_CHECKSUM iptable_mangle fuse ip6table_filter ip6_tables ebtable_nat
ebtables kvm_amd kvm ppdev microcode serio_raw nouveau mxm_wmi video
i2c_algo_bit ttm edac_core edac_mce_amd drm_kms_helper k10temp drm
parport_serial parport_pc parport wmi shpchp i2c_nforce2 i2c_core
acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd sunrpc binfmt_misc
ata_generic pata_acpi pata_amd 8021q garp mrp tun bridge stp llc bonding
[ 83.558530] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W
3.14.0-0.rc1.git5.1.fc21.1.x86_64 #1
[ 83.558530] Hardware name: Gigabyte Technology Co., Ltd.
M720-US3/M720-US3, BIOS F7n 09/07/2010
[ 83.558533] 0000000000000000 184f2fadf588e594 ffff880128bb9dd0
ffffffff817cf05d
[ 83.558535] 0000000000000000 ffff880128bb9e08 ffffffff810966cd
0000000000000001
[ 83.558536] 0000000000000001 ffff88012a80df00 0000000000000092
ffffffff81c28940
[ 83.558537] Call Trace:
[ 83.558541] [<ffffffff817cf05d>] dump_stack+0x4d/0x66
[ 83.558544] [<ffffffff810966cd>] warn_slowpath_common+0x7d/0xa0
[ 83.558545] [<ffffffff810967fa>] warn_slowpath_null+0x1a/0x20
[ 83.558547] [<ffffffff811243dd>]
tick_broadcast_oneshot_control+0x17d/0x190
[ 83.558549] [<ffffffff81122dd8>] clockevents_notify+0x178/0x1a0
[ 83.558551] [<ffffffff81025b17>] amd_e400_idle+0x87/0x130
[ 83.558553] [<ffffffff8102639e>] arch_cpu_idle+0x2e/0x40
[ 83.558555] [<ffffffff8110d375>] cpu_startup_entry+0xf5/0x420
[ 83.558558] [<ffffffff8104dc90>] start_secondary+0x240/0x300
[ 83.558559] ---[ end trace d6d150ba17d5207c ]---


https://bugzilla.redhat.com/attachment.cgi?id=861203
https://bugzilla.redhat.com/show_bug.cgi?id=1063108
Also,
https://bugzilla.redhat.com/show_bug.cgi?id=1033310
https://bugzilla.redhat.com/show_bug.cgi?id=1031296
https://bugzilla.redhat.com/show_bug.cgi?id=996973
https://bugzilla.redhat.com/show_bug.cgi?id=989718

>From kernel-3.10 to 3.14,
http://thread.gmane.org/gmane.linux.kernel/1529924
http://thread.gmane.org/gmane.linux.power-management.general/37437


poma


At 600km above planet Earth the temperature
fluctuates between +125 and -100 degrees Celsius

There is nothing to carry sound
No air pressure
No oxygen

Life in space is impossible

LINUX KERNEL LIST


2014-02-10 10:06:54

by Thomas Gleixner

[permalink] [raw]
Subject: Re: WARNING: CPU: 1 PID: 0 at kernel/time/tick-broadcast.c:668 tick_broadcast_oneshot_control+0x17d/0x190()

On Mon, 10 Feb 2014, poma wrote:

> [ 83.558551] [<ffffffff81025b17>] amd_e400_idle+0x87/0x130

So this seems to happen only on AMD machines which use that e400 idle
mode. I have no idea at the moment whats wrong there. I'll find one of
those machines and try to reproduce.

Thanks,

tglx



2014-02-10 18:59:50

by poma

[permalink] [raw]
Subject: Re: WARNING: CPU: 1 PID: 0 at kernel/time/tick-broadcast.c:668 tick_broadcast_oneshot_control+0x17d/0x190()

On 10.02.2014 11:06, Thomas Gleixner wrote:
> On Mon, 10 Feb 2014, poma wrote:
>
>> [ 83.558551] [<ffffffff81025b17>] amd_e400_idle+0x87/0x130
>
> So this seems to happen only on AMD machines which use that e400 idle
> mode. I have no idea at the moment whats wrong there. I'll find one of
> those machines and try to reproduce.
>
> Thanks,
>
> tglx

Thanks for your response! :)
https://bugzilla.redhat.com/show_bug.cgi?id=1031296#c24


poma

2014-02-11 08:24:12

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: WARNING: CPU: 1 PID: 0 at kernel/time/tick-broadcast.c:668 tick_broadcast_oneshot_control+0x17d/0x190()

On Mon, Feb 10, 2014 at 07:59:39PM +0100, poma wrote:
> On 10.02.2014 11:06, Thomas Gleixner wrote:
> > On Mon, 10 Feb 2014, poma wrote:
> >
> >> [ 83.558551] [<ffffffff81025b17>] amd_e400_idle+0x87/0x130
> >
> > So this seems to happen only on AMD machines which use that e400 idle
> > mode. I have no idea at the moment whats wrong there. I'll find one of
> > those machines and try to reproduce.

I tried to debug that warn as well. Even if I found machine with proper
family and model number, HW C1E bug do not happen there, hence I just
hack kernel to always use amd_e400_idle (and remove AMD rdmsr specific
instructions to do not crash). That make issue 100% reproducible when
suspend/resume.

It happens when cpu become idle, call CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
but before CLOCK_EVT_NOTIFY_BROADCAST_EXIT, interrupt trigger on that
cpu. IRQ is handled by hrtimer code, which want to switch to hres and
call:

tick_switch_to_oneshot() -> ... -> tick_broadcast_setup_oneshot()

Since we have already proper handler there, last procedure clear
tick_broadcast_oneshot_mask, but tick_broadcast_pending_mask stay
set. When amd_e400_idle next time call CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
the warning will happen.

I came with a below patch, which also clear pending mask, but perhaps
oneshot_mask should not be cleared on tick_broadcast_setup_oneshot(),
or should be cleared only conditionally, or some other solution is
needed. Anyway, patch make the warning gone on my hacked setup, I was
waiting for testing results on real C1E hardware.

Thanks
Stanislaw

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 43780ab..98977a5 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -756,6 +756,7 @@ out:
static void tick_broadcast_clear_oneshot(int cpu)
{
cpumask_clear_cpu(cpu, tick_broadcast_oneshot_mask);
+ cpumask_clear_cpu(cpu, tick_broadcast_pending_mask);
}

static void tick_broadcast_init_next_event(struct cpumask *mask,

2014-02-11 14:25:00

by Thomas Gleixner

[permalink] [raw]
Subject: Re: WARNING: CPU: 1 PID: 0 at kernel/time/tick-broadcast.c:668 tick_broadcast_oneshot_control+0x17d/0x190()

On Mon, 10 Feb 2014, Thomas Gleixner wrote:
> On Mon, 10 Feb 2014, poma wrote:
>
> > [ 83.558551] [<ffffffff81025b17>] amd_e400_idle+0x87/0x130
>
> So this seems to happen only on AMD machines which use that e400 idle
> mode. I have no idea at the moment whats wrong there. I'll find one of
> those machines and try to reproduce.

Found it. Patch below.

Thanks,

tglx
----
Subject: tick: Clear broadcast pending bit when switching to oneshot
From: Thomas Gleixner <[email protected]>
Date: Tue, 11 Feb 2014 14:35:40 +0100

AMD systems which use the C1E workaround in the amd_e400_idle routine
trigger the WARN_ON_ONCE in the broadcast code when onlining a CPU.

The reason is that the idle routine of those AMD systems switches the
cpu into forced broadcast mode early on before the newly brought up
CPU can switch over to high resolution / NOHZ mode. The timer related
CPU1 bringup looks like this:

clockevent_register_device(local_apic);
tick_setup(local_apic);
...
idle()
tick_broadcast_on_off(FORCE);
tick_broadcast_oneshot_control(ENTER)
cpumask_set(cpu, broadcast_oneshot_mask);
halt();

Now the broadcast interrupt on CPU0 sets CPU1 in the
broadcast_pending_mask and wakes CPU1. So CPU1 continues:

local_apic_timer_interrupt()
tick_handle_periodic();
softirq()
tick_init_highres();
cpumask_clr(cpu, broadcast_oneshot_mask);

tick_broadcast_oneshot_control(ENTER)
WARN_ON(cpumask_test(cpu, broadcast_pending_mask);

So while we remove CPU1 from the broadcast_oneshot_mask when we switch
over to highres mode, we do not clear the pending bit, which then
triggers the warning when we go back to idle.

The reason why this is only visible on C1E affected AMD systems is
that the other machines enter the deep sleep states via
acpi_idle/intel_idle and exit the broadcast mode before executing the
remote triggered local_apic_timer_interrupt. So the pending bit is
already cleared when the switch over to highres mode is clearing the
oneshot mask.

The solution is simple: Clear the pending bit together with the mask
bit when we switch over to highres mode.

Reported-by: poma <[email protected]>
Cc: [email protected] # 3.10+
Signed-off-by: Thomas Gleixner <[email protected]>
---
kernel/time/tick-broadcast.c | 1 +
1 file changed, 1 insertion(+)

Index: linux-2.6/kernel/time/tick-broadcast.c
===================================================================
--- linux-2.6.orig/kernel/time/tick-broadcast.c
+++ linux-2.6/kernel/time/tick-broadcast.c
@@ -756,6 +756,7 @@ out:
static void tick_broadcast_clear_oneshot(int cpu)
{
cpumask_clear_cpu(cpu, tick_broadcast_oneshot_mask);
+ cpumask_clear_cpu(cpu, tick_broadcast_pending_mask);
}

static void tick_broadcast_init_next_event(struct cpumask *mask,

2014-02-11 16:07:50

by Thomas Gleixner

[permalink] [raw]
Subject: Re: WARNING: CPU: 1 PID: 0 at kernel/time/tick-broadcast.c:668 tick_broadcast_oneshot_control+0x17d/0x190()

On Tue, 11 Feb 2014, Stanislaw Gruszka wrote:

> On Mon, Feb 10, 2014 at 07:59:39PM +0100, poma wrote:
> > On 10.02.2014 11:06, Thomas Gleixner wrote:
> > > On Mon, 10 Feb 2014, poma wrote:
> > >
> > >> [ 83.558551] [<ffffffff81025b17>] amd_e400_idle+0x87/0x130
> > >
> > > So this seems to happen only on AMD machines which use that e400 idle
> > > mode. I have no idea at the moment whats wrong there. I'll find one of
> > > those machines and try to reproduce.
>
> I tried to debug that warn as well. Even if I found machine with proper
> family and model number, HW C1E bug do not happen there, hence I just
> hack kernel to always use amd_e400_idle (and remove AMD rdmsr specific
> instructions to do not crash). That make issue 100% reproducible when
> suspend/resume.

It's also reproducible on cpu online/offline.

> It happens when cpu become idle, call CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
> but before CLOCK_EVT_NOTIFY_BROADCAST_EXIT, interrupt trigger on that
> cpu. IRQ is handled by hrtimer code, which want to switch to hres and
> call:
>
> tick_switch_to_oneshot() -> ... -> tick_broadcast_setup_oneshot()
>
> Since we have already proper handler there, last procedure clear
> tick_broadcast_oneshot_mask, but tick_broadcast_pending_mask stay
> set. When amd_e400_idle next time call CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
> the warning will happen.
>
> I came with a below patch, which also clear pending mask, but perhaps

Fun. I came up with the exact same solution independent of you and I
tested it on real C1E contaminated hardware.

> oneshot_mask should not be cleared on tick_broadcast_setup_oneshot(),
> or should be cleared only conditionally, or some other solution is

We can do it unconditionally. It creates consistent state in all
corner cases.

There are other solutions to the problem, but that needs a major
rework of the broadcast code. I so wish that this mess would have
never been necessary at all ...

Thanks,

tglx

2014-02-11 16:28:25

by Stanislaw Gruszka

[permalink] [raw]
Subject: Re: WARNING: CPU: 1 PID: 0 at kernel/time/tick-broadcast.c:668 tick_broadcast_oneshot_control+0x17d/0x190()

> > I came with a below patch, which also clear pending mask, but perhaps
>
> Fun. I came up with the exact same solution independent of you and I
> tested it on real C1E contaminated hardware.
>
> > oneshot_mask should not be cleared on tick_broadcast_setup_oneshot(),
> > or should be cleared only conditionally, or some other solution is
>
> We can do it unconditionally. It creates consistent state in all
> corner cases.
>
> There are other solutions to the problem, but that needs a major
> rework of the broadcast code. I so wish that this mess would have
> never been necessary at all ...

Thomas, please post/apply patch, which you think is the most
appropriate.

Thanks
Stanislaw

2014-02-11 20:45:10

by poma

[permalink] [raw]
Subject: Re: WARNING: CPU: 1 PID: 0 at kernel/time/tick-broadcast.c:668 tick_broadcast_oneshot_control+0x17d/0x190()

On 11.02.2014 15:25, Thomas Gleixner wrote:
> On Mon, 10 Feb 2014, Thomas Gleixner wrote:
>> On Mon, 10 Feb 2014, poma wrote:
>>
>>> [ 83.558551] [<ffffffff81025b17>] amd_e400_idle+0x87/0x130
>>
>> So this seems to happen only on AMD machines which use that e400 idle
>> mode. I have no idea at the moment whats wrong there. I'll find one of
>> those machines and try to reproduce.
>
> Found it. Patch below.
>
> Thanks,
>
> tglx
> ----
> Subject: tick: Clear broadcast pending bit when switching to oneshot
> From: Thomas Gleixner <[email protected]>
> Date: Tue, 11 Feb 2014 14:35:40 +0100
>
> AMD systems which use the C1E workaround in the amd_e400_idle routine
> trigger the WARN_ON_ONCE in the broadcast code when onlining a CPU.
>
> The reason is that the idle routine of those AMD systems switches the
> cpu into forced broadcast mode early on before the newly brought up
> CPU can switch over to high resolution / NOHZ mode. The timer related
> CPU1 bringup looks like this:
>
> clockevent_register_device(local_apic);
> tick_setup(local_apic);
> ...
> idle()
> tick_broadcast_on_off(FORCE);
> tick_broadcast_oneshot_control(ENTER)
> cpumask_set(cpu, broadcast_oneshot_mask);
> halt();
>
> Now the broadcast interrupt on CPU0 sets CPU1 in the
> broadcast_pending_mask and wakes CPU1. So CPU1 continues:
>
> local_apic_timer_interrupt()
> tick_handle_periodic();
> softirq()
> tick_init_highres();
> cpumask_clr(cpu, broadcast_oneshot_mask);
>
> tick_broadcast_oneshot_control(ENTER)
> WARN_ON(cpumask_test(cpu, broadcast_pending_mask);
>
> So while we remove CPU1 from the broadcast_oneshot_mask when we switch
> over to highres mode, we do not clear the pending bit, which then
> triggers the warning when we go back to idle.
>
> The reason why this is only visible on C1E affected AMD systems is
> that the other machines enter the deep sleep states via
> acpi_idle/intel_idle and exit the broadcast mode before executing the
> remote triggered local_apic_timer_interrupt. So the pending bit is
> already cleared when the switch over to highres mode is clearing the
> oneshot mask.
>
> The solution is simple: Clear the pending bit together with the mask
> bit when we switch over to highres mode.
>
> Reported-by: poma <[email protected]>
> Cc: [email protected] # 3.10+
> Signed-off-by: Thomas Gleixner <[email protected]>
> ---
> kernel/time/tick-broadcast.c | 1 +
> 1 file changed, 1 insertion(+)
>
> Index: linux-2.6/kernel/time/tick-broadcast.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/tick-broadcast.c
> +++ linux-2.6/kernel/time/tick-broadcast.c
> @@ -756,6 +756,7 @@ out:
> static void tick_broadcast_clear_oneshot(int cpu)
> {
> cpumask_clear_cpu(cpu, tick_broadcast_oneshot_mask);
> + cpumask_clear_cpu(cpu, tick_broadcast_pending_mask);
> }
>
> static void tick_broadcast_init_next_event(struct cpumask *mask,
>
>

Thanks!


poma

Subject: [tip:timers/urgent] tick: Clear broadcast pending bit when switching to oneshot

Commit-ID: dd5fd9b91a77b4c9c28b7ef9c181b1a875820d0a
Gitweb: http://git.kernel.org/tip/dd5fd9b91a77b4c9c28b7ef9c181b1a875820d0a
Author: Thomas Gleixner <[email protected]>
AuthorDate: Tue, 11 Feb 2014 14:35:40 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Thu, 13 Feb 2014 21:55:54 +0100

tick: Clear broadcast pending bit when switching to oneshot

AMD systems which use the C1E workaround in the amd_e400_idle routine
trigger the WARN_ON_ONCE in the broadcast code when onlining a CPU.

The reason is that the idle routine of those AMD systems switches the
cpu into forced broadcast mode early on before the newly brought up
CPU can switch over to high resolution / NOHZ mode. The timer related
CPU1 bringup looks like this:

clockevent_register_device(local_apic);
tick_setup(local_apic);
...
idle()
tick_broadcast_on_off(FORCE);
tick_broadcast_oneshot_control(ENTER)
cpumask_set(cpu, broadcast_oneshot_mask);
halt();

Now the broadcast interrupt on CPU0 sets CPU1 in the
broadcast_pending_mask and wakes CPU1. So CPU1 continues:

local_apic_timer_interrupt()
tick_handle_periodic();
softirq()
tick_init_highres();
cpumask_clr(cpu, broadcast_oneshot_mask);

tick_broadcast_oneshot_control(ENTER)
WARN_ON(cpumask_test(cpu, broadcast_pending_mask);

So while we remove CPU1 from the broadcast_oneshot_mask when we switch
over to highres mode, we do not clear the pending bit, which then
triggers the warning when we go back to idle.

The reason why this is only visible on C1E affected AMD systems is
that the other machines enter the deep sleep states via
acpi_idle/intel_idle and exit the broadcast mode before executing the
remote triggered local_apic_timer_interrupt. So the pending bit is
already cleared when the switch over to highres mode is clearing the
oneshot mask.

The solution is simple: Clear the pending bit together with the mask
bit when we switch over to highres mode.

Stanislaw came up independently with the same patch by enforcing the
C1E workaround and debugging the fallout. I picked mine, because mine
has a changelog :)

Reported-by: poma <[email protected]>
Debugged-by: Stanislaw Gruszka <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Olaf Hering <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: Justin M. Forbes <[email protected]>
Cc: Josh Boyer <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected] # 3.10+
Signed-off-by: Thomas Gleixner <[email protected]>
---
kernel/time/tick-broadcast.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 43780ab..98977a5 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -756,6 +756,7 @@ out:
static void tick_broadcast_clear_oneshot(int cpu)
{
cpumask_clear_cpu(cpu, tick_broadcast_oneshot_mask);
+ cpumask_clear_cpu(cpu, tick_broadcast_pending_mask);
}

static void tick_broadcast_init_next_event(struct cpumask *mask,