2020-01-06 06:47:27

by Liu, Chuansheng

[permalink] [raw]
Subject: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work

In ICL platform, it is easy to hit bootup failure with panic
in thermal interrupt handler during early bootup stage.

Such issue makes my platform almost can not boot up with
latest kernel code.

The call stack is like:
kernel BUG at kernel/timer/timer.c:1152!

Call Trace:
__queue_delayed_work
queue_delayed_work_on
therm_throt_process
intel_thermal_interrupt
...

When one CPU is up, the irq is enabled prior to CPU UP
notification which will then initialize therm_worker.
Such race will cause the posssibility that interrupt
handler therm_throt_process() accesses uninitialized
therm_work, then system hit panic at very early bootup
stage.

In my ICL platform, it can be reproduced in several times
of reboot stress. With this fix, the system keeps alive
for more than 200 times of reboot stress.

Signed-off-by: Chuansheng Liu <[email protected]>
---
arch/x86/kernel/cpu/mce/therm_throt.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/therm_throt.c b/arch/x86/kernel/cpu/mce/therm_throt.c
index b38010b541d6..7320eb3ac029 100644
--- a/arch/x86/kernel/cpu/mce/therm_throt.c
+++ b/arch/x86/kernel/cpu/mce/therm_throt.c
@@ -86,6 +86,7 @@ struct _thermal_state {
unsigned long total_time_ms;
bool rate_control_active;
bool new_event;
+ bool therm_work_active;
u8 level;
u8 sample_index;
u8 sample_count;
@@ -359,7 +360,9 @@ static void therm_throt_process(bool new_event, int event, int level)

state->baseline_temp = temp;
state->last_interrupt_time = now;
- schedule_delayed_work_on(this_cpu, &state->therm_work, THERM_THROT_POLL_INTERVAL);
+ if (state->therm_work_active)
+ schedule_delayed_work_on(this_cpu, &state->therm_work,
+ THERM_THROT_POLL_INTERVAL);
} else if (old_event && state->last_interrupt_time) {
unsigned long throttle_time;

@@ -473,7 +476,8 @@ static int thermal_throttle_online(unsigned int cpu)

INIT_DELAYED_WORK(&state->package_throttle.therm_work, throttle_active_work);
INIT_DELAYED_WORK(&state->core_throttle.therm_work, throttle_active_work);
-
+ state->package_throttle.therm_work_active = true;
+ state->core_throttle.therm_work_active = true;
return thermal_throttle_add_dev(dev, cpu);
}

@@ -482,6 +486,8 @@ static int thermal_throttle_offline(unsigned int cpu)
struct thermal_state *state = &per_cpu(thermal_state, cpu);
struct device *dev = get_cpu_device(cpu);

+ state->package_throttle.therm_work_active = false;
+ state->core_throttle.therm_work_active = false;
cancel_delayed_work(&state->package_throttle.therm_work);
cancel_delayed_work(&state->core_throttle.therm_work);

--
2.17.1


2020-01-06 07:09:43

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work

On Mon, Jan 06, 2020 at 06:41:55AM +0000, Chuansheng Liu wrote:
> In ICL platform, it is easy to hit bootup failure with panic
> in thermal interrupt handler during early bootup stage.
>
> Such issue makes my platform almost can not boot up with
> latest kernel code.
>
> The call stack is like:
> kernel BUG at kernel/timer/timer.c:1152!
>
> Call Trace:
> __queue_delayed_work
> queue_delayed_work_on
> therm_throt_process
> intel_thermal_interrupt
> ...
>
> When one CPU is up, the irq is enabled prior to CPU UP
> notification which will then initialize therm_worker.

You mean the unmasking of the thermal vector at the end of
intel_init_thermal()?

If so, why don't you move that to the end of the notifier and unmask it
only after all the necessary work like setting up the workqueues etc, is
done, and save yourself adding yet another silly bool?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-01-06 07:13:00

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work

* Borislav Petkov <[email protected]> wrote:

> On Mon, Jan 06, 2020 at 06:41:55AM +0000, Chuansheng Liu wrote:
> > In ICL platform, it is easy to hit bootup failure with panic
> > in thermal interrupt handler during early bootup stage.
> >
> > Such issue makes my platform almost can not boot up with
> > latest kernel code.
> >
> > The call stack is like:
> > kernel BUG at kernel/timer/timer.c:1152!
> >
> > Call Trace:
> > __queue_delayed_work
> > queue_delayed_work_on
> > therm_throt_process
> > intel_thermal_interrupt
> > ...
> >
> > When one CPU is up, the irq is enabled prior to CPU UP
> > notification which will then initialize therm_worker.
>
> You mean the unmasking of the thermal vector at the end of
> intel_init_thermal()?
>
> If so, why don't you move that to the end of the notifier and unmask it
> only after all the necessary work like setting up the workqueues etc, is
> done, and save yourself adding yet another silly bool?

A debugging WARN_ON_ONCE() when the workqueue is not initialized yet
would also be useful I suspect. This would turn any remaining race-crash
boot failure in this area into a warning.

Thanks,

Ingo

2020-01-06 09:23:07

by Liu, Chuansheng

[permalink] [raw]
Subject: RE: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work



> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Monday, January 6, 2020 3:08 PM
> To: Liu, Chuansheng <[email protected]>
> Cc: [email protected]; Luck, Tony <[email protected]>;
> [email protected]; [email protected]; [email protected]
> Subject: Re: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized
> therm_work
>
> On Mon, Jan 06, 2020 at 06:41:55AM +0000, Chuansheng Liu wrote:
> > In ICL platform, it is easy to hit bootup failure with panic
> > in thermal interrupt handler during early bootup stage.
> >
> > Such issue makes my platform almost can not boot up with
> > latest kernel code.
> >
> > The call stack is like:
> > kernel BUG at kernel/timer/timer.c:1152!
> >
> > Call Trace:
> > __queue_delayed_work
> > queue_delayed_work_on
> > therm_throt_process
> > intel_thermal_interrupt
> > ...
> >
> > When one CPU is up, the irq is enabled prior to CPU UP
> > notification which will then initialize therm_worker.
>
> You mean the unmasking of the thermal vector at the end of
> intel_init_thermal()?
Exactly, and there is one local CPU irq enable later too.

>
> If so, why don't you move that to the end of the notifier and unmask it
> only after all the necessary work like setting up the workqueues etc, is
> done, and save yourself adding yet another silly bool?
>
Thanks for your suggestion, I am just worried about the interrupt delay.
I traced there is about 2s gap between unmask interrupt and workqueue
Initialization. If you think it is OK to ignore this delay, I will make another
simple patch as you suggested????

Best Regards
Chuansheng


2020-01-06 09:25:25

by Liu, Chuansheng

[permalink] [raw]
Subject: RE: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work



> -----Original Message-----
> From: Ingo Molnar <[email protected]> On Behalf Of Ingo Molnar
> Sent: Monday, January 6, 2020 3:11 PM
> To: Borislav Petkov <[email protected]>
> Cc: Liu, Chuansheng <[email protected]>; [email protected];
> Luck, Tony <[email protected]>; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized
> therm_work
>
> * Borislav Petkov <[email protected]> wrote:
>
> > On Mon, Jan 06, 2020 at 06:41:55AM +0000, Chuansheng Liu wrote:
> > > In ICL platform, it is easy to hit bootup failure with panic
> > > in thermal interrupt handler during early bootup stage.
> > >
> > > Such issue makes my platform almost can not boot up with
> > > latest kernel code.
> > >
> > > The call stack is like:
> > > kernel BUG at kernel/timer/timer.c:1152!
> > >
> > > Call Trace:
> > > __queue_delayed_work
> > > queue_delayed_work_on
> > > therm_throt_process
> > > intel_thermal_interrupt
> > > ...
> > >
> > > When one CPU is up, the irq is enabled prior to CPU UP
> > > notification which will then initialize therm_worker.
> >
> > You mean the unmasking of the thermal vector at the end of
> > intel_init_thermal()?
> >
> > If so, why don't you move that to the end of the notifier and unmask it
> > only after all the necessary work like setting up the workqueues etc, is
> > done, and save yourself adding yet another silly bool?
>
> A debugging WARN_ON_ONCE() when the workqueue is not initialized yet
> would also be useful I suspect. This would turn any remaining race-crash
> boot failure in this area into a warning.
>
Thanks Ingo.
That's a good suggestion, I can try to make another patch about it later.

Best Regards
Chuansheng

2020-01-06 10:04:15

by Borislav Petkov

[permalink] [raw]
Subject: RE: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work

On January 6, 2020 10:22:06 AM GMT+01:00, "Liu, Chuansheng" <[email protected]> wrote:
>I traced there is about 2s gap between unmask interrupt and workqueue
>Initialization.

And that is a problem because?

You setup workqueue etc and *then* unmask the irq.

--
Sent from a small device: formatting sux and brevity is inevitable.

2020-01-06 10:12:10

by Liu, Chuansheng

[permalink] [raw]
Subject: RE: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work



> -----Original Message-----
> From: Boris Petkov <[email protected]>
> Sent: Monday, January 6, 2020 6:01 PM
> To: Liu, Chuansheng <[email protected]>
> Cc: [email protected]; Luck, Tony <[email protected]>;
> [email protected]; [email protected]; [email protected]
> Subject: RE: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized
> therm_work
>
> On January 6, 2020 10:22:06 AM GMT+01:00, "Liu, Chuansheng"
> <[email protected]> wrote:
> >I traced there is about 2s gap between unmask interrupt and workqueue
> >Initialization.
>
> And that is a problem because?
>
> You setup workqueue etc and *then* unmask the irq.
>
Some previous experience shows:
If there is critical thermal alert, we still can take action in kernel side in this
2s, even though the workqueue is not ready, but interrupt handler can work
well.


2020-01-06 10:38:34

by Borislav Petkov

[permalink] [raw]
Subject: RE: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work

On January 6, 2020 11:10:11 AM GMT+01:00, "Liu, Chuansheng" <[email protected]> wrote:
>If there is critical thermal alert, we still can take action in kernel
>side in this
>2s

What critical thermal alert during boot and what action can you take that early that can't wait 2 seconds?

--
Sent from a small device: formatting sux and brevity is inevitable.

2020-01-11 04:23:26

by Liu, Chuansheng

[permalink] [raw]
Subject: RE: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized therm_work

Hi Ingo,

> -----Original Message-----
> From: Ingo Molnar <[email protected]> On Behalf Of Ingo Molnar
> Sent: Monday, January 6, 2020 3:11 PM
> To: Borislav Petkov <[email protected]>
> Cc: Liu, Chuansheng <[email protected]>; [email protected];
> Luck, Tony <[email protected]>; [email protected]; [email protected];
> [email protected]
> Subject: Re: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized
> therm_work
>
> * Borislav Petkov <[email protected]> wrote:
>
> > On Mon, Jan 06, 2020 at 06:41:55AM +0000, Chuansheng Liu wrote:
> > > In ICL platform, it is easy to hit bootup failure with panic
> > > in thermal interrupt handler during early bootup stage.
> > >
> > > Such issue makes my platform almost can not boot up with
> > > latest kernel code.
> > >
> > > The call stack is like:
> > > kernel BUG at kernel/timer/timer.c:1152!
> > >
> > > Call Trace:
> > > __queue_delayed_work
> > > queue_delayed_work_on
> > > therm_throt_process
> > > intel_thermal_interrupt
> > > ...
> > >
> > > When one CPU is up, the irq is enabled prior to CPU UP
> > > notification which will then initialize therm_worker.
> >
> > You mean the unmasking of the thermal vector at the end of
> > intel_init_thermal()?
> >
> > If so, why don't you move that to the end of the notifier and unmask it
> > only after all the necessary work like setting up the workqueues etc, is
> > done, and save yourself adding yet another silly bool?
>
> A debugging WARN_ON_ONCE() when the workqueue is not initialized yet
> would also be useful I suspect. This would turn any remaining race-crash
> boot failure in this area into a warning.
>
Just checked the code, the WARN_ON_ONCE() is already there:
1622 WARN_ON_ONCE(timer->function != delayed_work_timer_fn);

With reproducing it, the corresponding log also shows before panic:
WARNING: CPU: 0 .... at kernel/workqueue.c:1622 __queue_delayed_work+0x73/0x90

Thanks for your reminder.