2016-03-02 10:42:31

by Hidehiro Kawai

[permalink] [raw]
Subject: [v2 PATCH 0/3] Use nmi_panic() in panic on NMI case

commit 1717f2096b54 ("panic, x86: Fix re-entrance problem due to
panic on NMI") and commit 58c5661f2144 ("panic, x86: Allow CPUs to
save registers even if looping in NMI context") introduced nmi_panic()
which prevents concurrent/recursive execution of panic(). It also
saves registers for the crash dump on x86.

However, there are some cases where NMI handlers still use panic().
This patch set partially replaces them with nmi_panic() in those
cases.

Changes since v1: https://lkml.org/lkml/2016/2/29/858
- Replace nmi_panic() macro with a function version instead of
exporting symbols referred by the macro (PATCH 1/3)
- Improve the patch descriptions (PATCH 2/3 and 3/3)
- Do small cleanups (PATCH 3/3)

---
Even if applying this patch set, some NMI or similar handlers (e.g.
MCE handler) remains to use panic(). This is because I can't test
them well and actual problems won't happen. For example, the
possibility that normal panic and panic on MCE happen simultaneously
is very low.

Hidehiro Kawai (3):
panic: Change nmi_panic from macro to function
ipmi/watchdog: Use nmi_panic() when kernel panics in NMI handler
hpwdt: Use nmi_panic() when kernel panics in NMI handler


drivers/char/ipmi/ipmi_watchdog.c | 2 +-
drivers/watchdog/hpwdt.c | 11 +++++------
include/linux/kernel.h | 22 ++--------------------
kernel/panic.c | 26 ++++++++++++++++++++++++++
4 files changed, 34 insertions(+), 27 deletions(-)


--
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group



2016-03-02 10:42:32

by Hidehiro Kawai

[permalink] [raw]
Subject: [v2 PATCH 2/3] ipmi/watchdog: Use nmi_panic() when kernel panics in NMI handler

commit 1717f2096b54 ("panic, x86: Fix re-entrance problem due to
panic on NMI") introduced nmi_panic() which prevents concurrent and
recursive execution of panic(). It also saves registers for the
crash dump on x86 by later commit 58c5661f2144 ("panic, x86: Allow
CPUs to save registers even if looping in NMI context").

ipmi_watchdog driver can call panic() from NMI handler, so replace
it with nmi_panic().

Signed-off-by: Hidehiro Kawai <[email protected]>
Acked-by: Corey Minyard <[email protected]>
Acked-by: Guenter Roeck <[email protected]>
Reviewed-by: Michal Hocko <[email protected]>
Cc: [email protected]
---
drivers/char/ipmi/ipmi_watchdog.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/char/ipmi/ipmi_watchdog.c b/drivers/char/ipmi/ipmi_watchdog.c
index 096f0ce..4facc75 100644
--- a/drivers/char/ipmi/ipmi_watchdog.c
+++ b/drivers/char/ipmi/ipmi_watchdog.c
@@ -1140,7 +1140,7 @@ ipmi_nmi(unsigned int val, struct pt_regs *regs)
the timer. So do so. */
pretimeout_since_last_heartbeat = 1;
if (atomic_inc_and_test(&preop_panic_excl))
- panic(PFX "pre-timeout");
+ nmi_panic(regs, PFX "pre-timeout");
}

return NMI_HANDLED;


2016-03-02 10:42:34

by Hidehiro Kawai

[permalink] [raw]
Subject: [v2 PATCH 1/3] panic: Change nmi_panic from macro to function

Change nmi_panic() macro to a normal function for the portability.
Also, export it for modules.

Signed-off-by: Hidehiro Kawai <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Michal Nazarewicz <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Nicolas Iooss <[email protected]>
Cc: Javi Merino <[email protected]>
Cc: Gobinda Charan Maji <[email protected]>
Cc: "Steven Rostedt (Red Hat)" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: HATAYAMA Daisuke <[email protected]>
Cc: Tejun Heo <[email protected]>
---
include/linux/kernel.h | 22 ++--------------------
kernel/panic.c | 26 ++++++++++++++++++++++++++
2 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index f31638c..daf233f 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -255,7 +255,8 @@ extern long (*panic_blink)(int state);
__printf(1, 2)
void panic(const char *fmt, ...)
__noreturn __cold;
-void nmi_panic_self_stop(struct pt_regs *);
+__printf(2, 3)
+void nmi_panic(struct pt_regs *regs, const char *fmt, ...);
extern void oops_enter(void);
extern void oops_exit(void);
void print_oops_end_marker(void);
@@ -455,25 +456,6 @@ extern atomic_t panic_cpu;
#define PANIC_CPU_INVALID -1

/*
- * A variant of panic() called from NMI context. We return if we've already
- * panicked on this CPU. If another CPU already panicked, loop in
- * nmi_panic_self_stop() which can provide architecture dependent code such
- * as saving register state for crash dump.
- */
-#define nmi_panic(regs, fmt, ...) \
-do { \
- int old_cpu, cpu; \
- \
- cpu = raw_smp_processor_id(); \
- old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, cpu); \
- \
- if (old_cpu == PANIC_CPU_INVALID) \
- panic(fmt, ##__VA_ARGS__); \
- else if (old_cpu != cpu) \
- nmi_panic_self_stop(regs); \
-} while (0)
-
-/*
* Only to be used by arch init code. If the user over-wrote the default
* CONFIG_PANIC_TIMEOUT, honor it.
*/
diff --git a/kernel/panic.c b/kernel/panic.c
index d96469d..fb61e54 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -72,6 +72,32 @@ void __weak nmi_panic_self_stop(struct pt_regs *regs)

atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);

+/*
+ * A variant of panic() called from NMI context. We return if we've already
+ * panicked on this CPU. If another CPU already panicked, loop in
+ * nmi_panic_self_stop() which can provide architecture dependent code such
+ * as saving register state for crash dump.
+ */
+void nmi_panic(struct pt_regs *regs, const char *fmt, ...)
+{
+ static char buf[1024]; /* protected by panic_cpu */
+ va_list args;
+ int old_cpu, cpu;
+
+ cpu = raw_smp_processor_id();
+ old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, cpu);
+
+ if (old_cpu == PANIC_CPU_INVALID) {
+ va_start(args, fmt);
+ vsnprintf(buf, sizeof(buf), fmt, args);
+ va_end(args);
+
+ panic("%s", buf);
+ } else if (old_cpu != cpu)
+ nmi_panic_self_stop(regs);
+}
+EXPORT_SYMBOL(nmi_panic);
+
/**
* panic - halt the system
* @fmt: The text string to print


2016-03-02 10:42:29

by Hidehiro Kawai

[permalink] [raw]
Subject: [v2 PATCH 3/3] hpwdt: Use nmi_panic() when kernel panics in NMI handler

commit 1717f2096b54 ("panic, x86: Fix re-entrance problem due to
panic on NMI") introduced nmi_panic() which prevents concurrent and
recursive execution of panic(). It also saves registers for the
crash dump on x86 by later commit 58c5661f2144 ("panic, x86: Allow
CPUs to save registers even if looping in NMI context").

hpwdt driver can call panic() from NMI handler, so replace it with
nmi_panic(). Also, do some cleanups.

Changes since V1:
- Use direct return instead of goto
- Combine the panic message string into a single line

Signed-off-by: Hidehiro Kawai <[email protected]>
Cc: Thomas Mingarelli <[email protected]>
Cc: Wim Van Sebroeck <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: [email protected]
---
drivers/watchdog/hpwdt.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index 92443c3..a6c8797 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -483,7 +483,7 @@ static int hpwdt_pretimeout(unsigned int ulReason, struct pt_regs *regs)
static int die_nmi_called;

if (!hpwdt_nmi_decoding)
- goto out;
+ return NMI_DONE;

spin_lock_irqsave(&rom_lock, rom_pl);
if (!die_nmi_called && !is_icru && !is_uefi)
@@ -496,11 +496,11 @@ static int hpwdt_pretimeout(unsigned int ulReason, struct pt_regs *regs)

if (!is_icru && !is_uefi) {
if (cmn_regs.u1.ral == 0) {
- panic("An NMI occurred, "
- "but unable to determine source.\n");
+ nmi_panic(regs, "An NMI occurred, but unable to determine source.\n");
+ return NMI_HANDLED;
}
}
- panic("An NMI occurred. Depending on your system the reason "
+ nmi_panic(regs, "An NMI occurred. Depending on your system the reason "
"for the NMI is logged in any one of the following "
"resources:\n"
"1. Integrated Management Log (IML)\n"
@@ -508,8 +508,7 @@ static int hpwdt_pretimeout(unsigned int ulReason, struct pt_regs *regs)
"3. OA Forward Progress Log\n"
"4. iLO Event Log");

-out:
- return NMI_DONE;
+ return NMI_HANDLED;
}
#endif /* CONFIG_HPWDT_NMI_DECODING */



2016-03-02 13:18:29

by Michal Hocko

[permalink] [raw]
Subject: Re: [v2 PATCH 1/3] panic: Change nmi_panic from macro to function

On Wed 02-03-16 19:36:26, Hidehiro Kawai wrote:
[...]
> +void nmi_panic(struct pt_regs *regs, const char *fmt, ...)

Do we really need vargs? All the current users seem to be OK with a
simple string. This makes the code slightly more complicated without any
apparent reason.

> +{
> + static char buf[1024]; /* protected by panic_cpu */
> + va_list args;
> + int old_cpu, cpu;
> +
> + cpu = raw_smp_processor_id();
> + old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, cpu);
> +
> + if (old_cpu == PANIC_CPU_INVALID) {
> + va_start(args, fmt);
> + vsnprintf(buf, sizeof(buf), fmt, args);
> + va_end(args);
> +
> + panic("%s", buf);
> + } else if (old_cpu != cpu)
> + nmi_panic_self_stop(regs);
> +}
> +EXPORT_SYMBOL(nmi_panic);
> +
> /**
> * panic - halt the system
> * @fmt: The text string to print
>
>

--
Michal Hocko
SUSE Labs

2016-03-02 13:23:00

by Borislav Petkov

[permalink] [raw]
Subject: Re: [v2 PATCH 1/3] panic: Change nmi_panic from macro to function

On Wed, Mar 02, 2016 at 02:18:24PM +0100, Michal Hocko wrote:
> On Wed 02-03-16 19:36:26, Hidehiro Kawai wrote:
> [...]
> > +void nmi_panic(struct pt_regs *regs, const char *fmt, ...)
>
> Do we really need vargs? All the current users seem to be OK with a
> simple string. This makes the code slightly more complicated without any
> apparent reason.

I was just wondering the exactly same thing...

The contra-arg would be that in case someone wants to do nmi_panic()
with more than a string, then it won't work.

The question is, does nmi_panic() even need to dump something more than
regs and a string?

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-02 17:03:17

by Guenter Roeck

[permalink] [raw]
Subject: Re: [v2 PATCH 1/3] panic: Change nmi_panic from macro to function

On 03/02/2016 05:18 AM, Michal Hocko wrote:
> On Wed 02-03-16 19:36:26, Hidehiro Kawai wrote:
> [...]
>> +void nmi_panic(struct pt_regs *regs, const char *fmt, ...)
>
> Do we really need vargs? All the current users seem to be OK with a
> simple string. This makes the code slightly more complicated without any
> apparent reason.
>
>> +{
>> + static char buf[1024]; /* protected by panic_cpu */

I am also not too happy with this additional stack allocation.
panic() itself takes varargs. Can those be passed on ?
I understand this would need something like vpanic(), so
maybe that isn't feasible.

Dropping the format, at least for now, might be a simpler option.

Guenter

>> + va_list args;
>> + int old_cpu, cpu;
>> +
>> + cpu = raw_smp_processor_id();
>> + old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, cpu);
>> +
>> + if (old_cpu == PANIC_CPU_INVALID) {
>> + va_start(args, fmt);
>> + vsnprintf(buf, sizeof(buf), fmt, args);
>> + va_end(args);
>> +
>> + panic("%s", buf);
>> + } else if (old_cpu != cpu)
>> + nmi_panic_self_stop(regs);
>> +}
>> +EXPORT_SYMBOL(nmi_panic);
>> +
>> /**
>> * panic - halt the system
>> * @fmt: The text string to print
>>
>>
>

2016-03-03 01:29:00

by Hidehiro Kawai

[permalink] [raw]
Subject: RE: [v2 PATCH 1/3] panic: Change nmi_panic from macro to function

Hi,

> From: Borislav Petkov [mailto:[email protected]]
> On Wed, Mar 02, 2016 at 02:18:24PM +0100, Michal Hocko wrote:
> > On Wed 02-03-16 19:36:26, Hidehiro Kawai wrote:
> > [...]
> > > +void nmi_panic(struct pt_regs *regs, const char *fmt, ...)
> >
> > Do we really need vargs? All the current users seem to be OK with a
> > simple string. This makes the code slightly more complicated without any
> > apparent reason.
>
> I was just wondering the exactly same thing...
>
> The contra-arg would be that in case someone wants to do nmi_panic()
> with more than a string, then it won't work.

It's not necessary to use vargs at this point, and passing a simple
string is OK for me. Even if someone wants to use vargs, we can modify
nmi_panic() without any changes in caller side. So, I'll remove it.

> The question is, does nmi_panic() even need to dump something more than
> regs and a string?

Hmm, printing regs, especially for RIP, would be useful for hang-up
cases, but I don't want to do much things in nmi_panic() as long as
it is already done by current callers. nmi_panic() can be called
concurrently on multiple CPUs, so it also needs another serialization
to print something.

By the way, I have a patch set to safely leave important information
by kexec's purgatory...but no one is interested in?
http://thread.gmane.org/gmane.linux.kernel.kexec/15382

Regards,
--
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group