2018-07-24 19:29:05

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: [PATCH 0/7] powerpc: Modernize unhandled signals message

Hi, everyone.

This series was inspired by the need to modernize and display more
informative messages about unhandled signals.

The "unhandled signal NN" is not very informative. We thought it would
be helpful adding a human-readable message describing what the signal
number means, printing the VMA address, and dumping the instructions.

We can add more informative messages, like informing what each code of a
SIGSEGV signal means. We are open to suggestions.

I have collected some early feedback from Michael Ellerman about this
series and would love to hear more feedback from you all.

Before this series:

Jul 24 13:01:07 localhost kernel: pandafault[5989]: unhandled signal 11 at 00000000100007d0 nip 000000001000061c lr 00003fff85a75100 code 2

After this series:

Jul 24 13:08:01 localhost kernel: pandafault[10758]: segfault (11) at 00000000100007d0 nip 000000001000061c lr 00007fffabc85100 code 2 in pandafault[10000000+10000]
Jul 24 13:08:01 localhost kernel: Instruction dump:
Jul 24 13:08:01 localhost kernel: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe
Jul 24 13:08:01 localhost kernel: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040

Cheers
Murilo

Murilo Opsfelder Araujo (7):
powerpc/traps: Print unhandled signals in a separate function
powerpc/traps: Return early in show_signal_msg()
powerpc/reg: Add REG_FMT definition
powerpc/traps: Use REG_FMT in show_signal_msg()
powerpc/traps: Print VMA for unhandled signals
powerpc/traps: Print signal name for unhandled signals
powerpc/traps: Show instructions on exceptions

arch/powerpc/include/asm/reg.h | 6 +++
arch/powerpc/include/asm/stacktrace.h | 7 +++
arch/powerpc/kernel/process.c | 28 +++++-----
arch/powerpc/kernel/traps.c | 73 +++++++++++++++++++++++----
4 files changed, 89 insertions(+), 25 deletions(-)
create mode 100644 arch/powerpc/include/asm/stacktrace.h

--
2.17.1



2018-07-24 19:29:10

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: [PATCH 2/7] powerpc/traps: Return early in show_signal_msg()

Modify logic of show_signal_msg() to return early, if possible. Replace
printk_ratelimited() by printk() and a default rate limit burst to limit
displaying unhandled signals messages.

Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
---
arch/powerpc/kernel/traps.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index cbd3dc365193..4faab4705774 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -301,6 +301,13 @@ void user_single_step_siginfo(struct task_struct *tsk,
info->si_addr = (void __user *)regs->nip;
}

+static bool show_unhandled_signals_ratelimited(void)
+{
+ static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
+ DEFAULT_RATELIMIT_BURST);
+ return show_unhandled_signals && __ratelimit(&rs);
+}
+
static void show_signal_msg(int signr, struct pt_regs *regs, int code,
unsigned long addr)
{
@@ -309,11 +316,12 @@ static void show_signal_msg(int signr, struct pt_regs *regs, int code,
const char fmt64[] = KERN_INFO "%s[%d]: unhandled signal %d " \
"at %016lx nip %016lx lr %016lx code %x\n";

- if (show_unhandled_signals && unhandled_signal(current, signr)) {
- printk_ratelimited(regs->msr & MSR_64BIT ? fmt64 : fmt32,
- current->comm, current->pid, signr,
- addr, regs->nip, regs->link, code);
- }
+ if (!unhandled_signal(current, signr))
+ return;
+
+ printk(regs->msr & MSR_64BIT ? fmt64 : fmt32,
+ current->comm, current->pid, signr,
+ addr, regs->nip, regs->link, code);
}

void _exception_pkey(int signr, struct pt_regs *regs, int code,
@@ -326,7 +334,8 @@ void _exception_pkey(int signr, struct pt_regs *regs, int code,
return;
}

- show_signal_msg(signr, regs, code, addr);
+ if (show_unhandled_signals_ratelimited())
+ show_signal_msg(signr, regs, code, addr);

if (arch_irqs_disabled() && !arch_irq_disabled_regs(regs))
local_irq_enable();
--
2.17.1


2018-07-24 19:29:20

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: [PATCH 3/7] powerpc/reg: Add REG_FMT definition

Make REG definition, in arch/powerpc/kernel/process.c, generic enough by
renaming it to REG_FMT and placing it in arch/powerpc/include/asm/reg.h to be
used elsewhere.

Replace occurrences of REG by REG_FMT in arch/powerpc/kernel/process.c.

Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
---
arch/powerpc/include/asm/reg.h | 6 ++++++
arch/powerpc/kernel/process.c | 22 ++++++++++------------
2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 858aa7984ab0..d6c5c77383de 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1319,6 +1319,12 @@
#define PVR_ARCH_207 0x0f000004
#define PVR_ARCH_300 0x0f000005

+#ifdef CONFIG_PPC64
+#define REG_FMT "%016lx"
+#else
+#define REG_FMT "%08lx"
+#endif /* CONFIG_PPC64 */
+
/* Macros for setting and retrieving special purpose registers */
#ifndef __ASSEMBLY__
#define mfmsr() ({unsigned long rval; \
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 27f0caee55ea..b1af3390249c 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1381,11 +1381,9 @@ static void print_msr_bits(unsigned long val)
}

#ifdef CONFIG_PPC64
-#define REG "%016lx"
#define REGS_PER_LINE 4
#define LAST_VOLATILE 13
#else
-#define REG "%08lx"
#define REGS_PER_LINE 8
#define LAST_VOLATILE 12
#endif
@@ -1396,21 +1394,21 @@ void show_regs(struct pt_regs * regs)

show_regs_print_info(KERN_DEFAULT);

- printk("NIP: "REG" LR: "REG" CTR: "REG"\n",
+ printk("NIP: "REG_FMT" LR: "REG_FMT" CTR: "REG_FMT"\n",
regs->nip, regs->link, regs->ctr);
printk("REGS: %px TRAP: %04lx %s (%s)\n",
regs, regs->trap, print_tainted(), init_utsname()->release);
- printk("MSR: "REG" ", regs->msr);
+ printk("MSR: "REG_FMT" ", regs->msr);
print_msr_bits(regs->msr);
- pr_cont(" CR: %08lx XER: %08lx\n", regs->ccr, regs->xer);
+ pr_cont(" CR: "REG_FMT" XER: "REG_FMT"\n", regs->ccr, regs->xer);
trap = TRAP(regs);
if ((TRAP(regs) != 0xc00) && cpu_has_feature(CPU_FTR_CFAR))
- pr_cont("CFAR: "REG" ", regs->orig_gpr3);
+ pr_cont("CFAR: "REG_FMT" ", regs->orig_gpr3);
if (trap == 0x200 || trap == 0x300 || trap == 0x600)
#if defined(CONFIG_4xx) || defined(CONFIG_BOOKE)
- pr_cont("DEAR: "REG" ESR: "REG" ", regs->dar, regs->dsisr);
+ pr_cont("DEAR: "REG_FMT" ESR: "REG_FMT" ", regs->dar, regs->dsisr);
#else
- pr_cont("DAR: "REG" DSISR: %08lx ", regs->dar, regs->dsisr);
+ pr_cont("DAR: "REG_FMT" DSISR: "REG_FMT" ", regs->dar, regs->dsisr);
#endif
#ifdef CONFIG_PPC64
pr_cont("IRQMASK: %lx ", regs->softe);
@@ -1423,7 +1421,7 @@ void show_regs(struct pt_regs * regs)
for (i = 0; i < 32; i++) {
if ((i % REGS_PER_LINE) == 0)
pr_cont("\nGPR%02d: ", i);
- pr_cont(REG " ", regs->gpr[i]);
+ pr_cont(REG_FMT " ", regs->gpr[i]);
if (i == LAST_VOLATILE && !FULL_REGS(regs))
break;
}
@@ -1433,8 +1431,8 @@ void show_regs(struct pt_regs * regs)
* Lookup NIP late so we have the best change of getting the
* above info out without failing
*/
- printk("NIP ["REG"] %pS\n", regs->nip, (void *)regs->nip);
- printk("LR ["REG"] %pS\n", regs->link, (void *)regs->link);
+ printk("NIP ["REG_FMT"] %pS\n", regs->nip, (void *)regs->nip);
+ printk("LR ["REG_FMT"] %pS\n", regs->link, (void *)regs->link);
#endif
show_stack(current, (unsigned long *) regs->gpr[1]);
if (!user_mode(regs))
@@ -2038,7 +2036,7 @@ void show_stack(struct task_struct *tsk, unsigned long *stack)
newsp = stack[0];
ip = stack[STACK_FRAME_LR_SAVE];
if (!firstframe || ip != lr) {
- printk("["REG"] ["REG"] %pS", sp, ip, (void *)ip);
+ printk("["REG_FMT"] ["REG_FMT"] %pS", sp, ip, (void *)ip);
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
if ((ip == rth) && curr_frame >= 0) {
pr_cont(" (%pS)",
--
2.17.1


2018-07-24 19:29:27

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: [PATCH 4/7] powerpc/traps: Use REG_FMT in show_signal_msg()

Simplify the message format by using REG_FMT as the register format. This
avoids having two different formats and avoids checking for MSR_64BIT.

Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
---
arch/powerpc/kernel/traps.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 4faab4705774..047d980ac776 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -311,17 +311,13 @@ static bool show_unhandled_signals_ratelimited(void)
static void show_signal_msg(int signr, struct pt_regs *regs, int code,
unsigned long addr)
{
- const char fmt32[] = KERN_INFO "%s[%d]: unhandled signal %d " \
- "at %08lx nip %08lx lr %08lx code %x\n";
- const char fmt64[] = KERN_INFO "%s[%d]: unhandled signal %d " \
- "at %016lx nip %016lx lr %016lx code %x\n";
-
if (!unhandled_signal(current, signr))
return;

- printk(regs->msr & MSR_64BIT ? fmt64 : fmt32,
- current->comm, current->pid, signr,
- addr, regs->nip, regs->link, code);
+ pr_info("%s[%d]: unhandled signal %d at "REG_FMT \
+ " nip "REG_FMT" lr "REG_FMT" code %x\n",
+ current->comm, current->pid, signr, addr,
+ regs->nip, regs->link, code);
}

void _exception_pkey(int signr, struct pt_regs *regs, int code,
--
2.17.1


2018-07-24 19:29:33

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: [PATCH 5/7] powerpc/traps: Print VMA for unhandled signals

This adds VMA address in the message printed for unhandled signals, similarly to
what other architectures, like x86, print.

Before this patch, a page fault looked like:

Jul 11 15:56:25 localhost kernel: pandafault[61470]: unhandled signal 11 at 00000000100007d0 nip 000000001000061c lr 00007fff8d185100 code 2

After this patch, a page fault looks like:

Jul 11 16:04:11 localhost kernel: pandafault[6303]: unhandled signal 11 at 00000000100007d0 nip 000000001000061c lr 00007fff93c55100 code 2 in pandafault[10000000+10000]

Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
---
arch/powerpc/kernel/traps.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 047d980ac776..e6c43ef9fb50 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -315,9 +315,13 @@ static void show_signal_msg(int signr, struct pt_regs *regs, int code,
return;

pr_info("%s[%d]: unhandled signal %d at "REG_FMT \
- " nip "REG_FMT" lr "REG_FMT" code %x\n",
+ " nip "REG_FMT" lr "REG_FMT" code %x",
current->comm, current->pid, signr, addr,
regs->nip, regs->link, code);
+
+ print_vma_addr(KERN_CONT " in ", regs->nip);
+
+ pr_cont("\n");
}

void _exception_pkey(int signr, struct pt_regs *regs, int code,
--
2.17.1


2018-07-24 19:29:46

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: [PATCH 7/7] powerpc/traps: Show instructions on exceptions

Move show_instructions() declaration to arch/powerpc/include/asm/stacktrace.h
and include asm/stracktrace.h in arch/powerpc/kernel/process.c, which contains
the implementation.

Modify show_instructions() not to call __kernel_text_address(), allowing
userspace instruction dump. probe_kernel_address(), which returns -EFAULT if
something goes wrong, is still being called.

Call show_instructions() in arch/powerpc/kernel/traps.c to dump instructions at
faulty location, useful to debugging.

Before this patch, an unhandled signal message looked like:

Jul 24 09:57:00 localhost kernel: pandafault[10524]: segfault (11) at 00000000100007d0 nip 000000001000061c lr 00007fffbd295100 code 2 in pandafault[10000000+10000]

After this patch, it looks like:

Jul 24 09:57:00 localhost kernel: pandafault[10524]: segfault (11) at 00000000100007d0 nip 000000001000061c lr 00007fffbd295100 code 2 in pandafault[10000000+10000]
Jul 24 09:57:00 localhost kernel: Instruction dump:
Jul 24 09:57:00 localhost kernel: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe
Jul 24 09:57:00 localhost kernel: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040

Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
---
arch/powerpc/include/asm/stacktrace.h | 7 +++++++
arch/powerpc/kernel/process.c | 6 +++---
arch/powerpc/kernel/traps.c | 3 +++
3 files changed, 13 insertions(+), 3 deletions(-)
create mode 100644 arch/powerpc/include/asm/stacktrace.h

diff --git a/arch/powerpc/include/asm/stacktrace.h b/arch/powerpc/include/asm/stacktrace.h
new file mode 100644
index 000000000000..46e5ef451578
--- /dev/null
+++ b/arch/powerpc/include/asm/stacktrace.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_STACKTRACE_H
+#define _ASM_POWERPC_STACKTRACE_H
+
+void show_instructions(struct pt_regs *regs);
+
+#endif /* _ASM_POWERPC_STACKTRACE_H */
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index b1af3390249c..ee1d63e03c52 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -52,6 +52,7 @@
#include <asm/machdep.h>
#include <asm/time.h>
#include <asm/runlatch.h>
+#include <asm/stacktrace.h>
#include <asm/syscalls.h>
#include <asm/switch_to.h>
#include <asm/tm.h>
@@ -1261,7 +1262,7 @@ struct task_struct *__switch_to(struct task_struct *prev,

static int instructions_to_print = 16;

-static void show_instructions(struct pt_regs *regs)
+void show_instructions(struct pt_regs *regs)
{
int i;
unsigned long pc = regs->nip - (instructions_to_print * 3 / 4 *
@@ -1283,8 +1284,7 @@ static void show_instructions(struct pt_regs *regs)
pc = (unsigned long)phys_to_virt(pc);
#endif

- if (!__kernel_text_address(pc) ||
- probe_kernel_address((unsigned int __user *)pc, instr)) {
+ if (probe_kernel_address((unsigned int __user *)pc, instr)) {
pr_cont("XXXXXXXX ");
} else {
if (regs->nip == pc)
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index e55ee639d010..3beca17ac1b1 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -70,6 +70,7 @@
#include <asm/hmi.h>
#include <sysdev/fsl_pci.h>
#include <asm/kprobes.h>
+#include <asm/stacktrace.h>

#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE)
int (*__debugger)(struct pt_regs *regs) __read_mostly;
@@ -357,6 +358,8 @@ static void show_signal_msg(int signr, struct pt_regs *regs, int code,
print_vma_addr(KERN_CONT " in ", regs->nip);

pr_cont("\n");
+
+ show_instructions(regs);
}

void _exception_pkey(int signr, struct pt_regs *regs, int code,
--
2.17.1


2018-07-24 19:29:50

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: [PATCH 1/7] powerpc/traps: Print unhandled signals in a separate function

Isolate the logic of printing unhandled signals out of _exception_pkey(). No
functional change, only code rearrangement.

Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
---
arch/powerpc/kernel/traps.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 0e17dcb48720..cbd3dc365193 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -301,26 +301,32 @@ void user_single_step_siginfo(struct task_struct *tsk,
info->si_addr = (void __user *)regs->nip;
}

+static void show_signal_msg(int signr, struct pt_regs *regs, int code,
+ unsigned long addr)
+{
+ const char fmt32[] = KERN_INFO "%s[%d]: unhandled signal %d " \
+ "at %08lx nip %08lx lr %08lx code %x\n";
+ const char fmt64[] = KERN_INFO "%s[%d]: unhandled signal %d " \
+ "at %016lx nip %016lx lr %016lx code %x\n";
+
+ if (show_unhandled_signals && unhandled_signal(current, signr)) {
+ printk_ratelimited(regs->msr & MSR_64BIT ? fmt64 : fmt32,
+ current->comm, current->pid, signr,
+ addr, regs->nip, regs->link, code);
+ }
+}

void _exception_pkey(int signr, struct pt_regs *regs, int code,
- unsigned long addr, int key)
+ unsigned long addr, int key)
{
siginfo_t info;
- const char fmt32[] = KERN_INFO "%s[%d]: unhandled signal %d " \
- "at %08lx nip %08lx lr %08lx code %x\n";
- const char fmt64[] = KERN_INFO "%s[%d]: unhandled signal %d " \
- "at %016lx nip %016lx lr %016lx code %x\n";

if (!user_mode(regs)) {
die("Exception in kernel mode", regs, signr);
return;
}

- if (show_unhandled_signals && unhandled_signal(current, signr)) {
- printk_ratelimited(regs->msr & MSR_64BIT ? fmt64 : fmt32,
- current->comm, current->pid, signr,
- addr, regs->nip, regs->link, code);
- }
+ show_signal_msg(signr, regs, code, addr);

if (arch_irqs_disabled() && !arch_irq_disabled_regs(regs))
local_irq_enable();
--
2.17.1


2018-07-24 19:30:18

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: [PATCH 6/7] powerpc/traps: Print signal name for unhandled signals

This adds a human-readable name in the unhandled signal message.

Before this patch, a page fault looked like:

Jul 11 16:04:11 localhost kernel: pandafault[6303]: unhandled signal 11 at 00000000100007d0 nip 000000001000061c lr 00007fff93c55100 code 2 in pandafault[10000000+10000]

After this patch, a page fault looks like:

Jul 11 18:14:48 localhost kernel: pandafault[6352]: segfault (11) at 000000013a2a09f8 nip 000000013a2a086c lr 00007fffb63e5100 code 2 in pandafault[13a2a0000+10000]

Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
---
arch/powerpc/kernel/traps.c | 43 +++++++++++++++++++++++++++++++++----
1 file changed, 39 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index e6c43ef9fb50..e55ee639d010 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -96,6 +96,41 @@ EXPORT_SYMBOL(__debugger_fault_handler);
#define TM_DEBUG(x...) do { } while(0)
#endif

+static const char *signames[SIGRTMIN + 1] = {
+ "UNKNOWN",
+ "SIGHUP", // 1
+ "SIGINT", // 2
+ "SIGQUIT", // 3
+ "SIGILL", // 4
+ "unhandled trap", // 5 = SIGTRAP
+ "SIGABRT", // 6 = SIGIOT
+ "bus error", // 7 = SIGBUS
+ "floating point exception", // 8 = SIGFPE
+ "illegal instruction", // 9 = SIGILL
+ "SIGUSR1", // 10
+ "segfault", // 11 = SIGSEGV
+ "SIGUSR2", // 12
+ "SIGPIPE", // 13
+ "SIGALRM", // 14
+ "SIGTERM", // 15
+ "SIGSTKFLT", // 16
+ "SIGCHLD", // 17
+ "SIGCONT", // 18
+ "SIGSTOP", // 19
+ "SIGTSTP", // 20
+ "SIGTTIN", // 21
+ "SIGTTOU", // 22
+ "SIGURG", // 23
+ "SIGXCPU", // 24
+ "SIGXFSZ", // 25
+ "SIGVTALRM", // 26
+ "SIGPROF", // 27
+ "SIGWINCH", // 28
+ "SIGIO", // 29 = SIGPOLL = SIGLOST
+ "SIGPWR", // 30
+ "SIGSYS", // 31 = SIGUNUSED
+};
+
/*
* Trap & Exception support
*/
@@ -314,10 +349,10 @@ static void show_signal_msg(int signr, struct pt_regs *regs, int code,
if (!unhandled_signal(current, signr))
return;

- pr_info("%s[%d]: unhandled signal %d at "REG_FMT \
- " nip "REG_FMT" lr "REG_FMT" code %x",
- current->comm, current->pid, signr, addr,
- regs->nip, regs->link, code);
+ pr_info("%s[%d]: %s (%d) at "REG_FMT" nip "REG_FMT \
+ " lr "REG_FMT" code %x",
+ current->comm, current->pid, signames[signr],
+ signr, addr, regs->nip, regs->link, code);

print_vma_addr(KERN_CONT " in ", regs->nip);

--
2.17.1


2018-07-25 07:02:05

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH 0/7] powerpc: Modernize unhandled signals message

On Tue, 2018-07-24 at 16:27 -0300, Murilo Opsfelder Araujo wrote:
> Hi, everyone.
>
> This series was inspired by the need to modernize and display more
> informative messages about unhandled signals.
>
> The "unhandled signal NN" is not very informative. We thought it would
> be helpful adding a human-readable message describing what the signal
> number means, printing the VMA address, and dumping the instructions.
>
> We can add more informative messages, like informing what each code of a
> SIGSEGV signal means. We are open to suggestions.
>
> I have collected some early feedback from Michael Ellerman about this
> series and would love to hear more feedback from you all.

Nice.. the instruction dump would have been very handy when debugging the PCR
init issue I had a month or so back.

> Before this series:
>
> Jul 24 13:01:07 localhost kernel: pandafault[5989]: unhandled signal 11 at 00000000100007d0 nip 000000001000061c lr 00003fff85a75100 code 2
>
> After this series:
>
> Jul 24 13:08:01 localhost kernel: pandafault[10758]: segfault (11) at 00000000100007d0 nip 000000001000061c lr 00007fffabc85100 code 2 in pandafault[10000000+10000]
> Jul 24 13:08:01 localhost kernel: Instruction dump:
> Jul 24 13:08:01 localhost kernel: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe
> Jul 24 13:08:01 localhost kernel: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040

What happens if we get a sudden flood of these from different processes that
overlap their output? Are we going to be able to match up the process with
instruction dump?

Should we prefix every line with the PID to avoid this?

Mikey

2018-07-25 15:20:18

by Gustavo Romero

[permalink] [raw]
Subject: Re: [PATCH 6/7] powerpc/traps: Print signal name for unhandled signals

Hi Murilo,

LGTM.

Just a comment:

On 07/24/2018 04:27 PM, Murilo Opsfelder Araujo wrote:
> This adds a human-readable name in the unhandled signal message.
>
> Before this patch, a page fault looked like:
>
> Jul 11 16:04:11 localhost kernel: pandafault[6303]: unhandled signal 11 at 00000000100007d0 nip 000000001000061c lr 00007fff93c55100 code 2 in pandafault[10000000+10000]
>
> After this patch, a page fault looks like:
>
> Jul 11 18:14:48 localhost kernel: pandafault[6352]: segfault (11) at 000000013a2a09f8 nip 000000013a2a086c lr 00007fffb63e5100 code 2 in pandafault[13a2a0000+10000]

I _really_ don't want to bikeshed here, but I vouch for keeping the
"unhandled" word before the signal name, like:

[...] pandafault[6352]: unhandled segfault (11) at 000000013a2a09f8 nip [...]

because the issue reported here is really that we got a segfault _and_
there was no handler to catch it.

But feel free to wait for additional comments to decide it.


Cheers,
Gustavo


2018-07-25 15:43:38

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH 2/7] powerpc/traps: Return early in show_signal_msg()

Murilo Opsfelder Araujo <[email protected]> a écrit :

> Modify logic of show_signal_msg() to return early, if possible. Replace
> printk_ratelimited() by printk() and a default rate limit burst to limit
> displaying unhandled signals messages.

Can you explain more the benefits of this change ?

Christophe

>
> Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
> ---
> arch/powerpc/kernel/traps.c | 21 +++++++++++++++------
> 1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index cbd3dc365193..4faab4705774 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -301,6 +301,13 @@ void user_single_step_siginfo(struct task_struct *tsk,
> info->si_addr = (void __user *)regs->nip;
> }
>
> +static bool show_unhandled_signals_ratelimited(void)
> +{
> + static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
> + DEFAULT_RATELIMIT_BURST);
> + return show_unhandled_signals && __ratelimit(&rs);
> +}
> +
> static void show_signal_msg(int signr, struct pt_regs *regs, int code,
> unsigned long addr)
> {
> @@ -309,11 +316,12 @@ static void show_signal_msg(int signr, struct
> pt_regs *regs, int code,
> const char fmt64[] = KERN_INFO "%s[%d]: unhandled signal %d " \
> "at %016lx nip %016lx lr %016lx code %x\n";
>
> - if (show_unhandled_signals && unhandled_signal(current, signr)) {
> - printk_ratelimited(regs->msr & MSR_64BIT ? fmt64 : fmt32,
> - current->comm, current->pid, signr,
> - addr, regs->nip, regs->link, code);
> - }
> + if (!unhandled_signal(current, signr))
> + return;
> +
> + printk(regs->msr & MSR_64BIT ? fmt64 : fmt32,
> + current->comm, current->pid, signr,
> + addr, regs->nip, regs->link, code);
> }
>
> void _exception_pkey(int signr, struct pt_regs *regs, int code,
> @@ -326,7 +334,8 @@ void _exception_pkey(int signr, struct pt_regs
> *regs, int code,
> return;
> }
>
> - show_signal_msg(signr, regs, code, addr);
> + if (show_unhandled_signals_ratelimited())
> + show_signal_msg(signr, regs, code, addr);
>
> if (arch_irqs_disabled() && !arch_irq_disabled_regs(regs))
> local_irq_enable();
> --
> 2.17.1



2018-07-25 15:50:51

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH 6/7] powerpc/traps: Print signal name for unhandled signals

Murilo Opsfelder Araujo <[email protected]> a écrit :

> This adds a human-readable name in the unhandled signal message.
>
> Before this patch, a page fault looked like:
>
> Jul 11 16:04:11 localhost kernel: pandafault[6303]: unhandled
> signal 11 at 00000000100007d0 nip 000000001000061c lr
> 00007fff93c55100 code 2 in pandafault[10000000+10000]
>
> After this patch, a page fault looks like:
>
> Jul 11 18:14:48 localhost kernel: pandafault[6352]: segfault
> (11) at 000000013a2a09f8 nip 000000013a2a086c lr 00007fffb63e5100
> code 2 in pandafault[13a2a0000+10000]
>
> Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
> ---
> arch/powerpc/kernel/traps.c | 43 +++++++++++++++++++++++++++++++++----
> 1 file changed, 39 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index e6c43ef9fb50..e55ee639d010 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -96,6 +96,41 @@ EXPORT_SYMBOL(__debugger_fault_handler);
> #define TM_DEBUG(x...) do { } while(0)
> #endif
>
> +static const char *signames[SIGRTMIN + 1] = {
> + "UNKNOWN",
> + "SIGHUP", // 1
> + "SIGINT", // 2
> + "SIGQUIT", // 3
> + "SIGILL", // 4
> + "unhandled trap", // 5 = SIGTRAP
> + "SIGABRT", // 6 = SIGIOT
> + "bus error", // 7 = SIGBUS
> + "floating point exception", // 8 = SIGFPE
> + "illegal instruction", // 9 = SIGILL
> + "SIGUSR1", // 10
> + "segfault", // 11 = SIGSEGV
> + "SIGUSR2", // 12
> + "SIGPIPE", // 13
> + "SIGALRM", // 14
> + "SIGTERM", // 15
> + "SIGSTKFLT", // 16
> + "SIGCHLD", // 17
> + "SIGCONT", // 18
> + "SIGSTOP", // 19
> + "SIGTSTP", // 20
> + "SIGTTIN", // 21
> + "SIGTTOU", // 22
> + "SIGURG", // 23
> + "SIGXCPU", // 24
> + "SIGXFSZ", // 25
> + "SIGVTALRM", // 26
> + "SIGPROF", // 27
> + "SIGWINCH", // 28
> + "SIGIO", // 29 = SIGPOLL = SIGLOST
> + "SIGPWR", // 30
> + "SIGSYS", // 31 = SIGUNUSED
> +};
> +
> /*
> * Trap & Exception support
> */
> @@ -314,10 +349,10 @@ static void show_signal_msg(int signr, struct
> pt_regs *regs, int code,
> if (!unhandled_signal(current, signr))
> return;
>
> - pr_info("%s[%d]: unhandled signal %d at "REG_FMT \
> - " nip "REG_FMT" lr "REG_FMT" code %x",
> - current->comm, current->pid, signr, addr,
> - regs->nip, regs->link, code);
> + pr_info("%s[%d]: %s (%d) at "REG_FMT" nip "REG_FMT \
> + " lr "REG_FMT" code %x",
> + current->comm, current->pid, signames[signr],
> + signr, addr, regs->nip, regs->link, code);

Are we sure that signr is always within the limits of the table ?

Christophe
>
> print_vma_addr(KERN_CONT " in ", regs->nip);
>
> --
> 2.17.1



2018-07-25 16:02:52

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH 7/7] powerpc/traps: Show instructions on exceptions

Murilo Opsfelder Araujo <[email protected]> a écrit :

> Move show_instructions() declaration to arch/powerpc/include/asm/stacktrace.h
> and include asm/stracktrace.h in arch/powerpc/kernel/process.c,
> which contains
> the implementation.
>
> Modify show_instructions() not to call __kernel_text_address(), allowing
> userspace instruction dump. probe_kernel_address(), which returns -EFAULT if
> something goes wrong, is still being called.
>
> Call show_instructions() in arch/powerpc/kernel/traps.c to dump
> instructions at
> faulty location, useful to debugging.

Shouldn't this part be in a second patch ?

Wouldn't it be better to also see regs in addition if we want to use
this to understand what happened ?
So you could call show_regs() instead of show_instructions() ?

Christophe

>
> Before this patch, an unhandled signal message looked like:
>
> Jul 24 09:57:00 localhost kernel: pandafault[10524]: segfault
> (11) at 00000000100007d0 nip 000000001000061c lr 00007fffbd295100
> code 2 in pandafault[10000000+10000]
>
> After this patch, it looks like:
>
> Jul 24 09:57:00 localhost kernel: pandafault[10524]: segfault
> (11) at 00000000100007d0 nip 000000001000061c lr 00007fffbd295100
> code 2 in pandafault[10000000+10000]
> Jul 24 09:57:00 localhost kernel: Instruction dump:
> Jul 24 09:57:00 localhost kernel: 4bfffeec 4bfffee8 3c401002
> 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe
> Jul 24 09:57:00 localhost kernel: 392988d0 f93f0020 e93f0020
> 39400048 <99490000> 39200000 7d234b78 383f0040
>
> Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
> ---
> arch/powerpc/include/asm/stacktrace.h | 7 +++++++
> arch/powerpc/kernel/process.c | 6 +++---
> arch/powerpc/kernel/traps.c | 3 +++
> 3 files changed, 13 insertions(+), 3 deletions(-)
> create mode 100644 arch/powerpc/include/asm/stacktrace.h
>
> diff --git a/arch/powerpc/include/asm/stacktrace.h
> b/arch/powerpc/include/asm/stacktrace.h
> new file mode 100644
> index 000000000000..46e5ef451578
> --- /dev/null
> +++ b/arch/powerpc/include/asm/stacktrace.h
> @@ -0,0 +1,7 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_POWERPC_STACKTRACE_H
> +#define _ASM_POWERPC_STACKTRACE_H
> +
> +void show_instructions(struct pt_regs *regs);
> +
> +#endif /* _ASM_POWERPC_STACKTRACE_H */
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index b1af3390249c..ee1d63e03c52 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -52,6 +52,7 @@
> #include <asm/machdep.h>
> #include <asm/time.h>
> #include <asm/runlatch.h>
> +#include <asm/stacktrace.h>
> #include <asm/syscalls.h>
> #include <asm/switch_to.h>
> #include <asm/tm.h>
> @@ -1261,7 +1262,7 @@ struct task_struct *__switch_to(struct
> task_struct *prev,
>
> static int instructions_to_print = 16;
>
> -static void show_instructions(struct pt_regs *regs)
> +void show_instructions(struct pt_regs *regs)
> {
> int i;
> unsigned long pc = regs->nip - (instructions_to_print * 3 / 4 *
> @@ -1283,8 +1284,7 @@ static void show_instructions(struct pt_regs *regs)
> pc = (unsigned long)phys_to_virt(pc);
> #endif
>
> - if (!__kernel_text_address(pc) ||
> - probe_kernel_address((unsigned int __user *)pc, instr)) {
> + if (probe_kernel_address((unsigned int __user *)pc, instr)) {
> pr_cont("XXXXXXXX ");
> } else {
> if (regs->nip == pc)
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index e55ee639d010..3beca17ac1b1 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -70,6 +70,7 @@
> #include <asm/hmi.h>
> #include <sysdev/fsl_pci.h>
> #include <asm/kprobes.h>
> +#include <asm/stacktrace.h>
>
> #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE)
> int (*__debugger)(struct pt_regs *regs) __read_mostly;
> @@ -357,6 +358,8 @@ static void show_signal_msg(int signr, struct
> pt_regs *regs, int code,
> print_vma_addr(KERN_CONT " in ", regs->nip);
>
> pr_cont("\n");
> +
> + show_instructions(regs);
> }
>
> void _exception_pkey(int signr, struct pt_regs *regs, int code,
> --
> 2.17.1



2018-07-25 19:38:03

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: Re: [PATCH 0/7] powerpc: Modernize unhandled signals message

Hi, Mikey.

On Wed, Jul 25, 2018 at 05:00:21PM +1000, Michael Neuling wrote:
> On Tue, 2018-07-24 at 16:27 -0300, Murilo Opsfelder Araujo wrote:
> > Hi, everyone.
> >
> > This series was inspired by the need to modernize and display more
> > informative messages about unhandled signals.
> >
> > The "unhandled signal NN" is not very informative. We thought it would
> > be helpful adding a human-readable message describing what the signal
> > number means, printing the VMA address, and dumping the instructions.
> >
> > We can add more informative messages, like informing what each code of a
> > SIGSEGV signal means. We are open to suggestions.
> >
> > I have collected some early feedback from Michael Ellerman about this
> > series and would love to hear more feedback from you all.
>
> Nice.. the instruction dump would have been very handy when debugging the PCR
> init issue I had a month or so back.
>
> > Before this series:
> >
> > Jul 24 13:01:07 localhost kernel: pandafault[5989]: unhandled signal 11 at 00000000100007d0 nip 000000001000061c lr 00003fff85a75100 code 2
> >
> > After this series:
> >
> > Jul 24 13:08:01 localhost kernel: pandafault[10758]: segfault (11) at 00000000100007d0 nip 000000001000061c lr 00007fffabc85100 code 2 in pandafault[10000000+10000]
> > Jul 24 13:08:01 localhost kernel: Instruction dump:
> > Jul 24 13:08:01 localhost kernel: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe
> > Jul 24 13:08:01 localhost kernel: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040
>
> What happens if we get a sudden flood of these from different processes that
> overlap their output? Are we going to be able to match up the process with
> instruction dump?

As to the flood of messages, ___ratelimit() makes me think that we'll
likely see some warn messages informing how many show_signal_msg()
callbacks were suppressed, instead of interleaved messages and
instruction dumps.

As to matching process with instruction dump, I believe we'd need more
information to glue them together.

What if we modify show_instructions() to accept a string prefix to be
printed along with each line?

> Should we prefix every line with the PID to avoid this?

That's possible. An alternative would be prefixing each line with the
process name and its PID, as in the first line. For example:

pandafault[10758]: segfault (11) at 00000000100007d0 nip 000000001000061c lr 00007fffabc85100 code 2 in pandafault[10000000+10000]
pandafault[10758]: Instruction dump:
pandafault[10758]: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe
pandafault[10758]: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040

The above can be interleaved with other messages and we'll still be able
to match process and its corresponding instruction dump.

Cheers
Murilo


2018-07-25 19:42:28

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: Re: [PATCH 6/7] powerpc/traps: Print signal name for unhandled signals

Hi, Gustavo.

On Wed, Jul 25, 2018 at 12:19:00PM -0300, Gustavo Romero wrote:
> Hi Murilo,
>
> LGTM.
>
> Just a comment:
>
> On 07/24/2018 04:27 PM, Murilo Opsfelder Araujo wrote:
> > This adds a human-readable name in the unhandled signal message.
> >
> > Before this patch, a page fault looked like:
> >
> > Jul 11 16:04:11 localhost kernel: pandafault[6303]: unhandled signal 11 at 00000000100007d0 nip 000000001000061c lr 00007fff93c55100 code 2 in pandafault[10000000+10000]
> >
> > After this patch, a page fault looks like:
> >
> > Jul 11 18:14:48 localhost kernel: pandafault[6352]: segfault (11) at 000000013a2a09f8 nip 000000013a2a086c lr 00007fffb63e5100 code 2 in pandafault[13a2a0000+10000]
>
> I _really_ don't want to bikeshed here, but I vouch for keeping the
> "unhandled" word before the signal name, like:
>
> [...] pandafault[6352]: unhandled segfault (11) at 000000013a2a09f8 nip [...]
>
> because the issue reported here is really that we got a segfault _and_
> there was no handler to catch it.

Either way works for me.

> But feel free to wait for additional comments to decide it.

Sure. I intend to wait a couple of weeks to respin the series based on
community feedback.

Cheers
Murilo


2018-07-25 20:07:47

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: Re: [PATCH 6/7] powerpc/traps: Print signal name for unhandled signals

Hi, Christophe.

On Wed, Jul 25, 2018 at 05:49:27PM +0200, LEROY Christophe wrote:
> Murilo Opsfelder Araujo <[email protected]> a ?crit?:
>
> > This adds a human-readable name in the unhandled signal message.
> >
> > Before this patch, a page fault looked like:
> >
> > Jul 11 16:04:11 localhost kernel: pandafault[6303]: unhandled signal
> > 11 at 00000000100007d0 nip 000000001000061c lr 00007fff93c55100 code 2
> > in pandafault[10000000+10000]
> >
> > After this patch, a page fault looks like:
> >
> > Jul 11 18:14:48 localhost kernel: pandafault[6352]: segfault (11) at
> > 000000013a2a09f8 nip 000000013a2a086c lr 00007fffb63e5100 code 2 in
> > pandafault[13a2a0000+10000]
> >
> > Signed-off-by: Murilo Opsfelder Araujo <[email protected]>
> > ---
> > arch/powerpc/kernel/traps.c | 43 +++++++++++++++++++++++++++++++++----
> > 1 file changed, 39 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> > index e6c43ef9fb50..e55ee639d010 100644
> > --- a/arch/powerpc/kernel/traps.c
> > +++ b/arch/powerpc/kernel/traps.c
> > @@ -96,6 +96,41 @@ EXPORT_SYMBOL(__debugger_fault_handler);
> > #define TM_DEBUG(x...) do { } while(0)
> > #endif
> >
> > +static const char *signames[SIGRTMIN + 1] = {
> > + "UNKNOWN",
> > + "SIGHUP", // 1
> > + "SIGINT", // 2
> > + "SIGQUIT", // 3
> > + "SIGILL", // 4
> > + "unhandled trap", // 5 = SIGTRAP
> > + "SIGABRT", // 6 = SIGIOT
> > + "bus error", // 7 = SIGBUS
> > + "floating point exception", // 8 = SIGFPE
> > + "illegal instruction", // 9 = SIGILL
> > + "SIGUSR1", // 10
> > + "segfault", // 11 = SIGSEGV
> > + "SIGUSR2", // 12
> > + "SIGPIPE", // 13
> > + "SIGALRM", // 14
> > + "SIGTERM", // 15
> > + "SIGSTKFLT", // 16
> > + "SIGCHLD", // 17
> > + "SIGCONT", // 18
> > + "SIGSTOP", // 19
> > + "SIGTSTP", // 20
> > + "SIGTTIN", // 21
> > + "SIGTTOU", // 22
> > + "SIGURG", // 23
> > + "SIGXCPU", // 24
> > + "SIGXFSZ", // 25
> > + "SIGVTALRM", // 26
> > + "SIGPROF", // 27
> > + "SIGWINCH", // 28
> > + "SIGIO", // 29 = SIGPOLL = SIGLOST
> > + "SIGPWR", // 30
> > + "SIGSYS", // 31 = SIGUNUSED
> > +};
> > +
> > /*
> > * Trap & Exception support
> > */
> > @@ -314,10 +349,10 @@ static void show_signal_msg(int signr, struct
> > pt_regs *regs, int code,
> > if (!unhandled_signal(current, signr))
> > return;
> >
> > - pr_info("%s[%d]: unhandled signal %d at "REG_FMT \
> > - " nip "REG_FMT" lr "REG_FMT" code %x",
> > - current->comm, current->pid, signr, addr,
> > - regs->nip, regs->link, code);
> > + pr_info("%s[%d]: %s (%d) at "REG_FMT" nip "REG_FMT \
> > + " lr "REG_FMT" code %x",
> > + current->comm, current->pid, signames[signr],
> > + signr, addr, regs->nip, regs->link, code);
>
> Are we sure that signr is always within the limits of the table ?

Looking at the code, we only pass the following signals:

SIGBUS
SIGFPE
SIGILL
SIGSEGV
SIGTRAP

All of them are within the limits of the table. We've added other
signals for completeness.

Cheers
Murilo


2018-07-25 20:32:51

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: Re: [PATCH 2/7] powerpc/traps: Return early in show_signal_msg()

Hi, Christophe.

On Wed, Jul 25, 2018 at 05:42:28PM +0200, LEROY Christophe wrote:
> Murilo Opsfelder Araujo <[email protected]> a ?crit?:
>
> > Modify logic of show_signal_msg() to return early, if possible. Replace
> > printk_ratelimited() by printk() and a default rate limit burst to limit
> > displaying unhandled signals messages.
>
> Can you explain more the benefits of this change ?

Mainly is to improve readability of the function.

The conditions to display the message were coupled together in one
single `if` statement.

Besides that, patch 5/7 adds a call to print_vma_addr(), which is not
aware of any rate limit - it simply calls printk().

So splitting out the rate limit check outside show_signal_msg() makes it
easier to the caller decide if it wants to respect a printk rate limit
or not.

Cheers
Murilo


2018-07-25 20:53:09

by Murilo Opsfelder Araujo

[permalink] [raw]
Subject: Re: [PATCH 7/7] powerpc/traps: Show instructions on exceptions

Hi, Christophe.

On Wed, Jul 25, 2018 at 06:01:34PM +0200, LEROY Christophe wrote:
> Murilo Opsfelder Araujo <[email protected]> a ?crit?:
>
> > Move show_instructions() declaration to arch/powerpc/include/asm/stacktrace.h
> > and include asm/stracktrace.h in arch/powerpc/kernel/process.c, which
> > contains
> > the implementation.
> >
> > Modify show_instructions() not to call __kernel_text_address(), allowing
> > userspace instruction dump. probe_kernel_address(), which returns -EFAULT if
> > something goes wrong, is still being called.
> >
> > Call show_instructions() in arch/powerpc/kernel/traps.c to dump
> > instructions at
> > faulty location, useful to debugging.
>
> Shouldn't this part be in a second patch ?

Makes sense. Perhaps I should split this patch in two: one to remove
__kernel_text_address() check in show_instructions(), and another to
call show_instructions() in show_signal_msg().

> Wouldn't it be better to also see regs in addition if we want to use this to
> understand what happened ?
> So you could call show_regs() instead of show_instructions() ?

I see that show_regs() prints more information and calls
show_instructions() at the end if in privileged state.

I'm not sure about which situations we might want to call show_regs() -
and display a bunch of information - or just dump instructions for some
signals.

Isn't calling show_regs() in this case considered overkill?

Cheers
Murilo


2018-07-26 02:13:08

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH 0/7] powerpc: Modernize unhandled signals message


> > Should we prefix every line with the PID to avoid this?
>
> That's possible. An alternative would be prefixing each line with the
> process name and its PID, as in the first line. For example:
>
> pandafault[10758]: segfault (11) at 00000000100007d0 nip 000000001000061c lr 00007fffabc85100 code 2 in pandafault[10000000+10000]
> pandafault[10758]: Instruction dump:
> pandafault[10758]: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe
> pandafault[10758]: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040
>
> The above can be interleaved with other messages and we'll still be able
> to match process and its corresponding instruction dump.

LGTM.

Thanks!
Mikey

2018-07-26 02:21:59

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 0/7] powerpc: Modernize unhandled signals message

Murilo Opsfelder Araujo <[email protected]> writes:
> On Wed, Jul 25, 2018 at 05:00:21PM +1000, Michael Neuling wrote:
>> On Tue, 2018-07-24 at 16:27 -0300, Murilo Opsfelder Araujo wrote:
>> > This series was inspired by the need to modernize and display more
>> > informative messages about unhandled signals.
...
>>
>> Nice.. the instruction dump would have been very handy when debugging the PCR
>> init issue I had a month or so back.

These things may be related :)

>> What happens if we get a sudden flood of these from different processes that
>> overlap their output? Are we going to be able to match up the process with
>> instruction dump?
>
> As to the flood of messages, ___ratelimit() makes me think that we'll
> likely see some warn messages informing how many show_signal_msg()
> callbacks were suppressed, instead of interleaved messages and
> instruction dumps.
>
> As to matching process with instruction dump, I believe we'd need more
> information to glue them together.
>
> What if we modify show_instructions() to accept a string prefix to be
> printed along with each line?
>
>> Should we prefix every line with the PID to avoid this?
>
> That's possible. An alternative would be prefixing each line with the
> process name and its PID, as in the first line. For example:
>
> pandafault[10758]: segfault (11) at 00000000100007d0 nip 000000001000061c lr 00007fffabc85100 code 2 in pandafault[10000000+10000]
> pandafault[10758]: Instruction dump:
> pandafault[10758]: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe
> pandafault[10758]: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040
>
> The above can be interleaved with other messages and we'll still be able
> to match process and its corresponding instruction dump.

Yeah prefixing with the comm and pid is nice.

Also the "Instruction dump:" line is a waste of space.

I prefer the x86 format, where it's prefixed with "code:", eg:

pandafault[10758]: segfault (11) at 00000000100007d0 nip 000000001000061c lr 00007fffabc85100 code 2 in pandafault[10000000+10000]
pandafault[10758]: code: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe
pandafault[10758]: code: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040

cheers