This is an attempt to resurrect Sumit's old patch series [1] that
allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
also to round up CPUs in kdb/kgdb. The last post from Sumit that I
could find was v7, so I called this series v8. I haven't copied all of
his old changelongs here, but you can find them from the link.
Since v7, I have:
* Addressed the small amount of feedback that was there for v7.
* Rebased.
* Added a new patch that prevents us from spamming the logs with idle
tasks.
* Added an extra patch to gracefully fall back to regular IPIs if
pseudo-NMIs aren't there.
Since there appear to be a few different patches series related to
being able to use NMIs to get stack traces of crashed systems, let me
try to organize them to the best of my understanding:
a) This series. On its own, a) will (among other things) enable stack
traces of all running processes with the soft lockup detector if
you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
its own, a) doesn't give a hard lockup detector.
b) A different recently-posted series [2] that adds a hard lockup
detector based on perf. On its own, b) gives a stack crawl of the
locked up CPU but no stack crawls of other CPUs (even if they're
locked too). Together with a) + b) we get everything (full lockup
detect, full ability to get stack crawls).
c) The old Android "buddy" hard lockup detector [3] that I'm
considering trying to upstream. If b) lands then I believe c) would
be redundant (at least for arm64). c) on its own is really only
useful on arm64 for platforms that can print CPU_DBGPCSR somehow
(see [4]). a) + c) is roughly as good as a) + b).
[1] https://lore.kernel.org/linux-arm-kernel/[email protected]/
[2] https://lore.kernel.org/linux-arm-kernel/[email protected]/
[3] https://issuetracker.google.com/172213097
[4] https://issuetracker.google.com/172213129
Changes in v8:
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
- Add loongarch support, too
- Removed "#ifdef CONFIG_SMP" since arm64 is always SMP
- "Tag the arm64 idle functions as __cpuidle" new for v8
- "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
- "Fallback to a regular IPI if NMI isn't enabled" new for v8
Douglas Anderson (3):
arm64: idle: Tag the arm64 idle functions as __cpuidle
kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled
Sumit Garg (7):
arm64: Add framework to turn IPI as NMI
irqchip/gic-v3: Enable support for SGIs to act as NMIs
arm64: smp: Assign and setup an IPI as NMI
nmi: backtrace: Allow runtime arch specific override
arm64: ipi_nmi: Add support for NMI backtrace
kgdb: Expose default CPUs roundup fallback mechanism
arm64: kgdb: Roundup cpus using IPI as NMI
arch/arm/include/asm/irq.h | 2 +-
arch/arm/kernel/smp.c | 3 +-
arch/arm64/include/asm/irq.h | 4 ++
arch/arm64/include/asm/nmi.h | 17 +++++
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/idle.c | 4 +-
arch/arm64/kernel/ipi_nmi.c | 103 +++++++++++++++++++++++++++++++
arch/arm64/kernel/kgdb.c | 18 ++++++
arch/arm64/kernel/smp.c | 8 +++
arch/loongarch/include/asm/irq.h | 2 +-
arch/loongarch/kernel/process.c | 3 +-
arch/mips/include/asm/irq.h | 2 +-
arch/mips/kernel/process.c | 3 +-
arch/powerpc/include/asm/nmi.h | 2 +-
arch/powerpc/kernel/stacktrace.c | 3 +-
arch/sparc/include/asm/irq_64.h | 2 +-
arch/sparc/kernel/process_64.c | 4 +-
arch/x86/include/asm/irq.h | 2 +-
arch/x86/kernel/apic/hw_nmi.c | 3 +-
drivers/irqchip/irq-gic-v3.c | 29 ++++++---
include/linux/kgdb.h | 13 ++++
include/linux/nmi.h | 12 ++--
kernel/debug/debug_core.c | 8 ++-
23 files changed, 217 insertions(+), 32 deletions(-)
create mode 100644 arch/arm64/include/asm/nmi.h
create mode 100644 arch/arm64/kernel/ipi_nmi.c
--
2.40.0.634.g4ca3ef3211-goog
From: Sumit Garg <[email protected]>
Introduce framework to turn an IPI as NMI using pseudo NMIs. The main
motivation for this feature is to have an IPI that can be leveraged to
invoke NMI functions on other CPUs.
And current prospective users are NMI backtrace and KGDB CPUs round-up
whose support is added via future patches.
Signed-off-by: Sumit Garg <[email protected]>
Reviewed-by: Masayoshi Mizuma <[email protected]>
Tested-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---
Changes in v8:
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
arch/arm64/include/asm/nmi.h | 17 ++++++++++
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/ipi_nmi.c | 65 ++++++++++++++++++++++++++++++++++++
3 files changed, 83 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/include/asm/nmi.h
create mode 100644 arch/arm64/kernel/ipi_nmi.c
diff --git a/arch/arm64/include/asm/nmi.h b/arch/arm64/include/asm/nmi.h
new file mode 100644
index 000000000000..2cc4b4d4090e
--- /dev/null
+++ b/arch/arm64/include/asm/nmi.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_NMI_H
+#define __ASM_NMI_H
+
+#ifndef __ASSEMBLER__
+
+#include <linux/cpumask.h>
+
+extern bool arm64_supports_nmi(void);
+extern void arm64_send_nmi(cpumask_t *mask);
+
+void set_smp_dynamic_ipi(int ipi);
+void dynamic_ipi_setup(void);
+void dynamic_ipi_teardown(void);
+
+#endif /* !__ASSEMBLER__ */
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index ceba6792f5b3..d57c8f99ca69 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -34,7 +34,7 @@ obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
cpufeature.o alternative.o cacheinfo.o \
smp.o smp_spin_table.o topology.o smccc-call.o \
syscall.o proton-pack.o idreg-override.o idle.o \
- patching.o
+ patching.o ipi_nmi.o
obj-$(CONFIG_COMPAT) += sys32.o signal32.o \
sys_compat.o
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
new file mode 100644
index 000000000000..712411eed949
--- /dev/null
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * NMI support for IPIs
+ *
+ * Copyright (C) 2020 Linaro Limited
+ * Author: Sumit Garg <[email protected]>
+ */
+
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/smp.h>
+
+#include <asm/nmi.h>
+
+static struct irq_desc *ipi_nmi_desc __read_mostly;
+static int ipi_nmi_id __read_mostly;
+
+bool arm64_supports_nmi(void)
+{
+ if (ipi_nmi_desc)
+ return true;
+
+ return false;
+}
+
+void arm64_send_nmi(cpumask_t *mask)
+{
+ if (WARN_ON_ONCE(!ipi_nmi_desc))
+ return;
+
+ __ipi_send_mask(ipi_nmi_desc, mask);
+}
+
+static irqreturn_t ipi_nmi_handler(int irq, void *data)
+{
+ /* nop, NMI handlers for special features can be added here. */
+
+ return IRQ_NONE;
+}
+
+void dynamic_ipi_setup(void)
+{
+ if (!ipi_nmi_desc)
+ return;
+
+ if (!prepare_percpu_nmi(ipi_nmi_id))
+ enable_percpu_nmi(ipi_nmi_id, IRQ_TYPE_NONE);
+}
+
+void dynamic_ipi_teardown(void)
+{
+ if (!ipi_nmi_desc)
+ return;
+
+ disable_percpu_nmi(ipi_nmi_id);
+ teardown_percpu_nmi(ipi_nmi_id);
+}
+
+void __init set_smp_dynamic_ipi(int ipi)
+{
+ if (!request_percpu_nmi(ipi, ipi_nmi_handler, "IPI", &cpu_number)) {
+ ipi_nmi_desc = irq_to_desc(ipi);
+ ipi_nmi_id = ipi;
+ }
+}
--
2.40.0.634.g4ca3ef3211-goog
From: Sumit Garg <[email protected]>
Add support to handle SGIs as pseudo NMIs. As SGIs or IPIs default to a
special flow handler: handle_percpu_devid_fasteoi_ipi(), so skip NMI
handler update in case of SGIs.
Also, enable NMI support prior to gic_smp_init() as allocation of SGIs
as IRQs/NMIs happen as part of this routine.
Signed-off-by: Sumit Garg <[email protected]>
Reviewed-by: Masayoshi Mizuma <[email protected]>
Tested-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---
(no changes since v1)
drivers/irqchip/irq-gic-v3.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index fd134e1f481a..b402a81fea59 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -482,6 +482,7 @@ static u32 gic_get_ppi_index(struct irq_data *d)
static int gic_irq_nmi_setup(struct irq_data *d)
{
struct irq_desc *desc = irq_to_desc(d->irq);
+ u32 idx;
if (!gic_supports_nmi())
return -EINVAL;
@@ -499,16 +500,22 @@ static int gic_irq_nmi_setup(struct irq_data *d)
return -EINVAL;
/* desc lock should already be held */
- if (gic_irq_in_rdist(d)) {
- u32 idx = gic_get_ppi_index(d);
+ switch (get_intid_range(d)) {
+ case SGI_RANGE:
+ break;
+ case PPI_RANGE:
+ case EPPI_RANGE:
+ idx = gic_get_ppi_index(d);
/* Setting up PPI as NMI, only switch handler for first NMI */
if (!refcount_inc_not_zero(&ppi_nmi_refs[idx])) {
refcount_set(&ppi_nmi_refs[idx], 1);
desc->handle_irq = handle_percpu_devid_fasteoi_nmi;
}
- } else {
+ break;
+ default:
desc->handle_irq = handle_fasteoi_nmi;
+ break;
}
gic_irq_set_prio(d, GICD_INT_NMI_PRI);
@@ -519,6 +526,7 @@ static int gic_irq_nmi_setup(struct irq_data *d)
static void gic_irq_nmi_teardown(struct irq_data *d)
{
struct irq_desc *desc = irq_to_desc(d->irq);
+ u32 idx;
if (WARN_ON(!gic_supports_nmi()))
return;
@@ -536,14 +544,20 @@ static void gic_irq_nmi_teardown(struct irq_data *d)
return;
/* desc lock should already be held */
- if (gic_irq_in_rdist(d)) {
- u32 idx = gic_get_ppi_index(d);
+ switch (get_intid_range(d)) {
+ case SGI_RANGE:
+ break;
+ case PPI_RANGE:
+ case EPPI_RANGE:
+ idx = gic_get_ppi_index(d);
/* Tearing down NMI, only switch handler for last NMI */
if (refcount_dec_and_test(&ppi_nmi_refs[idx]))
desc->handle_irq = handle_percpu_devid_irq;
- } else {
+ break;
+ default:
desc->handle_irq = handle_fasteoi_irq;
+ break;
}
gic_irq_set_prio(d, GICD_INT_DEF_PRI);
@@ -1867,6 +1881,7 @@ static int __init gic_init_bases(void __iomem *dist_base,
gic_dist_init();
gic_cpu_init();
+ gic_enable_nmi_support();
gic_smp_init();
gic_cpu_pm_init();
@@ -1879,8 +1894,6 @@ static int __init gic_init_bases(void __iomem *dist_base,
gicv2m_init(handle, gic_data.domain);
}
- gic_enable_nmi_support();
-
return 0;
out_free:
--
2.40.0.634.g4ca3ef3211-goog
As per the (somewhat recent) comment before the definition of
`__cpuidle`, the tag is like `noinstr` but also marks a function so it
can be identified by cpu_in_idle(). Let'a add this.
After doing this then when we dump stack traces of all processors
using nmi_cpu_backtrace() then instead of getting useless backtraces
we get things like:
NMI backtrace for cpu N skipped: idling at cpu_do_idle+0x94/0x98
Signed-off-by: Douglas Anderson <[email protected]>
---
Changes in v8:
- "Tag the arm64 idle functions as __cpuidle" new for v8
arch/arm64/kernel/idle.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/idle.c b/arch/arm64/kernel/idle.c
index c1125753fe9b..05cfb347ec26 100644
--- a/arch/arm64/kernel/idle.c
+++ b/arch/arm64/kernel/idle.c
@@ -20,7 +20,7 @@
* ensure that interrupts are not masked at the PMR (because the core will
* not wake up if we block the wake up signal in the interrupt controller).
*/
-void noinstr cpu_do_idle(void)
+void __cpuidle cpu_do_idle(void)
{
struct arm_cpuidle_irq_context context;
@@ -35,7 +35,7 @@ void noinstr cpu_do_idle(void)
/*
* This is our default idle handler.
*/
-void noinstr arch_cpu_idle(void)
+void __cpuidle arch_cpu_idle(void)
{
/*
* This should do all the clock switching and wait for interrupt
--
2.40.0.634.g4ca3ef3211-goog
From: Sumit Garg <[email protected]>
Assign an unused IPI which can be turned as NMI using ipi_nmi framework.
Also, invoke corresponding dynamic IPI setup/teardown APIs.
Signed-off-by: Sumit Garg <[email protected]>
Reviewed-by: Masayoshi Mizuma <[email protected]>
Tested-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---
Changes in v8:
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
arch/arm64/kernel/smp.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 4e8327264255..94ff063527c6 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -43,6 +43,7 @@
#include <asm/daifflags.h>
#include <asm/kvm_mmu.h>
#include <asm/mmu_context.h>
+#include <asm/nmi.h>
#include <asm/numa.h>
#include <asm/processor.h>
#include <asm/smp_plat.h>
@@ -938,6 +939,8 @@ static void ipi_setup(int cpu)
for (i = 0; i < nr_ipi; i++)
enable_percpu_irq(ipi_irq_base + i, 0);
+
+ dynamic_ipi_setup();
}
#ifdef CONFIG_HOTPLUG_CPU
@@ -950,6 +953,8 @@ static void ipi_teardown(int cpu)
for (i = 0; i < nr_ipi; i++)
disable_percpu_irq(ipi_irq_base + i);
+
+ dynamic_ipi_teardown();
}
#endif
@@ -971,6 +976,9 @@ void __init set_smp_ipi_range(int ipi_base, int n)
irq_set_status_flags(ipi_base + i, IRQ_HIDDEN);
}
+ if (n > nr_ipi)
+ set_smp_dynamic_ipi(ipi_base + nr_ipi);
+
ipi_irq_base = ipi_base;
/* Setup the boot CPU immediately */
--
2.40.0.634.g4ca3ef3211-goog
From: Sumit Garg <[email protected]>
Add a boolean return to arch_trigger_cpumask_backtrace() to support a
use-case where a particular architecture detects at runtime if it supports
NMI backtrace or it would like to fallback to default implementation using
SMP cross-calls.
Currently such an architecture example is arm64 supporting pseudo NMIs
feature which is only available on platforms which have support for GICv3
or later version.
Signed-off-by: Sumit Garg <[email protected]>
Tested-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---
Changes in v8:
- Add loongarch support, too
arch/arm/include/asm/irq.h | 2 +-
arch/arm/kernel/smp.c | 3 ++-
arch/loongarch/include/asm/irq.h | 2 +-
arch/loongarch/kernel/process.c | 3 ++-
arch/mips/include/asm/irq.h | 2 +-
arch/mips/kernel/process.c | 3 ++-
arch/powerpc/include/asm/nmi.h | 2 +-
arch/powerpc/kernel/stacktrace.c | 3 ++-
arch/sparc/include/asm/irq_64.h | 2 +-
arch/sparc/kernel/process_64.c | 4 +++-
arch/x86/include/asm/irq.h | 2 +-
arch/x86/kernel/apic/hw_nmi.c | 3 ++-
include/linux/nmi.h | 12 ++++--------
13 files changed, 23 insertions(+), 20 deletions(-)
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index a7c2337b0c7d..e6b62c7d6f0e 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -32,7 +32,7 @@ void init_IRQ(void);
#ifdef CONFIG_SMP
#include <linux/cpumask.h>
-extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
bool exclude_self);
#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
#endif
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 0b8c25763adc..acb97d9219b1 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -849,7 +849,8 @@ static void raise_nmi(cpumask_t *mask)
__ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask);
}
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
{
nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_nmi);
+ return true;
}
diff --git a/arch/loongarch/include/asm/irq.h b/arch/loongarch/include/asm/irq.h
index a115e8999c69..c7a152d6bf0c 100644
--- a/arch/loongarch/include/asm/irq.h
+++ b/arch/loongarch/include/asm/irq.h
@@ -40,7 +40,7 @@ void spurious_interrupt(void);
#define NR_IRQS_LEGACY 16
#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask, bool exclude_self);
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask, bool exclude_self);
#define MAX_IO_PICS 2
#define NR_IRQS (64 + (256 * MAX_IO_PICS))
diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c
index fa2443c7afb2..8f7f818f5c4e 100644
--- a/arch/loongarch/kernel/process.c
+++ b/arch/loongarch/kernel/process.c
@@ -339,9 +339,10 @@ static void raise_backtrace(cpumask_t *mask)
}
}
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
{
nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace);
+ return true;
}
#ifdef CONFIG_64BIT
diff --git a/arch/mips/include/asm/irq.h b/arch/mips/include/asm/irq.h
index 44f9824c1d8c..daf16173486a 100644
--- a/arch/mips/include/asm/irq.h
+++ b/arch/mips/include/asm/irq.h
@@ -77,7 +77,7 @@ extern int cp0_fdc_irq;
extern int get_c0_fdc_int(void);
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
bool exclude_self);
#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 093dbbd6b843..7d538571830a 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -750,9 +750,10 @@ static void raise_backtrace(cpumask_t *mask)
}
}
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
{
nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace);
+ return true;
}
int mips_get_process_fp_mode(struct task_struct *task)
diff --git a/arch/powerpc/include/asm/nmi.h b/arch/powerpc/include/asm/nmi.h
index c3c7adef74de..135f65adcf63 100644
--- a/arch/powerpc/include/asm/nmi.h
+++ b/arch/powerpc/include/asm/nmi.h
@@ -12,7 +12,7 @@ static inline void watchdog_nmi_set_timeout_pct(u64 pct) {}
#endif
#ifdef CONFIG_NMI_IPI
-extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
bool exclude_self);
#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
#endif
diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index 5de8597eaab8..0fee4bded7ba 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -221,8 +221,9 @@ static void raise_backtrace_ipi(cpumask_t *mask)
}
}
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
{
nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace_ipi);
+ return true;
}
#endif /* defined(CONFIG_PPC_BOOK3S_64) && defined(CONFIG_NMI_IPI) */
diff --git a/arch/sparc/include/asm/irq_64.h b/arch/sparc/include/asm/irq_64.h
index 154df2cf19f4..00a0051a9da0 100644
--- a/arch/sparc/include/asm/irq_64.h
+++ b/arch/sparc/include/asm/irq_64.h
@@ -87,7 +87,7 @@ static inline unsigned long get_softint(void)
return retval;
}
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
bool exclude_self);
#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index 91c2b8124527..f9aea1df3adf 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -236,7 +236,7 @@ static void __global_reg_poll(struct global_reg_snapshot *gp)
}
}
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
{
struct thread_info *tp = current_thread_info();
struct pt_regs *regs = get_irq_regs();
@@ -291,6 +291,8 @@ void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
memset(global_cpu_snapshot, 0, sizeof(global_cpu_snapshot));
spin_unlock_irqrestore(&global_cpu_snapshot_lock, flags);
+
+ return true;
}
#ifdef CONFIG_MAGIC_SYSRQ
diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
index 768aa234cbb4..f731638cc38e 100644
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -43,7 +43,7 @@ extern void init_ISA_irqs(void);
extern void __init init_IRQ(void);
#ifdef CONFIG_X86_LOCAL_APIC
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
bool exclude_self);
#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 34a992e275ef..e7dcd28bc824 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -34,10 +34,11 @@ static void nmi_raise_cpu_backtrace(cpumask_t *mask)
apic->send_IPI_mask(mask, NMI_VECTOR);
}
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
{
nmi_trigger_cpumask_backtrace(mask, exclude_self,
nmi_raise_cpu_backtrace);
+ return true;
}
static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 048c0b9aa623..7d8a77cd1e03 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -145,26 +145,22 @@ static inline void touch_nmi_watchdog(void)
#ifdef arch_trigger_cpumask_backtrace
static inline bool trigger_all_cpu_backtrace(void)
{
- arch_trigger_cpumask_backtrace(cpu_online_mask, false);
- return true;
+ return arch_trigger_cpumask_backtrace(cpu_online_mask, false);
}
static inline bool trigger_allbutself_cpu_backtrace(void)
{
- arch_trigger_cpumask_backtrace(cpu_online_mask, true);
- return true;
+ return arch_trigger_cpumask_backtrace(cpu_online_mask, true);
}
static inline bool trigger_cpumask_backtrace(struct cpumask *mask)
{
- arch_trigger_cpumask_backtrace(mask, false);
- return true;
+ return arch_trigger_cpumask_backtrace(mask, false);
}
static inline bool trigger_single_cpu_backtrace(int cpu)
{
- arch_trigger_cpumask_backtrace(cpumask_of(cpu), false);
- return true;
+ return arch_trigger_cpumask_backtrace(cpumask_of(cpu), false);
}
/* generic implementation */
--
2.40.0.634.g4ca3ef3211-goog
From: Sumit Garg <[email protected]>
Add a new API kgdb_smp_call_nmi_hook() to expose default CPUs roundup
mechanism to a particular archichecture as a runtime fallback if it
detects to not support NMI roundup.
Currently such an architecture example is arm64 supporting pseudo NMIs
feature which is only available on platforms which have support for GICv3
or later version.
Signed-off-by: Sumit Garg <[email protected]>
Tested-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---
(no changes since v1)
include/linux/kgdb.h | 12 ++++++++++++
kernel/debug/debug_core.c | 8 +++++++-
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 258cdde8d356..87713bd390f3 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -199,6 +199,18 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
extern void kgdb_call_nmi_hook(void *ignored);
+/**
+ * kgdb_smp_call_nmi_hook - Provide default fallback mechanism to
+ * round-up CPUs
+ *
+ * If you're using the default implementation of kgdb_roundup_cpus()
+ * this function will be called. And if an arch detects at runtime to
+ * not support NMI based roundup then it can fallback to default
+ * mechanism using this API.
+ */
+
+extern void kgdb_smp_call_nmi_hook(void);
+
/**
* kgdb_roundup_cpus - Get other CPUs into a holding pattern
*
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index d5e9ccde3ab8..14d40a7d6a4b 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -238,7 +238,7 @@ NOKPROBE_SYMBOL(kgdb_call_nmi_hook);
static DEFINE_PER_CPU(call_single_data_t, kgdb_roundup_csd) =
CSD_INIT(kgdb_call_nmi_hook, NULL);
-void __weak kgdb_roundup_cpus(void)
+void kgdb_smp_call_nmi_hook(void)
{
call_single_data_t *csd;
int this_cpu = raw_smp_processor_id();
@@ -269,6 +269,12 @@ void __weak kgdb_roundup_cpus(void)
kgdb_info[cpu].rounding_up = false;
}
}
+NOKPROBE_SYMBOL(kgdb_smp_call_nmi_hook);
+
+void __weak kgdb_roundup_cpus(void)
+{
+ kgdb_smp_call_nmi_hook();
+}
NOKPROBE_SYMBOL(kgdb_roundup_cpus);
#endif
--
2.40.0.634.g4ca3ef3211-goog
To save architectures from needing to wrap the call in #ifdefs, add a
stub no-op version of kgdb_nmicallback(), which returns 1 if it didn't
handle anything.
Signed-off-by: Douglas Anderson <[email protected]>
---
Changes in v8:
- "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
include/linux/kgdb.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 87713bd390f3..9ce628ee47cc 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -377,5 +377,6 @@ extern void kgdb_free_init_mem(void);
#define dbg_late_init()
static inline void kgdb_panic(const char *msg) {}
static inline void kgdb_free_init_mem(void) { }
+static int kgdb_nmicallback(int cpu, void *regs) { return 1; }
#endif /* ! CONFIG_KGDB */
#endif /* _KGDB_H_ */
--
2.40.0.634.g4ca3ef3211-goog
From: Sumit Garg <[email protected]>
Enable NMI backtrace support on arm64 using IPI turned as an NMI
leveraging pseudo NMIs support. It is now possible for users to get a
backtrace of a CPU stuck in hard-lockup using magic SYSRQ.
Signed-off-by: Sumit Garg <[email protected]>
Tested-by: Masayoshi Mizuma <[email protected]>
Tested-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---
Changes in v8:
- Removed "#ifdef CONFIG_SMP" since arm64 is always SMP
arch/arm64/include/asm/irq.h | 4 ++++
arch/arm64/kernel/ipi_nmi.c | 18 ++++++++++++++++--
2 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index fac08e18bcd5..dc35b9d23a81 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -6,6 +6,10 @@
#include <asm-generic/irq.h>
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+ bool exclude_self);
+#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
+
struct pt_regs;
int set_handle_irq(void (*handle_irq)(struct pt_regs *));
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index 712411eed949..c592e92b8cbf 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -8,6 +8,7 @@
#include <linux/interrupt.h>
#include <linux/irq.h>
+#include <linux/nmi.h>
#include <linux/smp.h>
#include <asm/nmi.h>
@@ -31,11 +32,24 @@ void arm64_send_nmi(cpumask_t *mask)
__ipi_send_mask(ipi_nmi_desc, mask);
}
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+{
+ if (!ipi_nmi_desc)
+ return false;
+
+ nmi_trigger_cpumask_backtrace(mask, exclude_self, arm64_send_nmi);
+
+ return true;
+}
+
static irqreturn_t ipi_nmi_handler(int irq, void *data)
{
- /* nop, NMI handlers for special features can be added here. */
+ irqreturn_t ret = IRQ_NONE;
+
+ if (nmi_cpu_backtrace(get_irq_regs()))
+ ret = IRQ_HANDLED;
- return IRQ_NONE;
+ return ret;
}
void dynamic_ipi_setup(void)
--
2.40.0.634.g4ca3ef3211-goog
From: Sumit Garg <[email protected]>
arm64 platforms with GICv3 or later supports pseudo NMIs which can be
leveraged to roundup CPUs which are stuck in hard lockup state with
interrupts disabled that wouldn't be possible with a normal IPI.
So instead switch to roundup CPUs using IPI turned as NMI. And in
case a particular arm64 platform doesn't supports pseudo NMIs,
it will switch back to default kgdb CPUs roundup mechanism.
Signed-off-by: Sumit Garg <[email protected]>
Tested-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Douglas Anderson <[email protected]>
---
(no changes since v1)
arch/arm64/kernel/ipi_nmi.c | 5 +++++
arch/arm64/kernel/kgdb.c | 18 ++++++++++++++++++
2 files changed, 23 insertions(+)
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index c592e92b8cbf..2adaaf1519e5 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -8,6 +8,7 @@
#include <linux/interrupt.h>
#include <linux/irq.h>
+#include <linux/kgdb.h>
#include <linux/nmi.h>
#include <linux/smp.h>
@@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
static irqreturn_t ipi_nmi_handler(int irq, void *data)
{
irqreturn_t ret = IRQ_NONE;
+ unsigned int cpu = smp_processor_id();
if (nmi_cpu_backtrace(get_irq_regs()))
ret = IRQ_HANDLED;
+ if (!kgdb_nmicallback(cpu, get_irq_regs()))
+ ret = IRQ_HANDLED;
+
return ret;
}
diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index cda9c1e9864f..2c85bc1df013 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -17,6 +17,7 @@
#include <asm/debug-monitors.h>
#include <asm/insn.h>
+#include <asm/nmi.h>
#include <asm/patching.h>
#include <asm/traps.h>
@@ -354,3 +355,20 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
return aarch64_insn_write((void *)bpt->bpt_addr,
*(u32 *)bpt->saved_instr);
}
+
+void kgdb_roundup_cpus(void)
+{
+ struct cpumask mask;
+
+ if (!arm64_supports_nmi()) {
+ kgdb_smp_call_nmi_hook();
+ return;
+ }
+
+ cpumask_copy(&mask, cpu_online_mask);
+ cpumask_clear_cpu(raw_smp_processor_id(), &mask);
+ if (cpumask_empty(&mask))
+ return;
+
+ arm64_send_nmi(&mask);
+}
--
2.40.0.634.g4ca3ef3211-goog
The current ipi_nmi implementation relies on the arm64 pseudo-NMI
support. This needs to be enabled in both the kernel config with
CONFIG_ARM64_PSEUDO_NMI and on the kernel command line with
"irqchip.gicv3_pseudo_nmi=1".
Let's add a fallback of using a regular IPI if the NMI isn't
enabled. The fallback mechanism of using a regular IPI matches what
arm32 does all the time since there is no NMI there.
The reason for doing this is to make the trigger_all_cpu_backtrace()
class of functions work. While those functions all return a bool
indicating that the caller should try a fallback upon failure, an
inspection of the callers shows that nearly nobody implements a
fallback. It's better to at least provide something here.
Signed-off-by: Douglas Anderson <[email protected]>
---
I dunno what people think of this patch. If it's great, we could
actually drop some of the patches out of this series since some of
them are to account for the fact that we might not be able to register
an "ipi_nmi". If it's awful, it could simply be dropped.
Changes in v8:
- "Fallback to a regular IPI if NMI isn't enabled" new for v8
arch/arm64/kernel/ipi_nmi.c | 31 +++++++++++++++++++++++++------
1 file changed, 25 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index 2adaaf1519e5..02868752845c 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -16,6 +16,7 @@
static struct irq_desc *ipi_nmi_desc __read_mostly;
static int ipi_nmi_id __read_mostly;
+static bool is_nmi;
bool arm64_supports_nmi(void)
{
@@ -62,8 +63,12 @@ void dynamic_ipi_setup(void)
if (!ipi_nmi_desc)
return;
- if (!prepare_percpu_nmi(ipi_nmi_id))
- enable_percpu_nmi(ipi_nmi_id, IRQ_TYPE_NONE);
+ if (is_nmi) {
+ if (!prepare_percpu_nmi(ipi_nmi_id))
+ enable_percpu_nmi(ipi_nmi_id, IRQ_TYPE_NONE);
+ } else {
+ enable_percpu_irq(ipi_nmi_id, IRQ_TYPE_NONE);
+ }
}
void dynamic_ipi_teardown(void)
@@ -71,14 +76,28 @@ void dynamic_ipi_teardown(void)
if (!ipi_nmi_desc)
return;
- disable_percpu_nmi(ipi_nmi_id);
- teardown_percpu_nmi(ipi_nmi_id);
+ if (is_nmi) {
+ disable_percpu_nmi(ipi_nmi_id);
+ teardown_percpu_nmi(ipi_nmi_id);
+ } else {
+ disable_percpu_irq(ipi_nmi_id);
+ }
}
void __init set_smp_dynamic_ipi(int ipi)
{
+ int err;
+
if (!request_percpu_nmi(ipi, ipi_nmi_handler, "IPI", &cpu_number)) {
- ipi_nmi_desc = irq_to_desc(ipi);
- ipi_nmi_id = ipi;
+ is_nmi = true;
+ } else {
+ err = request_percpu_irq(ipi, ipi_nmi_handler, "IPI", &cpu_number);
+ if (WARN_ON(err))
+ return;
+
+ irq_set_status_flags(ipi, IRQ_HIDDEN);
}
+
+ ipi_nmi_desc = irq_to_desc(ipi);
+ ipi_nmi_id = ipi;
}
--
2.40.0.634.g4ca3ef3211-goog
Hi,
On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <[email protected]> wrote:
>
> From: Sumit Garg <[email protected]>
>
> Assign an unused IPI which can be turned as NMI using ipi_nmi framework.
> Also, invoke corresponding dynamic IPI setup/teardown APIs.
>
> Signed-off-by: Sumit Garg <[email protected]>
> Reviewed-by: Masayoshi Mizuma <[email protected]>
> Tested-by: Chen-Yu Tsai <[email protected]>
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> Changes in v8:
> - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
>
> arch/arm64/kernel/smp.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 4e8327264255..94ff063527c6 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -43,6 +43,7 @@
> #include <asm/daifflags.h>
> #include <asm/kvm_mmu.h>
> #include <asm/mmu_context.h>
> +#include <asm/nmi.h>
> #include <asm/numa.h>
> #include <asm/processor.h>
> #include <asm/smp_plat.h>
> @@ -938,6 +939,8 @@ static void ipi_setup(int cpu)
>
> for (i = 0; i < nr_ipi; i++)
> enable_percpu_irq(ipi_irq_base + i, 0);
> +
> + dynamic_ipi_setup();
> }
>
> #ifdef CONFIG_HOTPLUG_CPU
> @@ -950,6 +953,8 @@ static void ipi_teardown(int cpu)
>
> for (i = 0; i < nr_ipi; i++)
> disable_percpu_irq(ipi_irq_base + i);
> +
> + dynamic_ipi_teardown();
> }
> #endif
>
> @@ -971,6 +976,9 @@ void __init set_smp_ipi_range(int ipi_base, int n)
> irq_set_status_flags(ipi_base + i, IRQ_HIDDEN);
> }
>
> + if (n > nr_ipi)
> + set_smp_dynamic_ipi(ipi_base + nr_ipi);
From thinking about this patch more, I'm guessing that the biggest
objection someone could have is that this uses up the "last" IPI slot
in any systems that are stuck with only 8. Is that the reason that
this patch series stagnated in the past, or was it something else?
If this is truly the concern that people have, it doesn't look like it
would be terribly hard to combine the existing IPI_CPU_STOP and
IPI_CPU_CRASH_STOP. Presumably we could just get rid of the "crash
stop" IPI and have the normal "stop" IPI do the crash if
"waiting_for_crash_ipi" is non-zero. If that's the thing blocking the
series from moving forward then I can add that to the series, or we
could just wait until someone actually needs another IPI. ;-)
-Doug
Hi,
On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <[email protected]> wrote:
>
> This is an attempt to resurrect Sumit's old patch series [1] that
> allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> could find was v7, so I called this series v8. I haven't copied all of
> his old changelongs here, but you can find them from the link.
>
> Since v7, I have:
> * Addressed the small amount of feedback that was there for v7.
> * Rebased.
> * Added a new patch that prevents us from spamming the logs with idle
> tasks.
> * Added an extra patch to gracefully fall back to regular IPIs if
> pseudo-NMIs aren't there.
>
> Since there appear to be a few different patches series related to
> being able to use NMIs to get stack traces of crashed systems, let me
> try to organize them to the best of my understanding:
>
> a) This series. On its own, a) will (among other things) enable stack
> traces of all running processes with the soft lockup detector if
> you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> its own, a) doesn't give a hard lockup detector.
>
> b) A different recently-posted series [2] that adds a hard lockup
> detector based on perf. On its own, b) gives a stack crawl of the
> locked up CPU but no stack crawls of other CPUs (even if they're
> locked too). Together with a) + b) we get everything (full lockup
> detect, full ability to get stack crawls).
>
> c) The old Android "buddy" hard lockup detector [3] that I'm
> considering trying to upstream. If b) lands then I believe c) would
> be redundant (at least for arm64). c) on its own is really only
> useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> (see [4]). a) + c) is roughly as good as a) + b).
>
> [1] https://lore.kernel.org/linux-arm-kernel/[email protected]/
> [2] https://lore.kernel.org/linux-arm-kernel/[email protected]/
> [3] https://issuetracker.google.com/172213097
> [4] https://issuetracker.google.com/172213129
>
> Changes in v8:
> - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
> - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
> - Add loongarch support, too
> - Removed "#ifdef CONFIG_SMP" since arm64 is always SMP
> - "Tag the arm64 idle functions as __cpuidle" new for v8
> - "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
> - "Fallback to a regular IPI if NMI isn't enabled" new for v8
>
> Douglas Anderson (3):
> arm64: idle: Tag the arm64 idle functions as __cpuidle
> kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
> arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled
>
> Sumit Garg (7):
> arm64: Add framework to turn IPI as NMI
> irqchip/gic-v3: Enable support for SGIs to act as NMIs
> arm64: smp: Assign and setup an IPI as NMI
> nmi: backtrace: Allow runtime arch specific override
> arm64: ipi_nmi: Add support for NMI backtrace
> kgdb: Expose default CPUs roundup fallback mechanism
> arm64: kgdb: Roundup cpus using IPI as NMI
>
> arch/arm/include/asm/irq.h | 2 +-
> arch/arm/kernel/smp.c | 3 +-
> arch/arm64/include/asm/irq.h | 4 ++
> arch/arm64/include/asm/nmi.h | 17 +++++
> arch/arm64/kernel/Makefile | 2 +-
> arch/arm64/kernel/idle.c | 4 +-
> arch/arm64/kernel/ipi_nmi.c | 103 +++++++++++++++++++++++++++++++
> arch/arm64/kernel/kgdb.c | 18 ++++++
> arch/arm64/kernel/smp.c | 8 +++
> arch/loongarch/include/asm/irq.h | 2 +-
> arch/loongarch/kernel/process.c | 3 +-
> arch/mips/include/asm/irq.h | 2 +-
> arch/mips/kernel/process.c | 3 +-
> arch/powerpc/include/asm/nmi.h | 2 +-
> arch/powerpc/kernel/stacktrace.c | 3 +-
> arch/sparc/include/asm/irq_64.h | 2 +-
> arch/sparc/kernel/process_64.c | 4 +-
> arch/x86/include/asm/irq.h | 2 +-
> arch/x86/kernel/apic/hw_nmi.c | 3 +-
> drivers/irqchip/irq-gic-v3.c | 29 ++++++---
> include/linux/kgdb.h | 13 ++++
> include/linux/nmi.h | 12 ++--
> kernel/debug/debug_core.c | 8 ++-
> 23 files changed, 217 insertions(+), 32 deletions(-)
It's been 3 weeks and I haven't heard a peep on this series. That
means nobody has any objections and it's all good to land, right?
Right? :-P
Seriously, though, I really think we should figure out how to get this
landed. There's obviously interest from several different parties and
I'm chomping at the bit waiting to spin this series according to your
wishes. I also don't think there's anything super scary/ugly here. IMO
the ideal situation is that folks are OK with the idea presented in
the last patch in the series ("arm64: ipi_nmi: Fallback to a regular
IPI if NMI isn't enabled") and then I can re-spin this series to be
_much_ simpler. That being said, I'm also OK with dropping that patch
and starting the discussion for how to land the rest of the patches in
the series.
Please let me know!
-Doug
On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> Hi,
Hi Doug,
> On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <[email protected]> wrote:
> > This is an attempt to resurrect Sumit's old patch series [1] that
> > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > could find was v7, so I called this series v8. I haven't copied all of
> > his old changelongs here, but you can find them from the link.
> >
> > Since v7, I have:
> > * Addressed the small amount of feedback that was there for v7.
> > * Rebased.
> > * Added a new patch that prevents us from spamming the logs with idle
> > tasks.
> > * Added an extra patch to gracefully fall back to regular IPIs if
> > pseudo-NMIs aren't there.
> >
> > Since there appear to be a few different patches series related to
> > being able to use NMIs to get stack traces of crashed systems, let me
> > try to organize them to the best of my understanding:
> >
> > a) This series. On its own, a) will (among other things) enable stack
> > traces of all running processes with the soft lockup detector if
> > you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > its own, a) doesn't give a hard lockup detector.
> >
> > b) A different recently-posted series [2] that adds a hard lockup
> > detector based on perf. On its own, b) gives a stack crawl of the
> > locked up CPU but no stack crawls of other CPUs (even if they're
> > locked too). Together with a) + b) we get everything (full lockup
> > detect, full ability to get stack crawls).
> >
> > c) The old Android "buddy" hard lockup detector [3] that I'm
> > considering trying to upstream. If b) lands then I believe c) would
> > be redundant (at least for arm64). c) on its own is really only
> > useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > (see [4]). a) + c) is roughly as good as a) + b).
> It's been 3 weeks and I haven't heard a peep on this series. That
> means nobody has any objections and it's all good to land, right?
> Right? :-P
FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
support (and fixing that requires an overhaul of our DAIF / IRQ flag
management, which I've been chipping away at for a number of releases), so I
hadn't looked at this in detail yet because the foundations are still somewhat
dodgy.
I appreciate that this has been around for a while, and it's on my queue to
look at.
Thanks,
Mark.
Hi,
On Wed, May 10, 2023 at 9:30 AM Mark Rutland <[email protected]> wrote:
>
> On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> > Hi,
>
> Hi Doug,
>
> > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <[email protected]> wrote:
> > > This is an attempt to resurrect Sumit's old patch series [1] that
> > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > > could find was v7, so I called this series v8. I haven't copied all of
> > > his old changelongs here, but you can find them from the link.
> > >
> > > Since v7, I have:
> > > * Addressed the small amount of feedback that was there for v7.
> > > * Rebased.
> > > * Added a new patch that prevents us from spamming the logs with idle
> > > tasks.
> > > * Added an extra patch to gracefully fall back to regular IPIs if
> > > pseudo-NMIs aren't there.
> > >
> > > Since there appear to be a few different patches series related to
> > > being able to use NMIs to get stack traces of crashed systems, let me
> > > try to organize them to the best of my understanding:
> > >
> > > a) This series. On its own, a) will (among other things) enable stack
> > > traces of all running processes with the soft lockup detector if
> > > you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > > its own, a) doesn't give a hard lockup detector.
> > >
> > > b) A different recently-posted series [2] that adds a hard lockup
> > > detector based on perf. On its own, b) gives a stack crawl of the
> > > locked up CPU but no stack crawls of other CPUs (even if they're
> > > locked too). Together with a) + b) we get everything (full lockup
> > > detect, full ability to get stack crawls).
> > >
> > > c) The old Android "buddy" hard lockup detector [3] that I'm
> > > considering trying to upstream. If b) lands then I believe c) would
> > > be redundant (at least for arm64). c) on its own is really only
> > > useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > > (see [4]). a) + c) is roughly as good as a) + b).
>
> > It's been 3 weeks and I haven't heard a peep on this series. That
> > means nobody has any objections and it's all good to land, right?
> > Right? :-P
>
> FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
> support (and fixing that requires an overhaul of our DAIF / IRQ flag
> management, which I've been chipping away at for a number of releases), so I
> hadn't looked at this in detail yet because the foundations are still somewhat
> dodgy.
>
> I appreciate that this has been around for a while, and it's on my queue to
> look at.
Ah, thanks for the heads up! We've been thinking about turning this on
in production in ChromeOS because it will help us track down a whole
class of field-generated crash reports that are otherwise opaque to
us. It sounds as if maybe that's not a good idea quite yet? Do you
have any idea of how much farther along this needs to go? ...of
course, we've also run into issues with Mediatek devices because they
don't save/restore GICR registers properly [1]. In theory, we might be
able to work around that in the kernel.
In any case, even if there are bugs that would prevent turning this on
for production, it still seems like we could still land this series.
It simply wouldn't do anything until someone turned on pseudo NMIs,
which wouldn't happen till the kinks are worked out.
...actually, I guess I should say that if all the patches of the
current series do land then it actually _would_ still do something,
even without pseudo-NMI. Assuming the last patch looks OK, it would at
least start falling back to using regular IPIs to do backtraces. That
wouldn't get backtraces on hard locked up CPUs but it would be better
than what we have today where we don't get any backtraces. This would
get arm64 on par with arm32...
[1] https://issuetracker.google.com/281831288
On Wed, Apr 19, 2023 at 03:56:00PM -0700, Douglas Anderson wrote:
> As per the (somewhat recent) comment before the definition of
> `__cpuidle`, the tag is like `noinstr` but also marks a function so it
> can be identified by cpu_in_idle(). Let'a add this.
>
> After doing this then when we dump stack traces of all processors
> using nmi_cpu_backtrace() then instead of getting useless backtraces
> we get things like:
>
> NMI backtrace for cpu N skipped: idling at cpu_do_idle+0x94/0x98
As a heads-up, this is only going to work in the trivial case where a CPU is
within the default cpu_do_idle(), and not for anything using PSCI
cpu_suspend() (which I'd *really* hope is the common case).
That doesn't get inlined, and the invocation is shared with other SMCCC users,
so we probably need more work there if culling idle backtraces is important.
I'm not averse to this change itself.
Mark.
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> Changes in v8:
> - "Tag the arm64 idle functions as __cpuidle" new for v8
>
> arch/arm64/kernel/idle.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kernel/idle.c b/arch/arm64/kernel/idle.c
> index c1125753fe9b..05cfb347ec26 100644
> --- a/arch/arm64/kernel/idle.c
> +++ b/arch/arm64/kernel/idle.c
> @@ -20,7 +20,7 @@
> * ensure that interrupts are not masked at the PMR (because the core will
> * not wake up if we block the wake up signal in the interrupt controller).
> */
> -void noinstr cpu_do_idle(void)
> +void __cpuidle cpu_do_idle(void)
> {
> struct arm_cpuidle_irq_context context;
>
> @@ -35,7 +35,7 @@ void noinstr cpu_do_idle(void)
> /*
> * This is our default idle handler.
> */
> -void noinstr arch_cpu_idle(void)
> +void __cpuidle arch_cpu_idle(void)
> {
> /*
> * This should do all the clock switching and wait for interrupt
> --
> 2.40.0.634.g4ca3ef3211-goog
>
Hi,
On Wed, May 10, 2023 at 9:43 AM Mark Rutland <[email protected]> wrote:
>
> On Wed, Apr 19, 2023 at 03:56:00PM -0700, Douglas Anderson wrote:
> > As per the (somewhat recent) comment before the definition of
> > `__cpuidle`, the tag is like `noinstr` but also marks a function so it
> > can be identified by cpu_in_idle(). Let'a add this.
> >
> > After doing this then when we dump stack traces of all processors
> > using nmi_cpu_backtrace() then instead of getting useless backtraces
> > we get things like:
> >
> > NMI backtrace for cpu N skipped: idling at cpu_do_idle+0x94/0x98
>
> As a heads-up, this is only going to work in the trivial case where a CPU is
> within the default cpu_do_idle(), and not for anything using PSCI
> cpu_suspend() (which I'd *really* hope is the common case).
Thanks for the review and the heads up! Right. It only catches shallow
idle. Specifically this was the stack trace when it was "caught" on a
sc7180-trogdor device:
cpu_do_idle+0x94/0x98
psci_enter_idle_state+0x58/0x70
cpuidle_enter_state+0xb8/0x260
cpuidle_enter+0x44/0x5c
do_idle+0x188/0x30c
...
I checked in kgdb and saw that "psci_enter_idle_state()" had "idx" as
0, which makes sense since __CPU_PM_CPU_IDLE_ENTER will directly call
cpu_do_idle() when idx is 0.
I agree that it doesn't catch every case. Certainly it's not too hard
to see a CPU here:
gic_cpu_sys_reg_init+0x1f8/0x314
gic_cpu_pm_notifier+0x40/0x78
raw_notifier_call_chain+0x5c/0x134
cpu_pm_notify+0x38/0x64
cpu_pm_exit+0x20/0x2c
psci_enter_idle_state+0x48/0x70
cpuidle_enter_state+0xb8/0x260
cpuidle_enter+0x44/0x5c
do_idle+0x188/0x30c
...
...and kgdb showed "idx" was 3.
That being said, catching some of the cases is better than catching
none of the cases. ;-)
In reality, I've seen cases where it catches most CPUs. For instance,
running soon after bootup on my sc7180-trogdor device:
echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT
...and then having the "buddy" hardlockup detector [1] catch the crash
caught all of the CPUs other than the one that was locked up and the
one that detected it. Specifically:
NMI backtrace for cpu 2 skipped: idling at cpu_do_idle+0x94/0x98
NMI backtrace for cpu 1 skipped: idling at cpu_do_idle+0x94/0x98
NMI backtrace for cpu 0 skipped: idling at cpu_do_idle+0x94/0x98
NMI backtrace for cpu 3 skipped: idling at cpu_do_idle+0x94/0x98
NMI backtrace for cpu 4 skipped: idling at cpu_do_idle+0x94/0x98
NMI backtrace for cpu 7 skipped: idling at cpu_do_idle+0x94/0x98
I haven't analyzed it, but I guess it's possible that when the buddy
hardlockup detector triggers that other CPUs are more likely to be in
a shallow idle? Certainly I seem to catch a lot more CPUs in a shallow
idle in buddy lockup reports...
[1] https://lore.kernel.org/r/[email protected]
> That doesn't get inlined, and the invocation is shared with other SMCCC users,
> so we probably need more work there if culling idle backtraces is important.
At this point I'd say that we should just take the low hanging fruit
(this patch). If someone later wants to try to do better they can.
> I'm not averse to this change itself.
Any chance you'd be willing to give any tags to it? :-P Do you need me
to add any of the above caveats to the commit message?
I also certainly wouldn't object to this patch landing even if others
aren't ready. It has no dependencies on any other patches and just
makes the debug messages prettier in some cases.
-Doug
Hi,
On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <[email protected]> wrote:
>
> To save architectures from needing to wrap the call in #ifdefs, add a
> stub no-op version of kgdb_nmicallback(), which returns 1 if it didn't
> handle anything.
>
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> Changes in v8:
> - "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
>
> include/linux/kgdb.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> index 87713bd390f3..9ce628ee47cc 100644
> --- a/include/linux/kgdb.h
> +++ b/include/linux/kgdb.h
> @@ -377,5 +377,6 @@ extern void kgdb_free_init_mem(void);
> #define dbg_late_init()
> static inline void kgdb_panic(const char *msg) {}
> static inline void kgdb_free_init_mem(void) { }
> +static int kgdb_nmicallback(int cpu, void *regs) { return 1; }
FWIW: I just realized that the above needs an "inline" to make the
compiler not complain. I'm still hoping for more feedback on the
series, but I'll plan to fix that in the next spin.
-Doug
On Wed, Apr 19, 2023 at 03:56:03PM -0700, Douglas Anderson wrote:
> From: Sumit Garg <[email protected]>
>
> arm64 platforms with GICv3 or later supports pseudo NMIs which can be
> leveraged to roundup CPUs which are stuck in hard lockup state with
> interrupts disabled that wouldn't be possible with a normal IPI.
>
> So instead switch to roundup CPUs using IPI turned as NMI. And in
> case a particular arm64 platform doesn't supports pseudo NMIs,
> it will switch back to default kgdb CPUs roundup mechanism.
>
> Signed-off-by: Sumit Garg <[email protected]>
> Tested-by: Chen-Yu Tsai <[email protected]>
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> (no changes since v1)
>
> arch/arm64/kernel/ipi_nmi.c | 5 +++++
> arch/arm64/kernel/kgdb.c | 18 ++++++++++++++++++
> 2 files changed, 23 insertions(+)
>
> diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> index c592e92b8cbf..2adaaf1519e5 100644
> --- a/arch/arm64/kernel/ipi_nmi.c
> +++ b/arch/arm64/kernel/ipi_nmi.c
> @@ -8,6 +8,7 @@
>
> #include <linux/interrupt.h>
> #include <linux/irq.h>
> +#include <linux/kgdb.h>
> #include <linux/nmi.h>
> #include <linux/smp.h>
>
> @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
> static irqreturn_t ipi_nmi_handler(int irq, void *data)
> {
> irqreturn_t ret = IRQ_NONE;
> + unsigned int cpu = smp_processor_id();
Does this play nice with CONFIG_DEBUG_PREEMPT? I may have missed
something about the NMI entry but a quick scan of the arm64
arch_irq_disabled() suggests that debug_smp_processor_id() will issue
warnings at this point...
Other than I didn't see anything I don't like here.
Daniel.
On Thu, May 11, 2023 at 07:34:30AM -0700, Doug Anderson wrote:
> Hi,
>
> On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <[email protected]> wrote:
> >
> > To save architectures from needing to wrap the call in #ifdefs, add a
> > stub no-op version of kgdb_nmicallback(), which returns 1 if it didn't
> > handle anything.
> >
> > Signed-off-by: Douglas Anderson <[email protected]>
> > ---
> >
> > Changes in v8:
> > - "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
> >
> > include/linux/kgdb.h | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > index 87713bd390f3..9ce628ee47cc 100644
> > --- a/include/linux/kgdb.h
> > +++ b/include/linux/kgdb.h
> > @@ -377,5 +377,6 @@ extern void kgdb_free_init_mem(void);
> > #define dbg_late_init()
> > static inline void kgdb_panic(const char *msg) {}
> > static inline void kgdb_free_init_mem(void) { }
> > +static int kgdb_nmicallback(int cpu, void *regs) { return 1; }
>
> FWIW: I just realized that the above needs an "inline" to make the
> compiler not complain. I'm still hoping for more feedback on the
> series, but I'll plan to fix that in the next spin.
On the next spin feel free to add:
Reviewed-by: Daniel Thompson <[email protected]>
Daniel.
On Wed, Apr 19, 2023 at 03:56:01PM -0700, Douglas Anderson wrote:
> From: Sumit Garg <[email protected]>
>
> Add a new API kgdb_smp_call_nmi_hook() to expose default CPUs roundup
> mechanism to a particular archichecture as a runtime fallback if it
> detects to not support NMI roundup.
>
> Currently such an architecture example is arm64 supporting pseudo NMIs
> feature which is only available on platforms which have support for GICv3
> or later version.
>
> Signed-off-by: Sumit Garg <[email protected]>
> Tested-by: Chen-Yu Tsai <[email protected]>
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
>
> (no changes since v1)
>
> include/linux/kgdb.h | 12 ++++++++++++
> kernel/debug/debug_core.c | 8 +++++++-
> 2 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> index 258cdde8d356..87713bd390f3 100644
> --- a/include/linux/kgdb.h
> +++ b/include/linux/kgdb.h
> @@ -199,6 +199,18 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
>
> extern void kgdb_call_nmi_hook(void *ignored);
>
> +/**
> + * kgdb_smp_call_nmi_hook - Provide default fallback mechanism to
> + * round-up CPUs
> + *
> + * If you're using the default implementation of kgdb_roundup_cpus()
> + * this function will be called. And if an arch detects at runtime to
> + * not support NMI based roundup then it can fallback to default
> + * mechanism using this API.
> + */
> +
> +extern void kgdb_smp_call_nmi_hook(void);
Concept looks sensible but this is a terrible name for aa command to
round up the CPUs using smp_call... functions. Whilst it is true it that
kgdb_roundup_cpus() does use kgdb_call_nmi_hook() internally that
doesn't mean we should name functions after it. They should be named
after what they are do, not how they do it.
Something more like kgdb_roundup_cpus_with_smp_call() would be a much
better name.
Daniel.
Hi,
On Fri, May 12, 2023 at 7:00 AM Daniel Thompson
<[email protected]> wrote:
>
> On Wed, Apr 19, 2023 at 03:56:03PM -0700, Douglas Anderson wrote:
> > From: Sumit Garg <[email protected]>
> >
> > arm64 platforms with GICv3 or later supports pseudo NMIs which can be
> > leveraged to roundup CPUs which are stuck in hard lockup state with
> > interrupts disabled that wouldn't be possible with a normal IPI.
> >
> > So instead switch to roundup CPUs using IPI turned as NMI. And in
> > case a particular arm64 platform doesn't supports pseudo NMIs,
> > it will switch back to default kgdb CPUs roundup mechanism.
> >
> > Signed-off-by: Sumit Garg <[email protected]>
> > Tested-by: Chen-Yu Tsai <[email protected]>
> > Signed-off-by: Douglas Anderson <[email protected]>
> > ---
> >
> > (no changes since v1)
> >
> > arch/arm64/kernel/ipi_nmi.c | 5 +++++
> > arch/arm64/kernel/kgdb.c | 18 ++++++++++++++++++
> > 2 files changed, 23 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> > index c592e92b8cbf..2adaaf1519e5 100644
> > --- a/arch/arm64/kernel/ipi_nmi.c
> > +++ b/arch/arm64/kernel/ipi_nmi.c
> > @@ -8,6 +8,7 @@
> >
> > #include <linux/interrupt.h>
> > #include <linux/irq.h>
> > +#include <linux/kgdb.h>
> > #include <linux/nmi.h>
> > #include <linux/smp.h>
> >
> > @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
> > static irqreturn_t ipi_nmi_handler(int irq, void *data)
> > {
> > irqreturn_t ret = IRQ_NONE;
> > + unsigned int cpu = smp_processor_id();
>
> Does this play nice with CONFIG_DEBUG_PREEMPT? I may have missed
> something about the NMI entry but a quick scan of the arm64
> arch_irq_disabled() suggests that debug_smp_processor_id() will issue
> warnings at this point...
>
> Other than I didn't see anything I don't like here.
Good question. It seems to, at least on the sc7180-trogdor-based
system I tested.
Just to confirm, I printed out the values of some of the stuff that's
checked. When this function was called, I saw:
irqs_disabled() => true
raw_local_save_flags() => 0x000000f0
__irqflags_uses_pmr() => 1
The "__irqflags_uses_pmr" is the thing that gets set when we try to
enable pseudo-NMIs and they're actually there. That causes us to call
__pmr_irqs_disabled_flags() which will compare the flags (0xf0) to
GIC_PRIO_IRQON (0xe0). If flags is not the same as GIC_PRIO_IRQON then
interrupts are disabled.
...so, assuming I understood everything, I think we're OK.
I also tried to see what happened with the whole "fallback to use
regular IPIs if we don't have pseudo-NMIs enabled" (AKA patch #10 in
this series). In that case, I had:
irqs_disabled() => true
raw_local_save_flags() => 0x000000c0
__irqflags_uses_pmr() => 0
...in this case we end up in __daif_irqs_disabled_flags(). That checks
to see if the flags (0xc0) has the "I bit" (0x80) set. It is set, so
interrupts are disabled.
-Doug
Hi,
On Fri, May 12, 2023 at 6:49 AM Daniel Thompson
<[email protected]> wrote:
>
> On Wed, Apr 19, 2023 at 03:56:01PM -0700, Douglas Anderson wrote:
> > From: Sumit Garg <[email protected]>
> >
> > Add a new API kgdb_smp_call_nmi_hook() to expose default CPUs roundup
> > mechanism to a particular archichecture as a runtime fallback if it
> > detects to not support NMI roundup.
> >
> > Currently such an architecture example is arm64 supporting pseudo NMIs
> > feature which is only available on platforms which have support for GICv3
> > or later version.
> >
> > Signed-off-by: Sumit Garg <[email protected]>
> > Tested-by: Chen-Yu Tsai <[email protected]>
> > Signed-off-by: Douglas Anderson <[email protected]>
> > ---
> >
> > (no changes since v1)
> >
> > include/linux/kgdb.h | 12 ++++++++++++
> > kernel/debug/debug_core.c | 8 +++++++-
> > 2 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > index 258cdde8d356..87713bd390f3 100644
> > --- a/include/linux/kgdb.h
> > +++ b/include/linux/kgdb.h
> > @@ -199,6 +199,18 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
> >
> > extern void kgdb_call_nmi_hook(void *ignored);
> >
> > +/**
> > + * kgdb_smp_call_nmi_hook - Provide default fallback mechanism to
> > + * round-up CPUs
> > + *
> > + * If you're using the default implementation of kgdb_roundup_cpus()
> > + * this function will be called. And if an arch detects at runtime to
> > + * not support NMI based roundup then it can fallback to default
> > + * mechanism using this API.
> > + */
> > +
> > +extern void kgdb_smp_call_nmi_hook(void);
>
> Concept looks sensible but this is a terrible name for aa command to
> round up the CPUs using smp_call... functions. Whilst it is true it that
> kgdb_roundup_cpus() does use kgdb_call_nmi_hook() internally that
> doesn't mean we should name functions after it. They should be named
> after what they are do, not how they do it.
>
> Something more like kgdb_roundup_cpus_with_smp_call() would be a much
> better name.
Sounds good. I'm happy to spin with this rename, though I was kinda
hoping to drop ${SUBJECT} patch if folks were OK with patch #10 in
this series [1]. I personally think that's the right way to go but
it's unclear to me if arm64 maintainers will think it's a hack
(despite the fact that arm32 implements the "nmi" functions with
regular IPIs).
For now, maybe I'll think positive thoughts and hope that folks will
have the time to review the series soon. If another few weeks go by
then I'll send a v9 with just your comments addressed. If nothing
else, maybe you can land the kgdb parts in a tree targeting v6.5 and
then when arm64 folks have the bandwidth then it will be easier to get
them landed.
[1] https://lore.kernel.org/r/20230419155341.v8.10.Ic3659997d6243139d0522fc3afcdfd88d7a5f030@changeid
-Doug
On Wed, 10 May 2023 at 22:20, Doug Anderson <[email protected]> wrote:
>
> Hi,
>
> On Wed, May 10, 2023 at 9:30 AM Mark Rutland <[email protected]> wrote:
> >
> > On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> > > Hi,
> >
> > Hi Doug,
> >
> > > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <[email protected]> wrote:
> > > > This is an attempt to resurrect Sumit's old patch series [1] that
> > > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > > > could find was v7, so I called this series v8. I haven't copied all of
> > > > his old changelongs here, but you can find them from the link.
> > > >
Thanks Doug for picking up this work and for all your additions/improvements.
> > > > Since v7, I have:
> > > > * Addressed the small amount of feedback that was there for v7.
> > > > * Rebased.
> > > > * Added a new patch that prevents us from spamming the logs with idle
> > > > tasks.
> > > > * Added an extra patch to gracefully fall back to regular IPIs if
> > > > pseudo-NMIs aren't there.
> > > >
> > > > Since there appear to be a few different patches series related to
> > > > being able to use NMIs to get stack traces of crashed systems, let me
> > > > try to organize them to the best of my understanding:
> > > >
> > > > a) This series. On its own, a) will (among other things) enable stack
> > > > traces of all running processes with the soft lockup detector if
> > > > you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > > > its own, a) doesn't give a hard lockup detector.
> > > >
> > > > b) A different recently-posted series [2] that adds a hard lockup
> > > > detector based on perf. On its own, b) gives a stack crawl of the
> > > > locked up CPU but no stack crawls of other CPUs (even if they're
> > > > locked too). Together with a) + b) we get everything (full lockup
> > > > detect, full ability to get stack crawls).
> > > >
> > > > c) The old Android "buddy" hard lockup detector [3] that I'm
> > > > considering trying to upstream. If b) lands then I believe c) would
> > > > be redundant (at least for arm64). c) on its own is really only
> > > > useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > > > (see [4]). a) + c) is roughly as good as a) + b).
> >
> > > It's been 3 weeks and I haven't heard a peep on this series. That
> > > means nobody has any objections and it's all good to land, right?
> > > Right? :-P
For me it was months waiting without any feedback. So I think you are
lucky :) or atleast better than me at poking arm64 maintainers.
> >
> > FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
> > support (and fixing that requires an overhaul of our DAIF / IRQ flag
> > management, which I've been chipping away at for a number of releases), so I
> > hadn't looked at this in detail yet because the foundations are still somewhat
> > dodgy.
> >
> > I appreciate that this has been around for a while, and it's on my queue to
> > look at.
>
> Ah, thanks for the heads up! We've been thinking about turning this on
> in production in ChromeOS because it will help us track down a whole
> class of field-generated crash reports that are otherwise opaque to
> us. It sounds as if maybe that's not a good idea quite yet? Do you
> have any idea of how much farther along this needs to go? ...of
> course, we've also run into issues with Mediatek devices because they
> don't save/restore GICR registers properly [1]. In theory, we might be
> able to work around that in the kernel.
>
> In any case, even if there are bugs that would prevent turning this on
> for production, it still seems like we could still land this series.
> It simply wouldn't do anything until someone turned on pseudo NMIs,
> which wouldn't happen till the kinks are worked out.
I agree here. We should be able to make the foundations robust later
on. IMHO, until we turn on features surrounding pseudo NMIs, I am not
sure how we can have true confidence in the underlying robustness.
-Sumit
>
> ...actually, I guess I should say that if all the patches of the
> current series do land then it actually _would_ still do something,
> even without pseudo-NMI. Assuming the last patch looks OK, it would at
> least start falling back to using regular IPIs to do backtraces. That
> wouldn't get backtraces on hard locked up CPUs but it would be better
> than what we have today where we don't get any backtraces. This would
> get arm64 on par with arm32...
>
> [1] https://issuetracker.google.com/281831288
Hi,
On Mon, May 15, 2023 at 4:21 PM Doug Anderson <[email protected]> wrote:
>
> Hi,
>
> On Fri, May 12, 2023 at 6:49 AM Daniel Thompson
> <[email protected]> wrote:
> >
> > On Wed, Apr 19, 2023 at 03:56:01PM -0700, Douglas Anderson wrote:
> > > From: Sumit Garg <[email protected]>
> > >
> > > Add a new API kgdb_smp_call_nmi_hook() to expose default CPUs roundup
> > > mechanism to a particular archichecture as a runtime fallback if it
> > > detects to not support NMI roundup.
> > >
> > > Currently such an architecture example is arm64 supporting pseudo NMIs
> > > feature which is only available on platforms which have support for GICv3
> > > or later version.
> > >
> > > Signed-off-by: Sumit Garg <[email protected]>
> > > Tested-by: Chen-Yu Tsai <[email protected]>
> > > Signed-off-by: Douglas Anderson <[email protected]>
> > > ---
> > >
> > > (no changes since v1)
> > >
> > > include/linux/kgdb.h | 12 ++++++++++++
> > > kernel/debug/debug_core.c | 8 +++++++-
> > > 2 files changed, 19 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > > index 258cdde8d356..87713bd390f3 100644
> > > --- a/include/linux/kgdb.h
> > > +++ b/include/linux/kgdb.h
> > > @@ -199,6 +199,18 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
> > >
> > > extern void kgdb_call_nmi_hook(void *ignored);
> > >
> > > +/**
> > > + * kgdb_smp_call_nmi_hook - Provide default fallback mechanism to
> > > + * round-up CPUs
> > > + *
> > > + * If you're using the default implementation of kgdb_roundup_cpus()
> > > + * this function will be called. And if an arch detects at runtime to
> > > + * not support NMI based roundup then it can fallback to default
> > > + * mechanism using this API.
> > > + */
> > > +
> > > +extern void kgdb_smp_call_nmi_hook(void);
> >
> > Concept looks sensible but this is a terrible name for aa command to
> > round up the CPUs using smp_call... functions. Whilst it is true it that
> > kgdb_roundup_cpus() does use kgdb_call_nmi_hook() internally that
> > doesn't mean we should name functions after it. They should be named
> > after what they are do, not how they do it.
> >
> > Something more like kgdb_roundup_cpus_with_smp_call() would be a much
> > better name.
>
> Sounds good. I'm happy to spin with this rename, though I was kinda
> hoping to drop ${SUBJECT} patch if folks were OK with patch #10 in
> this series [1]. I personally think that's the right way to go but
> it's unclear to me if arm64 maintainers will think it's a hack
> (despite the fact that arm32 implements the "nmi" functions with
> regular IPIs).
>
> For now, maybe I'll think positive thoughts and hope that folks will
> have the time to review the series soon. If another few weeks go by
> then I'll send a v9 with just your comments addressed. If nothing
> else, maybe you can land the kgdb parts in a tree targeting v6.5 and
> then when arm64 folks have the bandwidth then it will be easier to get
> them landed.
>
> [1] https://lore.kernel.org/r/20230419155341.v8.10.Ic3659997d6243139d0522fc3afcdfd88d7a5f030@changeid
Breadcrumbs: I've dropped this patch and several others in v9 [1] by
embracing the idea of using a normal IPI as a fallback.
[1] https://lore.kernel.org/r/[email protected]/
Hi,
On Wed, May 10, 2023 at 9:42 AM Doug Anderson <[email protected]> wrote:
>
> Hi,
>
> On Wed, May 10, 2023 at 9:30 AM Mark Rutland <[email protected]> wrote:
> >
> > On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> > > Hi,
> >
> > Hi Doug,
> >
> > > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <[email protected]> wrote:
> > > > This is an attempt to resurrect Sumit's old patch series [1] that
> > > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > > > could find was v7, so I called this series v8. I haven't copied all of
> > > > his old changelongs here, but you can find them from the link.
> > > >
> > > > Since v7, I have:
> > > > * Addressed the small amount of feedback that was there for v7.
> > > > * Rebased.
> > > > * Added a new patch that prevents us from spamming the logs with idle
> > > > tasks.
> > > > * Added an extra patch to gracefully fall back to regular IPIs if
> > > > pseudo-NMIs aren't there.
> > > >
> > > > Since there appear to be a few different patches series related to
> > > > being able to use NMIs to get stack traces of crashed systems, let me
> > > > try to organize them to the best of my understanding:
> > > >
> > > > a) This series. On its own, a) will (among other things) enable stack
> > > > traces of all running processes with the soft lockup detector if
> > > > you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > > > its own, a) doesn't give a hard lockup detector.
> > > >
> > > > b) A different recently-posted series [2] that adds a hard lockup
> > > > detector based on perf. On its own, b) gives a stack crawl of the
> > > > locked up CPU but no stack crawls of other CPUs (even if they're
> > > > locked too). Together with a) + b) we get everything (full lockup
> > > > detect, full ability to get stack crawls).
> > > >
> > > > c) The old Android "buddy" hard lockup detector [3] that I'm
> > > > considering trying to upstream. If b) lands then I believe c) would
> > > > be redundant (at least for arm64). c) on its own is really only
> > > > useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > > > (see [4]). a) + c) is roughly as good as a) + b).
> >
> > > It's been 3 weeks and I haven't heard a peep on this series. That
> > > means nobody has any objections and it's all good to land, right?
> > > Right? :-P
> >
> > FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
> > support (and fixing that requires an overhaul of our DAIF / IRQ flag
> > management, which I've been chipping away at for a number of releases), so I
> > hadn't looked at this in detail yet because the foundations are still somewhat
> > dodgy.
> >
> > I appreciate that this has been around for a while, and it's on my queue to
> > look at.
>
> Ah, thanks for the heads up! We've been thinking about turning this on
> in production in ChromeOS because it will help us track down a whole
> class of field-generated crash reports that are otherwise opaque to
> us. It sounds as if maybe that's not a good idea quite yet? Do you
> have any idea of how much farther along this needs to go?
I'm still very interested in this topic. Do you have anything concrete
that is broken? Your reply gives me the vibe that there have been a
bunch of bugs found recently and, while there are no known issues,
you're worried that there might be something lingering still. Is that
correct, or do you have something concrete that's broken? Is this
anything others could help with? I could make an attempt, or we might
be able to rope some of the others interested in this topic and more
GIC-knowledgeable to help?
I still have a goal for turning this on for production but obviously
don't want to make the device crashier because of it.
Also: if this really has known bugs, should we put a big WARN_ON splat
if anyone tries to turn on pseudo NMIs? Without that, it feels like
it's a bit of a footgun.
> ...of
> course, we've also run into issues with Mediatek devices because they
> don't save/restore GICR registers properly [1]. In theory, we might be
> able to work around that in the kernel.
To followup with this, we've formulated a plan for dealing with
Mediatek Chromebooks. I doubt anyone is truly interested, but if
anyone is the details are in the public Google bug tracker [1]. None
of that should block lading the NMI backtrace series.
> In any case, even if there are bugs that would prevent turning this on
> for production, it still seems like we could still land this series.
> It simply wouldn't do anything until someone turned on pseudo NMIs,
> which wouldn't happen till the kinks are worked out.
>
> ...actually, I guess I should say that if all the patches of the
> current series do land then it actually _would_ still do something,
> even without pseudo-NMI. Assuming the last patch looks OK, it would at
> least start falling back to using regular IPIs to do backtraces. That
> wouldn't get backtraces on hard locked up CPUs but it would be better
> than what we have today where we don't get any backtraces. This would
> get arm64 on par with arm32...
The more I thought about it, the more I liked the idea of going full
hog on the idea in patch #10. I've posted v9 [2] where I've really
embraced the idea of falling back to a regular IPI if NMI isn't
available. Hopefully it looks keen.
[1] https://issuetracker.google.com/281831288
[2] https://lore.kernel.org/r/[email protected]
-Doug