2020-10-29 15:01:06

by Sumit Garg

[permalink] [raw]
Subject: [PATCH v6 0/7] arm64: Add framework to turn an IPI as NMI

With pseudo NMIs support available its possible to configure SGIs to be
triggered as pseudo NMIs running in NMI context. And kernel features
such as:
- NMI backtrace can leverage IPI turned as NMI to get a backtrace of CPU
stuck in hard lockup using magic SYSRQ.
- kgdb relies on NMI support to round up CPUs which are stuck in hard
lockup state with interrupts disabled.

This patch-set adds framework to turn an IPI as NMI which can be triggered
as a pseudo NMI which in turn invokes registered NMI handlers.

After this patch-set we should be able to get a backtrace for a CPU
stuck in HARDLOCKUP. Have a look at an examples below from a hard lockup
testcase run on Developerbox:

$ echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT

NMI backtrace:
==============

# Issue Magic SysRq to dump backtrace

[ 376.894502] NMI backtrace for cpu 8
[ 376.894506] CPU: 8 PID: 555 Comm: bash Not tainted 5.9.0-rc3-00740-g06ff047-dirty #242
[ 376.894510] Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS build #73 Apr 6 2020
[ 376.894514] pstate: 40000005 (nZcv daif -PAN -UAO BTYPE=--)
[ 376.894517] pc : lkdtm_HARDLOCKUP+0x8/0x18
[ 376.894520] lr : lkdtm_do_action+0x24/0x30
[ 376.894524] sp : ffff800012cebd20
[ 376.894527] pmr_save: 00000060
[ 376.894530] x29: ffff800012cebd20 x28: ffff000875ae8000
[ 376.894540] x27: 0000000000000000 x26: 0000000000000000
[ 376.894550] x25: 000000000000001a x24: ffff800012cebe40
[ 376.894560] x23: 000000000000000b x22: ffff800010fc5040
[ 376.894569] x21: ffff000878b61000 x20: ffff8000113b2870
[ 376.894579] x19: 000000000000001b x18: 0000000000000010
[ 376.894588] x17: 0000000000000000 x16: 0000000000000000
[ 376.894598] x15: ffff000875ae8470 x14: 00000000000002ad
[ 376.894613] x13: 0000000000000000 x12: 0000000000000000
[ 376.894622] x11: 0000000000000007 x10: 00000000000009c0
[ 376.894631] x9 : ffff800012ceba80 x8 : ffff000875ae8a20
[ 376.894641] x7 : ffff00087f6b3280 x6 : ffff00087f6b3200
[ 376.894651] x5 : 0000000000000000 x4 : ffff00087f6a91f8
[ 376.894660] x3 : ffff00087f6b0120 x2 : 1aa310cec69eb500
[ 376.894670] x1 : 0000000000000000 x0 : 0000000000000060
[ 376.894679] Call trace:
[ 376.894683] lkdtm_HARDLOCKUP+0x8/0x18
[ 376.894686] direct_entry+0x124/0x1c0
[ 376.894689] full_proxy_write+0x60/0xb0
[ 376.894693] vfs_write+0xf0/0x230
[ 376.894696] ksys_write+0x6c/0xf8
[ 376.894699] __arm64_sys_write+0x1c/0x28
[ 376.894703] el0_svc_common.constprop.0+0x74/0x1f0
[ 376.894707] do_el0_svc+0x24/0x90
[ 376.894710] el0_sync_handler+0x180/0x2f8
[ 376.894713] el0_sync+0x158/0x180

KGDB:
=====

# Enter kdb via Magic SysRq

[6]kdb> btc
btc: cpu status: Currently on cpu 6
Available cpus: 0-5(I), 6, 7(I), 8, 9-23(I)
<snip>
Stack traceback for pid 555
0xffff000875ae8000 555 554 1 8 R 0xffff000875ae89c0 bash
CPU: 8 PID: 555 Comm: bash Not tainted 5.9.0-rc3-00740-g06ff047-dirty #242
Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS build #73 Apr 6 2020
Call trace:
dump_backtrace+0x0/0x1a0
show_stack+0x18/0x28
dump_stack+0xc0/0x11c
kgdb_cpu_enter+0x648/0x660
kgdb_nmicallback+0xa0/0xa8
ipi_kgdb_nmicallback+0x24/0x30
ipi_nmi_handler+0x48/0x60
handle_percpu_devid_fasteoi_ipi+0x74/0x88
generic_handle_irq+0x30/0x48
handle_domain_nmi+0x48/0x80
gic_handle_irq+0x18c/0x34c
el1_irq+0xcc/0x180
lkdtm_HARDLOCKUP+0x8/0x18
direct_entry+0x124/0x1c0
full_proxy_write+0x60/0xb0
vfs_write+0xf0/0x230
ksys_write+0x6c/0xf8
__arm64_sys_write+0x1c/0x28
el0_svc_common.constprop.0+0x74/0x1f0
do_el0_svc+0x24/0x90
el0_sync_handler+0x180/0x2f8
el0_sync+0x158/0x180
<snip>

Changes in v6:
- Two new patches: #4 and #6 which adds runtime fallback framework for
sysrq backtrace and kgdb roundup features.
- Reversed order of NMI backtrace and kgdb roundup feaure patches.
- Addressed other misc. comments from Marc.
- I haven't picked any tags from v5 since I think there is major rework
involved. Masayoshi, could you please confirm if these features still
work for you?

Changes in v5:
- Rebased to head of upstream master.
- Remove redundant invocation of ipi_nmi_setup().
- Addressed misc. comments.

Changes in v4:
- Move IPI NMI framework to a separate file.
- Get rid of hard-coded IPI_CALL_NMI_FUNC allocation.
- Add NMI backtrace support leveraged via magic SYSRQ.

Changes in v3:
- Rebased to Marc's latest IPIs patch-set [1].

[1] https://lkml.org/lkml/2020/9/1/603

Changes since RFC version [1]:
- Switch to use generic interrupt framework to turn an IPI as NMI.
- Dependent on Marc's patch-set [2] which turns IPIs into normal
interrupts.
- Addressed misc. comments from Doug on patch #4.
- Posted kgdb NMI printk() fixup separately which has evolved since
to be solved using different approach via changing kgdb interception
of printk() in common printk() code (see patch [3]).

[1] https://lkml.org/lkml/2020/4/24/328
[2] https://lkml.org/lkml/2020/5/19/710
[3] https://lkml.org/lkml/2020/5/20/418

Sumit Garg (7):
arm64: Add framework to turn IPI as NMI
irqchip/gic-v3: Enable support for SGIs to act as NMIs
arm64: smp: Assign and setup an IPI as NMI
nmi: backtrace: Allow runtime arch specific override
arm64: ipi_nmi: Add support for NMI backtrace
kgdb: roundup: Allow runtime arch specific override
arm64: kgdb: Roundup cpus using IPI as NMI

arch/arm/include/asm/irq.h | 2 +-
arch/arm/kernel/smp.c | 3 +-
arch/arm64/include/asm/irq.h | 6 +++
arch/arm64/include/asm/kgdb.h | 9 +++++
arch/arm64/include/asm/nmi.h | 17 ++++++++
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/ipi_nmi.c | 84 ++++++++++++++++++++++++++++++++++++++++
arch/arm64/kernel/kgdb.c | 35 +++++++++++++++++
arch/arm64/kernel/smp.c | 8 ++++
arch/mips/include/asm/irq.h | 2 +-
arch/mips/kernel/process.c | 3 +-
arch/powerpc/include/asm/nmi.h | 2 +-
arch/powerpc/kernel/kgdb.c | 3 +-
arch/powerpc/kernel/stacktrace.c | 3 +-
arch/sparc/include/asm/irq_64.h | 2 +-
arch/sparc/kernel/process_64.c | 4 +-
arch/sparc/kernel/smp_64.c | 3 +-
arch/x86/include/asm/irq.h | 2 +-
arch/x86/kernel/apic/hw_nmi.c | 3 +-
arch/x86/kernel/kgdb.c | 6 ++-
drivers/irqchip/irq-gic-v3.c | 29 ++++++++++----
include/linux/kgdb.h | 5 ++-
include/linux/nmi.h | 12 ++----
kernel/debug/debug_core.c | 10 ++++-
24 files changed, 221 insertions(+), 34 deletions(-)
create mode 100644 arch/arm64/include/asm/nmi.h
create mode 100644 arch/arm64/kernel/ipi_nmi.c

--
2.7.4


2020-10-29 15:01:09

by Sumit Garg

[permalink] [raw]
Subject: [PATCH v6 4/7] nmi: backtrace: Allow runtime arch specific override

Add a boolean return to arch_trigger_cpumask_backtrace() to support a
use-case where a particular architecture detects at runtime if it supports
NMI backtrace or it would like to fallback to default implementation using
SMP cross-calls.

Currently such an architecture example is arm64 supporting pseudo NMIs
feature which is only available on platforms which have support for GICv3
or later version.

Signed-off-by: Sumit Garg <[email protected]>
---
arch/arm/include/asm/irq.h | 2 +-
arch/arm/kernel/smp.c | 3 ++-
arch/mips/include/asm/irq.h | 2 +-
arch/mips/kernel/process.c | 3 ++-
arch/powerpc/include/asm/nmi.h | 2 +-
arch/powerpc/kernel/stacktrace.c | 3 ++-
arch/sparc/include/asm/irq_64.h | 2 +-
arch/sparc/kernel/process_64.c | 4 +++-
arch/x86/include/asm/irq.h | 2 +-
arch/x86/kernel/apic/hw_nmi.c | 3 ++-
include/linux/nmi.h | 12 ++++--------
11 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index 46d4114..54b0180 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -31,7 +31,7 @@ void handle_IRQ(unsigned int, struct pt_regs *);
void init_IRQ(void);

#ifdef CONFIG_SMP
-extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
bool exclude_self);
#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
#endif
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 48099c6e..bb20a43 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -856,7 +856,8 @@ static void raise_nmi(cpumask_t *mask)
__ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask);
}

-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
{
nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_nmi);
+ return true;
}
diff --git a/arch/mips/include/asm/irq.h b/arch/mips/include/asm/irq.h
index c5d3517..34f3b42 100644
--- a/arch/mips/include/asm/irq.h
+++ b/arch/mips/include/asm/irq.h
@@ -78,7 +78,7 @@ extern int cp0_fdc_irq;

extern int get_c0_fdc_int(void);

-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
bool exclude_self);
#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace

diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 75ebd8d..d19e672 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -735,9 +735,10 @@ static void raise_backtrace(cpumask_t *mask)
}
}

-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
{
nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace);
+ return true;
}

int mips_get_process_fp_mode(struct task_struct *task)
diff --git a/arch/powerpc/include/asm/nmi.h b/arch/powerpc/include/asm/nmi.h
index 84b4cfe..a5eb3e2 100644
--- a/arch/powerpc/include/asm/nmi.h
+++ b/arch/powerpc/include/asm/nmi.h
@@ -9,7 +9,7 @@ static inline void arch_touch_nmi_watchdog(void) {}
#endif

#if defined(CONFIG_NMI_IPI) && defined(CONFIG_STACKTRACE)
-extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
bool exclude_self);
#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
#endif
diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index b644065..22b112a 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -264,8 +264,9 @@ static void raise_backtrace_ipi(cpumask_t *mask)
}
}

-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
{
nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace_ipi);
+ return true;
}
#endif /* defined(CONFIG_PPC_BOOK3S_64) && defined(CONFIG_NMI_IPI) */
diff --git a/arch/sparc/include/asm/irq_64.h b/arch/sparc/include/asm/irq_64.h
index 4d748e9..35c01ff 100644
--- a/arch/sparc/include/asm/irq_64.h
+++ b/arch/sparc/include/asm/irq_64.h
@@ -87,7 +87,7 @@ static inline unsigned long get_softint(void)
return retval;
}

-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
bool exclude_self);
#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace

diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index a75093b..9182001 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -248,7 +248,7 @@ static void __global_reg_poll(struct global_reg_snapshot *gp)
}
}

-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
{
struct thread_info *tp = current_thread_info();
struct pt_regs *regs = get_irq_regs();
@@ -303,6 +303,8 @@ void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
memset(global_cpu_snapshot, 0, sizeof(global_cpu_snapshot));

spin_unlock_irqrestore(&global_cpu_snapshot_lock, flags);
+
+ return true;
}

#ifdef CONFIG_MAGIC_SYSRQ
diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
index 528c8a7..b7668e0 100644
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -47,7 +47,7 @@ extern void init_ISA_irqs(void);
extern void __init init_IRQ(void);

#ifdef CONFIG_X86_LOCAL_APIC
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
bool exclude_self);

#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 34a992e..e7dcd28 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -34,10 +34,11 @@ static void nmi_raise_cpu_backtrace(cpumask_t *mask)
apic->send_IPI_mask(mask, NMI_VECTOR);
}

-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
{
nmi_trigger_cpumask_backtrace(mask, exclude_self,
nmi_raise_cpu_backtrace);
+ return true;
}

static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 750c7f3..cedbfc1 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -143,26 +143,22 @@ static inline void touch_nmi_watchdog(void)
#ifdef arch_trigger_cpumask_backtrace
static inline bool trigger_all_cpu_backtrace(void)
{
- arch_trigger_cpumask_backtrace(cpu_online_mask, false);
- return true;
+ return arch_trigger_cpumask_backtrace(cpu_online_mask, false);
}

static inline bool trigger_allbutself_cpu_backtrace(void)
{
- arch_trigger_cpumask_backtrace(cpu_online_mask, true);
- return true;
+ return arch_trigger_cpumask_backtrace(cpu_online_mask, true);
}

static inline bool trigger_cpumask_backtrace(struct cpumask *mask)
{
- arch_trigger_cpumask_backtrace(mask, false);
- return true;
+ return arch_trigger_cpumask_backtrace(mask, false);
}

static inline bool trigger_single_cpu_backtrace(int cpu)
{
- arch_trigger_cpumask_backtrace(cpumask_of(cpu), false);
- return true;
+ return arch_trigger_cpumask_backtrace(cpumask_of(cpu), false);
}

/* generic implementation */
--
2.7.4

2020-10-29 15:01:13

by Sumit Garg

[permalink] [raw]
Subject: [PATCH v6 1/7] arm64: Add framework to turn IPI as NMI

Introduce framework to turn an IPI as NMI using pseudo NMIs. The main
motivation for this feature is to have an IPI that can be leveraged to
invoke NMI functions on other CPUs.

And current prospective users are NMI backtrace and KGDB CPUs round-up
whose support is added via future patches.

Signed-off-by: Sumit Garg <[email protected]>
---
arch/arm64/include/asm/nmi.h | 17 ++++++++++++
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/ipi_nmi.c | 65 ++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 83 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/include/asm/nmi.h
create mode 100644 arch/arm64/kernel/ipi_nmi.c

diff --git a/arch/arm64/include/asm/nmi.h b/arch/arm64/include/asm/nmi.h
new file mode 100644
index 0000000..4cd14b6
--- /dev/null
+++ b/arch/arm64/include/asm/nmi.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_NMI_H
+#define __ASM_NMI_H
+
+#ifndef __ASSEMBLER__
+
+#include <linux/cpumask.h>
+
+extern bool arm64_supports_nmi(void);
+extern void arm64_send_nmi(cpumask_t *mask);
+
+void set_smp_dynamic_ipi(int ipi);
+void dynamic_ipi_setup(int cpu);
+void dynamic_ipi_teardown(int cpu);
+
+#endif /* !__ASSEMBLER__ */
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index bbaf0bc..525a1e0 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -17,7 +17,7 @@ obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
return_address.o cpuinfo.o cpu_errata.o \
cpufeature.o alternative.o cacheinfo.o \
smp.o smp_spin_table.o topology.o smccc-call.o \
- syscall.o proton-pack.o
+ syscall.o proton-pack.o ipi_nmi.o

targets += efi-entry.o

diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
new file mode 100644
index 0000000..a945dcf
--- /dev/null
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * NMI support for IPIs
+ *
+ * Copyright (C) 2020 Linaro Limited
+ * Author: Sumit Garg <[email protected]>
+ */
+
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/smp.h>
+
+#include <asm/nmi.h>
+
+static struct irq_desc *ipi_nmi_desc __read_mostly;
+static int ipi_nmi_id __read_mostly;
+
+bool arm64_supports_nmi(void)
+{
+ if (ipi_nmi_desc)
+ return true;
+
+ return false;
+}
+
+void arm64_send_nmi(cpumask_t *mask)
+{
+ if (WARN_ON_ONCE(!ipi_nmi_desc))
+ return;
+
+ __ipi_send_mask(ipi_nmi_desc, mask);
+}
+
+static irqreturn_t ipi_nmi_handler(int irq, void *data)
+{
+ /* nop, NMI handlers for special features can be added here. */
+
+ return IRQ_NONE;
+}
+
+void dynamic_ipi_setup(int cpu)
+{
+ if (!ipi_nmi_desc)
+ return;
+
+ if (!prepare_percpu_nmi(ipi_nmi_id))
+ enable_percpu_nmi(ipi_nmi_id, IRQ_TYPE_NONE);
+}
+
+void dynamic_ipi_teardown(int cpu)
+{
+ if (!ipi_nmi_desc)
+ return;
+
+ disable_percpu_nmi(ipi_nmi_id);
+ teardown_percpu_nmi(ipi_nmi_id);
+}
+
+void __init set_smp_dynamic_ipi(int ipi)
+{
+ if (!request_percpu_nmi(ipi, ipi_nmi_handler, "IPI", &cpu_number)) {
+ ipi_nmi_desc = irq_to_desc(ipi);
+ ipi_nmi_id = ipi;
+ }
+}
--
2.7.4

2020-10-29 15:01:38

by Sumit Garg

[permalink] [raw]
Subject: [PATCH v6 3/7] arm64: smp: Assign and setup an IPI as NMI

Assign an unused IPI which can be turned as NMI using ipi_nmi framework.
Also, invoke corresponding dynamic IPI setup/teardown APIs.

Signed-off-by: Sumit Garg <[email protected]>
---
arch/arm64/kernel/smp.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 82e75fc..2e118e2 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -43,6 +43,7 @@
#include <asm/daifflags.h>
#include <asm/kvm_mmu.h>
#include <asm/mmu_context.h>
+#include <asm/nmi.h>
#include <asm/numa.h>
#include <asm/processor.h>
#include <asm/smp_plat.h>
@@ -962,6 +963,8 @@ static void ipi_setup(int cpu)

for (i = 0; i < nr_ipi; i++)
enable_percpu_irq(ipi_irq_base + i, 0);
+
+ dynamic_ipi_setup(cpu);
}

#ifdef CONFIG_HOTPLUG_CPU
@@ -974,6 +977,8 @@ static void ipi_teardown(int cpu)

for (i = 0; i < nr_ipi; i++)
disable_percpu_irq(ipi_irq_base + i);
+
+ dynamic_ipi_teardown(cpu);
}
#endif

@@ -995,6 +1000,9 @@ void __init set_smp_ipi_range(int ipi_base, int n)
irq_set_status_flags(ipi_base + i, IRQ_HIDDEN);
}

+ if (n > nr_ipi)
+ set_smp_dynamic_ipi(ipi_base + nr_ipi);
+
ipi_irq_base = ipi_base;

/* Setup the boot CPU immediately */
--
2.7.4

2020-10-29 15:02:16

by Sumit Garg

[permalink] [raw]
Subject: [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override

Add a new API kgdb_arch_roundup_cpus() for a particular archichecture to
override default kgdb roundup and if it detects at runtime to not support
NMI roundup then it can fallback to default implementation using async
SMP cross-calls.

Currently such an architecture example is arm64 supporting pseudo NMIs
feature which is only available on platforms which have support for GICv3
or later version.

Signed-off-by: Sumit Garg <[email protected]>
---
arch/powerpc/kernel/kgdb.c | 3 ++-
arch/sparc/kernel/smp_64.c | 3 ++-
arch/x86/kernel/kgdb.c | 6 ++++--
include/linux/kgdb.h | 5 +++--
kernel/debug/debug_core.c | 10 +++++++++-
5 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c
index 4090802..126575d 100644
--- a/arch/powerpc/kernel/kgdb.c
+++ b/arch/powerpc/kernel/kgdb.c
@@ -125,9 +125,10 @@ static int kgdb_debugger_ipi(struct pt_regs *regs)
}

#ifdef CONFIG_SMP
-void kgdb_roundup_cpus(void)
+bool kgdb_arch_roundup_cpus(void)
{
smp_send_debugger_break();
+ return true;
}
#endif

diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
index e38d8bf..c459c83 100644
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -1014,9 +1014,10 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page)
}

#ifdef CONFIG_KGDB
-void kgdb_roundup_cpus(void)
+bool kgdb_arch_roundup_cpus(void)
{
smp_cross_call(&xcall_kgdb_capture, 0, 0, 0);
+ return true;
}
#endif

diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index ff7878d..1b756d9 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -404,7 +404,8 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)

#ifdef CONFIG_SMP
/**
- * kgdb_roundup_cpus - Get other CPUs into a holding pattern
+ * kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
+ * in an architectural specific manner
*
* On SMP systems, we need to get the attention of the other CPUs
* and get them be in a known state. This should do what is needed
@@ -414,9 +415,10 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
*
* On non-SMP systems, this is not called.
*/
-void kgdb_roundup_cpus(void)
+bool kgdb_arch_roundup_cpus(void)
{
apic_send_IPI_allbutself(NMI_VECTOR);
+ return true;
}
#endif

diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 0d6cf64..f9db5b8 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -200,7 +200,8 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
extern void kgdb_call_nmi_hook(void *ignored);

/**
- * kgdb_roundup_cpus - Get other CPUs into a holding pattern
+ * kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
+ * in an architectural specific manner
*
* On SMP systems, we need to get the attention of the other CPUs
* and get them into a known state. This should do what is needed
@@ -210,7 +211,7 @@ extern void kgdb_call_nmi_hook(void *ignored);
*
* On non-SMP systems, this is not called.
*/
-extern void kgdb_roundup_cpus(void);
+extern bool kgdb_arch_roundup_cpus(void);

/**
* kgdb_arch_set_pc - Generic call back to the program counter
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 1e75a89..27e401c 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -241,13 +241,21 @@ void __weak kgdb_call_nmi_hook(void *ignored)
}
NOKPROBE_SYMBOL(kgdb_call_nmi_hook);

-void __weak kgdb_roundup_cpus(void)
+bool __weak kgdb_arch_roundup_cpus(void)
+{
+ return false;
+}
+
+static void kgdb_roundup_cpus(void)
{
call_single_data_t *csd;
int this_cpu = raw_smp_processor_id();
int cpu;
int ret;

+ if (kgdb_arch_roundup_cpus())
+ return;
+
for_each_online_cpu(cpu) {
/* No need to roundup ourselves */
if (cpu == this_cpu)
--
2.7.4

2020-10-29 15:02:26

by Sumit Garg

[permalink] [raw]
Subject: [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI

arm64 platforms with GICv3 or later supports pseudo NMIs which can be
leveraged to roundup CPUs which are stuck in hard lockup state with
interrupts disabled that wouldn't be possible with a normal IPI.

So instead switch to roundup CPUs using IPI turned as NMI. And in
case a particular arm64 platform doesn't supports pseudo NMIs,
it will switch back to default kgdb CPUs roundup mechanism.

Signed-off-by: Sumit Garg <[email protected]>
---
arch/arm64/include/asm/kgdb.h | 9 +++++++++
arch/arm64/kernel/ipi_nmi.c | 5 +++++
arch/arm64/kernel/kgdb.c | 35 +++++++++++++++++++++++++++++++++++
3 files changed, 49 insertions(+)

diff --git a/arch/arm64/include/asm/kgdb.h b/arch/arm64/include/asm/kgdb.h
index 21fc85e..c3d2425 100644
--- a/arch/arm64/include/asm/kgdb.h
+++ b/arch/arm64/include/asm/kgdb.h
@@ -24,6 +24,15 @@ static inline void arch_kgdb_breakpoint(void)
extern void kgdb_handle_bus_error(void);
extern int kgdb_fault_expected;

+#ifdef CONFIG_KGDB
+extern bool kgdb_ipi_nmicallback(int cpu, void *regs);
+#else
+static inline bool kgdb_ipi_nmicallback(int cpu, void *regs)
+{
+ return false;
+}
+#endif
+
#endif /* !__ASSEMBLY__ */

/*
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index 597dcf7..6ace182 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -8,6 +8,7 @@

#include <linux/interrupt.h>
#include <linux/irq.h>
+#include <linux/kgdb.h>
#include <linux/nmi.h>
#include <linux/smp.h>

@@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
static irqreturn_t ipi_nmi_handler(int irq, void *data)
{
irqreturn_t ret = IRQ_NONE;
+ unsigned int cpu = smp_processor_id();

if (nmi_cpu_backtrace(get_irq_regs()))
ret = IRQ_HANDLED;

+ if (kgdb_ipi_nmicallback(cpu, get_irq_regs()))
+ ret = IRQ_HANDLED;
+
return ret;
}

diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index 1a157ca3..c26e710 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -17,6 +17,7 @@

#include <asm/debug-monitors.h>
#include <asm/insn.h>
+#include <asm/nmi.h>
#include <asm/traps.h>

struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
@@ -353,3 +354,37 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
return aarch64_insn_write((void *)bpt->bpt_addr,
*(u32 *)bpt->saved_instr);
}
+
+bool kgdb_ipi_nmicallback(int cpu, void *regs)
+{
+ if (atomic_read(&kgdb_active) != -1) {
+ kgdb_nmicallback(cpu, regs);
+ return true;
+ }
+
+ return false;
+}
+
+static void kgdb_smp_callback(void *data)
+{
+ unsigned int cpu = smp_processor_id();
+
+ if (atomic_read(&kgdb_active) != -1)
+ kgdb_nmicallback(cpu, get_irq_regs());
+}
+
+bool kgdb_arch_roundup_cpus(void)
+{
+ struct cpumask mask;
+
+ if (!arm64_supports_nmi())
+ return false;
+
+ cpumask_copy(&mask, cpu_online_mask);
+ cpumask_clear_cpu(raw_smp_processor_id(), &mask);
+ if (cpumask_empty(&mask))
+ return false;
+
+ arm64_send_nmi(&mask);
+ return true;
+}
--
2.7.4

2020-10-29 15:02:47

by Sumit Garg

[permalink] [raw]
Subject: [PATCH v6 2/7] irqchip/gic-v3: Enable support for SGIs to act as NMIs

Add support to handle SGIs as pseudo NMIs. As SGIs or IPIs default to a
special flow handler: handle_percpu_devid_fasteoi_ipi(), so skip NMI
handler update in case of SGIs.

Also, enable NMI support prior to gic_smp_init() as allocation of SGIs
as IRQs/NMIs happen as part of this routine.

Signed-off-by: Sumit Garg <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 16fecc0..7010ae2 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -461,6 +461,7 @@ static u32 gic_get_ppi_index(struct irq_data *d)
static int gic_irq_nmi_setup(struct irq_data *d)
{
struct irq_desc *desc = irq_to_desc(d->irq);
+ u32 idx;

if (!gic_supports_nmi())
return -EINVAL;
@@ -478,16 +479,22 @@ static int gic_irq_nmi_setup(struct irq_data *d)
return -EINVAL;

/* desc lock should already be held */
- if (gic_irq_in_rdist(d)) {
- u32 idx = gic_get_ppi_index(d);
+ switch (get_intid_range(d)) {
+ case SGI_RANGE:
+ break;
+ case PPI_RANGE:
+ case EPPI_RANGE:
+ idx = gic_get_ppi_index(d);

/* Setting up PPI as NMI, only switch handler for first NMI */
if (!refcount_inc_not_zero(&ppi_nmi_refs[idx])) {
refcount_set(&ppi_nmi_refs[idx], 1);
desc->handle_irq = handle_percpu_devid_fasteoi_nmi;
}
- } else {
+ break;
+ default:
desc->handle_irq = handle_fasteoi_nmi;
+ break;
}

gic_irq_set_prio(d, GICD_INT_NMI_PRI);
@@ -498,6 +505,7 @@ static int gic_irq_nmi_setup(struct irq_data *d)
static void gic_irq_nmi_teardown(struct irq_data *d)
{
struct irq_desc *desc = irq_to_desc(d->irq);
+ u32 idx;

if (WARN_ON(!gic_supports_nmi()))
return;
@@ -515,14 +523,20 @@ static void gic_irq_nmi_teardown(struct irq_data *d)
return;

/* desc lock should already be held */
- if (gic_irq_in_rdist(d)) {
- u32 idx = gic_get_ppi_index(d);
+ switch (get_intid_range(d)) {
+ case SGI_RANGE:
+ break;
+ case PPI_RANGE:
+ case EPPI_RANGE:
+ idx = gic_get_ppi_index(d);

/* Tearing down NMI, only switch handler for last NMI */
if (refcount_dec_and_test(&ppi_nmi_refs[idx]))
desc->handle_irq = handle_percpu_devid_irq;
- } else {
+ break;
+ default:
desc->handle_irq = handle_fasteoi_irq;
+ break;
}

gic_irq_set_prio(d, GICD_INT_DEF_PRI);
@@ -1708,6 +1722,7 @@ static int __init gic_init_bases(void __iomem *dist_base,

gic_dist_init();
gic_cpu_init();
+ gic_enable_nmi_support();
gic_smp_init();
gic_cpu_pm_init();

@@ -1719,8 +1734,6 @@ static int __init gic_init_bases(void __iomem *dist_base,
gicv2m_init(handle, gic_data.domain);
}

- gic_enable_nmi_support();
-
return 0;

out_free:
--
2.7.4

2020-10-29 15:34:51

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override

On Thu, Oct 29, 2020 at 08:26:26PM +0530, Sumit Garg wrote:
> Add a new API kgdb_arch_roundup_cpus() for a particular archichecture to
> override default kgdb roundup and if it detects at runtime to not support
> NMI roundup then it can fallback to default implementation using async
> SMP cross-calls.
>
> Currently such an architecture example is arm64 supporting pseudo NMIs
> feature which is only available on platforms which have support for GICv3
> or later version.
>
> Signed-off-by: Sumit Garg <[email protected]>
> ---
> arch/powerpc/kernel/kgdb.c | 3 ++-
> arch/sparc/kernel/smp_64.c | 3 ++-
> arch/x86/kernel/kgdb.c | 6 ++++--
> include/linux/kgdb.h | 5 +++--
> kernel/debug/debug_core.c | 10 +++++++++-
> 5 files changed, 20 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c
> index 4090802..126575d 100644
> --- a/arch/powerpc/kernel/kgdb.c
> +++ b/arch/powerpc/kernel/kgdb.c
> @@ -125,9 +125,10 @@ static int kgdb_debugger_ipi(struct pt_regs *regs)
> }
>
> #ifdef CONFIG_SMP
> -void kgdb_roundup_cpus(void)
> +bool kgdb_arch_roundup_cpus(void)
> {
> smp_send_debugger_break();
> + return true;
> }
> #endif
>
> diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
> index e38d8bf..c459c83 100644
> --- a/arch/sparc/kernel/smp_64.c
> +++ b/arch/sparc/kernel/smp_64.c
> @@ -1014,9 +1014,10 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page)
> }
>
> #ifdef CONFIG_KGDB
> -void kgdb_roundup_cpus(void)
> +bool kgdb_arch_roundup_cpus(void)
> {
> smp_cross_call(&xcall_kgdb_capture, 0, 0, 0);
> + return true;
> }
> #endif
>
> diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
> index ff7878d..1b756d9 100644
> --- a/arch/x86/kernel/kgdb.c
> +++ b/arch/x86/kernel/kgdb.c
> @@ -404,7 +404,8 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
>
> #ifdef CONFIG_SMP
> /**
> - * kgdb_roundup_cpus - Get other CPUs into a holding pattern
> + * kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> + * in an architectural specific manner
> *
> * On SMP systems, we need to get the attention of the other CPUs
> * and get them be in a known state. This should do what is needed
> @@ -414,9 +415,10 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
> *
> * On non-SMP systems, this is not called.
> */
> -void kgdb_roundup_cpus(void)
> +bool kgdb_arch_roundup_cpus(void)
> {
> apic_send_IPI_allbutself(NMI_VECTOR);
> + return true;
> }
> #endif
>
> diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> index 0d6cf64..f9db5b8 100644
> --- a/include/linux/kgdb.h
> +++ b/include/linux/kgdb.h
> @@ -200,7 +200,8 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
> extern void kgdb_call_nmi_hook(void *ignored);
>
> /**
> - * kgdb_roundup_cpus - Get other CPUs into a holding pattern
> + * kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> + * in an architectural specific manner
> *
> * On SMP systems, we need to get the attention of the other CPUs
> * and get them into a known state. This should do what is needed
> @@ -210,7 +211,7 @@ extern void kgdb_call_nmi_hook(void *ignored);
> *
> * On non-SMP systems, this is not called.
> */
> -extern void kgdb_roundup_cpus(void);
> +extern bool kgdb_arch_roundup_cpus(void);
>
> /**
> * kgdb_arch_set_pc - Generic call back to the program counter
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index 1e75a89..27e401c 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -241,13 +241,21 @@ void __weak kgdb_call_nmi_hook(void *ignored)
> }
> NOKPROBE_SYMBOL(kgdb_call_nmi_hook);
>
> -void __weak kgdb_roundup_cpus(void)
> +bool __weak kgdb_arch_roundup_cpus(void)
> +{
> + return false;

Do we really need to introduce all these boolean return values just to
call a bit of library code when one of the architectures wants to
fallback to a generic implementation?

Why not something more like:

void kgdb_smp_call_nmi_hook(void)
{
/* original weak version of kgdb_roundup_cpus goes here */
}

void __weak kgdb_roundup_cpus(void)
{
kgdb_smp_call_nmi_hook();
}

arm64 can now simply call the new library function if a fallback is needed.

Note that same question applies to the backtrace changes...


Daniel.


> +}
> +
> +static void kgdb_roundup_cpus(void)
> {
> call_single_data_t *csd;
> int this_cpu = raw_smp_processor_id();
> int cpu;
> int ret;
>
> + if (kgdb_arch_roundup_cpus())
> + return;
> +
> for_each_online_cpu(cpu) {
> /* No need to roundup ourselves */
> if (cpu == this_cpu)
> --
> 2.7.4
>

2020-10-29 16:24:37

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI

On Thu, Oct 29, 2020 at 08:26:27PM +0530, Sumit Garg wrote:
> arm64 platforms with GICv3 or later supports pseudo NMIs which can be
> leveraged to roundup CPUs which are stuck in hard lockup state with
> interrupts disabled that wouldn't be possible with a normal IPI.
>
> So instead switch to roundup CPUs using IPI turned as NMI. And in
> case a particular arm64 platform doesn't supports pseudo NMIs,
> it will switch back to default kgdb CPUs roundup mechanism.
>
> Signed-off-by: Sumit Garg <[email protected]>
> ---
> arch/arm64/include/asm/kgdb.h | 9 +++++++++
> arch/arm64/kernel/ipi_nmi.c | 5 +++++
> arch/arm64/kernel/kgdb.c | 35 +++++++++++++++++++++++++++++++++++
> 3 files changed, 49 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kgdb.h b/arch/arm64/include/asm/kgdb.h
> index 21fc85e..c3d2425 100644
> --- a/arch/arm64/include/asm/kgdb.h
> +++ b/arch/arm64/include/asm/kgdb.h
> @@ -24,6 +24,15 @@ static inline void arch_kgdb_breakpoint(void)
> extern void kgdb_handle_bus_error(void);
> extern int kgdb_fault_expected;
>
> +#ifdef CONFIG_KGDB
> +extern bool kgdb_ipi_nmicallback(int cpu, void *regs);
> +#else
> +static inline bool kgdb_ipi_nmicallback(int cpu, void *regs)
> +{
> + return false;
> +}
> +#endif
> +
> #endif /* !__ASSEMBLY__ */
>
> /*
> diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> index 597dcf7..6ace182 100644
> --- a/arch/arm64/kernel/ipi_nmi.c
> +++ b/arch/arm64/kernel/ipi_nmi.c
> @@ -8,6 +8,7 @@
>
> #include <linux/interrupt.h>
> #include <linux/irq.h>
> +#include <linux/kgdb.h>
> #include <linux/nmi.h>
> #include <linux/smp.h>
>
> @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
> static irqreturn_t ipi_nmi_handler(int irq, void *data)
> {
> irqreturn_t ret = IRQ_NONE;
> + unsigned int cpu = smp_processor_id();
>
> if (nmi_cpu_backtrace(get_irq_regs()))
> ret = IRQ_HANDLED;
>
> + if (kgdb_ipi_nmicallback(cpu, get_irq_regs()))
> + ret = IRQ_HANDLED;
> +
> return ret;

It would be better to declare existing return value for
kgdb_nmicallback() to be dangerously stupid and fix it so it returns an
irqreturn_t (that's easy since most callers do not need to check the
return value).

Then this code simply becomes:

return kgdb_nmicallback(cpu, get_irq_regs());


> }
>
> diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
> index 1a157ca3..c26e710 100644
> --- a/arch/arm64/kernel/kgdb.c
> +++ b/arch/arm64/kernel/kgdb.c
> @@ -17,6 +17,7 @@
>
> #include <asm/debug-monitors.h>
> #include <asm/insn.h>
> +#include <asm/nmi.h>
> #include <asm/traps.h>
>
> struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
> @@ -353,3 +354,37 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
> return aarch64_insn_write((void *)bpt->bpt_addr,
> *(u32 *)bpt->saved_instr);
> }
> +
> +bool kgdb_ipi_nmicallback(int cpu, void *regs)
> +{
> + if (atomic_read(&kgdb_active) != -1) {
> + kgdb_nmicallback(cpu, regs);
> + return true;
> + }
> +
> + return false;
> +}

I *really* don't like this function.

If the return code of kgdb_nmicallback() is broken then fix it, don't
just wrap it and invent a new criteria for the return code.

To be honest I don't actually think the logic in kgdb_nmicallback() is
broken. As mentioned above the return value has a weird definition (0
for "handled it OK" and 1 for "nothing for me to do") but the logic to
calculate the return code looks OK.


> +
> +static void kgdb_smp_callback(void *data)
> +{
> + unsigned int cpu = smp_processor_id();
> +
> + if (atomic_read(&kgdb_active) != -1)
> + kgdb_nmicallback(cpu, get_irq_regs());
> +}

This is Unused. I presume it is litter from a previous revision of the
code and can be deleted?


> +
> +bool kgdb_arch_roundup_cpus(void)
> +{
> + struct cpumask mask;
> +
> + if (!arm64_supports_nmi())
> + return false;
> +
> + cpumask_copy(&mask, cpu_online_mask);
> + cpumask_clear_cpu(raw_smp_processor_id(), &mask);
> + if (cpumask_empty(&mask))
> + return false;

Why do we need to fallback if there is no work to do? There will still
be no work to do when we call the fallback.


Daniel.

2020-10-29 16:41:24

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI

On Thu, Oct 29, 2020 at 04:22:34PM +0000, Daniel Thompson wrote:
> On Thu, Oct 29, 2020 at 08:26:27PM +0530, Sumit Garg wrote:
> > arm64 platforms with GICv3 or later supports pseudo NMIs which can be
> > leveraged to roundup CPUs which are stuck in hard lockup state with
> > interrupts disabled that wouldn't be possible with a normal IPI.
> >
> > So instead switch to roundup CPUs using IPI turned as NMI. And in
> > case a particular arm64 platform doesn't supports pseudo NMIs,
> > it will switch back to default kgdb CPUs roundup mechanism.
> >
> > Signed-off-by: Sumit Garg <[email protected]>
> > ---
> > arch/arm64/include/asm/kgdb.h | 9 +++++++++
> > arch/arm64/kernel/ipi_nmi.c | 5 +++++
> > arch/arm64/kernel/kgdb.c | 35 +++++++++++++++++++++++++++++++++++
> > 3 files changed, 49 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/kgdb.h b/arch/arm64/include/asm/kgdb.h
> > index 21fc85e..c3d2425 100644
> > --- a/arch/arm64/include/asm/kgdb.h
> > +++ b/arch/arm64/include/asm/kgdb.h
> > @@ -24,6 +24,15 @@ static inline void arch_kgdb_breakpoint(void)
> > extern void kgdb_handle_bus_error(void);
> > extern int kgdb_fault_expected;
> >
> > +#ifdef CONFIG_KGDB
> > +extern bool kgdb_ipi_nmicallback(int cpu, void *regs);
> > +#else
> > +static inline bool kgdb_ipi_nmicallback(int cpu, void *regs)
> > +{
> > + return false;
> > +}
> > +#endif
> > +
> > #endif /* !__ASSEMBLY__ */
> >
> > /*
> > diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> > index 597dcf7..6ace182 100644
> > --- a/arch/arm64/kernel/ipi_nmi.c
> > +++ b/arch/arm64/kernel/ipi_nmi.c
> > @@ -8,6 +8,7 @@
> >
> > #include <linux/interrupt.h>
> > #include <linux/irq.h>
> > +#include <linux/kgdb.h>
> > #include <linux/nmi.h>
> > #include <linux/smp.h>
> >
> > @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
> > static irqreturn_t ipi_nmi_handler(int irq, void *data)
> > {
> > irqreturn_t ret = IRQ_NONE;
> > + unsigned int cpu = smp_processor_id();
> >
> > if (nmi_cpu_backtrace(get_irq_regs()))
> > ret = IRQ_HANDLED;
> >
> > + if (kgdb_ipi_nmicallback(cpu, get_irq_regs()))
> > + ret = IRQ_HANDLED;
> > +
> > return ret;
>
> It would be better to declare existing return value for
> kgdb_nmicallback() to be dangerously stupid and fix it so it returns an
> irqreturn_t (that's easy since most callers do not need to check the
> return value).
>
> Then this code simply becomes:
>
> return kgdb_nmicallback(cpu, get_irq_regs());

Actually, reflecting on this maybe it is better to keep kgdb_nmicallin()
and kgdb_nmicallback() aligned w.r.t. return codes (even if they are a
little unusual).

I'm still not sure why we'd keep kgdb_ipi_nmicallback() though.
kgdb_nmicallback() is intended to be called from arch code...


Daniel.


>
>
> > }
> >
> > diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
> > index 1a157ca3..c26e710 100644
> > --- a/arch/arm64/kernel/kgdb.c
> > +++ b/arch/arm64/kernel/kgdb.c
> > @@ -17,6 +17,7 @@
> >
> > #include <asm/debug-monitors.h>
> > #include <asm/insn.h>
> > +#include <asm/nmi.h>
> > #include <asm/traps.h>
> >
> > struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
> > @@ -353,3 +354,37 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
> > return aarch64_insn_write((void *)bpt->bpt_addr,
> > *(u32 *)bpt->saved_instr);
> > }
> > +
> > +bool kgdb_ipi_nmicallback(int cpu, void *regs)
> > +{
> > + if (atomic_read(&kgdb_active) != -1) {
> > + kgdb_nmicallback(cpu, regs);
> > + return true;
> > + }
> > +
> > + return false;
> > +}
>
> I *really* don't like this function.
>
> If the return code of kgdb_nmicallback() is broken then fix it, don't
> just wrap it and invent a new criteria for the return code.
>
> To be honest I don't actually think the logic in kgdb_nmicallback() is
> broken. As mentioned above the return value has a weird definition (0
> for "handled it OK" and 1 for "nothing for me to do") but the logic to
> calculate the return code looks OK.
>
>
> > +
> > +static void kgdb_smp_callback(void *data)
> > +{
> > + unsigned int cpu = smp_processor_id();
> > +
> > + if (atomic_read(&kgdb_active) != -1)
> > + kgdb_nmicallback(cpu, get_irq_regs());
> > +}
>
> This is Unused. I presume it is litter from a previous revision of the
> code and can be deleted?
>
>
> > +
> > +bool kgdb_arch_roundup_cpus(void)
> > +{
> > + struct cpumask mask;
> > +
> > + if (!arm64_supports_nmi())
> > + return false;
> > +
> > + cpumask_copy(&mask, cpu_online_mask);
> > + cpumask_clear_cpu(raw_smp_processor_id(), &mask);
> > + if (cpumask_empty(&mask))
> > + return false;
>
> Why do we need to fallback if there is no work to do? There will still
> be no work to do when we call the fallback.
>
>
> Daniel.

2020-11-02 06:22:57

by Sumit Garg

[permalink] [raw]
Subject: Re: [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override

On Thu, 29 Oct 2020 at 20:51, Daniel Thompson
<[email protected]> wrote:
>
> On Thu, Oct 29, 2020 at 08:26:26PM +0530, Sumit Garg wrote:
> > Add a new API kgdb_arch_roundup_cpus() for a particular archichecture to
> > override default kgdb roundup and if it detects at runtime to not support
> > NMI roundup then it can fallback to default implementation using async
> > SMP cross-calls.
> >
> > Currently such an architecture example is arm64 supporting pseudo NMIs
> > feature which is only available on platforms which have support for GICv3
> > or later version.
> >
> > Signed-off-by: Sumit Garg <[email protected]>
> > ---
> > arch/powerpc/kernel/kgdb.c | 3 ++-
> > arch/sparc/kernel/smp_64.c | 3 ++-
> > arch/x86/kernel/kgdb.c | 6 ++++--
> > include/linux/kgdb.h | 5 +++--
> > kernel/debug/debug_core.c | 10 +++++++++-
> > 5 files changed, 20 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c
> > index 4090802..126575d 100644
> > --- a/arch/powerpc/kernel/kgdb.c
> > +++ b/arch/powerpc/kernel/kgdb.c
> > @@ -125,9 +125,10 @@ static int kgdb_debugger_ipi(struct pt_regs *regs)
> > }
> >
> > #ifdef CONFIG_SMP
> > -void kgdb_roundup_cpus(void)
> > +bool kgdb_arch_roundup_cpus(void)
> > {
> > smp_send_debugger_break();
> > + return true;
> > }
> > #endif
> >
> > diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
> > index e38d8bf..c459c83 100644
> > --- a/arch/sparc/kernel/smp_64.c
> > +++ b/arch/sparc/kernel/smp_64.c
> > @@ -1014,9 +1014,10 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page)
> > }
> >
> > #ifdef CONFIG_KGDB
> > -void kgdb_roundup_cpus(void)
> > +bool kgdb_arch_roundup_cpus(void)
> > {
> > smp_cross_call(&xcall_kgdb_capture, 0, 0, 0);
> > + return true;
> > }
> > #endif
> >
> > diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
> > index ff7878d..1b756d9 100644
> > --- a/arch/x86/kernel/kgdb.c
> > +++ b/arch/x86/kernel/kgdb.c
> > @@ -404,7 +404,8 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
> >
> > #ifdef CONFIG_SMP
> > /**
> > - * kgdb_roundup_cpus - Get other CPUs into a holding pattern
> > + * kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> > + * in an architectural specific manner
> > *
> > * On SMP systems, we need to get the attention of the other CPUs
> > * and get them be in a known state. This should do what is needed
> > @@ -414,9 +415,10 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
> > *
> > * On non-SMP systems, this is not called.
> > */
> > -void kgdb_roundup_cpus(void)
> > +bool kgdb_arch_roundup_cpus(void)
> > {
> > apic_send_IPI_allbutself(NMI_VECTOR);
> > + return true;
> > }
> > #endif
> >
> > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > index 0d6cf64..f9db5b8 100644
> > --- a/include/linux/kgdb.h
> > +++ b/include/linux/kgdb.h
> > @@ -200,7 +200,8 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
> > extern void kgdb_call_nmi_hook(void *ignored);
> >
> > /**
> > - * kgdb_roundup_cpus - Get other CPUs into a holding pattern
> > + * kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> > + * in an architectural specific manner
> > *
> > * On SMP systems, we need to get the attention of the other CPUs
> > * and get them into a known state. This should do what is needed
> > @@ -210,7 +211,7 @@ extern void kgdb_call_nmi_hook(void *ignored);
> > *
> > * On non-SMP systems, this is not called.
> > */
> > -extern void kgdb_roundup_cpus(void);
> > +extern bool kgdb_arch_roundup_cpus(void);
> >
> > /**
> > * kgdb_arch_set_pc - Generic call back to the program counter
> > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > index 1e75a89..27e401c 100644
> > --- a/kernel/debug/debug_core.c
> > +++ b/kernel/debug/debug_core.c
> > @@ -241,13 +241,21 @@ void __weak kgdb_call_nmi_hook(void *ignored)
> > }
> > NOKPROBE_SYMBOL(kgdb_call_nmi_hook);
> >
> > -void __weak kgdb_roundup_cpus(void)
> > +bool __weak kgdb_arch_roundup_cpus(void)
> > +{
> > + return false;
>
> Do we really need to introduce all these boolean return values just to
> call a bit of library code when one of the architectures wants to
> fallback to a generic implementation?
>
> Why not something more like:
>
> void kgdb_smp_call_nmi_hook(void)
> {
> /* original weak version of kgdb_roundup_cpus goes here */
> }
>
> void __weak kgdb_roundup_cpus(void)
> {
> kgdb_smp_call_nmi_hook();
> }
>
> arm64 can now simply call the new library function if a fallback is needed.
>

Makes sense, I will use this approach instead.

> Note that same question applies to the backtrace changes...

In case of backtrace, there are already multiple APIs wrapping
arch_trigger_cpumask_backtrace() as follows:

- trigger_all_cpu_backtrace()
- trigger_allbutself_cpu_backtrace()
- trigger_cpumask_backtrace()
- trigger_single_cpu_backtrace()

And each of them already return a boolean and have corresponding
different fallback mechanisms. So we can't have a common fallback API
from arch specific code and instead need to extend that boolean return
to arch specific code that is implemented as part of patch #4.

If you do have any better ideas then do let me know.

-Sumit

>
>
> Daniel.
>
>
> > +}
> > +
> > +static void kgdb_roundup_cpus(void)
> > {
> > call_single_data_t *csd;
> > int this_cpu = raw_smp_processor_id();
> > int cpu;
> > int ret;
> >
> > + if (kgdb_arch_roundup_cpus())
> > + return;
> > +
> > for_each_online_cpu(cpu) {
> > /* No need to roundup ourselves */
> > if (cpu == this_cpu)
> > --
> > 2.7.4
> >

2020-11-02 07:02:55

by Sumit Garg

[permalink] [raw]
Subject: Re: [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI

On Thu, 29 Oct 2020 at 22:09, Daniel Thompson
<[email protected]> wrote:
>
> On Thu, Oct 29, 2020 at 04:22:34PM +0000, Daniel Thompson wrote:
> > On Thu, Oct 29, 2020 at 08:26:27PM +0530, Sumit Garg wrote:
> > > arm64 platforms with GICv3 or later supports pseudo NMIs which can be
> > > leveraged to roundup CPUs which are stuck in hard lockup state with
> > > interrupts disabled that wouldn't be possible with a normal IPI.
> > >
> > > So instead switch to roundup CPUs using IPI turned as NMI. And in
> > > case a particular arm64 platform doesn't supports pseudo NMIs,
> > > it will switch back to default kgdb CPUs roundup mechanism.
> > >
> > > Signed-off-by: Sumit Garg <[email protected]>
> > > ---
> > > arch/arm64/include/asm/kgdb.h | 9 +++++++++
> > > arch/arm64/kernel/ipi_nmi.c | 5 +++++
> > > arch/arm64/kernel/kgdb.c | 35 +++++++++++++++++++++++++++++++++++
> > > 3 files changed, 49 insertions(+)
> > >
> > > diff --git a/arch/arm64/include/asm/kgdb.h b/arch/arm64/include/asm/kgdb.h
> > > index 21fc85e..c3d2425 100644
> > > --- a/arch/arm64/include/asm/kgdb.h
> > > +++ b/arch/arm64/include/asm/kgdb.h
> > > @@ -24,6 +24,15 @@ static inline void arch_kgdb_breakpoint(void)
> > > extern void kgdb_handle_bus_error(void);
> > > extern int kgdb_fault_expected;
> > >
> > > +#ifdef CONFIG_KGDB
> > > +extern bool kgdb_ipi_nmicallback(int cpu, void *regs);
> > > +#else
> > > +static inline bool kgdb_ipi_nmicallback(int cpu, void *regs)
> > > +{
> > > + return false;
> > > +}
> > > +#endif
> > > +
> > > #endif /* !__ASSEMBLY__ */
> > >
> > > /*
> > > diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> > > index 597dcf7..6ace182 100644
> > > --- a/arch/arm64/kernel/ipi_nmi.c
> > > +++ b/arch/arm64/kernel/ipi_nmi.c
> > > @@ -8,6 +8,7 @@
> > >
> > > #include <linux/interrupt.h>
> > > #include <linux/irq.h>
> > > +#include <linux/kgdb.h>
> > > #include <linux/nmi.h>
> > > #include <linux/smp.h>
> > >
> > > @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
> > > static irqreturn_t ipi_nmi_handler(int irq, void *data)
> > > {
> > > irqreturn_t ret = IRQ_NONE;
> > > + unsigned int cpu = smp_processor_id();
> > >
> > > if (nmi_cpu_backtrace(get_irq_regs()))
> > > ret = IRQ_HANDLED;
> > >
> > > + if (kgdb_ipi_nmicallback(cpu, get_irq_regs()))
> > > + ret = IRQ_HANDLED;
> > > +
> > > return ret;
> >
> > It would be better to declare existing return value for
> > kgdb_nmicallback() to be dangerously stupid and fix it so it returns an
> > irqreturn_t (that's easy since most callers do not need to check the
> > return value).
> >
> > Then this code simply becomes:
> >
> > return kgdb_nmicallback(cpu, get_irq_regs());
>
> Actually, reflecting on this maybe it is better to keep kgdb_nmicallin()
> and kgdb_nmicallback() aligned w.r.t. return codes (even if they are a
> little unusual).
>
> I'm still not sure why we'd keep kgdb_ipi_nmicallback() though.
> kgdb_nmicallback() is intended to be called from arch code...
>

I added kgdb_ipi_nmicallback() just to add a check for "kgdb_active"
prior to entry into kgdb as here we are sharing NMI among backtrace
and kgdb.

But after your comments, I looked carefully into kgdb_nmicallback()
and I see the "raw_spin_is_locked(&dbg_master_lock)" check as well. So
it looked sufficient to me for calling kgdb_nmicallback() directly
from the arch code and hence I will remove kgdb_ipi_nmicallback() in
the next version.

>
> Daniel.
>
>
> >
> >
> > > }
> > >
> > > diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
> > > index 1a157ca3..c26e710 100644
> > > --- a/arch/arm64/kernel/kgdb.c
> > > +++ b/arch/arm64/kernel/kgdb.c
> > > @@ -17,6 +17,7 @@
> > >
> > > #include <asm/debug-monitors.h>
> > > #include <asm/insn.h>
> > > +#include <asm/nmi.h>
> > > #include <asm/traps.h>
> > >
> > > struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
> > > @@ -353,3 +354,37 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
> > > return aarch64_insn_write((void *)bpt->bpt_addr,
> > > *(u32 *)bpt->saved_instr);
> > > }
> > > +
> > > +bool kgdb_ipi_nmicallback(int cpu, void *regs)
> > > +{
> > > + if (atomic_read(&kgdb_active) != -1) {
> > > + kgdb_nmicallback(cpu, regs);
> > > + return true;
> > > + }
> > > +
> > > + return false;
> > > +}
> >
> > I *really* don't like this function.
> >
> > If the return code of kgdb_nmicallback() is broken then fix it, don't
> > just wrap it and invent a new criteria for the return code.
> >
> > To be honest I don't actually think the logic in kgdb_nmicallback() is
> > broken. As mentioned above the return value has a weird definition (0
> > for "handled it OK" and 1 for "nothing for me to do") but the logic to
> > calculate the return code looks OK.
> >

Makes sense, will remove it instead.

> >
> > > +
> > > +static void kgdb_smp_callback(void *data)
> > > +{
> > > + unsigned int cpu = smp_processor_id();
> > > +
> > > + if (atomic_read(&kgdb_active) != -1)
> > > + kgdb_nmicallback(cpu, get_irq_regs());
> > > +}
> >
> > This is Unused. I presume it is litter from a previous revision of the
> > code and can be deleted?
> >

Yeah.

> >
> > > +
> > > +bool kgdb_arch_roundup_cpus(void)
> > > +{
> > > + struct cpumask mask;
> > > +
> > > + if (!arm64_supports_nmi())
> > > + return false;
> > > +
> > > + cpumask_copy(&mask, cpu_online_mask);
> > > + cpumask_clear_cpu(raw_smp_processor_id(), &mask);
> > > + if (cpumask_empty(&mask))
> > > + return false;
> >
> > Why do we need to fallback if there is no work to do? There will still
> > be no work to do when we call the fallback.

Okay, won't switch back to fallback mode here.

-Sumit

> >
> >
> > Daniel.

2020-11-02 09:21:38

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override

On Mon, Nov 02, 2020 at 11:48:53AM +0530, Sumit Garg wrote:
> On Thu, 29 Oct 2020 at 20:51, Daniel Thompson
> <[email protected]> wrote:
> >
> > On Thu, Oct 29, 2020 at 08:26:26PM +0530, Sumit Garg wrote:
> > > Add a new API kgdb_arch_roundup_cpus() for a particular archichecture to
> > > override default kgdb roundup and if it detects at runtime to not support
> > > NMI roundup then it can fallback to default implementation using async
> > > SMP cross-calls.
> > >
> > > Currently such an architecture example is arm64 supporting pseudo NMIs
> > > feature which is only available on platforms which have support for GICv3
> > > or later version.
> > >
> > > Signed-off-by: Sumit Garg <[email protected]>
> > > ---
> > > arch/powerpc/kernel/kgdb.c | 3 ++-
> > > arch/sparc/kernel/smp_64.c | 3 ++-
> > > arch/x86/kernel/kgdb.c | 6 ++++--
> > > include/linux/kgdb.h | 5 +++--
> > > kernel/debug/debug_core.c | 10 +++++++++-
> > > 5 files changed, 20 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c
> > > index 4090802..126575d 100644
> > > --- a/arch/powerpc/kernel/kgdb.c
> > > +++ b/arch/powerpc/kernel/kgdb.c
> > > @@ -125,9 +125,10 @@ static int kgdb_debugger_ipi(struct pt_regs *regs)
> > > }
> > >
> > > #ifdef CONFIG_SMP
> > > -void kgdb_roundup_cpus(void)
> > > +bool kgdb_arch_roundup_cpus(void)
> > > {
> > > smp_send_debugger_break();
> > > + return true;
> > > }
> > > #endif
> > >
> > > diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
> > > index e38d8bf..c459c83 100644
> > > --- a/arch/sparc/kernel/smp_64.c
> > > +++ b/arch/sparc/kernel/smp_64.c
> > > @@ -1014,9 +1014,10 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page)
> > > }
> > >
> > > #ifdef CONFIG_KGDB
> > > -void kgdb_roundup_cpus(void)
> > > +bool kgdb_arch_roundup_cpus(void)
> > > {
> > > smp_cross_call(&xcall_kgdb_capture, 0, 0, 0);
> > > + return true;
> > > }
> > > #endif
> > >
> > > diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
> > > index ff7878d..1b756d9 100644
> > > --- a/arch/x86/kernel/kgdb.c
> > > +++ b/arch/x86/kernel/kgdb.c
> > > @@ -404,7 +404,8 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
> > >
> > > #ifdef CONFIG_SMP
> > > /**
> > > - * kgdb_roundup_cpus - Get other CPUs into a holding pattern
> > > + * kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> > > + * in an architectural specific manner
> > > *
> > > * On SMP systems, we need to get the attention of the other CPUs
> > > * and get them be in a known state. This should do what is needed
> > > @@ -414,9 +415,10 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
> > > *
> > > * On non-SMP systems, this is not called.
> > > */
> > > -void kgdb_roundup_cpus(void)
> > > +bool kgdb_arch_roundup_cpus(void)
> > > {
> > > apic_send_IPI_allbutself(NMI_VECTOR);
> > > + return true;
> > > }
> > > #endif
> > >
> > > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > > index 0d6cf64..f9db5b8 100644
> > > --- a/include/linux/kgdb.h
> > > +++ b/include/linux/kgdb.h
> > > @@ -200,7 +200,8 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
> > > extern void kgdb_call_nmi_hook(void *ignored);
> > >
> > > /**
> > > - * kgdb_roundup_cpus - Get other CPUs into a holding pattern
> > > + * kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> > > + * in an architectural specific manner
> > > *
> > > * On SMP systems, we need to get the attention of the other CPUs
> > > * and get them into a known state. This should do what is needed
> > > @@ -210,7 +211,7 @@ extern void kgdb_call_nmi_hook(void *ignored);
> > > *
> > > * On non-SMP systems, this is not called.
> > > */
> > > -extern void kgdb_roundup_cpus(void);
> > > +extern bool kgdb_arch_roundup_cpus(void);
> > >
> > > /**
> > > * kgdb_arch_set_pc - Generic call back to the program counter
> > > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > > index 1e75a89..27e401c 100644
> > > --- a/kernel/debug/debug_core.c
> > > +++ b/kernel/debug/debug_core.c
> > > @@ -241,13 +241,21 @@ void __weak kgdb_call_nmi_hook(void *ignored)
> > > }
> > > NOKPROBE_SYMBOL(kgdb_call_nmi_hook);
> > >
> > > -void __weak kgdb_roundup_cpus(void)
> > > +bool __weak kgdb_arch_roundup_cpus(void)
> > > +{
> > > + return false;
> >
> > Do we really need to introduce all these boolean return values just to
> > call a bit of library code when one of the architectures wants to
> > fallback to a generic implementation?
> >
> > Why not something more like:
> >
> > void kgdb_smp_call_nmi_hook(void)
> > {
> > /* original weak version of kgdb_roundup_cpus goes here */
> > }
> >
> > void __weak kgdb_roundup_cpus(void)
> > {
> > kgdb_smp_call_nmi_hook();
> > }
> >
> > arm64 can now simply call the new library function if a fallback is needed.
> >
>
> Makes sense, I will use this approach instead.
>
> > Note that same question applies to the backtrace changes...
>
> In case of backtrace, there are already multiple APIs wrapping
> arch_trigger_cpumask_backtrace() as follows:
>
> - trigger_all_cpu_backtrace()
> - trigger_allbutself_cpu_backtrace()
> - trigger_cpumask_backtrace()
> - trigger_single_cpu_backtrace()
>
> And each of them already return a boolean and have corresponding
> different fallback mechanisms. So we can't have a common fallback API
> from arch specific code and instead need to extend that boolean return
> to arch specific code that is implemented as part of patch #4.

Understood. Thanks.


Daniel.