2020-11-04 08:20:09

by Yuichi Ito

[permalink] [raw]
Subject: [PATCH v2 0/3] Enable support IPI_CPU_CRASH_STOP to be pseudo-NMI

This patchset enables IPI_CPU_CRASH_STOP IPI to be pseudo-NMI.
This allows kdump to collect system information even when the CPU is in
a HARDLOCKUP state.

Only IPI_CPU_CRASH_STOP uses NMI and the other IPIs remain normal IRQs.

The patch has been tested on FX1000.

It also uses some of Sumit's IPI patch set for NMI.[1]

[1] https://lore.kernel.org/lkml/[email protected]/

$ echo 1 > /proc/sys/kernel/panic_on_rcu_stal
$ echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT
: kernel panics and crash kernel boot
: makedumpfile saves the system state at HARDLOCKUP in vmcore.

crash utility:
#7 [fffffe00290afd30] lkdtm_HARDLOCKUP at fffffe0010857ee8
#8 [fffffe00290afd40] direct_entry at fffffe0010857c94
#9 [fffffe00290afd90] full_proxy_write at fffffe001055dea0
#10 [fffffe00290afdd0] vfs_write at fffffe001047533c
#11 [fffffe00290afe10] ksys_write at fffffe001047563c
#12 [fffffe00290afe60] __arm64_sys_write at fffffe00104756e8
#13 [fffffe00290afe70] do_el0_svc at fffffe00101590cc
#14 [fffffe00290afea0] el0_svc at fffffe0010147a30
#15 [fffffe00290afeb0] el0_sync_handler at fffffe001014835c
#16 [fffffe00290afff0] el0_sync at fffffe0010142c14

Changes in v1:
- Rebased to head of upstream master.
- Rebased to Sumit's latest IPIs patch-set [1].

[1] https://lore.kernel.org/lkml/[email protected]/

- Add conditional branch of local_irq_disable().

Sumit Garg (1):
irqchip/gic-v3: Enable support for SGIs to act as NMIs

Yuichi Ito (2):
arch64: smp: Register IPI_CPU_CRASH_STOP IPI as pseudo-NMI
arch64: smp: Disable priority masking when received NMI on PSR.I section

arch/arm64/kernel/smp.c | 44 +++++++++++++++++++++++++++++++++++---------
drivers/irqchip/irq-gic-v3.c | 29 +++++++++++++++++++++--------
2 files changed, 56 insertions(+), 17 deletions(-)

--
1.8.3.1


2020-11-04 08:21:40

by Yuichi Ito

[permalink] [raw]
Subject: [PATCH v2 1/3] irqchip/gic-v3: Enable support for SGIs to act as NMIs

From: From: Sumit Garg <[email protected]>

Add support to handle SGIs as pseudo NMIs. As SGIs or IPIs default to a
special flow handler: handle_percpu_devid_fasteoi_ipi(), so skip NMI
handler update in case of SGIs.

Also, enable NMI support prior to gic_smp_init() as allocation of SGIs
as IRQs/NMIs happen as part of this routine.

Signed-off-by: Sumit Garg <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 16fecc0..7010ae2 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -461,6 +461,7 @@ static u32 gic_get_ppi_index(struct irq_data *d)
static int gic_irq_nmi_setup(struct irq_data *d)
{
struct irq_desc *desc = irq_to_desc(d->irq);
+ u32 idx;

if (!gic_supports_nmi())
return -EINVAL;
@@ -478,16 +479,22 @@ static int gic_irq_nmi_setup(struct irq_data *d)
return -EINVAL;

/* desc lock should already be held */
- if (gic_irq_in_rdist(d)) {
- u32 idx = gic_get_ppi_index(d);
+ switch (get_intid_range(d)) {
+ case SGI_RANGE:
+ break;
+ case PPI_RANGE:
+ case EPPI_RANGE:
+ idx = gic_get_ppi_index(d);

/* Setting up PPI as NMI, only switch handler for first NMI */
if (!refcount_inc_not_zero(&ppi_nmi_refs[idx])) {
refcount_set(&ppi_nmi_refs[idx], 1);
desc->handle_irq = handle_percpu_devid_fasteoi_nmi;
}
- } else {
+ break;
+ default:
desc->handle_irq = handle_fasteoi_nmi;
+ break;
}

gic_irq_set_prio(d, GICD_INT_NMI_PRI);
@@ -498,6 +505,7 @@ static int gic_irq_nmi_setup(struct irq_data *d)
static void gic_irq_nmi_teardown(struct irq_data *d)
{
struct irq_desc *desc = irq_to_desc(d->irq);
+ u32 idx;

if (WARN_ON(!gic_supports_nmi()))
return;
@@ -515,14 +523,20 @@ static void gic_irq_nmi_teardown(struct irq_data *d)
return;

/* desc lock should already be held */
- if (gic_irq_in_rdist(d)) {
- u32 idx = gic_get_ppi_index(d);
+ switch (get_intid_range(d)) {
+ case SGI_RANGE:
+ break;
+ case PPI_RANGE:
+ case EPPI_RANGE:
+ idx = gic_get_ppi_index(d);

/* Tearing down NMI, only switch handler for last NMI */
if (refcount_dec_and_test(&ppi_nmi_refs[idx]))
desc->handle_irq = handle_percpu_devid_irq;
- } else {
+ break;
+ default:
desc->handle_irq = handle_fasteoi_irq;
+ break;
}

gic_irq_set_prio(d, GICD_INT_DEF_PRI);
@@ -1708,6 +1722,7 @@ static int __init gic_init_bases(void __iomem *dist_base,

gic_dist_init();
gic_cpu_init();
+ gic_enable_nmi_support();
gic_smp_init();
gic_cpu_pm_init();

@@ -1719,8 +1734,6 @@ static int __init gic_init_bases(void __iomem *dist_base,
gicv2m_init(handle, gic_data.domain);
}

- gic_enable_nmi_support();
-
return 0;

out_free:
--
1.8.3.1

2020-11-16 18:52:28

by Yuichi Ito

[permalink] [raw]
Subject: RE: [PATCH v2 0/3] Enable support IPI_CPU_CRASH_STOP to be pseudo-NMI

Hi Marc, Sumit

What should I do to merge this patch.
I would appreciate if you have any advice.

I have not tested it with ThunderX2 yet.

Best regards,

Yuichi Ito

> -----Original Message-----
> From: Yuichi Ito <[email protected]>
> Sent: Wednesday, November 4, 2020 5:06 PM
> To: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected]
> Cc: [email protected]; [email protected]; Ito,
> Yuichi/伊藤 有一 <[email protected]>
> Subject: [PATCH v2 0/3] Enable support IPI_CPU_CRASH_STOP to be
> pseudo-NMI
>
> This patchset enables IPI_CPU_CRASH_STOP IPI to be pseudo-NMI.
> This allows kdump to collect system information even when the CPU is in a
> HARDLOCKUP state.
>
> Only IPI_CPU_CRASH_STOP uses NMI and the other IPIs remain normal
> IRQs.
>
> The patch has been tested on FX1000.
>
> It also uses some of Sumit's IPI patch set for NMI.[1]
>
> [1]
> https://lore.kernel.org/lkml/1603983387-8738-3-git-send-email-sumit.garg@l
> inaro.org/
>
> $ echo 1 > /proc/sys/kernel/panic_on_rcu_stal
> $ echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT
> : kernel panics and crash kernel boot
> : makedumpfile saves the system state at HARDLOCKUP in vmcore.
>
> crash utility:
> #7 [fffffe00290afd30] lkdtm_HARDLOCKUP at fffffe0010857ee8
> #8 [fffffe00290afd40] direct_entry at fffffe0010857c94
> #9 [fffffe00290afd90] full_proxy_write at fffffe001055dea0
> #10 [fffffe00290afdd0] vfs_write at fffffe001047533c
> #11 [fffffe00290afe10] ksys_write at fffffe001047563c
> #12 [fffffe00290afe60] __arm64_sys_write at fffffe00104756e8
> #13 [fffffe00290afe70] do_el0_svc at fffffe00101590cc
> #14 [fffffe00290afea0] el0_svc at fffffe0010147a30
> #15 [fffffe00290afeb0] el0_sync_handler at fffffe001014835c
> #16 [fffffe00290afff0] el0_sync at fffffe0010142c14
>
> Changes in v1:
> - Rebased to head of upstream master.
> - Rebased to Sumit's latest IPIs patch-set [1].
>
> [1]
> https://lore.kernel.org/lkml/1603983387-8738-3-git-send-email-sumit.garg@l
> inaro.org/
>
> - Add conditional branch of local_irq_disable().
>
> Sumit Garg (1):
> irqchip/gic-v3: Enable support for SGIs to act as NMIs
>
> Yuichi Ito (2):
> arch64: smp: Register IPI_CPU_CRASH_STOP IPI as pseudo-NMI
> arch64: smp: Disable priority masking when received NMI on PSR.I section
>
> arch/arm64/kernel/smp.c | 44
> +++++++++++++++++++++++++++++++++++---------
> drivers/irqchip/irq-gic-v3.c | 29 +++++++++++++++++++++--------
> 2 files changed, 56 insertions(+), 17 deletions(-)
>
> --
> 1.8.3.1