2019-06-06 09:33:41

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 0/8] arm64: IRQ priority masking and Pseudo-NMI fixes

Version one[1] of this series attempted to fix the issue reported by
Zenghui[2] when using the function_graph tracer with IRQ priority
masking.

Since then, I realized that priority masking and the use of Pseudo-NMIs
was more broken than I thought.

* Patch 1-2 are just some cleanup
* Patch 3 fixes a potential issue with not clobbering condition flags
in irqflags operations
* Patch 4 fixes an issue where calling C code in Pseudo-NMI before
entering NMI enter could lead to potential races
* Patch 5 fixes the function_graph issue when using priority masking
* Patch 6 introduces some debug to hopefully avoid breaking things in
the future
* Patch 7 is a rebased version of the patch sent by Wei Li[3] fixing
an error that can happen during on some platform using the priority
masking feature
* Patch 8 re-enables the Pseudo-NMI selection

Changes since V2 [4]:
- Rebase on v5.2-rc3
- clobber conditions flags for asm that needs it as pointed out by Marc Z.
and Robin M.
- Fix the naming of the new PMR bit value
- Introduce some helper for the debug conditions
- use WARN_ONCE for debug that might be very noisy
- Reenable pseudo NMI.

Changes since V1 [1]:
- Fix possible race condition between NMI and trace irqflags
- Simplify the representation of PSR.I in the PMR value
- Include Wei Li's fix
- Rebase on v5.1-rc7

[1] https://marc.info/?l=linux-arm-kernel&m=155542458004480&w=2
[2] https://www.spinics.net/lists/arm-kernel/msg716956.html
[3] https://www.spinics.net/lists/arm-kernel/msg722171.html
[4] https://lkml.org/lkml/2019/4/29/643

Cheers,

Julien

-->

Julien Thierry (7):
arm64: Do not enable IRQs for ct_user_exit
arm64: irqflags: Pass flags as readonly operand to restore instruction
arm64: irqflags: Add condition flags to inline asm clobber list
arm64: Fix interrupt tracing in the presence of NMIs
arm64: Fix incorrect irqflag restore for priority masking
arm64: irqflags: Introduce explicit debugging for IRQ priorities
arm64: Allow selecting Pseudo-NMI again

Wei Li (1):
arm64: fix kernel stack overflow in kdump capture kernel

arch/arm64/Kconfig | 12 +++++-
arch/arm64/include/asm/arch_gicv3.h | 4 +-
arch/arm64/include/asm/cpufeature.h | 6 +++
arch/arm64/include/asm/daifflags.h | 75 +++++++++++++++++++++------------
arch/arm64/include/asm/irqflags.h | 79 +++++++++++++++++-----------------
arch/arm64/include/asm/kvm_host.h | 7 ++--
arch/arm64/include/asm/ptrace.h | 10 ++++-
arch/arm64/kernel/entry.S | 84 +++++++++++++++++++++++++++++--------
arch/arm64/kernel/irq.c | 26 ++++++++++++
arch/arm64/kernel/process.c | 2 +-
arch/arm64/kernel/smp.c | 6 +--
arch/arm64/kvm/hyp/switch.c | 2 +-
drivers/irqchip/irq-gic-v3.c | 6 +++
kernel/irq/irqdesc.c | 8 +++-
14 files changed, 227 insertions(+), 100 deletions(-)

--
1.9.1


2019-06-06 09:33:42

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 2/8] arm64: irqflags: Pass flags as readonly operand to restore instruction

Flags are only read by the instructions doing the irqflags restore
operation. Pass the operand as read only to the asm inline instead of
read-write.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/include/asm/irqflags.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 62996318..9c93152 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -119,8 +119,8 @@ static inline void arch_local_irq_restore(unsigned long flags)
__msr_s(SYS_ICC_PMR_EL1, "%0")
"dsb sy",
ARM64_HAS_IRQ_PRIO_MASKING)
- : "+r" (flags)
:
+ : "r" (flags)
: "memory");
}

--
1.9.1

2019-06-06 09:33:54

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 3/8] arm64: irqflags: Add condition flags to inline asm clobber list

Some of the inline assembly instruction use the condition flags and need
to include "cc" in the clobber list.

Fixes: commit 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking")
Suggested-by: Marc Zyngier <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/include/asm/irqflags.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 9c93152..fbe1aba 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -92,7 +92,7 @@ static inline unsigned long arch_local_save_flags(void)
ARM64_HAS_IRQ_PRIO_MASKING)
: "=&r" (flags), "+r" (daif_bits)
: "r" ((unsigned long) GIC_PRIO_IRQOFF)
- : "memory");
+ : "cc", "memory");

return flags;
}
@@ -136,7 +136,7 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
ARM64_HAS_IRQ_PRIO_MASKING)
: "=&r" (res)
: "r" ((int) flags)
- : "memory");
+ : "cc", "memory");

return res;
}
--
1.9.1

2019-06-06 09:34:02

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 5/8] arm64: Fix incorrect irqflag restore for priority masking

When using IRQ priority masking to disable interrupts, in order to deal
with the PSR.I state, local_irq_save() would convert the I bit into a
PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore()
potentially modifying the value of PMR in undesired location due to the
state of PSR.I upon flag saving [1].

In an attempt to solve this issue in a less hackish manner, introduce
a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent
whether PSR.I is being used to disable interrupts, in which case it
takes precedence of the status of interrupt masking via PMR.

GIC_PRIO_IGNORE_PMR is chosen such that (<pmr_value> |
GIC_PRIO_IGNORE_PMR) does not mask more interrupts than <pmr_value> as
some sections (e.g. arch_cpu_idle(), interrupt acknowledge path)
requires PMR not to mask interrupts that could be signaled to the
CPU when using only PSR.I.

[1] https://www.spinics.net/lists/arm-kernel/msg716956.html

Fixes: commit 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking")
Signed-off-by: Julien Thierry <[email protected]>
Reported-by: Zenghui Yu <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Wei Li <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: James Morse <[email protected]>
Cc: Suzuki K Pouloze <[email protected]>
Cc: Oleg Nesterov <[email protected]>
---
arch/arm64/include/asm/arch_gicv3.h | 4 ++-
arch/arm64/include/asm/daifflags.h | 68 ++++++++++++++++++++++---------------
arch/arm64/include/asm/irqflags.h | 67 +++++++++++++++---------------------
arch/arm64/include/asm/kvm_host.h | 7 ++--
arch/arm64/include/asm/ptrace.h | 10 ++++--
arch/arm64/kernel/entry.S | 38 ++++++++++++++++++---
arch/arm64/kernel/process.c | 2 +-
arch/arm64/kernel/smp.c | 8 +++--
arch/arm64/kvm/hyp/switch.c | 2 +-
9 files changed, 123 insertions(+), 83 deletions(-)

diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
index 14b41dd..9e991b6 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -163,7 +163,9 @@ static inline bool gic_prio_masking_enabled(void)

static inline void gic_pmr_mask_irqs(void)
{
- BUILD_BUG_ON(GICD_INT_DEF_PRI <= GIC_PRIO_IRQOFF);
+ BUILD_BUG_ON(GICD_INT_DEF_PRI < (GIC_PRIO_IRQOFF |
+ GIC_PRIO_PSR_I_SET));
+ BUILD_BUG_ON(GICD_INT_DEF_PRI >= GIC_PRIO_IRQON);
gic_write_pmr(GIC_PRIO_IRQOFF);
}

diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index db452aa..f93204f 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -18,6 +18,7 @@

#include <linux/irqflags.h>

+#include <asm/arch_gicv3.h>
#include <asm/cpufeature.h>

#define DAIF_PROCCTX 0
@@ -32,6 +33,11 @@ static inline void local_daif_mask(void)
:
:
: "memory");
+
+ /* Don't really care for a dsb here, we don't intend to enable IRQs */
+ if (system_uses_irq_prio_masking())
+ gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
+
trace_hardirqs_off();
}

@@ -43,7 +49,7 @@ static inline unsigned long local_daif_save(void)

if (system_uses_irq_prio_masking()) {
/* If IRQs are masked with PMR, reflect it in the flags */
- if (read_sysreg_s(SYS_ICC_PMR_EL1) <= GIC_PRIO_IRQOFF)
+ if (read_sysreg_s(SYS_ICC_PMR_EL1) != GIC_PRIO_IRQON)
flags |= PSR_I_BIT;
}

@@ -59,36 +65,44 @@ static inline void local_daif_restore(unsigned long flags)
if (!irq_disabled) {
trace_hardirqs_on();

- if (system_uses_irq_prio_masking())
- arch_local_irq_enable();
- } else if (!(flags & PSR_A_BIT)) {
- /*
- * If interrupts are disabled but we can take
- * asynchronous errors, we can take NMIs
- */
if (system_uses_irq_prio_masking()) {
- flags &= ~PSR_I_BIT;
+ gic_write_pmr(GIC_PRIO_IRQON);
+ dsb(sy);
+ }
+ } else if (system_uses_irq_prio_masking()) {
+ u64 pmr;
+
+ if (!(flags & PSR_A_BIT)) {
/*
- * There has been concern that the write to daif
- * might be reordered before this write to PMR.
- * From the ARM ARM DDI 0487D.a, section D1.7.1
- * "Accessing PSTATE fields":
- * Writes to the PSTATE fields have side-effects on
- * various aspects of the PE operation. All of these
- * side-effects are guaranteed:
- * - Not to be visible to earlier instructions in
- * the execution stream.
- * - To be visible to later instructions in the
- * execution stream
- *
- * Also, writes to PMR are self-synchronizing, so no
- * interrupts with a lower priority than PMR is signaled
- * to the PE after the write.
- *
- * So we don't need additional synchronization here.
+ * If interrupts are disabled but we can take
+ * asynchronous errors, we can take NMIs
*/
- arch_local_irq_disable();
+ flags &= ~PSR_I_BIT;
+ pmr = GIC_PRIO_IRQOFF;
+ } else {
+ pmr = GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET;
}
+
+ /*
+ * There has been concern that the write to daif
+ * might be reordered before this write to PMR.
+ * From the ARM ARM DDI 0487D.a, section D1.7.1
+ * "Accessing PSTATE fields":
+ * Writes to the PSTATE fields have side-effects on
+ * various aspects of the PE operation. All of these
+ * side-effects are guaranteed:
+ * - Not to be visible to earlier instructions in
+ * the execution stream.
+ * - To be visible to later instructions in the
+ * execution stream
+ *
+ * Also, writes to PMR are self-synchronizing, so no
+ * interrupts with a lower priority than PMR is signaled
+ * to the PE after the write.
+ *
+ * So we don't need additional synchronization here.
+ */
+ gic_write_pmr(pmr);
}

write_sysreg(flags, daif);
diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index fbe1aba..b6f757f 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -67,43 +67,46 @@ static inline void arch_local_irq_disable(void)
*/
static inline unsigned long arch_local_save_flags(void)
{
- unsigned long daif_bits;
unsigned long flags;

- daif_bits = read_sysreg(daif);
-
- /*
- * The asm is logically equivalent to:
- *
- * if (system_uses_irq_prio_masking())
- * flags = (daif_bits & PSR_I_BIT) ?
- * GIC_PRIO_IRQOFF :
- * read_sysreg_s(SYS_ICC_PMR_EL1);
- * else
- * flags = daif_bits;
- */
asm volatile(ALTERNATIVE(
- "mov %0, %1\n"
- "nop\n"
- "nop",
- __mrs_s("%0", SYS_ICC_PMR_EL1)
- "ands %1, %1, " __stringify(PSR_I_BIT) "\n"
- "csel %0, %0, %2, eq",
- ARM64_HAS_IRQ_PRIO_MASKING)
- : "=&r" (flags), "+r" (daif_bits)
- : "r" ((unsigned long) GIC_PRIO_IRQOFF)
- : "cc", "memory");
+ "mrs %0, daif",
+ __mrs_s("%0", SYS_ICC_PMR_EL1),
+ ARM64_HAS_IRQ_PRIO_MASKING)
+ : "=&r" (flags)
+ :
+ : "memory");

return flags;
}

+static inline int arch_irqs_disabled_flags(unsigned long flags)
+{
+ int res;
+
+ asm volatile(ALTERNATIVE(
+ "and %w0, %w1, #" __stringify(PSR_I_BIT),
+ "eor %w0, %w1, #" __stringify(GIC_PRIO_IRQOFF),
+ ARM64_HAS_IRQ_PRIO_MASKING)
+ : "=&r" (res)
+ : "r" ((int) flags)
+ : "memory");
+
+ return res;
+}
+
static inline unsigned long arch_local_irq_save(void)
{
unsigned long flags;

flags = arch_local_save_flags();

- arch_local_irq_disable();
+ /*
+ * There are too many states with IRQs disabled, just keep the current
+ * state if interrupts are already disabled/masked.
+ */
+ if (!arch_irqs_disabled_flags(flags))
+ arch_local_irq_disable();

return flags;
}
@@ -124,21 +127,5 @@ static inline void arch_local_irq_restore(unsigned long flags)
: "memory");
}

-static inline int arch_irqs_disabled_flags(unsigned long flags)
-{
- int res;
-
- asm volatile(ALTERNATIVE(
- "and %w0, %w1, #" __stringify(PSR_I_BIT) "\n"
- "nop",
- "cmp %w1, #" __stringify(GIC_PRIO_IRQOFF) "\n"
- "cset %w0, ls",
- ARM64_HAS_IRQ_PRIO_MASKING)
- : "=&r" (res)
- : "r" ((int) flags)
- : "cc", "memory");
-
- return res;
-}
#endif
#endif
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 4bcd9c1..fdc0e1c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -608,11 +608,12 @@ static inline void kvm_arm_vhe_guest_enter(void)
* will not signal the CPU of interrupts of lower priority, and the
* only way to get out will be via guest exceptions.
* Naturally, we want to avoid this.
+ *
+ * local_daif_mask() already sets IGNORE_PMR, we just need a
+ * dsb to ensure the redistributor is forwards EL2 IRQs to the CPU.
*/
- if (system_uses_irq_prio_masking()) {
- gic_write_pmr(GIC_PRIO_IRQON);
+ if (system_uses_irq_prio_masking())
dsb(sy);
- }
}

static inline void kvm_arm_vhe_guest_exit(void)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index b2de329..da22422 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -35,9 +35,15 @@
* means masking more IRQs (or at least that the same IRQs remain masked).
*
* To mask interrupts, we clear the most significant bit of PMR.
+ *
+ * Some code sections either automatically switch back to PSR.I or explicitly
+ * require to not use priority masking. If bit GIC_PRIO_PSR_I_SET is included
+ * in the the priority mask, it indicates that PSR.I should be set and
+ * interrupt disabling temporarily does not rely on IRQ priorities.
*/
-#define GIC_PRIO_IRQON 0xf0
-#define GIC_PRIO_IRQOFF (GIC_PRIO_IRQON & ~0x80)
+#define GIC_PRIO_IRQON 0xc0
+#define GIC_PRIO_IRQOFF (GIC_PRIO_IRQON & ~0x80)
+#define GIC_PRIO_PSR_I_SET (1 << 4)

/* Additional SPSR bits not exposed in the UABI */
#define PSR_IL_BIT (1 << 20)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 46358a3..7f92e4b 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -258,6 +258,7 @@ alternative_else_nop_endif
/*
* Registers that may be useful after this macro is invoked:
*
+ * x20 - ICC_PMR_EL1
* x21 - aborted SP
* x22 - aborted PC
* x23 - aborted PSTATE
@@ -449,6 +450,24 @@ alternative_endif
.endm
#endif

+ .macro gic_prio_kentry_setup, tmp:req
+#ifdef CONFIG_ARM64_PSEUDO_NMI
+ alternative_if ARM64_HAS_IRQ_PRIO_MASKING
+ mov x20, #(GIC_PRIO_PSR_I_SET | GIC_PRIO_IRQON)
+ msr_s SYS_ICC_PMR_EL1, x20
+ alternative_else_nop_endif
+#endif
+ .endm
+
+ .macro gic_prio_irq_setup, pmr:req, tmp:req
+#ifdef CONFIG_ARM64_PSEUDO_NMI
+ alternative_if ARM64_HAS_IRQ_PRIO_MASKING
+ orr \tmp, \pmr, #GIC_PRIO_PSR_I_SET
+ msr_s SYS_ICC_PMR_EL1, \tmp
+ alternative_else_nop_endif
+#endif
+ .endm
+
.text

/*
@@ -627,6 +646,7 @@ el1_dbg:
cmp x24, #ESR_ELx_EC_BRK64 // if BRK64
cinc x24, x24, eq // set bit '0'
tbz x24, #0, el1_inv // EL1 only
+ gic_prio_kentry_setup tmp=x3
mrs x0, far_el1
mov x2, sp // struct pt_regs
bl do_debug_exception
@@ -644,12 +664,10 @@ ENDPROC(el1_sync)
.align 6
el1_irq:
kernel_entry 1
+ gic_prio_irq_setup pmr=x20, tmp=x1
enable_da_f

#ifdef CONFIG_ARM64_PSEUDO_NMI
-alternative_if ARM64_HAS_IRQ_PRIO_MASKING
- ldr x20, [sp, #S_PMR_SAVE]
-alternative_else_nop_endif
test_irqs_unmasked res=x0, pmr=x20
cbz x0, 1f
bl asm_nmi_enter
@@ -679,8 +697,9 @@ alternative_else_nop_endif

#ifdef CONFIG_ARM64_PSEUDO_NMI
/*
- * if IRQs were disabled when we received the interrupt, we have an NMI
- * and we are not re-enabling interrupt upon eret. Skip tracing.
+ * When using IRQ priority masking, we can get spurious interrupts while
+ * PMR is set to GIC_PRIO_IRQOFF. An NMI might also have occurred in a
+ * section with interrupts disabled. Skip tracing in those cases.
*/
test_irqs_unmasked res=x0, pmr=x20
cbz x0, 1f
@@ -809,6 +828,7 @@ el0_ia:
* Instruction abort handling
*/
mrs x26, far_el1
+ gic_prio_kentry_setup tmp=x0
enable_da_f
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_off
@@ -854,6 +874,7 @@ el0_sp_pc:
* Stack or PC alignment exception handling
*/
mrs x26, far_el1
+ gic_prio_kentry_setup tmp=x0
enable_da_f
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_off
@@ -888,6 +909,7 @@ el0_dbg:
* Debug exception handling
*/
tbnz x24, #0, el0_inv // EL0 only
+ gic_prio_kentry_setup tmp=x3
mrs x0, far_el1
mov x1, x25
mov x2, sp
@@ -909,7 +931,9 @@ ENDPROC(el0_sync)
el0_irq:
kernel_entry 0
el0_irq_naked:
+ gic_prio_irq_setup pmr=x20, tmp=x0
enable_da_f
+
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_off
#endif
@@ -931,6 +955,7 @@ ENDPROC(el0_irq)
el1_error:
kernel_entry 1
mrs x1, esr_el1
+ gic_prio_kentry_setup tmp=x2
enable_dbg
mov x0, sp
bl do_serror
@@ -941,6 +966,7 @@ el0_error:
kernel_entry 0
el0_error_naked:
mrs x1, esr_el1
+ gic_prio_kentry_setup tmp=x2
enable_dbg
mov x0, sp
bl do_serror
@@ -965,6 +991,7 @@ work_pending:
*/
ret_to_user:
disable_daif
+ gic_prio_kentry_setup tmp=x3
ldr x1, [tsk, #TSK_TI_FLAGS]
and x2, x1, #_TIF_WORK_MASK
cbnz x2, work_pending
@@ -981,6 +1008,7 @@ ENDPROC(ret_to_user)
*/
.align 6
el0_svc:
+ gic_prio_kentry_setup tmp=x1
mov x0, sp
bl el0_svc_handler
b ret_to_user
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 3767fb2..58efc37 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -94,7 +94,7 @@ static void __cpu_do_idle_irqprio(void)
* be raised.
*/
pmr = gic_read_pmr();
- gic_write_pmr(GIC_PRIO_IRQON);
+ gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);

__cpu_do_idle();

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index bb4b3f0..4deaee3 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -192,11 +192,13 @@ static void init_gic_priority_masking(void)

WARN_ON(!(cpuflags & PSR_I_BIT));

- gic_write_pmr(GIC_PRIO_IRQOFF);
-
/* We can only unmask PSR.I if we can take aborts */
- if (!(cpuflags & PSR_A_BIT))
+ if (!(cpuflags & PSR_A_BIT)) {
+ gic_write_pmr(GIC_PRIO_IRQOFF);
write_sysreg(cpuflags & ~PSR_I_BIT, daif);
+ } else {
+ gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
+ }
}

/*
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 8799e0c..b89fcf0 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -615,7 +615,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
* Naturally, we want to avoid this.
*/
if (system_uses_irq_prio_masking()) {
- gic_write_pmr(GIC_PRIO_IRQON);
+ gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
dsb(sy);
}

--
1.9.1

2019-06-06 09:34:04

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 6/8] arm64: irqflags: Introduce explicit debugging for IRQ priorities

Using IRQ priority masking to enable/disable interrupts is a bit
sensitive as it requires to deal with both ICC_PMR_EL1 and PSR.I.

Introduce some validity checks to both highlight the states in which
functions dealing with IRQ enabling/disabling can (not) be called, and
bark a warning when called in an unexpected state.

Since these checks are done on hotpaths, introduce a build option to
choose whether to do the checking.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/Kconfig | 11 +++++++++++
arch/arm64/include/asm/cpufeature.h | 6 ++++++
arch/arm64/include/asm/daifflags.h | 7 +++++++
arch/arm64/include/asm/irqflags.h | 14 +++++++++++++-
4 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 697ea05..8acc40e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1436,6 +1436,17 @@ config ARM64_PSEUDO_NMI

If unsure, say N

+if ARM64_PSEUDO_NMI
+config ARM64_DEBUG_PRIORITY_MASKING
+ bool "Debug interrupt priority masking"
+ help
+ This adds runtime checks to functions enabling/disabling
+ interrupts when using priority masking. The additional checks verify
+ the validity of ICC_PMR_EL1 when calling concerned functions.
+
+ If unsure, say N
+endif
+
config RELOCATABLE
bool
help
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index bc895c8..693a086 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -617,6 +617,12 @@ static inline bool system_uses_irq_prio_masking(void)
cpus_have_const_cap(ARM64_HAS_IRQ_PRIO_MASKING);
}

+static inline bool system_has_prio_mask_debugging(void)
+{
+ return IS_ENABLED(CONFIG_ARM64_DEBUG_PRIORITY_MASKING) &&
+ system_uses_irq_prio_masking();
+}
+
#define ARM64_SSBD_UNKNOWN -1
#define ARM64_SSBD_FORCE_DISABLE 0
#define ARM64_SSBD_KERNEL 1
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index f93204f..eca5bee 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -28,6 +28,10 @@
/* mask/save/unmask/restore all exceptions, including interrupts. */
static inline void local_daif_mask(void)
{
+ WARN_ON(system_has_prio_mask_debugging() &&
+ (read_sysreg_s(SYS_ICC_PMR_EL1) == (GIC_PRIO_IRQOFF |
+ GIC_PRIO_PSR_I_SET)));
+
asm volatile(
"msr daifset, #0xf // local_daif_mask\n"
:
@@ -62,6 +66,9 @@ static inline void local_daif_restore(unsigned long flags)
{
bool irq_disabled = flags & PSR_I_BIT;

+ WARN_ON(system_has_prio_mask_debugging() &&
+ !(read_sysreg(daif) & PSR_I_BIT));
+
if (!irq_disabled) {
trace_hardirqs_on();

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index b6f757f..cac2d2a 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -40,6 +40,12 @@
*/
static inline void arch_local_irq_enable(void)
{
+ if (system_has_prio_mask_debugging()) {
+ u32 pmr = read_sysreg_s(SYS_ICC_PMR_EL1);
+
+ WARN_ON_ONCE(pmr != GIC_PRIO_IRQON && pmr != GIC_PRIO_IRQOFF);
+ }
+
asm volatile(ALTERNATIVE(
"msr daifclr, #2 // arch_local_irq_enable\n"
"nop",
@@ -53,6 +59,12 @@ static inline void arch_local_irq_enable(void)

static inline void arch_local_irq_disable(void)
{
+ if (system_has_prio_mask_debugging()) {
+ u32 pmr = read_sysreg_s(SYS_ICC_PMR_EL1);
+
+ WARN_ON_ONCE(pmr != GIC_PRIO_IRQON && pmr != GIC_PRIO_IRQOFF);
+ }
+
asm volatile(ALTERNATIVE(
"msr daifset, #2 // arch_local_irq_disable",
__msr_s(SYS_ICC_PMR_EL1, "%0"),
@@ -86,7 +98,7 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)

asm volatile(ALTERNATIVE(
"and %w0, %w1, #" __stringify(PSR_I_BIT),
- "eor %w0, %w1, #" __stringify(GIC_PRIO_IRQOFF),
+ "eor %w0, %w1, #" __stringify(GIC_PRIO_IRQON),
ARM64_HAS_IRQ_PRIO_MASKING)
: "=&r" (res)
: "r" ((int) flags)
--
1.9.1

2019-06-06 09:34:38

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 8/8] arm64: Allow selecting Pseudo-NMI again

Now that Pseudo-NMI are fixed, allow the use of that option again

This reverts commit 96a13f57b946be7a6c10405e4bd780c0b6b6fe63 ("arm64:
Kconfig: Make ARM64_PSEUDO_NMI depend on BROKEN for now").

Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/Kconfig | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8acc40e..a9dde24 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1423,7 +1423,6 @@ config ARM64_MODULE_PLTS

config ARM64_PSEUDO_NMI
bool "Support for NMI-like interrupts"
- depends on BROKEN # [email protected]
select CONFIG_ARM_GIC_V3
help
Adds support for mimicking Non-Maskable Interrupts through the use of
--
1.9.1

2019-06-06 09:35:00

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 4/8] arm64: Fix interrupt tracing in the presence of NMIs

In the presence of any form of instrumentation, nmi_enter() should be
done before calling any traceable code and any instrumentation code.

Currently, nmi_enter() is done in handle_domain_nmi(), which is much
too late as instrumentation code might get called before. Move the
nmi_enter/exit() calls to the arch IRQ vector handler.

On arm64, it is not possible to know if the IRQ vector handler was
called because of an NMI before acknowledging the interrupt. However, It
is possible to know whether normal interrupts could be taken in the
interrupted context (i.e. if taking an NMI in that context could
introduce a potential race condition).

When interrupting a context with IRQs disabled, call nmi_enter() as soon
as possible. In contexts with IRQs enabled, defer this to the interrupt
controller, which is in a better position to know if an interrupt taken
is an NMI.

Fixes: bc3c03ccb ("arm64: Enable the support of pseudo-NMIs")
Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
---
arch/arm64/kernel/entry.S | 44 +++++++++++++++++++++++++++++++++-----------
arch/arm64/kernel/irq.c | 17 +++++++++++++++++
drivers/irqchip/irq-gic-v3.c | 6 ++++++
kernel/irq/irqdesc.c | 8 ++++++--
4 files changed, 62 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 89ab6bd..46358a3 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -435,6 +435,20 @@ tsk .req x28 // current thread_info
irq_stack_exit
.endm

+#ifdef CONFIG_ARM64_PSEUDO_NMI
+ /*
+ * Set res to 0 if irqs were masked in interrupted context.
+ * Otherwise set res to non-0 value.
+ */
+ .macro test_irqs_unmasked res:req, pmr:req
+alternative_if ARM64_HAS_IRQ_PRIO_MASKING
+ sub \res, \pmr, #GIC_PRIO_IRQON
+alternative_else
+ mov \res, xzr
+alternative_endif
+ .endm
+#endif
+
.text

/*
@@ -631,19 +645,19 @@ ENDPROC(el1_sync)
el1_irq:
kernel_entry 1
enable_da_f
-#ifdef CONFIG_TRACE_IRQFLAGS
+
#ifdef CONFIG_ARM64_PSEUDO_NMI
alternative_if ARM64_HAS_IRQ_PRIO_MASKING
ldr x20, [sp, #S_PMR_SAVE]
-alternative_else
- mov x20, #GIC_PRIO_IRQON
-alternative_endif
- cmp x20, #GIC_PRIO_IRQOFF
- /* Irqs were disabled, don't trace */
- b.ls 1f
+alternative_else_nop_endif
+ test_irqs_unmasked res=x0, pmr=x20
+ cbz x0, 1f
+ bl asm_nmi_enter
+1:
#endif
+
+#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_off
-1:
#endif

irq_handler
@@ -662,14 +676,22 @@ alternative_else_nop_endif
bl preempt_schedule_irq // irq en/disable is done inside
1:
#endif
-#ifdef CONFIG_TRACE_IRQFLAGS
+
#ifdef CONFIG_ARM64_PSEUDO_NMI
/*
* if IRQs were disabled when we received the interrupt, we have an NMI
* and we are not re-enabling interrupt upon eret. Skip tracing.
*/
- cmp x20, #GIC_PRIO_IRQOFF
- b.ls 1f
+ test_irqs_unmasked res=x0, pmr=x20
+ cbz x0, 1f
+ bl asm_nmi_exit
+1:
+#endif
+
+#ifdef CONFIG_TRACE_IRQFLAGS
+#ifdef CONFIG_ARM64_PSEUDO_NMI
+ test_irqs_unmasked res=x0, pmr=x20
+ cbnz x0, 1f
#endif
bl trace_hardirqs_on
1:
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 92fa817..fdd9cb2 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -27,8 +27,10 @@
#include <linux/smp.h>
#include <linux/init.h>
#include <linux/irqchip.h>
+#include <linux/kprobes.h>
#include <linux/seq_file.h>
#include <linux/vmalloc.h>
+#include <asm/daifflags.h>
#include <asm/vmap_stack.h>

unsigned long irq_err_count;
@@ -76,3 +78,18 @@ void __init init_IRQ(void)
if (!handle_arch_irq)
panic("No interrupt controller found.");
}
+
+/*
+ * Stubs to make nmi_enter/exit() code callable from ASM
+ */
+asmlinkage void notrace asm_nmi_enter(void)
+{
+ nmi_enter();
+}
+NOKPROBE_SYMBOL(asm_nmi_enter);
+
+asmlinkage void notrace asm_nmi_exit(void)
+{
+ nmi_exit();
+}
+NOKPROBE_SYMBOL(asm_nmi_exit);
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index f44cd89..0bf0fc4 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -495,7 +495,13 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs

if (gic_supports_nmi() &&
unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) {
+ if (interrupts_enabled(regs))
+ nmi_enter();
+
gic_handle_nmi(irqnr, regs);
+
+ if (interrupts_enabled(regs))
+ nmi_exit();
return;
}

diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index c52b737..a92b335 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -680,6 +680,8 @@ int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
* @hwirq: The HW irq number to convert to a logical one
* @regs: Register file coming from the low-level handling code
*
+ * This function must be called from an NMI context.
+ *
* Returns: 0 on success, or -EINVAL if conversion has failed
*/
int handle_domain_nmi(struct irq_domain *domain, unsigned int hwirq,
@@ -689,7 +691,10 @@ int handle_domain_nmi(struct irq_domain *domain, unsigned int hwirq,
unsigned int irq;
int ret = 0;

- nmi_enter();
+ /*
+ * NMI context needs to be setup earlier in order to deal with tracing.
+ */
+ WARN_ON(!in_nmi());

irq = irq_find_mapping(domain, hwirq);

@@ -702,7 +707,6 @@ int handle_domain_nmi(struct irq_domain *domain, unsigned int hwirq,
else
ret = -EINVAL;

- nmi_exit();
set_irq_regs(old_regs);
return ret;
}
--
1.9.1

2019-06-06 09:35:13

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 7/8] arm64: fix kernel stack overflow in kdump capture kernel

From: Wei Li <[email protected]>

When enabling ARM64_PSEUDO_NMI feature in kdump capture kernel, it will
report a kernel stack overflow exception:

[ 0.000000] CPU features: detected: IRQ priority masking
[ 0.000000] alternatives: patching kernel code
[ 0.000000] Insufficient stack space to handle exception!
[ 0.000000] ESR: 0x96000044 -- DABT (current EL)
[ 0.000000] FAR: 0x0000000000000040
[ 0.000000] Task stack: [0xffff0000097f0000..0xffff0000097f4000]
[ 0.000000] IRQ stack: [0x0000000000000000..0x0000000000004000]
[ 0.000000] Overflow stack: [0xffff80002b7cf290..0xffff80002b7d0290]
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.34-lw+ #3
[ 0.000000] pstate: 400003c5 (nZcv DAIF -PAN -UAO)
[ 0.000000] pc : el1_sync+0x0/0xb8
[ 0.000000] lr : el1_irq+0xb8/0x140
[ 0.000000] sp : 0000000000000040
[ 0.000000] pmr_save: 00000070
[ 0.000000] x29: ffff0000097f3f60 x28: ffff000009806240
[ 0.000000] x27: 0000000080000000 x26: 0000000000004000
[ 0.000000] x25: 0000000000000000 x24: ffff000009329028
[ 0.000000] x23: 0000000040000005 x22: ffff000008095c6c
[ 0.000000] x21: ffff0000097f3f70 x20: 0000000000000070
[ 0.000000] x19: ffff0000097f3e30 x18: ffffffffffffffff
[ 0.000000] x17: 0000000000000000 x16: 0000000000000000
[ 0.000000] x15: ffff0000097f9708 x14: ffff000089a382ef
[ 0.000000] x13: ffff000009a382fd x12: ffff000009824000
[ 0.000000] x11: ffff0000097fb7b0 x10: ffff000008730028
[ 0.000000] x9 : ffff000009440018 x8 : 000000000000000d
[ 0.000000] x7 : 6b20676e69686374 x6 : 000000000000003b
[ 0.000000] x5 : 0000000000000000 x4 : ffff000008093600
[ 0.000000] x3 : 0000000400000008 x2 : 7db2e689fc2b8e00
[ 0.000000] x1 : 0000000000000000 x0 : ffff0000097f3e30
[ 0.000000] Kernel panic - not syncing: kernel stack overflow
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.34-lw+ #3
[ 0.000000] Call trace:
[ 0.000000] dump_backtrace+0x0/0x1b8
[ 0.000000] show_stack+0x24/0x30
[ 0.000000] dump_stack+0xa8/0xcc
[ 0.000000] panic+0x134/0x30c
[ 0.000000] __stack_chk_fail+0x0/0x28
[ 0.000000] handle_bad_stack+0xfc/0x108
[ 0.000000] __bad_stack+0x90/0x94
[ 0.000000] el1_sync+0x0/0xb8
[ 0.000000] init_gic_priority_masking+0x4c/0x70
[ 0.000000] smp_prepare_boot_cpu+0x60/0x68
[ 0.000000] start_kernel+0x1e8/0x53c
[ 0.000000] ---[ end Kernel panic - not syncing: kernel stack overflow ]---

The reason is init_gic_priority_masking() may unmask PSR.I while the
irq stacks are not inited yet. Some "NMI" could be raised unfortunately
and it will just go into this exception.

In this patch, we just write the PMR in smp_prepare_boot_cpu(), and delay
unmasking PSR.I after irq stacks inited in init_IRQ().

Fixes: e79321883842 ("arm64: Switch to PMR masking when starting CPUs")
Signed-off-by: Wei Li <[email protected]>
[JT: make init_gic_priority_masking() not modify daif, rebase on other
priority masking fixes]
Signed-off-by: Julien Thierry <[email protected]>
---
arch/arm64/kernel/irq.c | 9 +++++++++
arch/arm64/kernel/smp.c | 8 +-------
2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index fdd9cb2..e8daa7a 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -77,6 +77,15 @@ void __init init_IRQ(void)
irqchip_init();
if (!handle_arch_irq)
panic("No interrupt controller found.");
+
+ if (system_uses_irq_prio_masking()) {
+ /*
+ * Now that we have a stack for our IRQ handler, set
+ * the PMR/PSR pair to a consistent state.
+ */
+ WARN_ON(read_sysreg(daif) & PSR_A_BIT);
+ local_daif_restore(DAIF_PROCCTX_NOIRQ);
+ }
}

/*
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 4deaee3..83cdb0a 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -192,13 +192,7 @@ static void init_gic_priority_masking(void)

WARN_ON(!(cpuflags & PSR_I_BIT));

- /* We can only unmask PSR.I if we can take aborts */
- if (!(cpuflags & PSR_A_BIT)) {
- gic_write_pmr(GIC_PRIO_IRQOFF);
- write_sysreg(cpuflags & ~PSR_I_BIT, daif);
- } else {
- gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
- }
+ gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
}

/*
--
1.9.1

2019-06-07 15:56:26

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v3 4/8] arm64: Fix interrupt tracing in the presence of NMIs



On 07/06/2019 16:42, Marc Zyngier wrote:
> On 06/06/2019 10:31, Julien Thierry wrote:
>> In the presence of any form of instrumentation, nmi_enter() should be
>> done before calling any traceable code and any instrumentation code.
>>
>> Currently, nmi_enter() is done in handle_domain_nmi(), which is much
>> too late as instrumentation code might get called before. Move the
>> nmi_enter/exit() calls to the arch IRQ vector handler.
>>
>> On arm64, it is not possible to know if the IRQ vector handler was
>> called because of an NMI before acknowledging the interrupt. However, It
>> is possible to know whether normal interrupts could be taken in the
>> interrupted context (i.e. if taking an NMI in that context could
>> introduce a potential race condition).
>>
>> When interrupting a context with IRQs disabled, call nmi_enter() as soon
>> as possible. In contexts with IRQs enabled, defer this to the interrupt
>> controller, which is in a better position to know if an interrupt taken
>> is an NMI.
>>
>> Fixes: bc3c03ccb ("arm64: Enable the support of pseudo-NMIs")
>> Signed-off-by: Julien Thierry <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> Cc: Thomas Gleixner <[email protected]>
>> Cc: Jason Cooper <[email protected]>
>> Cc: Marc Zyngier <[email protected]>
>> Cc: Mark Rutland <[email protected]>
>> ---
>> arch/arm64/kernel/entry.S | 44 +++++++++++++++++++++++++++++++++-----------
>> arch/arm64/kernel/irq.c | 17 +++++++++++++++++
>> drivers/irqchip/irq-gic-v3.c | 6 ++++++
>> kernel/irq/irqdesc.c | 8 ++++++--
>> 4 files changed, 62 insertions(+), 13 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index 89ab6bd..46358a3 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -435,6 +435,20 @@ tsk .req x28 // current thread_info
>> irq_stack_exit
>> .endm
>>
>> +#ifdef CONFIG_ARM64_PSEUDO_NMI
>> + /*
>> + * Set res to 0 if irqs were masked in interrupted context.
>
> Is that comment right? This macro seems to return 0 if PMR is equal to
> GIC_PRIO_IRQON, meaning that irqs are unmasked...

You're right, the comment is wrong. At least the references to the macro
seem to be doing the right thing and interpreting 0 as "irqs were
unmasked" :) .

>
>> + * Otherwise set res to non-0 value.
>> + */
>> + .macro test_irqs_unmasked res:req, pmr:req
>> +alternative_if ARM64_HAS_IRQ_PRIO_MASKING
>> + sub \res, \pmr, #GIC_PRIO_IRQON
>> +alternative_else
>> + mov \res, xzr
>> +alternative_endif
>> + .endm
>> +#endif
>> +
>> .text
>>
>> /*
>> @@ -631,19 +645,19 @@ ENDPROC(el1_sync)
>> el1_irq:
>> kernel_entry 1
>> enable_da_f
>> -#ifdef CONFIG_TRACE_IRQFLAGS
>> +
>> #ifdef CONFIG_ARM64_PSEUDO_NMI
>> alternative_if ARM64_HAS_IRQ_PRIO_MASKING
>> ldr x20, [sp, #S_PMR_SAVE]
>> -alternative_else
>> - mov x20, #GIC_PRIO_IRQON
>> -alternative_endif
>> - cmp x20, #GIC_PRIO_IRQOFF
>> - /* Irqs were disabled, don't trace */
>> - b.ls 1f
>> +alternative_else_nop_endif
>> + test_irqs_unmasked res=x0, pmr=x20
>> + cbz x0, 1f
>> + bl asm_nmi_enter
>> +1:
>> #endif
>> +
>> +#ifdef CONFIG_TRACE_IRQFLAGS
>> bl trace_hardirqs_off
>> -1:
>> #endif
>>
>> irq_handler
>> @@ -662,14 +676,22 @@ alternative_else_nop_endif
>> bl preempt_schedule_irq // irq en/disable is done inside
>> 1:
>> #endif
>> -#ifdef CONFIG_TRACE_IRQFLAGS
>> +
>> #ifdef CONFIG_ARM64_PSEUDO_NMI
>> /*
>> * if IRQs were disabled when we received the interrupt, we have an NMI
>> * and we are not re-enabling interrupt upon eret. Skip tracing.
>> */
>> - cmp x20, #GIC_PRIO_IRQOFF
>> - b.ls 1f
>> + test_irqs_unmasked res=x0, pmr=x20
>> + cbz x0, 1f
>> + bl asm_nmi_exit
>> +1:
>> +#endif
>> +
>> +#ifdef CONFIG_TRACE_IRQFLAGS
>> +#ifdef CONFIG_ARM64_PSEUDO_NMI
>> + test_irqs_unmasked res=x0, pmr=x20
>> + cbnz x0, 1f
>> #endif
>> bl trace_hardirqs_on
>> 1:
>> diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
>> index 92fa817..fdd9cb2 100644
>> --- a/arch/arm64/kernel/irq.c
>> +++ b/arch/arm64/kernel/irq.c
>> @@ -27,8 +27,10 @@
>> #include <linux/smp.h>
>> #include <linux/init.h>
>> #include <linux/irqchip.h>
>> +#include <linux/kprobes.h>
>> #include <linux/seq_file.h>
>> #include <linux/vmalloc.h>
>> +#include <asm/daifflags.h>
>> #include <asm/vmap_stack.h>
>>
>> unsigned long irq_err_count;
>> @@ -76,3 +78,18 @@ void __init init_IRQ(void)
>> if (!handle_arch_irq)
>> panic("No interrupt controller found.");
>> }
>> +
>> +/*
>> + * Stubs to make nmi_enter/exit() code callable from ASM
>> + */
>> +asmlinkage void notrace asm_nmi_enter(void)
>> +{
>> + nmi_enter();
>> +}
>> +NOKPROBE_SYMBOL(asm_nmi_enter);
>> +
>> +asmlinkage void notrace asm_nmi_exit(void)
>> +{
>> + nmi_exit();
>> +}
>> +NOKPROBE_SYMBOL(asm_nmi_exit);
>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>> index f44cd89..0bf0fc4 100644
>> --- a/drivers/irqchip/irq-gic-v3.c
>> +++ b/drivers/irqchip/irq-gic-v3.c
>> @@ -495,7 +495,13 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
>>
>> if (gic_supports_nmi() &&
>> unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) {
>> + if (interrupts_enabled(regs))
>> + nmi_enter();
>> +
>> gic_handle_nmi(irqnr, regs);
>> +
>> + if (interrupts_enabled(regs))
>> + nmi_exit();
>
> Just to be on the safe side, I'd rather sample interrupts_enabled(regs)
> and use the same value to decide whether to do nmi_exit or not.
>

Fair enough.

In case there are no other issues, can this (and the comment) be amended
or shall respin the series? (If there are other isues raised I'll just
respin).

Thanks,

--
Julien Thierry

2019-06-07 16:00:38

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 4/8] arm64: Fix interrupt tracing in the presence of NMIs

On 06/06/2019 10:31, Julien Thierry wrote:
> In the presence of any form of instrumentation, nmi_enter() should be
> done before calling any traceable code and any instrumentation code.
>
> Currently, nmi_enter() is done in handle_domain_nmi(), which is much
> too late as instrumentation code might get called before. Move the
> nmi_enter/exit() calls to the arch IRQ vector handler.
>
> On arm64, it is not possible to know if the IRQ vector handler was
> called because of an NMI before acknowledging the interrupt. However, It
> is possible to know whether normal interrupts could be taken in the
> interrupted context (i.e. if taking an NMI in that context could
> introduce a potential race condition).
>
> When interrupting a context with IRQs disabled, call nmi_enter() as soon
> as possible. In contexts with IRQs enabled, defer this to the interrupt
> controller, which is in a better position to know if an interrupt taken
> is an NMI.
>
> Fixes: bc3c03ccb ("arm64: Enable the support of pseudo-NMIs")
> Signed-off-by: Julien Thierry <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Jason Cooper <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Mark Rutland <[email protected]>
> ---
> arch/arm64/kernel/entry.S | 44 +++++++++++++++++++++++++++++++++-----------
> arch/arm64/kernel/irq.c | 17 +++++++++++++++++
> drivers/irqchip/irq-gic-v3.c | 6 ++++++
> kernel/irq/irqdesc.c | 8 ++++++--
> 4 files changed, 62 insertions(+), 13 deletions(-)
>
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 89ab6bd..46358a3 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -435,6 +435,20 @@ tsk .req x28 // current thread_info
> irq_stack_exit
> .endm
>
> +#ifdef CONFIG_ARM64_PSEUDO_NMI
> + /*
> + * Set res to 0 if irqs were masked in interrupted context.

Is that comment right? This macro seems to return 0 if PMR is equal to
GIC_PRIO_IRQON, meaning that irqs are unmasked...

> + * Otherwise set res to non-0 value.
> + */
> + .macro test_irqs_unmasked res:req, pmr:req
> +alternative_if ARM64_HAS_IRQ_PRIO_MASKING
> + sub \res, \pmr, #GIC_PRIO_IRQON
> +alternative_else
> + mov \res, xzr
> +alternative_endif
> + .endm
> +#endif
> +
> .text
>
> /*
> @@ -631,19 +645,19 @@ ENDPROC(el1_sync)
> el1_irq:
> kernel_entry 1
> enable_da_f
> -#ifdef CONFIG_TRACE_IRQFLAGS
> +
> #ifdef CONFIG_ARM64_PSEUDO_NMI
> alternative_if ARM64_HAS_IRQ_PRIO_MASKING
> ldr x20, [sp, #S_PMR_SAVE]
> -alternative_else
> - mov x20, #GIC_PRIO_IRQON
> -alternative_endif
> - cmp x20, #GIC_PRIO_IRQOFF
> - /* Irqs were disabled, don't trace */
> - b.ls 1f
> +alternative_else_nop_endif
> + test_irqs_unmasked res=x0, pmr=x20
> + cbz x0, 1f
> + bl asm_nmi_enter
> +1:
> #endif
> +
> +#ifdef CONFIG_TRACE_IRQFLAGS
> bl trace_hardirqs_off
> -1:
> #endif
>
> irq_handler
> @@ -662,14 +676,22 @@ alternative_else_nop_endif
> bl preempt_schedule_irq // irq en/disable is done inside
> 1:
> #endif
> -#ifdef CONFIG_TRACE_IRQFLAGS
> +
> #ifdef CONFIG_ARM64_PSEUDO_NMI
> /*
> * if IRQs were disabled when we received the interrupt, we have an NMI
> * and we are not re-enabling interrupt upon eret. Skip tracing.
> */
> - cmp x20, #GIC_PRIO_IRQOFF
> - b.ls 1f
> + test_irqs_unmasked res=x0, pmr=x20
> + cbz x0, 1f
> + bl asm_nmi_exit
> +1:
> +#endif
> +
> +#ifdef CONFIG_TRACE_IRQFLAGS
> +#ifdef CONFIG_ARM64_PSEUDO_NMI
> + test_irqs_unmasked res=x0, pmr=x20
> + cbnz x0, 1f
> #endif
> bl trace_hardirqs_on
> 1:
> diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
> index 92fa817..fdd9cb2 100644
> --- a/arch/arm64/kernel/irq.c
> +++ b/arch/arm64/kernel/irq.c
> @@ -27,8 +27,10 @@
> #include <linux/smp.h>
> #include <linux/init.h>
> #include <linux/irqchip.h>
> +#include <linux/kprobes.h>
> #include <linux/seq_file.h>
> #include <linux/vmalloc.h>
> +#include <asm/daifflags.h>
> #include <asm/vmap_stack.h>
>
> unsigned long irq_err_count;
> @@ -76,3 +78,18 @@ void __init init_IRQ(void)
> if (!handle_arch_irq)
> panic("No interrupt controller found.");
> }
> +
> +/*
> + * Stubs to make nmi_enter/exit() code callable from ASM
> + */
> +asmlinkage void notrace asm_nmi_enter(void)
> +{
> + nmi_enter();
> +}
> +NOKPROBE_SYMBOL(asm_nmi_enter);
> +
> +asmlinkage void notrace asm_nmi_exit(void)
> +{
> + nmi_exit();
> +}
> +NOKPROBE_SYMBOL(asm_nmi_exit);
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index f44cd89..0bf0fc4 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -495,7 +495,13 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
>
> if (gic_supports_nmi() &&
> unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) {
> + if (interrupts_enabled(regs))
> + nmi_enter();
> +
> gic_handle_nmi(irqnr, regs);
> +
> + if (interrupts_enabled(regs))
> + nmi_exit();

Just to be on the safe side, I'd rather sample interrupts_enabled(regs)
and use the same value to decide whether to do nmi_exit or not.

> return;
> }
>
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index c52b737..a92b335 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -680,6 +680,8 @@ int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
> * @hwirq: The HW irq number to convert to a logical one
> * @regs: Register file coming from the low-level handling code
> *
> + * This function must be called from an NMI context.
> + *
> * Returns: 0 on success, or -EINVAL if conversion has failed
> */
> int handle_domain_nmi(struct irq_domain *domain, unsigned int hwirq,
> @@ -689,7 +691,10 @@ int handle_domain_nmi(struct irq_domain *domain, unsigned int hwirq,
> unsigned int irq;
> int ret = 0;
>
> - nmi_enter();
> + /*
> + * NMI context needs to be setup earlier in order to deal with tracing.
> + */
> + WARN_ON(!in_nmi());
>
> irq = irq_find_mapping(domain, hwirq);
>
> @@ -702,7 +707,6 @@ int handle_domain_nmi(struct irq_domain *domain, unsigned int hwirq,
> else
> ret = -EINVAL;
>
> - nmi_exit();
> set_irq_regs(old_regs);
> return ret;
> }
> --
> 1.9.1
>

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2019-06-07 16:33:18

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 6/8] arm64: irqflags: Introduce explicit debugging for IRQ priorities

On 06/06/2019 10:31, Julien Thierry wrote:
> Using IRQ priority masking to enable/disable interrupts is a bit
> sensitive as it requires to deal with both ICC_PMR_EL1 and PSR.I.
>
> Introduce some validity checks to both highlight the states in which
> functions dealing with IRQ enabling/disabling can (not) be called, and
> bark a warning when called in an unexpected state.
>
> Since these checks are done on hotpaths, introduce a build option to
> choose whether to do the checking.
>
> Signed-off-by: Julien Thierry <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> ---
> arch/arm64/Kconfig | 11 +++++++++++
> arch/arm64/include/asm/cpufeature.h | 6 ++++++
> arch/arm64/include/asm/daifflags.h | 7 +++++++
> arch/arm64/include/asm/irqflags.h | 14 +++++++++++++-
> 4 files changed, 37 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 697ea05..8acc40e 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1436,6 +1436,17 @@ config ARM64_PSEUDO_NMI
>
> If unsure, say N
>
> +if ARM64_PSEUDO_NMI
> +config ARM64_DEBUG_PRIORITY_MASKING
> + bool "Debug interrupt priority masking"
> + help
> + This adds runtime checks to functions enabling/disabling
> + interrupts when using priority masking. The additional checks verify
> + the validity of ICC_PMR_EL1 when calling concerned functions.
> +
> + If unsure, say N
> +endif
> +
> config RELOCATABLE
> bool
> help
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index bc895c8..693a086 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -617,6 +617,12 @@ static inline bool system_uses_irq_prio_masking(void)
> cpus_have_const_cap(ARM64_HAS_IRQ_PRIO_MASKING);
> }
>
> +static inline bool system_has_prio_mask_debugging(void)
> +{
> + return IS_ENABLED(CONFIG_ARM64_DEBUG_PRIORITY_MASKING) &&
> + system_uses_irq_prio_masking();
> +}
> +
> #define ARM64_SSBD_UNKNOWN -1
> #define ARM64_SSBD_FORCE_DISABLE 0
> #define ARM64_SSBD_KERNEL 1
> diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
> index f93204f..eca5bee 100644
> --- a/arch/arm64/include/asm/daifflags.h
> +++ b/arch/arm64/include/asm/daifflags.h
> @@ -28,6 +28,10 @@
> /* mask/save/unmask/restore all exceptions, including interrupts. */
> static inline void local_daif_mask(void)
> {
> + WARN_ON(system_has_prio_mask_debugging() &&
> + (read_sysreg_s(SYS_ICC_PMR_EL1) == (GIC_PRIO_IRQOFF |
> + GIC_PRIO_PSR_I_SET)));
> +
> asm volatile(
> "msr daifset, #0xf // local_daif_mask\n"
> :
> @@ -62,6 +66,9 @@ static inline void local_daif_restore(unsigned long flags)
> {
> bool irq_disabled = flags & PSR_I_BIT;
>
> + WARN_ON(system_has_prio_mask_debugging() &&
> + !(read_sysreg(daif) & PSR_I_BIT));
> +
> if (!irq_disabled) {
> trace_hardirqs_on();
>
> diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
> index b6f757f..cac2d2a 100644
> --- a/arch/arm64/include/asm/irqflags.h
> +++ b/arch/arm64/include/asm/irqflags.h
> @@ -40,6 +40,12 @@
> */
> static inline void arch_local_irq_enable(void)
> {
> + if (system_has_prio_mask_debugging()) {
> + u32 pmr = read_sysreg_s(SYS_ICC_PMR_EL1);
> +
> + WARN_ON_ONCE(pmr != GIC_PRIO_IRQON && pmr != GIC_PRIO_IRQOFF);
> + }
> +
> asm volatile(ALTERNATIVE(
> "msr daifclr, #2 // arch_local_irq_enable\n"
> "nop",
> @@ -53,6 +59,12 @@ static inline void arch_local_irq_enable(void)
>
> static inline void arch_local_irq_disable(void)
> {
> + if (system_has_prio_mask_debugging()) {
> + u32 pmr = read_sysreg_s(SYS_ICC_PMR_EL1);
> +
> + WARN_ON_ONCE(pmr != GIC_PRIO_IRQON && pmr != GIC_PRIO_IRQOFF);
> + }
> +
> asm volatile(ALTERNATIVE(
> "msr daifset, #2 // arch_local_irq_disable",
> __msr_s(SYS_ICC_PMR_EL1, "%0"),
> @@ -86,7 +98,7 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
>
> asm volatile(ALTERNATIVE(
> "and %w0, %w1, #" __stringify(PSR_I_BIT),
> - "eor %w0, %w1, #" __stringify(GIC_PRIO_IRQOFF),
> + "eor %w0, %w1, #" __stringify(GIC_PRIO_IRQON),

Err... Which version is the correct one? This one, or the previous one?

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2019-06-07 19:22:00

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 5/8] arm64: Fix incorrect irqflag restore for priority masking

On 06/06/2019 10:31, Julien Thierry wrote:
> When using IRQ priority masking to disable interrupts, in order to deal
> with the PSR.I state, local_irq_save() would convert the I bit into a
> PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore()
> potentially modifying the value of PMR in undesired location due to the
> state of PSR.I upon flag saving [1].
>
> In an attempt to solve this issue in a less hackish manner, introduce
> a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent
> whether PSR.I is being used to disable interrupts, in which case it
> takes precedence of the status of interrupt masking via PMR.
>
> GIC_PRIO_IGNORE_PMR is chosen such that (<pmr_value> |
> GIC_PRIO_IGNORE_PMR) does not mask more interrupts than <pmr_value> as
> some sections (e.g. arch_cpu_idle(), interrupt acknowledge path)
> requires PMR not to mask interrupts that could be signaled to the
> CPU when using only PSR.I.
>

s/GIC_PRIO_IGNORE_PMR/GIC_PRIO_PSR_I_SET/

> [1] https://www.spinics.net/lists/arm-kernel/msg716956.html
>
> Fixes: commit 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking")
> Signed-off-by: Julien Thierry <[email protected]>
> Reported-by: Zenghui Yu <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Wei Li <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Christoffer Dall <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: James Morse <[email protected]>
> Cc: Suzuki K Pouloze <[email protected]>
> Cc: Oleg Nesterov <[email protected]>
> ---
> arch/arm64/include/asm/arch_gicv3.h | 4 ++-
> arch/arm64/include/asm/daifflags.h | 68 ++++++++++++++++++++++---------------
> arch/arm64/include/asm/irqflags.h | 67 +++++++++++++++---------------------
> arch/arm64/include/asm/kvm_host.h | 7 ++--
> arch/arm64/include/asm/ptrace.h | 10 ++++--
> arch/arm64/kernel/entry.S | 38 ++++++++++++++++++---
> arch/arm64/kernel/process.c | 2 +-
> arch/arm64/kernel/smp.c | 8 +++--
> arch/arm64/kvm/hyp/switch.c | 2 +-
> 9 files changed, 123 insertions(+), 83 deletions(-)
>
> diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
> index 14b41dd..9e991b6 100644
> --- a/arch/arm64/include/asm/arch_gicv3.h
> +++ b/arch/arm64/include/asm/arch_gicv3.h
> @@ -163,7 +163,9 @@ static inline bool gic_prio_masking_enabled(void)
>
> static inline void gic_pmr_mask_irqs(void)
> {
> - BUILD_BUG_ON(GICD_INT_DEF_PRI <= GIC_PRIO_IRQOFF);
> + BUILD_BUG_ON(GICD_INT_DEF_PRI < (GIC_PRIO_IRQOFF |
> + GIC_PRIO_PSR_I_SET));
> + BUILD_BUG_ON(GICD_INT_DEF_PRI >= GIC_PRIO_IRQON);
> gic_write_pmr(GIC_PRIO_IRQOFF);
> }
>
> diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
> index db452aa..f93204f 100644
> --- a/arch/arm64/include/asm/daifflags.h
> +++ b/arch/arm64/include/asm/daifflags.h
> @@ -18,6 +18,7 @@
>
> #include <linux/irqflags.h>
>
> +#include <asm/arch_gicv3.h>
> #include <asm/cpufeature.h>
>
> #define DAIF_PROCCTX 0
> @@ -32,6 +33,11 @@ static inline void local_daif_mask(void)
> :
> :
> : "memory");
> +
> + /* Don't really care for a dsb here, we don't intend to enable IRQs */
> + if (system_uses_irq_prio_masking())
> + gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
> +
> trace_hardirqs_off();
> }
>
> @@ -43,7 +49,7 @@ static inline unsigned long local_daif_save(void)
>
> if (system_uses_irq_prio_masking()) {
> /* If IRQs are masked with PMR, reflect it in the flags */
> - if (read_sysreg_s(SYS_ICC_PMR_EL1) <= GIC_PRIO_IRQOFF)
> + if (read_sysreg_s(SYS_ICC_PMR_EL1) != GIC_PRIO_IRQON)
> flags |= PSR_I_BIT;
> }
>
> @@ -59,36 +65,44 @@ static inline void local_daif_restore(unsigned long flags)
> if (!irq_disabled) {
> trace_hardirqs_on();
>
> - if (system_uses_irq_prio_masking())
> - arch_local_irq_enable();
> - } else if (!(flags & PSR_A_BIT)) {
> - /*
> - * If interrupts are disabled but we can take
> - * asynchronous errors, we can take NMIs
> - */
> if (system_uses_irq_prio_masking()) {
> - flags &= ~PSR_I_BIT;
> + gic_write_pmr(GIC_PRIO_IRQON);
> + dsb(sy);
> + }
> + } else if (system_uses_irq_prio_masking()) {
> + u64 pmr;
> +
> + if (!(flags & PSR_A_BIT)) {
> /*
> - * There has been concern that the write to daif
> - * might be reordered before this write to PMR.
> - * From the ARM ARM DDI 0487D.a, section D1.7.1
> - * "Accessing PSTATE fields":
> - * Writes to the PSTATE fields have side-effects on
> - * various aspects of the PE operation. All of these
> - * side-effects are guaranteed:
> - * - Not to be visible to earlier instructions in
> - * the execution stream.
> - * - To be visible to later instructions in the
> - * execution stream
> - *
> - * Also, writes to PMR are self-synchronizing, so no
> - * interrupts with a lower priority than PMR is signaled
> - * to the PE after the write.
> - *
> - * So we don't need additional synchronization here.
> + * If interrupts are disabled but we can take
> + * asynchronous errors, we can take NMIs
> */
> - arch_local_irq_disable();
> + flags &= ~PSR_I_BIT;
> + pmr = GIC_PRIO_IRQOFF;
> + } else {
> + pmr = GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET;
> }
> +
> + /*
> + * There has been concern that the write to daif
> + * might be reordered before this write to PMR.
> + * From the ARM ARM DDI 0487D.a, section D1.7.1
> + * "Accessing PSTATE fields":
> + * Writes to the PSTATE fields have side-effects on
> + * various aspects of the PE operation. All of these
> + * side-effects are guaranteed:
> + * - Not to be visible to earlier instructions in
> + * the execution stream.
> + * - To be visible to later instructions in the
> + * execution stream
> + *
> + * Also, writes to PMR are self-synchronizing, so no
> + * interrupts with a lower priority than PMR is signaled
> + * to the PE after the write.
> + *
> + * So we don't need additional synchronization here.
> + */
> + gic_write_pmr(pmr);
> }
>
> write_sysreg(flags, daif);
> diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
> index fbe1aba..b6f757f 100644
> --- a/arch/arm64/include/asm/irqflags.h
> +++ b/arch/arm64/include/asm/irqflags.h
> @@ -67,43 +67,46 @@ static inline void arch_local_irq_disable(void)
> */
> static inline unsigned long arch_local_save_flags(void)
> {
> - unsigned long daif_bits;
> unsigned long flags;
>
> - daif_bits = read_sysreg(daif);
> -
> - /*
> - * The asm is logically equivalent to:
> - *
> - * if (system_uses_irq_prio_masking())
> - * flags = (daif_bits & PSR_I_BIT) ?
> - * GIC_PRIO_IRQOFF :
> - * read_sysreg_s(SYS_ICC_PMR_EL1);
> - * else
> - * flags = daif_bits;
> - */
> asm volatile(ALTERNATIVE(
> - "mov %0, %1\n"
> - "nop\n"
> - "nop",
> - __mrs_s("%0", SYS_ICC_PMR_EL1)
> - "ands %1, %1, " __stringify(PSR_I_BIT) "\n"
> - "csel %0, %0, %2, eq",
> - ARM64_HAS_IRQ_PRIO_MASKING)
> - : "=&r" (flags), "+r" (daif_bits)
> - : "r" ((unsigned long) GIC_PRIO_IRQOFF)
> - : "cc", "memory");
> + "mrs %0, daif",
> + __mrs_s("%0", SYS_ICC_PMR_EL1),
> + ARM64_HAS_IRQ_PRIO_MASKING)
> + : "=&r" (flags)
> + :
> + : "memory");

I think this is worth a comment here, as you're changing the semantics
of arch_local_save_flags(). Maybe just indicating that the only thing
this should be used for is to carry the interrupt state, and nothing else.

>
> return flags;
> }
>
> +static inline int arch_irqs_disabled_flags(unsigned long flags)
> +{
> + int res;
> +
> + asm volatile(ALTERNATIVE(
> + "and %w0, %w1, #" __stringify(PSR_I_BIT),
> + "eor %w0, %w1, #" __stringify(GIC_PRIO_IRQOFF),
> + ARM64_HAS_IRQ_PRIO_MASKING)
> + : "=&r" (res)
> + : "r" ((int) flags)
> + : "memory");
> +
> + return res;
> +}
> +
> static inline unsigned long arch_local_irq_save(void)
> {
> unsigned long flags;
>
> flags = arch_local_save_flags();
>
> - arch_local_irq_disable();
> + /*
> + * There are too many states with IRQs disabled, just keep the current
> + * state if interrupts are already disabled/masked.
> + */
> + if (!arch_irqs_disabled_flags(flags))
> + arch_local_irq_disable();
>
> return flags;
> }
> @@ -124,21 +127,5 @@ static inline void arch_local_irq_restore(unsigned long flags)
> : "memory");
> }
>
> -static inline int arch_irqs_disabled_flags(unsigned long flags)
> -{
> - int res;
> -
> - asm volatile(ALTERNATIVE(
> - "and %w0, %w1, #" __stringify(PSR_I_BIT) "\n"
> - "nop",
> - "cmp %w1, #" __stringify(GIC_PRIO_IRQOFF) "\n"
> - "cset %w0, ls",
> - ARM64_HAS_IRQ_PRIO_MASKING)
> - : "=&r" (res)
> - : "r" ((int) flags)
> - : "cc", "memory");
> -
> - return res;
> -}
> #endif
> #endif
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 4bcd9c1..fdc0e1c 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -608,11 +608,12 @@ static inline void kvm_arm_vhe_guest_enter(void)
> * will not signal the CPU of interrupts of lower priority, and the
> * only way to get out will be via guest exceptions.
> * Naturally, we want to avoid this.
> + *
> + * local_daif_mask() already sets IGNORE_PMR, we just need a

GIC_PRIO_PSR_I_SET?

> + * dsb to ensure the redistributor is forwards EL2 IRQs to the CPU.
> */
> - if (system_uses_irq_prio_masking()) {
> - gic_write_pmr(GIC_PRIO_IRQON);
> + if (system_uses_irq_prio_masking())
> dsb(sy);
> - }
> }
>
> static inline void kvm_arm_vhe_guest_exit(void)
> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> index b2de329..da22422 100644
> --- a/arch/arm64/include/asm/ptrace.h
> +++ b/arch/arm64/include/asm/ptrace.h
> @@ -35,9 +35,15 @@
> * means masking more IRQs (or at least that the same IRQs remain masked).
> *
> * To mask interrupts, we clear the most significant bit of PMR.
> + *
> + * Some code sections either automatically switch back to PSR.I or explicitly
> + * require to not use priority masking. If bit GIC_PRIO_PSR_I_SET is included
> + * in the the priority mask, it indicates that PSR.I should be set and
> + * interrupt disabling temporarily does not rely on IRQ priorities.
> */
> -#define GIC_PRIO_IRQON 0xf0
> -#define GIC_PRIO_IRQOFF (GIC_PRIO_IRQON & ~0x80)
> +#define GIC_PRIO_IRQON 0xc0
> +#define GIC_PRIO_IRQOFF (GIC_PRIO_IRQON & ~0x80)
> +#define GIC_PRIO_PSR_I_SET (1 << 4)
>
> /* Additional SPSR bits not exposed in the UABI */
> #define PSR_IL_BIT (1 << 20)
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 46358a3..7f92e4b 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -258,6 +258,7 @@ alternative_else_nop_endif
> /*
> * Registers that may be useful after this macro is invoked:
> *
> + * x20 - ICC_PMR_EL1
> * x21 - aborted SP
> * x22 - aborted PC
> * x23 - aborted PSTATE
> @@ -449,6 +450,24 @@ alternative_endif
> .endm
> #endif
>
> + .macro gic_prio_kentry_setup, tmp:req
> +#ifdef CONFIG_ARM64_PSEUDO_NMI
> + alternative_if ARM64_HAS_IRQ_PRIO_MASKING
> + mov x20, #(GIC_PRIO_PSR_I_SET | GIC_PRIO_IRQON)
> + msr_s SYS_ICC_PMR_EL1, x20

I find the implicit use of x20 quite dangerous, but hey. I guess that
the context makes that fairly explicit.

> + alternative_else_nop_endif
> +#endif
> + .endm
> +
> + .macro gic_prio_irq_setup, pmr:req, tmp:req
> +#ifdef CONFIG_ARM64_PSEUDO_NMI
> + alternative_if ARM64_HAS_IRQ_PRIO_MASKING
> + orr \tmp, \pmr, #GIC_PRIO_PSR_I_SET
> + msr_s SYS_ICC_PMR_EL1, \tmp
> + alternative_else_nop_endif
> +#endif
> + .endm
> +
> .text
>
> /*
> @@ -627,6 +646,7 @@ el1_dbg:
> cmp x24, #ESR_ELx_EC_BRK64 // if BRK64
> cinc x24, x24, eq // set bit '0'
> tbz x24, #0, el1_inv // EL1 only
> + gic_prio_kentry_setup tmp=x3
> mrs x0, far_el1
> mov x2, sp // struct pt_regs
> bl do_debug_exception
> @@ -644,12 +664,10 @@ ENDPROC(el1_sync)
> .align 6
> el1_irq:
> kernel_entry 1
> + gic_prio_irq_setup pmr=x20, tmp=x1
> enable_da_f
>
> #ifdef CONFIG_ARM64_PSEUDO_NMI
> -alternative_if ARM64_HAS_IRQ_PRIO_MASKING
> - ldr x20, [sp, #S_PMR_SAVE]
> -alternative_else_nop_endif
> test_irqs_unmasked res=x0, pmr=x20
> cbz x0, 1f
> bl asm_nmi_enter
> @@ -679,8 +697,9 @@ alternative_else_nop_endif
>
> #ifdef CONFIG_ARM64_PSEUDO_NMI
> /*
> - * if IRQs were disabled when we received the interrupt, we have an NMI
> - * and we are not re-enabling interrupt upon eret. Skip tracing.
> + * When using IRQ priority masking, we can get spurious interrupts while
> + * PMR is set to GIC_PRIO_IRQOFF. An NMI might also have occurred in a
> + * section with interrupts disabled. Skip tracing in those cases.
> */
> test_irqs_unmasked res=x0, pmr=x20
> cbz x0, 1f
> @@ -809,6 +828,7 @@ el0_ia:
> * Instruction abort handling
> */
> mrs x26, far_el1
> + gic_prio_kentry_setup tmp=x0
> enable_da_f
> #ifdef CONFIG_TRACE_IRQFLAGS
> bl trace_hardirqs_off
> @@ -854,6 +874,7 @@ el0_sp_pc:
> * Stack or PC alignment exception handling
> */
> mrs x26, far_el1
> + gic_prio_kentry_setup tmp=x0
> enable_da_f
> #ifdef CONFIG_TRACE_IRQFLAGS
> bl trace_hardirqs_off
> @@ -888,6 +909,7 @@ el0_dbg:
> * Debug exception handling
> */
> tbnz x24, #0, el0_inv // EL0 only
> + gic_prio_kentry_setup tmp=x3
> mrs x0, far_el1
> mov x1, x25
> mov x2, sp
> @@ -909,7 +931,9 @@ ENDPROC(el0_sync)
> el0_irq:
> kernel_entry 0
> el0_irq_naked:
> + gic_prio_irq_setup pmr=x20, tmp=x0
> enable_da_f
> +
> #ifdef CONFIG_TRACE_IRQFLAGS
> bl trace_hardirqs_off
> #endif
> @@ -931,6 +955,7 @@ ENDPROC(el0_irq)
> el1_error:
> kernel_entry 1
> mrs x1, esr_el1
> + gic_prio_kentry_setup tmp=x2
> enable_dbg
> mov x0, sp
> bl do_serror
> @@ -941,6 +966,7 @@ el0_error:
> kernel_entry 0
> el0_error_naked:
> mrs x1, esr_el1
> + gic_prio_kentry_setup tmp=x2
> enable_dbg
> mov x0, sp
> bl do_serror
> @@ -965,6 +991,7 @@ work_pending:
> */
> ret_to_user:
> disable_daif
> + gic_prio_kentry_setup tmp=x3
> ldr x1, [tsk, #TSK_TI_FLAGS]
> and x2, x1, #_TIF_WORK_MASK
> cbnz x2, work_pending
> @@ -981,6 +1008,7 @@ ENDPROC(ret_to_user)
> */
> .align 6
> el0_svc:
> + gic_prio_kentry_setup tmp=x1
> mov x0, sp
> bl el0_svc_handler
> b ret_to_user
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 3767fb2..58efc37 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -94,7 +94,7 @@ static void __cpu_do_idle_irqprio(void)
> * be raised.
> */
> pmr = gic_read_pmr();
> - gic_write_pmr(GIC_PRIO_IRQON);
> + gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
>
> __cpu_do_idle();
>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index bb4b3f0..4deaee3 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -192,11 +192,13 @@ static void init_gic_priority_masking(void)
>
> WARN_ON(!(cpuflags & PSR_I_BIT));
>
> - gic_write_pmr(GIC_PRIO_IRQOFF);
> -
> /* We can only unmask PSR.I if we can take aborts */
> - if (!(cpuflags & PSR_A_BIT))
> + if (!(cpuflags & PSR_A_BIT)) {
> + gic_write_pmr(GIC_PRIO_IRQOFF);
> write_sysreg(cpuflags & ~PSR_I_BIT, daif);
> + } else {
> + gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
> + }
> }
>
> /*
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 8799e0c..b89fcf0 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -615,7 +615,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
> * Naturally, we want to avoid this.
> */
> if (system_uses_irq_prio_masking()) {
> - gic_write_pmr(GIC_PRIO_IRQON);
> + gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
> dsb(sy);
> }
>
> --
> 1.9.1
>

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2019-06-10 07:51:35

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v3 5/8] arm64: Fix incorrect irqflag restore for priority masking



On 07/06/2019 17:29, Marc Zyngier wrote:
> On 06/06/2019 10:31, Julien Thierry wrote:
>> When using IRQ priority masking to disable interrupts, in order to deal
>> with the PSR.I state, local_irq_save() would convert the I bit into a
>> PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore()
>> potentially modifying the value of PMR in undesired location due to the
>> state of PSR.I upon flag saving [1].
>>
>> In an attempt to solve this issue in a less hackish manner, introduce
>> a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent
>> whether PSR.I is being used to disable interrupts, in which case it
>> takes precedence of the status of interrupt masking via PMR.
>>
>> GIC_PRIO_IGNORE_PMR is chosen such that (<pmr_value> |
>> GIC_PRIO_IGNORE_PMR) does not mask more interrupts than <pmr_value> as
>> some sections (e.g. arch_cpu_idle(), interrupt acknowledge path)
>> requires PMR not to mask interrupts that could be signaled to the
>> CPU when using only PSR.I.
>>
>
> s/GIC_PRIO_IGNORE_PMR/GIC_PRIO_PSR_I_SET/
>
>> [1] https://www.spinics.net/lists/arm-kernel/msg716956.html
>>
>> Fixes: commit 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking")
>> Signed-off-by: Julien Thierry <[email protected]>
>> Reported-by: Zenghui Yu <[email protected]>
>> Cc: Steven Rostedt <[email protected]>
>> Cc: Wei Li <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> Cc: Christoffer Dall <[email protected]>
>> Cc: Marc Zyngier <[email protected]>
>> Cc: James Morse <[email protected]>
>> Cc: Suzuki K Pouloze <[email protected]>
>> Cc: Oleg Nesterov <[email protected]>
>> ---
>> arch/arm64/include/asm/arch_gicv3.h | 4 ++-
>> arch/arm64/include/asm/daifflags.h | 68 ++++++++++++++++++++++---------------
>> arch/arm64/include/asm/irqflags.h | 67 +++++++++++++++---------------------
>> arch/arm64/include/asm/kvm_host.h | 7 ++--
>> arch/arm64/include/asm/ptrace.h | 10 ++++--
>> arch/arm64/kernel/entry.S | 38 ++++++++++++++++++---
>> arch/arm64/kernel/process.c | 2 +-
>> arch/arm64/kernel/smp.c | 8 +++--
>> arch/arm64/kvm/hyp/switch.c | 2 +-
>> 9 files changed, 123 insertions(+), 83 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
>> index 14b41dd..9e991b6 100644
>> --- a/arch/arm64/include/asm/arch_gicv3.h
>> +++ b/arch/arm64/include/asm/arch_gicv3.h
>> @@ -163,7 +163,9 @@ static inline bool gic_prio_masking_enabled(void)
>>
>> static inline void gic_pmr_mask_irqs(void)
>> {
>> - BUILD_BUG_ON(GICD_INT_DEF_PRI <= GIC_PRIO_IRQOFF);
>> + BUILD_BUG_ON(GICD_INT_DEF_PRI < (GIC_PRIO_IRQOFF |
>> + GIC_PRIO_PSR_I_SET));
>> + BUILD_BUG_ON(GICD_INT_DEF_PRI >= GIC_PRIO_IRQON);
>> gic_write_pmr(GIC_PRIO_IRQOFF);
>> }
>>
>> diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
>> index db452aa..f93204f 100644
>> --- a/arch/arm64/include/asm/daifflags.h
>> +++ b/arch/arm64/include/asm/daifflags.h
>> @@ -18,6 +18,7 @@
>>
>> #include <linux/irqflags.h>
>>
>> +#include <asm/arch_gicv3.h>
>> #include <asm/cpufeature.h>
>>
>> #define DAIF_PROCCTX 0
>> @@ -32,6 +33,11 @@ static inline void local_daif_mask(void)
>> :
>> :
>> : "memory");
>> +
>> + /* Don't really care for a dsb here, we don't intend to enable IRQs */
>> + if (system_uses_irq_prio_masking())
>> + gic_write_pmr(GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET);
>> +
>> trace_hardirqs_off();
>> }
>>
>> @@ -43,7 +49,7 @@ static inline unsigned long local_daif_save(void)
>>
>> if (system_uses_irq_prio_masking()) {
>> /* If IRQs are masked with PMR, reflect it in the flags */
>> - if (read_sysreg_s(SYS_ICC_PMR_EL1) <= GIC_PRIO_IRQOFF)
>> + if (read_sysreg_s(SYS_ICC_PMR_EL1) != GIC_PRIO_IRQON)
>> flags |= PSR_I_BIT;
>> }
>>
>> @@ -59,36 +65,44 @@ static inline void local_daif_restore(unsigned long flags)
>> if (!irq_disabled) {
>> trace_hardirqs_on();
>>
>> - if (system_uses_irq_prio_masking())
>> - arch_local_irq_enable();
>> - } else if (!(flags & PSR_A_BIT)) {
>> - /*
>> - * If interrupts are disabled but we can take
>> - * asynchronous errors, we can take NMIs
>> - */
>> if (system_uses_irq_prio_masking()) {
>> - flags &= ~PSR_I_BIT;
>> + gic_write_pmr(GIC_PRIO_IRQON);
>> + dsb(sy);
>> + }
>> + } else if (system_uses_irq_prio_masking()) {
>> + u64 pmr;
>> +
>> + if (!(flags & PSR_A_BIT)) {
>> /*
>> - * There has been concern that the write to daif
>> - * might be reordered before this write to PMR.
>> - * From the ARM ARM DDI 0487D.a, section D1.7.1
>> - * "Accessing PSTATE fields":
>> - * Writes to the PSTATE fields have side-effects on
>> - * various aspects of the PE operation. All of these
>> - * side-effects are guaranteed:
>> - * - Not to be visible to earlier instructions in
>> - * the execution stream.
>> - * - To be visible to later instructions in the
>> - * execution stream
>> - *
>> - * Also, writes to PMR are self-synchronizing, so no
>> - * interrupts with a lower priority than PMR is signaled
>> - * to the PE after the write.
>> - *
>> - * So we don't need additional synchronization here.
>> + * If interrupts are disabled but we can take
>> + * asynchronous errors, we can take NMIs
>> */
>> - arch_local_irq_disable();
>> + flags &= ~PSR_I_BIT;
>> + pmr = GIC_PRIO_IRQOFF;
>> + } else {
>> + pmr = GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET;
>> }
>> +
>> + /*
>> + * There has been concern that the write to daif
>> + * might be reordered before this write to PMR.
>> + * From the ARM ARM DDI 0487D.a, section D1.7.1
>> + * "Accessing PSTATE fields":
>> + * Writes to the PSTATE fields have side-effects on
>> + * various aspects of the PE operation. All of these
>> + * side-effects are guaranteed:
>> + * - Not to be visible to earlier instructions in
>> + * the execution stream.
>> + * - To be visible to later instructions in the
>> + * execution stream
>> + *
>> + * Also, writes to PMR are self-synchronizing, so no
>> + * interrupts with a lower priority than PMR is signaled
>> + * to the PE after the write.
>> + *
>> + * So we don't need additional synchronization here.
>> + */
>> + gic_write_pmr(pmr);
>> }
>>
>> write_sysreg(flags, daif);
>> diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
>> index fbe1aba..b6f757f 100644
>> --- a/arch/arm64/include/asm/irqflags.h
>> +++ b/arch/arm64/include/asm/irqflags.h
>> @@ -67,43 +67,46 @@ static inline void arch_local_irq_disable(void)
>> */
>> static inline unsigned long arch_local_save_flags(void)
>> {
>> - unsigned long daif_bits;
>> unsigned long flags;
>>
>> - daif_bits = read_sysreg(daif);
>> -
>> - /*
>> - * The asm is logically equivalent to:
>> - *
>> - * if (system_uses_irq_prio_masking())
>> - * flags = (daif_bits & PSR_I_BIT) ?
>> - * GIC_PRIO_IRQOFF :
>> - * read_sysreg_s(SYS_ICC_PMR_EL1);
>> - * else
>> - * flags = daif_bits;
>> - */
>> asm volatile(ALTERNATIVE(
>> - "mov %0, %1\n"
>> - "nop\n"
>> - "nop",
>> - __mrs_s("%0", SYS_ICC_PMR_EL1)
>> - "ands %1, %1, " __stringify(PSR_I_BIT) "\n"
>> - "csel %0, %0, %2, eq",
>> - ARM64_HAS_IRQ_PRIO_MASKING)
>> - : "=&r" (flags), "+r" (daif_bits)
>> - : "r" ((unsigned long) GIC_PRIO_IRQOFF)
>> - : "cc", "memory");
>> + "mrs %0, daif",
>> + __mrs_s("%0", SYS_ICC_PMR_EL1),
>> + ARM64_HAS_IRQ_PRIO_MASKING)
>> + : "=&r" (flags)
>> + :
>> + : "memory");
>
> I think this is worth a comment here, as you're changing the semantics
> of arch_local_save_flags(). Maybe just indicating that the only thing
> this should be used for is to carry the interrupt state, and nothing else.
>

Arguably, this is what gets called by local_save_flags() which is arch
independent and, as far as I understand, is only aware of the interrupt
state being contained in the flags (arch might wish to store more stuff
in it, but overall, generic code cannot rely on it).

I'll still add a comment so that code directly calling arch_save_flags()
doesn't try to play with PSR.DA_F. (In such a cases it would be probably
clearer for them to do direct DAIF reads/writes IMO).

>>
>> return flags;
>> }
>>
>> +static inline int arch_irqs_disabled_flags(unsigned long flags)
>> +{
>> + int res;
>> +
>> + asm volatile(ALTERNATIVE(
>> + "and %w0, %w1, #" __stringify(PSR_I_BIT),
>> + "eor %w0, %w1, #" __stringify(GIC_PRIO_IRQOFF),
>> + ARM64_HAS_IRQ_PRIO_MASKING)
>> + : "=&r" (res)
>> + : "r" ((int) flags)
>> + : "memory");
>> +
>> + return res;
>> +}
>> +
>> static inline unsigned long arch_local_irq_save(void)
>> {
>> unsigned long flags;
>>
>> flags = arch_local_save_flags();
>>
>> - arch_local_irq_disable();
>> + /*
>> + * There are too many states with IRQs disabled, just keep the current
>> + * state if interrupts are already disabled/masked.
>> + */
>> + if (!arch_irqs_disabled_flags(flags))
>> + arch_local_irq_disable();
>>
>> return flags;
>> }
>> @@ -124,21 +127,5 @@ static inline void arch_local_irq_restore(unsigned long flags)
>> : "memory");
>> }
>>
>> -static inline int arch_irqs_disabled_flags(unsigned long flags)
>> -{
>> - int res;
>> -
>> - asm volatile(ALTERNATIVE(
>> - "and %w0, %w1, #" __stringify(PSR_I_BIT) "\n"
>> - "nop",
>> - "cmp %w1, #" __stringify(GIC_PRIO_IRQOFF) "\n"
>> - "cset %w0, ls",
>> - ARM64_HAS_IRQ_PRIO_MASKING)
>> - : "=&r" (res)
>> - : "r" ((int) flags)
>> - : "cc", "memory");
>> -
>> - return res;
>> -}
>> #endif
>> #endif
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 4bcd9c1..fdc0e1c 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -608,11 +608,12 @@ static inline void kvm_arm_vhe_guest_enter(void)
>> * will not signal the CPU of interrupts of lower priority, and the
>> * only way to get out will be via guest exceptions.
>> * Naturally, we want to avoid this.
>> + *
>> + * local_daif_mask() already sets IGNORE_PMR, we just need a
>
> GIC_PRIO_PSR_I_SET?

Yes.

>
>> + * dsb to ensure the redistributor is forwards EL2 IRQs to the CPU.
>> */
>> - if (system_uses_irq_prio_masking()) {
>> - gic_write_pmr(GIC_PRIO_IRQON);
>> + if (system_uses_irq_prio_masking())
>> dsb(sy);
>> - }
>> }
>>
>> static inline void kvm_arm_vhe_guest_exit(void)
>> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
>> index b2de329..da22422 100644
>> --- a/arch/arm64/include/asm/ptrace.h
>> +++ b/arch/arm64/include/asm/ptrace.h
>> @@ -35,9 +35,15 @@
>> * means masking more IRQs (or at least that the same IRQs remain masked).
>> *
>> * To mask interrupts, we clear the most significant bit of PMR.
>> + *
>> + * Some code sections either automatically switch back to PSR.I or explicitly
>> + * require to not use priority masking. If bit GIC_PRIO_PSR_I_SET is included
>> + * in the the priority mask, it indicates that PSR.I should be set and
>> + * interrupt disabling temporarily does not rely on IRQ priorities.
>> */
>> -#define GIC_PRIO_IRQON 0xf0
>> -#define GIC_PRIO_IRQOFF (GIC_PRIO_IRQON & ~0x80)
>> +#define GIC_PRIO_IRQON 0xc0
>> +#define GIC_PRIO_IRQOFF (GIC_PRIO_IRQON & ~0x80)
>> +#define GIC_PRIO_PSR_I_SET (1 << 4)
>>
>> /* Additional SPSR bits not exposed in the UABI */
>> #define PSR_IL_BIT (1 << 20)
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index 46358a3..7f92e4b 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -258,6 +258,7 @@ alternative_else_nop_endif
>> /*
>> * Registers that may be useful after this macro is invoked:
>> *
>> + * x20 - ICC_PMR_EL1
>> * x21 - aborted SP
>> * x22 - aborted PC
>> * x23 - aborted PSTATE
>> @@ -449,6 +450,24 @@ alternative_endif
>> .endm
>> #endif
>>
>> + .macro gic_prio_kentry_setup, tmp:req
>> +#ifdef CONFIG_ARM64_PSEUDO_NMI
>> + alternative_if ARM64_HAS_IRQ_PRIO_MASKING
>> + mov x20, #(GIC_PRIO_PSR_I_SET | GIC_PRIO_IRQON)
>> + msr_s SYS_ICC_PMR_EL1, x20
>
> I find the implicit use of x20 quite dangerous, but hey. I guess that
> the context makes that fairly explicit.
>

Yes, this is not intentional, I must have messed up when reworking the
code. The macro has a tmp parameter for that purpose, I just forgot to
update the body of the macro...

Thanks for spotting it.

Thanks,

--
Julien Thierry

2019-06-10 07:55:29

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v3 6/8] arm64: irqflags: Introduce explicit debugging for IRQ priorities



On 07/06/2019 17:31, Marc Zyngier wrote:
> On 06/06/2019 10:31, Julien Thierry wrote:
>> Using IRQ priority masking to enable/disable interrupts is a bit
>> sensitive as it requires to deal with both ICC_PMR_EL1 and PSR.I.
>>
>> Introduce some validity checks to both highlight the states in which
>> functions dealing with IRQ enabling/disabling can (not) be called, and
>> bark a warning when called in an unexpected state.
>>
>> Since these checks are done on hotpaths, introduce a build option to
>> choose whether to do the checking.
>>
>> Signed-off-by: Julien Thierry <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> ---
>> arch/arm64/Kconfig | 11 +++++++++++
>> arch/arm64/include/asm/cpufeature.h | 6 ++++++
>> arch/arm64/include/asm/daifflags.h | 7 +++++++
>> arch/arm64/include/asm/irqflags.h | 14 +++++++++++++-
>> 4 files changed, 37 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 697ea05..8acc40e 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -1436,6 +1436,17 @@ config ARM64_PSEUDO_NMI
>>
>> If unsure, say N
>>
>> +if ARM64_PSEUDO_NMI
>> +config ARM64_DEBUG_PRIORITY_MASKING
>> + bool "Debug interrupt priority masking"
>> + help
>> + This adds runtime checks to functions enabling/disabling
>> + interrupts when using priority masking. The additional checks verify
>> + the validity of ICC_PMR_EL1 when calling concerned functions.
>> +
>> + If unsure, say N
>> +endif
>> +
>> config RELOCATABLE
>> bool
>> help
>> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
>> index bc895c8..693a086 100644
>> --- a/arch/arm64/include/asm/cpufeature.h
>> +++ b/arch/arm64/include/asm/cpufeature.h
>> @@ -617,6 +617,12 @@ static inline bool system_uses_irq_prio_masking(void)
>> cpus_have_const_cap(ARM64_HAS_IRQ_PRIO_MASKING);
>> }
>>
>> +static inline bool system_has_prio_mask_debugging(void)
>> +{
>> + return IS_ENABLED(CONFIG_ARM64_DEBUG_PRIORITY_MASKING) &&
>> + system_uses_irq_prio_masking();
>> +}
>> +
>> #define ARM64_SSBD_UNKNOWN -1
>> #define ARM64_SSBD_FORCE_DISABLE 0
>> #define ARM64_SSBD_KERNEL 1
>> diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
>> index f93204f..eca5bee 100644
>> --- a/arch/arm64/include/asm/daifflags.h
>> +++ b/arch/arm64/include/asm/daifflags.h
>> @@ -28,6 +28,10 @@
>> /* mask/save/unmask/restore all exceptions, including interrupts. */
>> static inline void local_daif_mask(void)
>> {
>> + WARN_ON(system_has_prio_mask_debugging() &&
>> + (read_sysreg_s(SYS_ICC_PMR_EL1) == (GIC_PRIO_IRQOFF |
>> + GIC_PRIO_PSR_I_SET)));
>> +
>> asm volatile(
>> "msr daifset, #0xf // local_daif_mask\n"
>> :
>> @@ -62,6 +66,9 @@ static inline void local_daif_restore(unsigned long flags)
>> {
>> bool irq_disabled = flags & PSR_I_BIT;
>>
>> + WARN_ON(system_has_prio_mask_debugging() &&
>> + !(read_sysreg(daif) & PSR_I_BIT));
>> +
>> if (!irq_disabled) {
>> trace_hardirqs_on();
>>
>> diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
>> index b6f757f..cac2d2a 100644
>> --- a/arch/arm64/include/asm/irqflags.h
>> +++ b/arch/arm64/include/asm/irqflags.h
>> @@ -40,6 +40,12 @@
>> */
>> static inline void arch_local_irq_enable(void)
>> {
>> + if (system_has_prio_mask_debugging()) {
>> + u32 pmr = read_sysreg_s(SYS_ICC_PMR_EL1);
>> +
>> + WARN_ON_ONCE(pmr != GIC_PRIO_IRQON && pmr != GIC_PRIO_IRQOFF);
>> + }
>> +
>> asm volatile(ALTERNATIVE(
>> "msr daifclr, #2 // arch_local_irq_enable\n"
>> "nop",
>> @@ -53,6 +59,12 @@ static inline void arch_local_irq_enable(void)
>>
>> static inline void arch_local_irq_disable(void)
>> {
>> + if (system_has_prio_mask_debugging()) {
>> + u32 pmr = read_sysreg_s(SYS_ICC_PMR_EL1);
>> +
>> + WARN_ON_ONCE(pmr != GIC_PRIO_IRQON && pmr != GIC_PRIO_IRQOFF);
>> + }
>> +
>> asm volatile(ALTERNATIVE(
>> "msr daifset, #2 // arch_local_irq_disable",
>> __msr_s(SYS_ICC_PMR_EL1, "%0"),
>> @@ -86,7 +98,7 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
>>
>> asm volatile(ALTERNATIVE(
>> "and %w0, %w1, #" __stringify(PSR_I_BIT),
>> - "eor %w0, %w1, #" __stringify(GIC_PRIO_IRQOFF),
>> + "eor %w0, %w1, #" __stringify(GIC_PRIO_IRQON),
>
> Err... Which version is the correct one? This one, or the previous one?
>

Argh, bad fixup of the patches. This one is the correct one (any state
!= GIC_PRIO_IRQON means interrupts are disabled), but the correct value
should already be in use in the previous patch.

Will fix that in the next posting.

Thanks,

--
Julien Thierry

2019-06-10 12:09:56

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v3 5/8] arm64: Fix incorrect irqflag restore for priority masking



On 10/06/2019 08:49, Julien Thierry wrote:
>
>
> On 07/06/2019 17:29, Marc Zyngier wrote:
>> On 06/06/2019 10:31, Julien Thierry wrote:
>>> When using IRQ priority masking to disable interrupts, in order to deal
>>> with the PSR.I state, local_irq_save() would convert the I bit into a
>>> PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore()
>>> potentially modifying the value of PMR in undesired location due to the
>>> state of PSR.I upon flag saving [1].
>>>
>>> In an attempt to solve this issue in a less hackish manner, introduce
>>> a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent
>>> whether PSR.I is being used to disable interrupts, in which case it
>>> takes precedence of the status of interrupt masking via PMR.
>>>
>>> GIC_PRIO_IGNORE_PMR is chosen such that (<pmr_value> |
>>> GIC_PRIO_IGNORE_PMR) does not mask more interrupts than <pmr_value> as
>>> some sections (e.g. arch_cpu_idle(), interrupt acknowledge path)
>>> requires PMR not to mask interrupts that could be signaled to the
>>> CPU when using only PSR.I.
>>>
>>
>> s/GIC_PRIO_IGNORE_PMR/GIC_PRIO_PSR_I_SET/
>>
>>> [1] https://www.spinics.net/lists/arm-kernel/msg716956.html
>>>
>>> Fixes: commit 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking")
>>> Signed-off-by: Julien Thierry <[email protected]>
>>> Reported-by: Zenghui Yu <[email protected]>
>>> Cc: Steven Rostedt <[email protected]>
>>> Cc: Wei Li <[email protected]>
>>> Cc: Catalin Marinas <[email protected]>
>>> Cc: Will Deacon <[email protected]>
>>> Cc: Christoffer Dall <[email protected]>
>>> Cc: Marc Zyngier <[email protected]>
>>> Cc: James Morse <[email protected]>
>>> Cc: Suzuki K Pouloze <[email protected]>
>>> Cc: Oleg Nesterov <[email protected]>
>>> ---
>>> arch/arm64/include/asm/arch_gicv3.h | 4 ++-
>>> arch/arm64/include/asm/daifflags.h | 68 ++++++++++++++++++++++---------------
>>> arch/arm64/include/asm/irqflags.h | 67 +++++++++++++++---------------------
>>> arch/arm64/include/asm/kvm_host.h | 7 ++--
>>> arch/arm64/include/asm/ptrace.h | 10 ++++--
>>> arch/arm64/kernel/entry.S | 38 ++++++++++++++++++---
>>> arch/arm64/kernel/process.c | 2 +-
>>> arch/arm64/kernel/smp.c | 8 +++--
>>> arch/arm64/kvm/hyp/switch.c | 2 +-
>>> 9 files changed, 123 insertions(+), 83 deletions(-)
>>>

[...]

>>> diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
>>> index fbe1aba..b6f757f 100644
>>> --- a/arch/arm64/include/asm/irqflags.h
>>> +++ b/arch/arm64/include/asm/irqflags.h
>>> @@ -67,43 +67,46 @@ static inline void arch_local_irq_disable(void)
>>> */
>>> static inline unsigned long arch_local_save_flags(void)
>>> {
>>> - unsigned long daif_bits;
>>> unsigned long flags;
>>>
>>> - daif_bits = read_sysreg(daif);
>>> -
>>> - /*
>>> - * The asm is logically equivalent to:
>>> - *
>>> - * if (system_uses_irq_prio_masking())
>>> - * flags = (daif_bits & PSR_I_BIT) ?
>>> - * GIC_PRIO_IRQOFF :
>>> - * read_sysreg_s(SYS_ICC_PMR_EL1);
>>> - * else
>>> - * flags = daif_bits;
>>> - */
>>> asm volatile(ALTERNATIVE(
>>> - "mov %0, %1\n"
>>> - "nop\n"
>>> - "nop",
>>> - __mrs_s("%0", SYS_ICC_PMR_EL1)
>>> - "ands %1, %1, " __stringify(PSR_I_BIT) "\n"
>>> - "csel %0, %0, %2, eq",
>>> - ARM64_HAS_IRQ_PRIO_MASKING)
>>> - : "=&r" (flags), "+r" (daif_bits)
>>> - : "r" ((unsigned long) GIC_PRIO_IRQOFF)
>>> - : "cc", "memory");
>>> + "mrs %0, daif",
>>> + __mrs_s("%0", SYS_ICC_PMR_EL1),
>>> + ARM64_HAS_IRQ_PRIO_MASKING)
>>> + : "=&r" (flags)
>>> + :
>>> + : "memory");
>>
>> I think this is worth a comment here, as you're changing the semantics
>> of arch_local_save_flags(). Maybe just indicating that the only thing
>> this should be used for is to carry the interrupt state, and nothing else.
>>
>
> Arguably, this is what gets called by local_save_flags() which is arch
> independent and, as far as I understand, is only aware of the interrupt
> state being contained in the flags (arch might wish to store more stuff
> in it, but overall, generic code cannot rely on it).
>
> I'll still add a comment so that code directly calling arch_save_flags()
> doesn't try to play with PSR.DA_F. (In such a cases it would be probably
> clearer for them to do direct DAIF reads/writes IMO).
>

After checking, arch_local_save_flags() already has the following
comment above it:

/*



* Save the current interrupt enable state.



*/


Which suggests you shouldn't rely on having the value of debug state and
other (it just happens to be there, maybe wrongfully...).

And user checking the flags should use arch_irqs_disabled_flags() rather
than "flags & PSR_I_BIT != 0".

Also, those semantics where already changed when we introduced priority
masking and included the PMR value in the irqflags.

I'm not sure there is a lot more explanation to do in this patch in
particular.

Thanks,

--
Julien Thierry

2019-06-10 12:15:51

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 5/8] arm64: Fix incorrect irqflag restore for priority masking

On 10/06/2019 12:36, Julien Thierry wrote:
>
>
> On 10/06/2019 08:49, Julien Thierry wrote:
>>
>>
>> On 07/06/2019 17:29, Marc Zyngier wrote:
>>> On 06/06/2019 10:31, Julien Thierry wrote:
>>>> When using IRQ priority masking to disable interrupts, in order to deal
>>>> with the PSR.I state, local_irq_save() would convert the I bit into a
>>>> PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore()
>>>> potentially modifying the value of PMR in undesired location due to the
>>>> state of PSR.I upon flag saving [1].
>>>>
>>>> In an attempt to solve this issue in a less hackish manner, introduce
>>>> a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent
>>>> whether PSR.I is being used to disable interrupts, in which case it
>>>> takes precedence of the status of interrupt masking via PMR.
>>>>
>>>> GIC_PRIO_IGNORE_PMR is chosen such that (<pmr_value> |
>>>> GIC_PRIO_IGNORE_PMR) does not mask more interrupts than <pmr_value> as
>>>> some sections (e.g. arch_cpu_idle(), interrupt acknowledge path)
>>>> requires PMR not to mask interrupts that could be signaled to the
>>>> CPU when using only PSR.I.
>>>>
>>>
>>> s/GIC_PRIO_IGNORE_PMR/GIC_PRIO_PSR_I_SET/
>>>
>>>> [1] https://www.spinics.net/lists/arm-kernel/msg716956.html
>>>>
>>>> Fixes: commit 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking")
>>>> Signed-off-by: Julien Thierry <[email protected]>
>>>> Reported-by: Zenghui Yu <[email protected]>
>>>> Cc: Steven Rostedt <[email protected]>
>>>> Cc: Wei Li <[email protected]>
>>>> Cc: Catalin Marinas <[email protected]>
>>>> Cc: Will Deacon <[email protected]>
>>>> Cc: Christoffer Dall <[email protected]>
>>>> Cc: Marc Zyngier <[email protected]>
>>>> Cc: James Morse <[email protected]>
>>>> Cc: Suzuki K Pouloze <[email protected]>
>>>> Cc: Oleg Nesterov <[email protected]>
>>>> ---
>>>> arch/arm64/include/asm/arch_gicv3.h | 4 ++-
>>>> arch/arm64/include/asm/daifflags.h | 68 ++++++++++++++++++++++---------------
>>>> arch/arm64/include/asm/irqflags.h | 67 +++++++++++++++---------------------
>>>> arch/arm64/include/asm/kvm_host.h | 7 ++--
>>>> arch/arm64/include/asm/ptrace.h | 10 ++++--
>>>> arch/arm64/kernel/entry.S | 38 ++++++++++++++++++---
>>>> arch/arm64/kernel/process.c | 2 +-
>>>> arch/arm64/kernel/smp.c | 8 +++--
>>>> arch/arm64/kvm/hyp/switch.c | 2 +-
>>>> 9 files changed, 123 insertions(+), 83 deletions(-)
>>>>
>
> [...]
>
>>>> diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
>>>> index fbe1aba..b6f757f 100644
>>>> --- a/arch/arm64/include/asm/irqflags.h
>>>> +++ b/arch/arm64/include/asm/irqflags.h
>>>> @@ -67,43 +67,46 @@ static inline void arch_local_irq_disable(void)
>>>> */
>>>> static inline unsigned long arch_local_save_flags(void)
>>>> {
>>>> - unsigned long daif_bits;
>>>> unsigned long flags;
>>>>
>>>> - daif_bits = read_sysreg(daif);
>>>> -
>>>> - /*
>>>> - * The asm is logically equivalent to:
>>>> - *
>>>> - * if (system_uses_irq_prio_masking())
>>>> - * flags = (daif_bits & PSR_I_BIT) ?
>>>> - * GIC_PRIO_IRQOFF :
>>>> - * read_sysreg_s(SYS_ICC_PMR_EL1);
>>>> - * else
>>>> - * flags = daif_bits;
>>>> - */
>>>> asm volatile(ALTERNATIVE(
>>>> - "mov %0, %1\n"
>>>> - "nop\n"
>>>> - "nop",
>>>> - __mrs_s("%0", SYS_ICC_PMR_EL1)
>>>> - "ands %1, %1, " __stringify(PSR_I_BIT) "\n"
>>>> - "csel %0, %0, %2, eq",
>>>> - ARM64_HAS_IRQ_PRIO_MASKING)
>>>> - : "=&r" (flags), "+r" (daif_bits)
>>>> - : "r" ((unsigned long) GIC_PRIO_IRQOFF)
>>>> - : "cc", "memory");
>>>> + "mrs %0, daif",
>>>> + __mrs_s("%0", SYS_ICC_PMR_EL1),
>>>> + ARM64_HAS_IRQ_PRIO_MASKING)
>>>> + : "=&r" (flags)
>>>> + :
>>>> + : "memory");
>>>
>>> I think this is worth a comment here, as you're changing the semantics
>>> of arch_local_save_flags(). Maybe just indicating that the only thing
>>> this should be used for is to carry the interrupt state, and nothing else.
>>>
>>
>> Arguably, this is what gets called by local_save_flags() which is arch
>> independent and, as far as I understand, is only aware of the interrupt
>> state being contained in the flags (arch might wish to store more stuff
>> in it, but overall, generic code cannot rely on it).
>>
>> I'll still add a comment so that code directly calling arch_save_flags()
>> doesn't try to play with PSR.DA_F. (In such a cases it would be probably
>> clearer for them to do direct DAIF reads/writes IMO).
>>
>
> After checking, arch_local_save_flags() already has the following
> comment above it:
>
> /*
>
>
>
> * Save the current interrupt enable state.
>
>
>
> */
>
>
> Which suggests you shouldn't rely on having the value of debug state and
> other (it just happens to be there, maybe wrongfully...).
>
> And user checking the flags should use arch_irqs_disabled_flags() rather
> than "flags & PSR_I_BIT != 0".
>
> Also, those semantics where already changed when we introduced priority
> masking and included the PMR value in the irqflags.
>
> I'm not sure there is a lot more explanation to do in this patch in
> particular.

Fair enough. I guess that if someone is fiddling with the flags in
ungodly ways, they deserve to be bitten...

Thanks,

M.
--
Jazz is not dead. It just smells funny...