2018-05-21 11:39:36

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 0/6] arm64: provide pseudo NMI with GICv3

This series is a continuation of the work started by Daniel [1]. The goal
is to use GICv3 interrupt priorities to simulate an NMI.

To achieve this, set two priorities, one for standard interrupts and
another, higher priority, for NMIs. Whenever we want to disable interrupts,
we mask the standard priority instead so NMIs can still be raised. Some
corner cases though still require to actually mask all interrupts
effectively disabling the NMI.


Currently, only PPIs and SPIs can be set as NMIs. IPIs being currently
hardcoded IRQ numbers, there isn't a generic interface to set SGIs as NMI
for now. I don't think there is any reason LPIs should be allowed to be set
as NMI as they do not have an active state.
When an NMI is active on a CPU, no other NMI can be triggered on the CPU.


I did a bit of testing on a board with 8 Cortex-A57 cores:

- "hackbench 200 process 1000" (average over 20 runs)
+-----------+----------+------------+------------------+
| | native | PMR guest | v4.17-rc6 guest |
+-----------+----------+------------+------------------+
| PMR host | 40.0336s | 39.3039s | 39.2044s |
| v4.17-rc6 | 40.4040s | 39.6011s | 39.1147s |
+-----------+----------+------------+------------------+

I'm not sure why guests appear to be faster than hosts, maybe
because host have full ubuntu system and guest just have a simple rootfs
with busybox...

It also seems the penalty from using PMR is cushioned by the removal of
the interrupt acknowledge loop in the GICv3 driver.


- Kernel build from defconfig:
PMR host: 13m45.743s
v4.17-rc6: 13m40.400s

The difference is ~0.65%, from different runs, this seems to be within
the noise.


Requirements to use this:
- Have GICv3
- SCR_EL3.FIQ is set to 1 when linux runs or have single security state
- Select Kernel Feature -> Use ICC system registers for IRQ masking

* Patches 1 and 2 allows to detect and enable the use of GICv3 system
registers during boot time.
* Patch 3 introduces the masking of IRQs using priorities replacing irq
disabling.
* Patch 4 adds some utility functions
* Patch 5 add detection of the view linux has on GICv3 priorities, without
this we cannot easily mask specific priorities in an accurate manner
* Patch 6 adds the support for NMIs


Changes since V2[2]:
* Series rebase to v4.17-rc6

* Adapt pathces 1 and 2 to the rework of cpufeatures framework

* Use the group0 detection scheme in the GICv3 driver to identify
the priority view, and drop the use of a fake interrupt

* Add the case for a GIC configured in a single security state

* Use local_daif_restore instead of local_irq_enable the first time
we enable interrupts after a bp hardening in the handling of a kernel
entry. Otherwise PRS.I remains set...


Changes since V1[3]:
* Series rebased to v4.15-rc8.

* Check for arm64_early_features in this_cpu_has_cap (spotted by Suzuki).

* Fix issue where debug exception were not masked when enabling debug in
mdscr_el1.


Changes since RFC[4]:
* The series was rebased to v4.15-rc2 which implied some changes mainly
related to the work on exception entries and daif flags by James Morse.

- The first patch in the previous series was dropped because no longer
applicable.

- With the semantics James introduced of "inheriting" daif flags,
handling of PMR on exception entry is simplified as PMR is not altered
by taking an exception and already inherited from previous state.

- James pointed out that taking a PseudoNMI before reading the FAR_EL1
register should not be allowed as per the TRM (D10.2.29):
"FAR_EL1 is made UNKNOWN on an exception return from EL1."
So in this submission PSR.I bit is cleared only after FAR_EL1 is read.

* For KVM, only deal with PMR unmasking/restoring in common code, and VHE
specific code makes sure PSR.I bit is set when necessary.

* When detecting the GIC priority view (patch 5), wait for an actual
interrupt instead of trying only once.


[1] http://www.spinics.net/lists/arm-kernel/msg525077.html
[2] https://lkml.org/lkml/2018/1/17/335
[3] https://www.spinics.net/lists/arm-kernel/msg620763.html
[4] https://www.spinics.net/lists/arm-kernel/msg610736.html

Cheers,

Julien


Daniel Thompson (3):
arm64: cpufeature: Allow early detect of specific features
arm64: alternative: Apply alternatives early in boot process
arm64: irqflags: Use ICC sysregs to implement IRQ masking

Julien Thierry (3):
irqchip/gic: Add functions to access irq priorities
arm64: Detect current view of GIC priorities
arm64: Add support for pseudo-NMIs

Documentation/arm64/booting.txt | 5 +
arch/arm64/Kconfig | 15 ++
arch/arm64/include/asm/alternative.h | 5 +-
arch/arm64/include/asm/arch_gicv3.h | 25 +++
arch/arm64/include/asm/assembler.h | 25 ++-
arch/arm64/include/asm/cpufeature.h | 2 +
arch/arm64/include/asm/daifflags.h | 36 ++--
arch/arm64/include/asm/efi.h | 5 +
arch/arm64/include/asm/irqflags.h | 131 ++++++++++++++
arch/arm64/include/asm/kvm_host.h | 14 ++
arch/arm64/include/asm/processor.h | 4 +
arch/arm64/include/asm/ptrace.h | 14 +-
arch/arm64/include/asm/sysreg.h | 1 +
arch/arm64/kernel/alternative.c | 39 +++-
arch/arm64/kernel/asm-offsets.c | 1 +
arch/arm64/kernel/cpufeature.c | 9 +-
arch/arm64/kernel/entry.S | 84 ++++++++-
arch/arm64/kernel/head.S | 37 ++++
arch/arm64/kernel/process.c | 6 +
arch/arm64/kernel/smp.c | 15 ++
arch/arm64/kvm/hyp/switch.c | 25 +++
arch/arm64/mm/fault.c | 5 +-
arch/arm64/mm/proc.S | 23 +++
drivers/irqchip/irq-gic-common.c | 10 ++
drivers/irqchip/irq-gic-common.h | 2 +
drivers/irqchip/irq-gic-v3-its.c | 2 +-
drivers/irqchip/irq-gic-v3.c | 320 ++++++++++++++++++++++++++-------
include/linux/interrupt.h | 1 +
include/linux/irqchip/arm-gic-common.h | 6 +
include/linux/irqchip/arm-gic.h | 5 -
30 files changed, 778 insertions(+), 94 deletions(-)

--
1.9.1


2018-05-21 11:36:45

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 6/6] arm64: Add support for pseudo-NMIs

arm64 does not provide native NMIs. Emulate the NMI behaviour using GIC
priorities.

Add the possibility to set an IRQ as an NMI and the handling of the NMI.

If the view of GIC priorities is the secure one
(i.e. SCR_EL3.FIQ == 0 && security enabled), do not allow the use of NMIs.
Emit a warning when attempting to set an IRQ as NMI under this scenario.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
arch/arm64/include/asm/arch_gicv3.h | 5 ++
arch/arm64/include/asm/irqflags.h | 6 ++
arch/arm64/kernel/entry.S | 56 ++++++++++++++
drivers/irqchip/irq-gic-v3.c | 141 ++++++++++++++++++++++++++++++++++++
include/linux/interrupt.h | 1 +
5 files changed, 209 insertions(+)

diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
index 6ee27ec..935511f 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -124,6 +124,11 @@ static inline void gic_write_bpr1(u32 val)
write_sysreg_s(val, SYS_ICC_BPR1_EL1);
}

+static inline u32 gic_read_rpr(void)
+{
+ return read_sysreg_s(SYS_ICC_RPR_EL1);
+}
+
#define gic_read_typer(c) readq_relaxed(c)
#define gic_write_irouter(v, c) writeq_relaxed(v, c)
#define gic_read_lpir(c) readq_relaxed(c)
diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 3d5d443..d25e7ee 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -217,6 +217,12 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
!(ARCH_FLAGS_GET_PMR(flags) & ICC_PMR_EL1_EN_BIT);
}

+/* Mask IRQs at CPU level instead of GIC level */
+static inline void arch_irqs_daif_disable(void)
+{
+ asm volatile ("msr daifset, #2" : : : "memory");
+}
+
void maybe_switch_to_sysreg_gic_cpuif(void);

#endif /* CONFIG_IRQFLAGS_GIC_MASKING */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index a7f753f..a52d5f8 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -395,6 +395,18 @@ alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0
mov sp, x19
.endm

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ /* Should be checked on return from irq handlers */
+ .macro branch_if_was_nmi, tmp, target
+ alternative_if ARM64_HAS_SYSREG_GIC_CPUIF
+ mrs \tmp, daif
+ alternative_else
+ mov \tmp, #0
+ alternative_endif
+ tbnz \tmp, #7, \target // Exiting an NMI
+ .endm
+#endif
+
/*
* These are the registers used in the syscall handler, and allow us to
* have in theory up to 7 arguments to a function - x0 to x6.
@@ -615,12 +627,30 @@ ENDPROC(el1_sync)
el1_irq:
kernel_entry 1
enable_da_f
+
#ifdef CONFIG_TRACE_IRQFLAGS
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ ldr x20, [sp, #S_PMR_SAVE]
+ /* Irqs were disabled, don't trace */
+ tbz x20, ICC_PMR_EL1_EN_SHIFT, 1f
+#endif
bl trace_hardirqs_off
+1:
#endif

irq_handler

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ /*
+ * Irqs were disabled, we have an nmi.
+ * We might have interrupted a context with interrupt disabled that set
+ * NEED_RESCHED flag.
+ * Skip preemption and irq tracing if needed.
+ */
+ tbz x20, ICC_PMR_EL1_EN_SHIFT, untraced_irq_exit
+ branch_if_was_nmi x0, skip_preempt
+#endif
+
#ifdef CONFIG_PREEMPT
ldr w24, [tsk, #TSK_TI_PREEMPT] // get preempt count
cbnz w24, 1f // preempt count != 0
@@ -629,9 +659,13 @@ el1_irq:
bl el1_preempt
1:
#endif
+
+skip_preempt:
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_on
#endif
+
+untraced_irq_exit:
kernel_exit 1
ENDPROC(el1_irq)

@@ -862,6 +896,11 @@ el0_irq_naked:
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_on
#endif
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ branch_if_was_nmi x2, nmi_ret_to_user
+#endif
+
b ret_to_user
ENDPROC(el0_irq)

@@ -1162,8 +1201,15 @@ ENTRY(cpu_switch_to)
ldp x27, x28, [x8], #16
ldp x29, x9, [x8], #16
ldr lr, [x8]
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ mrs x10, daif
+ msr daifset, #2
+#endif
mov sp, x9
msr sp_el0, x1
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ msr daif, x10
+#endif
ret
ENDPROC(cpu_switch_to)
NOKPROBE(cpu_switch_to)
@@ -1357,3 +1403,13 @@ alternative_else_nop_endif
ENDPROC(__sdei_asm_handler)
NOKPROBE(__sdei_asm_handler)
#endif /* CONFIG_ARM_SDE_INTERFACE */
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+/*
+ * NMI return path to EL0
+ */
+nmi_ret_to_user:
+ ldr x1, [tsk, #TSK_TI_FLAGS]
+ b finish_ret_to_user
+ENDPROC(nmi_ret_to_user)
+#endif
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 3c44918..6d25ead 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -34,6 +34,8 @@
#include <linux/irqchip/arm-gic-v3.h>
#include <linux/irqchip/irq-partition-percpu.h>

+#include <trace/events/irq.h>
+
#include <asm/cputype.h>
#include <asm/exception.h>
#include <asm/smp_plat.h>
@@ -41,6 +43,8 @@

#include "irq-gic-common.h"

+#define GICD_INT_NMI_PRI 0xa0
+
struct redist_region {
void __iomem *redist_base;
phys_addr_t phys_base;
@@ -247,6 +251,87 @@ static void gic_unmask_irq(struct irq_data *d)
gic_poke_irq(d, GICD_ISENABLER);
}

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+/*
+ * Chip flow handler for SPIs set as NMI
+ */
+static void handle_fasteoi_nmi(struct irq_desc *desc)
+{
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ struct irqaction *action = desc->action;
+ unsigned int irq = irq_desc_get_irq(desc);
+ irqreturn_t res;
+
+ if (chip->irq_ack)
+ chip->irq_ack(&desc->irq_data);
+
+ trace_irq_handler_entry(irq, action);
+ res = action->handler(irq, action->dev_id);
+ trace_irq_handler_exit(irq, action, res);
+
+ if (chip->irq_eoi)
+ chip->irq_eoi(&desc->irq_data);
+}
+
+/*
+ * Chip flow handler for PPIs set as NMI
+ */
+static void handle_percpu_devid_nmi(struct irq_desc *desc)
+{
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ struct irqaction *action = desc->action;
+ unsigned int irq = irq_desc_get_irq(desc);
+ irqreturn_t res;
+
+ if (chip->irq_ack)
+ chip->irq_ack(&desc->irq_data);
+
+ trace_irq_handler_entry(irq, action);
+ res = action->handler(irq, raw_cpu_ptr(action->percpu_dev_id));
+ trace_irq_handler_exit(irq, action, res);
+
+ if (chip->irq_eoi)
+ chip->irq_eoi(&desc->irq_data);
+}
+
+static int gic_irq_set_irqchip_prio(struct irq_data *d, bool val)
+{
+ u8 prio;
+ irq_flow_handler_t handler;
+
+ if (gic_peek_irq(d, GICD_ISENABLER)) {
+ pr_err("Cannot set NMI property of enabled IRQ %u\n", d->irq);
+ return -EPERM;
+ }
+
+ if (val) {
+ prio = GICD_INT_NMI_PRI;
+
+ if (gic_irq(d) < 32)
+ handler = handle_percpu_devid_nmi;
+ else
+ handler = handle_fasteoi_nmi;
+ } else {
+ prio = GICD_INT_DEF_PRI;
+
+ if (gic_irq(d) < 32)
+ handler = handle_percpu_devid_irq;
+ else
+ handler = handle_fasteoi_irq;
+ }
+
+ /*
+ * Already in a locked context for the desc from calling
+ * irq_set_irq_chip_state.
+ * It should be safe to simply modify the handler.
+ */
+ irq_to_desc(d->irq)->handle_irq = handler;
+ gic_set_irq_prio(gic_irq(d), gic_dist_base(d), prio);
+
+ return 0;
+}
+#endif
+
static int gic_irq_set_irqchip_state(struct irq_data *d,
enum irqchip_irq_state which, bool val)
{
@@ -268,6 +353,18 @@ static int gic_irq_set_irqchip_state(struct irq_data *d,
reg = val ? GICD_ICENABLER : GICD_ISENABLER;
break;

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ case IRQCHIP_STATE_NMI:
+ if (static_branch_likely(&have_non_secure_prio_view)) {
+ return gic_irq_set_irqchip_prio(d, val);
+ } else if (val) {
+ pr_warn("Failed to set IRQ %u as NMI, NMIs are unsupported\n",
+ gic_irq(d));
+ return -EINVAL;
+ }
+ return 0;
+#endif
+
default:
return -EINVAL;
}
@@ -295,6 +392,13 @@ static int gic_irq_get_irqchip_state(struct irq_data *d,
*val = !gic_peek_irq(d, GICD_ISENABLER);
break;

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ case IRQCHIP_STATE_NMI:
+ *val = (gic_get_irq_prio(gic_irq(d), gic_dist_base(d)) ==
+ GICD_INT_NMI_PRI);
+ break;
+#endif
+
default:
return -EINVAL;
}
@@ -365,6 +469,22 @@ static u64 gic_mpidr_to_affinity(unsigned long mpidr)
return aff;
}

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+static void do_handle_nmi(unsigned int hwirq, struct pt_regs *regs)
+{
+ struct pt_regs *old_regs = set_irq_regs(regs);
+ unsigned int irq;
+
+ nmi_enter();
+
+ irq = irq_find_mapping(gic_data.domain, hwirq);
+ generic_handle_irq(irq);
+
+ nmi_exit();
+ set_irq_regs(old_regs);
+}
+#endif
+
static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
{
u32 irqnr;
@@ -380,6 +500,25 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
int err;

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ if (static_branch_likely(&have_non_secure_prio_view)
+ && unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) {
+ /*
+ * We need to prevent other NMIs to occur even after a
+ * priority drop.
+ * We keep I flag set until cpsr is restored from
+ * kernel_exit.
+ */
+ arch_irqs_daif_disable();
+
+ if (static_branch_likely(&supports_deactivate_key))
+ gic_write_eoir(irqnr);
+
+ do_handle_nmi(irqnr, regs);
+ return;
+ }
+#endif
+
if (static_branch_likely(&supports_deactivate_key))
gic_write_eoir(irqnr);
else {
@@ -1177,6 +1316,8 @@ static int __init gic_init_bases(void __iomem *dist_base,
#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
if (!gic_cpu_has_group0() || gic_dist_security_disabled())
static_branch_enable(&have_non_secure_prio_view);
+ else
+ pr_warn("SCR_EL3.FIQ set, cannot enable use of pseudo-NMIs\n");
#endif

return 0;
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 5426627..02c794f 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -419,6 +419,7 @@ enum irqchip_irq_state {
IRQCHIP_STATE_ACTIVE, /* Is interrupt in progress? */
IRQCHIP_STATE_MASKED, /* Is interrupt masked? */
IRQCHIP_STATE_LINE_LEVEL, /* Is IRQ line high? */
+ IRQCHIP_STATE_NMI, /* Is IRQ an NMI? */
};

extern int irq_get_irqchip_state(unsigned int irq, enum irqchip_irq_state which,
--
1.9.1

2018-05-21 11:37:02

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 4/6] irqchip/gic: Add functions to access irq priorities

Signed-off-by: Julien Thierry <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-common.c | 10 ++++++++++
drivers/irqchip/irq-gic-common.h | 2 ++
2 files changed, 12 insertions(+)

diff --git a/drivers/irqchip/irq-gic-common.c b/drivers/irqchip/irq-gic-common.c
index 01e673c..910746f 100644
--- a/drivers/irqchip/irq-gic-common.c
+++ b/drivers/irqchip/irq-gic-common.c
@@ -98,6 +98,16 @@ int gic_configure_irq(unsigned int irq, unsigned int type,
return ret;
}

+void gic_set_irq_prio(unsigned int irq, void __iomem *base, u8 prio)
+{
+ writeb_relaxed(prio, base + GIC_DIST_PRI + irq);
+}
+
+u8 gic_get_irq_prio(unsigned int irq, void __iomem *base)
+{
+ return readb_relaxed(base + GIC_DIST_PRI + irq);
+}
+
void gic_dist_config(void __iomem *base, int gic_irqs,
void (*sync_access)(void))
{
diff --git a/drivers/irqchip/irq-gic-common.h b/drivers/irqchip/irq-gic-common.h
index 3919cd7..1586dbd 100644
--- a/drivers/irqchip/irq-gic-common.h
+++ b/drivers/irqchip/irq-gic-common.h
@@ -35,6 +35,8 @@ void gic_dist_config(void __iomem *base, int gic_irqs,
void gic_cpu_config(void __iomem *base, void (*sync_access)(void));
void gic_enable_quirks(u32 iidr, const struct gic_quirk *quirks,
void *data);
+void gic_set_irq_prio(unsigned int irq, void __iomem *base, u8 prio);
+u8 gic_get_irq_prio(unsigned int irq, void __iomem *base);

void gic_set_kvm_info(const struct gic_kvm_info *info);

--
1.9.1

2018-05-21 11:39:45

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 2/6] arm64: alternative: Apply alternatives early in boot process

From: Daniel Thompson <[email protected]>

Currently alternatives are applied very late in the boot process (and
a long time after we enable scheduling). Some alternative sequences,
such as those that alter the way CPU context is stored, must be applied
much earlier in the boot sequence.

Introduce apply_boot_alternatives() to allow some alternatives to be
applied immediately after we detect the CPU features of the boot CPU.

Since alternatives are now applied at different times, provide function
to check whether alternatives are applied per feature.

Signed-off-by: Daniel Thompson <[email protected]>
[[email protected]: rename to fit new cpufeature framework better,
apply BOOT_SCOPE feature early in boot,
add per feature alternative checking]
Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
---
arch/arm64/include/asm/alternative.h | 5 +++--
arch/arm64/include/asm/cpufeature.h | 2 ++
arch/arm64/kernel/alternative.c | 39 +++++++++++++++++++++++++++++++++---
arch/arm64/kernel/cpufeature.c | 7 ++++++-
arch/arm64/kernel/smp.c | 7 +++++++
5 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index a91933b..36b0703 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -14,8 +14,6 @@
#include <linux/stddef.h>
#include <linux/stringify.h>

-extern int alternatives_applied;
-
struct alt_instr {
s32 orig_offset; /* offset to original instruction */
s32 alt_offset; /* offset to replacement instruction */
@@ -27,9 +25,12 @@ struct alt_instr {
typedef void (*alternative_cb_t)(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr, int nr_inst);

+void __init apply_boot_alternatives(void);
void __init apply_alternatives_all(void);
void apply_alternatives(void *start, size_t length);

+bool feature_alternatives_applied(u16 cpufeature);
+
#define ALTINSTR_ENTRY(feature,cb) \
" .word 661b - .\n" /* label */ \
" .if " __stringify(cb) " == 0\n" \
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 09b0f2a..19efe4e 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -359,6 +359,8 @@ static inline int cpucap_default_scope(const struct arm64_cpu_capabilities *cap)
extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
extern struct static_key_false arm64_const_caps_ready;

+extern unsigned long boot_capabilities;
+
bool this_cpu_has_cap(unsigned int cap);

static inline bool cpu_have_feature(unsigned int num)
diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index 5c4bce4..29885eb 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -34,6 +34,8 @@

int alternatives_applied;

+DECLARE_BITMAP(alternatives_status, ARM64_NCAPS);
+
struct alt_region {
struct alt_instr *begin;
struct alt_instr *end;
@@ -122,7 +124,8 @@ static void patch_alternative(struct alt_instr *alt,
}
}

-static void __apply_alternatives(void *alt_region, bool use_linear_alias)
+static void __apply_alternatives(void *alt_region, bool use_linear_alias,
+ unsigned long feature_mask)
{
struct alt_instr *alt;
struct alt_region *region = alt_region;
@@ -132,6 +135,9 @@ static void __apply_alternatives(void *alt_region, bool use_linear_alias)
for (alt = region->begin; alt < region->end; alt++) {
int nr_inst;

+ if ((BIT(alt->cpufeature) & feature_mask) == 0)
+ continue;
+
/* Use ARM64_CB_PATCH as an unconditional patch */
if (alt->cpufeature < ARM64_CB_PATCH &&
!cpus_have_cap(alt->cpufeature))
@@ -142,6 +148,8 @@ static void __apply_alternatives(void *alt_region, bool use_linear_alias)
else
BUG_ON(alt->alt_len != alt->orig_len);

+ __set_bit(alt->cpufeature, alternatives_status);
+
pr_info_once("patching kernel code\n");

origptr = ALT_ORIG_PTR(alt);
@@ -178,7 +186,9 @@ static int __apply_alternatives_multi_stop(void *unused)
isb();
} else {
BUG_ON(alternatives_applied);
- __apply_alternatives(&region, true);
+
+ __apply_alternatives(&region, true, ~boot_capabilities);
+
/* Barriers provided by the cache flushing */
WRITE_ONCE(alternatives_applied, 1);
}
@@ -192,6 +202,24 @@ void __init apply_alternatives_all(void)
stop_machine(__apply_alternatives_multi_stop, NULL, cpu_online_mask);
}

+/*
+ * This is called very early in the boot process (directly after we run
+ * a feature detect on the boot CPU). No need to worry about other CPUs
+ * here.
+ */
+void __init apply_boot_alternatives(void)
+{
+ struct alt_region region = {
+ .begin = (struct alt_instr *)__alt_instructions,
+ .end = (struct alt_instr *)__alt_instructions_end,
+ };
+
+ /* If called on non-boot cpu things could go wrong */
+ WARN_ON(smp_processor_id() != 0);
+
+ __apply_alternatives(&region, true, boot_capabilities);
+}
+
void apply_alternatives(void *start, size_t length)
{
struct alt_region region = {
@@ -199,5 +227,10 @@ void apply_alternatives(void *start, size_t length)
.end = start + length,
};

- __apply_alternatives(&region, false);
+ __apply_alternatives(&region, false, -1);
+}
+
+bool feature_alternatives_applied(u16 cpufeature)
+{
+ return test_bit(cpufeature, alternatives_status);
}
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index e03e897..021ae87 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -52,6 +52,8 @@
DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
EXPORT_SYMBOL(cpu_hwcaps);

+unsigned long boot_capabilities;
+
/*
* Flag to indicate if we have computed the system wide
* capabilities based on the boot time active CPUs. This
@@ -1021,7 +1023,7 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
* that, freshly-onlined CPUs will set tpidr_el2, so we don't need to
* do anything here.
*/
- if (!alternatives_applied)
+ if (!feature_alternatives_applied(ARM64_HAS_VIRT_HOST_EXTN))
write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);
}
#endif
@@ -1346,6 +1348,9 @@ static void __update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
if (!cpus_have_cap(caps->capability) && caps->desc)
pr_info("%s %s\n", info, caps->desc);
cpus_set_cap(caps->capability);
+
+ if (scope_mask & SCOPE_BOOT_CPU)
+ __set_bit(caps->capability, &boot_capabilities);
}
}

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index f3e2e3a..b7fb909 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -410,6 +410,13 @@ void __init smp_prepare_boot_cpu(void)
*/
jump_label_init();
cpuinfo_store_boot_cpu();
+
+ /*
+ * We now know enough about the boot CPU to apply the
+ * alternatives that cannot wait until interrupt handling
+ * and/or scheduling is enabled.
+ */
+ apply_boot_alternatives();
}

static u64 __init of_get_cpu_mpidr(struct device_node *dn)
--
1.9.1

2018-05-21 11:39:56

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 1/6] arm64: cpufeature: Allow early detect of specific features

From: Daniel Thompson <[email protected]>

Currently it is not possible to detect features of the boot CPU
until the other CPUs have been brought up.

This prevents us from reacting to features of the boot CPU until
fairly late in the boot process. To solve this we allow a subset
of features (that are likely to be common to all clusters) to be
detected based on the boot CPU alone.

Signed-off-by: Daniel Thompson <[email protected]>
[[email protected]: check non-boot cpu missing early features, avoid
duplicates between early features and normal
features]
Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
---
arch/arm64/kernel/cpufeature.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9d1b06d..e03e897 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1030,7 +1030,7 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
{
.desc = "GIC system register CPU interface",
.capability = ARM64_HAS_SYSREG_GIC_CPUIF,
- .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+ .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
.matches = has_useable_gicv3_cpuif,
.sys_reg = SYS_ID_AA64PFR0_EL1,
.field_pos = ID_AA64PFR0_GIC_SHIFT,
--
1.9.1

2018-05-21 11:40:11

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 3/6] arm64: irqflags: Use ICC sysregs to implement IRQ masking

From: Daniel Thompson <[email protected]>

Currently irqflags is implemented using the PSR's I bit. It is possible
to implement irqflags by using the co-processor interface to the GIC.
Using the co-processor interface makes it feasible to simulate NMIs
using GIC interrupt prioritization.

This patch changes the irqflags macros to modify, save and restore
ICC_PMR_EL1. This has a substantial knock on effect for the rest of
the kernel. There are four reasons for this:

1. The state of the PMR becomes part of the interrupt context and must be
saved and restored during exceptions. It is saved on the stack as part
of the saved context when an interrupt/exception is taken.

2. The hardware automatically masks the I bit (at boot, during traps, etc).
When the I bit is set by hardware we must add code to switch from I
bit masking and PMR masking:
- For IRQs, this is done after the interrupt has been acknowledged
avoiding the need to unmask.
- For other exceptions, this is done right after saving the context.

3. Some instructions, such as wfi, require that the PMR not be used
for interrupt masking. Before calling these instructions we must
switch from PMR masking to I bit masking.
This is also the case when KVM runs a guest, if the CPU receives
an interrupt from the host, interrupts must not be masked in PMR
otherwise the GIC will not signal it to the CPU.

4. We use the alternatives system to allow a single kernel to boot and
be switched to the alternative masking approach at runtime.

Signed-off-by: Daniel Thompson <[email protected]>
[[email protected]: changes reflected in commit,
message, fixes, renaming]
Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: James Morse <[email protected]>
---
arch/arm64/Kconfig | 15 ++++
arch/arm64/include/asm/arch_gicv3.h | 20 ++++++
arch/arm64/include/asm/assembler.h | 25 ++++++-
arch/arm64/include/asm/daifflags.h | 36 +++++++---
arch/arm64/include/asm/efi.h | 5 ++
arch/arm64/include/asm/irqflags.h | 125 +++++++++++++++++++++++++++++++++
arch/arm64/include/asm/kvm_host.h | 14 ++++
arch/arm64/include/asm/processor.h | 4 ++
arch/arm64/include/asm/ptrace.h | 14 +++-
arch/arm64/kernel/asm-offsets.c | 1 +
arch/arm64/kernel/entry.S | 28 ++++++--
arch/arm64/kernel/head.S | 37 ++++++++++
arch/arm64/kernel/process.c | 6 ++
arch/arm64/kernel/smp.c | 8 +++
arch/arm64/kvm/hyp/switch.c | 25 +++++++
arch/arm64/mm/fault.c | 5 +-
arch/arm64/mm/proc.S | 23 ++++++
drivers/irqchip/irq-gic-v3-its.c | 2 +-
drivers/irqchip/irq-gic-v3.c | 82 +++++++++++----------
include/linux/irqchip/arm-gic-common.h | 6 ++
include/linux/irqchip/arm-gic.h | 5 --
21 files changed, 423 insertions(+), 63 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index eb2cf49..ab214b9 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -938,6 +938,21 @@ config HARDEN_EL2_VECTORS

If unsure, say Y.

+config USE_ICC_SYSREGS_FOR_IRQFLAGS
+ bool "Use ICC system registers for IRQ masking"
+ select CONFIG_ARM_GIC_V3
+ help
+ Using the ICC system registers for IRQ masking makes it possible
+ to simulate NMI on ARM64 systems. This allows several interesting
+ features (especially debug features) to be used on these systems.
+
+ Say Y here to implement IRQ masking using ICC system
+ registers when the GIC System Registers are available. The changes
+ are applied dynamically using the alternatives system so it is safe
+ to enable this option on systems with older interrupt controllers.
+
+ If unsure, say N
+
menuconfig ARMV8_DEPRECATED
bool "Emulate deprecated/obsolete ARMv8 instructions"
depends on COMPAT
diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
index e278f94..6ee27ec 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -76,6 +76,16 @@ static inline u64 gic_read_iar_cavium_thunderx(void)
return irqstat;
}

+static inline u32 gic_read_pmr(void)
+{
+ return read_sysreg_s(SYS_ICC_PMR_EL1);
+}
+
+static inline void gic_write_pmr(u32 val)
+{
+ write_sysreg_s(val, SYS_ICC_PMR_EL1);
+}
+
static inline void gic_write_ctlr(u32 val)
{
write_sysreg_s(val, SYS_ICC_CTLR_EL1);
@@ -140,5 +150,15 @@ static inline void gic_write_bpr1(u32 val)
#define gits_write_vpendbaser(v, c) writeq_relaxed(v, c)
#define gits_read_vpendbaser(c) readq_relaxed(c)

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+static inline void gic_start_pmr_masking(void)
+{
+ if (cpus_have_const_cap(ARM64_HAS_SYSREG_GIC_CPUIF)) {
+ gic_write_pmr(ICC_PMR_EL1_MASKED);
+ asm volatile ("msr daifclr, #2" : : : "memory");
+ }
+}
+#endif
+
#endif /* __ASSEMBLY__ */
#endif /* __ASM_ARCH_GICV3_H */
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 0bcc98d..9da68d2 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -23,6 +23,7 @@
#ifndef __ASM_ASSEMBLER_H
#define __ASM_ASSEMBLER_H

+#include <asm/alternative.h>
#include <asm/asm-offsets.h>
#include <asm/cpufeature.h>
#include <asm/debug-monitors.h>
@@ -62,12 +63,32 @@
/*
* Enable and disable interrupts.
*/
- .macro disable_irq
+ .macro disable_irq, tmp
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ mov \tmp, #ICC_PMR_EL1_MASKED
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
msr daifset, #2
+alternative_else
+ msr_s SYS_ICC_PMR_EL1, \tmp
+alternative_endif
+#else
+ msr daifset, #2
+#endif
.endm

- .macro enable_irq
+ .macro enable_irq, tmp
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ mov \tmp, #ICC_PMR_EL1_UNMASKED
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
msr daifclr, #2
+ nop
+alternative_else
+ msr_s SYS_ICC_PMR_EL1, \tmp
+ dsb sy
+alternative_endif
+#else
+ msr daifclr, #2
+#endif
.endm

.macro save_and_disable_irq, flags
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index 22e4c83..ba85822 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -18,9 +18,24 @@

#include <linux/irqflags.h>

+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+
#define DAIF_PROCCTX 0
#define DAIF_PROCCTX_NOIRQ PSR_I_BIT

+#else
+
+#define DAIF_PROCCTX \
+ (cpus_have_const_cap(ARM64_HAS_SYSREG_GIC_CPUIF) ? \
+ MAKE_ARCH_FLAGS(0, ICC_PMR_EL1_UNMASKED) : \
+ 0)
+
+#define DAIF_PROCCTX_NOIRQ \
+ (cpus_have_const_cap(ARM64_HAS_SYSREG_GIC_CPUIF) ? \
+ MAKE_ARCH_FLAGS(0, ICC_PMR_EL1_MASKED) : \
+ PSR_I_BIT)
+#endif
+
/* mask/save/unmask/restore all exceptions, including interrupts. */
static inline void local_daif_mask(void)
{
@@ -36,11 +51,8 @@ static inline unsigned long local_daif_save(void)
{
unsigned long flags;

- asm volatile(
- "mrs %0, daif // local_daif_save\n"
- : "=r" (flags)
- :
- : "memory");
+ flags = arch_local_save_flags();
+
local_daif_mask();

return flags;
@@ -54,17 +66,21 @@ static inline void local_daif_unmask(void)
:
:
: "memory");
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ /* Unmask IRQs in PMR if needed */
+ if (cpus_have_const_cap(ARM64_HAS_SYSREG_GIC_CPUIF))
+ arch_local_irq_enable();
+#endif
}

static inline void local_daif_restore(unsigned long flags)
{
if (!arch_irqs_disabled_flags(flags))
trace_hardirqs_on();
- asm volatile(
- "msr daif, %0 // local_daif_restore"
- :
- : "r" (flags)
- : "memory");
+
+ arch_local_irq_restore(flags);
+
if (arch_irqs_disabled_flags(flags))
trace_hardirqs_off();
}
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 192d791..2c50025 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -42,7 +42,12 @@

efi_status_t __efi_rt_asm_wrapper(void *, const char *, ...);

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+#define ARCH_EFI_IRQ_FLAGS_MASK \
+ (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT | ARCH_FLAG_PMR_EN)
+#else
#define ARCH_EFI_IRQ_FLAGS_MASK (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
+#endif

/* arch specific definitions used by the stub code */

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 24692ed..3d5d443 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -18,7 +18,10 @@

#ifdef __KERNEL__

+#include <asm/alternative.h>
+#include <asm/cpufeature.h>
#include <asm/ptrace.h>
+#include <asm/sysreg.h>

/*
* Aarch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
@@ -33,6 +36,7 @@
* unmask it at all other times.
*/

+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
/*
* CPU interrupt mask handling.
*/
@@ -96,5 +100,126 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
{
return flags & PSR_I_BIT;
}
+
+static inline void maybe_switch_to_sysreg_gic_cpuif(void) {}
+
+#else /* CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS */
+
+#define ARCH_FLAG_PMR_EN 0x1
+
+#define MAKE_ARCH_FLAGS(daif, pmr) \
+ ((daif) | (((pmr) >> ICC_PMR_EL1_EN_SHIFT) & ARCH_FLAG_PMR_EN))
+
+#define ARCH_FLAGS_GET_PMR(flags) \
+ ((((flags) & ARCH_FLAG_PMR_EN) << ICC_PMR_EL1_EN_SHIFT) \
+ | ICC_PMR_EL1_MASKED)
+
+#define ARCH_FLAGS_GET_DAIF(flags) ((flags) & ~ARCH_FLAG_PMR_EN)
+
+/*
+ * CPU interrupt mask handling.
+ */
+static inline unsigned long arch_local_irq_save(void)
+{
+ unsigned long flags, masked = ICC_PMR_EL1_MASKED;
+ unsigned long pmr = 0;
+
+ asm volatile(ALTERNATIVE(
+ "mrs %0, daif // arch_local_irq_save\n"
+ "msr daifset, #2\n"
+ "mov %1, #" __stringify(ICC_PMR_EL1_UNMASKED),
+ /* --- */
+ "mrs %0, daif\n"
+ "mrs_s %1, " __stringify(SYS_ICC_PMR_EL1) "\n"
+ "msr_s " __stringify(SYS_ICC_PMR_EL1) ", %2",
+ ARM64_HAS_SYSREG_GIC_CPUIF)
+ : "=&r" (flags), "=&r" (pmr)
+ : "r" (masked)
+ : "memory");
+
+ return MAKE_ARCH_FLAGS(flags, pmr);
+}
+
+static inline void arch_local_irq_enable(void)
+{
+ unsigned long unmasked = ICC_PMR_EL1_UNMASKED;
+
+ asm volatile(ALTERNATIVE(
+ "msr daifclr, #2 // arch_local_irq_enable\n"
+ "nop",
+ "msr_s " __stringify(SYS_ICC_PMR_EL1) ",%0\n"
+ "dsb sy",
+ ARM64_HAS_SYSREG_GIC_CPUIF)
+ :
+ : "r" (unmasked)
+ : "memory");
+}
+
+static inline void arch_local_irq_disable(void)
+{
+ unsigned long masked = ICC_PMR_EL1_MASKED;
+
+ asm volatile(ALTERNATIVE(
+ "msr daifset, #2 // arch_local_irq_disable",
+ "msr_s " __stringify(SYS_ICC_PMR_EL1) ",%0",
+ ARM64_HAS_SYSREG_GIC_CPUIF)
+ :
+ : "r" (masked)
+ : "memory");
+}
+
+/*
+ * Save the current interrupt enable state.
+ */
+static inline unsigned long arch_local_save_flags(void)
+{
+ unsigned long flags;
+ unsigned long pmr = 0;
+
+ asm volatile(ALTERNATIVE(
+ "mrs %0, daif // arch_local_save_flags\n"
+ "mov %1, #" __stringify(ICC_PMR_EL1_UNMASKED),
+ "mrs %0, daif\n"
+ "mrs_s %1, " __stringify(SYS_ICC_PMR_EL1),
+ ARM64_HAS_SYSREG_GIC_CPUIF)
+ : "=r" (flags), "=r" (pmr)
+ :
+ : "memory");
+
+ return MAKE_ARCH_FLAGS(flags, pmr);
+}
+
+/*
+ * restore saved IRQ state
+ */
+static inline void arch_local_irq_restore(unsigned long flags)
+{
+ unsigned long pmr = ARCH_FLAGS_GET_PMR(flags);
+
+ flags = ARCH_FLAGS_GET_DAIF(flags);
+
+ asm volatile(ALTERNATIVE(
+ "msr daif, %0 // arch_local_irq_restore\n"
+ "nop\n"
+ "nop",
+ "msr daif, %0\n"
+ "msr_s " __stringify(SYS_ICC_PMR_EL1) ",%1\n"
+ "dsb sy",
+ ARM64_HAS_SYSREG_GIC_CPUIF)
+ :
+ : "r" (flags), "r" (pmr)
+ : "memory");
+}
+
+static inline int arch_irqs_disabled_flags(unsigned long flags)
+{
+ return (ARCH_FLAGS_GET_DAIF(flags) & (PSR_I_BIT)) |
+ !(ARCH_FLAGS_GET_PMR(flags) & ICC_PMR_EL1_EN_BIT);
+}
+
+void maybe_switch_to_sysreg_gic_cpuif(void);
+
+#endif /* CONFIG_IRQFLAGS_GIC_MASKING */
+
#endif
#endif
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 469de8a..1882534 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -24,6 +24,7 @@

#include <linux/types.h>
#include <linux/kvm_types.h>
+#include <asm/arch_gicv3.h>
#include <asm/cpufeature.h>
#include <asm/daifflags.h>
#include <asm/fpsimd.h>
@@ -433,6 +434,19 @@ static inline void kvm_fpsimd_flush_cpu_state(void)
static inline void kvm_arm_vhe_guest_enter(void)
{
local_daif_mask();
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ /*
+ * Having IRQs masked via PMR when entering the guest means the GIC
+ * will not signal the CPU of interrupts of lower priority, and the
+ * only way to get out will be via guest exceptions.
+ * Naturally, we want to avoid this.
+ */
+ if (cpus_have_const_cap(ARM64_HAS_SYSREG_GIC_CPUIF)) {
+ gic_write_pmr(ICC_PMR_EL1_UNMASKED);
+ dsb(sy);
+ }
+#endif
}

static inline void kvm_arm_vhe_guest_exit(void)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 7675989..5d3bed7 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -163,6 +163,10 @@ static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
memset(regs, 0, sizeof(*regs));
forget_syscall(regs);
regs->pc = pc;
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ /* Have IRQs enabled by default */
+ regs->pmr_save = ICC_PMR_EL1_UNMASKED;
+#endif
}

static inline void start_thread(struct pt_regs *regs, unsigned long pc,
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 6069d66..aa1e948 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -25,6 +25,12 @@
#define CurrentEL_EL1 (1 << 2)
#define CurrentEL_EL2 (2 << 2)

+/* PMR values used to mask/unmask interrupts */
+#define ICC_PMR_EL1_EN_SHIFT 6
+#define ICC_PMR_EL1_EN_BIT (1 << ICC_PMR_EL1_EN_SHIFT) // PMR IRQ enable
+#define ICC_PMR_EL1_UNMASKED 0xf0
+#define ICC_PMR_EL1_MASKED (ICC_PMR_EL1_UNMASKED ^ ICC_PMR_EL1_EN_BIT)
+
/* AArch32-specific ptrace requests */
#define COMPAT_PTRACE_GETREGS 12
#define COMPAT_PTRACE_SETREGS 13
@@ -136,7 +142,7 @@ struct pt_regs {
#endif

u64 orig_addr_limit;
- u64 unused; // maintain 16 byte alignment
+ u64 pmr_save;
u64 stackframe[2];
};

@@ -171,8 +177,14 @@ static inline void forget_syscall(struct pt_regs *regs)
#define processor_mode(regs) \
((regs)->pstate & PSR_MODE_MASK)

+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
#define interrupts_enabled(regs) \
(!((regs)->pstate & PSR_I_BIT))
+#else
+#define interrupts_enabled(regs) \
+ ((!((regs)->pstate & PSR_I_BIT)) && \
+ ((regs)->pmr_save & ICC_PMR_EL1_EN_BIT))
+#endif

#define fast_interrupts_enabled(regs) \
(!((regs)->pstate & PSR_F_BIT))
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 5bdda65..1f6a0a9 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -78,6 +78,7 @@ int main(void)
DEFINE(S_ORIG_X0, offsetof(struct pt_regs, orig_x0));
DEFINE(S_SYSCALLNO, offsetof(struct pt_regs, syscallno));
DEFINE(S_ORIG_ADDR_LIMIT, offsetof(struct pt_regs, orig_addr_limit));
+ DEFINE(S_PMR_SAVE, offsetof(struct pt_regs, pmr_save));
DEFINE(S_STACKFRAME, offsetof(struct pt_regs, stackframe));
DEFINE(S_FRAME_SIZE, sizeof(struct pt_regs));
BLANK();
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index ec2ee72..a7f753f 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -20,6 +20,7 @@

#include <linux/init.h>
#include <linux/linkage.h>
+#include <linux/irqchip/arm-gic-v3.h>

#include <asm/alternative.h>
#include <asm/assembler.h>
@@ -230,6 +231,16 @@ alternative_else_nop_endif
msr sp_el0, tsk
.endif

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ /* Save pmr */
+alternative_if ARM64_HAS_SYSREG_GIC_CPUIF
+ mrs_s x20, SYS_ICC_PMR_EL1
+alternative_else
+ mov x20, #ICC_PMR_EL1_UNMASKED
+alternative_endif
+ str x20, [sp, #S_PMR_SAVE]
+#endif
+
/*
* Registers that may be useful after this macro is invoked:
*
@@ -240,9 +251,9 @@ alternative_else_nop_endif
.endm

.macro kernel_exit, el
- .if \el != 0
disable_daif

+ .if \el != 0
/* Restore the task's original addr_limit. */
ldr x20, [sp, #S_ORIG_ADDR_LIMIT]
str x20, [tsk, #TSK_TI_ADDR_LIMIT]
@@ -250,6 +261,15 @@ alternative_else_nop_endif
/* No need to restore UAO, it will be restored from SPSR_EL1 */
.endif

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ /* Restore pmr, ensuring IRQs are off before restoring context. */
+alternative_if ARM64_HAS_SYSREG_GIC_CPUIF
+ ldr x20, [sp, #S_PMR_SAVE]
+ msr_s SYS_ICC_PMR_EL1, x20
+ dsb sy
+alternative_else_nop_endif
+#endif
+
ldp x21, x22, [sp, #S_PC] // load ELR, SPSR
.if \el == 0
ct_user_enter
@@ -872,7 +892,7 @@ ENDPROC(el0_error)
* and this includes saving x0 back into the kernel stack.
*/
ret_fast_syscall:
- disable_daif
+ disable_irq x21 // disable interrupts
str x0, [sp, #S_X0] // returned x0
ldr x1, [tsk, #TSK_TI_FLAGS] // re-check for syscall tracing
and x2, x1, #_TIF_SYSCALL_WORK
@@ -882,7 +902,7 @@ ret_fast_syscall:
enable_step_tsk x1, x2
kernel_exit 0
ret_fast_syscall_trace:
- enable_daif
+ enable_irq x0 // enable interrupts
b __sys_trace_return_skipped // we already saved x0

/*
@@ -900,7 +920,7 @@ work_pending:
* "slow" syscall return path.
*/
ret_to_user:
- disable_daif
+ disable_irq x21 // disable interrupts
ldr x1, [tsk, #TSK_TI_FLAGS]
and x2, x1, #_TIF_WORK_MASK
cbnz x2, work_pending
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b085306..47b2785 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -647,6 +647,43 @@ set_cpu_boot_mode_flag:
ret
ENDPROC(set_cpu_boot_mode_flag)

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+/*
+ * void maybe_switch_to_sysreg_gic_cpuif(void)
+ *
+ * Enable interrupt controller system register access if this feature
+ * has been detected by the alternatives system.
+ *
+ * Before we jump into generic code we must enable interrupt controller system
+ * register access because this is required by the irqflags macros. We must
+ * also mask interrupts at the PMR and unmask them within the PSR. That leaves
+ * us set up and ready for the kernel to make its first call to
+ * arch_local_irq_enable().
+ *
+ */
+ENTRY(maybe_switch_to_sysreg_gic_cpuif)
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+ b 1f
+alternative_else
+ mrs_s x0, SYS_ICC_SRE_EL1
+alternative_endif
+ orr x0, x0, #1
+ msr_s SYS_ICC_SRE_EL1, x0 // Set ICC_SRE_EL1.SRE==1
+ isb // Make sure SRE is now set
+ mrs x0, daif
+ tbz x0, #7, no_mask_pmr // Are interrupts on?
+ mov x0, ICC_PMR_EL1_MASKED
+ msr_s SYS_ICC_PMR_EL1, x0 // Prepare for unmask of I bit
+ msr daifclr, #2 // Clear the I bit
+ b 1f
+no_mask_pmr:
+ mov x0, ICC_PMR_EL1_UNMASKED
+ msr_s SYS_ICC_PMR_EL1, x0
+1:
+ ret
+ENDPROC(maybe_switch_to_sysreg_gic_cpuif)
+#endif
+
/*
* These values are written with the MMU off, but read with the MMU on.
* Writers will invalidate the corresponding address, discarding up to a
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index f08a2ed..0be3d25 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -65,6 +65,8 @@
EXPORT_SYMBOL(__stack_chk_guard);
#endif

+#include <asm/arch_gicv3.h>
+
/*
* Function pointers to optional machine specific functions
*/
@@ -230,6 +232,7 @@ void __show_regs(struct pt_regs *regs)
}

printk("sp : %016llx\n", sp);
+ printk("pmr_save: %08llx\n", regs->pmr_save);

i = top_reg;

@@ -355,6 +358,9 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
} else {
memset(childregs, 0, sizeof(struct pt_regs));
childregs->pstate = PSR_MODE_EL1h;
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ childregs->pmr_save = ICC_PMR_EL1_UNMASKED;
+#endif
if (IS_ENABLED(CONFIG_ARM64_UAO) &&
cpus_have_const_cap(ARM64_HAS_UAO))
childregs->pstate |= PSR_UAO_BIT;
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index b7fb909..6aa21fd 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -185,6 +185,8 @@ asmlinkage void secondary_start_kernel(void)
struct mm_struct *mm = &init_mm;
unsigned int cpu;

+ maybe_switch_to_sysreg_gic_cpuif();
+
cpu = task_cpu(current);
set_my_cpu_offset(per_cpu_offset(cpu));

@@ -417,6 +419,12 @@ void __init smp_prepare_boot_cpu(void)
* and/or scheduling is enabled.
*/
apply_boot_alternatives();
+
+ /*
+ * Conditionally switch to GIC PMR for interrupt masking (this
+ * will be a nop if we are using normal interrupt masking)
+ */
+ maybe_switch_to_sysreg_gic_cpuif();
}

static u64 __init of_get_cpu_mpidr(struct device_node *dn)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index d964523..2c0f453 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -21,6 +21,9 @@

#include <kvm/arm_psci.h>

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+#include <asm/arch_gicv3.h>
+#endif
#include <asm/kvm_asm.h>
#include <asm/kvm_emulate.h>
#include <asm/kvm_hyp.h>
@@ -442,6 +445,23 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
struct kvm_cpu_context *guest_ctxt;
bool fp_enabled;
u64 exit_code;
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ u32 host_pmr = ICC_PMR_EL1_UNMASKED;
+#endif
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ /*
+ * Having IRQs masked via PMR when entering the guest means the GIC
+ * will not signal the CPU of interrupts of lower priority, and the
+ * only way to get out will be via guest exceptions.
+ * Naturally, we want to avoid this.
+ */
+ if (cpus_have_const_cap(ARM64_HAS_SYSREG_GIC_CPUIF)) {
+ host_pmr = gic_read_pmr();
+ gic_write_pmr(ICC_PMR_EL1_UNMASKED);
+ dsb(sy);
+ }
+#endif

vcpu = kern_hyp_va(vcpu);

@@ -496,6 +516,11 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
*/
__debug_switch_to_host(vcpu);

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ if (cpus_have_const_cap(ARM64_HAS_SYSREG_GIC_CPUIF))
+ gic_write_pmr(host_pmr);
+#endif
+
return exit_code;
}

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 4165485..7a18634 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -37,6 +37,7 @@
#include <asm/cmpxchg.h>
#include <asm/cpufeature.h>
#include <asm/exception.h>
+#include <asm/daifflags.h>
#include <asm/debug-monitors.h>
#include <asm/esr.h>
#include <asm/sysreg.h>
@@ -712,7 +713,7 @@ asmlinkage void __exception do_el0_ia_bp_hardening(unsigned long addr,
if (addr > TASK_SIZE)
arm64_apply_bp_hardening();

- local_irq_enable();
+ local_daif_restore(DAIF_PROCCTX);
do_mem_abort(addr, esr, regs);
}

@@ -726,7 +727,7 @@ asmlinkage void __exception do_sp_pc_abort(unsigned long addr,
if (user_mode(regs)) {
if (instruction_pointer(regs) > TASK_SIZE)
arm64_apply_bp_hardening();
- local_irq_enable();
+ local_daif_restore(DAIF_PROCCTX);
}

info.si_signo = SIGBUS;
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 5f9a73a..7e74f06 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -20,6 +20,7 @@

#include <linux/init.h>
#include <linux/linkage.h>
+#include <linux/irqchip/arm-gic-v3.h>
#include <asm/assembler.h>
#include <asm/asm-offsets.h>
#include <asm/hwcap.h>
@@ -53,11 +54,33 @@
* cpu_do_idle()
*
* Idle the processor (wait for interrupt).
+ *
+ * If CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS is set we must do additional
+ * work to ensure that interrupts are not masked at the PMR (because the
+ * core will not wake up if we block the wake up signal in the interrupt
+ * controller).
*/
ENTRY(cpu_do_idle)
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+#endif
+ dsb sy // WFI may enter a low-power mode
+ wfi
+ ret
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_else
+ mrs x0, daif // save I bit
+ msr daifset, #2 // set I bit
+ mrs_s x1, SYS_ICC_PMR_EL1 // save PMR
+alternative_endif
+ mov x2, #ICC_PMR_EL1_UNMASKED
+ msr_s SYS_ICC_PMR_EL1, x2 // unmask at PMR
dsb sy // WFI may enter a low-power mode
wfi
+ msr_s SYS_ICC_PMR_EL1, x1 // restore PMR
+ msr daif, x0 // restore I bit
ret
+#endif
ENDPROC(cpu_do_idle)

#ifdef CONFIG_CPU_PM
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 5416f2b..9ac146c 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -62,7 +62,7 @@
#define LPI_PROPBASE_SZ ALIGN(BIT(LPI_NRBITS), SZ_64K)
#define LPI_PENDBASE_SZ ALIGN(BIT(LPI_NRBITS) / 8, SZ_64K)

-#define LPI_PROP_DEFAULT_PRIO 0xa0
+#define LPI_PROP_DEFAULT_PRIO GICD_INT_DEF_PRI

/*
* Collection structure - just an ID, and a redistributor address to
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index e5d1014..82cfacf 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -71,9 +71,6 @@ struct gic_chip_data {
#define gic_data_rdist_rd_base() (gic_data_rdist()->rd_base)
#define gic_data_rdist_sgi_base() (gic_data_rdist_rd_base() + SZ_64K)

-/* Our default, arbitrary priority value. Linux only uses one anyway. */
-#define DEFAULT_PMR_VALUE 0xf0
-
static inline unsigned int gic_irq(struct irq_data *d)
{
return d->hwirq;
@@ -348,48 +345,55 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
{
u32 irqnr;

- do {
- irqnr = gic_read_iar();
+ irqnr = gic_read_iar();
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ isb();
+ /* Masking IRQs earlier would prevent to ack the current interrupt */
+ gic_start_pmr_masking();
+#endif
+
+ if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
+ int err;

- if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
- int err;
+ if (static_branch_likely(&supports_deactivate_key))
+ gic_write_eoir(irqnr);
+ else {
+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ isb();
+#endif
+ }

- if (static_branch_likely(&supports_deactivate_key))
+ err = handle_domain_irq(gic_data.domain, irqnr, regs);
+ if (err) {
+ WARN_ONCE(true, "Unexpected interrupt received!\n");
+ if (static_branch_likely(&supports_deactivate_key)) {
+ if (irqnr < 8192)
+ gic_write_dir(irqnr);
+ } else {
gic_write_eoir(irqnr);
- else
- isb();
-
- err = handle_domain_irq(gic_data.domain, irqnr, regs);
- if (err) {
- WARN_ONCE(true, "Unexpected interrupt received!\n");
- if (static_branch_likely(&supports_deactivate_key)) {
- if (irqnr < 8192)
- gic_write_dir(irqnr);
- } else {
- gic_write_eoir(irqnr);
- }
}
- continue;
}
- if (irqnr < 16) {
- gic_write_eoir(irqnr);
- if (static_branch_likely(&supports_deactivate_key))
- gic_write_dir(irqnr);
+ return;
+ }
+ if (irqnr < 16) {
+ gic_write_eoir(irqnr);
+ if (static_branch_likely(&supports_deactivate_key))
+ gic_write_dir(irqnr);
#ifdef CONFIG_SMP
- /*
- * Unlike GICv2, we don't need an smp_rmb() here.
- * The control dependency from gic_read_iar to
- * the ISB in gic_write_eoir is enough to ensure
- * that any shared data read by handle_IPI will
- * be read after the ACK.
- */
- handle_IPI(irqnr, regs);
+ /*
+ * Unlike GICv2, we don't need an smp_rmb() here.
+ * The control dependency from gic_read_iar to
+ * the ISB in gic_write_eoir is enough to ensure
+ * that any shared data read by handle_IPI will
+ * be read after the ACK.
+ */
+ handle_IPI(irqnr, regs);
#else
- WARN_ONCE(true, "Unexpected SGI received!\n");
+ WARN_ONCE(true, "Unexpected SGI received!\n");
#endif
- continue;
- }
- } while (irqnr != ICC_IAR1_EL1_SPURIOUS);
+ return;
+ }
}

static void __init gic_dist_init(void)
@@ -565,8 +569,10 @@ static void gic_cpu_sys_reg_init(void)
val = read_gicreg(ICC_PMR_EL1);
group0 = val != 0;

+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
/* Set priority mask register */
- write_gicreg(DEFAULT_PMR_VALUE, ICC_PMR_EL1);
+ write_gicreg(ICC_PMR_EL1_UNMASKED, ICC_PMR_EL1);
+#endif

/*
* Some firmwares hand over to the kernel with the BPR changed from
diff --git a/include/linux/irqchip/arm-gic-common.h b/include/linux/irqchip/arm-gic-common.h
index 0a83b43..2c9a4b3 100644
--- a/include/linux/irqchip/arm-gic-common.h
+++ b/include/linux/irqchip/arm-gic-common.h
@@ -13,6 +13,12 @@
#include <linux/types.h>
#include <linux/ioport.h>

+#define GICD_INT_DEF_PRI 0xc0
+#define GICD_INT_DEF_PRI_X4 ((GICD_INT_DEF_PRI << 24) |\
+ (GICD_INT_DEF_PRI << 16) |\
+ (GICD_INT_DEF_PRI << 8) |\
+ GICD_INT_DEF_PRI)
+
enum gic_type {
GIC_V2,
GIC_V3,
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 68d8b1f..5f2129b 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -65,11 +65,6 @@
#define GICD_INT_EN_CLR_X32 0xffffffff
#define GICD_INT_EN_SET_SGI 0x0000ffff
#define GICD_INT_EN_CLR_PPI 0xffff0000
-#define GICD_INT_DEF_PRI 0xa0
-#define GICD_INT_DEF_PRI_X4 ((GICD_INT_DEF_PRI << 24) |\
- (GICD_INT_DEF_PRI << 16) |\
- (GICD_INT_DEF_PRI << 8) |\
- GICD_INT_DEF_PRI)

#define GICH_HCR 0x0
#define GICH_VTR 0x4
--
1.9.1

2018-05-21 11:40:18

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v3 5/6] arm64: Detect current view of GIC priorities

The values non secure EL1 needs to use for PMR and RPR registers depends on
the value of SCR_EL3.FIQ.

The values non secure EL1 sees from the distributor and redistributor
depend on whether security is enabled for the GIC or not.

Figure out what values we are dealing with to know if the values we use for
PMR and RPR match the priority values that have been configured in the
distributor and redistributors.

Also, add firmware requirements related to SCR_EL3.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
Documentation/arm64/booting.txt | 5 +++
arch/arm64/include/asm/sysreg.h | 1 +
drivers/irqchip/irq-gic-v3.c | 99 ++++++++++++++++++++++++++++++-----------
3 files changed, 80 insertions(+), 25 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 8d0df62..e387938 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -188,6 +188,11 @@ Before jumping into the kernel, the following conditions must be met:
the kernel image will be entered must be initialised by software at a
higher exception level to prevent execution in an UNKNOWN state.

+ - SCR_EL3.FIQ must have the same value across all CPUs the kernel is
+ executing on.
+ - The value of SCR_EL3.FIQ must be the same as the one present at boot
+ time whenever the kernel is executing.
+
For systems with a GICv3 interrupt controller to be used in v3 mode:
- If EL3 is present:
ICC_SRE_EL3.Enable (bit 3) must be initialiased to 0b1.
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 6171178..fb8320a 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -322,6 +322,7 @@
#define SYS_ICC_SRE_EL1 sys_reg(3, 0, 12, 12, 5)
#define SYS_ICC_IGRPEN0_EL1 sys_reg(3, 0, 12, 12, 6)
#define SYS_ICC_IGRPEN1_EL1 sys_reg(3, 0, 12, 12, 7)
+#define SYS_ICC_RPR_EL1 sys_reg(3, 0, 12, 11, 3)

#define SYS_CONTEXTIDR_EL1 sys_reg(3, 0, 13, 0, 1)
#define SYS_TPIDR_EL1 sys_reg(3, 0, 13, 0, 4)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 82cfacf..3c44918 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -63,6 +63,30 @@ struct gic_chip_data {
static struct gic_chip_data gic_data __read_mostly;
static DEFINE_STATIC_KEY_TRUE(supports_deactivate_key);

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+/*
+ * The behaviours of RPR and PMR registers differ depending on the value of
+ * SCR_EL3.FIQ, and the behaviour of non-secure priority registers of the
+ * distributor and redistributors depends on whether security is enabled in the
+ * GIC.
+ *
+ * When security is enabled, non-secure priority values from the (re)distributor
+ * are presented to the GIC CPUIF as follow:
+ * (GIC_(R)DIST_PRI[irq] >> 1) | 0x80;
+ *
+ * If SCR_EL3.FIQ == 1, the values writen to/read from PMR and RPR at non-secure
+ * EL1 are subject to a similar operation thus matching the priorities presented
+ * from the (re)distributor when security is enabled.
+ *
+ * see GICv3/GICv4 Architecture Specification (IHI0069D):
+ * - section 4.8.1 Non-secure accesses to register fields for Secure interrupt
+ * priorities.
+ * - Figure 4-7 Secure read of the priority field for a Non-secure Group 1
+ * interrupt.
+ */
+DEFINE_STATIC_KEY_FALSE(have_non_secure_prio_view);
+#endif
+
static struct gic_kvm_info gic_v3_kvm_info;
static DEFINE_PER_CPU(bool, has_rss);

@@ -531,28 +555,26 @@ static void gic_update_vlpi_properties(void)
!gic_data.rdists.has_direct_lpi ? "no " : "");
}

-static void gic_cpu_sys_reg_init(void)
+/* Check whether it's single security state view */
+static inline bool gic_dist_security_disabled(void)
{
- int i, cpu = smp_processor_id();
- u64 mpidr = cpu_logical_map(cpu);
- u64 need_rss = MPIDR_RS(mpidr);
- bool group0;
- u32 val, pribits;
+ return readl_relaxed(gic_data.dist_base + GICD_CTLR) & GICD_CTLR_DS;
+}

- /*
- * Need to check that the SRE bit has actually been set. If
- * not, it means that SRE is disabled at EL2. We're going to
- * die painfully, and there is nothing we can do about it.
- *
- * Kindly inform the luser.
- */
- if (!gic_enable_sre())
- pr_err("GIC: unable to set SRE (disabled at EL2), panic ahead\n");
+static inline u32 gic_get_cpu_pri_bits(void)
+{
+ u32 pribits;

pribits = gic_read_ctlr();
pribits &= ICC_CTLR_EL1_PRI_BITS_MASK;
pribits >>= ICC_CTLR_EL1_PRI_BITS_SHIFT;
- pribits++;
+
+ return pribits + 1;
+}
+
+static inline bool gic_cpu_has_group0(void)
+{
+ u32 pmr_val;

/*
* Let's find out if Group0 is under control of EL3 or not by
@@ -565,13 +587,41 @@ static void gic_cpu_sys_reg_init(void)
* becomes 0x80. Reading it back returns 0, indicating that
* we're don't have access to Group0.
*/
- write_gicreg(BIT(8 - pribits), ICC_PMR_EL1);
- val = read_gicreg(ICC_PMR_EL1);
- group0 = val != 0;
+ write_gicreg(BIT(8 - gic_get_cpu_pri_bits()), ICC_PMR_EL1);
+ pmr_val = read_gicreg(ICC_PMR_EL1);
+
+ return pmr_val != 0;
+}
+
+static void gic_cpu_sys_reg_init(void)
+{
+ int i, cpu = smp_processor_id();
+ u64 mpidr = cpu_logical_map(cpu);
+ u64 need_rss = MPIDR_RS(mpidr);
+ bool group0;
+ u32 pribits;
+
+ /*
+ * Need to check that the SRE bit has actually been set. If
+ * not, it means that SRE is disabled at EL2. We're going to
+ * die painfully, and there is nothing we can do about it.
+ *
+ * Kindly inform the luser.
+ */
+ if (!gic_enable_sre())
+ pr_err("GIC: unable to set SRE (disabled at EL2), panic ahead\n");
+
+ pribits = gic_get_cpu_pri_bits();
+
+ group0 = gic_cpu_has_group0();

#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
/* Set priority mask register */
write_gicreg(ICC_PMR_EL1_UNMASKED, ICC_PMR_EL1);
+#else
+ if (static_branch_likely(&have_non_secure_prio_view) && group0)
+ /* Mismatch configuration with boot CPU */
+ WARN_ON(group0 && !gic_dist_security_disabled());
#endif

/*
@@ -825,12 +875,6 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
#endif

#ifdef CONFIG_CPU_PM
-/* Check whether it's single security state view */
-static bool gic_dist_security_disabled(void)
-{
- return readl_relaxed(gic_data.dist_base + GICD_CTLR) & GICD_CTLR_DS;
-}
-
static int gic_cpu_pm_notifier(struct notifier_block *self,
unsigned long cmd, void *v)
{
@@ -1130,6 +1174,11 @@ static int __init gic_init_bases(void __iomem *dist_base,
gic_cpu_init();
gic_cpu_pm_init();

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ if (!gic_cpu_has_group0() || gic_dist_security_disabled())
+ static_branch_enable(&have_non_secure_prio_view);
+#endif
+
return 0;

out_free:
--
1.9.1

2018-05-21 11:46:08

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v3 1/6] arm64: cpufeature: Allow early detect of specific features

On 21/05/18 12:35, Julien Thierry wrote:
> From: Daniel Thompson <[email protected]>
>
> Currently it is not possible to detect features of the boot CPU
> until the other CPUs have been brought up.
>
> This prevents us from reacting to features of the boot CPU until
> fairly late in the boot process. To solve this we allow a subset
> of features (that are likely to be common to all clusters) to be
> detected based on the boot CPU alone.
>
> Signed-off-by: Daniel Thompson <[email protected]>
> [[email protected]: check non-boot cpu missing early features, avoid
> duplicates between early features and normal
> features]
> Signed-off-by: Julien Thierry <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>

nit: Since this is completely different from what Daniel started with,
you could simply reset the author to yourself. The boot CPU feature was
added keeping this user in mind.

The patch as such looks good to me.

Reviewed-by: Suzuki K Poulose <[email protected]>

> ---
> arch/arm64/kernel/cpufeature.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 9d1b06d..e03e897 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1030,7 +1030,7 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
> {
> .desc = "GIC system register CPU interface",
> .capability = ARM64_HAS_SYSREG_GIC_CPUIF,
> - .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> + .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
> .matches = has_useable_gicv3_cpuif,
> .sys_reg = SYS_ID_AA64PFR0_EL1,
> .field_pos = ID_AA64PFR0_GIC_SHIFT,
> --
> 1.9.1
>


2018-05-21 12:07:24

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v3 1/6] arm64: cpufeature: Allow early detect of specific features

On Mon, May 21, 2018 at 12:45:22PM +0100, Suzuki K Poulose wrote:
> On 21/05/18 12:35, Julien Thierry wrote:
> > From: Daniel Thompson <[email protected]>
> >
> > Currently it is not possible to detect features of the boot CPU
> > until the other CPUs have been brought up.
> >
> > This prevents us from reacting to features of the boot CPU until
> > fairly late in the boot process. To solve this we allow a subset
> > of features (that are likely to be common to all clusters) to be
> > detected based on the boot CPU alone.
> >
> > Signed-off-by: Daniel Thompson <[email protected]>
> > [[email protected]: check non-boot cpu missing early features, avoid
> > duplicates between early features and normal
> > features]
> > Signed-off-by: Julien Thierry <[email protected]>
> > Cc: Catalin Marinas <[email protected]>
> > Cc: Will Deacon <[email protected]>
> > Cc: Suzuki K Poulose <[email protected]>
>
> nit: Since this is completely different from what Daniel started with,
> you could simply reset the author to yourself. The boot CPU feature was
> added keeping this user in mind.

Agree! It's no longer my patch.

If you want to retain any credit than Suggested-by: would make quite
sufficient.


Daniel.


>
> The patch as such looks good to me.
>
> Reviewed-by: Suzuki K Poulose <[email protected]>
>
> > ---
> > arch/arm64/kernel/cpufeature.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index 9d1b06d..e03e897 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -1030,7 +1030,7 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
> > {
> > .desc = "GIC system register CPU interface",
> > .capability = ARM64_HAS_SYSREG_GIC_CPUIF,
> > - .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> > + .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
> > .matches = has_useable_gicv3_cpuif,
> > .sys_reg = SYS_ID_AA64PFR0_EL1,
> > .field_pos = ID_AA64PFR0_GIC_SHIFT,
> > --
> > 1.9.1
> >
>

2018-05-21 12:12:00

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v3 1/6] arm64: cpufeature: Allow early detect of specific features



On 21/05/18 13:06, Daniel Thompson wrote:
> On Mon, May 21, 2018 at 12:45:22PM +0100, Suzuki K Poulose wrote:
>> On 21/05/18 12:35, Julien Thierry wrote:
>>> From: Daniel Thompson <[email protected]>
>>>
>>> Currently it is not possible to detect features of the boot CPU
>>> until the other CPUs have been brought up.
>>>
>>> This prevents us from reacting to features of the boot CPU until
>>> fairly late in the boot process. To solve this we allow a subset
>>> of features (that are likely to be common to all clusters) to be
>>> detected based on the boot CPU alone.
>>>
>>> Signed-off-by: Daniel Thompson <[email protected]>
>>> [[email protected]: check non-boot cpu missing early features, avoid
>>> duplicates between early features and normal
>>> features]
>>> Signed-off-by: Julien Thierry <[email protected]>
>>> Cc: Catalin Marinas <[email protected]>
>>> Cc: Will Deacon <[email protected]>
>>> Cc: Suzuki K Poulose <[email protected]>
>>
>> nit: Since this is completely different from what Daniel started with,
>> you could simply reset the author to yourself. The boot CPU feature was
>> added keeping this user in mind.
>
> Agree! It's no longer my patch.
>
> If you want to retain any credit than Suggested-by: would make quite
> sufficient.
>

Good point, I didn't think of that.

I'll change this after I get other reviews for this version.

Thanks,

>
>>
>> The patch as such looks good to me.
>>
>> Reviewed-by: Suzuki K Poulose <[email protected]>
>>
>>> ---
>>> arch/arm64/kernel/cpufeature.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>> index 9d1b06d..e03e897 100644
>>> --- a/arch/arm64/kernel/cpufeature.c
>>> +++ b/arch/arm64/kernel/cpufeature.c
>>> @@ -1030,7 +1030,7 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
>>> {
>>> .desc = "GIC system register CPU interface",
>>> .capability = ARM64_HAS_SYSREG_GIC_CPUIF,
>>> - .type = ARM64_CPUCAP_SYSTEM_FEATURE,
>>> + .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
>>> .matches = has_useable_gicv3_cpuif,
>>> .sys_reg = SYS_ID_AA64PFR0_EL1,
>>> .field_pos = ID_AA64PFR0_GIC_SHIFT,
>>> --
>>> 1.9.1
>>>
>>

--
Julien Thierry

2018-05-25 02:28:03

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 3/6] arm64: irqflags: Use ICC sysregs to implement IRQ masking

Hi Julien,

On 21/05/18 12:35, Julien Thierry wrote:
> From: Daniel Thompson <[email protected]>
>
> Currently irqflags is implemented using the PSR's I bit. It is possible
> to implement irqflags by using the co-processor interface to the GIC.
> Using the co-processor interface makes it feasible to simulate NMIs
> using GIC interrupt prioritization.
>
> This patch changes the irqflags macros to modify, save and restore
> ICC_PMR_EL1. This has a substantial knock on effect for the rest of
> the kernel. There are four reasons for this:
>
> 1. The state of the PMR becomes part of the interrupt context and must be
> saved and restored during exceptions. It is saved on the stack as part
> of the saved context when an interrupt/exception is taken.
>
> 2. The hardware automatically masks the I bit (at boot, during traps, etc).
> When the I bit is set by hardware we must add code to switch from I
> bit masking and PMR masking:
> - For IRQs, this is done after the interrupt has been acknowledged
> avoiding the need to unmask.
> - For other exceptions, this is done right after saving the context.
>
> 3. Some instructions, such as wfi, require that the PMR not be used
> for interrupt masking. Before calling these instructions we must
> switch from PMR masking to I bit masking.
> This is also the case when KVM runs a guest, if the CPU receives
> an interrupt from the host, interrupts must not be masked in PMR
> otherwise the GIC will not signal it to the CPU.
>
> 4. We use the alternatives system to allow a single kernel to boot and
> be switched to the alternative masking approach at runtime.
>
> Signed-off-by: Daniel Thompson <[email protected]>
> [[email protected]: changes reflected in commit,
> message, fixes, renaming]
> Signed-off-by: Julien Thierry <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Christoffer Dall <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Jason Cooper <[email protected]>
> Cc: James Morse <[email protected]>
> ---
> arch/arm64/Kconfig | 15 ++++
> arch/arm64/include/asm/arch_gicv3.h | 20 ++++++
> arch/arm64/include/asm/assembler.h | 25 ++++++-
> arch/arm64/include/asm/daifflags.h | 36 +++++++---
> arch/arm64/include/asm/efi.h | 5 ++
> arch/arm64/include/asm/irqflags.h | 125 +++++++++++++++++++++++++++++++++
> arch/arm64/include/asm/kvm_host.h | 14 ++++
> arch/arm64/include/asm/processor.h | 4 ++
> arch/arm64/include/asm/ptrace.h | 14 +++-
> arch/arm64/kernel/asm-offsets.c | 1 +
> arch/arm64/kernel/entry.S | 28 ++++++--
> arch/arm64/kernel/head.S | 37 ++++++++++
> arch/arm64/kernel/process.c | 6 ++
> arch/arm64/kernel/smp.c | 8 +++
> arch/arm64/kvm/hyp/switch.c | 25 +++++++
> arch/arm64/mm/fault.c | 5 +-
> arch/arm64/mm/proc.S | 23 ++++++
> drivers/irqchip/irq-gic-v3-its.c | 2 +-
> drivers/irqchip/irq-gic-v3.c | 82 +++++++++++----------
> include/linux/irqchip/arm-gic-common.h | 6 ++
> include/linux/irqchip/arm-gic.h | 5 --
> 21 files changed, 423 insertions(+), 63 deletions(-)
I've commented about this particular patch offline, but let me state it
on the list:

As it is, this patch is almost impossible to review. It turns the
interrupt masking upside down, messes with the GIC, hacks KVM... Too
many things change at once, and I find it very hard to build a mental
picture of the changes just by staring at it.

Can you please try to split it into related chunks, moving the enabling
of the feature right at the end, so that the reviewers can have a chance
to understand it? It should make it much easier to review.

Thanks,

M.
--
Jazz is not dead. It just smells funny...