2018-08-28 15:53:19

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 00/27] arm64: provide pseudo NMI with GICv3

Hi,

This series is a continuation of the work started by Daniel [1]. The goal
is to use GICv3 interrupt priorities to simulate an NMI.

The patches depend on the core API for NMIs patches [2].

To achieve this, set two priorities, one for standard interrupts and
another, higher priority, for NMIs. Whenever we want to disable interrupts,
we mask the standard priority instead so NMIs can still be raised. Some
corner cases though still require to actually mask all interrupts
effectively disabling the NMI.

Daniel Thompson ran some benchmarks [3] on the previous version showing a
small (<1%) performance drop when using interrupt priorities.

Currently, only PPIs and SPIs can be set as NMIs. IPIs being currently
hardcoded IRQ numbers, there isn't a generic interface to set SGIs as NMI
for now. LPIs being controlled by the ITS cannot be delivered as NMI.
When an NMI is active on a CPU, no other NMI can be triggered on the CPU.

Requirements to use this:
- Have GICv3
- SCR_EL3.FIQ is set to 1 when linux runs or have single security state
- Select Kernel Feature -> Use ICC system registers for IRQ masking


* Patches 1 to 3 aim at applying some alternatives early in the boot
process.

* Patches 4 to 7 ensure the logic of daifflags remains valid
after arch_local_irq flags use ICC_PMR_EL1.

* Patches 8 and 9 clean up GIC current priority definition to make it
easier to introduce a new priority

* Patches 10 to 16 prepare arch code for the use of priorities, saving and
restoring ICC_PMR_EL1 appropriately

* Patches 17 to 20 add the support to GICv3 driver to use priority masking
if required by the architecture

* Patches 21 to 23 make arm64 code use ICC_PMR_EL1 to enable/disable
interrupts, leaving PSR.I as often as possible

* Patches 24 to 27 add the support for NMIs to GICv3 driver


Changes since V4[4]:
* Rebased to v4.19-rc1
* Adapted GIC driver to the core NMI API
* Added option to disable priority masking on command line
* Added Daniel's Tested-by on patches related replacing PSR.I toggling with
PMR masking
* Fix scope matching for alternative features.
* Spotted some more places using PSR.I or daif and replaced with generic
interrupt functions

Changes since V3[5]:
* Big refactoring. As suggested by Marc Z., some of the bigger patches
needed to be split into smaller one.
* Try to reduce the amount of #ifdef for the new feature by introducing
an individual cpufeature for priority masking
* Do not track which alternatives have been applied (was a bit dodgy
anyway), and use an alternative for VHE cpu_enable callback
* Fix a build failure with arm by adding the correct RPR accessors
* Added Suggested-by tags for changes from comming or inspired by Daniel's
series. Do let me know if you feel I missed something and am not giving
you due credit.

Changes since V2[6]:
* Series rebase to v4.17-rc6
* Adapt pathces 1 and 2 to the rework of cpufeatures framework
* Use the group0 detection scheme in the GICv3 driver to identify
the priority view, and drop the use of a fake interrupt
* Add the case for a GIC configured in a single security state
* Use local_daif_restore instead of local_irq_enable the first time
we enable interrupts after a bp hardening in the handling of a kernel
entry. Otherwise PRS.I remains set...

Changes since V1[7]:
* Series rebased to v4.15-rc8.
* Check for arm64_early_features in this_cpu_has_cap (spotted by Suzuki).
* Fix issue where debug exception were not masked when enabling debug in
mdscr_el1.

Changes since RFC[8]:
* The series was rebased to v4.15-rc2 which implied some changes mainly
related to the work on exception entries and daif flags by James Morse.
- The first patch in the previous series was dropped because no longer
applicable.
- With the semantics James introduced of "inheriting" daif flags,
handling of PMR on exception entry is simplified as PMR is not altered
by taking an exception and already inherited from previous state.
- James pointed out that taking a PseudoNMI before reading the FAR_EL1
register should not be allowed as per the TRM (D10.2.29):
"FAR_EL1 is made UNKNOWN on an exception return from EL1."
So in this submission PSR.I bit is cleared only after FAR_EL1 is read.
* For KVM, only deal with PMR unmasking/restoring in common code, and VHE
specific code makes sure PSR.I bit is set when necessary.
* When detecting the GIC priority view (patch 5), wait for an actual
interrupt instead of trying only once.


[1] http://www.spinics.net/lists/arm-kernel/msg525077.html
[2] https://lkml.org/lkml/2018/8/28/661
[3] https://lkml.org/lkml/2018/7/20/803
[4] https://lkml.org/lkml/2018/7/24/321
[5] https://lkml.org/lkml/2018/5/21/276
[6] https://lkml.org/lkml/2018/1/17/335
[7] https://www.spinics.net/lists/arm-kernel/msg620763.html
[8] https://www.spinics.net/lists/arm-kernel/msg610736.html

Cheers,

Julien

-->

Daniel Thompson (1):
arm64: alternative: Apply alternatives early in boot process

Julien Thierry (26):
arm64: cpufeature: Set SYSREG_GIC_CPUIF as a boot system feature
arm64: cpufeature: Use alternatives for VHE cpu_enable
arm64: daifflags: Use irqflags functions for daifflags
arm64: Use daifflag_restore after bp_hardening
arm64: Delay daif masking for user return
arm64: xen: Use existing helper to check interrupt status
irqchip/gic: Unify GIC priority definitions
irqchip/gic: Lower priority of GIC interrupts
arm64: cpufeature: Add cpufeature for IRQ priority masking
arm64: Make PMR part of task context
arm64: Unmask PMR before going idle
arm/arm64: gic-v3: Add helper functions to manage IRQ priorities
arm64: kvm: Unmask PMR before entering guest
arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking
arm64: daifflags: Include PMR in daifflags restore operations
irqchip/gic-v3: Factor group0 detection into functions
irqchip/gic-v3: Do not overwrite PMR value
irqchip/gic-v3: Remove acknowledge loop
irqchip/gic-v3: Switch to PMR masking after IRQ acknowledge
arm64: Switch to PMR masking when starting CPUs
arm64: Add build option for IRQ masking via priority
arm64: Handle serror in NMI context
irqchip/gic-v3: Detect current view of GIC priorities
irqchip/gic-v3: Add base support for pseudo-NMI
irqchip/gic: Add functions to access irq priorities
irqchip/gic-v3: Allow interrupts to be set as pseudo-NMI

Documentation/admin-guide/kernel-parameters.txt | 3 +
Documentation/arm64/booting.txt | 5 +
arch/arm/include/asm/arch_gicv3.h | 33 +++
arch/arm64/Kconfig | 15 ++
arch/arm64/include/asm/alternative.h | 3 +-
arch/arm64/include/asm/arch_gicv3.h | 33 +++
arch/arm64/include/asm/assembler.h | 17 +-
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/cpufeature.h | 2 +
arch/arm64/include/asm/daifflags.h | 29 ++-
arch/arm64/include/asm/efi.h | 3 +-
arch/arm64/include/asm/irqflags.h | 100 +++++++--
arch/arm64/include/asm/kvm_host.h | 12 +
arch/arm64/include/asm/processor.h | 1 +
arch/arm64/include/asm/ptrace.h | 13 +-
arch/arm64/include/asm/xen/events.h | 2 +-
arch/arm64/kernel/alternative.c | 28 ++-
arch/arm64/kernel/asm-offsets.c | 1 +
arch/arm64/kernel/cpufeature.c | 51 ++++-
arch/arm64/kernel/entry.S | 63 +++++-
arch/arm64/kernel/head.S | 35 +++
arch/arm64/kernel/process.c | 2 +
arch/arm64/kernel/smp.c | 12 +
arch/arm64/kernel/traps.c | 8 +-
arch/arm64/kvm/hyp/switch.c | 17 ++
arch/arm64/mm/fault.c | 5 +-
arch/arm64/mm/proc.S | 18 ++
drivers/irqchip/irq-gic-common.c | 10 +
drivers/irqchip/irq-gic-common.h | 2 +
drivers/irqchip/irq-gic-v3-its.c | 2 +-
drivers/irqchip/irq-gic-v3.c | 279 +++++++++++++++++++-----
include/linux/irqchip/arm-gic-common.h | 6 +
include/linux/irqchip/arm-gic.h | 5 -
33 files changed, 699 insertions(+), 119 deletions(-)

--
1.9.1


2018-08-28 15:53:21

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 01/27] arm64: cpufeature: Set SYSREG_GIC_CPUIF as a boot system feature

Signed-off-by: Julien Thierry <[email protected]>
Suggested-by: Daniel Thompson <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
arch/arm64/kernel/cpufeature.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index e238b79..1e433ac 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1039,7 +1039,7 @@ static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
{
.desc = "GIC system register CPU interface",
.capability = ARM64_HAS_SYSREG_GIC_CPUIF,
- .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+ .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
.matches = has_useable_gicv3_cpuif,
.sys_reg = SYS_ID_AA64PFR0_EL1,
.field_pos = ID_AA64PFR0_GIC_SHIFT,
--
1.9.1


2018-08-28 15:53:27

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 02/27] arm64: cpufeature: Use alternatives for VHE cpu_enable

The cpu_enable callback for VHE feature requires all alternatives to have
been applied. This prevents applying VHE alternative separately from the
rest.

Use an alternative depending on VHE feature to know whether VHE
alternatives have already been applied.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Christoffer Dall <[email protected]>
---
arch/arm64/kernel/cpufeature.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 1e433ac..3bc1c8b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1014,6 +1014,8 @@ static bool runs_at_el2(const struct arm64_cpu_capabilities *entry, int __unused

static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
{
+ u64 tmp = 0;
+
/*
* Copy register values that aren't redirected by hardware.
*
@@ -1022,8 +1024,15 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
* that, freshly-onlined CPUs will set tpidr_el2, so we don't need to
* do anything here.
*/
- if (!alternatives_applied)
- write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);
+ asm volatile(ALTERNATIVE(
+ "mrs %0, tpidr_el1\n"
+ "msr tpidr_el2, %0",
+ "nop\n"
+ "nop",
+ ARM64_HAS_VIRT_HOST_EXTN)
+ : "+r" (tmp)
+ :
+ : "memory");
}
#endif

--
1.9.1


2018-08-28 15:53:33

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 03/27] arm64: alternative: Apply alternatives early in boot process

From: Daniel Thompson <[email protected]>

Currently alternatives are applied very late in the boot process (and
a long time after we enable scheduling). Some alternative sequences,
such as those that alter the way CPU context is stored, must be applied
much earlier in the boot sequence.

Introduce apply_boot_alternatives() to allow some alternatives to be
applied immediately after we detect the CPU features of the boot CPU.

Signed-off-by: Daniel Thompson <[email protected]>
[[email protected]: rename to fit new cpufeature framework better,
apply BOOT_SCOPE feature early in boot]
Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
---
arch/arm64/include/asm/alternative.h | 3 +--
arch/arm64/include/asm/cpufeature.h | 2 ++
arch/arm64/kernel/alternative.c | 28 +++++++++++++++++++++++++---
arch/arm64/kernel/cpufeature.c | 5 +++++
arch/arm64/kernel/smp.c | 7 +++++++
5 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index 4b650ec..17f4554 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -14,8 +14,6 @@
#include <linux/stddef.h>
#include <linux/stringify.h>

-extern int alternatives_applied;
-
struct alt_instr {
s32 orig_offset; /* offset to original instruction */
s32 alt_offset; /* offset to replacement instruction */
@@ -27,6 +25,7 @@ struct alt_instr {
typedef void (*alternative_cb_t)(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr, int nr_inst);

+void __init apply_boot_alternatives(void);
void __init apply_alternatives_all(void);

#ifdef CONFIG_MODULES
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 1717ba1..e6c030a 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -357,6 +357,8 @@ static inline int cpucap_default_scope(const struct arm64_cpu_capabilities *cap)
extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
extern struct static_key_false arm64_const_caps_ready;

+extern unsigned long boot_capabilities;
+
bool this_cpu_has_cap(unsigned int cap);

static inline bool cpu_have_feature(unsigned int num)
diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index b5d6039..70c2604 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -145,7 +145,8 @@ static void clean_dcache_range_nopatch(u64 start, u64 end)
} while (cur += d_size, cur < end);
}

-static void __apply_alternatives(void *alt_region, bool is_module)
+static void __apply_alternatives(void *alt_region, bool is_module,
+ unsigned long feature_mask)
{
struct alt_instr *alt;
struct alt_region *region = alt_region;
@@ -155,6 +156,9 @@ static void __apply_alternatives(void *alt_region, bool is_module)
for (alt = region->begin; alt < region->end; alt++) {
int nr_inst;

+ if ((BIT(alt->cpufeature) & feature_mask) == 0)
+ continue;
+
/* Use ARM64_CB_PATCH as an unconditional patch */
if (alt->cpufeature < ARM64_CB_PATCH &&
!cpus_have_cap(alt->cpufeature))
@@ -213,7 +217,7 @@ static int __apply_alternatives_multi_stop(void *unused)
isb();
} else {
BUG_ON(alternatives_applied);
- __apply_alternatives(&region, false);
+ __apply_alternatives(&region, false, ~boot_capabilities);
/* Barriers provided by the cache flushing */
WRITE_ONCE(alternatives_applied, 1);
}
@@ -227,6 +231,24 @@ void __init apply_alternatives_all(void)
stop_machine(__apply_alternatives_multi_stop, NULL, cpu_online_mask);
}

+/*
+ * This is called very early in the boot process (directly after we run
+ * a feature detect on the boot CPU). No need to worry about other CPUs
+ * here.
+ */
+void __init apply_boot_alternatives(void)
+{
+ struct alt_region region = {
+ .begin = (struct alt_instr *)__alt_instructions,
+ .end = (struct alt_instr *)__alt_instructions_end,
+ };
+
+ /* If called on non-boot cpu things could go wrong */
+ WARN_ON(smp_processor_id() != 0);
+
+ __apply_alternatives(&region, false, boot_capabilities);
+}
+
#ifdef CONFIG_MODULES
void apply_alternatives_module(void *start, size_t length)
{
@@ -235,6 +257,6 @@ void apply_alternatives_module(void *start, size_t length)
.end = start + length,
};

- __apply_alternatives(&region, true);
+ __apply_alternatives(&region, true, -1);
}
#endif
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 3bc1c8b..0d1e41e 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -52,6 +52,8 @@
DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
EXPORT_SYMBOL(cpu_hwcaps);

+unsigned long boot_capabilities;
+
/*
* Flag to indicate if we have computed the system wide
* capabilities based on the boot time active CPUs. This
@@ -1375,6 +1377,9 @@ static void __update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
if (!cpus_have_cap(caps->capability) && caps->desc)
pr_info("%s %s\n", info, caps->desc);
cpus_set_cap(caps->capability);
+
+ if (caps->type & SCOPE_BOOT_CPU)
+ __set_bit(caps->capability, &boot_capabilities);
}
}

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 25fcd22..22c9a0a 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -414,6 +414,13 @@ void __init smp_prepare_boot_cpu(void)
*/
jump_label_init();
cpuinfo_store_boot_cpu();
+
+ /*
+ * We now know enough about the boot CPU to apply the
+ * alternatives that cannot wait until interrupt handling
+ * and/or scheduling is enabled.
+ */
+ apply_boot_alternatives();
}

static u64 __init of_get_cpu_mpidr(struct device_node *dn)
--
1.9.1


2018-08-28 15:53:34

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 04/27] arm64: daifflags: Use irqflags functions for daifflags

Some of the work done in daifflags save/restore is already provided
by irqflags functions. Daifflags should always be a superset of irqflags
(it handles irq status + status of other flags). Modifying behaviour of
irqflags should alter the behaviour of daifflags.

Use irqflags_save/restore functions for the corresponding daifflags
operation.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: James Morse <[email protected]>
---
arch/arm64/include/asm/daifflags.h | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index 22e4c83..8d91f22 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -36,11 +36,8 @@ static inline unsigned long local_daif_save(void)
{
unsigned long flags;

- asm volatile(
- "mrs %0, daif // local_daif_save\n"
- : "=r" (flags)
- :
- : "memory");
+ flags = arch_local_save_flags();
+
local_daif_mask();

return flags;
@@ -60,11 +57,9 @@ static inline void local_daif_restore(unsigned long flags)
{
if (!arch_irqs_disabled_flags(flags))
trace_hardirqs_on();
- asm volatile(
- "msr daif, %0 // local_daif_restore"
- :
- : "r" (flags)
- : "memory");
+
+ arch_local_irq_restore(flags);
+
if (arch_irqs_disabled_flags(flags))
trace_hardirqs_off();
}
--
1.9.1


2018-08-28 15:53:38

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 05/27] arm64: Use daifflag_restore after bp_hardening

For EL0 entries requiring bp_hardening, daif status is kept at
DAIF_PROCCTX_NOIRQ until after hardening has been done. Then interrupts
are enabled through local_irq_enable().

Before using local_irq_* functions, daifflags should be properly restored
to a state where IRQs are enabled.

Enable IRQs by restoring DAIF_PROCCTX state after bp hardening.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: James Morse <[email protected]>
---
arch/arm64/mm/fault.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 50b30ff..dfa2c43 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -37,6 +37,7 @@
#include <asm/cmpxchg.h>
#include <asm/cpufeature.h>
#include <asm/exception.h>
+#include <asm/daifflags.h>
#include <asm/debug-monitors.h>
#include <asm/esr.h>
#include <asm/sysreg.h>
@@ -771,7 +772,7 @@ asmlinkage void __exception do_el0_ia_bp_hardening(unsigned long addr,
if (addr > TASK_SIZE)
arm64_apply_bp_hardening();

- local_irq_enable();
+ local_daif_restore(DAIF_PROCCTX);
do_mem_abort(addr, esr, regs);
}

@@ -785,7 +786,7 @@ asmlinkage void __exception do_sp_pc_abort(unsigned long addr,
if (user_mode(regs)) {
if (instruction_pointer(regs) > TASK_SIZE)
arm64_apply_bp_hardening();
- local_irq_enable();
+ local_daif_restore(DAIF_PROCCTX);
}

clear_siginfo(&info);
--
1.9.1


2018-08-28 15:53:45

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 07/27] arm64: xen: Use existing helper to check interrupt status

The status of interrupts might depend on more than just pstate. Use
interrupts_disabled() instead of raw_irqs_disabled_flags() to take the full
context into account.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/include/asm/xen/events.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/xen/events.h b/arch/arm64/include/asm/xen/events.h
index 4e22b7a..2788e95 100644
--- a/arch/arm64/include/asm/xen/events.h
+++ b/arch/arm64/include/asm/xen/events.h
@@ -14,7 +14,7 @@ enum ipi_vector {

static inline int xen_irqs_disabled(struct pt_regs *regs)
{
- return raw_irqs_disabled_flags((unsigned long) regs->pstate);
+ return !interrupts_enabled(regs);
}

#define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
--
1.9.1


2018-08-28 15:53:51

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 10/27] arm64: cpufeature: Add cpufeature for IRQ priority masking

Add a cpufeature indicating whether a cpu supports masking interrupts
by priority.

Add command line option to disable that feature at runtime.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 3 +++
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/kernel/cpufeature.c | 31 +++++++++++++++++++++++++
3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 9871e64..d3e1170 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2764,6 +2764,9 @@
noexec=on: enable non-executable mappings (default)
noexec=off: disable non-executable mappings

+ nogicprios [ARM64]
+ Disable usage of GIC priorities to toggle interrupt status.
+
nosmap [X86]
Disable SMAP (Supervisor Mode Access Prevention)
even if it is supported by processor.
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index ae1f704..8cc2ae5 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -51,7 +51,8 @@
#define ARM64_SSBD 30
#define ARM64_MISMATCHED_CACHE_TYPE 31
#define ARM64_HAS_STAGE2_FWB 32
+#define ARM64_HAS_IRQ_PRIO_MASKING 33

-#define ARM64_NCAPS 33
+#define ARM64_NCAPS 34

#endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 0d1e41e..2f2c557 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1046,6 +1046,22 @@ static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
WARN_ON(val & (7 << 27 | 7 << 21));
}

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+static bool nogicprios = false;
+
+static int __init early_nogicprios(char *p)
+{
+ nogicprios = true;
+ return 0;
+}
+early_param("nogicprios", early_nogicprios);
+
+static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry, int scope)
+{
+ return !nogicprios && has_useable_gicv3_cpuif(entry, scope);
+}
+#endif
+
static const struct arm64_cpu_capabilities arm64_features[] = {
{
.desc = "GIC system register CPU interface",
@@ -1233,6 +1249,21 @@ static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
.cpu_enable = cpu_enable_hw_dbm,
},
#endif
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ {
+ /*
+ * Depends on having GICv3
+ */
+ .desc = "IRQ priority masking",
+ .capability = ARM64_HAS_IRQ_PRIO_MASKING,
+ .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
+ .matches = can_use_gic_priorities,
+ .sys_reg = SYS_ID_AA64PFR0_EL1,
+ .field_pos = ID_AA64PFR0_GIC_SHIFT,
+ .sign = FTR_UNSIGNED,
+ .min_field_value = 1,
+ },
+#endif
{},
};

--
1.9.1


2018-08-28 15:54:07

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 15/27] arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking

Instead disabling interrupts by setting the PSR.I bit, use a priority
higher than the one used for interrupts to mask them via PMR.

The value chosen for PMR to enable/disable interrupts encodes the status
of interrupts on a single bit. This information is stored in the irqflags
values used when saving/restoring IRQ status.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Suggested-by: Daniel Thompson <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Oleg Nesterov <[email protected]>
---
arch/arm64/include/asm/assembler.h | 17 ++++++-
arch/arm64/include/asm/efi.h | 3 +-
arch/arm64/include/asm/irqflags.h | 97 ++++++++++++++++++++++++++++++--------
arch/arm64/include/asm/ptrace.h | 10 ++--
arch/arm64/kernel/entry.S | 2 +-
5 files changed, 102 insertions(+), 27 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 0bcc98d..0b2dcfd 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -23,6 +23,7 @@
#ifndef __ASM_ASSEMBLER_H
#define __ASM_ASSEMBLER_H

+#include <asm/alternative.h>
#include <asm/asm-offsets.h>
#include <asm/cpufeature.h>
#include <asm/debug-monitors.h>
@@ -62,12 +63,24 @@
/*
* Enable and disable interrupts.
*/
- .macro disable_irq
+ .macro disable_irq, tmp
+ mov \tmp, #ICC_PMR_EL1_MASKED
+alternative_if_not ARM64_HAS_IRQ_PRIO_MASKING
msr daifset, #2
+alternative_else
+ msr_s SYS_ICC_PMR_EL1, \tmp
+alternative_endif
.endm

- .macro enable_irq
+ .macro enable_irq, tmp
+ mov \tmp, #ICC_PMR_EL1_UNMASKED
+alternative_if_not ARM64_HAS_IRQ_PRIO_MASKING
msr daifclr, #2
+ nop
+alternative_else
+ msr_s SYS_ICC_PMR_EL1, \tmp
+ dsb sy
+alternative_endif
.endm

.macro save_and_disable_irq, flags
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 7ed3208..3e06891 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -42,7 +42,8 @@

efi_status_t __efi_rt_asm_wrapper(void *, const char *, ...);

-#define ARCH_EFI_IRQ_FLAGS_MASK (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
+#define ARCH_EFI_IRQ_FLAGS_MASK \
+ (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT | ARCH_FLAG_PMR_EN)

/* arch specific definitions used by the stub code */

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 24692ed..193cfd0 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -18,7 +18,27 @@

#ifdef __KERNEL__

+#include <asm/alternative.h>
+#include <asm/cpufeature.h>
#include <asm/ptrace.h>
+#include <asm/sysreg.h>
+
+
+/*
+ * When ICC_PMR_EL1 is used for interrupt masking, only the bit indicating
+ * whether the normal interrupts are masked is kept along with the daif
+ * flags.
+ */
+#define ARCH_FLAG_PMR_EN 0x1
+
+#define MAKE_ARCH_FLAGS(daif, pmr) \
+ ((daif) | (((pmr) >> ICC_PMR_EL1_EN_SHIFT) & ARCH_FLAG_PMR_EN))
+
+#define ARCH_FLAGS_GET_PMR(flags) \
+ ((((flags) & ARCH_FLAG_PMR_EN) << ICC_PMR_EL1_EN_SHIFT) \
+ | ICC_PMR_EL1_MASKED)
+
+#define ARCH_FLAGS_GET_DAIF(flags) ((flags) & ~ARCH_FLAG_PMR_EN)

/*
* Aarch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
@@ -38,31 +58,50 @@
*/
static inline unsigned long arch_local_irq_save(void)
{
- unsigned long flags;
- asm volatile(
+ unsigned long flags, masked = ICC_PMR_EL1_MASKED;
+ unsigned long pmr = 0;
+
+ asm volatile(ALTERNATIVE(
"mrs %0, daif // arch_local_irq_save\n"
- "msr daifset, #2"
- : "=r" (flags)
- :
+ "msr daifset, #2\n"
+ "mov %1, #" __stringify(ICC_PMR_EL1_UNMASKED),
+ /* --- */
+ "mrs %0, daif\n"
+ "mrs_s %1, " __stringify(SYS_ICC_PMR_EL1) "\n"
+ "msr_s " __stringify(SYS_ICC_PMR_EL1) ", %2",
+ ARM64_HAS_IRQ_PRIO_MASKING)
+ : "=&r" (flags), "=&r" (pmr)
+ : "r" (masked)
: "memory");
- return flags;
+
+ return MAKE_ARCH_FLAGS(flags, pmr);
}

static inline void arch_local_irq_enable(void)
{
- asm volatile(
- "msr daifclr, #2 // arch_local_irq_enable"
- :
+ unsigned long unmasked = ICC_PMR_EL1_UNMASKED;
+
+ asm volatile(ALTERNATIVE(
+ "msr daifclr, #2 // arch_local_irq_enable\n"
+ "nop",
+ "msr_s " __stringify(SYS_ICC_PMR_EL1) ",%0\n"
+ "dsb sy",
+ ARM64_HAS_IRQ_PRIO_MASKING)
:
+ : "r" (unmasked)
: "memory");
}

static inline void arch_local_irq_disable(void)
{
- asm volatile(
- "msr daifset, #2 // arch_local_irq_disable"
- :
+ unsigned long masked = ICC_PMR_EL1_MASKED;
+
+ asm volatile(ALTERNATIVE(
+ "msr daifset, #2 // arch_local_irq_disable",
+ "msr_s " __stringify(SYS_ICC_PMR_EL1) ",%0",
+ ARM64_HAS_IRQ_PRIO_MASKING)
:
+ : "r" (masked)
: "memory");
}

@@ -72,12 +111,19 @@ static inline void arch_local_irq_disable(void)
static inline unsigned long arch_local_save_flags(void)
{
unsigned long flags;
- asm volatile(
- "mrs %0, daif // arch_local_save_flags"
- : "=r" (flags)
+ unsigned long pmr = 0;
+
+ asm volatile(ALTERNATIVE(
+ "mrs %0, daif // arch_local_save_flags\n"
+ "mov %1, #" __stringify(ICC_PMR_EL1_UNMASKED),
+ "mrs %0, daif\n"
+ "mrs_s %1, " __stringify(SYS_ICC_PMR_EL1),
+ ARM64_HAS_IRQ_PRIO_MASKING)
+ : "=r" (flags), "=r" (pmr)
:
: "memory");
- return flags;
+
+ return MAKE_ARCH_FLAGS(flags, pmr);
}

/*
@@ -85,16 +131,27 @@ static inline unsigned long arch_local_save_flags(void)
*/
static inline void arch_local_irq_restore(unsigned long flags)
{
- asm volatile(
- "msr daif, %0 // arch_local_irq_restore"
+ unsigned long pmr = ARCH_FLAGS_GET_PMR(flags);
+
+ flags = ARCH_FLAGS_GET_DAIF(flags);
+
+ asm volatile(ALTERNATIVE(
+ "msr daif, %0 // arch_local_irq_restore\n"
+ "nop\n"
+ "nop",
+ "msr daif, %0\n"
+ "msr_s " __stringify(SYS_ICC_PMR_EL1) ",%1\n"
+ "dsb sy",
+ ARM64_HAS_IRQ_PRIO_MASKING)
:
- : "r" (flags)
+ : "r" (flags), "r" (pmr)
: "memory");
}

static inline int arch_irqs_disabled_flags(unsigned long flags)
{
- return flags & PSR_I_BIT;
+ return (ARCH_FLAGS_GET_DAIF(flags) & (PSR_I_BIT)) |
+ !(ARCH_FLAGS_GET_PMR(flags) & ICC_PMR_EL1_EN_BIT);
}
#endif
#endif
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 29ec217..67df46e 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -25,8 +25,11 @@
#define CurrentEL_EL1 (1 << 2)
#define CurrentEL_EL2 (2 << 2)

-/* PMR value use to unmask interrupts */
+/* PMR values used to mask/unmask interrupts */
#define ICC_PMR_EL1_UNMASKED 0xf0
+#define ICC_PMR_EL1_EN_SHIFT 6
+#define ICC_PMR_EL1_EN_BIT (1 << ICC_PMR_EL1_EN_SHIFT) // PMR IRQ enable
+#define ICC_PMR_EL1_MASKED (ICC_PMR_EL1_UNMASKED ^ ICC_PMR_EL1_EN_BIT)

/* AArch32-specific ptrace requests */
#define COMPAT_PTRACE_GETREGS 12
@@ -201,8 +204,9 @@ static inline void forget_syscall(struct pt_regs *regs)
#define processor_mode(regs) \
((regs)->pstate & PSR_MODE_MASK)

-#define interrupts_enabled(regs) \
- (!((regs)->pstate & PSR_I_BIT))
+#define interrupts_enabled(regs) \
+ ((!((regs)->pstate & PSR_I_BIT)) && \
+ ((regs)->pmr_save & ICC_PMR_EL1_EN_BIT))

#define fast_interrupts_enabled(regs) \
(!((regs)->pstate & PSR_F_BIT))
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 79b06af..91e1e3d 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -912,7 +912,7 @@ work_pending:
* "slow" syscall return path.
*/
ret_to_user:
- disable_irq // disable interrupts
+ disable_irq x21 // disable interrupts
ldr x1, [tsk, #TSK_TI_FLAGS]
and x2, x1, #_TIF_WORK_MASK
cbnz x2, work_pending
--
1.9.1

2018-08-28 15:54:08

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 16/27] arm64: daifflags: Include PMR in daifflags restore operations

The addition of PMR should not bypass the semantics of daifflags.

When DA_F are set, I bit is also set as no interrupts (even of higher
priority) is allowed.

When DA_F are cleared, I bit is cleared and interrupt enabling/disabling
goes through ICC_PMR_EL1.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: James Morse <[email protected]>
---
arch/arm64/include/asm/daifflags.h | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index 8d91f22..ccd95e8 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -18,8 +18,14 @@

#include <linux/irqflags.h>

-#define DAIF_PROCCTX 0
-#define DAIF_PROCCTX_NOIRQ PSR_I_BIT
+#include <asm/arch_gicv3.h>
+
+#define DAIF_PROCCTX MAKE_ARCH_FLAGS(0, ICC_PMR_EL1_UNMASKED)
+
+#define DAIF_PROCCTX_NOIRQ \
+ (gic_prio_masking_enabled() ? \
+ MAKE_ARCH_FLAGS(0, ICC_PMR_EL1_MASKED) : \
+ MAKE_ARCH_FLAGS(PSR_I_BIT, ICC_PMR_EL1_UNMASKED))

/* mask/save/unmask/restore all exceptions, including interrupts. */
static inline void local_daif_mask(void)
@@ -51,6 +57,10 @@ static inline void local_daif_unmask(void)
:
:
: "memory");
+
+ /* Unmask IRQs in PMR if needed */
+ if (gic_prio_masking_enabled())
+ arch_local_irq_enable();
}

static inline void local_daif_restore(unsigned long flags)
--
1.9.1


2018-08-28 15:54:10

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 17/27] irqchip/gic-v3: Factor group0 detection into functions

The code to detect whether Linux has access to group0 interrupts can
prove useful in other parts of the driver.

Provide a separate function to do this.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 55 +++++++++++++++++++++++++++++---------------
1 file changed, 36 insertions(+), 19 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index d5912f1..fef6688 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -392,6 +392,39 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
} while (irqnr != ICC_IAR1_EL1_SPURIOUS);
}

+static u32 gic_get_pribits(void)
+{
+ u32 pribits;
+
+ pribits = gic_read_ctlr();
+ pribits &= ICC_CTLR_EL1_PRI_BITS_MASK;
+ pribits >>= ICC_CTLR_EL1_PRI_BITS_SHIFT;
+ pribits++;
+
+ return pribits;
+}
+
+static bool gic_has_group0(void)
+{
+ u32 val;
+
+ /*
+ * Let's find out if Group0 is under control of EL3 or not by
+ * setting the highest possible, non-zero priority in PMR.
+ *
+ * If SCR_EL3.FIQ is set, the priority gets shifted down in
+ * order for the CPU interface to set bit 7, and keep the
+ * actual priority in the non-secure range. In the process, it
+ * looses the least significant bit and the actual priority
+ * becomes 0x80. Reading it back returns 0, indicating that
+ * we're don't have access to Group0.
+ */
+ gic_write_pmr(BIT(8 - gic_get_pribits()));
+ val = gic_read_pmr();
+
+ return val != 0;
+}
+
static void __init gic_dist_init(void)
{
unsigned int i;
@@ -533,7 +566,7 @@ static void gic_cpu_sys_reg_init(void)
u64 mpidr = cpu_logical_map(cpu);
u64 need_rss = MPIDR_RS(mpidr);
bool group0;
- u32 val, pribits;
+ u32 pribits;

/*
* Need to check that the SRE bit has actually been set. If
@@ -545,25 +578,9 @@ static void gic_cpu_sys_reg_init(void)
if (!gic_enable_sre())
pr_err("GIC: unable to set SRE (disabled at EL2), panic ahead\n");

- pribits = gic_read_ctlr();
- pribits &= ICC_CTLR_EL1_PRI_BITS_MASK;
- pribits >>= ICC_CTLR_EL1_PRI_BITS_SHIFT;
- pribits++;
+ pribits = gic_get_pribits();

- /*
- * Let's find out if Group0 is under control of EL3 or not by
- * setting the highest possible, non-zero priority in PMR.
- *
- * If SCR_EL3.FIQ is set, the priority gets shifted down in
- * order for the CPU interface to set bit 7, and keep the
- * actual priority in the non-secure range. In the process, it
- * looses the least significant bit and the actual priority
- * becomes 0x80. Reading it back returns 0, indicating that
- * we're don't have access to Group0.
- */
- write_gicreg(BIT(8 - pribits), ICC_PMR_EL1);
- val = read_gicreg(ICC_PMR_EL1);
- group0 = val != 0;
+ group0 = gic_has_group0();

/* Set priority mask register */
write_gicreg(DEFAULT_PMR_VALUE, ICC_PMR_EL1);
--
1.9.1

2018-08-28 15:54:11

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 18/27] irqchip/gic-v3: Do not overwrite PMR value

If the architecture is using ICC_PMR_EL1 to mask IRQs, do not overwrite
that value.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index fef6688..162c49c 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -407,6 +407,9 @@ static u32 gic_get_pribits(void)
static bool gic_has_group0(void)
{
u32 val;
+ u32 old_pmr;
+
+ old_pmr = gic_read_pmr();

/*
* Let's find out if Group0 is under control of EL3 or not by
@@ -422,6 +425,8 @@ static bool gic_has_group0(void)
gic_write_pmr(BIT(8 - gic_get_pribits()));
val = gic_read_pmr();

+ gic_write_pmr(old_pmr);
+
return val != 0;
}

@@ -583,7 +588,8 @@ static void gic_cpu_sys_reg_init(void)
group0 = gic_has_group0();

/* Set priority mask register */
- write_gicreg(DEFAULT_PMR_VALUE, ICC_PMR_EL1);
+ if (!gic_prio_masking_enabled())
+ write_gicreg(DEFAULT_PMR_VALUE, ICC_PMR_EL1);

/*
* Some firmwares hand over to the kernel with the BPR changed from
--
1.9.1

2018-08-28 15:54:18

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 19/27] irqchip/gic-v3: Remove acknowledge loop

Multiple interrupts pending for a CPU is actually rare. Doing an
acknowledge loop does not give much better performance or even can
deteriorate them.

Do not loop when an interrupt has been acknowledged, just return
from interrupt and wait for another one to be raised.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 65 +++++++++++++++++++++-----------------------
1 file changed, 31 insertions(+), 34 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 162c49c..a467fcf 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -348,48 +348,45 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
{
u32 irqnr;

- do {
- irqnr = gic_read_iar();
+ irqnr = gic_read_iar();

- if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
- int err;
+ if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
+ int err;

- if (static_branch_likely(&supports_deactivate_key))
+ if (static_branch_likely(&supports_deactivate_key))
+ gic_write_eoir(irqnr);
+ else
+ isb();
+
+ err = handle_domain_irq(gic_data.domain, irqnr, regs);
+ if (err) {
+ WARN_ONCE(true, "Unexpected interrupt received!\n");
+ if (static_branch_likely(&supports_deactivate_key)) {
+ if (irqnr < 8192)
+ gic_write_dir(irqnr);
+ } else {
gic_write_eoir(irqnr);
- else
- isb();
-
- err = handle_domain_irq(gic_data.domain, irqnr, regs);
- if (err) {
- WARN_ONCE(true, "Unexpected interrupt received!\n");
- if (static_branch_likely(&supports_deactivate_key)) {
- if (irqnr < 8192)
- gic_write_dir(irqnr);
- } else {
- gic_write_eoir(irqnr);
- }
}
- continue;
}
- if (irqnr < 16) {
- gic_write_eoir(irqnr);
- if (static_branch_likely(&supports_deactivate_key))
- gic_write_dir(irqnr);
+ return;
+ }
+ if (irqnr < 16) {
+ gic_write_eoir(irqnr);
+ if (static_branch_likely(&supports_deactivate_key))
+ gic_write_dir(irqnr);
#ifdef CONFIG_SMP
- /*
- * Unlike GICv2, we don't need an smp_rmb() here.
- * The control dependency from gic_read_iar to
- * the ISB in gic_write_eoir is enough to ensure
- * that any shared data read by handle_IPI will
- * be read after the ACK.
- */
- handle_IPI(irqnr, regs);
+ /*
+ * Unlike GICv2, we don't need an smp_rmb() here.
+ * The control dependency from gic_read_iar to
+ * the ISB in gic_write_eoir is enough to ensure
+ * that any shared data read by handle_IPI will
+ * be read after the ACK.
+ */
+ handle_IPI(irqnr, regs);
#else
- WARN_ONCE(true, "Unexpected SGI received!\n");
+ WARN_ONCE(true, "Unexpected SGI received!\n");
#endif
- continue;
- }
- } while (irqnr != ICC_IAR1_EL1_SPURIOUS);
+ }
}

static u32 gic_get_pribits(void)
--
1.9.1

2018-08-28 15:54:20

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 20/27] irqchip/gic-v3: Switch to PMR masking after IRQ acknowledge

After an interrupt has been acknowledged, mask the IRQ priority through
PMR and clear PSR.I bit, allowing higher priority interrupts to be
received during interrupt handling.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
arch/arm/include/asm/arch_gicv3.h | 6 ++++++
arch/arm64/include/asm/arch_gicv3.h | 6 ++++++
drivers/irqchip/irq-gic-v3.c | 8 +++++++-
3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/arch_gicv3.h b/arch/arm/include/asm/arch_gicv3.h
index 58d5d3e..b39d620 100644
--- a/arch/arm/include/asm/arch_gicv3.h
+++ b/arch/arm/include/asm/arch_gicv3.h
@@ -368,5 +368,11 @@ static inline bool gic_prio_masking_enabled(void)
return false;
}

+static inline void gic_start_pmr_masking(void)
+{
+ /* Should not get called */
+ WARN_ON(true);
+}
+
#endif /* !__ASSEMBLY__ */
#endif /* !__ASM_ARCH_GICV3_H */
diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
index 19a5b1f..eb55da8 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -161,5 +161,11 @@ static inline bool gic_prio_masking_enabled(void)
&& cpus_have_const_cap(ARM64_HAS_IRQ_PRIO_MASKING);
}

+static inline void gic_start_pmr_masking(void)
+{
+ gic_write_pmr(ICC_PMR_EL1_MASKED);
+ asm volatile ("msr daifclr, #2" : : : "memory");
+}
+
#endif /* __ASSEMBLY__ */
#endif /* __ASM_ARCH_GICV3_H */
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index a467fcf..56d1fb9 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -350,12 +350,18 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs

irqnr = gic_read_iar();

+ if (gic_prio_masking_enabled()) {
+ isb();
+ /* Masking IRQs earlier would prevent to ack the current IRQ */
+ gic_start_pmr_masking();
+ }
+
if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
int err;

if (static_branch_likely(&supports_deactivate_key))
gic_write_eoir(irqnr);
- else
+ else if (!gic_prio_masking_enabled())
isb();

err = handle_domain_irq(gic_data.domain, irqnr, regs);
--
1.9.1

2018-08-28 15:54:23

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 21/27] arm64: Switch to PMR masking when starting CPUs

Once the boot CPU has been prepared or a new secondary CPU has been
brought up, use ICC_PMR_EL1 to mask interrupts on that CPU and clear
PSR.I bit.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Suggested-by: Daniel Thompson <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: James Morse <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
arch/arm64/include/asm/irqflags.h | 3 +++
arch/arm64/kernel/head.S | 35 +++++++++++++++++++++++++++++++++++
arch/arm64/kernel/smp.c | 5 +++++
3 files changed, 43 insertions(+)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 193cfd0..d31e9b6 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -153,5 +153,8 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
return (ARCH_FLAGS_GET_DAIF(flags) & (PSR_I_BIT)) |
!(ARCH_FLAGS_GET_PMR(flags) & ICC_PMR_EL1_EN_BIT);
}
+
+void maybe_switch_to_sysreg_gic_cpuif(void);
+
#endif
#endif
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b085306..ba73690 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -648,6 +648,41 @@ set_cpu_boot_mode_flag:
ENDPROC(set_cpu_boot_mode_flag)

/*
+ * void maybe_switch_to_sysreg_gic_cpuif(void)
+ *
+ * Enable interrupt controller system register access if this feature
+ * has been detected by the alternatives system.
+ *
+ * Before we jump into generic code we must enable interrupt controller system
+ * register access because this is required by the irqflags macros. We must
+ * also mask interrupts at the PMR and unmask them within the PSR. That leaves
+ * us set up and ready for the kernel to make its first call to
+ * arch_local_irq_enable().
+ *
+ */
+ENTRY(maybe_switch_to_sysreg_gic_cpuif)
+alternative_if_not ARM64_HAS_IRQ_PRIO_MASKING
+ b 1f
+alternative_else
+ mrs_s x0, SYS_ICC_SRE_EL1
+alternative_endif
+ orr x0, x0, #1
+ msr_s SYS_ICC_SRE_EL1, x0 // Set ICC_SRE_EL1.SRE==1
+ isb // Make sure SRE is now set
+ mrs x0, daif
+ tbz x0, #7, no_mask_pmr // Are interrupts on?
+ mov x0, ICC_PMR_EL1_MASKED
+ msr_s SYS_ICC_PMR_EL1, x0 // Prepare for unmask of I bit
+ msr daifclr, #2 // Clear the I bit
+ b 1f
+no_mask_pmr:
+ mov x0, ICC_PMR_EL1_UNMASKED
+ msr_s SYS_ICC_PMR_EL1, x0
+1:
+ ret
+ENDPROC(maybe_switch_to_sysreg_gic_cpuif)
+
+/*
* These values are written with the MMU off, but read with the MMU on.
* Writers will invalidate the corresponding address, discarding up to a
* 'Cache Writeback Granule' (CWG) worth of data. The linker script ensures
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 22c9a0a..443fa2b 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -185,6 +185,8 @@ asmlinkage notrace void secondary_start_kernel(void)
struct mm_struct *mm = &init_mm;
unsigned int cpu;

+ maybe_switch_to_sysreg_gic_cpuif();
+
cpu = task_cpu(current);
set_my_cpu_offset(per_cpu_offset(cpu));

@@ -421,6 +423,9 @@ void __init smp_prepare_boot_cpu(void)
* and/or scheduling is enabled.
*/
apply_boot_alternatives();
+
+ /* Conditionally switch to GIC PMR for interrupt masking */
+ maybe_switch_to_sysreg_gic_cpuif();
}

static u64 __init of_get_cpu_mpidr(struct device_node *dn)
--
1.9.1

2018-08-28 15:54:23

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 12/27] arm64: Unmask PMR before going idle

CPU does not received signals for interrupts with a priority masked by
ICC_PMR_EL1. This means the CPU might not come back from a WFI
instruction.

Make sure ICC_PMR_EL1 does not mask interrupts when doing a WFI.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Suggested-by: Daniel Thompson <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/mm/proc.S | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 03646e6..cc1caf7 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -20,6 +20,7 @@

#include <linux/init.h>
#include <linux/linkage.h>
+#include <linux/irqchip/arm-gic-v3.h>
#include <asm/assembler.h>
#include <asm/asm-offsets.h>
#include <asm/hwcap.h>
@@ -53,10 +54,27 @@
* cpu_do_idle()
*
* Idle the processor (wait for interrupt).
+ *
+ * If the CPU supports priority masking we must do additional work to
+ * ensure that interrupts are not masked at the PMR (because the core will
+ * not wake up if we block the wake up signal in the interrupt controller).
*/
ENTRY(cpu_do_idle)
+alternative_if_not ARM64_HAS_IRQ_PRIO_MASKING
+ dsb sy // WFI may enter a low-power mode
+ wfi
+ ret
+alternative_else
+ mrs x0, daif // save I bit
+ msr daifset, #2 // set I bit
+ mrs_s x1, SYS_ICC_PMR_EL1 // save PMR
+alternative_endif
+ mov x2, #ICC_PMR_EL1_UNMASKED
+ msr_s SYS_ICC_PMR_EL1, x2 // unmask at PMR
dsb sy // WFI may enter a low-power mode
wfi
+ msr_s SYS_ICC_PMR_EL1, x1 // restore PMR
+ msr daif, x0 // restore I bit
ret
ENDPROC(cpu_do_idle)

--
1.9.1

2018-08-28 15:54:24

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 22/27] arm64: Add build option for IRQ masking via priority

Provide a build option to enable using GICv3 priorities to enable/disable
interrupts.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Suggested-by: Daniel Thompson <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/Kconfig | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 29e75b4..d09c6ff 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -944,6 +944,21 @@ config ARM64_SSBD

If unsure, say Y.

+config USE_ICC_SYSREGS_FOR_IRQFLAGS
+ bool "Use ICC system registers for IRQ masking"
+ select CONFIG_ARM_GIC_V3
+ help
+ Using the ICC system registers for IRQ masking makes it possible
+ to simulate NMI on ARM64 systems. This allows several interesting
+ features (especially debug features) to be used on these systems.
+
+ Say Y here to implement IRQ masking using ICC system
+ registers when the GIC System Registers are available. The changes
+ are applied dynamically using the alternatives system so it is safe
+ to enable this option on systems with older interrupt controllers.
+
+ If unsure, say N
+
menuconfig ARMV8_DEPRECATED
bool "Emulate deprecated/obsolete ARMv8 instructions"
depends on COMPAT
--
1.9.1

2018-08-28 15:54:28

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 24/27] irqchip/gic-v3: Detect current view of GIC priorities

The values non secure EL1 needs to use for PMR and RPR registers depends on
the value of SCR_EL3.FIQ.

The values non secure EL1 sees from the distributor and redistributor
depend on whether security is enabled for the GIC or not.

Figure out what values we are dealing with to know if the values we use for
PMR and RPR match the priority values that have been configured in the
distributor and redistributors.

Also, add firmware requirements related to SCR_EL3.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
Documentation/arm64/booting.txt | 5 +++++
drivers/irqchip/irq-gic-v3.c | 42 +++++++++++++++++++++++++++++++++++------
2 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 8d0df62..e387938 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -188,6 +188,11 @@ Before jumping into the kernel, the following conditions must be met:
the kernel image will be entered must be initialised by software at a
higher exception level to prevent execution in an UNKNOWN state.

+ - SCR_EL3.FIQ must have the same value across all CPUs the kernel is
+ executing on.
+ - The value of SCR_EL3.FIQ must be the same as the one present at boot
+ time whenever the kernel is executing.
+
For systems with a GICv3 interrupt controller to be used in v3 mode:
- If EL3 is present:
ICC_SRE_EL3.Enable (bit 3) must be initialiased to 0b1.
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 56d1fb9..ffe98e8 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -63,6 +63,28 @@ struct gic_chip_data {
static struct gic_chip_data gic_data __read_mostly;
static DEFINE_STATIC_KEY_TRUE(supports_deactivate_key);

+/*
+ * The behaviours of RPR and PMR registers differ depending on the value of
+ * SCR_EL3.FIQ, and the behaviour of non-secure priority registers of the
+ * distributor and redistributors depends on whether security is enabled in the
+ * GIC.
+ *
+ * When security is enabled, non-secure priority values from the (re)distributor
+ * are presented to the GIC CPUIF as follow:
+ * (GIC_(R)DIST_PRI[irq] >> 1) | 0x80;
+ *
+ * If SCR_EL3.FIQ == 1, the values writen to/read from PMR and RPR at non-secure
+ * EL1 are subject to a similar operation thus matching the priorities presented
+ * from the (re)distributor when security is enabled.
+ *
+ * see GICv3/GICv4 Architecture Specification (IHI0069D):
+ * - section 4.8.1 Non-secure accesses to register fields for Secure interrupt
+ * priorities.
+ * - Figure 4-7 Secure read of the priority field for a Non-secure Group 1
+ * interrupt.
+ */
+DEFINE_STATIC_KEY_FALSE(have_non_secure_prio_view);
+
static struct gic_kvm_info gic_v3_kvm_info;
static DEFINE_PER_CPU(bool, has_rss);

@@ -568,6 +590,12 @@ static void gic_update_vlpi_properties(void)
!gic_data.rdists.has_direct_lpi ? "no " : "");
}

+/* Check whether it's single security state view */
+static inline bool gic_dist_security_disabled(void)
+{
+ return readl_relaxed(gic_data.dist_base + GICD_CTLR) & GICD_CTLR_DS;
+}
+
static void gic_cpu_sys_reg_init(void)
{
int i, cpu = smp_processor_id();
@@ -593,6 +621,9 @@ static void gic_cpu_sys_reg_init(void)
/* Set priority mask register */
if (!gic_prio_masking_enabled())
write_gicreg(DEFAULT_PMR_VALUE, ICC_PMR_EL1);
+ else if (static_branch_likely(&have_non_secure_prio_view) && group0)
+ /* Mismatch configuration with boot CPU */
+ WARN_ON(group0 && !gic_dist_security_disabled());

/*
* Some firmwares hand over to the kernel with the BPR changed from
@@ -845,12 +876,6 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
#endif

#ifdef CONFIG_CPU_PM
-/* Check whether it's single security state view */
-static bool gic_dist_security_disabled(void)
-{
- return readl_relaxed(gic_data.dist_base + GICD_CTLR) & GICD_CTLR_DS;
-}
-
static int gic_cpu_pm_notifier(struct notifier_block *self,
unsigned long cmd, void *v)
{
@@ -1161,6 +1186,11 @@ static int __init gic_init_bases(void __iomem *dist_base,
gic_cpu_init();
gic_cpu_pm_init();

+ if (gic_prio_masking_enabled()) {
+ if (!gic_has_group0() || gic_dist_security_disabled())
+ static_branch_enable(&have_non_secure_prio_view);
+ }
+
return 0;

out_free:
--
1.9.1


2018-08-28 15:54:28

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 13/27] arm/arm64: gic-v3: Add helper functions to manage IRQ priorities

Add a function to check if priority masking is supported and accessors
for PMR/RPR.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
arch/arm/include/asm/arch_gicv3.h | 21 +++++++++++++++++++++
arch/arm64/include/asm/arch_gicv3.h | 21 +++++++++++++++++++++
2 files changed, 42 insertions(+)

diff --git a/arch/arm/include/asm/arch_gicv3.h b/arch/arm/include/asm/arch_gicv3.h
index 0bd5307..58d5d3e 100644
--- a/arch/arm/include/asm/arch_gicv3.h
+++ b/arch/arm/include/asm/arch_gicv3.h
@@ -34,6 +34,7 @@
#define ICC_SRE __ACCESS_CP15(c12, 0, c12, 5)
#define ICC_IGRPEN1 __ACCESS_CP15(c12, 0, c12, 7)
#define ICC_BPR1 __ACCESS_CP15(c12, 0, c12, 3)
+#define ICC_RPR __ACCESS_CP15(c12, 0, c11, 3)

#define __ICC_AP0Rx(x) __ACCESS_CP15(c12, 0, c8, 4 | x)
#define ICC_AP0R0 __ICC_AP0Rx(0)
@@ -245,6 +246,21 @@ static inline void gic_write_bpr1(u32 val)
write_sysreg(val, ICC_BPR1);
}

+static inline u32 gic_read_pmr(void)
+{
+ return read_sysreg(ICC_PMR);
+}
+
+static inline void gic_write_pmr(u32 val)
+{
+ write_sysreg(val, ICC_PMR);
+}
+
+static inline u32 gic_read_rpr(void)
+{
+ return read_sysreg(ICC_RPR);
+}
+
/*
* Even in 32bit systems that use LPAE, there is no guarantee that the I/O
* interface provides true 64bit atomic accesses, so using strd/ldrd doesn't
@@ -347,5 +363,10 @@ static inline void gits_write_vpendbaser(u64 val, void * __iomem addr)

#define gits_read_vpendbaser(c) __gic_readq_nonatomic(c)

+static inline bool gic_prio_masking_enabled(void)
+{
+ return false;
+}
+
#endif /* !__ASSEMBLY__ */
#endif /* !__ASM_ARCH_GICV3_H */
diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
index e278f94..19a5b1f 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -114,6 +114,21 @@ static inline void gic_write_bpr1(u32 val)
write_sysreg_s(val, SYS_ICC_BPR1_EL1);
}

+static inline u32 gic_read_pmr(void)
+{
+ return read_sysreg_s(SYS_ICC_PMR_EL1);
+}
+
+static inline void gic_write_pmr(u32 val)
+{
+ write_sysreg_s(val, SYS_ICC_PMR_EL1);
+}
+
+static inline u32 gic_read_rpr(void)
+{
+ return read_sysreg_s(SYS_ICC_RPR_EL1);
+}
+
#define gic_read_typer(c) readq_relaxed(c)
#define gic_write_irouter(v, c) writeq_relaxed(v, c)
#define gic_read_lpir(c) readq_relaxed(c)
@@ -140,5 +155,11 @@ static inline void gic_write_bpr1(u32 val)
#define gits_write_vpendbaser(v, c) writeq_relaxed(v, c)
#define gits_read_vpendbaser(c) readq_relaxed(c)

+static inline bool gic_prio_masking_enabled(void)
+{
+ return IS_ENABLED(CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS)
+ && cpus_have_const_cap(ARM64_HAS_IRQ_PRIO_MASKING);
+}
+
#endif /* __ASSEMBLY__ */
#endif /* __ASM_ARCH_GICV3_H */
--
1.9.1

2018-08-28 15:54:28

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 14/27] arm64: kvm: Unmask PMR before entering guest

Interrupts masked by ICC_PMR_EL1 will not be signaled to the CPU. This
means that hypervisor will not receive masked interrupts while running a
guest.

Avoid this by making sure ICC_PMR_EL1 is unmasked when we enter a guest.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
---
arch/arm64/include/asm/kvm_host.h | 12 ++++++++++++
arch/arm64/kvm/hyp/switch.c | 17 +++++++++++++++++
2 files changed, 29 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f26055f..9689592 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -24,6 +24,7 @@

#include <linux/types.h>
#include <linux/kvm_types.h>
+#include <asm/arch_gicv3.h>
#include <asm/cpufeature.h>
#include <asm/daifflags.h>
#include <asm/fpsimd.h>
@@ -466,6 +467,17 @@ static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
static inline void kvm_arm_vhe_guest_enter(void)
{
local_daif_mask();
+
+ /*
+ * Having IRQs masked via PMR when entering the guest means the GIC
+ * will not signal the CPU of interrupts of lower priority, and the
+ * only way to get out will be via guest exceptions.
+ * Naturally, we want to avoid this.
+ */
+ if (gic_prio_masking_enabled()) {
+ gic_write_pmr(ICC_PMR_EL1_UNMASKED);
+ dsb(sy);
+ }
}

static inline void kvm_arm_vhe_guest_exit(void)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index d496ef5..074ab21 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -22,6 +22,7 @@

#include <kvm/arm_psci.h>

+#include <asm/arch_gicv3.h>
#include <asm/cpufeature.h>
#include <asm/kvm_asm.h>
#include <asm/kvm_emulate.h>
@@ -533,6 +534,19 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
struct kvm_cpu_context *host_ctxt;
struct kvm_cpu_context *guest_ctxt;
u64 exit_code;
+ u32 host_pmr = ICC_PMR_EL1_UNMASKED;
+
+ /*
+ * Having IRQs masked via PMR when entering the guest means the GIC
+ * will not signal the CPU of interrupts of lower priority, and the
+ * only way to get out will be via guest exceptions.
+ * Naturally, we want to avoid this.
+ */
+ if (gic_prio_masking_enabled()) {
+ host_pmr = gic_read_pmr();
+ gic_write_pmr(ICC_PMR_EL1_UNMASKED);
+ dsb(sy);
+ }

vcpu = kern_hyp_va(vcpu);

@@ -586,6 +600,9 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
*/
__debug_switch_to_host(vcpu);

+ if (gic_prio_masking_enabled())
+ gic_write_pmr(host_pmr);
+
return exit_code;
}

--
1.9.1

2018-08-28 15:54:33

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 27/27] irqchip/gic-v3: Allow interrupts to be set as pseudo-NMI

Implement NMI callbacks for GICv3 irqchip. Install NMI safe handlers
when setting up interrupt line as NMI.

Only SPIs and PPIs are allowed to be set up as NMI.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 73 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 73 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 1af2fcc..b1b255a 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -311,6 +311,70 @@ static int gic_irq_get_irqchip_state(struct irq_data *d,
return 0;
}

+static int gic_irq_set_irqchip_prio(struct irq_data *d, u8 prio)
+{
+ struct irq_desc *desc = irq_to_desc(d->irq);
+ /* Number of CPUs having PPI (idx + 16) setup as NMI */
+ static uint32_t nb_ppinmi_refs[16] = { 0 };
+
+ if (gic_peek_irq(d, GICD_ISENABLER)) {
+ pr_err("Cannot set NMI property of enabled IRQ %u\n", d->irq);
+ return -EINVAL;
+ }
+
+ /*
+ * A secondary irq_chip should be in charge of LPI request,
+ * it should not be possible to get there
+ */
+ if (WARN_ON(gic_irq(d) >= 8192))
+ return -EINVAL;
+
+ /* desc lock should already be held */
+ if (prio == GICD_INT_NMI_PRI) {
+ if (gic_irq(d) < 32) {
+ /* Setting up NMI, only switch handler for first NMI */
+ if (nb_ppinmi_refs[gic_irq(d) - 16] == 0)
+ desc->handle_irq = handle_percpu_devid_fasteoi_nmi;
+
+ nb_ppinmi_refs[gic_irq(d) - 16]++;
+ } else {
+ desc->handle_irq = handle_fasteoi_nmi;
+ }
+ } else if (prio == GICD_INT_DEF_PRI) {
+ if (gic_irq(d) < 32) {
+ /* Tearing down NMI, only switch handler for last NMI */
+ if (nb_ppinmi_refs[gic_irq(d) - 16] == 1)
+ desc->handle_irq = handle_percpu_devid_irq;
+
+ nb_ppinmi_refs[gic_irq(d) - 16]--;
+ } else {
+ desc->handle_irq = handle_fasteoi_irq;
+ }
+ } else {
+ return -EINVAL;
+ }
+
+ gic_set_irq_prio(gic_irq(d), gic_dist_base(d), prio);
+
+ return 0;
+}
+
+static int gic_irq_nmi_setup(struct irq_data *d)
+{
+ if (!gic_supports_nmi())
+ return -EINVAL;
+
+ return gic_irq_set_irqchip_prio(d, GICD_INT_NMI_PRI);
+}
+
+static void gic_irq_nmi_teardown(struct irq_data *d)
+{
+ if (WARN_ON(!gic_supports_nmi()))
+ return;
+
+ gic_irq_set_irqchip_prio(d, GICD_INT_DEF_PRI);
+}
+
static void gic_eoi_irq(struct irq_data *d)
{
gic_write_eoir(gic_irq(d));
@@ -937,6 +1001,8 @@ static inline void gic_cpu_pm_init(void) { }
.irq_set_affinity = gic_set_affinity,
.irq_get_irqchip_state = gic_irq_get_irqchip_state,
.irq_set_irqchip_state = gic_irq_set_irqchip_state,
+ .irq_nmi_setup = gic_irq_nmi_setup,
+ .irq_nmi_teardown = gic_irq_nmi_teardown,
.flags = IRQCHIP_SET_TYPE_MASKED |
IRQCHIP_SKIP_SET_WAKE |
IRQCHIP_MASK_ON_SUSPEND,
@@ -952,6 +1018,8 @@ static inline void gic_cpu_pm_init(void) { }
.irq_get_irqchip_state = gic_irq_get_irqchip_state,
.irq_set_irqchip_state = gic_irq_set_irqchip_state,
.irq_set_vcpu_affinity = gic_irq_set_vcpu_affinity,
+ .irq_nmi_setup = gic_irq_nmi_setup,
+ .irq_nmi_teardown = gic_irq_nmi_teardown,
.flags = IRQCHIP_SET_TYPE_MASKED |
IRQCHIP_SKIP_SET_WAKE |
IRQCHIP_MASK_ON_SUSPEND,
@@ -1147,6 +1215,11 @@ static int partition_domain_translate(struct irq_domain *d,
static void gic_enable_nmi_support(void)
{
static_branch_enable(&have_non_secure_prio_view);
+
+ if (static_branch_likely(&supports_deactivate_key))
+ gic_eoimode1_chip.flags |= IRQCHIP_SUPPORTS_NMI;
+ else
+ gic_chip.flags |= IRQCHIP_SUPPORTS_NMI;
}

static int __init gic_init_bases(void __iomem *dist_base,
--
1.9.1


2018-08-28 15:55:14

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 26/27] irqchip/gic: Add functions to access irq priorities

Add accessors to the GIC distributor/redistributors priority registers.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-common.c | 10 ++++++++++
drivers/irqchip/irq-gic-common.h | 2 ++
2 files changed, 12 insertions(+)

diff --git a/drivers/irqchip/irq-gic-common.c b/drivers/irqchip/irq-gic-common.c
index 01e673c..910746f 100644
--- a/drivers/irqchip/irq-gic-common.c
+++ b/drivers/irqchip/irq-gic-common.c
@@ -98,6 +98,16 @@ int gic_configure_irq(unsigned int irq, unsigned int type,
return ret;
}

+void gic_set_irq_prio(unsigned int irq, void __iomem *base, u8 prio)
+{
+ writeb_relaxed(prio, base + GIC_DIST_PRI + irq);
+}
+
+u8 gic_get_irq_prio(unsigned int irq, void __iomem *base)
+{
+ return readb_relaxed(base + GIC_DIST_PRI + irq);
+}
+
void gic_dist_config(void __iomem *base, int gic_irqs,
void (*sync_access)(void))
{
diff --git a/drivers/irqchip/irq-gic-common.h b/drivers/irqchip/irq-gic-common.h
index 3919cd7..1586dbd 100644
--- a/drivers/irqchip/irq-gic-common.h
+++ b/drivers/irqchip/irq-gic-common.h
@@ -35,6 +35,8 @@ void gic_dist_config(void __iomem *base, int gic_irqs,
void gic_cpu_config(void __iomem *base, void (*sync_access)(void));
void gic_enable_quirks(u32 iidr, const struct gic_quirk *quirks,
void *data);
+void gic_set_irq_prio(unsigned int irq, void __iomem *base, u8 prio);
+u8 gic_get_irq_prio(unsigned int irq, void __iomem *base);

void gic_set_kvm_info(const struct gic_kvm_info *info);

--
1.9.1


2018-08-28 15:55:23

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 23/27] arm64: Handle serror in NMI context

Per definition of the daifflags, Serrors can occur during any interrupt
context, that includes NMI contexts. Trying to nmi_enter in an nmi context
will crash.

Skip nmi_enter/nmi_exit when serror occurred during an NMI.

Suggested-by: James Morse <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Dave Martin <[email protected]>
Cc: James Morse <[email protected]>
---
arch/arm64/kernel/traps.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 039e9ff..4955ec6 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -713,13 +713,17 @@ bool arm64_is_fatal_ras_serror(struct pt_regs *regs, unsigned int esr)

asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
{
- nmi_enter();
+ const bool was_in_nmi = in_nmi();
+
+ if (!was_in_nmi)
+ nmi_enter();

/* non-RAS errors are not containable */
if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(regs, esr))
arm64_serror_panic(regs, esr);

- nmi_exit();
+ if (!was_in_nmi)
+ nmi_exit();
}

void __pte_error(const char *file, int line, unsigned long val)
--
1.9.1


2018-08-28 15:55:47

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 25/27] irqchip/gic-v3: Add base support for pseudo-NMI

Provide a higher priority to be used for pseudo-NMIs. When such an
interrupt is received, enter the NMI state and prevent other NMIs to
be raised.

When returning from a pseudo-NMI, skip preemption and tracing if the
interrupted context has interrupts disabled.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
arch/arm/include/asm/arch_gicv3.h | 6 ++++++
arch/arm64/include/asm/arch_gicv3.h | 6 ++++++
arch/arm64/kernel/entry.S | 43 +++++++++++++++++++++++++++++++++++++
drivers/irqchip/irq-gic-v3.c | 34 ++++++++++++++++++++++++++++-
4 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/arch_gicv3.h b/arch/arm/include/asm/arch_gicv3.h
index b39d620..1ed0476 100644
--- a/arch/arm/include/asm/arch_gicv3.h
+++ b/arch/arm/include/asm/arch_gicv3.h
@@ -374,5 +374,11 @@ static inline void gic_start_pmr_masking(void)
WARN_ON(true);
}

+static inline void gic_set_nmi_active(void)
+{
+ /* Should not get called */
+ WARN_ON(true);
+}
+
#endif /* !__ASSEMBLY__ */
#endif /* !__ASM_ARCH_GICV3_H */
diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
index eb55da8..6213d6f 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -167,5 +167,11 @@ static inline void gic_start_pmr_masking(void)
asm volatile ("msr daifclr, #2" : : : "memory");
}

+/* Notify an NMI is active, blocking other NMIs */
+static inline void gic_set_nmi_active(void)
+{
+ asm volatile ("msr daifset, #2" : : : "memory");
+}
+
#endif /* __ASSEMBLY__ */
#endif /* __ASM_ARCH_GICV3_H */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 91e1e3d..39d6ef0 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -411,6 +411,16 @@ alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0
mov sp, x19
.endm

+ /* Should be checked on return from irq handlers */
+ .macro branch_if_was_nmi, tmp, target
+ alternative_if ARM64_HAS_IRQ_PRIO_MASKING
+ mrs \tmp, daif
+ alternative_else
+ mov \tmp, #0
+ alternative_endif
+ tbnz \tmp, #7, \target // Exiting an NMI
+ .endm
+
/*
* These are the registers used in the syscall handler, and allow us to
* have in theory up to 7 arguments to a function - x0 to x6.
@@ -631,12 +641,30 @@ ENDPROC(el1_sync)
el1_irq:
kernel_entry 1
enable_da_f
+
#ifdef CONFIG_TRACE_IRQFLAGS
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ ldr x20, [sp, #S_PMR_SAVE]
+ /* Irqs were disabled, don't trace */
+ tbz x20, ICC_PMR_EL1_EN_SHIFT, 1f
+#endif
bl trace_hardirqs_off
+1:
#endif

irq_handler

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ /*
+ * Irqs were disabled, we have an nmi.
+ * We might have interrupted a context with interrupt disabled that set
+ * NEED_RESCHED flag.
+ * Skip preemption and irq tracing if needed.
+ */
+ tbz x20, ICC_PMR_EL1_EN_SHIFT, untraced_irq_exit
+ branch_if_was_nmi x0, skip_preempt
+#endif
+
#ifdef CONFIG_PREEMPT
ldr w24, [tsk, #TSK_TI_PREEMPT] // get preempt count
cbnz w24, 1f // preempt count != 0
@@ -645,9 +673,13 @@ el1_irq:
bl el1_preempt
1:
#endif
+
+skip_preempt:
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_on
#endif
+
+untraced_irq_exit:
kernel_exit 1
ENDPROC(el1_irq)

@@ -873,6 +905,9 @@ el0_irq_naked:
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_on
#endif
+
+ branch_if_was_nmi x2, nmi_ret_to_user
+
b ret_to_user
ENDPROC(el0_irq)

@@ -1269,3 +1304,11 @@ alternative_else_nop_endif
ENDPROC(__sdei_asm_handler)
NOKPROBE(__sdei_asm_handler)
#endif /* CONFIG_ARM_SDE_INTERFACE */
+
+/*
+ * NMI return path to EL0
+ */
+nmi_ret_to_user:
+ ldr x1, [tsk, #TSK_TI_FLAGS]
+ b finish_ret_to_user
+ENDPROC(nmi_ret_to_user)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index ffe98e8..1af2fcc 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -41,6 +41,8 @@

#include "irq-gic-common.h"

+#define GICD_INT_NMI_PRI 0xa0
+
struct redist_region {
void __iomem *redist_base;
phys_addr_t phys_base;
@@ -248,6 +250,12 @@ static void gic_unmask_irq(struct irq_data *d)
gic_poke_irq(d, GICD_ISENABLER);
}

+static inline bool gic_supports_nmi(void)
+{
+ return gic_prio_masking_enabled()
+ && static_branch_likely(&have_non_secure_prio_view);
+}
+
static int gic_irq_set_irqchip_state(struct irq_data *d,
enum irqchip_irq_state which, bool val)
{
@@ -381,6 +389,23 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
int err;

+ if (gic_supports_nmi()
+ && unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) {
+ /*
+ * We need to prevent other NMIs to occur even after a
+ * priority drop.
+ * We keep I flag set until cpsr is restored from
+ * kernel_exit.
+ */
+ gic_set_nmi_active();
+
+ if (static_branch_likely(&supports_deactivate_key))
+ gic_write_eoir(irqnr);
+
+ err = handle_domain_nmi(gic_data.domain, irqnr, regs);
+ return;
+ }
+
if (static_branch_likely(&supports_deactivate_key))
gic_write_eoir(irqnr);
else if (!gic_prio_masking_enabled())
@@ -1119,6 +1144,11 @@ static int partition_domain_translate(struct irq_domain *d,
.select = gic_irq_domain_select,
};

+static void gic_enable_nmi_support(void)
+{
+ static_branch_enable(&have_non_secure_prio_view);
+}
+
static int __init gic_init_bases(void __iomem *dist_base,
struct redist_region *rdist_regs,
u32 nr_redist_regions,
@@ -1188,7 +1218,9 @@ static int __init gic_init_bases(void __iomem *dist_base,

if (gic_prio_masking_enabled()) {
if (!gic_has_group0() || gic_dist_security_disabled())
- static_branch_enable(&have_non_secure_prio_view);
+ gic_enable_nmi_support();
+ else
+ pr_warn("SCR_EL3.FIQ set, cannot enable use of pseudo-NMIs\n");
}

return 0;
--
1.9.1


2018-08-28 15:56:16

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 09/27] irqchip/gic: Lower priority of GIC interrupts

The current value used for IRQ priorities is high among the
non-secure interrupt priority values.

Lower the default priority of interrupts so there is more flexibility
to define higher priority interrupts.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Suggested-by: Daniel Thompson <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
include/linux/irqchip/arm-gic-common.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/irqchip/arm-gic-common.h b/include/linux/irqchip/arm-gic-common.h
index 9a1a479..2c9a4b3 100644
--- a/include/linux/irqchip/arm-gic-common.h
+++ b/include/linux/irqchip/arm-gic-common.h
@@ -13,7 +13,7 @@
#include <linux/types.h>
#include <linux/ioport.h>

-#define GICD_INT_DEF_PRI 0xa0
+#define GICD_INT_DEF_PRI 0xc0
#define GICD_INT_DEF_PRI_X4 ((GICD_INT_DEF_PRI << 24) |\
(GICD_INT_DEF_PRI << 16) |\
(GICD_INT_DEF_PRI << 8) |\
--
1.9.1

2018-08-28 15:56:24

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 06/27] arm64: Delay daif masking for user return

Masking daif flags is done very early before returning to EL0.

Only toggle the interrupt masking while in the vector entry and mask daif
once in kernel_exit.

Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: James Morse <[email protected]>
---
arch/arm64/kernel/entry.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 09dbea22..85ce06ac 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -259,9 +259,9 @@ alternative_else_nop_endif
.endm

.macro kernel_exit, el
- .if \el != 0
disable_daif

+ .if \el != 0
/* Restore the task's original addr_limit. */
ldr x20, [sp, #S_ORIG_ADDR_LIMIT]
str x20, [tsk, #TSK_TI_ADDR_LIMIT]
@@ -896,7 +896,7 @@ work_pending:
* "slow" syscall return path.
*/
ret_to_user:
- disable_daif
+ disable_irq // disable interrupts
ldr x1, [tsk, #TSK_TI_FLAGS]
and x2, x1, #_TIF_WORK_MASK
cbnz x2, work_pending
--
1.9.1


2018-08-28 15:56:55

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 08/27] irqchip/gic: Unify GIC priority definitions

LPIs use the same priority value as other GIC interrupts.

Make the GIC default priority definition visible to ITS implementation
and use this same definition for LPI priorities.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 2 +-
include/linux/irqchip/arm-gic-common.h | 6 ++++++
include/linux/irqchip/arm-gic.h | 5 -----
3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 316a575..f5391f2 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -64,7 +64,7 @@
#define LPI_PROPBASE_SZ ALIGN(BIT(LPI_NRBITS), SZ_64K)
#define LPI_PENDBASE_SZ ALIGN(BIT(LPI_NRBITS) / 8, SZ_64K)

-#define LPI_PROP_DEFAULT_PRIO 0xa0
+#define LPI_PROP_DEFAULT_PRIO GICD_INT_DEF_PRI

/*
* Collection structure - just an ID, and a redistributor address to
diff --git a/include/linux/irqchip/arm-gic-common.h b/include/linux/irqchip/arm-gic-common.h
index 0a83b43..9a1a479 100644
--- a/include/linux/irqchip/arm-gic-common.h
+++ b/include/linux/irqchip/arm-gic-common.h
@@ -13,6 +13,12 @@
#include <linux/types.h>
#include <linux/ioport.h>

+#define GICD_INT_DEF_PRI 0xa0
+#define GICD_INT_DEF_PRI_X4 ((GICD_INT_DEF_PRI << 24) |\
+ (GICD_INT_DEF_PRI << 16) |\
+ (GICD_INT_DEF_PRI << 8) |\
+ GICD_INT_DEF_PRI)
+
enum gic_type {
GIC_V2,
GIC_V3,
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 6c4aaf0..6261790 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -65,11 +65,6 @@
#define GICD_INT_EN_CLR_X32 0xffffffff
#define GICD_INT_EN_SET_SGI 0x0000ffff
#define GICD_INT_EN_CLR_PPI 0xffff0000
-#define GICD_INT_DEF_PRI 0xa0
-#define GICD_INT_DEF_PRI_X4 ((GICD_INT_DEF_PRI << 24) |\
- (GICD_INT_DEF_PRI << 16) |\
- (GICD_INT_DEF_PRI << 8) |\
- GICD_INT_DEF_PRI)

#define GICD_IIDR_IMPLEMENTER_SHIFT 0
#define GICD_IIDR_IMPLEMENTER_MASK (0xfff << GICD_IIDR_IMPLEMENTER_SHIFT)
--
1.9.1

2018-08-28 15:57:21

by Julien Thierry

[permalink] [raw]
Subject: [PATCH v5 11/27] arm64: Make PMR part of task context

If ICC_PMR_EL1 is used to mask interrupts, its value should be
saved/restored whenever a task is context switched out/in or
gets an exception.

Add PMR to the registers to save in the pt_regs struct upon kernel entry,
and restore it before ERET. Also, initialize it to a sane value when
creating new tasks.

Tested-by: Daniel Thompson <[email protected]>
Signed-off-by: Julien Thierry <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Dave Martin <[email protected]>
---
arch/arm64/include/asm/processor.h | 1 +
arch/arm64/include/asm/ptrace.h | 5 ++++-
arch/arm64/kernel/asm-offsets.c | 1 +
arch/arm64/kernel/entry.S | 16 ++++++++++++++++
arch/arm64/kernel/process.c | 2 ++
5 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 79657ad..45a2e08 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -167,6 +167,7 @@ static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
memset(regs, 0, sizeof(*regs));
forget_syscall(regs);
regs->pc = pc;
+ regs->pmr_save = ICC_PMR_EL1_UNMASKED;
}

static inline void start_thread(struct pt_regs *regs, unsigned long pc,
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 177b851..29ec217 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -25,6 +25,9 @@
#define CurrentEL_EL1 (1 << 2)
#define CurrentEL_EL2 (2 << 2)

+/* PMR value use to unmask interrupts */
+#define ICC_PMR_EL1_UNMASKED 0xf0
+
/* AArch32-specific ptrace requests */
#define COMPAT_PTRACE_GETREGS 12
#define COMPAT_PTRACE_SETREGS 13
@@ -163,7 +166,7 @@ struct pt_regs {
#endif

u64 orig_addr_limit;
- u64 unused; // maintain 16 byte alignment
+ u64 pmr_save;
u64 stackframe[2];
};

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 323aeb5..bab4122 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -78,6 +78,7 @@ int main(void)
DEFINE(S_ORIG_X0, offsetof(struct pt_regs, orig_x0));
DEFINE(S_SYSCALLNO, offsetof(struct pt_regs, syscallno));
DEFINE(S_ORIG_ADDR_LIMIT, offsetof(struct pt_regs, orig_addr_limit));
+ DEFINE(S_PMR_SAVE, offsetof(struct pt_regs, pmr_save));
DEFINE(S_STACKFRAME, offsetof(struct pt_regs, stackframe));
DEFINE(S_FRAME_SIZE, sizeof(struct pt_regs));
BLANK();
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 85ce06ac..79b06af 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -249,6 +249,15 @@ alternative_else_nop_endif
msr sp_el0, tsk
.endif

+ /* Save pmr */
+alternative_if ARM64_HAS_IRQ_PRIO_MASKING
+ mrs_s x20, SYS_ICC_PMR_EL1
+alternative_else
+ /* Keep a sane value in the task context */
+ mov x20, ICC_PMR_EL1_UNMASKED
+alternative_endif
+ str x20, [sp, #S_PMR_SAVE]
+
/*
* Registers that may be useful after this macro is invoked:
*
@@ -269,6 +278,13 @@ alternative_else_nop_endif
/* No need to restore UAO, it will be restored from SPSR_EL1 */
.endif

+ /* Restore pmr */
+alternative_if ARM64_HAS_IRQ_PRIO_MASKING
+ ldr x20, [sp, #S_PMR_SAVE]
+ msr_s SYS_ICC_PMR_EL1, x20
+ dsb sy
+alternative_else_nop_endif
+
ldp x21, x22, [sp, #S_PC] // load ELR, SPSR
.if \el == 0
ct_user_enter
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 7f1628e..1f6a4d5 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -230,6 +230,7 @@ void __show_regs(struct pt_regs *regs)
}

printk("sp : %016llx\n", sp);
+ printk("pmr_save: %08llx\n", regs->pmr_save);

i = top_reg;

@@ -355,6 +356,7 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
} else {
memset(childregs, 0, sizeof(struct pt_regs));
childregs->pstate = PSR_MODE_EL1h;
+ childregs->pmr_save = ICC_PMR_EL1_UNMASKED;
if (IS_ENABLED(CONFIG_ARM64_UAO) &&
cpus_have_const_cap(ARM64_HAS_UAO))
childregs->pstate |= PSR_UAO_BIT;
--
1.9.1

2018-08-29 11:39:53

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] arm64: provide pseudo NMI with GICv3

On Tue, Aug 28, 2018 at 04:51:10PM +0100, Julien Thierry wrote:
> Hi,
>
> This series is a continuation of the work started by Daniel [1]. The goal
> is to use GICv3 interrupt priorities to simulate an NMI.
>
> The patches depend on the core API for NMIs patches [2].

As before... is there a git tree? With the NMI core API I think I might
be able to get some of my old pseudo-NMI demos working.


> To achieve this, set two priorities, one for standard interrupts and
> another, higher priority, for NMIs. Whenever we want to disable interrupts,
> we mask the standard priority instead so NMIs can still be raised. Some
> corner cases though still require to actually mask all interrupts
> effectively disabling the NMI.
>
> Daniel Thompson ran some benchmarks [3] on the previous version showing a
> small (<1%) performance drop when using interrupt priorities.

IMHO it is very important to disclose which micro-architecture (in this
case I think I was using C-A53 and GIC-500) the performance drop is
observed with.

We know from both micro- and macro-benchmarks that the performance delta
is deeply dependant on core implementation. In fact, my own work in this
area stalled largely because the main device that justified my spending
working-hours on pseudo-NMI was too badly impacted to see it enabled by
default.


Daniel.


> Currently, only PPIs and SPIs can be set as NMIs. IPIs being currently
> hardcoded IRQ numbers, there isn't a generic interface to set SGIs as NMI
> for now. LPIs being controlled by the ITS cannot be delivered as NMI.
> When an NMI is active on a CPU, no other NMI can be triggered on the CPU.
>
> Requirements to use this:
> - Have GICv3
> - SCR_EL3.FIQ is set to 1 when linux runs or have single security state
> - Select Kernel Feature -> Use ICC system registers for IRQ masking
>
>
> * Patches 1 to 3 aim at applying some alternatives early in the boot
> process.
>
> * Patches 4 to 7 ensure the logic of daifflags remains valid
> after arch_local_irq flags use ICC_PMR_EL1.
>
> * Patches 8 and 9 clean up GIC current priority definition to make it
> easier to introduce a new priority
>
> * Patches 10 to 16 prepare arch code for the use of priorities, saving and
> restoring ICC_PMR_EL1 appropriately
>
> * Patches 17 to 20 add the support to GICv3 driver to use priority masking
> if required by the architecture
>
> * Patches 21 to 23 make arm64 code use ICC_PMR_EL1 to enable/disable
> interrupts, leaving PSR.I as often as possible
>
> * Patches 24 to 27 add the support for NMIs to GICv3 driver
>
>
> Changes since V4[4]:
> * Rebased to v4.19-rc1
> * Adapted GIC driver to the core NMI API
> * Added option to disable priority masking on command line
> * Added Daniel's Tested-by on patches related replacing PSR.I toggling with
> PMR masking
> * Fix scope matching for alternative features.
> * Spotted some more places using PSR.I or daif and replaced with generic
> interrupt functions
>
> Changes since V3[5]:
> * Big refactoring. As suggested by Marc Z., some of the bigger patches
> needed to be split into smaller one.
> * Try to reduce the amount of #ifdef for the new feature by introducing
> an individual cpufeature for priority masking
> * Do not track which alternatives have been applied (was a bit dodgy
> anyway), and use an alternative for VHE cpu_enable callback
> * Fix a build failure with arm by adding the correct RPR accessors
> * Added Suggested-by tags for changes from comming or inspired by Daniel's
> series. Do let me know if you feel I missed something and am not giving
> you due credit.
>
> Changes since V2[6]:
> * Series rebase to v4.17-rc6
> * Adapt pathces 1 and 2 to the rework of cpufeatures framework
> * Use the group0 detection scheme in the GICv3 driver to identify
> the priority view, and drop the use of a fake interrupt
> * Add the case for a GIC configured in a single security state
> * Use local_daif_restore instead of local_irq_enable the first time
> we enable interrupts after a bp hardening in the handling of a kernel
> entry. Otherwise PRS.I remains set...
>
> Changes since V1[7]:
> * Series rebased to v4.15-rc8.
> * Check for arm64_early_features in this_cpu_has_cap (spotted by Suzuki).
> * Fix issue where debug exception were not masked when enabling debug in
> mdscr_el1.
>
> Changes since RFC[8]:
> * The series was rebased to v4.15-rc2 which implied some changes mainly
> related to the work on exception entries and daif flags by James Morse.
> - The first patch in the previous series was dropped because no longer
> applicable.
> - With the semantics James introduced of "inheriting" daif flags,
> handling of PMR on exception entry is simplified as PMR is not altered
> by taking an exception and already inherited from previous state.
> - James pointed out that taking a PseudoNMI before reading the FAR_EL1
> register should not be allowed as per the TRM (D10.2.29):
> "FAR_EL1 is made UNKNOWN on an exception return from EL1."
> So in this submission PSR.I bit is cleared only after FAR_EL1 is read.
> * For KVM, only deal with PMR unmasking/restoring in common code, and VHE
> specific code makes sure PSR.I bit is set when necessary.
> * When detecting the GIC priority view (patch 5), wait for an actual
> interrupt instead of trying only once.
>
>
> [1] http://www.spinics.net/lists/arm-kernel/msg525077.html
> [2] https://lkml.org/lkml/2018/8/28/661
> [3] https://lkml.org/lkml/2018/7/20/803
> [4] https://lkml.org/lkml/2018/7/24/321
> [5] https://lkml.org/lkml/2018/5/21/276
> [6] https://lkml.org/lkml/2018/1/17/335
> [7] https://www.spinics.net/lists/arm-kernel/msg620763.html
> [8] https://www.spinics.net/lists/arm-kernel/msg610736.html
>
> Cheers,
>
> Julien
>
> -->
>
> Daniel Thompson (1):
> arm64: alternative: Apply alternatives early in boot process
>
> Julien Thierry (26):
> arm64: cpufeature: Set SYSREG_GIC_CPUIF as a boot system feature
> arm64: cpufeature: Use alternatives for VHE cpu_enable
> arm64: daifflags: Use irqflags functions for daifflags
> arm64: Use daifflag_restore after bp_hardening
> arm64: Delay daif masking for user return
> arm64: xen: Use existing helper to check interrupt status
> irqchip/gic: Unify GIC priority definitions
> irqchip/gic: Lower priority of GIC interrupts
> arm64: cpufeature: Add cpufeature for IRQ priority masking
> arm64: Make PMR part of task context
> arm64: Unmask PMR before going idle
> arm/arm64: gic-v3: Add helper functions to manage IRQ priorities
> arm64: kvm: Unmask PMR before entering guest
> arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking
> arm64: daifflags: Include PMR in daifflags restore operations
> irqchip/gic-v3: Factor group0 detection into functions
> irqchip/gic-v3: Do not overwrite PMR value
> irqchip/gic-v3: Remove acknowledge loop
> irqchip/gic-v3: Switch to PMR masking after IRQ acknowledge
> arm64: Switch to PMR masking when starting CPUs
> arm64: Add build option for IRQ masking via priority
> arm64: Handle serror in NMI context
> irqchip/gic-v3: Detect current view of GIC priorities
> irqchip/gic-v3: Add base support for pseudo-NMI
> irqchip/gic: Add functions to access irq priorities
> irqchip/gic-v3: Allow interrupts to be set as pseudo-NMI
>
> Documentation/admin-guide/kernel-parameters.txt | 3 +
> Documentation/arm64/booting.txt | 5 +
> arch/arm/include/asm/arch_gicv3.h | 33 +++
> arch/arm64/Kconfig | 15 ++
> arch/arm64/include/asm/alternative.h | 3 +-
> arch/arm64/include/asm/arch_gicv3.h | 33 +++
> arch/arm64/include/asm/assembler.h | 17 +-
> arch/arm64/include/asm/cpucaps.h | 3 +-
> arch/arm64/include/asm/cpufeature.h | 2 +
> arch/arm64/include/asm/daifflags.h | 29 ++-
> arch/arm64/include/asm/efi.h | 3 +-
> arch/arm64/include/asm/irqflags.h | 100 +++++++--
> arch/arm64/include/asm/kvm_host.h | 12 +
> arch/arm64/include/asm/processor.h | 1 +
> arch/arm64/include/asm/ptrace.h | 13 +-
> arch/arm64/include/asm/xen/events.h | 2 +-
> arch/arm64/kernel/alternative.c | 28 ++-
> arch/arm64/kernel/asm-offsets.c | 1 +
> arch/arm64/kernel/cpufeature.c | 51 ++++-
> arch/arm64/kernel/entry.S | 63 +++++-
> arch/arm64/kernel/head.S | 35 +++
> arch/arm64/kernel/process.c | 2 +
> arch/arm64/kernel/smp.c | 12 +
> arch/arm64/kernel/traps.c | 8 +-
> arch/arm64/kvm/hyp/switch.c | 17 ++
> arch/arm64/mm/fault.c | 5 +-
> arch/arm64/mm/proc.S | 18 ++
> drivers/irqchip/irq-gic-common.c | 10 +
> drivers/irqchip/irq-gic-common.h | 2 +
> drivers/irqchip/irq-gic-v3-its.c | 2 +-
> drivers/irqchip/irq-gic-v3.c | 279 +++++++++++++++++++-----
> include/linux/irqchip/arm-gic-common.h | 6 +
> include/linux/irqchip/arm-gic.h | 5 -
> 33 files changed, 699 insertions(+), 119 deletions(-)
>
> --
> 1.9.1

2018-08-29 13:00:14

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] arm64: provide pseudo NMI with GICv3



On 29/08/18 12:37, Daniel Thompson wrote:
> On Tue, Aug 28, 2018 at 04:51:10PM +0100, Julien Thierry wrote:
>> Hi,
>>
>> This series is a continuation of the work started by Daniel [1]. The goal
>> is to use GICv3 interrupt priorities to simulate an NMI.
>>
>> The patches depend on the core API for NMIs patches [2].
>
> As before... is there a git tree? With the NMI core API I think I might
> be able to get some of my old pseudo-NMI demos working.
>

Here is the git:

git://linux-arm.org/linux-jt.git v4.19-nmi-pmu-public

there are also a few patches to make the PMU interrupt into an NMI.

>
>> To achieve this, set two priorities, one for standard interrupts and
>> another, higher priority, for NMIs. Whenever we want to disable interrupts,
>> we mask the standard priority instead so NMIs can still be raised. Some
>> corner cases though still require to actually mask all interrupts
>> effectively disabling the NMI.
>>
>> Daniel Thompson ran some benchmarks [3] on the previous version showing a
>> small (<1%) performance drop when using interrupt priorities.
>
> IMHO it is very important to disclose which micro-architecture (in this
> case I think I was using C-A53 and GIC-500) the performance drop is
> observed with.
>
> We know from both micro- and macro-benchmarks that the performance delta
> is deeply dependant on core implementation. In fact, my own work in this
> area stalled largely because the main device that justified my spending
> working-hours on pseudo-NMI was too badly impacted to see it enabled by
> default.
>

Yes, you're right, I haven't had many boards to test on, the one I
tested on had 8 C-A57. Not sure about the GIC (it implemented GICv3
architecture, that's all I know). I had similar results to yours.

I did not have the opportunity to re-run the benchmarks on real hardware
for this spin however.

Thanks,

Julien

>
> Daniel.
>
>
>> Currently, only PPIs and SPIs can be set as NMIs. IPIs being currently
>> hardcoded IRQ numbers, there isn't a generic interface to set SGIs as NMI
>> for now. LPIs being controlled by the ITS cannot be delivered as NMI.
>> When an NMI is active on a CPU, no other NMI can be triggered on the CPU.
>>
>> Requirements to use this:
>> - Have GICv3
>> - SCR_EL3.FIQ is set to 1 when linux runs or have single security state
>> - Select Kernel Feature -> Use ICC system registers for IRQ masking
>>
>>
>> * Patches 1 to 3 aim at applying some alternatives early in the boot
>> process.
>>
>> * Patches 4 to 7 ensure the logic of daifflags remains valid
>> after arch_local_irq flags use ICC_PMR_EL1.
>>
>> * Patches 8 and 9 clean up GIC current priority definition to make it
>> easier to introduce a new priority
>>
>> * Patches 10 to 16 prepare arch code for the use of priorities, saving and
>> restoring ICC_PMR_EL1 appropriately
>>
>> * Patches 17 to 20 add the support to GICv3 driver to use priority masking
>> if required by the architecture
>>
>> * Patches 21 to 23 make arm64 code use ICC_PMR_EL1 to enable/disable
>> interrupts, leaving PSR.I as often as possible
>>
>> * Patches 24 to 27 add the support for NMIs to GICv3 driver
>>
>>
>> Changes since V4[4]:
>> * Rebased to v4.19-rc1
>> * Adapted GIC driver to the core NMI API
>> * Added option to disable priority masking on command line
>> * Added Daniel's Tested-by on patches related replacing PSR.I toggling with
>> PMR masking
>> * Fix scope matching for alternative features.
>> * Spotted some more places using PSR.I or daif and replaced with generic
>> interrupt functions
>>
>> Changes since V3[5]:
>> * Big refactoring. As suggested by Marc Z., some of the bigger patches
>> needed to be split into smaller one.
>> * Try to reduce the amount of #ifdef for the new feature by introducing
>> an individual cpufeature for priority masking
>> * Do not track which alternatives have been applied (was a bit dodgy
>> anyway), and use an alternative for VHE cpu_enable callback
>> * Fix a build failure with arm by adding the correct RPR accessors
>> * Added Suggested-by tags for changes from comming or inspired by Daniel's
>> series. Do let me know if you feel I missed something and am not giving
>> you due credit.
>>
>> Changes since V2[6]:
>> * Series rebase to v4.17-rc6
>> * Adapt pathces 1 and 2 to the rework of cpufeatures framework
>> * Use the group0 detection scheme in the GICv3 driver to identify
>> the priority view, and drop the use of a fake interrupt
>> * Add the case for a GIC configured in a single security state
>> * Use local_daif_restore instead of local_irq_enable the first time
>> we enable interrupts after a bp hardening in the handling of a kernel
>> entry. Otherwise PRS.I remains set...
>>
>> Changes since V1[7]:
>> * Series rebased to v4.15-rc8.
>> * Check for arm64_early_features in this_cpu_has_cap (spotted by Suzuki).
>> * Fix issue where debug exception were not masked when enabling debug in
>> mdscr_el1.
>>
>> Changes since RFC[8]:
>> * The series was rebased to v4.15-rc2 which implied some changes mainly
>> related to the work on exception entries and daif flags by James Morse.
>> - The first patch in the previous series was dropped because no longer
>> applicable.
>> - With the semantics James introduced of "inheriting" daif flags,
>> handling of PMR on exception entry is simplified as PMR is not altered
>> by taking an exception and already inherited from previous state.
>> - James pointed out that taking a PseudoNMI before reading the FAR_EL1
>> register should not be allowed as per the TRM (D10.2.29):
>> "FAR_EL1 is made UNKNOWN on an exception return from EL1."
>> So in this submission PSR.I bit is cleared only after FAR_EL1 is read.
>> * For KVM, only deal with PMR unmasking/restoring in common code, and VHE
>> specific code makes sure PSR.I bit is set when necessary.
>> * When detecting the GIC priority view (patch 5), wait for an actual
>> interrupt instead of trying only once.
>>
>>
>> [1] http://www.spinics.net/lists/arm-kernel/msg525077.html
>> [2] https://lkml.org/lkml/2018/8/28/661
>> [3] https://lkml.org/lkml/2018/7/20/803
>> [4] https://lkml.org/lkml/2018/7/24/321
>> [5] https://lkml.org/lkml/2018/5/21/276
>> [6] https://lkml.org/lkml/2018/1/17/335
>> [7] https://www.spinics.net/lists/arm-kernel/msg620763.html
>> [8] https://www.spinics.net/lists/arm-kernel/msg610736.html
>>
>> Cheers,
>>
>> Julien
>>
>> -->
>>
>> Daniel Thompson (1):
>> arm64: alternative: Apply alternatives early in boot process
>>
>> Julien Thierry (26):
>> arm64: cpufeature: Set SYSREG_GIC_CPUIF as a boot system feature
>> arm64: cpufeature: Use alternatives for VHE cpu_enable
>> arm64: daifflags: Use irqflags functions for daifflags
>> arm64: Use daifflag_restore after bp_hardening
>> arm64: Delay daif masking for user return
>> arm64: xen: Use existing helper to check interrupt status
>> irqchip/gic: Unify GIC priority definitions
>> irqchip/gic: Lower priority of GIC interrupts
>> arm64: cpufeature: Add cpufeature for IRQ priority masking
>> arm64: Make PMR part of task context
>> arm64: Unmask PMR before going idle
>> arm/arm64: gic-v3: Add helper functions to manage IRQ priorities
>> arm64: kvm: Unmask PMR before entering guest
>> arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking
>> arm64: daifflags: Include PMR in daifflags restore operations
>> irqchip/gic-v3: Factor group0 detection into functions
>> irqchip/gic-v3: Do not overwrite PMR value
>> irqchip/gic-v3: Remove acknowledge loop
>> irqchip/gic-v3: Switch to PMR masking after IRQ acknowledge
>> arm64: Switch to PMR masking when starting CPUs
>> arm64: Add build option for IRQ masking via priority
>> arm64: Handle serror in NMI context
>> irqchip/gic-v3: Detect current view of GIC priorities
>> irqchip/gic-v3: Add base support for pseudo-NMI
>> irqchip/gic: Add functions to access irq priorities
>> irqchip/gic-v3: Allow interrupts to be set as pseudo-NMI
>>
>> Documentation/admin-guide/kernel-parameters.txt | 3 +
>> Documentation/arm64/booting.txt | 5 +
>> arch/arm/include/asm/arch_gicv3.h | 33 +++
>> arch/arm64/Kconfig | 15 ++
>> arch/arm64/include/asm/alternative.h | 3 +-
>> arch/arm64/include/asm/arch_gicv3.h | 33 +++
>> arch/arm64/include/asm/assembler.h | 17 +-
>> arch/arm64/include/asm/cpucaps.h | 3 +-
>> arch/arm64/include/asm/cpufeature.h | 2 +
>> arch/arm64/include/asm/daifflags.h | 29 ++-
>> arch/arm64/include/asm/efi.h | 3 +-
>> arch/arm64/include/asm/irqflags.h | 100 +++++++--
>> arch/arm64/include/asm/kvm_host.h | 12 +
>> arch/arm64/include/asm/processor.h | 1 +
>> arch/arm64/include/asm/ptrace.h | 13 +-
>> arch/arm64/include/asm/xen/events.h | 2 +-
>> arch/arm64/kernel/alternative.c | 28 ++-
>> arch/arm64/kernel/asm-offsets.c | 1 +
>> arch/arm64/kernel/cpufeature.c | 51 ++++-
>> arch/arm64/kernel/entry.S | 63 +++++-
>> arch/arm64/kernel/head.S | 35 +++
>> arch/arm64/kernel/process.c | 2 +
>> arch/arm64/kernel/smp.c | 12 +
>> arch/arm64/kernel/traps.c | 8 +-
>> arch/arm64/kvm/hyp/switch.c | 17 ++
>> arch/arm64/mm/fault.c | 5 +-
>> arch/arm64/mm/proc.S | 18 ++
>> drivers/irqchip/irq-gic-common.c | 10 +
>> drivers/irqchip/irq-gic-common.h | 2 +
>> drivers/irqchip/irq-gic-v3-its.c | 2 +-
>> drivers/irqchip/irq-gic-v3.c | 279 +++++++++++++++++++-----
>> include/linux/irqchip/arm-gic-common.h | 6 +
>> include/linux/irqchip/arm-gic.h | 5 -
>> 33 files changed, 699 insertions(+), 119 deletions(-)
>>
>> --
>> 1.9.1

--
Julien Thierry

2018-08-29 21:37:14

by Stefano Stabellini

[permalink] [raw]
Subject: Re: [PATCH v5 07/27] arm64: xen: Use existing helper to check interrupt status

On Tue, 28 Aug 2018, Julien Thierry wrote:
> The status of interrupts might depend on more than just pstate. Use
> interrupts_disabled() instead of raw_irqs_disabled_flags() to take the full
> context into account.
>
> Signed-off-by: Julien Thierry <[email protected]>
> Cc: Stefano Stabellini <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>

Acked-by: Stefano Stabellini <[email protected]>


> ---
> arch/arm64/include/asm/xen/events.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/xen/events.h b/arch/arm64/include/asm/xen/events.h
> index 4e22b7a..2788e95 100644
> --- a/arch/arm64/include/asm/xen/events.h
> +++ b/arch/arm64/include/asm/xen/events.h
> @@ -14,7 +14,7 @@ enum ipi_vector {
>
> static inline int xen_irqs_disabled(struct pt_regs *regs)
> {
> - return raw_irqs_disabled_flags((unsigned long) regs->pstate);
> + return !interrupts_enabled(regs);
> }
>
> #define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
> --
> 1.9.1
>

2018-09-12 10:28:54

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v5 02/27] arm64: cpufeature: Use alternatives for VHE cpu_enable

Hi Julien,

On 28/08/18 16:51, Julien Thierry wrote:
> The cpu_enable callback for VHE feature requires all alternatives to have
> been applied. This prevents applying VHE alternative separately from the
> rest.
>
> Use an alternative depending on VHE feature to know whether VHE
> alternatives have already been applied.

> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 1e433ac..3bc1c8b 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1022,8 +1024,15 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
> * that, freshly-onlined CPUs will set tpidr_el2, so we don't need to
> * do anything here.
> */
> - if (!alternatives_applied)
> - write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);
> + asm volatile(ALTERNATIVE(
> + "mrs %0, tpidr_el1\n"
> + "msr tpidr_el2, %0",
> + "nop\n"
> + "nop",
> + ARM64_HAS_VIRT_HOST_EXTN)
> + : "+r" (tmp)
> + :
> + : "memory");
> }
> #endif

Catalin's preference was to keep this all in C:
https://patchwork.kernel.org/patch/10007977/
, for which we need to know if 'the' alternative has been applied.

I suspect there may be more registers in this list if we have to switch to
another EL2 register using alternatives. (but I don't have an example).

Could we make 'alternatives_applied' a macro that takes the cap as an argument?


Thanks,

James

2018-09-12 10:30:08

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v5 03/27] arm64: alternative: Apply alternatives early in boot process

Hi Julien,

On 28/08/18 16:51, Julien Thierry wrote:
> From: Daniel Thompson <[email protected]>
>
> Currently alternatives are applied very late in the boot process (and
> a long time after we enable scheduling). Some alternative sequences,
> such as those that alter the way CPU context is stored, must be applied
> much earlier in the boot sequence.
>
> Introduce apply_boot_alternatives() to allow some alternatives to be
> applied immediately after we detect the CPU features of the boot CPU.


> diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
> index b5d6039..70c2604 100644
> --- a/arch/arm64/kernel/alternative.c
> +++ b/arch/arm64/kernel/alternative.c
> @@ -145,7 +145,8 @@ static void clean_dcache_range_nopatch(u64 start, u64 end)
> } while (cur += d_size, cur < end);
> }
>
> -static void __apply_alternatives(void *alt_region, bool is_module)
> +static void __apply_alternatives(void *alt_region, bool is_module,
> + unsigned long feature_mask)

Shouldn't feature_mask be a DECLARE_BITMAP() maybe-array like cpu_hwcaps?
This means it keeps working when NR_CAPS grows over 64, which might happen
sooner than we think for backported errata...


> @@ -155,6 +156,9 @@ static void __apply_alternatives(void *alt_region, bool is_module)
> for (alt = region->begin; alt < region->end; alt++) {
> int nr_inst;
>
> + if ((BIT(alt->cpufeature) & feature_mask) == 0)
> + continue;
> +
> /* Use ARM64_CB_PATCH as an unconditional patch */
> if (alt->cpufeature < ARM64_CB_PATCH &&
> !cpus_have_cap(alt->cpufeature))
> @@ -213,7 +217,7 @@ static int __apply_alternatives_multi_stop(void *unused)
> isb();
> } else {
> BUG_ON(alternatives_applied);
> - __apply_alternatives(&region, false);
> + __apply_alternatives(&region, false, ~boot_capabilities);

Ah, this is tricky. There is a bitmap_complement() for the DECLARE_BITMAP()
stuff, but we'd need a second array...

We could pass the scope around, but then __apply_alternatives() would need to
lookup the struct arm64_cpu_capabilities up every time. This is only a problem
as we have one cap-number-space for errata/features, but separate sparse lists.

(I think applying the alternatives one cap at a time is a bad idea as we would
need to walk the alternative region NR_CAPS times)


> @@ -227,6 +231,24 @@ void __init apply_alternatives_all(void)
> stop_machine(__apply_alternatives_multi_stop, NULL, cpu_online_mask);
> }
>
> +/*
> + * This is called very early in the boot process (directly after we run
> + * a feature detect on the boot CPU). No need to worry about other CPUs
> + * here.
> + */
> +void __init apply_boot_alternatives(void)
> +{
> + struct alt_region region = {
> + .begin = (struct alt_instr *)__alt_instructions,
> + .end = (struct alt_instr *)__alt_instructions_end,
> + };
> +
> + /* If called on non-boot cpu things could go wrong */
> + WARN_ON(smp_processor_id() != 0);

Isn't the problem if there are multiple CPUs online?


> + __apply_alternatives(&region, false, boot_capabilities);
> +}
> +
> #ifdef CONFIG_MODULES
> void apply_alternatives_module(void *start, size_t length)
> {

> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 3bc1c8b..0d1e41e 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -52,6 +52,8 @@
> DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
> EXPORT_SYMBOL(cpu_hwcaps);
>
> +unsigned long boot_capabilities;
> +
> /*
> * Flag to indicate if we have computed the system wide
> * capabilities based on the boot time active CPUs. This
> @@ -1375,6 +1377,9 @@ static void __update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
> if (!cpus_have_cap(caps->capability) && caps->desc)
> pr_info("%s %s\n", info, caps->desc);
> cpus_set_cap(caps->capability);

Hmm, the bitmap behind cpus_set_cap() is what cpus_have_cap() in
__apply_alternatives() looks at. If you had a call to __apply_alternatives after
update_cpu_capabilities(SCOPE_BOOT_CPU), but before any others, it would only
apply those alternatives...

(I don't think there is a problem re-applying the same alternative, but I
haven't checked).


> +
> + if (caps->type & SCOPE_BOOT_CPU)
> + __set_bit(caps->capability, &boot_capabilities);
> }
> }


Thanks,

James

2018-09-12 10:32:39

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v5 06/27] arm64: Delay daif masking for user return

Hi Julien,

On 28/08/18 16:51, Julien Thierry wrote:
> Masking daif flags is done very early before returning to EL0.
>
> Only toggle the interrupt masking while in the vector entry and mask daif
> once in kernel_exit.

I had an earlier version that did this, but it showed up as a performance
problem. commit 8d66772e869e ("arm64: Mask all exceptions during kernel_exit")
described it as:
| Adding a naked 'disable_daif' to kernel_exit causes a performance problem
| for micro-benchmarks that do no real work, (e.g. calling getpid() in a
| loop). This is because the ret_to_user loop has already masked IRQs so
| that the TIF_WORK_MASK thread flags can't change underneath it, adding
| disable_daif is an additional self-synchronising operation.
|
| In the future, the RAS APEI code may need to modify the TIF_WORK_MASK
| flags from an SError, in which case the ret_to_user loop must mask SError
| while it examines the flags.


We may decide that the benchmark is silly, and we don't care about this. (At the
time it was easy enough to work around).

We need regular-IRQs masked when we read the TIF flags, and to stay masked until
we return to user-space.
I assume you're changing this so that psuedo-NMI are unmasked for EL0 until
kernel_exit.

I'd like to be able to change the TIF flags from the SError handlers for RAS,
which means masking SError for do_notify_resume too. (The RAS code that does
this doesn't exist today, so you can make this my problem to work out later!)
I think we should have psuedo_NMI masked if SError is masked too.


Is there a strong reason for having psuedo-NMI unmasked during
do_notify_resume(), or is it just for having the maximum amount of code exposed?


Thanks,

James

> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 09dbea22..85ce06ac 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -259,9 +259,9 @@ alternative_else_nop_endif
> .endm
>
> .macro kernel_exit, el
> - .if \el != 0
> disable_daif
>
> + .if \el != 0
> /* Restore the task's original addr_limit. */
> ldr x20, [sp, #S_ORIG_ADDR_LIMIT]
> str x20, [tsk, #TSK_TI_ADDR_LIMIT]
> @@ -896,7 +896,7 @@ work_pending:
> * "slow" syscall return path.
> */
> ret_to_user:
> - disable_daif
> + disable_irq // disable interrupts
> ldr x1, [tsk, #TSK_TI_FLAGS]
> and x2, x1, #_TIF_WORK_MASK
> cbnz x2, work_pending
>


2018-09-12 10:34:30

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v5 05/27] arm64: Use daifflag_restore after bp_hardening

Hi Julien,

On 28/08/18 16:51, Julien Thierry wrote:
> For EL0 entries requiring bp_hardening, daif status is kept at
> DAIF_PROCCTX_NOIRQ until after hardening has been done. Then interrupts
> are enabled through local_irq_enable().
>
> Before using local_irq_* functions, daifflags should be properly restored
> to a state where IRQs are enabled.

> Enable IRQs by restoring DAIF_PROCCTX state after bp hardening.

Is this just for symmetry, or are you going on to add something to the daifflags
state that local_irq_* functions won't change? (if so, could you allude to that
in the commit message)


Either way,

Acked-by: James Morse <[email protected]>

2018-09-12 11:13:48

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v5 05/27] arm64: Use daifflag_restore after bp_hardening

Hi James,

On 12/09/18 11:32, James Morse wrote:
> Hi Julien,
>
> On 28/08/18 16:51, Julien Thierry wrote:
>> For EL0 entries requiring bp_hardening, daif status is kept at
>> DAIF_PROCCTX_NOIRQ until after hardening has been done. Then interrupts
>> are enabled through local_irq_enable().
>>
>> Before using local_irq_* functions, daifflags should be properly restored
>> to a state where IRQs are enabled.
>
>> Enable IRQs by restoring DAIF_PROCCTX state after bp hardening.
>
> Is this just for symmetry, or are you going on to add something to the daifflags
> state that local_irq_* functions won't change? (if so, could you allude to that
> in the commit message)
>

What happens is that once we use ICC_PMR_EL1, local_irq_enable will not
touch PSR.I. And we are coming back from an entry where PSR.I was kept
to 1 so local_irq_enable was not actually enabling the interrupts. On
the otherhand, restore will affect both.

Another option is to have the asm macro "enable_da_f" also switch to PMR
usage (i.e. "just keep normal interrupts disabled"). Overall it would
probably be easier to reason with, but I'm just unsure whether it is
acceptable to receive a Pseudo NMI before having applied the bp_hardening.

If it is possible, I'm happy to solve this with enable_da_f.

Thanks,

>
> Either way,
>
> Acked-by: James Morse <[email protected]>
>

--
Julien Thierry

2018-09-12 12:03:38

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v5 02/27] arm64: cpufeature: Use alternatives for VHE cpu_enable

Hi James,

On 12/09/18 11:28, James Morse wrote:
> Hi Julien,
>
> On 28/08/18 16:51, Julien Thierry wrote:
>> The cpu_enable callback for VHE feature requires all alternatives to have
>> been applied. This prevents applying VHE alternative separately from the
>> rest.
>>
>> Use an alternative depending on VHE feature to know whether VHE
>> alternatives have already been applied.
>
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 1e433ac..3bc1c8b 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -1022,8 +1024,15 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
>> * that, freshly-onlined CPUs will set tpidr_el2, so we don't need to
>> * do anything here.
>> */
>> - if (!alternatives_applied)
>> - write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);
>> + asm volatile(ALTERNATIVE(
>> + "mrs %0, tpidr_el1\n"
>> + "msr tpidr_el2, %0",
>> + "nop\n"
>> + "nop",
>> + ARM64_HAS_VIRT_HOST_EXTN)
>> + : "+r" (tmp)
>> + :
>> + : "memory");
>> }
>> #endif
>
> Catalin's preference was to keep this all in C:
> https://patchwork.kernel.org/patch/10007977/
> , for which we need to know if 'the' alternative has been applied.
>
> I suspect there may be more registers in this list if we have to switch to
> another EL2 register using alternatives. (but I don't have an example).
>
> Could we make 'alternatives_applied' a macro that takes the cap as an argument?
>

I wanted to do this initially, the issue was that the alternatives
framework works on regions to patch rather than caps to apply. So I
found it a bit odd to associate the "code corresponding to cap was
applied" with the alternative application.

Although in patch 3 with the feature mask I guess that at the end of the
__apply_alternatives loop, if the cap was part of the mask passed, it
can be considered as applied.
If the asm inline is problematic I can go that route.

Thanks,

--
Julien Thierry

2018-09-12 12:28:50

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v5 05/27] arm64: Use daifflag_restore after bp_hardening

Hi Julien,

On 12/09/18 12:11, Julien Thierry wrote:
> On 12/09/18 11:32, James Morse wrote:
>> On 28/08/18 16:51, Julien Thierry wrote:
>>> For EL0 entries requiring bp_hardening, daif status is kept at
>>> DAIF_PROCCTX_NOIRQ until after hardening has been done. Then interrupts
>>> are enabled through local_irq_enable().
>>>
>>> Before using local_irq_* functions, daifflags should be properly restored
>>> to a state where IRQs are enabled.
>>
>>> Enable IRQs by restoring DAIF_PROCCTX state after bp hardening.
>>
>> Is this just for symmetry, or are you going on to add something to the daifflags
>> state that local_irq_* functions won't change? (if so, could you allude to that
>> in the commit message)

> What happens is that once we use ICC_PMR_EL1, local_irq_enable will not touch
> PSR.I. And we are coming back from an entry where PSR.I was kept to 1 so
> local_irq_enable was not actually enabling the interrupts. On the otherhand,
> restore will affect both.

Got it. Thanks!

Does this mean stop_machine()s local_save_flags()/local_irq_restore() will not
be symmetric around __apply_alternatives_multi_stop()?
I see you add alternatives in these in patch 15, but I haven't got that far yet)


> Another option is to have the asm macro "enable_da_f" also switch to PMR usage
> (i.e. "just keep normal interrupts disabled"). Overall it would probably be
> easier to reason with, but I'm just unsure whether it is acceptable to receive a
> Pseudo NMI before having applied the bp_hardening.

Wouldn't this give the interrupt controller a headache? I assume IRQs really are
masked when handle_arch_irq is called. (I know nothing about the gic)


Thanks,

James

2018-09-12 12:30:54

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v5 04/27] arm64: daifflags: Use irqflags functions for daifflags

Hi Julien,

On 28/08/18 16:51, Julien Thierry wrote:
> Some of the work done in daifflags save/restore is already provided
> by irqflags functions. Daifflags should always be a superset of irqflags
> (it handles irq status + status of other flags). Modifying behaviour of
> irqflags should alter the behaviour of daifflags.
>
> Use irqflags_save/restore functions for the corresponding daifflags
> operation.

> diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
> index 22e4c83..8d91f22 100644
> --- a/arch/arm64/include/asm/daifflags.h
> +++ b/arch/arm64/include/asm/daifflags.h
> @@ -36,11 +36,8 @@ static inline unsigned long local_daif_save(void)
> {
> unsigned long flags;
>
> - asm volatile(
> - "mrs %0, daif // local_daif_save\n"
> - : "=r" (flags)
> - :
> - : "memory");
> + flags = arch_local_save_flags();

I clearly should have done this from the beginning!
(I thought there was some header-loop that prevented it, but I can't reproduce
that now!)


> local_daif_mask();
>
> return flags;
> @@ -60,11 +57,9 @@ static inline void local_daif_restore(unsigned long flags)
> {
> if (!arch_irqs_disabled_flags(flags))
> trace_hardirqs_on();
> - asm volatile(
> - "msr daif, %0 // local_daif_restore"
> - :
> - : "r" (flags)
> - : "memory");
> +
> + arch_local_irq_restore(flags);

And I thought this only messed with the I bit, which it clearly doesn't.

Thanks for fixing these!

Reviewed-by: James Morse <[email protected]>

2018-09-12 12:38:06

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v5 02/27] arm64: cpufeature: Use alternatives for VHE cpu_enable

On 12/09/18 11:28, James Morse wrote:
> Hi Julien,
>
> On 28/08/18 16:51, Julien Thierry wrote:
>> The cpu_enable callback for VHE feature requires all alternatives to have
>> been applied. This prevents applying VHE alternative separately from the
>> rest.
>>
>> Use an alternative depending on VHE feature to know whether VHE
>> alternatives have already been applied.
>
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 1e433ac..3bc1c8b 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -1022,8 +1024,15 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
>> * that, freshly-onlined CPUs will set tpidr_el2, so we don't need to
>> * do anything here.
>> */
>> - if (!alternatives_applied)
>> - write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);
>> + asm volatile(ALTERNATIVE(
>> + "mrs %0, tpidr_el1\n"
>> + "msr tpidr_el2, %0",
>> + "nop\n"
>> + "nop",
>> + ARM64_HAS_VIRT_HOST_EXTN)
>> + : "+r" (tmp)
>> + :
>> + : "memory");
>> }
>> #endif
>
> Catalin's preference was to keep this all in C:
> https://patchwork.kernel.org/patch/10007977/
> , for which we need to know if 'the' alternative has been applied.
>
> I suspect there may be more registers in this list if we have to switch to
> another EL2 register using alternatives. (but I don't have an example).
>
> Could we make 'alternatives_applied' a macro that takes the cap as an argument?

Another crude way of doing this would be, take the action in the matches() of
a boot capability, when scope contains SCOPE_BOOT_CPU (to make sure that we
don't do this always for other "matches()" call backs via "this_cpu_has_cap()")
and get rid of this "enable()". But that looks more hackish.

Suzuki

2018-09-12 13:06:23

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v5 05/27] arm64: Use daifflag_restore after bp_hardening


On 12/09/18 13:28, James Morse wrote:
> On 12/09/18 12:11, Julien Thierry wrote:
>> On 12/09/18 11:32, James Morse wrote:
>>> On 28/08/18 16:51, Julien Thierry wrote:
>>>> For EL0 entries requiring bp_hardening, daif status is kept at
>>>> DAIF_PROCCTX_NOIRQ until after hardening has been done. Then interrupts
>>>> are enabled through local_irq_enable().
>>>>
>>>> Before using local_irq_* functions, daifflags should be properly restored
>>>> to a state where IRQs are enabled.
>>>
>>>> Enable IRQs by restoring DAIF_PROCCTX state after bp hardening.
>>>
>>> Is this just for symmetry, or are you going on to add something to the daifflags
>>> state that local_irq_* functions won't change? (if so, could you allude to that
>>> in the commit message)
>
>> What happens is that once we use ICC_PMR_EL1, local_irq_enable will not touch
>> PSR.I. And we are coming back from an entry where PSR.I was kept to 1 so
>> local_irq_enable was not actually enabling the interrupts. On the otherhand,
>> restore will affect both.
>
> Got it. Thanks!
>
> Does this mean stop_machine()s local_save_flags()/local_irq_restore() will not
> be symmetric around __apply_alternatives_multi_stop()?
> I see you add alternatives in these in patch 15, but I haven't got that far yet)
>

It's a good point but it should be fine.
The changes to the irqflags make save/restore takes everything into
consideration (PMR + PSR.I) because of situtations like this,
enable/disable only toggle the PMR (so the goal is to not have PSR.I set
before reaching path calling enable/disable).
Maybe I should add a comment for this in asm/irqflags.f when I add the
alternatives, so that at least arch code can be wary of this.

>
>> Another option is to have the asm macro "enable_da_f" also switch to PMR usage
>> (i.e. "just keep normal interrupts disabled"). Overall it would probably be
>> easier to reason with, but I'm just unsure whether it is acceptable to receive a
>> Pseudo NMI before having applied the bp_hardening.
>
> Wouldn't this give the interrupt controller a headache? I assume IRQs really are
> masked when handle_arch_irq is called. (I know nothing about the gic)
>

Yes, you're right, I missed that da_f gets unmasked right before the
irq_handler... So unless I do some special case for irqs, enable_da_f is
not the way to go.

Thanks,

--
Julien Thierry

2018-09-12 13:08:14

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v5 06/27] arm64: Delay daif masking for user return

Hi James,

On 12/09/18 11:31, James Morse wrote:
> Hi Julien,
>
> On 28/08/18 16:51, Julien Thierry wrote:
>> Masking daif flags is done very early before returning to EL0.
>>
>> Only toggle the interrupt masking while in the vector entry and mask daif
>> once in kernel_exit.
>
> I had an earlier version that did this, but it showed up as a performance
> problem. commit 8d66772e869e ("arm64: Mask all exceptions during kernel_exit")
> described it as:
> | Adding a naked 'disable_daif' to kernel_exit causes a performance problem
> | for micro-benchmarks that do no real work, (e.g. calling getpid() in a
> | loop). This is because the ret_to_user loop has already masked IRQs so
> | that the TIF_WORK_MASK thread flags can't change underneath it, adding
> | disable_daif is an additional self-synchronising operation.
> |
> | In the future, the RAS APEI code may need to modify the TIF_WORK_MASK
> | flags from an SError, in which case the ret_to_user loop must mask SError
> | while it examines the flags.
>
>
> We may decide that the benchmark is silly, and we don't care about this. (At the
> time it was easy enough to work around).
>
> We need regular-IRQs masked when we read the TIF flags, and to stay masked until
> we return to user-space.
> I assume you're changing this so that psuedo-NMI are unmasked for EL0 until
> kernel_exit.
>

Yes.

> I'd like to be able to change the TIF flags from the SError handlers for RAS,
> which means masking SError for do_notify_resume too. (The RAS code that does
> this doesn't exist today, so you can make this my problem to work out later!)
> I think we should have psuedo_NMI masked if SError is masked too.
>

Yes, my intention in the few daif changes was that PseudoNMI would have
just a little bit more priority than interrupt:

Debug > Abort > FIQ (not used) > NMI (PMR masked, PSR.I == 0) > IRQ
(daif + PMR cleared)

So if at any point I break this just shout. (I did that change because
currently el0_error has everything enabled before returning).

>
> Is there a strong reason for having psuedo-NMI unmasked during
> do_notify_resume(), or is it just for having the maximum amount of code exposed?
>

As you suspected, this is to have the maximum amount of code exposed to
Pseudo-NMIs.

Since it is not a strong requirement for Pseudo-NMI, if the perf issue
is more important I can drop the patch for now. Although it would be
useful to have other opinions to see what makes the most sense.

Thanks,

>
> Thanks,
>
> James
>
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index 09dbea22..85ce06ac 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -259,9 +259,9 @@ alternative_else_nop_endif
>> .endm
>>
>> .macro kernel_exit, el
>> - .if \el != 0
>> disable_daif
>>
>> + .if \el != 0
>> /* Restore the task's original addr_limit. */
>> ldr x20, [sp, #S_ORIG_ADDR_LIMIT]
>> str x20, [tsk, #TSK_TI_ADDR_LIMIT]
>> @@ -896,7 +896,7 @@ work_pending:
>> * "slow" syscall return path.
>> */
>> ret_to_user:
>> - disable_daif
>> + disable_irq // disable interrupts
>> ldr x1, [tsk, #TSK_TI_FLAGS]
>> and x2, x1, #_TIF_WORK_MASK
>> cbnz x2, work_pending
>>
>

--
Julien Thierry

2018-09-12 16:51:01

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v5 03/27] arm64: alternative: Apply alternatives early in boot process

Hi James,

On 12/09/18 11:29, James Morse wrote:
> Hi Julien,
>
> On 28/08/18 16:51, Julien Thierry wrote:
>> From: Daniel Thompson <[email protected]>
>>
>> Currently alternatives are applied very late in the boot process (and
>> a long time after we enable scheduling). Some alternative sequences,
>> such as those that alter the way CPU context is stored, must be applied
>> much earlier in the boot sequence.
>>
>> Introduce apply_boot_alternatives() to allow some alternatives to be
>> applied immediately after we detect the CPU features of the boot CPU.
>
>
>> diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
>> index b5d6039..70c2604 100644
>> --- a/arch/arm64/kernel/alternative.c
>> +++ b/arch/arm64/kernel/alternative.c
>> @@ -145,7 +145,8 @@ static void clean_dcache_range_nopatch(u64 start, u64 end)
>> } while (cur += d_size, cur < end);
>> }
>>
>> -static void __apply_alternatives(void *alt_region, bool is_module)
>> +static void __apply_alternatives(void *alt_region, bool is_module,
>> + unsigned long feature_mask)
>
> Shouldn't feature_mask be a DECLARE_BITMAP() maybe-array like cpu_hwcaps?
> This means it keeps working when NR_CAPS grows over 64, which might happen
> sooner than we think for backported errata...
>
>
>> @@ -155,6 +156,9 @@ static void __apply_alternatives(void *alt_region, bool is_module)
>> for (alt = region->begin; alt < region->end; alt++) {
>> int nr_inst;
>>
>> + if ((BIT(alt->cpufeature) & feature_mask) == 0)
>> + continue;
>> +
>> /* Use ARM64_CB_PATCH as an unconditional patch */
>> if (alt->cpufeature < ARM64_CB_PATCH &&
>> !cpus_have_cap(alt->cpufeature))
>> @@ -213,7 +217,7 @@ static int __apply_alternatives_multi_stop(void *unused)
>> isb();
>> } else {
>> BUG_ON(alternatives_applied);
>> - __apply_alternatives(&region, false);
>> + __apply_alternatives(&region, false, ~boot_capabilities);
>
> Ah, this is tricky. There is a bitmap_complement() for the DECLARE_BITMAP()
> stuff, but we'd need a second array...
>
> We could pass the scope around, but then __apply_alternatives() would need to
> lookup the struct arm64_cpu_capabilities up every time. This is only a problem
> as we have one cap-number-space for errata/features, but separate sparse lists.
>

Since for each alternative we know the cpufeature associated with it,
the "lookup" is really just accessing an array with the given index, so
that could be an option.

> (I think applying the alternatives one cap at a time is a bad idea as we would
> need to walk the alternative region NR_CAPS times)
>
>
>> @@ -227,6 +231,24 @@ void __init apply_alternatives_all(void)
>> stop_machine(__apply_alternatives_multi_stop, NULL, cpu_online_mask);
>> }
>>
>> +/*
>> + * This is called very early in the boot process (directly after we run
>> + * a feature detect on the boot CPU). No need to worry about other CPUs
>> + * here.
>> + */
>> +void __init apply_boot_alternatives(void)
>> +{
>> + struct alt_region region = {
>> + .begin = (struct alt_instr *)__alt_instructions,
>> + .end = (struct alt_instr *)__alt_instructions_end,
>> + };
>> +
>> + /* If called on non-boot cpu things could go wrong */
>> + WARN_ON(smp_processor_id() != 0);
>
> Isn't the problem if there are multiple CPUs online?
>

Yes, that makes more sense. I'll change this.

>
>> + __apply_alternatives(&region, false, boot_capabilities);
>> +}
>> +
>> #ifdef CONFIG_MODULES
>> void apply_alternatives_module(void *start, size_t length)
>> {
>
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 3bc1c8b..0d1e41e 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -52,6 +52,8 @@
>> DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
>> EXPORT_SYMBOL(cpu_hwcaps);
>>
>> +unsigned long boot_capabilities;
>> +
>> /*
>> * Flag to indicate if we have computed the system wide
>> * capabilities based on the boot time active CPUs. This
>> @@ -1375,6 +1377,9 @@ static void __update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
>> if (!cpus_have_cap(caps->capability) && caps->desc)
>> pr_info("%s %s\n", info, caps->desc);
>> cpus_set_cap(caps->capability);
>
> Hmm, the bitmap behind cpus_set_cap() is what cpus_have_cap() in
> __apply_alternatives() looks at. If you had a call to __apply_alternatives after
> update_cpu_capabilities(SCOPE_BOOT_CPU), but before any others, it would only
> apply those alternatives...
>
> (I don't think there is a problem re-applying the same alternative, but I
> haven't checked).
>

Interesting idea. If someone can confirm that patching alternatives
twice is safe, I think it would make things simpler.

Thanks,

--
Julien Thierry

2018-09-17 23:44:26

by Daniel Thompson

[permalink] [raw]
Subject: Re: [PATCH v5 03/27] arm64: alternative: Apply alternatives early in boot process

On Wed, Sep 12, 2018 at 05:49:09PM +0100, Julien Thierry wrote:
> > > + __apply_alternatives(&region, false, boot_capabilities);
> > > +}
> > > +
> > > #ifdef CONFIG_MODULES
> > > void apply_alternatives_module(void *start, size_t length)
> > > {
> >
> > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > > index 3bc1c8b..0d1e41e 100644
> > > --- a/arch/arm64/kernel/cpufeature.c
> > > +++ b/arch/arm64/kernel/cpufeature.c
> > > @@ -52,6 +52,8 @@
> > > DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
> > > EXPORT_SYMBOL(cpu_hwcaps);
> > > +unsigned long boot_capabilities;
> > > +
> > > /*
> > > * Flag to indicate if we have computed the system wide
> > > * capabilities based on the boot time active CPUs. This
> > > @@ -1375,6 +1377,9 @@ static void __update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
> > > if (!cpus_have_cap(caps->capability) && caps->desc)
> > > pr_info("%s %s\n", info, caps->desc);
> > > cpus_set_cap(caps->capability);
> >
> > Hmm, the bitmap behind cpus_set_cap() is what cpus_have_cap() in
> > __apply_alternatives() looks at. If you had a call to __apply_alternatives after
> > update_cpu_capabilities(SCOPE_BOOT_CPU), but before any others, it would only
> > apply those alternatives...
> >
> > (I don't think there is a problem re-applying the same alternative, but I
> > haven't checked).
> >
>
> Interesting idea. If someone can confirm that patching alternatives twice is
> safe, I think it would make things simpler.

Early versions of this patch applied the alternatives twice. I never
noticed any problems with double patching (second time round it will
write out code that is identical to what is already there so it is
merely inefficient rather than unsafe.


Daniel.

2018-09-18 07:40:34

by Julien Thierry

[permalink] [raw]
Subject: Re: [PATCH v5 03/27] arm64: alternative: Apply alternatives early in boot process



On 18/09/18 00:44, Daniel Thompson wrote:
> On Wed, Sep 12, 2018 at 05:49:09PM +0100, Julien Thierry wrote:
>>>> + __apply_alternatives(&region, false, boot_capabilities);
>>>> +}
>>>> +
>>>> #ifdef CONFIG_MODULES
>>>> void apply_alternatives_module(void *start, size_t length)
>>>> {
>>>
>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>>> index 3bc1c8b..0d1e41e 100644
>>>> --- a/arch/arm64/kernel/cpufeature.c
>>>> +++ b/arch/arm64/kernel/cpufeature.c
>>>> @@ -52,6 +52,8 @@
>>>> DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
>>>> EXPORT_SYMBOL(cpu_hwcaps);
>>>> +unsigned long boot_capabilities;
>>>> +
>>>> /*
>>>> * Flag to indicate if we have computed the system wide
>>>> * capabilities based on the boot time active CPUs. This
>>>> @@ -1375,6 +1377,9 @@ static void __update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
>>>> if (!cpus_have_cap(caps->capability) && caps->desc)
>>>> pr_info("%s %s\n", info, caps->desc);
>>>> cpus_set_cap(caps->capability);
>>>
>>> Hmm, the bitmap behind cpus_set_cap() is what cpus_have_cap() in
>>> __apply_alternatives() looks at. If you had a call to __apply_alternatives after
>>> update_cpu_capabilities(SCOPE_BOOT_CPU), but before any others, it would only
>>> apply those alternatives...
>>>
>>> (I don't think there is a problem re-applying the same alternative, but I
>>> haven't checked).
>>>
>>
>> Interesting idea. If someone can confirm that patching alternatives twice is
>> safe, I think it would make things simpler.
>
> Early versions of this patch applied the alternatives twice. I never
> noticed any problems with double patching (second time round it will
> write out code that is identical to what is already there so it is
> merely inefficient rather than unsafe.
>

When you say early version, do you mean the first ones you did? Because
the one I picked up (v4 I believe) had a feature mask to select which
ones to apply early and then which ones to exclude when applying the
rest of the features.

But I admit I have not looked at the previous versions.

Cheers,

--
Julien Thierry

2018-09-18 17:47:15

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v5 02/27] arm64: cpufeature: Use alternatives for VHE cpu_enable

Hi Julien,

On 09/12/2018 01:03 PM, Julien Thierry wrote:
> On 12/09/18 11:28, James Morse wrote:
>> On 28/08/18 16:51, Julien Thierry wrote:
>>> The cpu_enable callback for VHE feature requires all alternatives to have
>>> been applied. This prevents applying VHE alternative separately from the
>>> rest.
>>>
>>> Use an alternative depending on VHE feature to know whether VHE
>>> alternatives have already been applied.

>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>> index 1e433ac..3bc1c8b 100644
>>> --- a/arch/arm64/kernel/cpufeature.c
>>> +++ b/arch/arm64/kernel/cpufeature.c
>>> @@ -1022,8 +1024,15 @@ static void cpu_copy_el2regs(const struct
>>> arm64_cpu_capabilities *__unused)
>>>        * that, freshly-onlined CPUs will set tpidr_el2, so we don't need to
>>>        * do anything here.
>>>        */
>>> -    if (!alternatives_applied)
>>> -        write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);
>>> +    asm volatile(ALTERNATIVE(
>>> +        "mrs    %0, tpidr_el1\n"
>>> +        "msr    tpidr_el2, %0",
>>> +        "nop\n"
>>> +        "nop",
>>> +        ARM64_HAS_VIRT_HOST_EXTN)
>>> +        : "+r" (tmp)
>>> +        :
>>> +        : "memory");
>>>   }
>>>   #endif
>>
>> Catalin's preference was to keep this all in C:
>> https://patchwork.kernel.org/patch/10007977/
>> , for which we need to know if 'the' alternative has been applied.
>>
>> I suspect there may be more registers in this list if we have to switch to
>> another EL2 register using alternatives. (but I don't have an example).
>>
>> Could we make 'alternatives_applied' a macro that takes the cap as an argument?

> I wanted to do this initially, the issue was that the alternatives framework works on
> regions to patch rather than caps to apply. So I found it a bit odd to associate the
> "code corresponding to cap was applied" with the alternative application.

I agree it looks funny, but for the kernel text, its can only be one region. If we
ever had two we would still have to apply them at the same time as its not safe to
run code with alternatives partially applied.

(modules should be fine too as we apply the same alternatives as the kernel has
before we run any of the code)


I wonder if we can kill-off this function entirely... its only necessary because
set_my_cpu_offset() sets the 'wrong' tpidr register, and we need to copy them before
applying the alternatives.
... we only call set_my_cpu_offset() when we bring a cpu online, we could make it set
both tpidr registers if we're running at EL2.


Thanks,

James

2018-09-18 17:49:29

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v5 03/27] arm64: alternative: Apply alternatives early in boot process

Hi Daniel, Julien,

On 09/18/2018 12:44 AM, Daniel Thompson wrote:
> On Wed, Sep 12, 2018 at 05:49:09PM +0100, Julien Thierry wrote:
>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>>> index 3bc1c8b..0d1e41e 100644
>>>> --- a/arch/arm64/kernel/cpufeature.c
>>>> +++ b/arch/arm64/kernel/cpufeature.c
>>>> @@ -52,6 +52,8 @@
>>>> DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
>>>> EXPORT_SYMBOL(cpu_hwcaps);
>>>> +unsigned long boot_capabilities;
>>>> +
>>>> /*
>>>> * Flag to indicate if we have computed the system wide
>>>> * capabilities based on the boot time active CPUs. This
>>>> @@ -1375,6 +1377,9 @@ static void __update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
>>>> if (!cpus_have_cap(caps->capability) && caps->desc)
>>>> pr_info("%s %s\n", info, caps->desc);
>>>> cpus_set_cap(caps->capability);
>>>
>>> Hmm, the bitmap behind cpus_set_cap() is what cpus_have_cap() in
>>> __apply_alternatives() looks at. If you had a call to __apply_alternatives after
>>> update_cpu_capabilities(SCOPE_BOOT_CPU), but before any others, it would only
>>> apply those alternatives...
>>>
>>> (I don't think there is a problem re-applying the same alternative, but I
>>> haven't checked).

>> Interesting idea. If someone can confirm that patching alternatives twice is
>> safe, I think it would make things simpler.

Sounds good, I think we need to avoid adding a limit to the number of caps.

The extra-work is inefficient, but if it saves merging those lists as part of this
series its probably fine. (we only do this stuff once during boot)



> Early versions of this patch applied the alternatives twice. I never
> noticed any problems with double patching (second time round it will
> write out code that is identical to what is already there so it is
> merely inefficient rather than unsafe.

For the regular kind, I agree. But we've recently grown some fancy dynamic patching
where the code is generated at runtime, instead of swapping in an alternative
sequence. Details in commit dea5e2a4 ("arm64: alternatives: Add dynamic patching
feature"). Its unlikely we would ever apply these twice as they can't have a scope,
... and they all look safe.


Thanks,

James

2018-09-21 15:57:48

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v5 01/27] arm64: cpufeature: Set SYSREG_GIC_CPUIF as a boot system feature

On Tue, 28 Aug 2018 16:51:11 +0100,
Julien Thierry <[email protected]> wrote:
>
> Signed-off-by: Julien Thierry <[email protected]>
> Suggested-by: Daniel Thompson <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Suzuki K Poulose <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> ---
> arch/arm64/kernel/cpufeature.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index e238b79..1e433ac 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1039,7 +1039,7 @@ static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
> {
> .desc = "GIC system register CPU interface",
> .capability = ARM64_HAS_SYSREG_GIC_CPUIF,
> - .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> + .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
> .matches = has_useable_gicv3_cpuif,
> .sys_reg = SYS_ID_AA64PFR0_EL1,
> .field_pos = ID_AA64PFR0_GIC_SHIFT,
> --
> 1.9.1
>

This definitely deserves a commit message, such as:

"We do not support systems where some CPUs have an operational GICv3
CPU interface, and some don't. Let's make this requirement obvious by
flagging the GICv3 capability as being strict."

Thanks,

M.

--
Jazz is not dead, it just smell funny.

2018-09-21 16:08:18

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v5 03/27] arm64: alternative: Apply alternatives early in boot process

On Wed, 12 Sep 2018 17:49:09 +0100,
Julien Thierry <[email protected]> wrote:
>
> Hi James,
>
> On 12/09/18 11:29, James Morse wrote:
> > Hi Julien,
> >
> > On 28/08/18 16:51, Julien Thierry wrote:
> >> From: Daniel Thompson <[email protected]>
> >>
> >> Currently alternatives are applied very late in the boot process (and
> >> a long time after we enable scheduling). Some alternative sequences,
> >> such as those that alter the way CPU context is stored, must be applied
> >> much earlier in the boot sequence.
> >>
> >> Introduce apply_boot_alternatives() to allow some alternatives to be
> >> applied immediately after we detect the CPU features of the boot CPU.
> >
> >
> >> diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
> >> index b5d6039..70c2604 100644
> >> --- a/arch/arm64/kernel/alternative.c
> >> +++ b/arch/arm64/kernel/alternative.c
> >> @@ -145,7 +145,8 @@ static void clean_dcache_range_nopatch(u64 start, u64 end)
> >> } while (cur += d_size, cur < end);
> >> }
> >> -static void __apply_alternatives(void *alt_region, bool
> >> is_module)
> >> +static void __apply_alternatives(void *alt_region, bool is_module,
> >> + unsigned long feature_mask)
> >
> > Shouldn't feature_mask be a DECLARE_BITMAP() maybe-array like cpu_hwcaps?
> > This means it keeps working when NR_CAPS grows over 64, which might happen
> > sooner than we think for backported errata...
> >
> >
> >> @@ -155,6 +156,9 @@ static void __apply_alternatives(void *alt_region, bool is_module)
> >> for (alt = region->begin; alt < region->end; alt++) {
> >> int nr_inst;
> >> + if ((BIT(alt->cpufeature) & feature_mask) == 0)
> >> + continue;
> >> +
> >> /* Use ARM64_CB_PATCH as an unconditional patch */
> >> if (alt->cpufeature < ARM64_CB_PATCH &&
> >> !cpus_have_cap(alt->cpufeature))
> >> @@ -213,7 +217,7 @@ static int __apply_alternatives_multi_stop(void *unused)
> >> isb();
> >> } else {
> >> BUG_ON(alternatives_applied);
> >> - __apply_alternatives(&region, false);
> >> + __apply_alternatives(&region, false, ~boot_capabilities);
> >
> > Ah, this is tricky. There is a bitmap_complement() for the DECLARE_BITMAP()
> > stuff, but we'd need a second array...
> >
> > We could pass the scope around, but then __apply_alternatives() would need to
> > lookup the struct arm64_cpu_capabilities up every time. This is only a problem
> > as we have one cap-number-space for errata/features, but separate sparse lists.
> >
>
> Since for each alternative we know the cpufeature associated with it,
> the "lookup" is really just accessing an array with the given index,
> so that could be an option.
>
> > (I think applying the alternatives one cap at a time is a bad idea as we would
> > need to walk the alternative region NR_CAPS times)
> >
> >
> >> @@ -227,6 +231,24 @@ void __init apply_alternatives_all(void)
> >> stop_machine(__apply_alternatives_multi_stop, NULL, cpu_online_mask);
> >> }
> >> +/*
> >> + * This is called very early in the boot process (directly after we run
> >> + * a feature detect on the boot CPU). No need to worry about other CPUs
> >> + * here.
> >> + */
> >> +void __init apply_boot_alternatives(void)
> >> +{
> >> + struct alt_region region = {
> >> + .begin = (struct alt_instr *)__alt_instructions,
> >> + .end = (struct alt_instr *)__alt_instructions_end,
> >> + };
> >> +
> >> + /* If called on non-boot cpu things could go wrong */
> >> + WARN_ON(smp_processor_id() != 0);
> >
> > Isn't the problem if there are multiple CPUs online?
> >
>
> Yes, that makes more sense. I'll change this.
>
> >
> >> + __apply_alternatives(&region, false, boot_capabilities);
> >> +}
> >> +
> >> #ifdef CONFIG_MODULES
> >> void apply_alternatives_module(void *start, size_t length)
> >> {
> >
> >> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> >> index 3bc1c8b..0d1e41e 100644
> >> --- a/arch/arm64/kernel/cpufeature.c
> >> +++ b/arch/arm64/kernel/cpufeature.c
> >> @@ -52,6 +52,8 @@
> >> DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
> >> EXPORT_SYMBOL(cpu_hwcaps);
> >> +unsigned long boot_capabilities;
> >> +
> >> /*
> >> * Flag to indicate if we have computed the system wide
> >> * capabilities based on the boot time active CPUs. This
> >> @@ -1375,6 +1377,9 @@ static void __update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
> >> if (!cpus_have_cap(caps->capability) && caps->desc)
> >> pr_info("%s %s\n", info, caps->desc);
> >> cpus_set_cap(caps->capability);
> >
> > Hmm, the bitmap behind cpus_set_cap() is what cpus_have_cap() in
> > __apply_alternatives() looks at. If you had a call to __apply_alternatives after
> > update_cpu_capabilities(SCOPE_BOOT_CPU), but before any others, it would only
> > apply those alternatives...
> >
> > (I don't think there is a problem re-applying the same alternative, but I
> > haven't checked).
> >
>
> Interesting idea. If someone can confirm that patching alternatives
> twice is safe, I think it would make things simpler.

It may not be safe for dynamic alternatives, where the patch code is
generated at runtime and may rely on the original text (to extract a
register number, for example -- see kvm_update_va_mask).

Thanks,

M.

--
Jazz is not dead, it just smell funny.

2018-09-25 08:14:35

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v5 01/27] arm64: cpufeature: Set SYSREG_GIC_CPUIF as a boot system feature

On 25/09/18 04:10, Yao Lihua wrote:
> Hi Marc, Julien,
>
>
> On 09/21/2018 11:56 PM, Marc Zyngier wrote:
>> On Tue, 28 Aug 2018 16:51:11 +0100,
>> Julien Thierry <[email protected]> wrote:
>>> Signed-off-by: Julien Thierry <[email protected]>
>>> Suggested-by: Daniel Thompson <[email protected]>
>>> Cc: Catalin Marinas <[email protected]>
>>> Cc: Will Deacon <[email protected]>
>>> Cc: Suzuki K Poulose <[email protected]>
>>> Cc: Marc Zyngier <[email protected]>
>>> ---
>>> arch/arm64/kernel/cpufeature.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>> index e238b79..1e433ac 100644
>>> --- a/arch/arm64/kernel/cpufeature.c
>>> +++ b/arch/arm64/kernel/cpufeature.c
>>> @@ -1039,7 +1039,7 @@ static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
>>> {
>>> .desc = "GIC system register CPU interface",
>>> .capability = ARM64_HAS_SYSREG_GIC_CPUIF,
>>> - .type = ARM64_CPUCAP_SYSTEM_FEATURE,
>>> + .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
>>> .matches = has_useable_gicv3_cpuif,
>>> .sys_reg = SYS_ID_AA64PFR0_EL1,
>>> .field_pos = ID_AA64PFR0_GIC_SHIFT,
>>> --
>>> 1.9.1
>>>
>> This definitely deserves a commit message, such as:
>>
>> "We do not support systems where some CPUs have an operational GICv3
>> CPU interface, and some don't. Let's make this requirement obvious by
>> flagging the GICv3 capability as being strict."
> May I ask if it is possible to implement psedue-NMI on a arm64 SoC with GIC-400?

In theory, yes. In practice, this is likely to be both hard to implement
(you need to discover the GIC CPU interface address very early so that
you can patch the the PMR flipping code at the right time), and pretty
bad from a performance point of view (MMIO accesses are likely to be slow).

Given the above, the incentive to support such a configuration is close
to zero.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-10-03 09:24:38

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v5 08/27] irqchip/gic: Unify GIC priority definitions

On 28/08/18 16:51, Julien Thierry wrote:
> LPIs use the same priority value as other GIC interrupts.
>
> Make the GIC default priority definition visible to ITS implementation
> and use this same definition for LPI priorities.
>
> Tested-by: Daniel Thompson <[email protected]>
> Signed-off-by: Julien Thierry <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Jason Cooper <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> ---
> drivers/irqchip/irq-gic-v3-its.c | 2 +-
> include/linux/irqchip/arm-gic-common.h | 6 ++++++
> include/linux/irqchip/arm-gic.h | 5 -----
> 3 files changed, 7 insertions(+), 6 deletions(-)

I've cherry-picked this for 4.20, as this is a good cleanup.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-10-03 09:26:40

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v5 19/27] irqchip/gic-v3: Remove acknowledge loop

On 28/08/18 16:51, Julien Thierry wrote:
> Multiple interrupts pending for a CPU is actually rare. Doing an
> acknowledge loop does not give much better performance or even can
> deteriorate them.
>
> Do not loop when an interrupt has been acknowledged, just return
> from interrupt and wait for another one to be raised.
>
> Tested-by: Daniel Thompson <[email protected]>
> Signed-off-by: Julien Thierry <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Jason Cooper <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> ---
> drivers/irqchip/irq-gic-v3.c | 65 +++++++++++++++++++++-----------------------
> 1 file changed, 31 insertions(+), 34 deletions(-)

It would probably be valuable to do the same thing for GICv1/v2, where
the MMIO access can be even more costly.

In the meantime, I've queued this for 4.20.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-10-03 15:11:00

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v5 04/27] arm64: daifflags: Use irqflags functions for daifflags

On Tue, Aug 28, 2018 at 04:51:14PM +0100, Julien Thierry wrote:
> Some of the work done in daifflags save/restore is already provided
> by irqflags functions. Daifflags should always be a superset of irqflags
> (it handles irq status + status of other flags). Modifying behaviour of
> irqflags should alter the behaviour of daifflags.
>
> Use irqflags_save/restore functions for the corresponding daifflags
> operation.
>
> Signed-off-by: Julien Thierry <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: James Morse <[email protected]>

Queued for 4.20. Thanks.

--
Catalin

2018-10-03 15:13:42

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v5 05/27] arm64: Use daifflag_restore after bp_hardening

On Tue, Aug 28, 2018 at 04:51:15PM +0100, Julien Thierry wrote:
> For EL0 entries requiring bp_hardening, daif status is kept at
> DAIF_PROCCTX_NOIRQ until after hardening has been done. Then interrupts
> are enabled through local_irq_enable().
>
> Before using local_irq_* functions, daifflags should be properly restored
> to a state where IRQs are enabled.
>
> Enable IRQs by restoring DAIF_PROCCTX state after bp hardening.
>
> Signed-off-by: Julien Thierry <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: James Morse <[email protected]>

Queued for 4.20. Thanks.

--
Catalin

2018-10-03 15:15:12

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v5 07/27] arm64: xen: Use existing helper to check interrupt status

On Tue, Aug 28, 2018 at 04:51:17PM +0100, Julien Thierry wrote:
> The status of interrupts might depend on more than just pstate. Use
> interrupts_disabled() instead of raw_irqs_disabled_flags() to take the full
> context into account.
>
> Signed-off-by: Julien Thierry <[email protected]>
> Cc: Stefano Stabellini <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>

Queued for 4.20. Thanks.

--
Catalin