2018-01-05 13:12:45

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 00/11] arm64 kpti hardening and variant 2 workarounds

Hi again,

This is version two of the patches I posted yesterday:

http://lists.infradead.org/pipermail/linux-arm-kernel/2018-January/551838.html

Changes since v1:

* Added comment to BL/RET sequence in trampoline page
* Removed writeback addressing modes from Hyp PSCI register save/restore
* Avoid save/restore of x18/x19 across Hyp PSCI call
* Always print message when kpti is enabled
* Added tags

Cheers,

Will

--->8

Marc Zyngier (3):
arm64: Move post_ttbr_update_workaround to C code
arm64: KVM: Use per-CPU vector when BP hardening is enabled
arm64: KVM: Make PSCI_VERSION a fast path

Will Deacon (8):
arm64: use RET instruction for exiting the trampoline
arm64: Kconfig: Reword UNMAP_KERNEL_AT_EL0 kconfig entry
arm64: Take into account ID_AA64PFR0_EL1.CSV3
arm64: cpufeature: Pass capability structure to ->enable callback
drivers/firmware: Expose psci_get_version through psci_ops structure
arm64: Add skeleton to harden the branch predictor against aliasing
attacks
arm64: cputype: Add missing MIDR values for Cortex-A72 and Cortex-A75
arm64: Implement branch predictor hardening for affected Cortex-A CPUs

arch/arm/include/asm/kvm_mmu.h | 10 ++++
arch/arm64/Kconfig | 30 +++++++---
arch/arm64/include/asm/assembler.h | 13 -----
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/cputype.h | 4 ++
arch/arm64/include/asm/kvm_mmu.h | 38 ++++++++++++
arch/arm64/include/asm/mmu.h | 37 ++++++++++++
arch/arm64/include/asm/sysreg.h | 2 +
arch/arm64/kernel/Makefile | 4 ++
arch/arm64/kernel/bpi.S | 79 +++++++++++++++++++++++++
arch/arm64/kernel/cpu_errata.c | 116 +++++++++++++++++++++++++++++++++++++
arch/arm64/kernel/cpufeature.c | 13 ++++-
arch/arm64/kernel/entry.S | 12 +++-
arch/arm64/kvm/hyp/switch.c | 15 ++++-
arch/arm64/mm/context.c | 11 ++++
arch/arm64/mm/fault.c | 1 +
arch/arm64/mm/proc.S | 3 +-
drivers/firmware/psci.c | 2 +
include/linux/psci.h | 1 +
virt/kvm/arm/arm.c | 8 ++-
20 files changed, 372 insertions(+), 30 deletions(-)
create mode 100644 arch/arm64/kernel/bpi.S

--
2.1.4


2018-01-05 13:12:48

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 02/11] arm64: Kconfig: Reword UNMAP_KERNEL_AT_EL0 kconfig entry

Although CONFIG_UNMAP_KERNEL_AT_EL0 does make KASLR more robust, it's
actually more useful as a mitigation against speculation attacks that
can leak arbitrary kernel data to userspace through speculation.

Reword the Kconfig help message to reflect this, and make the option
depend on EXPERT so that it is on by default for the majority of users.

Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/Kconfig | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 3af1657fcac3..efaaa3a66b95 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -834,15 +834,14 @@ config FORCE_MAX_ZONEORDER
4M allocations matching the default size used by generic code.

config UNMAP_KERNEL_AT_EL0
- bool "Unmap kernel when running in userspace (aka \"KAISER\")"
+ bool "Unmap kernel when running in userspace (aka \"KAISER\")" if EXPERT
default y
help
- Some attacks against KASLR make use of the timing difference between
- a permission fault which could arise from a page table entry that is
- present in the TLB, and a translation fault which always requires a
- page table walk. This option defends against these attacks by unmapping
- the kernel whilst running in userspace, therefore forcing translation
- faults for all of kernel space.
+ Speculation attacks against some high-performance processors can
+ be used to bypass MMU permission checks and leak kernel data to
+ userspace. This can be defended against by unmapping the kernel
+ when running in userspace, mapping it back in on exception entry
+ via a trampoline page in the vector table.

If unsure, say Y.

--
2.1.4

2018-01-05 13:12:47

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 01/11] arm64: use RET instruction for exiting the trampoline

Speculation attacks against the entry trampoline can potentially resteer
the speculative instruction stream through the indirect branch and into
arbitrary gadgets within the kernel.

This patch defends against these attacks by forcing a misprediction
through the return stack: a dummy BL instruction loads an entry into
the stack, so that the predicted program flow of the subsequent RET
instruction is to a branch-to-self instruction which is finally resolved
as a branch to the kernel vectors with speculation suppressed.

Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/kernel/entry.S | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 031392ee5f47..71092ee09b6b 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -1029,6 +1029,14 @@ alternative_else_nop_endif
.if \regsize == 64
msr tpidrro_el0, x30 // Restored in kernel_ventry
.endif
+ /*
+ * Defend against branch aliasing attacks by pushing a dummy
+ * entry onto the return stack and using a RET instruction to
+ * entr the full-fat kernel vectors.
+ */
+ bl 2f
+ b .
+2:
tramp_map_kernel x30
#ifdef CONFIG_RANDOMIZE_BASE
adr x30, tramp_vectors + PAGE_SIZE
@@ -1041,7 +1049,7 @@ alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
msr vbar_el1, x30
add x30, x30, #(1b - tramp_vectors)
isb
- br x30
+ ret
.endm

.macro tramp_exit, regsize = 64
--
2.1.4

2018-01-05 13:13:19

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 07/11] arm64: Add skeleton to harden the branch predictor against aliasing attacks

Aliasing attacks against CPU branch predictors can allow an attacker to
redirect speculative control flow on some CPUs and potentially divulge
information from one context to another.

This patch adds initial skeleton code behind a new Kconfig option to
enable implementation-specific mitigations against these attacks for
CPUs that are affected.

Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/Kconfig | 17 +++++++++
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/mmu.h | 37 ++++++++++++++++++++
arch/arm64/include/asm/sysreg.h | 1 +
arch/arm64/kernel/Makefile | 4 +++
arch/arm64/kernel/bpi.S | 55 +++++++++++++++++++++++++++++
arch/arm64/kernel/cpu_errata.c | 74 ++++++++++++++++++++++++++++++++++++++++
arch/arm64/kernel/cpufeature.c | 1 +
arch/arm64/mm/context.c | 2 ++
arch/arm64/mm/fault.c | 1 +
10 files changed, 194 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/kernel/bpi.S

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index efaaa3a66b95..cea44b95187c 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -845,6 +845,23 @@ config UNMAP_KERNEL_AT_EL0

If unsure, say Y.

+config HARDEN_BRANCH_PREDICTOR
+ bool "Harden the branch predictor against aliasing attacks" if EXPERT
+ default y
+ help
+ Speculation attacks against some high-performance processors rely on
+ being able to manipulate the branch predictor for a victim context by
+ executing aliasing branches in the attacker context. Such attacks
+ can be partially mitigated against by clearing internal branch
+ predictor state and limiting the prediction logic in some situations.
+
+ This config option will take CPU-specific actions to harden the
+ branch predictor against aliasing attacks and may rely on specific
+ instruction sequences or control bits being set by the system
+ firmware.
+
+ If unsure, say Y.
+
menuconfig ARMV8_DEPRECATED
bool "Emulate deprecated/obsolete ARMv8 instructions"
depends on COMPAT
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index b4537ffd1018..51616e77fe6b 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -42,7 +42,8 @@
#define ARM64_HAS_DCPOP 21
#define ARM64_SVE 22
#define ARM64_UNMAP_KERNEL_AT_EL0 23
+#define ARM64_HARDEN_BRANCH_PREDICTOR 24

-#define ARM64_NCAPS 24
+#define ARM64_NCAPS 25

#endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 6f7bdb89817f..6dd83d75b82a 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -41,6 +41,43 @@ static inline bool arm64_kernel_unmapped_at_el0(void)
cpus_have_const_cap(ARM64_UNMAP_KERNEL_AT_EL0);
}

+typedef void (*bp_hardening_cb_t)(void);
+
+struct bp_hardening_data {
+ int hyp_vectors_slot;
+ bp_hardening_cb_t fn;
+};
+
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+extern char __bp_harden_hyp_vecs_start[], __bp_harden_hyp_vecs_end[];
+
+DECLARE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
+
+static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void)
+{
+ return this_cpu_ptr(&bp_hardening_data);
+}
+
+static inline void arm64_apply_bp_hardening(void)
+{
+ struct bp_hardening_data *d;
+
+ if (!cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR))
+ return;
+
+ d = arm64_get_bp_hardening_data();
+ if (d->fn)
+ d->fn();
+}
+#else
+static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void)
+{
+ return NULL;
+}
+
+static inline void arm64_apply_bp_hardening(void) { }
+#endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */
+
extern void paging_init(void);
extern void bootmem_init(void);
extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index ae519bbd3f9e..871744973ece 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -438,6 +438,7 @@

/* id_aa64pfr0 */
#define ID_AA64PFR0_CSV3_SHIFT 60
+#define ID_AA64PFR0_CSV2_SHIFT 56
#define ID_AA64PFR0_SVE_SHIFT 32
#define ID_AA64PFR0_GIC_SHIFT 24
#define ID_AA64PFR0_ASIMD_SHIFT 20
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 067baace74a0..0c760db04858 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -53,6 +53,10 @@ arm64-obj-$(CONFIG_ARM64_RELOC_TEST) += arm64-reloc-test.o
arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o

+ifeq ($(CONFIG_KVM),y)
+arm64-obj-$(CONFIG_HARDEN_BRANCH_PREDICTOR) += bpi.o
+endif
+
obj-y += $(arm64-obj-y) vdso/ probes/
obj-m += $(arm64-obj-m)
head-y := head.o
diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
new file mode 100644
index 000000000000..06a931eb2673
--- /dev/null
+++ b/arch/arm64/kernel/bpi.S
@@ -0,0 +1,55 @@
+/*
+ * Contains CPU specific branch predictor invalidation sequences
+ *
+ * Copyright (C) 2018 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+
+.macro ventry target
+ .rept 31
+ nop
+ .endr
+ b \target
+.endm
+
+.macro vectors target
+ ventry \target + 0x000
+ ventry \target + 0x080
+ ventry \target + 0x100
+ ventry \target + 0x180
+
+ ventry \target + 0x200
+ ventry \target + 0x280
+ ventry \target + 0x300
+ ventry \target + 0x380
+
+ ventry \target + 0x400
+ ventry \target + 0x480
+ ventry \target + 0x500
+ ventry \target + 0x580
+
+ ventry \target + 0x600
+ ventry \target + 0x680
+ ventry \target + 0x700
+ ventry \target + 0x780
+.endm
+
+ .align 11
+ENTRY(__bp_harden_hyp_vecs_start)
+ .rept 4
+ vectors __kvm_hyp_vector
+ .endr
+ENTRY(__bp_harden_hyp_vecs_end)
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 0e27f86ee709..16ea5c6f314e 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -46,6 +46,80 @@ static int cpu_enable_trap_ctr_access(void *__unused)
return 0;
}

+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+#include <asm/mmu_context.h>
+#include <asm/cacheflush.h>
+
+DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
+
+#ifdef CONFIG_KVM
+static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
+ const char *hyp_vecs_end)
+{
+ void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K);
+ int i;
+
+ for (i = 0; i < SZ_2K; i += 0x80)
+ memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);
+
+ flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
+}
+
+static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
+ const char *hyp_vecs_start,
+ const char *hyp_vecs_end)
+{
+ static int last_slot = -1;
+ static DEFINE_SPINLOCK(bp_lock);
+ int cpu, slot = -1;
+
+ spin_lock(&bp_lock);
+ for_each_possible_cpu(cpu) {
+ if (per_cpu(bp_hardening_data.fn, cpu) == fn) {
+ slot = per_cpu(bp_hardening_data.hyp_vectors_slot, cpu);
+ break;
+ }
+ }
+
+ if (slot == -1) {
+ last_slot++;
+ BUG_ON(((__bp_harden_hyp_vecs_end - __bp_harden_hyp_vecs_start)
+ / SZ_2K) <= last_slot);
+ slot = last_slot;
+ __copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end);
+ }
+
+ __this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot);
+ __this_cpu_write(bp_hardening_data.fn, fn);
+ spin_unlock(&bp_lock);
+}
+#else
+static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
+ const char *hyp_vecs_start,
+ const char *hyp_vecs_end)
+{
+ __this_cpu_write(bp_hardening_data.fn, fn);
+}
+#endif /* CONFIG_KVM */
+
+static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
+ bp_hardening_cb_t fn,
+ const char *hyp_vecs_start,
+ const char *hyp_vecs_end)
+{
+ u64 pfr0;
+
+ if (!entry->matches(entry, SCOPE_LOCAL_CPU))
+ return;
+
+ pfr0 = read_cpuid(ID_AA64PFR0_EL1);
+ if (cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_CSV2_SHIFT))
+ return;
+
+ __install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
+}
+#endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */
+
#define MIDR_RANGE(model, min, max) \
.def_scope = SCOPE_LOCAL_CPU, \
.matches = is_affected_midr_range, \
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 55712ab4e3bf..9d4d82c11528 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -146,6 +146,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {

static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
+ ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 5f7097d0cd12..d99b36555a16 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -246,6 +246,8 @@ asmlinkage void post_ttbr_update_workaround(void)
"ic iallu; dsb nsh; isb",
ARM64_WORKAROUND_CAVIUM_27456,
CONFIG_CAVIUM_ERRATUM_27456));
+
+ arm64_apply_bp_hardening();
}

static int asids_init(void)
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 22168cd0dde7..5203b6040cb6 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -318,6 +318,7 @@ static void __do_user_fault(struct task_struct *tsk, unsigned long addr,
lsb = PAGE_SHIFT;
si.si_addr_lsb = lsb;

+ arm64_apply_bp_hardening();
force_sig_info(sig, &si, tsk);
}

--
2.1.4

2018-01-05 13:13:17

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 08/11] arm64: KVM: Use per-CPU vector when BP hardening is enabled

From: Marc Zyngier <[email protected]>

Now that we have per-CPU vectors, let's plug then in the KVM/arm64 code.

Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm/include/asm/kvm_mmu.h | 10 ++++++++++
arch/arm64/include/asm/kvm_mmu.h | 38 ++++++++++++++++++++++++++++++++++++++
arch/arm64/kvm/hyp/switch.c | 2 +-
virt/kvm/arm/arm.c | 8 +++++++-
4 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index fa6f2174276b..eb46fc81a440 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -221,6 +221,16 @@ static inline unsigned int kvm_get_vmid_bits(void)
return 8;
}

+static inline void *kvm_get_hyp_vector(void)
+{
+ return kvm_ksym_ref(__kvm_hyp_vector);
+}
+
+static inline int kvm_map_vectors(void)
+{
+ return 0;
+}
+
#endif /* !__ASSEMBLY__ */

#endif /* __ARM_KVM_MMU_H__ */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 672c8684d5c2..2d6d4bd9de52 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -309,5 +309,43 @@ static inline unsigned int kvm_get_vmid_bits(void)
return (cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8;
}

+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+#include <asm/mmu.h>
+
+static inline void *kvm_get_hyp_vector(void)
+{
+ struct bp_hardening_data *data = arm64_get_bp_hardening_data();
+ void *vect = kvm_ksym_ref(__kvm_hyp_vector);
+
+ if (data->fn) {
+ vect = __bp_harden_hyp_vecs_start +
+ data->hyp_vectors_slot * SZ_2K;
+
+ if (!has_vhe())
+ vect = lm_alias(vect);
+ }
+
+ return vect;
+}
+
+static inline int kvm_map_vectors(void)
+{
+ return create_hyp_mappings(kvm_ksym_ref(__bp_harden_hyp_vecs_start),
+ kvm_ksym_ref(__bp_harden_hyp_vecs_end),
+ PAGE_HYP_EXEC);
+}
+
+#else
+static inline void *kvm_get_hyp_vector(void)
+{
+ return kvm_ksym_ref(__kvm_hyp_vector);
+}
+
+static inline int kvm_map_vectors(void)
+{
+ return 0;
+}
+#endif
+
#endif /* __ASSEMBLY__ */
#endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index f7c651f3a8c0..8d4f3c9d6dc4 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -52,7 +52,7 @@ static void __hyp_text __activate_traps_vhe(void)
val &= ~(CPACR_EL1_FPEN | CPACR_EL1_ZEN);
write_sysreg(val, cpacr_el1);

- write_sysreg(__kvm_hyp_vector, vbar_el1);
+ write_sysreg(kvm_get_hyp_vector(), vbar_el1);
}

static void __hyp_text __activate_traps_nvhe(void)
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 6b60c98a6e22..1c9fdb6db124 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -1158,7 +1158,7 @@ static void cpu_init_hyp_mode(void *dummy)
pgd_ptr = kvm_mmu_get_httbr();
stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
hyp_stack_ptr = stack_page + PAGE_SIZE;
- vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
+ vector_ptr = (unsigned long)kvm_get_hyp_vector();

__cpu_init_hyp_mode(pgd_ptr, hyp_stack_ptr, vector_ptr);
__cpu_init_stage2();
@@ -1403,6 +1403,12 @@ static int init_hyp_mode(void)
goto out_err;
}

+ err = kvm_map_vectors();
+ if (err) {
+ kvm_err("Cannot map vectors\n");
+ goto out_err;
+ }
+
/*
* Map the Hyp stack pages
*/
--
2.1.4

2018-01-05 13:13:16

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 10/11] arm64: cputype: Add missing MIDR values for Cortex-A72 and Cortex-A75

Hook up MIDR values for the Cortex-A72 and Cortex-A75 CPUs, since they
will soon need MIDR matches for hardening the branch predictor.

Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/cputype.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index 235e77d98261..84385b94e70b 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -79,8 +79,10 @@
#define ARM_CPU_PART_AEM_V8 0xD0F
#define ARM_CPU_PART_FOUNDATION 0xD00
#define ARM_CPU_PART_CORTEX_A57 0xD07
+#define ARM_CPU_PART_CORTEX_A72 0xD08
#define ARM_CPU_PART_CORTEX_A53 0xD03
#define ARM_CPU_PART_CORTEX_A73 0xD09
+#define ARM_CPU_PART_CORTEX_A75 0xD0A

#define APM_CPU_PART_POTENZA 0x000

@@ -94,7 +96,9 @@

#define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
#define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
+#define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A72)
#define MIDR_CORTEX_A73 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A73)
+#define MIDR_CORTEX_A75 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A75)
#define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
#define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
#define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX)
--
2.1.4

2018-01-05 13:13:14

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 09/11] arm64: KVM: Make PSCI_VERSION a fast path

From: Marc Zyngier <[email protected]>

For those CPUs that require PSCI to perform a BP invalidation,
going all the way to the PSCI code for not much is a waste of
precious cycles. Let's terminate that call as early as possible.

Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/kvm/hyp/switch.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 8d4f3c9d6dc4..4d273f6d0e69 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -17,6 +17,7 @@

#include <linux/types.h>
#include <linux/jump_label.h>
+#include <uapi/linux/psci.h>

#include <asm/kvm_asm.h>
#include <asm/kvm_emulate.h>
@@ -341,6 +342,18 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
goto again;

+ if (exit_code == ARM_EXCEPTION_TRAP &&
+ (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC64 ||
+ kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32) &&
+ vcpu_get_reg(vcpu, 0) == PSCI_0_2_FN_PSCI_VERSION) {
+ u64 val = PSCI_RET_NOT_SUPPORTED;
+ if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
+ val = 2;
+
+ vcpu_set_reg(vcpu, 0, val);
+ goto again;
+ }
+
if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
exit_code == ARM_EXCEPTION_TRAP) {
bool valid;
--
2.1.4

2018-01-05 13:13:13

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 11/11] arm64: Implement branch predictor hardening for affected Cortex-A CPUs

Cortex-A57, A72, A73 and A75 are susceptible to branch predictor aliasing
and can theoretically be attacked by malicious code.

This patch implements a PSCI-based mitigation for these CPUs when available.
The call into firmware will invalidate the branch predictor state, preventing
any malicious entries from affecting other victim contexts.

Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/kernel/bpi.S | 24 ++++++++++++++++++++++++
arch/arm64/kernel/cpu_errata.c | 42 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 66 insertions(+)

diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
index 06a931eb2673..2e9146534174 100644
--- a/arch/arm64/kernel/bpi.S
+++ b/arch/arm64/kernel/bpi.S
@@ -53,3 +53,27 @@ ENTRY(__bp_harden_hyp_vecs_start)
vectors __kvm_hyp_vector
.endr
ENTRY(__bp_harden_hyp_vecs_end)
+ENTRY(__psci_hyp_bp_inval_start)
+ sub sp, sp, #(8 * 18)
+ stp x16, x17, [sp, #(16 * 0)]
+ stp x14, x15, [sp, #(16 * 1)]
+ stp x12, x13, [sp, #(16 * 2)]
+ stp x10, x11, [sp, #(16 * 3)]
+ stp x8, x9, [sp, #(16 * 4)]
+ stp x6, x7, [sp, #(16 * 5)]
+ stp x4, x5, [sp, #(16 * 6)]
+ stp x2, x3, [sp, #(16 * 7)]
+ stp x0, x1, [sp, #(18 * 8)]
+ mov x0, #0x84000000
+ smc #0
+ ldp x16, x17, [sp, #(16 * 0)]
+ ldp x14, x15, [sp, #(16 * 1)]
+ ldp x12, x13, [sp, #(16 * 2)]
+ ldp x10, x11, [sp, #(16 * 3)]
+ ldp x8, x9, [sp, #(16 * 4)]
+ ldp x6, x7, [sp, #(16 * 5)]
+ ldp x4, x5, [sp, #(16 * 6)]
+ ldp x2, x3, [sp, #(16 * 7)]
+ ldp x0, x1, [sp, #(18 * 8)]
+ add sp, sp, #(8 * 18)
+ENTRY(__psci_hyp_bp_inval_end)
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 16ea5c6f314e..cb0fb3796bb8 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -53,6 +53,8 @@ static int cpu_enable_trap_ctr_access(void *__unused)
DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);

#ifdef CONFIG_KVM
+extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[];
+
static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
const char *hyp_vecs_end)
{
@@ -94,6 +96,9 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
spin_unlock(&bp_lock);
}
#else
+#define __psci_hyp_bp_inval_start NULL
+#define __psci_hyp_bp_inval_end NULL
+
static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
const char *hyp_vecs_start,
const char *hyp_vecs_end)
@@ -118,6 +123,21 @@ static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,

__install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
}
+
+#include <linux/psci.h>
+
+static int enable_psci_bp_hardening(void *data)
+{
+ const struct arm64_cpu_capabilities *entry = data;
+
+ if (psci_ops.get_version)
+ install_bp_hardening_cb(entry,
+ (bp_hardening_cb_t)psci_ops.get_version,
+ __psci_hyp_bp_inval_start,
+ __psci_hyp_bp_inval_end);
+
+ return 0;
+}
#endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */

#define MIDR_RANGE(model, min, max) \
@@ -261,6 +281,28 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
},
#endif
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+ {
+ .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
+ .enable = enable_psci_bp_hardening,
+ },
+ {
+ .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
+ .enable = enable_psci_bp_hardening,
+ },
+ {
+ .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
+ .enable = enable_psci_bp_hardening,
+ },
+ {
+ .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
+ .enable = enable_psci_bp_hardening,
+ },
+#endif
{
}
};
--
2.1.4

2018-01-05 13:14:26

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 04/11] arm64: cpufeature: Pass capability structure to ->enable callback

In order to invoke the CPU capability ->matches callback from the ->enable
callback for applying local-CPU workarounds, we need a handle on the
capability structure.

This patch passes a pointer to the capability structure to the ->enable
callback.

Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/kernel/cpufeature.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d723fc071f39..55712ab4e3bf 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1152,7 +1152,7 @@ void __init enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps)
* uses an IPI, giving us a PSTATE that disappears when
* we return.
*/
- stop_machine(caps->enable, NULL, cpu_online_mask);
+ stop_machine(caps->enable, (void *)caps, cpu_online_mask);
}
}
}
@@ -1195,7 +1195,7 @@ verify_local_cpu_features(const struct arm64_cpu_capabilities *caps)
cpu_die_early();
}
if (caps->enable)
- caps->enable(NULL);
+ caps->enable((void *)caps);
}
}

--
2.1.4

2018-01-05 13:14:25

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

For non-KASLR kernels where the KPTI behaviour has not been overridden
on the command line we can use ID_AA64PFR0_EL1.CSV3 to determine whether
or not we should unmap the kernel whilst running at EL0.

Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/sysreg.h | 1 +
arch/arm64/kernel/cpufeature.c | 8 +++++++-
2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 08cc88574659..ae519bbd3f9e 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -437,6 +437,7 @@
#define ID_AA64ISAR1_DPB_SHIFT 0

/* id_aa64pfr0 */
+#define ID_AA64PFR0_CSV3_SHIFT 60
#define ID_AA64PFR0_SVE_SHIFT 32
#define ID_AA64PFR0_GIC_SHIFT 24
#define ID_AA64PFR0_ASIMD_SHIFT 20
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9f0545dfe497..d723fc071f39 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -145,6 +145,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
};

static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+ ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
@@ -851,6 +852,8 @@ static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */
static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
int __unused)
{
+ u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
+
/* Forced on command line? */
if (__kpti_forced) {
pr_info_once("kernel page table isolation forced %s by command line option\n",
@@ -862,7 +865,9 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
return true;

- return false;
+ /* Defer to CPU feature registers */
+ return !cpuid_feature_extract_unsigned_field(pfr0,
+ ID_AA64PFR0_CSV3_SHIFT);
}

static int __init parse_kpti(char *str)
@@ -967,6 +972,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
},
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
{
+ .desc = "Kernel page table isolation (KPTI)",
.capability = ARM64_UNMAP_KERNEL_AT_EL0,
.def_scope = SCOPE_SYSTEM,
.matches = unmap_kernel_at_el0,
--
2.1.4

2018-01-05 13:14:24

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 05/11] drivers/firmware: Expose psci_get_version through psci_ops structure

Entry into recent versions of ARM Trusted Firmware will invalidate the CPU
branch predictor state in order to protect against aliasing attacks.

This patch exposes the PSCI "VERSION" function via psci_ops, so that it
can be invoked outside of the PSCI driver where necessary.

Acked-by: Lorenzo Pieralisi <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
drivers/firmware/psci.c | 2 ++
include/linux/psci.h | 1 +
2 files changed, 3 insertions(+)

diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
index d687ca3d5049..8b25d31e8401 100644
--- a/drivers/firmware/psci.c
+++ b/drivers/firmware/psci.c
@@ -496,6 +496,8 @@ static void __init psci_init_migrate(void)
static void __init psci_0_2_set_functions(void)
{
pr_info("Using standard PSCI v0.2 function IDs\n");
+ psci_ops.get_version = psci_get_version;
+
psci_function_id[PSCI_FN_CPU_SUSPEND] =
PSCI_FN_NATIVE(0_2, CPU_SUSPEND);
psci_ops.cpu_suspend = psci_cpu_suspend;
diff --git a/include/linux/psci.h b/include/linux/psci.h
index bdea1cb5e1db..6306ab10af18 100644
--- a/include/linux/psci.h
+++ b/include/linux/psci.h
@@ -26,6 +26,7 @@ int psci_cpu_init_idle(unsigned int cpu);
int psci_cpu_suspend_enter(unsigned long index);

struct psci_operations {
+ u32 (*get_version)(void);
int (*cpu_suspend)(u32 state, unsigned long entry_point);
int (*cpu_off)(u32 state);
int (*cpu_on)(unsigned long cpuid, unsigned long entry_point);
--
2.1.4

2018-01-05 13:14:23

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 06/11] arm64: Move post_ttbr_update_workaround to C code

From: Marc Zyngier <[email protected]>

We will soon need to invoke a CPU-specific function pointer after changing
page tables, so move post_ttbr_update_workaround out into C code to make
this possible.

Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/assembler.h | 13 -------------
arch/arm64/kernel/entry.S | 2 +-
arch/arm64/mm/context.c | 9 +++++++++
arch/arm64/mm/proc.S | 3 +--
4 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index c45bc94f15d0..cee60ce0da52 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -476,17 +476,4 @@ alternative_endif
mrs \rd, sp_el0
.endm

-/*
- * Errata workaround post TTBRx_EL1 update.
- */
- .macro post_ttbr_update_workaround
-#ifdef CONFIG_CAVIUM_ERRATUM_27456
-alternative_if ARM64_WORKAROUND_CAVIUM_27456
- ic iallu
- dsb nsh
- isb
-alternative_else_nop_endif
-#endif
- .endm
-
#endif /* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 71092ee09b6b..62500d371b06 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -277,7 +277,7 @@ alternative_else_nop_endif
* Cavium erratum 27456 (broadcast TLBI instructions may cause I-cache
* corruption).
*/
- post_ttbr_update_workaround
+ bl post_ttbr_update_workaround
.endif
1:
.if \el != 0
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 1cb3bc92ae5c..5f7097d0cd12 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -239,6 +239,15 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
cpu_switch_mm(mm->pgd, mm);
}

+/* Errata workaround post TTBRx_EL1 update. */
+asmlinkage void post_ttbr_update_workaround(void)
+{
+ asm(ALTERNATIVE("nop; nop; nop",
+ "ic iallu; dsb nsh; isb",
+ ARM64_WORKAROUND_CAVIUM_27456,
+ CONFIG_CAVIUM_ERRATUM_27456));
+}
+
static int asids_init(void)
{
asid_bits = get_cpu_asid_bits();
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 3146dc96f05b..6affb68a9a14 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -145,8 +145,7 @@ ENTRY(cpu_do_switch_mm)
isb
msr ttbr0_el1, x0 // now update TTBR0
isb
- post_ttbr_update_workaround
- ret
+ b post_ttbr_update_workaround // Back to C code...
ENDPROC(cpu_do_switch_mm)

.pushsection ".idmap.text", "ax"
--
2.1.4

2018-01-05 14:48:53

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v2 11/11] arm64: Implement branch predictor hardening for affected Cortex-A CPUs

Hi Marc, Will,

(SOB-chain suggests a missing From: tag on this and patch 7)

On 05/01/18 13:12, Will Deacon wrote:
> Cortex-A57, A72, A73 and A75 are susceptible to branch predictor aliasing
> and can theoretically be attacked by malicious code.
>
> This patch implements a PSCI-based mitigation for these CPUs when available.
> The call into firmware will invalidate the branch predictor state, preventing
> any malicious entries from affecting other victim contexts.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> Signed-off-by: Will Deacon <[email protected]>

> diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
> index 06a931eb2673..2e9146534174 100644
> --- a/arch/arm64/kernel/bpi.S
> +++ b/arch/arm64/kernel/bpi.S
> @@ -53,3 +53,27 @@ ENTRY(__bp_harden_hyp_vecs_start)
> vectors __kvm_hyp_vector
> .endr
> ENTRY(__bp_harden_hyp_vecs_end)
> +ENTRY(__psci_hyp_bp_inval_start)

> + sub sp, sp, #(8 * 18)

Where does 18 come from? Isn't this storing 9 sets of 16 bytes?


> + stp x16, x17, [sp, #(16 * 0)]
> + stp x14, x15, [sp, #(16 * 1)]
> + stp x12, x13, [sp, #(16 * 2)]
> + stp x10, x11, [sp, #(16 * 3)]
> + stp x8, x9, [sp, #(16 * 4)]
> + stp x6, x7, [sp, #(16 * 5)]
> + stp x4, x5, [sp, #(16 * 6)]
> + stp x2, x3, [sp, #(16 * 7)]

> + stp x0, x1, [sp, #(18 * 8)]

16->18 typo?


> + mov x0, #0x84000000
> + smc #0
> + ldp x16, x17, [sp, #(16 * 0)]
> + ldp x14, x15, [sp, #(16 * 1)]
> + ldp x12, x13, [sp, #(16 * 2)]
> + ldp x10, x11, [sp, #(16 * 3)]
> + ldp x8, x9, [sp, #(16 * 4)]
> + ldp x6, x7, [sp, #(16 * 5)]
> + ldp x4, x5, [sp, #(16 * 6)]
> + ldp x2, x3, [sp, #(16 * 7)]

> + ldp x0, x1, [sp, #(18 * 8)]
> + add sp, sp, #(8 * 18)

(and here?)

> +ENTRY(__psci_hyp_bp_inval_end)


Thanks,

James

2018-01-05 14:57:26

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v2 11/11] arm64: Implement branch predictor hardening for affected Cortex-A CPUs

On 05/01/18 14:46, James Morse wrote:
> Hi Marc, Will,
>
> (SOB-chain suggests a missing From: tag on this and patch 7)
>
> On 05/01/18 13:12, Will Deacon wrote:
>> Cortex-A57, A72, A73 and A75 are susceptible to branch predictor aliasing
>> and can theoretically be attacked by malicious code.
>>
>> This patch implements a PSCI-based mitigation for these CPUs when available.
>> The call into firmware will invalidate the branch predictor state, preventing
>> any malicious entries from affecting other victim contexts.
>>
>> Signed-off-by: Marc Zyngier <[email protected]>
>> Signed-off-by: Will Deacon <[email protected]>
>
>> diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
>> index 06a931eb2673..2e9146534174 100644
>> --- a/arch/arm64/kernel/bpi.S
>> +++ b/arch/arm64/kernel/bpi.S
>> @@ -53,3 +53,27 @@ ENTRY(__bp_harden_hyp_vecs_start)
>> vectors __kvm_hyp_vector
>> .endr
>> ENTRY(__bp_harden_hyp_vecs_end)
>> +ENTRY(__psci_hyp_bp_inval_start)
>
>> + sub sp, sp, #(8 * 18)
>
> Where does 18 come from? Isn't this storing 9 sets of 16 bytes?

Or 18 registers of 8 bytes each.

>
>> + stp x16, x17, [sp, #(16 * 0)]
>> + stp x14, x15, [sp, #(16 * 1)]
>> + stp x12, x13, [sp, #(16 * 2)]
>> + stp x10, x11, [sp, #(16 * 3)]
>> + stp x8, x9, [sp, #(16 * 4)]
>> + stp x6, x7, [sp, #(16 * 5)]
>> + stp x4, x5, [sp, #(16 * 6)]
>> + stp x2, x3, [sp, #(16 * 7)]
>
>> + stp x0, x1, [sp, #(18 * 8)]
>
> 16->18 typo?

/me bashes head against keyboard, as this is likely to generate less
crap then me trying to write something half correct...

The fun part is that the Seattle box I booted yesterday with that crap
is still happily churning along. I guess I'm corrupting something that
really doesn't matter...

>
>
>> + mov x0, #0x84000000
>> + smc #0
>> + ldp x16, x17, [sp, #(16 * 0)]
>> + ldp x14, x15, [sp, #(16 * 1)]
>> + ldp x12, x13, [sp, #(16 * 2)]
>> + ldp x10, x11, [sp, #(16 * 3)]
>> + ldp x8, x9, [sp, #(16 * 4)]
>> + ldp x6, x7, [sp, #(16 * 5)]
>> + ldp x4, x5, [sp, #(16 * 6)]
>> + ldp x2, x3, [sp, #(16 * 7)]
>
>> + ldp x0, x1, [sp, #(18 * 8)]
>> + add sp, sp, #(8 * 18)
>
> (and here?)

Yup.

Thanks for pointing that out. I'll fetch a brown paper bag.

M.
--
Jazz is not dead. It just smells funny...

2018-01-06 13:13:26

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2 01/11] arm64: use RET instruction for exiting the trampoline

On 5 January 2018 at 13:12, Will Deacon <[email protected]> wrote:
> Speculation attacks against the entry trampoline can potentially resteer
> the speculative instruction stream through the indirect branch and into
> arbitrary gadgets within the kernel.
>
> This patch defends against these attacks by forcing a misprediction
> through the return stack: a dummy BL instruction loads an entry into
> the stack, so that the predicted program flow of the subsequent RET
> instruction is to a branch-to-self instruction which is finally resolved
> as a branch to the kernel vectors with speculation suppressed.
>

How safe is it to assume that every microarchitecture will behave as
expected here? Wouldn't it be safer in general not to rely on a memory
load for x30 in the first place? (see below) Or may the speculative
execution still branch anywhere even if the branch target is
guaranteed to be known by that time?


> Signed-off-by: Will Deacon <[email protected]>
> ---
> arch/arm64/kernel/entry.S | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 031392ee5f47..71092ee09b6b 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -1029,6 +1029,14 @@ alternative_else_nop_endif
> .if \regsize == 64
> msr tpidrro_el0, x30 // Restored in kernel_ventry
> .endif
> + /*
> + * Defend against branch aliasing attacks by pushing a dummy
> + * entry onto the return stack and using a RET instruction to
> + * entr the full-fat kernel vectors.
> + */
> + bl 2f
> + b .
> +2:
> tramp_map_kernel x30
> #ifdef CONFIG_RANDOMIZE_BASE
> adr x30, tramp_vectors + PAGE_SIZE
> @@ -1041,7 +1049,7 @@ alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> msr vbar_el1, x30
> add x30, x30, #(1b - tramp_vectors)
> isb
> - br x30
> + ret
> .endm
>
> .macro tramp_exit, regsize = 64
> --
> 2.1.4
>

This uses Marc's callback alternative patching for the KASLR case, the
non-KASLR case just uses a plain movz/movk sequence to load the
address of vectors rather than a literal.

(apologies for the patch soup)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 55d05dacd02e..e8a846335e8e 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -1031,17 +1031,19 @@ alternative_else_nop_endif
.endif
tramp_map_kernel x30
#ifdef CONFIG_RANDOMIZE_BASE
- adr x30, tramp_vectors + PAGE_SIZE
alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
- ldr x30, [x30]
+ b tramp_vectors + PAGE_SIZE + .Ltramp_stage2_size * (1b
- tramp_vectors - 0x400) / 0x80
#else
- ldr x30, =vectors
-#endif
+ movz x30, :abs_g3:vectors
+ movk x30, :abs_g2_nc:vectors
+ movk x30, :abs_g1_nc:vectors
+ movk x30, :abs_g0_nc:vectors
prfm plil1strm, [x30, #(1b - tramp_vectors)]
msr vbar_el1, x30
add x30, x30, #(1b - tramp_vectors)
isb
br x30
+#endif
.endm

.macro tramp_exit, regsize = 64
@@ -1080,12 +1082,25 @@ END(tramp_exit_compat)
.ltorg
.popsection // .entry.tramp.text
#ifdef CONFIG_RANDOMIZE_BASE
- .pushsection ".rodata", "a"
+ .pushsection ".text", "ax"
.align PAGE_SHIFT
.globl __entry_tramp_data_start
__entry_tramp_data_start:
- .quad vectors
- .popsection // .rodata
+ .irpc i, 01234567
+alternative_cb tramp_patch_vectors_address
+ movz x30, :abs_g3:0
+ movk x30, :abs_g2_nc:0
+ movk x30, :abs_g1_nc:0
+ movk x30, :abs_g0_nc:0
+alternative_cb_end
+ prfm plil1strm, [x30, #(0x400 + \i * 0x80)]
+ msr vbar_el1, x30
+ add x30, x30, 0x400 + \i * 0x80
+ isb
+ br x30
+ .endr
+ .set .Ltramp_stage2_size, (. - __entry_tramp_data_start) / 8
+ .popsection // .text
#endif /* CONFIG_RANDOMIZE_BASE */
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 916d9ced1c3f..4a9788e76489 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -547,13 +547,38 @@ static int __init map_entry_trampoline(void)
extern char __entry_tramp_data_start[];

__set_fixmap(FIX_ENTRY_TRAMP_DATA,
- __pa_symbol(__entry_tramp_data_start),
- PAGE_KERNEL_RO);
+ __pa_symbol(__entry_tramp_data_start), prot);
}

return 0;
}
core_initcall(map_entry_trampoline);
+
+#ifdef CONFIG_RANDOMIZE_BASE
+void __init tramp_patch_vectors_address(struct alt_instr *alt,
+ __le32 *origptr, __le32 *updptr,
+ int nr_inst)
+{
+ enum aarch64_insn_movewide_type type;
+ int s;
+
+ /* We only expect a 4 instruction sequence */
+ BUG_ON(nr_inst != 4);
+
+ type = AARCH64_INSN_MOVEWIDE_ZERO;
+ for (s = 0; nr_inst--; s += 16) {
+ extern const char vectors[];
+
+ u32 insn = aarch64_insn_gen_movewide(AARCH64_INSN_REG_30,
+ (u16)((u64)vectors >> s),
+ s,
+ AARCH64_INSN_VARIANT_64BIT,
+ type);
+ *updptr++ = cpu_to_le32(insn);
+ type = AARCH64_INSN_MOVEWIDE_KEEP;
+ }
+}
+#endif
#endif

/*

2018-01-08 00:15:34

by Jon Masters

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: Add skeleton to harden the branch predictor against aliasing attacks

On 01/05/2018 08:12 AM, Will Deacon wrote:

> Aliasing attacks against CPU branch predictors can allow an attacker to
> redirect speculative control flow on some CPUs and potentially divulge
> information from one context to another.
>
> This patch adds initial skeleton code behind a new Kconfig option to
> enable implementation-specific mitigations against these attacks for
> CPUs that are affected.

Thanks to Qualcomm for the (typically prompt and immediate) followup. As
a reminder to the other usual server suspects (all of whom we've spoken
with about mitigations for this so we know there things coming), I'm
expecting to see your patches for this hit the list within the next 48
hours. You know who you are, and I'll be doing the rounds over the next
24 hours to check your status as to when you'll be posting these.

Thanks,

Jon.

--
Computer Architect | Sent from my Fedora powered laptop

2018-01-08 06:32:10

by Jayachandran C

[permalink] [raw]
Subject: Re: [v2, 11/11] arm64: Implement branch predictor hardening for affected Cortex-A CPUs

On Fri, Jan 05, 2018 at 01:12:41PM +0000, Will Deacon wrote:
> Cortex-A57, A72, A73 and A75 are susceptible to branch predictor aliasing
> and can theoretically be attacked by malicious code.
>
> This patch implements a PSCI-based mitigation for these CPUs when available.
> The call into firmware will invalidate the branch predictor state, preventing
> any malicious entries from affecting other victim contexts.
>
> Signed-off-by: Marc Zyngier <[email protected]>
> Signed-off-by: Will Deacon <[email protected]>
> ---
> arch/arm64/kernel/bpi.S | 24 ++++++++++++++++++++++++
> arch/arm64/kernel/cpu_errata.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 66 insertions(+)
>
> diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
> index 06a931eb2673..2e9146534174 100644
> --- a/arch/arm64/kernel/bpi.S
> +++ b/arch/arm64/kernel/bpi.S
> @@ -53,3 +53,27 @@ ENTRY(__bp_harden_hyp_vecs_start)
> vectors __kvm_hyp_vector
> .endr
> ENTRY(__bp_harden_hyp_vecs_end)
> +ENTRY(__psci_hyp_bp_inval_start)
> + sub sp, sp, #(8 * 18)
> + stp x16, x17, [sp, #(16 * 0)]
> + stp x14, x15, [sp, #(16 * 1)]
> + stp x12, x13, [sp, #(16 * 2)]
> + stp x10, x11, [sp, #(16 * 3)]
> + stp x8, x9, [sp, #(16 * 4)]
> + stp x6, x7, [sp, #(16 * 5)]
> + stp x4, x5, [sp, #(16 * 6)]
> + stp x2, x3, [sp, #(16 * 7)]
> + stp x0, x1, [sp, #(18 * 8)]
> + mov x0, #0x84000000
> + smc #0
> + ldp x16, x17, [sp, #(16 * 0)]
> + ldp x14, x15, [sp, #(16 * 1)]
> + ldp x12, x13, [sp, #(16 * 2)]
> + ldp x10, x11, [sp, #(16 * 3)]
> + ldp x8, x9, [sp, #(16 * 4)]
> + ldp x6, x7, [sp, #(16 * 5)]
> + ldp x4, x5, [sp, #(16 * 6)]
> + ldp x2, x3, [sp, #(16 * 7)]
> + ldp x0, x1, [sp, #(18 * 8)]
> + add sp, sp, #(8 * 18)
> +ENTRY(__psci_hyp_bp_inval_end)
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 16ea5c6f314e..cb0fb3796bb8 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -53,6 +53,8 @@ static int cpu_enable_trap_ctr_access(void *__unused)
> DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
>
> #ifdef CONFIG_KVM
> +extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[];
> +
> static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
> const char *hyp_vecs_end)
> {
> @@ -94,6 +96,9 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
> spin_unlock(&bp_lock);
> }
> #else
> +#define __psci_hyp_bp_inval_start NULL
> +#define __psci_hyp_bp_inval_end NULL
> +
> static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
> const char *hyp_vecs_start,
> const char *hyp_vecs_end)
> @@ -118,6 +123,21 @@ static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
>
> __install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
> }
> +
> +#include <linux/psci.h>
> +
> +static int enable_psci_bp_hardening(void *data)
> +{
> + const struct arm64_cpu_capabilities *entry = data;
> +
> + if (psci_ops.get_version)
> + install_bp_hardening_cb(entry,
> + (bp_hardening_cb_t)psci_ops.get_version,
> + __psci_hyp_bp_inval_start,
> + __psci_hyp_bp_inval_end);
> +
> + return 0;
> +}
> #endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */
>
> #define MIDR_RANGE(model, min, max) \
> @@ -261,6 +281,28 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
> MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
> },
> #endif
> +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
> + {
> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> + MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
> + .enable = enable_psci_bp_hardening,
> + },
> + {
> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> + MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
> + .enable = enable_psci_bp_hardening,
> + },
> + {
> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> + MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
> + .enable = enable_psci_bp_hardening,
> + },
> + {
> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> + MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
> + .enable = enable_psci_bp_hardening,
> + },
> +#endif
> {
> }
> };

On ThunderX2, we would like to do something similar, but it is important to
find out if the current firmware has support for BTB invalidate in the
PSCI version call. Otherwise, we will end up doing version calls that do
nothing but return version (and waste cycles).

I will follow up with ThunderX2 patches, but it would be good to have a
standard way of figuring out if the firmware has BTB invalidate support.

JC.

2018-01-08 06:54:44

by Jayachandran C

[permalink] [raw]
Subject: [PATCH 1/2] arm64: cputype: Add MIDR values for Cavium ThunderX2 CPUs

Add the older Broadcom ID as well as the new Cavium ID for ThunderX2
CPUs.

Signed-off-by: Jayachandran C <[email protected]>
---
arch/arm64/include/asm/cputype.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index 84385b9..cce5735 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -89,6 +89,7 @@
#define CAVIUM_CPU_PART_THUNDERX 0x0A1
#define CAVIUM_CPU_PART_THUNDERX_81XX 0x0A2
#define CAVIUM_CPU_PART_THUNDERX_83XX 0x0A3
+#define CAVIUM_CPU_PART_THUNDERX2 0x0AF

#define BRCM_CPU_PART_VULCAN 0x516

@@ -102,6 +103,8 @@
#define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
#define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
#define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX)
+#define MIDR_CAVIUM_THUNDERX2 MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX2)
+#define MIDR_BRCM_VULCAN MIDR_CPU_MODEL(ARM_CPU_IMP_BRCM, BRCM_CPU_PART_VULCAN)
#define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1)

#ifndef __ASSEMBLY__
--
2.7.4

2018-01-08 06:54:56

by Jayachandran C

[permalink] [raw]
Subject: [PATCH 2/2] arm64: Branch predictor hardening for Cavium ThunderX2

Use PSCI based mitigation for speculative execution attacks targeting
the branch predictor. The approach is similar to the one used for
Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
test if the firmware supports the capability.

If the secure firmware has been updated with the mitigation code to
invalidate the branch target buffer, we use the PSCI version call to
invoke it.

Signed-off-by: Jayachandran C <[email protected]>
---
arch/arm64/kernel/cpu_errata.c | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index cb0fb37..abceb5d 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -124,6 +124,7 @@ static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
__install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
}

+#include <linux/arm-smccc.h>
#include <linux/psci.h>

static int enable_psci_bp_hardening(void *data)
@@ -138,6 +139,33 @@ static int enable_psci_bp_hardening(void *data)

return 0;
}
+
+#define CAVIUM_TX2_SIP_SMC_CALL 0xC200FF00
+#define CAVIUM_TX2_BTB_HARDEN_CAP 0xB0A0
+
+static int enable_tx2_psci_bp_hardening(void *data)
+{
+ const struct arm64_cpu_capabilities *entry = data;
+ struct arm_smccc_res res;
+
+ if (!entry->matches(entry, SCOPE_LOCAL_CPU))
+ return;
+
+ arm_smccc_smc(CAVIUM_TX2_SIP_SMC_CALL, CAVIUM_TX2_BTB_HARDEN_CAP, 0, 0, 0, 0, 0, 0, &res);
+ if (res.a0 != 0) {
+ pr_warn("Error: CONFIG_HARDEN_BRANCH_PREDICTOR enabled, but firmware does not support it\n");
+ return 0;
+ }
+ if (res.a1 == 1 && psci_ops.get_version) {
+ pr_info("CPU%d: Branch predictor hardening enabled\n", smp_processor_id());
+ install_bp_hardening_cb(entry,
+ (bp_hardening_cb_t)psci_ops.get_version,
+ __psci_hyp_bp_inval_start,
+ __psci_hyp_bp_inval_end);
+ }
+
+ return 0;
+}
#endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */

#define MIDR_RANGE(model, min, max) \
@@ -302,6 +330,16 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
.enable = enable_psci_bp_hardening,
},
+ {
+ .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+ MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
+ .enable = enable_tx2_psci_bp_hardening,
+ },
+ {
+ .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+ MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
+ .enable = enable_tx2_psci_bp_hardening,
+ },
#endif
{
}
--
2.7.4

2018-01-08 07:24:20

by Jayachandran C

[permalink] [raw]
Subject: Re: [v2,03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

On Fri, Jan 05, 2018 at 01:12:33PM +0000, Will Deacon wrote:
> For non-KASLR kernels where the KPTI behaviour has not been overridden
> on the command line we can use ID_AA64PFR0_EL1.CSV3 to determine whether
> or not we should unmap the kernel whilst running at EL0.
>
> Reviewed-by: Suzuki K Poulose <[email protected]>
> Signed-off-by: Will Deacon <[email protected]>
> ---
> arch/arm64/include/asm/sysreg.h | 1 +
> arch/arm64/kernel/cpufeature.c | 8 +++++++-
> 2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 08cc88574659..ae519bbd3f9e 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -437,6 +437,7 @@
> #define ID_AA64ISAR1_DPB_SHIFT 0
>
> /* id_aa64pfr0 */
> +#define ID_AA64PFR0_CSV3_SHIFT 60
> #define ID_AA64PFR0_SVE_SHIFT 32
> #define ID_AA64PFR0_GIC_SHIFT 24
> #define ID_AA64PFR0_ASIMD_SHIFT 20
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 9f0545dfe497..d723fc071f39 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -145,6 +145,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
> };
>
> static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
> + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
> ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
> ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
> S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
> @@ -851,6 +852,8 @@ static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */
> static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> int __unused)
> {
> + u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> +
> /* Forced on command line? */
> if (__kpti_forced) {
> pr_info_once("kernel page table isolation forced %s by command line option\n",
> @@ -862,7 +865,9 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> return true;
>
> - return false;
> + /* Defer to CPU feature registers */
> + return !cpuid_feature_extract_unsigned_field(pfr0,
> + ID_AA64PFR0_CSV3_SHIFT);

If I read this correctly, this enables KPTI on all processors without the CSV3
set (which seems to be a future capability).

Turning on KPTI has a small but significant overhead, so I think we should turn
it off on processors that are not vulnerable to CVE-2017-5754. Can we add something
like this:

--->8
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 19ed09b..202b037 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -862,6 +862,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
return __kpti_forced > 0;
}

+ /* Don't force KPTI for CPUs that are not vulnerable */
+ switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
+ case MIDR_CAVIUM_THUNDERX2:
+ case MIDR_BRCM_VULCAN:
+ return false;
+ }
+
/* Useful for KASLR robustness */
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
return true;
--
JC

2018-01-08 09:20:18

by Marc Zyngier

[permalink] [raw]
Subject: Re: [v2,03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

On 08/01/18 07:24, Jayachandran C wrote:
> On Fri, Jan 05, 2018 at 01:12:33PM +0000, Will Deacon wrote:
>> For non-KASLR kernels where the KPTI behaviour has not been overridden
>> on the command line we can use ID_AA64PFR0_EL1.CSV3 to determine whether
>> or not we should unmap the kernel whilst running at EL0.
>>
>> Reviewed-by: Suzuki K Poulose <[email protected]>
>> Signed-off-by: Will Deacon <[email protected]>
>> ---
>> arch/arm64/include/asm/sysreg.h | 1 +
>> arch/arm64/kernel/cpufeature.c | 8 +++++++-
>> 2 files changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>> index 08cc88574659..ae519bbd3f9e 100644
>> --- a/arch/arm64/include/asm/sysreg.h
>> +++ b/arch/arm64/include/asm/sysreg.h
>> @@ -437,6 +437,7 @@
>> #define ID_AA64ISAR1_DPB_SHIFT 0
>>
>> /* id_aa64pfr0 */
>> +#define ID_AA64PFR0_CSV3_SHIFT 60
>> #define ID_AA64PFR0_SVE_SHIFT 32
>> #define ID_AA64PFR0_GIC_SHIFT 24
>> #define ID_AA64PFR0_ASIMD_SHIFT 20
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 9f0545dfe497..d723fc071f39 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -145,6 +145,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
>> };
>>
>> static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
>> + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
>> ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
>> ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
>> S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
>> @@ -851,6 +852,8 @@ static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */
>> static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
>> int __unused)
>> {
>> + u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
>> +
>> /* Forced on command line? */
>> if (__kpti_forced) {
>> pr_info_once("kernel page table isolation forced %s by command line option\n",
>> @@ -862,7 +865,9 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
>> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
>> return true;
>>
>> - return false;
>> + /* Defer to CPU feature registers */
>> + return !cpuid_feature_extract_unsigned_field(pfr0,
>> + ID_AA64PFR0_CSV3_SHIFT);
>
> If I read this correctly, this enables KPTI on all processors without the CSV3
> set (which seems to be a future capability).
>
> Turning on KPTI has a small but significant overhead, so I think we should turn
> it off on processors that are not vulnerable to CVE-2017-5754. Can we add something
> like this:
>
> --->8
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 19ed09b..202b037 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -862,6 +862,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> return __kpti_forced > 0;
> }
>
> + /* Don't force KPTI for CPUs that are not vulnerable */
> + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
> + case MIDR_CAVIUM_THUNDERX2:
> + case MIDR_BRCM_VULCAN:
> + return false;
> + }
> +
> /* Useful for KASLR robustness */
> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> return true;
>

KPTI is also an improvement for KASLR. Why would you deprive a user of
the choice to further secure their system?

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-01-08 12:18:46

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: Add skeleton to harden the branch predictor against aliasing attacks

Hi Will, Marc,

On 05/01/18 13:12, Will Deacon wrote:
> Aliasing attacks against CPU branch predictors can allow an attacker to
> redirect speculative control flow on some CPUs and potentially divulge
> information from one context to another.
>
> This patch adds initial skeleton code behind a new Kconfig option to
> enable implementation-specific mitigations against these attacks for
> CPUs that are affected.

[...]

> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> index 6f7bdb89817f..6dd83d75b82a 100644
> --- a/arch/arm64/include/asm/mmu.h
> +++ b/arch/arm64/include/asm/mmu.h
> @@ -41,6 +41,43 @@ static inline bool arm64_kernel_unmapped_at_el0(void)

> +static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void)
> +{
> + return this_cpu_ptr(&bp_hardening_data);
> +}
> +
> +static inline void arm64_apply_bp_hardening(void)
> +{
> + struct bp_hardening_data *d;
> +
> + if (!cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR))
> + return;
> +
> + d = arm64_get_bp_hardening_data();
> + if (d->fn)
> + d->fn();
> +}

> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 22168cd0dde7..5203b6040cb6 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -318,6 +318,7 @@ static void __do_user_fault(struct task_struct *tsk, unsigned long addr,
> lsb = PAGE_SHIFT;
> si.si_addr_lsb = lsb;
>
> + arm64_apply_bp_hardening();

Due to the this_cpu_ptr() call:

| BUG: using smp_processor_id() in preemptible [00000000] code: print_my_pa/2093
| caller is debug_smp_processor_id+0x1c/0x24
| CPU: 0 PID: 2093 Comm: print_my_pa Tainted: G W
4.15.0-rc3-00044-g7f0aaec94f27-dirty #8950
| Call trace:
| dump_backtrace+0x0/0x164
| show_stack+0x14/0x1c
| dump_stack+0xa4/0xdc
| check_preemption_disabled+0xfc/0x100
| debug_smp_processor_id+0x1c/0x24
| __do_user_fault+0xcc/0x180
| do_page_fault+0x14c/0x364
| do_translation_fault+0x40/0x48
| do_mem_abort+0x40/0xb8
| el0_da+0x20/0x24

Make it a TIF flag?

(Seen with arm64's kpti-base tag and this series)


> force_sig_info(sig, &si, tsk);
> }


Thanks,

James

2018-01-08 14:26:58

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: Add skeleton to harden the branch predictor against aliasing attacks

Hi James,

Thanks for having a look.

On Mon, Jan 08, 2018 at 12:16:28PM +0000, James Morse wrote:
> On 05/01/18 13:12, Will Deacon wrote:
> > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > index 22168cd0dde7..5203b6040cb6 100644
> > --- a/arch/arm64/mm/fault.c
> > +++ b/arch/arm64/mm/fault.c
> > @@ -318,6 +318,7 @@ static void __do_user_fault(struct task_struct *tsk, unsigned long addr,
> > lsb = PAGE_SHIFT;
> > si.si_addr_lsb = lsb;
> >
> > + arm64_apply_bp_hardening();
>
> Due to the this_cpu_ptr() call:
>
> | BUG: using smp_processor_id() in preemptible [00000000] code: print_my_pa/2093
> | caller is debug_smp_processor_id+0x1c/0x24
> | CPU: 0 PID: 2093 Comm: print_my_pa Tainted: G W
> 4.15.0-rc3-00044-g7f0aaec94f27-dirty #8950
> | Call trace:
> | dump_backtrace+0x0/0x164
> | show_stack+0x14/0x1c
> | dump_stack+0xa4/0xdc
> | check_preemption_disabled+0xfc/0x100
> | debug_smp_processor_id+0x1c/0x24
> | __do_user_fault+0xcc/0x180
> | do_page_fault+0x14c/0x364
> | do_translation_fault+0x40/0x48
> | do_mem_abort+0x40/0xb8
> | el0_da+0x20/0x24

Ah bugger, yes, we re-enabled interrupts in the entry code when we took the
fault initially.

> Make it a TIF flag?
>
> (Seen with arm64's kpti-base tag and this series)

A TIF flag is still a bit fiddly, because we need to track that the
predictor could be dirty on this CPU. Instead, I'll postpone the re-enabling
of IRQs on el0_ia until we're in C code -- we can quickly a check on the
address before doing that. See below.

Will

--->8

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 80b539845da6..07a7d4db8ec4 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -721,12 +721,15 @@ el0_ia:
* Instruction abort handling
*/
mrs x26, far_el1
- enable_daif
+ enable_da_f
+#ifdef CONFIG_TRACE_IRQFLAGS
+ bl trace_hardirqs_off
+#endif
ct_user_exit
mov x0, x26
mov x1, x25
mov x2, sp
- bl do_mem_abort
+ bl do_el0_ia_bp_hardening
b ret_to_user
el0_fpsimd_acc:
/*
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 5203b6040cb6..0e671ddf4855 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -318,7 +318,6 @@ static void __do_user_fault(struct task_struct *tsk, unsigned long addr,
lsb = PAGE_SHIFT;
si.si_addr_lsb = lsb;

- arm64_apply_bp_hardening();
force_sig_info(sig, &si, tsk);
}

@@ -709,6 +708,23 @@ asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr,
arm64_notify_die("", regs, &info, esr);
}

+asmlinkage void __exception do_el0_ia_bp_hardening(unsigned long addr,
+ unsigned int esr,
+ struct pt_regs *regs)
+{
+ /*
+ * We've taken an instruction abort from userspace and not yet
+ * re-enabled IRQs. If the address is a kernel address, apply
+ * BP hardening prior to enabling IRQs and pre-emption.
+ */
+ if (addr > TASK_SIZE)
+ arm64_apply_bp_hardening();
+
+ local_irq_enable();
+ do_mem_abort(addr, esr, regs);
+}
+
+
asmlinkage void __exception do_sp_pc_abort(unsigned long addr,
unsigned int esr,
struct pt_regs *regs)

2018-01-08 14:33:45

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 01/11] arm64: use RET instruction for exiting the trampoline

On Sat, Jan 06, 2018 at 01:13:23PM +0000, Ard Biesheuvel wrote:
> On 5 January 2018 at 13:12, Will Deacon <[email protected]> wrote:
> > Speculation attacks against the entry trampoline can potentially resteer
> > the speculative instruction stream through the indirect branch and into
> > arbitrary gadgets within the kernel.
> >
> > This patch defends against these attacks by forcing a misprediction
> > through the return stack: a dummy BL instruction loads an entry into
> > the stack, so that the predicted program flow of the subsequent RET
> > instruction is to a branch-to-self instruction which is finally resolved
> > as a branch to the kernel vectors with speculation suppressed.
> >
>
> How safe is it to assume that every microarchitecture will behave as
> expected here? Wouldn't it be safer in general not to rely on a memory
> load for x30 in the first place? (see below) Or may the speculative
> execution still branch anywhere even if the branch target is
> guaranteed to be known by that time?

The main problem with this approach is that EL0 can read out the text and
find the kaslr offset. The memory load is fine, because the data page is
unmapped along with the kernel text. I'm not aware of any
micro-architectures where this patch doesn't do what we need.

Will

2018-01-08 14:38:02

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2 01/11] arm64: use RET instruction for exiting the trampoline

On 8 January 2018 at 14:33, Will Deacon <[email protected]> wrote:
> On Sat, Jan 06, 2018 at 01:13:23PM +0000, Ard Biesheuvel wrote:
>> On 5 January 2018 at 13:12, Will Deacon <[email protected]> wrote:
>> > Speculation attacks against the entry trampoline can potentially resteer
>> > the speculative instruction stream through the indirect branch and into
>> > arbitrary gadgets within the kernel.
>> >
>> > This patch defends against these attacks by forcing a misprediction
>> > through the return stack: a dummy BL instruction loads an entry into
>> > the stack, so that the predicted program flow of the subsequent RET
>> > instruction is to a branch-to-self instruction which is finally resolved
>> > as a branch to the kernel vectors with speculation suppressed.
>> >
>>
>> How safe is it to assume that every microarchitecture will behave as
>> expected here? Wouldn't it be safer in general not to rely on a memory
>> load for x30 in the first place? (see below) Or may the speculative
>> execution still branch anywhere even if the branch target is
>> guaranteed to be known by that time?
>
> The main problem with this approach is that EL0 can read out the text and
> find the kaslr offset.

Not really - the CONFIG_RANDOMIZE_BASE path puts the movz/movk
sequence in the next page, but that does involve an unconditional
branch.

> The memory load is fine, because the data page is
> unmapped along with the kernel text. I'm not aware of any
> micro-architectures where this patch doesn't do what we need.
>

Well, the memory load is what may incur the delay, creating the window
for speculative execution of the indirect branch. What I don't have
enough of a handle on is whether this speculative execution may still
branch to wherever the branch predictor is pointing even if the
register containing the branch target is already available.

2018-01-08 14:45:34

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 01/11] arm64: use RET instruction for exiting the trampoline

On Mon, Jan 08, 2018 at 02:38:00PM +0000, Ard Biesheuvel wrote:
> On 8 January 2018 at 14:33, Will Deacon <[email protected]> wrote:
> > On Sat, Jan 06, 2018 at 01:13:23PM +0000, Ard Biesheuvel wrote:
> >> On 5 January 2018 at 13:12, Will Deacon <[email protected]> wrote:
> >> > Speculation attacks against the entry trampoline can potentially resteer
> >> > the speculative instruction stream through the indirect branch and into
> >> > arbitrary gadgets within the kernel.
> >> >
> >> > This patch defends against these attacks by forcing a misprediction
> >> > through the return stack: a dummy BL instruction loads an entry into
> >> > the stack, so that the predicted program flow of the subsequent RET
> >> > instruction is to a branch-to-self instruction which is finally resolved
> >> > as a branch to the kernel vectors with speculation suppressed.
> >> >
> >>
> >> How safe is it to assume that every microarchitecture will behave as
> >> expected here? Wouldn't it be safer in general not to rely on a memory
> >> load for x30 in the first place? (see below) Or may the speculative
> >> execution still branch anywhere even if the branch target is
> >> guaranteed to be known by that time?
> >
> > The main problem with this approach is that EL0 can read out the text and
> > find the kaslr offset.
>
> Not really - the CONFIG_RANDOMIZE_BASE path puts the movz/movk
> sequence in the next page, but that does involve an unconditional
> branch.

Ah sorry, I had missed that. The unconditional branch may still be attacked,
however.

> > The memory load is fine, because the data page is
> > unmapped along with the kernel text. I'm not aware of any
> > micro-architectures where this patch doesn't do what we need.
> >
>
> Well, the memory load is what may incur the delay, creating the window
> for speculative execution of the indirect branch. What I don't have
> enough of a handle on is whether this speculative execution may still
> branch to wherever the branch predictor is pointing even if the
> register containing the branch target is already available.

For the micro-architectures I'm aware of, the return stack predictor will
always safely mispredict the jump into the kernel vectors with this patch
applied.

Will

2018-01-08 14:56:58

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2 01/11] arm64: use RET instruction for exiting the trampoline

On 8 January 2018 at 14:45, Will Deacon <[email protected]> wrote:
> On Mon, Jan 08, 2018 at 02:38:00PM +0000, Ard Biesheuvel wrote:
>> On 8 January 2018 at 14:33, Will Deacon <[email protected]> wrote:
>> > On Sat, Jan 06, 2018 at 01:13:23PM +0000, Ard Biesheuvel wrote:
>> >> On 5 January 2018 at 13:12, Will Deacon <[email protected]> wrote:
>> >> > Speculation attacks against the entry trampoline can potentially resteer
>> >> > the speculative instruction stream through the indirect branch and into
>> >> > arbitrary gadgets within the kernel.
>> >> >
>> >> > This patch defends against these attacks by forcing a misprediction
>> >> > through the return stack: a dummy BL instruction loads an entry into
>> >> > the stack, so that the predicted program flow of the subsequent RET
>> >> > instruction is to a branch-to-self instruction which is finally resolved
>> >> > as a branch to the kernel vectors with speculation suppressed.
>> >> >
>> >>
>> >> How safe is it to assume that every microarchitecture will behave as
>> >> expected here? Wouldn't it be safer in general not to rely on a memory
>> >> load for x30 in the first place? (see below) Or may the speculative
>> >> execution still branch anywhere even if the branch target is
>> >> guaranteed to be known by that time?
>> >
>> > The main problem with this approach is that EL0 can read out the text and
>> > find the kaslr offset.
>>
>> Not really - the CONFIG_RANDOMIZE_BASE path puts the movz/movk
>> sequence in the next page, but that does involve an unconditional
>> branch.
>
> Ah sorry, I had missed that. The unconditional branch may still be attacked,
> however.
>

Yeah, I was surprised by that. How on earth is there ever a point to
using a branch predictor to [potentially mis]predict unconditional
branches.

>> > The memory load is fine, because the data page is
>> > unmapped along with the kernel text. I'm not aware of any
>> > micro-architectures where this patch doesn't do what we need.
>> >
>>
>> Well, the memory load is what may incur the delay, creating the window
>> for speculative execution of the indirect branch. What I don't have
>> enough of a handle on is whether this speculative execution may still
>> branch to wherever the branch predictor is pointing even if the
>> register containing the branch target is already available.
>
> For the micro-architectures I'm aware of, the return stack predictor will
> always safely mispredict the jump into the kernel vectors with this patch
> applied.
>

OK, fair enough. What I am asking is really whether there is a way
where we don't have to force a misprediction, by ensuring that x30 has
assumed its final value by the time the indirect branch is
[speculatively] executed. But if unconditional branches may be
mispredicted as well, I guess this doesn't fly either.

2018-01-08 15:26:47

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v2 01/11] arm64: use RET instruction for exiting the trampoline

From: Ard Biesheuvel
> Sent: 08 January 2018 14:38
> To: Will Deacon
> Cc: [email protected]; Catalin Marinas; Marc Zyngier; Lorenzo Pieralisi;
> Christoffer Dall; Linux Kernel Mailing List; Laura Abbott
> Subject: Re: [PATCH v2 01/11] arm64: use RET instruction for exiting the trampoline
>
> On 8 January 2018 at 14:33, Will Deacon <[email protected]> wrote:
> > On Sat, Jan 06, 2018 at 01:13:23PM +0000, Ard Biesheuvel wrote:
> >> On 5 January 2018 at 13:12, Will Deacon <[email protected]> wrote:
> >> > Speculation attacks against the entry trampoline can potentially resteer
> >> > the speculative instruction stream through the indirect branch and into
> >> > arbitrary gadgets within the kernel.
> >> >
> >> > This patch defends against these attacks by forcing a misprediction
> >> > through the return stack: a dummy BL instruction loads an entry into
> >> > the stack, so that the predicted program flow of the subsequent RET
> >> > instruction is to a branch-to-self instruction which is finally resolved
> >> > as a branch to the kernel vectors with speculation suppressed.
> >> >
> >>
> >> How safe is it to assume that every microarchitecture will behave as
> >> expected here? Wouldn't it be safer in general not to rely on a memory
> >> load for x30 in the first place? (see below) Or may the speculative
> >> execution still branch anywhere even if the branch target is
> >> guaranteed to be known by that time?
> >
> > The main problem with this approach is that EL0 can read out the text and
> > find the kaslr offset.
>
> Not really - the CONFIG_RANDOMIZE_BASE path puts the movz/movk
> sequence in the next page, but that does involve an unconditional
> branch.
>
> > The memory load is fine, because the data page is
> > unmapped along with the kernel text. I'm not aware of any
> > micro-architectures where this patch doesn't do what we need.
> >
>
> Well, the memory load is what may incur the delay, creating the window
> for speculative execution of the indirect branch. What I don't have
> enough of a handle on is whether this speculative execution may still
> branch to wherever the branch predictor is pointing even if the
> register containing the branch target is already available.

I would expect the predicted address to be used.
Much the same as a conditional branch doesn't use the flags
value at the time the instruction is decoded.

David

2018-01-08 16:46:52

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 2/2] arm64: Branch predictor hardening for Cavium ThunderX2

On Sun, Jan 07, 2018 at 10:53:36PM -0800, Jayachandran C wrote:
> Use PSCI based mitigation for speculative execution attacks targeting
> the branch predictor. The approach is similar to the one used for
> Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
> test if the firmware supports the capability.
>
> If the secure firmware has been updated with the mitigation code to
> invalidate the branch target buffer, we use the PSCI version call to
> invoke it.
>
> Signed-off-by: Jayachandran C <[email protected]>
> ---
> arch/arm64/kernel/cpu_errata.c | 38 ++++++++++++++++++++++++++++++++++++++
> 1 file changed, 38 insertions(+)
>
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index cb0fb37..abceb5d 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -124,6 +124,7 @@ static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
> __install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
> }
>
> +#include <linux/arm-smccc.h>
> #include <linux/psci.h>
>
> static int enable_psci_bp_hardening(void *data)
> @@ -138,6 +139,33 @@ static int enable_psci_bp_hardening(void *data)
>
> return 0;
> }
> +
> +#define CAVIUM_TX2_SIP_SMC_CALL 0xC200FF00
> +#define CAVIUM_TX2_BTB_HARDEN_CAP 0xB0A0
> +
> +static int enable_tx2_psci_bp_hardening(void *data)
> +{
> + const struct arm64_cpu_capabilities *entry = data;
> + struct arm_smccc_res res;
> +
> + if (!entry->matches(entry, SCOPE_LOCAL_CPU))
> + return;
> +
> + arm_smccc_smc(CAVIUM_TX2_SIP_SMC_CALL, CAVIUM_TX2_BTB_HARDEN_CAP, 0, 0, 0, 0, 0, 0, &res);

One thing to be aware of here is that if somebody configures qemu to emulate
a TX2, this may actually disappear into EL3 and not return. You're better
off sticking with PSCI GET_VERSION in terms of portability, but it's your
call -- I'd expect you to deal with any breakage reports on the list due
to the SMC above. Fair?

> + if (res.a0 != 0) {
> + pr_warn("Error: CONFIG_HARDEN_BRANCH_PREDICTOR enabled, but firmware does not support it\n");
> + return 0;
> + }

Please don't print this here; see below.

> + if (res.a1 == 1 && psci_ops.get_version) {
> + pr_info("CPU%d: Branch predictor hardening enabled\n", smp_processor_id());

If you want to print a message, please put it in the capability structure
.desc field.

Will

2018-01-08 17:06:25

by Will Deacon

[permalink] [raw]
Subject: Re: [v2,03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

On Sun, Jan 07, 2018 at 11:24:02PM -0800, Jayachandran C wrote:
> On Fri, Jan 05, 2018 at 01:12:33PM +0000, Will Deacon wrote:
> > For non-KASLR kernels where the KPTI behaviour has not been overridden
> > on the command line we can use ID_AA64PFR0_EL1.CSV3 to determine whether
> > or not we should unmap the kernel whilst running at EL0.
> >
> > Reviewed-by: Suzuki K Poulose <[email protected]>
> > Signed-off-by: Will Deacon <[email protected]>
> > ---
> > arch/arm64/include/asm/sysreg.h | 1 +
> > arch/arm64/kernel/cpufeature.c | 8 +++++++-
> > 2 files changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> > index 08cc88574659..ae519bbd3f9e 100644
> > --- a/arch/arm64/include/asm/sysreg.h
> > +++ b/arch/arm64/include/asm/sysreg.h
> > @@ -437,6 +437,7 @@
> > #define ID_AA64ISAR1_DPB_SHIFT 0
> >
> > /* id_aa64pfr0 */
> > +#define ID_AA64PFR0_CSV3_SHIFT 60
> > #define ID_AA64PFR0_SVE_SHIFT 32
> > #define ID_AA64PFR0_GIC_SHIFT 24
> > #define ID_AA64PFR0_ASIMD_SHIFT 20
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index 9f0545dfe497..d723fc071f39 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -145,6 +145,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
> > };
> >
> > static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
> > + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
> > ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
> > ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
> > S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
> > @@ -851,6 +852,8 @@ static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */
> > static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> > int __unused)
> > {
> > + u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> > +
> > /* Forced on command line? */
> > if (__kpti_forced) {
> > pr_info_once("kernel page table isolation forced %s by command line option\n",
> > @@ -862,7 +865,9 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> > if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> > return true;
> >
> > - return false;
> > + /* Defer to CPU feature registers */
> > + return !cpuid_feature_extract_unsigned_field(pfr0,
> > + ID_AA64PFR0_CSV3_SHIFT);
>
> If I read this correctly, this enables KPTI on all processors without the CSV3
> set (which seems to be a future capability).
>
> Turning on KPTI has a small but significant overhead, so I think we should turn
> it off on processors that are not vulnerable to CVE-2017-5754. Can we add something
> like this:
>
> --->8
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 19ed09b..202b037 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -862,6 +862,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> return __kpti_forced > 0;
> }
>
> + /* Don't force KPTI for CPUs that are not vulnerable */
> + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
> + case MIDR_CAVIUM_THUNDERX2:
> + case MIDR_BRCM_VULCAN:
> + return false;
> + }
> +

KASLR aside (I agree with Marc on that), I did consider an MIDR whitelist,
but it gets nasty for big.LITTLE systems if maxcpus= is used and we see a
non-whitelisted CPU after we've booted. At this point, we can't actually
bring the thing online.

You could make the argument that if you're passing maxcpus= then you can just
easily pass kpti= as well, but I wasn't sure.

Will

2018-01-08 17:20:04

by Jayachandran C

[permalink] [raw]
Subject: Re: [PATCH 2/2] arm64: Branch predictor hardening for Cavium ThunderX2

On Mon, Jan 08, 2018 at 04:46:52PM +0000, Will Deacon wrote:
> On Sun, Jan 07, 2018 at 10:53:36PM -0800, Jayachandran C wrote:
> > Use PSCI based mitigation for speculative execution attacks targeting
> > the branch predictor. The approach is similar to the one used for
> > Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
> > test if the firmware supports the capability.
> >
> > If the secure firmware has been updated with the mitigation code to
> > invalidate the branch target buffer, we use the PSCI version call to
> > invoke it.
> >
> > Signed-off-by: Jayachandran C <[email protected]>
> > ---
> > arch/arm64/kernel/cpu_errata.c | 38 ++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 38 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> > index cb0fb37..abceb5d 100644
> > --- a/arch/arm64/kernel/cpu_errata.c
> > +++ b/arch/arm64/kernel/cpu_errata.c
> > @@ -124,6 +124,7 @@ static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
> > __install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
> > }
> >
> > +#include <linux/arm-smccc.h>
> > #include <linux/psci.h>
> >
> > static int enable_psci_bp_hardening(void *data)
> > @@ -138,6 +139,33 @@ static int enable_psci_bp_hardening(void *data)
> >
> > return 0;
> > }
> > +
> > +#define CAVIUM_TX2_SIP_SMC_CALL 0xC200FF00
> > +#define CAVIUM_TX2_BTB_HARDEN_CAP 0xB0A0
> > +
> > +static int enable_tx2_psci_bp_hardening(void *data)
> > +{
> > + const struct arm64_cpu_capabilities *entry = data;
> > + struct arm_smccc_res res;
> > +
> > + if (!entry->matches(entry, SCOPE_LOCAL_CPU))
> > + return;
> > +
> > + arm_smccc_smc(CAVIUM_TX2_SIP_SMC_CALL, CAVIUM_TX2_BTB_HARDEN_CAP, 0, 0, 0, 0, 0, 0, &res);
>
> One thing to be aware of here is that if somebody configures qemu to emulate
> a TX2, this may actually disappear into EL3 and not return. You're better
> off sticking with PSCI GET_VERSION in terms of portability, but it's your
> call -- I'd expect you to deal with any breakage reports on the list due
> to the SMC above. Fair?

I don't like having a custom SMC here either. But Overloading PSCI get version
is the problem as I wrote earlier - there is no way to check if the firmware
implements BTB hardening with overloading. There is a good chance that users
with old firmware will just fail without any warning.

Is there a reason for overloading PSCI get version? Allocating a new standard
SMC number would make checking for existance and usage much simpler.

I did not quite understand the possible issue in qemu, unimplemented SMC calls
are expected to return an error code. What am I missing here?

>
> > + if (res.a0 != 0) {
> > + pr_warn("Error: CONFIG_HARDEN_BRANCH_PREDICTOR enabled, but firmware does not support it\n");
> > + return 0;
> > + }
>
> Please don't print this here; see below.
>
> > + if (res.a1 == 1 && psci_ops.get_version) {
> > + pr_info("CPU%d: Branch predictor hardening enabled\n", smp_processor_id());
>
> If you want to print a message, please put it in the capability structure
> .desc field.

Thanks, will fix this in the next rev.

JC.

2018-01-08 17:23:42

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 2/2] arm64: Branch predictor hardening for Cavium ThunderX2

On Mon, Jan 08, 2018 at 09:19:43AM -0800, Jayachandran C wrote:
> On Mon, Jan 08, 2018 at 04:46:52PM +0000, Will Deacon wrote:
> > On Sun, Jan 07, 2018 at 10:53:36PM -0800, Jayachandran C wrote:
> > > Use PSCI based mitigation for speculative execution attacks targeting
> > > the branch predictor. The approach is similar to the one used for
> > > Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
> > > test if the firmware supports the capability.
> > >
> > > If the secure firmware has been updated with the mitigation code to
> > > invalidate the branch target buffer, we use the PSCI version call to
> > > invoke it.
> > >
> > > Signed-off-by: Jayachandran C <[email protected]>
> > > ---
> > > arch/arm64/kernel/cpu_errata.c | 38 ++++++++++++++++++++++++++++++++++++++
> > > 1 file changed, 38 insertions(+)
> > >
> > > diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> > > index cb0fb37..abceb5d 100644
> > > --- a/arch/arm64/kernel/cpu_errata.c
> > > +++ b/arch/arm64/kernel/cpu_errata.c
> > > @@ -124,6 +124,7 @@ static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
> > > __install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
> > > }
> > >
> > > +#include <linux/arm-smccc.h>
> > > #include <linux/psci.h>
> > >
> > > static int enable_psci_bp_hardening(void *data)
> > > @@ -138,6 +139,33 @@ static int enable_psci_bp_hardening(void *data)
> > >
> > > return 0;
> > > }
> > > +
> > > +#define CAVIUM_TX2_SIP_SMC_CALL 0xC200FF00
> > > +#define CAVIUM_TX2_BTB_HARDEN_CAP 0xB0A0
> > > +
> > > +static int enable_tx2_psci_bp_hardening(void *data)
> > > +{
> > > + const struct arm64_cpu_capabilities *entry = data;
> > > + struct arm_smccc_res res;
> > > +
> > > + if (!entry->matches(entry, SCOPE_LOCAL_CPU))
> > > + return;
> > > +
> > > + arm_smccc_smc(CAVIUM_TX2_SIP_SMC_CALL, CAVIUM_TX2_BTB_HARDEN_CAP, 0, 0, 0, 0, 0, 0, &res);
> >
> > One thing to be aware of here is that if somebody configures qemu to emulate
> > a TX2, this may actually disappear into EL3 and not return. You're better
> > off sticking with PSCI GET_VERSION in terms of portability, but it's your
> > call -- I'd expect you to deal with any breakage reports on the list due
> > to the SMC above. Fair?
>
> I don't like having a custom SMC here either. But Overloading PSCI get version
> is the problem as I wrote earlier - there is no way to check if the firmware
> implements BTB hardening with overloading. There is a good chance that users
> with old firmware will just fail without any warning.

That's true, but there is precedent for this elsewhere. For example, CPU
errata that require a firmware change are often not probable. Also, your SMC
call won't always work (see the qemu comment below). Note that I'm not
saying I won't take this code, just that you need to be aware of what
you're doing.

> Is there a reason for overloading PSCI get version? Allocating a new standard
> SMC number would make checking for existance and usage much simpler.

PSCI get version is what we have today. We're working on extending PSCI to
allocate a new standard SMC number, but we need something that can be used
with existing firmware too and standardisation doesn't happen overnight.

> I did not quite understand the possible issue in qemu, unimplemented SMC calls
> are expected to return an error code. What am I missing here?

Qemu will inject them into EL3, and there might not be anything there.

Will

2018-01-08 17:40:44

by Jayachandran C

[permalink] [raw]
Subject: Re: [v2,03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

On Mon, Jan 08, 2018 at 09:20:09AM +0000, Marc Zyngier wrote:
> On 08/01/18 07:24, Jayachandran C wrote:
> > On Fri, Jan 05, 2018 at 01:12:33PM +0000, Will Deacon wrote:
> >> For non-KASLR kernels where the KPTI behaviour has not been overridden
> >> on the command line we can use ID_AA64PFR0_EL1.CSV3 to determine whether
> >> or not we should unmap the kernel whilst running at EL0.
> >>
> >> Reviewed-by: Suzuki K Poulose <[email protected]>
> >> Signed-off-by: Will Deacon <[email protected]>
> >> ---
> >> arch/arm64/include/asm/sysreg.h | 1 +
> >> arch/arm64/kernel/cpufeature.c | 8 +++++++-
> >> 2 files changed, 8 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> >> index 08cc88574659..ae519bbd3f9e 100644
> >> --- a/arch/arm64/include/asm/sysreg.h
> >> +++ b/arch/arm64/include/asm/sysreg.h
> >> @@ -437,6 +437,7 @@
> >> #define ID_AA64ISAR1_DPB_SHIFT 0
> >>
> >> /* id_aa64pfr0 */
> >> +#define ID_AA64PFR0_CSV3_SHIFT 60
> >> #define ID_AA64PFR0_SVE_SHIFT 32
> >> #define ID_AA64PFR0_GIC_SHIFT 24
> >> #define ID_AA64PFR0_ASIMD_SHIFT 20
> >> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> >> index 9f0545dfe497..d723fc071f39 100644
> >> --- a/arch/arm64/kernel/cpufeature.c
> >> +++ b/arch/arm64/kernel/cpufeature.c
> >> @@ -145,6 +145,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
> >> };
> >>
> >> static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
> >> + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
> >> ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
> >> ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
> >> S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
> >> @@ -851,6 +852,8 @@ static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */
> >> static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> >> int __unused)
> >> {
> >> + u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> >> +
> >> /* Forced on command line? */
> >> if (__kpti_forced) {
> >> pr_info_once("kernel page table isolation forced %s by command line option\n",
> >> @@ -862,7 +865,9 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> >> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> >> return true;
> >>
> >> - return false;
> >> + /* Defer to CPU feature registers */
> >> + return !cpuid_feature_extract_unsigned_field(pfr0,
> >> + ID_AA64PFR0_CSV3_SHIFT);
> >
> > If I read this correctly, this enables KPTI on all processors without the CSV3
> > set (which seems to be a future capability).
> >
> > Turning on KPTI has a small but significant overhead, so I think we should turn
> > it off on processors that are not vulnerable to CVE-2017-5754. Can we add something
> > like this:
> >
> > --->8
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index 19ed09b..202b037 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -862,6 +862,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> > return __kpti_forced > 0;
> > }
> >
> > + /* Don't force KPTI for CPUs that are not vulnerable */
> > + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
> > + case MIDR_CAVIUM_THUNDERX2:
> > + case MIDR_BRCM_VULCAN:
> > + return false;
> > + }
> > +
> > /* Useful for KASLR robustness */
> > if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> > return true;
> >
>
> KPTI is also an improvement for KASLR. Why would you deprive a user of
> the choice to further secure their system?

The user has a choice with kpti= at the kernel command line, so we are
not depriving the user of a choice. KASLR is expected to be enabled by
distributions, and KPTI will be enabled by default as well.

On systems that are not vulnerable to variant 3, this is an unnecessary
overhead.

JC

2018-01-08 17:50:54

by Jayachandran C

[permalink] [raw]
Subject: Re: [v2,03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

On Mon, Jan 08, 2018 at 05:06:24PM +0000, Will Deacon wrote:
> On Sun, Jan 07, 2018 at 11:24:02PM -0800, Jayachandran C wrote:
> > On Fri, Jan 05, 2018 at 01:12:33PM +0000, Will Deacon wrote:
> > > For non-KASLR kernels where the KPTI behaviour has not been overridden
> > > on the command line we can use ID_AA64PFR0_EL1.CSV3 to determine whether
> > > or not we should unmap the kernel whilst running at EL0.
> > >
> > > Reviewed-by: Suzuki K Poulose <[email protected]>
> > > Signed-off-by: Will Deacon <[email protected]>
> > > ---
> > > arch/arm64/include/asm/sysreg.h | 1 +
> > > arch/arm64/kernel/cpufeature.c | 8 +++++++-
> > > 2 files changed, 8 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> > > index 08cc88574659..ae519bbd3f9e 100644
> > > --- a/arch/arm64/include/asm/sysreg.h
> > > +++ b/arch/arm64/include/asm/sysreg.h
> > > @@ -437,6 +437,7 @@
> > > #define ID_AA64ISAR1_DPB_SHIFT 0
> > >
> > > /* id_aa64pfr0 */
> > > +#define ID_AA64PFR0_CSV3_SHIFT 60
> > > #define ID_AA64PFR0_SVE_SHIFT 32
> > > #define ID_AA64PFR0_GIC_SHIFT 24
> > > #define ID_AA64PFR0_ASIMD_SHIFT 20
> > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > > index 9f0545dfe497..d723fc071f39 100644
> > > --- a/arch/arm64/kernel/cpufeature.c
> > > +++ b/arch/arm64/kernel/cpufeature.c
> > > @@ -145,6 +145,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
> > > };
> > >
> > > static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
> > > + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
> > > ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
> > > ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
> > > S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
> > > @@ -851,6 +852,8 @@ static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */
> > > static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> > > int __unused)
> > > {
> > > + u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> > > +
> > > /* Forced on command line? */
> > > if (__kpti_forced) {
> > > pr_info_once("kernel page table isolation forced %s by command line option\n",
> > > @@ -862,7 +865,9 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> > > if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> > > return true;
> > >
> > > - return false;
> > > + /* Defer to CPU feature registers */
> > > + return !cpuid_feature_extract_unsigned_field(pfr0,
> > > + ID_AA64PFR0_CSV3_SHIFT);
> >
> > If I read this correctly, this enables KPTI on all processors without the CSV3
> > set (which seems to be a future capability).
> >
> > Turning on KPTI has a small but significant overhead, so I think we should turn
> > it off on processors that are not vulnerable to CVE-2017-5754. Can we add something
> > like this:
> >
> > --->8
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index 19ed09b..202b037 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -862,6 +862,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> > return __kpti_forced > 0;
> > }
> >
> > + /* Don't force KPTI for CPUs that are not vulnerable */
> > + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
> > + case MIDR_CAVIUM_THUNDERX2:
> > + case MIDR_BRCM_VULCAN:
> > + return false;
> > + }
> > +
>
> KASLR aside (I agree with Marc on that), I did consider an MIDR whitelist,
> but it gets nasty for big.LITTLE systems if maxcpus= is used and we see a
> non-whitelisted CPU after we've booted. At this point, we can't actually
> bring the thing online.
>
> You could make the argument that if you're passing maxcpus= then you can just
> easily pass kpti= as well, but I wasn't sure.

The code above should be a reasonable addition for getting the default right.
If by any chance these CPUs are shown to be vulnerable to timing attacks that
can bypass KASLR, we will need to move the change below the line:

if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
return true;

The current code, which is purely based on future capability does not handle
CPUs without the vulnerablility properly. Can we fix this?

Thanks,
JC

2018-01-08 17:51:03

by Will Deacon

[permalink] [raw]
Subject: Re: [v2,03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

On Mon, Jan 08, 2018 at 09:40:17AM -0800, Jayachandran C wrote:
> On Mon, Jan 08, 2018 at 09:20:09AM +0000, Marc Zyngier wrote:
> > On 08/01/18 07:24, Jayachandran C wrote:
> > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > > index 19ed09b..202b037 100644
> > > --- a/arch/arm64/kernel/cpufeature.c
> > > +++ b/arch/arm64/kernel/cpufeature.c
> > > @@ -862,6 +862,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> > > return __kpti_forced > 0;
> > > }
> > >
> > > + /* Don't force KPTI for CPUs that are not vulnerable */
> > > + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
> > > + case MIDR_CAVIUM_THUNDERX2:
> > > + case MIDR_BRCM_VULCAN:
> > > + return false;
> > > + }
> > > +
> > > /* Useful for KASLR robustness */
> > > if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> > > return true;
> > >
> >
> > KPTI is also an improvement for KASLR. Why would you deprive a user of
> > the choice to further secure their system?
>
> The user has a choice with kpti= at the kernel command line, so we are
> not depriving the user of a choice. KASLR is expected to be enabled by
> distributions, and KPTI will be enabled by default as well.
>
> On systems that are not vulnerable to variant 3, this is an unnecessary
> overhead.

KASLR can be bypassed on CPUs that are not vulnerable to variant 3 simply
by timing how long accesses to kernel addresses from EL0 take -- please read
the original KAISER paper for details about that attack on x86. kpti
mitigates that. If you don't care about KASLR, don't enable it (arguably
it's useless without kpti).

Will

2018-01-08 17:52:27

by Marc Zyngier

[permalink] [raw]
Subject: Re: [v2,03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

On 08/01/18 17:40, Jayachandran C wrote:
> On Mon, Jan 08, 2018 at 09:20:09AM +0000, Marc Zyngier wrote:
>> On 08/01/18 07:24, Jayachandran C wrote:
>>> On Fri, Jan 05, 2018 at 01:12:33PM +0000, Will Deacon wrote:
>>>> For non-KASLR kernels where the KPTI behaviour has not been overridden
>>>> on the command line we can use ID_AA64PFR0_EL1.CSV3 to determine whether
>>>> or not we should unmap the kernel whilst running at EL0.
>>>>
>>>> Reviewed-by: Suzuki K Poulose <[email protected]>
>>>> Signed-off-by: Will Deacon <[email protected]>
>>>> ---
>>>> arch/arm64/include/asm/sysreg.h | 1 +
>>>> arch/arm64/kernel/cpufeature.c | 8 +++++++-
>>>> 2 files changed, 8 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>>>> index 08cc88574659..ae519bbd3f9e 100644
>>>> --- a/arch/arm64/include/asm/sysreg.h
>>>> +++ b/arch/arm64/include/asm/sysreg.h
>>>> @@ -437,6 +437,7 @@
>>>> #define ID_AA64ISAR1_DPB_SHIFT 0
>>>>
>>>> /* id_aa64pfr0 */
>>>> +#define ID_AA64PFR0_CSV3_SHIFT 60
>>>> #define ID_AA64PFR0_SVE_SHIFT 32
>>>> #define ID_AA64PFR0_GIC_SHIFT 24
>>>> #define ID_AA64PFR0_ASIMD_SHIFT 20
>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>>> index 9f0545dfe497..d723fc071f39 100644
>>>> --- a/arch/arm64/kernel/cpufeature.c
>>>> +++ b/arch/arm64/kernel/cpufeature.c
>>>> @@ -145,6 +145,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
>>>> };
>>>>
>>>> static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
>>>> + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
>>>> ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
>>>> ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
>>>> S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
>>>> @@ -851,6 +852,8 @@ static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */
>>>> static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
>>>> int __unused)
>>>> {
>>>> + u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
>>>> +
>>>> /* Forced on command line? */
>>>> if (__kpti_forced) {
>>>> pr_info_once("kernel page table isolation forced %s by command line option\n",
>>>> @@ -862,7 +865,9 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
>>>> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
>>>> return true;
>>>>
>>>> - return false;
>>>> + /* Defer to CPU feature registers */
>>>> + return !cpuid_feature_extract_unsigned_field(pfr0,
>>>> + ID_AA64PFR0_CSV3_SHIFT);
>>>
>>> If I read this correctly, this enables KPTI on all processors without the CSV3
>>> set (which seems to be a future capability).
>>>
>>> Turning on KPTI has a small but significant overhead, so I think we should turn
>>> it off on processors that are not vulnerable to CVE-2017-5754. Can we add something
>>> like this:
>>>
>>> --->8
>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>> index 19ed09b..202b037 100644
>>> --- a/arch/arm64/kernel/cpufeature.c
>>> +++ b/arch/arm64/kernel/cpufeature.c
>>> @@ -862,6 +862,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
>>> return __kpti_forced > 0;
>>> }
>>>
>>> + /* Don't force KPTI for CPUs that are not vulnerable */
>>> + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
>>> + case MIDR_CAVIUM_THUNDERX2:
>>> + case MIDR_BRCM_VULCAN:
>>> + return false;
>>> + }
>>> +
>>> /* Useful for KASLR robustness */
>>> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
>>> return true;
>>>
>>
>> KPTI is also an improvement for KASLR. Why would you deprive a user of
>> the choice to further secure their system?
>
> The user has a choice with kpti= at the kernel command line, so we are
> not depriving the user of a choice. KASLR is expected to be enabled by
> distributions, and KPTI will be enabled by default as well.
>
> On systems that are not vulnerable to variant 3, this is an unnecessary
> overhead.

KASLR can be defeated if you don't have KPTI enabled. The original
KAISER paper is quite clear about that.

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2018-01-08 18:22:54

by Alan Cox

[permalink] [raw]
Subject: Re: [v2,03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

> > On systems that are not vulnerable to variant 3, this is an unnecessary
> > overhead.
>
> KASLR can be bypassed on CPUs that are not vulnerable to variant 3 simply
> by timing how long accesses to kernel addresses from EL0 take -- please read
> the original KAISER paper for details about that attack on x86. kpti
> mitigates that. If you don't care about KASLR, don't enable it (arguably
> it's useless without kpti).

KASLR is primarily of value for remote protection.

Alan

2018-01-09 02:26:36

by Jayachandran C

[permalink] [raw]
Subject: Re: [PATCH 2/2] arm64: Branch predictor hardening for Cavium ThunderX2

On Mon, Jan 08, 2018 at 05:23:41PM +0000, Will Deacon wrote:
> On Mon, Jan 08, 2018 at 09:19:43AM -0800, Jayachandran C wrote:
> > On Mon, Jan 08, 2018 at 04:46:52PM +0000, Will Deacon wrote:
> > > On Sun, Jan 07, 2018 at 10:53:36PM -0800, Jayachandran C wrote:
> > > > Use PSCI based mitigation for speculative execution attacks targeting
> > > > the branch predictor. The approach is similar to the one used for
> > > > Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
> > > > test if the firmware supports the capability.
> > > >
> > > > If the secure firmware has been updated with the mitigation code to
> > > > invalidate the branch target buffer, we use the PSCI version call to
> > > > invoke it.
> > > >
> > > > Signed-off-by: Jayachandran C <[email protected]>
> > > > ---
> > > > arch/arm64/kernel/cpu_errata.c | 38 ++++++++++++++++++++++++++++++++++++++
> > > > 1 file changed, 38 insertions(+)
> > > >
> > > > diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> > > > index cb0fb37..abceb5d 100644
> > > > --- a/arch/arm64/kernel/cpu_errata.c
> > > > +++ b/arch/arm64/kernel/cpu_errata.c
> > > > @@ -124,6 +124,7 @@ static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
> > > > __install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
> > > > }
> > > >
> > > > +#include <linux/arm-smccc.h>
> > > > #include <linux/psci.h>
> > > >
> > > > static int enable_psci_bp_hardening(void *data)
> > > > @@ -138,6 +139,33 @@ static int enable_psci_bp_hardening(void *data)
> > > >
> > > > return 0;
> > > > }
> > > > +
> > > > +#define CAVIUM_TX2_SIP_SMC_CALL 0xC200FF00
> > > > +#define CAVIUM_TX2_BTB_HARDEN_CAP 0xB0A0
> > > > +
> > > > +static int enable_tx2_psci_bp_hardening(void *data)
> > > > +{
> > > > + const struct arm64_cpu_capabilities *entry = data;
> > > > + struct arm_smccc_res res;
> > > > +
> > > > + if (!entry->matches(entry, SCOPE_LOCAL_CPU))
> > > > + return;
> > > > +
> > > > + arm_smccc_smc(CAVIUM_TX2_SIP_SMC_CALL, CAVIUM_TX2_BTB_HARDEN_CAP, 0, 0, 0, 0, 0, 0, &res);
> > >
> > > One thing to be aware of here is that if somebody configures qemu to emulate
> > > a TX2, this may actually disappear into EL3 and not return. You're better
> > > off sticking with PSCI GET_VERSION in terms of portability, but it's your
> > > call -- I'd expect you to deal with any breakage reports on the list due
> > > to the SMC above. Fair?
> >
> > I don't like having a custom SMC here either. But Overloading PSCI get version
> > is the problem as I wrote earlier - there is no way to check if the firmware
> > implements BTB hardening with overloading. There is a good chance that users
> > with old firmware will just fail without any warning.
>
> That's true, but there is precedent for this elsewhere. For example, CPU
> errata that require a firmware change are often not probable. Also, your SMC
> call won't always work (see the qemu comment below). Note that I'm not
> saying I won't take this code, just that you need to be aware of what
> you're doing.
>
> > Is there a reason for overloading PSCI get version? Allocating a new standard
> > SMC number would make checking for existance and usage much simpler.
>
> PSCI get version is what we have today. We're working on extending PSCI to
> allocate a new standard SMC number, but we need something that can be used
> with existing firmware too and standardisation doesn't happen overnight.

Can you hold this patchset until the SMC number is published? Otherwise we
will end up with two incompatible interfaces, and the mess of supporting
both.

Or if there is a plan standardize this later, I can pickup a vendor specific
SMC for now, and switch over to the standard one later. Any suggestions here?

JC.

2018-01-09 04:06:36

by Jayachandran C

[permalink] [raw]
Subject: Re: [v2,03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

On Mon, Jan 08, 2018 at 05:51:00PM +0000, Will Deacon wrote:
> On Mon, Jan 08, 2018 at 09:40:17AM -0800, Jayachandran C wrote:
> > On Mon, Jan 08, 2018 at 09:20:09AM +0000, Marc Zyngier wrote:
> > > On 08/01/18 07:24, Jayachandran C wrote:
> > > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > > > index 19ed09b..202b037 100644
> > > > --- a/arch/arm64/kernel/cpufeature.c
> > > > +++ b/arch/arm64/kernel/cpufeature.c
> > > > @@ -862,6 +862,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> > > > return __kpti_forced > 0;
> > > > }
> > > >
> > > > + /* Don't force KPTI for CPUs that are not vulnerable */
> > > > + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
> > > > + case MIDR_CAVIUM_THUNDERX2:
> > > > + case MIDR_BRCM_VULCAN:
> > > > + return false;
> > > > + }
> > > > +
> > > > /* Useful for KASLR robustness */
> > > > if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> > > > return true;
> > > >
> > >
> > > KPTI is also an improvement for KASLR. Why would you deprive a user of
> > > the choice to further secure their system?
> >
> > The user has a choice with kpti= at the kernel command line, so we are
> > not depriving the user of a choice. KASLR is expected to be enabled by
> > distributions, and KPTI will be enabled by default as well.
> >
> > On systems that are not vulnerable to variant 3, this is an unnecessary
> > overhead.
>
> KASLR can be bypassed on CPUs that are not vulnerable to variant 3 simply
> by timing how long accesses to kernel addresses from EL0 take -- please read
> the original KAISER paper for details about that attack on x86. kpti
> mitigates that. If you don't care about KASLR, don't enable it (arguably
> it's useless without kpti).

The code above assumes that all ARM CPUs (now and future) will be vulnerable
to timing attacks that can bypass KASLR. I don't think that is a correct
assumption to make.

If ThunderX2 is shown to be vulnerable to any timing based attack we can
certainly move the MIDR check after the check for the CONFIG_RANDOMIZE_BASE.
But I don't think that is the case now, if you have any PoC code to check
this I can run on the processor and make the change.

It is pretty clear that we need a whitelist check either before or after the
CONFIG_RANDOMIZE_BASE check.

The kaiser paper seems to say that ARM TTBR0/1 made it more immune, and the
prefetch paper(if I understand correctly) showed that prefetch on some ARM
cores can be used for timing attack. This is probably and area where you will
have better information, so any specific pointers would be appreciated -
especially ones showing that all ARM CPUs are susceptible.

Thanks,
JC.

2018-01-09 09:53:26

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 2/2] arm64: Branch predictor hardening for Cavium ThunderX2

On Mon, Jan 08, 2018 at 06:26:20PM -0800, Jayachandran C wrote:
> On Mon, Jan 08, 2018 at 05:23:41PM +0000, Will Deacon wrote:
> > On Mon, Jan 08, 2018 at 09:19:43AM -0800, Jayachandran C wrote:
> > > On Mon, Jan 08, 2018 at 04:46:52PM +0000, Will Deacon wrote:
> > > > On Sun, Jan 07, 2018 at 10:53:36PM -0800, Jayachandran C wrote:
> > > > > + arm_smccc_smc(CAVIUM_TX2_SIP_SMC_CALL, CAVIUM_TX2_BTB_HARDEN_CAP, 0, 0, 0, 0, 0, 0, &res);
> > > >
> > > > One thing to be aware of here is that if somebody configures qemu to emulate
> > > > a TX2, this may actually disappear into EL3 and not return. You're better
> > > > off sticking with PSCI GET_VERSION in terms of portability, but it's your
> > > > call -- I'd expect you to deal with any breakage reports on the list due
> > > > to the SMC above. Fair?
> > >
> > > I don't like having a custom SMC here either. But Overloading PSCI get version
> > > is the problem as I wrote earlier - there is no way to check if the firmware
> > > implements BTB hardening with overloading. There is a good chance that users
> > > with old firmware will just fail without any warning.
> >
> > That's true, but there is precedent for this elsewhere. For example, CPU
> > errata that require a firmware change are often not probable. Also, your SMC
> > call won't always work (see the qemu comment below). Note that I'm not
> > saying I won't take this code, just that you need to be aware of what
> > you're doing.
> >
> > > Is there a reason for overloading PSCI get version? Allocating a new standard
> > > SMC number would make checking for existance and usage much simpler.
> >
> > PSCI get version is what we have today. We're working on extending PSCI to
> > allocate a new standard SMC number, but we need something that can be used
> > with existing firmware too and standardisation doesn't happen overnight.
>
> Can you hold this patchset until the SMC number is published? Otherwise we
> will end up with two incompatible interfaces, and the mess of supporting
> both.

This has already been queued, and will be necessary for older PSCI versions
that can be extended to perform the BP invalidation in get version, but
which cannot be upgraded to a newer version of the spec. But that's fine; we
can support both interfaces because the new one will need to be discoverable
anyway.

> Or if there is a plan standardize this later, I can pickup a vendor specific
> SMC for now, and switch over to the standard one later. Any suggestions here?

I would suggest using GET VERSION for now and switching later, but it's
entirely up to you.

Wil

2018-01-09 10:00:35

by Will Deacon

[permalink] [raw]
Subject: Re: [v2,03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

On Mon, Jan 08, 2018 at 08:06:27PM -0800, Jayachandran C wrote:
> On Mon, Jan 08, 2018 at 05:51:00PM +0000, Will Deacon wrote:
> > On Mon, Jan 08, 2018 at 09:40:17AM -0800, Jayachandran C wrote:
> > > On Mon, Jan 08, 2018 at 09:20:09AM +0000, Marc Zyngier wrote:
> > > > On 08/01/18 07:24, Jayachandran C wrote:
> > > > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > > > > index 19ed09b..202b037 100644
> > > > > --- a/arch/arm64/kernel/cpufeature.c
> > > > > +++ b/arch/arm64/kernel/cpufeature.c
> > > > > @@ -862,6 +862,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> > > > > return __kpti_forced > 0;
> > > > > }
> > > > >
> > > > > + /* Don't force KPTI for CPUs that are not vulnerable */
> > > > > + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
> > > > > + case MIDR_CAVIUM_THUNDERX2:
> > > > > + case MIDR_BRCM_VULCAN:
> > > > > + return false;
> > > > > + }
> > > > > +
> > > > > /* Useful for KASLR robustness */
> > > > > if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> > > > > return true;
> > > > >
> > > >
> > > > KPTI is also an improvement for KASLR. Why would you deprive a user of
> > > > the choice to further secure their system?
> > >
> > > The user has a choice with kpti= at the kernel command line, so we are
> > > not depriving the user of a choice. KASLR is expected to be enabled by
> > > distributions, and KPTI will be enabled by default as well.
> > >
> > > On systems that are not vulnerable to variant 3, this is an unnecessary
> > > overhead.
> >
> > KASLR can be bypassed on CPUs that are not vulnerable to variant 3 simply
> > by timing how long accesses to kernel addresses from EL0 take -- please read
> > the original KAISER paper for details about that attack on x86. kpti
> > mitigates that. If you don't care about KASLR, don't enable it (arguably
> > it's useless without kpti).
>
> The code above assumes that all ARM CPUs (now and future) will be vulnerable
> to timing attacks that can bypass KASLR. I don't think that is a correct
> assumption to make.

Well, the code is assuming that the difference between a TLB hit and a miss
can be measured and that permission faulting entries can be cached in the
TLB. I think that's a safe assumption for the moment. You can also disable
kaslr on the command line and at compile-time if you don't want to use it,
and the same thing applies to kpti. I really see this more as user
preference, rather than something that should be keyed off the MIDR and we
already provide those controls via the command line.

To be clear: I'll take the MIDR whitelisting, but only after the KASLR check
above.

> If ThunderX2 is shown to be vulnerable to any timing based attack we can
> certainly move the MIDR check after the check for the CONFIG_RANDOMIZE_BASE.
> But I don't think that is the case now, if you have any PoC code to check
> this I can run on the processor and make the change.

I haven't tried, but if you have a TLB worth its salt, I suspect you can
defeat kaslr by timing prefetches or faulting loads to kernel addresses.

> It is pretty clear that we need a whitelist check either before or after the
> CONFIG_RANDOMIZE_BASE check.

Please send a patch implementing this after the check.

> The kaiser paper seems to say that ARM TTBR0/1 made it more immune, and the
> prefetch paper(if I understand correctly) showed that prefetch on some ARM
> cores can be used for timing attack. This is probably and area where you will
> have better information, so any specific pointers would be appreciated -
> especially ones showing that all ARM CPUs are susceptible.

Pretty much all the stuff specific to ARM (and there's not much) in the
paper is incorrect, but the basic premise of the timnig attacks is sound.

Will

2018-01-09 12:47:31

by Jayachandran C

[permalink] [raw]
Subject: [PATCH v2] arm64: Branch predictor hardening for Cavium ThunderX2

Use PSCI based mitigation for speculative execution attacks targeting
the branch predictor. The approach is similar to the one used for
Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
test if the firmware supports the capability.

If the secure firmware has been updated with the mitigation code to
invalidate the branch target buffer, we use the PSCI version call to
invoke it.

Signed-off-by: Jayachandran C <[email protected]>
---
v2:
- rebased on top of the latest kpti branch
- use pr_info_once/pr_warn_once to avoid excessive prints
- using .desc generated too many prints, dropped plan for using it
- fixed up a return

arch/arm64/kernel/cpu_errata.c | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 70e5f18..c626914 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -128,6 +128,7 @@ static void install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
__install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
}

+#include <linux/arm-smccc.h>
#include <linux/psci.h>

static int enable_psci_bp_hardening(void *data)
@@ -165,6 +166,33 @@ static int qcom_enable_link_stack_sanitization(void *data)

return 0;
}
+
+#define CAVIUM_TX2_SIP_SMC_CALL 0xC200FF00
+#define CAVIUM_TX2_BTB_HARDEN_CAP 0xB0A0
+
+static int enable_tx2_psci_bp_hardening(void *data)
+{
+ const struct arm64_cpu_capabilities *entry = data;
+ struct arm_smccc_res res;
+
+ if (!entry->matches(entry, SCOPE_LOCAL_CPU))
+ return 0;
+
+ arm_smccc_smc(CAVIUM_TX2_SIP_SMC_CALL, CAVIUM_TX2_BTB_HARDEN_CAP, 0, 0, 0, 0, 0, 0, &res);
+ if (res.a0 != 0) {
+ pr_warn_once("Error: CONFIG_HARDEN_BRANCH_PREDICTOR enabled, but firmware does not support it\n");
+ return 0;
+ }
+ if (res.a1 == 1 && psci_ops.get_version) {
+ pr_info_once("Branch predictor hardening: Enabled, using PSCI version call.\n");
+ install_bp_hardening_cb(entry,
+ (bp_hardening_cb_t)psci_ops.get_version,
+ __psci_hyp_bp_inval_start,
+ __psci_hyp_bp_inval_end);
+ }
+
+ return 0;
+}
#endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */

#define MIDR_RANGE(model, min, max) \
@@ -338,6 +366,16 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
.capability = ARM64_HARDEN_BP_POST_GUEST_EXIT,
MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1),
},
+ {
+ .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+ MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
+ .enable = enable_tx2_psci_bp_hardening,
+ },
+ {
+ .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+ MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
+ .enable = enable_tx2_psci_bp_hardening,
+ },
#endif
{
}
--
2.7.4

2018-01-16 21:50:48

by Jon Masters

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: Branch predictor hardening for Cavium ThunderX2

On 01/09/2018 07:47 AM, Jayachandran C wrote:
> Use PSCI based mitigation for speculative execution attacks targeting
> the branch predictor. The approach is similar to the one used for
> Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
> test if the firmware supports the capability.
>
> If the secure firmware has been updated with the mitigation code to
> invalidate the branch target buffer, we use the PSCI version call to
> invoke it.

What's the status of this patch for TX2? Previously you were holding on
the new SMC number, but I think the consensus was to pull this in now?

Jon.

--
Computer Architect

2018-01-16 21:52:57

by Jon Masters

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: Branch predictor hardening for Cavium ThunderX2

On 01/09/2018 07:47 AM, Jayachandran C wrote:

> Use PSCI based mitigation for speculative execution attacks targeting
> the branch predictor. The approach is similar to the one used for
> Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
> test if the firmware supports the capability.
>
> If the secure firmware has been updated with the mitigation code to
> invalidate the branch target buffer, we use the PSCI version call to
> invoke it.

What's the status of this patch currently? Previously you had suggested
to hold while the SMC got standardized, but then you seemed happy with
pulling in. What's the latest?

Jon.

2018-01-16 23:46:09

by Jayachandran C

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: Branch predictor hardening for Cavium ThunderX2

On Tue, Jan 16, 2018 at 04:52:53PM -0500, Jon Masters wrote:
> On 01/09/2018 07:47 AM, Jayachandran C wrote:
>
> > Use PSCI based mitigation for speculative execution attacks targeting
> > the branch predictor. The approach is similar to the one used for
> > Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
> > test if the firmware supports the capability.
> >
> > If the secure firmware has been updated with the mitigation code to
> > invalidate the branch target buffer, we use the PSCI version call to
> > invoke it.
>
> What's the status of this patch currently? Previously you had suggested
> to hold while the SMC got standardized, but then you seemed happy with
> pulling in. What's the latest?

My understanding is that the SMC standardization is being worked on
but will take more time, and the KPTI current patchset will go to
mainline before that.

Given that, I would expect arm64 maintainers to pick up this patch for
ThunderX2, but I have not seen any comments so far.

Will/Marc, please let me know if you are planning to pick this patch
into the KPTI tree.

Thanks,
JC.

2018-01-17 04:11:00

by Yisheng Xie

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: Add skeleton to harden the branch predictor against aliasing attacks

Hi Will,

On 2018/1/5 21:12, Will Deacon wrote:
> diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
> index 5f7097d0cd12..d99b36555a16 100644
> --- a/arch/arm64/mm/context.c
> +++ b/arch/arm64/mm/context.c
> @@ -246,6 +246,8 @@ asmlinkage void post_ttbr_update_workaround(void)
> "ic iallu; dsb nsh; isb",
> ARM64_WORKAROUND_CAVIUM_27456,
> CONFIG_CAVIUM_ERRATUM_27456));
> +
> + arm64_apply_bp_hardening();
> }

post_ttbr_update_workaround was used for fix Cavium erratum 2745? so does that
means, if we do not have this erratum, we do not need arm64_apply_bp_hardening()?
when mm_swtich and kernel_exit?

>From the code logical, it seems not only related to erratum 2745 anymore?
should it be renamed?

Thanks
Yisheng

2018-01-17 10:07:12

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: Add skeleton to harden the branch predictor against aliasing attacks

On Wed, Jan 17, 2018 at 12:10:33PM +0800, Yisheng Xie wrote:
> Hi Will,
>
> On 2018/1/5 21:12, Will Deacon wrote:
> > diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
> > index 5f7097d0cd12..d99b36555a16 100644
> > --- a/arch/arm64/mm/context.c
> > +++ b/arch/arm64/mm/context.c
> > @@ -246,6 +246,8 @@ asmlinkage void post_ttbr_update_workaround(void)
> > "ic iallu; dsb nsh; isb",
> > ARM64_WORKAROUND_CAVIUM_27456,
> > CONFIG_CAVIUM_ERRATUM_27456));
> > +
> > + arm64_apply_bp_hardening();
> > }
>
> post_ttbr_update_workaround was used for fix Cavium erratum 2745? so does that
> means, if we do not have this erratum, we do not need arm64_apply_bp_hardening()?
> when mm_swtich and kernel_exit?
>
> From the code logical, it seems not only related to erratum 2745 anymore?
> should it be renamed?

post_ttbr_update_workaround just runs code after a TTBR update, which
includes mitigations against variant 2 of "spectre" and also a workaround
for a Cavium erratum. These are separate issues.

Will

2018-01-17 18:35:28

by Jon Masters

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: Branch predictor hardening for Cavium ThunderX2

On 01/16/2018 06:45 PM, Jayachandran C wrote:
> On Tue, Jan 16, 2018 at 04:52:53PM -0500, Jon Masters wrote:
>> On 01/09/2018 07:47 AM, Jayachandran C wrote:
>>
>>> Use PSCI based mitigation for speculative execution attacks targeting
>>> the branch predictor. The approach is similar to the one used for
>>> Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
>>> test if the firmware supports the capability.
>>>
>>> If the secure firmware has been updated with the mitigation code to
>>> invalidate the branch target buffer, we use the PSCI version call to
>>> invoke it.
>>
>> What's the status of this patch currently? Previously you had suggested
>> to hold while the SMC got standardized, but then you seemed happy with
>> pulling in. What's the latest?
>
> My understanding is that the SMC standardization is being worked on
> but will take more time, and the KPTI current patchset will go to
> mainline before that.
>
> Given that, I would expect arm64 maintainers to pick up this patch for
> ThunderX2, but I have not seen any comments so far.
>
> Will/Marc, please let me know if you are planning to pick this patch
> into the KPTI tree.

We've pulled in mitigations for QCOM Falkor into our internal
development branch (for future releases, this isn't about existing
stuff), but we can't pull in mitigations for other vendors until they're
upstream, and this patch isn't in any tree we track yet.

Therefore, I encourage all of the vendors to get this upstream. Until
that's true, it will be difficult to continue to carry out of tree bits.

Jon.


2018-01-18 08:39:58

by Yisheng Xie

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: Add skeleton to harden the branch predictor against aliasing attacks

Hi Will,

On 2018/1/17 18:07, Will Deacon wrote:
> On Wed, Jan 17, 2018 at 12:10:33PM +0800, Yisheng Xie wrote:
>> Hi Will,
>>
>> On 2018/1/5 21:12, Will Deacon wrote:
>>> diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
>>> index 5f7097d0cd12..d99b36555a16 100644
>>> --- a/arch/arm64/mm/context.c
>>> +++ b/arch/arm64/mm/context.c
>>> @@ -246,6 +246,8 @@ asmlinkage void post_ttbr_update_workaround(void)
>>> "ic iallu; dsb nsh; isb",
>>> ARM64_WORKAROUND_CAVIUM_27456,
>>> CONFIG_CAVIUM_ERRATUM_27456));
>>> +
>>> + arm64_apply_bp_hardening();
>>> }
>>
>> post_ttbr_update_workaround was used for fix Cavium erratum 2745? so does that
>> means, if we do not have this erratum, we do not need arm64_apply_bp_hardening()?
>> when mm_swtich and kernel_exit?
>>
>> From the code logical, it seems not only related to erratum 2745 anymore?
>> should it be renamed?
>
> post_ttbr_update_workaround just runs code after a TTBR update, which
> includes mitigations against variant 2 of "spectre" and also a workaround
> for a Cavium erratum. These are separate issues.

Get it, Thanks for your kind explain.

Thanks
Yisheng
>
> Will
>
> .
>


2018-01-18 13:54:29

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: Branch predictor hardening for Cavium ThunderX2

Hi JC,

On Tue, Jan 16, 2018 at 03:45:54PM -0800, Jayachandran C wrote:
> On Tue, Jan 16, 2018 at 04:52:53PM -0500, Jon Masters wrote:
> > On 01/09/2018 07:47 AM, Jayachandran C wrote:
> >
> > > Use PSCI based mitigation for speculative execution attacks targeting
> > > the branch predictor. The approach is similar to the one used for
> > > Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
> > > test if the firmware supports the capability.
> > >
> > > If the secure firmware has been updated with the mitigation code to
> > > invalidate the branch target buffer, we use the PSCI version call to
> > > invoke it.
> >
> > What's the status of this patch currently? Previously you had suggested
> > to hold while the SMC got standardized, but then you seemed happy with
> > pulling in. What's the latest?
>
> My understanding is that the SMC standardization is being worked on
> but will take more time, and the KPTI current patchset will go to
> mainline before that.
>
> Given that, I would expect arm64 maintainers to pick up this patch for
> ThunderX2, but I have not seen any comments so far.
>
> Will/Marc, please let me know if you are planning to pick this patch
> into the KPTI tree.

Are you really sure you want us to apply this? If we do, then you can't run
KVM guests anymore because your IMPDEF SMC results in an UNDEF being
injected (crash below).

I really think that you should just hook up the enable_psci_bp_hardening
callback like we've done for the Cortex CPUs. We can optimise this later
once the SMC standarisation work has been completed (which is nearly final
now and works in a backwards-compatible manner).

Will

--->8

[ 0.319123] Code: 2a080042 b8236885 29008829 17ffffc0 (d4000003)
[ 0.319125] Code: 2a080042 b8236885 29008829 17ffffc0 (d4000003)
[ 0.319147] Modules linked in:
[ 0.319152] CPU: 2 PID: 19 Comm: migration/2 Not tainted 4.15.0-rc8-00103-g9409c1e175be-dirty #1
[ 0.319154] Hardware name: linux,dummy-virt (DT)
[ 0.319156] pstate: 00000085 (nzcv daIf -PAN -UAO)
[ 0.319163] pc : __arm_smccc_smc+0x0/0x2c
[ 0.319166] lr : enable_tx2_psci_bp_hardening+0x6c/0x108
[ 0.319167] sp : ffff000009dcbd30
[ 0.319168] x29: ffff000009dcbd40 x28: 0000000000000000
[ 0.319171] x27: ffff00000803bc88 x26: 0000000000000001
[ 0.319174] x25: ffff000008d13980 x24: ffff00000907b575
[ 0.319176] x23: 0000000000000001 x22: 0000000000000000
[ 0.319179] x21: ffff00000803bd3c x20: ffff00000803bd18
[ 0.319181] x19: ffff0000089f2438 x18: 0000000000000030
[ 0.319183] x17: 0000000000000000 x16: 0000000000000000
[ 0.319185] x15: 0000000000000000 x14: 0000000000000400
[ 0.319187] x13: 0000000000000400 x12: 0000000000000000
[ 0.319189] x11: 0000000000000000 x10: 0000000000000a00
[ 0.319192] x9 : ffff000009dcbd80 x8 : ffff8001f691b460
[ 0.319194] x7 : 0000000000000000 x6 : 0000000000000000
[ 0.319196] x5 : 0000000000000000 x4 : 0000000000000000
[ 0.319198] x3 : 0000000000000000 x2 : 0000000000000000
[ 0.319200] x1 : 000000000000b0a0 x0 : 00000000c200ff00
[ 0.319203] Process migration/2 (pid: 19, stack limit = 0x000000004aa336a5)
[ 0.319204] Call trace:
[ 0.319207] __arm_smccc_smc+0x0/0x2c
[ 0.319211] multi_cpu_stop+0x8c/0x110
[ 0.319213] cpu_stopper_thread+0xac/0x120
[ 0.319219] smpboot_thread_fn+0x158/0x240
[ 0.319220] kthread+0x128/0x130
[ 0.319223] ret_from_fork+0x10/0x18
[ 0.319226] Code: 2a080042 b8236885 29008829 17ffffc0 (d4000003)
[ 0.319230] ---[ end trace 169f08213b3163bb ]---
[ 0.319234] Internal error: undefined instruction: 0 [#2] PREEMPT SMP
[ 0.319259] note: migration/2[19] exited with preempt_count 1
[ 0.319284] Modules linked in:
[ 0.319288] CPU: 3 PID: 24 Comm: migration/3 Tainted: G D 4.15.0-rc8-00103-g9409c1e175be-dirty #1
[ 0.319289] Hardware name: linux,dummy-virt (DT)
[ 0.319291] pstate: 00000085 (nzcv daIf -PAN -UAO)
[ 0.319295] pc : __arm_smccc_smc+0x0/0x2c
[ 0.319298] lr : enable_tx2_psci_bp_hardening+0x6c/0x108
[ 0.319298] sp : ffff000009df3d30
[ 0.319300] x29: ffff000009df3d40 x28: 0000000000000000
[ 0.319303] x27: ffff00000803bc88 x26: 0000000000000001
[ 0.319305] x25: ffff000008d13980 x24: ffff00000907b575
[ 0.319307] x23: 0000000000000001 x22: 0000000000000000
[ 0.319310] x21: ffff00000803bd3c x20: ffff00000803bd18
[ 0.319312] x19: ffff0000089f2438 x18: 0000000000000030
[ 0.319314] x17: 0000000000000000 x16: 0000000000000000
[ 0.319316] x15: 0000000000000000 x14: 0000000000000400
[ 0.319318] x13: 0000000000000400 x12: 0000000000000001
[ 0.319321] x11: 000000009ad0065e x10: 0000000000000a00
[ 0.319323] x9 : ffff000009df3d80 x8 : ffff8001f691fa60
[ 0.319325] x7 : 0000000000000000 x6 : 0000000000000000
[ 0.319327] x5 : 0000000000000000 x4 : 0000000000000000
[ 0.319329] x3 : 0000000000000000 x2 : 0000000000000000
[ 0.319331] x1 : 000000000000b0a0 x0 : 00000000c200ff00
[ 0.319334] Process migration/3 (pid: 24, stack limit = 0x00000000be13f0f9)
[ 0.319335] Call trace:
[ 0.319338] __arm_smccc_smc+0x0/0x2c
[ 0.319340] multi_cpu_stop+0x8c/0x110
[ 0.319342] cpu_stopper_thread+0xac/0x120
[ 0.319345] smpboot_thread_fn+0x158/0x240
[ 0.319346] kthread+0x128/0x130
[ 0.319348] ret_from_fork+0x10/0x18
[ 0.319351] Code: 2a080042 b8236885 29008829 17ffffc0 (d4000003)
[ 0.319352] ---[ end trace 169f08213b3163bc ]---
[ 0.319355] note: migration/3[24] exited with preempt_count 1
[ 0.319371] Internal error: undefined instruction: 0 [#3] PREEMPT SMP
[ 0.319373] Modules linked in:
[ 0.319376] CPU: 1 PID: 14 Comm: migration/1 Tainted: G D 4.15.0-rc8-00103-g9409c1e175be-dirty #1
[ 0.319377] Hardware name: linux,dummy-virt (DT)
[ 0.319379] pstate: 00000085 (nzcv daIf -PAN -UAO)
[ 0.319383] pc : __arm_smccc_smc+0x0/0x2c
[ 0.319385] lr : enable_tx2_psci_bp_hardening+0x6c/0x108
[ 0.319386] sp : ffff000009da3d30
[ 0.319387] x29: ffff000009da3d40 x28: 0000000000000000
[ 0.319390] x27: ffff00000803bc88 x26: 0000000000000001
[ 0.319393] x25: ffff000008d13980 x24: ffff00000907b575
[ 0.319395] x23: 0000000000000001 x22: 0000000000000000
[ 0.319397] x21: ffff00000803bd3c x20: ffff00000803bd18
[ 0.319399] x19: ffff0000089f2438 x18: 0000000000000030
[ 0.319402] x17: 0000000000000000 x16: 0000000000000000
[ 0.319404] x15: 0000000000000000 x14: 0000000000000400
[ 0.319406] x13: 0000000000000400 x12: 0000000000000000
[ 0.319408] x11: 0000000000000000 x10: 0000000000000a00
[ 0.319410] x9 : ffff000009da3d80 x8 : ffff8001f68e6c60
[ 0.319412] x7 : 0000000000000000 x6 : 0000000000000000
[ 0.319414] x5 : 0000000000000000 x4 : 0000000000000000
[ 0.319416] x3 : 0000000000000000 x2 : 0000000000000000
[ 0.319418] x1 : 000000000000b0a0 x0 : 00000000c200ff00
[ 0.319422] Process migration/1 (pid: 14, stack limit = 0x00000000596b9e92)
[ 0.319423] Call trace:
[ 0.319425] __arm_smccc_smc+0x0/0x2c
[ 0.319427] multi_cpu_stop+0x8c/0x110
[ 0.319429] cpu_stopper_thread+0xac/0x120
[ 0.319431] smpboot_thread_fn+0x158/0x240
[ 0.319433] kthread+0x128/0x130
[ 0.319435] ret_from_fork+0x10/0x18
[ 0.319437] Code: 2a080042 b8236885 29008829 17ffffc0 (d4000003)
[ 0.319439] ---[ end trace 169f08213b3163bd ]---
[ 0.319441] note: migration/1[14] exited with preempt_count 1
[ 0.857389] Code: 2a080042 b8236885 29008829 17ffffc0 (d4000003)
[ 0.863472] Internal error: undefined instruction: 0 [#4] PREEMPT SMP
[ 0.869937] Modules linked in:
[ 0.872969] CPU: 0 PID: 11 Comm: migration/0 Tainted: G D 4.15.0-rc8-00103-g9409c1e175be-dirty #1
[ 0.883064] Hardware name: linux,dummy-virt (DT)
[ 0.887680] pstate: 00000085 (nzcv daIf -PAN -UAO)
[ 0.892453] pc : __arm_smccc_smc+0x0/0x2c
[ 0.896466] lr : enable_tx2_psci_bp_hardening+0x6c/0x108
[ 0.901767] sp : ffff000009d73d30
[ 0.905078] x29: ffff000009d73d40 x28: 0000000000000000
[ 0.910404] x27: ffff00000803bc88 x26: 0000000000000001
[ 0.915706] x25: ffff000008d13980 x24: ffff00000907b575
[ 0.921040] x23: 0000000000000001 x22: 0000000000000000
[ 0.926357] x21: ffff00000803bd3c x20: ffff00000803bd18
[ 0.931660] x19: ffff0000089f2438 x18: 0000000000000010
[ 0.936954] x17: 00000000ffffff80 x16: 00000000bad0c696
[ 0.942280] x15: 0000000000000000 x14: 0000000000000400
[ 0.947567] x13: 0000000000000400 x12: 0000000000000001
[ 0.952861] x11: 0000000002014024 x10: 0000000000000a00
[ 0.958179] x9 : ffff000009d73d80 x8 : ffff8001f68e1860
[ 0.963460] x7 : 0000000000000000 x6 : 0000000000000000
[ 0.968761] x5 : 0000000000000000 x4 : 0000000000000000
[ 0.974071] x3 : 0000000000000000 x2 : 0000000000000000
[ 0.979341] x1 : 000000000000b0a0 x0 : 00000000c200ff00
[ 0.984676] Process migration/0 (pid: 11, stack limit = 0x0000000054c91580)
[ 0.991623] Call trace:
[ 0.994062] __arm_smccc_smc+0x0/0x2c
[ 0.997749] multi_cpu_stop+0x8c/0x110
[ 1.001495] cpu_stopper_thread+0xac/0x120
[ 1.005567] smpboot_thread_fn+0x158/0x240
[ 1.009665] kthread+0x128/0x130
[ 1.012881] ret_from_fork+0x10/0x18
[ 1.016435] Code: 2a080042 b8236885 29008829 17ffffc0 (d4000003)
[ 1.022526] ---[ end trace 169f08213b3163be ]---
[ 1.027146] note: migration/0[11] exited with preempt_count 1

2018-01-18 17:57:03

by Jayachandran C

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: Branch predictor hardening for Cavium ThunderX2

On Thu, Jan 18, 2018 at 01:53:55PM +0000, Will Deacon wrote:
> Hi JC,
>
> On Tue, Jan 16, 2018 at 03:45:54PM -0800, Jayachandran C wrote:
> > On Tue, Jan 16, 2018 at 04:52:53PM -0500, Jon Masters wrote:
> > > On 01/09/2018 07:47 AM, Jayachandran C wrote:
> > >
> > > > Use PSCI based mitigation for speculative execution attacks targeting
> > > > the branch predictor. The approach is similar to the one used for
> > > > Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
> > > > test if the firmware supports the capability.
> > > >
> > > > If the secure firmware has been updated with the mitigation code to
> > > > invalidate the branch target buffer, we use the PSCI version call to
> > > > invoke it.
> > >
> > > What's the status of this patch currently? Previously you had suggested
> > > to hold while the SMC got standardized, but then you seemed happy with
> > > pulling in. What's the latest?
> >
> > My understanding is that the SMC standardization is being worked on
> > but will take more time, and the KPTI current patchset will go to
> > mainline before that.
> >
> > Given that, I would expect arm64 maintainers to pick up this patch for
> > ThunderX2, but I have not seen any comments so far.
> >
> > Will/Marc, please let me know if you are planning to pick this patch
> > into the KPTI tree.
>
> Are you really sure you want us to apply this? If we do, then you can't run
> KVM guests anymore because your IMPDEF SMC results in an UNDEF being
> injected (crash below).
>
> I really think that you should just hook up the enable_psci_bp_hardening
> callback like we've done for the Cortex CPUs. We can optimise this later
> once the SMC standarisation work has been completed (which is nearly final
> now and works in a backwards-compatible manner).

I think Marc's patch here:
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm64/kpti&id=d35e77fae4b70331310c3bc1796bb43b93f9a85e
handles returning for undefined smc calls in guest.

I think in this case we have to choose between crashing or giving a false
sense of security when a guest compiled with HARDEN_BRANCH_PREDICTOR is
booted on an hypervisor that does not support hardening. Crashing maybe
a reasonable option.

JC.

2018-01-18 18:29:13

by Jon Masters

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: Branch predictor hardening for Cavium ThunderX2

On 01/18/2018 12:56 PM, Jayachandran C wrote:
> On Thu, Jan 18, 2018 at 01:53:55PM +0000, Will Deacon wrote:
>> Hi JC,
>>
>> On Tue, Jan 16, 2018 at 03:45:54PM -0800, Jayachandran C wrote:
>>> On Tue, Jan 16, 2018 at 04:52:53PM -0500, Jon Masters wrote:
>>>> On 01/09/2018 07:47 AM, Jayachandran C wrote:
>>>>
>>>>> Use PSCI based mitigation for speculative execution attacks targeting
>>>>> the branch predictor. The approach is similar to the one used for
>>>>> Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
>>>>> test if the firmware supports the capability.
>>>>>
>>>>> If the secure firmware has been updated with the mitigation code to
>>>>> invalidate the branch target buffer, we use the PSCI version call to
>>>>> invoke it.
>>>>
>>>> What's the status of this patch currently? Previously you had suggested
>>>> to hold while the SMC got standardized, but then you seemed happy with
>>>> pulling in. What's the latest?
>>>
>>> My understanding is that the SMC standardization is being worked on
>>> but will take more time, and the KPTI current patchset will go to
>>> mainline before that.
>>>
>>> Given that, I would expect arm64 maintainers to pick up this patch for
>>> ThunderX2, but I have not seen any comments so far.
>>>
>>> Will/Marc, please let me know if you are planning to pick this patch
>>> into the KPTI tree.
>>
>> Are you really sure you want us to apply this? If we do, then you can't run
>> KVM guests anymore because your IMPDEF SMC results in an UNDEF being
>> injected (crash below).
>>
>> I really think that you should just hook up the enable_psci_bp_hardening
>> callback like we've done for the Cortex CPUs. We can optimise this later
>> once the SMC standarisation work has been completed (which is nearly final
>> now and works in a backwards-compatible manner).
>
> I think Marc's patch here:
> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm64/kpti&id=d35e77fae4b70331310c3bc1796bb43b93f9a85e
> handles returning for undefined smc calls in guest.
>
> I think in this case we have to choose between crashing or giving a false
> sense of security when a guest compiled with HARDEN_BRANCH_PREDICTOR is
> booted on an hypervisor that does not support hardening. Crashing maybe
> a reasonable option.

Crashing is a completely unreasonable option and is totally
unacceptable. We never do this in enterprise, period.

It's reasonable to give an output in dmesg that a system isn't hardened,
but it's not reasonable to crash. On x86, we added a new qemu machine
type for those guests that would have IBRS exposed, and ask users to
switch that on explicitly, but even if they boot the new kernels on
unpatched infrastructure, we'll detect the lack of the branch predictor
control interface and just log that.

The exact same thing should happen on ARM.

Jon.

2018-01-18 23:29:25

by Jayachandran C

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: Branch predictor hardening for Cavium ThunderX2

Hi Jon,

On Thu, Jan 18, 2018 at 01:27:15PM -0500, Jon Masters wrote:
> On 01/18/2018 12:56 PM, Jayachandran C wrote:
> > On Thu, Jan 18, 2018 at 01:53:55PM +0000, Will Deacon wrote:
> >> Hi JC,
> >>
> >> On Tue, Jan 16, 2018 at 03:45:54PM -0800, Jayachandran C wrote:
> >>> On Tue, Jan 16, 2018 at 04:52:53PM -0500, Jon Masters wrote:
> >>>> On 01/09/2018 07:47 AM, Jayachandran C wrote:
> >>>>
> >>>>> Use PSCI based mitigation for speculative execution attacks targeting
> >>>>> the branch predictor. The approach is similar to the one used for
> >>>>> Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
> >>>>> test if the firmware supports the capability.
> >>>>>
> >>>>> If the secure firmware has been updated with the mitigation code to
> >>>>> invalidate the branch target buffer, we use the PSCI version call to
> >>>>> invoke it.
> >>>>
> >>>> What's the status of this patch currently? Previously you had suggested
> >>>> to hold while the SMC got standardized, but then you seemed happy with
> >>>> pulling in. What's the latest?
> >>>
> >>> My understanding is that the SMC standardization is being worked on
> >>> but will take more time, and the KPTI current patchset will go to
> >>> mainline before that.
> >>>
> >>> Given that, I would expect arm64 maintainers to pick up this patch for
> >>> ThunderX2, but I have not seen any comments so far.
> >>>
> >>> Will/Marc, please let me know if you are planning to pick this patch
> >>> into the KPTI tree.
> >>
> >> Are you really sure you want us to apply this? If we do, then you can't run
> >> KVM guests anymore because your IMPDEF SMC results in an UNDEF being
> >> injected (crash below).
> >>
> >> I really think that you should just hook up the enable_psci_bp_hardening
> >> callback like we've done for the Cortex CPUs. We can optimise this later
> >> once the SMC standarisation work has been completed (which is nearly final
> >> now and works in a backwards-compatible manner).
> >
> > I think Marc's patch here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm64/kpti&id=d35e77fae4b70331310c3bc1796bb43b93f9a85e
> > handles returning for undefined smc calls in guest.
> >
> > I think in this case we have to choose between crashing or giving a false
> > sense of security when a guest compiled with HARDEN_BRANCH_PREDICTOR is
> > booted on an hypervisor that does not support hardening. Crashing maybe
> > a reasonable option.
>
> Crashing is a completely unreasonable option and is totally
> unacceptable. We never do this in enterprise, period.
>
> It's reasonable to give an output in dmesg that a system isn't hardened,
> but it's not reasonable to crash. On x86, we added a new qemu machine
> type for those guests that would have IBRS exposed, and ask users to
> switch that on explicitly, but even if they boot the new kernels on
> unpatched infrastructure, we'll detect the lack of the branch predictor
> control interface and just log that.
>
> The exact same thing should happen on ARM.

With the current patchset from ARM, there is no way of detecting if the
hypervisor is hardened or not, to provide the warning. The only other
option I have call get version blindly and provide a false sense of
security.

Since both options are bad, I don't have a good solution here. If RedHat
has a preference here on what would be better, I can go with that.

JC.

2018-01-19 01:01:42

by Jon Masters

[permalink] [raw]
Subject: Re: [v2,03/11] arm64: Take into account ID_AA64PFR0_EL1.CSV3

On 01/09/2018 05:00 AM, Will Deacon wrote:
> On Mon, Jan 08, 2018 at 08:06:27PM -0800, Jayachandran C wrote:
>> On Mon, Jan 08, 2018 at 05:51:00PM +0000, Will Deacon wrote:
>>> On Mon, Jan 08, 2018 at 09:40:17AM -0800, Jayachandran C wrote:
>>>> On Mon, Jan 08, 2018 at 09:20:09AM +0000, Marc Zyngier wrote:
>>>>> On 08/01/18 07:24, Jayachandran C wrote:
>>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>>>>> index 19ed09b..202b037 100644
>>>>>> --- a/arch/arm64/kernel/cpufeature.c
>>>>>> +++ b/arch/arm64/kernel/cpufeature.c
>>>>>> @@ -862,6 +862,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
>>>>>> return __kpti_forced > 0;
>>>>>> }
>>>>>>
>>>>>> + /* Don't force KPTI for CPUs that are not vulnerable */
>>>>>> + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
>>>>>> + case MIDR_CAVIUM_THUNDERX2:
>>>>>> + case MIDR_BRCM_VULCAN:
>>>>>> + return false;
>>>>>> + }
>>>>>> +
>>>>>> /* Useful for KASLR robustness */
>>>>>> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
>>>>>> return true;
>>>>>>
>>>>>
>>>>> KPTI is also an improvement for KASLR. Why would you deprive a user of
>>>>> the choice to further secure their system?
>>>>
>>>> The user has a choice with kpti= at the kernel command line, so we are
>>>> not depriving the user of a choice. KASLR is expected to be enabled by
>>>> distributions, and KPTI will be enabled by default as well.
>>>>
>>>> On systems that are not vulnerable to variant 3, this is an unnecessary
>>>> overhead.
>>>
>>> KASLR can be bypassed on CPUs that are not vulnerable to variant 3 simply
>>> by timing how long accesses to kernel addresses from EL0 take -- please read
>>> the original KAISER paper for details about that attack on x86. kpti
>>> mitigates that. If you don't care about KASLR, don't enable it (arguably
>>> it's useless without kpti).
>>
>> The code above assumes that all ARM CPUs (now and future) will be vulnerable
>> to timing attacks that can bypass KASLR. I don't think that is a correct
>> assumption to make.
>
> Well, the code is assuming that the difference between a TLB hit and a miss
> can be measured and that permission faulting entries can be cached in the
> TLB. I think that's a safe assumption for the moment. You can also disable
> kaslr on the command line and at compile-time if you don't want to use it,
> and the same thing applies to kpti. I really see this more as user
> preference, rather than something that should be keyed off the MIDR and we
> already provide those controls via the command line.
>
> To be clear: I'll take the MIDR whitelisting, but only after the KASLR check
> above.
>
>> If ThunderX2 is shown to be vulnerable to any timing based attack we can
>> certainly move the MIDR check after the check for the CONFIG_RANDOMIZE_BASE.
>> But I don't think that is the case now, if you have any PoC code to check
>> this I can run on the processor and make the change.
>
> I haven't tried, but if you have a TLB worth its salt, I suspect you can
> defeat kaslr by timing prefetches or faulting loads to kernel addresses.
>
>> It is pretty clear that we need a whitelist check either before or after the
>> CONFIG_RANDOMIZE_BASE check.
>
> Please send a patch implementing this after the check.

JC: what's the plan here from Cavium? I didn't see such a patch (but
might have missed it). I've asked that we follow the same logic as on
x86 within Red Hat and default to disabling (k)pti on hardware known not
to be vulnerable to that explicit attack. Sure, KASLR bypass is "not
good"(TM) and there are ton(ne)s of new ways to do that found all the
time, but the performance hit is non-zero and there is a difference
between breaking randomization vs. leaking cache data, and HPC folks are
among those who are going to come asking why they need to turn off PTI
all over the place. The equivalent would be on x86 where one vendor
always has PTI enabled, the other only if the user explicitly requests
that it be turned on at boot time.

Jon.

2018-01-19 01:18:34

by Jon Masters

[permalink] [raw]
Subject: Re: [PATCH v2] arm64: Branch predictor hardening for Cavium ThunderX2

Hi JC, Will,

On 01/18/2018 06:28 PM, Jayachandran C wrote:
> On Thu, Jan 18, 2018 at 01:27:15PM -0500, Jon Masters wrote:
>> On 01/18/2018 12:56 PM, Jayachandran C wrote:
>>> On Thu, Jan 18, 2018 at 01:53:55PM +0000, Will Deacon wrote:

>>> I think in this case we have to choose between crashing or giving a false
>>> sense of security when a guest compiled with HARDEN_BRANCH_PREDICTOR is
>>> booted on an hypervisor that does not support hardening. Crashing maybe
>>> a reasonable option.
>>
>> Crashing is a completely unreasonable option and is totally
>> unacceptable. We never do this in enterprise, period.
>>
>> It's reasonable to give an output in dmesg that a system isn't hardened,
>> but it's not reasonable to crash. On x86, we added a new qemu machine
>> type for those guests that would have IBRS exposed, and ask users to
>> switch that on explicitly, but even if they boot the new kernels on
>> unpatched infrastructure, we'll detect the lack of the branch predictor
>> control interface and just log that.
>>
>> The exact same thing should happen on ARM.
>
> With the current patchset from ARM, there is no way of detecting if the
> hypervisor is hardened or not, to provide the warning. The only other
> option I have call get version blindly and provide a false sense of
> security.

Agreed that (unless) I'm missing something, the current arm patchset
doesn't have an enumeration mechanism to see if firmware supports the
branch predictor hardening or not. Am I missing something there?

On the three other affected arches we're tracking, there's an
enumeration mechanism. On x86, there's a new set of CPUID bits. On
POWER, there's a new hcall that tells us whether the millicode supports
what we need, and on z there's a new facility code we can test for that
is also passed into VMs. So we need to have a similar enumeration
mechanism on ARM that is passed into guests as well.

> Since both options are bad, I don't have a good solution here. If RedHat
> has a preference here on what would be better, I can go with that.

We need an enumeration mechanism that determines whether the hypervisor
is patched. In the absence of that, blindly calling in and hoping that
the firmware is updated is better than nothing. I'll look to see if
there's a generic upstream solution for enumeration that I've missed (or
that can be added, perhaps a new SMC enumeration mechanism). If there
isn't a short term fix, we'll work with you guys directly to add
something RHEL specific by checking some firmware version somehow.

Jon.


2018-01-19 03:43:04

by Li Kun

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: Add skeleton to harden the branch predictor against aliasing attacks

Hi will,


在 2018/1/17 18:07, Will Deacon 写道:
> On Wed, Jan 17, 2018 at 12:10:33PM +0800, Yisheng Xie wrote:
>> Hi Will,
>>
>> On 2018/1/5 21:12, Will Deacon wrote:
>>> diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
>>> index 5f7097d0cd12..d99b36555a16 100644
>>> --- a/arch/arm64/mm/context.c
>>> +++ b/arch/arm64/mm/context.c
>>> @@ -246,6 +246,8 @@ asmlinkage void post_ttbr_update_workaround(void)
>>> "ic iallu; dsb nsh; isb",
>>> ARM64_WORKAROUND_CAVIUM_27456,
>>> CONFIG_CAVIUM_ERRATUM_27456));
>>> +
>>> + arm64_apply_bp_hardening();
>>> }
>> post_ttbr_update_workaround was used for fix Cavium erratum 2745? so does that
>> means, if we do not have this erratum, we do not need arm64_apply_bp_hardening()?
>> when mm_swtich and kernel_exit?
>>
>> From the code logical, it seems not only related to erratum 2745 anymore?
>> should it be renamed?
> post_ttbr_update_workaround just runs code after a TTBR update, which
> includes mitigations against variant 2 of "spectre" and also a workaround
> for a Cavium erratum. These are separate issues.
But AFAIU, according to the theory of spectre, we don't need to clear
the BTB every time we return to user?
If we enable CONFIG_ARM64_SW_TTBR0_PAN, there will be a call to
arm64_apply_bp_hardening every time kernel exit to el0.
kernel_exit
post_ttbr_update_workaround
arm64_apply_bp_hardening
>
> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

--
Best Regards
Li Kun


2018-01-19 12:23:54

by Jayachandran C

[permalink] [raw]
Subject: [PATCH v3 1/2] arm64: Branch predictor hardening for Cavium ThunderX2

Use PSCI based mitigation for speculative execution attacks targeting
the branch predictor. We use the same mechanism as the one used for
Cortex-A CPUs, we expect the PSCI version call to have a side effect
of clearing the BTBs.

Signed-off-by: Jayachandran C <[email protected]>
---
arch/arm64/kernel/cpu_errata.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 70e5f18..45ff9a2 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -338,6 +338,16 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
.capability = ARM64_HARDEN_BP_POST_GUEST_EXIT,
MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1),
},
+ {
+ .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+ MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
+ .enable = enable_psci_bp_hardening,
+ },
+ {
+ .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+ MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
+ .enable = enable_psci_bp_hardening,
+ },
#endif
{
}
--
2.7.4


2018-01-19 12:25:18

by Jayachandran C

[permalink] [raw]
Subject: [PATCH v3 2/2] arm64: Turn on KPTI only on CPUs that need it

Whitelist Broadcom Vulcan/Cavium ThunderX2 processors in
unmap_kernel_at_el0(). These CPUs are not vulnerable to
CVE-2017-5754 and do not need KPTI when KASLR is off.

Signed-off-by: Jayachandran C <[email protected]>
---
arch/arm64/kernel/cpufeature.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 647d44b..fb698ca 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -866,6 +866,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
return true;

+ /* Don't force KPTI for CPUs that are not vulnerable */
+ switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
+ case MIDR_CAVIUM_THUNDERX2:
+ case MIDR_BRCM_VULCAN:
+ return false;
+ }
+
/* Defer to CPU feature registers */
return !cpuid_feature_extract_unsigned_field(pfr0,
ID_AA64PFR0_CSV3_SHIFT);
--
2.7.4


2018-01-19 14:30:46

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: Add skeleton to harden the branch predictor against aliasing attacks

On Fri, Jan 19, 2018 at 11:37:24AM +0800, Li Kun wrote:
> 在 2018/1/17 18:07, Will Deacon 写道:
> >On Wed, Jan 17, 2018 at 12:10:33PM +0800, Yisheng Xie wrote:
> >>On 2018/1/5 21:12, Will Deacon wrote:
> >>>diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
> >>>index 5f7097d0cd12..d99b36555a16 100644
> >>>--- a/arch/arm64/mm/context.c
> >>>+++ b/arch/arm64/mm/context.c
> >>>@@ -246,6 +246,8 @@ asmlinkage void post_ttbr_update_workaround(void)
> >>> "ic iallu; dsb nsh; isb",
> >>> ARM64_WORKAROUND_CAVIUM_27456,
> >>> CONFIG_CAVIUM_ERRATUM_27456));
> >>>+
> >>>+ arm64_apply_bp_hardening();
> >>> }
> >>post_ttbr_update_workaround was used for fix Cavium erratum 2745? so does that
> >>means, if we do not have this erratum, we do not need arm64_apply_bp_hardening()?
> >>when mm_swtich and kernel_exit?
> >>
> >> From the code logical, it seems not only related to erratum 2745 anymore?
> >>should it be renamed?
> >post_ttbr_update_workaround just runs code after a TTBR update, which
> >includes mitigations against variant 2 of "spectre" and also a workaround
> >for a Cavium erratum. These are separate issues.
> But AFAIU, according to the theory of spectre, we don't need to clear the
> BTB every time we return to user?
> If we enable CONFIG_ARM64_SW_TTBR0_PAN, there will be a call to
> arm64_apply_bp_hardening every time kernel exit to el0.
> kernel_exit
> post_ttbr_update_workaround
> arm64_apply_bp_hardening

That's a really good point, thanks. What it means is that
post_ttbr_update_workaround is actually the wrong place for this, and we
should be doing it more directly on the switch_mm path -- probably in
check_and_switch_context.

Will

2018-01-19 19:10:03

by Jon Masters

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] arm64: Branch predictor hardening for Cavium ThunderX2

On 01/19/2018 07:22 AM, Jayachandran C wrote:
> Use PSCI based mitigation for speculative execution attacks targeting
> the branch predictor. We use the same mechanism as the one used for
> Cortex-A CPUs, we expect the PSCI version call to have a side effect
> of clearing the BTBs.
>
> Signed-off-by: Jayachandran C <[email protected]>
> ---
> arch/arm64/kernel/cpu_errata.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 70e5f18..45ff9a2 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -338,6 +338,16 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
> .capability = ARM64_HARDEN_BP_POST_GUEST_EXIT,
> MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1),
> },
> + {
> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> + MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
> + .enable = enable_psci_bp_hardening,
> + },
> + {
> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> + MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
> + .enable = enable_psci_bp_hardening,
> + },
> #endif
> {
> }
>

Both of these patches seem reasonable to me.


2018-01-22 06:53:08

by Li Kun

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: Add skeleton to harden the branch predictor against aliasing attacks



On 2018/1/19 22:28, Will Deacon Wrote:
> On Fri, Jan 19, 2018 at 11:37:24AM +0800, Li Kun wrote:
>> 在 2018/1/17 18:07, Will Deacon 写道:
>>> On Wed, Jan 17, 2018 at 12:10:33PM +0800, Yisheng Xie wrote:
>>>> On 2018/1/5 21:12, Will Deacon wrote:
>>>>> diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
>>>>> index 5f7097d0cd12..d99b36555a16 100644
>>>>> --- a/arch/arm64/mm/context.c
>>>>> +++ b/arch/arm64/mm/context.c
>>>>> @@ -246,6 +246,8 @@ asmlinkage void post_ttbr_update_workaround(void)
>>>>> "ic iallu; dsb nsh; isb",
>>>>> ARM64_WORKAROUND_CAVIUM_27456,
>>>>> CONFIG_CAVIUM_ERRATUM_27456));
>>>>> +
>>>>> + arm64_apply_bp_hardening();
>>>>> }
>>>> post_ttbr_update_workaround was used for fix Cavium erratum 2745? so does that
>>>> means, if we do not have this erratum, we do not need arm64_apply_bp_hardening()?
>>>> when mm_swtich and kernel_exit?
>>>>
>>>> From the code logical, it seems not only related to erratum 2745 anymore?
>>>> should it be renamed?
>>> post_ttbr_update_workaround just runs code after a TTBR update, which
>>> includes mitigations against variant 2 of "spectre" and also a workaround
>>> for a Cavium erratum. These are separate issues.
>> But AFAIU, according to the theory of spectre, we don't need to clear the
>> BTB every time we return to user?
>> If we enable CONFIG_ARM64_SW_TTBR0_PAN, there will be a call to
>> arm64_apply_bp_hardening every time kernel exit to el0.
>> kernel_exit
>> post_ttbr_update_workaround
>> arm64_apply_bp_hardening
> That's a really good point, thanks. What it means is that
> post_ttbr_update_workaround is actually the wrong place for this, and we
> should be doing it more directly on the switch_mm path -- probably in
> check_and_switch_context.
Yes, that's exactly what i mean.:-)
>
> Will

--
Best Regards
Li Kun


2018-01-22 11:33:58

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] arm64: Branch predictor hardening for Cavium ThunderX2

On Fri, Jan 19, 2018 at 04:22:47AM -0800, Jayachandran C wrote:
> Use PSCI based mitigation for speculative execution attacks targeting
> the branch predictor. We use the same mechanism as the one used for
> Cortex-A CPUs, we expect the PSCI version call to have a side effect
> of clearing the BTBs.
>
> Signed-off-by: Jayachandran C <[email protected]>
> ---
> arch/arm64/kernel/cpu_errata.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 70e5f18..45ff9a2 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -338,6 +338,16 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
> .capability = ARM64_HARDEN_BP_POST_GUEST_EXIT,
> MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1),
> },
> + {
> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> + MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
> + .enable = enable_psci_bp_hardening,
> + },
> + {
> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> + MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
> + .enable = enable_psci_bp_hardening,
> + },
> #endif

Thanks.

Acked-by: Will Deacon <[email protected]>

Will

2018-01-22 11:42:58

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] arm64: Turn on KPTI only on CPUs that need it

On Fri, Jan 19, 2018 at 04:22:48AM -0800, Jayachandran C wrote:
> Whitelist Broadcom Vulcan/Cavium ThunderX2 processors in
> unmap_kernel_at_el0(). These CPUs are not vulnerable to
> CVE-2017-5754 and do not need KPTI when KASLR is off.
>
> Signed-off-by: Jayachandran C <[email protected]>
> ---
> arch/arm64/kernel/cpufeature.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 647d44b..fb698ca 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -866,6 +866,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> return true;
>
> + /* Don't force KPTI for CPUs that are not vulnerable */
> + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
> + case MIDR_CAVIUM_THUNDERX2:
> + case MIDR_BRCM_VULCAN:
> + return false;
> + }
> +
> /* Defer to CPU feature registers */
> return !cpuid_feature_extract_unsigned_field(pfr0,
> ID_AA64PFR0_CSV3_SHIFT);

We'll need to re-jig this to work properly with big/little because this is
only called once, but that's ok for now:

Acked-by: Will Deacon <[email protected]>

Suzuki has a series reworking much of the cpufeatures code so that we can
do this properly for 4.17.

Will

2018-01-22 11:52:49

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] arm64: Turn on KPTI only on CPUs that need it

On 22 January 2018 at 11:41, Will Deacon <[email protected]> wrote:
> On Fri, Jan 19, 2018 at 04:22:48AM -0800, Jayachandran C wrote:
>> Whitelist Broadcom Vulcan/Cavium ThunderX2 processors in
>> unmap_kernel_at_el0(). These CPUs are not vulnerable to
>> CVE-2017-5754 and do not need KPTI when KASLR is off.
>>
>> Signed-off-by: Jayachandran C <[email protected]>
>> ---
>> arch/arm64/kernel/cpufeature.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 647d44b..fb698ca 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -866,6 +866,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
>> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
>> return true;
>>
>> + /* Don't force KPTI for CPUs that are not vulnerable */
>> + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
>> + case MIDR_CAVIUM_THUNDERX2:
>> + case MIDR_BRCM_VULCAN:
>> + return false;
>> + }
>> +
>> /* Defer to CPU feature registers */
>> return !cpuid_feature_extract_unsigned_field(pfr0,
>> ID_AA64PFR0_CSV3_SHIFT);
>
> We'll need to re-jig this to work properly with big/little because this is
> only called once, but that's ok for now:
>
> Acked-by: Will Deacon <[email protected]>
>
> Suzuki has a series reworking much of the cpufeatures code so that we can
> do this properly for 4.17.
>

If we start adding opt outs here, we should at least include A53, and
probably replace

>> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
>> return true;

with

>> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset() > 0)
>> return true;

so that adding 'nokaslr' to the command line also disables KPTI.

2018-01-22 11:56:59

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] arm64: Turn on KPTI only on CPUs that need it

On Mon, Jan 22, 2018 at 11:51:34AM +0000, Ard Biesheuvel wrote:
> On 22 January 2018 at 11:41, Will Deacon <[email protected]> wrote:
> > On Fri, Jan 19, 2018 at 04:22:48AM -0800, Jayachandran C wrote:
> >> Whitelist Broadcom Vulcan/Cavium ThunderX2 processors in
> >> unmap_kernel_at_el0(). These CPUs are not vulnerable to
> >> CVE-2017-5754 and do not need KPTI when KASLR is off.
> >>
> >> Signed-off-by: Jayachandran C <[email protected]>
> >> ---
> >> arch/arm64/kernel/cpufeature.c | 7 +++++++
> >> 1 file changed, 7 insertions(+)
> >>
> >> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> >> index 647d44b..fb698ca 100644
> >> --- a/arch/arm64/kernel/cpufeature.c
> >> +++ b/arch/arm64/kernel/cpufeature.c
> >> @@ -866,6 +866,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> >> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> >> return true;
> >>
> >> + /* Don't force KPTI for CPUs that are not vulnerable */
> >> + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
> >> + case MIDR_CAVIUM_THUNDERX2:
> >> + case MIDR_BRCM_VULCAN:
> >> + return false;
> >> + }
> >> +
> >> /* Defer to CPU feature registers */
> >> return !cpuid_feature_extract_unsigned_field(pfr0,
> >> ID_AA64PFR0_CSV3_SHIFT);
> >
> > We'll need to re-jig this to work properly with big/little because this is
> > only called once, but that's ok for now:
> >
> > Acked-by: Will Deacon <[email protected]>
> >
> > Suzuki has a series reworking much of the cpufeatures code so that we can
> > do this properly for 4.17.
> >
>
> If we start adding opt outs here, we should at least include A53, and
> probably replace
>
> >> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> >> return true;
>
> with
>
> >> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset() > 0)
> >> return true;
>
> so that adding 'nokaslr' to the command line also disables KPTI.

Yup, I was going to do this once we have the new cpufeatures code from
Suzuki and can safely whitelist cores that can appear in big/little
configurations.

Will

2018-01-22 19:00:02

by Jon Masters

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] arm64: Turn on KPTI only on CPUs that need it

On 01/22/2018 06:41 AM, Will Deacon wrote:
> On Fri, Jan 19, 2018 at 04:22:48AM -0800, Jayachandran C wrote:
>> Whitelist Broadcom Vulcan/Cavium ThunderX2 processors in
>> unmap_kernel_at_el0(). These CPUs are not vulnerable to
>> CVE-2017-5754 and do not need KPTI when KASLR is off.
>>
>> Signed-off-by: Jayachandran C <[email protected]>
>> ---
>> arch/arm64/kernel/cpufeature.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 647d44b..fb698ca 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -866,6 +866,13 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
>> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
>> return true;
>>
>> + /* Don't force KPTI for CPUs that are not vulnerable */
>> + switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
>> + case MIDR_CAVIUM_THUNDERX2:
>> + case MIDR_BRCM_VULCAN:
>> + return false;
>> + }
>> +
>> /* Defer to CPU feature registers */
>> return !cpuid_feature_extract_unsigned_field(pfr0,
>> ID_AA64PFR0_CSV3_SHIFT);
>
> We'll need to re-jig this to work properly with big/little because this is
> only called once, but that's ok for now:
>
> Acked-by: Will Deacon <[email protected]>
>
> Suzuki has a series reworking much of the cpufeatures code so that we can
> do this properly for 4.17.

Thanks, much appreciated.

Jon.


2018-01-22 19:01:47

by Jon Masters

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] arm64: Branch predictor hardening for Cavium ThunderX2

On 01/22/2018 06:33 AM, Will Deacon wrote:
> On Fri, Jan 19, 2018 at 04:22:47AM -0800, Jayachandran C wrote:
>> Use PSCI based mitigation for speculative execution attacks targeting
>> the branch predictor. We use the same mechanism as the one used for
>> Cortex-A CPUs, we expect the PSCI version call to have a side effect
>> of clearing the BTBs.
>>
>> Signed-off-by: Jayachandran C <[email protected]>
>> ---
>> arch/arm64/kernel/cpu_errata.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
>> index 70e5f18..45ff9a2 100644
>> --- a/arch/arm64/kernel/cpu_errata.c
>> +++ b/arch/arm64/kernel/cpu_errata.c
>> @@ -338,6 +338,16 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>> .capability = ARM64_HARDEN_BP_POST_GUEST_EXIT,
>> MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1),
>> },
>> + {
>> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
>> + MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
>> + .enable = enable_psci_bp_hardening,
>> + },
>> + {
>> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
>> + MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
>> + .enable = enable_psci_bp_hardening,
>> + },
>> #endif
>
> Thanks.
>
> Acked-by: Will Deacon <[email protected]>

Thanks. I have separately asked for a specification tweak to allow us to
discover whether firmware has been augmented to provide the necessary
support that we need. That applies beyond Cavium.

(for now in RHEL, we've asked the vendors to give us a temporary patch
that we can match DMI or other data later in boot and warn users on)

Jon.


2018-01-23 09:51:32

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] arm64: Branch predictor hardening for Cavium ThunderX2

On Mon, Jan 22, 2018 at 02:00:59PM -0500, Jon Masters wrote:
> On 01/22/2018 06:33 AM, Will Deacon wrote:
> > On Fri, Jan 19, 2018 at 04:22:47AM -0800, Jayachandran C wrote:
> >> Use PSCI based mitigation for speculative execution attacks targeting
> >> the branch predictor. We use the same mechanism as the one used for
> >> Cortex-A CPUs, we expect the PSCI version call to have a side effect
> >> of clearing the BTBs.
> >>
> >> Signed-off-by: Jayachandran C <[email protected]>
> >> ---
> >> arch/arm64/kernel/cpu_errata.c | 10 ++++++++++
> >> 1 file changed, 10 insertions(+)
> >>
> >> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> >> index 70e5f18..45ff9a2 100644
> >> --- a/arch/arm64/kernel/cpu_errata.c
> >> +++ b/arch/arm64/kernel/cpu_errata.c
> >> @@ -338,6 +338,16 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
> >> .capability = ARM64_HARDEN_BP_POST_GUEST_EXIT,
> >> MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1),
> >> },
> >> + {
> >> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> >> + MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
> >> + .enable = enable_psci_bp_hardening,
> >> + },
> >> + {
> >> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> >> + MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
> >> + .enable = enable_psci_bp_hardening,
> >> + },
> >> #endif
> >
> > Thanks.
> >
> > Acked-by: Will Deacon <[email protected]>
>
> Thanks. I have separately asked for a specification tweak to allow us to
> discover whether firmware has been augmented to provide the necessary
> support that we need. That applies beyond Cavium.

AFAIK, there's already an SMCCC/PSCI proposal doing the rounds that is
discoverable and does what we need. Have you seen it? We should be posting
code this week.

Will