Hi everybody,
This is version three of the patches formerly known as KAISER (♔).
v1: http://lists.infradead.org/pipermail/linux-arm-kernel/2017-November/542751.html
v2: http://lists.infradead.org/pipermail/linux-arm-kernel/2017-November/544817.html
Changes since v2 include:
* Rename command-line option from "kaiser=" to "kpti=" for parity with x86
* Fixed Falkor erratum workaround (missing '~')
* Moved vectors base from literal pool into separate data page
* Added TTBR_ASID_MASK instead of open-coded constants
* Added missing newline to error message
* Fail to probe SPE if KPTI is enabled
* Addressed minor review comments
* Added tags
* Based on -rc2
Patches are also pushed here:
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git kpti
Feedback and testing welcome. At this point, I'd like to start thinking
about getting this merged for 4.16.
Cheers,
Will
--->8
Will Deacon (20):
arm64: mm: Use non-global mappings for kernel space
arm64: mm: Temporarily disable ARM64_SW_TTBR0_PAN
arm64: mm: Move ASID from TTBR0 to TTBR1
arm64: mm: Remove pre_ttbr0_update_workaround for Falkor erratum
#E1003
arm64: mm: Rename post_ttbr0_update_workaround
arm64: mm: Fix and re-enable ARM64_SW_TTBR0_PAN
arm64: mm: Allocate ASIDs in pairs
arm64: mm: Add arm64_kernel_unmapped_at_el0 helper
arm64: mm: Invalidate both kernel and user ASIDs when performing TLBI
arm64: entry: Add exception trampoline page for exceptions from EL0
arm64: mm: Map entry trampoline into trampoline and kernel page tables
arm64: entry: Explicitly pass exception level to kernel_ventry macro
arm64: entry: Hook up entry trampoline to exception vectors
arm64: erratum: Work around Falkor erratum #E1003 in trampoline code
arm64: tls: Avoid unconditional zeroing of tpidrro_el0 for native
tasks
arm64: entry: Add fake CPU feature for unmapping the kernel at EL0
arm64: Kconfig: Add CONFIG_UNMAP_KERNEL_AT_EL0
perf: arm_spe: Fail device probe when arm64_kernel_unmapped_at_el0()
arm64: mm: Introduce TTBR_ASID_MASK for getting at the ASID in the
TTBR
arm64: kaslr: Put kernel vectors address in separate data page
arch/arm64/Kconfig | 30 +++--
arch/arm64/include/asm/asm-uaccess.h | 26 ++--
arch/arm64/include/asm/assembler.h | 27 +----
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/fixmap.h | 5 +
arch/arm64/include/asm/kernel-pgtable.h | 12 +-
arch/arm64/include/asm/mmu.h | 11 ++
arch/arm64/include/asm/mmu_context.h | 9 +-
arch/arm64/include/asm/pgtable-hwdef.h | 1 +
arch/arm64/include/asm/pgtable-prot.h | 21 +++-
arch/arm64/include/asm/pgtable.h | 1 +
arch/arm64/include/asm/proc-fns.h | 6 -
arch/arm64/include/asm/tlbflush.h | 16 ++-
arch/arm64/include/asm/uaccess.h | 21 +++-
arch/arm64/kernel/asm-offsets.c | 6 +-
arch/arm64/kernel/cpufeature.c | 41 +++++++
arch/arm64/kernel/entry.S | 203 +++++++++++++++++++++++++++-----
arch/arm64/kernel/process.c | 12 +-
arch/arm64/kernel/vmlinux.lds.S | 40 ++++++-
arch/arm64/lib/clear_user.S | 2 +-
arch/arm64/lib/copy_from_user.S | 2 +-
arch/arm64/lib/copy_in_user.S | 2 +-
arch/arm64/lib/copy_to_user.S | 2 +-
arch/arm64/mm/cache.S | 2 +-
arch/arm64/mm/context.c | 36 +++---
arch/arm64/mm/mmu.c | 31 +++++
arch/arm64/mm/proc.S | 12 +-
arch/arm64/xen/hypercall.S | 2 +-
drivers/perf/arm_spe_pmu.c | 9 ++
29 files changed, 454 insertions(+), 137 deletions(-)
--
2.1.4
We're about to rework the way ASIDs are allocated, switch_mm is
implemented and low-level kernel entry/exit is handled, so keep the
ARM64_SW_TTBR0_PAN code out of the way whilst we do the heavy lifting.
It will be re-enabled in a subsequent patch.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a93339f5178f..7e7d7fd152c4 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -910,6 +910,7 @@ endif
config ARM64_SW_TTBR0_PAN
bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
+ depends on BROKEN # Temporary while switch_mm is reworked
help
Enabling this option prevents the kernel from accessing
user-space memory directly by pointing TTBR0_EL1 to a reserved
--
2.1.4
With the ASID now installed in TTBR1, we can re-enable ARM64_SW_TTBR0_PAN
by ensuring that we switch to a reserved ASID of zero when disabling
user access and restore the active user ASID on the uaccess enable path.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/Kconfig | 1 -
arch/arm64/include/asm/asm-uaccess.h | 25 +++++++++++++++++--------
arch/arm64/include/asm/uaccess.h | 21 +++++++++++++++++----
arch/arm64/kernel/entry.S | 4 ++--
arch/arm64/lib/clear_user.S | 2 +-
arch/arm64/lib/copy_from_user.S | 2 +-
arch/arm64/lib/copy_in_user.S | 2 +-
arch/arm64/lib/copy_to_user.S | 2 +-
arch/arm64/mm/cache.S | 2 +-
arch/arm64/xen/hypercall.S | 2 +-
10 files changed, 42 insertions(+), 21 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7e7d7fd152c4..a93339f5178f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -910,7 +910,6 @@ endif
config ARM64_SW_TTBR0_PAN
bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
- depends on BROKEN # Temporary while switch_mm is reworked
help
Enabling this option prevents the kernel from accessing
user-space memory directly by pointing TTBR0_EL1 to a reserved
diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
index b3da6c886835..21b8cf304028 100644
--- a/arch/arm64/include/asm/asm-uaccess.h
+++ b/arch/arm64/include/asm/asm-uaccess.h
@@ -16,11 +16,20 @@
add \tmp1, \tmp1, #SWAPPER_DIR_SIZE // reserved_ttbr0 at the end of swapper_pg_dir
msr ttbr0_el1, \tmp1 // set reserved TTBR0_EL1
isb
+ sub \tmp1, \tmp1, #SWAPPER_DIR_SIZE
+ bic \tmp1, \tmp1, #(0xffff << 48)
+ msr ttbr1_el1, \tmp1 // set reserved ASID
+ isb
.endm
- .macro __uaccess_ttbr0_enable, tmp1
+ .macro __uaccess_ttbr0_enable, tmp1, tmp2
get_thread_info \tmp1
ldr \tmp1, [\tmp1, #TSK_TI_TTBR0] // load saved TTBR0_EL1
+ mrs \tmp2, ttbr1_el1
+ extr \tmp2, \tmp2, \tmp1, #48
+ ror \tmp2, \tmp2, #16
+ msr ttbr1_el1, \tmp2 // set the active ASID
+ isb
msr ttbr0_el1, \tmp1 // set the non-PAN TTBR0_EL1
isb
.endm
@@ -31,18 +40,18 @@ alternative_if_not ARM64_HAS_PAN
alternative_else_nop_endif
.endm
- .macro uaccess_ttbr0_enable, tmp1, tmp2
+ .macro uaccess_ttbr0_enable, tmp1, tmp2, tmp3
alternative_if_not ARM64_HAS_PAN
- save_and_disable_irq \tmp2 // avoid preemption
- __uaccess_ttbr0_enable \tmp1
- restore_irq \tmp2
+ save_and_disable_irq \tmp3 // avoid preemption
+ __uaccess_ttbr0_enable \tmp1, \tmp2
+ restore_irq \tmp3
alternative_else_nop_endif
.endm
#else
.macro uaccess_ttbr0_disable, tmp1
.endm
- .macro uaccess_ttbr0_enable, tmp1, tmp2
+ .macro uaccess_ttbr0_enable, tmp1, tmp2, tmp3
.endm
#endif
@@ -56,8 +65,8 @@ alternative_if ARM64_ALT_PAN_NOT_UAO
alternative_else_nop_endif
.endm
- .macro uaccess_enable_not_uao, tmp1, tmp2
- uaccess_ttbr0_enable \tmp1, \tmp2
+ .macro uaccess_enable_not_uao, tmp1, tmp2, tmp3
+ uaccess_ttbr0_enable \tmp1, \tmp2, \tmp3
alternative_if ARM64_ALT_PAN_NOT_UAO
SET_PSTATE_PAN(0)
alternative_else_nop_endif
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index fc0f9eb66039..750a3b76a01c 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -107,15 +107,19 @@ static inline void __uaccess_ttbr0_disable(void)
{
unsigned long ttbr;
+ ttbr = read_sysreg(ttbr1_el1);
/* reserved_ttbr0 placed at the end of swapper_pg_dir */
- ttbr = read_sysreg(ttbr1_el1) + SWAPPER_DIR_SIZE;
- write_sysreg(ttbr, ttbr0_el1);
+ write_sysreg(ttbr + SWAPPER_DIR_SIZE, ttbr0_el1);
+ isb();
+ /* Set reserved ASID */
+ ttbr &= ~(0xffffUL << 48);
+ write_sysreg(ttbr, ttbr1_el1);
isb();
}
static inline void __uaccess_ttbr0_enable(void)
{
- unsigned long flags;
+ unsigned long flags, ttbr0, ttbr1;
/*
* Disable interrupts to avoid preemption between reading the 'ttbr0'
@@ -123,7 +127,16 @@ static inline void __uaccess_ttbr0_enable(void)
* roll-over and an update of 'ttbr0'.
*/
local_irq_save(flags);
- write_sysreg(current_thread_info()->ttbr0, ttbr0_el1);
+ ttbr0 = current_thread_info()->ttbr0;
+
+ /* Restore active ASID */
+ ttbr1 = read_sysreg(ttbr1_el1);
+ ttbr1 |= ttbr0 & (0xffffUL << 48);
+ write_sysreg(ttbr1, ttbr1_el1);
+ isb();
+
+ /* Restore user page table */
+ write_sysreg(ttbr0, ttbr0_el1);
isb();
local_irq_restore(flags);
}
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 804e43c9cb0b..d454d8ed45e4 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -184,7 +184,7 @@ alternative_if ARM64_HAS_PAN
alternative_else_nop_endif
.if \el != 0
- mrs x21, ttbr0_el1
+ mrs x21, ttbr1_el1
tst x21, #0xffff << 48 // Check for the reserved ASID
orr x23, x23, #PSR_PAN_BIT // Set the emulated PAN in the saved SPSR
b.eq 1f // TTBR0 access already disabled
@@ -248,7 +248,7 @@ alternative_else_nop_endif
tbnz x22, #22, 1f // Skip re-enabling TTBR0 access if the PSR_PAN_BIT is set
.endif
- __uaccess_ttbr0_enable x0
+ __uaccess_ttbr0_enable x0, x1
.if \el == 0
/*
diff --git a/arch/arm64/lib/clear_user.S b/arch/arm64/lib/clear_user.S
index e88fb99c1561..8f9c4641e706 100644
--- a/arch/arm64/lib/clear_user.S
+++ b/arch/arm64/lib/clear_user.S
@@ -30,7 +30,7 @@
* Alignment fixed up by hardware.
*/
ENTRY(__clear_user)
- uaccess_enable_not_uao x2, x3
+ uaccess_enable_not_uao x2, x3, x4
mov x2, x1 // save the size for fixup return
subs x1, x1, #8
b.mi 2f
diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
index 4b5d826895ff..69d86a80f3e2 100644
--- a/arch/arm64/lib/copy_from_user.S
+++ b/arch/arm64/lib/copy_from_user.S
@@ -64,7 +64,7 @@
end .req x5
ENTRY(__arch_copy_from_user)
- uaccess_enable_not_uao x3, x4
+ uaccess_enable_not_uao x3, x4, x5
add end, x0, x2
#include "copy_template.S"
uaccess_disable_not_uao x3
diff --git a/arch/arm64/lib/copy_in_user.S b/arch/arm64/lib/copy_in_user.S
index b24a830419ad..e442b531252a 100644
--- a/arch/arm64/lib/copy_in_user.S
+++ b/arch/arm64/lib/copy_in_user.S
@@ -65,7 +65,7 @@
end .req x5
ENTRY(raw_copy_in_user)
- uaccess_enable_not_uao x3, x4
+ uaccess_enable_not_uao x3, x4, x5
add end, x0, x2
#include "copy_template.S"
uaccess_disable_not_uao x3
diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
index 351f0766f7a6..318f15d5c336 100644
--- a/arch/arm64/lib/copy_to_user.S
+++ b/arch/arm64/lib/copy_to_user.S
@@ -63,7 +63,7 @@
end .req x5
ENTRY(__arch_copy_to_user)
- uaccess_enable_not_uao x3, x4
+ uaccess_enable_not_uao x3, x4, x5
add end, x0, x2
#include "copy_template.S"
uaccess_disable_not_uao x3
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 7f1dbe962cf5..6cd20a8c0952 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -49,7 +49,7 @@ ENTRY(flush_icache_range)
* - end - virtual end address of region
*/
ENTRY(__flush_cache_user_range)
- uaccess_ttbr0_enable x2, x3
+ uaccess_ttbr0_enable x2, x3, x4
dcache_line_size x2, x3
sub x3, x2, #1
bic x4, x0, x3
diff --git a/arch/arm64/xen/hypercall.S b/arch/arm64/xen/hypercall.S
index 401ceb71540c..acdbd2c9e899 100644
--- a/arch/arm64/xen/hypercall.S
+++ b/arch/arm64/xen/hypercall.S
@@ -101,7 +101,7 @@ ENTRY(privcmd_call)
* need the explicit uaccess_enable/disable if the TTBR0 PAN emulation
* is enabled (it implies that hardware UAO and PAN disabled).
*/
- uaccess_ttbr0_enable x6, x7
+ uaccess_ttbr0_enable x6, x7, x8
hvc XEN_IMM
/*
--
2.1.4
In preparation for mapping kernelspace and userspace with different
ASIDs, move the ASID to TTBR1 and update switch_mm to context-switch
TTBR0 via an invalid mapping (the zero page).
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/mmu_context.h | 7 +++++++
arch/arm64/include/asm/pgtable-hwdef.h | 1 +
arch/arm64/include/asm/proc-fns.h | 6 ------
arch/arm64/mm/proc.S | 9 ++++++---
4 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 3257895a9b5e..2d63611e4311 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -57,6 +57,13 @@ static inline void cpu_set_reserved_ttbr0(void)
isb();
}
+static inline void cpu_switch_mm(pgd_t *pgd, struct mm_struct *mm)
+{
+ BUG_ON(pgd == swapper_pg_dir);
+ cpu_set_reserved_ttbr0();
+ cpu_do_switch_mm(virt_to_phys(pgd),mm);
+}
+
/*
* TCR.T0SZ value to use when the ID map is active. Usually equals
* TCR_T0SZ(VA_BITS), unless system RAM is positioned very high in
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index eb0c2bd90de9..8df4cb6ac6f7 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -272,6 +272,7 @@
#define TCR_TG1_4K (UL(2) << TCR_TG1_SHIFT)
#define TCR_TG1_64K (UL(3) << TCR_TG1_SHIFT)
+#define TCR_A1 (UL(1) << 22)
#define TCR_ASID16 (UL(1) << 36)
#define TCR_TBI0 (UL(1) << 37)
#define TCR_HA (UL(1) << 39)
diff --git a/arch/arm64/include/asm/proc-fns.h b/arch/arm64/include/asm/proc-fns.h
index 14ad6e4e87d1..16cef2e8449e 100644
--- a/arch/arm64/include/asm/proc-fns.h
+++ b/arch/arm64/include/asm/proc-fns.h
@@ -35,12 +35,6 @@ extern u64 cpu_do_resume(phys_addr_t ptr, u64 idmap_ttbr);
#include <asm/memory.h>
-#define cpu_switch_mm(pgd,mm) \
-do { \
- BUG_ON(pgd == swapper_pg_dir); \
- cpu_do_switch_mm(virt_to_phys(pgd),mm); \
-} while (0)
-
#endif /* __ASSEMBLY__ */
#endif /* __KERNEL__ */
#endif /* __ASM_PROCFNS_H */
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 95233dfc4c39..a8a64898a2aa 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -139,9 +139,12 @@ ENDPROC(cpu_do_resume)
*/
ENTRY(cpu_do_switch_mm)
pre_ttbr0_update_workaround x0, x2, x3
+ mrs x2, ttbr1_el1
mmid x1, x1 // get mm->context.id
- bfi x0, x1, #48, #16 // set the ASID
- msr ttbr0_el1, x0 // set TTBR0
+ bfi x2, x1, #48, #16 // set the ASID
+ msr ttbr1_el1, x2 // in TTBR1 (since TCR.A1 is set)
+ isb
+ msr ttbr0_el1, x0 // now update TTBR0
isb
post_ttbr0_update_workaround
ret
@@ -224,7 +227,7 @@ ENTRY(__cpu_setup)
* both user and kernel.
*/
ldr x10, =TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \
- TCR_TG_FLAGS | TCR_ASID16 | TCR_TBI0
+ TCR_TG_FLAGS | TCR_ASID16 | TCR_TBI0 | TCR_A1
tcr_set_idmap_t0sz x10, x9
/*
--
2.1.4
In preparation for unmapping the kernel whilst running in userspace,
make the kernel mappings non-global so we can avoid expensive TLB
invalidation on kernel exit to userspace.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/kernel-pgtable.h | 12 ++++++++++--
arch/arm64/include/asm/pgtable-prot.h | 21 +++++++++++++++------
2 files changed, 25 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index 7803343e5881..77a27af01371 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -78,8 +78,16 @@
/*
* Initial memory map attributes.
*/
-#define SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define _SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
+#define _SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+#define SWAPPER_PTE_FLAGS (_SWAPPER_PTE_FLAGS | PTE_NG)
+#define SWAPPER_PMD_FLAGS (_SWAPPER_PMD_FLAGS | PMD_SECT_NG)
+#else
+#define SWAPPER_PTE_FLAGS _SWAPPER_PTE_FLAGS
+#define SWAPPER_PMD_FLAGS _SWAPPER_PMD_FLAGS
+#endif
#if ARM64_SWAPPER_USES_SECTION_MAPS
#define SWAPPER_MM_MMUFLAGS (PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS)
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 0a5635fb0ef9..22a926825e3f 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -34,8 +34,16 @@
#include <asm/pgtable-types.h>
-#define PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define _PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
+#define _PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+#define PROT_DEFAULT (_PROT_DEFAULT | PTE_NG)
+#define PROT_SECT_DEFAULT (_PROT_SECT_DEFAULT | PMD_SECT_NG)
+#else
+#define PROT_DEFAULT _PROT_DEFAULT
+#define PROT_SECT_DEFAULT _PROT_SECT_DEFAULT
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
#define PROT_DEVICE_nGnRnE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRnE))
#define PROT_DEVICE_nGnRE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRE))
@@ -48,6 +56,7 @@
#define PROT_SECT_NORMAL_EXEC (PROT_SECT_DEFAULT | PMD_SECT_UXN | PMD_ATTRINDX(MT_NORMAL))
#define _PAGE_DEFAULT (PROT_DEFAULT | PTE_ATTRINDX(MT_NORMAL))
+#define _HYP_PAGE_DEFAULT (_PAGE_DEFAULT & ~PTE_NG)
#define PAGE_KERNEL __pgprot(_PAGE_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE)
#define PAGE_KERNEL_RO __pgprot(_PAGE_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_RDONLY)
@@ -55,15 +64,15 @@
#define PAGE_KERNEL_EXEC __pgprot(_PAGE_DEFAULT | PTE_UXN | PTE_DIRTY | PTE_WRITE)
#define PAGE_KERNEL_EXEC_CONT __pgprot(_PAGE_DEFAULT | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_CONT)
-#define PAGE_HYP __pgprot(_PAGE_DEFAULT | PTE_HYP | PTE_HYP_XN)
-#define PAGE_HYP_EXEC __pgprot(_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY)
-#define PAGE_HYP_RO __pgprot(_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN)
+#define PAGE_HYP __pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_HYP_XN)
+#define PAGE_HYP_EXEC __pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY)
+#define PAGE_HYP_RO __pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN)
#define PAGE_HYP_DEVICE __pgprot(PROT_DEVICE_nGnRE | PTE_HYP)
#define PAGE_S2 __pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY)
#define PAGE_S2_DEVICE __pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_UXN)
-#define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_PXN | PTE_UXN)
+#define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
#define PAGE_SHARED __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
#define PAGE_SHARED_EXEC __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_WRITE)
#define PAGE_READONLY __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
--
2.1.4
The literal pool entry for identifying the vectors base is the only piece
of information in the trampoline page that identifies the true location
of the kernel.
This patch moves it into its own page, which is only mapped by the full
kernel page table, which protects against any accidental leakage of the
trampoline contents.
Suggested-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/fixmap.h | 1 +
arch/arm64/kernel/entry.S | 11 +++++++++++
arch/arm64/kernel/vmlinux.lds.S | 35 ++++++++++++++++++++++++++++-------
arch/arm64/mm/mmu.c | 10 +++++++++-
4 files changed, 49 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 8119b49be98d..ec1e6d6fa14c 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -59,6 +59,7 @@ enum fixed_addresses {
#endif /* CONFIG_ACPI_APEI_GHES */
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ FIX_ENTRY_TRAMP_DATA,
FIX_ENTRY_TRAMP_TEXT,
#define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT))
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 3eabcb194c87..a70c6dd2cc19 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -1030,7 +1030,13 @@ alternative_else_nop_endif
msr tpidrro_el0, x30 // Restored in kernel_ventry
.endif
tramp_map_kernel x30
+#ifdef CONFIG_RANDOMIZE_BASE
+ adr x30, tramp_vectors + PAGE_SIZE
+alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
+ ldr x30, [x30]
+#else
ldr x30, =vectors
+#endif
prfm plil1strm, [x30, #(1b - tramp_vectors)]
msr vbar_el1, x30
add x30, x30, #(1b - tramp_vectors)
@@ -1073,6 +1079,11 @@ END(tramp_exit_compat)
.ltorg
.popsection // .entry.tramp.text
+#ifdef CONFIG_RANDOMIZE_BASE
+ .pushsection ".entry.tramp.data", "a" // .entry.tramp.data
+ .quad vectors
+ .popsection // .entry.tramp.data
+#endif /* CONFIG_RANDOMIZE_BASE */
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
/*
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 6b4260f22aab..976109b3ae51 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -58,15 +58,28 @@ jiffies = jiffies_64;
#endif
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
-#define TRAMP_TEXT \
- . = ALIGN(PAGE_SIZE); \
- VMLINUX_SYMBOL(__entry_tramp_text_start) = .; \
- *(.entry.tramp.text) \
- . = ALIGN(PAGE_SIZE); \
+#define TRAMP_TEXT \
+ . = ALIGN(PAGE_SIZE); \
+ VMLINUX_SYMBOL(__entry_tramp_text_start) = .; \
+ *(.entry.tramp.text) \
+ . = ALIGN(PAGE_SIZE); \
VMLINUX_SYMBOL(__entry_tramp_text_end) = .;
+#ifdef CONFIG_RANDOMIZE_BASE
+#define TRAMP_DATA \
+ .entry.tramp.data : { \
+ . = ALIGN(PAGE_SIZE); \
+ VMLINUX_SYMBOL(__entry_tramp_data_start) = .; \
+ *(.entry.tramp.data) \
+ . = ALIGN(PAGE_SIZE); \
+ VMLINUX_SYMBOL(__entry_tramp_data_end) = .; \
+ }
+#else
+#define TRAMP_DATA
+#endif /* CONFIG_RANDOMIZE_BASE */
#else
#define TRAMP_TEXT
-#endif
+#define TRAMP_DATA
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
/*
* The size of the PE/COFF section that covers the kernel image, which
@@ -137,6 +150,7 @@ SECTIONS
RO_DATA(PAGE_SIZE) /* everything from this point to */
EXCEPTION_TABLE(8) /* __init_begin will be marked RO NX */
NOTES
+ TRAMP_DATA
. = ALIGN(SEGMENT_ALIGN);
__init_begin = .;
@@ -251,7 +265,14 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
<= SZ_4K, "Hibernate exit text too big or misaligned")
#endif
-
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == PAGE_SIZE,
+ "Entry trampoline text too big")
+#ifdef CONFIG_RANDOMIZE_BASE
+ASSERT((__entry_tramp_data_end - __entry_tramp_data_start) == PAGE_SIZE,
+ "Entry trampoline data too big")
+#endif
+#endif
/*
* If padding is applied before .head.text, virt<->phys conversions will fail.
*/
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index fe68a48c64cb..916d9ced1c3f 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -541,8 +541,16 @@ static int __init map_entry_trampoline(void)
__create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
prot, pgd_pgtable_alloc, 0);
- /* ...as well as the kernel page table */
+ /* Map both the text and data into the kernel page table */
__set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot);
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+ extern char __entry_tramp_data_start[];
+
+ __set_fixmap(FIX_ENTRY_TRAMP_DATA,
+ __pa_symbol(__entry_tramp_data_start),
+ PAGE_KERNEL_RO);
+ }
+
return 0;
}
core_initcall(map_entry_trampoline);
--
2.1.4
There are now a handful of open-coded masks to extract the ASID from a
TTBR value, so introduce a TTBR_ASID_MASK and use that instead.
Suggested-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/asm-uaccess.h | 3 ++-
arch/arm64/include/asm/mmu.h | 1 +
arch/arm64/include/asm/uaccess.h | 4 ++--
arch/arm64/kernel/entry.S | 2 +-
4 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
index 21b8cf304028..f4f234b6155e 100644
--- a/arch/arm64/include/asm/asm-uaccess.h
+++ b/arch/arm64/include/asm/asm-uaccess.h
@@ -4,6 +4,7 @@
#include <asm/alternative.h>
#include <asm/kernel-pgtable.h>
+#include <asm/mmu.h>
#include <asm/sysreg.h>
#include <asm/assembler.h>
@@ -17,7 +18,7 @@
msr ttbr0_el1, \tmp1 // set reserved TTBR0_EL1
isb
sub \tmp1, \tmp1, #SWAPPER_DIR_SIZE
- bic \tmp1, \tmp1, #(0xffff << 48)
+ bic \tmp1, \tmp1, #TTBR_ASID_MASK
msr ttbr1_el1, \tmp1 // set reserved ASID
isb
.endm
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index da6f12e40714..6f7bdb89817f 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -18,6 +18,7 @@
#define MMCF_AARCH32 0x1 /* mm context flag for AArch32 executables */
#define USER_ASID_FLAG (UL(1) << 48)
+#define TTBR_ASID_MASK (UL(0xffff) << 48)
#ifndef __ASSEMBLY__
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 750a3b76a01c..6eadf55ebaf0 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -112,7 +112,7 @@ static inline void __uaccess_ttbr0_disable(void)
write_sysreg(ttbr + SWAPPER_DIR_SIZE, ttbr0_el1);
isb();
/* Set reserved ASID */
- ttbr &= ~(0xffffUL << 48);
+ ttbr &= ~TTBR_ASID_MASK;
write_sysreg(ttbr, ttbr1_el1);
isb();
}
@@ -131,7 +131,7 @@ static inline void __uaccess_ttbr0_enable(void)
/* Restore active ASID */
ttbr1 = read_sysreg(ttbr1_el1);
- ttbr1 |= ttbr0 & (0xffffUL << 48);
+ ttbr1 |= ttbr0 & TTBR_ASID_MASK;
write_sysreg(ttbr1, ttbr1_el1);
isb();
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 5d51bdbb2131..3eabcb194c87 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -205,7 +205,7 @@ alternative_else_nop_endif
.if \el != 0
mrs x21, ttbr1_el1
- tst x21, #0xffff << 48 // Check for the reserved ASID
+ tst x21, #TTBR_ASID_MASK // Check for the reserved ASID
orr x23, x23, #PSR_PAN_BIT // Set the emulated PAN in the saved SPSR
b.eq 1f // TTBR0 access already disabled
and x23, x23, #~PSR_PAN_BIT // Clear the emulated PAN in the saved SPSR
--
2.1.4
Add a Kconfig entry to control use of the entry trampoline, which allows
us to unmap the kernel whilst running in userspace and improve the
robustness of KASLR.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/Kconfig | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fdcc7b9bb15d..3af1657fcac3 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -833,6 +833,19 @@ config FORCE_MAX_ZONEORDER
However for 4K, we choose a higher default value, 11 as opposed to 10, giving us
4M allocations matching the default size used by generic code.
+config UNMAP_KERNEL_AT_EL0
+ bool "Unmap kernel when running in userspace (aka \"KAISER\")"
+ default y
+ help
+ Some attacks against KASLR make use of the timing difference between
+ a permission fault which could arise from a page table entry that is
+ present in the TLB, and a translation fault which always requires a
+ page table walk. This option defends against these attacks by unmapping
+ the kernel whilst running in userspace, therefore forcing translation
+ faults for all of kernel space.
+
+ If unsure, say Y.
+
menuconfig ARMV8_DEPRECATED
bool "Emulate deprecated/obsolete ARMv8 instructions"
depends on COMPAT
--
2.1.4
When running with the kernel unmapped whilst at EL0, the virtually-addressed
SPE buffer is also unmapped, which can lead to buffer faults if userspace
profiling is enabled and potentially also when writing back kernel samples
unless an expensive drain operation is performed on exception return.
For now, fail the SPE driver probe when arm64_kernel_unmapped_at_el0().
Signed-off-by: Will Deacon <[email protected]>
---
drivers/perf/arm_spe_pmu.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 8ce262fc2561..51b40aecb776 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -1164,6 +1164,15 @@ static int arm_spe_pmu_device_dt_probe(struct platform_device *pdev)
struct arm_spe_pmu *spe_pmu;
struct device *dev = &pdev->dev;
+ /*
+ * If kernelspace is unmapped when running at EL0, then the SPE
+ * buffer will fault and prematurely terminate the AUX session.
+ */
+ if (arm64_kernel_unmapped_at_el0()) {
+ dev_warn_once(dev, "profiling buffer inaccessible. Try passing \"kpti=off\" on the kernel command line\n");
+ return -EPERM;
+ }
+
spe_pmu = devm_kzalloc(dev, sizeof(*spe_pmu), GFP_KERNEL);
if (!spe_pmu) {
dev_err(dev, "failed to allocate spe_pmu\n");
--
2.1.4
Allow explicit disabling of the entry trampoline on the kernel command
line (kpti=off) by adding a fake CPU feature (ARM64_UNMAP_KERNEL_AT_EL0)
that can be used to toggle the alternative sequences in our entry code and
avoid use of the trampoline altogether if desired. This also allows us to
make use of a static key in arm64_kernel_unmapped_at_el0().
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/include/asm/mmu.h | 3 ++-
arch/arm64/kernel/cpufeature.c | 41 ++++++++++++++++++++++++++++++++++++++++
arch/arm64/kernel/entry.S | 9 +++++----
4 files changed, 50 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 2ff7c5e8efab..b4537ffd1018 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -41,7 +41,8 @@
#define ARM64_WORKAROUND_CAVIUM_30115 20
#define ARM64_HAS_DCPOP 21
#define ARM64_SVE 22
+#define ARM64_UNMAP_KERNEL_AT_EL0 23
-#define ARM64_NCAPS 23
+#define ARM64_NCAPS 24
#endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index c07954638658..da6f12e40714 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -36,7 +36,8 @@ typedef struct {
static inline bool arm64_kernel_unmapped_at_el0(void)
{
- return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0);
+ return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0) &&
+ cpus_have_const_cap(ARM64_UNMAP_KERNEL_AT_EL0);
}
extern void paging_init(void);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index c5ba0097887f..98e6563015a4 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -845,6 +845,40 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus
ID_AA64PFR0_FP_SHIFT) < 0;
}
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */
+
+static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
+ int __unused)
+{
+ /* Forced on command line? */
+ if (__kpti_forced) {
+ pr_info("kernel page table isolation forced %s by command line option\n",
+ __kpti_forced > 0 ? "ON" : "OFF");
+ return __kpti_forced > 0;
+ }
+
+ /* Useful for KASLR robustness */
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
+ return true;
+
+ return false;
+}
+
+static int __init parse_kpti(char *str)
+{
+ bool enabled;
+ int ret = strtobool(str, &enabled);
+
+ if (ret)
+ return ret;
+
+ __kpti_forced = enabled ? 1 : -1;
+ return 0;
+}
+__setup("kpti=", parse_kpti);
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
+
static const struct arm64_cpu_capabilities arm64_features[] = {
{
.desc = "GIC system register CPU interface",
@@ -931,6 +965,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.def_scope = SCOPE_SYSTEM,
.matches = hyp_offset_low,
},
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ {
+ .capability = ARM64_UNMAP_KERNEL_AT_EL0,
+ .def_scope = SCOPE_SYSTEM,
+ .matches = unmap_kernel_at_el0,
+ },
+#endif
{
/* FP/SIMD is not implemented */
.capability = ARM64_HAS_NO_FPSIMD,
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index ce56592b5f70..5d51bdbb2131 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -74,6 +74,7 @@
.macro kernel_ventry, el, label, regsize = 64
.align 7
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+alternative_if ARM64_UNMAP_KERNEL_AT_EL0
.if \el == 0
.if \regsize == 64
mrs x30, tpidrro_el0
@@ -82,6 +83,7 @@
mov x30, xzr
.endif
.endif
+alternative_else_nop_endif
#endif
sub sp, sp, #S_FRAME_SIZE
@@ -323,10 +325,9 @@ alternative_else_nop_endif
ldr lr, [sp, #S_LR]
add sp, sp, #S_FRAME_SIZE // restore sp
-#ifndef CONFIG_UNMAP_KERNEL_AT_EL0
- eret
-#else
.if \el == 0
+alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
bne 4f
msr far_el1, x30
tramp_alias x30, tramp_exit_native
@@ -334,10 +335,10 @@ alternative_else_nop_endif
4:
tramp_alias x30, tramp_exit_compat
br x30
+#endif
.else
eret
.endif
-#endif
.endm
.macro irq_stack_entry
--
2.1.4
When unmapping the kernel at EL0, we use tpidrro_el0 as a scratch register
during exception entry from native tasks and subsequently zero it in
the kernel_ventry macro. We can therefore avoid zeroing tpidrro_el0
in the context-switch path for native tasks using the entry trampoline.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/kernel/process.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index b2adcce7bc18..aba3a1fb492d 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -361,16 +361,14 @@ void tls_preserve_current_state(void)
static void tls_thread_switch(struct task_struct *next)
{
- unsigned long tpidr, tpidrro;
-
tls_preserve_current_state();
- tpidr = *task_user_tls(next);
- tpidrro = is_compat_thread(task_thread_info(next)) ?
- next->thread.tp_value : 0;
+ if (is_compat_thread(task_thread_info(next)))
+ write_sysreg(next->thread.tp_value, tpidrro_el0);
+ else if (!arm64_kernel_unmapped_at_el0())
+ write_sysreg(0, tpidrro_el0);
- write_sysreg(tpidr, tpidr_el0);
- write_sysreg(tpidrro, tpidrro_el0);
+ write_sysreg(*task_user_tls(next), tpidr_el0);
}
/* Restore the UAO state depending on next's addr_limit */
--
2.1.4
We will need to treat exceptions from EL0 differently in kernel_ventry,
so rework the macro to take the exception level as an argument and
construct the branch target using that.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/kernel/entry.S | 46 +++++++++++++++++++++++-----------------------
1 file changed, 23 insertions(+), 23 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 716b5ef42e29..b99fc928119c 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -71,7 +71,7 @@
#define BAD_FIQ 2
#define BAD_ERROR 3
- .macro kernel_ventry label
+ .macro kernel_ventry, el, label, regsize = 64
.align 7
sub sp, sp, #S_FRAME_SIZE
#ifdef CONFIG_VMAP_STACK
@@ -84,7 +84,7 @@
tbnz x0, #THREAD_SHIFT, 0f
sub x0, sp, x0 // x0'' = sp' - x0' = (sp + x0) - sp = x0
sub sp, sp, x0 // sp'' = sp' - x0 = (sp + x0) - x0 = sp
- b \label
+ b el\()\el\()_\label
0:
/*
@@ -116,7 +116,7 @@
sub sp, sp, x0
mrs x0, tpidrro_el0
#endif
- b \label
+ b el\()\el\()_\label
.endm
.macro kernel_entry, el, regsize = 64
@@ -369,31 +369,31 @@ tsk .req x28 // current thread_info
.align 11
ENTRY(vectors)
- kernel_ventry el1_sync_invalid // Synchronous EL1t
- kernel_ventry el1_irq_invalid // IRQ EL1t
- kernel_ventry el1_fiq_invalid // FIQ EL1t
- kernel_ventry el1_error_invalid // Error EL1t
+ kernel_ventry 1, sync_invalid // Synchronous EL1t
+ kernel_ventry 1, irq_invalid // IRQ EL1t
+ kernel_ventry 1, fiq_invalid // FIQ EL1t
+ kernel_ventry 1, error_invalid // Error EL1t
- kernel_ventry el1_sync // Synchronous EL1h
- kernel_ventry el1_irq // IRQ EL1h
- kernel_ventry el1_fiq_invalid // FIQ EL1h
- kernel_ventry el1_error // Error EL1h
+ kernel_ventry 1, sync // Synchronous EL1h
+ kernel_ventry 1, irq // IRQ EL1h
+ kernel_ventry 1, fiq_invalid // FIQ EL1h
+ kernel_ventry 1, error // Error EL1h
- kernel_ventry el0_sync // Synchronous 64-bit EL0
- kernel_ventry el0_irq // IRQ 64-bit EL0
- kernel_ventry el0_fiq_invalid // FIQ 64-bit EL0
- kernel_ventry el0_error // Error 64-bit EL0
+ kernel_ventry 0, sync // Synchronous 64-bit EL0
+ kernel_ventry 0, irq // IRQ 64-bit EL0
+ kernel_ventry 0, fiq_invalid // FIQ 64-bit EL0
+ kernel_ventry 0, error // Error 64-bit EL0
#ifdef CONFIG_COMPAT
- kernel_ventry el0_sync_compat // Synchronous 32-bit EL0
- kernel_ventry el0_irq_compat // IRQ 32-bit EL0
- kernel_ventry el0_fiq_invalid_compat // FIQ 32-bit EL0
- kernel_ventry el0_error_compat // Error 32-bit EL0
+ kernel_ventry 0, sync_compat, 32 // Synchronous 32-bit EL0
+ kernel_ventry 0, irq_compat, 32 // IRQ 32-bit EL0
+ kernel_ventry 0, fiq_invalid_compat, 32 // FIQ 32-bit EL0
+ kernel_ventry 0, error_compat, 32 // Error 32-bit EL0
#else
- kernel_ventry el0_sync_invalid // Synchronous 32-bit EL0
- kernel_ventry el0_irq_invalid // IRQ 32-bit EL0
- kernel_ventry el0_fiq_invalid // FIQ 32-bit EL0
- kernel_ventry el0_error_invalid // Error 32-bit EL0
+ kernel_ventry 0, sync_invalid, 32 // Synchronous 32-bit EL0
+ kernel_ventry 0, irq_invalid, 32 // IRQ 32-bit EL0
+ kernel_ventry 0, fiq_invalid, 32 // FIQ 32-bit EL0
+ kernel_ventry 0, error_invalid, 32 // Error 32-bit EL0
#endif
END(vectors)
--
2.1.4
Hook up the entry trampoline to our exception vectors so that all
exceptions from and returns to EL0 go via the trampoline, which swizzles
the vector base register accordingly. Transitioning to and from the
kernel clobbers x30, so we use tpidrro_el0 and far_el1 as scratch
registers for native tasks.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/kernel/entry.S | 39 ++++++++++++++++++++++++++++++++++++---
1 file changed, 36 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index b99fc928119c..39e3873b8d5a 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -73,6 +73,17 @@
.macro kernel_ventry, el, label, regsize = 64
.align 7
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ .if \el == 0
+ .if \regsize == 64
+ mrs x30, tpidrro_el0
+ msr tpidrro_el0, xzr
+ .else
+ mov x30, xzr
+ .endif
+ .endif
+#endif
+
sub sp, sp, #S_FRAME_SIZE
#ifdef CONFIG_VMAP_STACK
/*
@@ -119,6 +130,11 @@
b el\()\el\()_\label
.endm
+ .macro tramp_alias, dst, sym
+ mov_q \dst, TRAMP_VALIAS
+ add \dst, \dst, #(\sym - .entry.tramp.text)
+ .endm
+
.macro kernel_entry, el, regsize = 64
.if \regsize == 32
mov w0, w0 // zero upper 32 bits of x0
@@ -271,18 +287,20 @@ alternative_else_nop_endif
.if \el == 0
ldr x23, [sp, #S_SP] // load return stack pointer
msr sp_el0, x23
+ tst x22, #PSR_MODE32_BIT // native task?
+ b.eq 3f
+
#ifdef CONFIG_ARM64_ERRATUM_845719
alternative_if ARM64_WORKAROUND_845719
- tbz x22, #4, 1f
#ifdef CONFIG_PID_IN_CONTEXTIDR
mrs x29, contextidr_el1
msr contextidr_el1, x29
#else
msr contextidr_el1, xzr
#endif
-1:
alternative_else_nop_endif
#endif
+3:
.endif
msr elr_el1, x21 // set up the return data
@@ -304,7 +322,22 @@ alternative_else_nop_endif
ldp x28, x29, [sp, #16 * 14]
ldr lr, [sp, #S_LR]
add sp, sp, #S_FRAME_SIZE // restore sp
- eret // return to kernel
+
+#ifndef CONFIG_UNMAP_KERNEL_AT_EL0
+ eret
+#else
+ .if \el == 0
+ bne 4f
+ msr far_el1, x30
+ tramp_alias x30, tramp_exit_native
+ br x30
+4:
+ tramp_alias x30, tramp_exit_compat
+ br x30
+ .else
+ eret
+ .endif
+#endif
.endm
.macro irq_stack_entry
--
2.1.4
To allow unmapping of the kernel whilst running at EL0, we need to
point the exception vectors at an entry trampoline that can map/unmap
the kernel on entry/exit respectively.
This patch adds the trampoline page, although it is not yet plugged
into the vector table and is therefore unused.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/kernel/entry.S | 86 +++++++++++++++++++++++++++++++++++++++++
arch/arm64/kernel/vmlinux.lds.S | 17 ++++++++
2 files changed, 103 insertions(+)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index d454d8ed45e4..716b5ef42e29 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -28,6 +28,8 @@
#include <asm/errno.h>
#include <asm/esr.h>
#include <asm/irq.h>
+#include <asm/memory.h>
+#include <asm/mmu.h>
#include <asm/processor.h>
#include <asm/ptrace.h>
#include <asm/thread_info.h>
@@ -943,6 +945,90 @@ __ni_sys_trace:
.popsection // .entry.text
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+/*
+ * Exception vectors trampoline.
+ */
+ .pushsection ".entry.tramp.text", "ax"
+
+ .macro tramp_map_kernel, tmp
+ mrs \tmp, ttbr1_el1
+ sub \tmp, \tmp, #(SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE)
+ bic \tmp, \tmp, #USER_ASID_FLAG
+ msr ttbr1_el1, \tmp
+ .endm
+
+ .macro tramp_unmap_kernel, tmp
+ mrs \tmp, ttbr1_el1
+ add \tmp, \tmp, #(SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE)
+ orr \tmp, \tmp, #USER_ASID_FLAG
+ msr ttbr1_el1, \tmp
+ /*
+ * We avoid running the post_ttbr_update_workaround here because the
+ * user and kernel ASIDs don't have conflicting mappings, so any
+ * "blessing" as described in:
+ *
+ * http://lkml.kernel.org/r/[email protected]
+ *
+ * will not hurt correctness. Whilst this may partially defeat the
+ * point of using split ASIDs in the first place, it avoids
+ * the hit of invalidating the entire I-cache on every return to
+ * userspace.
+ */
+ .endm
+
+ .macro tramp_ventry, regsize = 64
+ .align 7
+1:
+ .if \regsize == 64
+ msr tpidrro_el0, x30 // Restored in kernel_ventry
+ .endif
+ tramp_map_kernel x30
+ ldr x30, =vectors
+ prfm plil1strm, [x30, #(1b - tramp_vectors)]
+ msr vbar_el1, x30
+ add x30, x30, #(1b - tramp_vectors)
+ isb
+ br x30
+ .endm
+
+ .macro tramp_exit, regsize = 64
+ adr x30, tramp_vectors
+ msr vbar_el1, x30
+ tramp_unmap_kernel x30
+ .if \regsize == 64
+ mrs x30, far_el1
+ .endif
+ eret
+ .endm
+
+ .align 11
+ENTRY(tramp_vectors)
+ .space 0x400
+
+ tramp_ventry
+ tramp_ventry
+ tramp_ventry
+ tramp_ventry
+
+ tramp_ventry 32
+ tramp_ventry 32
+ tramp_ventry 32
+ tramp_ventry 32
+END(tramp_vectors)
+
+ENTRY(tramp_exit_native)
+ tramp_exit
+END(tramp_exit_native)
+
+ENTRY(tramp_exit_compat)
+ tramp_exit 32
+END(tramp_exit_compat)
+
+ .ltorg
+ .popsection // .entry.tramp.text
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
+
/*
* Special system call wrappers.
*/
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7da3e5c366a0..6b4260f22aab 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -57,6 +57,17 @@ jiffies = jiffies_64;
#define HIBERNATE_TEXT
#endif
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+#define TRAMP_TEXT \
+ . = ALIGN(PAGE_SIZE); \
+ VMLINUX_SYMBOL(__entry_tramp_text_start) = .; \
+ *(.entry.tramp.text) \
+ . = ALIGN(PAGE_SIZE); \
+ VMLINUX_SYMBOL(__entry_tramp_text_end) = .;
+#else
+#define TRAMP_TEXT
+#endif
+
/*
* The size of the PE/COFF section that covers the kernel image, which
* runs from stext to _edata, must be a round multiple of the PE/COFF
@@ -113,6 +124,7 @@ SECTIONS
HYPERVISOR_TEXT
IDMAP_TEXT
HIBERNATE_TEXT
+ TRAMP_TEXT
*(.fixup)
*(.gnu.warning)
. = ALIGN(16);
@@ -214,6 +226,11 @@ SECTIONS
. += RESERVED_TTBR0_SIZE;
#endif
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ tramp_pg_dir = .;
+ . += PAGE_SIZE;
+#endif
+
__pecoff_data_size = ABSOLUTE(. - __initdata_begin);
_end = .;
--
2.1.4
Since an mm has both a kernel and a user ASID, we need to ensure that
broadcast TLB maintenance targets both address spaces so that things
like CoW continue to work with the uaccess primitives in the kernel.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/tlbflush.h | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index af1c76981911..9e82dd79c7db 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -23,6 +23,7 @@
#include <linux/sched.h>
#include <asm/cputype.h>
+#include <asm/mmu.h>
/*
* Raw TLBI operations.
@@ -54,6 +55,11 @@
#define __tlbi(op, ...) __TLBI_N(op, ##__VA_ARGS__, 1, 0)
+#define __tlbi_user(op, arg) do { \
+ if (arm64_kernel_unmapped_at_el0()) \
+ __tlbi(op, (arg) | USER_ASID_FLAG); \
+} while (0)
+
/*
* TLB Management
* ==============
@@ -115,6 +121,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
dsb(ishst);
__tlbi(aside1is, asid);
+ __tlbi_user(aside1is, asid);
dsb(ish);
}
@@ -125,6 +132,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
dsb(ishst);
__tlbi(vale1is, addr);
+ __tlbi_user(vale1is, addr);
dsb(ish);
}
@@ -151,10 +159,13 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
dsb(ishst);
for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) {
- if (last_level)
+ if (last_level) {
__tlbi(vale1is, addr);
- else
+ __tlbi_user(vale1is, addr);
+ } else {
__tlbi(vae1is, addr);
+ __tlbi_user(vae1is, addr);
+ }
}
dsb(ish);
}
@@ -194,6 +205,7 @@ static inline void __flush_tlb_pgtable(struct mm_struct *mm,
unsigned long addr = uaddr >> 12 | (ASID(mm) << 48);
__tlbi(vae1is, addr);
+ __tlbi_user(vae1is, addr);
dsb(ish);
}
--
2.1.4
The exception entry trampoline needs to be mapped at the same virtual
address in both the trampoline page table (which maps nothing else)
and also the kernel page table, so that we can swizzle TTBR1_EL1 on
exceptions from and return to EL0.
This patch maps the trampoline at a fixed virtual address in the fixmap
area of the kernel virtual address space, which allows the kernel proper
to be randomized with respect to the trampoline when KASLR is enabled.
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/fixmap.h | 4 ++++
arch/arm64/include/asm/pgtable.h | 1 +
arch/arm64/kernel/asm-offsets.c | 6 +++++-
arch/arm64/mm/mmu.c | 23 +++++++++++++++++++++++
4 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 4052ec39e8db..8119b49be98d 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -58,6 +58,10 @@ enum fixed_addresses {
FIX_APEI_GHES_NMI,
#endif /* CONFIG_ACPI_APEI_GHES */
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ FIX_ENTRY_TRAMP_TEXT,
+#define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT))
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
__end_of_permanent_fixed_addresses,
/*
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 149d05fb9421..774003b247ad 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -680,6 +680,7 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm,
extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
extern pgd_t idmap_pg_dir[PTRS_PER_PGD];
+extern pgd_t tramp_pg_dir[PTRS_PER_PGD];
/*
* Encode and decode a swap entry:
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 71bf088f1e4b..af247d10252f 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -24,6 +24,7 @@
#include <linux/kvm_host.h>
#include <linux/suspend.h>
#include <asm/cpufeature.h>
+#include <asm/fixmap.h>
#include <asm/thread_info.h>
#include <asm/memory.h>
#include <asm/smp_plat.h>
@@ -148,11 +149,14 @@ int main(void)
DEFINE(ARM_SMCCC_RES_X2_OFFS, offsetof(struct arm_smccc_res, a2));
DEFINE(ARM_SMCCC_QUIRK_ID_OFFS, offsetof(struct arm_smccc_quirk, id));
DEFINE(ARM_SMCCC_QUIRK_STATE_OFFS, offsetof(struct arm_smccc_quirk, state));
-
BLANK();
DEFINE(HIBERN_PBE_ORIG, offsetof(struct pbe, orig_address));
DEFINE(HIBERN_PBE_ADDR, offsetof(struct pbe, address));
DEFINE(HIBERN_PBE_NEXT, offsetof(struct pbe, next));
DEFINE(ARM64_FTR_SYSVAL, offsetof(struct arm64_ftr_reg, sys_val));
+ BLANK();
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ DEFINE(TRAMP_VALIAS, TRAMP_VALIAS);
+#endif
return 0;
}
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 267d2b79d52d..fe68a48c64cb 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -525,6 +525,29 @@ static int __init parse_rodata(char *arg)
}
early_param("rodata", parse_rodata);
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+static int __init map_entry_trampoline(void)
+{
+ extern char __entry_tramp_text_start[];
+
+ pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
+ phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start);
+
+ /* The trampoline is always mapped and can therefore be global */
+ pgprot_val(prot) &= ~PTE_NG;
+
+ /* Map only the text into the trampoline page table */
+ memset(tramp_pg_dir, 0, PGD_SIZE);
+ __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
+ prot, pgd_pgtable_alloc, 0);
+
+ /* ...as well as the kernel page table */
+ __set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot);
+ return 0;
+}
+core_initcall(map_entry_trampoline);
+#endif
+
/*
* Create fine-grained mappings for the kernel.
*/
--
2.1.4
We rely on an atomic swizzling of TTBR1 when transitioning from the entry
trampoline to the kernel proper on an exception. We can't rely on this
atomicity in the face of Falkor erratum #E1003, so on affected cores we
can issue a TLB invalidation to invalidate the walk cache prior to
jumping into the kernel. There is still the possibility of a TLB conflict
here due to conflicting walk cache entries prior to the invalidation, but
this doesn't appear to be the case on these CPUs in practice.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/Kconfig | 17 +++++------------
arch/arm64/kernel/entry.S | 12 ++++++++++++
2 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a93339f5178f..fdcc7b9bb15d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -522,20 +522,13 @@ config CAVIUM_ERRATUM_30115
config QCOM_FALKOR_ERRATUM_1003
bool "Falkor E1003: Incorrect translation due to ASID change"
default y
- select ARM64_PAN if ARM64_SW_TTBR0_PAN
help
On Falkor v1, an incorrect ASID may be cached in the TLB when ASID
- and BADDR are changed together in TTBRx_EL1. The workaround for this
- issue is to use a reserved ASID in cpu_do_switch_mm() before
- switching to the new ASID. Saying Y here selects ARM64_PAN if
- ARM64_SW_TTBR0_PAN is selected. This is done because implementing and
- maintaining the E1003 workaround in the software PAN emulation code
- would be an unnecessary complication. The affected Falkor v1 CPU
- implements ARMv8.1 hardware PAN support and using hardware PAN
- support versus software PAN emulation is mutually exclusive at
- runtime.
-
- If unsure, say Y.
+ and BADDR are changed together in TTBRx_EL1. Since we keep the ASID
+ in TTBR1_EL1, this situation only occurs in the entry trampoline and
+ then only for entries in the walk cache, since the leaf translation
+ is unchanged. Work around the erratum by invalidating the walk cache
+ entries for the trampoline before entering the kernel proper.
config QCOM_FALKOR_ERRATUM_1009
bool "Falkor E1009: Prematurely complete a DSB after a TLBI"
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 39e3873b8d5a..ce56592b5f70 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -989,6 +989,18 @@ __ni_sys_trace:
sub \tmp, \tmp, #(SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE)
bic \tmp, \tmp, #USER_ASID_FLAG
msr ttbr1_el1, \tmp
+#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1003
+alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1003
+ /* ASID already in \tmp[63:48] */
+ movk \tmp, #:abs_g2_nc:(TRAMP_VALIAS >> 12)
+ movk \tmp, #:abs_g1_nc:(TRAMP_VALIAS >> 12)
+ /* 2MB boundary containing the vectors, so we nobble the walk cache */
+ movk \tmp, #:abs_g0_nc:((TRAMP_VALIAS & ~(SZ_2M - 1)) >> 12)
+ isb
+ tlbi vae1, \tmp
+ dsb nsh
+alternative_else_nop_endif
+#endif /* CONFIG_QCOM_FALKOR_ERRATUM_1003 */
.endm
.macro tramp_unmap_kernel, tmp
--
2.1.4
In preparation for separate kernel/user ASIDs, allocate them in pairs
for each mm_struct. The bottom bit distinguishes the two: if it is set,
then the ASID will map only userspace.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/mmu.h | 1 +
arch/arm64/mm/context.c | 25 +++++++++++++++++--------
2 files changed, 18 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 0d34bf0a89c7..01bfb184f2a8 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -17,6 +17,7 @@
#define __ASM_MMU_H
#define MMCF_AARCH32 0x1 /* mm context flag for AArch32 executables */
+#define USER_ASID_FLAG (UL(1) << 48)
typedef struct {
atomic64_t id;
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 78a2dc596fee..1cb3bc92ae5c 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -39,7 +39,16 @@ static cpumask_t tlb_flush_pending;
#define ASID_MASK (~GENMASK(asid_bits - 1, 0))
#define ASID_FIRST_VERSION (1UL << asid_bits)
-#define NUM_USER_ASIDS ASID_FIRST_VERSION
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+#define NUM_USER_ASIDS (ASID_FIRST_VERSION >> 1)
+#define asid2idx(asid) (((asid) & ~ASID_MASK) >> 1)
+#define idx2asid(idx) (((idx) << 1) & ~ASID_MASK)
+#else
+#define NUM_USER_ASIDS (ASID_FIRST_VERSION)
+#define asid2idx(asid) ((asid) & ~ASID_MASK)
+#define idx2asid(idx) asid2idx(idx)
+#endif
/* Get the ASIDBits supported by the current CPU */
static u32 get_cpu_asid_bits(void)
@@ -98,7 +107,7 @@ static void flush_context(unsigned int cpu)
*/
if (asid == 0)
asid = per_cpu(reserved_asids, i);
- __set_bit(asid & ~ASID_MASK, asid_map);
+ __set_bit(asid2idx(asid), asid_map);
per_cpu(reserved_asids, i) = asid;
}
@@ -153,16 +162,16 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu)
* We had a valid ASID in a previous life, so try to re-use
* it if possible.
*/
- asid &= ~ASID_MASK;
- if (!__test_and_set_bit(asid, asid_map))
+ if (!__test_and_set_bit(asid2idx(asid), asid_map))
return newasid;
}
/*
* Allocate a free ASID. If we can't find one, take a note of the
- * currently active ASIDs and mark the TLBs as requiring flushes.
- * We always count from ASID #1, as we use ASID #0 when setting a
- * reserved TTBR0 for the init_mm.
+ * currently active ASIDs and mark the TLBs as requiring flushes. We
+ * always count from ASID #2 (index 1), as we use ASID #0 when setting
+ * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd
+ * pairs.
*/
asid = find_next_zero_bit(asid_map, NUM_USER_ASIDS, cur_idx);
if (asid != NUM_USER_ASIDS)
@@ -179,7 +188,7 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu)
set_asid:
__set_bit(asid, asid_map);
cur_idx = asid;
- return asid | generation;
+ return idx2asid(asid) | generation;
}
void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
--
2.1.4
In order for code such as TLB invalidation to operate efficiently when
the decision to map the kernel at EL0 is determined at runtime, this
patch introduces a helper function, arm64_kernel_unmapped_at_el0, to
determine whether or not the kernel is mapped whilst running in userspace.
Currently, this just reports the value of CONFIG_UNMAP_KERNEL_AT_EL0,
but will later be hooked up to a fake CPU capability using a static key.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/mmu.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 01bfb184f2a8..c07954638658 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -19,6 +19,8 @@
#define MMCF_AARCH32 0x1 /* mm context flag for AArch32 executables */
#define USER_ASID_FLAG (UL(1) << 48)
+#ifndef __ASSEMBLY__
+
typedef struct {
atomic64_t id;
void *vdso;
@@ -32,6 +34,11 @@ typedef struct {
*/
#define ASID(mm) ((mm)->context.id.counter & 0xffff)
+static inline bool arm64_kernel_unmapped_at_el0(void)
+{
+ return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0);
+}
+
extern void paging_init(void);
extern void bootmem_init(void);
extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
@@ -42,4 +49,5 @@ extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
extern void *fixmap_remap_fdt(phys_addr_t dt_phys);
extern void mark_linear_text_alias_ro(void);
+#endif /* !__ASSEMBLY__ */
#endif
--
2.1.4
The post_ttbr0_update_workaround hook applies to any change to TTBRx_EL1.
Since we're using TTBR1 for the ASID, rename the hook to make it clearer
as to what it's doing.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/assembler.h | 5 ++---
arch/arm64/kernel/entry.S | 2 +-
arch/arm64/mm/proc.S | 2 +-
3 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index e1fa5db858b7..c45bc94f15d0 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -477,10 +477,9 @@ alternative_endif
.endm
/*
-/*
- * Errata workaround post TTBR0_EL1 update.
+ * Errata workaround post TTBRx_EL1 update.
*/
- .macro post_ttbr0_update_workaround
+ .macro post_ttbr_update_workaround
#ifdef CONFIG_CAVIUM_ERRATUM_27456
alternative_if ARM64_WORKAROUND_CAVIUM_27456
ic iallu
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 6d14b8f29b5f..804e43c9cb0b 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -257,7 +257,7 @@ alternative_else_nop_endif
* Cavium erratum 27456 (broadcast TLBI instructions may cause I-cache
* corruption).
*/
- post_ttbr0_update_workaround
+ post_ttbr_update_workaround
.endif
1:
.if \el != 0
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index f2ff0837577c..3146dc96f05b 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -145,7 +145,7 @@ ENTRY(cpu_do_switch_mm)
isb
msr ttbr0_el1, x0 // now update TTBR0
isb
- post_ttbr0_update_workaround
+ post_ttbr_update_workaround
ret
ENDPROC(cpu_do_switch_mm)
--
2.1.4
The pre_ttbr0_update_workaround hook is called prior to context-switching
TTBR0 because Falkor erratum E1003 can cause TLB allocation with the wrong
ASID if both the ASID and the base address of the TTBR are updated at
the same time.
With the ASID sitting safely in TTBR1, we no longer update things
atomically, so we can remove the pre_ttbr0_update_workaround macro as
it's no longer required. The erratum infrastructure and documentation
is left around for #E1003, as it will be required by the entry
trampoline code in a future patch.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/arm64/include/asm/assembler.h | 22 ----------------------
arch/arm64/include/asm/mmu_context.h | 2 --
arch/arm64/mm/context.c | 11 -----------
arch/arm64/mm/proc.S | 1 -
4 files changed, 36 deletions(-)
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index aef72d886677..e1fa5db858b7 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -26,7 +26,6 @@
#include <asm/asm-offsets.h>
#include <asm/cpufeature.h>
#include <asm/debug-monitors.h>
-#include <asm/mmu_context.h>
#include <asm/page.h>
#include <asm/pgtable-hwdef.h>
#include <asm/ptrace.h>
@@ -478,27 +477,6 @@ alternative_endif
.endm
/*
- * Errata workaround prior to TTBR0_EL1 update
- *
- * val: TTBR value with new BADDR, preserved
- * tmp0: temporary register, clobbered
- * tmp1: other temporary register, clobbered
- */
- .macro pre_ttbr0_update_workaround, val, tmp0, tmp1
-#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1003
-alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1003
- mrs \tmp0, ttbr0_el1
- mov \tmp1, #FALKOR_RESERVED_ASID
- bfi \tmp0, \tmp1, #48, #16 // reserved ASID + old BADDR
- msr ttbr0_el1, \tmp0
- isb
- bfi \tmp0, \val, #0, #48 // reserved ASID + new BADDR
- msr ttbr0_el1, \tmp0
- isb
-alternative_else_nop_endif
-#endif
- .endm
-
/*
* Errata workaround post TTBR0_EL1 update.
*/
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 2d63611e4311..f9744944cf12 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -19,8 +19,6 @@
#ifndef __ASM_MMU_CONTEXT_H
#define __ASM_MMU_CONTEXT_H
-#define FALKOR_RESERVED_ASID 1
-
#ifndef __ASSEMBLY__
#include <linux/compiler.h>
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 6f4017046323..78a2dc596fee 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -79,13 +79,6 @@ void verify_cpu_asid_bits(void)
}
}
-static void set_reserved_asid_bits(void)
-{
- if (IS_ENABLED(CONFIG_QCOM_FALKOR_ERRATUM_1003) &&
- cpus_have_const_cap(ARM64_WORKAROUND_QCOM_FALKOR_E1003))
- __set_bit(FALKOR_RESERVED_ASID, asid_map);
-}
-
static void flush_context(unsigned int cpu)
{
int i;
@@ -94,8 +87,6 @@ static void flush_context(unsigned int cpu)
/* Update the list of reserved ASIDs and the ASID bitmap. */
bitmap_clear(asid_map, 0, NUM_USER_ASIDS);
- set_reserved_asid_bits();
-
for_each_possible_cpu(i) {
asid = atomic64_xchg_relaxed(&per_cpu(active_asids, i), 0);
/*
@@ -254,8 +245,6 @@ static int asids_init(void)
panic("Failed to allocate bitmap for %lu ASIDs\n",
NUM_USER_ASIDS);
- set_reserved_asid_bits();
-
pr_info("ASID allocator initialised with %lu entries\n", NUM_USER_ASIDS);
return 0;
}
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index a8a64898a2aa..f2ff0837577c 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -138,7 +138,6 @@ ENDPROC(cpu_do_resume)
* - pgd_phys - physical address of new TTB
*/
ENTRY(cpu_do_switch_mm)
- pre_ttbr0_update_workaround x0, x2, x3
mrs x2, ttbr1_el1
mmid x1, x1 // get mm->context.id
bfi x2, x1, #48, #16 // set the ASID
--
2.1.4
On 6 December 2017 at 12:35, Will Deacon <[email protected]> wrote:
> The literal pool entry for identifying the vectors base is the only piece
> of information in the trampoline page that identifies the true location
> of the kernel.
>
> This patch moves it into its own page, which is only mapped by the full
> kernel page table, which protects against any accidental leakage of the
> trampoline contents.
>
> Suggested-by: Ard Biesheuvel <[email protected]>
> Signed-off-by: Will Deacon <[email protected]>
> ---
> arch/arm64/include/asm/fixmap.h | 1 +
> arch/arm64/kernel/entry.S | 11 +++++++++++
> arch/arm64/kernel/vmlinux.lds.S | 35 ++++++++++++++++++++++++++++-------
> arch/arm64/mm/mmu.c | 10 +++++++++-
> 4 files changed, 49 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
> index 8119b49be98d..ec1e6d6fa14c 100644
> --- a/arch/arm64/include/asm/fixmap.h
> +++ b/arch/arm64/include/asm/fixmap.h
> @@ -59,6 +59,7 @@ enum fixed_addresses {
> #endif /* CONFIG_ACPI_APEI_GHES */
>
> #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> + FIX_ENTRY_TRAMP_DATA,
> FIX_ENTRY_TRAMP_TEXT,
> #define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT))
> #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 3eabcb194c87..a70c6dd2cc19 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -1030,7 +1030,13 @@ alternative_else_nop_endif
> msr tpidrro_el0, x30 // Restored in kernel_ventry
> .endif
> tramp_map_kernel x30
> +#ifdef CONFIG_RANDOMIZE_BASE
> + adr x30, tramp_vectors + PAGE_SIZE
> +alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> + ldr x30, [x30]
> +#else
> ldr x30, =vectors
> +#endif
> prfm plil1strm, [x30, #(1b - tramp_vectors)]
> msr vbar_el1, x30
> add x30, x30, #(1b - tramp_vectors)
> @@ -1073,6 +1079,11 @@ END(tramp_exit_compat)
>
> .ltorg
> .popsection // .entry.tramp.text
> +#ifdef CONFIG_RANDOMIZE_BASE
> + .pushsection ".entry.tramp.data", "a" // .entry.tramp.data
> + .quad vectors
> + .popsection // .entry.tramp.data
This does not need to be in a section of its own, and doesn't need to
be padded to a full page.
If you just stick this in .rodata but align it to page size, you can
just map whichever page it ends up in into the TRAMP_DATA fixmap slot
(which is a r/o mapping anyway). You could then drop most of the
changes below. And actually, I'm not entirely sure whether it still
makes sense then to do this only for CONFIG_RANDOMIZE_BASE.
> +#endif /* CONFIG_RANDOMIZE_BASE */
> #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
>
> /*
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 6b4260f22aab..976109b3ae51 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -58,15 +58,28 @@ jiffies = jiffies_64;
> #endif
>
> #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> -#define TRAMP_TEXT \
> - . = ALIGN(PAGE_SIZE); \
> - VMLINUX_SYMBOL(__entry_tramp_text_start) = .; \
> - *(.entry.tramp.text) \
> - . = ALIGN(PAGE_SIZE); \
> +#define TRAMP_TEXT \
> + . = ALIGN(PAGE_SIZE); \
> + VMLINUX_SYMBOL(__entry_tramp_text_start) = .; \
> + *(.entry.tramp.text) \
> + . = ALIGN(PAGE_SIZE); \
> VMLINUX_SYMBOL(__entry_tramp_text_end) = .;
> +#ifdef CONFIG_RANDOMIZE_BASE
> +#define TRAMP_DATA \
> + .entry.tramp.data : { \
> + . = ALIGN(PAGE_SIZE); \
> + VMLINUX_SYMBOL(__entry_tramp_data_start) = .; \
> + *(.entry.tramp.data) \
> + . = ALIGN(PAGE_SIZE); \
> + VMLINUX_SYMBOL(__entry_tramp_data_end) = .; \
> + }
> +#else
> +#define TRAMP_DATA
> +#endif /* CONFIG_RANDOMIZE_BASE */
> #else
> #define TRAMP_TEXT
> -#endif
> +#define TRAMP_DATA
> +#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
>
> /*
> * The size of the PE/COFF section that covers the kernel image, which
> @@ -137,6 +150,7 @@ SECTIONS
> RO_DATA(PAGE_SIZE) /* everything from this point to */
> EXCEPTION_TABLE(8) /* __init_begin will be marked RO NX */
> NOTES
> + TRAMP_DATA
>
> . = ALIGN(SEGMENT_ALIGN);
> __init_begin = .;
> @@ -251,7 +265,14 @@ ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
> ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
> <= SZ_4K, "Hibernate exit text too big or misaligned")
> #endif
> -
> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> +ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == PAGE_SIZE,
> + "Entry trampoline text too big")
> +#ifdef CONFIG_RANDOMIZE_BASE
> +ASSERT((__entry_tramp_data_end - __entry_tramp_data_start) == PAGE_SIZE,
> + "Entry trampoline data too big")
> +#endif
> +#endif
> /*
> * If padding is applied before .head.text, virt<->phys conversions will fail.
> */
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index fe68a48c64cb..916d9ced1c3f 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -541,8 +541,16 @@ static int __init map_entry_trampoline(void)
> __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
> prot, pgd_pgtable_alloc, 0);
>
> - /* ...as well as the kernel page table */
> + /* Map both the text and data into the kernel page table */
> __set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot);
> + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
> + extern char __entry_tramp_data_start[];
> +
> + __set_fixmap(FIX_ENTRY_TRAMP_DATA,
> + __pa_symbol(__entry_tramp_data_start),
> + PAGE_KERNEL_RO);
> + }
> +
> return 0;
> }
> core_initcall(map_entry_trampoline);
> --
> 2.1.4
>
Hi Ard,
On Wed, Dec 06, 2017 at 12:59:40PM +0000, Ard Biesheuvel wrote:
> On 6 December 2017 at 12:35, Will Deacon <[email protected]> wrote:
> > The literal pool entry for identifying the vectors base is the only piece
> > of information in the trampoline page that identifies the true location
> > of the kernel.
> >
> > This patch moves it into its own page, which is only mapped by the full
> > kernel page table, which protects against any accidental leakage of the
> > trampoline contents.
[...]
> > @@ -1073,6 +1079,11 @@ END(tramp_exit_compat)
> >
> > .ltorg
> > .popsection // .entry.tramp.text
> > +#ifdef CONFIG_RANDOMIZE_BASE
> > + .pushsection ".entry.tramp.data", "a" // .entry.tramp.data
> > + .quad vectors
> > + .popsection // .entry.tramp.data
>
> This does not need to be in a section of its own, and doesn't need to
> be padded to a full page.
>
> If you just stick this in .rodata but align it to page size, you can
> just map whichever page it ends up in into the TRAMP_DATA fixmap slot
> (which is a r/o mapping anyway). You could then drop most of the
> changes below. And actually, I'm not entirely sure whether it still
> makes sense then to do this only for CONFIG_RANDOMIZE_BASE.
Good point; I momentarily forgot this was in the kernel page table anyway.
How about something like the diff below merged on top (so this basically
undoes a bunch of the patch)?
I'd prefer to keep the CONFIG_RANDOMIZE_BASE dependency, at least for now.
Will
--->8
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index a70c6dd2cc19..031392ee5f47 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -1080,9 +1080,12 @@ END(tramp_exit_compat)
.ltorg
.popsection // .entry.tramp.text
#ifdef CONFIG_RANDOMIZE_BASE
- .pushsection ".entry.tramp.data", "a" // .entry.tramp.data
+ .pushsection ".rodata", "a"
+ .align PAGE_SHIFT
+ .globl __entry_tramp_data_start
+__entry_tramp_data_start:
.quad vectors
- .popsection // .entry.tramp.data
+ .popsection // .rodata
#endif /* CONFIG_RANDOMIZE_BASE */
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 976109b3ae51..27cf9be20a00 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -64,21 +64,8 @@ jiffies = jiffies_64;
*(.entry.tramp.text) \
. = ALIGN(PAGE_SIZE); \
VMLINUX_SYMBOL(__entry_tramp_text_end) = .;
-#ifdef CONFIG_RANDOMIZE_BASE
-#define TRAMP_DATA \
- .entry.tramp.data : { \
- . = ALIGN(PAGE_SIZE); \
- VMLINUX_SYMBOL(__entry_tramp_data_start) = .; \
- *(.entry.tramp.data) \
- . = ALIGN(PAGE_SIZE); \
- VMLINUX_SYMBOL(__entry_tramp_data_end) = .; \
- }
-#else
-#define TRAMP_DATA
-#endif /* CONFIG_RANDOMIZE_BASE */
#else
#define TRAMP_TEXT
-#define TRAMP_DATA
#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
/*
@@ -150,7 +137,6 @@ SECTIONS
RO_DATA(PAGE_SIZE) /* everything from this point to */
EXCEPTION_TABLE(8) /* __init_begin will be marked RO NX */
NOTES
- TRAMP_DATA
. = ALIGN(SEGMENT_ALIGN);
__init_begin = .;
@@ -268,10 +254,6 @@ ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == PAGE_SIZE,
"Entry trampoline text too big")
-#ifdef CONFIG_RANDOMIZE_BASE
-ASSERT((__entry_tramp_data_end - __entry_tramp_data_start) == PAGE_SIZE,
- "Entry trampoline data too big")
-#endif
#endif
/*
* If padding is applied before .head.text, virt<->phys conversions will fail.
On Wed, Dec 06, 2017 at 12:35:37PM +0000, Will Deacon wrote:
> When running with the kernel unmapped whilst at EL0, the virtually-addressed
> SPE buffer is also unmapped, which can lead to buffer faults if userspace
> profiling is enabled and potentially also when writing back kernel samples
> unless an expensive drain operation is performed on exception return.
>
> For now, fail the SPE driver probe when arm64_kernel_unmapped_at_el0().
>
> Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Mark.
> ---
> drivers/perf/arm_spe_pmu.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
> index 8ce262fc2561..51b40aecb776 100644
> --- a/drivers/perf/arm_spe_pmu.c
> +++ b/drivers/perf/arm_spe_pmu.c
> @@ -1164,6 +1164,15 @@ static int arm_spe_pmu_device_dt_probe(struct platform_device *pdev)
> struct arm_spe_pmu *spe_pmu;
> struct device *dev = &pdev->dev;
>
> + /*
> + * If kernelspace is unmapped when running at EL0, then the SPE
> + * buffer will fault and prematurely terminate the AUX session.
> + */
> + if (arm64_kernel_unmapped_at_el0()) {
> + dev_warn_once(dev, "profiling buffer inaccessible. Try passing \"kpti=off\" on the kernel command line\n");
> + return -EPERM;
> + }
> +
> spe_pmu = devm_kzalloc(dev, sizeof(*spe_pmu), GFP_KERNEL);
> if (!spe_pmu) {
> dev_err(dev, "failed to allocate spe_pmu\n");
> --
> 2.1.4
>
On 6 December 2017 at 13:27, Will Deacon <[email protected]> wrote:
> Hi Ard,
>
> On Wed, Dec 06, 2017 at 12:59:40PM +0000, Ard Biesheuvel wrote:
>> On 6 December 2017 at 12:35, Will Deacon <[email protected]> wrote:
>> > The literal pool entry for identifying the vectors base is the only piece
>> > of information in the trampoline page that identifies the true location
>> > of the kernel.
>> >
>> > This patch moves it into its own page, which is only mapped by the full
>> > kernel page table, which protects against any accidental leakage of the
>> > trampoline contents.
>
> [...]
>
>> > @@ -1073,6 +1079,11 @@ END(tramp_exit_compat)
>> >
>> > .ltorg
>> > .popsection // .entry.tramp.text
>> > +#ifdef CONFIG_RANDOMIZE_BASE
>> > + .pushsection ".entry.tramp.data", "a" // .entry.tramp.data
>> > + .quad vectors
>> > + .popsection // .entry.tramp.data
>>
>> This does not need to be in a section of its own, and doesn't need to
>> be padded to a full page.
>>
>> If you just stick this in .rodata but align it to page size, you can
>> just map whichever page it ends up in into the TRAMP_DATA fixmap slot
>> (which is a r/o mapping anyway). You could then drop most of the
>> changes below. And actually, I'm not entirely sure whether it still
>> makes sense then to do this only for CONFIG_RANDOMIZE_BASE.
>
> Good point; I momentarily forgot this was in the kernel page table anyway.
> How about something like the diff below merged on top (so this basically
> undoes a bunch of the patch)?
>
Yes, that looks much better.
> I'd prefer to keep the CONFIG_RANDOMIZE_BASE dependency, at least for now.
>
Fair enough.
> --->8
>
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index a70c6dd2cc19..031392ee5f47 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -1080,9 +1080,12 @@ END(tramp_exit_compat)
> .ltorg
> .popsection // .entry.tramp.text
> #ifdef CONFIG_RANDOMIZE_BASE
> - .pushsection ".entry.tramp.data", "a" // .entry.tramp.data
> + .pushsection ".rodata", "a"
> + .align PAGE_SHIFT
> + .globl __entry_tramp_data_start
> +__entry_tramp_data_start:
> .quad vectors
> - .popsection // .entry.tramp.data
> + .popsection // .rodata
> #endif /* CONFIG_RANDOMIZE_BASE */
> #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
>
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 976109b3ae51..27cf9be20a00 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -64,21 +64,8 @@ jiffies = jiffies_64;
> *(.entry.tramp.text) \
> . = ALIGN(PAGE_SIZE); \
> VMLINUX_SYMBOL(__entry_tramp_text_end) = .;
> -#ifdef CONFIG_RANDOMIZE_BASE
> -#define TRAMP_DATA \
> - .entry.tramp.data : { \
> - . = ALIGN(PAGE_SIZE); \
> - VMLINUX_SYMBOL(__entry_tramp_data_start) = .; \
> - *(.entry.tramp.data) \
> - . = ALIGN(PAGE_SIZE); \
> - VMLINUX_SYMBOL(__entry_tramp_data_end) = .; \
> - }
> -#else
> -#define TRAMP_DATA
> -#endif /* CONFIG_RANDOMIZE_BASE */
> #else
> #define TRAMP_TEXT
> -#define TRAMP_DATA
> #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
>
> /*
> @@ -150,7 +137,6 @@ SECTIONS
> RO_DATA(PAGE_SIZE) /* everything from this point to */
> EXCEPTION_TABLE(8) /* __init_begin will be marked RO NX */
> NOTES
> - TRAMP_DATA
>
> . = ALIGN(SEGMENT_ALIGN);
> __init_begin = .;
> @@ -268,10 +254,6 @@ ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
> #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == PAGE_SIZE,
> "Entry trampoline text too big")
> -#ifdef CONFIG_RANDOMIZE_BASE
> -ASSERT((__entry_tramp_data_end - __entry_tramp_data_start) == PAGE_SIZE,
> - "Entry trampoline data too big")
> -#endif
> #endif
> /*
> * If padding is applied before .head.text, virt<->phys conversions will fail.
On Wed, Dec 06, 2017 at 12:35:35PM +0000, Will Deacon wrote:
> +static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> + int __unused)
> +{
> + /* Forced on command line? */
> + if (__kpti_forced) {
> + pr_info("kernel page table isolation forced %s by command line option\n",
> + __kpti_forced > 0 ? "ON" : "OFF");
> + return __kpti_forced > 0;
> + }
I think we want this to be a pr_info_once() so that we don't print this
for late-onlined secondaries due to verify_local_cpu_features().
With that changed:
Reviewed-by: Mark Rutland <[email protected]>
Thanks,
Mark.
On Wed, Dec 06, 2017 at 12:35:38PM +0000, Will Deacon wrote:
> There are now a handful of open-coded masks to extract the ASID from a
> TTBR value, so introduce a TTBR_ASID_MASK and use that instead.
>
> Suggested-by: Mark Rutland <[email protected]>
> Signed-off-by: Will Deacon <[email protected]>
Thanks!
Reviewed-by: Mark Rutland <[email protected]>
Mark.
> ---
> arch/arm64/include/asm/asm-uaccess.h | 3 ++-
> arch/arm64/include/asm/mmu.h | 1 +
> arch/arm64/include/asm/uaccess.h | 4 ++--
> arch/arm64/kernel/entry.S | 2 +-
> 4 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
> index 21b8cf304028..f4f234b6155e 100644
> --- a/arch/arm64/include/asm/asm-uaccess.h
> +++ b/arch/arm64/include/asm/asm-uaccess.h
> @@ -4,6 +4,7 @@
>
> #include <asm/alternative.h>
> #include <asm/kernel-pgtable.h>
> +#include <asm/mmu.h>
> #include <asm/sysreg.h>
> #include <asm/assembler.h>
>
> @@ -17,7 +18,7 @@
> msr ttbr0_el1, \tmp1 // set reserved TTBR0_EL1
> isb
> sub \tmp1, \tmp1, #SWAPPER_DIR_SIZE
> - bic \tmp1, \tmp1, #(0xffff << 48)
> + bic \tmp1, \tmp1, #TTBR_ASID_MASK
> msr ttbr1_el1, \tmp1 // set reserved ASID
> isb
> .endm
> diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> index da6f12e40714..6f7bdb89817f 100644
> --- a/arch/arm64/include/asm/mmu.h
> +++ b/arch/arm64/include/asm/mmu.h
> @@ -18,6 +18,7 @@
>
> #define MMCF_AARCH32 0x1 /* mm context flag for AArch32 executables */
> #define USER_ASID_FLAG (UL(1) << 48)
> +#define TTBR_ASID_MASK (UL(0xffff) << 48)
>
> #ifndef __ASSEMBLY__
>
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index 750a3b76a01c..6eadf55ebaf0 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -112,7 +112,7 @@ static inline void __uaccess_ttbr0_disable(void)
> write_sysreg(ttbr + SWAPPER_DIR_SIZE, ttbr0_el1);
> isb();
> /* Set reserved ASID */
> - ttbr &= ~(0xffffUL << 48);
> + ttbr &= ~TTBR_ASID_MASK;
> write_sysreg(ttbr, ttbr1_el1);
> isb();
> }
> @@ -131,7 +131,7 @@ static inline void __uaccess_ttbr0_enable(void)
>
> /* Restore active ASID */
> ttbr1 = read_sysreg(ttbr1_el1);
> - ttbr1 |= ttbr0 & (0xffffUL << 48);
> + ttbr1 |= ttbr0 & TTBR_ASID_MASK;
> write_sysreg(ttbr1, ttbr1_el1);
> isb();
>
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 5d51bdbb2131..3eabcb194c87 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -205,7 +205,7 @@ alternative_else_nop_endif
>
> .if \el != 0
> mrs x21, ttbr1_el1
> - tst x21, #0xffff << 48 // Check for the reserved ASID
> + tst x21, #TTBR_ASID_MASK // Check for the reserved ASID
> orr x23, x23, #PSR_PAN_BIT // Set the emulated PAN in the saved SPSR
> b.eq 1f // TTBR0 access already disabled
> and x23, x23, #~PSR_PAN_BIT // Clear the emulated PAN in the saved SPSR
> --
> 2.1.4
>
On Wed, Dec 06, 2017 at 12:35:30PM +0000, Will Deacon wrote:
> The exception entry trampoline needs to be mapped at the same virtual
> address in both the trampoline page table (which maps nothing else)
> and also the kernel page table, so that we can swizzle TTBR1_EL1 on
> exceptions from and return to EL0.
>
> This patch maps the trampoline at a fixed virtual address in the fixmap
> area of the kernel virtual address space, which allows the kernel proper
> to be randomized with respect to the trampoline when KASLR is enabled.
>
> Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Mark.
> ---
> arch/arm64/include/asm/fixmap.h | 4 ++++
> arch/arm64/include/asm/pgtable.h | 1 +
> arch/arm64/kernel/asm-offsets.c | 6 +++++-
> arch/arm64/mm/mmu.c | 23 +++++++++++++++++++++++
> 4 files changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
> index 4052ec39e8db..8119b49be98d 100644
> --- a/arch/arm64/include/asm/fixmap.h
> +++ b/arch/arm64/include/asm/fixmap.h
> @@ -58,6 +58,10 @@ enum fixed_addresses {
> FIX_APEI_GHES_NMI,
> #endif /* CONFIG_ACPI_APEI_GHES */
>
> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> + FIX_ENTRY_TRAMP_TEXT,
> +#define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT))
> +#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
> __end_of_permanent_fixed_addresses,
>
> /*
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 149d05fb9421..774003b247ad 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -680,6 +680,7 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm,
>
> extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
> extern pgd_t idmap_pg_dir[PTRS_PER_PGD];
> +extern pgd_t tramp_pg_dir[PTRS_PER_PGD];
>
> /*
> * Encode and decode a swap entry:
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 71bf088f1e4b..af247d10252f 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -24,6 +24,7 @@
> #include <linux/kvm_host.h>
> #include <linux/suspend.h>
> #include <asm/cpufeature.h>
> +#include <asm/fixmap.h>
> #include <asm/thread_info.h>
> #include <asm/memory.h>
> #include <asm/smp_plat.h>
> @@ -148,11 +149,14 @@ int main(void)
> DEFINE(ARM_SMCCC_RES_X2_OFFS, offsetof(struct arm_smccc_res, a2));
> DEFINE(ARM_SMCCC_QUIRK_ID_OFFS, offsetof(struct arm_smccc_quirk, id));
> DEFINE(ARM_SMCCC_QUIRK_STATE_OFFS, offsetof(struct arm_smccc_quirk, state));
> -
> BLANK();
> DEFINE(HIBERN_PBE_ORIG, offsetof(struct pbe, orig_address));
> DEFINE(HIBERN_PBE_ADDR, offsetof(struct pbe, address));
> DEFINE(HIBERN_PBE_NEXT, offsetof(struct pbe, next));
> DEFINE(ARM64_FTR_SYSVAL, offsetof(struct arm64_ftr_reg, sys_val));
> + BLANK();
> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> + DEFINE(TRAMP_VALIAS, TRAMP_VALIAS);
> +#endif
> return 0;
> }
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 267d2b79d52d..fe68a48c64cb 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -525,6 +525,29 @@ static int __init parse_rodata(char *arg)
> }
> early_param("rodata", parse_rodata);
>
> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> +static int __init map_entry_trampoline(void)
> +{
> + extern char __entry_tramp_text_start[];
> +
> + pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
> + phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start);
> +
> + /* The trampoline is always mapped and can therefore be global */
> + pgprot_val(prot) &= ~PTE_NG;
> +
> + /* Map only the text into the trampoline page table */
> + memset(tramp_pg_dir, 0, PGD_SIZE);
> + __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
> + prot, pgd_pgtable_alloc, 0);
> +
> + /* ...as well as the kernel page table */
> + __set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot);
> + return 0;
> +}
> +core_initcall(map_entry_trampoline);
> +#endif
> +
> /*
> * Create fine-grained mappings for the kernel.
> */
> --
> 2.1.4
>
On 12/06/2017 04:35 AM, Will Deacon wrote:
> Hi everybody,
>
> This is version three of the patches formerly known as KAISER (♔).
>
> v1: http://lists.infradead.org/pipermail/linux-arm-kernel/2017-November/542751.html
> v2: http://lists.infradead.org/pipermail/linux-arm-kernel/2017-November/544817.html
>
> Changes since v2 include:
>
> * Rename command-line option from "kaiser=" to "kpti=" for parity with x86
> * Fixed Falkor erratum workaround (missing '~')
> * Moved vectors base from literal pool into separate data page
> * Added TTBR_ASID_MASK instead of open-coded constants
> * Added missing newline to error message
> * Fail to probe SPE if KPTI is enabled
> * Addressed minor review comments
> * Added tags
> * Based on -rc2
>
> Patches are also pushed here:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git kpti
>
> Feedback and testing welcome. At this point, I'd like to start thinking
> about getting this merged for 4.16.
>
You can add
Tested-by: Laura Abbott <[email protected]>
> Cheers,
>
> Will
>
> --->8
>
> Will Deacon (20):
> arm64: mm: Use non-global mappings for kernel space
> arm64: mm: Temporarily disable ARM64_SW_TTBR0_PAN
> arm64: mm: Move ASID from TTBR0 to TTBR1
> arm64: mm: Remove pre_ttbr0_update_workaround for Falkor erratum
> #E1003
> arm64: mm: Rename post_ttbr0_update_workaround
> arm64: mm: Fix and re-enable ARM64_SW_TTBR0_PAN
> arm64: mm: Allocate ASIDs in pairs
> arm64: mm: Add arm64_kernel_unmapped_at_el0 helper
> arm64: mm: Invalidate both kernel and user ASIDs when performing TLBI
> arm64: entry: Add exception trampoline page for exceptions from EL0
> arm64: mm: Map entry trampoline into trampoline and kernel page tables
> arm64: entry: Explicitly pass exception level to kernel_ventry macro
> arm64: entry: Hook up entry trampoline to exception vectors
> arm64: erratum: Work around Falkor erratum #E1003 in trampoline code
> arm64: tls: Avoid unconditional zeroing of tpidrro_el0 for native
> tasks
> arm64: entry: Add fake CPU feature for unmapping the kernel at EL0
> arm64: Kconfig: Add CONFIG_UNMAP_KERNEL_AT_EL0
> perf: arm_spe: Fail device probe when arm64_kernel_unmapped_at_el0()
> arm64: mm: Introduce TTBR_ASID_MASK for getting at the ASID in the
> TTBR
> arm64: kaslr: Put kernel vectors address in separate data page
>
> arch/arm64/Kconfig | 30 +++--
> arch/arm64/include/asm/asm-uaccess.h | 26 ++--
> arch/arm64/include/asm/assembler.h | 27 +----
> arch/arm64/include/asm/cpucaps.h | 3 +-
> arch/arm64/include/asm/fixmap.h | 5 +
> arch/arm64/include/asm/kernel-pgtable.h | 12 +-
> arch/arm64/include/asm/mmu.h | 11 ++
> arch/arm64/include/asm/mmu_context.h | 9 +-
> arch/arm64/include/asm/pgtable-hwdef.h | 1 +
> arch/arm64/include/asm/pgtable-prot.h | 21 +++-
> arch/arm64/include/asm/pgtable.h | 1 +
> arch/arm64/include/asm/proc-fns.h | 6 -
> arch/arm64/include/asm/tlbflush.h | 16 ++-
> arch/arm64/include/asm/uaccess.h | 21 +++-
> arch/arm64/kernel/asm-offsets.c | 6 +-
> arch/arm64/kernel/cpufeature.c | 41 +++++++
> arch/arm64/kernel/entry.S | 203 +++++++++++++++++++++++++++-----
> arch/arm64/kernel/process.c | 12 +-
> arch/arm64/kernel/vmlinux.lds.S | 40 ++++++-
> arch/arm64/lib/clear_user.S | 2 +-
> arch/arm64/lib/copy_from_user.S | 2 +-
> arch/arm64/lib/copy_in_user.S | 2 +-
> arch/arm64/lib/copy_to_user.S | 2 +-
> arch/arm64/mm/cache.S | 2 +-
> arch/arm64/mm/context.c | 36 +++---
> arch/arm64/mm/mmu.c | 31 +++++
> arch/arm64/mm/proc.S | 12 +-
> arch/arm64/xen/hypercall.S | 2 +-
> drivers/perf/arm_spe_pmu.c | 9 ++
> 29 files changed, 454 insertions(+), 137 deletions(-)
>
On Thu, Dec 07, 2017 at 04:40:05PM -0800, Laura Abbott wrote:
> On 12/06/2017 04:35 AM, Will Deacon wrote:
> >Hi everybody,
> >
> >This is version three of the patches formerly known as KAISER (♔).
> >
> > v1: http://lists.infradead.org/pipermail/linux-arm-kernel/2017-November/542751.html
> > v2: http://lists.infradead.org/pipermail/linux-arm-kernel/2017-November/544817.html
> >
> >Changes since v2 include:
> >
> > * Rename command-line option from "kaiser=" to "kpti=" for parity with x86
> > * Fixed Falkor erratum workaround (missing '~')
> > * Moved vectors base from literal pool into separate data page
> > * Added TTBR_ASID_MASK instead of open-coded constants
> > * Added missing newline to error message
> > * Fail to probe SPE if KPTI is enabled
> > * Addressed minor review comments
> > * Added tags
> > * Based on -rc2
> >
> >Patches are also pushed here:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git kpti
> >
> >Feedback and testing welcome. At this point, I'd like to start thinking
> >about getting this merged for 4.16.
> >
>
> You can add
>
> Tested-by: Laura Abbott <[email protected]>
Thanks, Laura!
Will
On Wed, Dec 06, 2017 at 12:35:19PM +0000, Will Deacon wrote:
> Patches are also pushed here:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git kpti
>
> Feedback and testing welcome. At this point, I'd like to start thinking
> about getting this merged for 4.16.
For the record, the fixed up version was pushed by Will here:
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git kpti
and I queued it for 4.16 in the arm64 for-next/core branch (same tree as
above).
--
Catalin
On 12/11/2017 09:59 AM, Catalin Marinas wrote:
> On Wed, Dec 06, 2017 at 12:35:19PM +0000, Will Deacon wrote:
>> Patches are also pushed here:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git kpti
>>
>> Feedback and testing welcome. At this point, I'd like to start thinking
>> about getting this merged for 4.16.
>
> For the record, the fixed up version was pushed by Will here:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git kpti
>
> and I queued it for 4.16 in the arm64 for-next/core branch (same tree as
> above).
Greg proposed the x86/KPTI patches for the stable-4.9.75 queue, is there
a plan to get the ARM64/KPTI patches backported towards stable trees as
well?
Thanks!
--
Florian
On Wed, Jan 03, 2018 at 09:17:26PM -0800, Florian Fainelli wrote:
> On 12/11/2017 09:59 AM, Catalin Marinas wrote:
> > On Wed, Dec 06, 2017 at 12:35:19PM +0000, Will Deacon wrote:
> >> Patches are also pushed here:
> >>
> >> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git kpti
> >>
> >> Feedback and testing welcome. At this point, I'd like to start thinking
> >> about getting this merged for 4.16.
> >
> > For the record, the fixed up version was pushed by Will here:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git kpti
> >
> > and I queued it for 4.16 in the arm64 for-next/core branch (same tree as
> > above).
>
> Greg proposed the x86/KPTI patches for the stable-4.9.75 queue, is there
> a plan to get the ARM64/KPTI patches backported towards stable trees as
> well?
Stable tree patches have to get into Linus's tree first before I can do
anything :)
Anyway, once that happens, yes, there is a plan, but it's a bit
"different", and I'll talk about it once these are merged.
thanks,
greg k-h
On 01/03/2018 10:50 PM, Greg Kroah-Hartman wrote:
> On Wed, Jan 03, 2018 at 09:17:26PM -0800, Florian Fainelli wrote:
>> On 12/11/2017 09:59 AM, Catalin Marinas wrote:
>>> On Wed, Dec 06, 2017 at 12:35:19PM +0000, Will Deacon wrote:
>>>> Patches are also pushed here:
>>>>
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git kpti
>>>>
>>>> Feedback and testing welcome. At this point, I'd like to start thinking
>>>> about getting this merged for 4.16.
>>>
>>> For the record, the fixed up version was pushed by Will here:
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git kpti
>>>
>>> and I queued it for 4.16 in the arm64 for-next/core branch (same tree as
>>> above).
>>
>> Greg proposed the x86/KPTI patches for the stable-4.9.75 queue, is there
>> a plan to get the ARM64/KPTI patches backported towards stable trees as
>> well?
>
> Stable tree patches have to get into Linus's tree first before I can do
> anything :)
>
> Anyway, once that happens, yes, there is a plan, but it's a bit
> "different", and I'll talk about it once these are merged.
Great, thanks! Bonus question, if someone is using any of the affected
devices in AArch32, should we be expecting to see ARM/Linux changes as
well, that is, is there a plan to come up with a kpti implementation for
ARM?
--
Florian
On Thu, Jan 04, 2018 at 10:23:40AM -0800, Florian Fainelli wrote:
> Great, thanks! Bonus question, if someone is using any of the affected
> devices in AArch32, should we be expecting to see ARM/Linux changes as
> well, that is, is there a plan to come up with a kpti implementation for
> ARM?
Given what little information there is, I've been trying today to see
whether I can detect whether a userspace address is cached or uncached
- the results suggest that I have code that works with an error rate of
between 2 and 20 in 10000 in a 32-bit VM on Cortex A72. Whether that
translates to Cortex A15, I don't know yet - I need a working Cortex
A15 platform for that. Unfortunately, my only Cortex A15 platform does
not setup the architected timer, and so the kernel doesn't make it
available to userspace. I will be raising this with those concerned on
Monday, in the hope of getting it resolved.
Based on this and the information on developer.arm.com, my gut feeling
is that 32-bit kernels running on a CPU with an architected timer _or_
with some other high resolution timer available to non-privileged
userspace are likely to be vulnerable in some way, as it seems to be
possible to measure whether a specific load results in data being
sourced from the cache or from memory.
That all said, what I read about Chrome OS is that google believes
that isn't exploitable - which seems to be a contradiction to the
information ARM have published. I'm not sure what the reasoning is
there, maybe there's just no current working exploit yet.
So, the message to take away is that 32-bit kernels are rather behind
on this issue, there are no known patches in development, and it is
not really known whether there is an exploitable problem for 32-bit
kernels or not.
Not really where I'd like 32-bit kernels to be.
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
On Thu, Jan 04, 2018 at 10:23:40AM -0800, Florian Fainelli wrote:
> On 01/03/2018 10:50 PM, Greg Kroah-Hartman wrote:
> > On Wed, Jan 03, 2018 at 09:17:26PM -0800, Florian Fainelli wrote:
> >> On 12/11/2017 09:59 AM, Catalin Marinas wrote:
> >>> On Wed, Dec 06, 2017 at 12:35:19PM +0000, Will Deacon wrote:
> >>>> Patches are also pushed here:
> >>>>
> >>>> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git kpti
> >>>>
> >>>> Feedback and testing welcome. At this point, I'd like to start thinking
> >>>> about getting this merged for 4.16.
> >>>
> >>> For the record, the fixed up version was pushed by Will here:
> >>>
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git kpti
> >>>
> >>> and I queued it for 4.16 in the arm64 for-next/core branch (same tree as
> >>> above).
> >>
> >> Greg proposed the x86/KPTI patches for the stable-4.9.75 queue, is there
> >> a plan to get the ARM64/KPTI patches backported towards stable trees as
> >> well?
> >
> > Stable tree patches have to get into Linus's tree first before I can do
> > anything :)
> >
> > Anyway, once that happens, yes, there is a plan, but it's a bit
> > "different", and I'll talk about it once these are merged.
>
> Great, thanks! Bonus question, if someone is using any of the affected
> devices in AArch32, should we be expecting to see ARM/Linux changes as
> well, that is, is there a plan to come up with a kpti implementation for
> ARM?
I have not heard of anyone working on this for any arm32 platforms,
as of this time, sorry.
Which makes me worry about my android tv, glad I don't connect it to the
network :(
thanks,
greg k-h
On 5 January 2018 at 16:06, Greg Kroah-Hartman
<[email protected]> wrote:
> On Thu, Jan 04, 2018 at 10:23:40AM -0800, Florian Fainelli wrote:
>> On 01/03/2018 10:50 PM, Greg Kroah-Hartman wrote:
>> > On Wed, Jan 03, 2018 at 09:17:26PM -0800, Florian Fainelli wrote:
>> >> On 12/11/2017 09:59 AM, Catalin Marinas wrote:
>> >>> On Wed, Dec 06, 2017 at 12:35:19PM +0000, Will Deacon wrote:
>> >>>> Patches are also pushed here:
>> >>>>
>> >>>> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git kpti
>> >>>>
>> >>>> Feedback and testing welcome. At this point, I'd like to start thinking
>> >>>> about getting this merged for 4.16.
>> >>>
>> >>> For the record, the fixed up version was pushed by Will here:
>> >>>
>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git kpti
>> >>>
>> >>> and I queued it for 4.16 in the arm64 for-next/core branch (same tree as
>> >>> above).
>> >>
>> >> Greg proposed the x86/KPTI patches for the stable-4.9.75 queue, is there
>> >> a plan to get the ARM64/KPTI patches backported towards stable trees as
>> >> well?
>> >
>> > Stable tree patches have to get into Linus's tree first before I can do
>> > anything :)
>> >
>> > Anyway, once that happens, yes, there is a plan, but it's a bit
>> > "different", and I'll talk about it once these are merged.
>>
>> Great, thanks! Bonus question, if someone is using any of the affected
>> devices in AArch32, should we be expecting to see ARM/Linux changes as
>> well, that is, is there a plan to come up with a kpti implementation for
>> ARM?
>
> I have not heard of anyone working on this for any arm32 platforms,
> as of this time, sorry.
>
> Which makes me worry about my android tv, glad I don't connect it to the
> network :(
>
The only ARM variant that is currently known to be affected by
Meltdown/variant 3 (which is what KPTI addresses) is the Cortex-A75,
which is a 64-bit core. That still means 32-bit guests running under
KVM will be affected, as well as a 32-bit kernel running on the bare
metal, but in practice, 32-bit ARM simply doesn't need KPTI. (My KASLR
patches for ARM are a bit in limbo atm, but those would benefit from
unmapping the kernel while running in userland as well)
As for variants 1/2 aka Spectre, I suppose ARM will need to implement
the same nospec/retpoline primitives that are being proposed for other
arches, but that work is not as fleshed out yet.
Hi Will,
On 2017/12/6 20:35, Will Deacon wrote:
> config ARM64_SW_TTBR0_PAN
> bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
> - depends on BROKEN # Temporary while switch_mm is reworked
> help
> Enabling this option prevents the kernel from accessing
> user-space memory directly by pointing TTBR0_EL1 to a reserved
I have a question not related to this patch itself, but to ARM64_SW_TTBR0_PAN:
What does ARM64_SW_TTBR0_PAN used for? I means is hardware support PAN, do we
still need SW_TTBR0_PAN?
And if the hardware do not support PAN, is SW_TTBR0_PAN is *must* option? or
there maybe a security risk?
Thanks
Yisheng Xie
Hi Will,
On 2017/12/6 20:35, Will Deacon wrote:
> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> +static int __init map_entry_trampoline(void)
> +{
> + extern char __entry_tramp_text_start[];
> +
> + pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
> + phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start);
> +
> + /* The trampoline is always mapped and can therefore be global */
> + pgprot_val(prot) &= ~PTE_NG;
> +
> + /* Map only the text into the trampoline page table */
> + memset(tramp_pg_dir, 0, PGD_SIZE);
> + __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
> + prot, pgd_pgtable_alloc, 0);
How the tramp_pg_dir is used, should it be set to ttbr1 when exit kernel? Sorry
for I do not find where it is used.
Thanks
Yisheng
> +
> + /* ...as well as the kernel page table */
> + __set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot);
> + return 0;
> +}
> +core_initcall(map_entry_trampoline);
> +#endif
> +
> /*
> * Create fine-grained mappings for the kernel.
> */
>
On Tue, Jan 23, 2018 at 04:28:45PM +0800, Yisheng Xie wrote:
> On 2017/12/6 20:35, Will Deacon wrote:
> > +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> > +static int __init map_entry_trampoline(void)
> > +{
> > + extern char __entry_tramp_text_start[];
> > +
> > + pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
> > + phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start);
> > +
> > + /* The trampoline is always mapped and can therefore be global */
> > + pgprot_val(prot) &= ~PTE_NG;
> > +
> > + /* Map only the text into the trampoline page table */
> > + memset(tramp_pg_dir, 0, PGD_SIZE);
> > + __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
> > + prot, pgd_pgtable_alloc, 0);
>
> How the tramp_pg_dir is used, should it be set to ttbr1 when exit kernel? Sorry
> for I do not find where it is used.
Yes, that's what happens when we return to userspace. The code is a little
convoluted, but the tramp_pg_dir is placed at a fixed offset from swapper
(see the linker script) so the sub instruction in tramp_unmap_kernel is what
gives us the ttbr1 value we need.
Will
Hi Will,
On 2018/1/23 18:04, Will Deacon wrote:
> On Tue, Jan 23, 2018 at 04:28:45PM +0800, Yisheng Xie wrote:
>> On 2017/12/6 20:35, Will Deacon wrote:
>>> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
>>> +static int __init map_entry_trampoline(void)
>>> +{
>>> + extern char __entry_tramp_text_start[];
>>> +
>>> + pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
>>> + phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start);
>>> +
>>> + /* The trampoline is always mapped and can therefore be global */
>>> + pgprot_val(prot) &= ~PTE_NG;
>>> +
>>> + /* Map only the text into the trampoline page table */
>>> + memset(tramp_pg_dir, 0, PGD_SIZE);
>>> + __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
>>> + prot, pgd_pgtable_alloc, 0);
>>
>> How the tramp_pg_dir is used, should it be set to ttbr1 when exit kernel? Sorry
>> for I do not find where it is used.
>
> Yes, that's what happens when we return to userspace. The code is a little
> convoluted, but the tramp_pg_dir is placed at a fixed offset from swapper
> (see the linker script) so the sub instruction in tramp_unmap_kernel is what
> gives us the ttbr1 value we need.
oh, I missed that. Maybe a comment inline is better to understand. Thanks once
more for your help and explain :)
Thanks
Yisheng
>
> Will
>
> .
>