Hi James,
Booting stable kernels <= v5.4 on arm64 with CONFIG_HARDEN_BRANCH_PREDICTOR=n
results in a NULL pointer dereference during boot due to kvm_get_hyp_vector()
dereferencing a NULL pointer from arm64_get_bp_hardening_data():
[ 2.384444] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 2.384461] pstate: 20400085 (nzCv daIf +PAN -UAO)
[ 2.384472] pc : cpu_hyp_reinit+0x114/0x30c
[ 2.384476] lr : cpu_hyp_reinit+0x80/0x30c
[ 2.384477] sp : ffff800010013e20
[ 2.384479] x29: ffff800010013e20 x28: ffff0000df033f00
[ 2.384484] x27: ffff800011caa000 x26: ffff0000de58e000
[ 2.384487] x25: ffff800011e57000 x24: ffff800011caad90
[ 2.384490] x23: ffff800011c778e0 x22: ffff800011a56000
[ 2.384494] x21: ffff800011a56000 x20: 0000b471121a6000
[ 2.384497] x19: 00000000de588000 x18: 0000000000000000
[ 2.384500] x17: 0000000000000000 x16: 0000000000000400
[ 2.384503] x15: 0000000000001000 x14: 00000000000007f2
[ 2.384507] x13: 0000000000000004 x12: 0000000000000002
[ 2.384510] x11: 0000000000000000 x10: 00000000121a6000
[ 2.384513] x9 : ffff800011c77fd8 x8 : 0000000000000010
[ 2.384516] x7 : 4f422effff306b64 x6 : 0000008080000000
[ 2.384519] x5 : 0000008080000000 x4 : ffff800011fabc94
[ 2.384522] x3 : ffff800011f5bde0 x2 : 0000000000000001
[ 2.384526] x1 : 00000000121a0000 x0 : ffff8000118f962d
[ 2.384529] Call trace:
[ 2.384533] cpu_hyp_reinit+0x114/0x30c
[ 2.384537] _kvm_arch_hardware_enable+0x30/0x54
[ 2.384541] flush_smp_call_function_queue+0xe4/0x154
[ 2.384544] generic_smp_call_function_single_interrupt+0x10/0x18
[ 2.384549] ipi_handler+0x170/0x2b0
[ 2.384555] handle_percpu_devid_fasteoi_ipi+0x120/0x1cc
[ 2.384560] __handle_domain_irq+0x9c/0xf4
[ 2.384563] gic_handle_irq+0x6c/0xe4
[ 2.384566] el1_irq+0xf0/0x1c0
[ 2.384570] arch_cpu_idle+0x28/0x44
[ 2.384574] do_idle+0x100/0x2a8
[ 2.384577] cpu_startup_entry+0x20/0x24
[ 2.384581] secondary_start_kernel+0x1b0/0x1cc
[ 2.384589] Code: b9469d08 7100011f 540003ad 52800208 (f9400108)
[ 2.384600] ---[ end trace 266d08dbf96ff143 ]---
[ 2.385171] Kernel panic - not syncing: Fatal exception in interrupt
The faulting load is the access to 'data->template_start', for which the
compiler emits the hilarious:
mov w8, #0x10
ldr x8, [x8]
I can bodge this as below (untested), but it's pretty grotty. Please can you
take a look?
Cheers,
Will
--->8
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 33b33416fea4..3bc1380f5524 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -939,6 +939,7 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
.cpu_enable = cpu_enable_ssbd_mitigation,
.midr_range_list = arm64_ssb_cpus,
},
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
{
.desc = "Spectre-BHB",
.capability = ARM64_SPECTRE_BHB,
@@ -946,6 +947,7 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
.matches = is_spectre_bhb_affected,
.cpu_enable = spectre_bhb_enable_mitigation,
},
+#endif
#ifdef CONFIG_ARM64_ERRATUM_1418040
{
.desc = "ARM erratum 1418040",
Hi Will,
On 08/04/2022 13:00, Will Deacon wrote:
> Booting stable kernels <= v5.4 on arm64 with CONFIG_HARDEN_BRANCH_PREDICTOR=n
> results in a NULL pointer dereference during boot due to kvm_get_hyp_vector()
> dereferencing a NULL pointer from arm64_get_bp_hardening_data():
>
> [ 2.384444] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> [ 2.384461] pstate: 20400085 (nzCv daIf +PAN -UAO)
> [ 2.384472] pc : cpu_hyp_reinit+0x114/0x30c
> [ 2.384476] lr : cpu_hyp_reinit+0x80/0x30c
> [ 2.384477] sp : ffff800010013e20
> [ 2.384479] x29: ffff800010013e20 x28: ffff0000df033f00
> [ 2.384484] x27: ffff800011caa000 x26: ffff0000de58e000
> [ 2.384487] x25: ffff800011e57000 x24: ffff800011caad90
> [ 2.384490] x23: ffff800011c778e0 x22: ffff800011a56000
> [ 2.384494] x21: ffff800011a56000 x20: 0000b471121a6000
> [ 2.384497] x19: 00000000de588000 x18: 0000000000000000
> [ 2.384500] x17: 0000000000000000 x16: 0000000000000400
> [ 2.384503] x15: 0000000000001000 x14: 00000000000007f2
> [ 2.384507] x13: 0000000000000004 x12: 0000000000000002
> [ 2.384510] x11: 0000000000000000 x10: 00000000121a6000
> [ 2.384513] x9 : ffff800011c77fd8 x8 : 0000000000000010
> [ 2.384516] x7 : 4f422effff306b64 x6 : 0000008080000000
> [ 2.384519] x5 : 0000008080000000 x4 : ffff800011fabc94
> [ 2.384522] x3 : ffff800011f5bde0 x2 : 0000000000000001
> [ 2.384526] x1 : 00000000121a0000 x0 : ffff8000118f962d
> [ 2.384529] Call trace:
> [ 2.384533] cpu_hyp_reinit+0x114/0x30c
> [ 2.384537] _kvm_arch_hardware_enable+0x30/0x54
> [ 2.384541] flush_smp_call_function_queue+0xe4/0x154
> [ 2.384544] generic_smp_call_function_single_interrupt+0x10/0x18
> [ 2.384549] ipi_handler+0x170/0x2b0
> [ 2.384555] handle_percpu_devid_fasteoi_ipi+0x120/0x1cc
> [ 2.384560] __handle_domain_irq+0x9c/0xf4
> [ 2.384563] gic_handle_irq+0x6c/0xe4
> [ 2.384566] el1_irq+0xf0/0x1c0
> [ 2.384570] arch_cpu_idle+0x28/0x44
> [ 2.384574] do_idle+0x100/0x2a8
> [ 2.384577] cpu_startup_entry+0x20/0x24
> [ 2.384581] secondary_start_kernel+0x1b0/0x1cc
> [ 2.384589] Code: b9469d08 7100011f 540003ad 52800208 (f9400108)
> [ 2.384600] ---[ end trace 266d08dbf96ff143 ]---
> [ 2.385171] Kernel panic - not syncing: Fatal exception in interrupt
Yikes!
Interesting to know that stuff behind CONFIG_EXPERT has someone who cares about it.
(I was going to propose dropping the Kconfig option after a while).
> The faulting load is the access to 'data->template_start', for which the
> compiler emits the hilarious:
>
> mov w8, #0x10
> ldr x8, [x8]
This is nasal-daemon territory, so I guess the compiler can do whatever it likes.
> I can bodge this as below (untested), but it's pretty grotty.
I wanted to keep the detection code even if the feature is disabled so the sysfs reporting
is always correct.
> Please can you take a look?
Ugh, arm64_get_bp_hardening_data() returns NULL with that Kconfig setup.
For v5.4, this fixes it for me:
--------------------%<--------------------
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 78d110667c0c..ffe0aad96b17 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -479,7 +479,8 @@ static inline void *kvm_get_hyp_vector(void)
int slot = -1;
if ((cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR) ||
- cpus_have_const_cap(ARM64_SPECTRE_BHB)) && data->template_start) {
+ cpus_have_const_cap(ARM64_SPECTRE_BHB)) &&
+ data && data->template_start) {
vect = kern_hyp_va(kvm_ksym_ref(__bp_harden_hyp_vecs_start));
slot = data->hyp_vectors_slot;
}
--------------------%<--------------------
I'll check the other versions and post patches to the stable list. Earlier stable
backports grew a dependency between these features as it wasn't possible to unpick the
dependencies.
Thanks,
James
Hi Will,
On 08/04/2022 17:21, Will Deacon wrote:
> On Fri, Apr 08, 2022 at 05:08:00PM +0100, James Morse wrote:
>> On 08/04/2022 13:00, Will Deacon wrote:
>>> Booting stable kernels <= v5.4 on arm64 with CONFIG_HARDEN_BRANCH_PREDICTOR=n
>>> results in a NULL pointer dereference during boot due to kvm_get_hyp_vector()
>>> dereferencing a NULL pointer from arm64_get_bp_hardening_data():
>>>
>>> [ 2.384444] Internal error: Oops: 96000004 [#1] PREEMPT SMP
>>> [ 2.384461] pstate: 20400085 (nzCv daIf +PAN -UAO)
>>> [ 2.384472] pc : cpu_hyp_reinit+0x114/0x30c
>>> [ 2.384476] lr : cpu_hyp_reinit+0x80/0x30c
>>> [ 2.385171] Kernel panic - not syncing: Fatal exception in interrupt
>>
>> Yikes!
>>
>> Interesting to know that stuff behind CONFIG_EXPERT has someone who cares about it.
>> (I was going to propose dropping the Kconfig option after a while).
> Yup! FWIW, the hardening options are enabled in Android (GKI), but this was
> reported to us externally by somebody using a custom config.
>>> I can bodge this as below (untested), but it's pretty grotty.
>>
>> I wanted to keep the detection code even if the feature is disabled so the sysfs reporting
>> is always correct.
>
> Makes sense. Another option is to check for ARM64_HARDEN_BRANCH_PREDICTOR in
> spectre_bhb_enable_mitigation(), but then I think the KVM code would need
> to query the mitigation state rather than just the cap.
It already does, but as you say KVM is only using the cap here.
>>> Please can you take a look?
>>
>> Ugh, arm64_get_bp_hardening_data() returns NULL with that Kconfig setup.
>>
>>
>> For v5.4, this fixes it for me:
>> --------------------%<--------------------
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 78d110667c0c..ffe0aad96b17 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -479,7 +479,8 @@ static inline void *kvm_get_hyp_vector(void)
>> int slot = -1;
>>
>> if ((cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR) ||
>> - cpus_have_const_cap(ARM64_SPECTRE_BHB)) && data->template_start) {
>> + cpus_have_const_cap(ARM64_SPECTRE_BHB)) &&
>> + data && data->template_start) {
>> vect = kern_hyp_va(kvm_ksym_ref(__bp_harden_hyp_vecs_start));
>> slot = data->hyp_vectors_slot;
>> }
>
> That'll work, but will sysfs report that BHB is mitigated even if
> !ARM64_HARDEN_BRANCH_PREDICTOR?
The (!IS_ENABLED(CONFIG_HARDEN_BRANCH_PREDICTOR) in check_branch_predictor() will set
__hardenbp_enab to false, which get_spectre_v2_workaround_state() picks up and causes
spectre_bhb_enable_mitigation() to skip all the mitigations, leaving state =
SPECTRE_VULNERABLE.
(The interactions with the other Spectre mitigations across the stable kernels
were/continue-to-be a pain in the neck)
>> --------------------%<--------------------
>>
>> I'll check the other versions and post patches to the stable list. Earlier stable
>> backports grew a dependency between these features as it wasn't possible to unpick the
>> dependencies.
>
> Cheers. I know 4.19 is busted too, but I didn't check 4.14.
Yup, I've just reproduced that one. I suspect v4.14 is where I added the Kconfig dependency.
Thanks,
James
Hi James,
On Fri, Apr 08, 2022 at 05:08:00PM +0100, James Morse wrote:
> On 08/04/2022 13:00, Will Deacon wrote:
> > Booting stable kernels <= v5.4 on arm64 with CONFIG_HARDEN_BRANCH_PREDICTOR=n
> > results in a NULL pointer dereference during boot due to kvm_get_hyp_vector()
> > dereferencing a NULL pointer from arm64_get_bp_hardening_data():
> >
> > [ 2.384444] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> > [ 2.384461] pstate: 20400085 (nzCv daIf +PAN -UAO)
> > [ 2.384472] pc : cpu_hyp_reinit+0x114/0x30c
> > [ 2.384476] lr : cpu_hyp_reinit+0x80/0x30c
> > [ 2.384477] sp : ffff800010013e20
> > [ 2.384479] x29: ffff800010013e20 x28: ffff0000df033f00
> > [ 2.384484] x27: ffff800011caa000 x26: ffff0000de58e000
> > [ 2.384487] x25: ffff800011e57000 x24: ffff800011caad90
> > [ 2.384490] x23: ffff800011c778e0 x22: ffff800011a56000
> > [ 2.384494] x21: ffff800011a56000 x20: 0000b471121a6000
> > [ 2.384497] x19: 00000000de588000 x18: 0000000000000000
> > [ 2.384500] x17: 0000000000000000 x16: 0000000000000400
> > [ 2.384503] x15: 0000000000001000 x14: 00000000000007f2
> > [ 2.384507] x13: 0000000000000004 x12: 0000000000000002
> > [ 2.384510] x11: 0000000000000000 x10: 00000000121a6000
> > [ 2.384513] x9 : ffff800011c77fd8 x8 : 0000000000000010
> > [ 2.384516] x7 : 4f422effff306b64 x6 : 0000008080000000
> > [ 2.384519] x5 : 0000008080000000 x4 : ffff800011fabc94
> > [ 2.384522] x3 : ffff800011f5bde0 x2 : 0000000000000001
> > [ 2.384526] x1 : 00000000121a0000 x0 : ffff8000118f962d
> > [ 2.384529] Call trace:
> > [ 2.384533] cpu_hyp_reinit+0x114/0x30c
> > [ 2.384537] _kvm_arch_hardware_enable+0x30/0x54
> > [ 2.384541] flush_smp_call_function_queue+0xe4/0x154
> > [ 2.384544] generic_smp_call_function_single_interrupt+0x10/0x18
> > [ 2.384549] ipi_handler+0x170/0x2b0
> > [ 2.384555] handle_percpu_devid_fasteoi_ipi+0x120/0x1cc
> > [ 2.384560] __handle_domain_irq+0x9c/0xf4
> > [ 2.384563] gic_handle_irq+0x6c/0xe4
> > [ 2.384566] el1_irq+0xf0/0x1c0
> > [ 2.384570] arch_cpu_idle+0x28/0x44
> > [ 2.384574] do_idle+0x100/0x2a8
> > [ 2.384577] cpu_startup_entry+0x20/0x24
> > [ 2.384581] secondary_start_kernel+0x1b0/0x1cc
> > [ 2.384589] Code: b9469d08 7100011f 540003ad 52800208 (f9400108)
> > [ 2.384600] ---[ end trace 266d08dbf96ff143 ]---
> > [ 2.385171] Kernel panic - not syncing: Fatal exception in interrupt
>
> Yikes!
>
> Interesting to know that stuff behind CONFIG_EXPERT has someone who cares about it.
> (I was going to propose dropping the Kconfig option after a while).
Yup! FWIW, the hardening options are enabled in Android (GKI), but this was
reported to us externally by somebody using a custom config.
> > I can bodge this as below (untested), but it's pretty grotty.
>
> I wanted to keep the detection code even if the feature is disabled so the sysfs reporting
> is always correct.
Makes sense. Another option is to check for ARM64_HARDEN_BRANCH_PREDICTOR in
spectre_bhb_enable_mitigation(), but then I think the KVM code would need
to query the mitigation state rather than just the cap.
> > Please can you take a look?
>
> Ugh, arm64_get_bp_hardening_data() returns NULL with that Kconfig setup.
>
>
> For v5.4, this fixes it for me:
> --------------------%<--------------------
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 78d110667c0c..ffe0aad96b17 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -479,7 +479,8 @@ static inline void *kvm_get_hyp_vector(void)
> int slot = -1;
>
> if ((cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR) ||
> - cpus_have_const_cap(ARM64_SPECTRE_BHB)) && data->template_start) {
> + cpus_have_const_cap(ARM64_SPECTRE_BHB)) &&
> + data && data->template_start) {
> vect = kern_hyp_va(kvm_ksym_ref(__bp_harden_hyp_vecs_start));
> slot = data->hyp_vectors_slot;
> }
That'll work, but will sysfs report that BHB is mitigated even if
!ARM64_HARDEN_BRANCH_PREDICTOR?
> --------------------%<--------------------
>
> I'll check the other versions and post patches to the stable list. Earlier stable
> backports grew a dependency between these features as it wasn't possible to unpick the
> dependencies.
Cheers. I know 4.19 is busted too, but I didn't check 4.14.
Will