Explicitly set X86_FEATURE_OSPKE via set_cpu_cap() instead of calling
get_cpu_cap() to pull the feature bit from CPUID after enabling CR4.PKE.
Invoking get_cpu_cap() effectively wipes out any {set,clear}_cpu_cap()
changes that were made between this_cpu->c_init() and setup_pku(), as
all non-synthetic feature words are reinitialized from the CPU's CPUID
values.
Blasting away capability updates manifests most visibility when running
on a VMX capable CPU, but with VMX disabled by BIOS. To indicate that
VMX is disabled, init_ia32_feat_ctl() clears X86_FEATURE_VMX, using
clear_cpu_cap() instead of setup_clear_cpu_cap() so that KVM can report
which CPU is misconfigured (KVM needs to probe every CPU anyways).
Restoring X86_FEATURE_VMX from CPUID causes KVM to think VMX is enabled,
ultimately leading to an unexpected #GP when KVM attempts to do VMXON.
Arguably, init_ia32_feat_ctl() should use setup_clear_cpu_cap() and let
KVM figure out a different way to report the misconfigured CPU, but VMX
is not the only feature bit that is affected, i.e. there is precedent
that tweaking feature bits via {set,clear}_cpu_cap() after ->c_init()
is expected to work. Most notably, x86_init_rdrand()'s clearing of
X86_FEATURE_RDRAND when RDRAND malfunctions is also overwritten.
Fixes: 0697694564c8 ("x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU")
Cc: [email protected]
Cc: Dave Hansen <[email protected]>
Reported-by: Jacob Keller <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
I considered adding a WARN_ON check of cpuid_ecx(7) to ensure OSPKE is set
and bail if it's not, but that seemed like useless paranoia.
I don't love that using clear_cpu_cap() is a bit fragile, especially where
init_ia32_feat_ctl() is concerned, but the existing RDRAND usage, which
predates setup_pku(), makes a pretty strong argument that setup_pku() is
in the wrong.
The other potential issue is late microcode updates, which refreshes the
feature bits from CPUID, but at least in that case the kernel will print a
warning if something changed.
arch/x86/kernel/cpu/common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 52c9bfbbdb2a..4cdb123ff66a 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -445,7 +445,7 @@ static __always_inline void setup_pku(struct cpuinfo_x86 *c)
* cpuid bit to be set. We need to ensure that we
* update that bit in this CPU's "cpu_info".
*/
- get_cpu_cap(c);
+ set_cpu_cap(c, X86_FEATURE_OSPKE);
}
#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
--
2.24.1
On 2/26/20 3:16 PM, Sean Christopherson wrote:
> Explicitly set X86_FEATURE_OSPKE via set_cpu_cap() instead of calling
> get_cpu_cap() to pull the feature bit from CPUID after enabling CR4.PKE.
First of all,
Acked-by: Dave Hansen <[email protected]>
I don't remember whether it was you or someone else inside Intel, but
someone was tripping over this recently.
I do think we need a bit more care in how we deal with dynamic CPUID
bits. I think you'd agree that it's a bit haphazard. For instance, I
went looking for where we set X86_FEATURE_OSXSAVE after the
cr4_set_bits(X86_CR4_OSXSAVE);
inside fpu__init_cpu_xstate() makes it appear. I couldn't find one, not
that we would notice since we suppress it from /proc/cpuinfo.
On 2/26/2020 3:32 PM, Dave Hansen wrote:
> On 2/26/20 3:16 PM, Sean Christopherson wrote:
>> Explicitly set X86_FEATURE_OSPKE via set_cpu_cap() instead of calling
>> get_cpu_cap() to pull the feature bit from CPUID after enabling CR4.PKE.
>
> First of all,
>
> Acked-by: Dave Hansen <[email protected]>
>
> I don't remember whether it was you or someone else inside Intel, but
> someone was tripping over this recently.
>
> I do think we need a bit more care in how we deal with dynamic CPUID
> bits. I think you'd agree that it's a bit haphazard. For instance, I
> went looking for where we set X86_FEATURE_OSXSAVE after the
>
> cr4_set_bits(X86_CR4_OSXSAVE);
>
> inside fpu__init_cpu_xstate() makes it appear. I couldn't find one, not
> that we would notice since we suppress it from /proc/cpuinfo.
>
I tripped over it. It's possible someone else has as well, but I haven't
heard from anyone else.
Thanks,
Jake
On 2/26/2020 3:16 PM, Sean Christopherson wrote:
> Explicitly set X86_FEATURE_OSPKE via set_cpu_cap() instead of calling
> get_cpu_cap() to pull the feature bit from CPUID after enabling CR4.PKE.
> Invoking get_cpu_cap() effectively wipes out any {set,clear}_cpu_cap()
> changes that were made between this_cpu->c_init() and setup_pku(), as
> all non-synthetic feature words are reinitialized from the CPU's CPUID
> values.
>
> Blasting away capability updates manifests most visibility when running
> on a VMX capable CPU, but with VMX disabled by BIOS. To indicate that
> VMX is disabled, init_ia32_feat_ctl() clears X86_FEATURE_VMX, using
> clear_cpu_cap() instead of setup_clear_cpu_cap() so that KVM can report
> which CPU is misconfigured (KVM needs to probe every CPU anyways).
> Restoring X86_FEATURE_VMX from CPUID causes KVM to think VMX is enabled,
> ultimately leading to an unexpected #GP when KVM attempts to do VMXON.
>
> Arguably, init_ia32_feat_ctl() should use setup_clear_cpu_cap() and let
> KVM figure out a different way to report the misconfigured CPU, but VMX
> is not the only feature bit that is affected, i.e. there is precedent
> that tweaking feature bits via {set,clear}_cpu_cap() after ->c_init()
> is expected to work. Most notably, x86_init_rdrand()'s clearing of
> X86_FEATURE_RDRAND when RDRAND malfunctions is also overwritten.
>
> Fixes: 0697694564c8 ("x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU")
> Cc: [email protected]
> Cc: Dave Hansen <[email protected]>
> Reported-by: Jacob Keller <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
>
I tested this and it resolves my report! Thanks for a timely fix.
I agree with the analysis. Perhaps it would make sense in the long term
to find a solution where get_cpu_cap can remember what was cleared for
each CPU and restore those? It already does this using the global
variables...
Thanks,
Jake
On Wed, Feb 26, 2020 at 04:16:13PM -0800, Jacob Keller wrote:
> I tested this and it resolves my report! Thanks for a timely fix.
Adding your Tested-by.
> I agree with the analysis. Perhaps it would make sense in the long term
> to find a solution where get_cpu_cap can remember what was cleared for
> each CPU and restore those? It already does this using the global
> variables...
get_cpu_cap() remembers if you use setup_force_cpu_cap() but I agree
that that whole feature bit handling is a bit, hm, "strange". It had its
raisins.
FWIW, we had started cleaning those up but then the security nightmares
happened so on the backburner it went...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 735a6dd02222d8d070c7bb748f25895239ca8c92
Gitweb: https://git.kernel.org/tip/735a6dd02222d8d070c7bb748f25895239ca8c92
Author: Sean Christopherson <[email protected]>
AuthorDate: Wed, 26 Feb 2020 15:16:15 -08:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Thu, 27 Feb 2020 19:02:45 +01:00
x86/pkeys: Manually set X86_FEATURE_OSPKE to preserve existing changes
Explicitly set X86_FEATURE_OSPKE via set_cpu_cap() instead of calling
get_cpu_cap() to pull the feature bit from CPUID after enabling CR4.PKE.
Invoking get_cpu_cap() effectively wipes out any {set,clear}_cpu_cap()
changes that were made between this_cpu->c_init() and setup_pku(), as
all non-synthetic feature words are reinitialized from the CPU's CPUID
values.
Blasting away capability updates manifests most visibility when running
on a VMX capable CPU, but with VMX disabled by BIOS. To indicate that
VMX is disabled, init_ia32_feat_ctl() clears X86_FEATURE_VMX, using
clear_cpu_cap() instead of setup_clear_cpu_cap() so that KVM can report
which CPU is misconfigured (KVM needs to probe every CPU anyways).
Restoring X86_FEATURE_VMX from CPUID causes KVM to think VMX is enabled,
ultimately leading to an unexpected #GP when KVM attempts to do VMXON.
Arguably, init_ia32_feat_ctl() should use setup_clear_cpu_cap() and let
KVM figure out a different way to report the misconfigured CPU, but VMX
is not the only feature bit that is affected, i.e. there is precedent
that tweaking feature bits via {set,clear}_cpu_cap() after ->c_init()
is expected to work. Most notably, x86_init_rdrand()'s clearing of
X86_FEATURE_RDRAND when RDRAND malfunctions is also overwritten.
Fixes: 0697694564c8 ("x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU")
Reported-by: Jacob Keller <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Tested-by: Jacob Keller <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/cpu/common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 52c9bfb..4cdb123 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -445,7 +445,7 @@ static __always_inline void setup_pku(struct cpuinfo_x86 *c)
* cpuid bit to be set. We need to ensure that we
* update that bit in this CPU's "cpu_info".
*/
- get_cpu_cap(c);
+ set_cpu_cap(c, X86_FEATURE_OSPKE);
}
#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
On 2/27/2020 10:01 AM, Borislav Petkov wrote:
> On Wed, Feb 26, 2020 at 04:16:13PM -0800, Jacob Keller wrote:
>> I tested this and it resolves my report! Thanks for a timely fix.
>
> Adding your Tested-by.
>
Yep, thanks.
>> I agree with the analysis. Perhaps it would make sense in the long term
>> to find a solution where get_cpu_cap can remember what was cleared for
>> each CPU and restore those? It already does this using the global
>> variables...
>
> get_cpu_cap() remembers if you use setup_force_cpu_cap() but I agree
> that that whole feature bit handling is a bit, hm, "strange". It had its
> raisins.
Right. Nothing quite equivalent to that for per-CPU changes though.
>
> FWIW, we had started cleaning those up but then the security nightmares
> happened so on the backburner it went...
>
> Thx.
>
Completely understandable. Especially since changes here are tricky to
get right, this being a case in point.
Thanks,
Jake