2024-04-11 14:45:09

by Xi Ruoyao

[permalink] [raw]
Subject: [PATCH v7 1/2] x86/mm: Don't disable INVLPG if "incomplete Global INVLPG flushes" is fixed by microcode

Per the "Processor Specification Update" documentations referred by the
intel-microcode-20240312 release note, this microcode release has fixed
the issue for all affected models.

So don't disable INVLPG if the microcode is new enough. The precise
minimum microcode revision fixing the issue is provided by engineer from
Intel.

Cc: Dave Hansen <[email protected]>
Cc: Michael Kelley <[email protected]>
Cc: Pawan Gupta <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Andrew Cooper <[email protected]>
Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@tip-bot2/
Link: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/releases/tag/microcode-20240312
Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13
Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24
Link: https://lore.kernel.org/all/20240325231300.qrltbzf6twm43ftb@desk/
Signed-off-by: Xi Ruoyao <[email protected]>
---
arch/x86/mm/init.c | 34 ++++++++++++++++++++++------------
1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 679893ea5e68..c318cdc35467 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -261,33 +261,43 @@ static void __init probe_page_size_mask(void)
}
}

-#define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \
- .family = 6, \
- .model = _model, \
- }
+#define INTEL_MATCH(_model, _fixed_microcode) \
+ { \
+ .vendor = X86_VENDOR_INTEL, \
+ .family = 6, \
+ .model = _model, \
+ .driver_data = _fixed_microcode, \
+ }
+
/*
* INVLPG may not properly flush Global entries
- * on these CPUs when PCIDs are enabled.
+ * on these CPUs when PCIDs are enabled and the
+ * microcode is not updated to fix the issue.
*/
static const struct x86_cpu_id invlpg_miss_ids[] = {
- INTEL_MATCH(INTEL_FAM6_ALDERLAKE ),
- INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ),
- INTEL_MATCH(INTEL_FAM6_ATOM_GRACEMONT ),
- INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ),
- INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P),
- INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE, 0x2e),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L, 0x42c),
+ INTEL_MATCH(INTEL_FAM6_ATOM_GRACEMONT, 0x11),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE, 0x118),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P, 0x4117),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S, 0x2e),
{}
};

static void setup_pcid(void)
{
+ const struct x86_cpu_id *invlpg_miss_match;
+
if (!IS_ENABLED(CONFIG_X86_64))
return;

if (!boot_cpu_has(X86_FEATURE_PCID))
return;

- if (x86_match_cpu(invlpg_miss_ids)) {
+ invlpg_miss_match = x86_match_cpu(invlpg_miss_ids);
+
+ if (invlpg_miss_match &&
+ boot_cpu_data.microcode < invlpg_miss_match->driver_data) {
pr_info("Incomplete global flushes, disabling PCID");
setup_clear_cpu_cap(X86_FEATURE_PCID);
return;
--
2.44.0



2024-04-11 14:45:30

by Xi Ruoyao

[permalink] [raw]
Subject: [PATCH v7 2/2] x86/mm: Don't disable INVLPG if the kernel is running on a hypervisor

The Intel erratum for "incomplete Global INVLPG flushes" says:

This erratum does not apply in VMX non-root operation. It applies
only when PCIDs are enabled and either in VMX root operation or
outside VMX operation.

So if the kernel is running in a hypervisor, we are in VMX non-root
operation and we should be safe to use INVLPG.

Cc: Dave Hansen <[email protected]>
Cc: Michael Kelley <[email protected]>
Cc: Pawan Gupta <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Andrew Cooper <[email protected]>
Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@tip-bot2/
Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13
Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24
Signed-off-by: Xi Ruoyao <[email protected]>
---
arch/x86/mm/init.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index c318cdc35467..e69d227ea123 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -296,7 +296,14 @@ static void setup_pcid(void)

invlpg_miss_match = x86_match_cpu(invlpg_miss_ids);

- if (invlpg_miss_match &&
+ /*
+ * The Intel errata claims: "this erratum does not apply in VMX
+ * non-root operation. It applies only when PCIDs are enabled
+ * and either in VMX root operation or outside VMX operation."
+ * So we are safe if we are surely running in a hypervisor.
+ */
+ if (!boot_cpu_has(X86_FEATURE_HYPERVISOR) &&
+ invlpg_miss_match &&
boot_cpu_data.microcode < invlpg_miss_match->driver_data) {
pr_info("Incomplete global flushes, disabling PCID");
setup_clear_cpu_cap(X86_FEATURE_PCID);
--
2.44.0


2024-04-11 17:42:09

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v7 1/2] x86/mm: Don't disable INVLPG if "incomplete Global INVLPG flushes" is fixed by microcode

Don't disable *PCID*.

On Thu, Apr 11, 2024, Xi Ruoyao wrote:
> Per the "Processor Specification Update" documentations referred by the
> intel-microcode-20240312 release note, this microcode release has fixed
> the issue for all affected models.
>
> So don't disable INVLPG if the microcode is new enough. The precise

Same thing here. INVLPG is very much still a thing, it's only PCID that gets
disabled.

2024-04-11 18:14:49

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v7 2/2] x86/mm: Don't disable INVLPG if the kernel is running on a hypervisor

On 4/11/24 09:22, Sean Christopherson wrote:
> In other words, simply checking HYPERVISOR *might* be safe, but it might not.
> If we wanted to be paranoid, this could also check X86_FEATURE_VMX, which also
> doesn't guarantee VMX non-root mode and would unnecessarily restrict PCID usage
> to setups that allow nested VMX, but AFAIK there aren't any hypervisors which
> fully emulate VMX.

X86_FEATURE_HYPERVISOR is most commonly used for vulnerabilities to see
if the data coming out of CPUID is likely to be garbage or not. I think
that's the most important thing to focus on.

It's arguable that x86_match_cpu() itself should just have a:

/*
* Don't even waste our time when running under a hypervisor.
* They lie.
*/
if (boot_cpu_bas(X86_FEATURE_HYPERVISOR))
return NULL;

(well, it should probably actually be in the for() loop because folks
might be looking for an X86_FEATURE_* that is set by software or derived
from actually agreed-upon host<->guest ABI, but you get my point...)

If the hypervisor is duplicitous enough to keep X86_FEATURE_HYPERVISOR
from getting set, then the hypervisor gets to clean up the mess. The
kernel can just wash its hands of the whole thing.

So, there are two broad cases and a few sub-cases:

1. "Nice" hypervisor. Kernel sees X86_FEATURE_HYPERVISOR and knows that
x86_match_cpu() and invlpg_miss_ids[] are irrelevant because:
1a. It is running in VMX non-root mode and is not vulnerable, or
1b. CPUID is a lie and x86_match_cpu() is meaningless, or
1c. The kernel is in ring3 and can't execute INVLPG anyway. Whatever
is running in ring0 will have to deal with it.
2. X86_FEATURE_HYPERVISOR is unset.
2a. "Mean" hypervisor. All bets are off anyway.
2b. Actual bare metal. Actually look for the bug.

I _think_ I'm OK with skipping the mitigation in all of the #1 cases and
applying it in both of the #2 cases. I don't think that checking for
VMX makes it much better.

Am I missing anything?

2024-04-11 18:34:07

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v7 2/2] x86/mm: Don't disable INVLPG if the kernel is running on a hypervisor

s/INVLPG/PCID again

On Thu, Apr 11, 2024, Xi Ruoyao wrote:
> The Intel erratum for "incomplete Global INVLPG flushes" says:
>
> This erratum does not apply in VMX non-root operation. It applies
> only when PCIDs are enabled and either in VMX root operation or
> outside VMX operation.
>
> So if the kernel is running in a hypervisor, we are in VMX non-root

No, this is not strictly true. The HYPERVISOR flag only states that the kernel
is running as a guest, it doesn't say anything about what mode the guest is run
in. E.g. a fully PV guest running at CPL3 isn't in VMX non-root mode. Ditto for
a fully emulated environment.

Of course, in such a setup the hypervisor really shouldn't be advertising PCID
support, and I've no idea if Xen PV (or any other PV mode) even shoves guest PCIDs
into hardware, i.e. PCID might be emulated and thus not subject to the hardware
bug.

In other words, simply checking HYPERVISOR *might* be safe, but it might not.
If we wanted to be paranoid, this could also check X86_FEATURE_VMX, which also
doesn't guarantee VMX non-root mode and would unnecessarily restrict PCID usage
to setups that allow nested VMX, but AFAIK there aren't any hypervisors which
fully emulate VMX.

> operation and we should be safe to use INVLPG.

s/INVLPG/PCID

2024-04-16 23:22:07

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v7 2/2] x86/mm: Don't disable INVLPG if the kernel is running on a hypervisor

On Thu, Apr 11, 2024, Dave Hansen wrote:
> On 4/11/24 09:22, Sean Christopherson wrote:
> > In other words, simply checking HYPERVISOR *might* be safe, but it might not.
> > If we wanted to be paranoid, this could also check X86_FEATURE_VMX, which also
> > doesn't guarantee VMX non-root mode and would unnecessarily restrict PCID usage
> > to setups that allow nested VMX, but AFAIK there aren't any hypervisors which
> > fully emulate VMX.
>
> X86_FEATURE_HYPERVISOR is most commonly used for vulnerabilities to see
> if the data coming out of CPUID is likely to be garbage or not. I think
> that's the most important thing to focus on.
>
> It's arguable that x86_match_cpu() itself should just have a:
>
> /*
> * Don't even waste our time when running under a hypervisor.
> * They lie.
> */
> if (boot_cpu_bas(X86_FEATURE_HYPERVISOR))
> return NULL;
>
> (well, it should probably actually be in the for() loop because folks
> might be looking for an X86_FEATURE_* that is set by software or derived
> from actually agreed-upon host<->guest ABI, but you get my point...)
>
> If the hypervisor is duplicitous enough to keep X86_FEATURE_HYPERVISOR
> from getting set, then the hypervisor gets to clean up the mess. The
> kernel can just wash its hands of the whole thing.
>
> So, there are two broad cases and a few sub-cases:
>
> 1. "Nice" hypervisor. Kernel sees X86_FEATURE_HYPERVISOR and knows that
> x86_match_cpu() and invlpg_miss_ids[] are irrelevant because:
> 1a. It is running in VMX non-root mode and is not vulnerable, or
> 1b. CPUID is a lie and x86_match_cpu() is meaningless, or
> 1c. The kernel is in ring3 and can't execute INVLPG anyway. Whatever
> is running in ring0 will have to deal with it.
> 2. X86_FEATURE_HYPERVISOR is unset.
> 2a. "Mean" hypervisor. All bets are off anyway.
> 2b. Actual bare metal. Actually look for the bug.
>
> I _think_ I'm OK with skipping the mitigation in all of the #1 cases and
> applying it in both of the #2 cases. I don't think that checking for
> VMX makes it much better.
>
> Am I missing anything?

I'm a-ok with just checking HYPERVISOR, I agree that the hypervisor is fully
responsible for correctly emulating PCID and INVLPG stuff for (1c).

My reaction was really just to the changelog equating HYPERVSIOR to VMX non-root.