2024-04-11 10:49:09

by Xi Ruoyao

[permalink] [raw]
Subject: [PATCH v6] x86/mm: Don't disable INVLPG if "incomplete Global INVLPG flushes" is fixed by microcode or the kernel is running in a hypervisor

Per the "Processor Specification Update" documentations referred by the
intel-microcode-20240312 release note, this microcode release has fixed
the issue for all affected models.

So don't disable INVLPG if the microcode is new enough. The precise
minimum microcode revision fixing the issue is provided by engineer from
Intel.

And the erratum says:

This erratum does not apply in VMX non-root operation. It applies
only when PCIDs are enabled and either in VMX root operation or
outside VMX operation.

So if the kernel is running in a hypervisor, we are in VMX non-root
operation and we should be safe.

Cc: Dave Hansen <[email protected]>
Cc: Michael Kelley <[email protected]>
Cc: Pawan Gupta <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Andrew Cooper <[email protected]>
Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@tip-bot2/
Link: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/releases/tag/microcode-20240312
Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13
Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24
Link: https://lore.kernel.org/all/20240325231300.qrltbzf6twm43ftb@desk/
Signed-off-by: Xi Ruoyao <[email protected]>
---
arch/x86/mm/init.c | 41 +++++++++++++++++++++++++++++------------
1 file changed, 29 insertions(+), 12 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 679893ea5e68..e69d227ea123 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -261,33 +261,50 @@ static void __init probe_page_size_mask(void)
}
}

-#define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \
- .family = 6, \
- .model = _model, \
- }
+#define INTEL_MATCH(_model, _fixed_microcode) \
+ { \
+ .vendor = X86_VENDOR_INTEL, \
+ .family = 6, \
+ .model = _model, \
+ .driver_data = _fixed_microcode, \
+ }
+
/*
* INVLPG may not properly flush Global entries
- * on these CPUs when PCIDs are enabled.
+ * on these CPUs when PCIDs are enabled and the
+ * microcode is not updated to fix the issue.
*/
static const struct x86_cpu_id invlpg_miss_ids[] = {
- INTEL_MATCH(INTEL_FAM6_ALDERLAKE ),
- INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ),
- INTEL_MATCH(INTEL_FAM6_ATOM_GRACEMONT ),
- INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ),
- INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P),
- INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE, 0x2e),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L, 0x42c),
+ INTEL_MATCH(INTEL_FAM6_ATOM_GRACEMONT, 0x11),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE, 0x118),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P, 0x4117),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S, 0x2e),
{}
};

static void setup_pcid(void)
{
+ const struct x86_cpu_id *invlpg_miss_match;
+
if (!IS_ENABLED(CONFIG_X86_64))
return;

if (!boot_cpu_has(X86_FEATURE_PCID))
return;

- if (x86_match_cpu(invlpg_miss_ids)) {
+ invlpg_miss_match = x86_match_cpu(invlpg_miss_ids);
+
+ /*
+ * The Intel errata claims: "this erratum does not apply in VMX
+ * non-root operation. It applies only when PCIDs are enabled
+ * and either in VMX root operation or outside VMX operation."
+ * So we are safe if we are surely running in a hypervisor.
+ */
+ if (!boot_cpu_has(X86_FEATURE_HYPERVISOR) &&
+ invlpg_miss_match &&
+ boot_cpu_data.microcode < invlpg_miss_match->driver_data) {
pr_info("Incomplete global flushes, disabling PCID");
setup_clear_cpu_cap(X86_FEATURE_PCID);
return;
--
2.44.0



2024-04-11 14:29:55

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v6] x86/mm: Don't disable INVLPG if "incomplete Global INVLPG flushes" is fixed by microcode or the kernel is running in a hypervisor

On Thu, Apr 11, 2024, Xi Ruoyao wrote:
> Per the "Processor Specification Update" documentations referred by the
> intel-microcode-20240312 release note, this microcode release has fixed
> the issue for all affected models.
>
> So don't disable INVLPG if the microcode is new enough. The precise
> minimum microcode revision fixing the issue is provided by engineer from
> Intel.
>
> And the erratum says:
>
> This erratum does not apply in VMX non-root operation. It applies
> only when PCIDs are enabled and either in VMX root operation or
> outside VMX operation.
>
> So if the kernel is running in a hypervisor, we are in VMX non-root
> operation and we should be safe.

This should be two separate patches, one to do the precise ucode update check,
and one to keep PCID enabled for guests.

2024-04-11 14:45:48

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v6] x86/mm: Don't disable INVLPG if "incomplete Global INVLPG flushes" is fixed by microcode or the kernel is running in a hypervisor

On 4/11/24 03:48, Xi Ruoyao wrote:
> + /*
> + * The Intel errata claims: "this erratum does not apply in VMX
> + * non-root operation. It applies only when PCIDs are enabled
> + * and either in VMX root operation or outside VMX operation."
> + * So we are safe if we are surely running in a hypervisor.
> + */

When you revise this, could you please work to make this more succinct?
The Intel language on these things tends to be a bit flowery and is not
always well-suited for the kernel.

Also, saying that the erratum "claims" this casts doubt on it. That's
counterproductive. I believe the current documentation is correct. My
original ce0b15d11ad8 ("x86/mm: Avoid incomplete Global INVLPG flushes")
should have considered virtualized systems immune to this issue.

I agree that it sounds weird. It _is_ weird that systems running under
hypervisors aren't affected. But that's all it is: a weird bug. The
documentation is correct.

2024-04-11 14:48:25

by Xi Ruoyao

[permalink] [raw]
Subject: Re: [PATCH v6] x86/mm: Don't disable INVLPG if "incomplete Global INVLPG flushes" is fixed by microcode or the kernel is running in a hypervisor

On Thu, 2024-04-11 at 07:44 -0700, Dave Hansen wrote:
> On 4/11/24 03:48, Xi Ruoyao wrote:
> > + /*
> > + * The Intel errata claims: "this erratum does not apply in VMX
> > + * non-root operation.  It applies only when PCIDs are enabled
> > + * and either in VMX root operation or outside VMX operation."
> > + * So we are safe if we are surely running in a hypervisor.
> > + */
>
> When you revise this, could you please work to make this more succinct?
> The Intel language on these things tends to be a bit flowery and is not
> always well-suited for the kernel.

Oops, bad timing. I just sent v7 before getting this reply.

I'm not a native English speaker, so could you give some hint about how
to write this comment clearly?

> Also, saying that the erratum "claims" this casts doubt on it.  That's
> counterproductive.  I believe the current documentation is correct.  My
> original ce0b15d11ad8 ("x86/mm: Avoid incomplete Global INVLPG flushes")
> should have considered virtualized systems immune to this issue.

Then do we need a "Fixes: ce0b15d11ad8" for the patch keeping PCID
enabled for guests?

> I agree that it sounds weird.  It _is_ weird that systems running under
> hypervisors aren't affected.  But that's all it is: a weird bug.  The
> documentation is correct.

Yes, these hardware issues are just weird to me...

--
Xi Ruoyao <[email protected]>
School of Aerospace Science and Technology, Xidian University

2024-04-11 15:00:19

by Andrew Cooper

[permalink] [raw]
Subject: Re: [PATCH v6] x86/mm: Don't disable INVLPG if "incomplete Global INVLPG flushes" is fixed by microcode or the kernel is running in a hypervisor

On 11/04/2024 3:48 pm, Xi Ruoyao wrote:
> On Thu, 2024-04-11 at 07:44 -0700, Dave Hansen wrote:
>> I agree that it sounds weird.  It _is_ weird that systems running under
>> hypervisors aren't affected.  But that's all it is: a weird bug.  The
>> documentation is correct.
> Yes, these hardware issues are just weird to me...
>

I have no inside knowledge here, but this isn't surprising to me.

VPID tags are unique to VMX operation.  At a guess, VMXON reconfigures
the tagging in the TLB to include VPID (and EPTP tags for that matter),
and works around the buggy selection for which PCIDs to drop.

~Andrew

2024-04-11 15:29:13

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v6] x86/mm: Don't disable INVLPG if "incomplete Global INVLPG flushes" is fixed by microcode or the kernel is running in a hypervisor

On 4/11/24 07:48, Xi Ruoyao wrote:
> On Thu, 2024-04-11 at 07:44 -0700, Dave Hansen wrote:
>> On 4/11/24 03:48, Xi Ruoyao wrote:
>>> + /*
>>> + * The Intel errata claims: "this erratum does not apply in VMX
>>> + * non-root operation.  It applies only when PCIDs are enabled
>>> + * and either in VMX root operation or outside VMX operation."
>>> + * So we are safe if we are surely running in a hypervisor.
>>> + */
>> When you revise this, could you please work to make this more succinct?
>> The Intel language on these things tends to be a bit flowery and is not
>> always well-suited for the kernel.
> Oops, bad timing. I just sent v7 before getting this reply.

One way to avoid bad timing like this is to wait more than 4 hours
between patch revisions.

> I'm not a native English speaker, so could you give some hint about how
> to write this comment clearly?

Something like this would be fine:

/* Only bare-metal is affected. PCIDs in guests are OK. */



2024-04-11 17:17:25

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v6] x86/mm: Don't disable INVLPG if "incomplete Global INVLPG flushes" is fixed by microcode or the kernel is running in a hypervisor

On 4/11/24 08:18, Dave Hansen wrote:
>
>> I'm not a native English speaker, so could you give some hint about how
>> to write this comment clearly?
> Something like this would be fine:
>
> /* Only bare-metal is affected. PCIDs in guests are OK. */

One more thing... Instead if cluttering up the main PCID setup path
that everybody has to read, you can put this in the macro:

#define INTEL_MATCH(_model) \
{ \
.vendor = X86_VENDOR_INTEL, \
.feature = X86_FEATURE_HYPERVISOR \
.family = 6, \
.model = _model, \
}

Then you can supplement the "INVLPG may not properly flush Global
entries..." comment.

/*
* INVLPG may not properly flush Global entries. But
* it only misbehaves when PCIDs are in use on bare
* metal on specific CPU models. It is completely
* fixed with updated microcode.
*/