2024-04-13 04:42:03

by Xi Ruoyao

[permalink] [raw]
Subject: [PATCH v8 1/2] x86/mm: Don't disable PCID if "incomplete Global INVLPG flushes" is fixed by microcode

Per the "Processor Specification Update" documentations referred by the
intel-microcode-20240312 release note, this microcode release has fixed
the issue for all affected models.

So don't disable PCID if the microcode is new enough. The precise
minimum microcode revision fixing the issue is provided by engineer from
Intel.

Cc: Dave Hansen <[email protected]>
Cc: Michael Kelley <[email protected]>
Cc: Pawan Gupta <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Andrew Cooper <[email protected]>
Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@tip-bot2/
Link: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/releases/tag/microcode-20240312
Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13
Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24
Link: https://lore.kernel.org/all/20240325231300.qrltbzf6twm43ftb@desk/
Signed-off-by: Xi Ruoyao <[email protected]>
---
arch/x86/mm/init.c | 34 ++++++++++++++++++++++------------
1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 679893ea5e68..c318cdc35467 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -261,33 +261,43 @@ static void __init probe_page_size_mask(void)
}
}

-#define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \
- .family = 6, \
- .model = _model, \
- }
+#define INTEL_MATCH(_model, _fixed_microcode) \
+ { \
+ .vendor = X86_VENDOR_INTEL, \
+ .family = 6, \
+ .model = _model, \
+ .driver_data = _fixed_microcode, \
+ }
+
/*
* INVLPG may not properly flush Global entries
- * on these CPUs when PCIDs are enabled.
+ * on these CPUs when PCIDs are enabled and the
+ * microcode is not updated to fix the issue.
*/
static const struct x86_cpu_id invlpg_miss_ids[] = {
- INTEL_MATCH(INTEL_FAM6_ALDERLAKE ),
- INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ),
- INTEL_MATCH(INTEL_FAM6_ATOM_GRACEMONT ),
- INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ),
- INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P),
- INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE, 0x2e),
+ INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L, 0x42c),
+ INTEL_MATCH(INTEL_FAM6_ATOM_GRACEMONT, 0x11),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE, 0x118),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P, 0x4117),
+ INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S, 0x2e),
{}
};

static void setup_pcid(void)
{
+ const struct x86_cpu_id *invlpg_miss_match;
+
if (!IS_ENABLED(CONFIG_X86_64))
return;

if (!boot_cpu_has(X86_FEATURE_PCID))
return;

- if (x86_match_cpu(invlpg_miss_ids)) {
+ invlpg_miss_match = x86_match_cpu(invlpg_miss_ids);
+
+ if (invlpg_miss_match &&
+ boot_cpu_data.microcode < invlpg_miss_match->driver_data) {
pr_info("Incomplete global flushes, disabling PCID");
setup_clear_cpu_cap(X86_FEATURE_PCID);
return;
--
2.44.0



2024-04-13 04:42:42

by Xi Ruoyao

[permalink] [raw]
Subject: [PATCH v8 2/2] x86/mm: Don't disable PCID if the kernel is running on a hypervisor

The Intel erratum for "incomplete Global INVLPG flushes" says:

This erratum does not apply in VMX non-root operation. It applies
only when PCIDs are enabled and either in VMX root operation or
outside VMX operation.

So if the kernel is running in a hypervisor, we are in VMX non-root
operation and we should be safe to use PCID.

Cc: Dave Hansen <[email protected]>
Cc: Michael Kelley <[email protected]>
Cc: Pawan Gupta <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Andrew Cooper <[email protected]>
Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@tip-bot2/
Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13
Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24
Signed-off-by: Xi Ruoyao <[email protected]>
---
arch/x86/mm/init.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index c318cdc35467..6010f86c5acd 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -275,6 +275,14 @@ static void __init probe_page_size_mask(void)
* microcode is not updated to fix the issue.
*/
static const struct x86_cpu_id invlpg_miss_ids[] = {
+ /* Only bare-metal is affected. PCIDs in guests are OK. */
+ {
+ .vendor = X86_VENDOR_INTEL,
+ .family = 6,
+ .model = INTEL_FAM6_ANY,
+ .feature = X86_FEATURE_HYPERVISOR,
+ .driver_data = 0,
+ },
INTEL_MATCH(INTEL_FAM6_ALDERLAKE, 0x2e),
INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L, 0x42c),
INTEL_MATCH(INTEL_FAM6_ATOM_GRACEMONT, 0x11),
--
2.44.0


2024-04-16 23:49:54

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v8 2/2] x86/mm: Don't disable PCID if the kernel is running on a hypervisor

On Sat, Apr 13, 2024, Xi Ruoyao wrote:
> The Intel erratum for "incomplete Global INVLPG flushes" says:
>
> This erratum does not apply in VMX non-root operation. It applies
> only when PCIDs are enabled and either in VMX root operation or
> outside VMX operation.
>
> So if the kernel is running in a hypervisor, we are in VMX non-root
> operation and we should be safe to use PCID.
>
> Cc: Dave Hansen <[email protected]>
> Cc: Michael Kelley <[email protected]>
> Cc: Pawan Gupta <[email protected]>
> Cc: Sean Christopherson <[email protected]>
> Cc: Andrew Cooper <[email protected]>
> Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@tip-bot2/
> Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13
> Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24
> Signed-off-by: Xi Ruoyao <[email protected]>
> ---
> arch/x86/mm/init.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index c318cdc35467..6010f86c5acd 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -275,6 +275,14 @@ static void __init probe_page_size_mask(void)
> * microcode is not updated to fix the issue.
> */
> static const struct x86_cpu_id invlpg_miss_ids[] = {
> + /* Only bare-metal is affected. PCIDs in guests are OK. */
> + {
> + .vendor = X86_VENDOR_INTEL,
> + .family = 6,
> + .model = INTEL_FAM6_ANY,
> + .feature = X86_FEATURE_HYPERVISOR,

Isn't this inverted? x86_match_cpu() will return NULL if the CPU doesn't have
HYPERVISOR. We want it to return NULL if the CPU *does* have HYPERVISOR.

> + .driver_data = 0,
> + },
> INTEL_MATCH(INTEL_FAM6_ALDERLAKE, 0x2e),
> INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L, 0x42c),
> INTEL_MATCH(INTEL_FAM6_ATOM_GRACEMONT, 0x11),
> --
> 2.44.0
>

2024-04-17 17:23:01

by Pawan Gupta

[permalink] [raw]
Subject: Re: [PATCH v8 2/2] x86/mm: Don't disable PCID if the kernel is running on a hypervisor

On Tue, Apr 16, 2024 at 04:49:42PM -0700, Sean Christopherson wrote:
> On Sat, Apr 13, 2024, Xi Ruoyao wrote:
> > The Intel erratum for "incomplete Global INVLPG flushes" says:
> >
> > This erratum does not apply in VMX non-root operation. It applies
> > only when PCIDs are enabled and either in VMX root operation or
> > outside VMX operation.
> >
> > So if the kernel is running in a hypervisor, we are in VMX non-root
> > operation and we should be safe to use PCID.
> >
> > Cc: Dave Hansen <[email protected]>
> > Cc: Michael Kelley <[email protected]>
> > Cc: Pawan Gupta <[email protected]>
> > Cc: Sean Christopherson <[email protected]>
> > Cc: Andrew Cooper <[email protected]>
> > Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@tip-bot2/
> > Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13
> > Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24
> > Signed-off-by: Xi Ruoyao <[email protected]>
> > ---
> > arch/x86/mm/init.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> > index c318cdc35467..6010f86c5acd 100644
> > --- a/arch/x86/mm/init.c
> > +++ b/arch/x86/mm/init.c
> > @@ -275,6 +275,14 @@ static void __init probe_page_size_mask(void)
> > * microcode is not updated to fix the issue.
> > */
> > static const struct x86_cpu_id invlpg_miss_ids[] = {
> > + /* Only bare-metal is affected. PCIDs in guests are OK. */
> > + {
> > + .vendor = X86_VENDOR_INTEL,
> > + .family = 6,
> > + .model = INTEL_FAM6_ANY,
> > + .feature = X86_FEATURE_HYPERVISOR,
>
> Isn't this inverted? x86_match_cpu() will return NULL if the CPU doesn't have
> HYPERVISOR. We want it to return NULL if the CPU *does* have HYPERVISOR.

I think the implementation is correct, x86_match_cpu() will not return
NULL if the CPU doesn't have HYPERVISOR feature *and* matches one of the
CPUs below. It will only return NULL if none of the entries match.

> > + .driver_data = 0,
> > + },
> > INTEL_MATCH(INTEL_FAM6_ALDERLAKE, 0x2e),
> > INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L, 0x42c),
> > INTEL_MATCH(INTEL_FAM6_ATOM_GRACEMONT, 0x11),
> > --
> > 2.44.0
> >

2024-04-17 17:45:15

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v8 2/2] x86/mm: Don't disable PCID if the kernel is running on a hypervisor

On 4/17/24 10:22, Pawan Gupta wrote:
>>> static const struct x86_cpu_id invlpg_miss_ids[] = {
>>> + /* Only bare-metal is affected. PCIDs in guests are OK. */
>>> + {
>>> + .vendor = X86_VENDOR_INTEL,
>>> + .family = 6,
>>> + .model = INTEL_FAM6_ANY,
>>> + .feature = X86_FEATURE_HYPERVISOR,
>> Isn't this inverted? x86_match_cpu() will return NULL if the CPU doesn't have
>> HYPERVISOR. We want it to return NULL if the CPU *does* have HYPERVISOR.
> I think the implementation is correct, x86_match_cpu() will not return
> NULL if the CPU doesn't have HYPERVISOR feature *and* matches one of the
> CPUs below. It will only return NULL if none of the entries match.

I think I gave a crappy suggestion here.

Let's just do the X86_FEATURE_HYPERVISOR explicitly in the code instead
of trying to cram it into the invlpg_miss_ids[] check. It's way easier
to understand with an explicit code check.

2024-04-17 18:27:04

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v8 2/2] x86/mm: Don't disable PCID if the kernel is running on a hypervisor

On Wed, Apr 17, 2024, Dave Hansen wrote:
> On 4/17/24 10:22, Pawan Gupta wrote:
> >>> static const struct x86_cpu_id invlpg_miss_ids[] = {
> >>> + /* Only bare-metal is affected. PCIDs in guests are OK. */
> >>> + {
> >>> + .vendor = X86_VENDOR_INTEL,
> >>> + .family = 6,
> >>> + .model = INTEL_FAM6_ANY,

Just in case we go this route (I hope we don't), this should probably be:

/* Only bare-metal is affected. PCIDs in guests are OK. */
{
.vendor = X86_VENDOR_ANY,
.feature = X86_FEATURE_HYPERVISOR,
.driver_data = 0,
},

to make it clear that the goal is to match only the feature. Matching Intel P6
suffices because that's what the other entries in the array all check, but it
makes subtle, confusing code even more subtle and confusing.

> >>> + .feature = X86_FEATURE_HYPERVISOR,
> >> Isn't this inverted? x86_match_cpu() will return NULL if the CPU doesn't have
> >> HYPERVISOR. We want it to return NULL if the CPU *does* have HYPERVISOR.
> > I think the implementation is correct, x86_match_cpu() will not return
> > NULL if the CPU doesn't have HYPERVISOR feature *and* matches one of the
> > CPUs below. It will only return NULL if none of the entries match.

Oooh, and because it's the first entry it will always be found even if a different
entry would match the FMS. Oof.

> I think I gave a crappy suggestion here.
>
> Let's just do the X86_FEATURE_HYPERVISOR explicitly in the code instead
> of trying to cram it into the invlpg_miss_ids[] check. It's way easier
> to understand with an explicit code check.

+1. And it doesn't rely on the HYPERVISOR entry being the first entry, which
is doubly evil.