2023-01-03 18:05:24

by Ashok Raj

[permalink] [raw]
Subject: [PATCH v3 2/6] x86/microcode/core: Take a snapshot before and after applying microcode

The kernel caches features about each CPU's features at boot in an
x86_capability[] structure. The microcode update takes one snapshot and
compares it with the saved copy at boot.

However, the capabilities in the boot copy can be turned off as a result of
certain command line parameters or configuration restrictions. This can
cause a mismatch when comparing the values before and after the microcode
update.

microcode_check() is called after an update to report any previously
cached CPUID bits might have changed due to the update.

microcode_store_cpu_caps() basically stores the original CPU reported
values and not the OS modified values. This will avoid giving a false
warning even if no capabilities have changed.

Ignore the capabilities recorded at boot. Take a new snapshot before the
update and compare with a snapshot after the update to eliminate the false
warning.

Fixes: 1008c52c09dc ("x86/CPU: Add a microcode loader callback")
Signed-off-by: Ashok Raj <[email protected]>
Cc: LKML <[email protected]>
Cc: x86 <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Alison Schofield <[email protected]>
Cc: Reinette Chatre <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
---
Changes since last post

- Boris
- Change function from copy_cpu_caps() -> store_cpu_caps()
- Keep microcode_check() inside cpu/common.c and not bleed
get_cpu_caps() outside of core code.

- Thomas : Commit log changes.
---
arch/x86/include/asm/processor.h | 1 +
arch/x86/kernel/cpu/common.c | 31 +++++++++++++++++++++-------
arch/x86/kernel/cpu/microcode/core.c | 7 +++++++
3 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 387578049de0..ac2e67156b9b 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -697,6 +697,7 @@ bool xen_set_default_idle(void);
#endif

void __noreturn stop_this_cpu(void *dummy);
+void microcode_store_cpu_caps(struct cpuinfo_x86 *info);
void microcode_check(struct cpuinfo_x86 *info);

enum l1tf_mitigations {
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index b9c7529c920e..7c86c6fd07ae 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2297,28 +2297,43 @@ void cpu_init_secondary(void)
#endif

#ifdef CONFIG_MICROCODE_LATE_LOADING
+
+void microcode_store_cpu_caps(struct cpuinfo_x86 *info)
+{
+ /* Reload CPUID max function as it might've changed. */
+ info->cpuid_level = cpuid_eax(0);
+
+ /*
+ * Copy all capability leafs to pick up the synthetic ones so that
+ * memcmp() below doesn't fail on that. The ones coming from CPUID will
+ * get overwritten in get_cpu_cap().
+ */
+ memcpy(info->x86_capability, &boot_cpu_data.x86_capability,
+ sizeof(info->x86_capability));
+
+ get_cpu_cap(info);
+}
+
/*
* The microcode loader calls this upon late microcode load to recheck features,
* only when microcode has been updated. Caller holds microcode_mutex and CPU
* hotplug lock.
*/
-void microcode_check(struct cpuinfo_x86 *info)
+void microcode_check(struct cpuinfo_x86 *orig)
{
- perf_check_microcode();
+ struct cpuinfo_x86 info;

- /* Reload CPUID max function as it might've changed. */
- info->cpuid_level = cpuid_eax(0);
+ perf_check_microcode();

/*
* Copy all capability leafs to pick up the synthetic ones so that
* memcmp() below doesn't fail on that. The ones coming from CPUID will
* get overwritten in get_cpu_cap().
*/
- memcpy(&info->x86_capability, &boot_cpu_data.x86_capability, sizeof(info->x86_capability));
-
- get_cpu_cap(info);
+ microcode_store_cpu_caps(&info);

- if (!memcmp(&info->x86_capability, &boot_cpu_data.x86_capability, sizeof(info->x86_capability)))
+ if (!memcmp(&info.x86_capability, &orig->x86_capability,
+ sizeof(info.x86_capability)))
return;

pr_warn("x86/CPU: CPU features have changed after loading microcode, but might not take effect.\n");
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index d86a4f910a6b..14d9031ed68a 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -447,6 +447,13 @@ static int microcode_reload_late(void)
atomic_set(&late_cpus_in, 0);
atomic_set(&late_cpus_out, 0);

+ /*
+ * Take a snapshot before the microcode update, so we can compare
+ * them after the update is successful to check for any bits
+ * changed.
+ */
+ microcode_store_cpu_caps(&info);
+
ret = stop_machine_cpuslocked(__reload_late, NULL, cpu_online_mask);
if (ret == 0)
microcode_check(&info);
--
2.34.1


2023-01-03 19:53:48

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v3 2/6] x86/microcode/core: Take a snapshot before and after applying microcode

On 1/3/23 10:02, Ashok Raj wrote:
> The kernel caches features about each CPU's features at boot in an
> x86_capability[] structure. The microcode update takes one snapshot and
> compares it with the saved copy at boot.
>
> However, the capabilities in the boot copy can be turned off as a result of
> certain command line parameters or configuration restrictions. This can
> cause a mismatch when comparing the values before and after the microcode
> update.
>
> microcode_check() is called after an update to report any previously
> cached CPUID bits might have changed due to the update.
>
> microcode_store_cpu_caps() basically stores the original CPU reported
> values and not the OS modified values. This will avoid giving a false
> warning even if no capabilities have changed.
>
> Ignore the capabilities recorded at boot. Take a new snapshot before the
> update and compare with a snapshot after the update to eliminate the false
> warning.
...

It took me a moment to square this "Ignore the capabilities recorded at
boot." statement with the continued existence of:

memcpy(info->x86_capability, &boot_cpu_data.x86_capability, ...

I think just adding "hardware" might help:

Ignore all hardware capabilities recorded at boot.

Or even adding one more sentence:

Only consult the synthetic capabilities recorded at boot so that
a simple memcmp() can be used for comparisons.

> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 387578049de0..ac2e67156b9b 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -697,6 +697,7 @@ bool xen_set_default_idle(void);
> #endif
>
> void __noreturn stop_this_cpu(void *dummy);
> +void microcode_store_cpu_caps(struct cpuinfo_x86 *info);
> void microcode_check(struct cpuinfo_x86 *info);
>
> enum l1tf_mitigations {
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index b9c7529c920e..7c86c6fd07ae 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -2297,28 +2297,43 @@ void cpu_init_secondary(void)
> #endif
>
> #ifdef CONFIG_MICROCODE_LATE_LOADING
> +
> +void microcode_store_cpu_caps(struct cpuinfo_x86 *info)
> +{
> + /* Reload CPUID max function as it might've changed. */
> + info->cpuid_level = cpuid_eax(0);
> +
> + /*
> + * Copy all capability leafs to pick up the synthetic ones so that
> + * memcmp() below doesn't fail on that. The ones coming from CPUID will
> + * get overwritten in get_cpu_cap().
> + */
> + memcpy(info->x86_capability, &boot_cpu_data.x86_capability,
> + sizeof(info->x86_capability));
> +
> + get_cpu_cap(info);
> +}
> +
> /*
> * The microcode loader calls this upon late microcode load to recheck features,
> * only when microcode has been updated. Caller holds microcode_mutex and CPU
> * hotplug lock.
> */
> -void microcode_check(struct cpuinfo_x86 *info)
> +void microcode_check(struct cpuinfo_x86 *orig)
> {
> - perf_check_microcode();
> + struct cpuinfo_x86 info;

'info' is kinda a throwaway name. would this be better as:

prev_info / new_info

instead of:

orig / info

?

> - /* Reload CPUID max function as it might've changed. */
> - info->cpuid_level = cpuid_eax(0);
> + perf_check_microcode();
>
> /*
> * Copy all capability leafs to pick up the synthetic ones so that
> * memcmp() below doesn't fail on that. The ones coming from CPUID will
> * get overwritten in get_cpu_cap().
> */

This comment got copied to microcode_store_cpu_caps(). Does this
instance still need to be here?

> - memcpy(&info->x86_capability, &boot_cpu_data.x86_capability, sizeof(info->x86_capability));
> -
> - get_cpu_cap(info);
> + microcode_store_cpu_caps(&info);
>
> - if (!memcmp(&info->x86_capability, &boot_cpu_data.x86_capability, sizeof(info->x86_capability)))
> + if (!memcmp(&info.x86_capability, &orig->x86_capability,
> + sizeof(info.x86_capability)))
> return;
>
> pr_warn("x86/CPU: CPU features have changed after loading microcode, but might not take effect.\n");
> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
> index d86a4f910a6b..14d9031ed68a 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -447,6 +447,13 @@ static int microcode_reload_late(void)
> atomic_set(&late_cpus_in, 0);
> atomic_set(&late_cpus_out, 0);
>
> + /*
> + * Take a snapshot before the microcode update, so we can compare
> + * them after the update is successful to check for any bits
> + * changed.
> + */
> + microcode_store_cpu_caps(&info);

A "we" snuck in there. How about this?

/*
* Take a snapshot before the microcode update. This enables
* a later comparison to see any bits changed after an update.
*/

I do think some better naming of 'info' here would be nice too.
'old_info' or 'prev_info' seem like good alternatives.

> ret = stop_machine_cpuslocked(__reload_late, NULL, cpu_online_mask);
> if (ret == 0)
> microcode_check(&info);

2023-01-03 20:06:44

by Ashok Raj

[permalink] [raw]
Subject: Re: [PATCH v3 2/6] x86/microcode/core: Take a snapshot before and after applying microcode

On Tue, Jan 03, 2023 at 10:46:56AM -0800, Dave Hansen wrote:
> On 1/3/23 10:02, Ashok Raj wrote:
> > The kernel caches features about each CPU's features at boot in an
> > x86_capability[] structure. The microcode update takes one snapshot and
> > compares it with the saved copy at boot.
> >
> > However, the capabilities in the boot copy can be turned off as a result of
> > certain command line parameters or configuration restrictions. This can
> > cause a mismatch when comparing the values before and after the microcode
> > update.
> >
> > microcode_check() is called after an update to report any previously
> > cached CPUID bits might have changed due to the update.
> >
> > microcode_store_cpu_caps() basically stores the original CPU reported
> > values and not the OS modified values. This will avoid giving a false
> > warning even if no capabilities have changed.
> >
> > Ignore the capabilities recorded at boot. Take a new snapshot before the
> > update and compare with a snapshot after the update to eliminate the false
> > warning.
> ...
>
> It took me a moment to square this "Ignore the capabilities recorded at
> boot." statement with the continued existence of:
>
> memcpy(info->x86_capability, &boot_cpu_data.x86_capability, ...
>
> I think just adding "hardware" might help:
>
> Ignore all hardware capabilities recorded at boot.
>
> Or even adding one more sentence:
>
> Only consult the synthetic capabilities recorded at boot so that
> a simple memcmp() can be used for comparisons.

Rewritten this multiple times, but it still seems confusing, however hard I
try :-(

Its not consulting just the synthetic capabilties, but all real
capabilities reported by hardware including synthetic ones. For e.g.
if we turn off SGX, but HW still exposes it, the new snapshot will
have SGX but previous copy doesn't as it has gone through some more
filtering during enabling. That is what resulted in the miscompare.

Does this replacement sound better?

Ignore any changes to capabilities applied by the kernel during any
feature enabling and check against raw capabilities exposed by the
hardware


>
> > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> > index 387578049de0..ac2e67156b9b 100644
> > --- a/arch/x86/include/asm/processor.h
> > +++ b/arch/x86/include/asm/processor.h
> > @@ -697,6 +697,7 @@ bool xen_set_default_idle(void);
> > #endif
> >
> > void __noreturn stop_this_cpu(void *dummy);
> > +void microcode_store_cpu_caps(struct cpuinfo_x86 *info);
> > void microcode_check(struct cpuinfo_x86 *info);
> >
> > enum l1tf_mitigations {
> > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> > index b9c7529c920e..7c86c6fd07ae 100644
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -2297,28 +2297,43 @@ void cpu_init_secondary(void)
> > #endif
> >
> > #ifdef CONFIG_MICROCODE_LATE_LOADING
> > +
> > +void microcode_store_cpu_caps(struct cpuinfo_x86 *info)
> > +{
> > + /* Reload CPUID max function as it might've changed. */
> > + info->cpuid_level = cpuid_eax(0);
> > +
> > + /*
> > + * Copy all capability leafs to pick up the synthetic ones so that
> > + * memcmp() below doesn't fail on that. The ones coming from CPUID will
> > + * get overwritten in get_cpu_cap().
> > + */
> > + memcpy(info->x86_capability, &boot_cpu_data.x86_capability,
> > + sizeof(info->x86_capability));
> > +
> > + get_cpu_cap(info);
> > +}
> > +
> > /*
> > * The microcode loader calls this upon late microcode load to recheck features,
> > * only when microcode has been updated. Caller holds microcode_mutex and CPU
> > * hotplug lock.
> > */
> > -void microcode_check(struct cpuinfo_x86 *info)
> > +void microcode_check(struct cpuinfo_x86 *orig)
> > {
> > - perf_check_microcode();
> > + struct cpuinfo_x86 info;
>
> 'info' is kinda a throwaway name. would this be better as:
>
> prev_info / new_info
>
> instead of:
>
> orig / info
>
> ?

Yes, that sounds better.

>
> > - /* Reload CPUID max function as it might've changed. */
> > - info->cpuid_level = cpuid_eax(0);
> > + perf_check_microcode();
> >
> > /*
> > * Copy all capability leafs to pick up the synthetic ones so that
> > * memcmp() below doesn't fail on that. The ones coming from CPUID will
> > * get overwritten in get_cpu_cap().
> > */
>
> This comment got copied to microcode_store_cpu_caps(). Does this
> instance still need to be here?

I think it still applies.. get_cpu_cap() copies all reported capabilities
and adds the synthetic ones too.

>
> > - memcpy(&info->x86_capability, &boot_cpu_data.x86_capability, sizeof(info->x86_capability));
> > -
> > - get_cpu_cap(info);
> > + microcode_store_cpu_caps(&info);
> >
> > - if (!memcmp(&info->x86_capability, &boot_cpu_data.x86_capability, sizeof(info->x86_capability)))
> > + if (!memcmp(&info.x86_capability, &orig->x86_capability,
> > + sizeof(info.x86_capability)))
> > return;
> >
> > pr_warn("x86/CPU: CPU features have changed after loading microcode, but might not take effect.\n");
> > diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
> > index d86a4f910a6b..14d9031ed68a 100644
> > --- a/arch/x86/kernel/cpu/microcode/core.c
> > +++ b/arch/x86/kernel/cpu/microcode/core.c
> > @@ -447,6 +447,13 @@ static int microcode_reload_late(void)
> > atomic_set(&late_cpus_in, 0);
> > atomic_set(&late_cpus_out, 0);
> >
> > + /*
> > + * Take a snapshot before the microcode update, so we can compare
> > + * them after the update is successful to check for any bits
> > + * changed.
> > + */
> > + microcode_store_cpu_caps(&info);
>
> A "we" snuck in there. How about this?
>
> /*
> * Take a snapshot before the microcode update. This enables
> * a later comparison to see any bits changed after an update.
> */
>
> I do think some better naming of 'info' here would be nice too.
> 'old_info' or 'prev_info' seem like good alternatives.

Sounds good.

Cheers,
Ashok

2023-01-04 19:13:05

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 2/6] x86/microcode/core: Take a snapshot before and after applying microcode

On Tue, Jan 03, 2023 at 10:02:08AM -0800, Ashok Raj wrote:
> Fixes: 1008c52c09dc ("x86/CPU: Add a microcode loader callback")

Why a Fixes tag? Do you have a failure scenario for current kernels?

If so, then it would need stable backporting.

If so, it would need the previous patch too.

> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 387578049de0..ac2e67156b9b 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -697,6 +697,7 @@ bool xen_set_default_idle(void);
> #endif
>
> void __noreturn stop_this_cpu(void *dummy);
> +void microcode_store_cpu_caps(struct cpuinfo_x86 *info);

s/microcode_store_cpu_caps/store_cpu_caps/g

> void microcode_check(struct cpuinfo_x86 *info);
>
> enum l1tf_mitigations {
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index b9c7529c920e..7c86c6fd07ae 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -2297,28 +2297,43 @@ void cpu_init_secondary(void)
> #endif
>
> #ifdef CONFIG_MICROCODE_LATE_LOADING
> +
> +void microcode_store_cpu_caps(struct cpuinfo_x86 *info)
> +{
> + /* Reload CPUID max function as it might've changed. */

Might've changed how?

> + info->cpuid_level = cpuid_eax(0);
> +
> + /*
> + * Copy all capability leafs to pick up the synthetic ones so that
> + * memcmp() below doesn't fail on that...

split that comment and put the second part...

> + */
> + memcpy(info->x86_capability, &boot_cpu_data.x86_capability,
> + sizeof(info->x86_capability));
> +

... here:

/*
* ... the ones coming from CPUID will get overwritten here:
*/

> + get_cpu_cap(info);
> +}
> +
> /*
> * The microcode loader calls this upon late microcode load to recheck features,
> * only when microcode has been updated. Caller holds microcode_mutex and CPU
> * hotplug lock.
> */
> -void microcode_check(struct cpuinfo_x86 *info)
> +void microcode_check(struct cpuinfo_x86 *orig)
^^^^^

Yeah, what dhansen said.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-01-06 21:56:20

by Ashok Raj

[permalink] [raw]
Subject: Re: [PATCH v3 2/6] x86/microcode/core: Take a snapshot before and after applying microcode

On Wed, Jan 04, 2023 at 07:56:52PM +0100, Borislav Petkov wrote:
> On Tue, Jan 03, 2023 at 10:02:08AM -0800, Ashok Raj wrote:
> > Fixes: 1008c52c09dc ("x86/CPU: Add a microcode loader callback")
>
> Why a Fixes tag? Do you have a failure scenario for current kernels?

When we reload the same microcode there is no change in CPUID bits, but if
the kernel has turned off a feature, in my case it was SGX. So bsp copy has
SGX off, but new get_cpu_cap() seems to think SGX is there since this is
unfiltered by the kernel.

This results in a miscompare. I have noticed even when i load a brand new
patch but has no change in CPUID, report there might have been a change in
CPUID.

>
> If so, then it would need stable backporting.
>
> If so, it would need the previous patch too.

I think so, but your call.


>
> > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> > index 387578049de0..ac2e67156b9b 100644
> > --- a/arch/x86/include/asm/processor.h
> > +++ b/arch/x86/include/asm/processor.h
> > @@ -697,6 +697,7 @@ bool xen_set_default_idle(void);
> > #endif
> >
> > void __noreturn stop_this_cpu(void *dummy);
> > +void microcode_store_cpu_caps(struct cpuinfo_x86 *info);
>
> s/microcode_store_cpu_caps/store_cpu_caps/g

Yes, i'll change.

>
> > void microcode_check(struct cpuinfo_x86 *info);
> >
> > enum l1tf_mitigations {
> > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> > index b9c7529c920e..7c86c6fd07ae 100644
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -2297,28 +2297,43 @@ void cpu_init_secondary(void)
> > #endif
> >
> > #ifdef CONFIG_MICROCODE_LATE_LOADING
> > +
> > +void microcode_store_cpu_caps(struct cpuinfo_x86 *info)
> > +{
> > + /* Reload CPUID max function as it might've changed. */
>
> Might've changed how?

This comment existed in the previous microcode_check(). I just moved it
around during the refactor.

I suppose new microcode can bring new features and this max function
enumeration can also change?

>
> > + info->cpuid_level = cpuid_eax(0);
> > +
> > + /*
> > + * Copy all capability leafs to pick up the synthetic ones so that
> > + * memcmp() below doesn't fail on that...
>
> split that comment and put the second part...
>
> > + */
> > + memcpy(info->x86_capability, &boot_cpu_data.x86_capability,
> > + sizeof(info->x86_capability));
> > +
>
> ... here:

Will do.

>
> /*
> * ... the ones coming from CPUID will get overwritten here:
> */
>
> > + get_cpu_cap(info);
> > +}
> > +
> > /*
> > * The microcode loader calls this upon late microcode load to recheck features,
> > * only when microcode has been updated. Caller holds microcode_mutex and CPU
> > * hotplug lock.
> > */
> > -void microcode_check(struct cpuinfo_x86 *info)
> > +void microcode_check(struct cpuinfo_x86 *orig)
> ^^^^^
>
> Yeah, what dhansen said.

Yes, I'm changing that as well.