Code in v6.9 arch/x86/kernel/smpboot.c was changed by commit 4db64279bc2b
("x86/cpu: Switch to new Intel CPU model defines") from old code:
440 static const struct x86_cpu_id intel_cod_cpu[] = {
441 X86_MATCH_INTEL_FAM6_MODEL(HASWELL_X, 0), /* COD */
442 X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_X, 0), /* COD */
443 X86_MATCH_INTEL_FAM6_MODEL(ANY, 1), /* SNC */
444 {}
445 };
446
447 static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
448 {
449 const struct x86_cpu_id *id = x86_match_cpu(intel_cod_cpu);
new code:
440 static const struct x86_cpu_id intel_cod_cpu[] = {
441 X86_MATCH_VFM(INTEL_HASWELL_X, 0), /* COD */
442 X86_MATCH_VFM(INTEL_BROADWELL_X, 0), /* COD */
443 X86_MATCH_VFM(INTEL_ANY, 1), /* SNC */
444 {}
445 };
446
447 static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
448 {
449 const struct x86_cpu_id *id = x86_match_cpu(intel_cod_cpu);
On an Intel CPU with SNC enabled this code previously matched the rule
on line 443 to avoid printing messages about insane cache configuration.
The new code did not match any rules.
Expanding the macros for the intel_cod_cpu[] array shows that the old
is equivalent to:
static const struct x86_cpu_id intel_cod_cpu[] = {
[0] = { .vendor = 0, .family = 6, .model = 0x3F, .steppings = 0, .feature = 0, .driver_data = 0 },
[1] = { .vendor = 0, .family = 6, .model = 0x4F, .steppings = 0, .feature = 0, .driver_data = 0 },
[2] = { .vendor = 0, .family = 6, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 1 },
[3] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 0 }
}
while the new code expands to:
static const struct x86_cpu_id intel_cod_cpu[] = {
[0] = { .vendor = 0, .family = 6, .model = 0x3F, .steppings = 0, .feature = 0, .driver_data = 0 },
[1] = { .vendor = 0, .family = 6, .model = 0x4F, .steppings = 0, .feature = 0, .driver_data = 0 },
[2] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 1 },
[3] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 0 }
}
Looking at the code for x86_match_cpu():
36 const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
37 {
38 const struct x86_cpu_id *m;
39 struct cpuinfo_x86 *c = &boot_cpu_data;
40
41 for (m = match;
42 m->vendor | m->family | m->model | m->steppings | m->feature;
43 m++) {
...
56 }
57 return NULL;
58 }
59 EXPORT_SYMBOL(x86_match_cpu);
it is clear that there was no match because the ANY entry in the table
(array index 2) is now the loop termination condition (all of vendor,
family, model, steppings, and feature are zero).
So this code was working before because the "ANY" check was looking for
any Intel CPU in family 6. But fails now because the family is a wild
card. So the root cause is that x86_match_cpu() has never been able to
match on a rule with just X86_VENDOR_INTEL and all other fields set to
wildcards.
Fix by adding a new flags field to struct x86_cpu_id that has a bit set
to indicate that the vendor field is valid. Update X86_MATCH*() macros
to set that bit. Extend the end-marker check in x86_match_cpu() to
check the flags field too.
Suggested-by: Thomas Gleixner <[email protected]>
Suggested-by: Borislav Petkov <[email protected]>
Fixes: 644e9cbbe3fc ("Add driver auto probing for x86 features v4")
Signed-off-by: Tony Luck <[email protected]>
---
Changes since v1:
1) More detailed commit description.
2) Changed "Fixes" tag. Commit 4db64279bc2b merely revealed a twelve
year old gap in the implementation of x86_match_cpu().
arch/x86/include/asm/processor.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index cb4f6c513c48..271c4c95bc37 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -175,10 +175,10 @@ struct cpuinfo_x86 {
unsigned initialized : 1;
} __randomize_layout;
-#define X86_VENDOR_INTEL 0
#define X86_VENDOR_CYRIX 1
#define X86_VENDOR_AMD 2
#define X86_VENDOR_UMC 3
+#define X86_VENDOR_INTEL 4
#define X86_VENDOR_CENTAUR 5
#define X86_VENDOR_TRANSMETA 7
#define X86_VENDOR_NSC 8
--
2.44.0
---
include/linux/mod_devicetable.h | 4 ++++
arch/x86/include/asm/cpu_device_id.h | 2 ++
arch/x86/kernel/cpu/match.c | 2 +-
3 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h
index 7a9a07ea451b..ede1bd8bc4b1 100644
--- a/include/linux/mod_devicetable.h
+++ b/include/linux/mod_devicetable.h
@@ -690,6 +690,7 @@ struct x86_cpu_id {
__u16 model;
__u16 steppings;
__u16 feature; /* bit index */
+ __u16 flags;
kernel_ulong_t driver_data;
};
@@ -700,6 +701,9 @@ struct x86_cpu_id {
#define X86_STEPPING_ANY 0
#define X86_FEATURE_ANY 0 /* Same as FPU, you can't test for that */
+/* x86_cpu_id::flags */
+#define X86_CPU_ID_FLAG_VENDOR_VALID BIT(0)
+
/*
* Generic table type for matching CPU features.
* @feature: the bit number of the feature (0 - 65535)
diff --git a/arch/x86/include/asm/cpu_device_id.h b/arch/x86/include/asm/cpu_device_id.h
index 970a232009c3..7fde9bd896d3 100644
--- a/arch/x86/include/asm/cpu_device_id.h
+++ b/arch/x86/include/asm/cpu_device_id.h
@@ -79,6 +79,7 @@
.model = _model, \
.steppings = _steppings, \
.feature = _feature, \
+ .flags = X86_CPU_ID_FLAG_VENDOR_VALID, \
.driver_data = (unsigned long) _data \
}
@@ -89,6 +90,7 @@
.model = _model, \
.steppings = _steppings, \
.feature = _feature, \
+ .flags = X86_CPU_ID_FLAG_VENDOR_VALID, \
.driver_data = (unsigned long) _data \
}
diff --git a/arch/x86/kernel/cpu/match.c b/arch/x86/kernel/cpu/match.c
index 8651643bddae..996f96cfce68 100644
--- a/arch/x86/kernel/cpu/match.c
+++ b/arch/x86/kernel/cpu/match.c
@@ -39,7 +39,7 @@ const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
struct cpuinfo_x86 *c = &boot_cpu_data;
for (m = match;
- m->vendor | m->family | m->model | m->steppings | m->feature;
+ m->vendor | m->family | m->model | m->steppings | m->feature | m->flags;
m++) {
if (m->vendor != X86_VENDOR_ANY && c->x86_vendor != m->vendor)
continue;
--
2.45.0
> Changes since v1:
> 1) More detailed commit description.
> 2) Changed "Fixes" tag. Commit 4db64279bc2b merely revealed a twelve
> year old gap in the implementation of x86_match_cpu().
Changes since v2:
Use fix suggested by Thomas & Boris that doesn't risk breakage from
changing the value of X86_VENDOR_INTEL #define.
-Tony
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index cb4f6c513c48..271c4c95bc37 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -175,10 +175,10 @@ struct cpuinfo_x86 {
> unsigned initialized : 1;
> } __randomize_layout;
>
> -#define X86_VENDOR_INTEL 0
> #define X86_VENDOR_CYRIX 1
> #define X86_VENDOR_AMD 2
> #define X86_VENDOR_UMC 3
> +#define X86_VENDOR_INTEL 4
> #define X86_VENDOR_CENTAUR 5
> #define X86_VENDOR_TRANSMETA 7
> #define X86_VENDOR_NSC 8
Bother ... I pasted in the whole of old patch to get the change log.
Obviously this part isn't needed in v3
Sorry.
-Tony
On Fri, May 17, 2024 at 10:21:34AM -0700, Tony Luck wrote:
> diff --git a/arch/x86/kernel/cpu/match.c b/arch/x86/kernel/cpu/match.c
> index 8651643bddae..996f96cfce68 100644
> --- a/arch/x86/kernel/cpu/match.c
> +++ b/arch/x86/kernel/cpu/match.c
> @@ -39,7 +39,7 @@ const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
> struct cpuinfo_x86 *c = &boot_cpu_data;
>
> for (m = match;
> - m->vendor | m->family | m->model | m->steppings | m->feature;
> + m->vendor | m->family | m->model | m->steppings | m->feature | m->flags;
I think this should not do anything implicit even if it is correct but
should explicitly check
if (!(m->flags & X86_CPU_ID_FLAG_VENDOR_VALID))
continue;
I don't have a clear idea how exactly yet - I need to play with it.
Maybe this stupid flow in the loop should be finally fixed into
something more readable and sensible...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On May 17, 2024 10:38:11 AM PDT, Borislav Petkov <[email protected]> wrote:
>On Fri, May 17, 2024 at 10:21:34AM -0700, Tony Luck wrote:
>> diff --git a/arch/x86/kernel/cpu/match.c b/arch/x86/kernel/cpu/match.c
>> index 8651643bddae..996f96cfce68 100644
>> --- a/arch/x86/kernel/cpu/match.c
>> +++ b/arch/x86/kernel/cpu/match.c
>> @@ -39,7 +39,7 @@ const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
>> struct cpuinfo_x86 *c = &boot_cpu_data;
>>
>> for (m = match;
>> - m->vendor | m->family | m->model | m->steppings | m->feature;
>> + m->vendor | m->family | m->model | m->steppings | m->feature | m->flags;
>
>I think this should not do anything implicit even if it is correct but
>should explicitly check
>
> if (!(m->flags & X86_CPU_ID_FLAG_VENDOR_VALID))
> continue;
>
>I don't have a clear idea how exactly yet - I need to play with it.
>
>Maybe this stupid flow in the loop should be finally fixed into
>something more readable and sensible...
>
>Thx.
>
Thought: why don't we add VENDOR and CPUID as synthetic CPU feature flags as well? Not saying it necessarily solves this specific problem but it might make some other code more uniform.
Obviously on x86-64 CPUID is baseline; VENDOR might not be known, however.
> > for (m = match;
> > - m->vendor | m->family | m->model | m->steppings | m->feature;
> > + m->vendor | m->family | m->model | m->steppings | m->feature | m->flags;
>
> I think this should not do anything implicit even if it is correct but
> should explicitly check
>
> if (!(m->flags & X86_CPU_ID_FLAG_VENDOR_VALID))
> continue;
>
> I don't have a clear idea how exactly yet - I need to play with it.
>
> Maybe this stupid flow in the loop should be finally fixed into
> something more readable and sensible...
What if the bit in flags was named " X86_CPU_ID_FLAG_ENTRY_VALID"
Then the loop in x86_match_cpu() could just be:
for (m = match; m->flags & X86_CPU_ID_FLAG_ENTRY_VALID; m++) {
...
}
-Tony
On Fri, May 17, 2024 at 05:43:10PM +0000, Luck, Tony wrote:
> What if the bit in flags was named " X86_CPU_ID_FLAG_ENTRY_VALID"
>
> Then the loop in x86_match_cpu() could just be:
>
> for (m = match; m->flags & X86_CPU_ID_FLAG_ENTRY_VALID; m++) {
Yeah, makes sense at a first glance.
This'll keep the terminators "{}" unchanged so that we don't have to
touch all those gazillion places and it'll explicitly state that an
entry is valid or not.
But the devil's in the detail, as always...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
>> for (m = match; m->flags & X86_CPU_ID_FLAG_ENTRY_VALID; m++) {
>
> Yeah, makes sense at a first glance.
>
> This'll keep the terminators "{}" unchanged so that we don't have to
> touch all those gazillion places and it'll explicitly state that an
> entry is valid or not.
> But the devil's in the detail, as always...
Yes. One detail is that there are places not using the X86_MATCH macros.
E.g. in arch/x86/crypto/aesni-intel_glue.c there is:
static const struct x86_cpu_id zmm_exclusion_list[] = {
{ .vendor = X86_VENDOR_INTEL, .family = 6, .model = INTEL_FAM6_SKYLAKE_X },
...
};
This one (and likely most/all others) will be fixed by the remaining patches in my new families[1] series.
But I'll need to audit to check that I got them all before changing x86_match_cpu() to
only look at m->flags.
-Tony
[1] I'll work on rebasing the remaining patches in that series. I think all but a couple of trees
that have conflicting changes in linux-next have now been pulled into mainline.
On Fri, May 17, 2024 at 10:46:29AM -0700, H. Peter Anvin wrote:
> Thought: why don't we add VENDOR and CPUID as synthetic CPU feature flags as well? Not saying it necessarily solves this specific problem but it might make some other code more uniform.
>
> Obviously on x86-64 CPUID is baseline; VENDOR might not be known, however.
Well, there's only a handful of X86_VENDOR_UNKNOWN usages in the kernel
so meh. Or do you mean something else?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Fri, May 17 2024 at 18:13, Luck, Tony wrote:
>>> for (m = match; m->flags & X86_CPU_ID_FLAG_ENTRY_VALID; m++) {
>>
>> Yeah, makes sense at a first glance.
>>
>> This'll keep the terminators "{}" unchanged so that we don't have to
>> touch all those gazillion places and it'll explicitly state that an
>> entry is valid or not.
>
>> But the devil's in the detail, as always...
>
> Yes. One detail is that there are places not using the X86_MATCH
> macros.
Groan.
> E.g. in arch/x86/crypto/aesni-intel_glue.c there is:
>
> static const struct x86_cpu_id zmm_exclusion_list[] = {
> { .vendor = X86_VENDOR_INTEL, .family = 6, .model = INTEL_FAM6_SKYLAKE_X },
> ...
> };
>
> This one (and likely most/all others) will be fixed by the remaining
> patches in my new families[1] series.
AFAICT, that's the only one.
# git grep -C5 'struct x86_cpu_id' | grep '\.vendor' | awk '{ print $1; }' | uniq
arch/x86/crypto/aesni-intel_glue.c-