2024-05-27 07:34:33

by Peter Schneider

[permalink] [raw]
Subject: Kernel 6.9 regression: X86: Bogus messages from topology detection

Hi all,

I have noticed strange messages in kernel version 6.9, obviously from CPU topology
detection, which were not present in 6.8.y and earlier kernels.

This is coming from an older server machine: 2-socket Ivy Bridge Xeon E5-2697 v2 (24C/48T)
in an Asus Z9PE-D16/2L motherboard (Intel C-602A chipset); BIOS patched to the latest
available from Asus. All memory slots occupied, so 256 GB RAM in total.


From a "good boot", e.g. kernel 6.8.11, dmesg output looks like this:

[ 1.823797] smpboot: x86: Booting SMP configuration:
[ 1.823799] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11
[ 1.827514] .... node #1, CPUs: #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23
[ 0.011462] smpboot: CPU 12 Converting physical 0 to logical die 1

[ 1.875532] .... node #0, CPUs: #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35
[ 1.882453] .... node #1, CPUs: #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47
[ 1.887532] MDS CPU bug present and SMT on, data leak possible. See
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
[ 1.933640] smp: Brought up 2 nodes, 48 CPUs
[ 1.933640] smpboot: Max logical packages: 2
[ 1.933640] smpboot: Total of 48 processors activated (259199.61 BogoMIPS)


From a "bad" boot, e.g. kernel 6.9.2, dmesg output has these messages in it:

[ 1.785937] smpboot: x86: Booting SMP configuration:
[ 1.785939] .... node #0, CPUs: #4
[ 1.786215] .... node #1, CPUs: #12 #16
[ 1.793547] MDS CPU bug present and SMT on, data leak possible. See
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.

[ 1.797547] .... node #0, CPUs: #1 #2 #3 #5 #6 #7 #8 #9 #10 #11
[ 1.801858] .... node #1, CPUs: #13 #14 #15 #17 #18 #19 #20 #21 #22 #23
[ 1.804687] .... node #0, CPUs: #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35
[ 1.810728] .... node #1, CPUs: #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47
[ 1.901547] smp: Brought up 2 nodes, 48 CPUs
[ 1.901547] smpboot: Total of 48 processors activated (259207.87 BogoMIPS)
[ 1.903803] BUG: arch topology borken
[ 1.903879] the SMT domain not a subset of the CLS domain
[ 1.903970] BUG: arch topology borken
[ 1.904040] the SMT domain not a subset of the CLS domain
[ 1.904128] BUG: arch topology borken
[ 1.904198] the SMT domain not a subset of the CLS domain

... and this "BUG" and the following line repeat 48 times which is the number of logical
CPUs this machine has. Also, there is a funny typo in the message, but that might be
intended, I guess?! Moreover I noticed, from node #1, CPU #12 detection message is
missing, so the counting maybe wrong?!

However the machine boots, and except from these strange messages, I cannot detect any
other abnormal behaviour. It is running ~15 QEMU/KVM virtual machines just fine. Because
these messages look unusual and a bit scary though, I have bisected the issue, to be able
to report it here. The first bad commit I found is this one:


22d63660c35eb751c63a709bf901a64c1726592a is the first bad commit
commit 22d63660c35eb751c63a709bf901a64c1726592a
Author: Thomas Gleixner <[email protected]>
Date: Tue Feb 13 22:04:08 2024 +0100

x86/cpu: Use common topology code for Intel

Intel CPUs use either topology leaf 0xb/0x1f evaluation or the legacy
SMP/HT evaluation based on CPUID leaf 0x1/0x4.

Move it over to the consolidated topology code and remove the random
topology hacks which are sprinkled into the Intel and the common code.

No functional change intended.

Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Juergen Gross <[email protected]>
Tested-by: Sohil Mehta <[email protected]>
Tested-by: Michael Kelley <[email protected]>
Tested-by: Zhang Rui <[email protected]>
Tested-by: Wang Wendy <[email protected]>
Tested-by: K Prateek Nayak <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

arch/x86/kernel/cpu/common.c | 65 -----------------------------------
arch/x86/kernel/cpu/cpu.h | 4 ---
arch/x86/kernel/cpu/intel.c | 25 --------------
arch/x86/kernel/cpu/topology.c | 22 ------------
arch/x86/kernel/cpu/topology_common.c | 5 ++-
5 files changed, 4 insertions(+), 117 deletions(-)
root@linus:/usr/src/linux#


I attach my bisect log, and full dmesg output from a good and from a bad kernel version.

Moreover, the last 3 bad kernels from my bisect session did not boot at all, including the
one with commit SHA1 from the first bad commit above. These kernels also had the series of
"BUG" messages scrolling through on the console, and then additionally a kernel panic,
seemingly coming from a divide exception from function init_intel_microcode:


<5>[ 5.968685] Key type dns_resolver registered
<4>[ 5.974402] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
<4>[ 5.977017] divide error: 0000 [#1] PREEMPT SMP PTI
<4>[ 5.977116] CPU: 9 PID: 1 Comm: swapper/0 Not tainted 6.8.0-rc4+ #1
<4>[ 5.977213] Hardware name: ASUSTeK COMPUTER INC. Z9PE-D16 Series/Z9PE-D16 Series,
BIOS 5601 06/11/2015
<4>[ 5.977337] RIP: 0010:init_intel_microcode+0x3c/0x80
<4>[ 5.977436] Code: ff 75 44 40 80 fe 05 76 3e 48 8b 05 b6 45 f7 ff a9 00 00 00 40 75
30 8b 05 85 46 f7 ff 0f b7 0d aa 46 f7 ff 31 d2 48 c1 e0 0a <48> f7 f1 89 05 9b f9 46 ff
48 c7 c0 c0 98 e4 a8 31 d2 31 c9 31 f6
<4>[ 5.977602] RSP: 0000:ffffb79b8008fd80 EFLAGS: 00010206
<4>[ 5.977697] RAX: 0000000001e00000 RBX: 0000000000000000 RCX: 0000000000000000
<4>[ 5.977795] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000000
<4>[ 5.977894] RBP: ffffb79b8008fdf8 R08: 0000000000000000 R09: 0000000000000000
<4>[ 5.977992] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4>[ 5.978090] R13: 000000000000019a R14: ffffb79b8008fe08 R15: ffff96ad4026cf00
<4>[ 5.978187] FS: 0000000000000000(0000) GS:ffff96cc3fa40000(0000) knlGS:0000000000000000
<4>[ 5.978308] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 5.978402] CR2: 0000000000000000 CR3: 0000000e6d236001 CR4: 00000000001706f0
<4>[ 5.978500] Call Trace:
<4>[ 5.978588] <TASK>
<4>[ 5.978675] ? show_regs+0x6d/0x80
<4>[ 5.978767] ? die+0x37/0xa0
<4>[ 5.978857] ? do_trap+0xd4/0xf0
<4>[ 5.978948] ? do_error_trap+0x71/0xb0
<4>[ 5.979040] ? init_intel_microcode+0x3c/0x80
<4>[ 5.979131] ? exc_divide_error+0x3a/0x70
<4>[ 5.979226] ? init_intel_microcode+0x3c/0x80
<4>[ 5.979317] ? asm_exc_divide_error+0x1b/0x20
<4>[ 5.979427] ? init_intel_microcode+0x3c/0x80
<4>[ 5.979520] ? microcode_init+0x196/0x260
<4>[ 5.979612] ? __pfx_microcode_init+0x10/0x10
<4>[ 5.979718] do_one_initcall+0x5e/0x340
<4>[ 5.979813] kernel_init_freeable+0x322/0x490
<4>[ 5.979906] ? __pfx_kernel_init+0x10/0x10
<4>[ 5.979998] kernel_init+0x1b/0x200
<4>[ 5.980089] ret_from_fork+0x47/0x70
<4>[ 5.980180] ? __pfx_kernel_init+0x10/0x10
<4>[ 5.980272] ret_from_fork_asm+0x1b/0x30
<4>[ 5.980364] </TASK>
<4>[ 5.980450] Modules linked in:
<4>[ 5.980544] ---[ end trace 0000000000000000 ]---
<4>[ 6.959943] RIP: 0010:init_intel_microcode+0x3c/0x80
<4>[ 6.960041] Code: ff 75 44 40 80 fe 05 76 3e 48 8b 05 b6 45 f7 ff a9 00 00 00 40 75
30 8b 05 85 46 f7 ff 0f b7 0d aa 46 f7 ff 31 d2 48 c1 e0 0a <48> f7 f1 89 05 9b f9 46 ff
48 c7 c0 c0 98 e4 a8 31 d2 31 c9 31 f6
<4>[ 6.960207] RSP: 0000:ffffb79b8008fd80 EFLAGS: 00010206
<4>[ 6.960316] RAX: 0000000001e00000 RBX: 0000000000000000 RCX: 0000000000000000
<4>[ 6.960414] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000000
<4>[ 6.960512] RBP: ffffb79b8008fdf8 R08: 0000000000000000 R09: 0000000000000000
<4>[ 6.960610] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4>[ 6.960708] R13: 000000000000019a R14: ffffb79b8008fe08 R15: ffff96ad4026cf00
<4>[ 6.960806] FS: 0000000000000000(0000) GS:ffff96cc3fa40000(0000) knlGS:0000000000000000
<4>[ 6.960927] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 6.961021] CR2: 0000000000000000 CR3: 0000000e6d236001 CR4: 00000000001706f0
<0>[ 6.961120] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
<0>[ 6.961312] Kernel Offset: 0x25c00000 from 0xffffffff81000000 (relocation range:
0xffffffff80000000-0xffffffffbfffffff)


I also attached full dmesg log file "dmesg-erst-7373208397568540677" of this panic which I
could find in /var/lib/systemd/pstore.


Beste Grüße,
Peter Schneider

--
Climb the mountain not to plant your flag, but to embrace the challenge,
enjoy the air and behold the view. Climb it so you can see the world,
not so the world can see you. -- David McCullough Jr.

OpenPGP: 0xA3828BD796CCE11A8CADE8866E3A92C92C3FF244
Download: https://www.peters-netzplatz.de/download/pschneider1968_pub.asc
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]


Attachments:
git_bisect.log (3.04 kB)
dmesg_v6.9.2_Bad.txt (123.41 kB)
dmesg_v6.8.11_Good.txt (117.79 kB)
dmesg-erst-7373208397568540677 (12.95 kB)
OpenPGP_signature.asc (243.00 B)
OpenPGP digital signature
Download all attachments

2024-05-27 13:15:43

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On Mon, May 27 2024 at 09:29, Peter Schneider wrote:
> This is coming from an older server machine: 2-socket Ivy Bridge Xeon E5-2697 v2 (24C/48T)
> in an Asus Z9PE-D16/2L motherboard (Intel C-602A chipset); BIOS patched to the latest
> available from Asus. All memory slots occupied, so 256 GB RAM in total.
>
> From a "good boot", e.g. kernel 6.8.11, dmesg output looks like this:
>
> [ 1.823797] smpboot: x86: Booting SMP configuration:
> [ 1.823799] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11
> [ 1.827514] .... node #1, CPUs: #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23
> [ 0.011462] smpboot: CPU 12 Converting physical 0 to logical die 1
>
> [ 1.875532] .... node #0, CPUs: #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35
> [ 1.882453] .... node #1, CPUs: #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47
> [ 1.887532] MDS CPU bug present and SMT on, data leak possible. See
> https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
> [ 1.933640] smp: Brought up 2 nodes, 48 CPUs
> [ 1.933640] smpboot: Max logical packages: 2
> [ 1.933640] smpboot: Total of 48 processors activated (259199.61 BogoMIPS)
>
>
> From a "bad" boot, e.g. kernel 6.9.2, dmesg output has these messages in it:
>
> [ 1.785937] smpboot: x86: Booting SMP configuration:
> [ 1.785939] .... node #0, CPUs: #4
> [ 1.786215] .... node #1, CPUs: #12 #16

Yuck. That does not make any sense.

> [ 1.797547] .... node #0, CPUs: #1 #2 #3 #5 #6 #7 #8 #9 #10 #11
> [ 1.801858] .... node #1, CPUs: #13 #14 #15 #17 #18 #19 #20 #21 #22 #23
> [ 1.804687] .... node #0, CPUs: #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35
> [ 1.810728] .... node #1, CPUs: #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47

> However the machine boots, and except from these strange messages, I cannot detect any
> other abnormal behaviour. It is running ~15 QEMU/KVM virtual machines just fine. Because
> these messages look unusual and a bit scary though, I have bisected the issue, to be able
> to report it here. The first bad commit I found is this one:

Ok. So as the machine is booting, can you please provide the output of:

cat /sys/kernel/debug/x86/topo/cpus/*

on the 6.9 kernel and

cat /proc/cpuinfo

for both 6.8 and 6.9?

Thanks,

tglx

2024-05-27 20:49:48

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On Mon, May 27 2024 at 15:14, Thomas Gleixner wrote:
> On Mon, May 27 2024 at 09:29, Peter Schneider wrote:
>> This is coming from an older server machine: 2-socket Ivy Bridge Xeon E5-2697 v2 (24C/48T)
>> in an Asus Z9PE-D16/2L motherboard (Intel C-602A chipset); BIOS patched to the latest
>> available from Asus. All memory slots occupied, so 256 GB RAM in total.
>>
>> From a "good boot", e.g. kernel 6.8.11, dmesg output looks like this:
>>
>> [ 1.823797] smpboot: x86: Booting SMP configuration:
>> [ 1.823799] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11
>> [ 1.827514] .... node #1, CPUs: #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23
>> [ 0.011462] smpboot: CPU 12 Converting physical 0 to logical die 1
>>
>> [ 1.875532] .... node #0, CPUs: #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35
>> [ 1.882453] .... node #1, CPUs: #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47
>> [ 1.887532] MDS CPU bug present and SMT on, data leak possible. See
>> https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
>> [ 1.933640] smp: Brought up 2 nodes, 48 CPUs
>> [ 1.933640] smpboot: Max logical packages: 2
>> [ 1.933640] smpboot: Total of 48 processors activated (259199.61 BogoMIPS)
>>
>>
>> From a "bad" boot, e.g. kernel 6.9.2, dmesg output has these messages in it:
>>
>> [ 1.785937] smpboot: x86: Booting SMP configuration:
>> [ 1.785939] .... node #0, CPUs: #4
>> [ 1.786215] .... node #1, CPUs: #12 #16
>
> Yuck. That does not make any sense.
>
>> [ 1.797547] .... node #0, CPUs: #1 #2 #3 #5 #6 #7 #8 #9 #10 #11
>> [ 1.801858] .... node #1, CPUs: #13 #14 #15 #17 #18 #19 #20 #21 #22 #23
>> [ 1.804687] .... node #0, CPUs: #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35
>> [ 1.810728] .... node #1, CPUs: #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47
>
>> However the machine boots, and except from these strange messages, I cannot detect any
>> other abnormal behaviour. It is running ~15 QEMU/KVM virtual machines just fine. Because
>> these messages look unusual and a bit scary though, I have bisected the issue, to be able
>> to report it here. The first bad commit I found is this one:
>
> Ok. So as the machine is booting, can you please provide the output of:
>
> cat /sys/kernel/debug/x86/topo/cpus/*
>
> on the 6.9 kernel and
>
> cat /proc/cpuinfo
>
> for both 6.8 and 6.9?

And once the output of:

cpuid -r

no matter on which kernel please?

Thanks,

tglx

2024-05-27 21:14:20

by Peter Schneider

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Hello Thomas,

thanks very much for looking into this issue!


Am 27.05.2024 um 15:14 schrieb Thomas Gleixner:

> Ok. So as the machine is booting, can you please provide the output of:
>
> cat /sys/kernel/debug/x86/topo/cpus/*
>
> on the 6.9 kernel and

Please find attached files topo_cpus_RAW_6_8_11.txt, topo_cpus_RAW_6_9_2.txt,
topo_cpus_SORTED_6_8_11.txt, topo_cpus_SORTED_6_9_2.txt. One for each kernel, and one raw
as requested, and one a bit sorted for easier navigation.


> cat /proc/cpuinfo
>
> for both 6.8 and 6.9?

Please find attached files cpuinfo_6_8_11.txt and cpuinfo_6_9_2.txt


> And once the output of:
>
> cpuid -r
>
> no matter on which kernel please?

Please find attached files cpuid.txt and cpuid-r.txt.

Beste Grüße,
Peter Schneider

--
Climb the mountain not to plant your flag, but to embrace the challenge,
enjoy the air and behold the view. Climb it so you can see the world,
not so the world can see you. -- David McCullough Jr.

OpenPGP: 0xA3828BD796CCE11A8CADE8866E3A92C92C3FF244
Download: https://www.peters-netzplatz.de/download/pschneider1968_pub.asc
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]


Attachments:
topo_cpus_RAW_6_8_11.txt (15.38 kB)
topo_cpus_RAW_6_9_2.txt (18.65 kB)
topo_cpus_SORTED_6_8_11.txt (17.06 kB)
topo_cpus_SORTED_6_9_2.txt (20.32 kB)
cpuinfo_6_8_11.txt (58.96 kB)
cpuinfo_6_9_2.txt (58.43 kB)
cpuid.txt (1.20 MB)
cpuid-r.txt (127.87 kB)
OpenPGP_signature.asc (243.00 B)
OpenPGP digital signature
Download all attachments

2024-05-27 21:16:54

by Peter Schneider

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Thomas,

Am 27.05.2024 um 23:06 schrieb Peter Schneider:
> Hello Thomas,
>
> thanks very much for looking into this issue!

[...]


I want to add one thing: there is a log entry in the dmesg output of a "bad" kernel, which
I initially overlooked, because it is way up, and I noticed this just now. I guess this
might be relevant:

[ 1.683564] [Firmware Bug]: CPU0: Topology domain 0 shift 1 != 5

This does not appear in the 6.8 kernel dmesg.

What do you think?

Beste Grüße,
Peter Schneider

--
Climb the mountain not to plant your flag, but to embrace the challenge,
enjoy the air and behold the view. Climb it so you can see the world,
not so the world can see you. -- David McCullough Jr.

OpenPGP: 0xA3828BD796CCE11A8CADE8866E3A92C92C3FF244
Download: https://www.peters-netzplatz.de/download/pschneider1968_pub.asc
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]


Attachments:
OpenPGP_signature.asc (243.00 B)
OpenPGP digital signature

2024-05-27 21:48:28

by Christian Heusel

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Hey Peter,

On 24/05/27 11:15PM, Peter Schneider wrote:
>
> I want to add one thing: there is a log entry in the dmesg output of a "bad"
> kernel, which I initially overlooked, because it is way up, and I noticed
> this just now. I guess this might be relevant:
>
> [ 1.683564] [Firmware Bug]: CPU0: Topology domain 0 shift 1 != 5
>
> This does not appear in the 6.8 kernel dmesg.
>

I also can't comment on whether this is relevant or not, but I have
noticed this in more places:

- https://bugzilla.kernel.org/show_bug.cgi?id=218879
- https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/57

Cheers,
Chris


Attachments:
(No filename) (664.00 B)
signature.asc (849.00 B)
Download all attachments

2024-05-30 08:30:23

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Peter!

On Mon, May 27 2024 at 23:15, Peter Schneider wrote:

Thanks for providing all the information!

> I want to add one thing: there is a log entry in the dmesg output of a "bad" kernel, which
> I initially overlooked, because it is way up, and I noticed this just now. I guess this
> might be relevant:
>
> [ 1.683564] [Firmware Bug]: CPU0: Topology domain 0 shift 1 != 5

Yes. That's absolutely related. I can see what goes wrong, but I have
absolutely no idea how that happens.

Can you please apply the debug patch below ad provide the full dmesg
after boot?

Thanks,

tglx
---
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -65,6 +65,7 @@ static void parse_legacy(struct topo_sca
cores <<= smt_shift;
}

+ pr_info("Legacy: %u %u %u\n", c->cpuid_level, smt_shift, core_shift);
topology_set_dom(tscan, TOPO_SMT_DOMAIN, smt_shift, 1U << smt_shift);
topology_set_dom(tscan, TOPO_CORE_DOMAIN, core_shift, cores);
}
--- a/arch/x86/kernel/cpu/topology_ext.c
+++ b/arch/x86/kernel/cpu/topology_ext.c
@@ -72,6 +72,9 @@ static inline bool topo_subleaf(struct t

cpuid_subleaf(leaf, subleaf, &sl);

+ pr_info("L:%0x %0x %0x S:%u N:%u T:%u\n", leaf, subleaf, sl.level, sl.x2apic_shift,
+ sl.num_processors, sl.type);
+
if (!sl.num_processors || sl.type == INVALID_TYPE)
return false;

@@ -97,6 +100,7 @@ static inline bool topo_subleaf(struct t
leaf, subleaf, tscan->c->topo.initial_apicid, sl.x2apic_id);
}

+ pr_info("D: %u\n", dom);
topology_set_dom(tscan, dom, sl.x2apic_shift, sl.num_processors);
return true;
}

2024-05-30 10:07:02

by Peter Schneider

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Hi Thomas,

Am 30.05.24 um 10:30 schrieb Thomas Gleixner:

> Can you please apply the debug patch below ad provide the full dmesg
> after boot?

Here you go... The patch applied cleanly against 6.9.3, which I saw was just released by
Greg, so I used that. If you want, I can repeat the test against 6.9.2, too.

Please note: to be able to boot any kernel >= 6.8.4 on my machine, I also had to apply
this patch by Martin Petersen, fixing another (unrelated SCSI) regression I reported some
time ago, see here:

https://lore.kernel.org/all/[email protected]/

But I think these two issues are not connected in any way. It was during testing the above
patch by Martin that I noticed this new issue in 6.9 BTW.

I have attached resulting file dmesg_6.9.3-dirty_Bad_wDebugInfo.txt, and I hope you can
make some sense of it.

Beste Grüße,
Peter Schneider

--
Climb the mountain not to plant your flag, but to embrace the challenge,
enjoy the air and behold the view. Climb it so you can see the world,
not so the world can see you. -- David McCullough Jr.

OpenPGP: 0xA3828BD796CCE11A8CADE8866E3A92C92C3FF244
Download: https://www.peters-netzplatz.de/download/pschneider1968_pub.asc
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]


Attachments:
dmesg_v6.9.3-dirty_Bad_wDebugInfo.txt (124.89 kB)
OpenPGP_signature.asc (243.00 B)
OpenPGP digital signature
Download all attachments

2024-05-30 13:36:15

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Peter!

On Thu, May 30 2024 at 12:06, Peter Schneider wrote:
> Am 30.05.24 um 10:30 schrieb Thomas Gleixner:
>
>> Can you please apply the debug patch below ad provide the full dmesg
>> after boot?
>
> Here you go... The patch applied cleanly against 6.9.3, which I saw
> was just released by Greg, so I used that. If you want, I can repeat
> the test against 6.9.2, too.

3 is fine

> Please note: to be able to boot any kernel >= 6.8.4 on my machine, I also had to apply
> this patch by Martin Petersen, fixing another (unrelated SCSI) regression I reported some
> time ago, see here:
>
> https://lore.kernel.org/all/[email protected]/
>
> But I think these two issues are not connected in any way. It was during testing the above
> patch by Martin that I noticed this new issue in 6.9 BTW.

Right. It's a seperate problem.

> I have attached resulting file dmesg_6.9.3-dirty_Bad_wDebugInfo.txt,
> and I hope you can make some sense of it.

It's exactly what I expected but it does not make any sense at all.

> [ 0.000000] Legacy: 2 5 5

So that means that during early boot where the topology parameters are
decoded from CPUID the CPUID evaluation code sees that the maximum
supported CPUID leaf is 0x02 and it therefore reads complete non-sense.

Later on when the full CPUID evaluation happens it sees the full space
and uses leaf 0xb.

> [ 1.687649] L:b 0 0 S:1 N:2 T:1
> [ 1.687652] D: 0
> [ 1.687653] L:b 1 1 S:5 N:24 T:2
> [ 1.687655] D: 1
> [ 1.687656] L:b 2 2 S:0 N:0 T:0
> [ 1.687658] [Firmware Bug]: CPU0: Topology domain 0 shift 1 != 5

And this obviously sees the proper numbers and complains about the
inconsistency.

So something on this CPU is broken. The same problem exists on all APs:

> [ 1.790035] .... node #0, CPUs: #4
> [ 1.790312] .... node #1, CPUs: #12 #16
> [ 0.011992] Legacy: 2 5 5
> [ 0.011992] Legacy: 2 5 5
> [ 0.011992] Legacy: 2 5 5
> [ 0.011992] Legacy: 2 5 5
.....

Now the million-dollar question is what unlocks CPUID to read the proper
value of EAX of leaf 0. All I could come up with is to sprinkle a dozen
of printks into that code. Updated debug patch below.

Thanks,

tglx
---
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -65,6 +65,7 @@ static void parse_legacy(struct topo_sca
cores <<= smt_shift;
}

+ pr_info("Legacy: %u %u %u\n", c->cpuid_level, smt_shift, core_shift);
topology_set_dom(tscan, TOPO_SMT_DOMAIN, smt_shift, 1U << smt_shift);
topology_set_dom(tscan, TOPO_CORE_DOMAIN, core_shift, cores);
}
--- a/arch/x86/kernel/cpu/topology_ext.c
+++ b/arch/x86/kernel/cpu/topology_ext.c
@@ -72,6 +72,9 @@ static inline bool topo_subleaf(struct t

cpuid_subleaf(leaf, subleaf, &sl);

+ pr_info("L:%0x %0x %0x S:%u N:%u T:%u\n", leaf, subleaf, sl.level, sl.x2apic_shift,
+ sl.num_processors, sl.type);
+
if (!sl.num_processors || sl.type == INVALID_TYPE)
return false;

@@ -97,6 +100,7 @@ static inline bool topo_subleaf(struct t
leaf, subleaf, tscan->c->topo.initial_apicid, sl.x2apic_id);
}

+ pr_info("D: %u\n", dom);
topology_set_dom(tscan, dom, sl.x2apic_shift, sl.num_processors);
return true;
}
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1584,22 +1584,30 @@ static void __init early_identify_cpu(st
/* cyrix could have cpuid enabled via c_identify()*/
if (have_cpuid_p()) {
cpu_detect(c);
+ pr_info("MAXL1: %x\n", cpuid_eax(0));
get_cpu_vendor(c);
+ pr_info("MAXL2: %x\n", cpuid_eax(0));
get_cpu_cap(c);
+ pr_info("MAXL3: %x\n", cpuid_eax(0));
setup_force_cpu_cap(X86_FEATURE_CPUID);
get_cpu_address_sizes(c);
+ pr_info("MAXL4: %x\n", cpuid_eax(0));
cpu_parse_early_param();
+ pr_info("MAXL5: %x\n", cpuid_eax(0));

cpu_init_topology(c);
+ pr_info("MAXL6: %x\n", cpuid_eax(0));

if (this_cpu->c_early_init)
this_cpu->c_early_init(c);
+ pr_info("MAXL7: %x\n", cpuid_eax(0));

c->cpu_index = 0;
filter_cpuid_features(c, false);

if (this_cpu->c_bsp_init)
this_cpu->c_bsp_init(c);
+ pr_info("MAXL8: %x\n", cpuid_eax(0));
} else {
setup_clear_cpu_cap(X86_FEATURE_CPUID);
get_cpu_address_sizes(c);
@@ -1797,9 +1805,12 @@ static void identify_cpu(struct cpuinfo_
#ifdef CONFIG_X86_VMX_FEATURE_NAMES
memset(&c->vmx_capability, 0, sizeof(c->vmx_capability));
#endif
+ pr_info("MAXLG1: %x\n", cpuid_eax(0));

generic_identify(c);

+ pr_info("MAXLG2: %x\n", cpuid_eax(0));
+
cpu_parse_topology(c);

if (this_cpu->c_identify)

2024-05-30 16:17:47

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On Thu, May 30 2024 at 15:35, Thomas Gleixner wrote:
> On Thu, May 30 2024 at 12:06, Peter Schneider wrote:
> Now the million-dollar question is what unlocks CPUID to read the proper
> value of EAX of leaf 0. All I could come up with is to sprinkle a dozen
> of printks into that code. Updated debug patch below.

Don't bother. Dave pointed out to me that this is unlocked in
early_init_intel() via MSR_IA32_MISC_ENABLE_LIMIT_CPUID...

Let me figure out how to fix that sanely.

Thanks,

tglx

2024-05-30 16:24:53

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On Thu, May 30 2024 at 17:53, Thomas Gleixner wrote:
> On Thu, May 30 2024 at 15:35, Thomas Gleixner wrote:
>> On Thu, May 30 2024 at 12:06, Peter Schneider wrote:
>> Now the million-dollar question is what unlocks CPUID to read the proper
>> value of EAX of leaf 0. All I could come up with is to sprinkle a dozen
>> of printks into that code. Updated debug patch below.
>
> Don't bother. Dave pointed out to me that this is unlocked in
> early_init_intel() via MSR_IA32_MISC_ENABLE_LIMIT_CPUID...
>
> Let me figure out how to fix that sanely.

The original code just worked because it was reevaluating this stuff
over and over until it magically became "correct".

The proper fix is obviously to unlock CPUID on Intel _before_ anything
which depends on cpuid_level is evaluated.

Thanks,

tglx
---
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -969,7 +969,7 @@ static void init_speculation_control(str
}
}

-void get_cpu_cap(struct cpuinfo_x86 *c)
+static void get_cpu_cap(struct cpuinfo_x86 *c)
{
u32 eax, ebx, ecx, edx;

@@ -1585,6 +1585,7 @@ static void __init early_identify_cpu(st
if (have_cpuid_p()) {
cpu_detect(c);
get_cpu_vendor(c);
+ intel_unlock_cpuid_leafs(c);
get_cpu_cap(c);
setup_force_cpu_cap(X86_FEATURE_CPUID);
get_cpu_address_sizes(c);
@@ -1744,7 +1745,7 @@ static void generic_identify(struct cpui
cpu_detect(c);

get_cpu_vendor(c);
-
+ intel_unlock_cpuid_leafs(c);
get_cpu_cap(c);

get_cpu_address_sizes(c);
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -61,14 +61,15 @@ extern __ro_after_init enum tsx_ctrl_sta

extern void __init tsx_init(void);
void tsx_ap_init(void);
+void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c);
#else
static inline void tsx_init(void) { }
static inline void tsx_ap_init(void) { }
+static inline void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c) { }
#endif /* CONFIG_CPU_SUP_INTEL */

extern void init_spectral_chicken(struct cpuinfo_x86 *c);

-extern void get_cpu_cap(struct cpuinfo_x86 *c);
extern void get_cpu_address_sizes(struct cpuinfo_x86 *c);
extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c);
extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c);
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -269,19 +269,26 @@ static void detect_tme_early(struct cpui
c->x86_phys_bits -= keyid_bits;
}

+void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c)
+{
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return;
+
+ if (c->x86 < 6 || (c->x86 == 6 && c->x86_model < 0xd))
+ return;
+
+ /*
+ * The BIOS can have limited CPUID to leaf 2, which breaks feature
+ * enumeration. Unlock it and update the maximum leaf info.
+ */
+ if (msr_clear_bit(MSR_IA32_MISC_ENABLE, MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0)
+ c->cpuid_level = cpuid_eax(0);
+}
+
static void early_init_intel(struct cpuinfo_x86 *c)
{
u64 misc_enable;

- /* Unmask CPUID levels if masked: */
- if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
- if (msr_clear_bit(MSR_IA32_MISC_ENABLE,
- MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0) {
- c->cpuid_level = cpuid_eax(0);
- get_cpu_cap(c);
- }
- }
-
if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
(c->x86 == 0x6 && c->x86_model >= 0x0e))
set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);





2024-05-31 06:53:11

by Peter Schneider

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Hi Thomas,


Am 30.05.2024 um 18:24 schrieb Thomas Gleixner:

>
> The proper fix is obviously to unlock CPUID on Intel _before_ anything
> which depends on cpuid_level is evaluated.
>
> Thanks,
>
> tglx
> ---
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -969,7 +969,7 @@ static void init_speculation_control(str
> }
> }
>
> -void get_cpu_cap(struct cpuinfo_x86 *c)
> +static void get_cpu_cap(struct cpuinfo_x86 *c)
> {
> u32 eax, ebx, ecx, edx;
>
> @@ -1585,6 +1585,7 @@ static void __init early_identify_cpu(st
> if (have_cpuid_p()) {
> cpu_detect(c);
> get_cpu_vendor(c);
> + intel_unlock_cpuid_leafs(c);
> get_cpu_cap(c);
> setup_force_cpu_cap(X86_FEATURE_CPUID);
> get_cpu_address_sizes(c);
> @@ -1744,7 +1745,7 @@ static void generic_identify(struct cpui
> cpu_detect(c);
>
> get_cpu_vendor(c);
> -
> + intel_unlock_cpuid_leafs(c);
> get_cpu_cap(c);
>
> get_cpu_address_sizes(c);
> --- a/arch/x86/kernel/cpu/cpu.h
> +++ b/arch/x86/kernel/cpu/cpu.h
> @@ -61,14 +61,15 @@ extern __ro_after_init enum tsx_ctrl_sta
>
> extern void __init tsx_init(void);
> void tsx_ap_init(void);
> +void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c);
> #else
> static inline void tsx_init(void) { }
> static inline void tsx_ap_init(void) { }
> +static inline void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c) { }
> #endif /* CONFIG_CPU_SUP_INTEL */
>
> extern void init_spectral_chicken(struct cpuinfo_x86 *c);
>
> -extern void get_cpu_cap(struct cpuinfo_x86 *c);
> extern void get_cpu_address_sizes(struct cpuinfo_x86 *c);
> extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c);
> extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c);
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -269,19 +269,26 @@ static void detect_tme_early(struct cpui
> c->x86_phys_bits -= keyid_bits;
> }
>
> +void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c)
> +{
> + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
> + return;
> +
> + if (c->x86 < 6 || (c->x86 == 6 && c->x86_model < 0xd))
> + return;
> +
> + /*
> + * The BIOS can have limited CPUID to leaf 2, which breaks feature
> + * enumeration. Unlock it and update the maximum leaf info.
> + */
> + if (msr_clear_bit(MSR_IA32_MISC_ENABLE, MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0)
> + c->cpuid_level = cpuid_eax(0);
> +}
> +
> static void early_init_intel(struct cpuinfo_x86 *c)
> {
> u64 misc_enable;
>
> - /* Unmask CPUID levels if masked: */
> - if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
> - if (msr_clear_bit(MSR_IA32_MISC_ENABLE,
> - MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0) {
> - c->cpuid_level = cpuid_eax(0);
> - get_cpu_cap(c);
> - }
> - }
> -
> if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
> (c->x86 == 0x6 && c->x86_model >= 0x0e))
> set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
>


With that patch applied, I now get a build error:

CC [M] drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp.o
CC [M] drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp1_execution.o
CC [M] drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp1_transition.o
CC [M] drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp2_execution.o
CC [M] drivers/gpu/drm/amd/amdgpu/../display/modules/hdcp/hdcp2_transition.o
LD [M] drivers/gpu/drm/amd/amdgpu/amdgpu.o
AR drivers/gpu/built-in.a
AR drivers/built-in.a
make[1]: *** [/usr/src/linux/Makefile:1919: .] Fehler 2
make: *** [Makefile:240: __sub-make] Fehler 2
root@linus:/usr/src/linux# make
CALL scripts/checksyscalls.sh
DESCEND objtool
INSTALL libsubcmd_headers
DESCEND bpf/resolve_btfids
INSTALL libsubcmd_headers
CC arch/x86/xen/enlighten_pv.o
arch/x86/xen/enlighten_pv.c: In Funktion »xen_start_kernel«:
arch/x86/xen/enlighten_pv.c:1388:9: Fehler: Implizite Deklaration der Funktion
»get_cpu_cap«; meinten Sie »set_cpu_cap«? [-Werror=implicit-function-declaration]
1388 | get_cpu_cap(&boot_cpu_data);
| ^~~~~~~~~~~
| set_cpu_cap
cc1: Einige Warnungen werden als Fehler behandelt
make[4]: *** [scripts/Makefile.build:244: arch/x86/xen/enlighten_pv.o] Fehler 1
make[3]: *** [scripts/Makefile.build:485: arch/x86/xen] Fehler 2
make[2]: *** [scripts/Makefile.build:485: arch/x86] Fehler 2
make[1]: *** [/usr/src/linux/Makefile:1919: .] Fehler 2
make: *** [Makefile:240: __sub-make] Fehler 2
root@linus:/usr/src/linux#


I used the kernel config of my Proxmox VE kernel, like so:

root@linus:/usr/src/linux# cp /boot/config-6.5.13-5-pve .config

and then ran "make olddefconfig", and then "make -j 48". That's how I tested all these
patches, including Martin's previously mentionened SCSI patch, and this used to work. I
have attached the .config file.

I am not a C programmer, let alone a kernel dev, so please bear with me if this is
nonsense, but: could the reason be that with your change, you have removed the declaration
of get_cpu_cap from the cpu.h header file, while it is still being referenced in
arch/x86/xen/enlighten_pv.c like so:

#include "../kernel/cpu/cpu.h" /* get_cpu_cap() */

Should I try to just add it back in, and see if that works? Or would you prefer to look
more deeply at this first, and then send me a reworked patch?

Beste Grüße,
Peter Schneider

--
Climb the mountain not to plant your flag, but to embrace the challenge,
enjoy the air and behold the view. Climb it so you can see the world,
not so the world can see you. -- David McCullough Jr.

OpenPGP: 0xA3828BD796CCE11A8CADE8866E3A92C92C3FF244
Download: https://www.peters-netzplatz.de/download/pschneider1968_pub.asc
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]


Attachments:
.config (280.83 kB)
OpenPGP_signature.asc (243.00 B)
OpenPGP digital signature
Download all attachments

2024-05-31 08:15:26

by Christian Heusel

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On 24/05/30 06:24PM, Thomas Gleixner wrote:
> On Thu, May 30 2024 at 17:53, Thomas Gleixner wrote:
>
> > Let me figure out how to fix that sanely.
>
> The proper fix is obviously to unlock CPUID on Intel _before_ anything
> which depends on cpuid_level is evaluated.
>
> Thanks,
>
> tglx

Hey Thomas,

as reported on the other mail the proposed fix broke the build (see
below) due to get_cpu_cap() becoming static but still being used in
other parts of the code.

One of the reporters in the Arch Bugtracker with an Intel Core i7-7700k
has tested a modified version of this fix[0] with the static change
reversed on top of the 6.9.2 stable kernel and reports that the patch
does not fix the issue for them. I have attached their output for the
patched (dmesg6.9.2-1.5.log) and nonpatched (dmesg6.9.2-1.log) kernel.

Should we also get them to test the mainline version or do you need any
other debug output?

Cheers,
gromit

[0]: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/57#note_189079

> ---
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -969,7 +969,7 @@ static void init_speculation_control(str
> }
> }
>
> -void get_cpu_cap(struct cpuinfo_x86 *c)
> +static void get_cpu_cap(struct cpuinfo_x86 *c)

making this function static breaks the build for me:

arch/x86/xen/enlighten_pv.c: In function ‘xen_start_kernel’:
arch/x86/xen/enlighten_pv.c:1388:9: error: implicit declaration of function ‘get_cpu_cap’; did you mean ‘set_cpu_cap’? [-Wimplicit-function-declaration]
1388 | get_cpu_cap(&boot_cpu_data);
¦ | ^~~~~~~~~~~
¦ | set_cpu_cap


> {
> u32 eax, ebx, ecx, edx;
>
> @@ -1585,6 +1585,7 @@ static void __init early_identify_cpu(st
> if (have_cpuid_p()) {
> cpu_detect(c);
> get_cpu_vendor(c);
> + intel_unlock_cpuid_leafs(c);
> get_cpu_cap(c);
> setup_force_cpu_cap(X86_FEATURE_CPUID);
> get_cpu_address_sizes(c);
> @@ -1744,7 +1745,7 @@ static void generic_identify(struct cpui
> cpu_detect(c);
>
> get_cpu_vendor(c);
> -
> + intel_unlock_cpuid_leafs(c);
> get_cpu_cap(c);
>
> get_cpu_address_sizes(c);
> --- a/arch/x86/kernel/cpu/cpu.h
> +++ b/arch/x86/kernel/cpu/cpu.h
> @@ -61,14 +61,15 @@ extern __ro_after_init enum tsx_ctrl_sta
>
> extern void __init tsx_init(void);
> void tsx_ap_init(void);
> +void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c);
> #else
> static inline void tsx_init(void) { }
> static inline void tsx_ap_init(void) { }
> +static inline void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c) { }
> #endif /* CONFIG_CPU_SUP_INTEL */
>
> extern void init_spectral_chicken(struct cpuinfo_x86 *c);
>
> -extern void get_cpu_cap(struct cpuinfo_x86 *c);
> extern void get_cpu_address_sizes(struct cpuinfo_x86 *c);
> extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c);
> extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c);
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -269,19 +269,26 @@ static void detect_tme_early(struct cpui
> c->x86_phys_bits -= keyid_bits;
> }
>
> +void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c)
> +{
> + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
> + return;
> +
> + if (c->x86 < 6 || (c->x86 == 6 && c->x86_model < 0xd))
> + return;
> +
> + /*
> + * The BIOS can have limited CPUID to leaf 2, which breaks feature
> + * enumeration. Unlock it and update the maximum leaf info.
> + */
> + if (msr_clear_bit(MSR_IA32_MISC_ENABLE, MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0)
> + c->cpuid_level = cpuid_eax(0);
> +}
> +
> static void early_init_intel(struct cpuinfo_x86 *c)
> {
> u64 misc_enable;
>
> - /* Unmask CPUID levels if masked: */
> - if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
> - if (msr_clear_bit(MSR_IA32_MISC_ENABLE,
> - MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0) {
> - c->cpuid_level = cpuid_eax(0);
> - get_cpu_cap(c);
> - }
> - }
> -
> if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
> (c->x86 == 0x6 && c->x86_model >= 0x0e))
> set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
>


Attachments:
(No filename) (4.16 kB)
signature.asc (849.00 B)
Download all attachments

2024-05-31 08:18:33

by Christian Heusel

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On 24/05/31 10:13AM, Christian Heusel wrote:
> On 24/05/30 06:24PM, Thomas Gleixner wrote:
> > On Thu, May 30 2024 at 17:53, Thomas Gleixner wrote:
> >
> > > Let me figure out how to fix that sanely.
> >
> > The proper fix is obviously to unlock CPUID on Intel _before_ anything
> > which depends on cpuid_level is evaluated.
> >
> > Thanks,
> >
> > tglx
>
> Hey Thomas,
>
> as reported on the other mail the proposed fix broke the build (see
> below) due to get_cpu_cap() becoming static but still being used in
> other parts of the code.
>
> One of the reporters in the Arch Bugtracker with an Intel Core i7-7700k
> has tested a modified version of this fix[0] with the static change
> reversed on top of the 6.9.2 stable kernel and reports that the patch
> does not fix the issue for them. I have attached their output for the
> patched (dmesg6.9.2-1.5.log) and nonpatched (dmesg6.9.2-1.log) kernel.
>
> Should we also get them to test the mainline version or do you need any
> other debug output?
>
> Cheers,
> gromit
>
> [0]: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/57#note_189079

Now with the logs really attached!

Cheers,
Chris


Attachments:
(No filename) (0.00 B)
signature.asc (849.00 B)
Download all attachments

2024-05-31 08:33:26

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Peter!

On Fri, May 31 2024 at 08:52, Peter Schneider wrote:
> Am 30.05.2024 um 18:24 schrieb Thomas Gleixner:
> With that patch applied, I now get a build error:
>
> arch/x86/xen/enlighten_pv.c: In Funktion »xen_start_kernel«:
> arch/x86/xen/enlighten_pv.c:1388:9: Fehler: Implizite Deklaration der Funktion
> »get_cpu_cap«; meinten Sie »set_cpu_cap«? [-Werror=implicit-function-declaration]
> 1388 | get_cpu_cap(&boot_cpu_data);

Bah. Updated patch below.

Thanks,

tglx
---
Subject: x86/topology/intel: Unlock CPUID before evaluating anything
From: Thomas Gleixner <[email protected]>
Date: Thu, 30 May 2024 17:29:18 +0200

Intel CPUs have a MSR bit to limit CPUID enumeration to leaf two. If this
bit is set by the BIOS then CPUID evaluation including topology enumeration
does not work correctly as the evaluation code does not try to analyze any
leaf greater than two.

This went unnoticed before because the original topology code just repeated
evaluation several times and managed to overwrite the initial limited
information with the correct one later. The new evaluation code does it
once and therefore ends up with the limited and wrong information.

Cure this by unlocking CPUID right before evaluating anything which depends
on the maximum CPUID leaf being greater than two instead of rereading stuff
after unlock.

Fixes: 22d63660c35e ("x86/cpu: Use common topology code for Intel")
Reported-by: Peter Schneider <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/cpu/common.c | 3 ++-
arch/x86/kernel/cpu/intel.c | 25 ++++++++++++++++---------
2 files changed, 18 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1585,6 +1585,7 @@ static void __init early_identify_cpu(st
if (have_cpuid_p()) {
cpu_detect(c);
get_cpu_vendor(c);
+ intel_unlock_cpuid_leafs(c);
get_cpu_cap(c);
setup_force_cpu_cap(X86_FEATURE_CPUID);
get_cpu_address_sizes(c);
@@ -1744,7 +1745,7 @@ static void generic_identify(struct cpui
cpu_detect(c);

get_cpu_vendor(c);
-
+ intel_unlock_cpuid_leafs(c);
get_cpu_cap(c);

get_cpu_address_sizes(c);
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -269,19 +269,26 @@ static void detect_tme_early(struct cpui
c->x86_phys_bits -= keyid_bits;
}

+void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c)
+{
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return;
+
+ if (c->x86 < 6 || (c->x86 == 6 && c->x86_model < 0xd))
+ return;
+
+ /*
+ * The BIOS can have limited CPUID to leaf 2, which breaks feature
+ * enumeration. Unlock it and update the maximum leaf info.
+ */
+ if (msr_clear_bit(MSR_IA32_MISC_ENABLE, MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0)
+ c->cpuid_level = cpuid_eax(0);
+}
+
static void early_init_intel(struct cpuinfo_x86 *c)
{
u64 misc_enable;

- /* Unmask CPUID levels if masked: */
- if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
- if (msr_clear_bit(MSR_IA32_MISC_ENABLE,
- MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0) {
- c->cpuid_level = cpuid_eax(0);
- get_cpu_cap(c);
- }
- }
-
if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
(c->x86 == 0x6 && c->x86_model >= 0x0e))
set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);

2024-05-31 08:49:50

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Christian!

On Fri, May 31 2024 at 10:16, Christian Heusel wrote:
>> One of the reporters in the Arch Bugtracker with an Intel Core i7-7700k
>> has tested a modified version of this fix[0] with the static change
>> reversed on top of the 6.9.2 stable kernel and reports that the patch
>> does not fix the issue for them. I have attached their output for the
>> patched (dmesg6.9.2-1.5.log) and nonpatched (dmesg6.9.2-1.log) kernel.
>>
>> Should we also get them to test the mainline version or do you need any
>> other debug output?

Can I get:

- dmesg from 6.8.y kernel
- output of cpuid -r
- content of /sys/kernel/debug/x86/topo/cpus/* (on 6.9.y)

please?

Thanks,

tglx

2024-05-31 08:57:22

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On Fri, May 31 2024 at 10:33, Thomas Gleixner wrote:

Clearly coffee did not set in yet.

Thanks,

tglx
---
Subject: x86/topology/intel: Unlock CPUID before evaluating anything
From: Thomas Gleixner <[email protected]>
Date: Thu, 30 May 2024 17:29:18 +0200

Intel CPUs have a MSR bit to limit CPUID enumeration to leaf two. If this
bit is set by the BIOS then CPUID evaluation including topology enumeration
does not work correctly as the evaluation code does not try to analyze any
leaf greater than two.

This went unnoticed before because the original topology code just repeated
evaluation several times and managed to overwrite the initial limited
information with the correct one later. The new evaluation code does it
once and therefore ends up with the limited and wrong information.

Cure this by unlocking CPUID right before evaluating anything which depends
on the maximum CPUID leaf being greater than two instead of rereading stuff
after unlock.

Fixes: 22d63660c35e ("x86/cpu: Use common topology code for Intel")
Reported-by: Peter Schneider <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/cpu/common.c | 3 ++-
arch/x86/kernel/cpu/cpu.h | 2 ++
arch/x86/kernel/cpu/intel.c | 25 ++++++++++++++++---------
3 files changed, 20 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1585,6 +1585,7 @@ static void __init early_identify_cpu(st
if (have_cpuid_p()) {
cpu_detect(c);
get_cpu_vendor(c);
+ intel_unlock_cpuid_leafs(c);
get_cpu_cap(c);
setup_force_cpu_cap(X86_FEATURE_CPUID);
get_cpu_address_sizes(c);
@@ -1744,7 +1745,7 @@ static void generic_identify(struct cpui
cpu_detect(c);

get_cpu_vendor(c);
-
+ intel_unlock_cpuid_leafs(c);
get_cpu_cap(c);

get_cpu_address_sizes(c);
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -61,9 +61,11 @@ extern __ro_after_init enum tsx_ctrl_sta

extern void __init tsx_init(void);
void tsx_ap_init(void);
+void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c);
#else
static inline void tsx_init(void) { }
static inline void tsx_ap_init(void) { }
+static inline void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c) { }
#endif /* CONFIG_CPU_SUP_INTEL */

extern void init_spectral_chicken(struct cpuinfo_x86 *c);
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -269,19 +269,26 @@ static void detect_tme_early(struct cpui
c->x86_phys_bits -= keyid_bits;
}

+void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c)
+{
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return;
+
+ if (c->x86 < 6 || (c->x86 == 6 && c->x86_model < 0xd))
+ return;
+
+ /*
+ * The BIOS can have limited CPUID to leaf 2, which breaks feature
+ * enumeration. Unlock it and update the maximum leaf info.
+ */
+ if (msr_clear_bit(MSR_IA32_MISC_ENABLE, MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0)
+ c->cpuid_level = cpuid_eax(0);
+}
+
static void early_init_intel(struct cpuinfo_x86 *c)
{
u64 misc_enable;

- /* Unmask CPUID levels if masked: */
- if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
- if (msr_clear_bit(MSR_IA32_MISC_ENABLE,
- MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0) {
- c->cpuid_level = cpuid_eax(0);
- get_cpu_cap(c);
- }
- }
-
if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
(c->x86 == 0x6 && c->x86_model >= 0x0e))
set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);

2024-05-31 09:12:08

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On Fri, May 31 2024 at 10:48, Thomas Gleixner wrote:

> Christian!
>
> On Fri, May 31 2024 at 10:16, Christian Heusel wrote:
>>> One of the reporters in the Arch Bugtracker with an Intel Core i7-7700k
>>> has tested a modified version of this fix[0] with the static change
>>> reversed on top of the 6.9.2 stable kernel and reports that the patch
>>> does not fix the issue for them. I have attached their output for the
>>> patched (dmesg6.9.2-1.5.log) and nonpatched (dmesg6.9.2-1.log) kernel.
>>>
>>> Should we also get them to test the mainline version or do you need any
>>> other debug output?
>
> Can I get:
>
> - dmesg from 6.8.y kernel
> - output of cpuid -r
> - content of /sys/kernel/debug/x86/topo/cpus/* (on 6.9.y)
>
> please?

It seems there are two different issues here. The dmesg you provided is
from a i7-1255U, which is a hybrid CPU. The i7-7700k has 4 cores (8
threads) and there is not necessarily the same root cause.

Thanks,

tglx

2024-05-31 09:42:28

by Peter Schneider

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Hey Thomas,

Am 31.05.2024 um 10:42 schrieb Thomas Gleixner:
> Bah. Updated patch below.
> Clearly coffee did not set in yet.
[...]

;-)

There seems to be an absolute lower limit of caffeine per ml blood serum, below which you
just can't get things done right... I know that too!

Anyway, this last version of your patch fixes things for me, please see attached dmesg
output. Thanks very much for investigating and fixing this issue!

Tested-by: Peter Schneider <[email protected]>

If you like, I can retest with your first patch (with additional debug info output)
additionally applied on top of that and send the output, if that would be useful for you.
Just let me know.

Beste Grüße,
Peter Schneider

--
Climb the mountain not to plant your flag, but to embrace the challenge,
enjoy the air and behold the view. Climb it so you can see the world,
not so the world can see you. -- David McCullough Jr.

OpenPGP: 0xA3828BD796CCE11A8CADE8866E3A92C92C3FF244
Download: https://www.peters-netzplatz.de/download/pschneider1968_pub.asc
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]



Attachments:
dmesg_v6.9.3-dirty_Good_w_tglx_topo_patch.txt (118.26 kB)
OpenPGP_signature.asc (243.00 B)
OpenPGP digital signature
Download all attachments

2024-05-31 10:07:23

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Peter!

On Fri, May 31 2024 at 11:41, Peter Schneider wrote:
> Anyway, this last version of your patch fixes things for me, please see attached dmesg
> output. Thanks very much for investigating and fixing this issue!
>
> Tested-by: Peter Schneider <[email protected]>
>
> If you like, I can retest with your first patch (with additional debug
> info output) additionally applied on top of that and send the output,
> if that would be useful for you.

No need. I'm properly coffeiniated and confident enough that this cures
it. :)

Thanks a lot for testing and providing all the information!

tglx

2024-05-31 10:23:04

by Peter Schneider

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

Am 31.05.2024 um 12:07 schrieb Thomas Gleixner:
> Thanks a lot for testing and providing all the information!

Refactoring messy legacy code is not an easy task. I'm glad I could help a tiny little bit
so that you can get this done right!

Beste Grüße,
Peter Schneider

--
Climb the mountain not to plant your flag, but to embrace the challenge,
enjoy the air and behold the view. Climb it so you can see the world,
not so the world can see you. -- David McCullough Jr.

OpenPGP: 0xA3828BD796CCE11A8CADE8866E3A92C92C3FF244
Download: https://www.peters-netzplatz.de/download/pschneider1968_pub.asc
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]
https://keys.mailvelope.com/pks/lookup?op=get&[email protected]


Attachments:
OpenPGP_signature.asc (243.00 B)
OpenPGP digital signature

2024-05-31 11:07:22

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On Fri, May 31 2024 at 10:16, Christian Heusel wrote:
> On 24/05/31 10:13AM, Christian Heusel wrote:
> [ 0.046127] TSC deadline timer available
> [ 0.046129] CPU topo: Max. logical packages: 1
> [ 0.046129] CPU topo: Max. logical dies: 1
> [ 0.046129] CPU topo: Max. dies per package: 1
> [ 0.046131] CPU topo: Max. threads per core: 2
> [ 0.046132] CPU topo: Num. cores per package: 10
> [ 0.046132] CPU topo: Num. threads per package: 12
> [ 0.046132] CPU topo: Allowing 12 present CPUs plus 0 hotplug CPUs

This looks correct.

> [ 0.117308] smpboot: x86: Booting SMP configuration:
> [ 0.117308] .... node #0, CPUs: #2 #4 #5 #6 #7 #8 #9 #10 #11
> [ 0.009676] [Firmware Bug]: CPU4: Topology domain 1 shift 7 != 6

So this means that the E-Cores have a different topology information for
the CORE shift value than the P-Cores which is definitely wrong.

Let's see what cpuid -r reports.

Thanks,

tglx



2024-05-31 13:10:05

by Christian Heusel

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On 24/05/31 11:11AM, Thomas Gleixner wrote:
> On Fri, May 31 2024 at 10:48, Thomas Gleixner wrote:
>
> It seems there are two different issues here. The dmesg you provided is
> from a i7-1255U, which is a hybrid CPU. The i7-7700k has 4 cores (8
> threads) and there is not necessarily the same root cause.

It seems like I was also below my needed caffeine levels :p The person
reporting (in the same thread) with the i7-7700k reports the problem
fixed[1] as well, so this is in line with Peters observerations!

The other person with the i7-1255U in the meantime got back to me with
the needed outputs:

> Can I get:
>
> - dmesg from 6.8.y kernel

See attachment (dmesg6.8.9-arch1-2.log)

> - output of cpuid -r

Basic Leafs :
================
0x00000000: EAX=0x00000020, EBX=0x756e6547, ECX=0x6c65746e, EDX=0x49656e69
0x00000001: EAX=0x000906a4, EBX=0x12400800, ECX=0x7ffafbff, EDX=0xbfebfbff
0x00000002: EAX=0x00feff01, EBX=0x00000000, ECX=0x00000000, EDX=0x00000000
0x00000004: subleafs:
0: EAX=0x7c004121, EBX=0x01c0003f, ECX=0x0000003f, EDX=0x00000000
1: EAX=0x7c004122, EBX=0x01c0003f, ECX=0x0000007f, EDX=0x00000000
2: EAX=0x7c01c143, EBX=0x03c0003f, ECX=0x000007ff, EDX=0x00000000
3: EAX=0x7c0fc163, EBX=0x02c0003f, ECX=0x00003fff, EDX=0x00000004
0x00000005: EAX=0x00000040, EBX=0x00000040, ECX=0x00000003, EDX=0x10102020
0x00000006: EAX=0x00df8ff7, EBX=0x00000002, ECX=0x00000409, EDX=0x00020003
0x00000007: subleafs:
0: EAX=0x00000002, EBX=0x239ca7eb, ECX=0x984007bc, EDX=0xfc18c410
1: EAX=0x00400810, EBX=0x00000000, ECX=0x00000000, EDX=0x00040000
2: EAX=0x00000000, EBX=0x00000000, ECX=0x00000000, EDX=0x00000017
0x0000000a: EAX=0x07300605, EBX=0x00000000, ECX=0x00000007, EDX=0x00008603
0x0000000b: subleafs:
0: EAX=0x00000001, EBX=0x00000001, ECX=0x00000100, EDX=0x00000012
1: EAX=0x00000006, EBX=0x0000000c, ECX=0x00000201, EDX=0x00000012
0x0000000d: subleafs:
0: EAX=0x00000207, EBX=0x00000a88, ECX=0x00000a88, EDX=0x00000000
1: EAX=0x0000000f, EBX=0x00000680, ECX=0x00009900, EDX=0x00000000
2: EAX=0x00000100, EBX=0x00000240, ECX=0x00000000, EDX=0x00000000
8: EAX=0x00000080, EBX=0x00000000, ECX=0x00000001, EDX=0x00000000
9: EAX=0x00000008, EBX=0x00000a80, ECX=0x00000000, EDX=0x00000000
11: EAX=0x00000010, EBX=0x00000000, ECX=0x00000001, EDX=0x00000000
12: EAX=0x00000018, EBX=0x00000000, ECX=0x00000001, EDX=0x00000000
15: EAX=0x00000328, EBX=0x00000000, ECX=0x00000001, EDX=0x00000000
0x00000010: subleafs:
0: EAX=0x00000000, EBX=0x00000004, ECX=0x00000000, EDX=0x00000000
2: EAX=0x0000000f, EBX=0x00000000, ECX=0x00000004, EDX=0x0000000f
0x00000014: subleafs:
0: EAX=0x00000001, EBX=0x0000005f, ECX=0x80000007, EDX=0x00000000
1: EAX=0x02490002, EBX=0x003f003f, ECX=0x00000000, EDX=0x00000000
0x00000015: EAX=0x00000002, EBX=0x00000088, ECX=0x0249f000, EDX=0x00000000
0x00000016: EAX=0x00000a28, EBX=0x00000dac, ECX=0x00000064, EDX=0x00000000
0x00000018: subleafs:
0: EAX=0x00000004, EBX=0x00000000, ECX=0x00000000, EDX=0x00000000
1: EAX=0x00000000, EBX=0x00300001, ECX=0x00000001, EDX=0x00000121
2: EAX=0x00000000, EBX=0x00040003, ECX=0x00000200, EDX=0x00000043
3: EAX=0x00000000, EBX=0x00400001, ECX=0x00000001, EDX=0x00000122
4: EAX=0x00000000, EBX=0x00080008, ECX=0x00000001, EDX=0x00000143
0x0000001a: EAX=0x20000001, EBX=0x00000000, ECX=0x00000000, EDX=0x00000000
0x0000001c: EAX=0xc000000b, EBX=0x00000007, ECX=0x00000007, EDX=0x00000000
0x0000001f: subleafs:
0: EAX=0x00000001, EBX=0x00000001, ECX=0x00000100, EDX=0x00000012
1: EAX=0x00000007, EBX=0x0000000c, ECX=0x00000201, EDX=0x00000012
2: EAX=0x00000000, EBX=0x00000000, ECX=0x00000002, EDX=0x00000012
3: EAX=0x00000000, EBX=0x00000000, ECX=0x00000003, EDX=0x00000012
4: EAX=0x00000000, EBX=0x00000000, ECX=0x00000004, EDX=0x00000012
5: EAX=0x00000000, EBX=0x00000000, ECX=0x00000005, EDX=0x00000012
6: EAX=0x00000000, EBX=0x00000000, ECX=0x00000006, EDX=0x00000012
7: EAX=0x00000000, EBX=0x00000000, ECX=0x00000007, EDX=0x00000012
8: EAX=0x00000000, EBX=0x00000000, ECX=0x00000008, EDX=0x00000012
9: EAX=0x00000000, EBX=0x00000000, ECX=0x00000009, EDX=0x00000012
10: EAX=0x00000000, EBX=0x00000000, ECX=0x0000000a, EDX=0x00000012
11: EAX=0x00000000, EBX=0x00000000, ECX=0x0000000b, EDX=0x00000012
12: EAX=0x00000000, EBX=0x00000000, ECX=0x0000000c, EDX=0x00000012
13: EAX=0x00000000, EBX=0x00000000, ECX=0x0000000d, EDX=0x00000012
14: EAX=0x00000000, EBX=0x00000000, ECX=0x0000000e, EDX=0x00000012
15: EAX=0x00000000, EBX=0x00000000, ECX=0x0000000f, EDX=0x00000012
16: EAX=0x00000000, EBX=0x00000000, ECX=0x00000010, EDX=0x00000012
17: EAX=0x00000000, EBX=0x00000000, ECX=0x00000011, EDX=0x00000012
18: EAX=0x00000000, EBX=0x00000000, ECX=0x00000012, EDX=0x00000012
19: EAX=0x00000000, EBX=0x00000000, ECX=0x00000013, EDX=0x00000012
20: EAX=0x00000000, EBX=0x00000000, ECX=0x00000014, EDX=0x00000012
21: EAX=0x00000000, EBX=0x00000000, ECX=0x00000015, EDX=0x00000012
22: EAX=0x00000000, EBX=0x00000000, ECX=0x00000016, EDX=0x00000012
23: EAX=0x00000000, EBX=0x00000000, ECX=0x00000017, EDX=0x00000012
24: EAX=0x00000000, EBX=0x00000000, ECX=0x00000018, EDX=0x00000012
25: EAX=0x00000000, EBX=0x00000000, ECX=0x00000019, EDX=0x00000012
26: EAX=0x00000000, EBX=0x00000000, ECX=0x0000001a, EDX=0x00000012
27: EAX=0x00000000, EBX=0x00000000, ECX=0x0000001b, EDX=0x00000012
28: EAX=0x00000000, EBX=0x00000000, ECX=0x0000001c, EDX=0x00000012
29: EAX=0x00000000, EBX=0x00000000, ECX=0x0000001d, EDX=0x00000012
30: EAX=0x00000000, EBX=0x00000000, ECX=0x0000001e, EDX=0x00000012
31: EAX=0x00000000, EBX=0x00000000, ECX=0x0000001f, EDX=0x00000012
0x00000020: EAX=0x00000000, EBX=0x00000001, ECX=0x00000000, EDX=0x00000000
Extended Leafs :
================
0x80000000: EAX=0x80000008, EBX=0x00000000, ECX=0x00000000, EDX=0x00000000
0x80000001: EAX=0x00000000, EBX=0x00000000, ECX=0x00000121, EDX=0x2c100800
0x80000002: EAX=0x68743231, EBX=0x6e654720, ECX=0x746e4920, EDX=0x52286c65
0x80000003: EAX=0x6f432029, EBX=0x54286572, ECX=0x6920294d, EDX=0x32312d37
0x80000004: EAX=0x00553535, EBX=0x00000000, ECX=0x00000000, EDX=0x00000000
0x80000006: EAX=0x00000000, EBX=0x00000000, ECX=0x08008040, EDX=0x00000000
0x80000007: EAX=0x00000000, EBX=0x00000000, ECX=0x00000000, EDX=0x00000100
0x80000008: EAX=0x00003027, EBX=0x00000000, ECX=0x00000000, EDX=0x00000000


> - content of /sys/kernel/debug/x86/topo/cpus/* (on 6.9.y)

See attachment (cat_debug.log)

Cheers,
chris


[1]: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/57#note_189134


Attachments:
(No filename) (0.00 B)
signature.asc (849.00 B)
Download all attachments

2024-05-31 13:44:04

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On Fri, May 31 2024 at 15:08, Christian Heusel wrote:
> On 24/05/31 11:11AM, Thomas Gleixner wrote:
>> On Fri, May 31 2024 at 10:48, Thomas Gleixner wrote:
>>
>> It seems there are two different issues here. The dmesg you provided is
>> from a i7-1255U, which is a hybrid CPU. The i7-7700k has 4 cores (8
>> threads) and there is not necessarily the same root cause.
>
> It seems like I was also below my needed caffeine levels :p The person
> reporting (in the same thread) with the i7-7700k reports the problem
> fixed[1] as well, so this is in line with Peters observerations!

Cool!

> The other person with the i7-1255U in the meantime got back to me with
> the needed outputs:
>> - output of cpuid -r

> 0x0000000b: subleafs:
> 0: EAX=0x00000001, EBX=0x00000001, ECX=0x00000100, EDX=0x00000012
> 1: EAX=0x00000006, EBX=0x0000000c, ECX=0x00000201, EDX=0x00000012

> 0x0000001f: subleafs:
> 0: EAX=0x00000001, EBX=0x00000001, ECX=0x00000100, EDX=0x00000012
> 1: EAX=0x00000007, EBX=0x0000000c, ECX=0x00000201, EDX=0x00000012

So this is inconsistent already. Both leafs should describe the same
topology. See the differing EAX values (6/7) in subleaf 1, which are
exactly the values the kernel complains about :)

But that should not be an issue because the kernel preferres 0x1f over
0xb and will never evaluate both, but this is just from one randomly
picked CPU.

I wonder which variant of the cpuid tool that is. cpuid -r gives you
usually just the plain values and collects them for all CPUs.

I really need to have the values for all CPUs to see whether there are
differences at the relevant places. The above is probably from one of
the E-Cores.

Thanks,

tglx

2024-05-31 14:30:57

by Christian Heusel

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On 24/05/31 03:42PM, Thomas Gleixner wrote:
> On Fri, May 31 2024 at 15:08, Christian Heusel wrote:

> > The other person with the i7-1255U in the meantime got back to me with
> > the needed outputs:
> >> - output of cpuid -r
>
> > 0x0000000b: subleafs:
> > 0: EAX=0x00000001, EBX=0x00000001, ECX=0x00000100, EDX=0x00000012
> > 1: EAX=0x00000006, EBX=0x0000000c, ECX=0x00000201, EDX=0x00000012
>
> > 0x0000001f: subleafs:
> > 0: EAX=0x00000001, EBX=0x00000001, ECX=0x00000100, EDX=0x00000012
> > 1: EAX=0x00000007, EBX=0x0000000c, ECX=0x00000201, EDX=0x00000012
>
> So this is inconsistent already. Both leafs should describe the same
> topology. See the differing EAX values (6/7) in subleaf 1, which are
> exactly the values the kernel complains about :)
>
> But that should not be an issue because the kernel preferres 0x1f over
> 0xb and will never evaluate both, but this is just from one randomly
> picked CPU.
>
> I wonder which variant of the cpuid tool that is. cpuid -r gives you
> usually just the plain values and collects them for all CPUs.

The previously attached one is output from the version located here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/arch/x86/kcpuid

The one I have now attached is the one being built from this:
https://www.etallen.com/cpuid.html

Cheers,
Chris


Attachments:
(No filename) (0.00 B)
signature.asc (849.00 B)
Download all attachments

2024-05-31 15:25:15

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On Fri, May 31 2024 at 16:29, Christian Heusel wrote:

P-Cores are consistent:

> CPU 0:
> 0x0000000b 0x01: eax=0x00000006 ebx=0x0000000c ecx=0x00000201 edx=0x00000000

> 0x0000001f 0x00: eax=0x00000001 ebx=0x00000002 ecx=0x00000100 edx=0x00000000

E-Cores are not:

> CPU 4:
> 0x0000000b 0x01: eax=0x00000006 ebx=0x0000000c ecx=0x00000201 edx=0x00000010

> 0x0000001f 0x01: eax=0x00000007 ebx=0x0000000c ecx=0x00000201 edx=0x00000010

As the topology is evaluated from CPU0 CPUID leaf 0x1f it's obvious that
CPU4...11 will trigger the sanity checks because their CPUID leaf 0x1f
subleaf 1 entries are bogus.

IOW it's a firmware bug and there is nothing the kernel will and can do
about it except what it does already: complaining about the inconsistency.

Thanks for providing all the information!

tglx


Subject: [tip: x86/urgent] x86/topology/intel: Unlock CPUID before evaluating anything

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 0c2f6d04619ec2b53ad4b0b591eafc9389786e86
Gitweb: https://git.kernel.org/tip/0c2f6d04619ec2b53ad4b0b591eafc9389786e86
Author: Thomas Gleixner <[email protected]>
AuthorDate: Thu, 30 May 2024 17:29:18 +02:00
Committer: Borislav Petkov (AMD) <[email protected]>
CommitterDate: Fri, 31 May 2024 20:25:56 +02:00

x86/topology/intel: Unlock CPUID before evaluating anything

Intel CPUs have a MSR bit to limit CPUID enumeration to leaf two. If
this bit is set by the BIOS then CPUID evaluation including topology
enumeration does not work correctly as the evaluation code does not try
to analyze any leaf greater than two.

This went unnoticed before because the original topology code just
repeated evaluation several times and managed to overwrite the initial
limited information with the correct one later. The new evaluation code
does it once and therefore ends up with the limited and wrong
information.

Cure this by unlocking CPUID right before evaluating anything which
depends on the maximum CPUID leaf being greater than two instead of
rereading stuff after unlock.

Fixes: 22d63660c35e ("x86/cpu: Use common topology code for Intel")
Reported-by: Peter Schneider <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Tested-by: Peter Schneider <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/kernel/cpu/common.c | 3 ++-
arch/x86/kernel/cpu/cpu.h | 2 ++
arch/x86/kernel/cpu/intel.c | 25 ++++++++++++++++---------
3 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index e31293c..d4e539d 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1589,6 +1589,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
if (have_cpuid_p()) {
cpu_detect(c);
get_cpu_vendor(c);
+ intel_unlock_cpuid_leafs(c);
get_cpu_cap(c);
setup_force_cpu_cap(X86_FEATURE_CPUID);
get_cpu_address_sizes(c);
@@ -1748,7 +1749,7 @@ static void generic_identify(struct cpuinfo_x86 *c)
cpu_detect(c);

get_cpu_vendor(c);
-
+ intel_unlock_cpuid_leafs(c);
get_cpu_cap(c);

get_cpu_address_sizes(c);
diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
index ea9e07d..1beccef 100644
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -61,9 +61,11 @@ extern __ro_after_init enum tsx_ctrl_states tsx_ctrl_state;

extern void __init tsx_init(void);
void tsx_ap_init(void);
+void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c);
#else
static inline void tsx_init(void) { }
static inline void tsx_ap_init(void) { }
+static inline void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c) { }
#endif /* CONFIG_CPU_SUP_INTEL */

extern void init_spectral_chicken(struct cpuinfo_x86 *c);
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 3c3e7e5..fdf3489 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -269,19 +269,26 @@ detect_keyid_bits:
c->x86_phys_bits -= keyid_bits;
}

+void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c)
+{
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return;
+
+ if (c->x86 < 6 || (c->x86 == 6 && c->x86_model < 0xd))
+ return;
+
+ /*
+ * The BIOS can have limited CPUID to leaf 2, which breaks feature
+ * enumeration. Unlock it and update the maximum leaf info.
+ */
+ if (msr_clear_bit(MSR_IA32_MISC_ENABLE, MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0)
+ c->cpuid_level = cpuid_eax(0);
+}
+
static void early_init_intel(struct cpuinfo_x86 *c)
{
u64 misc_enable;

- /* Unmask CPUID levels if masked: */
- if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
- if (msr_clear_bit(MSR_IA32_MISC_ENABLE,
- MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0) {
- c->cpuid_level = cpuid_eax(0);
- get_cpu_cap(c);
- }
- }
-
if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
(c->x86 == 0x6 && c->x86_model >= 0x0e))
set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);

2024-06-01 07:08:15

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On 31.05.24 10:42, Thomas Gleixner wrote:
> On Fri, May 31 2024 at 10:33, Thomas Gleixner wrote:

> ---
> Subject: x86/topology/intel: Unlock CPUID before evaluating anything
> From: Thomas Gleixner <[email protected]>
> Date: Thu, 30 May 2024 17:29:18 +0200
>
> Intel CPUs have a MSR bit to limit CPUID enumeration to leaf two. If this
> bit is set by the BIOS then CPUID evaluation including topology enumeration
> does not work correctly as the evaluation code does not try to analyze any
> leaf greater than two.
> [...]

TWIMC, I noticed a bug report with a "12 × 12th Gen Intel® Core™
i7-1255U" where the reporter also noticed a lot of messages like these:

archlinux kernel: [Firmware Bug]: CPU4: Topology domain 1 shift 7 != 6
archlinux kernel: [Firmware Bug]: CPU4: Topology domain 2 shift 7 != 6
archlinux kernel: [Firmware Bug]: CPU4: Topology domain 3 shift 7 != 6

Asked the reporter to test this patch. For details see:
https://bugzilla.kernel.org/show_bug.cgi?id=218879

Ciao, Thorsten

#regzbot fix: x86/topology/intel: Unlock CPUID before evaluating anything

2024-06-01 07:24:48

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On Sat, Jun 01 2024 at 09:06, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 31.05.24 10:42, Thomas Gleixner wrote:
>> On Fri, May 31 2024 at 10:33, Thomas Gleixner wrote:
>
>> ---
>> Subject: x86/topology/intel: Unlock CPUID before evaluating anything
>> From: Thomas Gleixner <[email protected]>
>> Date: Thu, 30 May 2024 17:29:18 +0200
>>
>> Intel CPUs have a MSR bit to limit CPUID enumeration to leaf two. If this
>> bit is set by the BIOS then CPUID evaluation including topology enumeration
>> does not work correctly as the evaluation code does not try to analyze any
>> leaf greater than two.
>> [...]
>
> TWIMC, I noticed a bug report with a "12 × 12th Gen Intel® Core™
> i7-1255U" where the reporter also noticed a lot of messages like these:
>
> archlinux kernel: [Firmware Bug]: CPU4: Topology domain 1 shift 7 != 6
> archlinux kernel: [Firmware Bug]: CPU4: Topology domain 2 shift 7 != 6
> archlinux kernel: [Firmware Bug]: CPU4: Topology domain 3 shift 7 != 6
>
> Asked the reporter to test this patch. For details see:
> https://bugzilla.kernel.org/show_bug.cgi?id=218879

Won't help. See: https://lore.kernel.org/all/87plt26m2b.ffs@tglx/

Thanks,

tglx


2024-06-01 07:27:03

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: Kernel 6.9 regression: X86: Bogus messages from topology detection

On 01.06.24 09:20, Thomas Gleixner wrote:
> On Sat, Jun 01 2024 at 09:06, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 31.05.24 10:42, Thomas Gleixner wrote:
>>> On Fri, May 31 2024 at 10:33, Thomas Gleixner wrote:
>>
>>> ---
>>> Subject: x86/topology/intel: Unlock CPUID before evaluating anything
>>> From: Thomas Gleixner <[email protected]>
>>> Date: Thu, 30 May 2024 17:29:18 +0200
>>>
>>> Intel CPUs have a MSR bit to limit CPUID enumeration to leaf two. If this
>>> bit is set by the BIOS then CPUID evaluation including topology enumeration
>>> does not work correctly as the evaluation code does not try to analyze any
>>> leaf greater than two.
>>> [...]
>>
>> TWIMC, I noticed a bug report with a "12 × 12th Gen Intel® Core™
>> i7-1255U" where the reporter also noticed a lot of messages like these:
>>
>> archlinux kernel: [Firmware Bug]: CPU4: Topology domain 1 shift 7 != 6
>> archlinux kernel: [Firmware Bug]: CPU4: Topology domain 2 shift 7 != 6
>> archlinux kernel: [Firmware Bug]: CPU4: Topology domain 3 shift 7 != 6
>>
>> Asked the reporter to test this patch. For details see:
>> https://bugzilla.kernel.org/show_bug.cgi?id=218879
>
> Won't help. See: https://lore.kernel.org/all/87plt26m2b.ffs@tglx/

Ahh, it was the other problem in this thread. Sorry for not noticing
that, had not followed things that closely. Forwarded that info to the
ticket. Many thx! Ciao, Thorsten