2024-04-30 15:04:03

by Oliver Sang

[permalink] [raw]
Subject: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap



Hello,

kernel test robot noticed "WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap" on:

commit: ee8962082a4413dba1a1b3d3d23490c5221f3b8a ("x86/alternatives: Catch late X86_FEATURE modifiers")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git x86/alternatives

[test failed on linux-next/master bb7a2467e6beef44a80a17d45ebf2931e7631083]

in testcase: lkvs
version: lkvs-x86_64-b07d44a-1_20240401
with following parameters:

test: xsave



compiler: gcc-13
test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 32G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-lkp/[email protected]


[ 0.055225][ T0] ------------[ cut here ]------------
[ 0.055225][ T0] WARNING: CPU: 1 PID: 0 at arch/x86/kernel/cpu/cpuid-deps.c:118 do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
[ 0.055225][ T0] Modules linked in:
[ 0.055225][ T0] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.9.0-rc3-00001-gee8962082a44 #1
[ 0.055225][ T0] Hardware name: Gigabyte Technology Co., Ltd. X299 UD4 Pro/X299 UD4 Pro-CF, BIOS F8a 04/27/2021
[ 0.055225][ T0] RIP: 0010:do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
[ 0.055225][ T0] Code: 89 c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 08 84 d2 0f 85 b7 00 00 00 8b 15 f4 f7 7c 04 85 d2 0f 84 f2 fd ff ff <0f> 0b e9 eb fd ff ff 48 c7 c7 c0 eb 89 85 e8 41 fd ff ff 49 8d bf
All code
========
0: 89 c1 mov %eax,%ecx
2: 83 e0 07 and $0x7,%eax
5: 48 c1 e9 03 shr $0x3,%rcx
9: 83 c0 03 add $0x3,%eax
c: 0f b6 14 11 movzbl (%rcx,%rdx,1),%edx
10: 38 d0 cmp %dl,%al
12: 7c 08 jl 0x1c
14: 84 d2 test %dl,%dl
16: 0f 85 b7 00 00 00 jne 0xd3
1c: 8b 15 f4 f7 7c 04 mov 0x47cf7f4(%rip),%edx # 0x47cf816
22: 85 d2 test %edx,%edx
24: 0f 84 f2 fd ff ff je 0xfffffffffffffe1c
2a:* 0f 0b ud2 <-- trapping instruction
2c: e9 eb fd ff ff jmpq 0xfffffffffffffe1c
31: 48 c7 c7 c0 eb 89 85 mov $0xffffffff8589ebc0,%rdi
38: e8 41 fd ff ff callq 0xfffffffffffffd7e
3d: 49 rex.WB
3e: 8d .byte 0x8d
3f: bf .byte 0xbf

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: e9 eb fd ff ff jmpq 0xfffffffffffffdf2
7: 48 c7 c7 c0 eb 89 85 mov $0xffffffff8589ebc0,%rdi
e: e8 41 fd ff ff callq 0xfffffffffffffd54
13: 49 rex.WB
14: 8d .byte 0x8d
15: bf .byte 0xbf
[ 0.055225][ T0] RSP: 0000:ffffc900001f7cd0 EFLAGS: 00010002
[ 0.055225][ T0] RAX: 0000000000000003 RBX: ffff888817ca9020 RCX: 1ffffffff0b13da3
[ 0.055225][ T0] RDX: 0000000000000001 RSI: 0000000000000085 RDI: ffffc900001f7d68
[ 0.055225][ T0] RBP: ffffc900001f7d08 R08: 0000000000000000 R09: fffffbfff0962288
[ 0.055225][ T0] R10: ffffffff84b11443 R11: 0000000000000001 R12: 0000000000000085
[ 0.055225][ T0] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff8683dee0
[ 0.055225][ T0] FS: 0000000000000000(0000) GS:ffff888817c80000(0000) knlGS:0000000000000000
[ 0.055225][ T0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.055225][ T0] CR2: 0000000000000000 CR3: 000000089c85a001 CR4: 00000000003706b0
[ 0.055225][ T0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.055225][ T0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.055225][ T0] Call Trace:
[ 0.055225][ T0] <TASK>
[ 0.055225][ T0] ? __warn (kernel/panic.c:694)
[ 0.055225][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
[ 0.055225][ T0] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 0.055225][ T0] ? handle_bug (arch/x86/kernel/traps.c:239 (discriminator 1))
[ 0.055225][ T0] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
[ 0.055225][ T0] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621)
[ 0.055225][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
[ 0.055225][ T0] ? __pfx_do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:109)
[ 0.055225][ T0] init_ia32_feat_ctl (arch/x86/kernel/cpu/feat_ctl.c:181)
[ 0.055225][ T0] init_intel (arch/x86/include/asm/msr.h:146 arch/x86/include/asm/msr.h:300 arch/x86/kernel/cpu/intel.c:583 arch/x86/kernel/cpu/intel.c:687)
[ 0.055225][ T0] identify_cpu (arch/x86/kernel/cpu/common.c:1824)
[ 0.055225][ T0] identify_secondary_cpu (arch/x86/kernel/cpu/common.c:1949)
[ 0.055225][ T0] smp_store_cpu_info (arch/x86/kernel/smpboot.c:333)
[ 0.055225][ T0] start_secondary (arch/x86/kernel/smpboot.c:197 arch/x86/kernel/smpboot.c:281)
[ 0.055225][ T0] ? __pfx_start_secondary (arch/x86/kernel/smpboot.c:231)
[ 0.055225][ T0] common_startup_64 (arch/x86/kernel/head_64.S:421)
[ 0.055225][ T0] </TASK>
[ 0.055225][ T0] ---[ end trace 0000000000000000 ]---


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240430/[email protected]



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



2024-04-30 17:23:47

by Borislav Petkov

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

+ Sean.

On Tue, Apr 30, 2024 at 11:00:52PM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed "WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap" on:
>
> commit: ee8962082a4413dba1a1b3d3d23490c5221f3b8a ("x86/alternatives: Catch late X86_FEATURE modifiers")
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git x86/alternatives
>
> [test failed on linux-next/master bb7a2467e6beef44a80a17d45ebf2931e7631083]
>
> in testcase: lkvs
> version: lkvs-x86_64-b07d44a-1_20240401
> with following parameters:
>
> test: xsave
>
>
>
> compiler: gcc-13
> test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 32G memory
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <[email protected]>
> | Closes: https://lore.kernel.org/oe-lkp/[email protected]
>
>
> [ 0.055225][ T0] ------------[ cut here ]------------
> [ 0.055225][ T0] WARNING: CPU: 1 PID: 0 at arch/x86/kernel/cpu/cpuid-deps.c:118 do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
> [ 0.055225][ T0] Modules linked in:
> [ 0.055225][ T0] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.9.0-rc3-00001-gee8962082a44 #1
> [ 0.055225][ T0] Hardware name: Gigabyte Technology Co., Ltd. X299 UD4 Pro/X299 UD4 Pro-CF, BIOS F8a 04/27/2021
> [ 0.055225][ T0] RIP: 0010:do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
> [ 0.055225][ T0] Code: 89 c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 08 84 d2 0f 85 b7 00 00 00 8b 15 f4 f7 7c 04 85 d2 0f 84 f2 fd ff ff <0f> 0b e9 eb fd ff ff 48 c7 c7 c0 eb 89 85 e8 41 fd ff ff 49 8d bf
> All code
> ========
> 0: 89 c1 mov %eax,%ecx
> 2: 83 e0 07 and $0x7,%eax
> 5: 48 c1 e9 03 shr $0x3,%rcx
> 9: 83 c0 03 add $0x3,%eax
> c: 0f b6 14 11 movzbl (%rcx,%rdx,1),%edx
> 10: 38 d0 cmp %dl,%al
> 12: 7c 08 jl 0x1c
> 14: 84 d2 test %dl,%dl
> 16: 0f 85 b7 00 00 00 jne 0xd3
> 1c: 8b 15 f4 f7 7c 04 mov 0x47cf7f4(%rip),%edx # 0x47cf816
> 22: 85 d2 test %edx,%edx
> 24: 0f 84 f2 fd ff ff je 0xfffffffffffffe1c
> 2a:* 0f 0b ud2 <-- trapping instruction
> 2c: e9 eb fd ff ff jmpq 0xfffffffffffffe1c
> 31: 48 c7 c7 c0 eb 89 85 mov $0xffffffff8589ebc0,%rdi
> 38: e8 41 fd ff ff callq 0xfffffffffffffd7e
> 3d: 49 rex.WB
> 3e: 8d .byte 0x8d
> 3f: bf .byte 0xbf
>
> Code starting with the faulting instruction
> ===========================================
> 0: 0f 0b ud2
> 2: e9 eb fd ff ff jmpq 0xfffffffffffffdf2
> 7: 48 c7 c7 c0 eb 89 85 mov $0xffffffff8589ebc0,%rdi
> e: e8 41 fd ff ff callq 0xfffffffffffffd54
> 13: 49 rex.WB
> 14: 8d .byte 0x8d
> 15: bf .byte 0xbf
> [ 0.055225][ T0] RSP: 0000:ffffc900001f7cd0 EFLAGS: 00010002
> [ 0.055225][ T0] RAX: 0000000000000003 RBX: ffff888817ca9020 RCX: 1ffffffff0b13da3
> [ 0.055225][ T0] RDX: 0000000000000001 RSI: 0000000000000085 RDI: ffffc900001f7d68
> [ 0.055225][ T0] RBP: ffffc900001f7d08 R08: 0000000000000000 R09: fffffbfff0962288
> [ 0.055225][ T0] R10: ffffffff84b11443 R11: 0000000000000001 R12: 0000000000000085
> [ 0.055225][ T0] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff8683dee0
> [ 0.055225][ T0] FS: 0000000000000000(0000) GS:ffff888817c80000(0000) knlGS:0000000000000000
> [ 0.055225][ T0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.055225][ T0] CR2: 0000000000000000 CR3: 000000089c85a001 CR4: 00000000003706b0
> [ 0.055225][ T0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 0.055225][ T0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 0.055225][ T0] Call Trace:
> [ 0.055225][ T0] <TASK>
> [ 0.055225][ T0] ? __warn (kernel/panic.c:694)
> [ 0.055225][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
> [ 0.055225][ T0] ? report_bug (lib/bug.c:180 lib/bug.c:219)
> [ 0.055225][ T0] ? handle_bug (arch/x86/kernel/traps.c:239 (discriminator 1))
> [ 0.055225][ T0] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
> [ 0.055225][ T0] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621)
> [ 0.055225][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
> [ 0.055225][ T0] ? __pfx_do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:109)
> [ 0.055225][ T0] init_ia32_feat_ctl (arch/x86/kernel/cpu/feat_ctl.c:181)

Yap, works as designed:

..
goto update_sgx;

if ( (tboot && !(msr & FEAT_CTL_VMX_ENABLED_INSIDE_SMX)) ||
(!tboot && !(msr & FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX))) {
if (IS_ENABLED(CONFIG_KVM_INTEL))
pr_err_once("VMX (%s TXT) disabled by BIOS\n",
tboot ? "inside" : "outside");
clear_cpu_cap(c, X86_FEATURE_VMX); <--- here
} else {

Clearing feature flags after alternatives have been applied means that
code which does

alternative(, ... X86_FEATURE_VMX, ...)

won't work as expected because the patching has already happened.

And I'm not sure even the dynamic testing *cpu_has() does will always
work as we do this dance in get_cpu_cap() with forced flags.

So, I'm thinking init_ia32_feat_ctl() should run in early_init_intel()
which is before alternatives.

And looking at init_ia32_feat_ctl(), all it does is set and clear
a bunch of bits so I think it should be ok.

Sean?

> [ 0.055225][ T0] init_intel (arch/x86/include/asm/msr.h:146 arch/x86/include/asm/msr.h:300 arch/x86/kernel/cpu/intel.c:583 arch/x86/kernel/cpu/intel.c:687)
> [ 0.055225][ T0] identify_cpu (arch/x86/kernel/cpu/common.c:1824)
> [ 0.055225][ T0] identify_secondary_cpu (arch/x86/kernel/cpu/common.c:1949)
> [ 0.055225][ T0] smp_store_cpu_info (arch/x86/kernel/smpboot.c:333)
> [ 0.055225][ T0] start_secondary (arch/x86/kernel/smpboot.c:197 arch/x86/kernel/smpboot.c:281)
> [ 0.055225][ T0] ? __pfx_start_secondary (arch/x86/kernel/smpboot.c:231)
> [ 0.055225][ T0] common_startup_64 (arch/x86/kernel/head_64.S:421)
> [ 0.055225][ T0] </TASK>
> [ 0.055225][ T0] ---[ end trace 0000000000000000 ]---
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240430/[email protected]
>
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-04-30 19:33:55

by Borislav Petkov

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

On Tue, Apr 30, 2024 at 11:40:14AM -0700, Sean Christopherson wrote:
> Hmm, I don't think the problem is that init_ia32_feat_ctl() is called too late.
> It too is called from the BSP prior to alternative_instructions():
>
> arch_cpu_finalize_init()
> |
> -> identify_boot_cpu()
> |
> -> identify_cpu()
> |
> -> .c_init() => init_intel()

Yeah, but look at the his stacktrace:

[ 0.055225][ T0] init_intel (arch/x86/include/asm/msr.h:146 arch/x86/include/asm/msr.h:300 arch/x86/kernel/cpu/intel.c:583
+arch/x86/kernel/cpu/intel.c:687)
[ 0.055225][ T0] identify_cpu (arch/x86/kernel/cpu/common.c:1824)
[ 0.055225][ T0] identify_secondary_cpu (arch/x86/kernel/cpu/common.c:1949)
[ 0.055225][ T0] smp_store_cpu_info (arch/x86/kernel/smpboot.c:333)

That's after alternatives.

> Ah, and the WARN even specifically checks for the case where there's divergence
> from the boot CPU:
>
> if (boot_cpu_has(feature))
> WARN_ON(alternatives_patched);

Funny you should mention that - I have this check in
setup_force_cpu_cap() too which works on boot_cpu_data *BUT*, actually,
the test in do_clear_cpu_cap() should be:

if (c && cpu_has(c, feature))
WARN_ON(alternatives_patched);

because setting a feature flag in *any* CPU's cap field is wrong after
alternatives because as explained earlier.

I know, our feature flags handling is a major mess.

> So I think this is a "real" warning about a misconfigured system, where VMX is
> fully configured in MSR_IA32_FEAT_CTL on the boot CPU, but is disabled on a
> secondary CPU.

And that's yet another issue. And it already warns about it:

[ 0.835741][ T1] smpboot: x86: Booting SMP configuration:
[ 0.836040][ T1] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17
[ 0.055225][ T0] masked ExtINT on CPU#1
[ 0.055225][ T0] x86/cpu: VMX (outside TXT) disabled by BIOS
^^^^^^^^^^^^^^^^^^^^

Oliver, does the second warning go away if you do this?

---
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 5dd427c6feb2..93fa2afc0c67 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -114,7 +114,7 @@ static void do_clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int feature)
if (WARN_ON(feature >= MAX_FEATURE_BITS))
return;

- if (boot_cpu_has(feature))
+ if (c && cpu_has(c, feature))
WARN_ON(alternatives_patched);

clear_feature(c, feature);

--

my guess would be no and that init_ia32_feat_ctl() really needs to go
before alternatives have been patched because it clears flags.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-04-30 22:33:39

by Borislav Petkov

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

On Tue, Apr 30, 2024 at 12:51:02PM -0700, Sean Christopherson wrote:
> But that would just mask the underlying problem, it wouldn't actually fix anything
> other than making the WARN go away. Unless I'm misreading the splat+code, the
> issue isn't that init_ia32_feat_ctl() clears VMX late, it's that the BSP sees
> VMX as fully enabled, but at least one AP sees VMX as disabled.
>
> I don't see how the kernel can expect to function correctly with divergent feature
> support across CPUs, i.e. the WARN is a _good_ thing in this case, because it
> alerts the user that their system is messed up, e.g. has a bad BIOS or something.

Yes, and yes.

There are two issues. Clearing feature flags after alternatives have
been applied should not happen, and this particular issue with that box.

Lemme cook up something in the coming days for the former.

As to the box having misconfigured VMX settings, dmesg says:

[ 0.055225][ T0] x86/cpu: VMX (outside TXT) disabled by BIOS

which means this:

(!tboot && !(msr & FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX))) {

so... I have no clue what needs to happen here. BIOS update...

I hope we won't try to do some nasty fix for b0rked BIOSes again...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-05-04 12:48:56

by Borislav Petkov

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

On Wed, May 01, 2024 at 12:33:05AM +0200, Borislav Petkov wrote:
> On Tue, Apr 30, 2024 at 12:51:02PM -0700, Sean Christopherson wrote:
> > But that would just mask the underlying problem, it wouldn't actually fix anything
> > other than making the WARN go away. Unless I'm misreading the splat+code, the
> > issue isn't that init_ia32_feat_ctl() clears VMX late, it's that the BSP sees
> > VMX as fully enabled, but at least one AP sees VMX as disabled.
> >
> > I don't see how the kernel can expect to function correctly with divergent feature
> > support across CPUs, i.e. the WARN is a _good_ thing in this case, because it
> > alerts the user that their system is messed up, e.g. has a bad BIOS or something.
>
> Yes, and yes.
>
> There are two issues. Clearing feature flags after alternatives have
> been applied should not happen, and this particular issue with that box.
>
> Lemme cook up something in the coming days for the former.

Two simple patches as a reply to this.

Oliver, can you run them on your box pls?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-05-04 12:57:21

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 1/2] x86/alternatives: Check the correct cpu_data's caps

From: "Borislav Petkov (AMD)" <[email protected]>
Date: Wed, 1 May 2024 11:26:16 +0200

Commit

ee8962082a44 ("x86/alternatives: Catch late X86_FEATURE modifiers")

added warns for when code modifies feature flags after alternatives have
been patched.

The clearing, however, checks whether the feature flag is previously
set on the boot CPU but the @c argument supplied can be any CPU.

Make sure that other CPU's feature flag is checked instead, as it should
be.

Fixes: ee8962082a44 ("x86/alternatives: Catch late X86_FEATURE modifiers")
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
---
arch/x86/kernel/cpu/cpuid-deps.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 5dd427c6feb2..93fa2afc0c67 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -114,7 +114,7 @@ static void do_clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int feature)
if (WARN_ON(feature >= MAX_FEATURE_BITS))
return;

- if (boot_cpu_has(feature))
+ if (c && cpu_has(c, feature))
WARN_ON(alternatives_patched);

clear_feature(c, feature);
--
2.43.0


--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-05-04 12:57:42

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 2/2] x86/CPU/Intel: Do the MSR_IA32_FEAT_CTL setup before alternatives

From: "Borislav Petkov (AMD)" <[email protected]>
Date: Thu, 2 May 2024 15:15:41 +0200

init_ia32_feat_ctl() goes through the MSR_IA32_FEAT_CTL settings and
sanity-checks the configuration on each logical CPU, while setting or
clearing X86_FEATURE flags in the process.

However, it does that after alternatives have run, leading to settings
which have been patched by the alternatives to become invalid and
irreversible.

Move the settings detection to an earlier path, before the alternatives.

Reported-by: kernel test robot <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/oe-lkp/[email protected]
---
arch/x86/kernel/cpu/intel.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index be30d7fa2e66..d8575511a143 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -406,6 +406,8 @@ static void early_init_intel(struct cpuinfo_x86 *c)
*/
if (cpu_has(c, X86_FEATURE_TME))
detect_tme_early(c);
+
+ init_ia32_feat_ctl(c);
}

static void bsp_init_intel(struct cpuinfo_x86 *c)
@@ -682,8 +684,6 @@ static void init_intel(struct cpuinfo_x86 *c)
/* Work around errata */
srat_detect_node(c);

- init_ia32_feat_ctl(c);
-
init_intel_misc_features(c);

split_lock_init();
--
2.43.0

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-05-06 07:09:56

by Oliver Sang

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

hi, Boris,

On Sat, May 04, 2024 at 02:48:22PM +0200, Borislav Petkov wrote:
> On Wed, May 01, 2024 at 12:33:05AM +0200, Borislav Petkov wrote:
> > On Tue, Apr 30, 2024 at 12:51:02PM -0700, Sean Christopherson wrote:
> > > But that would just mask the underlying problem, it wouldn't actually fix anything
> > > other than making the WARN go away. Unless I'm misreading the splat+code, the
> > > issue isn't that init_ia32_feat_ctl() clears VMX late, it's that the BSP sees
> > > VMX as fully enabled, but at least one AP sees VMX as disabled.
> > >
> > > I don't see how the kernel can expect to function correctly with divergent feature
> > > support across CPUs, i.e. the WARN is a _good_ thing in this case, because it
> > > alerts the user that their system is messed up, e.g. has a bad BIOS or something.
> >
> > Yes, and yes.
> >
> > There are two issues. Clearing feature flags after alternatives have
> > been applied should not happen, and this particular issue with that box.
> >
> > Lemme cook up something in the coming days for the former.
>
> Two simple patches as a reply to this.
>
> Oliver, can you run them on your box pls?

we confirmed after applying them upon ee8962082a, the WARNING which was reported
in our original report cannot be reproduced any longer.

Tested-by: kernel test robot <[email protected]>

>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

2024-05-06 07:39:35

by Borislav Petkov

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

On Mon, May 06, 2024 at 03:09:27PM +0800, Oliver Sang wrote:
> we confirmed after applying them upon ee8962082a, the WARNING which was reported
> in our original report cannot be reproduced any longer.
>
> Tested-by: kernel test robot <[email protected]>

Thanks a lot for testing - much appreciated.

Lemme queue them.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-05-06 08:05:06

by Borislav Petkov

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

On Mon, May 06, 2024 at 09:39:03AM +0200, Borislav Petkov wrote:
> On Mon, May 06, 2024 at 03:09:27PM +0800, Oliver Sang wrote:
> > we confirmed after applying them upon ee8962082a, the WARNING which was reported
> > in our original report cannot be reproduced any longer.
> >
> > Tested-by: kernel test robot <[email protected]>
>
> Thanks a lot for testing - much appreciated.

Hm, ok, after looking at this more, I think I'm going to go with patch
2 only.

And patch 1 is wrong. At least for now, lemme explain:

So before it, we'd do

if (boot_cpu_has(feature))
WARN_ON(alternatives_patched);

when clearing feature flags, meaning: if the flag is set on the BSP,
that means, alternatives have patched already and we're clearing
potentially on another CPU and it'll warn because it will be wrong.

If we do:

if (c && cpu_has(c, feature))
WARN_ON(alternatives_patched);

then it would warn on *every* CPU but it doesn't need to *BECAUSE* the
alternatives patching is controlled by the boot_cpu_data checks for the
BSP - not the AP ones.

So if the BSP doesn't have the feature, we can just as well clear it on
the APs as it wouldn't have any effect on conditionals further on.

Otherwise, we'll have to go and move every clear_cpu_cap() call after
alternatives have patched to before that, which would be insane.

I mean, we'll do that eventually but can't now.

Lemme write that as a comment in the function so that we don't forget.

Oh boy, I love our feature flags infra. :-\

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-05-06 08:13:04

by Borislav Petkov

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

Oliver,

does the warning go away when you run *only* the second patch?

I.e., this one:

https://lore.kernel.org/r/20240504125040.GCZjYvIAK9_DzKuHXh@fat_crate.local

?

I think in this case, it'll make sure that the misconfigured system is
"reconfigured" properly wrt VMX etc...

Can you pls send dmesg and /proc/cpuinfo from that run?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-05-06 15:43:49

by Sean Christopherson

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

On Mon, May 06, 2024, Oliver Sang wrote:
> hi, Boris,
>
> On Sat, May 04, 2024 at 02:48:22PM +0200, Borislav Petkov wrote:
> > On Wed, May 01, 2024 at 12:33:05AM +0200, Borislav Petkov wrote:
> > > On Tue, Apr 30, 2024 at 12:51:02PM -0700, Sean Christopherson wrote:
> > > > But that would just mask the underlying problem, it wouldn't actually fix anything
> > > > other than making the WARN go away. Unless I'm misreading the splat+code, the
> > > > issue isn't that init_ia32_feat_ctl() clears VMX late, it's that the BSP sees
> > > > VMX as fully enabled, but at least one AP sees VMX as disabled.
> > > >
> > > > I don't see how the kernel can expect to function correctly with divergent feature
> > > > support across CPUs, i.e. the WARN is a _good_ thing in this case, because it
> > > > alerts the user that their system is messed up, e.g. has a bad BIOS or something.
> > >
> > > Yes, and yes.
> > >
> > > There are two issues. Clearing feature flags after alternatives have
> > > been applied should not happen, and this particular issue with that box.
> > >
> > > Lemme cook up something in the coming days for the former.
> >
> > Two simple patches as a reply to this.
> >
> > Oliver, can you run them on your box pls?
>
> we confirmed after applying them upon ee8962082a, the WARNING which was reported
> in our original report cannot be reproduced any longer.

I am so confused. With both patches applied, simulating VMX being disabled by
BIOS, which is a _legal_ configuration, yields:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 0 at arch/x86/kernel/cpu/cpuid-deps.c:118 do_clear_cpu_cap+0xf6/0x120
Modules linked in:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.9.0-rc3-28ed6849f6ae-rev/boris-vm #389
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:do_clear_cpu_cap+0xf6/0x120
Call Trace:
<TASK>
init_ia32_feat_ctl+0x133/0x420
init_intel+0x11/0x330
identify_cpu+0x242/0x670
identify_secondary_cpu+0xe/0x40
smp_store_cpu_info+0x48/0x60
start_secondary+0x73/0x120
common_startup_64+0x13b/0x141
</TASK>
---[ end trace 0000000000000000 ]---

That's completely expected (by me at least), because as I said in the original
thread, secondary CPUs are identified after alternatives are applied, and when VMX
is disabled by BIOS, the feature flag will be initially set based on CPUID, and
then cleared by init_ia32_feat_ctl(). I.e. patch 1 is wrong on multiple levels.

And without _either_ patch applied, no WARN occurs, which is again expected (by
me), because init_ia32_feat_ctl() runs _before_ alternative_instructions() on the
boot CPU, i.e. alternatives_patched will be false when do_clear_cpu_cap() is
called by the boot CPU, and the boot_cpu_has(feature) that guards the WARN will
be false when do_clear_cpu_cap() is called by secondary CPUs.

The only way the WARN could have fired without this series is if VMX is enabled
in BIOS on the boot CPU, but disabled by BIOS on one more secondary CPUs. And
_that_ is a bogus setup that (a) the kernel absolutely should WARN about, and
(b) _still_ occurs with one or both patches applied.

So I don't see how this series could possibly have fixed the issue Oliver
encountered, nor do I see any value in moving init_ia32_feat_ctl() into
early_init_intel().

2024-05-06 15:58:27

by Borislav Petkov

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

On Mon, May 06, 2024 at 08:28:46AM -0700, Sean Christopherson wrote:
> The only way the WARN could have fired without this series is if VMX is enabled
> in BIOS on the boot CPU, but disabled by BIOS on one more secondary CPUs. And
> _that_ is a bogus setup that (a) the kernel absolutely should WARN about, and
> (b) _still_ occurs with one or both patches applied.

Right, that's my suspicion too.

> So I don't see how this series could possibly have fixed the issue Oliver
> encountered, nor do I see any value in moving init_ia32_feat_ctl() into
> early_init_intel().

Hm, right. I should've done this from the very beginning:

Oliver, can you please run the below debugging patch *without* any other
patches ontop of latest Linus master?

Also pls send /proc/cpuinfo and dmesg.

Thx.

---
diff --git a/arch/x86/kernel/cpu/feat_ctl.c b/arch/x86/kernel/cpu/feat_ctl.c
index 1640ae76548f..74d2f0a351aa 100644
--- a/arch/x86/kernel/cpu/feat_ctl.c
+++ b/arch/x86/kernel/cpu/feat_ctl.c
@@ -117,8 +117,14 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
bool tboot = tboot_enabled();
bool enable_vmx;
u64 msr;
+ int ret;

- if (rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr)) {
+ ret = rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr);
+
+ pr_info("%s: CPU%d: FEAT_CTL: 0x%llx, tboot: %d\n",
+ __func__, c->cpu_index, msr, tboot);
+
+ if (ret) {
clear_cpu_cap(c, X86_FEATURE_VMX);
clear_cpu_cap(c, X86_FEATURE_SGX);
return;
@@ -165,6 +171,9 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
msr |= FEAT_CTL_SGX_LC_ENABLED;
}

+ pr_info("%s: CPU%d: Write FEAT_CTL: 0x%llx\n",
+ __func__, c->cpu_index, msr);
+
wrmsrl(MSR_IA32_FEAT_CTL, msr);

update_caps:

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-05-07 02:30:07

by Oliver Sang

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

hi, Boris,

On Mon, May 06, 2024 at 10:12:35AM +0200, Borislav Petkov wrote:
> Oliver,
>
> does the warning go away when you run *only* the second patch?
>
> I.e., this one:
>
> https://lore.kernel.org/r/20240504125040.GCZjYvIAK9_DzKuHXh@fat_crate.local
>
> ?
>
> I think in this case, it'll make sure that the misconfigured system is
> "reconfigured" properly wrt VMX etc...
>
> Can you pls send dmesg and /proc/cpuinfo from that run?

not sure if this is still needed, anyway, our tests done, just update you the
status (we are testing your debug patch in a later thread now)

yes, the WARNING disappears when only running the second patch.

dmesg and /proc/cpuinfo are attached.

>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>


Attachments:
(No filename) (834.00 B)
dmesg.xz (24.19 kB)
cpuinfo (66.08 kB)
Download all attachments

2024-05-07 07:09:08

by Oliver Sang

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

hi, Boris,

On Mon, May 06, 2024 at 05:57:59PM +0200, Borislav Petkov wrote:
> On Mon, May 06, 2024 at 08:28:46AM -0700, Sean Christopherson wrote:
> > The only way the WARN could have fired without this series is if VMX is enabled
> > in BIOS on the boot CPU, but disabled by BIOS on one more secondary CPUs. And
> > _that_ is a bogus setup that (a) the kernel absolutely should WARN about, and
> > (b) _still_ occurs with one or both patches applied.
>
> Right, that's my suspicion too.
>
> > So I don't see how this series could possibly have fixed the issue Oliver
> > encountered, nor do I see any value in moving init_ia32_feat_ctl() into
> > early_init_intel().
>
> Hm, right. I should've done this from the very beginning:
>
> Oliver, can you please run the below debugging patch *without* any other
> patches ontop of latest Linus master?
>
> Also pls send /proc/cpuinfo and dmesg.

I applied the debug pach ontop of lastest Linus master:

1621a826233a7 debug patch from Boris for ee8962082a
dccb07f2914cd (HEAD, linus/master) Merge tag 'for-6.9-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

attached dmesg and cpuinfo (a little diff, so I attached it again)

>
> Thx.
>
> ---
> diff --git a/arch/x86/kernel/cpu/feat_ctl.c b/arch/x86/kernel/cpu/feat_ctl.c
> index 1640ae76548f..74d2f0a351aa 100644
> --- a/arch/x86/kernel/cpu/feat_ctl.c
> +++ b/arch/x86/kernel/cpu/feat_ctl.c
> @@ -117,8 +117,14 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> bool tboot = tboot_enabled();
> bool enable_vmx;
> u64 msr;
> + int ret;
>
> - if (rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr)) {
> + ret = rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr);
> +
> + pr_info("%s: CPU%d: FEAT_CTL: 0x%llx, tboot: %d\n",
> + __func__, c->cpu_index, msr, tboot);
> +
> + if (ret) {
> clear_cpu_cap(c, X86_FEATURE_VMX);
> clear_cpu_cap(c, X86_FEATURE_SGX);
> return;
> @@ -165,6 +171,9 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> msr |= FEAT_CTL_SGX_LC_ENABLED;
> }
>
> + pr_info("%s: CPU%d: Write FEAT_CTL: 0x%llx\n",
> + __func__, c->cpu_index, msr);
> +
> wrmsrl(MSR_IA32_FEAT_CTL, msr);
>
> update_caps:
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>


Attachments:
(No filename) (2.26 kB)
dmesg.xz (24.33 kB)
cpuinfo-2 (66.24 kB)
Download all attachments

2024-05-07 11:49:02

by Borislav Petkov

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

From: Borislav Petkov <[email protected]>
To: Oliver Sang <[email protected]>
Cc: Sean Christopherson <[email protected]>, [email protected],
[email protected], [email protected], [email protected],
Ingo Molnar <[email protected]>, Srikanth Aithal <[email protected]>
Bcc: [email protected]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a:
WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap
Reply-To:
In-Reply-To: <ZjnTW4XQwVHEiSaW@xsang-OptiPlex-9020>

On Tue, May 07, 2024 at 03:08:11PM +0800, Oliver Sang wrote:
> I applied the debug pach ontop of lastest Linus master:
>
> 1621a826233a7 debug patch from Boris for ee8962082a
> dccb07f2914cd (HEAD, linus/master) Merge tag 'for-6.9-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
>
> attached dmesg and cpuinfo (a little diff, so I attached it again)

Thanks, now what are we seeing here:

[ 0.763720][ T0] x86/cpu: init_ia32_feat_ctl: CPU0: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU1: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU2: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU3: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU4: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU5: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU6: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU7: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU8: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU9: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU10: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU11: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU12: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU13: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU14: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU15: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU16: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU17: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU18: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU19: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU20: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU21: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU22: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU23: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU24: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU25: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU26: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU27: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU28: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU29: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU30: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU31: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU32: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU33: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU34: FEAT_CTL: 0x5, tboot: 0
[ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU35: FEAT_CTL: 0x5, tboot: 0

So following the code in init_ia32_feat_ctl(), the BSP'll get to

if (msr & FEAT_CTL_LOCKED)
goto update_caps;

and that is the case - FEAT_CTL_LOCKED, bit 0, is set.

It'll go to the update_caps label and there it'll do:

if (!cpu_has(c, X86_FEATURE_VMX))
goto update_sgx;

VMX is set if I judge by the attached cpuinfo-2 so the next check comes:

if ( (tboot && !(msr & FEAT_CTL_VMX_ENABLED_INSIDE_SMX)) ||
(!tboot && !(msr & FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX))) {
if (IS_ENABLED(CONFIG_KVM_INTEL))
pr_err_once("VMX (%s TXT) disabled by BIOS\n",
tboot ? "inside" : "outside");
clear_cpu_cap(c, X86_FEATURE_VMX);

tboot is 0, so the second conditional:

(!tboot && !(msr & FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX))

FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX, bit 2 is set. So that conditional is
not true either. And the pr_err_once() doesn't appear in dmesg.

BUT(!), look what the original dmesg said:

[ 0.055225][ T0] x86/cpu: VMX (outside TXT) disabled by BIOS

So that FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX bit was not set back then. Why?

Oliver, have you done any BIOS config changes in the meantime?

This all looks really weird.

The other possibility would be if something changed between -rc3
(the branch x86/alternatives is based on) and -rc7. Unlikely but by now
everything's possible.

What could also be the case is, the BSP's FEAT_CTL is 0x0 (unconfigured,
whatever), we'd go in, set FEAT_CTL_LOCKED and that'll lock the bit in
all FEAT_CTLs on all cores, then it'll set
FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX. I'm presuming microcode forces this
and am obviously grasping at straws...

Then CPU1 will come, FEAT_CTL_LOCKED will be set but
FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX won't be set, leading to the warn.

But then again, SDM says that that MSR's scope is "Thread" which means,
locking that MSR on the BSP won't have effect on the same MSR on the
other HT thread.

Weird.

Ok, here's a bit modified debug patch ontop of the alternatives branch:

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=feat_ctl

please run it and send me dmesg again.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-05-08 08:16:53

by Oliver Sang

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap



hi, Boris,


very sorry that this could be a wrong report due to BIOS issues.
please just ignore it.


thanks a lot for reminding us below:

> BUT(!), look what the original dmesg said:
>
> [ 0.055225][ T0] x86/cpu: VMX (outside TXT) disabled by BIOS
>
> So that FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX bit was not set back then. Why?
>
> Oliver, have you done any BIOS config changes in the meantime?

we really don't do any manual change recently, however, per your reminding,
we rerun tests on both ee8962082a and its parent v6.9-rc3 again.

while we made original report, we always saw below for both commits.
"x86/cpu: VMX (outside TXT) disabled by BIOS"

by rerun today, we cannot see it again, on both commits.

then for ee8962082a, the reported
WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap
disappeared also while no "x86/cpu: VMX (outside TXT) disabled by BIOS"

we are using kexec to boot into kernel, and the kernel images keep same,
so we are sure this strage phenomenon is not caused by kernel build.

now we doubt it's really a bios issue, we found another similar machine,
which also show "x86/cpu: VMX (outside TXT) disabled by BIOS", but after
several rounds of cold reboot, the message disappeared, too.

we will investigate this BIOS further and avoid this kind of misleading
report in the future.

sorry again :(


On Tue, May 07, 2024 at 01:48:14PM +0200, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
> To: Oliver Sang <[email protected]>
> Cc: Sean Christopherson <[email protected]>, [email protected],
> [email protected], [email protected], [email protected],
> Ingo Molnar <[email protected]>, Srikanth Aithal <[email protected]>
> Bcc: [email protected]
> Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a:
> WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap
> Reply-To:
> In-Reply-To: <ZjnTW4XQwVHEiSaW@xsang-OptiPlex-9020>
>
> On Tue, May 07, 2024 at 03:08:11PM +0800, Oliver Sang wrote:
> > I applied the debug pach ontop of lastest Linus master:
> >
> > 1621a826233a7 debug patch from Boris for ee8962082a
> > dccb07f2914cd (HEAD, linus/master) Merge tag 'for-6.9-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
> >
> > attached dmesg and cpuinfo (a little diff, so I attached it again)
>
> Thanks, now what are we seeing here:
>
> [ 0.763720][ T0] x86/cpu: init_ia32_feat_ctl: CPU0: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU1: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU2: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU3: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU4: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU5: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU6: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU7: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU8: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU9: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU10: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU11: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU12: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU13: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU14: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU15: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU16: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU17: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU18: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU19: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU20: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU21: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU22: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU23: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU24: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU25: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU26: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU27: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU28: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU29: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU30: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU31: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU32: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU33: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU34: FEAT_CTL: 0x5, tboot: 0
> [ 0.055231][ T0] x86/cpu: init_ia32_feat_ctl: CPU35: FEAT_CTL: 0x5, tboot: 0
>
> So following the code in init_ia32_feat_ctl(), the BSP'll get to
>
> if (msr & FEAT_CTL_LOCKED)
> goto update_caps;
>
> and that is the case - FEAT_CTL_LOCKED, bit 0, is set.
>
> It'll go to the update_caps label and there it'll do:
>
> if (!cpu_has(c, X86_FEATURE_VMX))
> goto update_sgx;
>
> VMX is set if I judge by the attached cpuinfo-2 so the next check comes:
>
> if ( (tboot && !(msr & FEAT_CTL_VMX_ENABLED_INSIDE_SMX)) ||
> (!tboot && !(msr & FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX))) {
> if (IS_ENABLED(CONFIG_KVM_INTEL))
> pr_err_once("VMX (%s TXT) disabled by BIOS\n",
> tboot ? "inside" : "outside");
> clear_cpu_cap(c, X86_FEATURE_VMX);
>
> tboot is 0, so the second conditional:
>
> (!tboot && !(msr & FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX))
>
> FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX, bit 2 is set. So that conditional is
> not true either. And the pr_err_once() doesn't appear in dmesg.
>
> BUT(!), look what the original dmesg said:
>
> [ 0.055225][ T0] x86/cpu: VMX (outside TXT) disabled by BIOS
>
> So that FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX bit was not set back then. Why?
>
> Oliver, have you done any BIOS config changes in the meantime?
>
> This all looks really weird.
>
> The other possibility would be if something changed between -rc3
> (the branch x86/alternatives is based on) and -rc7. Unlikely but by now
> everything's possible.
>
> What could also be the case is, the BSP's FEAT_CTL is 0x0 (unconfigured,
> whatever), we'd go in, set FEAT_CTL_LOCKED and that'll lock the bit in
> all FEAT_CTLs on all cores, then it'll set
> FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX. I'm presuming microcode forces this
> and am obviously grasping at straws...
>
> Then CPU1 will come, FEAT_CTL_LOCKED will be set but
> FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX won't be set, leading to the warn.
>
> But then again, SDM says that that MSR's scope is "Thread" which means,
> locking that MSR on the BSP won't have effect on the same MSR on the
> other HT thread.
>
> Weird.
>
> Ok, here's a bit modified debug patch ontop of the alternatives branch:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=feat_ctl
>
> please run it and send me dmesg again.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

2024-05-08 08:25:04

by Borislav Petkov

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

On Wed, May 08, 2024 at 04:08:18PM +0800, Oliver Sang wrote:
> we really don't do any manual change recently, however, per your reminding,
> we rerun tests on both ee8962082a and its parent v6.9-rc3 again.
>
> while we made original report, we always saw below for both commits.
> "x86/cpu: VMX (outside TXT) disabled by BIOS"
>
> by rerun today, we cannot see it again, on both commits.

Yeah, from the last dump you shouldn't be seeing it.

> then for ee8962082a, the reported
> WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap
> disappeared also while no "x86/cpu: VMX (outside TXT) disabled by BIOS"

Yeah, it does feel like something underneath the OS has changed/is
changing for this to happen.

> we are using kexec to boot into kernel, and the kernel images keep same,
> so we are sure this strage phenomenon is not caused by kernel build.

Hmm, kexec won't go through BIOS so I can imagine if the kernel has left
FEAT_CTL in some weird state... but then if it has been locked, nothing
should change it anymore.

> now we doubt it's really a bios issue, we found another similar machine,
> which also show "x86/cpu: VMX (outside TXT) disabled by BIOS", but after
> several rounds of cold reboot, the message disappeared, too.

If you see that again, please run my branch with the debug patch - that
should at least tell us what's weird in FEAT_CTL.

> we will investigate this BIOS further and avoid this kind of misleading
> report in the future.
>
> sorry again :(

No worries, thanks for testing kernels and thanks Sean too. :-)

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-05-08 08:41:30

by Oliver Sang

[permalink] [raw]
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

hi, Boris,

On Wed, May 08, 2024 at 10:24:38AM +0200, Borislav Petkov wrote:
> On Wed, May 08, 2024 at 04:08:18PM +0800, Oliver Sang wrote:
> > we really don't do any manual change recently, however, per your reminding,
> > we rerun tests on both ee8962082a and its parent v6.9-rc3 again.
> >
> > while we made original report, we always saw below for both commits.
> > "x86/cpu: VMX (outside TXT) disabled by BIOS"
> >
> > by rerun today, we cannot see it again, on both commits.
>
> Yeah, from the last dump you shouldn't be seeing it.
>
> > then for ee8962082a, the reported
> > WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap
> > disappeared also while no "x86/cpu: VMX (outside TXT) disabled by BIOS"
>
> Yeah, it does feel like something underneath the OS has changed/is
> changing for this to happen.
>
> > we are using kexec to boot into kernel, and the kernel images keep same,
> > so we are sure this strage phenomenon is not caused by kernel build.
>
> Hmm, kexec won't go through BIOS so I can imagine if the kernel has left
> FEAT_CTL in some weird state... but then if it has been locked, nothing
> should change it anymore.
>
> > now we doubt it's really a bios issue, we found another similar machine,
> > which also show "x86/cpu: VMX (outside TXT) disabled by BIOS", but after
> > several rounds of cold reboot, the message disappeared, too.
>
> If you see that again, please run my branch with the debug patch - that
> should at least tell us what's weird in FEAT_CTL.

thanks a lot for education! sure, we will do that.

>
> > we will investigate this BIOS further and avoid this kind of misleading
> > report in the future.
> >
> > sorry again :(
>
> No worries, thanks for testing kernels and thanks Sean too. :-)
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>