2023-10-02 21:13:58

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] x86/sev-es: Only set x86_virt_bits to correct value

Hi Adam,

On Mon, Sep 11, 2023 at 05:27:03PM -0700, Adam Dunlap wrote:
> Instead of setting x86_virt_bits to a possibly-correct value and then
> correcting it later, do all the necessary checks before setting it.
>
> At this point, the #VC handler references boot_cpu_data.x86_virt_bits,
> and in the previous version, it would be triggered by the cpuids between
> the point at which it is set to 48 and when it is set to the correct
> value.
>
> Suggested-by: Dave Hansen <[email protected]>
> Signed-off-by: Adam Dunlap <[email protected]>

Our continuous integration started seeing panics when booting ARCH=i386
without KVM after this change landed in -tip as commit fbf6449f84bf
("x86/sev-es: Set x86_virt_bits to the correct value straight away,
instead of a two-phase approach"):

https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/6374441341/job/17299441736

I can reproduce this with GCC 13.2.0 from kernel.org and QEMU 8.1.1 (in
case those versions matter):

https://mirrors.edge.kernel.org/pub/tools/crosstool/

$ make -skj"$(nproc)" ARCH=i386 CROSS_COMPILE=i386-linux- defconfig bzImage

$ qemu-system-i386 \
-display none \
-nodefaults \
-append 'console=ttyS0 earlycon=uart8250,io,0x3f8' \
-kernel arch/x86/boot/bzImage \
-initrd rootfs.cpio \
-m 512m \
-serial mon:stdio
[ 0.000000] Linux version 6.6.0-rc1-00008-gfbf6449f84bf ([email protected]) (i386-linux-gcc (GCC) 13.2.0, GNU ld (GNU Binutils) 2.41) #1 SMP PREEMPT_DYNAMIC Mon Oct 2 12:55:06 MST 2023
...
[ 0.061831] BUG: kernel NULL pointer dereference, address: 00000020
[ 0.062135] #PF: supervisor write access in kernel mode
[ 0.062279] #PF: error_code(0x0002) - not-present page
[ 0.062430] *pde = 00000000
[ 0.062602] Oops: 0002 [#1] PREEMPT SMP
[ 0.062839] CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc1-00008-gfbf6449f84bf #1
[ 0.063086] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[ 0.063515] EIP: __ring_buffer_alloc+0x3a/0x1a4
[ 0.064135] Code: 8b 15 74 da c2 ce 8d 42 6f f7 da 21 d0 ba c0 0d 00 00 e8 2d d0 07 00 85 c0 0f 84 15 01 00 00 8b 4d f0 89 c3 81 c6 f3 0f 00 00 <c7> 40 10 00 00 00 00 8d 40 4c ba 2d a7 8b ce 89 78 b4 c7 40 ec 60
[ 0.064795] EAX: 00000010 EBX: 00000010 ECX: ced82e8d EDX: 00000dc0
[ 0.064970] ESI: 00001ff3 EDI: 00000001 EBP: cea3df54 ESP: cea3df44
[ 0.065151] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00200006
[ 0.065347] CR0: 80050033 CR2: 00000020 CR3: 0ed2d000 CR4: 00000090
[ 0.065596] Call Trace:
[ 0.066142] ? show_regs+0x4d/0x54
[ 0.066293] ? __die+0x21/0x68
[ 0.066383] ? page_fault_oops+0x18c/0x368
[ 0.066502] ? kernelmode_fixup_or_oops.constprop.0+0x84/0xdc
[ 0.066663] ? __bad_area_nosemaphore.constprop.0+0x125/0x1dc
[ 0.066811] ? __alloc_pages+0x156/0xe34
[ 0.066920] ? bad_area_nosemaphore+0x12/0x18
[ 0.067036] ? do_user_addr_fault+0x133/0x3e8
[ 0.067156] ? exc_page_fault+0x51/0x13c
[ 0.067269] ? pvclock_clocksource_read_nowd+0x110/0x110
[ 0.067421] ? handle_exception+0x133/0x133
[ 0.067553] ? pvclock_clocksource_read_nowd+0x110/0x110
[ 0.067724] ? __ring_buffer_alloc+0x3a/0x1a4
[ 0.067846] ? pvclock_clocksource_read_nowd+0x110/0x110
[ 0.067983] ? __ring_buffer_alloc+0x3a/0x1a4
[ 0.068113] early_trace_init+0xab/0x390
[ 0.068296] ? ring_buffer_reset_online_cpus+0xfc/0xfc
[ 0.068443] start_kernel+0x217/0x650
[ 0.068542] ? set_init_arg+0x70/0x70
[ 0.068643] i386_start_kernel+0x43/0x44
[ 0.068754] startup_32_smp+0x156/0x158
[ 0.068960] Modules linked in:
[ 0.069166] CR2: 0000000000000020
[ 0.069495] ---[ end trace 0000000000000000 ]---
[ 0.069653] EIP: __ring_buffer_alloc+0x3a/0x1a4
[ 0.069781] Code: 8b 15 74 da c2 ce 8d 42 6f f7 da 21 d0 ba c0 0d 00 00 e8 2d d0 07 00 85 c0 0f 84 15 01 00 00 8b 4d f0 89 c3 81 c6 f3 0f 00 00 <c7> 40 10 00 00 00 00 8d 40 4c ba 2d a7 8b ce 89 78 b4 c7 40 ec 60
[ 0.070249] EAX: 00000010 EBX: 00000010 ECX: ced82e8d EDX: 00000dc0
[ 0.070412] ESI: 00001ff3 EDI: 00000001 EBP: cea3df54 ESP: cea3df44
[ 0.070577] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00200006
[ 0.070761] CR0: 80050033 CR2: 00000020 CR3: 0ed2d000 CR4: 00000090
[ 0.071023] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.071643] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
...

In case it is necessary, our rootfs image is available at
https://github.com/ClangBuiltLinux/boot-utils/releases.

As mentioned before, this interestingly does not reproduce with
'-enable-kvm', so it could be potentially related to QEMU's TCG. If
there is any additional information I can provide or patches I can test,
I am more than happy to do so.

Cheers,
Nathan

# bad: [18a1f37f57ff28fa544124151ef0171ffdafd162] Merge branch into tip/master: 'x86/tdx'
# good: [9f3ebbef746f89f860a90ced99a359202ea86fde] Merge tag '6.6-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd
git bisect start '18a1f37f57ff28fa544124151ef0171ffdafd162' '9f3ebbef746f89f860a90ced99a359202ea86fde'
# good: [6395aa368403d595c8aab4ee1ac451b84fb37640] Merge branch into tip/master: 'sched/core'
git bisect good 6395aa368403d595c8aab4ee1ac451b84fb37640
# good: [7eddb9db1104079c5f5ac6b17a46f7a7860cf444] Merge branch into tip/master: 'x86/boot'
git bisect good 7eddb9db1104079c5f5ac6b17a46f7a7860cf444
# good: [3fb58599f5a9efaa422c9c9e78e3e799a58d4f63] Merge branch into tip/master: 'x86/entry'
git bisect good 3fb58599f5a9efaa422c9c9e78e3e799a58d4f63
# bad: [062ce2831f0526dec8930a35f1624c0ae50dd667] Merge branch into tip/master: 'x86/platform'
git bisect bad 062ce2831f0526dec8930a35f1624c0ae50dd667
# good: [f4c5ca9850124fb5715eff06cffb1beed837500c] x86_64: Show CR4.PSE on auxiliaries like on BSP
git bisect good f4c5ca9850124fb5715eff06cffb1beed837500c
# good: [24775700eaa93ff83b2a0f1e005879cdf186cdd9] x86/amd_nb: Add AMD Family MI300 PCI IDs
git bisect good 24775700eaa93ff83b2a0f1e005879cdf186cdd9
# bad: [fbf6449f84bf5e4ad09f2c09ee70ed7d629b5ff6] x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach
git bisect bad fbf6449f84bf5e4ad09f2c09ee70ed7d629b5ff6
# good: [f79936545fb122856bd78b189d3c7ee59928c751] x86/sev-es: Allow copy_from_kernel_nofault() in earlier boot
git bisect good f79936545fb122856bd78b189d3c7ee59928c751
# first bad commit: [fbf6449f84bf5e4ad09f2c09ee70ed7d629b5ff6] x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach


2023-10-02 21:54:08

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] x86/sev-es: Only set x86_virt_bits to correct value

On 10/2/23 13:04, Nathan Chancellor wrote:
> On Mon, Sep 11, 2023 at 05:27:03PM -0700, Adam Dunlap wrote:
>> Instead of setting x86_virt_bits to a possibly-correct value and then
>> correcting it later, do all the necessary checks before setting it.
>>
>> At this point, the #VC handler references boot_cpu_data.x86_virt_bits,
>> and in the previous version, it would be triggered by the cpuids between
>> the point at which it is set to 48 and when it is set to the correct
>> value.
>>
>> Suggested-by: Dave Hansen <[email protected]>
>> Signed-off-by: Adam Dunlap <[email protected]>
> Our continuous integration started seeing panics when booting ARCH=i386
> without KVM after this change landed in -tip as commit fbf6449f84bf
> ("x86/sev-es: Set x86_virt_bits to the correct value straight away,
> instead of a two-phase approach"):

I can't reproduce this, but I'm running a gcc-built kernel and I haven't
tried very hard to replicate your qemu setup.

I did notice, though, that the patch in question forgot to move one
assignment. Could you see if the attached patch fixes things for you?


Attachments:
cache-bits-debug.patch (750.00 B)

2023-10-02 21:58:26

by Adam Dunlap

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] x86/sev-es: Only set x86_virt_bits to correct value

On Mon, Oct 2, 2023 at 2:41 PM Dave Hansen <[email protected]> wrote:
>
> On 10/2/23 13:04, Nathan Chancellor wrote:
> > On Mon, Sep 11, 2023 at 05:27:03PM -0700, Adam Dunlap wrote:
> >> Instead of setting x86_virt_bits to a possibly-correct value and then
> >> correcting it later, do all the necessary checks before setting it.
> >>
> >> At this point, the #VC handler references boot_cpu_data.x86_virt_bits,
> >> and in the previous version, it would be triggered by the cpuids between
> >> the point at which it is set to 48 and when it is set to the correct
> >> value.
> >>
> >> Suggested-by: Dave Hansen <[email protected]>
> >> Signed-off-by: Adam Dunlap <[email protected]>
> > Our continuous integration started seeing panics when booting ARCH=i386
> > without KVM after this change landed in -tip as commit fbf6449f84bf
> > ("x86/sev-es: Set x86_virt_bits to the correct value straight away,
> > instead of a two-phase approach"):
>
> I can't reproduce this, but I'm running a gcc-built kernel and I haven't
> tried very hard to replicate your qemu setup.
>
> I did notice, though, that the patch in question forgot to move one
> assignment. Could you see if the attached patch fixes things for you?

I reproduced the issue as Nathan described and your attached patch fixes it
for me.

2023-10-02 22:03:56

by Dave Hansen

[permalink] [raw]
Subject: [PATCH] x86/boot: Move x86_cache_alignment initialization to correct spot

c->x86_cache_alignment is initialized from c->x86_clflush_size.
However, commit fbf6449f84bf moved c->x86_clflush_size initialization
to later in boot without moving the c->x86_cache_alignment assignment.

This presumably left c->x86_cache_alignment set to zero for longer
than it should be.

The result was an oops on 32-bit kernels while accessing a pointer
at 0x20. The 0x20 came from accessing a structure member at offset
0x10 (buffer->cpumask) from a ZERO_SIZE_PTR=0x10. kmalloc() can
evidently return ZERO_SIZE_PTR when it's given 0 as its alignment
requirement.

Move the c->x86_cache_alignment initialization to be after
c->x86_clflush_size has an actual value.

Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
Cc: Adam Dunlap <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jacob Xu <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
---
arch/x86/kernel/cpu/common.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8d7063e4f63c9..9c51ad5bbf319 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1141,6 +1141,7 @@ void get_cpu_address_sizes(struct cpuinfo_x86 *c)
}
}
c->x86_cache_bits = c->x86_phys_bits;
+ c->x86_cache_alignment = c->x86_clflush_size;
}

static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
@@ -1594,8 +1595,6 @@ static void __init cpu_parse_early_param(void)
*/
static void __init early_identify_cpu(struct cpuinfo_x86 *c)
{
- c->x86_cache_alignment = c->x86_clflush_size;
-
memset(&c->x86_capability, 0, sizeof(c->x86_capability));
c->extended_cpuid_level = 0;

--
2.34.1

2023-10-02 22:24:10

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Move x86_cache_alignment initialization to correct spot

On Mon, Oct 02, 2023 at 03:00:45PM -0700, Dave Hansen wrote:
> c->x86_cache_alignment is initialized from c->x86_clflush_size.
> However, commit fbf6449f84bf moved c->x86_clflush_size initialization
> to later in boot without moving the c->x86_cache_alignment assignment.
>
> This presumably left c->x86_cache_alignment set to zero for longer
> than it should be.
>
> The result was an oops on 32-bit kernels while accessing a pointer
> at 0x20. The 0x20 came from accessing a structure member at offset
> 0x10 (buffer->cpumask) from a ZERO_SIZE_PTR=0x10. kmalloc() can
> evidently return ZERO_SIZE_PTR when it's given 0 as its alignment
> requirement.
>
> Move the c->x86_cache_alignment initialization to be after
> c->x86_clflush_size has an actual value.
>
> Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
> Cc: Adam Dunlap <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Jacob Xu <[email protected]>
> Link: https://lore.kernel.org/all/[email protected]/

Tested-by: Nathan Chancellor <[email protected]>

Thanks for the quick fix!

> ---
> arch/x86/kernel/cpu/common.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 8d7063e4f63c9..9c51ad5bbf319 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1141,6 +1141,7 @@ void get_cpu_address_sizes(struct cpuinfo_x86 *c)
> }
> }
> c->x86_cache_bits = c->x86_phys_bits;
> + c->x86_cache_alignment = c->x86_clflush_size;
> }
>
> static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
> @@ -1594,8 +1595,6 @@ static void __init cpu_parse_early_param(void)
> */
> static void __init early_identify_cpu(struct cpuinfo_x86 *c)
> {
> - c->x86_cache_alignment = c->x86_clflush_size;
> -
> memset(&c->x86_capability, 0, sizeof(c->x86_capability));
> c->extended_cpuid_level = 0;
>
> --
> 2.34.1
>

2023-10-03 07:28:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86/boot: Move x86_cache_alignment initialization to correct spot


* Nathan Chancellor <[email protected]> wrote:

> On Mon, Oct 02, 2023 at 03:00:45PM -0700, Dave Hansen wrote:
> > c->x86_cache_alignment is initialized from c->x86_clflush_size.
> > However, commit fbf6449f84bf moved c->x86_clflush_size initialization
> > to later in boot without moving the c->x86_cache_alignment assignment.
> >
> > This presumably left c->x86_cache_alignment set to zero for longer
> > than it should be.
> >
> > The result was an oops on 32-bit kernels while accessing a pointer
> > at 0x20. The 0x20 came from accessing a structure member at offset
> > 0x10 (buffer->cpumask) from a ZERO_SIZE_PTR=0x10. kmalloc() can
> > evidently return ZERO_SIZE_PTR when it's given 0 as its alignment
> > requirement.
> >
> > Move the c->x86_cache_alignment initialization to be after
> > c->x86_clflush_size has an actual value.
> >
> > Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
> > Cc: Adam Dunlap <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > Cc: Jacob Xu <[email protected]>
> > Link: https://lore.kernel.org/all/[email protected]/
>
> Tested-by: Nathan Chancellor <[email protected]>
>
> Thanks for the quick fix!

Thanks for the quick testing - I've applied this fix on top
of fbf6449f84bf in tip:x86/mm.

Dave, I've added your SOB - let me know if that's not OK:

Signed-off-by: Dave Hansen <[email protected]>

Thanks,

Ingo

2023-10-03 07:38:56

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: x86/mm] x86/boot: Move x86_cache_alignment initialization to correct spot

The following commit has been merged into the x86/mm branch of tip:

Commit-ID: 3e32552652917f10c0aa8ac75cdc8f0b8d257dec
Gitweb: https://git.kernel.org/tip/3e32552652917f10c0aa8ac75cdc8f0b8d257dec
Author: Dave Hansen <[email protected]>
AuthorDate: Mon, 02 Oct 2023 15:00:45 -07:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Tue, 03 Oct 2023 09:27:12 +02:00

x86/boot: Move x86_cache_alignment initialization to correct spot

c->x86_cache_alignment is initialized from c->x86_clflush_size.
However, commit fbf6449f84bf moved c->x86_clflush_size initialization
to later in boot without moving the c->x86_cache_alignment assignment:

fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")

This presumably left c->x86_cache_alignment set to zero for longer
than it should be.

The result was an oops on 32-bit kernels while accessing a pointer
at 0x20. The 0x20 came from accessing a structure member at offset
0x10 (buffer->cpumask) from a ZERO_SIZE_PTR=0x10. kmalloc() can
evidently return ZERO_SIZE_PTR when it's given 0 as its alignment
requirement.

Move the c->x86_cache_alignment initialization to be after
c->x86_clflush_size has an actual value.

Fixes: fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
Signed-off-by: Dave Hansen <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/kernel/cpu/common.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8d7063e..9c51ad5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1141,6 +1141,7 @@ void get_cpu_address_sizes(struct cpuinfo_x86 *c)
}
}
c->x86_cache_bits = c->x86_phys_bits;
+ c->x86_cache_alignment = c->x86_clflush_size;
}

static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
@@ -1594,8 +1595,6 @@ static void __init cpu_parse_early_param(void)
*/
static void __init early_identify_cpu(struct cpuinfo_x86 *c)
{
- c->x86_cache_alignment = c->x86_clflush_size;
-
memset(&c->x86_capability, 0, sizeof(c->x86_capability));
c->extended_cpuid_level = 0;