2023-10-18 22:27:19

by Compostella, Jeremy

[permalink] [raw]
Subject: Reserved bits and commit x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach

Hi,

On both AMD and Intel platform, when memory encryption is enabled (TME
on Intel, SME or SVE on AMD), the number of physical address bits
should be lowered. Both AMD code (arch/x86/kernel/cpu/amd.c) and Intel
code (arch/x86/kernel/cpu/intel.c) support this.

I recently noticed though that Intel code is not lowering the number
of physical address bits as part of the early cpu initialization
(c_early_init) and this is leading to MTRRs sanity check failure in
generic_get_mtrr() with the following logs.

mtrr: your BIOS has configured an incorrect mask, fixing it.
mtrr: your BIOS has configured an incorrect mask, fixing it.
[...]

I have been working on fixing this following a similar approach to
what AMD code does: lower the number of physical address bits at early
initialization.
- AMD: early_init_amd() -> detect_tme() -> c->x86_phys_bits -= [...]
- Intel: early_init_intel() -> early_detect_mem_encrypt() -> c->x86_phys_bits -= [...]

I posted the patch on the LKML (cf. <https://lore.kernel.org/lkml/[email protected]/T/>)

It works just fine on v6.6-rc6. However, this morning Kirill brought
up commit fbf6449f84bf ("x86/sev-es: Set x86_virt_bits to the correct
value straight away, instead of a two-phase approach") on the tip
branch to my attention and I believe it should break the AMD early
flow and is breaking the patch I submitted on my local tests.

This commit moves the get_cpu_address_sizes() call after
the this_cpu->c_early_init() call.

@@ -1601,7 +1607,6 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
cpu_detect(c);
get_cpu_vendor(c);
get_cpu_cap(c);
- get_cpu_address_sizes(c);
setup_force_cpu_cap(X86_FEATURE_CPUID);
cpu_parse_early_param();

@@ -1617,6 +1622,8 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
setup_clear_cpu_cap(X86_FEATURE_CPUID);
}

+ get_cpu_address_sizes(c);
+ setup_force_cpu_cap(X86_FEATURE_ALWAYS);

cpu_set_bug_bits(c);

In the light of commit fbf6449f84bf I am wondering what is the right
approach to fix the regression for AMD and then fix the MTRR check for
Intel. Should we introduce a new cpu_dev callback to read the number
of reserved bits and take it into account in get_cpu_address_sizes() ?

Regards,

--
*Jeremy*
/One Emacs to rule them all/


Attachments:
(No filename) (2.31 kB)

2023-10-18 23:03:04

by Adam Dunlap

[permalink] [raw]
Subject: Re: Reserved bits and commit x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach

On Wed, Oct 18, 2023 at 3:27 PM Compostella, Jeremy
<[email protected]> wrote:
> In the light of commit fbf6449f84bf I am wondering what is the right
> approach to fix the regression for AMD and then fix the MTRR check for
> Intel. Should we introduce a new cpu_dev callback to read the number
> of reserved bits and take it into account in get_cpu_address_sizes() ?

I think this approach makes sense. It seems better to have one
function that simply sets it to the right thing rather than setting it
to one value and then adjusting it (fbf6449f84bf did that for
x86_virt_bits, although it caused some other problems). However, I'm
not sure it would solve the problem your original patch tried to fix,
since x86_phys_bits would still be set after intel_init, which
apparently uses the value. Would it work to move the call to
get_cpu_address_sizes() to nearer the start of early_identify_cpu()?
We could also add a cpu_dev callback so it doesn't need the 2-phase
approach, but this would at least bring it back into parity with
v6.6-rc6.

Ex (untested):

---
arch/x86/kernel/cpu/common.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index bcd3b2df83bb..cdbe8241e250 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1592,6 +1592,8 @@ static void __init early_identify_cpu(struct
cpuinfo_x86 *c)
if (!have_cpuid_p())
identify_cpu_without_cpuid(c);

+ get_cpu_address_sizes(c);
+
/* cyrix could have cpuid enabled via c_identify()*/
if (have_cpuid_p()) {
cpu_detect(c);
@@ -1612,8 +1614,6 @@ static void __init early_identify_cpu(struct
cpuinfo_x86 *c)
setup_clear_cpu_cap(X86_FEATURE_CPUID);
}

- get_cpu_address_sizes(c);
-
setup_force_cpu_cap(X86_FEATURE_ALWAYS);

cpu_set_bug_bits(c);
--

Thanks for finding this problem!
Adam

2023-10-18 23:40:03

by Compostella, Jeremy

[permalink] [raw]
Subject: Re: Reserved bits and commit x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach

Adam Dunlap <[email protected]> writes:

> On Wed, Oct 18, 2023 at 3:27 PM Compostella, Jeremy
> <[email protected]> wrote:
>> In the light of commit fbf6449f84bf I am wondering what is the right
>> approach to fix the regression for AMD and then fix the MTRR check for
>> Intel. Should we introduce a new cpu_dev callback to read the number
>> of reserved bits and take it into account in get_cpu_address_sizes() ?
>
> I think this approach makes sense. It seems better to have one
> function that simply sets it to the right thing rather than setting
> it to one value and then adjusting it (fbf6449f84bf did that for
> x86_virt_bits, although it caused some other problems). However, I'm
> not sure it would solve the problem your original patch tried to
> fix, since x86_phys_bits would still be set after intel_init, which
> apparently uses the value.

Using cscope, I don't see any evidence of any vendor init code using
`x86_phys_bits'. To my knowledge, they seem to be only setting
x86_phys_bits or adjusting it.


> Would it work to move the call to get_cpu_address_sizes() to nearer
> the start of early_identify_cpu()? We could also add a cpu_dev
> callback so it doesn't need the 2-phase approach, but this would at
> least bring it back into parity with v6.6-rc6.

Such a change should resolve the issue I reported on this thread. I
can run a quick smoke test later tonight or tomorrow.

> Ex (untested):
>
> ---
> arch/x86/kernel/cpu/common.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index bcd3b2df83bb..cdbe8241e250 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1592,6 +1592,8 @@ static void __init early_identify_cpu(struct
> cpuinfo_x86 *c)
> if (!have_cpuid_p())
> identify_cpu_without_cpuid(c);
>
> + get_cpu_address_sizes(c);
> +
> /* cyrix could have cpuid enabled via c_identify()*/
> if (have_cpuid_p()) {
> cpu_detect(c);
> @@ -1612,8 +1614,6 @@ static void __init early_identify_cpu(struct
> cpuinfo_x86 *c)
> setup_clear_cpu_cap(X86_FEATURE_CPUID);
> }
>
> - get_cpu_address_sizes(c);
> -
> setup_force_cpu_cap(X86_FEATURE_ALWAYS);
>
> cpu_set_bug_bits(c);


Attachments:
(No filename) (2.26 kB)

2023-10-19 20:01:48

by Compostella, Jeremy

[permalink] [raw]
Subject: Re: Reserved bits and commit x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach

"Compostella, Jeremy" <[email protected]> writes:

> Adam Dunlap <[email protected]> writes:
>
>> On Wed, Oct 18, 2023 at 3:27 PM Compostella, Jeremy
>> <[email protected]> wrote:
>>> In the light of commit fbf6449f84bf I am wondering what is the right
>>> approach to fix the regression for AMD and then fix the MTRR check for
>>> Intel. Should we introduce a new cpu_dev callback to read the number
>>> of reserved bits and take it into account in get_cpu_address_sizes() ?
>>
>> I think this approach makes sense. It seems better to have one
>> function that simply sets it to the right thing rather than setting
>> it to one value and then adjusting it (fbf6449f84bf did that for
>> x86_virt_bits, although it caused some other problems). However, I'm
>> not sure it would solve the problem your original patch tried to
>> fix, since x86_phys_bits would still be set after intel_init, which
>> apparently uses the value.
>
> Using cscope, I don't see any evidence of any vendor init code using
> `x86_phys_bits'. To my knowledge, they seem to be only setting
> x86_phys_bits or adjusting it.
>
>
>> Would it work to move the call to get_cpu_address_sizes() to nearer
>> the start of early_identify_cpu()? We could also add a cpu_dev
>> callback so it doesn't need the 2-phase approach, but this would at
>> least bring it back into parity with v6.6-rc6.
>
> Such a change should resolve the issue I reported on this thread. I
> can run a quick smoke test later tonight or tomorrow.

It turns out that your suggestion does not work because
`get_cpu_address_sizes()' relies on `c->extended_cpuid_level' (set by
`get_cpu_cap(c)') and the `X86_FEATURE_CPUID' cpu capability (set by
`setup_force_cpu_cap(X86_FEATURE_CPUID)').

The following change works perfectly well for me:

,----
| @@ -1589,6 +1591,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
| get_cpu_vendor(c);
| get_cpu_cap(c);
| setup_force_cpu_cap(X86_FEATURE_CPUID);
| + get_cpu_address_sizes(c);
| cpu_parse_early_param();
|
| if (this_cpu->c_early_init)
| @@ -1603,7 +1606,6 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
| setup_clear_cpu_cap(X86_FEATURE_CPUID);
| }
|
| - get_cpu_address_sizes(c);
|
| setup_force_cpu_cap(X86_FEATURE_ALWAYS);
`----

Looking at fbf6449f84bf I am under the impression it should not hurt
it either but I'll let you verify.

--
*Jeremy*
/One Emacs to rule them all/


Attachments:
(No filename) (2.45 kB)

2024-01-30 21:34:07

by Jacob Xu

[permalink] [raw]
Subject: Re: Reserved bits and commit x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach

Adding some AMD folk to the thread here.

For AMD CPUs, initialization of c->x86_phys_bits occurs in
get_cpu_address_sizes() which is called from early_identify_cpu().

However, early_identify_cpu() will first call early_init_amd() which adjusts
x86_phys_bits based on the PhysAddrReduction CPUID field.

c->x86_phys_bits -= (cpuid_ebx(0x8000001f) >> 6) & 0x3f;

Thus, this adjustment is ignored.

Adding a new cpu_dev callback to calculate num reserved_cpu_bits makes sense to
me, hopefully the AMD folk can chime in here though.

Jacob

2024-01-30 21:44:38

by Jacob Xu

[permalink] [raw]
Subject: Re: Reserved bits and commit x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach

Oops, I left out one of my questions (besides for the chime-in on fix).

What's the consequence of having this field un-adjusted? We noticed it's zero for milan but non-zero on Genoa.

Jacob

2024-01-31 21:34:28

by Tom Lendacky

[permalink] [raw]
Subject: Re: Reserved bits and commit x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach

On 1/30/24 15:33, Jacob Xu wrote:
> Adding some AMD folk to the thread here.
>
> For AMD CPUs, initialization of c->x86_phys_bits occurs in
> get_cpu_address_sizes() which is called from early_identify_cpu().
>
> However, early_identify_cpu() will first call early_init_amd() which adjusts
> x86_phys_bits based on the PhysAddrReduction CPUID field.
>
> c->x86_phys_bits -= (cpuid_ebx(0x8000001f) >> 6) & 0x3f;
>
> Thus, this adjustment is ignored.
>
> Adding a new cpu_dev callback to calculate num reserved_cpu_bits makes sense to
> me, hopefully the AMD folk can chime in here though.

Later identify_cpu() calls init_amd() which then makes the adjustment. So
there is a window between when the value is at 48 and when it gets reduced
to 43 (on my Milan system).

The actual flow has setup_arch() set the value to MAX_PHYSMEM_BITS, which
is 46. Then early_detect_mem_encrypt() reduces that to 41. Then
get_cpu_address_sizes() resets it to 48. Then a bit later, identify_cpu()
calls init_amd() which calls early_init_amd() which calls
early_detect_mem_encrypt() which reduces x86_phys_bits to 43.

Looking closer, if mem_encrypt=off is specified, then X86_FEATURE_SME is
cleared and it is X86_FEATURE_SEV that causes the adjustment. If
X86_FEATURE_SEV also gets cleared, we won't make the adjustment even
though when we should.

So I like the idea of a callback to calculate the number of reserved
physical address bits.

Thanks,
Tom

>
> Jacob