2024-01-31 23:27:59

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

Supersedes: <[email protected]>

MKTME repurposes the high bit of physical address to key id for encryption
key and, even though MAXPHYADDR in CPUID[0x80000008] remains the same,
the valid bits in the MTRR mask register are based on the reduced number
of physical address bits. This breaks boot on machines that have TME enabled
and do something to cleanup MTRRs, unless "disable_mtrr_cleanup" is
passed on the command line. The fix is to move the check to early CPU
initialization, which runs before Linux sets up MTRRs.

However, as noticed by Kirill, the patch I sent as v1 actually works only
until Linux 6.6. In Linux 6.7, commit fbf6449f84bf ("x86/sev-es: Set
x86_virt_bits to the correct value straight away, instead of a two-phase
approach") reorganized the initialization of c->x86_phys_bits in a way
that broke the patch. But even in 6.7 AMD processors, which did try to
reduce it in this_cpu->c_early_init(c), had their x86_phys_bits value
overwritten by get_cpu_address_sizes(), so that early_identify_cpu()
left the wrong value in x86_phys_bits. This probably went unnoticed
because on AMD processors you need not apply the reduced MAXPHYADDR to
MTRR masks.

Therefore, this v2 prepends the fix for this issue in commit fbf6449f84bf.
Apologies for the oversight.

Tested on an AMD Epyc machine (where I resorted to dumping mtrr_state) and
on the problematic Intel Emerald Rapids machine.

Thanks,

Paolo

Paolo Bonzini (2):
x86/cpu: allow reducing x86_phys_bits during early_identify_cpu()
x86/cpu/intel: Detect TME keyid bits before setting MTRR mask
registers

arch/x86/kernel/cpu/common.c | 4 +-
arch/x86/kernel/cpu/intel.c | 178 ++++++++++++++++++-----------------
2 files changed, 93 insertions(+), 89 deletions(-)

--
2.43.0



2024-01-31 23:35:13

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH v2 2/2] x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers

MKTME repurposes the high bit of physical address to key id for encryption
key and, even though MAXPHYADDR in CPUID[0x80000008] remains the same,
the valid bits in the MTRR mask register are based on the reduced number
of physical address bits.

detect_tme() in arch/x86/kernel/cpu/intel.c detects TME and subtracts
it from the total usable physical bits, but it is called too late.
Move the call to early_init_intel() so that it is called in setup_arch(),
before MTRRs are setup.

This fixes boot on TDX-enabled systems, which until now only worked with
"disable_mtrr_cleanup". Without the patch, the values written to the
MTRRs mask registers were 52-bit wide (e.g. 0x000fffff_80000800) and
the writes failed; with the patch, the values are 46-bit wide, which
matches the reduced MAXPHYADDR that is shown in /proc/cpuinfo.

Reported-by: Zixi Chen <[email protected]>
Cc: Adam Dunlap <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Xiaoyao Li <[email protected]>
Cc: Kai Huang <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kernel/cpu/intel.c | 178 ++++++++++++++++++------------------
1 file changed, 91 insertions(+), 87 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index a927a8fc9624..40dec9b56f87 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -184,6 +184,90 @@ static bool bad_spectre_microcode(struct cpuinfo_x86 *c)
return false;
}

+#define MSR_IA32_TME_ACTIVATE 0x982
+
+/* Helpers to access TME_ACTIVATE MSR */
+#define TME_ACTIVATE_LOCKED(x) (x & 0x1)
+#define TME_ACTIVATE_ENABLED(x) (x & 0x2)
+
+#define TME_ACTIVATE_POLICY(x) ((x >> 4) & 0xf) /* Bits 7:4 */
+#define TME_ACTIVATE_POLICY_AES_XTS_128 0
+
+#define TME_ACTIVATE_KEYID_BITS(x) ((x >> 32) & 0xf) /* Bits 35:32 */
+
+#define TME_ACTIVATE_CRYPTO_ALGS(x) ((x >> 48) & 0xffff) /* Bits 63:48 */
+#define TME_ACTIVATE_CRYPTO_AES_XTS_128 1
+
+/* Values for mktme_status (SW only construct) */
+#define MKTME_ENABLED 0
+#define MKTME_DISABLED 1
+#define MKTME_UNINITIALIZED 2
+static int mktme_status = MKTME_UNINITIALIZED;
+
+static void detect_tme_early(struct cpuinfo_x86 *c)
+{
+ u64 tme_activate, tme_policy, tme_crypto_algs;
+ int keyid_bits = 0, nr_keyids = 0;
+ static u64 tme_activate_cpu0 = 0;
+
+ rdmsrl(MSR_IA32_TME_ACTIVATE, tme_activate);
+
+ if (mktme_status != MKTME_UNINITIALIZED) {
+ if (tme_activate != tme_activate_cpu0) {
+ /* Broken BIOS? */
+ pr_err_once("x86/tme: configuration is inconsistent between CPUs\n");
+ pr_err_once("x86/tme: MKTME is not usable\n");
+ mktme_status = MKTME_DISABLED;
+
+ /* Proceed. We may need to exclude bits from x86_phys_bits. */
+ }
+ } else {
+ tme_activate_cpu0 = tme_activate;
+ }
+
+ if (!TME_ACTIVATE_LOCKED(tme_activate) || !TME_ACTIVATE_ENABLED(tme_activate)) {
+ pr_info_once("x86/tme: not enabled by BIOS\n");
+ mktme_status = MKTME_DISABLED;
+ return;
+ }
+
+ if (mktme_status != MKTME_UNINITIALIZED)
+ goto detect_keyid_bits;
+
+ pr_info("x86/tme: enabled by BIOS\n");
+
+ tme_policy = TME_ACTIVATE_POLICY(tme_activate);
+ if (tme_policy != TME_ACTIVATE_POLICY_AES_XTS_128)
+ pr_warn("x86/tme: Unknown policy is active: %#llx\n", tme_policy);
+
+ tme_crypto_algs = TME_ACTIVATE_CRYPTO_ALGS(tme_activate);
+ if (!(tme_crypto_algs & TME_ACTIVATE_CRYPTO_AES_XTS_128)) {
+ pr_err("x86/mktme: No known encryption algorithm is supported: %#llx\n",
+ tme_crypto_algs);
+ mktme_status = MKTME_DISABLED;
+ }
+detect_keyid_bits:
+ keyid_bits = TME_ACTIVATE_KEYID_BITS(tme_activate);
+ nr_keyids = (1UL << keyid_bits) - 1;
+ if (nr_keyids) {
+ pr_info_once("x86/mktme: enabled by BIOS\n");
+ pr_info_once("x86/mktme: %d KeyIDs available\n", nr_keyids);
+ } else {
+ pr_info_once("x86/mktme: disabled by BIOS\n");
+ }
+
+ if (mktme_status == MKTME_UNINITIALIZED) {
+ /* MKTME is usable */
+ mktme_status = MKTME_ENABLED;
+ }
+
+ /*
+ * KeyID bits effectively lower the number of physical address
+ * bits. Update cpuinfo_x86::x86_phys_bits accordingly.
+ */
+ c->x86_phys_bits -= keyid_bits;
+}
+
static void early_init_intel(struct cpuinfo_x86 *c)
{
u64 misc_enable;
@@ -322,6 +406,13 @@ static void early_init_intel(struct cpuinfo_x86 *c)
*/
if (detect_extended_topology_early(c) < 0)
detect_ht_early(c);
+
+ /*
+ * Adjust the number of physical bits early because it affects the
+ * valid bits of the MTRR mask registers.
+ */
+ if (cpu_has(c, X86_FEATURE_TME))
+ detect_tme_early(c);
}

static void bsp_init_intel(struct cpuinfo_x86 *c)
@@ -482,90 +573,6 @@ static void srat_detect_node(struct cpuinfo_x86 *c)
#endif
}

-#define MSR_IA32_TME_ACTIVATE 0x982
-
-/* Helpers to access TME_ACTIVATE MSR */
-#define TME_ACTIVATE_LOCKED(x) (x & 0x1)
-#define TME_ACTIVATE_ENABLED(x) (x & 0x2)
-
-#define TME_ACTIVATE_POLICY(x) ((x >> 4) & 0xf) /* Bits 7:4 */
-#define TME_ACTIVATE_POLICY_AES_XTS_128 0
-
-#define TME_ACTIVATE_KEYID_BITS(x) ((x >> 32) & 0xf) /* Bits 35:32 */
-
-#define TME_ACTIVATE_CRYPTO_ALGS(x) ((x >> 48) & 0xffff) /* Bits 63:48 */
-#define TME_ACTIVATE_CRYPTO_AES_XTS_128 1
-
-/* Values for mktme_status (SW only construct) */
-#define MKTME_ENABLED 0
-#define MKTME_DISABLED 1
-#define MKTME_UNINITIALIZED 2
-static int mktme_status = MKTME_UNINITIALIZED;
-
-static void detect_tme(struct cpuinfo_x86 *c)
-{
- u64 tme_activate, tme_policy, tme_crypto_algs;
- int keyid_bits = 0, nr_keyids = 0;
- static u64 tme_activate_cpu0 = 0;
-
- rdmsrl(MSR_IA32_TME_ACTIVATE, tme_activate);
-
- if (mktme_status != MKTME_UNINITIALIZED) {
- if (tme_activate != tme_activate_cpu0) {
- /* Broken BIOS? */
- pr_err_once("x86/tme: configuration is inconsistent between CPUs\n");
- pr_err_once("x86/tme: MKTME is not usable\n");
- mktme_status = MKTME_DISABLED;
-
- /* Proceed. We may need to exclude bits from x86_phys_bits. */
- }
- } else {
- tme_activate_cpu0 = tme_activate;
- }
-
- if (!TME_ACTIVATE_LOCKED(tme_activate) || !TME_ACTIVATE_ENABLED(tme_activate)) {
- pr_info_once("x86/tme: not enabled by BIOS\n");
- mktme_status = MKTME_DISABLED;
- return;
- }
-
- if (mktme_status != MKTME_UNINITIALIZED)
- goto detect_keyid_bits;
-
- pr_info("x86/tme: enabled by BIOS\n");
-
- tme_policy = TME_ACTIVATE_POLICY(tme_activate);
- if (tme_policy != TME_ACTIVATE_POLICY_AES_XTS_128)
- pr_warn("x86/tme: Unknown policy is active: %#llx\n", tme_policy);
-
- tme_crypto_algs = TME_ACTIVATE_CRYPTO_ALGS(tme_activate);
- if (!(tme_crypto_algs & TME_ACTIVATE_CRYPTO_AES_XTS_128)) {
- pr_err("x86/mktme: No known encryption algorithm is supported: %#llx\n",
- tme_crypto_algs);
- mktme_status = MKTME_DISABLED;
- }
-detect_keyid_bits:
- keyid_bits = TME_ACTIVATE_KEYID_BITS(tme_activate);
- nr_keyids = (1UL << keyid_bits) - 1;
- if (nr_keyids) {
- pr_info_once("x86/mktme: enabled by BIOS\n");
- pr_info_once("x86/mktme: %d KeyIDs available\n", nr_keyids);
- } else {
- pr_info_once("x86/mktme: disabled by BIOS\n");
- }
-
- if (mktme_status == MKTME_UNINITIALIZED) {
- /* MKTME is usable */
- mktme_status = MKTME_ENABLED;
- }
-
- /*
- * KeyID bits effectively lower the number of physical address
- * bits. Update cpuinfo_x86::x86_phys_bits accordingly.
- */
- c->x86_phys_bits -= keyid_bits;
-}
-
static void init_cpuid_fault(struct cpuinfo_x86 *c)
{
u64 msr;
@@ -702,9 +709,6 @@ static void init_intel(struct cpuinfo_x86 *c)

init_ia32_feat_ctl(c);

- if (cpu_has(c, X86_FEATURE_TME))
- detect_tme(c);
-
init_intel_misc_features(c);

split_lock_init();
--
2.43.0


2024-02-01 13:26:37

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers

On Thu, Feb 01, 2024 at 12:09:02AM +0100, Paolo Bonzini wrote:
> MKTME repurposes the high bit of physical address to key id for encryption
> key and, even though MAXPHYADDR in CPUID[0x80000008] remains the same,
> the valid bits in the MTRR mask register are based on the reduced number
> of physical address bits.
>
> detect_tme() in arch/x86/kernel/cpu/intel.c detects TME and subtracts
> it from the total usable physical bits, but it is called too late.
> Move the call to early_init_intel() so that it is called in setup_arch(),
> before MTRRs are setup.
>
> This fixes boot on TDX-enabled systems, which until now only worked with
> "disable_mtrr_cleanup". Without the patch, the values written to the
> MTRRs mask registers were 52-bit wide (e.g. 0x000fffff_80000800) and
> the writes failed; with the patch, the values are 46-bit wide, which
> matches the reduced MAXPHYADDR that is shown in /proc/cpuinfo.
>
> Reported-by: Zixi Chen <[email protected]>
> Cc: Adam Dunlap <[email protected]>
> Cc: Kirill A. Shutemov <[email protected]>
> Cc: Xiaoyao Li <[email protected]>
> Cc: Kai Huang <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Paolo Bonzini <[email protected]>

Reviewed-by: Kirill A. Shutemov <[email protected]>

--
Kiryl Shutsemau / Kirill A. Shutemov

2024-02-01 15:44:03

by Huang, Kai

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers


> static void early_init_intel(struct cpuinfo_x86 *c)
> {
> u64 misc_enable;
> @@ -322,6 +406,13 @@ static void early_init_intel(struct cpuinfo_x86 *c)
> */
> if (detect_extended_topology_early(c) < 0)
> detect_ht_early(c);
> +
> + /*
> + * Adjust the number of physical bits early because it affects the
> + * valid bits of the MTRR mask registers.
> + */
> + if (cpu_has(c, X86_FEATURE_TME))
> + detect_tme_early(c);
> }
>

early_init_intel() is also called by init_intel(), so IIUC for BSP
detect_tme_early() will be called twice.

But this is no harm AFAICT, so:

Acked-by: Kai Huang <[email protected]>

2024-02-01 18:29:21

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On 1/31/24 15:09, Paolo Bonzini wrote:
> However, as noticed by Kirill, the patch I sent as v1 actually works only
> until Linux 6.6. In Linux 6.7, commit fbf6449f84bf ("x86/sev-es: Set
> x86_virt_bits to the correct value straight away, instead of a two-phase
> approach") reorganized the initialization of c->x86_phys_bits in a way
> that broke the patch. But even in 6.7 AMD processors, which did try to
> reduce it in this_cpu->c_early_init(c), had their x86_phys_bits value
> overwritten by get_cpu_address_sizes(), so that early_identify_cpu()
> left the wrong value in x86_phys_bits. This probably went unnoticed
> because on AMD processors you need not apply the reduced MAXPHYADDR to
> MTRR masks.

I really wanted get_cpu_address_sizes() to be the one and only spot
where c->x86_phys_bits is established. That way, we don't get a bunch
of code all of the place tweaking it and fighting for who "wins".

We're not there yet, but the approach in this patch moves it back in the
wrong direction because it permits the random tweaking of c->x86_phys_bits.

Could we instead do something more like the (completely untested)
attached patch?

BTW, I'm pretty sure the WARN_ON() in the patch won't actually work, but
it'd be nice to work toward a point when it does.

2024-02-01 21:42:41

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On 2/1/24 10:29, Dave Hansen wrote:
> Could we instead do something more like the (completely untested)
> attached patch?

.. actually attaching it here


Attachments:
enc_phys_bits.patch (2.02 kB)

2024-02-01 23:09:34

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On Thu, Feb 1, 2024 at 7:29 PM Dave Hansen <[email protected]> wrote:
> I really wanted get_cpu_address_sizes() to be the one and only spot
> where c->x86_phys_bits is established. That way, we don't get a bunch
> of code all of the place tweaking it and fighting for who "wins".
> We're not there yet, but the approach in this patch moves it back in the
> wrong direction because it permits the random tweaking of c->x86_phys_bits.

I see your point; one of my earlier attempts added a
->c_detect_mem_encrypt() callback that basically amounted to either
amd_detect_mem_encrypt() or detect_tme(). I bailed out of it mostly
because these functions do more than adjusting phys_bits, and it
seemed arbitrary to call them from get_cpu_address_sizes(). The two
approaches share the idea of centralizing the computation of
x86_phys_bits in get_cpu_address_sizes().

There is unfortunately an important hurdle for your patch, in that
currently the BSP and AP flows are completely different. For the BSP
the flow is ->c_early_init(), then get_cpu_address_sizes(), then again
->c_early_init() called by ->c_init(), then ->c_init(). For APs it is
get_cpu_address_sizes(), then ->c_early_init() called by ->c_init(),
then the rest of ->c_init(). And let's not even look at
->c_identify().

This difference is bad for your patch, because get_cpu_address_sizes()
is called too early to see enc_phys_bits on APs. But it was also
something that fbf6449f84bf didn't take into account, because it left
behind the tentative initialization of x86_*_bits in identify_cpu(),
while removing it from early_identify_cpu(). And

TBH my first reaction after Kirill pointed me to fbf6449f84bf was to
revert it. It's not like the code before was much less of a dumpster
fire, but that commit made the BSP vs. AP mess even worse. But it
fixed a real-world bug and it did remove most of the "first set then
adjust" logic, at least for the BSP, so a revert wasn't on the table
and patch 1 was what came out of it.

I know that in general in Linux we prefer to fix things for good.
Dancing one step behind and two ahead comes with the the risk that you
only do the step behind. But in the end something like this patch 1
would have to be posted for stable branches (and Greg doesn't like
one-off patches), and I am not even sure it's a step behind because it
removes _some_ of the BSP vs. AP differences introduced by
fbf6449f84bf.

In a nutshell: I don't dislike the idea behind your patch, but the
code is just not ready for it.

I can look into cleaning up cpu/common.c though. I'm a natural
procrastinator, it's my kind of thing to embark on a series of 30
patches to clean up 20 years old code...

Paolo

> Could we instead do something more like the (completely untested)
> attached patch?
>
> BTW, I'm pretty sure the WARN_ON() in the patch won't actually work, but
> it'd be nice to work toward a point when it does.
>


2024-02-04 17:21:37

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On Fri, Feb 2, 2024 at 12:08 AM Paolo Bonzini <[email protected]> wrote:
>
> On Thu, Feb 1, 2024 at 7:29 PM Dave Hansen <[email protected]> wrote:
> > I really wanted get_cpu_address_sizes() to be the one and only spot
> > where c->x86_phys_bits is established. That way, we don't get a bunch
> > of code all of the place tweaking it and fighting for who "wins".
> > We're not there yet, but the approach in this patch moves it back in the
> > wrong direction because it permits the random tweaking of c->x86_phys_bits.
>
> I see your point; one of my earlier attempts added a
> ->c_detect_mem_encrypt() callback that basically amounted to either
> amd_detect_mem_encrypt() or detect_tme(). I bailed out of it mostly
> because these functions do more than adjusting phys_bits, and it
> seemed arbitrary to call them from get_cpu_address_sizes(). The two
> approaches share the idea of centralizing the computation of
> x86_phys_bits in get_cpu_address_sizes().
>
> There is unfortunately an important hurdle for your patch, in that
> currently the BSP and AP flows are completely different. For the BSP
> the flow is ->c_early_init(), then get_cpu_address_sizes(), then again
> ->c_early_init() called by ->c_init(), then ->c_init(). For APs it is
> get_cpu_address_sizes(), then ->c_early_init() called by ->c_init(),
> then the rest of ->c_init(). And let's not even look at
> ->c_identify().
>
> This difference is bad for your patch, because get_cpu_address_sizes()
> is called too early to see enc_phys_bits on APs. But it was also
> something that fbf6449f84bf didn't take into account, because it left
> behind the tentative initialization of x86_*_bits in identify_cpu(),
> while removing it from early_identify_cpu(). And
>
> TBH my first reaction after Kirill pointed me to fbf6449f84bf was to
> revert it. It's not like the code before was much less of a dumpster
> fire, but that commit made the BSP vs. AP mess even worse. But it
> fixed a real-world bug and it did remove most of the "first set then
> adjust" logic, at least for the BSP, so a revert wasn't on the table
> and patch 1 was what came out of it.
>
> I know that in general in Linux we prefer to fix things for good.
> Dancing one step behind and two ahead comes with the the risk that you
> only do the step behind. But in the end something like this patch 1
> would have to be posted for stable branches (and Greg doesn't like
> one-off patches), and I am not even sure it's a step behind because it
> removes _some_ of the BSP vs. AP differences introduced by
> fbf6449f84bf.
>
> In a nutshell: I don't dislike the idea behind your patch, but the
> code is just not ready for it.

This is the diffstat before your patch can be applied more or less as is:

$ git diffstat origin/master
arch/x86/include/asm/processor.h | 1 +
arch/x86/kernel/cpu/amd.c | 12 ++--
arch/x86/kernel/cpu/centaur.c | 13 ++---
arch/x86/kernel/cpu/common.c | 103 +++++++++++++++++++----------------
arch/x86/kernel/cpu/hygon.c | 2 -
arch/x86/kernel/cpu/intel.c | 4 +-
arch/x86/kernel/cpu/transmeta.c | 2 -
arch/x86/kernel/cpu/zhaoxin.c | 1 -
8 files changed, 69 insertions(+), 69 deletions(-)

$ git log --oneline --reverse origin/master..
d639afed02aa x86/cpu/common: move code up to early
get_cpu_address_sizes() to a new function
40d34260a4ba x86/cpu/common: pull get_cpu_*() calls common to BSPs and
APs to a new function
67b9839f9c38 x86/cpu/common: put all setup_force/clear capabilities together
ebeae7f91cbc x86/cpu/centaur: do everything before
early_init_centaur() in early_init_centaur()
d62fa7400885 x86/cpu/cyrix: call early init function from init function
0aa0916cd7e0 x86/cpu/common: call early_init function on APs from common code
d656b651d217 (HEAD) dave


I still haven't tested very well, but anyway, what do you think?

Paolo


2024-02-13 15:45:46

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On Sun, Feb 4, 2024 at 6:21 PM Paolo Bonzini <[email protected]> wrote:
> On Fri, Feb 2, 2024 at 12:08 AM Paolo Bonzini <[email protected]> wrote:
> > On Thu, Feb 1, 2024 at 7:29 PM Dave Hansen <[email protected]> wrote:
> > > I really wanted get_cpu_address_sizes() to be the one and only spot
> > > where c->x86_phys_bits is established. That way, we don't get a bunch
> > > of code all of the place tweaking it and fighting for who "wins".
> > > We're not there yet, but the approach in this patch moves it back in the
> > > wrong direction because it permits the random tweaking of c->x86_phys_bits.
> >
> > There is unfortunately an important hurdle [...] in that
> > currently the BSP and AP flows are completely different. For the BSP
> > the flow is ->c_early_init(), then get_cpu_address_sizes(), then again
> > ->c_early_init() called by ->c_init(), then ->c_init(). For APs it is
> > get_cpu_address_sizes(), then ->c_early_init() called by ->c_init(),
> > then the rest of ->c_init(). And let's not even look at
> > ->c_identify(). [...] get_cpu_address_sizes()
> > is called too early to see enc_phys_bits on APs. But it was also
> > something that fbf6449f84bf didn't take into account, because it left
> > behind the tentative initialization of x86_*_bits in identify_cpu(),
> > while removing it from early_identify_cpu(). And

Ping, either for applying the original patches or for guidance on how
to proceed.

Paolo


2024-02-13 17:08:08

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On 2/1/24 15:08, Paolo Bonzini wrote:
> There is unfortunately an important hurdle for your patch, in that
> currently the BSP and AP flows are completely different.

Do we even _need_ c->x86_phys_bits for APs? I need to do a bit more
grepping, but I only see it being read in show_cpuinfo(). Everything
else seems to be boot_cpu_data.x86_phys_bits.

2024-02-13 22:35:09

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On Tue, Feb 13, 2024 at 11:02 PM Dave Hansen <[email protected]> wrote:
>
> On 2/13/24 07:45, Paolo Bonzini wrote:
> > Ping, either for applying the original patches or for guidance on how
> > to proceed.
>
> Gah, all of the gunk that get_cpu_address_sizes() touches are out of
> control.
>
> They (phys/virt_bits and clflush) need to get consolidated back to a
> single copy that gets set up *once* in early boot and then read by
> everyone else.
>
> I've got a series to do that, but it's got its tentacles
> in quite a few places. They're not great backporting material.

Yes, same for mine. I probably digressed in a different direction than
you, but there will be conflicts almost surely. I can post my stuff in
the next few days.

Paolo


2024-02-13 22:42:21

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On 2/13/24 07:45, Paolo Bonzini wrote:
> Ping, either for applying the original patches or for guidance on how
> to proceed.

Gah, all of the gunk that get_cpu_address_sizes() touches are out of
control.

They (phys/virt_bits and clflush) need to get consolidated back to a
single copy that gets set up *once* in early boot and then read by
everyone else. I've got a series to do that, but it's got its tentacles
in quite a few places. They're not great backporting material.

Your patches make things a wee bit worse in the meantime, but they pale
in comparison to the random spaghetti that we've already got. Also, we
probably need the early TME stuff regardless.

I think I'll probably suck it up, apply them, then fix them up along
with the greater mess.

Anybody have any better ideas?


Attachments:
x86cai.patch (19.88 kB)

2024-02-20 12:22:57

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On Tue, Feb 13, 2024 at 11:02 PM Dave Hansen <[email protected]> wrote:
> Your patches make things a wee bit worse in the meantime, but they pale
> in comparison to the random spaghetti that we've already got. Also, we
> probably need the early TME stuff regardless.
>
> I think I'll probably suck it up, apply them, then fix them up along
> with the greater mess.
>
> Anybody have any better ideas?

Ping, in the end are we applying these patches for either 6.8 or 6.9?

Paolo


2024-02-22 18:07:30

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On 2/20/24 04:22, Paolo Bonzini wrote:
> On Tue, Feb 13, 2024 at 11:02 PM Dave Hansen <[email protected]> wrote:
>> Your patches make things a wee bit worse in the meantime, but they pale
>> in comparison to the random spaghetti that we've already got. Also, we
>> probably need the early TME stuff regardless.
>>
>> I think I'll probably suck it up, apply them, then fix them up along
>> with the greater mess.
>>
>> Anybody have any better ideas?
> Ping, in the end are we applying these patches for either 6.8 or 6.9?

Let me poke at them and see if we can stick them in x86/urgent early
next week. They do fix an actual bug that's biting people, right?

2024-02-22 18:09:06

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On Thu, Feb 22, 2024 at 7:07 PM Dave Hansen <[email protected]> wrote:
> > Ping, in the end are we applying these patches for either 6.8 or 6.9?
>
> Let me poke at them and see if we can stick them in x86/urgent early
> next week. They do fix an actual bug that's biting people, right?

Yes, I have gotten reports of {Sapphire,Emerald} Rapids machines that
don't boot at all without either these patches or
"disable_mtrr_cleanup".

Paolo


2024-02-26 01:58:16

by Yin, Fengwei

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME


On 2/23/24 02:08, Paolo Bonzini wrote:
> On Thu, Feb 22, 2024 at 7:07 PM Dave Hansen <[email protected]> wrote:
>>> Ping, in the end are we applying these patches for either 6.8 or 6.9?
>>
>> Let me poke at them and see if we can stick them in x86/urgent early
>> next week. They do fix an actual bug that's biting people, right?
>
> Yes, I have gotten reports of {Sapphire,Emerald} Rapids machines that
> don't boot at all without either these patches or
> "disable_mtrr_cleanup".
We tried platform other than Sapphire and Emerald. This patchset can fix
boot issues on that platform also.


Regards
Yin, Fengwei

>
> Paolo
>
>

2024-02-26 16:21:41

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On 2/25/24 17:57, Yin Fengwei wrote:
> On 2/23/24 02:08, Paolo Bonzini wrote:
>> On Thu, Feb 22, 2024 at 7:07 PM Dave Hansen <[email protected]> wrote:
>>>> Ping, in the end are we applying these patches for either 6.8 or 6.9?
>>> Let me poke at them and see if we can stick them in x86/urgent early
>>> next week. They do fix an actual bug that's biting people, right?
>> Yes, I have gotten reports of {Sapphire,Emerald} Rapids machines that
>> don't boot at all without either these patches or
>> "disable_mtrr_cleanup".
> We tried platform other than Sapphire and Emerald. This patchset can fix
> boot issues on that platform also.

Fengwei, could you also test this series on the troublesome hardware,
please?

> https://lore.kernel.org/all/[email protected]/

If it _also_ fixes the problem, it'll be a strong indication that it's
the right long-term approach.

2024-02-27 02:03:20

by Yin, Fengwei

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME



On 2/27/24 00:21, Dave Hansen wrote:
> On 2/25/24 17:57, Yin Fengwei wrote:
>> On 2/23/24 02:08, Paolo Bonzini wrote:
>>> On Thu, Feb 22, 2024 at 7:07 PM Dave Hansen <[email protected]> wrote:
>>>>> Ping, in the end are we applying these patches for either 6.8 or 6.9?
>>>> Let me poke at them and see if we can stick them in x86/urgent early
>>>> next week. They do fix an actual bug that's biting people, right?
>>> Yes, I have gotten reports of {Sapphire,Emerald} Rapids machines that
>>> don't boot at all without either these patches or
>>> "disable_mtrr_cleanup".
>> We tried platform other than Sapphire and Emerald. This patchset can fix
>> boot issues on that platform also.
>
> Fengwei, could you also test this series on the troublesome hardware,
> please?
Sure. I will try it on my env. I just didn't connected your patchset to this
boot issue. :(.


Regards
Yin, Fengwei

>
>> https://lore.kernel.org/all/[email protected]/
>
> If it _also_ fixes the problem, it'll be a strong indication that it's
> the right long-term approach.
>

2024-02-27 06:08:59

by Yin, Fengwei

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

Hi Dave,

On 2/27/24 00:21, Dave Hansen wrote:
> On 2/25/24 17:57, Yin Fengwei wrote:
>> On 2/23/24 02:08, Paolo Bonzini wrote:
>>> On Thu, Feb 22, 2024 at 7:07 PM Dave Hansen <[email protected]> wrote:
>>>>> Ping, in the end are we applying these patches for either 6.8 or 6.9?
>>>> Let me poke at them and see if we can stick them in x86/urgent early
>>>> next week. They do fix an actual bug that's biting people, right?
>>> Yes, I have gotten reports of {Sapphire,Emerald} Rapids machines that
>>> don't boot at all without either these patches or
>>> "disable_mtrr_cleanup".
>> We tried platform other than Sapphire and Emerald. This patchset can fix
>> boot issues on that platform also.
>
> Fengwei, could you also test this series on the troublesome hardware,
> please?
>
>> https://lore.kernel.org/all/[email protected]/
>
> If it _also_ fixes the problem, it'll be a strong indication that it's
> the right long-term approach.
I tried your patchset on a Sapphire machine which is the only one broken machine
I can access today. The base commit is 45ec2f5f6ed3 from the latest linus tree.

Without your patchset, the system boot hang.
With your patchset, the system boot successfully.


Regards
Yin, Fengwei

2024-02-27 13:32:44

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME

On 2/26/24 22:08, Yin Fengwei wrote:
>>> https://lore.kernel.org/all/[email protected]/
>> If it _also_ fixes the problem, it'll be a strong indication that it's
>> the right long-term approach.
> I tried your patchset on a Sapphire machine which is the only one broken machine
> I can access today. The base commit is 45ec2f5f6ed3 from the latest linus tree.
>
> Without your patchset, the system boot hang.
> With your patchset, the system boot successfully.

Yay! Thanks for testing.

2024-02-28 00:08:29

by Yin, Fengwei

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] x86/cpu: fix invalid MTRR mask values for SEV or TME



On 2/27/24 21:30, Dave Hansen wrote:
> On 2/26/24 22:08, Yin Fengwei wrote:
>>>> https://lore.kernel.org/all/[email protected]/
>>> If it _also_ fixes the problem, it'll be a strong indication that it's
>>> the right long-term approach.
>> I tried your patchset on a Sapphire machine which is the only one broken machine
>> I can access today. The base commit is 45ec2f5f6ed3 from the latest linus tree.
>>
>> Without your patchset, the system boot hang.
>> With your patchset, the system boot successfully.
>
> Yay! Thanks for testing.
My pleasure.


Regards
Yin, Fengwei