2023-11-18 19:33:32

by Yazen Ghannam

[permalink] [raw]
Subject: [PATCH 02/20] x86/mce: Define mce_setup() helpers for global and per-CPU fields

Generally, MCA information for an error is gathered on the CPU that
reported the error. In this case, CPU-specific information from the
running CPU will be correct.

However, this will be incorrect if the MCA information is gathered while
running on a CPU that didn't report the error. One example is creating
an MCA record using mce_setup() for errors reported from ACPI.

Split mce_setup() so that there is a helper function to gather global,
i.e. not CPU-specific, information and another helper for CPU-specific
information.

Don't set the CPU number in either helper function. This will be set
appropriately for each call site of the helpers.

Leave mce_setup() defined as-is for the common case when running on the
reporting CPU.

Get MCG_CAP in the global helper even though the register is per-CPU.
This value is not already cached per-CPU like other values. And it does
not assist with any per-CPU decoding or handling.

Signed-off-by: Yazen Ghannam <[email protected]>
---
arch/x86/kernel/cpu/mce/core.c | 31 +++++++++++++++++++++----------
1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 1642018dd6c9..7e86086aa19c 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -115,20 +115,31 @@ static struct irq_work mce_irq_work;
*/
BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain);

+void mce_setup_global(struct mce *m)
+{
+ memset(m, 0, sizeof(struct mce));
+
+ m->cpuid = cpuid_eax(1);
+ m->cpuvendor = boot_cpu_data.x86_vendor;
+ m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP);
+ /* need the internal __ version to avoid deadlocks */
+ m->time = __ktime_get_real_seconds();
+}
+
+void mce_setup_per_cpu(struct mce *m)
+{
+ m->apicid = cpu_data(m->extcpu).topo.initial_apicid;
+ m->microcode = cpu_data(m->extcpu).microcode;
+ m->ppin = cpu_data(m->extcpu).ppin;
+ m->socketid = cpu_data(m->extcpu).topo.pkg_id;
+}
+
/* Do initial initialization of a struct mce */
void mce_setup(struct mce *m)
{
- memset(m, 0, sizeof(struct mce));
+ mce_setup_global(m);
m->cpu = m->extcpu = smp_processor_id();
- /* need the internal __ version to avoid deadlocks */
- m->time = __ktime_get_real_seconds();
- m->cpuvendor = boot_cpu_data.x86_vendor;
- m->cpuid = cpuid_eax(1);
- m->socketid = cpu_data(m->extcpu).topo.pkg_id;
- m->apicid = cpu_data(m->extcpu).topo.initial_apicid;
- m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP);
- m->ppin = cpu_data(m->extcpu).ppin;
- m->microcode = boot_cpu_data.microcode;
+ mce_setup_per_cpu(m);
}

DEFINE_PER_CPU(struct mce, injectm);
--
2.34.1


2023-11-22 18:26:57

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 02/20] x86/mce: Define mce_setup() helpers for global and per-CPU fields

On Sat, Nov 18, 2023 at 01:32:30PM -0600, Yazen Ghannam wrote:
> +void mce_setup_global(struct mce *m)

We usually call those things "common":

mce_setup_common().

> +{
> + memset(m, 0, sizeof(struct mce));
> +
> + m->cpuid = cpuid_eax(1);
> + m->cpuvendor = boot_cpu_data.x86_vendor;
> + m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP);
> + /* need the internal __ version to avoid deadlocks */
> + m->time = __ktime_get_real_seconds();
> +}
> +
> +void mce_setup_per_cpu(struct mce *m)

And call this

mce_setup_for_cpu(unsigned int cpu, struct mce *m);

so that it doesn't look like some per_cpu helper.

And yes, you should supply the CPU number as an argument. Because
otherwise, when you look at your next change:


+ mce_setup_global(&m);
+ m.cpu = m.extcpu = cpu;
+ mce_setup_per_cpu(&m);

This contains the "hidden" requirement that m.extcpu happens *always*
*before* the mce_setup_per_cpu() call and that is flaky and error prone.

So make that:

mce_setup_common(&m);
mce_setup_for_cpu(m.extcpu, &m);

and do m.cpu = m.extcpu = cpu inside the second function.

And then it JustWorks(tm) and you can't "forget" assigning m.extcpu and
there's no subtlety.

Ok?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-11-27 14:53:05

by Yazen Ghannam

[permalink] [raw]
Subject: Re: [PATCH 02/20] x86/mce: Define mce_setup() helpers for global and per-CPU fields

On 11/22/2023 1:24 PM, Borislav Petkov wrote:
> On Sat, Nov 18, 2023 at 01:32:30PM -0600, Yazen Ghannam wrote:
>> +void mce_setup_global(struct mce *m)
>
> We usually call those things "common":
>
> mce_setup_common().
>
>> +{
>> + memset(m, 0, sizeof(struct mce));
>> +
>> + m->cpuid = cpuid_eax(1);
>> + m->cpuvendor = boot_cpu_data.x86_vendor;
>> + m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP);
>> + /* need the internal __ version to avoid deadlocks */
>> + m->time = __ktime_get_real_seconds();
>> +}
>> +
>> +void mce_setup_per_cpu(struct mce *m)
>
> And call this
>
> mce_setup_for_cpu(unsigned int cpu, struct mce *m);
>
> so that it doesn't look like some per_cpu helper.
>
> And yes, you should supply the CPU number as an argument. Because
> otherwise, when you look at your next change:
>
>
> + mce_setup_global(&m);
> + m.cpu = m.extcpu = cpu;
> + mce_setup_per_cpu(&m);
>
> This contains the "hidden" requirement that m.extcpu happens *always*
> *before* the mce_setup_per_cpu() call and that is flaky and error prone.
>
> So make that:
>
> mce_setup_common(&m);
> mce_setup_for_cpu(m.extcpu, &m);
>
> and do m.cpu = m.extcpu = cpu inside the second function.
>
> And then it JustWorks(tm) and you can't "forget" assigning m.extcpu and
> there's no subtlety.
>
> Ok?
>

Yep, understood. Thanks!

-Yazen