2018-06-01 11:31:51

by Filippo Sironi

[permalink] [raw]
Subject: [PATCH] x86/MCE: Get microcode revision from cpu_info instead of boot_cpu_data

Commit fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check
records") extended MCE entries to report the microcode revision taken
from boot_cpu_data. Unfortunately, boot_cpu_data isn't updated on late
microcode loading, thus making MCE entries slightly incorrect.
Use cpu_info instead, which is updated on late microcode loading.

Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records")
Signed-off-by: Filippo Sironi <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
arch/x86/kernel/cpu/mcheck/mce.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 42cf2880d0ed..4be323f9b390 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -134,7 +134,7 @@ void mce_setup(struct mce *m)
if (this_cpu_has(X86_FEATURE_INTEL_PPIN))
rdmsrl(MSR_PPIN, m->ppin);

- m->microcode = boot_cpu_data.microcode;
+ m->microcode = cpu_data(m->extcpu).microcode;
}

DEFINE_PER_CPU(struct mce, injectm);
--
2.7.4



2018-06-01 12:23:35

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/MCE: Get microcode revision from cpu_info instead of boot_cpu_data

On Fri, Jun 01, 2018 at 01:30:26PM +0200, Filippo Sironi wrote:
> Commit fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check
> records") extended MCE entries to report the microcode revision taken
> from boot_cpu_data. Unfortunately, boot_cpu_data isn't updated on late
> microcode loading,

Actually, I'd prefer if we fixed *that* instead by adding:

/* Update boot_cpu_data's revision too, if we're on the BSP: */
if (c->cpu_index == boot_cpu_data.cpu_index)
boot_cpu_data.microcode = <new rev>;

to the end of ->apply_microcode() functions so that boot_cpu_data has
the correct revision too, in case something else queries it.

Thx.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2018-06-01 12:32:45

by Filippo Sironi

[permalink] [raw]
Subject: Re: [PATCH] x86/MCE: Get microcode revision from cpu_info instead of boot_cpu_data

> On 1. Jun 2018, at 14:19, Borislav Petkov <[email protected]> wrote:
>
> On Fri, Jun 01, 2018 at 01:30:26PM +0200, Filippo Sironi wrote:
>> Commit fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check
>> records") extended MCE entries to report the microcode revision taken
>> from boot_cpu_data. Unfortunately, boot_cpu_data isn't updated on late
>> microcode loading,
>
> Actually, I'd prefer if we fixed *that* instead by adding:
>
> /* Update boot_cpu_data's revision too, if we're on the BSP: */
> if (c->cpu_index == boot_cpu_data.cpu_index)
> boot_cpu_data.microcode = <new rev>;
>
> to the end of ->apply_microcode() functions so that boot_cpu_data has
> the correct revision too, in case something else queries it.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> ECO tip #101: Trim your mails when you reply.
> --

I've that patch in my tree already, I can post it.
I'm still curious on why you'd prefer to use boot_cpu_data for all
CPUs instead of using cpu_data(m->extcpu) though.

Regards,
Filippo


Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B


2018-06-01 12:45:14

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/MCE: Get microcode revision from cpu_info instead of boot_cpu_data

On Fri, Jun 01, 2018 at 12:32:02PM +0000, Sironi, Filippo wrote:
> I've that patch in my tree already, I can post it.
> I'm still curious on why you'd prefer to use boot_cpu_data for all
> CPUs instead of using cpu_data(m->extcpu) though.

Well, for starters, boot_cpu_data having stale revision is a bug in
itself already. So fixing that is already the Right Thing(tm) to do.

Then, one fine day when the whole cpu_data mess is cleaned up,
boot_cpu_data will morph into the global descriptor representing CPU
features and keeping it uptodate now will facilitate the replacement
without introducing bugs.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2018-07-30 17:51:16

by Prarit Bhargava

[permalink] [raw]
Subject: [PATCH] arch/x86: Fix boot_cpu_data.microcode version output

On systems where a runtime microcode update has occurred the microcode
version output in a MCE log record is wrong because
boot_cpu_data.microcode is not updated during runtime.

Update boot_cpu_data.microcode when the BSP's microcode is updated.

Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records")
Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Prarit Bhargava <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
arch/x86/kernel/cpu/microcode/amd.c | 4 ++++
arch/x86/kernel/cpu/microcode/intel.c | 4 ++++
2 files changed, 8 insertions(+)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 0624957aa068..7f5b32535ac7 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu)
uci->cpu_sig.rev = mc_amd->hdr.patch_id;
c->microcode = mc_amd->hdr.patch_id;

+ /* Update boot_cpu_data's revision too, if we're on the BSP: */
+ if (c->cpu_index == boot_cpu_data.cpu_index)
+ boot_cpu_data.microcode = rev;
+
return UCODE_UPDATED;
}

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 97ccf4c3b45b..256d336cbc04 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu)
uci->cpu_sig.rev = rev;
c->microcode = rev;

+ /* Update boot_cpu_data's revision too, if we're on the BSP: */
+ if (c->cpu_index == boot_cpu_data.cpu_index)
+ boot_cpu_data.microcode = rev;
+
return UCODE_UPDATED;
}

--
2.17.0


2018-07-30 17:56:54

by Prarit Bhargava

[permalink] [raw]
Subject: Re: [PATCH] arch/x86: Fix boot_cpu_data.microcode version output



On 07/30/2018 01:49 PM, Prarit Bhargava wrote:
> On systems where a runtime microcode update has occurred the microcode
> version output in a MCE log record is wrong because
> boot_cpu_data.microcode is not updated during runtime.
>
> Update boot_cpu_data.microcode when the BSP's microcode is updated.
>
> Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records")
> Suggested-by: Borislav Petkov <[email protected]>
> Signed-off-by: Prarit Bhargava <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---
> arch/x86/kernel/cpu/microcode/amd.c | 4 ++++
> arch/x86/kernel/cpu/microcode/intel.c | 4 ++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
> index 0624957aa068..7f5b32535ac7 100644
> --- a/arch/x86/kernel/cpu/microcode/amd.c
> +++ b/arch/x86/kernel/cpu/microcode/amd.c
> @@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu)
> uci->cpu_sig.rev = mc_amd->hdr.patch_id;
> c->microcode = mc_amd->hdr.patch_id;
>
> + /* Update boot_cpu_data's revision too, if we're on the BSP: */
> + if (c->cpu_index == boot_cpu_data.cpu_index)
> + boot_cpu_data.microcode = rev;
> +

Borislav, hold off on this. I want to double check something with AMD.

I think this has to be

boot_cpu_data.microcode = mc_amd->hdr.patch_id;

I'm going to grab some old/new microcode and test. I do know that the Intel
update below works.

P.

> return UCODE_UPDATED;
> }
>
> diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
> index 97ccf4c3b45b..256d336cbc04 100644
> --- a/arch/x86/kernel/cpu/microcode/intel.c
> +++ b/arch/x86/kernel/cpu/microcode/intel.c
> @@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu)
> uci->cpu_sig.rev = rev;
> c->microcode = rev;
>
> + /* Update boot_cpu_data's revision too, if we're on the BSP: */
> + if (c->cpu_index == boot_cpu_data.cpu_index)
> + boot_cpu_data.microcode = rev;
> +
> return UCODE_UPDATED;
> }
>
>

2018-07-31 04:04:07

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] arch/x86: Fix boot_cpu_data.microcode version output

On Mon, Jul 30, 2018 at 01:53:50PM -0400, Prarit Bhargava wrote:
> I think this has to be
>
> boot_cpu_data.microcode = mc_amd->hdr.patch_id;

Yes, it does.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2018-07-31 11:28:59

by Prarit Bhargava

[permalink] [raw]
Subject: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output

I tested this on AMD Ryzen & Intel Broadwell system and dumped the
boot_cpu_data before and after a microcode update. On the Intel
system I also did a fatal MCE using mce-inject to confirm the output
from the mce handling code.

P.

---8<---

On systems where a runtime microcode update has occurred the microcode
version output in a MCE log record is wrong because
boot_cpu_data.microcode is not updated during runtime.

Update boot_cpu_data.microcode when the BSP's microcode is updated.

Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records")
Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Prarit Bhargava <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
Changes in v2: Use mc_amd->hdr.patch_id on AMD

arch/x86/kernel/cpu/microcode/amd.c | 4 ++++
arch/x86/kernel/cpu/microcode/intel.c | 4 ++++
2 files changed, 8 insertions(+)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 0624957aa068..63b072377ba4 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu)
uci->cpu_sig.rev = mc_amd->hdr.patch_id;
c->microcode = mc_amd->hdr.patch_id;

+ /* Update boot_cpu_data's revision too, if we're on the BSP: */
+ if (c->cpu_index == boot_cpu_data.cpu_index)
+ boot_cpu_data.microcode = mc_amd->hdr.patch_id;
+
return UCODE_UPDATED;
}

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 97ccf4c3b45b..256d336cbc04 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu)
uci->cpu_sig.rev = rev;
c->microcode = rev;

+ /* Update boot_cpu_data's revision too, if we're on the BSP: */
+ if (c->cpu_index == boot_cpu_data.cpu_index)
+ boot_cpu_data.microcode = rev;
+
return UCODE_UPDATED;
}

--
2.17.0


2018-07-31 11:51:03

by Filippo Sironi

[permalink] [raw]
Subject: Re: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output


> On 31. Jul 2018, at 13:27, Prarit Bhargava <[email protected]> wrote:
>
> I tested this on AMD Ryzen & Intel Broadwell system and dumped the
> boot_cpu_data before and after a microcode update. On the Intel
> system I also did a fatal MCE using mce-inject to confirm the output
> from the mce handling code.
>
> P.
>
> ---8<---
>
> On systems where a runtime microcode update has occurred the microcode
> version output in a MCE log record is wrong because
> boot_cpu_data.microcode is not updated during runtime.
>
> Update boot_cpu_data.microcode when the BSP's microcode is updated.
>
> Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records")
> Suggested-by: Borislav Petkov <[email protected]>
> Signed-off-by: Prarit Bhargava <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---
> Changes in v2: Use mc_amd->hdr.patch_id on AMD
>
> arch/x86/kernel/cpu/microcode/amd.c | 4 ++++
> arch/x86/kernel/cpu/microcode/intel.c | 4 ++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
> index 0624957aa068..63b072377ba4 100644
> --- a/arch/x86/kernel/cpu/microcode/amd.c
> +++ b/arch/x86/kernel/cpu/microcode/amd.c
> @@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu)
> uci->cpu_sig.rev = mc_amd->hdr.patch_id;
> c->microcode = mc_amd->hdr.patch_id;
>
> + /* Update boot_cpu_data's revision too, if we're on the BSP: */
> + if (c->cpu_index == boot_cpu_data.cpu_index)
> + boot_cpu_data.microcode = mc_amd->hdr.patch_id;
> +
> return UCODE_UPDATED;
> }
>
> diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
> index 97ccf4c3b45b..256d336cbc04 100644
> --- a/arch/x86/kernel/cpu/microcode/intel.c
> +++ b/arch/x86/kernel/cpu/microcode/intel.c
> @@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu)
> uci->cpu_sig.rev = rev;
> c->microcode = rev;
>
> + /* Update boot_cpu_data's revision too, if we're on the BSP: */
> + if (c->cpu_index == boot_cpu_data.cpu_index)
> + boot_cpu_data.microcode = rev;
> +
> return UCODE_UPDATED;
> }

There may be a chance of skipping this code, I think.

If the microcode is loaded on the hyperthread sibling of the boot cpu
before being loaded on the boot cpu, the boot cpu will exit earlier
from apply_microcode_intel() - in if (rev >= mc->hdr.rev) { ... }.

(This seems to be possible in apply_microcode_amd() as well.)

In my tree with the aforementioned change - Intel only - I also had
the following patch:

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 97ccf4c3b45b..4bc869e829eb 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -797,6 +797,7 @@ static enum ucode_state apply_microcode_intel(int cpu)
struct microcode_intel *mc;
static int prev_rev;
u32 rev;
+ enum ucode_state ret;

/* We should bind the task to the CPU */
if (WARN_ON(raw_smp_processor_id() != cpu))
@@ -817,9 +818,8 @@ static enum ucode_state apply_microcode_intel(int cpu)
*/
rev = intel_get_microcode_revision();
if (rev >= mc->hdr.rev) {
- uci->cpu_sig.rev = rev;
- c->microcode = rev;
- return UCODE_OK;
+ ret = UCODE_OK;
+ goto out;
}

/*
@@ -848,10 +848,12 @@ static enum ucode_state apply_microcode_intel(int cpu)
prev_rev = rev;
}

+ ret = UCODE_UPDATED;
+out:
uci->cpu_sig.rev = rev;
c->microcode = rev;

- return UCODE_UPDATED;
+ return ret;
}

static enum ucode_state generic_load_microcode(int cpu, void *data, size_t size,

which prevents the issue.

> --
> 2.17.0
>
>

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B


2018-07-31 13:01:52

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output

On Tue, Jul 31, 2018 at 11:46:09AM +0000, Sironi, Filippo wrote:
> There may be a chance of skipping this code, I think.
>
> If the microcode is loaded on the hyperthread sibling of the boot cpu
> before being loaded on the boot cpu, the boot cpu will exit earlier
> from apply_microcode_intel() - in if (rev >= mc->hdr.rev) { ... }.
>
> (This seems to be possible in apply_microcode_amd() as well.)
>
> In my tree with the aforementioned change - Intel only - I also had
> the following patch:

Yap, good catch.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2018-07-31 15:36:48

by Filippo Sironi

[permalink] [raw]
Subject: Re: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output


> On 31. Jul 2018, at 15:00, Borislav Petkov <[email protected]> wrote:
>
> On Tue, Jul 31, 2018 at 11:46:09AM +0000, Sironi, Filippo wrote:
>> There may be a chance of skipping this code, I think.
>>
>> If the microcode is loaded on the hyperthread sibling of the boot cpu
>> before being loaded on the boot cpu, the boot cpu will exit earlier
>> from apply_microcode_intel() - in if (rev >= mc->hdr.rev) { ... }.
>>
>> (This seems to be possible in apply_microcode_amd() as well.)
>>
>> In my tree with the aforementioned change - Intel only - I also had
>> the following patch:
>
> Yap, good catch.
>
> --
> Regards/Gruss,
> Boris.
>
> ECO tip #101: Trim your mails when you reply.
> --
>

Boris, Prarit,

I've sent the patch the I had in my tree and extended it to cover
apply_microcode_amd(). I don't have an AMD host available for testing,
can one of you give that a spin?

Thanks,
Filippo


Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B


2018-08-01 06:39:33

by Oleksandr Natalenko

[permalink] [raw]
Subject: Re: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output

Hi.

> I tested this on AMD Ryzen & Intel Broadwell system and dumped the
> boot_cpu_data before and after a microcode update. On the Intel
> system I also did a fatal MCE using mce-inject to confirm the output
> from the mce handling code.
>
> P.
>
> ---8<---
>
> On systems where a runtime microcode update has occurred the microcode
> version output in a MCE log record is wrong because
> boot_cpu_data.microcode is not updated during runtime.
>
> Update boot_cpu_data.microcode when the BSP's microcode is updated.
>
> Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check
> records")
> Suggested-by: Borislav Petkov <[email protected]>
> Signed-off-by: Prarit Bhargava <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---
> Changes in v2: Use mc_amd->hdr.patch_id on AMD
>
> arch/x86/kernel/cpu/microcode/amd.c | 4 ++++
> arch/x86/kernel/cpu/microcode/intel.c | 4 ++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/microcode/amd.c
> b/arch/x86/kernel/cpu/microcode/amd.c
> index 0624957aa068..63b072377ba4 100644
> --- a/arch/x86/kernel/cpu/microcode/amd.c
> +++ b/arch/x86/kernel/cpu/microcode/amd.c
> @@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int
> cpu)
> uci->cpu_sig.rev = mc_amd->hdr.patch_id;
> c->microcode = mc_amd->hdr.patch_id;
>
> + /* Update boot_cpu_data's revision too, if we're on the BSP: */
> + if (c->cpu_index == boot_cpu_data.cpu_index)
> + boot_cpu_data.microcode = mc_amd->hdr.patch_id;
> +
> return UCODE_UPDATED;
> }
>
> diff --git a/arch/x86/kernel/cpu/microcode/intel.c
> b/arch/x86/kernel/cpu/microcode/intel.c
> index 97ccf4c3b45b..256d336cbc04 100644
> --- a/arch/x86/kernel/cpu/microcode/intel.c
> +++ b/arch/x86/kernel/cpu/microcode/intel.c
> @@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int
> cpu)
> uci->cpu_sig.rev = rev;
> c->microcode = rev;
>
> + /* Update boot_cpu_data's revision too, if we're on the BSP: */
> + if (c->cpu_index == boot_cpu_data.cpu_index)
> + boot_cpu_data.microcode = rev;
> +
> return UCODE_UPDATED;
> }
>
> --
> 2.17.0

After this patch, do we preserve an original microcode version
somewhere? If no, why? Sometimes it is useful while debugging another
crash because of faulty microcode.

Thanks.

--
Oleksandr Natalenko (post-factum)

2018-08-01 11:39:20

by Prarit Bhargava

[permalink] [raw]
Subject: Re: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output



On 08/01/2018 02:38 AM, Oleksandr Natalenko wrote:
> Hi.
>
>> I tested this on AMD Ryzen & Intel Broadwell system and dumped the
>> boot_cpu_data before and after a microcode update.  On the Intel
>> system I also did a fatal MCE using mce-inject to confirm the output
>> from the mce handling code.
>>
>> P.
>>
>> ---8<---
>>
>> On systems where a runtime microcode update has occurred the microcode
>> version output in a MCE log record is wrong because
>> boot_cpu_data.microcode is not updated during runtime.
>>
>> Update boot_cpu_data.microcode when the BSP's microcode is updated.
>>
>> Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records")
>> Suggested-by: Borislav Petkov <[email protected]>
>> Signed-off-by: Prarit Bhargava <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> ---
>> Changes in v2: Use mc_amd->hdr.patch_id on AMD
>>
>>  arch/x86/kernel/cpu/microcode/amd.c   | 4 ++++
>>  arch/x86/kernel/cpu/microcode/intel.c | 4 ++++
>>  2 files changed, 8 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/microcode/amd.c
>> b/arch/x86/kernel/cpu/microcode/amd.c
>> index 0624957aa068..63b072377ba4 100644
>> --- a/arch/x86/kernel/cpu/microcode/amd.c
>> +++ b/arch/x86/kernel/cpu/microcode/amd.c
>> @@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu)
>>      uci->cpu_sig.rev = mc_amd->hdr.patch_id;
>>      c->microcode = mc_amd->hdr.patch_id;
>>
>> +    /* Update boot_cpu_data's revision too, if we're on the BSP: */
>> +    if (c->cpu_index == boot_cpu_data.cpu_index)
>> +        boot_cpu_data.microcode =  mc_amd->hdr.patch_id;
>> +
>>      return UCODE_UPDATED;
>>  }
>>
>> diff --git a/arch/x86/kernel/cpu/microcode/intel.c
>> b/arch/x86/kernel/cpu/microcode/intel.c
>> index 97ccf4c3b45b..256d336cbc04 100644
>> --- a/arch/x86/kernel/cpu/microcode/intel.c
>> +++ b/arch/x86/kernel/cpu/microcode/intel.c
>> @@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu)
>>      uci->cpu_sig.rev = rev;
>>      c->microcode = rev;
>>
>> +    /* Update boot_cpu_data's revision too, if we're on the BSP: */
>> +    if (c->cpu_index == boot_cpu_data.cpu_index)
>> +        boot_cpu_data.microcode = rev;
>> +
>>      return UCODE_UPDATED;
>>  }
>>
>> --
>> 2.17.0
>
> After this patch, do we preserve an original microcode version somewhere? If no,
> why? Sometimes it is useful while debugging another crash because of faulty
> microcode.
>

Interesting, and thanks for bringing this up. Oleksandr, under what
circumstances would you want to know what the old version was. AFAICS it is no
longer running and should not have an impact on the system?

P.

> Thanks.
>

2018-08-01 14:18:52

by Oleksandr Natalenko

[permalink] [raw]
Subject: Re: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output

Hi.

On 01.08.2018 13:38, Prarit Bhargava wrote:
>> After this patch, do we preserve an original microcode version
>> somewhere? If no,
>> why? Sometimes it is useful while debugging another crash because of
>> faulty
>> microcode.
>>
> Interesting, and thanks for bringing this up. Oleksandr, under what
> circumstances would you want to know what the old version was. AFAICS
> it is no
> longer running and should not have an impact on the system?

Once the kernel log does not contain a printout regarding updated
microcode anymore (because the log buffer is limited in size and can be
overwritten) and once you have a vmcore, it is handy to use
boot_cpu_data to compare the microcode version with the per-CPU value to
find out that is was updated at all. Or, maybe, that can be inspected in
another way now?

--
Oleksandr Natalenko (post-factum)

2018-08-01 15:31:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output

(drop stable@ from CC)

On Wed, Aug 01, 2018 at 04:16:42PM +0200, Oleksandr Natalenko wrote:
> Once the kernel log does not contain a printout regarding updated microcode
> anymore (because the log buffer is limited in size and can be overwritten)

There's no reliable way to get the old microcode revision which was
overwritten during the upgrade. If dmesg gets overwritten you lose, like
in all the other gazillion cases where you lose information due to that.
Sorry.

This becomes an even bigger problem if you have a long-running system
which upgrades microcode a couple of times before being rebooted again.
In that case, your only log which contains the microcode revisions being
upgraded is dmesg.

> once you have a vmcore, it is handy to use boot_cpu_data to compare
> the microcode version with the per-CPU value to find out that is was
> updated at all.

boot_cpu_data.microcode was never meant to contain the *previous*
microcode revision AFAICT - it just didn't get updated, which we're
fixing now.

> Or, maybe, that can be inspected in another way now?

Dunno, does kdump collect /proc/cpuinfo? If so:

$ grep microcode /proc/cpuinfo

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2018-08-01 15:44:44

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output

(dropping stable@)

On Tue, Jul 31, 2018 at 07:27:39AM -0400, Prarit Bhargava wrote:
> I tested this on AMD Ryzen & Intel Broadwell system and dumped the
> boot_cpu_data before and after a microcode update. On the Intel
> system I also did a fatal MCE using mce-inject to confirm the output
> from the mce handling code.

Ok, sending the two patches I've massaged into submission as a reply to
this message - please run them on your boxes. You can also inject MCEs
on the AMD box to verify - simply enable CONFIG_X86_MCE_INJECT, go to
debugfs and do, for example:

cd /sys/kernel/debug/mce-inject/
echo 10 > /sys/devices/system/machinecheck/machinecheck0/check_interval
echo 0x9c7d410092080813 > status; echo 0x000000006d3d483b > addr; echo 2 > cpu; echo 4 > bank; echo hw > flags
echo 0xbc00200000010015 > status; echo 0x00000000942e4d98 > addr; echo 6 > cpu; echo 2 > bank; echo hw > flags
echo 0xfe000010000b0c0f > status; echo 0xe1101add1e550012 > addr; echo 0xcafebeef > misc; echo 2 > cpu; echo hw > flags; echo 4 > bank

Thanks!

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2018-08-01 15:46:42

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 1/2] x86/microcode: Make sure boot_cpu_data.microcode is up-to-date

From: Prarit Bhargava <[email protected]>

The MCE code uses boot_cpu_data.microcode to prepare the MCE record.
However, on systems where late microcode update has occurred, the
microcode revision output in a MCE log record is wrong because
boot_cpu_data.microcode is not updated when the microcode gets updated.

However, the microcode revision saved in boot_cpu_data's microcode
member should be kept up-to-date, regardless, for consistency.

Make it so.

Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records")
Signed-off-by: Prarit Bhargava <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: x86-ml <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Make commit message more precise. ]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/microcode/amd.c | 4 ++++
arch/x86/kernel/cpu/microcode/intel.c | 4 ++++
2 files changed, 8 insertions(+)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 0624957aa068..63b072377ba4 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu)
uci->cpu_sig.rev = mc_amd->hdr.patch_id;
c->microcode = mc_amd->hdr.patch_id;

+ /* Update boot_cpu_data's revision too, if we're on the BSP: */
+ if (c->cpu_index == boot_cpu_data.cpu_index)
+ boot_cpu_data.microcode = mc_amd->hdr.patch_id;
+
return UCODE_UPDATED;
}

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 97ccf4c3b45b..256d336cbc04 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu)
uci->cpu_sig.rev = rev;
c->microcode = rev;

+ /* Update boot_cpu_data's revision too, if we're on the BSP: */
+ if (c->cpu_index == boot_cpu_data.cpu_index)
+ boot_cpu_data.microcode = rev;
+
return UCODE_UPDATED;
}

--
2.18.0.321.gffc6fa0e3962


--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2018-08-01 15:48:07

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 2/2] x86/microcode: Update the new microcode revision unconditionally

From: Filippo Sironi <[email protected]>

Handle the case where microcode gets loaded on the BSP's hyperthread
sibling first and the boot_cpu_data's microcode revision doesn't get
updated because of early exit due to the siblings sharing a microcode
engine.

For that, simply write the updated revision on all CPUs unconditionally.

Signed-off-by: Filippo Sironi <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: x86-ml <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Massage everything a bit. ]
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/cpu/microcode/amd.c | 22 +++++++++++++---------
arch/x86/kernel/cpu/microcode/intel.c | 13 ++++++++-----
2 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 63b072377ba4..07b5fc00b188 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -504,6 +504,7 @@ static enum ucode_state apply_microcode_amd(int cpu)
struct microcode_amd *mc_amd;
struct ucode_cpu_info *uci;
struct ucode_patch *p;
+ enum ucode_state ret;
u32 rev, dummy;

BUG_ON(raw_smp_processor_id() != cpu);
@@ -521,9 +522,8 @@ static enum ucode_state apply_microcode_amd(int cpu)

/* need to apply patch? */
if (rev >= mc_amd->hdr.patch_id) {
- c->microcode = rev;
- uci->cpu_sig.rev = rev;
- return UCODE_OK;
+ ret = UCODE_OK;
+ goto out;
}

if (__apply_microcode_amd(mc_amd)) {
@@ -531,17 +531,21 @@ static enum ucode_state apply_microcode_amd(int cpu)
cpu, mc_amd->hdr.patch_id);
return UCODE_ERROR;
}
- pr_info("CPU%d: new patch_level=0x%08x\n", cpu,
- mc_amd->hdr.patch_id);

- uci->cpu_sig.rev = mc_amd->hdr.patch_id;
- c->microcode = mc_amd->hdr.patch_id;
+ rev = mc_amd->hdr.patch_id;
+ ret = UCODE_UPDATED;
+
+ pr_info("CPU%d: new patch_level=0x%08x\n", cpu, rev);
+
+out:
+ uci->cpu_sig.rev = rev;
+ c->microcode = rev;

/* Update boot_cpu_data's revision too, if we're on the BSP: */
if (c->cpu_index == boot_cpu_data.cpu_index)
- boot_cpu_data.microcode = mc_amd->hdr.patch_id;
+ boot_cpu_data.microcode = rev;

- return UCODE_UPDATED;
+ return ret;
}

static int install_equiv_cpu_table(const u8 *buf)
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 256d336cbc04..16936a24795c 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -795,6 +795,7 @@ static enum ucode_state apply_microcode_intel(int cpu)
struct ucode_cpu_info *uci = ucode_cpu_info + cpu;
struct cpuinfo_x86 *c = &cpu_data(cpu);
struct microcode_intel *mc;
+ enum ucode_state ret;
static int prev_rev;
u32 rev;

@@ -817,9 +818,8 @@ static enum ucode_state apply_microcode_intel(int cpu)
*/
rev = intel_get_microcode_revision();
if (rev >= mc->hdr.rev) {
- uci->cpu_sig.rev = rev;
- c->microcode = rev;
- return UCODE_OK;
+ ret = UCODE_OK;
+ goto out;
}

/*
@@ -848,14 +848,17 @@ static enum ucode_state apply_microcode_intel(int cpu)
prev_rev = rev;
}

+ ret = UCODE_UPDATED;
+
+out:
uci->cpu_sig.rev = rev;
- c->microcode = rev;
+ c->microcode = rev;

/* Update boot_cpu_data's revision too, if we're on the BSP: */
if (c->cpu_index == boot_cpu_data.cpu_index)
boot_cpu_data.microcode = rev;

- return UCODE_UPDATED;
+ return ret;
}

static enum ucode_state generic_load_microcode(int cpu, void *data, size_t size,
--
2.18.0.321.gffc6fa0e3962


--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2018-08-01 15:56:55

by Prarit Bhargava

[permalink] [raw]
Subject: Re: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output



On 08/01/2018 11:43 AM, Borislav Petkov wrote:
> (dropping stable@)
>
> On Tue, Jul 31, 2018 at 07:27:39AM -0400, Prarit Bhargava wrote:
>> I tested this on AMD Ryzen & Intel Broadwell system and dumped the
>> boot_cpu_data before and after a microcode update. On the Intel
>> system I also did a fatal MCE using mce-inject to confirm the output
>> from the mce handling code.
>
> Ok, sending the two patches I've massaged into submission as a reply to
> this message - please run them on your boxes. You can also inject MCEs
> on the AMD box to verify - simply enable CONFIG_X86_MCE_INJECT, go to
> debugfs and do, for example:
>
> cd /sys/kernel/debug/mce-inject/
> echo 10 > /sys/devices/system/machinecheck/machinecheck0/check_interval
> echo 0x9c7d410092080813 > status; echo 0x000000006d3d483b > addr; echo 2 > cpu; echo 4 > bank; echo hw > flags
> echo 0xbc00200000010015 > status; echo 0x00000000942e4d98 > addr; echo 6 > cpu; echo 2 > bank; echo hw > flags
> echo 0xfe000010000b0c0f > status; echo 0xe1101add1e550012 > addr; echo 0xcafebeef > misc; echo 2 > cpu; echo hw > flags; echo 4 > bank
>

Borislav, I'm not sure if you saw his comments but Oleksandr, cc'd, has an
objection to overwriting the stored microcode version in boot_cpu_data. He has
found that information useful in debugging microcode issues.

P.

> Thanks!
>

2018-08-01 16:01:48

by Luck, Tony

[permalink] [raw]
Subject: RE: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output

> There's no reliable way to get the old microcode revision which was
> overwritten during the upgrade. If dmesg gets overwritten you lose, like
> in all the other gazillion cases where you lose information due to that.

The primary requirement here is that we report the version of the microcode
in use at the time of a crash. Keeping history of all updates seems to me to
beyond the scope of the kernel's responsibilities.

It's not like these updates appear out of the ether. You have to go out and
grab a new package and install it. User land can keep track of this much
more easily than the kernel.

-Tony

2018-08-01 16:03:01

by Oleksandr Natalenko

[permalink] [raw]
Subject: Re: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output

Hi.

On 01.08.2018 17:59, Luck, Tony wrote:
>> There's no reliable way to get the old microcode revision which was
>> overwritten during the upgrade. If dmesg gets overwritten you lose,
>> like
>> in all the other gazillion cases where you lose information due to
>> that.
>
> The primary requirement here is that we report the version of the
> microcode
> in use at the time of a crash. Keeping history of all updates seems to
> me to
> beyond the scope of the kernel's responsibilities.
>
> It's not like these updates appear out of the ether. You have to go out
> and
> grab a new package and install it. User land can keep track of this
> much
> more easily than the kernel.

I don't mind doing the right thing at all. It is just to inform you that
it was found to be useful.

Also, [1].

Thanks ☺.

[1] https://xkcd.com/1172/

--
Oleksandr Natalenko (post-factum)

2018-08-01 16:16:58

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] arch/x86: Fix boot_cpu_data.microcode version output

On August 1, 2018 7:01:50 PM GMT+03:00, Oleksandr Natalenko <[email protected]> wrote:
>I don't mind doing the right thing at all. It is just to inform you
>that
>it was found to be useful.

I don't think it would've worked if you did a second microcode upgrade on the system.

--
Sent from a small device: formatting sux and brevity is inevitable.

Subject: [tip:x86/urgent] x86/microcode: Make sure boot_cpu_data.microcode is up-to-date

Commit-ID: 370a132bb2227ff76278f98370e0e701d86ff752
Gitweb: https://git.kernel.org/tip/370a132bb2227ff76278f98370e0e701d86ff752
Author: Prarit Bhargava <[email protected]>
AuthorDate: Tue, 31 Jul 2018 07:27:39 -0400
Committer: Thomas Gleixner <[email protected]>
CommitDate: Sun, 2 Sep 2018 14:10:54 +0200

x86/microcode: Make sure boot_cpu_data.microcode is up-to-date

When preparing an MCE record for logging, boot_cpu_data.microcode is used
to read out the microcode revision on the box.

However, on systems where late microcode update has happened, the microcode
revision output in a MCE log record is wrong because
boot_cpu_data.microcode is not updated when the microcode gets updated.

But, the microcode revision saved in boot_cpu_data's microcode member
should be kept up-to-date, regardless, for consistency.

Make it so.

Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records")
Signed-off-by: Prarit Bhargava <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/cpu/microcode/amd.c | 4 ++++
arch/x86/kernel/cpu/microcode/intel.c | 4 ++++
2 files changed, 8 insertions(+)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 0624957aa068..602f17134103 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu)
uci->cpu_sig.rev = mc_amd->hdr.patch_id;
c->microcode = mc_amd->hdr.patch_id;

+ /* Update boot_cpu_data's revision too, if we're on the BSP: */
+ if (c->cpu_index == boot_cpu_data.cpu_index)
+ boot_cpu_data.microcode = mc_amd->hdr.patch_id;
+
return UCODE_UPDATED;
}

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 97ccf4c3b45b..256d336cbc04 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu)
uci->cpu_sig.rev = rev;
c->microcode = rev;

+ /* Update boot_cpu_data's revision too, if we're on the BSP: */
+ if (c->cpu_index == boot_cpu_data.cpu_index)
+ boot_cpu_data.microcode = rev;
+
return UCODE_UPDATED;
}