LinuxLists.cc - Re: [PATCH 2/8] x86, microcode, intel: don't update each HT core twice

2014-10-20 13:33:18

Subject: Re: [PATCH 2/8] x86, microcode, intel: don't update each HT core twice

On Mon, Sep 08, 2014 at 02:37:48PM -0300, Henrique de Moraes Holschuh wrote:
> Fix a regression introduced by 506ed6b53e00ba303ad778122f08e1fca7cf5efb,
> "x86, intel: Output microcode revision in /proc/cpuinfo", which added a
> cache of the thread microcode revision to cpu_data()->microcode and
> switched the microcode driver to using the cached value.
>
> This caused the driver to needlessly update each processor core twice
> when hyper-threading is enabled (once per hardware thread). The early
> microcode update code that runs during BSP/AP setup does not have this
> problem.
>
> Intel microcode update operations are extremely expensive. The WRMSR
> 79H instruction could take anywhere from a hundred-thousand to several
> million cycles to successfully apply a microcode update, depending on
> processor model and microcode update size.
>
> To avoid updating the same core twice per microcode update run, refresh
> the microcode revision of each CPU (hardware thread) before deciding
> whether it needs an update or not.
>
> A silent version of collect_cpu_info() is required for this fix,
> otherwise the logs produced by a microcode update run would be twice as
> long and very confusing.
>
> Signed-off-by: Henrique de Moraes Holschuh <[email protected]>
> ---
> arch/x86/kernel/cpu/microcode/intel.c | 39 ++++++++++++++++++++++-----------
> 1 file changed, 26 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
> index c6826d1..2c629d1 100644
> --- a/arch/x86/kernel/cpu/microcode/intel.c
> +++ b/arch/x86/kernel/cpu/microcode/intel.c
> @@ -87,7 +87,7 @@ MODULE_DESCRIPTION("Microcode Update Driver");
> MODULE_AUTHOR("Tigran Aivazian <[email protected]>");
> MODULE_LICENSE("GPL");
>
> -static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
> +static void __collect_cpu_info(int cpu_num, struct cpu_signature *csig)
> {
> struct cpuinfo_x86 *c = &cpu_data(cpu_num);
> unsigned int val[2];
> @@ -102,7 +102,19 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
> csig->pf = 1 << ((val[1] >> 18) & 7);
> }
>
> - csig->rev = c->microcode;
> + /* get the current microcode revision from MSR 0x8B */
> + wrmsr(MSR_IA32_UCODE_REV, 0, 0);
> + sync_core();
> + rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
> +
> + csig->rev = val[1];
> + c->microcode = val[1]; /* re-sync */
> +}
> +
> +static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
> +{
> + __collect_cpu_info(cpu_num, csig);
> +
> pr_info("CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
> cpu_num, csig->sig, csig->pf, csig->rev);

We probably should downgrade this to pr_debug and use collect_cpu_info()
everywhere instead of having a __ version.

>
> @@ -118,7 +130,10 @@ static int get_matching_mc(struct microcode_intel *mc_intel, int cpu)
> struct cpu_signature cpu_sig;
> unsigned int csig, cpf, crev;
>
> - collect_cpu_info(cpu, &cpu_sig);
> + /* NOTE: cpu_data()->microcode will be outdated on HT
> + * processors during an update run, it must be refreshed
> + * from MSR 0x8B */
> + __collect_cpu_info(cpu, &cpu_sig);
>
> csig = cpu_sig.sig;
> cpf = cpu_sig.pf;
> @@ -145,23 +160,21 @@ static int apply_microcode_intel(int cpu)
> return 0;
>
> /*
> - * Microcode on this CPU could be updated earlier. Only apply the
> - * microcode patch in mc_intel when it is newer than the one on this
> - * CPU.
> + * Microcode on this CPU might be already up-to-date. Only apply
> + * the microcode patch in mc_intel when it is newer than the one
> + * on this CPU.
> */
> if (get_matching_mc(mc_intel, cpu) == 0)
> return 0;
>
> - /* write microcode via MSR 0x79 */
> + /* write microcode via MSR 0x79. THIS IS VERY EXPENSIVE */

No need for screaming here - we know MSR accesses are expensive. This
comment is totally useless here so drop it altogether.

> wrmsr(MSR_IA32_UCODE_WRITE,
> - (unsigned long) mc_intel->bits,
> - (unsigned long) mc_intel->bits >> 16 >> 16);
> - wrmsr(MSR_IA32_UCODE_REV, 0, 0);
> -
> - /* As documented in the SDM: Do a CPUID 1 here */
> - sync_core();
> + lower_32_bits((unsigned long) mc_intel->bits),
> + upper_32_bits((unsigned long) mc_intel->bits));

wrmsrl() takes u64 directly - no need for the splitting.

> /* get the current revision from MSR 0x8B */
> + wrmsr(MSR_IA32_UCODE_REV, 0, 0);
> + sync_core();
> rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
>
> if (val[1] != mc_intel->hdr.rev) {
> --
> 1.7.10.4
>
>

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-10-20 18:24:42

by Henrique de Moraes Holschuh

[permalink] [raw]

Subject: Re: [PATCH 2/8] x86, microcode, intel: don't update each HT core twice

On Mon, 20 Oct 2014, Borislav Petkov wrote:
> On Mon, Sep 08, 2014 at 02:37:48PM -0300, Henrique de Moraes Holschuh wrote:
> > -static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
> > +static void __collect_cpu_info(int cpu_num, struct cpu_signature *csig)
> > {
> > struct cpuinfo_x86 *c = &cpu_data(cpu_num);
> > unsigned int val[2];
> > @@ -102,7 +102,19 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
> > csig->pf = 1 << ((val[1] >> 18) & 7);
> > }
> >
> > - csig->rev = c->microcode;
> > + /* get the current microcode revision from MSR 0x8B */
> > + wrmsr(MSR_IA32_UCODE_REV, 0, 0);
> > + sync_core();
> > + rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
> > +
> > + csig->rev = val[1];
> > + c->microcode = val[1]; /* re-sync */
> > +}
> > +
> > +static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
> > +{
> > + __collect_cpu_info(cpu_num, csig);
> > +
> > pr_info("CPU%d sig=0x%x, pf=0x%x, revision=0x%x\n",
> > cpu_num, csig->sig, csig->pf, csig->rev);
>
> We probably should downgrade this to pr_debug and use collect_cpu_info()
> everywhere instead of having a __ version.

Over time, grepping for that information on reports and logs all over the
net has helped me a great deal. In fact, it is in my backlog to add it to
the early microcode driver as well.

I really miss the full microcode ID information in /proc/cpuinfo, in fact.

If you want, I can modify the logging it in a future patch so that we print
it only once when all cores have the same sig, pf and revision (which should
cover 95% of the systems out there).

> > - /* write microcode via MSR 0x79 */
> > + /* write microcode via MSR 0x79. THIS IS VERY EXPENSIVE */
>
> No need for screaming here - we know MSR accesses are expensive. This
> comment is totally useless here so drop it altogether.

MSR 79H writes are on a class of their own as far as "expensive" goes... On
a modern i3/i5/i7, it will take approximately one million cycles to complete
(the larger the microcode update, the longer it takes).

I don't think people usually associate MSR write with "takes one million
cycles to complete"...

That said, I will remove the comment.

> > wrmsr(MSR_IA32_UCODE_WRITE,
> > - (unsigned long) mc_intel->bits,
> > - (unsigned long) mc_intel->bits >> 16 >> 16);
> > - wrmsr(MSR_IA32_UCODE_REV, 0, 0);
> > -
> > - /* As documented in the SDM: Do a CPUID 1 here */
> > - sync_core();
> > + lower_32_bits((unsigned long) mc_intel->bits),
> > + upper_32_bits((unsigned long) mc_intel->bits));
>
> wrmsrl() takes u64 directly - no need for the splitting.

This is old code, I guess it predates wrmsrl()...

Should I replace the old split version with wrmsrl() in this patch, or as a
separate patch?

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2014-10-28 17:31:14

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH 2/8] x86, microcode, intel: don't update each HT core twice

On Mon, Oct 20, 2014 at 04:24:27PM -0200, Henrique de Moraes Holschuh wrote:
> Over time, grepping for that information on reports and logs all over the
> net has helped me a great deal.

Helped you how, for what? I still am searching for a justification to
bother the user with the fact that her microcode just got upgraded. I
mean, she can simply do:

$ grep microcode /proc/cpuinfo | head -1
microcode : 0x6000822

if needed.

Now, the error cases where the upgrade fails for some unexpected reason
is what we want to know.

> I really miss the full microcode ID information in /proc/cpuinfo, in fact.

Full ID, you mean all fields of struct cpu_signature on Intel?

If so,

->sig - CPUID_EAX(1) which is in /proc/cpuinfo

->pf - processor flags in MSR_0x17[52:50] - I guess you can read that
out with rdmsr 0x17. Why do we need to know that one except maybe to
verify why a patch doesn't get accepted by the loader?

-> rev - that's in MSR_IA32_UCODE_REV

I'm not really sure we absolutely need those except for debugging. Thus
the pr_debug() suggestion from my side.

> MSR 79H writes are on a class of their own as far as "expensive" goes... On
> a modern i3/i5/i7, it will take approximately one million cycles to complete
> (the larger the microcode update, the longer it takes).
>
> I don't think people usually associate MSR write with "takes one million
> cycles to complete"...

So? You don't do microcode updates all the time - it is done once during
boot and when cores come back online.

> This is old code, I guess it predates wrmsrl()...
>
> Should I replace the old split version with wrmsrl() in this patch, or as a
> separate patch?

Yes please. And then add to the commit message something of the sorts
"While at it, ..."

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-10-31 18:43:59

by Henrique de Moraes Holschuh

[permalink] [raw]

Subject: Re: [PATCH 2/8] x86, microcode, intel: don't update each HT core twice

On Tue, 28 Oct 2014, Borislav Petkov wrote:
> On Mon, Oct 20, 2014 at 04:24:27PM -0200, Henrique de Moraes Holschuh wrote:
> > Over time, grepping for that information on reports and logs all over the
> > net has helped me a great deal.
>
> Helped you how, for what? I still am searching for a justification to
> bother the user with the fact that her microcode just got upgraded. I
> mean, she can simply do:

Every time I issued a warning to users that they need to upgrade their
microcode due to either security errata or other nasty issue, the fact that
you can just grep for microcode in the logs to know whether it worked or not
*and* the reasons why it did/did not work has been helpful.

The logging also tells me whether the early or regular microcode driver was
the one that did the update. This matters to me, as it helps me detect when
the regular microcode update driver is doing work that should have been done
by the early driver.

It was also really useful when I was searching the net for reports about a
specific processor signature, /proc/cpuinfo is really unhelpful for that.

It also gives me one interesting information that just looking at
/proc/cpuinfo doesn't: the version of the microcode installed by the user's
BIOS/UEFI. This kind of information is relevant to some of the Linux
distros (at least to Debian) due to the non-free nature of the microcode
updates [note: this assumes the regular microcode driver was loaded]. It is
the only metric I have that tells me whether the effort is worth it.

Let me propose a way forward that would improve the status quo re. reducing
the logging, and still log the information I find useful: If you agree, I
will write code to greatly reduce the number of log lines generated by the
combination of the early microcode and regular microcode intel driver.

I think I would be able to reduce it to *two* lines total at most, per boot,
in the general case. And it will likely be just one line, depends on
whether the early microcode driver was compiled in or not, and whether it or
the regular microcode driver managed to update the processor.

I will write that code for the next patchset (after I update this one until
it gets into a shape you can accept).

> > I really miss the full microcode ID information in /proc/cpuinfo, in fact.
>
> Full ID, you mean all fields of struct cpu_signature on Intel?

cpu signature, pf mask and revision. Since we already log the date, might
as well keep it too.

> ->sig - CPUID_EAX(1) which is in /proc/cpuinfo

Not in a format amicable to finding it through a search engine. Besides, we
lose the processor type bits. Given that we have a brand new i586 chip, who
is to say we won't see once again processors that have one of those bits set
in their signature?

> ->pf - processor flags in MSR_0x17[52:50] - I guess you can read that
> out with rdmsr 0x17. Why do we need to know that one except maybe to
> verify why a patch doesn't get accepted by the loader?

This one has very limited use right now, yes. But without it, I cannot even
tell whether the user has an up-to-date microcode or not when Intel releases
microcode updates that do not apply to all processors with the same
CPUID_EAX(1). They haven't done this in the last couple years, but it is
needed for processors that are not that old, such as cpuid 0x30661 (Intel
Atom D25xx/N26xx,N28xx, from 2011). It is likely it will be needed again.

> -> rev - that's in MSR_IA32_UCODE_REV

> I'm not really sure we absolutely need those except for debugging. Thus
> the pr_debug() suggestion from my side.

It is for debugging, indeed. But too much stuff never logs at the pr_debug
level, so you need to take special action. If you accept my proposal for
log size reduction while still logging the stuff I find useful, this could
be pr_info() as it would *not* be per-core anymore.

> > This is old code, I guess it predates wrmsrl()...
> >
> > Should I replace the old split version with wrmsrl() in this patch, or as a
> > separate patch?
>
> Yes please. And then add to the commit message something of the sorts
> "While at it, ..."

Will do.

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2014-11-01 11:06:32

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH 2/8] x86, microcode, intel: don't update each HT core twice

On Fri, Oct 31, 2014 at 04:43:45PM -0200, Henrique de Moraes Holschuh wrote:
> Every time I issued a warning to users that they need to upgrade their
> microcode due to either security errata or other nasty issue, the fact that
> you can just grep for microcode in the logs to know whether it worked or not
> *and* the reasons why it did/did not work has been helpful.

Ok, I see what you mean.

A couple of thoughs/suggestions for that:

/proc/cpuinfo is the place which gives you, well, CPU info. So all
the information pertinent to the CPU should be there. And since on bug
reports we do ask for /proc/cpuinfo along with dmesg, I think that
should be fine.

Now, we're not about reducing the log info in general but rather only
saying something when we really have to. Microcode updates should work
in the background and the user should not be bothered at all about it in
the normal, success case. Only when there are issues with it, only then
we want to say something.

> The logging also tells me whether the early or regular microcode driver was
> the one that did the update. This matters to me, as it helps me detect when
> the regular microcode update driver is doing work that should have been done
> by the early driver.

This is wrong: if! the early loader is enabled and initrd is supplied,
it should apply the microcode. That's why it is an early loader in the
first place.

The late loader is for the cases where you don't want to reboot the
system when applying microcode.

But all systems out there should enable the early loader - microcode
should be applied as early as possible.

> It was also really useful when I was searching the net for reports about a
> specific processor signature, /proc/cpuinfo is really unhelpful for that.

Why is that so? Can we improve it?

> It also gives me one interesting information that just looking at
> /proc/cpuinfo doesn't: the version of the microcode installed by the user's
> BIOS/UEFI. This kind of information is relevant to some of the Linux
> distros (at least to Debian) due to the non-free nature of the microcode
> updates [note: this assumes the regular microcode driver was loaded]. It is
> the only metric I have that tells me whether the effort is worth it.

Well, if you really need it, we could add it there:

cat /proc/cpuinfo:

...
vendor_id : AuthenticAMD
cpu family : 21
model : 2
model name : AMD FX(tm)-8350 Eight-Core Processor
stepping : 0
microcode : 0x6000832
BIOS microcode : 0x6000822

but I don't see why you need it. Regardless of the effort, you want to have
microcode loader running on all systems, that's it.

> > > I really miss the full microcode ID information in /proc/cpuinfo, in fact.
> >
> > Full ID, you mean all fields of struct cpu_signature on Intel?
>
> cpu signature, pf mask and revision. Since we already log the date, might
> as well keep it too.
>
> > ->sig - CPUID_EAX(1) which is in /proc/cpuinfo
>
> Not in a format amicable to finding it through a search engine. Besides, we
> lose the processor type bits. Given that we have a brand new i586 chip, who

processor type bits? What is that?

> is to say we won't see once again processors that have one of those bits set
> in their signature?
>
> > ->pf - processor flags in MSR_0x17[52:50] - I guess you can read that
> > out with rdmsr 0x17. Why do we need to know that one except maybe to
> > verify why a patch doesn't get accepted by the loader?
>
> This one has very limited use right now, yes. But without it, I cannot even
> tell whether the user has an up-to-date microcode or not when Intel releases
> microcode updates that do not apply to all processors with the same
> CPUID_EAX(1). They haven't done this in the last couple years, but it is
> needed for processors that are not that old, such as cpuid 0x30661 (Intel
> Atom D25xx/N26xx,N28xx, from 2011). It is likely it will be needed again.

Well, you have all that info:

grep . /sys/devices/system/cpu/cpu?/microcode/*
/sys/devices/system/cpu/cpu0/microcode/processor_flags:0x10
/sys/devices/system/cpu/cpu0/microcode/version:0x1b
/sys/devices/system/cpu/cpu1/microcode/processor_flags:0x10
/sys/devices/system/cpu/cpu1/microcode/version:0x1b
/sys/devices/system/cpu/cpu2/microcode/processor_flags:0x10
/sys/devices/system/cpu/cpu2/microcode/version:0x1b
/sys/devices/system/cpu/cpu3/microcode/processor_flags:0x10
/sys/devices/system/cpu/cpu3/microcode/version:0x1b

We can easily extend sysfs with more info required instead of spewing it
into dmesg.

You can access sysfs then in your scripts and query everything you need.
And use cpuid.ko and msr.ko too. I think with that you get all the info
needed.

> It is for debugging, indeed. But too much stuff never logs at the pr_debug
> level, so you need to take special action. If you accept my proposal for
> log size reduction while still logging the stuff I find useful, this could
> be pr_info() as it would *not* be per-core anymore.

See above.

Bottom line is: dmesg should contain only error information when
microcode fails applying. All the remaining info should be in sysfs and
/proc/cpuinfo.

This is much cleaner IMO and will save us a lot of useless bug reports.

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-11-01 19:20:42

by Henrique de Moraes Holschuh

[permalink] [raw]

Subject: Re: [PATCH 2/8] x86, microcode, intel: don't update each HT core twice

On Sat, 01 Nov 2014, Borislav Petkov wrote:
> Now, we're not about reducing the log info in general but rather only
> saying something when we really have to. Microcode updates should work
> in the background and the user should not be bothered at all about it in
> the normal, success case. Only when there are issues with it, only then
> we want to say something.

I understand your objective, yes. I just don't agree with your reasoning
entirely. I fully agree it should not bother the user by inducing doubts,
or being too verbose, though.

Suppose I do the work so that we get something like this:

<microcode driver>: sig XXX, pf XXX, rev YYY

or, if an update is done, we get this *instead*:

<microcode driver>: sig XXX, pf XXX, updated from rev YYY to rev ZZZ (YYYYMMDD)

And that will be all, unless you actually have multiple [different]
microcodes in the system (which is rare), in which case you will get more
lines.

You'd get the same one line if an update is triggered at runtime and it
manages to update at least one processor.

> The late loader is for the cases where you don't want to reboot the
> system when applying microcode.

I used it also as a fallback in Debian/Ubuntu should the early update
driver fail, but I have no idea whether it was effective or not because the
early microcode driver currently fails silently.

Anyway, the use of the late driver in initramfs (even as a fallback) is gone
from Debian since two weeks ago, and it will be gone from Ubuntu when they
sync to Debian once again.

> But all systems out there should enable the early loader - microcode
> should be applied as early as possible.

Agreed.

> > It was also really useful when I was searching the net for reports about a
> > specific processor signature, /proc/cpuinfo is really unhelpful for that.
>
> Why is that so? Can we improve it?

Yes, we can, but I'd still like to have my one log line per boot, please.

The log line has one feature /proc/cpuinfo won't ever have: it is always
there in the boot logs, even if you didn't ask for it in advance.

And the log currently has one feature that is valuable IMO: it is complete,
it has all the errors, warnings and useful information, and you just need
"grep microcode" to get it all.

> Well, if you really need it, we could add it there:
>
> cat /proc/cpuinfo:
>
> ...
> vendor_id : AuthenticAMD
> cpu family : 21
> model : 2
> model name : AMD FX(tm)-8350 Eight-Core Processor
> stepping : 0
> microcode : 0x6000832
> BIOS microcode : 0x6000822

For /proc/cpuinfo to be complete for *tool use*, I'd really need
cpuid_eax(1), and on intel, the three bits from MSR 0x17 [52:50] in pf_mask
format (1 << (MSR 0x17[52:50])).

microcode sig: 0x0000000
microcode pfm: 0x00 [only on intel]

We could make the "BIOS microcode" line optional, and supress it when no
microcode update was applied (i.e. BIOS microcode == current microcode).

However, that's two new lines for AMD, and three for Intel.

Somewhat related, but independent of what we do to /proc/cpuinfo,
microcode.ko really should grow a signature field in sysfs, so that you
don't need both microcode.ko *and* cpuid.ko to know what microcode the box
needs. I even have a patch that does this ready, if you want it.

And it is much better to deal with sysfs than parse /proc/cpuinfo...
cpuinfo is for humans, and it is quite important, but machine-parsing it
properly is annoying (requires a state machine, etc).

> but I don't see why you need it. Regardless of the effort, you want to have
> microcode loader running on all systems, that's it.

Eh, I don't _need_ it because right now the kernel log has everything I
want (as long as the late microcode module gets loaded) other than error
messages from the early loader.

> > > > I really miss the full microcode ID information in /proc/cpuinfo, in fact.
> > >
> > > Full ID, you mean all fields of struct cpu_signature on Intel?
> >
> > cpu signature, pf mask and revision. Since we already log the date, might
> > as well keep it too.
> >
> > > ->sig - CPUID_EAX(1) which is in /proc/cpuinfo
> >
> > Not in a format amicable to finding it through a search engine. Besides, we
> > lose the processor type bits. Given that we have a brand new i586 chip, who
>
> processor type bits? What is that?

It is a field composed by cpuid_eax(1) [13:12]. According to the Intel
SDM:

00b: Original OEM processor
01b: overdrive processor
10b: dual processor
11b: Intel reserved

This is the microcode for the Pentium Overdrive processor:
sig 0x00001632, pf mask 0x00, 1998-06-10, rev 0x0002, size 2048

where you can see that bits 13:12 of the signture are 01b (type: overdrive
processor).

Anyway, AFAIK the Intel SDM defines the microcode signature to be all 32
bits in cpuid_eax(1), including the reserved bits (which doesn't seem
right, you're usually told to mask these to zero, but whatever).

> You can access sysfs then in your scripts and query everything you need.
> And use cpuid.ko and msr.ko too. I think with that you get all the info
> needed.

msr.ko is a security nightmare. The less it is used, the better. I'd like
it to either go away entirely, or (more realistically) that it switched to
a processor model-aware whitelist of safe, in-use-by-userspace MSRs, and
reject everything else no matter what (even to processes with
CAP_SYS_RAWIO).

But yes, I can get that information as-is, and I don't even need msr.ko to
do it. It may not be convenient at all, but the information is reachable
from userspace through cpuid.ko and the late microcode loader
(microcode.ko).

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2014-11-04 15:53:56

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH 2/8] x86, microcode, intel: don't update each HT core twice

On Sat, Nov 01, 2014 at 05:20:26PM -0200, Henrique de Moraes Holschuh wrote:
> I understand your objective, yes. I just don't agree with your reasoning
> entirely. I fully agree it should not bother the user by inducing doubts,
> or being too verbose, though.
>
> Suppose I do the work so that we get something like this:
>
> <microcode driver>: sig XXX, pf XXX, rev YYY
>
> or, if an update is done, we get this *instead*:
>
> <microcode driver>: sig XXX, pf XXX, updated from rev YYY to rev ZZZ (YYYYMMDD)
>
> And that will be all, unless you actually have multiple [different]
> microcodes in the system (which is rare), in which case you will get more
> lines.
>
> You'd get the same one line if an update is triggered at runtime and it
> manages to update at least one processor.

And what is wrong with putting that info in /proc/cpuinfo and not saying
anything in dmesg?

> I used it also as a fallback in Debian/Ubuntu should the early update
> driver fail, but I have no idea whether it was effective or not because the
> early microcode driver currently fails silently.

The early driver should not fail - if it does we need to fix it. Feel
free to CC me on bug reports so that this gets sorted out and fixed.

And it fails silently because that early we cannot call anything
printk-like anyway as a debugging aid.

> Yes, we can, but I'd still like to have my one log line per boot, please.
>
> The log line has one feature /proc/cpuinfo won't ever have: it is always
> there in the boot logs, even if you didn't ask for it in advance.

Yeah right. Even if dmesg ring buffer has wrapped and overwritten stuff.

The lines which are always there are the /proc/cpuinfo ones. And CPU
microcode belongs to the CPU so it is only natural to put it there.
Oh, and removing the dmesg likes will save you silly bug reports like
microcode being updated only on the even-numbered cores.

> For /proc/cpuinfo to be complete for *tool use*, I'd really need
> cpuid_eax(1),

No, you don't. You can *call* CPUID in your tool.

> and on intel, the three bits from MSR 0x17 [52:50] in pf_mask
> format (1 << (MSR 0x17[52:50])).
>
> microcode sig: 0x0000000
> microcode pfm: 0x00 [only on intel]
>
> We could make the "BIOS microcode" line optional, and supress it when no
> microcode update was applied (i.e. BIOS microcode == current microcode).
>
> However, that's two new lines for AMD, and three for Intel.

Not a problem.

> Somewhat related, but independent of what we do to /proc/cpuinfo,
> microcode.ko really should grow a signature field in sysfs, so that you
> don't need both microcode.ko *and* cpuid.ko to know what microcode the box
> needs. I even have a patch that does this ready, if you want it.

You can call CPUID in your tool - it is not a privileged instruction
like accessing the MSRs, for example.

> And it is much better to deal with sysfs than parse /proc/cpuinfo...
> cpuinfo is for humans, and it is quite important, but machine-parsing it
> properly is annoying (requires a state machine, etc).

Really?! I'd use awk :-)

But whatever, we can put stuff in sysfs if you prefer.

> It is a field composed by cpuid_eax(1) [13:12]. According to the Intel
> SDM:
>
> 00b: Original OEM processor
> 01b: overdrive processor
> 10b: dual processor
> 11b: Intel reserved
>
> This is the microcode for the Pentium Overdrive processor:
> sig 0x00001632, pf mask 0x00, 1998-06-10, rev 0x0002, size 2048
>
> where you can see that bits 13:12 of the signture are 01b (type: overdrive
> processor).

Ok.

> Anyway, AFAIK the Intel SDM defines the microcode signature to be all 32
> bits in cpuid_eax(1), including the reserved bits (which doesn't seem
> right, you're usually told to mask these to zero, but whatever).

They probably say somewhere too that those are Read-As-Zero too...

> msr.ko is a security nightmare. The less it is used, the better. I'd like
> it to either go away entirely, or (more realistically) that it switched to
> a processor model-aware whitelist of safe, in-use-by-userspace MSRs, and
> reject everything else no matter what (even to processes with
> CAP_SYS_RAWIO).

Right, I think people even blacklist it on production systems.

> But yes, I can get that information as-is, and I don't even need msr.ko to
> do it. It may not be convenient at all, but the information is reachable
> from userspace through cpuid.ko and the late microcode loader
> (microcode.ko).

Ok, so far I gather all you need is CPUID which you can call - you
don't necessarily need cpuid.ko - and stuff exported through sysfs or
/proc/cpuinfo.

Sounds like a much better interface to me than grepping to fragile logs
which might get overwritten.

Btw, on bug reports, you can do a small script which goes and collects
all that info from the user's machine. I know alsa does something like
that.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--