2017-06-01 09:36:04

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 2/3] x86/apic: Add TSC_DEADLINE quirk due to errata

On Wed, May 31, 2017 at 06:58:55PM -0300, Henrique de Moraes Holschuh wrote:
> On Wed, 31 May 2017, Peter Zijlstra wrote:
> > + DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_SKYLAKE_X, 0x02000014),
> ...
> > + DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_SKYLAKE_MOBILE, 0xb2),
> > + DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_SKYLAKE_DESKTOP, 0xb2),
> ...
> > + DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_KABYLAKE_MOBILE, 0x52),
> > + DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_KABYLAKE_DESKTOP, 0x52),
>
> Maybe these revisions need to be decreased by one, so that the SGX
> behavior of messing with the microcode revision is taken into account?

If anything, we should fix whatever sets our boot_cpu_data.microcode
number, not muck with this table. However..

> When the processor microcode is auto-updated from the (signed) FIT, and
> SGX's PRMRR feature is supported, the processor will report its
> microcode revision as one less. Therefore, a microcode update with
> revision 0xb2 auto-loaded from FIT would be reported by RDMSR(0x8b) as
> revision 0xb1.
>
> I know about this SGX-related behavior from a coreboot commit from ~two
> years ago (link below). As far as I can tell, the Intel 64/IA32 SDM is
> *still* missing any mention about this behavior in vol 3A section 9.11
> (where it describes microcode updates).
>
> https://review.coreboot.org/cgit/coreboot.git/commit/?id=5042aad4ded1651638ae9b60e34114b65e4f211e

That commit also states that is avoids a superfluous microcode load. And
we've verified on affected systems (both Thomas and I have a SKL system
with the PRMRR bit set in MTRRCAP) that when we manually load the
microcode image the reported revision number matches the one from the
image.

[ 0.000000] microcode: microcode updated early to revision 0xba, date = 2017-04-09
[ 2.297894] microcode: sig=0x506e3, pf=0x2, revision=0xba

And:

# hexdump /lib/firmware/intel-ucode/06-5e-03 | head -1
0000000 0001 0000 00ba 0000 2017 0409 06e3 0005
^^^^^^^^^

So aside from a possible OS re-load of the microcode, the issue doesn't
appear to have any negative effect.


The microcode people (Cc'ed) might want to further look into this is
they care about avoiding the superfluous reload, but for the purpose of
this patch all is well.


2017-06-01 09:58:19

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 2/3] x86/apic: Add TSC_DEADLINE quirk due to errata

On Thu, 1 Jun 2017, Peter Zijlstra wrote:
> That commit also states that is avoids a superfluous microcode load. And
> we've verified on affected systems (both Thomas and I have a SKL system
> with the PRMRR bit set in MTRRCAP) that when we manually load the
> microcode image the reported revision number matches the one from the
> image.
>
> [ 0.000000] microcode: microcode updated early to revision 0xba, date = 2017-04-09
> [ 2.297894] microcode: sig=0x506e3, pf=0x2, revision=0xba
>
> And:
>
> # hexdump /lib/firmware/intel-ucode/06-5e-03 | head -1
> 0000000 0001 0000 00ba 0000 2017 0409 06e3 0005
> ^^^^^^^^^
>
> So aside from a possible OS re-load of the microcode, the issue doesn't
> appear to have any negative effect.
>
>
> The microcode people (Cc'ed) might want to further look into this is
> they care about avoiding the superfluous reload, but for the purpose of
> this patch all is well.

Avoiding the superflous reload is dangerous. If the BIOS/FIT nonsense gets
fixed (however unlikely that is) then we run into an even worse problem
because then the kernel will not load version 5 if the CPU says version 4.
That's way worse than reloading the same version for nothing.

Thanks,

tglx

Subject: Re: [PATCH 2/3] x86/apic: Add TSC_DEADLINE quirk due to errata

On Thu, 01 Jun 2017, Peter Zijlstra wrote:
> On Wed, May 31, 2017 at 06:58:55PM -0300, Henrique de Moraes Holschuh wrote:
> > On Wed, 31 May 2017, Peter Zijlstra wrote:
> > > + DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_SKYLAKE_X, 0x02000014),
> > ...
> > > + DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_SKYLAKE_MOBILE, 0xb2),
> > > + DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_SKYLAKE_DESKTOP, 0xb2),
> > ...
> > > + DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_KABYLAKE_MOBILE, 0x52),
> > > + DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_KABYLAKE_DESKTOP, 0x52),
> >
> > Maybe these revisions need to be decreased by one, so that the SGX
> > behavior of messing with the microcode revision is taken into account?
>
> If anything, we should fix whatever sets our boot_cpu_data.microcode
> number, not muck with this table. However..

That would work, yes. But that would breed confusion, it is visible to
userspace, and it and might create unforeseen issues in the future. I
feel it is best to not go that way.

> > When the processor microcode is auto-updated from the (signed) FIT, and
> > SGX's PRMRR feature is supported, the processor will report its
> > microcode revision as one less. Therefore, a microcode update with
> > revision 0xb2 auto-loaded from FIT would be reported by RDMSR(0x8b) as
> > revision 0xb1.
> >
> > I know about this SGX-related behavior from a coreboot commit from ~two
> > years ago (link below). As far as I can tell, the Intel 64/IA32 SDM is
> > *still* missing any mention about this behavior in vol 3A section 9.11
> > (where it describes microcode updates).
> >
> > https://review.coreboot.org/cgit/coreboot.git/commit/?id=5042aad4ded1651638ae9b60e34114b65e4f211e
>
> That commit also states that is avoids a superfluous microcode load. And

In the case of your patch, it would avoid issuing an incorrect warning
to the user and needlessly triggering the errata workaround if the
system UEFI has exactly the revision of the microcode that first fixed
the issue, and it was loaded from FIT.

OTOH, it will add a further (if rather small) nail to the coffin of any
attempts to depend on the fact that microcode was loaded from FIT -- and
anything that does so would get in the way of users and distros fixing
systems with an operating system microcode update.

On those grounds, I retract my suggestion to handle the microcode
revision oddity related to SGX.

Please keep the patch as is.

> we've verified on affected systems (both Thomas and I have a SKL system
> with the PRMRR bit set in MTRRCAP) that when we manually load the
> microcode image the reported revision number matches the one from the
> image.

Yes, it works. That is the reason why I never proposed a fix for our
microcode loader in these two years, actually. It works just fine right
now, and if Intel really wanted a different behavior, they should have
requested such a change or sent in a patch in the first place. Instead,
the changed behavior wasn't even documented in the Intel SDM, since it
is backwards-compatible.

In fact, as far as I am concerned, loading the same microcode update
over itself just to destroy the "loaded from FIT" marker IS the lesser
evil, at least as things currently stand.

> The microcode people (Cc'ed) might want to further look into this is
> they care about avoiding the superfluous reload, but for the purpose of
> this patch all is well.

I think we want to keep our current behavior in the microcode loader as
well. It is not there by mistake: it is exactly what the Intel SDM
recommends.

I believe it was no coincidence that this microcode-from-FIT marker was
designed by Intel in a way that non-modified operating systems (and
advanced boot loaders capable of microcode updates) in the field would
by default always attempt to override the FIT microcode and reset that
flag. Just like we're doing.

--
Henrique Holschuh