2016-12-09 18:24:15

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements

*** PATCHES 1 and 2 MAY BE 4.9 MATERIAL ***

Alan Cox pointed out that the 486 isn't the only supported CPU that
doesn't have CPUID. Let's clean up the mess and make everything
faster while we're at it.

Patch 1 is intended to be an easy fix: it makes sync_core() work
without CPUID on all 32-bit kernels. It should be quite safe. This
will have a negligible performance cost during boot on kernels built
for newer CPUs. With this in place, patch 2 reverts the buggy 486
check I added.

Patches 3-4 are meant to improve the situation. Patch 3 cleans up
the Intel microcode loader and the patch 4 (which depends on patch 3
to work correctly) stops using CPUID in sync_core() altogether.

Changes from v3:
- Improve sync_core() comments.
- Tidy up sync_core() asm.

Changes from v2:
- Switch to IRET-to-self and get rid of all the paravirt code.
- Further immprove the sync_core() comment.

Changes from v1:
- Fix Xen
- Add timing info to the changelog (hint: 2x speedup)
- Document patch 1 a bit better.

Andy Lutomirski (4):
x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit
kernels
Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
x86/microcode/intel: Replace sync_core() with native_cpuid()
x86/asm: Rewrite sync_core() to use IRET-to-self

arch/x86/boot/cpu.c | 6 ---
arch/x86/include/asm/processor.h | 80 +++++++++++++++++++++++++----------
arch/x86/kernel/cpu/microcode/intel.c | 26 ++++++++++--
3 files changed, 81 insertions(+), 31 deletions(-)

--
2.9.3


2016-12-09 18:24:17

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels

We support various non-Intel CPUs that don't have the CPUID
instruction, so the M486 test was wrong. For now, fix it with a big
hammer: handle missing CPUID on all 32-bit CPUs.

Reported-by: One Thousand Gnomes <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/x86/include/asm/processor.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 984a7bf17f6a..64fbc937d586 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -595,7 +595,7 @@ static inline void sync_core(void)
{
int tmp;

-#ifdef CONFIG_M486
+#ifdef CONFIG_X86_32
/*
* Do a CPUID if available, otherwise do a jump. The jump
* can conveniently enough be the jump around CPUID.
--
2.9.3

2016-12-09 18:24:22

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v4 2/4] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"

This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98.

The patch wasn't quite correct -- there are non-Intel (and hence
non-486) CPUs that we support that don't have CPUID. Since we no
longer require CPUID for sync_core(), just revert the patch.

I think the relevant CPUs are Geode and Elan, but I'm not sure.

In principle, we should try to do better at identifying CPUID-less
CPUs in early boot, but that's more complicated.

Reported-by: One Thousand Gnomes <[email protected]>
Cc: Matthew Whitehead <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/x86/boot/cpu.c | 6 ------
1 file changed, 6 deletions(-)

diff --git a/arch/x86/boot/cpu.c b/arch/x86/boot/cpu.c
index 4224ede43b4e..26240dde081e 100644
--- a/arch/x86/boot/cpu.c
+++ b/arch/x86/boot/cpu.c
@@ -87,12 +87,6 @@ int validate_cpu(void)
return -1;
}

- if (CONFIG_X86_MINIMUM_CPU_FAMILY <= 4 && !IS_ENABLED(CONFIG_M486) &&
- !has_eflag(X86_EFLAGS_ID)) {
- printf("This kernel requires a CPU with the CPUID instruction. Build with CONFIG_M486=y to run on this CPU.\n");
- return -1;
- }
-
if (err_flags) {
puts("This kernel requires the following features "
"not present on the CPU:\n");
--
2.9.3

2016-12-09 18:24:25

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v4 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self

Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID. Use IRET-to-self
instead. IRET-to-self works everywhere, so it makes testing easy.

For reference, On my laptop, IRET-to-self is ~110ns,
CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
and MOV-to-CR2 is ~42ns.

While we're at it: sync_core() serves a very specific purpose.
Document it.

Cc: "H. Peter Anvin" <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/x86/include/asm/processor.h | 80 +++++++++++++++++++++++++++++-----------
1 file changed, 58 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 64fbc937d586..ceb1f4d3f3fa 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -590,33 +590,69 @@ static __always_inline void cpu_relax(void)

#define cpu_relax_lowlatency() cpu_relax()

-/* Stop speculative execution and prefetching of modified code. */
+/*
+ * This function forces the icache and prefetched instruction stream to
+ * catch up with reality in two very specific cases:
+ *
+ * a) Text was modified using one virtual address and is about to be executed
+ * from the same physical page at a different virtual address.
+ *
+ * b) Text was modified on a different CPU, may subsequently be
+ * executed on this CPU, and you want to make sure the new version
+ * gets executed. This generally means you're calling this in a IPI.
+ *
+ * If you're calling this for a different reason, you're probably doing
+ * it wrong.
+ */
static inline void sync_core(void)
{
- int tmp;
-
-#ifdef CONFIG_X86_32
/*
- * Do a CPUID if available, otherwise do a jump. The jump
- * can conveniently enough be the jump around CPUID.
+ * There are quite a few ways to do this. IRET-to-self is nice
+ * because it works on every CPU, at any CPL (so it's compatible
+ * with paravirtualization), and it never exits to a hypervisor.
+ * The only down sides are that it's a bit slow (it seems to be
+ * a bit more than 2x slower than the fastest options) and that
+ * it unmasks NMIs. The "push %cs" is needed because, in
+ * paravirtual environments, __KERNEL_CS may not be a valid CS
+ * value when we do IRET directly.
+ *
+ * In case NMI unmasking or performance ever becomes a problem,
+ * the next best option appears to be MOV-to-CR2 and an
+ * unconditional jump. That sequence also works on all CPUs,
+ * but it will fault at CPL3 (i.e. Xen PV and lguest).
+ *
+ * CPUID is the conventional way, but it's nasty: it doesn't
+ * exist on some 486-like CPUs, and it usually exits to a
+ * hypervisor.
+ *
+ * Like all of Linux's memory ordering operations, this is a
+ * compiler barrier as well.
*/
- asm volatile("cmpl %2,%1\n\t"
- "jl 1f\n\t"
- "cpuid\n"
- "1:"
- : "=a" (tmp)
- : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
- : "ebx", "ecx", "edx", "memory");
+ register void *__sp asm(_ASM_SP);
+
+#ifdef CONFIG_X86_32
+ asm volatile (
+ "pushfl\n\t"
+ "pushl %%cs\n\t"
+ "pushl $1f\n\t"
+ "iret\n\t"
+ "1:"
+ : "+r" (__sp) : : "memory");
#else
- /*
- * CPUID is a barrier to speculative execution.
- * Prefetched instructions are automatically
- * invalidated when modified.
- */
- asm volatile("cpuid"
- : "=a" (tmp)
- : "0" (1)
- : "ebx", "ecx", "edx", "memory");
+ unsigned int tmp;
+
+ asm volatile (
+ "mov %%ss, %0\n\t"
+ "pushq %q0\n\t"
+ "pushq %%rsp\n\t"
+ "addq $8, (%%rsp)\n\t"
+ "pushfq\n\t"
+ "mov %%cs, %0\n\t"
+ "pushq %q0\n\t"
+ "pushq $1f\n\t"
+ "iretq\n\t"
+ "1:"
+ : "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
#endif
}

--
2.9.3

2016-12-09 18:25:04

by Andy Lutomirski

[permalink] [raw]
Subject: [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid()

The Intel microcode driver is using sync_core() to mean "do CPUID
with EAX=1". I want to rework sync_core(), but first the Intel
microcode driver needs to stop depending on its current behavior.

Reported-by: Henrique de Moraes Holschuh <[email protected]>
Acked-by: Borislav Petkov <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/x86/kernel/cpu/microcode/intel.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index cdc0deab00c9..e0981bb2a351 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -356,6 +356,26 @@ get_matching_model_microcode(unsigned long start, void *data, size_t size,
return state;
}

+static void cpuid_1(void)
+{
+ /*
+ * According to the Intel SDM, Volume 3, 9.11.7:
+ *
+ * CPUID returns a value in a model specific register in
+ * addition to its usual register return values. The
+ * semantics of CPUID cause it to deposit an update ID value
+ * in the 64-bit model-specific register at address 08BH
+ * (IA32_BIOS_SIGN_ID). If no update is present in the
+ * processor, the value in the MSR remains unmodified.
+ *
+ * Use native_cpuid -- this code runs very early and we don't
+ * want to mess with paravirt.
+ */
+ unsigned int eax = 1, ebx, ecx = 0, edx;
+
+ native_cpuid(&eax, &ebx, &ecx, &edx);
+}
+
static int collect_cpu_info_early(struct ucode_cpu_info *uci)
{
unsigned int val[2];
@@ -385,7 +405,7 @@ static int collect_cpu_info_early(struct ucode_cpu_info *uci)
native_wrmsrl(MSR_IA32_UCODE_REV, 0);

/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();

/* get the current revision from MSR 0x8B */
native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -627,7 +647,7 @@ static int apply_microcode_early(struct ucode_cpu_info *uci, bool early)
native_wrmsrl(MSR_IA32_UCODE_REV, 0);

/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();

/* get the current revision from MSR 0x8B */
native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -927,7 +947,7 @@ static int apply_microcode_intel(int cpu)
wrmsrl(MSR_IA32_UCODE_REV, 0);

/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();

/* get the current revision from MSR 0x8B */
rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
--
2.9.3

2016-12-15 18:06:26

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements

On Fri, Dec 9, 2016 at 10:24 AM, Andy Lutomirski <[email protected]> wrote:
> *** PATCHES 1 and 2 MAY BE 4.9 MATERIAL ***
>
> Alan Cox pointed out that the 486 isn't the only supported CPU that
> doesn't have CPUID. Let's clean up the mess and make everything
> faster while we're at it.
>

Ping? Any chance of getting these in for 4.10?

--Andy

Subject: [tip:x86/urgent] x86/asm: Rewrite sync_core() to use IRET-to-self

Commit-ID: c198b121b1a1d7a7171770c634cd49191bac4477
Gitweb: http://git.kernel.org/tip/c198b121b1a1d7a7171770c634cd49191bac4477
Author: Andy Lutomirski <[email protected]>
AuthorDate: Fri, 9 Dec 2016 10:24:08 -0800
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 19 Dec 2016 11:54:21 +0100

x86/asm: Rewrite sync_core() to use IRET-to-self

Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID. Use IRET-to-self
instead. IRET-to-self works everywhere, so it makes testing easy.

For reference, On my laptop, IRET-to-self is ~110ns,
CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
and MOV-to-CR2 is ~42ns.

While we're at it: sync_core() serves a very specific purpose.
Document it.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: One Thousand Gnomes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Matthew Whitehead <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Henrique de Moraes Holschuh <[email protected]>
Cc: Andrew Cooper <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: xen-devel <[email protected]>
Link: http://lkml.kernel.org/r/5c79f0225f68bc8c40335612bf624511abb78941.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/processor.h | 80 +++++++++++++++++++++++++++++-----------
1 file changed, 58 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b934871..eaf1005 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -602,33 +602,69 @@ static __always_inline void cpu_relax(void)
rep_nop();
}

-/* Stop speculative execution and prefetching of modified code. */
+/*
+ * This function forces the icache and prefetched instruction stream to
+ * catch up with reality in two very specific cases:
+ *
+ * a) Text was modified using one virtual address and is about to be executed
+ * from the same physical page at a different virtual address.
+ *
+ * b) Text was modified on a different CPU, may subsequently be
+ * executed on this CPU, and you want to make sure the new version
+ * gets executed. This generally means you're calling this in a IPI.
+ *
+ * If you're calling this for a different reason, you're probably doing
+ * it wrong.
+ */
static inline void sync_core(void)
{
- int tmp;
-
-#ifdef CONFIG_X86_32
/*
- * Do a CPUID if available, otherwise do a jump. The jump
- * can conveniently enough be the jump around CPUID.
+ * There are quite a few ways to do this. IRET-to-self is nice
+ * because it works on every CPU, at any CPL (so it's compatible
+ * with paravirtualization), and it never exits to a hypervisor.
+ * The only down sides are that it's a bit slow (it seems to be
+ * a bit more than 2x slower than the fastest options) and that
+ * it unmasks NMIs. The "push %cs" is needed because, in
+ * paravirtual environments, __KERNEL_CS may not be a valid CS
+ * value when we do IRET directly.
+ *
+ * In case NMI unmasking or performance ever becomes a problem,
+ * the next best option appears to be MOV-to-CR2 and an
+ * unconditional jump. That sequence also works on all CPUs,
+ * but it will fault at CPL3 (i.e. Xen PV and lguest).
+ *
+ * CPUID is the conventional way, but it's nasty: it doesn't
+ * exist on some 486-like CPUs, and it usually exits to a
+ * hypervisor.
+ *
+ * Like all of Linux's memory ordering operations, this is a
+ * compiler barrier as well.
*/
- asm volatile("cmpl %2,%1\n\t"
- "jl 1f\n\t"
- "cpuid\n"
- "1:"
- : "=a" (tmp)
- : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
- : "ebx", "ecx", "edx", "memory");
+ register void *__sp asm(_ASM_SP);
+
+#ifdef CONFIG_X86_32
+ asm volatile (
+ "pushfl\n\t"
+ "pushl %%cs\n\t"
+ "pushl $1f\n\t"
+ "iret\n\t"
+ "1:"
+ : "+r" (__sp) : : "memory");
#else
- /*
- * CPUID is a barrier to speculative execution.
- * Prefetched instructions are automatically
- * invalidated when modified.
- */
- asm volatile("cpuid"
- : "=a" (tmp)
- : "0" (1)
- : "ebx", "ecx", "edx", "memory");
+ unsigned int tmp;
+
+ asm volatile (
+ "mov %%ss, %0\n\t"
+ "pushq %q0\n\t"
+ "pushq %%rsp\n\t"
+ "addq $8, (%%rsp)\n\t"
+ "pushfq\n\t"
+ "mov %%cs, %0\n\t"
+ "pushq %q0\n\t"
+ "pushq $1f\n\t"
+ "iretq\n\t"
+ "1:"
+ : "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
#endif
}


Subject: [tip:x86/urgent] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"

Commit-ID: 426d1aff3138cf38da14e912df3c75e312f96e9e
Gitweb: http://git.kernel.org/tip/426d1aff3138cf38da14e912df3c75e312f96e9e
Author: Andy Lutomirski <[email protected]>
AuthorDate: Fri, 9 Dec 2016 10:24:06 -0800
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 19 Dec 2016 11:54:20 +0100

Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"

This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98.

The patch wasn't quite correct -- there are non-Intel (and hence
non-486) CPUs that we support that don't have CPUID. Since we no
longer require CPUID for sync_core(), just revert the patch.

I think the relevant CPUs are Geode and Elan, but I'm not sure.

In principle, we should try to do better at identifying CPUID-less
CPUs in early boot, but that's more complicated.

Reported-by: One Thousand Gnomes <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Matthew Whitehead <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Henrique de Moraes Holschuh <[email protected]>
Cc: Andrew Cooper <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: xen-devel <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/82acde18a108b8e353180dd6febcc2876df33f24.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/boot/cpu.c | 6 ------
1 file changed, 6 deletions(-)

diff --git a/arch/x86/boot/cpu.c b/arch/x86/boot/cpu.c
index 4224ede..26240dd 100644
--- a/arch/x86/boot/cpu.c
+++ b/arch/x86/boot/cpu.c
@@ -87,12 +87,6 @@ int validate_cpu(void)
return -1;
}

- if (CONFIG_X86_MINIMUM_CPU_FAMILY <= 4 && !IS_ENABLED(CONFIG_M486) &&
- !has_eflag(X86_EFLAGS_ID)) {
- printf("This kernel requires a CPU with the CPUID instruction. Build with CONFIG_M486=y to run on this CPU.\n");
- return -1;
- }
-
if (err_flags) {
puts("This kernel requires the following features "
"not present on the CPU:\n");

Subject: [tip:x86/urgent] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels

Commit-ID: 1c52d859cb2d417e7216d3e56bb7fea88444cec9
Gitweb: http://git.kernel.org/tip/1c52d859cb2d417e7216d3e56bb7fea88444cec9
Author: Andy Lutomirski <[email protected]>
AuthorDate: Fri, 9 Dec 2016 10:24:05 -0800
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 19 Dec 2016 11:54:20 +0100

x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels

We support various non-Intel CPUs that don't have the CPUID
instruction, so the M486 test was wrong. For now, fix it with a big
hammer: handle missing CPUID on all 32-bit CPUs.

Reported-by: One Thousand Gnomes <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Matthew Whitehead <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Henrique de Moraes Holschuh <[email protected]>
Cc: Andrew Cooper <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: xen-devel <[email protected]>
Link: http://lkml.kernel.org/r/685bd083a7c036f7769510b6846315b17d6ba71f.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/processor.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 6aa741f..b934871 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -607,7 +607,7 @@ static inline void sync_core(void)
{
int tmp;

-#ifdef CONFIG_M486
+#ifdef CONFIG_X86_32
/*
* Do a CPUID if available, otherwise do a jump. The jump
* can conveniently enough be the jump around CPUID.

Subject: [tip:x86/urgent] x86/microcode/intel: Replace sync_core() with native_cpuid()

Commit-ID: 484d0e5c7943644cc46e7308a8f9d83be598f2b9
Gitweb: http://git.kernel.org/tip/484d0e5c7943644cc46e7308a8f9d83be598f2b9
Author: Andy Lutomirski <[email protected]>
AuthorDate: Fri, 9 Dec 2016 10:24:07 -0800
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 19 Dec 2016 11:54:21 +0100

x86/microcode/intel: Replace sync_core() with native_cpuid()

The Intel microcode driver is using sync_core() to mean "do CPUID
with EAX=1". I want to rework sync_core(), but first the Intel
microcode driver needs to stop depending on its current behavior.

Reported-by: Henrique de Moraes Holschuh <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Acked-by: Borislav Petkov <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: One Thousand Gnomes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Matthew Whitehead <[email protected]>
Cc: Andrew Cooper <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: xen-devel <[email protected]>
Link: http://lkml.kernel.org/r/535a025bb91fed1a019c5412b036337ad239e5bb.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/kernel/cpu/microcode/intel.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 54d50c3..b624b54 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -368,6 +368,26 @@ next:
return patch;
}

+static void cpuid_1(void)
+{
+ /*
+ * According to the Intel SDM, Volume 3, 9.11.7:
+ *
+ * CPUID returns a value in a model specific register in
+ * addition to its usual register return values. The
+ * semantics of CPUID cause it to deposit an update ID value
+ * in the 64-bit model-specific register at address 08BH
+ * (IA32_BIOS_SIGN_ID). If no update is present in the
+ * processor, the value in the MSR remains unmodified.
+ *
+ * Use native_cpuid -- this code runs very early and we don't
+ * want to mess with paravirt.
+ */
+ unsigned int eax = 1, ebx, ecx = 0, edx;
+
+ native_cpuid(&eax, &ebx, &ecx, &edx);
+}
+
static int collect_cpu_info_early(struct ucode_cpu_info *uci)
{
unsigned int val[2];
@@ -393,7 +413,7 @@ static int collect_cpu_info_early(struct ucode_cpu_info *uci)
native_wrmsrl(MSR_IA32_UCODE_REV, 0);

/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();

/* get the current revision from MSR 0x8B */
native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -593,7 +613,7 @@ static int apply_microcode_early(struct ucode_cpu_info *uci, bool early)
native_wrmsrl(MSR_IA32_UCODE_REV, 0);

/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();

/* get the current revision from MSR 0x8B */
native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -805,7 +825,7 @@ static int apply_microcode_intel(int cpu)
wrmsrl(MSR_IA32_UCODE_REV, 0);

/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();

/* get the current revision from MSR 0x8B */
rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);