Linux will have all kinds of sporadic problems on systems that don't
have the CPUID instruction unless CONFIG_M486=y. In particular,
sync_core() will explode.
I believe that these kernels had a better chance of working before
commit 05fb3c199bb0 ("x86/boot: Initialize FPU and X86_FEATURE_ALWAYS
even if we don't have CPUID"). That commit inadvertently fixed a
serious bug: we used to fail to detect the FPU if CPUID wasn't
present. Because we also used to forget to set X86_FEATURE_ALWAYS, we
end up with no cpu feature bits set at all. This meant that
alternative patching didn't do anything and, if paravirt was disabled,
we could plausibly finish the entire boot process without calling
sync_core().
Rather than trying to work around these issues, just have the kernel
fail loudly if it's running on a CPUID-less 486, doesn't have CPUID,
and doesn't have CONFIG_M486 set.
Reported-by: Matthew Whitehead <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
---
arch/x86/boot/cpu.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/boot/cpu.c b/arch/x86/boot/cpu.c
index 26240dde081e..4224ede43b4e 100644
--- a/arch/x86/boot/cpu.c
+++ b/arch/x86/boot/cpu.c
@@ -87,6 +87,12 @@ int validate_cpu(void)
return -1;
}
+ if (CONFIG_X86_MINIMUM_CPU_FAMILY <= 4 && !IS_ENABLED(CONFIG_M486) &&
+ !has_eflag(X86_EFLAGS_ID)) {
+ printf("This kernel requires a CPU with the CPUID instruction. Build with CONFIG_M486=y to run on this CPU.\n");
+ return -1;
+ }
+
if (err_flags) {
puts("This kernel requires the following features "
"not present on the CPU:\n");
--
2.7.4
On Sat, Nov 19, 2016 at 03:37:30PM -0800, Andy Lutomirski wrote:
> Linux will have all kinds of sporadic problems on systems that don't
> have the CPUID instruction unless CONFIG_M486=y. In particular,
> sync_core() will explode.
Btw, I think we should do something like this, in addition:
---
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 50425dd7e113..ee9de769941f 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -105,6 +105,7 @@
#define X86_FEATURE_AMD_DCM ( 3*32+27) /* multi-node processor */
#define X86_FEATURE_APERFMPERF ( 3*32+28) /* APERFMPERF */
#define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */
+#define X86_FEATURE_CPUID ( 3*32+31) /* CPU has CPUID instruction itself */
/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
#define X86_FEATURE_XMM3 ( 4*32+ 0) /* "pni" SSE-3 */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 1f6a92903b09..63aa4842c0ae 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -602,34 +602,21 @@ static __always_inline void cpu_relax(void)
rep_nop();
}
-/* Stop speculative execution and prefetching of modified code. */
+/*
+ * Stop speculative execution and prefetching of modified code. CPUID is a
+ * barrier to speculative execution. Prefetched instructions are automatically
+ * invalidated when modified.
+ */
static inline void sync_core(void)
{
int tmp;
-#ifdef CONFIG_M486
- /*
- * Do a CPUID if available, otherwise do a jump. The jump
- * can conveniently enough be the jump around CPUID.
- */
- asm volatile("cmpl %2,%1\n\t"
- "jl 1f\n\t"
- "cpuid\n"
- "1:"
- : "=a" (tmp)
- : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
- : "ebx", "ecx", "edx", "memory");
-#else
- /*
- * CPUID is a barrier to speculative execution.
- * Prefetched instructions are automatically
- * invalidated when modified.
- */
- asm volatile("cpuid"
- : "=a" (tmp)
- : "0" (1)
- : "ebx", "ecx", "edx", "memory");
-#endif
+ /* Do a CPUID if available, otherwise do a forward jump. */
+ alternative_io("jmp 1f\n\t1:", "cpuid",
+ X86_FEATURE_CPUID,
+ "=a" (tmp),
+ "0" (1)
+ : "ebx", "ecx", "edx", "memory");
}
extern void select_idle_routine(const struct cpuinfo_x86 *c);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 90c007447507..8dcdcdeec569 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -800,14 +800,12 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
memset(&c->x86_capability, 0, sizeof c->x86_capability);
c->extended_cpuid_level = 0;
- if (!have_cpuid_p())
- identify_cpu_without_cpuid(c);
-
/* cyrix could have cpuid enabled via c_identify()*/
if (have_cpuid_p()) {
cpu_detect(c);
get_cpu_vendor(c);
get_cpu_cap(c);
+ setup_force_cpu_cap(X86_FEATURE_CPUID);
if (this_cpu->c_early_init)
this_cpu->c_early_init(c);
@@ -817,6 +815,9 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
if (this_cpu->c_bsp_init)
this_cpu->c_bsp_init(c);
+ } else {
+ identify_cpu_without_cpuid(c);
+ setup_clear_cpu_cap(X86_FEATURE_CPUID);
}
setup_force_cpu_cap(X86_FEATURE_ALWAYS);
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Nov 20, 2016 3:19 AM, "Borislav Petkov" <[email protected]> wrote:
>
> On Sat, Nov 19, 2016 at 03:37:30PM -0800, Andy Lutomirski wrote:
> > Linux will have all kinds of sporadic problems on systems that don't
> > have the CPUID instruction unless CONFIG_M486=y. In particular,
> > sync_core() will explode.
>
> Btw, I think we should do something like this, in addition:
>
> ---
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 50425dd7e113..ee9de769941f 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -105,6 +105,7 @@
> #define X86_FEATURE_AMD_DCM ( 3*32+27) /* multi-node processor */
> #define X86_FEATURE_APERFMPERF ( 3*32+28) /* APERFMPERF */
> #define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */
> +#define X86_FEATURE_CPUID ( 3*32+31) /* CPU has CPUID instruction itself */
>
> /* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
> #define X86_FEATURE_XMM3 ( 4*32+ 0) /* "pni" SSE-3 */
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 1f6a92903b09..63aa4842c0ae 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -602,34 +602,21 @@ static __always_inline void cpu_relax(void)
> rep_nop();
> }
>
> -/* Stop speculative execution and prefetching of modified code. */
> +/*
> + * Stop speculative execution and prefetching of modified code. CPUID is a
> + * barrier to speculative execution. Prefetched instructions are automatically
> + * invalidated when modified.
> + */
> static inline void sync_core(void)
> {
> int tmp;
>
> -#ifdef CONFIG_M486
> - /*
> - * Do a CPUID if available, otherwise do a jump. The jump
> - * can conveniently enough be the jump around CPUID.
> - */
> - asm volatile("cmpl %2,%1\n\t"
> - "jl 1f\n\t"
> - "cpuid\n"
> - "1:"
> - : "=a" (tmp)
> - : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
> - : "ebx", "ecx", "edx", "memory");
> -#else
> - /*
> - * CPUID is a barrier to speculative execution.
> - * Prefetched instructions are automatically
> - * invalidated when modified.
> - */
> - asm volatile("cpuid"
> - : "=a" (tmp)
> - : "0" (1)
> - : "ebx", "ecx", "edx", "memory");
> -#endif
> + /* Do a CPUID if available, otherwise do a forward jump. */
> + alternative_io("jmp 1f\n\t1:", "cpuid",
> + X86_FEATURE_CPUID,
> + "=a" (tmp),
> + "0" (1)
> + : "ebx", "ecx", "edx", "memory");
> }
This makes me nervous: don't some CPUs actually need the cpuid
instruction when patching alternatives? And with this approach, we
won't have the cpuid instruction there until after patching.
Why not change this function entirely:
write_cr2(0);
CR2 should be available on all 32-bit CPUs. It clobbers fewer
registers. More usefully, CPUID causes an exit when running under
most hypervisors, and that's quite slow. The only case I can think of
where CPUID should be faster than MOV to CR2 is on Xen PV before Ivy
Bridge, and I'm not sure I care about performance there.
(On Xen PV, it will do a hypercall instead, but the hypercall should
be good enough to serialize, too.)
Or we could do it dynamically:
bt $X86_FEATURE_CPUID, CPU_FLAGS(boot_cpu_data) # or whatever -- I
think we need to add an asm offset
jnc 1f # here's our jump
cpuid
1:
It's not like CPUID is fast enough that avoiding a predictable branch
will help measurably.
--Andy
On Sun, Nov 20, 2016 at 08:22:25AM -0800, Andy Lutomirski wrote:
> This makes me nervous: don't some CPUs actually need the cpuid
> instruction when patching alternatives?
Nope, we use boot_cpu_has() in apply_alternatives() if that is what you
mean.
> And with this approach, we won't have the cpuid instruction there
> until after patching.
We will have set (or not) the X86_FEATURE_CPUID bit at
early_identify_cpu() time. Looking at the code, we do call sync_core()
pretty early. :-\
Hmm, that #ifdef CONFIG_M486 is there for a reason.
> Why not change this function entirely:
>
> write_cr2(0);
>
> CR2 should be available on all 32-bit CPUs. It clobbers fewer
> registers.
Yap, just one. And not only that - it clobbers a register which gcc
doesn't have to reload at all.
> More usefully, CPUID causes an exit when running under
> most hypervisors, and that's quite slow. The only case I can think of
> where CPUID should be faster than MOV to CR2 is on Xen PV before Ivy
> Bridge, and I'm not sure I care about performance there.
>
> (On Xen PV, it will do a hypercall instead, but the hypercall should
> be good enough to serialize, too.)
A nop hypercall or whatever...
But yeah, pending a nod from hw people, this one sounds nice too. You
can do it basically on every CPU which supports paging. And that should
be all we support in Linux anyway.
> Or we could do it dynamically:
>
> bt $X86_FEATURE_CPUID, CPU_FLAGS(boot_cpu_data) # or whatever -- I
> think we need to add an asm offset
> jnc 1f # here's our jump
> cpuid
> 1:
We could... we did move the X86_FEATURE* things to a separate header so
that they can be used in asm too.
write_cr2(0) doesn't sound so bad either. Except what happens if
someone decides to sync_core() before the first line of do_page_fault()
executes... I know, it is unlikely but we do unlikely things :)
But yeah, the write_cr2() sounds better if one considers the lower
register pressure. Which is nice.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Sun, 20 Nov 2016, Borislav Petkov wrote:
> We will have set (or not) the X86_FEATURE_CPUID bit at
> early_identify_cpu() time. Looking at the code, we do call sync_core()
> pretty early. :-\
Hmm, watch out for the early microcode update driver for Intel
processors should something get changed in the implementation, or in the
behavior of sync_core().
That driver absolutely needs to issue a cpuid (with EAX = 1) before each
rdmsr(MSR_IA32_UCODE_REV). And it uses sync_core() calls to do it. A
CR2 access just won't do in this extremely specific case.
This kind of pitfall is why I wanted to replace all use of sync_core()
in arch/x86/kernel/cpu/microcode/intel.c with an explicit use of an
inconditional cpuid(eax = 1)...
(note: this protocol to read MSR_IA32_UCODE_REV was made an
architectural requirement a while ago -- it was once considered an
erratum workaround. It is documented in the "Intel 64 and IA‐32
Architectures Software Developer's Manual", Volume 3A: System
Programming Guide, Part 1, section 9.11).
--
Henrique Holschuh
On Sun, Nov 20, 2016 at 05:34:43PM -0200, Henrique de Moraes Holschuh wrote:
> On Sun, 20 Nov 2016, Borislav Petkov wrote:
> > We will have set (or not) the X86_FEATURE_CPUID bit at
> > early_identify_cpu() time. Looking at the code, we do call sync_core()
> > pretty early. :-\
>
> Hmm, watch out for the early microcode update driver for Intel
> processors should something get changed in the implementation, or in the
> behavior of sync_core().
That's exactly what I had in mind when I wrote the above sentence...
> That driver absolutely needs to issue a cpuid (with EAX = 1) before each
> rdmsr(MSR_IA32_UCODE_REV). And it uses sync_core() calls to do it. A
> CR2 access just won't do in this extremely specific case.
>
> This kind of pitfall is why I wanted to replace all use of sync_core()
> in arch/x86/kernel/cpu/microcode/intel.c with an explicit use of an
> inconditional cpuid(eax = 1)...
>
> (note: this protocol to read MSR_IA32_UCODE_REV was made an
> architectural requirement a while ago -- it was once considered an
> erratum workaround. It is documented in the "Intel 64 and IA‐32
> Architectures Software Developer's Manual", Volume 3A: System
> Programming Guide, Part 1, section 9.11).
Well, I'm looking at this:
"CPUID returns a value in a model specific register in addition to its
usual register return values. The semantics of CPUID cause it to deposit
an update ID value in the 64-bit model-specific register at address 08BH
(IA32_BIOS_SIGN_ID)."
And I think yes, we should do an explicit CPUID(1) regardless of
what happens to sync_core(). Feel free to send a patch against
tip:x86/microcode.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Commit-ID: ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98
Gitweb: http://git.kernel.org/tip/ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98
Author: Andy Lutomirski <[email protected]>
AuthorDate: Sat, 19 Nov 2016 15:37:30 -0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 21 Nov 2016 09:04:32 +0100
x86/boot: Fail the boot if !M486 and CPUID is missing
Linux will have all kinds of sporadic problems on systems that don't
have the CPUID instruction unless CONFIG_M486=y. In particular,
sync_core() will explode.
I believe that these kernels had a better chance of working before
commit 05fb3c199bb0 ("x86/boot: Initialize FPU and X86_FEATURE_ALWAYS
even if we don't have CPUID"). That commit inadvertently fixed a
serious bug: we used to fail to detect the FPU if CPUID wasn't
present. Because we also used to forget to set X86_FEATURE_ALWAYS, we
end up with no cpu feature bits set at all. This meant that
alternative patching didn't do anything and, if paravirt was disabled,
we could plausibly finish the entire boot process without calling
sync_core().
Rather than trying to work around these issues, just have the kernel
fail loudly if it's running on a CPUID-less 486, doesn't have CPUID,
and doesn't have CONFIG_M486 set.
Reported-by: Matthew Whitehead <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/70eac6639f23df8be5fe03fa1984aedd5d40077a.1479598603.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/boot/cpu.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/boot/cpu.c b/arch/x86/boot/cpu.c
index 26240dd..4224ede 100644
--- a/arch/x86/boot/cpu.c
+++ b/arch/x86/boot/cpu.c
@@ -87,6 +87,12 @@ int validate_cpu(void)
return -1;
}
+ if (CONFIG_X86_MINIMUM_CPU_FAMILY <= 4 && !IS_ENABLED(CONFIG_M486) &&
+ !has_eflag(X86_EFLAGS_ID)) {
+ printf("This kernel requires a CPU with the CPUID instruction. Build with CONFIG_M486=y to run on this CPU.\n");
+ return -1;
+ }
+
if (err_flags) {
puts("This kernel requires the following features "
"not present on the CPU:\n");
> Rather than trying to work around these issues, just have the kernel
> fail loudly if it's running on a CPUID-less 486, doesn't have CPUID,
> and doesn't have CONFIG_M486 set.
NAK
This still breaks the Geode at the very least and I think the ELAN and
some of the other older socket 7 devices. These have their own config CPU
type (and in some cases *need* it selected) but do not have CPUID.
Given the cores without CPUID are often post 486 in other respects this
seems a bit odd. In fact I can't help thinking that the problem is that
sync_core tests CONFIG_M486 whereas we should have and test
HAVE_CPUID
and set this by processor type (M586/M486/GEODEGX1/GEODE_LX1/Cyrix plus I
think ELAN not having it)
I'd been wondering why Geode wasn't working on modern kernels, now I
think I know 8)
Alan
On Wed, Nov 30, 2016 at 5:18 AM, One Thousand Gnomes
<[email protected]> wrote:
>> Rather than trying to work around these issues, just have the kernel
>> fail loudly if it's running on a CPUID-less 486, doesn't have CPUID,
>> and doesn't have CONFIG_M486 set.
>
> NAK
>
> This still breaks the Geode at the very least and I think the ELAN and
> some of the other older socket 7 devices. These have their own config CPU
> type (and in some cases *need* it selected) but do not have CPUID.
>
> Given the cores without CPUID are often post 486 in other respects this
> seems a bit odd. In fact I can't help thinking that the problem is that
> sync_core tests CONFIG_M486 whereas we should have and test
>
> HAVE_CPUID
>
> and set this by processor type (M586/M486/GEODEGX1/GEODE_LX1/Cyrix plus I
> think ELAN not having it)
>
> I'd been wondering why Geode wasn't working on modern kernels, now I
> think I know 8)
>
Ick. Am I understanding correctly that this isn't a regression per se
since the affected machines were already broken? If so, let's fix it
for real rather than just reverting this patch.
--Andy