2001-11-16 21:30:21

by Dave Jones

[permalink] [raw]
Subject: [PATCH] AMD SMP capability sanity checking.


In the wake of the recent fallout of "are Athlon XP's SMP capable or not",
the following patch adds some sanity checking to the SMP boot up code.
This code is based upon information from the folks at AMD. There are
no exceptions to these rules.

Before sending this to Linus, I want to make sure I didn't do something
dumb, like misplace a bracket, isolating a valid config.
It works on systems I've tested it on so far, but obviously there are
some combinations that are not tested.

Any "But my system is fine in SMP and isn't in the list" whinges won't
get it added to the list. The list is compiled from AMD approved
valid systems, added to by any system which reports itself as
multiprocessor capable in its cpu flags.

Note, this code will not stop you from continuing to use unsupported
configurations, but will..
a. Print a boot time warning.
b. Taint any oopses so that SMP problem oopses can be isolated easily.

I repeat, there is *no* loss of functionality.

Patch against 2.4.15pre5 follows.

regards,

Dave.


--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/arch/i386/kernel/setup.c linux-2.4.15-pre5-dj/arch/i386/kernel/setup.c
--- linux-2.4.15-pre5/arch/i386/kernel/setup.c Fri Nov 16 18:14:11 2001
+++ linux-2.4.15-pre5-dj/arch/i386/kernel/setup.c Fri Nov 16 18:30:29 2001
@@ -2707,7 +2707,7 @@
/* AMD-defined */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
NULL, NULL, NULL, "syscall", NULL, NULL, NULL, NULL,
- NULL, NULL, NULL, NULL, NULL, NULL, "mmxext", NULL,
+ NULL, NULL, NULL, "mp", NULL, NULL, "mmxext", NULL,
NULL, NULL, NULL, NULL, NULL, "lm", "3dnowext", "3dnow",

/* Transmeta-defined */
diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/arch/i386/kernel/smpboot.c linux-2.4.15-pre5-dj/arch/i386/kernel/smpboot.c
--- linux-2.4.15-pre5/arch/i386/kernel/smpboot.c Fri Oct 5 01:42:54 2001
+++ linux-2.4.15-pre5-dj/arch/i386/kernel/smpboot.c Fri Nov 16 21:09:33 2001
@@ -30,10 +30,12 @@
* Tigran Aivazian : fixed "0.00 in /proc/uptime on SMP" bug.
* Maciej W. Rozycki : Bits for genuine 82489DX APICs
* Martin J. Bligh : Added support for multi-quad systems
+ * Dave Jones : Report invalid combinations of Athlon CPUs.
*/

#include <linux/config.h>
#include <linux/init.h>
+#include <linux/kernel.h>

#include <linux/mm.h>
#include <linux/kernel_stat.h>
@@ -156,6 +158,35 @@
* Remember we have B step Pentia with bugs
*/
smp_b_stepping = 1;
+
+ /*
+ * Certain Athlons might work (for various values of 'work') in SMP
+ * but they are not certified as MP capable.
+ */
+ if ((c->x86_vendor == X86_VENDOR_AMD) && (c->x86 == 6)) {
+
+ /* Athlon 660/661 is valid. */
+ if ((c->x86_model==6) && ((c->x86_mask==0) || (c->x86_mask==1)))
+ goto valid_athlon;
+
+ /* Duron 670 is valid */
+ if ((c->x86_model==7) && (c->x86_mask==0))
+ goto valid_athlon;
+
+ /* Athlon 662, Duron 671, and Athlon >model 7 have capability bit */
+ if (((c->x86_model==6) && (c->x86_mask>=2)) ||
+ ((c->x86_model==7) && (c->x86_mask>=1)) ||
+ (c->x86_model> 7))
+ if (cpu_has_mp)
+ goto valid_athlon;
+
+ /* If we get here, it's not a certified SMP capable AMD system. */
+ printk (KERN_INFO "WARNING: This combination of AMD processors is not suitable for SMP.\n");
+ tainted |= (1<<2);
+
+ }
+valid_athlon:
+
}

/*
diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/include/asm-i386/cpufeature.h linux-2.4.15-pre5-dj/include/asm-i386/cpufeature.h
--- linux-2.4.15-pre5/include/asm-i386/cpufeature.h Mon Nov 13 05:55:50 2000
+++ linux-2.4.15-pre5-dj/include/asm-i386/cpufeature.h Fri Nov 16 18:29:24 2001
@@ -46,6 +46,7 @@
/* AMD-defined CPU features, CPUID level 0x80000001, word 1 */
/* Don't duplicate feature flags which are redundant with Intel! */
#define X86_FEATURE_SYSCALL (1*32+11) /* SYSCALL/SYSRET */
+#define X86_FEATURE_MP (1*32+19) /* MP Capable. */
#define X86_FEATURE_MMXEXT (1*32+22) /* AMD MMX extensions */
#define X86_FEATURE_LM (1*32+29) /* Long Mode (x86-64) */
#define X86_FEATURE_3DNOWEXT (1*32+30) /* AMD 3DNow! extensions */
diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/include/asm-i386/processor.h linux-2.4.15-pre5-dj/include/asm-i386/processor.h
--- linux-2.4.15-pre5/include/asm-i386/processor.h Fri Nov 16 18:14:14 2001
+++ linux-2.4.15-pre5-dj/include/asm-i386/processor.h Fri Nov 16 19:08:34 2001
@@ -90,6 +90,7 @@
#define cpu_has_xmm (test_bit(X86_FEATURE_XMM, boot_cpu_data.x86_capability))
#define cpu_has_fpu (test_bit(X86_FEATURE_FPU, boot_cpu_data.x86_capability))
#define cpu_has_apic (test_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability))
+#define cpu_has_mp (test_bit(X86_FEATURE_MP, boot_cpu_data.x86_capability))

extern char ignore_irq13;

diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/kernel/panic.c linux-2.4.15-pre5-dj/kernel/panic.c
--- linux-2.4.15-pre5/kernel/panic.c Sun Sep 30 19:26:08 2001
+++ linux-2.4.15-pre5-dj/kernel/panic.c Fri Nov 16 20:46:17 2001
@@ -103,6 +103,10 @@
/**
* print_tainted - return a string to represent the kernel taint state.
*
+ * 'P' - Proprietory module has been loaded.
+ * 'F' - Module has been forcibly loaded.
+ * 'S' - SMP with CPUs not designed for SMP.
+ *
* The string is overwritten by the next call to print_taint().
*/

@@ -112,7 +116,8 @@
if (tainted) {
snprintf(buf, sizeof(buf), "Tainted: %c%c",
tainted & 1 ? 'P' : 'G',
- tainted & 2 ? 'F' : ' ');
+ tainted & 2 ? 'F' : ' ',
+ tainted & 4 ? 'S' : ' ');
}
else
snprintf(buf, sizeof(buf), "Not tainted");


2001-11-16 21:43:02

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

Dave-
A couple of minor comments below.

~Randy

Dave Jones wrote:
>
> Note, this code will not stop you from continuing to use unsupported
> configurations, but will..
> a. Print a boot time warning.
> b. Taint any oopses so that SMP problem oopses can be isolated easily.
>
> I repeat, there is *no* loss of functionality.
>
> Patch against 2.4.15pre5 follows.
>
> diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/arch/i386/kernel/smpboot.c linux-2.4.15-pre5-dj/arch/i386/kernel/smpboot.c
> --- linux-2.4.15-pre5/arch/i386/kernel/smpboot.c Fri Oct 5 01:42:54 2001
> +++ linux-2.4.15-pre5-dj/arch/i386/kernel/smpboot.c Fri Nov 16 21:09:33 2001
> + printk (KERN_INFO "WARNING: This combination of AMD processors is not suitable for SMP.\n");
> + tainted |= (1<<2);

Some bit #defines for <tainted> would be nice (instead of magic
numbers).


> diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/kernel/panic.c linux-2.4.15-pre5-dj/kernel/panic.c
> --- linux-2.4.15-pre5/kernel/panic.c Sun Sep 30 19:26:08 2001
> +++ linux-2.4.15-pre5-dj/kernel/panic.c Fri Nov 16 20:46:17 2001
> @@ -103,6 +103,10 @@
> /**
> * print_tainted - return a string to represent the kernel taint state.
> *
> + * 'P' - Proprietory module has been loaded.
> + * 'F' - Module has been forcibly loaded.
> + * 'S' - SMP with CPUs not designed for SMP.
> + *
> * The string is overwritten by the next call to print_taint().
> */
>
> @@ -112,7 +116,8 @@
> if (tainted) {
> snprintf(buf, sizeof(buf), "Tainted: %c%c",
>>>>>>>>>>>>>>>>>> %c%c%c <<<<<<<<<<<<<

> tainted & 1 ? 'P' : 'G',
> - tainted & 2 ? 'F' : ' ');
> + tainted & 2 ? 'F' : ' ',
> + tainted & 4 ? 'S' : ' ');
> }
> else
> snprintf(buf, sizeof(buf), "Not tainted");

2001-11-16 21:45:12

by Dan Hollis

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Fri, 16 Nov 2001, Dave Jones wrote:
> Any "But my system is fine in SMP and isn't in the list" whinges won't
> get it added to the list.

Presumably you will add the same for intel chips (eg celerons).

-Dan
--
[-] Omae no subete no kichi wa ore no mono da. [-]

2001-11-16 21:47:22

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Fri, 16 Nov 2001, Randy.Dunlap wrote:

> Dave-
> A couple of minor comments below.
> Some bit #defines for <tainted> would be nice (instead of magic
> numbers).

Agreed.

> > snprintf(buf, sizeof(buf), "Tainted: %c%c",
> >>>>>>>>>>>>>>>>>> %c%c%c <<<<<<<<<<<<<

Yup. Thanks for that.

Those fixes, plus fixing borken indentation I introduced will be in
-2 at http://www.codemonkey.org.uk/cruft/
in a while.

regards,

Dave.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2001-11-16 21:47:22

by Jeff Garzik

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

Dave Jones wrote:
> + /* If we get here, it's not a certified SMP capable AMD system. */
> + printk (KERN_INFO "WARNING: This combination of AMD processors is not suitable for SMP.\n");
> + tainted |= (1<<2);
> +

having a constant instead of setting magic bit 2 would be nice

--
Jeff Garzik | Only so many songs can be sung
Building 1024 | with two lips, two lungs, and one tongue.
MandrakeSoft | - nomeansno

2001-11-16 21:59:02

by Stefan Smietanowski

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

Hi.

<snip description of code that checks for valid AMD SMP capable CPUs>

Would you mind writing what each of these actually is?

Athlon 661 doesn't tell me much, neither does Duron 671.

That's just an example, which is which?

// Stefan


2001-11-16 22:02:32

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Fri, 16 Nov 2001, Dan Hollis wrote:

> > Any "But my system is fine in SMP and isn't in the list" whinges won't
> > get it added to the list.
> Presumably you will add the same for intel chips (eg celerons).

As Intel never announced they were capable either, by rights they
should also get tainted imo. For now I'm working on getting the
Athlons sorted out though. There's enough different models of those
to keep me busy at the moment.

regards,
Dave.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2001-11-16 22:12:22

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Fri, 16 Nov 2001, Stefan Smietanowski wrote:

> Would you mind writing what each of these actually is?
> Athlon 661 doesn't tell me much, neither does Duron 671.
> That's just an example, which is which?

The numbers translate to the family/model/stepping fields
of /proc/cpuinfo.

The only older models certified as safe for SMP are.

Athlon model 6, stepping 0 CPUID = 660
Athlon model 6, stepping 1 CPUID = 661
Duron model 7, stepping 0 CPUID = 670

The newer models..
model 6 stepping 2 and above 662
model 7 stepping 1 and above 671

have a cpuid flag that must be compared to find out if they
are capable or not. Note that these id's tally with XP's and MP's.
The capability bit is the only way to distinguish between these models.

Hope this makes it clearer.

regards,

Dave.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2001-11-16 22:28:03

by Gérard Roudier

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.



On Fri, 16 Nov 2001, Dave Jones wrote:

> In the wake of the recent fallout of "are Athlon XP's SMP capable or not",
> the following patch adds some sanity checking to the SMP boot up code.
> This code is based upon information from the folks at AMD. There are
> no exceptions to these rules.

Hmmm.... What about odd CPUs replacement?

I have had years ago my Intel Pentium 90 that wasn't suitable:) for FDIV
replaced for free by Intel and they didn't ask me if I needed FDIV to give
suitable results or not.

'Not certified', 'not suitable for', ..., why such precious wording for
just 'bug' or 'erratum' that is the wording used since years for such kind
of misses. If we want to used wording targetted to idiots, we should use
it equally for each parties, in particular not write that CPU A is not
'suitable for something' in some place but write that CPU B is 'bogus
regarding something' in some other place.

Just my 0.02 euros.

Btw, my 2 athlons 1.2GBHz costed me much more, and I will know soon if I
paid that much just for sand. :-)

G?rard.

> Before sending this to Linus, I want to make sure I didn't do something
> dumb, like misplace a bracket, isolating a valid config.
> It works on systems I've tested it on so far, but obviously there are
> some combinations that are not tested.
>
> Any "But my system is fine in SMP and isn't in the list" whinges won't
> get it added to the list. The list is compiled from AMD approved
> valid systems, added to by any system which reports itself as
> multiprocessor capable in its cpu flags.
>
> Note, this code will not stop you from continuing to use unsupported
> configurations, but will..
> a. Print a boot time warning.
> b. Taint any oopses so that SMP problem oopses can be isolated easily.
>
> I repeat, there is *no* loss of functionality.
>
> Patch against 2.4.15pre5 follows.
>
> regards,
>
> Dave.
>
>
> --
> | Dave Jones. http://www.codemonkey.org.uk
> | SuSE Labs
>
> diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/arch/i386/kernel/setup.c linux-2.4.15-pre5-dj/arch/i386/kernel/setup.c
> --- linux-2.4.15-pre5/arch/i386/kernel/setup.c Fri Nov 16 18:14:11 2001
> +++ linux-2.4.15-pre5-dj/arch/i386/kernel/setup.c Fri Nov 16 18:30:29 2001
> @@ -2707,7 +2707,7 @@
> /* AMD-defined */
> NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
> NULL, NULL, NULL, "syscall", NULL, NULL, NULL, NULL,
> - NULL, NULL, NULL, NULL, NULL, NULL, "mmxext", NULL,
> + NULL, NULL, NULL, "mp", NULL, NULL, "mmxext", NULL,
> NULL, NULL, NULL, NULL, NULL, "lm", "3dnowext", "3dnow",
>
> /* Transmeta-defined */
> diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/arch/i386/kernel/smpboot.c linux-2.4.15-pre5-dj/arch/i386/kernel/smpboot.c
> --- linux-2.4.15-pre5/arch/i386/kernel/smpboot.c Fri Oct 5 01:42:54 2001
> +++ linux-2.4.15-pre5-dj/arch/i386/kernel/smpboot.c Fri Nov 16 21:09:33 2001
> @@ -30,10 +30,12 @@
> * Tigran Aivazian : fixed "0.00 in /proc/uptime on SMP" bug.
> * Maciej W. Rozycki : Bits for genuine 82489DX APICs
> * Martin J. Bligh : Added support for multi-quad systems
> + * Dave Jones : Report invalid combinations of Athlon CPUs.
> */
>
> #include <linux/config.h>
> #include <linux/init.h>
> +#include <linux/kernel.h>
>
> #include <linux/mm.h>
> #include <linux/kernel_stat.h>
> @@ -156,6 +158,35 @@
> * Remember we have B step Pentia with bugs
> */
> smp_b_stepping = 1;
> +
> + /*
> + * Certain Athlons might work (for various values of 'work') in SMP
> + * but they are not certified as MP capable.
> + */
> + if ((c->x86_vendor == X86_VENDOR_AMD) && (c->x86 == 6)) {
> +
> + /* Athlon 660/661 is valid. */
> + if ((c->x86_model==6) && ((c->x86_mask==0) || (c->x86_mask==1)))
> + goto valid_athlon;
> +
> + /* Duron 670 is valid */
> + if ((c->x86_model==7) && (c->x86_mask==0))
> + goto valid_athlon;
> +
> + /* Athlon 662, Duron 671, and Athlon >model 7 have capability bit */
> + if (((c->x86_model==6) && (c->x86_mask>=2)) ||
> + ((c->x86_model==7) && (c->x86_mask>=1)) ||
> + (c->x86_model> 7))
> + if (cpu_has_mp)
> + goto valid_athlon;
> +
> + /* If we get here, it's not a certified SMP capable AMD system. */
> + printk (KERN_INFO "WARNING: This combination of AMD processors is not suitable for SMP.\n");
> + tainted |= (1<<2);
> +
> + }
> +valid_athlon:
> +
> }
>
> /*
> diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/include/asm-i386/cpufeature.h linux-2.4.15-pre5-dj/include/asm-i386/cpufeature.h
> --- linux-2.4.15-pre5/include/asm-i386/cpufeature.h Mon Nov 13 05:55:50 2000
> +++ linux-2.4.15-pre5-dj/include/asm-i386/cpufeature.h Fri Nov 16 18:29:24 2001
> @@ -46,6 +46,7 @@
> /* AMD-defined CPU features, CPUID level 0x80000001, word 1 */
> /* Don't duplicate feature flags which are redundant with Intel! */
> #define X86_FEATURE_SYSCALL (1*32+11) /* SYSCALL/SYSRET */
> +#define X86_FEATURE_MP (1*32+19) /* MP Capable. */
> #define X86_FEATURE_MMXEXT (1*32+22) /* AMD MMX extensions */
> #define X86_FEATURE_LM (1*32+29) /* Long Mode (x86-64) */
> #define X86_FEATURE_3DNOWEXT (1*32+30) /* AMD 3DNow! extensions */
> diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/include/asm-i386/processor.h linux-2.4.15-pre5-dj/include/asm-i386/processor.h
> --- linux-2.4.15-pre5/include/asm-i386/processor.h Fri Nov 16 18:14:14 2001
> +++ linux-2.4.15-pre5-dj/include/asm-i386/processor.h Fri Nov 16 19:08:34 2001
> @@ -90,6 +90,7 @@
> #define cpu_has_xmm (test_bit(X86_FEATURE_XMM, boot_cpu_data.x86_capability))
> #define cpu_has_fpu (test_bit(X86_FEATURE_FPU, boot_cpu_data.x86_capability))
> #define cpu_has_apic (test_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability))
> +#define cpu_has_mp (test_bit(X86_FEATURE_MP, boot_cpu_data.x86_capability))
>
> extern char ignore_irq13;
>
> diff -urN --exclude-from=/home/davej/.exclude linux-2.4.15-pre5/kernel/panic.c linux-2.4.15-pre5-dj/kernel/panic.c
> --- linux-2.4.15-pre5/kernel/panic.c Sun Sep 30 19:26:08 2001
> +++ linux-2.4.15-pre5-dj/kernel/panic.c Fri Nov 16 20:46:17 2001
> @@ -103,6 +103,10 @@
> /**
> * print_tainted - return a string to represent the kernel taint state.
> *
> + * 'P' - Proprietory module has been loaded.
> + * 'F' - Module has been forcibly loaded.
> + * 'S' - SMP with CPUs not designed for SMP.
> + *
> * The string is overwritten by the next call to print_taint().
> */
>
> @@ -112,7 +116,8 @@
> if (tainted) {
> snprintf(buf, sizeof(buf), "Tainted: %c%c",
> tainted & 1 ? 'P' : 'G',
> - tainted & 2 ? 'F' : ' ');
> + tainted & 2 ? 'F' : ' ',
> + tainted & 4 ? 'S' : ' ');
> }
> else
> snprintf(buf, sizeof(buf), "Not tainted");
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>

2001-11-16 22:35:43

by Jeff Golds

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

Dave Jones wrote:
>
> On Fri, 16 Nov 2001, Stefan Smietanowski wrote:
>
> > Would you mind writing what each of these actually is?
> > Athlon 661 doesn't tell me much, neither does Duron 671.
> > That's just an example, which is which?
>
> The numbers translate to the family/model/stepping fields
> of /proc/cpuinfo.
>
> The only older models certified as safe for SMP are.
>
> Athlon model 6, stepping 0 CPUID = 660
> Athlon model 6, stepping 1 CPUID = 661
> Duron model 7, stepping 0 CPUID = 670
>
> The newer models..
> model 6 stepping 2 and above 662
> model 7 stepping 1 and above 671
>
> have a cpuid flag that must be compared to find out if they
> are capable or not. Note that these id's tally with XP's and MP's.
> The capability bit is the only way to distinguish between these models.
>

So the MP has the SMP capable bit set and the XP does not?

If so, I'm not convinced this is the correct way to approach this
issue. My reasoning is based on the fact that AMD is not exactly a
impartial source of information. AMD wants to sell more MP chips, so
they can say that only MP chips are SMP capable even if XP chips work
just fine.

Now, with your patch, if people successfully use XP chips in an SMP
configuration, you're giving the maintainers of the Linux kernel the
opportunity to ignore oopses reported from these people and I think
that's a bad thing. If someone can show that XPs are truly not SMP
capable, then, by all means, implement your patch as written.

The way I'd prefer to see this handled is that things are assumed to
work until proven otherwise. Sort of like the SMP Celeron systems
people have been using: Is there _any_ reason to believe that Celeron's
can't do SMP? Sure doesn't seem like it except for Intel's statement
that Celerons aren't SMP capable. And if you decide to taint oopses
from people with such configurations, I think you'll be doing the Linux
community a disservice.

-Jeff

P.S. BTW, I don't know all the Athlon steppings, but it sure looks like
_a lot_ of older Athlons/Durons are SMP capable. Does it seem likely
that this suddenly changed when AMD stamped XP or MP on the chip?

--
Jeff Golds
Sr. Software Engineer
[email protected]

2001-11-16 23:04:33

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Fri, 16 Nov 2001, Jeff Golds wrote:

> So the MP has the SMP capable bit set and the XP does not?

Yes.

> If so, I'm not convinced this is the correct way to approach this
> issue. My reasoning is based on the fact that AMD is not exactly a
> impartial source of information. AMD wants to sell more MP chips, so
> they can say that only MP chips are SMP capable even if XP chips work
> just fine.

Whats probably closer to the truth is..

make cpu
|
smp tests run ok ? ------> No, sell as XP
|
yes, sell as MP

The same as tests are done to test if they can run at 2GHz. If any
test fails, its tried as at 1.9Ghz, and 1.8Ghz until the tests pass.
One chip yield may run at certain speeds fine, whilst others don't.
How is this relevant ?
Well, overclockers found that the sample of a yield wasn't true of
all cpu silicon from that yield, and that some 1800's run at 1900 with no
problem. Just as SOME XP users are reporting problems in SMP whilst
some are not.

Burning out a fuse to make the switch from MP->XP may affect more
than just the cpuid capabilities. The fact is _we don't know_

> The way I'd prefer to see this handled is that things are assumed to
> work until proven otherwise. Sort of like the SMP Celeron systems
> people have been using: Is there _any_ reason to believe that Celeron's
> can't do SMP?

I've yet to see a socket 370 dual processormotherboard that
I'd put faith in for a mission critical environment.
"I had no problems" means _nothing_ when theres as few as 1 other
user reporting SMP related problems with the same setup.

> P.S. BTW, I don't know all the Athlon steppings, but it sure looks like
> _a lot_ of older Athlons/Durons are SMP capable.

>From my original message:

> > The only older models certified as safe for SMP are.
> > Athlon model 6, stepping 0 CPUID = 660
> > Athlon model 6, stepping 1 CPUID = 661
> > Duron model 7, stepping 0 CPUID = 670

Three models. There are considerably more out there.
Note, that model 6 isn't really 'old', a thunderbird for eg is
model 4. "old" was used relatively in my original mail.

regards,
Dave.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2001-11-16 23:40:27

by Stefan Smietanowski

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

Hi Dave.

>>Would you mind writing what each of these actually is?
>>Athlon 661 doesn't tell me much, neither does Duron 671.
>>That's just an example, which is which?
>
> The numbers translate to the family/model/stepping fields
> of /proc/cpuinfo.


Yeah, I know. That was the easy bit.

> The only older models certified as safe for SMP are.
>
> Athlon model 6, stepping 0 CPUID = 660
> Athlon model 6, stepping 1 CPUID = 661
> Duron model 7, stepping 0 CPUID = 670


Ok, since you're misunderstanding me, where do I find out which is
which, ie CPUID 660 is an ... and CPUID 670 is an ...

Point me to some good place to find out and I'm happy.

I'll try looking on http://www.amd.com to see if I can find it myself :)

> The newer models..
> model 6 stepping 2 and above 662
> model 7 stepping 1 and above 671
>
> have a cpuid flag that must be compared to find out if they
> are capable or not. Note that these id's tally with XP's and MP's.
> The capability bit is the only way to distinguish between these models.


Right, all I'd need is a way to match these numbers to core names. :)


// Stefan


2001-11-16 23:47:37

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Sat, 17 Nov 2001, Stefan Smietanowski wrote:

> Ok, since you're misunderstanding me, where do I find out which is
> which, ie CPUID 660 is an ... and CPUID 670 is an ...

Ah, gotcha. Not sure off hand of any resource.
My x86info program has them documented in source form..
ftp://ftp.suse.com/pub/people/davej/x86info/

I'll extrapolate those into a human readable table, and put
it on my webpage sometime.. I've been meaning to put up
x86info dumps from various cpu's on there actually.
(I'll take this opportunity to ask anyone with a few spare
minutes to send -a output to me (NOT to linux-kernel btw))

> Point me to some good place to find out and I'm happy.

If you want to look at that source, its in AMD/identify.c
The last released version isn't aware of MP/XP's, but has the
earlier models covered.

regards,

Dave.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2001-11-17 00:31:03

by Jeff Golds

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

Dave Jones wrote:
>
> Burning out a fuse to make the switch from MP->XP may affect more
> than just the cpuid capabilities. The fact is _we don't know_
>

Right, so why assume it doesn't work?

> I've yet to see a socket 370 dual processormotherboard that
> I'd put faith in for a mission critical environment.
> "I had no problems" means _nothing_ when theres as few as 1 other
> user reporting SMP related problems with the same setup.
>

People with "true" SMP CPUs can have problems as well. Does this mean
SMP CPUs are not SMP capable? If only one person is having problems,
chances are there's a problem someplace. Could it be a faulty
motherboard? Mismatched CPUs? Bad memory? Bad CPU? Bad power
supply? There's an awful lot of variables here.

-Jeff

--
Jeff Golds
Sr. Software Engineer
[email protected]

2001-11-17 00:39:43

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Fri, 16 Nov 2001, Jeff Golds wrote:

> > Burning out a fuse to make the switch from MP->XP may affect more
> > than just the cpuid capabilities. The fact is _we don't know_
> Right, so why assume it doesn't work?

Because there are cases where it. does. not. work.

> People with "true" SMP CPUs can have problems as well. Does this mean
> SMP CPUs are not SMP capable? If only one person is having problems,
> chances are there's a problem someplace.

Such problems get researched, and warnings such as the one in
my patch get added. Take a look through smpboot.c and friends.
We support a lot of broken hardware, but there's a difference between
broken (buggy) and running something outside of its specification.

> Could it be a faulty motherboard?

The reason we have DMI table scanning on boot up.

> Mismatched CPUs?

Another unsupported configuration we should at least warn about.
Note however, that some quad systems allow 2 different pairs.

> Bad memory?

The reason memtest86 came to be.

> Bad CPU?

See errata workarounds in smpboot.c & setup.c

> Bad power supply?

Running underrated PSUs on modern hw is asking for trouble.
Unless you think AMD approved PSUs are another marketing gimmik
to make people pay out more.

> There's an awful lot of variables here.

Sure. And this eliminates one such variable.

regards,

Dave.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2001-11-17 01:00:06

by Andreas Boman

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Sat, 17 Nov 2001 01:39:21 +0100 (CET)
Dave Jones <[email protected]> wrote:

> On Fri, 16 Nov 2001, Jeff Golds wrote:
>
> > > Burning out a fuse to make the switch from MP->XP may affect more
> > > than just the cpuid capabilities. The fact is _we don't know_
> > Right, so why assume it doesn't work?
>
> Because there are cases where it. does. not. work.
>

Well the case you cited in your first mail turned out to be a GPM glitch,
solved by plugging in a mouse. Nothing to do with SMP what so ever. I have
yet to hear of any K7 cpu(s) that will _not_ work in SMP mode, care to
share a few real cases where that is the case? (ie all other factors ruled
out).

Andreas

2001-11-17 01:05:25

by Stefan Smietanowski

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

Hi.

>>Ok, since you're misunderstanding me, where do I find out which is
>>which, ie CPUID 660 is an ... and CPUID 670 is an ...
>>
>
> Ah, gotcha. Not sure off hand of any resource.
> My x86info program has them documented in source form..
> ftp://ftp.suse.com/pub/people/davej/x86info/


Good enough for me.

> I'll extrapolate those into a human readable table, and put
> it on my webpage sometime.. I've been meaning to put up
> x86info dumps from various cpu's on there actually.
> (I'll take this opportunity to ask anyone with a few spare
> minutes to send -a output to me (NOT to linux-kernel btw))
>
>
>>Point me to some good place to find out and I'm happy.
>>
>
> If you want to look at that source, its in AMD/identify.c
> The last released version isn't aware of MP/XP's, but has the
> earlier models covered.

Umm. Tell me I'm wrong, but didn't your patch say the 670 was ok for SMP ?

The SMP according to your program is a Duron (Morgan Core).

So the Morgon Duron is ok for SMP and the Palomino AthlonXP is not ?

*bashes hand against head*

// Stefan


2001-11-17 01:10:27

by Dan Hollis

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Sat, 17 Nov 2001, Stefan Smietanowski wrote:
> Umm. Tell me I'm wrong, but didn't your patch say the 670 was ok for SMP ?
> The SMP according to your program is a Duron (Morgan Core).
> So the Morgon Duron is ok for SMP and the Palomino AthlonXP is not ?

I thought AMD publically says duron is not OK for SMP... they sure say it
for XP.

-Dan
--
[-] Omae no subete no kichi wa ore no mono da. [-]

2001-11-17 01:11:47

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Sat, 17 Nov 2001, Stefan Smietanowski wrote:

> Umm. Tell me I'm wrong, but didn't your patch say the 670 was ok for SMP ?

correct.

> So the Morgon Duron is ok for SMP and the Palomino AthlonXP is not ?

Looks that way.

regards,
Dave.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2001-11-17 01:13:06

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Fri, 16 Nov 2001, Dan Hollis wrote:

> I thought AMD publically says duron is not OK for SMP... they sure say it
> for XP.

So far, only two types of Duron. Model 7 stepping 0 and stepping 1.

regards,
Dave.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2001-11-17 01:54:44

by Mike Fedyk

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Sat, Nov 17, 2001 at 01:39:21AM +0100, Dave Jones wrote:
> On Fri, 16 Nov 2001, Jeff Golds wrote:
> > Mismatched CPUs?
>
> Another unsupported configuration we should at least warn about.

This I would like to see.

2001-11-17 04:14:20

by Nathan Walp

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

> Whats probably closer to the truth is..
>
> make cpu
> |
> smp tests run ok ? ------> No, sell as XP
> |
> yes, sell as MP
>

Actually, it's probably closer to:

make cpu
|
smp tests run ok? -------> No, sell as XP
|
yes, do we have more demand
for XPs than we have supply
of those that didn't pass? -------> Yes, sell as XP
|
No, sell as MP

Remember, AMD is just trying to make a buck. If they've got a bunch of
MP CPUs "sitting on the shelves" while no one can get their hands on the
XPs, some of those MPs are going to "become" XPs. For those of us on a
budget, we can only hope to get one of *those* variety of XPs.

Now, that said, I'm probably going to buy MPs when I build my machine,
as long as the price difference stays as the current low levels.
Consider it a "warranty" or something.

Just my $0.02

Nathan

--
Nathan Walp || [email protected]
GPG Fingerprint: || http://faceprint.com/
5509 6EF3 928B 2363 9B2B DA17 3E46 2CDC 492D DB7E


Attachments:
(No filename) (1.01 kB)
(No filename) (232.00 B)
Download all attachments

2001-11-17 04:59:39

by Mark Orr

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

On Fri, 16 Nov 2001 23:11:41 -0500
[email protected] (Nathan Walp) wrote:

> Actually, it's probably closer to:
>
> make cpu
> |
> smp tests run ok? -------> No, sell as XP
> |
> yes, do we have more demand
> for XPs than we have supply
> of those that didn't pass? -------> Yes, sell as XP
> |
> No, sell as MP

Yes, this is much closer to what's happening. I'd bet that most
Palomino chips would pass the smp tests, meaning many more MPs than
they'd ever need. They're probably just putting a bucket in the
manufacturing stream, testing those, and putting the rejects back
in the stream.

> Remember, AMD is just trying to make a buck. If they've got a bunch of
> MP CPUs "sitting on the shelves" while no one can get their hands on the
> XPs, some of those MPs are going to "become" XPs. For those of us on a
> budget, we can only hope to get one of *those* variety of XPs.

Umm...I cant see chips that have already been marked as MPs being
converted to XPs. Odds are the ratio of XP to MP is probably 10:1
or greater.

> Now, that said, I'm probably going to buy MPs when I build my machine,
> as long as the price difference stays as the current low levels.
> Consider it a "warranty" or something.

...and considering AMD doesnt lag in bringing out MPs. Right now
XP 1900s are widely available, but the highest speed MP's are 1800s.

I've heard that the AMD 762 northbridge only works up to 12.5x133
(1666MHz) so they'll hit the wall with the MPs pretty soon unless they
have an updated stepping.

2001-11-22 10:42:01

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] AMD SMP capability sanity checking.

Hi!

> In the wake of the recent fallout of "are Athlon XP's SMP capable or not",
> the following patch adds some sanity checking to the SMP boot up code.
> This code is based upon information from the folks at AMD. There are
> no exceptions to these rules.

Are there public errata documents which say what's wrong with older
athlons? Without that info I think this patch is bad idea.
Pavel
--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.