2002-01-01 02:12:38

by Chris Croswhite

[permalink] [raw]
Subject: Dual athlon XP 1800 problems

I am having problems with dual athlons and more than 512M RAM. I have
compiled several kernels with either ATHLON, 1386, i686 support with the
same affect, I get a kernel that will fail to boot properly. Sometimes
I get a kernel panic that outs to kdb, sometimes I get a freeze, and
sometimes I get failed to mount root partition, but never has this
kernel successfully come up. I am quite certain it is not the memory or
the system ( I can get windblows 2k to run successfully with upto 3.5G
RAM).

Here is the configuration:

Tyan S2460
Dual Athlon XP 1800
512M DDR DIMMS (also used 128, 256, and 1G)
Western Digital 20G Drive

Kernels 2.4.9, 2.4.16, 2.4.17, 2.5.2
configured for 4G
configured as 1386, 1686, and Athlon processor support
configured with XFS support
configured with kdb support

Is there a patch for this? Am I configuring something wrong in the
kernel?

TIA
Chris Croswhite


2002-01-01 03:21:48

by Dave Jones

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Mon, Dec 31, 2001 at 06:12:16PM -0800, [email protected] wrote:
> I am having problems with dual athlons and more than 512M RAM.
> Here is the configuration:
> Dual Athlon XP 1800
^^
Not a valid SMP configuration. Some people are getting away with
running XP's in SMP boxes, others aren't. And soon, it looks like
no new XP's will run in SMP at all.

Take one out/boot a UP kernel, and see if the problems go away.

Dave.

--
Dave Jones. http://www.codemonkey.org.uk
SuSE Labs.

2002-01-01 03:53:44

by Jeffrey H. Ingber

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

I'm using a pair of XP 1700's just fine in an SMP configuration, so it
appears to be valid (and does infact function in SMP).

Jeffrey H. Ingber (jhingber _at_ ix.netcom.com)

On Mon, 2001-12-31 at 22:23, Dave Jones wrote:
> On Mon, Dec 31, 2001 at 06:12:16PM -0800, [email protected] wrote:
> > I am having problems with dual athlons and more than 512M RAM.
> > Here is the configuration:
> > Dual Athlon XP 1800
> ^^
> Not a valid SMP configuration. Some people are getting away with
> running XP's in SMP boxes, others aren't. And soon, it looks like
> no new XP's will run in SMP at all.
>
> Take one out/boot a UP kernel, and see if the problems go away.
>
> Dave.
>
> --
> Dave Jones. http://www.codemonkey.org.uk
> SuSE Labs.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


2002-01-01 04:46:53

by Adam Schrotenboer

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Monday 31 December 2001 22:40, Jeffrey H. Ingber wrote:
> I'm using a pair of XP 1700's just fine in an SMP configuration, so it
> appears to be valid (and does in fact function in SMP).
>

Irrelevant. This issue was discussed to death a couple weeks ago. Some people
got it working, some don't. No discernible Stepping issues. Don't argue the
facts. PLEASE

2002-01-01 06:59:41

by Shaya Potter

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

I missed the original message, I'm running a dual athlon 1800+ XP system
(actually both CPU for some reason get identified as MPs, but they are
retail XP chips).

I assume you are using a tyan tiger or thunder motherboard (as I dont
know of any others in retail yet). Is your ram registered ecc ram? If
it's not, that most likely is your problem.

Is your ram from tyan's approved list? Mine isn't (crucial, but I have
never had bad luck with them, even though others seem to have had some
problems), but I would see if it would change if you would get ram from
suppliers tyan reccomends.

On the issue of future motherboards not supporting XPs at all, I've read
in multiple places that the rumor to that effect was false, and that the
motherboards will not lock out XPs, they just won't be supported (much
like motherboard manufacturers don't support overclocking, but many give
you the means to do it). If you have information direct from the
manufacturers, than I have no reason to not believe you.

On Mon, 2001-12-31 at 22:23, Dave Jones wrote:
> On Mon, Dec 31, 2001 at 06:12:16PM -0800, [email protected] wrote:
> > I am having problems with dual athlons and more than 512M RAM.
> > Here is the configuration:
> > Dual Athlon XP 1800
> ^^
> Not a valid SMP configuration. Some people are getting away with
> running XP's in SMP boxes, others aren't. And soon, it looks like
> no new XP's will run in SMP at all.
>
> Take one out/boot a UP kernel, and see if the problems go away.
>
> Dave.
>
> --
> Dave Jones. http://www.codemonkey.org.uk
> SuSE Labs.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


2002-01-01 15:22:16

by Dave Gilbert (Home)

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Mon, Dec 31, 2001 at 06:12:16PM -0800, [email protected] wrote:

> I am having problems with dual athlons and more than 512M RAM. I have
> compiled several kernels with either ATHLON, 1386, i686 support with the
> same affect, I get a kernel that will fail to boot properly. Sometimes
> I get a kernel panic that outs to kdb, sometimes I get a freeze, and
> sometimes I get failed to mount root partition, but never has this
> kernel successfully come up. I am quite certain it is not the memory or
> the system ( I can get windblows 2k to run successfully with upto 3.5G
> RAM).
>
> Here is the configuration:
>
> Tyan S2460
> Dual Athlon XP 1800
> 512M DDR DIMMS (also used 128, 256, and 1G)
> Western Digital 20G Drive

I have a similar system running fine. It has a Tyan S2460, a pair of
Athlon MP 1800s, 512M (2x256) and a pair of IBM 60G drives.

I haven't seen any signs of kernel instability. However:

1) When I first got it I had the BIOS do some very odd things; at one
point the CMOS got cleared and then everything worked from there on
in. So a good CMOS clean could be in order. I had to use the Debian
safe boot set prior to this.

2) Are you saying the problem only affects greater than 512M ? I only
have the 512M so don't know - but it is probably worth booting with
mem=512M as an option with more RAM in and see if it is stable.

3) The guys who put the machine together had lots of problems getting
it stable; the type of RAM they used was critical; it was stable
enough for them to boot NT and get it through a lot of tests before
they hit problems.

4) COOL IT - these things generate tons of heat (mine run at 75degC
normal operation). I have it in a big Supermicro 760 case with damn
big fans on.

5) I bought Athlon MPs because I didn't want the hastle of knowing
whether XPs would work or not. Now sure, it could be AMD just trying
to squeeze some more money out of us; but it is entirely possible that
a) the chips could be different, b) that the critical timing path in
the device could be in the cache snooping/consistency stuff (that
stuff is probably pretty hairy!). I mean there must be a reason why
it took them a month and a half longer to release the Athlon MP 1.9GHz
than the XP 1.9GHz.

6) I'm currently on 2.4.17 and have used most of the later 2.4.1x's on
it.

Dave
---------------- Have a happy GNU millennium! ----------------------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM, SPARC and HP-PA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/

2002-01-03 16:54:33

by Andreas Bombe

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Tue, Jan 01, 2002 at 01:58:23AM -0500, Shaya Potter wrote:
> I missed the original message, I'm running a dual athlon 1800+ XP system
> (actually both CPU for some reason get identified as MPs, but they are
> retail XP chips).

The identification string is written by the BIOS. Yours didn't know
about XPs so it misidentified them as MPs. Upgrade your BIOS if this
bugs you.

If ID string contradicts what you think you bought, don't trust the ID
string.

--
Andreas Bombe <[email protected]> DSA key 0x04880A44

2002-01-03 17:08:19

by Dave Jones

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Thu, 3 Jan 2002, Andreas Bombe wrote:

> The identification string is written by the BIOS. Yours didn't know
> about XPs so it misidentified them as MPs. Upgrade your BIOS if this
> bugs you.
>
> If ID string contradicts what you think you bought, don't trust the ID
> string.

x86info, and 2.5.2-dj11 both have code to correctly determine XP / MP.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-01-03 17:09:20

by Andreas Bombe

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Tue, Jan 01, 2002 at 03:21:56PM +0000, Dr. David Alan Gilbert wrote:
> 5) I bought Athlon MPs because I didn't want the hastle of knowing
> whether XPs would work or not. Now sure, it could be AMD just trying
> to squeeze some more money out of us; but it is entirely possible that
> a) the chips could be different,

Unlikely. The way from silicon wafer to chip is 4 to 8 weeks, so it's
more efficient (and makes it easier to follow the market) to produce one
line and select/configure them after production.

The differences are more like blown fuse links or external jumpers on
the carrier board that select a mode of operation. I think the Celerons
have the actually same cache on the die as the Pentiums which is just
halved by a configuration fuse.

> b) that the critical timing path in
> the device could be in the cache snooping/consistency stuff (that
> stuff is probably pretty hairy!). I mean there must be a reason why
> it took them a month and a half longer to release the Athlon MP 1.9GHz
> than the XP 1.9GHz.

More likely. There are essentially three types of possible XPs:

a) Passed MP, but market needs no more MPs
b) Not tested for MP, market asks for lots of XPs
c) Failed MP test, sold as XP

If you're lucky, you get a pair of type a) and it works. Types b) and
c) may appear to work 99.9% of the time ("it works for me") but that
does not make a stable system. Type c) may even work for months, until
the first summer sun shines into your room and gives the extra few
degrees of heat that crashes your system.

--
Andreas Bombe <[email protected]> DSA key 0x04880A44

2002-01-03 17:48:54

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

Followup to: <[email protected]>
By author: Andreas Bombe <[email protected]>
In newsgroup: linux.dev.kernel
>
> The identification string is written by the BIOS. Yours didn't know
> about XPs so it misidentified them as MPs. Upgrade your BIOS if this
> bugs you.
>
> If ID string contradicts what you think you bought, don't trust the ID
> string.
>

This seems very odd. I thought in Athlon processors the ID string
came from the *CPU* (via CPUID), not the BIOS...

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2002-01-03 18:02:54

by Dave Jones

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On 3 Jan 2002, H. Peter Anvin wrote:

> This seems very odd. I thought in Athlon processors the ID string
> came from the *CPU* (via CPUID), not the BIOS...

Software overridable cpuid strings are getting quick commonplace
in CPUs from several vendors.

I beleive the reasoning is that sometimes the lead time from
manufacture to marketing is long enough that the default power-on string
may not be correct, hence the BIOS can do the XP/MP descrimination, and
set accordingly. (Unless you've got a crap BIOS).

If anyone believes they have a BIOS which doesn't do this correctly,
(/proc/cpuinfo reports XP, and x86info reports MP or vice versa),
let me know, and I'll see what I can dig up.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-01-03 19:09:31

by Shaya Potter

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Thu, 2002-01-03 at 12:07, Dave Jones wrote:
> On Thu, 3 Jan 2002, Andreas Bombe wrote:
>
> > The identification string is written by the BIOS. Yours didn't know
> > about XPs so it misidentified them as MPs. Upgrade your BIOS if this
> > bugs you.
> >
> > If ID string contradicts what you think you bought, don't trust the ID
> > string.
>
> x86info, and 2.5.2-dj11 both have code to correctly determine XP / MP.

Just getting linux up and running on this machine, so with the original
debian kernel (2.2.20 UP) and x86info from its unstable, these retail
XPs were identified as MPs

trillian:/home/spotter# uname -a
Linux trillian 2.2.20 #1 Sun Nov 4 15:44:23 EST 2001 i686 unknown
trillian:/home/spotter# x86info
x86info v1.7. Dave Jones 2001
Feedback to <[email protected]>.

Found 1 CPU, but found 2 CPUs in MPTable.
/dev/cpu/0/cpuid: No such device
Family: 6 Model: 6 Stepping: 2 [Athlon MP]
Processor name string: AMD Athlon(tm) MP Processor 1800+

PowerNOW! Technology information
Available features:
Temperature sensing diode present.

prehaps 2.4.17 (compiling now) will make a difference, or does one need
your kernel + x86info to do it correctly?

thanks,

shaya

--
spotter@{cs.columbia.edu,yucs.org}
http://yucs.org/~spotter/

2002-01-04 00:37:21

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

Dave Jones wrote:

> On 3 Jan 2002, H. Peter Anvin wrote:
>
>
>>This seems very odd. I thought in Athlon processors the ID string
>>came from the *CPU* (via CPUID), not the BIOS...
>>
>
> Software overridable cpuid strings are getting quick commonplace
> in CPUs from several vendors.
>


Not applicable here, though.

-hpa

2002-01-04 02:24:21

by Andreas Bombe

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Thu, Jan 03, 2002 at 09:48:15AM -0800, H. Peter Anvin wrote:
> Followup to: <[email protected]>
> By author: Andreas Bombe <[email protected]>
> In newsgroup: linux.dev.kernel
> >
> > The identification string is written by the BIOS. Yours didn't know
> > about XPs so it misidentified them as MPs. Upgrade your BIOS if this
> > bugs you.
> >
> > If ID string contradicts what you think you bought, don't trust the ID
> > string.
> >
>
> This seems very odd. I thought in Athlon processors the ID string
> came from the *CPU* (via CPUID), not the BIOS...

It comes from there, but it is written there by the BIOS for Athlon (and
I guess Duron, too).

http://www.heise.de/newsticker/data/jow-18.10.01-000/ (in German)

I searched a bit with Google but couldn't find an English page with that
info right now.

--
Andreas Bombe <[email protected]> DSA key 0x04880A44

2002-11-15 10:24:43

by David Crooke

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems


Not sure if this is an apposite list, but found this thread ....
apologise in advance.

I just built a similar dual AMD machine tonight and it also has
issues....... never used any AMD stuff before so venturing into new turf.

Tyan S2460 - motherboard sticker says "C", BIOS rev 1.03
2 x Athlon MP 2000+ (1666 MHz) - retail packaged, AMD clip-on heatsinks
and fans
1Gb PC2100 ECC registered memory (2 x 512Mb DIMMs, Kingston branded)
Red Hat 7.2 but downloaded and built 2.4.19, configured for SMP, Athlon
target
Single IDE drives, no RAID or stuff like that
Tekram DC390 (8 bit SCSI)
3Com Cyclone 3C905
D-Link DE-590TX (??) - uses ne2k-pci driver, has worked since early
2.0.x and probably before
ATI Rage 3D Pro (AGP 2x)

Big Antec case with a ton of fans :-)

1. When I first put it together, it would consistenly run OK for a
period of 4-5 minutes, quite precisely - no less than 4, no more than 5
and then just lock up HARD - no Ctrl-Alt-Del, no kernel panics, nothing.
Once or twice it seemed like it stuttered - as if the load was like
10.00 or higher, the keystroke echo would take 2-3 seconds.

2. First try - I pulled the Tekram (it's ancient and has bootable BIOS)
- no difference

3. Tried some BIOS settings (e.g. SMP 1.1 mode) - it DOES NOT like this;
any BIOS changes AT ALL (even seemingly harmless ones like Num Lock)
appear to mess it up totally, and LILO hangs at "LI" when trying to
start. Restored factory defaults.

4. Then I noticed that the CPU1 heatsink was quite warm (maybe 70C
feeling around the thick bit of the aluminium) whereas CPU0 heatsink is
just above room temp.

5. Checking the Winbond monitoring in the BIOS** menu, it comes up
showing both CPU's at 77C, then as you hit keys it takes proper
readings, and claims both CPUs within 1-2 degrees of each other (??). It
seems accurate on fan speeds though. Both fans running pretty fast,
5500-6200 RPM.

6. Pulled CPU1, messed around some - same behaviour, lock ups after 4-5 mins

7. Brought it up to single user mode console, to see if it was video
card etc. - did some testing of just letting it mostly idle (while true
- uptime - sleep 1 - etc.) and locked up 1-2 more times.

8. Rebooted again, now it's up and running and appears stable (still 1
CPU), so I took it up to full init 5 and it stayed up (and so I'm
writing this email :-) Once or twice seemed to stall again for 1-2
seconds (interrupt storm ???) but recovered.

Anyone have suggestions? I'm thinking to leave it running and see if it
stays up. Smells of a hardware issue, but also the BIOS seems a bit
funny (there is a message in the Help which says "this setting for debug
only - remove for production" !!)

Other observation, possibly unrelated: the unpacking of the kernel seems
very slow for an otherwise pretty quick machine - the dots when it says
"Loading xxx..." tick at about 1 per second, much like a laptop with
PC-66 memory, compared with 4-5 per second for the Pentium III
800/PC-133 motherboard I just hauled out.




** The temperature sensor driver stuff didn't seem to come with the
kernel ??












2002-11-15 11:49:05

by Alastair MacGregor

[permalink] [raw]
Subject: RE: Dual athlon XP 1800 problems

Don't know about the kernel issues your experiencing but I have that
board and the BIOS does seem a bit flaky - I can confirm however that
the temperature monitoring as you noticed doesn't give sane readings
until you hit update for the first time.

I have Dual Athlon MP 1.2Ghz, 768MB crucial ECC RAM and kernel 2.4.19
optimized for athlonmp using gentoo patches and its solid as a rock.
The only problems I've had were full lock-ups when compiling but I
traced that to a faulty dimm module.

Cheers
ali

> -----Original Message-----
> From: [email protected] [mailto:linux-kernel-
> [email protected]] On Behalf Of David Crooke
> Sent: 15 November 2002 10:32
> To: [email protected]
> Subject: Re: Dual athlon XP 1800 problems
>
>
> Not sure if this is an apposite list, but found this thread ....
> apologise in advance.
>
> I just built a similar dual AMD machine tonight and it also has
> issues....... never used any AMD stuff before so venturing into new
turf.
>
> Tyan S2460 - motherboard sticker says "C", BIOS rev 1.03
> 2 x Athlon MP 2000+ (1666 MHz) - retail packaged, AMD clip-on
heatsinks
> and fans
> 1Gb PC2100 ECC registered memory (2 x 512Mb DIMMs, Kingston branded)
> Red Hat 7.2 but downloaded and built 2.4.19, configured for SMP,
Athlon
> target
> Single IDE drives, no RAID or stuff like that
> Tekram DC390 (8 bit SCSI)
> 3Com Cyclone 3C905
> D-Link DE-590TX (??) - uses ne2k-pci driver, has worked since early
> 2.0.x and probably before
> ATI Rage 3D Pro (AGP 2x)
>
> Big Antec case with a ton of fans :-)
>
> 1. When I first put it together, it would consistenly run OK for a
> period of 4-5 minutes, quite precisely - no less than 4, no more than
5
> and then just lock up HARD - no Ctrl-Alt-Del, no kernel panics,
nothing.
> Once or twice it seemed like it stuttered - as if the load was like
> 10.00 or higher, the keystroke echo would take 2-3 seconds.
>
> 2. First try - I pulled the Tekram (it's ancient and has bootable
BIOS)
> - no difference
>
> 3. Tried some BIOS settings (e.g. SMP 1.1 mode) - it DOES NOT like
this;
> any BIOS changes AT ALL (even seemingly harmless ones like Num Lock)
> appear to mess it up totally, and LILO hangs at "LI" when trying to
> start. Restored factory defaults.
>
> 4. Then I noticed that the CPU1 heatsink was quite warm (maybe 70C
> feeling around the thick bit of the aluminium) whereas CPU0 heatsink
is
> just above room temp.
>
> 5. Checking the Winbond monitoring in the BIOS** menu, it comes up
> showing both CPU's at 77C, then as you hit keys it takes proper
> readings, and claims both CPUs within 1-2 degrees of each other (??).
It
> seems accurate on fan speeds though. Both fans running pretty fast,
> 5500-6200 RPM.
>
> 6. Pulled CPU1, messed around some - same behaviour, lock ups after
4-5
> mins
>
> 7. Brought it up to single user mode console, to see if it was video
> card etc. - did some testing of just letting it mostly idle (while
true
> - uptime - sleep 1 - etc.) and locked up 1-2 more times.
>
> 8. Rebooted again, now it's up and running and appears stable (still 1
> CPU), so I took it up to full init 5 and it stayed up (and so I'm
> writing this email :-) Once or twice seemed to stall again for 1-2
> seconds (interrupt storm ???) but recovered.
>
> Anyone have suggestions? I'm thinking to leave it running and see if
it
> stays up. Smells of a hardware issue, but also the BIOS seems a bit
> funny (there is a message in the Help which says "this setting for
debug
> only - remove for production" !!)
>
> Other observation, possibly unrelated: the unpacking of the kernel
seems
> very slow for an otherwise pretty quick machine - the dots when it
says
> "Loading xxx..." tick at about 1 per second, much like a laptop with
> PC-66 memory, compared with 4-5 per second for the Pentium III
> 800/PC-133 motherboard I just hauled out.
>
>
>
>
> ** The temperature sensor driver stuff didn't seem to come with the
> kernel ??
>
>
>
>
>
>
>
>
>
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe
linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2002-11-15 14:21:32

by Alan

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Fri, 2002-11-15 at 10:31, David Crooke wrote:
> 1. When I first put it together, it would consistenly run OK for a
> period of 4-5 minutes, quite precisely - no less than 4, no more than 5
> and then just lock up HARD - no Ctrl-Alt-Del, no kernel panics, nothing.

Turn off ACPI and APM in the bios as a starter.

> 3. Tried some BIOS settings (e.g. SMP 1.1 mode) - it DOES NOT like this;
> any BIOS changes AT ALL (even seemingly harmless ones like Num Lock)
> appear to mess it up totally, and LILO hangs at "LI" when trying to
> start. Restored factory defaults.

Make sure you have a current BIOS on dual athlon boxes, the earlier
bioses were not terribly good on the whole. Make sure you have a PS/2
mouse in the mouse port even if you aren;t going to use it

2002-11-15 14:35:36

by Dave Jones

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Fri, Nov 15, 2002 at 02:54:48PM +0000, Alan Cox wrote:
> Make sure you have a current BIOS on dual athlon boxes, the earlier
> bioses were not terribly good on the whole. Make sure you have a PS/2
> mouse in the mouse port even if you aren;t going to use it

Unless he's lucky with steppings, it's also possible he's being
bitten by running XP's instead of MPs.

Dave

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-11-15 17:13:22

by Willy Tarreau

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Fri, Nov 15, 2002 at 02:40:20PM +0000, Dave Jones wrote:
> On Fri, Nov 15, 2002 at 02:54:48PM +0000, Alan Cox wrote:
> > Make sure you have a current BIOS on dual athlon boxes, the earlier
> > bioses were not terribly good on the whole. Make sure you have a PS/2
> > mouse in the mouse port even if you aren;t going to use it
>
> Unless he's lucky with steppings, it's also possible he's being
> bitten by running XP's instead of MPs.

BTW, since I've upgraded my Asus bios, my 2 XP1800 are reported as MP1800.
There's absolutely no way for me to tell that they're in fact XPs, except
from dismounting the cooling fans and read the chips. Although they're quite
stable even at very high temperatures (I could compile a complete kernel with
fans unplugged, but the case was as hot as a pizza oven), I know that there
are people out there with unstable dual XP setups and I frankly don't know
how they could tell that they're XP if their reseller sold them as MPs and
installed the fan himself.

Cheers,
Willy

2002-11-15 17:19:32

by Ken Witherow

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

> 1. When I first put it together, it would consistenly run OK for a
> period of 4-5 minutes, quite precisely - no less than 4, no more than 5
> and then just lock up HARD - no Ctrl-Alt-Del, no kernel panics, nothing.
> Once or twice it seemed like it stuttered - as if the load was like
> 10.00 or higher, the keystroke echo would take 2-3 seconds.
>
> 2. First try - I pulled the Tekram (it's ancient and has bootable BIOS)
> - no difference
>
> 3. Tried some BIOS settings (e.g. SMP 1.1 mode) - it DOES NOT like this;
> any BIOS changes AT ALL (even seemingly harmless ones like Num Lock)
> appear to mess it up totally, and LILO hangs at "LI" when trying to
> start. Restored factory defaults.

I have a S2460 with dual 1800MPs using BIOS rev 1.04. I had very similar
problems (random hangs, sometimes after 2 minutes, sometimes after 36
hours). Here's what I did to solve them:

1) Turn off power management in the BIOS. I still have power management
enabled in linux and all is fine.

2) (this is the most important one) Make sure you have a minimum of a 500
watt power supply. Each CPU alone is rated for 66 watts of consumption.

3) I still get random hangs at boot (usually after rebooting linux) and I
believe this is due to some ACPI problem. A hard reboot (turn the power
supply off and on) fixes it for me.

4) There are a couple bugs with the 760MP chipset and APICs. To see if
they're affecting you, add "mem=nopentium noapic" to your kernel
parameters (I can run fine without them).

> 4. Then I noticed that the CPU1 heatsink was quite warm (maybe 70C
> feeling around the thick bit of the aluminium) whereas CPU0 heatsink is
> just above room temp.
>
> 5. Checking the Winbond monitoring in the BIOS** menu, it comes up
> showing both CPU's at 77C, then as you hit keys it takes proper
> readings, and claims both CPUs within 1-2 degrees of each other (??). It
> seems accurate on fan speeds though. Both fans running pretty fast,
> 5500-6200 RPM.

My BIOS reports the right temps but lm_sensors didn't. I too was getting
temps in the 75C+ range. To fix lm_sensors, do the following:

echo "2" > /proc/sys/dev/sensors/w83782d-i2c-0-2d/sensor1
echo "2" > /proc/sys/dev/sensors/w83782d-i2c-0-2d/sensor2
echo "2" > /proc/sys/dev/sensors/w83782d-i2c-0-2d/sensor3

> 7. Brought it up to single user mode console, to see if it was video
> card etc. - did some testing of just letting it mostly idle (while true
> - uptime - sleep 1 - etc.) and locked up 1-2 more times.

I thought it was my video card too... so I went out and spent $90 on a new
one only to find it does the same thing.

> 8. Rebooted again, now it's up and running and appears stable (still 1
> CPU), so I took it up to full init 5 and it stayed up (and so I'm
> writing this email :-) Once or twice seemed to stall again for 1-2
> seconds (interrupt storm ???) but recovered.

I notice this sometimes too... I chalk it up to some SMP locking
somewhere. Currently up 6 days, 3:53 with the maximum around 40 days
(rebooted to upgrade kernel).

> Other observation, possibly unrelated: the unpacking of the kernel seems
> very slow for an otherwise pretty quick machine - the dots when it says
> "Loading xxx..." tick at about 1 per second, much like a laptop with
> PC-66 memory, compared with 4-5 per second for the Pentium III
> 800/PC-133 motherboard I just hauled out.

When mine hasn't reset right (the aforementioned ACPI lockup), mine does
this. It was especially prevalent before I upgraded my power supply from
400 to 550 watts

> ** The temperature sensor driver stuff didn't seem to come with the
> kernel ??

pick up the lm_sensors package

--
Ken Witherow <phantoml AT rochester.rr.com>
ICQ: 21840670 AIM: phantomlordken
http://www.krwtech.com/ken


2002-11-15 17:36:52

by steve roemen

[permalink] [raw]
Subject: RE: Dual athlon XP 1800 problems

i had similar issues on my old 2460 board.

i found out that a huge power supply doesn't cut it, you need a QUALITY
power supply of ~400watts( more specifically the 5 volt bus).

i also found out the hard way that i believe tyan didn't design that board
properly because the 5 volt part of the connectors were burned up on the PS
and MB.

i've since replaced with a s2466n-4m and am very happy.

i'd check your power supply connector before it burns up yours too...

-steve

-----Original Message-----
From: [email protected]
[mailto:[email protected]]On Behalf Of Ken Witherow
Sent: Friday, November 15, 2002 11:26 AM
To: David Crooke
Cc: [email protected]
Subject: Re: Dual athlon XP 1800 problems


> 1. When I first put it together, it would consistenly run OK for a
> period of 4-5 minutes, quite precisely - no less than 4, no more than 5
> and then just lock up HARD - no Ctrl-Alt-Del, no kernel panics, nothing.
> Once or twice it seemed like it stuttered - as if the load was like
> 10.00 or higher, the keystroke echo would take 2-3 seconds.
>
> 2. First try - I pulled the Tekram (it's ancient and has bootable BIOS)
> - no difference
>
> 3. Tried some BIOS settings (e.g. SMP 1.1 mode) - it DOES NOT like this;
> any BIOS changes AT ALL (even seemingly harmless ones like Num Lock)
> appear to mess it up totally, and LILO hangs at "LI" when trying to
> start. Restored factory defaults.

I have a S2460 with dual 1800MPs using BIOS rev 1.04. I had very similar
problems (random hangs, sometimes after 2 minutes, sometimes after 36
hours). Here's what I did to solve them:

1) Turn off power management in the BIOS. I still have power management
enabled in linux and all is fine.

2) (this is the most important one) Make sure you have a minimum of a 500
watt power supply. Each CPU alone is rated for 66 watts of consumption.

3) I still get random hangs at boot (usually after rebooting linux) and I
believe this is due to some ACPI problem. A hard reboot (turn the power
supply off and on) fixes it for me.

4) There are a couple bugs with the 760MP chipset and APICs. To see if
they're affecting you, add "mem=nopentium noapic" to your kernel
parameters (I can run fine without them).

> 4. Then I noticed that the CPU1 heatsink was quite warm (maybe 70C
> feeling around the thick bit of the aluminium) whereas CPU0 heatsink is
> just above room temp.
>
> 5. Checking the Winbond monitoring in the BIOS** menu, it comes up
> showing both CPU's at 77C, then as you hit keys it takes proper
> readings, and claims both CPUs within 1-2 degrees of each other (??). It
> seems accurate on fan speeds though. Both fans running pretty fast,
> 5500-6200 RPM.

My BIOS reports the right temps but lm_sensors didn't. I too was getting
temps in the 75C+ range. To fix lm_sensors, do the following:

echo "2" > /proc/sys/dev/sensors/w83782d-i2c-0-2d/sensor1
echo "2" > /proc/sys/dev/sensors/w83782d-i2c-0-2d/sensor2
echo "2" > /proc/sys/dev/sensors/w83782d-i2c-0-2d/sensor3

> 7. Brought it up to single user mode console, to see if it was video
> card etc. - did some testing of just letting it mostly idle (while true
> - uptime - sleep 1 - etc.) and locked up 1-2 more times.

I thought it was my video card too... so I went out and spent $90 on a new
one only to find it does the same thing.

> 8. Rebooted again, now it's up and running and appears stable (still 1
> CPU), so I took it up to full init 5 and it stayed up (and so I'm
> writing this email :-) Once or twice seemed to stall again for 1-2
> seconds (interrupt storm ???) but recovered.

I notice this sometimes too... I chalk it up to some SMP locking
somewhere. Currently up 6 days, 3:53 with the maximum around 40 days
(rebooted to upgrade kernel).

> Other observation, possibly unrelated: the unpacking of the kernel seems
> very slow for an otherwise pretty quick machine - the dots when it says
> "Loading xxx..." tick at about 1 per second, much like a laptop with
> PC-66 memory, compared with 4-5 per second for the Pentium III
> 800/PC-133 motherboard I just hauled out.

When mine hasn't reset right (the aforementioned ACPI lockup), mine does
this. It was especially prevalent before I upgraded my power supply from
400 to 550 watts

> ** The temperature sensor driver stuff didn't seem to come with the
> kernel ??

pick up the lm_sensors package

--
Ken Witherow <phantoml AT rochester.rr.com>
ICQ: 21840670 AIM: phantomlordken
http://www.krwtech.com/ken


2002-11-15 17:31:39

by Erich Boleyn

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems


Willy Tarreau <[email protected]> wrote:

> On Fri, Nov 15, 2002 at 02:40:20PM +0000, Dave Jones wrote:
> > On Fri, Nov 15, 2002 at 02:54:48PM +0000, Alan Cox wrote:
> > > Make sure you have a current BIOS on dual athlon boxes, the earlier
> > > bioses were not terribly good on the whole. Make sure you have a PS/2
> > > mouse in the mouse port even if you aren;t going to use it
> >
> > Unless he's lucky with steppings, it's also possible he's being
> > bitten by running XP's instead of MPs.
>
> BTW, since I've upgraded my Asus bios, my 2 XP1800 are reported as MP1800.
> There's absolutely no way for me to tell that they're in fact XPs, except
> from dismounting the cooling fans and read the chips. Although they're quite
> stable even at very high temperatures (I could compile a complete kernel with
> fans unplugged, but the case was as hot as a pizza oven), I know that there
> are people out there with unstable dual XP setups and I frankly don't know
> how they could tell that they're XP if their reseller sold them as MPs and
> installed the fan himself.

I have the ASUS A7M266-D board with 2 XP1800 as well with the same issue
(though it reported them as MP with the early and the later BIOS).

The deal here was that the early steppings of the XP processors,
through about the 1800 series, which didn't have the "MP disable" bridge
cut. When they released the 1900's and higher, this was changed, but
you can still find XP 1800's and lower which just appear as MP processors.

However, the machine runs rock stable with the stock i686 patch kernel
from RedHat on a 350W power supply (I even tweaked the voltages down quite
a bit to get the processors to run cooler, works great).

--
Erich Stefan Boleyn <[email protected]> http://www.uruk.org/
"Reality is truly stranger than fiction; Probably why fiction is so popular"

2002-11-15 18:14:29

by Ken Witherow

[permalink] [raw]
Subject: RE: Dual athlon XP 1800 problems

On Fri, 15 Nov 2002, steve roemen wrote:

> i had similar issues on my old 2460 board.
>
> i found out that a huge power supply doesn't cut it, you need a QUALITY
> power supply of ~400watts( more specifically the 5 volt bus).
>
> i also found out the hard way that i believe tyan didn't design that board
> properly because the 5 volt part of the connectors were burned up on the PS
> and MB.
>
> i've since replaced with a s2466n-4m and am very happy.
>
> i'd check your power supply connector before it burns up yours too...

PSU connector is fine... and I definitely need the extra juice since I'm
running quite a bit of power hungry hardware in this box (5 SCSI drives, a
GF4, etc). Both power supplies are pretty good Antecs. It does bother me
that they didn't use the AUX connector for the extra 5V power though

--
Ken Witherow <phantoml AT rochester.rr.com>
ICQ: 21840670 AIM: phantomlordken
http://www.krwtech.com/ken


2002-11-15 18:44:55

by Alastair MacGregor

[permalink] [raw]
Subject: RE: Dual athlon XP 1800 problems

I'm powering two 1.2Ghz MPs with this board from a 300watt power supply
and its rock solid so unless you've got a lot of power hungry devices
(scsi harddrives I guess) then you should get away with a lot less than
500W.



>
> When mine hasn't reset right (the aforementioned ACPI lockup), mine
does
> this. It was especially prevalent before I upgraded my power supply
from
> 400 to 550 watts
>


2002-11-16 02:31:09

by Alan

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

On Fri, 2002-11-15 at 17:26, Ken Witherow wrote:
> 4) There are a couple bugs with the 760MP chipset and APICs. To see if
> they're affecting you, add "mem=nopentium noapic" to your kernel
> parameters (I can run fine without them).

mem=nopentium isnt related to any AMD760MP/MPX stuff. SOme boxes seem to
need noapic, although a PS/2 mouse may cure that

2002-11-16 15:01:02

by Kevin Brosius

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

Kevin Brosius wrote:
>
> >
> > 8. Rebooted again, now it's up and running and appears stable (still 1
> > CPU), so I took it up to full init 5 and it stayed up (and so I'm
> > writing this email :-) Once or twice seemed to stall again for 1-2
> > seconds (interrupt storm ???) but recovered.
> >
> > Anyone have suggestions? I'm thinking to leave it running and see if it
> > stays up. Smells of a hardware issue, but also the BIOS seems a bit
> > funny (there is a message in the Help which says "this setting for debug
> > only - remove for production" !!)
>
> I've noticed some oddities on 2.4.19 with a dual Athlon Tyan S2462 that
> look like stalls under heavy load. If you're really curious, you might
> try 2.4.18, as this was not a problem there. (I'm running SuSE kernels
> shipped with SuSE 8.0 and 8.1, although I saw similar trouble with a
> stock 2.4.19 build and stopped using it. The stalls are only minor
> though, so I haven't investigated. Maybe they are worse on that
> motherboard.)

Oh, and there's a Beta BIOS available from Tyan, which mentions a fix
for IRQ routing. Don't know if that would help or not.

http://www.tyan.com/support/html/b_tg_mp.html

--
Kevin

2002-11-16 14:58:23

by Kevin Brosius

[permalink] [raw]
Subject: Re: Dual athlon XP 1800 problems

>
> 8. Rebooted again, now it's up and running and appears stable (still 1
> CPU), so I took it up to full init 5 and it stayed up (and so I'm
> writing this email :-) Once or twice seemed to stall again for 1-2
> seconds (interrupt storm ???) but recovered.
>
> Anyone have suggestions? I'm thinking to leave it running and see if it
> stays up. Smells of a hardware issue, but also the BIOS seems a bit
> funny (there is a message in the Help which says "this setting for debug
> only - remove for production" !!)


I've noticed some oddities on 2.4.19 with a dual Athlon Tyan S2462 that
look like stalls under heavy load. If you're really curious, you might
try 2.4.18, as this was not a problem there. (I'm running SuSE kernels
shipped with SuSE 8.0 and 8.1, although I saw similar trouble with a
stock 2.4.19 build and stopped using it. The stalls are only minor
though, so I haven't investigated. Maybe they are worse on that
motherboard.)

--
Kevin