2002-12-11 16:02:06

by Orion Poplawski

[permalink] [raw]
Subject: Reliable hardware

scott thomason wrote:

>On Tuesday 10 December 2002 03:00 pm, Alan Cox wrote:
>
>
>>Random lockups on dual athlons are a notorious problem under all
>>OS's. Start by checking it passes memtest86, that will verify the
>>RAM is ok - and the AMD is -very- picky about RAM.
>>
>>If thats ok then let me know which board you have, what is plugged
>>into it and what PSU you are using.
>>
>>
>
>I have two AMD MP 2000+ cpus in an ASUS A7M266-D. Even after returning
>my memory for new chips the store owner memtest86'd, my combo of cpus
>and mobo was finding the occasional error. I finally ended up
>resolving it by simply underclocking the bus about 6Mhz :(
>
>Next time, I'm buying ECC memory.
>---scott
>
>
Is there a good site for pointers towards assembling reliable Linux
machines? It seems to me the trickiest part of the whole operation is
choosing good hardware in the first place. I just started a new job and
inherited a buch of new but flakey machines, and I'd like to avoid doing
that in the future.



2002-12-11 16:16:42

by Alan

[permalink] [raw]
Subject: Re: Reliable hardware

On Wed, 2002-12-11 at 16:08, Orion Poplawski wrote:
> Is there a good site for pointers towards assembling reliable Linux
> machines? It seems to me the trickiest part of the whole operation is
> choosing good hardware in the first place. I just started a new job and
> inherited a buch of new but flakey machines, and I'd like to avoid doing
> that in the future.

The AMD duals have been a disaster in my experience. Its a shame because
when they do go they really are very fast boxes. The biggest factor I've
found is chipsets.

2002-12-11 16:18:33

by John Bradford

[permalink] [raw]
Subject: Re: Reliable hardware

> >>Random lockups on dual athlons are a notorious problem under all
> >>OS's. Start by checking it passes memtest86, that will verify the
> >>RAM is ok - and the AMD is -very- picky about RAM.
> >>
> >>If thats ok then let me know which board you have, what is plugged
> >>into it and what PSU you are using.
> >>
> >>
> >
> >I have two AMD MP 2000+ cpus in an ASUS A7M266-D. Even after returning
> >my memory for new chips the store owner memtest86'd, my combo of cpus
> >and mobo was finding the occasional error. I finally ended up
> >resolving it by simply underclocking the bus about 6Mhz :(
> >
> >Next time, I'm buying ECC memory.

Why? ECC memory guards against a single bit error in the RAM, nothing
else, (except that it also reports double bit errors).

John.

2002-12-11 17:14:16

by Jason L Tibbitts III

[permalink] [raw]
Subject: Re: Reliable hardware

>>>>> "AC" == Alan Cox <[email protected]> writes:

AC> The AMD duals have been a disaster in my experience.

I do have a bunch of these running reliably (RH 7.3 plus the latest
OpenMosix kernel). I had to go through a few combinations of
motherboard and RAM (four different manufacturers of RAM) before I got
something that works. Processors are MP 1900+ or 2000+, boards are
Tyan S2466, memory is in PC2100 ECC registered 512MB sticks from
Corsair. Case and power supply are PC Power and Cooling, mid tower,
450W PS, every fan bay filled. These machines have been rock
stable for months except for a failed IBM deathstar drive and an
over-temp shutdown when the room AC failed.

I still have a couple of the 760MP boards (as opposed to the MPX
boards) which I just can't get to run properly with two processors.

- J<

2002-12-11 23:28:48

by Patrick Finnegan

[permalink] [raw]
Subject: Re: Reliable hardware

On 11 Dec 2002, Alan Cox wrote:

> On Wed, 2002-12-11 at 16:08, Orion Poplawski wrote:
> > Is there a good site for pointers towards assembling reliable Linux
> > machines? It seems to me the trickiest part of the whole operation is
> > choosing good hardware in the first place. I just started a new job and
> > inherited a buch of new but flakey machines, and I'd like to avoid doing
> > that in the future.
>
> The AMD duals have been a disaster in my experience. Its a shame because
> when they do go they really are very fast boxes. The biggest factor I've
> found is chipsets.

Which chipset - the new or the old one? I've got an ASUS A7M266D (or
something) that's based on the AMD 760MPX chipset and has 512MB of
Registered ECC memory, and a pair of XP 1800+'s... and it works just
beautifuly. Truely rock solid.

Pat
--
Purdue Universtiy ITAP/RCS
Information Technology at Purdue
Research Computing and Storage
http://www-rcd.cc.purdue.edu


2002-12-12 00:39:38

by Alan

[permalink] [raw]
Subject: Re: Reliable hardware

On Wed, 2002-12-11 at 23:35, Patrick Finnegan wrote:
> Which chipset - the new or the old one? I've got an ASUS A7M266D (or
> something) that's based on the AMD 760MPX chipset and has 512MB of
> Registered ECC memory, and a pair of XP 1800+'s... and it works just
> beautifuly. Truely rock solid.

Same board you have.