2003-08-18 08:08:17

by kenton.groombridge

[permalink] [raw]
Subject: Re: nforce2 lockups

It seems that the kernel recognizes all nforce2 chipsets as revision 162. That is my bad since I found that seemed to be a common denominator. Taking shots in the dark. :^)

I will tell you that I know it isn't related to bad hardware. I used a program that I borrowed from my office (not a cheap program, and it is thorough). It has loopback plugs for USB, serial, parallel, CDs, etc. I put in all the loopback plugs and CD, and ran burnin diagnostics nightly for about five days. It made a total of about 30 complete loops through the hardware. No problems. Loaded "the other OS" (I like that) and ran it for three days straight with two different hardware banging programs running and there were no problems. Ran memtest86 overnight, cycled through the memory a good dozen times, no problems.

I start loading Mandrake 9.1, locks up within the install. If I am lucky enough to get it installed, it is doomed to lock up in three to five minutes of use. I have tried every kernel/patch that I know (2.4, 2.6 all with acpi, akmp, pre, bk). Nothing has helped. Longest run time, about six hours (because I didn't touch it).

This is a hunch: is it possible that gcc is compiling something a bit wrong? I know that some instructions when processed in a certain order, can do some wacky things. Maybe a gcc bug is causing the Athlon processor to caculate some instructions in the right sequence where it sometimes works, and other times doesn't.

The reason I say this, is that I have read a few posts where one person had lock-ups with one distro and not the other. Kernels are pretty much the same (I think we are all downloading the latest kernel source and building our own kernels), but gcc is different.

Haven't tried it yet, since I am working a project 24/7 that will keep me until the end of the month. Purchased the Athlon XP Gentoo 1.4 CDs, so will load then and may get some different results.

Ken

----- Original Message -----
From: Patrick Dreker <[email protected]>
Date: Monday, August 18, 2003 5:02 am
Subject: Re: nforce2 lockups

> >
> > I have ASUS A7N8X Deluxe mobo with nForce2 rev 162 without any
> problems> (if not counting unability to enabe SiI SATA DMA mode
> with attached
> > Seagate Barracuda drive).
>
> I have the exact same Board (except I'm not using SATA), and it's
> a nightmare.
> Best uptime so far: a little more than 16 hours. Usually it locks
> up a lot
> earlier. When I do network transfers I can cause it to lock within
> a few
> minutes. Under "the other OS" it runs without any problems.
>



2003-08-18 09:00:18

by Karel Kulhavy

[permalink] [raw]
Subject: Re: nforce2 lockups

> This is a hunch: is it possible that gcc is compiling something a bit wrong?
> I know that some instructions when processed in a certain order, can do some
> wacky things. Maybe a gcc bug is causing the Athlon processor to caculate
> some instructions in the right sequence where it sometimes works, and other
> times doesn't.
>
> The reason I say this, is that I have read a few posts where one person had
> lock-ups with one distro and not the other. Kernels are pretty much the same
> (I think we are all downloading the latest kernel source and building our own
> kernels), but gcc is different.

I realized that when I recompiled kernel from 2.4.21 to 2.6.0 it could
still be crashed on-demand. But when I replaced the 2.4.21 back it wouldn't
crash. But in meantime, when I replaced the IDE disk for another with the
same kernel, the crash could still be done on-demand.

I tried to copy the swap (disk map: 1G swap @ the beginning, ext2 the rest)
from the crashdisk to noncrashdisk verbatim if it's not dependent on the
content read (the crash was within first 10 seconds, with 40MB/s it's less than
400M from the beginning of the disk) and it didn't help. It seems it is highly
dependent on a sequence of some highly irrelevant operations during the startup
of the kernel.

>
> Haven't tried it yet, since I am working a project 24/7 that will keep me
> until the end of the month. Purchased the Athlon XP Gentoo 1.4 CDs, so will
> load then and may get some different results.
>
> Ken
>

2003-08-18 09:31:37

by Ookhoi

[permalink] [raw]
Subject: Re: nforce2 lockups

[email protected] wrote (ao):
> It seems that the kernel recognizes all nforce2 chipsets as revision
> 162. That is my bad since I found that seemed to be a common
> denominator. Taking shots in the dark. :^)

My Shuttle SN41G2 also has "NFORCE2: chipset revision 162"

> I will tell you that I know it isn't related to bad hardware. I used a
> program that I borrowed from my office (not a cheap program, and it is
> thorough).

> The reason I say this, is that I have read a few posts where one
> person had lock-ups with one distro and not the other. Kernels are
> pretty much the same (I think we are all downloading the latest kernel
> source and building our own kernels), but gcc is different.

I've had a few different 2.5 kernels on this one, compiled under always
up to date debian sid (unstable). The system is (and has been) rock
solid, now running 2.5.70 with an uptime of 70 days. Running
http://www.stanford.edu/group/pandegroup/folding

I get this now and then:
favonius kernel: Bank 2: 940040000000017a

favonius kernel: MCE: The hardware reports a non fatal, correctable
incident occurred on CPU 0.

but this goes unnoticed to me.

This all with an athlon XP 3000+ btw.