LinuxLists.cc - sym53c1010 issues

2001-10-05 23:45:50

Subject: sym53c1010 issues

Am currently getting the following errors in dmesg:

sym53c1010-33-0:0: ERROR (81:0) (8-0-0) (3e/18) @ (mem 6d40383c:ffffffff).
sym53c1010-33-0: regdump: da 00 00 18 47 3e 00 0f 04 08 80 00 00 00 0f 0a 28 96
7e 12 02 00 00 00.
sym53c1010-33-0: ctest4/sist original 0x8/0x0 mod: 0x18/0x0
sym53c1010-33-0: restart (scsi reset).
sym53c1010-33-0: handling phase mismatch from SCRIPTS.
sym53c1010-33-0: Downloading SCSI SCRIPTS.
sym53c1010-33-0-<0,*>: FAST-80 WIDE SCSI 160.0 MB/s (12.5 ns, offset 62)
sym53c1010-33-0:0: ERROR (81:0) (8-0-0) (3e/18) @ (script 50:f31c0004).
sym53c1010-33-0: script cmd = 90080000
sym53c1010-33-0: regdump: da 00 00 18 47 3e 00 0f 00 08 80 00 00 00 0f 0a 0a 4d
5f 49 02 00 00 00.
sym53c1010-33-0: ctest4/sist original 0x8/0x0 mod: 0x18/0x0
sym53c1010-33-0: restart (scsi reset).
sym53c1010-33-0: handling phase mismatch from SCRIPTS.
sym53c1010-33-0: Downloading SCSI SCRIPTS.
sym53c1010-33-0-<0,*>: FAST-80 WIDE SCSI 160.0 MB/s (12.5 ns, offset 62)

and, after about 24hrs, the box seizes upon me requireing a reboot. The
kernel I'm using is 2.2.19 with raid0.9 and the latest sym* drivers (ie
not the ones that come with the kernel).

The main thing here is that I'm not sure if this is a h/w or s/w issue. Would
msgs like the above be due to the driver, the onboard scsi card or the HD?

Or am I asking the wrong questions? :)

There are other errors usually reported also and they deal with the 1st
SCA drive (with 1 or 2 of the 3rd drive). There are currently 3 SCA drives
on the first controller and 4 LVD drives (external array) on the 2nd. No
hotswapping is being done and there have been no errors reported re the
4 LVD drives as far as I know.

Anyone able to help?

--
CaT "As you can expect it's really affecting my sex life. I can't help
it. Each time my wife initiates sex, these ejaculating hippos keep
floating through my mind."
- Mohd. Binatang bin Goncang, Singapore Zoological Gardens

2001-10-07 09:16:16

by Gérard Roudier

[permalink] [raw]

Subject: Re: sym53c1010 issues

On Sat, 6 Oct 2001, CaT wrote:

> Am currently getting the following errors in dmesg:
>
> sym53c1010-33-0:0: ERROR (81:0) (8-0-0) (3e/18) @ (mem 6d40383c:ffffffff).
^^^^^^^^ ^^^^^^^^
The chips said 'Illegal Script Instruction detected'
This one let show the scripts processor jumping to some wrong memory
address.

> sym53c1010-33-0: regdump: da 00 00 18 47 3e 00 0f 04 08 80 00 00 00 0f 0a 28 96
> 7e 12 02 00 00 00.
> sym53c1010-33-0: ctest4/sist original 0x8/0x0 mod: 0x18/0x0
> sym53c1010-33-0: restart (scsi reset).
> sym53c1010-33-0: handling phase mismatch from SCRIPTS.
> sym53c1010-33-0: Downloading SCSI SCRIPTS.
> sym53c1010-33-0-<0,*>: FAST-80 WIDE SCSI 160.0 MB/s (12.5 ns, offset 62)
> sym53c1010-33-0:0: ERROR (81:0) (8-0-0) (3e/18) @ (script 50:f31c0004).
^^^^^^^^
'Illegal Istruction' on a LOAD from memory DSA relative. This happen if
the DSA (base address) is not properly aligned regarding the size to load
(4).

> sym53c1010-33-0: script cmd = 90080000

> sym53c1010-33-0: regdump: da 00 00 18 47 3e 00 0f 00 08 80 00 00 00 0f 0a 0a 4d
> 5f 49 02 00 00 00.

The register dump shows DSA to be 495f4d0a which is indeed not a properly
aligned bus address. Probably some bogus value coming from screwed memory.

The both reports from the driver let me think that the driver internal
data have been corrupted from some obscure reason.

> sym53c1010-33-0: ctest4/sist original 0x8/0x0 mod: 0x18/0x0
> sym53c1010-33-0: restart (scsi reset).
> sym53c1010-33-0: handling phase mismatch from SCRIPTS.
> sym53c1010-33-0: Downloading SCSI SCRIPTS.
> sym53c1010-33-0-<0,*>: FAST-80 WIDE SCSI 160.0 MB/s (12.5 ns, offset 62)
>
> and, after about 24hrs, the box seizes upon me requireing a reboot. The
> kernel I'm using is 2.2.19 with raid0.9 and the latest sym* drivers (ie
> not the ones that come with the kernel).

You probably mean you are using sym53c8xx-1.7.3c. Btw, there is another
driver available called SYM-2. Current revision is 2.1.15, IIRC.
IMO, SYM-2 will also be victimized by the memory corruption that likely
originates from some other part of the kernel.

> The main thing here is that I'm not sure if this is a h/w or s/w issue. Would
> msgs like the above be due to the driver, the onboard scsi card or the HD?
>
> Or am I asking the wrong questions? :)

> There are other errors usually reported also and they deal with the 1st
> SCA drive (with 1 or 2 of the 3rd drive). There are currently 3 SCA drives
> on the first controller and 4 LVD drives (external array) on the 2nd. No
> hotswapping is being done and there have been no errors reported re the
> 4 LVD drives as far as I know.

The problem does not look SCSI-related. A SCSI error could be kind of
indirect cause that may trigger a bug in upper layers, but does not look
the direct cause at all.

> Anyone able to help?

I can't do more.

Regards,
G?rard.