2008-06-05 13:58:19

by Bart Van Assche

[permalink] [raw]
Subject: Re: Strange mptbase / mptscsih kernel messages

On Thu, May 8, 2008 at 10:33 AM, Prakash, Sathya <[email protected]> wrote:
> The meaning of 1 message is some fram transmit error encountered by
> hardware and the I/O request is aborted by firmware because of the
> error,
> The second message indicates, some I/O got timed out and the SML tries
> to abort the request and the firmware completes the I/O before aborting
> that. Hence returns IO executed message and the driver completes the
> abort as success.
> Suspecting some bad hardware in the topology(cables?)

Hello Sathya,

It took some time before I could have a closer look at the system on
which I observed the strange kernel messages. Apparently the RAID
controller (LSISAS3081E ?) is not connected directly to the 16 disks
but via a SAS expander (Super Micro SC836 SAS Backplane with two LSI
SASX28 Expander Chips --
http://www.supermicro.com/products/chassis/3U/836/SC836E2-R800.cfm).
It will be a challenge to find out which component triggered the
kernel messages and how to make the storage subsystem work perfectly.
Any hint is welcome.

Bart.