Hello,
I know that this issue has been brought up before, but still...
I have an ABIT KT7-RAID motherboard with a HPT370 IDE controller on it.
I have two SAMSUNG SP0802N drives attached, one on each channel, with a
software RAID1 setup.
Under both 2.4.22 and 2.6.4 that I tried, the same thing happens. System
boots up allright, and works for a random period of time. Then it
locks up completely with the disk led stuck lighting. No keystrokes work
and there is no error message that I could see. The crash can be triggered
by disk-intensive operations, it seems however like a random
phenomenon, that but sooner or later happens for sure. It is likely
that the case is connected with DMA handling, and that it only occurs if
both IDE channels are utilized heavily (like is the case with RAID1).
I've read from others having the same symptom on this list, but I could
find no solution so far. None of the suggestions or patches that I tried
have worked out (including the new patch of Andre Hedrick, which has no
effect in this case since the HPT370 is a rev 3. controller)
However, since my last try in last August with 2.4.22 I was using the
"opensource" driver of HighPoint which worked rock stable for my setup.
Now I started to experiment again with the hpt366 driver, this time under
2.6.4, and it's the same lockup situation. I would be rather happy to see
the hpt366 driver working as then I (and others) would not be forced to
use the "opensource" driver of Highpoint, that, besides being a
partly binary driver, has other disadvantages (like it needs initrd, and
it does not support S.M.A.R.T., or compile yet with the 2.6 kernel)
In case someone has any idea, I would be glad to send specific logs and/or
test patches (preferably with 2.6.4).
--
Balazs REE
On Tuesday 30 March 2004 14:36, Balazs Ree wrote:
> Hello,
>
> I know that this issue has been brought up before, but still...
>
> I have an ABIT KT7-RAID motherboard with a HPT370 IDE controller on it.
> I have two SAMSUNG SP0802N drives attached, one on each channel, with a
> software RAID1 setup.
>
> Under both 2.4.22 and 2.6.4 that I tried, the same thing happens. System
> boots up allright, and works for a random period of time. Then it
> locks up completely with the disk led stuck lighting. No keystrokes work
> and there is no error message that I could see. The crash can be triggered
SysRq not working too? Look into interrupt handler of this driver.
Is there any potentially-endless loops? Modify them to have
some timeout, make them printk out loud if timeout triggers.
> by disk-intensive operations, it seems however like a random
> phenomenon, that but sooner or later happens for sure. It is likely
> that the case is connected with DMA handling, and that it only occurs if
> both IDE channels are utilized heavily (like is the case with RAID1).
Do parallel reads with dd. Does it happen? Do the same with DMA off.
Does it happen now? Same with writes. etc.
You may need to serialize channel usage in driver code if it indeed
happens when both channels are working at the same time.
> I've read from others having the same symptom on this list, but I could
> find no solution so far. None of the suggestions or patches that I tried
> have worked out (including the new patch of Andre Hedrick, which has no
> effect in this case since the HPT370 is a rev 3. controller)
>
> However, since my last try in last August with 2.4.22 I was using the
> "opensource" driver of HighPoint which worked rock stable for my setup.
> Now I started to experiment again with the hpt366 driver, this time under
> 2.6.4, and it's the same lockup situation. I would be rather happy to see
> the hpt366 driver working as then I (and others) would not be forced to
> use the "opensource" driver of Highpoint, that, besides being a
> partly binary driver, has other disadvantages (like it needs initrd, and
> it does not support S.M.A.R.T., or compile yet with the 2.6 kernel)
>
> In case someone has any idea, I would be glad to send specific logs and/or
> test patches (preferably with 2.6.4).
--
vda
On Wed, 31 Mar 2004 10:54:01 +0200, Denis Vlasenko wrote:
> You may need to serialize channel usage in driver code if it indeed
> happens when both channels are working at the same time.
Thank you, this tip was really useful.
Setting #define HPT_SERIALIZE_IO in hpt366.h solves the lockups on my
machine (ABIT KT7-RAID). There seems to be a small (10-20%) performance
penalty involved according to the benchmarks on my RAID1 setup, but that's
acceptable.
If this solves the "hdd led stays on" freezups for others with HPT370
(rev. 3) on motherboards with possibly buggy IRQ handling, then maybe this
option could even be made settable through kernel config, together with an
appropriate explanation.
--
Balazs REE
Weird question, but do you perhaps have the APIC enabled?
Balazs Ree wrote:
>On Wed, 31 Mar 2004 10:54:01 +0200, Denis Vlasenko wrote:
>
>
>>You may need to serialize channel usage in driver code if it indeed
>>happens when both channels are working at the same time.
>>
>>
>
>Thank you, this tip was really useful.
>
>Setting #define HPT_SERIALIZE_IO in hpt366.h solves the lockups on my
>machine (ABIT KT7-RAID). There seems to be a small (10-20%) performance
>penalty involved according to the benchmarks on my RAID1 setup, but that's
>acceptable.
>
>If this solves the "hdd led stays on" freezups for others with HPT370
>(rev. 3) on motherboards with possibly buggy IRQ handling, then maybe this
>option could even be made settable through kernel config, together with an
>appropriate explanation.
>
>
>
--
Cohen's Law:
There is no bottom to worse.
===========================================
This message and attachments are subject to a disclaimer. Please refer to http://www.it.up.ac.za/documentation/governance/disclaimer/ for full details.
Hierdie boodskap en aanhangsels is aan 'n vrywaringsklousule onderhewig. Volledige besonderhede is by http://www.it.up.ac.za/documentation/governance/disclaimer/ beskikbaar.
===========================================
On Thu, 01 Apr 2004 13:40:42 +0200, Jaco Kroon wrote:
> Weird question, but do you perhaps have the APIC enabled?
Yes.
But I have tried switching it off earlier - no success. Seems that at
least this particular problem is not APIC related.
--
Balazs REE