2002-04-05 05:36:33

by Albert Max Lai

[permalink] [raw]
Subject: 2.4.x and DAC960 issues

I have had problems using the DAC960 driver under any 2.4.x kernel
(currently using 2.4.18). I have not experienced these problems under
2.2.x.

1. Under reasonably low loads the driver will hang. It is very
reproducible when using the benchmark program "Bonnie." This
problem is exacerbated when using the ext3 filing system, locking
up almost immediately.

2. I am not sure if this is related, but I believe that it is. I see
the message "spurious 8259A interrupt: IRQ7." Most of the
examples of this that I have read about involve having APIC
enabled. But, for me, this is not the case.

3. If I boot without appending the "noapic" option, the driver hangs
after scanning the bus for cards, but before getting to
configuring the card. This is minor. I can live with this
workaround.

The machine is a Tyan S1836DLUAN-BX, dual PIII 600Mhz. The controller
is a DAC1164P w/ Firmware Version: 5.08-0-87. The problem exists with
prior versions of the firmware and either RAID 0 or RAID 5 setups.
Kernels are compiled w/ egcs-2.91.66.

Please let me know if any additional information is needed. Any help
debugging these problems would be appreciated. Thanks in advance.

-Albert


2002-04-05 20:14:04

by Marc A. Volovic

[permalink] [raw]
Subject: Re: 2.4.x and DAC960 issues

Albert Max Lai said:

> On Friday, 5 April 2002, Marc A. Volovic wrote:
> > What is your interrupt breakdown? Could your machine be doing something
> > naughty with the interrupts?
>
> CPU0 CPU1
> 9: 128424 0 XT-PIC aic7xxx, aic7xxx
> 10: 52035 0 XT-PIC Mylex DAC1164P
[snips]

Ouch, this does not look healthy. You're at 'noapic'? Seems so.

Mine is:
CPU0 CPU1
17: 22 17 IO-APIC-level BusLogic BT-958
18: 83988 84319 IO-APIC-level Mylex AcceleRAID 160

However, I can see no indication for misbehaviour. Let's take it off
the list. Can you send me a driver startup dmesg?


---MAV
Linguists Do It Cunningly
Marc A. Volovic [email protected]

2002-04-05 18:28:02

by Marc A. Volovic

[permalink] [raw]
Subject: Re: 2.4.x and DAC960 issues

Albert Max Lai said:

> I have had problems using the DAC960 driver under any 2.4.x kernel
> (currently using 2.4.18). I have not experienced these problems under
>
> 1. Under reasonably low loads the driver will hang. It is very
> reproducible when using the benchmark program "Bonnie." This
[snip]
> problem is exacerbated when using the ext3 filing system, locking

I am using a Mylex 170LP with no problem, running on a dual 550MHz
MSI 6120S under quite a few 2.4.x kernels, lately 2.4.18 and 2.4.19pre5,
all under reiserfs. The firmware is 6.00-15, carrying 6 9GB disks
(5 RAID5 + 1 spare).

There have been NO lockups under any version of the kernel, not under
multiple bonnie runs.

What is your interrupt breakdown? Could your machine be doing something
naughty with the interrupts?

---MAV
Linguists Do It Cunningly
Marc A. Volovic [email protected]

2002-04-05 22:22:41

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: 2.4.x and DAC960 issues



On Fri, 5 Apr 2002, Albert Max Lai wrote:

> On Friday, 5 April 2002, Marc A. Volovic wrote:
>
> > I am using a Mylex 170LP with no problem, running on a dual 550MHz
> > MSI 6120S under quite a few 2.4.x kernels, lately 2.4.18 and 2.4.19pre5,
> > all under reiserfs. The firmware is 6.00-15, carrying 6 9GB disks
> > (5 RAID5 + 1 spare).
> >
> > There have been NO lockups under any version of the kernel, not under
> > multiple bonnie runs.
> >
> > What is your interrupt breakdown? Could your machine be doing something
> > naughty with the interrupts?
>
> This what /proc/interrupts says:
> CPU0 CPU1
> 0: 3874524 0 XT-PIC timer
> 1: 18836 0 XT-PIC keyboard
> 2: 0 0 XT-PIC cascade
> 4: 4 0 XT-PIC serial
> 5: 46479 0 XT-PIC soundblaster
> 8: 218807 0 XT-PIC rtc
> 9: 128424 0 XT-PIC aic7xxx, aic7xxx
> 10: 52035 0 XT-PIC Mylex DAC1164P
> 12: 342261 0 XT-PIC PS/2 Mouse
> 14: 209669 0 XT-PIC eth0
> 15: 44772 0 XT-PIC eth1, usb-uhci
> NMI: 0 0
> LOC: 3874766 3874768
> ERR: 16
> MIS: 0

I've forwarded your first message to Leonard (the driver author)... well
probably get some feedback soon.

2002-04-05 18:46:23

by Albert Max Lai

[permalink] [raw]
Subject: Re: 2.4.x and DAC960 issues

On Friday, 5 April 2002, Marc A. Volovic wrote:

> I am using a Mylex 170LP with no problem, running on a dual 550MHz
> MSI 6120S under quite a few 2.4.x kernels, lately 2.4.18 and 2.4.19pre5,
> all under reiserfs. The firmware is 6.00-15, carrying 6 9GB disks
> (5 RAID5 + 1 spare).
>
> There have been NO lockups under any version of the kernel, not under
> multiple bonnie runs.
>
> What is your interrupt breakdown? Could your machine be doing something
> naughty with the interrupts?

This what /proc/interrupts says:
CPU0 CPU1
0: 3874524 0 XT-PIC timer
1: 18836 0 XT-PIC keyboard
2: 0 0 XT-PIC cascade
4: 4 0 XT-PIC serial
5: 46479 0 XT-PIC soundblaster
8: 218807 0 XT-PIC rtc
9: 128424 0 XT-PIC aic7xxx, aic7xxx
10: 52035 0 XT-PIC Mylex DAC1164P
12: 342261 0 XT-PIC PS/2 Mouse
14: 209669 0 XT-PIC eth0
15: 44772 0 XT-PIC eth1, usb-uhci
NMI: 0 0
LOC: 3874766 3874768
ERR: 16
MIS: 0

-Albert

2002-04-16 22:17:01

by Albert Max Lai

[permalink] [raw]
Subject: Re: 2.4.x and DAC960 issues

This is the summary of the off-list discussion and solution to my
DAC960 problem.

-Albert

On Saturday, 6 April 2002, Marc A. Volovic wrote:

> Quoth Leonard N. Zubkoff:
>
> > From: Albert Max Lai <[email protected]>
> >
> > I moved the card into the slot closest to the CPU that I could, and
> > voila! everything works correctly; no lockups, ext3 works, even
> >
> > Excellent news. That's not a fix I've heard of before.
>
> Hi,
>
> Alas, it is very simple. Many (most? in my experience - Tyan, MSI, ASUS)
> motherboards leave their outer (farthest from the CPU area) PCI slots
> ___NON-bus mastering___. In most cases, this is the outermost slot, but
> sometimes it is more than one slot, but again, the outermost, leftmost.
>
> Sometimes, the masterlessness is dynamic - i.e. based on the number of
> populated slots, counting from the CPU. This is EXTREMELY rare. I saw
> it only once, I think, and on a board I did not trust even as far as
> I could toss it. (Well, I could toss it some reasonable distance, which
> I did ;-)...
>
> In some rare cases (errrr... ummmm... SOME Tyan board, I cannot
> currently remember the model) ran in the reverse direction.
>
> Populating these masterless slots with anything but a sound card (and in
> many cases even by a sound card) leads to loss of stability.
>
> Moving a board INWARD (i.e. toward the CPU) in many cases solves the
> problem.
>
> --
> ---MAV
> Linguists Do It Cunningly
> Marc A. Volovic [email protected]