Hello...
Any kernel above and including 2.4.21 (including 2.6.5 and 2.6.7, others not
tested) produces the following errors quite often (once or twice per minute,
with the corresponding delay) and the harddisk drops out of DMA.
-------------------------------------------------
hda: dma_timer_expiry: dma status == 0x20
hda: timeout waiting for DMA
hda: timeout waiting for DMA
hda: (__ide_dma_test_irq) called while not waiting
hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }
hda: drive not ready for command
-----------------------------------------------
I have checked that the drive and cable are OK (tested in another machine) and
no matter what drive I connect to the IDE controller, they *all* produce the
above error and drop DMA after some seconds.
Currently I am stuck at kernel version 2.4.20, as any later kernel severely
degrades the performance of the machine (Pentium Pro 200).
At http://www.uwsg.iu.edu/hypermail/linux/kernel/0304.1/0332.html I found a
thread which hints that IRQ sharing maybe the culprit, but /proc/interrupts
shows that the ide interrupt is not shared...
lspci and /proc/interrupts are included at the end of this mail.
What can I do to debug this problem?
Thanks,
Patrick
lspci:
-----------------------
0000:00:00.0 Host bridge: Intel Corp. 440FX - 82441FX PMC [Natoma] (rev 02)
0000:00:07.0 ISA bridge: Intel Corp. 82371SB PIIX3 ISA [Natoma/Triton II] (rev
01)
0000:00:07.1 IDE interface: Intel Corp. 82371SB PIIX3 IDE [Natoma/Triton II]
0000:00:0a.0 VGA compatible controller: Texas Instruments TVP4020 [Permedia 2]
(rev 01)
0000:00:0c.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
0000:00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8029(AS)
--------------------------
/proc/interrupts:
-------------------------
CPU0
0: 55623224 IO-APIC-edge timer
1: 2 IO-APIC-edge keyboard
2: 0 XT-PIC cascade
8: 4 IO-APIC-edge rtc
10: 59404185 IO-APIC-level eth1
11: 64467911 IO-APIC-level eth0
14: 9031698 IO-APIC-edge ide0
NMI: 0
LOC: 55623074
ERR: 0
MIS: 13142
----------------------------
--
Patrick Dreker
GPG KeyID : 0xFCC2F7A7 (Patrick Dreker)
Fingerprint: 7A21 FC7F 707A C498 F370 1008 7044 66DA FCC2 F7A7
Key available from keyservers
> Any kernel above and including 2.4.21 (including 2.6.5 and 2.6.7, others not
> tested) produces the following errors quite often (once or twice per minute,
> with the corresponding delay) and the harddisk drops out of DMA.
Same here. 3 computers with PIIX4, 2 (or maybe even 3, need to check the
850M Conner too) different disks (all pre-udma but mwdma, 2.5G Seagate
Medalist and 850M WD Caviar). The common denominator seems to be PIIX
chip (PIIX3 and PIIX4 reported so far) and multiword DMA.
It came with 2.4.19 for me - 2.4.18 (and thus also Debian Woody) is fine
but anything with a newer kernel (incl. 2.6.*) is broken - DMA timeouts.
So maybe it is a little different (since your 2.4.20 works) but still
very similar.
--
Meelis Roos ([email protected])
On Monday 28 of June 2004 14:48, Patrick Dreker wrote:
> Hello...
Hi,
> Any kernel above and including 2.4.21 (including 2.6.5 and 2.6.7, others
> not tested) produces the following errors quite often (once or twice per
> minute, with the corresponding delay) and the harddisk drops out of DMA.
>
> -------------------------------------------------
> hda: dma_timer_expiry: dma status == 0x20
> hda: timeout waiting for DMA
> hda: timeout waiting for DMA
> hda: (__ide_dma_test_irq) called while not waiting
> hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }
>
> hda: drive not ready for command
> -----------------------------------------------
>
> I have checked that the drive and cable are OK (tested in another machine)
> and no matter what drive I connect to the IDE controller, they *all*
> produce the above error and drop DMA after some seconds.
>
> Currently I am stuck at kernel version 2.4.20, as any later kernel severely
> degrades the performance of the machine (Pentium Pro 200).
>
> At http://www.uwsg.iu.edu/hypermail/linux/kernel/0304.1/0332.html I found a
> thread which hints that IRQ sharing maybe the culprit, but /proc/interrupts
> shows that the ide interrupt is not shared...
This was HPT specific problem.
> lspci and /proc/interrupts are included at the end of this mail.
>
> What can I do to debug this problem?
"diff -u" on "lspci -s 07.1 -xxx" outputs for 2.4.20 and 2.4.21 kernels.
Doing bisection search on 2.4.21-pre kernels would also help.
> Thanks,
> Patrick
>
> lspci:
> -----------------------
> 0000:00:00.0 Host bridge: Intel Corp. 440FX - 82441FX PMC [Natoma] (rev 02)
> 0000:00:07.0 ISA bridge: Intel Corp. 82371SB PIIX3 ISA [Natoma/Triton II]
> (rev 01)
> 0000:00:07.1 IDE interface: Intel Corp. 82371SB PIIX3 IDE [Natoma/Triton
> II] 0000:00:0a.0 VGA compatible controller: Texas Instruments TVP4020
> [Permedia 2] (rev 01)
> 0000:00:0c.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL-8139/8139C/8139C+ (rev 10)
> 0000:00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL-8029(AS) --------------------------
>
> /proc/interrupts:
> -------------------------
> CPU0
> 0: 55623224 IO-APIC-edge timer
> 1: 2 IO-APIC-edge keyboard
> 2: 0 XT-PIC cascade
> 8: 4 IO-APIC-edge rtc
> 10: 59404185 IO-APIC-level eth1
> 11: 64467911 IO-APIC-level eth0
> 14: 9031698 IO-APIC-edge ide0
> NMI: 0
> LOC: 55623074
> ERR: 0
> MIS: 13142
> ----------------------------
Am Montag, 28. Juni 2004 22:21 schrieb Bartlomiej Zolnierkiewicz:
> On Monday 28 of June 2004 14:48, Patrick Dreker wrote:
> > What can I do to debug this problem?
> "diff -u" on "lspci -s 07.1 -xxx" outputs for 2.4.20 and 2.4.21 kernels.
>
> Doing bisection search on 2.4.21-pre kernels would also help.
I will do the bisection search first and then post the lspci diff between the
last working revision and the first non-working version as I have to
recompile everything anyways.
Thanks,
Patrick
--
Patrick Dreker
GPG KeyID : 0xFCC2F7A7 (Patrick Dreker)
Fingerprint: 7A21 FC7F 707A C498 F370 1008 7044 66DA FCC2 F7A7
Key available from keyservers
Am Montag, 28. Juni 2004 22:21 schrieb Bartlomiej Zolnierkiewicz:
> > What can I do to debug this problem?
> "diff -u" on "lspci -s 07.1 -xxx" outputs for 2.4.20 and 2.4.21 kernels.
>
> Doing bisection search on 2.4.21-pre kernels would also help.
2.4.21-pre1 is the first non-working kernel, 2.4.20 works. When generating the
configs for 2.4.21-pre1 (make oldconfig based on the working 2.4.20 config) I
was asked "Use IDE Taskfile I/O" which defaulted to no and I kept that
default (i.e. "Don't use Taskfile I/O").
lspci -s 07.1 -xxx shows no difference between a working (2.4.20) kernel and a
non-working (2.4.21-pre1) kernel.
This caught my eye, but was probably obvious to you:
2.4.21-pre1 reports IDE Version 7.00beta-2.4 while 2.4.20 reports version 6.31
Patrick
--
Patrick Dreker
GPG KeyID : 0xFCC2F7A7 (Patrick Dreker)
Fingerprint: 7A21 FC7F 707A C498 F370 1008 7044 66DA FCC2 F7A7
Key available from keyservers