2003-05-13 19:35:29

by Maciej Soltysiak

[permalink] [raw]
Subject: hdb: dma_timer_expiry: dma status == 0x64 [2.5.69]

Hi,

on 2.5.69-dj1 (so it's a 2.5.69-bk5 kernel) i found these two in my kernel
log, which i have not seen before. There are just 2 occurences of that.
Is that something about a hardware failure, or something else?


15:04:01 pysiak kernel: hdb: dma_timer_expiry: dma status == 0x64
15:04:01 pysiak kernel: hdb: lost interrupt
15:04:01 pysiak kernel: hdb: dma_intr: bad DMA status (dma_stat=70)
15:04:01 pysiak kernel: hdb: dma_intr: status=0x50 { DriveReady SeekComplete }
15:04:01 pysiak kernel:
17:14:04 pysiak kernel: hdb: dma_timer_expiry: dma status == 0x64
17:14:04 pysiak kernel: hdb: lost interrupt
17:14:04 pysiak kernel: hdb: dma_intr: bad DMA status (dma_stat=70)
17:14:04 pysiak kernel: hdb: dma_intr: status=0x50 { DriveReady SeekComplete }


/dev/hdb:
multcount = 16 (on)
IO_support = 0 (default 16-bit)
unmaskirq = 0 (off)
using_dma = 1 (on)
keepsettings = 0 (off)
readonly = 0 (off)
readahead = 256 (on)
geometry = 38792/16/63, sectors = 39102336, start = 0

Regards,
Maciej


2003-05-14 13:34:27

by Zephaniah E. Hull

[permalink] [raw]
Subject: Re: hdb: dma_timer_expiry: dma status == 0x64 [2.5.69]

On Tue, May 13, 2003 at 09:48:13PM +0200, Maciej Soltysiak wrote:
> Hi,
>
> on 2.5.69-dj1 (so it's a 2.5.69-bk5 kernel) i found these two in my kernel
> log, which i have not seen before. There are just 2 occurences of that.
> Is that something about a hardware failure, or something else?

The first reaction is that this is a hardware thing.

EXCEPT.

I'm seeing it too, only with recent kernels.

May 14 07:43:43 agamemnon kernel: hda: dma_timer_expiry: dma status == 0x64
May 14 07:43:43 agamemnon kernel: hda: lost interrupt
May 14 07:43:43 agamemnon kernel: hda: dma_intr: bad DMA status (dma_stat=70)
May 14 07:43:43 agamemnon kernel: hda: dma_intr: status=0x50 { DriveReady SeekComplete }

Happens only with heavy disk IO, running 2.5.69-mm3, happened with a few
earlier kernels and sadly I don't remember which kernel it started on.

--
1024D/E65A7801 Zephaniah E. Hull <[email protected]>
92ED 94E4 B1E6 3624 226D 5727 4453 008B E65A 7801
CCs of replies from mailing lists are requested.

<Electro> LordHavoc: i got black lines on stuff in realtime mode, i'll
take a pic, and it runs slow
<LordHavoc> this is why I LOVE ATI drivers, they're so creative with the
geometry I give them...
<LordHavoc> they turn a refined and very specific standard for the pixel
by pixel handling of polygons into an interpretive artform


Attachments:
(No filename) (1.35 kB)
(No filename) (189.00 B)
Download all attachments

2003-05-14 14:02:25

by dth

[permalink] [raw]
Subject: Re: hdb: dma_timer_expiry: dma status == 0x64 [2.5.69]

Zephaniah E. Hull <[email protected]> wrote:
>Happens only with heavy disk IO, running 2.5.69-mm3, happened with a few
>earlier kernels and sadly I don't remember which kernel it started on.

I had similar problems on a uni-processor machine.

Try this:

Disable IO-APIC in the kernel

EG:
Processor type and features ->
[*] Local APIC support on uniprocessors
[ ] IO-APIC support on uniprocessors

This way i don't experience these errors anymore.
I can only guess what causes these errors.

Danny
--
Miguel | "I can't tell if I have worked all my life or if
de Icaza | I have never worked a single day of my life,"

2003-05-14 14:02:25

by Maciej Soltysiak

[permalink] [raw]
Subject: Re: hdb: dma_timer_expiry: dma status == 0x64 [2.5.69]

> I'm seeing it too, only with recent kernels.
Exactly like me.
Someone suggested Bartlomiej Zolnierkiewicz's patch.
Try this on for size. I haven't tested it yet, but please give it a shot.

Regards,
Maciej

# Fix masked_irq arg handling for ide_do_request().
# Solves "hdx: lost interrupt" bug.
#
# Bartlomiej Zolnierkiewicz <[email protected]>

--- linux-2.5.68-bk6/drivers/ide/ide-io.c Fri Apr 25 16:08:53 2003
+++ linux/drivers/ide/ide-io.c Fri Apr 25 16:13:37 2003
@@ -850,14 +850,14 @@
* happens anyway when any interrupt comes in, IDE or otherwise
* -- the kernel masks the IRQ while it is being handled.
*/
- if (hwif->irq != masked_irq)
+ if (masked_irq != IDE_NO_IRQ && hwif->irq != masked_irq)
disable_irq_nosync(hwif->irq);
spin_unlock(&ide_lock);
local_irq_enable();
/* allow other IRQs while we start this request */
startstop = start_request(drive, rq);
spin_lock_irq(&ide_lock);
- if (hwif->irq != masked_irq)
+ if (masked_irq != IDE_NO_IRQ && hwif->irq != masked_irq)
enable_irq(hwif->irq);
if (startstop == ide_released)
goto queue_next;


Attachments:
masked_irq.diff (926.00 B)
masked_irq.diff

2003-05-14 15:09:51

by Zephaniah E. Hull

[permalink] [raw]
Subject: Re: hdb: dma_timer_expiry: dma status == 0x64 [2.5.69]

On Wed, May 14, 2003 at 04:13:15PM +0200, Maciej Soltysiak wrote:
> > I'm seeing it too, only with recent kernels.
> Exactly like me.
> Someone suggested Bartlomiej Zolnierkiewicz's patch.
> Try this on for size. I haven't tested it yet, but please give it a shot.

It seems to be in -mm4 and -mm5 as well, and after rebooting to -mm5
from -mm3 I have not seen it, however the box has only been up 2 hours,
so we will know for sure when it happens again, or in a few days?

Thanks.
>
> Regards,
> Maciej

--
1024D/E65A7801 Zephaniah E. Hull <[email protected]>
92ED 94E4 B1E6 3624 226D 5727 4453 008B E65A 7801
CCs of replies from mailing lists are requested.

}>No. I just point out to troublemakers that I have an English degree,
}>which means that I am allowed to make changes to the English language.
}>(What _else_ could it possibly be for?)
}Wow; in that case, my physics degree is *WAY* more useful than I
}had thought.
This just proves how useless a computer science degree is: there is hardly
any useful science involved at all. I want my computer black magic degree!
-- Victoria Swann, Jonathan Dursi, and D. Joseph Creighton on ASR


Attachments:
(No filename) (1.14 kB)
(No filename) (189.00 B)
Download all attachments

2003-05-14 16:04:13

by Rafal Bujnowski

[permalink] [raw]
Subject: Re: hdb: dma_timer_expiry: dma status == 0x64 [2.5.69]

A Maciej Soltysiak <[email protected]> na to:

> Exactly like me.
> Someone suggested Bartlomiej Zolnierkiewicz's patch.
> Try this on for size. I haven't tested it yet, but please give it a
> shot.

Hello!

It doesn't work. I still get:

hda: task_no_data_intr: status=0x51 { DriveReady SeekComplete Error }
hda: task_no_data_intr: error=0x04 { DriveStatusError }

I'll try Dany's hint.

rafal


--

[ Rafal Bujnowski ][ e-mail: bujnor<at>go2.pl ]
[ http://www.bujnor.iq.pl/ ][ e-mail: bujnor<at>poczta.onet.pl ]
[ ICQ: 85602025 GG: 4174829 ][ Jabber: [email protected] ]

2003-05-14 17:16:22

by Mudama, Eric

[permalink] [raw]
Subject: RE: hdb: dma_timer_expiry: dma status == 0x64 [2.5.69]

0x5104 is a different can of worms from the other stuff you guys were
reporting.

5104 (status register = 0x51, error register 0x04) is the all-encompassing
"command abort" which is what the drive does any time you issue a command
with bad parameters, an invalid (immoral?) command, or some of the security
stuff out of sequence. Most commonly it is seen attempting to enable
features on a drive that doesn't support them.

--eric

-----Original Message-----
From: Rafal Bujnowski [mailto:[email protected]]
Sent: Wednesday, May 14, 2003 10:06 AM
To: linux-kernel
Cc: Maciej Soltysiak
Subject: Re: hdb: dma_timer_expiry: dma status == 0x64 [2.5.69]


A Maciej Soltysiak <[email protected]> na to:

> Exactly like me.
> Someone suggested Bartlomiej Zolnierkiewicz's patch.
> Try this on for size. I haven't tested it yet, but please give it a
> shot.

Hello!

It doesn't work. I still get:

hda: task_no_data_intr: status=0x51 { DriveReady SeekComplete Error }
hda: task_no_data_intr: error=0x04 { DriveStatusError }

I'll try Dany's hint.

rafal


--

[ Rafal Bujnowski ][ e-mail: bujnor<at>go2.pl ]
[ http://www.bujnor.iq.pl/ ][ e-mail: bujnor<at>poczta.onet.pl ]
[ ICQ: 85602025 GG: 4174829 ][ Jabber: [email protected] ]

2003-05-14 17:28:15

by Jens Axboe

[permalink] [raw]
Subject: Re: hdb: dma_timer_expiry: dma status == 0x64 [2.5.69]

On Wed, May 14 2003, Mudama, Eric wrote:
> 0x5104 is a different can of worms from the other stuff you guys were
> reporting.
>
> 5104 (status register = 0x51, error register 0x04) is the all-encompassing
> "command abort" which is what the drive does any time you issue a command
> with bad parameters, an invalid (immoral?) command, or some of the security
> stuff out of sequence. Most commonly it is seen attempting to enable
> features on a drive that doesn't support them.

Which reminds me that it has always annoyed me that Linux doesn't print
the failed command. Just leaves a lot of guess work... I'll try and
remedy that.

--
Jens Axboe

Subject: Re: hdb: dma_timer_expiry: dma status == 0x64 [2.5.69]


On Wed, 14 May 2003, Jens Axboe wrote:

> On Wed, May 14 2003, Mudama, Eric wrote:
> > 0x5104 is a different can of worms from the other stuff you guys were
> > reporting.
> >
> > 5104 (status register = 0x51, error register 0x04) is the all-encompassing
> > "command abort" which is what the drive does any time you issue a command
> > with bad parameters, an invalid (immoral?) command, or some of the security
> > stuff out of sequence. Most commonly it is seen attempting to enable
> > features on a drive that doesn't support them.

In reality its harmless, only noisy, ide driver tryied to do something
like checking max native address and drive doesn't support it.

> Which reminds me that it has always annoyed me that Linux doesn't print
> the failed command. Just leaves a lot of guess work... I'll try and
> remedy that.

Which reminds me that somebody (me?) should fix hdparm to not use
WIN_IDENTIFY with HDIO_DRIVE_CMD ioctl but use HDIO_GET_IDENTIFY ioctl.
This command will be aborted by ATAPI device and hdparm don't know
if device is ATA or ATAPI, it currently only works because ioctl handler
reads data from device *before* checking status (or something like that).

Back on topic: Jens, you can also check current status of per driver
->abort() introduced by Alan.

btw. I will be quite busy for next 3 weeks (exams), so expect long
delays in replies and slower ide progress.

--
Bartlomiej

> --
> Jens Axboe