2010-01-14 23:20:45

by Neil Schemenauer

[permalink] [raw]
Subject: ATA error with Asus A7VX board

Hi,

I have what seems to be a poorly designed board (based on the number
of problems that a Google search turns up). However, since I guess
there are still people that have it and I have some capability to do
some debugging I thought I should try to fix it rather than throwing
it out.

It is possible that my problem is caused by a bad cable, bad memory,
or a bad drive, but I suspect the board is just temperamental
(specifically, the way interrupts are handled). The drive had a lot
of bad sectors a while ago but writing over them seemed to have
fixed it. The drive was working without errors with another board
for a while now.

I've tried booting with and without the "noapic" command line option
and with the old ide driver and the pata_via driver. Each
configuration seems to have its own problems. Certain configurations
generated errors like "IRQ nobody cared" and "spurious interrupt".
The noapic and ide driver combination results in the following error:

[ 366.463909] spurious 8259A interrupt: IRQ7.
[ 5432.078546] hda: task_no_data_intr: status=0x30 { DeviceFault SeekComplete }
[ 5432.078555] hda: possibly failed opcode: 0xb0
[ 5462.224012] hda: lost interrupt

The "noapic" and pata_via is what I'm running now. It works for a
while and then I get an error like the following:

ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata4.00: BMDMA stat 0x65
ata4.00: failed command: READ DMA
ata4.00: cmd c8/00:18:4f:27:1b/00:00:00:00:00/e0 tag 0 dma 12288 in
res 51/10:18:4f:27:1b/00:00:00:00:00/a0 Emask 0x81 (invalid argument)
ata4.00: status: { DRDY ERR }
ata4.00: error: { IDNF }
ata4.00: configured for UDMA/100
ata4.01: configured for UDMA/100
sd 3:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
sd 3:0:0:0: [sda] Sense Key : 0xb [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
72 0b 14 00 00 00 00 0c 00 0a 80 00 00 00 00 00
00 1b 27 4f
sd 3:0:0:0: [sda] ASC=0x14 ASCQ=0x0
sd 3:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 00 1b 27 4f 00 00 18 00
end_request: I/O error, dev sda, sector 1779535
ata4: EH complete

I'm attaching the kernel log for the pata_via and ide setups and the
output from lspci.

Best regards,

Neil


Attachments:
(No filename) (2.16 kB)
lspci-v.txt (5.30 kB)
smartctl.txt (10.23 kB)
kernlog-ide.txt.gz (10.37 kB)
kernlog-pata.txt.gz (9.65 kB)
Download all attachments

2010-01-14 23:31:51

by Alan

[permalink] [raw]
Subject: Re: ATA error with Asus A7VX board

> [ 366.463909] spurious 8259A interrupt: IRQ7.
> [ 5432.078546] hda: task_no_data_intr: status=0x30 { DeviceFault SeekComplete }
> [ 5432.078555] hda: possibly failed opcode: 0xb0
> [ 5462.224012] hda: lost interrupt

DeviceFault comes from the drive

> ata4.00: failed command: READ DMA
> ata4.00: cmd c8/00:18:4f:27:1b/00:00:00:00:00/e0 tag 0 dma 12288 in
> res 51/10:18:4f:27:1b/00:00:00:00:00/a0 Emask 0x81 (invalid argument)
> ata4.00: status: { DRDY ERR }
> ata4.00: error: { IDNF }

IDNF - ID not found. The drive couldn't find the sector you requested.
Some drives do this if they get an invalid sector number in the command
but that doesn't look to be the case this time.

> I'm attaching the kernel log for the pata_via and ide setups and the
> output from lspci.

I'd say its a busted drive. If you had bad cables I'd expect to see CRC
errors not IDNF.

Alan

2010-01-16 18:21:23

by Bill Davidsen

[permalink] [raw]
Subject: Re: ATA error with Asus A7VX board

Neil Schemenauer wrote:
> Hi,
>
> I have what seems to be a poorly designed board (based on the number
> of problems that a Google search turns up). However, since I guess
> there are still people that have it and I have some capability to do
> some debugging I thought I should try to fix it rather than throwing
> it out.
>
> It is possible that my problem is caused by a bad cable, bad memory,
> or a bad drive, but I suspect the board is just temperamental
> (specifically, the way interrupts are handled). The drive had a lot
> of bad sectors a while ago but writing over them seemed to have
> fixed it. The drive was working without errors with another board
> for a while now.
>
I would reseat the cables (replace if you can), after that either the drive or
the power supply are suspects. since changing P/S is a PITA for test, if you
have some monster video card you might pull it and put in something low power.

Sure, it could be the board, but other things are more likely.

> I've tried booting with and without the "noapic" command line option
> and with the old ide driver and the pata_via driver. Each
> configuration seems to have its own problems. Certain configurations
> generated errors like "IRQ nobody cared" and "spurious interrupt".
> The noapic and ide driver combination results in the following error:
>
> [ 366.463909] spurious 8259A interrupt: IRQ7.
> [ 5432.078546] hda: task_no_data_intr: status=0x30 { DeviceFault SeekComplete }
> [ 5432.078555] hda: possibly failed opcode: 0xb0
> [ 5462.224012] hda: lost interrupt
>
> The "noapic" and pata_via is what I'm running now. It works for a
> while and then I get an error like the following:
>
> ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> ata4.00: BMDMA stat 0x65
> ata4.00: failed command: READ DMA
> ata4.00: cmd c8/00:18:4f:27:1b/00:00:00:00:00/e0 tag 0 dma 12288 in
> res 51/10:18:4f:27:1b/00:00:00:00:00/a0 Emask 0x81 (invalid argument)
> ata4.00: status: { DRDY ERR }
> ata4.00: error: { IDNF }
> ata4.00: configured for UDMA/100
> ata4.01: configured for UDMA/100
> sd 3:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
> sd 3:0:0:0: [sda] Sense Key : 0xb [current] [descriptor]
> Descriptor sense data with sense descriptors (in hex):
> 72 0b 14 00 00 00 00 0c 00 0a 80 00 00 00 00 00
> 00 1b 27 4f
> sd 3:0:0:0: [sda] ASC=0x14 ASCQ=0x0
> sd 3:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 00 1b 27 4f 00 00 18 00
> end_request: I/O error, dev sda, sector 1779535
> ata4: EH complete
>
> I'm attaching the kernel log for the pata_via and ide setups and the
> output from lspci.
>
> Best regards,
>
> Neil
>


--
Bill Davidsen <[email protected]>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot

2010-01-16 19:02:52

by Neil Schemenauer

[permalink] [raw]
Subject: Re: ATA error with Asus A7VX board

On Sat, Jan 16, 2010 at 01:21:03PM -0500, Bill Davidsen wrote:
> I would reseat the cables (replace if you can), after that either
> the drive or the power supply are suspects. since changing P/S is
> a PITA for test, if you have some monster video card you might
> pull it and put in something low power.

I will try changing the cable since it is possible it got damaged
when I changed the board. The power supply is a brand new Antec so I
think it's okay. Alan suggested it is a problem with the drive. I'm
willing to accept that but it's odd that the drive worked perfectly
fine for months with the old board.

Neil