2002-09-04 09:43:21

by Florian Hinzmann

[permalink] [raw]
Subject: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2

Hi!

I have problems with DMA mode at one of my boxes ( more technical
details at the end of this mail ).

It has one small disk (hda) and three disks bigger than 32GB. The machine
does not boot with that big harddisks connected even if they are not
listed in the BIOS. To circumvent this the three Maxtor disks (hdb,hdc,hdd)
have jumpers set which reduces them to 4092 cylinder. I the setmax utility
before to "unclip" them, but now the STROKE kernel option does this job and
that works fine.

But I do issue a "hdparm -d0" for each of them at bootup currently and they
are running fine then. Enabling DMA with "hdparm -d1" (or not using hdparm at all)
leads to errors like the following quite fast and reproducable:

kernel: hdb: dma_timer_expiry: dma status == 0x60
kernel: hdb: timeout waiting for DMA
kernel: hdb: timeout waiting for DMA
kernel: hdb: (__ide_dma_test_irq) called while not waiting
kernel: hdb: status error: status=0x58 { DriveReady SeekComplete DataRequest }
kernel:
kernel: hdb: drive not ready for command

Turning DMA off again stops these.


I'd love to hear any experience other people have with this mainboard
or even some statement if DMA is supposed to work with my setup.


Technical details below. If anything is missing please say so and I
will get it.


Regards

Florian

-------------------------------------------------------------------------------
Mainboard: Asus P/I-XP55T2P4

--- part of dmesg -------------------------------------------------------------
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha1
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX3: IDE controller at PCI slot 00:07.1
PIIX3: chipset revision 0
PIIX3: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xe800-0xe807, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xe808-0xe80f, BIOS settings: hdc:DMA, hdd:DMA
keyboard: Timeout - AT keyboard not present?(ed)
keyboard: Timeout - AT keyboard not present?(f4)
hda: Maxtor 82560A4, ATA DISK drive
hdb: Maxtor 4D080H4, ATA DISK drive
hdc: Maxtor 4W080H6, ATA DISK drive
hdd: Maxtor 4D060H3, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: host protected area => 1
hda: 5001728 sectors (2561 MB) w/256KiB Cache, CHS=620/128/63, DMA
hdb: host protected area => 1
hdb: 160086527 sectors (81964 MB) w/2048KiB Cache, CHS=9964/255/63, (U)DMA
hdc: host protected area => 1
hdc: 160086528 sectors (81964 MB) w/2048KiB Cache, CHS=158816/16/63, (U)DMA
hdd: host protected area => 1
hdd: 120069935 sectors (61476 MB) w/2048KiB Cache, CHS=7474/255/63, (U)DMA
Partition check:
hda: hda1 hda2
hdb: hdb1
hdc: hdc1
hdd: hdd1

--- part of lspci -vv ---------------------------------------------------------
00:07.1 IDE interface: Intel Corp. 82371SB PIIX3 IDE [Natoma/Triton II] (prog-if 80 [Master])
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32
Region 4: I/O ports at e800 [size=16]

--- atapci (v0.50) ------------------------------------------------------------
pcibus = 33333
00:07.1 vendor=8086 device=7010 class=0101 irq=0 base4=e801
----------PIIX BusMastering IDE Configuration---------------
Driver Version: 1.3
South Bridge: 28688
Revision: IDE 0
Highest DMA rate: MWDMA16
BM-DMA base: 0xe800
PCI clock: 33.3MHz
-----------------------Primary IDE-------Secondary IDE------
Enabled: yes yes
Simplex only: no no
Cable Type: 40w 40w
-------------------drive0----drive1----drive2----drive3-----
Prefetch+Post: yes yes yes yes
Transfer Mode: PIO PIO PIO PIO
Address Setup: 90ns 90ns 90ns 90ns
Cmd Active: 360ns 360ns 360ns 360ns
Cmd Recovery: 540ns 540ns 540ns 540ns
Data Active: 90ns 90ns 90ns 90ns
Data Recovery: 30ns 30ns 30ns 30ns
Cycle Time: 120ns 120ns 120ns 120ns
Transfer Rate: 16.6MB/s 16.6MB/s 16.6MB/s 16.6MB/s

--- IDE part of .config --------------------------------------------------------
#
# ATA/IDE/MFM/RLL support
#
CONFIG_IDE=y

#
# IDE, ATA and ATAPI Block devices
#
CONFIG_BLK_DEV_IDE=y

#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_HD_IDE is not set
# CONFIG_BLK_DEV_HD is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_IDEDISK_STROKE=y
# CONFIG_BLK_DEV_IDEDISK_VENDOR is not set
# CONFIG_BLK_DEV_IDEDISK_FUJITSU is not set
# CONFIG_BLK_DEV_IDEDISK_IBM is not set
# CONFIG_BLK_DEV_IDEDISK_MAXTOR is not set
# CONFIG_BLK_DEV_IDEDISK_QUANTUM is not set
# CONFIG_BLK_DEV_IDEDISK_SEAGATE is not set
# CONFIG_BLK_DEV_IDEDISK_WD is not set
# CONFIG_BLK_DEV_COMMERIAL is not set
# CONFIG_BLK_DEV_TIVO is not set
# CONFIG_BLK_DEV_IDECS is not set
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_IDECD_BAILOUT=y
CONFIG_BLK_DEV_IDETAPE=m
# CONFIG_BLK_DEV_IDEFLOPPY is not set
# CONFIG_BLK_DEV_IDESCSI is not set
# CONFIG_IDE_TASK_IOCTL is not set
# CONFIG_IDE_TASKFILE_IO is not set

#
# IDE chipset support/bugfixes
#
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
# CONFIG_BLK_DEV_ISAPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_BLK_DEV_GENERIC=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_OFFBOARD is not set
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_IDEDMA_PCI_WIP=y
CONFIG_IDEDMA_NEW_DRIVE_LISTINGS=y
CONFIG_BLK_DEV_ADMA=y
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_WDC_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_AMD74XX_OVERRIDE is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_HPT34X_AUTODMA is not set
# CONFIG_BLK_DEV_HPT366 is not set
CONFIG_BLK_DEV_PIIX=y
# CONFIG_BLK_DEV_NFORCE is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_PDC202XX_BURST is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_PDC202XX_FORCE is not set
# CONFIG_BLK_DEV_RZ1000 is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_CHIPSETS is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_IDEDMA_IVB is not set
# CONFIG_DMA_NONPCI is not set
CONFIG_BLK_DEV_IDE_MODES=y
# CONFIG_BLK_DEV_ATARAID is not set
# CONFIG_BLK_DEV_ATARAID_PDC is not set
# CONFIG_BLK_DEV_ATARAID_HPT is not set


--
Florian Hinzmann private: [email protected]
Debian: [email protected]
PGP Key / ID: 1024D/B4071A65
Fingerprint : F9AB 00C1 3E3A 8125 DD3F DF1C DF79 A374 B407 1A65


2002-09-14 14:31:06

by Jan-Hinnerk Reichert

[permalink] [raw]
Subject: Re: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2

Florian Hinzmann wrote:

> Hi!
>
> I have problems with DMA mode at one of my boxes ( more technical
> details at the end of this mail ).
[...]
> But I do issue a "hdparm -d0" for each of them at bootup currently and
> they are running fine then. Enabling DMA with "hdparm -d1" (or not using
> hdparm at all) leads to errors like the following quite fast and
> reproducable:
>
> kernel: hdb: dma_timer_expiry: dma status == 0x60
> kernel: hdb: timeout waiting for DMA
> kernel: hdb: timeout waiting for DMA
> kernel: hdb: (__ide_dma_test_irq) called while not waiting
> kernel: hdb: status error: status=0x58 { DriveReady SeekComplete
> DataRequest } kernel:
> kernel: hdb: drive not ready for command
>
> Turning DMA off again stops these.
>
>
> I'd love to hear any experience other people have with this mainboard
> or even some statement if DMA is supposed to work with my setup.

I had some problems like this using 2.4.17 on a PIIX3 board (don't know the
board type). The problems disappeared after switching to 2.4.19.
Unforunately I had to change the processor and processor fan about the same
time.

I tend to believe that this problem was related to CPU temperature not to
kernel bugs. I don't have any means of measuring CPU temperature on this
board. But the old CPU certainly burnt out, because the fan was not
working too well ;-(((

2002-09-14 22:38:15

by Andre Hedrick

[permalink] [raw]
Subject: Re: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2


Yep I had that problem too and fixed it.
Please try a newer pre5-acX


On Sat, 14 Sep 2002, Jan-Hinnerk Reichert wrote:

> Florian Hinzmann wrote:
>
> > Hi!
> >
> > I have problems with DMA mode at one of my boxes ( more technical
> > details at the end of this mail ).
> [...]
> > But I do issue a "hdparm -d0" for each of them at bootup currently and
> > they are running fine then. Enabling DMA with "hdparm -d1" (or not using
> > hdparm at all) leads to errors like the following quite fast and
> > reproducable:
> >
> > kernel: hdb: dma_timer_expiry: dma status == 0x60
> > kernel: hdb: timeout waiting for DMA
> > kernel: hdb: timeout waiting for DMA
> > kernel: hdb: (__ide_dma_test_irq) called while not waiting
> > kernel: hdb: status error: status=0x58 { DriveReady SeekComplete
> > DataRequest } kernel:
> > kernel: hdb: drive not ready for command
> >
> > Turning DMA off again stops these.
> >
> >
> > I'd love to hear any experience other people have with this mainboard
> > or even some statement if DMA is supposed to work with my setup.
>
> I had some problems like this using 2.4.17 on a PIIX3 board (don't know the
> board type). The problems disappeared after switching to 2.4.19.
> Unforunately I had to change the processor and processor fan about the same
> time.
>
> I tend to believe that this problem was related to CPU temperature not to
> kernel bugs. I don't have any means of measuring CPU temperature on this
> board. But the old CPU certainly burnt out, because the fan was not
> working too well ;-(((
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
LAD Storage Consulting Group

2002-09-16 08:53:12

by Florian Hinzmann

[permalink] [raw]
Subject: Re: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2

Hello!


On 14-Sep-2002 Jan-Hinnerk Reichert wrote:

> Florian Hinzmann wrote:

>> I have problems with DMA mode at one of my boxes ( more technical
>> details at the end of this mail ).
> [...]
>> But I do issue a "hdparm -d0" for each of them at bootup currently and
>> they are running fine then. Enabling DMA with "hdparm -d1" (or not using
>> hdparm at all) leads to errors like the following quite fast and
>> reproducable:
>>
>> kernel: hdb: dma_timer_expiry: dma status == 0x60
>> kernel: hdb: timeout waiting for DMA
>> kernel: hdb: timeout waiting for DMA
>> kernel: hdb: (__ide_dma_test_irq) called while not waiting
>> kernel: hdb: status error: status=0x58 { DriveReady SeekComplete
>> DataRequest } kernel:
>> kernel: hdb: drive not ready for command
>>
>> Turning DMA off again stops these.
>>
>>
>> I'd love to hear any experience other people have with this mainboard
>> or even some statement if DMA is supposed to work with my setup.
>
> I had some problems like this using 2.4.17 on a PIIX3 board (don't know the
> board type). The problems disappeared after switching to 2.4.19.
> Unforunately I had to change the processor and processor fan about the same
> time.
>
> I tend to believe that this problem was related to CPU temperature not to

I don't think cpu temperature is the problem in my case. The fan is
working well and the box is running stable with high load for hours/days
while there is no heavy disk activity. But copying files for some minutes
kills it reliably.
But I will try to watch this more closely. Might running the CPU with
100MHz instead of 200MHz to keep it cooler be worth a try? I could try to
find another CPU to put into my machine, too.


Greetings
Florian



--
Florian Hinzmann private: [email protected]
Debian: [email protected]
PGP Key / ID: 1024D/B4071A65
Fingerprint : F9AB 00C1 3E3A 8125 DD3F DF1C DF79 A374 B407 1A65

2002-09-16 11:12:36

by Florian Hinzmann

[permalink] [raw]
Subject: Re: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2


On 14-Sep-2002 Andre Hedrick wrote:

> Yep I had that problem too and fixed it.
> Please try a newer pre5-acX

Problem is still there with 2.4.20-pre5-ac6:

kernel: hdd: dma_timer_expiry: dma status == 0x60
kernel: hdd: timeout waiting for DMA
kernel: hdd: timeout waiting for DMA
kernel: hdd: (__ide_dma_test_irq) called while not waiting
kernel: hdd: status error: status=0x58 { DriveReady SeekComplete DataRequest }
kernel:
kernel: hdd: drive not ready for command
kernel: blk: queue c02e50e0, I/O limit 4095Mb (mask 0xffffffff)


No high load (wether cpu or disk io) is needed for this to happen.


In my initial mail I said the machine is running stable with DMA turned off.
This is not true for latest 2.4.20pre5 kernels. When I start one or two bigger
file copy operations it usually takes less than one minute and I get errors
like these (running 2.4.20-pre5-ac6 for this output):

kernel: hdb: status timeout: status=0xd0 { Busy }
kernel:
kernel: hdb: no DRQ after issuing WRITE
kernel: ide0: reset: success
kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
kernel: ide0: reset: success
kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
kernel: ide0: reset: success
kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
kernel: end_request: I/O error, dev 03:41 (hdb), sector 97567008


2.4.20-pre5-ac6 does not work for me with or without DMA. Using 2.4.19 that
machine is running stable with DMA turned off. Would it be interesting to hear
wich 2.4.20-preX-acY kernel was first to break pio mode at my machine?


Regards
Florian



--
Florian Hinzmann private: [email protected]
Debian: [email protected]
PGP Key / ID: 1024D/B4071A65
Fingerprint : F9AB 00C1 3E3A 8125 DD3F DF1C DF79 A374 B407 1A65

2002-09-16 12:26:39

by Alan

[permalink] [raw]
Subject: Re: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2

On Mon, 2002-09-16 at 12:17, Florian Hinzmann wrote:
> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }

Which is the drive reporting a physical media error

2002-09-16 14:21:35

by Florian Hinzmann

[permalink] [raw]
Subject: Re: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2


On 16-Sep-2002 Alan Cox wrote:
> On Mon, 2002-09-16 at 12:17, Florian Hinzmann wrote:
>> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
>> kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
>> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
>
> Which is the drive reporting a physical media error

Which seems to exist only while using the named combinations of DMA access
and kernel versions. While using i.e. 2.4.19 without DMA I can access the same data,
dd the whole disk to /dev/null or run badblock checks without finding
any physical media errors.

2.4.19 should complain, too, if there is a physical error indeed, right?


Regards
Florian



--
Florian Hinzmann private: [email protected]
Debian: [email protected]
PGP Key / ID: 1024D/B4071A65
Fingerprint : F9AB 00C1 3E3A 8125 DD3F DF1C DF79 A374 B407 1A65

2002-09-16 14:39:46

by Alan

[permalink] [raw]
Subject: Re: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2

On Mon, 2002-09-16 at 15:26, Florian Hinzmann wrote:
>
> On 16-Sep-2002 Alan Cox wrote:
> > On Mon, 2002-09-16 at 12:17, Florian Hinzmann wrote:
> >> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> >> kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
> >> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> >
> > Which is the drive reporting a physical media error
>
> Which seems to exist only while using the named combinations of DMA access
> and kernel versions. While using i.e. 2.4.19 without DMA I can access the same data,
> dd the whole disk to /dev/null or run badblock checks without finding
> any physical media errors.
>
> 2.4.19 should complain, too, if there is a physical error indeed, right?

The "sectoridnotfound" return is from the drive. That makes it very hard
to believe it isnt a physical error

2002-09-16 15:25:05

by Florian Hinzmann

[permalink] [raw]
Subject: Re: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2

Hello Alan!


On 16-Sep-2002 Alan Cox wrote:
> On Mon, 2002-09-16 at 15:26, Florian Hinzmann wrote:
>>
>> On 16-Sep-2002 Alan Cox wrote:
>> > On Mon, 2002-09-16 at 12:17, Florian Hinzmann wrote:
>> >> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
>> >> kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
>> >> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
>> >
>> > Which is the drive reporting a physical media error
>>
>> Which seems to exist only while using the named combinations of DMA access
>> and kernel versions. While using i.e. 2.4.19 without DMA I can access the same data,
>> dd the whole disk to /dev/null or run badblock checks without finding
>> any physical media errors.
>>
>> 2.4.19 should complain, too, if there is a physical error indeed, right?
>
> The "sectoridnotfound" return is from the drive. That makes it very hard
> to believe it isnt a physical error

This is what makes me believe it:

- I get "SectorIdNotFound" with differing sector numbers at three physical hard disk drives.
At least two of this drives are not very old. One of them was bought only a few weeks
ago.
- I had several occasions where I got this error while copying one given file. After rebooting
(the same kernel with the same settings) accessing the same file was fine.
- I can reproduce this errors starting to copy random groups of files from that disks
within one or two minutes. That makes me think it is not one special area at the disk(s).
On the other hand I can start several copy commands at once using the same commands
as before without getting any errors with i.e. 2.4.19 without DMA. It is running then
running stable for a long time (I think I tried times up to one hour).


My BIOS doesn't like disks larger than 32GB. Therefore the three disks are jumpered to
appear as 32GB (Maxtor disks) and I am using setmax resp. CONFIG_IDEDISK_STROKE to
unclip them.
Maybe this setup has something to do with the possibly bogus SectorIdNotFound message?


I am eager to sort this out. I would like to do the following:

- First, prove the drives don't have physical media errors. Is running "e2fsck -c"
sufficient? Do you have other suggestions to stress test my drives?

- Second, try hard to exclude other possible reasons for my drive returning that
message.
While digging the archives I found one mail stating an old/broken/weak power supply
caused that messages and a new one made them go away. I have already built in
a stronger and newer power supply - it did not change anything.
I changed nearly every part of that box in the past. Any ideas what else might
produce this errors? Too much heat? Cabling?

- Third, take a look at the software again. Is there anything I could try to help
debugging this, i.e. trying every -ac release since 2.4.19 and find out which
version is the first with this errors?


I would highly appreciate some guidance as I am trying to solve this for
months now and I am somehow aimless at this point.


Regards
Florian


--
Florian Hinzmann private: [email protected]
Debian: [email protected]
PGP Key / ID: 1024D/B4071A65
Fingerprint : F9AB 00C1 3E3A 8125 DD3F DF1C DF79 A374 B407 1A65

2002-09-16 15:27:14

by Russell King

[permalink] [raw]
Subject: Re: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2

On Mon, Sep 16, 2002 at 04:26:24PM +0200, Florian Hinzmann wrote:
>
> On 16-Sep-2002 Alan Cox wrote:
> > On Mon, 2002-09-16 at 12:17, Florian Hinzmann wrote:
> >> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> >> kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
> >> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> >
> > Which is the drive reporting a physical media error
>
> Which seems to exist only while using the named combinations of DMA access
> and kernel versions. While using i.e. 2.4.19 without DMA I can access the same data,
> dd the whole disk to /dev/null or run badblock checks without finding
> any physical media errors.

Let me get this straight. If you run this exact procedure, what happens?

1. start with DMA turned on
2. dd the whole disk to /dev/null
3. disable DMA
4. dd the whole disk to /dev/null
5. re-enable DMA
6. dd the whole disk to /dev/null
7. disable DMA
8. dd the whole disk to /dev/null

Are you saying that steps 2 and 6 produce a sectoridnotfound error, while
step 4 and 8 works without problem?

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-09-16 16:08:30

by Jan-Hinnerk Reichert

[permalink] [raw]
Subject: Re: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2

Am Montag, 16. September 2002 16:46 schrieb Alan Cox:
> On Mon, 2002-09-16 at 15:26, Florian Hinzmann wrote:
> > On 16-Sep-2002 Alan Cox wrote:
> > > On Mon, 2002-09-16 at 12:17, Florian Hinzmann wrote:
> > >> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete
> > >> DataRequest Error } kernel: hdb: read_intr: error=0x10 {
> > >> SectorIdNotFound }, LBAsect=97567071, high=5, lo kernel: hdb:
> > >> read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
> > >
> > > Which is the drive reporting a physical media error
> >
> > Which seems to exist only while using the named combinations of DMA
> > access and kernel versions. While using i.e. 2.4.19 without DMA I can
> > access the same data, dd the whole disk to /dev/null or run badblock
> > checks without finding any physical media errors.
> >
> > 2.4.19 should complain, too, if there is a physical error indeed, right?
>
> The "sectoridnotfound" return is from the drive. That makes it very hard
> to believe it isnt a physical error

Is the LBA sector number in the error coming from the drive?

Is the drive addressed with LBA or CHS?

Is it possible that this error occurs if the LBA (or CHS) is out of bound?
(e.g. 40GB drive shouldn't have sector 97567071)

2002-09-16 16:14:56

by Daniela Engert

[permalink] [raw]
Subject: Re: DMA problems w/ PIIX3 IDE, 2.4.20-pre4-ac2

On 16 Sep 2002 15:46:35 +0100, Alan Cox wrote:

>On Mon, 2002-09-16 at 15:26, Florian Hinzmann wrote:

>> > On Mon, 2002-09-16 at 12:17, Florian Hinzmann wrote:
>> >> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
>> >> kernel: hdb: read_intr: error=0x10 { SectorIdNotFound }, LBAsect=97567071, high=5, lo
>> >> kernel: hdb: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
>> >
>> > Which is the drive reporting a physical media error
>>
>> Which seems to exist only while using the named combinations of DMA access
>> and kernel versions. While using i.e. 2.4.19 without DMA I can access the same data,
>> dd the whole disk to /dev/null or run badblock checks without finding
>> any physical media errors.
>>
>> 2.4.19 should complain, too, if there is a physical error indeed, right?
>
>The "sectoridnotfound" return is from the drive. That makes it very hard
>to believe it isnt a physical error

It might be a seek beyond end-of-media error as well (see ATA spec for
error reporting in this case: either ABRT or IDNF shall be set)!

Ciao,
Dani