2003-03-22 15:38:22

by Alan

[permalink] [raw]
Subject: IDE todo list

(Minus some stuff which is NDA'd because it involves unreleased chips
etc)

- Promise 20376
- Audit Promise drivers
- BIOS timing stuff
- Simplex mode reassignment intelligence
- IDE-SCSI crashes on 2.5
- IDE-SCSI/reset race on 2.4/2.5
- Forward port remaining drivers to 2.5
- Add ATAPI virtual DMA
- Add DMA active irq poll trick
- Clock switching for Highpoint 372N
- Support for SATA bridge on HPT
- Intel ICH5 errata audit
- Intel Centrino idents and errata audit [Merged for 2.5]
- Explain rather than just fix the CMD680 mmio collision problem
- Finish hotplug handling
- Revert identify hacks now ide-default is present
- Allow multiple driver binding for ide-cd/ide-scsi etc
- Locking for unload driver
- Locking for modular load onto a busy interface
- ADMA full support
- Mark Lord/Andre ideas on LBA28/LBA48
- Finish verifying 256 sector I/O or larger on LBA48
[How to handle change dynamically on hotplug ?]
- Clean up ide_unregister paths
- Finish ide pcmcia code hot unplug interface registers
- Finish ide pcmcia unregister path retry logic
- IRQ detect broken for some setups
- Check full PCI clocking info on HPT37x
- Rewrite HPT37x controller type logic
- Debug TRM290
- Fix up the hwif based sectors per transfer limit
- How to handle generic class IDE devices by class only
- Opti support for the 558 ?
- Can we resolve NDA's with SiS ?
- Audit ALi driver use of config register bits on bridge
- Document the calling properties for each driver function
- Work out how to fix up all the TCQ crashes
- Does taskfile I/O now work after the bug fixes ?
[do we care 8))]
- Multiple taskfile load support for controllers that have it
[big performance win]
- IDE specification issue - mishandling error abort on ATA6
- 20276 with i960 SX6000 mishandling
- Investigate breakage in ide-floppy on 2.4
- Merge new ACPI + relax into the -ac tree
- Get Arjan's info on IDE violations in simulator




--
Alan Cox <[email protected]>


2003-03-22 17:13:58

by Petr Vandrovec

[permalink] [raw]
Subject: Re: IDE todo list

On Sat, Mar 22, 2003 at 05:01:32PM +0000, Alan Cox wrote:
> (Minus some stuff which is NDA'd because it involves unreleased chips
> etc)
>
> - Audit Promise drivers
> - BIOS timing stuff

Hi,
any hope that promise 20265 driver could detect non-udma66 cable
and run at udma2 only? BIOS properly detect this, but Linux driver
wants to use udma100, and usually dies hard with CRC errors during
reading of PTBL extended chain (it also should not lockup when
CRC error happens 5 times in a row, but ...).
Thanks,
Petr Vandrovec
[email protected]


2003-03-22 17:44:10

by Alan

[permalink] [raw]
Subject: Re: IDE todo list

On Sat, 2003-03-22 at 17:24, Petr Vandrovec wrote:
> any hope that promise 20265 driver could detect non-udma66 cable
> and run at udma2 only? BIOS properly detect this, but Linux driver
> wants to use udma100, and usually dies hard with CRC errors during
> reading of PTBL extended chain (it also should not lockup when
> CRC error happens 5 times in a row, but ...).

The five CRC in a row is what causes the DMA->PIO changedown. That
implies there is a real bug in the error handling locking, or in
the driver handling of that.

Can you throw some printks into the ide code and see what kind of
a death you get when it tries to change back to PIO.

As to the cable stuff, I'll take a look at it in time, but both
need fixing

2003-03-22 19:00:49

by Nicholas Wourms

[permalink] [raw]
Subject: Re: IDE todo list

Alan Cox wrote:
> On Sat, 2003-03-22 at 17:24, Petr Vandrovec wrote:
[SNIP]
> As to the cable stuff, I'll take a look at it in time, but both
> need fixing
>
Alan,

The AMD Opus ide driver is also displaying symptoms of the
same problem I had in 2.4.21-ac with UDMA100. To refresh,
it was detecting 80w as 40w and 40w as 80w [reverse logic].
I am going to try the same fix which was posted for my
2.4.21-ac problem. I'll let you know if it worked...

Cheers,
Nicholas


2003-03-22 19:13:12

by Alan

[permalink] [raw]
Subject: Re: IDE todo list

On Sat, 2003-03-22 at 18:03, Nicholas Wourms wrote:
> The AMD Opus ide driver is also displaying symptoms of the
> same problem I had in 2.4.21-ac with UDMA100. To refresh,
> it was detecting 80w as 40w and 40w as 80w [reverse logic].
> I am going to try the same fix which was posted for my
> 2.4.21-ac problem. I'll let you know if it worked...

The cable detect stuff for AMD is fixed in the current driver
I believe. Its not however full resynched into 2.5.6x yet. I
need to finish merging the proc fixes into 2.4 before I do that

2003-03-22 20:26:38

by Nicholas Wourms

[permalink] [raw]
Subject: Re: IDE todo list

Alan Cox wrote:
> On Sat, 2003-03-22 at 18:03, Nicholas Wourms wrote:
>
>>The AMD Opus ide driver is also displaying symptoms of the
>>same problem I had in 2.4.21-ac with UDMA100. To refresh,
>>it was detecting 80w as 40w and 40w as 80w [reverse logic].
>> I am going to try the same fix which was posted for my
>>2.4.21-ac problem. I'll let you know if it worked...
>
>
> The cable detect stuff for AMD is fixed in the current driver
> I believe. Its not however full resynched into 2.5.6x yet. I
> need to finish merging the proc fixes into 2.4 before I do that
>

Well unfortunately, even with out the little modification,
bk-current now locks right after printing out the hdd
geometry/partition info. Apparently, something in the slew
of patches which went in today has broke my setup.
Unfortunately, I'm using the -mm3 patchset, so I can't say
for certain that it isn't causing problems with the recently
checked in code. Any suggestions on how to go about
debugging this? I'm using a usb keyboard, so I don't think
the Magic Sys-Req button will work at this point [Alas, KDB
is currently non-functional for 2.5].

Cheers,
Nicholas


2003-03-22 22:09:40

by Alan

[permalink] [raw]
Subject: Re: IDE todo list

On Sat, 2003-03-22 at 20:33, Nicholas Wourms wrote:
> Well unfortunately, even with out the little modification,
> bk-current now locks right after printing out the hdd
> geometry/partition info. Apparently, something in the slew
> of patches which went in today has broke my setup.

I'm currently trying to work this out. However all my boxes
work beautifully single and multiprocessor which is thus
making it a bit tricky to work out


2003-03-23 03:39:05

by Petr Vandrovec

[permalink] [raw]
Subject: Re: IDE todo list

On Sat, Mar 22, 2003 at 07:07:21PM +0000, Alan Cox wrote:
> On Sat, 2003-03-22 at 17:24, Petr Vandrovec wrote:
> > any hope that promise 20265 driver could detect non-udma66 cable
> > and run at udma2 only? BIOS properly detect this, but Linux driver
> > wants to use udma100, and usually dies hard with CRC errors during
> > reading of PTBL extended chain (it also should not lockup when
> > CRC error happens 5 times in a row, but ...).
>
> The five CRC in a row is what causes the DMA->PIO changedown. That
> implies there is a real bug in the error handling locking, or in
> the driver handling of that.
>
> Can you throw some printks into the ide code and see what kind of
> a death you get when it tries to change back to PIO.
>
> As to the cable stuff, I'll take a look at it in time, but both
> need fixing

This one fixes lockup, but I'm not actually sure that it is better
than before, as both channels are downgraded to the PIO:

... CRC error stuff ...
PDC202XX: Secondary channel reset
PDC202XX: Primary channel reset
ide3: reset: master: error (0x00?)
... USB stuff & so on, before mounting root ...
... repeat 4 times:
hde: dma_timer_expiry: dma status = 0x21
hde: timeout waiting for DMA
PDC202XX: Primary channel reset
PDC202XX: Secondary channel reset
hde: timeout waiting for DMA
... end of repeat
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 292k freed
init-2.04#

It looks like that hde resurrecting did not cause any damage to the
filesystems (and obviously hde is completely innocent udma100 citizen;
hdg is the only one which should be turned to udma2 or pio).

And strange thing, once I was lucky enough to read partition table from
hdg, so I could run atapci, and it reports that hde host interface is setup
for UDMA4, while hdg for UDMA2. So host hardware is actually probably set
ok, it is just "hdg: 234441648 sectors (120034 MB) w/2048KiB Cache,
CHS=232581/16/63, UDMA(100)" which contains wrong value, and hdg itself,
which says that udma5 is selected (and does not work).
Petr Vandrovec
[email protected]


--- linux-2.5.65-c1197.dist/drivers/ide/ide-iops.c 2003-03-22 15:29:16.000000000 +0100
+++ linux-2.5.65-c1197/drivers/ide/ide-iops.c 2003-03-23 04:26:14.000000000 +0100
@@ -902,7 +902,7 @@
/*
* Select the drive, and issue the SETFEATURES command
*/
- disable_irq(hwif->irq); /* disable_irq_nosync ?? */
+ disable_irq_nosync(hwif->irq);
udelay(1);
SELECT_DRIVE(drive);
SELECT_MASK(drive, 0);

2003-03-23 09:16:16

by Matthias Andree

[permalink] [raw]
Subject: Re: IDE todo list

On Sat, 22 Mar 2003, Alan Cox wrote:

> - Add ATAPI virtual DMA
> - Add DMA active irq poll trick
...
> - ADMA full support
> - Mark Lord/Andre ideas on LBA28/LBA48

Will anything of this enable SG_IO via IDE-SCSI to do DMA with 2448 or
2352 bytes per block, or have other means to reduce the system load
when writing a CD?

2003-03-23 10:38:10

by Jens Axboe

[permalink] [raw]
Subject: Re: IDE todo list

On Sat, Mar 22 2003, Alan Cox wrote:
> - Finish verifying 256 sector I/O or larger on LBA48
> [How to handle change dynamically on hotplug ?]

That is basically impossible. How are you going to handle the case where
you have a queue full of 256 request writes, and the plugged in disk
chokes on them? And insolvable unless you start setting aside requests
simply for this purpose. Also breaks the pseudo atomic segments that a
single request represents. This is just way beyond ugly...

This is a generic problem of course, and the typical answer is to go by
the rules of the lowest common denominator if hot plug can cause you
queue limits to be violated (may be other problems than simply max
sector count).

--
Jens Axboe

2003-03-23 11:08:10

by Jens Axboe

[permalink] [raw]
Subject: Re: IDE todo list

On Sun, Mar 23 2003, Matthias Andree wrote:
> On Sat, 22 Mar 2003, Alan Cox wrote:
>
> > - Add ATAPI virtual DMA
> > - Add DMA active irq poll trick
> ...
> > - ADMA full support
> > - Mark Lord/Andre ideas on LBA28/LBA48
>
> Will anything of this enable SG_IO via IDE-SCSI to do DMA with 2448 or
> 2352 bytes per block, or have other means to reduce the system load
> when writing a CD?

SG_IO can already do that, it's been able to do that for a long time in
2.5.

--
Jens Axboe

2003-03-23 11:09:47

by Jens Axboe

[permalink] [raw]
Subject: Re: IDE todo list

On Sun, Mar 23 2003, Jens Axboe wrote:
> On Sun, Mar 23 2003, Matthias Andree wrote:
> > On Sat, 22 Mar 2003, Alan Cox wrote:
> >
> > > - Add ATAPI virtual DMA
> > > - Add DMA active irq poll trick
> > ...
> > > - ADMA full support
> > > - Mark Lord/Andre ideas on LBA28/LBA48
> >
> > Will anything of this enable SG_IO via IDE-SCSI to do DMA with 2448 or
> > 2352 bytes per block, or have other means to reduce the system load
> > when writing a CD?
>
> SG_IO can already do that, it's been able to do that for a long time in
> 2.5.

Correction, use ide-cd in 2.5 and it will do that. Don't use ide-scsi
for anything but tapes in 2.5

--
Jens Axboe

2003-03-23 12:01:22

by Allan Duncan

[permalink] [raw]
Subject: Re: IDE todo list

Alan Cox wrote:
> On Sat, 2003-03-22 at 17:24, Petr Vandrovec wrote:
>
>> any hope that promise 20265 driver could detect non-udma66 cable
>>and run at udma2 only? BIOS properly detect this, but Linux driver
>>wants to use udma100, and usually dies hard with CRC errors during
>>reading of PTBL extended chain (it also should not lockup when
>>CRC error happens 5 times in a row, but ...).
>
>
> The five CRC in a row is what causes the DMA->PIO changedown. That
> implies there is a real bug in the error handling locking, or in
> the driver handling of that.
>
> Can you throw some printks into the ide code and see what kind of
> a death you get when it tries to change back to PIO.
>
> As to the cable stuff, I'll take a look at it in time, but both
> need fixing

I've got a slightly different take on cables.

First an outline of the hardware:
Epox 8KHA+ with VIA KT266A
Promise Ultra100 TX2 (PDC20268)
Working ATA100 drive on Promise. 80 conductor cable, kernel has OFFBOARD set.
40 conductor cables on VIA IDE primary and secondary. Each has cdrom (slave).
Cable is correctly flagged by BIOS on boot screen as 40 conductor.
Second ATA100 HD in cradle MAY be plugged into VIA primary IDE.
Kernel 2.4.21-pre5-ac3

W/o second HD all operates correctly.
With second HD I get:

hda: attached ide-disk driver.
hda: host protected area => 1
hda: 80418240 sectors (41174 MB) w/1916KiB Cache, CHS=79780/16/63, UDMA(100)
hde: attached ide-disk driver.
hde: host protected area => 1
hde: 80418240 sectors (41174 MB) w/1916KiB Cache, CHS=79780/16/63, UDMA(100)
Partition check:
hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 hda10 >
hde:hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hde: dma_intr: error=0x84 { DriveStatusError BadCRC }
hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hde: dma_intr: error=0x84 { DriveStatusError BadCRC }
hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hde: dma_intr: error=0x84 { DriveStatusError BadCRC }
hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hde: dma_intr: error=0x84 { DriveStatusError BadCRC }
hdf: DMA disabled
ide2: reset: success
hde1 hde2 hde3 hde4 < hde5 hde6 hde7 hde8 hde9 hde10 >


Checking with "hdparm -d" I find that the HD has DMA on,
and the blameless cdrom has it disabled!
The HD then runs w/o error at UDMA2.

I had expected that by using a 40 conductor cable it would be sensed
and UDMA2 then used as the top setting, matching the cdrom capability.

2003-03-23 12:22:37

by Matthias Andree

[permalink] [raw]
Subject: Re: IDE todo list

On Sun, 23 Mar 2003, Jens Axboe wrote:

> Correction, use ide-cd in 2.5 and it will do that. Don't use ide-scsi
> for anything but tapes in 2.5

Hum, 2.5 is not yet up for replacing 2.4 on my production machines.

2003-03-23 13:59:28

by Alan

[permalink] [raw]
Subject: Re: IDE todo list

On Sun, 2003-03-23 at 10:49, Jens Axboe wrote:
> On Sat, Mar 22 2003, Alan Cox wrote:
> > - Finish verifying 256 sector I/O or larger on LBA48
> > [How to handle change dynamically on hotplug ?]
>
> That is basically impossible. How are you going to handle the case where
> you have a queue full of 256 request writes, and the plugged in disk
> chokes on them? And insolvable unless you start setting aside requests
> simply for this purpose. Also breaks the pseudo atomic segments that a
> single request represents. This is just way beyond ugly...

I don't think its impossible at all. Remember if you hotplug a drive you
*dont* want the pending I/O to hit the new drive!

Alan


2003-03-23 14:02:31

by Jens Axboe

[permalink] [raw]
Subject: Re: IDE todo list

On Sun, Mar 23 2003, Alan Cox wrote:
> On Sun, 2003-03-23 at 10:49, Jens Axboe wrote:
> > On Sat, Mar 22 2003, Alan Cox wrote:
> > > - Finish verifying 256 sector I/O or larger on LBA48
> > > [How to handle change dynamically on hotplug ?]
> >
> > That is basically impossible. How are you going to handle the case where
> > you have a queue full of 256 request writes, and the plugged in disk
> > chokes on them? And insolvable unless you start setting aside requests
> > simply for this purpose. Also breaks the pseudo atomic segments that a
> > single request represents. This is just way beyond ugly...
>
> I don't think its impossible at all. Remember if you hotplug a drive you
> *dont* want the pending I/O to hit the new drive!

In that case it could be done, the key point is that no resizing needs
to be done. The rest is purely driver implementation :)

--
Jens Axboe

2003-03-23 14:17:44

by Alan

[permalink] [raw]
Subject: Re: IDE todo list

On Sun, 2003-03-23 at 03:49, Petr Vandrovec wrote:
> This one fixes lockup, but I'm not actually sure that it is better
> than before, as both channels are downgraded to the PIO:

Its a real fix. We are in the IRQ handler so we cannot do disable_irq
because we deadlock waiting for ourselves to exit the interrupt handler

How we change mode downwards is a policy thing. We must avoid going into
MWDMA on error but going to UDMA0 is fine

Alan

2003-03-23 20:58:40

by Andre Hedrick

[permalink] [raw]
Subject: Re: IDE todo list

On 22 Mar 2003, Alan Cox wrote:

> (Minus some stuff which is NDA'd because it involves unreleased chips
> etc)
>
> - Promise 20376
> - Audit Promise drivers
> - BIOS timing stuff
> - Simplex mode reassignment intelligence
> - IDE-SCSI crashes on 2.5
> - IDE-SCSI/reset race on 2.4/2.5
> - Forward port remaining drivers to 2.5
> - Add ATAPI virtual DMA
> - Add DMA active irq poll trick
> - Clock switching for Highpoint 372N
> - Support for SATA bridge on HPT
> - Intel ICH5 errata audit
> - Intel Centrino idents and errata audit [Merged for 2.5]
> - Explain rather than just fix the CMD680 mmio collision problem
> - Finish hotplug handling
> - Revert identify hacks now ide-default is present
> - Allow multiple driver binding for ide-cd/ide-scsi etc
> - Locking for unload driver
> - Locking for modular load onto a busy interface
> - ADMA full support
> - Mark Lord/Andre ideas on LBA28/LBA48
> - Finish verifying 256 sector I/O or larger on LBA48
> [How to handle change dynamically on hotplug ?]
> - Clean up ide_unregister paths
> - Finish ide pcmcia code hot unplug interface registers
> - Finish ide pcmcia unregister path retry logic
> - IRQ detect broken for some setups
> - Check full PCI clocking info on HPT37x
> - Rewrite HPT37x controller type logic
> - Debug TRM290
> - Fix up the hwif based sectors per transfer limit
> - How to handle generic class IDE devices by class only
> - Opti support for the 558 ?
> - Can we resolve NDA's with SiS ?
> - Audit ALi driver use of config register bits on bridge
> - Document the calling properties for each driver function
> - Work out how to fix up all the TCQ crashes
> - Does taskfile I/O now work after the bug fixes ?
> [do we care 8))]
> - Multiple taskfile load support for controllers that have it
> [big performance win]
> - IDE specification issue - mishandling error abort on ATA6
> - 20276 with i960 SX6000 mishandling
> - Investigate breakage in ide-floppy on 2.4
> - Merge new ACPI + relax into the -ac tree
> - Get Arjan's info on IDE violations in simulator

Erm, where is this simulator?


Andre Hedrick
LAD Storage Consulting Group