Subject: IDE-flash device and hard disk on same controller


The IDE driver, file ide-probe.c currently contains this test do prevent
hard drives and IDE-flash devices (ex CompactFlash) from co-existing on the
same IDE controller.

/*
* Prevent long system lockup probing later for non-existant
* slave drive if the hwif is actually a flash memory card of some
variety:
*/
if (drive_is_flashcard(drive)) {
ide_drive_t *mate =
&HWIF(drive)->drives[1^drive->select.b.unit];
if (!mate->ata_flash) {
mate->present = 0;
mate->noprobe = 1;
}
}

This test's assumption that a spinning hard drive cannot coexist on the same
controller as an IDE-flash device is incorrect. I have a working setup with
such a configuration. I don't think that the IDE subsystem should punish
everyone because _some_ hardware cannot tolerate this configuration.

One solution may be to remove this test from the IDE subsystem and force
users with buggy hardware to explicitly disable probing for a second
device. I think the parameters hdx=none or hdx=noprobe should work for
them.

Comments??



2002-08-19 19:17:25

by Alan

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Mon, 2002-08-19 at 19:31, Heater, Daniel (IndSys, GEFanuc, VMIC)
wrote:
> One solution may be to remove this test from the IDE subsystem and force
> users with buggy hardware to explicitly disable probing for a second
> device. I think the parameters hdx=none or hdx=noprobe should work for
> them.

I'm inclined to agree about this in the absence of very good reasons why
not. The combination is found on several systems nowdays

2002-08-20 08:40:54

by Padraig Brady

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Heater, Daniel (IndSys, GEFanuc, VMIC) wrote:
> The IDE driver, file ide-probe.c currently contains this test do prevent
> hard drives and IDE-flash devices (ex CompactFlash) from co-existing on the
> same IDE controller.
>
> /*
> * Prevent long system lockup probing later for non-existant
> * slave drive if the hwif is actually a flash memory card of some
> variety:
> */
> if (drive_is_flashcard(drive)) {
> ide_drive_t *mate =
> &HWIF(drive)->drives[1^drive->select.b.unit];
> if (!mate->ata_flash) {
> mate->present = 0;
> mate->noprobe = 1;
> }
> }
>
> This test's assumption that a spinning hard drive cannot coexist on the same
> controller as an IDE-flash device is incorrect. I have a working setup with
> such a configuration. I don't think that the IDE subsystem should punish
> everyone because _some_ hardware cannot tolerate this configuration.
>
> One solution may be to remove this test from the IDE subsystem and force
> users with buggy hardware to explicitly disable probing for a second
> device. I think the parameters hdx=none or hdx=noprobe should work for
> them.
>
> Comments??

Mentioned several times (and there is a workaround), see:
http://marc.theaimsgroup.com/?l=linux-kernel&m=100446144028502&w=2

I really think some of the default CF logic is bogus.

P?draig.

Subject: RE: IDE-flash device and hard disk on same controller

> Heater, Daniel (IndSys, GEFanuc, VMIC) wrote:
> > The IDE driver, file ide-probe.c currently contains this
> test do prevent
> > hard drives and IDE-flash devices (ex CompactFlash) from
> co-existing on the
> > same IDE controller.
> >
> > /*
> > * Prevent long system lockup probing later for non-existant
> > * slave drive if the hwif is actually a flash
> memory card of some
> > variety:
> > */
> > if (drive_is_flashcard(drive)) {
> > ide_drive_t *mate =
> > &HWIF(drive)->drives[1^drive->select.b.unit];
> > if (!mate->ata_flash) {
> > mate->present = 0;
> > mate->noprobe = 1;
> > }
> > }
> >
> > This test's assumption that a spinning hard drive cannot
> coexist on the same
> > controller as an IDE-flash device is incorrect. I have a
> working setup with
> > such a configuration. I don't think that the IDE subsystem
> should punish
> > everyone because _some_ hardware cannot tolerate this configuration.
> >
> > One solution may be to remove this test from the IDE
> subsystem and force
> > users with buggy hardware to explicitly disable probing
> for a second
> > device. I think the parameters hdx=none or hdx=noprobe
> should work for
> > them.
> >
> > Comments??
>
> Mentioned several times (and there is a workaround), see:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=100446144028502&w=2
>
> I really think some of the default CF logic is bogus.
>
> P?draig.
>


OK. hdc=flash works where hdc=hard drive and hdd=CompactFlash.

Thanks Padraig.

I guess it's 6 of one, half-dozen the other, but telling the kernel that my
hard drive is a flash drive just makes me feel squidgy! I'm still inclined
to suggest that the test that _prevents_ hard drive + CF configuration is no
longer appropriate now that _some_ (most??) hardware vendors have figured
out how to get ide-flash devices to work without "hanging" when no second
device is present. Users with incompatible hardware can still prevent the
long system hang by using hdx=none.

I also used this workaround (hdb=flash) to configure a system with hda=flash
and hdb=cdrom. This seems to work also. Are there any side effects to
telling the kernel that the hard drive or cdrom is a flash device (such as
marking it removable or not marking it removable)?

2002-08-20 21:53:41

by Andre Hedrick

[permalink] [raw]
Subject: RE: IDE-flash device and hard disk on same controller

On Tue, 20 Aug 2002, Heater, Daniel (IndSys, GEFanuc, VMIC) wrote:

>
> OK. hdc=flash works where hdc=hard drive and hdd=CompactFlash.
>
> Thanks Padraig.
>
> I guess it's 6 of one, half-dozen the other, but telling the kernel that my
> hard drive is a flash drive just makes me feel squidgy! I'm still inclined
> to suggest that the test that _prevents_ hard drive + CF configuration is no
> longer appropriate now that _some_ (most??) hardware vendors have figured
> out how to get ide-flash devices to work without "hanging" when no second
> device is present. Users with incompatible hardware can still prevent the
> long system hang by using hdx=none.

That is sounds reasonable and something for just before final 2.6.

> I also used this workaround (hdb=flash) to configure a system with hda=flash
> and hdb=cdrom. This seems to work also. Are there any side effects to
> telling the kernel that the hard drive or cdrom is a flash device (such as
> marking it removable or not marking it removable)?

EWW that is nasty, and there is another ace I will put out for this
occassion

JG! this is where those long nights of explain the spec to you and getting
that one opcode functional will go.

Cheers,

Andre Hedrick
LAD Storage Consulting Group

2002-08-20 21:56:49

by Jeff Garzik

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Andre Hedrick wrote:
> JG! this is where those long nights of explain the spec to you and getting
> that one opcode functional will go.


What, the IDE rewrite I tease people about in private?

Attached is the ATA core...


Attachments:
ata.tar.bz2 (9.65 kB)

2002-08-20 22:24:49

by Jeff Garzik

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Jeff Garzik wrote:
> Attached is the ATA core...

Just to give a little bit more information about the previously attached
code, it is merely a module that does two things: (a) demonstrates
proper [and sometimes faster-than-current-linus] ATA bus probing, and
(b) demonstrates generic registration and initialization of ATA devices
and channels. All other tasks can be left to "personality" (a.k.a.
class) drivers, such as 'disk', 'cdrom', 'floppy', ... types.

Jeff



2002-08-21 06:37:06

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Tue, 20 Aug 2002, Jeff Garzik wrote:
> Jeff Garzik wrote:
> > Attached is the ATA core...
>
> Just to give a little bit more information about the previously attached
> code, it is merely a module that does two things: (a) demonstrates
> proper [and sometimes faster-than-current-linus] ATA bus probing, and
> (b) demonstrates generic registration and initialization of ATA devices
> and channels. All other tasks can be left to "personality" (a.k.a.
> class) drivers, such as 'disk', 'cdrom', 'floppy', ... types.

Looks nice (at first sight)!

But one limitation is that it always assumes the IDE ports are located in I/O
space :-(
What about architectures where IDE ports are located in MMIO space? Or worse,
have some ports in I/O space (e.g. PCI IDE card) and some in MMIO space (e.g.
SOC or mainboard IDE host interface)?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2002-08-21 06:52:33

by Andre Hedrick

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller


Geert,

Look at 2.4.20-pre2-ac5.

I fixed that problem.

On Wed, 21 Aug 2002, Geert Uytterhoeven wrote:

> On Tue, 20 Aug 2002, Jeff Garzik wrote:
> > Jeff Garzik wrote:
> > > Attached is the ATA core...
> >
> > Just to give a little bit more information about the previously attached
> > code, it is merely a module that does two things: (a) demonstrates
> > proper [and sometimes faster-than-current-linus] ATA bus probing, and
> > (b) demonstrates generic registration and initialization of ATA devices
> > and channels. All other tasks can be left to "personality" (a.k.a.
> > class) drivers, such as 'disk', 'cdrom', 'floppy', ... types.
>
> Looks nice (at first sight)!
>
> But one limitation is that it always assumes the IDE ports are located in I/O
> space :-(
> What about architectures where IDE ports are located in MMIO space? Or worse,
> have some ports in I/O space (e.g. PCI IDE card) and some in MMIO space (e.g.
> SOC or mainboard IDE host interface)?
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
>

Andre Hedrick
LAD Storage Consulting Group

2002-08-21 06:54:27

by Andre Hedrick

[permalink] [raw]
Subject: MMIO {Re: IDE-flash device and hard disk on same controller}


Geert,

The proof.


ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
SiI680: IDE controller on PCI bus 00 dev 90
SiI680: chipset revision 1
SiI680: not 100% native mode: will probe irqs later
SiI680: BASE CLOCK == 133
ide0: MMIO-DMA at 0xe080df00-0xe080df07, BIOS settings: hda:pio, hdb:pio
ide1: MMIO-DMA at 0xe080df08-0xe080df0f, BIOS settings: hdc:pio, hdd:pio
PIIX3: IDE controller on PCI bus 00 dev 39
PIIX3: chipset revision 0
PIIX3: not 100% native mode: will probe irqs later
ide2: BM-DMA at 0xffa0-0xffa7, BIOS settings: hde:DMA, hdf:DMA
ide3: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdg:pio, hdh:pio
hda: Maxtor 4G160J8, ATA DISK drive
hdb: Maxtor 4G160J8, ATA DISK drive
hdc: Maxtor 4G160J8, ATA DISK drive
hdd: Maxtor 4G160J8, ATA DISK drive
hde: ATAPI 44X CDROM, ATAPI CD/DVD-ROM drive
hdf: CREATIVEDVD5240E-1, ATAPI CD/DVD-ROM drive
ide0 at 0xe080df80-0xe080df87,0xe080df8a on irq 9
ide1 at 0xe080dfc0-0xe080dfc7,0xe080dfca on irq 9
ide2 at 0x1f0-0x1f7,0x3f6 on irq 14

On Wed, 21 Aug 2002, Geert Uytterhoeven wrote:

> On Tue, 20 Aug 2002, Jeff Garzik wrote:
> > Jeff Garzik wrote:
> > > Attached is the ATA core...
> >
> > Just to give a little bit more information about the previously attached
> > code, it is merely a module that does two things: (a) demonstrates
> > proper [and sometimes faster-than-current-linus] ATA bus probing, and
> > (b) demonstrates generic registration and initialization of ATA devices
> > and channels. All other tasks can be left to "personality" (a.k.a.
> > class) drivers, such as 'disk', 'cdrom', 'floppy', ... types.
>
> Looks nice (at first sight)!
>
> But one limitation is that it always assumes the IDE ports are located in I/O
> space :-(
> What about architectures where IDE ports are located in MMIO space? Or worse,
> have some ports in I/O space (e.g. PCI IDE card) and some in MMIO space (e.g.
> SOC or mainboard IDE host interface)?
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
LAD Storage Consulting Group

2002-08-21 07:07:49

by Andre Hedrick

[permalink] [raw]
Subject: Re: MMIO {Re: IDE-flash device and hard disk on same controller}


More proof:
p6dnf:/proc # cat iomem
00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000c7fff : Video ROM
000c8000-000c8fff : Extension ROM
000f0000-000fffff : System ROM
00100000-1fffffff : System RAM
00100000-002e44cb : Kernel code
002e44cc-0039b59f : Kernel data
e080df00-e080df07 : ide0
e080df08-e080df0f : ide1
e080df10-e080df17 : ide0
e080df18-e080df1f : ide1
e080df80-e080df87 : ide0
e080df8a-e080df8a : ide0
e080dfc0-e080dfc7 : ide1
e080dfca-e080dfca : ide1
fd000000-fd3fffff : Number 9 Computer Company Imagine 128
fd400000-fd7fffff : Number 9 Computer Company Imagine 128
fe3e7e00-fe3e7eff : Lite-On Communications Inc LNE100TX
fe3e7e00-fe3e7eff : tulip
fe3e7f00-fe3e7fff : CMD Technology Inc PCI0680
fe3f0000-fe3fffff : Number 9 Computer Company Imagine 128
fe400000-fe7fffff : Number 9 Computer Company Imagine 128
fe800000-febfffff : Number 9 Computer Company Imagine 128
fec00000-fec00fff : reserved
fee00000-fee00fff : reserved
fffe0000-ffffffff : reserved


On Tue, 20 Aug 2002, Andre Hedrick wrote:

>
> Geert,
>
> The proof.
>
>
> ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
> SiI680: IDE controller on PCI bus 00 dev 90
> SiI680: chipset revision 1
> SiI680: not 100% native mode: will probe irqs later
> SiI680: BASE CLOCK == 133
> ide0: MMIO-DMA at 0xe080df00-0xe080df07, BIOS settings: hda:pio, hdb:pio
> ide1: MMIO-DMA at 0xe080df08-0xe080df0f, BIOS settings: hdc:pio, hdd:pio
> PIIX3: IDE controller on PCI bus 00 dev 39
> PIIX3: chipset revision 0
> PIIX3: not 100% native mode: will probe irqs later
> ide2: BM-DMA at 0xffa0-0xffa7, BIOS settings: hde:DMA, hdf:DMA
> ide3: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdg:pio, hdh:pio
> hda: Maxtor 4G160J8, ATA DISK drive
> hdb: Maxtor 4G160J8, ATA DISK drive
> hdc: Maxtor 4G160J8, ATA DISK drive
> hdd: Maxtor 4G160J8, ATA DISK drive
> hde: ATAPI 44X CDROM, ATAPI CD/DVD-ROM drive
> hdf: CREATIVEDVD5240E-1, ATAPI CD/DVD-ROM drive
> ide0 at 0xe080df80-0xe080df87,0xe080df8a on irq 9
> ide1 at 0xe080dfc0-0xe080dfc7,0xe080dfca on irq 9
> ide2 at 0x1f0-0x1f7,0x3f6 on irq 14

Andre Hedrick
LAD Storage Consulting Group

2002-08-21 07:14:12

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Tue, 20 Aug 2002, Andre Hedrick wrote:
> Look at 2.4.20-pre2-ac5.
>
> I fixed that problem.

OK, thanks!

> On Wed, 21 Aug 2002, Geert Uytterhoeven wrote:
> > On Tue, 20 Aug 2002, Jeff Garzik wrote:
> > > Jeff Garzik wrote:
> > > > Attached is the ATA core...
> > >
> > > Just to give a little bit more information about the previously attached
> > > code, it is merely a module that does two things: (a) demonstrates
> > > proper [and sometimes faster-than-current-linus] ATA bus probing, and
> > > (b) demonstrates generic registration and initialization of ATA devices
> > > and channels. All other tasks can be left to "personality" (a.k.a.
> > > class) drivers, such as 'disk', 'cdrom', 'floppy', ... types.
> >
> > Looks nice (at first sight)!
> >
> > But one limitation is that it always assumes the IDE ports are located in I/O
> > space :-(
> > What about architectures where IDE ports are located in MMIO space? Or worse,
> > have some ports in I/O space (e.g. PCI IDE card) and some in MMIO space (e.g.
> > SOC or mainboard IDE host interface)?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Subject: RE: IDE-flash device and hard disk on same controller

Please take me off of this discussion thread.

Thank you.

> Bill Warner , CPU Design Engineer
VMIC A GE Fanuc Company
> [email protected]
> VMIC - A GE Fanuc Company
> 12090 S. Memorial Pkwy, Huntsville, AL 35803
> ph (256) 382-8230, fax (256) 650-5472
>
> -----Original Message-----
> From: Geert Uytterhoeven [SMTP:[email protected]]
> Sent: Wednesday, August 21, 2002 2:17 AM
> To: Andre Hedrick
> Cc: Jeff Garzik; Heater, Daniel (IndSys, GEFanuc, VMIC); 'Padraig
> Brady'; 'Linux Kernel'; Warner, Bill (IndSys, GEFanuc, VMIC)
> Subject: Re: IDE-flash device and hard disk on same controller
>
> On Tue, 20 Aug 2002, Andre Hedrick wrote:
> > Look at 2.4.20-pre2-ac5.
> >
> > I fixed that problem.
>
> OK, thanks!
>
> > On Wed, 21 Aug 2002, Geert Uytterhoeven wrote:
> > > On Tue, 20 Aug 2002, Jeff Garzik wrote:
> > > > Jeff Garzik wrote:
> > > > > Attached is the ATA core...
> > > >
> > > > Just to give a little bit more information about the previously
> attached
> > > > code, it is merely a module that does two things: (a) demonstrates
> > > > proper [and sometimes faster-than-current-linus] ATA bus probing,
> and
> > > > (b) demonstrates generic registration and initialization of ATA
> devices
> > > > and channels. All other tasks can be left to "personality" (a.k.a.
> > > > class) drivers, such as 'disk', 'cdrom', 'floppy', ... types.
> > >
> > > Looks nice (at first sight)!
> > >
> > > But one limitation is that it always assumes the IDE ports are located
> in I/O
> > > space :-(
> > > What about architectures where IDE ports are located in MMIO space? Or
> worse,
> > > have some ports in I/O space (e.g. PCI IDE card) and some in MMIO
> space (e.g.
> > > SOC or mainboard IDE host interface)?
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 --
> [email protected]
>
> In personal conversations with technical people, I call myself a hacker.
> But
> when I'm talking to journalists I just say "programmer" or something like
> that.
> -- Linus
> Torvalds

2002-08-22 05:44:01

by Eric W. Biederman

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Jeff Garzik <[email protected]> writes:

> Jeff Garzik wrote:
> > Attached is the ATA core...
>
> Just to give a little bit more information about the previously attached code,
> it is merely a module that does two things: (a) demonstrates proper [and
> sometimes faster-than-current-linus] ATA bus probing, and

I am assuming ata_chan_init is the function that does this
demonstration.

I don't see any checking for the ATA bsy flag before you start sending
commands. I have seen the current IDE code fail too many times if I
boot to fast, because of a lack of this one simple test. So I don't
see how this could be considered a proper probe.

Eric

2002-08-22 13:43:40

by Bill Davidsen

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

In article <[email protected]>,
Heater, Daniel (IndSys, GEFanuc, VMIC) <[email protected]> wrote:

| OK. hdc=flash works where hdc=hard drive and hdd=CompactFlash.
|
| Thanks Padraig.
|
| I guess it's 6 of one, half-dozen the other, but telling the kernel that my
| hard drive is a flash drive just makes me feel squidgy! I'm still inclined
| to suggest that the test that _prevents_ hard drive + CF configuration is no
| longer appropriate now that _some_ (most??) hardware vendors have figured
| out how to get ide-flash devices to work without "hanging" when no second
| device is present. Users with incompatible hardware can still prevent the
| long system hang by using hdx=none.

I think that traditionally people with broken hardware have been the
ones to use parameters to warn the kernel about them. I certainly have
run some ill-behaved hardware that way ;-) Since there is now and will
be in the future more correct systems than broken, it would seem that
the default would be to work with correct systems.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-08-22 23:08:24

by Jeff Garzik

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Eric W. Biederman wrote:
> I don't see any checking for the ATA bsy flag before you start sending
> commands. I have seen the current IDE code fail too many times if I
> boot to fast, because of a lack of this one simple test. So I don't
> see how this could be considered a proper probe.


There is no ATA bsy flag check at only one point, and that is before
EXECUTE DEVICE DIAGNOSTIC is issued. The idea with this command is that
it pretty much stomps up and down the ATA bus, trouncing ongoing
activity in the process.

Jeff



2002-08-23 00:59:50

by Eric W. Biederman

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Jeff Garzik <[email protected]> writes:

> Eric W. Biederman wrote:
> > I don't see any checking for the ATA bsy flag before you start sending
> > commands. I have seen the current IDE code fail too many times if I
> > boot to fast, because of a lack of this one simple test. So I don't
> > see how this could be considered a proper probe.
>
>
> There is no ATA bsy flag check at only one point, and that is before EXECUTE
> DEVICE DIAGNOSTIC is issued. The idea with this command is that it pretty much
> stomps up and down the ATA bus, trouncing ongoing activity in the process.

The problem is that immediately after bootup ATA devices do not respond until
their media has spun up. Which is both required by the spec, and observed in
practice. Which is likely a problem if this code is run a few seconds after
bootup. Which makes it quite possible the drive will ignore the EXECUTE DEVICE
DIAGNOSTICS and your error code won't be valid when the bsy flag
clears. I don't know how serious that would be.

I can test and find out but I would rather confine my testing to
commands that look like they will stay within the realms of
predictable behavior.

And yes with LinuxBIOS I can reliably boot up fast enough to make this
problem show up in practice.

Eric

2002-08-23 01:22:01

by Jeff Garzik

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Eric W. Biederman wrote:
> The problem is that immediately after bootup ATA devices do not respond until
> their media has spun up. Which is both required by the spec, and observed in
> practice. Which is likely a problem if this code is run a few seconds after
> bootup. Which makes it quite possible the drive will ignore the EXECUTE DEVICE
> DIAGNOSTICS and your error code won't be valid when the bsy flag
> clears. I don't know how serious that would be.


Well, this only applies if you are slack and letting the kernel init
your ATA from scratch, instead of doing proper ATA initialization in
firmware ;-)

Seriously, if you are a handed an ATA device that is actually in
operation when the kernel boots, you are already out of spec. I would
prefer to barf if the BSY or DRDY bits are set, because taking over the
ATA bus while a device is in the middle of a command shouldn't be
happening at Linux kernel boot, ever.

Jeff



2002-08-23 03:12:23

by Andre Hedrick

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On 22 Aug 2002, Eric W. Biederman wrote:

> Jeff Garzik <[email protected]> writes:
>
> > Eric W. Biederman wrote:
> > > I don't see any checking for the ATA bsy flag before you start sending
> > > commands. I have seen the current IDE code fail too many times if I
> > > boot to fast, because of a lack of this one simple test. So I don't
> > > see how this could be considered a proper probe.
> >
> >
> > There is no ATA bsy flag check at only one point, and that is before EXECUTE
> > DEVICE DIAGNOSTIC is issued. The idea with this command is that it pretty much
> > stomps up and down the ATA bus, trouncing ongoing activity in the process.
>
> The problem is that immediately after bootup ATA devices do not respond until
> their media has spun up. Which is both required by the spec, and observed in
> practice. Which is likely a problem if this code is run a few seconds after
> bootup. Which makes it quite possible the drive will ignore the EXECUTE DEVICE
> DIAGNOSTICS and your error code won't be valid when the bsy flag
> clears. I don't know how serious that would be.

We did POST already.

> I can test and find out but I would rather confine my testing to
> commands that look like they will stay within the realms of
> predictable behavior.
>
> And yes with LinuxBIOS I can reliably boot up fast enough to make this
> problem show up in practice.
>
> Eric
>

Andre Hedrick
LAD Storage Consulting Group

2002-08-23 03:16:31

by Andre Hedrick

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller


Oh and it is only useful for borken things like LINBIOS and other
braindead systems like ARM that violate the 31 second rule of POST.

RMK, don't take it personally, but ARM is a headache of the nth degree.
You and I know it, otherwise I would not have a raw TTL IDE PCI card to
mimic your arch.

Cheers,

On 22 Aug 2002, Eric W. Biederman wrote:

> Jeff Garzik <[email protected]> writes:
>
> > Eric W. Biederman wrote:
> > > I don't see any checking for the ATA bsy flag before you start sending
> > > commands. I have seen the current IDE code fail too many times if I
> > > boot to fast, because of a lack of this one simple test. So I don't
> > > see how this could be considered a proper probe.
> >
> >
> > There is no ATA bsy flag check at only one point, and that is before EXECUTE
> > DEVICE DIAGNOSTIC is issued. The idea with this command is that it pretty much
> > stomps up and down the ATA bus, trouncing ongoing activity in the process.
>
> The problem is that immediately after bootup ATA devices do not respond until
> their media has spun up. Which is both required by the spec, and observed in
> practice. Which is likely a problem if this code is run a few seconds after
> bootup. Which makes it quite possible the drive will ignore the EXECUTE DEVICE
> DIAGNOSTICS and your error code won't be valid when the bsy flag
> clears. I don't know how serious that would be.
>
> I can test and find out but I would rather confine my testing to
> commands that look like they will stay within the realms of
> predictable behavior.
>
> And yes with LinuxBIOS I can reliably boot up fast enough to make this
> problem show up in practice.
>
> Eric
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
LAD Storage Consulting Group

2002-08-23 06:50:03

by Adam J. Richter

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Jeff Garzik wrote:
>Eric W. Biederman wrote:
>>The problem is that immediately after bootup ATA devices do not respond until
>>their media has spun up. Which is both required by the spec, and observed in
>>practice. Which is likely a problem if this code is run a few seconds after
>>bootup. Which makes it quite possible the drive will ignore the
>>EXECUTE DEVICEDIAGNOSTICS and your error code won't be valid when
>>the bsy flag clears. I don't know how serious that would be.
>
>
>Well, this only applies if you are slack and letting the kernel init
>your ATA from scratch, instead of doing proper ATA initialization in
>firmware ;-)
>
>Seriously, if you are a handed an ATA device that is actually in
>operation when the kernel boots, you are already out of spec.

1. Regardless of whatever specification you are referring to
or Andre's "31 second rule of [Power On Self Test]", it is genuinely
useful to boot faster by overlapping some other kernel work before the
drive is. Specifications ultimately exist only to serve this
usefulness. When a specification impedes usefulness, sometimes it's
the right decision to violate it. Of course, we're not talking about
your IDE code violating such a specification, but rather not relying
on this particular guarantee.

2. Besides, if this code is supposed to be a generic IDE core,
it many need to run on platforms that do not provide that guarantee or
where the boot code is not even capable of finding where all of the
IDE controllers.

3. In the hierarchy of upgradability, it is generally easier
to replace the kernel than the Power On Self Test, which is more often
in flash or ROM, and which may require help from an unenthusiastic
hardware vendor. So, it is better to weight trade-offs a few notches
in favor of avoid reliance on guarantees about the Power On Self Test.

If I understand correctly, the cost of this trade off would be
adding one or two lines that add perhaps 20 bytes and as many CPU
cycles at initiailzation (except when this change really is necessary).

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."

2002-08-23 07:08:25

by Helge Hafting

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Andre Hedrick wrote:
>
> Oh and it is only useful for borken things like LINBIOS and other
> braindead systems like ARM that violate the 31 second rule of POST.
>
31s of POST is uselessly slow. Perhaps it is needed when
the drives _are_ spinning up, but not for the common
case of rebooting to activate a new kernel
(or reset button when the dev-kernel hung.) The disk
is spinning already in those cases, and there should
be no bootup delay.

Helge Hafting

2002-08-23 07:42:05

by Andre Hedrick

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Thu, 22 Aug 2002, Adam J. Richter wrote:

> 1. Regardless of whatever specification you are referring to
> or Andre's "31 second rule of [Power On Self Test]", it is genuinely
> useful to boot faster by overlapping some other kernel work before the
> drive is. Specifications ultimately exist only to serve this
> usefulness. When a specification impedes usefulness, sometimes it's
> the right decision to violate it. Of course, we're not talking about

Listen to yourself, and understand why 2.5 failed.

"When a specification impedes usefulness, sometimes it's the right
decision to violate it."

"Gee there is no traffic in the on coming lanes, maybe I should use them."

There are rules for how the hardware works, and if everything out there
comes up in 4 seconds great. If everything returns faster than the worst
case great. You start assuming everything behaves that way and you repeat
history.

You guys in 2.5 walked away from the rules because you thought you knew
better, where did it get you? Lost interrupts, PIO command block
exectution failures, dropping EOT on PRD's because reading something into
the published documents which is not there, etc ...

> your IDE code violating such a specification, but rather not relying
> on this particular guarantee.
>
> 2. Besides, if this code is supposed to be a generic IDE core,
> it many need to run on platforms that do not provide that guarantee or
> where the boot code is not even capable of finding where all of the
> IDE controllers.

It is a means for probing signatures w/o identify to test for presence.
It to has a 31 second rule. Break the worst case and device get lost.

We already have a problem with PPC and loosing devices.
This is where JG's hard work and my time with him explaining it will help
most. Also case where RMK's ARM toys do fun things and the assumption by
the driver that POST is valid is DEAD WRONG. I will repeat the assumption
of my code about POST is DEAD WRONG! POST like events happen at different
times for various archs.

****

So if we were in the network stack and decided to chuck the "D-gram"
because it got in the way is that cool?

Better if we were in the scsi stack and blew off the queue list and
rammed an immediate SCB down the pipes that wastes the device queue tags
internal, is that okay. The rules got in the way for this command I
wanted to beat down the controller.

> 3. In the hierarchy of upgradability, it is generally easier
> to replace the kernel than the Power On Self Test, which is more often
> in flash or ROM, and which may require help from an unenthusiastic
> hardware vendor. So, it is better to weight trade-offs a few notches
> in favor of avoid reliance on guarantees about the Power On Self Test.
>
> If I understand correctly, the cost of this trade off would be
> adding one or two lines that add perhaps 20 bytes and as many CPU
> cycles at initiailzation (except when this change really is necessary).

Please do not take this personal, because it is a technical arguemnet.
We do it by the books and then we cheat when we can, but only after we
have all the proper stuff in place for compliance.

Cheers,


Andre Hedrick
LAD Storage Consulting Group

2002-08-23 07:46:42

by Andre Hedrick

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Fri, 23 Aug 2002, Helge Hafting wrote:

> Andre Hedrick wrote:
> >
> > Oh and it is only useful for borken things like LINBIOS and other
> > braindead systems like ARM that violate the 31 second rule of POST.
> >
> 31s of POST is uselessly slow. Perhaps it is needed when
> the drives _are_ spinning up, but not for the common
> case of rebooting to activate a new kernel
> (or reset button when the dev-kernel hung.) The disk
> is spinning already in those cases, and there should
> be no bootup delay.

Correct, and your case is different than from power on cold.
Regardless, you isse EXECUTE DIAGNOSITCS and there are device which wait
until the last minute to respond.

There are things called shadow registers.

Where the slave device answers for or as the master device in this special
case. Now if you have a master (atapi) without shadow registers but slave
(atapi) with shadow registers, guess what sometimes the master negates
the slaves attempt to report. So this command fails here even after a
warm boot.

Now what?


Andre Hedrick
LAD Storage Consulting Group

2002-08-23 08:27:11

by Adam J. Richter

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Fri, 23 Aug 2002, Andre Hedrick wrote:
>On Thu, 22 Aug 2002, Adam J. Richter wrote:

>> 1. Regardless of whatever specification you are referring to
>> or Andre's "31 second rule of [Power On Self Test]", it is genuinely
>> useful to boot faster by overlapping some other kernel work before the
>> drive is. Specifications ultimately exist only to serve this
>> usefulness. When a specification impedes usefulness, sometimes it's
>> the right decision to violate it. Of course, we're not talking about

>Listen to yourself, and understand why 2.5 failed.

>"When a specification impedes usefulness, sometimes it's the right
>decision to violate it."

>"Gee there is no traffic in the on coming lanes, maybe I should use them."

>There are rules for how the hardware works, and if everything out there
>comes up in 4 seconds great. If everything returns faster than the worst
>case great. You start assuming everything behaves that way and you repeat
>history.

>You guys in 2.5 walked away from the rules because you thought you knew
>better, where did it get you? Lost interrupts, PIO command block
>exectution failures, dropping EOT on PRD's because reading something into
>the published documents which is not there, etc ...


Those are not examples of cases where the specification impedes
usefulness. Those are examples where a specification helped usefulness
by helping compatability (not driving in oncoming lanes, not having timing
problems), and a decision made either out of recklessness (driving into
oncoming lanes) or, I infer, out of a misunderstanding of some document.

However, your point is well taken in that when I said
"sometimes it's the right decision to violate [a specification]," I
did not mean by "sometimes" that this is a random event, like
"sometimes it rains." I meant that it's a very careful decision, the
details of which differ from case to case, and that the result is not
always against.


>> your IDE code violating such a specification, but rather not relying
>> on this particular guarantee.
>>
>> 2. Besides, if this code is supposed to be a generic IDE core,
>> it many need to run on platforms that do not provide that guarantee or
>> where the boot code is not even capable of finding where all of the
>> IDE controllers.

>It is a means for probing signatures w/o identify to test for presence.
>It to has a 31 second rule. Break the worst case and device get lost.

Can you provide a reference for this "31 second rule?" If
your reference does not directly discuss how it would impede the test
that you refer to, then you might want to explain that too. Thanks in
advance.

>Please do not take this personal, because it is a technical arguemnet.

Of couse, likewise.

>We do it by the books and then we cheat when we can, but only after we
>have all the proper stuff in place for compliance.

Let's keep in perspective that I am talking adding a test that
would cause a delay until it is safe to proceed (unless you're saying
that waiting for the busy bit to clear is insufficient, in which case
I'd like to know how, but it would still be no more dangerous), and
you are advocating skipping that test.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."



2002-08-23 08:56:33

by Andre Hedrick

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller


Section 6.10

EXECUTE DIAGNOSTICS: Host Shall assert Reset (SRST) prior to issuing.

Section 6.2 SoftReset (SRST)

Device side State Diagram: D0SR3 "Set_Status_State"

second paragraph

"All actions required in this state shall be completed within 31 s"

Page 86 Volume 3 ATA/ATAPI 7 rev 0, 5 November 2001.

Sorry for the old reference it was the quickest hard copy I could find.

The world of the HOST driver does not live in software alone.
You have to read the device side rules to obtain the rest of the solution.
Yeah this stinks, but very few of us are nutty enough to get the entire
process.

What I need most is people like yourself who are skilled in the kernel api
to help me get that part correct. This stuff is easy and can drive you
insane. Since I am there, you are welcome to come but I don't think you
are weird enough yet :-). If you want to visit my world and stay, jump in
and bring a barf bag for the first part of the ride. Also borrow Linus's
crack pipe too, mine is in use.


Cheers,

Andre Hedrick
LAD Storage Consulting Group

On Fri, 23 Aug 2002, Adam J. Richter wrote:

> On Fri, 23 Aug 2002, Andre Hedrick wrote:
> >On Thu, 22 Aug 2002, Adam J. Richter wrote:
>
> >> 1. Regardless of whatever specification you are referring to
> >> or Andre's "31 second rule of [Power On Self Test]", it is genuinely
> >> useful to boot faster by overlapping some other kernel work before the
> >> drive is. Specifications ultimately exist only to serve this
> >> usefulness. When a specification impedes usefulness, sometimes it's
> >> the right decision to violate it. Of course, we're not talking about
>
> >Listen to yourself, and understand why 2.5 failed.
>
> >"When a specification impedes usefulness, sometimes it's the right
> >decision to violate it."
>
> >"Gee there is no traffic in the on coming lanes, maybe I should use them."
>
> >There are rules for how the hardware works, and if everything out there
> >comes up in 4 seconds great. If everything returns faster than the worst
> >case great. You start assuming everything behaves that way and you repeat
> >history.
>
> >You guys in 2.5 walked away from the rules because you thought you knew
> >better, where did it get you? Lost interrupts, PIO command block
> >exectution failures, dropping EOT on PRD's because reading something into
> >the published documents which is not there, etc ...
>
>
> Those are not examples of cases where the specification impedes
> usefulness. Those are examples where a specification helped usefulness
> by helping compatability (not driving in oncoming lanes, not having timing
> problems), and a decision made either out of recklessness (driving into
> oncoming lanes) or, I infer, out of a misunderstanding of some document.
>
> However, your point is well taken in that when I said
> "sometimes it's the right decision to violate [a specification]," I
> did not mean by "sometimes" that this is a random event, like
> "sometimes it rains." I meant that it's a very careful decision, the
> details of which differ from case to case, and that the result is not
> always against.
>
>
> >> your IDE code violating such a specification, but rather not relying
> >> on this particular guarantee.
> >>
> >> 2. Besides, if this code is supposed to be a generic IDE core,
> >> it many need to run on platforms that do not provide that guarantee or
> >> where the boot code is not even capable of finding where all of the
> >> IDE controllers.
>
> >It is a means for probing signatures w/o identify to test for presence.
> >It to has a 31 second rule. Break the worst case and device get lost.
>
> Can you provide a reference for this "31 second rule?" If
> your reference does not directly discuss how it would impede the test
> that you refer to, then you might want to explain that too. Thanks in
> advance.
>
> >Please do not take this personal, because it is a technical arguemnet.
>
> Of couse, likewise.
>
> >We do it by the books and then we cheat when we can, but only after we
> >have all the proper stuff in place for compliance.
>
> Let's keep in perspective that I am talking adding a test that
> would cause a delay until it is safe to proceed (unless you're saying
> that waiting for the busy bit to clear is insufficient, in which case
> I'd like to know how, but it would still be no more dangerous), and
> you are advocating skipping that test.
>
> Adam J. Richter __ ______________ 575 Oroville Road
> [email protected] \ / Milpitas, California 95035
> +1 408 309-6081 | g g d r a s i l United States of America
> "Free Software For The Rest Of Us."
>
>
>

2002-08-23 09:30:26

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

>There is no ATA bsy flag check at only one point, and that is before
>EXECUTE DEVICE DIAGNOSTIC is issued. The idea with this command is that
>it pretty much stomps up and down the ATA bus, trouncing ongoing
>activity in the process.

After mucking around with problematic drives that needed such
a busy wait on the Xserve, it seems that you still need at least
a busy wait with timeout before doing anything on the bus in
some cases. Typically what happens to me is that the disks
were beeing reset (via the control register) by the firmware
just prior to booting the kernel, and those disks (or maybe
it's a Promise controller issue ?) appear to need up to 30
seconds before beeing useable again. Waiting for the busy bit
appear to be a working solution, it worked for me at least
and is what Apple did both in Darwin and in the firmware,
prior to sending the execute diag. command.

Though I can still try to send it here and see if it helps...

The actual scenario followed by Apple's firmware apparently
is:

- wait busy to go away
- select 0
- write 8 to control register (clearing any possible
residual reset)
- delay 2ms
- wait busy to go away
- select 1
- write 8 to control register (clearing any possible
residual reset)
- delay 2ms
- wait busy to go away

Then do the normal probe, which in their case involves the
diagnostics command, checking signatures, etc...

I implemented that in ide-probe and this seem to fix the
problem I have with the Xserve and a few other machines,
though I need to get some user reports before I can tell if
it helps with some other problems I was reported with some
ATAPI combo drives.

The only thing I added to it was to have the busy wait loop
exit when reading 0xff from the status reg, assuming some
controllers (especially hand-made embedded stuffs) would
return that when nothing is plugged.

I will try the full execute diag. on the Xserve tonight or
tomorrow and see if it works without the above, but I doubt
it as the drive seem to be totally unresponsive during this
period when it gets out of reset.

Ben.


2002-08-23 09:36:25

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

>Well, this only applies if you are slack and letting the kernel init
>your ATA from scratch, instead of doing proper ATA initialization in
>firmware ;-)

That will happen. Recent Apple's OF will reset all ATA devices before
booting the kernel, thus triggering the problem with some of them,
and ide-pmac will hard-reset (via the reset line) devices on boot
as well to avoid problems caused by bogus firmwares or machines booted
from MacOS who let the devices in whatever bogus/unknown state (possibly
SLEEP state).

I saw that happening on some embedded platforms as well.

I realy think the kernel should be able to do it all, and waiting
around the busy bit is neither complicated nor hamrful, so...
>
>Seriously, if you are a handed an ATA device that is actually in
>operation when the kernel boots, you are already out of spec. I would
>prefer to barf if the BSY or DRDY bits are set, because taking over the
>ATA bus while a device is in the middle of a command shouldn't be
>happening at Linux kernel boot, ever.

It will happen when the device just got reset or powered up. It's really
a couple of lines to do that properly (see my other mail about the full
procedure I copied from Apple firmware that seem to work fine on all
HW I've tested so far).

Also, another issue we didn't deal with properly yet is PM. With non-APM
power management (like pmac, but probably also ACPI and some embedded
devices), the devices will be basically powered off during suspend, and
no firmware is here to put them back into life on wakeup. So you have to
redo the bringup, which, in some cases (like hotswap IDE bays on some
PowerBooks) probably involves re-running the probe procedure at least,
then re-setting up the device (SET_FEATURE dance)

Ben.


2002-08-23 09:38:53

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

>> The problem is that immediately after bootup ATA devices do not respond
>until
>> their media has spun up. Which is both required by the spec, and
>observed in
>> practice. Which is likely a problem if this code is run a few seconds
>after
>> bootup. Which makes it quite possible the drive will ignore the
>EXECUTE DEVICE
>> DIAGNOSTICS and your error code won't be valid when the bsy flag
>> clears. I don't know how serious that would be.
>
>We did POST already.

Well... x86 PCs with ordinary BIOSes did. Other firmwares,
embedded devices, whatever.... may not, or eventually the firmware
will have reset everything prior to booting the kernel (go figure
why, but that happens).

It's not difficult nor harmful to wait for that dawn busy bit to
go away, so why not do it ?

Ben.


2002-08-23 09:50:18

by Andries Brouwer

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Fri, Aug 23, 2002 at 01:31:09AM -0700, Adam J. Richter wrote:

> Can you provide a reference for this "31 second rule?"

Read e.g. ATA-6.

[If things work well, they may well be fast. But you are only entitled to conclude
that something is wrong after waiting for 31 s. In particular, if you want to
detect device 1 (the slave device), then only after 31 s you know that it is absent.]

See for example the Device power-on or hardware reset state diagram in 9.1.
Some random quotes:

1) Following a hardware reset or software reset, the following steps may be
used to reselect Device 1:
a) Write to the Device register with DEV bit set to one;
b) Using one or more of the Command Block registers that may be both written and
read, such as the Sector Count or LBA Low, write a data pattern other than 00h or
FFh to the register(s);
c) Read the register(s) written in step (b). If the data read is the same as the
data written, proceed to step (e);
d) Repeat steps (a) to (c) until the data matches in step (c) or until 31 s has past.
After 31 s the host may assume that Device 1 is not functioning properly;
e) Read the Status register and Error registers. Check the Status and Error register
contents for any error conditions that Device 1 may have posted.

In Standby mode the device is capable of responding to commands but the device
may take longer to complete commands than in the Idle mode. The time to respond
could be as long as 30 s.

In Sleep mode the device requires a hardware or software reset or a DEVICE RESET
command to be activated. The time to respond could be as long as 30 s.

Transition D0HR2b:D0HR3: When the sample indicates that PDIAG- is not asserted
and 31 s have elapsed since the negation of RESET- (SRST), the device shall set
bit 7 in the Error register and make a transition to the D0HR3: Set_status state.

D0SR3: Set_status State: All actions required in this state shall be completed
within 31 s.

D1HR1: Set_status State: This state is entered when the device has asserted DASP-.
When in this state the device shall complete any hardware initialization and
self-diagnostic testing begun in the Set DASP- state if not already completed.
The diagnostic code shall be placed in the Error register (see Table 26).
If the device passed self-diagnostics, the device shall assert PDIAG-.
All actions required in this state shall be completed in 30 s.

D1SR2: Set_status State: (idem for SRST)

DI1: Device_Idle_S State:
When entering this state from a power-on, hardware, or software reset, if the device
does not implement the PACKET command feature set, the device shall set DRDY to one
within 30 s of entering this state. When entering this state from a power-on, hardware,
or software reset, if the device does implement the PACKET command feature set, the
device shall not set DRDY to one.

DI2: Device_Idle_NS State:
When entering this state from a power-on, hardware, or software reset, if the device
does not implement the PACKET command feature set, the device shall set DRDY to one
within 30 s of entering this state and shall release DASP- and PDIAG- with 31 s of
entering this state. When entering this state from a power-on, hardware, or software
reset, if the device does implement the PACKET command feature set, the device shall
not set DRDY to one.

2002-08-23 10:08:32

by Alan

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Fri, 2002-08-23 at 12:41, Benjamin Herrenschmidt wrote:
> Also, another issue we didn't deal with properly yet is PM. With non-APM
> power management (like pmac, but probably also ACPI and some embedded
> devices), the devices will be basically powered off during suspend, and
> no firmware is here to put them back into life on wakeup. So you have to
> redo the bringup, which, in some cases (like hotswap IDE bays on some
> PowerBooks) probably involves re-running the probe procedure at least,

ACPI deals with this itself. The problem however can occur in other
cases (hot plug ATA disks for one).

2002-08-23 10:10:46

by Adam J. Richter

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Andre Hedrick wrote:
>Section 6.10

>EXECUTE DIAGNOSTICS: Host Shall assert Reset (SRST) prior to issuing.

>Section 6.2 SoftReset (SRST)

>Device side State Diagram: D0SR3 "Set_Status_State"

>second paragraph

>"All actions required in this state shall be completed within 31 s"

>Page 86 Volume 3 ATA/ATAPI 7 rev 0, 5 November 2001.

>Sorry for the old reference it was the quickest hard copy I could find.


Okay. I found it on page 90 Volume 2 ATA/ATAPI 7 rev 0d,
8 July 2002 (http://www.t13.org/docs2002/d1532v2r0d.pdf). Thanks for
the pointer. I don't think I would have found it otherwise.

As far as I can tell from looking at the state diagram, if the
BSY bit is clear, the DOSR3-->idle_S transition has occurred, as that
is the only transition in the entire diagram that clears the BSY bit.
The transition is labelled with BSY=0, and all other states are explicitly
labelled with BSY=1. Likewise for the device 1 reset state diagram on
page 92 of the same document.

There should also be no race before the busy bit is set
because "the device shall set BSY to one within 400ns after entering
this state [DOSR0: SRST]" (I think SRST means "software reset start"),
at least if we assume that it takes at least 400 *nanoseconds* to get
this far in the boot process.

31 seconds seems to be the maximum amount of time that the drive
will take to reach the idle_S state, but it looks OK to poll BSY to see
if it has gotten there sooner.

It looks to me like the scenario that Eric W. Biederman wanted
(the POST takes less than than 31 seconds and the IDE driver checks
BSY until it clears) is "in spec" with respect to those device 0 and
device 1 software reset state diagrams.

Do you concur? Do you see another problem?

By the way, it might still be useful to have a timeout after
31 seconds, and fail the initialization if BSY is still set at that
point, so that the computer might be able to boot up far enough to
call for help if its a non-critical drive.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."

2002-08-23 10:44:04

by Adam J. Richter

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

>Date: Fri, 23 Aug 2002 11:54:21 +0200
>From: Andries Brouwer <[email protected]>

>Read e.g. ATA-6.

> [If things work well, they may well be fast. But you are only entitled
> to conclude that something is wrong after waiting for 31 s. In
> particular, if you want to detect device 1 (the slave device), then
> only after 31 s you know that it is absent.]

We're only talking about going fast if we detect that things have gone well.


>See for example the Device power-on or hardware reset state diagram in 9.1.

Thanks for the reference. Pointing to the actual standards
makes these discussions a lot more efficient.

What I said to Andre about software reset also apparently
applies to hardware reset.

The state diagram that you refer to for hardware reset is on
page 312 of ATA/ATAPI-6 revision 3b, 28 February 2002 and
page 85 of ATA/ATAPI-7 revision 0d, 8 July 2002. As with
software reset, BSY is asserted within 400ns of reset being asserted
and remains set until the transition at the end of the diagram
(D0HR3:Set_Status --> Device_idle_S).

So, I think that Eric Biederman's suggestion about waiting for
BSY to clear, so as to accomodate Power On Self Test that can complete
in under 31 seconds should be OK.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."

2002-08-23 10:48:23

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Thu, 22 Aug 2002, Andre Hedrick wrote:
> Oh and it is only useful for borken things like LINBIOS and other
> braindead systems like ARM that violate the 31 second rule of POST.

Is the 31 second rule defined for the PC or for IDE?

This would explain why my Amiga doesn't identify my Conner CP30540A after a
cold boot, because the disk isn't spun up yet. Apparently AmigaOS doesn't
follow the `31 second rule'...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2002-08-23 10:59:50

by Russell King

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Thu, Aug 22, 2002 at 09:26:04PM -0400, Jeff Garzik wrote:
> Well, this only applies if you are slack and letting the kernel init
> your ATA from scratch, instead of doing proper ATA initialization in
> firmware ;-)

Assuming that you have firmware. What about the case of PCMCIA drives
that you plug in after the kernel has booted and get registered with
IDE almost immediately?

> Seriously, if you are a handed an ATA device that is actually in
> operation when the kernel boots, you are already out of spec. I would
> prefer to barf if the BSY or DRDY bits are set, because taking over the
> ATA bus while a device is in the middle of a command shouldn't be
> happening at Linux kernel boot, ever.

Erm, no. Read the spec. When the drive is spinning up from power on,
BSY is set. BSY may be set for up to 30 seconds or so until the platters
are at full speed. (Some drives take even longer, maybe 40 seconds.)
Once this is so, there are magic bytes you can read from the drive
register that tell you if the device is AT or ATA. These aren't valid
until that BSY bit has cleared.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-08-23 11:03:19

by Russell King

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Thu, Aug 22, 2002 at 08:19:18PM -0700, Andre Hedrick wrote:
> Oh and it is only useful for borken things like LINBIOS and other
> braindead systems like ARM that violate the 31 second rule of POST.

Umm, there are no such ARM based Linux devices that violate this.
Certainly none using boot loaders I've written.

> RMK, don't take it personally, but ARM is a headache of the nth degree.
> You and I know it, otherwise I would not have a raw TTL IDE PCI card to
> mimic your arch.

You're talking crap here.

That machine doesn't get anywhere near Linux until the drives are fully
up and running. This is more or less guaranteed since the kernel comes
off its one and only hard drive. Therefore, by definition the drive
must have completed its diagnostics before Linux boots.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-08-23 11:04:50

by Russell King

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Fri, Aug 23, 2002 at 12:50:49PM +0200, Geert Uytterhoeven wrote:
> On Thu, 22 Aug 2002, Andre Hedrick wrote:
> > Oh and it is only useful for borken things like LINBIOS and other
> > braindead systems like ARM that violate the 31 second rule of POST.
>
> Is the 31 second rule defined for the PC or for IDE?

It's in the ATA specs.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-08-23 11:06:53

by Russell King

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Fri, Aug 23, 2002 at 12:45:03AM -0700, Andre Hedrick wrote:
> This is where JG's hard work and my time with him explaining it will help
> most. Also case where RMK's ARM toys do fun things and the assumption by
> the driver that POST is valid is DEAD WRONG. I will repeat the assumption
> of my code about POST is DEAD WRONG! POST like events happen at different
> times for various archs.

Yet more FUD. Andre - go away and come back once you've calmed down.

Maybe its because you don't actually understand my IDE hardware. I
dunno. But you are "DEAD WRONG" about the crap you've written above.

Completely.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-08-23 13:18:23

by Eric W. Biederman

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Jeff Garzik <[email protected]> writes:

> Eric W. Biederman wrote:
> > The problem is that immediately after bootup ATA devices do not respond until
> > their media has spun up. Which is both required by the spec, and observed in
> > practice. Which is likely a problem if this code is run a few seconds after
> > bootup. Which makes it quite possible the drive will ignore the EXECUTE
> DEVICE
>
> > DIAGNOSTICS and your error code won't be valid when the bsy flag
> > clears. I don't know how serious that would be.
>
>
> Well, this only applies if you are slack and letting the kernel init your ATA
> from scratch, instead of doing proper ATA initialization in firmware ;-)

That would be nice. I do admit it is hard to trigger if you don't do
it deliberately.

The x86 BIOS specifications say only the boot devices must be
initialized, before the BIOS gives up control. A more likely
reproducer is a plug-in ata controller that the BIOS does not
recognize, and the kernel does.

> Seriously, if you are a handed an ATA device that is actually in operation when
> the kernel boots, you are already out of spec. I would prefer to barf if the
> BSY or DRDY bits are set, because taking over the ATA bus while a device is in
> the middle of a command shouldn't be happening at Linux kernel boot, ever.

Throwing an error and giving up would certainly be a safe response,
though it is a strange way to handle in spec hardware and firmware
behavior. On the other hand it is a rare enough case deliberately not
coping with it is probably fine.

Eric

2002-08-23 17:19:12

by Andre Hedrick

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Fri, 23 Aug 2002, Russell King wrote:

> On Fri, Aug 23, 2002 at 12:45:03AM -0700, Andre Hedrick wrote:
> > This is where JG's hard work and my time with him explaining it will help
> > most. Also case where RMK's ARM toys do fun things and the assumption by
> > the driver that POST is valid is DEAD WRONG. I will repeat the assumption
> > of my code about POST is DEAD WRONG! POST like events happen at different
> > times for various archs.
>
> Yet more FUD. Andre - go away and come back once you've calmed down.
>
> Maybe its because you don't actually understand my IDE hardware. I
> dunno. But you are "DEAD WRONG" about the crap you've written above.
>
> Completely.

NO, I just misreferenced you because you was asked to help in a situation
with G Britton, as you have refreshed my memory with a hammer.

So I apologize for mixing events and people.

The point still stands, there are case where the OS is launched,
regardless of bootloading, where devices can be lost if the time limits
are not followed. In the case where my memory failed, you reminded me the
device took 40 secs to spinup.

Not dead wrong just dead tired again.

Cheers,


Andre Hedrick
LAD Storage Consulting Group

2002-08-23 17:53:02

by Andre Hedrick

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Fri, 23 Aug 2002, Andre Hedrick wrote:

> On Fri, 23 Aug 2002, Russell King wrote:
>
> > On Fri, Aug 23, 2002 at 12:45:03AM -0700, Andre Hedrick wrote:
> > > This is where JG's hard work and my time with him explaining it will help
> > > most. Also case where RMK's ARM toys do fun things and the assumption by
> > > the driver that POST is valid is DEAD WRONG. I will repeat the assumption
> > > of my code about POST is DEAD WRONG! POST like events happen at different
> > > times for various archs.
> >
> > Yet more FUD. Andre - go away and come back once you've calmed down.
> >
> > Maybe its because you don't actually understand my IDE hardware. I
> > dunno. But you are "DEAD WRONG" about the crap you've written above.
> >
> > Completely.
>
> NO, I just misreferenced you because you was asked to help in a situation
> with G Britton, as you have refreshed my memory with a hammer.
>
> So I apologize for mixing events and people.
>
> The point still stands, there are case where the OS is launched,
> regardless of bootloading, where devices can be lost if the time limits
> are not followed. In the case where my memory failed, you reminded me the
> device took 40 secs to spinup.

Oh I forgot to add, I said "POST-Like" events.
If Execute Diagnostics is never issues w/ the prior reset of the
devices/bus we do not have a quantified state the system should be left in
at bootloading.

Therefore I suspect the device is compliant and the HOST is bad.

Cheers,

Andre Hedrick
LAD Storage Consulting Group

2002-08-24 01:58:39

by Jeff Garzik

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

Benjamin Herrenschmidt wrote:
> Well... x86 PCs with ordinary BIOSes did. Other firmwares,
> embedded devices, whatever.... may not, or eventually the firmware
> will have reset everything prior to booting the kernel (go figure
> why, but that happens).
>
> It's not difficult nor harmful to wait for that dawn busy bit to
> go away, so why not do it ?


Basically think about the consequences of trying to handle a completely
unknown state -- if you are going to attempt to handle this you would
need to check for data, not just the BSY bit. And read the data into a
throwaway buffer, if there is data to be read, or write the data it's
expecting.

So it's not just the busy bit :)

2002-08-24 09:04:27

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

>> Well... x86 PCs with ordinary BIOSes did. Other firmwares,
>> embedded devices, whatever.... may not, or eventually the firmware
>> will have reset everything prior to booting the kernel (go figure
>> why, but that happens).
>>
>> It's not difficult nor harmful to wait for that dawn busy bit to
>> go away, so why not do it ?
>
>
>Basically think about the consequences of trying to handle a completely
>unknown state -- if you are going to attempt to handle this you would
>need to check for data, not just the BSY bit. And read the data into a
>throwaway buffer, if there is data to be read, or write the data it's
>expecting.
>
>So it's not just the busy bit :)

But are we dealing with completely unknown states ? That's not
what I'm saying. We are dealing with:

- Hotswap in (pcmcia, ...)
- Firmware that don't wait after reset
- Interfaces (like ide-pmac) that hard reset the disk
- no POST code (power-on reset)

A completely unknown state doesn't work today and I don't think
we should really care about it. If an arch wants to deal with
such a state (which is +/- the case of ide-pmac when booting
from MacOS), then that arch has to do the specifics of dealing
with that (in ide-pmac case, tggling the hard reset line of
the drive).

So I still think it would make sense to wait. Now, what I
suggested to Andre on IRC is that we could eventually make
that wait conditional on some HWIF flag set by the arch
(or rather the HBA driver) if you really don't want to do it in
the generic case.

Ben.

2002-08-24 20:26:22

by Andre Hedrick

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller


Please respect the wish of those other two guys who asked to be removed
from the thread.

Russell look at what T13 did. It gave CFA/PCMCIA there own set of opcodes
and rules. Meaning they can do it different. What is so hard to see that
they do it different and they have the resouces to do so.

It means, gee we need to go resolve the pcmcia issues in native calls.

On Sat, 24 Aug 2002, Russell King wrote:

> I notice everyone decided to miss replying to my mail about PCMCIA
> IDE devices, which will trip you up here. Could it be because I've
> identified a real problem here?
>
> - You plug the IDE device in.
> - Power gets applied.
> - cardmgr loads ide_cs.
> - cardmgr binds ide_cs, which registers with the IDE layer.
>
> The above happens in 10s of milliseconds, well before the hard drive
> platters have been spun up. Meanwhile, as defined by the T13 specs,
> the BSY bit can be set for up to 31 seconds.
>
> You're saying "completely unknown state". I say "T13 defines this
> state extremely well, and defines what happens from the drives point
> of view at the end of the power on reset sequence extremely well."
>
> I also say that your implementation above is, in andrespeak, a "bad
> host" because it doesn't follow the T13 power on reset sequence
> properly.
>
> And yes, people _do_ use PCMCIA IDE drives with Linux.
>
> --
> Russell King ([email protected]) The developer of ARM Linux
> http://www.arm.linux.org.uk/personal/aboutme.html
>

Andre Hedrick
LAD Storage Consulting Group

2002-08-24 08:37:27

by Russell King

[permalink] [raw]
Subject: Re: IDE-flash device and hard disk on same controller

On Fri, Aug 23, 2002 at 10:02:44PM -0400, Jeff Garzik wrote:
> Basically think about the consequences of trying to handle a completely
> unknown state -- if you are going to attempt to handle this you would
> need to check for data, not just the BSY bit. And read the data into a
> throwaway buffer, if there is data to be read, or write the data it's
> expecting.
>
> So it's not just the busy bit :)

I notice everyone decided to miss replying to my mail about PCMCIA
IDE devices, which will trip you up here. Could it be because I've
identified a real problem here?

- You plug the IDE device in.
- Power gets applied.
- cardmgr loads ide_cs.
- cardmgr binds ide_cs, which registers with the IDE layer.

The above happens in 10s of milliseconds, well before the hard drive
platters have been spun up. Meanwhile, as defined by the T13 specs,
the BSY bit can be set for up to 31 seconds.

You're saying "completely unknown state". I say "T13 defines this
state extremely well, and defines what happens from the drives point
of view at the end of the power on reset sequence extremely well."

I also say that your implementation above is, in andrespeak, a "bad
host" because it doesn't follow the T13 power on reset sequence
properly.

And yes, people _do_ use PCMCIA IDE drives with Linux.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html