2006-09-11 13:22:54

by Jeff Garzik

[permalink] [raw]
Subject: What's in libata-dev.git


The following libata changes are queued for 2.6.19:

General
-------
* Move libata to drivers/ata
* Serial Attached SCSI (SAS) attachment API
* Increase lba28 max sectors from 200 to 256
* Take the opportunity to rename a bunch of functions, and one filename
* More error handling improvements

Driver-specific
---------------
* ahci: suspend/resume support
* ahci: support some new SiS controllers
* sata_via: new PCI ID
* sata_sil: remove unaffected configurations from mod15write blacklist


The 'upstream' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git

contains the following updates:

Documentation/DocBook/libata.tmpl | 12
drivers/Kconfig | 2
drivers/Makefile | 1
drivers/ata/Kconfig | 493 ++
drivers/ata/Makefile | 63
drivers/ata/ahci.c | 1684 +++++++++
drivers/ata/ata_generic.c | 252 +
drivers/ata/ata_piix.c | 1258 +++++++
drivers/ata/libata-core.c | 6143 +++++++++++++++++++++++++++++++++++
drivers/ata/libata-eh.c | 2246 ++++++++++++
drivers/ata/libata-scsi.c | 3322 ++++++++++++++++++
drivers/ata/libata-sff.c | 1109 ++++++
drivers/ata/libata.h | 122
drivers/ata/pata_ali.c | 679 +++
drivers/ata/pata_amd.c | 707 ++++
drivers/ata/pata_artop.c | 518 ++
drivers/ata/pata_atiixp.c | 306 +
drivers/ata/pata_cmd64x.c | 505 ++
drivers/ata/pata_cs5520.c | 336 +
drivers/ata/pata_cs5530.c | 387 ++
drivers/ata/pata_cs5535.c | 291 +
drivers/ata/pata_cypress.c | 227 +
drivers/ata/pata_efar.c | 342 +
drivers/ata/pata_hpt366.c | 478 ++
drivers/ata/pata_hpt37x.c | 1257 +++++++
drivers/ata/pata_hpt3x2n.c | 597 +++
drivers/ata/pata_hpt3x3.c | 226 +
drivers/ata/pata_isapnp.c | 156
drivers/ata/pata_it8172.c | 288 +
drivers/ata/pata_it821x.c | 847 ++++
drivers/ata/pata_jmicron.c | 266 +
drivers/ata/pata_legacy.c | 949 +++++
drivers/ata/pata_mpiix.c | 313 +
drivers/ata/pata_netcell.c | 175
drivers/ata/pata_ns87410.c | 236 +
drivers/ata/pata_oldpiix.c | 339 +
drivers/ata/pata_opti.c | 292 +
drivers/ata/pata_optidma.c | 547 +++
drivers/ata/pata_pcmcia.c | 393 ++
drivers/ata/pata_pdc2027x.c | 869 ++++
drivers/ata/pata_pdc202xx_old.c | 423 ++
drivers/ata/pata_qdi.c | 403 ++
drivers/ata/pata_radisys.c | 335 +
drivers/ata/pata_rz1000.c | 205 +
drivers/ata/pata_sc1200.c | 287 +
drivers/ata/pata_serverworks.c | 587 +++
drivers/ata/pata_sil680.c | 381 ++
drivers/ata/pata_sis.c | 1030 +++++
drivers/ata/pata_sl82c105.c | 388 ++
drivers/ata/pata_triflex.c | 285 +
drivers/ata/pata_via.c | 568 +++
drivers/ata/pdc_adma.c | 740 ++++
drivers/ata/sata_mv.c | 2465 ++++++++++++++
drivers/ata/sata_nv.c | 595 +++
drivers/ata/sata_promise.c | 844 ++++
drivers/ata/sata_promise.h | 157
drivers/ata/sata_qstor.c | 730 ++++
drivers/ata/sata_sil.c | 728 ++++
drivers/ata/sata_sil24.c | 1227 ++++++
drivers/ata/sata_sis.c | 347 +
drivers/ata/sata_svw.c | 508 ++
drivers/ata/sata_sx4.c | 1502 ++++++++
drivers/ata/sata_uli.c | 300 +
drivers/ata/sata_via.c | 502 ++
drivers/ata/sata_vsc.c | 482 ++
drivers/pci/quirks.c | 6
drivers/scsi/Kconfig | 138
drivers/scsi/Makefile | 16
drivers/scsi/ahci.c | 1473 --------
drivers/scsi/ata_piix.c | 1008 -----
drivers/scsi/libata-bmdma.c | 1149 ------
drivers/scsi/libata-core.c | 6015 ----------------------------------
drivers/scsi/libata-eh.c | 2246 ------------
drivers/scsi/libata-scsi.c | 3173 ------------------
drivers/scsi/libata.h | 117
drivers/scsi/pdc_adma.c | 740 ----
drivers/scsi/sata_mv.c | 2468 --------------
drivers/scsi/sata_nv.c | 595 ---
drivers/scsi/sata_promise.c | 844 ----
drivers/scsi/sata_promise.h | 157
drivers/scsi/sata_qstor.c | 730 ----
drivers/scsi/sata_sil.c | 727 ----
drivers/scsi/sata_sil24.c | 1222 ------
drivers/scsi/sata_sis.c | 347 -
drivers/scsi/sata_svw.c | 508 --
drivers/scsi/sata_sx4.c | 1502 --------
drivers/scsi/sata_uli.c | 300 -
drivers/scsi/sata_via.c | 501 --
drivers/scsi/sata_vsc.c | 482 --
include/asm-alpha/libata-portmap.h | 1
include/asm-generic/libata-portmap.h | 12
include/asm-i386/libata-portmap.h | 1
include/asm-ia64/libata-portmap.h | 1
include/asm-powerpc/libata-portmap.h | 1
include/asm-sparc/libata-portmap.h | 1
include/asm-sparc64/libata-portmap.h | 1
include/asm-x86_64/libata-portmap.h | 1
include/linux/ata.h | 26
include/linux/libata.h | 76
99 files changed, 45337 insertions(+), 26500 deletions(-)

Alan Cox:
libata: rework legacy handling to remove much of the cruft
libata: Add CompactFlash support

Alexey Dobriyan:
CONFIG_PM=n slim: drivers/scsi/sata_sil*

Andres Salomon:
[libata] sata_mv: errata check buglet fix

Brian King:
libata: Add ata_host_set_init
libata: Add ata_port_init
libata: Move ata_probe_ent_alloc to libata_core
libata: Add support for SATA attachment to SAS adapters

Henrik Kretzschmar:
libata: change path to libata in libata.tmpl

Jay Cliburn:
sata_via: Add SATA support for vt8237a

Jeff Garzik:
[libata] ahci: add SiS PCI IDs
[libata] some function renaming
[libata] Kill 'count' var in ata_device_add()
[ATA] Increase lba48 max-sectors from 200 to 256.
Move libata to drivers/ata.
libata: Remove SCSI_ prefix from Kconfig symbols
libata: Separate libata.ko build from individual driver builds
[libata] ata_piix: add missing kfree()
libata: Make sure drivers/ata is a separate Kconfig menu
Clean up drivers/ata/Kconfig a bit.
libata: Grand renaming.
Rename libata-bmdma.c to libata-sff.c.
[libata] Add a bunch of PATA drivers.
[libata] Trim trailing whitespace.
[libata #pata-drivers] Trim trailing whitespace.
[libata] Add pata_jmicron driver to Kconfig, Makefile

Pavel Roskin:
libata: replace pci_module_init() with pci_register_driver()

Tejun Heo:
sata_sil: remove unaffected drives from m15w blacklist
ahci: relocate several internal functions
ahci: cosmetic changes to ahci_start/stop_engine()
ahci: simplify ahci_start_engine()
libata: improve driver initialization and deinitialization
ahci: separate out ahci_reset_controller() and ahci_init_controller()
ahci: implement Power Management support
libata: cosmetic changes to PM functions
ahci: remove IRQ mask clearing from init_controller()
libata: update ata_host_init() and rename it to ata_port_init_shost()
libata: implement per-dev xfermask
libata: implement dummy port
libata: use dummy port for stolen legacy ports
libata: replace ap->hard_port_no with ap->port_no
libata: kill unused hard_port_no and legacy_mode
libata: s/CONFIG_SCSI_SATA/CONFIG_[S]ATA/g in pci/quirks.c
ata_piix: add map 01b for ICH7M

zhao, forrest:
The redefinition of ahci_start_engine() and ahci_stop_engine()


2006-09-11 13:33:34

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Hello.

Jeff Garzik wrote:
> The following libata changes are queued for 2.6.19:
>
> General
> -------
> * Increase lba28 max sectors from 200 to 256

[...]

> Jeff Garzik:
[...]
> [ATA] Increase lba48 max-sectors from 200 to 256.

So was it for LBA28 or for LBA48?
As for LBA28, it might be quite dangerous. Particularly, I know that IBM
drives used to mistreated 256 as 0 in the past (bumped into that on a 8-year
old drive which is still alive though).

WBR, Sergei

2006-09-11 13:37:41

by Jeff Garzik

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Sergei Shtylyov wrote:
> Hello.
>
> Jeff Garzik wrote:
>> The following libata changes are queued for 2.6.19:
>>
>> General
>> -------
>> * Increase lba28 max sectors from 200 to 256
>
> [...]
>
>> Jeff Garzik:
> [...]
>> [ATA] Increase lba48 max-sectors from 200 to 256.
>
> So was it for LBA28 or for LBA48?
> As for LBA28, it might be quite dangerous. Particularly, I know that
> IBM drives used to mistreated 256 as 0 in the past (bumped into that on
> a 8-year old drive which is still alive though).

That's a typo. The first description ("lba28") is correct.

Let me know if your IBM drive has problems with current
libata-dev.git#upstream...

Jeff



2006-09-11 13:45:22

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Hello.

Jeff Garzik wrote:

>>> The following libata changes are queued for 2.6.19:
>>>
>>> General
>>> -------
>>> * Increase lba28 max sectors from 200 to 256
>>
>>
>> [...]
>>
>>> Jeff Garzik:
>>
>> [...]
>>
>>> [ATA] Increase lba48 max-sectors from 200 to 256.

>> So was it for LBA28 or for LBA48?
>> As for LBA28, it might be quite dangerous. Particularly, I know
>> that IBM drives used to mistreated 256 as 0 in the past (bumped into
>> that on a 8-year old drive which is still alive though).

> That's a typo. The first description ("lba28") is correct.

> Let me know if your IBM drive has problems with current
> libata-dev.git#upstream...

It's not likely I'll be able to try it. But I'm absolutely sure that drive
aborted the read commands with the sector count of 0 (i.e. 256 actually). The
exact model was IBM DHEA-34331.
255 sectors actually seems more safe bet.

WBR, Sergei

2006-09-11 13:49:07

by Jeff Garzik

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Sergei Shtylyov wrote:
> It's not likely I'll be able to try it. But I'm absolutely sure that
> drive aborted the read commands with the sector count of 0 (i.e. 256
> actually). The exact model was IBM DHEA-34331.
> 255 sectors actually seems more safe bet.

This sort of thing should be handled by quirks, depending on the
controller and drive.

That's why I was asking for testing, to see if the current code already
handles this.

Jeff


2006-09-11 14:40:14

by Alan

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Ar Llu, 2006-09-11 am 17:47 +0400, ysgrifennodd Sergei Shtylyov:
> It's not likely I'll be able to try it. But I'm absolutely sure that drive
> aborted the read commands with the sector count of 0 (i.e. 256 actually). The
> exact model was IBM DHEA-34331.

Several people reported this problem when we tried 256 years ago in
drivers/ide. You might want to do 256 for SATA Jeff but please don't do
256 for PATA. Reading specs is too hard for some people ;)

Some drives abort the xfer, some just choked.

2006-09-11 14:44:40

by Jeff Garzik

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Alan Cox wrote:
> Ar Llu, 2006-09-11 am 17:47 +0400, ysgrifennodd Sergei Shtylyov:
>> It's not likely I'll be able to try it. But I'm absolutely sure that drive
>> aborted the read commands with the sector count of 0 (i.e. 256 actually). The
>> exact model was IBM DHEA-34331.
>
> Several people reported this problem when we tried 256 years ago in
> drivers/ide. You might want to do 256 for SATA Jeff but please don't do
> 256 for PATA. Reading specs is too hard for some people ;)
>
> Some drives abort the xfer, some just choked.

Where in drivers/ide is it limited to 255?

Jeff



2006-09-11 14:53:35

by Linus Torvalds

[permalink] [raw]
Subject: Re: What's in libata-dev.git



On Mon, 11 Sep 2006, Jeff Garzik wrote:

> Sergei Shtylyov wrote:
> > It's not likely I'll be able to try it. But I'm absolutely sure that
> > drive aborted the read commands with the sector count of 0 (i.e. 256
> > actually). The exact model was IBM DHEA-34331.
> > 255 sectors actually seems more safe bet.
>
> This sort of thing should be handled by quirks, depending on the controller
> and drive.

Please don't play games with peoples data-safety.

It ios absolutely INCORRECT to think that "things should work as
documented, let's fix it up with quirks".

It's a hell of a lot better to instead say "people f*ck up, this is a
known point of trouble, and let's just not push the envelope that hard".

Making max-sectors be 255 instead of 256 just _avoids_ the problem that
the ATA protocol uses a single-byte control register for the sector
number, and that "0" is supposed to mean "256", but people have been
_known_ to get it wrong several times.

It's not like it's even strange and inexplicable that some drive
controller would think that "zero means zero". Quite the reverse. It's a
strange special case, and it's not surprising at all that people would
have gotten it wrong several times independently.

It's not even like you'd get magically higher performance by using 256
sectors, so there's simply no win from living dangerously. Only losses.

Linus

2006-09-11 15:02:59

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Hello.

Jeff Garzik wrote:
>> Ar Llu, 2006-09-11 am 17:47 +0400, ysgrifennodd Sergei Shtylyov:

>>> It's not likely I'll be able to try it. But I'm absolutely sure
>>> that drive aborted the read commands with the sector count of 0 (i.e.
>>> 256 actually). The exact model was IBM DHEA-34331.

>> Several people reported this problem when we tried 256 years ago in
>> drivers/ide. You might want to do 256 for SATA Jeff but please don't do
>> 256 for PATA. Reading specs is too hard for some people ;)

>> Some drives abort the xfer, some just choked.

> Where in drivers/ide is it limited to 255?

Hm, indeed, it's 256 there...
But the changelog in ide-probe.c suggests the were limited to 255 once
upon a time. Also, hd.c still has this limit, and changelog talling why it was
so...

WBR, Sergei

2006-09-11 15:05:29

by Alan

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Ar Llu, 2006-09-11 am 10:44 -0400, ysgrifennodd Jeff Garzik:
> > drivers/ide. You might want to do 256 for SATA Jeff but please don't do
> > 256 for PATA. Reading specs is too hard for some people ;)
> >
> > Some drives abort the xfer, some just choked.
>
> Where in drivers/ide is it limited to 255?

Being a sensible sanity check it was removed, and that was a small
mistake. Some 2.4 also has a 256 limit and it broken various transparent
raid units, older Maxtors(1Gb or so), some IBM drives etc. Got fixed in
-ac but never in base.

The failure pattern is pretty ugly too, your box runs and runs and
eventually you get a linear 256 sector I/O and it all blows up,
sometimes. The IBM's abort the xfer but the maxtors may or may not get
it right (its as if half the firmware has the right test).

We could perhaps do it by ATA version - 255 for ATA < 3 256 for ATA 3+,
lots for LBA48 ? Thats assuming you can show 256 sectors is faster than
255. I'd bet for normal I/O its unmeasurably small.

Alan

2006-09-11 15:08:00

by Jens Axboe

[permalink] [raw]
Subject: Re: What's in libata-dev.git

On Mon, Sep 11 2006, Alan Cox wrote:
> Ar Llu, 2006-09-11 am 17:47 +0400, ysgrifennodd Sergei Shtylyov:
> > It's not likely I'll be able to try it. But I'm absolutely sure that drive
> > aborted the read commands with the sector count of 0 (i.e. 256 actually). The
> > exact model was IBM DHEA-34331.
>
> Several people reported this problem when we tried 256 years ago in
> drivers/ide. You might want to do 256 for SATA Jeff but please don't do
> 256 for PATA. Reading specs is too hard for some people ;)
>
> Some drives abort the xfer, some just choked.

Ehm it's 256 now and it has been for a looong time. The few cases I've
seen where people claimed it broke, turned out to be something else.
I've still haven't seen a valid report on this.

It might sound obscure that 0 means 256 sectors, but it's really not a
hidden obscure fact - people do know. I'm all for being conservative
where it matters, but I'm siding with Jeff on this one. I suspect that
Windows uses 256 as well, which basically means that we're in the clear.

--
Jens Axboe

2006-09-11 15:19:04

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Hello.

Alan Cox wrote:

>>>drivers/ide. You might want to do 256 for SATA Jeff but please don't do
>>>256 for PATA. Reading specs is too hard for some people ;)

>>>Some drives abort the xfer, some just choked.

>>Where in drivers/ide is it limited to 255?

> Being a sensible sanity check it was removed, and that was a small
> mistake. Some 2.4 also has a 256 limit and it broken various transparent
> raid units, older Maxtors(1Gb or so), some IBM drives etc. Got fixed in
> -ac but never in base.

> The failure pattern is pretty ugly too, your box runs and runs and
> eventually you get a linear 256 sector I/O and it all blows up,
> sometimes. The IBM's abort the xfer but the maxtors may or may not get
> it right (its as if half the firmware has the right test).

So, this seems to have a long history... :-)
I've also heard several years ago of the drives not getting anything over
128 sectors right, but those should be really brain-damaged...

> We could perhaps do it by ATA version - 255 for ATA < 3 256 for ATA 3+,

Wouldn't work, I'm afraid. That IBM drive is UltraATA/33, so no less than
ATA-4...
Well, after having referred to the ID data read from it, it's ATA-3 actually.

> lots for LBA48 ? Thats assuming you can show 256 sectors is faster than
> 255. I'd bet for normal I/O its unmeasurably small.

> Alan

WBR, Sergei

2006-09-11 15:24:26

by Jeff Garzik

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Linus Torvalds wrote:
> It's not even like you'd get magically higher performance by using 256
> sectors, so there's simply no win from living dangerously. Only losses.

It's easy enough to change. Does this mean you want drivers/ide changed
too? It's apparently been living dangerously for years and years.

Jeff


2006-09-11 15:38:50

by Jens Axboe

[permalink] [raw]
Subject: Re: What's in libata-dev.git

On Mon, Sep 11 2006, Alan Cox wrote:
> Ar Llu, 2006-09-11 am 10:44 -0400, ysgrifennodd Jeff Garzik:
> > > drivers/ide. You might want to do 256 for SATA Jeff but please don't do
> > > 256 for PATA. Reading specs is too hard for some people ;)
> > >
> > > Some drives abort the xfer, some just choked.
> >
> > Where in drivers/ide is it limited to 255?
>
> Being a sensible sanity check it was removed, and that was a small
> mistake. Some 2.4 also has a 256 limit and it broken various transparent
> raid units, older Maxtors(1Gb or so), some IBM drives etc. Got fixed in
> -ac but never in base.
>
> The failure pattern is pretty ugly too, your box runs and runs and
> eventually you get a linear 256 sector I/O and it all blows up,
> sometimes. The IBM's abort the xfer but the maxtors may or may not get
> it right (its as if half the firmware has the right test).

So this is a confirmed, broken case? Why has no one complained for 2.4
and 2.6?

> We could perhaps do it by ATA version - 255 for ATA < 3 256 for ATA 3+,

Might be sane, yep.

> lots for LBA48 ? Thats assuming you can show 256 sectors is faster than
> 255. I'd bet for normal I/O its unmeasurably small.

255 isn't faster than 256, measurably. But the alignment for "natural"
transfer sizes is much nicer with 256, that's the problem. You really
don't want 248 + 8 going down all the time, for instance. Perhaps it's
not a real problem, but it could be.

--
Jens Axboe

2006-09-11 15:51:06

by Jeff Garzik

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Jens Axboe wrote:
> On Mon, Sep 11 2006, Alan Cox wrote:
>> We could perhaps do it by ATA version - 255 for ATA < 3 256 for ATA 3+,
>
> Might be sane, yep.


Since we're doing this just for paranoia, and nobody can actually
produce a problem case, it's safer just to hardcode 255 for all cases,
than try to come up with a hueristic that won't be exercised for another
decade...

Most new disks are lba48 anyway. (should we use 65535 there too???)

Jeff


2006-09-11 16:04:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: What's in libata-dev.git



On Mon, 11 Sep 2006, Jens Axboe wrote:
>
> So this is a confirmed, broken case? Why has no one complained for 2.4
> and 2.6?

Oh, I didn't even notice that we do that by default already. That's a bit
scary - I remember people having their disks trashed.

Maybe the broken disks are old enough to not be an issue any more, or
maybe something else makes it effectively impossible to trigger in
practice?

You do need to get 32 pages of contiguous IO for it to happen, and while I
don't see anything else that would limit it, maybe there is something that
does? (Some other limiter like max_phys_segments might, but that
particular one defaults to much more than 32)

Of course, we do hopefully handle requests that fail a lot more
gracefully these days, so if the drive says it didn't do it, maybe we just
fix it up properly, in a way we didn't use to.. Ie we may have fixed the
thing that caused corruption just by fixing something else ;)

Linus

2006-09-11 16:04:27

by Alan

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Ar Llu, 2006-09-11 am 17:37 +0200, ysgrifennodd Jens Axboe:
> On Mon, Sep 11 2006, Alan Cox wrote:
> > sometimes. The IBM's abort the xfer but the maxtors may or may not get
> > it right (its as if half the firmware has the right test).
>
> So this is a confirmed, broken case? Why has no one complained for 2.4
> and 2.6?

They did and proposed patches.

Alan

2006-09-11 19:53:13

by Jens Axboe

[permalink] [raw]
Subject: Re: What's in libata-dev.git

On Mon, Sep 11 2006, Alan Cox wrote:
> Ar Llu, 2006-09-11 am 17:37 +0200, ysgrifennodd Jens Axboe:
> > On Mon, Sep 11 2006, Alan Cox wrote:
> > > sometimes. The IBM's abort the xfer but the maxtors may or may not get
> > > it right (its as if half the firmware has the right test).
> >
> > So this is a confirmed, broken case? Why has no one complained for 2.4
> > and 2.6?
>
> They did and proposed patches.

Link?

--
Jens Axboe

2006-09-11 19:52:53

by Jens Axboe

[permalink] [raw]
Subject: Re: What's in libata-dev.git

On Mon, Sep 11 2006, Linus Torvalds wrote:
>
>
> On Mon, 11 Sep 2006, Jens Axboe wrote:
> >
> > So this is a confirmed, broken case? Why has no one complained for 2.4
> > and 2.6?
>
> Oh, I didn't even notice that we do that by default already. That's a bit
> scary - I remember people having their disks trashed.
>
> Maybe the broken disks are old enough to not be an issue any more, or
> maybe something else makes it effectively impossible to trigger in
> practice?

Well, as I said, I don't think we ever saw a case that was demonstrably
due to the 256 sector issue. And I really don't think it is as obscure a
fact that people seem to think it is.

> You do need to get 32 pages of contiguous IO for it to happen, and while I
> don't see anything else that would limit it, maybe there is something that
> does? (Some other limiter like max_phys_segments might, but that
> particular one defaults to much more than 32)

It should be pretty trivial to reach, the other IDE limits are basically
way beyond 128kb of contig io. People are hitting this during boot even
I bet, so...

> Of course, we do hopefully handle requests that fail a lot more
> gracefully these days, so if the drive says it didn't do it, maybe we just
> fix it up properly, in a way we didn't use to.. Ie we may have fixed the
> thing that caused corruption just by fixing something else ;)

If the firmware is really buggy in that it doesn't recognise the 0 case
as being 256, you'd see immediate transfer errors. This going by
unnoticed is highly unlikely.

--
Jens Axboe

2006-09-11 20:02:56

by Jens Axboe

[permalink] [raw]
Subject: Re: What's in libata-dev.git

On Mon, Sep 11 2006, Jeff Garzik wrote:
> Jens Axboe wrote:
> >On Mon, Sep 11 2006, Alan Cox wrote:
> >>We could perhaps do it by ATA version - 255 for ATA < 3 256 for ATA 3+,
> >
> >Might be sane, yep.
>
>
> Since we're doing this just for paranoia, and nobody can actually
> produce a problem case, it's safer just to hardcode 255 for all cases,
> than try to come up with a hueristic that won't be exercised for another
> decade...

If it's a real problem, yes I agree. If it's just hand waving, then no.
The fact that 2.4 and 2.6 has been using 256 for ages really tells me
that no one has been affected by this. The SUSE bugzilla certainly
hasn't seen any entries on it either.

> Most new disks are lba48 anyway. (should we use 65535 there too???)

Heh, good question. Given that the limit is so high, we might as well
just use 65535. It's not nearly as sensitive as the lba28 case.

--
Jens Axboe

2006-09-11 20:15:06

by Jeff Garzik

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Jens Axboe wrote:
> On Mon, Sep 11 2006, Jeff Garzik wrote:
>> Jens Axboe wrote:
>>> On Mon, Sep 11 2006, Alan Cox wrote:
>>>> We could perhaps do it by ATA version - 255 for ATA < 3 256 for ATA 3+,
>>> Might be sane, yep.
>>
>> Since we're doing this just for paranoia, and nobody can actually
>> produce a problem case, it's safer just to hardcode 255 for all cases,
>> than try to come up with a hueristic that won't be exercised for another
>> decade...
>
> If it's a real problem, yes I agree. If it's just hand waving, then no.
> The fact that 2.4 and 2.6 has been using 256 for ages really tells me
> that no one has been affected by this. The SUSE bugzilla certainly
> hasn't seen any entries on it either.
>
>> Most new disks are lba48 anyway. (should we use 65535 there too???)
>
> Heh, good question. Given that the limit is so high, we might as well
> just use 65535. It's not nearly as sensitive as the lba28 case.

Well, I _do_ think it's just hand waving, but OTOH I don't see much harm
in using 255. Contiguous 256-sector reads and writes have gotta be
pretty rare. But that's just a hand-waving guess too ;-)

Jeff



2006-09-11 20:24:44

by Jens Axboe

[permalink] [raw]
Subject: Re: What's in libata-dev.git

On Mon, Sep 11 2006, Jeff Garzik wrote:
> Jens Axboe wrote:
> >On Mon, Sep 11 2006, Jeff Garzik wrote:
> >>Jens Axboe wrote:
> >>>On Mon, Sep 11 2006, Alan Cox wrote:
> >>>>We could perhaps do it by ATA version - 255 for ATA < 3 256 for ATA 3+,
> >>>Might be sane, yep.
> >>
> >>Since we're doing this just for paranoia, and nobody can actually
> >>produce a problem case, it's safer just to hardcode 255 for all cases,
> >>than try to come up with a hueristic that won't be exercised for another
> >>decade...
> >
> >If it's a real problem, yes I agree. If it's just hand waving, then no.
> >The fact that 2.4 and 2.6 has been using 256 for ages really tells me
> >that no one has been affected by this. The SUSE bugzilla certainly
> >hasn't seen any entries on it either.
> >
> >>Most new disks are lba48 anyway. (should we use 65535 there too???)
> >
> >Heh, good question. Given that the limit is so high, we might as well
> >just use 65535. It's not nearly as sensitive as the lba28 case.
>
> Well, I _do_ think it's just hand waving, but OTOH I don't see much harm
> in using 255. Contiguous 256-sector reads and writes have gotta be
> pretty rare. But that's just a hand-waving guess too ;-)

Just check the default read-ahead size - it's 256 sectors. It's really
not that rare. The read-ahead case can be made a little more clever (and
it really should be), but still. I did some numbers on this when I wrote
the fcache code, and just a regular boot does generate some really big
requests. Big writes are trivial.

248 sector contig requests will in reality be just as fast as 256, I'm
more concerned about the alignment aspect of it. ATA does hit platter
speed fairly quickly. When I last measured a few months ago, for
sequential reads you are already there at 16 sectors (again rounded to
actual observed io in real life, raw it was around 6KiB). The really
nasty case is cache set to write through, there you really want every
milimeter of extra io size to get the performance up on writes.

--
Jens Axboe

2006-09-11 22:40:08

by Alan

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Ar Llu, 2006-09-11 am 21:51 +0200, ysgrifennodd Jens Axboe:
> Well, as I said, I don't think we ever saw a case that was demonstrably
> due to the 256 sector issue. And I really don't think it is as obscure a
> fact that people seem to think it is.

One of the ones I've got saved here is this thread. Paul goes on to
demonstrate that changing the 255<->256 limit makes 2.0/2.2/2.4 break or
not break.

--------

There is a potentially serious bug in ide-probe.c in which max_sectors
is set to 256 instead of 255. I am surprised that this hasn't bit anyone
else yet. Perhaps because you need a disk that is slow in comparison to
the host in order for the queue to climb up to and then hit the 256, at
which point it then falls over.


For example, with an old 700MB Maxtor on a "fast" 486, VL-bus, PIO,
hdparm -c1 -m8 -u1, I could pretty much on demand generate the
following
error by multiple builds, or by the final linking of any big project:


hdc: lost interrupt
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
hdc: drive not ready for command
<user space sees binary cruft in source files, etc etc...>


(Note that nothing in the status is really an error). With the
following
patch, everything works as it should & no errors even under high load.
Patch is against 2.4.3pre2.


Paul.


--- drivers/ide/ide-probe.c~ Sat Mar 17 16:50:14 2001
+++ drivers/ide/ide-probe.c Sat Mar 17 16:58:33 2001
@@ -1,5 +1,5 @@
/*
- * linux/drivers/ide/ide-probe.c Version 1.06 June 9, 2000
+ * linux/drivers/ide/ide-probe.c Version 1.07 March 18, 2001
*
* Copyright (C) 1994-1998 Linus Torvalds & authors (see below)
*/
@@ -25,6 +25,8 @@
* allowed for secondary flash card to be detectable
* with new flag : drive->ata_flash : 1;
* Version 1.06 stream line request queue and prep for cascade project.
+ * Version 1.07 max_sect <= 255; slower disks would get behind and
+ * then fall over when they get to 256. Paul G.
*/

#undef REALLY_SLOW_IO /* most systems can safely undef this */
@@ -772,10 +774,10 @@
for (unit = 0; unit < minors; ++unit) {
*bs++ = BLOCK_SIZE;
#ifdef CONFIG_BLK_DEV_PDC4030
- *max_sect++ = ((hwif->chipset == ide_pdc4030) ? 127 : 256);
+ *max_sect++ = ((hwif->chipset == ide_pdc4030) ? 127 : 255);
#else
/* IDE can do up to 128K per request. */
- *max_sect++ = 256;
+ *max_sect++ = 255;
#endif
*max_ra++ = MAX_READAHEAD;
}


2006-09-11 22:53:55

by Greg Freemyer

[permalink] [raw]
Subject: Re: What's in libata-dev.git

On 9/11/06, Alan Cox <[email protected]> wrote:
> Ar Llu, 2006-09-11 am 21:51 +0200, ysgrifennodd Jens Axboe:
> > Well, as I said, I don't think we ever saw a case that was demonstrably
> > due to the 256 sector issue. And I really don't think it is as obscure a
> > fact that people seem to think it is.
>
> One of the ones I've got saved here is this thread. Paul goes on to
> demonstrate that changing the 255<->256 limit makes 2.0/2.2/2.4 break or
> not break.
>
<snip>

The whole thread is at http://lkml.org/lkml/2001/3/18/29

Greg
--
Greg Freemyer
The Norcross Group
Forensics for the 21st Century

2006-09-12 05:24:30

by Jens Axboe

[permalink] [raw]
Subject: Re: What's in libata-dev.git

On Tue, Sep 12 2006, Alan Cox wrote:
> Ar Llu, 2006-09-11 am 21:51 +0200, ysgrifennodd Jens Axboe:
> > Well, as I said, I don't think we ever saw a case that was demonstrably
> > due to the 256 sector issue. And I really don't think it is as obscure a
> > fact that people seem to think it is.
>
> One of the ones I've got saved here is this thread. Paul goes on to
> demonstrate that changing the 255<->256 limit makes 2.0/2.2/2.4 break or
> not break.

I remember Paul's mails, and I'm pretty sure that the 256 sectors wasn't
the issue. This is one of the only cases I remember being reported to
lkml, unfortunately I cannot seem to locate the 2nd part of that
thread...

--
Jens Axboe

2006-09-12 08:46:21

by Helge Hafting

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Linus Torvalds wrote:
> On Mon, 11 Sep 2006, Jeff Garzik wrote:
>
>
>> Sergei Shtylyov wrote:
>>
>>> It's not likely I'll be able to try it. But I'm absolutely sure that
>>> drive aborted the read commands with the sector count of 0 (i.e. 256
>>> actually). The exact model was IBM DHEA-34331.
>>> 255 sectors actually seems more safe bet.
>>>
>> This sort of thing should be handled by quirks, depending on the controller
>> and drive.
>>
>
> Please don't play games with peoples data-safety.
>
> It ios absolutely INCORRECT to think that "things should work as
> documented, let's fix it up with quirks".
>
How about a simple and harmless test?
When an IDE disk is accessed for the first time, perhaps when
the partition table is read - issue a 256-sector read and see
what happens. If it works - fine. If not, tag the thing as
supporting max 255 sectors.

No wrecking of file systems, and full performance for
the vast majority.

Helge Hafting

2006-09-13 01:51:29

by Tejun Heo

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Helge Hafting wrote:
> How about a simple and harmless test?
> When an IDE disk is accessed for the first time, perhaps when
> the partition table is read - issue a 256-sector read and see
> what happens. If it works - fine. If not, tag the thing as
> supporting max 255 sectors.
>
> No wrecking of file systems, and full performance for
> the vast majority.

Before implementing anything like that, we need a test case. We don't
know how a faulty drive reacts on such cases. If it actively aborts the
command, we can reduce the limit to 255 sectors after upper layer issues
such command, no need to do it earlier. If it times out, we can't do it
during boot and it will suck later too. If it silently corrupts data
(highly unlikely), we need to detect the condition during boot.

I don't think it matters all that much anyway. IDE has been running w/
256 sectors for a loooong time and someone who seeks performance from
LBA28 only drive has bigger problems (also I don't think 255 would be
noticeably slower than 256).

--
tejun

2006-10-04 17:57:26

by Mark Lord

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Sergei Shtylyov wrote:
> ..
>> Jeff Garzik:
> [...]
>> [ATA] Increase lba48 max-sectors from 200 to 256.
>
> So was it for LBA28 or for LBA48?
> As for LBA28, it might be quite dangerous. Particularly, I know that
> IBM drives used to mistreated 256 as 0 in the past (bumped into that on
> a 8-year old drive which is still alive though).
..
>The exact model was IBM DHEA-34331.

I've been travelling for the past month, so pardon the late tuning in here.
I've *never* encountered a drive that had this problem.
Controllers, yes, and those are easily dealt with in the chipset drivers.

But never drives. Not since 1992 when I first took up Linux IDE stuff.

I have some 7-year old IBM drives here, and they certainly don't have
this problem either (but they do have working TCQ etc..).

I suspect Sergei simply had a bad controller card at the time.

Cheers

2006-10-04 18:03:49

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Hello.

Mark Lord wrote:

>>> [ATA] Increase lba48 max-sectors from 200 to 256.

>> So was it for LBA28 or for LBA48?
>> As for LBA28, it might be quite dangerous. Particularly, I know
>> that IBM drives used to mistreated 256 as 0 in the past (bumped into
>> that on a 8-year old drive which is still alive though).

> ..

>> The exact model was IBM DHEA-34331.

> I've been travelling for the past month, so pardon the late tuning in here.
> I've *never* encountered a drive that had this problem.
> Controllers, yes, and those are easily dealt with in the chipset drivers.
>
> But never drives. Not since 1992 when I first took up Linux IDE stuff.
>
> I have some 7-year old IBM drives here, and they certainly don't have
> this problem either (but they do have working TCQ etc..).

That was 8-year old Ultra33 drive, what TCQ? :-)

> I suspect Sergei simply had a bad controller card at the time.

I can hardly imagine the reason why a PCI IDE controller (that was
something like VT82C586 I think) would need to mess with the sector count reg.
in PIO mode and return "command aborted" in the error reg... That was the
exact sympthom IIRC.

> Cheers

WBR, Sergei

2006-10-04 18:48:27

by Mark Lord

[permalink] [raw]
Subject: Re: What's in libata-dev.git

Sergei Shtylyov wrote:
>..
>> I suspect Sergei simply had a bad controller card at the time.
>
> I can hardly imagine the reason why a PCI IDE controller (that was
> something like VT82C586 I think) would need to mess with the sector
> count reg. in PIO mode and return "command aborted" in the error reg...
> That was the exact sympthom IIRC.

Ahh.. well, if it just returned command aborted, then Jeff's original
change would present no real danger --> any occurances would be detected.

But to answer the imaginative question, the *reason* why a PCI (or VLB) IDE
controller would mess with the registers, is because the makers have this
nasty habit of wanting to do data prefetching (and posting) to speed up
transfers, particularly PIO transfers. And the only way they can do the
prefetching/posting "safely", is to snoop the taskfile registers and have
the contoller "know" their meanings.

This has lead to all kinds of lunacies, like the RZ1000, CMD640, and other
memorable disasters of mis-implementation.

Cheers!