There is a potentially serious bug in ide-probe.c in which max_sectors
is set to 256 instead of 255. I am surprised that this hasn't bit anyone
else yet. Perhaps because you need a disk that is slow in comparison to
the host in order for the queue to climb up to and then hit the 256, at
which point it then falls over.
For example, with an old 700MB Maxtor on a "fast" 486, VL-bus, PIO,
hdparm -c1 -m8 -u1, I could pretty much on demand generate the following
error by multiple builds, or by the final linking of any big project:
hdc: lost interrupt
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
hdc: drive not ready for command
<user space sees binary cruft in source files, etc etc...>
(Note that nothing in the status is really an error). With the following
patch, everything works as it should & no errors even under high load.
Patch is against 2.4.3pre2.
Paul.
--- drivers/ide/ide-probe.c~ Sat Mar 17 16:50:14 2001
+++ drivers/ide/ide-probe.c Sat Mar 17 16:58:33 2001
@@ -1,5 +1,5 @@
/*
- * linux/drivers/ide/ide-probe.c Version 1.06 June 9, 2000
+ * linux/drivers/ide/ide-probe.c Version 1.07 March 18, 2001
*
* Copyright (C) 1994-1998 Linus Torvalds & authors (see below)
*/
@@ -25,6 +25,8 @@
* allowed for secondary flash card to be detectable
* with new flag : drive->ata_flash : 1;
* Version 1.06 stream line request queue and prep for cascade project.
+ * Version 1.07 max_sect <= 255; slower disks would get behind and
+ * then fall over when they get to 256. Paul G.
*/
#undef REALLY_SLOW_IO /* most systems can safely undef this */
@@ -772,10 +774,10 @@
for (unit = 0; unit < minors; ++unit) {
*bs++ = BLOCK_SIZE;
#ifdef CONFIG_BLK_DEV_PDC4030
- *max_sect++ = ((hwif->chipset == ide_pdc4030) ? 127 : 256);
+ *max_sect++ = ((hwif->chipset == ide_pdc4030) ? 127 : 255);
#else
/* IDE can do up to 128K per request. */
- *max_sect++ = 256;
+ *max_sect++ = 255;
#endif
*max_ra++ = MAX_READAHEAD;
}
On Sun, Mar 18 2001, Paul Gortmaker wrote:
> There is a potentially serious bug in ide-probe.c in which max_sectors
> is set to 256 instead of 255. I am surprised that this hasn't bit anyone
> else yet. Perhaps because you need a disk that is slow in comparison to
> the host in order for the queue to climb up to and then hit the 256, at
> which point it then falls over.
You don't need a slow disk, it's trivial to provoke 256 sector sized
request on even the fastest disk available. People hit it all the time,
just with working drives...
> For example, with an old 700MB Maxtor on a "fast" 486, VL-bus, PIO,
> hdparm -c1 -m8 -u1, I could pretty much on demand generate the following
> error by multiple builds, or by the final linking of any big project:
The 256 is _not_ a bug in the driver, it's more likely a bug in your
drive. 256 is a perfectly legal transfer size. That said, maybe it is
a good idea to leave it at 255 just for safety on drives not handling
0 sectors == 128kB transfer.
--
Jens Axboe
On Sun, 18 Mar 2001, Jens Axboe wrote:
>
> The 256 is _not_ a bug in the driver, it's more likely a bug in your
> drive. 256 is a perfectly legal transfer size. That said, maybe it is
> a good idea to leave it at 255 just for safety on drives not handling
> 0 sectors == 128kB transfer.
Agreed. That would be a trivially easy bug in the firmware, limiting to
255 sectors seems safer.
Linus
On Sun, 18 Mar 2001, Linus Torvalds wrote:
>
>
> On Sun, 18 Mar 2001, Jens Axboe wrote:
> >
> > The 256 is _not_ a bug in the driver, it's more likely a bug in your
> > drive. 256 is a perfectly legal transfer size. That said, maybe it is
> > a good idea to leave it at 255 just for safety on drives not handling
> > 0 sectors == 128kB transfer.
>
> Agreed. That would be a trivially easy bug in the firmware, limiting to
> 255 sectors seems safer.
Guys which side of the counter is the decrementer?
Telling the drive to transfer 256 sectors in PIO is filling the
sector_count register with '0' == 'zero'.
As long as 255 == 255 and 0 == 256 for total sectors to transfer all is
cool.
Andre Hedrick
Linux ATA Development
On Sun, 18 Mar 2001, Jens Axboe wrote:
>
> The 256 is _not_ a bug in the driver, it's more likely a bug in your
> drive. 256 is a perfectly legal transfer size. That said, maybe it is
> a good idea to leave it at 255 just for safety on drives not handling
> 0 sectors == 128kB transfer.
Agreed. That would be a trivially easy bug in the firmware, limiting to
255 sectors seems safer.
Linus
Yes, possibly.
I checked old standards, and see that "0 means 256 as a sector count"
is already in ATA-1.
Is there any evidence that other people have been hit by this?
Unfortunately, the
"status error: status=0x58 { DriveReady SeekComplete DataRequest }"
is reported frequently these days, and has many causes.
In old reports it is rare. (E.g. none in lk for 1997.)
Paul: is there only one disk that you can make fail this way?
Andries
On Mon, 19 Mar 2001 [email protected] wrote:
>
> Agreed. That would be a trivially easy bug in the firmware, limiting to
> 255 sectors seems safer.
>
> Linus
>
> Yes, possibly.
> I checked old standards, and see that "0 means 256 as a sector count"
> is already in ATA-1.
Yes. But we could have some silly bug in the Linux drivers too, if some
part of the driver reads back the sector count and doesn't do the 0==256
conversion. So let's not blame the disk quite yet. Although it would be
interesting to hear if the problem only happens for a specific disk or
manufacturer...
Linus
On Sun, 18 Mar 2001, Linus Torvalds wrote:
>
>
> On Mon, 19 Mar 2001 [email protected] wrote:
> >
> > Agreed. That would be a trivially easy bug in the firmware, limiting to
> > 255 sectors seems safer.
> >
> > Linus
> >
> > Yes, possibly.
> > I checked old standards, and see that "0 means 256 as a sector count"
> > is already in ATA-1.
>
> Yes. But we could have some silly bug in the Linux drivers too, if some
> part of the driver reads back the sector count and doesn't do the 0==256
> conversion. So let's not blame the disk quite yet. Although it would be
> interesting to hear if the problem only happens for a specific disk or
> manufacturer...
LT,
This is why I want to standardize the data-phase rules, but we have agreed
to postpone for 2.5. Since the glue for the main loops is spread like hot
butter, it covers up a lot of issues and threads get messy. I had all but
given up on chasing them down and then resolved to start from scratch.
Andre Hedrick
Linux ATA Development
Jens Axboe wrote:
> You don't need a slow disk, it's trivial to provoke 256 sector sized
> request on even the fastest disk available. People hit it all the time,
> just with working drives...
Here is an update on the 255 vs 256 IDE issue. As Jens said, if it
screws up on every 256, then I shouldn't have to try at all to trip
it up - a simple dd should do it (but it doesn't). So I thought I'd
test some more & collect up the details before posting again.
Offending disk:
--------------
Model=Maxtor 7850 AV, FwRev=OA7X5B57, SerialNo=********
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>5Mbs FmtGapReq }
RawCHS=1654/16/63, TrkSize=0, SectSize=0, ECCbytes=11
BuffType=3(DualPortCache), BuffSize=64kB, MaxMultSect=8, MultSect=8
DblWordIO=yes, maxPIO=2(fast), DMA=yes, maxDMA=1(medium)
CurCHS=1654/16/63, CurSects=1667232, LBA=yes, LBAsects=1667232
tDMA={min:150,rec:150}, DMA modes: sword0 sword1 *sword2 *mword0
IORDY=on/off, tPIO={min:180,w/IORDY:150}, PIO modes: mode3
System Details:
--------------
Installed as primary (and only device) on ide1, which is a VL-bus
PIO card on 40MHz bus, booting with idebus=40, Am5x86-WT (i.e. 486)
in a Gigabyte SiS '471 chipset board. Disk tuned with "-c1 -m8 -u1".
Interface ide0 on same card has 2Gig Maxtor on it, which does not
appear to have problems (easier to show a failure happens than prove a
failure will never happen...) ide0 is not in use during failures, as
root fs is on SCSI.
Failure Cases:
--------------
My "test" at the moment consists of:
nice -n20 make bootstrap > make.log 2>&1 & tail -f make.log
while also doing a:
while [ 1 ]; do gunzip -vt some-7MB-o-cruft.tar.gz ; sleep 60 ; done
The GCC src & the tarball are on problem disk. The sleep ensures the
gcc build has pushed the tarball out of the 24MB of RAM. This then gives:
hdc: lost interrupt
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
in less than an hour (it varies of course). The one case where I happened
to "see" it fail, the drive LED stayed on (drive quiet, no seeks) until
the timer fired off and printed the message(s). Some times (3 out of 13
failures so far) I have seen the more suspect:
hdc: irq timeout: status=0x58 { DriveReady SeekComplete DataRequest }
ide1: unexpected interrupt, status=0x58, count=1
hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
hdc: drive not ready for command
all with the same timestamp in the syslog. (IRQ timeout followed by
unexpected IRQ ... ??? Hrrm.)
Most times gzip will get invalid data when a failure happens, but on
occasion random cruft has appeared in the GCC source that stops the
build. Forcing a re-read from the disk will return the correct data
in both cases. I haven't seen corrupt writes (at least none I've
detected yet...)
Compiler/Kernel:
----------------
I went back and double checked (2.4.3-pre4) breaks on 256 but works
reliably on 255. I have confirmed this for both gcc-2.95.2 and
gcc-2.7.2 (with appropriate workarounds for old gcc bugs) so it
shouldn't be a compiler issue.
More interesting is that I took a bone stock 2.2.18, which still uses
MAX_SECTORS (==128) in ide-probe.c and changed that to 256. This
caused 2.2.18 to fail in exactly the same way under the same test.
The 2.2.18 kernel without this change will complete the whole gcc
build without fail (on a 486-24MB, with gzip eating CPU, this takes
a while...)
Conclusions:
------------
I don't see this problem on any other machines, which includes some
pretty lame hardware (even an early 40MB *stepper motor* based WD IDE
seems to behave OK on 256).
So the 255 (or even the old 128) fixes things vs. 256, but I'd feel
better being 100% sure why. (i.e. is 255 a "fix" or a perturbation that
happens to paper over something else). If time permits, I will try
swapping hda with hdc - that will get the problem disk off of the
possibly noisy IRQ15 - and see what happens then. If it truly is a
one-off buggy disk, then maybe the fix is a large hammer, not a patch.
Ok, I think that is about as detailed as I can get. Ask if I left
something out. Commentary welcome, even an <aol> Me too! </aol>.
Paul.
Hi!
> hdc: irq timeout: status=0x58 { DriveReady SeekComplete DataRequest }
> ide1: unexpected interrupt, status=0x58, count=1
> hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> hdc: drive not ready for command
>
> all with the same timestamp in the syslog. (IRQ timeout followed by
> unexpected IRQ ... ??? Hrrm.)
Not *so* strange.
I'm waiting for interrupt. It does not come, so I start interrupt routine,
anyway.It finds something strange in registers, and says "unexpected interrupt".
--
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.