2000-11-21 08:10:24

by Hakan Lennestal

[permalink] [raw]
Subject: 2.4.0, test10, test11: HPT366 problem


Hi !

I'm having problems when booting 2.4.0 test10 and test11 kernels
(perhaps some earlier kernels too).
Approximately nine out of ten times the kernel hangs when
trying to detect partitions on the first HPT366 disk.

It looks something like this:

Nov 21 08:08:40 t kernel: Uniform Multi-Platform E-IDE driver Revision: 6.31
Nov 21 08:08:40 t kernel: ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
Nov 21 08:08:40 t kernel: PIIX4: IDE controller on PCI bus 00 dev 39
Nov 21 08:08:40 t kernel: PIIX4: chipset revision 1
Nov 21 08:08:40 t kernel: PIIX4: not 100%% native mode: will probe irqs later
Nov 21 08:08:40 t kernel: ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:pio
Nov 21 08:08:40 t kernel: ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:pio
Nov 21 08:08:40 t kernel: HPT366: IDE controller on PCI bus 00 dev 48
Nov 21 08:08:40 t kernel: HPT366: chipset revision 1
Nov 21 08:08:40 t kernel: HPT366: not 100%% native mode: will probe irqs later
Nov 21 08:08:40 t kernel: ide2: BM-DMA at 0xac00-0xac07, BIOS settings: hde:DMA, hdf:pio
Nov 21 08:08:40 t kernel: HPT366: IDE controller on PCI bus 00 dev 49
Nov 21 08:08:40 t kernel: HPT366: chipset revision 1
Nov 21 08:08:40 t kernel: HPT366: not 100%% native mode: will probe irqs later
Nov 21 08:08:40 t kernel: ide3: BM-DMA at 0xb800-0xb807, BIOS settings: hdg:DMA, hdh:pio
Nov 21 08:08:40 t kernel: hda: FUJITSU MPD3064AT, ATA DISK drive
Nov 21 08:08:40 t kernel: hdc: Hewlett-Packard CD-Writer Plus 8200, ATAPI CDROM drive
Nov 21 08:08:40 t kernel: hdd: IOMEGA ZIP 100 ATAPI, ATAPI FLOPPY drive
Nov 21 08:08:40 t kernel: hde: IBM-DTLA-307030, ATA DISK drive
Nov 21 08:08:40 t kernel: hdg: QUANTUM Bigfoot TX12.0AT, ATA DISK drive
Nov 21 08:08:40 t kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Nov 21 08:08:40 t kernel: ide1 at 0x170-0x177,0x376 on irq 15
Nov 21 08:08:40 t kernel: ide2 at 0xa400-0xa407,0xa802 on irq 9
Nov 21 08:08:40 t kernel: ide3 at 0xb000-0xb007,0xb402 on irq 9
Nov 21 08:08:40 t kernel: hda: 12672450 sectors (6488 MB) w/512KiB Cache, CHS=788/255/63, UDMA(33)
Nov 21 08:08:40 t kernel: hde: 60036480 sectors (30739 MB) w/1916KiB Cache, CHS=59560/16/63, UDMA(66)
Nov 21 08:08:40 t kernel: hdg: 23547888 sectors (12057 MB) w/69KiB Cache, CHS=23361/16/63, UDMA(33)
Nov 21 08:08:40 t kernel: Partition check:
Nov 21 08:08:40 t kernel: hda: hda1 hda2 < hda5 hda6 hda7 >
Nov 21 08:08:40 t kernel: hde: hde1 hde2 < hde5

And then after a while it gets a DMA timeout and hangs hard.

The hang can occur anywhere during the partition detection and it can for
instance also fail at once and look like:

hde:

or fail even after the last partiton:

hde: hde1 hde2 < hde5 hde6 hde7 hde8

Approximately one out of ten reboots the detection succedes and I'm able
to boot up the kernel and then everything works smoothly.

There are no problems when booting 2.2.*-kernels with the HPT366-patch.

Regards.

/H?kan


---------------------------------------
e-mail: [email protected] |
or [email protected] |
---------------------------------------


2000-11-21 10:00:27

by Andre Hedrick

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem


2.2.x and 2.4.0-xxx, do not share the same interrupt pin hack.

This is in 2.2.x patches.

printk("%s: onboard version of chipset, pin1=%d pin2=%d\n", d->name, pin1, pin2);
#if 1
/* I forgot why I did this once, but it fixed something. */
pci_write_config_byte(dev2, PCI_INTERRUPT_PIN, dev->irq);
printk("PCI: %s: Fixing interrupt %d pin %d to ZERO \n", d->name, dev2->irq, pin2);
pci_write_config_byte(dev2, PCI_INTERRUPT_LINE, 0);
#endif

It does the undocumented "mode 3" that I had to explain to HighPoint that
ABIT violated the OEM docs.

It is a PCI-addon chipset style deployed like a legacy chipset.

Primary channel is set to IRQ X and PIN A
Secondary channel is set to IRQ X++ and PIN B

This is not allowed by the guidelines but it you do the big nasty above,
it will fix it 99% of the time.

Add the above stub to ide-pci.c near or at line 756 to look like 2.2, then
retry and see if it fixes it. Then you bitch at Linus, not me, because it
is a functional kludge, but a "kludge".

Cheers,

On Tue, 21 Nov 2000, Hakan Lennestal wrote:

>
> Hi !
>
> I'm having problems when booting 2.4.0 test10 and test11 kernels
> (perhaps some earlier kernels too).
> Approximately nine out of ten times the kernel hangs when
> trying to detect partitions on the first HPT366 disk.
>
> It looks something like this:
>
> Nov 21 08:08:40 t kernel: Uniform Multi-Platform E-IDE driver Revision: 6.31
> Nov 21 08:08:40 t kernel: ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
> Nov 21 08:08:40 t kernel: PIIX4: IDE controller on PCI bus 00 dev 39
> Nov 21 08:08:40 t kernel: PIIX4: chipset revision 1
> Nov 21 08:08:40 t kernel: PIIX4: not 100%% native mode: will probe irqs later
> Nov 21 08:08:40 t kernel: ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:pio
> Nov 21 08:08:40 t kernel: ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:pio
> Nov 21 08:08:40 t kernel: HPT366: IDE controller on PCI bus 00 dev 48
> Nov 21 08:08:40 t kernel: HPT366: chipset revision 1
> Nov 21 08:08:40 t kernel: HPT366: not 100%% native mode: will probe irqs later
> Nov 21 08:08:40 t kernel: ide2: BM-DMA at 0xac00-0xac07, BIOS settings: hde:DMA, hdf:pio
> Nov 21 08:08:40 t kernel: HPT366: IDE controller on PCI bus 00 dev 49
> Nov 21 08:08:40 t kernel: HPT366: chipset revision 1
> Nov 21 08:08:40 t kernel: HPT366: not 100%% native mode: will probe irqs later
> Nov 21 08:08:40 t kernel: ide3: BM-DMA at 0xb800-0xb807, BIOS settings: hdg:DMA, hdh:pio
> Nov 21 08:08:40 t kernel: hda: FUJITSU MPD3064AT, ATA DISK drive
> Nov 21 08:08:40 t kernel: hdc: Hewlett-Packard CD-Writer Plus 8200, ATAPI CDROM drive
> Nov 21 08:08:40 t kernel: hdd: IOMEGA ZIP 100 ATAPI, ATAPI FLOPPY drive
> Nov 21 08:08:40 t kernel: hde: IBM-DTLA-307030, ATA DISK drive
> Nov 21 08:08:40 t kernel: hdg: QUANTUM Bigfoot TX12.0AT, ATA DISK drive
> Nov 21 08:08:40 t kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> Nov 21 08:08:40 t kernel: ide1 at 0x170-0x177,0x376 on irq 15
> Nov 21 08:08:40 t kernel: ide2 at 0xa400-0xa407,0xa802 on irq 9
> Nov 21 08:08:40 t kernel: ide3 at 0xb000-0xb007,0xb402 on irq 9
> Nov 21 08:08:40 t kernel: hda: 12672450 sectors (6488 MB) w/512KiB Cache, CHS=788/255/63, UDMA(33)
> Nov 21 08:08:40 t kernel: hde: 60036480 sectors (30739 MB) w/1916KiB Cache, CHS=59560/16/63, UDMA(66)
> Nov 21 08:08:40 t kernel: hdg: 23547888 sectors (12057 MB) w/69KiB Cache, CHS=23361/16/63, UDMA(33)
> Nov 21 08:08:40 t kernel: Partition check:
> Nov 21 08:08:40 t kernel: hda: hda1 hda2 < hda5 hda6 hda7 >
> Nov 21 08:08:40 t kernel: hde: hde1 hde2 < hde5
>
> And then after a while it gets a DMA timeout and hangs hard.
>
> The hang can occur anywhere during the partition detection and it can for
> instance also fail at once and look like:
>
> hde:
>
> or fail even after the last partiton:
>
> hde: hde1 hde2 < hde5 hde6 hde7 hde8
>
> Approximately one out of ten reboots the detection succedes and I'm able
> to boot up the kernel and then everything works smoothly.
>
> There are no problems when booting 2.2.*-kernels with the HPT366-patch.
>
> Regards.
>
> /H?kan
>
>
> ---------------------------------------
> e-mail: [email protected] |
> or [email protected] |
> ---------------------------------------
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
CTO Timpanogas Research Group
EVP Linux Development, TRG
Linux ATA Development

2000-11-21 11:05:30

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem


[email protected] said:
> Nov 21 08:08:40 t kernel: hde: IBM-DTLA-307030, ATA DISK drive

> Nov 21 08:08:40 t kernel: hde: hde1 hde2 < hde5

> And then after a while it gets a DMA timeout and hangs hard.

You mean this?

hde: timeout waiting for DMA
ide_dmaproc: chipset supported ide_dma_timeout func only: 14

I see identical hangs when I have a similar IBM-DTLA drive attached
anywhere on the HPT366. But I also see it hang on 2.2.17 if I try:

hdparm -t /dev/hde & hdparm -t /dev/hde & hdparm -t /dev/hde

a few times. This is even with Andre's latest patch.

For now, I've just moved the IBM drive onto the PIIX4, where it's stable. My
other UDMA66 drive has been rock solid on the BP6 for months:
hda: SAMSUNG SV0432D, ATA DISK drive

Andre, un{less,til} we can work out what the problem is, can we add the IBM
drives to the HPT366 udma4 blacklist? I'll also try it in udma3 mode to see
what happens.

--
dwmw2


2000-11-21 11:23:05

by Hakan Lennestal

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem

In message <[email protected]>, David Woodhouse writes:

> You mean this?
>
> hde: timeout waiting for DMA
> ide_dmaproc: chipset supported ide_dma_timeout func only: 14

Indeed.

> I see identical hangs when I have a similar IBM-DTLA drive attached
> anywhere on the HPT366. But I also see it hang on 2.2.17 if I try:
>
> hdparm -t /dev/hde & hdparm -t /dev/hde & hdparm -t /dev/hde

Yes, with udma4 it hangs sooner or later for me also, both under
2.2.* and 2.4.0.

Udma3 seem to be rock solid though as long as it manages to pass the partition
detection during boot up.

Regards.

/H?kan

2000-11-21 11:31:48

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem


[email protected] said:
> 2.2.x and 2.4.0-xxx, do not share the same interrupt pin hack.

> Add the above stub to ide-pci.c near or at line 756 to look like 2.2,
> then retry and see if it fixes it. Then you bitch at Linus, not me,
> because it is a functional kludge, but a "kludge".


But:

1) 2.2 with your latest patches also falls over, even with the DMA timeout
workaround enabled.

2) This happens even when the offending IBM-DTLA drive is the master
on the primary HPT366 controller, and nothing else is connected.

[email protected] said:
> Udma3 seem to be rock solid though as long as it manages to pass the
> partition detection during boot up.

Strange. If it sometimes fails during the partition detection, then I'd
expect it to also fail in stress testing. Can you try repeatedly doing
BLKRRPART on it {instead of,as well as} parallel 'hdparm -t'?

If it falls over at udma3, perhaps we should blacklist it all the way down
to udma2?

--
dwmw2


2000-11-21 12:45:55

by Peter Samuelson

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem


[[email protected]]
> > Udma3 seem to be rock solid though as long as it manages to pass
> > the partition detection during boot up.

[David Woodhouse]
> If it falls over at udma3, perhaps we should blacklist it all the way
> down to udma2?

The way I understood Hakan was: "it boots in udma4, and if it gets all
the way to userland I immediately hdparm it down to udma3, and then it
works fine".

Hakan, is this what you meant? If so, forcing it <= udma3 should be ok.

Peter

2000-11-21 12:56:36

by Hakan Lennestal

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem

In message <[email protected]>, Peter Samuelson writes:

> The way I understood Hakan was: "it boots in udma4, and if it gets all
> the way to userland I immediately hdparm it down to udma3, and then it
> works fine".
>
> Hakan, is this what you meant? If so, forcing it <= udma3 should be ok.

Yes.

When it comes to the partition detection during bootup, udma4 or udma3
doesn't seem to matter. It passes approx. one out of ten times either
way. When the system is up and running, using udma4 seem to hang the
system sooner or later. Udma3 on the other hand seem to be quite stable.

/H?kan

2000-11-21 14:00:28

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem


[email protected] said:
> When it comes to the partition detection during bootup, udma4 or
> udma3 doesn't seem to matter. It passes approx. one out of ten times
> either way.

How have you made it use udma3 at bootup? Something like the patch below?

Index: drivers/ide/hpt366.c
===================================================================
RCS file: /inst/cvs/linux/drivers/ide/Attic/hpt366.c,v
retrieving revision 1.1.2.10
diff -u -r1.1.2.10 hpt366.c
--- drivers/ide/hpt366.c 2000/11/10 14:56:31 1.1.2.10
+++ drivers/ide/hpt366.c 2000/11/21 13:27:32
@@ -55,6 +55,8 @@
};

const char *bad_ata66_4[] = {
+ "IBM-DTLA-307045",
+ "IBM-DTLA-307030",
"WDC AC310200R",
NULL
};

--
dwmw2


2000-11-21 19:04:00

by Andre Hedrick

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem

On Tue, 21 Nov 2000, Peter Samuelson wrote:

> The way I understood Hakan was: "it boots in udma4, and if it gets all
> the way to userland I immediately hdparm it down to udma3, and then it
> works fine".

No, if it doesn not hang and we get iCRC errors it will down grade
automatically, but it is a transfer rate issue than it must be hard coded
to force an upper threshold limit.

Cheers,

Andre Hedrick
CTO Timpanogas Research Group
EVP Linux Development, TRG
Linux ATA Development

2000-11-21 19:04:40

by Andre Hedrick

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem


Does that fix it?

On Tue, 21 Nov 2000, David Woodhouse wrote:

>
> [email protected] said:
> > When it comes to the partition detection during bootup, udma4 or
> > udma3 doesn't seem to matter. It passes approx. one out of ten times
> > either way.
>
> How have you made it use udma3 at bootup? Something like the patch below?
>
> Index: drivers/ide/hpt366.c
> ===================================================================
> RCS file: /inst/cvs/linux/drivers/ide/Attic/hpt366.c,v
> retrieving revision 1.1.2.10
> diff -u -r1.1.2.10 hpt366.c
> --- drivers/ide/hpt366.c 2000/11/10 14:56:31 1.1.2.10
> +++ drivers/ide/hpt366.c 2000/11/21 13:27:32
> @@ -55,6 +55,8 @@
> };
>
> const char *bad_ata66_4[] = {
> + "IBM-DTLA-307045",
> + "IBM-DTLA-307030",
> "WDC AC310200R",
> NULL
> };
>
> --
> dwmw2
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
CTO Timpanogas Research Group
EVP Linux Development, TRG
Linux ATA Development

2000-11-21 19:45:56

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem

On Tue, 21 Nov 2000, Andre Hedrick wrote:

> No, if it doesn not hang and we get iCRC errors it will down grade
> automatically, but it is a transfer rate issue than it must be hard coded
> to force an upper threshold limit.

Do we downgrade gracefully, or do we just drop directly to non-DMA mode?

--
dwmw2


2000-11-21 21:26:49

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem

On Tue, 21 Nov 2000, Andre Hedrick wrote:
>
> Does that fix it?

WorksForMe(tm)

Grrr. I specifically went and read the HPT366 blacklist before buying my
shiny new hard drive.

> On Tue, 21 Nov 2000, David Woodhouse wrote:
> > Index: drivers/ide/hpt366.c
> > ===================================================================
> > RCS file: /inst/cvs/linux/drivers/ide/Attic/hpt366.c,v
> > retrieving revision 1.1.2.10
> > diff -u -r1.1.2.10 hpt366.c
> > --- drivers/ide/hpt366.c 2000/11/10 14:56:31 1.1.2.10
> > +++ drivers/ide/hpt366.c 2000/11/21 13:27:32
> > @@ -55,6 +55,8 @@
> > };
> >
> > const char *bad_ata66_4[] = {
> > + "IBM-DTLA-307045",
> > + "IBM-DTLA-307030",
> > "WDC AC310200R",
> > NULL
> > };

--
dwmw2


2000-11-21 21:35:06

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem

I read somewhere that hpt366 bios 1.26 will fix the problem with this
particular drive. I'll try and dig up the reference.

David Woodhouse wrote:
>
> WorksForMe(tm)
>
> Grrr. I specifically went and read the HPT366 blacklist before buying my
> shiny new hard drive.
>

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-21 22:08:22

by Hakan Lennestal

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem

In message <[email protected]>, "Mohammad A. Haque" writes:
> I read somewhere that hpt366 bios 1.26 will fix the problem with this
> particular drive. I'll try and dig up the reference.

>From the 1.26beta bios redame-file (at http://www.highpoint-tech.com)

1.26.0 08Aug00
. Fix compatibility problem with IBM DTLA ATA-100 har disk

/H?kan


---------------------------------------
e-mail: [email protected] |
or [email protected] |
---------------------------------------

2000-11-22 01:42:24

by Andre Hedrick

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem

On Tue, 21 Nov 2000, David Woodhouse wrote:

> On Tue, 21 Nov 2000, Andre Hedrick wrote:
>
> > No, if it doesn not hang and we get iCRC errors it will down grade
> > automatically, but it is a transfer rate issue than it must be hard coded
> > to force an upper threshold limit.
>
> Do we downgrade gracefully, or do we just drop directly to non-DMA mode?

With Grace, and now you blessed, go in peace my son.

grep crc ./drivers/ide/*

Cheers,


Andre Hedrick
CTO Timpanogas Research Group
EVP Linux Development, TRG
Linux ATA Development

2000-11-24 21:17:46

by Nathan A. Ferch

[permalink] [raw]
Subject: Re: 2.4.0, test10, test11: HPT366 problem

I have similar problems with an IBM-DTLA-305020 and the HPT-366 on a ABIT BP6.
I'm not sure what the BIOS version is, i'll check it once i return home.
Changing to udma3 seems to fix the problems. However i can always pass
partition check fine.

On Tue, Nov 21, 2000 at 01:29:51PM +0000, David Woodhouse wrote:
>
> [email protected] said:
> > When it comes to the partition detection during bootup, udma4 or
> > udma3 doesn't seem to matter. It passes approx. one out of ten times
> > either way.
>
> How have you made it use udma3 at bootup? Something like the patch below?
>
> Index: drivers/ide/hpt366.c
> ===================================================================
> RCS file: /inst/cvs/linux/drivers/ide/Attic/hpt366.c,v
> retrieving revision 1.1.2.10
> diff -u -r1.1.2.10 hpt366.c
> --- drivers/ide/hpt366.c 2000/11/10 14:56:31 1.1.2.10
> +++ drivers/ide/hpt366.c 2000/11/21 13:27:32
> @@ -55,6 +55,8 @@
> };
>
> const char *bad_ata66_4[] = {
> + "IBM-DTLA-307045",
> + "IBM-DTLA-307030",
> "WDC AC310200R",
> NULL
> };
>
> --
> dwmw2
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
--
nathan a ferch
[email protected]
"No! Nobody ever built them like this! The architect was either an authentic whacko or a certified genius. The whole building is like a huge antenna for pulling in and concentrating psychokinetic energy." -Stantz