2002-04-02 23:39:07

by jim

[permalink] [raw]
Subject: Update on Promise 100TX2 + Serverworks IDE issues -- 2.2.20

FYI, here's an IDE update. I'm probably going to post a note on the
kernel mailing list about this to maybe help someone else avoid a
nasty problem, or maybe someone will have an "Ah Ha!" moment.

1. Linux 2.2.19 (and .20) + the Serverworks IDE chipset on the
Supermicro P3DLE is definitely hosed. Copying files from hda to hdc
causes both hda and hdc to be corrupted. I even took out the Promise
cards altogether to make sure there wasn't some interaction going on
there. I don't think your patches have anything to do with this
because it broke before without your patches, and also breaks with
your patches.

2. The MB IDE ports transfer data at about 18000K/sec while doing
cat /dev/hda >/dev/null and looking at vmstat.

3. The Promise card does about 31300K/sec doing the same thing.

4. If 2 Promise cards are installed and each of 4 Maxtor 5T060H6
drives get their own IDE port (ide2-5), the drives are hde, hdg, hdi,
hdk. In a 32-bit slot, cat /dev/hdx >/dev/null shows 31300K/sec. But
doing cat /dev/hde4 (a specific partition) for example gives
8400K/sec. That makes no sense to me.

5. If 1 Promise card has a drive installed as slave on the second
port, that drive will causes CRC errors on the console. I didn't try
slaving on the first port - it may be broken too. Switching drives
doesn't make this problem go away, and drives that generate errors
work perfectly as master. Setting drives to master/slave or using
cable select doesn't affect the problem.

6. If the Promise card is installed in one of the two 64-bit/66MHz
slots on the Supermicro MB, then hde (the first ide port) behaves the
same: 31300K/sec if catting /dev/hde, but only 8400K/sec if catting
/dev/hde4. HOWEVER, the master drive on the second port, hdg, yields
31300K/sec for both cat /dev/hdg and cat /dev/hdg4. I have swapped
drives around and verified this on combinations of drives and
controllers. I know it doesn't make sense, but thought I'd report it
in case there is something flakey in the ide driver. Maybe burst mode
is only getting used for this one port or something??

I dunno. This is all getting way too confusing so I am going to find
a configuration that works and stop trying to make it work perfectly
and understand the ins and outs.

My current config is:
hda = MB ide port 0 master
hdg = Promise #1 port 1 master
hde = Promise #1 port 0 master
hdk = Promise #2 port 1 master

All the drives on the Promise boards are running at 31300K now, even
hde (which again, I don't understand). hda+hdg will be a RAID1 pair,
hde+hdk will be a RAID1 pair. Seems to be working - I give up
understanding it.

Appreciate all your IDE efforts, and after this recent nosebleed, I
have much more empathy about how much of a pain this all must be for
you sometimes. :)

Jim Wilcoxson
Owner, http://www.rubylane.com

(I'm not on the kernel mailing list but comments/suggestions via email
are welcome.)


2002-04-03 00:49:28

by Trent Piepho

[permalink] [raw]
Subject: Re: Update on Promise 100TX2 + Serverworks IDE issues -- 2.2.20

On Tue, 2 Apr 2002 [email protected] wrote:
> 2. The MB IDE ports transfer data at about 18000K/sec while doing
> cat /dev/hda >/dev/null and looking at vmstat.

I think the serverworks IDE is only mode4, not even UDMA33. I heard a lot of
bad things about it, and removed all the IDE drives from our serverworks
system's controller.

> hdk. In a 32-bit slot, cat /dev/hdx >/dev/null shows 31300K/sec. But
> doing cat /dev/hde4 (a specific partition) for example gives
> 8400K/sec. That makes no sense to me.

The outer cylinders of a drive are faster than the inner cylinders. Try
repartitioning the drive so that hde4 starts at cylinder 1, and see if that
changes the speed.

> 6. If the Promise card is installed in one of the two 64-bit/66MHz
> slots on the Supermicro MB, then hde (the first ide port) behaves the
> same: 31300K/sec if catting /dev/hde, but only 8400K/sec if catting
> /dev/hde4. HOWEVER, the master drive on the second port, hdg, yields
> 31300K/sec for both cat /dev/hdg and cat /dev/hdg4. I have swapped

How is hdg partitioned? You should expect to see a significant speed
difference between the inner and outer cylinders of a drive.

> I dunno. This is all getting way too confusing so I am going to find
> a configuration that works and stop trying to make it work perfectly
> and understand the ins and outs.

We have a supermicro 370DE6 (serverworks HE-sl) with a 3ware escalade 7850.
It's very fast, faster than the mylex extremeraid 2000 with much more
expensive drives in the same computer. The 3ware hasn't been used much yet in
that machine, but so far we have had no problems. Maybe the promise cards and
the serverworks IDE controller are just crappy hardware, and are never going
to work correctly?

2002-04-03 01:01:43

by Alan

[permalink] [raw]
Subject: Re: Update on Promise 100TX2 + Serverworks IDE issues -- 2.2.20

> I think the serverworks IDE is only mode4, not even UDMA33. I heard a lot of
> bad things about it, and removed all the IDE drives from our serverworks
> system's controller.

Serverworks OSB4 IDE will do UDMA33 but seems to have problems with certain
combinations of drives, controllers and unknown influences. The newer CSB5
seems to work beautifully

2002-04-03 02:14:32

by Jeff Nguyen

[permalink] [raw]
Subject: Re: Update on Promise 100TX2 + Serverworks IDE issues -- 2.2.20

There are ATAPI devices using UDMA25 such as HP 9300i. These UDMA25
devices are the problem maker on OSB4. Unless DMA is disabled, the system
will lock up when accessing the drive.

If you have UDMA33 ATAPI devices, they work great in OSB4.

Jeff

----- Original Message -----
From: "Alan Cox" <[email protected]>
To: "Trent Piepho" <[email protected]>
Cc: <[email protected]>; <[email protected]>;
<[email protected]>
Sent: Tuesday, April 02, 2002 5:18 PM
Subject: Re: Update on Promise 100TX2 + Serverworks IDE issues -- 2.2.20


> > I think the serverworks IDE is only mode4, not even UDMA33. I heard a
lot of
> > bad things about it, and removed all the IDE drives from our serverworks
> > system's controller.
>
> Serverworks OSB4 IDE will do UDMA33 but seems to have problems with
certain
> combinations of drives, controllers and unknown influences. The newer CSB5
> seems to work beautifully
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2002-04-03 02:42:44

by Alan

[permalink] [raw]
Subject: Re: Update on Promise 100TX2 + Serverworks IDE issues -- 2.2.20

> devices are the problem maker on OSB4. Unless DMA is disabled, the system
> will lock up when accessing the drive.
>
> If you have UDMA33 ATAPI devices, they work great in OSB4.

Except when they don't. There are definite problems with some specific
combinations and ones I know are not one offs because we've seen them over
an entire render farm for example.

The current driver panics and asks people to email me if it spots the UDMA
disk corruption about to occur pattern. I get little mail but some

2002-04-03 02:58:41

by jim

[permalink] [raw]
Subject: Re: Update on Promise 100TX2 + Serverworks IDE issues -- 2.2.20

FWIW, I have Maxtor 5T060H6 UDMA100 drives connected as hda and hdc,
and this command will reliably trash both hda and hdc on a Supermicro
P3TDLE mobo using the built-in IDE ports:

mount /dev/hdc4 /mnt
cd /home (/dev/hda4)
tar -cf - *|tar -C /mnt -xpf -

Within seconds of hdc's light coming on, all kinds of filesystem
errors will occur, and then BOTH drives are corrupted. This is one of
two different Supermicro P3TDLE mobos, both purchased in Nov 2001. I
removed the Promise controllers completely to eliminate them as a
problem, and it still happens. It happens with regular 2.2.19 and
2.2.20 with the Andre Hedrick's patches.

This board does claim to support UDMA33 and Linux says the MB IDE
ports are in UDMA33 mode. Works fine in just PIO mode. Slower, but
at least it doesn't trash drives.

One reason for posting this to the list is that it cost us 34 hours of
downtime of a production site, delayed a site upgrade for 4 months,
and took me several days of testing to narrow it down to crummy
motherboards. If there is a tweak to say "don't ever do DMA on
Supermicro boards", at least on this one, I'd recommend it. I never
tried putting two drives on ide0; that may work or may also trash
both drives. And it never locked up our machine, although I think
it did panic once because of all the damage to the filesystem.

I can reliably duplicate this if anyone wants me to do some testing.
I have 2 dual-CPU test machines and 12 drives, so no shortage of
hardware to beat on.

This board says:

ServerWorks OSB4: IDE controller on PCI bus 00 dev 79
ServerWorks OSB4: chipset revision 0

Jim

>
> > devices are the problem maker on OSB4. Unless DMA is disabled, the system
> > will lock up when accessing the drive.
> >
> > If you have UDMA33 ATAPI devices, they work great in OSB4.
>
> Except when they don't. There are definite problems with some specific
> combinations and ones I know are not one offs because we've seen them over
> an entire render farm for example.
>
> The current driver panics and asks people to email me if it spots the UDMA
> disk corruption about to occur pattern. I get little mail but some
>

2002-04-03 03:10:41

by jim

[permalink] [raw]
Subject: Re: Update on Promise 100TX2 + Serverworks IDE issues -- 2.2.20

>
> On Tue, 2 Apr 2002 [email protected] wrote:
> > 2. The MB IDE ports transfer data at about 18000K/sec while doing
> > cat /dev/hda >/dev/null and looking at vmstat.
>
> I think the serverworks IDE is only mode4, not even UDMA33. I heard a lot of
> bad things about it, and removed all the IDE drives from our serverworks
> system's controller.
>
> > hdk. In a 32-bit slot, cat /dev/hdx >/dev/null shows 31300K/sec. But
> > doing cat /dev/hde4 (a specific partition) for example gives
> > 8400K/sec. That makes no sense to me.
>
> The outer cylinders of a drive are faster than the inner cylinders. Try
> repartitioning the drive so that hde4 starts at cylinder 1, and see if that
> changes the speed.

That's not it because all 4 drives are partitioned the same, yet hde4
gives 8400K/sec and hdk4 gives 31000K. Also there are the same number
of interrupts/sec and context switches per sec according to vmstat in
the 8400 and 31000 case.

> Maybe the promise cards and
> the serverworks IDE controller are just crappy hardware, and are never going
> to work correctly?

I certainly can relate to that!

Thanks for the feedback,
Jim

2002-04-03 14:52:28

by jim

[permalink] [raw]
Subject: Re: Update on Promise 100TX2 + Serverworks IDE issues -- 2.2.20

>
> On Tue, 2 Apr 2002 [email protected] wrote:
>
> > >
> > > On Tue, 2 Apr 2002 [email protected] wrote:
> > > > 2. The MB IDE ports transfer data at about 18000K/sec while doing
> > > > cat /dev/hda >/dev/null and looking at vmstat.
> > >
> > > I think the serverworks IDE is only mode4, not even UDMA33. I heard a lot of
> > > bad things about it, and removed all the IDE drives from our serverworks
> > > system's controller.
> > >
> > > > hdk. In a 32-bit slot, cat /dev/hdx >/dev/null shows 31300K/sec. But
> > > > doing cat /dev/hde4 (a specific partition) for example gives
> > > > 8400K/sec. That makes no sense to me.
> > >
> > > The outer cylinders of a drive are faster than the inner cylinders. Try
> > > repartitioning the drive so that hde4 starts at cylinder 1, and see if that
> > > changes the speed.
> >
> > That's not it because all 4 drives are partitioned the same, yet hde4
> > gives 8400K/sec and hdk4 gives 31000K. Also there are the same number
> > of interrupts/sec and context switches per sec according to vmstat in
> > the 8400 and 31000 case.
>
> sure you've got the same block size? 8400 * 4 is suspiciously close the
> 31000 ... I don't use vmstat, so I'm not sure if it reports blocks or
> kbyte /sec

Yeah, I even remade the file systems on both drives and got the same
result.

After moving the drives around on the Promise controller and moving 1
drive back to the MB IDE port, it looks like all drives are running at
high speed. I dunno - it's weird. I wouldn't be surprised if one of
the drives arbitrarily reverts back to the slower speed at some point
or after a future reboot, though it hasn't yet.

I was thinking maybe burst mode was not getting turned on, not working,
or getting turned off on some of the Promise ports for some reason.

vmstat reports KB/sec.

Jim

2002-04-04 01:40:29

by Nerijus Baliūnas

[permalink] [raw]
Subject: Re[2]: Update on Promise 100TX2 + Serverworks IDE issues -- 2.2.20

On Tue, 2 Apr 2002 18:58:20 -0800 (PST) "[email protected]" <[email protected]> wrote:

j> This board does claim to support UDMA33 and Linux says the MB IDE
j> ports are in UDMA33 mode. Works fine in just PIO mode. Slower, but
j> at least it doesn't trash drives.

j> This board says:
j>
j> ServerWorks OSB4: IDE controller on PCI bus 00 dev 79
j> ServerWorks OSB4: chipset revision 0

IIRC downgrading DMA to MDMA2 should help.

Regards,
Nerijus

2002-04-04 10:04:35

by Andre Hedrick

[permalink] [raw]
Subject: Re: Re[2]: Update on Promise 100TX2 + Serverworks IDE issues -- 2.2.20


I need to find time to publish and break down a new 352K patch for
2.4.19-pre4 that covers this issue also.

Andre Hedrick
LAD Storage Consulting Group

On Thu, 4 Apr 2002, Nerijus Baliunas wrote:

> On Tue, 2 Apr 2002 18:58:20 -0800 (PST) "[email protected]" <[email protected]> wrote:
>
> j> This board does claim to support UDMA33 and Linux says the MB IDE
> j> ports are in UDMA33 mode. Works fine in just PIO mode. Slower, but
> j> at least it doesn't trash drives.
>
> j> This board says:
> j>
> j> ServerWorks OSB4: IDE controller on PCI bus 00 dev 79
> j> ServerWorks OSB4: chipset revision 0
>
> IIRC downgrading DMA to MDMA2 should help.
>
> Regards,
> Nerijus
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>