2001-04-29 18:48:39

by Steffen Persvold

[permalink] [raw]
Subject: ServerWorks LE and MTRR

Hi all,

I just compiled 2.4.4 and are running it on a Serverworks LE motherboard.
Whenever I try to add a write-combining region, it gets rejected. I took a peek
in the arch/i386/kernel/mtrr.c and found that this is just as expected with
v1.40 of the code. It is great that the mtrr code checks and prevents the user
from doing something that could eventually lead to data corruption. Using
write-combining on PCI acesses can lead to this on certain LE revisions but
_not_ all (only rev < 5). Therefore please consider my small patch to allow the
good ones to be able to use write-combining. I have several rev 06 and they are
working fine with this patch.

Best regards,
--
Steffen Persvold Systems Engineer
Email : mailto:[email protected] Scali AS (http://www.scali.com)
Norway : Tel : (+47) 2262 8950 Olaf Helsets vei 6
Fax : (+47) 2262 8951 N-0621 Oslo, Norway

USA : Tel : (+1) 713 706 0544 10500 Richmond Avenue, Suite 190
Houston, Texas 77042, USA

diff -Nur linux/arch/i386/kernel/mtrr.c.~1~ linux/arch/i386/kernel/mtrr.c
--- linux/arch/i386/kernel/mtrr.c.~1~ Wed Apr 11 21:02:27 2001
+++ linux/arch/i386/kernel/mtrr.c Sun Apr 29 10:18:06 2001
@@ -480,6 +480,7 @@
{
unsigned long config, dummy;
struct pci_dev *dev = NULL;
+ u8 rev;

/* ServerWorks LE chipsets have problems with write-combining
Don't allow it and leave room for other chipsets to be tagged */
@@ -489,7 +490,9 @@
case PCI_VENDOR_ID_SERVERWORKS:
switch (dev->device) {
case PCI_DEVICE_ID_SERVERWORKS_LE:
- return 0;
+ pci_read_config_byte(dev, PCI_CLASS_REVISION, &rev);
+ if (rev <= 5)
+ return 0;
break;
default:
break;


2001-04-29 20:16:42

by Gérard Roudier

[permalink] [raw]
Subject: Re: ServerWorks LE and MTRR



On Sun, 29 Apr 2001, Steffen Persvold wrote:

> Hi all,
>
> I just compiled 2.4.4 and are running it on a Serverworks LE motherboard.
> Whenever I try to add a write-combining region, it gets rejected. I took a peek
> in the arch/i386/kernel/mtrr.c and found that this is just as expected with
> v1.40 of the code. It is great that the mtrr code checks and prevents the user
> from doing something that could eventually lead to data corruption. Using
> write-combining on PCI acesses can lead to this on certain LE revisions but
> _not_ all (only rev < 5). Therefore please consider my small patch to allow the
> good ones to be able to use write-combining. I have several rev 06 and they are
> working fine with this patch.

You wrote that 'only rev < 5' can lead to data corruption, but your patch
seems to disallow use of write combining for rev 5 too.

Could you clarify?

G?rard.

PS:
>From what hat did you get this information ? as it seems that ServerWorks
require NDA for letting know technical information on their chipsets.

> Best regards,
> --
> Steffen Persvold Systems Engineer
> Email : mailto:[email protected] Scali AS (http://www.scali.com)
> Norway : Tel : (+47) 2262 8950 Olaf Helsets vei 6
> Fax : (+47) 2262 8951 N-0621 Oslo, Norway
>
> USA : Tel : (+1) 713 706 0544 10500 Richmond Avenue, Suite 190
> Houston, Texas 77042, USA
>
> diff -Nur linux/arch/i386/kernel/mtrr.c.~1~ linux/arch/i386/kernel/mtrr.c
> --- linux/arch/i386/kernel/mtrr.c.~1~ Wed Apr 11 21:02:27 2001
> +++ linux/arch/i386/kernel/mtrr.c Sun Apr 29 10:18:06 2001
> @@ -480,6 +480,7 @@
> {
> unsigned long config, dummy;
> struct pci_dev *dev = NULL;
> + u8 rev;
>
> /* ServerWorks LE chipsets have problems with write-combining
> Don't allow it and leave room for other chipsets to be tagged */
> @@ -489,7 +490,9 @@
> case PCI_VENDOR_ID_SERVERWORKS:
> switch (dev->device) {
> case PCI_DEVICE_ID_SERVERWORKS_LE:
> - return 0;
> + pci_read_config_byte(dev, PCI_CLASS_REVISION, &rev);
> + if (rev <= 5)
> + return 0;
> break;
> default:
> break;
> -

2001-04-29 20:59:17

by Steffen Persvold

[permalink] [raw]
Subject: Re: ServerWorks LE and MTRR

G?rard Roudier wrote:
>
> On Sun, 29 Apr 2001, Steffen Persvold wrote:
>
> > Hi all,
> >
> > I just compiled 2.4.4 and are running it on a Serverworks LE motherboard.
> > Whenever I try to add a write-combining region, it gets rejected. I took a peek
> > in the arch/i386/kernel/mtrr.c and found that this is just as expected with
> > v1.40 of the code. It is great that the mtrr code checks and prevents the user
> > from doing something that could eventually lead to data corruption. Using
> > write-combining on PCI acesses can lead to this on certain LE revisions but
> > _not_ all (only rev < 5). Therefore please consider my small patch to allow the
> > good ones to be able to use write-combining. I have several rev 06 and they are
> > working fine with this patch.
>
> You wrote that 'only rev < 5' can lead to data corruption, but your patch
> seems to disallow use of write combining for rev 5 too.
>
> Could you clarify?

Oops just a typo, it should be <= 5. The patch is correct.

>
> G?rard.
>
> PS:
> >From what hat did you get this information ? as it seems that ServerWorks
> require NDA for letting know technical information on their chipsets.
>

I've learned it the hard way, I have two types : Compaq DL360 (rev 5) and a
Tyan S2510 (rev 6). On the compaq machine I constantly get data corruption on
the last double word (4 bytes) in a 64 byte PCI burst when I use write
combining on the CPU. On the Tyan however the transfer is always ok.

--
Steffen Persvold Systems Engineer
Email : mailto:[email protected] Scali AS (http://www.scali.com)
Norway : Tel : (+47) 2262 8950 Olaf Helsets vei 6
Fax : (+47) 2262 8951 N-0621 Oslo, Norway

USA : Tel : (+1) 713 706 0544 10500 Richmond Avenue, Suite 190
Houston, Texas 77042, USA

2001-04-29 22:15:08

by Nick

[permalink] [raw]
Subject: Re: ServerWorks LE and MTRR

Are you sure that is not due to board design differences?
Nick

On Sun, 29 Apr 2001, Steffen Persvold wrote:

> G?rard Roudier wrote:
> >
> > On Sun, 29 Apr 2001, Steffen Persvold wrote:
> >
> > > Hi all,
> > >
> > > I just compiled 2.4.4 and are running it on a Serverworks LE motherboard.
> > > Whenever I try to add a write-combining region, it gets rejected. I took a peek
> > > in the arch/i386/kernel/mtrr.c and found that this is just as expected with
> > > v1.40 of the code. It is great that the mtrr code checks and prevents the user
> > > from doing something that could eventually lead to data corruption. Using
> > > write-combining on PCI acesses can lead to this on certain LE revisions but
> > > _not_ all (only rev < 5). Therefore please consider my small patch to allow the
> > > good ones to be able to use write-combining. I have several rev 06 and they are
> > > working fine with this patch.
> >
> > You wrote that 'only rev < 5' can lead to data corruption, but your patch
> > seems to disallow use of write combining for rev 5 too.
> >
> > Could you clarify?
>
> Oops just a typo, it should be <= 5. The patch is correct.
>
> >
> > G?rard.
> >
> > PS:
> > >From what hat did you get this information ? as it seems that ServerWorks
> > require NDA for letting know technical information on their chipsets.
> >
>
> I've learned it the hard way, I have two types : Compaq DL360 (rev 5) and a
> Tyan S2510 (rev 6). On the compaq machine I constantly get data corruption on
> the last double word (4 bytes) in a 64 byte PCI burst when I use write
> combining on the CPU. On the Tyan however the transfer is always ok.
>
> --
> Steffen Persvold Systems Engineer
> Email : mailto:[email protected] Scali AS (http://www.scali.com)
> Norway : Tel : (+47) 2262 8950 Olaf Helsets vei 6
> Fax : (+47) 2262 8951 N-0621 Oslo, Norway
>
> USA : Tel : (+1) 713 706 0544 10500 Richmond Avenue, Suite 190
> Houston, Texas 77042, USA
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2001-04-29 22:33:14

by Steffen Persvold

[permalink] [raw]
Subject: Re: ServerWorks LE and MTRR

[email protected] wrote:
> On Sun, 29 Apr 2001, Steffen Persvold wrote:
>
> > I've learned it the hard way, I have two types : Compaq DL360 (rev 5) and a
> > Tyan S2510 (rev 6). On the compaq machine I constantly get data corruption on
> > the last double word (4 bytes) in a 64 byte PCI burst when I use write
> > combining on the CPU. On the Tyan however the transfer is always ok.
> >
>
> Are you sure that is not due to board design differences?

No I can't be 100% certain that the layout of the board isn't the reason since
I haven't asked ServerWorks about this and it doesn't say anything in their
docs (yes my company has the NDA, so I shouldn't get to much in detail here),
but if this was the case it would be totally wrong to disable write combining
on any LE chipset.

The test case that I have been using to trigger this is sort of special because
we are using SCI shared memory adapters to write (with PIO) into remote nodes
memory, and the bandwidth tends to get quite high (approx 170 MByte/sec on LE
with write combining). I've been able to run this case on 5 different
motherboards using the LE and HE-SL ServerWorks chipsets, but only two of them
are LE (the DL360 and the S2510). Everything works fine with write-combining on
every motherboard except the DL360 (which has rev 5).

One basic test case that I haven't tried, could be to enable write-combining on
your PCI graphics adapter memory and see if the X display gets screwed up.

I will try to get some information from ServerWorks about this problem, but I'm
not sure if ServerWorks would be happy if I told you the answer (because of the
NDA).

Regards,
--
Steffen Persvold Systems Engineer
Email : mailto:[email protected] Scali AS (http://www.scali.com)
Norway : Tel : (+47) 2262 8950 Olaf Helsets vei 6
Fax : (+47) 2262 8951 N-0621 Oslo, Norway

USA : Tel : (+1) 713 706 0544 10500 Richmond Avenue, Suite 190
Houston, Texas 77042, USA

2001-04-30 01:08:17

by Dave Jones

[permalink] [raw]
Subject: Re: ServerWorks LE and MTRR

On Sun, 29 Apr 2001, Steffen Persvold wrote:

> ...
> Therefore please consider my small patch to allow the
> good ones to be able to use write-combining. I have several rev 06 and they are
> working fine with this patch.
> ...

ObPedant:
Can you make a note of this in the comment a few lines above also,
so others who stumble across this code know why the check is there.
afaik, this chipset info isn't public, so it may not be obvious
in the future why the check has been added.

Just something simple like..

- /* ServerWorks LE chipsets have problems with write-combining
+ /* ServerWorks LE chipsets < rev 6 have problems with write-combining
Don't allow it and leave room for other chipsets to be tagged */


Otherwise, if this works for everyone else with rev 6+ serverworks
chipsets, looks ok to me.

regards,

Dave.

--
| Dave Jones. http://www.suse.de/~davej
| SuSE Labs

2001-04-30 06:46:52

by Gérard Roudier

[permalink] [raw]
Subject: Re: ServerWorks LE and MTRR



On Sun, 29 Apr 2001, Steffen Persvold wrote:

> [email protected] wrote:
> > On Sun, 29 Apr 2001, Steffen Persvold wrote:
> >
> > > I've learned it the hard way, I have two types : Compaq DL360 (rev 5) and a
> > > Tyan S2510 (rev 6). On the compaq machine I constantly get data corruption on
> > > the last double word (4 bytes) in a 64 byte PCI burst when I use write
> > > combining on the CPU. On the Tyan however the transfer is always ok.
> > >
> >
> > Are you sure that is not due to board design differences?
>
> No I can't be 100% certain that the layout of the board isn't the reason since
> I haven't asked ServerWorks about this and it doesn't say anything in their
> docs (yes my company has the NDA, so I shouldn't get to much in detail here),
> but if this was the case it would be totally wrong to disable write combining
> on any LE chipset.
>
> The test case that I have been using to trigger this is sort of special because
> we are using SCI shared memory adapters to write (with PIO) into remote nodes
> memory, and the bandwidth tends to get quite high (approx 170 MByte/sec on LE
> with write combining). I've been able to run this case on 5 different
> motherboards using the LE and HE-SL ServerWorks chipsets, but only two of them
> are LE (the DL360 and the S2510). Everything works fine with write-combining on
> every motherboard except the DL360 (which has rev 5).
>
> One basic test case that I haven't tried, could be to enable write-combining on
> your PCI graphics adapter memory and see if the X display gets screwed up.

Done since 8 months on my Supermicro 370 DLE board. /proc/pci tells about
2 PCI bridges rev. 5. The 64bit PCI (bus 1) is interfacing a LSI53C1010
33MHz 64 bit PCI-SCSI controller. The other devices (3dfx, SYM53C895, ...)
are on PCI bus #0. The machine does network using an external modem only.
Never got a single glitch (linux-2.2.18), but the machine is not a server
but my workstation I use at home.

Here is /proc/pci layout:
PCI devices found:
Bus 0, device 0, function 1:
Host bridge: Unknown vendor CNB30LE PCI Bridge (rev 5).
Medium devsel. Master Capable. Latency=16.
Bus 0, device 0, function 0:
Host bridge: Unknown vendor CNB30LE PCI Bridge (rev 5).
Medium devsel. Master Capable. Latency=32.
Bus 0, device 1, function 0:
SCSI storage controller: NCR 53c895 (rev 1).
Medium devsel. IRQ 16. Master Capable. Latency=72. Min Gnt=30.Max Lat=64.
I/O at 0xde00 [0xde01].
Non-prefetchable 32 bit memory at 0xfeaefe00 [0xfeaefe00].
Non-prefetchable 32 bit memory at 0xfeaec000 [0xfeaec000].
Bus 0, device 2, function 0:
SCSI storage controller: NCR 53c810 (rev 18).
Medium devsel. IRQ 18. Master Capable. Latency=64. Min Gnt=8.Max Lat=64.
I/O at 0xd400 [0xd401].
Non-prefetchable 32 bit memory at 0xfeaeff00 [0xfeaeff00].
Bus 0, device 3, function 0:
VGA compatible controller: 3Dfx Unknown device (rev 1).
Vendor id=121a. Device id=5.
Fast devsel. Fast back-to-back capable. IRQ 20.
Non-prefetchable 32 bit memory at 0xfc000000 [0xfc000000].
Prefetchable 32 bit memory at 0xf8000000 [0xf8000008].
I/O at 0xd800 [0xd801].
Bus 0, device 6, function 0:
Ethernet controller: Intel 82557 (rev 8).
Medium devsel. Fast back-to-back capable. IRQ 31. Master Capable. Latency=64. Min Gnt=8.Max Lat=56.
Non-prefetchable 32 bit memory at 0xfeaed000 [0xfeaed000].
I/O at 0xd000 [0xd001].
Non-prefetchable 32 bit memory at 0xfe900000 [0xfe900000].
Bus 0, device 15, function 0:
ISA bridge: Unknown vendor Unknown device (rev 79).
Vendor id=1166. Device id=200.
Medium devsel. Master Capable. No bursts.
Bus 0, device 15, function 1:
IDE interface: Unknown vendor Unknown device (rev 0).
Vendor id=1166. Device id=211.
Medium devsel. Master Capable. Latency=64.
I/O at 0xffa0 [0xffa1].
Bus 0, device 15, function 2:
USB Controller: Unknown vendor Unknown device (rev 4).
Vendor id=1166. Device id=220.
Medium devsel. Fast back-to-back capable. IRQ 10. Master Capable. Latency=64. Max Lat=80.
Non-prefetchable 32 bit memory at 0xfeaee000 [0xfeaee000].
Bus 1, device 1, function 1:
SCSI storage controller: NCR Unknown device (rev 1).
Vendor id=1000. Device id=20.
Medium devsel. IRQ 25. Master Capable. Latency=72. Min Gnt=17.Max Lat=18.
I/O at 0xe800 [0xe801].
Non-prefetchable 64 bit memory at 0xfebffc00 [0xfebffc04].
Non-prefetchable 64 bit memory at 0xfebfc000 [0xfebfc004].
Bus 1, device 1, function 0:
SCSI storage controller: NCR Unknown device (rev 1).
Vendor id=1000. Device id=20.
Medium devsel. IRQ 24. Master Capable. Latency=72. Min Gnt=17.Max Lat=18.
I/O at 0xe400 [0xe401].
Non-prefetchable 64 bit memory at 0xfebff800 [0xfebff804].
Non-prefetchable 64 bit memory at 0xfebfa000 [0xfebfa004].

> I will try to get some information from ServerWorks about this problem, but I'm
> not sure if ServerWorks would be happy if I told you the answer (because of the
> NDA).

Hope this thread will let them properly answer. You should told them
about, in my opinion ...
You may then tell the answer to mtrr.c that is unable sign NDAs... :-)

Regards,
G?rard.

> Regards,
> --
> Steffen Persvold Systems Engineer
> Email : mailto:[email protected] Scali AS (http://www.scali.com)
> Norway : Tel : (+47) 2262 8950 Olaf Helsets vei 6
> Fax : (+47) 2262 8951 N-0621 Oslo, Norway
>
> USA : Tel : (+1) 713 706 0544 10500 Richmond Avenue, Suite 190
> Houston, Texas 77042, USA
>

2001-04-30 08:57:45

by Eric W. Biederman

[permalink] [raw]
Subject: Re: ServerWorks LE and MTRR

Steffen Persvold <[email protected]> writes:

> [email protected] wrote:
> > On Sun, 29 Apr 2001, Steffen Persvold wrote:
> >
> > > I've learned it the hard way, I have two types : Compaq DL360 (rev 5) and a
> > > Tyan S2510 (rev 6). On the compaq machine I constantly get data corruption
> on
>
> > > the last double word (4 bytes) in a 64 byte PCI burst when I use write
> > > combining on the CPU. On the Tyan however the transfer is always ok.
> > >
> >
> > Are you sure that is not due to board design differences?
>
> No I can't be 100% certain that the layout of the board isn't the reason since
> I haven't asked ServerWorks about this and it doesn't say anything in their
> docs (yes my company has the NDA, so I shouldn't get to much in detail here),
> but if this was the case it would be totally wrong to disable write combining
> on any LE chipset.
>
> The test case that I have been using to trigger this is sort of special because
> we are using SCI shared memory adapters to write (with PIO) into remote nodes
> memory, and the bandwidth tends to get quite high (approx 170 MByte/sec on LE
> with write combining). I've been able to run this case on 5 different
> motherboards using the LE and HE-SL ServerWorks chipsets, but only two of them
> are LE (the DL360 and the S2510). Everything works fine with write-combining on
> every motherboard except the DL360 (which has rev 5).
>
> One basic test case that I haven't tried, could be to enable write-combining on
> your PCI graphics adapter memory and see if the X display gets screwed up.
>
> I will try to get some information from ServerWorks about this problem, but I'm
> not sure if ServerWorks would be happy if I told you the answer (because of the
> NDA).

I'd like to put my small plug in that this make me a little nervous.
It could also be a problem with the firmware (aka BIOS) missetting
something up. Working with linuxBIOS I have seen burst-writes
(enabled with write-combining or write-back) cause data corruption
when non-burst-writes to memory don't cause problems, when the memory
controller is setup wrong. (This is was with intel 440GX & 440BX
chipsets).

Eric

2001-04-30 13:46:05

by Mark_Rusk

[permalink] [raw]
Subject: RE: ServerWorks LE and MTRR

I had sent the original patch to Alan back in the Feb. to correct problems
with XFree 4.0.2 and LE chipsets. I will check that the problem is
corrected with Rev's >5. We would see complete system lockups when the Xfree
server would use write-combining mtrr segments.

-----Original Message-----
From: Dave Jones [mailto:[email protected]]
Sent: Sunday, April 29, 2001 8:06 PM
To: Steffen Persvold
Cc: lkml; [email protected]
Subject: Re: ServerWorks LE and MTRR


On Sun, 29 Apr 2001, Steffen Persvold wrote:

> ...
> Therefore please consider my small patch to allow the
> good ones to be able to use write-combining. I have several rev 06 and
they are
> working fine with this patch.
> ...

ObPedant:
Can you make a note of this in the comment a few lines above also,
so others who stumble across this code know why the check is there.
afaik, this chipset info isn't public, so it may not be obvious
in the future why the check has been added.

Just something simple like..

- /* ServerWorks LE chipsets have problems with write-combining
+ /* ServerWorks LE chipsets < rev 6 have problems with write-combining
Don't allow it and leave room for other chipsets to be tagged */


Otherwise, if this works for everyone else with rev 6+ serverworks
chipsets, looks ok to me.

regards,

Dave.

--
| Dave Jones. http://www.suse.de/~davej
| SuSE Labs