2003-02-26 13:38:51

by Dave Jones

[permalink] [raw]
Subject: Tighten up serverworks workaround.

Aparently on rev6 of the LE and above, this workaround
isn't needed. Lets give it a try, and see what happens.

Dave

diff -urpN --exclude-from=/home/davej/.exclude bk-linus/arch/i386/kernel/cpu/mtrr/main.c linux-2.5/arch/i386/kernel/cpu/mtrr/main.c
--- bk-linus/arch/i386/kernel/cpu/mtrr/main.c 2003-02-25 13:10:08.000000000 -0100
+++ linux-2.5/arch/i386/kernel/cpu/mtrr/main.c 2003-02-24 16:36:06.000000000 -0100
@@ -75,20 +75,24 @@ void set_mtrr_ops(struct mtrr_ops * ops)
static int have_wrcomb(void)
{
struct pci_dev *dev = NULL;
-
- /* WTF is this?
- * Someone, please shoot me.
- */
-
- /* ServerWorks LE chipsets have problems with write-combining
- Don't allow it and leave room for other chipsets to be tagged */
+ u8 rev;

if ((dev = pci_find_class(PCI_CLASS_BRIDGE_HOST << 8, NULL)) != NULL) {
if ((dev->vendor == PCI_VENDOR_ID_SERVERWORKS) &&
(dev->device == PCI_DEVICE_ID_SERVERWORKS_LE)) {
- printk(KERN_INFO
- "mtrr: Serverworks LE detected. Write-combining disabled.\n");
- return 0;
+
+ /* ServerWorks LE chipsets have problems with write-combining
+ Don't allow it and leave room for other chipsets to be tagged.
+ Rumour has it that rev6 and above are ok. */
+ pci_read_config_byte(dev, PCI_CLASS_REVISION, &rev);
+ if (rev > 5) {
+ printk ("mtrr: Serverworks LE rev %d detected. Earlier versions of this chipset had mtrr bugs\n", rev);
+ printk ("mtrr: Please send mail to [email protected] if this seems stable.\n");
+ return 1;
+ } else {
+ printk(KERN_INFO "mtrr: Serverworks LE detected. Write-combining disabled.\n");
+ return 0;
+ }
}
}
return (mtrr_if->have_wrcomb ? mtrr_if->have_wrcomb() : 0);


2003-02-26 16:17:05

by Alan Cox

[permalink] [raw]
Subject: Re: Tighten up serverworks workaround.

> Aparently on rev6 of the LE and above, this workaround
> isn't needed. Lets give it a try, and see what happens

Only if serverworks confirm the rumour. This is a corruptor.

2003-02-26 16:19:32

by Dave Jones

[permalink] [raw]
Subject: Re: Tighten up serverworks workaround.

On Wed, Feb 26, 2003 at 11:27:14AM -0500, Alan Cox wrote:
> > Aparently on rev6 of the LE and above, this workaround
> > isn't needed. Lets give it a try, and see what happens
>
> Only if serverworks confirm the rumour. This is a corruptor.

I've reports of people with rev6's who have reported success
with that workaround commented out. Could be they never
pushed the machine hard enough to trigger a bug, but I'd
have thought this breakage would show up pretty quickly.

My attempts to contact serverworks in the past have fallen on
deaf ears. maybe you have better luck ?

Dave

2003-02-26 16:26:02

by Alan Cox

[permalink] [raw]
Subject: Re: Tighten up serverworks workaround.

> I've reports of people with rev6's who have reported success
> with that workaround commented out. Could be they never
> pushed the machine hard enough to trigger a bug, but I'd
> have thought this breakage would show up pretty quickly.

It doesn't. It requires the right patterns and took Dell quite
some time to identify and pin down. If its like the 450NX one
which I suspect it is then you have to have pending misordered
stores to a write gathering target evicted by another read.

> My attempts to contact serverworks in the past have fallen on
> deaf ears. maybe you have better luck ?

I'll try. I got on ok with them for the OSB4 stuff but thats
a long time ago and they've been eaten since then

[Bcc'd to the person I suspect is the right starting point]

Alan

2003-02-26 17:15:16

by Kimball Brown

[permalink] [raw]
Subject: RE: Tighten up serverworks workaround.

How can e help? Please give me a configuration and how the bug manifests
inself.

Kim Brown
VP, Business Development
[email protected]
2451 Mission College Blvd
Santa Clara, CA 95054
(408)922-3174
(408)799-3500 (Mobile)
(408)922-3192 (FAX)

-----Original Message-----
From: Alan Cox [mailto:[email protected]]
Sent: Wednesday, February 26, 2003 8:36 AM
To: [email protected]
Cc: [email protected]; [email protected]; [email protected]
Subject: Re: Tighten up serverworks workaround.

> I've reports of people with rev6's who have reported success
> with that workaround commented out. Could be they never
> pushed the machine hard enough to trigger a bug, but I'd
> have thought this breakage would show up pretty quickly.

It doesn't. It requires the right patterns and took Dell quite
some time to identify and pin down. If its like the 450NX one
which I suspect it is then you have to have pending misordered
stores to a write gathering target evicted by another read.

> My attempts to contact serverworks in the past have fallen on
> deaf ears. maybe you have better luck ?

I'll try. I got on ok with them for the OSB4 stuff but thats
a long time ago and they've been eaten since then

[Bcc'd to the person I suspect is the right starting point]

Alan


2003-02-26 17:53:02

by Alan Cox

[permalink] [raw]
Subject: Re: Tighten up serverworks workaround.

> How can e help? Please give me a configuration and how the bug manifests
> inself.

OSB4 chipset system, some memory areas marked write combining with the
processor memory type range registers. A long time ago Dell (I
think) reported corruption from this and submitted changes to block the
use of write combining on OSB4. The question has arisen as to whether
thats a known thing, and if so which release of the chipset fixed it so that
people can only apply such a restriction to problem cases not all OSB4.

Alan

2003-02-26 18:38:22

by Jonathan Lundell

[permalink] [raw]
Subject: Re: Tighten up serverworks workaround.

At 1:03pm -0500 2/26/03, Alan Cox wrote:
> > How can e help? Please give me a configuration and how the bug manifests
>> inself.
>
>OSB4 chipset system, some memory areas marked write combining with the
>processor memory type range registers. A long time ago Dell (I
>think) reported corruption from this and submitted changes to block the
>use of write combining on OSB4. The question has arisen as to whether
>thats a known thing, and if so which release of the chipset fixed it so that
>people can only apply such a restriction to problem cases not all OSB4.

Presumably we're talking about CNB30 (the north bridge) rather than
OSB4 (the south bridge).
--
/Jonathan Lundell.

2003-02-27 13:01:14

by Ingo Oeser

[permalink] [raw]
Subject: Re: Tighten up serverworks workaround.

Hello there,

On Wed, Feb 26, 2003 at 10:47:52AM -0800, Jonathan Lundell wrote:
> At 1:03pm -0500 2/26/03, Alan Cox wrote:
> > > How can e help? Please give me a configuration and how the bug manifests
> >> inself.
> >
> >OSB4 chipset system, some memory areas marked write combining with the
> >processor memory type range registers. A long time ago Dell (I
> >think) reported corruption from this and submitted changes to block the
> >use of write combining on OSB4. The question has arisen as to whether
> >thats a known thing, and if so which release of the chipset fixed it so that
> >people can only apply such a restriction to problem cases not all OSB4.
>
> Presumably we're talking about CNB30 (the north bridge) rather than
> OSB4 (the south bridge).

No it's about CNB20LE. I'm one of the low performance victims.
And this is an ASUS board (CUR-DLS) (which proved not worth its
prices recently).

Regards

Ingo Oeser
--
Science is what we can tell a computer. Art is everything else. --- D.E.Knuth

2003-03-03 08:52:18

by Mike A. Harris

[permalink] [raw]
Subject: Re: Tighten up serverworks workaround.

On Wed, 26 Feb 2003, Alan Cox wrote:

>> How can e help? Please give me a configuration and how the bug manifests
>> inself.
>
>OSB4 chipset system, some memory areas marked write combining with the
>processor memory type range registers. A long time ago Dell (I
>think) reported corruption from this and submitted changes to block the
>use of write combining on OSB4. The question has arisen as to whether
>thats a known thing, and if so which release of the chipset fixed it so that
>people can only apply such a restriction to problem cases not all OSB4.

I've got 2 OSB4 machines here, one a Tyan HEsl 2567 board. MTRRs
have been disabled on this board for a couple years now with
every kernel release, which I'm told is due to the MTRR problem
described in this thread.

00:00.1 PCI bridge: ServerWorks CNB20LE (rev 01)

Kimball, we chatted before about AGP on this board and a few
other issues, but I don't know if we discussed the MTRR issue.
Could you confirm this problem? If the problem is anything
workaroundable, it would be nice to have MTRRs working on this
box sometime as video is quite slow. I'm willing to test any
potential workarounds if something creeps up.

TIA



--
Mike A. Harris ftp://people.redhat.com/mharris
OS Systems Engineer - XFree86 maintainer - Red Hat

2003-03-03 10:35:10

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: Tighten up serverworks workaround.

On Wed, 26 Feb 2003 13:03:10 -0500 (EST)
Alan Cox <[email protected]> wrote:

> > How can e help? Please give me a configuration and how the bug manifests
> > inself.
>
> OSB4 chipset system, some memory areas marked write combining with the
> processor memory type range registers. A long time ago Dell (I
> think) reported corruption from this and submitted changes to block the
> use of write combining on OSB4. The question has arisen as to whether
> thats a known thing, and if so which release of the chipset fixed it so that
> people can only apply such a restriction to problem cases not all OSB4.
>
> Alan

While we are at the topic. What exactly is this:

00:00.2 Host bridge: ServerWorks: Unknown device 0006 (rev 01)
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-

00:00.3 Host bridge: ServerWorks: Unknown device 0006 (rev 01)
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-

00:0f.3 Host bridge: ServerWorks: Unknown device 0225
Subsystem: ServerWorks: Unknown device 0230
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0

Taken from an Asus TRL-DLS.

--
Regards,
Stephan

2003-03-03 11:52:22

by Alan Cox

[permalink] [raw]
Subject: Re: Tighten up serverworks workaround.

You might want to take Kimball out of unrelated followups

2003-03-03 12:36:36

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: Tighten up serverworks workaround.

On Mon, 3 Mar 2003 07:02:47 -0500 (EST)
Alan Cox <[email protected]> wrote:

> You might want to take Kimball out of unrelated followups

Hm, I am not all that sure that it is completely unrelated. From my point of
view the "big picture" looks like this:

Coming from a system based on VIA-chipset and working perfectly well, we
changed mb to serverworks based TRL-DLS. From that time we experienced and
discussed here:

- strange mtrr settings (solved)
- interrupt sharing problem ide/tg3 (solved)
- reproducably oops'ing cd mounts (on internal ide, with ide-scsi) (not solved)
- latest news: reproducably cold-booting during tar-backup on _second_ streamer
device (dev/st1) on board-internal adaptec controller.

Please note that basic installation/distribution is the same since the VIA
setup.
We are a bit astonished since we expected serverworks-based hardware to perform
_better_ than VIA...
The email you commented is only a small hint that within -pre5 there are still
declared-unknown parts of the chipset. Based on the theory that they are named
"unknown" because nobody around here knows them, it might have been an adequate
idea to ask someone from serverworks, or not? This is in no way meant offensive.

--
Regards,
Stephan

2003-03-03 14:32:12

by Alan Cox

[permalink] [raw]
Subject: Re: Tighten up serverworks workaround.

> We are a bit astonished since we expected serverworks-based hardware to perform
> _better_ than VIA...

My experience is that in general it does.

> The email you commented is only a small hint that within -pre5 there are still
> declared-unknown parts of the chipset. Based on the theory that they are named
> "unknown" because nobody around here knows them, it might have been an adequate
> idea to ask someone from serverworks, or not? This is in no way meant offensive.

Sure, but lets not give senior folks at Serverworks a full blast of l/k.
Its better to sumarise the issues. In some cases vendors do have docs,
so the unknown device ids missing from lspci for example can be dealt with
outside already