2007-02-16 22:48:27

by Lennart Sorensen

[permalink] [raw]
Subject: MediaGX/GeodeGX1 requires X86_OOSTORE. (Was: Re: Strange connection slowdown on pcnet32)

On Fri, Feb 16, 2007 at 05:27:28PM -0500, Lennart Sorensen wrote:
> On Fri, Feb 16, 2007 at 04:01:57PM -0500, Lennart Sorensen wrote:
> > It seems whenever it gets stuck, it is always the same descripter it is
> > stuck on. Here is my current log:
> >
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0340
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: 0310 next->status: 0340
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: pcnet32_poll: pcnet32_rx() got 16 packets
> > eth1: base: 0x05215812 status: 0310 next->status: 0310
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: netif_receive_skb(skb)
> > eth1: pcnet32_poll: pcnet32_rx() got 16 packets
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0310
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x6f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0310
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0033, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0310
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> > eth1: interrupt csr0=0x4f3 new csr=0x33, csr3=0x0000.
> > eth1: exiting interrupt, csr0=0x0433, csr3=0x5f00.
> > eth1: base: 0x04c51812 status: ffff8000 next->status: 0310
> > eth1: pcnet32_poll: pcnet32_rx() got 0 packets
> >
> > So somehow it ends up that when it reads the status of the descriptor at
> > address 0x04c51812, it sees the status as 0x8000 (which means owned by
> > the MAC I believe), even though the next descriptor in the ring has a
> > sensible status, indicating that the descriptor is ready to be handled
> > by the driver. Since the descriptor isn't ready, we exit without
> > handling anything and NAPI reschedules is the next time we get an
> > interrupt, and after some random number of tries, we finally see the
> > right status and handle the packet, along with a bunch of other packets
> > waiting in the descriptor ring. Then we seem to hit the exact same
> > descriptor address again, with the same problem in the status we read,
> > and again we are stuck for a while, until finally we see the right
> > status, and another pile of packets get handled, and we again hit the
> > same descriptor address and get stuck.
> >
> > I believe doing ifconfig eth1 down;ifconfig eth1 up actually
> > reinitialized the port with a new descriptor ring, but I could have got
> > that part wrong looking at the code. Either way it clears things up
> > again for a while, until we get stuck again.
> >
> > It makes me think the cpu somehow reads the memory location when the
> > descriptor ring is empty, notes it is owned by the MAC, and stops
> > handling packets (as it should), but then when the next receive occours,
> > it doesn't read physical memory again for some odd reason, and see the
> > previous status rather than the updated status.
> >
> > Now for some reason it seems when the driver is not using NAPI, and
> > hence not enabling and disabling interrupt masks, this problem somehow
> > never occours (well at least I haven't managed to make it occour yet, so
> > who knows for sure. Certainly with NAPI enabled I can kill it in
> > seconds).
> >
> > Is there a way I can flush the CPU cache and force it to reread memory
> > for an address range, or for all cache, or some other way to ensure the
> > memory value I see is really the memory value that is in system memory
> > for a given address?
>
> It seems so far that if I change Kconfig.cpu and force it to enable
> X86_OOSTORE for the MGEODE_GX1 then things behave properly. Perhaps the
> out of order memory access crap the MediaGX/Geode processors do is
> having some affect, and enabling the code for wmb, etc, actually does
> something.

Well so far it really looks like enabling OOSTORE on the Geode
SC1200/GX1 really does make a difference. A bit of searching seems to
indicate the person that originally submitted the patch that enabled
load/store reordering on the MediaGX/Geode though it might need OOSTORE,
but was convinced by others it didn't. Looks like it really does need
it. The failure that occoured before within a few seconds of starting a
large transfer, no longer fails and all I did was enable
CONFIG_X86_OOSTORE, and recompile pcnet32.ko and load the new module on
the running system. Moving back to the pcnet32.ko built without OOSTORE
enabled hits the failure again within seconds, until ifconfig eth1
down/up reinitialized it's descriptor ring, after which it survices
another bit of transfer and then fails again.

I hate this CPU.

--
Len Sorensen


2007-02-17 00:00:22

by Lennart Sorensen

[permalink] [raw]
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE. (Was: Re: Strange connection slowdown on pcnet32)

On Fri, Feb 16, 2007 at 05:48:24PM -0500, Lennart Sorensen wrote:
> Well so far it really looks like enabling OOSTORE on the Geode
> SC1200/GX1 really does make a difference. A bit of searching seems to
> indicate the person that originally submitted the patch that enabled
> load/store reordering on the MediaGX/Geode though it might need OOSTORE,
> but was convinced by others it didn't. Looks like it really does need
> it. The failure that occoured before within a few seconds of starting a
> large transfer, no longer fails and all I did was enable
> CONFIG_X86_OOSTORE, and recompile pcnet32.ko and load the new module on
> the running system. Moving back to the pcnet32.ko built without OOSTORE
> enabled hits the failure again within seconds, until ifconfig eth1
> down/up reinitialized it's descriptor ring, after which it survices
> another bit of transfer and then fails again.

Well forcing load/store serialize on the CPU doesn't help, disalbing
memory bypass doesn't help. Enabling the X86_OOSTORE does help. What a
stupid CPU design.

So far nothing has managed to fix the __memcpy_toio in the jsm driver
getting data out of order when sending on an exar pci uart chip. Only
calling memcpy with one byte at a time seems to work there. Works fine
on every other cpu of course. What else am I going to discover is wrong
with this CPU.

--
Len Sorensen

2007-02-17 14:12:06

by TAKADA Yoshihito

[permalink] [raw]
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

From: [email protected] (Lennart Sorensen)
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE. (Was: Re: Strange connection slowdown on pcnet32)
Date: Fri, 16 Feb 2007 19:00:19 -0500

> On Fri, Feb 16, 2007 at 05:48:24PM -0500, Lennart Sorensen wrote:
> > Well so far it really looks like enabling OOSTORE on the Geode
> > SC1200/GX1 really does make a difference. A bit of searching seems to
> > indicate the person that originally submitted the patch that enabled
> > load/store reordering on the MediaGX/Geode though it might need OOSTORE,
> > but was convinced by others it didn't. Looks like it really does need
> > it. The failure that occoured before within a few seconds of starting a
> > large transfer, no longer fails and all I did was enable
> > CONFIG_X86_OOSTORE, and recompile pcnet32.ko and load the new module on
> > the running system. Moving back to the pcnet32.ko built without OOSTORE
> > enabled hits the failure again within seconds, until ifconfig eth1
> > down/up reinitialized it's descriptor ring, after which it survices
> > another bit of transfer and then fails again.
>
> Well forcing load/store serialize on the CPU doesn't help, disalbing
> memory bypass doesn't help. Enabling the X86_OOSTORE does help. What a
> stupid CPU design.

is it mean what doesn't help with doesn't call set_cx86_reoder()?
this function disable to reorder at 0x4000:0000 to 0xffff:ffff.
does pcnet32 access at out of above range?

--- arch/i386/Kconfig.cpu~ 2007-02-05 03:44:54.000000000 +0900
+++ arch/i386/Kconfig.cpu 2007-02-17 21:25:52.000000000 +0900
@@ -322,7 +322,7 @@ config X86_USE_3DNOW

config X86_OOSTORE
bool
- depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6) && MTRR
+ depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6) && MTRR || MGEODEGX1
default y

config X86_TSC

2007-02-17 15:07:22

by Lennart Sorensen

[permalink] [raw]
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

On Sat, Feb 17, 2007 at 11:11:13PM +0900, takada wrote:
> is it mean what doesn't help with doesn't call set_cx86_reoder()?
> this function disable to reorder at 0x4000:0000 to 0xffff:ffff.
> does pcnet32 access at out of above range?

No it is accessing system memory by DMA to transfer frames. Since the
system has 128MB ram, the addresses are probably all in the first 128MB
range.

I tried changing cyrix.c to explicitly set the serialize bit (0x8000 in
PCR0) rather than explcitly clearing it as is done now. Didn't make a
difference. I tried reversing the memory bypass setting, which also did
nothing. Enabling CONFIG_X86_OOSTORE and recompiling however does make
a difference.

> --- arch/i386/Kconfig.cpu~ 2007-02-05 03:44:54.000000000 +0900
> +++ arch/i386/Kconfig.cpu 2007-02-17 21:25:52.000000000 +0900
> @@ -322,7 +322,7 @@ config X86_USE_3DNOW
>
> config X86_OOSTORE
> bool
> - depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6) && MTRR
> + depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6) && MTRR || MGEODEGX1
> default y
>
> config X86_TSC

I did:
depends on ((MWINCHIP3D || MWINCHIP2 || MWINCHIPC6) && MTRR) || MGEODEGX1
since I wasn't sure of the precedence in the Kconfig files.

--
Len Sorensen

2007-02-19 14:55:17

by Lennart Sorensen

[permalink] [raw]
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

On Sat, Feb 17, 2007 at 11:11:13PM +0900, takada wrote:
> is it mean what doesn't help with doesn't call set_cx86_reoder()?
> this function disable to reorder at 0x4000:0000 to 0xffff:ffff.
> does pcnet32 access at out of above range?
>
> --- arch/i386/Kconfig.cpu~ 2007-02-05 03:44:54.000000000 +0900
> +++ arch/i386/Kconfig.cpu 2007-02-17 21:25:52.000000000 +0900
> @@ -322,7 +322,7 @@ config X86_USE_3DNOW
>
> config X86_OOSTORE
> bool
> - depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6) && MTRR
> + depends on (MWINCHIP3D || MWINCHIP2 || MWINCHIPC6) && MTRR || MGEODEGX1
> default y
>
> config X86_TSC

Well it turns out that enabling OOSTORE doesn't elliminate the problem,
but it does make it go from occouring within seconds to occouring within
many hours. I am off to investigate some more.

Does anyone know if there is any way to flush a cache line of the cpu to
force rereading system memory for a given address or address range?

--
Len Sorensen

2007-02-19 19:48:32

by Roland Dreier

[permalink] [raw]
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

> Does anyone know if there is any way to flush a cache line of the cpu to
> force rereading system memory for a given address or address range?

There is the "clflush" instruction, but not all x86 CPUs support it.
You need to check the CPUID flag to know for sure (/proc/cpuinfo will
show a "clflush" flag if it is supported).

2007-02-19 19:57:08

by Lennart Sorensen

[permalink] [raw]
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

On Mon, Feb 19, 2007 at 11:48:27AM -0800, Roland Dreier wrote:
> > Does anyone know if there is any way to flush a cache line of the cpu to
> > force rereading system memory for a given address or address range?
>
> There is the "clflush" instruction, but not all x86 CPUs support it.
> You need to check the CPUID flag to know for sure (/proc/cpuinfo will
> show a "clflush" flag if it is supported).

Well I will check for that. Of course it is still possible that is it
actually the network chip screwing up somehow.

--
Len Sorensen

2007-02-19 23:58:48

by TAKADA Yoshihito

[permalink] [raw]
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

From: Roland Dreier <[email protected]>
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE.
Date: Mon, 19 Feb 2007 11:48:27 -0800

> > Does anyone know if there is any way to flush a cache line of the cpu to
> > force rereading system memory for a given address or address range?
>
> There is the "clflush" instruction, but not all x86 CPUs support it.
> You need to check the CPUID flag to know for sure (/proc/cpuinfo will
> show a "clflush" flag if it is supported).

/proc/cpuinfo with MediaGXm :

processor : 0
vendor_id : CyrixInstead
cpu family : 5
model : 5
model name : Cyrix MediaGXtm MMXtm Enhanced
stepping : 2
cpu MHz : 199.750
cache size : 16 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu tsc msr cx8 cmov mmx cxmmx
bogomips : 401.00
clflush size : 32

2007-02-20 00:03:20

by Lennart Sorensen

[permalink] [raw]
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

On Tue, Feb 20, 2007 at 08:56:39AM +0900, takada wrote:
> /proc/cpuinfo with MediaGXm :
>
> processor : 0
> vendor_id : CyrixInstead
> cpu family : 5
> model : 5
> model name : Cyrix MediaGXtm MMXtm Enhanced
> stepping : 2
> cpu MHz : 199.750
> cache size : 16 KB
> fdiv_bug : no
> hlt_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 2
> wp : yes
> flags : fpu tsc msr cx8 cmov mmx cxmmx
> bogomips : 401.00
> clflush size : 32

Hmm with 2.6.18 I am seeing:

processor : 0
vendor_id : CyrixInstead
cpu family : 5
model : 9
model name : Geode(TM) Integrated Processor by National Semi
stepping : 1
cpu MHz : 266.648
cache size : 16 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu tsc msr cx8 cmov mmx cxmmx
bogomips : 534.50

Similar, but the last line isn't there. It looks like 2.6.18 doesn't
actually have code to print that information though.

--
Len Sorensen

2007-02-20 11:35:21

by TAKADA Yoshihito

[permalink] [raw]
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

From: [email protected] (Lennart Sorensen)
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE.
Date: Mon, 19 Feb 2007 19:02:31 -0500

> On Tue, Feb 20, 2007 at 08:56:39AM +0900, takada wrote:
> > /proc/cpuinfo with MediaGXm :

:

> > flags : fpu tsc msr cx8 cmov mmx cxmmx
> > bogomips : 401.00
> > clflush size : 32
>
> Hmm with 2.6.18 I am seeing:

I posted with 2.6.20 + enabled X86_OOSTORE.
The clflush sze line is in /proc/cpuinfo. but clfush is not in flags line.

BTW, can we use WBINVD instruction? I tested compile only.
Do you know a method to change dynamically without #ifdef when it works
with MediaGX/GeodeGX.

diff -Narup a/include/asm-i386/io.h b/include/asm-i386/io.h
--- a/include/asm-i386/io.h 2007-02-20 16:23:25.000000000 +0900
+++ b/include/asm-i386/io.h 2007-02-20 17:07:14.000000000 +0900
@@ -232,7 +232,19 @@ static inline void memcpy_toio(volatile
* 2. Accidentally out of order processors (PPro errata #51)
*/

-#if defined(CONFIG_X86_OOSTORE) || defined(CONFIG_X86_PPRO_FENCE)
+#ifdef CONFIG_MGEODEGX1
+
+static inline void dma_flush_cache(void)
+{
+ __asm__ __volatile__ ("wbinvd": : :"memory");
+}
+
+#define dma_cache_inv(_start,_size) dma_flush_cache()
+#define dma_cache_wback(_start,_size) dma_flush_cache()
+#define dma_cache_wback_inv(_start,_size) dma_flush_cache()
+#define flush_write_buffers()
+
+#elif defined(CONFIG_X86_OOSTORE) || defined(CONFIG_X86_PPRO_FENCE)

static inline void flush_write_buffers(void)
{

2007-02-20 14:48:25

by Lennart Sorensen

[permalink] [raw]
Subject: Re: MediaGX/GeodeGX1 requires X86_OOSTORE.

On Tue, Feb 20, 2007 at 08:34:13PM +0900, takada wrote:
> I posted with 2.6.20 + enabled X86_OOSTORE.
> The clflush sze line is in /proc/cpuinfo. but clfush is not in flags line.
>
> BTW, can we use WBINVD instruction? I tested compile only.
> Do you know a method to change dynamically without #ifdef when it works
> with MediaGX/GeodeGX.
>
> diff -Narup a/include/asm-i386/io.h b/include/asm-i386/io.h
> --- a/include/asm-i386/io.h 2007-02-20 16:23:25.000000000 +0900
> +++ b/include/asm-i386/io.h 2007-02-20 17:07:14.000000000 +0900
> @@ -232,7 +232,19 @@ static inline void memcpy_toio(volatile
> * 2. Accidentally out of order processors (PPro errata #51)
> */
>
> -#if defined(CONFIG_X86_OOSTORE) || defined(CONFIG_X86_PPRO_FENCE)
> +#ifdef CONFIG_MGEODEGX1
> +
> +static inline void dma_flush_cache(void)
> +{
> + __asm__ __volatile__ ("wbinvd": : :"memory");
> +}
> +
> +#define dma_cache_inv(_start,_size) dma_flush_cache()
> +#define dma_cache_wback(_start,_size) dma_flush_cache()
> +#define dma_cache_wback_inv(_start,_size) dma_flush_cache()
> +#define flush_write_buffers()
> +
> +#elif defined(CONFIG_X86_OOSTORE) || defined(CONFIG_X86_PPRO_FENCE)
>
> static inline void flush_write_buffers(void)
> {
> -

Well it is starting to look like it isn't a caching issue, but more
likely an issue of which order writes are performed in. I think the MAC
might be seeing the ownership bit change before the rest of the
descriptor, which shouldn't happen. With X86_OOSTORE, wmb() is called
between setting the fields in the descriptor and setting the ownership
bit to the MAC. I still have to investigate a bit more to find out for
sure, but that could certainly explain why X86_OOSTORE makes the problem
become much less frequent. It doesn't completely elliminate it though.
Of course maybe there are two different problems with the same symptoms.

--
Len Sorensen