2004-06-12 22:01:04

by Roland Dreier

[permalink] [raw]
Subject: [PATCH] Fix ppc64 out_be64

I notice that the latest BK has a fix for the bugs in ppc64's out_le64
that I just spent a few hours tracking down in 2.6.6, but out_be64 is
still broken.

This patch fixes two problems with out_be64:
- The val parameter has to be unsigned long (not int), since it's 64 bits.
- Since we're passing *addr into the asm as an output parameter, we
should just use %0 instead of 0(%0) -- what's written won't even
compile.

My ppc64 asm skills are nearly nonexistent but I'm pretty sure this
fix is needed and correct.

Thanks,
Roland

Signed-off-by: Roland Dreier <[email protected]>

===== include/asm-ppc64/io.h 1.18 vs edited =====
--- 1.18/include/asm-ppc64/io.h 2004-05-21 00:50:11 -07:00
+++ edited/include/asm-ppc64/io.h 2004-06-12 14:55:49 -07:00
@@ -356,9 +356,9 @@
: "=&r" (tmp) , "=&r" (val) : "1" (val) , "b" (addr) , "m" (*addr));
}

-static inline void out_be64(volatile unsigned long *addr, int val)
+static inline void out_be64(volatile unsigned long *addr, unsigned long val)
{
- __asm__ __volatile__("std %1,0(%0); sync" : "=m" (*addr) : "r" (val));
+ __asm__ __volatile__("std %1,%0; sync" : "=m" (*addr) : "r" (val));
}

#ifndef CONFIG_PPC_ISERIES


2004-06-13 15:54:20

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH] Fix ppc64 out_be64


> -static inline void out_be64(volatile unsigned long *addr, int val)
> +static inline void out_be64(volatile unsigned long *addr, unsigned long val)
> {
> - __asm__ __volatile__("std %1,0(%0); sync" : "=m" (*addr) : "r" (val));
> + __asm__ __volatile__("std %1,%0; sync" : "=m" (*addr) : "r" (val));
> }

Ugh ? The syntax of std is std rS, ds(rA), so your fix doesn't look good to me,
and it definitely builds with the current syntax, though I agree the type
is indeed wrong. I also spotted another bug where we forgot to change an
eieio into sync in there though.

Does this totally untested patch works for you ?

===== include/asm-ppc64/io.h 1.18 vs edited =====
--- 1.18/include/asm-ppc64/io.h 2004-05-21 02:50:11 -05:00
+++ edited/include/asm-ppc64/io.h 2004-06-12 19:01:41 -05:00
@@ -307,7 +307,7 @@

static inline void out_be32(volatile unsigned *addr, int val)
{
- __asm__ __volatile__("stw%U0%X0 %1,%0; eieio"
+ __asm__ __volatile__("stw%U0%X0 %1,%0; sync"
: "=m" (*addr) : "r" (val));
}

@@ -358,7 +358,7 @@

static inline void out_be64(volatile unsigned long *addr, int val)
{
- __asm__ __volatile__("std %1,0(%0); sync" : "=m" (*addr) : "r" (val));
+ __asm__ __volatile__("std%U0%X0 %1,%0; sync" : "=m" (*addr) : "r" (val));
}

#ifndef CONFIG_PPC_ISERIES


2004-06-13 16:23:31

by Anton Blanchard

[permalink] [raw]
Subject: Re: [PATCH] Fix ppc64 out_be64


Hi,

> Ugh ? The syntax of std is std rS, ds(rA), so your fix doesn't look
> good to me, and it definitely builds with the current syntax, though I
> agree the type is indeed wrong. I also spotted another bug where we
> forgot to change an eieio into sync in there though.
>
> Does this totally untested patch works for you ?

It would be nice to make val unsigned long too :)

Anton

> @@ -358,7 +358,7 @@
>
> static inline void out_be64(volatile unsigned long *addr, int val)
> {
> - __asm__ __volatile__("std %1,0(%0); sync" : "=m" (*addr) : "r" (val));
> + __asm__ __volatile__("std%U0%X0 %1,%0; sync" : "=m" (*addr) : "r" (val));
> }
>
> #ifndef CONFIG_PPC_ISERIES

2004-06-13 16:48:10

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH] Fix ppc64 out_be64

Benjamin> Ugh ? The syntax of std is std rS, ds(rA), so your fix
Benjamin> doesn't look good to me, and it definitely builds with
Benjamin> the current syntax, though I agree the type is indeed
Benjamin> wrong. I also spotted another bug where we forgot to
Benjamin> change an eieio into sync in there though.

Although the kernel builds, it's only because no one actually uses
out_be64. You can try the old version and see:

> cat foo.c
static inline void out_be64(volatile unsigned long *addr, unsigned
long val)
{
__asm__ __volatile__("std %1,0(%0); eieio" : "=m" (*addr) :
"r" (val));
}

void foo(void *x, unsigned long y)
{
out_be64(x, y);
}
$ ${CROSS_COMPILE}gcc -save-temps -c foo.c
foo.s: Assembler messages:
foo.s:49: Error: syntax error; found `(' but expected `)'
foo.s:49: Error: junk at end of line: `(9))'

Looking at foo.s, it's pretty obvious that %0 is already in the ds(rA)
form, and adding 0() around it breaks things. out_be64 expands to:

#APP
std 0,0(0(9)); eieio
#NO_APP

It's possible this is an artifact of my cross-toolchain (gcc
3.4.0/binutils 2.15 built with Dan Kegel's crosstool),

Benjamin> Does this totally untested patch works for you ?

Yes, that looks fine (after fixing val to be unsigned long in
out_be64). You know infinitely more about ppc64 asm than I do so I'm
sure your version is better.

Thanks,
Roland

2004-06-13 17:13:57

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH] Fix ppc64 out_be64

On Sun, 2004-06-13 at 11:21, Anton Blanchard wrote:
> Hi,
>
> > Ugh ? The syntax of std is std rS, ds(rA), so your fix doesn't look
> > good to me, and it definitely builds with the current syntax, though I
> > agree the type is indeed wrong. I also spotted another bug where we
> > forgot to change an eieio into sync in there though.
> >
> > Does this totally untested patch works for you ?
>
> It would be nice to make val unsigned long too :)

Ooops... that what happens with patches written before breakfast ;)

Ben.


2004-06-13 17:17:30

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH] Fix ppc64 out_be64


> Yes, that looks fine (after fixing val to be unsigned long in
> out_be64). You know infinitely more about ppc64 asm than I do so I'm
> sure your version is better.

Well, I may know ppc asm, but gcc inline asm still drives me nuts :)

Here's a fixed version (Andrew, please apply)

----

Patch fixes out_be64 implementation on ppc64 along with a glich in
out_be32 (inconsistent) use of barrier.

Signed-off-by: Roland Dreier <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>

===== include/asm-ppc64/io.h 1.18 vs edited =====
--- 1.18/include/asm-ppc64/io.h 2004-05-21 02:50:11 -05:00
+++ edited/include/asm-ppc64/io.h 2004-06-13 12:09:16 -05:00
@@ -307,7 +307,7 @@

static inline void out_be32(volatile unsigned *addr, int val)
{
- __asm__ __volatile__("stw%U0%X0 %1,%0; eieio"
+ __asm__ __volatile__("stw%U0%X0 %1,%0; sync"
: "=m" (*addr) : "r" (val));
}

@@ -356,9 +356,9 @@
: "=&r" (tmp) , "=&r" (val) : "1" (val) , "b" (addr) , "m" (*addr));
}

-static inline void out_be64(volatile unsigned long *addr, int val)
+static inline void out_be64(volatile unsigned long *addr, unsigned long val)
{
- __asm__ __volatile__("std %1,0(%0); sync" : "=m" (*addr) : "r" (val));
+ __asm__ __volatile__("std%U0%X0 %1,%0; sync" : "=m" (*addr) : "r" (val));
}

#ifndef CONFIG_PPC_ISERIES


2004-06-14 20:28:59

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH] Fix ppc64 out_be64

Benjamin> Well, I may know ppc asm, but gcc inline asm still
Benjamin> drives me nuts :)

Speaking of gcc asm, is there a reason why out_le64 (specifically the
constraints) isn't written in this (simpler) way? It seems to me we
can just let val be an input, as long as the "&" constraint for tmp
makes sure it doesn't share the same register. This seems to generate
the same code for me as the current kernel version, at least with gcc
3.4.0/binutils 2.15.

static inline void out_le64(volatile unsigned long *addr, unsigned long val)
{
unsigned long tmp;

__asm__ __volatile__(
"rldimi %0,%2,5*8,1*8\n"
"rldimi %0,%2,3*8,2*8\n"
"rldimi %0,%2,1*8,3*8\n"
"rldimi %0,%2,7*8,4*8\n"
"rldicl %2,%2,32,0\n"
"rlwimi %0,%2,8,8,31\n"
"rlwimi %0,%2,24,16,23\n"
"std %0,%1\n"
"sync"
: "=&r" (tmp), "=m" (*addr) : "r" (val));
}

2004-06-14 21:16:06

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH] Fix ppc64 out_be64

On Mon, 2004-06-14 at 15:26, Roland Dreier wrote:
> Benjamin> Well, I may know ppc asm, but gcc inline asm still
> Benjamin> drives me nuts :)
>
> Speaking of gcc asm, is there a reason why out_le64 (specifically the
> constraints) isn't written in this (simpler) way? It seems to me we
> can just let val be an input, as long as the "&" constraint for tmp
> makes sure it doesn't share the same register. This seems to generate
> the same code for me as the current kernel version, at least with gcc
> 3.4.0/binutils 2.15.

Hrm... and addr too .. well, I'm just paranoid about those constraints,
I never took the time to fully understand how gcc deals with them
and got bitten by them often enough. I'd rather keep the version that
just works at this point ;)

Ben.