I notice that the latest BK has a fix for the bugs in ppc64's out_le64
that I just spent a few hours tracking down in 2.6.6, but out_be64 is
still broken.
This patch fixes two problems with out_be64:
- The val parameter has to be unsigned long (not int), since it's 64 bits.
- Since we're passing *addr into the asm as an output parameter, we
should just use %0 instead of 0(%0) -- what's written won't even
compile.
My ppc64 asm skills are nearly nonexistent but I'm pretty sure this
fix is needed and correct.
Thanks,
Roland
Signed-off-by: Roland Dreier <[email protected]>
===== include/asm-ppc64/io.h 1.18 vs edited =====
--- 1.18/include/asm-ppc64/io.h 2004-05-21 00:50:11 -07:00
+++ edited/include/asm-ppc64/io.h 2004-06-12 14:55:49 -07:00
@@ -356,9 +356,9 @@
: "=&r" (tmp) , "=&r" (val) : "1" (val) , "b" (addr) , "m" (*addr));
}
-static inline void out_be64(volatile unsigned long *addr, int val)
+static inline void out_be64(volatile unsigned long *addr, unsigned long val)
{
- __asm__ __volatile__("std %1,0(%0); sync" : "=m" (*addr) : "r" (val));
+ __asm__ __volatile__("std %1,%0; sync" : "=m" (*addr) : "r" (val));
}
#ifndef CONFIG_PPC_ISERIES
> -static inline void out_be64(volatile unsigned long *addr, int val)
> +static inline void out_be64(volatile unsigned long *addr, unsigned long val)
> {
> - __asm__ __volatile__("std %1,0(%0); sync" : "=m" (*addr) : "r" (val));
> + __asm__ __volatile__("std %1,%0; sync" : "=m" (*addr) : "r" (val));
> }
Ugh ? The syntax of std is std rS, ds(rA), so your fix doesn't look good to me,
and it definitely builds with the current syntax, though I agree the type
is indeed wrong. I also spotted another bug where we forgot to change an
eieio into sync in there though.
Does this totally untested patch works for you ?
===== include/asm-ppc64/io.h 1.18 vs edited =====
--- 1.18/include/asm-ppc64/io.h 2004-05-21 02:50:11 -05:00
+++ edited/include/asm-ppc64/io.h 2004-06-12 19:01:41 -05:00
@@ -307,7 +307,7 @@
static inline void out_be32(volatile unsigned *addr, int val)
{
- __asm__ __volatile__("stw%U0%X0 %1,%0; eieio"
+ __asm__ __volatile__("stw%U0%X0 %1,%0; sync"
: "=m" (*addr) : "r" (val));
}
@@ -358,7 +358,7 @@
static inline void out_be64(volatile unsigned long *addr, int val)
{
- __asm__ __volatile__("std %1,0(%0); sync" : "=m" (*addr) : "r" (val));
+ __asm__ __volatile__("std%U0%X0 %1,%0; sync" : "=m" (*addr) : "r" (val));
}
#ifndef CONFIG_PPC_ISERIES
Hi,
> Ugh ? The syntax of std is std rS, ds(rA), so your fix doesn't look
> good to me, and it definitely builds with the current syntax, though I
> agree the type is indeed wrong. I also spotted another bug where we
> forgot to change an eieio into sync in there though.
>
> Does this totally untested patch works for you ?
It would be nice to make val unsigned long too :)
Anton
> @@ -358,7 +358,7 @@
>
> static inline void out_be64(volatile unsigned long *addr, int val)
> {
> - __asm__ __volatile__("std %1,0(%0); sync" : "=m" (*addr) : "r" (val));
> + __asm__ __volatile__("std%U0%X0 %1,%0; sync" : "=m" (*addr) : "r" (val));
> }
>
> #ifndef CONFIG_PPC_ISERIES
Benjamin> Ugh ? The syntax of std is std rS, ds(rA), so your fix
Benjamin> doesn't look good to me, and it definitely builds with
Benjamin> the current syntax, though I agree the type is indeed
Benjamin> wrong. I also spotted another bug where we forgot to
Benjamin> change an eieio into sync in there though.
Although the kernel builds, it's only because no one actually uses
out_be64. You can try the old version and see:
> cat foo.c
static inline void out_be64(volatile unsigned long *addr, unsigned
long val)
{
__asm__ __volatile__("std %1,0(%0); eieio" : "=m" (*addr) :
"r" (val));
}
void foo(void *x, unsigned long y)
{
out_be64(x, y);
}
$ ${CROSS_COMPILE}gcc -save-temps -c foo.c
foo.s: Assembler messages:
foo.s:49: Error: syntax error; found `(' but expected `)'
foo.s:49: Error: junk at end of line: `(9))'
Looking at foo.s, it's pretty obvious that %0 is already in the ds(rA)
form, and adding 0() around it breaks things. out_be64 expands to:
#APP
std 0,0(0(9)); eieio
#NO_APP
It's possible this is an artifact of my cross-toolchain (gcc
3.4.0/binutils 2.15 built with Dan Kegel's crosstool),
Benjamin> Does this totally untested patch works for you ?
Yes, that looks fine (after fixing val to be unsigned long in
out_be64). You know infinitely more about ppc64 asm than I do so I'm
sure your version is better.
Thanks,
Roland
On Sun, 2004-06-13 at 11:21, Anton Blanchard wrote:
> Hi,
>
> > Ugh ? The syntax of std is std rS, ds(rA), so your fix doesn't look
> > good to me, and it definitely builds with the current syntax, though I
> > agree the type is indeed wrong. I also spotted another bug where we
> > forgot to change an eieio into sync in there though.
> >
> > Does this totally untested patch works for you ?
>
> It would be nice to make val unsigned long too :)
Ooops... that what happens with patches written before breakfast ;)
Ben.
> Yes, that looks fine (after fixing val to be unsigned long in
> out_be64). You know infinitely more about ppc64 asm than I do so I'm
> sure your version is better.
Well, I may know ppc asm, but gcc inline asm still drives me nuts :)
Here's a fixed version (Andrew, please apply)
----
Patch fixes out_be64 implementation on ppc64 along with a glich in
out_be32 (inconsistent) use of barrier.
Signed-off-by: Roland Dreier <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
===== include/asm-ppc64/io.h 1.18 vs edited =====
--- 1.18/include/asm-ppc64/io.h 2004-05-21 02:50:11 -05:00
+++ edited/include/asm-ppc64/io.h 2004-06-13 12:09:16 -05:00
@@ -307,7 +307,7 @@
static inline void out_be32(volatile unsigned *addr, int val)
{
- __asm__ __volatile__("stw%U0%X0 %1,%0; eieio"
+ __asm__ __volatile__("stw%U0%X0 %1,%0; sync"
: "=m" (*addr) : "r" (val));
}
@@ -356,9 +356,9 @@
: "=&r" (tmp) , "=&r" (val) : "1" (val) , "b" (addr) , "m" (*addr));
}
-static inline void out_be64(volatile unsigned long *addr, int val)
+static inline void out_be64(volatile unsigned long *addr, unsigned long val)
{
- __asm__ __volatile__("std %1,0(%0); sync" : "=m" (*addr) : "r" (val));
+ __asm__ __volatile__("std%U0%X0 %1,%0; sync" : "=m" (*addr) : "r" (val));
}
#ifndef CONFIG_PPC_ISERIES
Benjamin> Well, I may know ppc asm, but gcc inline asm still
Benjamin> drives me nuts :)
Speaking of gcc asm, is there a reason why out_le64 (specifically the
constraints) isn't written in this (simpler) way? It seems to me we
can just let val be an input, as long as the "&" constraint for tmp
makes sure it doesn't share the same register. This seems to generate
the same code for me as the current kernel version, at least with gcc
3.4.0/binutils 2.15.
static inline void out_le64(volatile unsigned long *addr, unsigned long val)
{
unsigned long tmp;
__asm__ __volatile__(
"rldimi %0,%2,5*8,1*8\n"
"rldimi %0,%2,3*8,2*8\n"
"rldimi %0,%2,1*8,3*8\n"
"rldimi %0,%2,7*8,4*8\n"
"rldicl %2,%2,32,0\n"
"rlwimi %0,%2,8,8,31\n"
"rlwimi %0,%2,24,16,23\n"
"std %0,%1\n"
"sync"
: "=&r" (tmp), "=m" (*addr) : "r" (val));
}
On Mon, 2004-06-14 at 15:26, Roland Dreier wrote:
> Benjamin> Well, I may know ppc asm, but gcc inline asm still
> Benjamin> drives me nuts :)
>
> Speaking of gcc asm, is there a reason why out_le64 (specifically the
> constraints) isn't written in this (simpler) way? It seems to me we
> can just let val be an input, as long as the "&" constraint for tmp
> makes sure it doesn't share the same register. This seems to generate
> the same code for me as the current kernel version, at least with gcc
> 3.4.0/binutils 2.15.
Hrm... and addr too .. well, I'm just paranoid about those constraints,
I never took the time to fully understand how gcc deals with them
and got bitten by them often enough. I'd rather keep the version that
just works at this point ;)
Ben.