Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
that are summed to obtain the target address. Using '%y0' argument
gives GCC the opportunity to use both registers instead of only one
with the second being forced to 0.
Suggested-by: Segher Boessenkool <[email protected]>
Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/cache.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 40ea5b3781c6..5a22a869a20b 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -85,22 +85,22 @@ extern void _set_L3CR(unsigned long);
static inline void dcbz(void *addr)
{
- __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbz %y0" : : "m"(*(u8 *)addr) : "memory");
}
static inline void dcbi(void *addr)
{
- __asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbi %y0" : : "m"(*(u8 *)addr) : "memory");
}
static inline void dcbf(void *addr)
{
- __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbf %y0" : : "m"(*(u8 *)addr) : "memory");
}
static inline void dcbst(void *addr)
{
- __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbst %y0" : : "m"(*(u8 *)addr) : "memory");
}
#endif /* !__ASSEMBLY__ */
#endif /* __KERNEL__ */
--
2.13.3
Hi Christophe,
On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
> Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
> that are summed to obtain the target address. Using '%y0' argument
> gives GCC the opportunity to use both registers instead of only one
> with the second being forced to 0.
That's not quite right. Sorry if I didn't explain it properly.
"m" allows all memory. But this instruction only allows reg,reg and
0,reg addressing. For that you need to use constraint "Z".
The output modifier "%y0" just makes [reg] (i.e. simple indirect addressing)
print as "0,reg" instead of "0(reg)" as it would by default (for just "%0").
Segher
Le 07/05/2019 à 17:10, Segher Boessenkool a écrit :
> Hi Christophe,
>
> On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
>> Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
>> that are summed to obtain the target address. Using '%y0' argument
>> gives GCC the opportunity to use both registers instead of only one
>> with the second being forced to 0.
>
> That's not quite right. Sorry if I didn't explain it properly.
>
> "m" allows all memory. But this instruction only allows reg,reg and
> 0,reg addressing. For that you need to use constraint "Z".
But gcc help
(https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints)
says it is better to use 'm':
Z
Memory operand that is an indexed or indirect from a register (it
is usually better to use ‘m’ or ‘es’ in asm statements)
That's the reason why I used 'm', I thought it was equivalent.
Christophe
>
> The output modifier "%y0" just makes [reg] (i.e. simple indirect addressing)
> print as "0,reg" instead of "0(reg)" as it would by default (for just "%0").
>
>
> Segher
>
On Tue, May 07, 2019 at 06:53:30PM +0200, Christophe Leroy wrote:
> Le 07/05/2019 à 17:10, Segher Boessenkool a écrit :
> >On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
> >>Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
> >>that are summed to obtain the target address. Using '%y0' argument
> >>gives GCC the opportunity to use both registers instead of only one
> >>with the second being forced to 0.
> >
> >That's not quite right. Sorry if I didn't explain it properly.
> >
> >"m" allows all memory. But this instruction only allows reg,reg and
> >0,reg addressing. For that you need to use constraint "Z".
>
> But gcc help
> (https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints)
> says it is better to use 'm':
It says it *usually* is better to use "m". What it really should say is
it is better to use "m" _when that is valid_. It is not valid for the
cache block instructions.
I'll fix up the comment... "es" is ancient, too, nowadays it is
equivalent to just "m" (and you need "m<>" to allow pre-modify addressing).
> Z
>
> Memory operand that is an indexed or indirect from a register (it
> is usually better to use ‘m’ or ‘es’ in asm statements)
>
> That's the reason why I used 'm', I thought it was equivalent.
Yeah, the manual text could be clearer.
Segher