2019-08-17 09:06:02

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH] powerpc: optimise WARN_ON()

Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger
of the t(d/w)nei instruction instead of using directly the
value of x.

This leads to GCC adding unnecessary pair of addic/subfe. This was
revealed after adding a WARN_ON() on top of clear_page() in order
to detect misaligned destination:

@@ -49,6 +51,8 @@ static inline void clear_page(void *addr)
{
unsigned int i;

+ WARN_ON((unsigned long)addr & (L1_CACHE_BYTES - 1));
+
for (i = 0; i < PAGE_SIZE / L1_CACHE_BYTES; i++, addr += L1_CACHE_BYTES)
dcbz(addr);
}

This resulted on:

0000019c <clear_user_page>:
19c: 54 68 06 fe clrlwi r8,r3,27
1a0: 31 48 ff ff addic r10,r8,-1
1a4: 7d 4a 41 10 subfe r10,r10,r8
1a8: 0f 0a 00 00 twnei r10,0
1ac: 39 20 00 80 li r9,128
1b0: 7d 29 03 a6 mtctr r9
1b4: 7c 00 1f ec dcbz 0,r3
1b8: 38 63 00 20 addi r3,r3,32
1bc: 42 00 ff f8 bdnz 1b4 <clear_user_page+0x18>
1c0: 7c a3 2b 78 mr r3,r5
1c4: 48 00 00 00 b 1c4 <clear_user_page+0x28>
1c4: R_PPC_REL24 flush_dcache_page

By using (x) instead of !!(x) like BUG_ON() does, the additional
instructions go away:

0000019c <clear_user_page>:
19c: 54 6a 06 fe clrlwi r10,r3,27
1a0: 0f 0a 00 00 twnei r10,0
1a4: 39 20 00 80 li r9,128
1a8: 7d 29 03 a6 mtctr r9
1ac: 7c 00 1f ec dcbz 0,r3
1b0: 38 63 00 20 addi r3,r3,32
1b4: 42 00 ff f8 bdnz 1ac <clear_user_page+0x10>
1b8: 7c a3 2b 78 mr r3,r5
1bc: 48 00 00 00 b 1bc <clear_user_page+0x20>
1bc: R_PPC_REL24 flush_dcache_page

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/bug.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index fed7e6241349..77074582fe65 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -107,7 +107,7 @@
: : "i" (__FILE__), "i" (__LINE__), \
"i" (BUGFLAG_WARNING|BUGFLAG_TAINT(TAINT_WARN)),\
"i" (sizeof(struct bug_entry)), \
- "r" (__ret_warn_on)); \
+ "r" ((__force long)(x))); \
} \
unlikely(__ret_warn_on); \
})
--
2.17.1


2019-08-18 12:05:15

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH] powerpc: optimise WARN_ON()

On Sat, Aug 17, 2019 at 09:04:42AM +0000, Christophe Leroy wrote:
> Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger
> of the t(d/w)nei instruction instead of using directly the
> value of x.
>
> This leads to GCC adding unnecessary pair of addic/subfe.

And it has to, it is passed as an "r" to an asm, GCC has to put the "!!"
value into a register.

> By using (x) instead of !!(x) like BUG_ON() does, the additional
> instructions go away:

But is it correct? What happens if you pass an int to WARN_ON, on a
64-bit kernel?

(You might want to have 64-bit generate either tw or td. But, with
your __builtin_trap patch, all that will be automatic).


Segher

2019-08-19 05:42:07

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] powerpc: optimise WARN_ON()



Le 18/08/2019 à 14:01, Segher Boessenkool a écrit :
> On Sat, Aug 17, 2019 at 09:04:42AM +0000, Christophe Leroy wrote:
>> Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger
>> of the t(d/w)nei instruction instead of using directly the
>> value of x.
>>
>> This leads to GCC adding unnecessary pair of addic/subfe.
>
> And it has to, it is passed as an "r" to an asm, GCC has to put the "!!"
> value into a register.
>
>> By using (x) instead of !!(x) like BUG_ON() does, the additional
>> instructions go away:
>
> But is it correct? What happens if you pass an int to WARN_ON, on a
> 64-bit kernel?

On a 64-bit kernel, an int is still in a 64-bit register, so there would
be no problem with tdnei, would it ? an int 0 is the same as an long 0,
right ?

It is on 32-bit kernel that I see a problem, if one passes a long long
to WARN_ON(), the forced cast to long will just drop the upper size of
it. So as of today, BUG_ON() is buggy for that.

>
> (You might want to have 64-bit generate either tw or td. But, with
> your __builtin_trap patch, all that will be automatic).
>

Yes I'll discard this patch and focus on the __builtin_trap() one which
should solve most issues.

Christophe

2019-08-19 07:59:30

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH] powerpc: optimise WARN_ON()

On Mon, Aug 19, 2019 at 07:40:42AM +0200, Christophe Leroy wrote:
> Le 18/08/2019 ? 14:01, Segher Boessenkool a ?crit?:
> >On Sat, Aug 17, 2019 at 09:04:42AM +0000, Christophe Leroy wrote:
> >>Unlike BUG_ON(x), WARN_ON(x) uses !!(x) as the trigger
> >>of the t(d/w)nei instruction instead of using directly the
> >>value of x.
> >>
> >>This leads to GCC adding unnecessary pair of addic/subfe.
> >
> >And it has to, it is passed as an "r" to an asm, GCC has to put the "!!"
> >value into a register.
> >
> >>By using (x) instead of !!(x) like BUG_ON() does, the additional
> >>instructions go away:
> >
> >But is it correct? What happens if you pass an int to WARN_ON, on a
> >64-bit kernel?
>
> On a 64-bit kernel, an int is still in a 64-bit register, so there would
> be no problem with tdnei, would it ? an int 0 is the same as an long 0,
> right ?

The top 32 bits of a 64-bit register holding an int are undefined. Take
as example

int x = 42;
x = ~x;

which may put ffff_ffff_ffff_ffd5 into the reg, not 0000_0000_ffff_ffd5
as you might expect or want. For tw instructions this makes no difference
(they only look at the low 32 bits anyway); for td insns, it does.

> It is on 32-bit kernel that I see a problem, if one passes a long long
> to WARN_ON(), the forced cast to long will just drop the upper size of
> it. So as of today, BUG_ON() is buggy for that.

Sure, it isn't defined what types you can pass to that macro. Another
thing that makes inline functions much saner to use.

> >(You might want to have 64-bit generate either tw or td. But, with
> >your __builtin_trap patch, all that will be automatic).
>
> Yes I'll discard this patch and focus on the __builtin_trap() one which
> should solve most issues.

But see my comment there about the compiler knowing all code after an
unconditional trap is dead.


Segher