2018-11-05 20:41:01

by Bart Van Assche

[permalink] [raw]
Subject: [PATCH] slab.h: Avoid using & for logical and of booleans

This patch suppresses the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Cc: Vlastimil Babka <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Roman Gushchin <[email protected]>
Signed-off-by: Bart Van Assche <[email protected]>
---
include/linux/slab.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..97d0599ddb7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+ return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
}

/*
--
2.19.1.930.g4563a0d9d0-goog



2018-11-05 21:14:00

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Mon, 5 Nov 2018 12:40:00 -0800 Bart Van Assche <[email protected]> wrote:

> This patch suppresses the following sparse warning:
>
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>
> ...
>
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> }
>
> /*

I suppose so.

That function seems too clever for its own good :(. I wonder if these
branch-avoiding tricks are really worthwhile.


2018-11-05 21:48:56

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Mon, 2018-11-05 at 13:13 -0800, Andrew Morton wrote:
+AD4 On Mon, 5 Nov 2018 12:40:00 -0800 Bart Van Assche +ADw-bvanassche+AEA-acm.org+AD4 wrote:
+AD4
+AD4 +AD4 This patch suppresses the following sparse warning:
+AD4 +AD4
+AD4 +AD4 ./include/linux/slab.h:332:43: warning: dubious: x +ACY +ACE-y
+AD4 +AD4
+AD4 +AD4 ...
+AD4 +AD4
+AD4 +AD4 --- a/include/linux/slab.h
+AD4 +AD4 +-+-+- b/include/linux/slab.h
+AD4 +AD4 +AEAAQA -329,7 +-329,7 +AEAAQA static +AF8AXw-always+AF8-inline enum kmalloc+AF8-cache+AF8-type kmalloc+AF8-type(gfp+AF8-t flags)
+AD4 +AD4 +ACo If an allocation is both +AF8AXw-GFP+AF8-DMA and +AF8AXw-GFP+AF8-RECLAIMABLE, return
+AD4 +AD4 +ACo KMALLOC+AF8-DMA and effectively ignore +AF8AXw-GFP+AF8-RECLAIMABLE
+AD4 +AD4 +ACo-/
+AD4 +AD4 - return type+AF8-dma +- (is+AF8-reclaimable +ACY +ACE-is+AF8-dma) +ACo KMALLOC+AF8-RECLAIM+ADs
+AD4 +AD4 +- return type+AF8-dma +- is+AF8-reclaimable +ACo +ACE-is+AF8-dma +ACo KMALLOC+AF8-RECLAIM+ADs
+AD4 +AD4 +AH0
+AD4 +AD4
+AD4 +AD4 /+ACo
+AD4
+AD4 I suppose so.
+AD4
+AD4 That function seems too clever for its own good :(. I wonder if these
+AD4 branch-avoiding tricks are really worthwhile.

From what I have seen in gcc disassembly it seems to me like gcc uses the
cmov instruction to implement e.g. the ternary operator (?:). So I think none
of the cleverness in kmalloc+AF8-type() is really necessary to avoid conditional
branches. I think this function would become much more readable when using a
switch statement or when rewriting it as follows (untested):

static +AF8AXw-always+AF8-inline enum kmalloc+AF8-cache+AF8-type kmalloc+AF8-type(gfp+AF8-t flags)
+AHs
- int is+AF8-dma +AD0 0+ADs
- int type+AF8-dma +AD0 0+ADs
- int is+AF8-reclaimable+ADs
-
-+ACM-ifdef CONFIG+AF8-ZONE+AF8-DMA
- is+AF8-dma +AD0 +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-DMA)+ADs
- type+AF8-dma +AD0 is+AF8-dma +ACo KMALLOC+AF8-DMA+ADs
-+ACM-endif
-
- is+AF8-reclaimable +AD0 +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-RECLAIMABLE)+ADs
-
/+ACo
+ACo If an allocation is both +AF8AXw-GFP+AF8-DMA and +AF8AXw-GFP+AF8-RECLAIMABLE, return
+ACo KMALLOC+AF8-DMA and effectively ignore +AF8AXw-GFP+AF8-RECLAIMABLE
+ACo-/
- return type+AF8-dma +- (is+AF8-reclaimable +ACY +ACE-is+AF8-dma) +ACo KMALLOC+AF8-RECLAIM+ADs
+- static const enum kmalloc+AF8-cache+AF8-type flags+AF8-to+AF8-type+AFs-2+AF0AWw-2+AF0 +AD0 +AHs
+- +AHs 0, KMALLOC+AF8-RECLAIM +AH0,
+- +AHs KMALLOC+AF8-DMA, KMALLOC+AF8-DMA +AH0,
+- +AH0AOw
+-+ACM-ifdef CONFIG+AF8-ZONE+AF8-DMA
+- bool is+AF8-dma +AD0 +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-DMA)+ADs
+-+ACM-endif
+- bool is+AF8-reclaimable +AD0 +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-RECLAIMABLE)+ADs
+-
+- return flags+AF8-to+AF8-type+AFs-is+AF8-dma+AF0AWw-is+AF8-reclaimable+AF0AOw
+AH0


2018-11-05 22:16:16

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On 2018-11-05 22:48, Bart Van Assche wrote:
> On Mon, 2018-11-05 at 13:13 -0800, Andrew Morton wrote:
>> On Mon, 5 Nov 2018 12:40:00 -0800 Bart Van Assche <[email protected]> wrote:
>>
>>> This patch suppresses the following sparse warning:
>>>
>>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>>
>>> ...
>>>
>>> --- a/include/linux/slab.h
>>> +++ b/include/linux/slab.h
>>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>>> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>>> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>>> */
>>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>>> }
>>>
>>> /*
>>
>> I suppose so.
>>
>> That function seems too clever for its own good :(. I wonder if these
>> branch-avoiding tricks are really worthwhile.
>
> From what I have seen in gcc disassembly it seems to me like gcc uses the
> cmov instruction to implement e.g. the ternary operator (?:). So I think none
> of the cleverness in kmalloc_type() is really necessary to avoid conditional
> branches. I think this function would become much more readable when using a
> switch statement or when rewriting it as follows (untested):
>
> static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> {
> - int is_dma = 0;
> - int type_dma = 0;
> - int is_reclaimable;
> -
> -#ifdef CONFIG_ZONE_DMA
> - is_dma = !!(flags & __GFP_DMA);
> - type_dma = is_dma * KMALLOC_DMA;
> -#endif
> -
> - is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> -
> /*
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + static const enum kmalloc_cache_type flags_to_type[2][2] = {
> + { 0, KMALLOC_RECLAIM },
> + { KMALLOC_DMA, KMALLOC_DMA },
> + };
> +#ifdef CONFIG_ZONE_DMA
> + bool is_dma = !!(flags & __GFP_DMA);
> +#endif
> + bool is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> +
> + return flags_to_type[is_dma][is_reclaimable];
> }
>

Won't that pessimize the cases where gfp is a constant to actually do
the table lookup, and add 16 bytes to every translation unit?

Another option is to add a fake KMALLOC_DMA_RECLAIM so the
kmalloc_caches[] array has size 4, then assign the same dma
kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a
dozen pointers in .data), and then just compute kmalloc_type() as

((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >>
someothershift).

Perhaps one could even shuffle the GFP flags so the two shifts are the same.

Rasmus

2018-11-05 22:48:05

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Mon, 2018-11-05 at 23:14 +-0100, Rasmus Villemoes wrote:
+AD4 Won't that pessimize the cases where gfp is a constant to actually do
+AD4 the table lookup, and add 16 bytes to every translation unit?
+AD4
+AD4 Another option is to add a fake KMALLOC+AF8-DMA+AF8-RECLAIM so the
+AD4 kmalloc+AF8-caches+AFsAXQ array has size 4, then assign the same dma
+AD4 kmalloc+AF8-cache pointer to +AFs-2+AF0AWw-i+AF0 and +AFs-3+AF0AWw-i+AF0 (so that costs perhaps a
+AD4 dozen pointers in .data), and then just compute kmalloc+AF8-type() as
+AD4
+AD4 ((flags +ACY +AF8AXw-GFP+AF8-RECLAIMABLE) +AD4APg someshift) +AHw ((flags +ACY +AF8AXw-GFP+AF8-DMA) +AD4APg
+AD4 someothershift).
+AD4
+AD4 Perhaps one could even shuffle the GFP flags so the two shifts are the same.

How about this version, still untested? My compiler is able to evaluate
the switch expression if the argument is constant.

static +AF8AXw-always+AF8-inline enum kmalloc+AF8-cache+AF8-type kmalloc+AF8-type(gfp+AF8-t flags)
+AHs
- int is+AF8-dma +AD0 0+ADs
- int type+AF8-dma +AD0 0+ADs
- int is+AF8-reclaimable+ADs
+- unsigned int dr +AD0 +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-RECLAIMABLE)+ADs

+ACM-ifdef CONFIG+AF8-ZONE+AF8-DMA
- is+AF8-dma +AD0 +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-DMA)+ADs
- type+AF8-dma +AD0 is+AF8-dma +ACo KMALLOC+AF8-DMA+ADs
+- dr +AHwAPQ +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-DMA) +ADwAPA 1+ADs
+ACM-endif

- is+AF8-reclaimable +AD0 +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-RECLAIMABLE)+ADs
-
/+ACo
+ACo If an allocation is both +AF8AXw-GFP+AF8-DMA and +AF8AXw-GFP+AF8-RECLAIMABLE, return
+ACo KMALLOC+AF8-DMA and effectively ignore +AF8AXw-GFP+AF8-RECLAIMABLE
+ACo-/
- return type+AF8-dma +- (is+AF8-reclaimable +ACY +ACE-is+AF8-dma) +ACo KMALLOC+AF8-RECLAIM+ADs
+- switch (dr) +AHs
+- default:
+- case 0:
+- return 0+ADs
+- case 1:
+- return KMALLOC+AF8-RECLAIM+ADs
+- case 2:
+- case 3:
+- return KMALLOC+AF8-DMA+ADs
+- +AH0
+AH0

Bart.

2018-11-05 22:49:12

by Alexander Duyck

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche <[email protected]> wrote:
>
> On Mon, 2018-11-05 at 23:14 +0100, Rasmus Villemoes wrote:
> > Won't that pessimize the cases where gfp is a constant to actually do
> > the table lookup, and add 16 bytes to every translation unit?
> >
> > Another option is to add a fake KMALLOC_DMA_RECLAIM so the
> > kmalloc_caches[] array has size 4, then assign the same dma
> > kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a
> > dozen pointers in .data), and then just compute kmalloc_type() as
> >
> > ((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >>
> > someothershift).
> >
> > Perhaps one could even shuffle the GFP flags so the two shifts are the same.
>
> How about this version, still untested? My compiler is able to evaluate
> the switch expression if the argument is constant.
>
> static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> {
> - int is_dma = 0;
> - int type_dma = 0;
> - int is_reclaimable;
> + unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
>
> #ifdef CONFIG_ZONE_DMA
> - is_dma = !!(flags & __GFP_DMA);
> - type_dma = is_dma * KMALLOC_DMA;
> + dr |= !!(flags & __GFP_DMA) << 1;
> #endif
>
> - is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> -
> /*
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + switch (dr) {
> + default:
> + case 0:
> + return 0;
> + case 1:
> + return KMALLOC_RECLAIM;
> + case 2:
> + case 3:
> + return KMALLOC_DMA;
> + }
> }
>
> Bart.

Doesn't this defeat the whole point of the code which I thought was to
avoid conditional jumps and branches? Also why would you bother with
the "dr" value when you could just mask the flags value and switch on
that directly?

2018-11-06 00:01:57

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Mon, 2018-11-05 at 14:48 -0800, Alexander Duyck wrote:
+AD4 On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche +ADw-bvanassche+AEA-acm.org+AD4 wrote:
+AD4 +AD4 How about this version, still untested? My compiler is able to evaluate
+AD4 +AD4 the switch expression if the argument is constant.
+AD4 +AD4
+AD4 +AD4 static +AF8AXw-always+AF8-inline enum kmalloc+AF8-cache+AF8-type kmalloc+AF8-type(gfp+AF8-t flags)
+AD4 +AD4 +AHs
+AD4 +AD4 - int is+AF8-dma +AD0 0+ADs
+AD4 +AD4 - int type+AF8-dma +AD0 0+ADs
+AD4 +AD4 - int is+AF8-reclaimable+ADs
+AD4 +AD4 +- unsigned int dr +AD0 +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-RECLAIMABLE)+ADs
+AD4 +AD4
+AD4 +AD4 +ACM-ifdef CONFIG+AF8-ZONE+AF8-DMA
+AD4 +AD4 - is+AF8-dma +AD0 +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-DMA)+ADs
+AD4 +AD4 - type+AF8-dma +AD0 is+AF8-dma +ACo KMALLOC+AF8-DMA+ADs
+AD4 +AD4 +- dr +AHwAPQ +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-DMA) +ADwAPA 1+ADs
+AD4 +AD4 +ACM-endif
+AD4 +AD4
+AD4 +AD4 - is+AF8-reclaimable +AD0 +ACEAIQ(flags +ACY +AF8AXw-GFP+AF8-RECLAIMABLE)+ADs
+AD4 +AD4 -
+AD4 +AD4 /+ACo
+AD4 +AD4 +ACo If an allocation is both +AF8AXw-GFP+AF8-DMA and +AF8AXw-GFP+AF8-RECLAIMABLE, return
+AD4 +AD4 +ACo KMALLOC+AF8-DMA and effectively ignore +AF8AXw-GFP+AF8-RECLAIMABLE
+AD4 +AD4 +ACo-/
+AD4 +AD4 - return type+AF8-dma +- (is+AF8-reclaimable +ACY +ACE-is+AF8-dma) +ACo KMALLOC+AF8-RECLAIM+ADs
+AD4 +AD4 +- switch (dr) +AHs
+AD4 +AD4 +- default:
+AD4 +AD4 +- case 0:
+AD4 +AD4 +- return 0+ADs
+AD4 +AD4 +- case 1:
+AD4 +AD4 +- return KMALLOC+AF8-RECLAIM+ADs
+AD4 +AD4 +- case 2:
+AD4 +AD4 +- case 3:
+AD4 +AD4 +- return KMALLOC+AF8-DMA+ADs
+AD4 +AD4 +- +AH0
+AD4 +AD4 +AH0
+AD4
+AD4 Doesn't this defeat the whole point of the code which I thought was to
+AD4 avoid conditional jumps and branches? Also why would you bother with
+AD4 the +ACI-dr+ACI value when you could just mask the flags value and switch on
+AD4 that directly?

Storing the relevant bits of 'flags' in the 'dr' variable avoids that the
bit selection expressions have to be repeated and allows to use a switch
statement instead of multiple if / else statements.

Most kmalloc() calls pass a constant to the gfp argument. That allows the
compiler to evaluate kmalloc+AF8-type() at compile time. So the conditional jumps
and branches only appear when the gfp argument is not a constant. What makes
you think it is important to optimize for that case?

Bart.

2018-11-06 00:12:17

by Alexander Duyck

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Mon, Nov 5, 2018 at 4:01 PM Bart Van Assche <[email protected]> wrote:
>
> On Mon, 2018-11-05 at 14:48 -0800, Alexander Duyck wrote:
> > On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche <[email protected]> wrote:
> > > How about this version, still untested? My compiler is able to evaluate
> > > the switch expression if the argument is constant.
> > >
> > > static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> > > {
> > > - int is_dma = 0;
> > > - int type_dma = 0;
> > > - int is_reclaimable;
> > > + unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
> > >
> > > #ifdef CONFIG_ZONE_DMA
> > > - is_dma = !!(flags & __GFP_DMA);
> > > - type_dma = is_dma * KMALLOC_DMA;
> > > + dr |= !!(flags & __GFP_DMA) << 1;
> > > #endif
> > >
> > > - is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> > > -
> > > /*
> > > * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> > > * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> > > */
> > > - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> > > + switch (dr) {
> > > + default:
> > > + case 0:
> > > + return 0;
> > > + case 1:
> > > + return KMALLOC_RECLAIM;
> > > + case 2:
> > > + case 3:
> > > + return KMALLOC_DMA;
> > > + }
> > > }
> >
> > Doesn't this defeat the whole point of the code which I thought was to
> > avoid conditional jumps and branches? Also why would you bother with
> > the "dr" value when you could just mask the flags value and switch on
> > that directly?
>
> Storing the relevant bits of 'flags' in the 'dr' variable avoids that the
> bit selection expressions have to be repeated and allows to use a switch
> statement instead of multiple if / else statements.

Really they shouldn't have to be repeated. You essentially have just 3
cases. 0, __GFP_RECLAIMABLE, and the default case.

> Most kmalloc() calls pass a constant to the gfp argument. That allows the
> compiler to evaluate kmalloc_type() at compile time. So the conditional jumps
> and branches only appear when the gfp argument is not a constant. What makes
> you think it is important to optimize for that case?
>
> Bart.

I didn't really think it was all that important to optimize, but I
thought that was what you were trying to maintain with the earlier
patch since it was converting things to a table lookup.

If we really don't care then why even bother with the switch statement
anyway? It seems like you could just do one ternary operator and be
done with it. Basically all you need is:
return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
(flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;

Why bother with all the extra complexity of the switch statement?

Thanks.

- Alex

2018-11-06 00:34:51

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
+AD4 If we really don't care then why even bother with the switch statement
+AD4 anyway? It seems like you could just do one ternary operator and be
+AD4 done with it. Basically all you need is:
+AD4 return (defined(CONFIG+AF8-ZONE+AF8-DMA) +ACYAJg (flags +ACY +AF8AXw-GFP+AF8-DMA)) ? KMALLOC+AF8-DMA :
+AD4 (flags +ACY +AF8AXw-GFP+AF8-RECLAIMABLE) ? KMALLOC+AF8-RECLAIM : 0+ADs
+AD4
+AD4 Why bother with all the extra complexity of the switch statement?

I don't think that defined() can be used in a C expression. Hence the
IS+AF8-ENABLED() macro. If you fix that, leave out four superfluous parentheses,
test your patch, post that patch and cc me then I will add my Reviewed-by.

Bart.



2018-11-06 08:41:13

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On 11/5/18 9:40 PM, Bart Van Assche wrote:
> This patch suppresses the following sparse warning:
>
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Roman Gushchin <[email protected]>
> Signed-off-by: Bart Van Assche <[email protected]>

Thanks.

Acked-by: Vlastimil Babka <[email protected]>

> ---
> include/linux/slab.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> }
>
> /*
>


2018-11-06 09:46:46

by William Kucharski

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans



> On Nov 5, 2018, at 14:13, Andrew Morton <[email protected]> wrote:
>
>> On Mon, 5 Nov 2018 12:40:00 -0800 Bart Van Assche <[email protected]> wrote:
>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>> }
>>
>> /*
>
> I suppose so.
>
> That function seems too clever for its own good :(. I wonder if these
> branch-avoiding tricks are really worthwhile.

At the very least I'd like to see some comments added as to why that approach was taken for the sake of future maintainers.

William Kucharski



2018-11-06 10:09:00

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] slab.h: Avoid using & for logical and of booleans

From: Bart Van Assche
> Sent: 05 November 2018 20:40
>
> This patch suppresses the following sparse warning:
>
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Roman Gushchin <[email protected]>
> Signed-off-by: Bart Van Assche <[email protected]>
> ---
> include/linux/slab.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;

ISTM that changing is_dma and is_reclaimable from int to bool will stop the bleating.

It is also strange that this code is trying so hard here to avoid conditional instructions
and then uses several to generate the boolean values in the first place.

OTOH I'd probably write:
int gfp_dma = 0;

#ifdef CONFIG_ZONE_DMA
gfp_dma = __GFP_DMA;
#endif

return flags & gfp_dma ? KMALLOC_DMA : flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : 0;

That might generate cmovs, but is may be better to put unlikely() around both
conditional expressions. Or redo as:

return !unlikely(flags & (dfp_dma | __GFP_RECLAIMABLE)) ? 0 : flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


2018-11-06 10:23:13

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On 11/6/18 11:08 AM, David Laight wrote:
> From: Bart Van Assche
>> Sent: 05 November 2018 20:40
>>
>> This patch suppresses the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y

BTW, I wonder why the warnings appeared only now, after maybe months in
linux-next. Don't the various automated testing bots run sparse also on
linux-next?

>>
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Cc: Vlastimil Babka <[email protected]>
>> Cc: Mel Gorman <[email protected]>
>> Cc: Christoph Lameter <[email protected]>
>> Cc: Roman Gushchin <[email protected]>
>> Signed-off-by: Bart Van Assche <[email protected]>
>> ---
>> include/linux/slab.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..97d0599ddb7b 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>> */
>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>
> ISTM that changing is_dma and is_reclaimable from int to bool will stop the bleating.
>
> It is also strange that this code is trying so hard here to avoid conditional instructions

I primarily wanted to avoid branches in a hot path, not cmovs. Note
those are also not "free" (latency-wise) if the result of cmov is
immediately used for further computation.

> and then uses several to generate the boolean values in the first place.

I'm not sure where exactly?

> OTOH I'd probably write:
> int gfp_dma = 0;
>
> #ifdef CONFIG_ZONE_DMA
> gfp_dma = __GFP_DMA;
> #endif
>
> return flags & gfp_dma ? KMALLOC_DMA : flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : 0;

I'm not opposed to this. Christoph might :)

>
> That might generate cmovs, but is may be better to put unlikely() around both
> conditional expressions. Or redo as:
>
> return !unlikely(flags & (dfp_dma | __GFP_RECLAIMABLE)) ? 0 : flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;

I guess it should be structured so that the fast path is for gfp without
both __GFP_DMA and __GFP_RECLAIMABLE, with a single test+branch. IIRC
that's what Christoph originally requested.

> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>


2018-11-06 11:08:47

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] slab.h: Avoid using & for logical and of booleans

From: Vlastimil Babka [mailto:[email protected]]
> Sent: 06 November 2018 10:22
...
> >> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> >> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> >
> > ISTM that changing is_dma and is_reclaimable from int to bool will stop the bleating.
> >
> > It is also strange that this code is trying so hard here to avoid conditional instructions

I've done some experiments, compiled with gcc 4.7.3 and -O2
The constants match those from the kernel headers.

It is noticable that there isn't a cmov in sight.
The code would also be better if the KMALLOC constants matched the GFP ones.


#define __GFP_DMA 1u
#define __GFP_RECLAIM 0x10u

#define KMALLOC_DMA 2
#define KMALLOC_RECLAIM 1

unsigned int f(unsigned int flags)
{
return flags & __GFP_DMA ? KMALLOC_DMA : flags & __GFP_RECLAIM ? KMALLOC_RECLAIM : 0;
}

unsigned int f1(unsigned int flags)
{
return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 0 : flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
}

unsigned int f2(unsigned int flags)
{
int is_dma, type_dma, is_rec;

is_dma = !!(flags & __GFP_DMA);
type_dma = is_dma * KMALLOC_DMA;
is_rec = !!(flags & __GFP_RECLAIM);

return type_dma + (is_rec & !is_dma) * KMALLOC_RECLAIM;
}

unsigned int f3(unsigned int flags)
{
int is_dma, type_dma, is_rec;

is_dma = !!(flags & __GFP_DMA);
type_dma = is_dma * KMALLOC_DMA;
is_rec = !!(flags & __GFP_RECLAIM);

return type_dma + (is_rec * !is_dma) * KMALLOC_RECLAIM;
}

Disassembly of section .text:

0000000000000000 <f>:
0: 40 f6 c7 01 test $0x1,%dil
4: b8 02 00 00 00 mov $0x2,%eax
9: 74 05 je 10 <f+0x10>
b: f3 c3 repz retq
d: 0f 1f 00 nopl (%rax)
10: 89 f8 mov %edi,%eax
12: c1 e8 04 shr $0x4,%eax
15: 83 e0 01 and $0x1,%eax
18: c3 retq

This one has a misprediced branch in the common path.

0000000000000020 <f1>:
20: 40 f6 c7 11 test $0x11,%dil
24: 75 03 jne 29 <f1+0x9>
26: 31 c0 xor %eax,%eax
28: c3 retq
29: 83 e7 01 and $0x1,%edi
2c: 83 ff 01 cmp $0x1,%edi
2f: 19 c0 sbb %eax,%eax
31: 83 c0 02 add $0x2,%eax
34: c3 retq

The jne will be predicted not taken and the retq predicted.
So this might only be 1 clock in the normal case.

0000000000000040 <f2>:
40: 89 f8 mov %edi,%eax
42: c1 ef 04 shr $0x4,%edi
45: 83 e0 01 and $0x1,%eax
48: 89 c2 mov %eax,%edx
4a: 83 f2 01 xor $0x1,%edx
4d: 21 d7 and %edx,%edi
4f: 8d 04 47 lea (%rdi,%rax,2),%eax
52: c3 retq

No conditionals, but a 4 deep dependency chain.

0000000000000060 <f3>:
60: 89 fa mov %edi,%edx
62: c1 ef 04 shr $0x4,%edi
65: 83 e2 01 and $0x1,%edx
68: 83 e7 01 and $0x1,%edi
6b: 89 d0 mov %edx,%eax
6d: 83 f0 01 xor $0x1,%eax
70: 0f af c7 imul %edi,%eax
73: 8d 04 50 lea (%rax,%rdx,2),%eax
76: c3 retq

You really don't want the multiply.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2018-11-06 12:52:55

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On 11/6/18 12:07 PM, David Laight wrote:
> From: Vlastimil Babka [mailto:[email protected]]
>> Sent: 06 November 2018 10:22
> ...
>>>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>>>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>>>
>>> ISTM that changing is_dma and is_reclaimable from int to bool will stop the bleating.
>>>
>>> It is also strange that this code is trying so hard here to avoid conditional instructions
>
> I've done some experiments, compiled with gcc 4.7.3 and -O2
> The constants match those from the kernel headers.
>
> It is noticable that there isn't a cmov in sight.

There is with newer gcc: https://godbolt.org/z/qFdByQ

But even that didn't remove the imul in f3() so that's indeed a bust.

> The code would also be better if the KMALLOC constants matched the GFP ones.

That would be hard, as __GFP flags have also other constraints
(especially __GFP_DMA relative to other zone restricting __GFP flags)
and KMALLOC_* are used as array index.

> unsigned int f1(unsigned int flags)
> {
> return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 0 : flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
> }
>

...

> 0000000000000020 <f1>:
> 20: 40 f6 c7 11 test $0x11,%dil
> 24: 75 03 jne 29 <f1+0x9>
> 26: 31 c0 xor %eax,%eax
> 28: c3 retq
> 29: 83 e7 01 and $0x1,%edi
> 2c: 83 ff 01 cmp $0x1,%edi
> 2f: 19 c0 sbb %eax,%eax
> 31: 83 c0 02 add $0x2,%eax
> 34: c3 retq
>
> The jne will be predicted not taken and the retq predicted.
> So this might only be 1 clock in the normal case.

I think this is the winner. It's also a single branch and not two,
because the compiler could figure out some of the "clever arithmetics"
itself. Care to send a full patch?


2018-11-06 17:21:23

by Alexander Duyck

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche <[email protected]> wrote:
>
> On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> > If we really don't care then why even bother with the switch statement
> > anyway? It seems like you could just do one ternary operator and be
> > done with it. Basically all you need is:
> > return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> > (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> >
> > Why bother with all the extra complexity of the switch statement?
>
> I don't think that defined() can be used in a C expression. Hence the
> IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
> test your patch, post that patch and cc me then I will add my Reviewed-by.
>
> Bart.

Actually the defined macro is used multiple spots in if statements
throughout the kernel.

The reason for IS_ENABLED is to address the fact that we can be
dealing with macros that indicate if they are built in or a module
since those end up being two different defines depending on if you
select 'y' or 'm'.

Thanks.

- Alex

2018-11-06 18:19:16

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Tue, 2018-11-06 at 09:20 -0800, Alexander Duyck wrote:
+AD4 On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche +ADw-bvanassche+AEA-acm.org+AD4 wrote:
+AD4 +AD4
+AD4 +AD4 On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
+AD4 +AD4 +AD4 If we really don't care then why even bother with the switch statement
+AD4 +AD4 +AD4 anyway? It seems like you could just do one ternary operator and be
+AD4 +AD4 +AD4 done with it. Basically all you need is:
+AD4 +AD4 +AD4 return (defined(CONFIG+AF8-ZONE+AF8-DMA) +ACYAJg (flags +ACY +AF8AXw-GFP+AF8-DMA)) ? KMALLOC+AF8-DMA :
+AD4 +AD4 +AD4 (flags +ACY +AF8AXw-GFP+AF8-RECLAIMABLE) ? KMALLOC+AF8-RECLAIM : 0+ADs
+AD4 +AD4 +AD4
+AD4 +AD4 +AD4 Why bother with all the extra complexity of the switch statement?
+AD4 +AD4
+AD4 +AD4 I don't think that defined() can be used in a C expression. Hence the
+AD4 +AD4 IS+AF8-ENABLED() macro. If you fix that, leave out four superfluous parentheses,
+AD4 +AD4 test your patch, post that patch and cc me then I will add my Reviewed-by.
+AD4
+AD4 Actually the defined macro is used multiple spots in if statements
+AD4 throughout the kernel.

The only 'if (defined(' matches I found in the kernel tree that are not
preprocessor statements occur in Perl code. Maybe I overlooked something?

+AD4 The reason for IS+AF8-ENABLED is to address the fact that we can be
+AD4 dealing with macros that indicate if they are built in or a module
+AD4 since those end up being two different defines depending on if you
+AD4 select 'y' or 'm'.

From Documentation/process/coding-style.rst:

Within code, where possible, use the IS+AF8-ENABLED macro to convert a Kconfig
symbol into a C boolean expression, and use it in a normal C conditional:

.. code-block:: c

if (IS+AF8-ENABLED(CONFIG+AF8-SOMETHING)) +AHs
...
+AH0

Bart.

2018-11-06 18:39:58

by Alexander Duyck

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Tue, Nov 6, 2018 at 9:48 AM Bart Van Assche <[email protected]> wrote:
>
> On Tue, 2018-11-06 at 09:20 -0800, Alexander Duyck wrote:
> > On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche <[email protected]> wrote:
> > >
> > > On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> > > > If we really don't care then why even bother with the switch statement
> > > > anyway? It seems like you could just do one ternary operator and be
> > > > done with it. Basically all you need is:
> > > > return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> > > > (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> > > >
> > > > Why bother with all the extra complexity of the switch statement?
> > >
> > > I don't think that defined() can be used in a C expression. Hence the
> > > IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
> > > test your patch, post that patch and cc me then I will add my Reviewed-by.
> >
> > Actually the defined macro is used multiple spots in if statements
> > throughout the kernel.
>
> The only 'if (defined(' matches I found in the kernel tree that are not
> preprocessor statements occur in Perl code. Maybe I overlooked something?

You may be right. I think I was thinking of "__is_defined", not "defined".

> > The reason for IS_ENABLED is to address the fact that we can be
> > dealing with macros that indicate if they are built in or a module
> > since those end up being two different defines depending on if you
> > select 'y' or 'm'.
>
> From Documentation/process/coding-style.rst:
>
> Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
> symbol into a C boolean expression, and use it in a normal C conditional:
>
> .. code-block:: c
>
> if (IS_ENABLED(CONFIG_SOMETHING)) {
> ...
> }
>
> Bart.

Right. Part of the reason for suggesting that is that depending on how
you define "CONFIG_SOMETHING" it can actually be defined as
"CONFIG_SOMETHING" or "CONFIG_SOMETHING_MODULE". I was operating
under the assumption that CONFIG_ZONE_DMA wasn't ever going to be
built as a module.

Thanks.

- Alex

2018-11-07 10:41:53

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] slab.h: Avoid using & for logical and of booleans

From: Vlastimil Babka
> Sent: 06 November 2018 12:51
>
> On 11/6/18 12:07 PM, David Laight wrote:
> > From: Vlastimil Babka [mailto:[email protected]]
> >> Sent: 06 November 2018 10:22
> > ...
> >>>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> >>>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
...
> > I've done some experiments, compiled with gcc 4.7.3 and -O2
> > The constants match those from the kernel headers.
> >
> > It is noticable that there isn't a cmov in sight.
>
> There is with newer gcc: https://godbolt.org/z/qFdByQ
>
> But even that didn't remove the imul in f3() so that's indeed a bust.
>
> > The code would also be better if the KMALLOC constants matched the GFP ones.
>
> That would be hard, as __GFP flags have also other constraints
> (especially __GFP_DMA relative to other zone restricting __GFP flags)
> and KMALLOC_* are used as array index.

Hmmm...
With only 2 or three values conditionals are probably better than
table lookups - especially if they are function pointers.

>
> > unsigned int f1(unsigned int flags)
> > {
> > return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 0 : flags & __GFP_DMA ?
> KMALLOC_DMA : KMALLOC_RECLAIM;
> > }
> >
>
> ...
>
> > 0000000000000020 <f1>:
> > 20: 40 f6 c7 11 test $0x11,%dil
> > 24: 75 03 jne 29 <f1+0x9>
> > 26: 31 c0 xor %eax,%eax
> > 28: c3 retq
> > 29: 83 e7 01 and $0x1,%edi
> > 2c: 83 ff 01 cmp $0x1,%edi
> > 2f: 19 c0 sbb %eax,%eax
> > 31: 83 c0 02 add $0x2,%eax
> > 34: c3 retq
> >
> > The jne will be predicted not taken and the retq predicted.
> > So this might only be 1 clock in the normal case.
>
> I think this is the winner. It's also a single branch and not two,
> because the compiler could figure out some of the "clever arithmetics"
> itself. Care to send a full patch?

I've not got a suitable source tree lurking.
So someone else would need to do it.
I'll waive any copyright that could plausibly be assigned to the above!

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2018-11-09 08:12:59

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On 11/7/18 11:41 AM, David Laight wrote:
> From: Vlastimil Babka
>> Sent: 06 November 2018 12:51
>>
>> On 11/6/18 12:07 PM, David Laight wrote:
>>> From: Vlastimil Babka [mailto:[email protected]]
>>> 0000000000000020 <f1>:
>>> 20: 40 f6 c7 11 test $0x11,%dil
>>> 24: 75 03 jne 29 <f1+0x9>
>>> 26: 31 c0 xor %eax,%eax
>>> 28: c3 retq
>>> 29: 83 e7 01 and $0x1,%edi
>>> 2c: 83 ff 01 cmp $0x1,%edi
>>> 2f: 19 c0 sbb %eax,%eax
>>> 31: 83 c0 02 add $0x2,%eax
>>> 34: c3 retq
>>>
>>> The jne will be predicted not taken and the retq predicted.
>>> So this might only be 1 clock in the normal case.
>>
>> I think this is the winner. It's also a single branch and not two,
>> because the compiler could figure out some of the "clever arithmetics"
>> itself. Care to send a full patch?
>
> I've not got a suitable source tree lurking.
> So someone else would need to do it.
> I'll waive any copyright that could plausibly be assigned to the above!

There we go. This is to replace the current fix by Bart (sorry) which seems
to add an extra IMUL. Apparently current mainline is spamming anyone running
sparse with lots of warning, so it should be merged soon.

----8<----
From ddd2fc6fcba425733f8320413a1451410687c9c3 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <[email protected]>
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche <[email protected]>
Reported-by: Darryl T. Agostinelli <[email protected]>
Suggested-by: Andrew Morton <[email protected]>
Suggested-by: David Laight <[email protected]>
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka <[email protected]>
---
include/linux/slab.h | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..18c6920c2803 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -304,6 +304,8 @@ enum kmalloc_cache_type {
KMALLOC_RECLAIM,
#ifdef CONFIG_ZONE_DMA
KMALLOC_DMA,
+#else
+ KMALLOC_DMA = KMALLOC_NORMAL,
#endif
NR_KMALLOC_TYPES
};
@@ -314,22 +316,20 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];

static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
- int is_dma = 0;
- int type_dma = 0;
- int is_reclaimable;
-
-#ifdef CONFIG_ZONE_DMA
- is_dma = !!(flags & __GFP_DMA);
- type_dma = is_dma * KMALLOC_DMA;
-#endif
+ int gfp_dma = IS_ENABLED(CONFIG_ZONE_DMA) ? __GFP_DMA : 0;

- is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+ /*
+ * The most common case is KMALLOC_NORMAL, so test for it
+ * with a single branch for both flags.
+ */
+ if (likely((flags & (gfp_dma | __GFP_RECLAIMABLE)) == 0))
+ return KMALLOC_NORMAL;

/*
- * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
- * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+ * At least one of the flags has to be set. If both are, __GFP_DMA
+ * is more important.
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+ return flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;
}

/*
--
2.19.1


2018-11-09 19:01:03

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka <[email protected]> wrote:

> Multiple people have reported the following sparse warning:
>
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>
> The minimal fix would be to change the logical & to boolean &&, which emits the
> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
> not worthwile. David Laight provided a nice comparison of disassembly of
> multiple variants, which shows that the current version produces a 4 deep
> dependency chain, and fixing the sparse warning by changing logical and to
> multiplication emits an IMUL, making it even more expensive.
>
> The code as rewritten by this patch yielded the best disassembly, with a single
> predictable branch for the most common case, and a ternary operator for the
> rest, which gcc seems to compile without a branch or cmov by itself.
>
> The result should be more readable, without a sparse warning and probably also
> faster for the common case.
>
> Reported-by: Bart Van Assche <[email protected]>
> Reported-by: Darryl T. Agostinelli <[email protected]>
> Suggested-by: Andrew Morton <[email protected]>
> Suggested-by: David Laight <[email protected]>
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Signed-off-by: Vlastimil Babka <[email protected]>
> ---
> include/linux/slab.h | 24 ++++++++++++------------
> 1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..18c6920c2803 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
> KMALLOC_RECLAIM,
> #ifdef CONFIG_ZONE_DMA
> KMALLOC_DMA,
> +#else
> + KMALLOC_DMA = KMALLOC_NORMAL,
> #endif
> NR_KMALLOC_TYPES
> };

I don't think this works correctly. Resetting KMALLOC_DMA to 0 will
cause NR_KMALLOC_TYPES to have value 1.


enum foo {
a = 0,
b,
c = 0,
d
}

main()
{
printf("%d %d %d %d\n", a, b, c, d);
}

akpm3> ./a.out
0 1 0 1


2018-11-09 19:19:54

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On 11/9/18 8:00 PM, Andrew Morton wrote:
> On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka <[email protected]> wrote:
>
>> Multiple people have reported the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>
>> The minimal fix would be to change the logical & to boolean &&, which emits the
>> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
>> not worthwile. David Laight provided a nice comparison of disassembly of
>> multiple variants, which shows that the current version produces a 4 deep
>> dependency chain, and fixing the sparse warning by changing logical and to
>> multiplication emits an IMUL, making it even more expensive.
>>
>> The code as rewritten by this patch yielded the best disassembly, with a single
>> predictable branch for the most common case, and a ternary operator for the
>> rest, which gcc seems to compile without a branch or cmov by itself.
>>
>> The result should be more readable, without a sparse warning and probably also
>> faster for the common case.
>>
>> Reported-by: Bart Van Assche <[email protected]>
>> Reported-by: Darryl T. Agostinelli <[email protected]>
>> Suggested-by: Andrew Morton <[email protected]>
>> Suggested-by: David Laight <[email protected]>
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Signed-off-by: Vlastimil Babka <[email protected]>
>> ---
>> include/linux/slab.h | 24 ++++++++++++------------
>> 1 file changed, 12 insertions(+), 12 deletions(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..18c6920c2803 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
>> KMALLOC_RECLAIM,
>> #ifdef CONFIG_ZONE_DMA
>> KMALLOC_DMA,
>> +#else
>> + KMALLOC_DMA = KMALLOC_NORMAL,
>> #endif
>> NR_KMALLOC_TYPES
>> };
>
> I don't think this works correctly. Resetting KMALLOC_DMA to 0 will
> cause NR_KMALLOC_TYPES to have value 1.

Doh, right! Thanks for catching this.

This? Not terribly elegant, but I don't see a nicer way right now...

----8<----
From 40b84707e1b5aeccff11bd5f0563bb938e2c22d6 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <[email protected]>
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche <[email protected]>
Reported-by: Darryl T. Agostinelli <[email protected]>
Suggested-by: Andrew Morton <[email protected]>
Suggested-by: David Laight <[email protected]>
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka <[email protected]>
---
include/linux/slab.h | 26 ++++++++++++++------------
1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..ca113d4ecc6f 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,24 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];

static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
- int is_dma = 0;
- int type_dma = 0;
- int is_reclaimable;
+ int gfp_dma = IS_ENABLED(CONFIG_ZONE_DMA) ? __GFP_DMA : 0;

-#ifdef CONFIG_ZONE_DMA
- is_dma = !!(flags & __GFP_DMA);
- type_dma = is_dma * KMALLOC_DMA;
-#endif
-
- is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+ /*
+ * The most common case is KMALLOC_NORMAL, so test for it
+ * with a single branch for both flags.
+ */
+ if (likely((flags & (gfp_dma | __GFP_RECLAIMABLE)) == 0))
+ return KMALLOC_NORMAL;

/*
- * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
- * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+ * At least one of the flags has to be set. If both are, __GFP_DMA
+ * is more important.
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+#ifdef CONFIG_ZONE_DMA
+ return flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+ return KMALLOC_RECLAIM;
+#endif
}

/*
--
2.19.1


2018-11-09 19:48:23

by Darryl T. Agostinelli

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Fri, Nov 09, 2018 at 08:16:07PM +0100, Vlastimil Babka wrote:
> On 11/9/18 8:00 PM, Andrew Morton wrote:
> > On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka <[email protected]> wrote:
> >
> >> Multiple people have reported the following sparse warning:
> >>
> >> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> >>
> >> The minimal fix would be to change the logical & to boolean &&, which emits the
> >> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
> >> not worthwile. David Laight provided a nice comparison of disassembly of
> >> multiple variants, which shows that the current version produces a 4 deep
> >> dependency chain, and fixing the sparse warning by changing logical and to
> >> multiplication emits an IMUL, making it even more expensive.
> >>
> >> The code as rewritten by this patch yielded the best disassembly, with a single
> >> predictable branch for the most common case, and a ternary operator for the
> >> rest, which gcc seems to compile without a branch or cmov by itself.
> >>
> >> The result should be more readable, without a sparse warning and probably also
> >> faster for the common case.
> >>
> >> Reported-by: Bart Van Assche <[email protected]>
> >> Reported-by: Darryl T. Agostinelli <[email protected]>
> >> Suggested-by: Andrew Morton <[email protected]>
> >> Suggested-by: David Laight <[email protected]>
> >> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> >> Signed-off-by: Vlastimil Babka <[email protected]>
> >> ---
> >> include/linux/slab.h | 24 ++++++++++++------------
> >> 1 file changed, 12 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/include/linux/slab.h b/include/linux/slab.h
> >> index 918f374e7156..18c6920c2803 100644
> >> --- a/include/linux/slab.h
> >> +++ b/include/linux/slab.h
> >> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
> >> KMALLOC_RECLAIM,
> >> #ifdef CONFIG_ZONE_DMA
> >> KMALLOC_DMA,
> >> +#else
> >> + KMALLOC_DMA = KMALLOC_NORMAL,
> >> #endif
> >> NR_KMALLOC_TYPES
> >> };
> >
> > I don't think this works correctly. Resetting KMALLOC_DMA to 0 will
> > cause NR_KMALLOC_TYPES to have value 1.
>
> Doh, right! Thanks for catching this.
>
> This? Not terribly elegant, but I don't see a nicer way right now...
>

How about the solution I proposed yesterday?

https://lkml.org/lkml/2018/11/9/750

It doesn't involve any tricks.

As it is, this sparse warning is begging for a trick. Let's not
oblidge it to much.


2018-11-09 21:35:20

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On 11/9/18 8:47 PM, Darryl T. Agostinelli wrote:
> On Fri, Nov 09, 2018 at 08:16:07PM +0100, Vlastimil Babka wrote:
>> On 11/9/18 8:00 PM, Andrew Morton wrote:
>>> On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka <[email protected]> wrote:
>>>
>>>> Multiple people have reported the following sparse warning:
>>>>
>>>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>>>
>>>> The minimal fix would be to change the logical & to boolean &&, which emits the
>>>> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
>>>> not worthwile. David Laight provided a nice comparison of disassembly of
>>>> multiple variants, which shows that the current version produces a 4 deep
>>>> dependency chain, and fixing the sparse warning by changing logical and to
>>>> multiplication emits an IMUL, making it even more expensive.
>>>>
>>>> The code as rewritten by this patch yielded the best disassembly, with a single
>>>> predictable branch for the most common case, and a ternary operator for the
>>>> rest, which gcc seems to compile without a branch or cmov by itself.
>>>>
>>>> The result should be more readable, without a sparse warning and probably also
>>>> faster for the common case.
>>>>
>>>> Reported-by: Bart Van Assche <[email protected]>
>>>> Reported-by: Darryl T. Agostinelli <[email protected]>
>>>> Suggested-by: Andrew Morton <[email protected]>
>>>> Suggested-by: David Laight <[email protected]>
>>>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>>>> Signed-off-by: Vlastimil Babka <[email protected]>
>>>> ---
>>>> include/linux/slab.h | 24 ++++++++++++------------
>>>> 1 file changed, 12 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>>>> index 918f374e7156..18c6920c2803 100644
>>>> --- a/include/linux/slab.h
>>>> +++ b/include/linux/slab.h
>>>> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
>>>> KMALLOC_RECLAIM,
>>>> #ifdef CONFIG_ZONE_DMA
>>>> KMALLOC_DMA,
>>>> +#else
>>>> + KMALLOC_DMA = KMALLOC_NORMAL,
>>>> #endif
>>>> NR_KMALLOC_TYPES
>>>> };
>>>
>>> I don't think this works correctly. Resetting KMALLOC_DMA to 0 will
>>> cause NR_KMALLOC_TYPES to have value 1.
>>
>> Doh, right! Thanks for catching this.
>>
>> This? Not terribly elegant, but I don't see a nicer way right now...
>>
>
> How about the solution I proposed yesterday?
>
> https://lkml.org/lkml/2018/11/9/750
>
> It doesn't involve any tricks.

It doesn't remove the "trick" that calculates return value as a sum of
booleans multiplying constants. The patch converts one part of the
expression of those booleans to a ternary operator. I think the result
is even harder to follow and meanwhile Andrew's suggestion was to remove
all the tricks.

> As it is, this sparse warning is begging for a trick. Let's not
> oblidge it to much.

The sparse warning could be silenced just by changing '&' to '&&' which
would emit the same code. But we decided to untrick the code.

2018-11-12 09:57:19

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] slab.h: Avoid using & for logical and of booleans

From: Vlastimil Babka [mailto:[email protected]]
> Sent: 09 November 2018 19:16
...
> This? Not terribly elegant, but I don't see a nicer way right now...

Maybe just have two copies of the function body?

static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
#ifndef CONFIG_ZONE_DMA
return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
#else
if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
return KMALLOC_NORMAL;
return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
#endif
}

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2018-11-13 18:26:52

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On 11/12/18 10:55 AM, David Laight wrote:
> From: Vlastimil Babka [mailto:[email protected]]
>> Sent: 09 November 2018 19:16
> ...
>> This? Not terribly elegant, but I don't see a nicer way right now...
>
> Maybe just have two copies of the function body?
>
> static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> {
> #ifndef CONFIG_ZONE_DMA
> return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
> #else
> if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
> return KMALLOC_NORMAL;
> return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
> #endif
> }

OK that's probably the most straightforward to follow, thanks.
Note that for CONFIG_ZONE_DMA=n the result is identical to original code and
all other attempts. flags & __GFP_DMA is converted to 1/0 index without branches
or cmovs or whatnot.

----8<----
From 40735b637b28c3e5798bc7e90f72f349050c2045 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <[email protected]>
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche <[email protected]>
Reported-by: Darryl T. Agostinelli <[email protected]>
Suggested-by: Andrew Morton <[email protected]>
Suggested-by: David Laight <[email protected]>
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka <[email protected]>
---
include/linux/slab.h | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..6d5009f29ce5 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,22 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];

static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
- int is_dma = 0;
- int type_dma = 0;
- int is_reclaimable;
-
#ifdef CONFIG_ZONE_DMA
- is_dma = !!(flags & __GFP_DMA);
- type_dma = is_dma * KMALLOC_DMA;
-#endif
-
- is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+ /*
+ * The most common case is KMALLOC_NORMAL, so test for it
+ * with a single branch for both flags.
+ */
+ if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
+ return KMALLOC_NORMAL;

/*
- * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
- * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+ * At least one of the flags has to be set. If both are, __GFP_DMA
+ * is more important.
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+ return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+ return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
+#endif
}

/*
--
2.19.1




2018-11-19 11:06:07

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On Mon 2018-11-05 12:40:00, Bart Van Assche wrote:
> This patch suppresses the following sparse warning:
>
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Roman Gushchin <[email protected]>
> Signed-off-by: Bart Van Assche <[email protected]>
> ---
> include/linux/slab.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> }
>

What is wrong with && ? If logical and is better done by multiply,
that's compiler job, and compiler should be fixed to do it...


Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (1.37 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments

2018-11-19 12:56:50

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On 11/19/18 12:04 PM, Pavel Machek wrote:
> On Mon 2018-11-05 12:40:00, Bart Van Assche wrote:
>> This patch suppresses the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Cc: Vlastimil Babka <[email protected]>
>> Cc: Mel Gorman <[email protected]>
>> Cc: Christoph Lameter <[email protected]>
>> Cc: Roman Gushchin <[email protected]>
>> Signed-off-by: Bart Van Assche <[email protected]>
>> ---
>> include/linux/slab.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..97d0599ddb7b 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>> */
>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>> }
>>
>
> What is wrong with && ?

Nothing, it would work and generate the same assembly as '&'. But Andrew
noted that this code is probably too clever for its own good, and he has
a point. The single predictable branch is also likely faster than the
chain of arithmetic calculations anyway. Nobody has actually measured
it, so I'd go with the easier-to-read variant.

> If logical and is better done by multiply,
> that's compiler job, and compiler should be fixed to do it...

Multiply was just another way (equivalent to '&&' semantically) to shut
up sparse warning. But gcc actually emits IMUL in that case, which is
wasteful, so yeah there's a bug report now:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87954

>
> Pavel
>


2018-11-21 13:34:13

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] slab.h: Avoid using & for logical and of booleans

On 11/13/18 7:22 PM, Vlastimil Babka wrote:
> On 11/12/18 10:55 AM, David Laight wrote:
>> From: Vlastimil Babka [mailto:[email protected]]
>>> Sent: 09 November 2018 19:16
>> ...
>>> This? Not terribly elegant, but I don't see a nicer way right now...
>>
>> Maybe just have two copies of the function body?
>>
>> static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>> {
>> #ifndef CONFIG_ZONE_DMA
>> return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
>> #else
>> if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
>> return KMALLOC_NORMAL;
>> return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
>> #endif
>> }
>
> OK that's probably the most straightforward to follow, thanks.
> Note that for CONFIG_ZONE_DMA=n the result is identical to original code and
> all other attempts. flags & __GFP_DMA is converted to 1/0 index without branches
> or cmovs or whatnot.

Ping? Seems like people will report duplicates until the sparse warning
is gone in mainline...

Also CC linux-mm which was somehow lost.


----8<----
From 40735b637b28c3e5798bc7e90f72f349050c2045 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <[email protected]>
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche <[email protected]>
Reported-by: Darryl T. Agostinelli <[email protected]>
Reported-by: Masahiro Yamada <[email protected]>
Suggested-by: Andrew Morton <[email protected]>
Suggested-by: David Laight <[email protected]>
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka <[email protected]>
---
include/linux/slab.h | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..6d5009f29ce5 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,22 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];

static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
- int is_dma = 0;
- int type_dma = 0;
- int is_reclaimable;
-
#ifdef CONFIG_ZONE_DMA
- is_dma = !!(flags & __GFP_DMA);
- type_dma = is_dma * KMALLOC_DMA;
-#endif
-
- is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+ /*
+ * The most common case is KMALLOC_NORMAL, so test for it
+ * with a single branch for both flags.
+ */
+ if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
+ return KMALLOC_NORMAL;

/*
- * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
- * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+ * At least one of the flags has to be set. If both are, __GFP_DMA
+ * is more important.
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+ return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+ return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
+#endif
}

/*
--
2.19.1