2022-06-08 06:15:36

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] include/uapi/linux/swab.h: add __u16 cast to __swab16 conditional

On Tue, Jun 7, 2022 at 3:27 PM Andrew Morton <[email protected]> wrote:
>
> On Tue, 7 Jun 2022 15:20:06 -0700 Justin Stitt <[email protected]> wrote:
>
> > if __HAVE_BUILTIN_BSWAP16__ is defined then __swab16 utilizes a __u16 cast.
> > This same cast should be used if __HAVE_BUILTIN_BSWAP16__ is not defined as
> > well. This should fix loads (at least a few) clang -Wformat warnings
> > specifically with `ntohs()`
> >
> > ...
> >
> > --- a/include/uapi/linux/swab.h
> > +++ b/include/uapi/linux/swab.h
> > @@ -102,7 +102,7 @@ static inline __attribute_const__ __u32 __fswahb32(__u32 val)
> > #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
> > #else
> > #define __swab16(x) \
> > - (__builtin_constant_p((__u16)(x)) ? \
> > + (__u16)(__builtin_constant_p((__u16)(x)) ? \
> > ___constant_swab16(x) : \
> > __fswab16(x))
> > #endif
>
> More explanation, please? Both ___constant_swab16() and __fswab16()
> return __u16, so why does this patch have any effect?
>

See this example:
https://godbolt.org/z/fzE73jn13
And the ImplicitCastExpr nodes adding to the AST:
https://godbolt.org/z/oYeYxYdKW

Both the second and third operand are promoted to int.

C11: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf

6.5.15/5
>> If both the second and third operands have arithmetic type, the result type that would be determined by the usual arithmetic conversions, were they applied to those two operands, is the type of the result.
6.3.1.8/1
>> Otherwise, the integer promotions are performed on both operands.
6.3.1.1/2
>> If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.
--
Thanks,
~Nick Desaulniers


2022-06-08 07:34:30

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] include/uapi/linux/swab.h: add __u16 cast to __swab16 conditional

On Tue, 7 Jun 2022 15:42:56 -0700 Nick Desaulniers <[email protected]> wrote:

> On Tue, Jun 7, 2022 at 3:27 PM Andrew Morton <[email protected]> wrote:
> >
> > On Tue, 7 Jun 2022 15:20:06 -0700 Justin Stitt <[email protected]> wrote:
> >
> > > if __HAVE_BUILTIN_BSWAP16__ is defined then __swab16 utilizes a __u16 cast.
> > > This same cast should be used if __HAVE_BUILTIN_BSWAP16__ is not defined as
> > > well. This should fix loads (at least a few) clang -Wformat warnings
> > > specifically with `ntohs()`
> > >
> > > ...
> > >
> > > --- a/include/uapi/linux/swab.h
> > > +++ b/include/uapi/linux/swab.h
> > > @@ -102,7 +102,7 @@ static inline __attribute_const__ __u32 __fswahb32(__u32 val)
> > > #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
> > > #else
> > > #define __swab16(x) \
> > > - (__builtin_constant_p((__u16)(x)) ? \
> > > + (__u16)(__builtin_constant_p((__u16)(x)) ? \
> > > ___constant_swab16(x) : \
> > > __fswab16(x))
> > > #endif
> >
> > More explanation, please? Both ___constant_swab16() and __fswab16()
> > return __u16, so why does this patch have any effect?
> >
>
> See this example:
> https://godbolt.org/z/fzE73jn13
> And the ImplicitCastExpr nodes adding to the AST:
> https://godbolt.org/z/oYeYxYdKW
>
> Both the second and third operand are promoted to int.
>
> C11: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
>
> 6.5.15/5
> >> If both the second and third operands have arithmetic type, the result type that would be determined by the usual arithmetic conversions, were they applied to those two operands, is the type of the result.
> 6.3.1.8/1
> >> Otherwise, the integer promotions are performed on both operands.
> 6.3.1.1/2
> >> If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.

Geeze. Can we please turn this into English and add it to the changelog?

Is it saying that an expression

int ? u16 : u16

has type int? Or something else? What did we do wrong here and is it
possible to correct our types rather than adding a cast?

2022-06-08 08:34:44

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] include/uapi/linux/swab.h: add __u16 cast to __swab16 conditional

On Tue, Jun 07, 2022 at 04:21:28PM -0700, Andrew Morton wrote:

> > 6.5.15/5
> > >> If both the second and third operands have arithmetic type, the result type that would be determined by the usual arithmetic conversions, were they applied to those two operands, is the type of the result.
> > 6.3.1.8/1
> > >> Otherwise, the integer promotions are performed on both operands.
> > 6.3.1.1/2
> > >> If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.
>
> Geeze. Can we please turn this into English and add it to the changelog?
>
> Is it saying that an expression
>
> int ? u16 : u16
>
> has type int? Or something else? What did we do wrong here and is it
> possible to correct our types rather than adding a cast?

Not quite. Same rules as u16 + u16 - on architectures where int is wider
than 16 bits it's (int)u16 + (int)u16 and yields int, on 16bit ones it's
(unsigned int)u16 + (unsigned int)u16 and yields unsigned int.

You *can't* get smaller-than-int out of ? :, same as you can't get it
out of addition, etc.

__builtin_choose_expr() would do it, but I would take a cast over that
ugliness.

FWIW, it might make sense for clang to keep track of the following
property: expression has the same value as it would if integer promotions
in it had been replaced with integer promotion of result.

Example: with
unsigned short x, y, mask;

expresion "x & y" is interpreted as and_int((int)x, (int)y), which is equal
to (int)and_u16(x, y), so that expression has the property in question.
"x != 12 ? x : y" has the same property. "x + y", OTOH, doesn't - if x and y
are both 32768, x + y is add_int((int)x, (int)y), i.e. 65536, while
(int)add_u16(x, y) would be 0.

For a somewhat more subtle example,
(x & ~mask) | (y & mask)
is interpreted as
or_int(and_int((int)x, not_int((int)mask)), and_int((int)y, (int)mask))
which is equal to
(int)or_u16(and_u16(x,not_u16(mask)), and_u16(y, mask))
IOW, the property in question holds for that one, despite having a subexpression
(~mask) that does *NOT* have that property. (int)not_u16(0) is 0xffff and
not_int((int)0) is (assuming 32bit int) 0xffffffff. Upper 16 bits get fouled;
applying & with known-16bit launders them off...

That predicate is behind the handling of small bitwise types in sparse;
otherwise all operations on __be16 would trigger warnings due to promotions
from __be16 to int. And aforementioned subtle example is common enough, so we
had to deal with it. See commit d24967cb847b "[PATCH] handle fouled-bitwise"
in sparse git...

2022-06-08 19:57:24

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] include/uapi/linux/swab.h: add __u16 cast to __swab16 conditional

On Tue, Jun 7, 2022 at 9:54 PM Al Viro <[email protected]> wrote:
>
> On Tue, Jun 07, 2022 at 04:21:28PM -0700, Andrew Morton wrote:
>
> > > 6.5.15/5
> > > >> If both the second and third operands have arithmetic type, the result type that would be determined by the usual arithmetic conversions, were they applied to those two operands, is the type of the result.
> > > 6.3.1.8/1
> > > >> Otherwise, the integer promotions are performed on both operands.
> > > 6.3.1.1/2
> > > >> If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.
> >
> > Geeze. Can we please turn this into English and add it to the changelog?
> >
> > Is it saying that an expression
> >
> > int ? u16 : u16
> >
> > has type int? Or something else? What did we do wrong here and is it
> > possible to correct our types rather than adding a cast?
>
> Not quite. Same rules as u16 + u16 - on architectures where int is wider
> than 16 bits it's (int)u16 + (int)u16 and yields int, on 16bit ones it's
> (unsigned int)u16 + (unsigned int)u16 and yields unsigned int.
>
> You *can't* get smaller-than-int out of ? :, same as you can't get it
> out of addition, etc.

Exactly, and well put. More concise than I was able to express. I
think that description will satisfy Andrew's request for additional
context, so I'll recommend Justin add a blurb derived from what you
said when sending a v3.

>
> __builtin_choose_expr() would do it, but I would take a cast over that
> ugliness.
>
> FWIW, it might make sense for clang to keep track of the following
> property: expression has the same value as it would if integer promotions
> in it had been replaced with integer promotion of result.

I'm not sure that's precisely the same issue here.

The issue we're facing is more so that `ntohs` is being used in
printf-like expressions; clang's -Wformat warns about default argument
promotion so we need to clean up cases where smaller-than-int format
flags are being used for promoted-to-int params. While looking at
that, Nathan noticed that __swab16 will return either a __u16 or an
int based on whether __HAVE_BUILTIN_BSWAP16__ is defined, which
depends on BOTH the compiler being used and target architecture. This
patch from Justin just cleans that up.

>
> Example: with
> unsigned short x, y, mask;
>
> expresion "x & y" is interpreted as and_int((int)x, (int)y), which is equal
> to (int)and_u16(x, y), so that expression has the property in question.
> "x != 12 ? x : y" has the same property. "x + y", OTOH, doesn't - if x and y
> are both 32768, x + y is add_int((int)x, (int)y), i.e. 65536, while
> (int)add_u16(x, y) would be 0.
>
> For a somewhat more subtle example,
> (x & ~mask) | (y & mask)
> is interpreted as
> or_int(and_int((int)x, not_int((int)mask)), and_int((int)y, (int)mask))
> which is equal to
> (int)or_u16(and_u16(x,not_u16(mask)), and_u16(y, mask))
> IOW, the property in question holds for that one, despite having a subexpression
> (~mask) that does *NOT* have that property. (int)not_u16(0) is 0xffff and
> not_int((int)0) is (assuming 32bit int) 0xffffffff. Upper 16 bits get fouled;
> applying & with known-16bit launders them off...
>
> That predicate is behind the handling of small bitwise types in sparse;
> otherwise all operations on __be16 would trigger warnings due to promotions
> from __be16 to int. And aforementioned subtle example is common enough, so we
> had to deal with it. See commit d24967cb847b "[PATCH] handle fouled-bitwise"
> in sparse git...

https://git.kernel.org/pub/scm/devel/sparse/sparse.git/commit/?id=d24967cb847b7a04920698a9053ea8195046a831
(For others' reference)
--
Thanks,
~Nick Desaulniers

2022-06-10 07:24:35

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] include/uapi/linux/swab.h: add __u16 cast to __swab16 conditional

From: Nick Desaulniers
> Sent: 08 June 2022 20:35
....
> The issue we're facing is more so that `ntohs` is being used in
> printf-like expressions; clang's -Wformat warns about default argument
> promotion so we need to clean up cases where smaller-than-int format
> flags are being used for promoted-to-int params. While looking at
> that, Nathan noticed that __swab16 will return either a __u16 or an
> int based on whether __HAVE_BUILTIN_BSWAP16__ is defined, which
> depends on BOTH the compiler being used and target architecture. This
> patch from Justin just cleans that up.

The 'problem' is that the (__u16) cast is likely to add an
extra '& 0xffff' instruction that is almost certainly not
required.

OTOH the lack of this masking has always been a difference between
htons() on BE and LE systems.

But clang is also being over-enthusiastic with its warnings.
IIRC varargs parameters always get integer promotion.
So if %hd ever makes sense for printf then it implies a
mask inside printf.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)