2024-03-01 04:44:54

by Kees Cook

[permalink] [raw]
Subject: [PATCH] compiler.h: Explain how __is_constexpr() works

The __is_constexpr() macro is dark magic. Shed some light on it with
a comment to explain how and why it works.

Acked-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
---
Cc: Rasmus Villemoes <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: David Laight <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Martin Uecker <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: [email protected]
v2: *thread necromancy* rewrite based on feedback to v1
v1: https://lore.kernel.org/all/[email protected]/
---
include/linux/compiler.h | 39 +++++++++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index bb1339c7057b..38cd9f3c8f6a 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -231,6 +231,45 @@ static inline void *offset_to_ptr(const int *off)
* This returns a constant expression while determining if an argument is
* a constant expression, most importantly without evaluating the argument.
* Glory to Martin Uecker <[email protected]>
+ *
+ * Details:
+ * - sizeof() return an integer constant expression, and does not evaluate
+ * the value of its operand; it only examines the type of its operand.
+ * - The results of comparing two integer constant expressions is also
+ * an integer constant expression.
+ * - The first literal "8" isn't important. It could be any literal value.
+ * - The second literal "8" is to avoid warnings about unaligned pointers;
+ * this could otherwise just be "1".
+ * - (long)(x) is used to avoid warnings about 64-bit types on 32-bit
+ * architectures.
+ * - The C Standard defines "null pointer constant", "(void *)0", as
+ * distinct from other void pointers.
+ * - If (x) is an integer constant expression, then the "* 0l" resolves
+ * it into an integer constant expression of value 0. Since it is cast to
+ * "void *", this makes the second operand a null pointer constant.
+ * - If (x) is not an integer constant expression, then the second operand
+ * resolves to a void pointer (but not a null pointer constant: the value
+ * is not an integer constant 0).
+ * - The conditional operator's third operand, "(int *)8", is an object
+ * pointer (to type "int").
+ * - The behavior (including the return type) of the conditional operator
+ * ("operand1 ? operand2 : operand3") depends on the kind of expressions
+ * given for the second and third operands. This is the central mechanism
+ * of the macro:
+ * - When one operand is a null pointer constant (i.e. when x is an integer
+ * constant expression) and the other is an object pointer (i.e. our
+ * third operand), the conditional operator returns the type of the
+ * object pointer operand (i.e. "int *). Here, within the sizeof(), we
+ * would then get:
+ * sizeof(*((int *)(...)) == sizeof(int) == 4
+ * - When one operand is a void pointer (i.e. when x is not an integer
+ * constant expression) and the other is an object pointer (i.e. our
+ * third operand), the conditional operator returns a "void *" type.
+ * Here, within the sizeof(), we would then get:
+ * sizeof(*((void *)(...)) == sizeof(void) == 1
+ * - The equality comparison to "sizeof(int)" therefore depends on (x):
+ * sizeof(int) == sizeof(int) (x) was a constant expression
+ * sizeof(int) != sizeof(void) (x) was not a constant expression
*/
#define __is_constexpr(x) \
(sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))
--
2.34.1



2024-03-01 08:17:09

by Jani Nikula

[permalink] [raw]
Subject: Re: [PATCH] compiler.h: Explain how __is_constexpr() works

On Thu, 29 Feb 2024, Kees Cook <[email protected]> wrote:
> The __is_constexpr() macro is dark magic. Shed some light on it with
> a comment to explain how and why it works.

Hey, it was a fun little puzzle to figure out using the C standard. Now
you're ruining it for everyone! ;)

The description matches my recollection of how it works. Especially the
meaning of the first 8 threw me off way back when. And looks like I've
replied to that effect for v1.

FWIW,

Reviewed-by: Jani Nikula <[email protected]>

but I'm sure there are more pedantic reviewers for all the minor
details.

>
> Acked-by: Gustavo A. R. Silva <[email protected]>
> Signed-off-by: Kees Cook <[email protected]>
> ---
> Cc: Rasmus Villemoes <[email protected]>
> Cc: Miguel Ojeda <[email protected]>
> Cc: Jani Nikula <[email protected]>
> Cc: David Laight <[email protected]>
> Cc: Nick Desaulniers <[email protected]>
> Cc: Martin Uecker <[email protected]>
> Cc: Jonathan Corbet <[email protected]>
> Cc: [email protected]
> v2: *thread necromancy* rewrite based on feedback to v1
> v1: https://lore.kernel.org/all/[email protected]/
> ---
> include/linux/compiler.h | 39 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 39 insertions(+)
>
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index bb1339c7057b..38cd9f3c8f6a 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -231,6 +231,45 @@ static inline void *offset_to_ptr(const int *off)
> * This returns a constant expression while determining if an argument is
> * a constant expression, most importantly without evaluating the argument.
> * Glory to Martin Uecker <[email protected]>
> + *
> + * Details:
> + * - sizeof() return an integer constant expression, and does not evaluate
> + * the value of its operand; it only examines the type of its operand.
> + * - The results of comparing two integer constant expressions is also
> + * an integer constant expression.
> + * - The first literal "8" isn't important. It could be any literal value.
> + * - The second literal "8" is to avoid warnings about unaligned pointers;
> + * this could otherwise just be "1".
> + * - (long)(x) is used to avoid warnings about 64-bit types on 32-bit
> + * architectures.
> + * - The C Standard defines "null pointer constant", "(void *)0", as
> + * distinct from other void pointers.
> + * - If (x) is an integer constant expression, then the "* 0l" resolves
> + * it into an integer constant expression of value 0. Since it is cast to
> + * "void *", this makes the second operand a null pointer constant.
> + * - If (x) is not an integer constant expression, then the second operand
> + * resolves to a void pointer (but not a null pointer constant: the value
> + * is not an integer constant 0).
> + * - The conditional operator's third operand, "(int *)8", is an object
> + * pointer (to type "int").
> + * - The behavior (including the return type) of the conditional operator
> + * ("operand1 ? operand2 : operand3") depends on the kind of expressions
> + * given for the second and third operands. This is the central mechanism
> + * of the macro:
> + * - When one operand is a null pointer constant (i.e. when x is an integer
> + * constant expression) and the other is an object pointer (i.e. our
> + * third operand), the conditional operator returns the type of the
> + * object pointer operand (i.e. "int *). Here, within the sizeof(), we
> + * would then get:
> + * sizeof(*((int *)(...)) == sizeof(int) == 4
> + * - When one operand is a void pointer (i.e. when x is not an integer
> + * constant expression) and the other is an object pointer (i.e. our
> + * third operand), the conditional operator returns a "void *" type.
> + * Here, within the sizeof(), we would then get:
> + * sizeof(*((void *)(...)) == sizeof(void) == 1
> + * - The equality comparison to "sizeof(int)" therefore depends on (x):
> + * sizeof(int) == sizeof(int) (x) was a constant expression
> + * sizeof(int) != sizeof(void) (x) was not a constant expression
> */
> #define __is_constexpr(x) \
> (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))

--
Jani Nikula, Intel

2024-03-01 09:32:36

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] compiler.h: Explain how __is_constexpr() works

From: Kees Cook
> Sent: 01 March 2024 04:45
> To: Rasmus Villemoes <[email protected]>
>
> The __is_constexpr() macro is dark magic. Shed some light on it with
> a comment to explain how and why it works.

All the 8s don't help...

I don't think you need that much explanation.

Perhaps just saying that the type of ?: depends on the types
of the values and is independent of the condition.
The type of (0 ? (void *)p : (foo *)q) is normally 'void *'
(so that both values can be assigned to it).
But if 'p' is 'an integer constant expression with value 0'
then (void *)p is NULL and the type is 'foo *'.

The type can then be checked to find out it 'p' is constant 0.
A non-zero constant 'p' can be multiples by 0.

I need to replace the definition with (the more portable):
#define __if_constexpr(cond, if_const, if_not_const) \
_Generic(0 ? (void *)((long)(cond) * 0) : (char *)0, \
char *: (if_const), \
void *: (if_not_const))
which is arguably less cryptic.

#define __is_constexpr(cond) __if_constexpr(cond, 1, 0)

So that I can write:
#define is_non_neg_const(x) (__if_constexpr(x, x , -1) >= 0)
and avoid the compiler bleating about some comparisons
in unreachable code.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


2024-03-01 13:24:14

by Uecker, Martin

[permalink] [raw]
Subject: Re: [+externe Mail+] RE: [PATCH] compiler.h: Explain how __is_constexpr() works


BTW my main email addess is now: [email protected]

My suggestion would also to limit explanation. Nobody should
write such code and if you need to, you can find explanations
all over the internet.

Finally, I still think the motivation for this macro (removing
VLAs) is misguided if security is the goal because VLAs provide
precise bounds and larger worst-case fixed-size arrays do not.  

It would be better to use the compiler options that detect
possibly use of VLAs of unbounded size and if there a problems
with this, improve this on the compiler side.

Martin


Am Freitag, dem 01.03.2024 um 09:32 +0000 schrieb David Laight:
> From: Kees Cook
> > Sent: 01 March 2024 04:45
> > To: Rasmus Villemoes <[email protected]>
> >
> > The __is_constexpr() macro is dark magic. Shed some light on it with
> > a comment to explain how and why it works.
>
> All the 8s don't help...
>
> I don't think you need that much explanation.
>
> Perhaps just saying that the type of ?: depends on the types
> of the values and is independent of the condition.
> The type of (0 ? (void *)p : (foo *)q) is normally 'void *'
> (so that both values can be assigned to it).
> But if 'p' is 'an integer constant expression with value 0'
> then (void *)p is NULL and the type is 'foo *'.
>
> The type can then be checked to find out it 'p' is constant 0.
> A non-zero constant 'p' can be multiples by 0.
>
> I need to replace the definition with (the more portable):
> #define __if_constexpr(cond, if_const, if_not_const) \
> _Generic(0 ? (void *)((long)(cond) * 0) : (char *)0, \
> char *: (if_const), \
> void *: (if_not_const))
> which is arguably less cryptic.
>
> #define __is_constexpr(cond) __if_constexpr(cond, 1, 0)
>
> So that I can write:
> #define is_non_neg_const(x) (__if_constexpr(x, x , -1) >= 0)
> and avoid the compiler bleating about some comparisons
> in unreachable code.
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>

2024-03-01 13:45:21

by David Laight

[permalink] [raw]
Subject: RE: [+externe Mail+] RE: [PATCH] compiler.h: Explain how __is_constexpr() works

From: Uecker, Martin
> Sent: 01 March 2024 13:22
>
> My suggestion would also to limit explanation. Nobody should
> write such code and if you need to, you can find explanations
> all over the internet.
>
> Finally, I still think the motivation for this macro (removing
> VLAs) is misguided if security is the goal because VLAs provide
> precise bounds and larger worst-case fixed-size arrays do not.
>
> It would be better to use the compiler options that detect
> possibly use of VLAs of unbounded size and if there a problems
> with this, improve this on the compiler side.

In kernel code (with limited stack) there has to be enough room
for the largest possible 'VLA' so you might as well allocate one.

Allowing VLA also makes it pretty much impossible to do any
kind of static stack use analysis.
The fine IBT tags can be used identify valid indirect calls
which pretty much only leaves recursion stopping full static
stack analysis - and that could be banned except for a few
limited cases where 1 level could be permittd.

is_constexpr() has other uses - there are places where
__builtin_constant_p() isn't strong enough.
Particularly if you need to use builtin_choose_expr()
or _Generic() to get select a type.

For instance, if you can a constant value between 0 and MAXINT
it is safe to cast to/from unsigned in order change any
implicit integer promotion cast that may be grief some.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2024-03-01 13:56:49

by Uecker, Martin

[permalink] [raw]
Subject: Re: [+externe Mail+] RE: [PATCH] compiler.h: Explain how __is_constexpr() works

Am Freitag, dem 01.03.2024 um 13:43 +0000 schrieb David Laight:
> From: Uecker, Martin
> > Sent: 01 March 2024 13:22
> >
> > My suggestion would also to limit explanation. Nobody should
> > write such code and if you need to, you can find explanations
> > all over the internet.
> >
> > Finally, I still think the motivation for this macro (removing
> > VLAs) is misguided if security is the goal because VLAs provide
> > precise bounds and larger worst-case fixed-size arrays do not.
> >
> > It would be better to use the compiler options that detect
> > possibly use of VLAs of unbounded size and if there a problems
> > with this, improve this on the compiler side.
>
> In kernel code (with limited stack) there has to be enough room
> for the largest possible 'VLA' so you might as well allocate one.
>
> Allowing VLA also makes it pretty much impossible to do any
> kind of static stack use analysis.

If you limit VLAs to a certain maximum size, then you could use
this for analysis and it would not be worse than using worst case
fixed-size array on the stack. But you can also use the *precise*
run-time bound of the VLA if your static analysis is smart enough.
You can also use the precise run-time bound for run-time bounds
checking. It is strictly more expressive to use VLAs (or dependent
types in general) and therefor *good* for static analysis.

> The fine IBT tags can be used identify valid indirect calls
> which pretty much only leaves recursion stopping full static
> stack analysis - and that could be banned except for a few
> limited cases where 1 level could be permittd.
>
> is_constexpr() has other uses - there are places where
> __builtin_constant_p() isn't strong enough.
> Particularly if you need to use builtin_choose_expr()
> or _Generic() to get select a type.
>
> For instance, if you can a constant value between 0 and MAXINT
> it is safe to cast to/from unsigned in order change any
> implicit integer promotion cast that may be grief some.

glad to hear it is useful.

Martin

>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)

2024-03-02 01:47:06

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] compiler.h: Explain how __is_constexpr() works

On Sat, Mar 02, 2024 at 03:17:32AM +0200, Andy Shevchenko wrote:
> Thu, Feb 29, 2024 at 08:44:37PM -0800, Kees Cook kirjoitti:
> > The __is_constexpr() macro is dark magic. Shed some light on it with
> > a comment to explain how and why it works.
>
> I was under impression that somebody did it already once and it fell through
> cracks when has been moved (?) to compiler.h.

I tried to do it before (see the v1).

>
> Ah, now I see it, https://lore.kernel.org/all/[email protected]/.
> It was asked, but till now never fulfilled (maybe Reported-by:/Closes: tag?).

Sure! akpm was hardly the first to ask about it, but yeah, makes for
some good tags.

Reported-by: Andrew Morton <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/

:)

> And explanation before was given here:
> https://stackoverflow.com/questions/49481217/linux-kernels-is-constexpr-macro.

Sure, but I wanted something that lived with the macro and everyone was
happy with the details.

-Kees

--
Kees Cook

2024-03-04 16:53:42

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] compiler.h: Explain how __is_constexpr() works

On Thu, Feb 29, 2024 at 8:44 PM Kees Cook <[email protected]> wrote:
>
> The __is_constexpr() macro is dark magic. Shed some light on it with
> a comment to explain how and why it works.
>
> Acked-by: Gustavo A. R. Silva <[email protected]>
> Signed-off-by: Kees Cook <[email protected]>

Is Documentation/kernel-hacking/hacking.rst perhaps a more appropriate
place for this block of text? Perhaps as another :c:macro: in that
file?

You know I'm not a big fan of increasing header size, even if it is
just a comment.
--
Thanks,
~Nick Desaulniers

2024-03-02 01:18:58

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] compiler.h: Explain how __is_constexpr() works

Thu, Feb 29, 2024 at 08:44:37PM -0800, Kees Cook kirjoitti:
> The __is_constexpr() macro is dark magic. Shed some light on it with
> a comment to explain how and why it works.

I was under impression that somebody did it already once and it fell through
cracks when has been moved (?) to compiler.h.

Ah, now I see it, https://lore.kernel.org/all/[email protected]/.
It was asked, but till now never fulfilled (maybe Reported-by:/Closes: tag?).

And explanation before was given here:
https://stackoverflow.com/questions/49481217/linux-kernels-is-constexpr-macro.

--
With Best Regards,
Andy Shevchenko