2016-03-17 02:48:41

by Vinicius Tinti

[permalink] [raw]
Subject: [PATCH] x86: Avoid undefined behavior in macro expansion

C11 standard (at 6.10.3.3) says that ## operator (paste) has undefined
behavior when one of the result operands is not a valid preprocessing
token.

Therefore the macro expansion may depend on compiler implementation
which may or no preserve the leading white space.

Moreover other places in kernel use CONCAT(a,b) instead of CONCAT(a, b).
Changing favors concise usage.

Signed-off-by: Vinicius Tinti <[email protected]>
Acked-by: Behan Webster <[email protected]>
---
arch/x86/crypto/aes_ctrby8_avx-x86_64.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
index a916c4a..7a71553 100644
--- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
+++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
@@ -93,7 +93,7 @@

#define tmp %r10
#define DDQ(i) CONCAT(ddq_add_,i)
-#define XMM(i) CONCAT(%xmm, i)
+#define XMM(i) CONCAT(%xmm,i)
#define DDQ_DATA 0
#define XDATA 1
#define KEY_128 1
--
2.7.3


2016-03-19 02:16:03

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] x86: Avoid undefined behavior in macro expansion

On Wed, Mar 16, 2016 at 11:48:49PM -0300, Vinicius Tinti wrote:
> C11 standard (at 6.10.3.3) says that ## operator (paste) has undefined
> behavior when one of the result operands is not a valid preprocessing
> token.
>
> Therefore the macro expansion may depend on compiler implementation
> which may or no preserve the leading white space.
>
> Moreover other places in kernel use CONCAT(a,b) instead of CONCAT(a, b).
> Changing favors concise usage.

Huh?

> -#define XMM(i) CONCAT(%xmm, i)
> +#define XMM(i) CONCAT(%xmm,i)

What are you talking about? Undefined behaviour is when the result of
concatenation of adjacent tokens is not a valid preprocessor token.
It says nothing about the either argument being a single token.

In this case after the substitution of e.g. XMM(42) we get 3 tokens:
Punctuator[%] Identifier[xmm] Pp-number[42]
with ## instructing us to replace the last two with preprocessor token that
would be represented as concatenation of their representations. Which is
to say, concatenation of xmm and 42, i.e. xmm42. Which *is* a
representation of a valid preprocessor token - namely, Identifier[xmm42].
No undefined behaviour at all. And yes, you get two preprocessor tokens
in the expansion - % and xmm42. Preprocessor works in terms of tokens,
not strings...

If you know of any compiler where these two variants would produce different
expansions of XMM(<sequence of digits>), please report it to maintainers of
the compiler in question; it's a bug, plain and simple. And no, there's
no undefined behaviour in that.

2016-03-22 02:18:46

by Vinicius Tinti

[permalink] [raw]
Subject: Re: [PATCH] x86: Avoid undefined behavior in macro expansion

On Fri, Mar 18, 2016 at 11:15 PM, Al Viro <[email protected]> wrote:
> On Wed, Mar 16, 2016 at 11:48:49PM -0300, Vinicius Tinti wrote:
>> C11 standard (at 6.10.3.3) says that ## operator (paste) has undefined
>> behavior when one of the result operands is not a valid preprocessing
>> token.
>>
>> Therefore the macro expansion may depend on compiler implementation
>> which may or no preserve the leading white space.
>>
>> Moreover other places in kernel use CONCAT(a,b) instead of CONCAT(a, b).
>> Changing favors concise usage.
>
> Huh?
>
>> -#define XMM(i) CONCAT(%xmm, i)
>> +#define XMM(i) CONCAT(%xmm,i)
>
> What are you talking about? Undefined behaviour is when the result of
> concatenation of adjacent tokens is not a valid preprocessor token.
> It says nothing about the either argument being a single token.

Please check the example below otherwise it will be hard to explain.

The problem is that _i_ can be a macro to be expanded too. And
it can be a parameter for a _paste_ operator.

// tricky code
#define CONCAT(a,b) a##b
#define XMM(i) CONCAT(%xmm, i)
.macro foo n
x = XMM(\n)
.endm

_%xmm_ is not a problem but _i_ is.

> In this case after the substitution of e.g. XMM(42) we get 3 tokens:
> Punctuator[%] Identifier[xmm] Pp-number[42]
> with ## instructing us to replace the last two with preprocessor token that
> would be represented as concatenation of their representations. Which is
> to say, concatenation of xmm and 42, i.e. xmm42. Which *is* a
> representation of a valid preprocessor token - namely, Identifier[xmm42].

Agree. But it is not this case. I will add the code above at commit and
describe it. It will be easy to explain what I am trying to solve.

> No undefined behaviour at all. And yes, you get two preprocessor tokens
> in the expansion - % and xmm42. Preprocessor works in terms of tokens,
> not strings...

Understood.

> If you know of any compiler where these two variants would produce different
> expansions of XMM(<sequence of digits>), please report it to maintainers of
> the compiler in question; it's a bug, plain and simple. And no, there's
> no undefined behaviour in that.

I reported a bug and discussed over it and I too believe that the tricky
code that I have just sent triggers an undefined behavior.

What do you think?

--
Simplicity is the ultimate sophistication