2020-05-05 17:46:33

by Nick Desaulniers

[permalink] [raw]
Subject: [PATCH] x86: bitops: fix build regression

From: Sedat Dilek <[email protected]>

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, this now produces invalid
assembly:

$ cat foo.c
long a(long b, long c) {
asm("orb\t%1, %0" : "+q"(c): "r"(b));
return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

The "q" constraint only has meanting on -m32 otherwise is treated as
"r".

This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
or Clang+allyesconfig.

Keep the masking operation to appease sparse (`make C=1`), add back the
cast in order to properly select the proper 8b register alias.

[Nick: reworded]

Cc: [email protected]
Cc: Jesse Brandeburg <[email protected]>
Link: https://github.com/ClangBuiltLinux/linux/issues/961
Link: https://lore.kernel.org/lkml/[email protected]/
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <[email protected]>
Reported-by: kernelci.org bot <[email protected]>
Suggested-by: Andy Shevchenko <[email protected]>
Suggested-by: Ilie Halip <[email protected]>
Tested-by: Sedat Dilek <[email protected]>
Signed-off-by: Sedat Dilek <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
---
arch/x86/include/asm/bitops.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b392571c1f1d..139122e5b25b 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
if (__builtin_constant_p(nr)) {
asm volatile(LOCK_PREFIX "orb %1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) & 0xff)
+ : "iq" ((u8)(CONST_MASK(nr) & 0xff))
: "memory");
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
if (__builtin_constant_p(nr)) {
asm volatile(LOCK_PREFIX "andb %1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) ^ 0xff));
+ : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
--
2.26.2.526.g744177e7f7-goog


2020-05-05 18:10:23

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers <[email protected]> wrote:
>From: Sedat Dilek <[email protected]>
>
>It turns out that if your config tickles __builtin_constant_p via
>differences in choices to inline or not, this now produces invalid
>assembly:
>
>$ cat foo.c
>long a(long b, long c) {
> asm("orb\t%1, %0" : "+q"(c): "r"(b));
> return c;
>}
>$ gcc foo.c
>foo.c: Assembler messages:
>foo.c:2: Error: `%rax' not allowed with `orb'
>
>The "q" constraint only has meanting on -m32 otherwise is treated as
>"r".
>
>This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
>or Clang+allyesconfig.
>
>Keep the masking operation to appease sparse (`make C=1`), add back the
>cast in order to properly select the proper 8b register alias.
>
> [Nick: reworded]
>
>Cc: [email protected]
>Cc: Jesse Brandeburg <[email protected]>
>Link: https://github.com/ClangBuiltLinux/linux/issues/961
>Link: https://lore.kernel.org/lkml/[email protected]/
>Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
>Reported-by: Sedat Dilek <[email protected]>
>Reported-by: kernelci.org bot <[email protected]>
>Suggested-by: Andy Shevchenko <[email protected]>
>Suggested-by: Ilie Halip <[email protected]>
>Tested-by: Sedat Dilek <[email protected]>
>Signed-off-by: Sedat Dilek <[email protected]>
>Signed-off-by: Nick Desaulniers <[email protected]>
>---
> arch/x86/include/asm/bitops.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/arch/x86/include/asm/bitops.h
>b/arch/x86/include/asm/bitops.h
>index b392571c1f1d..139122e5b25b 100644
>--- a/arch/x86/include/asm/bitops.h
>+++ b/arch/x86/include/asm/bitops.h
>@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> if (__builtin_constant_p(nr)) {
> asm volatile(LOCK_PREFIX "orb %1,%0"
> : CONST_MASK_ADDR(nr, addr)
>- : "iq" (CONST_MASK(nr) & 0xff)
>+ : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> : "memory");
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
>@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> if (__builtin_constant_p(nr)) {
> asm volatile(LOCK_PREFIX "andb %1,%0"
> : CONST_MASK_ADDR(nr, addr)
>- : "iq" (CONST_MASK(nr) ^ 0xff));
>+ : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> : : RLONG_ADDR(addr), "Ir" (nr) : "memory");

Drop & 0xff and change ^ 0xff to ~.

The redundancy is confusing.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-05-05 18:27:11

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Tue, May 5, 2020 at 11:07 AM <[email protected]> wrote:
>
> On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers <[email protected]> wrote:
> >From: Sedat Dilek <[email protected]>
> >
> >It turns out that if your config tickles __builtin_constant_p via
> >differences in choices to inline or not, this now produces invalid
> >assembly:
> >
> >$ cat foo.c
> >long a(long b, long c) {
> > asm("orb\t%1, %0" : "+q"(c): "r"(b));
> > return c;
> >}
> >$ gcc foo.c
> >foo.c: Assembler messages:
> >foo.c:2: Error: `%rax' not allowed with `orb'
> >
> >The "q" constraint only has meanting on -m32 otherwise is treated as
> >"r".
> >
> >This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> >or Clang+allyesconfig.
> >
> >Keep the masking operation to appease sparse (`make C=1`), add back the
> >cast in order to properly select the proper 8b register alias.
> >
> > [Nick: reworded]
> >
> >Cc: [email protected]
> >Cc: Jesse Brandeburg <[email protected]>
> >Link: https://github.com/ClangBuiltLinux/linux/issues/961
> >Link: https://lore.kernel.org/lkml/[email protected]/
> >Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> >Reported-by: Sedat Dilek <[email protected]>
> >Reported-by: kernelci.org bot <[email protected]>
> >Suggested-by: Andy Shevchenko <[email protected]>
> >Suggested-by: Ilie Halip <[email protected]>
> >Tested-by: Sedat Dilek <[email protected]>
> >Signed-off-by: Sedat Dilek <[email protected]>
> >Signed-off-by: Nick Desaulniers <[email protected]>
> >---
> > arch/x86/include/asm/bitops.h | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> >diff --git a/arch/x86/include/asm/bitops.h
> >b/arch/x86/include/asm/bitops.h
> >index b392571c1f1d..139122e5b25b 100644
> >--- a/arch/x86/include/asm/bitops.h
> >+++ b/arch/x86/include/asm/bitops.h
> >@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> > if (__builtin_constant_p(nr)) {
> > asm volatile(LOCK_PREFIX "orb %1,%0"
> > : CONST_MASK_ADDR(nr, addr)
> >- : "iq" (CONST_MASK(nr) & 0xff)
> >+ : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> > : "memory");
> > } else {
> > asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> >@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> > if (__builtin_constant_p(nr)) {
> > asm volatile(LOCK_PREFIX "andb %1,%0"
> > : CONST_MASK_ADDR(nr, addr)
> >- : "iq" (CONST_MASK(nr) ^ 0xff));
> >+ : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> > } else {
> > asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> > : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
>
> Drop & 0xff and change ^ 0xff to ~.
>
> The redundancy is confusing.

Thanks for the review. While I would also like to have less
redundancy, we have ourselves a catch-22 that that won't resolve.

Without the cast to u8, gcc and clang will not select low-8-bit
registers required for the `b` suffix on `orb` and `andb`, which
results in an assembler error.
Without the mask, sparse will warn about the upper bytes of the value
being truncated.
(I guess that would have been a more concise commit message).
--
Thanks,
~Nick Desaulniers

2020-05-06 04:32:51

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
> From: Sedat Dilek <[email protected]>
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, this now produces invalid
> assembly:
>
> $ cat foo.c
> long a(long b, long c) {
> asm("orb\t%1, %0" : "+q"(c): "r"(b));
> return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> The "q" constraint only has meanting on -m32 otherwise is treated as
> "r".
>
> This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> or Clang+allyesconfig.

For what it's worth, I don't see this with allyesconfig.

> Keep the masking operation to appease sparse (`make C=1`), add back the
> cast in order to properly select the proper 8b register alias.
>
> [Nick: reworded]
>
> Cc: [email protected]

The offending commit was added in 5.7-rc1; we shouldn't need to
Cc stable since this should be picked up as an -rc fix.

> Cc: Jesse Brandeburg <[email protected]>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/[email protected]/
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <[email protected]>
> Reported-by: kernelci.org bot <[email protected]>
> Suggested-by: Andy Shevchenko <[email protected]>
> Suggested-by: Ilie Halip <[email protected]>

Not to split hairs but this is Ilie's diff, he should probably be the
author with Sedat's Reported-by/Tested-by.

https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458

But eh, it's all a team effort plus that can only happen with Ilie's
explicit consent for a Signed-off-by.

I am currently doing a set of builds with clang-11 with this patch on
top of 5.7-rc4 to make sure that all of the cases I have found work.
Once that is done, I'll comment back with a tag.

> Tested-by: Sedat Dilek <[email protected]>
> Signed-off-by: Sedat Dilek <[email protected]>
> Signed-off-by: Nick Desaulniers <[email protected]>
> ---
> arch/x86/include/asm/bitops.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..139122e5b25b 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> if (__builtin_constant_p(nr)) {
> asm volatile(LOCK_PREFIX "orb %1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) & 0xff)
> + : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> : "memory");
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> @@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> if (__builtin_constant_p(nr)) {
> asm volatile(LOCK_PREFIX "andb %1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) ^ 0xff));
> + : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> --
> 2.26.2.526.g744177e7f7-goog
>

Cheers,
Nathan

2020-05-06 09:27:45

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Wed, May 6, 2020 at 6:30 AM Nathan Chancellor
<[email protected]> wrote:
>
> On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
> > From: Sedat Dilek <[email protected]>
> >
> > It turns out that if your config tickles __builtin_constant_p via
> > differences in choices to inline or not, this now produces invalid
> > assembly:
> >
> > $ cat foo.c
> > long a(long b, long c) {
> > asm("orb\t%1, %0" : "+q"(c): "r"(b));
> > return c;
> > }
> > $ gcc foo.c
> > foo.c: Assembler messages:
> > foo.c:2: Error: `%rax' not allowed with `orb'
> >
> > The "q" constraint only has meanting on -m32 otherwise is treated as
> > "r".
> >
> > This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> > or Clang+allyesconfig.
>
> For what it's worth, I don't see this with allyesconfig.
>
> > Keep the masking operation to appease sparse (`make C=1`), add back the
> > cast in order to properly select the proper 8b register alias.
> >
> > [Nick: reworded]
> >
> > Cc: [email protected]
>
> The offending commit was added in 5.7-rc1; we shouldn't need to
> Cc stable since this should be picked up as an -rc fix.
>
> > Cc: Jesse Brandeburg <[email protected]>
> > Link: https://github.com/ClangBuiltLinux/linux/issues/961
> > Link: https://lore.kernel.org/lkml/[email protected]/
> > Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> > Reported-by: Sedat Dilek <[email protected]>
> > Reported-by: kernelci.org bot <[email protected]>
> > Suggested-by: Andy Shevchenko <[email protected]>
> > Suggested-by: Ilie Halip <[email protected]>
>
> Not to split hairs but this is Ilie's diff, he should probably be the
> author with Sedat's Reported-by/Tested-by.
>
> https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458
>
> But eh, it's all a team effort plus that can only happen with Ilie's
> explicit consent for a Signed-off-by.
>

Digital dementia... Looking 3 weeks back I have put all relevant
informations into the patches in [1], mentionning the diff is from
Ilie.
Ilie for what reason did not react on any response for 3 weeks in the
CBL issue-tracker.
I think Nick wants to quickly fix the Kernel-CI-Bot issue seen with Clang.
Personally, I hope this patch will be upstreamed in (one of) the next
RC release.

I agree on CC:stable can be dropped.
Check causing commit-id:

$ git describe --contains 1651e700664b4
v5.7-rc1~122^2

Thanks.

Regards,
- Sedat -

[1] https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-613207374

> I am currently doing a set of builds with clang-11 with this patch on
> top of 5.7-rc4 to make sure that all of the cases I have found work.
> Once that is done, I'll comment back with a tag.
>
> > Tested-by: Sedat Dilek <[email protected]>
> > Signed-off-by: Sedat Dilek <[email protected]>
> > Signed-off-by: Nick Desaulniers <[email protected]>
> > ---
> > arch/x86/include/asm/bitops.h | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> > index b392571c1f1d..139122e5b25b 100644
> > --- a/arch/x86/include/asm/bitops.h
> > +++ b/arch/x86/include/asm/bitops.h
> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> > if (__builtin_constant_p(nr)) {
> > asm volatile(LOCK_PREFIX "orb %1,%0"
> > : CONST_MASK_ADDR(nr, addr)
> > - : "iq" (CONST_MASK(nr) & 0xff)
> > + : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> > : "memory");
> > } else {
> > asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> > @@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> > if (__builtin_constant_p(nr)) {
> > asm volatile(LOCK_PREFIX "andb %1,%0"
> > : CONST_MASK_ADDR(nr, addr)
> > - : "iq" (CONST_MASK(nr) ^ 0xff));
> > + : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> > } else {
> > asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> > : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> > --
> > 2.26.2.526.g744177e7f7-goog
> >
>
> Cheers,
> Nathan

2020-05-06 15:44:49

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Tue, May 05, 2020 at 09:30:28PM -0700, Nathan Chancellor wrote:
> On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
> > From: Sedat Dilek <[email protected]>
> >
> > It turns out that if your config tickles __builtin_constant_p via
> > differences in choices to inline or not, this now produces invalid
> > assembly:
> >
> > $ cat foo.c
> > long a(long b, long c) {
> > asm("orb\t%1, %0" : "+q"(c): "r"(b));
> > return c;
> > }
> > $ gcc foo.c
> > foo.c: Assembler messages:
> > foo.c:2: Error: `%rax' not allowed with `orb'
> >
> > The "q" constraint only has meanting on -m32 otherwise is treated as
> > "r".
> >
> > This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> > or Clang+allyesconfig.
>
> For what it's worth, I don't see this with allyesconfig.
>
> > Keep the masking operation to appease sparse (`make C=1`), add back the
> > cast in order to properly select the proper 8b register alias.
> >
> > [Nick: reworded]
> >
> > Cc: [email protected]
>
> The offending commit was added in 5.7-rc1; we shouldn't need to
> Cc stable since this should be picked up as an -rc fix.
>
> > Cc: Jesse Brandeburg <[email protected]>
> > Link: https://github.com/ClangBuiltLinux/linux/issues/961
> > Link: https://lore.kernel.org/lkml/[email protected]/
> > Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> > Reported-by: Sedat Dilek <[email protected]>
> > Reported-by: kernelci.org bot <[email protected]>
> > Suggested-by: Andy Shevchenko <[email protected]>
> > Suggested-by: Ilie Halip <[email protected]>
>
> Not to split hairs but this is Ilie's diff, he should probably be the
> author with Sedat's Reported-by/Tested-by.
>
> https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458
>
> But eh, it's all a team effort plus that can only happen with Ilie's
> explicit consent for a Signed-off-by.
>
> I am currently doing a set of builds with clang-11 with this patch on
> top of 5.7-rc4 to make sure that all of the cases I have found work.
> Once that is done, I'll comment back with a tag.

Reviewed-by: Nathan Chancellor <[email protected]>
Tested-by: Nathan Chancellor <[email protected]> # build

> > Tested-by: Sedat Dilek <[email protected]>
> > Signed-off-by: Sedat Dilek <[email protected]>
> > Signed-off-by: Nick Desaulniers <[email protected]>
> > ---
> > arch/x86/include/asm/bitops.h | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> > index b392571c1f1d..139122e5b25b 100644
> > --- a/arch/x86/include/asm/bitops.h
> > +++ b/arch/x86/include/asm/bitops.h
> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> > if (__builtin_constant_p(nr)) {
> > asm volatile(LOCK_PREFIX "orb %1,%0"
> > : CONST_MASK_ADDR(nr, addr)
> > - : "iq" (CONST_MASK(nr) & 0xff)
> > + : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> > : "memory");
> > } else {
> > asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> > @@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> > if (__builtin_constant_p(nr)) {
> > asm volatile(LOCK_PREFIX "andb %1,%0"
> > : CONST_MASK_ADDR(nr, addr)
> > - : "iq" (CONST_MASK(nr) ^ 0xff));
> > + : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> > } else {
> > asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> > : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> > --
> > 2.26.2.526.g744177e7f7-goog
> >
>
> Cheers,
> Nathan

2020-05-06 16:40:05

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Tue, May 5, 2020 at 9:30 PM Nathan Chancellor
<[email protected]> wrote:
>
> On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
> > This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> > or Clang+allyesconfig.
>
> For what it's worth, I don't see this with allyesconfig.

Oops, ok, I'll drop that from the commit message in v2. I was testing
with the former.

>
> > Keep the masking operation to appease sparse (`make C=1`), add back the
> > cast in order to properly select the proper 8b register alias.
> >
> > [Nick: reworded]
> >
> > Cc: [email protected]
>
> The offending commit was added in 5.7-rc1; we shouldn't need to
> Cc stable since this should be picked up as an -rc fix.

Got it, will drop in v2.

>
> > Cc: Jesse Brandeburg <[email protected]>
> > Link: https://github.com/ClangBuiltLinux/linux/issues/961
> > Link: https://lore.kernel.org/lkml/[email protected]/
> > Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> > Reported-by: Sedat Dilek <[email protected]>
> > Reported-by: kernelci.org bot <[email protected]>
> > Suggested-by: Andy Shevchenko <[email protected]>
> > Suggested-by: Ilie Halip <[email protected]>
>
> Not to split hairs but this is Ilie's diff, he should probably be the
> author with Sedat's Reported-by/Tested-by.
>
> https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458

Ooh, you're right. Sorry about that Ilie. I'm usually pretty pedantic
about getting that right; my mistake. I'll fix that in v2. As Sedat
noted, the issue tracker has been a little quiet on this issue, but
I'll note that there are extraordinary circumstances going on in the
world these days (COVID) so delays should be anticipated.

Ilie, may I put your authorship and signed off by tag on the V2?

>
> But eh, it's all a team effort plus that can only happen with Ilie's
> explicit consent for a Signed-off-by.
--
Thanks,
~Nick Desaulniers

2020-05-06 16:57:58

by Ilie Halip

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

Hi Nick,

> Ilie, may I put your authorship and signed off by tag on the V2?

Yes, of course. With the current global situation I took some time off
and didn't follow the latest discussion.

Feel free to credit/rework the code as you see fit.

Thanks,
I.H.

2020-05-06 17:08:00

by Nick Desaulniers

[permalink] [raw]
Subject: [PATCH v2] x86: bitops: fix build regression

From: Ilie Halip <[email protected]>

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, this now produces invalid
assembly:

$ cat foo.c
long a(long b, long c) {
asm("orb\t%1, %0" : "+q"(c): "r"(b));
return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

The "q" constraint only has meanting on -m32 otherwise is treated as
"r".

This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

Without the cast to u8, gcc and clang will not select low-8-bit
registers required for the `b` suffix on `orb` and `andb`, which results
in an assembler error. Without the mask, sparse will warn about the
upper bytes of the value being truncated.

[Nick: reworded]

Cc: Jesse Brandeburg <[email protected]>
Link: https://github.com/ClangBuiltLinux/linux/issues/961
Link: https://lore.kernel.org/lkml/[email protected]/
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <[email protected]>
Reported-by: kernelci.org bot <[email protected]>
Suggested-by: Andy Shevchenko <[email protected]>
Suggested-by: Ilie Halip <[email protected]>
Tested-by: Sedat Dilek <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Ilie Halip <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
---
Changes V1 -> V2:
* change authorship/signed-off-by to Ilie
* add Nathan's Tested by/reviewed by
* update commit message slightly with info sent to HPA.

arch/x86/include/asm/bitops.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b392571c1f1d..139122e5b25b 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
if (__builtin_constant_p(nr)) {
asm volatile(LOCK_PREFIX "orb %1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) & 0xff)
+ : "iq" ((u8)(CONST_MASK(nr) & 0xff))
: "memory");
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
if (__builtin_constant_p(nr)) {
asm volatile(LOCK_PREFIX "andb %1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) ^ 0xff));
+ : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
--
2.26.2.526.g744177e7f7-goog

2020-05-07 06:20:06

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers <[email protected]> wrote:
>
> From: Sedat Dilek <[email protected]>
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, this now produces invalid
> assembly:
>
> $ cat foo.c
> long a(long b, long c) {
> asm("orb\t%1, %0" : "+q"(c): "r"(b));
> return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> The "q" constraint only has meanting on -m32 otherwise is treated as
> "r".
>
> This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> or Clang+allyesconfig.
>
> Keep the masking operation to appease sparse (`make C=1`), add back the
> cast in order to properly select the proper 8b register alias.
>
> [Nick: reworded]
>
> Cc: [email protected]
> Cc: Jesse Brandeburg <[email protected]>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/[email protected]/
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <[email protected]>
> Reported-by: kernelci.org bot <[email protected]>
> Suggested-by: Andy Shevchenko <[email protected]>
> Suggested-by: Ilie Halip <[email protected]>
> Tested-by: Sedat Dilek <[email protected]>
> Signed-off-by: Sedat Dilek <[email protected]>
> Signed-off-by: Nick Desaulniers <[email protected]>
> ---
> arch/x86/include/asm/bitops.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..139122e5b25b 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> if (__builtin_constant_p(nr)) {
> asm volatile(LOCK_PREFIX "orb %1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) & 0xff)
> + : "iq" ((u8)(CONST_MASK(nr) & 0xff))

I think a better fix would be to make CONST_MASK() return a u8 value
rather than have to cast on every use.

Also I question the need for the "q" constraint. It was added in
commit 437a0a54 as a workaround for GCC 3.4.4. Now that the minimum
GCC version is 4.6, is this still necessary?

--
Brian Gerst

2020-05-07 07:07:25

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On May 6, 2020 11:18:09 PM PDT, Brian Gerst <[email protected]> wrote:
>On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers
><[email protected]> wrote:
>>
>> From: Sedat Dilek <[email protected]>
>>
>> It turns out that if your config tickles __builtin_constant_p via
>> differences in choices to inline or not, this now produces invalid
>> assembly:
>>
>> $ cat foo.c
>> long a(long b, long c) {
>> asm("orb\t%1, %0" : "+q"(c): "r"(b));
>> return c;
>> }
>> $ gcc foo.c
>> foo.c: Assembler messages:
>> foo.c:2: Error: `%rax' not allowed with `orb'
>>
>> The "q" constraint only has meanting on -m32 otherwise is treated as
>> "r".
>>
>> This is easily reproducible via
>Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
>> or Clang+allyesconfig.
>>
>> Keep the masking operation to appease sparse (`make C=1`), add back
>the
>> cast in order to properly select the proper 8b register alias.
>>
>> [Nick: reworded]
>>
>> Cc: [email protected]
>> Cc: Jesse Brandeburg <[email protected]>
>> Link: https://github.com/ClangBuiltLinux/linux/issues/961
>> Link:
>https://lore.kernel.org/lkml/[email protected]/
>> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
>> Reported-by: Sedat Dilek <[email protected]>
>> Reported-by: kernelci.org bot <[email protected]>
>> Suggested-by: Andy Shevchenko <[email protected]>
>> Suggested-by: Ilie Halip <[email protected]>
>> Tested-by: Sedat Dilek <[email protected]>
>> Signed-off-by: Sedat Dilek <[email protected]>
>> Signed-off-by: Nick Desaulniers <[email protected]>
>> ---
>> arch/x86/include/asm/bitops.h | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/bitops.h
>b/arch/x86/include/asm/bitops.h
>> index b392571c1f1d..139122e5b25b 100644
>> --- a/arch/x86/include/asm/bitops.h
>> +++ b/arch/x86/include/asm/bitops.h
>> @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
>> if (__builtin_constant_p(nr)) {
>> asm volatile(LOCK_PREFIX "orb %1,%0"
>> : CONST_MASK_ADDR(nr, addr)
>> - : "iq" (CONST_MASK(nr) & 0xff)
>> + : "iq" ((u8)(CONST_MASK(nr) & 0xff))
>
>I think a better fix would be to make CONST_MASK() return a u8 value
>rather than have to cast on every use.
>
>Also I question the need for the "q" constraint. It was added in
>commit 437a0a54 as a workaround for GCC 3.4.4. Now that the minimum
>GCC version is 4.6, is this still necessary?
>
>--
>Brian Gerst

Yes, "q" is needed on i386.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-05-07 07:48:04

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] x86: bitops: fix build regression

From: Brian Gerst
> Sent: 07 May 2020 07:18
...
> > --- a/arch/x86/include/asm/bitops.h
> > +++ b/arch/x86/include/asm/bitops.h
> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> > if (__builtin_constant_p(nr)) {
> > asm volatile(LOCK_PREFIX "orb %1,%0"
> > : CONST_MASK_ADDR(nr, addr)
> > - : "iq" (CONST_MASK(nr) & 0xff)
> > + : "iq" ((u8)(CONST_MASK(nr) & 0xff))
>
> I think a better fix would be to make CONST_MASK() return a u8 value
> rather than have to cast on every use.

Or assign to a local variable - then it doesn't matter how
the value is actually calculated. So:
u8 mask = CONST_MASK(nr);
asm volatile(LOCK_PREFIX "orb %1,%0"
: CONST_MASK_ADDR(nr, addr)
: "iq" mask

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-05-07 08:03:42

by H. Peter Anvin

[permalink] [raw]
Subject: RE: [PATCH] x86: bitops: fix build regression

On May 7, 2020 12:44:44 AM PDT, David Laight <[email protected]> wrote:
>From: Brian Gerst
>> Sent: 07 May 2020 07:18
>...
>> > --- a/arch/x86/include/asm/bitops.h
>> > +++ b/arch/x86/include/asm/bitops.h
>> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
>*addr)
>> > if (__builtin_constant_p(nr)) {
>> > asm volatile(LOCK_PREFIX "orb %1,%0"
>> > : CONST_MASK_ADDR(nr, addr)
>> > - : "iq" (CONST_MASK(nr) & 0xff)
>> > + : "iq" ((u8)(CONST_MASK(nr) & 0xff))
>>
>> I think a better fix would be to make CONST_MASK() return a u8 value
>> rather than have to cast on every use.
>
>Or assign to a local variable - then it doesn't matter how
>the value is actually calculated. So:
> u8 mask = CONST_MASK(nr);
> asm volatile(LOCK_PREFIX "orb %1,%0"
> : CONST_MASK_ADDR(nr, addr)
> : "iq" mask
>
> David
>
>-
>Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
>MK1 1PT, UK
>Registration No: 1397386 (Wales)

"const u8" please...
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-05-07 08:37:24

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] x86: bitops: fix build regression

From: [email protected]
> Sent: 07 May 2020 08:59
> On May 7, 2020 12:44:44 AM PDT, David Laight <[email protected]> wrote:
> >From: Brian Gerst
> >> Sent: 07 May 2020 07:18
> >...
> >> > --- a/arch/x86/include/asm/bitops.h
> >> > +++ b/arch/x86/include/asm/bitops.h
> >> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
> >*addr)
> >> > if (__builtin_constant_p(nr)) {
> >> > asm volatile(LOCK_PREFIX "orb %1,%0"
> >> > : CONST_MASK_ADDR(nr, addr)
> >> > - : "iq" (CONST_MASK(nr) & 0xff)
> >> > + : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> >>
> >> I think a better fix would be to make CONST_MASK() return a u8 value
> >> rather than have to cast on every use.
> >
> >Or assign to a local variable - then it doesn't matter how
> >the value is actually calculated. So:
> > u8 mask = CONST_MASK(nr);
> > asm volatile(LOCK_PREFIX "orb %1,%0"
> > : CONST_MASK_ADDR(nr, addr)
> > : "iq" mask
> >
> > David
> >
> >-
> >Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
> >MK1 1PT, UK
> >Registration No: 1397386 (Wales)
>
> "const u8" please...

Why, just a waste of disk space.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-05-07 08:40:47

by H. Peter Anvin

[permalink] [raw]
Subject: RE: [PATCH] x86: bitops: fix build regression

On May 7, 2020 1:35:01 AM PDT, David Laight <[email protected]> wrote:
>From: [email protected]
>> Sent: 07 May 2020 08:59
>> On May 7, 2020 12:44:44 AM PDT, David Laight
><[email protected]> wrote:
>> >From: Brian Gerst
>> >> Sent: 07 May 2020 07:18
>> >...
>> >> > --- a/arch/x86/include/asm/bitops.h
>> >> > +++ b/arch/x86/include/asm/bitops.h
>> >> > @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
>> >*addr)
>> >> > if (__builtin_constant_p(nr)) {
>> >> > asm volatile(LOCK_PREFIX "orb %1,%0"
>> >> > : CONST_MASK_ADDR(nr, addr)
>> >> > - : "iq" (CONST_MASK(nr) & 0xff)
>> >> > + : "iq" ((u8)(CONST_MASK(nr) & 0xff))
>> >>
>> >> I think a better fix would be to make CONST_MASK() return a u8
>value
>> >> rather than have to cast on every use.
>> >
>> >Or assign to a local variable - then it doesn't matter how
>> >the value is actually calculated. So:
>> > u8 mask = CONST_MASK(nr);
>> > asm volatile(LOCK_PREFIX "orb %1,%0"
>> > : CONST_MASK_ADDR(nr, addr)
>> > : "iq" mask
>> >
>> > David
>> >
>> >-
>> >Registered Address Lakeside, Bramley Road, Mount Farm, Milton
>Keynes,
>> >MK1 1PT, UK
>> >Registration No: 1397386 (Wales)
>>
>> "const u8" please...
>
>Why, just a waste of disk space.
>
> David
>
>-
>Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
>MK1 1PT, UK
>Registration No: 1397386 (Wales)

Descriptive.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-05-07 09:21:02

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Thu, May 7, 2020 at 10:50 AM David Laight <[email protected]> wrote:
> From: Brian Gerst
> > Sent: 07 May 2020 07:18

> > I think a better fix would be to make CONST_MASK() return a u8 value
> > rather than have to cast on every use.
>
> Or assign to a local variable - then it doesn't matter how
> the value is actually calculated. So:
> u8 mask = CONST_MASK(nr);

Another case with negation won't work like this I believe.
So, I thin kthe patch we have is good enough, no need to seek for an evil.

--
With Best Regards,
Andy Shevchenko

2020-05-07 11:39:52

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Tue, May 05, 2020 at 11:07:24AM -0700, [email protected] wrote:
> On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers <[email protected]> wrote:

> >@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> > if (__builtin_constant_p(nr)) {
> > asm volatile(LOCK_PREFIX "orb %1,%0"
> > : CONST_MASK_ADDR(nr, addr)
> >- : "iq" (CONST_MASK(nr) & 0xff)
> >+ : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> > : "memory");
> > } else {
> > asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> >@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> > if (__builtin_constant_p(nr)) {
> > asm volatile(LOCK_PREFIX "andb %1,%0"
> > : CONST_MASK_ADDR(nr, addr)
> >- : "iq" (CONST_MASK(nr) ^ 0xff));
> >+ : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> > } else {
> > asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> > : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
>
> Drop & 0xff and change ^ 0xff to ~.

But then we're back to sparse being unhappy, no? The thing with ~ is
that it will set high bits which will be truncated, which makes sparse
sad.

2020-05-07 13:35:05

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Thu, May 7, 2020 at 3:02 AM <[email protected]> wrote:
>
> On May 6, 2020 11:18:09 PM PDT, Brian Gerst <[email protected]> wrote:
> >On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers
> ><[email protected]> wrote:
> >>
> >> From: Sedat Dilek <[email protected]>
> >>
> >> It turns out that if your config tickles __builtin_constant_p via
> >> differences in choices to inline or not, this now produces invalid
> >> assembly:
> >>
> >> $ cat foo.c
> >> long a(long b, long c) {
> >> asm("orb\t%1, %0" : "+q"(c): "r"(b));
> >> return c;
> >> }
> >> $ gcc foo.c
> >> foo.c: Assembler messages:
> >> foo.c:2: Error: `%rax' not allowed with `orb'
> >>
> >> The "q" constraint only has meanting on -m32 otherwise is treated as
> >> "r".
> >>
> >> This is easily reproducible via
> >Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
> >> or Clang+allyesconfig.
> >>
> >> Keep the masking operation to appease sparse (`make C=1`), add back
> >the
> >> cast in order to properly select the proper 8b register alias.
> >>
> >> [Nick: reworded]
> >>
> >> Cc: [email protected]
> >> Cc: Jesse Brandeburg <[email protected]>
> >> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> >> Link:
> >https://lore.kernel.org/lkml/[email protected]/
> >> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> >> Reported-by: Sedat Dilek <[email protected]>
> >> Reported-by: kernelci.org bot <[email protected]>
> >> Suggested-by: Andy Shevchenko <[email protected]>
> >> Suggested-by: Ilie Halip <[email protected]>
> >> Tested-by: Sedat Dilek <[email protected]>
> >> Signed-off-by: Sedat Dilek <[email protected]>
> >> Signed-off-by: Nick Desaulniers <[email protected]>
> >> ---
> >> arch/x86/include/asm/bitops.h | 4 ++--
> >> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/x86/include/asm/bitops.h
> >b/arch/x86/include/asm/bitops.h
> >> index b392571c1f1d..139122e5b25b 100644
> >> --- a/arch/x86/include/asm/bitops.h
> >> +++ b/arch/x86/include/asm/bitops.h
> >> @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> >> if (__builtin_constant_p(nr)) {
> >> asm volatile(LOCK_PREFIX "orb %1,%0"
> >> : CONST_MASK_ADDR(nr, addr)
> >> - : "iq" (CONST_MASK(nr) & 0xff)
> >> + : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> >
> >I think a better fix would be to make CONST_MASK() return a u8 value
> >rather than have to cast on every use.
> >
> >Also I question the need for the "q" constraint. It was added in
> >commit 437a0a54 as a workaround for GCC 3.4.4. Now that the minimum
> >GCC version is 4.6, is this still necessary?
> >
> >--
> >Brian Gerst
>
> Yes, "q" is needed on i386.

I think the bug this worked around was that the compiler didn't detect
that CONST_MASK(nr) was also constant and doesn't need to be put into
a register. The question is does that bug still exist on compiler
versions we care about?

--
Brian Gerst

2020-05-07 14:02:40

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Thu, May 7, 2020 at 7:38 AM Peter Zijlstra <[email protected]> wrote:
>
> On Tue, May 05, 2020 at 11:07:24AM -0700, [email protected] wrote:
> > On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers <[email protected]> wrote:
>
> > >@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
> > > if (__builtin_constant_p(nr)) {
> > > asm volatile(LOCK_PREFIX "orb %1,%0"
> > > : CONST_MASK_ADDR(nr, addr)
> > >- : "iq" (CONST_MASK(nr) & 0xff)
> > >+ : "iq" ((u8)(CONST_MASK(nr) & 0xff))
> > > : "memory");
> > > } else {
> > > asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> > >@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr)
> > > if (__builtin_constant_p(nr)) {
> > > asm volatile(LOCK_PREFIX "andb %1,%0"
> > > : CONST_MASK_ADDR(nr, addr)
> > >- : "iq" (CONST_MASK(nr) ^ 0xff));
> > >+ : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
> > > } else {
> > > asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> > > : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> >
> > Drop & 0xff and change ^ 0xff to ~.
>
> But then we're back to sparse being unhappy, no? The thing with ~ is
> that it will set high bits which will be truncated, which makes sparse
> sad.

This change will make sparse happy and allow these cleanups:
#define CONST_MASK(nr) ((u8)1 << ((nr) & 7))

Tested with GCC 9.3.1.

--
Brian Gerst

2020-05-07 15:12:17

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] x86: bitops: fix build regression

From: Brian Gerst
> Sent: 07 May 2020 14:32
...
> I think the bug this worked around was that the compiler didn't detect
> that CONST_MASK(nr) was also constant and doesn't need to be put into
> a register. The question is does that bug still exist on compiler
> versions we care about?

Hmmm...
That ought to have been fixed instead of worrying about the fact
that an invalid register was used.

Alternatively is there any reason not to use the bts/btc instructions?
Yes, I know they'll do wider accesses, but variable bit numbers do.
It is also possible that the assembler will support constant bit
numbers >= 32 by adding to the address offset.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-05-07 19:22:23

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Thu, May 7, 2020 at 7:00 AM Brian Gerst <[email protected]> wrote:
>
> This change will make sparse happy and allow these cleanups:
> #define CONST_MASK(nr) ((u8)1 << ((nr) & 7))

yep, this is more elegant, IMO. Will send a v3 later with this
change. Looking at the uses of CONST_MASK, I noticed
arch_change_bit() currently has the (u8) cast from commit
838e8bb71dc0c ("x86: Implement change_bit with immediate operand as
"lock xorb""), so that instance can get cleaned up with the above
suggestion.

--
Thanks,
~Nick Desaulniers

2020-05-07 19:26:55

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Wed, May 6, 2020 at 11:18 PM Brian Gerst <[email protected]> wrote:
>
> I think a better fix would be to make CONST_MASK() return a u8 value
> rather than have to cast on every use.
>
> Also I question the need for the "q" constraint. It was added in
> commit 437a0a54 as a workaround for GCC 3.4.4. Now that the minimum
> GCC version is 4.6, is this still necessary?

TL;DR yes

ah, thanks for the git archeology, it's useful. I don't think this is
a compiler bug however, just the compiler being more strict that the
`b` suffix on `orb` requires a 8b register operand. For 32b x86, `q`
asm constraint is required, because not all GPR's had lower 8b
register aliases, as Arnd found and HPA noted as well.

I like your suggested change to CONST_MASK, and will send that in a bit.
--
Thanks,
~Nick Desaulniers

2020-05-07 19:32:09

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On May 7, 2020 4:34:22 AM PDT, Peter Zijlstra <[email protected]> wrote:
>On Tue, May 05, 2020 at 11:07:24AM -0700, [email protected] wrote:
>> On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers
><[email protected]> wrote:
>
>> >@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
>*addr)
>> > if (__builtin_constant_p(nr)) {
>> > asm volatile(LOCK_PREFIX "orb %1,%0"
>> > : CONST_MASK_ADDR(nr, addr)
>> >- : "iq" (CONST_MASK(nr) & 0xff)
>> >+ : "iq" ((u8)(CONST_MASK(nr) & 0xff))
>> > : "memory");
>> > } else {
>> > asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
>> >@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long
>*addr)
>> > if (__builtin_constant_p(nr)) {
>> > asm volatile(LOCK_PREFIX "andb %1,%0"
>> > : CONST_MASK_ADDR(nr, addr)
>> >- : "iq" (CONST_MASK(nr) ^ 0xff));
>> >+ : "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
>> > } else {
>> > asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
>> > : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
>>
>> Drop & 0xff and change ^ 0xff to ~.
>
>But then we're back to sparse being unhappy, no? The thing with ~ is
>that it will set high bits which will be truncated, which makes sparse
>sad.

In that case, sparse is just broken.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-05-07 19:33:02

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On May 7, 2020 6:32:24 AM PDT, Brian Gerst <[email protected]> wrote:
>On Thu, May 7, 2020 at 3:02 AM <[email protected]> wrote:
>>
>> On May 6, 2020 11:18:09 PM PDT, Brian Gerst <[email protected]>
>wrote:
>> >On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers
>> ><[email protected]> wrote:
>> >>
>> >> From: Sedat Dilek <[email protected]>
>> >>
>> >> It turns out that if your config tickles __builtin_constant_p via
>> >> differences in choices to inline or not, this now produces invalid
>> >> assembly:
>> >>
>> >> $ cat foo.c
>> >> long a(long b, long c) {
>> >> asm("orb\t%1, %0" : "+q"(c): "r"(b));
>> >> return c;
>> >> }
>> >> $ gcc foo.c
>> >> foo.c: Assembler messages:
>> >> foo.c:2: Error: `%rax' not allowed with `orb'
>> >>
>> >> The "q" constraint only has meanting on -m32 otherwise is treated
>as
>> >> "r".
>> >>
>> >> This is easily reproducible via
>> >Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
>> >> or Clang+allyesconfig.
>> >>
>> >> Keep the masking operation to appease sparse (`make C=1`), add
>back
>> >the
>> >> cast in order to properly select the proper 8b register alias.
>> >>
>> >> [Nick: reworded]
>> >>
>> >> Cc: [email protected]
>> >> Cc: Jesse Brandeburg <[email protected]>
>> >> Link: https://github.com/ClangBuiltLinux/linux/issues/961
>> >> Link:
>> >https://lore.kernel.org/lkml/[email protected]/
>> >> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved
>cast")
>> >> Reported-by: Sedat Dilek <[email protected]>
>> >> Reported-by: kernelci.org bot <[email protected]>
>> >> Suggested-by: Andy Shevchenko <[email protected]>
>> >> Suggested-by: Ilie Halip <[email protected]>
>> >> Tested-by: Sedat Dilek <[email protected]>
>> >> Signed-off-by: Sedat Dilek <[email protected]>
>> >> Signed-off-by: Nick Desaulniers <[email protected]>
>> >> ---
>> >> arch/x86/include/asm/bitops.h | 4 ++--
>> >> 1 file changed, 2 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/arch/x86/include/asm/bitops.h
>> >b/arch/x86/include/asm/bitops.h
>> >> index b392571c1f1d..139122e5b25b 100644
>> >> --- a/arch/x86/include/asm/bitops.h
>> >> +++ b/arch/x86/include/asm/bitops.h
>> >> @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
>*addr)
>> >> if (__builtin_constant_p(nr)) {
>> >> asm volatile(LOCK_PREFIX "orb %1,%0"
>> >> : CONST_MASK_ADDR(nr, addr)
>> >> - : "iq" (CONST_MASK(nr) & 0xff)
>> >> + : "iq" ((u8)(CONST_MASK(nr) & 0xff))
>> >
>> >I think a better fix would be to make CONST_MASK() return a u8 value
>> >rather than have to cast on every use.
>> >
>> >Also I question the need for the "q" constraint. It was added in
>> >commit 437a0a54 as a workaround for GCC 3.4.4. Now that the minimum
>> >GCC version is 4.6, is this still necessary?
>> >
>> >--
>> >Brian Gerst
>>
>> Yes, "q" is needed on i386.
>
>I think the bug this worked around was that the compiler didn't detect
>that CONST_MASK(nr) was also constant and doesn't need to be put into
>a register. The question is does that bug still exist on compiler
>versions we care about?
>
>--
>Brian Gerst

The compiler is free to do that, including for legit reasons (common subexpression elimination, especially.) So yes.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-05-07 19:35:40

by H. Peter Anvin

[permalink] [raw]
Subject: RE: [PATCH] x86: bitops: fix build regression

On May 7, 2020 8:09:35 AM PDT, David Laight <[email protected]> wrote:
>From: Brian Gerst
>> Sent: 07 May 2020 14:32
>...
>> I think the bug this worked around was that the compiler didn't
>detect
>> that CONST_MASK(nr) was also constant and doesn't need to be put into
>> a register. The question is does that bug still exist on compiler
>> versions we care about?
>
>Hmmm...
>That ought to have been fixed instead of worrying about the fact
>that an invalid register was used.
>
>Alternatively is there any reason not to use the bts/btc instructions?
>Yes, I know they'll do wider accesses, but variable bit numbers do.
>It is also possible that the assembler will support constant bit
>numbers >= 32 by adding to the address offset.
>
> David
>
>-
>Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
>MK1 1PT, UK
>Registration No: 1397386 (Wales)

They're slower, and for unaligned locked fields can be severely so.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-05-07 22:33:56

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Thu, May 7, 2020 at 12:19 PM Nick Desaulniers
<[email protected]> wrote:
>
> On Thu, May 7, 2020 at 7:00 AM Brian Gerst <[email protected]> wrote:
> >
> > This change will make sparse happy and allow these cleanups:
> > #define CONST_MASK(nr) ((u8)1 << ((nr) & 7))
>
> yep, this is more elegant, IMO. Will send a v3 later with this
> change. Looking at the uses of CONST_MASK, I noticed
> arch_change_bit() currently has the (u8) cast from commit
> 838e8bb71dc0c ("x86: Implement change_bit with immediate operand as
> "lock xorb""), so that instance can get cleaned up with the above
> suggestion.

Oh, we need the cast to be the final operation. The binary AND and
XOR in 2 of the 3 uses of CONST_MASK implicitly promote the operands
of the binary operand to int, so the type of the evaluated
subexpression is int.
https://wiki.sei.cmu.edu/confluence/display/c/EXP14-C.+Beware+of+integer+promotion+when+performing+bitwise+operations+on+integer+types+smaller+than+int
So I think this version (v2) is most precise fix, and would be better
than defining more macros or (worse) using metaprogramming.
--
Thanks,
~Nick Desaulniers

2020-05-08 01:59:40

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Thu, May 7, 2020 at 6:29 PM Nick Desaulniers <[email protected]> wrote:
>
> On Thu, May 7, 2020 at 12:19 PM Nick Desaulniers
> <[email protected]> wrote:
> >
> > On Thu, May 7, 2020 at 7:00 AM Brian Gerst <[email protected]> wrote:
> > >
> > > This change will make sparse happy and allow these cleanups:
> > > #define CONST_MASK(nr) ((u8)1 << ((nr) & 7))
> >
> > yep, this is more elegant, IMO. Will send a v3 later with this
> > change. Looking at the uses of CONST_MASK, I noticed
> > arch_change_bit() currently has the (u8) cast from commit
> > 838e8bb71dc0c ("x86: Implement change_bit with immediate operand as
> > "lock xorb""), so that instance can get cleaned up with the above
> > suggestion.
>
> Oh, we need the cast to be the final operation. The binary AND and
> XOR in 2 of the 3 uses of CONST_MASK implicitly promote the operands
> of the binary operand to int, so the type of the evaluated
> subexpression is int.
> https://wiki.sei.cmu.edu/confluence/display/c/EXP14-C.+Beware+of+integer+promotion+when+performing+bitwise+operations+on+integer+types+smaller+than+int
> So I think this version (v2) is most precise fix, and would be better
> than defining more macros or (worse) using metaprogramming.

One last suggestion. Add the "b" modifier to the mask operand: "orb
%b1, %0". That forces the compiler to use the 8-bit register name
instead of trying to deduce the width from the input.

--
Brian Gerst

2020-05-08 17:24:10

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On Thu, May 7, 2020 at 6:57 PM Brian Gerst <[email protected]> wrote:
>
> On Thu, May 7, 2020 at 6:29 PM Nick Desaulniers <[email protected]> wrote:
> >
> > On Thu, May 7, 2020 at 12:19 PM Nick Desaulniers
> > <[email protected]> wrote:
> > >
> > > On Thu, May 7, 2020 at 7:00 AM Brian Gerst <[email protected]> wrote:
> > > >
> > > > This change will make sparse happy and allow these cleanups:
> > > > #define CONST_MASK(nr) ((u8)1 << ((nr) & 7))
> > >
> > > yep, this is more elegant, IMO. Will send a v3 later with this
> > > change. Looking at the uses of CONST_MASK, I noticed
> > > arch_change_bit() currently has the (u8) cast from commit
> > > 838e8bb71dc0c ("x86: Implement change_bit with immediate operand as
> > > "lock xorb""), so that instance can get cleaned up with the above
> > > suggestion.
> >
> > Oh, we need the cast to be the final operation. The binary AND and
> > XOR in 2 of the 3 uses of CONST_MASK implicitly promote the operands
> > of the binary operand to int, so the type of the evaluated
> > subexpression is int.
> > https://wiki.sei.cmu.edu/confluence/display/c/EXP14-C.+Beware+of+integer+promotion+when+performing+bitwise+operations+on+integer+types+smaller+than+int
> > So I think this version (v2) is most precise fix, and would be better
> > than defining more macros or (worse) using metaprogramming.
>
> One last suggestion. Add the "b" modifier to the mask operand: "orb
> %b1, %0". That forces the compiler to use the 8-bit register name
> instead of trying to deduce the width from the input.

Ah right: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers

Looks like that works for both compilers. In that case, we can likely
drop the `& 0xff`, too. Let me play with that, then I'll hopefully
send a v3 today.
--
Thanks,
~Nick Desaulniers

2020-05-08 17:34:29

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86: bitops: fix build regression

On 2020-05-08 10:21, Nick Desaulniers wrote:
>>
>> One last suggestion. Add the "b" modifier to the mask operand: "orb
>> %b1, %0". That forces the compiler to use the 8-bit register name
>> instead of trying to deduce the width from the input.
>
> Ah right: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
>
> Looks like that works for both compilers. In that case, we can likely
> drop the `& 0xff`, too. Let me play with that, then I'll hopefully
> send a v3 today.
>

Good idea. I requested a while ago that they document these modifiers; they
chose not to document them all which in some ways is good; it shows what they
are willing to commit to indefinitely.

-hpa

2020-05-08 18:07:55

by Nick Desaulniers

[permalink] [raw]
Subject: [PATCH v3] x86: bitops: fix build regression

This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce
invalid assembly:

$ cat foo.c
long a(long b, long c) {
asm("orb\t%1, %0" : "+q"(c): "r"(b));
return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Cc: Jesse Brandeburg <[email protected]>
Link: https://github.com/ClangBuiltLinux/linux/issues/961
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <[email protected]>
Reported-by: kernelci.org bot <[email protected]>
Suggested-by: Andy Shevchenko <[email protected]>
Suggested-by: Brian Gerst <[email protected]>
Suggested-by: H. Peter Anvin <[email protected]>
Suggested-by: Ilie Halip <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
---
Changes V2 -> V3:
* use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
* reword commit message.
* add Brian and HPA suggested by tags
* drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
enough).

Changes V1 -> V2:
* change authorship/signed-off-by to Ilie
* add Nathan's Tested by/reviewed by
* update commit message slightly with info sent to HPA.

arch/x86/include/asm/bitops.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b392571c1f1d..03e24286e4eb 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -52,9 +52,9 @@ static __always_inline void
arch_set_bit(long nr, volatile unsigned long *addr)
{
if (__builtin_constant_p(nr)) {
- asm volatile(LOCK_PREFIX "orb %1,%0"
+ asm volatile(LOCK_PREFIX "orb %b1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) & 0xff)
+ : "iq" (CONST_MASK(nr))
: "memory");
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -72,9 +72,9 @@ static __always_inline void
arch_clear_bit(long nr, volatile unsigned long *addr)
{
if (__builtin_constant_p(nr)) {
- asm volatile(LOCK_PREFIX "andb %1,%0"
+ asm volatile(LOCK_PREFIX "andb %b1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) ^ 0xff));
+ : "iq" (~CONST_MASK(nr)));
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
--
2.26.2.645.ge9eca65c58-goog

2020-05-08 18:12:55

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v3] x86: bitops: fix build regression

On Fri, May 8, 2020 at 11:06 AM Nick Desaulniers
<[email protected]> wrote:
>
> This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
>
> $ cat foo.c
> long a(long b, long c) {
> asm("orb\t%1, %0" : "+q"(c): "r"(b));
> return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> Use the `%b` "x86 Operand Modifier" to instead force register allocation
> to select a lower-8-bit GPR operand.
>
> The "q" constraint only has meaning on -m32 otherwise is treated as
> "r". Not all GPRs have low-8-bit aliases for -m32.
>
> Cc: Jesse Brandeburg <[email protected]>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/[email protected]/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <[email protected]>
> Reported-by: kernelci.org bot <[email protected]>
> Suggested-by: Andy Shevchenko <[email protected]>
> Suggested-by: Brian Gerst <[email protected]>
> Suggested-by: H. Peter Anvin <[email protected]>
> Suggested-by: Ilie Halip <[email protected]>
> Signed-off-by: Nick Desaulniers <[email protected]>
> ---
> Changes V2 -> V3:
> * use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
> * reword commit message.
> * add Brian and HPA suggested by tags
> * drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
> enough).

Oh, and I took over authorship for this version.

>
> Changes V1 -> V2:
> * change authorship/signed-off-by to Ilie
> * add Nathan's Tested by/reviewed by
> * update commit message slightly with info sent to HPA.
>
> arch/x86/include/asm/bitops.h | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..03e24286e4eb 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -52,9 +52,9 @@ static __always_inline void
> arch_set_bit(long nr, volatile unsigned long *addr)
> {
> if (__builtin_constant_p(nr)) {
> - asm volatile(LOCK_PREFIX "orb %1,%0"
> + asm volatile(LOCK_PREFIX "orb %b1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) & 0xff)
> + : "iq" (CONST_MASK(nr))
> : "memory");
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> @@ -72,9 +72,9 @@ static __always_inline void
> arch_clear_bit(long nr, volatile unsigned long *addr)
> {
> if (__builtin_constant_p(nr)) {
> - asm volatile(LOCK_PREFIX "andb %1,%0"
> + asm volatile(LOCK_PREFIX "andb %b1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) ^ 0xff));
> + : "iq" (~CONST_MASK(nr)));
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> --
> 2.26.2.645.ge9eca65c58-goog
>


--
Thanks,
~Nick Desaulniers

2020-05-08 18:24:42

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH v3] x86: bitops: fix build regression

On Fri, May 8, 2020 at 2:06 PM Nick Desaulniers <[email protected]> wrote:
>
> This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
>
> $ cat foo.c
> long a(long b, long c) {
> asm("orb\t%1, %0" : "+q"(c): "r"(b));
> return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> Use the `%b` "x86 Operand Modifier" to instead force register allocation
> to select a lower-8-bit GPR operand.
>
> The "q" constraint only has meaning on -m32 otherwise is treated as
> "r". Not all GPRs have low-8-bit aliases for -m32.
>
> Cc: Jesse Brandeburg <[email protected]>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/[email protected]/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <[email protected]>
> Reported-by: kernelci.org bot <[email protected]>
> Suggested-by: Andy Shevchenko <[email protected]>
> Suggested-by: Brian Gerst <[email protected]>
> Suggested-by: H. Peter Anvin <[email protected]>
> Suggested-by: Ilie Halip <[email protected]>
> Signed-off-by: Nick Desaulniers <[email protected]>
> ---
> Changes V2 -> V3:
> * use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
> * reword commit message.
> * add Brian and HPA suggested by tags
> * drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
> enough).
>
> Changes V1 -> V2:
> * change authorship/signed-off-by to Ilie
> * add Nathan's Tested by/reviewed by
> * update commit message slightly with info sent to HPA.
>
> arch/x86/include/asm/bitops.h | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..03e24286e4eb 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -52,9 +52,9 @@ static __always_inline void
> arch_set_bit(long nr, volatile unsigned long *addr)
> {
> if (__builtin_constant_p(nr)) {
> - asm volatile(LOCK_PREFIX "orb %1,%0"
> + asm volatile(LOCK_PREFIX "orb %b1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) & 0xff)
> + : "iq" (CONST_MASK(nr))
> : "memory");
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> @@ -72,9 +72,9 @@ static __always_inline void
> arch_clear_bit(long nr, volatile unsigned long *addr)
> {
> if (__builtin_constant_p(nr)) {
> - asm volatile(LOCK_PREFIX "andb %1,%0"
> + asm volatile(LOCK_PREFIX "andb %b1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) ^ 0xff));
> + : "iq" (~CONST_MASK(nr)));
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> : : RLONG_ADDR(addr), "Ir" (nr) : "memory");

arch_change_bit() should also be changed, but otherwise looks good.

--
Brian Gerst

2020-05-08 18:30:44

by Nick Desaulniers

[permalink] [raw]
Subject: [PATCH v4] x86: bitops: fix build regression

This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce
invalid assembly:

$ cat foo.c
long a(long b, long c) {
asm("orb\t%1, %0" : "+q"(c): "r"(b));
return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Cc: Jesse Brandeburg <[email protected]>
Link: https://github.com/ClangBuiltLinux/linux/issues/961
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <[email protected]>
Reported-by: kernelci.org bot <[email protected]>
Suggested-by: Andy Shevchenko <[email protected]>
Suggested-by: Brian Gerst <[email protected]>
Suggested-by: H. Peter Anvin <[email protected]>
Suggested-by: Ilie Halip <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
---
Changes V3 -> V4:
* drop (u8) cast from arch_change_bit() as well.

Changes V2 -> V3:
* use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
* reword commit message.
* add Brian and HPA suggested by tags
* drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
enough).
* Take over authorship.

Changes V1 -> V2:
* change authorship/signed-off-by to Ilie
* add Nathan's Tested by/reviewed by
* update commit message slightly with info sent to HPA.

arch/x86/include/asm/bitops.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b392571c1f1d..8a8b7bb9677b 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -52,9 +52,9 @@ static __always_inline void
arch_set_bit(long nr, volatile unsigned long *addr)
{
if (__builtin_constant_p(nr)) {
- asm volatile(LOCK_PREFIX "orb %1,%0"
+ asm volatile(LOCK_PREFIX "orb %b1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) & 0xff)
+ : "iq" (CONST_MASK(nr))
: "memory");
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -72,9 +72,9 @@ static __always_inline void
arch_clear_bit(long nr, volatile unsigned long *addr)
{
if (__builtin_constant_p(nr)) {
- asm volatile(LOCK_PREFIX "andb %1,%0"
+ asm volatile(LOCK_PREFIX "andb %b1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) ^ 0xff));
+ : "iq" (~CONST_MASK(nr)));
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
@@ -125,7 +125,7 @@ arch_change_bit(long nr, volatile unsigned long *addr)
if (__builtin_constant_p(nr)) {
asm volatile(LOCK_PREFIX "xorb %1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" ((u8)CONST_MASK(nr)));
+ : "iq" (CONST_MASK(nr)));
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(btc) " %1,%0"
: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
--
2.26.2.645.ge9eca65c58-goog

2020-05-08 18:35:20

by Nick Desaulniers

[permalink] [raw]
Subject: [PATCH v5] x86: bitops: fix build regression

This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.

It turns out that if your config tickles __builtin_constant_p via
differences in choices to inline or not, these statements produce
invalid assembly:

$ cat foo.c
long a(long b, long c) {
asm("orb\t%1, %0" : "+q"(c): "r"(b));
return c;
}
$ gcc foo.c
foo.c: Assembler messages:
foo.c:2: Error: `%rax' not allowed with `orb'

Use the `%b` "x86 Operand Modifier" to instead force register allocation
to select a lower-8-bit GPR operand.

The "q" constraint only has meaning on -m32 otherwise is treated as
"r". Not all GPRs have low-8-bit aliases for -m32.

Cc: Jesse Brandeburg <[email protected]>
Link: https://github.com/ClangBuiltLinux/linux/issues/961
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
Reported-by: Sedat Dilek <[email protected]>
Reported-by: kernelci.org bot <[email protected]>
Suggested-by: Andy Shevchenko <[email protected]>
Suggested-by: Brian Gerst <[email protected]>
Suggested-by: H. Peter Anvin <[email protected]>
Suggested-by: Ilie Halip <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
---
Changes V4 -> V5:
* actually use `%b` in arch_change_bit().

Changes V3 -> V4:
* drop (u8) cast from arch_change_bit() as well.

Changes V2 -> V3:
* use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
* reword commit message.
* add Brian and HPA suggested by tags
* drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
enough).
* Take over authorship.

Changes V1 -> V2:
* change authorship/signed-off-by to Ilie
* add Nathan's Tested by/reviewed by
* update commit message slightly with info sent to HPA.
arch/x86/include/asm/bitops.h | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b392571c1f1d..35460fef39b8 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -52,9 +52,9 @@ static __always_inline void
arch_set_bit(long nr, volatile unsigned long *addr)
{
if (__builtin_constant_p(nr)) {
- asm volatile(LOCK_PREFIX "orb %1,%0"
+ asm volatile(LOCK_PREFIX "orb %b1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) & 0xff)
+ : "iq" (CONST_MASK(nr))
: "memory");
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -72,9 +72,9 @@ static __always_inline void
arch_clear_bit(long nr, volatile unsigned long *addr)
{
if (__builtin_constant_p(nr)) {
- asm volatile(LOCK_PREFIX "andb %1,%0"
+ asm volatile(LOCK_PREFIX "andb %b1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) ^ 0xff));
+ : "iq" (~CONST_MASK(nr)));
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
@@ -123,9 +123,9 @@ static __always_inline void
arch_change_bit(long nr, volatile unsigned long *addr)
{
if (__builtin_constant_p(nr)) {
- asm volatile(LOCK_PREFIX "xorb %1,%0"
+ asm volatile(LOCK_PREFIX "xorb %b1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" ((u8)CONST_MASK(nr)));
+ : "iq" (CONST_MASK(nr)));
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(btc) " %1,%0"
: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
--
2.26.2.645.ge9eca65c58-goog

2020-05-08 20:32:04

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v5] x86: bitops: fix build regression

On Fri, May 08, 2020 at 11:32:29AM -0700, Nick Desaulniers wrote:
> This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
>
> $ cat foo.c
> long a(long b, long c) {
> asm("orb\t%1, %0" : "+q"(c): "r"(b));
> return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> Use the `%b` "x86 Operand Modifier" to instead force register allocation
> to select a lower-8-bit GPR operand.
>
> The "q" constraint only has meaning on -m32 otherwise is treated as
> "r". Not all GPRs have low-8-bit aliases for -m32.
>
> Cc: Jesse Brandeburg <[email protected]>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/[email protected]/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <[email protected]>
> Reported-by: kernelci.org bot <[email protected]>
> Suggested-by: Andy Shevchenko <[email protected]>
> Suggested-by: Brian Gerst <[email protected]>
> Suggested-by: H. Peter Anvin <[email protected]>
> Suggested-by: Ilie Halip <[email protected]>
> Signed-off-by: Nick Desaulniers <[email protected]>

Reviewed-by: Nathan Chancellor <[email protected]>
Tested-by: Nathan Chancellor <[email protected]> # build, clang-11

> ---
> Changes V4 -> V5:
> * actually use `%b` in arch_change_bit().
>
> Changes V3 -> V4:
> * drop (u8) cast from arch_change_bit() as well.
>
> Changes V2 -> V3:
> * use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
> * reword commit message.
> * add Brian and HPA suggested by tags
> * drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
> enough).
> * Take over authorship.
>
> Changes V1 -> V2:
> * change authorship/signed-off-by to Ilie
> * add Nathan's Tested by/reviewed by
> * update commit message slightly with info sent to HPA.
> arch/x86/include/asm/bitops.h | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..35460fef39b8 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -52,9 +52,9 @@ static __always_inline void
> arch_set_bit(long nr, volatile unsigned long *addr)
> {
> if (__builtin_constant_p(nr)) {
> - asm volatile(LOCK_PREFIX "orb %1,%0"
> + asm volatile(LOCK_PREFIX "orb %b1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) & 0xff)
> + : "iq" (CONST_MASK(nr))
> : "memory");
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> @@ -72,9 +72,9 @@ static __always_inline void
> arch_clear_bit(long nr, volatile unsigned long *addr)
> {
> if (__builtin_constant_p(nr)) {
> - asm volatile(LOCK_PREFIX "andb %1,%0"
> + asm volatile(LOCK_PREFIX "andb %b1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) ^ 0xff));
> + : "iq" (~CONST_MASK(nr)));
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> @@ -123,9 +123,9 @@ static __always_inline void
> arch_change_bit(long nr, volatile unsigned long *addr)
> {
> if (__builtin_constant_p(nr)) {
> - asm volatile(LOCK_PREFIX "xorb %1,%0"
> + asm volatile(LOCK_PREFIX "xorb %b1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" ((u8)CONST_MASK(nr)));
> + : "iq" (CONST_MASK(nr)));
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(btc) " %1,%0"
> : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> --
> 2.26.2.645.ge9eca65c58-goog
>

2020-05-08 23:48:53

by Jesse Brandeburg

[permalink] [raw]
Subject: Re: [PATCH v5] x86: bitops: fix build regression

On Fri, 8 May 2020 13:28:35 -0700
Nathan Chancellor <[email protected]> wrote:

> On Fri, May 08, 2020 at 11:32:29AM -0700, Nick Desaulniers wrote:
> > Use the `%b` "x86 Operand Modifier" to instead force register allocation
> > to select a lower-8-bit GPR operand.

This looks OK to me, I appreciate the work done to find the right
fix and clean up the code while not breaking sparse! I had a look at
the assembly from gcc 9.3.1 and it looks good. Thanks!

Reviewed-by: Jesse Brandeburg <[email protected]>

2020-05-09 04:46:36

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH v5] x86: bitops: fix build regression

On Sat, May 9, 2020 at 1:47 AM Jesse Brandeburg
<[email protected]> wrote:
>
> On Fri, 8 May 2020 13:28:35 -0700
> Nathan Chancellor <[email protected]> wrote:
>
> > On Fri, May 08, 2020 at 11:32:29AM -0700, Nick Desaulniers wrote:
> > > Use the `%b` "x86 Operand Modifier" to instead force register allocation
> > > to select a lower-8-bit GPR operand.
>
> This looks OK to me, I appreciate the work done to find the right
> fix and clean up the code while not breaking sparse! I had a look at
> the assembly from gcc 9.3.1 and it looks good. Thanks!
>
> Reviewed-by: Jesse Brandeburg <[email protected]>
>

Tested v5 on Debian/testing AMD64 with a selfmade llvm-toolchain
(LLVM/Clang/LLD) v10.0.1+git92d5c1be9ee9

Please add:

Tested-by: Sedat Dilek <[email protected]>

For details see
<https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-626104287>

Thanks to all involved people.

- Sedat -

2020-05-09 12:22:48

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v5] x86: bitops: fix build regression

On Fri, May 8, 2020 at 9:35 PM Nick Desaulniers <[email protected]> wrote:
>
> This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
>
> $ cat foo.c
> long a(long b, long c) {
> asm("orb\t%1, %0" : "+q"(c): "r"(b));
> return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> Use the `%b` "x86 Operand Modifier" to instead force register allocation
> to select a lower-8-bit GPR operand.
>
> The "q" constraint only has meaning on -m32 otherwise is treated as
> "r". Not all GPRs have low-8-bit aliases for -m32.
>

Looks very good!
One question though, does it work with minimum supported version of gcc?

> Cc: Jesse Brandeburg <[email protected]>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/[email protected]/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <[email protected]>
> Reported-by: kernelci.org bot <[email protected]>
> Suggested-by: Andy Shevchenko <[email protected]>
> Suggested-by: Brian Gerst <[email protected]>
> Suggested-by: H. Peter Anvin <[email protected]>
> Suggested-by: Ilie Halip <[email protected]>
> Signed-off-by: Nick Desaulniers <[email protected]>
> ---
> Changes V4 -> V5:
> * actually use `%b` in arch_change_bit().
>
> Changes V3 -> V4:
> * drop (u8) cast from arch_change_bit() as well.
>
> Changes V2 -> V3:
> * use `%b` "x86 Operand Modifier" instead of bitwise op then cast.
> * reword commit message.
> * add Brian and HPA suggested by tags
> * drop Nathan & Sedat Tested by/reviewed by tags (new patch is different
> enough).
> * Take over authorship.
>
> Changes V1 -> V2:
> * change authorship/signed-off-by to Ilie
> * add Nathan's Tested by/reviewed by
> * update commit message slightly with info sent to HPA.
> arch/x86/include/asm/bitops.h | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..35460fef39b8 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -52,9 +52,9 @@ static __always_inline void
> arch_set_bit(long nr, volatile unsigned long *addr)
> {
> if (__builtin_constant_p(nr)) {
> - asm volatile(LOCK_PREFIX "orb %1,%0"
> + asm volatile(LOCK_PREFIX "orb %b1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) & 0xff)
> + : "iq" (CONST_MASK(nr))
> : "memory");
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
> @@ -72,9 +72,9 @@ static __always_inline void
> arch_clear_bit(long nr, volatile unsigned long *addr)
> {
> if (__builtin_constant_p(nr)) {
> - asm volatile(LOCK_PREFIX "andb %1,%0"
> + asm volatile(LOCK_PREFIX "andb %b1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) ^ 0xff));
> + : "iq" (~CONST_MASK(nr)));
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
> : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> @@ -123,9 +123,9 @@ static __always_inline void
> arch_change_bit(long nr, volatile unsigned long *addr)
> {
> if (__builtin_constant_p(nr)) {
> - asm volatile(LOCK_PREFIX "xorb %1,%0"
> + asm volatile(LOCK_PREFIX "xorb %b1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" ((u8)CONST_MASK(nr)));
> + : "iq" (CONST_MASK(nr)));
> } else {
> asm volatile(LOCK_PREFIX __ASM_SIZE(btc) " %1,%0"
> : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
> --
> 2.26.2.645.ge9eca65c58-goog
>


--
With Best Regards,
Andy Shevchenko

2020-05-09 15:45:33

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH v5] x86: bitops: fix build regression

On Sat, May 9, 2020 at 8:20 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Fri, May 8, 2020 at 9:35 PM Nick Desaulniers <[email protected]> wrote:
> >
> > This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
> >
> > It turns out that if your config tickles __builtin_constant_p via
> > differences in choices to inline or not, these statements produce
> > invalid assembly:
> >
> > $ cat foo.c
> > long a(long b, long c) {
> > asm("orb\t%1, %0" : "+q"(c): "r"(b));
> > return c;
> > }
> > $ gcc foo.c
> > foo.c: Assembler messages:
> > foo.c:2: Error: `%rax' not allowed with `orb'
> >
> > Use the `%b` "x86 Operand Modifier" to instead force register allocation
> > to select a lower-8-bit GPR operand.
> >
> > The "q" constraint only has meaning on -m32 otherwise is treated as
> > "r". Not all GPRs have low-8-bit aliases for -m32.
> >
>
> Looks very good!
> One question though, does it work with minimum supported version of gcc?

Yes. The operand width modifiers have been around a long time but not
well documented until more recently.

--
Brian Gerst

2020-05-10 12:03:37

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] x86: bitops: fix build regression

From: Peter Anvin
> Sent: 08 May 2020 18:32
> On 2020-05-08 10:21, Nick Desaulniers wrote:
> >>
> >> One last suggestion. Add the "b" modifier to the mask operand: "orb
> >> %b1, %0". That forces the compiler to use the 8-bit register name
> >> instead of trying to deduce the width from the input.
> >
> > Ah right: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> >
> > Looks like that works for both compilers. In that case, we can likely
> > drop the `& 0xff`, too. Let me play with that, then I'll hopefully
> > send a v3 today.
> >
>
> Good idea. I requested a while ago that they document these modifiers; they
> chose not to document them all which in some ways is good; it shows what they
> are willing to commit to indefinitely.

I thought the intention here was to explicitly do a byte access.
If the constant bit number has had a div/mod by 8 done on it then
the address can be misaligned - so you mustn't do a non-byte sized
locked access.

OTOH the original base address must be aligned.

Looking at some instruction timing, BTS/BTR aren't too bad if the
bit number is a constant. But are 6 or 7 clocks slower if it is in %cl.
Given these are locked RMW bus cycles they'll always be slow!

How about an asm multi-part alternative that uses a byte offset
and byte constant if the compiler thinks the mask is constant
or a 4-byte offset and 32bit mask if it doesn't.

The other alternative is to just use BTS/BTS and (maybe) rely on the
assembler to add in the word offset to the base address.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-05-10 12:36:35

by H. Peter Anvin

[permalink] [raw]
Subject: RE: [PATCH] x86: bitops: fix build regression

On May 10, 2020 4:59:17 AM PDT, David Laight <[email protected]> wrote:
>From: Peter Anvin
>> Sent: 08 May 2020 18:32
>> On 2020-05-08 10:21, Nick Desaulniers wrote:
>> >>
>> >> One last suggestion. Add the "b" modifier to the mask operand:
>"orb
>> >> %b1, %0". That forces the compiler to use the 8-bit register name
>> >> instead of trying to deduce the width from the input.
>> >
>> > Ah right:
>https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
>> >
>> > Looks like that works for both compilers. In that case, we can
>likely
>> > drop the `& 0xff`, too. Let me play with that, then I'll hopefully
>> > send a v3 today.
>> >
>>
>> Good idea. I requested a while ago that they document these
>modifiers; they
>> chose not to document them all which in some ways is good; it shows
>what they
>> are willing to commit to indefinitely.
>
>I thought the intention here was to explicitly do a byte access.
>If the constant bit number has had a div/mod by 8 done on it then
>the address can be misaligned - so you mustn't do a non-byte sized
>locked access.
>
>OTOH the original base address must be aligned.
>
>Looking at some instruction timing, BTS/BTR aren't too bad if the
>bit number is a constant. But are 6 or 7 clocks slower if it is in %cl.
>Given these are locked RMW bus cycles they'll always be slow!
>
>How about an asm multi-part alternative that uses a byte offset
>and byte constant if the compiler thinks the mask is constant
>or a 4-byte offset and 32bit mask if it doesn't.
>
>The other alternative is to just use BTS/BTS and (maybe) rely on the
>assembler to add in the word offset to the base address.
>
> David
>
>-
>Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes,
>MK1 1PT, UK
>Registration No: 1397386 (Wales)

I don't understand what you are getting at here.

The intent is to do a byte access. The "multi-part asm" you are talking about is also already there...
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2020-05-10 13:58:02

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v5] x86: bitops: fix build regression

From: Nick Desaulniers
> Sent: 08 May 2020 19:32
..
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
...
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b392571c1f1d..35460fef39b8 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -52,9 +52,9 @@ static __always_inline void
> arch_set_bit(long nr, volatile unsigned long *addr)
> {
> if (__builtin_constant_p(nr)) {
> - asm volatile(LOCK_PREFIX "orb %1,%0"
> + asm volatile(LOCK_PREFIX "orb %b1,%0"
> : CONST_MASK_ADDR(nr, addr)
> - : "iq" (CONST_MASK(nr) & 0xff)
> + : "iq" (CONST_MASK(nr))
> : "memory");

What happens if CONST_MASK() is changed to:
#define CONST_MASK_(n) (n == 0 ? 1 : n == 1 ? 2 : n ....)
#define CONST_MASK(n) CONST_MASK_(((n) & 7))

and a separate definition for the inverse mask.

The lack of arithmetic promotion may mean that the only "i"
constraint is needed.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-05-11 17:27:44

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v5] x86: bitops: fix build regression

On Sat, May 9, 2020 at 5:20 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Fri, May 8, 2020 at 9:35 PM Nick Desaulniers <[email protected]> wrote:
> >
> > This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
> >
> > It turns out that if your config tickles __builtin_constant_p via
> > differences in choices to inline or not, these statements produce
> > invalid assembly:
> >
> > $ cat foo.c
> > long a(long b, long c) {
> > asm("orb\t%1, %0" : "+q"(c): "r"(b));
> > return c;
> > }
> > $ gcc foo.c
> > foo.c: Assembler messages:
> > foo.c:2: Error: `%rax' not allowed with `orb'
> >
> > Use the `%b` "x86 Operand Modifier" to instead force register allocation
> > to select a lower-8-bit GPR operand.
> >
> > The "q" constraint only has meaning on -m32 otherwise is treated as
> > "r". Not all GPRs have low-8-bit aliases for -m32.
> >
>
> Looks very good!
> One question though, does it work with minimum supported version of gcc?

Yes; the oldest version of GCC in godbolt.org is GCC 4.1.2 which
supports the syntax and generates the expected lower-8-bit register
operands.
--
Thanks,
~Nick Desaulniers

2020-05-11 18:54:26

by Brian Gerst

[permalink] [raw]
Subject: Re: [PATCH v5] x86: bitops: fix build regression

On Fri, May 8, 2020 at 2:32 PM Nick Desaulniers <[email protected]> wrote:
>
> This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
>
> It turns out that if your config tickles __builtin_constant_p via
> differences in choices to inline or not, these statements produce
> invalid assembly:
>
> $ cat foo.c
> long a(long b, long c) {
> asm("orb\t%1, %0" : "+q"(c): "r"(b));
> return c;
> }
> $ gcc foo.c
> foo.c: Assembler messages:
> foo.c:2: Error: `%rax' not allowed with `orb'
>
> Use the `%b` "x86 Operand Modifier" to instead force register allocation
> to select a lower-8-bit GPR operand.
>
> The "q" constraint only has meaning on -m32 otherwise is treated as
> "r". Not all GPRs have low-8-bit aliases for -m32.
>
> Cc: Jesse Brandeburg <[email protected]>
> Link: https://github.com/ClangBuiltLinux/linux/issues/961
> Link: https://lore.kernel.org/lkml/[email protected]/
> Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> Reported-by: Sedat Dilek <[email protected]>
> Reported-by: kernelci.org bot <[email protected]>
> Suggested-by: Andy Shevchenko <[email protected]>
> Suggested-by: Brian Gerst <[email protected]>
> Suggested-by: H. Peter Anvin <[email protected]>
> Suggested-by: Ilie Halip <[email protected]>
> Signed-off-by: Nick Desaulniers <[email protected]>

Reviewed-By: Brian Gerst <[email protected]>

2020-05-14 23:59:09

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v5] x86: bitops: fix build regression

Bumping for this to get reviewed+picked up. Fixes a regression I
would prefer didn't ship in 5.7.

On Mon, May 11, 2020 at 11:52 AM Brian Gerst <[email protected]> wrote:
>
> On Fri, May 8, 2020 at 2:32 PM Nick Desaulniers <[email protected]> wrote:
> >
> > This is easily reproducible via CC=clang+CONFIG_STAGING=y+CONFIG_VT6656=m.
> >
> > It turns out that if your config tickles __builtin_constant_p via
> > differences in choices to inline or not, these statements produce
> > invalid assembly:
> >
> > $ cat foo.c
> > long a(long b, long c) {
> > asm("orb\t%1, %0" : "+q"(c): "r"(b));
> > return c;
> > }
> > $ gcc foo.c
> > foo.c: Assembler messages:
> > foo.c:2: Error: `%rax' not allowed with `orb'
> >
> > Use the `%b` "x86 Operand Modifier" to instead force register allocation
> > to select a lower-8-bit GPR operand.
> >
> > The "q" constraint only has meaning on -m32 otherwise is treated as
> > "r". Not all GPRs have low-8-bit aliases for -m32.
> >
> > Cc: Jesse Brandeburg <[email protected]>
> > Link: https://github.com/ClangBuiltLinux/linux/issues/961
> > Link: https://lore.kernel.org/lkml/[email protected]/
> > Link: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
> > Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast")
> > Reported-by: Sedat Dilek <[email protected]>
> > Reported-by: kernelci.org bot <[email protected]>
> > Suggested-by: Andy Shevchenko <[email protected]>
> > Suggested-by: Brian Gerst <[email protected]>
> > Suggested-by: H. Peter Anvin <[email protected]>
> > Suggested-by: Ilie Halip <[email protected]>
> > Signed-off-by: Nick Desaulniers <[email protected]>
>
> Reviewed-By: Brian Gerst <[email protected]>



--
Thanks,
~Nick Desaulniers