2023-08-01 23:25:43

by Nick Desaulniers

[permalink] [raw]
Subject: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness

Compiling big-endian targets with Clang produces the diagnostic:

fs/namei.c:2173:13: warning: use of bitwise '|' with boolean operands
[-Wbitwise-instead-of-logical]
} while (!(has_zero(a, &adata, &constants) | has_zero(b, &bdata, &constants)));
~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
||
fs/namei.c:2173:13: note: cast one or both operands to int to silence
this warning

It appears that when has_zero was introduced, two definitions were
produced with different signatures (in particular different return types).

Looking at the usage in hash_name() in fs/namei.c, I suspect that
has_zero() is meant to be invoked twice per while loop iteration; using
logical-or would not update `bdata` when `a` did not have zeros. So I
think it's preferred to always return an unsigned long rather than a
bool then update the while loop in hash_name() to use a logical-or
rather than bitwise-or.

Link: https://github.com/ClangBuiltLinux/linux/issues/1832
Fixes: 36126f8f2ed8 ("word-at-a-time: make the interfaces truly generic")
Debugged-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
---
include/asm-generic/word-at-a-time.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/asm-generic/word-at-a-time.h b/include/asm-generic/word-at-a-time.h
index 20c93f08c993..95a1d214108a 100644
--- a/include/asm-generic/word-at-a-time.h
+++ b/include/asm-generic/word-at-a-time.h
@@ -38,7 +38,7 @@ static inline long find_zero(unsigned long mask)
return (mask >> 8) ? byte : byte + 1;
}

-static inline bool has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
+static inline unsigned long has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
{
unsigned long rhs = val | c->low_bits;
*data = rhs;

---
base-commit: 18b44bc5a67275641fb26f2c54ba7eef80ac5950
change-id: 20230801-bitwise-7812b11e5fb7

Best regards,
--
Nick Desaulniers <[email protected]>



2023-08-02 01:14:57

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness

On Tue, Aug 1, 2023 at 3:22 PM <[email protected]> wrote:
>
> Compiling big-endian targets with Clang produces the diagnostic:
>
> fs/namei.c:2173:13: warning: use of bitwise '|' with boolean operands
> [-Wbitwise-instead-of-logical]
> } while (!(has_zero(a, &adata, &constants) | has_zero(b, &bdata, &constants)));
> ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ||
> fs/namei.c:2173:13: note: cast one or both operands to int to silence
> this warning
>
> It appears that when has_zero was introduced, two definitions were
> produced with different signatures (in particular different return types).
>
> Looking at the usage in hash_name() in fs/namei.c, I suspect that
> has_zero() is meant to be invoked twice per while loop iteration; using
> logical-or would not update `bdata` when `a` did not have zeros. So I
> think it's preferred to always return an unsigned long rather than a
> bool then update the while loop in hash_name() to use a logical-or

s/then/than/

> rather than bitwise-or.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/1832
> Fixes: 36126f8f2ed8 ("word-at-a-time: make the interfaces truly generic")
> Debugged-by: Nathan Chancellor <[email protected]>
> Signed-off-by: Nick Desaulniers <[email protected]>
> ---
> include/asm-generic/word-at-a-time.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/asm-generic/word-at-a-time.h b/include/asm-generic/word-at-a-time.h
> index 20c93f08c993..95a1d214108a 100644
> --- a/include/asm-generic/word-at-a-time.h
> +++ b/include/asm-generic/word-at-a-time.h
> @@ -38,7 +38,7 @@ static inline long find_zero(unsigned long mask)
> return (mask >> 8) ? byte : byte + 1;
> }
>
> -static inline bool has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
> +static inline unsigned long has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
> {
> unsigned long rhs = val | c->low_bits;
> *data = rhs;
>
> ---
> base-commit: 18b44bc5a67275641fb26f2c54ba7eef80ac5950
> change-id: 20230801-bitwise-7812b11e5fb7
>
> Best regards,
> --
> Nick Desaulniers <[email protected]>
>


--
Thanks,
~Nick Desaulniers

2023-08-02 01:42:56

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness

On Tue, 1 Aug 2023 at 15:22, <[email protected]> wrote:
>
> Compiling big-endian targets with Clang produces the diagnostic:
>
> fs/namei.c:2173:13: warning: use of bitwise '|' with boolean operands
> [-Wbitwise-instead-of-logical]
> } while (!(has_zero(a, &adata, &constants) | has_zero(b, &bdata, &constants)));

Gaah.

Yes, I think that 'has_zero()' should return the 'unsigned long' bits
on big-endian too, because I do think we always want the bit ops, and
turn it into a boolean only at the very end.

> It appears that when has_zero was introduced, two definitions were
> produced with different signatures (in particular different return types).

Big-endian was kind of a later addition, and while that file is called
"generic", it's really "little-endian has an easier time of this all,
but let's do the 'generic' file for the more complicated case".

Who ends up being affected by this? Powerpc does its own
word-at-a-time thing because the big-endian case is nasty and you can
do better with special instructions that they have.

Who else is even BE any more? Some old 32-bit arm setup?

I think the patch is fine, but I guess I'd like to know that people
who are affected actually don't see any code generation changes (or
possibly see improvements from not turning it into a bool until later)

Linus

2023-08-02 07:29:11

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness

On Wed, Aug 2, 2023, at 03:07, Linus Torvalds wrote:
> On Tue, 1 Aug 2023 at 15:22, <[email protected]> wrote:

> Who ends up being affected by this? Powerpc does its own
> word-at-a-time thing because the big-endian case is nasty and you can
> do better with special instructions that they have.

powerpc needs the same patch though.

> Who else is even BE any more? Some old 32-bit arm setup?
>
> I think the patch is fine, but I guess I'd like to know that people
> who are affected actually don't see any code generation changes (or
> possibly see improvements from not turning it into a bool until later)

s390 is the main one here, maintainers added to Cc.

Atheros and Realtek MIPS are older but still shipping in low-end
network equipment, and there are still users on old embedded
big-endian mips (broadcom, cavium, intel/lantiq and arm
(intel ixp4xx) hardware. All modern Arm hardware (v6/v7/v8/v9)
can do both big-endian and little-endian, but actual user numbers
for big-endian are close to zero -- I have not heard from anyone
actually using it in years.

And then there was a lot of the old-school workstation/server
hardware (m68k, mips, parisc and sparc, not alpha/ia64) that is
mostly big-endian and still maintained to some degree.

Arnd

2023-08-02 09:09:17

by Heiko Carstens

[permalink] [raw]
Subject: Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness

On Wed, Aug 02, 2023 at 07:59:48AM +0200, Arnd Bergmann wrote:
> On Wed, Aug 2, 2023, at 03:07, Linus Torvalds wrote:
> > On Tue, 1 Aug 2023 at 15:22, <[email protected]> wrote:
>
> > Who ends up being affected by this? Powerpc does its own
> > word-at-a-time thing because the big-endian case is nasty and you can
> > do better with special instructions that they have.
>
> powerpc needs the same patch though.
>
> > Who else is even BE any more? Some old 32-bit arm setup?
> >
> > I think the patch is fine, but I guess I'd like to know that people
> > who are affected actually don't see any code generation changes (or
> > possibly see improvements from not turning it into a bool until later)
>
> s390 is the main one here, maintainers added to Cc.

The generated code on s390 is identical with and without the patch
using gcc 13.2.
So this looks fine from my point of view.

2023-08-02 16:51:53

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness

On Tue, Aug 01, 2023 at 06:07:08PM -0700, Linus Torvalds wrote:
> I think the patch is fine, but I guess I'd like to know that people
> who are affected actually don't see any code generation changes (or
> possibly see improvements from not turning it into a bool until later)

We see this warning with ARCH=arm64 defconfig + CONFIG_CPU_BIG_ENDIAN=y.
With both clang 18.0.0 (tip of tree) and GCC 13.1.0, I don't see any
actual code generation changes in fs/namei.o with this configuration.
I'd be pretty surprised if any of the other uses of has_zero() show any
changes, I at least checked lib/string.o with that configuration and
s390 and did not see anything.

As far as I can tell, arm and arm64 with CONFIG_CPU_BIG_ENDIAN=y are the
only configurations that can hit the particular bit of code with the
generic big endian has_zero() implementation because the version of
hash_name() that uses has_zero() in this manner is only used when
CONFIG_DCACHE_WORD_ACCESS is set, which only arm, arm64, powerpc (little
endian), and x86 select.

arch/arm/Kconfig:49: select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/arm64/Kconfig:121: select DCACHE_WORD_ACCESS
arch/powerpc/Kconfig:183: select DCACHE_WORD_ACCESS if PPC64 && CPU_LITTLE_ENDIAN
arch/x86/Kconfig:140: select DCACHE_WORD_ACCESS if !KMSAN
arch/x86/um/Kconfig:12: select DCACHE_WORD_ACCESS

So seems like a pretty low risk patch to me but I could be missing
something.

Cheers,
Nathan

2023-08-02 19:10:31

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness

On Wed, 2 Aug 2023 at 09:16, Nathan Chancellor <[email protected]> wrote:
>
> We see this warning with ARCH=arm64 defconfig + CONFIG_CPU_BIG_ENDIAN=y.

Oh Christ. I didn't even realize that arm64 allowed a BE config.

The config option goes back to 2013 - are there actually BE user space
implementations around?

People, why do we do that? That's positively crazy. BE is dead and
should be relegated to legacy platforms. There are no advantages to
being different just for the sake of being different - any "security
by obscurity" argument would be far outweighed by the inconvenience to
actual users.

Yes, yes, I know the aarch64 architecture technically allows BE
implementations - and apparently you can even do it by exception
level, which I had to look up. But do any actually exist?

Does the kernel even work right in BE mode? It's really easy to miss
some endianness check when all the actual hardware and use is LE, and
when (for example) instruction encoding and IO is then always LE
anyway.

> With both clang 18.0.0 (tip of tree) and GCC 13.1.0, I don't see any
> actual code generation changes in fs/namei.o with this configuration.

Ok, since the main legacy platform confirmed that, I'll just apply
that patch directly.

I'll also do the powerpc version that Arnd pointed to at the same
time, since it seems silly to pick these off one at a time. It too
should just be 'unsigned long', so that the two values can be bitwise
or'ed together without any questions.

Linus

2023-08-02 20:13:08

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness

On Wed, Aug 2, 2023, at 19:37, Linus Torvalds wrote:
> On Wed, 2 Aug 2023 at 09:16, Nathan Chancellor <[email protected]> wrote:
>>
>> We see this warning with ARCH=arm64 defconfig + CONFIG_CPU_BIG_ENDIAN=y.
>
> Oh Christ. I didn't even realize that arm64 allowed a BE config.
>
> The config option goes back to 2013 - are there actually BE user space
> implementations around?

At least NXP's Layerscape and Huawei's SoCs ended up in big-endian
appliances, running legacy software ported from mips or powerpc.
I agree this was a mistake, but that wasn't nearly as obvious ten
years ago when there were still new BE-only sparc, mips and powerpc
put on the market -- that really only ended in 2017.

> People, why do we do that? That's positively crazy. BE is dead and
> should be relegated to legacy platforms. There are no advantages to
> being different just for the sake of being different - any "security
> by obscurity" argument would be far outweighed by the inconvenience to
> actual users.
>
> Yes, yes, I know the aarch64 architecture technically allows BE
> implementations - and apparently you can even do it by exception
> level, which I had to look up. But do any actually exist?
>
> Does the kernel even work right in BE mode? It's really easy to miss
> some endianness check when all the actual hardware and use is LE, and
> when (for example) instruction encoding and IO is then always LE
> anyway.

This was always only done for compatibility with non-portable
software when companies with large custom network stacks argued
that it was cheaper to build the entire open source software to
big-endian than port their own product to little-endian. ;-)

We (Linaro) used to test all toolchain and kernel releases in
big-endian mode as member companies had customers that asked
for it, but that stopped a while ago as those legacy software
stacks either got more portable or got replaced over time.

Many Arm systems won't boot BE kernels any more because UEFI
firmware only supports LE, or because of driver bugs.
Virtual machines are still likely to work fine though.
I'm fairly sure that all Arm Cortex and Neoverse cores still\
support BE mode in all exception levels, OTOH at least Apple's
custom CPUs do not implement it at all.

Arnd

2023-08-04 13:17:52

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness

On Wed, Aug 02, 2023 at 08:17:32PM +0200, Arnd Bergmann wrote:
> On Wed, Aug 2, 2023, at 19:37, Linus Torvalds wrote:
> > On Wed, 2 Aug 2023 at 09:16, Nathan Chancellor <[email protected]> wrote:
> >>
> >> We see this warning with ARCH=arm64 defconfig + CONFIG_CPU_BIG_ENDIAN=y.
> >
> > Oh Christ. I didn't even realize that arm64 allowed a BE config.
> >
> > The config option goes back to 2013 - are there actually BE user space
> > implementations around?
>
> At least NXP's Layerscape and Huawei's SoCs ended up in big-endian
> appliances, running legacy software ported from mips or powerpc.
> I agree this was a mistake, but that wasn't nearly as obvious ten
> years ago when there were still new BE-only sparc, mips and powerpc
> put on the market -- that really only ended in 2017.
>
> > People, why do we do that? That's positively crazy. BE is dead and
> > should be relegated to legacy platforms. There are no advantages to
> > being different just for the sake of being different - any "security
> > by obscurity" argument would be far outweighed by the inconvenience to
> > actual users.
> >
> > Yes, yes, I know the aarch64 architecture technically allows BE
> > implementations - and apparently you can even do it by exception
> > level, which I had to look up. But do any actually exist?
> >
> > Does the kernel even work right in BE mode? It's really easy to miss
> > some endianness check when all the actual hardware and use is LE, and
> > when (for example) instruction encoding and IO is then always LE
> > anyway.
>
> This was always only done for compatibility with non-portable
> software when companies with large custom network stacks argued
> that it was cheaper to build the entire open source software to
> big-endian than port their own product to little-endian. ;-)
>
> We (Linaro) used to test all toolchain and kernel releases in
> big-endian mode as member companies had customers that asked
> for it, but that stopped a while ago as those legacy software
> stacks either got more portable or got replaced over time.
>
> Many Arm systems won't boot BE kernels any more because UEFI
> firmware only supports LE, or because of driver bugs.
> Virtual machines are still likely to work fine though.
> I'm fairly sure that all Arm Cortex and Neoverse cores still\
> support BE mode in all exception levels, OTOH at least Apple's
> custom CPUs do not implement it at all.

Yes, that's right. The CPUs we have *do* tend to support BE, but
practically there isn't any software to run on them. I asked about
removing it a few years ago:

https://lore.kernel.org/linux-arm-kernel/20191011102747.lpbaur2e4nqyf7sw@willie-the-truck/

but Hanjun said that Huawei are using it, so it stayed.

Will