LinuxLists.cc - Re: arch/riscv doesn't support xchg() on bool

2019-10-22 01:23:56

Subject: Re: arch/riscv doesn't support xchg() on bool

Hi Eric,

On Mon, 21 Oct 2019, Eric Biggers wrote:

> The kbuild test robot reported a build error on RISC-V in this patch:
>
> https://patchwork.kernel.org/patch/11182389/
>
> ... because of the line:
>
> if (!xchg(&mode->logged_impl_name, true)) {
>
> where logged_impl_name is a 'bool'. The problem is that unlike most (or
> all?) other kernel architectures, arch/riscv/ doesn't support xchg() on
> bytes.

When I looked at this in August, it looked like several Linux other
architectures - SPARC, Microblaze, C-SKY, and Hexagon - also didn't
support xchg() on anything other than 32-bit types:

https://lore.kernel.org/lkml/[email protected]/

Examples:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/sparc/include/asm/cmpxchg_32.h#n18

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/sparc/include/asm/cmpxchg_32.h#n41

> Is there any chance this could be implemented, to avoid this
> architecture-specific quirk?

It is certainly possible. I wonder whether it is wise. Several of the
other architectures implement a software workaround for this operation,
and I guess you're advocating that we do the same. We could copy one
these implementations. However, the workarounds balloon into quite a lot
of code. Here is an example from MIPS:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/mips/kernel/cmpxchg.c#n10

I could be wrong, but I think this expansion would be pretty surprising
for most users of xchg(). I suspect most xchg() users are looking for
something performant, and would be better served by simply using a
variable with a 32-bit type.

In the case of your patch, it appears that struct
fscrypt_mode.logged_impl_name is only used in the patched function. It
looks like it could be promoted into a u32 without much difficulty.
Would you be willing to consider that approach of solving the problem?
Then the code would be able to take advantage of the fast hardware
implementation that's available on many architectures (including RISC-V).

> Note, there's at least one other place in the kernel that also uses
> xchg() on a bool.

Given the nasty compatibility code, I wonder if we'd be better served by
removing most of this compatibility code across the kernel, and just
requiring callers to use a 32-bit type? For most callers that I've seen,
this doesn't seem to be much of an issue; and it would avoid the nasty
code involved in software emulations of xchg().

- Paul

2019-10-22 01:49:16

by Eric Biggers

[permalink] [raw]

Subject: Re: arch/riscv doesn't support xchg() on bool

Hi Paul,

On Mon, Oct 21, 2019 at 06:23:11PM -0700, Paul Walmsley wrote:
> Hi Eric,
>
> On Mon, 21 Oct 2019, Eric Biggers wrote:
>
> > The kbuild test robot reported a build error on RISC-V in this patch:
> >
> > https://patchwork.kernel.org/patch/11182389/
> >
> > ... because of the line:
> >
> > if (!xchg(&mode->logged_impl_name, true)) {
> >
> > where logged_impl_name is a 'bool'. The problem is that unlike most (or
> > all?) other kernel architectures, arch/riscv/ doesn't support xchg() on
> > bytes.
>
> When I looked at this in August, it looked like several Linux other
> architectures - SPARC, Microblaze, C-SKY, and Hexagon - also didn't
> support xchg() on anything other than 32-bit types:
>
> https://lore.kernel.org/lkml/[email protected]/
>
> Examples:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/sparc/include/asm/cmpxchg_32.h#n18
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/sparc/include/asm/cmpxchg_32.h#n41
>
> > Is there any chance this could be implemented, to avoid this
> > architecture-specific quirk?
>
> It is certainly possible. I wonder whether it is wise. Several of the
> other architectures implement a software workaround for this operation,
> and I guess you're advocating that we do the same. We could copy one
> these implementations. However, the workarounds balloon into quite a lot
> of code. Here is an example from MIPS:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/mips/kernel/cmpxchg.c#n10
>
> I could be wrong, but I think this expansion would be pretty surprising
> for most users of xchg(). I suspect most xchg() users are looking for
> something performant, and would be better served by simply using a
> variable with a 32-bit type.
>
> In the case of your patch, it appears that struct
> fscrypt_mode.logged_impl_name is only used in the patched function. It
> looks like it could be promoted into a u32 without much difficulty.
> Would you be willing to consider that approach of solving the problem?
> Then the code would be able to take advantage of the fast hardware
> implementation that's available on many architectures (including RISC-V).

Yes, I already sent a new version of the patch, which changes the variable to an
int: https://patchwork.kernel.org/patch/11203003/. I was wondering more about
how to stop other people from running into this.

>
> > Note, there's at least one other place in the kernel that also uses
> > xchg() on a bool.
>
> Given the nasty compatibility code, I wonder if we'd be better served by
> removing most of this compatibility code across the kernel, and just
> requiring callers to use a 32-bit type? For most callers that I've seen,
> this doesn't seem to be much of an issue; and it would avoid the nasty
> code involved in software emulations of xchg().
>

It's possible that's the better approach; someone would need to go through all
the xchg() users and check whether any truly need the 8 or 16-bit support. My
main concern was just the annoyance of code that only fails to compile on
certain architectures. It should either be one way or the other everywhere.

- Eric