2020-04-09 23:35:00

by Jian Cai

[permalink] [raw]
Subject: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
kernel.

Signed-off-by: Jian Cai <[email protected]>
---
arch/arm/Kconfig | 2 +-
init/Kconfig | 6 ++++++
2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 66a04f6f4775..39de8fc64a73 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -804,7 +804,7 @@ source "arch/arm/mm/Kconfig"

config IWMMXT
bool "Enable iWMMXt support"
- depends on CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B
+ depends on !AS_IS_CLANG && !LD_IS_LLD && (CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B)
default y if PXA27x || PXA3xx || ARCH_MMP || CPU_PJ4 || CPU_PJ4B
help
Enable support for iWMMXt context switching at run time if
diff --git a/init/Kconfig b/init/Kconfig
index 1c12059e0f7e..b0ab3271e900 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -19,6 +19,12 @@ config GCC_VERSION
config CC_IS_CLANG
def_bool $(success,$(CC) --version | head -n 1 | grep -q clang)

+config AS_IS_CLANG
+ def_bool $(success,$(AS) --version | head -n 1 | grep -q clang)
+
+config LD_IS_LLD
+ def_bool $(success,$(LD) --version | head -n 1 | grep -q LLD)
+
config CLANG_VERSION
int
default $(shell,$(srctree)/scripts/clang-version.sh $(CC))
--
2.26.0.110.g2183baf09c-goog


2020-04-10 00:21:57

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Thu, Apr 9, 2020 at 4:28 PM Jian Cai <[email protected]> wrote:
>
> iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> Skip this file if LLVM integrated assemmbler or LLD is used to build ARM

Hi Jian, thank you for the patch. It's pretty much spot on for what I
was looking for. Some notes below:

s/assemmbler/assembler

:set spell

;)

Also, could use a link tag, ie.

Link: https://github.com/ClangBuiltLinux/linux/issues/975

(Always include the link tag to help us track when and where patches land).

Finally, I think the hunks for the two different files should be
split; the init/Kconfig change should be it's own dedicated change
that goes to Masahiro, the maintainer of the Kbuild tree. Then when
we have that, the dependent configs should go separately. Would you
mind splitting this patch in two, and re-sending just the Kbuild patch
for now? Since you used Sami's patch for inspiration
(https://github.com/ClangBuiltLinux/linux/issues/975#issuecomment-611694153,
https://github.com/samitolvanen/linux/commit/fe9786cb52a0100273c75117dcea8b8e07006fde),
it would be polite to Sami to add a tag like:

Suggested-by: Sami Tolvanen <[email protected]>

or maybe better yet, take Sami's patch, add your signed off by tag
(maintaining him as the git author, see `git log --pretty=fuller`),
then rebase your AS_IS_CLANG hunk on top, make that a second patch,
then finally have the arm change as a third patch.

This change is exactly what I'm looking for, so these are just process concerns.

> kernel.
>
> Signed-off-by: Jian Cai <[email protected]>
> ---
> arch/arm/Kconfig | 2 +-
> init/Kconfig | 6 ++++++
> 2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 66a04f6f4775..39de8fc64a73 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -804,7 +804,7 @@ source "arch/arm/mm/Kconfig"
>
> config IWMMXT
> bool "Enable iWMMXt support"
> - depends on CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B
> + depends on !AS_IS_CLANG && !LD_IS_LLD && (CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B)
> default y if PXA27x || PXA3xx || ARCH_MMP || CPU_PJ4 || CPU_PJ4B
> help
> Enable support for iWMMXt context switching at run time if
> diff --git a/init/Kconfig b/init/Kconfig
> index 1c12059e0f7e..b0ab3271e900 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -19,6 +19,12 @@ config GCC_VERSION
> config CC_IS_CLANG
> def_bool $(success,$(CC) --version | head -n 1 | grep -q clang)
>
> +config AS_IS_CLANG
> + def_bool $(success,$(AS) --version | head -n 1 | grep -q clang)
> +
> +config LD_IS_LLD
> + def_bool $(success,$(LD) --version | head -n 1 | grep -q LLD)
> +
> config CLANG_VERSION
> int
> default $(shell,$(srctree)/scripts/clang-version.sh $(CC))
> --

--
Thanks,
~Nick Desaulniers

2020-04-10 00:41:07

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Thu, Apr 09, 2020 at 05:01:33PM -0700, 'Nick Desaulniers' via Clang Built Linux wrote:
> On Thu, Apr 9, 2020 at 4:28 PM Jian Cai <[email protected]> wrote:
> >
> > iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> > Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
>
> Hi Jian, thank you for the patch. It's pretty much spot on for what I
> was looking for. Some notes below:
>
> s/assemmbler/assembler
>
> :set spell
>
> ;)
>
> Also, could use a link tag, ie.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/975
>
> (Always include the link tag to help us track when and where patches land).
>
> Finally, I think the hunks for the two different files should be
> split; the init/Kconfig change should be it's own dedicated change
> that goes to Masahiro, the maintainer of the Kbuild tree. Then when
> we have that, the dependent configs should go separately. Would you
> mind splitting this patch in two, and re-sending just the Kbuild patch
> for now? Since you used Sami's patch for inspiration
> (https://github.com/ClangBuiltLinux/linux/issues/975#issuecomment-611694153,
> https://github.com/samitolvanen/linux/commit/fe9786cb52a0100273c75117dcea8b8e07006fde),
> it would be polite to Sami to add a tag like:
>
> Suggested-by: Sami Tolvanen <[email protected]>
>
> or maybe better yet, take Sami's patch, add your signed off by tag
> (maintaining him as the git author, see `git log --pretty=fuller`),
> then rebase your AS_IS_CLANG hunk on top, make that a second patch,
> then finally have the arm change as a third patch.
>
> This change is exactly what I'm looking for, so these are just process concerns.

I think that they can be sent as a series that is picked up by Masahiro
with an ack from an ARM maintainer.

> > kernel.
> >
> > Signed-off-by: Jian Cai <[email protected]>
> > ---
> > arch/arm/Kconfig | 2 +-
> > init/Kconfig | 6 ++++++
> > 2 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index 66a04f6f4775..39de8fc64a73 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -804,7 +804,7 @@ source "arch/arm/mm/Kconfig"
> >
> > config IWMMXT
> > bool "Enable iWMMXt support"
> > - depends on CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B
> > + depends on !AS_IS_CLANG && !LD_IS_LLD && (CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B)
> > default y if PXA27x || PXA3xx || ARCH_MMP || CPU_PJ4 || CPU_PJ4B
> > help
> > Enable support for iWMMXt context switching at run time if
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 1c12059e0f7e..b0ab3271e900 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -19,6 +19,12 @@ config GCC_VERSION
> > config CC_IS_CLANG
> > def_bool $(success,$(CC) --version | head -n 1 | grep -q clang)
> >
> > +config AS_IS_CLANG
> > + def_bool $(success,$(AS) --version | head -n 1 | grep -q clang)

$(AS) is being replaced with $(LLVM_IAS). That line should be:

def_bool $(success,test $(LLVM_IAS) -eq 1)

I think. That depends on a commit in Masahiro's for-next branch [1] so
it should be based/tested against that.

Otherwise, I agree with Nick's comment about being split into two
patches (one for the init/Kconfig change and one for the ARM change) and
adding the Link tag.

I ran about 75 randconfigs with LD=ld.lld and LLVM_IAS=1 and I did not
see any Kconfig warnings from this and CONFIG_IWMMXT was unset in every
one. Should prevent the errors that you encountered. Feel free to apply
the following tags to any follow up revisions.

Tested-by: Nathan Chancellor <[email protected]> # randconfig
Reviewed-by: Nathan Chancellor <[email protected]>

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git

> > +config LD_IS_LLD
> > + def_bool $(success,$(LD) --version | head -n 1 | grep -q LLD)
> > +
> > config CLANG_VERSION
> > int
> > default $(shell,$(srctree)/scripts/clang-version.sh $(CC))
> > --
>
> --
> Thanks,
> ~Nick Desaulniers
>

Cheers,
Nathan

2020-04-10 06:40:03

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Fri, Apr 10, 2020 at 1:28 AM Jian Cai <[email protected]> wrote:
>
> iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
> kernel.
>
> Signed-off-by: Jian Cai <[email protected]>
> ---
> arch/arm/Kconfig | 2 +-
> init/Kconfig | 6 ++++++
> 2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 66a04f6f4775..39de8fc64a73 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -804,7 +804,7 @@ source "arch/arm/mm/Kconfig"
>
> config IWMMXT
> bool "Enable iWMMXt support"
> - depends on CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B
> + depends on !AS_IS_CLANG && !LD_IS_LLD && (CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B)
> default y if PXA27x || PXA3xx || ARCH_MMP || CPU_PJ4 || CPU_PJ4B
> help
> Enable support for iWMMXt context switching at run time if
> diff --git a/init/Kconfig b/init/Kconfig
> index 1c12059e0f7e..b0ab3271e900 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -19,6 +19,12 @@ config GCC_VERSION
> config CC_IS_CLANG
> def_bool $(success,$(CC) --version | head -n 1 | grep -q clang)
>
> +config AS_IS_CLANG
> + def_bool $(success,$(AS) --version | head -n 1 | grep -q clang)
> +
> +config LD_IS_LLD
> + def_bool $(success,$(LD) --version | head -n 1 | grep -q LLD)
> +
> config CLANG_VERSION
> int
> default $(shell,$(srctree)/scripts/clang-version.sh $(CC))
> --
> 2.26.0.110.g2183baf09c-goog

Yesterday, when looking trough commits in Linus tree, I saw:

"init/kconfig: Add LD_VERSION Kconfig"

Nick had a patchset to distinguish LINKER via Kconfig (I cannot find
it right now).

So we should do all this the way CC_IS_XXX CC_VERSION handling is done.

I just want to point to [2] where we can rework (simplify) this
handling for CC and LD handling in a further step.
In one of Peter Z. tree someone started to do so (I was inspired by that).

Unfortunately, the hunk from [1] is IMHO a bit mis-placed and CC and
LD handling should stay together:

CC_IS_XXX where XXX is GCC or CLANG
CC_VERSION where CC is GCC or CLANG

LD_IS_XXX where XXX is BFD or GOLD or LLD
LD_VERSION

Just my €0,02.

Regards,
- Sedat -

[1] https://git.kernel.org/linus/9553d16fa671b9621c5e2847d08bd90d3be3349c
[2] https://github.com/ClangBuiltLinux/linux/issues/941

2020-04-10 07:47:10

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Fri, Apr 10, 2020 at 08:38:05AM +0200, Sedat Dilek wrote:
> On Fri, Apr 10, 2020 at 1:28 AM Jian Cai <[email protected]> wrote:
> >
> > iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> > Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
> > kernel.
> >
> > Signed-off-by: Jian Cai <[email protected]>
> > ---
> > arch/arm/Kconfig | 2 +-
> > init/Kconfig | 6 ++++++
> > 2 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index 66a04f6f4775..39de8fc64a73 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -804,7 +804,7 @@ source "arch/arm/mm/Kconfig"
> >
> > config IWMMXT
> > bool "Enable iWMMXt support"
> > - depends on CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B
> > + depends on !AS_IS_CLANG && !LD_IS_LLD && (CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B)
> > default y if PXA27x || PXA3xx || ARCH_MMP || CPU_PJ4 || CPU_PJ4B
> > help
> > Enable support for iWMMXt context switching at run time if
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 1c12059e0f7e..b0ab3271e900 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -19,6 +19,12 @@ config GCC_VERSION
> > config CC_IS_CLANG
> > def_bool $(success,$(CC) --version | head -n 1 | grep -q clang)
> >
> > +config AS_IS_CLANG
> > + def_bool $(success,$(AS) --version | head -n 1 | grep -q clang)
> > +
> > +config LD_IS_LLD
> > + def_bool $(success,$(LD) --version | head -n 1 | grep -q LLD)
> > +
> > config CLANG_VERSION
> > int
> > default $(shell,$(srctree)/scripts/clang-version.sh $(CC))
> > --
> > 2.26.0.110.g2183baf09c-goog
>
> Yesterday, when looking trough commits in Linus tree, I saw:
>
> "init/kconfig: Add LD_VERSION Kconfig"
>
> Nick had a patchset to distinguish LINKER via Kconfig (I cannot find
> it right now).

Probably referring to this?

https://github.com/samitolvanen/linux/commit/61889e01f0ed4f07a9d631f163bba6c6637bfa46

> So we should do all this the way CC_IS_XXX CC_VERSION handling is done.
>
> I just want to point to [2] where we can rework (simplify) this
> handling for CC and LD handling in a further step.
> In one of Peter Z. tree someone started to do so (I was inspired by that).
>
> Unfortunately, the hunk from [1] is IMHO a bit mis-placed and CC and
> LD handling should stay together:
>
> CC_IS_XXX where XXX is GCC or CLANG
> CC_VERSION where CC is GCC or CLANG

Are you suggesting unifying GCC_VERSION and CLANG_VERSION or am I
misunderstanding what you wrote here? Do you mean XXX_VERSION where XXX
is GCC or CLANG?

> LD_IS_XXX where XXX is BFD or GOLD or LLD
> LD_VERSION

ld.gold is no longer allowed to link the kernel so there is no point in
accounting for it in Kconfig. That leaves only ld.bfd and ld.lld to
account for. I do not think there is a point in adding LD_IS_BFD;
!LD_IS_LLD covers that since there is not another linker (at least that
I am aware of) that links the kernel.

Compiler is different because it technically has three options if icc
even still works to build the kernel.

LD_VERISON is explicitly an ld.bfd thing due to the way ld-version.sh
is written:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/ld-version.sh

There is not much of a reason to try and make LLD work with that given
we do not need it now. I am of the mindset that proactively changing
something only makes life more difficult down the road and makes things
harder to maintain.

We could suggest renaming that config to GNU_LD_VERSION and gnu-ld-version.sh
to be slightly more accurate but I am not sure that is necessary since
again, CONFIG_LD_IS_LLD will handle any incompatibilities that we
encounter with LD_VERSION, just like we do with CLANG_VERSION/GCC_VERSION.
See my commit for the __gnu_mcount_nc thing in ARM for an example of
that (CONFIG_CC_IS_GCC still needs to be specified).

https://git.kernel.org/linus/b0fe66cf095016e0b238374c10ae366e1f087d11

> Just my €0,02.
>
> Regards,
> - Sedat -
>
> [1] https://git.kernel.org/linus/9553d16fa671b9621c5e2847d08bd90d3be3349c
> [2] https://github.com/ClangBuiltLinux/linux/issues/941
>

Cheers,
Nathan

2020-04-10 09:58:59

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Fri, Apr 10, 2020 at 1:28 AM Jian Cai <[email protected]> wrote:
>
> iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
> kernel.
>
> Signed-off-by: Jian Cai <[email protected]>

It clearly makes sense to limit the Kconfig option to compilers that
can actually build it.
A few questions though:

- Given that Armada XP with its PJ4B was still marketed until fairly
recently[1],
wouldn't it make sense to still add support for it? Is it a lot of work?

- Why does the linker have to understand it, rather than just the assembler?

> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 66a04f6f4775..39de8fc64a73 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -804,7 +804,7 @@ source "arch/arm/mm/Kconfig"
>
> config IWMMXT
> bool "Enable iWMMXt support"
> - depends on CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B
> + depends on !AS_IS_CLANG && !LD_IS_LLD && (CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B)

I would suggest splitting it into two lines for readability:

depends on CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B
depends on !AS_IS_CLANG && !LD_IS_LLD

Arnd

[1] http://web.archive.org/web/20191015165247/https://www.marvell.com/embedded-processors/armada/index.jsp

2020-04-10 11:16:35

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Fri, 10 Apr 2020 at 11:56, Arnd Bergmann <[email protected]> wrote:
>
> On Fri, Apr 10, 2020 at 1:28 AM Jian Cai <[email protected]> wrote:
> >
> > iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> > Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
> > kernel.
> >
> > Signed-off-by: Jian Cai <[email protected]>
>
> It clearly makes sense to limit the Kconfig option to compilers that
> can actually build it.
> A few questions though:
>
> - Given that Armada XP with its PJ4B was still marketed until fairly
> recently[1],
> wouldn't it make sense to still add support for it? Is it a lot of work?
>

The part of that file that the assembler chokes on hasn't been touched
by anyone since Nico added it 15+ years ago. It can only be built in
ARM mode, and it disassembles to the sequence below (the ld/st fe/fp
mnemonics are not document in recent versions of the ARM ARM, and
aren't understood by Clang either)

Instead of playing all these tricks with Kconfig, couldn't we simply
insert the bare opcodes and be done with it?



00000054 <concan_dump>:
54: fd812120 stc2 1, cr2, [r1, #128] ; 0x80
58: fd813121 stc2 1, cr3, [r1, #132] ; 0x84
5c: fd818122 stc2 1, cr8, [r1, #136] ; 0x88
60: fd819123 stc2 1, cr9, [r1, #140] ; 0x8c
64: fd81a124 stc2 1, cr10, [r1, #144] ; 0x90
68: fd81b125 stc2 1, cr11, [r1, #148] ; 0x94
6c: e3120002 tst r2, #2
70: 0a00000f beq b4 <concan_dump+0x60>
74: edc10100 stfe f0, [r1]
78: edc11102 stfe f1, [r1, #8]
7c: edc12104 stfe f2, [r1, #16]
80: edc13106 stfe f3, [r1, #24]
84: edc14108 stfe f4, [r1, #32]
88: edc1510a stfe f5, [r1, #40] ; 0x28
8c: edc1610c stfe f6, [r1, #48] ; 0x30
90: edc1710e stfe f7, [r1, #56] ; 0x38
94: edc18110 stfp f0, [r1, #64] ; 0x40
98: edc19112 stfp f1, [r1, #72] ; 0x48
9c: edc1a114 stfp f2, [r1, #80] ; 0x50
a0: edc1b116 stfp f3, [r1, #88] ; 0x58
a4: edc1c118 stfp f4, [r1, #96] ; 0x60
a8: edc1d11a stfp f5, [r1, #104] ; 0x68
ac: edc1e11c stfp f6, [r1, #112] ; 0x70
b0: edc1f11e stfp f7, [r1, #120] ; 0x78
b4: e3300000 teq r0, #0
b8: 012fff1e bxeq lr

000000bc <concan_load>:
bc: edd00100 ldfe f0, [r0]
c0: edd01102 ldfe f1, [r0, #8]
c4: edd02104 ldfe f2, [r0, #16]
c8: edd03106 ldfe f3, [r0, #24]
cc: edd04108 ldfe f4, [r0, #32]
d0: edd0510a ldfe f5, [r0, #40] ; 0x28
d4: edd0610c ldfe f6, [r0, #48] ; 0x30
d8: edd0710e ldfe f7, [r0, #56] ; 0x38
dc: edd08110 ldfp f0, [r0, #64] ; 0x40
e0: edd09112 ldfp f1, [r0, #72] ; 0x48
e4: edd0a114 ldfp f2, [r0, #80] ; 0x50
e8: edd0b116 ldfp f3, [r0, #88] ; 0x58
ec: edd0c118 ldfp f4, [r0, #96] ; 0x60
f0: edd0d11a ldfp f5, [r0, #104] ; 0x68
f4: edd0e11c ldfp f6, [r0, #112] ; 0x70
f8: edd0f11e ldfp f7, [r0, #120] ; 0x78
fc: fd902120 ldc2 1, cr2, [r0, #128] ; 0x80
100: fd903121 ldc2 1, cr3, [r0, #132] ; 0x84
104: fd908122 ldc2 1, cr8, [r0, #136] ; 0x88
108: fd909123 ldc2 1, cr9, [r0, #140] ; 0x8c
10c: fd90a124 ldc2 1, cr10, [r0, #144] ; 0x90
110: fd90b125 ldc2 1, cr11, [r0, #148] ; 0x94
114: e3310000 teq r1, #0
118: e3a02000 mov r2, #0
11c: 012fff1e bxeq lr
120: ee012110 flts f1, r2
124: e12fff1e bx lr

> - Why does the linker have to understand it, rather than just the assembler?
>
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index 66a04f6f4775..39de8fc64a73 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -804,7 +804,7 @@ source "arch/arm/mm/Kconfig"
> >
> > config IWMMXT
> > bool "Enable iWMMXt support"
> > - depends on CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B
> > + depends on !AS_IS_CLANG && !LD_IS_LLD && (CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B)
>
> I would suggest splitting it into two lines for readability:
>
> depends on CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_PJ4 || CPU_PJ4B
> depends on !AS_IS_CLANG && !LD_IS_LLD
>
> Arnd
>
> [1] http://web.archive.org/web/20191015165247/https://www.marvell.com/embedded-processors/armada/index.jsp
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2020-04-10 12:36:43

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Fri, Apr 10, 2020 at 01:15:08PM +0200, Ard Biesheuvel wrote:
> On Fri, 10 Apr 2020 at 11:56, Arnd Bergmann <[email protected]> wrote:
> >
> > On Fri, Apr 10, 2020 at 1:28 AM Jian Cai <[email protected]> wrote:
> > >
> > > iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> > > Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
> > > kernel.
> > >
> > > Signed-off-by: Jian Cai <[email protected]>
> >
> > It clearly makes sense to limit the Kconfig option to compilers that
> > can actually build it.
> > A few questions though:
> >
> > - Given that Armada XP with its PJ4B was still marketed until fairly
> > recently[1],
> > wouldn't it make sense to still add support for it? Is it a lot of work?
> >
>
> The part of that file that the assembler chokes on hasn't been touched
> by anyone since Nico added it 15+ years ago. It can only be built in
> ARM mode, and it disassembles to the sequence below (the ld/st fe/fp
> mnemonics are not document in recent versions of the ARM ARM, and
> aren't understood by Clang either)

For older CPUs, it doesn't matter what the latest ARM ARM says, the
appropriate version of the ARM ARM is the one relevant for the CPU
architecture. This is a mistake frequently made, and it's been pointed
out by Arm Ltd in the past (before ARMv6 even came on the scene) that
keeping older revisions is necessary if you want to be interested in
the older architectures.

However, there's an additional complication here: DEC's license from
Arm Ltd back in the days of StrongARM allowed them to make changes to
the architecture - that was passed over to Intel when they bought that
part of DEC. Consequently, these "non-Arm vendor" cores contain
extensions that are not part of the ARM ARM. iWMMXT is one such
example, which first appeared in the Intel PXA270 SoC (an ARMv5
derived CPU).

In fact, several of the features found in later versions of the ARM
architecture came from DEC and Intel enhancements.

If your compiler/assembler only implements what is in the latest ARM
ARM, then it is not going to be suitable for these older CPUs and
alternate vendor "ARM compatible" CPUs.

> Instead of playing all these tricks with Kconfig, couldn't we simply
> insert the bare opcodes and be done with it?

That gets close to a GPL violation; the GPL requires that source code
be in the preferred form for making modifications. Encoding raw opcodes
can in no way be argued to be the preferred form. Arguing that raw
opcodes is acceptable sets a precedent that makes it acceptable for
other "works" to do the same, which makes arguments against firmware
supplied as a hexdump null and void.

Using macros to emulate the instructions and create the appropriate
opcodes is an alternative; we already have that for some of the VFP
code as early toolchains had no support for the VFP instructions.

So no, bare opcodes are unacceptable.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

2020-04-10 13:11:03

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Fri, 10 Apr 2020 at 14:33, Russell King - ARM Linux admin
<[email protected]> wrote:
>
> On Fri, Apr 10, 2020 at 01:15:08PM +0200, Ard Biesheuvel wrote:
> > On Fri, 10 Apr 2020 at 11:56, Arnd Bergmann <[email protected]> wrote:
> > >
> > > On Fri, Apr 10, 2020 at 1:28 AM Jian Cai <[email protected]> wrote:
> > > >
> > > > iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> > > > Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
> > > > kernel.
> > > >
> > > > Signed-off-by: Jian Cai <[email protected]>
> > >
> > > It clearly makes sense to limit the Kconfig option to compilers that
> > > can actually build it.
> > > A few questions though:
> > >
> > > - Given that Armada XP with its PJ4B was still marketed until fairly
> > > recently[1],
> > > wouldn't it make sense to still add support for it? Is it a lot of work?
> > >
> >
> > The part of that file that the assembler chokes on hasn't been touched
> > by anyone since Nico added it 15+ years ago. It can only be built in
> > ARM mode, and it disassembles to the sequence below (the ld/st fe/fp
> > mnemonics are not document in recent versions of the ARM ARM, and
> > aren't understood by Clang either)
>
> For older CPUs, it doesn't matter what the latest ARM ARM says, the
> appropriate version of the ARM ARM is the one relevant for the CPU
> architecture. This is a mistake frequently made, and it's been pointed
> out by Arm Ltd in the past (before ARMv6 even came on the scene) that
> keeping older revisions is necessary if you want to be interested in
> the older architectures.
>
> However, there's an additional complication here: DEC's license from
> Arm Ltd back in the days of StrongARM allowed them to make changes to
> the architecture - that was passed over to Intel when they bought that
> part of DEC. Consequently, these "non-Arm vendor" cores contain
> extensions that are not part of the ARM ARM. iWMMXT is one such
> example, which first appeared in the Intel PXA270 SoC (an ARMv5
> derived CPU).
>
> In fact, several of the features found in later versions of the ARM
> architecture came from DEC and Intel enhancements.
>
> If your compiler/assembler only implements what is in the latest ARM
> ARM, then it is not going to be suitable for these older CPUs and
> alternate vendor "ARM compatible" CPUs.
>

Indeed, and I'm a bit disappointed at the willingness to leave stuff
by the wayside, especially since Clang's integrated assembler has no
other benefit to it than being built into the compiler.

> > Instead of playing all these tricks with Kconfig, couldn't we simply
> > insert the bare opcodes and be done with it?
>
> That gets close to a GPL violation; the GPL requires that source code
> be in the preferred form for making modifications. Encoding raw opcodes
> can in no way be argued to be the preferred form. Arguing that raw
> opcodes is acceptable sets a precedent that makes it acceptable for
> other "works" to do the same, which makes arguments against firmware
> supplied as a hexdump null and void.
>
> Using macros to emulate the instructions and create the appropriate
> opcodes is an alternative; we already have that for some of the VFP
> code as early toolchains had no support for the VFP instructions.
>
> So no, bare opcodes are unacceptable.
>

Fair enough.

The following set of macros appears to emit the opcodes correctly,
assuming we're willing to tweak the source code somewhat, i.e., drop
square brackets and leading # for immediate offsets. (The tmcr/tmrc
instructions are left as an exercise for the reader)


.irp b, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
.set .LwR\b, \b
.set .Lr\b, \b
.endr

.set .LwCSSF, 0x2
.set .LwCASF, 0x3
.set .LwCGR0, 0x8
.set .LwCGR1, 0x9
.set .LwCGR2, 0xa
.set .LwCGR3, 0xb

.macro wldrd, reg:req, base:req, offset:req
.inst 0xedd00100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
.endm

.macro wldrw, reg:req, base:req, offset:req
.inst 0xfd900100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
.endm

.macro wstrd, reg:req, base:req, offset:req
.inst 0xedc00100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
.endm

.macro wstrw, reg:req, base:req, offset:req
.inst 0xfd800100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
.endm

2020-04-10 17:03:50

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Thu, Apr 09, 2020 at 04:27:26PM -0700, Jian Cai wrote:
> iwmmxt.S contains XScale instructions

Dumb question....

Are these Xscale instructions? My understanding is that they are an
instruction set of their own, implementing something similar to IA-32
MMX.

Would it be more accurate to say CLANG does not support the iwmmxt
instruction set?

Andrew

2020-04-10 18:36:25

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Fri, Apr 10, 2020 at 06:59:48PM +0200, Andrew Lunn wrote:
> On Thu, Apr 09, 2020 at 04:27:26PM -0700, Jian Cai wrote:
> > iwmmxt.S contains XScale instructions
>
> Dumb question....
>
> Are these Xscale instructions? My understanding is that they are an
> instruction set of their own, implementing something similar to IA-32
> MMX.
>
> Would it be more accurate to say CLANG does not support the iwmmxt
> instruction set?

Yes, because the XScale core on its own (otherwise known as 80200)
doesn't support iWMMXT.

It's worth pointing out that the iWMMXT instruction set uses the
co-processor #1 instruction space as defined by the ARMv5 ARM ARM,
which is also the FPA (floating point accelerator) instruction
space - which is the FP instruction set prior to VFP.

The LDFP and similar instructions that binutils decodes the opcodes
as are FPA instructions, and the LDC2 instructions are their "generic
co-processor" versions where there's no FPA instruction that matches
the op-code.

I'll also point out that the reason the iWMMXT code has never been
ported to Thumb2 is because there are no equivalents for the
co-processor instructions in the Thumb2 instruction set defined in
ARMv5. Hence why the file has a .arm. So, the fact the file hasn't
changed for a long time and hasn't been updated with "improvements"
such as Thumb2 kernels is because that's completely irrelevent to
the ISA.

It is an example of code that has become stable and mature, and
requires no maintanence with GNU toolchains.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

2020-04-14 13:31:17

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Fri, Apr 10, 2020 at 2:56 AM Arnd Bergmann <[email protected]> wrote:
>
> On Fri, Apr 10, 2020 at 1:28 AM Jian Cai <[email protected]> wrote:
> >
> > iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> > Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
> > kernel.
> >
> > Signed-off-by: Jian Cai <[email protected]>
>
> It clearly makes sense to limit the Kconfig option to compilers that
> can actually build it.
> A few questions though:
>
> - Given that Armada XP with its PJ4B was still marketed until fairly
> recently[1],
> wouldn't it make sense to still add support for it? Is it a lot of work?

Sorry, can you help me verify from that link that PJ4B uses XSCALE? I
didn't see references to either in the link provided. Also, given the
history of XSCALE as noted by Russell, I'm surprised to see Marvell in
the mix.

>
> - Why does the linker have to understand it, rather than just the assembler?

It doesn't, just the assembler is the problem.

>
> [1] http://web.archive.org/web/20191015165247/https://www.marvell.com/embedded-processors/armada/index.jsp



--
Thanks,
~Nick Desaulniers

2020-04-14 13:31:18

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Fri, Apr 10, 2020 at 11:34 AM Russell King - ARM Linux admin
<[email protected]> wrote:
>
> On Fri, Apr 10, 2020 at 06:59:48PM +0200, Andrew Lunn wrote:
> > On Thu, Apr 09, 2020 at 04:27:26PM -0700, Jian Cai wrote:
> > > iwmmxt.S contains XScale instructions
> >
> > Dumb question....
> >
> > Are these Xscale instructions? My understanding is that they are an
> > instruction set of their own, implementing something similar to IA-32
> > MMX.
> >
> > Would it be more accurate to say CLANG does not support the iwmmxt
> > instruction set?
>
> Yes, because the XScale core on its own (otherwise known as 80200)
> doesn't support iWMMXT.
>
> It's worth pointing out that the iWMMXT instruction set uses the
> co-processor #1 instruction space as defined by the ARMv5 ARM ARM,
> which is also the FPA (floating point accelerator) instruction
> space - which is the FP instruction set prior to VFP.

Reusing instruction encoding space? :-X I'll have to look and see how
we define cases like this in LLVM.

>
> The LDFP and similar instructions that binutils decodes the opcodes
> as are FPA instructions, and the LDC2 instructions are their "generic
> co-processor" versions where there's no FPA instruction that matches
> the op-code.
>
> I'll also point out that the reason the iWMMXT code has never been
> ported to Thumb2 is because there are no equivalents for the
> co-processor instructions in the Thumb2 instruction set defined in
> ARMv5. Hence why the file has a .arm. So, the fact the file hasn't
> changed for a long time and hasn't been updated with "improvements"
> such as Thumb2 kernels is because that's completely irrelevent to
> the ISA.
>
> It is an example of code that has become stable and mature, and
> requires no maintanence with GNU toolchains.

I agree. I think this is something we can mark broken for our
toolchain in Kconfig for the time being, and revisit once we have more
resources to implement (leaving the sources as is).

--
Thanks,
~Nick Desaulniers

2020-04-14 13:33:27

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Fri, Apr 10, 2020 at 5:33 AM Russell King - ARM Linux admin
<[email protected]> wrote:
>
> For older CPUs, it doesn't matter what the latest ARM ARM says, the
> appropriate version of the ARM ARM is the one relevant for the CPU
> architecture. This is a mistake frequently made, and it's been pointed
> out by Arm Ltd in the past (before ARMv6 even came on the scene) that
> keeping older revisions is necessary if you want to be interested in
> the older architectures.

As if it never existed *waves hands*. Interesting. Does ARM still
distribute these older reference manuals? Do you keep copies of the
older revisions?

>
> However, there's an additional complication here: DEC's license from
> Arm Ltd back in the days of StrongARM allowed them to make changes to
> the architecture - that was passed over to Intel when they bought that
> part of DEC. Consequently, these "non-Arm vendor" cores contain
> extensions that are not part of the ARM ARM. iWMMXT is one such
> example, which first appeared in the Intel PXA270 SoC (an ARMv5
> derived CPU).
>
> In fact, several of the features found in later versions of the ARM
> architecture came from DEC and Intel enhancements.
>
> If your compiler/assembler only implements what is in the latest ARM
> ARM, then it is not going to be suitable for these older CPUs and
> alternate vendor "ARM compatible" CPUs.

This is a neat piece of history, thanks for sharing.
--
Thanks,
~Nick Desaulniers

2020-04-14 13:37:15

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Fri, Apr 10, 2020 at 6:09 AM Ard Biesheuvel <[email protected]> wrote:
>
> On Fri, 10 Apr 2020 at 14:33, Russell King - ARM Linux admin
> <[email protected]> wrote:
> >
> > On Fri, Apr 10, 2020 at 01:15:08PM +0200, Ard Biesheuvel wrote:
> > > On Fri, 10 Apr 2020 at 11:56, Arnd Bergmann <[email protected]> wrote:
> > > >
> > > > On Fri, Apr 10, 2020 at 1:28 AM Jian Cai <[email protected]> wrote:
> > > > >
> > > > > iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> > > > > Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
> > > > > kernel.
> > > > >
> > > > > Signed-off-by: Jian Cai <[email protected]>
> > > >
> > > > It clearly makes sense to limit the Kconfig option to compilers that
> > > > can actually build it.
> > > > A few questions though:
> > > >
> > > > - Given that Armada XP with its PJ4B was still marketed until fairly
> > > > recently[1],
> > > > wouldn't it make sense to still add support for it? Is it a lot of work?
> > > >
> > >
> > > The part of that file that the assembler chokes on hasn't been touched
> > > by anyone since Nico added it 15+ years ago. It can only be built in
> > > ARM mode, and it disassembles to the sequence below (the ld/st fe/fp
> > > mnemonics are not document in recent versions of the ARM ARM, and
> > > aren't understood by Clang either)
> >
> > For older CPUs, it doesn't matter what the latest ARM ARM says, the
> > appropriate version of the ARM ARM is the one relevant for the CPU
> > architecture. This is a mistake frequently made, and it's been pointed
> > out by Arm Ltd in the past (before ARMv6 even came on the scene) that
> > keeping older revisions is necessary if you want to be interested in
> > the older architectures.
> >
> > However, there's an additional complication here: DEC's license from
> > Arm Ltd back in the days of StrongARM allowed them to make changes to
> > the architecture - that was passed over to Intel when they bought that
> > part of DEC. Consequently, these "non-Arm vendor" cores contain
> > extensions that are not part of the ARM ARM. iWMMXT is one such
> > example, which first appeared in the Intel PXA270 SoC (an ARMv5
> > derived CPU).
> >
> > In fact, several of the features found in later versions of the ARM
> > architecture came from DEC and Intel enhancements.
> >
> > If your compiler/assembler only implements what is in the latest ARM
> > ARM, then it is not going to be suitable for these older CPUs and
> > alternate vendor "ARM compatible" CPUs.
> >
>
> Indeed, and I'm a bit disappointed at the willingness to leave stuff
> by the wayside, especially since Clang's integrated assembler has no
> other benefit to it than being built into the compiler.

I don't disagree. I also wish LLVM had a backend for every
architecture that GCC does. But resources are finite and there are
more fires than firemen. It gets really hard to justify a high
priority for certain things over others. Doubly-so for hardware no
longer in production. Triply-so when the ISA vendor doesn't provide
information in available reference manuals. I'm happy to push for
more investment in LLVM to support the Linux kernel from Google
internally; maybe you can help do so from ARM? That was my appeal to
ARM back in February; support for newest ISA extensions is great,
support for existing instructions is great, too. (And not having to
choose between one or the other is preferrable, given the amount of
resources available).

My thoughts on the benefits of this approach to using Clang's
integrated assembler:
1. continuous integration and randconfigs. We need CI to help us spot
where things are still broken, and help us from regressing the ground
we've fought for. We can't expect kernel developers to test with
LLVM. Currently, we have LLVM builds in numerous kernel continuous
integration services (KernelCI, Kbuild test robot "0day bot", Linaro's
TCWG, Syzcaller, and our own CI). For services that are bisecting and
notifying authors, they are currently harassing authors for
pre-existing conditions that the service uncovered via randconfig.
This is very very dangerous territory to be in. If authors start
ignoring build reports due to false positives or false negatives, it
becomes a weak signal that tends to be ignored. Then when real bugs
are uncovered, the actual bugs get ignored. We don't want that. If a
canary dies in a coal mine, but no one notices, was it for naught?

2. It helps us quantify how broken we are, via `grep` and `wc`. LLVM
is in no way a perfect substitute for GNU utilities, but it's not too
far off either. As an imperfect substitute, there's a lot of work
both on the toolchain side and sources of various codebases in terms
of toolchain portability. Being able to quantify what doesn't work
let's us be clear in which ways LLVM is not a perfect substitute, but
also where and how much resources we need to get closer. That helps
then with planning and prioritization. The proper thing to do is to
bury the dead but at this point we only have resources to collect dog
tags and keep moving. That doesn't rule out revisiting implementing
iWMMXT in the future.

3. Testing Clang's assembler allows for us to do kernel builds without
binutils. This work is helping uncover places within the kernel where
the build is not hermetic. We're still a long ways away from hermetic
reproducible kernel builds I suspect, but my main worry is when people
have multiple versions of a toolchain in their path, that only one is
used. Otherwise, it leads to spooky hard to reproduce bug reports. I
don't think I need to argue about build hermiticity, but it's
important for user trust and verification.

4. Improving toolchain portability of assembler in LLVM itself.
There's plenty of subtle differences, but missing full on instructions
(or are they psuedo's?) is pretty bad.

I value the feedback from you, Russell, and Arnd even when I disagree.
These are just my thoughts on *why* things are the way they are, FWIW.
If there's thoughts on how we might better prioritize one thing over
another, I would appreciate it.
--
Thanks,
~Nick Desaulniers

2020-04-14 15:09:53

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Mon, 13 Apr 2020 at 22:45, Nick Desaulniers <[email protected]> wrote:
>
> On Fri, Apr 10, 2020 at 6:09 AM Ard Biesheuvel <[email protected]> wrote:
> >
> > On Fri, 10 Apr 2020 at 14:33, Russell King - ARM Linux admin
> > <[email protected]> wrote:
...
> > > If your compiler/assembler only implements what is in the latest ARM
> > > ARM, then it is not going to be suitable for these older CPUs and
> > > alternate vendor "ARM compatible" CPUs.
> > >
> >
> > Indeed, and I'm a bit disappointed at the willingness to leave stuff
> > by the wayside, especially since Clang's integrated assembler has no
> > other benefit to it than being built into the compiler.
>
> I don't disagree. I also wish LLVM had a backend for every
> architecture that GCC does. But resources are finite and there are
> more fires than firemen. It gets really hard to justify a high
> priority for certain things over others. Doubly-so for hardware no
> longer in production. Triply-so when the ISA vendor doesn't provide
> information in available reference manuals. I'm happy to push for
> more investment in LLVM to support the Linux kernel from Google
> internally; maybe you can help do so from ARM? That was my appeal to
> ARM back in February; support for newest ISA extensions is great,
> support for existing instructions is great, too. (And not having to
> choose between one or the other is preferrable, given the amount of
> resources available).
>

Sure. But my point was really that disabling stuff left and right just
so we can get to the finish line is fine for internal kernel-on-clang
development, but I'd expect the contributions upstream to be a bit
more considerate of other concerns, such as not regressing in terms of
functionality.

> My thoughts on the benefits of this approach to using Clang's
> integrated assembler:
> 1. continuous integration and randconfigs. We need CI to help us spot
> where things are still broken, and help us from regressing the ground
> we've fought for. We can't expect kernel developers to test with
> LLVM. Currently, we have LLVM builds in numerous kernel continuous
> integration services (KernelCI, Kbuild test robot "0day bot", Linaro's
> TCWG, Syzcaller, and our own CI). For services that are bisecting and
> notifying authors, they are currently harassing authors for
> pre-existing conditions that the service uncovered via randconfig.
> This is very very dangerous territory to be in. If authors start
> ignoring build reports due to false positives or false negatives, it
> becomes a weak signal that tends to be ignored. Then when real bugs
> are uncovered, the actual bugs get ignored. We don't want that. If a
> canary dies in a coal mine, but no one notices, was it for naught?
>

OK, so you are saying you need the Clang *assembler* to perform CI on
C pieces that we can now build with the Clang compiler, and we don't
want to regress on that? Is this because the cross-assemblers are
missing from the CI build hosts?

> 2. It helps us quantify how broken we are, via `grep` and `wc`. LLVM
> is in no way a perfect substitute for GNU utilities, but it's not too
> far off either. As an imperfect substitute, there's a lot of work
> both on the toolchain side and sources of various codebases in terms
> of toolchain portability. Being able to quantify what doesn't work
> let's us be clear in which ways LLVM is not a perfect substitute, but
> also where and how much resources we need to get closer. That helps
> then with planning and prioritization. The proper thing to do is to
> bury the dead but at this point we only have resources to collect dog
> tags and keep moving. That doesn't rule out revisiting implementing
> iWMMXT in the future.
>

To be honest with you, I don't actually think iwmmxt is that
important. But I have already demonstrated how we can use a couple of
macros to emit the same instructions without resorting to bare
opcodes, so there is really no need to disable pieces left and right
because the Clang assembler does not support them outright - it just
needs someone to care enough about this, rather than rush through the
list with a tick the box attitude, and rip out the pieces that look a
bit too complicated.

> 3. Testing Clang's assembler allows for us to do kernel builds without
> binutils. This work is helping uncover places within the kernel where
> the build is not hermetic. We're still a long ways away from hermetic
> reproducible kernel builds I suspect, but my main worry is when people
> have multiple versions of a toolchain in their path, that only one is
> used. Otherwise, it leads to spooky hard to reproduce bug reports. I
> don't think I need to argue about build hermiticity, but it's
> important for user trust and verification.
>

So we need the Clang assembler for reproducible builds?

> 4. Improving toolchain portability of assembler in LLVM itself.
> There's plenty of subtle differences, but missing full on instructions
> (or are they psuedo's?) is pretty bad.
>

I don't think this point belongs in the 'why should we care about the
Clang assembler' list :-)

> I value the feedback from you, Russell, and Arnd even when I disagree.
> These are just my thoughts on *why* things are the way they are, FWIW.
> If there's thoughts on how we might better prioritize one thing over
> another, I would appreciate it.

I think the 'all legacy needs to die' attitude is not particularly
helpful here. In the 32-bit Linux/ARM community, there are many people
who care about older systems, and spend a lot of time on keeping
things in a working order on platforms that Google or ARM have stopped
caring about long ago.

2020-04-15 04:19:59

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Mon, Apr 13, 2020 at 12:23:38PM -0700, Nick Desaulniers wrote:
> On Fri, Apr 10, 2020 at 5:33 AM Russell King - ARM Linux admin
> <[email protected]> wrote:
> >
> > For older CPUs, it doesn't matter what the latest ARM ARM says, the
> > appropriate version of the ARM ARM is the one relevant for the CPU
> > architecture. This is a mistake frequently made, and it's been pointed
> > out by Arm Ltd in the past (before ARMv6 even came on the scene) that
> > keeping older revisions is necessary if you want to be interested in
> > the older architectures.
>
> As if it never existed *waves hands*. Interesting. Does ARM still
> distribute these older reference manuals? Do you keep copies of the
> older revisions?

I keep copies of every document I've needed that I'm allowed to keep
as a general rule, including the early paper copies of the ARM
architecture reference manual. I even have the original VLSI ARM2
databook.

For the ARMv5TE architecture, you're looking for DDI0100E which can be
found via google.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

2020-04-15 04:23:21

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Mon, Apr 13, 2020 at 12:20:57PM -0700, Nick Desaulniers wrote:
> On Fri, Apr 10, 2020 at 2:56 AM Arnd Bergmann <[email protected]> wrote:
> >
> > On Fri, Apr 10, 2020 at 1:28 AM Jian Cai <[email protected]> wrote:
> > >
> > > iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> > > Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
> > > kernel.
> > >
> > > Signed-off-by: Jian Cai <[email protected]>
> >
> > It clearly makes sense to limit the Kconfig option to compilers that
> > can actually build it.
> > A few questions though:
> >
> > - Given that Armada XP with its PJ4B was still marketed until fairly
> > recently[1],
> > wouldn't it make sense to still add support for it? Is it a lot of work?
>
> Sorry, can you help me verify from that link that PJ4B uses XSCALE?

I think you missed my email. iwmmxt is not Xscale. iwmmxt is an
instruction set of its own, which any ARM processor could implement.

> I
> didn't see references to either in the link provided. Also, given the
> history of XSCALE as noted by Russell, I'm surprised to see Marvell in
> the mix.

https://en.wikipedia.org/wiki/XScale

XScale comprises several distinct families: IXP, IXC, IOP, PXA
and CE.

Intel sold the PXA family to Marvell Technology Group
in June 2006.[1] Marvell then extended the brand to include
processors with other microarchitectures, like ARM's Cortex.

Andrew

2020-04-15 04:52:34

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Mon, Apr 13, 2020 at 12:26:16PM -0700, Nick Desaulniers wrote:
> On Fri, Apr 10, 2020 at 11:34 AM Russell King - ARM Linux admin
> <[email protected]> wrote:
> >
> > On Fri, Apr 10, 2020 at 06:59:48PM +0200, Andrew Lunn wrote:
> > > On Thu, Apr 09, 2020 at 04:27:26PM -0700, Jian Cai wrote:
> > > > iwmmxt.S contains XScale instructions
> > >
> > > Dumb question....
> > >
> > > Are these Xscale instructions? My understanding is that they are an
> > > instruction set of their own, implementing something similar to IA-32
> > > MMX.
> > >
> > > Would it be more accurate to say CLANG does not support the iwmmxt
> > > instruction set?
> >
> > Yes, because the XScale core on its own (otherwise known as 80200)
> > doesn't support iWMMXT.
> >
> > It's worth pointing out that the iWMMXT instruction set uses the
> > co-processor #1 instruction space as defined by the ARMv5 ARM ARM,
> > which is also the FPA (floating point accelerator) instruction
> > space - which is the FP instruction set prior to VFP.
>
> Reusing instruction encoding space? :-X I'll have to look and see how
> we define cases like this in LLVM.

Unfortunately yes.

The ARM CPU was originally an integer-only CPU, and the design was
to allow up to 16 "co-processors" to be added for bolt-on facilities.
The ARM architecture reserves various instructions in its instruction
space for the co-processors, with instructions to load/store from the
co-processor to memory (the ARM works in unison with the co-processor
to provide the address for the memory access), transfer data between
the ARM integer register set and co-processor, and instruct the co-
processor to perform various data operations.

Back in the days of the ARM2 CPU, the ARM2 on its own had no co-
processors, but had external pins to external co-processors to be added
to the system, such as the FPA (floating point accelerator) hardware.
Early Acorn ARM-based computers had a separate socket to allow the FPA
chip to be plugged in. The FPA used the co-processor 0/1 instruction
space.

The ARM3 CPU added a cache, and with it the need to control that
cache, which is where the origins of the CP15 control co-processor
comes from (although ARM3 has a totally different CP15 layout.)

When a co-processor is addressed by an instruction, if it doesn't
respond, the ARM takes an undefined instruction exception, which allows
software to emulate the co-processor - and this is how software FP
emulation had been done - userspace continues to use the FPA ISA,
with the instructions trapping to an emulator.

Eventually, VFP came along, which uses the co-processor 10/11 space,
superseding FPA. However, the principle is still there - it is
entirely _possible_ if one had enough motivation to implement a VFP
software emulator on ARM2 and "execute" VFP instructions.

By the time that Intel decided to add iWMMXT, the ARM CPUs no longer
had support for external co-processors, so FPA had likely been long
forgotten (despite lots of Linux systems using it for their FP), and
they picked the same co-processor instruction space, which means FPA
and iWMMXT are mutually exclusive: you can't have a kernel supporting
both.

In all cases, the co-processor instructions have always had two
definitions: there's the ARM integer CPU naming of the instructions
which uses things like "CDP", "LDC", "STC", "MRC", "MCR" and so forth,
and the co-processor specific naming. In fact, the early VFP
documentation referred to the ARM integer CPU naming of the
instructions, talking about the FMRX being a MRC instruction etc.

So, for example, with VFP:

fmrx r0, fpsid

and:

mrc p10, 7, r0, c0, c0, 0

are exactly the same opcode. The former is defined in what used to be
the separate VFP architecture document (such as DDI0238A), the latter
in the ARM reference manual.

It's just the same with stuff like CP15 - ARM tried in ARMv7 to rename
various CP15 instructions such as "MCR p15, 0, c7, c5, 0" (which I can
tell you off the top of my head invalidates the instruction cache to
the point of unification) to be "ICIALLU" - a neumonic I still have to
look up. These are just different names for the same opcode.

I'm sure there's a document somewhere that defines the iWMMXT
instruction set (I don't seem to have it), but ultimately it can be
described in terms of the ARM co-processor instruction set, just like
any of the other ARM co-processors.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

2020-04-15 13:07:09

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Tue, Apr 14, 2020 at 1:59 AM Ard Biesheuvel <[email protected]> wrote:
>
> On Mon, 13 Apr 2020 at 22:45, Nick Desaulniers <[email protected]> wrote:
> >
> > On Fri, Apr 10, 2020 at 6:09 AM Ard Biesheuvel <[email protected]> wrote:
> > >
> > > On Fri, 10 Apr 2020 at 14:33, Russell King - ARM Linux admin
> > > <[email protected]> wrote:
> ...
> > > > If your compiler/assembler only implements what is in the latest ARM
> > > > ARM, then it is not going to be suitable for these older CPUs and
> > > > alternate vendor "ARM compatible" CPUs.
> > > >
> > >
> > > Indeed, and I'm a bit disappointed at the willingness to leave stuff
> > > by the wayside, especially since Clang's integrated assembler has no
> > > other benefit to it than being built into the compiler.
> >
> > I don't disagree. I also wish LLVM had a backend for every
> > architecture that GCC does. But resources are finite and there are
> > more fires than firemen. It gets really hard to justify a high
> > priority for certain things over others. Doubly-so for hardware no
> > longer in production. Triply-so when the ISA vendor doesn't provide
> > information in available reference manuals. I'm happy to push for
> > more investment in LLVM to support the Linux kernel from Google
> > internally; maybe you can help do so from ARM? That was my appeal to
> > ARM back in February; support for newest ISA extensions is great,
> > support for existing instructions is great, too. (And not having to
> > choose between one or the other is preferrable, given the amount of
> > resources available).
> >
>
> Sure. But my point was really that disabling stuff left and right just
> so we can get to the finish line is fine for internal kernel-on-clang
> development, but I'd expect the contributions upstream to be a bit
> more considerate of other concerns, such as not regressing in terms of
> functionality.

Does this or any of the proposed patches regress anything for GCC? Do
they regress anything for Clang? Is it a regression if it never
worked in the first place? ;)

Disabling things isn't crossing the finish line. It's admitting where
your weaknesses lie, and where you need to improve. Crossing the
finish line is implementing and filling out those weaknesses. We're
running a race we're behind by 10 years! We may never catch up!

>
> > My thoughts on the benefits of this approach to using Clang's
> > integrated assembler:
> > 1. continuous integration and randconfigs. We need CI to help us spot
> > where things are still broken, and help us from regressing the ground
> > we've fought for. We can't expect kernel developers to test with
> > LLVM. Currently, we have LLVM builds in numerous kernel continuous
> > integration services (KernelCI, Kbuild test robot "0day bot", Linaro's
> > TCWG, Syzcaller, and our own CI). For services that are bisecting and
> > notifying authors, they are currently harassing authors for
> > pre-existing conditions that the service uncovered via randconfig.
> > This is very very dangerous territory to be in. If authors start
> > ignoring build reports due to false positives or false negatives, it
> > becomes a weak signal that tends to be ignored. Then when real bugs
> > are uncovered, the actual bugs get ignored. We don't want that. If a
> > canary dies in a coal mine, but no one notices, was it for naught?
> >
>
> OK, so you are saying you need the Clang *assembler* to perform CI on
> C pieces that we can now build with the Clang compiler, and we don't
> want to regress on that? Is this because the cross-assemblers are
> missing from the CI build hosts?

We need to disable configs that we know are broken when using certain
LLVM tools until we can dedicate more resources to fixing them. This
prevents curious failures when doing randconfig builds, which some of
the CI systems do. It then gives us a nice list we can grep for
configs we need to fix, whether that's via improvements to the
toolchain to changes to the source. And I think Arnd agreed with me
when he said "It clearly makes sense to limit the Kconfig option to
compilers that can actually build it."

>
> > 2. It helps us quantify how broken we are, via `grep` and `wc`. LLVM
> > is in no way a perfect substitute for GNU utilities, but it's not too
> > far off either. As an imperfect substitute, there's a lot of work
> > both on the toolchain side and sources of various codebases in terms
> > of toolchain portability. Being able to quantify what doesn't work
> > let's us be clear in which ways LLVM is not a perfect substitute, but
> > also where and how much resources we need to get closer. That helps
> > then with planning and prioritization. The proper thing to do is to
> > bury the dead but at this point we only have resources to collect dog
> > tags and keep moving. That doesn't rule out revisiting implementing
> > iWMMXT in the future.
> >
>
> To be honest with you, I don't actually think iwmmxt is that
> important. But I have already demonstrated how we can use a couple of
> macros to emit the same instructions without resorting to bare
> opcodes, so there is really no need to disable pieces left and right
> because the Clang assembler does not support them outright - it just
> needs someone to care enough about this, rather than rush through the
> list with a tick the box attitude, and rip out the pieces that look a
> bit too complicated.

I don't think we have a long list of configs to mark broken; it's not
like we're flooding the list with patches marking tons of things
broken.

In my town where I live, they spray paint the sidewalks that need
repair so that people don't trip. "Why not just replace the
sidewalk?" you might ask. Eventually they do, but it takes much more
time and effort, and there are a lot more broken sidewalks than people
and money to repair them. Was it a waste to highlight the areas where
people might trip?

All I'm suggesting is that we mark the way for future travelers that
this doesn't work, as you'll get a potentially very confusing error
message if you try. Then `git log` probably has a Link: and more
context as to why. Then you can `grep` for all of the places that are
broken, and figure out which sidewalk is most important to repair
first, and better estimate the cost to repair all of them.

>
> > 3. Testing Clang's assembler allows for us to do kernel builds without
> > binutils. This work is helping uncover places within the kernel where
> > the build is not hermetic. We're still a long ways away from hermetic
> > reproducible kernel builds I suspect, but my main worry is when people
> > have multiple versions of a toolchain in their path, that only one is
> > used. Otherwise, it leads to spooky hard to reproduce bug reports. I
> > don't think I need to argue about build hermiticity, but it's
> > important for user trust and verification.
> >
>
> So we need the Clang assembler for reproducible builds?

No; it allows us to not even install binutils on the host, and know
that the Clang assembler and only the Clang assembler is being used;
no other host utilities that happen to be in the $PATH.

When I do a build with any LLVM utility, I'd like to ensure that the
equivalent from binutils isn't invoked. This is something folks can
already start testing today with multiple installs of GNU or LLVM
utilities, though I don't know of anyone leading a conscious effort.
This is something we've been finding the hard way; we've been doing
builds on hosts with no GCC/binutils, and finding places where as an
example `nm` or `objcopy` is hardcoded, rather than
`$(NM)`/`$(OBCOPY)`, which is the first step towards making the build
more hermetic.

The next step is ensuring the tools produce deterministic output,
which is quite painful.

>
> > 4. Improving toolchain portability of assembler in LLVM itself.
> > There's plenty of subtle differences, but missing full on instructions
> > (or are they psuedo's?) is pretty bad.
> >
>
> I don't think this point belongs in the 'why should we care about the
> Clang assembler' list :-)

If LLVM improves, thanks to efforts to support the Linux kernel, then
I'd call that a win that kernel developers can celebrate.

>
> > I value the feedback from you, Russell, and Arnd even when I disagree.
> > These are just my thoughts on *why* things are the way they are, FWIW.
> > If there's thoughts on how we might better prioritize one thing over
> > another, I would appreciate it.
>
> I think the 'all legacy needs to die' attitude is not particularly
> helpful here. In the 32-bit Linux/ARM community, there are many people
> who care about older systems, and spend a lot of time on keeping
> things in a working order on platforms that Google or ARM have stopped
> caring about long ago.

I think if anyone on this thread had the position that "_all_ legacy
needs to die," then no one here would have anything to work on. :P
--
Thanks,
~Nick Desaulniers

2020-04-15 21:57:18

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

I don't know if this will help, but I feel like folks might be talking
past each other a little here. I see two primary issues that are
colliding, and I just want to call them out specifically...

On Tue, Apr 14, 2020 at 1:59 AM Ard Biesheuvel <[email protected]> wrote:
> On Mon, 13 Apr 2020 at 22:45, Nick Desaulniers <[email protected]> wrote:
> > 1. continuous integration and randconfigs. We need CI to help us spot
> > where things are still broken, and help us from regressing the ground
> > we've fought for. We can't expect kernel developers to test with
> > LLVM. Currently, we have LLVM builds in numerous kernel continuous

First, is this one. To paraphrase, "we don't want to lose hard-won
ground."

> To be honest with you, I don't actually think iwmmxt is that
> important. But I have already demonstrated how we can use a couple of
> macros to emit the same instructions without resorting to bare
> opcodes, so there is really no need to disable pieces left and right
> because the Clang assembler does not support them outright - it just
> needs someone to care enough about this, rather than rush through the
> list with a tick the box attitude, and rip out the pieces that look a
> bit too complicated.

The second is this one. To paraphrase, "we can just fix things instead
of disabling them."

This feels a lot to me like the pain I (and plenty of others) have felt
when doing treewide changes (adding full compiler support is kind of a
treewide change). The above two points were really strongly felt when we
were trying to remove VLAs. In our case, we didn't even have the option
to disable stuff, so the pain was even worse: VLAs were being added to
the kernel faster than we could remove them.

What's a good middle ground here? For VLAs, I ended up getting akpm's
help by having him add -Wvla to his local builds and send nag emails
to people when they added VLAs. That's not really a thing here, but it
seems like there should be a way to avoid losing ground (in this case,
it's the erosion of attention: repeated known-issue warnings means the
CI gets ignored for the warnings on newly added issues). It does seem
to me like adding the negative depends is a reasonable first step: it
marks what hard things need fixing later without losing coverage for
things that can be more easily fixed now with available resources.

For the specific iwmmxt.S case, perhaps the solution is the suggested
changes? I imagine it should be possible to do a binary diff to see zero
changes before/after.

For others, is a negative depends acceptable? Given how responsive
Nick, Nathan, and others are, I don't think there is any real risk of a
"slippery slope" of things just getting swept under the rug forever.
Everyone here wants to see the kernel be awesome. :)

-Kees

--
Kees Cook

2020-04-15 22:58:17

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Tue, 14 Apr 2020 at 22:53, Kees Cook <[email protected]> wrote:
>
> I don't know if this will help, but I feel like folks might be talking
> past each other a little here. I see two primary issues that are
> colliding, and I just want to call them out specifically...
>

Thanks Kees.

I don't think there is a huge difference of opinion here, and I hope I
managed to strike the right tone, which was not meant to be a grumpy
one.

To reiterate my point: I strongly prefer minor asm surgery over
elaborate Kconfig plumbing if it means we can retain the functionality
even when using LLVM tools. In particular, the use of macros to
implement missing ISA support should be considered before any other
solution, as these are already being used widely across architectures
to fill in such gaps.


> On Tue, Apr 14, 2020 at 1:59 AM Ard Biesheuvel <[email protected]> wrote:
> > On Mon, 13 Apr 2020 at 22:45, Nick Desaulniers <[email protected]> wrote:
> > > 1. continuous integration and randconfigs. We need CI to help us spot
> > > where things are still broken, and help us from regressing the ground
> > > we've fought for. We can't expect kernel developers to test with
> > > LLVM. Currently, we have LLVM builds in numerous kernel continuous
>
> First, is this one. To paraphrase, "we don't want to lose hard-won
> ground."
>
> > To be honest with you, I don't actually think iwmmxt is that
> > important. But I have already demonstrated how we can use a couple of
> > macros to emit the same instructions without resorting to bare
> > opcodes, so there is really no need to disable pieces left and right
> > because the Clang assembler does not support them outright - it just
> > needs someone to care enough about this, rather than rush through the
> > list with a tick the box attitude, and rip out the pieces that look a
> > bit too complicated.
>
> The second is this one. To paraphrase, "we can just fix things instead
> of disabling them."
>
> This feels a lot to me like the pain I (and plenty of others) have felt
> when doing treewide changes (adding full compiler support is kind of a
> treewide change). The above two points were really strongly felt when we
> were trying to remove VLAs. In our case, we didn't even have the option
> to disable stuff, so the pain was even worse: VLAs were being added to
> the kernel faster than we could remove them.
>
> What's a good middle ground here? For VLAs, I ended up getting akpm's
> help by having him add -Wvla to his local builds and send nag emails
> to people when they added VLAs. That's not really a thing here, but it
> seems like there should be a way to avoid losing ground (in this case,
> it's the erosion of attention: repeated known-issue warnings means the
> CI gets ignored for the warnings on newly added issues). It does seem
> to me like adding the negative depends is a reasonable first step: it
> marks what hard things need fixing later without losing coverage for
> things that can be more easily fixed now with available resources.
>
> For the specific iwmmxt.S case, perhaps the solution is the suggested
> changes? I imagine it should be possible to do a binary diff to see zero
> changes before/after.
>

This code has been around since 2004. It has never been possible to
assemble it with Clang's assembler. So the only thing this patch gives
you is the ability to switch from a .config where IWMMXT was disabled
by hand to one where it gets disabled automatically by Kconfig.

So what hard-won ground are we losing here? Did IWMMXT recently get
enabled in a defconfig that you care about?

> For others, is a negative depends acceptable? Given how responsive
> Nick, Nathan, and others are, I don't think there is any real risk of a
> "slippery slope" of things just getting swept under the rug forever.
> Everyone here wants to see the kernel be awesome. :)
>

I am not disagreeing with you here, and I have worked with Nick,
Nathan and Stefan on numerous occasions to get Clang related build
issues solved.

2020-04-16 00:00:41

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Wed, Apr 15, 2020 at 12:32 PM Ard Biesheuvel <[email protected]> wrote:
>
> On Tue, 14 Apr 2020 at 22:53, Kees Cook <[email protected]> wrote:
> >
> > I don't know if this will help, but I feel like folks might be talking
> > past each other a little here. I see two primary issues that are
> > colliding, and I just want to call them out specifically...
>
> To reiterate my point: I strongly prefer minor asm surgery over
> elaborate Kconfig plumbing if it means we can retain the functionality
> even when using LLVM tools. In particular, the use of macros to
> implement missing ISA support should be considered before any other
> solution, as these are already being used widely across architectures
> to fill in such gaps.

+1

> > What's a good middle ground here? For VLAs, I ended up getting akpm's
> > help by having him add -Wvla to his local builds and send nag emails
> > to people when they added VLAs. That's not really a thing here, but it
> > seems like there should be a way to avoid losing ground (in this case,
> > it's the erosion of attention: repeated known-issue warnings means the
> > CI gets ignored for the warnings on newly added issues). It does seem
> > to me like adding the negative depends is a reasonable first step: it
> > marks what hard things need fixing later without losing coverage for
> > things that can be more easily fixed now with available resources.
> >
> > For the specific iwmmxt.S case, perhaps the solution is the suggested
> > changes? I imagine it should be possible to do a binary diff to see zero
> > changes before/after.
>
> This code has been around since 2004. It has never been possible to
> assemble it with Clang's assembler. So the only thing this patch gives
> you is the ability to switch from a .config where IWMMXT was disabled
> by hand to one where it gets disabled automatically by Kconfig.
>
> So what hard-won ground are we losing here? Did IWMMXT recently get
> enabled in a defconfig that you care about?

I mainly care about the build testing aspect here, it seems we are getting
close to having the clang integrated assembler working with all .S files
in the kernel (it used to work for none), and I'd like to do randconfig and
allmodconfig tests that include these as well. Disabling the option works
for me, but your suggestion with the added macros is clearly better.

Arnd

2020-04-16 00:13:21

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Wed, Apr 15, 2020 at 02:58:05PM +0200, Arnd Bergmann wrote:
> On Wed, Apr 15, 2020 at 12:32 PM Ard Biesheuvel <[email protected]> wrote:
> >
> > On Tue, 14 Apr 2020 at 22:53, Kees Cook <[email protected]> wrote:
> > >
> > > I don't know if this will help, but I feel like folks might be talking
> > > past each other a little here. I see two primary issues that are
> > > colliding, and I just want to call them out specifically...
> >
> > To reiterate my point: I strongly prefer minor asm surgery over
> > elaborate Kconfig plumbing if it means we can retain the functionality
> > even when using LLVM tools. In particular, the use of macros to
> > implement missing ISA support should be considered before any other
> > solution, as these are already being used widely across architectures
> > to fill in such gaps.
>
> +1
>
> > > What's a good middle ground here? For VLAs, I ended up getting akpm's
> > > help by having him add -Wvla to his local builds and send nag emails
> > > to people when they added VLAs. That's not really a thing here, but it
> > > seems like there should be a way to avoid losing ground (in this case,
> > > it's the erosion of attention: repeated known-issue warnings means the
> > > CI gets ignored for the warnings on newly added issues). It does seem
> > > to me like adding the negative depends is a reasonable first step: it
> > > marks what hard things need fixing later without losing coverage for
> > > things that can be more easily fixed now with available resources.
> > >
> > > For the specific iwmmxt.S case, perhaps the solution is the suggested
> > > changes? I imagine it should be possible to do a binary diff to see zero
> > > changes before/after.
> >
> > This code has been around since 2004. It has never been possible to
> > assemble it with Clang's assembler. So the only thing this patch gives
> > you is the ability to switch from a .config where IWMMXT was disabled
> > by hand to one where it gets disabled automatically by Kconfig.
> >
> > So what hard-won ground are we losing here? Did IWMMXT recently get
> > enabled in a defconfig that you care about?
>
> I mainly care about the build testing aspect here, it seems we are getting
> close to having the clang integrated assembler working with all .S files
> in the kernel (it used to work for none), and I'd like to do randconfig and
> allmodconfig tests that include these as well. Disabling the option works
> for me, but your suggestion with the added macros is clearly better.

However, to me it seems the approach has been "clang doesn't like X,
the kernel has to change to suit clang" - sometimes at the expense
of either functionality or maintainability of the kernel.

Some of the changes have been good (provoking modernisation) but that's
not true of everything - and I see nothing happening subsequently to
rectify the situation.

Had we gone down the path of disabling the build of iWMMXT, if anyone
builds a kernel with clang for ARMv5 PXA and relies on iWMMXT, their
userspace suddenly breaks because they used a different compiler and
lost the necessary iWMMXT support in the kernel to allow userspace to
operate, which isn't a nice approach.

Using macros is the best solution to work around clang, but should not
be done at the expense of GNU AS which has proper support.

I'd say this: if clang wants to support building ARMv5, then it needs
to support iWMMXT and stop forcing the kernel to adapt to Clang's
incomplete implementations (which are no direct fault of the clang
developers.)

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

2020-04-16 00:15:44

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Wed, Apr 15, 2020 at 12:32:17PM +0200, Ard Biesheuvel wrote:
> To reiterate my point: I strongly prefer minor asm surgery over
> elaborate Kconfig plumbing if it means we can retain the functionality
> even when using LLVM tools. In particular, the use of macros to
> implement missing ISA support should be considered before any other
> solution, as these are already being used widely across architectures
> to fill in such gaps.

Yeah, this seems like the right place to start from. It sounded like
there were cases where the people with knowledge needed to accomplish
the macro creation were not always immediately available. But, yes,
let's get iwmmxt fixed up.

> This code has been around since 2004. It has never been possible to
> assemble it with Clang's assembler. So the only thing this patch gives
> you is the ability to switch from a .config where IWMMXT was disabled
> by hand to one where it gets disabled automatically by Kconfig.

Right -- I meant "let's fix iwmmxt with macro magic" not "let's disable
it". I did want to point out the Kconfig disabling may be needed in
other cases.

> So what hard-won ground are we losing here? Did IWMMXT recently get
> enabled in a defconfig that you care about?

It's a CI's ability to do randconfig builds to catch new stuff. (i.e.
where "disabled by hand" isn't part of the process.) Since there are
multiple CIs doing multi-architecture builds we need to get these things
fixed upstream, not a CI's local patch stacks or Kconfig whitelists,
etc. And when the expertise isn't available to fix arch-specific stuff,
Kconfig negative depends seems like a reasonable middle ground. I, too,
prefer fixes that allow Clang to do its work without wrecking things
for GNU as.

> I am not disagreeing with you here, and I have worked with Nick,
> Nathan and Stefan on numerous occasions to get Clang related build
> issues solved.

Yup! Totally; this thread just looked very familiar to me from doing
treewide stuff and I didn't want what I thought looked like the core
points to get lost in the details. :)

--
Kees Cook

2020-04-17 14:17:56

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

Hi

On Thu, Apr 16, 2020 at 12:44 AM Kees Cook <[email protected]> wrote:
>
> On Wed, Apr 15, 2020 at 12:32:17PM +0200, Ard Biesheuvel wrote:
> > To reiterate my point: I strongly prefer minor asm surgery over
> > elaborate Kconfig plumbing if it means we can retain the functionality
> > even when using LLVM tools. In particular, the use of macros to
> > implement missing ISA support should be considered before any other
> > solution, as these are already being used widely across architectures
> > to fill in such gaps.
>
> Yeah, this seems like the right place to start from. It sounded like
> there were cases where the people with knowledge needed to accomplish
> the macro creation were not always immediately available. But, yes,
> let's get iwmmxt fixed up.
>
> > This code has been around since 2004. It has never been possible to
> > assemble it with Clang's assembler. So the only thing this patch gives
> > you is the ability to switch from a .config where IWMMXT was disabled
> > by hand to one where it gets disabled automatically by Kconfig.
>
> Right -- I meant "let's fix iwmmxt with macro magic" not "let's disable
> it". I did want to point out the Kconfig disabling may be needed in
> other cases.
>
> > So what hard-won ground are we losing here? Did IWMMXT recently get
> > enabled in a defconfig that you care about?
>
> It's a CI's ability to do randconfig builds to catch new stuff. (i.e.
> where "disabled by hand" isn't part of the process.) Since there are
> multiple CIs doing multi-architecture builds we need to get these things
> fixed upstream, not a CI's local patch stacks or Kconfig whitelists,
> etc. And when the expertise isn't available to fix arch-specific stuff,
> Kconfig negative depends seems like a reasonable middle ground. I, too,
> prefer fixes that allow Clang to do its work without wrecking things
> for GNU as.
>
> > I am not disagreeing with you here, and I have worked with Nick,
> > Nathan and Stefan on numerous occasions to get Clang related build
> > issues solved.
>
> Yup! Totally; this thread just looked very familiar to me from doing
> treewide stuff and I didn't want what I thought looked like the core
> points to get lost in the details. :)
>
> --
> Kees Cook



When I started to read this thread,
I just slightly preferred
depends on $(as-instr, <some iwmmxt instructions>)
over
depends on AS_IS_CLANG

because it is what we recently did for x86.
commit 5e8ebd841a44b895e2bab5e874ff7d333ca31135


But, Ard's macro approach seems even better.


I do not know how difficult to replace
the arch/x86/Kconfig.assembler with .macro

Anyway, arch/x86/Kconfig.assembler will be gone
when we raise the binutils version next time.


--
Best Regards
Masahiro Yamada

2020-04-19 11:09:48

by Stefan Agner

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On 2020-04-15 16:44, Russell King - ARM Linux admin wrote:
> On Wed, Apr 15, 2020 at 02:58:05PM +0200, Arnd Bergmann wrote:
>> On Wed, Apr 15, 2020 at 12:32 PM Ard Biesheuvel <[email protected]> wrote:
>> >
>> > On Tue, 14 Apr 2020 at 22:53, Kees Cook <[email protected]> wrote:
>> > >
>> > > I don't know if this will help, but I feel like folks might be talking
>> > > past each other a little here. I see two primary issues that are
>> > > colliding, and I just want to call them out specifically...
>> >
>> > To reiterate my point: I strongly prefer minor asm surgery over
>> > elaborate Kconfig plumbing if it means we can retain the functionality
>> > even when using LLVM tools. In particular, the use of macros to
>> > implement missing ISA support should be considered before any other
>> > solution, as these are already being used widely across architectures
>> > to fill in such gaps.
>>
>> +1
>>
>> > > What's a good middle ground here? For VLAs, I ended up getting akpm's
>> > > help by having him add -Wvla to his local builds and send nag emails
>> > > to people when they added VLAs. That's not really a thing here, but it
>> > > seems like there should be a way to avoid losing ground (in this case,
>> > > it's the erosion of attention: repeated known-issue warnings means the
>> > > CI gets ignored for the warnings on newly added issues). It does seem
>> > > to me like adding the negative depends is a reasonable first step: it
>> > > marks what hard things need fixing later without losing coverage for
>> > > things that can be more easily fixed now with available resources.
>> > >
>> > > For the specific iwmmxt.S case, perhaps the solution is the suggested
>> > > changes? I imagine it should be possible to do a binary diff to see zero
>> > > changes before/after.
>> >
>> > This code has been around since 2004. It has never been possible to
>> > assemble it with Clang's assembler. So the only thing this patch gives
>> > you is the ability to switch from a .config where IWMMXT was disabled
>> > by hand to one where it gets disabled automatically by Kconfig.
>> >
>> > So what hard-won ground are we losing here? Did IWMMXT recently get
>> > enabled in a defconfig that you care about?
>>
>> I mainly care about the build testing aspect here, it seems we are getting
>> close to having the clang integrated assembler working with all .S files
>> in the kernel (it used to work for none), and I'd like to do randconfig and
>> allmodconfig tests that include these as well. Disabling the option works
>> for me, but your suggestion with the added macros is clearly better.
>
> However, to me it seems the approach has been "clang doesn't like X,
> the kernel has to change to suit clang" - sometimes at the expense
> of either functionality or maintainability of the kernel.

There are also quite some quirks which work around gcc/binutils
weirdness in the kernel. E.g. there are macros to make VFP support work
with older binutils versions.

I understand, Clang is the newcomer here. Breaking gcc/binutils is a
nogo, and functionality or maintainability should not suffer.

I think the important thing here is that whatever workarounds are
introduced that they are properly documented: Why is this required, how
does it work, and under what circumstances can it be removed again. This
should be in commit logs as well as inline.

>
> Some of the changes have been good (provoking modernisation) but that's
> not true of everything - and I see nothing happening subsequently to
> rectify the situation.

And that is true with gcc/binutils work arounds which have been
introduced long time ago.

From my perspective, the kernel always tried to be very user oriented
when it comes to toolchain support: Rather than blacklist a known broken
toolchain version, work arounds have been introduced. And I think we
should treat Clang no different.

That being said, I am not saying we should hack up the kernel to make
Clang work no matter what. There are fixes made in Clang so we can avoid
introducing hacks in the kernel. There needs to be a balance.

Again, I think more important is that when we introduce work arounds in
Linux, that we make sure that such changes are properly document. This
will make cleanup a *lot* easier and therefor more likely down the road.

>
> Had we gone down the path of disabling the build of iWMMXT, if anyone
> builds a kernel with clang for ARMv5 PXA and relies on iWMMXT, their
> userspace suddenly breaks because they used a different compiler and
> lost the necessary iWMMXT support in the kernel to allow userspace to
> operate, which isn't a nice approach.

That is actually a very good point and hasn't really been taken into
account in this discussion.

Currently the default behavior is that iWMMXT is enabled for all CPUs
supporting it. With the patch as-is, users who might try out Clang (with
integrated assembler) likely will not notice that iWMMXT is not
supported. They will be in for a surprise once they try to use user
space applications making use of iWMMXT instructions.

Avoiding such surprises would mean we either disable all iWMMXT
CPUs/architectures when using Clang's integrated assembler or make sure
they work with the integrated assembler.

Is there a nice way to print a warning at build time in this case?

>
> Using macros is the best solution to work around clang, but should not
> be done at the expense of GNU AS which has proper support.

I guess making the macros a Clang only thing should be doable.

>
> I'd say this: if clang wants to support building ARMv5, then it needs
> to support iWMMXT and stop forcing the kernel to adapt to Clang's
> incomplete implementations (which are no direct fault of the clang
> developers.)

So far I at least did not really look into supporting ARMv5
architectures when building with Clang. I would be fine to just disable
ARMv5 architecture when using Clang's integrated assembler, at least for
now.

However, iWMMXT is also supported by Marvell's PJ4 microarchitecture,
which is an ARMv7-A architecture. Hence this file is assembled when
building multi_v7_defconfig (since ARCH_DOVE is enabled there). So if we
consider iWMMXT mandatory to support an architecture, we would have to
disable more than "just" ARMv5.

That said, I still would prefer this patch over disabling all the
architectures. Keep in mind that integrated assembler needs to be
explicitly enabled (using LLVM_IAS=1).

While I fear that the asm macro surgery will not be that clean/trivial,
I still think its the best option. It side steps the problem of
accidentally breaking user space and ultimately allows to build more
configurations with Clang's integrated assembler.

--
Stefan

2020-11-04 21:24:24

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

On Mon, Apr 13, 2020 at 2:01 PM Russell King - ARM Linux admin
<[email protected]> wrote:
>
> On Mon, Apr 13, 2020 at 12:23:38PM -0700, Nick Desaulniers wrote:
> > On Fri, Apr 10, 2020 at 5:33 AM Russell King - ARM Linux admin
> > <[email protected]> wrote:
> > >
> > > For older CPUs, it doesn't matter what the latest ARM ARM says, the
> > > appropriate version of the ARM ARM is the one relevant for the CPU
> > > architecture. This is a mistake frequently made, and it's been pointed
> > > out by Arm Ltd in the past (before ARMv6 even came on the scene) that
> > > keeping older revisions is necessary if you want to be interested in
> > > the older architectures.
> >
> > As if it never existed *waves hands*. Interesting. Does ARM still
> > distribute these older reference manuals? Do you keep copies of the
> > older revisions?
>
> I keep copies of every document I've needed that I'm allowed to keep
> as a general rule, including the early paper copies of the ARM
> architecture reference manual. I even have the original VLSI ARM2
> databook.
>
> For the ARMv5TE architecture, you're looking for DDI0100E which can be
> found via google.

Russell, Arnd found
http://read.pudn.com/downloads78/ebook/297953/WirlessMMX251669_devguide.pdf
"Intel Wireless MMX Technology - Developer Guide - August, 2002".

I thought you might be interested in fetching a copy for historical reference.
--
Thanks,
~Nick Desaulniers

2020-11-07 00:13:50

by Jian Cai

[permalink] [raw]
Subject: [PATCH v2] Make iwmmxt.S support Clang's integrated assembler

This patch replaces 6 IWMMXT instructions Clang's integrated assembler
does not support in iwmmxt.S using macros, while making sure GNU
assembler still emit the same instructions. This should be easier than
providing full IWMMXT support in Clang.

"Intel Wireless MMX Technology - Developer Guide - August, 2002" should
be referenced for the encoding schemes of these extensions.

Link: https://github.com/ClangBuiltLinux/linux/issues/975

Suggested-by: Nick Desaulniers <[email protected]>
Suggested-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Jian Cai <[email protected]>
---
arch/arm/kernel/iwmmxt.S | 89 ++++++++++++++++++++--------------------
arch/arm/kernel/iwmmxt.h | 47 +++++++++++++++++++++
2 files changed, 92 insertions(+), 44 deletions(-)
create mode 100644 arch/arm/kernel/iwmmxt.h

diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
index 0dcae787b004..d2b4ac06e4ed 100644
--- a/arch/arm/kernel/iwmmxt.S
+++ b/arch/arm/kernel/iwmmxt.S
@@ -16,6 +16,7 @@
#include <asm/thread_info.h>
#include <asm/asm-offsets.h>
#include <asm/assembler.h>
+#include "iwmmxt.h"

#if defined(CONFIG_CPU_PJ4) || defined(CONFIG_CPU_PJ4B)
#define PJ4(code...) code
@@ -113,33 +114,33 @@ concan_save:

concan_dump:

- wstrw wCSSF, [r1, #MMX_WCSSF]
- wstrw wCASF, [r1, #MMX_WCASF]
- wstrw wCGR0, [r1, #MMX_WCGR0]
- wstrw wCGR1, [r1, #MMX_WCGR1]
- wstrw wCGR2, [r1, #MMX_WCGR2]
- wstrw wCGR3, [r1, #MMX_WCGR3]
+ wstrw wCSSF, r1, MMX_WCSSF
+ wstrw wCASF, r1, MMX_WCASF
+ wstrw wCGR0, r1, MMX_WCGR0
+ wstrw wCGR1, r1, MMX_WCGR1
+ wstrw wCGR2, r1, MMX_WCGR2
+ wstrw wCGR3, r1, MMX_WCGR3

1: @ MUP? wRn
tst r2, #0x2
beq 2f

- wstrd wR0, [r1, #MMX_WR0]
- wstrd wR1, [r1, #MMX_WR1]
- wstrd wR2, [r1, #MMX_WR2]
- wstrd wR3, [r1, #MMX_WR3]
- wstrd wR4, [r1, #MMX_WR4]
- wstrd wR5, [r1, #MMX_WR5]
- wstrd wR6, [r1, #MMX_WR6]
- wstrd wR7, [r1, #MMX_WR7]
- wstrd wR8, [r1, #MMX_WR8]
- wstrd wR9, [r1, #MMX_WR9]
- wstrd wR10, [r1, #MMX_WR10]
- wstrd wR11, [r1, #MMX_WR11]
- wstrd wR12, [r1, #MMX_WR12]
- wstrd wR13, [r1, #MMX_WR13]
- wstrd wR14, [r1, #MMX_WR14]
- wstrd wR15, [r1, #MMX_WR15]
+ wstrd wR0, r1, MMX_WR0
+ wstrd wR1, r1, MMX_WR1
+ wstrd wR2, r1, MMX_WR2
+ wstrd wR3, r1, MMX_WR3
+ wstrd wR4, r1, MMX_WR4
+ wstrd wR5, r1, MMX_WR5
+ wstrd wR6, r1, MMX_WR6
+ wstrd wR7, r1, MMX_WR7
+ wstrd wR8, r1, MMX_WR8
+ wstrd wR9, r1, MMX_WR9
+ wstrd wR10, r1, MMX_WR10
+ wstrd wR11, r1, MMX_WR11
+ wstrd wR12, r1, MMX_WR12
+ wstrd wR13, r1, MMX_WR13
+ wstrd wR14, r1, MMX_WR14
+ wstrd wR15, r1, MMX_WR15

2: teq r0, #0 @ anything to load?
reteq lr @ if not, return
@@ -147,30 +148,30 @@ concan_dump:
concan_load:

@ Load wRn
- wldrd wR0, [r0, #MMX_WR0]
- wldrd wR1, [r0, #MMX_WR1]
- wldrd wR2, [r0, #MMX_WR2]
- wldrd wR3, [r0, #MMX_WR3]
- wldrd wR4, [r0, #MMX_WR4]
- wldrd wR5, [r0, #MMX_WR5]
- wldrd wR6, [r0, #MMX_WR6]
- wldrd wR7, [r0, #MMX_WR7]
- wldrd wR8, [r0, #MMX_WR8]
- wldrd wR9, [r0, #MMX_WR9]
- wldrd wR10, [r0, #MMX_WR10]
- wldrd wR11, [r0, #MMX_WR11]
- wldrd wR12, [r0, #MMX_WR12]
- wldrd wR13, [r0, #MMX_WR13]
- wldrd wR14, [r0, #MMX_WR14]
- wldrd wR15, [r0, #MMX_WR15]
+ wldrd wR0, r0, MMX_WR0
+ wldrd wR1, r0, MMX_WR1
+ wldrd wR2, r0, MMX_WR2
+ wldrd wR3, r0, MMX_WR3
+ wldrd wR4, r0, MMX_WR4
+ wldrd wR5, r0, MMX_WR5
+ wldrd wR6, r0, MMX_WR6
+ wldrd wR7, r0, MMX_WR7
+ wldrd wR8, r0, MMX_WR8
+ wldrd wR9, r0, MMX_WR9
+ wldrd wR10, r0, MMX_WR10
+ wldrd wR11, r0, MMX_WR11
+ wldrd wR12, r0, MMX_WR12
+ wldrd wR13, r0, MMX_WR13
+ wldrd wR14, r0, MMX_WR14
+ wldrd wR15, r0, MMX_WR15

@ Load wCx
- wldrw wCSSF, [r0, #MMX_WCSSF]
- wldrw wCASF, [r0, #MMX_WCASF]
- wldrw wCGR0, [r0, #MMX_WCGR0]
- wldrw wCGR1, [r0, #MMX_WCGR1]
- wldrw wCGR2, [r0, #MMX_WCGR2]
- wldrw wCGR3, [r0, #MMX_WCGR3]
+ wldrw wCSSF, r0, MMX_WCSSF
+ wldrw wCASF, r0, MMX_WCASF
+ wldrw wCGR0, r0, MMX_WCGR0
+ wldrw wCGR1, r0, MMX_WCGR1
+ wldrw wCGR2, r0, MMX_WCGR2
+ wldrw wCGR3, r0, MMX_WCGR3

@ clear CUP/MUP (only if r1 != 0)
teq r1, #0
diff --git a/arch/arm/kernel/iwmmxt.h b/arch/arm/kernel/iwmmxt.h
new file mode 100644
index 000000000000..fb627286f5bb
--- /dev/null
+++ b/arch/arm/kernel/iwmmxt.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __IWMMXT_H__
+#define __IWMMXT_H__
+
+.irp b, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
+.set .LwR\b, \b
+.set .Lr\b, \b
+.endr
+
+.set .LwCSSF, 0x2
+.set .LwCASF, 0x3
+.set .LwCGR0, 0x8
+.set .LwCGR1, 0x9
+.set .LwCGR2, 0xa
+.set .LwCGR3, 0xb
+
+.macro wldrd, reg:req, base:req, offset:req
+.inst 0xedd00100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
+.endm
+
+.macro wldrw, reg:req, base:req, offset:req
+.inst 0xfd900100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
+.endm
+
+.macro wstrd, reg:req, base:req, offset:req
+.inst 0xedc00100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
+.endm
+
+.macro wstrw, reg:req, base:req, offset:req
+.inst 0xfd800100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
+.endm
+
+#ifdef __clang__
+
+#define wCon c1
+
+.macro tmrc, dest:req, control:req
+mrc p1, 0, \dest, \control, c0, 0
+.endm
+
+.macro tmcr, control:req, src:req
+mcr p1, 0, \src, \control, c0, 0
+.endm
+#endif
+
+#endif
--
2.29.1.341.ge80a0c044ae-goog

2020-11-07 08:32:28

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2] Make iwmmxt.S support Clang's integrated assembler

On Sat, 7 Nov 2020 at 01:11, Jian Cai <[email protected]> wrote:
>
> This patch replaces 6 IWMMXT instructions Clang's integrated assembler
> does not support in iwmmxt.S using macros, while making sure GNU
> assembler still emit the same instructions. This should be easier than
> providing full IWMMXT support in Clang.
>
> "Intel Wireless MMX Technology - Developer Guide - August, 2002" should
> be referenced for the encoding schemes of these extensions.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/975
>
> Suggested-by: Nick Desaulniers <[email protected]>
> Suggested-by: Ard Biesheuvel <[email protected]>
> Signed-off-by: Jian Cai <[email protected]>

Please make sure you test this carefully on BE32, as the instruction
byte order used by .inst is LE IIRC

> ---
> arch/arm/kernel/iwmmxt.S | 89 ++++++++++++++++++++--------------------
> arch/arm/kernel/iwmmxt.h | 47 +++++++++++++++++++++
> 2 files changed, 92 insertions(+), 44 deletions(-)
> create mode 100644 arch/arm/kernel/iwmmxt.h
>
> diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
> index 0dcae787b004..d2b4ac06e4ed 100644
> --- a/arch/arm/kernel/iwmmxt.S
> +++ b/arch/arm/kernel/iwmmxt.S
> @@ -16,6 +16,7 @@
> #include <asm/thread_info.h>
> #include <asm/asm-offsets.h>
> #include <asm/assembler.h>
> +#include "iwmmxt.h"
>
> #if defined(CONFIG_CPU_PJ4) || defined(CONFIG_CPU_PJ4B)
> #define PJ4(code...) code
> @@ -113,33 +114,33 @@ concan_save:
>
> concan_dump:
>
> - wstrw wCSSF, [r1, #MMX_WCSSF]
> - wstrw wCASF, [r1, #MMX_WCASF]
> - wstrw wCGR0, [r1, #MMX_WCGR0]
> - wstrw wCGR1, [r1, #MMX_WCGR1]
> - wstrw wCGR2, [r1, #MMX_WCGR2]
> - wstrw wCGR3, [r1, #MMX_WCGR3]
> + wstrw wCSSF, r1, MMX_WCSSF
> + wstrw wCASF, r1, MMX_WCASF
> + wstrw wCGR0, r1, MMX_WCGR0
> + wstrw wCGR1, r1, MMX_WCGR1
> + wstrw wCGR2, r1, MMX_WCGR2
> + wstrw wCGR3, r1, MMX_WCGR3
>
> 1: @ MUP? wRn
> tst r2, #0x2
> beq 2f
>
> - wstrd wR0, [r1, #MMX_WR0]
> - wstrd wR1, [r1, #MMX_WR1]
> - wstrd wR2, [r1, #MMX_WR2]
> - wstrd wR3, [r1, #MMX_WR3]
> - wstrd wR4, [r1, #MMX_WR4]
> - wstrd wR5, [r1, #MMX_WR5]
> - wstrd wR6, [r1, #MMX_WR6]
> - wstrd wR7, [r1, #MMX_WR7]
> - wstrd wR8, [r1, #MMX_WR8]
> - wstrd wR9, [r1, #MMX_WR9]
> - wstrd wR10, [r1, #MMX_WR10]
> - wstrd wR11, [r1, #MMX_WR11]
> - wstrd wR12, [r1, #MMX_WR12]
> - wstrd wR13, [r1, #MMX_WR13]
> - wstrd wR14, [r1, #MMX_WR14]
> - wstrd wR15, [r1, #MMX_WR15]
> + wstrd wR0, r1, MMX_WR0
> + wstrd wR1, r1, MMX_WR1
> + wstrd wR2, r1, MMX_WR2
> + wstrd wR3, r1, MMX_WR3
> + wstrd wR4, r1, MMX_WR4
> + wstrd wR5, r1, MMX_WR5
> + wstrd wR6, r1, MMX_WR6
> + wstrd wR7, r1, MMX_WR7
> + wstrd wR8, r1, MMX_WR8
> + wstrd wR9, r1, MMX_WR9
> + wstrd wR10, r1, MMX_WR10
> + wstrd wR11, r1, MMX_WR11
> + wstrd wR12, r1, MMX_WR12
> + wstrd wR13, r1, MMX_WR13
> + wstrd wR14, r1, MMX_WR14
> + wstrd wR15, r1, MMX_WR15
>
> 2: teq r0, #0 @ anything to load?
> reteq lr @ if not, return
> @@ -147,30 +148,30 @@ concan_dump:
> concan_load:
>
> @ Load wRn
> - wldrd wR0, [r0, #MMX_WR0]
> - wldrd wR1, [r0, #MMX_WR1]
> - wldrd wR2, [r0, #MMX_WR2]
> - wldrd wR3, [r0, #MMX_WR3]
> - wldrd wR4, [r0, #MMX_WR4]
> - wldrd wR5, [r0, #MMX_WR5]
> - wldrd wR6, [r0, #MMX_WR6]
> - wldrd wR7, [r0, #MMX_WR7]
> - wldrd wR8, [r0, #MMX_WR8]
> - wldrd wR9, [r0, #MMX_WR9]
> - wldrd wR10, [r0, #MMX_WR10]
> - wldrd wR11, [r0, #MMX_WR11]
> - wldrd wR12, [r0, #MMX_WR12]
> - wldrd wR13, [r0, #MMX_WR13]
> - wldrd wR14, [r0, #MMX_WR14]
> - wldrd wR15, [r0, #MMX_WR15]
> + wldrd wR0, r0, MMX_WR0
> + wldrd wR1, r0, MMX_WR1
> + wldrd wR2, r0, MMX_WR2
> + wldrd wR3, r0, MMX_WR3
> + wldrd wR4, r0, MMX_WR4
> + wldrd wR5, r0, MMX_WR5
> + wldrd wR6, r0, MMX_WR6
> + wldrd wR7, r0, MMX_WR7
> + wldrd wR8, r0, MMX_WR8
> + wldrd wR9, r0, MMX_WR9
> + wldrd wR10, r0, MMX_WR10
> + wldrd wR11, r0, MMX_WR11
> + wldrd wR12, r0, MMX_WR12
> + wldrd wR13, r0, MMX_WR13
> + wldrd wR14, r0, MMX_WR14
> + wldrd wR15, r0, MMX_WR15
>
> @ Load wCx
> - wldrw wCSSF, [r0, #MMX_WCSSF]
> - wldrw wCASF, [r0, #MMX_WCASF]
> - wldrw wCGR0, [r0, #MMX_WCGR0]
> - wldrw wCGR1, [r0, #MMX_WCGR1]
> - wldrw wCGR2, [r0, #MMX_WCGR2]
> - wldrw wCGR3, [r0, #MMX_WCGR3]
> + wldrw wCSSF, r0, MMX_WCSSF
> + wldrw wCASF, r0, MMX_WCASF
> + wldrw wCGR0, r0, MMX_WCGR0
> + wldrw wCGR1, r0, MMX_WCGR1
> + wldrw wCGR2, r0, MMX_WCGR2
> + wldrw wCGR3, r0, MMX_WCGR3
>
> @ clear CUP/MUP (only if r1 != 0)
> teq r1, #0
> diff --git a/arch/arm/kernel/iwmmxt.h b/arch/arm/kernel/iwmmxt.h
> new file mode 100644
> index 000000000000..fb627286f5bb
> --- /dev/null
> +++ b/arch/arm/kernel/iwmmxt.h
> @@ -0,0 +1,47 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __IWMMXT_H__
> +#define __IWMMXT_H__
> +
> +.irp b, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
> +.set .LwR\b, \b
> +.set .Lr\b, \b
> +.endr
> +
> +.set .LwCSSF, 0x2
> +.set .LwCASF, 0x3
> +.set .LwCGR0, 0x8
> +.set .LwCGR1, 0x9
> +.set .LwCGR2, 0xa
> +.set .LwCGR3, 0xb
> +
> +.macro wldrd, reg:req, base:req, offset:req
> +.inst 0xedd00100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
> +.endm
> +
> +.macro wldrw, reg:req, base:req, offset:req
> +.inst 0xfd900100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
> +.endm
> +
> +.macro wstrd, reg:req, base:req, offset:req
> +.inst 0xedc00100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
> +.endm
> +
> +.macro wstrw, reg:req, base:req, offset:req
> +.inst 0xfd800100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
> +.endm
> +
> +#ifdef __clang__
> +
> +#define wCon c1
> +
> +.macro tmrc, dest:req, control:req
> +mrc p1, 0, \dest, \control, c0, 0
> +.endm
> +
> +.macro tmcr, control:req, src:req
> +mcr p1, 0, \src, \control, c0, 0
> +.endm
> +#endif
> +
> +#endif
> --
> 2.29.1.341.ge80a0c044ae-goog
>

2020-11-10 01:57:59

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2] Make iwmmxt.S support Clang's integrated assembler

On Sat, Nov 7, 2020 at 12:29 AM Ard Biesheuvel <[email protected]> wrote:
>
> On Sat, 7 Nov 2020 at 01:11, Jian Cai <[email protected]> wrote:
> >
> > This patch replaces 6 IWMMXT instructions Clang's integrated assembler
> > does not support in iwmmxt.S using macros, while making sure GNU
> > assembler still emit the same instructions. This should be easier than
> > providing full IWMMXT support in Clang.
> >
> > "Intel Wireless MMX Technology - Developer Guide - August, 2002" should
> > be referenced for the encoding schemes of these extensions.
> >
> > Link: https://github.com/ClangBuiltLinux/linux/issues/975
> >
> > Suggested-by: Nick Desaulniers <[email protected]>
> > Suggested-by: Ard Biesheuvel <[email protected]>
> > Signed-off-by: Jian Cai <[email protected]>
>
> Please make sure you test this carefully on BE32, as the instruction
> byte order used by .inst is LE IIRC

Unless that was a recent-ish change in GAS, it looks like I get the
same disassembly/encodings before/after this patch for
CONFIG_BIG_ENDIAN with GNU as.

$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make CC=clang -j71 defconfig
$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make CC=clang -j71 menuconfig
<enable CPU_BIG_ENDIAN>
$ grep BIG_ENDIAN .config
CONFIG_CPU_BIG_ENDIAN=y
CONFIG_ARCH_SUPPORTS_BIG_ENDIAN=y
$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make CC=clang -j71
arch/arm/kernel/iwmmxt.o
$ arm-linux-gnueabi-objdump -dr arch/arm/kernel/iwmmxt.o > gas_before.txt
$ b4 am https://lore.kernel.org/lkml/[email protected]/
-o - | git am
$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make CC=clang -j71
arch/arm/kernel/iwmmxt.o
$ arm-linux-gnueabi-objdump -dr arch/arm/kernel/iwmmxt.o > gas_after.txt
$ diff -u gas_before.txt gas_after.txt
$ echo $?
0

(Orthogonal, it looks like llvm-objdump has issues decoding
elf32-bigarm that we'll need to fix.)

Either way the patch LGTM for IWMMXT, thanks Jian for the patch and
Ard for the suggestions.

Reviewed-by: Nick Desaulniers <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>
--
Thanks,
~Nick Desaulniers

2020-11-10 07:38:48

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2] Make iwmmxt.S support Clang's integrated assembler

On Tue, 10 Nov 2020 at 02:55, Nick Desaulniers <[email protected]> wrote:
>
> On Sat, Nov 7, 2020 at 12:29 AM Ard Biesheuvel <[email protected]> wrote:
> >
> > On Sat, 7 Nov 2020 at 01:11, Jian Cai <[email protected]> wrote:
> > >
> > > This patch replaces 6 IWMMXT instructions Clang's integrated assembler
> > > does not support in iwmmxt.S using macros, while making sure GNU
> > > assembler still emit the same instructions. This should be easier than
> > > providing full IWMMXT support in Clang.
> > >
> > > "Intel Wireless MMX Technology - Developer Guide - August, 2002" should
> > > be referenced for the encoding schemes of these extensions.
> > >
> > > Link: https://github.com/ClangBuiltLinux/linux/issues/975
> > >
> > > Suggested-by: Nick Desaulniers <[email protected]>
> > > Suggested-by: Ard Biesheuvel <[email protected]>
> > > Signed-off-by: Jian Cai <[email protected]>
> >
> > Please make sure you test this carefully on BE32, as the instruction
> > byte order used by .inst is LE IIRC
>
> Unless that was a recent-ish change in GAS, it looks like I get the
> same disassembly/encodings before/after this patch for
> CONFIG_BIG_ENDIAN with GNU as.
>
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make CC=clang -j71 defconfig
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make CC=clang -j71 menuconfig
> <enable CPU_BIG_ENDIAN>
> $ grep BIG_ENDIAN .config
> CONFIG_CPU_BIG_ENDIAN=y
> CONFIG_ARCH_SUPPORTS_BIG_ENDIAN=y
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make CC=clang -j71
> arch/arm/kernel/iwmmxt.o
> $ arm-linux-gnueabi-objdump -dr arch/arm/kernel/iwmmxt.o > gas_before.txt
> $ b4 am https://lore.kernel.org/lkml/[email protected]/
> -o - | git am
> $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- make CC=clang -j71
> arch/arm/kernel/iwmmxt.o
> $ arm-linux-gnueabi-objdump -dr arch/arm/kernel/iwmmxt.o > gas_after.txt
> $ diff -u gas_before.txt gas_after.txt
> $ echo $?
> 0
>
> (Orthogonal, it looks like llvm-objdump has issues decoding
> elf32-bigarm that we'll need to fix.)
>
> Either way the patch LGTM for IWMMXT, thanks Jian for the patch and
> Ard for the suggestions.
>
> Reviewed-by: Nick Desaulniers <[email protected]>
> Tested-by: Nick Desaulniers <[email protected]>


BE32 != BE8

Please use, e.g., ixp4xx_defconfig with IWMMXT and BE enabled. It
seems like .inst does the right thing here, i.e., the assembler knows
that it should emit BE for BE32 and LE for BE8, but it needs to be
confirmed.

2020-11-10 19:28:52

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2] Make iwmmxt.S support Clang's integrated assembler

On Mon, Nov 9, 2020 at 11:36 PM Ard Biesheuvel <[email protected]> wrote:
>
> BE32 != BE8

Oh? Sorry, what does BE8 stand for? arch/arm/mm/Kconfig says:

CONFIG_CPU_ENDIAN_BE8
Support for the BE-8 (big-endian) mode on ARMv6 and ARMv7 processors.

vs:
CPU_ENDIAN_BE32
Support for the BE-32 (big-endian) mode on pre-ARMv6 processors.

So BE8 seems newer? It looks like in my tests, enabling
CPU_BIG_ENDIAN via menuconfig enabled CPU_ENDIAN_BE8.

>
> Please use, e.g., ixp4xx_defconfig with IWMMXT and BE enabled. It
> seems like .inst does the right thing here, i.e., the assembler knows
> that it should emit BE for BE32 and LE for BE8, but it needs to be
> confirmed.

Ah and ixp4xx_defconfig selects CPU_ENDIAN_BE32. Yep, and the
disassemblies of those match, too.
--
Thanks,
~Nick Desaulniers

2020-11-10 20:16:04

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2] Make iwmmxt.S support Clang's integrated assembler

On Tue, Nov 10, 2020 at 12:10 PM Jian Cai <[email protected]> wrote:
>
> I tried to verify with ixp4xx_defconfig, and I noticed it also used CONFIG_CPU_BIG_ENDIAN=y to enable big endianness as follows,
>
> linux$ grep ENDIAN arch/arm/configs/ixp4xx_defconfig
> CONFIG_CPU_BIG_ENDIAN=y
>
> Also it appeared arch/arm/kernel/iwmmxt.o was not built with ixp4xx_defconfig. The commands I used
>
> linux$ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make CC=clang ixp4xx_defconfig
> linux$ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make -j32 CC=clang
> linux$ ls arch/arm/kernel/iwmmxt.o
> ls: cannot access 'arch/arm/kernel/iwmmxt.o': No such file or directory
>
> Did I miss any steps?

Yes, you need to manually enable CONFIG_IWMMXT in menuconfig or via
`scripts/configs -e`.

--
Thanks,
~Nick Desaulniers

2020-11-10 20:43:00

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v2] Make iwmmxt.S support Clang's integrated assembler

On Tue, Nov 10, 2020 at 9:11 PM 'Nick Desaulniers' via Clang Built
Linux <[email protected]> wrote:
>
> On Tue, Nov 10, 2020 at 12:10 PM Jian Cai <[email protected]> wrote:
> >
> > I tried to verify with ixp4xx_defconfig, and I noticed it also used CONFIG_CPU_BIG_ENDIAN=y to enable big endianness as follows,
> >
> > linux$ grep ENDIAN arch/arm/configs/ixp4xx_defconfig
> > CONFIG_CPU_BIG_ENDIAN=y
> >
> > Also it appeared arch/arm/kernel/iwmmxt.o was not built with ixp4xx_defconfig. The commands I used
> >
> > linux$ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make CC=clang ixp4xx_defconfig
> > linux$ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make -j32 CC=clang
> > linux$ ls arch/arm/kernel/iwmmxt.o
> > ls: cannot access 'arch/arm/kernel/iwmmxt.o': No such file or directory
> >
> > Did I miss any steps?
>
> Yes, you need to manually enable CONFIG_IWMMXT in menuconfig or via
> `scripts/configs -e`.

To clarify: ixp4xx and pxa3xx were two platforms based on the XScale core.
ixp4xx was commonly used in big-endian mode but lacked iWMMXt. pxa3xx
had iWMMXt but doesn't allow enabling big-endian mode because of a Kconfig
dependency, meaning that nobody has ever tried it, and it's likely broken.

Later 'mvebu' parts (Armada 510) do have iWMMXt and allow big-endian
mode, but those are BE8, with non-reversed byteorder for the instructions.

So none of this matters in practice, but it's very satifiying to know
it is finally
all working with the integrated assembler in all those combinations, at
least in theory!

Arnd

2020-11-10 21:40:10

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2] Make iwmmxt.S support Clang's integrated assembler

On Tue, 10 Nov 2020 at 21:38, Arnd Bergmann <[email protected]> wrote:
>
> On Tue, Nov 10, 2020 at 9:11 PM 'Nick Desaulniers' via Clang Built
> Linux <[email protected]> wrote:
> >
> > On Tue, Nov 10, 2020 at 12:10 PM Jian Cai <[email protected]> wrote:
> > >
> > > I tried to verify with ixp4xx_defconfig, and I noticed it also used CONFIG_CPU_BIG_ENDIAN=y to enable big endianness as follows,
> > >
> > > linux$ grep ENDIAN arch/arm/configs/ixp4xx_defconfig
> > > CONFIG_CPU_BIG_ENDIAN=y
> > >
> > > Also it appeared arch/arm/kernel/iwmmxt.o was not built with ixp4xx_defconfig. The commands I used
> > >
> > > linux$ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make CC=clang ixp4xx_defconfig
> > > linux$ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make -j32 CC=clang
> > > linux$ ls arch/arm/kernel/iwmmxt.o
> > > ls: cannot access 'arch/arm/kernel/iwmmxt.o': No such file or directory
> > >
> > > Did I miss any steps?
> >
> > Yes, you need to manually enable CONFIG_IWMMXT in menuconfig or via
> > `scripts/configs -e`.
>
> To clarify: ixp4xx and pxa3xx were two platforms based on the XScale core.
> ixp4xx was commonly used in big-endian mode but lacked iWMMXt. pxa3xx
> had iWMMXt but doesn't allow enabling big-endian mode because of a Kconfig
> dependency, meaning that nobody has ever tried it, and it's likely broken.
>
> Later 'mvebu' parts (Armada 510) do have iWMMXt and allow big-endian
> mode, but those are BE8, with non-reversed byteorder for the instructions.
>
> So none of this matters in practice, but it's very satifiying to know
> it is finally
> all working with the integrated assembler in all those combinations, at
> least in theory!
>

Thanks for the perspective. I wasn't aware of all these details, but I
just wanted to confirm that .inst works as expected in all
configurations.

2020-11-12 08:47:02

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v2] Make iwmmxt.S support Clang's integrated assembler

On Sat, 7 Nov 2020 at 01:11, Jian Cai <[email protected]> wrote:
>
> This patch replaces 6 IWMMXT instructions Clang's integrated assembler
> does not support in iwmmxt.S using macros, while making sure GNU
> assembler still emit the same instructions. This should be easier than
> providing full IWMMXT support in Clang.
>
> "Intel Wireless MMX Technology - Developer Guide - August, 2002" should
> be referenced for the encoding schemes of these extensions.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/975
>
> Suggested-by: Nick Desaulniers <[email protected]>
> Suggested-by: Ard Biesheuvel <[email protected]>
> Signed-off-by: Jian Cai <[email protected]>

For the change itself,

Acked-by: Ard Biesheuvel <[email protected]>

Although I must admit I am still on the fence when it comes to the
policy around rewriting perfectly valid code like this to accommodate
a toolchain that nobody is likely to use to build the code in
question.

Perhaps we should at least add some rationale to the commit log why
Clang's integrated assembler is something we should care about? I take
it this is not about diagnostics or CFI but simply about avoiding the
need to carry cross-binutils in the first place?

> ---
> arch/arm/kernel/iwmmxt.S | 89 ++++++++++++++++++++--------------------
> arch/arm/kernel/iwmmxt.h | 47 +++++++++++++++++++++
> 2 files changed, 92 insertions(+), 44 deletions(-)
> create mode 100644 arch/arm/kernel/iwmmxt.h
>
> diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
> index 0dcae787b004..d2b4ac06e4ed 100644
> --- a/arch/arm/kernel/iwmmxt.S
> +++ b/arch/arm/kernel/iwmmxt.S
> @@ -16,6 +16,7 @@
> #include <asm/thread_info.h>
> #include <asm/asm-offsets.h>
> #include <asm/assembler.h>
> +#include "iwmmxt.h"
>
> #if defined(CONFIG_CPU_PJ4) || defined(CONFIG_CPU_PJ4B)
> #define PJ4(code...) code
> @@ -113,33 +114,33 @@ concan_save:
>
> concan_dump:
>
> - wstrw wCSSF, [r1, #MMX_WCSSF]
> - wstrw wCASF, [r1, #MMX_WCASF]
> - wstrw wCGR0, [r1, #MMX_WCGR0]
> - wstrw wCGR1, [r1, #MMX_WCGR1]
> - wstrw wCGR2, [r1, #MMX_WCGR2]
> - wstrw wCGR3, [r1, #MMX_WCGR3]
> + wstrw wCSSF, r1, MMX_WCSSF
> + wstrw wCASF, r1, MMX_WCASF
> + wstrw wCGR0, r1, MMX_WCGR0
> + wstrw wCGR1, r1, MMX_WCGR1
> + wstrw wCGR2, r1, MMX_WCGR2
> + wstrw wCGR3, r1, MMX_WCGR3
>
> 1: @ MUP? wRn
> tst r2, #0x2
> beq 2f
>
> - wstrd wR0, [r1, #MMX_WR0]
> - wstrd wR1, [r1, #MMX_WR1]
> - wstrd wR2, [r1, #MMX_WR2]
> - wstrd wR3, [r1, #MMX_WR3]
> - wstrd wR4, [r1, #MMX_WR4]
> - wstrd wR5, [r1, #MMX_WR5]
> - wstrd wR6, [r1, #MMX_WR6]
> - wstrd wR7, [r1, #MMX_WR7]
> - wstrd wR8, [r1, #MMX_WR8]
> - wstrd wR9, [r1, #MMX_WR9]
> - wstrd wR10, [r1, #MMX_WR10]
> - wstrd wR11, [r1, #MMX_WR11]
> - wstrd wR12, [r1, #MMX_WR12]
> - wstrd wR13, [r1, #MMX_WR13]
> - wstrd wR14, [r1, #MMX_WR14]
> - wstrd wR15, [r1, #MMX_WR15]
> + wstrd wR0, r1, MMX_WR0
> + wstrd wR1, r1, MMX_WR1
> + wstrd wR2, r1, MMX_WR2
> + wstrd wR3, r1, MMX_WR3
> + wstrd wR4, r1, MMX_WR4
> + wstrd wR5, r1, MMX_WR5
> + wstrd wR6, r1, MMX_WR6
> + wstrd wR7, r1, MMX_WR7
> + wstrd wR8, r1, MMX_WR8
> + wstrd wR9, r1, MMX_WR9
> + wstrd wR10, r1, MMX_WR10
> + wstrd wR11, r1, MMX_WR11
> + wstrd wR12, r1, MMX_WR12
> + wstrd wR13, r1, MMX_WR13
> + wstrd wR14, r1, MMX_WR14
> + wstrd wR15, r1, MMX_WR15
>
> 2: teq r0, #0 @ anything to load?
> reteq lr @ if not, return
> @@ -147,30 +148,30 @@ concan_dump:
> concan_load:
>
> @ Load wRn
> - wldrd wR0, [r0, #MMX_WR0]
> - wldrd wR1, [r0, #MMX_WR1]
> - wldrd wR2, [r0, #MMX_WR2]
> - wldrd wR3, [r0, #MMX_WR3]
> - wldrd wR4, [r0, #MMX_WR4]
> - wldrd wR5, [r0, #MMX_WR5]
> - wldrd wR6, [r0, #MMX_WR6]
> - wldrd wR7, [r0, #MMX_WR7]
> - wldrd wR8, [r0, #MMX_WR8]
> - wldrd wR9, [r0, #MMX_WR9]
> - wldrd wR10, [r0, #MMX_WR10]
> - wldrd wR11, [r0, #MMX_WR11]
> - wldrd wR12, [r0, #MMX_WR12]
> - wldrd wR13, [r0, #MMX_WR13]
> - wldrd wR14, [r0, #MMX_WR14]
> - wldrd wR15, [r0, #MMX_WR15]
> + wldrd wR0, r0, MMX_WR0
> + wldrd wR1, r0, MMX_WR1
> + wldrd wR2, r0, MMX_WR2
> + wldrd wR3, r0, MMX_WR3
> + wldrd wR4, r0, MMX_WR4
> + wldrd wR5, r0, MMX_WR5
> + wldrd wR6, r0, MMX_WR6
> + wldrd wR7, r0, MMX_WR7
> + wldrd wR8, r0, MMX_WR8
> + wldrd wR9, r0, MMX_WR9
> + wldrd wR10, r0, MMX_WR10
> + wldrd wR11, r0, MMX_WR11
> + wldrd wR12, r0, MMX_WR12
> + wldrd wR13, r0, MMX_WR13
> + wldrd wR14, r0, MMX_WR14
> + wldrd wR15, r0, MMX_WR15
>
> @ Load wCx
> - wldrw wCSSF, [r0, #MMX_WCSSF]
> - wldrw wCASF, [r0, #MMX_WCASF]
> - wldrw wCGR0, [r0, #MMX_WCGR0]
> - wldrw wCGR1, [r0, #MMX_WCGR1]
> - wldrw wCGR2, [r0, #MMX_WCGR2]
> - wldrw wCGR3, [r0, #MMX_WCGR3]
> + wldrw wCSSF, r0, MMX_WCSSF
> + wldrw wCASF, r0, MMX_WCASF
> + wldrw wCGR0, r0, MMX_WCGR0
> + wldrw wCGR1, r0, MMX_WCGR1
> + wldrw wCGR2, r0, MMX_WCGR2
> + wldrw wCGR3, r0, MMX_WCGR3
>
> @ clear CUP/MUP (only if r1 != 0)
> teq r1, #0
> diff --git a/arch/arm/kernel/iwmmxt.h b/arch/arm/kernel/iwmmxt.h
> new file mode 100644
> index 000000000000..fb627286f5bb
> --- /dev/null
> +++ b/arch/arm/kernel/iwmmxt.h
> @@ -0,0 +1,47 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __IWMMXT_H__
> +#define __IWMMXT_H__
> +
> +.irp b, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
> +.set .LwR\b, \b
> +.set .Lr\b, \b
> +.endr
> +
> +.set .LwCSSF, 0x2
> +.set .LwCASF, 0x3
> +.set .LwCGR0, 0x8
> +.set .LwCGR1, 0x9
> +.set .LwCGR2, 0xa
> +.set .LwCGR3, 0xb
> +
> +.macro wldrd, reg:req, base:req, offset:req
> +.inst 0xedd00100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
> +.endm
> +
> +.macro wldrw, reg:req, base:req, offset:req
> +.inst 0xfd900100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
> +.endm
> +
> +.macro wstrd, reg:req, base:req, offset:req
> +.inst 0xedc00100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
> +.endm
> +
> +.macro wstrw, reg:req, base:req, offset:req
> +.inst 0xfd800100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
> +.endm
> +
> +#ifdef __clang__
> +
> +#define wCon c1
> +
> +.macro tmrc, dest:req, control:req
> +mrc p1, 0, \dest, \control, c0, 0
> +.endm
> +
> +.macro tmcr, control:req, src:req
> +mcr p1, 0, \src, \control, c0, 0
> +.endm
> +#endif
> +
> +#endif
> --
> 2.29.1.341.ge80a0c044ae-goog
>

2020-11-12 13:11:02

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v2] Make iwmmxt.S support Clang's integrated assembler

On Thu, Nov 12, 2020 at 9:42 AM Ard Biesheuvel <[email protected]> wrote:

> Although I must admit I am still on the fence when it comes to the
> policy around rewriting perfectly valid code like this to accommodate
> a toolchain that nobody is likely to use to build the code in
> question.
>
> Perhaps we should at least add some rationale to the commit log why
> Clang's integrated assembler is something we should care about?

This is one of the last bits of kernel code that could be compiled but
not assembled with clang. Once all of it works with IAS, we no longer
need to special-case 32-bit Arm in Kbuild, or turn off CONFIG_IWMMXT
when build-testing.

Arnd

2020-11-12 20:09:30

by Jian Cai

[permalink] [raw]
Subject: [PATCH v3] Make iwmmxt.S support Clang's integrated assembler

This patch replaces 6 IWMMXT instructions Clang's integrated assembler
does not support in iwmmxt.S using macros, while making sure GNU
assembler still emit the same instructions. This should be easier than
providing full IWMMXT support in Clang. This is one of the last bits of
kernel code that could be compiled but not assembled with clang. Once
all of it works with IAS, we no longer need to special-case 32-bit Arm
in Kbuild, or turn off CONFIG_IWMMXT when build-testing.

"Intel Wireless MMX Technology - Developer Guide - August, 2002" should
be referenced for the encoding schemes of these extensions.

Link: https://github.com/ClangBuiltLinux/linux/issues/975

Suggested-by: Nick Desaulniers <[email protected]>
Suggested-by: Ard Biesheuvel <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Jian Cai <[email protected]>
---

Thanks Ard for the writeup.

Changes v2 -> v3:
Add rationale in the commit log to further highlight the necessity of
this change based on [email protected]'s suggestion.

arch/arm/kernel/iwmmxt.S | 89 ++++++++++++++++++++--------------------
1 file changed, 45 insertions(+), 44 deletions(-)

diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
index 0dcae787b004..d2b4ac06e4ed 100644
--- a/arch/arm/kernel/iwmmxt.S
+++ b/arch/arm/kernel/iwmmxt.S
@@ -16,6 +16,7 @@
#include <asm/thread_info.h>
#include <asm/asm-offsets.h>
#include <asm/assembler.h>
+#include "iwmmxt.h"

#if defined(CONFIG_CPU_PJ4) || defined(CONFIG_CPU_PJ4B)
#define PJ4(code...) code
@@ -113,33 +114,33 @@ concan_save:

concan_dump:

- wstrw wCSSF, [r1, #MMX_WCSSF]
- wstrw wCASF, [r1, #MMX_WCASF]
- wstrw wCGR0, [r1, #MMX_WCGR0]
- wstrw wCGR1, [r1, #MMX_WCGR1]
- wstrw wCGR2, [r1, #MMX_WCGR2]
- wstrw wCGR3, [r1, #MMX_WCGR3]
+ wstrw wCSSF, r1, MMX_WCSSF
+ wstrw wCASF, r1, MMX_WCASF
+ wstrw wCGR0, r1, MMX_WCGR0
+ wstrw wCGR1, r1, MMX_WCGR1
+ wstrw wCGR2, r1, MMX_WCGR2
+ wstrw wCGR3, r1, MMX_WCGR3

1: @ MUP? wRn
tst r2, #0x2
beq 2f

- wstrd wR0, [r1, #MMX_WR0]
- wstrd wR1, [r1, #MMX_WR1]
- wstrd wR2, [r1, #MMX_WR2]
- wstrd wR3, [r1, #MMX_WR3]
- wstrd wR4, [r1, #MMX_WR4]
- wstrd wR5, [r1, #MMX_WR5]
- wstrd wR6, [r1, #MMX_WR6]
- wstrd wR7, [r1, #MMX_WR7]
- wstrd wR8, [r1, #MMX_WR8]
- wstrd wR9, [r1, #MMX_WR9]
- wstrd wR10, [r1, #MMX_WR10]
- wstrd wR11, [r1, #MMX_WR11]
- wstrd wR12, [r1, #MMX_WR12]
- wstrd wR13, [r1, #MMX_WR13]
- wstrd wR14, [r1, #MMX_WR14]
- wstrd wR15, [r1, #MMX_WR15]
+ wstrd wR0, r1, MMX_WR0
+ wstrd wR1, r1, MMX_WR1
+ wstrd wR2, r1, MMX_WR2
+ wstrd wR3, r1, MMX_WR3
+ wstrd wR4, r1, MMX_WR4
+ wstrd wR5, r1, MMX_WR5
+ wstrd wR6, r1, MMX_WR6
+ wstrd wR7, r1, MMX_WR7
+ wstrd wR8, r1, MMX_WR8
+ wstrd wR9, r1, MMX_WR9
+ wstrd wR10, r1, MMX_WR10
+ wstrd wR11, r1, MMX_WR11
+ wstrd wR12, r1, MMX_WR12
+ wstrd wR13, r1, MMX_WR13
+ wstrd wR14, r1, MMX_WR14
+ wstrd wR15, r1, MMX_WR15

2: teq r0, #0 @ anything to load?
reteq lr @ if not, return
@@ -147,30 +148,30 @@ concan_dump:
concan_load:

@ Load wRn
- wldrd wR0, [r0, #MMX_WR0]
- wldrd wR1, [r0, #MMX_WR1]
- wldrd wR2, [r0, #MMX_WR2]
- wldrd wR3, [r0, #MMX_WR3]
- wldrd wR4, [r0, #MMX_WR4]
- wldrd wR5, [r0, #MMX_WR5]
- wldrd wR6, [r0, #MMX_WR6]
- wldrd wR7, [r0, #MMX_WR7]
- wldrd wR8, [r0, #MMX_WR8]
- wldrd wR9, [r0, #MMX_WR9]
- wldrd wR10, [r0, #MMX_WR10]
- wldrd wR11, [r0, #MMX_WR11]
- wldrd wR12, [r0, #MMX_WR12]
- wldrd wR13, [r0, #MMX_WR13]
- wldrd wR14, [r0, #MMX_WR14]
- wldrd wR15, [r0, #MMX_WR15]
+ wldrd wR0, r0, MMX_WR0
+ wldrd wR1, r0, MMX_WR1
+ wldrd wR2, r0, MMX_WR2
+ wldrd wR3, r0, MMX_WR3
+ wldrd wR4, r0, MMX_WR4
+ wldrd wR5, r0, MMX_WR5
+ wldrd wR6, r0, MMX_WR6
+ wldrd wR7, r0, MMX_WR7
+ wldrd wR8, r0, MMX_WR8
+ wldrd wR9, r0, MMX_WR9
+ wldrd wR10, r0, MMX_WR10
+ wldrd wR11, r0, MMX_WR11
+ wldrd wR12, r0, MMX_WR12
+ wldrd wR13, r0, MMX_WR13
+ wldrd wR14, r0, MMX_WR14
+ wldrd wR15, r0, MMX_WR15

@ Load wCx
- wldrw wCSSF, [r0, #MMX_WCSSF]
- wldrw wCASF, [r0, #MMX_WCASF]
- wldrw wCGR0, [r0, #MMX_WCGR0]
- wldrw wCGR1, [r0, #MMX_WCGR1]
- wldrw wCGR2, [r0, #MMX_WCGR2]
- wldrw wCGR3, [r0, #MMX_WCGR3]
+ wldrw wCSSF, r0, MMX_WCSSF
+ wldrw wCASF, r0, MMX_WCASF
+ wldrw wCGR0, r0, MMX_WCGR0
+ wldrw wCGR1, r0, MMX_WCGR1
+ wldrw wCGR2, r0, MMX_WCGR2
+ wldrw wCGR3, r0, MMX_WCGR3

@ clear CUP/MUP (only if r1 != 0)
teq r1, #0
--
2.29.2.299.gdc1121823c-goog

2020-11-17 02:41:51

by Jian Cai

[permalink] [raw]
Subject: [PATCH v4] Make iwmmxt.S support Clang's integrated assembler

This patch replaces 6 IWMMXT instructions Clang's integrated assembler
does not support in iwmmxt.S using macros, while making sure GNU
assembler still emit the same instructions. This should be easier than
providing full IWMMXT support in Clang. This is one of the last bits of
kernel code that could be compiled but not assembled with clang. Once
all of it works with IAS, we no longer need to special-case 32-bit Arm
in Kbuild, or turn off CONFIG_IWMMXT when build-testing.

"Intel Wireless MMX Technology - Developer Guide - August, 2002" should
be referenced for the encoding schemes of these extensions.

Link: https://github.com/ClangBuiltLinux/linux/issues/975

Suggested-by: Nick Desaulniers <[email protected]>
Suggested-by: Ard Biesheuvel <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Jian Cai <[email protected]>
---

Changes v3 -> v4:
Forgot to include iwmmxt.h in v3.

arch/arm/kernel/iwmmxt.S | 89 ++++++++++++++++++++--------------------
arch/arm/kernel/iwmmxt.h | 47 +++++++++++++++++++++
2 files changed, 92 insertions(+), 44 deletions(-)
create mode 100644 arch/arm/kernel/iwmmxt.h

diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
index 0dcae787b004..d2b4ac06e4ed 100644
--- a/arch/arm/kernel/iwmmxt.S
+++ b/arch/arm/kernel/iwmmxt.S
@@ -16,6 +16,7 @@
#include <asm/thread_info.h>
#include <asm/asm-offsets.h>
#include <asm/assembler.h>
+#include "iwmmxt.h"

#if defined(CONFIG_CPU_PJ4) || defined(CONFIG_CPU_PJ4B)
#define PJ4(code...) code
@@ -113,33 +114,33 @@ concan_save:

concan_dump:

- wstrw wCSSF, [r1, #MMX_WCSSF]
- wstrw wCASF, [r1, #MMX_WCASF]
- wstrw wCGR0, [r1, #MMX_WCGR0]
- wstrw wCGR1, [r1, #MMX_WCGR1]
- wstrw wCGR2, [r1, #MMX_WCGR2]
- wstrw wCGR3, [r1, #MMX_WCGR3]
+ wstrw wCSSF, r1, MMX_WCSSF
+ wstrw wCASF, r1, MMX_WCASF
+ wstrw wCGR0, r1, MMX_WCGR0
+ wstrw wCGR1, r1, MMX_WCGR1
+ wstrw wCGR2, r1, MMX_WCGR2
+ wstrw wCGR3, r1, MMX_WCGR3

1: @ MUP? wRn
tst r2, #0x2
beq 2f

- wstrd wR0, [r1, #MMX_WR0]
- wstrd wR1, [r1, #MMX_WR1]
- wstrd wR2, [r1, #MMX_WR2]
- wstrd wR3, [r1, #MMX_WR3]
- wstrd wR4, [r1, #MMX_WR4]
- wstrd wR5, [r1, #MMX_WR5]
- wstrd wR6, [r1, #MMX_WR6]
- wstrd wR7, [r1, #MMX_WR7]
- wstrd wR8, [r1, #MMX_WR8]
- wstrd wR9, [r1, #MMX_WR9]
- wstrd wR10, [r1, #MMX_WR10]
- wstrd wR11, [r1, #MMX_WR11]
- wstrd wR12, [r1, #MMX_WR12]
- wstrd wR13, [r1, #MMX_WR13]
- wstrd wR14, [r1, #MMX_WR14]
- wstrd wR15, [r1, #MMX_WR15]
+ wstrd wR0, r1, MMX_WR0
+ wstrd wR1, r1, MMX_WR1
+ wstrd wR2, r1, MMX_WR2
+ wstrd wR3, r1, MMX_WR3
+ wstrd wR4, r1, MMX_WR4
+ wstrd wR5, r1, MMX_WR5
+ wstrd wR6, r1, MMX_WR6
+ wstrd wR7, r1, MMX_WR7
+ wstrd wR8, r1, MMX_WR8
+ wstrd wR9, r1, MMX_WR9
+ wstrd wR10, r1, MMX_WR10
+ wstrd wR11, r1, MMX_WR11
+ wstrd wR12, r1, MMX_WR12
+ wstrd wR13, r1, MMX_WR13
+ wstrd wR14, r1, MMX_WR14
+ wstrd wR15, r1, MMX_WR15

2: teq r0, #0 @ anything to load?
reteq lr @ if not, return
@@ -147,30 +148,30 @@ concan_dump:
concan_load:

@ Load wRn
- wldrd wR0, [r0, #MMX_WR0]
- wldrd wR1, [r0, #MMX_WR1]
- wldrd wR2, [r0, #MMX_WR2]
- wldrd wR3, [r0, #MMX_WR3]
- wldrd wR4, [r0, #MMX_WR4]
- wldrd wR5, [r0, #MMX_WR5]
- wldrd wR6, [r0, #MMX_WR6]
- wldrd wR7, [r0, #MMX_WR7]
- wldrd wR8, [r0, #MMX_WR8]
- wldrd wR9, [r0, #MMX_WR9]
- wldrd wR10, [r0, #MMX_WR10]
- wldrd wR11, [r0, #MMX_WR11]
- wldrd wR12, [r0, #MMX_WR12]
- wldrd wR13, [r0, #MMX_WR13]
- wldrd wR14, [r0, #MMX_WR14]
- wldrd wR15, [r0, #MMX_WR15]
+ wldrd wR0, r0, MMX_WR0
+ wldrd wR1, r0, MMX_WR1
+ wldrd wR2, r0, MMX_WR2
+ wldrd wR3, r0, MMX_WR3
+ wldrd wR4, r0, MMX_WR4
+ wldrd wR5, r0, MMX_WR5
+ wldrd wR6, r0, MMX_WR6
+ wldrd wR7, r0, MMX_WR7
+ wldrd wR8, r0, MMX_WR8
+ wldrd wR9, r0, MMX_WR9
+ wldrd wR10, r0, MMX_WR10
+ wldrd wR11, r0, MMX_WR11
+ wldrd wR12, r0, MMX_WR12
+ wldrd wR13, r0, MMX_WR13
+ wldrd wR14, r0, MMX_WR14
+ wldrd wR15, r0, MMX_WR15

@ Load wCx
- wldrw wCSSF, [r0, #MMX_WCSSF]
- wldrw wCASF, [r0, #MMX_WCASF]
- wldrw wCGR0, [r0, #MMX_WCGR0]
- wldrw wCGR1, [r0, #MMX_WCGR1]
- wldrw wCGR2, [r0, #MMX_WCGR2]
- wldrw wCGR3, [r0, #MMX_WCGR3]
+ wldrw wCSSF, r0, MMX_WCSSF
+ wldrw wCASF, r0, MMX_WCASF
+ wldrw wCGR0, r0, MMX_WCGR0
+ wldrw wCGR1, r0, MMX_WCGR1
+ wldrw wCGR2, r0, MMX_WCGR2
+ wldrw wCGR3, r0, MMX_WCGR3

@ clear CUP/MUP (only if r1 != 0)
teq r1, #0
diff --git a/arch/arm/kernel/iwmmxt.h b/arch/arm/kernel/iwmmxt.h
new file mode 100644
index 000000000000..fb627286f5bb
--- /dev/null
+++ b/arch/arm/kernel/iwmmxt.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __IWMMXT_H__
+#define __IWMMXT_H__
+
+.irp b, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
+.set .LwR\b, \b
+.set .Lr\b, \b
+.endr
+
+.set .LwCSSF, 0x2
+.set .LwCASF, 0x3
+.set .LwCGR0, 0x8
+.set .LwCGR1, 0x9
+.set .LwCGR2, 0xa
+.set .LwCGR3, 0xb
+
+.macro wldrd, reg:req, base:req, offset:req
+.inst 0xedd00100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
+.endm
+
+.macro wldrw, reg:req, base:req, offset:req
+.inst 0xfd900100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
+.endm
+
+.macro wstrd, reg:req, base:req, offset:req
+.inst 0xedc00100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
+.endm
+
+.macro wstrw, reg:req, base:req, offset:req
+.inst 0xfd800100 | (.L\reg << 12) | (.L\base << 16) | (\offset >> 2)
+.endm
+
+#ifdef __clang__
+
+#define wCon c1
+
+.macro tmrc, dest:req, control:req
+mrc p1, 0, \dest, \control, c0, 0
+.endm
+
+.macro tmcr, control:req, src:req
+mcr p1, 0, \src, \control, c0, 0
+.endm
+#endif
+
+#endif
--
2.29.2.299.gdc1121823c-goog