2020-02-24 23:58:31

by Stefan Agner

[permalink] [raw]
Subject: [PATCH] ARM: use assembly mnemonics for VFP register access

Clang's integrated assembler does not allow to to use the mcr
instruction to access floating point co-processor registers:
arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
^
arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
^
<inline asm>:1:6: note: instantiated into assembly here
mcr p10, 7, r0, cr8, cr0, 0 @ fmxr FPEXC, r0
^

The GNU assembler supports the .fpu directive at least since 2.17 (when
documentation has been added). Since Linux requires binutils 2.21 it is
safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
register access.

This allows to build vfpmodule.c with Clang and its integrated assembler.

Link: https://github.com/ClangBuiltLinux/linux/issues/905
Signed-off-by: Stefan Agner <[email protected]>
---
arch/arm/vfp/vfpinstr.h | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
index 38dc154e39ff..799ccf065406 100644
--- a/arch/arm/vfp/vfpinstr.h
+++ b/arch/arm/vfp/vfpinstr.h
@@ -62,21 +62,17 @@
#define FPSCR_C (1 << 29)
#define FPSCR_V (1 << 28)

-/*
- * Since we aren't building with -mfpu=vfp, we need to code
- * these instructions using their MRC/MCR equivalents.
- */
-#define vfpreg(_vfp_) #_vfp_
-
#define fmrx(_vfp_) ({ \
u32 __v; \
- asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx %0, " #_vfp_ \
+ asm(".fpu vfpv2\n" \
+ "vmrs %0, " #_vfp_ \
: "=r" (__v) : : "cc"); \
__v; \
})

#define fmxr(_vfp_,_var_) \
- asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
+ asm(".fpu vfpv2\n" \
+ "vmsr " #_vfp_ ", %0" \
: : "r" (_var_) : "cc")

u32 vfp_single_cpdo(u32 inst, u32 fpscr);
--
2.25.1


2020-02-25 19:34:26

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] ARM: use assembly mnemonics for VFP register access

On Mon, Feb 24, 2020 at 9:22 PM Stefan Agner <[email protected]> wrote:
>
> Clang's integrated assembler does not allow to to use the mcr
> instruction to access floating point co-processor registers:
> arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
> fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
> ^
> arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
> asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
> ^
> <inline asm>:1:6: note: instantiated into assembly here
> mcr p10, 7, r0, cr8, cr0, 0 @ fmxr FPEXC, r0
> ^
>
> The GNU assembler supports the .fpu directive at least since 2.17 (when
> documentation has been added). Since Linux requires binutils 2.21 it is
> safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
> register access.
>
> This allows to build vfpmodule.c with Clang and its integrated assembler.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/905
> Signed-off-by: Stefan Agner <[email protected]>
> ---
> arch/arm/vfp/vfpinstr.h | 12 ++++--------
> 1 file changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
> index 38dc154e39ff..799ccf065406 100644
> --- a/arch/arm/vfp/vfpinstr.h
> +++ b/arch/arm/vfp/vfpinstr.h
> @@ -62,21 +62,17 @@
> #define FPSCR_C (1 << 29)
> #define FPSCR_V (1 << 28)
>
> -/*
> - * Since we aren't building with -mfpu=vfp, we need to code
> - * these instructions using their MRC/MCR equivalents.
> - */
> -#define vfpreg(_vfp_) #_vfp_
> -
> #define fmrx(_vfp_) ({ \
> u32 __v; \
> - asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx %0, " #_vfp_ \
> + asm(".fpu vfpv2\n" \
> + "vmrs %0, " #_vfp_ \
> : "=r" (__v) : : "cc"); \
> __v; \
> })
>
> #define fmxr(_vfp_,_var_) \
> - asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
> + asm(".fpu vfpv2\n" \
> + "vmsr " #_vfp_ ", %0" \
> : : "r" (_var_) : "cc")
>
> u32 vfp_single_cpdo(u32 inst, u32 fpscr);
> --

Hi Stefan,
Thanks for the patch. Reading through:
- FMRX, FMXR, and FMSTAT:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Bcfbdihi.html
- VMRS and VMSR:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfbdihi.html

Should a macro called `fmrx` that had a comment about `fmrx` be using
`vmrs` in place of `fmrx`?

It looks like Clang treats them the same, but GCC keeps them separate:
https://godbolt.org/z/YKmSAs
Ah, this is only when streaming to assembly. Looks like they have the
same encoding, and produce the same disassembly. (Godbolt emits
assembly by default, and has the option to compile, then disassemble).
If I take my case from godbolt above:

➜ /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
➜ /tmp llvm-objdump -dr x.o

x.o: file format elf32-arm-little


Disassembly of section .text:

00000000 bar:
0: f1 ee 10 0a vmrs r0, fpscr
4: 70 47 bx lr
6: 00 bf nop

00000008 baz:
8: f1 ee 10 0a vmrs r0, fpscr
c: 70 47 bx lr
e: 00 bf nop

So indeed a similar encoding exists for the two different assembler
instructions.
Reviewed-by: Nick Desaulniers <[email protected]>


--
Thanks,
~Nick Desaulniers

2020-02-25 19:39:22

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH] ARM: use assembly mnemonics for VFP register access

On Tue, 25 Feb 2020 at 20:10, Nick Desaulniers <[email protected]> wrote:
>
> On Mon, Feb 24, 2020 at 9:22 PM Stefan Agner <[email protected]> wrote:
> >
> > Clang's integrated assembler does not allow to to use the mcr
> > instruction to access floating point co-processor registers:
> > arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
> > fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
> > ^
> > arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
> > asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
> > ^
> > <inline asm>:1:6: note: instantiated into assembly here
> > mcr p10, 7, r0, cr8, cr0, 0 @ fmxr FPEXC, r0
> > ^
> >
> > The GNU assembler supports the .fpu directive at least since 2.17 (when
> > documentation has been added). Since Linux requires binutils 2.21 it is
> > safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
> > register access.
> >
> > This allows to build vfpmodule.c with Clang and its integrated assembler.
> >
> > Link: https://github.com/ClangBuiltLinux/linux/issues/905
> > Signed-off-by: Stefan Agner <[email protected]>
> > ---
> > arch/arm/vfp/vfpinstr.h | 12 ++++--------
> > 1 file changed, 4 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
> > index 38dc154e39ff..799ccf065406 100644
> > --- a/arch/arm/vfp/vfpinstr.h
> > +++ b/arch/arm/vfp/vfpinstr.h
> > @@ -62,21 +62,17 @@
> > #define FPSCR_C (1 << 29)
> > #define FPSCR_V (1 << 28)
> >
> > -/*
> > - * Since we aren't building with -mfpu=vfp, we need to code
> > - * these instructions using their MRC/MCR equivalents.
> > - */
> > -#define vfpreg(_vfp_) #_vfp_
> > -
> > #define fmrx(_vfp_) ({ \
> > u32 __v; \
> > - asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx %0, " #_vfp_ \
> > + asm(".fpu vfpv2\n" \
> > + "vmrs %0, " #_vfp_ \
> > : "=r" (__v) : : "cc"); \
> > __v; \
> > })
> >
> > #define fmxr(_vfp_,_var_) \
> > - asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
> > + asm(".fpu vfpv2\n" \
> > + "vmsr " #_vfp_ ", %0" \
> > : : "r" (_var_) : "cc")
> >
> > u32 vfp_single_cpdo(u32 inst, u32 fpscr);
> > --
>
> Hi Stefan,
> Thanks for the patch. Reading through:
> - FMRX, FMXR, and FMSTAT:
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Bcfbdihi.html
> - VMRS and VMSR:
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfbdihi.html
>
> Should a macro called `fmrx` that had a comment about `fmrx` be using
> `vmrs` in place of `fmrx`?
>
> It looks like Clang treats them the same, but GCC keeps them separate:
> https://godbolt.org/z/YKmSAs
> Ah, this is only when streaming to assembly. Looks like they have the
> same encoding, and produce the same disassembly. (Godbolt emits
> assembly by default, and has the option to compile, then disassemble).
> If I take my case from godbolt above:
>
> ➜ /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
> ➜ /tmp llvm-objdump -dr x.o
>
> x.o: file format elf32-arm-little
>
>
> Disassembly of section .text:
>
> 00000000 bar:
> 0: f1 ee 10 0a vmrs r0, fpscr
> 4: 70 47 bx lr
> 6: 00 bf nop
>
> 00000008 baz:
> 8: f1 ee 10 0a vmrs r0, fpscr
> c: 70 47 bx lr
> e: 00 bf nop
>
> So indeed a similar encoding exists for the two different assembler
> instructions.

Does that hold for ARM (A32) instructions as well?

2020-02-25 19:46:33

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] ARM: use assembly mnemonics for VFP register access

On 2020-02-25 7:33 pm, Ard Biesheuvel wrote:
> On Tue, 25 Feb 2020 at 20:10, Nick Desaulniers <[email protected]> wrote:
>>
>> On Mon, Feb 24, 2020 at 9:22 PM Stefan Agner <[email protected]> wrote:
>>>
>>> Clang's integrated assembler does not allow to to use the mcr
>>> instruction to access floating point co-processor registers:
>>> arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
>>> fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
>>> ^
>>> arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
>>> asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
>>> ^
>>> <inline asm>:1:6: note: instantiated into assembly here
>>> mcr p10, 7, r0, cr8, cr0, 0 @ fmxr FPEXC, r0
>>> ^
>>>
>>> The GNU assembler supports the .fpu directive at least since 2.17 (when
>>> documentation has been added). Since Linux requires binutils 2.21 it is
>>> safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
>>> register access.
>>>
>>> This allows to build vfpmodule.c with Clang and its integrated assembler.
>>>
>>> Link: https://github.com/ClangBuiltLinux/linux/issues/905
>>> Signed-off-by: Stefan Agner <[email protected]>
>>> ---
>>> arch/arm/vfp/vfpinstr.h | 12 ++++--------
>>> 1 file changed, 4 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
>>> index 38dc154e39ff..799ccf065406 100644
>>> --- a/arch/arm/vfp/vfpinstr.h
>>> +++ b/arch/arm/vfp/vfpinstr.h
>>> @@ -62,21 +62,17 @@
>>> #define FPSCR_C (1 << 29)
>>> #define FPSCR_V (1 << 28)
>>>
>>> -/*
>>> - * Since we aren't building with -mfpu=vfp, we need to code
>>> - * these instructions using their MRC/MCR equivalents.
>>> - */
>>> -#define vfpreg(_vfp_) #_vfp_
>>> -
>>> #define fmrx(_vfp_) ({ \
>>> u32 __v; \
>>> - asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx %0, " #_vfp_ \
>>> + asm(".fpu vfpv2\n" \
>>> + "vmrs %0, " #_vfp_ \
>>> : "=r" (__v) : : "cc"); \
>>> __v; \
>>> })
>>>
>>> #define fmxr(_vfp_,_var_) \
>>> - asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
>>> + asm(".fpu vfpv2\n" \
>>> + "vmsr " #_vfp_ ", %0" \
>>> : : "r" (_var_) : "cc")
>>>
>>> u32 vfp_single_cpdo(u32 inst, u32 fpscr);
>>> --
>>
>> Hi Stefan,
>> Thanks for the patch. Reading through:
>> - FMRX, FMXR, and FMSTAT:
>> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Bcfbdihi.html
>> - VMRS and VMSR:
>> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfbdihi.html
>>
>> Should a macro called `fmrx` that had a comment about `fmrx` be using
>> `vmrs` in place of `fmrx`?
>>
>> It looks like Clang treats them the same, but GCC keeps them separate:
>> https://godbolt.org/z/YKmSAs
>> Ah, this is only when streaming to assembly. Looks like they have the
>> same encoding, and produce the same disassembly. (Godbolt emits
>> assembly by default, and has the option to compile, then disassemble).
>> If I take my case from godbolt above:
>>
>> ➜ /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
>> ➜ /tmp llvm-objdump -dr x.o
>>
>> x.o: file format elf32-arm-little
>>
>>
>> Disassembly of section .text:
>>
>> 00000000 bar:
>> 0: f1 ee 10 0a vmrs r0, fpscr
>> 4: 70 47 bx lr
>> 6: 00 bf nop
>>
>> 00000008 baz:
>> 8: f1 ee 10 0a vmrs r0, fpscr
>> c: 70 47 bx lr
>> e: 00 bf nop
>>
>> So indeed a similar encoding exists for the two different assembler
>> instructions.
>
> Does that hold for ARM (A32) instructions as well?

It should do - they're all the same thing underneath. The UAL syntax
just renamed all the legacy VFP mnemonics from Fxxx to Vxxx form, apart
from a couple of things that were already deprecated. GAS still accepts
both regardless of ".syntax unified", and as a result GCC never saw a
reason to stop emitting the old mnemonics.

Robin.

2020-02-25 20:01:27

by Stefan Agner

[permalink] [raw]
Subject: Re: [PATCH] ARM: use assembly mnemonics for VFP register access

On 2020-02-25 20:45, Robin Murphy wrote:
> On 2020-02-25 7:33 pm, Ard Biesheuvel wrote:
>> On Tue, 25 Feb 2020 at 20:10, Nick Desaulniers <[email protected]> wrote:
>>>
>>> On Mon, Feb 24, 2020 at 9:22 PM Stefan Agner <[email protected]> wrote:
>>>>
>>>> Clang's integrated assembler does not allow to to use the mcr
>>>> instruction to access floating point co-processor registers:
>>>> arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
>>>> fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
>>>> ^
>>>> arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
>>>> asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
>>>> ^
>>>> <inline asm>:1:6: note: instantiated into assembly here
>>>> mcr p10, 7, r0, cr8, cr0, 0 @ fmxr FPEXC, r0
>>>> ^
>>>>
>>>> The GNU assembler supports the .fpu directive at least since 2.17 (when
>>>> documentation has been added). Since Linux requires binutils 2.21 it is
>>>> safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
>>>> register access.
>>>>
>>>> This allows to build vfpmodule.c with Clang and its integrated assembler.
>>>>
>>>> Link: https://github.com/ClangBuiltLinux/linux/issues/905
>>>> Signed-off-by: Stefan Agner <[email protected]>
>>>> ---
>>>> arch/arm/vfp/vfpinstr.h | 12 ++++--------
>>>> 1 file changed, 4 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
>>>> index 38dc154e39ff..799ccf065406 100644
>>>> --- a/arch/arm/vfp/vfpinstr.h
>>>> +++ b/arch/arm/vfp/vfpinstr.h
>>>> @@ -62,21 +62,17 @@
>>>> #define FPSCR_C (1 << 29)
>>>> #define FPSCR_V (1 << 28)
>>>>
>>>> -/*
>>>> - * Since we aren't building with -mfpu=vfp, we need to code
>>>> - * these instructions using their MRC/MCR equivalents.
>>>> - */
>>>> -#define vfpreg(_vfp_) #_vfp_
>>>> -
>>>> #define fmrx(_vfp_) ({ \
>>>> u32 __v; \
>>>> - asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx %0, " #_vfp_ \
>>>> + asm(".fpu vfpv2\n" \
>>>> + "vmrs %0, " #_vfp_ \
>>>> : "=r" (__v) : : "cc"); \
>>>> __v; \
>>>> })
>>>>
>>>> #define fmxr(_vfp_,_var_) \
>>>> - asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
>>>> + asm(".fpu vfpv2\n" \
>>>> + "vmsr " #_vfp_ ", %0" \
>>>> : : "r" (_var_) : "cc")
>>>>
>>>> u32 vfp_single_cpdo(u32 inst, u32 fpscr);
>>>> --
>>>
>>> Hi Stefan,
>>> Thanks for the patch. Reading through:
>>> - FMRX, FMXR, and FMSTAT:
>>> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Bcfbdihi.html
>>> - VMRS and VMSR:
>>> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfbdihi.html
>>>
>>> Should a macro called `fmrx` that had a comment about `fmrx` be using
>>> `vmrs` in place of `fmrx`?
>>>
>>> It looks like Clang treats them the same, but GCC keeps them separate:
>>> https://godbolt.org/z/YKmSAs
>>> Ah, this is only when streaming to assembly. Looks like they have the
>>> same encoding, and produce the same disassembly. (Godbolt emits
>>> assembly by default, and has the option to compile, then disassemble).
>>> If I take my case from godbolt above:
>>>
>>> ➜ /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
>>> ➜ /tmp llvm-objdump -dr x.o
>>>
>>> x.o: file format elf32-arm-little
>>>
>>>
>>> Disassembly of section .text:
>>>
>>> 00000000 bar:
>>> 0: f1 ee 10 0a vmrs r0, fpscr
>>> 4: 70 47 bx lr
>>> 6: 00 bf nop
>>>
>>> 00000008 baz:
>>> 8: f1 ee 10 0a vmrs r0, fpscr
>>> c: 70 47 bx lr
>>> e: 00 bf nop
>>>
>>> So indeed a similar encoding exists for the two different assembler
>>> instructions.
>>
>> Does that hold for ARM (A32) instructions as well?
>
> It should do - they're all the same thing underneath. The UAL syntax
> just renamed all the legacy VFP mnemonics from Fxxx to Vxxx form,
> apart from a couple of things that were already deprecated. GAS still
> accepts both regardless of ".syntax unified", and as a result GCC
> never saw a reason to stop emitting the old mnemonics.
>

Yes this is really only a mnemonic change when unified assembler
language (UAL) got introduce, the ARM ARM has a list of mnemonic changes
in the appendix.

Just do make sure I also did compare the disassembled object file of
vfpmodule.c before and after this change.

I guess we could (should?) also change the macro name, but I guess that
should be a separate commit anyway.

--
Stefan

2020-02-25 20:28:11

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] ARM: use assembly mnemonics for VFP register access

On Tue, Feb 25, 2020 at 11:33 AM Ard Biesheuvel
<[email protected]> wrote:
>
> On Tue, 25 Feb 2020 at 20:10, Nick Desaulniers <[email protected]> wrote:
> > Ah, this is only when streaming to assembly. Looks like they have the
> > same encoding, and produce the same disassembly. (Godbolt emits
> > assembly by default, and has the option to compile, then disassemble).
> > If I take my case from godbolt above:
> >
> > ➜ /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
> > ➜ /tmp llvm-objdump -dr x.o
> >
> > x.o: file format elf32-arm-little
> >
> >
> > Disassembly of section .text:
> >
> > 00000000 bar:
> > 0: f1 ee 10 0a vmrs r0, fpscr
> > 4: 70 47 bx lr
> > 6: 00 bf nop
> >
> > 00000008 baz:
> > 8: f1 ee 10 0a vmrs r0, fpscr
> > c: 70 47 bx lr
> > e: 00 bf nop
> >
> > So indeed a similar encoding exists for the two different assembler
> > instructions.
>
> Does that hold for ARM (A32) instructions as well?

TIL -mthumb is the default for arm-linux-gnueabihf-gcc -O2.

➜ /tmp arm-linux-gnueabihf-gcc -O2 -c x.c -marm
➜ /tmp llvm-objdump -dr x.o

x.o: file format elf32-arm-little


Disassembly of section .text:

00000000 bar:
0: 10 0a f1 ee vmrs r0, fpscr
4: 1e ff 2f e1 bx lr

00000008 baz:
8: 10 0a f1 ee vmrs r0, fpscr
c: 1e ff 2f e1 bx lr

^ Just to show the matching encoding.
--
Thanks,
~Nick Desaulniers

2020-02-25 22:48:41

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] ARM: use assembly mnemonics for VFP register access

On Tue, Feb 25, 2020 at 12:27 PM Nick Desaulniers
<[email protected]> wrote:
>
> On Tue, Feb 25, 2020 at 11:33 AM Ard Biesheuvel
> <[email protected]> wrote:
> >
> > On Tue, 25 Feb 2020 at 20:10, Nick Desaulniers <[email protected]> wrote:
> > > Ah, this is only when streaming to assembly. Looks like they have the
> > > same encoding, and produce the same disassembly. (Godbolt emits
> > > assembly by default, and has the option to compile, then disassemble).
> > > If I take my case from godbolt above:
> > >
> > > ➜ /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
> > > ➜ /tmp llvm-objdump -dr x.o
> > >
> > > x.o: file format elf32-arm-little
> > >
> > >
> > > Disassembly of section .text:
> > >
> > > 00000000 bar:
> > > 0: f1 ee 10 0a vmrs r0, fpscr
> > > 4: 70 47 bx lr
> > > 6: 00 bf nop
> > >
> > > 00000008 baz:
> > > 8: f1 ee 10 0a vmrs r0, fpscr
> > > c: 70 47 bx lr
> > > e: 00 bf nop
> > >
> > > So indeed a similar encoding exists for the two different assembler
> > > instructions.
> >
> > Does that hold for ARM (A32) instructions as well?
>
> TIL -mthumb is the default for arm-linux-gnueabihf-gcc -O2.
>
> ➜ /tmp arm-linux-gnueabihf-gcc -O2 -c x.c -marm
> ➜ /tmp llvm-objdump -dr x.o
>
> x.o: file format elf32-arm-little
>
>
> Disassembly of section .text:
>
> 00000000 bar:
> 0: 10 0a f1 ee vmrs r0, fpscr
> 4: 1e ff 2f e1 bx lr
>
> 00000008 baz:
> 8: 10 0a f1 ee vmrs r0, fpscr
> c: 1e ff 2f e1 bx lr
>
> ^ Just to show the matching encoding.

Further, Peter just sent me this response off thread, which I thought
I'd share. Thanks Peter. Bookmarked.
```
FWIW the Arm ARM reference manual
https://static.docs.arm.com/ddi0487/ea/DDI0487E_a_armv8_arm.pdf has a
table that maps the pre-UAL syntax to the UAL syntax.

K6.1.2 Pre-UAL instruction syntax for the A32 floating-point instructions
This has an entry mapping pre-UAL (FMRX) to UAL (VMSR)

So they are the same instruction with the modern name being VMSR. If
it is possible to use the new name it will probably confuse fewer
people, but other than that it won't do any harm.
```
--
Thanks,
~Nick Desaulniers

2020-02-29 23:01:02

by Stefan Agner

[permalink] [raw]
Subject: Re: [PATCH] ARM: use assembly mnemonics for VFP register access

On 2020-02-21 07:34, Stefan Agner wrote:
> Clang's integrated assembler does not allow to to use the mcr
> instruction to access floating point co-processor registers:
> arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
> fmxr(FPEXC, fpexc &
> ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
> ^
> arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
> asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr "
> #_vfp_ ", %0" \
> ^
> <inline asm>:1:6: note: instantiated into assembly here
> mcr p10, 7, r0, cr8, cr0, 0 @ fmxr FPEXC, r0
> ^
>
> The GNU assembler supports the .fpu directive at least since 2.17 (when
> documentation has been added). Since Linux requires binutils 2.21 it is
> safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
> register access.
>
> This allows to build vfpmodule.c with Clang and its integrated assembler.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/905
> Signed-off-by: Stefan Agner <[email protected]>
> ---
> arch/arm/vfp/vfpinstr.h | 12 ++++--------
> 1 file changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
> index 38dc154e39ff..799ccf065406 100644
> --- a/arch/arm/vfp/vfpinstr.h
> +++ b/arch/arm/vfp/vfpinstr.h
> @@ -62,21 +62,17 @@
> #define FPSCR_C (1 << 29)
> #define FPSCR_V (1 << 28)
>
> -/*
> - * Since we aren't building with -mfpu=vfp, we need to code
> - * these instructions using their MRC/MCR equivalents.
> - */
> -#define vfpreg(_vfp_) #_vfp_
> -
> #define fmrx(_vfp_) ({ \
> u32 __v; \
> - asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx %0, " #_vfp_ \
> + asm(".fpu vfpv2\n" \
> + "vmrs %0, " #_vfp_ \
> : "=r" (__v) : : "cc"); \
> __v; \
> })
>
> #define fmxr(_vfp_,_var_) \
> - asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr " #_vfp_ ", %0" \
> + asm(".fpu vfpv2\n" \
> + "vmsr " #_vfp_ ", %0" \
> : : "r" (_var_) : "cc")
>
> u32 vfp_single_cpdo(u32 inst, u32 fpscr);

I just found out that this fails with binutils 2.23.1. Since we support
binutils back to 2.21 I guess that is not OK..?

CC arch/arm/vfp/vfpmodule.o
/tmp/cc2Vcw98.s: Assembler messages:
/tmp/cc2Vcw98.s:920: Error: operand 1 must be a VFP extension System
Register -- `vmrs r6,FPINST'
/tmp/cc2Vcw98.s:948: Error: operand 1 must be a VFP extension System
Register -- `vmrs r6,FPINST2'

Looking into binutils history reveals that FPINST/FPINST2 has been
allowed with 16d02dc907c5717b5f47076bb90ae3795e73b59f
("gas/config/tc-arm.c (do_vmrs): Accept all control registers") which
made it into binutils 2.24...

I don't have a particular good idea how to make this work for Clang and
GCC other than a some ifdef's...

--
Stefan