2023-09-13 21:08:18

by Willy Tarreau

[permalink] [raw]
Subject: Re: aarch64 binaries using nolibc segfault before reaching the entry point

On Wed, Sep 13, 2023 at 10:19:00PM +0200, Thomas Wei?schuh wrote:
> > All on aarch64 running fedora37 + upstream kernel. Any hints on what could
> > be borken here or how to actually fix it?
>
> I reduced it to the following reproducer:
>
> $ cat test.c
> int foo; /* It works when deleting this variable */
>
> void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) _start(void)
> {
> __asm__ volatile (
> "mov x8, 93\n" /* NR_exit == 93 */
> "svc #0\n"
> );
> __builtin_unreachable();
> }
>
> $ aarch64-linux-gnu-gcc -Os -static -fno-stack-protector -Wall -nostdlib test.c
> $ ./a.out
> Segmentation fault
>
> Also when running under gdb the error message is:
>
> During startup program terminated with signal SIGSEGV, Segmentation fault.
>
> So it seems the error already happens during loading.
>
> Could be a compiler or kernel bug?

I tried here with gcc-11.4.0 native on an ubuntu-22.04 and using a
cross gcc-9.5 executed on another box and couldn't reproduce the issue
at all. It could be that the compiler inserts whatever, did someone
try to disassemble de resulting program to see what it looks like ?
Maybe we're even dealing with issues related to random stack alignment
that causes issues past a function call due to some garbage being placed
at the wrong place in the stack. Also, dmesg should generally report
what (and where) the segv happened. Similarly, gdb with "info reg"
and "disassemble $pc" should report some info.

In my case, I just have this:

$ objdump -d a.out

a.out: file format elf64-littleaarch64


Disassembly of section .text:

0000000000400144 <_start>:
400144: d2800ba8 mov x8, #0x5d // #93
400148: d4000001 svc #0x0

The kernel is a 6.2:

$ uname -a
Linux ampere 6.2.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 20:49:15 UTC 2 aarch64 aarch64 aarch64 GNU/Linux

Cheers,
Willy


2023-09-14 01:20:52

by Thomas Weißschuh

[permalink] [raw]
Subject: Re: aarch64 binaries using nolibc segfault before reaching the entry point

On 2023-09-13 22:58:38+0200, Willy Tarreau wrote:
> On Wed, Sep 13, 2023 at 10:19:00PM +0200, Thomas Weißschuh wrote:
> > > All on aarch64 running fedora37 + upstream kernel. Any hints on what could
> > > be borken here or how to actually fix it?
> >
> > I reduced it to the following reproducer:
> >
> > $ cat test.c
> > int foo; /* It works when deleting this variable */
> >
> > void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) _start(void)
> > {
> > __asm__ volatile (
> > "mov x8, 93\n" /* NR_exit == 93 */
> > "svc #0\n"
> > );
> > __builtin_unreachable();
> > }
> >
> > $ aarch64-linux-gnu-gcc -Os -static -fno-stack-protector -Wall -nostdlib test.c
> > $ ./a.out
> > Segmentation fault
> >
> > Also when running under gdb the error message is:
> >
> > During startup program terminated with signal SIGSEGV, Segmentation fault.
> >
> > So it seems the error already happens during loading.
> >
> > Could be a compiler or kernel bug?
>
> I tried here with gcc-11.4.0 native on an ubuntu-22.04 and using a
> cross gcc-9.5 executed on another box and couldn't reproduce the issue
> at all. It could be that the compiler inserts whatever, did someone
> try to disassemble de resulting program to see what it looks like ?
> Maybe we're even dealing with issues related to random stack alignment
> that causes issues past a function call due to some garbage being placed
> at the wrong place in the stack. Also, dmesg should generally report
> what (and where) the segv happened. Similarly, gdb with "info reg"
> and "disassemble $pc" should report some info.

Im using:

aarch64-linux-gnu-gcc (GCC) 13.2.0

It's reproducible reliably.

No output in dmesg, binary works in qemu-user.
There should be no function calls at all, or?
GDB also doesn't show any registers, it seems to fail before anything is
executed.

>
> In my case, I just have this:
>
> $ objdump -d a.out
>
> a.out: file format elf64-littleaarch64
>
>
> Disassembly of section .text:
>
> 0000000000400144 <_start>:
> 400144: d2800ba8 mov x8, #0x5d // #93
> 400148: d4000001 svc #0x0

Looks absolutely identical for me.

>
> The kernel is a 6.2:
>
> $ uname -a
> Linux ampere 6.2.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 20:49:15 UTC 2 aarch64 aarch64 aarch64 GNU/Linux

Linux fedora-4gb-fsn1-1 6.4.11-200.fc38.aarch64 #1 SMP PREEMPT_DYNAMIC Wed Aug 16 18:01:59 UTC 2023 aarch64 GNU/Linux

(Just a default ARM VM on Hetzner with Fedora 38)

2023-09-14 05:26:20

by Sebastian Ott

[permalink] [raw]
Subject: Re: aarch64 binaries using nolibc segfault before reaching the entry point

On Wed, 13 Sep 2023, Willy Tarreau wrote:
> On Wed, Sep 13, 2023 at 10:19:00PM +0200, Thomas Wei?schuh wrote:
>>> All on aarch64 running fedora37 + upstream kernel. Any hints on what could
>>> be borken here or how to actually fix it?
>>
>> I reduced it to the following reproducer:
>>
>> $ cat test.c
>> int foo; /* It works when deleting this variable */
>>
>> void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) _start(void)
>> {
>> __asm__ volatile (
>> "mov x8, 93\n" /* NR_exit == 93 */
>> "svc #0\n"
>> );
>> __builtin_unreachable();
>> }
>>
>> $ aarch64-linux-gnu-gcc -Os -static -fno-stack-protector -Wall -nostdlib test.c
>> $ ./a.out
>> Segmentation fault
>>
>> Also when running under gdb the error message is:
>>
>> During startup program terminated with signal SIGSEGV, Segmentation fault.
>>
>> So it seems the error already happens during loading.
>>
>> Could be a compiler or kernel bug?
>
> I tried here with gcc-11.4.0 native on an ubuntu-22.04 and using a
> cross gcc-9.5 executed on another box and couldn't reproduce the issue
> at all. It could be that the compiler inserts whatever, did someone
> try to disassemble de resulting program to see what it looks like ?
> Maybe we're even dealing with issues related to random stack alignment
> that causes issues past a function call due to some garbage being placed
> at the wrong place in the stack. Also, dmesg should generally report
> what (and where) the segv happened. Similarly, gdb with "info reg"
> and "disassemble $pc" should report some info.

Sadly there is no kernel output for this I guess this happens just too
early. gdb just reports "The program has no registers now."

Sebastian