Hi,
the tpidr2 selftest on an arm box segfaults before reaching the entry point.
I have no clue what is to blame for this or how to debug it but for a
statically linked binary there shouldn't be much stuff going on besides the
elf loader?
I can reproduce this with a program using an empty main function. Also checked
for other nolibc users - same result for init.c from rcutorture.
tools/testing/selftests/arm64/fp/za-fork is working though - the only
difference I could spot here is that it is linked together with another object
file. I also looked at the elf headers but didn't find anything obvious (but
I'm a bit out of my comfort zone here..)
After playing around with linker options I found that using -static-pie
lets the binaries run successful.
[root@arm abi]# cat test.c
int main(void)
{
return 1;
}
[root@arm abi]# gcc -Os -static -Wall -lgcc -nostdlib -ffreestanding -include ../../../../include/nolibc/nolibc.h test.c
[root@arm abi]# ./a.out
Segmentation fault
[root@arm abi]# gcc -Os -static -Wall -lgcc -nostdlib -ffreestanding -static-pie -include ../../../../include/nolibc/nolibc.h test.c
[root@arm abi]# ./a.out
[root@arm abi]#
All on aarch64 running fedora37 + upstream kernel. Any hints on what could
be borken here or how to actually fix it?
Sebastian
On 2023-09-13 22:19:00+0200, Thomas Weißschuh wrote:
> On 2023-09-13 20:44:59+0200, Sebastian Ott wrote:
> > the tpidr2 selftest on an arm box segfaults before reaching the entry point.
> > I have no clue what is to blame for this or how to debug it but for a
> > statically linked binary there shouldn't be much stuff going on besides the
> > elf loader?
> [..]
>
> I reduced it to the following reproducer:
>
> $ cat test.c
> int foo; /* It works when deleting this variable */
>
> void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) _start(void)
> {
> __asm__ volatile (
> "mov x8, 93\n" /* NR_exit == 93 */
> "svc #0\n"
> );
> __builtin_unreachable();
> }
>
> $ aarch64-linux-gnu-gcc -Os -static -fno-stack-protector -Wall -nostdlib test.c
> $ ./a.out
> Segmentation fault
>
> Also when running under gdb the error message is:
>
> During startup program terminated with signal SIGSEGV, Segmentation fault.
>
> So it seems the error already happens during loading.
>
> Could be a compiler or kernel bug?
Callchain for the failure:
load_elf_binary()
-> if (likely(elf_bss != elf_brk) && unlikely(padzero(elf_bess)))
-> padzero()
-> clear_user()
-> __arch_clear_user()
-> failure in arch/arm64/lib/clear_user.S
Resulting in a EFAULT which gets translated to SIGSEGV somewhere.
The following patch, which seems sensible to me, fixes it for me.
But as this is really old, heavily used code I'm a bit hesitant.
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 7b3d2d491407..13f71733ba63 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -112,7 +112,7 @@ static struct linux_binfmt elf_format = {
static int set_brk(unsigned long start, unsigned long end, int prot)
{
- start = ELF_PAGEALIGN(start);
+ start = ELF_PAGESTART(start);
end = ELF_PAGEALIGN(end);
if (end > start) {
/*
On 2023-09-13 20:44:59+0200, Sebastian Ott wrote:
> Hi,
>
> the tpidr2 selftest on an arm box segfaults before reaching the entry point.
> I have no clue what is to blame for this or how to debug it but for a
> statically linked binary there shouldn't be much stuff going on besides the
> elf loader?
>
> I can reproduce this with a program using an empty main function. Also checked
> for other nolibc users - same result for init.c from rcutorture.
>
> tools/testing/selftests/arm64/fp/za-fork is working though - the only
> difference I could spot here is that it is linked together with another object
> file. I also looked at the elf headers but didn't find anything obvious (but
> I'm a bit out of my comfort zone here..)
>
> After playing around with linker options I found that using -static-pie
> lets the binaries run successful.
>
> [root@arm abi]# cat test.c
> int main(void)
> {
> return 1;
> }
> [root@arm abi]# gcc -Os -static -Wall -lgcc -nostdlib -ffreestanding -include ../../../../include/nolibc/nolibc.h test.c
> [root@arm abi]# ./a.out Segmentation fault
> [root@arm abi]# gcc -Os -static -Wall -lgcc -nostdlib -ffreestanding -static-pie -include ../../../../include/nolibc/nolibc.h test.c
> [root@arm abi]# ./a.out [root@arm abi]#
>
> All on aarch64 running fedora37 + upstream kernel. Any hints on what could
> be borken here or how to actually fix it?
I reduced it to the following reproducer:
$ cat test.c
int foo; /* It works when deleting this variable */
void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) _start(void)
{
__asm__ volatile (
"mov x8, 93\n" /* NR_exit == 93 */
"svc #0\n"
);
__builtin_unreachable();
}
$ aarch64-linux-gnu-gcc -Os -static -fno-stack-protector -Wall -nostdlib test.c
$ ./a.out
Segmentation fault
Also when running under gdb the error message is:
During startup program terminated with signal SIGSEGV, Segmentation fault.
So it seems the error already happens during loading.
Could be a compiler or kernel bug?
Thomas