2014-07-01 15:06:06

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 22/24] ARM64:ILP32: Use a seperate syscall table as a few syscalls need to be using the compat syscalls.

On Sat, May 24, 2014 at 12:02:17AM -0700, Andrew Pinski wrote:
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 1e1ebfc..8241ffe 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -620,9 +620,14 @@ ENDPROC(ret_from_fork)
> */
> .align 6
> el0_svc:
> - adrp stbl, sys_call_table // load syscall table pointer
> uxtw scno, w8 // syscall number in w8
> mov sc_nr, #__NR_syscalls
> +#ifdef CONFIG_ARM64_ILP32
> + get_thread_info tsk
> + ldr x16, [tsk, #TI_FLAGS]
> + tbnz x16, #TIF_32BIT_AARCH64, el0_ilp32_svc // We are using ILP32
> +#endif
> + adrp stbl, sys_call_table // load syscall table pointer

This adds a slight penalty on the AArch64 SVC entry path. I can't tell
whether that's visible or not but I think the x86 guys decided to set an
extra bit to the syscall number to distinguish it from native calls.

> diff --git a/arch/arm64/kernel/sys_ilp32.c b/arch/arm64/kernel/sys_ilp32.c
> new file mode 100644
> index 0000000..1da1d11
> --- /dev/null
> +++ b/arch/arm64/kernel/sys_ilp32.c
[...]
> +/*
> + * Wrappers to pass the pt_regs argument.
> + */
> +#define sys_rt_sigreturn sys_rt_sigreturn_wrapper
> +
> +
> +/* Using Compat syscalls where necessary */
> +#define sys_ioctl compat_sys_ioctl
> +/* iovec */
> +#define sys_readv compat_sys_readv
> +#define sys_writev compat_sys_writev
> +#define sys_preadv compat_sys_preadv64
> +#define sys_pwritev compat_sys_pwritev64
> +#define sys_vmsplice compat_sys_vmsplice

Do these actually work? compat_iovec has two members of 32-bit each
while the ILP32 iovec has a void * (32-bit) and a __kernel_size_t which
is 64-bit.

> +/* robust_list_head */
> +#define sys_set_robust_list compat_sys_set_robust_list
> +#define sys_get_robust_list compat_sys_get_robust_list

Same here, we have a size_t * argument. The compat function would write
back 32-bit but size_t is 64-bit for ILP32.

> +/* kexec_segment */
> +#define sys_kexec_load compat_sys_kexec_load

More size_t members in the kexec_segment structure (but we don't yet
have kexec on arm64).

> +/* struct msghdr */
> +#define sys_recvfrom compat_sys_recvfrom

Why compat here? struct sockaddr seems to be the same as the native one.

> +#define sys_recvmmsg compat_sys_recvmmsg
> +#define sys_sendmmsg compat_sys_sendmmsg
> +#define sys_sendmsg compat_sys_sendmsg
> +#define sys_recvmsg compat_sys_recvmsg

These get messier as well with a different size_t affecting struct
msghdr.

> +#define sys_setsockopt compat_sys_setsockopt
> +#define sys_getsockopt compat_sys_getsockopt

Looking at the sock_getsockopt() function, we have a union v copied
to/from user. However, such a union contains a struct timeval which for
ILP32 would be different than the compat one.

> +/* iovec */
> +#define sys_process_vm_readv compat_sys_process_vm_readv
> +#define sys_process_vm_writev compat_sys_process_vm_writev

See above for iovec.

> +/* Pointer in struct */
> +#define sys_mount compat_sys_mount

Which structure is this?

> +/* Scheduler */
> +/* unsigned long bitmaps */
> +#define sys_sched_setaffinity compat_sys_sched_setaffinity
> +#define sys_sched_getaffinity compat_sys_sched_getaffinity

Does the long bitmask matter here? I can see the length is passed in
bytes.

> +/* iov usage */
> +#define sys_keyctl compat_sys_keyctl

Same problem as iovec above.

> +/* aio */
> +/* Pointer to Pointer */
> +#define sys_io_setup compat_sys_io_setup

sys_io_setup takes a pointer to aio_context_t which is defined as
__kernel_ulong_t (same as LP64).

--
Catalin


2014-07-01 15:30:58

by Andrew Pinski

[permalink] [raw]
Subject: Re: [PATCH 22/24] ARM64:ILP32: Use a seperate syscall table as a few syscalls need to be using the compat syscalls.



> On Jul 1, 2014, at 8:07 AM, "Catalin Marinas" <[email protected]> wrote:
>
>> On Sat, May 24, 2014 at 12:02:17AM -0700, Andrew Pinski wrote:
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index 1e1ebfc..8241ffe 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -620,9 +620,14 @@ ENDPROC(ret_from_fork)
>> */
>> .align 6
>> el0_svc:
>> - adrp stbl, sys_call_table // load syscall table pointer
>> uxtw scno, w8 // syscall number in w8
>> mov sc_nr, #__NR_syscalls
>> +#ifdef CONFIG_ARM64_ILP32
>> + get_thread_info tsk
>> + ldr x16, [tsk, #TI_FLAGS]
>> + tbnz x16, #TIF_32BIT_AARCH64, el0_ilp32_svc // We are using ILP32
>> +#endif
>> + adrp stbl, sys_call_table // load syscall table pointer
>
> This adds a slight penalty on the AArch64 SVC entry path. I can't tell
> whether that's visible or not but I think the x86 guys decided to set an
> extra bit to the syscall number to distinguish it from native calls.
>
>> diff --git a/arch/arm64/kernel/sys_ilp32.c b/arch/arm64/kernel/sys_ilp32.c
>> new file mode 100644
>> index 0000000..1da1d11
>> --- /dev/null
>> +++ b/arch/arm64/kernel/sys_ilp32.c
> [...]
>> +/*
>> + * Wrappers to pass the pt_regs argument.
>> + */
>> +#define sys_rt_sigreturn sys_rt_sigreturn_wrapper
>> +
>> +
>> +/* Using Compat syscalls where necessary */
>> +#define sys_ioctl compat_sys_ioctl
>> +/* iovec */
>> +#define sys_readv compat_sys_readv
>> +#define sys_writev compat_sys_writev
>> +#define sys_preadv compat_sys_preadv64
>> +#define sys_pwritev compat_sys_pwritev64
>> +#define sys_vmsplice compat_sys_vmsplice
>
> Do these actually work? compat_iovec has two members of 32-bit each
> while the ILP32 iovec has a void * (32-bit) and a __kernel_size_t which
> is 64-bit.

size_t should be unsigned long in ilp32 so a 32bit unsigned integer type. That part of the abi was already defined in the arm abi documents. Now are saying we should pass size_t different between user and kernel space?

>
>> +/* robust_list_head */
>> +#define sys_set_robust_list compat_sys_set_robust_list
>> +#define sys_get_robust_list compat_sys_get_robust_list
>
> Same here, we have a size_t * argument. The compat function would write
> back 32-bit but size_t is 64-bit for ILP32.

See above. Size_t is 32bits.

>
>> +/* kexec_segment */
>> +#define sys_kexec_load compat_sys_kexec_load
>
> More size_t members in the kexec_segment structure (but we don't yet
> have kexec on arm64).

See above. Size_t is 32bits.

>
>> +/* struct msghdr */
>> +#define sys_recvfrom compat_sys_recvfrom
>
> Why compat here? struct sockaddr seems to be the same as the native one.
>
>> +#define sys_recvmmsg compat_sys_recvmmsg
>> +#define sys_sendmmsg compat_sys_sendmmsg
>> +#define sys_sendmsg compat_sys_sendmsg
>> +#define sys_recvmsg compat_sys_recvmsg
>
> These get messier as well with a different size_t affecting struct
> msghdr.

See above about size_t.


>
>> +#define sys_setsockopt compat_sys_setsockopt
>> +#define sys_getsockopt compat_sys_getsockopt
>
> Looking at the sock_getsockopt() function, we have a union v copied
> to/from user. However, such a union contains a struct timeval which for
> ILP32 would be different than the compat one.

I will look into this one but it might already be taken care of due to the compact uses 64bit time spec define. I will add a comment saying that if it is true.


>
>> +/* iovec */
>> +#define sys_process_vm_readv compat_sys_process_vm_readv
>> +#define sys_process_vm_writev compat_sys_process_vm_writev
>
> See above for iovec.

See above for my size_t question.

>
>> +/* Pointer in struct */
>> +#define sys_mount compat_sys_mount
>
> Which structure is this?

NFS structure, I can expand out the comment if needed.


>
>> +/* Scheduler */
>> +/* unsigned long bitmaps */
>> +#define sys_sched_setaffinity compat_sys_sched_setaffinity
>> +#define sys_sched_getaffinity compat_sys_sched_getaffinity
>
> Does the long bitmask matter here? I can see the length is passed in
> bytes.

Yes for big endian. If we were only supporting little endian, bit fields would not matter.

>
>> +/* iov usage */
>> +#define sys_keyctl compat_sys_keyctl
>
> Same problem as iovec above.
>
>> +/* aio */
>> +/* Pointer to Pointer */
>> +#define sys_io_setup compat_sys_io_setup
>
> sys_io_setup takes a pointer to aio_context_t which is defined as
> __kernel_ulong_t (same as LP64).

Let me look at why I did this one, I think the code which used aio was not in glibc which is why I used the compat version.

Thanks,
Andrew

>
> --
> Catalin

2014-07-01 16:38:26

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 22/24] ARM64:ILP32: Use a seperate syscall table as a few syscalls need to be using the compat syscalls.

On Tuesday 01 July 2014 15:30:51 Pinski, Andrew wrote:
> > On Jul 1, 2014, at 8:07 AM, "Catalin Marinas" <[email protected]> wrote:
> >> On Sat, May 24, 2014 at 12:02:17AM -0700, Andrew Pinski wrote:
> >> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> >> index 1e1ebfc..8241ffe 100644
> >> --- a/arch/arm64/kernel/entry.S
> >> +++ b/arch/arm64/kernel/entry.S
> >> @@ -620,9 +620,14 @@ ENDPROC(ret_from_fork)
> >> */
> >> .align 6
> >> el0_svc:
> >> - adrp stbl, sys_call_table // load syscall table pointer
> >> uxtw scno, w8 // syscall number in w8
> >> mov sc_nr, #__NR_syscalls
> >> +#ifdef CONFIG_ARM64_ILP32
> >> + get_thread_info tsk
> >> + ldr x16, [tsk, #TI_FLAGS]
> >> + tbnz x16, #TIF_32BIT_AARCH64, el0_ilp32_svc // We are using ILP32
> >> +#endif
> >> + adrp stbl, sys_call_table // load syscall table pointer
> >
> > This adds a slight penalty on the AArch64 SVC entry path. I can't tell
> > whether that's visible or not but I think the x86 guys decided to set an
> > extra bit to the syscall number to distinguish it from native calls.

IIRC the intention on x86 was that you should always be able to call
any of the three syscall ABIs (x86-32, x86-64, x32) from any process
by passing the right number, for flexibility.

Arnd

2014-07-01 16:49:14

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 22/24] ARM64:ILP32: Use a seperate syscall table as a few syscalls need to be using the compat syscalls.

On Tue, Jul 01, 2014 at 04:30:51PM +0100, Pinski, Andrew wrote:
> On Jul 1, 2014, at 8:07 AM, "Catalin Marinas" <[email protected]> wrote:
> > On Sat, May 24, 2014 at 12:02:17AM -0700, Andrew Pinski wrote:
> >> +/* Using Compat syscalls where necessary */
> >> +#define sys_ioctl compat_sys_ioctl
> >> +/* iovec */
> >> +#define sys_readv compat_sys_readv
> >> +#define sys_writev compat_sys_writev
> >> +#define sys_preadv compat_sys_preadv64
> >> +#define sys_pwritev compat_sys_pwritev64
> >> +#define sys_vmsplice compat_sys_vmsplice
> >
> > Do these actually work? compat_iovec has two members of 32-bit each
> > while the ILP32 iovec has a void * (32-bit) and a __kernel_size_t which
> > is 64-bit.
>
> size_t should be unsigned long in ilp32 so a 32bit unsigned integer
> type. That part of the abi was already defined in the arm abi
> documents. Now are saying we should pass size_t different between
> user and kernel space?

OK, I think you are right here. The ILP32 would not see __kernel_size_t
defined as __kernel_ulong_t because __BITS_PER_LONG != 64.

> >> +/* Pointer in struct */
> >> +#define sys_mount compat_sys_mount
> >
> > Which structure is this?
>
> NFS structure, I can expand out the comment if needed.

That would be good for future reference.

Thanks.

--
Catalin

2014-07-01 16:50:52

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH 22/24] ARM64:ILP32: Use a seperate syscall table as a few syscalls need to be using the compat syscalls.

On Tue, Jul 01, 2014 at 05:38:12PM +0100, Arnd Bergmann wrote:
> On Tuesday 01 July 2014 15:30:51 Pinski, Andrew wrote:
> > > On Jul 1, 2014, at 8:07 AM, "Catalin Marinas" <[email protected]> wrote:
> > >> On Sat, May 24, 2014 at 12:02:17AM -0700, Andrew Pinski wrote:
> > >> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > >> index 1e1ebfc..8241ffe 100644
> > >> --- a/arch/arm64/kernel/entry.S
> > >> +++ b/arch/arm64/kernel/entry.S
> > >> @@ -620,9 +620,14 @@ ENDPROC(ret_from_fork)
> > >> */
> > >> .align 6
> > >> el0_svc:
> > >> - adrp stbl, sys_call_table // load syscall table pointer
> > >> uxtw scno, w8 // syscall number in w8
> > >> mov sc_nr, #__NR_syscalls
> > >> +#ifdef CONFIG_ARM64_ILP32
> > >> + get_thread_info tsk
> > >> + ldr x16, [tsk, #TI_FLAGS]
> > >> + tbnz x16, #TIF_32BIT_AARCH64, el0_ilp32_svc // We are using ILP32
> > >> +#endif
> > >> + adrp stbl, sys_call_table // load syscall table pointer
> > >
> > > This adds a slight penalty on the AArch64 SVC entry path. I can't tell
> > > whether that's visible or not but I think the x86 guys decided to set an
> > > extra bit to the syscall number to distinguish it from native calls.
>
> IIRC the intention on x86 was that you should always be able to call
> any of the three syscall ABIs (x86-32, x86-64, x32) from any process
> by passing the right number, for flexibility.

I don't see how this is useful though. Do you happen to have more
information?

--
Catalin

2014-07-01 17:04:50

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 22/24] ARM64:ILP32: Use a seperate syscall table as a few syscalls need to be using the compat syscalls.

On Tuesday 01 July 2014 17:50:41 Catalin Marinas wrote:
> On Tue, Jul 01, 2014 at 05:38:12PM +0100, Arnd Bergmann wrote:
> > On Tuesday 01 July 2014 15:30:51 Pinski, Andrew wrote:
> > > > On Jul 1, 2014, at 8:07 AM, "Catalin Marinas" <[email protected]> wrote:
> > > >> On Sat, May 24, 2014 at 12:02:17AM -0700, Andrew Pinski wrote:
> > > >> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > >> index 1e1ebfc..8241ffe 100644
> > > >> --- a/arch/arm64/kernel/entry.S
> > > >> +++ b/arch/arm64/kernel/entry.S
> > > >> @@ -620,9 +620,14 @@ ENDPROC(ret_from_fork)
> > > >> */
> > > >> .align 6
> > > >> el0_svc:
> > > >> - adrp stbl, sys_call_table // load syscall table pointer
> > > >> uxtw scno, w8 // syscall number in w8
> > > >> mov sc_nr, #__NR_syscalls
> > > >> +#ifdef CONFIG_ARM64_ILP32
> > > >> + get_thread_info tsk
> > > >> + ldr x16, [tsk, #TI_FLAGS]
> > > >> + tbnz x16, #TIF_32BIT_AARCH64, el0_ilp32_svc // We are using ILP32
> > > >> +#endif
> > > >> + adrp stbl, sys_call_table // load syscall table pointer
> > > >
> > > > This adds a slight penalty on the AArch64 SVC entry path. I can't tell
> > > > whether that's visible or not but I think the x86 guys decided to set an
> > > > extra bit to the syscall number to distinguish it from native calls.
> >
> > IIRC the intention on x86 was that you should always be able to call
> > any of the three syscall ABIs (x86-32, x86-64, x32) from any process
> > by passing the right number, for flexibility.
>
> I don't see how this is useful though. Do you happen to have more
> information?

It's been a decade since this code was merged, so my memory isn't very
good here. I believe one of the main reasons was being able to run
emulation layers in user space that make use of the kernel helpers.

Another use case might be an application that wants to use the native
ioctl interface for a driver that does not have a (efficient) compat
ioctl handler.

Arnd