2017-09-12 22:58:02

by Dmitry V. Levin

[permalink] [raw]
Subject: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y

Before this change, CONFIG_X86_X32=y fastpath behaviour was different
from slowpath:

$ gcc -xc -Wall -O2 - <<'EOF'
#include <unistd.h>
#include <asm/unistd.h>
int main(void) {
unsigned long nr = ~0xffffffffUL | __NR_exit;
return !!syscall(nr, 42, 1, 2, 3, 4, 5);
}
EOF
$ ./a.out; echo \$?=$?
$?=42
$ strace -enone ./a.out
syscall_18446744069414584380(0x2a, 0x1, 0x2, 0x3, 0x4, 0x5) = -1 (errno 38)
+++ exited with 1 +++

This change syncs CONFIG_X86_X32=y fastpath behaviour with the case
when CONFIG_X86_X32 is not enabled.

Fixes: fca460f95e92 ("x32: Handle the x32 system call flag")
Cc: [email protected]
Signed-off-by: Dmitry V. Levin <[email protected]>
---
arch/x86/entry/entry_64.S | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 4916725..3bab6af 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -185,12 +185,10 @@ entry_SYSCALL_64_fastpath:
*/
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
-#if __SYSCALL_MASK == ~0
- cmpq $__NR_syscall_max, %rax
-#else
- andl $__SYSCALL_MASK, %eax
- cmpl $__NR_syscall_max, %eax
+#if __SYSCALL_MASK != ~0
+ andq $__SYSCALL_MASK, %rax
#endif
+ cmpq $__NR_syscall_max, %rax
ja 1f /* return -ENOSYS (already in pt_regs->ax) */
movq %r10, %rcx

--
ldv


2017-09-13 16:49:30

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y


On 09/13, Dmitry V. Levin wrote:
>
> Before this change, CONFIG_X86_X32=y fastpath behaviour was different
> from slowpath:

and even with this change they differ if CONFIG_X86_X32=n? do_syscall_64()
does "nr & __SYSCALL_MASK" unconditionally, this clears the upper bits, no?

And why __SYSCALL_MASK is not "unsigned long" ? IOW, why do we want to silently
ignore the upper bits in $rax ?

Or I am totally confused?

Oleg.

> $ gcc -xc -Wall -O2 - <<'EOF'
> #include <unistd.h>
> #include <asm/unistd.h>
> int main(void) {
> unsigned long nr = ~0xffffffffUL | __NR_exit;
> return !!syscall(nr, 42, 1, 2, 3, 4, 5);
> }
> EOF
> $ ./a.out; echo \$?=$?
> $?=42
> $ strace -enone ./a.out
> syscall_18446744069414584380(0x2a, 0x1, 0x2, 0x3, 0x4, 0x5) = -1 (errno 38)
> +++ exited with 1 +++
>
> This change syncs CONFIG_X86_X32=y fastpath behaviour with the case
> when CONFIG_X86_X32 is not enabled.
>
> Fixes: fca460f95e92 ("x32: Handle the x32 system call flag")
> Cc: [email protected]
> Signed-off-by: Dmitry V. Levin <[email protected]>
> ---
> arch/x86/entry/entry_64.S | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 4916725..3bab6af 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -185,12 +185,10 @@ entry_SYSCALL_64_fastpath:
> */
> TRACE_IRQS_ON
> ENABLE_INTERRUPTS(CLBR_NONE)
> -#if __SYSCALL_MASK == ~0
> - cmpq $__NR_syscall_max, %rax
> -#else
> - andl $__SYSCALL_MASK, %eax
> - cmpl $__NR_syscall_max, %eax
> +#if __SYSCALL_MASK != ~0
> + andq $__SYSCALL_MASK, %rax
> #endif
> + cmpq $__NR_syscall_max, %rax
> ja 1f /* return -ENOSYS (already in pt_regs->ax) */
> movq %r10, %rcx
>
> --
> ldv

2017-09-14 19:41:05

by Eugene Syromyatnikov

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y

On Wed, Sep 13, 2017 at 4:49 PM, Oleg Nesterov <[email protected]> wrote:
> IOW, why do we want to silently ignore the upper bits in $rax ?

By the way, they are ignored elsewhere, in audit[1] or seccomp[2], for example.

[1] include/linux/audit.h
[2] include/uapi/linux/seccomp.h, definition of struct seccomp_data

--
Eugene Syromyatnikov
mailto:[email protected]
xmpp:esyr@jabber.{ru|org}

2017-09-14 20:21:17

by Dmitry V. Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y

On Wed, Sep 13, 2017 at 06:49:21PM +0200, Oleg Nesterov wrote:
> On 09/13, Dmitry V. Levin wrote:
> >
> > Before this change, CONFIG_X86_X32=y fastpath behaviour was different
> > from slowpath:
>
> and even with this change they differ if CONFIG_X86_X32=n?

No, I don't think so.

> do_syscall_64() does "nr & __SYSCALL_MASK" unconditionally,

yes

> this clears the upper bits, no?

Why? As "nr" is of type "unsigned long" and __SYSCALL_MASK is either
(~(__X32_SYSCALL_BIT)) or (~0), that is, an integer with the sign bit set,
in "nr & __SYSCALL_MASK" expression __SYSCALL_MASK is sign-extended
to unsigned long. When __SYSCALL_MASK is defined to (~0),
"nr & __SYSCALL_MASK" is optimized to "nr" at compilation time:

$ echo 'unsigned long foo(unsigned long nr) { return nr & (~0); }' |
gcc -Wall -O2 -xc -S -o - - |
sed -n '/cfi_/,/cfi_/p'
.cfi_startproc
movq %rdi, %rax
ret
.cfi_endproc

> And why __SYSCALL_MASK is not "unsigned long" ? IOW, why do we want to silently
> ignore the upper bits in $rax ?

__SYSCALL_MASK is "int" but it is being sign-extended to unsigned long in all
(two) places of arch/x86/entry/common.c where it is used.

> Or I am totally confused?

The thing looks like it was designed to confuse people.


--
ldv


Attachments:
(No filename) (1.22 kB)
signature.asc (801.00 B)
Download all attachments

2017-09-14 20:24:32

by Dmitry V. Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y

On Thu, Sep 14, 2017 at 07:40:43PM +0000, Eugene Syromyatnikov wrote:
> On Wed, Sep 13, 2017 at 4:49 PM, Oleg Nesterov <[email protected]> wrote:
> > IOW, why do we want to silently ignore the upper bits in $rax ?
>
> By the way, they are ignored elsewhere, in audit[1] or seccomp[2], for example.
>
> [1] include/linux/audit.h
> [2] include/uapi/linux/seccomp.h, definition of struct seccomp_data

Yes, unfortunately, they are ignored later, but that is another story.


--
ldv


Attachments:
(No filename) (480.00 B)
signature.asc (801.00 B)
Download all attachments

2017-09-14 21:05:27

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y

On Tue, Sep 12, 2017 at 3:57 PM, Dmitry V. Levin <[email protected]> wrote:
> Before this change, CONFIG_X86_X32=y fastpath behaviour was different
> from slowpath:
>
> $ gcc -xc -Wall -O2 - <<'EOF'
> #include <unistd.h>
> #include <asm/unistd.h>
> int main(void) {
> unsigned long nr = ~0xffffffffUL | __NR_exit;
> return !!syscall(nr, 42, 1, 2, 3, 4, 5);
> }
> EOF
> $ ./a.out; echo \$?=$?
> $?=42
> $ strace -enone ./a.out
> syscall_18446744069414584380(0x2a, 0x1, 0x2, 0x3, 0x4, 0x5) = -1 (errno 38)
> +++ exited with 1 +++
>
> This change syncs CONFIG_X86_X32=y fastpath behaviour with the case
> when CONFIG_X86_X32 is not enabled.

Do you see real brokenness here, or is it just weird?

>
> Fixes: fca460f95e92 ("x32: Handle the x32 system call flag")
> Cc: [email protected]
> Signed-off-by: Dmitry V. Levin <[email protected]>
> ---
> arch/x86/entry/entry_64.S | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 4916725..3bab6af 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -185,12 +185,10 @@ entry_SYSCALL_64_fastpath:
> */
> TRACE_IRQS_ON
> ENABLE_INTERRUPTS(CLBR_NONE)
> -#if __SYSCALL_MASK == ~0
> - cmpq $__NR_syscall_max, %rax
> -#else
> - andl $__SYSCALL_MASK, %eax
> - cmpl $__NR_syscall_max, %eax
> +#if __SYSCALL_MASK != ~0
> + andq $__SYSCALL_MASK, %rax
> #endif
> + cmpq $__NR_syscall_max, %rax

I don't know much about x32 userspace, but there's an argument that
the high bits *should* be masked off if the x32 bit is set. Of
course, that's slower, but it could be done without performance loss,
I think.

--Andy

2017-09-14 21:33:19

by Dmitry V. Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y

On Thu, Sep 14, 2017 at 02:05:04PM -0700, Andy Lutomirski wrote:
> On Tue, Sep 12, 2017 at 3:57 PM, Dmitry V. Levin wrote:
> > Before this change, CONFIG_X86_X32=y fastpath behaviour was different
> > from slowpath:
> >
> > $ gcc -xc -Wall -O2 - <<'EOF'
> > #include <unistd.h>
> > #include <asm/unistd.h>
> > int main(void) {
> > unsigned long nr = ~0xffffffffUL | __NR_exit;
> > return !!syscall(nr, 42, 1, 2, 3, 4, 5);
> > }
> > EOF
> > $ ./a.out; echo \$?=$?
> > $?=42
> > $ strace -enone ./a.out
> > syscall_18446744069414584380(0x2a, 0x1, 0x2, 0x3, 0x4, 0x5) = -1 (errno 38)
> > +++ exited with 1 +++
> >
> > This change syncs CONFIG_X86_X32=y fastpath behaviour with the case
> > when CONFIG_X86_X32 is not enabled.
>
> Do you see real brokenness here, or is it just weird?

It's definitely broken. A syscall should be either valid or invalid
regardless of implementation peculiarities like fastpath vs slowpath.

> > Fixes: fca460f95e92 ("x32: Handle the x32 system call flag")
> > Cc: [email protected]
> > Signed-off-by: Dmitry V. Levin <[email protected]>
> > ---
> > arch/x86/entry/entry_64.S | 8 +++-----
> > 1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> > index 4916725..3bab6af 100644
> > --- a/arch/x86/entry/entry_64.S
> > +++ b/arch/x86/entry/entry_64.S
> > @@ -185,12 +185,10 @@ entry_SYSCALL_64_fastpath:
> > */
> > TRACE_IRQS_ON
> > ENABLE_INTERRUPTS(CLBR_NONE)
> > -#if __SYSCALL_MASK == ~0
> > - cmpq $__NR_syscall_max, %rax
> > -#else
> > - andl $__SYSCALL_MASK, %eax
> > - cmpl $__NR_syscall_max, %eax
> > +#if __SYSCALL_MASK != ~0
> > + andq $__SYSCALL_MASK, %rax
> > #endif
> > + cmpq $__NR_syscall_max, %rax
>
> I don't know much about x32 userspace, but there's an argument that
> the high bits *should* be masked off if the x32 bit is set.

Why?


--
ldv


Attachments:
(No filename) (1.92 kB)
signature.asc (801.00 B)
Download all attachments

2017-09-14 21:38:47

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y

On Thu, Sep 14, 2017 at 2:33 PM, Dmitry V. Levin <[email protected]> wrote:
> On Thu, Sep 14, 2017 at 02:05:04PM -0700, Andy Lutomirski wrote:
>> On Tue, Sep 12, 2017 at 3:57 PM, Dmitry V. Levin wrote:
>> > Before this change, CONFIG_X86_X32=y fastpath behaviour was different
>> > from slowpath:
>> >
>> > $ gcc -xc -Wall -O2 - <<'EOF'
>> > #include <unistd.h>
>> > #include <asm/unistd.h>
>> > int main(void) {
>> > unsigned long nr = ~0xffffffffUL | __NR_exit;
>> > return !!syscall(nr, 42, 1, 2, 3, 4, 5);
>> > }
>> > EOF
>> > $ ./a.out; echo \$?=$?
>> > $?=42
>> > $ strace -enone ./a.out
>> > syscall_18446744069414584380(0x2a, 0x1, 0x2, 0x3, 0x4, 0x5) = -1 (errno 38)
>> > +++ exited with 1 +++
>> >
>> > This change syncs CONFIG_X86_X32=y fastpath behaviour with the case
>> > when CONFIG_X86_X32 is not enabled.
>>
>> Do you see real brokenness here, or is it just weird?
>
> It's definitely broken. A syscall should be either valid or invalid
> regardless of implementation peculiarities like fastpath vs slowpath.
>
>> > Fixes: fca460f95e92 ("x32: Handle the x32 system call flag")
>> > Cc: [email protected]
>> > Signed-off-by: Dmitry V. Levin <[email protected]>
>> > ---
>> > arch/x86/entry/entry_64.S | 8 +++-----
>> > 1 file changed, 3 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
>> > index 4916725..3bab6af 100644
>> > --- a/arch/x86/entry/entry_64.S
>> > +++ b/arch/x86/entry/entry_64.S
>> > @@ -185,12 +185,10 @@ entry_SYSCALL_64_fastpath:
>> > */
>> > TRACE_IRQS_ON
>> > ENABLE_INTERRUPTS(CLBR_NONE)
>> > -#if __SYSCALL_MASK == ~0
>> > - cmpq $__NR_syscall_max, %rax
>> > -#else
>> > - andl $__SYSCALL_MASK, %eax
>> > - cmpl $__NR_syscall_max, %eax
>> > +#if __SYSCALL_MASK != ~0
>> > + andq $__SYSCALL_MASK, %rax
>> > #endif
>> > + cmpq $__NR_syscall_max, %rax
>>
>> I don't know much about x32 userspace, but there's an argument that
>> the high bits *should* be masked off if the x32 bit is set.
>
> Why?

Because it always worked that way.

That being said, I'd be okay with applying your patch and seeing
whether anything breaks. Ingo?

>
>
> --
> ldv

2017-09-15 05:32:00

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y


* Andy Lutomirski <[email protected]> wrote:

> >> > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> >> > index 4916725..3bab6af 100644
> >> > --- a/arch/x86/entry/entry_64.S
> >> > +++ b/arch/x86/entry/entry_64.S
> >> > @@ -185,12 +185,10 @@ entry_SYSCALL_64_fastpath:
> >> > */
> >> > TRACE_IRQS_ON
> >> > ENABLE_INTERRUPTS(CLBR_NONE)
> >> > -#if __SYSCALL_MASK == ~0
> >> > - cmpq $__NR_syscall_max, %rax
> >> > -#else
> >> > - andl $__SYSCALL_MASK, %eax
> >> > - cmpl $__NR_syscall_max, %eax
> >> > +#if __SYSCALL_MASK != ~0
> >> > + andq $__SYSCALL_MASK, %rax
> >> > #endif
> >> > + cmpq $__NR_syscall_max, %rax
> >>
> >> I don't know much about x32 userspace, but there's an argument that
> >> the high bits *should* be masked off if the x32 bit is set.
> >
> > Why?
>
> Because it always worked that way.
>
> That being said, I'd be okay with applying your patch and seeing
> whether anything breaks. Ingo?

So I believe this was introduced with x32 as a 'fresh, modern syscall ABI'
behavioral aspect, because we wanted to protect the overly complex syscall entry
code from 'weird' input values. IIRC there was an old bug where we'd overflow the
syscall table in certain circumstances ...

But our new, redesigned entry code is a lot less complex, a lot more readable and
a lot more maintainable (not to mention a lot more robust), so if invalid RAX
values with high bits set get reliably turned into -ENOSYS or such then I'd not
mind the patch per se either, as a general consistency improvement.

Of course if something in x32 user-land breaks then this turns into an ABI and we
have to reintroduce this aspect, as a quirk :-/

It should also improve x32 syscall performance a tiny bit, right? So might be
worth a try on various grounds.

( Another future advantage would be that _maybe_ we could use the high RAX
component as an extra (64-bit only) special argument of sorts. Not that I can
think of any such use right now. )

Thanks,

Ingo

2017-09-15 05:53:30

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y

On September 14, 2017 10:31:55 PM PDT, Ingo Molnar <[email protected]> wrote:
>
>* Andy Lutomirski <[email protected]> wrote:
>
>> >> > diff --git a/arch/x86/entry/entry_64.S
>b/arch/x86/entry/entry_64.S
>> >> > index 4916725..3bab6af 100644
>> >> > --- a/arch/x86/entry/entry_64.S
>> >> > +++ b/arch/x86/entry/entry_64.S
>> >> > @@ -185,12 +185,10 @@ entry_SYSCALL_64_fastpath:
>> >> > */
>> >> > TRACE_IRQS_ON
>> >> > ENABLE_INTERRUPTS(CLBR_NONE)
>> >> > -#if __SYSCALL_MASK == ~0
>> >> > - cmpq $__NR_syscall_max, %rax
>> >> > -#else
>> >> > - andl $__SYSCALL_MASK, %eax
>> >> > - cmpl $__NR_syscall_max, %eax
>> >> > +#if __SYSCALL_MASK != ~0
>> >> > + andq $__SYSCALL_MASK, %rax
>> >> > #endif
>> >> > + cmpq $__NR_syscall_max, %rax
>> >>
>> >> I don't know much about x32 userspace, but there's an argument
>that
>> >> the high bits *should* be masked off if the x32 bit is set.
>> >
>> > Why?
>>
>> Because it always worked that way.
>>
>> That being said, I'd be okay with applying your patch and seeing
>> whether anything breaks. Ingo?
>
>So I believe this was introduced with x32 as a 'fresh, modern syscall
>ABI'
>behavioral aspect, because we wanted to protect the overly complex
>syscall entry
>code from 'weird' input values. IIRC there was an old bug where we'd
>overflow the
>syscall table in certain circumstances ...
>
>But our new, redesigned entry code is a lot less complex, a lot more
>readable and
>a lot more maintainable (not to mention a lot more robust), so if
>invalid RAX
>values with high bits set get reliably turned into -ENOSYS or such then
>I'd not
>mind the patch per se either, as a general consistency improvement.
>
>Of course if something in x32 user-land breaks then this turns into an
>ABI and we
>have to reintroduce this aspect, as a quirk :-/
>
>It should also improve x32 syscall performance a tiny bit, right? So
>might be
>worth a try on various grounds.
>
>( Another future advantage would be that _maybe_ we could use the high
>RAX
>component as an extra (64-bit only) special argument of sorts. Not that
>I can
> think of any such use right now. )
>
>Thanks,
>
> Ingo

x32 should be sharing the native 64-but entry code, by design. We have had the latter mask the upper 32 bits, so we should do that for x32 as well.

There seems to be little if no motivation to ever change both these ABIs.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2017-09-15 05:54:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y

On September 14, 2017 10:31:55 PM PDT, Ingo Molnar <[email protected]> wrote:
>
>* Andy Lutomirski <[email protected]> wrote:
>
>> >> > diff --git a/arch/x86/entry/entry_64.S
>b/arch/x86/entry/entry_64.S
>> >> > index 4916725..3bab6af 100644
>> >> > --- a/arch/x86/entry/entry_64.S
>> >> > +++ b/arch/x86/entry/entry_64.S
>> >> > @@ -185,12 +185,10 @@ entry_SYSCALL_64_fastpath:
>> >> > */
>> >> > TRACE_IRQS_ON
>> >> > ENABLE_INTERRUPTS(CLBR_NONE)
>> >> > -#if __SYSCALL_MASK == ~0
>> >> > - cmpq $__NR_syscall_max, %rax
>> >> > -#else
>> >> > - andl $__SYSCALL_MASK, %eax
>> >> > - cmpl $__NR_syscall_max, %eax
>> >> > +#if __SYSCALL_MASK != ~0
>> >> > + andq $__SYSCALL_MASK, %rax
>> >> > #endif
>> >> > + cmpq $__NR_syscall_max, %rax
>> >>
>> >> I don't know much about x32 userspace, but there's an argument
>that
>> >> the high bits *should* be masked off if the x32 bit is set.
>> >
>> > Why?
>>
>> Because it always worked that way.
>>
>> That being said, I'd be okay with applying your patch and seeing
>> whether anything breaks. Ingo?
>
>So I believe this was introduced with x32 as a 'fresh, modern syscall
>ABI'
>behavioral aspect, because we wanted to protect the overly complex
>syscall entry
>code from 'weird' input values. IIRC there was an old bug where we'd
>overflow the
>syscall table in certain circumstances ...
>
>But our new, redesigned entry code is a lot less complex, a lot more
>readable and
>a lot more maintainable (not to mention a lot more robust), so if
>invalid RAX
>values with high bits set get reliably turned into -ENOSYS or such then
>I'd not
>mind the patch per se either, as a general consistency improvement.
>
>Of course if something in x32 user-land breaks then this turns into an
>ABI and we
>have to reintroduce this aspect, as a quirk :-/
>
>It should also improve x32 syscall performance a tiny bit, right? So
>might be
>worth a try on various grounds.
>
>( Another future advantage would be that _maybe_ we could use the high
>RAX
>component as an extra (64-bit only) special argument of sorts. Not that
>I can
> think of any such use right now. )
>
>Thanks,
>
> Ingo

If the consensus is that we should change the x86-64 ABI then we should change the x32 ABI to match, though.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2017-09-15 16:12:47

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y

On 09/14, Dmitry V. Levin wrote:
>
> On Wed, Sep 13, 2017 at 06:49:21PM +0200, Oleg Nesterov wrote:
>
> > do_syscall_64() does "nr & __SYSCALL_MASK" unconditionally,
>
> yes
>
> > this clears the upper bits, no?
>
> Why? As "nr" is of type "unsigned long" and __SYSCALL_MASK is either
> (~(__X32_SYSCALL_BIT)) or (~0), that is, an integer with the sign bit set,

Yes, thanks for correcting me, it is "int" but somehow I thought it is
"unsigned int".

Oleg.

2017-09-17 16:45:28

by Dmitry V. Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/64: do not clear high 32 bits of syscall number when CONFIG_X86_X32=y

On Thu, Sep 14, 2017 at 10:46:37PM -0700, [email protected] wrote:
> On September 14, 2017 10:31:55 PM PDT, Ingo Molnar <[email protected]> wrote:
> >
> >* Andy Lutomirski <[email protected]> wrote:
> >
> >> >> > diff --git a/arch/x86/entry/entry_64.S
> >b/arch/x86/entry/entry_64.S
> >> >> > index 4916725..3bab6af 100644
> >> >> > --- a/arch/x86/entry/entry_64.S
> >> >> > +++ b/arch/x86/entry/entry_64.S
> >> >> > @@ -185,12 +185,10 @@ entry_SYSCALL_64_fastpath:
> >> >> > */
> >> >> > TRACE_IRQS_ON
> >> >> > ENABLE_INTERRUPTS(CLBR_NONE)
> >> >> > -#if __SYSCALL_MASK == ~0
> >> >> > - cmpq $__NR_syscall_max, %rax
> >> >> > -#else
> >> >> > - andl $__SYSCALL_MASK, %eax
> >> >> > - cmpl $__NR_syscall_max, %eax
> >> >> > +#if __SYSCALL_MASK != ~0
> >> >> > + andq $__SYSCALL_MASK, %rax
> >> >> > #endif
> >> >> > + cmpq $__NR_syscall_max, %rax
> >> >>
> >> >> I don't know much about x32 userspace, but there's an argument
> >that
> >> >> the high bits *should* be masked off if the x32 bit is set.
> >> >
> >> > Why?
> >>
> >> Because it always worked that way.
> >>
> >> That being said, I'd be okay with applying your patch and seeing
> >> whether anything breaks. Ingo?
> >
> >So I believe this was introduced with x32 as a 'fresh, modern syscall
> >ABI'
> >behavioral aspect, because we wanted to protect the overly complex
> >syscall entry
> >code from 'weird' input values. IIRC there was an old bug where we'd
> >overflow the
> >syscall table in certain circumstances ...
> >
> >But our new, redesigned entry code is a lot less complex, a lot more
> >readable and
> >a lot more maintainable (not to mention a lot more robust), so if
> >invalid RAX
> >values with high bits set get reliably turned into -ENOSYS or such then
> >I'd not
> >mind the patch per se either, as a general consistency improvement.
> >
> >Of course if something in x32 user-land breaks then this turns into an
> >ABI and we
> >have to reintroduce this aspect, as a quirk :-/
> >
> >It should also improve x32 syscall performance a tiny bit, right? So
> >might be
> >worth a try on various grounds.
> >
> >( Another future advantage would be that _maybe_ we could use the high
> >RAX
> >component as an extra (64-bit only) special argument of sorts. Not that
> >I can
> > think of any such use right now. )
> >
> >Thanks,
> >
> > Ingo
>
> x32 should be sharing the native 64-but entry code, by design.
> We have had the latter mask the upper 32 bits,

There must be some misunderstanding on your side: the clearing
of the upper 32 bits of 64-bit syscall numbers currently happens
if and only if fastpath and CONFIG_X86_X32_ABI is enabled.


--
ldv


Attachments:
(No filename) (2.65 kB)
signature.asc (801.00 B)
Download all attachments