When we're stopped at syscall entry tracing, ptrace can change the %eax
value from -ENOSYS to something else. If no system call is actually made
because the syscall number (now in orig_eax) is bad, then the %eax value
set by ptrace should be returned to the user. But, instead it gets reset
to -ENOSYS again. This is a regression from the native 32-bit kernel.
This change fixes it by leaving the return value alone after entry tracing.
Signed-off-by: Roland McGrath <[email protected]>
---
arch/x86/ia32/ia32entry.S | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index b42d009..eafb0b9 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -325,7 +325,7 @@ ENTRY(ia32_syscall)
jnz ia32_tracesys
ia32_do_syscall:
cmpl $(IA32_NR_syscalls-1),%eax
- ja ia32_badsys
+ ja int_ret_from_sys_call /* ia32_tracesys has set RAX(%rsp) */
IA32_ARG_FIXUP
call *ia32_sys_call_table(,%rax,8) # xxx: rip relative
ia32_sysret:
@@ -335,7 +335,7 @@ ia32_sysret:
ia32_tracesys:
SAVE_REST
CLEAR_RREGS
- movq $-ENOSYS,RAX(%rsp) /* really needed? */
+ movq $-ENOSYS,RAX(%rsp) /* ptrace can change this for a bad syscall */
movq %rsp,%rdi /* &pt_regs -> arg1 */
call syscall_trace_enter
LOAD_ARGS32 ARGOFFSET /* reload args from stack in case ptrace changed it */
When we're stopped at syscall entry tracing, ptrace can change the %rax
value from -ENOSYS to something else. If no system call is actually made
because the syscall number (now in orig_rax) is bad, then we now always
reset %rax to -ENOSYS again.
This changes it to leave the return value alone after entry tracing.
That way, the %rax value set by ptrace is there to be seen in user mode
(or in syscall exit tracing). This is consistent with what the 32-bit
kernel does.
Signed-off-by: Roland McGrath <[email protected]>
---
arch/x86/kernel/entry_64.S | 8 +++-----
1 files changed, 3 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index c20c9e7..556a8df 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -319,19 +319,17 @@ badsys:
/* Do syscall tracing */
tracesys:
SAVE_REST
- movq $-ENOSYS,RAX(%rsp)
+ movq $-ENOSYS,RAX(%rsp) /* ptrace can change this for a bad syscall */
FIXUP_TOP_OF_STACK %rdi
movq %rsp,%rdi
call syscall_trace_enter
LOAD_ARGS ARGOFFSET /* reload args from stack in case ptrace changed it */
RESTORE_REST
cmpq $__NR_syscall_max,%rax
- movq $-ENOSYS,%rcx
- cmova %rcx,%rax
- ja 1f
+ ja int_ret_from_sys_call /* RAX(%rsp) set to -ENOSYS above */
movq %r10,%rcx /* fixup for C */
call *sys_call_table(,%rax,8)
-1: movq %rax,RAX-ARGOFFSET(%rsp)
+ movq %rax,RAX-ARGOFFSET(%rsp)
/* Use IRET because user could have changed frame */
/*
The previous "x86_64 ia32 ptrace vs -ENOSYS" fix only covered
the int $0x80 system call entries. This does the same fix
for the sysenter and syscall instruction paths.
Signed-off-by: Roland McGrath <[email protected]>
---
arch/x86/ia32/ia32entry.S | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index eafb0b9..b5e329d 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -162,12 +162,14 @@ sysenter_tracesys:
SAVE_REST
CLEAR_RREGS
movq %r9,R9(%rsp)
- movq $-ENOSYS,RAX(%rsp) /* really needed? */
+ movq $-ENOSYS,RAX(%rsp)/* ptrace can change this for a bad syscall */
movq %rsp,%rdi /* &pt_regs -> arg1 */
call syscall_trace_enter
LOAD_ARGS32 ARGOFFSET /* reload args from stack in case ptrace changed it */
RESTORE_REST
xchgl %ebp,%r9d
+ cmpl $(IA32_NR_syscalls-1),%eax
+ ja int_ret_from_sys_call /* sysenter_tracesys has set RAX(%rsp) */
jmp sysenter_do_call
CFI_ENDPROC
ENDPROC(ia32_sysenter_target)
@@ -261,13 +263,15 @@ cstar_tracesys:
SAVE_REST
CLEAR_RREGS
movq %r9,R9(%rsp)
- movq $-ENOSYS,RAX(%rsp) /* really needed? */
+ movq $-ENOSYS,RAX(%rsp) /* ptrace can change this for a bad syscall */
movq %rsp,%rdi /* &pt_regs -> arg1 */
call syscall_trace_enter
LOAD_ARGS32 ARGOFFSET /* reload args from stack in case ptrace changed it */
RESTORE_REST
xchgl %ebp,%r9d
movl RSP-ARGOFFSET(%rsp), %r8d
+ cmpl $(IA32_NR_syscalls-1),%eax
+ ja int_ret_from_sys_call /* cstar_tracesys has set RAX(%rsp) */
jmp cstar_do_call
END(ia32_cstar_target)
On Sun, 16 Mar 2008 21:57:41 -0700 (PDT)
Roland McGrath <[email protected]> wrote:
> When we're stopped at syscall entry tracing, ptrace can change the %eax
> value from -ENOSYS to something else. If no system call is actually made
> because the syscall number (now in orig_eax) is bad, then the %eax value
> set by ptrace should be returned to the user. But, instead it gets reset
> to -ENOSYS again. This is a regression from the native 32-bit kernel.
>
> This change fixes it by leaving the return value alone after entry tracing.
It is unclear (to me) what are the consequences of this problem? What
are the user-visible effects of the fix?
IOW: is it a 2.6.25 thing and if so, why?
Thanks.
> It is unclear (to me) what are the consequences of this problem? What
> are the user-visible effects of the fix?
I don't know off hand of something that cares. (I have a test case, but
it's just contrived for the purpose.) Something like UML could use this to
approximate PTRACE_SYSEMU when it's not there. But we would have heard
before now if UML cared about this particular behavior on x86_64.
> IOW: is it a 2.6.25 thing and if so, why?
No hurry. It's been broken forever (regression vs native 32-bit).
(OTOH, the fixes are quite safe.)
Thanks,
Roland
* Roland McGrath <[email protected]> wrote:
> > IOW: is it a 2.6.25 thing and if so, why?
>
> No hurry. It's been broken forever (regression vs native 32-bit).
> (OTOH, the fixes are quite safe.)
thanks Roland, i've applied your fixes.
Ingo