This patch fixes a little problem when sys_clone is called from a
vsyscall and a new stack
for the child is specifed.
Some regs, and return address are pushed in the stack before sysenter,
so when the syscall returns, pop the regs and return, if we made a new
stack for the thread
we need to copy that to the new stack or the thread will segfault.
Sorry for use the 0xffffe410 constant I didn't find another way to reference it.
And sorry for my english :S
Signed-off-by: Daniel F <[email protected]>
--- /usr/src/linux-2.6.14/arch/i386/kernel/process.c.orig
2005-12-13 17:06:46.000000000 +0000
+++ /usr/src/linux-2.6.14/arch/i386/kernel/process.c 2006-01-11
15:19:01.000000000 +0000
@@ -446,6 +446,7 @@ int copy_thread(int nr, unsigned long cl
{
struct pt_regs * childregs;
struct task_struct *tsk;
+ unsigned long vsyscall_stack[4] ;
int err;
childregs = ((struct pt_regs *) (THREAD_SIZE + (unsigned long)
p->thread_info)) - 1;
@@ -462,6 +463,17 @@ int copy_thread(int nr, unsigned long cl
childregs = (struct pt_regs *) ((unsigned long) childregs - 8);
*childregs = *regs;
childregs->eax = 0;
+ /*
+ * When we were called from a vsyscall some thigs are pushed
on the stack,
+ * we need to copy the stuff to the new stack.
+ */
+ if ((regs->esp != esp) && (regs->eip == 0xffffe410)) {
+ if (copy_from_user(vsyscall_stack,(void
*)regs->esp,sizeof(vsyscall_stack)))
+ return -EFAULT ;
+ if (copy_to_user((void
*)esp-sizeof(vsyscall_stack),vsyscall_stack,sizeof(vsyscall_stack)))
+ return -EFAULT ;
+ esp -= sizeof(vsyscall_stack) ;
+ }
childregs->esp = esp;
p->thread.esp = (unsigned long) childregs;
In-Reply-To: <[email protected]>
On Wed, 11 Jan 2006 at 17:53:10 +0000, erg0t wrote:
> This patch fixes a little problem when sys_clone is called from
> a vsyscall and a new stack for the child is specifed.
> Some regs, and return address are pushed in the stack before sysenter,
> so when the syscall returns, pop the regs and return, if we made
> a new stack for the thread we need to copy that to the new stack
> or the thread will segfault.
glibc works around this bug by hardcoding "int 0x80" at
glibc-2.3.5/sysdeps/unix/sysv/linux/i386/clone.S line 99.
> Sorry for use the 0xffffe410 constant I didn't find another way
> to reference it.
It's SYSENTER_RETURN.
> Signed-off-by: Daniel F <[email protected]>
Your patch almost works but it copies the stack into the parent's address space.
Using access_process_vm() fixes it. However, that still leaves unfixed the case
where vsyscall-int80 is used.
[patch] i386: fix sys_clone when using vsyscall-sysenter
Fix a problem when sys_clone is called from sysenter vsyscall and a new stack
for the child is specified. Some data needs to be copied from the parent
to the child stack or the child will segfault.
Bug report and initial patch from Daniel F <[email protected]>
Signed-off-by: Chuck Ebbert <[email protected]>
--- 2.6.15a.orig/arch/i386/kernel/process.c
+++ 2.6.15a/arch/i386/kernel/process.c
@@ -429,12 +429,15 @@ void prepare_to_copy(struct task_struct
unlazy_fpu(tsk);
}
+void SYSENTER_RETURN(void);
+
int copy_thread(int nr, unsigned long clone_flags, unsigned long esp,
unsigned long unused,
struct task_struct * p, struct pt_regs * regs)
{
struct pt_regs * childregs;
struct task_struct *tsk;
+ unsigned long vsyscall_stack[4];
int err;
childregs = ((struct pt_regs *) (THREAD_SIZE + (unsigned long) p->thread_info)) - 1;
@@ -451,6 +454,19 @@ int copy_thread(int nr, unsigned long cl
childregs = (struct pt_regs *) ((unsigned long) childregs - 8);
*childregs = *regs;
childregs->eax = 0;
+ /*
+ * When we were called from a vsyscall some things are pushed on the stack;
+ * we need to copy the stuff to the new stack.
+ */
+ if (regs->esp != esp && (void *)regs->eip == SYSENTER_RETURN) {
+ int size = sizeof(vsyscall_stack);
+
+ if (copy_from_user(vsyscall_stack, (void *)regs->esp, size))
+ return -EFAULT;
+ if (access_process_vm(p, esp - size, vsyscall_stack, size, 1) != size)
+ return -EFAULT;
+ esp -= size;
+ }
childregs->esp = esp;
p->thread.esp = (unsigned long) childregs;
--
Chuck
Thanks for your work on the patch
> Your patch almost works but it copies the stack into the parent's address space.
> Using access_process_vm() fixes it. However, that still leaves unfixed the case
> where vsyscall-int80 is used.
I copy the stack into the parent's address space becuase in this case
the memory is shared, but access_process_vm() is more elegant :).
About vsyscall-int80, I don't know how to test that case in my
computer but I think a solution could be:
add a INT80H_RETURN symbol:
.LSTART_vsyscall:
int $0x80
.globl INT80H_RETURN
INT80H_RETURN:
ret
.LEND_vsyscall:
and then in process.c:
int size = 0 ;
childregs->eax = 0;
if ((void *)regs->eip == SYSENTER_RETURN)
size = sizeof(vsyscall_stack) ;
if ((void *)regs->eip == INT80H_RETURN)
size = sizeof(unsigned long) ;
if (regs->esp != esp && size) {
if (copy_from_user(vsyscall_stack, (void *)regs->esp, size))
return -EFAULT;
if (access_process_vm(p, esp - size, vsyscall_stack,
size, 1) != size)
return -EFAULT;
esp -= size;
}
childregs->esp = esp;
I hope somebody can test if it works that way.
Regards
In-Reply-To: <[email protected]>
On Thu, 26 Jan 2006 at 15:12:08 +0000, Daniel fernandez wrote:
> > Your patch almost works but it copies the stack into the parent's address space.
> > Using access_process_vm() fixes it. However, that still leaves unfixed the case
> > where vsyscall-int80 is used.
>
> I copy the stack into the parent's address space becuase in this case
> the memory is shared, but access_process_vm() is more elegant :).
My test program (below) doesn't use CLONE_VM. With your patch the stack
data showed up only in the parent process, probably due to copy-on-write,
and the child gets SIGSEGV trying to transfer control to address 0.
> About vsyscall-int80, I don't know how to test that case in my
> computer but I think a solution could be:
I patched arch/i386/kernel/cpu/common.c to add a boot option "nox86sep"
similar to "nofxsr" and sure enough the test program dies when booting
with that option:
$ ./test_clone2.ex
SIGSEGV accessing 0x00000000 from EIP 0x00000000
cloned; ret = 621
Your fix for this should work fine.
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#ifdef INT80
# define SYSCALL_STR "int $0x80\n\t"
#else
# define SYSCALL_STR "call 0xffffe400\n\t"
#endif
#define CLONE 120
#define FLAGS 0
unsigned long child_stack[4096] __attribute__((__aligned__(4096)));
unsigned long *child_stack_ptr = &child_stack[2047];
struct sigaction sa;
int ret;
static void handler(int nr, siginfo_t *si, void *vuc)
{
struct ucontext *uc = (struct ucontext *)vuc;
struct sigcontext *sc = (struct sigcontext *)&uc->uc_mcontext;
printf("SIGSEGV accessing 0x%08x from EIP 0x%08x\n",
(unsigned long)si->si_addr, sc->eip);
sa.sa_handler = SIG_DFL;
sa.sa_flags = 0;
sigaction(SIGSEGV, &sa, NULL);
}
int main(int argc, char * const argv[])
{
sa.sa_sigaction = handler;
sa.sa_flags = SA_SIGINFO;
sigaction(SIGSEGV, &sa, NULL);
asm volatile(
SYSCALL_STR
: "=a"(ret)
: "a"(CLONE), "b"(FLAGS), "c"(child_stack_ptr)
: "memory"
);
printf("cloned; ret = %d\n", ret);
_exit(0);
}
--
Chuck
Currently reading: _The Atrocity Archives_ by Charles Stross