kill(..., SIGKILL) doesn't work to kill host-OS processes created in
the exec path in TT mode --- for this we need PTRACE_KILL (it did work
in previous kernels, but not by design). Without this process will
accumulate on the host-OS (although the won't be visible inside UML).
Signed-off-by: Chris Wedgwood <[email protected]>
---
Yes, there are other fixes along these lines which are needed but one
at a time as we test these...
Index: cw-current/arch/um/kernel/tt/exec_user.c
===================================================================
--- cw-current.orig/arch/um/kernel/tt/exec_user.c 2004-11-03 02:10:18.064830204 -0800
+++ cw-current/arch/um/kernel/tt/exec_user.c 2004-11-03 02:12:10.447716745 -0800
@@ -35,7 +35,8 @@
tracer_panic("do_exec failed to get registers - errno = %d",
errno);
- kill(old_pid, SIGKILL);
+ if (ptrace(PTRACE_KILL, old_pid, NULL, NULL))
+ printk("Warning: ptrace(PTRACE_KILL, %d, ...) saw %d\n", errno);
if(ptrace_setregs(new_pid, regs) < 0)
tracer_panic("do_exec failed to start new proc - errno = %d",
On Wed, 2004-11-03 at 11:37, Chris Wedgwood wrote:
> kill(..., SIGKILL) doesn't work to kill host-OS processes created in
> the exec path in TT mode --- for this we need PTRACE_KILL (it did work
> in previous kernels, but not by design). Without this process will
> accumulate on the host-OS (although the won't be visible inside UML).
>
> Signed-off-by: Chris Wedgwood <[email protected]>
> ---
>
> Yes, there are other fixes along these lines which are needed but one
> at a time as we test these...
>
> Index: cw-current/arch/um/kernel/tt/exec_user.c
> ===================================================================
> --- cw-current.orig/arch/um/kernel/tt/exec_user.c 2004-11-03 02:10:18.064830204 -0800
> +++ cw-current/arch/um/kernel/tt/exec_user.c 2004-11-03 02:12:10.447716745 -0800
> @@ -35,7 +35,8 @@
> tracer_panic("do_exec failed to get registers - errno = %d",
> errno);
>
> - kill(old_pid, SIGKILL);
> + if (ptrace(PTRACE_KILL, old_pid, NULL, NULL))
> + printk("Warning: ptrace(PTRACE_KILL, %d, ...) saw %d\n", errno);
You have two %d but only one argument. You seem to have forgotten an
"old_pid, " in there.
>
> if(ptrace_setregs(new_pid, regs) < 0)
> tracer_panic("do_exec failed to start new proc - errno = %d",
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/, http://www-stu.christs.cam.ac.uk/~aia21/
On Wed, Nov 03, 2004 at 11:47:38AM +0000, Anton Altaparmakov wrote:
> You have two %d but only one argument. You seem to have forgotten
> an "old_pid, " in there.
doh! it's a warning if something iffy happens (which so far for me it
hasn't) which explains why i missed it... not sure why i didn't get a
build warning though... thanks!
---
kill(..., SIGKILL) doesn't work to kill host-OS processes created in
the exec path in TT mode --- for this we need PTRACE_KILL (it did work
in previous kernels, but not by design). Without this process will
accumulate on the host-OS (although the won't be visible inside UML).
Signed-off-by: Chris Wedgwood <[email protected]>
---
Index: cw-current/arch/um/kernel/tt/exec_user.c
===================================================================
--- cw-current.orig/arch/um/kernel/tt/exec_user.c 2004-11-03 02:10:18.064830204 -0800
+++ cw-current/arch/um/kernel/tt/exec_user.c 2004-11-03 04:05:00.435843464 -0800
@@ -35,7 +35,8 @@
tracer_panic("do_exec failed to get registers - errno = %d",
errno);
- kill(old_pid, SIGKILL);
+ if (ptrace(PTRACE_KILL, old_pid, NULL, NULL))
+ printk("Warning: ptrace(PTRACE_KILL, %d, ...) saw %d\n", old_pid, errno);
if(ptrace_setregs(new_pid, regs) < 0)
tracer_panic("do_exec failed to start new proc - errno = %d",
On Wednesday 03 November 2004 13:08, Chris Wedgwood wrote:
> On Wed, Nov 03, 2004 at 11:47:38AM +0000, Anton Altaparmakov wrote:
> > You have two %d but only one argument. You seem to have forgotten
> > an "old_pid, " in there.
> kill(..., SIGKILL) doesn't work to kill host-OS processes created in
> the exec path in TT mode --- for this we need PTRACE_KILL (it did work
> in previous kernels, but not by design). Without this process will
> accumulate on the host-OS (although the won't be visible inside UML).
> Signed-off-by: Chris Wedgwood <[email protected]>
> ---
>
> Index: cw-current/arch/um/kernel/tt/exec_user.c
> ===================================================================
> --- cw-current.orig/arch/um/kernel/tt/exec_user.c 2004-11-03
> 02:10:18.064830204 -0800 +++
> cw-current/arch/um/kernel/tt/exec_user.c 2004-11-03 04:05:00.435843464
> -0800 @@ -35,7 +35,8 @@
> tracer_panic("do_exec failed to get registers - errno = %d",
> errno);
>
> - kill(old_pid, SIGKILL);
> + if (ptrace(PTRACE_KILL, old_pid, NULL, NULL))
> + printk("Warning: ptrace(PTRACE_KILL, %d, ...) saw %d\n", old_pid,
> errno);
>
> if(ptrace_setregs(new_pid, regs) < 0)
> tracer_panic("do_exec failed to start new proc - errno = %d",
I'm going to test this. I thought that Gerd Knorr patch (which I sent cc'ing
LKML and most of you) already solved this (I actually modified that one,
replacing his SIGCONT kill()ing with a PTRACE_KILL, but I did this in the
places he identified). I guess that old_pid should either already be dead
there or going to die after a little, but I'm going to check (after I get UML
to run in the current snapshot...)
For now, please hold on this.
--
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729
On Wed, Nov 03, 2004 at 08:28:44PM +0100, Blaisorblade wrote:
> I'm going to test this.
please do
> I thought that Gerd Knorr patch (which I sent cc'ing LKML and most
> of you) already solved this (I actually modified that one, replacing
> his SIGCONT kill()ing with a PTRACE_KILL, but I did this in the
> places he identified).
it might, i'm going to check soon
what worries me is that two very different code paths might be fixing
the same problem which makes me think the flow of execution here is
very vague and needs cleaninng up
also, check your return values for errors --- i bet you will see
some. os_kill_process has this problem too --- many invocations of it
are pointless and fail, especially those from relase_thread_tt (i need
to check the details here, this is all from memory and im getting old)
> I guess that old_pid should either already be dead there or going to
> die after a little, but I'm going to check (after I get UML to run
> in the current snapshot...)
it should build pretty close to as-is, if not let me know and i'll
sent you what i have
> I'm going to test this. I thought that Gerd Knorr patch (which I sent cc'ing
> LKML and most of you) already solved this (I actually modified that one,
Not sure whenever tt is fixed with my patch, I've tested skas only (I'm
building skas-only dynamically linked kernels these days because due to
working on x11 framebuffer stuff which needs dynamically linked libX11).
So if Chris actually tested TT then his patch probably is ok and needed
as well ...
Gerd
--
#define printk(args...) fprintf(stderr, ## args)
On Wed, Nov 03, 2004 at 09:18:36PM +0100, Gerd Knorr wrote:
> Not sure whenever tt is fixed with my patch, I've tested skas only
> (I'm building skas-only dynamically linked kernels these days
> because due to working on x11 framebuffer stuff which needs
> dynamically linked libX11).
it would be nice to find a way to make this work TT mode for you
> So if Chris actually tested TT then his patch probably is ok and
> needed as well ...
i'm only using TT mode at present, i don't check esoteric modes that
require host-OS patches at present
On Wednesday 03 November 2004 21:09, Chris Wedgwood wrote:
> On Wed, Nov 03, 2004 at 08:28:44PM +0100, Blaisorblade wrote:
> > I'm going to test this.
>
> please do
The Gerd / mine patch seem to work nicely. No thread is left over.
> > I thought that Gerd Knorr patch (which I sent cc'ing LKML and most
> > of you) already solved this (I actually modified that one, replacing
> > his SIGCONT kill()ing with a PTRACE_KILL, but I did this in the
> > places he identified).
> it might, i'm going to check soon
> what worries me is that two very different code paths might be fixing
> the same problem which makes me think the flow of execution here is
> very vague and needs cleaninng up
Yes, but the cleanup should first be revierwed by Jeff Dike and put in his
tree. I too don't understand the code. Don't CC Andrew on the first release
of the cleanup at least.
To go to the actual fact, the code path I'm fixing didn't exist till a few
releases ago.
Follows is the patch containing this fix against 2.4.27, with the changelog,
from Jeff Dike.
This fixes a use-after-free bug in the context switching. A process going out
of context after exiting wakes up the next process and then kills itself. The
problem is that when it gets around to killing itself is up to the host and
can happen a long time later, including after the incoming process has freed
its stack, and that memory is possibly being used for something else.
The fix is to have the incoming process kill the exiting process just to
make sure it can't be running at the point that its stack is freed.
um-linux-2.4.27-paolo/arch/um/kernel/tt/process_kern.c | 13 ++++++++++++-
1 files changed, 12 insertions(+), 1 deletion(-)
diff -puN arch/um/kernel/tt/process_kern.c~scheduler
arch/um/kernel/tt/process_kern.c
--- um-linux-2.4.27/arch/um/kernel/tt/process_kern.c~scheduler 2004-11-02
11:12:28.805815264 +0100
+++ um-linux-2.4.27-paolo/arch/um/kernel/tt/process_kern.c 2004-11-02
11:12:28.807814960 +0100
@@ -25,7 +25,7 @@
void *_switch_to_tt(void *prev, void *next)
{
- struct task_struct *from, *to;
+ struct task_struct *from, *to, *prev_sched;
unsigned long flags;
int err, vtalrm, alrm, prof, cpu;
char c;
@@ -67,6 +67,17 @@ void *_switch_to_tt(void *prev, void *ne
if(err != sizeof(c))
panic("read of switch_pipe failed, errno = %d", -err);
+ /* If the process that we have just scheduled away from has exited,
+ * then it needs to be killed here. The reason is that, even though
+ * it will kill itself when it next runs, that may be too late. Its
+ * stack will be freed, possibly before then, and if that happens,
+ * we have a use-after-free situation. So, it gets killed here
+ * in case it has not already killed itself.
+ */
+ prev_sched = current->thread.prev_sched;
+ if(prev_sched->state == TASK_ZOMBIE)
+ os_kill_process(prev_sched->thread.mode.tt.extern_pid, 1);
+
/* This works around a nasty race with 'jail'. If we are switching
* between two threads of a threaded app and the incoming process
* runs before the outgoing process reaches the read, and it makes
> also, check your return values for errors --- i bet you will see
> some.
That is a separate, on-top of this patch.
> os_kill_process has this problem too --- many invocations of it
> are pointless and fail, especially those from relase_thread_tt (i need
> to check the details here, this is all from memory and im getting old)
> > I guess that old_pid should either already be dead there or going to
> > die after a little, but I'm going to check (after I get UML to run
> > in the current snapshot...)
> it should build pretty close to as-is, if not let me know and i'll
> sent you what i have
No problem actually in UML, the build environment instead is very complicate.
It refuses to work with NPTL, so I must build it in my old Mandrake chroot
(it was a distro, now I keep it just for compilation).
Now, I must also run it on the Gentoo, or it gives problems... however, don't
worry.
I'm also going to merge some patches (21 it should be).
Bye
--
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729
On Wednesday 03 November 2004 21:18, Gerd Knorr wrote:
> > I'm going to test this. I thought that Gerd Knorr patch (which I sent
> > cc'ing LKML and most of you) already solved this (I actually modified
> > that one,
> Not sure whenever tt is fixed with my patch, I've tested skas only
I've tested it, and it works in fact.
> (I'm
> building skas-only dynamically linked kernels these days because due to
> working on x11 framebuffer stuff which needs dynamically linked libX11).
> So if Chris actually tested TT then his patch probably is ok and needed
> as well ...
Actually, from looking at Jeff Dike comment, it seem that the new kill (the
one you've changed) is executed earlier and more reliably than the 1st one,
the one Chris identified.
So we could ask Jeff to see if he can remove the earlier kill, the one
affected by the Chris Wedgwood.
--
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729
On Wednesday 03 November 2004 21:48, Chris Wedgwood wrote:
> On Wed, Nov 03, 2004 at 09:18:36PM +0100, Gerd Knorr wrote:
> > Not sure whenever tt is fixed with my patch, I've tested skas only
> > (I'm building skas-only dynamically linked kernels these days
> > because due to working on x11 framebuffer stuff which needs
> > dynamically linked libX11).
> it would be nice to find a way to make this work TT mode for you
> > So if Chris actually tested TT then his patch probably is ok and
> > needed as well ...
> i'm only using TT mode at present, i don't check esoteric modes that
> require host-OS patches at present
SKAS mode is currently more used and tested than TT mode. And the SKAS patch
is included at least in WOLK, the SuSE kernel, and the -cko patchset (Con
Kolivas Overloaded).
That said, feel free not to use it, but at least test that it compiles in SKAS
mode.
Bye
--
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729