2002-01-31 10:42:26

by Nathan Field

[permalink] [raw]
Subject: BUG: PTRACE_POKETEXT modifies memory in related processes

I believe I have found a bug in recent 2.4 linux kernels (at least 2.4.17)
running on x86 related to using ptrace to modify a processes memory. From
a quick scan of the 2.4.17 code it looks like ptrace.c:access_process_vm
is not checking to see if the page of memory it is writing to has write
permission, so it is not properly copy-on-write'ing. This means that if I
have a simple program like this:

main() {
int pid = fork();
while(1) {
printf("pid is: %d\n", pid);
sleep(1);
}
}

and I set a breakpoint on the printf() line in the parent both the parent
and the child will hit that breakpoint.

This bug was not present in any 2.2 kernel that I've tested or in 2.4.2 or
2.4.7 (RH stock kernels for 7.1 and 7.2). Looking at the code for 2.4.14 I
noticed that access_one_page, which is called through access_process_vm,
does a lot more work to make sure that page faults are handled correctly.
I haven't tested 2.4.14 though, so I can't say that it does the correct
thing. I'm not familiar with kernel source, but it looks to me like
get_user_pages is supposed to handle the fault, and it is being passed a
len of 1 rather than the len passed into access_process_vm. Not that I
think that's the bug, just an observation.

I'm more than willing to test out patches, provide more information, or
just have a fireside chat. Please CC replies to my address as I'm on the
daily digest list.

nathan

------------
Nathan Field Root is not something to be shared with strangers.

"It is commonly the case with technologies that you can get the best
insight about how they work by watching them fail."
-- Neal Stephenson


2002-01-31 16:07:14

by Daniel Jacobowitz

[permalink] [raw]
Subject: Re: BUG: PTRACE_POKETEXT modifies memory in related processes

On Thu, Jan 31, 2002 at 02:43:00AM -0800, Nathan Field wrote:
> I believe I have found a bug in recent 2.4 linux kernels (at least 2.4.17)
> running on x86 related to using ptrace to modify a processes memory. From
> a quick scan of the 2.4.17 code it looks like ptrace.c:access_process_vm
> is not checking to see if the page of memory it is writing to has write
> permission, so it is not properly copy-on-write'ing. This means that if I
> have a simple program like this:

I don't think you're correctly understanding the circumstances. Are
you setting the breakpoint after they fork (seems unlikely, given this
test case - not much time to do so)? Otherwise, the breakpoint is
simply being carried over by your forking. Why you see it now and did
not before I don't know.

I see the same behavior with both software and hardware watchpoints.

Basically, GDB on Linux does not support fork very well. It doesn't
show up terribly often, because exec() clears breakpoints.

COW doesn't come into it. The page is being modified before it is
copied.

--
Daniel Jacobowitz Carnegie Mellon University
MontaVista Software Debian GNU/Linux Developer

2002-02-02 04:29:11

by Nathan Field

[permalink] [raw]
Subject: PATCH: Re: BUG: PTRACE_POKETEXT modifies memory in related processes

> > I believe I have found a bug in recent 2.4 linux kernels (at least 2.4.17)
> > running on x86 related to using ptrace to modify a processes memory. From
[snip incorrect guess at cause of problem]
>
> I don't think you're correctly understanding the circumstances. Are
> you setting the breakpoint after they fork (seems unlikely, given this
> test case - not much time to do so)? Otherwise, the breakpoint is
> simply being carried over by your forking. Why you see it now and did
> not before I don't know.
Actually I do understand the circumstances. I am setting the
breakpoint _AFTER_ the fork, when the parent and the child should exist in
different memory spaces.

> Basically, GDB on Linux does not support fork very well. It doesn't
> show up terribly often, because exec() clears breakpoints.
You are right on that, GDB sucks when debugging forks, which is
why I have written my own debugger that can correctly handle forks. This
is why I've come across this problem. Basically I detect a fork, do some
magic, catch the child process before it continues to run and attach to
it. Then I remove all the breakpoints from the child that were in the
parent. The problem is that in 2.4.17 when I remove the breakpoints from
the child they are also removed from the parent. I do know what I'm
talking about here, I'm not trying to remove the breakpoints before the
fork system call, it is after the fork system call (I actually remove the
breakpoints after the fork library call has returned), and the bug is
related to COW because I figured out exactly what's wrong and here's a
patch that fixes it:

--- ptrace.c.orig Fri Feb 1 20:17:18 2002
+++ ptrace.c Fri Feb 1 19:54:58 2002
@@ -148,16 +148,16 @@
int bytes, ret, offset;
void *maddr;

- ret = get_user_pages(current, mm, addr, 1,
- write, 1, &page, &vma);
- if (ret <= 0)
- break;
-
bytes = len;
offset = addr & (PAGE_SIZE-1);
if (bytes > PAGE_SIZE-offset)
bytes = PAGE_SIZE-offset;

+ ret = get_user_pages(current, mm, addr, 1,
+ write, 1, &page, &vma);
+ if (ret <= 0)
+ break;
+
flush_cache_page(vma, addr);

maddr = kmap(page);
@@ -173,6 +173,7 @@
put_page(page);
len -= bytes;
buf += bytes;
+ addr += bytes;
}
up_read(&mm->mmap_sem);
mmput(mm);


Basically the kernel was calling get_user_pages on the same address, even
as it moved through different addresses in memory. I have tested this
change on my own 2.4.17 kernel and it now works correctly.

nathan

------------
Nathan Field Root is not something to be shared with strangers.

"The nice thing about standards is that there's so many to choose from."
-- Andrew S. Tanenbaum

2002-02-02 07:39:51

by Daniel Jacobowitz

[permalink] [raw]
Subject: Re: PATCH: Re: BUG: PTRACE_POKETEXT modifies memory in related processes

On Fri, Feb 01, 2002 at 08:32:21PM -0800, Nathan Field wrote:
> > > I believe I have found a bug in recent 2.4 linux kernels (at least 2.4.17)
> > > running on x86 related to using ptrace to modify a processes memory. From
> [snip incorrect guess at cause of problem]
> >
> > I don't think you're correctly understanding the circumstances. Are
> > you setting the breakpoint after they fork (seems unlikely, given this
> > test case - not much time to do so)? Otherwise, the breakpoint is
> > simply being carried over by your forking. Why you see it now and did
> > not before I don't know.
> Actually I do understand the circumstances. I am setting the
> breakpoint _AFTER_ the fork, when the parent and the child should exist in
> different memory spaces.
>
> > Basically, GDB on Linux does not support fork very well. It doesn't
> > show up terribly often, because exec() clears breakpoints.
> You are right on that, GDB sucks when debugging forks, which is
> why I have written my own debugger that can correctly handle forks. This
> is why I've come across this problem. Basically I detect a fork, do some
> magic, catch the child process before it continues to run and attach to
> it. Then I remove all the breakpoints from the child that were in the
> parent. The problem is that in 2.4.17 when I remove the breakpoints from
> the child they are also removed from the parent. I do know what I'm
> talking about here, I'm not trying to remove the breakpoints before the
> fork system call, it is after the fork system call (I actually remove the
> breakpoints after the fork library call has returned), and the bug is
> related to COW because I figured out exactly what's wrong and here's a
> patch that fixes it:

Sorry then. You didn't give any of this context in the original
message. Had you been using GDB - a reasonable assumption - my
explanation would have been accurate.

I've got the design mostly worked out to make GDB handle fork() better.
I just need to settle on a few details and find some time to finish it.

> --- ptrace.c.orig Fri Feb 1 20:17:18 2002
> +++ ptrace.c Fri Feb 1 19:54:58 2002
> @@ -148,16 +148,16 @@
> int bytes, ret, offset;
> void *maddr;
>
> - ret = get_user_pages(current, mm, addr, 1,
> - write, 1, &page, &vma);
> - if (ret <= 0)
> - break;
> -
> bytes = len;
> offset = addr & (PAGE_SIZE-1);
> if (bytes > PAGE_SIZE-offset)
> bytes = PAGE_SIZE-offset;
>
> + ret = get_user_pages(current, mm, addr, 1,
> + write, 1, &page, &vma);
> + if (ret <= 0)
> + break;
> +
> flush_cache_page(vma, addr);
>
> maddr = kmap(page);

Why is this first half even necessary? I don't see that it makes any
difference. Maybe I'm missing something.

> @@ -173,6 +173,7 @@
> put_page(page);
> len -= bytes;
> buf += bytes;
> + addr += bytes;
> }
> up_read(&mm->mmap_sem);
> mmput(mm);
>
>
> Basically the kernel was calling get_user_pages on the same address, even
> as it moved through different addresses in memory. I have tested this
> change on my own 2.4.17 kernel and it now works correctly.

That'll do it all right. Might want to forward this patch (at least
the latter bit) to Linus and Marcello to make sure they see it.

--
Daniel Jacobowitz Carnegie Mellon University
MontaVista Software Debian GNU/Linux Developer

2002-02-02 09:23:30

by Nathan Field

[permalink] [raw]
Subject: Re: PATCH: Re: BUG: PTRACE_POKETEXT modifies memory in related processes

> Sorry then. You didn't give any of this context in the original
> message. Had you been using GDB - a reasonable assumption - my
> explanation would have been accurate.
Thats okay, years ago I used to lurk on the lkml and I saw tons of
stupid questions. Had I seen my own message without knowing the background
I probably would have assumed the same things. The problem was when I
wrote it I didn't really have the context, I just knew the parent and
child were sharing the same memory space since. After I saw your response
I found out that it was almost impossible to get the problem to reproduce
under gdb. I was all set to write a mini-debugger to show the problem when
a friend suggested we just scan the kernel source.

> I've got the design mostly worked out to make GDB handle fork()
> better. I just need to settle on a few details and find some time to
> finish it.
Personally I'd love to offer to help you out, there are tons of
special cases that are really nasty to deal with that I could help you out
on. Unfortunatly I don't think my company would appreciate it, so all I
can really say is best of luck. It's a fun project, and it'll be
interesting to see how you do it. I've heard that strace can do it by
patching in an infinite loop after the syscall. I do it in a way that is
more arch indep, though only if you have a really capable debugger behind
you.

Do you know if anyone has looked at improving the general
debugging interface? The solaris interface is just beautiful for things
like this. You can catch forks, vforks, exec's and so on without having to
do any crazy run-time modifications. It also works when you don't have
debugging symbols, which is my Achilles heel. If I can't tell where fork
is (think of a shell with a statically linked libc that's been stripped)
then I can't hope to catch it. Since many apps spawn shells which then
spawn other programs I'm just out of luck. Have you come up with a way to
work around this?

[snip silly part of patch]
> Why is this first half even necessary? I don't see that it makes any
> difference. Maybe I'm missing something.
Oops, my mistake. That is a leftover from my first pass at fixing
the problem, and shouldn't be there. Sorry about that, I just submitted a
patch for what it took to get my system to work. I can take that out and
submit a new patch.

In fact, how about this:

--- ptrace.c.orig Fri Feb 1 20:17:18 2002
+++ ptrace.c Sat Feb 2 00:53:43 2002
@@ -173,6 +173,7 @@
put_page(page);
len -= bytes;
buf += bytes;
+ addr += bytes;
}
up_read(&mm->mmap_sem);
mmput(mm);

> That'll do it all right. Might want to forward this patch (at least
> the latter bit) to Linus and Marcello to make sure they see it.
Sure, I'm not a regular on the lkml, what email addresses should I
use to send the patches? Someone else suggested that I also send it to
Dave Jones, but I haven't been able to find an email address for Marcello
through google or in the one daily digest I've gotten so far.
Dave Jones <[email protected]>
Linus Torvalds <[email protected]>
The lkml FAQ at http://www.tux.ork/lkml seems to be a bit out of date on
this :)

nathan

------------
Nathan Field Root is not something to be shared with strangers.

"It is commonly the case with technologies that you can get the best
insight about how they work by watching them fail."
-- Neal Stephenson