2004-11-15 22:55:26

by Roland McGrath

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

> No, TIF_SINGLESTEP gets set even when the _user_ set TF. It is just a flag
> saying that we should re-enable TF when we get back to user space.
>
> So TIF_SINGLESTEP in no way implies that TF was set by a debugger.

Ok, whatever. I'm not really sure its use for the single-step stuff in
Davide Libenzi's changes doesn't change the expected behavior for the
nondebugger case, but it's too early in the morning to think hard about that.

Your change hit only one spot of three in arch/i386/kernel/signal.c where
PT_PTRACED is now tested and it should be a "is PTRACE_SINGLESTEP in effect?"
test. Also the same spots in native and 32-bit emul for x86-64.


Thanks,
Roland


2004-11-19 19:00:09

by Eric Pouech

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

Roland McGrath a ?crit :
>>No, TIF_SINGLESTEP gets set even when the _user_ set TF. It is just a flag
>>saying that we should re-enable TF when we get back to user space.
>>
>>So TIF_SINGLESTEP in no way implies that TF was set by a debugger.
>
>
> Ok, whatever. I'm not really sure its use for the single-step stuff in
> Davide Libenzi's changes doesn't change the expected behavior for the
> nondebugger case, but it's too early in the morning to think hard about that.
>
> Your change hit only one spot of three in arch/i386/kernel/signal.c where
> PT_PTRACED is now tested and it should be a "is PTRACE_SINGLESTEP in effect?"
> test. Also the same spots in native and 32-bit emul for x86-64.
>
>
> Thanks,
> Roland
>
the first patch put in BK by Linus doesn't fix the problem. Any plan to fix the
two other spots Roland mentionned ?
A+

2004-11-19 19:21:03

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Fri, 19 Nov 2004, Eric Pouech wrote:
>
> the first patch put in BK by Linus doesn't fix the problem. Any plan to fix the
> two other spots Roland mentionned ?

Can you just try it? I don't have wine, and since my main machine is
ppc64, and I don't actually have any windows programs to test even on any
of my laptops...

Linus

2004-11-19 19:33:13

by Eric Pouech

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

Linus Torvalds a ?crit :
>
> On Fri, 19 Nov 2004, Eric Pouech wrote:
>
>>the first patch put in BK by Linus doesn't fix the problem. Any plan to fix the
>>two other spots Roland mentionned ?
>
>
> Can you just try it? I don't have wine, and since my main machine is
> ppc64, and I don't actually have any windows programs to test even on any
> of my laptops...
I don't have 2.6.9 installed here, I'm just reporting & interpreting bug reports
we have from end users. I'll try to make on the bug reporters try to fix the
other spots, but that's always easier from them the get the source from one spot.
Thanx for the quick answer anyhow.
A+

2004-11-19 19:57:38

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Fri, 19 Nov 2004, Eric Pouech wrote:
>
> I don't have 2.6.9 installed here, I'm just reporting & interpreting bug reports
> we have from end users. I'll try to make on the bug reporters try to fix the
> other spots, but that's always easier from them the get the source from one spot.

Btw, does wine ever _use_ PTRACE_SINGLESTEP for any of the things it does?

If it does, then that woulc certainly explain why my "fix" made no
difference: my fix _only_ handles the case where the ptracer never
actually asks for single-stepping, and single-stepping was started
entirely by the program being run (ie by setting TF in eflags from within
the program itself).

But if wine ends up using PTRACE_SINGESTEP because wine actually wants to
single-step over some instructions, then the kernel will set the PT_DTRACE
bit, and start tracing through signal handlers too. The way Wine doesn't
want..

Linus

2004-11-19 20:41:47

by Eric Pouech

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

> Btw, does wine ever _use_ PTRACE_SINGLESTEP for any of the things it does?
>
> If it does, then that woulc certainly explain why my "fix" made no
> difference: my fix _only_ handles the case where the ptracer never
> actually asks for single-stepping, and single-stepping was started
> entirely by the program being run (ie by setting TF in eflags from within
> the program itself).
>
> But if wine ends up using PTRACE_SINGESTEP because wine actually wants to
> single-step over some instructions, then the kernel will set the PT_DTRACE
> bit, and start tracing through signal handlers too. The way Wine doesn't
> want..

wine mixes both approches, we have (to control what's generated inside the
various exception) to ptrace from our NT-kernel-like process (the ptracer) to
get the context of the exception. Restart from the ptracer is done with
PTRACE_SINGLESTEP.

(BTW: I also CC:ed wine-devel ML, that might be of interest to them too)

A+

2004-11-19 20:59:47

by Grzegorz Kulewski

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

On Fri, 19 Nov 2004, Linus Torvalds wrote:
> On Fri, 19 Nov 2004, Eric Pouech wrote:
>>
>> the first patch put in BK by Linus doesn't fix the problem. Any plan to fix the
>> two other spots Roland mentionned ?
>
> Can you just try it? I don't have wine, and since my main machine is
> ppc64, and I don't actually have any windows programs to test even on any
> of my laptops...

You could probably use QEMU to run windows binaries on ppc. It has some
kind of user-mode (per process) emulation and it was designed (at the
begining) exactly to run wine on !x86. I do not know if the wine emulation
is still supported (because Fabrice is mainly working on whole-system
emulation), but you can fix any issues with never wine versions in 5
minutes I will bet two beers... :-)

And some windows programs to test can be found on the Internet.


Thanks,

Grzegorz Kulewski


PS. Thanks for your work Fabrice!

2004-11-19 21:22:35

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Fri, 19 Nov 2004, Eric Pouech wrote:
>
> wine mixes both approches, we have (to control what's generated inside the
> various exception) to ptrace from our NT-kernel-like process (the ptracer) to
> get the context of the exception. Restart from the ptracer is done with
> PTRACE_SINGLESTEP.

Here's a new patch to try. Totally untested.

It is more careful about clearing PT_DTRACED (which by now should probably
be renamed PT_PRACE_SINGLESTEP or something on x86, since we should never
be lazy about this thing any more), and it may or may not help.

Pls test _together_ with the previous patch (which is already applied in
the current top-of-tree for anybody with really recent kernels).

Linus

-----
===== arch/i386/kernel/ptrace.c 1.27 vs edited =====
--- 1.27/arch/i386/kernel/ptrace.c 2004-11-07 18:10:34 -08:00
+++ edited/arch/i386/kernel/ptrace.c 2004-11-19 13:18:56 -08:00
@@ -138,6 +138,26 @@
return retval;
}

+static void set_singlestep(struct task_struct *child)
+{
+ long eflags;
+
+ set_tsk_thread_flag(child, TIF_SINGLESTEP);
+ eflags = get_stack_long(child, EFL_OFFSET);
+ put_stack_long(child, EFL_OFFSET, eflags | TRAP_FLAG);
+ child->ptrace |= PT_DTRACE;
+}
+
+static void clear_singlestep(struct task_struct *child)
+{
+ long eflags;
+
+ clear_tsk_thread_flag(child, TIF_SINGLESTEP);
+ eflags = get_stack_long(child, EFL_OFFSET);
+ put_stack_long(child, EFL_OFFSET, eflags & ~TRAP_FLAG);
+ child->ptrace &= ~PT_DTRACE;
+}
+
/*
* Called by kernel/ptrace.c when detaching..
*
@@ -145,11 +165,7 @@
*/
void ptrace_disable(struct task_struct *child)
{
- long tmp;
-
- clear_tsk_thread_flag(child, TIF_SINGLESTEP);
- tmp = get_stack_long(child, EFL_OFFSET) & ~TRAP_FLAG;
- put_stack_long(child, EFL_OFFSET, tmp);
+ clear_singlestep(child);
}

/*
@@ -388,10 +404,8 @@
}
break;

- case PTRACE_SYSCALL: /* continue and stop at next (return from) syscall */
- case PTRACE_CONT: { /* restart after signal. */
- long tmp;
-
+ case PTRACE_SYSCALL: /* continue and stop at next (return from) syscall */
+ case PTRACE_CONT: /* restart after signal. */
ret = -EIO;
if ((unsigned long) data > _NSIG)
break;
@@ -401,56 +415,39 @@
else {
clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
}
- clear_tsk_thread_flag(child, TIF_SINGLESTEP);
child->exit_code = data;
- /* make sure the single step bit is not set. */
- tmp = get_stack_long(child, EFL_OFFSET) & ~TRAP_FLAG;
- put_stack_long(child, EFL_OFFSET,tmp);
+ /* make sure the single step bit is not set. */
+ clear_singlestep(child);
wake_up_process(child);
ret = 0;
break;
- }

/*
* make the child exit. Best I can do is send it a sigkill.
* perhaps it should be put in the status that it wants to
* exit.
*/
- case PTRACE_KILL: {
- long tmp;
-
+ case PTRACE_KILL:
ret = 0;
if (child->exit_state == EXIT_ZOMBIE) /* already dead */
break;
child->exit_code = SIGKILL;
- clear_tsk_thread_flag(child, TIF_SINGLESTEP);
/* make sure the single step bit is not set. */
- tmp = get_stack_long(child, EFL_OFFSET) & ~TRAP_FLAG;
- put_stack_long(child, EFL_OFFSET, tmp);
+ clear_singlestep(child);
wake_up_process(child);
break;
- }
-
- case PTRACE_SINGLESTEP: { /* set the trap flag. */
- long tmp;

+ case PTRACE_SINGLESTEP: /* set the trap flag. */
ret = -EIO;
if ((unsigned long) data > _NSIG)
break;
clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
- if ((child->ptrace & PT_DTRACE) == 0) {
- /* Spurious delayed TF traps may occur */
- child->ptrace |= PT_DTRACE;
- }
- tmp = get_stack_long(child, EFL_OFFSET) | TRAP_FLAG;
- put_stack_long(child, EFL_OFFSET, tmp);
- set_tsk_thread_flag(child, TIF_SINGLESTEP);
+ set_singlestep(child);
child->exit_code = data;
/* give it a chance to run. */
wake_up_process(child);
ret = 0;
break;
- }

case PTRACE_DETACH:
/* detach a process that was attached. */

2004-11-19 21:24:42

by Daniel Jacobowitz

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

On Fri, Nov 19, 2004 at 09:41:44PM +0100, Eric Pouech wrote:
> >Btw, does wine ever _use_ PTRACE_SINGLESTEP for any of the things it does?
> >
> >If it does, then that woulc certainly explain why my "fix" made no
> >difference: my fix _only_ handles the case where the ptracer never
> >actually asks for single-stepping, and single-stepping was started
> >entirely by the program being run (ie by setting TF in eflags from within
> >the program itself).
> >
> >But if wine ends up using PTRACE_SINGESTEP because wine actually wants to
> >single-step over some instructions, then the kernel will set the PT_DTRACE
> >bit, and start tracing through signal handlers too. The way Wine doesn't
> >want..
>
> wine mixes both approches, we have (to control what's generated inside the
> various exception) to ptrace from our NT-kernel-like process (the ptracer)
> to get the context of the exception. Restart from the ptracer is done with
> PTRACE_SINGLESTEP.

I'm getting the feeling that the question of whether to step into
signal handlers is orthogonal to single-stepping; maybe it should be a
separate ptrace operation.

Platforms which don't implement PTRACE_SINGLESTEP would probably
appreciate this. A "single step" which stops you after setting up the
signal trampoline and adjusting the PC, before executing any
instructions in the handler.

--
Daniel Jacobowitz

2004-11-19 22:02:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Fri, 19 Nov 2004, Daniel Jacobowitz wrote:
>
> I'm getting the feeling that the question of whether to step into
> signal handlers is orthogonal to single-stepping; maybe it should be a
> separate ptrace operation.

I really don't see why. If a controlling process is asking for
single-stepping, then it damn well should get it. It it doesn't want to
single-step through a signal handler, then it could decide to just put a
breakpoint on the return point (possibly by modifying the signal handler
save area).

It's not like single-stepping into the signal handler in any way removes
any information (while _not_ single-stepping into it clearly does).

With the patch I just posted (assuming it works for people), Wine should
at least have the choice. The behaviour now should be:

- if the app sets TF on its own, it will cause a SIGTRAP which it can
catch.
- if the debugger sets TF with SINGLESTEP, it will single-step into a
signal handler.
- it the app sets TF _and_ you ptrace it, you the ptracer will see the
debug event and catch it. However, doing a "continue" at that point
will remove the TF flag (and always has), the app will normally then
never see the trap. You can do a "signal SIGTRAP" to actually force
the trap handler to tun, but that one won't actually single-step (it's
a "continue" in all other senses).

It sounds like the third case is what wine wants.

Linus

2004-11-20 03:43:51

by Roland McGrath

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

> I'm getting the feeling that the question of whether to step into
> signal handlers is orthogonal to single-stepping;

No, it's not. You only ever step into a handler when you ask to.
That's already the interface.

> Platforms which don't implement PTRACE_SINGLESTEP would probably
> appreciate this. A "single step" which stops you after setting up the
> signal trampoline and adjusting the PC, before executing any
> instructions in the handler.

That's what PTRACE_SINGLESTEP with a nonzero signal number does since it
was fixed.

2004-11-20 21:47:41

by Jesse Allen

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

On Fri, Nov 19, 2004 at 01:53:38PM -0800, Linus Torvalds wrote:
>
>
> On Fri, 19 Nov 2004, Daniel Jacobowitz wrote:
> >
> > I'm getting the feeling that the question of whether to step into
> > signal handlers is orthogonal to single-stepping; maybe it should be a
> > separate ptrace operation.
>
> I really don't see why. If a controlling process is asking for
> single-stepping, then it damn well should get it. It it doesn't want to
> single-step through a signal handler, then it could decide to just put a
> breakpoint on the return point (possibly by modifying the signal handler
> save area).
>
> It's not like single-stepping into the signal handler in any way removes
> any information (while _not_ single-stepping into it clearly does).
>
> With the patch I just posted (assuming it works for people), Wine should
> at least have the choice. The behaviour now should be:
>
> - if the app sets TF on its own, it will cause a SIGTRAP which it can
> catch.
> - if the debugger sets TF with SINGLESTEP, it will single-step into a
> signal handler.
> - it the app sets TF _and_ you ptrace it, you the ptracer will see the
> debug event and catch it. However, doing a "continue" at that point
> will remove the TF flag (and always has), the app will normally then
> never see the trap. You can do a "signal SIGTRAP" to actually force
> the trap handler to tun, but that one won't actually single-step (it's
> a "continue" in all other senses).
>
> It sounds like the third case is what wine wants.
>
> Linus


So an app running through wine could set TF on it's own? It would be a
good idea to find out what it is doing in the first place. There has to be
a reason why War3 is so picky after the original change was introduced and
a reason why the latest changes don't seem to fix it.

I've built a kernel 2.6.10-rc2 with the new ptrace patch. The program still
says "please insert disc". I haven't been able to get winedbg to tell me
anything useful -- the program never crashes anyways. So I went ahead and I
captured a debug log.

the command:
WINEDEBUG=+all wine war3/Warcraft\ III.exe -opengl >& war3_and_2.6.10-rc2.log

I scanned for the part right before it calls up to display the error. I found
that after loading advapi32.dll, the thread 000c creates a mutex and wakes up
000a. Then 000c gets killed and right after that starts calling up the
resources for the "insert disc" message box. I put the log up on my ftp, and
the interesting part in a seperate file:
ftp://resnet.dnip.net/

Any clue on what may be happening here? Or maybe another idea on where else to search?


Jesse

2004-11-21 04:53:56

by Jesse Allen

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

On Sat, Nov 20, 2004 at 02:49:15PM -0700, Jesse Allen wrote:
> the interesting part in a seperate file:
> ftp://resnet.dnip.net/
>

I took another look at the section I found. It doesn't describe much, but it
shows "000c: *signal* signal=5" for example, which are probably SIGTRAP's.

I decided to capture a log running under a kernel before the change, and see
if I could find the same section again, based on the mutex name. Well I did,
and found alot more tracing. The thread 000c didn't get killed either so it
shows something is different. Of course under the old kernels I don't get the
"insert disc" message. I put up the working version log on the ftp.


Jesse

2004-11-21 21:33:01

by Davide Libenzi

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

On Fri, Nov 19, 2004 at 01:53:38PM -0800, Linus Torvalds wrote:
>
> On Fri, 19 Nov 2004, Daniel Jacobowitz wrote:
> >
> > I'm getting the feeling that the question of whether to step into
> > signal handlers is orthogonal to single-stepping; maybe it should be a
> > separate ptrace operation.
>
> I really don't see why. If a controlling process is asking for
> single-stepping, then it damn well should get it. It it doesn't want to
> single-step through a signal handler, then it could decide to just put a
> breakpoint on the return point (possibly by modifying the signal handler
> save area).

I'd agree with Linus here. A signal handler is part of the application, so
it should be single stepped in the same way other application code does.
My original patch simply reenabled the flag before returning to userspace,
and this had the consequence to single step into signal handlers too.



PS: I still cannot find the beginning this thread on lkml.


- Davide

2004-11-21 22:34:12

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Sun, 21 Nov 2004, Davide Libenzi wrote:
>
> I'd agree with Linus here. A signal handler is part of the application, so
> it should be single stepped in the same way other application code does.
> My original patch simply reenabled the flag before returning to userspace,
> and this had the consequence to single step into signal handlers too.

Hmmm.. I think I may have a test-case for the problem.

Lookie here:

#include <signal.h>
#include <sys/mman.h>

void function(void)
{
printf("Copy protected: ok\n");
}

void handler(int signo)
{
extern char smc;
smc++;
}

#define TF 0x100

int main(int argc, char **argv)
{
void (*fnp)(void);

signal(SIGTRAP, handler);
mprotect((void *)(0xfffff000 & (unsigned long)main), 4096, PROT_READ | PROT_WRITE);
asm volatile("pushfl ; orl %0,(%%esp) ; popfl"
: :"i" (TF):"memory");
asm volatile("pushfl ; andl %0,(%%esp) ; popfl"
: :"i" (~TF):"memory");
asm volatile("\nsmc:\n\t"
".byte 0xb7\n\t"
".long function"
:"=d" (fnp));
fnp();
exit(1);
}

Compile it, run it, and it should say

Copy protected: ok

Now, try to "strace" it, or debug it with gdb, and see if you can repeat
the behaviour.

Roland? Think of it as a challenge,

Linus

2004-11-21 23:15:49

by Davide Libenzi

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

On Sun, 21 Nov 2004, Linus Torvalds wrote:

> void handler(int signo)
> {
> extern char smc;
> smc++;
> }
>
> asm volatile("\nsmc:\n\t"
> ".byte 0xb7\n\t"
> ".long function"
> :"=d" (fnp));
> fnp();

You know you're sick, don't you? Making traps inc's to get you in the
correct opcode to move function in edx->fnp, is indeed fruit of a sick
mind :=)



- Davide

2004-11-22 00:16:53

by Andreas Schwab

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

Linus Torvalds <[email protected]> writes:

> Now, try to "strace" it, or debug it with gdb, and see if you can repeat
> the behaviour.

You'll always have hard time repeating that under strace or gdb, since a
debugger uses SIGTRAP for it's own purpose and does not pass it to the
program.

Andreas.

--
Andreas Schwab, SuSE Labs, [email protected]
SuSE Linux Products GmbH, Maxfeldstra?e 5, 90409 N?rnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

2004-11-22 01:12:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Mon, 22 Nov 2004, Andreas Schwab wrote:
>
> Linus Torvalds <[email protected]> writes:
>
> > Now, try to "strace" it, or debug it with gdb, and see if you can repeat
> > the behaviour.
>
> You'll always have hard time repeating that under strace or gdb, since a
> debugger uses SIGTRAP for it's own purpose and does not pass it to the
> program.

Sure. But "hard time" and "impossible" are two different things.

It _should_ be perfectly easy to debug this, by using

signal SIGTRAP

instead of "continue" when you get a SIGTRAP that wasn't due to anything
you did.

But try it. It doesn't work. Why? Because the kernel will have cleared TF
on the signal stack, so even when you do a "signal SIGTRAP", it will only
run the trap handler _once_, even though it should run it three times
(once for each instruction in between the "popfl"s.

I do think this is actually a bug, although whether it's the bug that
causes problems for Wine is not clear at all. I'mm too lazy to build
and boot an older kernel, but I bet that on an older kernel you _can_ do
"signal SIGTRAP" three times, and it will work correctly. That would
indeed make this a regression.

Linus

2004-11-22 01:14:50

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Sun, 21 Nov 2004, Davide Libenzi wrote:
>
> You know you're sick, don't you? Making traps inc's to get you in the
> correct opcode to move function in edx->fnp, is indeed fruit of a sick
> mind :=)

I prefer "creative" over "sick". It's so much less judgemental.

I thought it was a fun way to illustrate the issue, and I'm sure somebody
had fun trying to figure out what the thing did.

I could have _obfuscated_ the thing if I had wanted to make it hard to
follow, but instead I tried to make it as obvious as possible so that it's
quite clear from reading the code what it actually does.

But imagine hitting that thing without seeing the source code. NOT a lot
of fun to debug, even if the debugger _worked_ on it.

Linus

2004-11-22 04:06:43

by Davide Libenzi

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

On Sun, 21 Nov 2004, Linus Torvalds wrote:

> On Mon, 22 Nov 2004, Andreas Schwab wrote:
> >
> > Linus Torvalds <[email protected]> writes:
> >
> > > Now, try to "strace" it, or debug it with gdb, and see if you can repeat
> > > the behaviour.
> >
> > You'll always have hard time repeating that under strace or gdb, since a
> > debugger uses SIGTRAP for it's own purpose and does not pass it to the
> > program.
>
> Sure. But "hard time" and "impossible" are two different things.
>
> It _should_ be perfectly easy to debug this, by using
>
> signal SIGTRAP
>
> instead of "continue" when you get a SIGTRAP that wasn't due to anything
> you did.
>
> But try it. It doesn't work. Why? Because the kernel will have cleared TF
> on the signal stack, so even when you do a "signal SIGTRAP", it will only
> run the trap handler _once_, even though it should run it three times
> (once for each instruction in between the "popfl"s.
>
> I do think this is actually a bug, although whether it's the bug that
> causes problems for Wine is not clear at all. I'mm too lazy to build
> and boot an older kernel, but I bet that on an older kernel you _can_ do
> "signal SIGTRAP" three times, and it will work correctly. That would
> indeed make this a regression.

Hmmm ..., this is 2.4.27:


[davide@bigblue davide]$ gcc -g -o zzzzzzzz zzzzzzzz.c
[davide@bigblue davide]$ ./zzzzzzzz
Copy protected: ok
[davide@bigblue davide]$ gdb ./zzzzzzzz
GNU gdb Red Hat Linux (5.3.90-0.20030710.41rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db
library "/lib/libthread_db.so.1".

(gdb) r
Starting program: /home/davide/zzzzzzzz

Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048454 in main (argc=1, argv=0xbffff9c4) at zzzzzzzz.c:26
26 asm volatile("pushfl ; andl %0,(%%esp) ; popfl"
(gdb) signal SIGTRAP
Continuing with signal SIGTRAP.

Program received signal SIGSEGV, Segmentation fault.
0x00000003 in ?? ()
(gdb) bt
#0 0x00000003 in ?? ()
#1 0x0804846b in smc () at zzzzzzzz.c:32
#2 0x46649b7f in __libc_start_main () from /lib/i686/libc.so.6
#3 0x08048359 in _start ()


So it seems it did not work even before, the gdb-SIGTRAP stepping. In
2.6.8 I get a straight segfault just for running it.



- Davide

2004-11-22 04:29:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Sun, 21 Nov 2004, Davide Libenzi wrote:
>
> So it seems it did not work even before, the gdb-SIGTRAP stepping. In
> 2.6.8 I get a straight segfault just for running it.

Ok, that at least means it's not a regression, although it may be that the
altered behaviour is enough to make some program work/not-work depending
on exactly what it is testing. My example is certainly not the only way to
try to mess up a debugger.

I'm by no means 100% sure that we should encourage the kind of programming
"skills" I showed with that example program, so in that sense this may not
be worth worrying about. That said, I do hate the notion of having
programs that are basically undebuggable, so from a QoI standpoint I'd
really like to say that you can run my horrid little program under the
debugger and see it work...

Linus

2004-11-22 06:25:37

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Sun, 21 Nov 2004, Linus Torvalds wrote:
>
> I'm by no means 100% sure that we should encourage the kind of programming
> "skills" I showed with that example program, so in that sense this may not
> be worth worrying about. That said, I do hate the notion of having
> programs that are basically undebuggable, so from a QoI standpoint I'd
> really like to say that you can run my horrid little program under the
> debugger and see it work...

Ok, how about this patch?

It does basically two things:

- it makes the x86 version of ptrace be a lot more careful about the TF
bit in eflags, and in particular it never touches it _unless_ the
tracer has explicitly asked for it (ie we set TF only when doing a
PTRACE_SINGESTEP, and we clear it only when it has been set by us, not
if it has been set by the program itself).

This patch also cleans up the codepaths by doing all the common stuff
in set_singlestep()/clear_singlestep().

- It clarifies signal handling, and makes it clear that we always push
the full eflags onto the signal stack, _except_ if the TF bit was set
by an external ptrace user, in which case we hide it so that the tracee
doesn't see it when it looks at its stack contents.

It also adds a few comments, and makes it clear that the signal handler
itself is always set up with TF _clear_. But if we were single-stepped
into it, we will have notified the debugger, so the debugger obviously
can (and often will) decide to continue single-stepping.

IMHO, this is a nice cleanup, and it also means that I can actually debug
my "program from hell":

[torvalds@evo ~]$ gdb ./a.out
GNU gdb Red Hat Linux (6.1post-1.20040607.41rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols
found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".

(gdb) run
Starting program: /home/torvalds/a.out
Reading symbols from shared object read from target memory...(no debugging
symbols found)...done.
Loaded system supplied DSO at 0xffffe000
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048480 in main ()
(gdb) signal SIGTRAP
Continuing with signal SIGTRAP.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048487 in main ()
(gdb) signal SIGTRAP
Continuing with signal SIGTRAP.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048488 in smc ()
(gdb) signal SIGTRAP
Continuing with signal SIGTRAP.
Copy protected: ok

Program exited with code 01.
(gdb)

which I think is a sign that this patch actually fixes ptrace.

Does this help with wine? I dunno. Maybe some wine people can comment..

Roland, mind take a look? I assume you have some gdb test-suite that you
use to test the things?

Linus

----

===== arch/i386/kernel/ptrace.c 1.27 vs edited =====
--- 1.27/arch/i386/kernel/ptrace.c 2004-11-07 18:10:34 -08:00
+++ edited/arch/i386/kernel/ptrace.c 2004-11-21 21:34:58 -08:00
@@ -138,6 +138,28 @@
return retval;
}

+static void set_singlestep(struct task_struct *child)
+{
+ long eflags;
+
+ set_tsk_thread_flag(child, TIF_SINGLESTEP);
+ eflags = get_stack_long(child, EFL_OFFSET);
+ put_stack_long(child, EFL_OFFSET, eflags | TRAP_FLAG);
+ child->ptrace |= PT_DTRACE;
+}
+
+static void clear_singlestep(struct task_struct *child)
+{
+ if (child->ptrace & PT_DTRACE) {
+ long eflags;
+
+ clear_tsk_thread_flag(child, TIF_SINGLESTEP);
+ eflags = get_stack_long(child, EFL_OFFSET);
+ put_stack_long(child, EFL_OFFSET, eflags & ~TRAP_FLAG);
+ child->ptrace &= ~PT_DTRACE;
+ }
+}
+
/*
* Called by kernel/ptrace.c when detaching..
*
@@ -145,11 +167,7 @@
*/
void ptrace_disable(struct task_struct *child)
{
- long tmp;
-
- clear_tsk_thread_flag(child, TIF_SINGLESTEP);
- tmp = get_stack_long(child, EFL_OFFSET) & ~TRAP_FLAG;
- put_stack_long(child, EFL_OFFSET, tmp);
+ clear_singlestep(child);
}

/*
@@ -388,10 +406,8 @@
}
break;

- case PTRACE_SYSCALL: /* continue and stop at next (return from) syscall */
- case PTRACE_CONT: { /* restart after signal. */
- long tmp;
-
+ case PTRACE_SYSCALL: /* continue and stop at next (return from) syscall */
+ case PTRACE_CONT: /* restart after signal. */
ret = -EIO;
if ((unsigned long) data > _NSIG)
break;
@@ -401,56 +417,39 @@
else {
clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
}
- clear_tsk_thread_flag(child, TIF_SINGLESTEP);
child->exit_code = data;
- /* make sure the single step bit is not set. */
- tmp = get_stack_long(child, EFL_OFFSET) & ~TRAP_FLAG;
- put_stack_long(child, EFL_OFFSET,tmp);
+ /* make sure the single step bit is not set. */
+ clear_singlestep(child);
wake_up_process(child);
ret = 0;
break;
- }

/*
* make the child exit. Best I can do is send it a sigkill.
* perhaps it should be put in the status that it wants to
* exit.
*/
- case PTRACE_KILL: {
- long tmp;
-
+ case PTRACE_KILL:
ret = 0;
if (child->exit_state == EXIT_ZOMBIE) /* already dead */
break;
child->exit_code = SIGKILL;
- clear_tsk_thread_flag(child, TIF_SINGLESTEP);
/* make sure the single step bit is not set. */
- tmp = get_stack_long(child, EFL_OFFSET) & ~TRAP_FLAG;
- put_stack_long(child, EFL_OFFSET, tmp);
+ clear_singlestep(child);
wake_up_process(child);
break;
- }
-
- case PTRACE_SINGLESTEP: { /* set the trap flag. */
- long tmp;

+ case PTRACE_SINGLESTEP: /* set the trap flag. */
ret = -EIO;
if ((unsigned long) data > _NSIG)
break;
clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
- if ((child->ptrace & PT_DTRACE) == 0) {
- /* Spurious delayed TF traps may occur */
- child->ptrace |= PT_DTRACE;
- }
- tmp = get_stack_long(child, EFL_OFFSET) | TRAP_FLAG;
- put_stack_long(child, EFL_OFFSET, tmp);
- set_tsk_thread_flag(child, TIF_SINGLESTEP);
+ set_singlestep(child);
child->exit_code = data;
/* give it a chance to run. */
wake_up_process(child);
ret = 0;
break;
- }

case PTRACE_DETACH:
/* detach a process that was attached. */
===== arch/i386/kernel/signal.c 1.48 vs edited =====
--- 1.48/arch/i386/kernel/signal.c 2004-11-15 00:56:24 -08:00
+++ edited/arch/i386/kernel/signal.c 2004-11-21 21:33:21 -08:00
@@ -292,10 +292,15 @@
err |= __put_user(current->thread.error_code, &sc->err);
err |= __put_user(regs->eip, &sc->eip);
err |= __put_user(regs->xcs, (unsigned int __user *)&sc->cs);
+
+ /*
+ * Iff TF was set because the program is being single-stepped by a
+ * debugger, don't save that information on the signal stack.. We
+ * don't want debugging to change state.
+ */
eflags = regs->eflags;
- if (current->ptrace & PT_PTRACED) {
+ if (current->ptrace & PT_DTRACE)
eflags &= ~TF_MASK;
- }
err |= __put_user(eflags, &sc->eflags);
err |= __put_user(regs->esp, &sc->esp_at_signal);
err |= __put_user(regs->xss, (unsigned int __user *)&sc->ss);
@@ -412,12 +417,17 @@
regs->xes = __USER_DS;
regs->xss = __USER_DS;
regs->xcs = __USER_CS;
+
+ /*
+ * Clear TF when entering the signal handler, but
+ * notify any tracer that was single-stepping it.
+ * The tracer may want to single-step inside the
+ * handler too.
+ */
if (regs->eflags & TF_MASK) {
- if ((current->ptrace & (PT_PTRACED | PT_DTRACE)) == (PT_PTRACED | PT_DTRACE)) {
+ regs->eflags &= ~TF_MASK;
+ if (current->ptrace & PT_DTRACE)
ptrace_notify(SIGTRAP);
- } else {
- regs->eflags &= ~TF_MASK;
- }
}

#if DEBUG_SIG
@@ -502,12 +512,17 @@
regs->xes = __USER_DS;
regs->xss = __USER_DS;
regs->xcs = __USER_CS;
+
+ /*
+ * Clear TF when entering the signal handler, but
+ * notify any tracer that was single-stepping it.
+ * The tracer may want to single-step inside the
+ * handler too.
+ */
if (regs->eflags & TF_MASK) {
- if (current->ptrace & PT_PTRACED) {
+ regs->eflags &= ~TF_MASK;
+ if (current->ptrace & PT_DTRACE)
ptrace_notify(SIGTRAP);
- } else {
- regs->eflags &= ~TF_MASK;
- }
}

#if DEBUG_SIG

2004-11-22 11:12:31

by Andreas Schwab

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

Linus Torvalds <[email protected]> writes:

> IMHO, this is a nice cleanup, and it also means that I can actually debug
> my "program from hell":

Does it also work when trying to single step over it? I guess all bets
are off then.

Andreas.

--
Andreas Schwab, SuSE Labs, [email protected]
SuSE Linux Products GmbH, Maxfeldstra?e 5, 90409 N?rnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

2004-11-22 13:50:16

by Davide Libenzi

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

On Sun, 21 Nov 2004, Linus Torvalds wrote:

> Ok, how about this patch?
>
> It does basically two things:
>
> - it makes the x86 version of ptrace be a lot more careful about the TF
> bit in eflags, and in particular it never touches it _unless_ the
> tracer has explicitly asked for it (ie we set TF only when doing a
> PTRACE_SINGESTEP, and we clear it only when it has been set by us, not
> if it has been set by the program itself).
>
> This patch also cleans up the codepaths by doing all the common stuff
> in set_singlestep()/clear_singlestep().
>
> - It clarifies signal handling, and makes it clear that we always push
> the full eflags onto the signal stack, _except_ if the TF bit was set
> by an external ptrace user, in which case we hide it so that the tracee
> doesn't see it when it looks at its stack contents.
>
> It also adds a few comments, and makes it clear that the signal handler
> itself is always set up with TF _clear_. But if we were single-stepped
> into it, we will have notified the debugger, so the debugger obviously
> can (and often will) decide to continue single-stepping.

Looks like a nice cleanup. What does the test program below print for you?



- Davide



#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <linux/user.h>
#include <linux/unistd.h>


int main(int ac, char **av) {
int i, status, res;
long start, end;
pid_t cpid, pid;
struct user_regs_struct ur;
struct sigaction sa;

sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sa.sa_handler = SIG_DFL;
sigaction(SIGCHLD, &sa, NULL);

printf("nargs=%d\n", ac);
if (ac == 1)
goto tracer;

printf("arg=%s\n", av[1]);
loop:
__asm__ volatile ("int $0x80"
: "=a" (res)
: "0" (__NR_getpid));
goto loop;
endloop:
exit(0);


tracer:
if ((cpid = fork()) != 0)
goto parent;

printf("child=%d\n", getpid());
ptrace(PTRACE_TRACEME, 0, NULL, NULL);

execl(av[0], av[0], "child", NULL);

exit(0);

parent:
start = (long) &&loop;
end = (long) &&endloop;

printf("pchild=%d\n", cpid);

for (;;) {
pid = wait(&status);
if (pid != cpid)
continue;
res = WSTOPSIG(status);
if (ptrace(PTRACE_GETREGS, pid, NULL, &ur)) {
printf("[%d] error: ptrace(PTRACE_GETREGS, %d)\n",
pid, pid);
return 1;
}

if (ptrace(PTRACE_SINGLESTEP, pid, NULL, res != SIGTRAP ? res: 0)) {
perror("ptrace(PTRACE_SINGLESTEP)");
return 1;
}

if (ur.eip >= start && ur.eip <= end)
break;
}


for (i = 0; i < 15; i++) {
printf("waiting ...\n");
pid = wait(&status);
printf("done: pid=%d status=%d\n", pid, status);
if (pid != cpid)
continue;
res = WSTOPSIG(status);
printf("sig=%d\n", res);
if (ptrace(PTRACE_GETREGS, pid, NULL, &ur)) {
printf("[%d] error: ptrace(PTRACE_GETREGS, %d)\n",
pid, pid);
return 1;
}

printf("EIP=0x%08x\n", ur.eip);

if (ptrace(PTRACE_SINGLESTEP, pid, NULL, res != SIGTRAP ? res: 0)) {
perror("ptrace(PTRACE_SINGLESTEP)");
return 1;
}
}

if (ptrace(PTRACE_CONT, cpid, NULL, SIGKILL)) {
perror("ptrace(PTRACE_SINGLESTEP)");
return 1;
}

return 0;
}


2004-11-22 16:45:34

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Mon, 22 Nov 2004, Andreas Schwab wrote:

> Linus Torvalds <[email protected]> writes:
>
> > IMHO, this is a nice cleanup, and it also means that I can actually debug
> > my "program from hell":
>
> Does it also work when trying to single step over it? I guess all bets
> are off then.

If you single-step over the "popfl", then you need to generate the
SIGTRAP's by hand too. IOW, it's _possible_ to emulate the behaviour from
within the debugger, but it gets really really nasty very quickly.

I think the nastyness in that case is at least acceptable, since if you
single-step, you actually _see_ what is happening, and thus you have a
chance in hell of figuring it out. Practical? No. But debuggable at least
in theory, which it really wasn't before.

Linus

2004-11-22 21:14:29

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Mon, 22 Nov 2004, Eric Pouech wrote:
>
> For the linux folks, here a small comparison of what happens in the working
> (old) case and in the non-working (new) case:
>
> In both case
>
> - Wine gets a first SIGTRAP (in it's sig_trap handler)
> + Wine converts it into a Windows exception (w-exception in short).
> This includes creating a context for the generic CPU registers
> + This w-exception is sent to the w-exception handler the program
> installed (this one can modifiy the all registers)
> o this handler touches one of the i386 debug registers
> + as we need to update the debug registers values (and we don't do in
> the signal handler return), this task is delegated to the Wine server
> (our central process, which is in charge of synchronisation...)
> > the Wine server ptrace-attach:es to the process which handled
> the SIGTRAP.
> > Wine server wait4:s on the SIGSTOP (after ptrace:attach)
> > modify (with ptrace) the debug registers
> > and resumes excution (ptrace: cont)
> + wine terminates the sig trap handler and resumes the execution with
> the modified basic registers (from the saved context), and the
> modified debug registers (from the Wine server round trip)
> - a second sig trap is generated
> > since the wine server is still ptrace:attached, it gets the signal.
>
> What differs then in both execution:
> - in the working case, the sig trap handler is called on the client side,
> whereas it's never called in the non-working case. We do have a couple of
> protection (to avoid some misbehaving apps), but none of them get triggered. So
> it seems like the trap handler is not called (ugh).

This actually implies that the current -bk tree with my latest patch may
actually fix it.

One of the things that 2.6.9 did wrong was exactly that it cleared TF too
much in the ptrace interface. The current development tree is a lot more
careful about that, and it fixes the horrid test-case that I used to debug
it. The test-case (when run under gdb) actually does something similar to
what Wine appears to do.

> - in Windows trap handling, the TF is explictly reset before calling the windows
> exception handler (which is what Wine does, before calling the window exception
> handler). Of course the handler can set it back if it wants to continue single
> stepping.

TF will be still set in Linux when ptrace gets access, but the ptracer can
choose to clear it with PTRACE_PEEKUSR/PTRACE_POKEUSR (or with
PTRACE_GETREGS/SETREGS). I assume you already do that, since I think that
has been true forever (although maybe you don't: PTRACE_CONTINUE used to
unconditionally clear TF, so it may be that Wine may need some minor
modification to not do that - but the good news is that mod should be
backwards-compatible, so it should be pretty easy).

I actually broke down and am downloading the latest source tree of wine,
let's see if I can find the place you do this.

Linus

2004-11-22 20:55:45

by Eric Pouech

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

Jesse Allen a ?crit :
> On Fri, Nov 19, 2004 at 01:53:38PM -0800, Linus Torvalds wrote:
>
>>
>>On Fri, 19 Nov 2004, Daniel Jacobowitz wrote:
>>
>>>I'm getting the feeling that the question of whether to step into
>>>signal handlers is orthogonal to single-stepping; maybe it should be a
>>>separate ptrace operation.
>>
>>I really don't see why. If a controlling process is asking for
>>single-stepping, then it damn well should get it. It it doesn't want to
>>single-step through a signal handler, then it could decide to just put a
>>breakpoint on the return point (possibly by modifying the signal handler
>>save area).
>>
>>It's not like single-stepping into the signal handler in any way removes
>>any information (while _not_ single-stepping into it clearly does).
>>
>>With the patch I just posted (assuming it works for people), Wine should
>>at least have the choice. The behaviour now should be:
>>
>> - if the app sets TF on its own, it will cause a SIGTRAP which it can
>> catch.
>> - if the debugger sets TF with SINGLESTEP, it will single-step into a
>> signal handler.
>> - it the app sets TF _and_ you ptrace it, you the ptracer will see the
>> debug event and catch it. However, doing a "continue" at that point
>> will remove the TF flag (and always has), the app will normally then
>> never see the trap. You can do a "signal SIGTRAP" to actually force
>> the trap handler to tun, but that one won't actually single-step (it's
>> a "continue" in all other senses).
>>
>>It sounds like the third case is what wine wants.
>>
>> Linus
>
>
>
> So an app running through wine could set TF on it's own? It would be a
> good idea to find out what it is doing in the first place. There has to be
> a reason why War3 is so picky after the original change was introduced and
> a reason why the latest changes don't seem to fix it.
>
> I've built a kernel 2.6.10-rc2 with the new ptrace patch. The program still
> says "please insert disc". I haven't been able to get winedbg to tell me
> anything useful -- the program never crashes anyways. So I went ahead and I
> captured a debug log.
>
> the command:
> WINEDEBUG=+all wine war3/Warcraft\ III.exe -opengl >& war3_and_2.6.10-rc2.log
>
> I scanned for the part right before it calls up to display the error. I found
> that after loading advapi32.dll, the thread 000c creates a mutex and wakes up
> 000a. Then 000c gets killed and right after that starts calling up the
> resources for the "insert disc" message box. I put the log up on my ftp, and
> the interesting part in a seperate file:
> ftp://resnet.dnip.net/
>
> Any clue on what may be happening here? Or maybe another idea on where else to search?
>
>
> Jesse
>
>
>
For the linux folks, here a small comparison of what happens in the working
(old) case and in the non-working (new) case:

In both case

- Wine gets a first SIGTRAP (in it's sig_trap handler)
+ Wine converts it into a Windows exception (w-exception in short).
This includes creating a context for the generic CPU registers
+ This w-exception is sent to the w-exception handler the program
installed (this one can modifiy the all registers)
o this handler touches one of the i386 debug registers
+ as we need to update the debug registers values (and we don't do in
the signal handler return), this task is delegated to the Wine server
(our central process, which is in charge of synchronisation...)
> the Wine server ptrace-attach:es to the process which handled
the SIGTRAP.
> Wine server wait4:s on the SIGSTOP (after ptrace:attach)
> modify (with ptrace) the debug registers
> and resumes excution (ptrace: cont)
+ wine terminates the sig trap handler and resumes the execution with
the modified basic registers (from the saved context), and the
modified debug registers (from the Wine server round trip)
- a second sig trap is generated
> since the wine server is still ptrace:attached, it gets the signal.

What differs then in both execution:
- in the working case, the sig trap handler is called on the client side,
whereas it's never called in the non-working case. We do have a couple of
protection (to avoid some misbehaving apps), but none of them get triggered. So
it seems like the trap handler is not called (ugh).

A couple of notes:
- as the program tested is copy protected, and as it seems that the copy
protection is what causes the harm, it can be interesting to know that the
programe seems to set the TF flag (some copy protection schemes directly do an
"int 1", but given the exception code we get, this isn't the case).
- in Windows trap handling, the TF is explictly reset before calling the windows
exception handler (which is what Wine does, before calling the window exception
handler). Of course the handler can set it back if it wants to continue single
stepping.

HTH
A+

2004-11-22 22:20:37

by Mike Hearn

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

On Mon, 2004-11-22 at 13:10 -0800, Linus Torvalds wrote:
> I actually broke down and am downloading the latest source tree of wine,
> let's see if I can find the place you do this.

The Wine source tree is organised in the same way Windows is, which is
bizarre and unintuitive but we don't really have much choice. So here
are the files you'd be looking for.

this is where signals are converted to SEH exceptions (w-exceptions as
Eric called them):

dlls/ntdll/signal_i386.c

this is where the wineserver does context related things:

server/context_i386.c

this is where the server does ptracing:

server/ptrace.c

There may be one or two other places that are related, I only ever
looked at this code to deal with some other copy protection system that
wasn't happy (not kernels fault though).

thanks -mike

--
Mike Hearn <[email protected]>
Codeweavers, Inc

2004-11-22 22:27:33

by Linus Torvalds

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine



On Mon, 22 Nov 2004, Mike Hearn wrote:
>
> this is where signals are converted to SEH exceptions (w-exceptions as
> Eric called them):
>
> dlls/ntdll/signal_i386.c

Looks like it clears TF there already:

if (context->EFlags & 0x100)
{
context->EFlags &= ~0x100; /* clear single-step flag */
}

so I'll just assume it's ok.

Linus

2004-11-22 23:51:02

by Jesse Allen

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

On Mon, Nov 22, 2004 at 04:15:21PM -0700, Jesse Allen wrote:
>
> For the wine app in question, it does make a difference. However, there is
> a new problem. The program gets stuck at the loading screen at 100% CPU.
> It's making a call to select, timing out, and then tries select again,
> repeats. It's waiting for something that seems to never happen.
>
> I've captured a log, "loop.log.bz2", and shortened to have only the relevent
> information after the CMS32_MUTEX is created. Look for occurances of
> "select()" -- I think the second one is where it starts. It's on my ftp if
> anyone wants to take a look. It probably can be compared to the working-
> version log where this doesn't occur, but it might be a pain to spot the
> working particular instance.
>
>


Actually it's pretty obvious. In the working version, it's supposed to be
getting SIGTRAPs while it's calling select(). So something's up there. But
it's doing part of what it should be doing now.


2004-11-22 23:19:48

by Jesse Allen

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

On Sun, Nov 21, 2004 at 10:23:41PM -0800, Linus Torvalds wrote:
>
> Ok, how about this patch?
>
> It does basically two things:
>
> - it makes the x86 version of ptrace be a lot more careful about the TF
> bit in eflags, and in particular it never touches it _unless_ the
> tracer has explicitly asked for it (ie we set TF only when doing a
> PTRACE_SINGESTEP, and we clear it only when it has been set by us, not
> if it has been set by the program itself).
>
> This patch also cleans up the codepaths by doing all the common stuff
> in set_singlestep()/clear_singlestep().
>
> - It clarifies signal handling, and makes it clear that we always push
> the full eflags onto the signal stack, _except_ if the TF bit was set
> by an external ptrace user, in which case we hide it so that the tracee
> doesn't see it when it looks at its stack contents.
>
> It also adds a few comments, and makes it clear that the signal handler
> itself is always set up with TF _clear_. But if we were single-stepped
> into it, we will have notified the debugger, so the debugger obviously
> can (and often will) decide to continue single-stepping.
>
> IMHO, this is a nice cleanup, and it also means that I can actually debug
> my "program from hell":
>
> [torvalds@evo ~]$ gdb ./a.out
> GNU gdb Red Hat Linux (6.1post-1.20040607.41rh)
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols
> found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".
>
> (gdb) run
> Starting program: /home/torvalds/a.out
> Reading symbols from shared object read from target memory...(no debugging
> symbols found)...done.
> Loaded system supplied DSO at 0xffffe000
> (no debugging symbols found)...(no debugging symbols found)...
> Program received signal SIGTRAP, Trace/breakpoint trap.
> 0x08048480 in main ()
> (gdb) signal SIGTRAP
> Continuing with signal SIGTRAP.
>
> Program received signal SIGTRAP, Trace/breakpoint trap.
> 0x08048487 in main ()
> (gdb) signal SIGTRAP
> Continuing with signal SIGTRAP.
>
> Program received signal SIGTRAP, Trace/breakpoint trap.
> 0x08048488 in smc ()
> (gdb) signal SIGTRAP
> Continuing with signal SIGTRAP.
> Copy protected: ok
>
> Program exited with code 01.
> (gdb)
>
> which I think is a sign that this patch actually fixes ptrace.
>
> Does this help with wine? I dunno. Maybe some wine people can comment..
>


For the wine app in question, it does make a difference. However, there is
a new problem. The program gets stuck at the loading screen at 100% CPU.
It's making a call to select, timing out, and then tries select again,
repeats. It's waiting for something that seems to never happen.

I've captured a log, "loop.log.bz2", and shortened to have only the relevent
information after the CMS32_MUTEX is created. Look for occurances of
"select()" -- I think the second one is where it starts. It's on my ftp if
anyone wants to take a look. It probably can be compared to the working-
version log where this doesn't occur, but it might be a pain to spot the
working particular instance.


Jesse

2004-11-28 17:04:14

by Eric Pouech

[permalink] [raw]
Subject: Re: ptrace single-stepping change breaks Wine

Jesse Allen a ?crit :
> On Sun, Nov 21, 2004 at 10:23:41PM -0800, Linus Torvalds wrote:
>
>>Ok, how about this patch?
>>
>>It does basically two things:
>>
>> - it makes the x86 version of ptrace be a lot more careful about the TF
>> bit in eflags, and in particular it never touches it _unless_ the
>> tracer has explicitly asked for it (ie we set TF only when doing a
>> PTRACE_SINGESTEP, and we clear it only when it has been set by us, not
>> if it has been set by the program itself).
>>
>> This patch also cleans up the codepaths by doing all the common stuff
>> in set_singlestep()/clear_singlestep().
>>
>> - It clarifies signal handling, and makes it clear that we always push
>> the full eflags onto the signal stack, _except_ if the TF bit was set
>> by an external ptrace user, in which case we hide it so that the tracee
>> doesn't see it when it looks at its stack contents.
>>
>> It also adds a few comments, and makes it clear that the signal handler
>> itself is always set up with TF _clear_. But if we were single-stepped
>> into it, we will have notified the debugger, so the debugger obviously
>> can (and often will) decide to continue single-stepping.
>>
>>IMHO, this is a nice cleanup, and it also means that I can actually debug
>>my "program from hell":
>>
>> [torvalds@evo ~]$ gdb ./a.out
>> GNU gdb Red Hat Linux (6.1post-1.20040607.41rh)
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and you are
>> welcome to change it and/or distribute copies of it under certain conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB. Type "show warranty" for details.
>> This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols
>> found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".
>>
>> (gdb) run
>> Starting program: /home/torvalds/a.out
>> Reading symbols from shared object read from target memory...(no debugging
>> symbols found)...done.
>> Loaded system supplied DSO at 0xffffe000
>> (no debugging symbols found)...(no debugging symbols found)...
>> Program received signal SIGTRAP, Trace/breakpoint trap.
>> 0x08048480 in main ()
>> (gdb) signal SIGTRAP
>> Continuing with signal SIGTRAP.
>>
>> Program received signal SIGTRAP, Trace/breakpoint trap.
>> 0x08048487 in main ()
>> (gdb) signal SIGTRAP
>> Continuing with signal SIGTRAP.
>>
>> Program received signal SIGTRAP, Trace/breakpoint trap.
>> 0x08048488 in smc ()
>> (gdb) signal SIGTRAP
>> Continuing with signal SIGTRAP.
>> Copy protected: ok
>>
>> Program exited with code 01.
>> (gdb)
>>
>>which I think is a sign that this patch actually fixes ptrace.
>>
>>Does this help with wine? I dunno. Maybe some wine people can comment..
>>
>
>
>
> For the wine app in question, it does make a difference. However, there is
> a new problem. The program gets stuck at the loading screen at 100% CPU.
> It's making a call to select, timing out, and then tries select again,
> repeats. It's waiting for something that seems to never happen.
>
> I've captured a log, "loop.log.bz2", and shortened to have only the relevent
> information after the CMS32_MUTEX is created. Look for occurances of
> "select()" -- I think the second one is where it starts. It's on my ftp if
> anyone wants to take a look. It probably can be compared to the working-
> version log where this doesn't occur, but it might be a pain to spot the
> working particular instance.

Hi Jesse
Any network issue on your side? I tried to traceroute resnet.disp.net, but no avail.
So I cannot have a look to you newest trace.

A+